首页
学习
活动
专区
圈层
工具
发布
首页
学习
活动
专区
圈层
工具
MCP广场
社区首页 >问答首页 >Azure语音到文本的Python start_continuous_recognition女士不会停留在流的末尾。

Azure语音到文本的Python start_continuous_recognition女士不会停留在流的末尾。
EN

Stack Overflow用户
提问于 2022-11-13 23:16:34
回答 1查看 11关注 0票数 0

我将Azure女士演讲至文字服务与Python结合使用。

我的data输入是一个字节字符串,只有几秒钟的音频。我的期望是云服务在流结束后停止处理音频,并返回识别的文本。相反,需要大约5分钟才能触发recognized事件。

代码语言:javascript
运行
复制
           speech_config = speechsdk.SpeechConfig(subscription=API_KEY,
                                                   region="westeurope",
                                                   speech_recognition_language='de-DE')
            stream = PushAudioInputStream(stream_format=
                                          AudioStreamFormat(samples_per_second=sample_rate, bits_per_sample=SAMPLE_WIDTH * 8,
                                                            compressed_stream_format=speechsdk.AudioStreamContainerFormat.FLAC))
            audio_input = speechsdk.AudioConfig(stream=stream)
            stream.write(data)
            speech_recognizer = speechsdk.SpeechRecognizer(speech_config=speech_config, audio_config=audio_input)
            speech_recognizer.start_continuous_recognition()
            done = False

            def stop_recognition(evt):
                logger.debug("Stopped MS Azure recognition: %s", evt)
                nonlocal done
                done = True

            def recognized(evt):
                logger.info("Recognized MS Azure transcript: %s", evt)
                nonlocal text
                text += " " + evt.result.text

            speech_recognizer.recognizing.connect(lambda evt: print('RECOGNIZING: {}'.format(evt)))
            speech_recognizer.recognized.connect(lambda evt: print('RECOGNIZED: {}'.format(evt)))
            speech_recognizer.session_started.connect(lambda evt: print('SESSION STARTED: {}'.format(evt)))
            speech_recognizer.session_stopped.connect(lambda evt: print('SESSION STOPPED {}'.format(evt)))
            speech_recognizer.canceled.connect(lambda evt: print('CANCELED {}'.format(evt)))

            speech_recognizer.recognized.connect(recognized)
            speech_recognizer.session_stopped.connect(stop_recognition)
            speech_recognizer.canceled.connect(stop_recognition)
            while not done:
                time.sleep(.5)
            speech_recognizer.stop_continuous_recognition()

相反,我看到了一个5分钟的延迟:

代码语言:javascript
运行
复制
2022-11-13 23:58:19,504 - speech_processing.speech_recognition.speech_recognition - DEBUG - Sending 192000 bytes (6 sec) for recognition
RECOGNIZING: SpeechRecognitionEventArgs(session_id=2e4c92f4fed6498f8f5260199bdcc5d7, result=SpeechRecognitionResult(result_id=50e5c478cdc34e0a8ced3867be493bc3, text="telefon", reason=ResultReason.RecognizingSpeech))
RECOGNIZING: SpeechRecognitionEventArgs(session_id=2e4c92f4fed6498f8f5260199bdcc5d7, result=SpeechRecognitionResult(result_id=d1448833ac8f40ef9c1ebc4cae488bcd, text="telefonspeicher", reason=ResultReason.RecognizingSpeech))
RECOGNIZING: SpeechRecognitionEventArgs(session_id=2e4c92f4fed6498f8f5260199bdcc5d7, result=SpeechRecognitionResult(result_id=cdf9f074c13b4a2c94960ec147db765c, text="telefon speichere", reason=ResultReason.RecognizingSpeech))
RECOGNIZING: SpeechRecognitionEventArgs(session_id=2e4c92f4fed6498f8f5260199bdcc5d7, result=SpeechRecognitionResult(result_id=548133156bb44dc8ae08fd0848fa8ec5, text="telefon speichere als", reason=ResultReason.RecognizingSpeech))
RECOGNIZING: SpeechRecognitionEventArgs(session_id=2e4c92f4fed6498f8f5260199bdcc5d7, result=SpeechRecognitionResult(result_id=c03970619f1e42278b2a2ef19ee4f1fe, text="telefon speichere als bärbel", reason=ResultReason.RecognizingSpeech))
RECOGNIZING: SpeechRecognitionEventArgs(session_id=2e4c92f4fed6498f8f5260199bdcc5d7, result=SpeechRecognitionResult(result_id=ff5f6a18d1e4409cab2661582cb8a693, text="telefon speichere als bärbel 0", reason=ResultReason.RecognizingSpeech))
RECOGNIZING: SpeechRecognitionEventArgs(session_id=2e4c92f4fed6498f8f5260199bdcc5d7, result=SpeechRecognitionResult(result_id=a3cb8f82c62b4235abc2fea2696342f8, text="telefon speichere als bärbel 03", reason=ResultReason.RecognizingSpeech))
RECOGNIZING: SpeechRecognitionEventArgs(session_id=2e4c92f4fed6498f8f5260199bdcc5d7, result=SpeechRecognitionResult(result_id=cafc805031654aa4865a4fe1b742d1cd, text="telefon speichere als bärbel 038", reason=ResultReason.RecognizingSpeech))
RECOGNIZING: SpeechRecognitionEventArgs(session_id=2e4c92f4fed6498f8f5260199bdcc5d7, result=SpeechRecognitionResult(result_id=69782c9485244e3191846b924adb3807, text="telefon speichere als bärbel 0385", reason=ResultReason.RecognizingSpeech))
RECOGNIZED: SpeechRecognitionEventArgs(session_id=2e4c92f4fed6498f8f5260199bdcc5d7, result=SpeechRecognitionResult(result_id=9d92890d52d84b7f926a6977d6324ca1, text="Telefon speichere als Bärbel 0385.", reason=ResultReason.RecognizedSpeech))
2022-11-14 00:03:26,487 - speech_processing.speech_recognition.speech_recognition - INFO - Recognized MS Azure transcript: SpeechRecognitionEventArgs(session_id=2e4c92f4fed6498f8f5260199bdcc5d7, result=SpeechRecognitionResult(result_id=9d92890d52d84b7f926a6977d6324ca1, text="Telefon speichere als Bärbel 0385.", reason=ResultReason.RecognizedSpeech))
EN

回答 1

Stack Overflow用户

发布于 2022-11-13 23:42:31

我发现了我的错误:

流必须关闭:

代码语言:javascript
运行
复制
stream.write(data)
stream.close()
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/74425524

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档