我能够生成一个wav文件的“玛丽有一个小羊羔”使用下面的代码。但是当我试图生成一个mp3时,它失败了。
#https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/get-started-text-to-speech?tabs=script%2Cwindowsinstall&pivots=programming-language-python
import azure.cognitiveservices.speech as speechsdk
languageCode = 'en-US'
ssmlGender = 'MALE'
voicName = 'en-US-JennyNeural'
speakingRate = '-5%'
pitch = '-10%'
voiceStyle = 'newscast'
azureKey = 'FAKE KEY'
azureRegion = 'FAKE REGION'
#############################################################
#audioOuputFile = './audioFiles/test.wav'
audioOuputFile = './audioFiles/test.mp3'
#############################################################
txt = 'Mary had a little lamb it\'s fleece was white as snow.'
txt+= 'And everywhere that Mary went, the lamb was sure to go,'
txt+= 'It followed her to school one day,'
txt+= 'That was against the rule,'
txt+= 'It made the children laugh and play,'
txt+= 'To see a lamb at school.'
head1 = f'<speak xmlns="http://www.w3.org/2001/10/synthesis" xmlns:mstts="http://www.w3.org/2001/mstts" xmlns:emo="http://www.w3.org/2009/10/emotionml" version="1.0" xml:lang="{languageCode}">'
head2 = f'<voice name="{voicName}">'
head3 =f'<mstts:express-as style="{voiceStyle}">'
head4 = f'<prosody rate="{speakingRate}" pitch="{pitch}">'
tail= '</prosody></mstts:express-as></voice></speak>'
ssml = head1 + head2 + head3 + head4 + txt + tail
print('this is the ssml======================================')
print(ssml)
print('end ssml======================================')
print()
speech_config = speechsdk.SpeechConfig(subscription=azureKey, region=azureRegion)
audio_config = speechsdk.AudioConfig(filename=audioOuputFile)
#HERE IS THE PROBLEM
#Without this statement everything works fine
#Can produce a wav file
speech_config.set_speech_synthesis_output_format(SpeechSynthesisOutputFormat["Audio16Khz128KBitRateMonoMp3"])
synthesizer = speechsdk.SpeechSynthesizer(speech_config=speech_config, audio_config=audio_config)
synthesizer.speak_ssml_async(ssml)
以下是控制台输出:
(埃沃) D:\py_new\tts>python ttsTest3.py这是ssml====================================== 玛丽有一只小羊羔,它的羊毛是白色的,因为snow.And到处都是玛丽去的地方,羊羔肯定要去,有一天它跟着她去上学,这违反了规则,它让孩子们笑着玩,在学校看到一只小羊羔。ssml======================================
speech_config.set_speech_synthesis_output_format(SpeechSynthesisOutputFormat"Audio16Khz128KBitRateMonoMp3") NameError: name 'SpeechSynthesisOutputFormat‘中的文件"D:\py_new\tts\ttsTest3.py",第45行
(envo) D:\py_new\tts>
注意错误: NameError:未定义名称'SpeechSynthesisOutputFormat‘
与:自定义音频格式进行比较
在:
在Nodejs中,这一切都很好。但是我也需要能够用Python来完成这个任务。
发布于 2022-03-20 13:37:25
尝尝这个
speech_config.set_speech_synthesis_output_format(speechsdk.SpeechSynthesisOutputFormat.Audio16Khz32KBitRateMonoMp3)
发布于 2022-01-08 18:07:38
您需要配置音频,如下所示
def hindi_text_to_speech_azure(hindi_text):
speech_config = SpeechConfig(subscription=SPEECH_KEY, region=LOCATION_AREA)
# Note: if only language is set, the default voice of that language is chosen.
speech_config.speech_synthesis_language = LANGUAGE_LOCATION_HINDI # e.g. "de-DE"
# The voice setting will overwrite language setting.
# The voice setting will not overwrite the voice element in input SSML.
speech_config.speech_synthesis_voice_name = MALE_VOICE_NAME_HINDI
audio_config = AudioOutputConfig(
filename="{name}.mp3".format(name=hindi_text[:30]))
synthesizer = SpeechSynthesizer(
speech_config=speech_config, audio_config=audio_config)
synthesizer.speak_text_async(hindi_text)
尝尝这个。
但问题是,实际上这不是一个问题,但我坚持在那里,文件保存在本地,但我想上传到服务器(默认存储)即时本地存储。你知道吗?
https://stackoverflow.com/questions/70629234
复制相似问题