拼音评测模式

最近更新时间:2024-11-15 17:59:23

我的收藏

评估模式描述

评测要求:支持30以内拼音。音频时长最长60秒。
评测维度:支持返回单词精准度,单词流利度;支持音素精准度。
评测功能:支持声调检测,指定发音。

请求参数

主要请求参数说明:
参数名称
类型
描述
ref_text
String
被评估文本。使用 拼音 加上声调来表示。例如 shǎn,使用 shan3 来表示 shan(三声)
eval_mode
Integer
评估模式。8:拼音评测模式
请求示例
# 参数示例为websocket连接URL展开, 如:soe.cloud.tencent.com/soe/api/1306***?eval_mode=0&voice_format=1&...
server_engine_type=16k_zh
eval_mode=8
ref_text="shan3"
score_coeff=1.000000
voice_format=1

返回结果

主要返回结果说明
参数名称
类型
描述
SuggestedScore
Float
建议评分
PronAccuracy
Float
整体精准度
PronFluency
Float
整体流利度
Words.PronAccuracy
Float
单词精准度
Words.PronFluency
Float
单词流利度
Words.MatchTag
Integer
当前词的音频与文本的匹配情况
Words.PhoneInfos.PronAccuracy
Float
音素精准度
Words.PhoneInfos.MatchTag
Integer
当前音素的音频与文本的匹配情况
返回示例
{ "code": 0, "message": "06104bab-98fa-4b4b-ab9e-f1800fcaebf4_5", "voice_id": "06104bab-98fa-4b4b-ab9e-f1800fcaebf4", "result": { "SuggestedScore": 75.57745361328125, "PronAccuracy": 75.57745361328125, "PronFluency": 0.9382091164588928, "PronCompletion": 1, "Words": [ { "MemBeginTime": 400, "MemEndTime": 870, "PronAccuracy": 75.57745361328125, "PronFluency": 0.9382091164588928, "ReferenceWord": "", "Word": "shan3", "MatchTag": 0, "KeywordTag": 0, "PhoneInfos": [ { "MemBeginTime": 400, "MemEndTime": 650, "PronAccuracy": 94.27549743652344, "DetectedStress": false, "Phone": "sh", "ReferencePhone": "", "ReferenceLetter": "", "Stress": false, "MatchTag": 0 }, { "MemBeginTime": 650, "MemEndTime": 870, "PronAccuracy": 66.22842407226562, "DetectedStress": false, "Phone": "an3", "ReferencePhone": "", "ReferenceLetter": "", "Stress": false, "MatchTag": 0 } ], "Tone": { "Valid": false, "RefTone": -1, "HypothesisTone": -1 } } ], "SentenceId": -1, "RefTextId": -1, "KeyWordHits": null, "UnKeyWordHits": null }, "final": 1 }

声调评测

拼音评测支持声调检测,使用 {::cmd{F_TDET=true}} + 拼音 的方式进行评测。只支持4个拼音。

请求参数

主要请求参数说明
参数名称
类型
描述
ref_text
String
被评估文本。使用拼音加上声调来表示。例如 shǎn,使用 shan3 来表示 shan(三声)
eval_mode
Integer
评估模式。8:拼音评测模式
请求示例
# 参数示例为websocket连接URL展开, 如:soe.cloud.tencent.com/soe/api/1306***?eval_mode=0&voice_format=1&...
server_engine_type=16k_zh
eval_mode=8
ref_text="{::cmd{F_TDET=true}}shan3"
score_coeff=1.000000
voice_format=1

返回结果

主要返回结果说明
参数名称
类型
描述
SuggestedScore
Float
建议评分
PronAccuracy
Float
整体精准度
PronFluency
Float
整体流利度
Words.PronAccuracy
Float
单词精准度
Words.PronFluency
Float
单词流利度
Words.PhoneInfos.PronAccuracy
Float
音素精准度
Words.PhoneInfos.Phone
String
当前音频数据对应音素
Words.PhoneInfos.ReferencePhone
String
当前评估文本对应音素
返回示例
{ "code": 0, "message": "cb626c1d-944c-4e2c-aba6-3f4c79a89abd_5", "voice_id": "cb626c1d-944c-4e2c-aba6-3f4c79a89abd", "result": { "SuggestedScore": 74.7906723022461, "PronAccuracy": 74.7906723022461, "PronFluency": 0.9533305168151855, "PronCompletion": 1, "Words": [ { "MemBeginTime": 460, "MemEndTime": 850, "PronAccuracy": 74.7906723022461, "PronFluency": 0.9533305168151855, "ReferenceWord": "shan3", "Word": "shan3", "MatchTag": 0, "KeywordTag": 0, "PhoneInfos": [ { "MemBeginTime": 460, "MemEndTime": 650, "PronAccuracy": 95.47496795654297, "DetectedStress": false, "Phone": "sh", "ReferencePhone": "sh", "ReferenceLetter": "", "Stress": false, "MatchTag": 0 }, { "MemBeginTime": 650, "MemEndTime": 850, "PronAccuracy": 64.44853210449219, "DetectedStress": false, "Phone": "an3", "ReferencePhone": "an3", "ReferenceLetter": "", "Stress": false, "MatchTag": 0 } ], "Tone": { "Valid": false, "RefTone": -1, "HypothesisTone": -1 } } ], "SentenceId": -1, "RefTextId": -1, "KeyWordHits": null, "UnKeyWordHits": null }, "final": 1 }

指定发音

需要在声调检测下,使用 {::cmd{F_TDET=true}} 汉字 {::pron{p1,p2..},{p3,p4..}..} 指定发音,发音单元为 拼音

请求参数

主要请求参数说明
参数名称
类型
描述
ref_text
String
被评估文本。使用| 划分多组分支
eval_mode
Integer
评估模式。8:拼音评测模式
请求示例
# 参数示例为websocket连接URL展开, 如:soe.cloud.tencent.com/soe/api/1306***?eval_mode=0&voice_format=1&...
server_engine_type=16k_zh
eval_mode=8
ref_text="{::cmd{F_TDET=true}}清平乐{::pron{yue4}}"
score_coeff=1.000000
voice_format=1

返回结果

主要返回结果说明
参数名称
类型
描述
SentenceInfoSet.Words.PronAccuracy
Float
单词精准度
SentenceInfoSet.Words.PronFluency
Float
单词流利度
SentenceInfoSet.Words.MatchTag
Integer
当前词的音频与文本的匹配情况
SentenceInfoSet.Words.PhoneInfos.PronAccuracy
Float
音素精准度
SentenceInfoSet.Words.PhoneInfos.MatchTag
Integer
当前音素的音频与文本的匹配情况
返回示例
{ "code": 0, "message": "a560bdc1-9cca-4d0b-b470-91e7e3e7c27e_6", "voice_id": "a560bdc1-9cca-4d0b-b470-91e7e3e7c27e", "result": { "SuggestedScore": 99.31161499023438, "PronAccuracy": 99.31161499023438, "PronFluency": 0.9807323813438416, "PronCompletion": 1, "Words": [ { "MemBeginTime": 280, "MemEndTime": 590, "PronAccuracy": 99.23593139648438, "PronFluency": 0.9863236546516418, "ReferenceWord": "qing1", "Word": "清", "MatchTag": 0, "KeywordTag": 0, "PhoneInfos": [ { "MemBeginTime": 280, "MemEndTime": 470, "PronAccuracy": 99.28839111328125, "DetectedStress": false, "Phone": "q", "ReferencePhone": "q", "ReferenceLetter": "", "Stress": false, "MatchTag": 0 }, { "MemBeginTime": 470, "MemEndTime": 590, "PronAccuracy": 99.20970153808594, "DetectedStress": false, "Phone": "ing1", "ReferencePhone": "ing1", "ReferenceLetter": "", "Stress": false, "MatchTag": 0 } ], "Tone": { "Valid": false, "RefTone": -1, "HypothesisTone": -1 } }, { "MemBeginTime": 590, "MemEndTime": 780, "PronAccuracy": 99.42112731933594, "PronFluency": 1, "ReferenceWord": "ping2", "Word": "平", "MatchTag": 0, "KeywordTag": 0, "PhoneInfos": [ { "MemBeginTime": 590, "MemEndTime": 650, "PronAccuracy": 99.52622985839844, "DetectedStress": false, "Phone": "p", "ReferencePhone": "p", "ReferenceLetter": "", "Stress": false, "MatchTag": 0 }, { "MemBeginTime": 650, "MemEndTime": 780, "PronAccuracy": 99.36857604980469, "DetectedStress": false, "Phone": "ing2", "ReferencePhone": "ing2", "ReferenceLetter": "", "Stress": false, "MatchTag": 0 } ], "Tone": { "Valid": false, "RefTone": -1, "HypothesisTone": -1 } }, { "MemBeginTime": 780, "MemEndTime": 1190, "PronAccuracy": 99.27776336669922, "PronFluency": 0.9558734893798828, "ReferenceWord": "yue4", "Word": "乐", "MatchTag": 0, "KeywordTag": 0, "PhoneInfos": [ { "MemBeginTime": 780, "MemEndTime": 830, "PronAccuracy": 99.16903686523438, "DetectedStress": false, "Phone": "y", "ReferencePhone": "y", "ReferenceLetter": "", "Stress": false, "MatchTag": 0 }, { "MemBeginTime": 830, "MemEndTime": 1190, "PronAccuracy": 99.3321304321289, "DetectedStress": false, "Phone": "ve4", "ReferencePhone": "ve4", "ReferenceLetter": "", "Stress": false, "MatchTag": 0 } ], "Tone": { "Valid": false, "RefTone": -1, "HypothesisTone": -1 } } ], "SentenceId": -1, "RefTextId": -1, "KeyWordHits": null, "UnKeyWordHits": null }, "final": 1 }