评测模式描述
评测要求:支持多组评估文本,每组不超过30个字。音频数据最长60秒。
评测维度:支持返回单词精准度,流利度,完整度;支持返回音素精准度。
评测功能:多组文本,音素到字母映射,音素到国际音标。
请求参数
主要请求参数说明:
参数名称 | 类型 | 描述 |
ref_text | String | 被评估文本。可以使用 | 划分成多个分支 |
eval_mode | Integer | 评估模式。6:句子多分支评测模式 |
请求示例
# 参数示例为websocket连接URL展开, 如:soe.cloud.tencent.com/soe/api/1306***?eval_mode=0&voice_format=1&...server_engine_type=16k_eneval_mode=6# 长音频推荐流式rec_mode=0# 流式展示中间结果sentence_info_enabled=1ref_text="i go to school by bus | i go to school by train | i go to school by car"score_coeff=1.000000voice_format=1
返回结果
主要返回结果说明:
参数名称 | 类型 | 描述 |
SuggestedScore | Float | 建议评分 |
PronAccuracy | Float | 整体精准度 |
PronFluency | Float | 整体流利度 |
PronCompletion | Float | 整体完整度 |
Words.PronAccuracy | Float | 单词精准度 |
Words.PronFluency | Float | 单词流利度 |
Words.MatchTag | Integer | 当前词的音频与文本的匹配情况 |
Words.PhoneInfos.PronAccuracy | Float | 音素精准度 |
Words.PhoneInfos.MatchTag | Integer | 当前音素的音频与文本的匹配情况 |
RefTextId | Integer | 匹配候选文本的序号 |
返回示例
{ "code": 0, "message": "7ed2d079-1e81-4bcc-be4b-45078d10c77c_11", "voice_id": "7ed2d079-1e81-4bcc-be4b-45078d10c77c", "result": { "SuggestedScore": 98.80253601074219, "PronAccuracy": 98.80253601074219, "PronFluency": 0.9656875133514404, "PronCompletion": 1, "Words": [ { "MemBeginTime": 210, "MemEndTime": 350, "PronAccuracy": 96.01470947265625, "PronFluency": 0.9704024791717529, "ReferenceWord": "i_0", "Word": "i", "MatchTag": 0, "KeywordTag": 0, "PhoneInfos": [ { "MemBeginTime": 210, "MemEndTime": 350, "PronAccuracy": 96.01470947265625, "DetectedStress": false, "Phone": "ay", "ReferencePhone": "", "ReferenceLetter": "", "Stress": false, "MatchTag": 0 } ], "Tone": { "Valid": false, "RefTone": -1, "HypothesisTone": -1 } }, { "MemBeginTime": 350, "MemEndTime": 530, "PronAccuracy": 99.027587890625, "PronFluency": 0.9832581281661987, "ReferenceWord": "go_1", "Word": "go", "MatchTag": 0, "KeywordTag": 0, "PhoneInfos": [ { "MemBeginTime": 350, "MemEndTime": 430, "PronAccuracy": 99.00126647949219, "DetectedStress": false, "Phone": "g", "ReferencePhone": "", "ReferenceLetter": "", "Stress": false, "MatchTag": 0 }, { "MemBeginTime": 430, "MemEndTime": 530, "PronAccuracy": 99.05391693115234, "DetectedStress": false, "Phone": "ow", "ReferencePhone": "", "ReferenceLetter": "", "Stress": false, "MatchTag": 0 } ], "Tone": { "Valid": false, "RefTone": -1, "HypothesisTone": -1 } }, { "MemBeginTime": 530, "MemEndTime": 630, "PronAccuracy": 98.93866729736328, "PronFluency": 1, "ReferenceWord": "to_2", "Word": "to", "MatchTag": 0, "KeywordTag": 0, "PhoneInfos": [ { "MemBeginTime": 530, "MemEndTime": 590, "PronAccuracy": 98.98005676269531, "DetectedStress": false, "Phone": "t", "ReferencePhone": "", "ReferenceLetter": "", "Stress": false, "MatchTag": 0 }, { "MemBeginTime": 590, "MemEndTime": 630, "PronAccuracy": 98.89727783203125, "DetectedStress": false, "Phone": "ah", "ReferencePhone": "", "ReferenceLetter": "", "Stress": false, "MatchTag": 0 } ], "Tone": { "Valid": false, "RefTone": -1, "HypothesisTone": -1 } }, { "MemBeginTime": 630, "MemEndTime": 990, "PronAccuracy": 99.04634094238281, "PronFluency": 0.9832895994186401, "ReferenceWord": "school_3", "Word": "school", "MatchTag": 0, "KeywordTag": 0, "PhoneInfos": [ { "MemBeginTime": 630, "MemEndTime": 730, "PronAccuracy": 99.14324951171875, "DetectedStress": false, "Phone": "s", "ReferencePhone": "", "ReferenceLetter": "", "Stress": false, "MatchTag": 0 }, { "MemBeginTime": 730, "MemEndTime": 810, "PronAccuracy": 99.23045349121094, "DetectedStress": false, "Phone": "k", "ReferencePhone": "", "ReferenceLetter": "", "Stress": false, "MatchTag": 0 }, { "MemBeginTime": 810, "MemEndTime": 910, "PronAccuracy": 99.08352661132812, "DetectedStress": false, "Phone": "uw", "ReferencePhone": "", "ReferenceLetter": "", "Stress": false, "MatchTag": 0 }, { "MemBeginTime": 910, "MemEndTime": 990, "PronAccuracy": 98.7281265258789, "DetectedStress": false, "Phone": "l", "ReferencePhone": "", "ReferenceLetter": "", "Stress": false, "MatchTag": 0 } ], "Tone": { "Valid": false, "RefTone": -1, "HypothesisTone": -1 } }, { "MemBeginTime": 990, "MemEndTime": 1150, "PronAccuracy": 99.0534439086914, "PronFluency": 0.9959675073623657, "ReferenceWord": "by_4", "Word": "by", "MatchTag": 0, "KeywordTag": 0, "PhoneInfos": [ { "MemBeginTime": 990, "MemEndTime": 1050, "PronAccuracy": 98.91246795654297, "DetectedStress": false, "Phone": "b", "ReferencePhone": "", "ReferenceLetter": "", "Stress": false, "MatchTag": 0 }, { "MemBeginTime": 1050, "MemEndTime": 1150, "PronAccuracy": 99.19442749023438, "DetectedStress": false, "Phone": "ay", "ReferencePhone": "", "ReferenceLetter": "", "Stress": false, "MatchTag": 0 } ], "Tone": { "Valid": false, "RefTone": -1, "HypothesisTone": -1 } }, { "MemBeginTime": 1150, "MemEndTime": 1730, "PronAccuracy": 98.99871063232422, "PronFluency": 0.8612076640129089, "ReferenceWord": "bus_5", "Word": "bus", "MatchTag": 0, "KeywordTag": 0, "PhoneInfos": [ { "MemBeginTime": 1150, "MemEndTime": 1250, "PronAccuracy": 98.90721893310547, "DetectedStress": false, "Phone": "b", "ReferencePhone": "", "ReferenceLetter": "", "Stress": false, "MatchTag": 0 }, { "MemBeginTime": 1250, "MemEndTime": 1470, "PronAccuracy": 99.09923553466797, "DetectedStress": false, "Phone": "ah", "ReferencePhone": "", "ReferenceLetter": "", "Stress": false, "MatchTag": 0 }, { "MemBeginTime": 1470, "MemEndTime": 1730, "PronAccuracy": 98.98968505859375, "DetectedStress": false, "Phone": "s", "ReferencePhone": "", "ReferenceLetter": "", "Stress": false, "MatchTag": 0 } ], "Tone": { "Valid": false, "RefTone": -1, "HypothesisTone": -1 } } ], "SentenceId": 0, "RefTextId": 0, "KeyWordHits": null, "UnKeyWordHits": null }, "final": 1 }
音素到字母映射
通过该功能,可以对音素映射字母进行标记。音素到字母映射结构:{::cmd{F_P2L=true}} + 评估文本。
请求参数
主要请求参数说明:
参数名称 | 类型 | 描述 |
ref_text | String | 被评估文本 |
eval_mode | Integer | 评估模式。1:句子评测模式 |
请求示例
# 参数示例为websocket连接URL展开, 如:soe.cloud.tencent.com/soe/api/1306***?eval_mode=0&voice_format=1&...server_engine_type=16k_eneval_mode=6# 长音频推荐流式rec_mode=0# 流式展示中间结果sentence_info_enabled=1ref_text="{::cmd{F_P2L=true}}I would like to eat an orange"score_coeff=1.000000voice_format=1
返回结果
主要返回结果说明:
参数名称 | 类型 | 描述 |
Words.Word | String | 当前单词 |
Words.PhoneInfos.Phone | String | 当前单词的音素 |
Words.PhoneInfos.ReferenceLetter | String | 当前单词的音素映射的字母 |
返回示例
音素到国际音标转换
返回音素默认为智聆音素,使用 {::cmd{F_IPA=true}} + 单词 开启⾳素到国际⾳标转换功能。
请求参数
主要请求参数说明:
参数名称 | 类型 | 描述 |
ref_text | String | 被评估文本 |
eval_mode | Integer | 评估模式。6:句子多分支评测模式 |
请求示例
# 参数示例为websocket连接URL展开, 如:soe.cloud.tencent.com/soe/api/1306***?eval_mode=0&voice_format=1&...server_engine_type=16k_eneval_mode=6# 长音频推荐流式rec_mode=0# 流式展示中间结果sentence_info_enabled=1ref_text="{::cmd{F_IPA=true}}I would like to eat an orange"score_coeff=1.000000voice_format=1
返回结果
返回结果说明:
参数名称 | 类型 | 描述 |
Words.PhoneInfos.Phone | String | 当前音节 |
返回示例