句子评测模式

最近更新时间:2024-11-15 17:59:22

我的收藏

评测模式描述

评测要求:支持不超过30个单词的句子或词组。音频数据最长60秒。
评测维度:支持返回单词精准度,流利度,完整度;支持返回音素精准度。
评测功能:支持实时评测,原始单词,音素到字母映射,音素到国际音标转换,指定发音,指定国际音标。

请求参数

主要请求参数说明:
参数名称
类型
描述
ref_text
String
被评估文本。不超过30个单词
eval_mode
Integer
评估模式。1:句子评测模式
请求示例
# 参数示例为websocket连接URL展开, 如:soe.cloud.tencent.com/soe/api/1306***?eval_mode=0&voice_format=1&...
server_engine_type=16k_en
eval_mode=1
# 流式
rec_mode=0
# 流式展示中间结果
sentence_info_enabled=1
ref_text="i go to school by bus"
score_coeff=1.000000
voice_format=1

返回结果

主要返回结果说明:
参数名称
类型
描述
SuggestedScore
Float
建议评分
PronAccuracy
Float
整体精准度
PronFluency
Float
整体流利度
PronCompletion
Float
整体完整度
Words.PronAccuracy
Float
单词精准度
Words.PronFluency
Float
单词流利度
Words.MatchTag
Integer
当前词的音频与文本的匹配情况
Words.PhoneInfos.PronAccuracy
Float
音素精准度
Words.PhoneInfos.MatchTag
Integer
当前音素的音频与文本的匹配情况
返回示例
{ "code": 0, "message": "6ef67f85-830e-4075-87a2-0ffda1843607_10", "voice_id": "6ef67f85-830e-4075-87a2-0ffda1843607", "result": { "SuggestedScore": 98.80253601074219, "PronAccuracy": 98.80253601074219, "PronFluency": 0.9656875133514404, "PronCompletion": 1, "Words": [ { "MemBeginTime": 210, "MemEndTime": 350, "PronAccuracy": 96.01470947265625, "PronFluency": 0.9704024791717529, "ReferenceWord": "i_0", "Word": "i", "MatchTag": 0, "KeywordTag": 0, "PhoneInfos": [ { "MemBeginTime": 210, "MemEndTime": 350, "PronAccuracy": 96.01470947265625, "DetectedStress": false, "Phone": "ay", "ReferencePhone": "", "ReferenceLetter": "", "Stress": false, "MatchTag": 0 } ], "Tone": { "Valid": false, "RefTone": -1, "HypothesisTone": -1 } }, { "MemBeginTime": 350, "MemEndTime": 530, "PronAccuracy": 99.027587890625, "PronFluency": 0.9832581281661987, "ReferenceWord": "go_1", "Word": "go", "MatchTag": 0, "KeywordTag": 0, "PhoneInfos": [ { "MemBeginTime": 350, "MemEndTime": 430, "PronAccuracy": 99.00126647949219, "DetectedStress": false, "Phone": "g", "ReferencePhone": "", "ReferenceLetter": "", "Stress": false, "MatchTag": 0 }, { "MemBeginTime": 430, "MemEndTime": 530, "PronAccuracy": 99.05391693115234, "DetectedStress": false, "Phone": "ow", "ReferencePhone": "", "ReferenceLetter": "", "Stress": false, "MatchTag": 0 } ], "Tone": { "Valid": false, "RefTone": -1, "HypothesisTone": -1 } }, { "MemBeginTime": 530, "MemEndTime": 630, "PronAccuracy": 98.93866729736328, "PronFluency": 1, "ReferenceWord": "to_2", "Word": "to", "MatchTag": 0, "KeywordTag": 0, "PhoneInfos": [ { "MemBeginTime": 530, "MemEndTime": 590, "PronAccuracy": 98.98005676269531, "DetectedStress": false, "Phone": "t", "ReferencePhone": "", "ReferenceLetter": "", "Stress": false, "MatchTag": 0 }, { "MemBeginTime": 590, "MemEndTime": 630, "PronAccuracy": 98.89727783203125, "DetectedStress": false, "Phone": "ah", "ReferencePhone": "", "ReferenceLetter": "", "Stress": false, "MatchTag": 0 } ], "Tone": { "Valid": false, "RefTone": -1, "HypothesisTone": -1 } }, { "MemBeginTime": 630, "MemEndTime": 990, "PronAccuracy": 99.04634094238281, "PronFluency": 0.9832895994186401, "ReferenceWord": "school_3", "Word": "school", "MatchTag": 0, "KeywordTag": 0, "PhoneInfos": [ { "MemBeginTime": 630, "MemEndTime": 730, "PronAccuracy": 99.14324951171875, "DetectedStress": false, "Phone": "s", "ReferencePhone": "", "ReferenceLetter": "", "Stress": false, "MatchTag": 0 }, { "MemBeginTime": 730, "MemEndTime": 810, "PronAccuracy": 99.23045349121094, "DetectedStress": false, "Phone": "k", "ReferencePhone": "", "ReferenceLetter": "", "Stress": false, "MatchTag": 0 }, { "MemBeginTime": 810, "MemEndTime": 910, "PronAccuracy": 99.08352661132812, "DetectedStress": false, "Phone": "uw", "ReferencePhone": "", "ReferenceLetter": "", "Stress": false, "MatchTag": 0 }, { "MemBeginTime": 910, "MemEndTime": 990, "PronAccuracy": 98.7281265258789, "DetectedStress": false, "Phone": "l", "ReferencePhone": "", "ReferenceLetter": "", "Stress": false, "MatchTag": 0 } ], "Tone": { "Valid": false, "RefTone": -1, "HypothesisTone": -1 } }, { "MemBeginTime": 990, "MemEndTime": 1150, "PronAccuracy": 99.0534439086914, "PronFluency": 0.9959675073623657, "ReferenceWord": "by_4", "Word": "by", "MatchTag": 0, "KeywordTag": 0, "PhoneInfos": [ { "MemBeginTime": 990, "MemEndTime": 1050, "PronAccuracy": 98.91246795654297, "DetectedStress": false, "Phone": "b", "ReferencePhone": "", "ReferenceLetter": "", "Stress": false, "MatchTag": 0 }, { "MemBeginTime": 1050, "MemEndTime": 1150, "PronAccuracy": 99.19442749023438, "DetectedStress": false, "Phone": "ay", "ReferencePhone": "", "ReferenceLetter": "", "Stress": false, "MatchTag": 0 } ], "Tone": { "Valid": false, "RefTone": -1, "HypothesisTone": -1 } }, { "MemBeginTime": 1150, "MemEndTime": 1730, "PronAccuracy": 98.99871063232422, "PronFluency": 0.8612076640129089, "ReferenceWord": "bus_5", "Word": "bus", "MatchTag": 0, "KeywordTag": 0, "PhoneInfos": [ { "MemBeginTime": 1150, "MemEndTime": 1250, "PronAccuracy": 98.90721893310547, "DetectedStress": false, "Phone": "b", "ReferencePhone": "", "ReferenceLetter": "", "Stress": false, "MatchTag": 0 }, { "MemBeginTime": 1250, "MemEndTime": 1470, "PronAccuracy": 99.09923553466797, "DetectedStress": false, "Phone": "ah", "ReferencePhone": "", "ReferenceLetter": "", "Stress": false, "MatchTag": 0 }, { "MemBeginTime": 1470, "MemEndTime": 1730, "PronAccuracy": 98.98968505859375, "DetectedStress": false, "Phone": "s", "ReferencePhone": "", "ReferenceLetter": "", "Stress": false, "MatchTag": 0 } ], "Tone": { "Valid": false, "RefTone": -1, "HypothesisTone": -1 } } ], "SentenceId": -1, "ref_textId": -1, "KeyWordHits": null, "UnKeyWordHits": null }, "final": 1 }

原始单词

数字类型使用英文评测会转换为对应英文单词,可在对应返回字段获取原始数字以及单词下标

请求参数

主要请求参数说明:
参数名称
类型
描述
ref_text
String
被评估文本。不超过30个单词
eval_mode
Integer
评估模式。1:句子评测模式
请求示例
# 参数示例为websocket连接URL展开, 如:soe.cloud.tencent.com/soe/api/1306***?eval_mode=0&voice_format=1&...
server_engine_type=16k_en
eval_mode=1
rec_mode=0
ref_text="61"
score_coeff=1.000000
voice_format=1

返回结果

主要返回结果说明:
参数名称
类型
描述
Words.ReferenceWord
String
当前词原始文本
返回示例
{ "code": 0, "message": "9f6268ff-0a94-493f-9e37-fe89b92cd659_7", "voice_id": "9f6268ff-0a94-493f-9e37-fe89b92cd659", "result": { "SuggestedScore": 98.6801986694336, "PronAccuracy": 98.6801986694336, "PronFluency": 0.9605352878570557, "PronCompletion": 1, "Words": [ { "MemBeginTime": 390, "MemEndTime": 890, "PronAccuracy": 98.81883239746094, "PronFluency": 0.9803668856620789, "ReferenceWord": "61_0", "Word": "sixty", "MatchTag": 0, "KeywordTag": 0, "PhoneInfos": [ { "MemBeginTime": 390, "MemEndTime": 570, "PronAccuracy": 97.81999969482422, "DetectedStress": false, "Phone": "s", "ReferencePhone": "", "ReferenceLetter": "", "Stress": false, "MatchTag": 0 }, { "MemBeginTime": 570, "MemEndTime": 630, "PronAccuracy": 99.2169418334961, "DetectedStress": false, "Phone": "ih", "ReferencePhone": "", "ReferenceLetter": "", "Stress": false, "MatchTag": 0 }, { "MemBeginTime": 630, "MemEndTime": 690, "PronAccuracy": 99.1970443725586, "DetectedStress": false, "Phone": "k", "ReferencePhone": "", "ReferenceLetter": "", "Stress": false, "MatchTag": 0 }, { "MemBeginTime": 690, "MemEndTime": 750, "PronAccuracy": 99.1033706665039, "DetectedStress": false, "Phone": "s", "ReferencePhone": "", "ReferenceLetter": "", "Stress": false, "MatchTag": 0 }, { "MemBeginTime": 750, "MemEndTime": 840, "PronAccuracy": 98.88194274902344, "DetectedStress": false, "Phone": "t", "ReferencePhone": "", "ReferenceLetter": "", "Stress": false, "MatchTag": 0 }, { "MemBeginTime": 840, "MemEndTime": 890, "PronAccuracy": 98.69371795654297, "DetectedStress": false, "Phone": "iy", "ReferencePhone": "", "ReferenceLetter": "", "Stress": false, "MatchTag": 0 } ], "Tone": { "Valid": false, "RefTone": -1, "HypothesisTone": -1 } }, { "MemBeginTime": 890, "MemEndTime": 1330, "PronAccuracy": 98.47223663330078, "PronFluency": 0.9407037496566772, "ReferenceWord": "61_0", "Word": "one", "MatchTag": 0, "KeywordTag": 0, "PhoneInfos": [ { "MemBeginTime": 890, "MemEndTime": 1010, "PronAccuracy": 98.7368392944336, "DetectedStress": false, "Phone": "hh", "ReferencePhone": "", "ReferenceLetter": "", "Stress": false, "MatchTag": 0 }, { "MemBeginTime": 1010, "MemEndTime": 1050, "PronAccuracy": 97.58154296875, "DetectedStress": false, "Phone": "w", "ReferencePhone": "", "ReferenceLetter": "", "Stress": false, "MatchTag": 0 }, { "MemBeginTime": 1050, "MemEndTime": 1190, "PronAccuracy": 99.07205963134766, "DetectedStress": false, "Phone": "ah", "ReferencePhone": "", "ReferenceLetter": "", "Stress": false, "MatchTag": 0 }, { "MemBeginTime": 1190, "MemEndTime": 1330, "PronAccuracy": 98.49849700927734, "DetectedStress": false, "Phone": "n", "ReferencePhone": "", "ReferenceLetter": "", "Stress": false, "MatchTag": 0 } ], "Tone": { "Valid": false, "RefTone": -1, "HypothesisTone": -1 } } ], "SentenceId": -1, "ref_textId": -1, "KeyWordHits": null, "UnKeyWordHits": null }, "final": 1 }

音素到字母映射

通过该功能,可以对音素映射字母进行标记。音素到字母映射结构:{::cmd{F_P2L=true}} + 评估文本。

请求参数

主要请求参数说明:
参数名称
类型
描述
ref_text
String
被评估文本。
eval_mode
Integer
评估模式。1:句子评测模式
请求示例
# 参数示例为websocket连接URL展开, 如:soe.cloud.tencent.com/soe/api/1306***?eval_mode=0&voice_format=1&...
server_engine_type=16k_en
eval_mode=1
rec_mode=0
ref_text="{::cmd{F_P2L=true}}I would like to eat an orange"
score_coeff=1.000000
voice_format=1

返回结果

主要返回结果说明:
参数名称
类型
描述
Words.Word
String
当前单词
Words.PhoneInfos.Phone
String
当前单词的音素
Words.PhoneInfos.ReferenceLetter
String
当前单词的音素映射的字母


音素到国际音标转换

返回音素默认为智聆音素,使用 {::cmd{F_IPA=true}} + 单词 开启⾳素到国际⾳标转换功能。

请求参数

主要请求参数说明:
参数名称
类型
描述
ref_text
String
被评估文本。
eval_mode
Integer
评估模式。1:句子评测模式
请求示例
# 参数示例为websocket连接URL展开, 如:soe.cloud.tencent.com/soe/api/1306***?eval_mode=0&voice_format=1&...
server_engine_type=16k_en
eval_mode=1
rec_mode=0
ref_text="{::cmd{F_IPA=true}}I would like to eat an orange"
score_coeff=1.000000
voice_format=1

返回结果

返回结果说明:
参数名称
类型
描述
Words.PhoneInfos.Phone
Integer
当前音节


指定发音

使用 单词{::pron{p1,p2..},{p3,p4..}..} 指定发音,需要将国际音标通过音素映射表转换为智聆音素,参考音素映射表 -- 智聆音素

请求参数

主要请求参数说明:
参数名称
类型
描述
ref_text
String
被评估文本
eval_mode
Integer
评估模式。1:句子评测模式
请求示例
# 参数示例为websocket连接URL展开, 如:soe.cloud.tencent.com/soe/api/1306***?eval_mode=0&voice_format=1&...
server_engine_type=16k_en
eval_mode=1
rec_mode=0
ref_text="I would{::pron{w,uh,d}} like to eat{::pron{iy,t}} an orange"
score_coeff=1.000000
voice_format=1

返回结果

主要返回结果说明:
参数名称
类型
描述
SuggestedScore
Float
建议评分
PronAccuracy
Float
整体精准度
PronFluency
Float
整体流利度
Words.PronAccuracy
Float
单词精准度
Words.PronFluency
Float
单词流利度
Words.PhoneInfos.PronAccuracy
Float
音素精准度


指定国际音标

使用 单词{::ipapron{p1,p2..},{p3,p4..}..} 指定国际音标,参考音素映射表 -- 国际音标。指定国际音标返回结果依旧为智聆音素,需要使用音素到国际音标转换功能,才会显示国际音标。

请求参数

主要请求参数说明
参数名称
类型
描述
ref_text
String
被评估文本。
eval_mode
Integer
评估模式。1:句子评测模式
请求示例

# 参数示例为websocket连接URL展开, 如:soe.cloud.tencent.com/soe/api/1306***?eval_mode=0&voice_format=1&...
server_engine_type=16k_en
eval_mode=1
rec_mode=0
ref_text="I would{::ipapron{w,ʊ,d}} like to eat{::ipapron{i,t}} an orange"
score_coeff=1.000000
voice_format=1

返回结果

主要返回结果说明
参数名称
类型
描述
SuggestedScore
Float
建议评分
PronAccuracy
Float
整体精准度
PronFluency
Float
整体流利度
Words.PronAccuracy
Float
单词精准度
Words.PronFluency
Float
单词流利度
Words.PhoneInfos.PronAccuracy
Float
音素精准度