句子多分支评测模式

最近更新时间:2024-11-15 17:59:23

我的收藏

评测模式描述

评测要求:支持多组评估文本,每组不超过30个字。音频数据最长60秒。
评测维度:支持返回单词精准度,流利度,完整度;支持返回音素精准度。
评测功能:多组文本,音素到字母映射,音素到国际音标。

请求参数

主要请求参数说明:
参数名称
类型
描述
ref_text
String
被评估文本。可以使用 | 划分成多个分支
eval_mode
Integer
评估模式。6:句子多分支评测模式
请求示例
# 参数示例为websocket连接URL展开, 如:soe.cloud.tencent.com/soe/api/1306***?eval_mode=0&voice_format=1&...
server_engine_type=16k_en
eval_mode=6
# 长音频推荐流式
rec_mode=0
# 流式展示中间结果
sentence_info_enabled=1
ref_text="i go to school by bus | i go to school by train | i go to school by car"
score_coeff=1.000000
voice_format=1

返回结果

主要返回结果说明:
参数名称
类型
描述
SuggestedScore
Float
建议评分
PronAccuracy
Float
整体精准度
PronFluency
Float
整体流利度
PronCompletion
Float
整体完整度
Words.PronAccuracy
Float
单词精准度
Words.PronFluency
Float
单词流利度
Words.MatchTag
Integer
当前词的音频与文本的匹配情况
Words.PhoneInfos.PronAccuracy
Float
音素精准度
Words.PhoneInfos.MatchTag
Integer
当前音素的音频与文本的匹配情况
RefTextId
Integer
匹配候选文本的序号
返回示例
{ "code": 0, "message": "7ed2d079-1e81-4bcc-be4b-45078d10c77c_11", "voice_id": "7ed2d079-1e81-4bcc-be4b-45078d10c77c", "result": { "SuggestedScore": 98.80253601074219, "PronAccuracy": 98.80253601074219, "PronFluency": 0.9656875133514404, "PronCompletion": 1, "Words": [ { "MemBeginTime": 210, "MemEndTime": 350, "PronAccuracy": 96.01470947265625, "PronFluency": 0.9704024791717529, "ReferenceWord": "i_0", "Word": "i", "MatchTag": 0, "KeywordTag": 0, "PhoneInfos": [ { "MemBeginTime": 210, "MemEndTime": 350, "PronAccuracy": 96.01470947265625, "DetectedStress": false, "Phone": "ay", "ReferencePhone": "", "ReferenceLetter": "", "Stress": false, "MatchTag": 0 } ], "Tone": { "Valid": false, "RefTone": -1, "HypothesisTone": -1 } }, { "MemBeginTime": 350, "MemEndTime": 530, "PronAccuracy": 99.027587890625, "PronFluency": 0.9832581281661987, "ReferenceWord": "go_1", "Word": "go", "MatchTag": 0, "KeywordTag": 0, "PhoneInfos": [ { "MemBeginTime": 350, "MemEndTime": 430, "PronAccuracy": 99.00126647949219, "DetectedStress": false, "Phone": "g", "ReferencePhone": "", "ReferenceLetter": "", "Stress": false, "MatchTag": 0 }, { "MemBeginTime": 430, "MemEndTime": 530, "PronAccuracy": 99.05391693115234, "DetectedStress": false, "Phone": "ow", "ReferencePhone": "", "ReferenceLetter": "", "Stress": false, "MatchTag": 0 } ], "Tone": { "Valid": false, "RefTone": -1, "HypothesisTone": -1 } }, { "MemBeginTime": 530, "MemEndTime": 630, "PronAccuracy": 98.93866729736328, "PronFluency": 1, "ReferenceWord": "to_2", "Word": "to", "MatchTag": 0, "KeywordTag": 0, "PhoneInfos": [ { "MemBeginTime": 530, "MemEndTime": 590, "PronAccuracy": 98.98005676269531, "DetectedStress": false, "Phone": "t", "ReferencePhone": "", "ReferenceLetter": "", "Stress": false, "MatchTag": 0 }, { "MemBeginTime": 590, "MemEndTime": 630, "PronAccuracy": 98.89727783203125, "DetectedStress": false, "Phone": "ah", "ReferencePhone": "", "ReferenceLetter": "", "Stress": false, "MatchTag": 0 } ], "Tone": { "Valid": false, "RefTone": -1, "HypothesisTone": -1 } }, { "MemBeginTime": 630, "MemEndTime": 990, "PronAccuracy": 99.04634094238281, "PronFluency": 0.9832895994186401, "ReferenceWord": "school_3", "Word": "school", "MatchTag": 0, "KeywordTag": 0, "PhoneInfos": [ { "MemBeginTime": 630, "MemEndTime": 730, "PronAccuracy": 99.14324951171875, "DetectedStress": false, "Phone": "s", "ReferencePhone": "", "ReferenceLetter": "", "Stress": false, "MatchTag": 0 }, { "MemBeginTime": 730, "MemEndTime": 810, "PronAccuracy": 99.23045349121094, "DetectedStress": false, "Phone": "k", "ReferencePhone": "", "ReferenceLetter": "", "Stress": false, "MatchTag": 0 }, { "MemBeginTime": 810, "MemEndTime": 910, "PronAccuracy": 99.08352661132812, "DetectedStress": false, "Phone": "uw", "ReferencePhone": "", "ReferenceLetter": "", "Stress": false, "MatchTag": 0 }, { "MemBeginTime": 910, "MemEndTime": 990, "PronAccuracy": 98.7281265258789, "DetectedStress": false, "Phone": "l", "ReferencePhone": "", "ReferenceLetter": "", "Stress": false, "MatchTag": 0 } ], "Tone": { "Valid": false, "RefTone": -1, "HypothesisTone": -1 } }, { "MemBeginTime": 990, "MemEndTime": 1150, "PronAccuracy": 99.0534439086914, "PronFluency": 0.9959675073623657, "ReferenceWord": "by_4", "Word": "by", "MatchTag": 0, "KeywordTag": 0, "PhoneInfos": [ { "MemBeginTime": 990, "MemEndTime": 1050, "PronAccuracy": 98.91246795654297, "DetectedStress": false, "Phone": "b", "ReferencePhone": "", "ReferenceLetter": "", "Stress": false, "MatchTag": 0 }, { "MemBeginTime": 1050, "MemEndTime": 1150, "PronAccuracy": 99.19442749023438, "DetectedStress": false, "Phone": "ay", "ReferencePhone": "", "ReferenceLetter": "", "Stress": false, "MatchTag": 0 } ], "Tone": { "Valid": false, "RefTone": -1, "HypothesisTone": -1 } }, { "MemBeginTime": 1150, "MemEndTime": 1730, "PronAccuracy": 98.99871063232422, "PronFluency": 0.8612076640129089, "ReferenceWord": "bus_5", "Word": "bus", "MatchTag": 0, "KeywordTag": 0, "PhoneInfos": [ { "MemBeginTime": 1150, "MemEndTime": 1250, "PronAccuracy": 98.90721893310547, "DetectedStress": false, "Phone": "b", "ReferencePhone": "", "ReferenceLetter": "", "Stress": false, "MatchTag": 0 }, { "MemBeginTime": 1250, "MemEndTime": 1470, "PronAccuracy": 99.09923553466797, "DetectedStress": false, "Phone": "ah", "ReferencePhone": "", "ReferenceLetter": "", "Stress": false, "MatchTag": 0 }, { "MemBeginTime": 1470, "MemEndTime": 1730, "PronAccuracy": 98.98968505859375, "DetectedStress": false, "Phone": "s", "ReferencePhone": "", "ReferenceLetter": "", "Stress": false, "MatchTag": 0 } ], "Tone": { "Valid": false, "RefTone": -1, "HypothesisTone": -1 } } ], "SentenceId": 0, "RefTextId": 0, "KeyWordHits": null, "UnKeyWordHits": null }, "final": 1 }

音素到字母映射

通过该功能,可以对音素映射字母进行标记。音素到字母映射结构:{::cmd{F_P2L=true}} + 评估文本。

请求参数

主要请求参数说明:
参数名称
类型
描述
ref_text
String
被评估文本
eval_mode
Integer
评估模式。1:句子评测模式
请求示例
# 参数示例为websocket连接URL展开, 如:soe.cloud.tencent.com/soe/api/1306***?eval_mode=0&voice_format=1&...
server_engine_type=16k_en
eval_mode=6
# 长音频推荐流式
rec_mode=0
# 流式展示中间结果
sentence_info_enabled=1
ref_text="{::cmd{F_P2L=true}}I would like to eat an orange"
score_coeff=1.000000
voice_format=1

返回结果

主要返回结果说明:
参数名称
类型
描述
Words.Word
String
当前单词
Words.PhoneInfos.Phone
String
当前单词的音素
Words.PhoneInfos.ReferenceLetter
String
当前单词的音素映射的字母
返回示例
可以参照句子模式相应示例句子评测模式

音素到国际音标转换

返回音素默认为智聆音素,使用 {::cmd{F_IPA=true}} + 单词 开启⾳素到国际⾳标转换功能。

请求参数

主要请求参数说明:
参数名称
类型
描述
ref_text
String
被评估文本
eval_mode
Integer
评估模式。6:句子多分支评测模式
请求示例
# 参数示例为websocket连接URL展开, 如:soe.cloud.tencent.com/soe/api/1306***?eval_mode=0&voice_format=1&...
server_engine_type=16k_en
eval_mode=6
# 长音频推荐流式
rec_mode=0
# 流式展示中间结果
sentence_info_enabled=1
ref_text="{::cmd{F_IPA=true}}I would like to eat an orange"
score_coeff=1.000000
voice_format=1

返回结果

返回结果说明:
参数名称
类型
描述
Words.PhoneInfos.Phone
String
当前音节
返回示例
可以参照句子模式相应示例句子评测模式