前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >Introducing SensoryCloud.ai Part 3: Speech-to-Text & Accuracy

Introducing SensoryCloud.ai Part 3: Speech-to-Text & Accuracy

作者头像
用户6026865
发布2022-04-02 17:12:36
3410
发布2022-04-02 17:12:36
举报
文章被收录于专栏:VoiceVista语音智能

When considering speech-to-text (STT) solutions, businesses are faced with many different solutions and varying degrees of marketing hype.

However, when it comes down to choosing a STT partner, one factor tends to outweigh the others: Accuracy. In fact, in a 2022 survey from Opus Research, 90% of respondents indicated that increased accuracy was the critical enabler to expanding speech technology use across their businesses.

State-of-the-art accuracy was a top requirement for Sensory when we launched the SensoryCloud.ai solution. To demonstrate the performance of the SensoryCloud speech-to-text, we hired a 3rd party company to perform a word error rate test (WER) and compare our STT solution to other offerings in their system.

As you can see in the table below, . Each engine was provided with hours of audio and text transcripts for WER calculations. This audio was played back in two scenarios: 1) relatively clearly spoken and free of background noise, and 2) Added noise to create SNR 10.

The 3rd party test house used identical test data (podcasts and various other audio files) with no company having access to customized language models (which can improve performance in known domains where there is accessible data).

1) relatively clearly spoken and free of background noise

2) Added noise to create SNR 10.

Performance in normal conditions, as shown above, reveals that the SensoryCloud STT engine provides best-in-class accuracy. The same WER testing was performed with a mix of added noise files (TV, Radio, Babble, Car, and Office) to create SNR 10 and Sensory continued to be near the top in performance (as shown, one company did outperform Sensory in the test). However, this is before the addition of our new noise robust front end, so we expect that within this quarter Sensory will be the most accurate in quiet and noise!

As mentioned, no customized language models were used to match the data in the domain of this test. However, Sensory can customize language models to substantially outperform the broader domain STT models. An earlier study by Sensory’s Vocalize showed that our embedded engines (e.g. TrulyNatural) could beat or match Google and Amazon in a microwave domain that used over 50,000 possible commands, by deploying customized language models.

So, if you are looking for the highest accuracy and the flexibility to work with your team to build a customized solution, then SensoryCloud’s speech-to-text is the best choice in large vocabulary natural language speech recognition! We invite you to subscribe to our blog and stay up to date on all the services offered by SensoryCloud: Speech-to-Text, Wake Word Verification, Sound ID, Face & Voice Biometrics, and Text-to-Speech.

本文参与 腾讯云自媒体同步曝光计划,分享自微信公众号。
原始发表:2022-03-07,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 SmellLikeAISpirit 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体同步曝光计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档