Introducing SensoryCloud.ai Part 3: Speech-to-Text & Accuracy

用户6026865

发布于 2022-04-02 17:12:36

3410

发布于 2022-04-02 17:12:36

When considering speech-to-text (STT) solutions, businesses are faced with many different solutions and varying degrees of marketing hype.

However, when it comes down to choosing a STT partner, one factor tends to outweigh the others: Accuracy. In fact, in a 2022 survey from Opus Research, 90% of respondents indicated that increased accuracy was the critical enabler to expanding speech technology use across their businesses.

State-of-the-art accuracy was a top requirement for Sensory when we launched the SensoryCloud.ai solution. To demonstrate the performance of the SensoryCloud speech-to-text, we hired a 3rd party company to perform a word error rate test (WER) and compare our STT solution to other offerings in their system.

As you can see in the table below, . Each engine was provided with hours of audio and text transcripts for WER calculations. This audio was played back in two scenarios: 1) relatively clearly spoken and free of background noise, and 2) Added noise to create SNR 10.

The 3rd party test house used identical test data (podcasts and various other audio files) with no company having access to customized language models (which can improve performance in known domains where there is accessible data).

1) relatively clearly spoken and free of background noise

2) Added noise to create SNR 10.

Performance in normal conditions, as shown above, reveals that the SensoryCloud STT engine provides best-in-class accuracy. The same WER testing was performed with a mix of added noise files (TV, Radio, Babble, Car, and Office) to create SNR 10 and Sensory continued to be near the top in performance (as shown, one company did outperform Sensory in the test). However, this is before the addition of our new noise robust front end, so we expect that within this quarter Sensory will be the most accurate in quiet and noise!

As mentioned, no customized language models were used to match the data in the domain of this test. However, Sensory can customize language models to substantially outperform the broader domain STT models. An earlier study by Sensory’s Vocalize showed that our embedded engines (e.g. TrulyNatural) could beat or match Google and Amazon in a microwave domain that used over 50,000 possible commands, by deploying customized language models.

So, if you are looking for the highest accuracy and the flexibility to work with your team to build a customized solution, then SensoryCloud’s speech-to-text is the best choice in large vocabulary natural language speech recognition! We invite you to subscribe to our blog and stay up to date on all the services offered by SensoryCloud: Speech-to-Text, Wake Word Verification, Sound ID, Face & Voice Biometrics, and Text-to-Speech.

本文参与腾讯云自媒体同步曝光计划，分享自微信公众号。

原始发表：2022-03-07，如有侵权请联系 cloudcommunity@tencent.com 删除

linux

本文分享自 SmellLikeAISpirit 微信公众号，前往查看

如有侵权，请联系 cloudcommunity@tencent.com 删除。

本文参与腾讯云自媒体同步曝光计划，欢迎热爱写作的你一起参与！

linux

登录后参与评论

0 条评论

热度

Introducing SensoryCloud.ai Part 3: Speech-to-Text & Accuracy

Introducing SensoryCloud.ai Part 3: Speech-to-Text & Accuracy

社区

活动

资源

关于

腾讯云开发者

热门产品

热门推荐

更多推荐