The content of this page has been automatically translated by AI. If you encounter any problems while reading, you can view the corresponding content in Chinese.

Release Notes

Last updated: 2026-02-11 10:06:32

October, 2024

Update
Description
Release Date
References
The client adds HarmonyOS NEXT SDK, supporting single-sentence recognition and ultra-fast version of recording file recognition API
The client adds HarmonyOS NEXT SDK, supporting single-sentence recognition and ultra-fast version of recording file recognition API.
2024-10-12

May 2024

Update
Description
Release Date
References
The client adds HarmonyOS NEXT SDK, supporting real-time Automatic Speech Recognition API
The client adds HarmonyOS NEXT SDK, supporting real-time Automatic Speech Recognition API.
2024-5-9

March 2024

Update
Description
Release Date
References
Version release of ASR large model V2.0
ASR large model V2.0 is newly upgraded to General-Pinyin-English Large Model, supporting mixed recognition of Mandarin, English, and 27 Chinese dialects with one engine, and newly supporting real-time speech recognition
2024-3-28



February 2024

Update
Description
Release Date
References
ASR large model V1.0 version release
ASR large model V1.0 version supports Mandarin large model and Chinese dialect large model. The number of model parameters is significantly increased, enhancing language model performance, and greatly improving identification accuracy for low-quality audio such as dialect mixed recognition, high noise, high echo, low voice, and distant voice. Click here to compare the recognition effects of the general version and the large model version.
2024-2-1



January 2024

Update
Description
Release Date
References
SDK for Flutter update
Added temporary key authentication method, fixed some callback information errors, updated iOS Framework version to 3.1.15
2024-1-19
The experience module supports history query.
Historical recognition records are stored on the cloud for up to 7 days. Recognition results can be downloaded, allowing users to view Optical Character Recognition (OCR) content.
2024-1-29



October 2023

Update
Description
Release Date
References
Real-time speech recognition supports forced segmentation parameter.
Real-time speech recognition supports forced segmentation parameter, allowing users to configure the segmentation parameter to disconnect vad during continuous speech or noise, achieving forced segmentation.
2023-10-31
New super term feature launched

When the hotword weight is set to 11, the hotword will be upgraded to a super hotword to improve identification accuracy. However, it is recommended to set only important and necessary hotwords to 11, as setting too many hotwords to 11 will affect the overall character accuracy rate.
2023-10-30
The Flutter client adds single-sentence recognition and ultra-fast recording file recognition APIs.
The SDK for Flutter client adds single-sentence recognition API and ultra-fast recording file recognition API.
2023-10-26

September 2023

Update
Description
Release Date
References
Recording file recognition supports oral-to-written transcription.
Recording file recognition supports oral-to-written transcription and allows complex post-processing of transcription results to filter out modal particles and repeated words.
2023-09-26


August 2023

Update
Description
Release Date
References
Client SDK supports echo cancellation
iOS/Android clients support echo cancellation API, customers can choose to enable or disable it.
2023-08-31
ASR products support Hindi
ASR products have added new language capabilities, now supporting Hindi.
2023-08-31
ASR products support Spanish.
ASR products have added new language capabilities, now supporting Spanish.
2023-08-03



July 2023

Update
Description
Release Date
References
The console has launched new Cloud Access Management.
You can configure sub-account permissions by first creating a JSON policy and then associating the policy with users or user groups.
2023-07-27
ASR products support Arabic.
ASR products have added new language capabilities, now supporting Arabic.
2023-07-18
The official website experience supports recording file recognition.
The official website experience page supports the upload of recording file recognition feature, showcasing product capabilities more comprehensively.
2023-07-13


June 2023

Update
Description
Release Date
References
ASR+ product series published
Newly launched speaker identification and virtual number human detection. Speaker identification can be used in scenarios such as login locks and identity verification; virtual number human detection can be used in intelligent outbound calling scenarios.
2023-06-28
ASR+ product series


April 2023

Update
Description
Release Date
References
Offline and Online Speech Recognition SDK published
The offline and online SDK supports online APIs within the offline SDK, enabling a hybrid recognition mode that automatically switches between offline and online versions based on network conditions.
2023-04-28



March 2023

Update
Description
Release Date
References
ASR products support Vietnamese, Malay, Indonesian, Filipino, Portuguese, and Turkish
ASR products have added new language capabilities, now supporting Vietnamese, Malay, Indonesian, Filipino, Portuguese, and Turkish.
2023-03-24


February 2023

Update
Description
Release Date
References
Recording file recognition supports emotion recognition.
By configuring emotion recognition parameters (EmotionRecognition) in the recording file recognition API, it can output emotion tags such as happy, angry, and sad.
2023-02-28



December 2022

Update
Description
Release Date
References
Real-time speech recognition, recording file recognition premium version, and single-sentence recognition support the purchase of QPS/concurrency add-ons.
QPS/concurrency add-ons can scale out the existing QPS/concurrency.
2022-12-26


November 2022

Update
Description
Release Date
References
Real-time speech recognition, recording file recognition, and single-sentence recognition support the hotword enhanced version.
The hotword enhanced version can effectively improve the hit rate of hotwords in the recognition results, only applicable to Chinese 8k and 16k engines.
2022-11-30
Real-time speech recognition, recording file recognition, and single-sentence recognition support mixed models of Chinese, English, and Cantonese.
Chinese, English, and Cantonese can be recognized in a mixed manner without switching.
2022-11-28


October 2022

Update
Description
Release Date
References
Recording file recognition supports emotion energy detection and silence duration detection.
All language engines support emotion energy detection. Silence duration detection supports detecting the silence duration between the current sentence and the previous one, measured in seconds.
2022-10-29


September 2022

Update
Description
Release Date
References
Optimize the Chinese Mandarin model under the 8k engine
The recognition rate, accuracy, and performance of 8k_zh (Chinese 8k engine) have been improved, better adapting to the phone call scenario
2022-09-27
Optimize the multi-dialect model under the 16k engine
The 16k_zh_dialect (Chinese 16k multi-dialect engine) has resolved multiple recognition issues, significantly improving the recognition accuracy of some dialects
2022-09-26
Optimize the multi-dialect model under the 16k engine
The 16k_zh_dialect (Chinese 16k multi-dialect engine) supports speaker separation feature.
2022-09-26


December 2021

Update
Description
Release Date
References
ASR products support 23 dialects including Sichuanese and Wuhanese.
In addition to Mandarin, English, Cantonese, Japanese, and Shanghainese, Sichuanese, Wuhanese, Guiyang dialect, Kunming dialect, Xi'an dialect, Zhengzhou dialect, Taiyuan dialect, Lanzhou dialect, Yinchuan dialect, Xining dialect, Nanjing dialect, Hefei dialect, Nanchang dialect, Changsha dialect, Suzhou dialect, Hangzhou dialect, Jinan dialect, Tianjin dialect, Shijiazhuang dialect, Heilongjiang dialect, Jilin dialect, and Liaoning dialect have been added.
2021-12-03


February 2021

Update
Description
Release Date
References
Speech recognition releases multiple industry models
Real-time speech recognition supports education, medical, gaming, and court industry models. Recording file recognition supports education and medical industry models.
2021-02-01


January 2021

Update
Description
Release Date
References
Speech recognition supports a comprehensive upgrade of audio formats
Comprehensive upgrade and enrichment of formats supported by recording file recognition and real-time speech recognition
2021-01-21
Real-Time Speech Recognition SDK fully supports the WebSocket protocol
Real-Time Speech Recognition server-side, client, and front-end SDKs all support the WebSocket protocol
2021-01-21
Async voice stream recognition subproduct published
Recognizes audio streams over live streaming protocols to return recognition results in quasi-real time and supports dedicated models for audio/video scenarios, which is fit for live stream quality inspection.
2021-01-15
Ultrafast recording file recognition subproduct published
Quickly recognizes large recording files and returns recognition results in semi-real time, which is ideal for scenarios such as audio/video subtitling and quasi-real-time quality inspection and analysis.
2021-01-15


November 2020

Update
Description
Release Date
References
Recording file recognition supports automatic speaker separation
The 16k_zh_video engine model now supports the speaker separation feature, with automatic separation and specified speaker number separation available for both phone call and non-phone call scenarios.
2020-11-27


October 2020

Update
Description
Release Date
References
ASR access layer supports WebSocket Protocol
This API service uses the WebSocket Protocol to recognize real-time audio streams, synchronously returning recognition results to achieve the effect of "text output while speaking."
2020-10-10


September 2020

Update
Description
Release Date
References
ASR Access Control
Implement permission management for ASR operation and resource dimensions through Tencent Cloud's CAM (Cloud Access Management) product.
2020-09-16
ASR phone call scenario supports English model
ASR adds a new 8k English model for phone calls, suitable for Speech-to-Text Conversion in English call scenarios
2020-09-09
Client TRTC integration with real-time speech recognition technology guide is online
For users with real-time audio and video and speech recognition needs, real-time speech recognition can be accessed through TRTC.
2020-09-07


August 2020

Update
Description
Release Date
References
ASR products support the Shanghai dialect
ASR products enhance language and dialect capabilities, adding support for the Shanghai dialect
2020-08-21
ASR products support Japanese
ASR products enhance language and dialect capabilities, adding support for Japanese
2020-08-04
Punctuation capability enhancement in ASR product results
After the punctuation ability upgrade in ASR product results, support for the Chinese comma, question mark, and exclamation point is added
2020-08-01


July 2020

Update
Description
Release Date
References
Recording file recognition supports separation of three or more speakers
Phone call scenario supports mono 2-speaker separation, non-phone call scenario supports mono 2-10 speaker separation
2020-07-28
Real-Time speech recognition supports OPUS format
OPUS is a low-latency, high-fidelity open-source audio codec suitable for network transmission. It is a mainstream audio streaming format that better supports customers using this format to access real-time speech recognition
2020-07-02


June 2020

Update
Description
Release Date
References
Recording file recognition supports duration capability enhancement
When using recording file recognition with the upload audio URL method, the URL duration limit is extended from 1 hour to 5 hours
2020-06-18
Real-time speech recognition supports word-level timestamp feature
Real-time speech recognition supports word-level timestamp feature, suitable for loading subtitles through speech recognition with high latency requirements
2020-06-05


April 2020

Update
Description
Release Date
References
Supports user autonomous conversion of recognition results to numerals
Supports user autonomous selection to convert to Chinese numerals or intelligently convert to Arabic numerals
2020-04-24
Recording file recognition product launched with audio and video domain model
For audio transcription in the audio and video field (semi-far-field, with background music), it has industry-leading recognition precision.
2020-04-07


March 2020

Update
Description
Release Date
References
Release the user-selectable features for dirty word filtering, modal particle filtering, and sentence-ending punctuation filtering after speech recognition.
Support users to choose whether to filter dirty words, modal particles, and sentence-ending punctuation based on usage scenarios.
2020-03-16


February 2020

Update
Description
Release Date
References
The ASR product supports creating hotwords through the console.
Adding hotwords can significantly improve the recognition accuracy of proprietary words.
2020-02-25


January 2020

Update
Description
Release Date
References
Real-time speech recognition and single-sentence recognition product price strategy update
The updated billing strategy determines the product price based on usage tiers; the more you use, the lower the unit price.
2020-01-01


December 2019

Update
Description
Release Date
References
Recording file recognition supports Serverless Cloud Function (SCF) access method 
For users storing audio files on Tencent Cloud COS, using the SCF access method can significantly reduce initial integration development work.
2019-12-18
-
ASR product launched beta version self-learning model
Supports custom optimization through the language model self-learning tool, effectively improving ASR accuracy in specific domains or industries.
2019-12-10
ASR products support prepaid purchase methods.
<Tencent Cloud ASR offers both prepaid and postpaid billing modes.
2019-12-06


November 2019

Update
Description
Release Date
References
Real-time speech recognition and single-sentence recognition support English and Cantonese.
Real-time speech recognition and single-sentence recognition enhance language and dialect capabilities, adding support for English and Cantonese.
2019-11-13