October, 2024
Update | Description | Release Date | References |
The client adds HarmonyOS NEXT SDK, supporting single-sentence recognition and ultra-fast version of recording file recognition API
|
The client adds HarmonyOS NEXT SDK, supporting single-sentence recognition and ultra-fast version of recording file recognition API.
| 2024-10-12 |
May 2024
Update | Description | Release Date | References |
The client adds HarmonyOS NEXT SDK, supporting real-time Automatic Speech Recognition API
|
The client adds HarmonyOS NEXT SDK, supporting real-time Automatic Speech Recognition API.
| 2024-5-9 |
March 2024
Update | Description | Release Date | References |
Version release of ASR large model V2.0 |
ASR large model V2.0 is newly upgraded to General-Pinyin-English Large Model, supporting mixed recognition of Mandarin, English, and 27 Chinese dialects with one engine, and newly supporting real-time speech recognition | 2024-3-28 |
February 2024
Update | Description | Release Date | References |
ASR large model V1.0 version release |
ASR large model V1.0 version supports Mandarin large model and Chinese dialect large model. The number of model parameters is significantly increased, enhancing language model performance, and greatly improving identification accuracy for low-quality audio such as dialect mixed recognition, high noise, high echo, low voice, and distant voice. Click here to compare the recognition effects of the general version and the large model version.
| 2024-2-1 |
January 2024
Update | Description | Release Date | References |
SDK for Flutter update |
Added temporary key authentication method, fixed some callback information errors,
updated iOS Framework version to 3.1.15
| 2024-1-19 | |
The experience module supports history query.
|
Historical recognition records are stored on the cloud for up to 7 days. Recognition results can be downloaded, allowing users to view Optical Character Recognition (OCR) content.
| 2024-1-29 |
October 2023
Update | Description | Release Date | References |
Real-time speech recognition supports forced segmentation parameter. | Real-time speech recognition supports forced segmentation parameter, allowing users to configure the segmentation parameter to disconnect vad during continuous speech or noise, achieving forced segmentation. | 2023-10-31 | |
New super term feature launched | When the hotword weight is set to 11, the hotword will be upgraded to a super hotword to improve identification accuracy. However, it is recommended to set only important and necessary hotwords to 11, as setting too many hotwords to 11 will affect the overall character accuracy rate.
| 2023-10-30 | |
The Flutter client adds single-sentence recognition and ultra-fast recording file recognition APIs. |
The SDK for Flutter client adds single-sentence recognition API and ultra-fast recording file recognition API.
| 2023-10-26 |
September 2023
Update | Description | Release Date | References |
Recording file recognition supports oral-to-written transcription.
|
Recording file recognition supports oral-to-written transcription and allows complex post-processing of transcription results to filter out modal particles and repeated words.
| 2023-09-26 |
August 2023
Update | Description | Release Date | References |
Client SDK supports echo cancellation | iOS/Android clients support echo cancellation API, customers can choose to enable or disable it. | 2023-08-31 | |
ASR products support Hindi | ASR products have added new language capabilities, now supporting Hindi. | 2023-08-31 | |
ASR products support Spanish. | ASR products have added new language capabilities, now supporting Spanish. | 2023-08-03 |
July 2023
Update | Description | Release Date | References |
The console has launched new Cloud Access Management. | You can configure sub-account permissions by first creating a JSON policy and then associating the policy with users or user groups. | 2023-07-27 | |
ASR products support Arabic. | ASR products have added new language capabilities, now supporting Arabic. | 2023-07-18 | |
The official website experience supports recording file recognition. | The official website experience page supports the upload of recording file recognition feature, showcasing product capabilities more comprehensively. | 2023-07-13 |
June 2023
Update | Description | Release Date | References |
ASR+ product series published | Newly launched speaker identification and virtual number human detection. Speaker identification can be used in scenarios such as login locks and identity verification; virtual number human detection can be used in intelligent outbound calling scenarios. | 2023-06-28 | ASR+ product series |
April 2023
Update | Description | Release Date | References |
Offline and Online Speech Recognition SDK published | The offline and online SDK supports online APIs within the offline SDK, enabling a hybrid recognition mode that automatically switches between offline and online versions based on network conditions. | 2023-04-28 |
March 2023
Update | Description | Release Date | References |
ASR products support Vietnamese, Malay, Indonesian, Filipino, Portuguese, and Turkish | ASR products have added new language capabilities, now supporting Vietnamese, Malay, Indonesian, Filipino, Portuguese, and Turkish. | 2023-03-24 |
February 2023
Update | Description | Release Date | References |
Recording file recognition supports emotion recognition. | By configuring emotion recognition parameters (EmotionRecognition) in the recording file recognition API, it can output emotion tags such as happy, angry, and sad. | 2023-02-28 |
December 2022
Update | Description | Release Date | References |
Real-time speech recognition, recording file recognition premium version, and single-sentence recognition support the purchase of QPS/concurrency add-ons. | QPS/concurrency add-ons can scale out the existing QPS/concurrency. | 2022-12-26 |
November 2022
Update | Description | Release Date | References |
Real-time speech recognition, recording file recognition, and single-sentence recognition support the hotword enhanced version. | The hotword enhanced version can effectively improve the hit rate of hotwords in the recognition results, only applicable to Chinese 8k and 16k engines. | 2022-11-30 | |
Real-time speech recognition, recording file recognition, and single-sentence recognition support mixed models of Chinese, English, and Cantonese. | Chinese, English, and Cantonese can be recognized in a mixed manner without switching. | 2022-11-28 |
October 2022
Update | Description | Release Date | References |
Recording file recognition supports emotion energy detection and silence duration detection. | All language engines support emotion energy detection. Silence duration detection supports detecting the silence duration between the current sentence and the previous one, measured in seconds. | 2022-10-29 |
September 2022
Update | Description | Release Date | References |
Optimize the Chinese Mandarin model under the 8k engine | The recognition rate, accuracy, and performance of 8k_zh (Chinese 8k engine) have been improved, better adapting to the phone call scenario | 2022-09-27 | |
Optimize the multi-dialect model under the 16k engine | The 16k_zh_dialect (Chinese 16k multi-dialect engine) has resolved multiple recognition issues, significantly improving the recognition accuracy of some dialects | 2022-09-26 | |
Optimize the multi-dialect model under the 16k engine | The 16k_zh_dialect (Chinese 16k multi-dialect engine) supports speaker separation feature. | 2022-09-26 |
December 2021
Update | Description | Release Date | References |
ASR products support 23 dialects including Sichuanese and Wuhanese. | In addition to Mandarin, English, Cantonese, Japanese, and Shanghainese, Sichuanese, Wuhanese, Guiyang dialect, Kunming dialect, Xi'an dialect, Zhengzhou dialect, Taiyuan dialect, Lanzhou dialect, Yinchuan dialect, Xining dialect, Nanjing dialect, Hefei dialect, Nanchang dialect, Changsha dialect, Suzhou dialect, Hangzhou dialect, Jinan dialect, Tianjin dialect, Shijiazhuang dialect, Heilongjiang dialect, Jilin dialect, and Liaoning dialect have been added. | 2021-12-03 |
February 2021
Update | Description | Release Date | References |
Speech recognition releases multiple industry models | Real-time speech recognition supports education, medical, gaming, and court industry models. Recording file recognition supports education and medical industry models. | 2021-02-01 |
January 2021
Update | Description | Release Date | References |
Speech recognition supports a comprehensive upgrade of audio formats | Comprehensive upgrade and enrichment of formats supported by recording file recognition and real-time speech recognition | 2021-01-21 | |
Real-Time Speech Recognition SDK fully supports the WebSocket protocol | Real-Time Speech Recognition server-side, client, and front-end SDKs all support the WebSocket protocol | 2021-01-21 | |
Async voice stream recognition subproduct published | Recognizes audio streams over live streaming protocols to return recognition results in quasi-real time and supports dedicated models for audio/video scenarios, which is fit for live stream quality inspection. | 2021-01-15 | |
Ultrafast recording file recognition subproduct published | Quickly recognizes large recording files and returns recognition results in semi-real time, which is ideal for scenarios such as audio/video subtitling and quasi-real-time quality inspection and analysis. | 2021-01-15 |
November 2020
Update | Description | Release Date | References |
Recording file recognition supports automatic speaker separation | The 16k_zh_video engine model now supports the speaker separation feature, with automatic separation and specified speaker number separation available for both phone call and non-phone call scenarios. | 2020-11-27 |
October 2020
Update | Description | Release Date | References |
ASR access layer supports WebSocket Protocol | This API service uses the WebSocket Protocol to recognize real-time audio streams, synchronously returning recognition results to achieve the effect of "text output while speaking." | 2020-10-10 |
September 2020
Update | Description | Release Date | References |
ASR Access Control | Implement permission management for ASR operation and resource dimensions through Tencent Cloud's CAM (Cloud Access Management) product. | 2020-09-16 | |
ASR phone call scenario supports English model | ASR adds a new 8k English model for phone calls, suitable for Speech-to-Text Conversion in English call scenarios | 2020-09-09 | |
Client TRTC integration with real-time speech recognition technology guide is online | For users with real-time audio and video and speech recognition needs, real-time speech recognition can be accessed through TRTC. | 2020-09-07 |
August 2020
Update | Description | Release Date | References |
ASR products support the Shanghai dialect | ASR products enhance language and dialect capabilities, adding support for the Shanghai dialect | 2020-08-21 | |
ASR products support Japanese | ASR products enhance language and dialect capabilities, adding support for Japanese | 2020-08-04 | |
Punctuation capability enhancement in ASR product results | After the punctuation ability upgrade in ASR product results, support for the Chinese comma, question mark, and exclamation point is added | 2020-08-01 |
July 2020
Update | Description | Release Date | References |
Recording file recognition supports separation of three or more speakers | Phone call scenario supports mono 2-speaker separation, non-phone call scenario supports mono 2-10 speaker separation | 2020-07-28 | |
Real-Time speech recognition supports OPUS format | OPUS is a low-latency, high-fidelity open-source audio codec suitable for network transmission. It is a mainstream audio streaming format that better supports customers using this format to access real-time speech recognition | 2020-07-02 |
June 2020
Update | Description | Release Date | References |
Recording file recognition supports duration capability enhancement | When using recording file recognition with the upload audio URL method, the URL duration limit is extended from 1 hour to 5 hours | 2020-06-18 | |
Real-time speech recognition supports word-level timestamp feature | Real-time speech recognition supports word-level timestamp feature, suitable for loading subtitles through speech recognition with high latency requirements | 2020-06-05 |
April 2020
Update | Description | Release Date | References |
Supports user autonomous conversion of recognition results to numerals | Supports user autonomous selection to convert to Chinese numerals or intelligently convert to Arabic numerals | 2020-04-24 | |
Recording file recognition product launched with audio and video domain model | For audio transcription in the audio and video field (semi-far-field, with background music), it has industry-leading recognition precision. | 2020-04-07 |
March 2020
Update | Description | Release Date | References |
Release the user-selectable features for dirty word filtering, modal particle filtering, and sentence-ending punctuation filtering after speech recognition. | Support users to choose whether to filter dirty words, modal particles, and sentence-ending punctuation based on usage scenarios. | 2020-03-16 |
February 2020
Update | Description | Release Date | References |
The ASR product supports creating hotwords through the console. | Adding hotwords can significantly improve the recognition accuracy of proprietary words. | 2020-02-25 |
January 2020
Update | Description | Release Date | References |
Real-time speech recognition and single-sentence recognition product price strategy update | The updated billing strategy determines the product price based on usage tiers; the more you use, the lower the unit price. | 2020-01-01 |
December 2019
Update | Description | Release Date | References |
Recording file recognition supports Serverless Cloud Function (SCF) access method | For users storing audio files on Tencent Cloud COS, using the SCF access method can significantly reduce initial integration development work. | 2019-12-18 | - |
ASR product launched beta version self-learning model | Supports custom optimization through the language model self-learning tool, effectively improving ASR accuracy in specific domains or industries. | 2019-12-10 | |
ASR products support prepaid purchase methods. | <Tencent Cloud ASR offers both prepaid and postpaid billing modes. | 2019-12-06 |
November 2019
Update | Description | Release Date | References |
Real-time speech recognition and single-sentence recognition support English and Cantonese. | Real-time speech recognition and single-sentence recognition enhance language and dialect capabilities, adding support for English and Cantonese. | 2019-11-13 |