Overview
General Optical Character Recognition (General OCR) is based on industry-leading deep learning technology, providing various services such as general printed text recognition, high-accuracy general printed text recognition, general handwritten text recognition, English recognition, and table recognition. It supports intelligently recognizing text on images and converting it into editable text. It can be used in photo scanning, paper document digitization, e-commerce ad moderation, and many other scenarios to greatly improve information processing efficiency.
Product Features
General Print Recognition
Supports text recognition in entire images under multi-scenario and different layouts. It can automatically detect the language type and also supports selecting the language type manually (recommended). In addition to Chinese and English, it supports Japanese, Korean, Spanish, French, German, Portuguese, Vietnamese, Malay, Russian, Italian, Dutch, Swedish, Finnish, Danish, Norwegian, Hungarian, and Thai.
General Print Recognition (High-Precision)
Supports the detection and recognition of text in entire images, returning the text box positions and text content. Compared to the general printed text recognition API, it offers higher accuracy and recall rate, covering a wider range of scenarios. Application scenarios include printed text recognition, online image text recognition, ad image text recognition, street view store sign text recognition, menu text recognition, video title text recognition, and avatar text recognition.
General Handwriting Recognition
Supports the recognition of handwritten Chinese, English, letters, numbers, and common characters in entire images under multi-scenario and different layouts. The recognition capability has been enhanced to handle irregular, messy, and blurry handwriting. It can be applied to scenarios such as handwritten document entry in banks, insurance, finance, and documentation of notes in the education sector.
English Recognition
Supports the detection and recognition of English text in images, returning the text box positions and text content. It supports the recognition of English, letters, numbers, and common characters in multi-scenario and different layouts, covering both printed and handwritten English text. It can be applied to scenarios such as documentation of English notes, recognition of English exam answer sheets, and other similar applications.
Table Recognition (V2)
Supports the detection and recognition of regular tables, wireless tables, and multi-tables in Chinese and English images/PDFs, and supports the recognition of wired tables in Japanese. It returns the text content of each cell, supports the recognition of rotated table images, and allows saving the recognition results in Excel format.
Table Recognition (V3)
Supports the detection and recognition of regular tables, wireless tables, and multi-tables in Chinese and English images/PDFs, returning the text content of each cell, supporting the recognition of rotated table images, and allowing saving the recognition results in Excel format. The recognition effect is better than table recognition, covering a wider range of scenarios. It performs better in difficult table scenarios such as wireless tables and nested tables (wireless tables within wired tables), and can avoid interference from stamps and broken table lines. It is suitable for customers with higher accuracy and recall rate requirements for the API.
Advertisement Text Recognition
Supports the detection and recognition of text in advertisement product images, returning the text box positions and text content. It supports the recognition of Chinese and English, horizontal and vertical text, and text in 90-degree, 180-degree, 270-degree flip, and tilted scenarios. The recall rate and accuracy of text recognition can reach over 96%.