google的一个开源OCR项目,详情读项目README吧。
https://github.com/tesseract-ocr/tesseract
https://github.com/tesseract-ocr/tesseract/wiki/Compiling-%E2%80%93-GitInstallation
首先安装相关库
apt-get install autoconf-archive automake g++ libtool libleptonica-dev make pkg-config
然后运行
cd tesseract-ocr ./autogen.sh ./configure make sudo make install sudo ldconfig
在configure过程会报错:
configure: error: Leptonica 1.74 or higher is required. Try to install libleptonica-dev package.
查看本地安装的Leptonica发现是1.73版本。查资料发现如下解释,1.74需要下载源码编译。
Tesseract versions and the minimum version of Leptonica required: Tesseract Leptonica Ubuntu 4.00 1.74.2 Must build from source 3.05 1.74.0 Must build from source 3.04 1.71 Ubuntu 16.04 <http://packages.ubuntu.com/xenial/libtesseract3> 3.03 1.70 Ubuntu 14.04 <http://packages.ubuntu.com/trusty/libtesseract3> 3.02 1.69 Ubuntu 12.04 <http://packages.ubuntu.com/precise/libtesseract3> 3.01 1.67
wget http://www.leptonica.com/source/leptonica-1.74.4.tar.gz tar xvf leptonica-1.74.tar.gz cd leptonica-1.74 ./configure make sudo make install
成功后继续执行tesseract的安装。
tesseract digits1.png result -l chi_sim
命令参数:
报错:
Error opening data file /usr/local/share//tessdata/chi_sim.traineddata Please make sure the TESSDATA_PREFIX environment variable is set to your "tessdata" directory. Failed loading language 'chi_sim' Tesseract couldn't load any languages! Could not initialize tesseract.
需要设置data路径
export TESSDATA_PREFIX=/usr/local/share/tessdata/
然后从git@github.com:tesseract-ocr/tessdata.git
下载需要语言的data,中文就下载chi开头的文件。把data拷贝到TESSDATA_PREFIX路径下,再执行检测命令即可。
0 电 话 18663778972
全 国 朝 号 2012127
&) H: 02 04 12 13 16 26 标 | 标标 _
本文参与腾讯云自媒体分享计划,欢迎正在阅读的你也加入,一起分享。
我来说两句