python图片识别文字

R0A1NG

发布于 2022-02-19 10:08:26

48.6K00

代码可运行

文章被收录于专栏：R0A1NG 技术分享R0A1NG 技术分享

运行总次数：0

代码可运行

安装tesseract

https://digi.bib.uni-mannheim.de/tesseract/ 如果安装时勾选下载其他语言包，会提示下载失败，因为下载地址被墙，需要科学上网，或者安装的时候不勾选。语言包下载：https://tesseract-ocr.github.io/tessdoc/Data-Files

根据需要下载语言包（chi_sim是中文）下载后移动到C:\Program Files\Tesseract-OCR\tessdata目录

cmd进入命令行，命令tesseract --list-langs

安装中文语言包成功若出现找不到命令，需要自己配环境变量

python脚本

先安装相关模块 pip install pillow pip install pytesseract 再到python安装目录下例如我的：E:\python3\Lib\site-packages\pytesseract 打开pytesseract.py文件，找到tesseract_cmd = 'tesseract'，修改为tesseract_cmd = 'C:\\Program Files\\Tesseract-OCR\\tesseract.exe'路径为自己的tesseract安装路径

from PIL import Image
import pytesseract

img = Image.open('2.jpg')

text = pytesseract.image_to_string(img, lang='chi_sim')

print(text)

本文参与腾讯云自媒体同步曝光计划，分享自作者个人站点/博客。

原始发表：2021 年 06 月，如有侵权请联系 cloudcommunity@tencent.com 删除

python

https

文字识别

网络安全

本文分享自作者个人站点/博客前往查看

如有侵权，请联系 cloudcommunity@tencent.com 删除。

本文参与腾讯云自媒体同步曝光计划，欢迎热爱写作的你一起参与！

登录后参与评论

0 条评论

热度