问如何在pytesseract中使用经过训练的数据？
EN

Stack Overflow用户

提问于 2017-05-25 22:59:11

回答 1查看 12K关注 0票数 6

使用这个工具http://trainyourtesseract.com/，我希望能够使用pytesseract的新字体。该工具提供了一个名为*.traineddata文件

现在我使用这个简单的脚本：

try:
    import Image
except ImportError:
    from PIL import Image
import pytesseract as tes

results = tes.image_to_string(Image.open('./test.jpg'),boxes=True)
file = open('parsing.text','a')
file.write(results)
print(results)

我如何使用我的训练数据文件，以便我能够通过python脚本读取新字体？

谢谢！

edit#1 :所以我知道*.traineddata可以作为命令行程序与Tesseract一起使用。所以我的问题仍然是一样的，我如何在python中使用训练数据？

edit#2 :我的问题的答案在这里，How to access the command line for Tesseract from Python?

ocr

tesseract

python-tesseract

回答 1

Stack Overflow用户

回答已采纳

发布于 2017-05-26 22:10:11

下面是一个带有选项的pytesseract.image_to_string()示例。

pytesseract.image_to_string(Image.open("./imagesStackoverflow/xyz-small-gray.png"),
                                  lang="eng",boxes=False,
                                  config="--psm 4 --oem 3 
                                  -c tessedit_char_whitelist=-01234567890XYZ:"))

要使用您自己训练的语言数据，只需将lang="eng"中的"eng"替换为您的语言name(.traineddata)即可。

票数 7

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/44183679

复制

相似问题

问如何在pytesseract中使用经过训练的数据？
EN

回答 1

Stack Overflow用户

社区

活动

资源

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何在pytesseract中使用经过训练的数据？EN

回答 1

Stack Overflow用户

社区

活动

资源

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何在pytesseract中使用经过训练的数据？
EN