blocks|key|447396|text|您必须将“页面分割模式”设置为“单个字符”。|type|unstyled|depth|inlineStyleRanges|entityRanges|data|447397|例如，在Android中，您可以执行以下操作：|447398|api.setPageSegMode(TessBaseAPI.pageSegMode.PSM_SINGLE_CHAR);|code-block|syntax|javascript|447399|entityMap^0|0|0|0^^$0|@$1|2|3|4|5|6|7|K|8|@]|9|@]|A|$]]|$1|B|3|C|5|6|7|L|8|@]|9|@]|A|$]]|$1|D|3|E|5|F|7|M|8|@]|9|@]|A|$G|H]]|$1|I|3|-4|5|6|7|N|8|@]|9|@]|A|$]]]|J|$]]

You must set the "page segmentation mode" to "single char".

For example, in Android you do the following:

<pre><code>api.setPageSegMode(TessBaseAPI.pageSegMode.PSM_SINGLE_CHAR);
</code></pre>

blocks|key|446701|text|进行这种配置的python代码如下所示：|type|unstyled|depth|inlineStyleRanges|entityRanges|data|446702|import+pytesseract
import+cv2
img+=+cv2.imread("path+to+some+image")
pytesseract.image_to_string(
+++++img,+config=("-c+tessedit"
++++++++++++++++++"_char_whitelist=abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789"
++++++++++++++++++"+--psm+10"
++++++++++++++++++"+-l+osd"
++++++++++++++++++"+"))|code-block|syntax|javascript|446703|--psm标志定义页面分段模式。|offset|length|style|CODE|446704|根据“文件”，10的意思是：|446705|将图像视为单个字符。|blockquote|446706|因此，要识别单个字符，只需使用：--psm+10标志。|446707|entityMap^0|0|0|0|5|0|7|2|0|0|G|8|0^^$0|@$1|2|3|4|5|6|7|V|8|@]|9|@]|A|$]]|$1|B|3|C|5|D|7|W|8|@]|9|@]|A|$E|F]]|$1|G|3|H|5|6|7|X|8|@$I|Y|J|Z|K|L]]|9|@]|A|$]]|$1|M|3|N|5|6|7|10|8|@$I|11|J|12|K|L]]|9|@]|A|$]]|$1|O|3|P|5|Q|7|13|8|@]|9|@]|A|$]]|$1|R|3|S|5|6|7|14|8|@$I|15|J|16|K|L]]|9|@]|A|$]]|$1|T|3|-4|5|6|7|17|8|@]|9|@]|A|$]]]|U|$]]

python code to do that configuration is like this:
<pre><code>import pytesseract
import cv2
img = cv2.imread(&quot;path to some image&quot;)
pytesseract.image_to_string(
 img, config=(&quot;-c tessedit&quot;
 &quot;_char_whitelist=abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789&quot;
 &quot; --psm 10&quot;
 &quot; -l osd&quot;
 &quot; &quot;))
</code></pre>
the <code>--psm</code> flag defines the page segmentation mode.
according to documentaion of tesseract, <code>10</code> means :
<blockquote>
Treat the image as a single character.
</blockquote>
so to recognize a single character you just need to use : <code>--psm 10</code> flag.

blocks|key|430785|text|您需要将Tesseract的页面分割模式设置为“单字符”。|type|unstyled|depth|inlineStyleRanges|entityRanges|data|430786|entityMap^0|0^^$0|@$1|2|3|4|5|6|7|D|8|@]|9|@]|A|$]]|$1|B|3|-4|5|6|7|E|8|@]|9|@]|A|$]]]|C|$]]

You need to set Tesseract's page segmentation mode to "single character."

blocks|key|430741|text|你看过这个吗？|type|unstyled|depth|inlineStyleRanges|entityRanges|data|430742|https://code.google.com/p/tesseract-ocr/issues/detail?id=581|offset|length|430743|错误列表将其显示为“不再是问题”。|430744|一定要有高分辨率的图像。|unordered-list-item|430745|如果您正在调整图像大小，请确保保持较高的DPI值，并且不要调整太小的大小。|430746|确保训练你的系统|430747|在baseApi.setVariable("tessedit_char_whitelist",+"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz");之前使用init+Tesseract代码|style|CODE|430748|此外，您还可以查看与OCR一起使用的字体|430749|entityMap|0|LINK|mutability|MUTABLE|url|1|https://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract|2|https://stackoverflow.com/questions/316068/what-is-the-ideal-font-for-ocr^0|0|0|1O|0|0|0|0|0|2|6|1|0|1|2V|30|E|0|9|B|2|0^^$0|@$1|2|3|4|5|6|7|15|8|@]|9|@]|A|$]]|$1|B|3|C|5|6|7|16|8|@]|9|@$D|17|E|18|1|19]]|A|$]]|$1|F|3|G|5|6|7|1A|8|@]|9|@]|A|$]]|$1|H|3|I|5|J|7|1B|8|@]|9|@]|A|$]]|$1|K|3|L|5|J|7|1C|8|@]|9|@]|A|$]]|$1|M|3|N|5|J|7|1D|8|@]|9|@$D|1E|E|1F|1|1G]]|A|$]]|$1|O|3|P|5|J|7|1H|8|@$D|1I|E|1J|Q|R]|$D|1K|E|1L|Q|R]]|9|@]|A|$]]|$1|S|3|T|5|J|7|1M|8|@]|9|@$D|1N|E|1O|1|1P]]|A|$]]|$1|U|3|-4|5|6|7|1Q|8|@]|9|@]|A|$]]]|V|$W|$5|X|Y|Z|A|$10|C]]|11|$5|X|Y|Z|A|$10|12]]|13|$5|X|Y|Z|A|$10|14]]]]

Have you seen this?

<a href="https://code.google.com/p/tesseract-ocr/issues/detail?id=581" rel="nofollow noreferrer">https://code.google.com/p/tesseract-ocr/issues/detail?id=581</a>

The bug list shows it as "no longer an issue".

<ul>
<li>Be sure to have high resolution images.</li>
<li>If you are resizing the image, be sure to keep a high DPI and don't resize too small</li>
<li>Be sure to <a href="https://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract" rel="nofollow noreferrer">train your tesseract system</a> </li>
<li>use the <code>baseApi.setVariable("tessedit_char_whitelist", "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz");</code> code before the <code>init Tesseract</code></li>
<li>Also, you may look into <a href="https://stackoverflow.com/questions/316068/what-is-the-ideal-font-for-ocr">which font to use with OCR</a></li>
</ul>

How to represent:

<ol>
<li>Create new image with paint (any size)</li>
<li>Add letter A to this image</li>
<li>Try to recognize -> tesseract will not find any letters</li>
<li>Copy-paste this letter 5-6 times to this image</li>
<li>Try to recognize -> tesseract will find all the letters</li>
</ol>

Why?

Tesseract does not recognize single characters

翻译质量差，导致语言生硬或混乱。

没有提供实际的解决方法或示例。

解答不清晰，无法理解或解决问题。

页面排版不美观，阅读体验差。

文章

问答

视频

学习中心

腾讯云实验室

直播

竞赛

腾讯云代码分析专区

腾讯iOA零信任安全管理系统专区

腾讯云架构师技术同盟交流圈

腾讯云数据库专区

腾讯云顾问专区

腾讯云原生专区

腾讯混元专区

腾讯云TCE专区

腾讯云Lighthouse专区

腾讯云HAI专区

腾讯云Edgeone专区

腾讯云存储专区

腾讯云智能专区

腾讯轻联专区 

腾讯云开发专区

TAPD专区

腾讯轻量云游戏服专区

腾讯云最具价值专家

腾讯云架构师技术同盟

腾讯云创作之星

腾讯云开发者先锋

腾讯云代码助手

云原生构建

TAPD 敏捷项目管理

Cloud Studio

SDK中心

API中心

命令行工具

涵盖代码开发、场景应用、自动测试全流程，助你从零构建专属AI助手

一站式MCP教程库，解锁AI应用新玩法

如何代表：用油漆(任意大小)创建新图像将字母A添加到此图像中尝试识别-> tesseract将找不到任何字母将此字母复制粘贴到此图像5-6次。试着识别-> tesseract会找到所有的字母为什么？

问Tesseract不识别单个字符
EN

回答 4

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Tesseract不识别单个字符EN

回答 4

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Tesseract不识别单个字符
EN