首页
学习
活动
专区
圈层
工具
发布
首页
学习
活动
专区
圈层
工具
MCP广场
社区首页 >问答首页 >使用pytesseract在一串单词周围画一个矩形

使用pytesseract在一串单词周围画一个矩形
EN

Stack Overflow用户
提问于 2022-10-27 11:03:42
回答 1查看 37关注 0票数 -1

这是我的形象:

我能认出以下几个字:

我需要检查图像中是否有带有特定文本的一行,并用矩形突出显示这一行。例如。我检查是否有“时间,这是最坏的”。然后我希望看到:

我怎样才能做到这一点?

我的代码:

代码语言:javascript
运行
复制
import cv2 as cv
from pytesseract import pytesseract, Output
unchanged_image = cv.imread("i1Abv.png", cv.IMREAD_UNCHANGED)
initial_image = cv.imread("i1Abv.png", 0)
cv.imshow('', initial_image)
cv.waitKey(0)

ret, image = cv.threshold(initial_image, 100, 255, cv.THRESH_BINARY)

cv.imshow('', image)
cv.waitKey(0)

results = pytesseract.image_to_data(image, output_type=Output.DICT, config="--psm 6")
for i in range(0, len(results["text"])):
    # extract the bounding box coordinates of the text region from
    # the current result
    x = results["left"][i]
    y = results["top"][i]
    w = results["width"][i]
    h = results["height"][i]
    # extract the OCR text itself along with the confidence of the
    # text localization
    text = results["text"][i]
    conf = float(results["conf"][i])

    # filter out weak confidence text localizations
    if conf > 10:

        # strip out non-ASCII text, so we can draw the text on the image
        # using OpenCV, then draw a bounding box around the text along
        # with the text itself
        cv.rectangle(unchanged_image, (x, y), (x + w, y + h), (0, 255, 0), 2)
        cv.putText(img=unchanged_image, text=text, org=(x, y), fontFace=cv.FONT_HERSHEY_COMPLEX,
                   fontScale=0.3, color=(36, 0, 255), thickness=1)

cv.imshow('', unchanged_image)
cv.waitKey(0)
EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2022-10-27 19:00:02

pytesseract.image_to_data为每个可识别的文本块提供line_num。您可以将所有识别信息按line_num分组,然后将单词连接到一个文本行中。此外,您还需要找到一个边框,其中包括每个单词的所有文本框(您可以使用lefttopwidthheight获得它们)。你可以通过cv2.boundingRect找到它

结果:

代码:

代码语言:javascript
运行
复制
import cv2
import numpy as np
import pytesseract

original_image = cv2.imread("text_line.png")
gray = cv2.cvtColor(original_image, cv2.COLOR_BGR2GRAY)
target_text = "times, it was the worst"

df = pytesseract.image_to_data(gray, lang="eng", config="--psm 6", output_type=pytesseract.Output.DATAFRAME)

# group recognized words by lines
for line_num, words_per_line in df.groupby("line_num"):
    # filter out words with a low confidence
    words_per_line = words_per_line[words_per_line["conf"] >= 5]
    if not len(words_per_line):
        continue

    words = words_per_line["text"].values
    line = " ".join(words)
    print(f"{line_num} '{line}'")

    if target_text in line:
        print("Found a line with specified text")
        word_boxes = []
        for left, top, width, height in words_per_line[["left", "top", "width", "height"]].values:
            word_boxes.append((left, top))
            word_boxes.append((left + width, top + height))

        x, y, w, h = cv2.boundingRect(np.array(word_boxes))
        cv2.rectangle(original_image, (x, y), (x + w, y + h), color=(255, 0, 255), thickness=3)

cv2.imwrite("out.jpg", original_image)
票数 2
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/74221064

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档