文章/答案/技术大牛

发布

社区首页 >问答首页 >如何利用PaddleOcr包围盒坐标裁剪图像的特定部分

问如何利用PaddleOcr包围盒坐标裁剪图像的特定部分
EN

Stack Overflow用户

提问于 2022-10-21 09:53:34

回答 1查看 139关注 0票数 0

目前，我正在致力于从图像中检测越南文本。

因此，为了检测图像中的文本，我使用PaddleOcr检测，因为我需要行到行检测。Paddleocr显示了100%的结果，我们可以用PaddleOcr进行识别-- PaddleOcr没有在越南人身上训练--没有100%的结果。

因此，为了得到认可，我将使用vietocr，它将显示100%的结果，但是vietocr的问题是，只有当我们通过裁剪图像而不是全像时，它才能工作。

我的计划是通过使用从PaddleOcr生成的边界框协调将图像裁剪成多个。

我正在使用PaddleOcr进行文本检测。

Sample Code 

from paddleocr import PaddleOCR,draw_ocr
# Paddleocr supports Chinese, English, French, German, Korean and Japanese.
# You can set the parameter `lang` as `ch`, `en`, `french`, `german`, `korean`, `japan`
# to switch the language model in order.
ocr = PaddleOCR(use_angle_cls=True) # need to run only once to download and load model into memory
img_path = '/content/im1502.jpg'
result = ocr.ocr(img_path, cls=True)
resultss = result
for idx in range(len(result)):
    res = result[idx]
    for line in res:
        print(line)

# draw result
from PIL import Image
result = result[0]
results = result
image = Image.open(img_path).convert('RGB')
boxes = [line[0] for line in result]
txts = [line[1][0] for line in result]
scores = [line[1][1] for line in result]
im_show = draw_ocr(image, boxes, txts=None, scores=None, font_path='/path/to/PaddleOCR/doc/fonts/simfang.ttf')
im_show = Image.fromarray(im_show)
im_show.save('result.jpg')

包围盒图像和结果

[[[108.0, 78.0], [289.0, 93.0], [286.0, 131.0], [104.0, 116.0]],
 [[51.0, 230.0], [267.0, 235.0], [266.0, 272.0], [50.0, 267.0]],
 [[17.0, 304.0], [343.0, 304.0], [343.0, 340.0], [17.0, 340.0]]]

基于VietOcr的识别

import matplotlib.pyplot as plt
from PIL import Image

from vietocr.tool.predictor import Predictor
from vietocr.tool.config import Cfg
config = Cfg.load_config_from_name('vgg_transformer')

# config['weights'] = './weights/transformerocr.pth'
#config['weights'] = 'https://drive.google.com/uc?id=13327Y1tz1ohsm5YZMyXVMPIOjoOA0OaA'
config['cnn']['pretrained']=False
#config['device'] = 'cuda:0'
config['predictor']['beamsearch']=False

detector = Predictor(config)

img = '/content/im1502.jpg'
img = Image.open(img)
plt.imshow(img)
result = detector.predict(img)
result

结果是

SẢNH CHUNG

所以有人能帮我如何使用paddleocr包围框坐标来裁剪图像吗？

ocr

bounding-box

paddle-paddle

python

python-imaging-library

回答 1

Stack Overflow用户

发布于 2022-10-27 06:55:21

谢谢，我想出来了。这里的盒子是来自paddelocr的Bbox

box = np.array(boxes).astype(np.int32).reshape(-1, 2)

img = cv2.imread(img_path)
height = img.shape[0]
width = img.shape[1]

mask = np.zeros((height, width), dtype=np.uint8)
New_list = boxes.copy()

for boxs in New_list:
  box=np.array(boxs).astype(np.int32).reshape(-1, 2)
  points = np.array([box])
  cv2.fillPoly(mask, points, (255))
  res = cv2.bitwise_and(img,img,mask = mask)
  rect = cv2.boundingRect(points) # returns (x,y,w,h) of the rect
  cropped = res[rect[1]: rect[1] + rect[3], rect[0]: rect[0] + rect[2]]

裁剪的边框将是img的一部分。

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/74151879

复制

相似问题

问如何利用PaddleOcr包围盒坐标裁剪图像的特定部分
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何利用PaddleOcr包围盒坐标裁剪图像的特定部分EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何利用PaddleOcr包围盒坐标裁剪图像的特定部分
EN