目前,我正在致力于从图像中检测越南文本。
因此,为了检测图像中的文本,我使用PaddleOcr检测,因为我需要行到行检测。Paddleocr显示了100%的结果,我们可以用PaddleOcr进行识别-- PaddleOcr没有在越南人身上训练--没有100%的结果。
因此,为了得到认可,我将使用vietocr,它将显示100%的结果,但是vietocr的问题是,只有当我们通过裁剪图像而不是全像时,它才能工作。
我的计划是通过使用从PaddleOcr生成的边界框协调将图像裁剪成多个。
我正在使用PaddleOcr进行文本检测。
Sample Code
from paddleocr import PaddleOCR,draw_ocr
# Paddleocr supports Chinese, English, French, German, Korean and Japanese.
# You can set the parameter `lang` as `ch`, `en`, `french`, `german`, `korean`, `japan`
# to switch the language model in order.
ocr = PaddleOCR(use_angle_cls=True) # need to run only once to download and load model into memory
img_path = '/content/im1502.jpg'
result = ocr.ocr(img_path, cls=True)
resultss = result
for idx in range(len(result)):
res = result[idx]
for line in res:
print(line)
# draw result
from PIL import Image
result = result[0]
results = result
image = Image.open(img_path).convert('RGB')
boxes = [line[0] for line in result]
txts = [line[1][0] for line in result]
scores = [line[1][1] for line in result]
im_show = draw_ocr(image, boxes, txts=None, scores=None, font_path='/path/to/PaddleOCR/doc/fonts/simfang.ttf')
im_show = Image.fromarray(im_show)
im_show.save('result.jpg')
[[[108.0, 78.0], [289.0, 93.0], [286.0, 131.0], [104.0, 116.0]],
[[51.0, 230.0], [267.0, 235.0], [266.0, 272.0], [50.0, 267.0]],
[[17.0, 304.0], [343.0, 304.0], [343.0, 340.0], [17.0, 340.0]]]
基于VietOcr的识别
import matplotlib.pyplot as plt
from PIL import Image
from vietocr.tool.predictor import Predictor
from vietocr.tool.config import Cfg
config = Cfg.load_config_from_name('vgg_transformer')
# config['weights'] = './weights/transformerocr.pth'
#config['weights'] = 'https://drive.google.com/uc?id=13327Y1tz1ohsm5YZMyXVMPIOjoOA0OaA'
config['cnn']['pretrained']=False
#config['device'] = 'cuda:0'
config['predictor']['beamsearch']=False
detector = Predictor(config)
img = '/content/im1502.jpg'
img = Image.open(img)
plt.imshow(img)
result = detector.predict(img)
result
结果是
SẢNH CHUNG
所以有人能帮我如何使用paddleocr包围框坐标来裁剪图像吗?
发布于 2022-10-27 06:55:21
谢谢,我想出来了。这里的盒子是来自paddelocr的Bbox
box = np.array(boxes).astype(np.int32).reshape(-1, 2)
img = cv2.imread(img_path)
height = img.shape[0]
width = img.shape[1]
mask = np.zeros((height, width), dtype=np.uint8)
New_list = boxes.copy()
for boxs in New_list:
box=np.array(boxs).astype(np.int32).reshape(-1, 2)
points = np.array([box])
cv2.fillPoly(mask, points, (255))
res = cv2.bitwise_and(img,img,mask = mask)
rect = cv2.boundingRect(points) # returns (x,y,w,h) of the rect
cropped = res[rect[1]: rect[1] + rect[3], rect[0]: rect[0] + rect[2]]
裁剪的边框将是img的一部分。
https://stackoverflow.com/questions/74151879
复制相似问题