我对pytesseract和OCR很陌生,我在互联网上搜索到,这是从图像中提取文本的工具。但是,我以前对这个工具一无所知。现在,我遇到了一个错误:没有安装tesseract,或者它不在您的路径中。有关更多信息,请参见自述文件.
我不知道如何解决这个问题,我尝试了我在互联网上找到的各种解决方案,但不幸的是,这些解决方案并没有奏效。
错误代码:
---------------------------------------------------------------------------
FileNotFoundError Traceback (most recent call last)
~/.local/lib/python3.9/site-packages/pytesseract/pytesseract.py in run_tesseract(input_filename, output_filename_base, extension, lang, config, nice, timeout)
254 try:
--> 255 proc = subprocess.Popen(cmd_args, **subprocess_args())
256 except OSError as e:
/opt/conda/lib/python3.9/subprocess.py in __init__(self, args, bufsize, executable, stdin, stdout, stderr, preexec_fn, close_fds, shell, cwd, env, universal_newlines, startupinfo, creationflags, restore_signals, start_new_session, pass_fds, user, group, extra_groups, encoding, errors, text, umask)
950
--> 951 self._execute_child(args, executable, preexec_fn, close_fds,
952 pass_fds, cwd, env,
/opt/conda/lib/python3.9/subprocess.py in _execute_child(self, args, executable, preexec_fn, close_fds, pass_fds, cwd, env, startupinfo, creationflags, shell, p2cread, p2cwrite, c2pread, c2pwrite, errread, errwrite, restore_signals, gid, gids, uid, umask, start_new_session)
1822 err_msg = os.strerror(errno_num)
-> 1823 raise child_exception_type(errno_num, err_msg, err_filename)
1824 raise child_exception_type(err_msg)
FileNotFoundError: [Errno 2] No such file or directory: 'tesseract'
During handling of the above exception, another exception occurred:
TesseractNotFoundError Traceback (most recent call last)
<ipython-input-7-96e86f1cd397> in <module>
1 img = cv2.imread("Z++¦hler NSHV KTL-Durchlaufanlage-1.jpg")
----> 2 data = pytesseract.image_to_string(img)
3 print(data)
4 # plt.imshow(img)
~/.local/lib/python3.9/site-packages/pytesseract/pytesseract.py in image_to_string(image, lang, config, nice, output_type, timeout)
407 args = [image, 'txt', lang, config, nice, timeout]
408
--> 409 return {
410 Output.BYTES: lambda: run_and_get_output(*(args + [True])),
411 Output.DICT: lambda: {'text': run_and_get_output(*args)},
~/.local/lib/python3.9/site-packages/pytesseract/pytesseract.py in <lambda>()
410 Output.BYTES: lambda: run_and_get_output(*(args + [True])),
411 Output.DICT: lambda: {'text': run_and_get_output(*args)},
--> 412 Output.STRING: lambda: run_and_get_output(*args),
413 }[output_type]()
414
~/.local/lib/python3.9/site-packages/pytesseract/pytesseract.py in run_and_get_output(image, extension, lang, config, nice, timeout, return_bytes)
285 }
286
--> 287 run_tesseract(**kwargs)
288 filename = kwargs['output_filename_base'] + extsep + extension
289 with open(filename, 'rb') as output_file:
~/.local/lib/python3.9/site-packages/pytesseract/pytesseract.py in run_tesseract(input_filename, output_filename_base, extension, lang, config, nice, timeout)
257 if e.errno != ENOENT:
258 raise e
--> 259 raise TesseractNotFoundError()
260
261 with timeout_manager(proc, timeout) as error_string:
TesseractNotFoundError: tesseract is not installed or it's not in your PATH. See README file for more information.
相应代码:
!pip install tesseract
import pytesseract
import cv2
from PIL import Image
import matplotlib.pyplot as plt
img = cv2.imread("meter.jpg")
data = pytesseract.image_to_string(img)
print(data)
# plt.imshow(img)
让我先告诉你们我在用木星。事实上,我对大学的朱庇特中心做了个说明。此外,我还在网上搜索,可以使用'cmd'并解决这个问题。如果是的话,那么请告诉我如何这样做,或者我必须联系Uni管理员来解决这个问题。任何帮助都是非常感谢的!
发布于 2021-06-22 14:03:25
造成此错误的可能原因是您安装了带有pytesseract
的pip
,而没有安装二进制文件。如果是这样的话,您可以按以下方式安装:
在linux上:
sudo apt update
sudo apt install tesseract-ocr
sudo apt install libtesseract-dev
在windows上:从这里下载它,然后将二进制路径插入到代码中
pytesseract.pytesseract.tesseract_cmd = 'C:\\Program Files (x86)\\Tesseract-OCR\\tesseract.exe'
在Mac上:
brew install tesseract
发布于 2022-09-03 12:30:05
对于Windows,如果用户已经为用户安装了它,则路径将位于用户文件夹中,如:C:\Users\<User.Name>\AppData\Local\Tesseract-OCR\tesseract.exe
在代码中使用相同的方法很好。
pytesseract.pytesseract.tesseract_cmd = r'C:\Users\John.Doe\AppData\Local\Tesseract-OCR\tesseract.exe'
https://stackoverflow.com/questions/68084044
复制相似问题