使用PyPDF2通过python加密许多PDF_使用python PyPDF2合并PDF文件_Python不使用pyPDF2打印PDF - 腾讯云开发者社区

python、pypdf2

我想读一下pdf文件。这是带有密码的book.pdf (256位AES加密)。我知道一个密码。我正在使用Jupyter Notebook。我得到一个错误： import PyPDF2 pdfReader = PyPDF2.PdfFileReader(open('book.pdf', 'rb')) pdfReader.decrypt('333') pdfReader.getPage(0) --------------------------------------------------------------------------- N

浏览 3提问于2018-06-08得票数 8

1回答

如何在python2.7中读取和打印PDF的内容？

pdf、python

我使用PyPDF2库，并打开pdf文件。 file = open("C:\\Users\\ZJ\\S40rooms.pdf",'rb') 要阅读pdf的内容，我应该知道些什么？我需要了解PyPDF2中的所有函数，以便以后使用它。另外，关于python 2.7在pdf中的搜索，我在pdf中有一个表格。为了便于搜索，我需要将每一列分开。

浏览 0提问于2016-03-24得票数 0

1回答

PyPDF2 PdfReadError:无法读取布尔对象

python、pypdf2

当使用PyPDF2读取某些PDF文件时，我得到以下错误。由于这些文件的机密性，我无法分享它们，但我可以尝试提供有助于解决这个问题的信息。斯塔克斯迹- inputpdf = PdfFileReader(open(pdfpath, "rb"), strict=False) File "/home/tata/.virtualenvs/obu/local/lib/python2.7/site-packages/PyPDF2/pdf.py", line 1084, in __init__ self.read(stream) File "/

浏览 7提问于2017-11-14得票数 0

回答已采纳

1回答

如何检测和消除此错误无法读取格式错误的PDF

reportlab、pypdf2

我一直在努力处理这段代码，结果却得到了最后的错误。 from pdfrw import PdfWriter from PyPDF2 import PdfFileWriter, PdfFileReader import io import csv from reportlab.pdfgen import canvas packet = io.BytesIO(b"F4_II.txt") c = canvas.Canvas(packet) packet.seek(0) new_pdf = PdfFileReader(packet) template = PdfFileReader(

浏览 24提问于2020-05-25得票数 0

2回答

如何从python解码pdf加密文件

python、pdf、encryption

我有一个PDF文件和相关的密码。我只使用python将一个加密的文件转换成一个清晰的版本。我发现有一些python模块(pyPdf2，PDFMiner)来处理PDF文件，但是没有一个模块可以处理加密。有人已经这么做了？

浏览 4提问于2016-08-08得票数 2

回答已采纳

2回答

pyPDF2中的extractText()函数抛出错误

python、pdf、python-3.x、pypdf

我正在尝试从PDF中提取文本，以便分析它，但是当我尝试从页面中提取文本时，我收到以下错误。 Traceback (most recent call last): File "C:\Program Files (x86)\eclipse\plugins\org.python.pydev_2.7.4.2013051601\pysrc\pydevd_comm.py", line 765, in doIt result = pydevd_vars.evaluateExpression(self.thread_id, self.frame_id, self.expression,

浏览 1提问于2013-06-02得票数 1

1回答

PyPDF2 PdfFileWriter没有属性流

python、pdf、pypdf2

我正在尝试将pdf分成多个页面，并将每个页面另存为一个新的pdf。我尝试了上一个问题中的方法，但没有成功，也尝试了中的pypdf2拆分示例，但没有成功。编辑:我可以在我的文件中看到它成功地写入了第一页，然后创建了第二页pdf，但它是空的。下面是我尝试运行的代码： from PyPDF2 import PdfFileWriter, PdfFileReader inputpdf = PdfFileReader(open("my_pdf.pdf", "rb")) for i in range(inputpdf.numPages): output = Pd

浏览 0提问于2016-10-21得票数 4

1回答

PyPDF2模块和加密的PDF文件

pypdf2

我目前正在使用PyPDF2处理Python中的PDF文件。当我运行一个脚本来加载一些PDF文件并从PDF中提取一些关键词时，我无法： PdfReadError: File has not been decrypted 因此，为了绕过这个问题，我实现了： if pathObj.isEncrypted: pathObj.decrypt('') 然而，我面对的却是： NotImplementedError: only algorithm code 1 and 2 are supported 现在，我有点明白这些错误告诉我的是什么。我不明白的是我的PDF没有加密有人

浏览 3提问于2019-10-21得票数 0

1回答

在Python中使用PyPdf2 PdfFileMerger时发生错误

python、python-3.x、exception、pypdf2、pypdf

我一直在创建一个使用PyPdf2合并多个pdf文件的Python程序. 这是代码 import os from PyPDF2 import PdfFileMerger source_dir = os.getcwd() merger = PdfFileMerger() for item in os.listdir(source_dir): if item.endswith('pdf'): merger.append(item) merger.write('completed_file.pdf') merger.close() 在运行

浏览 3提问于2020-11-17得票数 2

回答已采纳

1回答

保存使用PyPDF2生成的pdf文件时使用python中的AssertionError

python、python-3.x、pdf、pypdf2

我想把一个给定的PDF的页面分割成单独的PDF。下面是我写的代码，但在这里，当使用open()和.write()函数保存文件时，我得到了错误: AssertionError from PyPDF2 import PdfFileReader, PdfFileWriter pdf = PdfFileReader("input.pdf") # this is the source pdf for page in range(pdf.getNumPages()): pdf_writer = PdfFileWriter() pdf_writer.addPage(p

浏览 16提问于2021-07-07得票数 0

1回答

PyPDF2 IndexError:超出范围的索引

python、pypdf2

首先，我对使用Python和PyPDF非常陌生。我试图收集所有的字段在一个pdf收集成一个数据。最后，我想收集成千上万的PDF，它们都具有与基线相同的结构(表单)，并将它们放入PDF中。在没有数字证书/签名的情况下，我能够让这些代码在PDF上工作得很好。但是，当我在PDF上运行带有数字证书/签名的代码时，会出现错误。我真的不需要文档的数字签名/证书点，所以我认为最简单的方法就是跳过PDF字段。但是，我不知道如何做到这一点，因为PyPDF2包会查看每个字段。代码： import os import PyPDF2 as pypdf import pandas as pd directory

浏览 9提问于2022-08-15得票数 0

回答已采纳

1回答

无法使用Python3.x: DependencyError: PyCryptodome算法查找PDF的页数

python-3.x、encryption

我正在对从url下载的文件执行数据验证。其中一个验证检查涉及检查PDF的页数。使用PyPDF2包和PdfFileReader模块，直到我遇到一个具有权限密码但没有打开密码的256位AES加密的PDF。我无法访问任何密码，因为这些文件来自制造商网站，所以我的结论是，目前我只需检查PDF是否加密，如果是的话，暂时跳过它，但不管我是否试图检索页面计数或检查PDF是否加密，我都会得到以下错误： DependencyError: PyCryptodome is required for AES algorithm 此错误发生在第6行if语句中。尽管已经安装了pycryptodome并导入了AES模块，

浏览 9提问于2022-08-29得票数 0

回答已采纳

1回答

当尝试从lib运行示例时，pyPDF2 TypeError

python、python-3.x、pypdf

从这里获得pyPDF2库：当尝试运行脚本“示例1:”时，从那里可以看到： PyPDF2 python versions (2.5 - 3.3) compatibility branch Traceback (most recent call last): File "1.py", line 6, in <module> input1 = PdfFileReader(open("document1.pdf", "rb")) File "C:\Python33\lib\site-packages\PyPDF2

浏览 3提问于2013-10-04得票数 0

回答已采纳

1回答

如何在python中使用PyPDF2读取此pdf格式

python、pdfminer、pypdf2、poppler

我试着用PyPDF2或Pdfminer来读取这个pdf，但是它是说文件还没有在Pypdf2和pdfminer中被解密，它是说它可以解压这个pdf。有人让我知道如何在python3 windows环境中这样做。我不能使用波普勒，因为我不能在这个窗口安装波普勒。

浏览 0提问于2018-04-13得票数 0

回答已采纳

2回答

不能将PDF与py2pdf - ValueError合并

python、pdf

我试图合并我从Google下载的PDF文件，我得到了以下错误： ValueError: invalid literal for int() with base 10: b'F-1.4' 当我将我生成的PDF与基调合并时，这种情况就不会发生。完整的错误如下： Traceback (most recent call last): File "weekly_meeting.py", line 36, in <module> file_path = sort_pdf(path) File "weekly_meeting.py"

浏览 2提问于2019-01-13得票数 1

1回答

将WindowsPath路径的python列表中的PDF合并

python、pdf

我有一个excel文件，包含行和列中的一些数据，我将从每一行获取文件名并将它们合并为一个pdf文件(简单地说，每一行到一个pdf文件)--这是列表['1', '112238', '112239', '112240', '112337', '112338']的一个例子，python列表中的第一个元素将是pdf名称，其他元素是应该存在于名为Files的目录中的文件名。 from pathlib import Path import pandas as pd from PyPDF2 import PdfF

浏览 4提问于2022-04-12得票数 0

2回答

PYPDF水印返回错误

python-2.7、pypdf

嗨，我试图水印一个pdf文件使用pypdf2，虽然我得到这个错误，我不知道哪里出了问题。我得到以下错误： Traceback (most recent call last): File "test.py", line 13, in <module> page.mergePage(watermark.getPage(0)) File "C:\Python27\site-packages\PyPDF2\pdf.py", line 1594, in mergePage self._mergePage(page2) File &

浏览 2提问于2013-11-26得票数 0

回答已采纳

1回答

使用python从pdf中读取表

python、pdf、import

我有以下的pdf定位。我试着，试着，再试一次，从pdf中读取表格。到目前为止，我已经列出了我使用的所有东西。我试过tabulua # import tabula # # Read pdf into DataFrame # df = tabula.read_pdf(r"pdf\10027183.pdf") # doesn't work 我试过PyPDF2 # read pdf # import PyPDF2 # # creating an object # pdfFileObj = open(r"pdf\10027183.pdf", 'r

浏览 2提问于2019-12-14得票数 2

1回答

使用Python和PyPDF2合并PDF文件会抛出一个TypeError

python、pdf、pypdf2

我使用Python 3.6.5将PDF合并在一起，但遇到了一个问题。下面的代码引发一个'TypeError: 'NumberObject' object is not subscriptable'错误。我做错了什么？当我用merger.append注释掉这一行时，它会正确地打印出文件路径。 import webbrowser import os from PyPDF2 import PdfFileMerger, PdfFileReader path = 'C:/test/pdfs' merger = PdfFileMerger() for pd

浏览 0提问于2018-04-06得票数 4

1回答

为什么“导入pyPDF2 2”安装后不工作？

python、pypdf2

这是我的代码，我得到了这个错误ModuleNotFoundError: No module named 'pyPDF2'。我已经安装了pip instal pyPDF2。如果我再试一次，上面写着： C:\Users\nicks\Desktop\Coding Projects\Python\Pdf to Audio›pip install PyPDF2 Requirement already satisfied: PyPDF2 in c: \users\nicks\appdata\local\packages \pythonsoftwarefoundation.python.

浏览 20提问于2022-03-13得票数 0

2回答

PyPDF2文件的完整克隆

python、pdf、pypdf2

我正在尝试使用PyPDF2复制整个pdf，以下代码复制内容但不复制PDF的大纲。，并使用如下python test.py <input pdf> <output dest>代码这是我到目前为止拥有的代码。 from PyPDF2 import PdfFileWriter, PdfFileReader import sys import os.path def main(argv): if not os.path.isfile(argv[0]) and \ not os.path.isfile(argv[1]): print("

浏览 0提问于2018-02-04得票数 1

1回答

NotImplementedError在python3中使用PyPDF2模块

python、python-3.x、pypdf2、pypdf

我一直在用Python创建一个程序，将2个pdf文件合并到一个文件中。这是代码：- import os from PyPDF2 import PdfFileMerger source_dir = os.getcwd() merger = PdfFileMerger() for item in os.listdir(source_dir): if item.endswith('pdf'): merger.append(item) merger.write('completed_file.pdf') merger.close() 在运

浏览 2提问于2020-11-16得票数 0

1回答

获取TypeError: ord()期望长度为1的字符串，但int找到了错误

python、python-3.x、pypdf2

代码是 from PyPDF2 import PdfFileReader with open('HTTP_Book.pdf','rb') as file: pdf=PdfFileReader(file) pagedd=pdf.getPage(0) print(pagedd.extractText()) 此代码引发下面所示的错误： TypeError: ord() expected string of length 1, but int found 我在网上搜索，发现了这个，但没有多大帮助。我知道这个错误的背景是什么，但不确定它在这里有什么

浏览 0提问于2019-05-05得票数 6

回答已采纳

1回答

ImportError没有名为'PyPDF2‘的模块

python、pypdf2

Python新手...，我们实际上是编程新手，所以请耐心听我说。在装有Python 3.8.2的Ubuntu 20.04 (是的，也是Linux的新手)上我正在尝试运行一个使用PyPDF2的脚本。我可以用以下命令很好地安装它： sudo apt-get install python3-pypdf2和我可以从命令行导入它，没有任何错误： import PyPDF2 然而，当我尝试从Pycharm导入它时，它生成了一个ModuleNotFoundError错误： Traceback (most recent call last): File "/home/surista/.confi

浏览 11提问于2020-05-07得票数 0

回答已采纳

1回答

从另一个PDF中替换至少100页的PDF

python、pdf

这是代码的一个例子， import PyPDF2 import numpy as np # creating a pdf file object pdfFileObj = open('original.pdf' , 'rb') pdfFileObj_1 = open('tutorial.pdf', 'rb') # creating a pdf reader object pdfReader = PyPDF2.PdfFileReader(pdfFileObj) pdfReader_1 = PyPDF2.PdfFileReader(

浏览 0提问于2018-09-18得票数 1

回答已采纳

2回答

Python & PDF解析:有任何现代的、强大的、维护良好的开源库吗？

python、pdf、ocr、scraping、parser

我正在寻找维护良好和文档齐全的Python强大PDF解析库(主要用于从具有不同/不可预测结构的各种类型的PDF中提取和解析数据，包括借助可靠和强大的OCR)。目前，我知道以下主要项目： PDFMiner：https://github.com/euske/pdfminer (最后一次提交是11天前) PDFMiner.six：https://github.com/pdfminer/pdfminer.six (最后一次提交是3天前--似乎是维护最活跃的项目) 在我看来，PDFMiner API使用起来有点过于复杂了-- 这里有一个很好的例子。 PyPDF2：https://github.com/

浏览 0提问于2019-11-14得票数 4

回答已采纳

6回答

如何使用PyPDF2解密PDF？

python、pdf、encryption、pypdf2

目前，我正在使用PyPDF2作为依赖项。我遇到了一些加密文件，并按照通常的方式处理它们(在下面的代码中)： from PyPDF2 import PdfReader reader = PdfReader(pdf_filepath) if reader.is_encrypted: reader.decrypt("") print(len(reader.pages)) 我的文件路径看起来类似于"~/blah/FDJKL492019 21490，LFS.pdf“PDF.decrypt("")返回1，这意味着它是成功的。但是，当它点击prin

浏览 2提问于2014-10-07得票数 20

回答已采纳

4回答

PyPDF2 write不适用于某些PDF文件(Python3.5.1)

python、python-3.x、pdf、reportlab、pypdf2

首先，我使用的是Python3.5.1 (32位版本)，我编写了以下程序，使用PyPDF2和reportlab在我的pdf文件的所有页面上添加页码： #import modules from os import listdir from PyPDF2 import PdfFileWriter, PdfFileReader import io from reportlab.pdfgen import canvas from reportlab.lib.pagesizes import A4 #initial values of variable declarations PDFlist=[] X

浏览 1提问于2017-08-31得票数 12

1回答

pyPDF2“流意外结束”

python、visual-studio-code、error-handling、pypdf2

这是我的第一个python代码。作者传递了一个错误。这似乎是随机发生在循环过程中，通过pdf的。 try: except: pass将无法工作，因为它只会跳过该问题的文件，而不会为它生成一个输出。 strict=False似乎不适合作者。错误： PdfReadWarning: Multiple definitions in dictionary at byte 0x6eb54 for key /PageMode [generic.py:587] PdfReadWarning: Multiple definitions in dictionary at byte 0x75740 for key

浏览 11提问于2022-03-31得票数 0

1回答

PyPDF2附加文件问题

python

我需要编写将图像转换为pdfs并将tchem合并为一个的脚本。我尝试过使用img2pdf和PYPDF2，但是我遇到了错误。谁能看看，告诉我我做错了什么。 import img2pdf import os from PyPDF2 import PdfFileReader, PdfFileMerger, PdfFileWriter merger = PdfFileMerger() path = input() for root,dir,files in os.walk(path): for eachfile in files: if "pdf&

浏览 2提问于2015-12-17得票数 1

回答已采纳

1回答

使用PyPDF2通过python加密许多PDF

python、pdf、encryption、permissions、pypdf2

我正在尝试制作一个python程序，它循环遍历文件夹中的所有文件，选择那些扩展名为'.pdf‘的文件，并使用受限权限对它们进行加密。我使用的是这个版本的PyPDF2库：https://github.com/vchatterji/PyPDF2。(对原始PyPDF2的修改也允许设置权限)。我已经用一个pdf文件测试了它，它工作得很好。我希望原始的pdf文件应该被删除，加密的文件应该保留相同的名称。下面是我的代码： import os import PyPDF2 directory = './' for filename in os.listdir(directory)

浏览 17提问于2019-03-21得票数 1

2回答

PyPDF2，为什么我得到一个索引错误？列表索引超出范围

python-3.x、pypdf2

我正在阅读阿尔·斯威加特的书“自动化无聊的东西”，我遇到了一个索引错误，不知所措。我正在使用PDF打开一个加密的PyPDF2文档。我知道这本书是2015年的，所以我去了看看我是否遗漏了什么，一切似乎都是一样的，至少从我能说的来看是这样的。所以我不知道这里出了什么问题。这是我的代码。 import PyPDF2 pdfReader = PyPDF2.PdfFileReader(open('encrypted.pdf', 'rb')) pdfReader.isEncrypted True pdfReader.getPage(0) Traceback (most re

浏览 2提问于2018-06-20得票数 2

2回答

使用PyPDF2合并两个pdf文件时出错

python、python-2.7、pypdf2

我为这个问题搜索了很多次，但我没有找到这个问题的确切解决方案，这就是为什么我要问这个问题…… 这是我使用PyPDF2在python中合并两个pdf文件的代码： import os from PyPDF2 import PdfFileReader, PdfFileMerger files_dir = "/Users/ajayvictor/" pdf_files = [f for f in os.listdir(files_dir) if f.endswith("pdf")] merger = PdfFileMerger() for filename in pd

浏览 0提问于2017-04-22得票数 1

2回答

如何使用python从pdf中提取数据

python、pycharm

我想知道如何使用pycharm上的python语言从pdf中提取数据，.I试图通过从pypdf2导入py魅力来编写代码，但是它并没有显示结果。

浏览 2提问于2022-02-22得票数 -3

1回答

在Python 3中提取PDF元数据

python、metadata

从PDF文件中获取元数据的最佳模块或简单脚本是什么？对于python2.7，一切看起来都是这样，否则模块就不能工作了。我需要它才能让python 3.4.2工作。 https://pypi.python.org/pypi/pdfminer/ = Python 2.7 使用PyPDF2：使用：print(input1.getDocumentInfo())，我不断地收到错误： raise utils.PdfReadError("file has not been decrypted") PyPDF2.utils.PdfReadError: file has not been de

浏览 0提问于2015-05-31得票数 1

回答已采纳

1回答

Lambda通过列表中的元素循环函数

python、pandas、function、pdf、lambda

如何调整这段代码，使函数循环通过列表models_2？如果函数使用models，它会工作，如果我更改为`models_2‘，它会给出以下错误： AttributeError：“浮动”对象没有属性“查找” 这是我的数据，从excel的所有单元格格式设置为“文本”。 MOD1 MOD2 MOD3 MOD4 0 File1.pdf File3.pdf File1.pdf File3.pdf 1 File2.pdf NaN File2.pdf File3.pdf 2 File3.pdf NaN

浏览 6提问于2022-05-25得票数 0

回答已采纳

2回答

使用PyPDF2合并文件时，'OSError：[Errno 22]无效参数‘

python、pdf、pypdf2、oserror

我只是想用python合并一些PDF文件，更具体地说是PyPDF2。很简单，但由于某些原因，我得到了一个错误，这是根本不理解的。在寻找解决方案的过程中，我发现其他人也有这个问题。然而，我没有满意的解决方案张贴出来。我的合并文件代码： from PyPDF2 import PdfFileMerger def merge(self, work_files, destination_file): pdf_merger = PdfFileMerger() for pdf in work_files: pdf_merger.append(pdf)

浏览 0提问于2020-05-28得票数 0

1回答

Python2imp- ValueError:在非包中尝试相对导入

python、python-2.7、import

我试图使用imp库导入一个库，代码如下所示： import imp PyPDF2 = imp.load_source('PyPDF2', '/Python Projects/econquizz/PyPDF2-master/PyPDF2/__init__.py') pdfFileObj = open('./Chapters/Chapter-3_5.pdf', 'rb') 当我尝试运行这段代码时，我会得到以下错误： Traceback (most recent call last): File "brains_3.py&

浏览 0提问于2019-05-13得票数 1

回答已采纳

1回答

“导入pyPDF2”结果为“ModuleNotFoundError”

python、python-3.x、windows、pypdf2、modulenotfounderror

问题总结:在使用python解释器时，我输入了import pyPDF2并得到了一个ModuleNotFound错误，尽管我已经安装了pyPDF2模块： >>> import pyPDF2 Traceback (most recent call last): File "<stdin>", line 1, in <module> ModuleNotFoundError: No module named 'pyPDF2' 我尝试过的:我使用的是Windows10。我是python的新手。我已经将Python3.8.3安

浏览 468提问于2020-05-28得票数 0

1回答

在python中读取基于联机的pdf文件，并将数据分离到列-OSError中

python

我与python有一个问题，在获得一个基于网络的pdf文件到python。下面是我写的代码 import PyPDF2 import pandas as pd from PyPDF2 import PdfReader reader = PdfReader(r"http://www.meteo.gov.lk/images/mergepdf/20221004MERGED.pdf") text = "" for page in reader.pages: text += page.extract_text() + "\n" 这给了我一个错误 O

浏览 10提问于2022-10-05得票数 -2

1回答

读取PDF文件python - pypdf2时出现断言错误

python、pdf、python-3.6、pypdf2

当我尝试读取PDF文件时，出现以下错误。代码： from PyPDF2 import PdfFileReader import os os.chdir("Path to dir") pdf_document = 'sample.pdf' pdf = PdfFileReader(pdf_document,'rb') #Error here 错误： Traceback (most recent call last): File "/home/krishna/PycharmProjects/sample/sample.py", l

浏览 45提问于2020-05-21得票数 0

1回答

模块PyPDF2没有属性“PdfFileReader”

python-3.x

我正在学习“用Python自动化无聊的东西”一书，但是当我试图运行这个简单的脚本时，我收到了一个错误。 import PyPDF2 pdfFileObj = open('meetingminutes.pdf', 'rb') pdfReader = PyPDF2.PdfFileReader(pdfFileObj) 完整的错误消息是： runfile('D:/Python files/PyPDF2/PyPDF2.py', wdir='D:/Python files/PyPDF2') Traceback (most recent ca

浏览 2提问于2019-02-21得票数 2

回答已采纳

1回答

PyPDF2给我一个无效参数错误

pypdf2

我正在尝试解析pdf文件中的文本。当我在做how to PyPDF2的教程时，我得到了以下错误。我进行了搜索，但最终什么也找不到。任何帮助都将不胜感激。 Traceback (most recent call last): File "D:/text_recognizer/main.py", line 4, in <module> inputStream = PyPDF2.PdfFileReader(input) File "D:\KimKanna's Class\python27\lib\site-packages\PyPDF2\p

浏览 0提问于2017-09-12得票数 2

5回答

使用Python 3.7 Anaconda将PDF转换为CSV

python、anaconda

我正在尝试将pdf文件"January2019“转换为csv文件。原始的pdf只包含某些页面上的表格，我正在尝试提取这些表格。我遵循了上的教程，但是当我插入：导入PyPDF2 PDFfilename =“1月2019.pdf” pfr = PyPDF2.PdfFileReader(open(January2019，"rb")) 输出显示为ModuleNotFoundError:没有名为‘PyPDF2’的模块... PS。我对Python和编程非常陌生。任何建议都将不胜感激！

浏览 0提问于2019-03-13得票数 0

1回答

PyPDF2.utils.PdfReadError:文件尚未解密

python、python-3.x、pypdf2

我一直在学习Python PyPDF2，这是geeksabieks.org上的代码 # importing required modules import PyPDF2 # creating a pdf file object pdfFileObj = open('English.pdf', 'rb') # creating a pdf reader object pdfReader = PyPDF2.PdfFileReader(pdfFileObj) # printing number of pages in pdf file print(pdfRead

浏览 5提问于2021-03-09得票数 1

回答已采纳

1回答

如何使用Python2.7创建使用PyPDF2的路径？

python、python-2.7

如何让python的路径使用PyPDF2 ..。关于PyPDF2，我需要下载它并将它添加到python中吗？我是python的初学者，我需要学习如何阅读文本表单PDF文件。(帮助我:)

浏览 1提问于2016-03-23得票数 1

回答已采纳

3回答

PyPDF2:流意外结束

python、python-3.x、pdf、pypdf、pypdf2

我有一个Python脚本，它使用PyPDF2来颠倒PDF的页面顺序。 from PyPDF2 import PdfFileWriter, PdfFileReader output = PdfFileWriter() rpage = [] name = input("What's the file called?") filename = name.split('.', 1) input1 = PdfFileReader(open(name,'rb'), strict = False) pages = list(range(1,i

浏览 5提问于2017-03-03得票数 0

2回答

PyPDF2忽略内容，仅获取水印

python、pypdf2

我有成千上万的PDF文件，像。我正在尝试使用PyPDF2将它们转换为纯文本(代码如下)。但PyPDF2显然只“看到”水印，而不是内容本身。我能在这里做些什么？ import os import PyPDF2 path_to_pdfs = '/path/to/pdf/files/' for filename in os.listdir(path_to_pdfs): if '.pdf' in filename.lower(): with open(path_to_pdfs + filename, mode = 'rb')

浏览 0提问于2018-06-14得票数 1

2回答

直接在Python中使用来自web的pdf？

python、pdf、urllib、pypdf

我试图使用Python直接从web读取.pdf文件，而不是将它们全部保存到我的计算机上。我所需要的只是来自.pdf的文本，我将阅读很多(~60k)它们，所以我更希望不必将它们全部保存起来。我知道如何使用urllib从互联网上保存.pdf并使用PyPDF2打开它。() 我想跳过保存到文件的步骤。 import urllib, PyPDF2 urllib.urlopen('https://bitcoin.org/bitcoin.pdf') wFile = urllib.urlopen('https://bitcoin.org/bitcoin.pdf') lFile

浏览 0提问于2014-04-18得票数 2

1回答

如何使用Python获取PDF文件元数据“页面大小”？

python、scanning、pypdf2、page-size

我尝试在Python3中使用PyPDF2模块，但无法显示“页面大小”属性。我想知道在扫描到PDF文件之前的纸张尺寸是什么。就像这样： import PyPDF2 pdf=PdfFileReader("sample.pdf","rb") print(pdf.getNumPages()) 但是我正在寻找另一个Python函数，而不是例如getNumPages(). 下面的命令输出某种元数据，但没有页面大小： pdf_info=pdf.getDocumentInfo() print(pdf_info)

浏览 7提问于2017-09-15得票数 3

回答已采纳