首页
学习
活动
专区
圈层
工具
发布
首页
学习
活动
专区
圈层
工具
MCP广场
社区首页 >问答首页 >如何重命名PDF文件,从PDF文件中提取文本?

如何重命名PDF文件,从PDF文件中提取文本?
EN

Stack Overflow用户
提问于 2014-07-21 06:58:36
回答 3查看 6.1K关注 0票数 1

我试图使用Python来重命名PDF文件,使用文件内容的一部分。情况就是这样。

PDF文件是商业发票,包含“商业发票”和“部门”字样。我想将文件重命名为“商业发票”和“部门",如"353624人力资源”。

以下是我到目前为止所拥有的:

代码语言:javascript
运行
复制
from StringIO import StringIO
import pyPdf
import os

# a function here
def getPDFContent(path):
    content = ""
    num_pages = 10
    p = file(path, "rb")
    pdf = pyPdf.PdfFileReader(p)
    for i in range(0, num_pages):
        content += pdf.getPage(i).extractText() + "\n"
        content = " ".join(content.replace(u"\xa0", " ").strip().split())     
        return content 

# name of the source PDF file
PDF_name = '222'

# picking texts from the PDF file
pdfContent = StringIO(getPDFContent("C:\\" + PDF_name + ".pdf").encode("ascii", "ignore"))
for line in pdfContent:
    aaa = line.find(' Commercial Invoice ')
    CIN = line[aaa + 28: aaa + 38]
    bbb = line.find('Department')
    Dpt = line [bbb+20 : bbb+26]

    final_name = str(CIN + " " + Dpt)
    
print final_name

f = open("C:\\" + PDF_name + ".pdf")
f.close()

os.rename("C:\\" + PDF_name + ".pdf", "C:\\" + final_name + ".pdf")

它直到打印出提取的文本‘print _name’,但在最后一部分重命名文件时,它会出现一个错误“WindowsError: error 32进程无法访问该文件,因为它正在被另一个进程使用”。

这里出了什么问题?文件似乎曾经被关闭过吗?

EN

回答 3

Stack Overflow用户

回答已采纳

发布于 2014-07-21 07:38:12

def getPDFContent(path)中,在p = file(path, "rb")之后,当内容被复制时,您需要关闭该文件。

代码语言:javascript
运行
复制
p.close()

将它放在for循环之后,但放在函数中。

票数 1
EN

Stack Overflow用户

发布于 2014-07-21 07:01:59

C:\\添加到最后一行的PDF_name中。

票数 -1
EN

Stack Overflow用户

发布于 2022-05-13 14:36:46

这可以通过mouse_event完成,光标在下面的位置是代码:

代码语言:javascript
运行
复制
Sub Run_report1()

'
' Run_report Macro
'
' Keyboard Shortcut: Ctrl+Shift+G
'
Application.Wait Now + TimeValue("0:00:01")
SendKeys "%{Tab}", True
Application.Wait Now + TimeValue("0:00:01")


Dim i As Integer
i = 1
Do Until i > 8
Application.Wait Now + TimeValue("0:00:01")


SetCursorPos 309, 253

mouse_event MOUSEEVENTF_LEFTDOWN, 0, 0, 0, 0
mouse_event MOUSEEVENTF_LEFTUP, 0, 0, 0, 0

Application.Wait Now + TimeValue("0:00:01")

SendKeys "{Enter}", True


Application.Wait Now + TimeValue("0:00:03")

SetCursorPos 794, 771

mouse_event MOUSEEVENTF_LEFTDOWN, 0, 0, 0, 0
mouse_event MOUSEEVENTF_LEFTUP, 0, 0, 0, 0
mouse_event MOUSEEVENTF_LEFTDOWN, 0, 0, 0, 0
mouse_event MOUSEEVENTF_LEFTUP, 0, 0, 0, 0

Application.Wait Now + TimeValue("0:00:01")


SetCursorPos 1068, 728

mouse_event MOUSEEVENTF_LEFTDOWN, 0, 0, 0, 0
mouse_event MOUSEEVENTF_LEFTUP, 0, 0, 0, 0

Application.Wait Now + TimeValue("0:00:01")

SetCursorPos 746, 94

mouse_event MOUSEEVENTF_LEFTDOWN, 0, 0, 0, 0
mouse_event MOUSEEVENTF_LEFTUP, 0, 0, 0, 0

Application.Wait Now + TimeValue("0:00:01")


SendKeys "%{Tab}", True

Application.Wait Now + TimeValue("0:00:01")

SetCursorPos 309, 253

mouse_event MOUSEEVENTF_LEFTDOWN, 0, 0, 0, 0
mouse_event MOUSEEVENTF_LEFTUP, 0, 0, 0, 0

Application.Wait Now + TimeValue("0:00:01")

SendKeys "^V", True
SendKeys "{Enter}", True

Application.Wait Now + TimeValue("0:00:01")

SendKeys "{F5}", True

SendKeys "{PGUP}", True
SendKeys "{PGUP}", True
SendKeys "{PGUP}", True
SendKeys "{PGUP}", True
SendKeys "{PGUP}", True
SendKeys "{PGUP}", True
SendKeys "{PGUP}", True
SendKeys "{PGUP}", True
SendKeys "{PGUP}", True
SendKeys "{PGUP}", True
SendKeys "{PGUP}", True
SendKeys "{PGUP}", True
SendKeys "{PGUP}", True
SendKeys "{PGUP}", True
SendKeys "{PGUP}", True

i = i + 1
Loop

MsgBox "Task Completed"

End Sub
票数 -1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/24859261

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档