我是python新手。我开始用Beautiful Soup
编写一个处理HTML文件的脚本。一切都在正常处理,但我现在想将文章保存在一个名为nowe
的新文件夹中,而不是打印它。我需要在处理后将所有文章放到同一个文件夹中,或者创建一个CSV文件。
from bs4 import BeautifulSoup
import glob
import os, os.path
path = '/home/darek/Dokumenty/pliki/'
path_out = '/home/darek/Dokumenty/pliki/nowe'
for filename in glob.glob(os.path.join(path, '*.html',)):
f = filename
tresc = open(f)
soup = BeautifulSoup(tresc, 'html.parser')
article = soup.find('div',class_='post')
tagi = soup.find('div', class_='ph_social_share_box ph_social_share_box_bottom')
fout = open( +filename, "w")
fout.close()
print(article)
我的错误日志:
File "/home/darek/Dokumenty/parser.py", line 21, in <module>
fout = open( +filename, "w")
TypeError: bad operand type for unary +: 'str'
这是印刷品的作品
从bs4导入BeautifulSoup导入glob导入os,os.path
path = '/home/darek/Dokumenty/pliki/'
path_out = '/home/darek/Dokumenty/pliki/nowe'
for filename in glob.glob(os.path.join(path, '*.html',)):
f = filename
content = open(f)
soup = BeautifulSoup(content, 'html.parser')
article = soup.find('div',class_='post')
tags = soup.find('div', class_='ph_social_share_box ph_social_share_box_bottom')
print(article)
那是我不能写到文件的作品的想法?
发布于 2018-08-18 00:33:44
删除此行中的"+“:fout = open( +filename, "w")
"w“的意思是:”以写入模式打开文件“。如果你给它加一个"+“,比如"w+",它会在文件打开时从头开始写入文件。所以这行应该是
fout = open(filename, "w+")
发布于 2018-08-18 00:49:55
更改以下代码块:
fout = open( +filename, "w")
fout.close()
要做到这点:
fout = open( filename, "w")
fout.write(article) # I assume here that article is what you want to be writing
fout.close()
tresc.close() # You never closed this, so it was a memory leak
https://stackoverflow.com/questions/51899594
复制相似问题