我在显示内容时遇到问题,我的程序:
#! /usr/bin/python
import urllib
import re
url = "http://yahoo.com"
pattern = '''<span class="medium item-label".*?>(.*)</span>'''
website = urllib.urlopen(url)
pageContent = website.read()
result = re.findall(pattern, pageContent)
for record in result:
print record输出:
Masked teen killed by dad
First look in 'Hotel of Doom'
Ex-NFL QB's sad condition
Reporter ignores warning
Romney's low bar for debates所以问题是,我应该在我的代码中包含什么,以便将‘转换为字符
发布于 2012-09-29 03:30:06
在Python2中:
In [16]: text = 'Ex-NFL QB's sad condition'
In [17]: import HTMLParser
In [18]: parser = HTMLParser.HTMLParser()
In [19]: parser.unescape(text)
Out[19]: u"Ex-NFL QB's sad condition"在Python3中:
import html.parser as htmlparser
parser = htmlparser.HTMLParser()
parser.unescape(text)https://stackoverflow.com/questions/12646177
复制相似问题