我对BeautifulSoup有一个“问题”,尤其是re模块,这里有个问题:
import re
from bs4 import BeautifulSoup
string = """
<div id="my_id">
<ul>
<li>something</li>
<li class="color12">something</li>
<li class="color45">something else</li>
</ul>
</div>
"""
soup = BeautifulSoup(string)
li = soup.find_all('li', {'class': re.compile('color(\d+)')} )
for ele in li:
print ele['class'] # will print colorXXXX but i would like to know how to get only this XXXX但我只想提取颜色后的数字。我有没有可能或者有义务使用下面这样的东西:
match = re.search(r'color(\d+)', str(ele['class']))
if match:
print match.group(1)感谢您的帮助:)
发布于 2012-11-15 23:34:29
您必须重新应用正则表达式。只需将其存储在变量中并重用:
colorpattern = re.compile(r'color(\d+)')
li = soup.find_all('li', {'class': colorpattern} )
for ele in li:
print colorpattern.search(ele['class']).group(1)https://stackoverflow.com/questions/13400774
复制相似问题