我正在抓取一个网页,它运行得很好,除了re.compile()在传递给它的文本出现时返回空[]的部分。这是我的代码
dob = soup.find(text = re.compile('Date of Birth')).findNext('td').text
print(dob)
father_name = soup.find(text = re.compile("Father's Name")).findNext('td').text
print(father_name)
mob_no_parent = soup.find(text = re.compile("Mobile Number")).findNext('td').text
print(mob_no_parent)
mob_no_student = soup.findAll(text = re.compile("Mobile Number(Student)"))
print(mob_no_student)
email = soup.find(text = re.compile("E - Mail Address")).findNext('td').text
print(email)
p_address = soup.find(text = re.compile("PermanentAddress")).findNext('td').text
print(p_address)除上述代码外,上述代码对所有文本都能正常工作。
mob_no_student = soup.findAll(text = re.compile("Mobile Number(Student)"))
print(mob_no_student)以上一项返回[]
这是我的html代码
<td align="left" width="50%" class="inner_padding_even"> Registration No </td>
<td align="left" width="50%" class="inner_padding_even">CPT0000</td>
</tr>
<tr>
<td align="left" width="50%" class="inner_padding_odd"> Name of Candidate</td>
<td align="left" width="50%" class="inner_padding_odd"><font face=arial size=2>KKKKKKK B.</font></td>
</tr>
<tr>
<td align="left" class="inner_padding_even"> Date of Birth</td>
<td align="left" class="inner_padding_even">16.11.1900</td>
</tr>
<tr>
<td align="left" class="inner_padding_even"> Father's Name</td>
<td align="left" class="inner_padding_even">BBBBBBBB.</td>
</tr>
<tr>
<td align="left" class="inner_padding_even"> Mobile Number</font>(Parent)</td>
<td align="left" class="inner_padding_even">99999999999</td>
</tr>
<tr>
<td align="left" class="inner_padding_odd"> Mobile Number(Student)</td>
<td align="left" class="inner_padding_odd">9999999999</td>
</tr>
<tr>
<td align="left" class="inner_padding_even"> E - Mail Address</td>
<td align="left" class="inner_padding_even">keyansgm@gmail.com</td>
</tr>
<tr>
<td width="50%" align="left" class="inner_padding_even"> Permanent Address</td>
<td width="50%" align="left" class="inner_padding_even">Blah blah</td>
</tr>我在这里错过了什么?
发布于 2016-12-17 17:24:31
在regex中,您需要转义括号,如果不是,它将引用组。
尝尝这个
mob_no_student = soup.findAll(text = re.compile("Mobile Number\(Student\)"))https://stackoverflow.com/questions/41201086
复制相似问题