我试图解析下表,用UTF-8编码(这是其中的一部分):
<table cellspacing="0" cellpadding="3" border="0" id="ctl00_SPWebPartManager1_g_c001c0d9_0cb8_4b0f_b75a_7cc3b6f7d790_ctl00_HistoryData1_gridHistoryData_DataGrid1" style="width:100%;border-collapse:collapse;">
<tr class="gridHeader" valign="top">
<td class="titleGridRegNoB" align="center" valign="top"><span dir=RTL>שווי שוק (אלפי ש"ח)</span></td><td class="titleGridReg" align="center" valign="top">הון רשום למסחר</td><td class="titleGridReg" align="center" valign="top">שער נמוך</td><td class="titleGridReg" align="center" valign="top">שער גבוה</td><td class="titleGridReg" align="center" valign="top">שער בסיס</td><td class="titleGridReg" align="center" valign="top">שער פתיחה</td><td class="titleGridReg" align="center" valign="top"><span dir="rtl">שער נעילה (באגורות)</span>
</td><td class="titleGridReg" align="center" valign="top">שער נעילה מתואם</td><td class="titleGridReg" align="center" valign="top">תאריך</td>
</tr><tr onmouseover="this.style.backgroundColor='#FDF1D7'" onmouseout="this.style.backgroundColor='#ffffff'">
我的代码是:
html = br.response().read().decode('utf-8')
soup = BeautifulSoup(html)
table_id = "ctl00_SPWebPartManager1_g_c001c0d9_0cb8_4b0f_b75a_7cc3b6f7d790_ctl00_HistoryData1_gridHistoryData_DataGrid1"
table = soup.findall("table", id=table_id)
我得到了以下错误:
TypeError: 'NoneType' object is not callable
发布于 2013-10-25 15:06:13
因为您只是在查找使用id
的情况,所以可以只使用id
而不使用其他任何东西,因为id
是唯一的:
更新
使用您的粘贴:
# encoding=utf-8
from bs4 import BeautifulSoup
import requests
data = requests.get('https://dpaste.de/EWCK/raw/')
soup = BeautifulSoup(data.text)
print soup.find("table",
id="ctl00_SPWebPartManager1_g_c001c0d9_0cb8_4b0f_b75a_7cc3b6f7d790_ctl00_HistoryData1_gridHistoryData_DataGrid1")
我使用python请求从网页中获取数据,就像您试图获取数据一样。以上代码有效,并给出了正确的ID。试着改变一下,不要使用.decode('utf-8')
,而只是使用br.response().read()
。
https://stackoverflow.com/questions/19593340
复制相似问题