我试图刮取span标记的文本,但是我得到了"ResultSet对象没有属性'find_all'“的错误。我想我必须有第二个循环在最后一个。不过,我还是想不出该怎么做。
from bs4 import BeautifulSoup
import requests
urls = []
soups = []
divs = []
for i in range(20):
i=i+1
url = "https://www.autoscout24.de/lst?sort=standard&desc=0&cy=D&atype=C&ustate=N%2CU&powertype=ps&ocs_listing=include&adage=1&page=" + str(i)
urls.append(url)
for url in urls:
page = requests.get(url)
soups.append(BeautifulSoup(page.content, "html.parser"))
for soup in range(len(soups)):
divs.append(soups[soup].find_all("div", class_="VehicleDetailTable_container__mUUbY"))
for div in range(len(divs)):
mileage = divs[div].find_all("span", class_="VehicleDetailTable_item__koEV4")[0].text
year = divs[div].find_all("span", class_="VehicleDetailTable_item__koEV4")[1].text
print(mileage)
print(year)
print()
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
/tmp/ipykernel_17/4044669827.py in <module>
21
22 for div in range(len(divs)):
---> 23 mileage = divs[div].find_all("span", class_="VehicleDetailTable_item__koEV4")[0].text
24 year = divs[div].find_all("span", class_="VehicleDetailTable_item__koEV4")[1].text
25 print(mileage)
/opt/conda/lib/python3.7/site-packages/bs4/element.py in __getattr__(self, key)
2288 """Raise a helpful exception to explain a common code fix."""
2289 raise AttributeError(
-> 2290 "ResultSet object has no attribute '%s'. You're probably treating a list of elements like a single element. Did you call find_all() when you meant to call find()?" % key
2291 )
AttributeError: ResultSet object has no attribute 'find_all'. You're probably treating a list of elements like a single element. Did you call find_all() when you meant to call find()?
发布于 2022-09-16 00:21:04
尽量避免所有这些列表和循环,它只需要一个,并将消除错误。否则,您必须另外迭代您的ResultSets
,但这将不是一个好的行为。
for i in range(1,21):
url = f"https://www.autoscout24.de/lst?sort=standard&desc=0&cy=D&atype=C&ustate=N%2CU&powertype=ps&ocs_listing=include&adage=1&page={i}"
page = requests.get(url)
soup = BeautifulSoup(page.content, "html.parser")
for div in soup.find_all("div", class_="VehicleDetailTable_container__mUUbY"):
mileage = div.find_all("span", class_="VehicleDetailTable_item__koEV4")[0].text
year = div.find_all("span", class_="VehicleDetailTable_item__koEV4")[1].text
print(mileage)
print(year)
要注意,在没有里程或年份可用的情况下,使用此索引的可能会产生不符合您预期的文本。更好的做法是刮细页而不是。
示例
from bs4 import BeautifulSoup
import requests
for i in range(1,21):
url = f"https://www.autoscout24.de/lst?sort=standard&desc=0&cy=D&atype=C&ustate=N%2CU&powertype=ps&ocs_listing=include&adage=1&page={i}"
page = requests.get(url)
soup = BeautifulSoup(page.content, "html.parser")
for div in soup.find_all("div", class_="VehicleDetailTable_container__mUUbY"):
mileage = div.find_all("span", class_="VehicleDetailTable_item__koEV4")[0].text
year = div.find_all("span", class_="VehicleDetailTable_item__koEV4")[1].text
print(mileage)
print(year)
https://stackoverflow.com/questions/73741809
复制