首页
学习
活动
专区
圈层
工具
发布
首页
学习
活动
专区
圈层
工具
MCP广场
社区首页 >问答首页 >如何同时抓取多个页面,避免:'ResultSet‘对象没有属性'find_all'?

如何同时抓取多个页面,避免:'ResultSet‘对象没有属性'find_all'?
EN

Stack Overflow用户
提问于 2022-09-16 08:11:13
回答 1查看 38关注 0票数 0

我试图刮取span标记的文本,但是我得到了"ResultSet对象没有属性'find_all'“的错误。我想我必须有第二个循环在最后一个。不过,我还是想不出该怎么做。

代码语言:javascript
运行
复制
from bs4 import BeautifulSoup
import requests

urls = []
soups = []
divs = []

for i in range(20):
    i=i+1
    url = "https://www.autoscout24.de/lst?sort=standard&desc=0&cy=D&atype=C&ustate=N%2CU&powertype=ps&ocs_listing=include&adage=1&page=" + str(i)
    urls.append(url)

for url in urls:
    page = requests.get(url)
    soups.append(BeautifulSoup(page.content, "html.parser"))

for soup in range(len(soups)):
    divs.append(soups[soup].find_all("div", class_="VehicleDetailTable_container__mUUbY"))
    
for div in range(len(divs)):
    mileage = divs[div].find_all("span", class_="VehicleDetailTable_item__koEV4")[0].text
    year = divs[div].find_all("span", class_="VehicleDetailTable_item__koEV4")[1].text
    print(mileage)
    print(year)
    print()
代码语言:javascript
运行
复制
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
/tmp/ipykernel_17/4044669827.py in <module>
     21 
     22 for div in range(len(divs)):
---> 23     mileage = divs[div].find_all("span", class_="VehicleDetailTable_item__koEV4")[0].text
     24     year = divs[div].find_all("span", class_="VehicleDetailTable_item__koEV4")[1].text
     25     print(mileage)

/opt/conda/lib/python3.7/site-packages/bs4/element.py in __getattr__(self, key)
   2288         """Raise a helpful exception to explain a common code fix."""
   2289         raise AttributeError(
-> 2290             "ResultSet object has no attribute '%s'. You're probably treating a list of elements like a single element. Did you call find_all() when you meant to call find()?" % key
   2291         )

AttributeError: ResultSet object has no attribute 'find_all'. You're probably treating a list of elements like a single element. Did you call find_all() when you meant to call find()?
EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2022-09-16 08:21:04

尽量避免所有这些列表和循环,它只需要一个,并将消除错误。否则,您必须另外迭代您的ResultSets,但这将不是一个好的行为。

代码语言:javascript
运行
复制
for i in range(1,21):
    url = f"https://www.autoscout24.de/lst?sort=standard&desc=0&cy=D&atype=C&ustate=N%2CU&powertype=ps&ocs_listing=include&adage=1&page={i}"

    page = requests.get(url)
    soup = BeautifulSoup(page.content, "html.parser")

    for div in soup.find_all("div", class_="VehicleDetailTable_container__mUUbY"):
        mileage = div.find_all("span", class_="VehicleDetailTable_item__koEV4")[0].text
        year = div.find_all("span", class_="VehicleDetailTable_item__koEV4")[1].text
        print(mileage)
        print(year)

要注意,在没有里程或年份可用的情况下,使用此索引的可能会产生不符合您预期的文本。更好的做法是刮细页而不是。

示例

代码语言:javascript
运行
复制
from bs4 import BeautifulSoup
import requests

for i in range(1,21):
    url = f"https://www.autoscout24.de/lst?sort=standard&desc=0&cy=D&atype=C&ustate=N%2CU&powertype=ps&ocs_listing=include&adage=1&page={i}"

    page = requests.get(url)
    soup = BeautifulSoup(page.content, "html.parser")

    for div in soup.find_all("div", class_="VehicleDetailTable_container__mUUbY"):
        mileage = div.find_all("span", class_="VehicleDetailTable_item__koEV4")[0].text
        year = div.find_all("span", class_="VehicleDetailTable_item__koEV4")[1].text
        print(mileage)
        print(year)
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/73741809

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档