文章/答案/技术大牛

发布

社区首页 >问答首页 >BeautifulSoup在任何soup命令上返回‘`NoneType`’

问BeautifulSoup在任何soup命令上返回‘`NoneType`’
EN

Stack Overflow用户

提问于 2021-05-27 06:42:08

回答 1查看 225关注 0票数 1

我正在使用BeautifulSoup抓取“华尔街日报”，但它似乎永远找不到id=的“顶部新闻”元素，它总是可以在主页上找到。我已经尝试了find()、find_all()和各种其他方法，它们都为在NoneType对象上调用的任何方法返回一个NoneType。

我试图提取关于头条新闻文章的元数据，主要是文章标题和url。每一篇文章的元数据都在一个名为“WSJTheme--标题--7VCzo7Ay”的类下，但我只希望那些位于“头条新闻”的类中。

这是我的代码：

import requests
from bs4 import BeautifulSoup
from shutil import copyfile

URL = 'https://www.wsj.com'
page = requests.get(URL)

soup = BeautifulSoup(page.content, 'html.parser')
results = soup.find(id='top-news')

topArticles = results.find_all('div', class_='WSJTheme--headline--7VCzo7Ay ')

python

web-scraping

beautifulsoup

回答 1

Stack Overflow用户

回答已采纳

发布于 2021-05-27 07:34:13

指定User-Agent从服务器获得正确的响应：

import requests
from bs4 import BeautifulSoup


url = "https://www.wsj.com/"

headers = {
    "User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:88.0) Gecko/20100101 Firefox/88.0"
}

soup = BeautifulSoup(requests.get(url, headers=headers).content, "html.parser")

for headline in soup.select('#top-news span[class*="headline"]'):
    print(headline.text)

指纹：

Oil Giants Dealt Defeats as Climate Pressures Intensify
At Least Eight Killed in San Jose Shooting
HSBC to Exit Most U.S. Retail Banking
Amazon-MGM Deal Marks Win for Hedge Funds
Cities Reverse Defunding the Police Amid Rising Crime
Federal Prosecutors Have Asked Banks for Information About Archegos Meltdown
Why a Grand Plan to Vaccinate the World Against Covid Unraveled
Inside the Israel-Hamas Conflict and One of Its Deadliest Hours in Gaza
Eric Carle, ‘The Very Hungry Caterpillar’ Author, Dies at 91
Wynn May Face U.S. Action for Role in China’s Push to Expel Businessman
Walmart to Sell New Line of Gap-Branded Homegoods

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/67716937

复制

相似问题

问BeautifulSoup在任何soup命令上返回‘`NoneType`’
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问BeautifulSoup在任何soup命令上返回‘`NoneType`’EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问BeautifulSoup在任何soup命令上返回‘`NoneType`’
EN