我想做的是:
我试图写一个脚本,刮一个网站的产品信息。
目前,该程序使用for -循环来刮取产品的价格和唯一的ID.
for-循环包含两个if-语句,以阻止它刮取NoneTypes。
import requests
from bs4 import BeautifulSoup
def average(price_list):
    return sum(price_list) / len(price_list)
# Requests search data from Website
page_link = 'URL'
page_response = requests.get(page_link, timeout=5)  # gets the webpage (search) from Website
page_content = BeautifulSoup(page_response.content, 'html.parser')  # turns the webpage it just retrieved into a BeautifulSoup-object
# Selects the product listings from page content so we can work with these
product_listings = page_content.find_all("div", {"class": "unit flex align-items-stretch result-item"})
prices = []  # Creates a list to add the prices to
uids = [] # Creates a list to store the unique ids
for product in product_listings:
## UIDS 
    if product.find('a')['id'] is not None:
        uid = product.find('a')['id']
        uids.append(uid)
# PRICES
    if product.find('p', class_ = 'result-price man milk word-break') is not None:# assures that the loop only finds the prices
        price = int(product.p.text[:-2].replace(u'\xa0', ''))  # makes a temporary variable where the last two chars of the string (,-) and whitespace are removed, turns into int
        prices.append(price)  # adds the price to the list问题是:
在if product.find('a')['id'] is not None:上,我得到了一个Exception has occurred: TypeError 'NoneType' object is not subscriptable。
不管是谁,如果我运行print(product.find('a')['id']),我都能得到我想要的价值,这让我非常困惑。这不意味着错误不是NoneType吗?
而且,if product.find('p', class_ = 'result-price man milk word-break') is not None:工作得完美无缺。
我试过的是:
我尝试过将if product.find('p', class_ = 'result-price man milk word-break')赋值给一个变量,然后在for-循环中运行它,但这是行不通的。我也做了我的公平份额谷歌,但没有得逞。问题可能是,我对编程比较陌生,不知道确切地搜索什么,但我仍然找到了许多似乎与相关问题有关的答案,但这在我的代码中是行不通的。
任何帮助都将不胜感激!
发布于 2018-12-01 21:49:14
只需做一个中间的步骤:
res = product.find('a')
if res is not None and res['id'] is not None:
    uids.append(res['id'])这样,如果因为找不到项目而返回None,那么您就不会试图订阅NoneType。
https://stackoverflow.com/questions/53574373
复制相似问题