问Python 3网络刮刮&美汤
EN

Stack Overflow用户

提问于 2018-06-05 19:13:58

回答 2查看 246关注 0票数 2

我开始使用Python和美汤。

我正在练习使用以下代码：

import requests  
r = requests.get('https://www.autobarn.com.au/car-care-touring-accessories/car-care/washes?dir=asc&limit=48&order=name')

from bs4 import BeautifulSoup  
soup = BeautifulSoup(r.text, 'lxml')  
results = soup.find_all('div', class_='product-details')

records = []  
for result in results:  
    SKU = result.find('small',class_='text-muted').text.strip()
    DESC = result.find('strong').text.strip().upper()
    PRICE = result.find ('span',class_='price')
    URL = result.find('a')['href']
    records.append((SKU, DESC, PRICE, URL))

import pandas as pd  
df = pd.DataFrame(records, columns=['SKU','DESCRIPTION', 'RRP', 'URL'])  
df.to_csv('d:\\WEB SCRAPE TEST 4.csv', index=False, encoding='utf-8')

这可以很好地获取我想要的信息。

然而，对于价格，它会拖拽所有周围的HTML信息。

例如span class="price“id="product-price-1242”span class="price">$6.99

这似乎是由两个相继相同的标记引起的：- span class='price‘span class='price’

虽然我可以在csv文件中清理价格数据，但有没有办法改进代码以获取价格？

提前感谢

python-3.x

web-scraping

beautifulsoup

回答 2

Stack Overflow用户

回答已采纳

发布于 2018-06-06 05:47:19

尝尝这个。它应该可以解决这个问题：

import requests
from bs4 import BeautifulSoup  

url = 'https://www.autobarn.com.au/car-care-touring-accessories/car-care/washes?dir=asc&limit=48&order=name'

r = requests.get(url)
soup = BeautifulSoup(r.text, 'lxml')  
for result in soup.find_all('div', class_='product-details'): 
    SKU = result.find('small',class_='text-muted').text.strip()
    DESC = result.find('strong').text.strip().upper() 

    try:
        PRICE = result.select_one("[id^='product-price-'] span").text
    except AttributeError: PRICE = ""

    URL = result.find('a')['href']
    print(SKU, DESC, PRICE, URL)

票数 0

Stack Overflow用户

发布于 2018-06-05 20:46:26

你可以这样做：

PRICE = result.find('span',class_='price').find('span',class_='price').text

您还必须决定如何处理没有价格可用的情况。可能是这样的：

if result.find('span',class_='price') is None:
    PRICE = "N/A"
else:
    PRICE = result.find('span',class_='price').find('span',class_='price').text

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/50698698

复制

相似问题

问Python 3网络刮刮&美汤
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

资源

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Python 3网络刮刮&美汤EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

资源

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Python 3网络刮刮&美汤
EN