首页
学习
活动
专区
圈层
工具
发布
首页
学习
活动
专区
圈层
工具
社区首页 >问答首页 >find_elements CSS_Selector Python

find_elements CSS_Selector Python
EN

Stack Overflow用户
提问于 2022-09-21 00:13:41
回答 1查看 77关注 0票数 -1

我意识到Selenium删除了一些属性,我的代码不能使用each_item.find_element(By.CSS_SELECTOR语句):

代码语言:javascript
代码运行次数:0
运行
复制
for i in range(pagenum):
     driver.get(f"https://www.adiglobaldistribution.us/search?attributes=dd1a8f50-5ac8-ec11-a837-000d3a006ffb&page={i}&criteria=Tp-link")
     time.sleep(5)
     wait=WebDriverWait(driver,10)
     search_items = driver.find_elements(By.CSS_SELECTOR,"[class='rd-thumb-details-price']")

     for each_item in search_items:
          item_title = each_item.find_element(By.CSS_SELECTOR, "span[class='rd-item-name-desc']").text
          item_name = each_item.find_element(By.CSS_SELECTOR, "span[class='item-num-mfg']").text[7:]
          item_link = each_item.find_element(By.CSS_SELECTOR, "div[class='item-thumb'] a").get_attribute('href')
          item_price = each_item.find_element(By.CSS_SELECTOR, "div[class='rd-item-price rd-item-price--list']").text[2:].replace("\n",".")
          item_stock = each_item.find_element(By.CSS_SELECTOR, "div[class='rd-item-price']").text[19:]
    
          table = {"title": item_title, "name": item_name, "Price": item_price, "Stock": item_stock, "link": item_link}
          data_adi.append(table)

错误:

EN

回答 1

Stack Overflow用户

发布于 2022-09-21 10:04:05

你可能是以错误的方式对待整个情况。这些产品在页面中被javascript补充,一旦页面加载,您就可以实际地刮取api端点,并避免selenium的复杂性(和缓慢性)。下面是一种基于请求和熊猫的解决方案,用于抓取API端点(可在Dev tools - Network选项卡下找到):

代码语言:javascript
代码运行次数:0
运行
复制
import requests
import pandas as pd

headers = {
    'User-Agent': "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/104.0.0.0 Safari/537.36"
    }

full_df = pd.DataFrame()
for x in range(1, 4):
    r = requests.get(f'https://www.adiglobaldistribution.us/api/v2/adiglobalproducts/?applyPersonalization=true&boostIds=&categoryId=16231864-9ed5-4536-a8b3-ae870078e9f7&expand=pricing,brand&getAllAttributeFacets=false&hasMarketingTileContent=false&includeAttributes=IncludeOnProduct&includeSuggestions=false&makeBrandUrls=false&page={x}&pageSize=36&previouslyPurchasedProducts=false&query=&searchWithin=&sort=Bestseller', headers=headers)
    df = pd.json_normalize(r.json()['products'])
    full_df = pd.concat([full_df, df], axis=0, ignore_index=True)
# print([x for x in full_df.columns])
print(full_df[['basicListPrice', 'modelNumber', 'name', 'properties.countrY_OF_ORIGIN', 'productDetailUrl', 'properties.minimuM_QTY', 'properties.onsalenow']])

在终端打印的结果:

代码语言:javascript
代码运行次数:0
运行
复制
basicListPrice  modelNumber name    properties.countrY_OF_ORIGIN    productDetailUrl    properties.minimuM_QTY  properties.onsalenow
0   51.99   TL-SG1005P  TP-Link TL-SG1005P 5-Port Gigabit Desktop Switch with 4-Port PoE    China   /Catalog/shop-brands/tp-link/FP-TLSG1005P   1   0
1   81.99   C7  TP-Link ARCHER C7 AC1750 Wireless Dual Band Gigabit Router  China   /Catalog/shop-brands/tp-link/FP-ARCHERC7    1   0
2   18.99   TL-POE150S  TP-Link TL-POE150S PoE Injector, IEEE 802.3af Compliant China   /Catalog/shop-brands/tp-link/FP-TLPOE150S   1   0
3   19.99   TL-WR841N   TP-Link TL-WR841N 300Mbps Wireless N Router China   /Catalog/shop-brands/tp-link/FP-TLWR841N    1   0
4   43.99   TL-PA4010 KIT   TP-Link TL-PA4010KIT AV600 600Mbps Powerline Starter Kit    China   /Catalog/shop-brands/tp-link/FP-TLPA4010K   1   0
... ... ... ... ... ... ... ...
85  76.99   TL-SL1311MP TP-Link TL-SL1311MP 8-Port 10/100mbps + 3-Port Gigabit Desktop Switch With 8-Port PoE+      /Catalog/shop-brands/tp-link/FP-TSL1311MP   1   0
86  35.99   C20 TP-Link ARCHER C20 IEEE 802.11ac Ethernet Wireless Router   China   /Catalog/shop-brands/tp-link/FP-ARCHERC20   1   0
87  29.99   TL-WR802N   TP-Link TL-WR802N 300Mbps Wireless N Nano Router, Pocket Size   China   /Catalog/shop-brands/tp-link/FP-TLWR802N    1   0
88  100.99  EAP610  TP-Link EAP610_V2 AX1800 CEILING MOUNT WI-FI 6" China   /Catalog/shop-brands/tp-link/FP-EAP610V2    1   0
89  130.99  EAP650  TP-Link EAP650 AX3000 Ceiling Mount Wi-Fi 6 Access Point    China   /Catalog/shop-brands/tp-link/FP-EAP650  1   0
90 rows × 7 columns

您可以进一步检查json响应,并查看是否需要更多有用的信息。雷夫和熊猫的文档:https://pandas.pydata.org/docs/reference/api/pandas.json_normalize.html

有关请求文档,请参见https://requests.readthedocs.io/en/latest/

票数 2
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/73793786

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档