我正在尝试加载youtube频道的视频页面,并对其进行解析以提取最新的视频信息。我想避免使用API,因为它有一个每日使用配额。我遇到的问题是,在打印“driver.pagesource”时,Selenium似乎没有加载网页的完整html:
from bs4 import BeautifulSoup
from selenium.webdriver import Chrome
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.chrome.options import Options
driver = Chrome(executable_path='chromedriver')
driver.get('https://www.youtube.com/c/Oxylabs/videos')
# Agree to youtube cookie popup
try:
consent = driver.find_element_by_xpath(
"//*[contains(text(), 'I agree')]")
consent.click()
except:
pass
# Parse html
WebDriverWait(driver,100).until(EC.visibility_of_element_located((By.XPATH, '//*[@id="show-more-button"]')))
print(driver.page_source)
如上文所示,我试图实现WebDriverWait。这将导致超时异常错误。但是,以下xpath (/html --网页的末尾)不会导致超时异常:
WebDriverWait(driver,100).until(EC.visibility_of_element_located((By.XPATH, '/html')))
-but --这也不会加载完整的html。我还尝试实现time.sleep(100)而不是WebDriverWait,但这也导致了不完整的html。任何帮助都将不胜感激。
发布于 2022-01-09 06:01:12
您要查找的元素不在页面上,这是超时的原因:
//*[@id="show-more-button"]
您是否尝试过滚动到页面底部或寻找其他元素??
driver.execute_script("arguments[0].scrollIntoView();", element)
https://stackoverflow.com/questions/70638776
复制相似问题