问尝试使for循环在命中html表中的特定行时中断。
EN

Stack Overflow用户

提问于 2018-07-22 06:08:46

回答 1查看 71关注 0票数 1

我正在尝试从我的代码中找到的wesbite中抓取一个webtable。基本上，我只想抓取今天的比赛，当我的for循环到达HTML表中包含第二天比赛信息的那部分时，它会停止。我试过用谷歌搜索这个，但似乎还是解决不了这个问题。任何帮助都将不胜感激。我的代码发布在下面。

url='http://www.oddsportal.com/baseball/usa/mlb/'
driver = webdriver.Chrome() 
driver.get(url)
time.sleep(5)

driver.find_element_by_id('user-header-timezone-expander').click() #get to est timezone
time.sleep(2)
driver.find_element_by_xpath("//*[contains(text(), 'GMT - 4')]").click() #get to est timezone
time.sleep(2)

content=driver.page_source

soup=BeautifulSoup(content,'lxml')


file_dates = []
todays_games=soup.find('table',{'class':'table-main'})
dummy_row=soup.find_all(attrs={'class':'table-dummyrow'})

for games in todays_games.select('td.table-time.datet'): #gets the time of the game
    games= [games.text]
    file_dates.append(games)

    if dummy_row==dummy_row[1]: #I want the for loop to break when it hits the gray header titled "Tomorrow, 22 Jul" on the webpage
        break

print(file_dates)  #still returns every game on the website though

python-3.x

selenium

beautifulsoup

回答 1

Stack Overflow用户

回答已采纳

发布于 2018-07-22 06:34:34

要获得只有今天的比赛时间，您可以尝试以下代码：

games = [td.text for td in driver.find_elements_by_xpath('//table[@id="tournamentTable"]//td[contains(@class, "datet") '
                                                     'and following::span[starts-with(., "Tomorrow,")]]')]
print(games)

如果您仍然想使用bs4，请尝试：

file_dates = []
todays_games=soup.find('table',{'class':'table-main'})

for games in todays_games.select('tr')[2:]:
    if games.select('td.datet'):
        file_dates.append(games.select('td.datet')[0].text)
    if games.select('th'):
        break

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/51460504

复制

相似问题

问尝试使for循环在命中html表中的特定行时中断。
EN

回答 1

Stack Overflow用户

社区

活动

资源

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问尝试使for循环在命中html表中的特定行时中断。EN

回答 1

Stack Overflow用户

社区

活动

资源

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问尝试使for循环在命中html表中的特定行时中断。
EN