使用Python3.4、lxml和请求来搜索google趋势.
在本例中,我试图检索位于这些span标记之间的文本"Johnny“。我对lxml模块和XPath语法还不熟悉,但我不知道自己做错了什么。
提前谢谢你。
HTML:
<span class="hottrends-single-trend-title ellipsis-maker-inner">Johnny Depp</span>代码:
from lxml import html
import requests
page = requests.get('https://trends.google.com/trends/hottrends')
tree = html.fromstring(page.content)
#This will create a list of trends:
trends = tree.xpath('//span[@class="hottrends-single-trend-title ellipsis-maker-inner"]/text()')
print('Trends: ', trends)结果:

发布于 2017-06-24 12:59:06
使用相应的result,您可以使用lxml的XML解析器,甚至可以使用标准库中的xml.etree,因为XML比HTML简单得多。给定RSS,您只需遍历item元素并打印title,例如(尽管顶部的结果不再是'Johnny‘了:):
>>> from lxml import etree as ET
>>> import requests
>>> page = requests.get('https://trends.google.com/trends/hottrends/atom/feed?pn=p1')
>>> root = ET.fromstring(page.content)
>>> for trend in root.xpath('//item'):
... print trend.find('title').text
...
spinner
Old Navy Flip Flop Sale
You Get Me
Johnny Depp
NHL Draft
GLOW
Despicable Me 3
Blake Griffin
Robert Del Naja
DJ Khaled Grateful
Bella Thorne
Tubelight
interstellar
Camila Cabello
Mexico vs Russia
Frank Mason
Bam Adebayo
TJ Leaf
the house
Dwyane Wadehttps://stackoverflow.com/questions/44731421
复制相似问题