我正在尝试使用xpath从这个html中提取死亡年龄。我的问题是它没有一个类名或选择器来获取信息。有没有一种方法可以获取第三类属性,然后在其中获取第三个href,然后在该属性中获取跨度之后的年龄?
这就是我到目前为止所拥有的,但它不工作
=IMPORTXML(B3,"//div[@class='stat'][3]")
html代码:
<div class="is-flex">
<div class="col-xs-6 col-md-12">
<div class="stat box">
<i class="icn icn-birthday"></i>
<h6> Birthday </h6>
<a href="/february26.html"><span class="hidden-sm">February</span><span class="hidden-xs hidden-md hidden-lg">Feb</span> 26</a>, <a href="/year/1932.html">1932</a>
</div>
</div>
<div class="col-xs-6 col-md-12">
<div class="stat box">
<i class="icn icn-birthplace"></i>
<h6>Birthplace</h6>
Kingsland,
<a href="/birthplace/arkansas.html"> AR </a>
</div>
</div>
<div class="col-xs-6 col-md-12">
<div class="stat box">
<i class="icn icn-age"></i>
<h6>Death Date</h6><a href="/deceased/day/september12.html">Sep 12</a>, <a href="/deceased/2003.html">2003</a> (<a href="/deceased/age/71.html"><span class="hidden-sm">age </span>71</a>)
</div>
</div>
<div class="col-xs-6 col-md-12">
<div class="stat box">
<i class="icn icn-horiscope"></i>
<h6>Birth Sign</h6><a href="/astrology/pisces.html">Pisces</a>
</div>
</div>
</div>
发布于 2019-05-15 03:36:40
此xpath表达式:
//div[@class='stat box'][1]/a[3]/text()
应输出:
71
发布于 2019-05-15 03:38:09
您可以使用以下XPath-1.0表达式选择年龄值:
=IMPORTXML(B3,"//div[contains(@class,'stat') and contains(h6,'Death Date')]/a[contains(@href,'/deceased/age')]/span/following::text()")
它返回包含一些空格的71
。
要去掉前导空格和尾随空格,请使用
=IMPORTXML(B3,"normalize-space(//div[contains(@class,'stat') and contains(h6,'Death Date')]/a[contains(@href,'/deceased/age')]/span/following::text())")
https://stackoverflow.com/questions/56137262
复制相似问题