我试图在下表中的括号节点之间提取包含简略学历信息(例如MA、BA)的文本。我可以使用xpath提取所有节点(包括括号),对它们进行迭代,并添加一些逻辑,但我很好奇是否有一种更有效的方法来提取括号之间的文本。
xpath("//tr/td@class='infobox-data'//following-sibling::node()")
<table>
<tbody>
<tr>
<th scope="row" class="infobox-label">
<a href="/wiki/Alma_mater" title="Alma mater">Alma mater
</a>
</th>
<td class="infobox-data">
<a href="/wiki/University_of_Alberta" title="University of Alberta">University of Alberta
</a>
" ("
<a href="/wiki/Bachelor_of_Arts" title="Bachelor of Arts">BA
</a>
")"
<br>
<a href="/wiki/Hertford_College,_Oxford" title="Hertford College, Oxford">Hertford College, Oxford
</a>
" ("
<a href="/wiki/Master_of_Arts_(Oxford,_Cambridge,_and_Dublin)" title="Master of Arts (Oxford, Cambridge, and Dublin)">MA
</a>
","
<a href="/wiki/Bachelor_of_Civil_Law" title="Bachelor of Civil Law">BCL
</a>
")"
</td>
</tr>
</tbody>
</table>发布于 2022-04-13 20:57:35
//text()[contains(.,')')]/preceding-sibling::a[1]/normalize-space(text())https://stackoverflow.com/questions/71863688
复制相似问题