首页
学习
活动
专区
工具
TVP
发布
社区首页 >问答首页 >可以使用xpath获取特定的表

可以使用xpath获取特定的表
EN

Stack Overflow用户
提问于 2018-06-04 18:20:18
回答 1查看 97关注 0票数 2

使用xpath和table标签可以获取特定的表,如下所示,谢谢!

代码语言:javascript
复制
<table border="0" cellspacing="1" cellpadding="1" class="bigborder" width="1050">

*上面的特定表格标签可以在该URL中找到,再次感谢!

代码语言:javascript
复制
import requests
from lxml import html

req = requests.get("url")
raw_html = html.fromstring(req.text)
tr = raw_html.xpath('//*[@id="innerContent"]/table/tbody/tr/td[2]/form/table[4]//tr/text()')
print("".join([x.replace("\t", "").replace("\r\n","").strip() for x in tr]))

输出:None

预期输出:

代码语言:javascript
复制
491 12 20/03/2016 ST / "Turf" / "A         " 1200 G 4 12 052 C S Shum K K
Chiong 9-1/4 92 112 8  11  12 1.11.59 1067 TT/B

456 09 06/03/2016 ST / "Turf" / "C         " 1200 G 4 8 052 C S Shum G 
Lerena 8-3/4 16 126 9  10  9 1.11.42 1078 TT1/B1
EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2018-06-05 07:52:46

值为"innerContent“的属性id在超文本标记语言中不存在,您在xpath中使用了它。您可以返回一个tr元素列表,从中可以提取字符串。您使用它的方式是查找tr标记的文本元素,而这个元素并不存在。你可以像这样得到所有的行。

代码语言:javascript
复制
import requests
from lxml import html
import re

req = requests.get(<URL>)
raw_html = html.fromstring(req.text)
lines = raw_html.xpath('//form/table[4]/tr')
for line in lines:
    print(re.sub( '\s+', ' ', line.xpath("string()").replace("\t"," ").replace("\r"," ").replace("\n"," ")).strip())

输出:

代码语言:javascript
复制
RaceIndex Pla. Date RC/Track/Course Dist. G RaceClass Dr Rtg. Trainer Jockey LBW Win Odds Act.Wt. RunningPosition Finish Time Declar.Horse Wt. Gear VideoReplay
17/18 Season
264 06 13/12/2017 HV / "Turf" / "C " 1650 G 5 2 013 C W Chang W M Lai 6-1/2 15 113 4 4 3 6 1.41.76 1115 TT/B-
181 13 11/11/2017 ST / "Turf" / "A " 1400 GF 5 8 016 C W Chang W M Lai 6 89 113 13 13 12 13 1.23.58 1109 TT/B2
138 09 25/10/2017 HV / "Turf" / "C+3 " 1650 GF 5 10 018 C W Chang W M Lai 3-1/2 37 113 10 11 10 9 1.40.77 1100 TT
068 11 27/09/2017 HV / "Turf" / "C+3 " 1650 GF 5 7 020 C W Chang W M Lai 8 24 113 4 5 5 11 1.41.93 1102 TT
031 04 13/09/2017 HV / "Turf" / "B " 1650 GF 5 7 020 C W Chang W M Lai 1-3/4 45 114 7 8 7 4 1.41.43 1099 TT/B-
013 11 06/09/2017 HV / "Turf" / "A " 1650 G 5 11 020 C W Chang W M Lai 8-1/2 16 113 11 11 11 11 1.42.61 1110 TT/B
16/17 Season
707 02 07/06/2017 HV / "Turf" / "A " 1650 G 5 4 016 C W Chang W M Lai 2 31 113 10 10 8 2 1.40.41 1084 TT/B
589 12 23/04/2017 ST / "AWT" / "-" 1200 GD 5 12 020 C W Chang W M Lai 12 39 113 11 12 12 1.11.33 1082 TT/CP-/B2
481 10 12/03/2017 ST / "AWT" / "-" 1650 GD 5 13 023 C S Shum H T Mo 12-1/2 24 108 14 14 13 10 1.40.66 1068 TT/CP
390 13 05/02/2017 ST / "Turf" / "C " 1400 G 5 8 026 C S Shum H T Mo 5-3/4 14 111 11 11 12 13 1.23.93 1074 TT/CP
344 04 18/01/2017 ST / "AWT" / "-" 1200 GD 5 12 028 C S Shum H T Mo 5-1/2 60 112 11 11 4 1.10.46 1077 TT/B-/CP1
286 10 27/12/2016 ST / "Turf" / "A+3 " 1200 G 5 3 030 C S Shum O Murphy 7 11 123 5 5 10 1.12.18 1075 TT/B
231 09 04/12/2016 ST / "AWT" / "-" 1200 GD 5 6 033 C S Shum Z Purton 4 19 126 6 7 9 1.09.65 1066 TT/B
223 11 30/11/2016 HV / "Turf" / "A " 1200 G 5 8 035 C S Shum N Rawiller 5 10 130 10 11 11 1.11.47 1062 TT/B
213 07 27/11/2016 ST / "Turf" / "C " 1400 G 5 6 035 C S Shum Z Purton 2-1/2 11 128 6 7 6 7 1.23.50 1079 TT/B
103 14 16/10/2016 ST / "Turf" / "C " 1600 GF 5 14 035 C S Shum N Rawiller 25-3/4 11 128 1 2 6 14 1.39.40 1078 TT/B
049 07 25/09/2016 ST / "Turf" / "A " 1400 GF 5 9 036 C S Shum N Rawiller 1-3/4 11 129 7 9 9 7 1.23.23 1077 TT/B
001 05 03/09/2016 ST / "Turf" / "B " 1200 G 5 7 036 C S Shum N Rawiller 3-1/2 70 125 9 9 5 1.09.89 1086 TT/B
15/16 Season
639 12 14/05/2016 ST / "AWT" / "-" 1650 WS 4 8 042 C S Shum H N Wong 20-1/2 99 108 6 5 9 12 1.41.86 1043 TT/B
605 13 01/05/2016 ST / "Turf" / "B " 1400 G 4 10 046 C S Shum M L Yeung 8-1/4 99 117 6 4 5 13 1.24.19 1053 TT/B
527 09 03/04/2016 ST / "Turf" / "B+2 " 1400 G 4 13 049 C S Shum C Schofield 5-3/4 99 122 12 13 12 9 1.23.16 1065 TT/B
491 12 20/03/2016 ST / "Turf" / "A " 1200 G 4 12 052 C S Shum K K Chiong 9-1/4 92 112 8 11 12 1.11.59 1067 TT/B
456 09 06/03/2016 ST / "Turf" / "C " 1200 G 4 8 052 C S Shum G Lerena 8-3/4 16 126 9 10 9 1.11.42 1078 TT1/B1
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/50678331

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档