我一直在试图废除一个基于java的网站。但是,表数据在所有不同的表中是分开的。我需要从这张桌子上去掉名字和角色。
不幸的是,我不能将URL作为内部网站提供,但是我附加了html代码,出于安全原因也更改了名称。
这是网站代码:
<div role="presentation" style="position: absolute; left: 0px; top: 0px;">
<div class="dojoxGridRow" role="row" aria-selected="false" idref="admin" style="">
<table class="dojoxGridRowTable" border="0" cellspacing="0" cellpadding="0" role="presentation">
<tbody>
<tr>
<td tabindex="-1" role="gridcell" class="dojoxGridCell" idx="0" style="width:14em;" hilite="1" fieldname="name">admin</td>
<td tabindex="-1" role="gridcell" class="dojoxGridCell" idx="1" style="width:12em;" hilite="1" fieldname="state">
<div class="stateCouple userState" data-state="locked">
<div class="sprite warning13 mar5r"></div>
Locked
</div>
</td>
<td tabindex="-1" role="gridcell" class="dojoxGridCell" idx="2" style="width:14em;" hilite="1" fieldname="role"><span class="hasMapTooltip" tooltipkey="admin">Administrator</span></td>
<td tabindex="-1" role="gridcell" class="dojoxGridCell " idx="3" style="display:none;width:10em;" hilite="1" fieldname="scope">*</td>
<td tabindex="-1" role="gridcell" class="dojoxGridCell " idx="4" style="display:none;width:16em;" hilite="1" fieldname="lastAuthenticatedTime">08/07/13 07:17:49 PM</td>
</tr>
</tbody>
</table>
</div>
<div class="dojoxGridRow dojoxGridRowOdd" role="row" aria-selected="false" idref="user1" style="">
<table class="dojoxGridRowTable" border="0" cellspacing="0" cellpadding="0" role="presentation">
<tbody>
<tr>
<td tabindex="-1" role="gridcell" class="dojoxGridCell " idx="0" style="width:14em;" hilite="1" fieldname="name">user1</td>
<td tabindex="-1" role="gridcell" class="dojoxGridCell " idx="1" style="width:12em;" hilite="1" fieldname="state">
<div class="stateCouple userState" data-state="connected">
<div class="sprite checkMark13 mar5r"></div>
Connected (2)
</div>
</td>
<td tabindex="-1" role="gridcell" class="dojoxGridCell " idx="2" style="width:14em;" hilite="1" fieldname="role"><span class="hasMapTooltip" tooltipkey="admin">Administrator</span></td>
<td tabindex="-1" role="gridcell" class="dojoxGridCell " idx="3" style="display:none;width:10em;" hilite="1" fieldname="scope">*</td>
<td tabindex="-1" role="gridcell" class="dojoxGridCell " idx="4" style="display:none;width:16em;" hilite="1" fieldname="lastAuthenticatedTime">07/04/22 03:37:32 PM</td>
</tr>
</tbody>
</table>
</div>
<div class="dojoxGridRow" role="row" aria-selected="false" idref="user2" style="">
<table class="dojoxGridRowTable" border="0" cellspacing="0" cellpadding="0" role="presentation">
<tbody>
<tr>
<td tabindex="-1" role="gridcell" class="dojoxGridCell " idx="0" style="width:14em;" hilite="1" fieldname="name">user2</td>
<td tabindex="-1" role="gridcell" class="dojoxGridCell " idx="1" style="width:12em;" hilite="1" fieldname="state">
<div class="stateCouple userState" data-state="disconnected">
<div class="sprite disabledMark13 mar5r"></div>
Disconnected
</div>
</td>
<td tabindex="-1" role="gridcell" class="dojoxGridCell " idx="2" style="width:14em;" hilite="1" fieldname="role"><span class="hasMapTooltip" tooltipkey="admin">Administrator</span></td>
<td tabindex="-1" role="gridcell" class="dojoxGridCell " idx="3" style="display:none;width:10em;" hilite="1" fieldname="scope">*</td>
<td tabindex="-1" role="gridcell" class="dojoxGridCell " idx="4" style="display:none;width:16em;" hilite="1" fieldname="lastAuthenticatedTime">06/27/22 09:55:30 AM</td>
</tr>
</tbody>
</table>
</div>
<div class="dojoxGridRow dojoxGridRowOdd" role="row" aria-selected="false" idref="user3" style="">
<table class="dojoxGridRowTable" border="0" cellspacing="0" cellpadding="0" role="presentation">
<tbody>
<tr>
<td tabindex="-1" role="gridcell" class="dojoxGridCell " idx="0" style="width:14em;" hilite="1" fieldname="name">user3</td>
<td tabindex="-1" role="gridcell" class="dojoxGridCell " idx="1" style="width:12em;" hilite="1" fieldname="state">
<div class="stateCouple userState" data-state="disconnected">
<div class="sprite disabledMark13 mar5r"></div>
Disconnected
</div>
</td>
<td tabindex="-1" role="gridcell" class="dojoxGridCell " idx="2" style="width:14em;" hilite="1" fieldname="role"><span class="hasMapTooltip" tooltipkey="admin">Administrator</span></td>
<td tabindex="-1" role="gridcell" class="dojoxGridCell " idx="3" style="display:none;width:10em;" hilite="1" fieldname="scope">*</td>
<td tabindex="-1" role="gridcell" class="dojoxGridCell " idx="4" style="display:none;width:16em;" hilite="1" fieldname="lastAuthenticatedTime">12/18/19 03:56:05 PM</td>
</tr>
</tbody>
</table>
</div>
<div class="dojoxGridRow" role="row" aria-selected="false" idref="user4" style="">
<table class="dojoxGridRowTable" border="0" cellspacing="0" cellpadding="0" role="presentation">
<tbody>
<tr>
<td tabindex="-1" role="gridcell" class="dojoxGridCell " idx="0" style="width:14em;" hilite="1" fieldname="name">user4</td>
<td tabindex="-1" role="gridcell" class="dojoxGridCell " idx="1" style="width:12em;" hilite="1" fieldname="state">
<div class="stateCouple userState" data-state="disconnected">
<div class="sprite disabledMark13 mar5r"></div>
Disconnected
</div>
</td>
<td tabindex="-1" role="gridcell" class="dojoxGridCell " idx="2" style="width:14em;" hilite="1" fieldname="role"><span class="hasMapTooltip" tooltipkey="admin">Administrator</span></td>
<td tabindex="-1" role="gridcell" class="dojoxGridCell " idx="3" style="display:none;width:10em;" hilite="1" fieldname="scope">*</td>
<td tabindex="-1" role="gridcell" class="dojoxGridCell " idx="4" style="display:none;width:16em;" hilite="1" fieldname="lastAuthenticatedTime">05/20/22 05:49:45 PM</td>
</tr>
</tbody>
</table>
</div>
<div class="dojoxGridRow dojoxGridRowOdd" role="row" aria-selected="false" idref="user5" style="">
<table class="dojoxGridRowTable" border="0" cellspacing="0" cellpadding="0" role="presentation">
<tbody>
<tr>
<td tabindex="-1" role="gridcell" class="dojoxGridCell " idx="0" style="width:14em;" hilite="1" fieldname="name">user5</td>
<td tabindex="-1" role="gridcell" class="dojoxGridCell " idx="1" style="width:12em;" hilite="1" fieldname="state">
<div class="stateCouple userState" data-state="disconnected">
<div class="sprite disabledMark13 mar5r"></div>
Disconnected
</div>
</td>
<td tabindex="-1" role="gridcell" class="dojoxGridCell " idx="2" style="width:14em;" hilite="1" fieldname="role"><span class="hasMapTooltip" tooltipkey="admin">Administrator</span></td>
<td tabindex="-1" role="gridcell" class="dojoxGridCell " idx="3" style="display:none;width:10em;" hilite="1" fieldname="scope">*</td>
<td tabindex="-1" role="gridcell" class="dojoxGridCell " idx="4" style="display:none;width:16em;" hilite="1" fieldname="lastAuthenticatedTime">05/19/22 12:16:31 PM</td>
</tr>
</tbody>
</table>
</div>
<div class="dojoxGridRow" role="row" aria-selected="false" idref="user6" style="">
<table class="dojoxGridRowTable" border="0" cellspacing="0" cellpadding="0" role="presentation">
<tbody>
<tr>
<td tabindex="-1" role="gridcell" class="dojoxGridCell " idx="0" style="width:14em;" hilite="1" fieldname="name">user6</td>
<td tabindex="-1" role="gridcell" class="dojoxGridCell " idx="1" style="width:12em;" hilite="1" fieldname="state">
<div class="stateCouple userState" data-state="disconnected">
<div class="sprite disabledMark13 mar5r"></div>
Disconnected
</div>
</td>
<td tabindex="-1" role="gridcell" class="dojoxGridCell " idx="2" style="width:14em;" hilite="1" fieldname="role"><span class="hasMapTooltip" tooltipkey="admin">Administrator</span></td>
<td tabindex="-1" role="gridcell" class="dojoxGridCell " idx="3" style="display:none;width:10em;" hilite="1" fieldname="scope">*</td>
<td tabindex="-1" role="gridcell" class="dojoxGridCell " idx="4" style="display:none;width:16em;" hilite="1" fieldname="lastAuthenticatedTime">07/01/22 03:24:16 PM</td>
</tr>
</tbody>
</table>
</div>
<div class="dojoxGridRow dojoxGridRowOdd" role="row" aria-selected="false" idref="secadmin" style="">
<table class="dojoxGridRowTable" border="0" cellspacing="0" cellpadding="0" role="presentation">
<tbody>
<tr>
<td tabindex="-1" role="gridcell" class="dojoxGridCell " idx="0" style="width:14em;" hilite="1" fieldname="name">secadmin</td>
<td tabindex="-1" role="gridcell" class="dojoxGridCell " idx="1" style="width:12em;" hilite="1" fieldname="state">
<div class="stateCouple userState" data-state="locked">
<div class="sprite warning13 mar5r"></div>
Locked
</div>
</td>
<td tabindex="-1" role="gridcell" class="dojoxGridCell " idx="2" style="width:14em;" hilite="1" fieldname="role"><span class="hasMapTooltip" tooltipkey="security_admin">Security administrator</span></td>
<td tabindex="-1" role="gridcell" class="dojoxGridCell " idx="3" style="display:none;width:10em;" hilite="1" fieldname="scope">*</td>
<td tabindex="-1" role="gridcell" class="dojoxGridCell " idx="4" style="display:none;width:16em;" hilite="1" fieldname="lastAuthenticatedTime">06/07/21 03:28:40 PM</td>
</tr>
</tbody>
</table>
</div>
<div class="dojoxGridRow" role="row" aria-selected="false" idref="tpcuser" style="">
<table class="dojoxGridRowTable" border="0" cellspacing="0" cellpadding="0" role="presentation">
<tbody>
<tr>
<td tabindex="-1" role="gridcell" class="dojoxGridCell " idx="0" style="width:14em;" hilite="1" fieldname="name">tpcuser</td>
<td tabindex="-1" role="gridcell" class="dojoxGridCell " idx="1" style="width:12em;" hilite="1" fieldname="state">
<div class="stateCouple userState" data-state="disconnected">
<div class="sprite disabledMark13 mar5r"></div>
Disconnected
</div>
</td>
<td tabindex="-1" role="gridcell" class="dojoxGridCell " idx="2" style="width:14em;" hilite="1" fieldname="role"><span class="hasMapTooltip" tooltipkey="monitor">Monitor</span></td>
<td tabindex="-1" role="gridcell" class="dojoxGridCell " idx="3" style="display:none;width:10em;" hilite="1" fieldname="scope">PUBLIC</td>
<td tabindex="-1" role="gridcell" class="dojoxGridCell " idx="4" style="display:none;width:16em;" hilite="1" fieldname="lastAuthenticatedTime">03/03/21 06:00:33 PM</td>
</tr>
</tbody>
</table>
</div>
<div class="dojoxGridRow dojoxGridRowOdd" role="row" aria-selected="false" idref="user6" style="">
<table class="dojoxGridRowTable" border="0" cellspacing="0" cellpadding="0" role="presentation">
<tbody>
<tr>
<td tabindex="-1" role="gridcell" class="dojoxGridCell " idx="0" style="width:14em;" hilite="1" fieldname="name">user6</td>
<td tabindex="-1" role="gridcell" class="dojoxGridCell " idx="1" style="width:12em;" hilite="1" fieldname="state">
<div class="stateCouple userState" data-state="disconnected">
<div class="sprite disabledMark13 mar5r"></div>
Disconnected
</div>
</td>
<td tabindex="-1" role="gridcell" class="dojoxGridCell " idx="2" style="width:14em;" hilite="1" fieldname="role"><span class="hasMapTooltip" tooltipkey="admin">Administrator</span></td>
<td tabindex="-1" role="gridcell" class="dojoxGridCell " idx="3" style="display:none;width:10em;" hilite="1" fieldname="scope">*</td>
<td tabindex="-1" role="gridcell" class="dojoxGridCell " idx="4" style="display:none;width:16em;" hilite="1" fieldname="lastAuthenticatedTime">05/10/22 12:39:54 PM</td>
</tr>
</tbody>
</table>
</div>
</div>
有可能在未来的用户添加,所以我不能有一个设置的数字为循环。
我的代码如下:
#Scrape the user table
Username = driver.find_elements(By.XPATH, value="/html/body/div[7]/div/div[3]/div[1]/div/div/div[2]/div/div[3]/div/div/div/div/div[1]/table/tbody/tr/td[1]")
Role = driver.find_elements(By.XPATH, value="/html/body/div[7]/div/div[3]/div[1]/div/div/div[2]/div/div[3]/div/div/div/div/div[1]/table/tbody/tr/td[3]")
for i in range(len(Username)):
if Username[i].text in userdict:
UserGet = userdict.get(Username[i].text)
print("Company Name|F|DS0000|Company Role|" + UserGet + "|enabled|||" + Role[i].text)
List.append("Company Name|F|DS0000|Company Role|" + UserGet + "|enabled|||" + Role[i].text)
else:
print(Username[i].text + " Not in User Dictionary")
我希望我的问题是有意义的,并感谢提供的任何帮助。
编辑:
名称1= /html/body/div7/div/div3/div1/div/div/div2/div/div3/div/div/div/div/div1/table/tbody/tr/td1
名称2= /html/body/div7/div/div3/div1/div/div/div2/div/div3/div/div/div/div/div2/table/tbody/tr/td1
不知怎么的,我需要循环通过div,但是有和结束点。
发布于 2022-07-04 10:02:46
如果您更正了XPath值,它应该会给出预期的结果。我使用这个网站http://xpather.com/测试了它,但是我不太擅长使用Python中的字典,所以我无法正确地测试代码。如果有任何问题,请告诉我,我可以设法解决。
#Scrape the user table
Username = driver.find_elements(By.XPATH, value="//td[@role][1]")
Role = driver.find_elements(By.XPATH, value="//td/span[@tooltipkey]")
for i in range(len(Username)):
if Username[i].text in userdict:
UserGet = userdict.get(Username[i].text)
print("Company Name|F|DS0000|Company Role|" + UserGet + "|enabled|||" + Role[i].text)
List.append("Company Name|F|DS0000|Company Role|" + UserGet + "|enabled|||" + Role[i].text)
else:
print(Username[i].text + " Not in User Dictionary")
https://stackoverflow.com/questions/72852682
复制相似问题