我想使用VBA从网站中提取数据。我已经设法提取了输出的一部分,但不是完整的。
有人能给我一些关于我的代码的提示吗?
HTML代码为:
<div id="lc">
<div class="mbox0px">
<ul class="menu country-list">
<li class="head">Countries</li>
<li id="lmenu_17">
<a href="/soccer/england/" onclick="return cjs.dic.get('Helper_Menu').lmenu(17,req_url,1);">England</a>
<ul class="submenu hidden" data-ajax="true">
<li class="" data-mt="1_17_nwPDBpVc">
<a href="/soccer/england/premier-league/">Premier League</a>
</li>
<li class="" data-mt="1_198_2DSCa5fE">
<a href="/soccer/england/championship/">Championship</a>
</li>
<li class="" data-mt="1_198_rJSMG3H0">
<a href="/soccer/england/league-one/">League One</a>
</li>
结果应该在每个连续的单元格中,如下所示:
England
Premier League
Championship
League One
我的VBA代码如下:
Sub Get_Link_Name()
Dim URL As String
Dim ie As New InternetExplorer
Dim HTMLdoc As HTMLDocument
Dim NewCollection As Object: Set dictObj = New Collection
Dim tRowID As String
URL = "http://www.flashscore.ro/"
With ie
.navigate URL
.Visible = True
Do Until .readyState = READYSTATE_COMPLETE: DoEvents: Loop
Set HTMLdoc = .document
End With
With HTMLdoc
Set tblSet = .getElementById("lc")
Set mTbl = tblSet.getElementsByTagName("ul")(4)
Set tRows = mTbl.getElementsByTagName("li")
With NewCollection
i = 2
For Each trow In tRows
tRowText = trow.innerText
ActiveSheet.Cells(i, 1) = tRowText
i = i + 1
'MsgBox tRowText
Next trow
End With
End With
Set ie = Nothing
MsgBox "Process Completed"
End Sub
发布于 2018-06-21 16:11:22
我猜你想知道英格兰的实际结果。
网址是以https://www.flashscore.ro/fotbal/
为基础字符串(路径)设置的,后跟国家,例如"anglia"
,代表英格兰,然后是联赛/锦标赛名称。
因此,您可以使用以下命令导航到每个所需的结果表:
循环将其中一个表写出到工作表的一个示例是使用英超联赛表,该表只需通过"https://www.flashscore.ro/fotbal/anglia/
获得。
你可能需要对每个链接进行调整,但它向你展示了如何抓取表,如何循环表的行和列,如何在分数前添加"'“以阻止它被视为日期或数学计算等等。
网站示例:
表中的代码输出(示例):
VBA:
Option Explicit
Public Sub GetInfo()
Dim ie As New InternetExplorer, html As HTMLDocument, sResponse As String, hTable As Object
Application.ScreenUpdating = False
With ie
.Visible = True
.navigate "https://www.flashscore.ro/fotbal/anglia/"
While .Busy Or .readyState < 4: DoEvents: Wend
Do: DoEvents: On Error Resume Next: Set hTable = .document.getElementsByTagName("table")(0).getElementsByTagName("tbody")(0): On Error GoTo 0: Loop While hTable Is Nothing
Dim tRow As Object, tCell As Object, tCurr As Object, r As Long, c As Long
r = 1
With ActiveSheet
For Each tRow In hTable.Rows
For Each tCell In tRow.Cells
c = c + 1
Select Case c
Case 5
.Cells(r, c) = "'" & tCell.innerText
Case Else
.Cells(r, c) = tCell.innerText
End Select
Next tCell
c = 0: r = r + 1
Next tRow
End With
'Quit '<== Remember to quit application
End With
Application.ScreenUpdating = True
End Sub
https://stackoverflow.com/questions/41703931
复制相似问题