首页
学习
活动
专区
工具
TVP
发布
社区首页 >问答首页 >用VBA从HTML页面获取"ul class“值和"li class”值

用VBA从HTML页面获取"ul class“值和"li class”值
EN

Stack Overflow用户
提问于 2017-01-18 01:59:04
回答 1查看 2.3K关注 0票数 1

我想使用VBA从网站中提取数据。我已经设法提取了输出的一部分,但不是完整的。

有人能给我一些关于我的代码的提示吗?

HTML代码为:

代码语言:javascript
复制
<div id="lc">
  <div class="mbox0px">
    <ul class="menu country-list">
      <li class="head">Countries</li>
      <li id="lmenu_17">
        <a href="/soccer/england/" onclick="return cjs.dic.get('Helper_Menu').lmenu(17,req_url,1);">England</a>
        <ul class="submenu hidden" data-ajax="true">
          <li class="" data-mt="1_17_nwPDBpVc">
            <a href="/soccer/england/premier-league/">Premier League</a>
          </li>
          <li class="" data-mt="1_198_2DSCa5fE">
            <a href="/soccer/england/championship/">Championship</a>
          </li>
          <li class="" data-mt="1_198_rJSMG3H0">
            <a href="/soccer/england/league-one/">League One</a>
        </li>

结果应该在每个连续的单元格中,如下所示:

代码语言:javascript
复制
England
Premier League
Championship
League One

我的VBA代码如下:

代码语言:javascript
复制
Sub Get_Link_Name()

Dim URL As String
Dim ie As New InternetExplorer
Dim HTMLdoc As HTMLDocument
Dim NewCollection As Object: Set dictObj = New Collection
Dim tRowID As String

URL = "http://www.flashscore.ro/"

With ie
    .navigate URL
    .Visible = True
    Do Until .readyState = READYSTATE_COMPLETE: DoEvents: Loop
    Set HTMLdoc = .document
End With


With HTMLdoc

    Set tblSet = .getElementById("lc")
    Set mTbl = tblSet.getElementsByTagName("ul")(4)
    Set tRows = mTbl.getElementsByTagName("li")

    With NewCollection

        i = 2
        For Each trow In tRows

                tRowText = trow.innerText
                ActiveSheet.Cells(i, 1) = tRowText
                i = i + 1
                'MsgBox tRowText

        Next trow
    End With
End With

Set ie = Nothing
MsgBox "Process Completed"

End Sub
EN

回答 1

Stack Overflow用户

发布于 2018-06-21 16:11:22

我猜你想知道英格兰的实际结果。

网址是以https://www.flashscore.ro/fotbal/为基础字符串(路径)设置的,后跟国家,例如"anglia",代表英格兰,然后是联赛/锦标赛名称。

因此,您可以使用以下命令导航到每个所需的结果表:

  1. https://www.flashscore.ro/fotbal/anglia/premier-league/
  2. https://www.flashscore.ro/fotbal/anglia/championship/
  3. https://www.flashscore.ro/fotbal/anglia/league-one/
  4. https://www.flashscore.ro/fotbal/anglia/league-two/

循环将其中一个表写出到工作表的一个示例是使用英超联赛表,该表只需通过"https://www.flashscore.ro/fotbal/anglia/获得。

你可能需要对每个链接进行调整,但它向你展示了如何抓取表,如何循环表的行和列,如何在分数前添加"'“以阻止它被视为日期或数学计算等等。

网站示例:

表中的代码输出(示例):

VBA:

代码语言:javascript
复制
Option Explicit

Public Sub GetInfo()
    Dim ie As New InternetExplorer, html As HTMLDocument, sResponse As String, hTable As Object
    Application.ScreenUpdating = False
    With ie
        .Visible = True
        .navigate "https://www.flashscore.ro/fotbal/anglia/"

        While .Busy Or .readyState < 4: DoEvents: Wend

        Do: DoEvents: On Error Resume Next: Set hTable = .document.getElementsByTagName("table")(0).getElementsByTagName("tbody")(0): On Error GoTo 0: Loop While hTable Is Nothing

        Dim tRow As Object, tCell As Object, tCurr As Object, r As Long, c As Long
        r = 1
        With ActiveSheet
            For Each tRow In hTable.Rows
                For Each tCell In tRow.Cells
                    c = c + 1
                    Select Case c
                    Case 5
                        .Cells(r, c) = "'" & tCell.innerText
                    Case Else
                        .Cells(r, c) = tCell.innerText
                    End Select
                Next tCell
                c = 0: r = r + 1
            Next tRow
        End With
        'Quit '<== Remember to quit application
    End With
    Application.ScreenUpdating = True
End Sub
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/41703931

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档