首页
学习
活动
专区
圈层
工具
发布
首页
学习
活动
专区
圈层
工具
MCP广场
社区首页 >问答首页 >Excel VBA抓取CSS同级

Excel VBA抓取CSS同级
EN

Stack Overflow用户
提问于 2020-10-02 05:06:43
回答 1查看 105关注 0票数 0

我正在尝试从一组网页中抓取数据,就像这样:https://www.cookcountyassessor.com/pin/14333230200000/print

大部分数据似乎是由一个具有多个同级的CSS类引用的,这个类有多个同级,名为"detail-row--detail“(数据标签包含在”detail-row--label“中)。因此,第一个数据项包含在detail-row--detail:eq(0)中,第二个数据项包含在detail-row--detail:eq(1)中,依此类推。我的VBA将获取第一个detail-row--detail,但不会获取任何后续项。

下面是我的代码的简化代码片段。单元格TargetURL包含上面的URL。range CSSRange包含3个值:"print-pint“、"address”和“detail-CSSRange--detail”(不带引号)。MsgBox (仅用于测试目的)正确地返回了前2个CSSRange项(没有多个兄弟项)的值。对于第三个CSS项(有31个兄弟项),它会正确地运行For-Each循环次数,但每次都会返回第一个兄弟项的值。对于如何获取后续兄弟的价值,有什么建议吗?

代码语言:javascript
运行
复制
Sub SnippetForStackOverflow()
'Be sure to load Tools > References "Microsoft Internet Controls" & "Microsoft HTML Object Library"

Dim ShtSource As Worksheet
Dim CSSRange As Range
Dim TargetURL As Range
Dim rng As Range
Dim n As Integer
Dim webpage As HTMLDocument
Dim element As IHTMLElement
Dim Output As String
Dim ie As InternetExplorer

'Get things ready
    Set ShtSource = Sheets("PINforVBA")
    Set TargetURL = ShtSource.Range("$B$2")
    Set CSSRange = ShtSource.Range("$B$5:$B$7")

'Open IE in memory, go to site
    Set ie = New InternetExplorer
    ie.Visible = True 'SET AS FALSE UNLESS DEBUGGING
    ie.navigate (TargetURL.Value)
    Do While ie.readyState <> READYSTATE_COMPLETE
        Application.StatusBar = "Loading Web page …"
        DoEvents
    Loop
    Set webpage = ie.document

'Scrape desired elements
    For Each rng In CSSRange
        For Each element In webpage.getElementsByClassName(rng.Value)
            n = n + 1
            Output = webpage.getElementsByClassName(rng.Value)(0).innerText
            MsgBox (n & ": " & Output)
        Next
        n = 0
    Next

'Wrap it up
    ie.Quit
    Set ie = Nothing

End Sub
EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2020-10-03 05:58:25

我已经注释了我在代码中所做的事情。包括重置状态栏。

代码语言:javascript
运行
复制
Sub SnippetForStackOverflow()
'Be sure to load Tools > References "Microsoft Internet Controls" & "Microsoft HTML Object Library"

Dim ShtSource As Worksheet
Dim CSSRange As Range
Dim TargetURL As Range
Dim rng As Range
Dim n As Integer
Dim webpage As HTMLDocument
Dim element As IHTMLElement
Dim Output As String
Dim ie As InternetExplorer

  'Get things ready
  Set ShtSource = Sheets("PINforVBA")
  Set TargetURL = ShtSource.Range("$B$2")
  Set CSSRange = ShtSource.Range("$B$5:$B$7")
  
  'Open IE in memory, go to site
  Set ie = New InternetExplorer
  ie.Visible = True 'SET AS FALSE UNLESS DEBUGGING
  ie.navigate (TargetURL.Value)
  Do While ie.readyState <> READYSTATE_COMPLETE
    Application.StatusBar = "Loading Web page …"
    DoEvents
  Loop
  Set webpage = ie.document
  
  'Scrape desired elements
  For Each rng In CSSRange
    For Each element In webpage.getElementsByClassName(rng.Value)
      n = n + 1
      
      'The following line searches in the whole document
      'That's the reason, why always the first Element was listed
      'Output = webpage.getElementsByClassName(rng.Value)(0).innerText
      
      'Some changes in the code will do what is wanted
      Output = element.innerText
      MsgBox (n & ": " & Output)
    Next
    n = 0
  Next
  
  'Wrap it up
  ie.Quit
  Set ie = Nothing
  
  'Reset status bar
  Application.StatusBar = False
End Sub
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/64163199

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档