首页
学习
活动
专区
工具
TVP
发布
社区首页 >问答首页 >VBA -抓取没有id的HTML表格

VBA -抓取没有id的HTML表格
EN

Stack Overflow用户
提问于 2018-07-20 00:56:01
回答 1查看 1.3K关注 0票数 1

我正在尝试用VBA从html表中获取数据。从列表框中选择一个值,填充文本框并单击按钮后,将出现表格。但是网站的url不会改变。

我的程序确实填充了该框,选择列表框的值并单击“搜索”按钮,但之后我无法从表中获取数据。

我需要页面末尾的表格单元格的值。(第二个)

下面是该页面的url

代码:

代码语言:javascript
复制
Sub Info()

Dim enlace As String
Dim id As String
Dim lista
Dim rut As Integer
Dim i As Integer
Dim largo As Integer

largo = Worksheets("Lista").Cells(rows.Count, 1).End(xlUp).Row

id = Worksheets("Lista").Cells(2, 1).Value
lista = Split(id, "-")
rut = lista(0)
enlace = "http://www.cmfchile.cl/institucional/mercados/entidad.php?auth=&send=&mercado=V&rut=" & rut & "&grupo=&tipoentidad=FINRE&vig=VI&row=AAAw+cAAhAABP4MAAz&control=svs&pestania=1"

Set objIE = CreateObject("InternetExplorer.application")
objIE.Visible = False
objIE.Navigate (enlace)
Do
    If objIE.ReadyState = 4 Then
        objIE.Visible = False
        Exit Do
    Else
        DoEvents
        End If
Loop

Dim button_name As String
button_name = "Aportantes"

Set link = objIE.document.getElementsByTagName("A")
For Each Hyperlink In link
If InStr(Hyperlink.innerText, button_name) > 0 Then
    Hyperlink.Click
Exit For
End If
Next

Dim nuevoLink As String
nuevoLink = Hyperlink

objIE.Quit

Set ie = CreateObject("InternetExplorer.application")
ie.Visible = False
ie.Navigate (nuevoLink)
Do
    If ie.ReadyState = 4 Then
        ie.Visible = False
        Exit Do
    Else
        DoEvents
        End If
Loop

Dim sem As String
Dim ano As Integer
sem = "03"
ano = 2018

Dim aportantes As Object
Dim cuotas_emitidas As Object

ie.document.getElementById("semestre").Value = sem
ie.document.getElementById("aa").Value = ano
Set elems = ie.document.getElementsByTagName("input")
For Each e In elems
If (e.getAttribute("value") = "Consultar") Then
    e.Click
    ''HERE IS THE PROBLEM
    Set aportantes = ie.document.getElementsByTagName("table")(1).getElementsByTagName("tr")(0).getElementsByTagName("tr")(1)
    ThisWorkbook.Worksheets("Lista").Cells(i, 4).Value = aportantes
    Set cuotas_emitidas = ie.document.getElementsByTagName("table")(1).getElementsByTagName("tr")(1).getElementsByTagName("tr")(1).innerText
    ThisWorkbook.Worksheets("Lista").Cells(i, 5).Value = cuotas_emitidas
End If
Next e
End Sub

HTML:

代码语言:javascript
复制
<table>
 <tbody>
    <tr>
    <td class="fondoOscuro">2.01.60 TOTAL APORTANTES</td>
    <td>58</td>
  </tr>

  <tr>
    <td class="fondoOscuro">2.01.70 CUOTAS EMITIDAS</td>
    <td>20000000 </td>
  </tr>
  <tr>
    <td class="fondoOscuro">2.01.71 CUOTAS PAGADAS</td>
    <td>7691000</td>

  </tr>
  <tr>
    <td class="fondoOscuro">2.01.72 CUOTAS SUSCRITAS Y NO PAGADAS</td>
    <td>0 </td>
  </tr>
  <tr>
    <td class="fondoOscuro">2.01.73 NUMERO DE CUOTAS CON PROMESA DE SUSCRIPCION Y PAGO</td>
    <td>0  </td>
  </tr>
  <tr>
    <td class="fondoOscuro">2.01.74 NUMERO DE CONTRATOS DE PROMESAS DE SUSCRIPCION Y PAGO</td>
    <td>0</td>
  </tr>
  <tr>
    <td class="fondoOscuro">2.01.75 NUMERO DE PROMITENTES SUSCRIPTORES DE CUOTAS</td>
    <td>0 </td>
  </tr>
  <tr>
    <td class="fondoOscuro">2.01.80 VALOR LIBRO DE LA CUOTA</td>
    <td>1.0059 </td>
  </tr>
</tbody></table>

EN

回答 1

Stack Overflow用户

发布于 2018-07-20 04:02:58

你已经得到了很好的答案。问题是,当QHarr决定参与进来时,他几乎没有给其他人留下任何选择来表明立场。但是,下面的脚本将为您节省一些额外的时间。我已经使用IE来获取page source,然后应用更快的方法来管理其余部分。我尝试解析根据年份2016填充的相关表格数据。请随意根据您的要求更改年份。

代码语言:javascript
复制
Sub ScrapeTabularInfo()
    Dim IE As New InternetExplorer, Html As HTMLDocument
    Dim Htmldoc As New HTMLDocument, post As Object, elem As Object
    Dim trow As Object, R&, C&

    With IE
        .Visible = False
        .navigate "http://www.cmfchile.cl/institucional/mercados/entidad.php?auth=&send=&mercado=V&rut=9278&grupo=&tipoentidad=FINRE&vig=VI&row=AAAw%20cAAhAABP4MAAz&control=svs&pestania=27"
        While .Busy Or .readyState < 4: DoEvents: Wend
        Set Html = .document
        Html.querySelector("#aa").innerText = 2016
        Html.querySelector("input[value='Consultar']").Click
        Do: Set post = Html.getElementsByTagName("table")(1): DoEvents: Loop While post Is Nothing
    End With

    Htmldoc.body.innerHTML = Html.DocumentElement.outerHTML

    For Each elem In Htmldoc.getElementsByTagName("table")(1).Rows
        For Each trow In elem.Cells
            C = C + 1: Cells(R + 1, C) = trow.innerText
        Next trow
        C = 0: R = R + 1
    Next elem
    IE.Quit
End Sub

这里最好的方法是利用你已经有一个演示的post request。

要添加到库中的引用(考虑到您有IE9或更高版本才能使.querySelector()正常工作):

代码语言:javascript
复制
Microsoft Internet Controls
Microsoft HTML Object Libray
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/51428160

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档