首页
学习
活动
专区
工具
TVP
发布
社区首页 >问答首页 >需要Excel VBA来浏览网站和下载特定文件

需要Excel VBA来浏览网站和下载特定文件
EN

Stack Overflow用户
提问于 2019-06-13 02:03:41
回答 1查看 195关注 0票数 1

尝试理解如何以特定的方式与网站进行交互。这是我正在编写的一个更大的代码的一部分,它将遍历ContractorID列表。下面是我需要做的事情:

UFR导航到这个网站:https://ufr.osd.state.ma.us/WebAccess/SearchDetails.asp?ContractorID=042786217&FilingYear=2018&nOrgPage=7&Year=2018

  • Find点击上面写着“
  1. 备案”的链接。(如果不存在,则结束sub)
  2. 在接下来的页面中,找到“文档类别”下标识为"UFR Excel模板“的链接,然后单击它。(在本例中,链接显示为"15-UFR18.xls",但是由于没有一致的链接命名方案,因此正确的链接必须始终由前面提到的“文档类别”下的标签来标识。如果该链接不存在,请在接下来的页面上退出下载,单击顶部的“

”链接,并将文件保存在以下文件路径下(此时将创建):C:\Documents\042786217\2018. (

  • )

编辑:下面的代码让我点击了download按钮,然后我看到了Open/Save/Cancel对话框。差不多了,只需要弄清楚如何将文件保存到特定的路径中。

代码语言:javascript
复制
Option Explicit
Sub UFRScraper()

    If MsgBox("UFR Scraper will run now. Do you wish to continue?", vbYesNo) = vbNo Then Exit Sub

    Dim IE As Object
    Dim objElement As Object
    Dim objCollection As Object
    Dim ele As Object
    Dim tbl_Providers As ListObject: Set tbl_Providers = ThisWorkbook.Worksheets("tbl_ProviderList").ListObjects("tbl_Providers")
    Dim FEIN As String: FEIN = ""
    Dim FEINList As Range: Set FEINList = tbl_Providers.ListColumns("FEIN").DataBodyRange
    Dim ProviderName As String: ProviderName = ""
    Dim ProviderNames As Range: Set ProviderNames = tbl_Providers.ListColumns("Provider Name").DataBodyRange
    Dim FiscalYear As String: FiscalYear = ""
    Dim urlUFRDetails As String: urlUFRDetails = ""
    Dim i As Integer

    ' Create InternetExplorer Object
    Set IE = CreateObject("InternetExplorer.Application")

    ' Show (True)/Hide (False) IE
    IE.Visible = True

    i = 1
    For i = 1 To 3 'Limited to 3 during testing. Change when ready.
        FEIN = FEINList(i, 1)
        ProviderName = ProviderNames(i, 1)

        urlUFRDetails = "https://ufr.osd.state.ma.us/WebAccess/SearchDetails.asp?ContractorID=" & FEIN & "&FilingYear=2018&nOrgPage=1&Year=2018"

        IE.Navigate urlUFRDetails

        ' Wait while IE loading...
        'IE ReadyState = 4 signifies the webpage has loaded (the first loop is set to avoid inadvertently skipping over the second loop)
        Do While IE.ReadyState = 4: DoEvents: Loop   'Do While
        Do Until IE.ReadyState = 4: DoEvents: Loop   'Do Until


        'Step 2 is done here
        Dim filingFound As Boolean: filingFound = False
        For Each ele In IE.Document.getElementsByTagName("a")
            If ele.innerText = "UFR Filing with Audited Financials" Then
                filingFound = True
                IE.Navigate ele.href
                Do While IE.ReadyState = 4: DoEvents: Loop   'Do While
                Do Until IE.ReadyState = 4: DoEvents: Loop   'Do Until
                Exit For
            End If
        Next ele

        If filingFound = False Then
            GoTo Skip
        End If


        'Step 3
        Dim j As Integer: j = 0
        Dim UFRFileFound As Boolean: UFRFileFound = False
        For Each ele In IE.Document.getElementsByTagName("li")
            j = j + 1
            If ele.innerText = "UFR Excel Template" Then
                UFRFileFound = True
                IE.Navigate "https://ufr.osd.state.ma.us/WebAccess/documentviewact.asp?counter=" & j - 4
                Do While IE.ReadyState = 4: DoEvents: Loop   'Do While
                Do Until IE.ReadyState = 4: DoEvents: Loop   'Do Until
                Exit For
            End If
        Next ele

        If UFRFileFound = False Then
            GoTo Skip
        End If


        'Step 4
        IE.Document.getElementById("LinkButton2").Click

        '**Built in wait time to avoid accidentally overloading server with repeated quick requests during development and testing**
Skip:
        Application.Wait (Now + TimeValue("0:00:03"))
        MsgBox "Loop " & i & " complete."

    Next i

    'Unload IE
    IE.Quit
    Set IE = Nothing
    Set objElement = Nothing
    Set objCollection = Nothing

    MsgBox "Process complete!"

End Sub
EN

回答 1

Stack Overflow用户

发布于 2019-06-13 11:19:45

我已经用一些很长的方法尝试了第三步。但无法提供完整的下载代码,因为(在一次成功的手动尝试之后)目前甚至手动下载尝试也会导致消息“无法检索文件”(可能是服务器端约束)

代码只将您带到包含xlx文件的href的单元格。

代码语言:javascript
复制
 Dim doc As HTMLDocument
        Dim Tbl As HTMLTable, Cel As HTMLTableCell, Rw As HTMLTableRow, Col As HTMLTableCol
        Set doc = IE.document

        For Each ele In IE.document.getElementsByClassName("boxedContent")
            For Each Tbl In ele.getElementsByTagName("table")
               For Each Rw In Tbl.Rows
                    For Each Cel In Rw.Cells
                    'Debug.Print Cel.innerText
                        If InStr(1, Cel.innerText, "UFR Excel Template") > 0 Then
                        Debug.Print Rw.Cells(2).innerText & " - " & Rw.Cells(2).innerHTML
                        End If
                    Next
               Next Rw
            Next Tbl
        Next

一旦href可用,就可以使用PtrSafe函数、WinHTTPrequest或其他方法来下载文件。欢迎并渴望从@QHarr等专家那里了解更有效的答案。

票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/56568004

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档