首页
学习
活动
专区
圈层
工具
发布
首页
学习
活动
专区
圈层
工具
MCP广场
社区首页 >问答首页 >正在抓取.aspx页,未填充post请求结果

正在抓取.aspx页,未填充post请求结果
EN

Stack Overflow用户
提问于 2019-07-19 21:49:08
回答 1查看 83关注 0票数 0

我正在尝试抓取一个网页,但我只想要特定储备银行(纽约)的结果。我做了一些关于抓取.aspx页面的研究,我相信我在post请求中捕获了所有需要的变量,但我仍然没有做到这一点。

我添加了请求主体中的各种元素,这些元素可以在inspect元素中看到。我一直没有得到任何结果,就好像页面上的搜索功能从未执行过一样。

我可以抓取不可搜索的页面(https://www.federalreserve.gov/apps/h2a/h2a.aspx)没有问题,我的结果如下:

代码语言:javascript
复制
Applicant: Alberto Joseph Safra, David Joseph Safra and Esther Safra Dayan, Sao Palo, Brazil and Jacob Joseph Safra, Geneva, Switzerland;, Activity: to acquire voting shares of SNBNY Holdings Limited, Gibraltar, Gibraltar and thereby indirectly acquire Safra National Bank of New York, New York, New York., Law: CIBC, Reserve Bank: St. Louis, End of Comment Period: 04/16/2019 

Applicant: American National Bankshares, Inc.,, Activity: to acquire HomeTown Bankshares Corporation, and thereby indirectly acquire HomeTown Bank, both in Roanoke, Virginia ... engage in mortgage lending, also applied to acquire at least 49 percent of HomeTown Residential Mortgage, LLC, Virginia Beach, VA., Law: 3, Reserve Bank: Richmond, End of Comment Period: 02/28/2019 

Applicant: Ameris Bancorp, Moultrie, Georgia;, Activity: to merge with Fidelity Southern Corporation, and thereby indirectly acquire Fidelity Bank, both of Atlanta, Georgia., Law: 3, Reserve Bank: Atlanta, End of Comment Period: 03/14/2019 

Applicant: Amsterdam Bancshares, Inc., Amsterdam, Missouri;, Activity: to acquire 100 percent of the voting shares of S.T.D. Investments, Inc., and thereby indirectly acquire Bank of Minden, both of Mindenmines, Missouri., Law: 3, Reserve Bank: Kansas City, End of Comment Period: 01/04/2019 

Applicant: Amy Beth Windle Oakley, Cookeville, Tennessee, and Mark Edward Copeland, Ooltewah, Tennessee;, Activity: to become members of the Windle/Copeland Family Control Group and thereby retain shares of Overton Financial Services, Inc., and its subsidiary, Union Bank and Trust Company, both of Livingston, Tennessee., Law: CIBC, Reserve Bank: Atlanta, End of Comment Period: 12/27/2018 

Applicant: Anderson W. Chandler Trust A Indenture dated July 25, 1996, and Cathleen Chandler Stevenson, individually, and as trustee, both of Dallas, Texas; Activity: to retain voting shares of Fidelity as members of the Anderson W. Chandler Family Control Group., Law: CIBC, Reserve Bank: Kansas City, End of Comment Period: 06/20/2019 

Applicant: Arthur Haag Sherman, the Sherman 2018 Irrevocable Trust, Sherman Tectonic FLP LP, and Sherman Family Holdings LLC, all of Houston, Texas;, Activity: as a group acting in concert, to acquire shares of T Acquisition, Inc., and thereby indirectly acquire T Bank, National Association, both of Dallas, Texas., Law: CIBC, Reserve Bank: Dallas, End of Comment Period: 12/10/2018 

Applicant: BancFirst Corporation, Oklahoma City, Oklahoma;, Activity: to acquire voting shares of Pegasus Bank, Dallas, Texas., Law: 3, Reserve Bank: Kansas City, End of Comment Period: 06/07/2019 

Applicant: BankFirst Capital Corporation, Macon, Mississippi;, Activity: to merge with FNB Bancshares of Central Alabama, Inc., and thereby indirectly acquire FNB of Central Alabama, both in Aliceville, Alabama., Law: 3, Reserve Bank: St. Louis, End of Comment Period: 02/28/2019 

因为我只想要纽约联邦储备银行的结果,所以我想抓取可搜索的URL (https://www.federalreserve.gov/apps/h2a/h2asearch.aspx)。除了纽约之外,我还尝试过不同的银行,但没有一家银行使用我的代码产生结果。当你在网页上搜索时,会有关于纽约的结果。这就是让我相信我的post请求有问题的原因。下面是我的代码:

代码语言:javascript
复制
import requests
from bs4 import BeautifulSoup



headers = {'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.94 Safari/537.36'}

print('Scraping the Latest H2A Release...')

url1 = 'https://www.federalreserve.gov/apps/h2a/h2asearch.aspx'
r1 = requests.get(url=url1, headers=headers)
soup1 = BeautifulSoup(r1.text,'html.parser')
viewstate = soup1.findAll("input", {"type": "hidden", "name": "__VIEWSTATE"})
eventvalidation = soup1.findAll("input", {"type": "hidden", "name": "__EVENTVALIDATION"})
stategenerator = soup1.findAll("input", {"type": "hidden", "name": "__VIEWSTATEGENERATOR"})


item_request_body = {
"__ASYNCPOST": "true",
"__EVENTARGUMENT": "",
"__EVENTTARGET": "",
"__EVENTVALIDATION": eventvalidation[0]['value'],
"__VIEWSTATE": viewstate[0]['value'],
"__VIEWSTATEGENERATOR": stategenerator[0]['value'],
"ctl00%24bodyMaster%24applicantTextBox":" ",
"ctl00%24bodyMaster%24districtDropDownList": "2",
"ctl00%24bodyMaster%24ScriptManager1": "ctl00%24bodyMaster%24mainUpdatePanel%7Cctl00%24bodyMaster%24searchButton",
"ctl00%24bodyMaster%24searchButton": "Search",
"ctl00%24bodyMaster%24sectionDropDownBox": "ALL",
"ctl00%24bodyMaster%24targetTextBox": ""
}

url = 'https://www.federalreserve.gov/apps/h2a/h2asearch.aspx'
r2 = requests.post(url=url, data=item_request_body, cookies=r1.cookies, headers=headers)
soup = BeautifulSoup(r2.text, 'html.parser')

mylist5 = []
for tr in soup.find_all('tr')[2:]:
    tds = tr.find_all('td')
    output5 = ("Applicant: %s, Activity: %s, Law: %s, Reserve Bank: %s, End of Comment Period: %s \r\n" % (tds[0].text, tds[1].text, tds[2].text, tds[3].text, tds[4].text))
    mylist5.append(output5)
    print(mylist5)
EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2019-07-20 01:53:14

我希望下面的脚本能让您解析选择New York后生成的内容。尝尝这个。

代码语言:javascript
复制
import requests
from bs4 import BeautifulSoup

url = 'https://www.federalreserve.gov/apps/h2a/h2asearch.aspx'

with requests.Session() as s:
    r = s.get(url)
    soup = BeautifulSoup(r.text,'lxml')
    payload = {item['name']:item.get('value','') for item in soup.select('input[name]')}
    payload['ctl00$bodyMaster$sectionDropDownBox'] = 'ALL'
    payload['ctl00$bodyMaster$districtDropDownList'] = '2'
    del payload['ctl00$bodyMaster$clearButton']
    res = s.post(url,data=payload)
    sauce = BeautifulSoup(res.text,'lxml')
    for items in sauce.select("table.pubtables tr"):
        data = [item.get_text(strip=True) for item in items.select("th,td")]
        print(data)
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/57113897

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档