问用Python语言在.aspx网站上构建POST方法的数据
EN

Stack Overflow用户

提问于 2018-07-27 04:23:03

回答 1查看 70关注 0票数 0

我刚接触.NET和Python，但我想做一个程序来抓取.aspx站点并处理那里的内容(HTML代码就足够了)。我尝试了一些用Python编写的库，但我得到的只是该站点的第一个页面。似乎我在构建错误的POST数据，我不知道数据的正确形式，什么应该包括，什么不应该。

http://nastenka.lesy.sk/EZOZV/Publish/ObjednavkyZverejnenie.aspx?YR=2018

import requests, urllib, urllib2

r = requests.get("http://nastenka.lesy.sk/EZOZV/Publish/ObjednavkyZverejnenie.aspx?YR=2018")
content = r.text
print content

start_index = content.find('id="__VIEWSTATE"') + 24
sliced_vs = content[start_index:content.find('"',start_index)]

start_index = content.find('id="__VIEWSTATEGENERATOR"') + 33
sliced_vsg = content[start_index:content.find('"',start_index)]

start_index = content.find('id="__VIEWSTATEENCRYPTED"') + 33
sliced_vse = content[start_index:content.find('"',start_index)]

start_index = content.find('id="__EVENTVALIDATION"') + 30
sliced_EV = content[start_index:content.find('"',start_index)]

form_data = {'__EVENTTARGET': 'gvZverejnenie',
      '__EVENTARGUMENT': 'Page$2',
      '__VIEWSTATE': sliced_vs,
      '__VIEWSTATEGENERATOR': sliced_vsg,
      '__VIEWSTATEENCRYPTED': sliced_vse,
      '__EVENTVALIDATION': sliced_EV}

data_encoded = urllib.urlencode(form_data)


r = requests.post('http://nastenka.lesy.sk/EZOZV/Publish/ObjednavkyZverejnenie.aspx?YR=2018',data=data_encoded)
content = r.text
print content

例如，在代码中，我想获取第二个页面(' page $2')。我总是得到相同的结果，但ViewState和EventValidation的值不同。请问问题出在哪里？

python

asp.net

post

回答 1

Stack Overflow用户

回答已采纳

发布于 2018-07-27 06:20:31

这段代码需要selenium和chromedriver来控制Google Chrome。结果是总共有476页(按照你提供的url )。

from selenium import webdriver
from selenium.common.exceptions import NoSuchElementException

chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--headless')

driver = webdriver.Chrome(chrome_options=chrome_options)
driver.get('http://nastenka.lesy.sk/EZOZV/Publish/ObjednavkyZverejnenie.aspx?YR=2018')

with open('page_1.html', 'w') as f:
    f.write(driver.page_source)

page_num = 2
while True:
    try:
        element = driver.find_element_by_link_text(str(page_num))
    except NoSuchElementException:
        elements = driver.find_elements_by_link_text('...')
        if len(elements) == 0:
            break  # less than 11 pages total
        elif len(elements) == 1 and page_num > 12:
            break  # last page
        element = elements[-1]

    element.click()

    with open('page_{}.html'.format(page_num), 'w') as f:
        f.write(driver.page_source)

    page_num += 1

driver.quit()

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/51546975

复制

相似问题

问用Python语言在.aspx网站上构建POST方法的数据
EN

回答 1

Stack Overflow用户

社区

活动

资源

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问用Python语言在.aspx网站上构建POST方法的数据EN

回答 1

Stack Overflow用户

社区

活动

资源

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问用Python语言在.aspx网站上构建POST方法的数据
EN