我试图让Scrapy使用FormRequest.from_response填写以下HTML表单:
<form class="form-horizontal" method="POST" role="form">
<div class="form-group">
<label class="col-sm-3 control-label" for="inputEmail3"> Username </label>
<div class="col-sm-9">
<input class="form-control" value="" maxlength="32" name="pun" />
</div>
</div>
<div class="form-group">
<label class="col-sm-3 control-label" for="inputEmail3"> Passphrase </label>
<div class="col-sm-9">
<input class="form-control" type="password" value="" maxlength="10000" name="ak" />
</div>
</div>
</form>
</div>
<div align="right">
<input id="send" type="submit" value="Login" name="login" />
</div>我遵循了教程这里,但是包含"ak“和”双关“字段的代码不起作用。有什么想法或建议吗?谢谢。编辑:这就是我到目前为止得到的
class TestSpider(CrawlSpider):
name = "test1"
allowed_domains = ['...']
start_urls = [
'...'
]
rules = {Rule(LinkExtractor(), callback='parse_items', follow=True),}
def parse_items(self, response):
return [FormRequest.from_response(response,
formdata={"pun": '...', "ak": '...'},
callback=self.after_login)]
def after_login(self, link):
# Check login succeed before going on
if "authentication failed" in response.body:
self.log("Login failed", level=log.ERROR)
return
# Crawl contents ... 发布于 2015-04-16 08:55:33
submit按钮必须在<form>标记中
尝尝这个
<form class="form-horizontal" method="POST" role="form">
<div class="form-group">
<label class="col-sm-3 control-label" for="inputEmail3"> Username </label>
<div class="col-sm-9">
<input class="form-control" value="" maxlength="32" name="pun" />
</div>
</div>
<div class="form-group">
<label class="col-sm-3 control-label" for="inputEmail3"> Passphrase </label>
<div class="col-sm-9">
<input class="form-control" type="password" value="" maxlength="10000" name="ak" />
</div>
</div>
<div align="right">
<input id="send" type="submit" value="Login" name="login" />
</div>
</form>https://stackoverflow.com/questions/29669751
复制相似问题