这是一个网站中查看更多按钮的检查。我可以抓取显示在网站上的数据,但我希望它能抓取隐藏在查看更多按钮后面的项目。我该怎么做?
<div id="view-more" class="p20px pt10px">
<div id="view-more-loader" class="tac"></div>
<a href="javascript:void(0);" onclick="add_more_product_classified();$('#load_more_a_id').hide();" class="xxxxlarge ffrc lightbginfo gbiwb bdr darkbdrinfo p10px20px db w180px m0a tac" id="load_more_a_id" style="display: block;"><b class="icon-refresh xsmall mr5px"></b>View More Products..</a>
</div>
我的垃圾代码:
import scrapy
class DummymartSpider(scrapy.Spider):
name = 'dummymart'
allowed_domains = ['dummymart.net']
start_urls =['https://www.dummymart.com/catalog/car-dvd-player_cid100001018.html']
def parse(self, response):
Product = response.xpath('//div[@class="attr"]/h2/a/@title').extract()
Company = response.xpath('//div[@class="supplier"]/p/a/@title').extract()
Country = response.xpath('//*[@class="location a-color-secondary"]/span/text()').extract()
Category = response.xpath('//*[@class="attr category hide--mobile"]/span/a/text()').extract()
for item in zip(Product,Company,Country,Category):
scraped_info = {
'Product':item[0],
'Company': item[1],
'Country':item[2],
'Category':item[3]
}
yield scraped_info
发布于 2018-08-16 04:56:53
对于这样的问题,通常的解决方案是:
This blog post可能会对您有所帮助。
https://stackoverflow.com/questions/51861991
复制相似问题