以下是示例代码:
boxes = sel.xpath("//div[@class='lister-item mode-advanced']")
for box in boxes:
link = box.xpath(".//div[@class='lister-item-image float-left']/a/@href").extract_first()
absolute_url = response.urljoin(link)
yield SeleniumRequest(url=absolute_url, callback=self.parse_example)请告诉我,这是怎么做到的?
if all items (links from boxes) have been sent to self.parse_example:
... do something
else:
pass也许我需要使用while?如果是,如何做到这一点?
发布于 2020-03-29 10:38:48
您可能希望使用meta字典将所有链接传递给回调,该回调将在列表为空时执行操作:
def thing_one():
boxes = sel.xpath("//div[@class='lister-item mode-advanced']")
box_urls = []
for box in boxes:
link = box.xpath(".//div[@class='lister-item-image float-left']/a/@href").extract_first()
absolute_url = response.urljoin(link)
box_urls.append(absolute_url)
# use the first one to get started
box0 = box_urls.pop()
yield SeleniumRequest(url=box0, callback=self.parse_example,
meta={'box_urls': box_urls})
def parse_example(self, response):
box_urls = response.meta.get('box_urls')
if not box_urls:
self.log('I have seen them all, now time for action!')
box = box_urls.pop()
yield SeleniumRequest(url=box, callback=self.parse_example,
meta={'box_urls': box_urls})这有一个不幸的副作用,导致它们被串行执行,但有一个令人愉快的副作用,即不需要任何外部协调系统,如数据库或更糟的系统。
https://stackoverflow.com/questions/60908198
复制相似问题