在Python3.3.0中有没有办法设置robotparser.read()函数的超时时间?(例如在urllib.request such打开中)
默认的60秒超时有点太大了。
(我正在自学Python。)
Python 3.3.0 - robotparser
Python 3.3.0 - urllib.request
发布于 2013-03-06 06:34:26
不,您必须使用socket.setdefaulttimeout()
设置全局默认超时,或者派生RobotFileParser
类以添加自定义超时:
from urllib.robotparser import RobotFileParser
import urllib.request
class TimoutRobotFileParser(RobotFileParser):
def __init__(self, url='', timeout=60):
super().__init__(url)
self.timeout = timeout
def read(self):
"""Reads the robots.txt URL and feeds it to the parser."""
try:
f = urllib.request.urlopen(self.url, timeout=self.timeout)
except urllib.error.HTTPError as err:
if err.code in (401, 403):
self.disallow_all = True
elif err.code >= 400:
self.allow_all = True
else:
raw = f.read()
self.parse(raw.decode("utf-8").splitlines())
https://stackoverflow.com/questions/15235374
复制相似问题