我正在使用一种标准的模式将重试行为放在Python中的requests
请求周围,
import requests
from requests.adapters import HTTPAdapter
from requests.packages.urllib3.util.retry import Retry
retry_strategy = Retry(
total=HTTP_RETRY_LIMIT,
status_forcelist=HTTP_RETRY_CODES,
method_whitelist=HTTP_RETRY_METHODS,
backoff_factor=HTTP_BACKOFF_FACTOR
)
adapter = HTTPAdapter(max_retries=retry_strategy)
http = requests.Session()
http.mount("https://", adapter)
http.mount("http://", adapter)
...
try:
response = http.get(... some request params ...)
except requests.Exceptions.RetryError as err:
# Do logic with err to perform error handling & logging.
不幸的是,RetryError上的文档没有解释任何事情,当我像上面一样拦截异常对象时,err.response
就是None
。虽然您可以调用str(err)
来获取异常的消息字符串,但这将需要不合理的字符串解析来尝试恢复特定的响应细节,而且即使您愿意尝试,消息实际上也会删除所需的详细信息。例如,在给400秒的站点上故意调用一个这样的响应(并不是说您真的会在上面重试,而只是为了调试),它提供了"(Caused by ResponseError('too many 400 error responses'))"
的消息--它删除了实际的响应细节,比如请求的站点自己描述400个错误的性质的描述文本(这可能对确定处理至关重要,甚至只是返回记录错误)。
我想要做的是为最后一次不成功的重试尝试接收response
,并使用状态代码和该特定故障的描述来确定处理逻辑。尽管我想使它在重试之后变得健壮,但在最终处理错误时,我仍然需要知道除了“太多重试”之外的潜在失败。
是否可以从为重试而引发的异常中提取此信息?
发布于 2022-09-02 06:22:26
我们不可能在每个异常中都得到响应,因为请求可能尚未发送,或者请求或响应可能尚未到达其目的地。例如,这些异常不会得到响应。
urllib3.exceptions.ConnectTimeoutError
urllib3.exceptions.SSLError
urllib3.exceptions.NewConnectionError
urllib3.util.Retry
中有一个名为raise_on_status
的参数,默认为True
。如果它是False
,那么urllib3.exceptions.MaxRetryError
就不会是raise
d,如果raise
d没有例外,那么肯定会有响应。现在,在包装在另一个response.raise_for_status
中的try
块的else
块中,可以很容易地使用try
。
我已经将except RetryError
更改为except Exception
,以捕获其他异常。
import requests
from requests.adapters import HTTPAdapter
from requests.packages.urllib3.util.retry import Retry
from requests.exceptions import RetryError
# DEFAULT_ALLOWED_METHODS = frozenset({'DELETE', 'GET', 'HEAD', 'OPTIONS', 'PUT', 'TRACE'})
# Default methods to be used for allowed_methods
# RETRY_AFTER_STATUS_CODES = frozenset({413, 429, 503})
# Default status codes to be used for status_forcelist
HTTP_RETRY_LIMIT = 3
HTTP_BACKOFF_FACTOR = 0.2
retry_strategy = Retry(
total=HTTP_RETRY_LIMIT,
backoff_factor=HTTP_BACKOFF_FACTOR,
raise_on_status=False,
)
adapter = HTTPAdapter(max_retries=retry_strategy)
http = requests.Session()
http.mount("https://", adapter)
http.mount("http://", adapter)
try:
response = http.get("https://httpbin.org/status/503")
except Exception as err:
print(err)
else:
try:
response.raise_for_status()
except Exception as e:
# Do logic with err to perform error handling & logging.
print(response.reason)
# Or
# print(e.response.reason)
else:
print(response.text)
试验;
# https://httpbin.org/user-agent
➜ python requests_retry.py
{
"user-agent": "python-requests/2.28.1"
}
# url = https://httpbin.org/status/503
➜ python requests_retry.py
SERVICE UNAVAILABLE
发布于 2022-08-30 15:38:31
它不是由库直接支持的:
Retry
不将response
附加到MaxRetryError
:urllib3 3/util/retry.py#L 486-L 512psf/requests
回购中提出的类似问题:在错误#4455中没有捕获响应对象可以通过子类Retry
将response
附加到MaxRetryError
来实现
from requests.adapters import MaxRetryError, Retry
class MyRetry(Retry):
def increment(self, *args, **kwargs):
try:
return super().increment(*args, **kwargs)
except MaxRetryError as ex:
response = kwargs.get('response')
if response:
response.read(cache_content=True)
ex.response = response
raise
用法:
# retry_strategy = Retry(
retry_strategy = MyRetry(
# Do logic with err to perform error handling & logging.
print(err.args[0].response.status)
print(err.args[0].response.data)
发布于 2022-09-01 07:38:12
正如aaron已经指出的,您试图捕获的实际错误与库正在引发的错误并不相同。而且,这在很大程度上取决于所使用的库的版本,因为他们似乎也用Retry方法改变了事情(它也可以从from requests.adapters import Retry
(包括RetryError
)获得)。
工作守则
对于在requests=2.27.1
和python=3.7.12
上测试的代码,以及在使用urlib3时从urlib3测试的Retry
:
import requests
from requests.adapters import HTTPAdapter
from requests.packages.urllib3.util.retry import Retry
HTTP_RETRY_LIMIT = 1
HTTP_RETRY_CODES = [403, 400, 401, 429, 500, 502, 503, 504]
HTTP_RETRY_METHODS = ['HEAD', 'GET', 'OPTIONS', 'TRACE', 'POST']
HTTP_BACKOFF_FACTOR = 1
retry_strategy = Retry(
total=HTTP_RETRY_LIMIT,
status_forcelist=HTTP_RETRY_CODES,
allowed_methods=HTTP_RETRY_METHODS, # changed to allowed_methods
backoff_factor=HTTP_BACKOFF_FACTOR
)
adapter = HTTPAdapter(max_retries=retry_strategy)
http = requests.Session()
http.mount("https://", adapter)
http.mount("http://", adapter)
try:
response = http.get('https://www.howtogeek.com/wp-content/uploads/2018/06/')
except (requests.exceptions.RetryError, requests.exceptions.ConnectionError) as err:
# Do logic with err to perform error handling & logging.
print(err)
print(err.args[0].reason)
我确实得到了
requests.exceptions.RetryError: HTTPSConnectionPool(host='www.howtogeek.com', port=443): Max retries exceeded with url: /wp-content/uploads/2018/06/ (Caused by ResponseError('too many 403 error responses'))
too many 403 error responses
与sys.exc_info()
如果这还不够,您可以检查导入traceback
包或使用sys.exc_info()
(索引0、1或2),检查这种堆叠溢出上的更多内容。在你的例子中,你会做这样的事情:
import traceback, sys
try:
response = http.get('https://www.howtogeek.com/wp-content/uploads/2018/06/')
except (requests.exceptions.RetryError, requests.exceptions.ConnectionError) as err:
# Do logic with err to perform error handling & logging.
print(sys.exc_info()[0]) # just the class of the exception, check the link for more info
它返回类,您可以使用该类来处理错误,这也可以与捕获泛型Exception
结合在一起。
<class 'requests.exceptions.ConnectionError'>
这为您提供了大量的控制,因为您可以执行info = sys.exc_info()[1]
并获得实际的object
。因此,您可以使用以下方法进行访问:
print(info.request.url)
print(info.request.headers)
# and probably most important for you
print(info.args[0].reason) # urllib3.exceptions.ResponseError('too many 403 error responses')
并获取所需的结果信息:
https://www.howtogeek.com/wp-content/uploads/2018/06/
{'User-Agent': 'python-requests/2.27.1', 'Accept-Encoding': 'gzip, deflate, br', 'Accept': '*/*', 'Connection': 'keep-alive'}
too many 403 error responses
使用完整回溯的替代更多信息(取决于解析):
print(traceback.format_exc()) # Returns full stack trace, might not be most useful in your case
https://stackoverflow.com/questions/73394472
复制相似问题