首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >Python抓取:错误54 'Connection reset by peer‘

Python抓取:错误54 'Connection reset by peer‘
EN

Stack Overflow用户
提问于 2020-08-05 19:39:35
回答 1查看 3.9K关注 0票数 2

我已经写了简单的脚本,从多个网站获得html的。不过,直到昨天我对这个剧本都没什么意见。它突然开始抛出异常。

代码语言:javascript
复制
Traceback (most recent call last):
  File "crowling.py", line 45, in <module>
    result = requests.get(url)
  File "/Users/gen/.pyenv/versions/3.7.1/lib/python3.7/site-packages/requests/api.py", line 76, in get
    return request('get', url, params=params, **kwargs)
  File "/Users/gen/.pyenv/versions/3.7.1/lib/python3.7/site-packages/requests/api.py", line 61, in request
    return session.request(method=method, url=url, **kwargs)
  File "/Users/gen/.pyenv/versions/3.7.1/lib/python3.7/site-packages/requests/sessions.py", line 530, in request
    resp = self.send(prep, **send_kwargs)
  File "/Users/gen/.pyenv/versions/3.7.1/lib/python3.7/site-packages/requests/sessions.py", line 685, in send
    r.content
  File "/Users/gen/.pyenv/versions/3.7.1/lib/python3.7/site-packages/requests/models.py", line 829, in content
    self._content = b''.join(self.iter_content(CONTENT_CHUNK_SIZE)) or b''
  File "/Users/gen/.pyenv/versions/3.7.1/lib/python3.7/site-packages/requests/models.py", line 754, in generate
    raise ChunkedEncodingError(e)
requests.exceptions.ChunkedEncodingError: ("Connection broken: ConnectionResetError(54, 'Connection reset by peer')", ConnectionResetError(54, 'Connection reset by peer'))

脚本的主要部分如下所示。

代码语言:javascript
复制
c = 0
#urls is the list of urls as strings
for url in urls:
    result = requests.get(url)
    c += 1
    with open('htmls/p{}.html'.format(c),'w',encoding='UTF-8') as f:
        f.write(result.text)

列表urls是由我的其他代码生成的,我已经检查了urls是否正确。此外,异常的时间也不是恒定的。有时它会在刮到20htmls时停止,有时会一直到80htmls,然后停止。由于异常突然出现而没有更改代码,我猜测异常是由于互联网连接造成的。然而,我希望确保脚本能够稳定地工作。是否存在导致此错误的可能原因?

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2020-08-05 20:23:58

如果您确定URL是正确的,并且这是一个间歇性的连接问题,则可以在失败后重试连接:

代码语言:javascript
复制
c = 0
#urls is the list of urls as strings
for url in urls:
    trycnt = 3  # max try cnt
    while trycnt > 0:
        try:
           result = requests.get(url)
           c += 1
           with open('htmls/p{}.html'.format(c),'w',encoding='UTF-8') as f:
               f.write(result.text)
           trycnt = 0 # success
        except ChunkedEncodingError as ex:
           if trycnt <= 0: print("Failed to retrieve: " + url + "\n" + str(ex))  # done retrying
           else: trycnt -= 1  # retry
           time.sleep(0.5)  # wait 1/2 second then retry
     # go to next URL
票数 2
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/63264392

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档