文章/答案/技术大牛

发布

社区首页 >问答首页 >如何处理urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='127.0.0.1'，port=58408)：最大重试超过url

问如何处理urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='127.0.0.1'，port=58408)：最大重试超过url
EN

Stack Overflow用户

提问于 2020-11-09 04:03:59

回答 2查看 25.3K关注 0票数 10

我试着用selenium抓取几个网页并使用结果，但是当我两次运行这个函数时

[WinError 10061] No connection could be made because the target machine actively refused it'

第二个函数调用出现错误。我的方法是：

import os
import re
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.chrome.options import Options
from bs4 import BeautifulSoup as soup

opts = webdriver.ChromeOptions()
opts.binary_location = os.environ.get('GOOGLE_CHROME_BIN', None)
opts.add_argument("--headless")
opts.add_argument("--disable-dev-shm-usage")
opts.add_argument("--no-sandbox")
browser = webdriver.Chrome(executable_path="CHROME_DRIVER PATH", options=opts)

lst =[]
def search(st):
    for i in range(1,3):
        url = "https://gogoanime.so/anime-list.html?page=" + str(i)
        browser.get(url)
        req = browser.page_source
        sou = soup(req, "html.parser")
        title = sou.find('ul', class_ = "listing")
        title = title.find_all("li")
        for j in range(len(title)):
            lst.append(title[j].getText().lower()[1:])
    browser.quit()
    print(len(lst))
    
search("a")
search("a")

输出

272
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='127.0.0.1', port=58408): Max retries exceeded with url: /session/4b3cb270d1b5b867257dcb1cee49b368/url (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x000001D5B378FA60>: Failed to establish a new connection: [WinError 10061] No connection could be made because the target machine actively refused it'))

beautifulsoup

webdriver

selenium

selenium-webdriver

web-scraping

回答 2

Stack Overflow用户

回答已采纳

发布于 2020-11-09 13:13:14

这个错误信息..。

raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='127.0.0.1', port=58408): Max retries exceeded with url: /session/4b3cb270d1b5b867257dcb1cee49b368/url (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x000001D5B378FA60>: Failed to establish a new connection: [WinError 10061] No connection could be made because the target machine actively refused it'))

...implies表示未能建立新连接，从而使MaxRetryError无法建立连接。

几件事：

首先也是最重要的，根据讨论最大重试-超过异常是令人困惑的。，追溯在某种程度上具有误导性。请求封装异常以方便用户使用。原始异常是显示的消息的一部分。
请求从不重试(它为urllib3 3的MaxRetryError设置retries=0的HTTPConnectionPool)，因此如果没有retries=0和HTTPConnectionPool关键字，错误就会更加规范。因此，一个理想的回溯应该是： ConnectionError(

根本原因与解决办法

一旦您启动了webdriver会话，接下来在def search(st)中您将调用get() o access一个url，并在随后的行中调用用于调用/shutdown端点的browser.quit()，并随后调用webdriver & web客户端实例将被完全销毁，关闭所有页面/选项卡/窗口。因此，不再存在连接。

您可以在以下几个方面找到相关的详细讨论：

在这种情况下，在下一次迭代中(由于for循环)，当调用browser.get()时，没有活动连接。因此，您可以看到错误。

因此，一个简单的解决方案是删除行browser.quit()并在相同的浏览上下文中调用browser.get(url)。

结论

升级到Selenium 3.14.1之后，您将能够设置超时并查看规范的跟踪，并将能够采取所需的操作。

参考文献

您可以在以下网站找到相关的详细讨论：

MaxRetryError: HTTPConnectionPool:最大重试超过(由ProtocolError(‘连接中止.’，错误(111个，‘连接拒绝’)引起)

tl；dr

几次相关的讨论：

票数 9

Stack Overflow用户

发布于 2022-03-18 08:25:18

问题

退出后，驱动程序被要求爬行URL。在获得内容之前，一定要确保没有退出驱动程序。

溶液

对于您的代码，在执行search("a")时，驱动程序检索url，返回内容，然后关闭内容。

当serach()再次运行时，驱动程序不再存在，因此它无法继续运行该URL。

您需要从函数中删除browser.quit()并在脚本末尾添加它。

lst =[]
def search(st):
    for i in range(1,3):
        url = "https://gogoanime.so/anime-list.html?page=" + str(i)
        browser.get(url)
        req = browser.page_source
        sou = soup(req, "html.parser")
        title = sou.find('ul', class_ = "listing")
        title = title.find_all("li")
        for j in range(len(title)):
            lst.append(title[j].getText().lower()[1:])
    print(len(lst))
    
search("a")
search("a")
browser.quit()

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/64745726

复制

相似问题

问如何处理urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='127.0.0.1'，port=58408)：最大重试超过url
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何处理urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='127.0.0.1'，port=58408)：最大重试超过urlEN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何处理urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='127.0.0.1'，port=58408)：最大重试超过url
EN