问了解Python请求模块的代理字典
EN

Stack Overflow用户

提问于 2018-09-26 05:59:19

回答 1查看 0关注 0票数 0

我今天早些时候制作了下面的脚本，对这一行有疑问：

proxies = {"https": "https://" + current_socket}

根据这里的文件：

http://docs.python-requests.org/en/master/user/advanced/#proxies

使用两个键/值对传递字典：

proxies = {
    'http': 'http://10.10.1.10:3128',
    'https': 'http://10.10.1.10:1080',
}

这个字典可以包含多于这两个键吗？（或更少）换句话说......这本词典可以多长时间？我可以添加更多套接字吗？或者它只是意味着有那些键/值对？

我也懒得在我的代码中指定一个“http”键（仅限https）。然而，一些代理人只是http而不是https。这会导致异常（以及下一个递归深度）还是非https套接字被使用？

偶尔我会看到“命中下一个递归级别”，代码执行时间会更长。但我不知道它是由错误的套接字引起的，还是因为我没有在上面显示的字典中指定http键/值对。其他时候它会先快速运行。

#! /usr/bin/python3
import requests
import re


class ProxyRequests:
    def __init__(self, url):
        self.sockets = []
        self.url = url
        self.proxy = ""
        self.request = ""
        self.__acquire_sockets()

    # get a list of US sockets from us-proxy.org
    def __acquire_sockets(self):
        r = requests.get("https://www.us-proxy.org/")
        matches = re.findall(r"<td>\d+.\d+.\d+.\d+</td><td>\d+</td>", r.text)
        revised_list = [m1.replace("<td>", "") for m1 in matches]
        for socket_str in revised_list:
            self.sockets.append(socket_str[:-5].replace("</td>", ":"))
        self.__proxy_request()

    # recursively try socket until successful
    def __proxy_request(self):
        if len(self.sockets) > 0:
            current_socket = self.sockets.pop(0)
            proxies = {"https": "https://" + current_socket}
            try:
                request = requests.get(self.url, proxies=proxies)
                self.request = request.text
            except:
                print('hit next recursion level')
                self.__proxy_request()

    def __str__(self):
        return str(self.request)


if __name__ == "__main__":
    r = ProxyRequests("https://www.somesite.com")
    print(r)

回答 1

Stack Overflow用户

发布于 2018-09-26 15:55:21

这是一个没有HTTPS功能的套接字，但是向一个需要https的网站发出请求

>>> import requests
>>> proxies = {
...     'https': 'https://54.174.104.38:1080'
... }
>>> r = requests.get('https://www.urbandictionary.com', proxies=proxies)

这导致：

a raise ProxyError(e, request=request)
requests.exceptions.ProxyError: HTTPSConnectionPool(host='www.urbandictionary.com',   port=443): Max retries exceeded with url:

如果我尝试使用能够https（并请求https站点）的代理套接字，我会得到一个有效的响应：

>>> proxies = {
...     'https': '165.227.63.207:8080'
... }

>>> r = requests.get('https://www.urbandictionary.com', proxies=proxies)
>>> r.headers
{'Accept-Ranges': 'bytes', 'X-Cache-Hits': '89', 'Content-Type': 'text/html; charset=utf-8', 'Content-Encoding': 'gzip', 'Content-Length': '22976', 'Cache-Control': 'max-age=3600', 'X-Cache': 'HIT', 'Last-Modified': 'Sat, 04 Aug 2018 14:49:04 GMT', 'ETag': '"1b285bd6418a910ff525d2165320a18b"', 'Age': '34268', 'Vary': 'Accept-Encoding,Fastly-SSL', 'X-Timer': 'S1533430595.339135,VS0,VE0', 'Via': '1.1 varnish', 'x-amz-meta-surrogate-control': 'max-age=864000', 'X-Served-By': 'cache-sjc3149-SJC', 'Connection': 'keep-alive', 'Date': 'Sun, 05 Aug 2018 00:56:35 GMT'}

但是，我明确强制尝试通过我的字典中的https连接。这就是抛出异常的原因。

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/-100002752

复制

相似问题

问了解Python请求模块的代理字典
EN

回答 1

Stack Overflow用户

社区

活动

资源

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问了解Python请求模块的代理字典EN

回答 1

Stack Overflow用户

社区

活动

资源

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问了解Python请求模块的代理字典
EN