我尝试了python的requests library文档中提供的示例。
使用async.map(rs)
时,我会得到响应代码,但我想要得到请求的每个页面的内容。例如,这不起作用:
out = async.map(rs)
print out[0].content
发布于 2012-08-14 17:47:10
async
现在是一个独立的模块:grequests
。
查看此处:https://github.com/kennethreitz/grequests
然后是:Ideal method for sending multiple HTTP requests over Python?
安装:
$ pip install grequests
用法:
构建堆栈:
import grequests
urls = [
'http://www.heroku.com',
'http://tablib.org',
'http://httpbin.org',
'http://python-requests.org',
'http://kennethreitz.com'
]
rs = (grequests.get(u) for u in urls)
发送堆栈
grequests.map(rs)
结果看起来像
[<Response [200]>, <Response [200]>, <Response [200]>, <Response [200]>, <Response [200]>]
grequests似乎没有对并发请求设置限制,即当多个请求被发送到同一服务器时。
发布于 2015-11-18 18:08:01
我测试了requests-futures和grequests。Grequest速度更快,但也带来了猴子补丁和额外的依赖问题。requests-futures比grequests慢好几倍。我决定编写我自己的简单的包装到ThreadPoolExecutor的请求,它几乎和grequests一样快,但没有外部依赖。
import requests
import concurrent.futures
def get_urls():
return ["url1","url2"]
def load_url(url, timeout):
return requests.get(url, timeout = timeout)
with concurrent.futures.ThreadPoolExecutor(max_workers=20) as executor:
future_to_url = {executor.submit(load_url, url, 10): url for url in get_urls()}
for future in concurrent.futures.as_completed(future_to_url):
url = future_to_url[future]
try:
data = future.result()
except Exception as exc:
resp_err = resp_err + 1
else:
resp_ok = resp_ok + 1
发布于 2014-05-28 10:49:00
也许requests-futures是另一个选择。
from requests_futures.sessions import FuturesSession
session = FuturesSession()
# first request is started in background
future_one = session.get('http://httpbin.org/get')
# second requests is started immediately
future_two = session.get('http://httpbin.org/get?foo=bar')
# wait for the first request to complete, if it hasn't already
response_one = future_one.result()
print('response one status: {0}'.format(response_one.status_code))
print(response_one.content)
# wait for the second request to complete, if it hasn't already
response_two = future_two.result()
print('response two status: {0}'.format(response_two.status_code))
print(response_two.content)
在the office document中也推荐使用它。如果你不想涉及gevent,这是一个很好的选择。
https://stackoverflow.com/questions/9110593
复制相似问题