我正在尝试从下面的代码中实现逻辑,这段代码使用aiohttp请求google搜索,我的解决方案似乎是等效的,但由于某种原因,没有按预期设置cookie。有什么帮助吗?
from http.cookiejar import LWPCookieJar
from urllib.request import Request, urlopen
USER_AGENT = 'Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0)'
cookie_jar = LWPCookieJar(os.path.join(home_folder, '.google-cookie'))
cookie_jar.load()
def get_page(url, user_agent=None, verify_ssl=True):
if user_agent is None:
user_agent = USER_AGENT
request = Request(url)
request.add_header('User-Agent', user_agent)
cookie_jar.add_cookie_header(request)
response = urlopen(request)
cookie_jar.extract_cookies(response, request)
html = response.read()
response.close()
try:
cookie_jar.save()
except Exception:
pass
return html我的解决方案是:
import aiohttp
USER_AGENT = 'Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0)'
abs_cookie_jar = aiohttp.CookieJar()
abs_cookie_jar.load('.aiogoogle-cookie')
async def get_page(url, user_agent=None, verify_ssl=True):
if user_agent is None:
user_agent = USER_AGENT
async with aiohttp.ClientSession(headers={'User-Agent': user_agent}, cookie_jar=abs_cookie_jar) as session:
response = await session.get(url)
if response.cookies:
abs_cookie_jar.update_cookies(cookies=response.cookies)
abs_cookie_jar.save('.aiogoogle-cookie')
html = await response.text()
return html发布于 2021-08-21 16:44:35
发生的情况是,当你前往google.com时,你会被重定向。结果是执行了3个HTTP请求,响应代码分别为301、302、200 (您可以通过访问response.history属性来展示)。
cookies报头被添加到第一个响应中,但您在response变量中拥有的是最后一个,它不包含Set-Cookie。
你的实现中的更新部分:abs_cookie_jar.update_cookies(cookies=response.cookies)是不需要的,因为aiohttp会为所有的请求自动更新,请参阅source。
如何修复您的解决方案:
import aiohttp, asyncio
USER_AGENT = 'Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0)'
abs_cookie_jar = aiohttp.CookieJar()
abs_cookie_jar.load('.aiogoogle-cookie')
async def get_page(url, user_agent=None, verify_ssl=True):
if user_agent is None:
user_agent = USER_AGENT
async with aiohttp.ClientSession(headers={'User-Agent': user_agent}, cookie_jar=abs_cookie_jar) as session:
response = await session.get(url)
html = await response.text()
# display redirect responses
for resp in response.history:
print(resp)
# print cookies for human readable format
for cookie in abs_cookie_jar:
print(cookie)
# save jar which already have response cookies
abs_cookie_jar.save('.aiogoogle-cookie')
return html
loop = asyncio.get_event_loop()
loop.run_until_complete(get_page('https://google.com'))https://stackoverflow.com/questions/68701758
复制相似问题