首页
学习
活动
专区
圈层
工具
发布
首页
学习
活动
专区
圈层
工具
MCP广场
社区首页 >问答首页 >将URL传递到requests.get()时出错

将URL传递到requests.get()时出错
EN

Stack Overflow用户
提问于 2022-09-30 02:18:24
回答 1查看 54关注 0票数 0

我一直在开发一个程序,它从.csv中获取URL并在网页上计算单词数量。URL来自熊猫数据表中“文章”列下的行。url被输入到设置为变量的requests.get(url)中。在我对错误的研究中,当将URL输入到requrests.get()中时,就会出现问题。

代码语言:javascript
运行
复制
def file_input(file):
   #takes a .csv file from the user
   df = pd.read_csv(file, sep='[;,]', engine='python')
   for i in range(len(df)):
     df.at[i, "Word Count"] = word_counter(df.at[i, "Article"])
代码语言:javascript
运行
复制
def word_counter(url):
  #keeps tracks of the page's word count
  count = 0
  #the requests.get(url) takes the string of url and gets the access of the webpage
  page = requests.get(url)

以下是错误描述:

代码语言:javascript
运行
复制
Traceback (most recent call last):
  File "/home/runner/Article-Word-counter/venv/lib/python3.8/site-packages/urllib3/response.py", line 406, in _decode
    data = self._decoder.decompress(data)
  File "/home/runner/Article-Word-counter/venv/lib/python3.8/site-packages/urllib3/response.py", line 93, in decompress
    ret += self._obj.decompress(data)
zlib.error: Error -3 while decompressing data: incorrect header check

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/runner/Article-Word-counter/venv/lib/python3.8/site-packages/requests/models.py", line 816, in generate
    yield from self.raw.stream(chunk_size, decode_content=True)
  File "/home/runner/Article-Word-counter/venv/lib/python3.8/site-packages/urllib3/response.py", line 627, in stream
    data = self.read(amt=amt, decode_content=decode_content)
  File "/home/runner/Article-Word-counter/venv/lib/python3.8/site-packages/urllib3/response.py", line 599, in read
    data = self._decode(data, decode_content, flush_decoder)
  File "/home/runner/Article-Word-counter/venv/lib/python3.8/site-packages/urllib3/response.py", line 409, in _decode
    raise DecodeError(
urllib3.exceptions.DecodeError: ('Received response with content-encoding: gzip, but failed to decode it.', error('Error -3 while decompressing data: incorrect header check'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "main.py", line 59, in <module>
    main()
  File "main.py", line 44, in main
    file_input(file)
  File "main.py", line 35, in file_input
    df.at[i, "Word Count"] = word_counter(df.at[i, "Article"])
  File "main.py", line 13, in word_counter
    page = requests.get(anything)
  File "/home/runner/Article-Word-counter/venv/lib/python3.8/site-packages/requests/api.py", line 73, in get
    return request("get", url, params=params, **kwargs)
  File "/home/runner/Article-Word-counter/venv/lib/python3.8/site-packages/requests/api.py", line 59, in request
    return session.request(method=method, url=url, **kwargs)
  File "/home/runner/Article-Word-counter/venv/lib/python3.8/site-packages/requests/sessions.py", line 587, in request
    resp = self.send(prep, **send_kwargs)
  File "/home/runner/Article-Word-counter/venv/lib/python3.8/site-packages/requests/sessions.py", line 745, in send
    r.content
  File "/home/runner/Article-Word-counter/venv/lib/python3.8/site-packages/requests/models.py", line 899, in content
    self._content = b"".join(self.iter_content(CONTENT_CHUNK_SIZE)) or b""
  File "/home/runner/Article-Word-counter/venv/lib/python3.8/site-packages/requests/models.py", line 820, in generate
    raise ContentDecodingError(e)
requests.exceptions.ContentDecodingError: ('Received response with content-encoding: gzip, but failed to decode it.', error('Error -3 while decompressing data: incorrect header check'))
EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2022-09-30 04:34:11

requests.exceptions.ContentDecodingError:(“接收到的内容编码响应: gzip,但未能对其进行解码。”,错误(在解压缩数据时出错-3 :不正确的报头检查))

服务器的响应似乎表明它是gzip编码的,但是requests处理它时没有对其进行解码。这可能是服务器配置错误,或者是一些更微妙的问题。尝试通过指定接受-编码头来请求非压缩响应(尽管服务器可能不尊重您的请求):

代码语言:javascript
运行
复制
headers = { 'Accept-Encoding': 'identity' }
page = requests.get(url, headers=headers)

您还可以检查是否可以使用其他工具(如curl )或web浏览器访问URL。此外,您还可以显式地检查原始响应,以查看服务器实际发送的内容。但似乎联系网址管理员可能是真正的解决方案。

票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/73903352

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档