在Python 3中从web下载文件?

内容来源于 Stack Overflow,并遵循CC BY-SA 3.0许可协议进行翻译与使用

  • 回答 (2)
  • 关注 (0)
  • 查看 (1024)

正在创建一个程。中指定的URL,从而从Web服务器读取JAR(Java)文件。

设法从JAD文件中提取了JAR文件的URL(每个JAD文件都包含到JAR文件的URL),但正如您所想象的,提取的值是type()字符串。

以下是相关功能:

def downloadFile(URL=None):
    import httplib2
    h = httplib2.Http(".cache")
    resp, content = h.request(URL, "GET")
    return content

downloadFile(URL_from_file)
提问于
用户回答回答于

requests每当我想要与HTTP请求相关的内容时,都可以打包,因为它的API很容易从以下方面开始:

首先,安装requests

$ pip install requests

然后代码:

from requests import get  # to make GET request


def download(url, file_name):
    # open in binary mode
    with open(file_name, "wb") as file:
        # get request
        response = get(url)
        # write to file
        file.write(response.content)
用户回答回答于

如果要将网页的内容获取到变量中,只需read对...的反应urllib.request.urlopen:

import urllib.request
...
url = 'http://example.com/'
response = urllib.request.urlopen(url)
data = response.read()      # a `bytes` object
text = data.decode('utf-8') # a `str`; this step can't be used if data is binary

下载和保存文件的最简单方法是使用urllib.request.urlretriev

import urllib.request
...
# Download the file from `url` and save it locally under `file_name`:
urllib.request.urlretrieve(url, file_name)
import urllib.request
...
# Download the file from `url`, save it in a temporary directory and get the
# path to it (e.g. '/tmp/tmpb48zma.txt') in the `file_name` variable:
file_name, headers = urllib.request.urlretrieve(url)

是这样这样做的方法是使用urllib.request.urlopen函数返回表示HTTP响应的类似文件的对象,并使用shutil.copyfileobj...

import urllib.request
import shutil
...
# Download the file from `url` and save it locally under `file_name`:
with urllib.request.urlopen(url) as response, open(file_name, 'wb') as out_file:
    shutil.copyfileobj(response, out_file)

如果这看起来太复杂了,可能希望简化,并将整个下载存储在bytes对象,然后将其写入文件。但这只对小文件有效。

import urllib.request
...
# Download the file from `url` and save it locally under `file_name`:
with urllib.request.urlopen(url) as response, open(file_name, 'wb') as out_file:
    data = response.read() # a `bytes` object
    out_file.write(data)

可以提取.gz(可能还有其他格式)动态压缩数据

import urllib.request
import gzip
...
# Read the first 64 bytes of the file inside the .gz archive located at `url`
url = 'http://example.com/something.gz'
with urllib.request.urlopen(url) as response:
    with gzip.GzipFile(fileobj=response) as uncompressed:
        file_header = uncompressed.read(64) # a `bytes` object
        # Or do anything shown above using `uncompressed` instead of `response`.

扫码关注云+社区

领取腾讯云代金券