文章/答案/技术大牛

发布

问FastAPI UploadFile比烧瓶慢
EN

Stack Overflow用户

提问于 2020-12-17 14:42:25

回答 1查看 3.5K关注 0票数 6

我已经创建了一个端点，如下所示：

@app.post("/report/upload")
def create_upload_files(files: UploadFile = File(...)):
        try:
            with open(files.filename,'wb+') as wf:
                wf.write(file.file.read())
                wf.close()
        except Exception as e:
            return {"error": e.__str__()}

它是由uvicorn发起的：

../venv/bin/uvicorn test_upload:app --host=0.0.0.0 --port=5000 --reload

我正在执行一些测试，使用请求上传大约100 MB的文件，大约需要128秒钟：

f = open(sys.argv[1],"rb").read()
hex_convert = binascii.hexlify(f)
items = {"files": hex_convert.decode()}
start = time.time()
r = requests.post("http://192.168.0.90:5000/report/upload",files=items)
end = time.time() - start
print(end)

我用一个API端点使用烧瓶测试了相同的上传脚本，花费了大约0.5秒：

from flask import Flask, render_template, request
app = Flask(__name__)


@app.route('/uploader', methods = ['GET', 'POST'])
def upload_file():
   if request.method == 'POST':
      f = request.files['file']
      f.save(f.filename)
      return 'file uploaded successfully'

if __name__ == '__main__':
    app.run(host="192.168.0.90",port=9000)

我做错什么了吗？

python

file-upload

upload

fastapi

starlette

回答 1

Stack Overflow用户

发布于 2022-01-11 13:20:20

您可以使用同步写入、用def定义端点之后(如这个答案中所示)或使用异步写入(使用教友)、使用async def定义端点之后编写文件；UploadFile方法是async方法，因此需要对它们进行await。示例如下所示。有关def与async def的更多细节，以及它们如何影响API的性能(取决于端点内执行的任务)，请查看这个答案。

上传单个文件

app.py

from fastapi import File, UploadFile
import aiofiles

@app.post("/upload")
async def upload(file: UploadFile = File(...)):
    try:
        contents = await file.read()
        async with aiofiles.open(file.filename, 'wb') as f:
            await f.write(contents)
    except Exception:
        return {"message": "There was an error uploading the file"}
    finally:
        await file.close()

    return {"message": f"Successfuly uploaded {file.filename}"}

按块读取文件

或者，您可以以块的方式使用async，以避免将整个文件加载到内存中。例如，如果您有8GB的RAM，就不能加载一个50 8GB的文件(更别提可用的RAM总是小于安装的总量，因为在您的机器上运行的本机操作系统和其他应用程序将使用一些RAM)。因此，在这种情况下，您应该以块的形式将文件加载到内存中，并一次处理一个数据块。但是，根据您选择的块大小，此方法可能需要更长的时间才能完成；下面是1024 * 1024字节(= 1MB)。您可以根据需要调整块大小。

from fastapi import File, UploadFile
import aiofiles

@app.post("/upload")
async def upload(file: UploadFile = File(...)):
    try:
        async with aiofiles.open(file.filename, 'wb') as f:
            while contents := await file.read(1024 * 1024):
                await f.write(contents)
    except Exception:
        return {"message": "There was an error uploading the file"}
    finally:
        await file.close()

    return {"message": f"Successfuly uploaded {file.filename}"}

或者，您可以使用shutil.copyfileobj()，它用于将file-like对象的内容复制到另一个file-like对象(参见这个答案 )。默认情况下，数据以块形式读取，默认缓冲区(块)大小为1MB (即1024 * 1024字节)，其他平台为64 in (参见源代码这里)。可以通过传递可选的length参数来指定缓冲区大小。注意:如果传递负length值，文件的全部内容将被读取--参见f.read()文档，.copyfileobj()在幕后使用该文档。可以发现，.copyfileobj()的源代码这里-there实际上与以前读取/写入文件内容的方法并没有什么不同。不过，.copyfileobj()在幕后使用阻塞I/O操作，这将导致阻塞整个服务器(如果在async def端点中使用)。因此，为了避免这种情况，可以使用Starlette的run_in_threadpool()在一个单独的线程(然后等待)中运行所有需要的函数，以确保主线程(运行协同线)不会被阻塞。FastAPI在内部调用UploadFile对象的async方法(即.write()、.read()、.close()等)时也使用相同的函数--参见源代码这里。示例：

from fastapi import File, UploadFile
from fastapi.concurrency import run_in_threadpool
import shutil
        
@app.post("/upload")
async def upload(file: UploadFile = File(...)):
    try:
        f = await run_in_threadpool(open, file.filename, 'wb')
        await run_in_threadpool(shutil.copyfileobj, file.file, f)
    except Exception:
        return {"message": "There was an error uploading the file"}
    finally:
        if 'f' in locals(): await run_in_threadpool(f.close)
        await file.close()

    return {"message": f"Successfuly uploaded {file.filename}"}

test.py

import requests

url = 'http://127.0.0.1:8000/upload'
file = {'file': open('images/1.png', 'rb')}
resp = requests.post(url=url, files=file) 
print(resp.json())

上载多个文件

app.py

from fastapi import File, UploadFile
import aiofiles

@app.post("/upload")
async def upload(files: List[UploadFile] = File(...)):
    for file in files:
        try:
            contents = await file.read()
            async with aiofiles.open(file.filename, 'wb') as f:
                await f.write(contents)
        except Exception:
            return {"message": "There was an error uploading the file(s)"}
        finally:
            await file.close()

    return {"message": f"Successfuly uploaded {[file.filename for file in files]}"}

按块读取文件

若要以块形式读取文件，请参阅本答案前面描述的方法。

test.py

import requests

url = 'http://127.0.0.1:8000/upload'
files = [('files', open('images/1.png', 'rb')), ('files', open('images/2.png', 'rb'))]
resp = requests.post(url=url, files=files) 
print(resp.json())

更新

深入了解源代码，星轮的最新版本( FastAPI在下面使用)似乎使用了SpooledTemporaryFile (对于UploadFile数据结构)，将max_size属性设置为1MB (1024 * 1024字节)--参见这里 --与以前的版本相比，max_size设置为默认值，即0字节，例如< code >C58。

在过去，无论文件的大小如何，数据都会被完全加载到内存中(这可能会导致文件无法装入内存时出现问题)，而在最新版本中，数据会在内存中假脱机，直到file大小超过max_size (即1MB)，此时内容被写入磁盘；更具体地说，被写入到OS临时目录 (注:这也意味着您可以上传的文件的最大大小被系统的临时目录。可用的存储所绑定)。如果您的系统上有足够的存储空间(满足您的需要)，则无需担心；否则，请查看这个答案如何更改默认的临时目录)。因此，多次写入文件的过程--即最初将数据加载到RAM中，然后，如果数据大小超过1MB，将文件写入临时目录，然后从临时目录(使用file.read())读取文件，最后将文件写入永久目录--是导致上传文件比使用Flask框架慢的原因，正如OP在他们的问题中所指出的(不过，时间上的差异并不大，而只是几秒钟，这取决于文件的大小)。

解决方案

解决方案(如果需要上传大于1MB的文件并且上传时间对它们很重要)将是将request主体作为流访问。根据Starlette文档，如果您访问.stream()，那么字节块将不将整个正文存储到内存中(如果正文包含超过1MB的文件数据，则稍后存储到临时目录)。下面给出了示例，其中上传的时间记录在客户端，其结果与使用Flask框架时与OP问题中给出的示例相同。

app.py

from fastapi import Request
import aiofiles

@app.post('/upload')
async def upload(request: Request):
    try:
        filename = request.headers['filename']
        async with aiofiles.open(filename, 'wb') as f:
            async for chunk in request.stream():
                await f.write(chunk)
    except Exception:
        return {"message": "There was an error uploading the file"}
     
    return {"message": f"Successfuly uploaded {filename}"}

如果应用程序不需要将文件保存到磁盘，而只需将文件直接加载到内存中，则只需使用以下内容(确保RAM有足够的空间容纳累积的数据)：

from fastapi import Request

@app.post('/upload')
async def upload(request: Request):
    body = b''
    try:
        filename = request.headers['filename']
        async for chunk in request.stream():
            body += chunk
    except Exception:
        return {"message": "There was an error uploading the file"}
    
    #print(body.decode())
    return {"message": f"Successfuly uploaded {filename}"}

test.py

import requests
import time

with open("images/1.png", "rb") as f:
    data = f.read()
   
url = 'http://127.0.0.1:8000/upload'
headers = {'filename': '1.png'}

start = time.time()
resp = requests.post(url=url, data=data, headers=headers)
end = time.time() - start

print(f'Elapsed time is {end} seconds.', '\n')
print(resp.json())

关于更详细的和代码示例(关于上传多个文件和表单/JSON数据)，请看一下这个答案。

票数 9

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/65342833

复制

相似问题

问FastAPI UploadFile比烧瓶慢
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问FastAPI UploadFile比烧瓶慢EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问FastAPI UploadFile比烧瓶慢
EN