开源AIGC学习—文生视频模型本地运行

原创

平常心

修改于 2024-03-15 16:59:39

5280

修改于 2024-03-15 16:59:39

文章被收录于专栏：个人总结系列

一、模型下载

可以参见之前文章介绍：开源AIGC学习—文生图模型本地运行

1、模型地址
参见huggingface，https://huggingface.co/cerspense/zeroscope_v2_576w

2、模型下载拷贝
pipe = DiffusionPipeline.from_pretrained("cerspense/zeroscope_v2_576w", torch_dtype=torch.float16)
这里和snapshot_download一样，都是在当前用户.cache路径

# cp -r .cache/huggingface/hub/models--cerspense--zeroscope_v2_576w /mnt/d/aigc_model/hub/
个人PC可以是本地盘路径，实际环境把自动下载的模型拷贝到挂载（分布式NAS）地址，python代码从nas的地址读取模型的代码与本地路径有区别

二、python代码开发

import torch
from diffusers import DiffusionPipeline, DPMSolverMultistepScheduler
from diffusers.utils import export_to_video

pipe = DiffusionPipeline.from_pretrained("/mnt/d/aigc_model/hub/models--cerspense--zeroscope_v2_576w/snapshots/6963642a64dbefa93663d1ecebb4ceda2d9ecb28", torch_dtype=torch.float16)
pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)
pipe.enable_model_cpu_offload()

prompt = "Darth Vader is surfing on waves"
video_frames = pipe(prompt, num_inference_steps=10, height=320, width=576, num_frames=24).frames
video_path = export_to_video(video_frames)

这里报异常，异常信息是GPU OOM

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 2.11 GiB. GPU 0 has a total capacity of 8.00 GiB of which 147.00 MiB is free. 
Including non-PyTorch memory, this process has 17179869184.00 GiB memory in use. Of the allocated memory 5.04 GiB is allocated by PyTorch, and 631.85 MiB is reserved by PyTorch but unallocated. 
If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  
See documentation for Memory Management  (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

对应解决方式是设置PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True

>>> import torch
>>> torch.cuda.device_count()
1
>>> torch.cuda.get_device_name(0)
'NVIDIA GeForce RTX 4060 Laptop GPU'

import os 
# os.environ["PYTORCH_CUDA_ALLOC_CONF"] = "max_split_size_mb:4000"
os.environ["PYTORCH_CUDA_ALLOC_CONF"] = "expandable_segments:True"

这种设置方式是当前运行python代码有效，再次运行python代码可以正常。

prompt = "A beautiful woman running on the beach"
video_frames = pipe(prompt, num_inference_steps=10, height=320, width=576, num_frames=24).frames
video_path = export_to_video(video_frames)
# 查看当前生成的视频路径
print(video_path)

效果展示很一般，在自然性、连贯性差很多。更换另外一种文生视频算法ali-vilab/text-to-video-ms-1.7b

import torch
from diffusers import DiffusionPipeline, DPMSolverMultistepScheduler
from diffusers.utils import export_to_video
pipe = DiffusionPipeline.from_pretrained("/mnt/d/aigc_model/hub/models--damo-vilab--text-to-video-ms-1.7b/snapshots/8227dddca75a8561bf858d604cc5dae52b954d01", torch_dtype=torch.float16, variant="fp16")
pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)
pipe.to("cuda")
# pipe.enable_model_cpu_offload()

prompt = "A beautiful woman running on the beach"
video_frames = pipe(prompt, num_inference_steps=25).frames
video_path = export_to_video(video_frames)
print(video_path)

video_path = export_to_video(video_frames, "/mnt/d/result.mp4")
# 保存指定位置

效果比之前的zeroscope_v2_576w模型要好一些。

原创声明：本文系作者授权腾讯云开发者社区发表，未经许可，不得转载。

如有侵权，请联系 cloudcommunity@tencent.com 删除。

aigc