banner
hughie

hughie

热爱技术的小菜鸟, 记录一下所学所感

StableVideo - Text-driven Consistent Perception Diffusion Video Editing

Introduction#

This article briefly introduces StableVideo.

StableVideo is a text-driven consistency-aware diffusion video editing tool.

31-stablevideo-overview


Main Content#

1. What is StableVideo#

It can modify objects in a video while maintaining their appearance consistency over time.

2. StableVideo Features#

Introduce temporal dependency into the existing text-driven diffusion model to solve this problem, allowing the model to generate temporally consistent appearance for edited objects.

3. Using StableVideo#

  • Download and install project dependencies
git clone https://github.com/rese1f/StableVideo.git
conda create -n stablevideo python=3.11
pip install -r requirements.txt
(optional) pip install xformers 

The project also provides a CPU-only version.

4. Conclusion#

Modifying videos directly using SD results in some differences in each frame and inconsistency in time. This project mitigates the side effects of temporal inconsistency by adding a temporal dependency module, making it worth a try.


Finally#

References:

Official Project


Disclaimer#

This article is for personal learning purposes only.
This article is synchronized with HBlog.

Loading...
Ownership of this post data is guaranteed by blockchain and smart contracts to the creator alone.