StyleMaster

Arxiv 2024

¹Hong Kong University of Science and Technology ²Kuaishou Technology

^†Intern at KwaiVGI, Kuaishou Technology ^✉ Corresponding Author

TL;DR: A high-quality video stylization method enables both stylized video generation and video translation

Video Style Transfer with Different Styles

Stylized Video Generation with Different Prompts

Stylized Video Generation with Different Style Images

Method

We first obtain patch features and image embedding of the style image from CLIP, then we select the patches sharing less similarity with text prompt as texture guidance, and use a global projection module to transform it into global style descriptions. The global projection module is trained with a contrastive dataset constructed by model illusion through contrastive learning. The style information is then injected into the model through the decoupled cross-attention. The motion adapter and gray tile ControlNet are used to enhance dynamic quality and enable content control respectively.

Stylize Your Video with Artistic Generation and Translation

Arxiv 2024

Comparison with Stylized Video Generation Methods

Video Style Transfer with Different Styles

Image Style Transfer Result

Stylized Video Generation with Different Prompts

Stylized Video Generation with Different Style Images

Method