Dynamic Attention-Guided Video Generation from Text with Multi-Scale Synthesis and LoRA Optimization
Abstract: This paper introduces DAG-VSNet, a new architecture for high-quality text-to-video generation. The model uses a Dynamic Text Encoder (DTE), Multi-Scale Video Generator (MS-VG), Dynamic ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results