GitHub - quanhaol/FlashMotion: [CVPR 2026] FlashMotion: Few-Step Controllable Video Generation with Trajectory Guidance

FlashMotion: Few-Step Controllable Video Generation with Trajectory Guidance
Quanhao Li¹, Zhen Xing¹, Rui Wang¹, Haidong Cao¹, Qi Dai², Daoguo Dong¹ and Zuxuan Wu¹

¹ Fudan University; ² Microsoft Research Asia

💡 Abstract

Recent advances in trajectory-controllable video generation have achieved remarkable progress. Previous methods mainly use adapter-based architectures for precise motion control along predefined trajectories. However, all these methods rely on a multi-step denoising process, leading to substantial time redundancy and computational overhead. While existing video distillation methods successfully distill multi-step generators into few-step, directly applying these approaches to trajectory-controllable video generation results in noticeable degradation in both video quality and trajectory accuracy. To bridge this gap, we introduce FlashMotion, a novel training framework designed for few-step trajectory-controllable video generation. We first train a trajectory adapter on a multi-step video generator for precise trajectory control. Then, we distill the generator into a few-step version to accelerate video generation. Finally, we finetune the adapter using a hybrid strategy that combines diffusion and adversarial objectives, aligning it with the few-step generator to produce high-quality, trajectory-accurate videos. For evaluation, we introduce FlashBench, a benchmark for long-sequence trajectory-controllable video generation that measures both video quality and trajectory accuracy across varying numbers of foreground objects. Experiments on two adapter architectures show that FlashMotion surpasses existing video distillation methods and previous multi-step models in both visual quality and trajectory consistency.

📣 Updates

2026/03/13 🔥🔥We released FlashMotion, including its training code, inference code, model weights and also the evaluation benchmark.
2026/02 🔥🔥🔥 FlashMotion has been accepted by CVPR2026!

📑 Table of Contents

💡 Abstract
📣 Updates
📑 Table of Contents
✅ TODO List
🐍 Installation
📦 Model Weights
- Folder Structure
- Download Links
⛽️ Dataset Prepare
🔄 Inference
- Scripts
🏎️ Train
🤝 Acknowledgements
📚 Contact

✅ TODO List

Release our inference code and model weights
Release our training code
Release our evaluation benchmark

🐍 Installation

# Clone this repository.
git clone https://github.com/quanhaol/FlashMotion
cd FlashMotion

# Install requirements
conda create -n flashmotion python=3.10 -y
conda activate flashmotion
pip install -r requirements.txt
pip install flash-attn --no-build-isolation
python setup.py develop

📦 Model Weights

Folder Structure

FlashMotion
└── ckpts
    ├── FastGenerator
    │   ├── model.pt
    ├── SlowAdapter
    │   ├── ResNet
    │       └── model.pt
    │   ├── ControlNet
    │       └── model.pt
    ├── FastAdapter
    │   ├── ResNet
    │       └── model.pt
    │   ├── ControlNet
    │       └── model.pt

Download Links

Please use the following commands to download the model weights

pip install "huggingface_hub[hf_transfer]"
HF_HUB_ENABLE_HF_TRANSFER=1 huggingface-cli download quanhaol/FlashMotion --local-dir ckpts
HF_HUB_ENABLE_HF_TRANSFER=1 huggingface-cli download Wan-AI/Wan2.2-TI2V-5B --local-dir wan_models/Wan2.2-TI2V-5B

⛽️ Dataset Prepare

All three training stages of FlashMotion uses MagicData, an open-sourced dataset built for trajectory-controllable video generation. Please follow this README to download and extract the data in a proper path on your machine.

The dataset structure can be organized as follows:

MagicData
├── videos
│   ├── videoid_1.mp4
│   ├── videoid_2.mp4
│   ├── ...
├── masks
│   ├── videoid_1
│   │   ├── annotated_frame_00000.png
│   │   ├── annotated_frame_00001.png
│   │   ├── ...
│   ├── videoid_2
│   │   ├── ...
├── boxs
│   ├── videoid_1
│   │   ├── annotated_frame_00000.png
│   │   ├── annotated_frame_00001.png
│   │   ├── ...
│   ├── videoid_2
│   │   ├── ...
├── MagicData.csv   # detailed information of each video

🔄 Inference

The Inference process requires around 42 GiB GPU memory to use the ResNet FastAdapter and 50GiB GPU memory to use the ControlNet FastAdapter, all tested on a single NVIDIA A100 GPU.

⚡️⚡️⚡️ It takes only 11 seconds for denoising a video using the ResNet Adapter, and around 24 seconds to denoise a video using the ControlNet Adapter.

Scripts

We here provide demo scripts to run both types of trajectory adapter.

# Demo inference script of each adapter type
bash running_scripts/inference/i2v_control_fewstep_controlnet.sh
bash running_scripts/inference/i2v_control_fewstep_resnet.sh

We also provide sample input image and trajectory maps in ./assets.

Feel free to replace the --prompt, --image, --trajectory with your customized input prompt, input image and input trajectory maps.

Note: If you want to build your own trajectory maps, please refer to the box trajectory construction pipeline introduced in MagicMotion.

🏎️ Train

We here provide scripts for all three training stages of FlashMotion, including training the SlowAdapter, FastGenerator, and the FastAdapter.

SlowAdapter Training

In this stage, we first train the SlowAdapter using the mask annotations in MagicData, and then finetune it using bounding box as the trajectory maps conditions.

# Demo training script of SlowAdapter
bash running_scripts/train/stage1_mask.sh
bash running_scripts/train/stage1_box.sh

FastGenerator Training

In this stage, we distill the Wan2.2-TI2V-5B model into a 4-steps image-to-video generation model, named as the FastGenerator.

# Demo training script of FastGenerator
bash running_scripts/train/stage2.sh

FastAdapter Training

In this stage, we trains the FastAdapter to fit with the FastGenerator and enable few-step trajectory controllable video generation.

# Demo training script of FastGenerator
bash running_scripts/train/stage3.sh

🤝 Acknowledgements

We would like to express our gratitude to the following open-source projects that have been instrumental in the development of our project:

Wan: An open sourced base video generation model.
Self-Forcing and Causvid: Two frameworks that pioneer the field of distilling video generation methods.
MagicMotion: An open source trajectory-controllable video generation framework.
Wan2.2-TI2V-5B-Turbo: An open source step distillation image-to-video generation framework that distill Wan2.2-5B-TI2V model into 4 steps.

Special thanks to the contributors of these libraries for their hard work and dedication!

📚 Contact

If you have any suggestions or find our work helpful, feel free to contact us

Email: liqh24@m.fudan.edu.cn

If you find our work useful, please consider giving a star to this github repository and citing it:

@misc{li2026flashmotionfewstepcontrollablevideo,
      title={FlashMotion: Few-Step Controllable Video Generation with Trajectory Guidance}, 
      author={Quanhao Li and Zhen Xing and Rui Wang and Haidong Cao and Qi Dai and Daoguo Dong and Zuxuan Wu},
      year={2026},
      eprint={2603.12146},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2603.12146}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
assets		assets
configs		configs
model		model
pipeline		pipeline
running_scripts		running_scripts
trainer		trainer
utils		utils
wan		wan
wan22		wan22
.gitignore		.gitignore
README.md		README.md
inference.py		inference.py
requirements.txt		requirements.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

💡 Abstract

📣 Updates

📑 Table of Contents

✅ TODO List

🐍 Installation

📦 Model Weights

Folder Structure

Download Links

⛽️ Dataset Prepare

🔄 Inference

Scripts

🏎️ Train

SlowAdapter Training

FastGenerator Training

FastAdapter Training

🤝 Acknowledgements

📚 Contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 1

Languages

Folders and files

Latest commit

History

Repository files navigation

💡 Abstract

📣 Updates

📑 Table of Contents

✅ TODO List

🐍 Installation

📦 Model Weights

Folder Structure

Download Links

⛽️ Dataset Prepare

🔄 Inference

Scripts

🏎️ Train

SlowAdapter Training

FastGenerator Training

FastAdapter Training

🤝 Acknowledgements

📚 Contact

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 1

Languages

Packages