DicFace: Dirichlet-Constrained Variational Codebook Learning for Temporally Coherent Video Face Restoration
supplementary-Compressed.mp4 |
bfr-1.mp4 |
bfr-2.mp4 |
inpainting-1.mp4 |
inpainting-2.mp4 |
color-1.mp4 |
color-2.mp4 |
2025/06/17: Paper submitted on Arixiv. paper2025/06/16: ๐๐๐ Release inference scripts
| Status | Milestone | ETA |
|---|---|---|
| โ | Inference Code release | 2025-6-16 |
| โ | Model Weight release๏ผ baidu-link | 2025-6-16 |
| โ | Paper submitted on Arixiv | 2025-6-17 |
| ๐ | Test data release | 2025-6-20 |
| ๐ | Training Code release | 2025-6-22 |
- System requirement: PyTorch version >=2.4.1, python == 3.10
- Tested on GPUs: A800, python version == 3.10, PyTorch version == 2.4.1, cuda version == 12.1
Download the codes:
git clone https://github.com/fudan-generative-vision/DicFace
cd DicFaceCreate conda environment:
conda create -n DicFace python=3.10
conda activate DicFaceInstall PyTorch
conda install pytorch==2.4.1 torchvision==0.19.1 torchaudio==2.4.1 pytorch-cuda=12.1 -c pytorch -c nvidiaInstall packages with pip
pip install -r requirements.txt
python basicsr/setup.py develop
conda install -c conda-forge dlibThe pre-trained weights have been uploaded to Baidu Netdisk. Please download them from the link
File Structure of Pretrained Models The downloaded .ckpts directory contains the following pre-trained models:
.ckpts
|-- CodeFormer # CodeFormer-related models
| |-- bfr_100k.pth # Blind Face Restoration model
| |-- color_100k.pth # Color Restoration model
| `-- inpainting_100k.pth # Image Inpainting model
|-- dlib # dlib face-related models
| |-- mmod_human_face_detector.dat # Human face detector
| `-- shape_predictor_5_face_landmarks.dat # 5-point face landmark predictor
|-- facelib # Face processing library models
| |-- detection_Resnet50_Final.pth # ResNet50 face detector
| |-- detection_mobilenet0.25_Final.pth # MobileNet0.25 face detector
| |-- parsing_parsenet.pth # Face parsing model
| |-- yolov5l-face.pth # YOLOv5l face detection model
| `-- yolov5n-face.pth # YOLOv5n face detection model
|-- realesrgan # Real-ESRGAN super-resolution model
| `-- RealESRGAN_x2plus.pth # 2x super-resolution enhancement model
`-- vgg # VGG feature extraction model
`-- vgg.pth # VGG network pre-trained weights
python scripts/inference.py \
-i /path/to/video \
-o /path/to/output_folder \
--max_length 10 \
--save_video_fps 24 \
--ckpt_path /bfr/bfr_weight.pth \
--bg_upsampler realesrgan \
--save_video
# or your videos has been aligned
python scripts/inference.py \
-i /path/to/video \
-o /path/to/output_folder \
--max_length 10 \
--save_video_fps 24 \
--ckpt_path /bfr/bfr_weight.pth \
--save_video \
--has_alignedThe current colorization & inpainting tasks only supports input of aligned faces. If a non-aligned face is input, it may lead to unsatisfactory final results.
# for colorization task
python scripts/inference_color_and_inpainting.py \
-i /path/to/video_warped \
-o /path/to/output_folder \
--max_length 10 \
--save_video_fps 24 \
--ckpt_path /colorization/colorization_weight.pth \
--bg_upsampler realesrgan \
--save_video \
--has_aligned
# for inpainting task
python scripts/inference_color_and_inpainting.py \
-i /path/to/video_warped \
-o /path/to/output_folder \
--max_length 10 \
--save_video_fps 24 \
--ckpt_path /inpainting/inpainting_weight.pth \
--bg_upsampler realesrgan \
--save_video \
--has_alignedour test data link: https://pan.baidu.com/s/1zMp3fnf6LvlRT9CAoL1OUw?pwd=drhh
TBD
TBD
If you find our work useful for your research, please consider citing the paper:
@misc{chen2025dicfacedirichletconstrainedvariationalcodebook,
title={DicFace: Dirichlet-Constrained Variational Codebook Learning for Temporally Coherent Video Face Restoration},
author={Yan Chen and Hanlin Shang and Ce Liu and Yuxuan Chen and Hui Li and Weihao Yuan and Hao Zhu and Zilong Dong and Siyu Zhu},
year={2025},
eprint={2506.13355},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2506.13355},
}