This repository contains code for TGQ that performs efficient low bit quantization on diffusion transformers. It includes utilities to calibrate, quantize, test and analyze quantized models.
quant_main.py- Main entry script for running experiments and evaluation.quantization.py- Quantization algorithms and helper functions (TT-SVD, activation quantizers, helpers).datautils.py- Calibration dataset loader and gradient utilities.models/- Model definitions and helpers.pretrained_models/- Suggested place for large pretrained weights.cali_data/- (ignored) Calibration data.pthfiles used by the calibration routines.final_paper_results/- Output/results from experiments.evaluations/- Evaluation utilities and data references (follows ADM Tensorflow evaluation suite. Refer to https://github.com/openai/guided-diffusion/tree/main/evaluations for enviornment setup for the evaluations.)utils/- Download & logger helpers.
- Conda (Miniconda or Anaconda) installed and available on your PATH.
- A GPU with a supported CUDA driver is recommended for reasonable speed; CPU-only will also work but significantly slower.
Create and activate an environment with Python 3.10 (example):
conda create -n tgq python=3.10 -y
conda activate tgq
pip install -r requirements.txt- Put large pretrained files (for example,
DiT-XL-2-256x256.pt, safetensors, VAE weights) inpretrained_models/. - Put calibration data (e.g.
cali_data_256.pth,cali_data_512.pth) incali_data/.
Auto-download: If the pretrained_models/ directory or the SD-VAE folder pretrained_models/sd-vae-ft-mse (or sd-vae-ft-ema) is missing, the code will attempt to auto-download the SD VAE from Hugging Face the first time it is needed. This requires the Python package huggingface-hub to be installed (see Dependencies below). If auto-download fails, place the VAE files manually under pretrained_models/sd-vae-ft-<mse|ema>/.
- The project entrypoint is
quant_main.py. Use the--helpflag to see available options and arguments:
Example Script:
python quant_main.py --wbits 4 --abits 8 --num-fid-samples 1000 --cfg-scale 1.5 --image-size 256Generate calibration data
Before quantization you should generate calibration samples. Run the helper script to sample from a pretrained DiT and save calibration tensors to cali_data/:
python collect_cali_data.py --image-size 256 --num-cali-data 256 --batch-size 16This will create cali_data/cali_data_256.pth (or cali_data_512.pth for 512-size images). Use the generated file when running quantization.
Dependencies (additional)
- To enable automatic SD-VAE downloads from Hugging Face, install
huggingface-hub:
pip install huggingface-hub- GPU memory: quantization and calibration can be memory-hungry. If you encounter OOMs, reduce
batch_sizeor use fewer calibration samples. - CUDA/toolkit mismatch: If PyTorch cannot see the GPU, make sure the NVIDIA driver and CUDA toolkit are compatible with the PyTorch CUDA build you installed.
- If you run into import errors for missing packages, install them via
pip install <package>.