TGQ - Tensor train informed Grouping for efficient low-bit Quantization

This repository contains code for TGQ that performs efficient low bit quantization on diffusion transformers. It includes utilities to calibrate, quantize, test and analyze quantized models.

Repository layout

quant_main.py - Main entry script for running experiments and evaluation.
quantization.py - Quantization algorithms and helper functions (TT-SVD, activation quantizers, helpers).
datautils.py - Calibration dataset loader and gradient utilities.
models/ - Model definitions and helpers.
pretrained_models/ - Suggested place for large pretrained weights.
cali_data/ - (ignored) Calibration data .pth files used by the calibration routines.
final_paper_results/ - Output/results from experiments.
evaluations/ - Evaluation utilities and data references (follows ADM Tensorflow evaluation suite. Refer to https://github.com/openai/guided-diffusion/tree/main/evaluations for enviornment setup for the evaluations.)
utils/ - Download & logger helpers.

Quick prerequisites

Conda (Miniconda or Anaconda) installed and available on your PATH.
A GPU with a supported CUDA driver is recommended for reasonable speed; CPU-only will also work but significantly slower.

Create a conda environment (recommended)

Create and activate an environment with Python 3.10 (example):

conda create -n tgq python=3.10 -y
conda activate tgq
pip install -r requirements.txt

Prepare data & pretrained weights

Put large pretrained files (for example, DiT-XL-2-256x256.pt, safetensors, VAE weights) in pretrained_models/.
Put calibration data (e.g. cali_data_256.pth, cali_data_512.pth) in cali_data/.

Auto-download: If the pretrained_models/ directory or the SD-VAE folder pretrained_models/sd-vae-ft-mse (or sd-vae-ft-ema) is missing, the code will attempt to auto-download the SD VAE from Hugging Face the first time it is needed. This requires the Python package huggingface-hub to be installed (see Dependencies below). If auto-download fails, place the VAE files manually under pretrained_models/sd-vae-ft-<mse|ema>/.

Running the code

The project entrypoint is quant_main.py. Use the --help flag to see available options and arguments:

Example Script:

python quant_main.py --wbits 4 --abits 8 --num-fid-samples 1000 --cfg-scale 1.5 --image-size 256

Generate calibration data

Before quantization you should generate calibration samples. Run the helper script to sample from a pretrained DiT and save calibration tensors to cali_data/:

python collect_cali_data.py --image-size 256 --num-cali-data 256 --batch-size 16

This will create cali_data/cali_data_256.pth (or cali_data_512.pth for 512-size images). Use the generated file when running quantization.

Dependencies (additional)

To enable automatic SD-VAE downloads from Hugging Face, install huggingface-hub:

pip install huggingface-hub

Tips and troubleshooting

GPU memory: quantization and calibration can be memory-hungry. If you encounter OOMs, reduce batch_size or use fewer calibration samples.
CUDA/toolkit mismatch: If PyTorch cannot see the GPU, make sure the NVIDIA driver and CUDA toolkit are compatible with the PyTorch CUDA build you installed.
If you run into import errors for missing packages, install them via pip install <package>.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TGQ - Tensor train informed Grouping for efficient low-bit Quantization

Repository layout

Quick prerequisites

Create a conda environment (recommended)

Prepare data & pretrained weights

Running the code

Tips and troubleshooting

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
diffusion		diffusion
evaluations		evaluations
models		models
utils		utils
.gitignore		.gitignore
README.md		README.md
collect_cali_data.py		collect_cali_data.py
datautils.py		datautils.py
quant_main.py		quant_main.py
quantization.py		quantization.py
requirements.txt		requirements.txt

protyayofficial/TGQ

Folders and files

Latest commit

History

Repository files navigation

TGQ - Tensor train informed Grouping for efficient low-bit Quantization

Repository layout

Quick prerequisites

Create a conda environment (recommended)

Prepare data & pretrained weights

Running the code

Tips and troubleshooting

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages