Skip to content

Course Project for CSE676 - Fall 2025 University at Buffalo

Notifications You must be signed in to change notification settings

protyayofficial/TGQ

Repository files navigation

TGQ - Tensor train informed Grouping for efficient low-bit Quantization

This repository contains code for TGQ that performs efficient low bit quantization on diffusion transformers. It includes utilities to calibrate, quantize, test and analyze quantized models.

Repository layout

  • quant_main.py - Main entry script for running experiments and evaluation.
  • quantization.py - Quantization algorithms and helper functions (TT-SVD, activation quantizers, helpers).
  • datautils.py - Calibration dataset loader and gradient utilities.
  • models/ - Model definitions and helpers.
  • pretrained_models/ - Suggested place for large pretrained weights.
  • cali_data/ - (ignored) Calibration data .pth files used by the calibration routines.
  • final_paper_results/ - Output/results from experiments.
  • evaluations/ - Evaluation utilities and data references (follows ADM Tensorflow evaluation suite. Refer to https://github.com/openai/guided-diffusion/tree/main/evaluations for enviornment setup for the evaluations.)
  • utils/ - Download & logger helpers.

Quick prerequisites

  • Conda (Miniconda or Anaconda) installed and available on your PATH.
  • A GPU with a supported CUDA driver is recommended for reasonable speed; CPU-only will also work but significantly slower.

Create a conda environment (recommended)

Create and activate an environment with Python 3.10 (example):

conda create -n tgq python=3.10 -y
conda activate tgq
pip install -r requirements.txt

Prepare data & pretrained weights

  • Put large pretrained files (for example, DiT-XL-2-256x256.pt, safetensors, VAE weights) in pretrained_models/.
  • Put calibration data (e.g. cali_data_256.pth, cali_data_512.pth) in cali_data/.

Auto-download: If the pretrained_models/ directory or the SD-VAE folder pretrained_models/sd-vae-ft-mse (or sd-vae-ft-ema) is missing, the code will attempt to auto-download the SD VAE from Hugging Face the first time it is needed. This requires the Python package huggingface-hub to be installed (see Dependencies below). If auto-download fails, place the VAE files manually under pretrained_models/sd-vae-ft-<mse|ema>/.

Running the code

  • The project entrypoint is quant_main.py. Use the --help flag to see available options and arguments:

Example Script:

python quant_main.py --wbits 4 --abits 8 --num-fid-samples 1000 --cfg-scale 1.5 --image-size 256

Generate calibration data

Before quantization you should generate calibration samples. Run the helper script to sample from a pretrained DiT and save calibration tensors to cali_data/:

python collect_cali_data.py --image-size 256 --num-cali-data 256 --batch-size 16

This will create cali_data/cali_data_256.pth (or cali_data_512.pth for 512-size images). Use the generated file when running quantization.

Dependencies (additional)

  • To enable automatic SD-VAE downloads from Hugging Face, install huggingface-hub:
pip install huggingface-hub

Tips and troubleshooting

  • GPU memory: quantization and calibration can be memory-hungry. If you encounter OOMs, reduce batch_size or use fewer calibration samples.
  • CUDA/toolkit mismatch: If PyTorch cannot see the GPU, make sure the NVIDIA driver and CUDA toolkit are compatible with the PyTorch CUDA build you installed.
  • If you run into import errors for missing packages, install them via pip install <package>.

About

Course Project for CSE676 - Fall 2025 University at Buffalo

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages