Official code repository for the paper Continuous Diffusion Model for Language Modeling (NeurIPS 2025)
We provide an implementation for Riemannian Diffusion Language Model (RDLM) on language modeling tasks.
Create an environment with Python 3.9, and Pytorch 2.3.1. Install requirements with the following command:
pip install -r requirements.txt
The configurations are provided in the config/ directory in YAML format.
- To use new dataset, refer to
configs/exp. - To use new model architecture, refer to
configs/model. - To use new type of generative process, refer to
configs/sde.
Datasetes are automatically downloaded when running the training script.
-
Data cache directory is set to
data/. This can be modified viadata.cache_dirin the config file. -
To add new dataset or modify dataset/tokenizer setting, please refer to the
data.py.
To run on Text8 dataset use the following command:
CUDA_VISIBLE_DEVICES=0 python main.py \
ngpus=1 \
training.accum=1 \
exp=text8 \
sde=mixture \
sde.step_thr=0.35 \
scheduler=geometric \
scheduler.weight_type=step \
scheduler.left=0.3 \
scheduler.right=0.6ngpusis the number of GPUs used for trainingtraining.accumis the number of gradient accumulation steps.
Modify these two hyperparameters to fit your hardware.
Similarly, to run on One Billion Words (LM1B) dataset use the following command:
CUDA_VISIBLE_DEVICES=0,1,2,3 python main.py \
ngpus=4 \
training.accum=1 \
exp=lm1b \
tokens=3 \
sde=mixture \
sde.rho_scale=1.14 \
sde.step_thr=0.38 \
scheduler=geometric \
scheduler.weight_type=step \
scheduler.left=0.3 \
scheduler.right=0.75Run the following command to generate samples and evaluate:
CUDA_VISIBLE_DEVICES=0 python main.py \
ngpus=1 \
run_mode=sample \
server=sample \
exp=sample_lm1b \
"model_path='PATH_TO_MODEL_CHECKPOINT'" \
seed=0The checkpoints for the models trained on Text8 and LM1B datasets are available in this Google Drive folder.
-
Download
checkpoint.pthand pass the path to the downloaded file toPATH_TO_MODEL_CHECKPOINT. Use the command provided in the above section to generate and evaluate the samples. -
Additional files
sde.pklandconfig.yamlare provided for reproducibility and further analysis.
If you found the provided code with our paper useful in your work, we kindly request that you cite our work.
@inproceedings{jo2025RDLM,
author = {Jaehyeong Jo and
Sung Ju Hwang},
title = {Continuous Diffusion Model for Language Modeling},
booktitle = {Advances in Neural Information Processing Systems},
year = {2025},
}