pip install uv
uv syncNothing to do.
This is because in train.py:get_streaming_dataset we implement a default setting of loading streaming dataset from DKYoon/SlimPajama-6B.
bash train.sh arm_700m # bdm_700m, mdm_700m, udm_700mTo evaluate a model, set the MODEL_PATH environment variable to your checkpoint directory and run eval.sh.
IMPORTANT: The script detects the model architecture from the folder name.
# Example for an AR model
MODEL_PATH=ar_700m/checkpoint-500 bash eval.sh
# Example for a Masked Diffusion model
MODEL_PATH=mdm_700m/checkpoint-1000 bash eval.sh