Supervised-Fine-Tuning

Overreview

This project focuses on fine-tuning the Pythia-160M model for conversational AI using domain-specific dialogue datasets deita-6k-v0.

Implement Details

Dataset loading

The dataset is downloaded and stored using datasets.

from datasets import load_dataset
dataset = load_dataset("hkust-nlp/deita-6k-v0")
dataset.save_to_disk("./deita-6k-v0")

Dataset preprocessing

In the preprocessing stage, we need to format conversational data using Mistral-7B-Instruct-v0.1 chat template and pack multi-turn dialogues to meet the training context length (2048) The key column formatted_text is extracted and tokenized using Hugging Face's AutoTokenizer. Padding and truncation ensure input sequences stay within 2048 tokens. Split the preprocessed dataset into two parts, leaving the last 100 instances as test dataset.

Model Fine-Tuning

The original pre-trained model is downloaded on huggingface and loaded from a local path.

Training is conducted on multi-GPU (CUDA_VISIBLE_DEVICES) using DeepSpeed for memory optimization.

Uses DeepSpeed for multi-GPU training.

Runs for 3 epochs with batch size 32 per GPU.

The learning rate is 2e-5, optimized for Pythia-160M.

Evaluation and checkpointing occur at the end of each epoch.

The Trainer API from transformers is used for training.

The DataCollatorForLanguageModeling ensures proper tokenization and padding.

Fine-tuning begins with trainer.train(), logging loss and evaluation metrics.

CUDA_VISIBLE_DEVICES=0,1 torchrun --standalone --nproc_per_node=2 train.py

Inference and Evaluation

After training, BLEU scores are computed to evaluate model performance. The model generates predictions for the test set. Predictions are compared to reference responses using BLEU scoring. In my experiment: the Average BLEU Score is 0.8371

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
README.md		README.md
process.py		process.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Supervised-Fine-Tuning

Overreview

Implement Details

Dataset loading

Dataset preprocessing

Model Fine-Tuning

Inference and Evaluation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Supervised-Fine-Tuning

Overreview

Implement Details

Dataset loading

Dataset preprocessing

Model Fine-Tuning

Inference and Evaluation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages