Benchmarking deep learning optimization with nanoGPT

This benchmark is dedicated to evaluate new deep learning optimization methods on the nanoGPT architecture. The optimization problem is defined as in the original speedrun of nanoGPT (see modded nanogpt):

The training and validation is perfromed on FineWeb -- Do not change the dataloaders.
The training is stopped once the validation loss is below 3.28. (Still todo)

For now, the repository contains a single solver, Adam, and run on CPU. The dataloaders are working but with fixed sequence length of 128 tokens. We used the original code from nanoGPT (GPT2 from llm.c), but use the simple dataloader from `modded-nanogpt`_.

TODO:

Tweak the dataloaders to make it more efficient/less error prone.
See if we want to add imporevments to the architecture (QK-norm, Rotary embeddings, etc.).

Install

This benchmark can be run using the following commands:

$ pip install -U benchopt
$ git clone https://github.com/tomMoral/benchmark_nanogpt
$ benchopt run benchmark_nanogpt

Apart from the problem, options can be passed to benchopt run, to restrict the benchmarks to some solvers or datasets, e.g.:

$ benchopt run benchmark_nanogpt -s solver1 -d dataset2 --max-runs 10 --n-repetitions 10

Use benchopt run -h for more details about these options, or visit https://benchopt.github.io/api.html.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.github/workflows		.github/workflows
benchmark_utils		benchmark_utils
datasets		datasets
solvers		solvers
.gitignore		.gitignore
LICENSE.md		LICENSE.md
README.rst		README.rst
objective.py		objective.py
test_config.py		test_config.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Benchmarking deep learning optimization with nanoGPT

Install

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

benchopt/benchmark_nanogpt

Folders and files

Latest commit

History

Repository files navigation

Benchmarking deep learning optimization with nanoGPT

Install

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages