Skip to content

sidsurakanti/tiny-ml-lib

Repository files navigation

Overview

A lightweight deep learning framework built from scratch using raw CUDA with a friendly Python API including all necessary functionality for core DL components.

Features

  • Fully connected layer
  • Convolutional layer
  • GPU Acceleration (15x faster than regular NumPy, same speed as PyTorch for smaller stuff)
  • Flatten layer
  • Max pooling layer
  • ReLU activation
  • Softmax layer
  • Model save/load
  • Cross Entropy Loss & MSE Loss
  • Model & sequential classes
  • Training & eval loop
  • Mini-batching

Usage

Convolutional NN on MNIST

>>> py main.py
Input shape: (60000, 784)
Labels shape: (60000,)

Model(
  [0] Conv2d       ((1, 28, 28) → (5, 24, 24))
  [1] ReLU
  [2] Flatten
  [3] Linear       (2880 → 128)
  [4] ReLU
  [5] Linear       (128 → 10)
  Loss: CrossEntropyLoss
  Total parameters: 373,063
)

TRAINING...
EPOCH 1/10, Loss: 0.1227
...
EPOCH 10/10, Loss: 0.0347
Time spent training: 437.89s

EVALUATING...
Sample labels: [9 2 9 8 9 7 1 2 4 3]
Sample preds: [9 2 9 8 9 7 1 2 4 3]
Accuracy: 98.32%

Save weights? (y/n) >>> y
File name? (empty for default) >>> cnn-weights
Saved model weights to cnn-weights.pkl

With MaxPool (GPU ver.)

Model(
  [0] Conv2d       ((1, 28, 28) → (5, 24, 24))
  [1] ReLU
  [2] MaxPool # on gpu
  [3] Flatten
  [4] Linear       (720 → 128)
  [5] ReLU
  [6] Linear       (128 → 10)
  Loss: CrossEntropyLoss
  Total parameters: 96,583
  Device: CPU
)

TRAINING...
EPOCH 1/5, Loss: 1.75497549
...
EPOCH 5/5, Loss: 0.41454145
Finished in: 492.62s # CPU time 1200s

EVALUATING...
Sample labels: [8 5 6 4 2 4 2 4 1 3]
Sample preds: [8 5 6 4 4 4 2 4 1 3]
Accuracy: 89.95%

MLP on MNIST (GPU)

>>> py main.py
Input shape: (60000, 784)
Labels shape: (60000,)
Model(
  [0] Linear       (784 → 512)
  [1] ReLU
  [2] Linear       (512 → 512)
  [3] ReLU
  [4] Linear       (512 → 512)
  [5] ReLU
  [6] Linear       (512 → 10)
  Loss: CrossEntropyLoss
  Total parameters: 932,362
  Device: GPU
)

TRAINING...
EPOCH 1/10, Loss: 0.5499
...
EPOCH 10/10, Loss: 0.2297
Finished in: 9.51s

EVALUATING...
Sample labels: [7 3 1 1 0 8 0 8 6 4]
Sample preds: [7 3 1 1 0 0 0 8 6 4]
Accuracy: 95.70%

Save weights? (y/n) >>> n

With pretrained weights:

Loaded model weights from mlp-weights.pkl

EVALUATING...
Sample labels: [2 0 1 9 6 5 5 6 7 8]
Sample preds: [2 0 1 9 6 5 5 6 7 8]
Accuracy: 98.13%

Benchmarks

Note

This library doesn't have autograd (yet), graph tracing, mixed precision, tensor cores, cuDNN, cuBLAS, or any of the fancy stuff PyTorch does.

It only runs "faster" because it's lightweight.

Still beats pytorch at batch sizes <= 512 for MNIST though, so it's a win in my book.

All benchmarks were run on a RTX 4060, training a simple MNIST NN from scratch using this library’s GPU backend.

Model: Linear(784 → 512) → ReLU → Linear(512 → 512) → ReLU → Linear(512 → 512) → ReLU → Linear(512 → 10)

Loss: CrossEntropy

Optimizer: SGD, lr=0.1

Epochs: 10

Batch Size Framework Time (10 Epochs)
64 PyTorch 27.2s
64 This lib 20.2s
512 PyTorch 9.7s
512 This lib 9.5s

Why make this?

  • GPU programming seemed like a really fun problem space
  • Implement a bunch of things on my own
  • Experiment with a framework and learn cool stuff

Stack

CUDA Python
NumPy

Getting Started

Prerequisites

  • Python 3.10+
  • pip
  • CUDA Toolkit
  • CMake
  • gcc or g++

Installation

Clone the repo:

git clone https://github.com/sidsurakanti/tiny-ml-lib.git
cd tiny-ml-lib

Create a virtual environment (optional but recommended):

python3 -m venv venv
source venv/bin/activate  # windows: venv\Scripts\activate

Install dependencies:

pip install -r requirements.txt

Build core CUDA lib

mkdir build && cd build
cmake .. && make && make install
cd ..

Run it

python3 main.py

or

python main.py

Roadmap

  • MLP basic functionality
  • Add Conv2d
  • Add pooling layer
  • Add weight inits
  • Cuda remake
  • Add more activations etc

Support

Need help? Ping me on discord

About

a lightweight gpu-accelerated deep learning engine from scratch with 0 dependencies

Topics

Resources

Stars

Watchers

Forks