Skip to content

ScatmanVit/NeuralForge

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 

Repository files navigation

NeuralForge

A neural network built from scratch.

Python PyTorch Accuracy

Most people use neural networks without understanding what happens inside them. This project is the opposite of that.

NeuralForge is a handwritten digit classifier trained on MNIST — written line by line to understand every step: how data flows through layers, how loss is calculated, how backpropagation adjusts weights, and how a model that knows nothing becomes one that's right 98% of the time.


How it works

A grayscale 28×28 image enters the network as 784 numbers. Three fully connected layers progressively compress that information — 784 → 128 → 64 → 10 — until the final output represents the model's confidence for each digit (0–9). The highest value wins.

During training, the model sees 60,000 handwritten digits in batches of 64. After each batch, it measures how wrong it was (CrossEntropyLoss), traces that error back through the network (backpropagation), and nudges every weight slightly in the right direction (Adam optimizer).

Training runs in multiple rounds. After each round, the model is evaluated on 10,000 images it has never seen. If accuracy improves, the best model is saved — if not, the previous best is kept. This is early stopping with checkpointing: a standard technique to avoid overfitting and always preserve the best version of the model.

Peak accuracy reached: 98.26%.

Input       Hidden 1    Hidden 2    Output
 784   →      128    →    64    →    10
        ReLU        ReLU

Environment

Training runs on Kaggle Notebooks with GPU acceleration — a strategic choice over local development. Kaggle provides free T4 GPU access, a stable environment, and persistent storage for model checkpoints, making it ideal for iterative ML experimentation without hardware limitations.


Project structure

NeuralForge/
├── notebook-mnist.ipynb   # Full notebook: data loading, model, training, evaluation
└── best_model.pth             # Best weights found (auto-generated, not tracked)

Training output

No model found, starting from scratch.
Epoch 1/9 - Loss: 0.0278
...
Accuracy: 97.91%
New best model saved: 97.91%
Epoch 1/9 - Loss: 0.0005
...
Accuracy: 98.26%
New best model saved: 98.26%
Epoch 1/9 - Loss: 0.0001
...
Accuracy: 97.85%
No improvement. Best so far: 98.26%

What I Learned

  • How a neural network learns through backpropagation
  • What loss is and why it doesn't always reflect real performance
  • The difference between a model that memorizes and one that generalizes — overfitting
  • Why test accuracy and training loss are independent metrics
  • How to implement checkpointing to always preserve the best model
  • That ML has inherent randomness — two identical runs rarely produce the same result

What's next

This MLP hits its ceiling around 98%. The architecture flattens images to 784 numbers, losing all spatial information. The next step is a CNN (Convolutional Neural Network) — designed to understand images as images, not flat vectors. The goal: break 99%.


About

A neural network built from scratch to recognize handwritten digits — 98%+ accuracy on MNIST using PyTorch.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors