NeuralForge

A neural network built from scratch.

Most people use neural networks without understanding what happens inside them. This project is the opposite of that.

NeuralForge is a handwritten digit classifier trained on MNIST — written line by line to understand every step: how data flows through layers, how loss is calculated, how backpropagation adjusts weights, and how a model that knows nothing becomes one that's right 98% of the time.

How it works

A grayscale 28×28 image enters the network as 784 numbers. Three fully connected layers progressively compress that information — 784 → 128 → 64 → 10 — until the final output represents the model's confidence for each digit (0–9). The highest value wins.

During training, the model sees 60,000 handwritten digits in batches of 64. After each batch, it measures how wrong it was (CrossEntropyLoss), traces that error back through the network (backpropagation), and nudges every weight slightly in the right direction (Adam optimizer).

Training runs in multiple rounds. After each round, the model is evaluated on 10,000 images it has never seen. If accuracy improves, the best model is saved — if not, the previous best is kept. This is early stopping with checkpointing: a standard technique to avoid overfitting and always preserve the best version of the model.

Peak accuracy reached: 98.26%.

Input       Hidden 1    Hidden 2    Output
 784   →      128    →    64    →    10
        ReLU        ReLU

Environment

Training runs on Kaggle Notebooks with GPU acceleration — a strategic choice over local development. Kaggle provides free T4 GPU access, a stable environment, and persistent storage for model checkpoints, making it ideal for iterative ML experimentation without hardware limitations.

Project structure

NeuralForge/
├── notebook-mnist.ipynb   # Full notebook: data loading, model, training, evaluation
└── best_model.pth             # Best weights found (auto-generated, not tracked)

Training output

No model found, starting from scratch.
Epoch 1/9 - Loss: 0.0278
...
Accuracy: 97.91%
New best model saved: 97.91%
Epoch 1/9 - Loss: 0.0005
...
Accuracy: 98.26%
New best model saved: 98.26%
Epoch 1/9 - Loss: 0.0001
...
Accuracy: 97.85%
No improvement. Best so far: 98.26%

What I Learned

How a neural network learns through backpropagation
What loss is and why it doesn't always reflect real performance
The difference between a model that memorizes and one that generalizes — overfitting
Why test accuracy and training loss are independent metrics
How to implement checkpointing to always preserve the best model
That ML has inherent randomness — two identical runs rarely produce the same result

What's next

This MLP hits its ceiling around 98%. The architecture flattens images to 784 numbers, losing all spatial information. The next step is a CNN (Convolutional Neural Network) — designed to understand images as images, not flat vectors. The goal: break 99%.

↑ back to top

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md
best_model.pth		best_model.pth
notebook-mnist.ipynb		notebook-mnist.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NeuralForge

How it works

Environment

Project structure

Training output

What I Learned

What's next

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

NeuralForge

How it works

Environment

Project structure

Training output

What I Learned

What's next

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages