network_general.py: where our model gets trained and tested, the model is being stored for manual use in the same folder as "weights.npz".
manual_check.py: where we can let the model guess the digits we wrote ourselves.
weights.npz: where our weights and biases sets of values are being stored, they represent the "trained model", we use the .npz file in manual_check.py.
MNIST_dataset: this folder has 4 files, 2 for training the model, and 2 for testing the model, each 2 have the images themselves, and their labels. This folder is being used for training the model.
images: containing 3 folders, each folder has a set of 10 digits I wrote, used in manual_check.py
Neural Networks is a topic I wanted to learn a long time, this project uses advanced Math, as you learn in courses like Calculus 2 or Linear Algebra, fortunately I have a slight background in those topics, but I did learn new things such as gradients, multi-variable functions, partial derivatives and more.
Simply, the process of training the model is when the program loops through images, and tweaks certain weights/biases values in order to minimize the Loss function.
Minimized loss means a more correct and certain predictions.
The loss function can be imagined as nice a view full of mountains and valleys, where our job is to go in the direction of the steepest descent, where the loss is the lowest, and the step size is our "learning_rate" variable.
I made an organized pdf file (I am very proud of it), the pdf is in the project folder, use for visualising the system and for calculating all derivatives.
(Before running just make sure folder paths are correct)
Firstly, weights and biases are initialized randomally, the constant NEURONS can be changed, the higher it is the more precise the model, but the longer it'll take to train.
Secondly, we're running through the images (and training the model), in the back propagation process, we're calculating the gradients which are vectors indicating the steepest ascent, so we're assigning w0, b0, b1 and w1 to go in the negative (opposite) direction of their gradient, making them generate less loss.
Then we test to make sure we got a nice and precise model, and finally saving the model weights and biases for later use.
Dr. Aviv Censor - Mathematics lecturer at the Technion, whose calculus and linear algebra playlists were an absolute delight to learn from.
Dr. Raj Abhijit Dandekar - Whose YouTube lecture series on building Neural Networks from scratch was an invaluable guide throughout this project.
Neural Networks Playlist
Claude AI (Anthropic) & Gemini (Google) - AI assistants that helped debug, explain, and refine ideas during development.