CUDA Neural Network for MNIST Classification

A high-performance neural network implementation using CUDA and cuBLAS for MNIST digit classification. This project demonstrates GPU-accelerated deep learning and achieves 97.5% accuracy on the MNIST test set and 98.5% accuracy on the training set.

Architecture

Input Layer: 784 neurons (28×28 flattened MNIST images)
Hidden Layer: 128 neurons with ReLU activation
Output Layer: 10 neurons with Softmax activation (digit classes 0-9)
Optimizer: SGD with momentum
Loss Function: Cross-entropy loss

Project Structure

├── utils.cuh              # Header file with utility function declarations
├── utils.cu               # Utility functions implementation
├── train.cu               # Training code
├── test.cu                # Testing code
├── create_mnist_data.py   # Python script to generate binary data files from MNIST
├── Makefile              # Build config
└── README.md             # Read this.

🔧 Dependencies

System Requirements

NVIDIA GPU with CUDA support
CUDA Toolkit (tested with CUDA 11.0+)
cuBLAS library
GCC/G++ compiler
Python 3.x (for data preparation)

Note: This project was created in Coursera's Lab Environment.

Python Dependencies

pip install torch torchvision numpy

Getting Started

1. Prepare the MNIST Dataset

python create_mnist_dataset.py

This creates:

train_images.bin (60,000 × 784 float32)
train_labels.bin (60,000 × 10 float32)
test_images.bin (10,000 × 784 float32)
test_labels.bin (10,000 × 10 float32)

2. Build the Project

make clean build

This compiles:

utils.o - Utility functions object file
train.exe - Training executable
test.exe - Testing executable

3. Train the Model

./train.exe

4. Test the Model

./test.exe

Model Features

CUDA Kernels

ReLU Activation: GPU-accelerated forward and backward pass
Cross-entropy Gradient: Parallel gradient computation

Training Features

Batch Processing: Configurable batch size (default: 64)
Momentum: SGD with momentum (default: 0.9)
Gradient Clipping: Prevents gradient explosion (norm: 1.0)
Weight Persistence: Automatic save/load of trained weights
He Initialization: Proper weight initialization for ReLU networks

Performance Optimizations

cuBLAS Integration: Optimized matrix operations
Memory Management: Efficient GPU memory allocation
Column-major Storage: cuBLAS-compatible weight layout

Configuration

Key parameters in train.cu:

const int batch_size = 64;        // Training batch size
const int hidden_dim = 128;       // Hidden layer neurons
const float learning_rate = 0.005f; // Learning rate
const float momentum = 0.9f;      // Momentum coefficient
const int epochs = 20;             // Training epochs

License

This project is licensed under the MIT License - see the LICENSE file for details. Feel free to modify and extend for your own learning and research.

Contributing

Suggestions for improvements:

Add more activation functions
Implement different optimizers (Adam, RMSprop)
Add regularization techniques (dropout, weight decay)
Support for different architectures
Visualization tools for training progress

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CUDA Neural Network for MNIST Classification

Architecture

Project Structure

🔧 Dependencies

System Requirements

Python Dependencies

Getting Started

1. Prepare the MNIST Dataset

2. Build the Project

3. Train the Model

4. Test the Model

Model Features

CUDA Kernels

Training Features

Performance Optimizations

Configuration

License

Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
create_mnist_dataset.py		create_mnist_dataset.py
test.cu		test.cu
train.cu		train.cu
utils.cu		utils.cu
utils.cuh		utils.cuh

Folders and files

Latest commit

History

Repository files navigation

CUDA Neural Network for MNIST Classification

Architecture

Project Structure

🔧 Dependencies

System Requirements

Python Dependencies

Getting Started

1. Prepare the MNIST Dataset

2. Build the Project

3. Train the Model

4. Test the Model

Model Features

CUDA Kernels

Training Features

Performance Optimizations

Configuration

License

Contributing

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages