Authors: Sandy Tam, Christina Vo
This project aims to explore how can we utilize machine learning models, CNN and logisitic regression, to distinguish between different cat breeds. By collecting, labeling, and training on a collected dataset, we will investigate what balance of dataset size, complexity, and generalization leads to the best performing model. Furthermore, we will analyze which visual features and data characteristics most influence model decisions.
For our project outline, we will first search for our datasets, specifically looking for cat breed datasets containing images and characteristics. After having our datasets, we planned on testing the datasets with two models, CNN and Logistic Regression for two datasets to see which model would perform the best.
Sandy: I will gather high quality, diverse cat breed data from various sources such as Kaggle. Then, I will process the data to ensure consistency and split it into training and testing sets. I plan to use data augmentation techniques to increase the diversity of the dataset. This step is to prevent overfitting and allow the CNN to distinguish subtle differences between the different cat breeds. I will also manually procure and process some images for the validation set.
Christina: When looking for cat breed datasets, there are two different data I want to look for. For one dataset, it would need to contain images in order to compare different cat breeds based on visuals. As for the other dataset, it would need to contain the different characteristics of cat breeds to find the relationships in order to identify the cat breed.
Sandy: I plan to design and implement a simple CNN using PyTorch. The initial model will consist of two layers, followed by pooling layers, fully connected layers, and a softmax output layer. I will experiment with various kernel sizes, activation functions, and optimizers to find the best performing model. To address training efficiency and concerns of overfitting, I will incorporate batch normalization and dropout layers. The model's performance will be evaluated using metrics such as accuracy and F1-score.
Christina: I plan on using a Logistic Regression model for the characteristics dataset as it can help in finding the relationships between features such as fur color, fur length, body weight, and other traits to identify the breed. I would test the data with train-test splitting and normalization to ensure that the model performs well.
To run the CNN model, you'll need the following Python packages:
- Python 3.x (recommended 3.8 or higher)
- PyTorch - Deep learning framework
pip install torch torchvision
- NumPy - Numerical computing
pip install numpy
- Matplotlib - Plotting and visualization
pip install matplotlib
If you have a compatible NVIDIA GPU and want to use CUDA acceleration:
- CUDA Toolkit (compatible with your PyTorch version)
- cuDNN
To install PyTorch with CUDA support, visit PyTorch's official website and select your configuration.
Run the preprocessing notebook and ensure your data is organized in the following structure:
CNN/
├── data/
│ ├── train/
│ │ ├── breed1/
│ │ ├── breed2/
│ │ └── ...
│ └── test/
│ ├── breed1/
│ ├── breed2/
│ └── ...
Open cnn.ipynb and adjust the hyperparameters as needed:
BATCH_SIZE: Number of samples per batch (default: 32)LEARNING_RATE: Learning rate for optimizer (default: 0.0001)NUM_EPOCHS: Number of training epochs (default: 50)KERNEL_SIZE: Convolutional kernel size (default: 5)TARGET_SIZE: Image resize dimensions (default: 224x224)
You can run the CNN model in two ways:
Option A: Using Jupyter Notebook (Recommended)
- Navigate to the CNN directory
- Launch Jupyter Notebook:
jupyter notebook
- Open
cnn.ipynb - Run all cells sequentially
Option B: Converting to Python Script If you prefer to run it as a script:
- Convert the notebook to a Python script:
jupyter nbconvert --to script cnn.ipynb
- Run the script:
python cnn.py
The model automatically saves checkpoints during training. To resume from the last checkpoint:
- Set
START_FRESH = Falsein the configuration cell - The model will automatically load the most recent checkpoint from the
checkpoints/directory
After training completes:
- The trained model is saved as
cat_breed_cnn.pth - Training history is saved as
training_history.json - Checkpoints are saved in the
checkpoints/directory - Training/validation loss plots are displayed in the notebook
The model automatically detects and uses available hardware:
- GPU (CUDA/MPS): If available, training will use GPU acceleration
- CPU: Falls back to CPU if no GPU is detected
To force CPU usage or check device:
device = torch.device("cpu") # Force CPU
# or
print("Using device:", device) # Check current device- Out of Memory Error: Reduce
BATCH_SIZE - Slow Training: Check if GPU is being utilized, or reduce image resolution
- Poor Performance: Try adjusting learning rate, increasing epochs, or experimenting with different kernel sizes
- Checkpoint Issues: Delete the
checkpoints/folder to start completely fresh
To run the Logistic Regression model, you will need the following packages:
- Python 3.x (recommended 3.8 or higher)
- NumPy
pip install numpy
- Matplotlib
pip install matplotlib
- Pandas
pip install pandas
- Scikit-learn
-
pip install sklearn
Run the preprocessing.ipynb and ensure that the cleaned dataset is in the following structure:
LogisticRegression/
├── data/
│ ├── cat_breeds_dirty
│ ├── cat_breeds_clean
│
│
│
Open Cat Breed Classifier (LR).ipynb and changed the parameters to your liking to test the model
learning_rate: learning rate of optimizer (default: 0.1)n_iters: numbers of iterations (deafult: 1000)random_seed: random number generator (default: 1)
Run the model using Jupyter Notebook
- Navigate to the LogisticRegression directory
- Launch Jupyter Notebook:
jupyter notebook
- Open
Cat Breed Classifier (LR).ipynb - Run all cells sequentially
Results will be under the last cell of Cat Breed Classifier (LR).ipynb
Week of 10/21 and 10/23 - Search for Datasets
Week of 10/28 and 10/30 - Clean and Preprocess Datasets
Week of 11/4 and 11/6 - Code and Train Models
Week of 11/11 and 11/13 - Analyze Results
Week of 11/18 and 11/20 - Create Presentation and Final Project Meetings
Week of 11/25 and 11/27 - Prepare for Project Presentation
Week of 12/2 and 12/4 - Final Project Presentations
If we were to continue working on this project, we would pursue the following enhancements:
- Transfer Learning: Implement pre-trained models (ResNet, VGG16, EfficientNet) and fine-tune them on our cat breed dataset to leverage learned features from larger datasets
- Attention Mechanisms: Add attention layers to help the model focus on distinguishing features like ear shape, face structure, and fur patterns
- Better Dataset: Expand the dataset with more images per breed, use higher quality images, and include rarer breeds
- Web Interface: Create a user-friendly web application for breed prediction with confidence scores
- Age and Gender Prediction: Extend model to predict additional cat characteristics
- Better Dataset: Add more to the dataset to have more breeds
- Input Variables: Add more features for the input such as eye colors and weight
- Hyperparameters Tuning: Changed the hyperparameters such as the numbers of iterations and test which is the best
- Model: See if there's a different model that can do much better than Logisitic Regression