NOTE: ignore the commit history, I am trying to automate model training using GitHub Actions and do experiment and model logging using WanDB 🗿
A journey of learning MLOps, as my past professional experience does not utilize MLOps well. A publicly available Trash Classification Dataset is used as its training dataset.
Available features:
- Cross Validation: Able to do cross validation using pre-defined folds, learning rate (LR) and batch size. A plot consists of loss and accuracy is saved to determine whether it is worth to try using the determined hyperparameter. Sample of cross validation results:

- Full Training: After finding that a set of parameters works for our deep learning model (ConvNextV2), we can train using the whole training dataset. WanDB logging is optional to log several metrics, including confusion matrix for both training and validation dataset. A plot of confusion matrix and sample of misclassified for each possible classification is saved. Example of misclassified visualization:

- Plot Misclassified Manually: Ability to see which data is classified as X when it should be classified as Y.

- GradCAM: To see which features is 'seen' by the model to determine the data is classified as some class X. Example of GradCAM on Metal data but classified as Glass:

WanDB is used for logging the metrics, and it is optional. If you want to use it, make sure you have logged in on your local WanDB and change the config into the name that you want.
The best performing model is available at the model registry at WanDB. Note: This should be automated to select the best performing model, since my time is limited currently I set it manually using the website and choose a model.

The best model is available to access using HuggingFace
