Skip to content

mamadinho/trashnet-mlops

Repository files navigation

TrashNet-MLOps

NOTE: ignore the commit history, I am trying to automate model training using GitHub Actions and do experiment and model logging using WanDB 🗿

A journey of learning MLOps, as my past professional experience does not utilize MLOps well. A publicly available Trash Classification Dataset is used as its training dataset.

Available features:

  • Cross Validation: Able to do cross validation using pre-defined folds, learning rate (LR) and batch size. A plot consists of loss and accuracy is saved to determine whether it is worth to try using the determined hyperparameter. Sample of cross validation results: image
  • Full Training: After finding that a set of parameters works for our deep learning model (ConvNextV2), we can train using the whole training dataset. WanDB logging is optional to log several metrics, including confusion matrix for both training and validation dataset. A plot of confusion matrix and sample of misclassified for each possible classification is saved. Example of misclassified visualization: image
  • Plot Misclassified Manually: Ability to see which data is classified as X when it should be classified as Y. image
  • GradCAM: To see which features is 'seen' by the model to determine the data is classified as some class X. Example of GradCAM on Metal data but classified as Glass: image

WanDB is used for logging the metrics, and it is optional. If you want to use it, make sure you have logged in on your local WanDB and change the config into the name that you want.

image

The best performing model is available at the model registry at WanDB. Note: This should be automated to select the best performing model, since my time is limited currently I set it manually using the website and choose a model. image

The best model is available to access using HuggingFace

About

A journey of learning MLOps, as my past professional experience does not utilize MLOps well. A publicly available Trash Classification Dataset is used as its training dataset.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors