This project implements a machine learning–based recommendation system for fitness supplements using real-world observational data. The system predicts expected weight change, body fat change, and performance improvement for gym users based on their profile, training habits, and supplement usage, and then recommends the most suitable supplement aligned with a user’s fitness goal.
Key Features:
- Predicts weight, body fat, and performance changes
- Recommends supplements based on user goals
- Emphasizes interpretability and explainability
- Source: Kaggle – Fitness Supplement Effect Tracking Dataset
- Size: 3,788 observations, 17 original features
- Time period: January–December 2023
The dataset contains anonymized questionnaire and fitness app–tracking data, including:
- Demographics
- Training frequency and type
- Diet type and fitness level
- Supplement usage details
- Observed changes in weight, body fat percentage, and performance
⚠️ Note: The dataset is not included in this repository due to licensing and size considerations. You must download it manually from Kaggle.
To build a Digital Personal Trainer that recommends supplements aligned with a user’s primary fitness goal:
- Muscle gain
- Fat loss
- Performance improvement
Recommendations are based on observed data patterns rather than arbitrary rules and include interpretable insights.
Given a user profile and supplement configuration, the system predicts:
- Weight change (kg)
- Body fat change (%)
- Performance improvement (%)
The recommendation engine evaluates all supplements and selects the optimal option based on the user’s goal.
- Textual age and height ranges converted to numeric values
- Supplement and supplement type merged into a single categorical feature
- Categorical variables encoded using one-hot encoding
- Numerical variables kept on natural scale
- No missing values in core variables
Three regression algorithms were trained for each target variable:
- Linear Regression
- Random Forest Regressor
- Gradient Boosting Regressor
Model evaluation used 5-fold cross-validation with:
- MAE
- RMSE
- R²
- Adjusted R²
- Random Forest consistently achieved the best performance across all targets
- Tree-based models significantly outperformed Linear Regression
- Recommendations aligned with real-world fitness practices:
- Creatine → muscle gain
- Pre-workout → performance boost
- L-carnitine → fat loss
An interactive Jupyter widget allows users to input their profile and receive ranked supplement recommendations with predicted outcomes.
.
├── project_ML.ipynb # Main Jupyter Notebook (analysis, modeling, evaluation)
├── README.md # Project documentation
└── .gitignore
- Clone the repository
git clone https://github.com/edbajric/SupplementRecsML.git cd SupplementRecsML - Download the dataset
- Go to the Kaggle link
- Download the CSV file
- Place it in the project directory (or update the notebook path accordingly)
- Open the notebook
jupyter notebook project_ML.ipynb
- Run all cells
- Run the notebook top to bottom to reproduce preprocessing, model training, evaluation, and recommendations.
The project uses standard Python data science libraries, including:
- pandas
- numpy
- scikit-learn
- matplotlib / seaborn
- ipywidgets
(Exact versions are not pinned; standard recent versions are sufficient.)
- The dataset is observational, not experimental
- Many influential factors are not captured (sleep, genetics, injuries, caloric intake, etc.)
- Predictions should be treated as supportive guidance, not medical or nutritional advice
- Recommendations do not replace professional consultation
- Kaggle. (2023). Fitness Supplement Effect Tracking Dataset. Link
- P. C. Magalhães et al. (2025). Machine learning classification of consumption habits of creatine supplements in gym goers. RBNE – Brazilian Journal of Sports Nutrition, 19(114):1–13. Link
- J. Wang, C. He, and Z. Long. (2023). Establishing a machine learning model for predicting nutritional risk through facial feature recognition. Frontiers in Nutrition, 10:1219193. doi:10.3389/fnut.2023.1219193