🤖 Machine Learning — Bhavya Kansal

A Live, Production-Grade ML Knowledge Repository

From raw data to trained models — every algorithm, every experiment, documented.

🏢 Maintained and updated by : Bhavya Kansal | 🌐 visit at : bhavyakansal.dev | 📍 Patiala, Punjab, India

📋 Table of Contents

About This Repository
Who Is This For?
Tech Stack
Notebook Index
Getting Started
Repository Roadmap
Datasets
Acknowledgements
Contributing
Legal & License
Contact

🧠 About This Repository

This is not just a notebook dump — it is a structured, continuously updated ML knowledge base maintained and updated by Bhavya Kansal , an AI/ML Engineer and Developer , Built this repository to provide Begineer to Advance level and Sutructural Understanding of Machine Learning.

Every notebook in this repository:

Is written from scratch with clean, readable code
Covers theory + implementation — not just copy-paste code
Is beginner-friendly — designed so anyone can open it and understand it
Reflects real internship and coursework experiments, not toy examples

This repository is actively maintained and updated regularly with new algorithms, projects, and experiments as the learning journey progresses.

👥 Who Is This For?

Audience	How This Helps
🎓 Beginners	Learn ML concepts step-by-step with clean, documented code
🔬 Students	Reference implementations for assignments and understanding
💼 Practitioners	Quick refresher notebooks for standard algorithms
🧑‍💻 Developers	Baseline scikit-learn patterns to build production models from

⚙️ Tech Stack

Tool	Purpose
	Core programming language
	Numerical computing & array ops
	Data manipulation & analysis
	Data visualization
	Statistical data visualization
	Machine Learning algorithms
	Interactive notebooks

📓 Notebook Index

All notebooks are self-contained and can be opened directly on GitHub or run locally. Click any notebook name to open it.

📦 Data Preprocessing & Feature Engineering

The foundation of every ML pipeline — cleaning, transforming, and preparing raw data.

#	Notebook	Concepts Covered
1	Data Preprocessing	Missing values, data cleaning, pipelines
2	Scikit-Learn Data Preprocessing	sklearn preprocessing transformers
3	Encoding	Label encoding, One-Hot encoding
4	Feature Scaling	StandardScaler, MinMaxScaler, normalization
5	Function Transformation	Log, sqrt, box-cox transformations
6	Feature Elimination	Removing irrelevant/redundant features
7	Outlier Detection	IQR, Z-score, visualizing outliers
8	Outlier Analysis (Custom)	Custom outlier detection experiments

📊 Data Analysis & Visualization

Exploring data with NumPy, Pandas, and the classic Iris dataset.

#	Notebook	Concepts Covered
1	NumPy Basics	Arrays, operations, broadcasting
2	Pandas Project	DataFrames, groupby, EDA workflow
3	NumPy with Iris Dataset	NumPy analysis on real dataset
4	Iris Data Exploration	EDA, pairplots, correlation heatmaps
5	ML Fundamentals	Core ML workflow introduction
6	Train-Test Split	Proper data splitting strategies

📈 Regression Algorithms

Predicting continuous values — from simple lines to complex polynomial curves.

#	Notebook	Concepts Covered
1	Simple Linear Regression	OLS, slope/intercept, R² score
2	Multiple Linear Regression	Multi-feature regression, multicollinearity
3	Polynomial Regression	Degree tuning, overfitting demo
4	KNN Regression	K-Neighbors for continuous output
5	Ridge Regularisation	L2 penalty, reducing overfitting
6	Lasso Regularisation	L1 penalty, automatic feature selection
7	Polynomial Logistic	Logistic with polynomial features

🔷 Classification Algorithms

Teaching machines to sort, label, and decide.

#	Notebook	Concepts Covered
1	Logistic Regression — Part 1	Binary classification, sigmoid, threshold
2	Logistic Regression — Part 2	Advanced logistic, multi-solver comparison
3	KNN Classification	K selection, euclidean distance, boundaries
4	Decision Tree Classification	Gini, entropy, tree visualization
5	Multiclass Classification	OvR, OvO strategies
6	Naive Bayes	Gaussian NB, conditional probability

🧮 Support Vector Machines (SVM)

Margin maximization — one of the most powerful classical ML algorithms.

#	Notebook	Concepts Covered
1	Linear SVM	Hard/soft margin, C parameter
2	Polynomial SVM	Kernel trick with polynomial kernel
3	SVM Regression	SVR, epsilon-tube, continuous prediction
4	Polynomial Regression SVM	Combining polynomial features with SVR

🌲 Tree-Based Models

Decision boundaries built like trees — interpretable and powerful.

#	Notebook	Concepts Covered
1	Decision Tree Regression	MSE-based splits, tree depth control
2	Pre & Post Pruning	ccp_alpha, max_depth, preventing overfit

🎯 Model Evaluation & Optimization

Because building a model is only half the job.

#	Notebook	Concepts Covered
1	Confusion Matrix	TP/FP/FN/TN, precision, recall, F1
2	Cross Validation	K-Fold, StratifiedKFold, LOOCV
3	Dataset Imbalance	SMOTE, class_weight, oversampling
4	Hyperparameter Tuning	GridSearchCV, RandomizedSearchCV

🚀 Getting Started

Prerequisites

Make sure you have Python 3.x installed. Then install the required libraries:

pip install numpy pandas matplotlib seaborn scikit-learn jupyter

Running Locally

# 1. Clone this repository
git clone https://github.com/BhavyaKansal20/MachineLearning.git

# 2. Navigate into the folder
cd MachineLearning

# 3. Launch Jupyter Notebook
jupyter notebook

Then open any .ipynb file from the Jupyter interface in your browser.

Running on Google Colab

Click the badge below or open any notebook on GitHub and change the URL domain from github.com to colab.research.google.com/github:

🗺️ Repository Roadmap

This repository is actively growing. Upcoming additions:

Unsupervised Learning (K-Means, DBSCAN, Hierarchical Clustering)
Dimensionality Reduction (PCA, t-SNE, LDA)
Ensemble Methods (Random Forest, Gradient Boosting, XGBoost)
Neural Networks (ANN from scratch with NumPy)
NLP Basics (TF-IDF, Bag of Words)
End-to-End ML Projects with real-world datasets
Model deployment notebooks (Flask + Render)

⭐ Star the repo to get notified when new notebooks are added!

📂 Datasets

Datasets used in these notebooks are maintained in a separate dedicated repository to keep this repo clean and lightweight.

🔗 Dataset Repository: Datasets

Some notebooks use built-in scikit-learn datasets (Iris, Boston, etc.) which require no external download.

🤝 Contributing

Contributions, improvements, and suggestions are warmly welcome!

How to contribute:

Fork this repository
Create a new branch: git checkout -b feature/your-topic
Add your notebook or improvement
Commit your changes: git commit -m "Add: XGBoost notebook"
Push to your branch: git push origin feature/your-topic
Open a Pull Request with a clear description

Please read the CONTRIBUTING.md and CODE_OF_CONDUCT.md before submitting.

⚖️ Legal & License

MIT License

MIT License

Copyright (c) 2026 Bhavya Kansal

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.

See the full LICENSE file.

📚 Educational Use Disclaimer

All notebooks and code in this repository are intended strictly for educational and learning purposes. The implementations are for conceptual clarity and skill development, not production deployment without thorough validation.

📂 Dataset Attribution

Datasets used across these notebooks may be sourced from:

Custom datasets of Bhavya Kansal
Scikit-learn built-in datasets (BSD License)
UCI Machine Learning Repository (varies per dataset)
Publicly available open-source data

Refer to individual notebooks for specific dataset sources and their respective licenses. All will be Checked from this repository : Datasets

🔐 Security Policy

For responsible disclosure of any security concerns, please refer to the SECURITY.md file.

📬 Contact

Bhavya Kansal | AI/ML Developer | Researcher & Collaborator | जय श्री राम 🙏❤️

📍 Patiala, Punjab, India

If this repository helped you learn something new — leave a ⭐

It keeps this project alive and motivates more content to be added.

Built with ❤️ & 🧠 in Patiala, Punjab, India

Name		Name	Last commit message	Last commit date
Latest commit History 49 Commits
Ensemble Learning		Ensemble Learning
ML Project Notebooks		ML Project Notebooks
Pricipal Component Analysis		Pricipal Component Analysis
Saved Graph Images		Saved Graph Images
Supervised ML		Supervised ML
Unsupervised ML		Unsupervised ML
XgBoost		XgBoost
.gitignore		.gitignore
BhavyaKansal.ipynb		BhavyaKansal.ipynb
BhavyaKansal_PandasProject.ipynb		BhavyaKansal_PandasProject.ipynb
Breast_Cancer_Prediction.ipynb		Breast_Cancer_Prediction.ipynb
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Car_price_Prediction.ipynb		Car_price_Prediction.ipynb
Confusion_Matrix.ipynb		Confusion_Matrix.ipynb
Cross_Validation.ipynb		Cross_Validation.ipynb
Dataset_Imbalance.ipynb		Dataset_Imbalance.ipynb
Handling_Outliers.ipynb		Handling_Outliers.ipynb
HyperParameter_Tuning.ipynb		HyperParameter_Tuning.ipynb
LICENSE		LICENSE
Lasso_Regulisation.ipynb		Lasso_Regulisation.ipynb
ML.ipynb		ML.ipynb
Outlier.ipynb		Outlier.ipynb
Polynomial_SVM.ipynb		Polynomial_SVM.ipynb
Pre_Post_Pruning.ipynb		Pre_Post_Pruning.ipynb
README.md		README.md
Regression_SVM.ipynb		Regression_SVM.ipynb
Ridge_Regulisation.ipynb		Ridge_Regulisation.ipynb
SECURITY.md		SECURITY.md
banner.png		banner.png
data_preprocessing..ipynb		data_preprocessing..ipynb
elimination.ipynb		elimination.ipynb
encoding.ipynb		encoding.ipynb
feature_Scaling.ipynb		feature_Scaling.ipynb
function_transform.ipynb		function_transform.ipynb
mine_outlier.ipynb		mine_outlier.ipynb
scikitlearn_data_preprocesing.ipynb		scikitlearn_data_preprocesing.ipynb
train_test_split.ipynb		train_test_split.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🤖 Machine Learning — Bhavya Kansal

A Live, Production-Grade ML Knowledge Repository

📋 Table of Contents

🧠 About This Repository

👥 Who Is This For?

⚙️ Tech Stack

📓 Notebook Index

📦 Data Preprocessing & Feature Engineering

📊 Data Analysis & Visualization

📈 Regression Algorithms

🔷 Classification Algorithms

🧮 Support Vector Machines (SVM)

🌲 Tree-Based Models

🎯 Model Evaluation & Optimization

🚀 Getting Started

Prerequisites

Running Locally

Running on Google Colab

🗺️ Repository Roadmap

📂 Datasets

🤝 Contributing

⚖️ Legal & License

MIT License

📚 Educational Use Disclaimer

📂 Dataset Attribution

🔐 Security Policy

📬 Contact

If this repository helped you learn something new — leave a ⭐

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🤖 Machine Learning — Bhavya Kansal

A Live, Production-Grade ML Knowledge Repository

📋 Table of Contents

🧠 About This Repository

👥 Who Is This For?

⚙️ Tech Stack

📓 Notebook Index

📦 Data Preprocessing & Feature Engineering

📊 Data Analysis & Visualization

📈 Regression Algorithms

🔷 Classification Algorithms

🧮 Support Vector Machines (SVM)

🌲 Tree-Based Models

🎯 Model Evaluation & Optimization

🚀 Getting Started

Prerequisites

Running Locally

Running on Google Colab

🗺️ Repository Roadmap

📂 Datasets

🤝 Contributing

⚖️ Legal & License

MIT License

📚 Educational Use Disclaimer

📂 Dataset Attribution

🔐 Security Policy

📬 Contact

If this repository helped you learn something new — leave a ⭐

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages