Skip to content

BhavyaKansal20/MachineLearning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

49 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Machine Learning Banner


๐Ÿค– Machine Learning โ€” Bhavya Kansal

A Live, Production-Grade ML Knowledge Repository

From raw data to trained models โ€” every algorithm, every experiment, documented.


MIT License Python Jupyter Scikit-Learn Maintained Stars


๐Ÿข Maintained and updated by : Bhavya Kansal ย |ย  ๐ŸŒ visit at : bhavyakansal.dev ย |ย  ๐Ÿ“ Patiala, Punjab, India


๐Ÿ“‹ Table of Contents


๐Ÿง  About This Repository

This is not just a notebook dump โ€” it is a structured, continuously updated ML knowledge base maintained and updated by Bhavya Kansal , an AI/ML Engineer and Developer , Built this repository to provide Begineer to Advance level and Sutructural Understanding of Machine Learning.

Every notebook in this repository:

  • Is written from scratch with clean, readable code
  • Covers theory + implementation โ€” not just copy-paste code
  • Is beginner-friendly โ€” designed so anyone can open it and understand it
  • Reflects real internship and coursework experiments, not toy examples

This repository is actively maintained and updated regularly with new algorithms, projects, and experiments as the learning journey progresses.


๐Ÿ‘ฅ Who Is This For?

Audience How This Helps
๐ŸŽ“ Beginners Learn ML concepts step-by-step with clean, documented code
๐Ÿ”ฌ Students Reference implementations for assignments and understanding
๐Ÿ’ผ Practitioners Quick refresher notebooks for standard algorithms
๐Ÿง‘โ€๐Ÿ’ป Developers Baseline scikit-learn patterns to build production models from

โš™๏ธ Tech Stack

Tool Purpose
Python Core programming language
NumPy Numerical computing & array ops
Pandas Data manipulation & analysis
Matplotlib Data visualization
Seaborn Statistical data visualization
Scikit-learn Machine Learning algorithms
Jupyter Interactive notebooks

๐Ÿ““ Notebook Index

All notebooks are self-contained and can be opened directly on GitHub or run locally. Click any notebook name to open it.


๐Ÿ“ฆ Data Preprocessing & Feature Engineering

The foundation of every ML pipeline โ€” cleaning, transforming, and preparing raw data.

# Notebook Concepts Covered
1 Data Preprocessing Missing values, data cleaning, pipelines
2 Scikit-Learn Data Preprocessing sklearn preprocessing transformers
3 Encoding Label encoding, One-Hot encoding
4 Feature Scaling StandardScaler, MinMaxScaler, normalization
5 Function Transformation Log, sqrt, box-cox transformations
6 Feature Elimination Removing irrelevant/redundant features
7 Outlier Detection IQR, Z-score, visualizing outliers
8 Outlier Analysis (Custom) Custom outlier detection experiments

๐Ÿ“Š Data Analysis & Visualization

Exploring data with NumPy, Pandas, and the classic Iris dataset.

# Notebook Concepts Covered
1 NumPy Basics Arrays, operations, broadcasting
2 Pandas Project DataFrames, groupby, EDA workflow
3 NumPy with Iris Dataset NumPy analysis on real dataset
4 Iris Data Exploration EDA, pairplots, correlation heatmaps
5 ML Fundamentals Core ML workflow introduction
6 Train-Test Split Proper data splitting strategies

๐Ÿ“ˆ Regression Algorithms

Predicting continuous values โ€” from simple lines to complex polynomial curves.

# Notebook Concepts Covered
1 Simple Linear Regression OLS, slope/intercept, Rยฒ score
2 Multiple Linear Regression Multi-feature regression, multicollinearity
3 Polynomial Regression Degree tuning, overfitting demo
4 KNN Regression K-Neighbors for continuous output
5 Ridge Regularisation L2 penalty, reducing overfitting
6 Lasso Regularisation L1 penalty, automatic feature selection
7 Polynomial Logistic Logistic with polynomial features

๐Ÿ”ท Classification Algorithms

Teaching machines to sort, label, and decide.

# Notebook Concepts Covered
1 Logistic Regression โ€” Part 1 Binary classification, sigmoid, threshold
2 Logistic Regression โ€” Part 2 Advanced logistic, multi-solver comparison
3 KNN Classification K selection, euclidean distance, boundaries
4 Decision Tree Classification Gini, entropy, tree visualization
5 Multiclass Classification OvR, OvO strategies
6 Naive Bayes Gaussian NB, conditional probability

๐Ÿงฎ Support Vector Machines (SVM)

Margin maximization โ€” one of the most powerful classical ML algorithms.

# Notebook Concepts Covered
1 Linear SVM Hard/soft margin, C parameter
2 Polynomial SVM Kernel trick with polynomial kernel
3 SVM Regression SVR, epsilon-tube, continuous prediction
4 Polynomial Regression SVM Combining polynomial features with SVR

๐ŸŒฒ Tree-Based Models

Decision boundaries built like trees โ€” interpretable and powerful.

# Notebook Concepts Covered
1 Decision Tree Regression MSE-based splits, tree depth control
2 Pre & Post Pruning ccp_alpha, max_depth, preventing overfit

๐ŸŽฏ Model Evaluation & Optimization

Because building a model is only half the job.

# Notebook Concepts Covered
1 Confusion Matrix TP/FP/FN/TN, precision, recall, F1
2 Cross Validation K-Fold, StratifiedKFold, LOOCV
3 Dataset Imbalance SMOTE, class_weight, oversampling
4 Hyperparameter Tuning GridSearchCV, RandomizedSearchCV

๐Ÿš€ Getting Started

Prerequisites

Make sure you have Python 3.x installed. Then install the required libraries:

pip install numpy pandas matplotlib seaborn scikit-learn jupyter

Running Locally

# 1. Clone this repository
git clone https://github.com/BhavyaKansal20/MachineLearning.git

# 2. Navigate into the folder
cd MachineLearning

# 3. Launch Jupyter Notebook
jupyter notebook

Then open any .ipynb file from the Jupyter interface in your browser.

Running on Google Colab

Click the badge below or open any notebook on GitHub and change the URL domain from github.com to colab.research.google.com/github:

Open in Colab


๐Ÿ—บ๏ธ Repository Roadmap

This repository is actively growing. Upcoming additions:

  • Unsupervised Learning (K-Means, DBSCAN, Hierarchical Clustering)
  • Dimensionality Reduction (PCA, t-SNE, LDA)
  • Ensemble Methods (Random Forest, Gradient Boosting, XGBoost)
  • Neural Networks (ANN from scratch with NumPy)
  • NLP Basics (TF-IDF, Bag of Words)
  • End-to-End ML Projects with real-world datasets
  • Model deployment notebooks (Flask + Render)

โญ Star the repo to get notified when new notebooks are added!


๐Ÿ“‚ Datasets

Datasets used in these notebooks are maintained in a separate dedicated repository to keep this repo clean and lightweight.

๐Ÿ”— Dataset Repository: Datasets

Some notebooks use built-in scikit-learn datasets (Iris, Boston, etc.) which require no external download.


๐Ÿค Contributing

Contributions, improvements, and suggestions are warmly welcome!

How to contribute:

  1. Fork this repository
  2. Create a new branch: git checkout -b feature/your-topic
  3. Add your notebook or improvement
  4. Commit your changes: git commit -m "Add: XGBoost notebook"
  5. Push to your branch: git push origin feature/your-topic
  6. Open a Pull Request with a clear description

Please read the CONTRIBUTING.md and CODE_OF_CONDUCT.md before submitting.


โš–๏ธ Legal & License

MIT License

MIT License

Copyright (c) 2026 Bhavya Kansal

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.

See the full LICENSE file.


๐Ÿ“š Educational Use Disclaimer

All notebooks and code in this repository are intended strictly for educational and learning purposes. The implementations are for conceptual clarity and skill development, not production deployment without thorough validation.


๐Ÿ“‚ Dataset Attribution

Datasets used across these notebooks may be sourced from:

Refer to individual notebooks for specific dataset sources and their respective licenses. All will be Checked from this repository : Datasets


๐Ÿ” Security Policy

For responsible disclosure of any security concerns, please refer to the SECURITY.md file.


๐Ÿ“ฌ Contact

Bhavya Kansal | AI/ML Developer | Researcher & Collaborator | เคœเคฏ เคถเฅเคฐเฅ€ เคฐเคพเคฎ ๐Ÿ™โค๏ธ

Visit GitHub

๐Ÿ“ Patiala, Punjab, India


If this repository helped you learn something new โ€” leave a โญ

It keeps this project alive and motivates more content to be added.


ยฉ 2026 Bhavya Kansal ยท All Rights Reserved

Built with โค๏ธ & ๐Ÿง  in Patiala, Punjab, India

About

This Repository Contains Machine Learning Based Jupyter Notebooks From Beginner to Advance | All are structured in Folders and according to their types | And the datasets used in these notebooks you can find then in my Dataset Repository.

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors