Curated CSV datasets for machine learning, analytics, and educational workflows
My Datasets Repository contains practical CSV files for:
- Machine learning model training
- EDA and feature engineering
- Classification and regression practice
- Academic projects and interview preparation
Maintainer: Bhavya Kansal
Portfolio: https://bhavyakansal.dev
GitHub: https://github.com/BhavyaKansal20
| Metric | Value |
|---|---|
| Total datasets | 26 |
| File format | CSV |
| Approx. storage | 4.02 MB |
| Largest dataset | House Prices.csv |
| Smallest dataset | Placement2.csv |
Follow these steps:
- Open any CSV file in this repository
- Click the Raw button (top-right above file preview)
- Copy the URL of that raw file
- Create a local folder on your desktop
- Create a Python file inside that folder
- Paste the script below and run it
- The CSV will be downloaded locally
import requests
import pandas as pd
url = "{paste_raw_url_here}"
res = requests.get(url, allow_redirects=True)
with open("download_file_name.csv", "wb") as file:
file.write(res.content)
download_file_name = pd.read_csv("download_file_name.csv")
print(download_file_name.head())| Dataset | Rows | Columns | Size (KB) | Primary Use |
|---|---|---|---|---|
| Boston.csv | 506 | 15 | 36.8 | Regression |
| Crop.csv | 620 | 12 | 38.4 | Classification |
| DBSCAN_DATA.csv | 500 | 2 | 18.0 | Clustering |
| House Prices.csv | 21613 | 21 | 2190.0 | Pricing Regression |
| Placement2.csv | 100 | 3 | 1.0 | Placement Classification |
| Salary Data.csv | 375 | 6 | 18.9 | Salary Regression |
| Salary.csv | 375 | 3 | 4.8 | Salary Regression |
| Social_Network_Ads.csv | 400 | 5 | 10.7 | Binary Classification |
| Titanic-Dataset.csv | 891 | 12 | 59.8 | Survival Classification |
| bitcoin.csv | 2785 | 7 | 180.7 | Time Series Analysis |
| breast-cancer.csv | 569 | 32 | 121.7 | Medical Classification |
| car data.csv | 301 | 9 | 16.8 | Price Prediction |
| car.csv | 301 | 9 | 16.8 | Price Prediction |
| diabetes.csv | 768 | 9 | 22.6 | Medical Classification |
| houseprice.csv | 21613 | 13 | 1008.5 | House Price Regression |
| iris copy.csv | 150 | 5 | 4.6 | Multiclass Classification |
| iris.csv | 150 | 5 | 4.6 | Multiclass Classification |
| loan.csv | 614 | 13 | 37.1 | Loan Risk Classification |
| medical_data.csv | 4240 | 16 | 187.3 | Healthcare Analytics |
| placement.csv | 200 | 2 | 2.1 | Placement Insights |
| polynomial.csv | 200 | 2 | 2.1 | Curve Fitting |
| polynomial1.csv | 200 | 2 | 2.1 | Curve Fitting |
| polynomial2.csv | 200 | 2 | 2.1 | Curve Fitting |
| polynomial_classification.csv | 10000 | 2 | 117.1 | Decision Boundary Classification |
| student_placement.csv | 1000 | 3 | 12.5 | Student Placement |
| unlabeled_iris.csv | 150 | 4 | 2.5 | Unsupervised Practice |
- UCI Machine Learning Repository
- Kaggle Datasets
- Google Dataset Search
- OpenML
- Hugging Face Datasets
- Papers with Code - Datasets
- Zenodo
- Data World
- Awesome Public Datasets
- India AI Datasets
- Data.gov.in
- Data.gov (USA)
- EU Open Data Portal
- Canada Open Government
- Australia Data Portal
- COCO Dataset
- ImageNet
- Common Crawl
- Project Gutenberg
- PhysioNet
- MIMIC-III
- OpenSLR
- OpenStreetMap (Geofabrik)
| Name | Domain | Link |
|---|---|---|
| UCI ML Repo | General | Open |
| Kaggle | General | Open |
| IndiaAI | Govt (India) | Open |
| Data.gov.in | Govt (India) | Open |
| Data.gov | Govt (USA) | Open |
| Hugging Face | NLP/ML | Open |
| Papers with Code | Benchmarks | Open |
| Zenodo | Research | Open |
These datasets can be used for:
- Machine learning projects
- Data analysis and visualization
- Educational and tutorial workflows
If you want to add more datasets, open a PR and include:
- File name
- Small description
- Source/reference (if public)
- Intended use case
This repository uses the BSD-2-Clause license. See LICENSE for complete terms.
Important: If any dataset comes from an external source, follow its original license and attribution requirements.
- Website: https://bhavyakansal.dev
- GitHub: https://github.com/BhavyaKansal20
If this repository helped you, consider starring ⭐️ it.
