Skip to content

Anusara14/HousePricePrediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

House Sale Price Prediction

AID232 — Machine Learning Final Project

Student: Anusara Ranasinghe
Index Number: ADAI2401014
Qualification: DLC Higher Diploma in AI and Data Science
Module: AID232 - Machine Learning
Lecturer: Mr. Ravidu Bandara

Dataset Reference

Ames Housing Dataset — Dean De Cock (2011)
Source: https://www.kaggle.com/datasets/prevek18/ames-housing-dataset
Records: 2,930 residential sales from Ames, Iowa (2006-2010)
Features: 80 physical and locational attributes

Problem Description

Supervised regression task to predict the sale price of residential houses using 80 physical and locational features. The dataset contains 2,930 records of residential sales from Ames, Iowa (2006-2010).

Task (T): Predict sale price (continuous USD value)
Experience (E): 2,930 labeled house sale records with 80 features
Performance (P): RMSE and R-squared on a held-out 20% test set

Model Used

Primary model: Lasso Regression (L1 Regularisation)
Compared against: Linear Regression, Ridge Regression (L2), Polynomial Regression (degree=2)

Selection rationale: Lasso provides automatic feature selection by driving weak feature coefficients to zero, which is particularly effective with the 200+ one-hot encoded features in this dataset. It prevents overfitting while maintaining interpretability.

Why not other algorithms?

  • Logistic Regression: binary classification only, cannot predict continuous price
  • SVM: only classification variant was covered in the module
  • Naive Bayes: classification algorithm, incompatible with regression output

How to Run

  1. Open project in PyCharm Professional (or any Python IDE)

  2. Install dependencies:

pip install pandas numpy matplotlib seaborn scikit-learn scipy
  1. Place AmesHousing.csv inside the data/ folder

  2. Run the main script:

python src/model.py

Or in PyCharm: Right-click src/model.py -> Run

  1. To explore step-by-step notebooks: Open notebooks/ folder, run cells in order:
    • 01_EDA.ipynb — Exploratory Data Analysis
    • 02_Preprocessing.ipynb — Data cleaning and preparation
    • 03_ModelBuilding.ipynb — Train and tune all 4 models
    • 04_Evaluation.ipynb — Results visualisation and comparison

Project Structure

HousePricePrediction/
├── data/
│   └── AmesHousing.csv
├── notebooks/
│   ├── 01_EDA.ipynb
│   ├── 02_Preprocessing.ipynb
│   ├── 03_ModelBuilding.ipynb
│   └── 04_Evaluation.ipynb
├── src/
│   └── model.py
├── outputs/              (generated charts)
├── presentation/
│   └── slides.pptx
└── README.md

About

AID232 — Machine Learning Final Project

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors