Skip to content

Shree-2516/RetailSense360

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

python -m venv .venv ..venv\Scripts\Activate.ps1


RetailSense360

RetailSense360 is a retail analytics project that combines data preprocessing, customer segmentation (RFM), market basket analysis, customer lifetime value (LTV) modeling, SQL business analysis, and a Power BI dashboard.

Project Objectives

  • Clean and standardize retail customer shopping data.
  • Build customer segments using RFM and clustering.
  • Discover product affinity rules for cross-sell opportunities.
  • Train an LTV model for customer value estimation.
  • Enable SQL-driven business analysis on PostgreSQL.
  • Visualize KPIs and insights in Power BI.

Repository Structure

RetailSense3601/
|- data/
|  |- raw/
|  |  |- customer_shopping_behavior.csv
|  |- processed/
|  |  |- cleaned_data.csv
|  |  |- rfm_output.csv
|- src/
|  |- data_preprocessing.py
|  |- db_connection.py
|  |- rfm_analysis.py
|  |- market_basket.py
|  |- ltv_model.py
|- notebooks/
|  |- 01_EDA.ipynb
|  |- 02_RFM_Analysis.ipynb
|  |- 03_Market_Basket.ipynb
|  |- 04_LTV_Prediction.ipynb
|- models/
|  |- ltv_model.pkl
|- reports/
|  |- Problem Overview.pdf
|  |- sqlquries.sql
|- Dashboard/
|  |- Retail Sense 360 Dashboard.pbix
|- .env
|- .gitignore
|- README.md

Tech Stack

  • Python (pandas, scikit-learn, SQLAlchemy, joblib, python-dotenv)
  • PostgreSQL (data storage and SQL analytics)
  • Jupyter Notebook (EDA and analysis workflow)
  • Power BI (interactive dashboarding)
  • MLXtend (Apriori and association rules for market basket analysis)

Setup

  1. Create and activate virtual environment:
python -m venv .venv
.\.venv\Scripts\Activate.ps1
  1. Install dependencies:
pip install pandas scikit-learn joblib sqlalchemy python-dotenv psycopg2-binary mlxtend jupyter
  1. Configure environment variables in .env:
DB_USER=your_user
DB_PASSWORD=your_password
DB_HOST=localhost
DB_PORT=5432
DB_NAME=your_database

Execution Workflow

  1. Run preprocessing on raw data to generate clean features (see src/data_preprocessing.py and notebooks).
  2. Perform EDA and analysis through notebooks in notebooks/.
  3. Generate RFM and segmentation using src/rfm_analysis.py.
  4. Run association-rule mining using src/market_basket.py.
  5. Train LTV model using src/ltv_model.py (model saved as models/ltv_model.pkl).
  6. Upload processed CSV to PostgreSQL using:
python src/db_connection.py
  1. Execute business SQL queries from reports/sqlquries.sql.
  2. Open Dashboard/Retail Sense 360 Dashboard.pbix in Power BI Desktop for reporting.

Python Module Summary

  • data_preprocessing.py: Cleans nulls/duplicates, normalizes columns, engineers frequency and monetary features.
  • rfm_analysis.py: Computes customer-level RFM metrics and applies KMeans clustering.
  • market_basket.py: Builds Apriori frequent itemsets and association rules with robust empty-case handling.
  • ltv_model.py: Trains RandomForestRegressor for LTV prediction and exports trained model.
  • db_connection.py: Loads DB credentials from .env, creates SQLAlchemy engine, and uploads processed CSV to PostgreSQL.

SQL Analytics Coverage

The SQL script includes analysis for:

  • Revenue by gender, category, location, payment method, and season
  • Subscriber vs non-subscriber spend comparison
  • Discount and promo-code behavior
  • Top products by rating and discount usage
  • Repeat buyers and customer segmentation logic
  • Window functions (running totals, ranking, lag analysis)

Dashboard

Dashboard/Retail Sense 360 Dashboard.pbix contains the Power BI dashboard layer to present key business and customer insights derived from processed data and SQL analysis.

Notes and Recommendations

  • Keep raw data immutable under data/raw/.
  • Regenerate data/processed/cleaned_data.csv after preprocessing changes.
  • Re-train models/ltv_model.pkl when feature logic or source data changes.
  • Consider adding a requirements.txt file for reproducible environment setup.
  • Consider adding a single main.py pipeline entrypoint for end-to-end execution.

About

Analyze Customer Behavior And Making PowerBi Dashboard

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors