Skip to content

vishanshm/Project-Customer_Segmentation_Clustering

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

Project : Smart Segmentation - Unlocking Customer Personas with AI 🛍️🤖

Project Objective

This project utilizes Unsupervised Machine Learning to identify distinct customer segments within a retail dataset. By segmenting customers based on multiple dimensions—such as age, annual income, and spending habits—we provide rich, actionable insights that enable marketing teams to design highly targeted and effective campaigns.

Core Concepts Covered

  • Unsupervised Learning: Discovering hidden structures in data without the need for pre-defined labels.
  • Clustering Fundamentals & K-Means: A deep dive into how the K-Means algorithm groups similar data points by minimizing intra-cluster variance.
  • The Elbow Method: A critical technique used to programmatically determine the optimal number of clusters (k) by analyzing the Within-Cluster Sum of Squares (WCSS).
  • Multi-dimensional EDA: Using 2D and 3D visualizations to explore complex relationships between features.
  • Hierarchical Clustering: Implementing an alternative clustering strategy and using Dendrograms to validate the optimal number of segments.

Key Steps Undertaken

1. In-Depth Exploratory Data Analysis (EDA)

  • Analyzed distributions of Age, Annual Income, and Spending Scores.
  • Utilized 2D and 3D scatter plots to visualize natural groupings and correlations before modeling.

2. Multi-Model Building

Recognizing that segmentation is not a one-size-fits-all process, this project develops two distinct models:

  • Income-based Model: Segments customers based on Annual Income vs. Spending Score.
  • Age-based Model: Segments customers based on Age vs. Spending Score.

3. Optimization and Validation

  • Applied the Elbow Method to ensure the mathematical validity of the chosen cluster counts.
  • Introduced Hierarchical Clustering as a secondary validation tool, using a dendrogram to confirm the hierarchical relationships within the data.

4. Data-Driven Personas

Translated abstract clusters into quantitative personas (e.g., "Target Customers," "Sensible Spenders," "Careless Consumers"), providing precise insights for targeted marketing.

Technologies Used

  • Python
  • Pandas & NumPy for data manipulation.
  • Scikit-Learn for K-Means and Hierarchical Clustering algorithms.
  • Seaborn & Matplotlib for statistical data visualization.
  • Plotly Express for interactive 3D cluster exploration.

Conclusion

This project demonstrates how different clustering approaches uncover different facets of customer behavior. By combining K-Means and Hierarchical methods, we achieve a more nuanced understanding of the customer base, moving beyond simple demographics to behavior-based insights.


About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors