Building a modern DataWarehouse
π SQL Data Warehouse & Analytics Project (MySQL) π Project Overview
This project demonstrates a complete end-to-end data warehousing and analytics solution built using MySQL. It covers the full lifecycle β from ingesting raw CSV data to building an analytics-ready data model that supports business insights.
The project is designed as a portfolio project to showcase practical data engineering and SQL analytics skills, following industry best practices such as layered architecture and clean data modeling.
ποΈ Architecture Overview
The project follows a Medallion Architecture approach:
Bronze Layer β Raw data ingestion from source CSV files
Silver Layer β Cleaned, validated, and standardized data
Gold Layer β Analytics-ready tables for reporting and insights
Each layer is implemented using separate schemas inside MySQL to clearly separate responsibilities.
π§° Tech Stack
Database: MySQL
Language: SQL
Version Control: Git & GitHub
Data Sources: CSV files (ERP & CRM systems)
ποΈ Project Structure sql-data-warehouse-project/ β βββ datasets/ # Raw CSV input files β βββ scripts/ # SQL scripts (executed in order) β βββ init_database.sql # Create database & schemas β βββ bronze_tables.sql # Raw ingestion tables β βββ silver_tables.sql # Cleaned & transformed tables β βββ gold_tables.sql # Analytics-ready tables β βββ tests/ # Data quality & validation queries β βββ docs/ # Architecture & documentation β βββ README.md βββ .gitignore
π― Project Requirements π§ Building the Data Warehouse (Data Engineering) Objective
Develop a modern MySQL-based data warehouse that consolidates sales data from multiple source systems to enable analytical reporting and informed decision-making.
Specifications
Data Sources: Import data from two source systems (ERP and CRM) provided as CSV files.
Data Quality: Cleanse and resolve data quality issues before analysis.
Integration: Combine both sources into a unified, analysis-friendly data model.
Scope: Focus on the latest snapshot of data (no historization).
Documentation: Provide clear documentation to support analytics and business understanding.
π BI: Analytics & Reporting (Data Analytics) Objective
Develop SQL-based analytics to deliver insights into:
Customer Behavior
Product Performance
Sales Trends
These insights support data-driven decision-making and demonstrate real-world analytical use cases.
SQL scripts must be executed in the following order:
scripts/init_database.sql
scripts/bronze_tables.sql
scripts/silver_tables.sql
scripts/gold_tables.sql
tests/data_quality_checks.sql
π License
This project is licensed under the MIT License. You are free to use, modify, and share this project with proper attribution.
π About Me
Hi! Iβm Minnu Thomas, an aspiring Data Engineer focused on building strong foundations in SQL, data warehousing, and analytics. This project reflects my hands-on learning journey and my goal of becoming job-ready for data engineering roles.