Skip to content

minnu-et/sql_data_warehouse_project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

6 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

sql_data_warehouse_project

Building a modern DataWarehouse

πŸ“Š SQL Data Warehouse & Analytics Project (MySQL) πŸš€ Project Overview

This project demonstrates a complete end-to-end data warehousing and analytics solution built using MySQL. It covers the full lifecycle β€” from ingesting raw CSV data to building an analytics-ready data model that supports business insights.

The project is designed as a portfolio project to showcase practical data engineering and SQL analytics skills, following industry best practices such as layered architecture and clean data modeling.

πŸ—οΈ Architecture Overview

The project follows a Medallion Architecture approach:

Bronze Layer – Raw data ingestion from source CSV files

Silver Layer – Cleaned, validated, and standardized data

Gold Layer – Analytics-ready tables for reporting and insights

Each layer is implemented using separate schemas inside MySQL to clearly separate responsibilities.

🧰 Tech Stack

Database: MySQL

Language: SQL

Version Control: Git & GitHub

Data Sources: CSV files (ERP & CRM systems)

πŸ—‚οΈ Project Structure sql-data-warehouse-project/ β”‚ β”œβ”€β”€ datasets/ # Raw CSV input files β”‚ β”œβ”€β”€ scripts/ # SQL scripts (executed in order) β”‚ β”œβ”€β”€ init_database.sql # Create database & schemas β”‚ β”œβ”€β”€ bronze_tables.sql # Raw ingestion tables β”‚ β”œβ”€β”€ silver_tables.sql # Cleaned & transformed tables β”‚ β”œβ”€β”€ gold_tables.sql # Analytics-ready tables β”‚ β”œβ”€β”€ tests/ # Data quality & validation queries β”‚ β”œβ”€β”€ docs/ # Architecture & documentation β”‚ β”œβ”€β”€ README.md └── .gitignore

🎯 Project Requirements πŸ”§ Building the Data Warehouse (Data Engineering) Objective

Develop a modern MySQL-based data warehouse that consolidates sales data from multiple source systems to enable analytical reporting and informed decision-making.

Specifications

Data Sources: Import data from two source systems (ERP and CRM) provided as CSV files.

Data Quality: Cleanse and resolve data quality issues before analysis.

Integration: Combine both sources into a unified, analysis-friendly data model.

Scope: Focus on the latest snapshot of data (no historization).

Documentation: Provide clear documentation to support analytics and business understanding.

πŸ“ˆ BI: Analytics & Reporting (Data Analytics) Objective

Develop SQL-based analytics to deliver insights into:

Customer Behavior

Product Performance

Sales Trends

These insights support data-driven decision-making and demonstrate real-world analytical use cases.

▢️ Execution Order

SQL scripts must be executed in the following order:

scripts/init_database.sql

scripts/bronze_tables.sql

scripts/silver_tables.sql

scripts/gold_tables.sql

tests/data_quality_checks.sql

πŸ“œ License

This project is licensed under the MIT License. You are free to use, modify, and share this project with proper attribution.

πŸ‘‹ About Me

Hi! I’m Minnu Thomas, an aspiring Data Engineer focused on building strong foundations in SQL, data warehousing, and analytics. This project reflects my hands-on learning journey and my goal of becoming job-ready for data engineering roles.

About

Building a modern DataWarehouse

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors