Skip to content

Joyan9/learning_data_engineering

Repository files navigation

Learning Data Engineering 🚀

This repository documents my journey in learning data engineering concepts, tools, and best practices. It contains structured notes, hands-on exercises, and references to key topics in the field.

📂 Repository Structure

  • Apache Airflow
  • Apache Spark
  • Data Engineering Best Practices
  • Data Engineering Fundamentals Concepts
  • Data Modelling – Dimensional modelling techniques, scenario-based examples, and interview questions.
  • Data Processing – Comparison of batch vs. stream processing techniques.
  • Python Concepts – Key Python programming concepts useful for data engineering.
  • DBT/SQL for Modularity – SQL refactoring techniques and best practices using dbt.
  • dlt (data load tool) – Open-source Python Library to load data from multiple sources, reduces boilerplate code, features like auto schema detection, unnesting etc.

🔥 Current Focus

  • Completing the Data Engineering ZoomCamp 2025 coursework.
  • Deep diving into workflow orchestration and modern data stack tools.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published