- Python (version >= 3.8)
- Docker
- Make
- k3d
- helm
Building a real-time analytics data platform using modern techs: Apache Kafka, Apache Flink, Apache Iceberg, Trino, and Apache Pinot, which are widely adopted in production by large-scale companies: Netflix, Uber, LinkedIn and Airbnb, etc.. to handle high-volume, low-latency data processing and analytics.
The goal of this project is to understanding of how these components work together as a complete data stack, including data ingestion, stream processing, storage, and real-time querying. It also demonstrates how to deploy, run, and test the platform locally using docker-compose and kubernetes via k3d with Helm charts.