Skip to content

Phase 5: Airflow Orchestration + Spark Batch Jobs#12

Merged
AndrewAct merged 6 commits into
mainfrom
dev
May 17, 2026
Merged

Phase 5: Airflow Orchestration + Spark Batch Jobs#12
AndrewAct merged 6 commits into
mainfrom
dev

Conversation

@AndrewAct

Copy link
Copy Markdown
Owner

No description provided.

AndrewAct and others added 6 commits May 17, 2026 01:05
…rializer

Replace SimpleStringSchema with a custom KafkaRecordDeserializationSchema in
both normalize.py and cdc_symbol_config.py so that kafka_partition and
kafka_offset written to Iceberg silver/bronze layers reflect the actual Kafka
record position instead of the placeholder -1.

The -1 fallback path in iceberg_reader.py is retained for backward
compatibility with rows written before this fix.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…rializer

Replace SimpleStringSchema with a custom KafkaRecordDeserializationSchema in
both normalize.py and cdc_symbol_config.py so that kafka_partition and
kafka_offset written to Iceberg silver/bronze layers reflect the actual Kafka
record position instead of the placeholder -1.

The -1 fallback path in iceberg_reader.py is retained for backward
compatibility with rows written before this fix.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
fix: store real kafka partition+offset in Iceberg via KafkaRecordDeserializer
- Add airflow/dags: backfill_ohlcv_1m, compact_iceberg_tables,
  expire_iceberg_snapshots, freshness_sla_check, run_dbt_models,
  run_great_expectations
- Add spark/jobs: backfill_ohlcv, compact_tables, spark_lib
- Add airflow/tests and spark/tests; include both in pytest testpaths
- Rename flink/jobs/lib → flink_lib, spark/jobs/lib → spark_lib to
  eliminate bare `lib` package collision across services
- Fix all ruff lint errors in new airflow/spark files

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@AndrewAct AndrewAct merged commit 747cf28 into main May 17, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant