Skip to content

Phase 4: FastAPI + dbt + Prometheus + Grafana + Load Testing#7

Merged
AndrewAct merged 1 commit into
mainfrom
dev
May 17, 2026
Merged

Phase 4: FastAPI + dbt + Prometheus + Grafana + Load Testing#7
AndrewAct merged 1 commit into
mainfrom
dev

Conversation

@AndrewAct

Copy link
Copy Markdown
Owner

What changed

API layer (api/)

  • New FastAPI service with 6 endpoints: /health, /ready, /ohlcv/{symbol}, /spread/{symbol}, /liquidity/{symbol}, /pipeline/lag,
    /symbols
  • Background poller (poller.py) queries Trino every 30s and publishes live market data as Prometheus Gauges — mid price, spread bps,
    order book imbalance, staleness, health score per symbol
  • Prometheus middleware for request rate, P95 latency, error rate by endpoint
  • Pydantic v2 request/response models throughout; all Trino queries in .sql files (no inline SQL in business logic)
  • Docker healthcheck fixed: installed curl in python:3.13-slim image via apt-get

dbt (dbt/)

  • Gold layer models: mart_ohlcv, mart_liquidity, mart_exchange_health
  • dbt-runner init container waits for Flink to create the normalized schema before running (polling loop prevents race condition on
    cold start)
  • Freshness thresholds corrected: FRESH ≤ 60s / WARN ≤ 120s, aligned with Flink's checkpoint interval (was 30s/60s, which caused health
    score to oscillate with every checkpoint cycle)

Monitoring (infra/config/grafana/, infra/config/prometheus/)

  • Replaced prom/prometheus:v2.55.1 with victoriametrics/victoria-metrics:v1.102.1 — prom/prometheus crashes on Apple Silicon Docker
    with a SIGBUS in NewActiveQueryTracker (mmap issue with linuxkit kernel); VictoriaMetrics is a drop-in Prometheus-compatible
    replacement without the mmap dependency
  • Grafana dashboard: 17 panels — API ops metrics (request rate, P95, error rate), live market prices for 5 symbols, bid-ask spread
    trend, order book imbalance, pipeline health scores
  • k6 dashboard: 8 panels — VUs, request rate, check pass rate, error rate, latency percentiles, per-endpoint P95, per-assertion pass
    rate

Load testing (k6/)

  • make load-gen runs k6 via docker run --network ticksense_default, exercising all 6 endpoints with per-endpoint tags and per-assertion
    checks
  • Results stream to VictoriaMetrics via Prometheus remote write and appear in the k6 Grafana dashboard in real time

Test infrastructure

  • Removed pythonpath from pytest config — editable installs (.pth files) own import paths; explicit pythonpath caused namespace-package
    shadowing where pytest cached workspace member directories as namespace packages before the real src/ packages were reachable
  • Removed init.py from api/tests/ (rootless layout); renamed test_models.py → test_api_models.py to eliminate collision with
    ingest/tests/unit/test_models.py
  • 179 tests, 92% coverage

Docs

  • ROADMAP.md: Phase 4 marked complete
  • DEBUGGING_PHASE4.md: two new entries — Docker healthcheck curl fix; pytest namespace-package shadowing (full root cause + rule of
    thumb)
  • MARKET_CONCEPTS.md / MARKET_CONCEPTS_ZH.md: bilingual glossary for bid/ask/spread/imbalance
  • docs/BLOG_IDEAS.md, docs/VIDEO_SCRIPT.md: 6-article Medium series outline and bilingual 10–15 min demo video script

@AndrewAct AndrewAct merged commit 3ad0de5 into main May 17, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant