Skip to content

feat(tiering): rename compression to tiering with per-chunk LSN durability gate#25

Merged
taran-dbx merged 9 commits into
mainfrom
feat/compression-to-tiering
Jun 1, 2026
Merged

feat(tiering): rename compression to tiering with per-chunk LSN durability gate#25
taran-dbx merged 9 commits into
mainfrom
feat/compression-to-tiering

Conversation

@taran-dbx
Copy link
Copy Markdown
Collaborator

Summary

Reworks the old "compression" feature into tiering: a tiering policy drops cold ChronoTable partitions to reclaim Lakebase storage, but only once the data is provably durable in the Unity Catalog Managed Table via Lakebase CDF. Pure Lakebase SQL — the Spark/Delta-export path is gone (CDF owns the UC copy).

Clean break, no upgrade migration — ships as a fresh release.

API

  • add_compression_policyadd_tiering_policy(table, after, [schema]) (drops the segment_by/order_by params)
  • compress_chunk/decompress_chunktier_chunk (returns BOOLEANTRUE dropped, FALSE deferred) / untier_chunk
  • show_/remove_compression_policyshow_/remove_tiering_policy
  • New show_tiering_status() + lakets_tiering_* Prometheus metrics
  • compression policy type → tiering; compressed chunk status removed (activetiered); _chronotable_registry.compression_enabledtiering_enabled; _chunk_metadata gains last_write_lsn

Durability gate (the core of this PR)

A partition is dropped only when its CDF shadow is STREAMING in wal2delta.tables and committed_lsn >= chunk.last_write_lsn.

The first design compared committed_lsn to pg_current_wal_lsn() (the global WAL head). Live testing on a CDF-enabled Lakebase instance disproved that: a per-table committed_lsn does not advance while the shadow is idle — it freezes at the last flush while the global head keeps climbing from unrelated activity. So the head comparison never passes for a cold (idle) chunk, which is exactly what we want to evict. The gate now compares against each chunk's own last_write_lsn, stamped by statement-level transition-table triggers (INSERT/UPDATE/DELETE) on the ChronoTable parent — a write to the hot chunk never bumps a cold chunk's watermark. Fail-closed everywhere: missing CDF, non-STREAMING shadow, NULL watermark, or behind-chunk all defer.

CDF is a documented prerequisite (enabled on the lakets_cdf schema via Databricks); enable_sync now warns when wal2delta is absent.

Also

  • compression_job.pytiering_job.py (Spark-free, calls tier_chunk); bundle job lakets_compressionlakets_tiering
  • Full docs rewrite (page renamed compression-and-retention.mdtiering-and-retention.md; site build green)

Test plan

  • tests/test_tiering.sql — 8 cases: fail-closed gate, policy CRUD, eligibility, tier_chunk/untier_chunk, show_tiering_status, per-chunk write-stamping trigger
  • Full SQL suite green on a clean install (tiering, monitoring, retention, rollup, shadow-sync, …)
  • Live integration on a CDF-enabled instance: caught-up cold chunk drops (partition gone, status tiered); LSN-behind and not-STREAMING both defer with the partition intact; hot-chunk writes leave cold watermarks untouched
  • cd website && npm run build passes (strict broken-link check)
  • Reviewer note: tests/test_ingest.sql has a pre-existing date-drift failure (hardcoded 2026-03-25 timestamps, unrelated to this PR)

taran-dbx added 9 commits June 1, 2026 13:58
Rename add/remove/show_compression_policy -> *_tiering_policy (drop
Delta-era segment_by/order_by). Replace compress/decompress_chunk with
tier_chunk (drops partition only when shadow STREAMING and committed_lsn
>= pg_current_wal_lsn()) and untier_chunk. Add _get_chunks_to_tier
eligibility filter and show_tiering_status observability function.
…_* metrics

Emit lakets_tiering_pending_chunks, _tiered_chunks_total, and _caught_up
(CDF durability gate) per table. Drop the dead compressed_chunks column
from chunk_health now that 'compressed' status is removed.
Rewrite the job to call lakets.tier_chunk per eligible chunk (the gated
drop and metadata transition live in SQL), dropping the Spark/Delta
optimize path entirely. Rename compression_job.py -> tiering_job.py,
lakets_compression -> lakets_tiering in the bundle, and repoint the
python-pattern tests at the new Spark-free job.
Live CDF testing disproved the original gate's assumption: a per-table
committed_lsn does NOT advance while the shadow is idle, so it never
catches pg_current_wal_lsn() (which keeps moving from unrelated DB
activity). Cold chunks -- exactly what we want to evict -- could never
pass committed_lsn >= pg_current_wal_lsn().

Redesign:
- Add _chunk_metadata.last_write_lsn, stamped by statement-level
  transition-table triggers on the ChronoTable parent (one each for
  INSERT/UPDATE/DELETE; a transition table allows only one event). The
  row->chunk mapping is recomputed via date_bin so a write to the hot
  chunk never bumps a cold chunk's watermark.
- tier_chunk now drops iff shadow STREAMING AND committed_lsn >=
  chunk.last_write_lsn (NULL watermark = cannot prove durable = defer).
- add_tiering_policy installs the triggers and backfills existing active
  chunks with the current WAL head (conservative upper bound).
- show_tiering_status.caught_up / cdf_lag_bytes and the lakets_tiering_*
  metrics now reflect the per-chunk gate (metrics sourced from
  show_tiering_status for a single source of truth).
- enable_sync warns when Lakebase CDF (wal2delta) is absent, since CDF is
  a prerequisite enabled on the lakets_cdf schema, not by LakeTS.

Verified live on lakets-tiering-test: caught-up cold chunk drops
(partition gone, status tiered); LSN-behind and not-streaming both defer
with the partition intact; hot-chunk writes leave cold watermarks
untouched. New TEST 8 locks in the trigger behavior.
…itoring

Rename guides/how-it-works/compression-and-retention.md -> tiering-and-retention.md
and rewrite all docs to the tiering model: add_tiering_policy (no
segment_by/order_by), tier_chunk (BOOLEAN, defers), untier_chunk,
show_tiering_status, lakets_tiering_* metrics. Document the per-chunk
durability gate (committed_lsn >= chunk.last_write_lsn) and that Lakebase
CDF on the lakets_cdf schema is a prerequisite. Drop Spark/Z-ORDER/_archive
wording from the workflow-jobs reference. Update metadata-tables
(tiering_enabled, active/tiered/dropped, last_write_lsn) and fix the
sidebar + menu-icon selectors for the renamed page.
Remove the last compression remnants: drop the unused _chunk_metadata
compressed_at column, drop 'compressed' from the rollup freshness
classifier, retire tests/test_compression.sql (replaced by
test_tiering.sql), and convert test_monitoring T5 to assert tiered
chunks. Update README feature table + example and the CHANGELOG
Unreleased section to the tiering model and per-chunk LSN gate.
@taran-dbx taran-dbx merged commit 179468f into main Jun 1, 2026
10 checks passed
@taran-dbx taran-dbx deleted the feat/compression-to-tiering branch June 1, 2026 13:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant