Skip to content

databricks-solutions/lakets

Repository files navigation

LakeTS — Time-Series Toolkit for Databricks Lakebase

CI Security & Quality Checks Release Latest release License PostgreSQL

LakeTS turns Databricks Lakebase (managed PostgreSQL 17) into a time-series database: automatic time-based partitioning, incremental RollUps, a last-value cache, policy-driven lifecycle tiering, and one-call sync to Unity Catalog via Lakebase CDF. It is pure PL/pgSQL — no custom extensions required — with optional Databricks jobs for scheduled maintenance.

Install

# Single-file install (recommended) — from a published release
curl -LO https://github.com/databricks-solutions/lakets/releases/latest/download/lakets.sql
psql -q -h <host> -U <user> -d <database> -f lakets.sql

# Or from source
git clone https://github.com/databricks-solutions/lakets.git
psql -q -h <host> -U <user> -d <database> -f lakets/sql/99_install.sql

Quick start

-- Partition a table by time
CREATE TABLE metrics (time TIMESTAMPTZ NOT NULL, device TEXT, cpu FLOAT8);
SELECT lakets.create_chronotable('metrics', 'time', '1 day');

-- Query with time-series functions
SELECT lakets.time_bucket('1 hour'::interval, time) AS hour,
       avg(cpu), lakets.first(cpu, time), lakets.last(cpu, time)
FROM metrics GROUP BY 1 ORDER BY 1;

-- Lifecycle: keep data resident, then drop once it is durable in Unity Catalog
SELECT lakets.add_tiering_policy('metrics', '7 days');
SELECT lakets.add_retention_policy('metrics', '90 days');

-- Mirror to Unity Catalog (Lakebase CDF)
SELECT lakets.enable_sync('metrics');

Documentation

Full documentation is published at https://databricks-solutions.github.io/lakets/:

  • Getting started — install, create ChronoTables, run your first query.
  • How it works — partitioning, RollUps, tiering, and Lakebase CDF internals.
  • How-to guides — RollUps, lifecycle, LVC, alerts, bulk ingest, sync to UC, upgrading.
  • Reference — every function, aggregate, trigger, and metadata table.

Requirements

  • Databricks workspace with Lakebase (PostgreSQL 17+)
  • For scheduled maintenance jobs: a Databricks serverless runtime with pip install -r requirements.txt

Contributing

PRs to main are validated by CI (SQL lint, Python lint, secret scan, unit tests) — see .github/workflows/ci.yml. Licensed under the Databricks License.

About

LakeTS — Time Series Toolkit for Databricks Lakebase. Pure SQL (PL/pgSQL) functions delivering ChronoTables, RollUps, gap-filling, and Lakehouse Sync on a hot (Lakebase) + cold (Delta) tier — no custom extensions.

Topics

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors