Skip to content

feat(jobs): authenticate to Lakebase via M2M OAuth as a service principal#29

Merged
taran-dbx merged 1 commit into
mainfrom
docs/lakebase-m2m-auth-service-principal
Jun 1, 2026
Merged

feat(jobs): authenticate to Lakebase via M2M OAuth as a service principal#29
taran-dbx merged 1 commit into
mainfrom
docs/lakebase-m2m-auth-service-principal

Conversation

@taran-dbx
Copy link
Copy Markdown
Collaborator

Summary

Adopts the Databricks psycopg3 connection pattern for the maintenance jobs and makes the service-principal execution + Lakebase permission model explicit, per the M2M OAuth docs.

Changes

Authentication (databricks/workflows/lakebase_utils.py)

  • The OAuth credential is now minted inside a psycopg.Connection subclass's connect() (the documented psycopg3 pattern), so every physical connect gets a fresh, non-expired token — handling the ~1 h token rotation transparently.
  • Postgres role defaults to the running identity (current_user) and is overridable via the LAKETS_PG_ROLE env var (for when the Lakebase role name differs from the SP application ID).
  • M2M auth uses the SDK's default WorkspaceClient() — resolves the job's service principal automatically inside Databricks, or reads DATABRICKS_HOST/DATABRICKS_CLIENT_ID/DATABRICKS_CLIENT_SECRET for runs outside Databricks.

Bundle (databricks/bundles/databricks.yml)

  • The prod target runs every job as a service principal via run_as: { service_principal_name: ${var.service_principal_name} }.

Dependencies

  • Bumped databricks-sdk to >=0.56.0 (M2M requirement) in both requirements.txt and the bundle libraries.

Docs (website/docs/reference/workflow-jobs.md)

  • New Authentication & permissions section: M2M OAuth flow, deploying as the SP, the env vars for external runs, and the Lakebase Postgres role + GRANTs the service principal needs to create/drop partitions, refresh RollUps, and enforce retention.

Test plan

  • lakebase_utils.py byte-compiles
  • Bundle parses; prod.run_as resolves; SDK pin is >=0.56.0
  • tests/test_python_patterns.py passes (11/11)
  • databricks bundle deploy -t prod --var="service_principal_name=<sp>" runs each job end-to-end against a Lakebase instance where the SP holds a granted Postgres role

…ipal

Adopt the documented psycopg3 connection pattern and make the
service-principal execution + permission model explicit.

- lakebase_utils: mint the OAuth credential inside a psycopg.Connection
  subclass connect() (fresh token per physical connect, handles ~1h
  rotation); Postgres role overridable via LAKETS_PG_ROLE; M2M auth via
  the SDK's default WorkspaceClient (resolves the job's SP, or env-var
  client_id/secret for external runs)
- bundle: prod target runs jobs as a service principal via run_as
  (service_principal_name variable)
- deps: bump databricks-sdk to >=0.56.0 (M2M requirement) in
  requirements.txt and the bundle libraries
- docs: workflow-jobs gains an "Authentication & permissions" section
  covering M2M OAuth, run_as SP, and the Lakebase Postgres role/grants
  the SP needs
@taran-dbx taran-dbx merged commit cba3911 into main Jun 1, 2026
9 checks passed
@taran-dbx taran-dbx deleted the docs/lakebase-m2m-auth-service-principal branch June 1, 2026 14:55
@github-actions github-actions Bot added documentation Improvements or additions to documentation dependencies databricks-workflows labels Jun 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

databricks-workflows dependencies documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant