feat(jobs): authenticate to Lakebase via M2M OAuth as a service principal#29
Merged
Merged
Conversation
…ipal Adopt the documented psycopg3 connection pattern and make the service-principal execution + permission model explicit. - lakebase_utils: mint the OAuth credential inside a psycopg.Connection subclass connect() (fresh token per physical connect, handles ~1h rotation); Postgres role overridable via LAKETS_PG_ROLE; M2M auth via the SDK's default WorkspaceClient (resolves the job's SP, or env-var client_id/secret for external runs) - bundle: prod target runs jobs as a service principal via run_as (service_principal_name variable) - deps: bump databricks-sdk to >=0.56.0 (M2M requirement) in requirements.txt and the bundle libraries - docs: workflow-jobs gains an "Authentication & permissions" section covering M2M OAuth, run_as SP, and the Lakebase Postgres role/grants the SP needs
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adopts the Databricks psycopg3 connection pattern for the maintenance jobs and makes the service-principal execution + Lakebase permission model explicit, per the M2M OAuth docs.
Changes
Authentication (
databricks/workflows/lakebase_utils.py)psycopg.Connectionsubclass'sconnect()(the documented psycopg3 pattern), so every physical connect gets a fresh, non-expired token — handling the ~1 h token rotation transparently.current_user) and is overridable via theLAKETS_PG_ROLEenv var (for when the Lakebase role name differs from the SP application ID).WorkspaceClient()— resolves the job's service principal automatically inside Databricks, or readsDATABRICKS_HOST/DATABRICKS_CLIENT_ID/DATABRICKS_CLIENT_SECRETfor runs outside Databricks.Bundle (
databricks/bundles/databricks.yml)run_as: { service_principal_name: ${var.service_principal_name} }.Dependencies
databricks-sdkto>=0.56.0(M2M requirement) in bothrequirements.txtand the bundlelibraries.Docs (
website/docs/reference/workflow-jobs.md)Test plan
lakebase_utils.pybyte-compilesprod.run_asresolves; SDK pin is>=0.56.0tests/test_python_patterns.pypasses (11/11)databricks bundle deploy -t prod --var="service_principal_name=<sp>"runs each job end-to-end against a Lakebase instance where the SP holds a granted Postgres role