Skip to content

Commit cb02ade

Browse files
Add Genie space examples (#158)
## Summary The Databricks CLI supports the `genie_space` bundle resource since [databricks/cli#5282](databricks/cli#5282) (upcoming v1.3.0 release, direct deployment engine only). This PR adds examples for it: - **`knowledge_base/genie_space_nyc_taxi`** — a minimal bundle that deploys a Genie space for the `samples.nyctaxi.trips` table. The README also covers importing an existing space with `databricks bundle generate genie-space` and keeping the local `.geniespace.json` in sync with UI edits. - **`knowledge_base/app_with_genie_space`** — a Databricks app that answers questions through the Genie Conversation API. The bundle declares the Genie space as an app resource (granting the app's service principal `CAN_RUN`) and injects the space ID into the app via `valueFrom`. ## Test plan - [x] `databricks bundle validate` passes for all bundles with a CLI build that includes databricks/cli#5282 (verified `file_path` is read and inlined into `serialized_space`, dev-mode prefixes, default `parent_path`, and permissions) - [x] The resource/JSON layout matches what `databricks bundle generate genie-space` produces (`resources/<key>.genie_space.yml` + `src/<key>.geniespace.json`) - [x] `ruff format --check` passes on the new app code - [x] Genie space deployment and the app resource wiring were verified against a live workspace during development of databricks/cli#5282 Deployed app: <img width="964" height="379" alt="Screenshot 2026-06-10 at 16 37 20" src="https://github.com/user-attachments/assets/5104ced3-de79-4baa-869f-68ffb4272e3e" /> This pull request and its description were written by Isaac.
1 parent d53214e commit cb02ade

14 files changed

Lines changed: 442 additions & 0 deletions

File tree

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
.databricks
Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,52 @@
1+
# Databricks app using a Genie space
2+
3+
This example demonstrates how to define a Databricks app that uses a Genie space in a Declarative Automation Bundle.
4+
5+
It deploys a Genie space for the `samples.nyctaxi.trips` [table](https://docs.databricks.com/aws/en/discover/databricks-datasets#nyctaxi) and a Flask app that lets users ask the space questions in natural language through the [Genie Conversation API](https://docs.databricks.com/genie/conversation-api.html).
6+
7+
For more information about Databricks Apps, see the [documentation](https://docs.databricks.com/aws/en/dev-tools/databricks-apps).
8+
For more information about Genie, see the [documentation](https://docs.databricks.com/genie/index.html).
9+
10+
## Prerequisites
11+
12+
* Databricks CLI v1.3.0 or above.
13+
* Genie spaces can only be deployed with the [direct deployment engine](https://docs.databricks.com/dev-tools/bundles/direct) (`engine: direct`), which is the default for new deployments since CLI v1.3.0.
14+
15+
## Usage
16+
17+
1. Modify `databricks.yml`:
18+
- Update the `host` field to your Databricks workspace URL
19+
- Update the `warehouse` field to the name of your SQL warehouse
20+
21+
2. Deploy the bundle:
22+
```sh
23+
databricks bundle deploy
24+
```
25+
26+
3. Run the app:
27+
```sh
28+
databricks bundle run genie_assistant
29+
```
30+
31+
4. Open the app in your browser:
32+
```sh
33+
databricks bundle open genie_assistant
34+
```
35+
Alternatively, run `databricks bundle summary` to display its URL.
36+
37+
## How it works
38+
39+
* `resources/nyc_taxi_genie.genie_space.yml` defines the Genie space, with its data sources, instructions, and sample questions stored in `src/nyc_taxi_genie.geniespace.json`.
40+
* `resources/genie_assistant.app.yml` declares the Genie space as an app resource. This grants the app's service principal `CAN_RUN` permission on the space:
41+
```yaml
42+
resources:
43+
- name: "genie-space"
44+
genie_space:
45+
name: "NYC Taxi Trip Analysis"
46+
space_id: ${resources.genie_spaces.nyc_taxi_genie.space_id}
47+
permission: CAN_RUN
48+
```
49+
* The `config` block in `resources/genie_assistant.app.yml` injects the space ID into the app as the `GENIE_SPACE_ID` environment variable using `value_from: "genie-space"`.
50+
* `app/app.py` sends each question to the space with `w.genie.start_conversation_and_wait(...)` and renders the text answer or the generated SQL and its results.
51+
52+
Note that the app queries Genie with its own service principal identity: in addition to the `CAN_RUN` permission on the space granted by the bundle, the service principal must be able to use the SQL warehouse and read the tables that back the space. If access to the `samples` catalog is restricted for service principals in your workspace, point the space at a table the app can read.
Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
import os
2+
3+
from databricks.sdk import WorkspaceClient
4+
from flask import Flask, render_template, request
5+
6+
app = Flask(__name__)
7+
8+
w = WorkspaceClient()
9+
10+
# The space ID is injected by the "genie_space" resource declared in app.yml.
11+
space_id = os.getenv("GENIE_SPACE_ID")
12+
13+
14+
@app.route("/", methods=["GET", "POST"])
15+
def home():
16+
question = None
17+
answer = None
18+
sql = None
19+
columns = []
20+
rows = []
21+
22+
if request.method == "POST":
23+
question = request.form["question"]
24+
25+
# Start a new conversation in the Genie space and wait for the answer.
26+
# Use w.genie.create_message_and_wait(...) to ask follow-up questions
27+
# in the same conversation.
28+
message = w.genie.start_conversation_and_wait(space_id, question)
29+
30+
for attachment in message.attachments or []:
31+
# Genie answers either with plain text...
32+
if attachment.text:
33+
answer = attachment.text.content
34+
35+
# ...or with a generated SQL query and its result set.
36+
if attachment.query:
37+
answer = attachment.query.description
38+
sql = attachment.query.query
39+
result = w.genie.get_message_attachment_query_result(
40+
space_id,
41+
message.conversation_id,
42+
message.id,
43+
attachment.attachment_id,
44+
)
45+
statement = result.statement_response
46+
columns = [column.name for column in statement.manifest.schema.columns]
47+
rows = statement.result.data_array or []
48+
49+
return render_template(
50+
"index.html",
51+
question=question,
52+
answer=answer,
53+
sql=sql,
54+
columns=columns,
55+
rows=rows,
56+
)
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
databricks-sdk>=0.60.0
2+
flask
Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
<html>
2+
<head>
3+
<title>Genie space app managed by DABs</title>
4+
</head>
5+
<body>
6+
<h1>Ask Genie about NYC taxi trips</h1>
7+
<form method="post">
8+
<input type="text" name="question" placeholder="e.g. What is the average fare per trip?" size="60" required>
9+
<button type="submit">Ask</button>
10+
</form>
11+
12+
{% if question %}
13+
<h2>{{ question }}</h2>
14+
{% if answer %}
15+
<p>{{ answer }}</p>
16+
{% endif %}
17+
{% if sql %}
18+
<pre>{{ sql }}</pre>
19+
{% endif %}
20+
{% if columns %}
21+
<table border="1" cellpadding="4">
22+
<tr>
23+
{% for column in columns %}<th>{{ column }}</th>{% endfor %}
24+
</tr>
25+
{% for row in rows %}
26+
<tr>
27+
{% for value in row %}<td>{{ value }}</td>{% endfor %}
28+
</tr>
29+
{% endfor %}
30+
</table>
31+
{% endif %}
32+
{% endif %}
33+
</body>
34+
</html>
Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
bundle:
2+
name: app_with_genie_space
3+
4+
# Genie spaces can only be deployed with the direct deployment engine.
5+
# The direct engine is the default for new deployments since Databricks CLI v1.3.0.
6+
engine: direct
7+
8+
include:
9+
- resources/*.yml
10+
11+
variables:
12+
# The "warehouse_id" variable is used to reference the warehouse used by the Genie space.
13+
warehouse_id:
14+
lookup:
15+
# Replace this with the name of your SQL warehouse.
16+
warehouse: "Shared Unity Catalog Serverless"
17+
18+
workspace:
19+
host: https://myworkspace.databricks.com
20+
21+
targets:
22+
dev:
23+
default: true
24+
mode: development
Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
resources:
2+
apps:
3+
genie_assistant:
4+
name: "genie-assistant"
5+
description: "An app that answers questions using a Genie space"
6+
source_code_path: ../app
7+
8+
# The app configuration: the command to start the app and its environment.
9+
config:
10+
command: ["flask", "--app", "app", "run"]
11+
env:
12+
# The value is injected by the Databricks Apps runtime from the app
13+
# resource named "genie-space" declared below.
14+
- name: GENIE_SPACE_ID
15+
value_from: "genie-space"
16+
17+
# The resources which this app has access to:
18+
resources:
19+
- name: "genie-space"
20+
description: "The Genie space that the app sends questions to"
21+
genie_space:
22+
name: "NYC Taxi Trip Analysis"
23+
space_id: ${resources.genie_spaces.nyc_taxi_genie.space_id}
24+
permission: CAN_RUN
Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
resources:
2+
genie_spaces:
3+
nyc_taxi_genie:
4+
title: "NYC Taxi Trip Analysis"
5+
description: "Ask questions about NYC taxi trip data in natural language"
6+
7+
# The serialized definition of the Genie space: its data sources,
8+
# instructions, and sample questions.
9+
file_path: ../src/nyc_taxi_genie.geniespace.json
10+
11+
# The warehouse used to run the queries that Genie generates.
12+
warehouse_id: ${var.warehouse_id}
Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
{
2+
"version": 2,
3+
"config": {
4+
"sample_questions": [
5+
{
6+
"id": "11111111111111111111111111111111",
7+
"question": ["What is the average fare per trip?"]
8+
},
9+
{
10+
"id": "22222222222222222222222222222222",
11+
"question": ["How many trips were longer than 10 miles?"]
12+
}
13+
]
14+
},
15+
"data_sources": {
16+
"tables": [
17+
{
18+
"identifier": "samples.nyctaxi.trips",
19+
"column_configs": [
20+
{ "column_name": "dropoff_zip" },
21+
{ "column_name": "fare_amount" },
22+
{ "column_name": "pickup_zip" },
23+
{ "column_name": "tpep_dropoff_datetime" },
24+
{ "column_name": "tpep_pickup_datetime" },
25+
{ "column_name": "trip_distance" }
26+
]
27+
}
28+
]
29+
},
30+
"instructions": {
31+
"text_instructions": [
32+
{
33+
"id": "33333333333333333333333333333333",
34+
"content": [
35+
"This Genie space answers questions about NYC taxi trips.\n",
36+
"All data is in the samples.nyctaxi.trips table.\n",
37+
"Fare amounts are in USD. When asked about revenue, use SUM(fare_amount)."
38+
]
39+
}
40+
],
41+
"example_question_sqls": [
42+
{
43+
"id": "44444444444444444444444444444444",
44+
"question": ["What was the total revenue per pickup zip code?"],
45+
"sql": [
46+
"SELECT\n",
47+
" pickup_zip,\n",
48+
" SUM(fare_amount) AS total_revenue\n",
49+
"FROM samples.nyctaxi.trips\n",
50+
"GROUP BY pickup_zip\n",
51+
"ORDER BY total_revenue DESC"
52+
]
53+
}
54+
]
55+
}
56+
}
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
.databricks

0 commit comments

Comments
 (0)