A Python utility to synchronize semantic definitions from your dbt project (semantic models and metrics) to your Eppo instance using Eppo's bulk metrics sync API (/api/v1/metrics/sync).
- Parses dbt Artifacts: Reads
metrics:andsemantic_models:definitions from.ymlfiles, and uses dbt'smanifest.jsonartifact to access model metadata and compiled SQL. - Maps Concepts: Translates dbt semantic models into Eppo
fact_sources(including compiled SQL) and dbt metrics into Eppometricsdefinitions. - Eppo Bulk Sync API Integration: Generates a single payload and sends it to the Eppo
/api/v1/metrics/syncendpoint. - Dry Run Mode: Allows previewing the generated bulk payload without actually sending it to Eppo.
Before you begin, ensure you have the following:
- Python: Version 3.9 or higher.
- Poetry: For managing dependencies and the virtual environment (recommended for development). Install via
pip install poetry. - dbt Project: A dbt project with semantic layer definitions (
metrics:,semantic_models:) defined in YAML files. - dbt
manifest.json: You need themanifest.jsonartifact generated by dbt. Rundbt parseordbt compilein your dbt project to generate/update this file (usually located in thetarget/directory). This tool requires the manifest to get model relationships and compiled SQL. - Eppo API Key: Generate an API key from your Eppo instance. Admins can create and manage REST API Keys by visiting Admin > API Keys.
- Clone the repository:
git clone <your-repository-url> cd dbt-eppo-sync
- Install dependencies using Poetry:
This creates a virtual environment, installs all necessary packages, and makes the
poetry install
dbt-eppo-synccommand available.
The tool is configured via command-line arguments.
Required Arguments:
--dbt-project-dir: Path to the root directory of your dbt project (containingdbt_project.yml). The parser uses this to locate dbt definition files.--manifest-path: Path to the dbtmanifest.jsonfile (e.g.,./your_dbt_project/target/manifest.json).--eppo-api-key: Your Eppo API key. Recommendation: Use theEPPO_API_KEYenvironment variable for your API key:
export EPPO_API_KEY="your_actual_api_key"
# The tool will pick this up automaticallyOptional Arguments:
--sync-tag: A string tag to identify this sync operation in Eppo logs or UI. Defaults todbt-sync-<timestamp>.--dry-run: A flag to perform parsing and mapping but print the payload instead of sending it to Eppo.--eppo-base-url: Override the default Eppo API base URL (https://eppo.cloud/api).
Run the sync command using poetry run, which executes the command within the project's virtual environment managed by Poetry.
Basic Sync:
# Ensure EPPO_API_KEY environment variable is set
poetry run dbt-eppo-sync \
--dbt-project-dir "/path/to/your/dbt/project" \
--manifest-path "/path/to/your/dbt/project/target/manifest.json" \
# Optional: --sync-tag "my-custom-tag"Alternatively, provide the API key directly (less secure):
poetry run dbt-eppo-sync \
--dbt-project-dir "/path/to/your/dbt/project" \
--manifest-path "/path/to/your/dbt/project/target/manifest.json" \
--eppo-api-key "your_api_key_here"Dry Run:
To generate the bulk payload and print it without sending it to Eppo, use the --dry-run flag:
# Ensure EPPO_API_KEY environment variable is set (or use --eppo-api-key)
poetry run dbt-eppo-sync \
--dbt-project-dir "/path/to/your/dbt/project" \
--manifest-path "/path/to/your/dbt/project/target/manifest.json" \
--dry-runThis tool maps dbt artifacts to the structure required by Eppo's /api/v1/metrics/sync endpoint:
- dbt Semantic Model -> Eppo
fact_source:- The
nameof the semantic model becomes thefact_source.name. - The compiled SQL for the underlying dbt model (extracted from
manifest.jsonbased on the semantic model'smodelreference) is placed infact_source.sql. - The dbt primary entity (
type: 'primary') is mapped tofact_source.entities, using the entity'snameasentity_nameandexprascolumn. Other entity types are currently ignored. - dbt
measuresare mapped tofact_source.facts. Specifically:measure.name->fact.namemeasure.expr->fact.column(ifexprexists)measure.description->fact.descriptionmeasure.meta.eppo_desired_change->fact.desired_change(defaults to 'increase' if meta tag is absent)
- dbt
dimensionsare mapped tofact_source.properties. Specifically:dimension.name->property.namedimension.expr->property.columndimension.description->property.description
- A
timestamp_columnis required by Eppo and is automatically inferred by looking for a dimension withtype: 'time'or common names liketimestamp,event_timestamp,created_at. An error is raised if none can be found.
- The
- dbt Metric -> Eppo
metric:- The dbt metric
namebecomes the Eppometric.name. Thelabelfield is currently ignored for naming. - The dbt metric
typedetermines the Eppometric.typeand structure:- dbt
sum,count,count_distinctmap to Eppotype: simple, with theoperationset tosum,count, ordistinct_entityrespectively. Thefact_namelinks back to the corresponding measure in the source semantic model. - dbt
averagemaps to Eppotype: ratio, constructing thenumeratoranddenominatorobjects linked to the appropriate measure-derivedfact_names. - dbt
percentilemaps to Eppotype: percentile, constructing thepercentileobject linked to the appropriatefact_nameand including thepercentile_value.
- dbt
- The primary
entityfor the Eppo metric is derived from the primary entity of the source dbt semantic model. - Basic dbt
filterexpressions matching the pattern{{ Dimension('dimension_name') }} = 'value'or!= 'value'are translated to Eppofilterson the correspondingfact_property. More complex filters are currently ignored with a warning.
- The dbt metric
Important: The accuracy of the mapping depends on the structure of your dbt metrics and semantic models matching the expectations outlined above. Complex dbt features (e.g., intricate filters, certain derived metric types not listed) or specific Eppo features (e.g., threshold, conversion, retention operations, funnel metrics) may require adjustments to the mapping logic in mapper.py or may not be fully supported yet. Always review the generated payload (using --dry-run), consult Eppo's API docs (https://eppo.cloud/api/docs#/Metrics%20Sync/syncMetrics), and/or reach out to Eppo Support.
- Follow the Installation steps using Poetry.
- Activate the virtual environment:
$ eval $(poetry env activate) (test-project-for-test) $ # Virtualenv entered
- Run tests (once implemented):
pytest - Make your changes and contribute!
- Have an enhancement request, idea, or notice a bug? Create a Github Issue!
This project is licensed under the MIT License - see the LICENSE file for details (or specify your chosen license).
- 400 Bad Request Error with "SQL validation failed": If you encounter a 400 error and the detailed response from Eppo indicates an SQL validation failure (often mentioning "Unexpected token"), check the SQL queries being sent in the payload (use the
--dry-runoption).- Cause: This commonly occurs if your dbt project is configured to use non-standard SQL quoting (like backticks
`) for identifiers, especially when using Snowflake. Eppo's SQL validator might not recognize this quoting style. - Solution: Review your dbt project's quoting configuration (e.g., in
dbt_project.yml) and ensure it generates standard SQL identifiers (usually double-quoted" "if quoting is needed, or unquoted). You may need to adjust settings related toquotingstrategies for databases, schemas, and identifiers. - Workaround: You could manually edit the
compiled_codein yourmanifest.jsonbefore running the sync tool, but configuring dbt correctly is the recommended long-term fix.
- Cause: This commonly occurs if your dbt project is configured to use non-standard SQL quoting (like backticks