diff --git a/doc/developer/catalog-ontology.md b/doc/developer/catalog-ontology.md new file mode 100644 index 0000000000000..e351370746a19 --- /dev/null +++ b/doc/developer/catalog-ontology.md @@ -0,0 +1,120 @@ +# Catalog Ontology Views + +Generates four built-in views in `mz_internal` that describe the structure and +relationships of the Materialize system catalog. Designed to help LLMs, +diagnostic tools, and developers discover the right tables, join paths, and ID +types when writing catalog queries. + +## Views + +| View | Columns | Purpose | +|---|---|---| +| `mz_internal.mz_ontology_entity_types` | `name, relation, properties, description` | What kinds of things exist. `properties` jsonb has `{"primary_key": ["id"]}`. | +| `mz_internal.mz_ontology_semantic_types` | `name, sql_type, description` | Typed ID domains and other semantic column types (CatalogItemId, GlobalId, ByteCount, etc.) | +| `mz_internal.mz_ontology_properties` | `entity_type, column_name, semantic_type, description` | Maps every column to its semantic type and describes what it means. | +| `mz_internal.mz_ontology_link_types` | `name, source_entity, target_entity, properties, description` | Named relationships between entity types. | + +The views are generated at startup by `generate_views()`, which enumerates all +builtins that have `ontology: Some(...)` annotations and extracts metadata from +their `RelationDesc`, column comments, and semantic type annotations. + +## How it works + +1. **Entity types** — one row per builtin with an `Ontology` annotation. The + `relation` column is `schema.table_name`, `properties` contains primary key + info extracted from `RelationDesc::typ().keys`. + +2. **Semantic types** — a static reference table of 20 ID/value domains + (e.g., `CatalogItemId`, `GlobalId`, `ReplicaId`, `ByteCount`). + +3. **Properties** — one row per column per annotated entity. Joins against + `mz_columns` at runtime to discover column names and types. Semantic type + annotations come from `RelationDesc::get_semantic_type()`. Column + descriptions come from `mz_comments`. + +4. **Link types** — one row per `OntologyLink` on each annotated entity. + The `properties` JSONB column contains structured relationship metadata + (kind, source_column, target_column, cardinality, source_id_type, etc.). + +## Link type properties + +The `properties` jsonb in `mz_ontology_link_types` uses a `"kind"` field: + +- `"foreign_key"` — column-level join with `source_column`, `target_column`, `cardinality` +- `"measures"` — a measurement/metric relationship +- `"depends_on"` — a dependency relationship +- `"maps_to"` — an ID mapping (e.g., CatalogItemId to GlobalId) +- `"union"` — a UNION view includes another entity type + +Common keys in the properties JSONB: + +| Key | Description | +|---|---| +| `kind` | Relationship kind: `foreign_key`, `measures`, `depends_on`, `maps_to`, or `union`. | +| `source_column` | Column name on the source entity used for the join. | +| `target_column` | Column name on the target entity used for the join. | +| `cardinality` | Join cardinality: `many_to_one`, `one_to_one`, `many_to_many`. | +| `nullable` | `true` if the FK column can be NULL (optional relationship). | +| `source_id_type` | Semantic ID type of the source column (e.g., `CatalogItemId`, `GlobalId`). | +| `requires_mapping` | Mapping table needed to bridge ID namespaces (e.g., `mz_internal.mz_object_global_ids`). | +| `from_type` | Source semantic ID type for `maps_to` links (e.g., `CatalogItemId`). | +| `to_type` | Target semantic ID type for `maps_to` links (e.g., `GlobalId`). | +| `via` | Intermediate table or view used to perform a mapping or indirect join. | +| `metric` | Name of the metric or statistic measured by a `measures` link (e.g., `cpu_time_ns`, `materialization_lag`). | +| `discriminator_column` | Column on the `union` view that identifies the member type (e.g., `type`). | +| `discriminator_value` | Value in `discriminator_column` that selects the specific member entity. | +| `note` | Free-text clarification for unusual join semantics or caveats. | + +## For LLMs + +If connected to a Materialize instance, query these views **before** writing +catalog queries. They help find the right tables, correct join paths, and avoid +the GlobalId/CatalogItemId trap. + +### Key queries + +**Find all entities related to X:** +```sql +SELECT l.name, l.source_entity, l.target_entity, + l.properties->>'source_id_type' AS id_type +FROM mz_internal.mz_ontology_link_types l +WHERE l.source_entity = 'X' OR l.target_entity = 'X'; +``` + +**Discover columns and types for entity Z:** +```sql +SELECT p.column_name, p.semantic_type, p.description +FROM mz_internal.mz_ontology_properties p +WHERE p.entity_type = 'Z' +ORDER BY p.column_name; +``` + +**Look up the actual table name for an entity:** +```sql +SELECT name, relation FROM mz_internal.mz_ontology_entity_types WHERE name = 'mv'; +-- mv -> mz_catalog.mz_materialized_views +``` + +### GlobalId vs CatalogItemId + +Many `object_id` columns in `mz_internal` and `mz_introspection` use +**GlobalId**, not **CatalogItemId**. Both are `text`, both look like `u42`, +but they are different ID namespaces. A direct join to `mz_objects.id` +(CatalogItemId) will silently return wrong results after ALTER operations. + +Check `mz_ontology_properties.semantic_type` before writing joins. If the +types differ, bridge through `mz_internal.mz_object_global_ids`. + +## Stats + +- ~117 entity types (mz_catalog + mz_internal + mz_introspection) +- 20 semantic types +- ~450 column properties +- ~150 named relationships + +## Related files + +- `src/catalog/src/builtin.rs` — `Ontology` and `OntologyLink` struct definitions, per-builtin annotations +- `src/repr/src/relation.rs` — `semantic_types` field on `RelationDesc` +- `src/storage-client/src/healthcheck.rs` — semantic type annotations on status history tables +- `misc/ontology/` — SQL files for loading the same data as user-space tables diff --git a/doc/user/content/reference/system-catalog/mz_internal.md b/doc/user/content/reference/system-catalog/mz_internal.md index 1e3c5f8679f34..549fd37728c11 100644 --- a/doc/user/content/reference/system-catalog/mz_internal.md +++ b/doc/user/content/reference/system-catalog/mz_internal.md @@ -338,7 +338,7 @@ SQL objects that don't exist in the compute layer (such as views) are omitted. | Field | Type | Meaning | | ----------- | -------- | -------- | -| `object_id` | [`text`] | The ID of a compute object. Corresponds to [`mz_catalog.mz_indexes.id`](../mz_catalog#mz_indexes), [`mz_catalog.mz_materialized_views.id`](../mz_catalog#mz_materialized_views), or [`mz_internal.mz_subscriptions`](#mz_subscriptions). | +| `object_id` | [`text`] | The ID of a compute object. Corresponds to [`mz_catalog.mz_indexes.id`](../mz_catalog#mz_indexes), [`mz_catalog.mz_materialized_views.id`](../mz_catalog#mz_materialized_views), or [`mz_internal.mz_subscriptions.id`](#mz_subscriptions). | | `dependency_id` | [`text`] | The ID of a compute dependency. Corresponds to [`mz_catalog.mz_indexes.id`](../mz_catalog#mz_indexes), [`mz_catalog.mz_materialized_views.id`](../mz_catalog#mz_materialized_views), [`mz_catalog.mz_sources.id`](../mz_catalog#mz_sources), or [`mz_catalog.mz_tables.id`](../mz_catalog#mz_tables). | ## `mz_compute_hydration_statuses` @@ -658,6 +658,10 @@ system. The view can be accessed by Materialize _superusers_. | `object_id` | [`text`] | The ID of the materialized view or index. Corresponds to [`mz_objects.id`](../mz_catalog/#mz_objects). For global notices, this column is `NULL`. | | `created_at` | [`timestamp with time zone`] | The time at which the notice was created. Note that some notices are re-created on `environmentd` restart. | + + + + ## `mz_notices_redacted` @@ -835,7 +839,7 @@ in the system. | Field | Type | Meaning | | -----------------| ----------------------| -------- | | `name` | [`text`] | The name of the network policy rule. Can be combined with `policy_id` to form a unique identifier. | -| `policy_id` | [`text`] | The ID the network policy the rule is part of. Corresponds to [`mz_network_policy_rules.id`](#mz_network_policy_rules). | +| `policy_id` | [`text`] | The ID the network policy the rule is part of. Corresponds to [`mz_internal.mz_network_policies.id`](#mz_network_policies). | | `action` | [`text`] | The action of the rule. `allow` is the only supported action. | | `address` | [`text`] | The address the rule will take action on. | | `direction` | [`text`] | The direction of traffic the rule applies to. `ingress` is the only supported direction. | diff --git a/doc/user/content/reference/system-catalog/mz_introspection.md b/doc/user/content/reference/system-catalog/mz_introspection.md index f4b189cc04c67..f238ce5e8240b 100644 --- a/doc/user/content/reference/system-catalog/mz_introspection.md +++ b/doc/user/content/reference/system-catalog/mz_introspection.md @@ -107,7 +107,7 @@ The `mz_compute_exports` view describes the objects exported by [dataflows][data | Field | Type | Meaning | | -------------- |-----------| -------- | -| `export_id` | [`text`] | The ID of the index, materialized view, or subscription exported by the dataflow. Corresponds to [`mz_catalog.mz_indexes.id`](../mz_catalog#mz_indexes), [`mz_catalog.mz_materialized_views.id`](../mz_catalog#mz_materialized_views), or [`mz_internal.mz_subscriptions`](../mz_internal#mz_subscriptions). | +| `export_id` | [`text`] | The ID of the index, materialized view, or subscription exported by the dataflow. Corresponds to [`mz_catalog.mz_indexes.id`](../mz_catalog#mz_indexes), [`mz_catalog.mz_materialized_views.id`](../mz_catalog#mz_materialized_views), or [`mz_internal.mz_subscriptions.id`](../mz_internal#mz_subscriptions). | | `dataflow_id` | [`uint8`] | The ID of the dataflow. Corresponds to [`mz_dataflows.id`](#mz_dataflows). | diff --git a/src/adapter/src/catalog/open/builtin_schema_migration_tests.rs b/src/adapter/src/catalog/open/builtin_schema_migration_tests.rs index b7d8e56f7de29..eeacfea1c459f 100644 --- a/src/adapter/src/catalog/open/builtin_schema_migration_tests.rs +++ b/src/adapter/src/catalog/open/builtin_schema_migration_tests.rs @@ -285,6 +285,7 @@ fn make_builtin_table(name: String) -> (SystemObjectDescription, &'static Builti column_comments: BTreeMap::new(), is_retained_metrics_object: false, access: Vec::new(), + ontology: None, }; let builtin = leak(Builtin::Table(leak(builtin))); @@ -309,6 +310,7 @@ fn make_builtin_source(name: String) -> (SystemObjectDescription, &'static Built column_comments: BTreeMap::new(), is_retained_metrics_object: false, access: Vec::new(), + ontology: None, }; let builtin = leak(Builtin::Source(leak(builtin))); diff --git a/src/catalog/src/builtin.rs b/src/catalog/src/builtin.rs index e02b14e3e1d9d..51110437f228b 100644 --- a/src/catalog/src/builtin.rs +++ b/src/catalog/src/builtin.rs @@ -24,6 +24,7 @@ mod builtin; pub mod notice; +mod ontology; use std::collections::BTreeMap; use std::hash::Hash; @@ -40,7 +41,7 @@ use mz_repr::namespaces::{ MZ_UNSAFE_SCHEMA, PG_CATALOG_SCHEMA, }; use mz_repr::role_id::RoleId; -use mz_repr::{RelationDesc, SqlRelationType, SqlScalarType}; +use mz_repr::{RelationDesc, SemanticType, SqlRelationType, SqlScalarType}; use mz_sql::catalog::RoleAttributesRaw; use mz_sql::catalog::{ CatalogItemType, CatalogType, CatalogTypeDetails, CatalogTypePgMetadata, NameReference, @@ -161,6 +162,315 @@ pub struct BuiltinLog { pub access: Vec, } +/// Ontology metadata for a builtin catalog object. +/// +/// When present on a builtin, it marks it as an ontology entity with an explicit +/// `entity_name`, `description`, and optional per-column semantic type annotations. +/// +/// ## Why `column_semantic_types` lives here and not in `RelationDesc` +/// +/// Semantic types are pure catalog-level metadata: they annotate what an ID +/// column *means* (e.g. "this is a ClusterId") without affecting the Arrow +/// data type used for encoding. Keeping them in `RelationDesc` would cause +/// persist schema mismatches during zero-downtime upgrades: the old binary +/// registers a schema without semantic types, the new binary tries to register +/// a schema with them, and `register_schema` returns `None` because the schemas +/// are not `PartialEq`. Since the only consumers of semantic types are the +/// ontology views (which already have access to `Ontology`), storing them here +/// is both correct and avoids the schema-evolution problem entirely. +#[derive(Clone, Hash, Debug, PartialEq, Eq)] +pub struct Ontology { + /// The ontology entity name (e.g., "database", "table", "mv"). Names a + /// single row of this relation, so prefer singular event/object nouns + /// (e.g., "replica_status_event" not "replica_status_history"). + pub entity_name: &'static str, + /// One-line description of this entity. + pub description: &'static str, + /// Relationships originating from this entity (foreign keys, unions, + /// mappings, dependencies, metrics). + pub links: &'static [OntologyLink], + /// Per-column semantic type annotations: `(column_name, SemanticType)`. + /// Only columns that carry a meaningful semantic type need to appear here. + pub column_semantic_types: &'static [(&'static str, SemanticType)], +} + +/// Cardinality of an ontology link. +#[derive( + Clone, + Copy, + Debug, + Hash, + PartialEq, + Eq, + serde::Serialize, + serde::Deserialize +)] +#[serde(rename_all = "snake_case")] +pub enum Cardinality { + OneToOne, + ManyToOne, +} + +/// Helper used by serde to skip serializing `false` boolean fields. +fn is_false(v: &bool) -> bool { + !v +} + +/// Typed properties for an ontology link. Serialized to the `properties` JSONB +/// column in `mz_ontology_link_types`. The `kind` field is inlined from the +/// enum variant name via `#[serde(tag = "kind")]`. +#[derive(Clone, Copy, Debug, Hash, PartialEq, Eq, serde::Serialize)] +#[serde(tag = "kind", rename_all = "snake_case")] +pub enum LinkProperties { + /// A foreign-key relationship: `source_column` in the source entity + /// references `target_column` in the target entity. + ForeignKey { + /// Column in the source entity that holds the reference. + source_column: &'static str, + /// Column in the target entity being referenced (usually `id`). + target_column: &'static str, + /// How many source rows may reference a single target row. + cardinality: Cardinality, + /// Semantic type of the source column, if it carries an ID that + /// requires type-aware resolution (e.g. `CatalogItemId`, `GlobalId`). + #[serde(skip_serializing_if = "Option::is_none")] + source_id_type: Option, + /// Intermediate mapping relation needed when `source_id_type` does not + /// directly match the target entity's ID type (e.g. + /// `mz_internal.mz_object_global_ids` to go from `GlobalId` to catalog + /// object). + #[serde(skip_serializing_if = "Option::is_none")] + requires_mapping: Option<&'static str>, + /// True when the source column may be NULL (the reference is optional). + #[serde(default, skip_serializing_if = "is_false")] + nullable: bool, + /// Free-form annotation for cases that need extra context. + #[serde(skip_serializing_if = "Option::is_none")] + note: Option<&'static str>, + }, + /// A union relationship: the source entity is a superset view that includes + /// the target entity, optionally filtered by a discriminator column/value. + Union { + /// Column used to discriminate between subtypes (e.g. `type`). + #[serde(skip_serializing_if = "Option::is_none")] + discriminator_column: Option<&'static str>, + /// Value of `discriminator_column` that selects the target entity. + #[serde(skip_serializing_if = "Option::is_none")] + discriminator_value: Option<&'static str>, + /// Free-form annotation for cases that need extra context. + #[serde(skip_serializing_if = "Option::is_none")] + note: Option<&'static str>, + }, + /// A mapping relationship: the source entity maps to the target entity, + /// optionally via an intermediate table and/or with an ID-type conversion. + MapsTo { + /// Column in the source entity that holds the ID to map from. + #[serde(skip_serializing_if = "Option::is_none")] + source_column: Option<&'static str>, + /// Column in the target entity being mapped to. + #[serde(skip_serializing_if = "Option::is_none")] + target_column: Option<&'static str>, + /// Intermediate relation used to perform the mapping. + #[serde(skip_serializing_if = "Option::is_none")] + via: Option<&'static str>, + /// Semantic type of the source ID before mapping. + #[serde(skip_serializing_if = "Option::is_none")] + from_type: Option, + /// Semantic type of the target ID after mapping. + #[serde(skip_serializing_if = "Option::is_none")] + to_type: Option, + /// Free-form annotation for cases that need extra context. + #[serde(skip_serializing_if = "Option::is_none")] + note: Option<&'static str>, + }, + /// A dependency relationship: the source entity directly depends on the + /// target entity (e.g. a materialization that references an object). + DependsOn { + /// Column in the source entity that holds the dependency ID. + source_column: &'static str, + /// Column in the target entity being depended upon (usually `id`). + target_column: &'static str, + /// Semantic type of the source column. + #[serde(skip_serializing_if = "Option::is_none")] + source_id_type: Option, + }, + /// A metric relationship: the source entity records measurements of a named + /// metric on the target entity. + Measures { + /// Column in the source entity that references the target entity. + source_column: &'static str, + /// Column in the target entity being measured (usually `id`). + target_column: &'static str, + /// Name of the metric being measured (e.g. `cpu_time_ns`). + metric: &'static str, + /// Semantic type of the source column, if ID-type resolution is needed. + #[serde(skip_serializing_if = "Option::is_none")] + source_id_type: Option, + /// Intermediate mapping relation needed when the source ID type differs + /// from the target entity's ID type. + #[serde(skip_serializing_if = "Option::is_none")] + requires_mapping: Option<&'static str>, + /// Free-form annotation for cases that need extra context. + #[serde(skip_serializing_if = "Option::is_none")] + note: Option<&'static str>, + }, +} + +impl LinkProperties { + /// Basic foreign-key link with no optional fields set. + pub const fn fk( + source_column: &'static str, + target_column: &'static str, + cardinality: Cardinality, + ) -> Self { + Self::ForeignKey { + source_column, + target_column, + cardinality, + source_id_type: None, + requires_mapping: None, + nullable: false, + note: None, + } + } + + /// Foreign-key link where the source column may be NULL. + pub const fn fk_nullable( + source_column: &'static str, + target_column: &'static str, + cardinality: Cardinality, + ) -> Self { + Self::ForeignKey { + source_column, + target_column, + cardinality, + source_id_type: None, + requires_mapping: None, + nullable: true, + note: None, + } + } + + /// Foreign-key link whose source column carries a typed ID (e.g. + /// `CatalogItemId`) but does not require an intermediate mapping table. + pub const fn fk_typed( + source_column: &'static str, + target_column: &'static str, + cardinality: Cardinality, + source_id_type: mz_repr::SemanticType, + ) -> Self { + Self::ForeignKey { + source_column, + target_column, + cardinality, + source_id_type: Some(source_id_type), + requires_mapping: None, + nullable: false, + note: None, + } + } + + /// Foreign-key link whose source column carries a typed ID that requires + /// an intermediate mapping table to resolve (e.g. `GlobalId` → + /// `mz_internal.mz_object_global_ids`). + pub const fn fk_mapped( + source_column: &'static str, + target_column: &'static str, + cardinality: Cardinality, + source_id_type: mz_repr::SemanticType, + requires_mapping: &'static str, + ) -> Self { + Self::ForeignKey { + source_column, + target_column, + cardinality, + source_id_type: Some(source_id_type), + requires_mapping: Some(requires_mapping), + nullable: false, + note: None, + } + } + + /// Union link filtered by a discriminator column/value pair. + pub const fn union_disc( + discriminator_column: &'static str, + discriminator_value: &'static str, + ) -> Self { + Self::Union { + discriminator_column: Some(discriminator_column), + discriminator_value: Some(discriminator_value), + note: None, + } + } + + /// Basic measures link with no optional fields set. + pub const fn measures( + source_column: &'static str, + target_column: &'static str, + metric: &'static str, + ) -> Self { + Self::Measures { + source_column, + target_column, + metric, + source_id_type: None, + requires_mapping: None, + note: None, + } + } + + /// Measures link whose source ID requires an intermediate mapping table. + pub const fn measures_mapped( + source_column: &'static str, + target_column: &'static str, + metric: &'static str, + source_id_type: mz_repr::SemanticType, + requires_mapping: &'static str, + ) -> Self { + Self::Measures { + source_column, + target_column, + metric, + source_id_type: Some(source_id_type), + requires_mapping: Some(requires_mapping), + note: None, + } + } +} + +/// A directed relationship from one ontology entity to another. +/// +/// Each link has a `name` (the relationship label, e.g. `"owned_by"`), a +/// `target` entity name, and a `properties` variant that captures the +/// *kind* of relationship. Choosing the right variant matters: +/// +/// - [`LinkProperties::ForeignKey`]: the source entity has a column whose +/// value is an ID that directly references a row in the target entity. +/// Use this when there is an explicit FK column (e.g. `schema_id` -> +/// `schema`). +/// - [`LinkProperties::DependsOn`]: the source entity logically depends on +/// the target entity, but the relationship is a graph edge rather than a +/// simple column reference (e.g. `mz_compute_dependencies` records that a +/// dataflow depends on an object). Use this for dependency-graph tables, +/// **not** `ForeignKey`. +/// - [`LinkProperties::Union`]: the source entity is a superset view that +/// contains the target entity as a subset, optionally filtered by a +/// discriminator column. +/// - [`LinkProperties::MapsTo`]: the source entity provides an ID translation +/// to the target entity, possibly via an intermediate table or across ID +/// namespaces. +/// - [`LinkProperties::Measures`]: the source entity records metric +/// measurements about the target entity. +#[derive(Clone, Debug, Hash, PartialEq, Eq)] +pub struct OntologyLink { + /// Relationship name (e.g., "owned_by", "in_schema"). + pub name: &'static str, + /// Target entity name (e.g., "role", "schema"). + pub target: &'static str, + /// Typed properties for the `properties` JSONB column. + pub properties: LinkProperties, +} + #[derive(Clone, Hash, Debug, PartialEq, Eq)] pub struct BuiltinTable { pub name: &'static str, @@ -173,6 +483,8 @@ pub struct BuiltinTable { pub is_retained_metrics_object: bool, /// ACL items to apply to the object pub access: Vec, + /// Ontology metadata. None means this builtin is not an ontology entity. + pub ontology: Option, } #[derive(Clone, Debug, PartialEq, Eq)] @@ -188,6 +500,8 @@ pub struct BuiltinSource { pub is_retained_metrics_object: bool, /// ACL items to apply to the object pub access: Vec, + /// Ontology metadata. None means this builtin is not an ontology entity. + pub ontology: Option, } #[derive(Hash, Debug)] @@ -200,6 +514,8 @@ pub struct BuiltinView { pub sql: &'static str, /// ACL items to apply to the object pub access: Vec, + /// Ontology metadata. None means this builtin is not an ontology entity. + pub ontology: Option, } impl BuiltinView { @@ -224,6 +540,8 @@ pub struct BuiltinMaterializedView { pub is_retained_metrics_object: bool, /// ACL items to apply to the object pub access: Vec, + /// Ontology metadata. None means this builtin is not an ontology entity. + pub ontology: Option, } impl BuiltinMaterializedView { @@ -1773,6 +2091,7 @@ pub static MZ_CATALOG_RAW: LazyLock = LazyLock::new(|| BuiltinSou is_retained_metrics_object: false, // The raw catalog contains unredacted SQL statements, so we limit access to the system user. access: vec![], + ontology: None, }); pub static MZ_CATALOG_RAW_DESCRIPTION: LazyLock = @@ -2068,6 +2387,18 @@ pub static MZ_ICEBERG_SINKS: LazyLock = LazyLock::new(|| BuiltinTa ]), is_retained_metrics_object: false, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "iceberg_sink", + description: "Iceberg-specific sink configuration (namespace, table)", + links: &const { + [OntologyLink { + name: "details_of", + target: "sink", + properties: LinkProperties::fk("id", "id", Cardinality::OneToOne), + }] + }, + column_semantic_types: &[("id", SemanticType::CatalogItemId)], + }), }); pub static MZ_KAFKA_SINKS: LazyLock = LazyLock::new(|| BuiltinTable { @@ -2088,6 +2419,18 @@ pub static MZ_KAFKA_SINKS: LazyLock = LazyLock::new(|| BuiltinTabl ]), is_retained_metrics_object: false, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "kafka_sink", + description: "Kafka-specific sink configuration (topic)", + links: &const { + [OntologyLink { + name: "details_of", + target: "sink", + properties: LinkProperties::fk("id", "id", Cardinality::OneToOne), + }] + }, + column_semantic_types: &[("id", SemanticType::CatalogItemId)], + }), }); pub static MZ_KAFKA_CONNECTIONS: LazyLock = LazyLock::new(|| BuiltinTable { name: "mz_kafka_connections", @@ -2114,6 +2457,18 @@ pub static MZ_KAFKA_CONNECTIONS: LazyLock = LazyLock::new(|| Built ]), is_retained_metrics_object: false, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "kafka_connection", + description: "Kafka-specific connection configuration (brokers, progress topic)", + links: &const { + [OntologyLink { + name: "details_of", + target: "connection", + properties: LinkProperties::fk("id", "id", Cardinality::OneToOne), + }] + }, + column_semantic_types: &[("id", SemanticType::CatalogItemId)], + }), }); pub static MZ_KAFKA_SOURCES: LazyLock = LazyLock::new(|| BuiltinTable { name: "mz_kafka_sources", @@ -2140,6 +2495,18 @@ pub static MZ_KAFKA_SOURCES: LazyLock = LazyLock::new(|| BuiltinTa ]), is_retained_metrics_object: false, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "kafka_source", + description: "Kafka-specific source configuration (topic, group ID)", + links: &const { + [OntologyLink { + name: "details_of", + target: "source", + properties: LinkProperties::fk("id", "id", Cardinality::OneToOne), + }] + }, + column_semantic_types: &[("id", SemanticType::CatalogItemId)], + }), }); pub static MZ_POSTGRES_SOURCES: LazyLock = LazyLock::new(|| BuiltinTable { name: "mz_postgres_sources", @@ -2166,6 +2533,18 @@ pub static MZ_POSTGRES_SOURCES: LazyLock = LazyLock::new(|| Builti ]), is_retained_metrics_object: false, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "postgres_source", + description: "Postgres source-level details", + links: &const { + [OntologyLink { + name: "details_of", + target: "source", + properties: LinkProperties::fk("id", "id", Cardinality::OneToOne), + }] + }, + column_semantic_types: &[("id", SemanticType::CatalogItemId)], + }), }); pub static MZ_POSTGRES_SOURCE_TABLES: LazyLock = LazyLock::new(|| BuiltinTable { name: "mz_postgres_source_tables", @@ -2192,6 +2571,18 @@ pub static MZ_POSTGRES_SOURCE_TABLES: LazyLock = LazyLock::new(|| ]), is_retained_metrics_object: true, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "postgres_source_table", + description: "Postgres source table-level details", + links: &const { + [OntologyLink { + name: "describes_source_table", + target: "table", + properties: LinkProperties::fk("id", "id", Cardinality::OneToOne), + }] + }, + column_semantic_types: &[("id", SemanticType::CatalogItemId)], + }), }); pub static MZ_MYSQL_SOURCE_TABLES: LazyLock = LazyLock::new(|| BuiltinTable { name: "mz_mysql_source_tables", @@ -2218,6 +2609,18 @@ pub static MZ_MYSQL_SOURCE_TABLES: LazyLock = LazyLock::new(|| Bui ]), is_retained_metrics_object: true, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "mysql_source_table", + description: "MySQL source table-level details", + links: &const { + [OntologyLink { + name: "describes_source_table", + target: "table", + properties: LinkProperties::fk("id", "id", Cardinality::OneToOne), + }] + }, + column_semantic_types: &[("id", SemanticType::CatalogItemId)], + }), }); pub static MZ_SQL_SERVER_SOURCE_TABLES: LazyLock = LazyLock::new(|| BuiltinTable { name: "mz_sql_server_source_tables", @@ -2244,6 +2647,18 @@ pub static MZ_SQL_SERVER_SOURCE_TABLES: LazyLock = LazyLock::new(| ]), is_retained_metrics_object: true, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "sql_server_source_table", + description: "SQL Server source table-level details", + links: &const { + [OntologyLink { + name: "describes_source_table", + target: "table", + properties: LinkProperties::fk("id", "id", Cardinality::OneToOne), + }] + }, + column_semantic_types: &[("id", SemanticType::CatalogItemId)], + }), }); pub static MZ_KAFKA_SOURCE_TABLES: LazyLock = LazyLock::new(|| BuiltinTable { name: "mz_kafka_source_tables", @@ -2277,6 +2692,18 @@ pub static MZ_KAFKA_SOURCE_TABLES: LazyLock = LazyLock::new(|| Bui ]), is_retained_metrics_object: true, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "kafka_source_table", + description: "Kafka source table-level details", + links: &const { + [OntologyLink { + name: "describes_source_table", + target: "table", + properties: LinkProperties::fk("id", "id", Cardinality::OneToOne), + }] + }, + column_semantic_types: &[("id", SemanticType::CatalogItemId)], + }), }); pub static MZ_OBJECT_DEPENDENCIES: LazyLock = LazyLock::new(|| BuiltinTable { name: "mz_object_dependencies", @@ -2301,6 +2728,40 @@ pub static MZ_OBJECT_DEPENDENCIES: LazyLock = LazyLock::new(|| Bui ]), is_retained_metrics_object: true, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "object_dependency", + description: "A dependency edge: one object depends on another", + links: &const { + [ + OntologyLink { + name: "dependent_object", + target: "object", + properties: LinkProperties::fk_typed( + "object_id", + "id", + Cardinality::ManyToOne, + mz_repr::SemanticType::CatalogItemId, + ), + }, + OntologyLink { + name: "referenced_object", + target: "object", + properties: LinkProperties::fk_typed( + "referenced_object_id", + "id", + Cardinality::ManyToOne, + mz_repr::SemanticType::CatalogItemId, + ), + }, + ] + }, + column_semantic_types: &const { + [ + ("object_id", SemanticType::CatalogItemId), + ("referenced_object_id", SemanticType::CatalogItemId), + ] + }, + }), }); pub static MZ_COMPUTE_DEPENDENCIES: LazyLock = LazyLock::new(|| BuiltinSource { name: "mz_compute_dependencies", @@ -2314,7 +2775,7 @@ pub static MZ_COMPUTE_DEPENDENCIES: LazyLock = LazyLock::new(|| B column_comments: BTreeMap::from_iter([ ( "object_id", - "The ID of a compute object. Corresponds to `mz_catalog.mz_indexes.id`, `mz_catalog.mz_materialized_views.id`, or `mz_internal.mz_subscriptions`.", + "The ID of a compute object. Corresponds to `mz_catalog.mz_indexes.id`, `mz_catalog.mz_materialized_views.id`, or `mz_internal.mz_subscriptions.id`.", ), ( "dependency_id", @@ -2323,6 +2784,42 @@ pub static MZ_COMPUTE_DEPENDENCIES: LazyLock = LazyLock::new(|| B ]), is_retained_metrics_object: false, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "compute_dependency", + description: "Dependency edge from a compute object (index, materialized view, or subscription) to one of the sources of its data", + links: &const { + [ + OntologyLink { + name: "dependent_compute_object", + target: "object", + properties: LinkProperties::fk_mapped( + "object_id", + "id", + Cardinality::ManyToOne, + mz_repr::SemanticType::GlobalId, + "mz_internal.mz_object_global_ids", + ), + }, + OntologyLink { + name: "compute_dependency_target", + target: "object", + properties: LinkProperties::fk_mapped( + "dependency_id", + "id", + Cardinality::ManyToOne, + mz_repr::SemanticType::GlobalId, + "mz_internal.mz_object_global_ids", + ), + }, + ] + }, + column_semantic_types: &const { + [ + ("object_id", SemanticType::GlobalId), + ("dependency_id", SemanticType::GlobalId), + ] + }, + }), }); pub static MZ_DATABASES: LazyLock = @@ -2371,6 +2868,24 @@ FROM mz_internal.mz_catalog_raw WHERE data->>'kind' = 'Database'", is_retained_metrics_object: false, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "database", + description: "A top-level namespace that contains schemas", + links: &const { + [OntologyLink { + name: "owned_by", + target: "role", + properties: LinkProperties::fk("owner_id", "id", Cardinality::ManyToOne), + }] + }, + column_semantic_types: &const { + [ + ("id", SemanticType::DatabaseId), + ("oid", SemanticType::OID), + ("owner_id", SemanticType::RoleId), + ] + }, + }), }); pub static MZ_SCHEMAS: LazyLock = @@ -2427,6 +2942,36 @@ FROM mz_internal.mz_catalog_raw WHERE data->>'kind' = 'Schema'", is_retained_metrics_object: false, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "schema", + description: "A namespace within a database that contains objects", + links: &const { + [ + OntologyLink { + name: "in_database", + target: "database", + properties: LinkProperties::fk_nullable( + "database_id", + "id", + Cardinality::ManyToOne, + ), + }, + OntologyLink { + name: "owned_by", + target: "role", + properties: LinkProperties::fk("owner_id", "id", Cardinality::ManyToOne), + }, + ] + }, + column_semantic_types: &const { + [ + ("id", SemanticType::SchemaId), + ("oid", SemanticType::OID), + ("database_id", SemanticType::DatabaseId), + ("owner_id", SemanticType::RoleId), + ] + }, + }), }); pub static MZ_COLUMNS: LazyLock = LazyLock::new(|| BuiltinTable { @@ -2464,6 +3009,31 @@ pub static MZ_COLUMNS: LazyLock = LazyLock::new(|| BuiltinTable { ]), is_retained_metrics_object: false, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "column", + description: "A column of a relation, with its name, position, type, and nullability", + links: &const { + [OntologyLink { + name: "belongs_to_relation", + target: "object", + properties: LinkProperties::ForeignKey { + source_column: "id", + target_column: "id", + cardinality: Cardinality::ManyToOne, + source_id_type: None, + requires_mapping: None, + nullable: false, + note: Some("id in mz_columns is the relation ID, not a unique column ID"), + }, + }] + }, + column_semantic_types: &const { + [ + ("id", SemanticType::CatalogItemId), + ("type_oid", SemanticType::OID), + ] + }, + }), }); pub static MZ_INDEXES: LazyLock = LazyLock::new(|| BuiltinTable { name: "mz_indexes", @@ -2505,6 +3075,40 @@ pub static MZ_INDEXES: LazyLock = LazyLock::new(|| BuiltinTable { ]), is_retained_metrics_object: false, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "index", + description: "An in-memory index on a relation for fast lookups", + links: &const { + [ + OntologyLink { + name: "owned_by", + target: "role", + properties: LinkProperties::fk("owner_id", "id", Cardinality::ManyToOne), + }, + OntologyLink { + name: "runs_on_cluster", + target: "cluster", + properties: LinkProperties::fk("cluster_id", "id", Cardinality::ManyToOne), + }, + OntologyLink { + name: "indexes_relation", + target: "relation", + properties: LinkProperties::fk("on_id", "id", Cardinality::ManyToOne), + }, + ] + }, + column_semantic_types: &const { + [ + ("id", SemanticType::CatalogItemId), + ("oid", SemanticType::OID), + ("on_id", SemanticType::CatalogItemId), + ("cluster_id", SemanticType::ClusterId), + ("owner_id", SemanticType::RoleId), + ("create_sql", SemanticType::SqlDefinition), + ("redacted_create_sql", SemanticType::RedactedSqlDefinition), + ] + }, + }), }); pub static MZ_INDEX_COLUMNS: LazyLock = LazyLock::new(|| BuiltinTable { name: "mz_index_columns", @@ -2541,6 +3145,18 @@ pub static MZ_INDEX_COLUMNS: LazyLock = LazyLock::new(|| BuiltinTa ]), is_retained_metrics_object: false, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "index_column", + description: "A column or expression in an index, with its position", + links: &const { + [OntologyLink { + name: "belongs_to_index", + target: "index", + properties: LinkProperties::fk("index_id", "id", Cardinality::ManyToOne), + }] + }, + column_semantic_types: &[("index_id", SemanticType::CatalogItemId)], + }), }); pub static MZ_TABLES: LazyLock = LazyLock::new(|| BuiltinTable { name: "mz_tables", @@ -2587,6 +3203,44 @@ pub static MZ_TABLES: LazyLock = LazyLock::new(|| BuiltinTable { ]), is_retained_metrics_object: true, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "table", + description: "A user-writable table that can be inserted into and updated", + links: &const { + [ + OntologyLink { + name: "in_schema", + target: "schema", + properties: LinkProperties::fk("schema_id", "id", Cardinality::ManyToOne), + }, + OntologyLink { + name: "owned_by", + target: "role", + properties: LinkProperties::fk("owner_id", "id", Cardinality::ManyToOne), + }, + OntologyLink { + name: "created_by_source", + target: "source", + properties: LinkProperties::fk_nullable( + "source_id", + "id", + Cardinality::ManyToOne, + ), + }, + ] + }, + column_semantic_types: &const { + [ + ("id", SemanticType::CatalogItemId), + ("oid", SemanticType::OID), + ("schema_id", SemanticType::SchemaId), + ("owner_id", SemanticType::RoleId), + ("create_sql", SemanticType::SqlDefinition), + ("redacted_create_sql", SemanticType::RedactedSqlDefinition), + ("source_id", SemanticType::CatalogItemId), + ] + }, + }), }); pub static MZ_CONNECTIONS: LazyLock = LazyLock::new(|| { @@ -2665,6 +3319,23 @@ WHERE mz_internal.parse_catalog_create_sql(data->'value'->'definition'->'V1'->>'create_sql')->>'type' = 'connection'", is_retained_metrics_object: false, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "connection", + description: "A reusable connection configuration to an external system", + links: &const { [ + OntologyLink { + name: "in_schema", + target: "schema", + properties: LinkProperties::fk("schema_id", "id", Cardinality::ManyToOne), + }, + OntologyLink { + name: "owned_by", + target: "role", + properties: LinkProperties::fk("owner_id", "id", Cardinality::ManyToOne), + }, + ] }, + column_semantic_types: &const {[("id", SemanticType::CatalogItemId), ("oid", SemanticType::OID), ("schema_id", SemanticType::SchemaId), ("type", SemanticType::ConnectionType), ("owner_id", SemanticType::RoleId), ("create_sql", SemanticType::SqlDefinition), ("redacted_create_sql", SemanticType::RedactedSqlDefinition)]}, + }), } }); @@ -2690,6 +3361,18 @@ pub static MZ_SSH_TUNNEL_CONNECTIONS: LazyLock = LazyLock::new(|| ]), is_retained_metrics_object: false, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "ssh_tunnel_connection", + description: "SSH tunnel connection with public keys", + links: &const { + [OntologyLink { + name: "details_of", + target: "connection", + properties: LinkProperties::fk("id", "id", Cardinality::OneToOne), + }] + }, + column_semantic_types: &[("id", SemanticType::CatalogItemId)], + }), }); pub static MZ_SOURCES: LazyLock = LazyLock::new(|| BuiltinTable { name: "mz_sources", @@ -2763,6 +3446,55 @@ pub static MZ_SOURCES: LazyLock = LazyLock::new(|| BuiltinTable { ]), is_retained_metrics_object: true, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "source", + description: "An external data source ingested into Materialize (e.g., Kafka, Postgres)", + links: &const { + [ + OntologyLink { + name: "in_schema", + target: "schema", + properties: LinkProperties::fk("schema_id", "id", Cardinality::ManyToOne), + }, + OntologyLink { + name: "owned_by", + target: "role", + properties: LinkProperties::fk("owner_id", "id", Cardinality::ManyToOne), + }, + OntologyLink { + name: "runs_on_cluster", + target: "cluster", + properties: LinkProperties::fk_nullable( + "cluster_id", + "id", + Cardinality::ManyToOne, + ), + }, + OntologyLink { + name: "uses_connection", + target: "connection", + properties: LinkProperties::fk_nullable( + "connection_id", + "id", + Cardinality::ManyToOne, + ), + }, + ] + }, + column_semantic_types: &const { + [ + ("id", SemanticType::CatalogItemId), + ("oid", SemanticType::OID), + ("schema_id", SemanticType::SchemaId), + ("type", SemanticType::SourceType), + ("connection_id", SemanticType::CatalogItemId), + ("cluster_id", SemanticType::ClusterId), + ("owner_id", SemanticType::RoleId), + ("create_sql", SemanticType::SqlDefinition), + ("redacted_create_sql", SemanticType::RedactedSqlDefinition), + ] + }, + }), }); pub static MZ_SINKS: LazyLock = LazyLock::new(|| { BuiltinTable { @@ -2836,6 +3568,50 @@ pub static MZ_SINKS: LazyLock = LazyLock::new(|| { ]), is_retained_metrics_object: true, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "sink", + description: "An export of data from Materialize to an external system", + links: &const { + [ + OntologyLink { + name: "in_schema", + target: "schema", + properties: LinkProperties::fk("schema_id", "id", Cardinality::ManyToOne), + }, + OntologyLink { + name: "owned_by", + target: "role", + properties: LinkProperties::fk("owner_id", "id", Cardinality::ManyToOne), + }, + OntologyLink { + name: "runs_on_cluster", + target: "cluster", + properties: LinkProperties::fk("cluster_id", "id", Cardinality::ManyToOne), + }, + OntologyLink { + name: "uses_connection", + target: "connection", + properties: LinkProperties::fk_nullable( + "connection_id", + "id", + Cardinality::ManyToOne, + ), + }, + ] + }, + column_semantic_types: &const { + [ + ("id", SemanticType::CatalogItemId), + ("oid", SemanticType::OID), + ("schema_id", SemanticType::SchemaId), + ("connection_id", SemanticType::CatalogItemId), + ("cluster_id", SemanticType::ClusterId), + ("owner_id", SemanticType::RoleId), + ("create_sql", SemanticType::SqlDefinition), + ("redacted_create_sql", SemanticType::RedactedSqlDefinition), + ] + }, + }), } }); pub static MZ_VIEWS: LazyLock = LazyLock::new(|| BuiltinTable { @@ -2880,6 +3656,35 @@ pub static MZ_VIEWS: LazyLock = LazyLock::new(|| BuiltinTable { ]), is_retained_metrics_object: false, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "view", + description: "A non-materialized view defined by a SQL query", + links: &const { + [ + OntologyLink { + name: "in_schema", + target: "schema", + properties: LinkProperties::fk("schema_id", "id", Cardinality::ManyToOne), + }, + OntologyLink { + name: "owned_by", + target: "role", + properties: LinkProperties::fk("owner_id", "id", Cardinality::ManyToOne), + }, + ] + }, + column_semantic_types: &const { + [ + ("id", SemanticType::CatalogItemId), + ("oid", SemanticType::OID), + ("schema_id", SemanticType::SchemaId), + ("definition", SemanticType::SqlDefinition), + ("owner_id", SemanticType::RoleId), + ("create_sql", SemanticType::SqlDefinition), + ("redacted_create_sql", SemanticType::RedactedSqlDefinition), + ] + }, + }), }); pub static MZ_MATERIALIZED_VIEWS: LazyLock = LazyLock::new(|| { @@ -3005,6 +3810,16 @@ UNION ALL SELECT * FROM builtin_mvs").into_boxed_str()), is_retained_metrics_object: false, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "mv", + description: "A materialized view maintained incrementally on a cluster", + links: &const { [ + OntologyLink { name: "in_schema", target: "schema", properties: LinkProperties::fk("schema_id", "id", Cardinality::ManyToOne) }, + OntologyLink { name: "owned_by", target: "role", properties: LinkProperties::fk("owner_id", "id", Cardinality::ManyToOne) }, + OntologyLink { name: "runs_on_cluster", target: "cluster", properties: LinkProperties::fk("cluster_id", "id", Cardinality::ManyToOne) }, + ] }, + column_semantic_types: &const {[("id", SemanticType::CatalogItemId), ("oid", SemanticType::OID), ("schema_id", SemanticType::SchemaId), ("cluster_id", SemanticType::ClusterId), ("definition", SemanticType::SqlDefinition), ("owner_id", SemanticType::RoleId), ("create_sql", SemanticType::SqlDefinition), ("redacted_create_sql", SemanticType::RedactedSqlDefinition)]}, + }), } }); @@ -3053,6 +3868,7 @@ pub static MZ_MATERIALIZED_VIEW_REFRESH_STRATEGIES: LazyLock = Laz ]), is_retained_metrics_object: false, access: vec![PUBLIC_SELECT], + ontology: None, } }); pub static MZ_TYPES: LazyLock = LazyLock::new(|| BuiltinTable { @@ -3097,6 +3913,34 @@ pub static MZ_TYPES: LazyLock = LazyLock::new(|| BuiltinTable { ]), is_retained_metrics_object: false, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "type", + description: "A named data type (base, array, list, map, or pseudo)", + links: &const { + [ + OntologyLink { + name: "in_schema", + target: "schema", + properties: LinkProperties::fk("schema_id", "id", Cardinality::ManyToOne), + }, + OntologyLink { + name: "owned_by", + target: "role", + properties: LinkProperties::fk("owner_id", "id", Cardinality::ManyToOne), + }, + ] + }, + column_semantic_types: &const { + [ + ("id", SemanticType::CatalogItemId), + ("oid", SemanticType::OID), + ("schema_id", SemanticType::SchemaId), + ("owner_id", SemanticType::RoleId), + ("create_sql", SemanticType::SqlDefinition), + ("redacted_create_sql", SemanticType::RedactedSqlDefinition), + ] + }, + }), }); pub static MZ_NETWORK_POLICIES: LazyLock = LazyLock::new(|| { @@ -3148,6 +3992,24 @@ FROM mz_internal.mz_catalog_raw WHERE data->>'kind' = 'NetworkPolicy'", is_retained_metrics_object: false, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "network_policy", + description: "Network access policies", + links: &const { + [OntologyLink { + name: "owned_by", + target: "role", + properties: LinkProperties::fk("owner_id", "id", Cardinality::ManyToOne), + }] + }, + column_semantic_types: &const { + [ + ("id", SemanticType::NetworkPolicyId), + ("owner_id", SemanticType::RoleId), + ("oid", SemanticType::OID), + ] + }, + }), } }); @@ -3170,7 +4032,7 @@ pub static MZ_NETWORK_POLICY_RULES: LazyLock = LazyLock ), ( "policy_id", - "The ID the network policy the rule is part of. Corresponds to `mz_network_policy_rules.id`.", + "The ID the network policy the rule is part of. Corresponds to `mz_internal.mz_network_policies.id`.", ), ( "action", @@ -3203,6 +4065,18 @@ FROM WHERE data->>'kind' = 'NetworkPolicy'", is_retained_metrics_object: false, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "network_policy_rule", + description: "Individual rules within a network policy", + links: &const { + [OntologyLink { + name: "belongs_to_policy", + target: "network_policy", + properties: LinkProperties::fk("policy_id", "id", Cardinality::ManyToOne), + }] + }, + column_semantic_types: &[], + }), } }); @@ -3220,6 +4094,7 @@ pub static MZ_TYPE_PG_METADATA: LazyLock = LazyLock::new(|| Builti column_comments: BTreeMap::new(), is_retained_metrics_object: false, access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_ARRAY_TYPES: LazyLock = LazyLock::new(|| BuiltinTable { name: "mz_array_types", @@ -3235,6 +4110,30 @@ pub static MZ_ARRAY_TYPES: LazyLock = LazyLock::new(|| BuiltinTabl ]), is_retained_metrics_object: false, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "array_type", + description: "An array type with its element type", + links: &const { + [ + OntologyLink { + name: "detail_of", + target: "type", + properties: LinkProperties::fk("id", "id", Cardinality::OneToOne), + }, + OntologyLink { + name: "has_element_type", + target: "type", + properties: LinkProperties::fk("element_id", "id", Cardinality::ManyToOne), + }, + ] + }, + column_semantic_types: &const { + [ + ("id", SemanticType::CatalogItemId), + ("element_id", SemanticType::CatalogItemId), + ] + }, + }), }); pub static MZ_BASE_TYPES: LazyLock = LazyLock::new(|| BuiltinTable { name: "mz_base_types", @@ -3246,6 +4145,12 @@ pub static MZ_BASE_TYPES: LazyLock = LazyLock::new(|| BuiltinTable column_comments: BTreeMap::from_iter([("id", "The ID of the type.")]), is_retained_metrics_object: false, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "base_type", + description: "A primitive/base data type", + links: &const { [] }, + column_semantic_types: &[("id", SemanticType::CatalogItemId)], + }), }); pub static MZ_LIST_TYPES: LazyLock = LazyLock::new(|| BuiltinTable { name: "mz_list_types", @@ -3273,6 +4178,30 @@ pub static MZ_LIST_TYPES: LazyLock = LazyLock::new(|| BuiltinTable ]), is_retained_metrics_object: false, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "list_type", + description: "A list type with its element type", + links: &const { + [ + OntologyLink { + name: "detail_of", + target: "type", + properties: LinkProperties::fk("id", "id", Cardinality::OneToOne), + }, + OntologyLink { + name: "has_element_type", + target: "type", + properties: LinkProperties::fk("element_id", "id", Cardinality::ManyToOne), + }, + ] + }, + column_semantic_types: &const { + [ + ("id", SemanticType::CatalogItemId), + ("element_id", SemanticType::CatalogItemId), + ] + }, + }), }); pub static MZ_MAP_TYPES: LazyLock = LazyLock::new(|| BuiltinTable { name: "mz_map_types", @@ -3314,6 +4243,36 @@ pub static MZ_MAP_TYPES: LazyLock = LazyLock::new(|| BuiltinTable ]), is_retained_metrics_object: false, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "map_type", + description: "A map type with its key and value types", + links: &const { + [ + OntologyLink { + name: "detail_of", + target: "type", + properties: LinkProperties::fk("id", "id", Cardinality::OneToOne), + }, + OntologyLink { + name: "has_key_type", + target: "type", + properties: LinkProperties::fk("key_id", "id", Cardinality::ManyToOne), + }, + OntologyLink { + name: "has_value_type", + target: "type", + properties: LinkProperties::fk("value_id", "id", Cardinality::ManyToOne), + }, + ] + }, + column_semantic_types: &const { + [ + ("id", SemanticType::CatalogItemId), + ("key_id", SemanticType::CatalogItemId), + ("value_id", SemanticType::CatalogItemId), + ] + }, + }), }); pub static MZ_ROLES: LazyLock = LazyLock::new(|| BuiltinTable { name: "mz_roles", @@ -3342,6 +4301,12 @@ pub static MZ_ROLES: LazyLock = LazyLock::new(|| BuiltinTable { ]), is_retained_metrics_object: false, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "role", + description: "A user or role for authentication and access control", + links: &const { [] }, + column_semantic_types: &const { [("id", SemanticType::RoleId), ("oid", SemanticType::OID)] }, + }), }); pub static MZ_ROLE_MEMBERS: LazyLock = LazyLock::new(|| { @@ -3385,6 +4350,36 @@ FROM WHERE data->>'kind' = 'Role'", is_retained_metrics_object: false, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "role_membership", + description: "A membership grant: one role is a member of another role", + links: &const { + [ + OntologyLink { + name: "group_role", + target: "role", + properties: LinkProperties::fk("role_id", "id", Cardinality::ManyToOne), + }, + OntologyLink { + name: "member_role", + target: "role", + properties: LinkProperties::fk("member", "id", Cardinality::ManyToOne), + }, + OntologyLink { + name: "granted_by", + target: "role", + properties: LinkProperties::fk("grantor", "id", Cardinality::ManyToOne), + }, + ] + }, + column_semantic_types: &const { + [ + ("role_id", SemanticType::RoleId), + ("member", SemanticType::RoleId), + ("grantor", SemanticType::RoleId), + ] + }, + }), } }); @@ -3413,6 +4408,18 @@ pub static MZ_ROLE_PARAMETERS: LazyLock = LazyLock::new(|| Builtin ]), is_retained_metrics_object: false, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "role_parameter", + description: "A session parameter default set for a role", + links: &const { + [OntologyLink { + name: "default_parameter_setting_of", + target: "role", + properties: LinkProperties::fk("role_id", "id", Cardinality::ManyToOne), + }] + }, + column_semantic_types: &[("role_id", SemanticType::RoleId)], + }), }); pub static MZ_ROLE_AUTH: LazyLock = LazyLock::new(|| BuiltinTable { name: "mz_role_auth", @@ -3444,6 +4451,7 @@ pub static MZ_ROLE_AUTH: LazyLock = LazyLock::new(|| BuiltinTable ]), is_retained_metrics_object: false, access: vec![rbac::owner_privilege(ObjectType::Table, MZ_SYSTEM_ROLE_ID)], + ontology: None, }); pub static MZ_PSEUDO_TYPES: LazyLock = LazyLock::new(|| BuiltinTable { name: "mz_pseudo_types", @@ -3455,6 +4463,12 @@ pub static MZ_PSEUDO_TYPES: LazyLock = LazyLock::new(|| BuiltinTab column_comments: BTreeMap::from_iter([("id", "The ID of the type.")]), is_retained_metrics_object: false, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "pseudo_type", + description: "A pseudo-type used in function signatures", + links: &const { [] }, + column_semantic_types: &[("id", SemanticType::CatalogItemId)], + }), }); pub static MZ_FUNCTIONS: LazyLock = LazyLock::new(|| { BuiltinTable { @@ -3509,6 +4523,52 @@ pub static MZ_FUNCTIONS: LazyLock = LazyLock::new(|| { ]), is_retained_metrics_object: false, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "function", + description: "A built-in or user-defined function", + links: &const { + [ + OntologyLink { + name: "in_schema", + target: "schema", + properties: LinkProperties::fk("schema_id", "id", Cardinality::ManyToOne), + }, + OntologyLink { + name: "owned_by", + target: "role", + properties: LinkProperties::fk("owner_id", "id", Cardinality::ManyToOne), + }, + OntologyLink { + name: "returns_type", + target: "type", + properties: LinkProperties::fk_nullable( + "return_type_id", + "id", + Cardinality::ManyToOne, + ), + }, + OntologyLink { + name: "has_variadic_arg_type", + target: "type", + properties: LinkProperties::fk_nullable( + "variadic_argument_type_id", + "id", + Cardinality::ManyToOne, + ), + }, + ] + }, + column_semantic_types: &const { + [ + ("id", SemanticType::CatalogItemId), + ("oid", SemanticType::OID), + ("schema_id", SemanticType::SchemaId), + ("variadic_argument_type_id", SemanticType::CatalogItemId), + ("return_type_id", SemanticType::CatalogItemId), + ("owner_id", SemanticType::RoleId), + ] + }, + }), } }); pub static MZ_OPERATORS: LazyLock = LazyLock::new(|| BuiltinTable { @@ -3527,6 +4587,27 @@ pub static MZ_OPERATORS: LazyLock = LazyLock::new(|| BuiltinTable column_comments: BTreeMap::new(), is_retained_metrics_object: false, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "operator", + description: "A built-in SQL operator", + links: &const { + [OntologyLink { + name: "returns_type", + target: "type", + properties: LinkProperties::fk_nullable( + "return_type_id", + "id", + Cardinality::ManyToOne, + ), + }] + }, + column_semantic_types: &const { + [ + ("oid", SemanticType::OID), + ("return_type_id", SemanticType::CatalogItemId), + ] + }, + }), }); pub static MZ_AGGREGATES: LazyLock = LazyLock::new(|| BuiltinTable { name: "mz_aggregates", @@ -3540,6 +4621,12 @@ pub static MZ_AGGREGATES: LazyLock = LazyLock::new(|| BuiltinTable column_comments: BTreeMap::new(), is_retained_metrics_object: false, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "aggregate", + description: "Aggregate function metadata", + links: &const { [] }, + column_semantic_types: &[("oid", SemanticType::OID)], + }), }); pub static MZ_CLUSTERS: LazyLock = LazyLock::new(|| BuiltinTable { @@ -3615,6 +4702,30 @@ pub static MZ_CLUSTERS: LazyLock = LazyLock::new(|| BuiltinTable { ]), is_retained_metrics_object: false, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "cluster", + description: "A compute cluster that runs dataflows for sources, sinks, MVs, and indexes", + links: &const { + [ + OntologyLink { + name: "owned_by", + target: "role", + properties: LinkProperties::fk("owner_id", "id", Cardinality::ManyToOne), + }, + OntologyLink { + name: "has_size", + target: "replica_size", + properties: LinkProperties::fk_nullable("size", "size", Cardinality::ManyToOne), + }, + ] + }, + column_semantic_types: &const { + [ + ("id", SemanticType::ClusterId), + ("owner_id", SemanticType::RoleId), + ] + }, + }), }); pub static MZ_CLUSTER_WORKLOAD_CLASSES: LazyLock = @@ -3642,6 +4753,7 @@ FROM mz_internal.mz_catalog_raw WHERE data->>'kind' = 'Cluster'", is_retained_metrics_object: false, access: vec![PUBLIC_SELECT], + ontology: None, }); pub const MZ_CLUSTER_WORKLOAD_CLASSES_IND: BuiltinIndex = BuiltinIndex { @@ -3678,6 +4790,18 @@ pub static MZ_CLUSTER_SCHEDULES: LazyLock = LazyLock::new(|| Built ]), is_retained_metrics_object: false, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "cluster_schedule", + description: "Cluster scheduling configuration", + links: &const { + [OntologyLink { + name: "belongs_to_cluster", + target: "cluster", + properties: LinkProperties::fk("cluster_id", "id", Cardinality::ManyToOne), + }] + }, + column_semantic_types: &[("cluster_id", SemanticType::ClusterId)], + }), }); pub static MZ_SECRETS: LazyLock = LazyLock::new(|| { @@ -3733,6 +4857,23 @@ WHERE mz_internal.parse_catalog_create_sql(data->'value'->'definition'->'V1'->>'create_sql')->>'type' = 'secret'", is_retained_metrics_object: false, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "secret", + description: "A user-defined secret containing sensitive configuration (e.g., credentials)", + links: &const { [ + OntologyLink { + name: "in_schema", + target: "schema", + properties: LinkProperties::fk("schema_id", "id", Cardinality::ManyToOne), + }, + OntologyLink { + name: "owned_by", + target: "role", + properties: LinkProperties::fk("owner_id", "id", Cardinality::ManyToOne), + }, + ] }, + column_semantic_types: &const {[("id", SemanticType::CatalogItemId), ("oid", SemanticType::OID), ("schema_id", SemanticType::SchemaId), ("owner_id", SemanticType::RoleId)]}, + }), } }); @@ -3774,6 +4915,36 @@ pub static MZ_CLUSTER_REPLICAS: LazyLock = LazyLock::new(|| Builti ]), is_retained_metrics_object: true, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "replica", + description: "A physical replica of a cluster providing fault tolerance", + links: &const { + [ + OntologyLink { + name: "owned_by", + target: "role", + properties: LinkProperties::fk("owner_id", "id", Cardinality::ManyToOne), + }, + OntologyLink { + name: "belongs_to_cluster", + target: "cluster", + properties: LinkProperties::fk("cluster_id", "id", Cardinality::ManyToOne), + }, + OntologyLink { + name: "has_size", + target: "replica_size", + properties: LinkProperties::fk_nullable("size", "size", Cardinality::ManyToOne), + }, + ] + }, + column_semantic_types: &const { + [ + ("id", SemanticType::ReplicaId), + ("cluster_id", SemanticType::ClusterId), + ("owner_id", SemanticType::RoleId), + ] + }, + }), }); pub static MZ_INTERNAL_CLUSTER_REPLICAS: LazyLock = @@ -3801,6 +4972,7 @@ WHERE (data->'value'->'config'->'location'->'Managed'->>'internal')::bool = true", is_retained_metrics_object: false, access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_PENDING_CLUSTER_REPLICAS: LazyLock = @@ -3828,6 +5000,7 @@ WHERE (data->'value'->'config'->'location'->'Managed'->>'pending')::bool = true", is_retained_metrics_object: false, access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_CLUSTER_REPLICA_STATUS_HISTORY: LazyLock = LazyLock::new(|| { @@ -3855,6 +5028,23 @@ pub static MZ_CLUSTER_REPLICA_STATUS_HISTORY: LazyLock = LazyLock ]), is_retained_metrics_object: false, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "replica_status_event", + description: "Historical replica status events (ready, not-ready, etc.)", + links: &const { + [OntologyLink { + name: "status_event_of_replica", + target: "replica", + properties: LinkProperties::fk_typed( + "replica_id", + "id", + Cardinality::ManyToOne, + mz_repr::SemanticType::CatalogItemId, + ), + }] + }, + column_semantic_types: &[("replica_id", SemanticType::ReplicaId)], + }), } }); @@ -3907,6 +5097,28 @@ FROM mz_internal.mz_cluster_replica_status_history JOIN mz_cluster_replicas r ON r.id = replica_id ORDER BY replica_id, process_id, occurred_at DESC", access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "replica_status", + description: "Current status of each replica", + links: &const { + [OntologyLink { + name: "status_of_replica", + target: "replica", + properties: LinkProperties::fk_typed( + "replica_id", + "id", + Cardinality::ManyToOne, + mz_repr::SemanticType::ReplicaId, + ), + }] + }, + column_semantic_types: &const { + [ + ("replica_id", SemanticType::ReplicaId), + ("updated_at", SemanticType::WallclockTimestamp), + ] + }, + }), }); pub static MZ_CLUSTER_REPLICA_SIZES: LazyLock = LazyLock::new(|| BuiltinTable { @@ -3948,6 +5160,18 @@ pub static MZ_CLUSTER_REPLICA_SIZES: LazyLock = LazyLock::new(|| B ]), is_retained_metrics_object: true, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "replica_size", + description: "Available cluster replica sizes with CPU, memory, and credit cost", + links: &const { [] }, + column_semantic_types: &const { + [ + ("memory_bytes", SemanticType::ByteCount), + ("disk_bytes", SemanticType::ByteCount), + ("credits_per_hour", SemanticType::CreditRate), + ] + }, + }), }); pub static MZ_AUDIT_EVENTS: LazyLock = LazyLock::new(|| BuiltinTable { @@ -3994,6 +5218,17 @@ pub static MZ_AUDIT_EVENTS: LazyLock = LazyLock::new(|| BuiltinTab ]), is_retained_metrics_object: false, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "audit_event", + description: "An audit log entry recording a DDL operation", + links: &const { [] }, + column_semantic_types: &const { + [ + ("object_type", SemanticType::ObjectType), + ("occurred_at", SemanticType::WallclockTimestamp), + ] + }, + }), }); pub static MZ_SOURCE_STATUS_HISTORY: LazyLock = LazyLock::new(|| BuiltinSource { @@ -4030,6 +5265,41 @@ pub static MZ_SOURCE_STATUS_HISTORY: LazyLock = LazyLock::new(|| ]), is_retained_metrics_object: false, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "source_status_event", + description: "Historical source status events", + links: &const { + [ + OntologyLink { + name: "status_event_of_source", + target: "source", + properties: LinkProperties::fk_mapped( + "source_id", + "id", + Cardinality::ManyToOne, + mz_repr::SemanticType::GlobalId, + "mz_internal.mz_object_global_ids", + ), + }, + OntologyLink { + name: "on_replica", + target: "replica", + properties: LinkProperties::fk_nullable( + "replica_id", + "id", + Cardinality::ManyToOne, + ), + }, + ] + }, + column_semantic_types: &const { + [ + ("occurred_at", SemanticType::WallclockTimestamp), + ("source_id", SemanticType::GlobalId), + ("replica_id", SemanticType::ReplicaId), + ] + }, + }), }); pub static MZ_AWS_PRIVATELINK_CONNECTION_STATUS_HISTORY: LazyLock = LazyLock::new( @@ -4054,6 +5324,7 @@ pub static MZ_AWS_PRIVATELINK_CONNECTION_STATUS_HISTORY: LazyLock ]), is_retained_metrics_object: false, access: vec![PUBLIC_SELECT], + ontology: None, }, ); @@ -4113,6 +5384,18 @@ pub static MZ_AWS_PRIVATELINK_CONNECTION_STATUSES: LazyLock = LazyL JOIN mz_catalog.mz_connections AS conns ON conns.id = latest_events.connection_id", access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "privatelink_status", + description: "PrivateLink connection health status", + links: &const { + [OntologyLink { + name: "status_of", + target: "connection", + properties: LinkProperties::fk("id", "id", Cardinality::OneToOne), + }] + }, + column_semantic_types: &[("id", SemanticType::CatalogItemId)], + }), } }); @@ -4126,6 +5409,7 @@ pub static MZ_STATEMENT_EXECUTION_HISTORY: LazyLock = column_comments: BTreeMap::new(), is_retained_metrics_object: false, access: vec![MONITOR_SELECT], + ontology: None, }); pub static MZ_STATEMENT_EXECUTION_HISTORY_REDACTED: LazyLock = LazyLock::new(|| { @@ -4163,6 +5447,7 @@ transient_index_id, mz_version, began_at, finished_at, finished_status, result_size, rows_returned, execution_strategy FROM mz_internal.mz_statement_execution_history", access: vec![SUPPORT_SELECT, ANALYTICS_SELECT, MONITOR_REDACTED_SELECT, MONITOR_SELECT], + ontology: None, } }); @@ -4181,6 +5466,7 @@ pub static MZ_PREPARED_STATEMENT_HISTORY: LazyLock = MONITOR_REDACTED_SELECT, MONITOR_SELECT, ], + ontology: None, }); pub static MZ_SQL_TEXT: LazyLock = LazyLock::new(|| BuiltinSource { @@ -4192,6 +5478,12 @@ pub static MZ_SQL_TEXT: LazyLock = LazyLock::new(|| BuiltinSource column_comments: BTreeMap::new(), is_retained_metrics_object: false, access: vec![MONITOR_SELECT], + ontology: Some(Ontology { + entity_name: "sql_text", + description: "Raw SQL text of executed statements", + links: &const { [] }, + column_semantic_types: &[], + }), }); pub static MZ_SQL_TEXT_REDACTED: LazyLock = LazyLock::new(|| BuiltinView { @@ -4210,6 +5502,7 @@ pub static MZ_SQL_TEXT_REDACTED: LazyLock = LazyLock::new(|| Builti SUPPORT_SELECT, ANALYTICS_SELECT, ], + ontology: None, }); pub static MZ_RECENT_SQL_TEXT: LazyLock = LazyLock::new(|| { @@ -4230,6 +5523,12 @@ pub static MZ_RECENT_SQL_TEXT: LazyLock = LazyLock::new(|| { column_comments: BTreeMap::new(), sql: "SELECT DISTINCT sql_hash, sql, redacted_sql FROM mz_internal.mz_sql_text WHERE prepared_day + INTERVAL '4 days' >= mz_now()", access: vec![MONITOR_SELECT], + ontology: Some(Ontology { + entity_name: "recent_sql_text", + description: "Recent SQL text (indexed, last ~3-4 days)", + links: &const { [] }, + column_semantic_types: &[("sql", SemanticType::SqlDefinition)], + }), } }); @@ -4249,6 +5548,7 @@ pub static MZ_RECENT_SQL_TEXT_REDACTED: LazyLock = LazyLock::new(|| SUPPORT_SELECT, ANALYTICS_SELECT, ], + ontology: None, }); pub static MZ_RECENT_SQL_TEXT_IND: LazyLock = LazyLock::new(|| BuiltinIndex { @@ -4285,6 +5585,18 @@ pub static MZ_SESSION_HISTORY: LazyLock = LazyLock::new(|| Builti ]), is_retained_metrics_object: false, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "session", + description: "Historical session connection events", + links: &const { + [OntologyLink { + name: "active_as", + target: "active_session", + properties: LinkProperties::fk_nullable("session_id", "id", Cardinality::ManyToOne), + }] + }, + column_semantic_types: &[("connected_at", SemanticType::WallclockTimestamp)], + }), }); pub static MZ_ACTIVITY_LOG_THINNED: LazyLock = LazyLock::new(|| { @@ -4338,6 +5650,7 @@ FROM mz_internal.mz_statement_execution_history mseh, WHERE mseh.prepared_statement_id = mpsh.id AND mpsh.session_id = msh.session_id", access: vec![MONITOR_SELECT], + ontology: None, } }); @@ -4385,6 +5698,7 @@ pub static MZ_RECENT_ACTIVITY_LOG_THINNED: LazyLock = LazyLock::new "SELECT * FROM mz_internal.mz_activity_log_thinned WHERE prepared_at + INTERVAL '1 day' > mz_now() AND began_at + INTERVAL '1 day' > mz_now() AND connected_at + INTERVAL '2 days' > mz_now()", access: vec![MONITOR_SELECT], + ontology: None, } }); @@ -4584,6 +5898,62 @@ FROM mz_internal.mz_recent_activity_log_thinned mralt, mz_internal.mz_recent_sql_text mrst WHERE mralt.sql_hash = mrst.sql_hash", access: vec![MONITOR_SELECT], + ontology: Some(Ontology { + entity_name: "activity_log", + description: "Recent query activity with execution stats", + links: &const { + [ + OntologyLink { + name: "in_session", + target: "session", + properties: LinkProperties::fk("session_id", "id", Cardinality::ManyToOne), + }, + OntologyLink { + name: "in_active_session", + target: "active_session", + properties: LinkProperties::fk_nullable( + "session_id", + "id", + Cardinality::ManyToOne, + ), + }, + OntologyLink { + name: "ran_on_cluster", + target: "cluster", + properties: LinkProperties::fk_nullable( + "cluster_id", + "id", + Cardinality::ManyToOne, + ), + }, + OntologyLink { + name: "used_transient_index", + target: "object", + properties: LinkProperties::ForeignKey { + source_column: "transient_index_id", + target_column: "id", + cardinality: Cardinality::ManyToOne, + source_id_type: Some(mz_repr::SemanticType::GlobalId), + requires_mapping: Some("mz_internal.mz_object_global_ids"), + nullable: true, + note: None, + }, + }, + ] + }, + column_semantic_types: &const { + [ + ("cluster_id", SemanticType::ClusterId), + ("execution_timestamp", SemanticType::MzTimestamp), + ("transient_index_id", SemanticType::GlobalId), + ("began_at", SemanticType::WallclockTimestamp), + ("finished_at", SemanticType::WallclockTimestamp), + ("prepared_at", SemanticType::WallclockTimestamp), + ("connected_at", SemanticType::WallclockTimestamp), + ("sql", SemanticType::SqlDefinition), + ] + }, + }), }); pub static MZ_RECENT_ACTIVITY_LOG_REDACTED: LazyLock = LazyLock::new(|| { @@ -4635,6 +6005,7 @@ FROM mz_internal.mz_recent_activity_log_thinned mralt, mz_internal.mz_recent_sql_text mrst WHERE mralt.sql_hash = mrst.sql_hash", access: vec![MONITOR_SELECT, MONITOR_REDACTED_SELECT, SUPPORT_SELECT, ANALYTICS_SELECT], + ontology: None, } }); @@ -4673,6 +6044,22 @@ pub static MZ_STATEMENT_LIFECYCLE_HISTORY: LazyLock = LazyLock::n MONITOR_REDACTED_SELECT, MONITOR_SELECT, ], + ontology: Some(Ontology { + entity_name: "statement_lifecycle_event", + description: "Statement lifecycle events (parse, bind, execute)", + links: &const { + [OntologyLink { + name: "for_execution", + target: "activity_log", + properties: LinkProperties::fk( + "statement_id", + "execution_id", + Cardinality::ManyToOne, + ), + }] + }, + column_semantic_types: &[], + }), } }); @@ -4855,6 +6242,24 @@ SELECT FROM combined WHERE id NOT LIKE 's%';", access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "source_status", + description: "Current source status (running, stalled, etc.)", + links: &const { + [OntologyLink { + name: "status_of_source", + target: "source", + properties: LinkProperties::fk("id", "id", Cardinality::OneToOne), + }] + }, + column_semantic_types: &const { + [ + ("id", SemanticType::CatalogItemId), + ("type", SemanticType::SourceType), + ("last_status_change_at", SemanticType::WallclockTimestamp), + ] + }, + }), }); pub static MZ_SINK_STATUS_HISTORY: LazyLock = LazyLock::new(|| BuiltinSource { @@ -4891,6 +6296,41 @@ pub static MZ_SINK_STATUS_HISTORY: LazyLock = LazyLock::new(|| Bu ]), is_retained_metrics_object: false, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "sink_status_event", + description: "Historical sink status events", + links: &const { + [ + OntologyLink { + name: "status_event_of_sink", + target: "sink", + properties: LinkProperties::fk_mapped( + "sink_id", + "id", + Cardinality::ManyToOne, + mz_repr::SemanticType::GlobalId, + "mz_internal.mz_object_global_ids", + ), + }, + OntologyLink { + name: "on_replica", + target: "replica", + properties: LinkProperties::fk_nullable( + "replica_id", + "id", + Cardinality::ManyToOne, + ), + }, + ] + }, + column_semantic_types: &const { + [ + ("occurred_at", SemanticType::WallclockTimestamp), + ("sink_id", SemanticType::GlobalId), + ("replica_id", SemanticType::ReplicaId), + ] + }, + }), }); pub static MZ_SINK_STATUSES: LazyLock = LazyLock::new(|| BuiltinView { @@ -4996,6 +6436,28 @@ WHERE -- This is a convenient way to filter out system sinks, like the status_history table itself. mz_sinks.id NOT LIKE 's%'", access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "sink_status", + description: "Current sink status", + links: &const { + [OntologyLink { + name: "status_of_sink", + target: "sink", + properties: LinkProperties::fk_typed( + "id", + "id", + Cardinality::OneToOne, + mz_repr::SemanticType::CatalogItemId, + ), + }] + }, + column_semantic_types: &const { + [ + ("id", SemanticType::CatalogItemId), + ("last_status_change_at", SemanticType::WallclockTimestamp), + ] + }, + }), }); pub static MZ_STORAGE_USAGE_BY_SHARD_DESCRIPTION: LazyLock = @@ -5021,6 +6483,18 @@ pub static MZ_STORAGE_USAGE_BY_SHARD: LazyLock = LazyLock::new(|| column_comments: BTreeMap::new(), is_retained_metrics_object: false, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "storage_usage_by_shard", + description: "Storage usage broken down by shard", + links: &const { [] }, + column_semantic_types: &const { + [ + ("shard_id", SemanticType::ShardId), + ("size_bytes", SemanticType::ByteCount), + ("collection_timestamp", SemanticType::WallclockTimestamp), + ] + }, + }), }); pub static MZ_EGRESS_IPS: LazyLock = LazyLock::new(|| BuiltinTable { @@ -5042,6 +6516,12 @@ pub static MZ_EGRESS_IPS: LazyLock = LazyLock::new(|| BuiltinTable ]), is_retained_metrics_object: false, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "egress_ip", + description: "IP addresses used for outbound connections from Materialize", + links: &const { [] }, + column_semantic_types: &[], + }), }); pub static MZ_AWS_PRIVATELINK_CONNECTIONS: LazyLock = @@ -5062,6 +6542,18 @@ pub static MZ_AWS_PRIVATELINK_CONNECTIONS: LazyLock = ]), is_retained_metrics_object: false, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "aws_privatelink_connection", + description: "AWS PrivateLink connection configuration", + links: &const { + [OntologyLink { + name: "details_of", + target: "connection", + properties: LinkProperties::fk("id", "id", Cardinality::OneToOne), + }] + }, + column_semantic_types: &[("id", SemanticType::CatalogItemId)], + }), }); pub static MZ_AWS_CONNECTIONS: LazyLock = LazyLock::new(|| BuiltinTable { @@ -5142,6 +6634,18 @@ pub static MZ_AWS_CONNECTIONS: LazyLock = LazyLock::new(|| Builtin ]), is_retained_metrics_object: false, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "aws_connection", + description: "AWS connection configuration details", + links: &const { + [OntologyLink { + name: "details_of", + target: "connection", + properties: LinkProperties::fk("id", "id", Cardinality::OneToOne), + }] + }, + column_semantic_types: &[], + }), }); pub static MZ_CLUSTER_REPLICA_METRICS_HISTORY: LazyLock = @@ -5172,6 +6676,7 @@ pub static MZ_CLUSTER_REPLICA_METRICS_HISTORY: LazyLock = ]), is_retained_metrics_object: false, access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_CLUSTER_REPLICA_METRICS: LazyLock = LazyLock::new(|| BuiltinView { @@ -5217,6 +6722,31 @@ FROM mz_internal.mz_cluster_replica_metrics_history JOIN mz_cluster_replicas r ON r.id = replica_id ORDER BY replica_id, process_id, occurred_at DESC", access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "replica_metrics", + description: "CPU and memory metrics per replica", + links: &const { + [OntologyLink { + name: "metrics_of_replica", + target: "replica", + properties: LinkProperties::fk_typed( + "replica_id", + "id", + Cardinality::OneToOne, + mz_repr::SemanticType::CatalogItemId, + ), + }] + }, + column_semantic_types: &const { + [ + ("replica_id", SemanticType::ReplicaId), + ("memory_bytes", SemanticType::ByteCount), + ("disk_bytes", SemanticType::ByteCount), + ("heap_bytes", SemanticType::ByteCount), + ("heap_limit", SemanticType::ByteCount), + ] + }, + }), }); pub static MZ_CLUSTER_REPLICA_FRONTIERS: LazyLock = @@ -5243,6 +6773,7 @@ pub static MZ_CLUSTER_REPLICA_FRONTIERS: LazyLock = ]), is_retained_metrics_object: false, access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_CLUSTER_REPLICA_FRONTIERS_IND: LazyLock = @@ -5280,6 +6811,30 @@ pub static MZ_FRONTIERS: LazyLock = LazyLock::new(|| BuiltinSourc ]), is_retained_metrics_object: false, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "frontier", + description: "Current read/write frontiers for sources, sinks, tables, materialized views, indexes, and subscriptions", + links: &const { + [OntologyLink { + name: "frontier_of", + target: "object", + properties: LinkProperties::fk_mapped( + "object_id", + "id", + Cardinality::ManyToOne, + mz_repr::SemanticType::GlobalId, + "mz_internal.mz_object_global_ids", + ), + }] + }, + column_semantic_types: &const { + [ + ("object_id", SemanticType::GlobalId), + ("read_frontier", SemanticType::MzTimestamp), + ("write_frontier", SemanticType::MzTimestamp), + ] + }, + }), }); /// DEPRECATED and scheduled for removal! Use `mz_frontiers` instead. @@ -5297,6 +6852,7 @@ SELECT object_id, write_frontier AS time FROM mz_internal.mz_frontiers WHERE write_frontier IS NOT NULL", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_WALLCLOCK_LAG_HISTORY: LazyLock = LazyLock::new(|| BuiltinSource { @@ -5325,6 +6881,41 @@ pub static MZ_WALLCLOCK_LAG_HISTORY: LazyLock = LazyLock::new(|| ]), is_retained_metrics_object: false, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "wallclock_lag_event", + description: "Historical wallclock lag per object", + links: &const { + [ + OntologyLink { + name: "measures_lag_of", + target: "object", + properties: LinkProperties::measures_mapped( + "object_id", + "id", + "wallclock_lag", + mz_repr::SemanticType::GlobalId, + "mz_internal.mz_object_global_ids", + ), + }, + OntologyLink { + name: "on_replica", + target: "replica", + properties: LinkProperties::fk_nullable( + "replica_id", + "id", + Cardinality::ManyToOne, + ), + }, + ] + }, + column_semantic_types: &const { + [ + ("object_id", SemanticType::GlobalId), + ("replica_id", SemanticType::ReplicaId), + ("occurred_at", SemanticType::WallclockTimestamp), + ] + }, + }), }); pub static MZ_WALLCLOCK_GLOBAL_LAG_HISTORY: LazyLock = LazyLock::new(|| BuiltinView { @@ -5370,6 +6961,23 @@ FROM times_binned GROUP BY object_id, occurred_at OPTIONS (AGGREGATE INPUT GROUP SIZE = 1)", access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "wallclock_global_lag_event", + description: "Historical global wallclock lag", + links: &const { + [OntologyLink { + name: "lag_of", + target: "object_global_id", + properties: LinkProperties::fk("object_id", "global_id", Cardinality::ManyToOne), + }] + }, + column_semantic_types: &const { + [ + ("object_id", SemanticType::GlobalId), + ("occurred_at", SemanticType::WallclockTimestamp), + ] + }, + }), }); pub static MZ_WALLCLOCK_GLOBAL_LAG_RECENT_HISTORY: LazyLock = LazyLock::new(|| { @@ -5405,6 +7013,7 @@ SELECT object_id, lag, occurred_at FROM mz_internal.mz_wallclock_global_lag_history WHERE occurred_at + '1 day' > mz_now()", access: vec![PUBLIC_SELECT], + ontology: None, } }); @@ -5433,6 +7042,24 @@ FROM mz_internal.mz_wallclock_global_lag_recent_history WHERE occurred_at + '5 minutes' > mz_now() ORDER BY object_id, occurred_at DESC", access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "wallclock_global_lag", + description: "Current wallclock lag aggregated across replicas", + links: &const { + [OntologyLink { + name: "measures_global_lag_of", + target: "object", + properties: LinkProperties::measures_mapped( + "object_id", + "id", + "wallclock_lag_global", + mz_repr::SemanticType::GlobalId, + "mz_internal.mz_object_global_ids", + ), + }] + }, + column_semantic_types: &[("object_id", SemanticType::GlobalId)], + }), }); pub static MZ_WALLCLOCK_GLOBAL_LAG_HISTOGRAM_RAW: LazyLock = @@ -5445,6 +7072,7 @@ pub static MZ_WALLCLOCK_GLOBAL_LAG_HISTOGRAM_RAW: LazyLock = data_source: IntrospectionType::WallclockLagHistogram.into(), is_retained_metrics_object: false, access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_WALLCLOCK_GLOBAL_LAG_HISTOGRAM: LazyLock = @@ -5473,6 +7101,7 @@ SELECT *, count(*) AS count FROM mz_internal.mz_wallclock_global_lag_histogram_raw GROUP BY period_start, period_end, object_id, lag_seconds, labels", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_MATERIALIZED_VIEW_REFRESHES: LazyLock = LazyLock::new(|| { @@ -5510,6 +7139,7 @@ pub static MZ_MATERIALIZED_VIEW_REFRESHES: LazyLock = LazyLock::n ]), is_retained_metrics_object: false, access: vec![PUBLIC_SELECT], + ontology: None, } }); @@ -5555,6 +7185,39 @@ pub static MZ_SUBSCRIPTIONS: LazyLock = LazyLock::new(|| BuiltinTa ]), is_retained_metrics_object: false, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "subscription", + description: "Active SUBSCRIBE operations", + links: &const { + [ + OntologyLink { + name: "uses_session", + target: "session", + properties: LinkProperties::fk("session_id", "id", Cardinality::ManyToOne), + }, + OntologyLink { + name: "in_active_session", + target: "active_session", + properties: LinkProperties::fk_nullable( + "session_id", + "id", + Cardinality::ManyToOne, + ), + }, + OntologyLink { + name: "belongs_to_cluster", + target: "cluster", + properties: LinkProperties::fk("cluster_id", "id", Cardinality::ManyToOne), + }, + ] + }, + column_semantic_types: &const { + [ + ("id", SemanticType::CatalogItemId), + ("cluster_id", SemanticType::ClusterId), + ] + }, + }), }); pub static MZ_SESSIONS: LazyLock = LazyLock::new(|| BuiltinTable { @@ -5592,6 +7255,18 @@ pub static MZ_SESSIONS: LazyLock = LazyLock::new(|| BuiltinTable { ]), is_retained_metrics_object: false, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "active_session", + description: "Currently active sessions", + links: &const { + [OntologyLink { + name: "logged_in_as", + target: "role", + properties: LinkProperties::fk("role_id", "id", Cardinality::ManyToOne), + }] + }, + column_semantic_types: &[("role_id", SemanticType::RoleId)], + }), }); pub static MZ_DEFAULT_PRIVILEGES: LazyLock = LazyLock::new(|| BuiltinTable { @@ -5631,6 +7306,51 @@ pub static MZ_DEFAULT_PRIVILEGES: LazyLock = LazyLock::new(|| Buil ]), is_retained_metrics_object: false, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "default_privilege", + description: "A default privilege rule applied to newly created objects", + links: &const { + [ + OntologyLink { + name: "default_priv_for_role", + target: "role", + properties: LinkProperties::fk("role_id", "id", Cardinality::ManyToOne), + }, + OntologyLink { + name: "default_priv_in_database", + target: "database", + properties: LinkProperties::fk_nullable( + "database_id", + "id", + Cardinality::ManyToOne, + ), + }, + OntologyLink { + name: "default_priv_in_schema", + target: "schema", + properties: LinkProperties::fk_nullable( + "schema_id", + "id", + Cardinality::ManyToOne, + ), + }, + OntologyLink { + name: "default_priv_granted_to", + target: "role", + properties: LinkProperties::fk("grantee", "id", Cardinality::ManyToOne), + }, + ] + }, + column_semantic_types: &const { + [ + ("role_id", SemanticType::RoleId), + ("database_id", SemanticType::DatabaseId), + ("schema_id", SemanticType::SchemaId), + ("object_type", SemanticType::ObjectType), + ("grantee", SemanticType::RoleId), + ] + }, + }), }); pub static MZ_SYSTEM_PRIVILEGES: LazyLock = LazyLock::new(|| BuiltinTable { @@ -5646,6 +7366,12 @@ pub static MZ_SYSTEM_PRIVILEGES: LazyLock = LazyLock::new(|| Built )]), is_retained_metrics_object: false, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "system_privilege", + description: "A system-level privilege grant", + links: &const { [] }, + column_semantic_types: &[], + }), }); pub static MZ_COMMENTS: LazyLock = LazyLock::new(|| BuiltinTable { @@ -5675,6 +7401,28 @@ pub static MZ_COMMENTS: LazyLock = LazyLock::new(|| BuiltinTable { ]), is_retained_metrics_object: false, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "comment", + description: "A COMMENT ON annotation for a catalog object or column", + links: &const { + [OntologyLink { + name: "comment_on", + target: "object", + properties: LinkProperties::fk_typed( + "id", + "id", + Cardinality::ManyToOne, + mz_repr::SemanticType::CatalogItemId, + ), + }] + }, + column_semantic_types: &const { + [ + ("id", SemanticType::CatalogItemId), + ("object_type", SemanticType::ObjectType), + ] + }, + }), }); pub static MZ_SOURCE_REFERENCES: LazyLock = LazyLock::new(|| BuiltinTable { @@ -5697,6 +7445,18 @@ pub static MZ_SOURCE_REFERENCES: LazyLock = LazyLock::new(|| Built column_comments: BTreeMap::new(), is_retained_metrics_object: false, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "source_reference", + description: "External references tracked by sources", + links: &const { + [OntologyLink { + name: "references_source", + target: "source", + properties: LinkProperties::fk("source_id", "id", Cardinality::ManyToOne), + }] + }, + column_semantic_types: &[("source_id", SemanticType::CatalogItemId)], + }), }); pub static MZ_WEBHOOKS_SOURCES: LazyLock = LazyLock::new(|| BuiltinTable { @@ -5721,6 +7481,18 @@ pub static MZ_WEBHOOKS_SOURCES: LazyLock = LazyLock::new(|| Builti ]), is_retained_metrics_object: false, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "webhook_source", + description: "Webhook source configuration", + links: &const { + [OntologyLink { + name: "details_of", + target: "source", + properties: LinkProperties::fk("id", "id", Cardinality::OneToOne), + }] + }, + column_semantic_types: &[("id", SemanticType::CatalogItemId)], + }), }); pub static MZ_HISTORY_RETENTION_STRATEGIES: LazyLock = LazyLock::new(|| { @@ -5746,6 +7518,12 @@ pub static MZ_HISTORY_RETENTION_STRATEGIES: LazyLock = LazyLock::n ]), is_retained_metrics_object: false, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "history_retention", + description: "History retention strategy for an object", + links: &const { [] }, + column_semantic_types: &[("id", SemanticType::CatalogItemId)], + }), } }); @@ -5787,6 +7565,12 @@ pub static MZ_LICENSE_KEYS: LazyLock = LazyLock::new(|| BuiltinTab ]), is_retained_metrics_object: false, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "license_key", + description: "License key metadata", + links: &const { [] }, + column_semantic_types: &[("id", SemanticType::CatalogItemId)], + }), }); pub static MZ_REPLACEMENTS: LazyLock = LazyLock::new(|| BuiltinTable { @@ -5809,6 +7593,25 @@ pub static MZ_REPLACEMENTS: LazyLock = LazyLock::new(|| BuiltinTab ]), is_retained_metrics_object: false, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "replacement", + description: "A record of an object replacement (ALTER ... SWAP)", + links: &const { + [ + OntologyLink { + name: "replacement_object", + target: "object", + properties: LinkProperties::fk("id", "id", Cardinality::ManyToOne), + }, + OntologyLink { + name: "replacement_target", + target: "object", + properties: LinkProperties::fk("target_id", "id", Cardinality::ManyToOne), + }, + ] + }, + column_semantic_types: &[("id", SemanticType::CatalogItemId)], + }), }); // These will be replaced with per-replica tables once source/sink multiplexing on @@ -5822,6 +7625,7 @@ pub static MZ_SOURCE_STATISTICS_RAW: LazyLock = LazyLock::new(|| column_comments: BTreeMap::new(), is_retained_metrics_object: true, access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_SINK_STATISTICS_RAW: LazyLock = LazyLock::new(|| BuiltinSource { name: "mz_sink_statistics_raw", @@ -5832,6 +7636,7 @@ pub static MZ_SINK_STATISTICS_RAW: LazyLock = LazyLock::new(|| Bu column_comments: BTreeMap::new(), is_retained_metrics_object: true, access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_STORAGE_SHARDS: LazyLock = LazyLock::new(|| BuiltinSource { @@ -5846,6 +7651,29 @@ pub static MZ_STORAGE_SHARDS: LazyLock = LazyLock::new(|| Builtin column_comments: BTreeMap::new(), is_retained_metrics_object: false, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "storage_shard", + description: "Persist shards used by storage objects", + links: &const { + [OntologyLink { + name: "shard_of", + target: "object", + properties: LinkProperties::fk_mapped( + "object_id", + "id", + Cardinality::ManyToOne, + mz_repr::SemanticType::GlobalId, + "mz_internal.mz_object_global_ids", + ), + }] + }, + column_semantic_types: &const { + [ + ("object_id", SemanticType::GlobalId), + ("shard_id", SemanticType::ShardId), + ] + }, + }), }); pub static MZ_STORAGE_USAGE: LazyLock = LazyLock::new(|| BuiltinView { @@ -5885,6 +7713,24 @@ FROM JOIN mz_internal.mz_storage_usage_by_shard USING (shard_id) GROUP BY object_id, collection_timestamp", access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "storage_usage", + description: "Historical storage usage per object over time", + links: &const { + [OntologyLink { + name: "storage_usage_of", + target: "object", + properties: LinkProperties::fk("object_id", "id", Cardinality::ManyToOne), + }] + }, + column_semantic_types: &const { + [ + ("object_id", SemanticType::CatalogItemId), + ("size_bytes", SemanticType::ByteCount), + ("collection_timestamp", SemanticType::WallclockTimestamp), + ] + }, + }), }); pub static MZ_RECENT_STORAGE_USAGE: LazyLock = LazyLock::new(|| { @@ -5929,6 +7775,14 @@ FROM AND most_recent_collection_timestamp_by_shard.collection_timestamp = recent_storage_usage_by_shard.collection_timestamp GROUP BY object_id", access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "recent_storage", + description: "Most recent storage usage snapshot per object", + links: &const { [ + OntologyLink { name: "recent_storage_of", target: "object", properties: LinkProperties::fk("object_id", "id", Cardinality::OneToOne) }, + ] }, + column_semantic_types: &const {[("object_id", SemanticType::CatalogItemId), ("size_bytes", SemanticType::ByteCount)]}, + }), } }); @@ -5971,6 +7825,17 @@ UNION ALL SELECT id, oid, schema_id, name, 'source', owner_id, cluster_id, privi UNION ALL SELECT id, oid, schema_id, name, 'view', owner_id, NULL::text, privileges FROM mz_catalog.mz_views UNION ALL SELECT id, oid, schema_id, name, 'materialized-view', owner_id, cluster_id, privileges FROM mz_catalog.mz_materialized_views", access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "relation", + description: "Union of all relation types: tables, sources, views, MVs (convenience view)", + links: &const { [ + OntologyLink { name: "union_includes", target: "table", properties: LinkProperties::union_disc("type", "table") }, + OntologyLink { name: "union_includes", target: "source", properties: LinkProperties::union_disc("type", "source") }, + OntologyLink { name: "union_includes", target: "view", properties: LinkProperties::union_disc("type", "view") }, + OntologyLink { name: "union_includes", target: "mv", properties: LinkProperties::union_disc("type", "materialized-view") }, + ] }, + column_semantic_types: &const {[("id", SemanticType::CatalogItemId), ("oid", SemanticType::OID), ("schema_id", SemanticType::SchemaId), ("type", SemanticType::ObjectType), ("owner_id", SemanticType::RoleId), ("cluster_id", SemanticType::ClusterId)]}, + }), } }); @@ -5999,6 +7864,7 @@ pub static MZ_OBJECTS_ID_NAMESPACE_TYPES: LazyLock = LazyLock::new( ) AS _ (object_type)"#, access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_OBJECT_OID_ALIAS: LazyLock = LazyLock::new(|| BuiltinView { @@ -6027,6 +7893,7 @@ pub static MZ_OBJECT_OID_ALIAS: LazyLock = LazyLock::new(|| Builtin ) AS _ (object_type, oid_alias);", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_OBJECTS: LazyLock = LazyLock::new(|| { @@ -6071,6 +7938,29 @@ UNION ALL UNION ALL SELECT id, oid, schema_id, name, 'secret', owner_id, NULL::text, privileges FROM mz_catalog.mz_secrets", access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "object", + description: "Union of all object types: relations, indexes, connections, etc. (convenience view)", + links: &const { [ + OntologyLink { name: "union_includes", target: "relation", properties: LinkProperties::Union { + discriminator_column: None, + discriminator_value: None, + note: Some("mz_objects includes all relations plus indexes, connections, secrets, types, functions"), + } }, + OntologyLink { name: "union_includes", target: "index", properties: LinkProperties::union_disc("type", "index") }, + OntologyLink { name: "union_includes", target: "connection", properties: LinkProperties::union_disc("type", "connection") }, + OntologyLink { name: "union_includes", target: "secret", properties: LinkProperties::union_disc("type", "secret") }, + OntologyLink { name: "maps_to_global_id", target: "object", properties: LinkProperties::MapsTo { + source_column: None, + target_column: None, + via: Some("mz_internal.mz_object_global_ids"), + from_type: Some(mz_repr::SemanticType::CatalogItemId), + to_type: Some(mz_repr::SemanticType::GlobalId), + note: Some("A CatalogItemId (SQL layer) maps to one or more GlobalIds (runtime layer)."), + } }, + ] }, + column_semantic_types: &const {[("id", SemanticType::CatalogItemId), ("oid", SemanticType::OID), ("schema_id", SemanticType::SchemaId), ("type", SemanticType::ObjectType), ("owner_id", SemanticType::RoleId), ("cluster_id", SemanticType::ClusterId)]}, + }), } }); @@ -6130,6 +8020,43 @@ pub static MZ_OBJECT_FULLY_QUALIFIED_NAMES: LazyLock = LazyLock::ne -- LEFT JOIN accounts for objects in the ambient database. LEFT JOIN mz_catalog.mz_databases db ON db.id = sc.database_id", access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "object_fqn", + description: "Fully qualified name (database.schema.name) for objects", + links: &const { + [ + OntologyLink { + name: "details_of", + target: "object", + properties: LinkProperties::fk("id", "id", Cardinality::OneToOne), + }, + OntologyLink { + name: "in_schema", + target: "schema", + properties: LinkProperties::fk("schema_id", "id", Cardinality::ManyToOne), + }, + OntologyLink { + name: "in_database", + target: "database", + properties: LinkProperties::fk("database_id", "id", Cardinality::ManyToOne), + }, + OntologyLink { + name: "belongs_to_cluster", + target: "cluster", + properties: LinkProperties::fk("cluster_id", "id", Cardinality::ManyToOne), + }, + ] + }, + column_semantic_types: &const { + [ + ("id", SemanticType::CatalogItemId), + ("object_type", SemanticType::ObjectType), + ("schema_id", SemanticType::SchemaId), + ("database_id", SemanticType::DatabaseId), + ("cluster_id", SemanticType::ClusterId), + ] + }, + }), }); pub static MZ_OBJECT_GLOBAL_IDS: LazyLock = LazyLock::new(|| BuiltinTable { @@ -6149,6 +8076,25 @@ pub static MZ_OBJECT_GLOBAL_IDS: LazyLock = LazyLock::new(|| Built ]), is_retained_metrics_object: false, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "object_global_id", + description: "Mapping between CatalogItemId (SQL layer) and GlobalId (runtime layer)", + links: &const { + [OntologyLink { + name: "has_global_id", + target: "object", + properties: LinkProperties::MapsTo { + source_column: Some("id"), + target_column: Some("id"), + via: None, + from_type: Some(mz_repr::SemanticType::CatalogItemId), + to_type: Some(mz_repr::SemanticType::GlobalId), + note: None, + }, + }] + }, + column_semantic_types: &[("id", SemanticType::CatalogItemId)], + }), }); // TODO (SangJunBak): Remove once mz_object_history is released and used in the Console https://github.com/MaterializeInc/console/issues/3342 @@ -6195,6 +8141,28 @@ pub static MZ_OBJECT_LIFETIMES: LazyLock = LazyLock::new(|| Builtin FROM mz_catalog.mz_audit_events a WHERE a.event_type = 'create' OR a.event_type = 'drop'", access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "object_lifetime_event", + description: "Create or drop lifecycle event for a catalog object", + links: &const { + [OntologyLink { + name: "lifetime_event_of", + target: "object", + properties: LinkProperties::fk_typed( + "id", + "id", + Cardinality::ManyToOne, + mz_repr::SemanticType::CatalogItemId, + ), + }] + }, + column_semantic_types: &const { + [ + ("id", SemanticType::CatalogItemId), + ("object_type", SemanticType::ObjectType), + ] + }, + }), }); pub static MZ_OBJECT_HISTORY: LazyLock = LazyLock::new(|| BuiltinView { @@ -6280,6 +8248,18 @@ pub static MZ_OBJECT_HISTORY: LazyLock = LazyLock::new(|| BuiltinVi ) SELECT * FROM user_object_history UNION ALL (SELECT * FROM built_in_objects)"#, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "object_history", + description: "Historical record of object creation and drops", + links: &const { + [OntologyLink { + name: "history_of", + target: "object", + properties: LinkProperties::fk("id", "id", Cardinality::ManyToOne), + }] + }, + column_semantic_types: &[("id", SemanticType::CatalogItemId)], + }), }); pub static MZ_DATAFLOWS_PER_WORKER: LazyLock = LazyLock::new(|| BuiltinView { @@ -6304,6 +8284,7 @@ WHERE addrs.worker_id = ops.worker_id AND mz_catalog.list_length(addrs.address) = 1", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_DATAFLOWS: LazyLock = LazyLock::new(|| BuiltinView { @@ -6323,6 +8304,12 @@ SELECT id, name FROM mz_introspection.mz_dataflows_per_worker WHERE worker_id = 0::uint8", access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "dataflow", + description: "Dataflow instances", + links: &const { [] }, + column_semantic_types: &[], + }), }); pub static MZ_DATAFLOW_ADDRESSES: LazyLock = LazyLock::new(|| BuiltinView { @@ -6356,6 +8343,18 @@ SELECT id, address FROM mz_introspection.mz_dataflow_addresses_per_worker WHERE worker_id = 0::uint8", access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "dataflow_address", + description: "Address (scope path) of dataflow operators", + links: &const { + [OntologyLink { + name: "address_of_operator", + target: "dataflow_operator", + properties: LinkProperties::fk("id", "id", Cardinality::OneToOne), + }] + }, + column_semantic_types: &[], + }), }); pub static MZ_DATAFLOW_CHANNELS: LazyLock = LazyLock::new(|| BuiltinView { @@ -6390,6 +8389,27 @@ SELECT id, from_index, from_port, to_index, to_port, type FROM mz_introspection.mz_dataflow_channels_per_worker WHERE worker_id = 0::uint8", access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "dataflow_channel", + description: "Communication channels between operators", + links: &const { + [OntologyLink { + name: "channel_in_dataflow", + target: "dataflow", + properties: LinkProperties::MapsTo { + source_column: None, + target_column: None, + via: Some("mz_introspection.mz_dataflow_operator_dataflows"), + from_type: None, + to_type: None, + note: Some( + "Channels do not have a direct dataflow_id. Use mz_dataflow_addresses to find the parent scope, then correlate with mz_dataflow_operator_dataflows.", + ), + }, + }] + }, + column_semantic_types: &[], + }), }); pub static MZ_DATAFLOW_OPERATORS: LazyLock = LazyLock::new(|| BuiltinView { @@ -6410,6 +8430,12 @@ SELECT id, name FROM mz_introspection.mz_dataflow_operators_per_worker WHERE worker_id = 0::uint8", access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "dataflow_operator", + description: "Operators within dataflows", + links: &const { [] }, + column_semantic_types: &[], + }), }); pub static MZ_DATAFLOW_GLOBAL_IDS: LazyLock = LazyLock::new(|| BuiltinView { @@ -6430,6 +8456,7 @@ SELECT id, global_id FROM mz_introspection.mz_compute_dataflow_global_ids_per_worker WHERE worker_id = 0::uint8", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_MAPPABLE_OBJECTS: LazyLock = LazyLock::new(|| BuiltinView { @@ -6455,6 +8482,12 @@ FROM mz_catalog.mz_objects mo JOIN mz_introspection.mz_dataflow_global_ids mgi ON (mce.dataflow_id = mgi.id) LEFT JOIN mz_catalog.mz_databases md ON (ms.database_id = md.id);", access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "mappable_object", + description: "Objects that can be mapped to dataflow operators", + links: &const { [] }, + column_semantic_types: &[("global_id", SemanticType::GlobalId)], + }), }); pub static MZ_LIR_MAPPING: LazyLock = LazyLock::new(|| BuiltinView { @@ -6497,6 +8530,12 @@ SELECT global_id, lir_id, operator, parent_lir_id, nesting, operator_id_start, o FROM mz_introspection.mz_compute_lir_mapping_per_worker WHERE worker_id = 0::uint8", access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "lir_mapping", + description: "LIR (low-level IR) to dataflow operator mapping", + links: &const { [] }, + column_semantic_types: &[("global_id", SemanticType::GlobalId)], + }), }); pub static MZ_DATAFLOW_OPERATOR_DATAFLOWS_PER_WORKER: LazyLock = @@ -6528,6 +8567,7 @@ WHERE dfs.id = addrs.address[1] AND dfs.worker_id = addrs.worker_id", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_DATAFLOW_OPERATOR_DATAFLOWS: LazyLock = LazyLock::new(|| BuiltinView { @@ -6560,6 +8600,18 @@ SELECT id, name, dataflow_id, dataflow_name FROM mz_introspection.mz_dataflow_operator_dataflows_per_worker WHERE worker_id = 0::uint8", access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "dataflow_operator_dataflow", + description: "Mapping of operators to their parent dataflow", + links: &const { + [OntologyLink { + name: "operator_in_dataflow", + target: "dataflow", + properties: LinkProperties::fk("dataflow_id", "id", Cardinality::ManyToOne), + }] + }, + column_semantic_types: &[], + }), }); pub static MZ_OBJECT_TRANSITIVE_DEPENDENCIES: LazyLock = LazyLock::new(|| { @@ -6594,6 +8646,40 @@ WITH MUTUALLY RECURSIVE ) SELECT object_id, referenced_object_id FROM reach;", access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "transitive_dependency", + description: "Transitive closure of object dependencies — all direct and indirect dependencies", + links: &const { + [ + OntologyLink { + name: "transitively_dependent_object", + target: "object", + properties: LinkProperties::fk_typed( + "object_id", + "id", + Cardinality::ManyToOne, + mz_repr::SemanticType::CatalogItemId, + ), + }, + OntologyLink { + name: "transitively_referenced_object", + target: "object", + properties: LinkProperties::fk_typed( + "referenced_object_id", + "id", + Cardinality::ManyToOne, + mz_repr::SemanticType::CatalogItemId, + ), + }, + ] + }, + column_semantic_types: &const { + [ + ("object_id", SemanticType::CatalogItemId), + ("referenced_object_id", SemanticType::CatalogItemId), + ] + }, + }), } }); @@ -6609,7 +8695,7 @@ pub static MZ_COMPUTE_EXPORTS: LazyLock = LazyLock::new(|| BuiltinV column_comments: BTreeMap::from_iter([ ( "export_id", - "The ID of the index, materialized view, or subscription exported by the dataflow. Corresponds to `mz_catalog.mz_indexes.id`, `mz_catalog.mz_materialized_views.id`, or `mz_internal.mz_subscriptions`.", + "The ID of the index, materialized view, or subscription exported by the dataflow. Corresponds to `mz_catalog.mz_indexes.id`, `mz_catalog.mz_materialized_views.id`, or `mz_internal.mz_subscriptions.id`.", ), ( "dataflow_id", @@ -6621,6 +8707,40 @@ SELECT export_id, dataflow_id FROM mz_introspection.mz_compute_exports_per_worker WHERE worker_id = 0::uint8", access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "compute_export", + description: "Compute exports (maintained collections)", + links: &const { + [ + OntologyLink { + name: "export_of", + target: "object", + properties: LinkProperties::fk_mapped( + "export_id", + "id", + Cardinality::ManyToOne, + mz_repr::SemanticType::GlobalId, + "mz_internal.mz_object_global_ids", + ), + }, + OntologyLink { + name: "introspection_uses_global_id", + target: "object_global_id", + properties: LinkProperties::MapsTo { + source_column: None, + target_column: None, + via: None, + from_type: None, + to_type: None, + note: Some( + "mz_introspection tables use GlobalId. To join with mz_catalog tables (which use CatalogItemId), go through mz_internal.mz_object_global_ids.", + ), + }, + }, + ] + }, + column_semantic_types: &[("export_id", SemanticType::GlobalId)], + }), }); pub static MZ_COMPUTE_FRONTIERS: LazyLock = LazyLock::new(|| BuiltinView { @@ -6647,6 +8767,29 @@ pub static MZ_COMPUTE_FRONTIERS: LazyLock = LazyLock::new(|| Builti FROM mz_introspection.mz_compute_frontiers_per_worker GROUP BY export_id", access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "compute_frontier", + description: "Per-replica compute frontiers", + links: &const { + [OntologyLink { + name: "compute_frontier_of", + target: "object", + properties: LinkProperties::fk_mapped( + "export_id", + "id", + Cardinality::ManyToOne, + mz_repr::SemanticType::GlobalId, + "mz_internal.mz_object_global_ids", + ), + }] + }, + column_semantic_types: &const { + [ + ("export_id", SemanticType::GlobalId), + ("time", SemanticType::MzTimestamp), + ] + }, + }), }); pub static MZ_DATAFLOW_CHANNEL_OPERATORS_PER_WORKER: LazyLock = @@ -6715,6 +8858,7 @@ FROM channel_operator_addresses coa coa.worker_id = to_ops.worker_id ", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_DATAFLOW_CHANNEL_OPERATORS: LazyLock = LazyLock::new(|| BuiltinView { @@ -6771,6 +8915,7 @@ SELECT id, from_operator_id, from_operator_address, to_operator_id, to_operator_ FROM mz_introspection.mz_dataflow_channel_operators_per_worker WHERE worker_id = 0::uint8", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_COMPUTE_IMPORT_FRONTIERS: LazyLock = LazyLock::new(|| BuiltinView { @@ -6802,6 +8947,30 @@ pub static MZ_COMPUTE_IMPORT_FRONTIERS: LazyLock = LazyLock::new(|| FROM mz_introspection.mz_compute_import_frontiers_per_worker GROUP BY export_id, import_id", access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "compute_import_frontier", + description: "Import frontiers for compute dependencies", + links: &const { + [OntologyLink { + name: "compute_import_frontier_of", + target: "object", + properties: LinkProperties::fk_mapped( + "export_id", + "id", + Cardinality::ManyToOne, + mz_repr::SemanticType::GlobalId, + "mz_internal.mz_object_global_ids", + ), + }] + }, + column_semantic_types: &const { + [ + ("export_id", SemanticType::GlobalId), + ("import_id", SemanticType::GlobalId), + ("time", SemanticType::MzTimestamp), + ] + }, + }), }); pub static MZ_RECORDS_PER_DATAFLOW_OPERATOR_PER_WORKER: LazyLock = @@ -6838,6 +9007,7 @@ FROM dod.id = ar_size.operator_id AND dod.worker_id = ar_size.worker_id", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_RECORDS_PER_DATAFLOW_OPERATOR: LazyLock = @@ -6891,6 +9061,7 @@ SELECT FROM mz_introspection.mz_records_per_dataflow_operator_per_worker GROUP BY id, name, dataflow_id", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_RECORDS_PER_DATAFLOW_PER_WORKER: LazyLock = @@ -6931,6 +9102,7 @@ GROUP BY dfs.name, rdo.worker_id", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_RECORDS_PER_DATAFLOW: LazyLock = LazyLock::new(|| BuiltinView { @@ -6980,6 +9152,18 @@ GROUP BY id, name", access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "records_per_dataflow", + description: "Record counts aggregated per dataflow", + links: &const { + [OntologyLink { + name: "details_of", + target: "dataflow", + properties: LinkProperties::fk("id", "id", Cardinality::OneToOne), + }] + }, + column_semantic_types: &[], + }), }); /// Peeled version of `PG_NAMESPACE`: @@ -7013,6 +9197,7 @@ FROM mz_catalog.mz_schemas s LEFT JOIN mz_catalog.mz_databases d ON d.id = s.database_id JOIN mz_catalog.mz_roles role_owner ON role_owner.id = s.owner_id", access: vec![PUBLIC_SELECT], + ontology: None, }); pub const PG_NAMESPACE_ALL_DATABASES_IND: BuiltinIndex = BuiltinIndex { @@ -7044,6 +9229,7 @@ SELECT FROM mz_internal.pg_namespace_all_databases WHERE database_name IS NULL OR database_name = pg_catalog.current_database();", access: vec![PUBLIC_SELECT], + ontology: None, }); /// Peeled version of `PG_CLASS`: @@ -7151,6 +9337,7 @@ JOIN mz_catalog.mz_schemas ON mz_schemas.id = class_objects.schema_id LEFT JOIN mz_catalog.mz_databases d ON d.id = mz_schemas.database_id JOIN mz_catalog.mz_roles role_owner ON role_owner.id = class_objects.owner_id", access: vec![PUBLIC_SELECT], + ontology: None, } }); @@ -7206,6 +9393,7 @@ FROM mz_internal.pg_class_all_databases WHERE database_name IS NULL OR database_name = pg_catalog.current_database(); ", access: vec![PUBLIC_SELECT], + ontology: None, } }); @@ -7271,6 +9459,7 @@ FROM mz_internal.mz_object_dependencies JOIN current_objects objects ON object_id = objects.id JOIN current_objects dependents ON referenced_object_id = dependents.id", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static PG_DATABASE: LazyLock = LazyLock::new(|| BuiltinView { @@ -7307,6 +9496,7 @@ pub static PG_DATABASE: LazyLock = LazyLock::new(|| BuiltinView { FROM mz_catalog.mz_databases d JOIN mz_catalog.mz_roles role_owner ON role_owner.id = d.owner_id", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static PG_INDEX: LazyLock = LazyLock::new(|| { @@ -7373,6 +9563,7 @@ LEFT JOIN mz_catalog.mz_databases d ON d.id = mz_schemas.database_id WHERE mz_schemas.database_id IS NULL OR d.name = pg_catalog.current_database() GROUP BY mz_indexes.oid, mz_relations.oid", access: vec![PUBLIC_SELECT], + ontology: None, } }); @@ -7403,6 +9594,7 @@ JOIN mz_catalog.mz_schemas s ON s.id = r.schema_id LEFT JOIN mz_catalog.mz_databases d ON d.id = s.database_id WHERE s.database_id IS NULL OR d.name = current_database()", access: vec![PUBLIC_SELECT], + ontology: None, }); /// Peeled version of `PG_DESCRIPTION`: @@ -7466,6 +9658,7 @@ pub static PG_DESCRIPTION_ALL_DATABASES: LazyLock = LazyLock::new(| mz_internal.mz_comments AS cmt ON mz_objects.id = cmt.id AND lower(mz_objects.type) = lower(cmt.object_type) )", access: vec![PUBLIC_SELECT], + ontology: None, } }); @@ -7504,6 +9697,7 @@ WHERE (oid_database_name IS NULL OR oid_database_name = pg_catalog.current_database()) AND (class_database_name IS NULL OR class_database_name = pg_catalog.current_database());", access: vec![PUBLIC_SELECT], + ontology: None, }); /// Peeled version of `PG_TYPE`: @@ -7622,6 +9816,7 @@ FROM LEFT JOIN mz_catalog.mz_databases d ON d.id = mz_schemas.database_id JOIN mz_catalog.mz_roles role_owner ON role_owner.id = mz_types.owner_id", access: vec![PUBLIC_SELECT], + ontology: None, } }); @@ -7665,6 +9860,7 @@ pub static PG_TYPE: LazyLock = LazyLock::new(|| BuiltinView { FROM mz_internal.pg_type_all_databases WHERE database_name IS NULL OR database_name = pg_catalog.current_database();", access: vec![PUBLIC_SELECT], + ontology: None, }); /// Peeled version of `PG_ATTRIBUTE`: @@ -7732,6 +9928,7 @@ LEFT JOIN mz_catalog.mz_databases d ON d.id = mz_schemas.database_id", // Since this depends on pg_type, its id must be higher due to initialization // ordering. access: vec![PUBLIC_SELECT], + ontology: None, } }); @@ -7780,6 +9977,7 @@ WHERE // Since this depends on pg_type, its id must be higher due to initialization // ordering. access: vec![PUBLIC_SELECT], + ontology: None, } }); @@ -7810,6 +10008,7 @@ JOIN mz_catalog.mz_types AS ret_type ON mz_functions.return_type_id = ret_type.i JOIN mz_catalog.mz_roles role_owner ON role_owner.id = mz_functions.owner_id WHERE mz_schemas.database_id IS NULL OR d.name = pg_catalog.current_database()", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static PG_OPERATOR: LazyLock = LazyLock::new(|| BuiltinView { @@ -7847,6 +10046,7 @@ JOIN mz_catalog.mz_types AS ret_type ON mz_operators.return_type_id = ret_type.i JOIN mz_catalog.mz_types AS right_type ON mz_operators.argument_type_ids[1] = right_type.id WHERE array_length(mz_operators.argument_type_ids, 1) = 1", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static PG_RANGE: LazyLock = LazyLock::new(|| BuiltinView { @@ -7864,6 +10064,7 @@ pub static PG_RANGE: LazyLock = LazyLock::new(|| BuiltinView { NULL::pg_catalog.oid AS rngsubtype WHERE false", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static PG_ENUM: LazyLock = LazyLock::new(|| BuiltinView { @@ -7885,6 +10086,7 @@ pub static PG_ENUM: LazyLock = LazyLock::new(|| BuiltinView { NULL::pg_catalog.text AS enumlabel WHERE false", access: vec![PUBLIC_SELECT], + ontology: None, }); /// Peeled version of `PG_ATTRDEF`: @@ -7913,6 +10115,7 @@ FROM mz_catalog.mz_columns JOIN mz_catalog.mz_objects ON mz_columns.id = mz_objects.id WHERE default IS NOT NULL", access: vec![PUBLIC_SELECT], + ontology: None, }); pub const PG_ATTRDEF_ALL_DATABASES_IND: BuiltinIndex = BuiltinIndex { @@ -7946,6 +10149,7 @@ SELECT FROM mz_internal.pg_attrdef_all_databases JOIN mz_catalog.mz_databases d ON (d.id IS NULL OR d.name = pg_catalog.current_database());", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static PG_SETTINGS: LazyLock = LazyLock::new(|| BuiltinView { @@ -7964,6 +10168,7 @@ FROM (VALUES ('max_index_keys'::pg_catalog.text, '1000'::pg_catalog.text) ) AS _ (name, setting)", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static PG_AUTH_MEMBERS: LazyLock = LazyLock::new(|| BuiltinView { @@ -7988,6 +10193,7 @@ JOIN mz_catalog.mz_roles role ON membership.role_id = role.id JOIN mz_catalog.mz_roles member ON membership.member = member.id JOIN mz_catalog.mz_roles grantor ON membership.grantor = grantor.id", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static PG_EVENT_TRIGGER: LazyLock = LazyLock::new(|| BuiltinView { @@ -8018,6 +10224,7 @@ pub static PG_EVENT_TRIGGER: LazyLock = LazyLock::new(|| BuiltinVie NULL::pg_catalog.text[] AS evttags WHERE false", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static PG_LANGUAGE: LazyLock = LazyLock::new(|| BuiltinView { @@ -8052,6 +10259,7 @@ pub static PG_LANGUAGE: LazyLock = LazyLock::new(|| BuiltinView { NULL::pg_catalog.text[] AS lanacl WHERE false", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static PG_SHDESCRIPTION: LazyLock = LazyLock::new(|| BuiltinView { @@ -8071,6 +10279,7 @@ pub static PG_SHDESCRIPTION: LazyLock = LazyLock::new(|| BuiltinVie NULL::pg_catalog.text AS description WHERE false", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static PG_TIMEZONE_ABBREVS: LazyLock = LazyLock::new(|| { @@ -8093,6 +10302,7 @@ pub static PG_TIMEZONE_ABBREVS: LazyLock = LazyLock::new(|| { AS is_dst FROM mz_catalog.mz_timezone_abbreviations", access: vec![PUBLIC_SELECT], + ontology: None, } }); @@ -8117,6 +10327,7 @@ pub static PG_TIMEZONE_NAMES: LazyLock = LazyLock::new(|| BuiltinVi AS is_dst FROM mz_catalog.mz_timezone_names", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_TIMEZONE_ABBREVIATIONS: LazyLock = LazyLock::new(|| BuiltinView { @@ -8151,6 +10362,7 @@ pub static MZ_TIMEZONE_ABBREVIATIONS: LazyLock = LazyLock::new(|| B ) .leak(), access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_TIMEZONE_NAMES: LazyLock = LazyLock::new(|| BuiltinView { @@ -8168,6 +10380,7 @@ pub static MZ_TIMEZONE_NAMES: LazyLock = LazyLock::new(|| BuiltinVi ) .leak(), access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_PEEK_DURATIONS_HISTOGRAM_PER_WORKER: LazyLock = @@ -8190,6 +10403,7 @@ FROM GROUP BY worker_id, type, duration_ns", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_PEEK_DURATIONS_HISTOGRAM: LazyLock = LazyLock::new(|| BuiltinView { @@ -8226,6 +10440,12 @@ SELECT FROM mz_introspection.mz_peek_durations_histogram_per_worker GROUP BY type, duration_ns", access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "peek_duration", + description: "Histogram of SELECT query durations", + links: &const { [] }, + column_semantic_types: &[], + }), }); pub static MZ_SCHEDULING_ELAPSED_PER_WORKER: LazyLock = @@ -8247,6 +10467,7 @@ FROM GROUP BY id, worker_id", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_SCHEDULING_ELAPSED: LazyLock = LazyLock::new(|| BuiltinView { @@ -8281,6 +10502,18 @@ SELECT FROM mz_introspection.mz_scheduling_elapsed_per_worker GROUP BY id", access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "scheduling_elapsed", + description: "CPU time spent per operator", + links: &const { + [OntologyLink { + name: "elapsed_for_operator", + target: "dataflow_operator", + properties: LinkProperties::measures("id", "id", "cpu_time_ns"), + }] + }, + column_semantic_types: &[], + }), }); pub static MZ_COMPUTE_OPERATOR_DURATIONS_HISTOGRAM_PER_WORKER: LazyLock = @@ -8303,6 +10536,7 @@ FROM GROUP BY id, worker_id, duration_ns", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_COMPUTE_OPERATOR_DURATIONS_HISTOGRAM: LazyLock = @@ -8344,6 +10578,7 @@ SELECT FROM mz_introspection.mz_compute_operator_durations_histogram_per_worker GROUP BY id, duration_ns", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_SCHEDULING_PARKS_HISTOGRAM_PER_WORKER: LazyLock = @@ -8366,6 +10601,7 @@ FROM GROUP BY worker_id, slept_for_ns, requested_ns", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_SCHEDULING_PARKS_HISTOGRAM: LazyLock = LazyLock::new(|| BuiltinView { @@ -8406,6 +10642,12 @@ SELECT FROM mz_introspection.mz_scheduling_parks_histogram_per_worker GROUP BY slept_for_ns, requested_ns", access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "scheduling_parks", + description: "Histogram of operator park durations", + links: &const { [] }, + column_semantic_types: &[], + }), }); pub static MZ_COMPUTE_ERROR_COUNTS_PER_WORKER: LazyLock = @@ -8449,6 +10691,7 @@ WITH MUTUALLY RECURSIVE ) SELECT * FROM all_errors", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_COMPUTE_ERROR_COUNTS: LazyLock = LazyLock::new(|| BuiltinView { @@ -8484,6 +10727,18 @@ FROM mz_introspection.mz_compute_error_counts_per_worker GROUP BY export_id HAVING pg_catalog.sum(count) != 0", access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "compute_error_count", + description: "Error counts per compute collection", + links: &const { + [OntologyLink { + name: "errors_in", + target: "compute_export", + properties: LinkProperties::fk("export_id", "export_id", Cardinality::OneToOne), + }] + }, + column_semantic_types: &[("export_id", SemanticType::GlobalId)], + }), }); pub static MZ_COMPUTE_ERROR_COUNTS_RAW_UNIFIED: LazyLock = @@ -8506,6 +10761,7 @@ pub static MZ_COMPUTE_ERROR_COUNTS_RAW_UNIFIED: LazyLock = column_comments: BTreeMap::new(), is_retained_metrics_object: false, access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_COMPUTE_HYDRATION_TIMES: LazyLock = LazyLock::new(|| BuiltinSource { @@ -8521,6 +10777,17 @@ pub static MZ_COMPUTE_HYDRATION_TIMES: LazyLock = LazyLock::new(| column_comments: BTreeMap::new(), is_retained_metrics_object: true, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "compute_hydration_time", + description: "Time to hydrate compute objects", + links: &const { [] }, + column_semantic_types: &const { + [ + ("replica_id", SemanticType::ReplicaId), + ("object_id", SemanticType::CatalogItemId), + ] + }, + }), }); pub static MZ_COMPUTE_HYDRATION_TIMES_IND: LazyLock = @@ -8586,6 +10853,17 @@ SELECT * FROM dataflows UNION ALL SELECT * FROM complete_mvs", access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "compute_hydration_status_view", + description: "Computed hydration status per compute object", + links: &const { [] }, + column_semantic_types: &const { + [ + ("object_id", SemanticType::GlobalId), + ("replica_id", SemanticType::ReplicaId), + ] + }, + }), }); pub static MZ_COMPUTE_OPERATOR_HYDRATION_STATUSES: LazyLock = LazyLock::new(|| { @@ -8618,6 +10896,17 @@ pub static MZ_COMPUTE_OPERATOR_HYDRATION_STATUSES: LazyLock = Laz ]), is_retained_metrics_object: false, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "compute_hydration_status", + description: "Hydration status per compute operator", + links: &const { [] }, + column_semantic_types: &const { + [ + ("replica_id", SemanticType::ReplicaId), + ("object_id", SemanticType::CatalogItemId), + ] + }, + }), } }); @@ -8694,6 +10983,7 @@ JOIN received_cte USING (channel_id, from_worker_id, to_worker_id) JOIN batch_sent_cte USING (channel_id, from_worker_id, to_worker_id) JOIN batch_received_cte USING (channel_id, from_worker_id, to_worker_id)", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_MESSAGE_COUNTS: LazyLock = LazyLock::new(|| BuiltinView { @@ -8752,6 +11042,18 @@ SELECT FROM mz_introspection.mz_message_counts_per_worker GROUP BY channel_id", access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "message_count", + description: "Inter-worker message counts", + links: &const { + [OntologyLink { + name: "counts_for", + target: "dataflow_channel", + properties: LinkProperties::fk("channel_id", "id", Cardinality::OneToOne), + }] + }, + column_semantic_types: &[], + }), }); pub static MZ_ACTIVE_PEEKS: LazyLock = LazyLock::new(|| BuiltinView { @@ -8782,6 +11084,17 @@ SELECT id, object_id, type, time FROM mz_introspection.mz_active_peeks_per_worker WHERE worker_id = 0::uint8", access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "active_peek", + description: "Currently executing SELECT queries", + links: &const { [] }, + column_semantic_types: &const { + [ + ("object_id", SemanticType::GlobalId), + ("time", SemanticType::MzTimestamp), + ] + }, + }), }); pub static MZ_DATAFLOW_OPERATOR_REACHABILITY_PER_WORKER: LazyLock = @@ -8821,6 +11134,7 @@ WHERE AND addr2.worker_id = reachability.worker_id GROUP BY addr2.id, reachability.worker_id, port, update_type, time", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_DATAFLOW_OPERATOR_REACHABILITY: LazyLock = @@ -8853,6 +11167,7 @@ SELECT FROM mz_introspection.mz_dataflow_operator_reachability_per_worker GROUP BY id, port, update_type, time", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_ARRANGEMENT_SIZES_PER_WORKER: LazyLock = LazyLock::new(|| { @@ -9012,6 +11327,7 @@ WHERE OR allocations IS NOT NULL ", access: vec![PUBLIC_SELECT], + ontology: None, } }); @@ -9056,6 +11372,27 @@ SELECT FROM mz_introspection.mz_arrangement_sizes_per_worker GROUP BY operator_id", access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "arrangement_size", + description: "Aggregated arrangement sizes (records, batches, bytes)", + links: &const { + [OntologyLink { + name: "arrangement_of_operator", + target: "dataflow_operator", + properties: LinkProperties::Measures { + source_column: "operator_id", + target_column: "id", + metric: "arrangement_size", + source_id_type: None, + requires_mapping: None, + note: Some( + "Both IDs are local uint64 operator IDs within a dataflow, not GlobalIds.", + ), + }, + }] + }, + column_semantic_types: &[], + }), }); pub static MZ_ARRANGEMENT_SHARING_PER_WORKER: LazyLock = @@ -9078,6 +11415,7 @@ SELECT FROM mz_introspection.mz_arrangement_sharing_raw GROUP BY operator_id, worker_id", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_ARRANGEMENT_SHARING: LazyLock = LazyLock::new(|| BuiltinView { @@ -9104,6 +11442,18 @@ SELECT operator_id, count FROM mz_introspection.mz_arrangement_sharing_per_worker WHERE worker_id = 0::uint8", access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "arrangement_sharing", + description: "Arrangement sharing between operators", + links: &const { + [OntologyLink { + name: "shared_by", + target: "dataflow_operator", + properties: LinkProperties::fk("operator_id", "id", Cardinality::OneToOne), + }] + }, + column_semantic_types: &[], + }), }); pub static MZ_CLUSTER_REPLICA_UTILIZATION: LazyLock = LazyLock::new(|| BuiltinView { @@ -9151,6 +11501,23 @@ FROM JOIN mz_catalog.mz_cluster_replica_sizes AS s ON r.size = s.size JOIN mz_internal.mz_cluster_replica_metrics AS m ON m.replica_id = r.id", access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "replica_utilization", + description: "Computed utilization metrics per replica", + links: &const { + [OntologyLink { + name: "utilization_of_replica", + target: "replica", + properties: LinkProperties::fk_typed( + "replica_id", + "id", + Cardinality::OneToOne, + mz_repr::SemanticType::CatalogItemId, + ), + }] + }, + column_semantic_types: &[("replica_id", SemanticType::ReplicaId)], + }), }); pub static MZ_CLUSTER_REPLICA_UTILIZATION_HISTORY: LazyLock = @@ -9208,6 +11575,7 @@ FROM JOIN mz_catalog.mz_cluster_replica_sizes AS s ON r.size = s.size JOIN mz_internal.mz_cluster_replica_metrics_history AS m ON m.replica_id = r.id", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_DATAFLOW_OPERATOR_PARENTS_PER_WORKER: LazyLock = @@ -9242,6 +11610,7 @@ FROM parent_addrs AS pa ON pa.parent_address = oa.address AND pa.worker_id = oa.worker_id", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_DATAFLOW_OPERATOR_PARENTS: LazyLock = LazyLock::new(|| BuiltinView { @@ -9267,6 +11636,7 @@ SELECT id, parent_id FROM mz_introspection.mz_dataflow_operator_parents_per_worker WHERE worker_id = 0::uint8", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_DATAFLOW_ARRANGEMENT_SIZES: LazyLock = LazyLock::new(|| BuiltinView { @@ -9321,6 +11691,7 @@ LEFT JOIN mz_introspection.mz_arrangement_sizes AS mas ON mdod.id = mas.operator_id GROUP BY mdod.dataflow_id, mdod.dataflow_name", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_EXPECTED_GROUP_SIZE_ADVICE: LazyLock = LazyLock::new(|| BuiltinView { @@ -9506,6 +11877,12 @@ pub static MZ_EXPECTED_GROUP_SIZE_ADVICE: LazyLock = LazyLock::new( JOIN mz_introspection.mz_dataflow_operator_dataflows dod ON dod.dataflow_id = c.dataflow_id AND dod.id = c.region_id", access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "group_size_advice", + description: "Advice on expected group sizes for reduce operators", + links: &const { [] }, + column_semantic_types: &[], + }), }); pub static MZ_INDEX_ADVICE: LazyLock = LazyLock::new(|| { @@ -9839,6 +12216,7 @@ SELECT h.justification AS referenced_object_ids FROM hints_resolved_ids AS h", access: vec![PUBLIC_SELECT], + ontology: None, } }); @@ -9923,6 +12301,7 @@ pub static PG_CONSTRAINT: LazyLock = LazyLock::new(|| BuiltinView { NULL::pg_catalog.text as conbin WHERE false", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static PG_TABLES: LazyLock = LazyLock::new(|| BuiltinView { @@ -9943,6 +12322,7 @@ FROM pg_catalog.pg_class c LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace WHERE c.relkind IN ('r', 'p')", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static PG_TABLESPACE: LazyLock = LazyLock::new(|| BuiltinView { @@ -9978,6 +12358,7 @@ pub static PG_TABLESPACE: LazyLock = LazyLock::new(|| BuiltinView { ) AS _ (oid, spcname, spcowner, spcacl, spcoptions) ", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static PG_ACCESS_METHODS: LazyLock = LazyLock::new(|| BuiltinView { @@ -9999,6 +12380,7 @@ SELECT NULL::pg_catalog.oid AS oid, NULL::pg_catalog.\"char\" AS amtype WHERE false", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static PG_ROLES: LazyLock = LazyLock::new(|| BuiltinView { @@ -10048,6 +12430,7 @@ pub static PG_ROLES: LazyLock = LazyLock::new(|| BuiltinView { oid FROM pg_catalog.pg_authid ai", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static PG_USER: LazyLock = LazyLock::new(|| BuiltinView { @@ -10091,6 +12474,7 @@ SELECT FROM pg_catalog.pg_authid ai WHERE rolcanlogin", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static PG_VIEWS: LazyLock = LazyLock::new(|| BuiltinView { @@ -10115,6 +12499,7 @@ LEFT JOIN mz_catalog.mz_databases d ON d.id = s.database_id JOIN mz_catalog.mz_roles role_owner ON role_owner.id = v.owner_id WHERE s.database_id IS NULL OR d.name = current_database()", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static PG_MATVIEWS: LazyLock = LazyLock::new(|| BuiltinView { @@ -10139,6 +12524,7 @@ LEFT JOIN mz_catalog.mz_databases d ON d.id = s.database_id JOIN mz_catalog.mz_roles role_owner ON role_owner.id = m.owner_id WHERE s.database_id IS NULL OR d.name = current_database()", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static INFORMATION_SCHEMA_APPLICABLE_ROLES: LazyLock = @@ -10163,6 +12549,7 @@ JOIN mz_catalog.mz_roles role ON membership.role_id = role.id JOIN mz_catalog.mz_roles member ON membership.member = member.id WHERE mz_catalog.mz_is_superuser() OR pg_has_role(current_role, member.oid, 'USAGE')", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static INFORMATION_SCHEMA_COLUMNS: LazyLock = LazyLock::new(|| BuiltinView { @@ -10205,6 +12592,7 @@ JOIN mz_catalog.mz_schemas s ON s.id = o.schema_id LEFT JOIN mz_catalog.mz_databases d ON d.id = s.database_id WHERE s.database_id IS NULL OR d.name = current_database()", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static INFORMATION_SCHEMA_ENABLED_ROLES: LazyLock = @@ -10221,6 +12609,7 @@ SELECT name AS role_name FROM mz_catalog.mz_roles WHERE mz_catalog.mz_is_superuser() OR pg_has_role(current_role, oid, 'USAGE')", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static INFORMATION_SCHEMA_ROLE_TABLE_GRANTS: LazyLock = LazyLock::new(|| { @@ -10246,6 +12635,7 @@ WHERE grantor IN (SELECT role_name FROM information_schema.enabled_roles) OR grantee IN (SELECT role_name FROM information_schema.enabled_roles)", access: vec![PUBLIC_SELECT], + ontology: None, } }); @@ -10282,6 +12672,7 @@ pub static INFORMATION_SCHEMA_KEY_COLUMN_USAGE: LazyLock = NULL::integer AS position_in_unique_constraint WHERE false", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static INFORMATION_SCHEMA_REFERENTIAL_CONSTRAINTS: LazyLock = @@ -10323,6 +12714,7 @@ pub static INFORMATION_SCHEMA_REFERENTIAL_CONSTRAINTS: LazyLock = NULL::text AS delete_rule WHERE false", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static INFORMATION_SCHEMA_ROUTINES: LazyLock = LazyLock::new(|| BuiltinView { @@ -10348,6 +12740,7 @@ JOIN mz_catalog.mz_schemas s ON s.id = f.schema_id LEFT JOIN mz_catalog.mz_databases d ON d.id = s.database_id WHERE s.database_id IS NULL OR d.name = current_database()", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static INFORMATION_SCHEMA_SCHEMATA: LazyLock = LazyLock::new(|| BuiltinView { @@ -10367,6 +12760,7 @@ FROM mz_catalog.mz_schemas s LEFT JOIN mz_catalog.mz_databases d ON d.id = s.database_id WHERE s.database_id IS NULL OR d.name = current_database()", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static INFORMATION_SCHEMA_TABLES: LazyLock = LazyLock::new(|| BuiltinView { @@ -10394,6 +12788,7 @@ JOIN mz_catalog.mz_schemas s ON s.id = r.schema_id LEFT JOIN mz_catalog.mz_databases d ON d.id = s.database_id WHERE s.database_id IS NULL OR d.name = current_database()", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static INFORMATION_SCHEMA_TABLE_CONSTRAINTS: LazyLock = @@ -10430,6 +12825,7 @@ pub static INFORMATION_SCHEMA_TABLE_CONSTRAINTS: LazyLock = NULL::text AS nulls_distinct WHERE false", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static INFORMATION_SCHEMA_TABLE_PRIVILEGES: LazyLock = LazyLock::new(|| { @@ -10499,6 +12895,7 @@ WHERE OR pg_has_role(current_role, grantor, 'USAGE') END", access: vec![PUBLIC_SELECT], + ontology: None, } }); @@ -10550,6 +12947,7 @@ pub static INFORMATION_SCHEMA_TRIGGERS: LazyLock = LazyLock::new(|| NULL::text AS action_reference_new_table WHERE FALSE", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static INFORMATION_SCHEMA_VIEWS: LazyLock = LazyLock::new(|| BuiltinView { @@ -10573,6 +12971,7 @@ JOIN mz_catalog.mz_schemas s ON s.id = v.schema_id LEFT JOIN mz_catalog.mz_databases d ON d.id = s.database_id WHERE s.database_id IS NULL OR d.name = current_database()", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static INFORMATION_SCHEMA_CHARACTER_SETS: LazyLock = @@ -10617,6 +13016,7 @@ pub static INFORMATION_SCHEMA_CHARACTER_SETS: LazyLock = 'pg_catalog' as default_collate_schema, 'en_US.utf8' as default_collate_name", access: vec![PUBLIC_SELECT], + ontology: None, }); // MZ doesn't support COLLATE so the table is filled with NULLs and made empty. pg_database hard @@ -10653,6 +13053,7 @@ SELECT NULL::pg_catalog.text AS collversion WHERE false", access: vec![PUBLIC_SELECT], + ontology: None, }); // MZ doesn't support row level security policies so the table is filled in with NULLs and made empty. @@ -10687,6 +13088,7 @@ SELECT NULL::pg_catalog.text AS polwithcheck WHERE false", access: vec![PUBLIC_SELECT], + ontology: None, }); // MZ doesn't support table inheritance so the table is filled in with NULLs and made empty. @@ -10710,6 +13112,7 @@ SELECT NULL::pg_catalog.bool AS inhdetachpending WHERE false", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static PG_LOCKS: LazyLock = LazyLock::new(|| BuiltinView { @@ -10760,6 +13163,7 @@ SELECT NULL::pg_catalog.timestamptz AS waitstart WHERE false", access: vec![PUBLIC_SELECT], + ontology: None, }); /// Peeled version of `PG_AUTHID`: Excludes the columns rolcreaterole and rolcreatedb, to make this @@ -10802,6 +13206,7 @@ SELECT FROM mz_catalog.mz_roles r LEFT JOIN mz_catalog.mz_role_auth a ON r.oid = a.role_oid"#, access: vec![rbac::owner_privilege(ObjectType::Table, MZ_SYSTEM_ROLE_ID)], + ontology: None, }); pub const PG_AUTHID_CORE_IND: BuiltinIndex = BuiltinIndex { @@ -10875,6 +13280,7 @@ SELECT FROM mz_internal.pg_authid_core LEFT JOIN extra USING (oid)"#, access: vec![rbac::owner_privilege(ObjectType::Table, MZ_SYSTEM_ROLE_ID)], + ontology: None, }); pub static PG_AGGREGATE: LazyLock = LazyLock::new(|| BuiltinView { @@ -10936,6 +13342,7 @@ pub static PG_AGGREGATE: LazyLock = LazyLock::new(|| BuiltinView { NULL::pg_catalog.text as aggminitval FROM mz_internal.mz_aggregates a", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static PG_TRIGGER: LazyLock = LazyLock::new(|| BuiltinView { @@ -10991,6 +13398,7 @@ pub static PG_TRIGGER: LazyLock = LazyLock::new(|| BuiltinView { WHERE false ", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static PG_REWRITE: LazyLock = LazyLock::new(|| BuiltinView { @@ -11024,6 +13432,7 @@ pub static PG_REWRITE: LazyLock = LazyLock::new(|| BuiltinView { WHERE false ", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static PG_EXTENSION: LazyLock = LazyLock::new(|| BuiltinView { @@ -11061,6 +13470,7 @@ pub static PG_EXTENSION: LazyLock = LazyLock::new(|| BuiltinView { WHERE false ", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_SHOW_ALL_OBJECTS: LazyLock = LazyLock::new(|| BuiltinView { @@ -11084,6 +13494,7 @@ pub static MZ_SHOW_ALL_OBJECTS: LazyLock = LazyLock::new(|| Builtin LEFT JOIN comments ON objs.id = comments.id WHERE (comments.object_type = objs.type OR comments.object_type IS NULL)", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_SHOW_CLUSTERS: LazyLock = LazyLock::new(|| { @@ -11117,6 +13528,7 @@ pub static MZ_SHOW_CLUSTERS: LazyLock = LazyLock::new(|| { FROM clusters LEFT JOIN comments ON clusters.id = comments.id", access: vec![PUBLIC_SELECT], + ontology: None, } }); @@ -11139,6 +13551,7 @@ pub static MZ_SHOW_SECRETS: LazyLock = LazyLock::new(|| BuiltinView FROM mz_catalog.mz_secrets secrets LEFT JOIN comments ON secrets.id = comments.id", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_SHOW_COLUMNS: LazyLock = LazyLock::new(|| BuiltinView { @@ -11160,6 +13573,7 @@ pub static MZ_SHOW_COLUMNS: LazyLock = LazyLock::new(|| BuiltinView LEFT JOIN mz_internal.mz_comments comments ON columns.id = comments.id AND columns.position = comments.object_sub_id", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_SHOW_DATABASES: LazyLock = LazyLock::new(|| BuiltinView { @@ -11180,6 +13594,7 @@ pub static MZ_SHOW_DATABASES: LazyLock = LazyLock::new(|| BuiltinVi FROM mz_catalog.mz_databases databases LEFT JOIN comments ON databases.id = comments.id", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_SHOW_SCHEMAS: LazyLock = LazyLock::new(|| BuiltinView { @@ -11201,6 +13616,7 @@ pub static MZ_SHOW_SCHEMAS: LazyLock = LazyLock::new(|| BuiltinView FROM mz_catalog.mz_schemas schemas LEFT JOIN comments ON schemas.id = comments.id", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_SHOW_ROLES: LazyLock = LazyLock::new(|| BuiltinView { @@ -11223,6 +13639,7 @@ pub static MZ_SHOW_ROLES: LazyLock = LazyLock::new(|| BuiltinView { WHERE roles.id NOT LIKE 's%' AND roles.id NOT LIKE 'g%'", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_SHOW_TABLES: LazyLock = LazyLock::new(|| BuiltinView { @@ -11245,6 +13662,7 @@ pub static MZ_SHOW_TABLES: LazyLock = LazyLock::new(|| BuiltinView FROM mz_catalog.mz_tables tables LEFT JOIN comments ON tables.id = comments.id", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_SHOW_VIEWS: LazyLock = LazyLock::new(|| BuiltinView { @@ -11266,6 +13684,7 @@ pub static MZ_SHOW_VIEWS: LazyLock = LazyLock::new(|| BuiltinView { FROM mz_catalog.mz_views views LEFT JOIN comments ON views.id = comments.id", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_SHOW_TYPES: LazyLock = LazyLock::new(|| BuiltinView { @@ -11287,6 +13706,7 @@ pub static MZ_SHOW_TYPES: LazyLock = LazyLock::new(|| BuiltinView { FROM mz_catalog.mz_types types LEFT JOIN comments ON types.id = comments.id", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_SHOW_CONNECTIONS: LazyLock = LazyLock::new(|| BuiltinView { @@ -11309,6 +13729,7 @@ pub static MZ_SHOW_CONNECTIONS: LazyLock = LazyLock::new(|| Builtin FROM mz_catalog.mz_connections connections LEFT JOIN comments ON connections.id = comments.id", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_SHOW_SOURCES: LazyLock = LazyLock::new(|| BuiltinView { @@ -11346,6 +13767,7 @@ FROM ON clusters.id = sources.cluster_id LEFT JOIN comments ON sources.id = comments.id", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_SHOW_SINKS: LazyLock = LazyLock::new(|| BuiltinView { @@ -11383,6 +13805,7 @@ FROM ON clusters.id = sinks.cluster_id LEFT JOIN comments ON sinks.id = comments.id", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_SHOW_MATERIALIZED_VIEWS: LazyLock = LazyLock::new(|| BuiltinView { @@ -11417,6 +13840,7 @@ FROM JOIN mz_catalog.mz_clusters AS clusters ON clusters.id = mviews.cluster_id LEFT JOIN comments ON mviews.id = comments.id", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_SHOW_INDEXES: LazyLock = LazyLock::new(|| BuiltinView { @@ -11477,6 +13901,7 @@ FROM ON idxs.id = keys.id LEFT JOIN comments ON idxs.id = comments.id", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_SHOW_CLUSTER_REPLICAS: LazyLock = LazyLock::new(|| BuiltinView { @@ -11518,6 +13943,7 @@ FROM WHERE (comments.object_type = 'cluster-replica' OR comments.object_type IS NULL) ORDER BY 1, 2"#, access: vec![PUBLIC_SELECT], + ontology: None, }); /// Lightweight data product discovery for MCP (Model Context Protocol). @@ -11571,6 +13997,7 @@ WHERE op.privilege_type = 'SELECT' AND s.name NOT IN ('mz_catalog', 'mz_internal', 'pg_catalog', 'information_schema', 'mz_introspection') "#, access: vec![PUBLIC_SELECT], + ontology: None, }); /// Full data product details with JSON Schema for MCP agents. @@ -11690,6 +14117,7 @@ GROUP BY 1, 2, 3 ) "#, access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_SHOW_ROLE_MEMBERS: LazyLock = LazyLock::new(|| BuiltinView { @@ -11719,6 +14147,7 @@ JOIN mz_catalog.mz_roles r2 ON r2.id = rm.member JOIN mz_catalog.mz_roles r3 ON r3.id = rm.grantor ORDER BY role"#, access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_SHOW_MY_ROLE_MEMBERS: LazyLock = LazyLock::new(|| BuiltinView { @@ -11742,6 +14171,7 @@ pub static MZ_SHOW_MY_ROLE_MEMBERS: LazyLock = LazyLock::new(|| Bui FROM mz_internal.mz_show_role_members WHERE pg_has_role(member, 'USAGE')"#, access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_SHOW_SYSTEM_PRIVILEGES: LazyLock = LazyLock::new(|| BuiltinView { @@ -11772,6 +14202,7 @@ LEFT JOIN mz_catalog.mz_roles grantor ON privileges.grantor = grantor.id LEFT JOIN mz_catalog.mz_roles grantee ON privileges.grantee = grantee.id WHERE privileges.grantee NOT LIKE 's%'"#, access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_SHOW_MY_SYSTEM_PRIVILEGES: LazyLock = LazyLock::new(|| BuiltinView { @@ -11796,6 +14227,7 @@ WHERE ELSE pg_has_role(grantee, 'USAGE') END"#, access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_SHOW_CLUSTER_PRIVILEGES: LazyLock = LazyLock::new(|| BuiltinView { @@ -11830,6 +14262,7 @@ LEFT JOIN mz_catalog.mz_roles grantor ON privileges.grantor = grantor.id LEFT JOIN mz_catalog.mz_roles grantee ON privileges.grantee = grantee.id WHERE privileges.grantee NOT LIKE 's%'"#, access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_SHOW_MY_CLUSTER_PRIVILEGES: LazyLock = LazyLock::new(|| BuiltinView { @@ -11856,6 +14289,7 @@ WHERE ELSE pg_has_role(grantee, 'USAGE') END"#, access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_SHOW_DATABASE_PRIVILEGES: LazyLock = LazyLock::new(|| BuiltinView { @@ -11890,6 +14324,7 @@ LEFT JOIN mz_catalog.mz_roles grantor ON privileges.grantor = grantor.id LEFT JOIN mz_catalog.mz_roles grantee ON privileges.grantee = grantee.id WHERE privileges.grantee NOT LIKE 's%'"#, access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_SHOW_MY_DATABASE_PRIVILEGES: LazyLock = LazyLock::new(|| BuiltinView { @@ -11916,6 +14351,7 @@ WHERE ELSE pg_has_role(grantee, 'USAGE') END"#, access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_SHOW_SCHEMA_PRIVILEGES: LazyLock = LazyLock::new(|| BuiltinView { @@ -11957,6 +14393,7 @@ LEFT JOIN mz_catalog.mz_roles grantee ON privileges.grantee = grantee.id LEFT JOIN mz_catalog.mz_databases databases ON privileges.database_id = databases.id WHERE privileges.grantee NOT LIKE 's%'"#, access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_SHOW_MY_SCHEMA_PRIVILEGES: LazyLock = LazyLock::new(|| BuiltinView { @@ -11988,6 +14425,7 @@ WHERE ELSE pg_has_role(grantee, 'USAGE') END"#, access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_SHOW_OBJECT_PRIVILEGES: LazyLock = LazyLock::new(|| BuiltinView { @@ -12039,6 +14477,7 @@ LEFT JOIN mz_catalog.mz_schemas schemas ON privileges.schema_id = schemas.id LEFT JOIN mz_catalog.mz_databases databases ON schemas.database_id = databases.id WHERE privileges.grantee NOT LIKE 's%'"#, access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_SHOW_MY_OBJECT_PRIVILEGES: LazyLock = LazyLock::new(|| BuiltinView { @@ -12077,6 +14516,7 @@ WHERE ELSE pg_has_role(grantee, 'USAGE') END"#, access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_SHOW_ALL_PRIVILEGES: LazyLock = LazyLock::new(|| BuiltinView { @@ -12122,6 +14562,7 @@ UNION ALL SELECT grantor, grantee, database, schema, name, object_type, privilege_type FROM mz_internal.mz_show_object_privileges"#, access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_SHOW_ALL_MY_PRIVILEGES: LazyLock = LazyLock::new(|| BuiltinView { @@ -12160,6 +14601,7 @@ WHERE ELSE pg_has_role(grantee, 'USAGE') END"#, access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_SHOW_DEFAULT_PRIVILEGES: LazyLock = LazyLock::new(|| BuiltinView { @@ -12219,6 +14661,7 @@ WHERE defaults.grantee NOT LIKE 's%' AND defaults.database_id IS NULL OR defaults.database_id NOT LIKE 's%' AND defaults.schema_id IS NULL OR defaults.schema_id NOT LIKE 's%'"#, access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_SHOW_MY_DEFAULT_PRIVILEGES: LazyLock = LazyLock::new(|| BuiltinView { @@ -12264,6 +14707,7 @@ WHERE ELSE pg_has_role(grantee, 'USAGE') END"#, access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_SHOW_NETWORK_POLICIES: LazyLock = LazyLock::new(|| BuiltinView { @@ -12298,6 +14742,7 @@ AND policy.id NOT LIKE 'g%' GROUP BY policy.name, comments.comment;", access: vec![PUBLIC_SELECT], + ontology: None, }); pub static MZ_CLUSTER_REPLICA_HISTORY: LazyLock = LazyLock::new(|| BuiltinView { @@ -12389,6 +14834,12 @@ pub static MZ_CLUSTER_REPLICA_HISTORY: LazyLock = LazyLock::new(|| mz_catalog.mz_cluster_replica_sizes ON mz_cluster_replica_sizes.size = creates.size"#, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "replica_history", + description: "Historical record of replica creation/drops", + links: &const { [] }, + column_semantic_types: &[], + }), }); pub static MZ_CLUSTER_REPLICA_NAME_HISTORY: LazyLock = LazyLock::new(|| BuiltinView { @@ -12455,6 +14906,12 @@ UNION ALL SELECT * FROM system_replicas"#, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "replica_name_history", + description: "Historical replica names", + links: &const { [] }, + column_semantic_types: &[("id", SemanticType::CatalogItemId)], + }), }); pub static MZ_HYDRATION_STATUSES: LazyLock = LazyLock::new(|| BuiltinView { @@ -12540,6 +14997,35 @@ SELECT * FROM sources UNION ALL SELECT * FROM sinks"#, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "hydration_status", + description: "Overall hydration status per object", + links: &const { + [ + OntologyLink { + name: "hydration_of", + target: "object", + properties: LinkProperties::fk_typed( + "object_id", + "id", + Cardinality::OneToOne, + mz_repr::SemanticType::CatalogItemId, + ), + }, + OntologyLink { + name: "hydration_on_replica", + target: "replica", + properties: LinkProperties::fk("replica_id", "id", Cardinality::ManyToOne), + }, + ] + }, + column_semantic_types: &const { + [ + ("object_id", SemanticType::CatalogItemId), + ("replica_id", SemanticType::ReplicaId), + ] + }, + }), }); pub const MZ_HYDRATION_STATUSES_IND: BuiltinIndex = BuiltinIndex { @@ -12578,6 +15064,34 @@ FROM mz_internal.mz_object_dependencies d JOIN mz_catalog.mz_sinks s ON (s.id = d.object_id) JOIN mz_catalog.mz_relations r ON (r.id = d.referenced_object_id)", access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "materialization_dep", + description: "Dependencies between materializations", + links: &const { + [ + OntologyLink { + name: "materialization_depends_on", + target: "object", + properties: LinkProperties::DependsOn { + source_column: "object_id", + target_column: "id", + source_id_type: Some(mz_repr::SemanticType::CatalogItemId), + }, + }, + OntologyLink { + name: "materialization_dependency", + target: "object", + properties: LinkProperties::fk("dependency_id", "id", Cardinality::ManyToOne), + }, + ] + }, + column_semantic_types: &const { + [ + ("object_id", SemanticType::CatalogItemId), + ("dependency_id", SemanticType::CatalogItemId), + ] + }, + }), }); pub static MZ_MATERIALIZATION_LAG: LazyLock = LazyLock::new(|| BuiltinView { @@ -12701,6 +15215,44 @@ FROM materialization_times m JOIN input_times i USING (id) JOIN root_times r USING (id)", access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "materialization_lag", + description: "Lag between a materialization and its inputs", + links: &const { + [ + OntologyLink { + name: "measures_materialization_lag", + target: "object", + properties: LinkProperties::measures("object_id", "id", "materialization_lag"), + }, + OntologyLink { + name: "slowest_local_input", + target: "object", + properties: LinkProperties::fk( + "slowest_local_input_id", + "id", + Cardinality::ManyToOne, + ), + }, + OntologyLink { + name: "slowest_global_input", + target: "object", + properties: LinkProperties::fk( + "slowest_global_input_id", + "id", + Cardinality::ManyToOne, + ), + }, + ] + }, + column_semantic_types: &const { + [ + ("object_id", SemanticType::CatalogItemId), + ("slowest_local_input_id", SemanticType::CatalogItemId), + ("slowest_global_input_id", SemanticType::CatalogItemId), + ] + }, + }), }); /** @@ -12971,6 +15523,7 @@ CROSS JOIN LATERAL ( ) AS replica_name_history LEFT JOIN replica_offline_event_history USING (bucket_start, replica_id)"#, access: vec![PUBLIC_SELECT], + ontology: None, } }); @@ -13101,6 +15654,29 @@ dropped_clusters ( SELECT * FROM mz_cluster_deployment_lineage"#, access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "cluster_deployment", + description: "Cluster deployment lineage information", + links: &const { + [ + OntologyLink { + name: "deployment_of", + target: "cluster", + properties: LinkProperties::fk("cluster_id", "id", Cardinality::ManyToOne), + }, + OntologyLink { + name: "current_deployment", + target: "cluster", + properties: LinkProperties::fk( + "current_deployment_cluster_id", + "id", + Cardinality::ManyToOne, + ), + }, + ] + }, + column_semantic_types: &[], + }), }); pub const MZ_SHOW_DATABASES_IND: BuiltinIndex = BuiltinIndex { @@ -13545,6 +16121,7 @@ FROM mz_internal.mz_source_statistics_raw JOIN report_paths USING (id) GROUP BY report_paths.report_id, replica_id", access: vec![PUBLIC_SELECT], + ontology: None, }); pub const MZ_SOURCE_STATISTICS_WITH_HISTORY_IND: BuiltinIndex = BuiltinIndex { @@ -13650,6 +16227,31 @@ pub static MZ_SOURCE_STATISTICS: LazyLock = LazyLock::new(|| { ]), sql: "SELECT * FROM mz_internal.mz_source_statistics_with_history WHERE length(id) > 0", access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "source_statistics", + description: "Aggregated source ingestion statistics", + links: &const { + [OntologyLink { + name: "statistics_of_source", + target: "source", + properties: LinkProperties::measures("id", "id", "ingestion_statistics"), + }] + }, + column_semantic_types: &const { + [ + ("id", SemanticType::CatalogItemId), + ("replica_id", SemanticType::ReplicaId), + ("messages_received", SemanticType::RecordCount), + ("bytes_received", SemanticType::ByteCount), + ("updates_staged", SemanticType::RecordCount), + ("updates_committed", SemanticType::RecordCount), + ("records_indexed", SemanticType::RecordCount), + ("bytes_indexed", SemanticType::ByteCount), + ("snapshot_records_known", SemanticType::RecordCount), + ("snapshot_records_staged", SemanticType::RecordCount), + ] + }, + }), } }); @@ -13712,6 +16314,27 @@ SELECT FROM mz_internal.mz_sink_statistics_raw GROUP BY id, replica_id", access: vec![PUBLIC_SELECT], + ontology: Some(Ontology { + entity_name: "sink_statistics", + description: "Aggregated sink export statistics", + links: &const { + [OntologyLink { + name: "statistics_of_sink", + target: "sink", + properties: LinkProperties::measures("id", "id", "export_statistics"), + }] + }, + column_semantic_types: &const { + [ + ("id", SemanticType::CatalogItemId), + ("replica_id", SemanticType::ReplicaId), + ("messages_staged", SemanticType::RecordCount), + ("messages_committed", SemanticType::RecordCount), + ("bytes_staged", SemanticType::ByteCount), + ("bytes_committed", SemanticType::ByteCount), + ] + }, + }), }); pub const MZ_SINK_STATISTICS_IND: BuiltinIndex = BuiltinIndex { @@ -14537,6 +17160,9 @@ pub static BUILTINS_STATIC: LazyLock>> = LazyLock::ne builtin_items.extend(notice::builtins()); + // Generate ontology views by enumerating existing builtins. + builtin_items.extend(ontology::generate_views(&builtin_items)); + // Generate builtin relations reporting builtin objects last, since they need a complete view // of all other builtins. let mut builtin_builtins = builtin::builtins(&builtin_items).collect(); @@ -14793,4 +17419,332 @@ mod tests { violations.join("\n"), ); } + + /// Validates ontology metadata consistency: + /// - Every link target references an entity that exists. + /// - No duplicate entity names. + /// - Every annotated builtin has a non-empty entity_name and description. + #[mz_ore::test] + #[cfg_attr(miri, ignore)] + fn test_ontology_consistency() { + // Collect all entity names from builtins with ontology annotations. + let mut entity_names: BTreeSet = BTreeSet::new(); + let mut duplicate_entities = Vec::new(); + + for builtin in BUILTINS_STATIC.iter() { + let ontology = match builtin { + Builtin::Table(t) => t.ontology.as_ref(), + Builtin::View(v) => v.ontology.as_ref(), + Builtin::MaterializedView(mv) => mv.ontology.as_ref(), + Builtin::Source(s) => s.ontology.as_ref(), + _ => None, + }; + if let Some(ont) = ontology { + assert!( + !ont.entity_name.is_empty(), + "builtin {} has empty ontology entity_name", + builtin.name() + ); + assert!( + !ont.description.is_empty(), + "builtin {} ({}) has empty ontology description", + builtin.name(), + ont.entity_name + ); + if !entity_names.insert(ont.entity_name.to_string()) { + duplicate_entities.push(format!( + "duplicate entity_name {:?} on builtin {}", + ont.entity_name, + builtin.name() + )); + } + } + } + assert!( + duplicate_entities.is_empty(), + "ontology has duplicate entity names:\n{}", + duplicate_entities.join("\n"), + ); + + // Validate link targets reference existing entities. + let mut bad_targets = Vec::new(); + for builtin in BUILTINS_STATIC.iter() { + let ontology = match builtin { + Builtin::Table(t) => t.ontology.as_ref(), + Builtin::View(v) => v.ontology.as_ref(), + Builtin::MaterializedView(mv) => mv.ontology.as_ref(), + Builtin::Source(s) => s.ontology.as_ref(), + _ => None, + }; + if let Some(ont) = ontology { + for link in ont.links { + if !entity_names.contains(link.target) { + bad_targets.push(format!( + "entity {:?} link {:?} targets {:?} which is not a known entity", + ont.entity_name, link.name, link.target + )); + } + } + } + } + assert!( + bad_targets.is_empty(), + "ontology has links targeting unknown entities:\n{}", + bad_targets.join("\n"), + ); + + // Semantic type annotations are typed (SemanticType enum), so validity + // is guaranteed at compile time — no runtime check needed. + + // Validate that every "reference" column (one whose semantic type implies + // a FK relationship) is covered by an OntologyLink on entities that + // already have at least one FK-style link. + // + // Scope: only entities that have started FK annotation (at least one + // link with a source_column). Entities with only union/maps_to links, + // or no links at all, are not yet fully annotated and are skipped to + // avoid noise. + // + // Exemptions: + // - Column at index 0 named "id": almost always the entity's own PK, + // not a FK (e.g. mz_objects.id, mz_functions.id). + // - Columns in the relation's declared key set. + // + // "Reference" types are ID types that imply a FK. Discriminators + // (ObjectType, ConnectionType, SourceType), OID, and metric types + // (ByteCount, etc.) are excluded. + let reference_sem_types: BTreeSet = BTreeSet::from([ + SemanticType::CatalogItemId, + SemanticType::GlobalId, + SemanticType::ClusterId, + SemanticType::ReplicaId, + SemanticType::SchemaId, + SemanticType::DatabaseId, + SemanticType::RoleId, + SemanticType::NetworkPolicyId, + ]); + + let mut uncovered_fk_cols = Vec::new(); + for builtin in BUILTINS_STATIC.iter() { + let (name, desc, ontology) = match builtin { + Builtin::Table(t) => (t.name, &t.desc, t.ontology.as_ref()), + Builtin::View(v) => (v.name, &v.desc, v.ontology.as_ref()), + Builtin::MaterializedView(mv) => (mv.name, &mv.desc, mv.ontology.as_ref()), + Builtin::Source(s) => (s.name, &s.desc, s.ontology.as_ref()), + _ => continue, + }; + let Some(ont) = ontology else { continue }; + + // Collect all source_column values declared by existing links. + let linked_cols: BTreeSet<&str> = ont + .links + .iter() + .filter_map(|link| match &link.properties { + LinkProperties::ForeignKey { source_column, .. } => Some(*source_column), + LinkProperties::Measures { source_column, .. } => Some(*source_column), + LinkProperties::DependsOn { source_column, .. } => Some(*source_column), + LinkProperties::MapsTo { + source_column: Some(sc), + .. + } => Some(*sc), + _ => None, + }) + .collect(); + + // Skip entities that have no FK-style links yet — they are either + // unannotated or use only union/maps_to links. Only enforce + // coverage on entities that have started FK annotation. + if linked_cols.is_empty() { + continue; + } + + // Column indices that are part of the declared key set. + let pk_indices: BTreeSet = desc.typ().keys.iter().flatten().copied().collect(); + + for (col_name, sem) in ont.column_semantic_types { + if !reference_sem_types.contains(sem) { + continue; + } + let Some(idx) = desc.iter_names().position(|n| n.as_str() == *col_name) else { + continue; + }; + // Exempt the entity's own primary identifier: column 0 named + // "id" is by convention the entity's own PK (not a FK), even + // when no explicit with_key() is declared on the relation. + if idx == 0 && *col_name == "id" { + continue; + } + if pk_indices.contains(&idx) { + continue; + } + if linked_cols.contains(*col_name) { + continue; + } + uncovered_fk_cols.push(format!( + "entity {:?} (builtin {}) column {:?} has semantic type {:?} but no OntologyLink covers it (add a link with source_column: {:?})", + ont.entity_name, name, col_name, sem, col_name + )); + } + } + assert!( + uncovered_fk_cols.is_empty(), + "ontology entities have FK-typed columns with no OntologyLink:\n{}", + uncovered_fk_cols.join("\n"), + ); + + // Validate that every source_column in a link actually names a column + // in the entity's RelationDesc. This catches stale annotations after + // column renames or removals. With typed LinkProperties this is mostly + // belt-and-suspenders since the type system enforces field presence, but + // we still need to verify the string value matches a real column. + let mut bad_source_cols = Vec::new(); + for builtin in BUILTINS_STATIC.iter() { + let (name, desc, ontology) = match builtin { + Builtin::Table(t) => (t.name, &t.desc, t.ontology.as_ref()), + Builtin::View(v) => (v.name, &v.desc, v.ontology.as_ref()), + Builtin::MaterializedView(mv) => (mv.name, &mv.desc, mv.ontology.as_ref()), + Builtin::Source(s) => (s.name, &s.desc, s.ontology.as_ref()), + _ => continue, + }; + let Some(ont) = ontology else { continue }; + + let col_names: BTreeSet<&str> = desc.iter_names().map(|c| c.as_str()).collect(); + + for link in ont.links { + let source_col = match &link.properties { + LinkProperties::ForeignKey { source_column, .. } => Some(*source_column), + LinkProperties::Measures { source_column, .. } => Some(*source_column), + LinkProperties::DependsOn { source_column, .. } => Some(*source_column), + LinkProperties::MapsTo { + source_column: Some(sc), + .. + } => Some(*sc), + _ => None, + }; + let Some(col) = source_col else { continue }; + if !col_names.contains(col) { + bad_source_cols.push(format!( + "entity {:?} (builtin {}) link {:?} references source_column {:?} which does not exist in the relation", + ont.entity_name, name, link.name, col + )); + } + } + } + assert!( + bad_source_cols.is_empty(), + "ontology links reference non-existent source_columns:\n{}", + bad_source_cols.join("\n"), + ); + + // Sanity check: we have a reasonable number of annotated entities. + assert!( + entity_names.len() > 90, + "expected > 90 ontology entities, found {}", + entity_names.len() + ); + } + + /// Verify that `LinkProperties` serializes to the same JSON that the old + /// hand-written `properties_json` strings contained. One representative + /// case per constructor/variant is enough — the important thing is that + /// field names, enum tag values, and skip-if-None/false behaviour are all + /// correct. + #[mz_ore::test] + fn test_link_properties_serialization() { + let check = |props: LinkProperties, expected: &str| { + let got = serde_json::to_string(&props).expect("serialize"); + let got_val: serde_json::Value = serde_json::from_str(&got).expect("parse got"); + let exp_val: serde_json::Value = + serde_json::from_str(expected).expect("parse expected"); + assert_eq!(got_val, exp_val, "mismatch for {expected}"); + }; + + // fk — basic, no optional fields + check( + LinkProperties::fk("owner_id", "id", Cardinality::ManyToOne), + r#"{"kind":"foreign_key","source_column":"owner_id","target_column":"id","cardinality":"many_to_one"}"#, + ); + // fk — one_to_one cardinality + check( + LinkProperties::fk("id", "id", Cardinality::OneToOne), + r#"{"kind":"foreign_key","source_column":"id","target_column":"id","cardinality":"one_to_one"}"#, + ); + // fk_nullable — nullable field present and true + check( + LinkProperties::fk_nullable("database_id", "id", Cardinality::ManyToOne), + r#"{"kind":"foreign_key","source_column":"database_id","target_column":"id","cardinality":"many_to_one","nullable":true}"#, + ); + // fk_typed — source_id_type present, requires_mapping absent + check( + LinkProperties::fk_typed( + "replica_id", + "id", + Cardinality::ManyToOne, + mz_repr::SemanticType::CatalogItemId, + ), + r#"{"kind":"foreign_key","source_column":"replica_id","target_column":"id","cardinality":"many_to_one","source_id_type":"CatalogItemId"}"#, + ); + // fk_mapped — source_id_type + requires_mapping both present + check( + LinkProperties::fk_mapped( + "object_id", + "id", + Cardinality::ManyToOne, + mz_repr::SemanticType::GlobalId, + "mz_internal.mz_object_global_ids", + ), + r#"{"kind":"foreign_key","source_column":"object_id","target_column":"id","cardinality":"many_to_one","source_id_type":"GlobalId","requires_mapping":"mz_internal.mz_object_global_ids"}"#, + ); + // union_disc — discriminator fields present, note absent + check( + LinkProperties::union_disc("type", "table"), + r#"{"kind":"union","discriminator_column":"type","discriminator_value":"table"}"#, + ); + // Union — note only, discriminator absent + check( + LinkProperties::Union { + discriminator_column: None, + discriminator_value: None, + note: Some("example note"), + }, + r#"{"kind":"union","note":"example note"}"#, + ); + // measures — basic + check( + LinkProperties::measures("id", "id", "cpu_time_ns"), + r#"{"kind":"measures","source_column":"id","target_column":"id","metric":"cpu_time_ns"}"#, + ); + // measures_mapped — source_id_type + requires_mapping present + check( + LinkProperties::measures_mapped( + "object_id", + "id", + "wallclock_lag", + mz_repr::SemanticType::GlobalId, + "mz_internal.mz_object_global_ids", + ), + r#"{"kind":"measures","source_column":"object_id","target_column":"id","metric":"wallclock_lag","source_id_type":"GlobalId","requires_mapping":"mz_internal.mz_object_global_ids"}"#, + ); + // DependsOn + check( + LinkProperties::DependsOn { + source_column: "object_id", + target_column: "id", + source_id_type: Some(mz_repr::SemanticType::CatalogItemId), + }, + r#"{"kind":"depends_on","source_column":"object_id","target_column":"id","source_id_type":"CatalogItemId"}"#, + ); + // MapsTo — via + from_type + to_type + check( + LinkProperties::MapsTo { + source_column: None, + target_column: None, + via: Some("mz_internal.mz_object_global_ids"), + from_type: Some(mz_repr::SemanticType::CatalogItemId), + to_type: Some(mz_repr::SemanticType::GlobalId), + note: None, + }, + r#"{"kind":"maps_to","via":"mz_internal.mz_object_global_ids","from_type":"CatalogItemId","to_type":"GlobalId"}"#, + ); + } } diff --git a/src/catalog/src/builtin/builtin.rs b/src/catalog/src/builtin/builtin.rs index 6930532672a3f..3a808a91890d5 100644 --- a/src/catalog/src/builtin/builtin.rs +++ b/src/catalog/src/builtin/builtin.rs @@ -102,6 +102,7 @@ FROM (VALUES {values}) AS v(oid, schema_name, name, cluster_name, definition, pr column_comments: Default::default(), sql: Box::leak(sql.into_boxed_str()), access: vec![PUBLIC_SELECT], + ontology: None, } } diff --git a/src/catalog/src/builtin/notice.rs b/src/catalog/src/builtin/notice.rs index c285524a73334..9c8d05d9c8981 100644 --- a/src/catalog/src/builtin/notice.rs +++ b/src/catalog/src/builtin/notice.rs @@ -54,6 +54,7 @@ pub static MZ_OPTIMIZER_NOTICES: LazyLock = LazyLock::new(|| { column_comments: BTreeMap::new(), is_retained_metrics_object: false, access: vec![MONITOR_SELECT], + ontology: None, } }); @@ -139,6 +140,7 @@ FROM mz_internal.mz_optimizer_notices n ", access: vec![MONITOR_SELECT], + ontology: None, }); /// A redacted version of [`MZ_NOTICES`] that is made safe to be viewed by @@ -203,6 +205,7 @@ FROM mz_internal.mz_notices ", access: vec![SUPPORT_SELECT, MONITOR_REDACTED_SELECT, MONITOR_SELECT], + ontology: None, }); pub const MZ_NOTICES_IND: BuiltinIndex = BuiltinIndex { diff --git a/src/catalog/src/builtin/ontology.rs b/src/catalog/src/builtin/ontology.rs new file mode 100644 index 0000000000000..08c6a332e75f5 --- /dev/null +++ b/src/catalog/src/builtin/ontology.rs @@ -0,0 +1,459 @@ +// Copyright Materialize, Inc. and contributors. All rights reserved. +// +// Use of this software is governed by the Business Source License +// included in the LICENSE file. +// +// As of the Change Date specified in that file, in accordance with +// the Business Source License, use of this software will be governed +// by the Apache License, Version 2.0. + +//! Catalog ontology views derived from existing builtin definitions. +//! +//! Enumerates builtins that have `ontology: Some(...)` and generates 4 views: +//! - entity_types: from ontology.description + RelationDesc::keys() +//! - properties: from mz_columns + mz_comments + semantic type inference +//! - semantic_types: small const reference data +//! - link_types: from ontology.links on each builtin + +use std::collections::BTreeMap; + +use mz_pgrepr::oid; +use mz_repr::namespaces::MZ_INTERNAL_SCHEMA; +use mz_repr::{RelationDesc, SemanticType, SqlScalarType}; +use mz_sql::catalog::NameReference; + +use super::{Builtin, BuiltinView, Ontology, PUBLIC_SELECT}; + +pub(super) fn generate_views(builtins: &[Builtin]) -> Vec> { + let infos: Vec<_> = builtins + .iter() + .filter_map(|b| { + let (name, schema, desc, ontology) = match b { + Builtin::Table(t) => (t.name, t.schema, &t.desc, t.ontology.as_ref()?), + Builtin::View(v) => (v.name, v.schema, &v.desc, v.ontology.as_ref()?), + Builtin::MaterializedView(mv) => { + (mv.name, mv.schema, &mv.desc, mv.ontology.as_ref()?) + } + Builtin::Source(s) => (s.name, s.schema, &s.desc, s.ontology.as_ref()?), + _ => return None, + }; + let entity_name = ontology.entity_name.to_string(); + Some(Info { + table_name: name, + schema_name: schema, + entity_name, + desc, + ontology, + }) + }) + .collect(); + + vec![ + Builtin::View(leak(entity_types_view(&infos))), + Builtin::View(leak(semantic_types_view())), + Builtin::View(leak(properties_view(&infos))), + Builtin::View(leak(link_types_view(&infos))), + ] +} + +/// Leak a `BuiltinView` to get a `&'static` reference. Called exactly 4 times +/// at startup (one per ontology view). These views live for the entire process +/// lifetime (same as `LazyLock<&'static BuiltinView>` used by other builtins), +/// so the leak is intentional and bounded. +fn leak(v: BuiltinView) -> &'static BuiltinView { + Box::leak(Box::new(v)) +} + +struct Info<'a> { + table_name: &'static str, + schema_name: &'static str, + entity_name: String, + desc: &'a RelationDesc, + ontology: &'a Ontology, +} + +/// A single typed SQL literal for use inside a VALUES list. +enum Lit { + /// A text string: rendered as `'escaped'`. + Str(String), + /// A JSONB value: rendered as `'escaped'::jsonb`. + Json(String), + /// SQL NULL. + Null, +} + +impl Lit { + fn render(&self) -> String { + match self { + Lit::Str(s) => format!("'{}'", esc(s)), + Lit::Json(s) => format!("'{}'::jsonb", esc(s)), + Lit::Null => "NULL".to_string(), + } + } +} + +/// Map a `SqlScalarType` to the SQL type name used in cast expressions. +fn sql_type_name(ty: &SqlScalarType) -> &'static str { + match ty { + SqlScalarType::String => "text", + SqlScalarType::Jsonb => "jsonb", + SqlScalarType::Oid => "oid", + SqlScalarType::UInt64 => "uint8", + SqlScalarType::Numeric { .. } => "numeric", + SqlScalarType::MzTimestamp => "mz_timestamp", + SqlScalarType::TimestampTz { .. } => "timestamp with time zone", + other => panic!("unsupported SqlScalarType in ontology view: {other:?}"), + } +} + +/// Escape single quotes for SQL string literals. Only safe for trusted +/// compile-time constants (entity names, descriptions, link JSON from +/// `Ontology` annotations) — never use with user-supplied input. +fn esc(s: &str) -> String { + s.replace('\'', "''") +} + +/// Render rows into a SQL `VALUES (r1c1,r1c2,...),(r2c1,...)` fragment. +/// Used when a VALUES list appears as a subquery inside a larger SQL string +/// rather than as the top-level source of a `values_view`. +fn values_sql(rows: &[Vec]) -> String { + rows.iter() + .map(|row| { + let lits: Vec = row.iter().map(Lit::render).collect(); + format!("({})", lits.join(",")) + }) + .collect::>() + .join(",") +} + +/// Build an ontology view from a static VALUES list. Each row is a `Vec`; +/// all escaping and type-casting is handled here so callers never touch SQL +/// string formatting directly. +fn values_view( + name: &'static str, + oid: u32, + cols: &[(&'static str, SqlScalarType, bool)], + keys: &[Vec], + rows: Vec>, +) -> BuiltinView { + let col_names: Vec<&str> = cols.iter().map(|(n, _, _)| *n).collect(); + let cast_exprs: Vec = cols + .iter() + .map(|(n, ty, _)| format!("{n}::{}", sql_type_name(ty))) + .collect(); + + let vals: Vec = rows + .iter() + .map(|row| { + let lits: Vec = row.iter().map(Lit::render).collect(); + format!("({})", lits.join(",")) + }) + .collect(); + + let sql = format!( + "SELECT {casts} FROM (VALUES {vals}) AS t({cols})", + casts = cast_exprs.join(","), + vals = vals.join(","), + cols = col_names.join(","), + ); + + let mut b = RelationDesc::builder(); + for (n, ty, nullable) in cols { + b = b.with_column(*n, ty.clone().nullable(*nullable)); + } + let mut desc = b.finish(); + for key in keys { + desc = desc.with_key(key.clone()); + } + BuiltinView { + name, + schema: MZ_INTERNAL_SCHEMA, + oid, + desc, + column_comments: BTreeMap::new(), + sql: Box::leak(sql.into_boxed_str()), + access: vec![PUBLIC_SELECT], + ontology: None, + } +} + +/// Extract all keys from a `RelationDesc` and return a `Lit::Json` with shape: +/// `{"primary_key": ["id"], "alternate_keys": [["oid"]]}`. +/// `primary_key` is the first declared key; `alternate_keys` contains any +/// additional unique keys. Returns `Lit::Null` if no keys are defined. +fn pk_lit(desc: &RelationDesc) -> Lit { + let all_keys = &desc.typ().keys; + let Some((first, rest)) = all_keys.split_first() else { + return Lit::Null; + }; + let fmt_key = |key: &Vec| -> String { + let cols: Vec<_> = key + .iter() + .map(|&i| serde_json::to_string(desc.get_name(i).as_str()).expect("valid utf-8")) + .collect(); + format!("[{}]", cols.join(", ")) + }; + let primary = fmt_key(first); + let json = if rest.is_empty() { + format!("{{\"primary_key\": {primary}}}") + } else { + let alts: Vec<_> = rest.iter().map(fmt_key).collect(); + format!( + "{{\"primary_key\": {primary}, \"alternate_keys\": [{}]}}", + alts.join(", ") + ) + }; + Lit::Json(json) +} + +// ── View builders ──────────────────────────────────────────── + +fn entity_types_view(infos: &[Info]) -> BuiltinView { + let rows = infos + .iter() + .map(|i| { + vec![ + Lit::Str(i.entity_name.clone()), + Lit::Str(format!("{}.{}", i.schema_name, i.table_name)), + pk_lit(i.desc), + Lit::Str(i.ontology.description.to_string()), + ] + }) + .collect(); + values_view( + "mz_ontology_entity_types", + oid::VIEW_MZ_ONTOLOGY_ENTITY_TYPES_OID, + &[ + ("name", SqlScalarType::String, false), + ("relation", SqlScalarType::String, false), + ("properties", SqlScalarType::Jsonb, true), + ("description", SqlScalarType::String, false), + ], + &[vec![0], vec![1], vec![3]], + rows, + ) +} + +fn semantic_types_view() -> BuiltinView { + let rows = SEMANTIC_TYPE_DEFS + .iter() + .map(|(n, t, d)| { + vec![ + Lit::Str(n.to_string()), + Lit::Str(t.to_string()), + Lit::Str(d.to_string()), + ] + }) + .collect(); + values_view( + "mz_ontology_semantic_types", + oid::VIEW_MZ_ONTOLOGY_SEMANTIC_TYPES_OID, + &[ + ("name", SqlScalarType::String, false), + ("sql_type", SqlScalarType::String, false), + ("description", SqlScalarType::String, false), + ], + &[vec![0], vec![2]], + rows, + ) +} + +/// Build the `mz_ontology_properties` view: one row per column of every +/// annotated builtin relation. +/// +/// The generated SQL works in two halves: +/// +/// 1. **Column discovery** — An inline VALUES list (`ent`) maps each entity to +/// its (schema, table) pair. This is joined through `mz_schemas` → +/// `mz_objects` → `mz_columns` so the view always reflects the live catalog +/// (column additions/removals are picked up automatically). +/// +/// 2. **Annotation enrichment** — A second VALUES list (`ann`) carries the +/// semantic-type annotations from `Ontology::column_semantic_types`. +/// Column descriptions come from `mz_comments`. Both are LEFT JOINed so +/// columns without annotations or comments still appear (with NULLs). +fn properties_view(infos: &[Info]) -> BuiltinView { + let mut ent: Vec> = Vec::new(); + let mut ann: Vec> = Vec::new(); + for i in infos { + ent.push(vec![ + Lit::Str(i.schema_name.to_string()), + Lit::Str(i.table_name.to_string()), + Lit::Str(i.entity_name.clone()), + ]); + for (col_name, sem) in i.ontology.column_semantic_types { + ann.push(vec![ + Lit::Str(i.entity_name.clone()), + Lit::Str(col_name.to_string()), + Lit::Str(sem.to_string()), + ]); + } + } + let sql = format!( + "SELECT ent.entity_name AS entity_type,col.name AS column_name,\ + ann.semantic_type::text AS semantic_type,cmt.comment AS description \ + FROM (VALUES {ent}) AS ent(schema_name,table_name,entity_name) \ + JOIN mz_catalog.mz_schemas s ON s.name=ent.schema_name \ + JOIN mz_catalog.mz_objects o ON o.schema_id=s.id AND o.name=ent.table_name \ + JOIN mz_catalog.mz_columns col ON col.id=o.id \ + LEFT JOIN mz_internal.mz_comments cmt ON cmt.id=o.id AND cmt.object_sub_id=col.position \ + LEFT JOIN (VALUES {ann}) AS ann(entity_name,column_name,semantic_type) \ + ON ann.entity_name=ent.entity_name AND ann.column_name=col.name", + ent = values_sql(&ent), + ann = values_sql(&ann), + ); + + let mut b = RelationDesc::builder(); + for (n, ty, nullable) in &[ + ("entity_type", SqlScalarType::String, false), + ("column_name", SqlScalarType::String, false), + ("semantic_type", SqlScalarType::String, true), + ("description", SqlScalarType::String, true), + ] { + b = b.with_column(*n, ty.clone().nullable(*nullable)); + } + BuiltinView { + name: "mz_ontology_properties", + schema: MZ_INTERNAL_SCHEMA, + oid: oid::VIEW_MZ_ONTOLOGY_PROPERTIES_OID, + desc: b.finish(), + column_comments: BTreeMap::new(), + sql: Box::leak(sql.into_boxed_str()), + access: vec![PUBLIC_SELECT], + ontology: None, + } +} + +fn link_types_view(infos: &[Info]) -> BuiltinView { + let rows = infos + .iter() + .flat_map(|i| { + i.ontology.links.iter().map(move |l| { + vec![ + Lit::Str(l.name.to_string()), + Lit::Str(i.entity_name.clone()), + Lit::Str(l.target.to_string()), + Lit::Json( + serde_json::to_string(&l.properties) + .expect("LinkProperties is serializable"), + ), + Lit::Null, + ] + }) + }) + .collect(); + values_view( + "mz_ontology_link_types", + oid::VIEW_MZ_ONTOLOGY_LINK_TYPES_OID, + &[ + ("name", SqlScalarType::String, false), + ("source_entity", SqlScalarType::String, false), + ("target_entity", SqlScalarType::String, false), + ("properties", SqlScalarType::Jsonb, false), + ("description", SqlScalarType::String, true), + ], + &[], + rows, + ) +} + +// ── Semantic type reference data ───────────────────────────── + +pub(super) const SEMANTIC_TYPE_DEFS: &[(SemanticType, &str, &str)] = &[ + ( + SemanticType::CatalogItemId, + "text", + "SQL-layer object ID. Format: s{n}/u{n}.", + ), + ( + SemanticType::GlobalId, + "text", + "Runtime ID used by compute/storage. Format: s{n}/u{n}/si{n}.", + ), + ( + SemanticType::ClusterId, + "text", + "Cluster ID. Format: s{n}/u{n}.", + ), + ( + SemanticType::ReplicaId, + "text", + "Cluster replica ID. Format: s{n}/u{n}.", + ), + ( + SemanticType::SchemaId, + "text", + "Schema ID. Format: s{n}/u{n}.", + ), + ( + SemanticType::DatabaseId, + "text", + "Database ID. Format: s{n}/u{n}.", + ), + ( + SemanticType::RoleId, + "text", + "Role ID. Format: s{n}/g{n}/u{n}/p.", + ), + ( + SemanticType::NetworkPolicyId, + "text", + "Network policy ID. Format: s{n}/u{n}.", + ), + ( + SemanticType::ShardId, + "text", + "Persist shard ID. Format: s{uuid}.", + ), + ( + SemanticType::OID, + "oid", + "PostgreSQL-compatible object identifier.", + ), + ( + SemanticType::ObjectType, + "text", + "Catalog object type discriminator (e.g., table, view, source, sink, index, materialized-view).", + ), + ( + SemanticType::ConnectionType, + "text", + "Connection type discriminator (e.g., kafka, postgres, mysql, ssh-tunnel).", + ), + ( + SemanticType::SourceType, + "text", + "Source type discriminator (e.g., kafka, postgres, mysql, webhook).", + ), + ( + SemanticType::MzTimestamp, + "mz_timestamp", + "Internal logical timestamp (8-byte unsigned integer).", + ), + ( + SemanticType::WallclockTimestamp, + "timestamp with time zone", + "Wall clock timestamp.", + ), + (SemanticType::ByteCount, "uint8", "A count of bytes."), + ( + SemanticType::RecordCount, + "uint8", + "A count of records/rows.", + ), + ( + SemanticType::CreditRate, + "numeric", + "Credits consumed per hour.", + ), + ( + SemanticType::SqlDefinition, + "text", + "A SQL CREATE statement.", + ), + ( + SemanticType::RedactedSqlDefinition, + "text", + "A redacted SQL CREATE statement.", + ), +]; diff --git a/src/pgrepr-consts/src/oid.rs b/src/pgrepr-consts/src/oid.rs index ded0875d9db04..c4905027cc46f 100644 --- a/src/pgrepr-consts/src/oid.rs +++ b/src/pgrepr-consts/src/oid.rs @@ -787,3 +787,7 @@ pub const FUNC_PARSE_CATALOG_CREATE_SQL_OID: u32 = 17073; pub const FUNC_REDACT_SQL_OID: u32 = 17074; pub const FUNC_REPEAT_ROW_NON_NEGATIVE_OID: u32 = 17075; pub const ROLE_MZ_JWT_SYNC_OID: u32 = 17076; +pub const VIEW_MZ_ONTOLOGY_ENTITY_TYPES_OID: u32 = 17077; +pub const VIEW_MZ_ONTOLOGY_SEMANTIC_TYPES_OID: u32 = 17078; +pub const VIEW_MZ_ONTOLOGY_PROPERTIES_OID: u32 = 17079; +pub const VIEW_MZ_ONTOLOGY_LINK_TYPES_OID: u32 = 17080; diff --git a/src/repr/src/lib.rs b/src/repr/src/lib.rs index 3ad159243e42a..32f98521eece2 100644 --- a/src/repr/src/lib.rs +++ b/src/repr/src/lib.rs @@ -56,9 +56,9 @@ pub use crate::relation::{ ColumnDiff, ColumnIndex, ColumnName, KeyDiff, NotNullViolation, PropRelationDescDiff, ProtoColumnName, ProtoColumnType, ProtoRelationDesc, ProtoRelationType, RelationDesc, RelationDescBuilder, RelationDescDiff, RelationVersion, RelationVersionSelector, - ReprColumnType, ReprRelationType, SqlColumnType, SqlRelationType, UNKNOWN_COLUMN_NAME, - VersionedRelationDesc, arb_relation_desc_diff, arb_relation_desc_projection, - arb_row_for_relation, + ReprColumnType, ReprRelationType, SemanticType, SqlColumnType, SqlRelationType, + UNKNOWN_COLUMN_NAME, VersionedRelationDesc, arb_relation_desc_diff, + arb_relation_desc_projection, arb_row_for_relation, }; pub use crate::row::encode::{RowColumnarDecoder, RowColumnarEncoder, preserves_order}; pub use crate::row::iter::{IntoRowIterator, RowIterator}; diff --git a/src/repr/src/relation.rs b/src/repr/src/relation.rs index 7e4a3ff71839c..5c6e784ff8889 100644 --- a/src/repr/src/relation.rs +++ b/src/repr/src/relation.rs @@ -822,6 +822,74 @@ impl RustType for RelationVersion { } } +/// Semantic type annotation for a column in a builtin catalog relation. +/// +/// These are compile-time metadata used by the catalog ontology layer to +/// describe the meaning of a column (e.g., that it contains a catalog item ID +/// or a role ID). Possible values correspond to the entries in +/// `SEMANTIC_TYPE_DEFS` in the `mz-catalog` crate. +#[derive( + Clone, + Copy, + Debug, + PartialEq, + Eq, + PartialOrd, + Ord, + Hash, + serde::Serialize +)] +pub enum SemanticType { + CatalogItemId, + GlobalId, + ClusterId, + ReplicaId, + SchemaId, + DatabaseId, + RoleId, + NetworkPolicyId, + ShardId, + OID, + ObjectType, + ConnectionType, + SourceType, + MzTimestamp, + WallclockTimestamp, + ByteCount, + RecordCount, + CreditRate, + SqlDefinition, + RedactedSqlDefinition, +} + +impl fmt::Display for SemanticType { + fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { + let s = match self { + SemanticType::CatalogItemId => "CatalogItemId", + SemanticType::GlobalId => "GlobalId", + SemanticType::ClusterId => "ClusterId", + SemanticType::ReplicaId => "ReplicaId", + SemanticType::SchemaId => "SchemaId", + SemanticType::DatabaseId => "DatabaseId", + SemanticType::RoleId => "RoleId", + SemanticType::NetworkPolicyId => "NetworkPolicyId", + SemanticType::ShardId => "ShardId", + SemanticType::OID => "OID", + SemanticType::ObjectType => "ObjectType", + SemanticType::ConnectionType => "ConnectionType", + SemanticType::SourceType => "SourceType", + SemanticType::MzTimestamp => "MzTimestamp", + SemanticType::WallclockTimestamp => "WallclockTimestamp", + SemanticType::ByteCount => "ByteCount", + SemanticType::RecordCount => "RecordCount", + SemanticType::CreditRate => "CreditRate", + SemanticType::SqlDefinition => "SqlDefinition", + SemanticType::RedactedSqlDefinition => "RedactedSqlDefinition", + }; + f.write_str(s) + } +} + /// Metadata (other than type) for a column in a [`RelationDesc`]. #[derive(Clone, Debug, Eq, PartialEq, Serialize, Deserialize, Hash, MzReflect)] struct ColumnMetadata { @@ -902,7 +970,7 @@ struct ColumnMetadata { /// the index in [`SqlRelationType`] that corresponds to a given column, and the /// version at which this column was added or dropped. /// -#[derive(Clone, Debug, Eq, PartialEq, Serialize, Deserialize, Hash, MzReflect)] +#[derive(Clone, Debug, PartialEq, Eq, Hash, Serialize, Deserialize, MzReflect)] pub struct RelationDesc { typ: SqlRelationType, metadata: BTreeMap, diff --git a/test/sqllogictest/autogenerated/mz_internal.slt b/test/sqllogictest/autogenerated/mz_internal.slt index ea87e6201290c..f0d440add2708 100644 --- a/test/sqllogictest/autogenerated/mz_internal.slt +++ b/test/sqllogictest/autogenerated/mz_internal.slt @@ -225,7 +225,7 @@ comment text The␠comment␠itself. query TTT SELECT name, type, comment FROM objects WHERE schema = 'mz_internal' AND object = 'mz_compute_dependencies' ORDER BY position ---- -object_id text The␠ID␠of␠a␠compute␠object.␠Corresponds␠to␠`mz_catalog.mz_indexes.id`,␠`mz_catalog.mz_materialized_views.id`,␠or␠`mz_internal.mz_subscriptions`. +object_id text The␠ID␠of␠a␠compute␠object.␠Corresponds␠to␠`mz_catalog.mz_indexes.id`,␠`mz_catalog.mz_materialized_views.id`,␠or␠`mz_internal.mz_subscriptions.id`. dependency_id text The␠ID␠of␠a␠compute␠dependency.␠Corresponds␠to␠`mz_catalog.mz_indexes.id`,␠`mz_catalog.mz_materialized_views.id`,␠`mz_catalog.mz_sources.id`,␠or␠`mz_catalog.mz_tables.id`. query TTT @@ -464,7 +464,7 @@ query TTT SELECT name, type, comment FROM objects WHERE schema = 'mz_internal' AND object = 'mz_network_policy_rules' ORDER BY position ---- name text The␠name␠of␠the␠network␠policy␠rule.␠Can␠be␠combined␠with␠`policy_id`␠to␠form␠a␠unique␠identifier. -policy_id text The␠ID␠the␠network␠policy␠the␠rule␠is␠part␠of.␠Corresponds␠to␠`mz_network_policy_rules.id`. +policy_id text The␠ID␠the␠network␠policy␠the␠rule␠is␠part␠of.␠Corresponds␠to␠`mz_internal.mz_network_policies.id`. action text The␠action␠of␠the␠rule.␠`allow`␠is␠the␠only␠supported␠action. address text The␠address␠the␠rule␠will␠take␠action␠on. direction text The␠direction␠of␠traffic␠the␠rule␠applies␠to.␠`ingress`␠is␠the␠only␠supported␠direction. @@ -787,6 +787,10 @@ mz_object_lifetimes mz_object_oid_alias mz_object_transitive_dependencies mz_objects_id_namespace_types +mz_ontology_entity_types +mz_ontology_link_types +mz_ontology_properties +mz_ontology_semantic_types mz_optimizer_notices mz_pending_cluster_replicas mz_postgres_source_tables diff --git a/test/sqllogictest/autogenerated/mz_introspection.slt b/test/sqllogictest/autogenerated/mz_introspection.slt index c30ccfa704691..a914ba4344620 100644 --- a/test/sqllogictest/autogenerated/mz_introspection.slt +++ b/test/sqllogictest/autogenerated/mz_introspection.slt @@ -68,7 +68,7 @@ count numeric The␠count␠of␠errors␠present␠in␠this␠dataflow␠exp query TTT SELECT name, type, comment FROM objects WHERE schema = 'mz_introspection' AND object = 'mz_compute_exports' ORDER BY position ---- -export_id text The␠ID␠of␠the␠index,␠materialized␠view,␠or␠subscription␠exported␠by␠the␠dataflow.␠Corresponds␠to␠`mz_catalog.mz_indexes.id`,␠`mz_catalog.mz_materialized_views.id`,␠or␠`mz_internal.mz_subscriptions`. +export_id text The␠ID␠of␠the␠index,␠materialized␠view,␠or␠subscription␠exported␠by␠the␠dataflow.␠Corresponds␠to␠`mz_catalog.mz_indexes.id`,␠`mz_catalog.mz_materialized_views.id`,␠or␠`mz_internal.mz_subscriptions.id`. dataflow_id uint8 The␠ID␠of␠the␠dataflow.␠Corresponds␠to␠`mz_dataflows.id`. query TTT diff --git a/test/sqllogictest/information_schema_tables.slt b/test/sqllogictest/information_schema_tables.slt index a3d24064b5116..ea3473555eeca 100644 --- a/test/sqllogictest/information_schema_tables.slt +++ b/test/sqllogictest/information_schema_tables.slt @@ -477,6 +477,22 @@ mz_objects_id_namespace_types VIEW materialize mz_internal +mz_ontology_entity_types +VIEW +materialize +mz_internal +mz_ontology_link_types +VIEW +materialize +mz_internal +mz_ontology_properties +VIEW +materialize +mz_internal +mz_ontology_semantic_types +VIEW +materialize +mz_internal mz_optimizer_notices BASE TABLE materialize diff --git a/test/sqllogictest/mz_ontology.slt b/test/sqllogictest/mz_ontology.slt new file mode 100644 index 0000000000000..9f1470bfabbd0 --- /dev/null +++ b/test/sqllogictest/mz_ontology.slt @@ -0,0 +1,148 @@ +# Copyright Materialize, Inc. and contributors. All rights reserved. +# +# Use of this software is governed by the Business Source License +# included in the LICENSE file at the root of this repository. +# +# As of the Change Date specified in that file, in accordance with +# the Business Source License, use of this software will be governed +# by the Apache License, Version 2.0. + +# Smoke tests for the mz_ontology built-in views in mz_internal. + +mode cockroach + +# Verify the four ontology views exist and have the expected columns. + +query TTTT +SELECT name, relation, properties->>'primary_key', description +FROM mz_internal.mz_ontology_entity_types +WHERE name = 'cluster' +---- +cluster mz_catalog.mz_clusters ["id"] A␠compute␠cluster␠that␠runs␠dataflows␠for␠sources,␠sinks,␠MVs,␠and␠indexes + +query TTT +SELECT name, sql_type, description +FROM mz_internal.mz_ontology_semantic_types +WHERE name = 'CatalogItemId' +---- +CatalogItemId text SQL-layer␠object␠ID.␠Format:␠s{n}/u{n}. + +query TTTT +SELECT entity_type, column_name, semantic_type, description +FROM mz_internal.mz_ontology_properties +WHERE entity_type = 'cluster' AND column_name = 'id' +---- +cluster id ClusterId Materialize's␠unique␠ID␠for␠the␠cluster. + +query TTTTT +SELECT name, source_entity, target_entity, properties->>'kind', description +FROM mz_internal.mz_ontology_link_types +WHERE name = 'belongs_to_cluster' AND source_entity = 'replica' +---- +belongs_to_cluster replica cluster foreign_key NULL + +# Verify basic row counts are in expected ranges. + +query B +SELECT count(*) > 90 FROM mz_internal.mz_ontology_entity_types +---- +true + +query B +SELECT count(*) >= 15 FROM mz_internal.mz_ontology_semantic_types +---- +true + +query B +SELECT count(*) > 300 FROM mz_internal.mz_ontology_properties +---- +true + +query B +SELECT count(*) > 80 FROM mz_internal.mz_ontology_link_types +---- +true + +# Verify referential integrity: every entity_type in properties exists in entity_types. + +query I +SELECT count(*) +FROM mz_internal.mz_ontology_properties p +WHERE NOT EXISTS ( + SELECT 1 FROM mz_internal.mz_ontology_entity_types e + WHERE e.name = p.entity_type +) +---- +0 + +# Verify referential integrity: every semantic_type in properties exists in semantic_types. + +query I +SELECT count(*) +FROM mz_internal.mz_ontology_properties p +WHERE p.semantic_type IS NOT NULL + AND NOT EXISTS ( + SELECT 1 FROM mz_internal.mz_ontology_semantic_types s + WHERE s.name = p.semantic_type +) +---- +0 + +# Verify referential integrity: link type source/target entities exist in entity_types. + +query I +SELECT count(*) +FROM mz_internal.mz_ontology_link_types l +WHERE NOT EXISTS ( + SELECT 1 FROM mz_internal.mz_ontology_entity_types e + WHERE e.name = l.source_entity +) +---- +0 + +query I +SELECT count(*) +FROM mz_internal.mz_ontology_link_types l +WHERE NOT EXISTS ( + SELECT 1 FROM mz_internal.mz_ontology_entity_types e + WHERE e.name = l.target_entity +) +---- +0 + +# Verify link_types properties column has structured JSON (not empty). + +query B +SELECT count(*) > 0 +FROM mz_internal.mz_ontology_link_types +WHERE properties->>'kind' IS NOT NULL +---- +true + +# Verify semantic type annotations are populated in properties. + +query B +SELECT count(*) > 200 +FROM mz_internal.mz_ontology_properties +WHERE semantic_type IS NOT NULL +---- +true + +# Verify key entity types exist. + +query T rowsort +SELECT name FROM mz_internal.mz_ontology_entity_types +WHERE name IN ('database', 'schema', 'role', 'cluster', 'replica', 'table', 'source', 'view', 'mv', 'index', 'sink', 'connection') +---- +cluster +connection +database +index +mv +replica +role +schema +sink +source +table +view diff --git a/test/sqllogictest/oid.slt b/test/sqllogictest/oid.slt index 75443f5286fcf..a9190b231a8e0 100644 --- a/test/sqllogictest/oid.slt +++ b/test/sqllogictest/oid.slt @@ -1181,3 +1181,7 @@ SELECT oid, name FROM mz_objects WHERE id LIKE 's%' AND oid < 20000 ORDER BY oid 17073 parse_catalog_create_sql 17074 redact_sql 17075 repeat_row_non_negative +17077 mz_ontology_entity_types +17078 mz_ontology_semantic_types +17079 mz_ontology_properties +17080 mz_ontology_link_types diff --git a/test/testdrive/catalog.td b/test/testdrive/catalog.td index 85e6ea6dfa101..7522066a62aef 100644 --- a/test/testdrive/catalog.td +++ b/test/testdrive/catalog.td @@ -653,6 +653,10 @@ mz_object_lifetimes "" mz_object_oid_alias "" mz_object_transitive_dependencies "" mz_objects_id_namespace_types "" +mz_ontology_entity_types "" +mz_ontology_link_types "" +mz_ontology_properties "" +mz_ontology_semantic_types "" mz_recent_activity_log "" mz_recent_activity_log_thinned "" mz_recent_activity_log_redacted ""