Skip to content

Generate schema v2 catalog fields from ddp sync catalog #17

@eric-tramel

Description

@eric-tramel

Parent epic: #15

Depends on: #16, #21

Why

The catalog must remain generated, not hand-edited. Today ddp sync catalog renders schema v1 from package metadata, installed data_designer.plugins entry points, and direct data-designer dependency constraints. It needs to render the concrete schema v2 contract from #16 using the repo-level tap metadata from #21.

Implementation

  • Set CATALOG_SCHEMA_VERSION = 2.
  • Extend CatalogEntry with:
    • source: dict[str, object]
    • docs_url: str
  • Generate default NVIDIA entries with this source object because [tool.ddp.tap].default-source = "pypi":
{"type": "pypi", "package": "<project.name>"}
  • Generate docs.url as:
f"{docs_base_url.rstrip('/')}/plugins/{normalize_docs_slug(project_name)}/"

where normalize_docs_slug should match the existing plugin docs slug logic.

  • Preserve current v1 fields exactly unless Define the schema v2 tap catalog contract #16 says otherwise: name, plugin_type, description, package, entry_point, and compatibility.
  • Keep compatibility.data_designer.requirement, specifier, and marker; the PDF example omits requirement, but the current field is useful for explanations and should remain.
  • Add duplicate runtime plugin-name detection before rendering JSON.
  • Add source-object validation for the default pypi source.
  • Regenerate catalog/plugins.json.
  • Update devtools/ddp/tests/test_catalog.py expected output to schema v2.
  • Add a multi-entry fixture test where two entry points in one package share identical package and source metadata.

Expected generated entry

For the template plugin, schema v2 output should be structurally equivalent to:

{
  "name": "text-transform",
  "plugin_type": "column-generator",
  "description": "Template Data Designer plugin — text transform column generator",
  "package": {
    "name": "data-designer-template",
    "version": "0.1.0",
    "path": "plugins/data-designer-template"
  },
  "entry_point": {
    "group": "data_designer.plugins",
    "name": "text-transform",
    "value": "data_designer_template.plugin:plugin"
  },
  "compatibility": {
    "python": {"specifier": ">=3.10"},
    "data_designer": {
      "requirement": "data-designer>=0.5.7",
      "specifier": ">=0.5.7",
      "marker": null
    }
  },
  "source": {
    "type": "pypi",
    "package": "data-designer-template"
  },
  "docs": {
    "url": "https://nvidia-nemo.github.io/DataDesignerPlugins/plugins/data-designer-template/"
  }
}

Acceptance criteria

  • make catalog emits schema_version: 2.
  • Every generated plugin entry has source.type = "pypi", source.package = package.name, and a docs.url derived from [tool.ddp.tap].docs-base-url.
  • Existing compatibility fields remain generated from direct versioned data-designer dependencies.
  • Duplicate runtime plugin names fail catalog generation.
  • Multi-entry packages render as multiple entries sharing one package/source identity.
  • uv run pytest devtools/ddp/tests/test_catalog.py -q passes.
  • make check-catalog fails when checked-in catalog output is stale.

Dependencies

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestplugin tapPlugin catalog and tap ecosystem work

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions