diff --git a/.agents/agents/docs-searcher.md b/.agents/agents/docs-searcher.md index 814254189..f150cdf7f 100644 --- a/.agents/agents/docs-searcher.md +++ b/.agents/agents/docs-searcher.md @@ -63,7 +63,7 @@ Brief summary of what was found and any recommendations for the user. - Only include results that are actually relevant to the search topic - If no relevant documentation is found, clearly state that - Keep excerpts concise but include enough context to be useful -- Prioritize user guides and examples over API reference when both exist +- Prioritize user guides, concepts, tutorials, and recipes according to the user's task - If the docs/ folder doesn't exist or is empty, report that clearly ## Search Strategy diff --git a/.agents/recipes/code-quality/recipe.md b/.agents/recipes/code-quality/recipe.md index 268e0adff..a8e0f8577 100644 --- a/.agents/recipes/code-quality/recipe.md +++ b/.agents/recipes/code-quality/recipe.md @@ -152,7 +152,7 @@ Examples of things to test (pick 2-3 per run, and invent new ones): - Column names with special characters or very long strings - Recently changed validators (check `git log --oneline -10 -- packages/*/src/data_designer/config/`) -**API reference:** +**Useful imports:** ```python from data_designer.config.config_builder import DataDesignerConfigBuilder diff --git a/.agents/recipes/docs-and-references/recipe.md b/.agents/recipes/docs-and-references/recipe.md index 45a54b324..e640a47be 100644 --- a/.agents/recipes/docs-and-references/recipe.md +++ b/.agents/recipes/docs-and-references/recipe.md @@ -101,9 +101,6 @@ Review for accuracy against the current code: the most recent 3-5 posts for references to functions, classes, or architecture that have since been modified. -**Code reference** (`docs/code_reference/`): -- Check that autodoc module paths point to modules that still exist. - **Prioritize by risk of drift**: pages with the most code symbols referenced are most likely to be stale. Don't read every page - sample 5-10 high-value pages and flag patterns. diff --git a/.agents/recipes/test-health/recipe.md b/.agents/recipes/test-health/recipe.md index 2224684ff..d47ffa963 100644 --- a/.agents/recipes/test-health/recipe.md +++ b/.agents/recipes/test-health/recipe.md @@ -208,7 +208,7 @@ without at least one provider configured. Stick to config-layer checks (`DataDesignerConfigBuilder.build()`, column type resolution) which do not require providers. -**API reference** for writing checks: +**Useful imports** for writing checks: ```python from data_designer.config.config_builder import DataDesignerConfigBuilder diff --git a/.agents/skills/datadesigner-docs/SKILL.md b/.agents/skills/datadesigner-docs/SKILL.md index d2a49ae1c..68019f382 100644 --- a/.agents/skills/datadesigner-docs/SKILL.md +++ b/.agents/skills/datadesigner-docs/SKILL.md @@ -4,8 +4,8 @@ description: > Maintain the NeMo Data Designer Fern docs site under fern/. Use for any documentation change. Triggered by: "edit docs", "add doc page", "update docs", "rename page", "fix broken link", "add redirect", "preview docs", - "publish docs", "regenerate notebooks", "update dev note", "add API - reference", any request that touches `fern/`. + "publish docs", "regenerate notebooks", "update dev note", any request + that touches `fern/`. --- # Data Designer Docs Maintenance @@ -16,7 +16,7 @@ Current URL: **`datadesigner.docs.buildwithfern.com/nemo/datadesigner`** (see `i ## Scope Rule -**ALL doc edits happen under `fern/`.** The legacy `docs/` directory is the original MkDocs source. `docs/notebook_source/*.py` remains canonical for notebook code, but **do not add new top-level prose pages under `docs/`**. Concept pages, recipes, plugins, code reference, and Dev Notes prose live under `fern/versions/latest/pages/`. +**ALL doc edits happen under `fern/`.** The legacy `docs/` directory is the original MkDocs source. `docs/notebook_source/*.py` remains canonical for notebook code, but **do not add new top-level prose pages under `docs/`**. Concept pages, recipes, plugins, and Dev Notes prose live under `fern/versions/latest/pages/`. ## Versioning Model @@ -39,7 +39,7 @@ For future Fern-native releases, do not copy page trees by hand on `main`. The r ``` fern/ ├── README.md ← maintainer cheat sheet -├── docs.yml ← title, theme, versions:, libraries:, redirects, custom-domain +├── docs.yml ← title, theme, versions:, redirects, custom-domain ├── fern.config.json ← organization + fern-api version pin ├── main.css ← bundled NVIDIA theme CSS ├── assets/ ← logos, favicon, recipe assets, devnote post images (shared) @@ -56,7 +56,6 @@ fern/ │ └── devnotes/ ← .authors.yml, authors-data.ts, per-post trajectory data ├── scripts/ │ └── ipynb-to-fern-json.py ← .ipynb → fern/components/notebooks/*.{json,ts} -├── code-reference/ ← gitignored; populated by `fern docs md generate` └── versions/ ├── latest.yml ← authoring navigation tree └── latest/pages/ ← authoring MDX content @@ -401,47 +400,6 @@ import notebook from "@/components/notebooks/1-the-basics"; The converter (`fern/scripts/ipynb-to-fern-json.py`) **auto-strips the leading Colab badge cell** — `` renders its own banner from the `colabUrl` prop. Don't manually re-add it. -## Python API Reference (`libraries:`) - -`docs.yml` keeps a Fern-native `libraries:` block for the config package. Local generation uses `py2fern` through `make generate-fern-api-reference` and writes multiple gitignored trees under `fern/code-reference/`: - -- `data-designer/` for `data_designer.config` -- `interface/` for `data_designer.interface` -- `engine/seed-readers/` -- `engine/processors/` -- `engine/mcp/` -- `engine/column-generators/` - -To populate locally: - -```bash -make generate-fern-api-reference -``` - -This does not require Fern auth. Re-run when the upstream Python source changes. If you need to compare with Fern's native generator, use `make generate-fern-api-reference-native` with Fern auth. - -The generated trees are wired into `versions/latest.yml` under `Code Reference`: - -- `Config` contains prose pages plus `Config API` from `../code-reference/data-designer/data_designer/config` -- `Interface` contains prose pages plus `Interface API` from `../code-reference/interface/data_designer/interface` -- `Engine Extension API` contains prose pages plus the seed reader, processor, MCP runtime, and column generator API folders - -There is no `Topic Overviews` section. Prose reference pages live beside the generated folders under `fern/versions/latest/pages/code_reference/`. - -To add another generated package, update the `generate-fern-api-reference` target and add the matching `folder:` entry under the right `Code Reference` section. Only add a `libraries:` entry when Fern's native generator should know about that source: - -```yaml -libraries: - data-designer: - input: - git: https://github.com/NVIDIA-NeMo/DataDesigner - subpath: packages/data-designer-config/src/data_designer/config - output: { path: ./code-reference/data-designer } - lang: python -``` - -Pyright needs a regular Python package (with `__init__.py`). The `data_designer` namespace itself is PEP 420 (no `__init__.py`), so always point at a sub-package one level deeper. - ## MDX Gotchas (the ones that bit during migration) | Pattern | Problem | Fix | @@ -467,14 +425,6 @@ fern docs dev # localhost:3000 hot-reload preview `fern check` must pass before commit. The local broken-link checker has known false positives — it computes URLs from file paths instead of from slugified nav titles, so cross-section absolute links sometimes flag incorrectly. Spot-check by clicking through the dev server. -To generate the API reference for local preview: - -```bash -make generate-fern-api-reference # py2fern; populates fern/code-reference/ (gitignored) -``` - -If the "Python API" sidebar folder is empty, you forgot this step. - ## Commit & Preview ```bash diff --git a/.github/workflows/docs-preview.yml b/.github/workflows/docs-preview.yml index b9e40ef0c..8a61fb8e7 100644 --- a/.github/workflows/docs-preview.yml +++ b/.github/workflows/docs-preview.yml @@ -88,8 +88,7 @@ jobs: cd "$FERN_PREVIEW_ROOT" make check-fern-docs \ DOCS_PYTHON="$GITHUB_WORKSPACE/.venv/bin/python" \ - DOCS_JUPYTEXT="$GITHUB_WORKSPACE/.venv/bin/jupytext" \ - DOCS_PY2FERN="$GITHUB_WORKSPACE/.venv/bin/py2fern" + DOCS_JUPYTEXT="$GITHUB_WORKSPACE/.venv/bin/jupytext" - name: Skip hosted previews for fork PRs if: github.event.pull_request.head.repo.full_name != github.repository diff --git a/.gitignore b/.gitignore index e7f8718b7..f1f5a010e 100644 --- a/.gitignore +++ b/.gitignore @@ -107,10 +107,6 @@ packages/data-designer/README.md # Claude worktrees .claude/worktrees/ -# Fern libraries output — generated by `fern docs md generate` from `libraries:` in fern/docs.yml. -# Regenerate locally (no token needed) or in CI before publishing. -fern/code-reference/ - # Fern notebook output - generated by `make generate-fern-notebooks`. fern/components/notebooks/*.json fern/components/notebooks/*.ts diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index c996d3c7f..f352759c3 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -79,7 +79,7 @@ Data Designer is migrating from MkDocs to Fern over several releases. Until the - Use `make serve-docs-locally` to preview the legacy MkDocs site. - Use `make check-fern-docs` to regenerate local Fern artifacts and validate the Fern site. - Fern release publishing snapshots versioned docs into the CI-managed `docs-website` branch automatically. -- Do not commit generated Fern API reference or notebook artifacts. +- Do not commit generated notebook artifacts. --- diff --git a/Makefile b/Makefile index 962662014..5ba6c3d3c 100644 --- a/Makefile +++ b/Makefile @@ -80,8 +80,6 @@ help: @echo " generate-colab-notebooks - Generate Colab-compatible notebooks" @echo " generate-fern-notebooks - Convert docs/notebook_source/*.py → fern/components/notebooks/{json,ts}" @echo " generate-fern-notebooks-with-outputs - Full pipeline: execute notebooks (needs API key), colabify, convert to Fern" - @echo " generate-fern-api-reference - Generate local Fern API reference with py2fern" - @echo " generate-fern-api-reference-native - Generate Fern API reference with Fern CLI (requires auth)" @echo " install-docs-deps - Install docs and notebook dependencies" @echo " prepare-fern-release VERSION=X.Y.Z - Add or refresh Fern version files for release preview" @echo " check-fern-release-version VERSION=X.Y.Z - Verify Fern has a version entry for release publishing" @@ -474,15 +472,6 @@ DOCS_PYTHON_VERSION ?= 3.13 DOCS_PYTHON ?= .venv/bin/python DOCS_JUPYTEXT ?= .venv/bin/jupytext DOCS_MKDOCS ?= .venv/bin/mkdocs -DOCS_PY2FERN ?= .venv/bin/py2fern -FERN_API_REFERENCE_OUTPUT ?= fern/code-reference -FERN_API_REFERENCE_CONFIG_OUTPUT ?= $(FERN_API_REFERENCE_OUTPUT)/data-designer -FERN_API_REFERENCE_CONFIG_SOURCE ?= packages/data-designer-config/src/data_designer/config -FERN_API_REFERENCE_INTERFACE_SOURCE ?= packages/data-designer/src/data_designer/interface -FERN_API_REFERENCE_ENGINE_COLUMN_GENERATORS_SOURCE ?= packages/data-designer-engine/src/data_designer/engine/column_generators/generators/base.py -FERN_API_REFERENCE_ENGINE_MCP_SOURCE ?= packages/data-designer-engine/src/data_designer/engine/mcp -FERN_API_REFERENCE_ENGINE_PROCESSORS_SOURCE ?= packages/data-designer-engine/src/data_designer/engine/processing/processors -FERN_API_REFERENCE_ENGINE_SEED_READERS_SOURCE ?= packages/data-designer-engine/src/data_designer/engine/resources/seed_reader.py FERN_VERSION ?= $(shell jq -r .version fern/fern.config.json) FERN ?= npx -y fern-api@$(FERN_VERSION) @@ -501,21 +490,6 @@ serve-docs-locally: @echo "📝 Building and serving docs (Python $(DOCS_PYTHON_VERSION))..." $(DOCS_MKDOCS) serve --livereload -generate-fern-api-reference: - @echo "📚 Generating Fern API reference with py2fern ($(DOCS_PY2FERN))..." - @rm -rf $(FERN_API_REFERENCE_OUTPUT) - $(DOCS_PY2FERN) write $(FERN_API_REFERENCE_CONFIG_SOURCE) --module data_designer.config --output $(FERN_API_REFERENCE_CONFIG_OUTPUT) --clean - $(DOCS_PY2FERN) write $(FERN_API_REFERENCE_INTERFACE_SOURCE) --module data_designer.interface --output $(FERN_API_REFERENCE_OUTPUT)/interface --clean - $(DOCS_PY2FERN) write $(FERN_API_REFERENCE_ENGINE_COLUMN_GENERATORS_SOURCE) --module data_designer.engine.column_generators.generators.base --output $(FERN_API_REFERENCE_OUTPUT)/engine/column-generators --clean - $(DOCS_PY2FERN) write $(FERN_API_REFERENCE_ENGINE_MCP_SOURCE) --module data_designer.engine.mcp --output $(FERN_API_REFERENCE_OUTPUT)/engine/mcp --clean - $(DOCS_PY2FERN) write $(FERN_API_REFERENCE_ENGINE_PROCESSORS_SOURCE) --module data_designer.engine.processing.processors --output $(FERN_API_REFERENCE_OUTPUT)/engine/processors --clean - $(DOCS_PY2FERN) write $(FERN_API_REFERENCE_ENGINE_SEED_READERS_SOURCE) --module data_designer.engine.resources.seed_reader --output $(FERN_API_REFERENCE_OUTPUT)/engine/seed-readers --clean - $(DOCS_PYTHON) fern/scripts/normalize-py2fern-indexes.py $(FERN_API_REFERENCE_OUTPUT) - -generate-fern-api-reference-native: - @echo "📚 Generating Fern API reference with Fern CLI..." - cd fern && $(FERN) docs md generate - prepare-fern-release: ifndef VERSION $(error VERSION is required, e.g. make prepare-fern-release VERSION=0.5.10) @@ -528,7 +502,7 @@ ifndef VERSION endif $(DOCS_PYTHON) fern/scripts/fern-release-version.py check --version $(VERSION) $(if $(REQUIRE_LATEST),--require-latest-matches-release,) -prepare-fern-docs: generate-fern-api-reference generate-fern-notebooks +prepare-fern-docs: generate-fern-notebooks @echo "✅ Fern local artifacts ready" check-fern-docs: prepare-fern-docs @@ -759,7 +733,7 @@ clean-test-coverage: coverage coverage-config coverage-engine coverage-interface \ format format-check format-check-config format-check-engine format-check-interface \ format-config format-engine format-interface \ - generate-colab-notebooks generate-fern-api-reference generate-fern-api-reference-native generate-fern-notebooks generate-fern-notebooks-with-outputs help \ + generate-colab-notebooks generate-fern-notebooks generate-fern-notebooks-with-outputs help \ install install-dev install-dev-notebooks install-dev-recipes install-docs-deps \ lint lint-config lint-engine lint-fix lint-fix-config lint-fix-engine lint-fix-interface lint-interface \ perf-import perf-import-runtime prepare-fern-docs prepare-fern-release publish serve-docs-locally serve-fern-docs-locally show-versions \ diff --git a/docs/code_reference/config/analysis.md b/docs/code_reference/config/analysis.md deleted file mode 100644 index fa59221a0..000000000 --- a/docs/code_reference/config/analysis.md +++ /dev/null @@ -1,31 +0,0 @@ -# Analysis - -Profiling result objects and report helpers returned after generation. - -## Column Statistics - -`DataDesigner.create()` and `DataDesigner.preview()` run the dataset profiler after generation. The profiler computes statistics for each configured column; side-effect columns are recorded separately in `DatasetProfilerResults.side_effect_column_names`. - -Statistics result classes store computed metrics for each column type and format those metrics for reports. - -::: data_designer.config.analysis.column_statistics - -## Column Profilers - -Column profilers are optional analysis tools that provide deeper insights into specific column types. Currently, the only column profiler available is the Judge Score Profiler. - -Profiler result classes store computed profiler output and format it for reports. - -::: data_designer.config.analysis.column_profilers - -## Dataset Profiler - -The [DatasetProfilerResults](#data_designer.config.analysis.dataset_profiler.DatasetProfilerResults) class stores profiling results for a generated dataset. It aggregates column-level statistics, side-effect column names, and optional profiler results, and provides methods to: - -- Compute dataset-level metrics (completion percentage, column type summary) -- Filter statistics by column type -- Generate formatted analysis reports via the `to_report()` method - -Reports can be displayed in the console or exported to HTML/SVG formats. - -::: data_designer.config.analysis.dataset_profiler diff --git a/docs/code_reference/config/column_configs.md b/docs/code_reference/config/column_configs.md deleted file mode 100644 index 4ff2e8f2f..000000000 --- a/docs/code_reference/config/column_configs.md +++ /dev/null @@ -1,18 +0,0 @@ -# Column Configurations - -Column configs declare Data Designer's built-in column types. Each configuration inherits from [SingleColumnConfig](#data_designer.config.base.SingleColumnConfig), which provides shared arguments like the column `name`, whether to `drop` the column after generation, and the `column_type`. - -For column generator implementation classes, see [column_generators](../engine/column_generators.md). - -!!! info "`column_type` is a discriminator field" - The `column_type` argument is used to identify column types when deserializing the [Data Designer Config](data_designer_config.md) from JSON/YAML. It acts as the discriminator in a [discriminated union](https://docs.pydantic.dev/latest/concepts/unions/#discriminated-unions), allowing Pydantic to automatically determine which column configuration class to instantiate. - -## `SingleColumnConfig` {#data_designer.config.base.SingleColumnConfig} - -::: data_designer.config.base.SingleColumnConfig - options: - show_root_toc_entry: false - -## Column configurations - -::: data_designer.config.column_configs diff --git a/docs/code_reference/config/config_builder.md b/docs/code_reference/config/config_builder.md deleted file mode 100644 index 1aad978ae..000000000 --- a/docs/code_reference/config/config_builder.md +++ /dev/null @@ -1,10 +0,0 @@ -# Data Designer's Config Builder - -Use [DataDesignerConfigBuilder](#data_designer.config.config_builder.DataDesignerConfigBuilder) to construct [DataDesignerConfig](data_designer_config.md#data_designer.config.data_designer_config.DataDesignerConfig) objects. The builder accumulates model configs, tool configs, column configs, constraints, seed settings, processors, and profilers. - -Inputs can come from scratch, a `dict`, [BuilderConfig](#data_designer.config.config_builder.BuilderConfig), a local YAML/JSON file, or an HTTP(S) YAML/JSON URL via [`from_config()`](#data_designer.config.config_builder.DataDesignerConfigBuilder.from_config). Use [`build()`](#data_designer.config.config_builder.DataDesignerConfigBuilder.build) to create a [DataDesignerConfig](data_designer_config.md#data_designer.config.data_designer_config.DataDesignerConfig), or [`write_config()`](#data_designer.config.config_builder.DataDesignerConfigBuilder.write_config) to serialize the current builder config to YAML or JSON. - -!!! info "Model config loading" - [DataDesignerConfigBuilder](#data_designer.config.config_builder.DataDesignerConfigBuilder) accepts model configs as a list of [ModelConfig](models.md#data_designer.config.models.ModelConfig) objects, a YAML/JSON config path, or `None`. When `model_configs=None`, the builder loads default model configs if Data Designer can run locally; otherwise initialization raises BuilderConfigurationError. Model configs define the aliases referenced by model-backed columns such as [`LLMTextColumnConfig`](column_configs.md#data_designer.config.column_configs.LLMTextColumnConfig), [`LLMCodeColumnConfig`](column_configs.md#data_designer.config.column_configs.LLMCodeColumnConfig), [`LLMStructuredColumnConfig`](column_configs.md#data_designer.config.column_configs.LLMStructuredColumnConfig), [`LLMJudgeColumnConfig`](column_configs.md#data_designer.config.column_configs.LLMJudgeColumnConfig), [`EmbeddingColumnConfig`](column_configs.md#data_designer.config.column_configs.EmbeddingColumnConfig), and [`ImageColumnConfig`](column_configs.md#data_designer.config.column_configs.ImageColumnConfig). - -::: data_designer.config.config_builder diff --git a/docs/code_reference/config/data_designer_config.md b/docs/code_reference/config/data_designer_config.md deleted file mode 100644 index d6329a9fa..000000000 --- a/docs/code_reference/config/data_designer_config.md +++ /dev/null @@ -1,7 +0,0 @@ -# Data Designer Configuration - -[DataDesignerConfig](#data_designer.config.data_designer_config.DataDesignerConfig) is the top-level configuration object passed to Data Designer. It declares the columns to generate and may include model configs, tool configs, seed settings, sampler constraints, processors, and profiler configs. - -Prefer [DataDesignerConfigBuilder](config_builder.md#data_designer.config.config_builder.DataDesignerConfigBuilder) for programmatic construction. Direct [DataDesignerConfig](#data_designer.config.data_designer_config.DataDesignerConfig) instantiation is also supported. - -::: data_designer.config.data_designer_config diff --git a/docs/code_reference/config/index.md b/docs/code_reference/config/index.md deleted file mode 100644 index 1ec8b4de0..000000000 --- a/docs/code_reference/config/index.md +++ /dev/null @@ -1,7 +0,0 @@ -# Config Package - -The `data-designer-config` package provides `data_designer.config`, the configuration layer of Data Designer. It contains the objects used to describe dataset structure, model access, tool access, seed data, sampler parameters, validators, processors, run settings, plugin registrations, and analysis results. - -This package is the base of the dependency chain. Engine and interface code consume these config objects, but config objects do not execute generation directly. - -For programmatic configuration work, start with [config_builder](config_builder.md) and [data_designer_config](data_designer_config.md). Use the narrower pages for exact constructor fields for columns, models, MCP tools, seeds, processors, samplers, validators, or profiling results. diff --git a/docs/code_reference/config/mcp.md b/docs/code_reference/config/mcp.md deleted file mode 100644 index 49b6f5cfb..000000000 --- a/docs/code_reference/config/mcp.md +++ /dev/null @@ -1,16 +0,0 @@ -# MCP Configuration - -MCP config objects tell Data Designer which Model Context Protocol providers exist and which tools an LLM column may use. - -[MCPProvider](#data_designer.config.mcp.MCPProvider) configures remote MCP servers via SSE or Streamable HTTP transport. [LocalStdioMCPProvider](#data_designer.config.mcp.LocalStdioMCPProvider) configures local MCP servers as subprocesses via stdio transport. [ToolConfig](#data_designer.config.mcp.ToolConfig) sets which tools are available for LLM columns and how they are constrained. - -For MCP execution internals, see [Engine MCP](../engine/mcp.md). Related guides: - -- **[MCP Providers](../../concepts/mcp/mcp-providers.md)** - Configure local or remote MCP providers -- **[Tool Configs](../../concepts/mcp/tool-configs.md)** - Define tool permissions and limits -- **[Enabling Tools](../../concepts/mcp/enabling-tools.md)** - Use tools in LLM columns -- **[Traces](../../concepts/traces.md)** - Capture full conversation history - -## API Reference - -::: data_designer.config.mcp diff --git a/docs/code_reference/config/models.md b/docs/code_reference/config/models.md deleted file mode 100644 index e14e8cfdb..000000000 --- a/docs/code_reference/config/models.md +++ /dev/null @@ -1,12 +0,0 @@ -# Models - -[ModelProvider](#data_designer.config.models.ModelProvider) stores connection and authentication details for model providers. [ModelConfig](#data_designer.config.models.ModelConfig) stores a model alias, model identifier, provider settings, and inference parameters. [Inference Parameters](../../concepts/models/inference-parameters.md) control model behavior. Chat-completion parameters include `temperature`, `top_p`, and `max_tokens`; `temperature` and `top_p` can be fixed values or configured distributions. [ImageContext](#data_designer.config.models.ImageContext) provides image inputs to multimodal models, and [ImageInferenceParams](#data_designer.config.models.ImageInferenceParams) configures image generation models. - -Related guides: - -- **[Model Providers](../../concepts/models/model-providers.md)** -- **[Model Configs](../../concepts/models/model-configs.md)** -- **[Image Context](../../notebooks/4-providing-images-as-context.ipynb)** -- **[Generating Images](../../notebooks/5-generating-images.ipynb)** - -::: data_designer.config.models diff --git a/docs/code_reference/config/plugins.md b/docs/code_reference/config/plugins.md deleted file mode 100644 index 93f4533de..000000000 --- a/docs/code_reference/config/plugins.md +++ /dev/null @@ -1,17 +0,0 @@ -# Plugins - -Plugin packages register [Plugin](#data_designer.plugins.plugin.Plugin) objects through entry points in the `data_designer.plugins` group. A plugin registration ties a config class to its implementation class and declares its [PluginType](#data_designer.plugins.plugin.PluginType). - -Related pages: [Build Your Own](../../plugins/build_your_own.md), [Column Generators](../engine/column_generators.md), [Seed Readers](../engine/seed_readers.md), [Engine Processors](../engine/processors.md), and [Processor Configurations](processors.md). - -## `Plugin` {#data_designer.plugins.plugin.Plugin} - -::: data_designer.plugins.plugin.Plugin - options: - show_root_toc_entry: false - -## `PluginType` {#data_designer.plugins.plugin.PluginType} - -::: data_designer.plugins.plugin.PluginType - options: - show_root_toc_entry: false diff --git a/docs/code_reference/config/processors.md b/docs/code_reference/config/processors.md deleted file mode 100644 index a1795643b..000000000 --- a/docs/code_reference/config/processors.md +++ /dev/null @@ -1,7 +0,0 @@ -# Processor Configurations - -Processor configs request data transformations after generation. Add them to a `DataDesignerConfig` or `DataDesignerConfigBuilder`; the engine later compiles them into runtime processor implementations. - -Related pages: [engine processors](../engine/processors.md) and [Build Your Own](../../plugins/build_your_own.md). - -::: data_designer.config.processors diff --git a/docs/code_reference/config/run_config.md b/docs/code_reference/config/run_config.md deleted file mode 100644 index f39dbb7f3..000000000 --- a/docs/code_reference/config/run_config.md +++ /dev/null @@ -1,29 +0,0 @@ -# Run Config - -`RunConfig` controls dataset generation behavior, including early shutdown thresholds, -batch sizing, non-inference worker concurrency, and the Jinja rendering engine used by -the runtime. - -`JinjaRenderingEngine.SECURE` is the default. Set `JinjaRenderingEngine.NATIVE` -when you want Jinja2's broader built-in sandbox behavior instead of Data Designer's -hardened renderer. - -For guidance on when to use each mode, see [Security](../../concepts/security.md). - -## Usage - -```python -import data_designer.config as dd -from data_designer.interface import DataDesigner - -data_designer = DataDesigner() -data_designer.set_run_config(dd.RunConfig( - buffer_size=500, - max_conversation_restarts=3, - jinja_rendering_engine=dd.JinjaRenderingEngine.NATIVE, -)) -``` - -## API Reference - -::: data_designer.config.run_config diff --git a/docs/code_reference/config/sampler_params.md b/docs/code_reference/config/sampler_params.md deleted file mode 100644 index 751fc604d..000000000 --- a/docs/code_reference/config/sampler_params.md +++ /dev/null @@ -1,12 +0,0 @@ -# Sampler Parameters - -Sampler parameter classes configure Data Designer's built-in samplers. Use them in [SamplerColumnConfig](column_configs.md#data_designer.config.column_configs.SamplerColumnConfig) to specify how sampled column values are generated. - -!!! tip "Displaying available samplers and their parameters" - The config builder has an `info` attribute that can be used to display the - available sampler types and their parameters: - ```python - config_builder.info.display("samplers") - ``` - -::: data_designer.config.sampler_params diff --git a/docs/code_reference/config/seeds.md b/docs/code_reference/config/seeds.md deleted file mode 100644 index a3b77ac64..000000000 --- a/docs/code_reference/config/seeds.md +++ /dev/null @@ -1,19 +0,0 @@ -# Seeds - -Seed configs declare existing data used as input during generation. A [SeedConfig](#data_designer.config.seed.SeedConfig) combines a seed source with optional row sampling and selection settings. Seed source objects declare where seed data comes from; the engine reads them through seed readers. - -Use these objects with `DataDesignerConfigBuilder.with_seed_dataset()`. Related pages: [Seed Datasets](../../concepts/seed-datasets.md) and [seed readers](../engine/seed_readers.md). - -Built-in seed sources include local files, Hugging Face paths, in-memory DataFrames, directories, file contents, and agent rollout traces. Plugin seed sources can extend the same discriminated union through the plugin system. - -## Seed Config - -::: data_designer.config.seed - -## Built-In Seed Sources - -::: data_designer.config.seed_source - -## DataFrame Seed Source - -::: data_designer.config.seed_source_dataframe diff --git a/docs/code_reference/config/validator_params.md b/docs/code_reference/config/validator_params.md deleted file mode 100644 index c69773da6..000000000 --- a/docs/code_reference/config/validator_params.md +++ /dev/null @@ -1,6 +0,0 @@ -# Validator Parameters - -`ValidationColumnConfig` selects a validator with `validator_type` and configures it with `validator_params`. -The `validator_type` field can be `code`, `local_callable`, or `remote`. The matching `validator_params` objects are: - -::: data_designer.config.validator_params diff --git a/docs/code_reference/engine/column_generators.md b/docs/code_reference/engine/column_generators.md deleted file mode 100644 index b2aff0ce1..000000000 --- a/docs/code_reference/engine/column_generators.md +++ /dev/null @@ -1,53 +0,0 @@ -# Column Generators - -Column generators execute column generation in the Data Designer engine. A generator receives the upstream data needed for its task, returns row or batch data with generated values added, and reports the generation strategy the scheduler should use. - -Related pages: [column_configs](../config/column_configs.md), [Build Your Own](../../plugins/build_your_own.md), [Using Models in Plugins](../../plugins/models.md), and [Custom Columns](../../concepts/custom_columns.md). - -## Configuration - -User-facing column configs inherit from [SingleColumnConfig](../config/column_configs.md#data_designer.config.base.SingleColumnConfig) and define a unique `column_type` discriminator. During compilation, the engine may group related configs into multi-column configs for generators that create sampler or seed columns together. - -## Generation strategy - -Column generator base classes return [GenerationStrategy](../config/column_configs.md#data_designer.config.column_configs.GenerationStrategy) values to tell the engine whether they run per row or over a full batch. - -## Implementation bases - -Generators that operate on a full batch can inherit from [ColumnGeneratorFullColumn](#data_designer.engine.column_generators.generators.base.ColumnGeneratorFullColumn). Row-oriented non-model generators can inherit from [ColumnGeneratorCellByCell](#data_designer.engine.column_generators.generators.base.ColumnGeneratorCellByCell). Generators that create initial rows use [FromScratchColumnGenerator](#data_designer.engine.column_generators.generators.base.FromScratchColumnGenerator). Model-backed plugin generators should use [ColumnGeneratorWithModelRegistry](#data_designer.engine.column_generators.generators.base.ColumnGeneratorWithModelRegistry) or [ColumnGeneratorWithModel](#data_designer.engine.column_generators.generators.base.ColumnGeneratorWithModel); see [Using Models in Plugins](../../plugins/models.md) for authoring guidance. - -### `ColumnGenerator` {#data_designer.engine.column_generators.generators.base.ColumnGenerator} - -::: data_designer.engine.column_generators.generators.base.ColumnGenerator - options: - show_root_toc_entry: false - -### `ColumnGeneratorFullColumn` {#data_designer.engine.column_generators.generators.base.ColumnGeneratorFullColumn} - -::: data_designer.engine.column_generators.generators.base.ColumnGeneratorFullColumn - options: - show_root_toc_entry: false - -### `ColumnGeneratorCellByCell` {#data_designer.engine.column_generators.generators.base.ColumnGeneratorCellByCell} - -::: data_designer.engine.column_generators.generators.base.ColumnGeneratorCellByCell - options: - show_root_toc_entry: false - -### `FromScratchColumnGenerator` {#data_designer.engine.column_generators.generators.base.FromScratchColumnGenerator} - -::: data_designer.engine.column_generators.generators.base.FromScratchColumnGenerator - options: - show_root_toc_entry: false - -### `ColumnGeneratorWithModelRegistry` {#data_designer.engine.column_generators.generators.base.ColumnGeneratorWithModelRegistry} - -::: data_designer.engine.column_generators.generators.base.ColumnGeneratorWithModelRegistry - options: - show_root_toc_entry: false - -### `ColumnGeneratorWithModel` {#data_designer.engine.column_generators.generators.base.ColumnGeneratorWithModel} - -::: data_designer.engine.column_generators.generators.base.ColumnGeneratorWithModel - options: - show_root_toc_entry: false diff --git a/docs/code_reference/engine/index.md b/docs/code_reference/engine/index.md deleted file mode 100644 index 06dfa4e6d..000000000 --- a/docs/code_reference/engine/index.md +++ /dev/null @@ -1,5 +0,0 @@ -# Engine Package - -The `data-designer-engine` package provides `data_designer.engine`, the runtime layer of Data Designer. It consumes `data_designer.config` objects and maps them to execution behavior through generators, seed readers, processors, registries, model access, and MCP tool execution. - -This package sits between config and interface: it depends on config, and the public interface calls into it. Use these pages for plugin implementation contracts, registry behavior, seed reader internals, processor execution, column generator bases, and MCP runtime behavior. diff --git a/docs/code_reference/engine/mcp.md b/docs/code_reference/engine/mcp.md deleted file mode 100644 index a9b333b97..000000000 --- a/docs/code_reference/engine/mcp.md +++ /dev/null @@ -1,94 +0,0 @@ -# Engine MCP - -Execution-time MCP registries, facades, session handling, schema discovery, and tool calls. - -For user-facing provider and tool config objects, see [MCP configuration](../config/mcp.md). - -## Parallel Structure - -| Model layer | MCP layer | Purpose | -|-------------|-----------|---------| -| `ModelProviderRegistry` | `MCPProviderRegistry` | Holds provider configurations. | -| `ModelRegistry` | `MCPRegistry` | Manages configs by alias and lazily creates facades. | -| `ModelFacade` | `MCPFacade` | Provides a lightweight runtime facade scoped to one config. | -| `ModelConfig.alias` | `ToolConfig.tool_alias` | Alias referenced by column configs. | - -## Registry - -### `MCPToolDefinition` {#data_designer.engine.mcp.registry.MCPToolDefinition} - -::: data_designer.engine.mcp.registry.MCPToolDefinition - options: - show_root_toc_entry: false - -### `MCPToolResult` {#data_designer.engine.mcp.registry.MCPToolResult} - -::: data_designer.engine.mcp.registry.MCPToolResult - options: - show_root_toc_entry: false - -### `MCPRegistry` {#data_designer.engine.mcp.registry.MCPRegistry} - -::: data_designer.engine.mcp.registry.MCPRegistry - options: - show_root_toc_entry: false - -### `create_mcp_registry` {#data_designer.engine.mcp.factory.create_mcp_registry} - -::: data_designer.engine.mcp.factory.create_mcp_registry - options: - show_root_toc_entry: false - -## Facade - -`ModelFacade.generate()` accepts a `tool_alias` parameter. When it is provided, `ModelFacade` looks up the matching `MCPFacade` from `MCPRegistry`, fetches tool schemas, passes them to the model, processes tool calls after each completion, tracks tool-call turns, and returns messages that include tool results for trace capture. - -### `MCPFacade` {#data_designer.engine.mcp.facade.MCPFacade} - -::: data_designer.engine.mcp.facade.MCPFacade - options: - show_root_toc_entry: false - -## I/O Service - -The I/O service owns a background event loop, pools MCP sessions by provider config, coalesces concurrent tool schema lookups, and executes parallel tool calls. - -### `MCPIOService` {#data_designer.engine.mcp.io.MCPIOService} - -::: data_designer.engine.mcp.io.MCPIOService - options: - show_root_toc_entry: false - -### Runtime Helpers - -::: data_designer.engine.mcp.io.list_tools - options: - show_root_toc_entry: false - -::: data_designer.engine.mcp.io.list_tool_names - options: - show_root_toc_entry: false - -::: data_designer.engine.mcp.io.call_tools - options: - show_root_toc_entry: false - -::: data_designer.engine.mcp.io.clear_provider_caches - options: - show_root_toc_entry: false - -::: data_designer.engine.mcp.io.clear_tools_cache - options: - show_root_toc_entry: false - -::: data_designer.engine.mcp.io.get_cache_info - options: - show_root_toc_entry: false - -::: data_designer.engine.mcp.io.clear_session_pool - options: - show_root_toc_entry: false - -::: data_designer.engine.mcp.io.get_session_pool_info - options: - show_root_toc_entry: false diff --git a/docs/code_reference/engine/processors.md b/docs/code_reference/engine/processors.md deleted file mode 100644 index e11653ead..000000000 --- a/docs/code_reference/engine/processors.md +++ /dev/null @@ -1,43 +0,0 @@ -# Engine Processor Implementations - -Runtime processor classes and processor registry helpers. - -Plugin processors inherit from [Processor](#data_designer.engine.processing.processors.base.Processor) and override one or more callback methods: `process_before_batch`, `process_after_batch`, or `process_after_generation`. - -For user-facing processor config objects, see [processor configurations](../config/processors.md). - -## Base Contract - -### `Processor` {#data_designer.engine.processing.processors.base.Processor} - -::: data_designer.engine.processing.processors.base.Processor - options: - show_root_toc_entry: false - -## Built-In Implementations - -### `DropColumnsProcessor` {#data_designer.engine.processing.processors.drop_columns.DropColumnsProcessor} - -::: data_designer.engine.processing.processors.drop_columns.DropColumnsProcessor - options: - show_root_toc_entry: false - -### `SchemaTransformProcessor` {#data_designer.engine.processing.processors.schema_transform.SchemaTransformProcessor} - -::: data_designer.engine.processing.processors.schema_transform.SchemaTransformProcessor - options: - show_root_toc_entry: false - -## Registry - -### `ProcessorRegistry` {#data_designer.engine.processing.processors.registry.ProcessorRegistry} - -::: data_designer.engine.processing.processors.registry.ProcessorRegistry - options: - show_root_toc_entry: false - -### `create_default_processor_registry` {#data_designer.engine.processing.processors.registry.create_default_processor_registry} - -::: data_designer.engine.processing.processors.registry.create_default_processor_registry - options: - show_root_toc_entry: false diff --git a/docs/code_reference/engine/seed_readers.md b/docs/code_reference/engine/seed_readers.md deleted file mode 100644 index 5f6294a34..000000000 --- a/docs/code_reference/engine/seed_readers.md +++ /dev/null @@ -1,101 +0,0 @@ -# Seed Readers - -Seed readers are engine-side adapters that turn a configured seed source into tabular seed rows. The engine attaches a `SeedSource` and secret resolver, asks the reader for column names and dataset size, then streams batches into generation. - -Related pages: [seeds](../config/seeds.md), [Seed Datasets](../../concepts/seed-datasets.md), and [Build Your Own](../../plugins/build_your_own.md). - -## Core Contracts - -### `SeedReader` {#data_designer.engine.resources.seed_reader.SeedReader} - -::: data_designer.engine.resources.seed_reader.SeedReader - options: - show_root_toc_entry: false - -### `FileSystemSeedReader` {#data_designer.engine.resources.seed_reader.FileSystemSeedReader} - -::: data_designer.engine.resources.seed_reader.FileSystemSeedReader - options: - show_root_toc_entry: false - -### `SeedReaderFileSystemContext` {#data_designer.engine.resources.seed_reader.SeedReaderFileSystemContext} - -::: data_designer.engine.resources.seed_reader.SeedReaderFileSystemContext - options: - show_root_toc_entry: false - -### `SeedReaderBatch` {#data_designer.engine.resources.seed_reader.SeedReaderBatch} - -::: data_designer.engine.resources.seed_reader.SeedReaderBatch - options: - show_root_toc_entry: false - -### `SeedReaderBatchReader` {#data_designer.engine.resources.seed_reader.SeedReaderBatchReader} - -::: data_designer.engine.resources.seed_reader.SeedReaderBatchReader - options: - show_root_toc_entry: false - -### `PandasSeedReaderBatch` {#data_designer.engine.resources.seed_reader.PandasSeedReaderBatch} - -::: data_designer.engine.resources.seed_reader.PandasSeedReaderBatch - options: - show_root_toc_entry: false - -### `create_seed_reader_output_dataframe` {#data_designer.engine.resources.seed_reader.create_seed_reader_output_dataframe} - -::: data_designer.engine.resources.seed_reader.create_seed_reader_output_dataframe - options: - show_root_toc_entry: false - -## Built-In Readers - -### `LocalFileSeedReader` {#data_designer.engine.resources.seed_reader.LocalFileSeedReader} - -::: data_designer.engine.resources.seed_reader.LocalFileSeedReader - options: - show_root_toc_entry: false - -### `HuggingFaceSeedReader` {#data_designer.engine.resources.seed_reader.HuggingFaceSeedReader} - -::: data_designer.engine.resources.seed_reader.HuggingFaceSeedReader - options: - show_root_toc_entry: false - -### `DataFrameSeedReader` {#data_designer.engine.resources.seed_reader.DataFrameSeedReader} - -::: data_designer.engine.resources.seed_reader.DataFrameSeedReader - options: - show_root_toc_entry: false - -### `DirectorySeedReader` {#data_designer.engine.resources.seed_reader.DirectorySeedReader} - -::: data_designer.engine.resources.seed_reader.DirectorySeedReader - options: - show_root_toc_entry: false - -### `FileContentsSeedReader` {#data_designer.engine.resources.seed_reader.FileContentsSeedReader} - -::: data_designer.engine.resources.seed_reader.FileContentsSeedReader - options: - show_root_toc_entry: false - -### `AgentRolloutSeedReader` {#data_designer.engine.resources.seed_reader.AgentRolloutSeedReader} - -::: data_designer.engine.resources.seed_reader.AgentRolloutSeedReader - options: - show_root_toc_entry: false - -## Registry and Errors - -### `SeedReaderRegistry` {#data_designer.engine.resources.seed_reader.SeedReaderRegistry} - -::: data_designer.engine.resources.seed_reader.SeedReaderRegistry - options: - show_root_toc_entry: false - -### `SeedReaderError` {#data_designer.engine.resources.seed_reader.SeedReaderError} - -::: data_designer.engine.resources.seed_reader.SeedReaderError - options: - show_root_toc_entry: false diff --git a/docs/code_reference/index.md b/docs/code_reference/index.md deleted file mode 100644 index 5263b0ffe..000000000 --- a/docs/code_reference/index.md +++ /dev/null @@ -1,11 +0,0 @@ -# Code Reference - -Data Designer is implemented as three installable packages that share the `data_designer` namespace. The packages are layered: user-facing interface code calls the engine, and the engine consumes declarative config objects. - -| Package | Namespace | Role | -|---------|-----------|------| -| [`data-designer-config`](config/index.md) | `data_designer.config` | Configuration schemas, builder APIs, plugin registration objects, and result schemas. | -| [`data-designer-engine`](engine/index.md) | `data_designer.engine` | Runtime contracts and implementations for generation, seed reading, processing, and MCP tool execution. | -| [`data-designer`](interface/index.md) | `data_designer.interface` | Public entry points for previewing, creating, and inspecting generated datasets. | - -The dependency direction is `interface -> engine -> config`. Config objects describe what should happen, engine objects implement how it happens, and interface objects expose the supported public API. diff --git a/docs/code_reference/interface/data_designer.md b/docs/code_reference/interface/data_designer.md deleted file mode 100644 index 050ba6242..000000000 --- a/docs/code_reference/interface/data_designer.md +++ /dev/null @@ -1,11 +0,0 @@ -# DataDesigner Interface - -[DataDesigner](#data_designer.interface.data_designer.DataDesigner) validates configs, generates in-memory previews, creates persisted datasets, lists configured MCP tools, and exposes default model settings. - -For runtime settings passed through `set_run_config()`, see [run_config](../config/run_config.md). For persisted creation results returned by `create()`, see [results](results.md). - -## `DataDesigner` {#data_designer.interface.data_designer.DataDesigner} - -::: data_designer.interface.data_designer.DataDesigner - options: - show_root_toc_entry: false diff --git a/docs/code_reference/interface/errors.md b/docs/code_reference/interface/errors.md deleted file mode 100644 index a969cf8fe..000000000 --- a/docs/code_reference/interface/errors.md +++ /dev/null @@ -1,29 +0,0 @@ -# Interface Errors - -Interface errors represent failures surfaced at the public API boundary. DataDesignerGenerationError wraps dataset generation failures from `create()` and `preview()`, DataDesignerEarlyShutdownError identifies generation runs that terminate early without producing records, and DataDesignerProfilingError wraps profiling failures from those methods. These errors inherit from `data_designer.errors.DataDesignerError`, allowing callers to catch either specific interface failures or the project-wide base error type. - -The package-level `data_designer.interface` export lazily exposes [DataDesignerGenerationError](#data_designer.interface.errors.DataDesignerGenerationError), [DataDesignerEarlyShutdownError](#data_designer.interface.errors.DataDesignerEarlyShutdownError), and [DataDesignerProfilingError](#data_designer.interface.errors.DataDesignerProfilingError). [InvalidBufferValueError](#data_designer.interface.errors.InvalidBufferValueError) is defined in this module. - -## `DataDesignerGenerationError` {#data_designer.interface.errors.DataDesignerGenerationError} - -::: data_designer.interface.errors.DataDesignerGenerationError - options: - show_root_toc_entry: false - -## `DataDesignerEarlyShutdownError` {#data_designer.interface.errors.DataDesignerEarlyShutdownError} - -::: data_designer.interface.errors.DataDesignerEarlyShutdownError - options: - show_root_toc_entry: false - -## `DataDesignerProfilingError` {#data_designer.interface.errors.DataDesignerProfilingError} - -::: data_designer.interface.errors.DataDesignerProfilingError - options: - show_root_toc_entry: false - -## `InvalidBufferValueError` {#data_designer.interface.errors.InvalidBufferValueError} - -::: data_designer.interface.errors.InvalidBufferValueError - options: - show_root_toc_entry: false diff --git a/docs/code_reference/interface/index.md b/docs/code_reference/interface/index.md deleted file mode 100644 index e43caa783..000000000 --- a/docs/code_reference/interface/index.md +++ /dev/null @@ -1,7 +0,0 @@ -# Interface Package - -The `data-designer` package provides the top-level user-facing package surface. This section covers `data_designer.interface`, which contains `DataDesigner`, persisted dataset creation results, and interface-level errors. - -This package sits above engine and config. `DataDesigner` accepts Data Designer configs, calls the runtime layer, and returns preview or persisted creation results. - -Start with [DataDesigner](data_designer.md) for previewing, creating, and inspecting datasets from a config. Use [results](results.md) for the object returned by persisted dataset creation, and [errors](errors.md) for exceptions surfaced at the public API boundary. diff --git a/docs/code_reference/interface/results.md b/docs/code_reference/interface/results.md deleted file mode 100644 index 044ca6ccf..000000000 --- a/docs/code_reference/interface/results.md +++ /dev/null @@ -1,11 +0,0 @@ -# Dataset Creation Results - -[DatasetCreationResults](#data_designer.interface.results.DatasetCreationResults) is returned by [DataDesigner.create()](data_designer.md#data_designer.interface.data_designer.DataDesigner.create). It provides access to persisted creation artifacts, including the generated dataset, profiling analysis, processor outputs, task traces, dataset metadata, and Hugging Face Hub upload support. - -Preview generation uses the in-memory `data_designer.config.preview_results.PreviewResults` object returned by [DataDesigner.preview()](data_designer.md#data_designer.interface.data_designer.DataDesigner.preview). Persisted dataset creation uses [DatasetCreationResults](#data_designer.interface.results.DatasetCreationResults). - -## `DatasetCreationResults` {#data_designer.interface.results.DatasetCreationResults} - -::: data_designer.interface.results.DatasetCreationResults - options: - show_root_toc_entry: false diff --git a/docs/concepts/columns.md b/docs/concepts/columns.md index 45b87d174..16fbc329d 100644 --- a/docs/concepts/columns.md +++ b/docs/concepts/columns.md @@ -213,4 +213,4 @@ Computed property listing columns created implicitly alongside the primary colum - `{name}__trace`: Created when `with_trace` is not `TraceType.NONE` on the column. - `{name}__reasoning_content`: Created when `extract_reasoning_content=True` on the column. -For detailed information on each column type, refer to the [column configuration code reference](../code_reference/config/column_configs.md). +For examples of column type usage, see the tutorials and recipe pages. diff --git a/docs/concepts/custom_columns.md b/docs/concepts/custom_columns.md index 3d9ae3954..22aec2ab7 100644 --- a/docs/concepts/custom_columns.md +++ b/docs/concepts/custom_columns.md @@ -191,5 +191,4 @@ Mocking only `generate()` will silently no-op under the async engine because the ## See Also -- [Column Configs Reference](../code_reference/config/column_configs.md) - [Plugins Overview](../plugins/overview.md) diff --git a/docs/concepts/models/model-configs.md b/docs/concepts/models/model-configs.md index 888a7bdca..891460e53 100644 --- a/docs/concepts/models/model-configs.md +++ b/docs/concepts/models/model-configs.md @@ -143,5 +143,4 @@ model_config = dd.ModelConfig( - **[Default Model Settings](default-model-settings.md)**: Pre-configured model settings included with Data Designer - **[Custom Model Settings](custom-model-settings.md)**: Learn how to create custom providers and model configurations - **[Configure Model Settings With the CLI](configure-model-settings-with-the-cli.md)**: Use the CLI to manage model settings -- **[Column Configurations](../../code_reference/config/column_configs.md)**: Learn how to use models in column configurations - **[Architecture & Performance](../architecture-and-performance.md)**: Understanding separation of concerns and optimizing concurrency diff --git a/docs/concepts/person_sampling.md b/docs/concepts/person_sampling.md index 3c9e5eaf6..6bc78600a 100644 --- a/docs/concepts/person_sampling.md +++ b/docs/concepts/person_sampling.md @@ -40,7 +40,7 @@ config_builder.add_column( ) ``` -For mor details, see the documentation for [`SamplerColumnConfig`](../code_reference/config/column_configs.md#data_designer.config.column_configs.SamplerColumnConfig) and [`PersonFromFakerSamplerParams`](../code_reference/config/sampler_params.md#data_designer.config.sampler_params.PersonFromFakerSamplerParams). +Use `SamplerColumnConfig` with `PersonFromFakerSamplerParams` when you need locale-aware synthetic person fields. --- @@ -161,7 +161,7 @@ config_builder.add_column( ) ``` -For more details, see the documentation for [`SamplerColumnConfig`](../code_reference/config/column_configs.md#data_designer.config.column_configs.SamplerColumnConfig) and [`PersonSamplerParams`](../code_reference/config/sampler_params.md#data_designer.config.sampler_params.PersonSamplerParams). +Use `SamplerColumnConfig` with `PersonSamplerParams` when you need richer personas from curated datasets. ### Available Data Fields diff --git a/docs/concepts/security.md b/docs/concepts/security.md index 6b365befd..360730cf6 100644 --- a/docs/concepts/security.md +++ b/docs/concepts/security.md @@ -200,4 +200,3 @@ For example, this is often reasonable in a notebook, local script, or other sing ## Related Reading - [Deployment Options](deployment-options.md) -- [Run Config Reference](../code_reference/config/run_config.md) diff --git a/docs/concepts/tool_use_and_mcp.md b/docs/concepts/tool_use_and_mcp.md index ec2771f3f..7bb39e7ae 100644 --- a/docs/concepts/tool_use_and_mcp.md +++ b/docs/concepts/tool_use_and_mcp.md @@ -63,7 +63,3 @@ builder.add_column( ## Example See the [PDF Q&A Recipe](../recipes/mcp_and_tooluse/pdf_qa.md) for a complete working example. - -## Code Reference - -For config objects, see [MCP Configuration Reference](../code_reference/config/mcp.md). For runtime internals, see [Engine MCP Reference](../code_reference/engine/mcp.md). diff --git a/docs/concepts/validators.md b/docs/concepts/validators.md index 043694ee7..521f16ccd 100644 --- a/docs/concepts/validators.md +++ b/docs/concepts/validators.md @@ -286,10 +286,6 @@ builder.add_column( The `target_columns` parameter specifies which columns to validate. All target columns are passed to the validator together (except for code validators, which process each column separately). -### Configuration Parameters - -See more about parameters used to instantiate `ValidationColumnConfig` in the [code reference](../code_reference/config/column_configs.md#data_designer.config.column_configs.ValidationColumnConfig). - ### Batch Size Considerations Larger batch sizes improve efficiency but consume more memory: @@ -327,7 +323,3 @@ builder.add_column( ``` **Note**: Code validators always process each target column separately, even when multiple columns are specified. Local callable and remote validators receive all target columns together. - -## See Also - -- [Validator Parameters Reference](../code_reference/config/validator_params.md): Configuration object schemas diff --git a/docs/css/mkdocstrings.css b/docs/css/mkdocstrings.css deleted file mode 100644 index 56ba05c64..000000000 --- a/docs/css/mkdocstrings.css +++ /dev/null @@ -1,132 +0,0 @@ -/* Indentation. */ -div.doc-contents:not(.first) { - padding-left: 25px; - border-left: .05rem solid var(--md-typeset-table-color); - } - - /* Mark external links as such. */ - a.external::after, - a.autorefs-external::after { - /* https://primer.style/octicons/arrow-up-right-24 */ - mask-image: url('data:image/svg+xml,'); - -webkit-mask-image: url('data:image/svg+xml,'); - content: ' '; - - display: inline-block; - vertical-align: middle; - position: relative; - - height: 1em; - width: 1em; - background-color: currentColor; - } - - a.external:hover::after, - a.autorefs-external:hover::after { - background-color: var(--md-accent-fg-color); - } - - /* Tree-like output for backlinks. */ - .doc-backlink-list { - --tree-clr: var(--md-default-fg-color); - --tree-font-size: 1rem; - --tree-item-height: 1; - --tree-offset: 1rem; - --tree-thickness: 1px; - --tree-style: solid; - display: grid; - list-style: none !important; - } - - .doc-backlink-list li > span:first-child { - text-indent: .3rem; - } - .doc-backlink-list li { - padding-inline-start: var(--tree-offset); - border-left: var(--tree-thickness) var(--tree-style) var(--tree-clr); - position: relative; - margin-left: 0 !important; - - &:last-child { - border-color: transparent; - } - &::before{ - content: ''; - position: absolute; - top: calc(var(--tree-item-height) / 2 * -1 * var(--tree-font-size) + var(--tree-thickness)); - left: calc(var(--tree-thickness) * -1); - width: calc(var(--tree-offset) + var(--tree-thickness) * 2); - height: calc(var(--tree-item-height) * var(--tree-font-size)); - border-left: var(--tree-thickness) var(--tree-style) var(--tree-clr); - border-bottom: var(--tree-thickness) var(--tree-style) var(--tree-clr); - } - &::after{ - content: ''; - position: absolute; - border-radius: 50%; - background-color: var(--tree-clr); - top: calc(var(--tree-item-height) / 2 * 1rem); - left: var(--tree-offset) ; - translate: calc(var(--tree-thickness) * -1) calc(var(--tree-thickness) * -1); - } - } - - .doc-symbol-toc.doc-symbol-module::after { - content: "module"; - } - - .doc-symbol-toc.doc-symbol-method::after { - content: "method"; - } - - /* Keep API section tables readable when Python type annotations are long. */ - div.doc-contents:has(table:has(thead th:nth-child(3))) { - overflow-x: auto; - } - - div.doc-contents table:has(thead th:nth-child(3)) { - table-layout: fixed; - width: 100%; - min-width: 42rem; - } - - div.doc-contents table:has(thead th:nth-child(3)) td { - vertical-align: top; - } - - div.doc-contents table:has(thead th:nth-child(3)) code { - white-space: normal; - overflow-wrap: anywhere; - word-break: normal; - } - - /* Attributes: Name, Type, Description. */ - div.doc-contents table:has(thead th:nth-child(3)):not(:has(thead th:nth-child(4))) th:nth-child(1), - div.doc-contents table:has(thead th:nth-child(3)):not(:has(thead th:nth-child(4))) td:nth-child(1) { - width: clamp(9rem, 18%, 12rem); - } - - div.doc-contents table:has(thead th:nth-child(3)):not(:has(thead th:nth-child(4))) th:nth-child(2), - div.doc-contents table:has(thead th:nth-child(3)):not(:has(thead th:nth-child(4))) td:nth-child(2) { - width: clamp(16rem, 38%, 34rem); - } - - /* Parameters: Name, Type, Description, Default. */ - div.doc-contents table:has(thead th:nth-child(4)) { - min-width: 54rem; - } - - div.doc-contents table:has(thead th:nth-child(4)) th:nth-child(1), - div.doc-contents table:has(thead th:nth-child(4)) td:nth-child(1) { - width: clamp(9rem, 16%, 11rem); - } - - div.doc-contents table:has(thead th:nth-child(4)) th:nth-child(2), - div.doc-contents table:has(thead th:nth-child(4)) td:nth-child(2) { - width: clamp(16rem, 32%, 28rem); - } - - div.doc-contents table:has(thead th:nth-child(4)) th:nth-child(4), - div.doc-contents table:has(thead th:nth-child(4)) td:nth-child(4) { - width: clamp(5rem, 9%, 7rem); - } diff --git a/docs/css/style.css b/docs/css/style.css index d19b92cd1..19a10be3e 100644 --- a/docs/css/style.css +++ b/docs/css/style.css @@ -86,12 +86,12 @@ div.output_subarea pre, width: 10rem !important; } -/* Hide right sidebar (TOC) by default, JavaScript will show it on Code Reference pages */ +/* Hide right sidebar (TOC) by default, JavaScript will show it on Concepts and Plugins pages */ .md-sidebar.md-sidebar--secondary { display: none !important; } -/* Show TOC on Code Reference pages (controlled by JavaScript) */ +/* Show TOC on selected pages (controlled by JavaScript) */ body.show-toc .md-sidebar.md-sidebar--secondary { display: block !important; } diff --git a/docs/js/toc-toggle.js b/docs/js/toc-toggle.js index 22f7e079a..6b2187813 100644 --- a/docs/js/toc-toggle.js +++ b/docs/js/toc-toggle.js @@ -1,23 +1,20 @@ // Wrap in a check to ensure document$ exists if (typeof document$ !== "undefined") { document$.subscribe(function() { - // Check if this is a Code Reference page (contains mkdocstrings content) - const isCodeReferencePage = document.querySelector(".doc.doc-contents"); - // Check if this is a Concepts page (URL contains /concepts/) const isConceptsPage = window.location.pathname.includes("/concepts/"); // Check if this is a Plugins page (URL contains /plugins/) const isPluginsPage = window.location.pathname.includes("/plugins/"); - if (isCodeReferencePage || isConceptsPage || isPluginsPage) { - // Show TOC for Code Reference, Concepts, and Plugins pages by adding class to body + if (isConceptsPage || isPluginsPage) { + // Show TOC for Concepts and Plugins pages by adding class to body document.body.classList.add("show-toc"); - console.log("Code Reference, Concepts, or Plugins page detected - showing TOC"); + console.log("Concepts or Plugins page detected - showing TOC"); } else { // Hide TOC for all other pages by removing class from body document.body.classList.remove("show-toc"); - console.log("Non-Code Reference/Concepts/Plugins page - hiding TOC"); + console.log("Non-Concepts/Plugins page - hiding TOC"); } }); } else { diff --git a/docs/notebook_source/_README.md b/docs/notebook_source/_README.md index 97bcdf8cb..a51336e53 100644 --- a/docs/notebook_source/_README.md +++ b/docs/notebook_source/_README.md @@ -131,12 +131,3 @@ Understanding these concepts will help you make the most of the tutorials: - **[Columns](../concepts/columns.md)** - Learn about different column types (Sampler, LLM, Expression, Validation, etc.) - **[Validators](../concepts/validators.md)** - Understand how to validate generated data with Python, SQL, and remote validators - **[Person Sampling](../concepts/person_sampling.md)** - Learn how to sample realistic person data with demographic attributes - -### Code Reference - -Quick reference guides for the main configuration objects: - -- **[column_configs](../code_reference/config/column_configs.md)** - All column configuration types -- **[config_builder](../code_reference/config/config_builder.md)** - The `DataDesignerConfigBuilder` API -- **[data_designer_config](../code_reference/config/data_designer_config.md)** - Main configuration schema -- **[validator_params](../code_reference/config/validator_params.md)** - Validator configuration options diff --git a/docs/plugins/build_your_own.md b/docs/plugins/build_your_own.md index 649b8bdd7..896f471c4 100644 --- a/docs/plugins/build_your_own.md +++ b/docs/plugins/build_your_own.md @@ -100,7 +100,7 @@ data-designer-my-plugin/ index-multiplier = "data_designer_index_multiplier.plugin:plugin" ``` - For the generator implementation contract, see [Column Generators](../code_reference/engine/column_generators.md). For inline custom functions, see [Custom Columns](../concepts/custom_columns.md). + For inline custom functions, see [Custom Columns](../concepts/custom_columns.md). === "Seed reader" @@ -196,7 +196,7 @@ data-designer-my-plugin/ prefixed-text-files = "data_designer_prefixed_text_seed_reader.plugin:plugin" ``` - For the engine API behind this example, see [Seed Readers](../code_reference/engine/seed_readers.md). + This pattern works for any directory-backed seed reader. === "Processor" @@ -265,7 +265,7 @@ data-designer-my-plugin/ regex-filter = "data_designer_regex_filter.plugin:plugin" ``` - For callback selection and processor execution details, see [Processors](../concepts/processors.md). For the engine API behind this example, see [Engine Processors code reference](../code_reference/engine/processors.md). + For callback selection and processor execution details, see [Processors](../concepts/processors.md). ## Install and use locally diff --git a/docs/plugins/models.md b/docs/plugins/models.md index c3bb25228..e069ea2b4 100644 --- a/docs/plugins/models.md +++ b/docs/plugins/models.md @@ -191,4 +191,4 @@ The built-in model-backed generators use these same hooks: - `ImageCellGenerator` uses `ColumnGeneratorWithModel`, renders a prompt, calls the facade's image methods, and writes generated media through the artifact storage supplied by the same `ResourceProvider`. - `CustomColumnGenerator` is the inline-function counterpart: when users declare `model_aliases`, it builds a `models` dict from `resource_provider.model_registry`. Packaged plugins usually use `ColumnGeneratorWithModel` or `ColumnGeneratorWithModelRegistry` directly instead of recreating that dict. -See [Column Generators](../code_reference/engine/column_generators.md) for the full base-class API and [Custom Model Settings](../concepts/models/custom-model-settings.md) for configuring model aliases. +See [Custom Model Settings](../concepts/models/custom-model-settings.md) for configuring model aliases. diff --git a/docs/plugins/overview.md b/docs/plugins/overview.md index b071966c8..e05de3f35 100644 --- a/docs/plugins/overview.md +++ b/docs/plugins/overview.md @@ -6,9 +6,9 @@ Plugins let you add new object types to Data Designer without modifying the core Data Designer supports three plugin types: -- **Column generator plugins**: Custom [column generators](../code_reference/engine/column_generators.md) you pass to the config builder's [add_column](../code_reference/config/config_builder.md#data_designer.config.config_builder.DataDesignerConfigBuilder.add_column) method. -- **Seed reader plugins**: Custom [seed readers](../code_reference/engine/seed_readers.md) that load data from new sources, such as databases, cloud storage, or custom file formats. -- **Processor plugins**: Custom [processor implementations](../code_reference/engine/processors.md) configured by processor config objects that transform data before batches, after batches, or after generation completes. Pass them to the config builder's [add_processor](../code_reference/config/config_builder.md#data_designer.config.config_builder.DataDesignerConfigBuilder.add_processor) method. +- **Column generator plugins**: Custom column types you pass to the config builder's `add_column` method. +- **Seed reader plugins**: Custom seed readers that load data from new sources, such as databases, cloud storage, or custom file formats. +- **Processor plugins**: Custom processor implementations that transform data before batches, after batches, or after generation completes. Pass them to the config builder's `add_processor` method. ## Use an Installed Plugin diff --git a/fern/AGENTS.md b/fern/AGENTS.md index c1d90db86..1dfd3d2cd 100644 --- a/fern/AGENTS.md +++ b/fern/AGENTS.md @@ -12,8 +12,6 @@ This folder contains the Fern docs site for NeMo Data Designer. Use `fern/README ## Generated Artifacts -- `make generate-fern-api-reference` creates gitignored API reference files in `fern/code-reference/` for `data_designer.config`, `data_designer.interface`, and curated engine extension modules. -- `py2fern` only descends into Python packages. Add `__init__.py` to any new subdirectory whose modules should appear in the API reference. - `make generate-fern-notebooks` creates gitignored notebook files in `fern/components/notebooks/`. - `docs/notebook_source/*.py` is the notebook source of truth. - `docs/colab_notebooks/` is only for Colab links, not Fern input. diff --git a/fern/README.md b/fern/README.md index c6aad488a..289ea00a5 100644 --- a/fern/README.md +++ b/fern/README.md @@ -10,7 +10,7 @@ Data Designer is moving from MkDocs to Fern over several releases. During that t - Keep Fern working in parallel for local checks and hosted validation. - Treat `docs/` as the docs source of truth unless a page has already been intentionally moved to Fern-only MDX. - Treat `docs/notebook_source/*.py` as the notebook source of truth. -- Keep generated Fern API reference and notebook artifacts gitignored. +- Keep generated Fern notebook artifacts gitignored. ## Prerequisites @@ -21,23 +21,9 @@ npm install -g fern-api ## First-time setup -Two pre-render steps are needed before the dev server has all content. Both produce gitignored files and are safe to rerun. +One pre-render step is needed before the dev server has all tutorial content. It produces gitignored files and is safe to rerun. -### 1. Python API reference (gitignored - must regenerate) - -`make generate-fern-api-reference` uses `py2fern` to extract API docs from local Python source. The output lands in `fern/code-reference/` (gitignored), preserving the existing Config API folder and adding Interface and curated Engine extension API folders. - -```bash -make generate-fern-api-reference -``` - -`py2fern` only descends into Python packages. Add `__init__.py` to any new subdirectory whose modules should appear in the API reference. - -The `libraries:` block in [`docs.yml`](docs.yml) still documents the Fern-native config generator. Run `make generate-fern-api-reference-native` only when you want the Fern CLI output and have Fern auth. - -Re-run when the upstream package source changes. - -### 2. Notebook tutorials (gitignored - regenerate on clone) +### Notebook tutorials (gitignored - regenerate on clone) Each tutorial source file is converted to a JSON+TS pair in `fern/components/notebooks/`, then rendered through the `` component on the wrapper MDX page. Output is gitignored; regenerate it after cloning and after changing `docs/notebook_source/*.py`. @@ -57,7 +43,7 @@ make serve-fern-docs-locally # → http://localhost:3000 ``` -`serve-fern-docs-locally` generates Fern API reference and notebook artifacts before starting `fern docs dev`. It does not publish. +`serve-fern-docs-locally` generates notebook artifacts before starting `fern docs dev`. It does not publish. ## CI and publishing @@ -114,7 +100,7 @@ Dev Notes publishing mirrors MkDocs: it patches only the Dev Notes nav and pages ``` fern/ ├── README.md ← this file -├── docs.yml ← title, colors, versions:, libraries:, redirects, custom domain +├── docs.yml ← title, colors, versions:, redirects, custom domain ├── fern.config.json ← organization, fern-api version pin ├── main.css ← bundled NVIDIA theme CSS ├── assets/ ← logos, favicon, recipe assets, devnote post images @@ -131,7 +117,6 @@ fern/ │ └── devnotes/ ← .authors.yml, authors-data.ts, per-post trajectory data ├── scripts/ │ └── ipynb-to-fern-json.py ← .ipynb → fern/components/notebooks/*.{json,ts} -├── code-reference/ ← gitignored; populated by `make generate-fern-api-reference` └── versions/ ├── latest.yml ← authoring navigation tree └── latest/pages/ ← authoring MDX content @@ -154,11 +139,9 @@ Support and CI targets: | Command | Purpose | |---------|---------| | `make install-docs-deps` | Install docs and notebook dependencies | -| `make generate-fern-api-reference` | Generate local Fern API reference with `py2fern` | -| `make generate-fern-api-reference-native` | Generate Fern API reference with Fern CLI (requires Fern auth) | | `make generate-fern-notebooks` | Refresh gitignored notebook output from `docs/notebook_source/*.py` | -| `make prepare-fern-docs` | Generate local Fern artifacts | -| `make check-fern-docs` | Generate local Fern artifacts and run `fern check` | +| `make prepare-fern-docs` | Generate local Fern notebook artifacts | +| `make check-fern-docs` | Generate local Fern notebook artifacts and run `fern check` | Raw Fern CLI commands, normally wrapped by Make: @@ -166,5 +149,4 @@ Raw Fern CLI commands, normally wrapped by Make: |---------|---------| | `fern docs dev` | Local preview at `http://localhost:3000` | | `fern check` | Validate `docs.yml` and MDX | -| `fern docs md generate` | Generate library API docs with Fern CLI (requires Fern auth) | | `fern generate --docs --preview` | Hosted preview on `*.docs.buildwithfern.com` (needs Fern token) | diff --git a/fern/docs.yml b/fern/docs.yml index 7c7058865..160580d4d 100644 --- a/fern/docs.yml +++ b/fern/docs.yml @@ -49,15 +49,6 @@ versions: path: versions/latest.yml slug: latest -libraries: - data-designer-config: - input: - git: https://github.com/NVIDIA-NeMo/DataDesigner - subpath: packages/data-designer-config/src/data_designer/config - output: - path: ./code-reference/data-designer - lang: python - redirects: # ---- Section-only landing ---- # Mirrors Curator's /home -> /home/welcome convention. @@ -217,7 +208,7 @@ redirects: # ---- MkDocs path renames (most specific first) ---- # The legacy site lived at https://nvidia-nemo.github.io/DataDesigner/ and used - # MkDocs-Material's directory-URL conventions (mkdocstrings + blog plugin + + # MkDocs-Material's directory-URL conventions (blog plugin + # mkdocs-jupyter). The Fern migration changed several path segments because # Fern slugifies section/page titles. These rules catch search-engine indexed # links and copy-pasted bookmarks that arrive at the new host. @@ -331,57 +322,6 @@ redirects: destination: "/nemo/datadesigner/plugins/file-system-seed-reader-plugins" - source: "/nemo/datadesigner/plugins/example" destination: "/nemo/datadesigner/plugins/example-plugin" - # Code Reference: mkdocstrings tree -> Fern package-shaped sections. - # Underscored page names get kebab'd at the page-slug level too (Fern's title - # slugifier drops underscores), so the snake_case modules need per-page rules. - - source: "/nemo/datadesigner/code_reference" - destination: "/nemo/datadesigner/code-reference/overview" - - source: "/nemo/datadesigner/code_reference/config" - destination: "/nemo/datadesigner/code-reference/config/overview" - - source: "/nemo/datadesigner/code_reference/config/column_configs" - destination: "/nemo/datadesigner/code-reference/config/column-configs" - - source: "/nemo/datadesigner/code_reference/config/config_builder" - destination: "/nemo/datadesigner/code-reference/config/config-builder" - - source: "/nemo/datadesigner/code_reference/config/data_designer_config" - destination: "/nemo/datadesigner/code-reference/config/data-designer-config" - - source: "/nemo/datadesigner/code_reference/config/run_config" - destination: "/nemo/datadesigner/code-reference/config/run-config" - - source: "/nemo/datadesigner/code_reference/config/sampler_params" - destination: "/nemo/datadesigner/code-reference/config/sampler-params" - - source: "/nemo/datadesigner/code_reference/config/validator_params" - destination: "/nemo/datadesigner/code-reference/config/validator-params" - - source: "/nemo/datadesigner/code_reference/config/:module*" - destination: "/nemo/datadesigner/code-reference/config/:module*" - - source: "/nemo/datadesigner/code_reference/interface" - destination: "/nemo/datadesigner/code-reference/interface/overview" - - source: "/nemo/datadesigner/code_reference/interface/data_designer" - destination: "/nemo/datadesigner/code-reference/interface/data-designer" - - source: "/nemo/datadesigner/code_reference/interface/:module*" - destination: "/nemo/datadesigner/code-reference/interface/:module*" - - source: "/nemo/datadesigner/code_reference/engine" - destination: "/nemo/datadesigner/code-reference/engine-extension-api/overview" - - source: "/nemo/datadesigner/code_reference/engine/column_generators" - destination: "/nemo/datadesigner/code-reference/engine-extension-api/column-generators" - - source: "/nemo/datadesigner/code_reference/engine/seed_readers" - destination: "/nemo/datadesigner/code-reference/engine-extension-api/seed-readers" - - source: "/nemo/datadesigner/code_reference/engine/:module*" - destination: "/nemo/datadesigner/code-reference/engine-extension-api/:module*" - - source: "/nemo/datadesigner/code_reference/column_configs" - destination: "/nemo/datadesigner/code-reference/config/column-configs" - - source: "/nemo/datadesigner/code_reference/config_builder" - destination: "/nemo/datadesigner/code-reference/config/config-builder" - - source: "/nemo/datadesigner/code_reference/data_designer_config" - destination: "/nemo/datadesigner/code-reference/config/data-designer-config" - - source: "/nemo/datadesigner/code_reference/run_config" - destination: "/nemo/datadesigner/code-reference/config/run-config" - - source: "/nemo/datadesigner/code_reference/sampler_params" - destination: "/nemo/datadesigner/code-reference/config/sampler-params" - - source: "/nemo/datadesigner/code_reference/validator_params" - destination: "/nemo/datadesigner/code-reference/config/validator-params" - # Modules whose page slug already matches the filename (no underscores): - - source: "/nemo/datadesigner/code_reference/:module*" - destination: "/nemo/datadesigner/code-reference/config/:module*" - # Dev Notes: mkdocs-material blog plugin URL shape. # Section title "Dev Notes" -> /dev-notes; intermediate posts/ directory dropped. # Most posts kept their filename slug, but two were retitled during migration diff --git a/fern/scripts/fern-published-branch.py b/fern/scripts/fern-published-branch.py index e74020713..a28be833c 100644 --- a/fern/scripts/fern-published-branch.py +++ b/fern/scripts/fern-published-branch.py @@ -15,8 +15,8 @@ from pathlib import Path DEVNOTES_SECTION_RE = re.compile(r"^ - section:\s+Dev Notes\s*$") -CODE_REFERENCE_SECTION_RE = re.compile(r"^ - section:\s+Code Reference\s*$") -CODE_REFERENCE_PAGE_ROOT_RE = re.compile(r"path:\s+\./([^/]+)/pages/code_reference/") +RETIRED_REFERENCE_SECTION_RE = re.compile(r"^ - section:\s+" + re.escape("Code " + "Reference") + r"\s*$") +RETIRED_REFERENCE_DIR = "code" + "_reference" NAV_PATH_RE = re.compile(r"^(\s*path:\s+)\./([^#\s]+)(.*)$") REDIRECT_VERSION_RE = re.compile( r'^\s*destination:\s+["\']/nemo/datadesigner/((?:v[0-9][^/"\']*)|older-versions)(?:/|["\'])' @@ -46,43 +46,16 @@ "fern/styles/metrics-table.css", "fern/styles/trajectory-viewer.css", ] -CONFIG_CODE_REFERENCE_PAGES = [ - "analysis.mdx", - "column_configs.mdx", - "config_builder.mdx", - "data_designer_config.mdx", - "mcp.mdx", - "models.mdx", - "processors.mdx", - "run_config.mdx", - "sampler_params.mdx", - "validator_params.mdx", -] -CODE_REFERENCE_STRUCTURE_PAGES = [ - "index.mdx", - "config/index.mdx", - "config/seeds.mdx", - "engine/column_generators.mdx", - "engine/index.mdx", - "engine/mcp.mdx", - "engine/processors.mdx", - "engine/seed_readers.mdx", - "interface/data_designer.mdx", - "interface/errors.mdx", - "interface/index.mdx", - "interface/results.mdx", -] -CODE_REFERENCE_LINK_REPLACEMENTS = [ - ("/code-reference/topic-overviews/data-designer-config", "/code-reference/config/data-designer-config"), - ("/code-reference/topic-overviews/column-configs", "/code-reference/config/column-configs"), - ("/code-reference/topic-overviews/config-builder", "/code-reference/config/config-builder"), - ("/code-reference/topic-overviews/run-config", "/code-reference/config/run-config"), - ("/code-reference/topic-overviews/sampler-params", "/code-reference/config/sampler-params"), - ("/code-reference/topic-overviews/validator-params", "/code-reference/config/validator-params"), - ("/code-reference/topic-overviews/models", "/code-reference/config/models"), - ("/code-reference/topic-overviews/mcp", "/code-reference/config/mcp"), - ("/code-reference/topic-overviews/processors", "/code-reference/config/processors"), - ("/code-reference/topic-overviews/analysis", "/code-reference/config/analysis"), +RETIRED_REFERENCE_CLEAN_PAGE_PATHS = [ + "concepts/columns.mdx", + "concepts/custom_columns.mdx", + "concepts/models/model-configs.mdx", + "concepts/person_sampling.mdx", + "concepts/security.mdx", + "concepts/tool_use_and_mcp.mdx", + "concepts/validators.mdx", + "plugins/example.mdx", + "plugins/overview.mdx", ] @@ -231,16 +204,6 @@ def copy_path(source: Path, target: Path) -> None: shutil.copy2(source, target) -def copy_mdx_with_link_rewrites(source: Path, target: Path) -> None: - if not source.exists(): - return - target.parent.mkdir(parents=True, exist_ok=True) - content = source.read_text() - for old, new in CODE_REFERENCE_LINK_REPLACEMENTS: - content = content.replace(old, new) - target.write_text(content) - - def clear_published_tree(root: Path) -> None: root.mkdir(parents=True, exist_ok=True) for path in root.iterdir(): @@ -291,54 +254,36 @@ def replace_navigation_section(path: Path, section_re: re.Pattern[str], block: l path.write_text("".join(lines)) -def code_reference_page_root(block: list[str]) -> str | None: - for line in block: - match = CODE_REFERENCE_PAGE_ROOT_RE.search(line) - if match: - return match.group(1) - return None - - -def rewrite_code_reference_block(block: list[str], page_root: str) -> list[str]: - return [line.replace("./latest/pages/code_reference/", f"./{page_root}/pages/code_reference/") for line in block] - - -def sync_code_reference_pages(source_root: Path, published_root: Path, page_root: str) -> None: - source_base = source_root / "fern" / "versions" / "latest" / "pages" / "code_reference" - target_base = published_root / "fern" / "versions" / page_root / "pages" / "code_reference" - if not source_base.exists() or not target_base.exists(): +def remove_navigation_section(path: Path, section_re: re.Pattern[str]) -> None: + lines = path.read_text().splitlines(keepends=True) + start = next((i for i, line in enumerate(lines) if section_re.match(line)), -1) + if start == -1: return + end = start + 1 + while end < len(lines): + if lines[end].startswith(" - ") and lines[end].strip(): + break + end += 1 + lines[start:end] = [] + path.write_text("".join(lines)) - for rel_path in CODE_REFERENCE_STRUCTURE_PAGES: - copy_mdx_with_link_rewrites(source_base / rel_path, target_base / rel_path) - - for filename in CONFIG_CODE_REFERENCE_PAGES: - flat_source = target_base / filename - nested_source = target_base / "config" / filename - latest_source = source_base / "config" / filename - source = flat_source if flat_source.exists() else nested_source if nested_source.exists() else latest_source - copy_mdx_with_link_rewrites(source, target_base / "config" / filename) - - -def sync_code_reference_archive(source_root: Path, published_root: Path) -> None: - source_nav = source_root / "fern" / "versions" / "latest.yml" - if not source_nav.exists(): - return - source_block = extract_navigation_section(source_nav, CODE_REFERENCE_SECTION_RE) +def remove_retired_reference_archive(source_root: Path, published_root: Path) -> None: versions_dir = published_root / "fern" / "versions" for nav in sorted(path for path in versions_dir.glob("*.yml") if path.name != "latest.yml"): - try: - current_block = extract_navigation_section(nav, CODE_REFERENCE_SECTION_RE) - except PublishedBranchError: - continue - page_root = code_reference_page_root(current_block) - if page_root is None: - continue - sync_code_reference_pages(source_root, published_root, page_root) - replace_navigation_section( - nav, CODE_REFERENCE_SECTION_RE, rewrite_code_reference_block(source_block, page_root) - ) + remove_navigation_section(nav, RETIRED_REFERENCE_SECTION_RE) + + for path in sorted(versions_dir.glob(f"*/pages/{RETIRED_REFERENCE_DIR}")): + if path.is_dir(): + shutil.rmtree(path) + + source_pages = source_root / "fern" / "versions" / "latest" / "pages" + for pages_dir in sorted(versions_dir.glob("v*/pages")): + for rel_path in RETIRED_REFERENCE_CLEAN_PAGE_PATHS: + source_file = source_pages / rel_path + target_file = pages_dir / rel_path + if source_file.exists() and target_file.exists(): + copy_path(source_file, target_file) def materialize_version_nav_pages(published_root: Path) -> None: @@ -389,7 +334,7 @@ def sync_source(args: argparse.Namespace) -> int: merge_preserved_versions( source_root / "fern" / "versions", published_root / "fern" / "versions", preserved_versions ) - sync_code_reference_archive(source_root, published_root) + remove_retired_reference_archive(source_root, published_root) materialize_version_nav_pages(published_root) restore_versions_block(published_root / "fern" / "docs.yml", preserved_versions_block) validate_redirect_targets(published_root) diff --git a/fern/scripts/normalize-py2fern-indexes.py b/fern/scripts/normalize-py2fern-indexes.py deleted file mode 100644 index d0a1776d4..000000000 --- a/fern/scripts/normalize-py2fern-indexes.py +++ /dev/null @@ -1,49 +0,0 @@ -#!/usr/bin/env python3 -# SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. -# SPDX-License-Identifier: Apache-2.0 - -"""Convert py2fern self-named module pages into Fern folder overview pages.""" - -from __future__ import annotations - -import argparse -import re -from pathlib import Path - - -def normalized(value: str) -> str: - return re.sub(r"[^a-z0-9]+", "-", value.lower()).strip("-") - - -def normalize_root(root: Path) -> int: - renamed = 0 - for path in sorted(root.rglob("*.mdx")): - if path.name == "index.mdx": - continue - if normalized(path.stem) != normalized(path.parent.name): - continue - - target = path.with_name("index.mdx") - if target.exists(): - raise FileExistsError(f"Cannot rename {path}: {target} already exists") - path.rename(target) - renamed += 1 - return renamed - - -def main() -> int: - parser = argparse.ArgumentParser(description=__doc__) - parser.add_argument("roots", nargs="+", type=Path, help="py2fern output roots to normalize") - args = parser.parse_args() - - count = 0 - for root in args.roots: - if not root.exists(): - raise FileNotFoundError(root) - count += normalize_root(root) - print(f"Normalized {count} py2fern pages to index.mdx") - return 0 - - -if __name__ == "__main__": - raise SystemExit(main()) diff --git a/fern/versions/latest.yml b/fern/versions/latest.yml index 75ed0f350..07c1f6ee1 100644 --- a/fern/versions/latest.yml +++ b/fern/versions/latest.yml @@ -139,70 +139,6 @@ navigation: path: ./latest/pages/plugins/filesystem_seed_reader.mdx - page: Discover path: ./latest/pages/plugins/discover.mdx - - section: Code Reference - contents: - - page: Overview - path: ./latest/pages/code_reference/index.mdx - - section: Config - contents: - - page: Overview - path: ./latest/pages/code_reference/config/index.mdx - - page: models - path: ./latest/pages/code_reference/config/models.mdx - - page: mcp - path: ./latest/pages/code_reference/config/mcp.mdx - - page: column_configs - path: ./latest/pages/code_reference/config/column_configs.mdx - - page: config_builder - path: ./latest/pages/code_reference/config/config_builder.mdx - - page: data_designer_config - path: ./latest/pages/code_reference/config/data_designer_config.mdx - - page: run_config - path: ./latest/pages/code_reference/config/run_config.mdx - - page: sampler_params - path: ./latest/pages/code_reference/config/sampler_params.mdx - - page: validator_params - path: ./latest/pages/code_reference/config/validator_params.mdx - - page: seeds - path: ./latest/pages/code_reference/config/seeds.mdx - - page: processors - path: ./latest/pages/code_reference/config/processors.mdx - - page: analysis - path: ./latest/pages/code_reference/config/analysis.mdx - - folder: ../code-reference/data-designer/data_designer/config - title: Config API - - section: Interface - contents: - - page: Overview - path: ./latest/pages/code_reference/interface/index.mdx - - page: data_designer - path: ./latest/pages/code_reference/interface/data_designer.mdx - - page: results - path: ./latest/pages/code_reference/interface/results.mdx - - page: errors - path: ./latest/pages/code_reference/interface/errors.mdx - - folder: ../code-reference/interface/data_designer/interface - title: Interface API - - section: Engine Extension API - contents: - - page: Overview - path: ./latest/pages/code_reference/engine/index.mdx - - page: seed_readers - path: ./latest/pages/code_reference/engine/seed_readers.mdx - - page: processors - path: ./latest/pages/code_reference/engine/processors.mdx - - page: mcp - path: ./latest/pages/code_reference/engine/mcp.mdx - - page: column_generators - path: ./latest/pages/code_reference/engine/column_generators.mdx - - folder: ../code-reference/engine/seed-readers/data_designer/engine/resources/seed_reader - title: Seed Reader API - - folder: ../code-reference/engine/processors/data_designer/engine/processing/processors - title: Processor API - - folder: ../code-reference/engine/mcp/data_designer/engine/mcp - title: MCP Runtime API - - folder: ../code-reference/engine/column-generators/data_designer/engine/column_generators/generators/base - title: Column Generator API - section: Dev Notes contents: - page: Overview diff --git a/fern/versions/latest/pages/code_reference/config/analysis.mdx b/fern/versions/latest/pages/code_reference/config/analysis.mdx deleted file mode 100644 index a6072a932..000000000 --- a/fern/versions/latest/pages/code_reference/config/analysis.mdx +++ /dev/null @@ -1,30 +0,0 @@ ---- -title: "Analysis" -description: "" -position: 10 ---- -The `analysis` modules provide tools for profiling and analyzing generated datasets. It includes statistics tracking, column profiling, and reporting capabilities. - -## Column Statistics - -Column statistics are automatically computed for every column after generation. They provide basic metrics specific to the column type. For example, LLM columns track token usage statistics, sampler columns track distribution information, and validation columns track validation success rates. - -The classes below are result objects that store the computed statistics for each column type and provide methods for formatting these results for display in reports. - - -## Column Profilers - -Column profilers are optional analysis tools that provide deeper insights into specific column types. Currently, the only column profiler available is the Judge Score Profiler. - -The classes below are result objects that store the computed profiler results and provide methods for formatting these results for display in reports. - - -## Dataset Profiler - -The [DatasetProfilerResults](#data_designer.config.analysis.dataset_profiler.DatasetProfilerResults) class contains complete profiling results for a generated dataset. It aggregates column-level statistics, metadata, and profiler results, and provides methods to: - -- Compute dataset-level metrics (completion percentage, column type summary) -- Filter statistics by column type -- Generate formatted analysis reports via the `to_report()` method - -Reports can be displayed in the console or exported to HTML/SVG formats. diff --git a/fern/versions/latest/pages/code_reference/config/column_configs.mdx b/fern/versions/latest/pages/code_reference/config/column_configs.mdx deleted file mode 100644 index f46c1a125..000000000 --- a/fern/versions/latest/pages/code_reference/config/column_configs.mdx +++ /dev/null @@ -1,11 +0,0 @@ ---- -title: "Column Configurations" -description: "" -position: 3 ---- -The `column_configs` module defines configuration objects for all Data Designer column types. Each configuration inherits from [SingleColumnConfig](#data_designer.config.base.SingleColumnConfig), which provides shared arguments like the column `name`, whether to `drop` the column after generation, and the `column_type`. - - -`column_type` is a discriminator field -The `column_type` argument is used to identify column types when deserializing the [Data Designer Configuration](/code-reference/config/data-designer-config) from JSON/YAML. It acts as the discriminator in a [discriminated union](https://docs.pydantic.dev/latest/concepts/unions/#discriminated-unions), allowing Pydantic to automatically determine which column configuration class to instantiate. - diff --git a/fern/versions/latest/pages/code_reference/config/config_builder.mdx b/fern/versions/latest/pages/code_reference/config/config_builder.mdx deleted file mode 100644 index 14d9a2d51..000000000 --- a/fern/versions/latest/pages/code_reference/config/config_builder.mdx +++ /dev/null @@ -1,13 +0,0 @@ ---- -title: "Data Designer's Config Builder" -description: "" -position: 4 ---- -The `config_builder` module provides a high-level interface for constructing Data Designer configurations through the [DataDesignerConfigBuilder](#data_designer.config.config_builder.DataDesignerConfigBuilder) class, enabling programmatic creation of [DataDesignerConfig](/code-reference/config/data-designer-config#data_designer.config.data_designer_config.DataDesignerConfig) objects by incrementally adding column configurations, constraints, processors, and profilers. - -You can use the builder to create Data Designer configurations from scratch or from existing configurations stored in YAML/JSON files via [`from_config()`](#data_designer.config.config_builder.DataDesignerConfigBuilder.from_config). The builder includes validation capabilities to catch configuration errors early and can work with seed datasets from local sources or external datastores. Once configured, use [`build()`](#data_designer.config.config_builder.DataDesignerConfigBuilder.build) to generate the final configuration object or [`write_config()`](#data_designer.config.config_builder.DataDesignerConfigBuilder.write_config) to serialize it to disk. - - -Model configs are required -[DataDesignerConfigBuilder](#data_designer.config.config_builder.DataDesignerConfigBuilder) requires a list of model configurations at initialization. This tells the builder which model aliases can be referenced by LLM-generated columns (such as [`LLMTextColumnConfig`](/code-reference/config/column-configs#data_designer.config.column_configs.LLMTextColumnConfig), [`LLMCodeColumnConfig`](/code-reference/config/column-configs#data_designer.config.column_configs.LLMCodeColumnConfig), [`LLMStructuredColumnConfig`](/code-reference/config/column-configs#data_designer.config.column_configs.LLMStructuredColumnConfig), and [`LLMJudgeColumnConfig`](/code-reference/config/column-configs#data_designer.config.column_configs.LLMJudgeColumnConfig)). Each model configuration specifies the model alias, model provider, model ID, and inference parameters that will be used during data generation. - diff --git a/fern/versions/latest/pages/code_reference/config/data_designer_config.mdx b/fern/versions/latest/pages/code_reference/config/data_designer_config.mdx deleted file mode 100644 index 2ef7dd739..000000000 --- a/fern/versions/latest/pages/code_reference/config/data_designer_config.mdx +++ /dev/null @@ -1,8 +0,0 @@ ---- -title: "Data Designer Configuration" -description: "" -position: 5 ---- -[DataDesignerConfig](#data_designer.config.data_designer_config.DataDesignerConfig) is the main configuration object for builder datasets with Data Designer. It is a declarative configuration for defining the dataset you want to generate column-by-column, including options for dataset post-processing, validation, and profiling. - -Generally, you should use the [DataDesignerConfigBuilder](/code-reference/config/config-builder#data_designer.config.config_builder.DataDesignerConfigBuilder) to build your configuration, but you can also build it manually by instantiating the [DataDesignerConfig](#data_designer.config.data_designer_config.DataDesignerConfig) class directly. diff --git a/fern/versions/latest/pages/code_reference/config/index.mdx b/fern/versions/latest/pages/code_reference/config/index.mdx deleted file mode 100644 index 646de123b..000000000 --- a/fern/versions/latest/pages/code_reference/config/index.mdx +++ /dev/null @@ -1,11 +0,0 @@ ---- -title: "Config Package" -description: "" -position: 1 ---- - -The `data-designer-config` package provides `data_designer.config`, the configuration layer of Data Designer. It contains the objects used to describe dataset structure, model access, tool access, seed data, sampler parameters, validators, processors, run settings, plugin registrations, and analysis results. - -This package is the base of the dependency chain. Engine and interface code consume these config objects, but config objects do not execute generation directly. - -For programmatic configuration work, start with [config_builder](config-builder) and [data_designer_config](data-designer-config). Use the narrower pages for exact constructor fields for columns, models, MCP tools, seeds, processors, samplers, validators, or profiling results. diff --git a/fern/versions/latest/pages/code_reference/config/mcp.mdx b/fern/versions/latest/pages/code_reference/config/mcp.mdx deleted file mode 100644 index e27ae5d24..000000000 --- a/fern/versions/latest/pages/code_reference/config/mcp.mdx +++ /dev/null @@ -1,105 +0,0 @@ ---- -title: "MCP (Model Context Protocol)" -description: "" -position: 2 ---- -The `mcp` module defines configuration and execution classes for tool use via MCP (Model Context Protocol). - -## Configuration Classes - -[MCPProvider](#data_designer.config.mcp.MCPProvider) configures remote MCP servers via SSE or Streamable HTTP transport. [LocalStdioMCPProvider](#data_designer.config.mcp.LocalStdioMCPProvider) configures local MCP servers as subprocesses via stdio transport. [ToolConfig](#data_designer.config.mcp.ToolConfig) defines which tools are available for LLM columns and how they are constrained. - -For user-facing guides, see: - -- **[MCP Providers](/concepts/tool-use-and-mcp/mcp-providers)** - Configure local or remote MCP providers -- **[Tool Configurations](/concepts/tool-use-and-mcp/tool-configs)** - Define tool permissions and limits -- **[Enabling Tools on Columns](/concepts/tool-use-and-mcp/enabling-tools)** - Use tools in LLM columns -- **[Message Traces](/concepts/traces)** - Capture full conversation history - -## Internal Architecture - -### Parallel Structure - -| Model Layer | MCP Layer | Purpose | -|-------------|-----------|---------| -| `ModelProviderRegistry` | `MCPProviderRegistry` | Holds provider configurations | -| `ModelRegistry` | `MCPRegistry` | Manages configs by alias, lazy facade creation | -| `ModelFacade` | `MCPFacade` | Lightweight facade scoped to specific config | -| `ModelConfig.alias` | `ToolConfig.tool_alias` | Alias for referencing in column configs | - -### MCPProviderRegistry - -Holds MCP provider configurations. Can be empty (MCP is optional). Created first during resource initialization. - -### MCPRegistry - -The central registry for tool configurations: - -- Holds `ToolConfig` instances by `tool_alias` -- Lazily creates `MCPFacade` instances via `get_mcp(tool_alias)` -- Manages shared connection pool and tool cache across all facades -- Validates that tool configs reference valid providers - -### MCPFacade - -A lightweight facade scoped to a specific `ToolConfig`. Key methods: - -| Method | Description | -|--------|-------------| -| `tool_call_count(response)` | Count tool calls in a completion response | -| `has_tool_calls(response)` | Check if response contains tool calls | -| `get_tool_schemas()` | Get OpenAI-format tool schemas for this config | -| `process_completion_response(response)` | Execute tool calls and return messages | -| `refuse_completion_response(response)` | Refuse tool calls gracefully (budget exhaustion) | - -Properties: `tool_alias`, `providers`, `max_tool_call_turns`, `allow_tools`, `timeout_sec` - -### I/O Layer (mcp/io.py) - -The `io.py` module provides low-level MCP communication with performance optimizations: - -**Single event loop architecture:** -All MCP operations funnel through a dedicated background daemon thread running an asyncio event loop. This allows: - -- Efficient concurrent I/O without per-thread event loop overhead -- Natural session sharing across all worker threads -- Clean async implementation for parallel tool calls - -**Session pooling:** -MCP sessions are created lazily and kept alive for the program's duration: - -- One session per provider (keyed by serialized config) -- No per-call connection/handshake overhead -- Graceful cleanup on program exit via `atexit` handler - -**Request coalescing:** -The `list_tools` operation uses request coalescing to prevent thundering herd: - -- When multiple workers request tools from the same provider simultaneously -- Only one request is made; others wait for the cached result -- Uses asyncio.Lock per provider key - -**Parallel tool execution:** -The `call_tools_parallel()` function executes multiple tool calls concurrently via `asyncio.gather()`. This is used by MCPFacade when the model returns parallel tool calls in a single response. - -### Integration with ModelFacade.generate() - -The `ModelFacade.generate()` method accepts an optional `tool_alias` parameter: - -```python -output, messages = model_facade.generate( - prompt="Search and answer...", - parser=my_parser, - tool_alias="my-tools", # Enables tool calling for this generation -) -``` - -When `tool_alias` is provided: - -1. `ModelFacade` looks up the `MCPFacade` from `MCPRegistry` -2. Tool schemas are fetched and passed to the LLM -3. After each completion, `MCPFacade` processes tool calls -4. Turn counting tracks iterations; refusal kicks in when budget exhausted -5. Messages (including tool results) are returned for trace capture - -## Config Module diff --git a/fern/versions/latest/pages/code_reference/config/models.mdx b/fern/versions/latest/pages/code_reference/config/models.mdx deleted file mode 100644 index a9c6da403..000000000 --- a/fern/versions/latest/pages/code_reference/config/models.mdx +++ /dev/null @@ -1,13 +0,0 @@ ---- -title: "Models" -description: "" -position: 1 ---- -The `models` module defines configuration objects for model-based generation. [ModelProvider](#data_designer.config.models.ModelProvider) specifies connection and authentication details for custom providers. [ModelConfig](#data_designer.config.models.ModelConfig) encapsulates model details including the model alias, identifier, and inference parameters. [Inference Parameters](/concepts/models/inference-parameters) controls model behavior through settings like `temperature`, `top_p`, and `max_tokens`, with support for both fixed values and distribution-based sampling. The module includes [ImageContext](#data_designer.config.models.ImageContext) for providing image inputs to multimodal models, and [ImageInferenceParams](#data_designer.config.models.ImageInferenceParams) for configuring image generation models. - -For more information on how they are used, see below: - -- **[Model Providers](/concepts/models/model-providers)** -- **[Model Configurations](/concepts/models/model-configs)** -- **[Image Context](/tutorials/providing-images-as-context)** -- **[Generating Images](/tutorials/generating-images)** diff --git a/fern/versions/latest/pages/code_reference/config/processors.mdx b/fern/versions/latest/pages/code_reference/config/processors.mdx deleted file mode 100644 index 1770e65c7..000000000 --- a/fern/versions/latest/pages/code_reference/config/processors.mdx +++ /dev/null @@ -1,6 +0,0 @@ ---- -title: "Processors" -description: "" -position: 9 ---- -The `processors` module defines configuration objects for post-generation data transformations. Processors run after column generation and can modify the dataset schema or content before output. diff --git a/fern/versions/latest/pages/code_reference/config/run_config.mdx b/fern/versions/latest/pages/code_reference/config/run_config.mdx deleted file mode 100644 index e4118cb86..000000000 --- a/fern/versions/latest/pages/code_reference/config/run_config.mdx +++ /dev/null @@ -1,30 +0,0 @@ ---- -title: "Run Config" -description: "" -position: 6 ---- -The `run_config` module defines runtime settings that control dataset generation behavior, -including early shutdown thresholds, batch sizing, non-inference worker concurrency, -and the Jinja rendering engine used by the runtime. - -`JinjaRenderingEngine.SECURE` is the default. Set `JinjaRenderingEngine.NATIVE` -when you want Jinja2's broader built-in sandbox behavior instead of Data Designer's -hardened renderer. - -For guidance on when to use each mode, see [Security](/concepts/security). - -## Usage - -```python -import data_designer.config as dd -from data_designer.interface import DataDesigner - -data_designer = DataDesigner() -data_designer.set_run_config(dd.RunConfig( - buffer_size=500, - max_conversation_restarts=3, - jinja_rendering_engine=dd.JinjaRenderingEngine.NATIVE, -)) -``` - -## API Reference diff --git a/fern/versions/latest/pages/code_reference/config/sampler_params.mdx b/fern/versions/latest/pages/code_reference/config/sampler_params.mdx deleted file mode 100644 index 7346ea5e3..000000000 --- a/fern/versions/latest/pages/code_reference/config/sampler_params.mdx +++ /dev/null @@ -1,15 +0,0 @@ ---- -title: "Sampler Parameters" -description: "" -position: 7 ---- -The `sampler_params` module defines parameter configuration objects for all Data Designer sampler types. Sampler parameters are used within the [SamplerColumnConfig](/code-reference/config/column-configs#data_designer.config.column_configs.SamplerColumnConfig) to specify how values should be generated for sampled columns. - - -Displaying available samplers and their parameters -The config builder has an `info` attribute that can be used to display the -available sampler types and their parameters: -```python -config_builder.info.display("samplers") -``` - diff --git a/fern/versions/latest/pages/code_reference/config/seeds.mdx b/fern/versions/latest/pages/code_reference/config/seeds.mdx deleted file mode 100644 index 5e0d75bd8..000000000 --- a/fern/versions/latest/pages/code_reference/config/seeds.mdx +++ /dev/null @@ -1,11 +0,0 @@ ---- -title: "Seeds" -description: "" -position: 12 ---- - -Seed configs declare existing data used as input during generation. A `SeedConfig` combines a seed source with optional row sampling and selection settings. Seed source objects declare where seed data comes from; the engine reads them through seed readers. - -Use these objects with `DataDesignerConfigBuilder.with_seed_dataset()`. Related pages: [Seed Datasets](/concepts/seed-datasets) and [seed readers](/code-reference/engine-extension-api/seed-readers). - -Built-in seed sources include local files, Hugging Face paths, in-memory DataFrames, directories, file contents, and agent rollout traces. Plugin seed sources can extend the same discriminated union through the plugin system. diff --git a/fern/versions/latest/pages/code_reference/config/validator_params.mdx b/fern/versions/latest/pages/code_reference/config/validator_params.mdx deleted file mode 100644 index 8aa89f863..000000000 --- a/fern/versions/latest/pages/code_reference/config/validator_params.mdx +++ /dev/null @@ -1,7 +0,0 @@ ---- -title: "Validator Parameters" -description: "" -position: 8 ---- -When creating a `ValidationColumnConfig`, two parameters are used to define the validator: `validator_type` and `validator_config`. -The `validator_type` parameter can be set to either `code`, `local_callable` or `remote`. The `validator_config` accompanying each of these is, respectively: diff --git a/fern/versions/latest/pages/code_reference/engine/column_generators.mdx b/fern/versions/latest/pages/code_reference/engine/column_generators.mdx deleted file mode 100644 index 976e830c4..000000000 --- a/fern/versions/latest/pages/code_reference/engine/column_generators.mdx +++ /dev/null @@ -1,13 +0,0 @@ ---- -title: "Column Generators" -description: "" -position: 5 ---- - -Column generators execute column generation in the Data Designer engine. A generator receives the upstream data needed for its task, returns row or batch data with generated values added, and reports the generation strategy the scheduler should use. - -Related pages: [column_configs](/code-reference/config/column-configs), [FileSystemSeedReader Plugins](/plugins/file-system-seed-reader-plugins), and [Custom Columns](/concepts/custom-columns). - -User-facing column configs inherit from `SingleColumnConfig` and define a unique `column_type` discriminator. During compilation, the engine may group related configs into multi-column configs for generators that create sampler or seed columns together. - -Generators that operate on a full batch can inherit from `ColumnGeneratorFullColumn`. Row-oriented non-model generators can inherit from `ColumnGeneratorCellByCell`. Generators that create initial rows use `FromScratchColumnGenerator`. Model-backed plugin generators should use `ColumnGeneratorWithModelRegistry` or `ColumnGeneratorWithModel`. diff --git a/fern/versions/latest/pages/code_reference/engine/index.mdx b/fern/versions/latest/pages/code_reference/engine/index.mdx deleted file mode 100644 index a2040fab4..000000000 --- a/fern/versions/latest/pages/code_reference/engine/index.mdx +++ /dev/null @@ -1,9 +0,0 @@ ---- -title: "Engine Extension API" -description: "" -position: 1 ---- - -The `data-designer-engine` package provides the runtime layer of Data Designer. It consumes `data_designer.config` objects and maps them to execution behavior through generators, seed readers, processors, registries, model access, and MCP tool execution. - -This section is intentionally curated. Use it for plugin implementation contracts, registry behavior, seed reader internals, processor execution, column generator bases, and MCP runtime behavior. It does not expose every internal engine module. diff --git a/fern/versions/latest/pages/code_reference/engine/mcp.mdx b/fern/versions/latest/pages/code_reference/engine/mcp.mdx deleted file mode 100644 index bfa01aed0..000000000 --- a/fern/versions/latest/pages/code_reference/engine/mcp.mdx +++ /dev/null @@ -1,16 +0,0 @@ ---- -title: "Engine MCP" -description: "" -position: 4 ---- - -Execution-time MCP registries, facades, session handling, schema discovery, and tool calls. - -For user-facing provider and tool config objects, see [MCP configuration](/code-reference/config/mcp). - -| Model layer | MCP layer | Purpose | -|-------------|-----------|---------| -| `ModelProviderRegistry` | `MCPProviderRegistry` | Holds provider configurations. | -| `ModelRegistry` | `MCPRegistry` | Manages configs by alias and lazily creates facades. | -| `ModelFacade` | `MCPFacade` | Provides a lightweight runtime facade scoped to one config. | -| `ModelConfig.alias` | `ToolConfig.tool_alias` | Alias referenced by column configs. | diff --git a/fern/versions/latest/pages/code_reference/engine/processors.mdx b/fern/versions/latest/pages/code_reference/engine/processors.mdx deleted file mode 100644 index e88473996..000000000 --- a/fern/versions/latest/pages/code_reference/engine/processors.mdx +++ /dev/null @@ -1,11 +0,0 @@ ---- -title: "Engine Processor Implementations" -description: "" -position: 3 ---- - -Runtime processor classes and processor registry helpers. - -Plugin processors inherit from `Processor` and override one or more callback methods: `process_before_batch`, `process_after_batch`, or `process_after_generation`. - -For user-facing processor config objects, see [processor configurations](/code-reference/config/processors). diff --git a/fern/versions/latest/pages/code_reference/engine/seed_readers.mdx b/fern/versions/latest/pages/code_reference/engine/seed_readers.mdx deleted file mode 100644 index 6b549d613..000000000 --- a/fern/versions/latest/pages/code_reference/engine/seed_readers.mdx +++ /dev/null @@ -1,9 +0,0 @@ ---- -title: "Seed Readers" -description: "" -position: 2 ---- - -Seed readers are engine-side adapters that turn a configured seed source into tabular seed rows. The engine attaches a `SeedSource` and secret resolver, asks the reader for column names and dataset size, then streams batches into generation. - -Related pages: [seeds](/code-reference/config/seeds), [Seed Datasets](/concepts/seed-datasets), and [FileSystemSeedReader Plugins](/plugins/file-system-seed-reader-plugins). diff --git a/fern/versions/latest/pages/code_reference/index.mdx b/fern/versions/latest/pages/code_reference/index.mdx deleted file mode 100644 index 4777cc3f3..000000000 --- a/fern/versions/latest/pages/code_reference/index.mdx +++ /dev/null @@ -1,15 +0,0 @@ ---- -title: "Code Reference" -description: "" -position: 1 ---- - -Data Designer is implemented as three installable packages that share the `data_designer` namespace. - -| Package | Namespace | Role | -|---------|-----------|------| -| `data-designer-config` | `data_designer.config` | Configuration schemas, builder APIs, plugin registration objects, and result schemas. | -| `data-designer-engine` | `data_designer.engine` | Runtime extension contracts for generation, seed reading, processing, and MCP tool execution. | -| `data-designer` | `data_designer.interface` | Public entry points for previewing, creating, and inspecting generated datasets. | - -The dependency direction is `interface -> engine -> config`. Config objects describe what should happen, engine objects implement how it happens, and interface objects expose the supported public API. diff --git a/fern/versions/latest/pages/code_reference/interface/data_designer.mdx b/fern/versions/latest/pages/code_reference/interface/data_designer.mdx deleted file mode 100644 index fe02b5907..000000000 --- a/fern/versions/latest/pages/code_reference/interface/data_designer.mdx +++ /dev/null @@ -1,9 +0,0 @@ ---- -title: "DataDesigner Interface" -description: "" -position: 2 ---- - -`DataDesigner` validates configs, generates in-memory previews, creates persisted datasets, lists configured MCP tools, and exposes default model settings. - -For runtime settings passed through `set_run_config()`, see [run_config](/code-reference/config/run-config). For persisted creation results returned by `create()`, see [results](results). diff --git a/fern/versions/latest/pages/code_reference/interface/errors.mdx b/fern/versions/latest/pages/code_reference/interface/errors.mdx deleted file mode 100644 index bba594e19..000000000 --- a/fern/versions/latest/pages/code_reference/interface/errors.mdx +++ /dev/null @@ -1,9 +0,0 @@ ---- -title: "Interface Errors" -description: "" -position: 4 ---- - -Interface errors represent failures surfaced at the public API boundary. `DataDesignerGenerationError` wraps dataset generation failures from `create()` and `preview()`, `DataDesignerEarlyShutdownError` identifies generation runs that terminate early without producing records, and `DataDesignerProfilingError` wraps profiling failures from those methods. - -These errors inherit from `data_designer.errors.DataDesignerError`, allowing callers to catch either specific interface failures or the project-wide base error type. diff --git a/fern/versions/latest/pages/code_reference/interface/index.mdx b/fern/versions/latest/pages/code_reference/interface/index.mdx deleted file mode 100644 index fa3dc27de..000000000 --- a/fern/versions/latest/pages/code_reference/interface/index.mdx +++ /dev/null @@ -1,11 +0,0 @@ ---- -title: "Interface Package" -description: "" -position: 1 ---- - -The `data-designer` package provides the top-level user-facing package surface. This section covers `data_designer.interface`, which contains `DataDesigner`, persisted dataset creation results, and interface-level errors. - -This package sits above engine and config. `DataDesigner` accepts Data Designer configs, calls the runtime layer, and returns preview or persisted creation results. - -Start with [data_designer](data-designer) for previewing, creating, and inspecting datasets from a config. Use [results](results) for the object returned by persisted dataset creation, and [errors](errors) for exceptions surfaced at the public API boundary. diff --git a/fern/versions/latest/pages/code_reference/interface/results.mdx b/fern/versions/latest/pages/code_reference/interface/results.mdx deleted file mode 100644 index 917f999bd..000000000 --- a/fern/versions/latest/pages/code_reference/interface/results.mdx +++ /dev/null @@ -1,9 +0,0 @@ ---- -title: "Dataset Creation Results" -description: "" -position: 3 ---- - -`DatasetCreationResults` is returned by `DataDesigner.create()`. It provides access to persisted creation artifacts, including the generated dataset, profiling analysis, processor outputs, task traces, dataset metadata, and Hugging Face Hub upload support. - -Preview generation uses the in-memory `data_designer.config.preview_results.PreviewResults` object returned by `DataDesigner.preview()`. Persisted dataset creation uses `DatasetCreationResults`. diff --git a/fern/versions/latest/pages/concepts/columns.mdx b/fern/versions/latest/pages/concepts/columns.mdx index 9f4e82527..daab64cfa 100644 --- a/fern/versions/latest/pages/concepts/columns.mdx +++ b/fern/versions/latest/pages/concepts/columns.mdx @@ -234,4 +234,4 @@ Computed property listing columns created implicitly alongside the primary colum - `{name}__trace`: Created when `with_trace` is not `TraceType.NONE` on the column. - `{name}__reasoning_content`: Created when `extract_reasoning_content=True` on the column. -For detailed information on each column type, refer to the [column configuration code reference](/code-reference/config/column-configs). +For examples of column type usage, see the tutorials and recipe pages. diff --git a/fern/versions/latest/pages/concepts/custom_columns.mdx b/fern/versions/latest/pages/concepts/custom_columns.mdx index 7867572ee..82b84053d 100644 --- a/fern/versions/latest/pages/concepts/custom_columns.mdx +++ b/fern/versions/latest/pages/concepts/custom_columns.mdx @@ -195,5 +195,4 @@ Mocking only `generate()` will silently no-op under the async engine because the ## See Also -- [Column Configs Reference](/code-reference/config/column-configs) - [Plugins Overview](/plugins/overview) diff --git a/fern/versions/latest/pages/concepts/models/model-configs.mdx b/fern/versions/latest/pages/concepts/models/model-configs.mdx index a784b5746..11834f035 100644 --- a/fern/versions/latest/pages/concepts/models/model-configs.mdx +++ b/fern/versions/latest/pages/concepts/models/model-configs.mdx @@ -150,5 +150,4 @@ Note that skipping health checks means errors will only be discovered during act - **[Default Model Settings](/concepts/models/default-model-settings)**: Pre-configured model settings included with Data Designer - **[Custom Model Settings](/concepts/models/custom-model-settings)**: Learn how to create custom providers and model configurations - **[Configure Model Settings With the CLI](/concepts/models/configure-with-the-cli)**: Use the CLI to manage model settings -- **[Column Configurations](/code-reference/config/column-configs)**: Learn how to use models in column configurations - **[Architecture & Performance](/concepts/architecture-and-performance)**: Understanding separation of concerns and optimizing concurrency diff --git a/fern/versions/latest/pages/concepts/person_sampling.mdx b/fern/versions/latest/pages/concepts/person_sampling.mdx index 6b1088419..206d0f47e 100644 --- a/fern/versions/latest/pages/concepts/person_sampling.mdx +++ b/fern/versions/latest/pages/concepts/person_sampling.mdx @@ -43,7 +43,7 @@ config_builder.add_column( ) ``` -For mor details, see the documentation for [`SamplerColumnConfig`](/code-reference/config/column-configs#data_designer.config.column_configs.SamplerColumnConfig) and [`PersonFromFakerSamplerParams`](/code-reference/config/sampler-params#data_designer.config.sampler_params.PersonFromFakerSamplerParams). +Use `SamplerColumnConfig` with `PersonFromFakerSamplerParams` when you need locale-aware synthetic person fields. --- @@ -164,7 +164,7 @@ config_builder.add_column( ) ``` -For more details, see the documentation for [`SamplerColumnConfig`](/code-reference/config/column-configs#data_designer.config.column_configs.SamplerColumnConfig) and [`PersonSamplerParams`](/code-reference/config/sampler-params#data_designer.config.sampler_params.PersonSamplerParams). +Use `SamplerColumnConfig` with `PersonSamplerParams` when you need richer personas from curated datasets. ### Available Data Fields diff --git a/fern/versions/latest/pages/concepts/security.mdx b/fern/versions/latest/pages/concepts/security.mdx index 24b413b75..8651b10e2 100644 --- a/fern/versions/latest/pages/concepts/security.mdx +++ b/fern/versions/latest/pages/concepts/security.mdx @@ -205,4 +205,3 @@ For example, this is often reasonable in a notebook, local script, or other sing ## Related Reading - [Deployment Options: Library vs. Microservice](/concepts/deployment-options) -- [Run Config Reference](/code-reference/config/run-config) diff --git a/fern/versions/latest/pages/concepts/tool_use_and_mcp.mdx b/fern/versions/latest/pages/concepts/tool_use_and_mcp.mdx index 7f8133a37..7ea2ee8a4 100644 --- a/fern/versions/latest/pages/concepts/tool_use_and_mcp.mdx +++ b/fern/versions/latest/pages/concepts/tool_use_and_mcp.mdx @@ -66,7 +66,3 @@ builder.add_column( ## Example See the [PDF Q&A Recipe](/recipes/mcp-and-tool-use/pdf-document-qa) for a complete working example. - -## Code Reference - -For internal architecture and API documentation, see [MCP Code Reference](/code-reference/config/mcp). diff --git a/fern/versions/latest/pages/concepts/validators.mdx b/fern/versions/latest/pages/concepts/validators.mdx index 0aeeae9ea..471f6e4dc 100644 --- a/fern/versions/latest/pages/concepts/validators.mdx +++ b/fern/versions/latest/pages/concepts/validators.mdx @@ -295,10 +295,6 @@ builder.add_column( The `target_columns` parameter specifies which columns to validate. All target columns are passed to the validator together (except for code validators, which process each column separately). -### Configuration Parameters - -See more about parameters used to instantiate `ValidationColumnConfig` in the [code reference](/code-reference/config/column-configs#data_designer.config.column_configs.ValidationColumnConfig). - ### Batch Size Considerations Larger batch sizes improve efficiency but consume more memory: @@ -336,7 +332,3 @@ builder.add_column( ``` **Note**: Code validators always process each target column separately, even when multiple columns are specified. Local callable and remote validators receive all target columns together. - -## See Also - -- [Validator Parameters Reference](/code-reference/config/validator-params): Configuration object schemas diff --git a/fern/versions/latest/pages/plugins/example.mdx b/fern/versions/latest/pages/plugins/example.mdx index 5a41b9774..77aeced33 100644 --- a/fern/versions/latest/pages/plugins/example.mdx +++ b/fern/versions/latest/pages/plugins/example.mdx @@ -39,7 +39,7 @@ data-designer-index-multiplier/ ### Step 2: Create the config class -The configuration class defines what parameters users can set when using your plugin. For column generator plugins, it must inherit from [SingleColumnConfig](/code-reference/config/column-configs#data_designer.config.column_configs.SingleColumnConfig) and include a [discriminator field](https://docs.pydantic.dev/latest/concepts/unions/#discriminated-unions). +The configuration class defines what parameters users can set when using your plugin. For column generator plugins, it must inherit from `SingleColumnConfig` and include a [discriminator field](https://docs.pydantic.dev/latest/concepts/unions/#discriminated-unions). Create `src/data_designer_index_multiplier/config.py`: diff --git a/fern/versions/latest/pages/plugins/overview.mdx b/fern/versions/latest/pages/plugins/overview.mdx index 9dc364a96..c03a5db35 100644 --- a/fern/versions/latest/pages/plugins/overview.mdx +++ b/fern/versions/latest/pages/plugins/overview.mdx @@ -10,9 +10,9 @@ Plugins let you add new object types to Data Designer without modifying the core Data Designer supports three plugin types: -- **Column generator plugins**: Custom column types you pass to the config builder's [add_column](/code-reference/config/config-builder#data_designer.config.config_builder.DataDesignerConfigBuilder.add_column) method. +- **Column generator plugins**: Custom column types you pass to the config builder's `add_column` method. - **Seed reader plugins**: Custom seed dataset readers that let you load data from new sources, such as databases, cloud storage, or custom file formats. -- **Processor plugins**: Custom processors that transform data before batches, after batches, or after generation completes. Pass them to the config builder's [add_processor](/code-reference/config/config-builder#data_designer.config.config_builder.DataDesignerConfigBuilder.add_processor) method. +- **Processor plugins**: Custom processors that transform data before batches, after batches, or after generation completes. Pass them to the config builder's `add_processor` method. ## How do you use plugins? diff --git a/mkdocs.yml b/mkdocs.yml index 89d425dfd..a5df6f103 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -72,34 +72,6 @@ nav: - Build Your Own: plugins/build_your_own.md - Using Models: plugins/models.md - Discover Plugins: plugins/discover.md - - Code Reference: - - Overview: code_reference/index.md - # Keep module reference pages ordered alphabetically by nav label within each package group. - - Config: - - Overview: code_reference/config/index.md - - analysis: code_reference/config/analysis.md - - column_configs: code_reference/config/column_configs.md - - config_builder: code_reference/config/config_builder.md - - data_designer_config: code_reference/config/data_designer_config.md - - mcp: code_reference/config/mcp.md - - models: code_reference/config/models.md - - plugins: code_reference/config/plugins.md - - processors: code_reference/config/processors.md - - run_config: code_reference/config/run_config.md - - sampler_params: code_reference/config/sampler_params.md - - seeds: code_reference/config/seeds.md - - validator_params: code_reference/config/validator_params.md - - Engine: - - Overview: code_reference/engine/index.md - - column_generators: code_reference/engine/column_generators.md - - mcp: code_reference/engine/mcp.md - - processors: code_reference/engine/processors.md - - seed_readers: code_reference/engine/seed_readers.md - - Interface: - - Overview: code_reference/interface/index.md - - data_designer: code_reference/interface/data_designer.md - - errors: code_reference/interface/errors.md - - results: code_reference/interface/results.md - Dev Notes: # NOTE: Order is most recent -> oldest (so sidebar shows recent first!) - devnotes/index.md @@ -154,9 +126,6 @@ extra: default: latest watch: - - packages/data-designer-config/src/data_designer - - packages/data-designer-engine/src/data_designer - - packages/data-designer/src/data_designer - docs/ plugins: @@ -179,28 +148,9 @@ plugins: include_source: True ignore: - "assets/recipes/**/*.py" - - mkdocstrings: - handlers: - python: - paths: - - packages/data-designer-config/src - - packages/data-designer-engine/src - - packages/data-designer/src - options: - show_symbol_type_heading: true - show_symbol_type_toc: true - show_root_toc_entry: true - show_object_full_path: false - filters: ["!^_"] - docstring_options: - ignore_init_summary: false - merge_init_into_class: true - docstring_section_style: table - summary: true extra_css: - css/style.css - - css/mkdocstrings.css extra_javascript: - js/toc-toggle.js diff --git a/pyproject.toml b/pyproject.toml index 828de9cd3..628832b77 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -51,10 +51,7 @@ docs = [ "mkdocs-material>=9.6.22,<10", "mkdocs-redirects>=1.2.2,<2", "mkdocs>=1.6.1,<2", - "mkdocstrings-python>=1.18.2,<2", - "mkdocstrings>=0.30.1,<1", "nbconvert>=7.17.1,<8", # 7.17.1 fixes security advisory pulled in by mkdocs-jupyter - "py2fern==0.1.6", "pymdown-extensions>=10.21.2,<11", ] notebooks = [ diff --git a/uv.lock b/uv.lock index d21c51538..b3e910024 100644 --- a/uv.lock +++ b/uv.lock @@ -269,18 +269,6 @@ wheels = [ { url = "https://files.pythonhosted.org/packages/ed/c9/d7977eaacb9df673210491da99e6a247e93df98c715fc43fd136ce1d3d33/arrow-1.4.0-py3-none-any.whl", hash = "sha256:749f0769958ebdc79c173ff0b0670d59051a535fa26e8eba02953dc19eb43205", size = 68797, upload-time = "2025-10-18T17:46:45.663Z" }, ] -[[package]] -name = "astroid" -version = "3.3.11" -source = { registry = "https://pypi.org/simple" } -dependencies = [ - { name = "typing-extensions", marker = "python_full_version < '3.11'" }, -] -sdist = { url = "https://files.pythonhosted.org/packages/18/74/dfb75f9ccd592bbedb175d4a32fc643cf569d7c218508bfbd6ea7ef9c091/astroid-3.3.11.tar.gz", hash = "sha256:1e5a5011af2920c7c67a53f65d536d65bfa7116feeaf2354d8b94f29573bb0ce", size = 400439, upload-time = "2025-07-13T18:04:23.177Z" } -wheels = [ - { url = "https://files.pythonhosted.org/packages/af/0f/3b8fdc946b4d9cc8cc1e8af42c4e409468c84441b933d037e101b3d72d86/astroid-3.3.11-py3-none-any.whl", hash = "sha256:54c760ae8322ece1abd213057c4b5bba7c49818853fc901ef09719a60dbf9dec", size = 275612, upload-time = "2025-07-13T18:04:21.07Z" }, -] - [[package]] name = "asttokens" version = "3.0.1" @@ -934,10 +922,7 @@ docs = [ { name = "mkdocs-jupyter" }, { name = "mkdocs-material" }, { name = "mkdocs-redirects" }, - { name = "mkdocstrings" }, - { name = "mkdocstrings-python" }, { name = "nbconvert" }, - { name = "py2fern" }, { name = "pymdown-extensions" }, ] notebooks = [ @@ -977,10 +962,7 @@ docs = [ { name = "mkdocs-jupyter", specifier = ">=0.25.1,<1" }, { name = "mkdocs-material", specifier = ">=9.6.22,<10" }, { name = "mkdocs-redirects", specifier = ">=1.2.2,<2" }, - { name = "mkdocstrings", specifier = ">=0.30.1,<1" }, - { name = "mkdocstrings-python", specifier = ">=1.18.2,<2" }, { name = "nbconvert", specifier = ">=7.17.1,<8" }, - { name = "py2fern", specifier = "==0.1.6" }, { name = "pymdown-extensions", specifier = ">=10.21.2,<11" }, ] notebooks = [ @@ -1113,15 +1095,6 @@ wheels = [ { url = "https://files.pythonhosted.org/packages/ba/5a/18ad964b0086c6e62e2e7500f7edc89e3faa45033c71c1893d34eed2b2de/dnspython-2.8.0-py3-none-any.whl", hash = "sha256:01d9bbc4a2d76bf0db7c1f729812ded6d912bd318d3b1cf81d30c0f845dbf3af", size = 331094, upload-time = "2025-09-07T18:57:58.071Z" }, ] -[[package]] -name = "docstring-parser" -version = "0.18.0" -source = { registry = "https://pypi.org/simple" } -sdist = { url = "https://files.pythonhosted.org/packages/e0/4d/f332313098c1de1b2d2ff91cf2674415cc7cddab2ca1b01ae29774bd5fdf/docstring_parser-0.18.0.tar.gz", hash = "sha256:292510982205c12b1248696f44959db3cdd1740237a968ea1e2e7a900eeb2015", size = 29341, upload-time = "2026-04-14T04:09:19.867Z" } -wheels = [ - { url = "https://files.pythonhosted.org/packages/a7/5f/ed01f9a3cdffbd5a008556fc7b2a08ddb1cc6ace7effa7340604b1d16699/docstring_parser-0.18.0-py3-none-any.whl", hash = "sha256:b3fcbed555c47d8479be0796ef7e19c2670d428d72e96da63f3a40122860374b", size = 22484, upload-time = "2026-04-14T04:09:18.638Z" }, -] - [[package]] name = "duckdb" version = "1.5.0" @@ -1384,41 +1357,6 @@ wheels = [ { url = "https://files.pythonhosted.org/packages/f7/ec/67fbef5d497f86283db54c22eec6f6140243aae73265799baaaa19cd17fb/ghp_import-2.1.0-py3-none-any.whl", hash = "sha256:8337dd7b50877f163d4c0289bc1f1c7f127550241988d568c1db512c4324a619", size = 11034, upload-time = "2022-05-02T15:47:14.552Z" }, ] -[[package]] -name = "griffe" -version = "2.0.0" -source = { registry = "https://pypi.org/simple" } -dependencies = [ - { name = "griffecli" }, - { name = "griffelib" }, -] -sdist = { url = "https://files.pythonhosted.org/packages/04/56/28a0accac339c164b52a92c6cfc45a903acc0c174caa5c1713803467b533/griffe-2.0.0.tar.gz", hash = "sha256:c68979cd8395422083a51ea7cf02f9c119d889646d99b7b656ee43725de1b80f", size = 293906, upload-time = "2026-03-23T21:06:53.402Z" } -wheels = [ - { url = "https://files.pythonhosted.org/packages/8b/94/ee21d41e7eb4f823b94603b9d40f86d3c7fde80eacc2c3c71845476dddaa/griffe-2.0.0-py3-none-any.whl", hash = "sha256:5418081135a391c3e6e757a7f3f156f1a1a746cc7b4023868ff7d5e2f9a980aa", size = 5214, upload-time = "2026-02-09T19:09:44.105Z" }, -] - -[[package]] -name = "griffecli" -version = "2.0.0" -source = { registry = "https://pypi.org/simple" } -dependencies = [ - { name = "colorama" }, - { name = "griffelib" }, -] -sdist = { url = "https://files.pythonhosted.org/packages/a4/f8/2e129fd4a86e52e58eefe664de05e7d502decf766e7316cc9e70fdec3e18/griffecli-2.0.0.tar.gz", hash = "sha256:312fa5ebb4ce6afc786356e2d0ce85b06c1c20d45abc42d74f0cda65e159f6ef", size = 56213, upload-time = "2026-03-23T21:06:54.8Z" } -wheels = [ - { url = "https://files.pythonhosted.org/packages/e6/ed/d93f7a447bbf7a935d8868e9617cbe1cadf9ee9ee6bd275d3040fbf93d60/griffecli-2.0.0-py3-none-any.whl", hash = "sha256:9f7cd9ee9b21d55e91689358978d2385ae65c22f307a63fb3269acf3f21e643d", size = 9345, upload-time = "2026-02-09T19:09:42.554Z" }, -] - -[[package]] -name = "griffelib" -version = "2.0.0" -source = { registry = "https://pypi.org/simple" } -sdist = { url = "https://files.pythonhosted.org/packages/ad/06/eccbd311c9e2b3ca45dbc063b93134c57a1ccc7607c5e545264ad092c4a9/griffelib-2.0.0.tar.gz", hash = "sha256:e504d637a089f5cab9b5daf18f7645970509bf4f53eda8d79ed71cce8bd97934", size = 166312, upload-time = "2026-03-23T21:06:55.954Z" } -wheels = [ - { url = "https://files.pythonhosted.org/packages/4d/51/c936033e16d12b627ea334aaaaf42229c37620d0f15593456ab69ab48161/griffelib-2.0.0-py3-none-any.whl", hash = "sha256:01284878c966508b6d6f1dbff9b6fa607bc062d8261c5c7253cb285b06422a7f", size = 142004, upload-time = "2026-02-09T19:09:40.561Z" }, -] - [[package]] name = "h11" version = "0.16.0" @@ -2504,20 +2442,6 @@ wheels = [ { url = "https://files.pythonhosted.org/packages/22/5b/dbc6a8cddc9cfa9c4971d59fb12bb8d42e161b7e7f8cc89e49137c5b279c/mkdocs-1.6.1-py3-none-any.whl", hash = "sha256:db91759624d1647f3f34aa0c3f327dd2601beae39a366d6e064c03468d35c20e", size = 3864451, upload-time = "2024-08-30T12:24:05.054Z" }, ] -[[package]] -name = "mkdocs-autorefs" -version = "1.4.4" -source = { registry = "https://pypi.org/simple" } -dependencies = [ - { name = "markdown" }, - { name = "markupsafe" }, - { name = "mkdocs" }, -] -sdist = { url = "https://files.pythonhosted.org/packages/52/c0/f641843de3f612a6b48253f39244165acff36657a91cc903633d456ae1ac/mkdocs_autorefs-1.4.4.tar.gz", hash = "sha256:d54a284f27a7346b9c38f1f852177940c222da508e66edc816a0fa55fc6da197", size = 56588, upload-time = "2026-02-10T15:23:55.105Z" } -wheels = [ - { url = "https://files.pythonhosted.org/packages/28/de/a3e710469772c6a89595fc52816da05c1e164b4c866a89e3cb82fb1b67c5/mkdocs_autorefs-1.4.4-py3-none-any.whl", hash = "sha256:834ef5408d827071ad1bc69e0f39704fa34c7fc05bc8e1c72b227dfdc5c76089", size = 25530, upload-time = "2026-02-10T15:23:53.817Z" }, -] - [[package]] name = "mkdocs-get-deps" version = "0.2.2" @@ -2592,38 +2516,6 @@ wheels = [ { url = "https://files.pythonhosted.org/packages/c4/ec/38443b1f2a3821bbcb24e46cd8ba979154417794d54baf949fefde1c2146/mkdocs_redirects-1.2.2-py3-none-any.whl", hash = "sha256:7dbfa5647b79a3589da4401403d69494bd1f4ad03b9c15136720367e1f340ed5", size = 6142, upload-time = "2024-11-07T14:57:19.143Z" }, ] -[[package]] -name = "mkdocstrings" -version = "0.30.1" -source = { registry = "https://pypi.org/simple" } -dependencies = [ - { name = "jinja2" }, - { name = "markdown" }, - { name = "markupsafe" }, - { name = "mkdocs" }, - { name = "mkdocs-autorefs" }, - { name = "pymdown-extensions" }, -] -sdist = { url = "https://files.pythonhosted.org/packages/c5/33/2fa3243439f794e685d3e694590d28469a9b8ea733af4b48c250a3ffc9a0/mkdocstrings-0.30.1.tar.gz", hash = "sha256:84a007aae9b707fb0aebfc9da23db4b26fc9ab562eb56e335e9ec480cb19744f", size = 106350, upload-time = "2025-09-19T10:49:26.446Z" } -wheels = [ - { url = "https://files.pythonhosted.org/packages/7b/2c/f0dc4e1ee7f618f5bff7e05898d20bf8b6e7fa612038f768bfa295f136a4/mkdocstrings-0.30.1-py3-none-any.whl", hash = "sha256:41bd71f284ca4d44a668816193e4025c950b002252081e387433656ae9a70a82", size = 36704, upload-time = "2025-09-19T10:49:24.805Z" }, -] - -[[package]] -name = "mkdocstrings-python" -version = "1.19.0" -source = { registry = "https://pypi.org/simple" } -dependencies = [ - { name = "griffe" }, - { name = "mkdocs-autorefs" }, - { name = "mkdocstrings" }, - { name = "typing-extensions", marker = "python_full_version < '3.11'" }, -] -sdist = { url = "https://files.pythonhosted.org/packages/75/1c/3af8413919b0839b96a78f60e8bd0dfd26c844d3717eeb77f80b43f5be1c/mkdocstrings_python-1.19.0.tar.gz", hash = "sha256:917aac66cf121243c11db5b89f66b0ded6c53ec0de5318ff5e22424eb2f2e57c", size = 204010, upload-time = "2025-11-10T13:30:55.915Z" } -wheels = [ - { url = "https://files.pythonhosted.org/packages/98/5c/2597cef67b6947b15c47f8dba967a0baf19fbdfdc86f6e4a8ba7af8b581a/mkdocstrings_python-1.19.0-py3-none-any.whl", hash = "sha256:395c1032af8f005234170575cc0c5d4d20980846623b623b35594281be4a3059", size = 143417, upload-time = "2025-11-10T13:30:54.164Z" }, -] - [[package]] name = "multidict" version = "6.7.1" @@ -3498,23 +3390,6 @@ wheels = [ { url = "https://files.pythonhosted.org/packages/8e/37/efad0257dc6e593a18957422533ff0f87ede7c9c6ea010a2177d738fb82f/pure_eval-0.2.3-py3-none-any.whl", hash = "sha256:1db8e35b67b3d218d818ae653e27f06c3aa420901fa7b081ca98cbedc874e0d0", size = 11842, upload-time = "2024-07-21T12:58:20.04Z" }, ] -[[package]] -name = "py2fern" -version = "0.1.6" -source = { registry = "https://pypi.org/simple" } -dependencies = [ - { name = "astroid" }, - { name = "docstring-parser" }, - { name = "pyyaml" }, - { name = "tomli", marker = "python_full_version < '3.11'" }, - { name = "typer" }, - { name = "typing-extensions" }, -] -sdist = { url = "https://files.pythonhosted.org/packages/21/28/761a80183ce7d51f5acf6357dbf55b3d34f7f9b454dc0c97eedc8230c946/py2fern-0.1.6.tar.gz", hash = "sha256:8f26a89313fd0d852c06f8a1a84ce0016afb513efc97eacef92fe30f9dbae655", size = 65915, upload-time = "2025-11-24T03:30:27.356Z" } -wheels = [ - { url = "https://files.pythonhosted.org/packages/4b/7b/f81028a600ac1b7004059f101a611a05fe506258f9fb71afe1bae576d9de/py2fern-0.1.6-py3-none-any.whl", hash = "sha256:dfa3c5c854c27cfa322b47d3e0b45b49ab1ad28e1bf551c5134763ba290ea73f", size = 64221, upload-time = "2025-11-24T03:30:26.114Z" }, -] - [[package]] name = "pyarrow" version = "19.0.1"