Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -65,14 +65,6 @@ data-designer config providers

**Delete all providers**: Remove all providers and their associated models.

**Change default provider**: Set which provider is used by default. This option is only available when multiple providers are configured.

!!! warning "Deprecated: 'Change default provider' workflow"
The "Change default provider" workflow is **deprecated** and will be removed in a future
release alongside the registry-level default. Specify `provider=` explicitly on each
`ModelConfig` instead — the workflow now emits a `DeprecationWarning` when entered.
See [issue #589](https://github.com/NVIDIA-NeMo/DataDesigner/issues/589).

## Managing Model Configurations

Run the interactive model configuration command:
Expand Down Expand Up @@ -117,7 +109,6 @@ data-designer config list
This command displays:

- **Model Providers**: All configured providers with their endpoints (API keys are masked)
- **Default Provider**: The currently selected default provider _(deprecated; see issue #589)_
- **Model Configurations**: All configured models with their settings

## Resetting Configurations
Expand Down
7 changes: 0 additions & 7 deletions docs/concepts/models/custom-model-settings.md
Original file line number Diff line number Diff line change
Expand Up @@ -90,13 +90,6 @@ preview_result.display_sample_record()
!!! note "Default Providers Always Available"
When you only specify `model_configs`, the default model providers (NVIDIA, OpenAI, and OpenRouter) are still available. You only need to create custom providers if you want to connect to different endpoints or modify provider settings.

!!! warning "Always specify `provider=` on `ModelConfig`"
Leaving `provider` unset (or passing `provider=None`) on `ModelConfig` is **deprecated**.
The legacy "implicit default provider" routing — used when `provider` is omitted — emits
a `DeprecationWarning` and will be removed in a future release. Always reference the
intended provider by name, as the examples below do. See
[issue #589](https://github.com/NVIDIA-NeMo/DataDesigner/issues/589).

!!! tip "Mixing Custom and Default Models"
When you provide custom `model_configs` to `DataDesignerConfigBuilder`, they **replace** the defaults entirely. To use custom model configs in addition to the default configs, use the add_model_config method:

Expand Down
7 changes: 0 additions & 7 deletions docs/concepts/models/default-model-settings.md
Original file line number Diff line number Diff line change
Expand Up @@ -110,13 +110,6 @@ Both methods operate on the same files, ensuring consistency across your entire
!!! warning "Hosted Provider Data Handling"
The default model providers call hosted endpoints operated by NVIDIA, OpenAI, OpenRouter, or their upstream providers. Provider terms and privacy practices apply independently of Data Designer, and free or trial endpoints may log request data for security, operations, or product improvement. Do not submit confidential information or personal data, including faces, voices, screenshots, regulated data, or other sensitive content, unless the selected provider and endpoint are approved for your use case.

!!! warning "Deprecated: implicit default provider routing"
The `default:` key in `~/.data-designer/model_providers.yaml` and the registry-level
"default provider" concept are **deprecated** and will be removed in a future release.
Specify `provider=` explicitly on every `ModelConfig` instead — the built-in defaults
above already do this, and a `DeprecationWarning` is now emitted whenever the legacy
routing is exercised. See [issue #589](https://github.com/NVIDIA-NeMo/DataDesigner/issues/589).

!!! tip "Environment Variables"
Store your API keys in environment variables rather than hardcoding them in your scripts:

Expand Down
5 changes: 4 additions & 1 deletion docs/concepts/models/model-configs.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,9 +15,12 @@ The `ModelConfig` class has the following fields:
| `alias` | `str` | Yes | Unique identifier for this model configuration (e.g., `"my-text-model"`, `"reasoning-model"`) |
| `model` | `str` | Yes | Model identifier as recognized by the provider (e.g., `"nvidia/nemotron-3-nano-30b-a3b"`, `"gpt-4"`) |
| `inference_parameters` | `InferenceParamsT` | No | Controls model behavior during generation. Use `ChatCompletionInferenceParams` for text/code/structured generation or `EmbeddingInferenceParams` for embeddings. Defaults to `ChatCompletionInferenceParams()` if not provided. The generation type is automatically determined by the inference parameters type. See [Inference Parameters](inference-parameters.md) for details. |
| `provider` | `str` | No | Reference to the name of the Provider to use (e.g., `"nvidia"`, `"openai"`, `"openrouter"`). If not specified, one set as the default provider, which may resolve to the first provider if there are more than one |
| `provider` | `str` | Yes | Reference to the name of the Provider to use (e.g., `"nvidia"`, `"openai"`, `"openrouter"`). |
| `skip_health_check` | `bool` | No | Whether to skip the health check for this model. Defaults to `False`. Set to `True` to skip health checks when you know the model is accessible or want to defer validation. |

!!! warning "Upgrade note"
Every `ModelConfig` must now specify `provider`. Existing `model_configs.yaml` entries from older releases that omit `provider` or set it to `null` must be updated with an explicit provider name before loading. Agent tooling that parses `data-designer agent context` should read each model alias item's `provider` field; the top-level `default_provider` and per-item `configured_provider` / `effective_provider` fields are no longer emitted.


## Examples

Expand Down
8 changes: 0 additions & 8 deletions docs/concepts/models/model-providers.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,14 +6,6 @@ Model providers are external services that host and serve models. Data Designer

A `ModelProvider` defines how Data Designer connects to a provider's API endpoint. When you create a `ModelConfig`, you reference a provider by name, and Data Designer uses that provider's settings to make API calls to the appropriate endpoint.

!!! warning "Deprecated: implicit default provider routing"
Earlier versions of Data Designer let you omit `provider=` on `ModelConfig` and
fall back to a registry-level default — including the `default:` key in
`~/.data-designer/model_providers.yaml`. That implicit routing is **deprecated**
and will be removed in a future release. Always reference a provider by name on
every `ModelConfig`. A `DeprecationWarning` is now emitted when the legacy path
is exercised. See [issue #589](https://github.com/NVIDIA-NeMo/DataDesigner/issues/589).

## ModelProvider Configuration

The `ModelProvider` class has the following fields:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -74,12 +74,6 @@ data-designer config providers

**Delete all providers**: Remove all providers and their associated models.

**Change default provider**: Set which provider is used by default. This option is only available when multiple providers are configured.

<Warning title="Deprecated: 'Change default provider' workflow">
The "Change default provider" workflow is **deprecated** and will be removed in a future release alongside the registry-level default. Specify `provider=` explicitly on each `ModelConfig` instead — the workflow now emits a `DeprecationWarning` when entered. See [issue #589](https://github.com/NVIDIA-NeMo/DataDesigner/issues/589).
</Warning>

## Managing Model Configurations

Run the interactive model configuration command:
Expand Down Expand Up @@ -128,7 +122,6 @@ data-designer config list
This command displays:

- **Model Providers**: All configured providers with their endpoints (API keys are masked)
- **Default Provider**: The currently selected default provider _(deprecated; see [issue #589](https://github.com/NVIDIA-NeMo/DataDesigner/issues/589))_
- **Model Configurations**: All configured models with their settings

## Resetting Configurations
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -94,9 +94,9 @@ preview_result.display_sample_record()
When you only specify `model_configs`, the default model providers (NVIDIA, OpenAI, and OpenRouter) are still available. You only need to create custom providers if you want to connect to different endpoints or modify provider settings.
</Note>

<Warning title="Always specify `provider=` on `ModelConfig`">
Leaving `provider` unset (or passing `provider=None`) on `ModelConfig` is **deprecated**. The legacy "implicit default provider" routing — used when `provider` is omitted — emits a `DeprecationWarning` and will be removed in a future release. Always reference the intended provider by name, as the examples below do. See [issue #589](https://github.com/NVIDIA-NeMo/DataDesigner/issues/589).
</Warning>
<Note title="Provider is required">
Every custom `ModelConfig` must reference the intended provider by name. The examples below use the built-in `nvidia` provider.
</Note>

<Tip title="Mixing Custom and Default Models">
When you provide custom `model_configs` to `DataDesignerConfigBuilder`, they **replace** the defaults entirely. To use custom model configs in addition to the default configs, use the add_model_config method:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -116,10 +116,6 @@ Both methods operate on the same files, ensuring consistency across your entire
The default model providers call hosted endpoints operated by NVIDIA, OpenAI, OpenRouter, or their upstream providers. Provider terms and privacy practices apply independently of Data Designer, and free or trial endpoints may log request data for security, operations, or product improvement. Do not submit confidential information or personal data, including faces, voices, screenshots, regulated data, or other sensitive content, unless the selected provider and endpoint are approved for your use case.
</Warning>

<Warning title="Deprecated: implicit default provider routing">
The `default:` key in `~/.data-designer/model_providers.yaml` and the registry-level "default provider" concept are **deprecated** and will be removed in a future release. Specify `provider=` explicitly on every `ModelConfig` instead — the built-in defaults above already do this, and a `DeprecationWarning` is now emitted whenever the legacy routing is exercised. See [issue #589](https://github.com/NVIDIA-NeMo/DataDesigner/issues/589).
</Warning>

<Tip title="Environment Variables">
Store your API keys in environment variables rather than hardcoding them in your scripts:

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -18,9 +18,13 @@ The `ModelConfig` class has the following fields:
| `alias` | `str` | Yes | Unique identifier for this model configuration (e.g., `"my-text-model"`, `"reasoning-model"`) |
| `model` | `str` | Yes | Model identifier as recognized by the provider (e.g., `"nvidia/nemotron-3-nano-30b-a3b"`, `"gpt-4"`) |
| `inference_parameters` | `InferenceParamsT` | No | Controls model behavior during generation. Use `ChatCompletionInferenceParams` for text/code/structured generation or `EmbeddingInferenceParams` for embeddings. Defaults to `ChatCompletionInferenceParams()` if not provided. The generation type is automatically determined by the inference parameters type. See [Inference Parameters](/concepts/models/inference-parameters) for details. |
| `provider` | `str` | No | Reference to the name of the Provider to use (e.g., `"nvidia"`, `"openai"`, `"openrouter"`). If not specified, one set as the default provider, which may resolve to the first provider if there are more than one |
| `provider` | `str` | Yes | Reference to the name of the Provider to use (e.g., `"nvidia"`, `"openai"`, `"openrouter"`). |
| `skip_health_check` | `bool` | No | Whether to skip the health check for this model. Defaults to `False`. Set to `True` to skip health checks when you know the model is accessible or want to defer validation. |

<Warning title="Upgrade note">
Every `ModelConfig` must now specify `provider`. Existing `model_configs.yaml` entries from older releases that omit `provider` or set it to `null` must be updated with an explicit provider name before loading. Agent tooling that parses `data-designer agent context` should read each model alias item's `provider` field; the top-level `default_provider` and per-item `configured_provider` / `effective_provider` fields are no longer emitted.
</Warning>


## Examples

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,10 +9,6 @@ Model providers are external services that host and serve models. Data Designer

A `ModelProvider` defines how Data Designer connects to a provider's API endpoint. When you create a `ModelConfig`, you reference a provider by name, and Data Designer uses that provider's settings to make API calls to the appropriate endpoint.

<Warning title="Deprecated: implicit default provider routing">
Earlier versions of Data Designer let you omit `provider=` on `ModelConfig` and fall back to a registry-level default — including the `default:` key in `~/.data-designer/model_providers.yaml`. That implicit routing is **deprecated** and will be removed in a future release. Always reference a provider by name on every `ModelConfig`. A `DeprecationWarning` is now emitted when the legacy path is exercised. See [issue #589](https://github.com/NVIDIA-NeMo/DataDesigner/issues/589).
</Warning>

## ModelProvider Configuration

The `ModelProvider` class has the following fields:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,6 @@
PREDEFINED_PROVIDERS_MODEL_MAP,
)
from data_designer.config.utils.io_helpers import load_config_file, save_config_file
from data_designer.config.utils.warning_helpers import warn_at_caller

logger = logging.getLogger(__name__)

Expand Down Expand Up @@ -95,31 +94,6 @@ def get_default_providers() -> list[ModelProvider]:
return []


def get_default_provider_name() -> str | None:
"""Return the YAML's ``default:`` provider name, if set.

Deprecated: this function and the underlying YAML key are deprecated and
will be removed in a future release. Specify ``provider=`` explicitly on
each ``ModelConfig`` instead. See issue #589.
"""
default = _get_default_providers_file_content(MODEL_PROVIDERS_FILE_PATH).get("default")
if default is not None:
# ``warn_at_caller`` (rather than ``warnings.warn(stacklevel=2)``) so the
# warning attributes to the user's call site rather than this library
# module. The only real call path is ``DataDesigner.__init__``, which
# is itself a ``data_designer`` frame; under default Python filters,
# library-attributed ``DeprecationWarning`` entries are silenced
# (``ignore::DeprecationWarning``), so library attribution = invisible
# warning. See PR #594 review.
warn_at_caller(
f"The 'default:' key in {MODEL_PROVIDERS_FILE_PATH} is deprecated and will "
"be removed in a future release. Remove it and specify provider= explicitly "
"on each ModelConfig instead. See issue #589.",
DeprecationWarning,
)
return default


def resolve_seed_default_model_settings() -> None:
if not MODEL_CONFIGS_FILE_PATH.exists():
logger.debug(
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,6 @@
load_image_path_to_base64,
)
from data_designer.config.utils.io_helpers import smart_load_yaml
from data_designer.config.utils.warning_helpers import warn_at_caller

logger = logging.getLogger(__name__)

Expand Down Expand Up @@ -504,17 +503,15 @@ class ModelConfig(ConfigBase):
model: Model identifier (e.g., from build.nvidia.com or other providers).
inference_parameters: Inference parameters for the model (temperature, top_p, max_tokens, etc.).
The generation_type is determined by the type of inference_parameters.
provider: Name of the model provider. Required in a future release. Leaving
``provider`` unset (or ``None``) currently routes through the registry's
implicit default and is **deprecated**; specify ``provider=`` explicitly.
See issue #589.
provider: Name of the model provider. Must match the ``name`` field of a
``ModelProvider`` registered with the surrounding ``DataDesigner`` instance.
skip_health_check: Whether to skip the health check for this model. Defaults to False.
"""

alias: str
model: str
inference_parameters: InferenceParamsT = Field(default_factory=ChatCompletionInferenceParams)
provider: str | None = None
provider: str
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude and Codex both caught a migration edge here: once provider is required, an existing model_configs.yaml with one provider-less entry makes ModelRepository.load() return None, so CLI and agent flows can treat the whole model file as missing. Could we add a migration/error path in ModelRepository.load() so users see "add provider to alias X" instead of silently losing the registry?

skip_health_check: bool = False

@property
Expand All @@ -539,22 +536,6 @@ def _convert_inference_parameters(cls, value: Any) -> Any:
return ChatCompletionInferenceParams(**value)
return value

@model_validator(mode="after")
def _warn_on_implicit_provider(self) -> Self:
if self.provider is None:
# Use ``warn_at_caller`` so the warning is attributed to the user's
# ``ModelConfig(...)`` / ``model_validate(...)`` call rather than a
# pydantic-internal frame. Without this, every call dedupes to the
# same pydantic line and only the first emission is shown. See
# PR #594 review.
warn_at_caller(
f"ModelConfig.provider=None is deprecated and will be required in a future release. "
f"Specify provider= explicitly on ModelConfig(alias={self.alias!r}, ...). "
"See issue #589.",
DeprecationWarning,
)
return self


class ModelProvider(ConfigBase):
"""Configuration for a custom model provider.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ def stub_data_designer_config_str() -> str:
model_configs:
- alias: my_own_code_model
model: openai/meta/llama-3.3-70b-instruct
provider: openai
inference_parameters:
temperature:
distribution_type: uniform
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -196,6 +196,7 @@ def test_from_config_auto_wraps_bare_dict() -> None:
{
"alias": "test-model",
"model": "openai/meta/llama-3.3-70b-instruct",
"provider": "openai",
}
],
"columns": [
Expand All @@ -219,6 +220,7 @@ def test_from_config_passthrough_when_already_wrapped() -> None:
{
"alias": "test-model",
"model": "openai/meta/llama-3.3-70b-instruct",
"provider": "openai",
}
],
"columns": [
Expand Down Expand Up @@ -253,6 +255,7 @@ def test_from_config_auto_wraps_bare_json_file() -> None:
{
"alias": "test-model",
"model": "openai/meta/llama-3.3-70b-instruct",
"provider": "openai",
}
],
"columns": [
Expand Down
Loading
Loading