chore(config): consolidate config maintenance to `schema_overlay.yaml` by webern · Pull Request #1828 · DataDog/saluki

webern · 2026-06-05T16:18:22Z

Human Summary

This PR consolidates the ledger files and the config registry into a single source of truth, the
overlay (schema_overlay.yaml). It generates the config_registry, moving it to
datadog/agent/config-testsupport. The smoke tests remain and pass, but edits to the config
registry now flow through the overlay file.

Code for the runtime classification of unsupported configuration for ADP is generated separately by
datadog-agent/config. This allows for a slimmer and purpose-built structure for the production
binary that does not have all of the test support metadata in it.

The documentation page that we have been maintaining by hand, dogstatsd.md, is now also generated.
I went to great lengths to preserve its character as much as possible, though it had drifted from
reality in some cases. Order of the main sections was preserved, but order of the prose edits was
not. Unfortunately, this is about as much as I could reduce the diff. Going forward, it will diff
mechanically with edits to the overlay.

I handled the consolidation and conflict resolution of our data with painstaking care. The config
registry was considered more authoratative than documentation or the ledger. I have been
incorporating recent changes to the data from recent PRs, but we should be vigilent against PRs that
are landing while this one is open.

A git check has been added to CI to catch in-tree code gen diffs that haven't been checked in.

AI Summary

Consolidate ADP's config metadata into a single schema_overlay.yaml and generate the config
registry + documentation from it at build time.

Before: Config metadata lived in multiple files (ignored_keys.yaml, known-configs.json,
known-configs-not-applicable.json, per-key annotation structs in config_registry/datadog/*.rs,
and dogstatsd.md). Updating a single key required touching up to five files with no
synchronization between them.

After: One hand-maintained YAML file (schema_overlay.yaml) is the single source of truth.
build.rs enforces completeness (every core_schema.yaml key must appear in the overlay) and
generates:

The runtime classifier dataset (yaml_path, aliases, support_level, pipeline_affinity, default)
The test-support annotation constants
The configuration documentation (dogstatsd.md)

New crates

datadog-agent/config-overlay-model — Leaf crate with Rust types for deserializing and
validating schema_overlay.yaml. Enforces structural invariants at load time.
datadog-agent/config — Prod crate. Owns the overlay file, core_schema.yaml, and a
build.rs that generates the slim classifier array.
datadog-agent/config-testsupport — Test crate. Generates full annotation constants, the
smoke test runner, and the configuration documentation from the same overlay.

What was deleted

docs/agent-data-plane/configuration/dogstatsd/known-configs.json
docs/agent-data-plane/configuration/dogstatsd/known-configs-not-applicable.json
Hand-written annotation structs in config_registry/datadog/*.rs (replaced by generated code)
saluki-components/src/config_registry/generated/ directory

Key design decisions

The overlay has four sections: supported, unsupported, investigate, ignored.
unsupported entries carry mandatory severity (low/medium/high) for runtime warn/error behavior
and mandatory planned (bool) for documentation rendering.
investigate is a parking lot for keys not yet classified with optional severity (when present,
flows through classifier as Incompatible).
Documentation is generated from a .tmpl template with in-place output for clean PR diffs.
saluki-components no longer has a config_registry module. Code that previously used
crate::config_registry::* now imports from datadog_agent_config_testsupport::config_registry
directly.

Change Type

New feature
Non-functional (chore, refactoring, docs)

How did you test this PR?

A lot of painstaking work with the data by hand.
Kept the smoke tests alive and they pass.
Unit tests pass.
I will fix any CI integration or correctness failures.
Added a diff check after build/test in CI to make sure in-tree generated files don't have a diff.

References

pr-commenter · 2026-06-05T16:23:34Z

Binary Size Analysis (Agent Data Plane)

Baseline: 76d9a1e · Comparison: 817e405 · diff
Analysis Configuration: stripped binaries · Pass/Fail Threshold: +5%
Sizes: 38.09 MiB (baseline) vs 38.04 MiB (comparison)
Size Change: -42.80 KiB (-0.11%)

✅ Binary size difference within threshold

Changes by Module

Module	File Size	Symbols
`tracing`	-29.40 KiB	156
`saluki_components::sources::otlp`	+22.19 KiB	239
`alloc`	+20.02 KiB	2682
`prost`	-17.00 KiB	551
`resource_accounting::groups::Tracked`	+16.95 KiB	31
`axum`	-12.08 KiB	514
`figment`	-11.96 KiB	722
`tonic_prost`	-11.33 KiB	78
`serde_json`	+10.44 KiB	331
`hashbrown`	+10.37 KiB	1112
`anon.168724101e883a32e713fefb5452135d.103.llvm.17918517674286795052`	+9.07 KiB	1
`agent_data_plane::main`	-7.90 KiB	2
`saluki_core::runtime::process`	+7.66 KiB	18
`hyper_timeout`	-7.25 KiB	9
`hyper`	-7.24 KiB	590
`saluki_io::net::util`	-7.08 KiB	175
`saluki_components::config_registry::datadog`	-7.02 KiB	33
`saluki_components::transforms::trace_obfuscation`	+6.65 KiB	68
`anyhow`	-6.49 KiB	1703
`h2`	-6.24 KiB	849

Detailed Symbol Changes

    FILE SIZE        VM SIZE    
 --------------  -------------- 
  [NEW]  +148Ki  [NEW]  +148Ki    agent_data_plane::cli::run::handle_run_command::_{{closure}}::h89b52f4572e4d450
  [NEW] +84.7Ki  [NEW] +84.5Ki    agent_data_plane::cli::dogstatsd::handle_dogstatsd_command::_{{closure}}::h56bd0ab477aa9104
  [NEW] +67.2Ki  [NEW] +67.0Ki    agent_data_plane::cli::run::create_topology::_{{closure}}::h216473a8d538f1f6
  [NEW] +65.6Ki  [NEW] +65.4Ki    saluki_core::topology::built::BuiltTopology::spawn::_{{closure}}::h5f7de4f9f12ee780
  [NEW] +58.1Ki  [NEW] +57.9Ki    saluki_core::topology::blueprint::TopologyBlueprint::build::_{{closure}}::h0d021cb542de231b
  [NEW] +56.8Ki  [NEW] +56.7Ki    agent_data_plane::cli::debug::handle_debug_command::_{{closure}}::hd0e8ec767fa496c5
  [NEW] +49.7Ki  [NEW] +49.4Ki    agent_data_plane::main::_{{closure}}::h32369461b5b1a268
  [NEW] +48.0Ki  [NEW] +47.9Ki    saluki_components::common::datadog::io::run_endpoint_io_loop::_{{closure}}::h232e30a9699f04b3
  [NEW] +39.8Ki  [NEW] +39.6Ki    agent_data_plane::internal::env::workload::build_collector::_{{closure}}::h6b8bb7e77f5258e7
  [NEW] +39.7Ki  [NEW] +39.5Ki    _<saluki_components::forwarders::otlp::OtlpForwarder as saluki_core::components::forwarders::Forwarder>::run::_{{closure}}::h87fde4231f18454f
  [DEL] -41.2Ki  [DEL] -41.0Ki    _<saluki_components::forwarders::otlp::OtlpForwarder as saluki_core::components::forwarders::Forwarder>::run::_{{closure}}::hc7a1cfd111a617ce
  [DEL] -48.0Ki  [DEL] -47.9Ki    saluki_components::common::datadog::io::run_endpoint_io_loop::_{{closure}}::h1b85364575686d85
  [DEL] -49.2Ki  [DEL] -48.9Ki    agent_data_plane::main::_{{closure}}::h6bd0f50150c0d309
  [DEL] -52.7Ki  [DEL] -52.5Ki    saluki_core::topology::blueprint::TopologyBlueprint::build::_{{closure}}::h29437ed3139f6330
  [DEL] -55.7Ki  [DEL] -55.6Ki    core::ops::function::FnOnce::call_once::h53ed267cc9bd6b1f
  -0.1% -31.1Ki  -0.3% -56.6Ki    [48003 Others]
  [DEL] -57.1Ki  [DEL] -56.9Ki    agent_data_plane::cli::debug::handle_debug_command::_{{closure}}::hbe57146ffcca7c4f
  [DEL] -64.5Ki  [DEL] -64.3Ki    saluki_core::topology::built::BuiltTopology::spawn::_{{closure}}::h980424f2554a3447
  [DEL] -67.2Ki  [DEL] -67.1Ki    agent_data_plane::cli::run::create_topology::_{{closure}}::hce0c9f9bb1ba32fc
  [DEL] -84.6Ki  [DEL] -84.4Ki    agent_data_plane::cli::dogstatsd::handle_dogstatsd_command::_{{closure}}::hd79ecc7f50168053
  [DEL]  -149Ki  [DEL]  -149Ki    agent_data_plane::cli::run::handle_run_command::_{{closure}}::h252f83b0a8475f58
  -0.1% -42.8Ki  -0.2% -68.5Ki    TOTAL

pr-commenter · 2026-06-05T16:38:42Z

Regression Detector (Agent Data Plane)

Run ID: 19e30d0a-2cd1-4fa6-af78-a718a3ed019c
Baseline: 76d9a1e6 · Comparison: 817e4051 · diff

Optimization Goals: ✅ No significant changes detected

Fine details of change detection per experiment (35)

Experiments configured erratic: true are tagged (ignored) and skipped when determining which experiments regressed or improved. Experiments which are detected as erratic at runtime are tagged (erratic) to flag that the run's sample dispersion was high, but their regression / improvement signal still counts.

experiment	goal	Δ mean %	links
otlp_ingest_logs_5mb_memory (ignored)	memory	⚪ +7.98	metrics profiles logs
otlp_ingest_traces_5mb_cpu (erratic)	cpu	⚪ +2.22	metrics profiles logs
dsd_uds_100mb_3k_contexts_cpu (erratic)	cpu	⚪ +1.31	metrics profiles logs
otlp_ingest_traces_ottl_transform_5mb_cpu (erratic)	cpu	⚪ +1.21	metrics profiles logs
dsd_uds_500mb_3k_contexts_throughput	throughput	⚪ -0.82	metrics profiles logs
dsd_uds_500mb_3k_contexts_cpu (erratic)	cpu	⚪ +0.77	metrics profiles logs
quality_gates_rss_idle	memory	⚪ +0.50	metrics profiles logs
otlp_ingest_metrics_5mb_cpu (erratic)	cpu	⚪ +0.50	metrics profiles logs
otlp_ingest_traces_ottl_transform_5mb_throughput	throughput	⚪ -0.38	metrics profiles logs
dsd_uds_500mb_3k_contexts_memory	memory	⚪ +0.37	metrics profiles logs
quality_gates_rss_dsd_low	memory	⚪ +0.34	metrics profiles logs
otlp_ingest_traces_5mb_throughput	throughput	⚪ -0.19	metrics profiles logs
quality_gates_rss_dsd_heavy	memory	⚪ +0.10	metrics profiles logs
dsd_uds_10mb_3k_contexts_memory	memory	⚪ +0.09	metrics profiles logs
otlp_ingest_traces_ottl_filtering_5mb_throughput	throughput	⚪ -0.09	metrics profiles logs
dsd_uds_100mb_3k_contexts_memory	memory	⚪ +0.03	metrics profiles logs
otlp_ingest_metrics_5mb_throughput	throughput	⚪ -0.01	metrics profiles logs
dsd_uds_10mb_3k_contexts_throughput	throughput	⚪ -0.01	metrics profiles logs
dsd_uds_1mb_3k_contexts_throughput	throughput	⚪ -0.00	metrics profiles logs
dsd_uds_100mb_3k_contexts_throughput	throughput	⚪ -0.00	metrics profiles logs
dsd_uds_512kb_3k_contexts_throughput	throughput	⚪ +0.00	metrics profiles logs
otlp_ingest_logs_5mb_throughput (ignored)	throughput	⚪ +0.00	metrics profiles logs
otlp_ingest_traces_ottl_filtering_5mb_memory	memory	⚪ -0.04	metrics profiles logs
quality_gates_rss_dsd_medium	memory	⚪ -0.04	metrics profiles logs
otlp_ingest_traces_ottl_transform_5mb_memory	memory	⚪ -0.05	metrics profiles logs
otlp_ingest_traces_5mb_memory	memory	⚪ -0.11	metrics profiles logs
dsd_uds_1mb_3k_contexts_memory	memory	⚪ -0.11	metrics profiles logs
quality_gates_rss_dsd_ultraheavy	memory	⚪ -0.17	metrics profiles logs
dsd_uds_10mb_3k_contexts_cpu (erratic)	cpu	⚪ -0.28	metrics profiles logs
dsd_uds_1mb_3k_contexts_cpu (erratic)	cpu	⚪ -0.44	metrics profiles logs
dsd_uds_512kb_3k_contexts_memory	memory	⚪ -0.77	metrics profiles logs
otlp_ingest_traces_ottl_filtering_5mb_cpu (erratic)	cpu	⚪ -0.77	metrics profiles logs
otlp_ingest_logs_5mb_cpu (ignored)	cpu	⚪ -0.88	metrics profiles logs
otlp_ingest_metrics_5mb_memory	memory	⚪ -1.10	metrics profiles logs
dsd_uds_512kb_3k_contexts_cpu (erratic)	cpu	⚪ -4.51	metrics profiles logs

Bounds Checks: ✅ Passed (5)

experiment	check	replicates	observed	links
quality_gates_rss_dsd_heavy	memory_usage	10/10	✅ 131 MiB ≤ 140 MiB	metrics profiles logs
quality_gates_rss_dsd_low	memory_usage	10/10	✅ 41.3 MiB ≤ 50 MiB	metrics profiles logs
quality_gates_rss_dsd_medium	memory_usage	10/10	✅ 64.6 MiB ≤ 75 MiB	metrics profiles logs
quality_gates_rss_dsd_ultraheavy	memory_usage	10/10	✅ 191 MiB ≤ 200 MiB	metrics profiles logs
quality_gates_rss_idle	memory_usage	10/10	✅ 27.1 MiB ≤ 40 MiB	metrics profiles logs

Explanation

A change is flagged as a regression when |Δ mean %| > 5.00% in the regressing direction for its optimization goal AND SMP marks the experiment as a regression (is_regression: true). Improvements use the matching criteria for the improving direction. Experiments configured erratic: true (tagged (ignored)) are skipped outright; experiments detected as erratic at runtime (tagged (erratic)) still count, since that flag describes sample dispersion rather than directional certainty. The Δ mean % cell is colored accordingly: 🟢 = improvement, 🔴 = regression, ⚪ = neutral. Reduction in CPU or memory is an improvement; reduction in ingress throughput is a regression.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f88a143a11

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-05T16:56:14Z

+        writeln!(out, "        yaml_path: \"{}\",", yaml_path).unwrap();
+        writeln!(out, "        aliases: &[],").unwrap();
+        writeln!(out, "        support_level: SupportLevel::Incompatible({}),", severity).unwrap();
+        writeln!(out, "        pipeline_affinity: PipelineAffinity::CrossCutting,").unwrap();


Preserve pipeline affinity for investigate entries

For investigate entries with a severity, this hard-codes every generated classifier entry as cross-cutting. That changes runtime filtering in check_and_warn_config: keys such as autoscaling.failover.*, cluster_agent.enabled, and telemetry.dogstatsd_* were previously scoped to DogStatsD/Checks, but now they are considered active even for OTLP- or traces-only runs, so those configurations emit incompatible-key warnings even when the affected metrics pipelines are disabled. Carry the overlay pipeline affinity through these generated entries instead of defaulting them to CrossCutting.

Useful? React with 👍 / 👎.

I'm going to say this is fine since I don't want to keep tweaking the data model forever. Basically I added investigate as a section when I realized that sometimes the reality is we don't quite know whether something needs to be supported or not. Then I realized that sometimes we are warning on things that we need to "investigate" so I added the ability to add a severity, but made it optional.

I think if we know enough to say what the severity is, and we really need to assign specific pipelines, we should probably move it up to the unsupported sections and leave investigate for the keys we know less about.

Not a bad AI catch, though.

…ool v1.0.6

This comment has been minimized.

Sign in to view

webern changed the title ~~chore(config): consolidate config maintenance to schema_overlay.yaml~~ chore(config): consolidate config maintenance to schema_overlay.yaml Jun 5, 2026

webern marked this pull request as ready for review June 5, 2026 16:51

webern requested a review from a team as a code owner June 5, 2026 16:51

chatgpt-codex-connector Bot reviewed Jun 5, 2026

View reviewed changes

webern added 4 commits June 6, 2026 09:00

big squash yo

e593dd3

chore(licenses): sync third-party license file with dd-rust-license-t…

10ec6bb

…ool v1.0.6

remove an oops file

082cc7c

fix heroku_dyno and dogstatsd_telemetry_enabled_listener_id severity

817e405

webern force-pushed the matt.briggs/config branch from 11594e4 to 817e405 Compare June 6, 2026 07:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore(config): consolidate config maintenance to `schema_overlay.yaml`#1828

chore(config): consolidate config maintenance to `schema_overlay.yaml`#1828
webern wants to merge 4 commits into
mainfrom
matt.briggs/config

webern commented Jun 5, 2026

Uh oh!

This comment has been minimized.

pr-commenter Bot commented Jun 5, 2026 •

edited

Loading

Uh oh!

pr-commenter Bot commented Jun 5, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Jun 5, 2026

Uh oh!

webern Jun 6, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

webern commented Jun 5, 2026

Human Summary

AI Summary

New crates

What was deleted

Key design decisions

Change Type

How did you test this PR?

References

Uh oh!

This comment has been minimized.

pr-commenter Bot commented Jun 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Binary Size Analysis (Agent Data Plane)

✅ Binary size difference within threshold

Uh oh!

pr-commenter Bot commented Jun 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Regression Detector (Agent Data Plane)

Optimization Goals: ✅ No significant changes detected

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 5, 2026

Choose a reason for hiding this comment

Uh oh!

webern Jun 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

pr-commenter Bot commented Jun 5, 2026 •

edited

Loading

pr-commenter Bot commented Jun 5, 2026 •

edited

Loading

webern Jun 6, 2026 •

edited

Loading