Skip to content

chore(config): consolidate config maintenance to schema_overlay.yaml#1828

Open
webern wants to merge 4 commits into
mainfrom
matt.briggs/config
Open

chore(config): consolidate config maintenance to schema_overlay.yaml#1828
webern wants to merge 4 commits into
mainfrom
matt.briggs/config

Conversation

@webern
Copy link
Copy Markdown
Contributor

@webern webern commented Jun 5, 2026

Human Summary

This PR consolidates the ledger files and the config registry into a single source of truth, the
overlay
(schema_overlay.yaml). It generates the config_registry, moving it to
datadog/agent/config-testsupport. The smoke tests remain and pass, but edits to the config
registry now flow through the overlay file.

Code for the runtime classification of unsupported configuration for ADP is generated separately by
datadog-agent/config. This allows for a slimmer and purpose-built structure for the production
binary that does not have all of the test support metadata in it.

The documentation page that we have been maintaining by hand, dogstatsd.md, is now also generated.
I went to great lengths to preserve its character as much as possible, though it had drifted from
reality in some cases. Order of the main sections was preserved, but order of the prose edits was
not. Unfortunately, this is about as much as I could reduce the diff. Going forward, it will diff
mechanically with edits to the overlay.

I handled the consolidation and conflict resolution of our data with painstaking care. The config
registry was considered more authoratative than documentation or the ledger. I have been
incorporating recent changes to the data from recent PRs, but we should be vigilent against PRs that
are landing while this one is open.

A git check has been added to CI to catch in-tree code gen diffs that haven't been checked in.

AI Summary

Consolidate ADP's config metadata into a single schema_overlay.yaml and generate the config
registry + documentation from it at build time.

Before: Config metadata lived in multiple files (ignored_keys.yaml, known-configs.json,
known-configs-not-applicable.json, per-key annotation structs in config_registry/datadog/*.rs,
and dogstatsd.md). Updating a single key required touching up to five files with no
synchronization between them.

After: One hand-maintained YAML file (schema_overlay.yaml) is the single source of truth.
build.rs enforces completeness (every core_schema.yaml key must appear in the overlay) and
generates:

  • The runtime classifier dataset (yaml_path, aliases, support_level, pipeline_affinity, default)
  • The test-support annotation constants
  • The configuration documentation (dogstatsd.md)

New crates

  • datadog-agent/config-overlay-model — Leaf crate with Rust types for deserializing and
    validating schema_overlay.yaml. Enforces structural invariants at load time.
  • datadog-agent/config — Prod crate. Owns the overlay file, core_schema.yaml, and a
    build.rs that generates the slim classifier array.
  • datadog-agent/config-testsupport — Test crate. Generates full annotation constants, the
    smoke test runner, and the configuration documentation from the same overlay.

What was deleted

  • docs/agent-data-plane/configuration/dogstatsd/known-configs.json
  • docs/agent-data-plane/configuration/dogstatsd/known-configs-not-applicable.json
  • Hand-written annotation structs in config_registry/datadog/*.rs (replaced by generated code)
  • saluki-components/src/config_registry/generated/ directory

Key design decisions

  • The overlay has four sections: supported, unsupported, investigate, ignored.
  • unsupported entries carry mandatory severity (low/medium/high) for runtime warn/error behavior
    and mandatory planned (bool) for documentation rendering.
  • investigate is a parking lot for keys not yet classified with optional severity (when present,
    flows through classifier as Incompatible).
  • Documentation is generated from a .tmpl template with in-place output for clean PR diffs.
  • saluki-components no longer has a config_registry module. Code that previously used
    crate::config_registry::* now imports from datadog_agent_config_testsupport::config_registry
    directly.

Change Type

  • New feature
  • Non-functional (chore, refactoring, docs)

How did you test this PR?

  • A lot of painstaking work with the data by hand.
  • Kept the smoke tests alive and they pass.
  • Unit tests pass.
  • I will fix any CI integration or correctness failures.
  • Added a diff check after build/test in CI to make sure in-tree generated files don't have a diff.

References

@datadog-prod-us1-5

This comment has been minimized.

@webern webern changed the title chore(config): consolidate config maintenance to schema_overlay.yaml chore(config): consolidate config maintenance to schema_overlay.yaml Jun 5, 2026
@dd-octo-sts dd-octo-sts Bot added area/components Sources, transforms, and destinations. area/ci CI/CD, automated testing, etc. source/dogstatsd DogStatsD source. transform/aggregate Aggregate transform. transform/dogstatsd-mapper DogStatsD Mapper synchronous transform. area/docs Reference documentation. decoders/otlp OTLP decoder. encoder/datadog-events Datadog events encoder. encoder/datadog-logs Datadog Logs encoder. encoder/datadog-metrics Datadog Metrics encoder. encoder/datadog-service-checks Datadog Service Checks encoder. encoder/datadog-stats Datadog APM Stats encoder. encoder/datadog-traces Datadog Traces encoder. relay/otlp OTLP relay. source/otlp OTLP source. transform/trace-obfuscation Trace Obfuscation synchronous transform. labels Jun 5, 2026
@pr-commenter
Copy link
Copy Markdown

pr-commenter Bot commented Jun 5, 2026

Binary Size Analysis (Agent Data Plane)

Baseline: 76d9a1e · Comparison: 817e405 · diff
Analysis Configuration: stripped binaries · Pass/Fail Threshold: +5%
Sizes: 38.09 MiB (baseline) vs 38.04 MiB (comparison)
Size Change: -42.80 KiB (-0.11%)

✅ Binary size difference within threshold

Changes by Module
Module File Size Symbols
tracing -29.40 KiB 156
saluki_components::sources::otlp +22.19 KiB 239
alloc +20.02 KiB 2682
prost -17.00 KiB 551
resource_accounting::groups::Tracked +16.95 KiB 31
axum -12.08 KiB 514
figment -11.96 KiB 722
tonic_prost -11.33 KiB 78
serde_json +10.44 KiB 331
hashbrown +10.37 KiB 1112
anon.168724101e883a32e713fefb5452135d.103.llvm.17918517674286795052 +9.07 KiB 1
agent_data_plane::main -7.90 KiB 2
saluki_core::runtime::process +7.66 KiB 18
hyper_timeout -7.25 KiB 9
hyper -7.24 KiB 590
saluki_io::net::util -7.08 KiB 175
saluki_components::config_registry::datadog -7.02 KiB 33
saluki_components::transforms::trace_obfuscation +6.65 KiB 68
anyhow -6.49 KiB 1703
h2 -6.24 KiB 849
Detailed Symbol Changes
    FILE SIZE        VM SIZE    
 --------------  -------------- 
  [NEW]  +148Ki  [NEW]  +148Ki    agent_data_plane::cli::run::handle_run_command::_{{closure}}::h89b52f4572e4d450
  [NEW] +84.7Ki  [NEW] +84.5Ki    agent_data_plane::cli::dogstatsd::handle_dogstatsd_command::_{{closure}}::h56bd0ab477aa9104
  [NEW] +67.2Ki  [NEW] +67.0Ki    agent_data_plane::cli::run::create_topology::_{{closure}}::h216473a8d538f1f6
  [NEW] +65.6Ki  [NEW] +65.4Ki    saluki_core::topology::built::BuiltTopology::spawn::_{{closure}}::h5f7de4f9f12ee780
  [NEW] +58.1Ki  [NEW] +57.9Ki    saluki_core::topology::blueprint::TopologyBlueprint::build::_{{closure}}::h0d021cb542de231b
  [NEW] +56.8Ki  [NEW] +56.7Ki    agent_data_plane::cli::debug::handle_debug_command::_{{closure}}::hd0e8ec767fa496c5
  [NEW] +49.7Ki  [NEW] +49.4Ki    agent_data_plane::main::_{{closure}}::h32369461b5b1a268
  [NEW] +48.0Ki  [NEW] +47.9Ki    saluki_components::common::datadog::io::run_endpoint_io_loop::_{{closure}}::h232e30a9699f04b3
  [NEW] +39.8Ki  [NEW] +39.6Ki    agent_data_plane::internal::env::workload::build_collector::_{{closure}}::h6b8bb7e77f5258e7
  [NEW] +39.7Ki  [NEW] +39.5Ki    _<saluki_components::forwarders::otlp::OtlpForwarder as saluki_core::components::forwarders::Forwarder>::run::_{{closure}}::h87fde4231f18454f
  [DEL] -41.2Ki  [DEL] -41.0Ki    _<saluki_components::forwarders::otlp::OtlpForwarder as saluki_core::components::forwarders::Forwarder>::run::_{{closure}}::hc7a1cfd111a617ce
  [DEL] -48.0Ki  [DEL] -47.9Ki    saluki_components::common::datadog::io::run_endpoint_io_loop::_{{closure}}::h1b85364575686d85
  [DEL] -49.2Ki  [DEL] -48.9Ki    agent_data_plane::main::_{{closure}}::h6bd0f50150c0d309
  [DEL] -52.7Ki  [DEL] -52.5Ki    saluki_core::topology::blueprint::TopologyBlueprint::build::_{{closure}}::h29437ed3139f6330
  [DEL] -55.7Ki  [DEL] -55.6Ki    core::ops::function::FnOnce::call_once::h53ed267cc9bd6b1f
  -0.1% -31.1Ki  -0.3% -56.6Ki    [48003 Others]
  [DEL] -57.1Ki  [DEL] -56.9Ki    agent_data_plane::cli::debug::handle_debug_command::_{{closure}}::hbe57146ffcca7c4f
  [DEL] -64.5Ki  [DEL] -64.3Ki    saluki_core::topology::built::BuiltTopology::spawn::_{{closure}}::h980424f2554a3447
  [DEL] -67.2Ki  [DEL] -67.1Ki    agent_data_plane::cli::run::create_topology::_{{closure}}::hce0c9f9bb1ba32fc
  [DEL] -84.6Ki  [DEL] -84.4Ki    agent_data_plane::cli::dogstatsd::handle_dogstatsd_command::_{{closure}}::hd79ecc7f50168053
  [DEL]  -149Ki  [DEL]  -149Ki    agent_data_plane::cli::run::handle_run_command::_{{closure}}::h252f83b0a8475f58
  -0.1% -42.8Ki  -0.2% -68.5Ki    TOTAL

@pr-commenter
Copy link
Copy Markdown

pr-commenter Bot commented Jun 5, 2026

Regression Detector (Agent Data Plane)

Run ID: 19e30d0a-2cd1-4fa6-af78-a718a3ed019c
Baseline: 76d9a1e6 · Comparison: 817e4051 · diff

Optimization Goals: ✅ No significant changes detected

Fine details of change detection per experiment (35)

Experiments configured erratic: true are tagged (ignored) and skipped when determining which experiments regressed or improved. Experiments which are detected as erratic at runtime are tagged (erratic) to flag that the run's sample dispersion was high, but their regression / improvement signal still counts.

experiment goal Δ mean % links
otlp_ingest_logs_5mb_memory (ignored) memory ⚪ +7.98 metrics profiles logs
otlp_ingest_traces_5mb_cpu (erratic) cpu ⚪ +2.22 metrics profiles logs
dsd_uds_100mb_3k_contexts_cpu (erratic) cpu ⚪ +1.31 metrics profiles logs
otlp_ingest_traces_ottl_transform_5mb_cpu (erratic) cpu ⚪ +1.21 metrics profiles logs
dsd_uds_500mb_3k_contexts_throughput throughput ⚪ -0.82 metrics profiles logs
dsd_uds_500mb_3k_contexts_cpu (erratic) cpu ⚪ +0.77 metrics profiles logs
quality_gates_rss_idle memory ⚪ +0.50 metrics profiles logs
otlp_ingest_metrics_5mb_cpu (erratic) cpu ⚪ +0.50 metrics profiles logs
otlp_ingest_traces_ottl_transform_5mb_throughput throughput ⚪ -0.38 metrics profiles logs
dsd_uds_500mb_3k_contexts_memory memory ⚪ +0.37 metrics profiles logs
quality_gates_rss_dsd_low memory ⚪ +0.34 metrics profiles logs
otlp_ingest_traces_5mb_throughput throughput ⚪ -0.19 metrics profiles logs
quality_gates_rss_dsd_heavy memory ⚪ +0.10 metrics profiles logs
dsd_uds_10mb_3k_contexts_memory memory ⚪ +0.09 metrics profiles logs
otlp_ingest_traces_ottl_filtering_5mb_throughput throughput ⚪ -0.09 metrics profiles logs
dsd_uds_100mb_3k_contexts_memory memory ⚪ +0.03 metrics profiles logs
otlp_ingest_metrics_5mb_throughput throughput ⚪ -0.01 metrics profiles logs
dsd_uds_10mb_3k_contexts_throughput throughput ⚪ -0.01 metrics profiles logs
dsd_uds_1mb_3k_contexts_throughput throughput ⚪ -0.00 metrics profiles logs
dsd_uds_100mb_3k_contexts_throughput throughput ⚪ -0.00 metrics profiles logs
dsd_uds_512kb_3k_contexts_throughput throughput ⚪ +0.00 metrics profiles logs
otlp_ingest_logs_5mb_throughput (ignored) throughput ⚪ +0.00 metrics profiles logs
otlp_ingest_traces_ottl_filtering_5mb_memory memory ⚪ -0.04 metrics profiles logs
quality_gates_rss_dsd_medium memory ⚪ -0.04 metrics profiles logs
otlp_ingest_traces_ottl_transform_5mb_memory memory ⚪ -0.05 metrics profiles logs
otlp_ingest_traces_5mb_memory memory ⚪ -0.11 metrics profiles logs
dsd_uds_1mb_3k_contexts_memory memory ⚪ -0.11 metrics profiles logs
quality_gates_rss_dsd_ultraheavy memory ⚪ -0.17 metrics profiles logs
dsd_uds_10mb_3k_contexts_cpu (erratic) cpu ⚪ -0.28 metrics profiles logs
dsd_uds_1mb_3k_contexts_cpu (erratic) cpu ⚪ -0.44 metrics profiles logs
dsd_uds_512kb_3k_contexts_memory memory ⚪ -0.77 metrics profiles logs
otlp_ingest_traces_ottl_filtering_5mb_cpu (erratic) cpu ⚪ -0.77 metrics profiles logs
otlp_ingest_logs_5mb_cpu (ignored) cpu ⚪ -0.88 metrics profiles logs
otlp_ingest_metrics_5mb_memory memory ⚪ -1.10 metrics profiles logs
dsd_uds_512kb_3k_contexts_cpu (erratic) cpu ⚪ -4.51 metrics profiles logs
Bounds Checks: ✅ Passed (5)
experiment check replicates observed links
quality_gates_rss_dsd_heavy memory_usage 10/10 ✅ 131 MiB ≤ 140 MiB metrics profiles logs
quality_gates_rss_dsd_low memory_usage 10/10 ✅ 41.3 MiB ≤ 50 MiB metrics profiles logs
quality_gates_rss_dsd_medium memory_usage 10/10 ✅ 64.6 MiB ≤ 75 MiB metrics profiles logs
quality_gates_rss_dsd_ultraheavy memory_usage 10/10 ✅ 191 MiB ≤ 200 MiB metrics profiles logs
quality_gates_rss_idle memory_usage 10/10 ✅ 27.1 MiB ≤ 40 MiB metrics profiles logs
Explanation

A change is flagged as a regression when |Δ mean %| > 5.00% in the regressing direction for its optimization goal AND SMP marks the experiment as a regression (is_regression: true). Improvements use the matching criteria for the improving direction. Experiments configured erratic: true (tagged (ignored)) are skipped outright; experiments detected as erratic at runtime (tagged (erratic)) still count, since that flag describes sample dispersion rather than directional certainty. The Δ mean % cell is colored accordingly: 🟢 = improvement, 🔴 = regression, ⚪ = neutral. Reduction in CPU or memory is an improvement; reduction in ingress throughput is a regression.

@webern webern marked this pull request as ready for review June 5, 2026 16:51
@webern webern requested a review from a team as a code owner June 5, 2026 16:51
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f88a143a11

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

writeln!(out, " yaml_path: \"{}\",", yaml_path).unwrap();
writeln!(out, " aliases: &[],").unwrap();
writeln!(out, " support_level: SupportLevel::Incompatible({}),", severity).unwrap();
writeln!(out, " pipeline_affinity: PipelineAffinity::CrossCutting,").unwrap();
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve pipeline affinity for investigate entries

For investigate entries with a severity, this hard-codes every generated classifier entry as cross-cutting. That changes runtime filtering in check_and_warn_config: keys such as autoscaling.failover.*, cluster_agent.enabled, and telemetry.dogstatsd_* were previously scoped to DogStatsD/Checks, but now they are considered active even for OTLP- or traces-only runs, so those configurations emit incompatible-key warnings even when the affected metrics pipelines are disabled. Carry the overlay pipeline affinity through these generated entries instead of defaulting them to CrossCutting.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor Author

@webern webern Jun 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to say this is fine since I don't want to keep tweaking the data model forever. Basically I added investigate as a section when I realized that sometimes the reality is we don't quite know whether something needs to be supported or not. Then I realized that sometimes we are warning on things that we need to "investigate" so I added the ability to add a severity, but made it optional.

I think if we know enough to say what the severity is, and we really need to assign specific pipelines, we should probably move it up to the unsupported sections and leave investigate for the keys we know less about.

Not a bad AI catch, though.

@webern webern force-pushed the matt.briggs/config branch from 11594e4 to 817e405 Compare June 6, 2026 07:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/ci CI/CD, automated testing, etc. area/components Sources, transforms, and destinations. area/docs Reference documentation. decoders/otlp OTLP decoder. encoder/datadog-events Datadog events encoder. encoder/datadog-logs Datadog Logs encoder. encoder/datadog-metrics Datadog Metrics encoder. encoder/datadog-service-checks Datadog Service Checks encoder. encoder/datadog-stats Datadog APM Stats encoder. encoder/datadog-traces Datadog Traces encoder. relay/otlp OTLP relay. source/dogstatsd DogStatsD source. source/otlp OTLP source. transform/aggregate Aggregate transform. transform/dogstatsd-mapper DogStatsD Mapper synchronous transform. transform/trace-obfuscation Trace Obfuscation synchronous transform.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Separate prod from test/build concerns in the config registry.

1 participant