Skip to content

docs(antithesis): Antithesis research scratchbook and bug ledger#1768

Merged
blt merged 1 commit into
mainfrom
blt/antithesis-research
Jun 2, 2026
Merged

docs(antithesis): Antithesis research scratchbook and bug ledger#1768
blt merged 1 commit into
mainfrom
blt/antithesis-research

Conversation

@blt
Copy link
Copy Markdown
Contributor

@blt blt commented May 29, 2026

Summary

The Antithesis research artifacts for agent-data-plane, under test/antithesis/scratchbook/. This is
the analysis behind the harness and the bug repros:

  • A SUT analysis of the DogStatsD data path and runtime.
  • A property catalog (35 properties) with per-property evidence files.
  • A deployment topology, a portfolio evaluation, property relationships, and a bug ledger that maps
    each discovered defect to how it is reproduced.

Docs only, no code. The internal design-partner codename is scrubbed and an internal Antithesis run
id is redacted; Confluence links and Jira references are kept.

Change Type

  • Bug fix
  • New feature
  • Non-functional (chore, refactoring, docs)
  • Performance

How did you test this PR?

Docs only. check-docs is unaffected (it builds the docs/ site; these notes live under test/).
No code paths change.

References

  • Builds on the harness PR (test/antithesis/) in this stack.
  • The failing bug repros it catalogs are in the bug-tests PR in this stack.
  • Internal context (kept per repo norms) is in the Confluence/Jira links in each artifact's
    frontmatter.

@dd-octo-sts dd-octo-sts Bot added the area/test All things testing: unit/integration, correctness, SMP regression, etc. label May 29, 2026
Copy link
Copy Markdown
Contributor Author

blt commented May 29, 2026

@datadog-prod-us1-4

This comment has been minimized.

@pr-commenter
Copy link
Copy Markdown

pr-commenter Bot commented May 29, 2026

Binary Size Analysis (Agent Data Plane)

Baseline: 9d9e29d · Comparison: 1b642b8 · diff
Analysis Configuration: stripped binaries · Pass/Fail Threshold: +5%
Sizes: 38.04 MiB (baseline) vs 38.04 MiB (comparison)
Size Change: +16 B (+0.00%)

✅ Binary size difference within threshold

Changes by Module
Module File Size Symbols
anon.85240eacea40817b540ad191ce7e90d0.1.llvm.15412738400374065525 +130 B 1
anon.85240eacea40817b540ad191ce7e90d0.1.llvm.956545143103123932 -128 B 1
anon.85240eacea40817b540ad191ce7e90d0.4.llvm.15412738400374065525 +115 B 1
anon.85240eacea40817b540ad191ce7e90d0.4.llvm.956545143103123932 -113 B 1
anon.85240eacea40817b540ad191ce7e90d0.3.llvm.15412738400374065525 +109 B 1
anon.85240eacea40817b540ad191ce7e90d0.3.llvm.956545143103123932 -107 B 1
anon.85240eacea40817b540ad191ce7e90d0.0.llvm.15412738400374065525 +97 B 1
anon.85240eacea40817b540ad191ce7e90d0.2.llvm.15412738400374065525 +95 B 1
anon.85240eacea40817b540ad191ce7e90d0.0.llvm.956545143103123932 -95 B 1
anon.85240eacea40817b540ad191ce7e90d0.2.llvm.956545143103123932 -93 B 1
[Unmapped] +6 B 1
Detailed Symbol Changes
    FILE SIZE        VM SIZE    
 --------------  -------------- 
  [NEW]    +130  [NEW]     +40    anon.85240eacea40817b540ad191ce7e90d0.1.llvm.15412738400374065525
  [NEW]    +115  [NEW]     +25    anon.85240eacea40817b540ad191ce7e90d0.4.llvm.15412738400374065525
  [NEW]    +109  [NEW]     +19    anon.85240eacea40817b540ad191ce7e90d0.3.llvm.15412738400374065525
  [NEW]     +97  [NEW]      +7    anon.85240eacea40817b540ad191ce7e90d0.0.llvm.15412738400374065525
  [NEW]     +95  [NEW]      +5    anon.85240eacea40817b540ad191ce7e90d0.2.llvm.15412738400374065525
  +0.1%      +6  [ = ]       0    [Unmapped]
  [DEL]     -93  [DEL]      -5    anon.85240eacea40817b540ad191ce7e90d0.2.llvm.956545143103123932
  [DEL]     -95  [DEL]      -7    anon.85240eacea40817b540ad191ce7e90d0.0.llvm.956545143103123932
  [DEL]    -107  [DEL]     -19    anon.85240eacea40817b540ad191ce7e90d0.3.llvm.956545143103123932
  [DEL]    -113  [DEL]     -25    anon.85240eacea40817b540ad191ce7e90d0.4.llvm.956545143103123932
  [DEL]    -128  [DEL]     -40    anon.85240eacea40817b540ad191ce7e90d0.1.llvm.956545143103123932
  +0.0%     +16  [ = ]       0    TOTAL

@blt blt changed the title docs(antithesis): research scratchbook (SUT analysis, property catalog, bug ledger) docs(agent-data-plane): Antithesis research scratchbook and bug ledger May 29, 2026
@pr-commenter
Copy link
Copy Markdown

pr-commenter Bot commented May 29, 2026

Regression Detector (Agent Data Plane)

Run ID: 09d39a36-790a-4abd-b4d5-f129010bb357
Baseline: 9d9e29d0 · Comparison: 1b642b8d · diff

Optimization Goals: ✅ No significant changes detected

Fine details of change detection per experiment (35)

Experiments configured erratic: true are tagged (ignored) and skipped when determining which experiments regressed or improved. Experiments which are detected as erratic at runtime are tagged (erratic) to flag that the run's sample dispersion was high, but their regression / improvement signal still counts.

experiment goal Δ mean % links
dsd_uds_1mb_3k_contexts_cpu (erratic) cpu ⚪ +7.89 metrics profiles logs
dsd_uds_500mb_3k_contexts_cpu (erratic) cpu ⚪ +2.66 metrics profiles logs
otlp_ingest_metrics_5mb_memory memory ⚪ +1.36 metrics profiles logs
otlp_ingest_logs_5mb_cpu (ignored) cpu ⚪ +1.20 metrics profiles logs
otlp_ingest_traces_ottl_transform_5mb_cpu (erratic) cpu ⚪ +1.14 metrics profiles logs
dsd_uds_500mb_3k_contexts_memory memory ⚪ +0.63 metrics profiles logs
otlp_ingest_traces_5mb_cpu (erratic) cpu ⚪ +0.34 metrics profiles logs
otlp_ingest_traces_ottl_transform_5mb_memory memory ⚪ +0.28 metrics profiles logs
quality_gates_rss_dsd_ultraheavy memory ⚪ +0.23 metrics profiles logs
dsd_uds_100mb_3k_contexts_memory memory ⚪ +0.17 metrics profiles logs
quality_gates_rss_idle memory ⚪ +0.13 metrics profiles logs
quality_gates_rss_dsd_medium memory ⚪ +0.12 metrics profiles logs
dsd_uds_10mb_3k_contexts_memory memory ⚪ +0.07 metrics profiles logs
otlp_ingest_traces_ottl_transform_5mb_throughput throughput ⚪ -0.05 metrics profiles logs
otlp_ingest_traces_ottl_filtering_5mb_throughput throughput ⚪ -0.02 metrics profiles logs
otlp_ingest_traces_5mb_throughput throughput ⚪ -0.02 metrics profiles logs
quality_gates_rss_dsd_heavy memory ⚪ +0.01 metrics profiles logs
dsd_uds_1mb_3k_contexts_throughput throughput ⚪ +0.00 metrics profiles logs
dsd_uds_512kb_3k_contexts_throughput throughput ⚪ +0.00 metrics profiles logs
otlp_ingest_logs_5mb_throughput (ignored) throughput ⚪ +0.01 metrics profiles logs
dsd_uds_100mb_3k_contexts_throughput throughput ⚪ +0.01 metrics profiles logs
otlp_ingest_metrics_5mb_throughput throughput ⚪ +0.01 metrics profiles logs
dsd_uds_10mb_3k_contexts_throughput throughput ⚪ +0.02 metrics profiles logs
otlp_ingest_traces_5mb_memory memory ⚪ -0.08 metrics profiles logs
otlp_ingest_traces_ottl_filtering_5mb_memory memory ⚪ -0.08 metrics profiles logs
dsd_uds_1mb_3k_contexts_memory memory ⚪ -0.16 metrics profiles logs
dsd_uds_100mb_3k_contexts_cpu (erratic) cpu ⚪ -0.28 metrics profiles logs
otlp_ingest_traces_ottl_filtering_5mb_cpu (erratic) cpu ⚪ -0.30 metrics profiles logs
dsd_uds_512kb_3k_contexts_memory memory ⚪ -0.31 metrics profiles logs
quality_gates_rss_dsd_low memory ⚪ -0.31 metrics profiles logs
dsd_uds_500mb_3k_contexts_throughput throughput ⚪ +0.60 metrics profiles logs
otlp_ingest_metrics_5mb_cpu (erratic) cpu ⚪ -2.10 metrics profiles logs
dsd_uds_10mb_3k_contexts_cpu (erratic) cpu 🟢 -5.36 metrics profiles logs
otlp_ingest_logs_5mb_memory (ignored) memory ⚪ -5.74 metrics profiles logs
dsd_uds_512kb_3k_contexts_cpu (erratic) cpu 🟢 -6.67 metrics profiles logs
Bounds Checks: ✅ Passed (5)
experiment check replicates observed links
quality_gates_rss_dsd_heavy memory_usage 10/10 ✅ 125 MiB ≤ 140 MiB metrics profiles logs
quality_gates_rss_dsd_low memory_usage 10/10 ✅ 39.7 MiB ≤ 50 MiB metrics profiles logs
quality_gates_rss_dsd_medium memory_usage 10/10 ✅ 60.3 MiB ≤ 75 MiB metrics profiles logs
quality_gates_rss_dsd_ultraheavy memory_usage 10/10 ✅ 182 MiB ≤ 200 MiB metrics profiles logs
quality_gates_rss_idle memory_usage 10/10 ✅ 26.8 MiB ≤ 40 MiB metrics profiles logs
Explanation

A change is flagged as a regression when |Δ mean %| > 5.00% in the regressing direction for its optimization goal AND SMP marks the experiment as a regression (is_regression: true). Improvements use the matching criteria for the improving direction. Experiments configured erratic: true (tagged (ignored)) are skipped outright; experiments detected as erratic at runtime (tagged (erratic)) still count, since that flag describes sample dispersion rather than directional certainty. The Δ mean % cell is colored accordingly: 🟢 = improvement, 🔴 = regression, ⚪ = neutral. Reduction in CPU or memory is an improvement; reduction in ingress throughput is a regression.

@blt blt force-pushed the blt/antithesis-research branch from c269079 to e840a73 Compare May 29, 2026 20:04
@blt blt force-pushed the blt/antithesis-harness branch from 6b01012 to c9699d1 Compare May 29, 2026 20:04
@blt blt force-pushed the blt/antithesis-research branch from e840a73 to 8712b69 Compare May 29, 2026 20:33
@blt blt force-pushed the blt/antithesis-harness branch 2 times, most recently from 25934ae to 330a22f Compare May 29, 2026 20:34
@blt blt force-pushed the blt/antithesis-research branch from 8712b69 to 3265657 Compare May 29, 2026 20:34
@blt blt force-pushed the blt/antithesis-harness branch from 330a22f to f8bab09 Compare May 29, 2026 20:38
@blt blt force-pushed the blt/antithesis-research branch from 3265657 to af46baf Compare May 29, 2026 20:38
@blt blt force-pushed the blt/antithesis-harness branch from f8bab09 to d048536 Compare May 29, 2026 20:46
@blt blt force-pushed the blt/antithesis-research branch from af46baf to 5cb8545 Compare May 29, 2026 20:46
@blt blt force-pushed the blt/antithesis-harness branch from d048536 to e4defed Compare May 29, 2026 20:55
@blt blt force-pushed the blt/antithesis-research branch 2 times, most recently from d604067 to a649f24 Compare May 29, 2026 21:18
@blt blt force-pushed the blt/antithesis-harness branch from e4defed to 9994924 Compare May 29, 2026 21:18
@blt blt force-pushed the blt/antithesis-research branch from a649f24 to 249a646 Compare May 29, 2026 22:36
@blt blt force-pushed the blt/antithesis-harness branch 2 times, most recently from e46f656 to fb9e655 Compare May 30, 2026 00:43
@blt blt force-pushed the blt/antithesis-research branch from 249a646 to 05efef3 Compare May 30, 2026 00:43
@blt blt force-pushed the blt/antithesis-harness branch from fb9e655 to 4d533a8 Compare May 30, 2026 00:47
@blt blt force-pushed the blt/antithesis-research branch from 05efef3 to 08ac10a Compare May 30, 2026 00:47
@blt blt force-pushed the blt/antithesis-harness branch from 4d533a8 to ac323be Compare May 30, 2026 00:49
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 46 out of 46 changed files in this pull request and generated 2 comments.

correctness, lifecycle/config, untrusted-input parsing, concurrency, and **transform & enrichment
correctness** (Category G, added after evaluation — ADP as a *transformer*, not just a transport).

> **Evaluation note (2026-05-28):** an 4-lens portfolio evaluation added 8 properties (G1 events/
Comment on lines +30 to +35
- **Dominance:** `rss-bounded-under-cardinality` is the **roll-up** — it observes the aggregate
outcome (RSS ≤ grant). The other four explain *why* it does or doesn't hold:
`aggregate-context-limit-enforced` and `interner-full-bounded` are the two designed bounds;
`interner-full-bounded` (heap-on default) and `memory-limiter-survives-rss-read-failure` are the
two leaks that make the roll-up fail. If `rss-bounded` passes, the sub-properties likely hold; if
it fails, the sub-properties localize the cause. Test the roll-up *and* the components.
@blt blt force-pushed the blt/antithesis-harness branch from e8a5058 to 377ffd5 Compare June 1, 2026 18:17
@blt blt force-pushed the blt/antithesis-research branch from e540987 to 8d3fbbf Compare June 1, 2026 18:17
@blt blt changed the title docs(agent-data-plane): Antithesis research scratchbook and bug ledger docs(antithesis): Antithesis research scratchbook and bug ledger Jun 2, 2026
@blt blt changed the base branch from blt/antithesis-harness to graphite-base/1768 June 2, 2026 12:21
Copilot AI review requested due to automatic review settings June 2, 2026 12:22
@blt blt force-pushed the graphite-base/1768 branch from 377ffd5 to 9d9e29d Compare June 2, 2026 12:22
@blt blt force-pushed the blt/antithesis-research branch from 8d3fbbf to b230025 Compare June 2, 2026 12:22
@graphite-app graphite-app Bot changed the base branch from graphite-base/1768 to main June 2, 2026 12:22
## Summary

The Antithesis research artifacts for agent-data-plane, under `test/antithesis/scratchbook/`. This is
the analysis behind the harness and the bug repros:

- A SUT analysis of the DogStatsD data path and runtime.
- A property catalog (35 properties) with per-property evidence files.
- A deployment topology, a portfolio evaluation, property relationships, and a bug ledger that maps
  each discovered defect to how it is reproduced.

Docs only, no code. The internal design-partner codename is scrubbed and an internal Antithesis run
id is redacted; Confluence links and Jira references are kept.

## Change Type
- [ ] Bug fix
- [ ] New feature
- [x] Non-functional (chore, refactoring, docs)
- [ ] Performance

## How did you test this PR?

Docs only. `check-docs` is unaffected (it builds the `docs/` site; these notes live under `test/`).
No code paths change.

## References

- Builds on the harness PR (`test/antithesis/`) in this stack.
- The failing bug repros it catalogs are in the bug-tests PR in this stack.
- Internal context (kept per repo norms) is in the Confluence/Jira links in each artifact's
  frontmatter.
@blt blt force-pushed the blt/antithesis-research branch from b230025 to 1b642b8 Compare June 2, 2026 12:22
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 46 out of 46 changed files in this pull request and generated 17 comments.

@@ -0,0 +1,159 @@
# rss-bounded-under-cardinality
@@ -0,0 +1,150 @@
# retry-queue-bounded-under-outage
@@ -0,0 +1,129 @@
# interner-full-bounded
@@ -0,0 +1,121 @@
# aggregate-context-limit-enforced
@@ -0,0 +1,60 @@
# config-runtime-update-not-revalidated
type: Safety (Always) + Reachability
priority: High
status: assertion-missing
sut_commit: 042f41db3bd97118c38981765fd49696fce9d318
type: Safety (Reachability / Unreachable)
priority: Medium
status: assertion-missing
sut_commit: fc4bb29728814ddf9321572b954ec28f58faeb53
type: Liveness
priority: High
status: assertion-missing
sut_commit: 042f41db3bd97118c38981765fd49696fce9d318
Comment on lines +31 to +35
outcome (RSS ≤ grant). The other four explain *why* it does or doesn't hold:
`aggregate-context-limit-enforced` and `interner-full-bounded` are the two designed bounds;
`interner-full-bounded` (heap-on default) and `memory-limiter-survives-rss-read-failure` are the
two leaks that make the roll-up fail. If `rss-bounded` passes, the sub-properties likely hold; if
it fails, the sub-properties localize the cause. Test the roll-up *and* the components.
correctness, lifecycle/config, untrusted-input parsing, concurrency, and **transform & enrichment
correctness** (Category G, added after evaluation — ADP as a *transformer*, not just a transport).

> **Evaluation note (2026-05-28):** an 4-lens portfolio evaluation added 8 properties (G1 events/
Copy link
Copy Markdown
Member

@tobz tobz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given what we discussed yesterday about how we expect to use this and have it be kept up-to-date, fine with merging it. 👍🏻

@blt blt merged commit ba3b71f into main Jun 2, 2026
80 of 81 checks passed
Copy link
Copy Markdown
Contributor Author

blt commented Jun 2, 2026

Merge activity

  • Jun 2, 4:24 PM UTC: @blt merged this pull request with Graphite.

@blt blt deleted the blt/antithesis-research branch June 2, 2026 16:24
dd-octo-sts Bot pushed a commit that referenced this pull request Jun 2, 2026
## Summary

The Antithesis research artifacts for agent-data-plane, under `test/antithesis/scratchbook/`. This is
the analysis behind the harness and the bug repros:

- A SUT analysis of the DogStatsD data path and runtime.
- A property catalog (35 properties) with per-property evidence files.
- A deployment topology, a portfolio evaluation, property relationships, and a bug ledger that maps
  each discovered defect to how it is reproduced.

Docs only, no code. The internal design-partner codename is scrubbed and an internal Antithesis run
id is redacted; Confluence links and Jira references are kept.

## Change Type
- [ ] Bug fix
- [ ] New feature
- [x] Non-functional (chore, refactoring, docs)
- [ ] Performance

## How did you test this PR?

Docs only. `check-docs` is unaffected (it builds the `docs/` site; these notes live under `test/`).
No code paths change.

## References

- Builds on the harness PR (`test/antithesis/`) in this stack.
- The failing bug repros it catalogs are in the bug-tests PR in this stack.
- Internal context (kept per repo norms) is in the Confluence/Jira links in each artifact's
  frontmatter. ba3b71f
@blt blt mentioned this pull request Jun 5, 2026
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/test All things testing: unit/integration, correctness, SMP regression, etc.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants