antithesis: scaffold harness, multi-scenario layout, broadened testdrive corpus by DAlperin · Pull Request #36437 · MaterializeInc/materialize

DAlperin · 2026-05-07T02:39:17Z

Brings up the Antithesis Test Composer harness for Materialize. The branch covers research artifacts, the test-driver mzbuild image, scenario infrastructure, workload helpers (testdrive runner, corpus lists, the upsert_sources prototype), and a few small upstream fixes that were prerequisites.

This is a draft because validation still needs to run on real x86 (see Known limitations) and we still need to wire ANTITHESIS_REPOSITORY into the launch pipeline. Functionally, the harness is ready to ship to Antithesis as the first base scenario.

What's in here

`antithesis/`

antithesis/
  AGENTS.md                           directory map + scenarios table
  scratchbook/                        SUT analysis, deployment topology,
                                      test-driver integration plan,
                                      scenario strategy, existing-assertions
                                      inventory
  configs/<scenario>/
    mzcompose.py                      source-of-truth composition
    docker-compose.yaml               generated artifact (snouty consumes)
  bin/render-compose-yaml.py          renders configs/<scenario>/docker-compose.yaml
                                      from mzcompose.py and layers on
                                      platform / hostname / container_name /
                                      NO_COLOR Antithesis attributes
  test-driver/                        mzbuild image: MZFROM testdrive +
                                      Python + Antithesis SDK + the workload
                                      tree at /opt/antithesis/test/v1/ +
                                      curated test/testdrive corpus at
                                      /opt/materialize/td/testdrive/
  test/v1/
    helper_bootstrap.py               shared sys.path injector
    testdrive_{sql,kafka,
               load_generator,
               recovery}/             area templates; each picks a random .td
                                      from materialize.antithesis.testdrive_corpus
    upsert_sources/                   randomized helper-driven upsert workload
                                      with expected-state model

`misc/python/materialize/antithesis/`

Module	Role
`sdk.py`	SDK wrapper with local fallbacks. The `antithesis` package coexists with our scaffolding directory because both are namespace packages.
`testdrive_config.py`	Shared `TestdriveConfig` dataclass (env-driven).
`td_runner.py`	Generic `.td` runner: subprocess + tolerated-failure retry + `reachable()` on success and on tolerated failure.
`testdrive_corpus.py`	Curated lists of base-compatible `.td` files split into 4 area buckets — `BASE_SQL` (22), `BASE_KAFKA` (35), `BASE_LOAD_GENERATOR` (12), `BASE_RECOVERY` (8) — 72 unique files.
`upsert_sources.py`	Prototype workload helper: randomized Kafka UPSERT writes with an expected-state model.

Why per-scenario configs

antithesis/scratchbook/scenario-strategy.md documents the design with citations to the Antithesis docs (snouty docs CLI). Short version: incompatible topologies (e.g. base SQL+Kafka vs MySQL CDC with multithreaded replicas) get separate config dirs and separate snouty launch invocations; incompatible workloads on the same topology coexist as multiple test templates inside one config (Antithesis selects exactly one template per execution history).

The base scenario is the only one wired up here; mysql_mt_replicas for the SS-95 ticket is the planned next addition.

Why no per-area `eventually_*` recovery checks

Earlier drafts of this branch had per-area eventually_* commands. They either reduced to tautologies (SELECT count(*) >= 0 always passes if pgwire is up) or to a generic CREATE/INSERT/SELECT/DROP round-trip that was only weakly correlated with the chosen .td's actual semantics. When the singleton picks a random .td you can't write a useful recovery property without knowing what state was created.

Real recovery properties belong either in SUT-side Rust assertions or in scenario-specific eventually_* commands tied to a specific workload — which is exactly the shape upsert_sources/eventually_* already has (it writes a sentinel and waits for it). The scratchbook entry warns against re-adding generic per-area recovery checks.

Upstream fixes pulled in

These could ideally be split out as their own PRs. They were prerequisites for the Antithesis work to function and are limited in scope.

misc/python/materialize/cli/mzcompose.py — the shtab Enum-choices workaround broke --arch and --sanitizer on Python 3.13. argparse's post-conversion member in choices check failed because choices were member names while type=Enum returned member objects. Switched to list(action.choices) (Enum members) so argparse and shtab are both happy. Includes the regenerated bash/zsh shell completions.
misc/python/materialize/mzbuild.py — the Copy pre-image plugin was unused upstream and crashed when used: Copy.inputs() returned paths relative to the source dir, but the mzbuild fingerprinter expected paths relative to the repo root, so os.lstat(rd.root / rel_path) hit FileNotFoundError. Made Copy.inputs() repo-root-relative and Copy.run() strip the source prefix before computing the destination inside the build context.
misc/python/materialize/mzcompose/service.py — added container_name to the ServiceConfig TypedDict (a real Compose field). Antithesis requires it for log/fault attribution.
ci/test/lint-main/checks/check-mzcompose-files.sh — exclude antithesis/configs/*/mzcompose.py from the "unused in any CI pipeline file" check; Antithesis runs are submitted via snouty, not Buildkite.
ci/builder/requirements.txt — added antithesis==0.2.0 so the SDK resolves locally for type-checking and ad-hoc imports.

Known limitations

On Apple Silicon, snouty validate antithesis/configs/base fails end-to-end because the amd64 materialized image's clusterd child segfaults under Rosetta during lgalloc init (unix_wait_status(11) → container Exited (139)). Run validate on Linux/x86 instead. Documented in antithesis/AGENTS.md.
On Apple Silicon, bin/mzimage acquire --arch x86_64 currently fails to link with ld.lld: error: undefined symbol: getauxval: the materializeinc/crosstools/x86_64-unknown-linux-gnu homebrew formula ships glibc 2.12.1, but Rust 1.95's stdlib references getauxval which needs glibc 2.16+. Workaround:
```
bin/ci-builder run stable bin/mzimage acquire --arch x86_64 antithesis-test-driver
```
uses the Docker builder's current glibc. The homebrew formula needs an upstream update; CI is unaffected (Linux hosts route through ci-builder by default).

Next steps

SS-95: mysql_mt_replicas scenario per scratchbook/test-driver-integration.md.
Tier 2 scenarios: pg_cdc, mysql_cdc, sql_server_cdc, s3_copy.
Tier 3 structural refactors: parallel-workload regression scenario (gate Database.create() Kafka/CSR/AWS/PG/MySQL/SQLServer/Iceberg CREATE CONNECTIONs on flags), zippy execution adapter.
Per-scenario template gating via ANTITHESIS_SCENARIO env in the test-driver entrypoint, so a scenario only sees its compatible templates.
SUT-side Rust assertions where they're justified (rare/dangerous internal states, branch outcomes).

Tests

Lint passes locally (bin/lint); the audited test commands bin/pyactivate-import cleanly. The compose YAML validates via docker compose config --quiet. End-to-end snouty validate blocked on the local toolchain limitations above.

🤖 Generated with Claude Code

…ive corpus Brings up the Antithesis Test Composer harness for Materialize. The branch covers research artifacts, the test-driver mzbuild image, scenario infrastructure, workload helpers (testdrive runner, corpus lists, the upsert-sources prototype), and a few small upstream fixes that were prerequisites. Layout ------ antithesis/ AGENTS.md directory map + scenarios table scratchbook/ SUT analysis, deployment topology, test-driver integration plan, scenario strategy, existing-assertions inventory configs/<scenario>/ mzcompose.py source-of-truth composition docker-compose.yaml generated artifact (snouty consumes) bin/render-compose-yaml.py renders configs/<scenario>/docker-compose.yaml from mzcompose.py and layers on platform/hostname/container_name/ NO_COLOR Antithesis attributes test-driver/ mzbuild image: MZFROM testdrive + Python + Antithesis SDK + the workload tree at /opt/antithesis/test/v1/ + curated test/testdrive corpus at /opt/materialize/td/testdrive/ test/v1/ helper_bootstrap.py shared sys.path injector testdrive_{sql,kafka,load_generator,recovery}/ area templates; each picks a random .td from materialize.antithesis.testdrive_corpus upsert_sources/ randomized helper-driven upsert workload with expected-state model misc/python/materialize/antithesis/ sdk.py SDK wrapper with local fallbacks; the antithesis package coexists with our scaffolding directory because both are namespace packages testdrive_config.py shared TestdriveConfig dataclass td_runner.py generic .td runner: subprocess + tolerated-failure retry + reachable() on success and on tolerated failure testdrive_corpus.py curated lists of base-compatible .td files split into 4 area buckets (BASE_SQL=22, BASE_KAFKA=35, BASE_LOAD_GENERATOR=12, BASE_RECOVERY=8 — 72 unique files) upsert_sources.py prototype workload helper Why per-scenario configs ------------------------ `antithesis/scratchbook/scenario-strategy.md` documents the design with citations to the Antithesis docs (snouty docs CLI). In short: incompatible topologies (e.g. base SQL+Kafka vs MySQL CDC with multithreaded replicas) get separate config dirs and separate `snouty launch` invocations; incompatible workloads on the same topology coexist as multiple test templates inside one config (Antithesis selects exactly one template per execution history). The base scenario is the only one wired up here; `mysql_mt_replicas` for the SS-95 ticket is the planned next addition. Why no per-area eventually_* recovery checks -------------------------------------------- Earlier drafts of this branch had per-area `eventually_*` commands. They either reduced to tautologies (`SELECT count(*) >= 0` always passes if pgwire is up) or to a generic CREATE/INSERT/SELECT/DROP round-trip that was only weakly correlated with the chosen .td's actual semantics. When the singleton picks a random .td you can't write a useful recovery property without knowing what state was created. Real recovery properties belong either in SUT-side Rust assertions or in scenario-specific `eventually_*` commands tied to a specific workload — which is exactly the shape `upsert_sources/eventually_*` already has (it writes a sentinel and waits for it). The scratchbook entry warns against re-adding generic per-area recovery checks. Upstream fixes pulled in (could be split out later) --------------------------------------------------- * misc/python/materialize/cli/mzcompose.py — the shtab-Enum-choices workaround broke `--arch` and `--sanitizer` on Python 3.13. Argparse's post-conversion `member in choices` check failed because choices were member names while `type=Enum` returned member objects. Switched to `list(action.choices)` (Enum members) so argparse and shtab are both happy. Includes the regenerated bash/zsh shell completions. * misc/python/materialize/mzbuild.py — the Copy pre-image plugin was unused upstream and crashed when used: `Copy.inputs()` returned paths relative to the source dir, but the mzbuild fingerprinter expected paths relative to the repo root, so `os.lstat(rd.root / rel_path)` hit FileNotFoundError. Made `Copy.inputs()` repo-root-relative and `Copy.run()` strip the source prefix before computing the destination inside the build context. * misc/python/materialize/mzcompose/service.py — added `container_name` to `ServiceConfig` (a real Compose field). Antithesis requires it for log/fault attribution. * ci/test/lint-main/checks/check-mzcompose-files.sh — exclude `antithesis/configs/*/mzcompose.py` from the 'unused in any CI pipeline file' check; Antithesis runs are submitted via snouty, not Buildkite. * ci/builder/requirements.txt — added `antithesis==0.2.0` so the SDK resolves locally for type-checking and ad-hoc imports. Coexists with our `antithesis/` scaffolding directory because both are namespace packages and the merge picks up submodules from site-packages. Known limitations ----------------- * On Apple Silicon, `snouty validate antithesis/configs/base` fails end-to-end because the amd64 `materialized` image's `clusterd` child segfaults under Rosetta during lgalloc init (`unix_wait_status(11)` -> container `Exited (139)`). Run validate on Linux/x86 instead. Documented in antithesis/AGENTS.md. * On Apple Silicon, `bin/mzimage acquire --arch x86_64` currently fails to link with `ld.lld: error: undefined symbol: getauxval`: the `materializeinc/crosstools/x86_64-unknown-linux-gnu` homebrew formula ships glibc 2.12.1, but Rust 1.95's stdlib references getauxval which needs glibc 2.16+. Workaround: `bin/ci-builder run stable bin/mzimage acquire --arch x86_64 antithesis-test-driver` uses the Docker builder's current glibc. The homebrew formula needs an upstream update; CI is unaffected (Linux hosts route through ci-builder by default). Next steps ---------- * SS-95: `mysql_mt_replicas` scenario per scratchbook/test-driver-integration.md * Tier 2 scenarios: pg_cdc, mysql_cdc, sql_server_cdc, s3_copy * Tier 3 structural refactors: parallel-workload regression scenario (gate Database.create() Kafka/CSR/AWS/PG/MySQL/SQLServer/Iceberg CONNECTIONs on flags), zippy execution adapter * Per-scenario template gating via `ANTITHESIS_SCENARIO` env in the test-driver entrypoint, so a scenario only sees its compatible templates * SUT-side Rust assertions where they're justified (rare/dangerous internal states, branch outcomes)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

antithesis: scaffold harness, multi-scenario layout, broadened testdrive corpus#36437

antithesis: scaffold harness, multi-scenario layout, broadened testdrive corpus#36437
DAlperin wants to merge 1 commit intoMaterializeInc:mainfrom
DAlperin:dov/antithesis-harness

DAlperin commented May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

DAlperin commented May 7, 2026

What's in here

antithesis/

misc/python/materialize/antithesis/

Why per-scenario configs

Why no per-area eventually_* recovery checks

Upstream fixes pulled in

Known limitations

Next steps

Tests

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

`antithesis/`

`misc/python/materialize/antithesis/`

Why no per-area `eventually_*` recovery checks