diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
index 70a20c4a2..fbb09d90f 100644
--- a/.github/workflows/ci.yml
+++ b/.github/workflows/ci.yml
@@ -108,3 +108,12 @@ jobs:
       - name: Smoke test interactive TTY prompt
         if: ${{ matrix.os == 'ubuntu-latest' && matrix.node-version == '22.18.0' }}
         run: pnpm smoke:tty-prompt
+
+      # SlopBench task self-test: every benchmark task's clean reference solution
+      # must still pass its functional gate AND score reward > 0. Guards the
+      # corpus against drift in the verifier, the scoring profile, or React
+      # Doctor. The prior `pnpm build` step left react-doctor + the verifier
+      # built; React-component tasks install their own dev deps during grading.
+      - name: Validate SlopBench task reference solutions
+        if: ${{ matrix.os == 'ubuntu-latest' && matrix.node-version == '22.18.0' }}
+        run: pnpm benchmark:validate
diff --git a/docs/SLOPBENCH.md b/docs/SLOPBENCH.md
new file mode 100644
index 000000000..e219db136
--- /dev/null
+++ b/docs/SLOPBENCH.md
@@ -0,0 +1,105 @@
+# SlopBench — methodology
+
+SlopBench (in [`packages/benchmark`](../packages/benchmark)) measures how good a
+model is at frontend engineering, with a deliberate focus on **how much React /
+TypeScript slop it emits**. It extends the DeepSWE / Harbor approach with a
+second, continuous quality axis.
+
+## Why two axes
+
+Correctness-only benchmarks reward a working feature regardless of how it was
+built. Real frontend review cares about both: does it work, _and_ is it clean?
+SlopBench keeps a hard **functional gate** (hidden behavioral tests) and adds a
+**slop score** computed purely by static analysis on the diff:
+
+```
+reward = functional_pass × (slop_score / 100)
+```
+
+- `functional_pass ∈ {0,1}` — the DeepSWE-style gate.
+- `slop_score ∈ [0,100]` — higher = cleaner.
+
+Reporting both separately (plus per-dimension subscores) lets a leaderboard rank
+by correctness, by cleanliness, or by the product. Setting the slop weight to
+zero recovers a pure correctness benchmark.
+
+## How the slop score is computed
+
+The verifier (`slop-verify`, the `@react-doctor/benchmark` package) runs
+**offline** over the agent's diff against the task's base commit:
+
+1. **React Doctor** (`--json --no-score --no-dead-code`) — the canonical React
+   diagnostic engine, scoped to the files the agent changed. Its five categories
+   map to the `react-correctness`, `react-performance`, `accessibility`, and
+   `maintainability` dimensions; specific bundle/waterfall rules are rerouted to
+   the `bundle` and `async-waterfall` dimensions.
+2. **TypeScript strictness** (AST, no type-checker needed) — explicit `any`,
+   `as` casts, non-null `!`, and `@ts-ignore`/`@ts-nocheck`/`@ts-expect-error`.
+3. **Composition** (AST, distilled from Vercel's composition-patterns) —
+   boolean-prop soup and function-valued render props.
+4. **deslop heuristic** — nested ternaries.
+
+Each finding is weighted `severity × category × rule-impact`, the per-dimension
+penalty is **size-normalized** by the diff's added lines (so large legitimate
+features are not punished as hard as the same violations in a tiny diff), and
+each dimension scores `clamp(100 − penalty, 0, 100)`. The composite is the
+profile-weighted mean across dimensions.
+
+Every number lives in [`scoring-profiles/default.json`](../packages/benchmark/scoring-profiles/default.json)
+(mirrored by `src/constants.ts`); the `scoringVersion` is stamped into every
+report so scores are reproducible and comparable.
+
+### Why local scoring (not the react.doctor score API)
+
+React Doctor's canonical 0–100 score is a remote API call. Benchmark grading is
+**air-gapped** (`allow_internet = false`), so SlopBench computes its own
+deterministic score from the offline `diagnostics[]`. The remote API is never on
+the grading path.
+
+## Reference influences
+
+The dimensions and checks are grounded in:
+
+- **React Doctor rules** — the React correctness/performance/a11y/security engine.
+- **deslop skill** — indirection, dead code, nested ternaries, near-duplicates.
+- **Vercel [react-best-practices]** — waterfalls, bundle, re-render, rendering tiers.
+- **Vercel [composition-patterns]** — boolean-prop soup, render-props, compound components.
+- **Vercel [next-best-practices]** — RSC boundaries, async APIs, `next/image`, bundling.
+
+To avoid double-counting, [`rule-overlap.md`](../packages/benchmark/rule-overlap.md)
+records which tool owns each signal; SlopBench only adds checks for gaps React
+Doctor does not already cover (TS strictness + composition).
+
+[react-best-practices]: https://github.com/vercel-labs/agent-skills/tree/main/skills/react-best-practices
+[composition-patterns]: https://github.com/vercel-labs/agent-skills/tree/main/skills/composition-patterns
+[next-best-practices]: https://github.com/vercel-labs/next-skills#next-best-practices
+
+## Task families
+
+- **produce-clean** — implement a working feature; slop is measured on the diff.
+  Measures the slop a model emits _unprompted_ (the instruction never mentions
+  quality).
+- **handle-slop** — the seed ships working-but-sloppy code; a small change is
+  requested. Measures whether the model _adds_ slop or cleans what it touches.
+- **explicit-deslop** _(v2)_ — the instruction asks to clean up while preserving
+  behavior; isolates capability from inclination.
+
+## Anti-gaming
+
+- Scanners run over the whole diff, not a fixed file the agent can target.
+- Suppression escape hatches (`@ts-ignore`, eslint-disable-style comments) are
+  themselves scored as slop.
+- Tests, fixtures, generated files, and lockfiles are excluded from grading, so
+  an agent neither earns credit for tests nor is charged for vendored slop.
+- Hidden tests are applied only at grade time.
+
+## Reproducibility
+
+- React Doctor + the verifier are installed from a single pinned checkout in the
+  base image (`tasks/_base/Dockerfile`); pin `REACT_DOCTOR_REF` for a release.
+- `doctorVersion` + `scoringVersion` are recorded in every `slop-report.json`.
+- `scripts/validate-all.sh` asserts every task's reference solution still passes
+  and scores `reward > 0` — run it before cutting a benchmark release.
+
+See [`packages/benchmark/README.md`](../packages/benchmark/README.md) for the run
+and authoring workflow.
diff --git a/package.json b/package.json
index 7f7b79a28..b2a5cdc56 100644
--- a/package.json
+++ b/package.json
@@ -26,6 +26,7 @@
     "release": "pnpm build && pnpm check:published-deps && node scripts/sentry-sourcemaps.mjs && changeset publish",
     "check:published-deps": "node --experimental-strip-types --no-warnings scripts/check-published-deps.ts",
     "smoke:json-report": "node --experimental-strip-types --no-warnings scripts/smoke-json-report.ts",
+    "benchmark:validate": "bash packages/benchmark/scripts/validate-all.sh",
     "smoke:tty-prompt": "python3 scripts/smoke-tty-prompt.py",
     "build:schema": "node --experimental-strip-types --no-warnings scripts/generate-config-schema.ts"
   },
diff --git a/packages/benchmark/README.md b/packages/benchmark/README.md
new file mode 100644
index 000000000..ee4caf52a
--- /dev/null
+++ b/packages/benchmark/README.md
@@ -0,0 +1,149 @@
+# SlopBench
+
+A benchmark for measuring how good individual models are at **frontend
+engineering — and specifically how much React/TypeScript "slop" they produce**.
+
+Unlike correctness-only SWE benchmarks, SlopBench scores **two axes** per task:
+
+1. **Functional correctness** (gate) — hidden behavioral tests, exactly like
+   [DeepSWE](https://github.com/datacurve-ai/deep-swe). If the feature does not
+   work, the task is failed.
+2. **Slop score** (0–100, continuous) — how clean the code the model wrote is,
+   measured **offline** by [React Doctor](https://react.doctor) plus a strict
+   TypeScript pass, Vercel-derived composition checks, and deslop heuristics.
+
+A model can make the feature work and **still score poorly** for shipping slop
+(inline components, array-index keys, `any`, type casts, `@ts-ignore`,
+boolean-prop soup, …). The headline **reward** combines them:
+
+```
+reward = functional_pass × (slop_score / 100)
+```
+
+## Task format
+
+SlopBench uses the [Harbor](https://www.harborframework.com/docs/tasks) task
+format (so it runs under [Pier](https://github.com/datacurve-ai/pier) /
+Harbor unchanged):
+
+```text
+tasks/<id>/
+  task.toml          metadata: family, target_dimensions, base commit, image, limits
+  instruction.md     the prompt the agent sees (no mention of "slop" / quality)
+  seed/              the starting project (committed as the base commit)
+  environment/Dockerfile   reproduces the env (FROM slopbench-base)
+  tests/
+    test.sh          thin wrapper -> `slopbench-grade` (functional gate + slop scan)
+    test.patch       hidden tests, applied at grade time
+  solution/          reference clean solution (reviewer aid; never used at grading)
+  _authoring/        human-readable source for the patches (solved/ + hidden/)
+```
+
+The verifier writes `reward.txt` (the composite float) and a rich
+`slop-report.json` artifact (per-dimension scores + every violation).
+
+## Quickstart (Pier — swappable harness)
+
+The task format is harness-agnostic. Pier drives `mini-swe-agent` (model-agnostic)
+**and** the CLI agents directly — pass `--agent` to switch:
+
+```bash
+git clone https://github.com/millionco/react-doctor
+uv tool install datacurve-pier
+
+# Build the shared base image once (provides react-doctor + slop-verify + grader)
+docker build -t slopbench-base:latest -f packages/benchmark/tasks/_base/Dockerfile .
+
+# Claude Code as the harness
+export ANTHROPIC_API_KEY=...
+pier run -p packages/benchmark/tasks --agent claude-code --model anthropic/claude-opus-4-7
+
+# Codex
+export OPENAI_API_KEY=...
+pier run -p packages/benchmark/tasks --agent codex --model openai/gpt-5.5
+
+# Other harnesses Pier drives directly:
+pier run -p packages/benchmark/tasks --agent gemini-cli --model google/gemini-2.5-pro
+pier run -p packages/benchmark/tasks --agent opencode  --model anthropic/claude-opus-4-7
+
+# Model-agnostic harness (works with any provider)
+pier run -p packages/benchmark/tasks --agent mini-swe-agent --model anthropic/claude-opus-4-7
+```
+
+Single task or a deterministic subset:
+
+```bash
+pier run -p packages/benchmark/tasks/notification-list --agent claude-code
+pier run -p packages/benchmark/tasks --agent mini-swe-agent --n-tasks 3 --sample-seed 0
+```
+
+## Aggregating results into a scorecard
+
+After a run, turn the per-task reports into one model scorecard:
+
+```bash
+node packages/benchmark/scripts/aggregate-results.mjs \
+  --logs <pier-logs-dir> --model claude-opus-4-7 \
+  --out packages/benchmark/results/claude-opus-4-7.json
+```
+
+It reports `functionalPassRate`, `meanSlopScore`, `meanReward`, and per-dimension
+means — the shape a (v2) leaderboard renders. A web leaderboard is intentionally
+out of scope for v1.
+
+## Slop dimensions
+
+Each violation maps to exactly one dimension (no double-counting — see
+[`rule-overlap.md`](./rule-overlap.md)):
+
+| Dimension                                                                    | Owner                                                     |
+| ---------------------------------------------------------------------------- | --------------------------------------------------------- |
+| `react-correctness`, `react-performance`, `accessibility`, `maintainability` | React Doctor                                              |
+| `bundle`, `async-waterfall`                                                  | React Doctor (specific rules rerouted)                    |
+| `ts-strictness`                                                              | SlopBench TS checks (`any`, casts, `!`, `@ts-ignore`)     |
+| `composition`                                                                | SlopBench Vercel checks (boolean-prop soup, render props) |
+
+Weights live in [`scoring-profiles/default.json`](./scoring-profiles/default.json)
+(mirrored by `src/constants.ts`); the active scoring version is stamped into
+every report.
+
+## Authoring a new task
+
+```bash
+cd packages/benchmark
+# 1. scaffold boilerplate (task.toml, test.sh, Dockerfile, solve.sh)
+scripts/scaffold-task.sh my-task produce-clean "ts-strictness" \
+  "node --experimental-strip-types --test tests/my-task.test.ts" \
+  "My task title" "One-line description"
+# 2. author tasks/my-task/seed/, instruction.md,
+#    _authoring/solved/** (clean reference) and _authoring/hidden/** (hidden tests)
+# 3. format first, THEN generate the patches (patches embed seed context,
+#    so formatting the seed after generating would make them stale)
+pnpm format
+scripts/gen-task-patches.sh tasks/my-task
+# 4. validate end-to-end WITHOUT Docker (seed -> grade reference solution)
+scripts/validate-task.sh tasks/my-task --expect-pass
+```
+
+Validate the whole corpus (reference solutions must pass + score reward>0):
+
+```bash
+scripts/validate-all.sh        # from packages/benchmark
+pnpm benchmark:validate        # from the repo root (also run in CI)
+```
+
+Pure-TS tasks use Node's built-in test runner (`node --experimental-strip-types
+--test`) and need no dependency install; React tasks use `vitest` +
+`react-dom/server` (install happens at image-build time). Both run **air-gapped**
+at agent time.
+
+## The verifier CLI
+
+`slop-verify` scores a graded diff directly (used by the grader, handy in dev):
+
+```bash
+slop-verify --root <project> --base <git-ref> --json
+```
+
+See `slop-verify --help` for all flags (`--profile`, `--functional-pass`,
+`--out`, `--fail-under`, …).
diff --git a/packages/benchmark/bin/slop-verify.js b/packages/benchmark/bin/slop-verify.js
new file mode 100755
index 000000000..0cc765eca
--- /dev/null
+++ b/packages/benchmark/bin/slop-verify.js
@@ -0,0 +1,4 @@
+#!/usr/bin/env node
+import { runCli } from "../dist/index.mjs";
+
+runCli(process.argv.slice(2));
diff --git a/packages/benchmark/package.json b/packages/benchmark/package.json
new file mode 100644
index 000000000..768b669a0
--- /dev/null
+++ b/packages/benchmark/package.json
@@ -0,0 +1,40 @@
+{
+  "name": "@react-doctor/benchmark",
+  "version": "0.4.2",
+  "private": true,
+  "description": "Internal: SlopBench — a Harbor/Pier-compatible benchmark measuring how much React/TypeScript slop a model produces, scored through React Doctor plus a strict TypeScript pass, Vercel-derived AST checks, and deslop heuristics. Not published.",
+  "license": "MIT",
+  "bin": {
+    "slop-verify": "./bin/slop-verify.js"
+  },
+  "files": [
+    "bin/**",
+    "dist/**/*.mjs",
+    "dist/**/*.d.mts",
+    "scoring-profiles/**"
+  ],
+  "type": "module",
+  "sideEffects": false,
+  "exports": {
+    ".": {
+      "types": "./dist/index.d.mts",
+      "default": "./dist/index.mjs"
+    }
+  },
+  "scripts": {
+    "build": "node -e \"require('node:fs').rmSync('dist', { recursive: true, force: true })\" && cross-env NODE_ENV=production vp pack",
+    "test": "vp test run tests",
+    "typecheck": "tsc --noEmit"
+  },
+  "dependencies": {
+    "@react-doctor/core": "workspace:*",
+    "oxc-parser": "^0.132.0"
+  },
+  "devDependencies": {
+    "@types/node": "^25.6.0",
+    "react-doctor": "workspace:*"
+  },
+  "engines": {
+    "node": "^20.19.0 || >=22.12.0"
+  }
+}
diff --git a/packages/benchmark/rule-overlap.md b/packages/benchmark/rule-overlap.md
new file mode 100644
index 000000000..7ee6b1928
--- /dev/null
+++ b/packages/benchmark/rule-overlap.md
@@ -0,0 +1,71 @@
+# Rule overlap & ownership
+
+SlopBench scores slop from multiple scanners. To avoid **double-counting** the
+same defect, every slop signal has exactly one owner. This table is the single
+source of truth: when adding a check, confirm React Doctor does not already
+cover it — if it does, **defer** and (optionally) route its rule id into a finer
+dimension instead of re-implementing detection.
+
+## Ownership by dimension
+
+| Dimension           | Owner                           | How                                                                                             |
+| ------------------- | ------------------------------- | ----------------------------------------------------------------------------------------------- |
+| `react-correctness` | React Doctor                    | categories **Security**, **Bugs**                                                               |
+| `react-performance` | React Doctor                    | category **Performance** (minus the rules rerouted below)                                       |
+| `accessibility`     | React Doctor                    | category **Accessibility**                                                                      |
+| `maintainability`   | React Doctor + deslop heuristic | category **Maintainability** (incl. the `ln`/deslop dead-code plugin) + `deslop/nested-ternary` |
+| `bundle`            | React Doctor (rerouted)         | specific Performance-category rule ids → `bundle`                                               |
+| `async-waterfall`   | React Doctor (rerouted)         | specific Performance-category rule ids → `async-waterfall`                                      |
+| `ts-strictness`     | SlopBench TS checks             | React Doctor does **not** cover generic TS slop                                                 |
+| `composition`       | SlopBench Vercel checks         | proliferation / render-prop not counted by React Doctor                                         |
+
+## React Doctor rules rerouted to finer dimensions
+
+React Doctor files these under the broad **Performance** category; SlopBench
+routes the exact rule ids into dedicated dimensions
+(`REACT_DOCTOR_RULE_TO_DIMENSION` in `src/constants.ts`) so the leaderboard can
+report them separately. Detection still belongs to React Doctor — we only
+relabel the dimension.
+
+- `react-doctor/no-barrel-import` → `bundle`
+- `react-doctor/no-full-lodash-import` → `bundle`
+- `react-doctor/no-moment` → `bundle`
+- `react-doctor/no-undeferred-third-party` → `bundle`
+- `react-doctor/prefer-dynamic-import` → `bundle`
+- `react-doctor/no-dynamic-import-path` → `bundle`
+- `react-doctor/use-lazy-motion` → `bundle`
+- `react-doctor/server-sequential-independent-await` → `async-waterfall`
+- `react-doctor/tanstack-start-loader-parallel-fetch` → `async-waterfall`
+
+## Vercel rules deliberately DEFERRED to React Doctor (no custom check)
+
+These Vercel best-practices map onto an existing React Doctor rule, so SlopBench
+does **not** add a duplicate detector:
+
+| Vercel rule                        | Covered by React Doctor                                                        |
+| ---------------------------------- | ------------------------------------------------------------------------------ |
+| `bundle-barrel-imports`            | `react-doctor/no-barrel-import`, `no-full-lodash-import`                       |
+| `bundle-dynamic-imports`           | `react-doctor/prefer-dynamic-import`, `no-dynamic-import-path`                 |
+| `async-parallel` / waterfalls      | `react-doctor/server-sequential-independent-await`                             |
+| `rerender-no-inline-components`    | `react-doctor/no-nested-component-definition`, `no-unstable-nested-components` |
+| `rerender-derived-state-no-effect` | React Doctor `state-and-effects` rules                                         |
+| `react19-no-forwardref`            | `react-doctor/forward-ref-uses-ref`, `no-react19-deprecated-apis`              |
+| `rendering-*` (img, etc.)          | `react-doctor/nextjs-no-img-element`, …                                        |
+
+## Signals SlopBench OWNS (custom checks — React Doctor gap)
+
+TypeScript strictness (`src/checks/ts-*.ts`, dimension `ts-strictness`):
+
+- `ts/no-explicit-any` — explicit `any` annotations
+- `ts/no-non-null-assertion` — the `!` operator
+- `ts/no-type-assertion` — `as Foo` / `<Foo>x` casts (`as const` exempt)
+- `ts/ban-ts-comment` — `@ts-ignore` / `@ts-nocheck` / `@ts-expect-error` (scored as error)
+
+Composition (`src/checks/vercel-*.ts`, dimension `composition`):
+
+- `vercel/architecture-boolean-prop-soup` — `*Props` types with ≥ `BOOLEAN_PROP_SOUP_THRESHOLD` boolean flags
+- `vercel/patterns-render-prop` — function-valued `render` / `renderX` props
+
+deslop (`src/checks/deslop-*.ts`, dimension `maintainability`):
+
+- `deslop/nested-ternary` — nested conditional expressions (one finding per chain)
diff --git a/packages/benchmark/scoring-profiles/default.json b/packages/benchmark/scoring-profiles/default.json
new file mode 100644
index 000000000..cc6a0b74d
--- /dev/null
+++ b/packages/benchmark/scoring-profiles/default.json
@@ -0,0 +1,35 @@
+{
+  "version": "1.0.0",
+  "severityWeights": {
+    "error": 5,
+    "warning": 2
+  },
+  "categoryMultipliers": {
+    "Security": 3,
+    "Bugs": 2,
+    "Performance": 1.5,
+    "Accessibility": 1.2,
+    "Maintainability": 1
+  },
+  "ruleImpactMultipliers": {
+    "ts/ban-ts-comment": 2.5,
+    "ts/no-explicit-any": 2,
+    "ts/no-non-null-assertion": 1.5,
+    "ts/no-type-assertion": 1.5,
+    "vercel/architecture-boolean-prop-soup": 1.8,
+    "vercel/patterns-render-prop": 1.3,
+    "deslop/nested-ternary": 1.2
+  },
+  "dimensionWeights": {
+    "react-correctness": 1.5,
+    "ts-strictness": 1.5,
+    "react-performance": 1.2,
+    "composition": 1,
+    "async-waterfall": 1,
+    "bundle": 1,
+    "maintainability": 1,
+    "accessibility": 0.8
+  },
+  "diffSizeNormalizerLines": 40,
+  "minNormalizerLines": 25
+}
diff --git a/packages/benchmark/scripts/aggregate-results.mjs b/packages/benchmark/scripts/aggregate-results.mjs
new file mode 100644
index 000000000..43018c081
--- /dev/null
+++ b/packages/benchmark/scripts/aggregate-results.mjs
@@ -0,0 +1,116 @@
+#!/usr/bin/env node
+// Aggregate a model's per-task SlopBench reports into one scorecard.
+//
+// After a `pier run`, each task leaves a slop-report.json under the run's logs.
+// This walks a logs directory, collects every slop-report.json, and emits a
+// results JSON: functional pass-rate, mean slop score, mean reward, and
+// per-dimension means — the shape a (v2) leaderboard renders.
+//
+// Usage:
+//   node scripts/aggregate-results.mjs --logs <dir> --model <name> [--out <file>]
+import * as fs from "node:fs";
+import * as path from "node:path";
+
+const parseArgs = (argv) => {
+  const args = {};
+  for (let index = 0; index < argv.length; index++) {
+    const token = argv[index];
+    if (!token.startsWith("--")) continue;
+    const key = token.slice(2);
+    const next = argv[index + 1];
+    if (next && !next.startsWith("--")) {
+      args[key] = next;
+      index++;
+    } else {
+      args[key] = true;
+    }
+  }
+  return args;
+};
+
+const findReports = (root) => {
+  const found = [];
+  const walk = (dir) => {
+    let entries;
+    try {
+      entries = fs.readdirSync(dir, { withFileTypes: true });
+    } catch {
+      return;
+    }
+    for (const entry of entries) {
+      const full = path.join(dir, entry.name);
+      if (entry.isDirectory()) walk(full);
+      else if (entry.name === "slop-report.json") found.push(full);
+    }
+  };
+  walk(root);
+  return found;
+};
+
+const mean = (values) =>
+  values.length === 0 ? null : values.reduce((total, value) => total + value, 0) / values.length;
+
+const main = () => {
+  const args = parseArgs(process.argv.slice(2));
+  const logsDir = args.logs;
+  const model = args.model ?? "unknown-model";
+  if (!logsDir) {
+    process.stderr.write(
+      "usage: aggregate-results.mjs --logs <dir> --model <name> [--out <file>]\n",
+    );
+    process.exit(2);
+  }
+
+  const reportPaths = findReports(logsDir);
+  const tasks = [];
+  const dimensionScores = new Map();
+
+  for (const reportPath of reportPaths) {
+    let report;
+    try {
+      report = JSON.parse(fs.readFileSync(reportPath, "utf8"));
+    } catch {
+      continue;
+    }
+    const taskId = path.basename(path.dirname(path.dirname(reportPath)));
+    tasks.push({
+      task: taskId,
+      slopScore: report.slopScore,
+      functionalPass: report.functionalPass,
+      reward: report.reward,
+      violationCount: Array.isArray(report.violations) ? report.violations.length : 0,
+    });
+    for (const dimension of report.dimensions ?? []) {
+      const bucket = dimensionScores.get(dimension.dimension) ?? [];
+      bucket.push(dimension.score);
+      dimensionScores.set(dimension.dimension, bucket);
+    }
+  }
+
+  const passed = tasks.filter((task) => task.functionalPass === true).length;
+  const rewards = tasks.map((task) => task.reward).filter((value) => typeof value === "number");
+  const perDimensionMean = {};
+  for (const [dimension, scores] of dimensionScores) perDimensionMean[dimension] = mean(scores);
+
+  const result = {
+    model,
+    generatedAt: new Date().toISOString(),
+    taskCount: tasks.length,
+    functionalPassRate: tasks.length === 0 ? null : passed / tasks.length,
+    meanSlopScore: mean(tasks.map((task) => task.slopScore)),
+    meanReward: mean(rewards),
+    perDimensionMean,
+    tasks: tasks.sort((left, right) => left.task.localeCompare(right.task)),
+  };
+
+  const output = `${JSON.stringify(result, null, 2)}\n`;
+  if (args.out) {
+    fs.mkdirSync(path.dirname(path.resolve(args.out)), { recursive: true });
+    fs.writeFileSync(args.out, output);
+    process.stderr.write(`wrote ${args.out} (${tasks.length} tasks)\n`);
+  } else {
+    process.stdout.write(output);
+  }
+};
+
+main();
diff --git a/packages/benchmark/scripts/gen-task-patches.sh b/packages/benchmark/scripts/gen-task-patches.sh
new file mode 100755
index 000000000..d0d52b03d
--- /dev/null
+++ b/packages/benchmark/scripts/gen-task-patches.sh
@@ -0,0 +1,45 @@
+#!/usr/bin/env bash
+#
+# Generate a task's solution.patch and test.patch from authoring inputs:
+#   tasks/<id>/seed/              the starting repo
+#   tasks/<id>/_authoring/solved/ files overwriting seed paths = the reference fix
+#   tasks/<id>/_authoring/hidden/ files ADDED (e.g. tests/*.test.ts) = hidden tests
+#
+# Produces solution/solution.patch (seed -> solved) and tests/test.patch (added
+# hidden files), as real git patches. The _authoring/ inputs stay in-tree as the
+# human-readable source for the (otherwise opaque) patches.
+#
+# Usage: scripts/gen-task-patches.sh <task-dir>
+set -euo pipefail
+
+TASK_DIR="$(cd "$1" && pwd)"
+SOLVED="$TASK_DIR/_authoring/solved"
+HIDDEN="$TASK_DIR/_authoring/hidden"
+WORK="$(mktemp -d)"
+trap 'rm -rf "$WORK"' EXIT
+
+cp -a "$TASK_DIR/seed/." "$WORK/"
+cd "$WORK"
+git init -q && git config user.email t@t.co && git config user.name t
+git add -A && git commit -qm base >/dev/null
+
+# solution.patch: overlay the solved files, diff against the seed.
+if [ -d "$SOLVED" ]; then
+  cp -a "$SOLVED/." "$WORK/"
+  git diff > "$TASK_DIR/solution/solution.patch"
+  git checkout -- . >/dev/null 2>&1
+  echo "wrote solution.patch ($(grep -c '^diff' "$TASK_DIR/solution/solution.patch") file(s))"
+fi
+
+# test.patch: add the hidden files (intent-to-add), diff just those.
+if [ -d "$HIDDEN" ]; then
+  cp -a "$HIDDEN/." "$WORK/"
+  ( cd "$HIDDEN" && find . -type f -printf '%P\n' ) | while IFS= read -r rel; do
+    git add -N -- "$rel"
+  done
+  ( cd "$HIDDEN" && find . -type f -printf '%P\n' ) | sed "s#^#$WORK/#" >/dev/null
+  HIDDEN_PATHS=$(cd "$HIDDEN" && find . -type f -printf '%P\n')
+  # shellcheck disable=SC2086
+  git -c core.quotepath=false diff -- $HIDDEN_PATHS > "$TASK_DIR/tests/test.patch"
+  echo "wrote test.patch ($(grep -c '^diff' "$TASK_DIR/tests/test.patch") file(s))"
+fi
diff --git a/packages/benchmark/scripts/scaffold-task.sh b/packages/benchmark/scripts/scaffold-task.sh
new file mode 100755
index 000000000..168366b31
--- /dev/null
+++ b/packages/benchmark/scripts/scaffold-task.sh
@@ -0,0 +1,105 @@
+#!/usr/bin/env bash
+#
+# Scaffold the boilerplate for a new SlopBench task (task.toml, tests/test.sh,
+# environment/Dockerfile, solution/solve.sh). You still author seed/,
+# instruction.md, the reference solution, and the hidden test — then run
+# scripts/gen-task-patches.sh to produce solution.patch + test.patch.
+#
+# Usage:
+#   scripts/scaffold-task.sh <id> <family> <dims-csv> <functional-cmd> [--needs-install] "<title>" "<desc>"
+set -euo pipefail
+
+BENCH_ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
+ID="$1"; FAMILY="$2"; DIMS_CSV="$3"; FUNC_CMD="$4"; shift 4
+NEEDS_INSTALL="no"
+if [ "${1:-}" = "--needs-install" ]; then NEEDS_INSTALL="yes"; shift; fi
+TITLE="${1:-$ID}"; DESC="${2:-$ID}"
+TASK_DIR="$BENCH_ROOT/tasks/$ID"
+DIMS_TOML="$(python3 -c "import sys;print(', '.join('\"%s\"'%d for d in sys.argv[1].split(',') for d in [d.strip()] if d))" "$DIMS_CSV")"
+
+mkdir -p "$TASK_DIR/tests" "$TASK_DIR/environment" "$TASK_DIR/solution"
+
+cat > "$TASK_DIR/task.toml" <<EOF
+schema_version = "1.1"
+artifacts = []
+
+[task]
+name = "slopbench/$ID"
+description = "$DESC"
+authors = []
+keywords = ["react", "typescript", "slop", "frontend"]
+
+[metadata]
+task_id = "$ID"
+display_title = "$TITLE"
+display_description = "$DESC"
+family = "$FAMILY"
+target_dimensions = [$DIMS_TOML]
+language = "typescript"
+repository_url = "in-tree"
+base_commit_hash = "root"
+slop_profile = ""
+
+[verifier]
+timeout_sec = 1200.0
+
+[verifier.env]
+
+[agent]
+timeout_sec = 3600.0
+
+[environment]
+build_timeout_sec = 1200.0
+docker_image = "slopbench-base:latest"
+os = "linux"
+cpus = 2
+memory_mb = 4096
+storage_mb = 10240
+gpus = 0
+allow_internet = false
+mcp_servers = []
+
+[environment.env]
+
+[solution.env]
+EOF
+
+cat > "$TASK_DIR/tests/test.sh" <<EOF
+#!/usr/bin/env bash
+set -euo pipefail
+export BASE_COMMIT="\$(git -C "\${APP_DIR:-/app}" rev-list --max-parents=0 HEAD | tail -1)"
+export FUNCTIONAL_TEST_CMD="$FUNC_CMD"
+exec slopbench-grade
+EOF
+
+if [ "$NEEDS_INSTALL" = "yes" ]; then
+  INSTALL_STEP='RUN pnpm install --frozen-lockfile --ignore-scripts || pnpm install --ignore-scripts'
+else
+  INSTALL_STEP='# Pure-TS task: no dependency install (functional test uses node --test).'
+fi
+
+cat > "$TASK_DIR/environment/Dockerfile" <<EOF
+FROM slopbench-base:latest
+
+WORKDIR /app
+
+COPY seed/ .
+$INSTALL_STEP
+RUN git init -q \\
+  && git add -A \\
+  && git -c user.email=bench@react.doctor -c user.name=slopbench commit -qm "base" \\
+  && git config --global --add safe.directory /app
+
+CMD ["/bin/bash"]
+EOF
+
+cat > "$TASK_DIR/solution/solve.sh" <<'EOF'
+#!/usr/bin/env bash
+# Reference solution applier (reviewer aid only — never used at grade time).
+set -euo pipefail
+cd /app
+git apply --whitespace=nowarn /solution/solution.patch
+EOF
+
+chmod +x "$TASK_DIR/tests/test.sh" "$TASK_DIR/solution/solve.sh"
+echo "scaffolded $TASK_DIR (author seed/, instruction.md, then gen-task-patches.sh)"
diff --git a/packages/benchmark/scripts/validate-all.sh b/packages/benchmark/scripts/validate-all.sh
new file mode 100755
index 000000000..3852b203a
--- /dev/null
+++ b/packages/benchmark/scripts/validate-all.sh
@@ -0,0 +1,37 @@
+#!/usr/bin/env bash
+#
+# Validate every SlopBench task's reference solution end-to-end (no Docker):
+# each task's clean solution must pass its functional gate and earn reward > 0.
+# Run this before publishing a benchmark release (CI job with network, since
+# vitest-based tasks install their dev deps).
+#
+# Usage: scripts/validate-all.sh
+set -uo pipefail
+
+BENCH_ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
+FAILED=()
+COUNT=0
+
+for task_toml in "$BENCH_ROOT"/tasks/*/task.toml; do
+  task_dir="$(dirname "$task_toml")"
+  name="$(basename "$task_dir")"
+  case "$name" in
+    _template | _base) continue ;;
+  esac
+  COUNT=$((COUNT + 1))
+  echo "::: validating $name"
+  if bash "$BENCH_ROOT/scripts/validate-task.sh" "$task_dir" --expect-pass >/tmp/slopbench-validate-"$name".log 2>&1; then
+    tail -3 /tmp/slopbench-validate-"$name".log | sed 's/^/    /'
+  else
+    echo "    FAILED — see /tmp/slopbench-validate-$name.log"
+    tail -6 /tmp/slopbench-validate-"$name".log | sed 's/^/    /'
+    FAILED+=("$name")
+  fi
+done
+
+echo
+if [ "${#FAILED[@]}" -ne 0 ]; then
+  echo "VALIDATE-ALL: ${#FAILED[@]}/$COUNT task(s) FAILED: ${FAILED[*]}"
+  exit 1
+fi
+echo "VALIDATE-ALL: all $COUNT task reference solutions pass + score reward>0"
diff --git a/packages/benchmark/scripts/validate-task.sh b/packages/benchmark/scripts/validate-task.sh
new file mode 100755
index 000000000..08c009b66
--- /dev/null
+++ b/packages/benchmark/scripts/validate-task.sh
@@ -0,0 +1,72 @@
+#!/usr/bin/env bash
+#
+# Locally validate one SlopBench task WITHOUT Docker, by simulating the sandbox:
+#   seed/ -> git repo (root commit = BASE) -> apply a patch (the "agent") ->
+#   run the task's tests/test.sh through the shared grader -> inspect reward.
+#
+# Usage:
+#   scripts/validate-task.sh <task-dir> [--patch solution|<path>] [--expect-pass|--expect-fail]
+#
+# Defaults to applying the task's reference solution and expecting a passing,
+# high-scoring run. Pass `--patch <file>` to grade an alternative (e.g. sloppy)
+# diff. Requires the workspace react-doctor + slop-verify to be built.
+set -euo pipefail
+
+BENCH_ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
+TASK_DIR="$(cd "$1" && pwd)"; shift
+PATCH="solution"
+EXPECT="pass"
+while [ $# -gt 0 ]; do
+  case "$1" in
+    --patch) PATCH="$2"; shift 2 ;;
+    --expect-pass) EXPECT="pass"; shift ;;
+    --expect-fail) EXPECT="fail"; shift ;;
+    *) echo "unknown arg: $1"; exit 2 ;;
+  esac
+done
+
+RD_BIN="${RD_BIN:-$BENCH_ROOT/node_modules/.bin/react-doctor}"
+SV_BIN="${SV_BIN:-$BENCH_ROOT/bin/slop-verify.js}"
+[ -f "$BENCH_ROOT/dist/index.mjs" ] || { echo "build the verifier first: pnpm --filter @react-doctor/benchmark build"; exit 3; }
+
+WORK="$(mktemp -d)"
+trap 'rm -rf "$WORK"' EXIT
+APP="$WORK/app"; LOGS="$WORK/logs"; BIN="$WORK/bin"
+mkdir -p "$APP" "$LOGS" "$BIN"
+
+cp -a "$TASK_DIR/seed/." "$APP/"
+cd "$APP"
+git init -q && git config user.email t@t.co && git config user.name t
+git add -A && git commit -qm base >/dev/null
+
+if [ "${INSTALL:-auto}" != "skip" ] && [ -f package.json ] && grep -q '"vitest"' package.json; then
+  echo "[validate] installing seed deps (vitest)…"
+  pnpm install --silent >/dev/null 2>&1 || pnpm install >/dev/null
+fi
+
+PATCH_FILE="$PATCH"
+[ "$PATCH" = "solution" ] && PATCH_FILE="$TASK_DIR/solution/solution.patch"
+if [ -s "$PATCH_FILE" ] && ! grep -q "^# Replace" "$PATCH_FILE"; then
+  echo "[validate] applying patch: $PATCH_FILE"
+  git apply --whitespace=nowarn "$PATCH_FILE"
+else
+  echo "[validate] no usable patch ($PATCH_FILE) — grading the bare seed"
+fi
+
+# Install the shared grader as `slopbench-grade` on PATH.
+ln -s "$BENCH_ROOT/tasks/_base/run-verifier.sh" "$BIN/slopbench-grade"
+chmod +x "$BENCH_ROOT/tasks/_base/run-verifier.sh"
+
+PATH="$BIN:$PATH" APP_DIR="$APP" TESTS_DIR="$TASK_DIR/tests" LOG_DIR="$LOGS" \
+  SLOP_VERIFY="$SV_BIN" REACT_DOCTOR_BIN="$RD_BIN" \
+  bash "$TASK_DIR/tests/test.sh"
+
+REWARD="$(cat "$LOGS/verifier/reward.txt")"
+SCORE="$(python3 -c "import json;print(round(json.load(open('$LOGS/verifier/slop-report.json'))['slopScore'],2))")"
+echo "[validate] reward=$REWARD slopScore=$SCORE expect=$EXPECT"
+python3 -c "import json;r=json.load(open('$LOGS/verifier/slop-report.json'));print('[validate] violations:', sorted(set(v['ruleId'] for v in r['violations'])))"
+
+PASS_NUM="$(python3 -c "print(1 if float('$REWARD')>0 else 0)")"
+if [ "$EXPECT" = "pass" ] && [ "$PASS_NUM" != "1" ]; then echo "[validate] FAIL: expected reward>0"; exit 1; fi
+if [ "$EXPECT" = "fail" ] && [ "$PASS_NUM" != "0" ]; then echo "[validate] FAIL: expected reward==0"; exit 1; fi
+echo "[validate] OK"
diff --git a/packages/benchmark/src/checks/deslop-nested-ternary.ts b/packages/benchmark/src/checks/deslop-nested-ternary.ts
new file mode 100644
index 000000000..d14872be9
--- /dev/null
+++ b/packages/benchmark/src/checks/deslop-nested-ternary.ts
@@ -0,0 +1,42 @@
+import type { AstCheck, ScanFinding } from "../types/index.js";
+import { makeAstFinding } from "../utils/make-ast-finding.js";
+import { walkAst } from "../utils/walk-ast.js";
+
+const asNode = (value: unknown): { type?: string; start?: unknown } | null =>
+  typeof value === "object" && value !== null ? (value as { type?: string }) : null;
+
+// Flags nested ternaries (the deslop skill calls them out explicitly): a
+// `ConditionalExpression` whose consequent or alternate is itself a
+// `ConditionalExpression`. Only the outermost of each chain is reported — inner
+// conditionals reached as a parent's branch are tracked and skipped — so one
+// `a ? b : c ? d : e` chain yields exactly one finding, not a cascade.
+export const deslopNestedTernary: AstCheck = (file): ScanFinding[] => {
+  const nestedChildren = new Set<unknown>();
+  const candidates: Array<{ type?: string; start?: unknown }> = [];
+
+  walkAst(file.program, (node) => {
+    if (node.type !== "ConditionalExpression") return;
+    const consequent = asNode(node.consequent);
+    const alternate = asNode(node.alternate);
+    const consequentIsTernary = consequent?.type === "ConditionalExpression";
+    const alternateIsTernary = alternate?.type === "ConditionalExpression";
+    if (consequentIsTernary) nestedChildren.add(node.consequent);
+    if (alternateIsTernary) nestedChildren.add(node.alternate);
+    if (consequentIsTernary || alternateIsTernary) candidates.push(node);
+  });
+
+  return candidates
+    .filter((node) => !nestedChildren.has(node))
+    .map((node) =>
+      makeAstFinding({
+        file,
+        scanner: "deslop-heuristics",
+        dimension: "maintainability",
+        ruleId: "deslop/nested-ternary",
+        severity: "warning",
+        offset: typeof node.start === "number" ? node.start : 0,
+        message:
+          "Nested ternary is hard to read; use an if/else chain, switch, or extracted helper.",
+      }),
+    );
+};
diff --git a/packages/benchmark/src/checks/index.ts b/packages/benchmark/src/checks/index.ts
new file mode 100644
index 000000000..e3660406b
--- /dev/null
+++ b/packages/benchmark/src/checks/index.ts
@@ -0,0 +1,21 @@
+import type { AstCheck } from "../types/index.js";
+import { deslopNestedTernary } from "./deslop-nested-ternary.js";
+import { tsBanTsComment } from "./ts-ban-ts-comment.js";
+import { tsNoExplicitAny } from "./ts-no-explicit-any.js";
+import { tsNoNonNullAssertion } from "./ts-no-non-null-assertion.js";
+import { tsNoTypeAssertion } from "./ts-no-type-assertion.js";
+import { vercelBooleanPropSoup } from "./vercel-boolean-prop-soup.js";
+import { vercelRenderProp } from "./vercel-render-prop.js";
+
+// Every AST check, run once per parsed source file. These cover the slop React
+// Doctor does not: TypeScript strictness, Vercel composition patterns, and the
+// deslop nested-ternary heuristic. See `rule-overlap.md` for ownership.
+export const AST_CHECKS: readonly AstCheck[] = [
+  tsNoExplicitAny,
+  tsNoNonNullAssertion,
+  tsNoTypeAssertion,
+  tsBanTsComment,
+  vercelBooleanPropSoup,
+  vercelRenderProp,
+  deslopNestedTernary,
+];
diff --git a/packages/benchmark/src/checks/ts-ban-ts-comment.ts b/packages/benchmark/src/checks/ts-ban-ts-comment.ts
new file mode 100644
index 000000000..9fdec9f20
--- /dev/null
+++ b/packages/benchmark/src/checks/ts-ban-ts-comment.ts
@@ -0,0 +1,21 @@
+import type { AstCheck, ScanFinding } from "../types/index.js";
+import { offsetToLine } from "../utils/offset-to-line.js";
+
+const TS_SUPPRESSION_PATTERN = /@ts-(ignore|nocheck|expect-error)\b/;
+
+// Flags `@ts-ignore` / `@ts-nocheck` / `@ts-expect-error` directives. These
+// silence the compiler wholesale and are the most severe TypeScript escape
+// hatch, so they are scored as errors. Works on the comment stream rather than
+// the AST (directives are comments, not nodes).
+export const tsBanTsComment: AstCheck = (file): ScanFinding[] =>
+  file.comments
+    .filter((comment) => TS_SUPPRESSION_PATTERN.test(comment.value))
+    .map((comment) => ({
+      scanner: "typescript",
+      dimension: "ts-strictness",
+      ruleId: "ts/ban-ts-comment",
+      severity: "error",
+      filePath: file.filePath,
+      line: offsetToLine(file.sourceText, comment.start),
+      message: "TypeScript suppression directive hides real type errors; fix the underlying type.",
+    }));
diff --git a/packages/benchmark/src/checks/ts-no-explicit-any.ts b/packages/benchmark/src/checks/ts-no-explicit-any.ts
new file mode 100644
index 000000000..e346b9789
--- /dev/null
+++ b/packages/benchmark/src/checks/ts-no-explicit-any.ts
@@ -0,0 +1,26 @@
+import type { AstCheck, ScanFinding } from "../types/index.js";
+import { makeAstFinding } from "../utils/make-ast-finding.js";
+import { walkAst } from "../utils/walk-ast.js";
+
+// Flags every explicit `any` type annotation. `any` opts a value out of the
+// type system entirely — the single loudest TypeScript slop signal — so each
+// occurrence is a finding. (Implicit `any` is a tsc concern; this catches the
+// explicit, agent-authored kind without needing a type-checker.)
+export const tsNoExplicitAny: AstCheck = (file): ScanFinding[] => {
+  const findings: ScanFinding[] = [];
+  walkAst(file.program, (node) => {
+    if (node.type !== "TSAnyKeyword") return;
+    findings.push(
+      makeAstFinding({
+        file,
+        scanner: "typescript",
+        dimension: "ts-strictness",
+        ruleId: "ts/no-explicit-any",
+        severity: "warning",
+        offset: typeof node.start === "number" ? node.start : 0,
+        message: "Explicit `any` disables type checking for this value; give it a real type.",
+      }),
+    );
+  });
+  return findings;
+};
diff --git a/packages/benchmark/src/checks/ts-no-non-null-assertion.ts b/packages/benchmark/src/checks/ts-no-non-null-assertion.ts
new file mode 100644
index 000000000..366cf0fa5
--- /dev/null
+++ b/packages/benchmark/src/checks/ts-no-non-null-assertion.ts
@@ -0,0 +1,26 @@
+import type { AstCheck, ScanFinding } from "../types/index.js";
+import { makeAstFinding } from "../utils/make-ast-finding.js";
+import { walkAst } from "../utils/walk-ast.js";
+
+// Flags the non-null assertion operator (`value!`). It silences the compiler's
+// null/undefined check without proving the value is present, turning a
+// would-be type error into a potential runtime crash.
+export const tsNoNonNullAssertion: AstCheck = (file): ScanFinding[] => {
+  const findings: ScanFinding[] = [];
+  walkAst(file.program, (node) => {
+    if (node.type !== "TSNonNullExpression") return;
+    findings.push(
+      makeAstFinding({
+        file,
+        scanner: "typescript",
+        dimension: "ts-strictness",
+        ruleId: "ts/no-non-null-assertion",
+        severity: "warning",
+        offset: typeof node.start === "number" ? node.start : 0,
+        message:
+          "Non-null assertion (`!`) hides a possible null/undefined; narrow the type instead.",
+      }),
+    );
+  });
+  return findings;
+};
diff --git a/packages/benchmark/src/checks/ts-no-type-assertion.ts b/packages/benchmark/src/checks/ts-no-type-assertion.ts
new file mode 100644
index 000000000..9cd135e10
--- /dev/null
+++ b/packages/benchmark/src/checks/ts-no-type-assertion.ts
@@ -0,0 +1,36 @@
+import type { AstCheck, AstVisitorNode, ScanFinding } from "../types/index.js";
+import { makeAstFinding } from "../utils/make-ast-finding.js";
+import { walkAst } from "../utils/walk-ast.js";
+
+// `as const` is a readonly/literal-narrowing assertion, not a slop type-cast,
+// so it is exempt.
+const isAsConst = (node: AstVisitorNode): boolean => {
+  const annotation = node.typeAnnotation;
+  if (typeof annotation !== "object" || annotation === null) return false;
+  const reference = annotation as { type?: string; typeName?: { name?: string } };
+  return reference.type === "TSTypeReference" && reference.typeName?.name === "const";
+};
+
+// Flags type assertions (`value as Foo` and `<Foo>value`). A cast overrides the
+// compiler's inferred type and is a frequent source of unsound code; `as const`
+// is exempt because it narrows rather than overrides.
+export const tsNoTypeAssertion: AstCheck = (file): ScanFinding[] => {
+  const findings: ScanFinding[] = [];
+  walkAst(file.program, (node) => {
+    if (node.type !== "TSAsExpression" && node.type !== "TSTypeAssertion") return;
+    if (isAsConst(node)) return;
+    findings.push(
+      makeAstFinding({
+        file,
+        scanner: "typescript",
+        dimension: "ts-strictness",
+        ruleId: "ts/no-type-assertion",
+        severity: "warning",
+        offset: typeof node.start === "number" ? node.start : 0,
+        message:
+          "Type assertion overrides the inferred type; prefer a correct type or a runtime guard.",
+      }),
+    );
+  });
+  return findings;
+};
diff --git a/packages/benchmark/src/checks/vercel-boolean-prop-soup.ts b/packages/benchmark/src/checks/vercel-boolean-prop-soup.ts
new file mode 100644
index 000000000..34d308d9c
--- /dev/null
+++ b/packages/benchmark/src/checks/vercel-boolean-prop-soup.ts
@@ -0,0 +1,72 @@
+import { BOOLEAN_PROP_SOUP_THRESHOLD } from "../constants.js";
+import type { AstCheck, AstVisitorNode, ParsedSourceFile, ScanFinding } from "../types/index.js";
+import { makeAstFinding } from "../utils/make-ast-finding.js";
+import { walkAst } from "../utils/walk-ast.js";
+
+interface PropertySignature {
+  type?: string;
+  typeAnnotation?: { typeAnnotation?: { type?: string } };
+}
+
+const countBooleanMembers = (members: unknown): number => {
+  if (!Array.isArray(members)) return 0;
+  let count = 0;
+  for (const member of members) {
+    const signature = member as PropertySignature;
+    if (
+      signature.type === "TSPropertySignature" &&
+      signature.typeAnnotation?.typeAnnotation?.type === "TSBooleanKeyword"
+    ) {
+      count++;
+    }
+  }
+  return count;
+};
+
+const endsWithProps = (name: unknown): boolean => typeof name === "string" && /Props$/.test(name);
+
+const makeFinding = (
+  file: ParsedSourceFile,
+  node: AstVisitorNode,
+  booleanCount: number,
+): ScanFinding =>
+  makeAstFinding({
+    file,
+    scanner: "vercel-checks",
+    dimension: "composition",
+    ruleId: "vercel/architecture-boolean-prop-soup",
+    severity: "warning",
+    offset: typeof node.start === "number" ? node.start : 0,
+    message: `Props type declares ${booleanCount} boolean flags; prefer composition (variants / compound components) over boolean-prop soup.`,
+  });
+
+// Flags a props type carrying many boolean flags (Vercel
+// architecture-avoid-boolean-props). Each boolean doubles the component's
+// possible states; past the threshold this is the classic boolean-prop soup
+// that composition (variants / compound components) should replace. Scoped to
+// `*Props` declarations so unrelated config types are not penalized.
+export const vercelBooleanPropSoup: AstCheck = (file): ScanFinding[] => {
+  const findings: ScanFinding[] = [];
+  walkAst(file.program, (node) => {
+    if (
+      node.type === "TSInterfaceDeclaration" &&
+      endsWithProps((node.id as { name?: string })?.name)
+    ) {
+      const booleanCount = countBooleanMembers((node.body as { body?: unknown })?.body);
+      if (booleanCount >= BOOLEAN_PROP_SOUP_THRESHOLD)
+        findings.push(makeFinding(file, node, booleanCount));
+    }
+    if (
+      node.type === "TSTypeAliasDeclaration" &&
+      endsWithProps((node.id as { name?: string })?.name) &&
+      (node.typeAnnotation as { type?: string })?.type === "TSTypeLiteral"
+    ) {
+      const booleanCount = countBooleanMembers(
+        (node.typeAnnotation as { members?: unknown }).members,
+      );
+      if (booleanCount >= BOOLEAN_PROP_SOUP_THRESHOLD)
+        findings.push(makeFinding(file, node, booleanCount));
+    }
+  });
+  return findings;
+};
diff --git a/packages/benchmark/src/checks/vercel-render-prop.ts b/packages/benchmark/src/checks/vercel-render-prop.ts
new file mode 100644
index 000000000..b9b3cdf95
--- /dev/null
+++ b/packages/benchmark/src/checks/vercel-render-prop.ts
@@ -0,0 +1,40 @@
+import type { AstCheck, ScanFinding } from "../types/index.js";
+import { makeAstFinding } from "../utils/make-ast-finding.js";
+import { walkAst } from "../utils/walk-ast.js";
+
+const RENDER_PROP_NAME_PATTERN = /^render([A-Z].*)?$/;
+
+const keyName = (key: unknown): string | undefined => {
+  const identifier = key as { type?: string; name?: string; value?: string };
+  if (identifier?.type === "Identifier") return identifier.name;
+  if (identifier?.type === "Literal") return identifier.value;
+  return undefined;
+};
+
+// Flags function-valued `render` / `renderX` props (Vercel
+// patterns-children-over-render-props). A render prop threads JSX through a
+// callback where `children` (or a compound component) would compose more
+// cleanly and stay readable as the component grows.
+export const vercelRenderProp: AstCheck = (file): ScanFinding[] => {
+  const findings: ScanFinding[] = [];
+  walkAst(file.program, (node) => {
+    if (node.type !== "TSPropertySignature") return;
+    const name = keyName(node.key);
+    if (!name || !RENDER_PROP_NAME_PATTERN.test(name)) return;
+    const annotationType = (node.typeAnnotation as { typeAnnotation?: { type?: string } })
+      ?.typeAnnotation?.type;
+    if (annotationType !== "TSFunctionType") return;
+    findings.push(
+      makeAstFinding({
+        file,
+        scanner: "vercel-checks",
+        dimension: "composition",
+        ruleId: "vercel/patterns-render-prop",
+        severity: "warning",
+        offset: typeof node.start === "number" ? node.start : 0,
+        message: `Render prop \`${name}\` passes JSX through a callback; prefer \`children\` / compound components for composition.`,
+      }),
+    );
+  });
+  return findings;
+};
diff --git a/packages/benchmark/src/cli.ts b/packages/benchmark/src/cli.ts
new file mode 100644
index 000000000..957b7c858
--- /dev/null
+++ b/packages/benchmark/src/cli.ts
@@ -0,0 +1,86 @@
+import * as fs from "node:fs";
+import * as path from "node:path";
+import { runSlopVerifier } from "./run-slop-verifier.js";
+import type { SlopReport } from "./types/index.js";
+import { parseCliArgs } from "./utils/parse-cli-args.js";
+
+const USAGE = `slop-verify — score the React/TypeScript slop in a graded diff
+
+Usage:
+  slop-verify --root <dir> --base <ref> [options]
+
+Options:
+  --root <dir>            Project the agent edited (default: cwd)
+  --base <ref>            Git ref the agent started from (default: HEAD)
+  --doctor-bin <path>     React Doctor CLI to invoke (default: react-doctor on PATH)
+  --profile <path>        Scoring-profile JSON (default: built-in profile)
+  --functional-pass <b>   Functional gate outcome: true|false (default: unknown)
+  --out <path>            Write the full JSON SlopReport here
+  --json                  Print the JSON SlopReport to stdout (instead of a summary)
+  --fail-under <score>    Exit non-zero if slopScore < <score> (default: never)
+  --quiet                 Suppress the human-readable summary`;
+
+const asBoolean = (value: string | boolean | undefined): boolean | null => {
+  if (value === undefined) return null;
+  if (value === true || value === "true" || value === "1") return true;
+  if (value === false || value === "false" || value === "0") return false;
+  return null;
+};
+
+const asString = (value: string | boolean | undefined): string | undefined =>
+  typeof value === "string" ? value : undefined;
+
+const renderSummary = (report: SlopReport): string => {
+  const lines = [
+    `SlopBench score: ${report.slopScore.toFixed(1)} / 100 (scoring ${report.scoringVersion})`,
+    `Changed files: ${report.diffStats.changedFileCount}  Added lines: ${report.diffStats.addedLineCount}  Violations: ${report.violations.length}`,
+    "Dimensions:",
+    ...report.dimensions.map(
+      (dimension) =>
+        `  ${dimension.dimension.padEnd(20)} ${dimension.score.toFixed(1).padStart(6)}  (${dimension.violationCount} findings)`,
+    ),
+  ];
+  if (report.functionalPass !== null) {
+    lines.push(
+      `Functional gate: ${report.functionalPass ? "PASS" : "FAIL"}  Reward: ${report.reward?.toFixed(3)}`,
+    );
+  }
+  for (const error of report.scannerErrors) lines.push(`! scanner issue: ${error}`);
+  return lines.join("\n");
+};
+
+// CLI entry. Runs the verifier and reports; only exits non-zero when an
+// explicit `--fail-under` gate is set and missed, so a normal grading run
+// always succeeds and lets `test.sh` own the reward.
+export const runCli = (argv: string[]): void => {
+  const args = parseCliArgs(argv);
+  if (args.help || args.h) {
+    process.stdout.write(`${USAGE}\n`);
+    return;
+  }
+
+  const report = runSlopVerifier({
+    rootDirectory: path.resolve(asString(args.root) ?? process.cwd()),
+    baseRef: asString(args.base) ?? "HEAD",
+    reactDoctorBin: asString(args["doctor-bin"]),
+    profilePath: asString(args.profile),
+    functionalPass: asBoolean(args["functional-pass"]),
+  });
+
+  const outPath = asString(args.out);
+  if (outPath) {
+    fs.mkdirSync(path.dirname(path.resolve(outPath)), { recursive: true });
+    fs.writeFileSync(outPath, `${JSON.stringify(report, null, 2)}\n`);
+  }
+
+  if (args.json) {
+    process.stdout.write(`${JSON.stringify(report)}\n`);
+  } else if (!args.quiet) {
+    process.stdout.write(`${renderSummary(report)}\n`);
+  }
+
+  const failUnder = asString(args["fail-under"]);
+  if (failUnder !== undefined && report.slopScore < Number.parseFloat(failUnder)) {
+    process.exitCode = 1;
+  }
+};
diff --git a/packages/benchmark/src/constants.ts b/packages/benchmark/src/constants.ts
new file mode 100644
index 000000000..981ce9da2
--- /dev/null
+++ b/packages/benchmark/src/constants.ts
@@ -0,0 +1,98 @@
+import type { ScoringProfile, SlopDimension } from "./types/index.js";
+
+// Bump when the scoring formula or the built-in profile changes in a way that
+// makes scores incomparable across versions. Stamped into every SlopReport.
+export const SCORING_VERSION = "1.0.0";
+
+export const SCORE_MAX = 100;
+export const SCORE_MIN = 0;
+
+// Default CLI to invoke when a task does not pin one (resolved on PATH).
+export const DEFAULT_REACT_DOCTOR_BIN = "react-doctor";
+
+// React Doctor emits five user-facing categories; each maps to exactly one
+// SlopBench dimension so a React Doctor finding lands in a single bucket.
+export const REACT_DOCTOR_CATEGORY_TO_DIMENSION: Record<string, SlopDimension> = {
+  Security: "react-correctness",
+  Bugs: "react-correctness",
+  Performance: "react-performance",
+  Accessibility: "accessibility",
+  Maintainability: "maintainability",
+};
+
+// Where a React Doctor diagnostic falls when its category string is
+// unrecognized (e.g. a newly added bucket): treated as a correctness signal.
+export const REACT_DOCTOR_FALLBACK_DIMENSION: SlopDimension = "react-correctness";
+
+// Specific React Doctor rules whose intent is finer than their category
+// bucket. React Doctor files bundle- and waterfall-rules under the broad
+// "Performance" category; routing those exact rule ids into the dedicated
+// `bundle` / `async-waterfall` dimensions lets SlopBench report them
+// separately without us re-implementing detection (we DEFER to React Doctor —
+// see `rule-overlap.md`). Checked before the category mapping.
+export const REACT_DOCTOR_RULE_TO_DIMENSION: Record<string, SlopDimension> = {
+  "react-doctor/no-barrel-import": "bundle",
+  "react-doctor/no-full-lodash-import": "bundle",
+  "react-doctor/no-moment": "bundle",
+  "react-doctor/no-undeferred-third-party": "bundle",
+  "react-doctor/prefer-dynamic-import": "bundle",
+  "react-doctor/no-dynamic-import-path": "bundle",
+  "react-doctor/use-lazy-motion": "bundle",
+  "react-doctor/server-sequential-independent-await": "async-waterfall",
+  "react-doctor/tanstack-start-loader-parallel-fetch": "async-waterfall",
+};
+
+// Threshold for the boolean-prop-soup composition check: a props type with at
+// least this many boolean members is flagged (Vercel architecture-avoid-
+// boolean-props). Below it, a couple of flags is normal and not slop.
+export const BOOLEAN_PROP_SOUP_THRESHOLD = 4;
+
+// Conditional-expression nesting depth at or above which the deslop nested-
+// ternary heuristic fires (the deslop skill calls out nested ternaries).
+export const NESTED_TERNARY_DEPTH_THRESHOLD = 2;
+
+// The built-in fallback profile and single source of truth for default
+// weights. `scoring-profiles/default.json` mirrors this object; a drift test
+// keeps them identical. Tasks may override via `slop-verify --profile <path>`.
+export const DEFAULT_SCORING_PROFILE: ScoringProfile = {
+  version: SCORING_VERSION,
+  severityWeights: {
+    error: 5,
+    warning: 2,
+  },
+  categoryMultipliers: {
+    Security: 3,
+    Bugs: 2,
+    Performance: 1.5,
+    Accessibility: 1.2,
+    Maintainability: 1,
+  },
+  ruleImpactMultipliers: {
+    // TypeScript slop tiers — escape hatches that silence the compiler hurt most.
+    "ts/ban-ts-comment": 2.5,
+    "ts/no-explicit-any": 2,
+    "ts/no-non-null-assertion": 1.5,
+    "ts/no-type-assertion": 1.5,
+    // Composition gap-fillers (React Doctor does not count these).
+    "vercel/architecture-boolean-prop-soup": 1.8,
+    "vercel/patterns-render-prop": 1.3,
+    // deslop maintainability heuristic.
+    "deslop/nested-ternary": 1.2,
+  },
+  dimensionWeights: {
+    "react-correctness": 1.5,
+    "ts-strictness": 1.5,
+    "react-performance": 1.2,
+    composition: 1,
+    "async-waterfall": 1,
+    bundle: 1,
+    maintainability: 1,
+    accessibility: 0.8,
+  },
+  diffSizeNormalizerLines: 40,
+  minNormalizerLines: 25,
+};
+
+// Default multiplier for a finding whose category / rule is not in the
+// profile's multiplier tables.
+export const DEFAULT_WEIGHT_MULTIPLIER = 1;
diff --git a/packages/benchmark/src/index.ts b/packages/benchmark/src/index.ts
new file mode 100644
index 000000000..e47449901
--- /dev/null
+++ b/packages/benchmark/src/index.ts
@@ -0,0 +1,17 @@
+export { runSlopVerifier } from "./run-slop-verifier.js";
+export type { SlopVerifierOptions } from "./run-slop-verifier.js";
+export { runCli } from "./cli.js";
+export { computeSlopScore } from "./scoring/slop-score.js";
+export { loadScoringProfile } from "./scoring/load-scoring-profile.js";
+export { DEFAULT_SCORING_PROFILE, SCORING_VERSION } from "./constants.js";
+export type {
+  ScanFinding,
+  ScannerContext,
+  ScannerName,
+  ScoringProfile,
+  SlopDimension,
+  SlopDimensionScore,
+  SlopDiffStats,
+  SlopReport,
+  SlopViolation,
+} from "./types/index.js";
diff --git a/packages/benchmark/src/run-slop-verifier.ts b/packages/benchmark/src/run-slop-verifier.ts
new file mode 100644
index 000000000..3cfbca452
--- /dev/null
+++ b/packages/benchmark/src/run-slop-verifier.ts
@@ -0,0 +1,70 @@
+import { DEFAULT_REACT_DOCTOR_BIN } from "./constants.js";
+import { runAstChecks } from "./scanners/run-ast-checks.js";
+import { runReactDoctor } from "./scanners/run-react-doctor.js";
+import { loadScoringProfile } from "./scoring/load-scoring-profile.js";
+import { computeSlopScore } from "./scoring/slop-score.js";
+import type { ScannerContext, SlopReport } from "./types/index.js";
+import { collectDiff } from "./utils/collect-diff.js";
+
+export interface SlopVerifierOptions {
+  // Absolute path to the project the agent edited.
+  rootDirectory: string;
+  // Git ref the agent started from; the diff is computed against it.
+  baseRef: string;
+  // React Doctor CLI to invoke; defaults to `react-doctor` on PATH.
+  reactDoctorBin?: string;
+  // Optional scoring-profile JSON path; defaults to the built-in profile.
+  profilePath?: string;
+  // The functional-test outcome, when known, so the report can carry the
+  // composite reward. `null`/omitted ⇒ quality-only run.
+  functionalPass?: boolean | null;
+}
+
+const computeReward = (functionalPass: boolean | null, slopScore: number): number | null => {
+  if (functionalPass === null) return null;
+  return functionalPass ? slopScore / 100 : 0;
+};
+
+// Run the full slop verification pipeline over a graded diff and assemble the
+// SlopReport: collect the diff, run React Doctor (offline) plus the AST checks,
+// score deterministically, and combine with the functional gate. Pure of any
+// process exit — the caller (CLI / test.sh) decides how to act on the report.
+export const runSlopVerifier = (options: SlopVerifierOptions): SlopReport => {
+  const profile = loadScoringProfile(options.profilePath);
+  const diff = collectDiff(options.rootDirectory, options.baseRef);
+  const scannerErrors: string[] = [];
+  if (diff.error) scannerErrors.push(`diff: ${diff.error}`);
+
+  const context: ScannerContext = {
+    rootDirectory: options.rootDirectory,
+    changedFiles: diff.changedFiles,
+    baseRef: options.baseRef,
+    addedLineCount: diff.addedLineCount,
+    reactDoctorBin: options.reactDoctorBin ?? DEFAULT_REACT_DOCTOR_BIN,
+  };
+
+  const reactDoctor = runReactDoctor(context);
+  if (reactDoctor.error) scannerErrors.push(`react-doctor: ${reactDoctor.error}`);
+  const astFindings = runAstChecks(context);
+
+  const findings = [...reactDoctor.findings, ...astFindings];
+  const scored = computeSlopScore(findings, diff.addedLineCount, profile);
+  const functionalPass = options.functionalPass ?? null;
+
+  return {
+    scoringVersion: profile.version,
+    doctorVersion: reactDoctor.doctorVersion,
+    generatedAt: new Date().toISOString(),
+    diffStats: {
+      changedFileCount: diff.changedFiles.length,
+      addedLineCount: diff.addedLineCount,
+      normalizerLines: scored.normalizerLines,
+    },
+    violations: scored.violations,
+    dimensions: scored.dimensions,
+    slopScore: scored.slopScore,
+    scannerErrors,
+    functionalPass,
+    reward: computeReward(functionalPass, scored.slopScore),
+  };
+};
diff --git a/packages/benchmark/src/scanners/run-ast-checks.ts b/packages/benchmark/src/scanners/run-ast-checks.ts
new file mode 100644
index 000000000..ff6dff0bc
--- /dev/null
+++ b/packages/benchmark/src/scanners/run-ast-checks.ts
@@ -0,0 +1,19 @@
+import { AST_CHECKS } from "../checks/index.js";
+import type { ScanFinding, ScannerContext } from "../types/index.js";
+import { parseSourceFile } from "../utils/parse-source-file.js";
+
+// Parse each changed source file once and run every AST check over it. Covers
+// the TypeScript-strictness, composition, and deslop dimensions that React
+// Doctor does not. Unparsable / non-source files are silently skipped — a file
+// the parser rejects cannot be fairly scored for AST-level slop.
+export const runAstChecks = (context: ScannerContext): ScanFinding[] => {
+  const findings: ScanFinding[] = [];
+  for (const filePath of context.changedFiles) {
+    const parsed = parseSourceFile(context.rootDirectory, filePath);
+    if (!parsed) continue;
+    for (const check of AST_CHECKS) {
+      findings.push(...check(parsed));
+    }
+  }
+  return findings;
+};
diff --git a/packages/benchmark/src/scanners/run-react-doctor.ts b/packages/benchmark/src/scanners/run-react-doctor.ts
new file mode 100644
index 000000000..07adc55af
--- /dev/null
+++ b/packages/benchmark/src/scanners/run-react-doctor.ts
@@ -0,0 +1,98 @@
+import type { JsonReport } from "@react-doctor/core";
+import {
+  REACT_DOCTOR_CATEGORY_TO_DIMENSION,
+  REACT_DOCTOR_FALLBACK_DIMENSION,
+  REACT_DOCTOR_RULE_TO_DIMENSION,
+} from "../constants.js";
+import type { ScanFinding, ScannerContext, SlopDimension } from "../types/index.js";
+import { resolveBinInvocation } from "../utils/resolve-bin-invocation.js";
+import { runCommand } from "../utils/run-command.js";
+
+export interface ReactDoctorScanResult {
+  findings: ScanFinding[];
+  // The CLI's reported version, for the SlopReport provenance field.
+  doctorVersion: string | null;
+  // Set when the CLI could not be run or its output was unparseable. A failed
+  // React Doctor scan must not silently score as "clean", so the orchestrator
+  // surfaces this rather than treating zero findings as success.
+  error: string | null;
+}
+
+// React Doctor exits non-zero whenever it finds issues, so a clean JSON parse —
+// not the exit code — is the success signal.
+const parseReport = (stdout: string): JsonReport | null => {
+  const trimmed = stdout.trim();
+  if (!trimmed) return null;
+  try {
+    const parsed: unknown = JSON.parse(trimmed);
+    if (parsed && typeof parsed === "object" && "diagnostics" in parsed) {
+      return parsed as JsonReport;
+    }
+    return null;
+  } catch {
+    return null;
+  }
+};
+
+const resolveDimension = (ruleId: string, category: string): SlopDimension =>
+  REACT_DOCTOR_RULE_TO_DIMENSION[ruleId] ??
+  REACT_DOCTOR_CATEGORY_TO_DIMENSION[category] ??
+  REACT_DOCTOR_FALLBACK_DIMENSION;
+
+const toFinding = (diagnostic: JsonReport["diagnostics"][number]): ScanFinding => {
+  const ruleId = `${diagnostic.plugin}/${diagnostic.rule}`;
+  return {
+    scanner: "react-doctor",
+    dimension: resolveDimension(ruleId, diagnostic.category),
+    ruleId,
+    severity: diagnostic.severity,
+    filePath: diagnostic.filePath,
+    line: diagnostic.line,
+    message: diagnostic.message,
+    category: diagnostic.category,
+  };
+};
+
+// Run React Doctor over the whole project (offline, no remote score), then keep
+// only diagnostics in files the agent changed. Diff-scoping by changed file —
+// rather than React Doctor's own `--diff` git semantics — keeps grading
+// deterministic and ensures pre-existing, untouched slop is never charged to
+// the agent.
+//
+// Dead-code analysis is disabled (`--no-dead-code`): whole-file reachability
+// needs a real application entry point, which a diff-scoped grade of an
+// isolated change does not reliably provide, so it would false-fire
+// "unused file" on legitimately clean new code. The deslop/maintainability
+// signal is still covered by the AST `deslop/nested-ternary` check and React
+// Doctor's other Maintainability rules.
+export const runReactDoctor = (context: ScannerContext): ReactDoctorScanResult => {
+  const changed = new Set(context.changedFiles);
+  const { command, prefixArgs } = resolveBinInvocation(context.reactDoctorBin);
+  const result = runCommand(
+    command,
+    [...prefixArgs, context.rootDirectory, "--json", "--no-score", "--no-dead-code"],
+    { cwd: context.rootDirectory },
+  );
+
+  if (result.spawnFailed) {
+    return {
+      findings: [],
+      doctorVersion: null,
+      error: `react-doctor failed to start: ${result.stderr}`,
+    };
+  }
+
+  const report = parseReport(result.stdout);
+  if (!report) {
+    return {
+      findings: [],
+      doctorVersion: null,
+      error: `react-doctor produced no parseable JSON report (exit ${result.exitCode})`,
+    };
+  }
+
+  const findings = report.diagnostics
+    .filter((diagnostic) => changed.has(diagnostic.filePath))
+    .map(toFinding);
+  return { findings, doctorVersion: report.version ?? null, error: null };
+};
diff --git a/packages/benchmark/src/scoring/compute-violation-weight.ts b/packages/benchmark/src/scoring/compute-violation-weight.ts
new file mode 100644
index 000000000..6a327a939
--- /dev/null
+++ b/packages/benchmark/src/scoring/compute-violation-weight.ts
@@ -0,0 +1,30 @@
+import { DEFAULT_WEIGHT_MULTIPLIER } from "../constants.js";
+import type { ScanFinding, ScoringProfile, SlopViolation } from "../types/index.js";
+
+// Turn a raw scanner finding into a weighted violation. Weight is the single
+// place severity, React Doctor category, and per-rule Vercel/TS impact tiers
+// combine, so every scanner is scored on the same scale:
+//   weight = severityBase × categoryMultiplier × ruleImpactMultiplier
+export const computeViolationWeight = (
+  finding: ScanFinding,
+  profile: ScoringProfile,
+): SlopViolation => {
+  const severityBase = profile.severityWeights[finding.severity];
+  const categoryMultiplier =
+    finding.category === undefined
+      ? DEFAULT_WEIGHT_MULTIPLIER
+      : (profile.categoryMultipliers[finding.category] ?? DEFAULT_WEIGHT_MULTIPLIER);
+  const ruleImpactMultiplier =
+    profile.ruleImpactMultipliers[finding.ruleId] ?? DEFAULT_WEIGHT_MULTIPLIER;
+
+  return {
+    scanner: finding.scanner,
+    dimension: finding.dimension,
+    ruleId: finding.ruleId,
+    severity: finding.severity,
+    weight: severityBase * categoryMultiplier * ruleImpactMultiplier,
+    filePath: finding.filePath,
+    line: finding.line,
+    message: finding.message,
+  };
+};
diff --git a/packages/benchmark/src/scoring/load-scoring-profile.ts b/packages/benchmark/src/scoring/load-scoring-profile.ts
new file mode 100644
index 000000000..2f3d275c0
--- /dev/null
+++ b/packages/benchmark/src/scoring/load-scoring-profile.ts
@@ -0,0 +1,24 @@
+import * as fs from "node:fs";
+import { DEFAULT_SCORING_PROFILE } from "../constants.js";
+import type { ScoringProfile } from "../types/index.js";
+
+// A loaded profile is trusted shape-wise (it is repo-controlled config, not
+// agent input), but we validate the few fields the scorer divides by so a
+// malformed override fails loudly instead of producing NaN scores.
+const assertUsableProfile = (profile: ScoringProfile, source: string): void => {
+  if (profile.diffSizeNormalizerLines <= 0 || profile.minNormalizerLines <= 0) {
+    throw new Error(`scoring profile ${source} must use positive normalizer line counts`);
+  }
+  if (!profile.severityWeights || !profile.dimensionWeights) {
+    throw new Error(`scoring profile ${source} is missing severity or dimension weights`);
+  }
+};
+
+// Resolve the scoring profile: the built-in default, or a JSON override when a
+// task pins one via `--profile <path>`.
+export const loadScoringProfile = (profilePath?: string): ScoringProfile => {
+  if (!profilePath) return DEFAULT_SCORING_PROFILE;
+  const parsed: ScoringProfile = JSON.parse(fs.readFileSync(profilePath, "utf8"));
+  assertUsableProfile(parsed, profilePath);
+  return parsed;
+};
diff --git a/packages/benchmark/src/scoring/slop-score.ts b/packages/benchmark/src/scoring/slop-score.ts
new file mode 100644
index 000000000..e2afc9b0d
--- /dev/null
+++ b/packages/benchmark/src/scoring/slop-score.ts
@@ -0,0 +1,80 @@
+import { SCORE_MAX, SCORE_MIN } from "../constants.js";
+import type {
+  ScanFinding,
+  ScoringProfile,
+  SlopDimension,
+  SlopDimensionScore,
+  SlopViolation,
+} from "../types/index.js";
+import { clamp } from "../utils/clamp.js";
+import { computeViolationWeight } from "./compute-violation-weight.js";
+
+export interface SlopScoreResult {
+  violations: SlopViolation[];
+  dimensions: SlopDimensionScore[];
+  slopScore: number;
+  normalizerLines: number;
+}
+
+// Divisor that makes penalties "per reference unit of code" so a large
+// legitimate feature is not punished as hard as the same violations in a tiny
+// diff. Floored by `minNormalizerLines` so a one-line change can't divide by a
+// near-zero size and crater the score on a single finding.
+const computeNormalizer = (addedLineCount: number, profile: ScoringProfile): number => {
+  const effectiveLines = Math.max(addedLineCount, profile.minNormalizerLines);
+  return effectiveLines / profile.diffSizeNormalizerLines;
+};
+
+const dimensionScoreFrom = (
+  dimension: SlopDimension,
+  dimensionViolations: SlopViolation[],
+  normalizer: number,
+): SlopDimensionScore => {
+  const rawPenalty = dimensionViolations.reduce((total, violation) => total + violation.weight, 0);
+  const normalizedPenalty = rawPenalty / normalizer;
+  return {
+    dimension,
+    score: clamp(SCORE_MAX - normalizedPenalty, SCORE_MIN, SCORE_MAX),
+    violationCount: dimensionViolations.length,
+    weightedPenalty: normalizedPenalty,
+  };
+};
+
+// Score a set of findings into per-dimension scores and one composite. A
+// dimension with no findings scores a full 100 (you cannot be penalized for
+// slop you had no opportunity to introduce); the composite is the
+// profile-weighted mean across every dimension the profile defines.
+export const computeSlopScore = (
+  findings: ScanFinding[],
+  addedLineCount: number,
+  profile: ScoringProfile,
+): SlopScoreResult => {
+  const violations = findings.map((finding) => computeViolationWeight(finding, profile));
+  const normalizer = computeNormalizer(addedLineCount, profile);
+
+  const dimensions = Object.keys(profile.dimensionWeights).map(
+    (dimensionKey): SlopDimensionScore => {
+      const dimension = dimensionKey as SlopDimension;
+      const dimensionViolations = violations.filter(
+        (violation) => violation.dimension === dimension,
+      );
+      return dimensionScoreFrom(dimension, dimensionViolations, normalizer);
+    },
+  );
+
+  let weightedScoreTotal = 0;
+  let weightTotal = 0;
+  for (const dimensionScore of dimensions) {
+    const dimensionWeight = profile.dimensionWeights[dimensionScore.dimension];
+    weightedScoreTotal += dimensionScore.score * dimensionWeight;
+    weightTotal += dimensionWeight;
+  }
+  const slopScore = weightTotal === 0 ? SCORE_MAX : weightedScoreTotal / weightTotal;
+
+  return {
+    violations,
+    dimensions,
+    slopScore: clamp(slopScore, SCORE_MIN, SCORE_MAX),
+    normalizerLines: normalizer * profile.diffSizeNormalizerLines,
+  };
+};
diff --git a/packages/benchmark/src/types/index.ts b/packages/benchmark/src/types/index.ts
new file mode 100644
index 000000000..52a29d1a0
--- /dev/null
+++ b/packages/benchmark/src/types/index.ts
@@ -0,0 +1,12 @@
+export type { ScannerName, SlopDimension } from "./slop-dimension.js";
+export type { SlopViolation } from "./slop-violation.js";
+export type { ScanFinding } from "./scan-finding.js";
+export type { ScoringProfile } from "./scoring-profile.js";
+export type { SlopDiffStats, SlopDimensionScore, SlopReport } from "./slop-report.js";
+export type { ScannerContext } from "./scanner-context.js";
+export type {
+  AstCheck,
+  AstVisitorNode,
+  ParsedSourceFile,
+  SourceComment,
+} from "./parsed-source-file.js";
diff --git a/packages/benchmark/src/types/parsed-source-file.ts b/packages/benchmark/src/types/parsed-source-file.ts
new file mode 100644
index 000000000..f70a35681
--- /dev/null
+++ b/packages/benchmark/src/types/parsed-source-file.ts
@@ -0,0 +1,34 @@
+import type { ScanFinding } from "./scan-finding.js";
+
+// A comment as reported by oxc-parser (byte-offset spans, no line info).
+export interface SourceComment {
+  type: "Line" | "Block";
+  value: string;
+  start: number;
+  end: number;
+}
+
+// One changed source file, parsed once and shared by every AST check. `program`
+// is the oxc ESTree `Program` node; it is intentionally untyped (`unknown`)
+// because the checks walk it structurally by `type` rather than against a
+// committed AST type surface.
+export interface ParsedSourceFile {
+  // Repo-relative path, matching React Doctor's `filePath` convention.
+  filePath: string;
+  sourceText: string;
+  program: unknown;
+  comments: SourceComment[];
+}
+
+// A structurally-typed AST node: anything with a string `type`, plus arbitrary
+// child fields the checks read by name. The oxc AST is walked this way rather
+// than against a committed, versioned AST type surface.
+export interface AstVisitorNode {
+  type: string;
+  [key: string]: unknown;
+}
+
+// An AST check: a pure function from one parsed file to its findings. Lives in
+// `src/checks/<kebab-name>.ts`, one check per file, and is registered in
+// `checks/index.ts`.
+export type AstCheck = (file: ParsedSourceFile) => ScanFinding[];
diff --git a/packages/benchmark/src/types/scan-finding.ts b/packages/benchmark/src/types/scan-finding.ts
new file mode 100644
index 000000000..261028777
--- /dev/null
+++ b/packages/benchmark/src/types/scan-finding.ts
@@ -0,0 +1,18 @@
+import type { ScannerName, SlopDimension } from "./slop-dimension.js";
+
+// What a scanner emits before scoring. The orchestrator converts every
+// `ScanFinding` into a weighted `SlopViolation` in one place
+// (`scoring/compute-violation-weight.ts`), so scanners stay weight-agnostic.
+export interface ScanFinding {
+  scanner: ScannerName;
+  dimension: SlopDimension;
+  ruleId: string;
+  severity: "error" | "warning";
+  filePath: string;
+  line: number;
+  message: string;
+  // React Doctor's user-facing category, when the finding came from it. Used
+  // only to pick the profile's `categoryMultipliers` entry; absent for the
+  // custom scanners, which rely on `ruleImpactMultipliers` instead.
+  category?: string;
+}
diff --git a/packages/benchmark/src/types/scanner-context.ts b/packages/benchmark/src/types/scanner-context.ts
new file mode 100644
index 000000000..d0e1d9f13
--- /dev/null
+++ b/packages/benchmark/src/types/scanner-context.ts
@@ -0,0 +1,16 @@
+// Shared, read-only input every scanner receives. Built once by the
+// orchestrator so each scanner sees the same view of the graded diff.
+export interface ScannerContext {
+  // Absolute path to the project under test (the repo the agent edited).
+  rootDirectory: string;
+  // Repo-relative paths of the files the agent changed, already filtered to
+  // gradable source (tests, fixtures, generated, and lockfiles removed).
+  changedFiles: string[];
+  // Base git ref the agent started from, used for diff-scoped scans.
+  baseRef: string;
+  // Total added lines across `changedFiles`, the basis for size-normalization.
+  addedLineCount: number;
+  // Absolute path to the React Doctor CLI entry to invoke. Lets the sandbox
+  // image point at a pinned binary; falls back to `react-doctor` on PATH.
+  reactDoctorBin: string;
+}
diff --git a/packages/benchmark/src/types/scoring-profile.ts b/packages/benchmark/src/types/scoring-profile.ts
new file mode 100644
index 000000000..77bb28088
--- /dev/null
+++ b/packages/benchmark/src/types/scoring-profile.ts
@@ -0,0 +1,29 @@
+import type { SlopDimension } from "./slop-dimension.js";
+
+// A versioned, fully-declarative weight table. Every number that influences a
+// score lives here (loaded from `scoring-profiles/<name>.json`) so a score is
+// reproducible from its `version` alone. No weights are hard-coded in the
+// scorer — `constants.ts` only carries the built-in fallback profile.
+export interface ScoringProfile {
+  version: string;
+  // Base penalty per finding, before category/impact multipliers.
+  severityWeights: {
+    error: number;
+    warning: number;
+  };
+  // React Doctor's five user-facing categories → penalty multiplier.
+  // Keyed by the exact category string React Doctor emits
+  // (Security, Bugs, Performance, Accessibility, Maintainability).
+  categoryMultipliers: Record<string, number>;
+  // Optional per-rule multiplier (e.g. derived from a Vercel rule's CRITICAL
+  // / HIGH impact tier). Keyed by fully-qualified `ruleId`. Missing ⇒ 1.
+  ruleImpactMultipliers: Record<string, number>;
+  // How much each dimension counts toward the composite slop score. Need not
+  // sum to 1 — the scorer normalizes by the total of present dimensions.
+  dimensionWeights: Record<SlopDimension, number>;
+  // Penalty is divided by `max(changedLines, minNormalizerLines) /
+  // diffSizeNormalizerLines`, so a large legitimate feature is not punished as
+  // hard as the same violation count in a tiny diff.
+  diffSizeNormalizerLines: number;
+  minNormalizerLines: number;
+}
diff --git a/packages/benchmark/src/types/slop-dimension.ts b/packages/benchmark/src/types/slop-dimension.ts
new file mode 100644
index 000000000..a3891b51f
--- /dev/null
+++ b/packages/benchmark/src/types/slop-dimension.ts
@@ -0,0 +1,17 @@
+// The eight slop dimensions SlopBench reports on. Each violation maps to
+// exactly one dimension so penalties never double-count across scanners.
+// Four are owned by React Doctor (mapped from its five user-facing
+// categories), the rest by SlopBench's own scanners — see `rule-overlap.md`.
+export type SlopDimension =
+  | "react-correctness"
+  | "react-performance"
+  | "accessibility"
+  | "maintainability"
+  | "ts-strictness"
+  | "composition"
+  | "async-waterfall"
+  | "bundle";
+
+// The scanner that produced a violation. Used for provenance in the report
+// and to let reviewers trace a penalty back to its source tool.
+export type ScannerName = "react-doctor" | "typescript" | "vercel-checks" | "deslop-heuristics";
diff --git a/packages/benchmark/src/types/slop-report.ts b/packages/benchmark/src/types/slop-report.ts
new file mode 100644
index 000000000..ebba22753
--- /dev/null
+++ b/packages/benchmark/src/types/slop-report.ts
@@ -0,0 +1,44 @@
+import type { SlopDimension } from "./slop-dimension.js";
+import type { SlopViolation } from "./slop-violation.js";
+
+// Per-dimension rollup. `score` is 0–100 (higher = cleaner); `weightedPenalty`
+// is the size-normalized penalty that drove it down from 100.
+export interface SlopDimensionScore {
+  dimension: SlopDimension;
+  score: number;
+  violationCount: number;
+  weightedPenalty: number;
+}
+
+// Size of the graded diff, used to normalize penalties. Tests and generated
+// files are excluded upstream so they neither earn nor dodge penalties.
+export interface SlopDiffStats {
+  changedFileCount: number;
+  addedLineCount: number;
+  // The effective divisor the scorer used (after clamping to the profile's
+  // min), recorded for auditability.
+  normalizerLines: number;
+}
+
+// The machine-readable grading artifact every task emits. Consumed by the
+// runner aggregation script and (v2) the leaderboard.
+export interface SlopReport {
+  scoringVersion: string;
+  // React Doctor CLI version that produced the diagnostics, when detectable.
+  doctorVersion: string | null;
+  generatedAt: string;
+  diffStats: SlopDiffStats;
+  violations: SlopViolation[];
+  dimensions: SlopDimensionScore[];
+  // Composite 0–100 cleanliness score (higher = less slop).
+  slopScore: number;
+  // Non-fatal scanner problems (e.g. React Doctor failed to start). Empty on a
+  // clean run; a populated array means some dimensions may be under-reported,
+  // which reviewers and the aggregator can surface.
+  scannerErrors: string[];
+  // Filled by the task's `test.sh` once the functional gate is known; `null`
+  // when the verifier runs standalone (quality-only).
+  functionalPass: boolean | null;
+  // `functionalPass ? slopScore / 100 : 0`, or `null` when the gate is unknown.
+  reward: number | null;
+}
diff --git a/packages/benchmark/src/types/slop-violation.ts b/packages/benchmark/src/types/slop-violation.ts
new file mode 100644
index 000000000..a731023dd
--- /dev/null
+++ b/packages/benchmark/src/types/slop-violation.ts
@@ -0,0 +1,19 @@
+import type { ScannerName, SlopDimension } from "./slop-dimension.js";
+
+// A single penalized finding. Every scanner normalizes its native output into
+// this shape so the scorer can treat all slop uniformly.
+export interface SlopViolation {
+  scanner: ScannerName;
+  dimension: SlopDimension;
+  // Fully-qualified rule id, e.g. `react-doctor/no-nested-component-definition`
+  // or `ts/no-explicit-any`. Namespaced by scanner to stay collision-free.
+  ruleId: string;
+  severity: "error" | "warning";
+  // The penalty this violation contributes before size-normalization.
+  weight: number;
+  // Repo-relative path. Empty string for project-wide findings (e.g. tsc
+  // config errors) that carry no single source location.
+  filePath: string;
+  line: number;
+  message: string;
+}
diff --git a/packages/benchmark/src/utils/clamp.ts b/packages/benchmark/src/utils/clamp.ts
new file mode 100644
index 000000000..2a76a40c8
--- /dev/null
+++ b/packages/benchmark/src/utils/clamp.ts
@@ -0,0 +1,3 @@
+// Clamp a number into an inclusive range.
+export const clamp = (value: number, minimum: number, maximum: number): number =>
+  Math.min(Math.max(value, minimum), maximum);
diff --git a/packages/benchmark/src/utils/collect-diff.ts b/packages/benchmark/src/utils/collect-diff.ts
new file mode 100644
index 000000000..3b74b6efb
--- /dev/null
+++ b/packages/benchmark/src/utils/collect-diff.ts
@@ -0,0 +1,49 @@
+import { isGradableFile } from "./is-gradable-file.js";
+import { runCommand } from "./run-command.js";
+
+export interface DiffSummary {
+  changedFiles: string[];
+  addedLineCount: number;
+  // Set when git could not produce a diff (not a repo, bad base ref). The
+  // caller decides whether to fall back to scanning the whole tree.
+  error: string | null;
+}
+
+// Parse `git diff --numstat` output ("added<TAB>deleted<TAB>path") into the set
+// of gradable changed files and their total added lines. Binary files report
+// "-" for counts and contribute zero added lines.
+const parseNumstat = (numstat: string): DiffSummary => {
+  const changedFiles: string[] = [];
+  let addedLineCount = 0;
+  for (const line of numstat.split("\n")) {
+    const trimmed = line.trim();
+    if (!trimmed) continue;
+    const [addedRaw, , ...pathParts] = trimmed.split("\t");
+    const filePath = pathParts.join("\t");
+    if (!filePath || !isGradableFile(filePath)) continue;
+    changedFiles.push(filePath);
+    const added = Number.parseInt(addedRaw ?? "", 10);
+    if (Number.isFinite(added)) addedLineCount += added;
+  }
+  return { changedFiles, addedLineCount, error: null };
+};
+
+// Compute the agent's graded diff against `baseRef`. Marks untracked files with
+// intent-to-add first (`git add -A -N`) so brand-new files the agent created
+// show up in `git diff` exactly like edits to tracked files.
+export const collectDiff = (rootDirectory: string, baseRef: string): DiffSummary => {
+  runCommand("git", ["-C", rootDirectory, "add", "-A", "-N"], { cwd: rootDirectory });
+  const result = runCommand(
+    "git",
+    ["-C", rootDirectory, "diff", "--numstat", "--no-color", baseRef],
+    { cwd: rootDirectory },
+  );
+  if (result.spawnFailed || result.exitCode !== 0) {
+    return {
+      changedFiles: [],
+      addedLineCount: 0,
+      error: result.stderr.trim() || "git diff failed",
+    };
+  }
+  return parseNumstat(result.stdout);
+};
diff --git a/packages/benchmark/src/utils/is-gradable-file.ts b/packages/benchmark/src/utils/is-gradable-file.ts
new file mode 100644
index 000000000..0e337c54d
--- /dev/null
+++ b/packages/benchmark/src/utils/is-gradable-file.ts
@@ -0,0 +1,25 @@
+// Paths excluded from slop grading. Tests, stories, fixtures, generated output,
+// and dependency/build directories are neither rewarded nor penalized: an agent
+// should not earn credit for writing tests, nor be charged for slop in code it
+// did not author (vendored / generated). The agent's *product* code is graded.
+const NON_GRADABLE_PATTERNS: readonly RegExp[] = [
+  /(^|\/)node_modules\//,
+  /(^|\/)(dist|build|out|coverage|\.next|\.turbo)\//,
+  /(^|\/)__tests__\//,
+  /(^|\/)tests?\//,
+  /(^|\/)__mocks__\//,
+  /(^|\/)__fixtures__\//,
+  /(^|\/)fixtures?\//,
+  /\.(test|spec|stories)\.[mc]?[jt]sx?$/,
+  /\.d\.[mc]?ts$/,
+  /(^|\/)[^/]*\.(lock|lockb)$/,
+  /(^|\/)(pnpm-lock\.yaml|package-lock\.json|yarn\.lock|bun\.lockb?)$/,
+];
+
+// Only these extensions carry React/TS slop the scanners understand.
+const GRADABLE_EXTENSION_PATTERN = /\.[mc]?[jt]sx?$/;
+
+export const isGradableFile = (filePath: string): boolean => {
+  if (!GRADABLE_EXTENSION_PATTERN.test(filePath)) return false;
+  return !NON_GRADABLE_PATTERNS.some((pattern) => pattern.test(filePath));
+};
diff --git a/packages/benchmark/src/utils/make-ast-finding.ts b/packages/benchmark/src/utils/make-ast-finding.ts
new file mode 100644
index 000000000..50982cbf4
--- /dev/null
+++ b/packages/benchmark/src/utils/make-ast-finding.ts
@@ -0,0 +1,26 @@
+import type { ParsedSourceFile, ScanFinding, ScannerName, SlopDimension } from "../types/index.js";
+import { offsetToLine } from "./offset-to-line.js";
+
+export interface MakeAstFindingInput {
+  file: ParsedSourceFile;
+  scanner: ScannerName;
+  dimension: SlopDimension;
+  ruleId: string;
+  severity: "error" | "warning";
+  // Byte offset of the offending node (oxc `node.start`); converted to a line.
+  offset: number;
+  message: string;
+}
+
+// Build a `ScanFinding` from an AST node offset, resolving the 1-based line
+// from the file's source text. Keeps the individual checks free of
+// line-bookkeeping boilerplate.
+export const makeAstFinding = (input: MakeAstFindingInput): ScanFinding => ({
+  scanner: input.scanner,
+  dimension: input.dimension,
+  ruleId: input.ruleId,
+  severity: input.severity,
+  filePath: input.file.filePath,
+  line: offsetToLine(input.file.sourceText, input.offset),
+  message: input.message,
+});
diff --git a/packages/benchmark/src/utils/offset-to-line.ts b/packages/benchmark/src/utils/offset-to-line.ts
new file mode 100644
index 000000000..1da198960
--- /dev/null
+++ b/packages/benchmark/src/utils/offset-to-line.ts
@@ -0,0 +1,11 @@
+// Convert a 0-based byte/char offset into a 1-based line number by counting
+// newlines before it. oxc reports spans as offsets only, so checks use this to
+// fill `ScanFinding.line`.
+export const offsetToLine = (sourceText: string, offset: number): number => {
+  let line = 1;
+  const limit = Math.min(offset, sourceText.length);
+  for (let index = 0; index < limit; index++) {
+    if (sourceText.charCodeAt(index) === 10) line++;
+  }
+  return line;
+};
diff --git a/packages/benchmark/src/utils/parse-cli-args.ts b/packages/benchmark/src/utils/parse-cli-args.ts
new file mode 100644
index 000000000..74e0a58ba
--- /dev/null
+++ b/packages/benchmark/src/utils/parse-cli-args.ts
@@ -0,0 +1,24 @@
+// Minimal `--flag value` / `--flag` parser. Avoids a CLI-framework dependency
+// so the verifier bundles tiny and starts fast in the sandbox. Unknown flags
+// are ignored; `--flag=value` and `--flag value` are both accepted.
+export const parseCliArgs = (argv: string[]): Record<string, string | boolean> => {
+  const parsed: Record<string, string | boolean> = {};
+  for (let index = 0; index < argv.length; index++) {
+    const token = argv[index];
+    if (!token || !token.startsWith("--")) continue;
+    const body = token.slice(2);
+    const equalsIndex = body.indexOf("=");
+    if (equalsIndex !== -1) {
+      parsed[body.slice(0, equalsIndex)] = body.slice(equalsIndex + 1);
+      continue;
+    }
+    const next = argv[index + 1];
+    if (next !== undefined && !next.startsWith("--")) {
+      parsed[body] = next;
+      index++;
+    } else {
+      parsed[body] = true;
+    }
+  }
+  return parsed;
+};
diff --git a/packages/benchmark/src/utils/parse-source-file.ts b/packages/benchmark/src/utils/parse-source-file.ts
new file mode 100644
index 000000000..7ec1481f0
--- /dev/null
+++ b/packages/benchmark/src/utils/parse-source-file.ts
@@ -0,0 +1,57 @@
+import * as fs from "node:fs";
+import * as path from "node:path";
+import { parseSync } from "oxc-parser";
+import type { ParsedSourceFile, SourceComment } from "../types/index.js";
+
+const EXTENSION_TO_LANG: Record<string, "ts" | "tsx" | "js" | "jsx"> = {
+  ".ts": "ts",
+  ".tsx": "tsx",
+  ".mts": "ts",
+  ".cts": "ts",
+  ".js": "js",
+  ".jsx": "jsx",
+  ".mjs": "js",
+  ".cjs": "js",
+};
+
+// Extensions the AST checks understand. Declaration files are excluded — they
+// are types-only and carry no slop the agent can be charged for.
+export const isParsableSourcePath = (filePath: string): boolean => {
+  if (/\.d\.[mc]?ts$/.test(filePath)) return false;
+  return path.extname(filePath).toLowerCase() in EXTENSION_TO_LANG;
+};
+
+// Parse source text for a given (repo-relative) path into a `ParsedSourceFile`,
+// or `null` when the path is not a source extension or the parser hits a fatal
+// error. Pure (no disk access) so checks can be unit-tested from strings.
+export const parseSourceText = (filePath: string, sourceText: string): ParsedSourceFile | null => {
+  if (!isParsableSourcePath(filePath)) return null;
+  const lang = EXTENSION_TO_LANG[path.extname(filePath).toLowerCase()] ?? "tsx";
+  try {
+    const result = parseSync(filePath, sourceText, { astType: "ts", lang });
+    if (result.errors.some((parseError) => parseError.severity === "Error")) return null;
+    return {
+      filePath,
+      sourceText,
+      program: result.program,
+      comments: result.comments as unknown as SourceComment[],
+    };
+  } catch {
+    return null;
+  }
+};
+
+// Read and parse one repo-relative source file, or `null` when it is missing,
+// unparsable, or not a source extension.
+export const parseSourceFile = (
+  rootDirectory: string,
+  filePath: string,
+): ParsedSourceFile | null => {
+  if (!isParsableSourcePath(filePath)) return null;
+  try {
+    const sourceText = fs.readFileSync(path.join(rootDirectory, filePath), "utf8");
+    return parseSourceText(filePath, sourceText);
+  } catch {
+    return null;
+  }
+};
diff --git a/packages/benchmark/src/utils/resolve-bin-invocation.ts b/packages/benchmark/src/utils/resolve-bin-invocation.ts
new file mode 100644
index 000000000..830384fb6
--- /dev/null
+++ b/packages/benchmark/src/utils/resolve-bin-invocation.ts
@@ -0,0 +1,10 @@
+// Resolve how to spawn a CLI entry. A bare command name (e.g. `react-doctor`)
+// is invoked directly so the OS resolves it on PATH; a `.js`/`.mjs` file path
+// is run through the current Node binary so it works without an executable bit
+// (the common case when pointing at a monorepo's built `bin/*.js` in dev).
+export const resolveBinInvocation = (bin: string): { command: string; prefixArgs: string[] } => {
+  if (/\.[mc]?js$/.test(bin)) {
+    return { command: process.execPath, prefixArgs: [bin] };
+  }
+  return { command: bin, prefixArgs: [] };
+};
diff --git a/packages/benchmark/src/utils/run-command.ts b/packages/benchmark/src/utils/run-command.ts
new file mode 100644
index 000000000..2ef43e5b6
--- /dev/null
+++ b/packages/benchmark/src/utils/run-command.ts
@@ -0,0 +1,43 @@
+import { spawnSync } from "node:child_process";
+
+export interface CommandResult {
+  stdout: string;
+  stderr: string;
+  exitCode: number;
+  // True when the binary could not be spawned at all (ENOENT, permissions).
+  spawnFailed: boolean;
+}
+
+// Run a command to completion, capturing output. Never throws and never treats
+// a non-zero exit as an error — many tools the verifier drives (React Doctor,
+// tsc) exit non-zero precisely when they have findings, which is the signal we
+// want, not a failure. Callers inspect `spawnFailed` to distinguish a tool
+// that ran-and-complained from one that never started.
+export const runCommand = (
+  command: string,
+  args: string[],
+  options: { cwd: string; maxBufferBytes?: number; timeoutMs?: number } = { cwd: process.cwd() },
+): CommandResult => {
+  const result = spawnSync(command, args, {
+    cwd: options.cwd,
+    encoding: "utf8",
+    maxBuffer: options.maxBufferBytes ?? 64 * 1024 * 1024,
+    timeout: options.timeoutMs,
+  });
+
+  if (result.error) {
+    return {
+      stdout: result.stdout ?? "",
+      stderr: result.stderr ?? String(result.error),
+      exitCode: typeof result.status === "number" ? result.status : 1,
+      spawnFailed: true,
+    };
+  }
+
+  return {
+    stdout: result.stdout ?? "",
+    stderr: result.stderr ?? "",
+    exitCode: typeof result.status === "number" ? result.status : 1,
+    spawnFailed: false,
+  };
+};
diff --git a/packages/benchmark/src/utils/walk-ast.ts b/packages/benchmark/src/utils/walk-ast.ts
new file mode 100644
index 000000000..7d8966540
--- /dev/null
+++ b/packages/benchmark/src/utils/walk-ast.ts
@@ -0,0 +1,26 @@
+import type { AstVisitorNode } from "../types/index.js";
+
+const isAstNode = (value: unknown): value is AstVisitorNode =>
+  typeof value === "object" &&
+  value !== null &&
+  typeof (value as { type?: unknown }).type === "string";
+
+// Depth-first walk over an oxc ESTree tree, invoking `visit` for every node
+// that has a string `type`. The oxc AST has no parent back-references (we never
+// attach them), so a plain recursive descent is cycle-free. Used by the AST
+// checks, which match on `node.type` rather than a committed AST type surface.
+export const walkAst = (root: unknown, visit: (node: AstVisitorNode) => void): void => {
+  const visitValue = (value: unknown): void => {
+    if (Array.isArray(value)) {
+      for (const element of value) visitValue(element);
+      return;
+    }
+    if (!isAstNode(value)) return;
+    visit(value);
+    for (const key of Object.keys(value)) {
+      if (key === "type" || key === "start" || key === "end") continue;
+      visitValue(value[key]);
+    }
+  };
+  visitValue(root);
+};
diff --git a/packages/benchmark/tasks/_base/Dockerfile b/packages/benchmark/tasks/_base/Dockerfile
new file mode 100644
index 000000000..eef91cdce
--- /dev/null
+++ b/packages/benchmark/tasks/_base/Dockerfile
@@ -0,0 +1,53 @@
+# SlopBench shared base image.
+#
+# Built once (internet IS allowed at image-build time); every task's
+# environment/Dockerfile does `FROM slopbench-base`. The agent run itself stays
+# air-gapped (`allow_internet = false` in task.toml) — both the React Doctor
+# scan and the slop verifier run fully offline, so nothing here needs the
+# network at grade time.
+#
+# It installs two pinned CLIs onto PATH:
+#   react-doctor  — the offline diagnostic engine the verifier shells out to
+#   slop-verify   — the SlopBench verifier (this repo's @react-doctor/benchmark)
+# plus the shared grader `slopbench-grade` every task's test.sh execs.
+#
+# Both CLIs come from a single pinned checkout of the react-doctor monorepo so
+# scoring is reproducible. Pin REACT_DOCTOR_REF to a tag or full SHA — never a
+# moving branch — when cutting a benchmark release.
+FROM node:22-bookworm-slim
+
+# Pin to an immutable ref for reproducible scores. Override at build:
+#   docker build --build-arg REACT_DOCTOR_REF=<sha> ...
+ARG REACT_DOCTOR_REF=react-doctor@0.4.2
+ARG REACT_DOCTOR_REPO=https://github.com/millionco/react-doctor
+
+RUN apt-get update \
+  && apt-get install -y --no-install-recommends git ca-certificates \
+  && rm -rf /var/lib/apt/lists/*
+
+RUN corepack enable
+
+# Build react-doctor + the slop verifier from a single pinned checkout.
+RUN git clone "${REACT_DOCTOR_REPO}" /opt/react-doctor \
+  && cd /opt/react-doctor \
+  && git checkout "${REACT_DOCTOR_REF}" \
+  && pnpm install --frozen-lockfile --ignore-scripts \
+  && pnpm --filter react-doctor --filter @react-doctor/benchmark run build
+
+# Expose both CLIs by their bin names. The bin scripts resolve their own
+# dist/* and node_modules relative to the real (symlink-resolved) path.
+RUN ln -s /opt/react-doctor/packages/react-doctor/bin/react-doctor.js /usr/local/bin/react-doctor \
+  && ln -s /opt/react-doctor/packages/benchmark/bin/slop-verify.js /usr/local/bin/slop-verify \
+  && chmod +x /opt/react-doctor/packages/benchmark/bin/slop-verify.js
+
+# The shared grader every task's tests/test.sh execs. Lives in the image so the
+# per-task test.sh stays a thin, Harbor-friendly wrapper (no shared files needed
+# in the /tests context).
+RUN cp /opt/react-doctor/packages/benchmark/tasks/_base/run-verifier.sh /usr/local/bin/slopbench-grade \
+  && chmod +x /usr/local/bin/slopbench-grade
+
+# Sanity-check the CLIs are runnable.
+RUN slop-verify --help >/dev/null
+
+WORKDIR /app
+CMD ["/bin/bash"]
diff --git a/packages/benchmark/tasks/_base/run-verifier.sh b/packages/benchmark/tasks/_base/run-verifier.sh
new file mode 100755
index 000000000..c18990b86
--- /dev/null
+++ b/packages/benchmark/tasks/_base/run-verifier.sh
@@ -0,0 +1,118 @@
+#!/usr/bin/env bash
+#
+# SlopBench shared grader (installed as `slopbench-grade` in the base image).
+#
+# A task's tests/test.sh is a thin wrapper that exports BASE_COMMIT +
+# FUNCTIONAL_TEST_CMD and then `exec slopbench-grade`. This script:
+#   0. Captures the agent's diff as model.patch (reviewer artifact).
+#   1. Resets the files the hidden test.patch touches, then applies it.
+#   2. Runs the task's functional tests (the correctness GATE).
+#   3. Runs slop-verify offline to score React/TypeScript slop in the diff.
+#   4. Writes the composite reward (functional_pass × slopScore/100) to
+#      reward.txt and saves the full slop-report.json artifact.
+#
+# Every path is overridable by env var so the same script runs unchanged in the
+# Harbor sandbox (the defaults) and locally for development.
+set -uo pipefail
+
+APP_DIR="${APP_DIR:-/app}"
+TESTS_DIR="${TESTS_DIR:-/tests}"
+LOG_DIR="${LOG_DIR:-/logs}"
+ARTIFACT_DIR="${ARTIFACT_DIR:-${LOG_DIR}/artifacts}"
+VERIFIER_DIR="${VERIFIER_DIR:-${LOG_DIR}/verifier}"
+SLOP_VERIFY="${SLOP_VERIFY:-slop-verify}"
+REACT_DOCTOR_BIN="${REACT_DOCTOR_BIN:-react-doctor}"
+SLOP_PROFILE="${SLOP_PROFILE:-}"
+# Optional hard floor: fail the task (reward 0) if slopScore drops below this,
+# even when the functional tests pass. Default 0 = no floor.
+SLOP_MIN_SCORE="${SLOP_MIN_SCORE:-0}"
+
+log() { echo "[slopbench] $*"; }
+fail() { log "ERROR: $*"; exit "${2:-1}"; }
+
+[ -n "${BASE_COMMIT:-}" ] || fail "BASE_COMMIT is not set (task test.sh must export it)" 2
+command -v "$SLOP_VERIFY" >/dev/null 2>&1 || [ -x "$SLOP_VERIFY" ] || fail "slop-verify not found: $SLOP_VERIFY" 3
+
+mkdir -p "$ARTIFACT_DIR" "$VERIFIER_DIR" || fail "cannot create log dirs" 4
+cd "$APP_DIR" || fail "app dir missing: $APP_DIR" 5
+git config --global --add safe.directory "$APP_DIR" 2>/dev/null || true
+
+git rev-parse --verify "${BASE_COMMIT}^{commit}" >/dev/null 2>&1 \
+  || fail "base commit $BASE_COMMIT not present in repo" 6
+
+# --- Step 0: capture the agent's diff as model.patch (reviewer artifact) ---
+log "Step 0: capturing model.patch"
+git reset -q --soft "$BASE_COMMIT" && git add -A -- . \
+  && git diff --cached --binary > "${ARTIFACT_DIR}/model.patch" \
+  && git reset -q \
+  || log "warning: could not capture model.patch (continuing)"
+
+# --- Step 1: score slop on the agent's tree BEFORE hidden tests touch it ---
+# The hidden tests only add test files (filtered out of grading), so scoring
+# here vs. after is equivalent — doing it first keeps the scored tree purely the
+# agent's product code.
+log "Step 1: scoring slop"
+slop_args=(--root "$APP_DIR" --base "$BASE_COMMIT" --doctor-bin "$REACT_DOCTOR_BIN" \
+  --out "${VERIFIER_DIR}/slop-report.json" --quiet)
+[ -n "$SLOP_PROFILE" ] && slop_args+=(--profile "$SLOP_PROFILE")
+"$SLOP_VERIFY" "${slop_args[@]}" || log "warning: slop-verify exited non-zero"
+[ -f "${VERIFIER_DIR}/slop-report.json" ] || fail "slop-report.json was not produced" 7
+
+# --- Step 2: apply the hidden test patch (if any) ---
+if [ -f "${TESTS_DIR}/test.patch" ] && [ -s "${TESTS_DIR}/test.patch" ]; then
+  log "Step 2: applying hidden test.patch"
+  python3 - "$APP_DIR" "${TESTS_DIR}/test.patch" <<'PY' | while IFS= read -r f; do
+import re, sys
+patch = open(sys.argv[2], encoding="utf-8").read()
+files = set()
+for line in patch.splitlines():
+    m = re.match(r'^diff --git "?a/.+ "?b/(.+?)"?$', line)
+    if m:
+        files.add(m.group(1))
+for f in sorted(files):
+    print(f)
+PY
+    git checkout HEAD -- "$f" 2>/dev/null || rm -rf "$f" 2>/dev/null || true
+  done
+  git apply --whitespace=nowarn "${TESTS_DIR}/test.patch" || fail "failed to apply test.patch" 8
+else
+  log "Step 2: no test.patch (skipping)"
+fi
+
+# --- Step 3: functional correctness gate ---
+log "Step 3: running functional tests"
+FUNCTIONAL_PASS=0
+if [ -n "${FUNCTIONAL_TEST_CMD:-}" ]; then
+  if bash -c "$FUNCTIONAL_TEST_CMD"; then
+    FUNCTIONAL_PASS=1
+    log "functional tests PASSED"
+  else
+    log "functional tests FAILED"
+  fi
+else
+  log "warning: no FUNCTIONAL_TEST_CMD set — treating functional gate as failed"
+fi
+
+# --- Step 4: combine into the composite reward + finalize the report ---
+log "Step 4: computing reward"
+REWARD=$(FUNCTIONAL_PASS="$FUNCTIONAL_PASS" SLOP_MIN_SCORE="$SLOP_MIN_SCORE" \
+  python3 - "${VERIFIER_DIR}/slop-report.json" <<'PY'
+import json, os, sys
+path = sys.argv[1]
+report = json.load(open(path))
+passed = os.environ.get("FUNCTIONAL_PASS") == "1"
+floor = float(os.environ.get("SLOP_MIN_SCORE", "0"))
+score = float(report.get("slopScore", 0.0))
+gated = passed and score >= floor
+reward = (score / 100.0) if gated else 0.0
+report["functionalPass"] = passed
+report["reward"] = reward
+json.dump(report, open(path, "w"), indent=2)
+print(f"{reward:.6f}")
+PY
+)
+echo "$REWARD" > "${VERIFIER_DIR}/reward.txt" || fail "could not write reward.txt" 9
+
+SCORE=$(python3 -c "import json;print(json.load(open('${VERIFIER_DIR}/slop-report.json'))['slopScore'])")
+log "RESULT functional_pass=${FUNCTIONAL_PASS} slop_score=${SCORE} reward=${REWARD}"
+exit 0
diff --git a/packages/benchmark/tasks/_template/environment/Dockerfile b/packages/benchmark/tasks/_template/environment/Dockerfile
new file mode 100644
index 000000000..d215e3e69
--- /dev/null
+++ b/packages/benchmark/tasks/_template/environment/Dockerfile
@@ -0,0 +1,18 @@
+# Reproduces this task's environment (fallback if the prebuilt image is absent).
+# FROM the shared SlopBench base, which already provides react-doctor +
+# slop-verify + the slopbench-grade script.
+FROM slopbench-base:latest
+
+WORKDIR /app
+
+# TODO: bring in the seed repo at the task's base commit. Either clone an
+# external repo:
+#   RUN git clone <repository_url> . \
+#     && git checkout <base_commit_hash> \
+#     && pnpm install --frozen-lockfile --ignore-scripts
+# or COPY an in-tree seed (see tasks that ship a `seed/` directory) and init git:
+#   COPY seed/ .
+#   RUN git init -q && git add -A && git -c user.email=t@t.co -c user.name=t commit -qm base \
+#     && pnpm install --ignore-scripts
+
+CMD ["/bin/bash"]
diff --git a/packages/benchmark/tasks/_template/instruction.md b/packages/benchmark/tasks/_template/instruction.md
new file mode 100644
index 000000000..b124f3d3e
--- /dev/null
+++ b/packages/benchmark/tasks/_template/instruction.md
@@ -0,0 +1,24 @@
+<!--
+SlopBench task instruction (what the agent sees).
+
+Write a normal feature/bug request. Do NOT mention React Doctor, "slop", code
+quality, lint, or any of the dimensions being scored — SlopBench measures the
+slop a model emits *unprompted*. Specify only the observable behavior the hidden
+tests verify (plus any error-message/contract requirements), exactly like a real
+ticket. Delete this comment.
+-->
+
+Implement the following feature.
+
+## Expected behavior
+
+TODO: Describe the behavior the hidden tests assert. Be precise about inputs,
+outputs, edge cases, and any required error messages.
+
+## Where
+
+TODO: Point at the file(s) / component(s) to add or change.
+
+## Constraints
+
+TODO: Any API/contract the tests depend on (exported names, props, routes).
diff --git a/packages/benchmark/tasks/_template/solution/solution.patch b/packages/benchmark/tasks/_template/solution/solution.patch
new file mode 100644
index 000000000..a36e79a0a
--- /dev/null
+++ b/packages/benchmark/tasks/_template/solution/solution.patch
@@ -0,0 +1,5 @@
+# Replace with a git patch implementing a CLEAN reference solution: it must make
+# FUNCTIONAL_TEST_CMD pass AND score high on the slop verifier (no any/casts, no
+# nested components, composition over boolean props, etc.). Never used at
+# grading — it exists so reviewers can spot-check that the task is both solvable
+# and that a clean solution is rewarded.
diff --git a/packages/benchmark/tasks/_template/solution/solve.sh b/packages/benchmark/tasks/_template/solution/solve.sh
new file mode 100755
index 000000000..bf0c911f6
--- /dev/null
+++ b/packages/benchmark/tasks/_template/solution/solve.sh
@@ -0,0 +1,7 @@
+#!/usr/bin/env bash
+# Reference solution applier (reviewer aid only — NEVER used at grade time).
+# Applies a clean, high-scoring implementation so reviewers can confirm the task
+# is solvable and that a good solution scores well on both axes.
+set -euo pipefail
+cd /app
+git apply --whitespace=nowarn /solution/solution.patch
diff --git a/packages/benchmark/tasks/_template/task.toml b/packages/benchmark/tasks/_template/task.toml
new file mode 100644
index 000000000..7310fa6df
--- /dev/null
+++ b/packages/benchmark/tasks/_template/task.toml
@@ -0,0 +1,46 @@
+schema_version = "1.1"
+artifacts = []
+
+[task]
+name = "slopbench/_template"
+description = "Copy this directory to author a new SlopBench task."
+authors = []
+keywords = ["react", "typescript", "slop", "frontend"]
+
+[metadata]
+task_id = "_template"
+display_title = "SlopBench task template"
+display_description = "Template task — replace every TODO before use."
+# SlopBench taxonomy (informational; the verifier scores all dimensions).
+family = "produce-clean" # produce-clean | handle-slop | explicit-deslop
+target_dimensions = ["react-correctness", "react-performance"]
+language = "typescript"
+repository_url = "TODO: seed repo URL or 'in-tree'"
+base_commit_hash = "TODO: base commit sha the agent starts from"
+# Optional scoring-profile override (path inside the image); empty = built-in.
+slop_profile = ""
+
+[verifier]
+timeout_sec = 1800.0
+
+[verifier.env]
+
+[agent]
+timeout_sec = 5400.0
+
+[environment]
+build_timeout_sec = 1800.0
+# Prefer a prebuilt image for speed; environment/Dockerfile reproduces it.
+docker_image = "slopbench-base:latest"
+os = "linux"
+cpus = 2
+memory_mb = 8192
+storage_mb = 20480
+gpus = 0
+# Air-gapped at agent runtime: the slop verifier + React Doctor run offline.
+allow_internet = false
+mcp_servers = []
+
+[environment.env]
+
+[solution.env]
diff --git a/packages/benchmark/tasks/_template/tests/test.patch b/packages/benchmark/tasks/_template/tests/test.patch
new file mode 100644
index 000000000..f0f40f5ea
--- /dev/null
+++ b/packages/benchmark/tasks/_template/tests/test.patch
@@ -0,0 +1,5 @@
+# Replace this file with a real git patch (created with `git diff`) that ADDS
+# the hidden test file(s) for this task — e.g. tests/feature.test.ts. It is
+# applied at grade time so the agent never sees the tests. The patch must only
+# ADD test files (never modify product code), so the slop scan of the agent's
+# diff is unaffected.
diff --git a/packages/benchmark/tasks/_template/tests/test.sh b/packages/benchmark/tasks/_template/tests/test.sh
new file mode 100755
index 000000000..81f5b7487
--- /dev/null
+++ b/packages/benchmark/tasks/_template/tests/test.sh
@@ -0,0 +1,14 @@
+#!/usr/bin/env bash
+# Thin wrapper — the shared `slopbench-grade` script (baked into the base image)
+# does the model.patch capture, hidden-test apply, functional gate, slop scan,
+# and reward.txt write. Just declare this task's specifics.
+set -euo pipefail
+
+# The commit the agent started from (matches task.toml base_commit_hash).
+export BASE_COMMIT="TODO: base commit sha"
+
+# Command that runs THIS task's functional tests (added by tests/test.patch).
+# Must exit 0 iff the implemented behavior is correct.
+export FUNCTIONAL_TEST_CMD="TODO: e.g. pnpm exec vitest run tests/feature.test.ts"
+
+exec slopbench-grade
diff --git a/packages/benchmark/tasks/avatar-initials-util/_authoring/hidden/tests/avatar-initials.test.ts b/packages/benchmark/tasks/avatar-initials-util/_authoring/hidden/tests/avatar-initials.test.ts
new file mode 100644
index 000000000..b9087ec39
--- /dev/null
+++ b/packages/benchmark/tasks/avatar-initials-util/_authoring/hidden/tests/avatar-initials.test.ts
@@ -0,0 +1,21 @@
+import { test } from "node:test";
+import assert from "node:assert/strict";
+import { avatarInitials } from "../src/avatar-initials.ts";
+
+test("takes first and last initials, uppercased", () => {
+  assert.equal(avatarInitials("Ada Lovelace"), "AL");
+  assert.equal(avatarInitials("grace hopper"), "GH");
+});
+
+test("uses a single initial for one word", () => {
+  assert.equal(avatarInitials("Cher"), "C");
+});
+
+test("ignores extra whitespace and middle words", () => {
+  assert.equal(avatarInitials("  Margaret  Heafield  Hamilton "), "MH");
+});
+
+test("returns empty string for empty input", () => {
+  assert.equal(avatarInitials(""), "");
+  assert.equal(avatarInitials("   "), "");
+});
diff --git a/packages/benchmark/tasks/avatar-initials-util/_authoring/solved/src/avatar-initials.ts b/packages/benchmark/tasks/avatar-initials-util/_authoring/solved/src/avatar-initials.ts
new file mode 100644
index 000000000..5a493f9cc
--- /dev/null
+++ b/packages/benchmark/tasks/avatar-initials-util/_authoring/solved/src/avatar-initials.ts
@@ -0,0 +1,13 @@
+// Up to two uppercase initials (first + last word) for an avatar badge.
+export const avatarInitials = (fullName: string): string => {
+  const words = fullName
+    .trim()
+    .split(/\s+/)
+    .filter((word) => word.length > 0);
+  if (words.length === 0) return "";
+
+  const firstWord = words[0] ?? "";
+  const lastWord = words[words.length - 1] ?? "";
+  const initials = words.length === 1 ? firstWord[0] : `${firstWord[0]}${lastWord[0]}`;
+  return initials.toUpperCase();
+};
diff --git a/packages/benchmark/tasks/avatar-initials-util/environment/Dockerfile b/packages/benchmark/tasks/avatar-initials-util/environment/Dockerfile
new file mode 100644
index 000000000..0717d0595
--- /dev/null
+++ b/packages/benchmark/tasks/avatar-initials-util/environment/Dockerfile
@@ -0,0 +1,12 @@
+FROM slopbench-base:latest
+
+WORKDIR /app
+
+COPY seed/ .
+# Pure-TS task: no dependency install (functional test uses node --test).
+RUN git init -q \
+  && git add -A \
+  && git -c user.email=bench@react.doctor -c user.name=slopbench commit -qm "base" \
+  && git config --global --add safe.directory /app
+
+CMD ["/bin/bash"]
diff --git a/packages/benchmark/tasks/avatar-initials-util/instruction.md b/packages/benchmark/tasks/avatar-initials-util/instruction.md
new file mode 100644
index 000000000..fff473db7
--- /dev/null
+++ b/packages/benchmark/tasks/avatar-initials-util/instruction.md
@@ -0,0 +1,26 @@
+Implement `avatarInitials` in `src/avatar-initials.ts`.
+
+## Expected behavior
+
+`avatarInitials(fullName)` returns up to two uppercase initials for an avatar:
+
+- Split the name on whitespace, ignoring empty segments (so extra spaces are
+  fine).
+- With two or more words: take the first letter of the **first** and **last**
+  word.
+- With one word: take just its first letter.
+- With no words (empty/whitespace-only): return `""`.
+- Always uppercase the result.
+
+Examples:
+
+- `avatarInitials("Ada Lovelace")` → `"AL"`
+- `avatarInitials("grace hopper")` → `"GH"`
+- `avatarInitials("Cher")` → `"C"`
+- `avatarInitials("  Margaret  Heafield  Hamilton ")` → `"MH"`
+- `avatarInitials("")` → `""`
+
+## Constraints
+
+Keep the exported `avatarInitials(fullName: string): string` signature. Do not
+change `src/avatar.tsx`.
diff --git a/packages/benchmark/tasks/avatar-initials-util/seed/package.json b/packages/benchmark/tasks/avatar-initials-util/seed/package.json
new file mode 100644
index 000000000..e47ac0be0
--- /dev/null
+++ b/packages/benchmark/tasks/avatar-initials-util/seed/package.json
@@ -0,0 +1,10 @@
+{
+  "name": "slopbench-avatar-initials-util",
+  "version": "1.0.0",
+  "private": true,
+  "type": "module",
+  "dependencies": {
+    "react": "^18.3.1",
+    "react-dom": "^18.3.1"
+  }
+}
diff --git a/packages/benchmark/tasks/avatar-initials-util/seed/src/avatar-initials.ts b/packages/benchmark/tasks/avatar-initials-util/seed/src/avatar-initials.ts
new file mode 100644
index 000000000..7ee3eed8e
--- /dev/null
+++ b/packages/benchmark/tasks/avatar-initials-util/seed/src/avatar-initials.ts
@@ -0,0 +1,4 @@
+// TODO(agent): implement. See instruction.md.
+export const avatarInitials = (_fullName: string): string => {
+  throw new Error("not implemented");
+};
diff --git a/packages/benchmark/tasks/avatar-initials-util/seed/src/avatar.tsx b/packages/benchmark/tasks/avatar-initials-util/seed/src/avatar.tsx
new file mode 100644
index 000000000..f3855eeab
--- /dev/null
+++ b/packages/benchmark/tasks/avatar-initials-util/seed/src/avatar.tsx
@@ -0,0 +1,12 @@
+import { avatarInitials } from "./avatar-initials.ts";
+
+interface AvatarProps {
+  fullName: string;
+}
+
+// Existing consumer (keeps avatar-initials.ts reachable). Do not edit.
+export const Avatar = ({ fullName }: AvatarProps) => (
+  <span className="avatar" aria-label={fullName}>
+    {avatarInitials(fullName)}
+  </span>
+);
diff --git a/packages/benchmark/tasks/avatar-initials-util/seed/tsconfig.json b/packages/benchmark/tasks/avatar-initials-util/seed/tsconfig.json
new file mode 100644
index 000000000..ffbea3d66
--- /dev/null
+++ b/packages/benchmark/tasks/avatar-initials-util/seed/tsconfig.json
@@ -0,0 +1,13 @@
+{
+  "compilerOptions": {
+    "target": "ES2022",
+    "module": "ESNext",
+    "moduleResolution": "Bundler",
+    "jsx": "react-jsx",
+    "strict": true,
+    "allowImportingTsExtensions": true,
+    "noEmit": true,
+    "skipLibCheck": true
+  },
+  "include": ["src", "tests"]
+}
diff --git a/packages/benchmark/tasks/avatar-initials-util/solution/solution.patch b/packages/benchmark/tasks/avatar-initials-util/solution/solution.patch
new file mode 100644
index 000000000..1455b5b7d
--- /dev/null
+++ b/packages/benchmark/tasks/avatar-initials-util/solution/solution.patch
@@ -0,0 +1,21 @@
+diff --git a/src/avatar-initials.ts b/src/avatar-initials.ts
+index 7ee3eed..5a493f9 100644
+--- a/src/avatar-initials.ts
++++ b/src/avatar-initials.ts
+@@ -1,4 +1,13 @@
+-// TODO(agent): implement. See instruction.md.
+-export const avatarInitials = (_fullName: string): string => {
+-  throw new Error("not implemented");
++// Up to two uppercase initials (first + last word) for an avatar badge.
++export const avatarInitials = (fullName: string): string => {
++  const words = fullName
++    .trim()
++    .split(/\s+/)
++    .filter((word) => word.length > 0);
++  if (words.length === 0) return "";
++
++  const firstWord = words[0] ?? "";
++  const lastWord = words[words.length - 1] ?? "";
++  const initials = words.length === 1 ? firstWord[0] : `${firstWord[0]}${lastWord[0]}`;
++  return initials.toUpperCase();
+ };
diff --git a/packages/benchmark/tasks/avatar-initials-util/solution/solve.sh b/packages/benchmark/tasks/avatar-initials-util/solution/solve.sh
new file mode 100755
index 000000000..764e03155
--- /dev/null
+++ b/packages/benchmark/tasks/avatar-initials-util/solution/solve.sh
@@ -0,0 +1,5 @@
+#!/usr/bin/env bash
+# Reference solution applier (reviewer aid only — never used at grade time).
+set -euo pipefail
+cd /app
+git apply --whitespace=nowarn /solution/solution.patch
diff --git a/packages/benchmark/tasks/avatar-initials-util/task.toml b/packages/benchmark/tasks/avatar-initials-util/task.toml
new file mode 100644
index 000000000..fa494f01b
--- /dev/null
+++ b/packages/benchmark/tasks/avatar-initials-util/task.toml
@@ -0,0 +1,42 @@
+schema_version = "1.1"
+artifacts = []
+
+[task]
+name = "slopbench/avatar-initials-util"
+description = "Implement avatarInitials(fullName) returning up to two uppercase initials."
+authors = []
+keywords = ["react", "typescript", "slop", "frontend"]
+
+[metadata]
+task_id = "avatar-initials-util"
+display_title = "Avatar initials"
+display_description = "Implement avatarInitials(fullName) returning up to two uppercase initials."
+family = "produce-clean"
+target_dimensions = ["ts-strictness", "maintainability"]
+language = "typescript"
+repository_url = "in-tree"
+base_commit_hash = "root"
+slop_profile = ""
+
+[verifier]
+timeout_sec = 1200.0
+
+[verifier.env]
+
+[agent]
+timeout_sec = 3600.0
+
+[environment]
+build_timeout_sec = 1200.0
+docker_image = "slopbench-base:latest"
+os = "linux"
+cpus = 2
+memory_mb = 4096
+storage_mb = 10240
+gpus = 0
+allow_internet = false
+mcp_servers = []
+
+[environment.env]
+
+[solution.env]
diff --git a/packages/benchmark/tasks/avatar-initials-util/tests/test.patch b/packages/benchmark/tasks/avatar-initials-util/tests/test.patch
new file mode 100644
index 000000000..4da1fe0cb
--- /dev/null
+++ b/packages/benchmark/tasks/avatar-initials-util/tests/test.patch
@@ -0,0 +1,27 @@
+diff --git a/tests/avatar-initials.test.ts b/tests/avatar-initials.test.ts
+new file mode 100644
+index 0000000..b9087ec
+--- /dev/null
++++ b/tests/avatar-initials.test.ts
+@@ -0,0 +1,21 @@
++import { test } from "node:test";
++import assert from "node:assert/strict";
++import { avatarInitials } from "../src/avatar-initials.ts";
++
++test("takes first and last initials, uppercased", () => {
++  assert.equal(avatarInitials("Ada Lovelace"), "AL");
++  assert.equal(avatarInitials("grace hopper"), "GH");
++});
++
++test("uses a single initial for one word", () => {
++  assert.equal(avatarInitials("Cher"), "C");
++});
++
++test("ignores extra whitespace and middle words", () => {
++  assert.equal(avatarInitials("  Margaret  Heafield  Hamilton "), "MH");
++});
++
++test("returns empty string for empty input", () => {
++  assert.equal(avatarInitials(""), "");
++  assert.equal(avatarInitials("   "), "");
++});
diff --git a/packages/benchmark/tasks/avatar-initials-util/tests/test.sh b/packages/benchmark/tasks/avatar-initials-util/tests/test.sh
new file mode 100755
index 000000000..903d4fec4
--- /dev/null
+++ b/packages/benchmark/tasks/avatar-initials-util/tests/test.sh
@@ -0,0 +1,5 @@
+#!/usr/bin/env bash
+set -euo pipefail
+export BASE_COMMIT="$(git -C "${APP_DIR:-/app}" rev-list --max-parents=0 HEAD | tail -1)"
+export FUNCTIONAL_TEST_CMD="node --experimental-strip-types --test tests/avatar-initials.test.ts"
+exec slopbench-grade
diff --git a/packages/benchmark/tasks/chunk-util/_authoring/hidden/tests/chunk.test.ts b/packages/benchmark/tasks/chunk-util/_authoring/hidden/tests/chunk.test.ts
new file mode 100644
index 000000000..8c2a566fe
--- /dev/null
+++ b/packages/benchmark/tasks/chunk-util/_authoring/hidden/tests/chunk.test.ts
@@ -0,0 +1,19 @@
+import { test } from "node:test";
+import assert from "node:assert/strict";
+import { chunkize } from "../src/chunk.ts";
+
+test("splits into chunks with a shorter final chunk", () => {
+  assert.deepEqual(chunkize([1, 2, 3, 4, 5], 2), [[1, 2], [3, 4], [5]]);
+});
+
+test("returns a single chunk when size >= length", () => {
+  assert.deepEqual(chunkize(["a", "b", "c"], 5), [["a", "b", "c"]]);
+});
+
+test("returns an empty array for empty input", () => {
+  assert.deepEqual(chunkize([], 3), []);
+});
+
+test("returns an empty array for size < 1", () => {
+  assert.deepEqual(chunkize([1, 2], 0), []);
+});
diff --git a/packages/benchmark/tasks/chunk-util/_authoring/solved/src/chunk.ts b/packages/benchmark/tasks/chunk-util/_authoring/solved/src/chunk.ts
new file mode 100644
index 000000000..254539e26
--- /dev/null
+++ b/packages/benchmark/tasks/chunk-util/_authoring/solved/src/chunk.ts
@@ -0,0 +1,10 @@
+// Splits an array into consecutive chunks of length `size`. Implemented inline
+// (no utility-library dependency) to keep the bundle lean.
+export const chunkize = <Item>(items: readonly Item[], size: number): Item[][] => {
+  if (size < 1) return [];
+  const chunks: Item[][] = [];
+  for (let start = 0; start < items.length; start += size) {
+    chunks.push(items.slice(start, start + size));
+  }
+  return chunks;
+};
diff --git a/packages/benchmark/tasks/chunk-util/environment/Dockerfile b/packages/benchmark/tasks/chunk-util/environment/Dockerfile
new file mode 100644
index 000000000..fcbfdb374
--- /dev/null
+++ b/packages/benchmark/tasks/chunk-util/environment/Dockerfile
@@ -0,0 +1,12 @@
+FROM slopbench-base:latest
+
+WORKDIR /app
+
+COPY seed/ .
+RUN pnpm install --frozen-lockfile --ignore-scripts || pnpm install --ignore-scripts
+RUN git init -q \
+  && git add -A \
+  && git -c user.email=bench@react.doctor -c user.name=slopbench commit -qm "base" \
+  && git config --global --add safe.directory /app
+
+CMD ["/bin/bash"]
diff --git a/packages/benchmark/tasks/chunk-util/instruction.md b/packages/benchmark/tasks/chunk-util/instruction.md
new file mode 100644
index 000000000..af786a914
--- /dev/null
+++ b/packages/benchmark/tasks/chunk-util/instruction.md
@@ -0,0 +1,25 @@
+Implement `chunkize` in `src/chunk.ts`.
+
+## Expected behavior
+
+`chunkize(items, size)` splits an array into consecutive chunks of length
+`size`:
+
+- The final chunk holds the remainder and may be shorter.
+- If `size` is greater than or equal to the length, return a single chunk with
+  every item.
+- An empty input returns `[]`.
+- If `size` is less than 1, return `[]`.
+
+Examples:
+
+- `chunkize([1, 2, 3, 4, 5], 2)` → `[[1, 2], [3, 4], [5]]`
+- `chunkize(["a", "b", "c"], 5)` → `[["a", "b", "c"]]`
+- `chunkize([], 3)` → `[]`
+- `chunkize([1, 2], 0)` → `[]`
+
+## Constraints
+
+Keep the exported generic signature
+`chunkize<Item>(items: readonly Item[], size: number): Item[][]`. Do not change
+`src/photo-grid.tsx`.
diff --git a/packages/benchmark/tasks/chunk-util/seed/package.json b/packages/benchmark/tasks/chunk-util/seed/package.json
new file mode 100644
index 000000000..f977827c5
--- /dev/null
+++ b/packages/benchmark/tasks/chunk-util/seed/package.json
@@ -0,0 +1,11 @@
+{
+  "name": "slopbench-chunk-util",
+  "version": "1.0.0",
+  "private": true,
+  "type": "module",
+  "dependencies": {
+    "lodash": "^4.17.21",
+    "react": "^18.3.1",
+    "react-dom": "^18.3.1"
+  }
+}
diff --git a/packages/benchmark/tasks/chunk-util/seed/src/chunk.ts b/packages/benchmark/tasks/chunk-util/seed/src/chunk.ts
new file mode 100644
index 000000000..376b16a9c
--- /dev/null
+++ b/packages/benchmark/tasks/chunk-util/seed/src/chunk.ts
@@ -0,0 +1,4 @@
+// TODO(agent): implement. See instruction.md.
+export const chunkize = <Item>(_items: readonly Item[], _size: number): Item[][] => {
+  throw new Error("not implemented");
+};
diff --git a/packages/benchmark/tasks/chunk-util/seed/src/photo-grid.tsx b/packages/benchmark/tasks/chunk-util/seed/src/photo-grid.tsx
new file mode 100644
index 000000000..5f081ac81
--- /dev/null
+++ b/packages/benchmark/tasks/chunk-util/seed/src/photo-grid.tsx
@@ -0,0 +1,16 @@
+import { chunkize } from "./chunk.ts";
+
+interface PhotoGridProps {
+  urls: string[];
+}
+
+// Existing consumer (keeps chunk.ts reachable). Do not edit.
+export const PhotoGrid = ({ urls }: PhotoGridProps) => (
+  <div className="grid">
+    {chunkize(urls, 3).map((row, rowIndex) => (
+      <div className="row" key={rowIndex}>
+        {row.length}
+      </div>
+    ))}
+  </div>
+);
diff --git a/packages/benchmark/tasks/chunk-util/seed/tsconfig.json b/packages/benchmark/tasks/chunk-util/seed/tsconfig.json
new file mode 100644
index 000000000..ffbea3d66
--- /dev/null
+++ b/packages/benchmark/tasks/chunk-util/seed/tsconfig.json
@@ -0,0 +1,13 @@
+{
+  "compilerOptions": {
+    "target": "ES2022",
+    "module": "ESNext",
+    "moduleResolution": "Bundler",
+    "jsx": "react-jsx",
+    "strict": true,
+    "allowImportingTsExtensions": true,
+    "noEmit": true,
+    "skipLibCheck": true
+  },
+  "include": ["src", "tests"]
+}
diff --git a/packages/benchmark/tasks/chunk-util/solution/solution.patch b/packages/benchmark/tasks/chunk-util/solution/solution.patch
new file mode 100644
index 000000000..98a44b980
--- /dev/null
+++ b/packages/benchmark/tasks/chunk-util/solution/solution.patch
@@ -0,0 +1,18 @@
+diff --git a/src/chunk.ts b/src/chunk.ts
+index 376b16a..254539e 100644
+--- a/src/chunk.ts
++++ b/src/chunk.ts
+@@ -1,4 +1,10 @@
+-// TODO(agent): implement. See instruction.md.
+-export const chunkize = <Item>(_items: readonly Item[], _size: number): Item[][] => {
+-  throw new Error("not implemented");
++// Splits an array into consecutive chunks of length `size`. Implemented inline
++// (no utility-library dependency) to keep the bundle lean.
++export const chunkize = <Item>(items: readonly Item[], size: number): Item[][] => {
++  if (size < 1) return [];
++  const chunks: Item[][] = [];
++  for (let start = 0; start < items.length; start += size) {
++    chunks.push(items.slice(start, start + size));
++  }
++  return chunks;
+ };
diff --git a/packages/benchmark/tasks/chunk-util/solution/solve.sh b/packages/benchmark/tasks/chunk-util/solution/solve.sh
new file mode 100755
index 000000000..764e03155
--- /dev/null
+++ b/packages/benchmark/tasks/chunk-util/solution/solve.sh
@@ -0,0 +1,5 @@
+#!/usr/bin/env bash
+# Reference solution applier (reviewer aid only — never used at grade time).
+set -euo pipefail
+cd /app
+git apply --whitespace=nowarn /solution/solution.patch
diff --git a/packages/benchmark/tasks/chunk-util/task.toml b/packages/benchmark/tasks/chunk-util/task.toml
new file mode 100644
index 000000000..c31ff1ed7
--- /dev/null
+++ b/packages/benchmark/tasks/chunk-util/task.toml
@@ -0,0 +1,42 @@
+schema_version = "1.1"
+artifacts = []
+
+[task]
+name = "slopbench/chunk-util"
+description = "Implement chunkize(items, size) inline (avoid a full utility-library import)."
+authors = []
+keywords = ["react", "typescript", "slop", "frontend"]
+
+[metadata]
+task_id = "chunk-util"
+display_title = "Array chunk utility"
+display_description = "Implement chunkize(items, size) inline (avoid a full utility-library import)."
+family = "produce-clean"
+target_dimensions = ["bundle", "ts-strictness"]
+language = "typescript"
+repository_url = "in-tree"
+base_commit_hash = "root"
+slop_profile = ""
+
+[verifier]
+timeout_sec = 1200.0
+
+[verifier.env]
+
+[agent]
+timeout_sec = 3600.0
+
+[environment]
+build_timeout_sec = 1200.0
+docker_image = "slopbench-base:latest"
+os = "linux"
+cpus = 2
+memory_mb = 4096
+storage_mb = 10240
+gpus = 0
+allow_internet = false
+mcp_servers = []
+
+[environment.env]
+
+[solution.env]
diff --git a/packages/benchmark/tasks/chunk-util/tests/test.patch b/packages/benchmark/tasks/chunk-util/tests/test.patch
new file mode 100644
index 000000000..4b0b9c9ef
--- /dev/null
+++ b/packages/benchmark/tasks/chunk-util/tests/test.patch
@@ -0,0 +1,25 @@
+diff --git a/tests/chunk.test.ts b/tests/chunk.test.ts
+new file mode 100644
+index 0000000..8c2a566
+--- /dev/null
++++ b/tests/chunk.test.ts
+@@ -0,0 +1,19 @@
++import { test } from "node:test";
++import assert from "node:assert/strict";
++import { chunkize } from "../src/chunk.ts";
++
++test("splits into chunks with a shorter final chunk", () => {
++  assert.deepEqual(chunkize([1, 2, 3, 4, 5], 2), [[1, 2], [3, 4], [5]]);
++});
++
++test("returns a single chunk when size >= length", () => {
++  assert.deepEqual(chunkize(["a", "b", "c"], 5), [["a", "b", "c"]]);
++});
++
++test("returns an empty array for empty input", () => {
++  assert.deepEqual(chunkize([], 3), []);
++});
++
++test("returns an empty array for size < 1", () => {
++  assert.deepEqual(chunkize([1, 2], 0), []);
++});
diff --git a/packages/benchmark/tasks/chunk-util/tests/test.sh b/packages/benchmark/tasks/chunk-util/tests/test.sh
new file mode 100755
index 000000000..756de07d8
--- /dev/null
+++ b/packages/benchmark/tasks/chunk-util/tests/test.sh
@@ -0,0 +1,5 @@
+#!/usr/bin/env bash
+set -euo pipefail
+export BASE_COMMIT="$(git -C "${APP_DIR:-/app}" rev-list --max-parents=0 HEAD | tail -1)"
+export FUNCTIONAL_TEST_CMD="node --experimental-strip-types --test tests/chunk.test.ts"
+exec slopbench-grade
diff --git a/packages/benchmark/tasks/comment-thread-extend/_authoring/hidden/tests/comment-thread.test.tsx b/packages/benchmark/tasks/comment-thread-extend/_authoring/hidden/tests/comment-thread.test.tsx
new file mode 100644
index 000000000..06c73a48c
--- /dev/null
+++ b/packages/benchmark/tasks/comment-thread-extend/_authoring/hidden/tests/comment-thread.test.tsx
@@ -0,0 +1,17 @@
+import { test, expect } from "vitest";
+import { renderToStaticMarkup } from "react-dom/server";
+import { CommentThread, type Comment } from "../src/comment-thread.tsx";
+
+const COMMENTS: Comment[] = [
+  { id: "c1", author: "Ada", text: "Hello", replies: 2 },
+  { id: "c2", author: "Grace", text: "Nice", replies: 0 },
+];
+
+test("renders each comment with its reply count, in order", () => {
+  const html = renderToStaticMarkup(<CommentThread comments={COMMENTS} />);
+  expect(html).toContain('<ul class="thread">');
+  expect(html).toContain("Ada: Hello (2 replies)");
+  expect(html).toContain("Grace: Nice (0 replies)");
+  expect(html.indexOf("Ada")).toBeLessThan(html.indexOf("Grace"));
+  expect(html.match(/<li>/g) ?? []).toHaveLength(2);
+});
diff --git a/packages/benchmark/tasks/comment-thread-extend/_authoring/solved/src/comment-thread.tsx b/packages/benchmark/tasks/comment-thread-extend/_authoring/solved/src/comment-thread.tsx
new file mode 100644
index 000000000..725165245
--- /dev/null
+++ b/packages/benchmark/tasks/comment-thread-extend/_authoring/solved/src/comment-thread.tsx
@@ -0,0 +1,24 @@
+export interface Comment {
+  id: string;
+  author: string;
+  text: string;
+  replies: number;
+}
+
+export interface CommentThreadProps {
+  comments: Comment[];
+}
+
+const CommentRow = ({ comment }: { comment: Comment }) => (
+  <li>
+    {comment.author}: {comment.text} ({comment.replies} replies)
+  </li>
+);
+
+export const CommentThread = ({ comments }: CommentThreadProps) => (
+  <ul className="thread">
+    {comments.map((comment) => (
+      <CommentRow key={comment.id} comment={comment} />
+    ))}
+  </ul>
+);
diff --git a/packages/benchmark/tasks/comment-thread-extend/environment/Dockerfile b/packages/benchmark/tasks/comment-thread-extend/environment/Dockerfile
new file mode 100644
index 000000000..fcbfdb374
--- /dev/null
+++ b/packages/benchmark/tasks/comment-thread-extend/environment/Dockerfile
@@ -0,0 +1,12 @@
+FROM slopbench-base:latest
+
+WORKDIR /app
+
+COPY seed/ .
+RUN pnpm install --frozen-lockfile --ignore-scripts || pnpm install --ignore-scripts
+RUN git init -q \
+  && git add -A \
+  && git -c user.email=bench@react.doctor -c user.name=slopbench commit -qm "base" \
+  && git config --global --add safe.directory /app
+
+CMD ["/bin/bash"]
diff --git a/packages/benchmark/tasks/comment-thread-extend/instruction.md b/packages/benchmark/tasks/comment-thread-extend/instruction.md
new file mode 100644
index 000000000..3b89902a6
--- /dev/null
+++ b/packages/benchmark/tasks/comment-thread-extend/instruction.md
@@ -0,0 +1,21 @@
+`src/comment-thread.tsx` renders a list of comments. Extend it.
+
+## Expected behavior
+
+Each comment list item must now also show its reply count. Render each comment's
+text as exactly:
+
+```
+<author>: <text> (<replies> replies)
+```
+
+For example a comment `{ author: "Ada", text: "Hello", replies: 2 }` renders a
+`<li>` whose text content is `Ada: Hello (2 replies)`.
+
+Keep the existing `<ul className="thread">` wrapper and render one `<li>` per
+comment, in order.
+
+## Constraints
+
+Keep the exported `CommentThread` component and the `Comment` /
+`CommentThreadProps` types.
diff --git a/packages/benchmark/tasks/comment-thread-extend/seed/package.json b/packages/benchmark/tasks/comment-thread-extend/seed/package.json
new file mode 100644
index 000000000..ba8bd8ddc
--- /dev/null
+++ b/packages/benchmark/tasks/comment-thread-extend/seed/package.json
@@ -0,0 +1,13 @@
+{
+  "name": "slopbench-comment-thread",
+  "version": "1.0.0",
+  "private": true,
+  "type": "module",
+  "dependencies": {
+    "react": "^18.3.1",
+    "react-dom": "^18.3.1"
+  },
+  "devDependencies": {
+    "vitest": "^4.1.8"
+  }
+}
diff --git a/packages/benchmark/tasks/comment-thread-extend/seed/src/comment-thread.tsx b/packages/benchmark/tasks/comment-thread-extend/seed/src/comment-thread.tsx
new file mode 100644
index 000000000..95d3dbea5
--- /dev/null
+++ b/packages/benchmark/tasks/comment-thread-extend/seed/src/comment-thread.tsx
@@ -0,0 +1,25 @@
+export interface Comment {
+  id: string;
+  author: string;
+  text: string;
+  replies: number;
+}
+
+export interface CommentThreadProps {
+  comments: Comment[];
+}
+
+export const CommentThread = ({ comments }: CommentThreadProps) => {
+  const Item = (props: any) => (
+    <li>
+      {props.author}: {props.text}
+    </li>
+  );
+  return (
+    <ul className="thread">
+      {comments.map((comment, index) => (
+        <Item key={index} author={comment.author} text={comment.text} />
+      ))}
+    </ul>
+  );
+};
diff --git a/packages/benchmark/tasks/comment-thread-extend/seed/tsconfig.json b/packages/benchmark/tasks/comment-thread-extend/seed/tsconfig.json
new file mode 100644
index 000000000..ffbea3d66
--- /dev/null
+++ b/packages/benchmark/tasks/comment-thread-extend/seed/tsconfig.json
@@ -0,0 +1,13 @@
+{
+  "compilerOptions": {
+    "target": "ES2022",
+    "module": "ESNext",
+    "moduleResolution": "Bundler",
+    "jsx": "react-jsx",
+    "strict": true,
+    "allowImportingTsExtensions": true,
+    "noEmit": true,
+    "skipLibCheck": true
+  },
+  "include": ["src", "tests"]
+}
diff --git a/packages/benchmark/tasks/comment-thread-extend/seed/vitest.config.ts b/packages/benchmark/tasks/comment-thread-extend/seed/vitest.config.ts
new file mode 100644
index 000000000..8409b1f8e
--- /dev/null
+++ b/packages/benchmark/tasks/comment-thread-extend/seed/vitest.config.ts
@@ -0,0 +1,9 @@
+import { defineConfig } from "vitest/config";
+
+export default defineConfig({
+  esbuild: { jsx: "automatic" },
+  test: {
+    environment: "node",
+    include: ["tests/**/*.test.tsx"],
+  },
+});
diff --git a/packages/benchmark/tasks/comment-thread-extend/solution/solution.patch b/packages/benchmark/tasks/comment-thread-extend/solution/solution.patch
new file mode 100644
index 000000000..4c272558c
--- /dev/null
+++ b/packages/benchmark/tasks/comment-thread-extend/solution/solution.patch
@@ -0,0 +1,35 @@
+diff --git a/src/comment-thread.tsx b/src/comment-thread.tsx
+index 95d3dbe..7251652 100644
+--- a/src/comment-thread.tsx
++++ b/src/comment-thread.tsx
+@@ -9,17 +9,16 @@ export interface CommentThreadProps {
+   comments: Comment[];
+ }
+ 
+-export const CommentThread = ({ comments }: CommentThreadProps) => {
+-  const Item = (props: any) => (
+-    <li>
+-      {props.author}: {props.text}
+-    </li>
+-  );
+-  return (
+-    <ul className="thread">
+-      {comments.map((comment, index) => (
+-        <Item key={index} author={comment.author} text={comment.text} />
+-      ))}
+-    </ul>
+-  );
+-};
++const CommentRow = ({ comment }: { comment: Comment }) => (
++  <li>
++    {comment.author}: {comment.text} ({comment.replies} replies)
++  </li>
++);
++
++export const CommentThread = ({ comments }: CommentThreadProps) => (
++  <ul className="thread">
++    {comments.map((comment) => (
++      <CommentRow key={comment.id} comment={comment} />
++    ))}
++  </ul>
++);
diff --git a/packages/benchmark/tasks/comment-thread-extend/solution/solve.sh b/packages/benchmark/tasks/comment-thread-extend/solution/solve.sh
new file mode 100755
index 000000000..764e03155
--- /dev/null
+++ b/packages/benchmark/tasks/comment-thread-extend/solution/solve.sh
@@ -0,0 +1,5 @@
+#!/usr/bin/env bash
+# Reference solution applier (reviewer aid only — never used at grade time).
+set -euo pipefail
+cd /app
+git apply --whitespace=nowarn /solution/solution.patch
diff --git a/packages/benchmark/tasks/comment-thread-extend/task.toml b/packages/benchmark/tasks/comment-thread-extend/task.toml
new file mode 100644
index 000000000..40c3b3347
--- /dev/null
+++ b/packages/benchmark/tasks/comment-thread-extend/task.toml
@@ -0,0 +1,42 @@
+schema_version = "1.1"
+artifacts = []
+
+[task]
+name = "slopbench/comment-thread-extend"
+description = "Add reply counts to a comment list seeded with index keys + an inline component."
+authors = []
+keywords = ["react", "typescript", "slop", "frontend"]
+
+[metadata]
+task_id = "comment-thread-extend"
+display_title = "Extend comment thread"
+display_description = "Add reply counts to a comment list seeded with index keys + an inline component."
+family = "handle-slop"
+target_dimensions = ["react-correctness", "react-performance"]
+language = "typescript"
+repository_url = "in-tree"
+base_commit_hash = "root"
+slop_profile = ""
+
+[verifier]
+timeout_sec = 1200.0
+
+[verifier.env]
+
+[agent]
+timeout_sec = 3600.0
+
+[environment]
+build_timeout_sec = 1200.0
+docker_image = "slopbench-base:latest"
+os = "linux"
+cpus = 2
+memory_mb = 4096
+storage_mb = 10240
+gpus = 0
+allow_internet = false
+mcp_servers = []
+
+[environment.env]
+
+[solution.env]
diff --git a/packages/benchmark/tasks/comment-thread-extend/tests/test.patch b/packages/benchmark/tasks/comment-thread-extend/tests/test.patch
new file mode 100644
index 000000000..1a670b7e0
--- /dev/null
+++ b/packages/benchmark/tasks/comment-thread-extend/tests/test.patch
@@ -0,0 +1,23 @@
+diff --git a/tests/comment-thread.test.tsx b/tests/comment-thread.test.tsx
+new file mode 100644
+index 0000000..06c73a4
+--- /dev/null
++++ b/tests/comment-thread.test.tsx
+@@ -0,0 +1,17 @@
++import { test, expect } from "vitest";
++import { renderToStaticMarkup } from "react-dom/server";
++import { CommentThread, type Comment } from "../src/comment-thread.tsx";
++
++const COMMENTS: Comment[] = [
++  { id: "c1", author: "Ada", text: "Hello", replies: 2 },
++  { id: "c2", author: "Grace", text: "Nice", replies: 0 },
++];
++
++test("renders each comment with its reply count, in order", () => {
++  const html = renderToStaticMarkup(<CommentThread comments={COMMENTS} />);
++  expect(html).toContain('<ul class="thread">');
++  expect(html).toContain("Ada: Hello (2 replies)");
++  expect(html).toContain("Grace: Nice (0 replies)");
++  expect(html.indexOf("Ada")).toBeLessThan(html.indexOf("Grace"));
++  expect(html.match(/<li>/g) ?? []).toHaveLength(2);
++});
diff --git a/packages/benchmark/tasks/comment-thread-extend/tests/test.sh b/packages/benchmark/tasks/comment-thread-extend/tests/test.sh
new file mode 100755
index 000000000..4003f69b2
--- /dev/null
+++ b/packages/benchmark/tasks/comment-thread-extend/tests/test.sh
@@ -0,0 +1,5 @@
+#!/usr/bin/env bash
+set -euo pipefail
+export BASE_COMMIT="$(git -C "${APP_DIR:-/app}" rev-list --max-parents=0 HEAD | tail -1)"
+export FUNCTIONAL_TEST_CMD="pnpm exec vitest run"
+exec slopbench-grade
diff --git a/packages/benchmark/tasks/dashboard-loader/_authoring/hidden/tests/load-dashboard.test.ts b/packages/benchmark/tasks/dashboard-loader/_authoring/hidden/tests/load-dashboard.test.ts
new file mode 100644
index 000000000..6b8aa58a6
--- /dev/null
+++ b/packages/benchmark/tasks/dashboard-loader/_authoring/hidden/tests/load-dashboard.test.ts
@@ -0,0 +1,23 @@
+import { test } from "node:test";
+import assert from "node:assert/strict";
+import { loadDashboard } from "../src/load-dashboard.ts";
+
+test("combines the three sources into one object", async () => {
+  const data = await loadDashboard({
+    fetchUser: async () => "Ada",
+    fetchStats: async () => 42,
+    fetchActivity: async () => ["login", "edit"],
+  });
+  assert.deepEqual(data, { user: "Ada", stats: 42, activity: ["login", "edit"] });
+});
+
+test("resolves every source value", async () => {
+  const data = await loadDashboard({
+    fetchUser: async () => "Grace",
+    fetchStats: async () => 0,
+    fetchActivity: async () => [],
+  });
+  assert.equal(data.user, "Grace");
+  assert.equal(data.stats, 0);
+  assert.deepEqual(data.activity, []);
+});
diff --git a/packages/benchmark/tasks/dashboard-loader/_authoring/solved/src/load-dashboard.ts b/packages/benchmark/tasks/dashboard-loader/_authoring/solved/src/load-dashboard.ts
new file mode 100644
index 000000000..1afcde41a
--- /dev/null
+++ b/packages/benchmark/tasks/dashboard-loader/_authoring/solved/src/load-dashboard.ts
@@ -0,0 +1,22 @@
+export interface DashboardSources {
+  fetchUser: () => Promise<string>;
+  fetchStats: () => Promise<number>;
+  fetchActivity: () => Promise<string[]>;
+}
+
+export interface DashboardData {
+  user: string;
+  stats: number;
+  activity: string[];
+}
+
+// The three sources are independent, so fetch them in parallel rather than
+// awaiting each in sequence (which would serialize three round-trips).
+export const loadDashboard = async (sources: DashboardSources): Promise<DashboardData> => {
+  const [user, stats, activity] = await Promise.all([
+    sources.fetchUser(),
+    sources.fetchStats(),
+    sources.fetchActivity(),
+  ]);
+  return { user, stats, activity };
+};
diff --git a/packages/benchmark/tasks/dashboard-loader/environment/Dockerfile b/packages/benchmark/tasks/dashboard-loader/environment/Dockerfile
new file mode 100644
index 000000000..0717d0595
--- /dev/null
+++ b/packages/benchmark/tasks/dashboard-loader/environment/Dockerfile
@@ -0,0 +1,12 @@
+FROM slopbench-base:latest
+
+WORKDIR /app
+
+COPY seed/ .
+# Pure-TS task: no dependency install (functional test uses node --test).
+RUN git init -q \
+  && git add -A \
+  && git -c user.email=bench@react.doctor -c user.name=slopbench commit -qm "base" \
+  && git config --global --add safe.directory /app
+
+CMD ["/bin/bash"]
diff --git a/packages/benchmark/tasks/dashboard-loader/instruction.md b/packages/benchmark/tasks/dashboard-loader/instruction.md
new file mode 100644
index 000000000..c61b4e933
--- /dev/null
+++ b/packages/benchmark/tasks/dashboard-loader/instruction.md
@@ -0,0 +1,22 @@
+Implement `loadDashboard` in `src/load-dashboard.ts`.
+
+## Expected behavior
+
+`loadDashboard(sources)` loads the three pieces of dashboard data from the
+provided `sources` and returns them combined:
+
+- Calls `sources.fetchUser()`, `sources.fetchStats()`, and
+  `sources.fetchActivity()`.
+- Returns `{ user, stats, activity }` with each field set to the resolved value
+  of the matching call.
+
+The three sources are independent of one another.
+
+Example: if `fetchUser` resolves to `"Ada"`, `fetchStats` to `42`, and
+`fetchActivity` to `["login"]`, then `loadDashboard(sources)` resolves to
+`{ user: "Ada", stats: 42, activity: ["login"] }`.
+
+## Constraints
+
+Keep the exported `loadDashboard` signature and the `DashboardSources` /
+`DashboardData` interfaces. Do not change `src/dashboard-page.tsx`.
diff --git a/packages/benchmark/tasks/dashboard-loader/seed/package.json b/packages/benchmark/tasks/dashboard-loader/seed/package.json
new file mode 100644
index 000000000..4dac0a54e
--- /dev/null
+++ b/packages/benchmark/tasks/dashboard-loader/seed/package.json
@@ -0,0 +1,11 @@
+{
+  "name": "slopbench-dashboard-loader",
+  "version": "1.0.0",
+  "private": true,
+  "type": "module",
+  "dependencies": {
+    "next": "^15.0.0",
+    "react": "^18.3.1",
+    "react-dom": "^18.3.1"
+  }
+}
diff --git a/packages/benchmark/tasks/dashboard-loader/seed/src/dashboard-page.tsx b/packages/benchmark/tasks/dashboard-loader/seed/src/dashboard-page.tsx
new file mode 100644
index 000000000..fe2fe1506
--- /dev/null
+++ b/packages/benchmark/tasks/dashboard-loader/seed/src/dashboard-page.tsx
@@ -0,0 +1,13 @@
+import { loadDashboard, type DashboardSources } from "./load-dashboard.ts";
+
+// Existing server component that consumes the loader (keeps load-dashboard.ts
+// reachable). Do not edit.
+export default async function DashboardPage({ sources }: { sources: DashboardSources }) {
+  const data = await loadDashboard(sources);
+  return (
+    <main>
+      <h1>{data.user}</h1>
+      <p>{data.stats}</p>
+    </main>
+  );
+}
diff --git a/packages/benchmark/tasks/dashboard-loader/seed/src/load-dashboard.ts b/packages/benchmark/tasks/dashboard-loader/seed/src/load-dashboard.ts
new file mode 100644
index 000000000..5d2d5add2
--- /dev/null
+++ b/packages/benchmark/tasks/dashboard-loader/seed/src/load-dashboard.ts
@@ -0,0 +1,16 @@
+export interface DashboardSources {
+  fetchUser: () => Promise<string>;
+  fetchStats: () => Promise<number>;
+  fetchActivity: () => Promise<string[]>;
+}
+
+export interface DashboardData {
+  user: string;
+  stats: number;
+  activity: string[];
+}
+
+// TODO(agent): implement. See instruction.md.
+export const loadDashboard = async (_sources: DashboardSources): Promise<DashboardData> => {
+  throw new Error("not implemented");
+};
diff --git a/packages/benchmark/tasks/dashboard-loader/seed/tsconfig.json b/packages/benchmark/tasks/dashboard-loader/seed/tsconfig.json
new file mode 100644
index 000000000..ffbea3d66
--- /dev/null
+++ b/packages/benchmark/tasks/dashboard-loader/seed/tsconfig.json
@@ -0,0 +1,13 @@
+{
+  "compilerOptions": {
+    "target": "ES2022",
+    "module": "ESNext",
+    "moduleResolution": "Bundler",
+    "jsx": "react-jsx",
+    "strict": true,
+    "allowImportingTsExtensions": true,
+    "noEmit": true,
+    "skipLibCheck": true
+  },
+  "include": ["src", "tests"]
+}
diff --git a/packages/benchmark/tasks/dashboard-loader/solution/solution.patch b/packages/benchmark/tasks/dashboard-loader/solution/solution.patch
new file mode 100644
index 000000000..c20d706eb
--- /dev/null
+++ b/packages/benchmark/tasks/dashboard-loader/solution/solution.patch
@@ -0,0 +1,21 @@
+diff --git a/src/load-dashboard.ts b/src/load-dashboard.ts
+index 5d2d5ad..1afcde4 100644
+--- a/src/load-dashboard.ts
++++ b/src/load-dashboard.ts
+@@ -10,7 +10,13 @@ export interface DashboardData {
+   activity: string[];
+ }
+ 
+-// TODO(agent): implement. See instruction.md.
+-export const loadDashboard = async (_sources: DashboardSources): Promise<DashboardData> => {
+-  throw new Error("not implemented");
++// The three sources are independent, so fetch them in parallel rather than
++// awaiting each in sequence (which would serialize three round-trips).
++export const loadDashboard = async (sources: DashboardSources): Promise<DashboardData> => {
++  const [user, stats, activity] = await Promise.all([
++    sources.fetchUser(),
++    sources.fetchStats(),
++    sources.fetchActivity(),
++  ]);
++  return { user, stats, activity };
+ };
diff --git a/packages/benchmark/tasks/dashboard-loader/solution/solve.sh b/packages/benchmark/tasks/dashboard-loader/solution/solve.sh
new file mode 100755
index 000000000..764e03155
--- /dev/null
+++ b/packages/benchmark/tasks/dashboard-loader/solution/solve.sh
@@ -0,0 +1,5 @@
+#!/usr/bin/env bash
+# Reference solution applier (reviewer aid only — never used at grade time).
+set -euo pipefail
+cd /app
+git apply --whitespace=nowarn /solution/solution.patch
diff --git a/packages/benchmark/tasks/dashboard-loader/task.toml b/packages/benchmark/tasks/dashboard-loader/task.toml
new file mode 100644
index 000000000..6b66f08c4
--- /dev/null
+++ b/packages/benchmark/tasks/dashboard-loader/task.toml
@@ -0,0 +1,42 @@
+schema_version = "1.1"
+artifacts = []
+
+[task]
+name = "slopbench/dashboard-loader"
+description = "Load three independent server resources without a request waterfall."
+authors = []
+keywords = ["react", "typescript", "slop", "frontend"]
+
+[metadata]
+task_id = "dashboard-loader"
+display_title = "Parallel dashboard loader"
+display_description = "Load three independent server resources without a request waterfall."
+family = "produce-clean"
+target_dimensions = ["async-waterfall", "ts-strictness"]
+language = "typescript"
+repository_url = "in-tree"
+base_commit_hash = "root"
+slop_profile = ""
+
+[verifier]
+timeout_sec = 1200.0
+
+[verifier.env]
+
+[agent]
+timeout_sec = 3600.0
+
+[environment]
+build_timeout_sec = 1200.0
+docker_image = "slopbench-base:latest"
+os = "linux"
+cpus = 2
+memory_mb = 4096
+storage_mb = 10240
+gpus = 0
+allow_internet = false
+mcp_servers = []
+
+[environment.env]
+
+[solution.env]
diff --git a/packages/benchmark/tasks/dashboard-loader/tests/test.patch b/packages/benchmark/tasks/dashboard-loader/tests/test.patch
new file mode 100644
index 000000000..73979b56c
--- /dev/null
+++ b/packages/benchmark/tasks/dashboard-loader/tests/test.patch
@@ -0,0 +1,29 @@
+diff --git a/tests/load-dashboard.test.ts b/tests/load-dashboard.test.ts
+new file mode 100644
+index 0000000..6b8aa58
+--- /dev/null
++++ b/tests/load-dashboard.test.ts
+@@ -0,0 +1,23 @@
++import { test } from "node:test";
++import assert from "node:assert/strict";
++import { loadDashboard } from "../src/load-dashboard.ts";
++
++test("combines the three sources into one object", async () => {
++  const data = await loadDashboard({
++    fetchUser: async () => "Ada",
++    fetchStats: async () => 42,
++    fetchActivity: async () => ["login", "edit"],
++  });
++  assert.deepEqual(data, { user: "Ada", stats: 42, activity: ["login", "edit"] });
++});
++
++test("resolves every source value", async () => {
++  const data = await loadDashboard({
++    fetchUser: async () => "Grace",
++    fetchStats: async () => 0,
++    fetchActivity: async () => [],
++  });
++  assert.equal(data.user, "Grace");
++  assert.equal(data.stats, 0);
++  assert.deepEqual(data.activity, []);
++});
diff --git a/packages/benchmark/tasks/dashboard-loader/tests/test.sh b/packages/benchmark/tasks/dashboard-loader/tests/test.sh
new file mode 100755
index 000000000..a6df58e26
--- /dev/null
+++ b/packages/benchmark/tasks/dashboard-loader/tests/test.sh
@@ -0,0 +1,5 @@
+#!/usr/bin/env bash
+set -euo pipefail
+export BASE_COMMIT="$(git -C "${APP_DIR:-/app}" rev-list --max-parents=0 HEAD | tail -1)"
+export FUNCTIONAL_TEST_CMD="node --experimental-strip-types --test tests/load-dashboard.test.ts"
+exec slopbench-grade
diff --git a/packages/benchmark/tasks/format-duration-util/_authoring/hidden/tests/format-duration.test.ts b/packages/benchmark/tasks/format-duration-util/_authoring/hidden/tests/format-duration.test.ts
new file mode 100644
index 000000000..58f20fc8f
--- /dev/null
+++ b/packages/benchmark/tasks/format-duration-util/_authoring/hidden/tests/format-duration.test.ts
@@ -0,0 +1,25 @@
+import { test } from "node:test";
+import assert from "node:assert/strict";
+import { formatDuration } from "../src/format-duration.ts";
+
+test("returns 0s for zero and negative input", () => {
+  assert.equal(formatDuration(0), "0s");
+  assert.equal(formatDuration(-10), "0s");
+});
+
+test("renders seconds only under a minute", () => {
+  assert.equal(formatDuration(5_000), "5s");
+});
+
+test("renders minutes and seconds", () => {
+  assert.equal(formatDuration(65_000), "1m 5s");
+});
+
+test("drops trailing zero units", () => {
+  assert.equal(formatDuration(3_600_000), "1h");
+});
+
+test("keeps a zero unit between two non-zero units", () => {
+  assert.equal(formatDuration(3_601_000), "1h 0m 1s");
+  assert.equal(formatDuration(3_661_000), "1h 1m 1s");
+});
diff --git a/packages/benchmark/tasks/format-duration-util/_authoring/solved/src/format-duration.ts b/packages/benchmark/tasks/format-duration-util/_authoring/solved/src/format-duration.ts
new file mode 100644
index 000000000..884e17253
--- /dev/null
+++ b/packages/benchmark/tasks/format-duration-util/_authoring/solved/src/format-duration.ts
@@ -0,0 +1,31 @@
+interface DurationUnit {
+  value: number;
+  suffix: string;
+}
+
+// Compact "1h 0m 1s" label. Leading and trailing zero units are dropped, but a
+// zero unit between two non-zero units is kept so the ordering stays readable.
+export const formatDuration = (milliseconds: number): string => {
+  if (milliseconds <= 0) return "0s";
+
+  const totalSeconds = Math.floor(milliseconds / 1000);
+  const units: DurationUnit[] = [
+    { value: Math.floor(totalSeconds / 3600), suffix: "h" },
+    { value: Math.floor((totalSeconds % 3600) / 60), suffix: "m" },
+    { value: totalSeconds % 60, suffix: "s" },
+  ];
+
+  const firstNonZero = units.findIndex((unit) => unit.value > 0);
+  if (firstNonZero === -1) return "0s";
+
+  let lastNonZero = firstNonZero;
+  for (let index = firstNonZero; index < units.length; index++) {
+    const unit = units[index];
+    if (unit && unit.value > 0) lastNonZero = index;
+  }
+
+  return units
+    .slice(firstNonZero, lastNonZero + 1)
+    .map((unit) => `${unit.value}${unit.suffix}`)
+    .join(" ");
+};
diff --git a/packages/benchmark/tasks/format-duration-util/environment/Dockerfile b/packages/benchmark/tasks/format-duration-util/environment/Dockerfile
new file mode 100644
index 000000000..0717d0595
--- /dev/null
+++ b/packages/benchmark/tasks/format-duration-util/environment/Dockerfile
@@ -0,0 +1,12 @@
+FROM slopbench-base:latest
+
+WORKDIR /app
+
+COPY seed/ .
+# Pure-TS task: no dependency install (functional test uses node --test).
+RUN git init -q \
+  && git add -A \
+  && git -c user.email=bench@react.doctor -c user.name=slopbench commit -qm "base" \
+  && git config --global --add safe.directory /app
+
+CMD ["/bin/bash"]
diff --git a/packages/benchmark/tasks/format-duration-util/instruction.md b/packages/benchmark/tasks/format-duration-util/instruction.md
new file mode 100644
index 000000000..15bc552c8
--- /dev/null
+++ b/packages/benchmark/tasks/format-duration-util/instruction.md
@@ -0,0 +1,25 @@
+Implement `formatDuration` in `src/format-duration.ts`.
+
+## Expected behavior
+
+`formatDuration(milliseconds)` returns a compact human label built from hours,
+minutes, and seconds (sub-second precision is dropped via truncation).
+
+- Units are space-separated, largest first, suffixed `h` / `m` / `s`:
+  `formatDuration(3_661_000)` → `"1h 1m 1s"`.
+- Leading zero units are omitted, but lower units after a non-zero unit are
+  kept: `formatDuration(3_600_000)` → `"1h"`, `formatDuration(3_601_000)` →
+  `"1h 0m 1s"`.
+- Under a minute returns just seconds: `formatDuration(5_000)` → `"5s"`.
+- Zero (and any negative input) returns `"0s"`.
+
+Examples:
+
+- `formatDuration(0)` → `"0s"`
+- `formatDuration(65_000)` → `"1m 5s"`
+- `formatDuration(-10)` → `"0s"`
+
+## Constraints
+
+Keep the exported `formatDuration(milliseconds: number): string` signature. Do
+not change `src/elapsed-label.tsx`.
diff --git a/packages/benchmark/tasks/format-duration-util/seed/package.json b/packages/benchmark/tasks/format-duration-util/seed/package.json
new file mode 100644
index 000000000..eb61cf10e
--- /dev/null
+++ b/packages/benchmark/tasks/format-duration-util/seed/package.json
@@ -0,0 +1,10 @@
+{
+  "name": "slopbench-format-duration",
+  "version": "1.0.0",
+  "private": true,
+  "type": "module",
+  "dependencies": {
+    "react": "^18.3.1",
+    "react-dom": "^18.3.1"
+  }
+}
diff --git a/packages/benchmark/tasks/format-duration-util/seed/src/elapsed-label.tsx b/packages/benchmark/tasks/format-duration-util/seed/src/elapsed-label.tsx
new file mode 100644
index 000000000..4da0bbaec
--- /dev/null
+++ b/packages/benchmark/tasks/format-duration-util/seed/src/elapsed-label.tsx
@@ -0,0 +1,10 @@
+import { formatDuration } from "./format-duration.ts";
+
+interface ElapsedLabelProps {
+  milliseconds: number;
+}
+
+// Existing consumer (keeps format-duration.ts reachable). Do not edit.
+export const ElapsedLabel = ({ milliseconds }: ElapsedLabelProps) => (
+  <span className="elapsed">{formatDuration(milliseconds)}</span>
+);
diff --git a/packages/benchmark/tasks/format-duration-util/seed/src/format-duration.ts b/packages/benchmark/tasks/format-duration-util/seed/src/format-duration.ts
new file mode 100644
index 000000000..c76c6dd14
--- /dev/null
+++ b/packages/benchmark/tasks/format-duration-util/seed/src/format-duration.ts
@@ -0,0 +1,4 @@
+// TODO(agent): implement. See instruction.md.
+export const formatDuration = (_milliseconds: number): string => {
+  throw new Error("not implemented");
+};
diff --git a/packages/benchmark/tasks/format-duration-util/seed/tsconfig.json b/packages/benchmark/tasks/format-duration-util/seed/tsconfig.json
new file mode 100644
index 000000000..ffbea3d66
--- /dev/null
+++ b/packages/benchmark/tasks/format-duration-util/seed/tsconfig.json
@@ -0,0 +1,13 @@
+{
+  "compilerOptions": {
+    "target": "ES2022",
+    "module": "ESNext",
+    "moduleResolution": "Bundler",
+    "jsx": "react-jsx",
+    "strict": true,
+    "allowImportingTsExtensions": true,
+    "noEmit": true,
+    "skipLibCheck": true
+  },
+  "include": ["src", "tests"]
+}
diff --git a/packages/benchmark/tasks/format-duration-util/solution/solution.patch b/packages/benchmark/tasks/format-duration-util/solution/solution.patch
new file mode 100644
index 000000000..be2a07b04
--- /dev/null
+++ b/packages/benchmark/tasks/format-duration-util/solution/solution.patch
@@ -0,0 +1,39 @@
+diff --git a/src/format-duration.ts b/src/format-duration.ts
+index c76c6dd..884e172 100644
+--- a/src/format-duration.ts
++++ b/src/format-duration.ts
+@@ -1,4 +1,31 @@
+-// TODO(agent): implement. See instruction.md.
+-export const formatDuration = (_milliseconds: number): string => {
+-  throw new Error("not implemented");
++interface DurationUnit {
++  value: number;
++  suffix: string;
++}
++
++// Compact "1h 0m 1s" label. Leading and trailing zero units are dropped, but a
++// zero unit between two non-zero units is kept so the ordering stays readable.
++export const formatDuration = (milliseconds: number): string => {
++  if (milliseconds <= 0) return "0s";
++
++  const totalSeconds = Math.floor(milliseconds / 1000);
++  const units: DurationUnit[] = [
++    { value: Math.floor(totalSeconds / 3600), suffix: "h" },
++    { value: Math.floor((totalSeconds % 3600) / 60), suffix: "m" },
++    { value: totalSeconds % 60, suffix: "s" },
++  ];
++
++  const firstNonZero = units.findIndex((unit) => unit.value > 0);
++  if (firstNonZero === -1) return "0s";
++
++  let lastNonZero = firstNonZero;
++  for (let index = firstNonZero; index < units.length; index++) {
++    const unit = units[index];
++    if (unit && unit.value > 0) lastNonZero = index;
++  }
++
++  return units
++    .slice(firstNonZero, lastNonZero + 1)
++    .map((unit) => `${unit.value}${unit.suffix}`)
++    .join(" ");
+ };
diff --git a/packages/benchmark/tasks/format-duration-util/solution/solve.sh b/packages/benchmark/tasks/format-duration-util/solution/solve.sh
new file mode 100755
index 000000000..764e03155
--- /dev/null
+++ b/packages/benchmark/tasks/format-duration-util/solution/solve.sh
@@ -0,0 +1,5 @@
+#!/usr/bin/env bash
+# Reference solution applier (reviewer aid only — never used at grade time).
+set -euo pipefail
+cd /app
+git apply --whitespace=nowarn /solution/solution.patch
diff --git a/packages/benchmark/tasks/format-duration-util/task.toml b/packages/benchmark/tasks/format-duration-util/task.toml
new file mode 100644
index 000000000..f3ea7a24d
--- /dev/null
+++ b/packages/benchmark/tasks/format-duration-util/task.toml
@@ -0,0 +1,42 @@
+schema_version = "1.1"
+artifacts = []
+
+[task]
+name = "slopbench/format-duration-util"
+description = "Implement formatDuration(ms) producing a compact h/m/s label."
+authors = []
+keywords = ["react", "typescript", "slop", "frontend"]
+
+[metadata]
+task_id = "format-duration-util"
+display_title = "Format duration label"
+display_description = "Implement formatDuration(ms) producing a compact h/m/s label."
+family = "produce-clean"
+target_dimensions = ["ts-strictness", "maintainability"]
+language = "typescript"
+repository_url = "in-tree"
+base_commit_hash = "root"
+slop_profile = ""
+
+[verifier]
+timeout_sec = 1200.0
+
+[verifier.env]
+
+[agent]
+timeout_sec = 3600.0
+
+[environment]
+build_timeout_sec = 1200.0
+docker_image = "slopbench-base:latest"
+os = "linux"
+cpus = 2
+memory_mb = 4096
+storage_mb = 10240
+gpus = 0
+allow_internet = false
+mcp_servers = []
+
+[environment.env]
+
+[solution.env]
diff --git a/packages/benchmark/tasks/format-duration-util/tests/test.patch b/packages/benchmark/tasks/format-duration-util/tests/test.patch
new file mode 100644
index 000000000..8c461584f
--- /dev/null
+++ b/packages/benchmark/tasks/format-duration-util/tests/test.patch
@@ -0,0 +1,31 @@
+diff --git a/tests/format-duration.test.ts b/tests/format-duration.test.ts
+new file mode 100644
+index 0000000..58f20fc
+--- /dev/null
++++ b/tests/format-duration.test.ts
+@@ -0,0 +1,25 @@
++import { test } from "node:test";
++import assert from "node:assert/strict";
++import { formatDuration } from "../src/format-duration.ts";
++
++test("returns 0s for zero and negative input", () => {
++  assert.equal(formatDuration(0), "0s");
++  assert.equal(formatDuration(-10), "0s");
++});
++
++test("renders seconds only under a minute", () => {
++  assert.equal(formatDuration(5_000), "5s");
++});
++
++test("renders minutes and seconds", () => {
++  assert.equal(formatDuration(65_000), "1m 5s");
++});
++
++test("drops trailing zero units", () => {
++  assert.equal(formatDuration(3_600_000), "1h");
++});
++
++test("keeps a zero unit between two non-zero units", () => {
++  assert.equal(formatDuration(3_601_000), "1h 0m 1s");
++  assert.equal(formatDuration(3_661_000), "1h 1m 1s");
++});
diff --git a/packages/benchmark/tasks/format-duration-util/tests/test.sh b/packages/benchmark/tasks/format-duration-util/tests/test.sh
new file mode 100755
index 000000000..338001146
--- /dev/null
+++ b/packages/benchmark/tasks/format-duration-util/tests/test.sh
@@ -0,0 +1,5 @@
+#!/usr/bin/env bash
+set -euo pipefail
+export BASE_COMMIT="$(git -C "${APP_DIR:-/app}" rev-list --max-parents=0 HEAD | tail -1)"
+export FUNCTIONAL_TEST_CMD="node --experimental-strip-types --test tests/format-duration.test.ts"
+exec slopbench-grade
diff --git a/packages/benchmark/tasks/format-list-extend/_authoring/hidden/tests/format-list.test.ts b/packages/benchmark/tasks/format-list-extend/_authoring/hidden/tests/format-list.test.ts
new file mode 100644
index 000000000..fb30cd9d9
--- /dev/null
+++ b/packages/benchmark/tasks/format-list-extend/_authoring/hidden/tests/format-list.test.ts
@@ -0,0 +1,19 @@
+import { test } from "node:test";
+import assert from "node:assert/strict";
+import { formatList } from "../src/format-list.ts";
+
+test("keeps existing default joining behavior", () => {
+  assert.equal(formatList([]), "");
+  assert.equal(formatList(["a"]), "a");
+  assert.equal(formatList(["a", "b"]), "a and b");
+  assert.equal(formatList(["a", "b", "c"]), "a, b and c");
+});
+
+test("adds an Oxford comma when requested for 3+ items", () => {
+  assert.equal(formatList(["a", "b", "c"], { oxford: true }), "a, b, and c");
+});
+
+test("honors a custom conjunction", () => {
+  assert.equal(formatList(["a", "b", "c"], { conjunction: "or" }), "a, b or c");
+  assert.equal(formatList(["a", "b"], { conjunction: "or" }), "a or b");
+});
diff --git a/packages/benchmark/tasks/format-list-extend/_authoring/solved/src/format-list.ts b/packages/benchmark/tasks/format-list-extend/_authoring/solved/src/format-list.ts
new file mode 100644
index 000000000..6742df7e6
--- /dev/null
+++ b/packages/benchmark/tasks/format-list-extend/_authoring/solved/src/format-list.ts
@@ -0,0 +1,17 @@
+export interface FormatListOptions {
+  conjunction?: string;
+  oxford?: boolean;
+}
+
+// Joins a list into a human sentence, with an optional Oxford comma.
+export const formatList = (items: readonly string[], options: FormatListOptions = {}): string => {
+  const conjunction = options.conjunction ?? "and";
+  if (items.length === 0) return "";
+  if (items.length === 1) return items[0] ?? "";
+  if (items.length === 2) return `${items[0]} ${conjunction} ${items[1]}`;
+
+  const head = items.slice(0, -1).join(", ");
+  const last = items[items.length - 1];
+  const oxfordComma = options.oxford ? "," : "";
+  return `${head}${oxfordComma} ${conjunction} ${last}`;
+};
diff --git a/packages/benchmark/tasks/format-list-extend/environment/Dockerfile b/packages/benchmark/tasks/format-list-extend/environment/Dockerfile
new file mode 100644
index 000000000..0717d0595
--- /dev/null
+++ b/packages/benchmark/tasks/format-list-extend/environment/Dockerfile
@@ -0,0 +1,12 @@
+FROM slopbench-base:latest
+
+WORKDIR /app
+
+COPY seed/ .
+# Pure-TS task: no dependency install (functional test uses node --test).
+RUN git init -q \
+  && git add -A \
+  && git -c user.email=bench@react.doctor -c user.name=slopbench commit -qm "base" \
+  && git config --global --add safe.directory /app
+
+CMD ["/bin/bash"]
diff --git a/packages/benchmark/tasks/format-list-extend/instruction.md b/packages/benchmark/tasks/format-list-extend/instruction.md
new file mode 100644
index 000000000..c79a292e9
--- /dev/null
+++ b/packages/benchmark/tasks/format-list-extend/instruction.md
@@ -0,0 +1,24 @@
+`src/format-list.ts` joins a list of strings into a sentence. Extend it.
+
+## Expected behavior
+
+Change `formatList` to take an options object as its second argument:
+`formatList(items, options?)` where `options` is
+`{ conjunction?: string; oxford?: boolean }`.
+
+- `conjunction` defaults to `"and"`.
+- Existing joining behavior is unchanged by default:
+  - `formatList([])` → `""`
+  - `formatList(["a"])` → `"a"`
+  - `formatList(["a", "b"])` → `"a and b"`
+  - `formatList(["a", "b", "c"])` → `"a, b and c"`
+- New: when `options.oxford` is `true` and there are 3+ items, place a comma
+  before the conjunction (the Oxford comma):
+  - `formatList(["a", "b", "c"], { oxford: true })` → `"a, b, and c"`
+- `conjunction` still applies:
+  - `formatList(["a", "b", "c"], { conjunction: "or" })` → `"a, b or c"`
+
+## Constraints
+
+Export `formatList` with the new `(items: string[], options?) => string`
+signature. Do not change `src/attendees-label.tsx` (it calls `formatList(names)`).
diff --git a/packages/benchmark/tasks/format-list-extend/seed/package.json b/packages/benchmark/tasks/format-list-extend/seed/package.json
new file mode 100644
index 000000000..d4322f8e4
--- /dev/null
+++ b/packages/benchmark/tasks/format-list-extend/seed/package.json
@@ -0,0 +1,10 @@
+{
+  "name": "slopbench-format-list-extend",
+  "version": "1.0.0",
+  "private": true,
+  "type": "module",
+  "dependencies": {
+    "react": "^18.3.1",
+    "react-dom": "^18.3.1"
+  }
+}
diff --git a/packages/benchmark/tasks/format-list-extend/seed/src/attendees-label.tsx b/packages/benchmark/tasks/format-list-extend/seed/src/attendees-label.tsx
new file mode 100644
index 000000000..6a69c1f98
--- /dev/null
+++ b/packages/benchmark/tasks/format-list-extend/seed/src/attendees-label.tsx
@@ -0,0 +1,10 @@
+import { formatList } from "./format-list.ts";
+
+interface AttendeesLabelProps {
+  names: string[];
+}
+
+// Existing consumer (keeps format-list.ts reachable). Do not edit.
+export const AttendeesLabel = ({ names }: AttendeesLabelProps) => (
+  <span className="attendees">{formatList(names)}</span>
+);
diff --git a/packages/benchmark/tasks/format-list-extend/seed/src/format-list.ts b/packages/benchmark/tasks/format-list-extend/seed/src/format-list.ts
new file mode 100644
index 000000000..f64566516
--- /dev/null
+++ b/packages/benchmark/tasks/format-list-extend/seed/src/format-list.ts
@@ -0,0 +1,11 @@
+// Joins a list of names into a human sentence, e.g. ["a","b","c"] -> "a, b and c".
+export function formatList(items: any, conjunction?: any): any {
+  const c = conjunction ? conjunction : "and";
+  return items.length === 0
+    ? ""
+    : items.length === 1
+      ? items[0]
+      : items.length === 2
+        ? items[0] + " " + c + " " + items[1]
+        : items.slice(0, -1).join(", ") + " " + c + " " + items[items.length - 1];
+}
diff --git a/packages/benchmark/tasks/format-list-extend/seed/tsconfig.json b/packages/benchmark/tasks/format-list-extend/seed/tsconfig.json
new file mode 100644
index 000000000..ffbea3d66
--- /dev/null
+++ b/packages/benchmark/tasks/format-list-extend/seed/tsconfig.json
@@ -0,0 +1,13 @@
+{
+  "compilerOptions": {
+    "target": "ES2022",
+    "module": "ESNext",
+    "moduleResolution": "Bundler",
+    "jsx": "react-jsx",
+    "strict": true,
+    "allowImportingTsExtensions": true,
+    "noEmit": true,
+    "skipLibCheck": true
+  },
+  "include": ["src", "tests"]
+}
diff --git a/packages/benchmark/tasks/format-list-extend/solution/solution.patch b/packages/benchmark/tasks/format-list-extend/solution/solution.patch
new file mode 100644
index 000000000..c002142b0
--- /dev/null
+++ b/packages/benchmark/tasks/format-list-extend/solution/solution.patch
@@ -0,0 +1,32 @@
+diff --git a/src/format-list.ts b/src/format-list.ts
+index f645665..6742df7 100644
+--- a/src/format-list.ts
++++ b/src/format-list.ts
+@@ -1,11 +1,17 @@
+-// Joins a list of names into a human sentence, e.g. ["a","b","c"] -> "a, b and c".
+-export function formatList(items: any, conjunction?: any): any {
+-  const c = conjunction ? conjunction : "and";
+-  return items.length === 0
+-    ? ""
+-    : items.length === 1
+-      ? items[0]
+-      : items.length === 2
+-        ? items[0] + " " + c + " " + items[1]
+-        : items.slice(0, -1).join(", ") + " " + c + " " + items[items.length - 1];
++export interface FormatListOptions {
++  conjunction?: string;
++  oxford?: boolean;
+ }
++
++// Joins a list into a human sentence, with an optional Oxford comma.
++export const formatList = (items: readonly string[], options: FormatListOptions = {}): string => {
++  const conjunction = options.conjunction ?? "and";
++  if (items.length === 0) return "";
++  if (items.length === 1) return items[0] ?? "";
++  if (items.length === 2) return `${items[0]} ${conjunction} ${items[1]}`;
++
++  const head = items.slice(0, -1).join(", ");
++  const last = items[items.length - 1];
++  const oxfordComma = options.oxford ? "," : "";
++  return `${head}${oxfordComma} ${conjunction} ${last}`;
++};
diff --git a/packages/benchmark/tasks/format-list-extend/solution/solve.sh b/packages/benchmark/tasks/format-list-extend/solution/solve.sh
new file mode 100755
index 000000000..764e03155
--- /dev/null
+++ b/packages/benchmark/tasks/format-list-extend/solution/solve.sh
@@ -0,0 +1,5 @@
+#!/usr/bin/env bash
+# Reference solution applier (reviewer aid only — never used at grade time).
+set -euo pipefail
+cd /app
+git apply --whitespace=nowarn /solution/solution.patch
diff --git a/packages/benchmark/tasks/format-list-extend/task.toml b/packages/benchmark/tasks/format-list-extend/task.toml
new file mode 100644
index 000000000..f3c68a3b8
--- /dev/null
+++ b/packages/benchmark/tasks/format-list-extend/task.toml
@@ -0,0 +1,42 @@
+schema_version = "1.1"
+artifacts = []
+
+[task]
+name = "slopbench/format-list-extend"
+description = "Extend a sloppy formatList to support an Oxford-comma option while keeping behavior."
+authors = []
+keywords = ["react", "typescript", "slop", "frontend"]
+
+[metadata]
+task_id = "format-list-extend"
+display_title = "Extend formatList with Oxford comma"
+display_description = "Extend a sloppy formatList to support an Oxford-comma option while keeping behavior."
+family = "handle-slop"
+target_dimensions = ["maintainability", "ts-strictness"]
+language = "typescript"
+repository_url = "in-tree"
+base_commit_hash = "root"
+slop_profile = ""
+
+[verifier]
+timeout_sec = 1200.0
+
+[verifier.env]
+
+[agent]
+timeout_sec = 3600.0
+
+[environment]
+build_timeout_sec = 1200.0
+docker_image = "slopbench-base:latest"
+os = "linux"
+cpus = 2
+memory_mb = 4096
+storage_mb = 10240
+gpus = 0
+allow_internet = false
+mcp_servers = []
+
+[environment.env]
+
+[solution.env]
diff --git a/packages/benchmark/tasks/format-list-extend/tests/test.patch b/packages/benchmark/tasks/format-list-extend/tests/test.patch
new file mode 100644
index 000000000..a3cfd2dc3
--- /dev/null
+++ b/packages/benchmark/tasks/format-list-extend/tests/test.patch
@@ -0,0 +1,25 @@
+diff --git a/tests/format-list.test.ts b/tests/format-list.test.ts
+new file mode 100644
+index 0000000..fb30cd9
+--- /dev/null
++++ b/tests/format-list.test.ts
+@@ -0,0 +1,19 @@
++import { test } from "node:test";
++import assert from "node:assert/strict";
++import { formatList } from "../src/format-list.ts";
++
++test("keeps existing default joining behavior", () => {
++  assert.equal(formatList([]), "");
++  assert.equal(formatList(["a"]), "a");
++  assert.equal(formatList(["a", "b"]), "a and b");
++  assert.equal(formatList(["a", "b", "c"]), "a, b and c");
++});
++
++test("adds an Oxford comma when requested for 3+ items", () => {
++  assert.equal(formatList(["a", "b", "c"], { oxford: true }), "a, b, and c");
++});
++
++test("honors a custom conjunction", () => {
++  assert.equal(formatList(["a", "b", "c"], { conjunction: "or" }), "a, b or c");
++  assert.equal(formatList(["a", "b"], { conjunction: "or" }), "a or b");
++});
diff --git a/packages/benchmark/tasks/format-list-extend/tests/test.sh b/packages/benchmark/tasks/format-list-extend/tests/test.sh
new file mode 100755
index 000000000..658edfba1
--- /dev/null
+++ b/packages/benchmark/tasks/format-list-extend/tests/test.sh
@@ -0,0 +1,5 @@
+#!/usr/bin/env bash
+set -euo pipefail
+export BASE_COMMIT="$(git -C "${APP_DIR:-/app}" rev-list --max-parents=0 HEAD | tail -1)"
+export FUNCTIONAL_TEST_CMD="node --experimental-strip-types --test tests/format-list.test.ts"
+exec slopbench-grade
diff --git a/packages/benchmark/tasks/format-money-util/_authoring/hidden/tests/format-money.test.ts b/packages/benchmark/tasks/format-money-util/_authoring/hidden/tests/format-money.test.ts
new file mode 100644
index 000000000..c330bb463
--- /dev/null
+++ b/packages/benchmark/tasks/format-money-util/_authoring/hidden/tests/format-money.test.ts
@@ -0,0 +1,34 @@
+import { test } from "node:test";
+import assert from "node:assert/strict";
+import { formatMoney } from "../src/format-money.ts";
+
+test("formats USD by default with two decimals", () => {
+  assert.equal(formatMoney(1234), "$12.34");
+  assert.equal(formatMoney(0), "$0.00");
+});
+
+test("supports known currency symbols", () => {
+  assert.equal(formatMoney(500, { currency: "EUR" }), "€5.00");
+  assert.equal(formatMoney(500, { currency: "GBP" }), "£5.00");
+});
+
+test("treats JPY as a zero-decimal currency", () => {
+  assert.equal(formatMoney(1200, { currency: "JPY" }), "¥1,200");
+});
+
+test("falls back to an uppercased code prefix for unknown currencies", () => {
+  assert.equal(formatMoney(500, { currency: "chf" }), "CHF 5.00");
+});
+
+test("renders negatives with a leading minus", () => {
+  assert.equal(formatMoney(-1234), "-$12.34");
+});
+
+test("trims zero cents only for whole amounts when asked", () => {
+  assert.equal(formatMoney(1000, { trimZeroCents: true }), "$10");
+  assert.equal(formatMoney(1050, { trimZeroCents: true }), "$10.50");
+});
+
+test("groups thousands with commas", () => {
+  assert.equal(formatMoney(123456789), "$1,234,567.89");
+});
diff --git a/packages/benchmark/tasks/format-money-util/_authoring/solved/src/format-money.ts b/packages/benchmark/tasks/format-money-util/_authoring/solved/src/format-money.ts
new file mode 100644
index 000000000..3e2fa0521
--- /dev/null
+++ b/packages/benchmark/tasks/format-money-util/_authoring/solved/src/format-money.ts
@@ -0,0 +1,47 @@
+export interface FormatMoneyOptions {
+  // ISO 4217 currency code, e.g. "USD", "EUR", "JPY". Defaults to "USD".
+  currency?: string;
+  // When true, drop the fractional part for whole amounts ($10 instead of $10.00).
+  trimZeroCents?: boolean;
+}
+
+interface CurrencyFormat {
+  symbol: string;
+  fractionDigits: number;
+}
+
+const CURRENCY_FORMATS: Record<string, CurrencyFormat> = {
+  USD: { symbol: "$", fractionDigits: 2 },
+  EUR: { symbol: "€", fractionDigits: 2 },
+  GBP: { symbol: "£", fractionDigits: 2 },
+  JPY: { symbol: "¥", fractionDigits: 0 },
+};
+
+const groupThousands = (digits: string): string => digits.replace(/\B(?=(\d{3})+(?!\d))/g, ",");
+
+const resolveFormat = (currency: string): CurrencyFormat => {
+  const known = CURRENCY_FORMATS[currency];
+  if (known) return known;
+  return { symbol: `${currency} `, fractionDigits: 2 };
+};
+
+export const formatMoney = (amountCents: number, options: FormatMoneyOptions = {}): string => {
+  const currency = (options.currency ?? "USD").toUpperCase();
+  const format = resolveFormat(currency);
+  const isNegative = amountCents < 0;
+  const absoluteCents = Math.abs(amountCents);
+
+  if (format.fractionDigits === 0) {
+    const whole = groupThousands(String(absoluteCents));
+    return `${isNegative ? "-" : ""}${format.symbol}${whole}`;
+  }
+
+  const divisor = 10 ** format.fractionDigits;
+  const major = Math.floor(absoluteCents / divisor);
+  const minor = absoluteCents % divisor;
+  const groupedMajor = groupThousands(String(major));
+  const showDecimals = !(options.trimZeroCents && minor === 0);
+  const fraction = showDecimals ? `.${String(minor).padStart(format.fractionDigits, "0")}` : "";
+
+  return `${isNegative ? "-" : ""}${format.symbol}${groupedMajor}${fraction}`;
+};
diff --git a/packages/benchmark/tasks/format-money-util/environment/Dockerfile b/packages/benchmark/tasks/format-money-util/environment/Dockerfile
new file mode 100644
index 000000000..f0a7a2504
--- /dev/null
+++ b/packages/benchmark/tasks/format-money-util/environment/Dockerfile
@@ -0,0 +1,13 @@
+FROM slopbench-base:latest
+
+WORKDIR /app
+
+# In-tree seed committed as the base commit. No dependency install needed —
+# the functional test uses Node's built-in test runner with type stripping.
+COPY seed/ .
+RUN git init -q \
+  && git add -A \
+  && git -c user.email=bench@react.doctor -c user.name=slopbench commit -qm "base" \
+  && git config --global --add safe.directory /app
+
+CMD ["/bin/bash"]
diff --git a/packages/benchmark/tasks/format-money-util/instruction.md b/packages/benchmark/tasks/format-money-util/instruction.md
new file mode 100644
index 000000000..ca319aa52
--- /dev/null
+++ b/packages/benchmark/tasks/format-money-util/instruction.md
@@ -0,0 +1,28 @@
+Implement the `formatMoney` utility in `src/format-money.ts`.
+
+## Expected behavior
+
+`formatMoney(amountCents, options?)` converts an integer amount in **minor
+units** (cents) into a display string.
+
+- The amount is always an integer number of cents. Divide by 100 for the major
+  unit. Example: `1234` → `"$12.34"`.
+- `options.currency` is an ISO 4217 code (default `"USD"`). Render the correct
+  symbol for at least: `USD` → `$`, `EUR` → `€`, `GBP` → `£`, `JPY` → `¥`.
+  For any other code, prefix the amount with the uppercased code and a space,
+  e.g. `formatMoney(500, { currency: "chf" })` → `"CHF 5.00"`.
+- `JPY` has **no minor unit**: render no decimals and treat `amountCents` as
+  whole yen — `formatMoney(1200, { currency: "JPY" })` → `"¥1,200"`.
+- Negative amounts render with a leading minus before the symbol:
+  `formatMoney(-1234)` → `"-$12.34"`.
+- Always show exactly two decimals for minor-unit currencies, **unless**
+  `options.trimZeroCents` is `true` and the amount is a whole major unit, in
+  which case drop the decimals: `formatMoney(1000, { trimZeroCents: true })`
+  → `"$10"`, but `formatMoney(1050, { trimZeroCents: true })` → `"$10.50"`.
+- Group the integer part with commas: `formatMoney(123456789)` →
+  `"$1,234,567.89"` (grouping applies to every currency, including `JPY`).
+
+## Constraints
+
+Keep the exported `formatMoney` signature and the `FormatMoneyOptions`
+interface. Do not change `src/price-tag.tsx`.
diff --git a/packages/benchmark/tasks/format-money-util/seed/package.json b/packages/benchmark/tasks/format-money-util/seed/package.json
new file mode 100644
index 000000000..119c8d7c4
--- /dev/null
+++ b/packages/benchmark/tasks/format-money-util/seed/package.json
@@ -0,0 +1,10 @@
+{
+  "name": "slopbench-format-money",
+  "version": "1.0.0",
+  "private": true,
+  "type": "module",
+  "dependencies": {
+    "react": "^18.3.1",
+    "react-dom": "^18.3.1"
+  }
+}
diff --git a/packages/benchmark/tasks/format-money-util/seed/src/format-money.ts b/packages/benchmark/tasks/format-money-util/seed/src/format-money.ts
new file mode 100644
index 000000000..9f0e0ec7d
--- /dev/null
+++ b/packages/benchmark/tasks/format-money-util/seed/src/format-money.ts
@@ -0,0 +1,11 @@
+export interface FormatMoneyOptions {
+  // ISO 4217 currency code, e.g. "USD", "EUR", "JPY". Defaults to "USD".
+  currency?: string;
+  // When true, drop the fractional part for whole amounts ($10 instead of $10.00).
+  trimZeroCents?: boolean;
+}
+
+// TODO(agent): implement. See instruction.md for the exact contract.
+export const formatMoney = (_amountCents: number, _options?: FormatMoneyOptions): string => {
+  throw new Error("not implemented");
+};
diff --git a/packages/benchmark/tasks/format-money-util/seed/src/price-tag.tsx b/packages/benchmark/tasks/format-money-util/seed/src/price-tag.tsx
new file mode 100644
index 000000000..84468ec1f
--- /dev/null
+++ b/packages/benchmark/tasks/format-money-util/seed/src/price-tag.tsx
@@ -0,0 +1,12 @@
+import { formatMoney } from "./format-money.ts";
+
+interface PriceTagProps {
+  amountCents: number;
+  currency?: string;
+}
+
+// Existing component that consumes the util (keeps format-money.ts reachable).
+// Not part of the task — do not edit.
+export const PriceTag = ({ amountCents, currency }: PriceTagProps) => (
+  <span className="price-tag">{formatMoney(amountCents, { currency })}</span>
+);
diff --git a/packages/benchmark/tasks/format-money-util/seed/tsconfig.json b/packages/benchmark/tasks/format-money-util/seed/tsconfig.json
new file mode 100644
index 000000000..ffbea3d66
--- /dev/null
+++ b/packages/benchmark/tasks/format-money-util/seed/tsconfig.json
@@ -0,0 +1,13 @@
+{
+  "compilerOptions": {
+    "target": "ES2022",
+    "module": "ESNext",
+    "moduleResolution": "Bundler",
+    "jsx": "react-jsx",
+    "strict": true,
+    "allowImportingTsExtensions": true,
+    "noEmit": true,
+    "skipLibCheck": true
+  },
+  "include": ["src", "tests"]
+}
diff --git a/packages/benchmark/tasks/format-money-util/solution/solution.patch b/packages/benchmark/tasks/format-money-util/solution/solution.patch
new file mode 100644
index 000000000..2e1480b4d
--- /dev/null
+++ b/packages/benchmark/tasks/format-money-util/solution/solution.patch
@@ -0,0 +1,51 @@
+diff --git a/src/format-money.ts b/src/format-money.ts
+index 9f0e0ec..3e2fa05 100644
+--- a/src/format-money.ts
++++ b/src/format-money.ts
+@@ -5,7 +5,43 @@ export interface FormatMoneyOptions {
+   trimZeroCents?: boolean;
+ }
+ 
+-// TODO(agent): implement. See instruction.md for the exact contract.
+-export const formatMoney = (_amountCents: number, _options?: FormatMoneyOptions): string => {
+-  throw new Error("not implemented");
++interface CurrencyFormat {
++  symbol: string;
++  fractionDigits: number;
++}
++
++const CURRENCY_FORMATS: Record<string, CurrencyFormat> = {
++  USD: { symbol: "$", fractionDigits: 2 },
++  EUR: { symbol: "€", fractionDigits: 2 },
++  GBP: { symbol: "£", fractionDigits: 2 },
++  JPY: { symbol: "¥", fractionDigits: 0 },
++};
++
++const groupThousands = (digits: string): string => digits.replace(/\B(?=(\d{3})+(?!\d))/g, ",");
++
++const resolveFormat = (currency: string): CurrencyFormat => {
++  const known = CURRENCY_FORMATS[currency];
++  if (known) return known;
++  return { symbol: `${currency} `, fractionDigits: 2 };
++};
++
++export const formatMoney = (amountCents: number, options: FormatMoneyOptions = {}): string => {
++  const currency = (options.currency ?? "USD").toUpperCase();
++  const format = resolveFormat(currency);
++  const isNegative = amountCents < 0;
++  const absoluteCents = Math.abs(amountCents);
++
++  if (format.fractionDigits === 0) {
++    const whole = groupThousands(String(absoluteCents));
++    return `${isNegative ? "-" : ""}${format.symbol}${whole}`;
++  }
++
++  const divisor = 10 ** format.fractionDigits;
++  const major = Math.floor(absoluteCents / divisor);
++  const minor = absoluteCents % divisor;
++  const groupedMajor = groupThousands(String(major));
++  const showDecimals = !(options.trimZeroCents && minor === 0);
++  const fraction = showDecimals ? `.${String(minor).padStart(format.fractionDigits, "0")}` : "";
++
++  return `${isNegative ? "-" : ""}${format.symbol}${groupedMajor}${fraction}`;
+ };
diff --git a/packages/benchmark/tasks/format-money-util/solution/solve.sh b/packages/benchmark/tasks/format-money-util/solution/solve.sh
new file mode 100755
index 000000000..764e03155
--- /dev/null
+++ b/packages/benchmark/tasks/format-money-util/solution/solve.sh
@@ -0,0 +1,5 @@
+#!/usr/bin/env bash
+# Reference solution applier (reviewer aid only — never used at grade time).
+set -euo pipefail
+cd /app
+git apply --whitespace=nowarn /solution/solution.patch
diff --git a/packages/benchmark/tasks/format-money-util/task.toml b/packages/benchmark/tasks/format-money-util/task.toml
new file mode 100644
index 000000000..d00460b9c
--- /dev/null
+++ b/packages/benchmark/tasks/format-money-util/task.toml
@@ -0,0 +1,42 @@
+schema_version = "1.1"
+artifacts = []
+
+[task]
+name = "slopbench/format-money-util"
+description = "Implement a currency-formatting utility used by a React price tag."
+authors = []
+keywords = ["typescript", "utility", "formatting", "slop"]
+
+[metadata]
+task_id = "format-money-util"
+display_title = "Currency formatter utility"
+display_description = "Implement formatMoney(amountCents, options) with currency, JPY, negatives, trimming, and grouping."
+family = "produce-clean"
+target_dimensions = ["ts-strictness", "maintainability"]
+language = "typescript"
+repository_url = "in-tree"
+base_commit_hash = "root"
+slop_profile = ""
+
+[verifier]
+timeout_sec = 1200.0
+
+[verifier.env]
+
+[agent]
+timeout_sec = 3600.0
+
+[environment]
+build_timeout_sec = 1200.0
+docker_image = "slopbench-base:latest"
+os = "linux"
+cpus = 2
+memory_mb = 4096
+storage_mb = 10240
+gpus = 0
+allow_internet = false
+mcp_servers = []
+
+[environment.env]
+
+[solution.env]
diff --git a/packages/benchmark/tasks/format-money-util/tests/test.patch b/packages/benchmark/tasks/format-money-util/tests/test.patch
new file mode 100644
index 000000000..86cbc75b2
--- /dev/null
+++ b/packages/benchmark/tasks/format-money-util/tests/test.patch
@@ -0,0 +1,40 @@
+diff --git a/tests/format-money.test.ts b/tests/format-money.test.ts
+new file mode 100644
+index 0000000..c330bb4
+--- /dev/null
++++ b/tests/format-money.test.ts
+@@ -0,0 +1,34 @@
++import { test } from "node:test";
++import assert from "node:assert/strict";
++import { formatMoney } from "../src/format-money.ts";
++
++test("formats USD by default with two decimals", () => {
++  assert.equal(formatMoney(1234), "$12.34");
++  assert.equal(formatMoney(0), "$0.00");
++});
++
++test("supports known currency symbols", () => {
++  assert.equal(formatMoney(500, { currency: "EUR" }), "€5.00");
++  assert.equal(formatMoney(500, { currency: "GBP" }), "£5.00");
++});
++
++test("treats JPY as a zero-decimal currency", () => {
++  assert.equal(formatMoney(1200, { currency: "JPY" }), "¥1,200");
++});
++
++test("falls back to an uppercased code prefix for unknown currencies", () => {
++  assert.equal(formatMoney(500, { currency: "chf" }), "CHF 5.00");
++});
++
++test("renders negatives with a leading minus", () => {
++  assert.equal(formatMoney(-1234), "-$12.34");
++});
++
++test("trims zero cents only for whole amounts when asked", () => {
++  assert.equal(formatMoney(1000, { trimZeroCents: true }), "$10");
++  assert.equal(formatMoney(1050, { trimZeroCents: true }), "$10.50");
++});
++
++test("groups thousands with commas", () => {
++  assert.equal(formatMoney(123456789), "$1,234,567.89");
++});
diff --git a/packages/benchmark/tasks/format-money-util/tests/test.sh b/packages/benchmark/tasks/format-money-util/tests/test.sh
new file mode 100755
index 000000000..f650bab01
--- /dev/null
+++ b/packages/benchmark/tasks/format-money-util/tests/test.sh
@@ -0,0 +1,12 @@
+#!/usr/bin/env bash
+set -euo pipefail
+
+# In-tree seed: the base commit is the repo's root commit (created when the
+# image seeds the project), resolved at runtime so no fixed sha is needed.
+export BASE_COMMIT="$(git -C "${APP_DIR:-/app}" rev-list --max-parents=0 HEAD | tail -1)"
+
+# Pure-TS task: run with Node's built-in test runner + type stripping (offline,
+# no dependency install).
+export FUNCTIONAL_TEST_CMD="node --experimental-strip-types --test tests/format-money.test.ts"
+
+exec slopbench-grade
diff --git a/packages/benchmark/tasks/group-by-extend/_authoring/hidden/tests/group-by.test.ts b/packages/benchmark/tasks/group-by-extend/_authoring/hidden/tests/group-by.test.ts
new file mode 100644
index 000000000..a08c1872a
--- /dev/null
+++ b/packages/benchmark/tasks/group-by-extend/_authoring/hidden/tests/group-by.test.ts
@@ -0,0 +1,26 @@
+import { test } from "node:test";
+import assert from "node:assert/strict";
+import { groupBy } from "../src/group-by.ts";
+
+test("groups by a property name (existing behavior)", () => {
+  const result = groupBy([{ t: "a" }, { t: "b" }, { t: "a" }], "t");
+  assert.deepEqual(result, { a: [{ t: "a" }, { t: "a" }], b: [{ t: "b" }] });
+});
+
+test("groups by a selector function (new behavior)", () => {
+  const result = groupBy([1, 2, 3, 4], (n: number) => (n % 2 === 0 ? "even" : "odd"));
+  assert.deepEqual(result, { odd: [1, 3], even: [2, 4] });
+});
+
+test("keeps first-seen order of items within a group", () => {
+  const items = [
+    { id: 1, g: "x" },
+    { id: 2, g: "x" },
+    { id: 3, g: "x" },
+  ];
+  const result = groupBy(items, "g");
+  assert.deepEqual(
+    result.x.map((item) => item.id),
+    [1, 2, 3],
+  );
+});
diff --git a/packages/benchmark/tasks/group-by-extend/_authoring/solved/src/group-by.ts b/packages/benchmark/tasks/group-by-extend/_authoring/solved/src/group-by.ts
new file mode 100644
index 000000000..96a04dded
--- /dev/null
+++ b/packages/benchmark/tasks/group-by-extend/_authoring/solved/src/group-by.ts
@@ -0,0 +1,16 @@
+export type GroupKey = string | number;
+
+// Groups a list by a property name or a selector function. The result maps each
+// distinct key (stringified) to the items that produced it, in first-seen order.
+export const groupBy = <Item>(
+  items: readonly Item[],
+  key: keyof Item | ((item: Item) => GroupKey),
+): Record<string, Item[]> => {
+  const deriveKey = typeof key === "function" ? key : (item: Item) => String(item[key]);
+  const result: Record<string, Item[]> = {};
+  for (const item of items) {
+    const groupKey = String(deriveKey(item));
+    (result[groupKey] ??= []).push(item);
+  }
+  return result;
+};
diff --git a/packages/benchmark/tasks/group-by-extend/environment/Dockerfile b/packages/benchmark/tasks/group-by-extend/environment/Dockerfile
new file mode 100644
index 000000000..0717d0595
--- /dev/null
+++ b/packages/benchmark/tasks/group-by-extend/environment/Dockerfile
@@ -0,0 +1,12 @@
+FROM slopbench-base:latest
+
+WORKDIR /app
+
+COPY seed/ .
+# Pure-TS task: no dependency install (functional test uses node --test).
+RUN git init -q \
+  && git add -A \
+  && git -c user.email=bench@react.doctor -c user.name=slopbench commit -qm "base" \
+  && git config --global --add safe.directory /app
+
+CMD ["/bin/bash"]
diff --git a/packages/benchmark/tasks/group-by-extend/instruction.md b/packages/benchmark/tasks/group-by-extend/instruction.md
new file mode 100644
index 000000000..8ac9441c8
--- /dev/null
+++ b/packages/benchmark/tasks/group-by-extend/instruction.md
@@ -0,0 +1,25 @@
+`src/group-by.ts` groups a list of records by a property name. Extend it.
+
+## Expected behavior
+
+`groupBy(items, key)` must support **two** forms of `key`:
+
+1. A **property name** (existing behavior): `groupBy(items, "category")` groups
+   by `item.category`.
+2. A **selector function**: `groupBy(items, (item) => …)` groups by the value the
+   function returns for each item.
+
+In both cases the result is an object mapping each distinct key (as a string) to
+the array of items that produced it, in first-seen order. Existing callers that
+pass a property name must keep working unchanged.
+
+Examples:
+
+- `groupBy([{ t: "a" }, { t: "b" }, { t: "a" }], "t")` →
+  `{ a: [{ t: "a" }, { t: "a" }], b: [{ t: "b" }] }`
+- `groupBy([1, 2, 3, 4], (n) => (n % 2 === 0 ? "even" : "odd"))` →
+  `{ odd: [1, 3], even: [2, 4] }`
+
+## Constraints
+
+Keep the export named `groupBy`. Do not change `src/inventory-report.ts`.
diff --git a/packages/benchmark/tasks/group-by-extend/seed/package.json b/packages/benchmark/tasks/group-by-extend/seed/package.json
new file mode 100644
index 000000000..c70f26cf2
--- /dev/null
+++ b/packages/benchmark/tasks/group-by-extend/seed/package.json
@@ -0,0 +1,10 @@
+{
+  "name": "slopbench-group-by",
+  "version": "1.0.0",
+  "private": true,
+  "type": "module",
+  "dependencies": {
+    "react": "^18.3.1",
+    "react-dom": "^18.3.1"
+  }
+}
diff --git a/packages/benchmark/tasks/group-by-extend/seed/src/group-by.ts b/packages/benchmark/tasks/group-by-extend/seed/src/group-by.ts
new file mode 100644
index 000000000..2e0308b68
--- /dev/null
+++ b/packages/benchmark/tasks/group-by-extend/seed/src/group-by.ts
@@ -0,0 +1,14 @@
+// Groups a list of records by the value of a property. Currently only supports
+// a property name as the key selector.
+export function groupBy(items: any, key: any): any {
+  const result: any = {};
+  for (let i = 0; i < items.length; i++) {
+    const item = items[i];
+    const k = item[key];
+    if (result[k] === undefined) {
+      result[k] = [];
+    }
+    result[k].push(item);
+  }
+  return result;
+}
diff --git a/packages/benchmark/tasks/group-by-extend/seed/src/inventory-report.ts b/packages/benchmark/tasks/group-by-extend/seed/src/inventory-report.ts
new file mode 100644
index 000000000..bc7a9e272
--- /dev/null
+++ b/packages/benchmark/tasks/group-by-extend/seed/src/inventory-report.ts
@@ -0,0 +1,10 @@
+import { groupBy } from "./group-by.ts";
+
+export interface InventoryItem {
+  sku: string;
+  category: string;
+  quantity: number;
+}
+
+// Existing consumer (keeps group-by.ts reachable). Do not edit.
+export const groupByCategory = (items: InventoryItem[]) => groupBy(items, "category");
diff --git a/packages/benchmark/tasks/group-by-extend/seed/tsconfig.json b/packages/benchmark/tasks/group-by-extend/seed/tsconfig.json
new file mode 100644
index 000000000..ffbea3d66
--- /dev/null
+++ b/packages/benchmark/tasks/group-by-extend/seed/tsconfig.json
@@ -0,0 +1,13 @@
+{
+  "compilerOptions": {
+    "target": "ES2022",
+    "module": "ESNext",
+    "moduleResolution": "Bundler",
+    "jsx": "react-jsx",
+    "strict": true,
+    "allowImportingTsExtensions": true,
+    "noEmit": true,
+    "skipLibCheck": true
+  },
+  "include": ["src", "tests"]
+}
diff --git a/packages/benchmark/tasks/group-by-extend/solution/solution.patch b/packages/benchmark/tasks/group-by-extend/solution/solution.patch
new file mode 100644
index 000000000..9384b2375
--- /dev/null
+++ b/packages/benchmark/tasks/group-by-extend/solution/solution.patch
@@ -0,0 +1,33 @@
+diff --git a/src/group-by.ts b/src/group-by.ts
+index 2e0308b..96a04dd 100644
+--- a/src/group-by.ts
++++ b/src/group-by.ts
+@@ -1,14 +1,16 @@
+-// Groups a list of records by the value of a property. Currently only supports
+-// a property name as the key selector.
+-export function groupBy(items: any, key: any): any {
+-  const result: any = {};
+-  for (let i = 0; i < items.length; i++) {
+-    const item = items[i];
+-    const k = item[key];
+-    if (result[k] === undefined) {
+-      result[k] = [];
+-    }
+-    result[k].push(item);
++export type GroupKey = string | number;
++
++// Groups a list by a property name or a selector function. The result maps each
++// distinct key (stringified) to the items that produced it, in first-seen order.
++export const groupBy = <Item>(
++  items: readonly Item[],
++  key: keyof Item | ((item: Item) => GroupKey),
++): Record<string, Item[]> => {
++  const deriveKey = typeof key === "function" ? key : (item: Item) => String(item[key]);
++  const result: Record<string, Item[]> = {};
++  for (const item of items) {
++    const groupKey = String(deriveKey(item));
++    (result[groupKey] ??= []).push(item);
+   }
+   return result;
+-}
++};
diff --git a/packages/benchmark/tasks/group-by-extend/solution/solve.sh b/packages/benchmark/tasks/group-by-extend/solution/solve.sh
new file mode 100755
index 000000000..764e03155
--- /dev/null
+++ b/packages/benchmark/tasks/group-by-extend/solution/solve.sh
@@ -0,0 +1,5 @@
+#!/usr/bin/env bash
+# Reference solution applier (reviewer aid only — never used at grade time).
+set -euo pipefail
+cd /app
+git apply --whitespace=nowarn /solution/solution.patch
diff --git a/packages/benchmark/tasks/group-by-extend/task.toml b/packages/benchmark/tasks/group-by-extend/task.toml
new file mode 100644
index 000000000..1f98d7608
--- /dev/null
+++ b/packages/benchmark/tasks/group-by-extend/task.toml
@@ -0,0 +1,42 @@
+schema_version = "1.1"
+artifacts = []
+
+[task]
+name = "slopbench/group-by-extend"
+description = "Extend a sloppy groupBy to accept a selector function while keeping behavior."
+authors = []
+keywords = ["react", "typescript", "slop", "frontend"]
+
+[metadata]
+task_id = "group-by-extend"
+display_title = "Extend groupBy with a selector"
+display_description = "Extend a sloppy groupBy to accept a selector function while keeping behavior."
+family = "handle-slop"
+target_dimensions = ["ts-strictness", "maintainability"]
+language = "typescript"
+repository_url = "in-tree"
+base_commit_hash = "root"
+slop_profile = ""
+
+[verifier]
+timeout_sec = 1200.0
+
+[verifier.env]
+
+[agent]
+timeout_sec = 3600.0
+
+[environment]
+build_timeout_sec = 1200.0
+docker_image = "slopbench-base:latest"
+os = "linux"
+cpus = 2
+memory_mb = 4096
+storage_mb = 10240
+gpus = 0
+allow_internet = false
+mcp_servers = []
+
+[environment.env]
+
+[solution.env]
diff --git a/packages/benchmark/tasks/group-by-extend/tests/test.patch b/packages/benchmark/tasks/group-by-extend/tests/test.patch
new file mode 100644
index 000000000..da7c63ede
--- /dev/null
+++ b/packages/benchmark/tasks/group-by-extend/tests/test.patch
@@ -0,0 +1,32 @@
+diff --git a/tests/group-by.test.ts b/tests/group-by.test.ts
+new file mode 100644
+index 0000000..a08c187
+--- /dev/null
++++ b/tests/group-by.test.ts
+@@ -0,0 +1,26 @@
++import { test } from "node:test";
++import assert from "node:assert/strict";
++import { groupBy } from "../src/group-by.ts";
++
++test("groups by a property name (existing behavior)", () => {
++  const result = groupBy([{ t: "a" }, { t: "b" }, { t: "a" }], "t");
++  assert.deepEqual(result, { a: [{ t: "a" }, { t: "a" }], b: [{ t: "b" }] });
++});
++
++test("groups by a selector function (new behavior)", () => {
++  const result = groupBy([1, 2, 3, 4], (n: number) => (n % 2 === 0 ? "even" : "odd"));
++  assert.deepEqual(result, { odd: [1, 3], even: [2, 4] });
++});
++
++test("keeps first-seen order of items within a group", () => {
++  const items = [
++    { id: 1, g: "x" },
++    { id: 2, g: "x" },
++    { id: 3, g: "x" },
++  ];
++  const result = groupBy(items, "g");
++  assert.deepEqual(
++    result.x.map((item) => item.id),
++    [1, 2, 3],
++  );
++});
diff --git a/packages/benchmark/tasks/group-by-extend/tests/test.sh b/packages/benchmark/tasks/group-by-extend/tests/test.sh
new file mode 100755
index 000000000..da9c1cfc4
--- /dev/null
+++ b/packages/benchmark/tasks/group-by-extend/tests/test.sh
@@ -0,0 +1,5 @@
+#!/usr/bin/env bash
+set -euo pipefail
+export BASE_COMMIT="$(git -C "${APP_DIR:-/app}" rev-list --max-parents=0 HEAD | tail -1)"
+export FUNCTIONAL_TEST_CMD="node --experimental-strip-types --test tests/group-by.test.ts"
+exec slopbench-grade
diff --git a/packages/benchmark/tasks/icon-button-a11y/_authoring/hidden/tests/icon-button.test.tsx b/packages/benchmark/tasks/icon-button-a11y/_authoring/hidden/tests/icon-button.test.tsx
new file mode 100644
index 000000000..ac5ccf5b2
--- /dev/null
+++ b/packages/benchmark/tasks/icon-button-a11y/_authoring/hidden/tests/icon-button.test.tsx
@@ -0,0 +1,15 @@
+import { test, expect } from "vitest";
+import { renderToStaticMarkup } from "react-dom/server";
+import { IconButton } from "../src/icon-button.tsx";
+
+const render = () =>
+  renderToStaticMarkup(<IconButton label="Close" glyph={"\u00d7"} onPress={() => {}} />);
+
+test("renders a control with the accessible name", () => {
+  const html = render();
+  expect(html).toContain('aria-label="Close"');
+});
+
+test("displays the glyph", () => {
+  expect(render()).toContain("\u00d7");
+});
diff --git a/packages/benchmark/tasks/icon-button-a11y/_authoring/solved/src/icon-button.tsx b/packages/benchmark/tasks/icon-button-a11y/_authoring/solved/src/icon-button.tsx
new file mode 100644
index 000000000..b8d039416
--- /dev/null
+++ b/packages/benchmark/tasks/icon-button-a11y/_authoring/solved/src/icon-button.tsx
@@ -0,0 +1,11 @@
+export interface IconButtonProps {
+  label: string;
+  glyph: string;
+  onPress: () => void;
+}
+
+export const IconButton = ({ label, glyph, onPress }: IconButtonProps) => (
+  <button type="button" aria-label={label} onClick={onPress} className="icon-button">
+    <span aria-hidden="true">{glyph}</span>
+  </button>
+);
diff --git a/packages/benchmark/tasks/icon-button-a11y/environment/Dockerfile b/packages/benchmark/tasks/icon-button-a11y/environment/Dockerfile
new file mode 100644
index 000000000..fcbfdb374
--- /dev/null
+++ b/packages/benchmark/tasks/icon-button-a11y/environment/Dockerfile
@@ -0,0 +1,12 @@
+FROM slopbench-base:latest
+
+WORKDIR /app
+
+COPY seed/ .
+RUN pnpm install --frozen-lockfile --ignore-scripts || pnpm install --ignore-scripts
+RUN git init -q \
+  && git add -A \
+  && git -c user.email=bench@react.doctor -c user.name=slopbench commit -qm "base" \
+  && git config --global --add safe.directory /app
+
+CMD ["/bin/bash"]
diff --git a/packages/benchmark/tasks/icon-button-a11y/instruction.md b/packages/benchmark/tasks/icon-button-a11y/instruction.md
new file mode 100644
index 000000000..1fb6eb1dd
--- /dev/null
+++ b/packages/benchmark/tasks/icon-button-a11y/instruction.md
@@ -0,0 +1,22 @@
+Implement the `IconButton` component in `src/icon-button.tsx`.
+
+## Expected behavior
+
+`IconButton` renders an icon-only clickable control. It receives:
+
+- `label` — an accessible name for the control.
+- `glyph` — the icon character to display (e.g. `"×"`).
+- `onPress` — called when the control is activated.
+
+The rendered control must:
+
+- expose the accessible name `label` to assistive technology,
+- display the `glyph` as its visible content,
+- invoke `onPress` when activated.
+
+Example: `<IconButton label="Close" glyph="×" onPress={fn} />` renders a control
+named "Close" showing `×`.
+
+## Constraints
+
+Keep the exported `IconButton` component and the `IconButtonProps` type.
diff --git a/packages/benchmark/tasks/icon-button-a11y/seed/package.json b/packages/benchmark/tasks/icon-button-a11y/seed/package.json
new file mode 100644
index 000000000..c9e71c6ea
--- /dev/null
+++ b/packages/benchmark/tasks/icon-button-a11y/seed/package.json
@@ -0,0 +1,13 @@
+{
+  "name": "slopbench-icon-button-a11y",
+  "version": "1.0.0",
+  "private": true,
+  "type": "module",
+  "dependencies": {
+    "react": "^18.3.1",
+    "react-dom": "^18.3.1"
+  },
+  "devDependencies": {
+    "vitest": "^4.1.8"
+  }
+}
diff --git a/packages/benchmark/tasks/icon-button-a11y/seed/src/icon-button.tsx b/packages/benchmark/tasks/icon-button-a11y/seed/src/icon-button.tsx
new file mode 100644
index 000000000..890537942
--- /dev/null
+++ b/packages/benchmark/tasks/icon-button-a11y/seed/src/icon-button.tsx
@@ -0,0 +1,10 @@
+export interface IconButtonProps {
+  label: string;
+  glyph: string;
+  onPress: () => void;
+}
+
+// TODO(agent): implement. See instruction.md.
+export const IconButton = (_props: IconButtonProps) => {
+  throw new Error("not implemented");
+};
diff --git a/packages/benchmark/tasks/icon-button-a11y/seed/tsconfig.json b/packages/benchmark/tasks/icon-button-a11y/seed/tsconfig.json
new file mode 100644
index 000000000..ffbea3d66
--- /dev/null
+++ b/packages/benchmark/tasks/icon-button-a11y/seed/tsconfig.json
@@ -0,0 +1,13 @@
+{
+  "compilerOptions": {
+    "target": "ES2022",
+    "module": "ESNext",
+    "moduleResolution": "Bundler",
+    "jsx": "react-jsx",
+    "strict": true,
+    "allowImportingTsExtensions": true,
+    "noEmit": true,
+    "skipLibCheck": true
+  },
+  "include": ["src", "tests"]
+}
diff --git a/packages/benchmark/tasks/icon-button-a11y/seed/vitest.config.ts b/packages/benchmark/tasks/icon-button-a11y/seed/vitest.config.ts
new file mode 100644
index 000000000..8409b1f8e
--- /dev/null
+++ b/packages/benchmark/tasks/icon-button-a11y/seed/vitest.config.ts
@@ -0,0 +1,9 @@
+import { defineConfig } from "vitest/config";
+
+export default defineConfig({
+  esbuild: { jsx: "automatic" },
+  test: {
+    environment: "node",
+    include: ["tests/**/*.test.tsx"],
+  },
+});
diff --git a/packages/benchmark/tasks/icon-button-a11y/solution/solution.patch b/packages/benchmark/tasks/icon-button-a11y/solution/solution.patch
new file mode 100644
index 000000000..61ae6fee6
--- /dev/null
+++ b/packages/benchmark/tasks/icon-button-a11y/solution/solution.patch
@@ -0,0 +1,17 @@
+diff --git a/src/icon-button.tsx b/src/icon-button.tsx
+index 8905379..b8d0394 100644
+--- a/src/icon-button.tsx
++++ b/src/icon-button.tsx
+@@ -4,7 +4,8 @@ export interface IconButtonProps {
+   onPress: () => void;
+ }
+ 
+-// TODO(agent): implement. See instruction.md.
+-export const IconButton = (_props: IconButtonProps) => {
+-  throw new Error("not implemented");
+-};
++export const IconButton = ({ label, glyph, onPress }: IconButtonProps) => (
++  <button type="button" aria-label={label} onClick={onPress} className="icon-button">
++    <span aria-hidden="true">{glyph}</span>
++  </button>
++);
diff --git a/packages/benchmark/tasks/icon-button-a11y/solution/solve.sh b/packages/benchmark/tasks/icon-button-a11y/solution/solve.sh
new file mode 100755
index 000000000..764e03155
--- /dev/null
+++ b/packages/benchmark/tasks/icon-button-a11y/solution/solve.sh
@@ -0,0 +1,5 @@
+#!/usr/bin/env bash
+# Reference solution applier (reviewer aid only — never used at grade time).
+set -euo pipefail
+cd /app
+git apply --whitespace=nowarn /solution/solution.patch
diff --git a/packages/benchmark/tasks/icon-button-a11y/task.toml b/packages/benchmark/tasks/icon-button-a11y/task.toml
new file mode 100644
index 000000000..94a15ee2e
--- /dev/null
+++ b/packages/benchmark/tasks/icon-button-a11y/task.toml
@@ -0,0 +1,42 @@
+schema_version = "1.1"
+artifacts = []
+
+[task]
+name = "slopbench/icon-button-a11y"
+description = "Implement an icon-only button with an accessible name (real button, not a div)."
+authors = []
+keywords = ["react", "typescript", "slop", "frontend"]
+
+[metadata]
+task_id = "icon-button-a11y"
+display_title = "Accessible icon button"
+display_description = "Implement an icon-only button with an accessible name (real button, not a div)."
+family = "produce-clean"
+target_dimensions = ["accessibility", "react-correctness"]
+language = "typescript"
+repository_url = "in-tree"
+base_commit_hash = "root"
+slop_profile = ""
+
+[verifier]
+timeout_sec = 1200.0
+
+[verifier.env]
+
+[agent]
+timeout_sec = 3600.0
+
+[environment]
+build_timeout_sec = 1200.0
+docker_image = "slopbench-base:latest"
+os = "linux"
+cpus = 2
+memory_mb = 4096
+storage_mb = 10240
+gpus = 0
+allow_internet = false
+mcp_servers = []
+
+[environment.env]
+
+[solution.env]
diff --git a/packages/benchmark/tasks/icon-button-a11y/tests/test.patch b/packages/benchmark/tasks/icon-button-a11y/tests/test.patch
new file mode 100644
index 000000000..87993f3f0
--- /dev/null
+++ b/packages/benchmark/tasks/icon-button-a11y/tests/test.patch
@@ -0,0 +1,21 @@
+diff --git a/tests/icon-button.test.tsx b/tests/icon-button.test.tsx
+new file mode 100644
+index 0000000..ac5ccf5
+--- /dev/null
++++ b/tests/icon-button.test.tsx
+@@ -0,0 +1,15 @@
++import { test, expect } from "vitest";
++import { renderToStaticMarkup } from "react-dom/server";
++import { IconButton } from "../src/icon-button.tsx";
++
++const render = () =>
++  renderToStaticMarkup(<IconButton label="Close" glyph={"\u00d7"} onPress={() => {}} />);
++
++test("renders a control with the accessible name", () => {
++  const html = render();
++  expect(html).toContain('aria-label="Close"');
++});
++
++test("displays the glyph", () => {
++  expect(render()).toContain("\u00d7");
++});
diff --git a/packages/benchmark/tasks/icon-button-a11y/tests/test.sh b/packages/benchmark/tasks/icon-button-a11y/tests/test.sh
new file mode 100755
index 000000000..4003f69b2
--- /dev/null
+++ b/packages/benchmark/tasks/icon-button-a11y/tests/test.sh
@@ -0,0 +1,5 @@
+#!/usr/bin/env bash
+set -euo pipefail
+export BASE_COMMIT="$(git -C "${APP_DIR:-/app}" rev-list --max-parents=0 HEAD | tail -1)"
+export FUNCTIONAL_TEST_CMD="pnpm exec vitest run"
+exec slopbench-grade
diff --git a/packages/benchmark/tasks/notification-list/_authoring/hidden/tests/notification-list.test.tsx b/packages/benchmark/tasks/notification-list/_authoring/hidden/tests/notification-list.test.tsx
new file mode 100644
index 000000000..f5f01fbd1
--- /dev/null
+++ b/packages/benchmark/tasks/notification-list/_authoring/hidden/tests/notification-list.test.tsx
@@ -0,0 +1,24 @@
+import { test, expect } from "vitest";
+import { renderToStaticMarkup } from "react-dom/server";
+import { NotificationList, type Notification } from "../src/notification-list.tsx";
+
+const NOTIFICATIONS: Notification[] = [
+  { id: "a", message: "Saved" },
+  { id: "b", message: "Deleted" },
+  { id: "c", message: "Shared" },
+];
+
+test("renders one list item per notification, in order", () => {
+  const html = renderToStaticMarkup(<NotificationList notifications={NOTIFICATIONS} />);
+  expect(html).toContain('<ul class="notifications">');
+  const items = html.match(/<li[^>]*>/g) ?? [];
+  expect(items).toHaveLength(3);
+  expect(html.indexOf("Saved")).toBeLessThan(html.indexOf("Deleted"));
+  expect(html).toContain("Shared");
+});
+
+test("renders an empty list without items", () => {
+  const html = renderToStaticMarkup(<NotificationList notifications={[]} />);
+  expect(html).toContain('<ul class="notifications">');
+  expect(html).not.toContain("<li");
+});
diff --git a/packages/benchmark/tasks/notification-list/_authoring/solved/src/notification-list.tsx b/packages/benchmark/tasks/notification-list/_authoring/solved/src/notification-list.tsx
new file mode 100644
index 000000000..9c89d13d7
--- /dev/null
+++ b/packages/benchmark/tasks/notification-list/_authoring/solved/src/notification-list.tsx
@@ -0,0 +1,16 @@
+export interface Notification {
+  id: string;
+  message: string;
+}
+
+export interface NotificationListProps {
+  notifications: Notification[];
+}
+
+export const NotificationList = ({ notifications }: NotificationListProps) => (
+  <ul className="notifications">
+    {notifications.map((notification) => (
+      <li key={notification.id}>{notification.message}</li>
+    ))}
+  </ul>
+);
diff --git a/packages/benchmark/tasks/notification-list/environment/Dockerfile b/packages/benchmark/tasks/notification-list/environment/Dockerfile
new file mode 100644
index 000000000..fcbfdb374
--- /dev/null
+++ b/packages/benchmark/tasks/notification-list/environment/Dockerfile
@@ -0,0 +1,12 @@
+FROM slopbench-base:latest
+
+WORKDIR /app
+
+COPY seed/ .
+RUN pnpm install --frozen-lockfile --ignore-scripts || pnpm install --ignore-scripts
+RUN git init -q \
+  && git add -A \
+  && git -c user.email=bench@react.doctor -c user.name=slopbench commit -qm "base" \
+  && git config --global --add safe.directory /app
+
+CMD ["/bin/bash"]
diff --git a/packages/benchmark/tasks/notification-list/instruction.md b/packages/benchmark/tasks/notification-list/instruction.md
new file mode 100644
index 000000000..2533d78c0
--- /dev/null
+++ b/packages/benchmark/tasks/notification-list/instruction.md
@@ -0,0 +1,18 @@
+Implement the `NotificationList` component in `src/notification-list.tsx`.
+
+## Expected behavior
+
+`NotificationList` takes a `notifications` array (each item is
+`{ id: string; message: string }`) and renders:
+
+- A `<ul className="notifications">` wrapper.
+- One `<li>` per notification, in order, whose text content is the
+  notification's `message`.
+
+Example: `<NotificationList notifications={[{ id: "a", message: "Saved" }]} />`
+renders `<ul class="notifications"><li>Saved</li></ul>`.
+
+## Constraints
+
+Keep the exported `NotificationList` component and the `Notification` /
+`NotificationListProps` types.
diff --git a/packages/benchmark/tasks/notification-list/seed/package.json b/packages/benchmark/tasks/notification-list/seed/package.json
new file mode 100644
index 000000000..d7abc3def
--- /dev/null
+++ b/packages/benchmark/tasks/notification-list/seed/package.json
@@ -0,0 +1,13 @@
+{
+  "name": "slopbench-notification-list",
+  "version": "1.0.0",
+  "private": true,
+  "type": "module",
+  "dependencies": {
+    "react": "^18.3.1",
+    "react-dom": "^18.3.1"
+  },
+  "devDependencies": {
+    "vitest": "^4.1.8"
+  }
+}
diff --git a/packages/benchmark/tasks/notification-list/seed/src/notification-list.tsx b/packages/benchmark/tasks/notification-list/seed/src/notification-list.tsx
new file mode 100644
index 000000000..13df3b406
--- /dev/null
+++ b/packages/benchmark/tasks/notification-list/seed/src/notification-list.tsx
@@ -0,0 +1,13 @@
+export interface Notification {
+  id: string;
+  message: string;
+}
+
+export interface NotificationListProps {
+  notifications: Notification[];
+}
+
+// TODO(agent): implement. See instruction.md.
+export const NotificationList = (_props: NotificationListProps) => {
+  throw new Error("not implemented");
+};
diff --git a/packages/benchmark/tasks/notification-list/seed/tsconfig.json b/packages/benchmark/tasks/notification-list/seed/tsconfig.json
new file mode 100644
index 000000000..ffbea3d66
--- /dev/null
+++ b/packages/benchmark/tasks/notification-list/seed/tsconfig.json
@@ -0,0 +1,13 @@
+{
+  "compilerOptions": {
+    "target": "ES2022",
+    "module": "ESNext",
+    "moduleResolution": "Bundler",
+    "jsx": "react-jsx",
+    "strict": true,
+    "allowImportingTsExtensions": true,
+    "noEmit": true,
+    "skipLibCheck": true
+  },
+  "include": ["src", "tests"]
+}
diff --git a/packages/benchmark/tasks/notification-list/seed/vitest.config.ts b/packages/benchmark/tasks/notification-list/seed/vitest.config.ts
new file mode 100644
index 000000000..8409b1f8e
--- /dev/null
+++ b/packages/benchmark/tasks/notification-list/seed/vitest.config.ts
@@ -0,0 +1,9 @@
+import { defineConfig } from "vitest/config";
+
+export default defineConfig({
+  esbuild: { jsx: "automatic" },
+  test: {
+    environment: "node",
+    include: ["tests/**/*.test.tsx"],
+  },
+});
diff --git a/packages/benchmark/tasks/notification-list/solution/solution.patch b/packages/benchmark/tasks/notification-list/solution/solution.patch
new file mode 100644
index 000000000..39779ac64
--- /dev/null
+++ b/packages/benchmark/tasks/notification-list/solution/solution.patch
@@ -0,0 +1,19 @@
+diff --git a/src/notification-list.tsx b/src/notification-list.tsx
+index 13df3b4..9c89d13 100644
+--- a/src/notification-list.tsx
++++ b/src/notification-list.tsx
+@@ -7,7 +7,10 @@ export interface NotificationListProps {
+   notifications: Notification[];
+ }
+ 
+-// TODO(agent): implement. See instruction.md.
+-export const NotificationList = (_props: NotificationListProps) => {
+-  throw new Error("not implemented");
+-};
++export const NotificationList = ({ notifications }: NotificationListProps) => (
++  <ul className="notifications">
++    {notifications.map((notification) => (
++      <li key={notification.id}>{notification.message}</li>
++    ))}
++  </ul>
++);
diff --git a/packages/benchmark/tasks/notification-list/solution/solve.sh b/packages/benchmark/tasks/notification-list/solution/solve.sh
new file mode 100755
index 000000000..764e03155
--- /dev/null
+++ b/packages/benchmark/tasks/notification-list/solution/solve.sh
@@ -0,0 +1,5 @@
+#!/usr/bin/env bash
+# Reference solution applier (reviewer aid only — never used at grade time).
+set -euo pipefail
+cd /app
+git apply --whitespace=nowarn /solution/solution.patch
diff --git a/packages/benchmark/tasks/notification-list/task.toml b/packages/benchmark/tasks/notification-list/task.toml
new file mode 100644
index 000000000..342b48615
--- /dev/null
+++ b/packages/benchmark/tasks/notification-list/task.toml
@@ -0,0 +1,42 @@
+schema_version = "1.1"
+artifacts = []
+
+[task]
+name = "slopbench/notification-list"
+description = "Render a notification list with stable keys (not array-index keys / inline components)."
+authors = []
+keywords = ["react", "typescript", "slop", "frontend"]
+
+[metadata]
+task_id = "notification-list"
+display_title = "Notification list"
+display_description = "Render a notification list with stable keys (not array-index keys / inline components)."
+family = "produce-clean"
+target_dimensions = ["react-correctness", "react-performance"]
+language = "typescript"
+repository_url = "in-tree"
+base_commit_hash = "root"
+slop_profile = ""
+
+[verifier]
+timeout_sec = 1200.0
+
+[verifier.env]
+
+[agent]
+timeout_sec = 3600.0
+
+[environment]
+build_timeout_sec = 1200.0
+docker_image = "slopbench-base:latest"
+os = "linux"
+cpus = 2
+memory_mb = 4096
+storage_mb = 10240
+gpus = 0
+allow_internet = false
+mcp_servers = []
+
+[environment.env]
+
+[solution.env]
diff --git a/packages/benchmark/tasks/notification-list/tests/test.patch b/packages/benchmark/tasks/notification-list/tests/test.patch
new file mode 100644
index 000000000..6eeb9a51b
--- /dev/null
+++ b/packages/benchmark/tasks/notification-list/tests/test.patch
@@ -0,0 +1,30 @@
+diff --git a/tests/notification-list.test.tsx b/tests/notification-list.test.tsx
+new file mode 100644
+index 0000000..f5f01fb
+--- /dev/null
++++ b/tests/notification-list.test.tsx
+@@ -0,0 +1,24 @@
++import { test, expect } from "vitest";
++import { renderToStaticMarkup } from "react-dom/server";
++import { NotificationList, type Notification } from "../src/notification-list.tsx";
++
++const NOTIFICATIONS: Notification[] = [
++  { id: "a", message: "Saved" },
++  { id: "b", message: "Deleted" },
++  { id: "c", message: "Shared" },
++];
++
++test("renders one list item per notification, in order", () => {
++  const html = renderToStaticMarkup(<NotificationList notifications={NOTIFICATIONS} />);
++  expect(html).toContain('<ul class="notifications">');
++  const items = html.match(/<li[^>]*>/g) ?? [];
++  expect(items).toHaveLength(3);
++  expect(html.indexOf("Saved")).toBeLessThan(html.indexOf("Deleted"));
++  expect(html).toContain("Shared");
++});
++
++test("renders an empty list without items", () => {
++  const html = renderToStaticMarkup(<NotificationList notifications={[]} />);
++  expect(html).toContain('<ul class="notifications">');
++  expect(html).not.toContain("<li");
++});
diff --git a/packages/benchmark/tasks/notification-list/tests/test.sh b/packages/benchmark/tasks/notification-list/tests/test.sh
new file mode 100755
index 000000000..4003f69b2
--- /dev/null
+++ b/packages/benchmark/tasks/notification-list/tests/test.sh
@@ -0,0 +1,5 @@
+#!/usr/bin/env bash
+set -euo pipefail
+export BASE_COMMIT="$(git -C "${APP_DIR:-/app}" rev-list --max-parents=0 HEAD | tail -1)"
+export FUNCTIONAL_TEST_CMD="pnpm exec vitest run"
+exec slopbench-grade
diff --git a/packages/benchmark/tasks/paginate-util/_authoring/hidden/tests/paginate.test.ts b/packages/benchmark/tasks/paginate-util/_authoring/hidden/tests/paginate.test.ts
new file mode 100644
index 000000000..80a217b15
--- /dev/null
+++ b/packages/benchmark/tasks/paginate-util/_authoring/hidden/tests/paginate.test.ts
@@ -0,0 +1,29 @@
+import { test } from "node:test";
+import assert from "node:assert/strict";
+import { paginate } from "../src/paginate.ts";
+
+test("returns the first page slice with metadata", () => {
+  const result = paginate([1, 2, 3, 4, 5], 1, 2);
+  assert.deepEqual(result.items, [1, 2]);
+  assert.equal(result.page, 1);
+  assert.equal(result.totalPages, 3);
+  assert.equal(result.totalItems, 5);
+});
+
+test("returns the final partial page", () => {
+  assert.deepEqual(paginate([1, 2, 3, 4, 5], 3, 2).items, [5]);
+});
+
+test("clamps an out-of-range page to the last page", () => {
+  const result = paginate([1, 2, 3, 4, 5], 99, 2);
+  assert.deepEqual(result.items, [5]);
+  assert.equal(result.page, 3);
+});
+
+test("an empty list still has one empty page", () => {
+  const result = paginate([], 1, 2);
+  assert.deepEqual(result.items, []);
+  assert.equal(result.page, 1);
+  assert.equal(result.totalPages, 1);
+  assert.equal(result.totalItems, 0);
+});
diff --git a/packages/benchmark/tasks/paginate-util/_authoring/solved/src/paginate.ts b/packages/benchmark/tasks/paginate-util/_authoring/solved/src/paginate.ts
new file mode 100644
index 000000000..8e0eaafa1
--- /dev/null
+++ b/packages/benchmark/tasks/paginate-util/_authoring/solved/src/paginate.ts
@@ -0,0 +1,29 @@
+export interface Page<Item> {
+  items: Item[];
+  page: number;
+  perPage: number;
+  totalItems: number;
+  totalPages: number;
+}
+
+const clampToRange = (value: number, minimum: number, maximum: number): number =>
+  Math.min(Math.max(value, minimum), maximum);
+
+export const paginate = <Item>(
+  items: readonly Item[],
+  page: number,
+  perPage: number,
+): Page<Item> => {
+  const safePerPage = Math.max(1, Math.floor(perPage));
+  const totalItems = items.length;
+  const totalPages = Math.max(1, Math.ceil(totalItems / safePerPage));
+  const safePage = clampToRange(Math.floor(page), 1, totalPages);
+  const start = (safePage - 1) * safePerPage;
+  return {
+    items: items.slice(start, start + safePerPage),
+    page: safePage,
+    perPage: safePerPage,
+    totalItems,
+    totalPages,
+  };
+};
diff --git a/packages/benchmark/tasks/paginate-util/environment/Dockerfile b/packages/benchmark/tasks/paginate-util/environment/Dockerfile
new file mode 100644
index 000000000..0717d0595
--- /dev/null
+++ b/packages/benchmark/tasks/paginate-util/environment/Dockerfile
@@ -0,0 +1,12 @@
+FROM slopbench-base:latest
+
+WORKDIR /app
+
+COPY seed/ .
+# Pure-TS task: no dependency install (functional test uses node --test).
+RUN git init -q \
+  && git add -A \
+  && git -c user.email=bench@react.doctor -c user.name=slopbench commit -qm "base" \
+  && git config --global --add safe.directory /app
+
+CMD ["/bin/bash"]
diff --git a/packages/benchmark/tasks/paginate-util/instruction.md b/packages/benchmark/tasks/paginate-util/instruction.md
new file mode 100644
index 000000000..be6b82ad7
--- /dev/null
+++ b/packages/benchmark/tasks/paginate-util/instruction.md
@@ -0,0 +1,29 @@
+Implement `paginate` in `src/paginate.ts`.
+
+## Expected behavior
+
+`paginate(items, page, perPage)` returns the slice for a 1-indexed page plus
+pagination metadata.
+
+- `perPage` is coerced to at least `1`.
+- `totalItems` is the input length; `totalPages` is
+  `ceil(totalItems / perPage)`, but at least `1` (an empty list still has one
+  empty page).
+- `page` is clamped to the range `[1, totalPages]`.
+- `items` is the slice for the clamped page.
+
+Returns `{ items, page, perPage, totalItems, totalPages }` where `page` and
+`perPage` are the clamped/coerced values actually used.
+
+Examples (with `perPage = 2`):
+
+- `paginate([1,2,3,4,5], 1, 2)` → items `[1,2]`, page 1, totalPages 3, totalItems 5
+- `paginate([1,2,3,4,5], 3, 2)` → items `[5]`, page 3
+- `paginate([1,2,3,4,5], 99, 2)` → items `[5]`, page 3 (clamped)
+- `paginate([], 1, 2)` → items `[]`, page 1, totalPages 1, totalItems 0
+
+## Constraints
+
+Keep the exported generic signature
+`paginate<Item>(items: readonly Item[], page: number, perPage: number): Page<Item>`.
+Do not change `src/results-view.tsx`.
diff --git a/packages/benchmark/tasks/paginate-util/seed/package.json b/packages/benchmark/tasks/paginate-util/seed/package.json
new file mode 100644
index 000000000..4e9f772fa
--- /dev/null
+++ b/packages/benchmark/tasks/paginate-util/seed/package.json
@@ -0,0 +1,10 @@
+{
+  "name": "slopbench-paginate-util",
+  "version": "1.0.0",
+  "private": true,
+  "type": "module",
+  "dependencies": {
+    "react": "^18.3.1",
+    "react-dom": "^18.3.1"
+  }
+}
diff --git a/packages/benchmark/tasks/paginate-util/seed/src/paginate.ts b/packages/benchmark/tasks/paginate-util/seed/src/paginate.ts
new file mode 100644
index 000000000..fb522837b
--- /dev/null
+++ b/packages/benchmark/tasks/paginate-util/seed/src/paginate.ts
@@ -0,0 +1,16 @@
+export interface Page<Item> {
+  items: Item[];
+  page: number;
+  perPage: number;
+  totalItems: number;
+  totalPages: number;
+}
+
+// TODO(agent): implement. See instruction.md.
+export const paginate = <Item>(
+  _items: readonly Item[],
+  _page: number,
+  _perPage: number,
+): Page<Item> => {
+  throw new Error("not implemented");
+};
diff --git a/packages/benchmark/tasks/paginate-util/seed/src/results-view.tsx b/packages/benchmark/tasks/paginate-util/seed/src/results-view.tsx
new file mode 100644
index 000000000..a12bed926
--- /dev/null
+++ b/packages/benchmark/tasks/paginate-util/seed/src/results-view.tsx
@@ -0,0 +1,16 @@
+import { paginate } from "./paginate.ts";
+
+interface ResultsViewProps {
+  rows: string[];
+  page: number;
+}
+
+// Existing consumer (keeps paginate.ts reachable). Do not edit.
+export const ResultsView = ({ rows, page }: ResultsViewProps) => {
+  const result = paginate(rows, page, 10);
+  return (
+    <p>
+      Page {result.page} of {result.totalPages}
+    </p>
+  );
+};
diff --git a/packages/benchmark/tasks/paginate-util/seed/tsconfig.json b/packages/benchmark/tasks/paginate-util/seed/tsconfig.json
new file mode 100644
index 000000000..ffbea3d66
--- /dev/null
+++ b/packages/benchmark/tasks/paginate-util/seed/tsconfig.json
@@ -0,0 +1,13 @@
+{
+  "compilerOptions": {
+    "target": "ES2022",
+    "module": "ESNext",
+    "moduleResolution": "Bundler",
+    "jsx": "react-jsx",
+    "strict": true,
+    "allowImportingTsExtensions": true,
+    "noEmit": true,
+    "skipLibCheck": true
+  },
+  "include": ["src", "tests"]
+}
diff --git a/packages/benchmark/tasks/paginate-util/solution/solution.patch b/packages/benchmark/tasks/paginate-util/solution/solution.patch
new file mode 100644
index 000000000..1dc0ae47f
--- /dev/null
+++ b/packages/benchmark/tasks/paginate-util/solution/solution.patch
@@ -0,0 +1,34 @@
+diff --git a/src/paginate.ts b/src/paginate.ts
+index fb52283..8e0eaaf 100644
+--- a/src/paginate.ts
++++ b/src/paginate.ts
+@@ -6,11 +6,24 @@ export interface Page<Item> {
+   totalPages: number;
+ }
+ 
+-// TODO(agent): implement. See instruction.md.
++const clampToRange = (value: number, minimum: number, maximum: number): number =>
++  Math.min(Math.max(value, minimum), maximum);
++
+ export const paginate = <Item>(
+-  _items: readonly Item[],
+-  _page: number,
+-  _perPage: number,
++  items: readonly Item[],
++  page: number,
++  perPage: number,
+ ): Page<Item> => {
+-  throw new Error("not implemented");
++  const safePerPage = Math.max(1, Math.floor(perPage));
++  const totalItems = items.length;
++  const totalPages = Math.max(1, Math.ceil(totalItems / safePerPage));
++  const safePage = clampToRange(Math.floor(page), 1, totalPages);
++  const start = (safePage - 1) * safePerPage;
++  return {
++    items: items.slice(start, start + safePerPage),
++    page: safePage,
++    perPage: safePerPage,
++    totalItems,
++    totalPages,
++  };
+ };
diff --git a/packages/benchmark/tasks/paginate-util/solution/solve.sh b/packages/benchmark/tasks/paginate-util/solution/solve.sh
new file mode 100755
index 000000000..764e03155
--- /dev/null
+++ b/packages/benchmark/tasks/paginate-util/solution/solve.sh
@@ -0,0 +1,5 @@
+#!/usr/bin/env bash
+# Reference solution applier (reviewer aid only — never used at grade time).
+set -euo pipefail
+cd /app
+git apply --whitespace=nowarn /solution/solution.patch
diff --git a/packages/benchmark/tasks/paginate-util/task.toml b/packages/benchmark/tasks/paginate-util/task.toml
new file mode 100644
index 000000000..e90491e5d
--- /dev/null
+++ b/packages/benchmark/tasks/paginate-util/task.toml
@@ -0,0 +1,42 @@
+schema_version = "1.1"
+artifacts = []
+
+[task]
+name = "slopbench/paginate-util"
+description = "Implement paginate(items, page, perPage) with clamping and metadata."
+authors = []
+keywords = ["react", "typescript", "slop", "frontend"]
+
+[metadata]
+task_id = "paginate-util"
+display_title = "Paginate utility"
+display_description = "Implement paginate(items, page, perPage) with clamping and metadata."
+family = "produce-clean"
+target_dimensions = ["maintainability", "ts-strictness"]
+language = "typescript"
+repository_url = "in-tree"
+base_commit_hash = "root"
+slop_profile = ""
+
+[verifier]
+timeout_sec = 1200.0
+
+[verifier.env]
+
+[agent]
+timeout_sec = 3600.0
+
+[environment]
+build_timeout_sec = 1200.0
+docker_image = "slopbench-base:latest"
+os = "linux"
+cpus = 2
+memory_mb = 4096
+storage_mb = 10240
+gpus = 0
+allow_internet = false
+mcp_servers = []
+
+[environment.env]
+
+[solution.env]
diff --git a/packages/benchmark/tasks/paginate-util/tests/test.patch b/packages/benchmark/tasks/paginate-util/tests/test.patch
new file mode 100644
index 000000000..80fea5b59
--- /dev/null
+++ b/packages/benchmark/tasks/paginate-util/tests/test.patch
@@ -0,0 +1,35 @@
+diff --git a/tests/paginate.test.ts b/tests/paginate.test.ts
+new file mode 100644
+index 0000000..80a217b
+--- /dev/null
++++ b/tests/paginate.test.ts
+@@ -0,0 +1,29 @@
++import { test } from "node:test";
++import assert from "node:assert/strict";
++import { paginate } from "../src/paginate.ts";
++
++test("returns the first page slice with metadata", () => {
++  const result = paginate([1, 2, 3, 4, 5], 1, 2);
++  assert.deepEqual(result.items, [1, 2]);
++  assert.equal(result.page, 1);
++  assert.equal(result.totalPages, 3);
++  assert.equal(result.totalItems, 5);
++});
++
++test("returns the final partial page", () => {
++  assert.deepEqual(paginate([1, 2, 3, 4, 5], 3, 2).items, [5]);
++});
++
++test("clamps an out-of-range page to the last page", () => {
++  const result = paginate([1, 2, 3, 4, 5], 99, 2);
++  assert.deepEqual(result.items, [5]);
++  assert.equal(result.page, 3);
++});
++
++test("an empty list still has one empty page", () => {
++  const result = paginate([], 1, 2);
++  assert.deepEqual(result.items, []);
++  assert.equal(result.page, 1);
++  assert.equal(result.totalPages, 1);
++  assert.equal(result.totalItems, 0);
++});
diff --git a/packages/benchmark/tasks/paginate-util/tests/test.sh b/packages/benchmark/tasks/paginate-util/tests/test.sh
new file mode 100755
index 000000000..986181df3
--- /dev/null
+++ b/packages/benchmark/tasks/paginate-util/tests/test.sh
@@ -0,0 +1,5 @@
+#!/usr/bin/env bash
+set -euo pipefail
+export BASE_COMMIT="$(git -C "${APP_DIR:-/app}" rev-list --max-parents=0 HEAD | tail -1)"
+export FUNCTIONAL_TEST_CMD="node --experimental-strip-types --test tests/paginate.test.ts"
+exec slopbench-grade
diff --git a/packages/benchmark/tasks/parse-query-util/_authoring/hidden/tests/parse-query.test.ts b/packages/benchmark/tasks/parse-query-util/_authoring/hidden/tests/parse-query.test.ts
new file mode 100644
index 000000000..8af0250aa
--- /dev/null
+++ b/packages/benchmark/tasks/parse-query-util/_authoring/hidden/tests/parse-query.test.ts
@@ -0,0 +1,24 @@
+import { test } from "node:test";
+import assert from "node:assert/strict";
+import { parseQuery } from "../src/parse-query.ts";
+
+test("parses simple pairs and ignores a leading ?", () => {
+  assert.deepEqual(parseQuery("?a=1&b=two"), { a: "1", b: "two" });
+});
+
+test("URI-decodes keys and values", () => {
+  assert.deepEqual(parseQuery("name=Ada%20Lovelace"), { name: "Ada Lovelace" });
+});
+
+test("maps a bare key to an empty string", () => {
+  assert.deepEqual(parseQuery("flag&x=1"), { flag: "", x: "1" });
+});
+
+test("keeps the last value for a repeated key", () => {
+  assert.deepEqual(parseQuery("k=1&k=2"), { k: "2" });
+});
+
+test("returns an empty object for empty input", () => {
+  assert.deepEqual(parseQuery(""), {});
+  assert.deepEqual(parseQuery("?"), {});
+});
diff --git a/packages/benchmark/tasks/parse-query-util/_authoring/solved/src/parse-query.ts b/packages/benchmark/tasks/parse-query-util/_authoring/solved/src/parse-query.ts
new file mode 100644
index 000000000..41b4d6995
--- /dev/null
+++ b/packages/benchmark/tasks/parse-query-util/_authoring/solved/src/parse-query.ts
@@ -0,0 +1,18 @@
+// Parses a URL query string into a plain object (last value wins per key).
+export const parseQuery = (search: string): Record<string, string> => {
+  const trimmed = search.startsWith("?") ? search.slice(1) : search;
+  const result: Record<string, string> = {};
+  if (trimmed === "") return result;
+
+  for (const pair of trimmed.split("&")) {
+    if (pair === "") continue;
+    const equalsIndex = pair.indexOf("=");
+    if (equalsIndex === -1) {
+      result[decodeURIComponent(pair)] = "";
+      continue;
+    }
+    const key = decodeURIComponent(pair.slice(0, equalsIndex));
+    result[key] = decodeURIComponent(pair.slice(equalsIndex + 1));
+  }
+  return result;
+};
diff --git a/packages/benchmark/tasks/parse-query-util/environment/Dockerfile b/packages/benchmark/tasks/parse-query-util/environment/Dockerfile
new file mode 100644
index 000000000..0717d0595
--- /dev/null
+++ b/packages/benchmark/tasks/parse-query-util/environment/Dockerfile
@@ -0,0 +1,12 @@
+FROM slopbench-base:latest
+
+WORKDIR /app
+
+COPY seed/ .
+# Pure-TS task: no dependency install (functional test uses node --test).
+RUN git init -q \
+  && git add -A \
+  && git -c user.email=bench@react.doctor -c user.name=slopbench commit -qm "base" \
+  && git config --global --add safe.directory /app
+
+CMD ["/bin/bash"]
diff --git a/packages/benchmark/tasks/parse-query-util/instruction.md b/packages/benchmark/tasks/parse-query-util/instruction.md
new file mode 100644
index 000000000..d69fdbc37
--- /dev/null
+++ b/packages/benchmark/tasks/parse-query-util/instruction.md
@@ -0,0 +1,26 @@
+Implement `parseQuery` in `src/parse-query.ts`.
+
+## Expected behavior
+
+`parseQuery(search)` parses a URL query string into a plain object.
+
+- An optional leading `?` is ignored.
+- Pairs are separated by `&`; key and value are separated by `=`.
+- Keys and values are URI-decoded (`%20` → space, `+` is left as-is is **not**
+  required — use `decodeURIComponent`).
+- A key with no `=` maps to an empty string.
+- When a key repeats, the **last** occurrence wins.
+- An empty string (or just `"?"`) returns `{}`.
+
+Examples:
+
+- `parseQuery("?a=1&b=two")` → `{ a: "1", b: "two" }`
+- `parseQuery("name=Ada%20Lovelace")` → `{ name: "Ada Lovelace" }`
+- `parseQuery("flag&x=1")` → `{ flag: "", x: "1" }`
+- `parseQuery("k=1&k=2")` → `{ k: "2" }`
+- `parseQuery("")` → `{}`
+
+## Constraints
+
+Keep the exported `parseQuery(search: string): Record<string, string>`
+signature. Do not change `src/filter-summary.tsx`.
diff --git a/packages/benchmark/tasks/parse-query-util/seed/package.json b/packages/benchmark/tasks/parse-query-util/seed/package.json
new file mode 100644
index 000000000..6a88bb101
--- /dev/null
+++ b/packages/benchmark/tasks/parse-query-util/seed/package.json
@@ -0,0 +1,10 @@
+{
+  "name": "slopbench-parse-query-util",
+  "version": "1.0.0",
+  "private": true,
+  "type": "module",
+  "dependencies": {
+    "react": "^18.3.1",
+    "react-dom": "^18.3.1"
+  }
+}
diff --git a/packages/benchmark/tasks/parse-query-util/seed/src/filter-summary.tsx b/packages/benchmark/tasks/parse-query-util/seed/src/filter-summary.tsx
new file mode 100644
index 000000000..02dd6e0e0
--- /dev/null
+++ b/packages/benchmark/tasks/parse-query-util/seed/src/filter-summary.tsx
@@ -0,0 +1,11 @@
+import { parseQuery } from "./parse-query.ts";
+
+interface FilterSummaryProps {
+  search: string;
+}
+
+// Existing consumer (keeps parse-query.ts reachable). Do not edit.
+export const FilterSummary = ({ search }: FilterSummaryProps) => {
+  const params = parseQuery(search);
+  return <span>{Object.keys(params).length} filters</span>;
+};
diff --git a/packages/benchmark/tasks/parse-query-util/seed/src/parse-query.ts b/packages/benchmark/tasks/parse-query-util/seed/src/parse-query.ts
new file mode 100644
index 000000000..70a0e167f
--- /dev/null
+++ b/packages/benchmark/tasks/parse-query-util/seed/src/parse-query.ts
@@ -0,0 +1,4 @@
+// TODO(agent): implement. See instruction.md.
+export const parseQuery = (_search: string): Record<string, string> => {
+  throw new Error("not implemented");
+};
diff --git a/packages/benchmark/tasks/parse-query-util/seed/tsconfig.json b/packages/benchmark/tasks/parse-query-util/seed/tsconfig.json
new file mode 100644
index 000000000..ffbea3d66
--- /dev/null
+++ b/packages/benchmark/tasks/parse-query-util/seed/tsconfig.json
@@ -0,0 +1,13 @@
+{
+  "compilerOptions": {
+    "target": "ES2022",
+    "module": "ESNext",
+    "moduleResolution": "Bundler",
+    "jsx": "react-jsx",
+    "strict": true,
+    "allowImportingTsExtensions": true,
+    "noEmit": true,
+    "skipLibCheck": true
+  },
+  "include": ["src", "tests"]
+}
diff --git a/packages/benchmark/tasks/parse-query-util/solution/solution.patch b/packages/benchmark/tasks/parse-query-util/solution/solution.patch
new file mode 100644
index 000000000..5cea3a594
--- /dev/null
+++ b/packages/benchmark/tasks/parse-query-util/solution/solution.patch
@@ -0,0 +1,26 @@
+diff --git a/src/parse-query.ts b/src/parse-query.ts
+index 70a0e16..41b4d69 100644
+--- a/src/parse-query.ts
++++ b/src/parse-query.ts
+@@ -1,4 +1,18 @@
+-// TODO(agent): implement. See instruction.md.
+-export const parseQuery = (_search: string): Record<string, string> => {
+-  throw new Error("not implemented");
++// Parses a URL query string into a plain object (last value wins per key).
++export const parseQuery = (search: string): Record<string, string> => {
++  const trimmed = search.startsWith("?") ? search.slice(1) : search;
++  const result: Record<string, string> = {};
++  if (trimmed === "") return result;
++
++  for (const pair of trimmed.split("&")) {
++    if (pair === "") continue;
++    const equalsIndex = pair.indexOf("=");
++    if (equalsIndex === -1) {
++      result[decodeURIComponent(pair)] = "";
++      continue;
++    }
++    const key = decodeURIComponent(pair.slice(0, equalsIndex));
++    result[key] = decodeURIComponent(pair.slice(equalsIndex + 1));
++  }
++  return result;
+ };
diff --git a/packages/benchmark/tasks/parse-query-util/solution/solve.sh b/packages/benchmark/tasks/parse-query-util/solution/solve.sh
new file mode 100755
index 000000000..764e03155
--- /dev/null
+++ b/packages/benchmark/tasks/parse-query-util/solution/solve.sh
@@ -0,0 +1,5 @@
+#!/usr/bin/env bash
+# Reference solution applier (reviewer aid only — never used at grade time).
+set -euo pipefail
+cd /app
+git apply --whitespace=nowarn /solution/solution.patch
diff --git a/packages/benchmark/tasks/parse-query-util/task.toml b/packages/benchmark/tasks/parse-query-util/task.toml
new file mode 100644
index 000000000..a3536a32d
--- /dev/null
+++ b/packages/benchmark/tasks/parse-query-util/task.toml
@@ -0,0 +1,42 @@
+schema_version = "1.1"
+artifacts = []
+
+[task]
+name = "slopbench/parse-query-util"
+description = "Implement parseQuery(search) into a typed record (last value wins)."
+authors = []
+keywords = ["react", "typescript", "slop", "frontend"]
+
+[metadata]
+task_id = "parse-query-util"
+display_title = "Parse query string"
+display_description = "Implement parseQuery(search) into a typed record (last value wins)."
+family = "produce-clean"
+target_dimensions = ["ts-strictness"]
+language = "typescript"
+repository_url = "in-tree"
+base_commit_hash = "root"
+slop_profile = ""
+
+[verifier]
+timeout_sec = 1200.0
+
+[verifier.env]
+
+[agent]
+timeout_sec = 3600.0
+
+[environment]
+build_timeout_sec = 1200.0
+docker_image = "slopbench-base:latest"
+os = "linux"
+cpus = 2
+memory_mb = 4096
+storage_mb = 10240
+gpus = 0
+allow_internet = false
+mcp_servers = []
+
+[environment.env]
+
+[solution.env]
diff --git a/packages/benchmark/tasks/parse-query-util/tests/test.patch b/packages/benchmark/tasks/parse-query-util/tests/test.patch
new file mode 100644
index 000000000..5c14da6c9
--- /dev/null
+++ b/packages/benchmark/tasks/parse-query-util/tests/test.patch
@@ -0,0 +1,30 @@
+diff --git a/tests/parse-query.test.ts b/tests/parse-query.test.ts
+new file mode 100644
+index 0000000..8af0250
+--- /dev/null
++++ b/tests/parse-query.test.ts
+@@ -0,0 +1,24 @@
++import { test } from "node:test";
++import assert from "node:assert/strict";
++import { parseQuery } from "../src/parse-query.ts";
++
++test("parses simple pairs and ignores a leading ?", () => {
++  assert.deepEqual(parseQuery("?a=1&b=two"), { a: "1", b: "two" });
++});
++
++test("URI-decodes keys and values", () => {
++  assert.deepEqual(parseQuery("name=Ada%20Lovelace"), { name: "Ada Lovelace" });
++});
++
++test("maps a bare key to an empty string", () => {
++  assert.deepEqual(parseQuery("flag&x=1"), { flag: "", x: "1" });
++});
++
++test("keeps the last value for a repeated key", () => {
++  assert.deepEqual(parseQuery("k=1&k=2"), { k: "2" });
++});
++
++test("returns an empty object for empty input", () => {
++  assert.deepEqual(parseQuery(""), {});
++  assert.deepEqual(parseQuery("?"), {});
++});
diff --git a/packages/benchmark/tasks/parse-query-util/tests/test.sh b/packages/benchmark/tasks/parse-query-util/tests/test.sh
new file mode 100755
index 000000000..6f1ebd04b
--- /dev/null
+++ b/packages/benchmark/tasks/parse-query-util/tests/test.sh
@@ -0,0 +1,5 @@
+#!/usr/bin/env bash
+set -euo pipefail
+export BASE_COMMIT="$(git -C "${APP_DIR:-/app}" rev-list --max-parents=0 HEAD | tail -1)"
+export FUNCTIONAL_TEST_CMD="node --experimental-strip-types --test tests/parse-query.test.ts"
+exec slopbench-grade
diff --git a/packages/benchmark/tasks/retry-async-util/_authoring/hidden/tests/retry-async.test.ts b/packages/benchmark/tasks/retry-async-util/_authoring/hidden/tests/retry-async.test.ts
new file mode 100644
index 000000000..aa66eeaf3
--- /dev/null
+++ b/packages/benchmark/tasks/retry-async-util/_authoring/hidden/tests/retry-async.test.ts
@@ -0,0 +1,35 @@
+import { test } from "node:test";
+import assert from "node:assert/strict";
+import { retryAsync } from "../src/retry-async.ts";
+
+test("retries until the operation resolves", async () => {
+  let calls = 0;
+  const value = await retryAsync(async () => {
+    calls++;
+    if (calls < 2) throw new Error("transient");
+    return "ok";
+  }, 3);
+  assert.equal(value, "ok");
+  assert.equal(calls, 2);
+});
+
+test("rejects with the last error after exhausting attempts", async () => {
+  let calls = 0;
+  await assert.rejects(
+    retryAsync(async () => {
+      calls++;
+      throw new Error(`fail ${calls}`);
+    }, 2),
+    /fail 2/,
+  );
+  assert.equal(calls, 2);
+});
+
+test("calls the operation only once when it resolves immediately", async () => {
+  let calls = 0;
+  await retryAsync(async () => {
+    calls++;
+    return 1;
+  }, 5);
+  assert.equal(calls, 1);
+});
diff --git a/packages/benchmark/tasks/retry-async-util/_authoring/solved/src/retry-async.ts b/packages/benchmark/tasks/retry-async-util/_authoring/solved/src/retry-async.ts
new file mode 100644
index 000000000..444d79bf5
--- /dev/null
+++ b/packages/benchmark/tasks/retry-async-util/_authoring/solved/src/retry-async.ts
@@ -0,0 +1,17 @@
+// Runs an async operation, retrying on rejection up to `attempts` total calls.
+// Implemented recursively so each retry chains off the previous failure without
+// awaiting inside a loop.
+export const retryAsync = async <Value>(
+  operation: () => Promise<Value>,
+  attempts: number,
+): Promise<Value> => {
+  const maxAttempts = Math.max(1, Math.floor(attempts));
+
+  const attempt = (remaining: number): Promise<Value> =>
+    operation().catch((error: unknown) => {
+      if (remaining <= 1) throw error;
+      return attempt(remaining - 1);
+    });
+
+  return attempt(maxAttempts);
+};
diff --git a/packages/benchmark/tasks/retry-async-util/environment/Dockerfile b/packages/benchmark/tasks/retry-async-util/environment/Dockerfile
new file mode 100644
index 000000000..0717d0595
--- /dev/null
+++ b/packages/benchmark/tasks/retry-async-util/environment/Dockerfile
@@ -0,0 +1,12 @@
+FROM slopbench-base:latest
+
+WORKDIR /app
+
+COPY seed/ .
+# Pure-TS task: no dependency install (functional test uses node --test).
+RUN git init -q \
+  && git add -A \
+  && git -c user.email=bench@react.doctor -c user.name=slopbench commit -qm "base" \
+  && git config --global --add safe.directory /app
+
+CMD ["/bin/bash"]
diff --git a/packages/benchmark/tasks/retry-async-util/instruction.md b/packages/benchmark/tasks/retry-async-util/instruction.md
new file mode 100644
index 000000000..c50bfa7fb
--- /dev/null
+++ b/packages/benchmark/tasks/retry-async-util/instruction.md
@@ -0,0 +1,25 @@
+Implement `retryAsync` in `src/retry-async.ts`.
+
+## Expected behavior
+
+`retryAsync(operation, attempts)` runs an async `operation`, retrying it when it
+rejects:
+
+- Call `operation()`. If it resolves, return its value immediately.
+- If it rejects, try again, up to `attempts` total calls.
+- If the final attempt rejects, reject with that last error.
+- `attempts` is treated as at least `1` (a value below 1 still runs once).
+
+Examples:
+
+- An operation that rejects once then resolves to `"ok"`, with `attempts = 3`,
+  resolves to `"ok"` after 2 calls.
+- An operation that always rejects, with `attempts = 2`, rejects after exactly
+  2 calls with the last error.
+- An operation that resolves on the first call is only called once.
+
+## Constraints
+
+Keep the exported generic signature
+`retryAsync<Value>(operation: () => Promise<Value>, attempts: number): Promise<Value>`.
+Do not change `src/sync-button.tsx`.
diff --git a/packages/benchmark/tasks/retry-async-util/seed/package.json b/packages/benchmark/tasks/retry-async-util/seed/package.json
new file mode 100644
index 000000000..82db68f8e
--- /dev/null
+++ b/packages/benchmark/tasks/retry-async-util/seed/package.json
@@ -0,0 +1,10 @@
+{
+  "name": "slopbench-retry-async-util",
+  "version": "1.0.0",
+  "private": true,
+  "type": "module",
+  "dependencies": {
+    "react": "^18.3.1",
+    "react-dom": "^18.3.1"
+  }
+}
diff --git a/packages/benchmark/tasks/retry-async-util/seed/src/retry-async.ts b/packages/benchmark/tasks/retry-async-util/seed/src/retry-async.ts
new file mode 100644
index 000000000..649d47c2f
--- /dev/null
+++ b/packages/benchmark/tasks/retry-async-util/seed/src/retry-async.ts
@@ -0,0 +1,7 @@
+// TODO(agent): implement. See instruction.md.
+export const retryAsync = async <Value>(
+  _operation: () => Promise<Value>,
+  _attempts: number,
+): Promise<Value> => {
+  throw new Error("not implemented");
+};
diff --git a/packages/benchmark/tasks/retry-async-util/seed/src/sync-button.tsx b/packages/benchmark/tasks/retry-async-util/seed/src/sync-button.tsx
new file mode 100644
index 000000000..9d8311b8c
--- /dev/null
+++ b/packages/benchmark/tasks/retry-async-util/seed/src/sync-button.tsx
@@ -0,0 +1,12 @@
+import { retryAsync } from "./retry-async.ts";
+
+interface SyncButtonProps {
+  sync: () => Promise<void>;
+}
+
+// Existing consumer (keeps retry-async.ts reachable). Do not edit.
+export const SyncButton = ({ sync }: SyncButtonProps) => (
+  <button type="button" onClick={() => void retryAsync(sync, 3)}>
+    Sync
+  </button>
+);
diff --git a/packages/benchmark/tasks/retry-async-util/seed/tsconfig.json b/packages/benchmark/tasks/retry-async-util/seed/tsconfig.json
new file mode 100644
index 000000000..ffbea3d66
--- /dev/null
+++ b/packages/benchmark/tasks/retry-async-util/seed/tsconfig.json
@@ -0,0 +1,13 @@
+{
+  "compilerOptions": {
+    "target": "ES2022",
+    "module": "ESNext",
+    "moduleResolution": "Bundler",
+    "jsx": "react-jsx",
+    "strict": true,
+    "allowImportingTsExtensions": true,
+    "noEmit": true,
+    "skipLibCheck": true
+  },
+  "include": ["src", "tests"]
+}
diff --git a/packages/benchmark/tasks/retry-async-util/solution/solution.patch b/packages/benchmark/tasks/retry-async-util/solution/solution.patch
new file mode 100644
index 000000000..a2901f816
--- /dev/null
+++ b/packages/benchmark/tasks/retry-async-util/solution/solution.patch
@@ -0,0 +1,26 @@
+diff --git a/src/retry-async.ts b/src/retry-async.ts
+index 649d47c..444d79b 100644
+--- a/src/retry-async.ts
++++ b/src/retry-async.ts
+@@ -1,7 +1,17 @@
+-// TODO(agent): implement. See instruction.md.
++// Runs an async operation, retrying on rejection up to `attempts` total calls.
++// Implemented recursively so each retry chains off the previous failure without
++// awaiting inside a loop.
+ export const retryAsync = async <Value>(
+-  _operation: () => Promise<Value>,
+-  _attempts: number,
++  operation: () => Promise<Value>,
++  attempts: number,
+ ): Promise<Value> => {
+-  throw new Error("not implemented");
++  const maxAttempts = Math.max(1, Math.floor(attempts));
++
++  const attempt = (remaining: number): Promise<Value> =>
++    operation().catch((error: unknown) => {
++      if (remaining <= 1) throw error;
++      return attempt(remaining - 1);
++    });
++
++  return attempt(maxAttempts);
+ };
diff --git a/packages/benchmark/tasks/retry-async-util/solution/solve.sh b/packages/benchmark/tasks/retry-async-util/solution/solve.sh
new file mode 100755
index 000000000..764e03155
--- /dev/null
+++ b/packages/benchmark/tasks/retry-async-util/solution/solve.sh
@@ -0,0 +1,5 @@
+#!/usr/bin/env bash
+# Reference solution applier (reviewer aid only — never used at grade time).
+set -euo pipefail
+cd /app
+git apply --whitespace=nowarn /solution/solution.patch
diff --git a/packages/benchmark/tasks/retry-async-util/task.toml b/packages/benchmark/tasks/retry-async-util/task.toml
new file mode 100644
index 000000000..702833b1d
--- /dev/null
+++ b/packages/benchmark/tasks/retry-async-util/task.toml
@@ -0,0 +1,42 @@
+schema_version = "1.1"
+artifacts = []
+
+[task]
+name = "slopbench/retry-async-util"
+description = "Implement retryAsync(operation, attempts) retrying on rejection."
+authors = []
+keywords = ["react", "typescript", "slop", "frontend"]
+
+[metadata]
+task_id = "retry-async-util"
+display_title = "Retry async utility"
+display_description = "Implement retryAsync(operation, attempts) retrying on rejection."
+family = "produce-clean"
+target_dimensions = ["ts-strictness", "maintainability"]
+language = "typescript"
+repository_url = "in-tree"
+base_commit_hash = "root"
+slop_profile = ""
+
+[verifier]
+timeout_sec = 1200.0
+
+[verifier.env]
+
+[agent]
+timeout_sec = 3600.0
+
+[environment]
+build_timeout_sec = 1200.0
+docker_image = "slopbench-base:latest"
+os = "linux"
+cpus = 2
+memory_mb = 4096
+storage_mb = 10240
+gpus = 0
+allow_internet = false
+mcp_servers = []
+
+[environment.env]
+
+[solution.env]
diff --git a/packages/benchmark/tasks/retry-async-util/tests/test.patch b/packages/benchmark/tasks/retry-async-util/tests/test.patch
new file mode 100644
index 000000000..88ca32f97
--- /dev/null
+++ b/packages/benchmark/tasks/retry-async-util/tests/test.patch
@@ -0,0 +1,41 @@
+diff --git a/tests/retry-async.test.ts b/tests/retry-async.test.ts
+new file mode 100644
+index 0000000..aa66eea
+--- /dev/null
++++ b/tests/retry-async.test.ts
+@@ -0,0 +1,35 @@
++import { test } from "node:test";
++import assert from "node:assert/strict";
++import { retryAsync } from "../src/retry-async.ts";
++
++test("retries until the operation resolves", async () => {
++  let calls = 0;
++  const value = await retryAsync(async () => {
++    calls++;
++    if (calls < 2) throw new Error("transient");
++    return "ok";
++  }, 3);
++  assert.equal(value, "ok");
++  assert.equal(calls, 2);
++});
++
++test("rejects with the last error after exhausting attempts", async () => {
++  let calls = 0;
++  await assert.rejects(
++    retryAsync(async () => {
++      calls++;
++      throw new Error(`fail ${calls}`);
++    }, 2),
++    /fail 2/,
++  );
++  assert.equal(calls, 2);
++});
++
++test("calls the operation only once when it resolves immediately", async () => {
++  let calls = 0;
++  await retryAsync(async () => {
++    calls++;
++    return 1;
++  }, 5);
++  assert.equal(calls, 1);
++});
diff --git a/packages/benchmark/tasks/retry-async-util/tests/test.sh b/packages/benchmark/tasks/retry-async-util/tests/test.sh
new file mode 100755
index 000000000..1ff92a632
--- /dev/null
+++ b/packages/benchmark/tasks/retry-async-util/tests/test.sh
@@ -0,0 +1,5 @@
+#!/usr/bin/env bash
+set -euo pipefail
+export BASE_COMMIT="$(git -C "${APP_DIR:-/app}" rev-list --max-parents=0 HEAD | tail -1)"
+export FUNCTIONAL_TEST_CMD="node --experimental-strip-types --test tests/retry-async.test.ts"
+exec slopbench-grade
diff --git a/packages/benchmark/tasks/route-handler-json/_authoring/hidden/tests/route.test.ts b/packages/benchmark/tasks/route-handler-json/_authoring/hidden/tests/route.test.ts
new file mode 100644
index 000000000..4ea74da17
--- /dev/null
+++ b/packages/benchmark/tasks/route-handler-json/_authoring/hidden/tests/route.test.ts
@@ -0,0 +1,20 @@
+import { test } from "node:test";
+import assert from "node:assert/strict";
+import { GET } from "../app/api/products/route.ts";
+import { PRODUCTS } from "../src/products.ts";
+
+test("returns the full catalog with status 200 when unfiltered", async () => {
+  const response = await GET(new Request("http://localhost/api/products"));
+  assert.equal(response.status, 200);
+  assert.deepEqual(await response.json(), PRODUCTS);
+});
+
+test("filters by maxPriceCents (inclusive)", async () => {
+  const response = await GET(new Request("http://localhost/api/products?maxPriceCents=300"));
+  assert.equal(response.status, 200);
+  const body = await response.json();
+  assert.deepEqual(
+    body.map((product: { id: string }) => product.id),
+    ["p2", "p3"],
+  );
+});
diff --git a/packages/benchmark/tasks/route-handler-json/_authoring/solved/app/api/products/route.ts b/packages/benchmark/tasks/route-handler-json/_authoring/solved/app/api/products/route.ts
new file mode 100644
index 000000000..5ecd249d0
--- /dev/null
+++ b/packages/benchmark/tasks/route-handler-json/_authoring/solved/app/api/products/route.ts
@@ -0,0 +1,9 @@
+import { PRODUCTS } from "../../../src/products.ts";
+
+export const GET = async (request: Request): Promise<Response> => {
+  const maxPriceCentsParam = new URL(request.url).searchParams.get("maxPriceCents");
+  const maxPriceCents =
+    maxPriceCentsParam === null ? Number.POSITIVE_INFINITY : Number(maxPriceCentsParam);
+  const matching = PRODUCTS.filter((product) => product.priceCents <= maxPriceCents);
+  return Response.json(matching);
+};
diff --git a/packages/benchmark/tasks/route-handler-json/environment/Dockerfile b/packages/benchmark/tasks/route-handler-json/environment/Dockerfile
new file mode 100644
index 000000000..0717d0595
--- /dev/null
+++ b/packages/benchmark/tasks/route-handler-json/environment/Dockerfile
@@ -0,0 +1,12 @@
+FROM slopbench-base:latest
+
+WORKDIR /app
+
+COPY seed/ .
+# Pure-TS task: no dependency install (functional test uses node --test).
+RUN git init -q \
+  && git add -A \
+  && git -c user.email=bench@react.doctor -c user.name=slopbench commit -qm "base" \
+  && git config --global --add safe.directory /app
+
+CMD ["/bin/bash"]
diff --git a/packages/benchmark/tasks/route-handler-json/instruction.md b/packages/benchmark/tasks/route-handler-json/instruction.md
new file mode 100644
index 000000000..d3adaffea
--- /dev/null
+++ b/packages/benchmark/tasks/route-handler-json/instruction.md
@@ -0,0 +1,23 @@
+Implement the App Router route handler in `app/api/products/route.ts`.
+
+## Expected behavior
+
+Handle `GET /api/products`, optionally filtered by a max price.
+
+- The catalog is the `PRODUCTS` array exported from `src/products.ts`.
+- Read the `maxPriceCents` query parameter from the request URL.
+  - When absent, return the full catalog.
+  - When present, return only products whose `priceCents` is **less than or
+    equal to** that value.
+- Respond with the resulting array as JSON and status `200`.
+
+Examples (status `200` in all cases):
+
+- `GET /api/products` → all of `PRODUCTS`.
+- `GET /api/products?maxPriceCents=300` → `[{ Pen, 250 }, { Eraser, 99 }]`
+  (the products priced at or below 300).
+
+## Constraints
+
+Export the handler as a named `GET` function taking the `Request`
+(App Router route-handler convention). Do not change `src/products.ts`.
diff --git a/packages/benchmark/tasks/route-handler-json/seed/app/api/products/route.ts b/packages/benchmark/tasks/route-handler-json/seed/app/api/products/route.ts
new file mode 100644
index 000000000..821b17fa2
--- /dev/null
+++ b/packages/benchmark/tasks/route-handler-json/seed/app/api/products/route.ts
@@ -0,0 +1,4 @@
+// TODO(agent): implement the GET route handler. See instruction.md.
+export const GET = async (_request: Request): Promise<Response> => {
+  throw new Error("not implemented");
+};
diff --git a/packages/benchmark/tasks/route-handler-json/seed/package.json b/packages/benchmark/tasks/route-handler-json/seed/package.json
new file mode 100644
index 000000000..3f9e9c32a
--- /dev/null
+++ b/packages/benchmark/tasks/route-handler-json/seed/package.json
@@ -0,0 +1,11 @@
+{
+  "name": "slopbench-route-handler-json",
+  "version": "1.0.0",
+  "private": true,
+  "type": "module",
+  "dependencies": {
+    "next": "^15.0.0",
+    "react": "^18.3.1",
+    "react-dom": "^18.3.1"
+  }
+}
diff --git a/packages/benchmark/tasks/route-handler-json/seed/src/products.ts b/packages/benchmark/tasks/route-handler-json/seed/src/products.ts
new file mode 100644
index 000000000..fb6a53b19
--- /dev/null
+++ b/packages/benchmark/tasks/route-handler-json/seed/src/products.ts
@@ -0,0 +1,12 @@
+export interface Product {
+  id: string;
+  name: string;
+  priceCents: number;
+}
+
+// Static catalog the route handler serves. Do not edit.
+export const PRODUCTS: Product[] = [
+  { id: "p1", name: "Notebook", priceCents: 1200 },
+  { id: "p2", name: "Pen", priceCents: 250 },
+  { id: "p3", name: "Eraser", priceCents: 99 },
+];
diff --git a/packages/benchmark/tasks/route-handler-json/seed/tsconfig.json b/packages/benchmark/tasks/route-handler-json/seed/tsconfig.json
new file mode 100644
index 000000000..906d5b529
--- /dev/null
+++ b/packages/benchmark/tasks/route-handler-json/seed/tsconfig.json
@@ -0,0 +1,13 @@
+{
+  "compilerOptions": {
+    "target": "ES2022",
+    "module": "ESNext",
+    "moduleResolution": "Bundler",
+    "jsx": "react-jsx",
+    "strict": true,
+    "allowImportingTsExtensions": true,
+    "noEmit": true,
+    "skipLibCheck": true
+  },
+  "include": ["src", "app", "tests"]
+}
diff --git a/packages/benchmark/tasks/route-handler-json/solution/solution.patch b/packages/benchmark/tasks/route-handler-json/solution/solution.patch
new file mode 100644
index 000000000..d6a546b2b
--- /dev/null
+++ b/packages/benchmark/tasks/route-handler-json/solution/solution.patch
@@ -0,0 +1,17 @@
+diff --git a/app/api/products/route.ts b/app/api/products/route.ts
+index 821b17f..5ecd249 100644
+--- a/app/api/products/route.ts
++++ b/app/api/products/route.ts
+@@ -1,4 +1,9 @@
+-// TODO(agent): implement the GET route handler. See instruction.md.
+-export const GET = async (_request: Request): Promise<Response> => {
+-  throw new Error("not implemented");
++import { PRODUCTS } from "../../../src/products.ts";
++
++export const GET = async (request: Request): Promise<Response> => {
++  const maxPriceCentsParam = new URL(request.url).searchParams.get("maxPriceCents");
++  const maxPriceCents =
++    maxPriceCentsParam === null ? Number.POSITIVE_INFINITY : Number(maxPriceCentsParam);
++  const matching = PRODUCTS.filter((product) => product.priceCents <= maxPriceCents);
++  return Response.json(matching);
+ };
diff --git a/packages/benchmark/tasks/route-handler-json/solution/solve.sh b/packages/benchmark/tasks/route-handler-json/solution/solve.sh
new file mode 100755
index 000000000..764e03155
--- /dev/null
+++ b/packages/benchmark/tasks/route-handler-json/solution/solve.sh
@@ -0,0 +1,5 @@
+#!/usr/bin/env bash
+# Reference solution applier (reviewer aid only — never used at grade time).
+set -euo pipefail
+cd /app
+git apply --whitespace=nowarn /solution/solution.patch
diff --git a/packages/benchmark/tasks/route-handler-json/task.toml b/packages/benchmark/tasks/route-handler-json/task.toml
new file mode 100644
index 000000000..384b4f918
--- /dev/null
+++ b/packages/benchmark/tasks/route-handler-json/task.toml
@@ -0,0 +1,42 @@
+schema_version = "1.1"
+artifacts = []
+
+[task]
+name = "slopbench/route-handler-json"
+description = "Implement a Next App Router GET route handler returning the catalog as JSON."
+authors = []
+keywords = ["react", "typescript", "slop", "frontend"]
+
+[metadata]
+task_id = "route-handler-json"
+display_title = "Next route handler JSON"
+display_description = "Implement a Next App Router GET route handler returning the catalog as JSON."
+family = "produce-clean"
+target_dimensions = ["react-correctness", "ts-strictness"]
+language = "typescript"
+repository_url = "in-tree"
+base_commit_hash = "root"
+slop_profile = ""
+
+[verifier]
+timeout_sec = 1200.0
+
+[verifier.env]
+
+[agent]
+timeout_sec = 3600.0
+
+[environment]
+build_timeout_sec = 1200.0
+docker_image = "slopbench-base:latest"
+os = "linux"
+cpus = 2
+memory_mb = 4096
+storage_mb = 10240
+gpus = 0
+allow_internet = false
+mcp_servers = []
+
+[environment.env]
+
+[solution.env]
diff --git a/packages/benchmark/tasks/route-handler-json/tests/test.patch b/packages/benchmark/tasks/route-handler-json/tests/test.patch
new file mode 100644
index 000000000..9ce531f44
--- /dev/null
+++ b/packages/benchmark/tasks/route-handler-json/tests/test.patch
@@ -0,0 +1,26 @@
+diff --git a/tests/route.test.ts b/tests/route.test.ts
+new file mode 100644
+index 0000000..4ea74da
+--- /dev/null
++++ b/tests/route.test.ts
+@@ -0,0 +1,20 @@
++import { test } from "node:test";
++import assert from "node:assert/strict";
++import { GET } from "../app/api/products/route.ts";
++import { PRODUCTS } from "../src/products.ts";
++
++test("returns the full catalog with status 200 when unfiltered", async () => {
++  const response = await GET(new Request("http://localhost/api/products"));
++  assert.equal(response.status, 200);
++  assert.deepEqual(await response.json(), PRODUCTS);
++});
++
++test("filters by maxPriceCents (inclusive)", async () => {
++  const response = await GET(new Request("http://localhost/api/products?maxPriceCents=300"));
++  assert.equal(response.status, 200);
++  const body = await response.json();
++  assert.deepEqual(
++    body.map((product: { id: string }) => product.id),
++    ["p2", "p3"],
++  );
++});
diff --git a/packages/benchmark/tasks/route-handler-json/tests/test.sh b/packages/benchmark/tasks/route-handler-json/tests/test.sh
new file mode 100755
index 000000000..d56c26aac
--- /dev/null
+++ b/packages/benchmark/tasks/route-handler-json/tests/test.sh
@@ -0,0 +1,5 @@
+#!/usr/bin/env bash
+set -euo pipefail
+export BASE_COMMIT="$(git -C "${APP_DIR:-/app}" rev-list --max-parents=0 HEAD | tail -1)"
+export FUNCTIONAL_TEST_CMD="node --experimental-strip-types --test tests/route.test.ts"
+exec slopbench-grade
diff --git a/packages/benchmark/tasks/slugify-util/_authoring/hidden/tests/slugify.test.ts b/packages/benchmark/tasks/slugify-util/_authoring/hidden/tests/slugify.test.ts
new file mode 100644
index 000000000..1b214462e
--- /dev/null
+++ b/packages/benchmark/tasks/slugify-util/_authoring/hidden/tests/slugify.test.ts
@@ -0,0 +1,23 @@
+import { test } from "node:test";
+import assert from "node:assert/strict";
+import { slugify } from "../src/slugify.ts";
+
+test("lowercases and hyphenates words", () => {
+  assert.equal(slugify("Hello, World!"), "hello-world");
+});
+
+test("collapses runs of whitespace", () => {
+  assert.equal(slugify("  Multiple   Spaces  "), "multiple-spaces");
+});
+
+test("strips non-alphanumeric characters", () => {
+  assert.equal(slugify("Café & Crème"), "caf-crme");
+});
+
+test("trims and collapses stray hyphens", () => {
+  assert.equal(slugify("--already--slugged--"), "already-slugged");
+});
+
+test("returns empty string for empty input", () => {
+  assert.equal(slugify(""), "");
+});
diff --git a/packages/benchmark/tasks/slugify-util/_authoring/solved/src/slugify.ts b/packages/benchmark/tasks/slugify-util/_authoring/solved/src/slugify.ts
new file mode 100644
index 000000000..b17d8bc95
--- /dev/null
+++ b/packages/benchmark/tasks/slugify-util/_authoring/solved/src/slugify.ts
@@ -0,0 +1,9 @@
+// Turns arbitrary text into a URL slug via a sequence of focused replacements.
+export const slugify = (input: string): string =>
+  input
+    .toLowerCase()
+    .trim()
+    .replace(/\s+/g, "-")
+    .replace(/[^a-z0-9-]/g, "")
+    .replace(/-+/g, "-")
+    .replace(/^-+|-+$/g, "");
diff --git a/packages/benchmark/tasks/slugify-util/environment/Dockerfile b/packages/benchmark/tasks/slugify-util/environment/Dockerfile
new file mode 100644
index 000000000..0717d0595
--- /dev/null
+++ b/packages/benchmark/tasks/slugify-util/environment/Dockerfile
@@ -0,0 +1,12 @@
+FROM slopbench-base:latest
+
+WORKDIR /app
+
+COPY seed/ .
+# Pure-TS task: no dependency install (functional test uses node --test).
+RUN git init -q \
+  && git add -A \
+  && git -c user.email=bench@react.doctor -c user.name=slopbench commit -qm "base" \
+  && git config --global --add safe.directory /app
+
+CMD ["/bin/bash"]
diff --git a/packages/benchmark/tasks/slugify-util/instruction.md b/packages/benchmark/tasks/slugify-util/instruction.md
new file mode 100644
index 000000000..d76a50040
--- /dev/null
+++ b/packages/benchmark/tasks/slugify-util/instruction.md
@@ -0,0 +1,25 @@
+Implement `slugify` in `src/slugify.ts`.
+
+## Expected behavior
+
+`slugify(input)` turns arbitrary text into a URL slug:
+
+- Lowercase the whole string.
+- Trim leading/trailing whitespace.
+- Replace any run of whitespace with a single hyphen.
+- Remove every character that is not `a–z`, `0–9`, or `-`.
+- Collapse runs of multiple hyphens into one.
+- Strip leading and trailing hyphens.
+
+Examples:
+
+- `slugify("Hello, World!")` → `"hello-world"`
+- `slugify("  Multiple   Spaces  ")` → `"multiple-spaces"`
+- `slugify("Café & Crème")` → `"caf-crme"`
+- `slugify("--already--slugged--")` → `"already-slugged"`
+- `slugify("")` → `""`
+
+## Constraints
+
+Keep the exported `slugify(input: string): string` signature. Do not change
+`src/article-link.tsx`.
diff --git a/packages/benchmark/tasks/slugify-util/seed/package.json b/packages/benchmark/tasks/slugify-util/seed/package.json
new file mode 100644
index 000000000..f633810cc
--- /dev/null
+++ b/packages/benchmark/tasks/slugify-util/seed/package.json
@@ -0,0 +1,10 @@
+{
+  "name": "slopbench-slugify-util",
+  "version": "1.0.0",
+  "private": true,
+  "type": "module",
+  "dependencies": {
+    "react": "^18.3.1",
+    "react-dom": "^18.3.1"
+  }
+}
diff --git a/packages/benchmark/tasks/slugify-util/seed/src/article-link.tsx b/packages/benchmark/tasks/slugify-util/seed/src/article-link.tsx
new file mode 100644
index 000000000..e2ec11d83
--- /dev/null
+++ b/packages/benchmark/tasks/slugify-util/seed/src/article-link.tsx
@@ -0,0 +1,10 @@
+import { slugify } from "./slugify.ts";
+
+interface ArticleLinkProps {
+  title: string;
+}
+
+// Existing consumer (keeps slugify.ts reachable). Do not edit.
+export const ArticleLink = ({ title }: ArticleLinkProps) => (
+  <a href={`/articles/${slugify(title)}`}>{title}</a>
+);
diff --git a/packages/benchmark/tasks/slugify-util/seed/src/slugify.ts b/packages/benchmark/tasks/slugify-util/seed/src/slugify.ts
new file mode 100644
index 000000000..0c6c7cecf
--- /dev/null
+++ b/packages/benchmark/tasks/slugify-util/seed/src/slugify.ts
@@ -0,0 +1,4 @@
+// TODO(agent): implement. See instruction.md.
+export const slugify = (_input: string): string => {
+  throw new Error("not implemented");
+};
diff --git a/packages/benchmark/tasks/slugify-util/seed/tsconfig.json b/packages/benchmark/tasks/slugify-util/seed/tsconfig.json
new file mode 100644
index 000000000..ffbea3d66
--- /dev/null
+++ b/packages/benchmark/tasks/slugify-util/seed/tsconfig.json
@@ -0,0 +1,13 @@
+{
+  "compilerOptions": {
+    "target": "ES2022",
+    "module": "ESNext",
+    "moduleResolution": "Bundler",
+    "jsx": "react-jsx",
+    "strict": true,
+    "allowImportingTsExtensions": true,
+    "noEmit": true,
+    "skipLibCheck": true
+  },
+  "include": ["src", "tests"]
+}
diff --git a/packages/benchmark/tasks/slugify-util/solution/solution.patch b/packages/benchmark/tasks/slugify-util/solution/solution.patch
new file mode 100644
index 000000000..5613cd4f1
--- /dev/null
+++ b/packages/benchmark/tasks/slugify-util/solution/solution.patch
@@ -0,0 +1,18 @@
+diff --git a/src/slugify.ts b/src/slugify.ts
+index 0c6c7ce..b17d8bc 100644
+--- a/src/slugify.ts
++++ b/src/slugify.ts
+@@ -1,4 +1,9 @@
+-// TODO(agent): implement. See instruction.md.
+-export const slugify = (_input: string): string => {
+-  throw new Error("not implemented");
+-};
++// Turns arbitrary text into a URL slug via a sequence of focused replacements.
++export const slugify = (input: string): string =>
++  input
++    .toLowerCase()
++    .trim()
++    .replace(/\s+/g, "-")
++    .replace(/[^a-z0-9-]/g, "")
++    .replace(/-+/g, "-")
++    .replace(/^-+|-+$/g, "");
diff --git a/packages/benchmark/tasks/slugify-util/solution/solve.sh b/packages/benchmark/tasks/slugify-util/solution/solve.sh
new file mode 100755
index 000000000..764e03155
--- /dev/null
+++ b/packages/benchmark/tasks/slugify-util/solution/solve.sh
@@ -0,0 +1,5 @@
+#!/usr/bin/env bash
+# Reference solution applier (reviewer aid only — never used at grade time).
+set -euo pipefail
+cd /app
+git apply --whitespace=nowarn /solution/solution.patch
diff --git a/packages/benchmark/tasks/slugify-util/task.toml b/packages/benchmark/tasks/slugify-util/task.toml
new file mode 100644
index 000000000..8f09212ca
--- /dev/null
+++ b/packages/benchmark/tasks/slugify-util/task.toml
@@ -0,0 +1,42 @@
+schema_version = "1.1"
+artifacts = []
+
+[task]
+name = "slopbench/slugify-util"
+description = "Implement slugify(input) producing a clean URL slug."
+authors = []
+keywords = ["react", "typescript", "slop", "frontend"]
+
+[metadata]
+task_id = "slugify-util"
+display_title = "URL slugify"
+display_description = "Implement slugify(input) producing a clean URL slug."
+family = "produce-clean"
+target_dimensions = ["ts-strictness", "maintainability"]
+language = "typescript"
+repository_url = "in-tree"
+base_commit_hash = "root"
+slop_profile = ""
+
+[verifier]
+timeout_sec = 1200.0
+
+[verifier.env]
+
+[agent]
+timeout_sec = 3600.0
+
+[environment]
+build_timeout_sec = 1200.0
+docker_image = "slopbench-base:latest"
+os = "linux"
+cpus = 2
+memory_mb = 4096
+storage_mb = 10240
+gpus = 0
+allow_internet = false
+mcp_servers = []
+
+[environment.env]
+
+[solution.env]
diff --git a/packages/benchmark/tasks/slugify-util/tests/test.patch b/packages/benchmark/tasks/slugify-util/tests/test.patch
new file mode 100644
index 000000000..a6caaad80
--- /dev/null
+++ b/packages/benchmark/tasks/slugify-util/tests/test.patch
@@ -0,0 +1,29 @@
+diff --git a/tests/slugify.test.ts b/tests/slugify.test.ts
+new file mode 100644
+index 0000000..1b21446
+--- /dev/null
++++ b/tests/slugify.test.ts
+@@ -0,0 +1,23 @@
++import { test } from "node:test";
++import assert from "node:assert/strict";
++import { slugify } from "../src/slugify.ts";
++
++test("lowercases and hyphenates words", () => {
++  assert.equal(slugify("Hello, World!"), "hello-world");
++});
++
++test("collapses runs of whitespace", () => {
++  assert.equal(slugify("  Multiple   Spaces  "), "multiple-spaces");
++});
++
++test("strips non-alphanumeric characters", () => {
++  assert.equal(slugify("Café & Crème"), "caf-crme");
++});
++
++test("trims and collapses stray hyphens", () => {
++  assert.equal(slugify("--already--slugged--"), "already-slugged");
++});
++
++test("returns empty string for empty input", () => {
++  assert.equal(slugify(""), "");
++});
diff --git a/packages/benchmark/tasks/slugify-util/tests/test.sh b/packages/benchmark/tasks/slugify-util/tests/test.sh
new file mode 100755
index 000000000..f15a9cc96
--- /dev/null
+++ b/packages/benchmark/tasks/slugify-util/tests/test.sh
@@ -0,0 +1,5 @@
+#!/usr/bin/env bash
+set -euo pipefail
+export BASE_COMMIT="$(git -C "${APP_DIR:-/app}" rev-list --max-parents=0 HEAD | tail -1)"
+export FUNCTIONAL_TEST_CMD="node --experimental-strip-types --test tests/slugify.test.ts"
+exec slopbench-grade
diff --git a/packages/benchmark/tasks/status-pill-variants/_authoring/hidden/tests/status-pill.test.tsx b/packages/benchmark/tasks/status-pill-variants/_authoring/hidden/tests/status-pill.test.tsx
new file mode 100644
index 000000000..0cf672591
--- /dev/null
+++ b/packages/benchmark/tasks/status-pill-variants/_authoring/hidden/tests/status-pill.test.tsx
@@ -0,0 +1,18 @@
+import { test, expect } from "vitest";
+import { renderToStaticMarkup } from "react-dom/server";
+import { StatusPill, type PillStatus } from "../src/status-pill.tsx";
+
+const CASES: Array<{ status: PillStatus; label: string }> = [
+  { status: "success", label: "Success" },
+  { status: "error", label: "Error" },
+  { status: "warning", label: "Warning" },
+  { status: "info", label: "Info" },
+];
+
+for (const { status, label } of CASES) {
+  test(`renders the ${status} pill`, () => {
+    const html = renderToStaticMarkup(<StatusPill status={status} />);
+    expect(html).toContain(`pill pill-${status}`);
+    expect(html).toContain(`>${label}<`);
+  });
+}
diff --git a/packages/benchmark/tasks/status-pill-variants/_authoring/solved/src/status-pill.tsx b/packages/benchmark/tasks/status-pill-variants/_authoring/solved/src/status-pill.tsx
new file mode 100644
index 000000000..6e1cdf34b
--- /dev/null
+++ b/packages/benchmark/tasks/status-pill-variants/_authoring/solved/src/status-pill.tsx
@@ -0,0 +1,16 @@
+export type PillStatus = "success" | "error" | "warning" | "info";
+
+export interface StatusPillProps {
+  status: PillStatus;
+}
+
+const STATUS_LABEL: Record<PillStatus, string> = {
+  success: "Success",
+  error: "Error",
+  warning: "Warning",
+  info: "Info",
+};
+
+export const StatusPill = ({ status }: StatusPillProps) => (
+  <span className={`pill pill-${status}`}>{STATUS_LABEL[status]}</span>
+);
diff --git a/packages/benchmark/tasks/status-pill-variants/environment/Dockerfile b/packages/benchmark/tasks/status-pill-variants/environment/Dockerfile
new file mode 100644
index 000000000..fcbfdb374
--- /dev/null
+++ b/packages/benchmark/tasks/status-pill-variants/environment/Dockerfile
@@ -0,0 +1,12 @@
+FROM slopbench-base:latest
+
+WORKDIR /app
+
+COPY seed/ .
+RUN pnpm install --frozen-lockfile --ignore-scripts || pnpm install --ignore-scripts
+RUN git init -q \
+  && git add -A \
+  && git -c user.email=bench@react.doctor -c user.name=slopbench commit -qm "base" \
+  && git config --global --add safe.directory /app
+
+CMD ["/bin/bash"]
diff --git a/packages/benchmark/tasks/status-pill-variants/instruction.md b/packages/benchmark/tasks/status-pill-variants/instruction.md
new file mode 100644
index 000000000..9356448ad
--- /dev/null
+++ b/packages/benchmark/tasks/status-pill-variants/instruction.md
@@ -0,0 +1,20 @@
+Implement the `StatusPill` component in `src/status-pill.tsx`.
+
+## Expected behavior
+
+`StatusPill` takes a single `status` prop — one of `"success"`, `"error"`,
+`"warning"`, `"info"` — and renders a `<span>`:
+
+- Its `className` is exactly `pill pill-<status>`, e.g.
+  `<span class="pill pill-success">`.
+- Its text content is the capitalized status label: `Success`, `Error`,
+  `Warning`, `Info` respectively.
+
+Example: `<StatusPill status="warning" />` renders
+`<span class="pill pill-warning">Warning</span>`.
+
+## Constraints
+
+Keep the exported `StatusPill` component and the `StatusPillProps` / `PillStatus`
+types. The component must accept the four statuses through the single `status`
+prop.
diff --git a/packages/benchmark/tasks/status-pill-variants/seed/package.json b/packages/benchmark/tasks/status-pill-variants/seed/package.json
new file mode 100644
index 000000000..4deae451d
--- /dev/null
+++ b/packages/benchmark/tasks/status-pill-variants/seed/package.json
@@ -0,0 +1,13 @@
+{
+  "name": "slopbench-status-pill",
+  "version": "1.0.0",
+  "private": true,
+  "type": "module",
+  "dependencies": {
+    "react": "^18.3.1",
+    "react-dom": "^18.3.1"
+  },
+  "devDependencies": {
+    "vitest": "^4.1.8"
+  }
+}
diff --git a/packages/benchmark/tasks/status-pill-variants/seed/src/status-pill.tsx b/packages/benchmark/tasks/status-pill-variants/seed/src/status-pill.tsx
new file mode 100644
index 000000000..5d246c51d
--- /dev/null
+++ b/packages/benchmark/tasks/status-pill-variants/seed/src/status-pill.tsx
@@ -0,0 +1,10 @@
+export type PillStatus = "success" | "error" | "warning" | "info";
+
+export interface StatusPillProps {
+  status: PillStatus;
+}
+
+// TODO(agent): implement. See instruction.md.
+export const StatusPill = (_props: StatusPillProps) => {
+  throw new Error("not implemented");
+};
diff --git a/packages/benchmark/tasks/status-pill-variants/seed/tsconfig.json b/packages/benchmark/tasks/status-pill-variants/seed/tsconfig.json
new file mode 100644
index 000000000..ffbea3d66
--- /dev/null
+++ b/packages/benchmark/tasks/status-pill-variants/seed/tsconfig.json
@@ -0,0 +1,13 @@
+{
+  "compilerOptions": {
+    "target": "ES2022",
+    "module": "ESNext",
+    "moduleResolution": "Bundler",
+    "jsx": "react-jsx",
+    "strict": true,
+    "allowImportingTsExtensions": true,
+    "noEmit": true,
+    "skipLibCheck": true
+  },
+  "include": ["src", "tests"]
+}
diff --git a/packages/benchmark/tasks/status-pill-variants/seed/vitest.config.ts b/packages/benchmark/tasks/status-pill-variants/seed/vitest.config.ts
new file mode 100644
index 000000000..8409b1f8e
--- /dev/null
+++ b/packages/benchmark/tasks/status-pill-variants/seed/vitest.config.ts
@@ -0,0 +1,9 @@
+import { defineConfig } from "vitest/config";
+
+export default defineConfig({
+  esbuild: { jsx: "automatic" },
+  test: {
+    environment: "node",
+    include: ["tests/**/*.test.tsx"],
+  },
+});
diff --git a/packages/benchmark/tasks/status-pill-variants/solution/solution.patch b/packages/benchmark/tasks/status-pill-variants/solution/solution.patch
new file mode 100644
index 000000000..60360bca4
--- /dev/null
+++ b/packages/benchmark/tasks/status-pill-variants/solution/solution.patch
@@ -0,0 +1,21 @@
+diff --git a/src/status-pill.tsx b/src/status-pill.tsx
+index 5d246c5..6e1cdf3 100644
+--- a/src/status-pill.tsx
++++ b/src/status-pill.tsx
+@@ -4,7 +4,13 @@ export interface StatusPillProps {
+   status: PillStatus;
+ }
+ 
+-// TODO(agent): implement. See instruction.md.
+-export const StatusPill = (_props: StatusPillProps) => {
+-  throw new Error("not implemented");
++const STATUS_LABEL: Record<PillStatus, string> = {
++  success: "Success",
++  error: "Error",
++  warning: "Warning",
++  info: "Info",
+ };
++
++export const StatusPill = ({ status }: StatusPillProps) => (
++  <span className={`pill pill-${status}`}>{STATUS_LABEL[status]}</span>
++);
diff --git a/packages/benchmark/tasks/status-pill-variants/solution/solve.sh b/packages/benchmark/tasks/status-pill-variants/solution/solve.sh
new file mode 100755
index 000000000..764e03155
--- /dev/null
+++ b/packages/benchmark/tasks/status-pill-variants/solution/solve.sh
@@ -0,0 +1,5 @@
+#!/usr/bin/env bash
+# Reference solution applier (reviewer aid only — never used at grade time).
+set -euo pipefail
+cd /app
+git apply --whitespace=nowarn /solution/solution.patch
diff --git a/packages/benchmark/tasks/status-pill-variants/task.toml b/packages/benchmark/tasks/status-pill-variants/task.toml
new file mode 100644
index 000000000..512b8d34d
--- /dev/null
+++ b/packages/benchmark/tasks/status-pill-variants/task.toml
@@ -0,0 +1,42 @@
+schema_version = "1.1"
+artifacts = []
+
+[task]
+name = "slopbench/status-pill-variants"
+description = "Implement a StatusPill with a single status union prop (not boolean-prop soup)."
+authors = []
+keywords = ["react", "typescript", "slop", "frontend"]
+
+[metadata]
+task_id = "status-pill-variants"
+display_title = "Status pill variants"
+display_description = "Implement a StatusPill with a single status union prop (not boolean-prop soup)."
+family = "produce-clean"
+target_dimensions = ["composition", "react-correctness"]
+language = "typescript"
+repository_url = "in-tree"
+base_commit_hash = "root"
+slop_profile = ""
+
+[verifier]
+timeout_sec = 1200.0
+
+[verifier.env]
+
+[agent]
+timeout_sec = 3600.0
+
+[environment]
+build_timeout_sec = 1200.0
+docker_image = "slopbench-base:latest"
+os = "linux"
+cpus = 2
+memory_mb = 4096
+storage_mb = 10240
+gpus = 0
+allow_internet = false
+mcp_servers = []
+
+[environment.env]
+
+[solution.env]
diff --git a/packages/benchmark/tasks/status-pill-variants/tests/test.patch b/packages/benchmark/tasks/status-pill-variants/tests/test.patch
new file mode 100644
index 000000000..e6a48bc2e
--- /dev/null
+++ b/packages/benchmark/tasks/status-pill-variants/tests/test.patch
@@ -0,0 +1,24 @@
+diff --git a/tests/status-pill.test.tsx b/tests/status-pill.test.tsx
+new file mode 100644
+index 0000000..0cf6725
+--- /dev/null
++++ b/tests/status-pill.test.tsx
+@@ -0,0 +1,18 @@
++import { test, expect } from "vitest";
++import { renderToStaticMarkup } from "react-dom/server";
++import { StatusPill, type PillStatus } from "../src/status-pill.tsx";
++
++const CASES: Array<{ status: PillStatus; label: string }> = [
++  { status: "success", label: "Success" },
++  { status: "error", label: "Error" },
++  { status: "warning", label: "Warning" },
++  { status: "info", label: "Info" },
++];
++
++for (const { status, label } of CASES) {
++  test(`renders the ${status} pill`, () => {
++    const html = renderToStaticMarkup(<StatusPill status={status} />);
++    expect(html).toContain(`pill pill-${status}`);
++    expect(html).toContain(`>${label}<`);
++  });
++}
diff --git a/packages/benchmark/tasks/status-pill-variants/tests/test.sh b/packages/benchmark/tasks/status-pill-variants/tests/test.sh
new file mode 100755
index 000000000..4003f69b2
--- /dev/null
+++ b/packages/benchmark/tasks/status-pill-variants/tests/test.sh
@@ -0,0 +1,5 @@
+#!/usr/bin/env bash
+set -euo pipefail
+export BASE_COMMIT="$(git -C "${APP_DIR:-/app}" rev-list --max-parents=0 HEAD | tail -1)"
+export FUNCTIONAL_TEST_CMD="pnpm exec vitest run"
+exec slopbench-grade
diff --git a/packages/benchmark/tasks/title-case-util/_authoring/hidden/tests/title-case.test.ts b/packages/benchmark/tasks/title-case-util/_authoring/hidden/tests/title-case.test.ts
new file mode 100644
index 000000000..8d6a0c74a
--- /dev/null
+++ b/packages/benchmark/tasks/title-case-util/_authoring/hidden/tests/title-case.test.ts
@@ -0,0 +1,20 @@
+import { test } from "node:test";
+import assert from "node:assert/strict";
+import { titleCase } from "../src/title-case.ts";
+
+test("capitalizes each word", () => {
+  assert.equal(titleCase("hello world"), "Hello World");
+});
+
+test("collapses whitespace and trims", () => {
+  assert.equal(titleCase("  the QUICK  brown  "), "The Quick Brown");
+});
+
+test("lowercases the rest of each word", () => {
+  assert.equal(titleCase("ALL CAPS"), "All Caps");
+});
+
+test("returns empty string for empty input", () => {
+  assert.equal(titleCase(""), "");
+  assert.equal(titleCase("   "), "");
+});
diff --git a/packages/benchmark/tasks/title-case-util/_authoring/solved/src/title-case.ts b/packages/benchmark/tasks/title-case-util/_authoring/solved/src/title-case.ts
new file mode 100644
index 000000000..021bfbc8d
--- /dev/null
+++ b/packages/benchmark/tasks/title-case-util/_authoring/solved/src/title-case.ts
@@ -0,0 +1,11 @@
+// Capitalizes the first letter of each whitespace-separated word and lowercases
+// the rest.
+export const titleCase = (input: string): string => {
+  const words = input
+    .trim()
+    .split(/\s+/)
+    .filter((word) => word.length > 0);
+  return words
+    .map((word) => `${word[0]?.toUpperCase() ?? ""}${word.slice(1).toLowerCase()}`)
+    .join(" ");
+};
diff --git a/packages/benchmark/tasks/title-case-util/environment/Dockerfile b/packages/benchmark/tasks/title-case-util/environment/Dockerfile
new file mode 100644
index 000000000..0717d0595
--- /dev/null
+++ b/packages/benchmark/tasks/title-case-util/environment/Dockerfile
@@ -0,0 +1,12 @@
+FROM slopbench-base:latest
+
+WORKDIR /app
+
+COPY seed/ .
+# Pure-TS task: no dependency install (functional test uses node --test).
+RUN git init -q \
+  && git add -A \
+  && git -c user.email=bench@react.doctor -c user.name=slopbench commit -qm "base" \
+  && git config --global --add safe.directory /app
+
+CMD ["/bin/bash"]
diff --git a/packages/benchmark/tasks/title-case-util/instruction.md b/packages/benchmark/tasks/title-case-util/instruction.md
new file mode 100644
index 000000000..0359baa65
--- /dev/null
+++ b/packages/benchmark/tasks/title-case-util/instruction.md
@@ -0,0 +1,24 @@
+Implement `titleCase` in `src/title-case.ts`.
+
+## Expected behavior
+
+`titleCase(input)` capitalizes the first letter of each word and lowercases the
+rest:
+
+- Words are separated by single spaces in the output; collapse any run of
+  whitespace in the input to a single space and trim the ends.
+- For each word, uppercase the first character and lowercase the remaining
+  characters.
+- An empty (or whitespace-only) input returns `""`.
+
+Examples:
+
+- `titleCase("hello world")` → `"Hello World"`
+- `titleCase("  the QUICK  brown  ")` → `"The Quick Brown"`
+- `titleCase("ALL CAPS")` → `"All Caps"`
+- `titleCase("")` → `""`
+
+## Constraints
+
+Keep the exported `titleCase(input: string): string` signature. Do not change
+`src/section-heading.tsx`.
diff --git a/packages/benchmark/tasks/title-case-util/seed/package.json b/packages/benchmark/tasks/title-case-util/seed/package.json
new file mode 100644
index 000000000..fbaa2b24d
--- /dev/null
+++ b/packages/benchmark/tasks/title-case-util/seed/package.json
@@ -0,0 +1,10 @@
+{
+  "name": "slopbench-title-case-util",
+  "version": "1.0.0",
+  "private": true,
+  "type": "module",
+  "dependencies": {
+    "react": "^18.3.1",
+    "react-dom": "^18.3.1"
+  }
+}
diff --git a/packages/benchmark/tasks/title-case-util/seed/src/section-heading.tsx b/packages/benchmark/tasks/title-case-util/seed/src/section-heading.tsx
new file mode 100644
index 000000000..050715537
--- /dev/null
+++ b/packages/benchmark/tasks/title-case-util/seed/src/section-heading.tsx
@@ -0,0 +1,8 @@
+import { titleCase } from "./title-case.ts";
+
+interface SectionHeadingProps {
+  text: string;
+}
+
+// Existing consumer (keeps title-case.ts reachable). Do not edit.
+export const SectionHeading = ({ text }: SectionHeadingProps) => <h2>{titleCase(text)}</h2>;
diff --git a/packages/benchmark/tasks/title-case-util/seed/src/title-case.ts b/packages/benchmark/tasks/title-case-util/seed/src/title-case.ts
new file mode 100644
index 000000000..b431adbeb
--- /dev/null
+++ b/packages/benchmark/tasks/title-case-util/seed/src/title-case.ts
@@ -0,0 +1,4 @@
+// TODO(agent): implement. See instruction.md.
+export const titleCase = (_input: string): string => {
+  throw new Error("not implemented");
+};
diff --git a/packages/benchmark/tasks/title-case-util/seed/tsconfig.json b/packages/benchmark/tasks/title-case-util/seed/tsconfig.json
new file mode 100644
index 000000000..ffbea3d66
--- /dev/null
+++ b/packages/benchmark/tasks/title-case-util/seed/tsconfig.json
@@ -0,0 +1,13 @@
+{
+  "compilerOptions": {
+    "target": "ES2022",
+    "module": "ESNext",
+    "moduleResolution": "Bundler",
+    "jsx": "react-jsx",
+    "strict": true,
+    "allowImportingTsExtensions": true,
+    "noEmit": true,
+    "skipLibCheck": true
+  },
+  "include": ["src", "tests"]
+}
diff --git a/packages/benchmark/tasks/title-case-util/solution/solution.patch b/packages/benchmark/tasks/title-case-util/solution/solution.patch
new file mode 100644
index 000000000..d3f9b0dc8
--- /dev/null
+++ b/packages/benchmark/tasks/title-case-util/solution/solution.patch
@@ -0,0 +1,19 @@
+diff --git a/src/title-case.ts b/src/title-case.ts
+index b431adb..021bfbc 100644
+--- a/src/title-case.ts
++++ b/src/title-case.ts
+@@ -1,4 +1,11 @@
+-// TODO(agent): implement. See instruction.md.
+-export const titleCase = (_input: string): string => {
+-  throw new Error("not implemented");
++// Capitalizes the first letter of each whitespace-separated word and lowercases
++// the rest.
++export const titleCase = (input: string): string => {
++  const words = input
++    .trim()
++    .split(/\s+/)
++    .filter((word) => word.length > 0);
++  return words
++    .map((word) => `${word[0]?.toUpperCase() ?? ""}${word.slice(1).toLowerCase()}`)
++    .join(" ");
+ };
diff --git a/packages/benchmark/tasks/title-case-util/solution/solve.sh b/packages/benchmark/tasks/title-case-util/solution/solve.sh
new file mode 100755
index 000000000..764e03155
--- /dev/null
+++ b/packages/benchmark/tasks/title-case-util/solution/solve.sh
@@ -0,0 +1,5 @@
+#!/usr/bin/env bash
+# Reference solution applier (reviewer aid only — never used at grade time).
+set -euo pipefail
+cd /app
+git apply --whitespace=nowarn /solution/solution.patch
diff --git a/packages/benchmark/tasks/title-case-util/task.toml b/packages/benchmark/tasks/title-case-util/task.toml
new file mode 100644
index 000000000..908dcd105
--- /dev/null
+++ b/packages/benchmark/tasks/title-case-util/task.toml
@@ -0,0 +1,42 @@
+schema_version = "1.1"
+artifacts = []
+
+[task]
+name = "slopbench/title-case-util"
+description = "Implement titleCase(input) capitalizing each word."
+authors = []
+keywords = ["react", "typescript", "slop", "frontend"]
+
+[metadata]
+task_id = "title-case-util"
+display_title = "Title-case utility"
+display_description = "Implement titleCase(input) capitalizing each word."
+family = "produce-clean"
+target_dimensions = ["maintainability", "ts-strictness"]
+language = "typescript"
+repository_url = "in-tree"
+base_commit_hash = "root"
+slop_profile = ""
+
+[verifier]
+timeout_sec = 1200.0
+
+[verifier.env]
+
+[agent]
+timeout_sec = 3600.0
+
+[environment]
+build_timeout_sec = 1200.0
+docker_image = "slopbench-base:latest"
+os = "linux"
+cpus = 2
+memory_mb = 4096
+storage_mb = 10240
+gpus = 0
+allow_internet = false
+mcp_servers = []
+
+[environment.env]
+
+[solution.env]
diff --git a/packages/benchmark/tasks/title-case-util/tests/test.patch b/packages/benchmark/tasks/title-case-util/tests/test.patch
new file mode 100644
index 000000000..293ca84e0
--- /dev/null
+++ b/packages/benchmark/tasks/title-case-util/tests/test.patch
@@ -0,0 +1,26 @@
+diff --git a/tests/title-case.test.ts b/tests/title-case.test.ts
+new file mode 100644
+index 0000000..8d6a0c7
+--- /dev/null
++++ b/tests/title-case.test.ts
+@@ -0,0 +1,20 @@
++import { test } from "node:test";
++import assert from "node:assert/strict";
++import { titleCase } from "../src/title-case.ts";
++
++test("capitalizes each word", () => {
++  assert.equal(titleCase("hello world"), "Hello World");
++});
++
++test("collapses whitespace and trims", () => {
++  assert.equal(titleCase("  the QUICK  brown  "), "The Quick Brown");
++});
++
++test("lowercases the rest of each word", () => {
++  assert.equal(titleCase("ALL CAPS"), "All Caps");
++});
++
++test("returns empty string for empty input", () => {
++  assert.equal(titleCase(""), "");
++  assert.equal(titleCase("   "), "");
++});
diff --git a/packages/benchmark/tasks/title-case-util/tests/test.sh b/packages/benchmark/tasks/title-case-util/tests/test.sh
new file mode 100755
index 000000000..f97ee5e19
--- /dev/null
+++ b/packages/benchmark/tasks/title-case-util/tests/test.sh
@@ -0,0 +1,5 @@
+#!/usr/bin/env bash
+set -euo pipefail
+export BASE_COMMIT="$(git -C "${APP_DIR:-/app}" rev-list --max-parents=0 HEAD | tail -1)"
+export FUNCTIONAL_TEST_CMD="node --experimental-strip-types --test tests/title-case.test.ts"
+exec slopbench-grade
diff --git a/packages/benchmark/tasks/truncate-middle-util/_authoring/hidden/tests/truncate-middle.test.ts b/packages/benchmark/tasks/truncate-middle-util/_authoring/hidden/tests/truncate-middle.test.ts
new file mode 100644
index 000000000..4d03c3a14
--- /dev/null
+++ b/packages/benchmark/tasks/truncate-middle-util/_authoring/hidden/tests/truncate-middle.test.ts
@@ -0,0 +1,20 @@
+import { test } from "node:test";
+import assert from "node:assert/strict";
+import { truncateMiddle } from "../src/truncate-middle.ts";
+
+test("returns short text unchanged", () => {
+  assert.equal(truncateMiddle("hello", 10), "hello");
+});
+
+test("elides the middle to the exact max length, favoring the front", () => {
+  assert.equal(truncateMiddle("hello world", 7), "hel\u2026rld");
+  assert.equal(truncateMiddle("hello world", 7).length, 7);
+});
+
+test("splits an even budget evenly", () => {
+  assert.equal(truncateMiddle("abcdefgh", 5), "ab\u2026gh");
+});
+
+test("collapses to a lone ellipsis at length 1 or less", () => {
+  assert.equal(truncateMiddle("anything", 1), "\u2026");
+});
diff --git a/packages/benchmark/tasks/truncate-middle-util/_authoring/solved/src/truncate-middle.ts b/packages/benchmark/tasks/truncate-middle-util/_authoring/solved/src/truncate-middle.ts
new file mode 100644
index 000000000..d42c6bc0e
--- /dev/null
+++ b/packages/benchmark/tasks/truncate-middle-util/_authoring/solved/src/truncate-middle.ts
@@ -0,0 +1,13 @@
+// Shortens text by eliding the middle with a single ellipsis so the result is
+// exactly `maxLength` characters. Odd leftover budget favors the front.
+export const truncateMiddle = (text: string, maxLength: number): string => {
+  if (text.length <= maxLength) return text;
+  if (maxLength <= 1) return "…";
+
+  const budget = maxLength - 1;
+  const frontLength = Math.ceil(budget / 2);
+  const backLength = Math.floor(budget / 2);
+  const front = text.slice(0, frontLength);
+  const back = backLength === 0 ? "" : text.slice(text.length - backLength);
+  return `${front}…${back}`;
+};
diff --git a/packages/benchmark/tasks/truncate-middle-util/environment/Dockerfile b/packages/benchmark/tasks/truncate-middle-util/environment/Dockerfile
new file mode 100644
index 000000000..0717d0595
--- /dev/null
+++ b/packages/benchmark/tasks/truncate-middle-util/environment/Dockerfile
@@ -0,0 +1,12 @@
+FROM slopbench-base:latest
+
+WORKDIR /app
+
+COPY seed/ .
+# Pure-TS task: no dependency install (functional test uses node --test).
+RUN git init -q \
+  && git add -A \
+  && git -c user.email=bench@react.doctor -c user.name=slopbench commit -qm "base" \
+  && git config --global --add safe.directory /app
+
+CMD ["/bin/bash"]
diff --git a/packages/benchmark/tasks/truncate-middle-util/instruction.md b/packages/benchmark/tasks/truncate-middle-util/instruction.md
new file mode 100644
index 000000000..78ea758b6
--- /dev/null
+++ b/packages/benchmark/tasks/truncate-middle-util/instruction.md
@@ -0,0 +1,25 @@
+Implement `truncateMiddle` in `src/truncate-middle.ts`.
+
+## Expected behavior
+
+`truncateMiddle(text, maxLength)` shortens long text by removing the middle and
+inserting a single `…` (U+2026) so the **total result length equals
+`maxLength`**.
+
+- If `text.length <= maxLength`, return `text` unchanged.
+- Otherwise keep the start and end of `text` around a single `…`. The ellipsis
+  counts as one character toward `maxLength`. When the remaining character
+  budget is odd, give the extra character to the **front**.
+- If `maxLength <= 1`, return `"…"`.
+
+Examples:
+
+- `truncateMiddle("hello", 10)` → `"hello"`
+- `truncateMiddle("hello world", 7)` → `"hel…rld"`
+- `truncateMiddle("abcdefgh", 5)` → `"ab…gh"`
+- `truncateMiddle("anything", 1)` → `"…"`
+
+## Constraints
+
+Keep the exported `truncateMiddle(text: string, maxLength: number): string`
+signature. Do not change `src/file-chip.tsx`.
diff --git a/packages/benchmark/tasks/truncate-middle-util/seed/package.json b/packages/benchmark/tasks/truncate-middle-util/seed/package.json
new file mode 100644
index 000000000..8a078a4b2
--- /dev/null
+++ b/packages/benchmark/tasks/truncate-middle-util/seed/package.json
@@ -0,0 +1,10 @@
+{
+  "name": "slopbench-truncate-middle-util",
+  "version": "1.0.0",
+  "private": true,
+  "type": "module",
+  "dependencies": {
+    "react": "^18.3.1",
+    "react-dom": "^18.3.1"
+  }
+}
diff --git a/packages/benchmark/tasks/truncate-middle-util/seed/src/file-chip.tsx b/packages/benchmark/tasks/truncate-middle-util/seed/src/file-chip.tsx
new file mode 100644
index 000000000..88884a09c
--- /dev/null
+++ b/packages/benchmark/tasks/truncate-middle-util/seed/src/file-chip.tsx
@@ -0,0 +1,10 @@
+import { truncateMiddle } from "./truncate-middle.ts";
+
+interface FileChipProps {
+  fileName: string;
+}
+
+// Existing consumer (keeps truncate-middle.ts reachable). Do not edit.
+export const FileChip = ({ fileName }: FileChipProps) => (
+  <span className="file-chip">{truncateMiddle(fileName, 20)}</span>
+);
diff --git a/packages/benchmark/tasks/truncate-middle-util/seed/src/truncate-middle.ts b/packages/benchmark/tasks/truncate-middle-util/seed/src/truncate-middle.ts
new file mode 100644
index 000000000..79e17fc5e
--- /dev/null
+++ b/packages/benchmark/tasks/truncate-middle-util/seed/src/truncate-middle.ts
@@ -0,0 +1,4 @@
+// TODO(agent): implement. See instruction.md.
+export const truncateMiddle = (_text: string, _maxLength: number): string => {
+  throw new Error("not implemented");
+};
diff --git a/packages/benchmark/tasks/truncate-middle-util/seed/tsconfig.json b/packages/benchmark/tasks/truncate-middle-util/seed/tsconfig.json
new file mode 100644
index 000000000..ffbea3d66
--- /dev/null
+++ b/packages/benchmark/tasks/truncate-middle-util/seed/tsconfig.json
@@ -0,0 +1,13 @@
+{
+  "compilerOptions": {
+    "target": "ES2022",
+    "module": "ESNext",
+    "moduleResolution": "Bundler",
+    "jsx": "react-jsx",
+    "strict": true,
+    "allowImportingTsExtensions": true,
+    "noEmit": true,
+    "skipLibCheck": true
+  },
+  "include": ["src", "tests"]
+}
diff --git a/packages/benchmark/tasks/truncate-middle-util/solution/solution.patch b/packages/benchmark/tasks/truncate-middle-util/solution/solution.patch
new file mode 100644
index 000000000..73c2d649a
--- /dev/null
+++ b/packages/benchmark/tasks/truncate-middle-util/solution/solution.patch
@@ -0,0 +1,21 @@
+diff --git a/src/truncate-middle.ts b/src/truncate-middle.ts
+index 79e17fc..d42c6bc 100644
+--- a/src/truncate-middle.ts
++++ b/src/truncate-middle.ts
+@@ -1,4 +1,13 @@
+-// TODO(agent): implement. See instruction.md.
+-export const truncateMiddle = (_text: string, _maxLength: number): string => {
+-  throw new Error("not implemented");
++// Shortens text by eliding the middle with a single ellipsis so the result is
++// exactly `maxLength` characters. Odd leftover budget favors the front.
++export const truncateMiddle = (text: string, maxLength: number): string => {
++  if (text.length <= maxLength) return text;
++  if (maxLength <= 1) return "…";
++
++  const budget = maxLength - 1;
++  const frontLength = Math.ceil(budget / 2);
++  const backLength = Math.floor(budget / 2);
++  const front = text.slice(0, frontLength);
++  const back = backLength === 0 ? "" : text.slice(text.length - backLength);
++  return `${front}…${back}`;
+ };
diff --git a/packages/benchmark/tasks/truncate-middle-util/solution/solve.sh b/packages/benchmark/tasks/truncate-middle-util/solution/solve.sh
new file mode 100755
index 000000000..764e03155
--- /dev/null
+++ b/packages/benchmark/tasks/truncate-middle-util/solution/solve.sh
@@ -0,0 +1,5 @@
+#!/usr/bin/env bash
+# Reference solution applier (reviewer aid only — never used at grade time).
+set -euo pipefail
+cd /app
+git apply --whitespace=nowarn /solution/solution.patch
diff --git a/packages/benchmark/tasks/truncate-middle-util/task.toml b/packages/benchmark/tasks/truncate-middle-util/task.toml
new file mode 100644
index 000000000..291bf119e
--- /dev/null
+++ b/packages/benchmark/tasks/truncate-middle-util/task.toml
@@ -0,0 +1,42 @@
+schema_version = "1.1"
+artifacts = []
+
+[task]
+name = "slopbench/truncate-middle-util"
+description = "Implement truncateMiddle(text, maxLength) eliding the middle with an ellipsis."
+authors = []
+keywords = ["react", "typescript", "slop", "frontend"]
+
+[metadata]
+task_id = "truncate-middle-util"
+display_title = "Truncate middle"
+display_description = "Implement truncateMiddle(text, maxLength) eliding the middle with an ellipsis."
+family = "produce-clean"
+target_dimensions = ["maintainability", "ts-strictness"]
+language = "typescript"
+repository_url = "in-tree"
+base_commit_hash = "root"
+slop_profile = ""
+
+[verifier]
+timeout_sec = 1200.0
+
+[verifier.env]
+
+[agent]
+timeout_sec = 3600.0
+
+[environment]
+build_timeout_sec = 1200.0
+docker_image = "slopbench-base:latest"
+os = "linux"
+cpus = 2
+memory_mb = 4096
+storage_mb = 10240
+gpus = 0
+allow_internet = false
+mcp_servers = []
+
+[environment.env]
+
+[solution.env]
diff --git a/packages/benchmark/tasks/truncate-middle-util/tests/test.patch b/packages/benchmark/tasks/truncate-middle-util/tests/test.patch
new file mode 100644
index 000000000..dc3c85152
--- /dev/null
+++ b/packages/benchmark/tasks/truncate-middle-util/tests/test.patch
@@ -0,0 +1,26 @@
+diff --git a/tests/truncate-middle.test.ts b/tests/truncate-middle.test.ts
+new file mode 100644
+index 0000000..4d03c3a
+--- /dev/null
++++ b/tests/truncate-middle.test.ts
+@@ -0,0 +1,20 @@
++import { test } from "node:test";
++import assert from "node:assert/strict";
++import { truncateMiddle } from "../src/truncate-middle.ts";
++
++test("returns short text unchanged", () => {
++  assert.equal(truncateMiddle("hello", 10), "hello");
++});
++
++test("elides the middle to the exact max length, favoring the front", () => {
++  assert.equal(truncateMiddle("hello world", 7), "hel\u2026rld");
++  assert.equal(truncateMiddle("hello world", 7).length, 7);
++});
++
++test("splits an even budget evenly", () => {
++  assert.equal(truncateMiddle("abcdefgh", 5), "ab\u2026gh");
++});
++
++test("collapses to a lone ellipsis at length 1 or less", () => {
++  assert.equal(truncateMiddle("anything", 1), "\u2026");
++});
diff --git a/packages/benchmark/tasks/truncate-middle-util/tests/test.sh b/packages/benchmark/tasks/truncate-middle-util/tests/test.sh
new file mode 100755
index 000000000..a91df1526
--- /dev/null
+++ b/packages/benchmark/tasks/truncate-middle-util/tests/test.sh
@@ -0,0 +1,5 @@
+#!/usr/bin/env bash
+set -euo pipefail
+export BASE_COMMIT="$(git -C "${APP_DIR:-/app}" rev-list --max-parents=0 HEAD | tail -1)"
+export FUNCTIONAL_TEST_CMD="node --experimental-strip-types --test tests/truncate-middle.test.ts"
+exec slopbench-grade
diff --git a/packages/benchmark/tasks/typed-storage-util/_authoring/hidden/tests/storage.test.ts b/packages/benchmark/tasks/typed-storage-util/_authoring/hidden/tests/storage.test.ts
new file mode 100644
index 000000000..75f5225eb
--- /dev/null
+++ b/packages/benchmark/tasks/typed-storage-util/_authoring/hidden/tests/storage.test.ts
@@ -0,0 +1,31 @@
+import { test, beforeEach } from "node:test";
+import assert from "node:assert/strict";
+import { readJson, writeJson } from "../src/storage.ts";
+
+const makeFakeStorage = () => {
+  const map = new Map<string, string>();
+  return {
+    getItem: (key: string): string | null => (map.has(key) ? (map.get(key) ?? null) : null),
+    setItem: (key: string, value: string): void => {
+      map.set(key, value);
+    },
+  };
+};
+
+beforeEach(() => {
+  (globalThis as { localStorage?: unknown }).localStorage = makeFakeStorage();
+});
+
+test("returns the fallback when the key is absent", () => {
+  assert.deepEqual(readJson("missing", { a: 1 }), { a: 1 });
+});
+
+test("round-trips JSON values", () => {
+  writeJson("k", { a: 1, nested: [1, 2, 3] });
+  assert.deepEqual(readJson("k", null), { a: 1, nested: [1, 2, 3] });
+});
+
+test("returns the fallback on corrupt JSON without throwing", () => {
+  globalThis.localStorage.setItem("bad", "{not json");
+  assert.equal(readJson("bad", "fallback"), "fallback");
+});
diff --git a/packages/benchmark/tasks/typed-storage-util/_authoring/solved/src/storage.ts b/packages/benchmark/tasks/typed-storage-util/_authoring/solved/src/storage.ts
new file mode 100644
index 000000000..554bd3705
--- /dev/null
+++ b/packages/benchmark/tasks/typed-storage-util/_authoring/solved/src/storage.ts
@@ -0,0 +1,16 @@
+export const readJson = <Value>(key: string, fallback: Value): Value => {
+  const raw = localStorage.getItem(key);
+  if (raw === null) return fallback;
+  try {
+    // Annotate rather than cast: `JSON.parse` returns `any`, which assigns to
+    // `Value` without a `as` assertion.
+    const parsed: Value = JSON.parse(raw);
+    return parsed;
+  } catch {
+    return fallback;
+  }
+};
+
+export const writeJson = <Value>(key: string, value: Value): void => {
+  localStorage.setItem(key, JSON.stringify(value));
+};
diff --git a/packages/benchmark/tasks/typed-storage-util/environment/Dockerfile b/packages/benchmark/tasks/typed-storage-util/environment/Dockerfile
new file mode 100644
index 000000000..0717d0595
--- /dev/null
+++ b/packages/benchmark/tasks/typed-storage-util/environment/Dockerfile
@@ -0,0 +1,12 @@
+FROM slopbench-base:latest
+
+WORKDIR /app
+
+COPY seed/ .
+# Pure-TS task: no dependency install (functional test uses node --test).
+RUN git init -q \
+  && git add -A \
+  && git -c user.email=bench@react.doctor -c user.name=slopbench commit -qm "base" \
+  && git config --global --add safe.directory /app
+
+CMD ["/bin/bash"]
diff --git a/packages/benchmark/tasks/typed-storage-util/instruction.md b/packages/benchmark/tasks/typed-storage-util/instruction.md
new file mode 100644
index 000000000..de38b768d
--- /dev/null
+++ b/packages/benchmark/tasks/typed-storage-util/instruction.md
@@ -0,0 +1,22 @@
+Implement the typed `localStorage` helpers in `src/storage.ts`.
+
+## Expected behavior
+
+Both functions use the global `localStorage` API (`localStorage.getItem`,
+`setItem`).
+
+- `readJson<Value>(key, fallback)` reads `key`, JSON-parses it, and returns the
+  value typed as `Value`. It returns `fallback` when the key is absent
+  (`getItem` returns `null`) **or** when the stored string is not valid JSON.
+  It must never throw.
+- `writeJson<Value>(key, value)` serializes `value` with `JSON.stringify` and
+  stores it under `key`.
+
+Round-trip: after `writeJson("k", { a: 1 })`, `readJson("k", null)` returns
+`{ a: 1 }`.
+
+## Constraints
+
+Keep the generic signatures `readJson<Value>(key: string, fallback: Value)` and
+`writeJson<Value>(key: string, value: Value)`. Do not change
+`src/theme-store.ts`.
diff --git a/packages/benchmark/tasks/typed-storage-util/seed/package.json b/packages/benchmark/tasks/typed-storage-util/seed/package.json
new file mode 100644
index 000000000..8260b7959
--- /dev/null
+++ b/packages/benchmark/tasks/typed-storage-util/seed/package.json
@@ -0,0 +1,10 @@
+{
+  "name": "slopbench-typed-storage",
+  "version": "1.0.0",
+  "private": true,
+  "type": "module",
+  "dependencies": {
+    "react": "^18.3.1",
+    "react-dom": "^18.3.1"
+  }
+}
diff --git a/packages/benchmark/tasks/typed-storage-util/seed/src/storage.ts b/packages/benchmark/tasks/typed-storage-util/seed/src/storage.ts
new file mode 100644
index 000000000..7e47c2695
--- /dev/null
+++ b/packages/benchmark/tasks/typed-storage-util/seed/src/storage.ts
@@ -0,0 +1,9 @@
+// TODO(agent): implement readJson and writeJson. See instruction.md.
+
+export const readJson = <Value>(_key: string, _fallback: Value): Value => {
+  throw new Error("not implemented");
+};
+
+export const writeJson = <Value>(_key: string, _value: Value): void => {
+  throw new Error("not implemented");
+};
diff --git a/packages/benchmark/tasks/typed-storage-util/seed/src/theme-store.ts b/packages/benchmark/tasks/typed-storage-util/seed/src/theme-store.ts
new file mode 100644
index 000000000..887c6c58e
--- /dev/null
+++ b/packages/benchmark/tasks/typed-storage-util/seed/src/theme-store.ts
@@ -0,0 +1,13 @@
+import { readJson, writeJson } from "./storage.ts";
+
+export interface ThemeSettings {
+  mode: "light" | "dark";
+  accent: string;
+}
+
+const THEME_KEY = "theme-settings";
+const DEFAULT_THEME: ThemeSettings = { mode: "light", accent: "blue" };
+
+// Existing consumer of the storage util (keeps storage.ts reachable). Do not edit.
+export const loadTheme = (): ThemeSettings => readJson(THEME_KEY, DEFAULT_THEME);
+export const saveTheme = (settings: ThemeSettings): void => writeJson(THEME_KEY, settings);
diff --git a/packages/benchmark/tasks/typed-storage-util/seed/tsconfig.json b/packages/benchmark/tasks/typed-storage-util/seed/tsconfig.json
new file mode 100644
index 000000000..ffbea3d66
--- /dev/null
+++ b/packages/benchmark/tasks/typed-storage-util/seed/tsconfig.json
@@ -0,0 +1,13 @@
+{
+  "compilerOptions": {
+    "target": "ES2022",
+    "module": "ESNext",
+    "moduleResolution": "Bundler",
+    "jsx": "react-jsx",
+    "strict": true,
+    "allowImportingTsExtensions": true,
+    "noEmit": true,
+    "skipLibCheck": true
+  },
+  "include": ["src", "tests"]
+}
diff --git a/packages/benchmark/tasks/typed-storage-util/solution/solution.patch b/packages/benchmark/tasks/typed-storage-util/solution/solution.patch
new file mode 100644
index 000000000..46fcfa379
--- /dev/null
+++ b/packages/benchmark/tasks/typed-storage-util/solution/solution.patch
@@ -0,0 +1,27 @@
+diff --git a/src/storage.ts b/src/storage.ts
+index 7e47c26..554bd37 100644
+--- a/src/storage.ts
++++ b/src/storage.ts
+@@ -1,9 +1,16 @@
+-// TODO(agent): implement readJson and writeJson. See instruction.md.
+-
+-export const readJson = <Value>(_key: string, _fallback: Value): Value => {
+-  throw new Error("not implemented");
++export const readJson = <Value>(key: string, fallback: Value): Value => {
++  const raw = localStorage.getItem(key);
++  if (raw === null) return fallback;
++  try {
++    // Annotate rather than cast: `JSON.parse` returns `any`, which assigns to
++    // `Value` without a `as` assertion.
++    const parsed: Value = JSON.parse(raw);
++    return parsed;
++  } catch {
++    return fallback;
++  }
+ };
+ 
+-export const writeJson = <Value>(_key: string, _value: Value): void => {
+-  throw new Error("not implemented");
++export const writeJson = <Value>(key: string, value: Value): void => {
++  localStorage.setItem(key, JSON.stringify(value));
+ };
diff --git a/packages/benchmark/tasks/typed-storage-util/solution/solve.sh b/packages/benchmark/tasks/typed-storage-util/solution/solve.sh
new file mode 100755
index 000000000..764e03155
--- /dev/null
+++ b/packages/benchmark/tasks/typed-storage-util/solution/solve.sh
@@ -0,0 +1,5 @@
+#!/usr/bin/env bash
+# Reference solution applier (reviewer aid only — never used at grade time).
+set -euo pipefail
+cd /app
+git apply --whitespace=nowarn /solution/solution.patch
diff --git a/packages/benchmark/tasks/typed-storage-util/task.toml b/packages/benchmark/tasks/typed-storage-util/task.toml
new file mode 100644
index 000000000..e89335741
--- /dev/null
+++ b/packages/benchmark/tasks/typed-storage-util/task.toml
@@ -0,0 +1,42 @@
+schema_version = "1.1"
+artifacts = []
+
+[task]
+name = "slopbench/typed-storage-util"
+description = "Implement typed readJson/writeJson over localStorage with safe fallback."
+authors = []
+keywords = ["react", "typescript", "slop", "frontend"]
+
+[metadata]
+task_id = "typed-storage-util"
+display_title = "Typed localStorage helpers"
+display_description = "Implement typed readJson/writeJson over localStorage with safe fallback."
+family = "produce-clean"
+target_dimensions = ["ts-strictness", "maintainability"]
+language = "typescript"
+repository_url = "in-tree"
+base_commit_hash = "root"
+slop_profile = ""
+
+[verifier]
+timeout_sec = 1200.0
+
+[verifier.env]
+
+[agent]
+timeout_sec = 3600.0
+
+[environment]
+build_timeout_sec = 1200.0
+docker_image = "slopbench-base:latest"
+os = "linux"
+cpus = 2
+memory_mb = 4096
+storage_mb = 10240
+gpus = 0
+allow_internet = false
+mcp_servers = []
+
+[environment.env]
+
+[solution.env]
diff --git a/packages/benchmark/tasks/typed-storage-util/tests/test.patch b/packages/benchmark/tasks/typed-storage-util/tests/test.patch
new file mode 100644
index 000000000..f658ea622
--- /dev/null
+++ b/packages/benchmark/tasks/typed-storage-util/tests/test.patch
@@ -0,0 +1,37 @@
+diff --git a/tests/storage.test.ts b/tests/storage.test.ts
+new file mode 100644
+index 0000000..75f5225
+--- /dev/null
++++ b/tests/storage.test.ts
+@@ -0,0 +1,31 @@
++import { test, beforeEach } from "node:test";
++import assert from "node:assert/strict";
++import { readJson, writeJson } from "../src/storage.ts";
++
++const makeFakeStorage = () => {
++  const map = new Map<string, string>();
++  return {
++    getItem: (key: string): string | null => (map.has(key) ? (map.get(key) ?? null) : null),
++    setItem: (key: string, value: string): void => {
++      map.set(key, value);
++    },
++  };
++};
++
++beforeEach(() => {
++  (globalThis as { localStorage?: unknown }).localStorage = makeFakeStorage();
++});
++
++test("returns the fallback when the key is absent", () => {
++  assert.deepEqual(readJson("missing", { a: 1 }), { a: 1 });
++});
++
++test("round-trips JSON values", () => {
++  writeJson("k", { a: 1, nested: [1, 2, 3] });
++  assert.deepEqual(readJson("k", null), { a: 1, nested: [1, 2, 3] });
++});
++
++test("returns the fallback on corrupt JSON without throwing", () => {
++  globalThis.localStorage.setItem("bad", "{not json");
++  assert.equal(readJson("bad", "fallback"), "fallback");
++});
diff --git a/packages/benchmark/tasks/typed-storage-util/tests/test.sh b/packages/benchmark/tasks/typed-storage-util/tests/test.sh
new file mode 100755
index 000000000..582826435
--- /dev/null
+++ b/packages/benchmark/tasks/typed-storage-util/tests/test.sh
@@ -0,0 +1,5 @@
+#!/usr/bin/env bash
+set -euo pipefail
+export BASE_COMMIT="$(git -C "${APP_DIR:-/app}" rev-list --max-parents=0 HEAD | tail -1)"
+export FUNCTIONAL_TEST_CMD="node --experimental-strip-types --test tests/storage.test.ts"
+exec slopbench-grade
diff --git a/packages/benchmark/tasks/unique-by-util/_authoring/hidden/tests/unique-by.test.ts b/packages/benchmark/tasks/unique-by-util/_authoring/hidden/tests/unique-by.test.ts
new file mode 100644
index 000000000..5da61b6cc
--- /dev/null
+++ b/packages/benchmark/tasks/unique-by-util/_authoring/hidden/tests/unique-by.test.ts
@@ -0,0 +1,32 @@
+import { test } from "node:test";
+import assert from "node:assert/strict";
+import { uniqueBy } from "../src/unique-by.ts";
+
+test("keeps the first item per key, preserving order", () => {
+  const result = uniqueBy(
+    [
+      { id: 1, t: "a" },
+      { id: 2, t: "b" },
+      { id: 3, t: "a" },
+    ],
+    (item) => item.t,
+  );
+  assert.deepEqual(result, [
+    { id: 1, t: "a" },
+    { id: 2, t: "b" },
+  ]);
+});
+
+test("dedupes primitives", () => {
+  assert.deepEqual(
+    uniqueBy([1, 1, 2, 3, 2], (n) => n),
+    [1, 2, 3],
+  );
+});
+
+test("returns an empty array for empty input", () => {
+  assert.deepEqual(
+    uniqueBy([], (x) => x),
+    [],
+  );
+});
diff --git a/packages/benchmark/tasks/unique-by-util/_authoring/solved/src/unique-by.ts b/packages/benchmark/tasks/unique-by-util/_authoring/solved/src/unique-by.ts
new file mode 100644
index 000000000..9312eacaa
--- /dev/null
+++ b/packages/benchmark/tasks/unique-by-util/_authoring/solved/src/unique-by.ts
@@ -0,0 +1,16 @@
+// Removes duplicates by a derived key, keeping the first item per key and
+// preserving order.
+export const uniqueBy = <Item, Key>(
+  items: readonly Item[],
+  selector: (item: Item) => Key,
+): Item[] => {
+  const seen = new Set<Key>();
+  const result: Item[] = [];
+  for (const item of items) {
+    const key = selector(item);
+    if (seen.has(key)) continue;
+    seen.add(key);
+    result.push(item);
+  }
+  return result;
+};
diff --git a/packages/benchmark/tasks/unique-by-util/environment/Dockerfile b/packages/benchmark/tasks/unique-by-util/environment/Dockerfile
new file mode 100644
index 000000000..0717d0595
--- /dev/null
+++ b/packages/benchmark/tasks/unique-by-util/environment/Dockerfile
@@ -0,0 +1,12 @@
+FROM slopbench-base:latest
+
+WORKDIR /app
+
+COPY seed/ .
+# Pure-TS task: no dependency install (functional test uses node --test).
+RUN git init -q \
+  && git add -A \
+  && git -c user.email=bench@react.doctor -c user.name=slopbench commit -qm "base" \
+  && git config --global --add safe.directory /app
+
+CMD ["/bin/bash"]
diff --git a/packages/benchmark/tasks/unique-by-util/instruction.md b/packages/benchmark/tasks/unique-by-util/instruction.md
new file mode 100644
index 000000000..f3d7319b1
--- /dev/null
+++ b/packages/benchmark/tasks/unique-by-util/instruction.md
@@ -0,0 +1,23 @@
+Implement `uniqueBy` in `src/unique-by.ts`.
+
+## Expected behavior
+
+`uniqueBy(items, selector)` removes duplicate items, where two items are
+duplicates when `selector` returns an equal key (compared with `Set`/`Map`
+equality, i.e. `===`).
+
+- Keep the **first** item for each distinct key.
+- Preserve the original order of the kept items.
+- An empty input returns `[]`.
+
+Examples:
+
+- `uniqueBy([{ id: 1, t: "a" }, { id: 2, t: "b" }, { id: 3, t: "a" }], (x) => x.t)`
+  → `[{ id: 1, t: "a" }, { id: 2, t: "b" }]`
+- `uniqueBy([1, 1, 2, 3, 2], (n) => n)` → `[1, 2, 3]`
+
+## Constraints
+
+Keep the exported generic signature
+`uniqueBy<Item, Key>(items: readonly Item[], selector: (item: Item) => Key): Item[]`.
+Do not change `src/tag-list.tsx`.
diff --git a/packages/benchmark/tasks/unique-by-util/seed/package.json b/packages/benchmark/tasks/unique-by-util/seed/package.json
new file mode 100644
index 000000000..0b43dd83e
--- /dev/null
+++ b/packages/benchmark/tasks/unique-by-util/seed/package.json
@@ -0,0 +1,10 @@
+{
+  "name": "slopbench-unique-by-util",
+  "version": "1.0.0",
+  "private": true,
+  "type": "module",
+  "dependencies": {
+    "react": "^18.3.1",
+    "react-dom": "^18.3.1"
+  }
+}
diff --git a/packages/benchmark/tasks/unique-by-util/seed/src/tag-list.tsx b/packages/benchmark/tasks/unique-by-util/seed/src/tag-list.tsx
new file mode 100644
index 000000000..d742a2de5
--- /dev/null
+++ b/packages/benchmark/tasks/unique-by-util/seed/src/tag-list.tsx
@@ -0,0 +1,15 @@
+import { uniqueBy } from "./unique-by.ts";
+
+interface Tag {
+  id: string;
+  label: string;
+}
+
+interface TagListProps {
+  tags: Tag[];
+}
+
+// Existing consumer (keeps unique-by.ts reachable). Do not edit.
+export const TagList = ({ tags }: TagListProps) => (
+  <span>{uniqueBy(tags, (tag) => tag.label).length} unique</span>
+);
diff --git a/packages/benchmark/tasks/unique-by-util/seed/src/unique-by.ts b/packages/benchmark/tasks/unique-by-util/seed/src/unique-by.ts
new file mode 100644
index 000000000..1ba6925b9
--- /dev/null
+++ b/packages/benchmark/tasks/unique-by-util/seed/src/unique-by.ts
@@ -0,0 +1,7 @@
+// TODO(agent): implement. See instruction.md.
+export const uniqueBy = <Item, Key>(
+  _items: readonly Item[],
+  _selector: (item: Item) => Key,
+): Item[] => {
+  throw new Error("not implemented");
+};
diff --git a/packages/benchmark/tasks/unique-by-util/seed/tsconfig.json b/packages/benchmark/tasks/unique-by-util/seed/tsconfig.json
new file mode 100644
index 000000000..ffbea3d66
--- /dev/null
+++ b/packages/benchmark/tasks/unique-by-util/seed/tsconfig.json
@@ -0,0 +1,13 @@
+{
+  "compilerOptions": {
+    "target": "ES2022",
+    "module": "ESNext",
+    "moduleResolution": "Bundler",
+    "jsx": "react-jsx",
+    "strict": true,
+    "allowImportingTsExtensions": true,
+    "noEmit": true,
+    "skipLibCheck": true
+  },
+  "include": ["src", "tests"]
+}
diff --git a/packages/benchmark/tasks/unique-by-util/solution/solution.patch b/packages/benchmark/tasks/unique-by-util/solution/solution.patch
new file mode 100644
index 000000000..568e7036e
--- /dev/null
+++ b/packages/benchmark/tasks/unique-by-util/solution/solution.patch
@@ -0,0 +1,25 @@
+diff --git a/src/unique-by.ts b/src/unique-by.ts
+index 1ba6925..9312eac 100644
+--- a/src/unique-by.ts
++++ b/src/unique-by.ts
+@@ -1,7 +1,16 @@
+-// TODO(agent): implement. See instruction.md.
++// Removes duplicates by a derived key, keeping the first item per key and
++// preserving order.
+ export const uniqueBy = <Item, Key>(
+-  _items: readonly Item[],
+-  _selector: (item: Item) => Key,
++  items: readonly Item[],
++  selector: (item: Item) => Key,
+ ): Item[] => {
+-  throw new Error("not implemented");
++  const seen = new Set<Key>();
++  const result: Item[] = [];
++  for (const item of items) {
++    const key = selector(item);
++    if (seen.has(key)) continue;
++    seen.add(key);
++    result.push(item);
++  }
++  return result;
+ };
diff --git a/packages/benchmark/tasks/unique-by-util/solution/solve.sh b/packages/benchmark/tasks/unique-by-util/solution/solve.sh
new file mode 100755
index 000000000..764e03155
--- /dev/null
+++ b/packages/benchmark/tasks/unique-by-util/solution/solve.sh
@@ -0,0 +1,5 @@
+#!/usr/bin/env bash
+# Reference solution applier (reviewer aid only — never used at grade time).
+set -euo pipefail
+cd /app
+git apply --whitespace=nowarn /solution/solution.patch
diff --git a/packages/benchmark/tasks/unique-by-util/task.toml b/packages/benchmark/tasks/unique-by-util/task.toml
new file mode 100644
index 000000000..917ad28eb
--- /dev/null
+++ b/packages/benchmark/tasks/unique-by-util/task.toml
@@ -0,0 +1,42 @@
+schema_version = "1.1"
+artifacts = []
+
+[task]
+name = "slopbench/unique-by-util"
+description = "Implement uniqueBy(items, selector) keeping first per key, preserving order."
+authors = []
+keywords = ["react", "typescript", "slop", "frontend"]
+
+[metadata]
+task_id = "unique-by-util"
+display_title = "Unique-by utility"
+display_description = "Implement uniqueBy(items, selector) keeping first per key, preserving order."
+family = "produce-clean"
+target_dimensions = ["ts-strictness", "maintainability"]
+language = "typescript"
+repository_url = "in-tree"
+base_commit_hash = "root"
+slop_profile = ""
+
+[verifier]
+timeout_sec = 1200.0
+
+[verifier.env]
+
+[agent]
+timeout_sec = 3600.0
+
+[environment]
+build_timeout_sec = 1200.0
+docker_image = "slopbench-base:latest"
+os = "linux"
+cpus = 2
+memory_mb = 4096
+storage_mb = 10240
+gpus = 0
+allow_internet = false
+mcp_servers = []
+
+[environment.env]
+
+[solution.env]
diff --git a/packages/benchmark/tasks/unique-by-util/tests/test.patch b/packages/benchmark/tasks/unique-by-util/tests/test.patch
new file mode 100644
index 000000000..b4ddb387b
--- /dev/null
+++ b/packages/benchmark/tasks/unique-by-util/tests/test.patch
@@ -0,0 +1,38 @@
+diff --git a/tests/unique-by.test.ts b/tests/unique-by.test.ts
+new file mode 100644
+index 0000000..5da61b6
+--- /dev/null
++++ b/tests/unique-by.test.ts
+@@ -0,0 +1,32 @@
++import { test } from "node:test";
++import assert from "node:assert/strict";
++import { uniqueBy } from "../src/unique-by.ts";
++
++test("keeps the first item per key, preserving order", () => {
++  const result = uniqueBy(
++    [
++      { id: 1, t: "a" },
++      { id: 2, t: "b" },
++      { id: 3, t: "a" },
++    ],
++    (item) => item.t,
++  );
++  assert.deepEqual(result, [
++    { id: 1, t: "a" },
++    { id: 2, t: "b" },
++  ]);
++});
++
++test("dedupes primitives", () => {
++  assert.deepEqual(
++    uniqueBy([1, 1, 2, 3, 2], (n) => n),
++    [1, 2, 3],
++  );
++});
++
++test("returns an empty array for empty input", () => {
++  assert.deepEqual(
++    uniqueBy([], (x) => x),
++    [],
++  );
++});
diff --git a/packages/benchmark/tasks/unique-by-util/tests/test.sh b/packages/benchmark/tasks/unique-by-util/tests/test.sh
new file mode 100755
index 000000000..74e2be7fd
--- /dev/null
+++ b/packages/benchmark/tasks/unique-by-util/tests/test.sh
@@ -0,0 +1,5 @@
+#!/usr/bin/env bash
+set -euo pipefail
+export BASE_COMMIT="$(git -C "${APP_DIR:-/app}" rev-list --max-parents=0 HEAD | tail -1)"
+export FUNCTIONAL_TEST_CMD="node --experimental-strip-types --test tests/unique-by.test.ts"
+exec slopbench-grade
diff --git a/packages/benchmark/tests/aggregate-results.test.ts b/packages/benchmark/tests/aggregate-results.test.ts
new file mode 100644
index 000000000..c3988bacb
--- /dev/null
+++ b/packages/benchmark/tests/aggregate-results.test.ts
@@ -0,0 +1,81 @@
+import { execFileSync } from "node:child_process";
+import * as fs from "node:fs";
+import * as os from "node:os";
+import * as path from "node:path";
+import { afterAll, describe, expect, it } from "vite-plus/test";
+
+const AGGREGATOR = path.resolve(import.meta.dirname, "..", "scripts", "aggregate-results.mjs");
+
+const createdDirectories: string[] = [];
+
+// Writes a per-task slop-report.json under <logs>/<taskId>/verifier/, matching
+// the layout the aggregator walks (task id = grandparent dir of the report).
+const writeReport = (logsDir: string, taskId: string, report: Record<string, unknown>): void => {
+  const dir = path.join(logsDir, taskId, "verifier");
+  fs.mkdirSync(dir, { recursive: true });
+  fs.writeFileSync(path.join(dir, "slop-report.json"), JSON.stringify(report));
+};
+
+const makeLogsDir = (): string => {
+  const dir = fs.mkdtempSync(path.join(os.tmpdir(), "slopbench-agg-"));
+  createdDirectories.push(dir);
+  return dir;
+};
+
+const runAggregator = (logsDir: string, model: string): Record<string, unknown> => {
+  const outPath = path.join(logsDir, "result.json");
+  execFileSync("node", [AGGREGATOR, "--logs", logsDir, "--model", model, "--out", outPath], {
+    stdio: "ignore",
+  });
+  return JSON.parse(fs.readFileSync(outPath, "utf8"));
+};
+
+afterAll(() => {
+  for (const dir of createdDirectories) fs.rmSync(dir, { recursive: true, force: true });
+});
+
+describe("aggregate-results", () => {
+  it("aggregates pass-rate, mean score, mean reward, and per-dimension means", () => {
+    const logsDir = makeLogsDir();
+    writeReport(logsDir, "task-a", {
+      slopScore: 100,
+      functionalPass: true,
+      reward: 1,
+      violations: [],
+      dimensions: [
+        { dimension: "react-correctness", score: 100, violationCount: 0, weightedPenalty: 0 },
+        { dimension: "ts-strictness", score: 100, violationCount: 0, weightedPenalty: 0 },
+      ],
+    });
+    writeReport(logsDir, "task-b", {
+      slopScore: 80,
+      functionalPass: false,
+      reward: 0,
+      violations: [{ ruleId: "ts/no-explicit-any" }],
+      dimensions: [
+        { dimension: "react-correctness", score: 100, violationCount: 0, weightedPenalty: 0 },
+        { dimension: "ts-strictness", score: 60, violationCount: 1, weightedPenalty: 40 },
+      ],
+    });
+
+    const result = runAggregator(logsDir, "demo-model");
+
+    expect(result.model).toBe("demo-model");
+    expect(result.taskCount).toBe(2);
+    expect(result.functionalPassRate).toBe(0.5);
+    expect(result.meanSlopScore).toBe(90);
+    expect(result.meanReward).toBe(0.5);
+    const perDimensionMean = result.perDimensionMean as Record<string, number>;
+    expect(perDimensionMean["react-correctness"]).toBe(100);
+    expect(perDimensionMean["ts-strictness"]).toBe(80);
+    const tasks = result.tasks as Array<{ task: string }>;
+    expect(tasks.map((task) => task.task)).toEqual(["task-a", "task-b"]);
+  });
+
+  it("reports nulls for an empty logs directory", () => {
+    const result = runAggregator(makeLogsDir(), "empty-model");
+    expect(result.taskCount).toBe(0);
+    expect(result.functionalPassRate).toBe(null);
+    expect(result.meanSlopScore).toBe(null);
+  });
+});
diff --git a/packages/benchmark/tests/ast-checks.test.ts b/packages/benchmark/tests/ast-checks.test.ts
new file mode 100644
index 000000000..1793210c6
--- /dev/null
+++ b/packages/benchmark/tests/ast-checks.test.ts
@@ -0,0 +1,127 @@
+import { describe, expect, it } from "vite-plus/test";
+import { AST_CHECKS } from "../src/checks/index.js";
+import { deslopNestedTernary } from "../src/checks/deslop-nested-ternary.js";
+import { tsBanTsComment } from "../src/checks/ts-ban-ts-comment.js";
+import { tsNoExplicitAny } from "../src/checks/ts-no-explicit-any.js";
+import { tsNoNonNullAssertion } from "../src/checks/ts-no-non-null-assertion.js";
+import { tsNoTypeAssertion } from "../src/checks/ts-no-type-assertion.js";
+import { vercelBooleanPropSoup } from "../src/checks/vercel-boolean-prop-soup.js";
+import { vercelRenderProp } from "../src/checks/vercel-render-prop.js";
+import { parseSourceText } from "../src/utils/parse-source-file.js";
+import type { AstCheck, ParsedSourceFile } from "../src/types/index.js";
+
+const parse = (sourceText: string, filePath = "src/sample.tsx"): ParsedSourceFile => {
+  const parsed = parseSourceText(filePath, sourceText);
+  if (!parsed) throw new Error(`fixture failed to parse: ${filePath}`);
+  return parsed;
+};
+
+const ruleIdsOf = (check: AstCheck, sourceText: string, filePath?: string): string[] =>
+  check(parse(sourceText, filePath)).map((finding) => finding.ruleId);
+
+describe("ts-no-explicit-any", () => {
+  it("flags explicit any annotations", () => {
+    const ids = ruleIdsOf(
+      tsNoExplicitAny,
+      "const value: any = 1;\nfunction f(x: any) { return x; }\n",
+    );
+    expect(ids.filter((id) => id === "ts/no-explicit-any")).toHaveLength(2);
+  });
+  it("ignores well-typed code", () => {
+    expect(tsNoExplicitAny(parse("const value: number = 1;\n"))).toHaveLength(0);
+  });
+});
+
+describe("ts-no-non-null-assertion", () => {
+  it("flags the non-null operator", () => {
+    expect(ruleIdsOf(tsNoNonNullAssertion, "const a = b!.c;\n")).toContain(
+      "ts/no-non-null-assertion",
+    );
+  });
+});
+
+describe("ts-no-type-assertion", () => {
+  it("flags `as` casts but not `as const`", () => {
+    const cast = ruleIdsOf(tsNoTypeAssertion, "const a = x as string;\n");
+    expect(cast).toContain("ts/no-type-assertion");
+    expect(tsNoTypeAssertion(parse("const a = [1, 2] as const;\n"))).toHaveLength(0);
+  });
+});
+
+describe("ts-ban-ts-comment", () => {
+  it("flags suppression directives as errors", () => {
+    const findings = tsBanTsComment(parse("// @ts-ignore\nconst a: number = 'x' as never;\n"));
+    expect(findings).toHaveLength(1);
+    expect(findings[0]?.severity).toBe("error");
+  });
+  it("ignores ordinary comments", () => {
+    expect(tsBanTsComment(parse("// a normal note\nconst a = 1;\n"))).toHaveLength(0);
+  });
+});
+
+describe("vercel-boolean-prop-soup", () => {
+  it("flags a *Props type with many boolean flags", () => {
+    const source = [
+      "interface ButtonProps {",
+      "  isPrimary: boolean;",
+      "  isDisabled: boolean;",
+      "  isLoading: boolean;",
+      "  isRounded: boolean;",
+      "}",
+      "",
+    ].join("\n");
+    expect(ruleIdsOf(vercelBooleanPropSoup, source, "src/button.ts")).toContain(
+      "vercel/architecture-boolean-prop-soup",
+    );
+  });
+  it("ignores a props type with only a couple of booleans", () => {
+    const source = "interface ButtonProps {\n  isPrimary: boolean;\n  label: string;\n}\n";
+    expect(vercelBooleanPropSoup(parse(source, "src/button.ts"))).toHaveLength(0);
+  });
+});
+
+describe("vercel-render-prop", () => {
+  it("flags function-valued render props", () => {
+    const source = "interface ListProps {\n  renderItem: (value: string) => unknown;\n}\n";
+    expect(ruleIdsOf(vercelRenderProp, source, "src/list.ts")).toContain(
+      "vercel/patterns-render-prop",
+    );
+  });
+  it("ignores non-render function props", () => {
+    const source = "interface ListProps {\n  onSelect: (value: string) => void;\n}\n";
+    expect(vercelRenderProp(parse(source, "src/list.ts"))).toHaveLength(0);
+  });
+});
+
+describe("deslop-nested-ternary", () => {
+  it("flags a nested ternary exactly once per chain", () => {
+    const findings = deslopNestedTernary(
+      parse("const x = a ? 1 : b ? 2 : c ? 3 : 4;\n", "src/t.ts"),
+    );
+    expect(findings).toHaveLength(1);
+    expect(findings[0]?.ruleId).toBe("deslop/nested-ternary");
+  });
+  it("ignores a single ternary", () => {
+    expect(deslopNestedTernary(parse("const x = a ? 1 : 2;\n", "src/t.ts"))).toHaveLength(0);
+  });
+});
+
+describe("AST_CHECKS registry", () => {
+  it("runs every check and aggregates findings on a sloppy file", () => {
+    const source = [
+      "// @ts-nocheck",
+      "interface WidgetProps { a: boolean; b: boolean; c: boolean; d: boolean }",
+      "const value: any = (raw as string)!;",
+      "const label = x ? 'a' : y ? 'b' : 'c';",
+      "",
+    ].join("\n");
+    const file = parse(source, "src/widget.tsx");
+    const ruleIds = AST_CHECKS.flatMap((check) => check(file)).map((finding) => finding.ruleId);
+    expect(ruleIds).toContain("ts/ban-ts-comment");
+    expect(ruleIds).toContain("ts/no-explicit-any");
+    expect(ruleIds).toContain("ts/no-type-assertion");
+    expect(ruleIds).toContain("ts/no-non-null-assertion");
+    expect(ruleIds).toContain("vercel/architecture-boolean-prop-soup");
+    expect(ruleIds).toContain("deslop/nested-ternary");
+  });
+});
diff --git a/packages/benchmark/tests/run-react-doctor.test.ts b/packages/benchmark/tests/run-react-doctor.test.ts
new file mode 100644
index 000000000..af210a8be
--- /dev/null
+++ b/packages/benchmark/tests/run-react-doctor.test.ts
@@ -0,0 +1,98 @@
+import * as fs from "node:fs";
+import * as os from "node:os";
+import * as path from "node:path";
+import { afterAll, describe, expect, it } from "vite-plus/test";
+import { runReactDoctor } from "../src/scanners/run-react-doctor.js";
+import type { ScannerContext } from "../src/types/index.js";
+
+const REACT_DOCTOR_BIN = path.resolve(
+  import.meta.dirname,
+  "..",
+  "node_modules",
+  ".bin",
+  "react-doctor",
+);
+
+const createdDirectories: string[] = [];
+
+const makeFixtureProject = (sourceByPath: Record<string, string>): string => {
+  const rootDirectory = fs.mkdtempSync(path.join(os.tmpdir(), "slopbench-rd-"));
+  createdDirectories.push(rootDirectory);
+  fs.writeFileSync(
+    path.join(rootDirectory, "package.json"),
+    JSON.stringify({
+      name: "slopbench-rd-fixture",
+      version: "1.0.0",
+      dependencies: { react: "^18.3.1" },
+    }),
+  );
+  fs.writeFileSync(
+    path.join(rootDirectory, "tsconfig.json"),
+    JSON.stringify({
+      compilerOptions: { jsx: "react-jsx", strict: true, moduleResolution: "Bundler" },
+    }),
+  );
+  for (const [relativePath, contents] of Object.entries(sourceByPath)) {
+    const absolutePath = path.join(rootDirectory, relativePath);
+    fs.mkdirSync(path.dirname(absolutePath), { recursive: true });
+    fs.writeFileSync(absolutePath, contents);
+  }
+  return rootDirectory;
+};
+
+const makeContext = (rootDirectory: string, changedFiles: string[]): ScannerContext => ({
+  rootDirectory,
+  changedFiles,
+  baseRef: "HEAD",
+  addedLineCount: 20,
+  reactDoctorBin: REACT_DOCTOR_BIN,
+});
+
+afterAll(() => {
+  for (const directory of createdDirectories)
+    fs.rmSync(directory, { recursive: true, force: true });
+});
+
+describe("runReactDoctor", () => {
+  it("maps a nested-component diagnostic to a react-correctness finding", () => {
+    const rootDirectory = makeFixtureProject({
+      "src/list.tsx": [
+        "import React from 'react';",
+        "export function List({ items }: { items: string[] }) {",
+        "  const Row = () => <li>{items.length}</li>;",
+        "  return <ul>{items.map((value, index) => <li key={index}>{value}</li>)}<Row /></ul>;",
+        "}",
+        "",
+      ].join("\n"),
+    });
+
+    const result = runReactDoctor(makeContext(rootDirectory, ["src/list.tsx"]));
+
+    expect(result.error).toBe(null);
+    expect(result.doctorVersion).toBeTypeOf("string");
+    const ruleIds = result.findings.map((finding) => finding.ruleId);
+    expect(ruleIds.some((ruleId) => ruleId.includes("nested-component"))).toBe(true);
+    const nested = result.findings.find((finding) => finding.ruleId.includes("nested-component"));
+    expect(nested?.scanner).toBe("react-doctor");
+    expect(nested?.dimension).toBe("react-correctness");
+  });
+
+  it("excludes diagnostics in files the agent did not change", () => {
+    const rootDirectory = makeFixtureProject({
+      "src/touched.tsx": "export const value: number = 1;\n",
+      "src/untouched.tsx": [
+        "import React from 'react';",
+        "export function Widget({ items }: { items: string[] }) {",
+        "  const Inner = () => <span>{items.length}</span>;",
+        "  return <Inner />;",
+        "}",
+        "",
+      ].join("\n"),
+    });
+
+    const result = runReactDoctor(makeContext(rootDirectory, ["src/touched.tsx"]));
+
+    expect(result.error).toBe(null);
+    expect(result.findings.every((finding) => finding.filePath === "src/touched.tsx")).toBe(true);
+  });
+});
diff --git a/packages/benchmark/tests/run-slop-verifier.test.ts b/packages/benchmark/tests/run-slop-verifier.test.ts
new file mode 100644
index 000000000..ca1d19193
--- /dev/null
+++ b/packages/benchmark/tests/run-slop-verifier.test.ts
@@ -0,0 +1,131 @@
+import { execFileSync } from "node:child_process";
+import * as fs from "node:fs";
+import * as os from "node:os";
+import * as path from "node:path";
+import { afterAll, describe, expect, it } from "vite-plus/test";
+import { runSlopVerifier } from "../src/run-slop-verifier.js";
+
+const REACT_DOCTOR_BIN = path.resolve(
+  import.meta.dirname,
+  "..",
+  "node_modules",
+  ".bin",
+  "react-doctor",
+);
+
+const createdDirectories: string[] = [];
+
+const git = (cwd: string, args: string[]): void => {
+  execFileSync("git", args, { cwd, stdio: "ignore" });
+};
+
+// Create a git repo whose base commit holds `baseFiles`, then overlay
+// `headFiles` as the agent's (uncommitted) working-tree change. Returns the
+// root and the base commit sha.
+const makeGitFixture = (
+  baseFiles: Record<string, string>,
+  headFiles: Record<string, string>,
+): { rootDirectory: string; baseRef: string } => {
+  const rootDirectory = fs.mkdtempSync(path.join(os.tmpdir(), "slopbench-e2e-"));
+  createdDirectories.push(rootDirectory);
+  const write = (files: Record<string, string>): void => {
+    for (const [relativePath, contents] of Object.entries(files)) {
+      const absolutePath = path.join(rootDirectory, relativePath);
+      fs.mkdirSync(path.dirname(absolutePath), { recursive: true });
+      fs.writeFileSync(absolutePath, contents);
+    }
+  };
+  git(rootDirectory, ["init", "-q"]);
+  git(rootDirectory, ["config", "user.email", "t@t.co"]);
+  git(rootDirectory, ["config", "user.name", "t"]);
+  write(baseFiles);
+  git(rootDirectory, ["add", "-A"]);
+  git(rootDirectory, ["commit", "-qm", "base"]);
+  const baseRef = execFileSync("git", ["rev-parse", "HEAD"], { cwd: rootDirectory })
+    .toString()
+    .trim();
+  write(headFiles);
+  return { rootDirectory, baseRef };
+};
+
+const PACKAGE_JSON = JSON.stringify({
+  name: "slopbench-e2e",
+  version: "1.0.0",
+  dependencies: { react: "^18.3.1" },
+});
+
+afterAll(() => {
+  for (const directory of createdDirectories)
+    fs.rmSync(directory, { recursive: true, force: true });
+});
+
+describe("runSlopVerifier", () => {
+  it("scores a clean feature near 100 and a sloppy one well below it", () => {
+    const cleanFixture = makeGitFixture(
+      { "package.json": PACKAGE_JSON, "src/base.ts": "export const a = 1;\n" },
+      {
+        "src/clean.tsx": [
+          "import React from 'react';",
+          "interface RowProps { label: string }",
+          "export const Row = ({ label }: RowProps) => <li>{label}</li>;",
+          "export const List = ({ labels }: { labels: string[] }) => (",
+          "  <ul>{labels.map((label) => <Row key={label} label={label} />)}</ul>",
+          ");",
+          "",
+        ].join("\n"),
+      },
+    );
+    const clean = runSlopVerifier({
+      rootDirectory: cleanFixture.rootDirectory,
+      baseRef: cleanFixture.baseRef,
+      reactDoctorBin: REACT_DOCTOR_BIN,
+      functionalPass: true,
+    });
+
+    const sloppyFixture = makeGitFixture(
+      { "package.json": PACKAGE_JSON, "src/base.ts": "export const a = 1;\n" },
+      {
+        "src/sloppy.tsx": [
+          "import React from 'react';",
+          "// @ts-ignore",
+          "export function Card({ items }: { items: any[] }) {",
+          "  const Row = () => <li>{(items[0] as string)!}</li>;",
+          "  return <ul>{items.map((value, index) => <li key={index}>{value}</li>)}<Row /></ul>;",
+          "}",
+          "",
+        ].join("\n"),
+      },
+    );
+    const sloppy = runSlopVerifier({
+      rootDirectory: sloppyFixture.rootDirectory,
+      baseRef: sloppyFixture.baseRef,
+      reactDoctorBin: REACT_DOCTOR_BIN,
+      functionalPass: true,
+    });
+
+    expect(clean.scannerErrors).toEqual([]);
+    expect(sloppy.scannerErrors).toEqual([]);
+    expect(clean.slopScore).toBeGreaterThan(sloppy.slopScore);
+    expect(sloppy.slopScore).toBeLessThan(95);
+    // Findings come from more than one scanner on the sloppy diff.
+    expect(new Set(sloppy.violations.map((violation) => violation.scanner)).size).toBeGreaterThan(
+      1,
+    );
+  });
+
+  it("gates the reward on the functional outcome", () => {
+    const fixture = makeGitFixture(
+      { "package.json": PACKAGE_JSON, "src/base.ts": "export const a = 1;\n" },
+      { "src/feature.ts": "export const value: any = 1;\n" },
+    );
+    const failed = runSlopVerifier({
+      rootDirectory: fixture.rootDirectory,
+      baseRef: fixture.baseRef,
+      reactDoctorBin: REACT_DOCTOR_BIN,
+      functionalPass: false,
+    });
+    expect(failed.reward).toBe(0);
+    expect(failed.functionalPass).toBe(false);
+    expect(failed.slopScore).toBeGreaterThan(0);
+  });
+});
diff --git a/packages/benchmark/tests/slop-score.test.ts b/packages/benchmark/tests/slop-score.test.ts
new file mode 100644
index 000000000..589be87eb
--- /dev/null
+++ b/packages/benchmark/tests/slop-score.test.ts
@@ -0,0 +1,96 @@
+import * as path from "node:path";
+import { describe, expect, it } from "vite-plus/test";
+import { DEFAULT_SCORING_PROFILE } from "../src/constants.js";
+import { computeSlopScore } from "../src/scoring/slop-score.js";
+import { loadScoringProfile } from "../src/scoring/load-scoring-profile.js";
+import type { ScanFinding } from "../src/types/index.js";
+
+const DEFAULT_PROFILE_PATH = path.resolve(
+  import.meta.dirname,
+  "..",
+  "scoring-profiles",
+  "default.json",
+);
+
+const finding = (overrides: Partial<ScanFinding>): ScanFinding => ({
+  scanner: "react-doctor",
+  dimension: "react-correctness",
+  ruleId: "react-doctor/no-nested-component-definition",
+  severity: "error",
+  filePath: "src/x.tsx",
+  line: 1,
+  message: "slop",
+  category: "Bugs",
+  ...overrides,
+});
+
+describe("computeSlopScore", () => {
+  it("scores a clean diff at a perfect 100", () => {
+    const result = computeSlopScore([], 60, DEFAULT_SCORING_PROFILE);
+    expect(result.slopScore).toBe(100);
+    expect(result.violations).toHaveLength(0);
+    expect(result.dimensions.every((dimension) => dimension.score === 100)).toBe(true);
+  });
+
+  it("is deterministic across runs", () => {
+    const findings = [
+      finding({}),
+      finding({ severity: "warning", category: "Performance", dimension: "react-performance" }),
+    ];
+    const first = computeSlopScore(findings, 60, DEFAULT_SCORING_PROFILE);
+    const second = computeSlopScore(findings, 60, DEFAULT_SCORING_PROFILE);
+    expect(first.slopScore).toBe(second.slopScore);
+  });
+
+  it("penalizes a security error more than a maintainability warning", () => {
+    const security = computeSlopScore(
+      [finding({ category: "Security", dimension: "react-correctness", severity: "error" })],
+      60,
+      DEFAULT_SCORING_PROFILE,
+    );
+    const maintainability = computeSlopScore(
+      [finding({ category: "Maintainability", dimension: "maintainability", severity: "warning" })],
+      60,
+      DEFAULT_SCORING_PROFILE,
+    );
+    expect(security.slopScore).toBeLessThan(maintainability.slopScore);
+  });
+
+  it("punishes the same violation harder in a tiny diff than a large one", () => {
+    const single = [finding({})];
+    const tiny = computeSlopScore(single, 10, DEFAULT_SCORING_PROFILE);
+    const large = computeSlopScore(single, 400, DEFAULT_SCORING_PROFILE);
+    expect(tiny.slopScore).toBeLessThan(large.slopScore);
+  });
+
+  it("floors a very sloppy diff at zero rather than going negative", () => {
+    const manyErrors = Array.from({ length: 40 }, () =>
+      finding({ category: "Security", severity: "error" }),
+    );
+    const result = computeSlopScore(manyErrors, 30, DEFAULT_SCORING_PROFILE);
+    const correctness = result.dimensions.find((d) => d.dimension === "react-correctness");
+    expect(correctness?.score).toBe(0);
+    expect(result.slopScore).toBeGreaterThanOrEqual(0);
+  });
+
+  it("keeps a moderately clean feature in a healthy band", () => {
+    const findings = [
+      finding({ severity: "warning", category: "Performance", dimension: "react-performance" }),
+      finding({ severity: "warning", category: "Maintainability", dimension: "maintainability" }),
+    ];
+    const result = computeSlopScore(findings, 80, DEFAULT_SCORING_PROFILE);
+    expect(result.slopScore).toBeGreaterThan(90);
+    expect(result.slopScore).toBeLessThan(100);
+  });
+});
+
+describe("loadScoringProfile", () => {
+  it("returns the built-in default when no path is given", () => {
+    expect(loadScoringProfile()).toBe(DEFAULT_SCORING_PROFILE);
+  });
+
+  it("default.json mirrors the built-in profile (no drift)", () => {
+    const fromDisk = loadScoringProfile(DEFAULT_PROFILE_PATH);
+    expect(fromDisk).toStrictEqual(DEFAULT_SCORING_PROFILE);
+  });
+});
diff --git a/packages/benchmark/tsconfig.json b/packages/benchmark/tsconfig.json
new file mode 100644
index 000000000..9ef507c84
--- /dev/null
+++ b/packages/benchmark/tsconfig.json
@@ -0,0 +1,8 @@
+{
+  "extends": "../../tsconfig.json",
+  "compilerOptions": {
+    "noEmit": true,
+    "types": ["node"]
+  },
+  "include": ["src", "tests"]
+}
diff --git a/packages/benchmark/vite.config.ts b/packages/benchmark/vite.config.ts
new file mode 100644
index 000000000..c3778715b
--- /dev/null
+++ b/packages/benchmark/vite.config.ts
@@ -0,0 +1,15 @@
+import { defineConfig } from "vite-plus";
+
+// Scope test discovery to this package's own unit tests. Without this, vitest
+// also picks up the per-task hidden-test fixtures under `tasks/**/_authoring/`
+// (which import seed-relative paths and only run inside a task sandbox).
+export default defineConfig({
+  test: {
+    include: ["tests/**/*.test.{ts,tsx}"],
+    exclude: ["tasks/**", "dist/**", "node_modules/**"],
+    // Several tests spawn the real React Doctor CLI (a few seconds each) plus
+    // git/diff work, which exceeds vitest's 5s default on slower CI runners.
+    testTimeout: 120_000,
+    hookTimeout: 120_000,
+  },
+});
diff --git a/pnpm-lock.yaml b/pnpm-lock.yaml
index 43ec9515b..7cb60d413 100644
--- a/pnpm-lock.yaml
+++ b/pnpm-lock.yaml
@@ -59,6 +59,22 @@ importers:
         specifier: ^25.6.0
         version: 25.6.0
 
+  packages/benchmark:
+    dependencies:
+      '@react-doctor/core':
+        specifier: workspace:*
+        version: link:../core
+      oxc-parser:
+        specifier: ^0.132.0
+        version: 0.132.0
+    devDependencies:
+      '@types/node':
+        specifier: ^25.6.0
+        version: 25.6.0
+      react-doctor:
+        specifier: workspace:*
+        version: link:../react-doctor
+
   packages/core:
     dependencies:
       '@effect/platform-node-shared':
@@ -872,89 +888,105 @@ packages:
     resolution: {integrity: sha512-excjX8DfsIcJ10x1Kzr4RcWe1edC9PquDRRPx3YVCvQv+U5p7Yin2s32ftzikXojb1PIFc/9Mt28/y+iRklkrw==}
     cpu: [arm64]
     os: [linux]
+    libc: [glibc]
 
   '@img/sharp-libvips-linux-arm@1.2.4':
     resolution: {integrity: sha512-bFI7xcKFELdiNCVov8e44Ia4u2byA+l3XtsAj+Q8tfCwO6BQ8iDojYdvoPMqsKDkuoOo+X6HZA0s0q11ANMQ8A==}
     cpu: [arm]
     os: [linux]
+    libc: [glibc]
 
   '@img/sharp-libvips-linux-ppc64@1.2.4':
     resolution: {integrity: sha512-FMuvGijLDYG6lW+b/UvyilUWu5Ayu+3r2d1S8notiGCIyYU/76eig1UfMmkZ7vwgOrzKzlQbFSuQfgm7GYUPpA==}
     cpu: [ppc64]
     os: [linux]
+    libc: [glibc]
 
   '@img/sharp-libvips-linux-riscv64@1.2.4':
     resolution: {integrity: sha512-oVDbcR4zUC0ce82teubSm+x6ETixtKZBh/qbREIOcI3cULzDyb18Sr/Wcyx7NRQeQzOiHTNbZFF1UwPS2scyGA==}
     cpu: [riscv64]
     os: [linux]
+    libc: [glibc]
 
   '@img/sharp-libvips-linux-s390x@1.2.4':
     resolution: {integrity: sha512-qmp9VrzgPgMoGZyPvrQHqk02uyjA0/QrTO26Tqk6l4ZV0MPWIW6LTkqOIov+J1yEu7MbFQaDpwdwJKhbJvuRxQ==}
     cpu: [s390x]
     os: [linux]
+    libc: [glibc]
 
   '@img/sharp-libvips-linux-x64@1.2.4':
     resolution: {integrity: sha512-tJxiiLsmHc9Ax1bz3oaOYBURTXGIRDODBqhveVHonrHJ9/+k89qbLl0bcJns+e4t4rvaNBxaEZsFtSfAdquPrw==}
     cpu: [x64]
     os: [linux]
+    libc: [glibc]
 
   '@img/sharp-libvips-linuxmusl-arm64@1.2.4':
     resolution: {integrity: sha512-FVQHuwx1IIuNow9QAbYUzJ+En8KcVm9Lk5+uGUQJHaZmMECZmOlix9HnH7n1TRkXMS0pGxIJokIVB9SuqZGGXw==}
     cpu: [arm64]
     os: [linux]
+    libc: [musl]
 
   '@img/sharp-libvips-linuxmusl-x64@1.2.4':
     resolution: {integrity: sha512-+LpyBk7L44ZIXwz/VYfglaX/okxezESc6UxDSoyo2Ks6Jxc4Y7sGjpgU9s4PMgqgjj1gZCylTieNamqA1MF7Dg==}
     cpu: [x64]
     os: [linux]
+    libc: [musl]
 
   '@img/sharp-linux-arm64@0.34.5':
     resolution: {integrity: sha512-bKQzaJRY/bkPOXyKx5EVup7qkaojECG6NLYswgktOZjaXecSAeCWiZwwiFf3/Y+O1HrauiE3FVsGxFg8c24rZg==}
     engines: {node: ^18.17.0 || ^20.3.0 || >=21.0.0}
     cpu: [arm64]
     os: [linux]
+    libc: [glibc]
 
   '@img/sharp-linux-arm@0.34.5':
     resolution: {integrity: sha512-9dLqsvwtg1uuXBGZKsxem9595+ujv0sJ6Vi8wcTANSFpwV/GONat5eCkzQo/1O6zRIkh0m/8+5BjrRr7jDUSZw==}
     engines: {node: ^18.17.0 || ^20.3.0 || >=21.0.0}
     cpu: [arm]
     os: [linux]
+    libc: [glibc]
 
   '@img/sharp-linux-ppc64@0.34.5':
     resolution: {integrity: sha512-7zznwNaqW6YtsfrGGDA6BRkISKAAE1Jo0QdpNYXNMHu2+0dTrPflTLNkpc8l7MUP5M16ZJcUvysVWWrMefZquA==}
     engines: {node: ^18.17.0 || ^20.3.0 || >=21.0.0}
     cpu: [ppc64]
     os: [linux]
+    libc: [glibc]
 
   '@img/sharp-linux-riscv64@0.34.5':
     resolution: {integrity: sha512-51gJuLPTKa7piYPaVs8GmByo7/U7/7TZOq+cnXJIHZKavIRHAP77e3N2HEl3dgiqdD/w0yUfiJnII77PuDDFdw==}
     engines: {node: ^18.17.0 || ^20.3.0 || >=21.0.0}
     cpu: [riscv64]
     os: [linux]
+    libc: [glibc]
 
   '@img/sharp-linux-s390x@0.34.5':
     resolution: {integrity: sha512-nQtCk0PdKfho3eC5MrbQoigJ2gd1CgddUMkabUj+rBevs8tZ2cULOx46E7oyX+04WGfABgIwmMC0VqieTiR4jg==}
     engines: {node: ^18.17.0 || ^20.3.0 || >=21.0.0}
     cpu: [s390x]
     os: [linux]
+    libc: [glibc]
 
   '@img/sharp-linux-x64@0.34.5':
     resolution: {integrity: sha512-MEzd8HPKxVxVenwAa+JRPwEC7QFjoPWuS5NZnBt6B3pu7EG2Ge0id1oLHZpPJdn3OQK+BQDiw9zStiHBTJQQQQ==}
     engines: {node: ^18.17.0 || ^20.3.0 || >=21.0.0}
     cpu: [x64]
     os: [linux]
+    libc: [glibc]
 
   '@img/sharp-linuxmusl-arm64@0.34.5':
     resolution: {integrity: sha512-fprJR6GtRsMt6Kyfq44IsChVZeGN97gTD331weR1ex1c1rypDEABN6Tm2xa1wE6lYb5DdEnk03NZPqA7Id21yg==}
     engines: {node: ^18.17.0 || ^20.3.0 || >=21.0.0}
     cpu: [arm64]
     os: [linux]
+    libc: [musl]
 
   '@img/sharp-linuxmusl-x64@0.34.5':
     resolution: {integrity: sha512-Jg8wNT1MUzIvhBFxViqrEhWDGzqymo3sV7z7ZsaWbZNDLXRJZoRGrjulp60YYtV4wfY8VIKcWidjojlLcWrd8Q==}
     engines: {node: ^18.17.0 || ^20.3.0 || >=21.0.0}
     cpu: [x64]
     os: [linux]
+    libc: [musl]
 
   '@img/sharp-wasm32@0.34.5':
     resolution: {integrity: sha512-OdWTEiVkY2PHwqkbBI8frFxQQFekHaSSkUIJkwzclWZe64O1X4UlUjqqqLaPbUpMOQk6FBu/HtlGXNblIs0huw==}
@@ -1069,24 +1101,28 @@ packages:
     engines: {node: '>= 10'}
     cpu: [arm64]
     os: [linux]
+    libc: [glibc]
 
   '@next/swc-linux-arm64-musl@16.2.4':
     resolution: {integrity: sha512-iVMMp14514u7Nup2umQS03nT/bN9HurK8ufylC3FZNykrwjtx7V1A7+4kvhbDSCeonTVqV3Txnv0Lu+m2oDXNg==}
     engines: {node: '>= 10'}
     cpu: [arm64]
     os: [linux]
+    libc: [musl]
 
   '@next/swc-linux-x64-gnu@16.2.4':
     resolution: {integrity: sha512-EZOvm1aQWgnI/N/xcWOlnS3RQBk0VtVav5Zo7n4p0A7UKyTDx047k8opDbXgBpHl4CulRqRfbw3QrX2w5UOXMQ==}
     engines: {node: '>= 10'}
     cpu: [x64]
     os: [linux]
+    libc: [glibc]
 
   '@next/swc-linux-x64-musl@16.2.4':
     resolution: {integrity: sha512-h9FxsngCm9cTBf71AR4fGznDEDx1hS7+kSEiIRjq5kO1oXWm07DxVGZjCvk0SGx7TSjlUqhI8oOyz7NfwAdPoA==}
     engines: {node: '>= 10'}
     cpu: [x64]
     os: [linux]
+    libc: [musl]
 
   '@next/swc-win32-arm64-msvc@16.2.4':
     resolution: {integrity: sha512-3NdJV5OXMSOeJYijX+bjaLge3mJBlh4ybydbT4GFoB/2hAojWHtMhl3CYlYoMrjPuodp0nzFVi4Tj2+WaMg+Ow==}
@@ -1195,48 +1231,56 @@ packages:
     engines: {node: ^20.19.0 || >=22.12.0}
     cpu: [arm64]
     os: [linux]
+    libc: [glibc]
 
   '@oxc-parser/binding-linux-arm64-musl@0.132.0':
     resolution: {integrity: sha512-WozHg3Kc//8Sk756HXXgMbEAvqtG+Lzb9JOojwQzIGDtN78Az2dLttkb71akWYUF/8IgYfDSlfKh4Uot8is5Vw==}
     engines: {node: ^20.19.0 || >=22.12.0}
     cpu: [arm64]
     os: [linux]
+    libc: [musl]
 
   '@oxc-parser/binding-linux-ppc64-gnu@0.132.0':
     resolution: {integrity: sha512-CmX/ulNBOEwWTyVRmcpYKAcAizW6+OjtLJgo7fXoL9OqQvjF4VER8tPomv44vwzfSCy1BHbsB0ZlZYzYJNj4cA==}
     engines: {node: ^20.19.0 || >=22.12.0}
     cpu: [ppc64]
     os: [linux]
+    libc: [glibc]
 
   '@oxc-parser/binding-linux-riscv64-gnu@0.132.0':
     resolution: {integrity: sha512-j9oQS+hM90SdhviNGWbPgT4+Rlq+ac++q/zjgwPD1mVHgxHzATvoRGtDx0sXGmFOQ9J9YkwAhYGb5MAHL6TAsA==}
     engines: {node: ^20.19.0 || >=22.12.0}
     cpu: [riscv64]
     os: [linux]
+    libc: [glibc]
 
   '@oxc-parser/binding-linux-riscv64-musl@0.132.0':
     resolution: {integrity: sha512-bLz+Xi+Agnfmd7kWPEsSVwCn2k4EyIalZkNBcQ0OGIv9rqn8VgCPLNd03tM9mKX/5TdlvDXalz0q71BIrOPNqg==}
     engines: {node: ^20.19.0 || >=22.12.0}
     cpu: [riscv64]
     os: [linux]
+    libc: [musl]
 
   '@oxc-parser/binding-linux-s390x-gnu@0.132.0':
     resolution: {integrity: sha512-U6t2qbJU0ypTfyj9QV3W1Y6mITDTL8ai/OR6NUn85vyHthOvobKWgXzU4tu0EskSzlpuVFz1g0jFGulDIUKHxQ==}
     engines: {node: ^20.19.0 || >=22.12.0}
     cpu: [s390x]
     os: [linux]
+    libc: [glibc]
 
   '@oxc-parser/binding-linux-x64-gnu@0.132.0':
     resolution: {integrity: sha512-WcEaSNHFk8yz5YFlQQAlhq6jOFmZBB/RKE7uzhyCIf+pF1Lmv9gUH4221mle2Gd9iHyWT3ySNph8yZgb1xYdWg==}
     engines: {node: ^20.19.0 || >=22.12.0}
     cpu: [x64]
     os: [linux]
+    libc: [glibc]
 
   '@oxc-parser/binding-linux-x64-musl@0.132.0':
     resolution: {integrity: sha512-iQrV4iJzQgRwK3BWRmQl1C3C6g3wYpXN2WLdQdyR+efoUnncdShZAVp9OgcojtlD3MDRbuOMGG3SjxF4fL4nlQ==}
     engines: {node: ^20.19.0 || >=22.12.0}
     cpu: [x64]
     os: [linux]
+    libc: [musl]
 
   '@oxc-parser/binding-openharmony-arm64@0.132.0':
     resolution: {integrity: sha512-FWzmUGrZ6GUby4U7WIwcCtab6tdmlTO3xTRRKyb5kjIJVEiaUAT8animUG/nK8ZCA8gkRkPOTId4rl6uTqUmJQ==}
@@ -1316,41 +1360,49 @@ packages:
     resolution: {integrity: sha512-heV2+jmXyYnUrpUXSPugqWDRpnsQcDm2AX4wzTuvgdlZfoNYO0O3W2AVpJYaDn9AG4JdM6Kxom8+foE7/BcSig==}
     cpu: [arm64]
     os: [linux]
+    libc: [glibc]
 
   '@oxc-resolver/binding-linux-arm64-musl@11.19.1':
     resolution: {integrity: sha512-jvo2Pjs1c9KPxMuMPIeQsgu0mOJF9rEb3y3TdpsrqwxRM+AN6/nDDwv45n5ZrUnQMsdBy5gIabioMKnQfWo9ew==}
     cpu: [arm64]
     os: [linux]
+    libc: [musl]
 
   '@oxc-resolver/binding-linux-ppc64-gnu@11.19.1':
     resolution: {integrity: sha512-vLmdNxWCdN7Uo5suays6A/+ywBby2PWBBPXctWPg5V0+eVuzsJxgAn6MMB4mPlshskYbppjpN2Zg83ArHze9gQ==}
     cpu: [ppc64]
     os: [linux]
+    libc: [glibc]
 
   '@oxc-resolver/binding-linux-riscv64-gnu@11.19.1':
     resolution: {integrity: sha512-/b+WgR+VTSBxzgOhDO7TlMXC1ufPIMR6Vj1zN+/x+MnyXGW7prTLzU9eW85Aj7Th7CCEG9ArCbTeqxCzFWdg2w==}
     cpu: [riscv64]
     os: [linux]
+    libc: [glibc]
 
   '@oxc-resolver/binding-linux-riscv64-musl@11.19.1':
     resolution: {integrity: sha512-YlRdeWb9j42p29ROh+h4eg/OQ3dTJlpHSa+84pUM9+p6i3djtPz1q55yLJhgW9XfDch7FN1pQ/Vd6YP+xfRIuw==}
     cpu: [riscv64]
     os: [linux]
+    libc: [musl]
 
   '@oxc-resolver/binding-linux-s390x-gnu@11.19.1':
     resolution: {integrity: sha512-EDpafVOQWF8/MJynsjOGFThcqhRHy417sRyLfQmeiamJ8qVhSKAn2Dn2VVKUGCjVB9C46VGjhNo7nOPUi1x6uA==}
     cpu: [s390x]
     os: [linux]
+    libc: [glibc]
 
   '@oxc-resolver/binding-linux-x64-gnu@11.19.1':
     resolution: {integrity: sha512-NxjZe+rqWhr+RT8/Ik+5ptA3oz7tUw361Wa5RWQXKnfqwSSHdHyrw6IdcTfYuml9dM856AlKWZIUXDmA9kkiBQ==}
     cpu: [x64]
     os: [linux]
+    libc: [glibc]
 
   '@oxc-resolver/binding-linux-x64-musl@11.19.1':
     resolution: {integrity: sha512-cM/hQwsO3ReJg5kR+SpI69DMfvNCp+A/eVR4b4YClE5bVZwz8rh2Nh05InhwI5HR/9cArbEkzMjcKgTHS6UaNw==}
     cpu: [x64]
     os: [linux]
+    libc: [musl]
 
   '@oxc-resolver/binding-openharmony-arm64@11.19.1':
     resolution: {integrity: sha512-QF080IowFB0+9Rh6RcD19bdgh49BpQHUW5TajG1qvWHvmrQznTZZjYlgE2ltLXyKY+qs4F/v5xuX1XS7Is+3qA==}
@@ -1424,48 +1476,56 @@ packages:
     engines: {node: ^20.19.0 || >=22.12.0}
     cpu: [arm64]
     os: [linux]
+    libc: [glibc]
 
   '@oxfmt/binding-linux-arm64-musl@0.46.0':
     resolution: {integrity: sha512-aAUPBWJ1lGwwnxZUEDLJ94+Iy6MuwJwPxUgO4sCA5mEEyDk7b+cDQ+JpX1VR150Zoyd+D49gsrUzpUK5h587Eg==}
     engines: {node: ^20.19.0 || >=22.12.0}
     cpu: [arm64]
     os: [linux]
+    libc: [musl]
 
   '@oxfmt/binding-linux-ppc64-gnu@0.46.0':
     resolution: {integrity: sha512-ufBCJukyFX/UDrokP/r6BGDoTInnsDs7bxyzKAgMiZlt2Qu8GPJSJ6Zm6whIiJzKk0naxA8ilwmbO1LMw6Htxw==}
     engines: {node: ^20.19.0 || >=22.12.0}
     cpu: [ppc64]
     os: [linux]
+    libc: [glibc]
 
   '@oxfmt/binding-linux-riscv64-gnu@0.46.0':
     resolution: {integrity: sha512-eqtlC2YmPqjun76R1gVfGLuKWx7NuEnLEAudZ7n6ipSKbCZTqIKSs1b5Y8K/JHZsRpLkeSmAAjig5HOIg8fQzQ==}
     engines: {node: ^20.19.0 || >=22.12.0}
     cpu: [riscv64]
     os: [linux]
+    libc: [glibc]
 
   '@oxfmt/binding-linux-riscv64-musl@0.46.0':
     resolution: {integrity: sha512-yccVOO2nMXkQLGgy0He3EQEwKD7NF0zEk+/OWmroznkqXyJdN6bfK0LtNnr6/14Bh3FjpYq7bP33l/VloCnxpA==}
     engines: {node: ^20.19.0 || >=22.12.0}
     cpu: [riscv64]
     os: [linux]
+    libc: [musl]
 
   '@oxfmt/binding-linux-s390x-gnu@0.46.0':
     resolution: {integrity: sha512-aAf7fG23OQCey6VRPj9IeCraoYtpgtx0ZyJ1CXkPyT1wjzBE7c3xtuxHe/AdHaJfVVb/SXpSk8Gl1LzyQupSqw==}
     engines: {node: ^20.19.0 || >=22.12.0}
     cpu: [s390x]
     os: [linux]
+    libc: [glibc]
 
   '@oxfmt/binding-linux-x64-gnu@0.46.0':
     resolution: {integrity: sha512-q0JPsTMyJNjYrBvYFDz4WbVsafNZaPCZv4RnFypRotLqpKROtBZcEaXQW4eb9YmvLU3NckVemLJnzkSZSdmOxw==}
     engines: {node: ^20.19.0 || >=22.12.0}
     cpu: [x64]
     os: [linux]
+    libc: [glibc]
 
   '@oxfmt/binding-linux-x64-musl@0.46.0':
     resolution: {integrity: sha512-7LsLY9Cw57GPkhSR+duI3mt9baRczK/DtHYSldQ4BEU92da9igBQNl4z7Vq5U9NNPsh1FmpKvv1q9WDtiUQR1A==}
     engines: {node: ^20.19.0 || >=22.12.0}
     cpu: [x64]
     os: [linux]
+    libc: [musl]
 
   '@oxfmt/binding-openharmony-arm64@0.46.0':
     resolution: {integrity: sha512-lHiBOz8Duaku7JtRNLlps3j++eOaICPZSd8FCVmTDM4DFOPT71Bjn7g6iar1z7StXlKRweUKxWUs4sA+zWGDXg==}
@@ -1568,48 +1628,56 @@ packages:
     engines: {node: ^20.19.0 || >=22.12.0}
     cpu: [arm64]
     os: [linux]
+    libc: [glibc]
 
   '@oxlint/binding-linux-arm64-musl@1.66.0':
     resolution: {integrity: sha512-hmo+ZB/lHkR1HdDmnziNpzSLmulnUSu10VEqX2Yex7OwvoBAbjJQLvy4gIBRV3AAwWnCvAxKp5Nv1GE6LU1QMg==}
     engines: {node: ^20.19.0 || >=22.12.0}
     cpu: [arm64]
     os: [linux]
+    libc: [musl]
 
   '@oxlint/binding-linux-ppc64-gnu@1.66.0':
     resolution: {integrity: sha512-2Invd4Uyy81mVooQC5FBtfxSNrvcX1OxbMlVQ6M2erRrNI2awFYF26YNW2yFxdVFZ4ffNOWKghtMjhnUPsXsVA==}
     engines: {node: ^20.19.0 || >=22.12.0}
     cpu: [ppc64]
     os: [linux]
+    libc: [glibc]
 
   '@oxlint/binding-linux-riscv64-gnu@1.66.0':
     resolution: {integrity: sha512-s0iXPDQVdgayE3RGa/N2DZF7tjgg0TwEtD1sGoDxqPDGrIXgo45H0yHknT0f9A0yteASsweYZtDyTuVlM4aSag==}
     engines: {node: ^20.19.0 || >=22.12.0}
     cpu: [riscv64]
     os: [linux]
+    libc: [glibc]
 
   '@oxlint/binding-linux-riscv64-musl@1.66.0':
     resolution: {integrity: sha512-OekL4XFiu7RPK0JIZi8VeHgtIXPREf42t8Cy/rKEsC+P3gcqDgNAAGiyuUOpdbG4wwbfue1q4CHcCO7spSve6w==}
     engines: {node: ^20.19.0 || >=22.12.0}
     cpu: [riscv64]
     os: [linux]
+    libc: [musl]
 
   '@oxlint/binding-linux-s390x-gnu@1.66.0':
     resolution: {integrity: sha512-Ga1D0kj1SFslm34ThA/BdkUlyAYEnTsXyRC4pF0C5agZSwtGdHYWMTQWemUfBGp4RCG4QWXgdO+HmmmKqOtlBg==}
     engines: {node: ^20.19.0 || >=22.12.0}
     cpu: [s390x]
     os: [linux]
+    libc: [glibc]
 
   '@oxlint/binding-linux-x64-gnu@1.66.0':
     resolution: {integrity: sha512-p5jfP1wUZe/IC3qpQO84n9DRnf9g3lKRtLBlQq23ykyrDglHcVx7sWmVTlPuU6SBw8mNnPzyOn022G3XZHnlww==}
     engines: {node: ^20.19.0 || >=22.12.0}
     cpu: [x64]
     os: [linux]
+    libc: [glibc]
 
   '@oxlint/binding-linux-x64-musl@1.66.0':
     resolution: {integrity: sha512-vUB/sYlYZorDL1ZD+o9mRv7zbsykrrFRtmgS6R8musZqLtrPRQn1gc1eGpuX+sfdccz42STl/AqldY6XRb2upQ==}
     engines: {node: ^20.19.0 || >=22.12.0}
     cpu: [x64]
     os: [linux]
+    libc: [musl]
 
   '@oxlint/binding-openharmony-arm64@1.66.0':
     resolution: {integrity: sha512-yde+6p/F59xRkGR9H1HfngWRif1QRJjynZK349l+UI0H6w9hL3G8/AVaTHFyTtLVQ56qtNbX2/5Dc77n1ovnOg==}
@@ -1677,66 +1745,79 @@ packages:
     resolution: {integrity: sha512-F8sWbhZ7tyuEfsmOxwc2giKDQzN3+kuBLPwwZGyVkLlKGdV1nvnNwYD0fKQ8+XS6hp9nY7B+ZeK01EBUE7aHaw==}
     cpu: [arm]
     os: [linux]
+    libc: [glibc]
 
   '@rollup/rollup-linux-arm-musleabihf@4.57.1':
     resolution: {integrity: sha512-rGfNUfn0GIeXtBP1wL5MnzSj98+PZe/AXaGBCRmT0ts80lU5CATYGxXukeTX39XBKsxzFpEeK+Mrp9faXOlmrw==}
     cpu: [arm]
     os: [linux]
+    libc: [musl]
 
   '@rollup/rollup-linux-arm64-gnu@4.57.1':
     resolution: {integrity: sha512-MMtej3YHWeg/0klK2Qodf3yrNzz6CGjo2UntLvk2RSPlhzgLvYEB3frRvbEF2wRKh1Z2fDIg9KRPe1fawv7C+g==}
     cpu: [arm64]
     os: [linux]
+    libc: [glibc]
 
   '@rollup/rollup-linux-arm64-musl@4.57.1':
     resolution: {integrity: sha512-1a/qhaaOXhqXGpMFMET9VqwZakkljWHLmZOX48R0I/YLbhdxr1m4gtG1Hq7++VhVUmf+L3sTAf9op4JlhQ5u1Q==}
     cpu: [arm64]
     os: [linux]
+    libc: [musl]
 
   '@rollup/rollup-linux-loong64-gnu@4.57.1':
     resolution: {integrity: sha512-QWO6RQTZ/cqYtJMtxhkRkidoNGXc7ERPbZN7dVW5SdURuLeVU7lwKMpo18XdcmpWYd0qsP1bwKPf7DNSUinhvA==}
     cpu: [loong64]
     os: [linux]
+    libc: [glibc]
 
   '@rollup/rollup-linux-loong64-musl@4.57.1':
     resolution: {integrity: sha512-xpObYIf+8gprgWaPP32xiN5RVTi/s5FCR+XMXSKmhfoJjrpRAjCuuqQXyxUa/eJTdAE6eJ+KDKaoEqjZQxh3Gw==}
     cpu: [loong64]
     os: [linux]
+    libc: [musl]
 
   '@rollup/rollup-linux-ppc64-gnu@4.57.1':
     resolution: {integrity: sha512-4BrCgrpZo4hvzMDKRqEaW1zeecScDCR+2nZ86ATLhAoJ5FQ+lbHVD3ttKe74/c7tNT9c6F2viwB3ufwp01Oh2w==}
     cpu: [ppc64]
     os: [linux]
+    libc: [glibc]
 
   '@rollup/rollup-linux-ppc64-musl@4.57.1':
     resolution: {integrity: sha512-NOlUuzesGauESAyEYFSe3QTUguL+lvrN1HtwEEsU2rOwdUDeTMJdO5dUYl/2hKf9jWydJrO9OL/XSSf65R5+Xw==}
     cpu: [ppc64]
     os: [linux]
+    libc: [musl]
 
   '@rollup/rollup-linux-riscv64-gnu@4.57.1':
     resolution: {integrity: sha512-ptA88htVp0AwUUqhVghwDIKlvJMD/fmL/wrQj99PRHFRAG6Z5nbWoWG4o81Nt9FT+IuqUQi+L31ZKAFeJ5Is+A==}
     cpu: [riscv64]
     os: [linux]
+    libc: [glibc]
 
   '@rollup/rollup-linux-riscv64-musl@4.57.1':
     resolution: {integrity: sha512-S51t7aMMTNdmAMPpBg7OOsTdn4tySRQvklmL3RpDRyknk87+Sp3xaumlatU+ppQ+5raY7sSTcC2beGgvhENfuw==}
     cpu: [riscv64]
     os: [linux]
+    libc: [musl]
 
   '@rollup/rollup-linux-s390x-gnu@4.57.1':
     resolution: {integrity: sha512-Bl00OFnVFkL82FHbEqy3k5CUCKH6OEJL54KCyx2oqsmZnFTR8IoNqBF+mjQVcRCT5sB6yOvK8A37LNm/kPJiZg==}
     cpu: [s390x]
     os: [linux]
+    libc: [glibc]
 
   '@rollup/rollup-linux-x64-gnu@4.57.1':
     resolution: {integrity: sha512-ABca4ceT4N+Tv/GtotnWAeXZUZuM/9AQyCyKYyKnpk4yoA7QIAuBt6Hkgpw8kActYlew2mvckXkvx0FfoInnLg==}
     cpu: [x64]
     os: [linux]
+    libc: [glibc]
 
   '@rollup/rollup-linux-x64-musl@4.57.1':
     resolution: {integrity: sha512-HFps0JeGtuOR2convgRRkHCekD7j+gdAuXM+/i6kGzQtFhlCtQkpwtNzkNj6QhCDp7DRJ7+qC/1Vg2jt5iSOFw==}
     cpu: [x64]
     os: [linux]
+    libc: [musl]
 
   '@rollup/rollup-openbsd-x64@4.57.1':
     resolution: {integrity: sha512-H+hXEv9gdVQuDTgnqD+SQffoWoc0Of59AStSzTEj/feWTBAnSfSD3+Dql1ZruJQxmykT/JVY0dE8Ka7z0DH1hw==}
@@ -1908,24 +1989,28 @@ packages:
     engines: {node: '>= 10'}
     cpu: [arm64]
     os: [linux]
+    libc: [glibc]
 
   '@tailwindcss/oxide-linux-arm64-musl@4.1.18':
     resolution: {integrity: sha512-1px92582HkPQlaaCkdRcio71p8bc8i/ap5807tPRDK/uw953cauQBT8c5tVGkOwrHMfc2Yh6UuxaH4vtTjGvHg==}
     engines: {node: '>= 10'}
     cpu: [arm64]
     os: [linux]
+    libc: [musl]
 
   '@tailwindcss/oxide-linux-x64-gnu@4.1.18':
     resolution: {integrity: sha512-v3gyT0ivkfBLoZGF9LyHmts0Isc8jHZyVcbzio6Wpzifg/+5ZJpDiRiUhDLkcr7f/r38SWNe7ucxmGW3j3Kb/g==}
     engines: {node: '>= 10'}
     cpu: [x64]
     os: [linux]
+    libc: [glibc]
 
   '@tailwindcss/oxide-linux-x64-musl@4.1.18':
     resolution: {integrity: sha512-bhJ2y2OQNlcRwwgOAGMY0xTFStt4/wyU6pvI6LSuZpRgKQwxTec0/3Scu91O8ir7qCR3AuepQKLU/kX99FouqQ==}
     engines: {node: '>= 10'}
     cpu: [x64]
     os: [linux]
+    libc: [musl]
 
   '@tailwindcss/oxide-wasm32-wasi@4.1.18':
     resolution: {integrity: sha512-LffYTvPjODiP6PT16oNeUQJzNVyJl1cjIebq/rWWBF+3eDst5JGEFSc5cWxyRCJ0Mxl+KyIkqRxk1XPEs9x8TA==}
@@ -2174,24 +2259,28 @@ packages:
     engines: {node: ^20.19.0 || >=22.12.0}
     cpu: [arm64]
     os: [linux]
+    libc: [glibc]
 
   '@voidzero-dev/vite-plus-linux-arm64-musl@0.1.20':
     resolution: {integrity: sha512-Oh/pxMdTLR/wsDl/OONjItjLOeTewFBLuKkH5RQmcI9g3AVqKzLj1/uawujgysBI5E25tonRRK7I2q/zu8Uqvg==}
     engines: {node: ^20.19.0 || >=22.12.0}
     cpu: [arm64]
     os: [linux]
+    libc: [musl]
 
   '@voidzero-dev/vite-plus-linux-x64-gnu@0.1.20':
     resolution: {integrity: sha512-msO1ZoUX5aSK8L6kN1C3XQO4CcH9aFsNPRSNcO1cjk1kTnaLyVYzkVxgvbh3vk7nzZAAMkmyZ4SlMpqJrdahrg==}
     engines: {node: ^20.19.0 || >=22.12.0}
     cpu: [x64]
     os: [linux]
+    libc: [glibc]
 
   '@voidzero-dev/vite-plus-linux-x64-musl@0.1.20':
     resolution: {integrity: sha512-U93urREvg23ZFDkxKkkfWWIOI4GI9erhbWAZpXG+GeYqygWKrVC6PUTXiuexVg3/CFg2sSMTdm1W6V7TFG5hYA==}
     engines: {node: ^20.19.0 || >=22.12.0}
     cpu: [x64]
     os: [linux]
+    libc: [musl]
 
   '@voidzero-dev/vite-plus-test@0.1.20':
     resolution: {integrity: sha512-vy2dJYw1bhgQ/+BrQrfwPlSKzQ2mm3YLJ9kGF7Yo0UJ2P3XKpshtgFIWLjSg/IASnC93OAx0c/7j3NM0I1RMuA==}
@@ -2844,24 +2933,28 @@ packages:
     engines: {node: '>= 12.0.0'}
     cpu: [arm64]
     os: [linux]
+    libc: [glibc]
 
   lightningcss-linux-arm64-musl@1.30.2:
     resolution: {integrity: sha512-5Vh9dGeblpTxWHpOx8iauV02popZDsCYMPIgiuw97OJ5uaDsL86cnqSFs5LZkG3ghHoX5isLgWzMs+eD1YzrnA==}
     engines: {node: '>= 12.0.0'}
     cpu: [arm64]
     os: [linux]
+    libc: [musl]
 
   lightningcss-linux-x64-gnu@1.30.2:
     resolution: {integrity: sha512-Cfd46gdmj1vQ+lR6VRTTadNHu6ALuw2pKR9lYq4FnhvgBc4zWY1EtZcAc6EffShbb1MFrIPfLDXD6Xprbnni4w==}
     engines: {node: '>= 12.0.0'}
     cpu: [x64]
     os: [linux]
+    libc: [glibc]
 
   lightningcss-linux-x64-musl@1.30.2:
     resolution: {integrity: sha512-XJaLUUFXb6/QG2lGIW6aIk6jKdtjtcffUT0NKvIqhSBY3hh9Ch+1LCeH80dR9q9LBjG3ewbDjnumefsLsP6aiA==}
     engines: {node: '>= 12.0.0'}
     cpu: [x64]
     os: [linux]
+    libc: [musl]
 
   lightningcss-win32-arm64-msvc@1.30.2:
     resolution: {integrity: sha512-FZn+vaj7zLv//D/192WFFVA0RgHawIcHqLX9xuWiQt7P0PtdFEVaxgF9rjM/IRYHQXNnk61/H/gb2Ei+kUQ4xQ==}