proposal: benchmark harness vs OPA / Cedar by kanywst · Pull Request #8 · 0-draft/zopa

kanywst · 2026-05-09T12:48:11Z

Design doc only. Tracks the compiled-policy benchmark item from ROADMAP near term.

See docs/proposals/benchmark-harness.md.

Summary:

Reproducible harness comparing zopa, OPA WASM SDK, OPA HTTP sidecar, and Cedar native.
Metrics: p50/p95/p99 latency, memory floor, cold-start, throughput.
Fixture set covers static / header-eq / nested-every / deep-AST cases. Each fixture carries Rego, Cedar, and zopa AST forms; runner asserts decision parity before timing.
CI runs a smoke pass per PR; nightly full bench on main commits results.

Status: design only, no implementation.

Summary by CodeRabbit

New Features
- WebAssembly latency benchmarking with p50/p95/p99 and mean metrics
- Initial benchmark fixtures covering simple policy scenarios
- A Node-based benchmark runner that executes fixture suites and prints results
Documentation
- Added benchmark framework proposal and a README documenting scope, runner behavior, and example outputs
Chores
- CI updated to run unit tests and a smoke benchmark; benchmark step integrated into build workflow

coderabbitai · 2026-05-09T12:48:16Z

Warning

Rate limit exceeded

@kanywst has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 55 minutes and 36 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c43694bb-ecb7-4409-87d3-531b2286f232

📥 Commits

Reviewing files that changed from the base of the PR and between d7b933c and e6efca1.

📒 Files selected for processing (3)

.github/workflows/ci.yml
bench/fixtures/02_header_eq.json
bench/run.mjs

📝 Walkthrough

Walkthrough

This PR establishes a latency benchmark harness for zopa.wasm: design proposal, JSON fixtures, a Node.js runner that measures percentile latencies, build/CI integration, and user-facing README with run instructions.

Changes

Benchmark Harness

Layer / File(s)	Summary
Design Proposal `docs/proposals/benchmark-harness.md`	Architecture, metrics, fixture format, CI plan, and directory layout for reproducible latency benchmarking.
Benchmark Fixtures `bench/fixtures/01_static.json`, `bench/fixtures/02_header_eq.json`	JSON fixtures encoding input, AST, and reference Rego/Cedar policies for baseline and header-equality cases.
Benchmark Runner `bench/run.mjs`	Node ESM runner that instantiates `zopa.wasm`, allocates/free memory, performs warmups and measured iterations, computes p50/p95/p99/mean, and prints a results table.
Build & CI Integration `build.zig`, `.github/workflows/ci.yml`	`build.zig` adds a `bench` step (`node bench/run.mjs`). CI adds `test-unit` and a `bench (smoke)` job that depends on `build` and runs `zig build bench` with Node.js setup.
User Documentation `bench/README.md`	Documents scope (zopa-only), layout, build/run commands, iteration strategy, and example output with measurement notes.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 In circuits where the WASM hums,
I count the hops and tiny drums.
Warmups first, then timed delights,
p50, p95, through nights.
A carrot chart of latency crumbs.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'proposal: benchmark harness vs OPA / Cedar' accurately summarizes the main change: a new benchmark harness proposal comparing zopa against OPA and Cedar implementations.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/benchmark-harness

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

New 'zig build bench' target runs bench/run.mjs against the freshly built zopa.wasm. Reports p50/p95/p99/mean in microseconds across each fixture in bench/fixtures/. Two starter fixtures: 01_static (literal allow:true) and 02_header_eq (input.method == 'GET'). Each fixture also carries the equivalent rego/cedar source for cross-engine runs added later. Cross-engine comparison (OPA WASM SDK, OPA HTTP, Cedar) is deferred per docs/proposals/benchmark-harness.md until the OPA conformance harness lands; without 'same answer' assertions a head-to-head latency number isn't honest. Sample numbers from a local M-series Mac, --release=small: 01_static p50 1.54us p99 3.46us 02_header_eq p50 4.42us p99 5.17us

- test-unit: runs 'zig build test-unit' so host-side Zig unit tests are exercised in CI (matches feat/string-builtins and feat/composite-ref-iteration). - bench: runs 'zig build bench' (Node-driven, ~22000 evaluations total over 2 fixtures) and prints the latency table to the run log. Not gated; the goal is visibility, not perf-regression enforcement (CI runners have variable noise floors).

- bench/fixtures/02_header_eq.json: rewrite Cedar policy from 'action == Action::"GET"' (matches an action named GET) to 'action when context.method == "GET"' (checks the request context method attribute, matching the Rego/AST semantics). Cosmetic for now since cross-engine bench is deferred. - bench/run.mjs: validate evaluate() return value during warmup; -1 (parse / depth-cap / unknown-node failures) now raises so the benchmark never times an error path. The hot ITERS loop stays branch-free since warmup already certified the fixture. - bench/run.mjs: comment block above the proxy stub env explaining the stubs always succeed and would mask errors if reached, plus the harness-drift check. - .github/workflows/ci.yml: drop 'needs: build' from bench job. 'zig build bench' rebuilds via the install step anyway, so the job dep adds wait time without saving work. Skipped (false positive): bench/README.md reference to docs/proposals/opa-conformance-harness.md is intentional. Cross-engine bench depends on conformance landing first; the link points to the right proposal.

This comment was marked as resolved.

Sign in to view

kanywst marked this pull request as ready for review May 9, 2026 13:41

This comment was marked as resolved.

Sign in to view

kanywst added 3 commits May 9, 2026 23:41

docs(proposal): benchmark harness vs OPA / Cedar

0e6ac62

kanywst force-pushed the feat/benchmark-harness branch from 3d1f406 to d7b933c Compare May 9, 2026 14:41

kanywst merged commit 6ce0b51 into main May 9, 2026
11 checks passed

kanywst deleted the feat/benchmark-harness branch May 9, 2026 14:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

proposal: benchmark harness vs OPA / Cedar#8

proposal: benchmark harness vs OPA / Cedar#8
kanywst merged 4 commits into
mainfrom
feat/benchmark-harness

kanywst commented May 9, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 9, 2026 •

edited

Loading

Rate limit exceeded

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

kanywst commented May 9, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

kanywst commented May 9, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 9, 2026 •

edited

Loading