Skip to content

proposal: benchmark harness vs OPA / Cedar#8

Merged
kanywst merged 4 commits into
mainfrom
feat/benchmark-harness
May 9, 2026
Merged

proposal: benchmark harness vs OPA / Cedar#8
kanywst merged 4 commits into
mainfrom
feat/benchmark-harness

Conversation

@kanywst

@kanywst kanywst commented May 9, 2026

Copy link
Copy Markdown
Member

Design doc only. Tracks the compiled-policy benchmark item from ROADMAP near term.

See docs/proposals/benchmark-harness.md.

Summary:

  • Reproducible harness comparing zopa, OPA WASM SDK, OPA HTTP sidecar, and Cedar native.
  • Metrics: p50/p95/p99 latency, memory floor, cold-start, throughput.
  • Fixture set covers static / header-eq / nested-every / deep-AST cases. Each fixture carries Rego, Cedar, and zopa AST forms; runner asserts decision parity before timing.
  • CI runs a smoke pass per PR; nightly full bench on main commits results.

Status: design only, no implementation.

Summary by CodeRabbit

  • New Features

    • WebAssembly latency benchmarking with p50/p95/p99 and mean metrics
    • Initial benchmark fixtures covering simple policy scenarios
    • A Node-based benchmark runner that executes fixture suites and prints results
  • Documentation

    • Added benchmark framework proposal and a README documenting scope, runner behavior, and example outputs
  • Chores

    • CI updated to run unit tests and a smoke benchmark; benchmark step integrated into build workflow

Review Change Stack

@coderabbitai

coderabbitai Bot commented May 9, 2026

Copy link
Copy Markdown

Warning

Rate limit exceeded

@kanywst has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 55 minutes and 36 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c43694bb-ecb7-4409-87d3-531b2286f232

📥 Commits

Reviewing files that changed from the base of the PR and between d7b933c and e6efca1.

📒 Files selected for processing (3)
  • .github/workflows/ci.yml
  • bench/fixtures/02_header_eq.json
  • bench/run.mjs
📝 Walkthrough

Walkthrough

This PR establishes a latency benchmark harness for zopa.wasm: design proposal, JSON fixtures, a Node.js runner that measures percentile latencies, build/CI integration, and user-facing README with run instructions.

Changes

Benchmark Harness

Layer / File(s) Summary
Design Proposal
docs/proposals/benchmark-harness.md
Architecture, metrics, fixture format, CI plan, and directory layout for reproducible latency benchmarking.
Benchmark Fixtures
bench/fixtures/01_static.json, bench/fixtures/02_header_eq.json
JSON fixtures encoding input, AST, and reference Rego/Cedar policies for baseline and header-equality cases.
Benchmark Runner
bench/run.mjs
Node ESM runner that instantiates zopa.wasm, allocates/free memory, performs warmups and measured iterations, computes p50/p95/p99/mean, and prints a results table.
Build & CI Integration
build.zig, .github/workflows/ci.yml
build.zig adds a bench step (node bench/run.mjs). CI adds test-unit and a bench (smoke) job that depends on build and runs zig build bench with Node.js setup.
User Documentation
bench/README.md
Documents scope (zopa-only), layout, build/run commands, iteration strategy, and example output with measurement notes.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 In circuits where the WASM hums,
I count the hops and tiny drums.
Warmups first, then timed delights,
p50, p95, through nights.
A carrot chart of latency crumbs.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'proposal: benchmark harness vs OPA / Cedar' accurately summarizes the main change: a new benchmark harness proposal comparing zopa against OPA and Cedar implementations.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/benchmark-harness

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

gemini-code-assist[bot]

This comment was marked as resolved.

@kanywst kanywst marked this pull request as ready for review May 9, 2026 13:41
coderabbitai[bot]

This comment was marked as resolved.

kanywst added 3 commits May 9, 2026 23:41
New 'zig build bench' target runs bench/run.mjs against the
freshly built zopa.wasm. Reports p50/p95/p99/mean in microseconds
across each fixture in bench/fixtures/.

Two starter fixtures: 01_static (literal allow:true) and
02_header_eq (input.method == 'GET'). Each fixture also carries
the equivalent rego/cedar source for cross-engine runs added
later.

Cross-engine comparison (OPA WASM SDK, OPA HTTP, Cedar) is
deferred per docs/proposals/benchmark-harness.md until the OPA
conformance harness lands; without 'same answer' assertions a
head-to-head latency number isn't honest.

Sample numbers from a local M-series Mac, --release=small:
  01_static     p50 1.54us  p99 3.46us
  02_header_eq  p50 4.42us  p99 5.17us
- test-unit: runs 'zig build test-unit' so host-side Zig unit tests
  are exercised in CI (matches feat/string-builtins and
  feat/composite-ref-iteration).
- bench: runs 'zig build bench' (Node-driven, ~22000 evaluations
  total over 2 fixtures) and prints the latency table to the run
  log. Not gated; the goal is visibility, not perf-regression
  enforcement (CI runners have variable noise floors).
@kanywst kanywst force-pushed the feat/benchmark-harness branch from 3d1f406 to d7b933c Compare May 9, 2026 14:41
- bench/fixtures/02_header_eq.json: rewrite Cedar policy from
  'action == Action::"GET"' (matches an action named GET) to
  'action when context.method == "GET"' (checks the request
  context method attribute, matching the Rego/AST semantics).
  Cosmetic for now since cross-engine bench is deferred.

- bench/run.mjs: validate evaluate() return value during warmup;
  -1 (parse / depth-cap / unknown-node failures) now raises so the
  benchmark never times an error path. The hot ITERS loop stays
  branch-free since warmup already certified the fixture.

- bench/run.mjs: comment block above the proxy stub env explaining
  the stubs always succeed and would mask errors if reached, plus
  the harness-drift check.

- .github/workflows/ci.yml: drop 'needs: build' from bench job.
  'zig build bench' rebuilds via the install step anyway, so the
  job dep adds wait time without saving work.

Skipped (false positive): bench/README.md reference to
docs/proposals/opa-conformance-harness.md is intentional. Cross-engine
bench depends on conformance landing first; the link points to the
right proposal.
@kanywst kanywst merged commit 6ce0b51 into main May 9, 2026
11 checks passed
@kanywst kanywst deleted the feat/benchmark-harness branch May 9, 2026 14:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant