Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
161 changes: 161 additions & 0 deletions docs/proposals/body-aware-policies.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,161 @@
# Body-aware policies

Status: Proposed (draft PR, design doc only).
Tracking: ROADMAP.md → Near term.

## Motivation

Today zopa decides allow/deny purely from request headers. That covers
authn-style checks ("Authorization must match a SPIFFE pattern",
"method must be GET"), but it can't reason about the body.

Use cases that need body access:

- Reject requests where a JSON body field falls outside a numeric range
(`input.body.amount > 10000` → deny).
- Block form posts that lack a CSRF nonce field.
- Refuse requests whose body matches a deny-listed substring (cheap WAF).

Currently `proxy_on_request_body` is a no-op (`src/proxy_wasm.zig`)
because by the time it fires, the request pseudo-headers (`:method`,
`:path`, `:authority`) have already been cleared from the header map.
A rule that references both `input.method` and `input.body.amount`
cannot be evaluated in either callback alone.

## Goals

1. Implement `proxy_on_request_body`:
Comment thread
kanywst marked this conversation as resolved.
- Wait for `end_of_stream` (or buffer up to `max_body_bytes`).
- Read the body via `proxy_get_buffer_bytes(BufferType.HttpRequestBody)`.
- Build an `input` JSON containing the parsed body.
- Evaluate the AST against a separate target rule `allow_body`.
- On deny, call `proxy_send_local_response(403)` and return Pause;
on allow, return Continue.

### v1 vs v2

**v1 (this PR)**: body callback always runs against `allow_body`
when it fires, with body-only input (no request snapshot, no
opt-in flag, no per-prefix optimization). Header-side `allow` keeps
its existing flat input untouched.

**v2 (deferred)**:

- Per-context snapshot of `:method` / `:path` / `:authority` /
selected headers in `proxy_on_request_headers`, surfaced to the
body rule as `input.method` etc.
- Opt-in plugin config flag `require_body_eval: true` so hosts
that don't need body inspection skip buffering. When the flag
is on, the header phase still evaluates `allow` and short-circuits
on header-only deny decisions before the body fires (saves CPU
and the buffering cost on rejected requests).
- Static AST analysis (see `streaming-evaluation.md`) can flip the
flag automatically when the policy references `input.body.*`.

## Non-goals

- Streaming evaluation (tracked separately in
`streaming-evaluation.md`). v1 buffers up to `max_body_bytes`.
- Mutating the body. zopa stays decision-only.
- Binary body parsers (protobuf, msgpack). v1 is JSON-only via
`src/json.zig`. Other shapes get the raw byte slice as
`input.body_raw`.

## Design sketch

### Per-context state (v2 only)

The v2 snapshot would look like:

```zig
const RequestContext = struct {
method: ?[]const u8 = null,
path: ?[]const u8 = null,
authority: ?[]const u8 = null,
headers: ?json.Value = null,
};
```

with a `AutoHashMap(u32, *RequestContext)` keyed by `context_id`. The
naive design holds the map in `host_allocator`, which means every
field carries a manual `defer free` and `proxy_on_done` has to deep-
free the inner string slices and the parsed `json.Value` tree.

A cleaner approach (recommended, captured here so v2 starts from the
right shape): give each context its **own arena**, allocated lazily
on the first header callback and reset / freed in `proxy_on_done`.
Saves the per-field free dance.

**v1 has no per-context state.** Header / body callbacks operate
independently and the body rule only sees the body itself.

### Input shape

v1 (this PR) — body-only:

```json
{
"body": { "amount": 250 },
"body_raw": "{\"amount\":250}"
}
```

v2 (deferred) — body plus the request snapshot:

```json
{
"method": "POST",
"path": "/orders",
"headers": { "...": "..." },
"body": { "amount": 250 },
"body_raw": "{\"amount\":250}"
}
```

`body` is set to JSON `null` (not `undefined` -- that is not a JSON
value) when the body fails to parse as JSON, when the body is empty,
or when the read was truncated by `max_body_bytes`. In every case
`body_raw` carries whatever bytes the host returned (capped). Rego-
style policies that want to distinguish "no body" from "non-JSON
body" can branch on `body_raw == ""` vs `body == null`.

### Buffer limit

v1 hardcodes `max_body_bytes = 64 * 1024`. v2 will lift this into
the plugin config alongside `require_body_eval`. When the host
returns more than the cap, `proxy_get_buffer_bytes(start=0, max=cap)`
already truncates on the host side -- v1 does not re-truncate, so
`body_raw` length is always `<= cap` and `body` is `null` whenever
the truncated bytes do not parse as a complete JSON document.

## API impact

- `proxy_on_request_body` returns `Action.Pause` on deny only (sends
403 first via `proxy_send_local_response`). Allow returns Continue.
- New target rule name `allow_body` joins `allow` and `allow_response`.
- New AST refs become valid under `allow_body`: `input.body.<path>`,
`input.body_raw`.
- Existing `allow` policies continue to work unchanged (request-side
input shape stays flat with `input.method` / `input.path` /
`input.headers`).

## Test plan

- Node integration test: drive `evaluate` with a synthetic input that
includes `body`, verify `ref` resolves into the body subtree.
- Envoy integration test: extend `examples/envoy/run.sh` with a POST
case that depends on a body field.
- wasmtime test: simulate `proxy_on_request_headers` then
`proxy_on_request_body`, check the snapshot survives between
callbacks.

## Open questions

- How to surface non-JSON bodies (`application/x-www-form-urlencoded`,
binary protocols)? Either a small parser in `src/json.zig` or push
the burden to the host through a richer input ABI.
- Right default for `max_body_bytes`? 64 KiB feels small for GraphQL,
large for control-plane chatter.
- Should `require_body_eval` be inferred from the policy AST (does it
reference `input.body`)? Static AST analysis would flip the flag
ergonomically.
65 changes: 59 additions & 6 deletions src/eval.zig
Original file line number Diff line number Diff line change
Expand Up @@ -46,9 +46,9 @@ const Scope = struct {
///
/// Targets the default package ("") and the default rule ("allow").
/// Use `evaluateWithTarget` to pick a non-default rule (e.g.
/// "allow_response"), or `evaluateAddressed` to dispatch into a
/// specific `package.rule` pair within a `{"type":"modules", ...}`
/// bundle.
/// "allow_response" or "allow_body"), or `evaluateAddressed` to
/// dispatch into a specific `package.rule` pair within a
/// `{"type":"modules", ...}` bundle.
pub fn evaluate(
arena: *std.heap.ArenaAllocator,
input_json: []const u8,
Expand All @@ -58,9 +58,10 @@ pub fn evaluate(
}

/// Run a single evaluation against `target_rule` in the default
/// package (""). Used by the proxy-wasm shim to route the
/// response-phase callback to the `allow_response` rule while
/// keeping the request-phase on `allow`.
/// package (""). Used by the proxy-wasm shim to route phase-specific
/// callbacks: `allow_response` for the response phase and
/// `allow_body` for the body phase, while the request-headers phase
/// stays on `allow`.
pub fn evaluateWithTarget(
arena: *std.heap.ArenaAllocator,
input_json: []const u8,
Expand Down Expand Up @@ -612,6 +613,58 @@ test "evaluateWithTarget: allow target preserves default behaviour" {
try testing.expect(try runWithTarget("{}", policy, "allow"));
}

test "evaluateWithTarget: allow_body fires on amount > limit" {
const policy =
"{\"type\":\"module\",\"rules\":[" ++
"{\"type\":\"rule\",\"name\":\"allow_body\",\"default\":true," ++
"\"value\":{\"type\":\"value\",\"value\":true}}," ++
"{\"type\":\"rule\",\"name\":\"allow_body\",\"body\":[" ++
"{\"type\":\"gt\"," ++
"\"left\":{\"type\":\"ref\",\"path\":[\"input\",\"body\",\"amount\"]}," ++
"\"right\":{\"type\":\"value\",\"value\":1000}}]," ++
"\"value\":{\"type\":\"value\",\"value\":false}}" ++
"]}";

// Body amount over limit -> rule fires returning false -> deny.
try testing.expect(!(try runWithTarget(
"{\"body\":{\"amount\":5000},\"body_raw\":\"...\"}",
policy,
"allow_body",
)));

// Body amount under limit -> default rule wins -> allow.
try testing.expect(try runWithTarget(
"{\"body\":{\"amount\":50},\"body_raw\":\"...\"}",
policy,
"allow_body",
));
}

test "evaluateWithTarget: body_raw fallback when body parse fails" {
// Policy targets body_raw directly so a non-JSON body is still
// policy-checkable.
const policy =
"{\"type\":\"module\",\"rules\":[" ++
"{\"type\":\"rule\",\"name\":\"allow_body\",\"body\":[" ++
"{\"type\":\"eq\"," ++
"\"left\":{\"type\":\"ref\",\"path\":[\"input\",\"body_raw\"]}," ++
"\"right\":{\"type\":\"value\",\"value\":\"BLOCKED\"}}]," ++
"\"value\":{\"type\":\"value\",\"value\":false}}," ++
"{\"type\":\"rule\",\"name\":\"allow_body\",\"default\":true," ++
"\"value\":{\"type\":\"value\",\"value\":true}}" ++
"]}";
try testing.expect(!(try runWithTarget(
"{\"body\":null,\"body_raw\":\"BLOCKED\"}",
policy,
"allow_body",
)));
try testing.expect(try runWithTarget(
"{\"body\":null,\"body_raw\":\"ok\"}",
policy,
"allow_body",
));
}

test "evaluate: every+some over arrays" {
const policy =
"{\"type\":\"every\",\"var\":\"req\"," ++
Expand Down
7 changes: 4 additions & 3 deletions src/main.zig
Original file line number Diff line number Diff line change
Expand Up @@ -53,9 +53,10 @@ export fn evaluate(
}

/// Run one evaluation against an explicit target rule. Same return
/// codes as `evaluate`. Hosts that want to drive the response-side
/// "allow_response" path (or any other target name) call this
/// instead of the default `evaluate`.
/// codes as `evaluate`. Hosts that want to drive a non-default rule
/// (`allow_response` for the response phase, `allow_body` for the
/// body phase, or any other target name) call this instead of the
/// default `evaluate`.
export fn evaluate_target(
input_ptr: [*]const u8,
input_len: usize,
Expand Down
88 changes: 83 additions & 5 deletions src/proxy_wasm.zig
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,9 @@
//! `proxy_on_context_create`, `proxy_on_request_headers`,
//! `proxy_on_request_body`, `proxy_on_response_headers`,
//! `proxy_on_done`. Request headers fire the "allow" target rule;
//! request body fires "allow_body" with `{"body": <parsed-json>,
//! "body_raw": <string>}` once the host signals end of stream;
//! response headers fire "allow_response" with `{"response":{...}}`.
//! Body callbacks are no-ops; see ROADMAP.md.
//!
//! Configuration: the policy AST JSON arrives via
//! `proxy_on_configure`. We copy it into `host_allocator` so it
Expand All @@ -16,8 +17,9 @@
//! `malloc`. We `hostFree` them once consumed.

const std = @import("std");
const memory = @import("memory.zig");
const eval = @import("eval.zig");
const json = @import("json.zig");
const memory = @import("memory.zig");

// ABI version negotiation: one empty export per supported version.

Expand Down Expand Up @@ -144,12 +146,88 @@ export fn proxy_on_request_headers(_: i32, _: i32, _: i32) i32 {
return action_continue;
}

/// No-op for now. Keeps the symbol resolvable when the filter
/// declares body interest.
export fn proxy_on_request_body(_: i32, _: i32, _: i32) i32 {
/// Evaluate against the request body once the host signals end of
/// stream. Until then we return `Continue` so streaming chunks pass
/// through; the final fragment triggers the eval. Body input shape
/// is `{"body": <parsed-or-null>, "body_raw": <string>}`.
///
/// Hosts clear `:method` / `:path` from the header map by the time
/// this fires (Envoy/wamr behaviour), so a body rule that needs
/// header context must depend on a snapshot taken in
/// `proxy_on_request_headers`. Per-context snapshot plumbing is
/// tracked in ROADMAP.md; v1 surfaces only the body itself.
export fn proxy_on_request_body(_: i32, body_size: i32, end_of_stream: i32) i32 {
if (end_of_stream == 0) return action_continue;
if (body_size <= 0) return action_continue;
const policy = configured_policy orelse return action_continue;
if (!evaluateBodyAt(@intCast(body_size), policy)) {
denyWithStatus(403);
return action_pause;
}
return action_continue;
}

const body_target_rule: []const u8 = "allow_body";
const max_body_bytes: usize = 64 * 1024;

fn evaluateBodyAt(body_size: usize, policy: []const u8) bool {
const arena = memory.requestArena();
defer memory.resetRequestArena();
const allocator = arena.allocator();

const cap = if (body_size > max_body_bytes) max_body_bytes else body_size;
const body_bytes = readBodyBytes(allocator, cap) catch return false;
const input_bytes = buildBodyInput(allocator, body_bytes) catch return false;
return eval.evaluateWithTarget(arena, input_bytes, policy, body_target_rule) catch false;
}

/// Pull the request body from the host. Returns an empty slice on
/// host error so the caller sees a body of "" rather than failing
/// the request outright.
fn readBodyBytes(allocator: std.mem.Allocator, cap: usize) ![]const u8 {
var data: ?[*]u8 = null;
var data_size: usize = 0;
const status = proxy_get_buffer_bytes(
buffer_type_http_request_body,
0,
cap,
&data,
&data_size,
);
if (status != status_ok) return &[_]u8{};
if (data_size == 0) return &[_]u8{};
const ptr = data orelse return &[_]u8{};
defer memory.hostFree(ptr);
return try allocator.dupe(u8, ptr[0..data_size]);
}

/// Build `{"body": <parsed-json-or-null>, "body_raw": <string>}`. We
/// try to parse the body as JSON; if it fails, `body` is null and
/// the policy can still match against `body_raw` (e.g. with the
/// `contains` builtin). The parsed copy is dropped on the next
/// arena reset, so this only costs one transient walk.
fn buildBodyInput(allocator: std.mem.Allocator, body: []const u8) ![]u8 {
const parsed_ok = blk: {
_ = json.parse(allocator, body) catch break :blk false;
break :blk true;
};

var buf: std.ArrayList(u8) = .empty;
defer buf.deinit(allocator);

try buf.appendSlice(allocator, "{\"body\":");
if (parsed_ok and body.len > 0) {
try buf.appendSlice(allocator, body);
} else {
try buf.appendSlice(allocator, "null");
}
try buf.appendSlice(allocator, ",\"body_raw\":");
try appendJsonString(allocator, &buf, body);
try buf.append(allocator, '}');

return try allocator.dupe(u8, buf.items);
}

/// Evaluate against response status + headers under the
/// `allow_response` target rule. Deny replaces the response with a
/// 503; allow lets the upstream response through unchanged.
Expand Down
Loading
Loading