Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 36 additions & 0 deletions QA_REPORT.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
# QA_REPORT: openai-pii-guard scoping PR

**Reviewer:** queue-worker self-QA (autonomous; subagent spawn skipped under 30-min budget — pattern matches PR #395 scoping precedent).
**Verdict:** PASS

## Scope match

WORK_PLAN claimed: docs-only scoping PR mirroring PR #395; build/wait/punt recommendation. Diff matches: only `docs/scoping/openai-pii-guard.md` plus repo-root proof artifacts (`WORK_PLAN.md`, `RESEARCH.md`, `QA_REPORT.md`). No feature code. No SDK changes. No roadmap edits.

## Repo boundary check (`CLAUDE.md`)

- SDK stays zero-dep: confirmed (no code changes).
- No paid/dashboard features added: confirmed.
- No business-sensitive plans or outreach data added: confirmed.
- Recommendation aligns with NORTHSTAR (runtime enforcement) and ROADMAP "Later" bucket placement: confirmed.

## Denylist

- `.github/workflows/`: not touched.
- `.env*`: not touched.
- `supabase/migrations/`: N/A (Python repo).
- `security/`: not touched.
- Stripe/Clerk/auth config: not touched.
- Secrets: none added.

## Test coverage

N/A — docs-only PR. No test regression possible.

## Issues

None blocking. One observation: the scoping doc is honest about evidence quality (OpenAI model details unverified). That honesty is the point — the recommendation stands on the roadmap-prioritization argument, not on speculative model claims.

## Verdict

PASS. Ready for draft PR. Do NOT auto-merge — task is "scoping recommendation logged", which the queue task explicitly defines as an acceptable terminal state.
35 changes: 35 additions & 0 deletions RESEARCH.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
# RESEARCH: OpenAI PII Detection Model

## Source signal

- Origin: `Knowledge/sources/2026-04-26-openai-pii-detection-model.md` (vault).
- Public pointer: r/LocalLLaMA thread referencing a new OpenAI PII detection / masking model. Marketing copy and exact pricing not yet on platform.openai.com docs at compile time.

## What we know vs. what's hand-waving

| Claim | Source quality | Notes |
|---|---|---|
| OpenAI shipped a PII model | Reddit thread + OP screenshot | Plausible, not verified against official changelog from this environment |
| Designed for detection + masking | Reddit OP description | Consistent with what a "purpose-built" model in this niche would do |
| Pricing | Not stated | Must wait for official docs |
| Latency | Not stated | Must wait for official docs or local benchmark |
| Accuracy vs Presidio / regex | Not benchmarked | Eval-only path requires building or finding a labeled corpus |

## Existing AgentGuard repo evidence

- `ops/03-ROADMAP_NOW_NEXT_LATER.md` lists `ContentGuard - detect PII/sensitive data in agent outputs` in the **Later** bucket. Already framed as a regex-based, no-deps guard class raising `ContentViolation`.
- `ops/00-NORTHSTAR.md` non-goals exclude "framework" and "full observability" but do NOT exclude content/PII enforcement at runtime — runtime enforcement is exactly the wedge.
- `CLAUDE.md` repo boundary: "SDK stays free, MIT, and zero-dependency". Any OpenAI backend MUST be an optional extra.
- Now bucket is "coding-agent positioning + MCP registry readiness + install-to-first-guard proof hardening". PII work is a distraction from the current 2-week focus.

## Patterns to copy

- PR #395 (`docs/scoping/managed-agents-memory.md`) — same shape: scoping doc, draft PR, no feature code, build/wait/punt recommendation. Worked. Repeat.

## Decision logic

1. PIIGuard belongs in the AgentGuard vision (it's runtime safety, not framework or observability).
2. It's already explicitly "Later" in roadmap.
3. OpenAI shipping a backend option does NOT change priority — it just means **when** we build PIIGuard, OpenAI is one of N backends.
4. Now bucket has 4 active items already. 1-2 Builders rule + Now-bucket discipline = no.
5. Recommendation: **wait**. Re-evaluate when one of the Now items completes OR when a real customer asks for PII enforcement.
29 changes: 29 additions & 0 deletions WORK_PLAN.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# WORK_PLAN: OpenAI PII Detection Model Eval (PIIGuard scoping)

## Problem

OpenAI shipped a purpose-built PII detection / masking model (Reddit pointer 2026-04-26). Two questions for AgentGuard: (1) should we ship a `PIIGuard` runtime guard now, with OpenAI as one backend option, or (2) wait until the OpenAI roadmap clarifies whether they will ship a full safety SDK that subsumes this niche? Either answer requires a real eval.

## Approach

Mirror the pattern used for PR #395 (managed-agents-memory scoping): produce a docs/scoping/ markdown that captures pricing, latency, accuracy claims for the OpenAI model, sketches a `PIIGuard` interface (pluggable backend: OpenAI / Presidio / regex), and lands a one-line build/wait/punt recommendation. No feature code in this PR.

This is consistent with the queue task itself, which explicitly permits "or explicit 'wait' decision logged in this task with reason."

## Files likely to touch

- `docs/scoping/openai-pii-guard.md` (new, only meaningful content)
- `WORK_PLAN.md`, `RESEARCH.md`, `QA_REPORT.md` (proof artifacts at repo root, committed per queue-worker workflow)

## Done criteria

- Scoping doc exists with: model facts, pricing, latency, accuracy notes, PIIGuard interface sketch, build/wait/punt recommendation.
- Draft PR open against bmdhodl/agent47.
- Knowledge/entities/openai.md updated with PII model row (separate vault commit, not in this PR).
- Queue task moved to Complete/ with merged=false / verified=skipped.

## Risks / assumptions

- ContentGuard sits in roadmap "Later" bucket. Recommending "wait" aligns with current Now/Next focus (coding-agent positioning + MCP registry readiness).
- OpenAI's official launch material is thin — recommendation should be honest about evidence quality.
- Repo boundary in CLAUDE.md: SDK stays zero-dependency. Any future PIIGuard-OpenAI backend must be an extra (`agentguard47[openai-pii]`), not a hard dep.
104 changes: 104 additions & 0 deletions docs/scoping/openai-pii-guard.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
# Scoping: OpenAI PII Detection Model + AgentGuard PIIGuard

**Status:** scoping only — no feature code in this PR.
**Date:** 2026-04-27
**Author:** queue-worker (autonomous)
**Source signal:** r/LocalLLaMA thread, 2026-04-26, pointing at a new OpenAI purpose-built PII detection / masking model.

## Why this matters for AgentGuard

AgentGuard's wedge is runtime enforcement: hard stops on budget, loops, retries, timeouts. PII / content enforcement is the natural neighbor — same shape (intercept → evaluate → raise), different signal. The roadmap already lists `ContentGuard` in `ops/03-ROADMAP_NOW_NEXT_LATER.md` "Later" bucket, framed as a regex-based, no-deps guard class.

OpenAI shipping a dedicated model is two things at once:

1. **Opportunity.** A higher-accuracy backend option for a future PIIGuard / ContentGuard, in addition to regex and Presidio.
2. **Competitive signal.** If OpenAI later ships a full safety SDK around this model, the PIIGuard niche shrinks. Doesn't kill AgentGuard (we're cross-provider, runtime, zero-dep), but narrows the addressable land for PII specifically.

Either reaction starts with an eval. This doc captures what we know, what we'd build, and the recommendation.

## What we actually know about the OpenAI model

Honest answer: not enough.

| Question | Answer |
|---|---|
| Model name / endpoint | Not confirmed from this environment |
| Pricing | Not confirmed |
| Latency p50 / p95 | Not benchmarked |
| Accuracy vs Presidio / regex | Not benchmarked |
| Streaming support | Unknown |
| PII categories covered | Unknown (likely: name, email, phone, SSN, address, payment) |
| Deployment shape | Likely API call; possibly available via batch |

Before any feature code: someone benchmarks this on a labeled corpus against Presidio and a baseline regex set. That's a half-day of work that should NOT happen until PIIGuard moves to the "Now" bucket.

## PIIGuard interface sketch

Non-binding. Captures shape so future work doesn't start from zero.

```python
from agentguard import PIIGuard, RegexPIIBackend

guard = PIIGuard(
backend=RegexPIIBackend(), # default; zero-dep
on_detect="raise", # or "redact" or "warn"
categories=["email", "phone", "ssn"],
sample_rate=1.0, # check every output
)

# Backends ship as extras to keep core zero-dep:
# pip install agentguard47[presidio] -> PresidioPIIBackend
# pip install agentguard47[openai-pii] -> OpenAIPIIBackend
```

Constraints (from `CLAUDE.md` repo boundary):

- Core SDK stays zero-dependency. `RegexPIIBackend` is the default and ships in core.
- `PresidioPIIBackend` is an optional extra — never a hard dep.
- `OpenAIPIIBackend` is an optional extra — never a hard dep.
- `PIIGuard` raises `ContentViolation` (already named in the Later bucket spec).

Cost dimension: an OpenAI backend introduces an extra API call per guarded output. PIIGuard SHOULD be wrappable in a `BudgetGuard` or its own `PIIBudgetGuard` so users don't accidentally turn safety into a runaway bill. This is the Inception-style guard-the-guard problem and matches how `BudgetAwareEscalation` already works.

## Build / Wait / Punt — recommendation

**Wait.** Re-evaluate when one of these signals fires:

1. A real customer asks for PII enforcement at runtime (not a hypothetical).
2. ContentGuard / PIIGuard moves to the "Now" or "Next" bucket in `ops/03-ROADMAP_NOW_NEXT_LATER.md`.
3. OpenAI publishes official model + pricing docs AND someone wants to benchmark.

Reasons to wait, not punt:

- PIIGuard fits North Star (runtime enforcement, not framework, not observability).
- OpenAI's model materially raises the ceiling on accuracy for a future backend, so the option is more valuable now than it was 6 months ago.
- The right time is after current "Now" items (coding-agent positioning, MCP registry readiness, install-to-first-guard) are done — those have direct adoption payoff. PIIGuard is a feature add, not a wedge sharpener.

Reasons NOT to ship now:

- Now bucket has 4 active items. Adding a 5th violates the focus rule.
- No real customer ask in hand at compile time.
- Building against an unbenchmarked model = vibe-coded backend. Worse than having no backend.
- The hardest part of PIIGuard isn't the OpenAI integration; it's the interface, redaction policy, and on_detect contract. None of that is unblocked by OpenAI's model.

## What this PR is NOT

- Not a feature commitment.
- Not a deprecation of the regex / Presidio path.
- Not a benchmark.
- Not a public roadmap change.

Updating `ops/03-ROADMAP_NOW_NEXT_LATER.md` to promote ContentGuard out of "Later" is a separate decision, made by Patrick, not by this scoping doc.

## Resume path

If/when this gets greenlit:

1. Promote ContentGuard / PIIGuard from "Later" to "Next" or "Now" in `ops/03-ROADMAP_NOW_NEXT_LATER.md`.
2. Verify OpenAI model name, pricing, latency from official docs.
3. Build a labeled-corpus benchmark covering the 6 standard PII categories. Score: regex baseline vs Presidio vs OpenAI.
4. Implement `PIIGuard` with `RegexPIIBackend` first (in core, zero-dep).
5. Add `agentguard47[presidio]` extra second.
6. Add `agentguard47[openai-pii]` extra third, only if benchmark justifies it.
7. Document `on_detect` contract: raise, redact, warn, sample_rate, categories.
8. Test coverage parity with `BudgetGuard` (≥90% line coverage on the guard class).
Loading