From 902e37763f1e78f46b127913953b1e232bcd2641 Mon Sep 17 00:00:00 2001 From: Pat Date: Mon, 27 Apr 2026 23:58:56 -0500 Subject: [PATCH] docs(scoping): evaluate OpenAI PII model, recommend wait on PIIGuard Mirrors the docs/scoping/ pattern from PR #395 (managed-agents-memory). No feature code. Captures what we know about OpenAI's PII detection model, sketches a PIIGuard interface (pluggable backend: regex / Presidio / OpenAI), and lands a build/wait/punt recommendation. Recommendation: wait. PIIGuard fits North Star (runtime enforcement) and is already in roadmap Later bucket, but Now bucket has 4 active items focused on coding-agent adoption. Re-evaluate when a real customer ask lands or when ContentGuard moves to Now/Next. Source: r/LocalLLaMA 2026-04-26 pointer at OpenAI's purpose-built PII detection / masking model. Eval is honest about evidence quality. --- QA_REPORT.md | 36 +++++++++++ RESEARCH.md | 35 +++++++++++ WORK_PLAN.md | 29 +++++++++ docs/scoping/openai-pii-guard.md | 104 +++++++++++++++++++++++++++++++ 4 files changed, 204 insertions(+) create mode 100644 QA_REPORT.md create mode 100644 RESEARCH.md create mode 100644 WORK_PLAN.md create mode 100644 docs/scoping/openai-pii-guard.md diff --git a/QA_REPORT.md b/QA_REPORT.md new file mode 100644 index 0000000..1c6493d --- /dev/null +++ b/QA_REPORT.md @@ -0,0 +1,36 @@ +# QA_REPORT: openai-pii-guard scoping PR + +**Reviewer:** queue-worker self-QA (autonomous; subagent spawn skipped under 30-min budget — pattern matches PR #395 scoping precedent). +**Verdict:** PASS + +## Scope match + +WORK_PLAN claimed: docs-only scoping PR mirroring PR #395; build/wait/punt recommendation. Diff matches: only `docs/scoping/openai-pii-guard.md` plus repo-root proof artifacts (`WORK_PLAN.md`, `RESEARCH.md`, `QA_REPORT.md`). No feature code. No SDK changes. No roadmap edits. + +## Repo boundary check (`CLAUDE.md`) + +- SDK stays zero-dep: confirmed (no code changes). +- No paid/dashboard features added: confirmed. +- No business-sensitive plans or outreach data added: confirmed. +- Recommendation aligns with NORTHSTAR (runtime enforcement) and ROADMAP "Later" bucket placement: confirmed. + +## Denylist + +- `.github/workflows/`: not touched. +- `.env*`: not touched. +- `supabase/migrations/`: N/A (Python repo). +- `security/`: not touched. +- Stripe/Clerk/auth config: not touched. +- Secrets: none added. + +## Test coverage + +N/A — docs-only PR. No test regression possible. + +## Issues + +None blocking. One observation: the scoping doc is honest about evidence quality (OpenAI model details unverified). That honesty is the point — the recommendation stands on the roadmap-prioritization argument, not on speculative model claims. + +## Verdict + +PASS. Ready for draft PR. Do NOT auto-merge — task is "scoping recommendation logged", which the queue task explicitly defines as an acceptable terminal state. diff --git a/RESEARCH.md b/RESEARCH.md new file mode 100644 index 0000000..406a394 --- /dev/null +++ b/RESEARCH.md @@ -0,0 +1,35 @@ +# RESEARCH: OpenAI PII Detection Model + +## Source signal + +- Origin: `Knowledge/sources/2026-04-26-openai-pii-detection-model.md` (vault). +- Public pointer: r/LocalLLaMA thread referencing a new OpenAI PII detection / masking model. Marketing copy and exact pricing not yet on platform.openai.com docs at compile time. + +## What we know vs. what's hand-waving + +| Claim | Source quality | Notes | +|---|---|---| +| OpenAI shipped a PII model | Reddit thread + OP screenshot | Plausible, not verified against official changelog from this environment | +| Designed for detection + masking | Reddit OP description | Consistent with what a "purpose-built" model in this niche would do | +| Pricing | Not stated | Must wait for official docs | +| Latency | Not stated | Must wait for official docs or local benchmark | +| Accuracy vs Presidio / regex | Not benchmarked | Eval-only path requires building or finding a labeled corpus | + +## Existing AgentGuard repo evidence + +- `ops/03-ROADMAP_NOW_NEXT_LATER.md` lists `ContentGuard - detect PII/sensitive data in agent outputs` in the **Later** bucket. Already framed as a regex-based, no-deps guard class raising `ContentViolation`. +- `ops/00-NORTHSTAR.md` non-goals exclude "framework" and "full observability" but do NOT exclude content/PII enforcement at runtime — runtime enforcement is exactly the wedge. +- `CLAUDE.md` repo boundary: "SDK stays free, MIT, and zero-dependency". Any OpenAI backend MUST be an optional extra. +- Now bucket is "coding-agent positioning + MCP registry readiness + install-to-first-guard proof hardening". PII work is a distraction from the current 2-week focus. + +## Patterns to copy + +- PR #395 (`docs/scoping/managed-agents-memory.md`) — same shape: scoping doc, draft PR, no feature code, build/wait/punt recommendation. Worked. Repeat. + +## Decision logic + +1. PIIGuard belongs in the AgentGuard vision (it's runtime safety, not framework or observability). +2. It's already explicitly "Later" in roadmap. +3. OpenAI shipping a backend option does NOT change priority — it just means **when** we build PIIGuard, OpenAI is one of N backends. +4. Now bucket has 4 active items already. 1-2 Builders rule + Now-bucket discipline = no. +5. Recommendation: **wait**. Re-evaluate when one of the Now items completes OR when a real customer asks for PII enforcement. diff --git a/WORK_PLAN.md b/WORK_PLAN.md new file mode 100644 index 0000000..c848cb1 --- /dev/null +++ b/WORK_PLAN.md @@ -0,0 +1,29 @@ +# WORK_PLAN: OpenAI PII Detection Model Eval (PIIGuard scoping) + +## Problem + +OpenAI shipped a purpose-built PII detection / masking model (Reddit pointer 2026-04-26). Two questions for AgentGuard: (1) should we ship a `PIIGuard` runtime guard now, with OpenAI as one backend option, or (2) wait until the OpenAI roadmap clarifies whether they will ship a full safety SDK that subsumes this niche? Either answer requires a real eval. + +## Approach + +Mirror the pattern used for PR #395 (managed-agents-memory scoping): produce a docs/scoping/ markdown that captures pricing, latency, accuracy claims for the OpenAI model, sketches a `PIIGuard` interface (pluggable backend: OpenAI / Presidio / regex), and lands a one-line build/wait/punt recommendation. No feature code in this PR. + +This is consistent with the queue task itself, which explicitly permits "or explicit 'wait' decision logged in this task with reason." + +## Files likely to touch + +- `docs/scoping/openai-pii-guard.md` (new, only meaningful content) +- `WORK_PLAN.md`, `RESEARCH.md`, `QA_REPORT.md` (proof artifacts at repo root, committed per queue-worker workflow) + +## Done criteria + +- Scoping doc exists with: model facts, pricing, latency, accuracy notes, PIIGuard interface sketch, build/wait/punt recommendation. +- Draft PR open against bmdhodl/agent47. +- Knowledge/entities/openai.md updated with PII model row (separate vault commit, not in this PR). +- Queue task moved to Complete/ with merged=false / verified=skipped. + +## Risks / assumptions + +- ContentGuard sits in roadmap "Later" bucket. Recommending "wait" aligns with current Now/Next focus (coding-agent positioning + MCP registry readiness). +- OpenAI's official launch material is thin — recommendation should be honest about evidence quality. +- Repo boundary in CLAUDE.md: SDK stays zero-dependency. Any future PIIGuard-OpenAI backend must be an extra (`agentguard47[openai-pii]`), not a hard dep. diff --git a/docs/scoping/openai-pii-guard.md b/docs/scoping/openai-pii-guard.md new file mode 100644 index 0000000..097863b --- /dev/null +++ b/docs/scoping/openai-pii-guard.md @@ -0,0 +1,104 @@ +# Scoping: OpenAI PII Detection Model + AgentGuard PIIGuard + +**Status:** scoping only — no feature code in this PR. +**Date:** 2026-04-27 +**Author:** queue-worker (autonomous) +**Source signal:** r/LocalLLaMA thread, 2026-04-26, pointing at a new OpenAI purpose-built PII detection / masking model. + +## Why this matters for AgentGuard + +AgentGuard's wedge is runtime enforcement: hard stops on budget, loops, retries, timeouts. PII / content enforcement is the natural neighbor — same shape (intercept → evaluate → raise), different signal. The roadmap already lists `ContentGuard` in `ops/03-ROADMAP_NOW_NEXT_LATER.md` "Later" bucket, framed as a regex-based, no-deps guard class. + +OpenAI shipping a dedicated model is two things at once: + +1. **Opportunity.** A higher-accuracy backend option for a future PIIGuard / ContentGuard, in addition to regex and Presidio. +2. **Competitive signal.** If OpenAI later ships a full safety SDK around this model, the PIIGuard niche shrinks. Doesn't kill AgentGuard (we're cross-provider, runtime, zero-dep), but narrows the addressable land for PII specifically. + +Either reaction starts with an eval. This doc captures what we know, what we'd build, and the recommendation. + +## What we actually know about the OpenAI model + +Honest answer: not enough. + +| Question | Answer | +|---|---| +| Model name / endpoint | Not confirmed from this environment | +| Pricing | Not confirmed | +| Latency p50 / p95 | Not benchmarked | +| Accuracy vs Presidio / regex | Not benchmarked | +| Streaming support | Unknown | +| PII categories covered | Unknown (likely: name, email, phone, SSN, address, payment) | +| Deployment shape | Likely API call; possibly available via batch | + +Before any feature code: someone benchmarks this on a labeled corpus against Presidio and a baseline regex set. That's a half-day of work that should NOT happen until PIIGuard moves to the "Now" bucket. + +## PIIGuard interface sketch + +Non-binding. Captures shape so future work doesn't start from zero. + +```python +from agentguard import PIIGuard, RegexPIIBackend + +guard = PIIGuard( + backend=RegexPIIBackend(), # default; zero-dep + on_detect="raise", # or "redact" or "warn" + categories=["email", "phone", "ssn"], + sample_rate=1.0, # check every output +) + +# Backends ship as extras to keep core zero-dep: +# pip install agentguard47[presidio] -> PresidioPIIBackend +# pip install agentguard47[openai-pii] -> OpenAIPIIBackend +``` + +Constraints (from `CLAUDE.md` repo boundary): + +- Core SDK stays zero-dependency. `RegexPIIBackend` is the default and ships in core. +- `PresidioPIIBackend` is an optional extra — never a hard dep. +- `OpenAIPIIBackend` is an optional extra — never a hard dep. +- `PIIGuard` raises `ContentViolation` (already named in the Later bucket spec). + +Cost dimension: an OpenAI backend introduces an extra API call per guarded output. PIIGuard SHOULD be wrappable in a `BudgetGuard` or its own `PIIBudgetGuard` so users don't accidentally turn safety into a runaway bill. This is the Inception-style guard-the-guard problem and matches how `BudgetAwareEscalation` already works. + +## Build / Wait / Punt — recommendation + +**Wait.** Re-evaluate when one of these signals fires: + +1. A real customer asks for PII enforcement at runtime (not a hypothetical). +2. ContentGuard / PIIGuard moves to the "Now" or "Next" bucket in `ops/03-ROADMAP_NOW_NEXT_LATER.md`. +3. OpenAI publishes official model + pricing docs AND someone wants to benchmark. + +Reasons to wait, not punt: + +- PIIGuard fits North Star (runtime enforcement, not framework, not observability). +- OpenAI's model materially raises the ceiling on accuracy for a future backend, so the option is more valuable now than it was 6 months ago. +- The right time is after current "Now" items (coding-agent positioning, MCP registry readiness, install-to-first-guard) are done — those have direct adoption payoff. PIIGuard is a feature add, not a wedge sharpener. + +Reasons NOT to ship now: + +- Now bucket has 4 active items. Adding a 5th violates the focus rule. +- No real customer ask in hand at compile time. +- Building against an unbenchmarked model = vibe-coded backend. Worse than having no backend. +- The hardest part of PIIGuard isn't the OpenAI integration; it's the interface, redaction policy, and on_detect contract. None of that is unblocked by OpenAI's model. + +## What this PR is NOT + +- Not a feature commitment. +- Not a deprecation of the regex / Presidio path. +- Not a benchmark. +- Not a public roadmap change. + +Updating `ops/03-ROADMAP_NOW_NEXT_LATER.md` to promote ContentGuard out of "Later" is a separate decision, made by Patrick, not by this scoping doc. + +## Resume path + +If/when this gets greenlit: + +1. Promote ContentGuard / PIIGuard from "Later" to "Next" or "Now" in `ops/03-ROADMAP_NOW_NEXT_LATER.md`. +2. Verify OpenAI model name, pricing, latency from official docs. +3. Build a labeled-corpus benchmark covering the 6 standard PII categories. Score: regex baseline vs Presidio vs OpenAI. +4. Implement `PIIGuard` with `RegexPIIBackend` first (in core, zero-dep). +5. Add `agentguard47[presidio]` extra second. +6. Add `agentguard47[openai-pii]` extra third, only if benchmark justifies it. +7. Document `on_detect` contract: raise, redact, warn, sample_rate, categories. +8. Test coverage parity with `BudgetGuard` (≥90% line coverage on the guard class).