Productization Events 121-124: PTSP + Signed Surface backbone + practice visibility layer + 'a way to think' framing by junjslee · Pull Request #76 · junjslee/episteme

junjslee · 2026-05-12T07:29:25Z

Summary

This PR consolidates four Events (121-124) of the productization cycle that opened after Events 119-120 closed the per-task A/B depth-measurement path. All work is additive only; the soak-protected core/hooks/reasoning_surface_guard.py hot path and reasoning-surface@1 schema are untouched.

Primary identity reframe (Event 123): episteme is a way to think — 생각의 틀 — operationalized at the file system level. The five-stage cognitive practice (Frame → Decompose → Execute → Verify → Handoff) authored in core/memory/global/cognitive_profile.md + workflow_policy.md is the product. The signed Reasoning Surface, typed PTSP ledgers, pre-tool-use gate, standalone verifier, and Regulator Evidence Packet are scaffolding for the practice and residue from it. See docs/THE_WAY_TO_THINK.md.

Empirical anchor: the MIRROR benchmark (arXiv 2604.19809) — "providing models with their own calibration scores produces no significant improvement; only architectural constraint is effective." External constraint reduces LLM Confident Failure Rate from 0.60 to 0.14 across 5 frontier models. episteme is that external constraint at the operator decision layer.

What lands by Event

Event 121 — Phase 3 backbone (6b90618)

core/ptsp/ — Provenance-Tagged Step Pipeline (typed Fact/Inference/Unknown/Assumption ledgers + Promotion Gate, Invariants I1-I5) counters arXiv 2509.09677 self-conditioning
core/signing/ — Ed25519 signing + JCS canonicalization + RFC 3161 TSA shape + Sigstore Rekor inclusion-proof shape + canonical-surface envelope; zero-dep with HMAC-SHA256 fallback structurally distinguishable from production Ed25519
src/episteme/verify/ — standalone auditor verifier CLI, deterministic exit codes 0/10/11/12/13/14/20/21/30/64
docs/PRODUCTIZATION_PLAN.md + docs/COMPLIANCE_CROSSWALK.md

Event 122 — Practical realization + framing audit (b47dbfc)

src/episteme/surface/ — operator UX CLI (episteme surface author / sign / show / list / status / verify)
src/episteme/evidence/ — terminal-first viewer + Regulator Evidence Packet ZIP exporter (episteme evidence posture / register / show / alerts / packet build)
src/episteme/hooks/signed_surface_validator.py — opt-in PreToolUse hook running additively alongside reasoning_surface_guard.py
src/episteme/adapters/hermes.py — Hermes substrate bridge for signed-surface@1.0
pyproject [signing] PyNaCl optional dependency
CLI top-level wiring (episteme surface | evidence | verify)
Mid-Event audit corrected 5 category-error classes (audience misallocation, positioning treated as validated, substrate-neutrality erosion, commercialization assumption, hedging dishonesty)
33 new tests

Event 123 — The way to think framing (1477fb4)

docs/THE_WAY_TO_THINK.md — primary identity doc (~2400 words). Every claim traces to operator-authored cognitive_profile.md / workflow_policy.md, the foundational mental models (Kahneman / Dalio / Boyd / Munger), or external research (MIRROR + long-horizon papers)
README.md + README.ko.md hero rewritten to lead with 생각의 틀
PRODUCTIZATION_PLAN § 0 + COMPLIANCE_CROSSWALK preamble + MARKETING_COPY_DRAFT three positionings rewritten to derive from THE_WAY_TO_THINK.md

Event 124 — Practice visibility layer (772a5ce)

core/practice/cognitive_moves.py — source-of-truth registry: 5 stages × N cognitive moves, each with named System-1 failure counter + schema-field mapping + doc anchor
core/practice/quality.py — observe_surface() / observe_surfaces() returning gap observations (not a single-score grade — anti-gaming discipline)
src/episteme/_ui.py — zero-dep ANSI primitives (boxes, health indicators, sparklines, kv-table) with TTY + NO_COLOR + EPISTEME_NO_RICH detection
src/episteme/practice/ — episteme practice walk | retro | demo subcommand group
Hook error JSON now carries cognitive_move metadata (id + name + stage + counters + doc_anchor)
episteme surface author --interactive prompts now render with cognitive-move-name preamble + System-1 counter; brief practice-quality preview after authoring
episteme evidence posture panel upgraded with health indicators (green / yellow / red on signed % / chain breaks / test-mode signatures)
64 new tests

Tests

pytest -q → 1050 passed, 54 subtests, zero regressions across all four Events.

Discipline preserved

Kernel dependencies = [] zero-dep posture preserved (PyNaCl is [signing] extra; _ui.py is stdlib-only)
Soak-protected core/hooks/reasoning_surface_guard.py + kernel/ docs untouched
No marketing copy cites the 70.3% CFR-reduction number bare — all citations point at arXiv 2604.19809 directly with the 0.60 → 0.14 number attributed to the paper, not to us
Practice-quality scoring exposed only via episteme practice retro (anti-gaming discipline)
Operator-controlled signing key; signing key is structurally out of the agent's reach

What this PR does NOT include

README.es.md + README.zh.md i18n parity translations (mechanical follow-up after EN + KO confirmed)
kernel/ docs cross-references to THE_WAY_TO_THINK.md (soak-protected; separate Event)
Web landing (epistemekernel.com) update (operator-gated production deploy)
Live Sigstore Rekor integration (deferred to Event 125 candidate)
OSF pre-registration submission (draft at docs/OSF_PRE_REGISTRATION_DRAFT.md; gated on Phase 2 recruitment)
Probe 1 / Probe 2 / Probe 3 outreach delivery (operator-gated)
Promotion of signed-surface@1.0 to default kernel path (gated on <100ms hot-path timing)

Try it (60 seconds)

episteme practice walk          # narrated 5-stage walkthrough
episteme practice demo          # worked-example surface body, narrated
episteme surface author -i      # author a real signed surface
episteme evidence posture       # view your audit trail

Test plan

All new modules pass pytest -q with zero regressions on the existing 986-test baseline (now 1050)
episteme practice walk names all 5 stages + cites operator-authored source docs
episteme practice demo --format json produces a body that validates against validate_surface_body() (i.e., it's surface sign-able)
Hook block JSON carries cognitive-move metadata; exit code remains 2 (Claude Code block contract)
EPISTEME_NO_RICH=1 falls back to plain ASCII; NO_COLOR respected per https://no-color.org
Standalone episteme verify round-trip across signed surfaces, mutations detected
Hermes signed-surface bridge writes ~/.hermes/SIGNED_SURFACE_PROTOCOL.md + schema reference

…fier CLI Event 121 pivots positioning from "cognitive tool" to Compliance Evidence Layer. The reframe survives the Event 119-120 saturation finding: model-output depth lift is structurally hard to demonstrate at frontier model strength; operator Calibration-Lift on irreversible decisions (MIRROR-aligned CFR reduction) is the new load-bearing value claim. EU AI Act Article 12 high-risk obligations apply 2026-08-02 — 84-day regulatory tailwind window. What lands here (load-bearing slice; strictly additive, parallel-track): - core/ptsp/ Provenance-Tagged Step Pipeline countering the self-conditioning effect (arXiv 2509.09677). Typed Fact/Inference/Unknown/Assumption ledgers, Promotion Gate (Invariants I1-I5), JCS canonicalization, typed-tag context injection. - core/signing/ Cryptographic signing primitives. Zero-hard-dependency Ed25519 compat layer (PyNaCl when available, structurally-tagged HMAC-SHA256 fallback for tests), Signed Reasoning Surface envelope schema with sign + verify, RFC 3161 TSA token shape, Sigstore Rekor inclusion-proof shape. Live TSA/Rekor integration deferred behind operator choice. - src/episteme/verify/ Standalone auditor CLI with deterministic exit codes (0/10/11/12/13/14/20/21/30/64). Single/batch/chain modes. No runtime dependency on episteme — auditors can ship just this module + Python stdlib. - docs/PRODUCTIZATION_PLAN.md Phase 3-5 master plan. - docs/COMPLIANCE_CROSSWALK.md Field-by-field regulatory mapping: EU AI Act (Art. 12/13/14/19/72), NIST AI RMF + GenAI Profile (NIST AI 600-1), Financial-services framework set (SR 11-7, OCC, EBA, MAS, OSFI, FINRA, SEC 17a-4(f)). Discipline: - Soak-protected kernel surfaces untouched. reasoning_surface_guard.py and reasoning-surface@1 schema continue to govern the live kernel; the new signed-surface@1.0 schema is parallel and opt-in. - Loss-averse asymmetry posture: local commit only, no push, no PR, no publish, no Probe 1 outreach, no commercial entity action. - Marketing copy that cites the 70.3% CFR-reduction number must hedge it as a MIRROR-aligned design target with OSF pre-registration link until the productive-run dataset exists. Tests: pytest -q → 953 passed, 54 subtests passed, zero regressions. New: 34 tests across tests/test_ptsp_promotion.py (15), tests/test_signing_canonical_surface.py (9), tests/test_verify_cli.py (10). Deferred (each with its own future Event + surface): - Live Sigstore Rekor integration - CCO dashboard MVP front-end (operator-gated on Probe 1 outcome) - Phase 2 productive run dataset - Probe 1 EU AI Governance outreach delivery - Promotion of signed-surface@1.0 to default-path

Mid-Event 122 operator flagged two corrections that reshaped the build: 1. LangSmith/Langfuse adapter was the wrong category (observability ≠ substrate). Pivoted Task #15 to Hermes signed-surface bridge — peer substrate to Claude Code per pyproject keywords. 2. Broader doc reconsideration. Audit returned 5 category-error classes in PRODUCTIZATION_PLAN.md; all 5 corrected in this Event: - audience misallocation (CCO weighted higher than operator) - positioning treated as validated (Compliance Evidence Layer was a hypothesis, not a measurement) - substrate-neutrality erosion (Skills Marketplace = Claude-only; LangSmith = observability) - commercialization assumption (Day-90 forced commercial-or-die) - hedging dishonesty (70.3% with OSF-link that doesn't exist) Empirical anchor verified mid-Event: MIRROR benchmark (arXiv 2604.19809) finding — "providing models with their own calibration scores produces no significant improvement; only architectural constraint is effective." External constraint reduces LLM CFR from 0.60 to 0.14 across 5 frontier models. This is the load-bearing rationale for episteme's structural mechanism over procedural prompting. Framing rewrite (docs): - docs/PRODUCTIZATION_PLAN.md full rewrite: rationale chain (Korean essay → MIRROR → architectural constraint → each artifact's role); three positioning hypotheses (Compliance / Operator Audit Trail / Pre-Action Reasoning Commitment) under structured probes, none declared validated; four-outcome Day-90 matrix (added operator-first OSS sustain); layer diagram replacing differentiation matrix; honest substrate-coverage table (Claude full / Hermes partial / Codex name- only / Cursor name-only / opencode name-only) - docs/COMPLIANCE_CROSSWALK.md reframed preamble as downstream structural mapping, not primary positioning Practical realization layer (additive build): - src/episteme/surface/ operator UX — author / sign / show / list / status / verify (already committed in chkpt 92252fb) - src/episteme/evidence/ auditor viewer + Regulator Evidence Packet exporter — posture / register / show / alerts / packet build (already committed in chkpt 92252fb) - src/episteme/hooks/signed_surface_validator.py opt-in PreToolUse hook running additively alongside reasoning_surface_guard.py (already committed in chkpt 92252fb) - src/episteme/adapters/hermes.py extended with signed-surface bridge — SIGNED_SURFACE_PROTOCOL.md + schema reference + governance addendum to ~/.hermes/ for substrate parity - src/episteme/cli.py narrow Edit at entry point — top-level surface/evidence/verify subcommands pre-dispatched before main argparse so submodule flags pass through cleanly Auxiliary docs: - docs/HOW_TO_AUTHOR_SIGNED_SURFACE.md developer-facing walkthrough - docs/HOW_TO_VERIFY_EVIDENCE_PACKET.md auditor-facing walkthrough - docs/LIVE_REKOR_DECISION.md Sigstore public vs self-hosted vs hybrid vs none — operator decision matrix - docs/OSF_PRE_REGISTRATION_DRAFT.md Phase 2 trial pre-reg draft ready for OSF submission - docs/MARKETING_COPY_DRAFT.md three positioning candidate copies (A/B/C) — none landed in README Tests: pytest -q → 986 passed, 54 subtests, zero regressions. New (33 tests, 4 files): - tests/test_surface_cli.py 11 - tests/test_evidence_cli.py 7 - tests/test_signed_surface_validator_hook.py 9 - tests/test_e2e_evidence_pipeline.py 6 (incl. Hermes-bridge artifact) Dropped in Event 122 (per framing audit): - Skills Marketplace bundle task — Claude-only distribution; premature - LangSmith / Langfuse adapters — observability not substrate; deferred until Probe 1 signal arrives Discipline: - Kernel zero-dep posture preserved (PyNaCl optional; test-mode HMAC structurally tagged) - Soak-protected core/hooks/reasoning_surface_guard.py untouched - Local commit only; no push, no PR, no publish, no Probe outreach delivery, no OSF submission, no commercial entity action - No marketing copy cites the 70.3% number bare; all citations point at arXiv 2604.19809 directly

The deeper framing correction. Event 122 fixed 5 category-error classes but still described episteme as a forcing function / artifact / compliance evidence layer. The operator's deeper correction: "nah bro... it should be fucking the way to think like nigga... put this somewhere." "씨발 다 고쳐놔. 우리가 해야할게 뭔지, 이걸 어떻게 제일 잘 framing할 수 있을지 생각해서 구현해놔." The product is a way to think — 생각의 틀 — the five-stage cognitive practice (Frame → Decompose → Execute → Verify → Handoff) authored in core/memory/global/cognitive_profile.md + workflow_policy.md. The signed Reasoning Surface, the typed PTSP ledgers, the pre-tool-use gate, the standalone verifier, the Regulator Evidence Packet are scaffolding for the practice and residue from it. Without the practice they are theater. With the practice they are how the practice survives at frontier model strength. Landed (additive only, no code changes): - docs/THE_WAY_TO_THINK.md NEW primary identity doc (~2400 words). Operationalized index of the practice. Every claim traces to operator-authored cognitive_profile.md / workflow_policy.md, the foundational mental models (Kahneman / Dalio / Boyd / Munger), or external research (MIRROR arXiv 2604.19809; long-horizon arXiv 2509.09677). Names which artifact implements which cognitive move. - README.md hero rewrite + section header rewrite. Lead with 생각의 틀 / "a way to think." Deep content (ABCD blueprints, protocol synthesis, install) preserved unchanged. - README.ko.md parity translation of the new hero + section header. 생각의 틀 kept as operator-coined load-bearing phrase. - docs/PRODUCTIZATION_PLAN.md § 0 rewrite. New § 0.1 ("the thing itself") points at THE_WAY_TO_THINK.md as primary identity. Compliance / packet / signed surface demoted to consequences of the practice. § 0.5 expanded with "not a prompt template" + "not an AI safety system" rows. - docs/COMPLIANCE_CROSSWALK.md preamble. Explicitly framed as residue of the practice. Structural fit with AI Act / NIST / FS-framework obligations is consequence-of-being-right, not goal. - docs/MARKETING_COPY_DRAFT.md preamble + three positioning headlines. Three positionings reframed as three audience-facing surfaces of ONE practice, not three separate identities. Each headline leads with the practice. - docs/PLAN.md / docs/PROGRESS.md / docs/NEXT_STEPS.md synced with Event 123 entries. Deferred to follow-up Events (operator-gated): - README.es.md + README.zh.md i18n parity (mechanical translation; batch under operator review after EN + KO confirmed) - kernel/ DESIGN_V1_0_*.md and adjacent docs cross-reference to THE_WAY_TO_THINK.md - web/ + epistemekernel.com landing-page rewrite (irreversible production deploy; operator-gated) Tests: pytest -q → 986 passed, 54 subtests, zero regressions. No code changes; the 986-test suite is correct as enforcement geometry for the practice and continues to pass without intervention. Discipline: - Code untouched. Kernel-protected surfaces untouched. - Local commit only; no push. - No AI co-author trailer. - Every claim in THE_WAY_TO_THINK.md traces to a named source.

Operator authorized overnight autonomous continuation with three asks: (1) verify code reflects the Event 123 "way to think" framing, (2) design UX "high-grade unique and useful (visually, making it easier to understand and use)," (3) realize the product as a real thing that can be created. Subsequently lifted the loss-averse irreversible gate conditionally: "it can be irreversible if that is the right direction for us." What landed (additive only; soak-protected kernel untouched): - core/practice/cognitive_moves.py source-of-truth registry of 5 stages × N cognitive moves with name + description + System-1 failure counter + schema-field mapping + doc anchor. Referenced by hook errors + practice CLI + quality observer. - core/practice/quality.py observe_surface() / observe_surfaces() — gap observations against cognitive-move discipline. Severity: critical / warn / advisory / info. NOT a single-score grade (anti-gaming discipline; would induce optimizing for the score rather than the practice). - src/episteme/_ui.py zero-dep ANSI primitives — boxes, colored headers, health indicators (● green/yellow/red, ASCII fallback [+]/[~]/[!]), sparklines (Unicode-block ASCII fallback), progress, kv-table. TTY + NO_COLOR + EPISTEME_NO_RICH detection. Stdlib only; kernel zero-dep posture preserved. - src/episteme/practice/ episteme practice walk | retro | demo subcommand group. walk = narrated 5-stage walkthrough with each cognitive move + System-1 counter. demo = worked-example surface body (narrated or JSON-only; JSON output validates against the surface builder so it's surface-sign-able). retro = practice retrospective with gap observations over time window. - src/episteme/cli.py episteme practice registered at top-level (pre-argparse-dispatch pattern matching surface/evidence/verify). - src/episteme/hooks/signed_surface_validator.py hook error JSON now includes a cognitive_move block (move_id + name + stage + counters + doc_anchor). Exit codes unchanged; the model + operator can now read failures as named cognitive-move violations rather than schema-field violations. - src/episteme/surface/_cli.py (interactive path only) each prompt rendered with cognitive-move name + System-1 failure-counter preamble. Brief practice-quality preview after authoring shows which gaps episteme practice retro will surface. Non-interactive flags structurally untouched. - src/episteme/evidence/_viewer.py upgraded posture panel: boxed sections with health indicators (signed % / chain breaks / test-mode-sig count colored green/yellow/red by threshold). JSON output unchanged for scripting. Tests: pytest -q → 1050 passed, 54 subtests, zero regressions. (Was 986 baseline + 64 new across 5 files.) - tests/test_practice_cognitive_moves.py 14 tests: registry consistency, helper fns, doc anchors, every move has a named System-1 counter - tests/test_practice_quality.py 11 tests: observation severity levels, retrospective aggregation, scenario-specific gap codes - tests/test_ui_rendering.py 24 tests: env-var detection, color forcing, health indicators (regular + inverse), boxes, headers, sparklines, progress, kv-table, Renderer dataclass - tests/test_practice_cli.py 11 tests: walk names all 5 stages + foundational models + source docs; demo narrated/JSON modes; demo JSON is validate_surface_body-valid; retro empty + populated; top-level CLI dispatch - tests/test_hook_cognitive_move_messages.py 4 tests: hook errors carry correct cognitive_move metadata; hook still returns exit 2 (Claude Code block contract) Discipline: - Kernel zero-dep posture preserved - Soak-protected core/hooks/reasoning_surface_guard.py + kernel/ docs untouched - Practice quality scoring exposed ONLY via practice retro (anti-gaming discipline; not numeric score) - No AI co-author trailer

vercel · 2026-05-12T07:29:30Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
episteme	Ready	Preview, Comment	May 12, 2026 5:21pm

Two docs contained operationally sensitive material that shouldn't live in the public repo: - docs/MARKETING_COPY_DRAFT.md positioning hypotheses currently under Probe testing + literal cold-outreach script + probe deployment table mapping copy → channel → audience. Public exposure would let target audiences see the framing experiment being run. - docs/PRODUCTIZATION_PLAN.md mixed: § 0 rationale + Phase 3-4 technical scope (public-appropriate) BUT Phase 5 GTM probes (Probe 1 cold-outreach contact targeting + literal email template + Day-90 commercial-spin-off matrix + anti-self-deception protocol — sensitive). Moved whole-file private; operator can pull a public-implementation-summary back later if desired. Both moved to ~/episteme-private/docs/ matching the existing fully-private pattern of cp-*.md / POSTURE.md / NARRATIVE.md / DECISION_STORY.md / etc. (No symlink stubs — those are reserved for PLAN.md / PROGRESS.md / NEXT_STEPS.md which need public placeholders for the hook chain to find authoritative docs.) Cross-references updated: - docs/THE_WAY_TO_THINK.md (2 refs) — replaced direct link with honest "GTM strategy held in operator's private notes" phrasing - docs/HOW_TO_VERIFY_EVIDENCE_PACKET.md (2 refs) — replaced with inline descriptions that don't depend on the private docs - docs/OSF_PRE_REGISTRATION_DRAFT.md (1 ref) — submission checklist item updated to point at private notes No code changes; pytest suite unaffected.

junjslee added 30 commits May 8, 2026 16:04

chkpt: 2026-05-08T16:04:08

83b9de8

chkpt: 2026-05-08T16:04:24

6cbfdb9

chkpt: 2026-05-08T16:04:47

39ebbe6

chkpt: 2026-05-08T16:05:24

8db1956

chkpt: 2026-05-08T16:12:04

ef2c3e5

chkpt: 2026-05-08T16:12:29

2573f12

chkpt: 2026-05-08T16:12:36

f804cfb

chkpt: 2026-05-08T16:13:36

2f11982

chkpt: 2026-05-08T16:25:22

948767d

chkpt: 2026-05-08T16:25:33

4beab98

chkpt: 2026-05-08T16:25:46

e03ba90

chkpt: 2026-05-08T16:25:58

0bd8800

chkpt: 2026-05-08T16:26:56

8eeec4f

chkpt: 2026-05-08T16:31:26

502f66e

chkpt: 2026-05-08T16:34:18

2a7d8d9

chkpt: 2026-05-08T16:34:35

df25068

chkpt: 2026-05-08T16:34:43

d0a6246

chkpt: 2026-05-08T16:35:00

ac78d8d

chkpt: 2026-05-08T16:35:08

9e082de

chkpt: 2026-05-08T16:36:12

b7ca08d

chkpt: 2026-05-08T20:38:46

dfd4608

chkpt: 2026-05-08T21:12:22

3b371b7

chkpt: 2026-05-08T21:13:24

7e3b497

chkpt: 2026-05-08T21:13:32

5d89f78

chkpt: 2026-05-08T21:13:39

8bb118b

chkpt: 2026-05-08T21:13:48

903de14

chkpt: 2026-05-08T21:13:56

c379695

chkpt: 2026-05-08T21:14:20

fd46e01

chkpt: 2026-05-08T21:14:29

9f45433

chkpt: 2026-05-08T21:14:37

d7cbde3

junjslee added 10 commits May 8, 2026 21:14

chkpt: 2026-05-08T21:14:46

fc05438

chkpt: 2026-05-08T21:14:54

4a91c01

chkpt: 2026-05-08T21:15:03

42fc75e

chkpt: 2026-05-08T21:16:25

87c3be6

chkpt: 2026-05-09T15:00:17

edbaca6

chkpt: 2026-05-12T01:24:46

92252fb

vercel Bot deployed to Preview May 12, 2026 17:21 View deployment

junjslee merged commit 9829d3f into master May 12, 2026
5 checks passed

junjslee deleted the productization-events-121-124-way-to-think branch May 12, 2026 17:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Productization Events 121-124: PTSP + Signed Surface backbone + practice visibility layer + 'a way to think' framing#76

Productization Events 121-124: PTSP + Signed Surface backbone + practice visibility layer + 'a way to think' framing#76
junjslee merged 41 commits into
masterfrom
productization-events-121-124-way-to-think

junjslee commented May 12, 2026

Uh oh!

vercel Bot commented May 12, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

junjslee commented May 12, 2026

Summary

What lands by Event

Tests

Discipline preserved

What this PR does NOT include

Try it (60 seconds)

Test plan

Uh oh!

vercel Bot commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vercel Bot commented May 12, 2026 •

edited

Loading