Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions DECISIONS.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,20 @@ Active architectural and behavioral decisions for Agent Native PM.

When this file exceeds 50 entries or 30 KB, archive older entries to `DECISIONS_ARCHIVE.md`. The most recent archival pass was on 2026-04-22.

## 2026-04-27: Phase 6c PR-2 Copilot follow-up — error_kind split + connector parity [agent:backend-architect]

- **Context**: Copilot review on PR #27 raised six line-level findings post-merge-review-pipeline. Two of the six (a confidence-error sentinel split and a malformed-source error_kind split) and one critic-round-4 mandatory finding (connector-side parser drift) shipped together as a single follow-up commit.
- **Decision**: (1) `audit.ErrConfidenceNotAllowed` is a NEW sentinel for the non-router-with-confidence case; `audit.ErrInvalidConfidence` retains its router-specific text and is reserved for nil/out-of-range Confidence on `actor_kind=router` rows. Callers that need to distinguish between the two failure modes now have correct `errors.Is` semantics. (2) `models.ErrorKindRoleDispatchMalformed = "role_dispatch_malformed"` is a NEW dispatch error_kind for tasks whose source string is missing the `role_dispatch:<role>` suffix entirely (legacy tasks from before suffixes were required, or programming errors that emit just `role_dispatch`). It is structurally distinct from `role_not_found` (well-formed id, absent from catalog). The remediation hint points operators at the task source field rather than the role catalog. (3) `parseRoleIDFromSource` returns `(roleID, hasSuffix)` — empty suffix counts as malformed. (4) The connector-side parser in `connector/service.go RunOnceTask` was previously emitting `ErrorKind: "unknown"` for both branches; it now emits `role_dispatch_malformed` and `role_not_found` to mirror the server, so operators see consistent diagnostics regardless of which path drained the task. (5) Frontend `availableRolesError: string | null` is a sibling state to `availableRoles`; on `/api/roles` fetch failure the hook keeps `availableRoles=null` (preserving the loading sentinel) and populates the error string. PlanningTab plumbs both into `CandidateReviewPanel` and `CandidateRoleEditor`, which render an explicit "Failed to load roles: …" alert in the dropdown + chip area.
- **Constraints introduced**: (a) `parseRoleIDFromSource(source) → (string, bool)` is the new helper signature in `internal/store/task_store.go`; the previous single-return signature is retracted (only one caller existed). (b) Server and connector parsers MUST stay in lockstep; godoc on the server-side helper documents this and connector tests reference the same `models.ErrorKind*` constants. (c) `actor_audit.actor_kind` enum comment in migration 030 now lists all five values (user, api_key, router, system, connector); embedded `;` characters are forbidden in migration comments because the SQLite parser splits on them (re-tripped during this fix). (d) Phase 6c is greenfield — no live deployment carries `tasks` rows with `source = "role_dispatch"` (no colon), so the new `role_dispatch_malformed` error_kind has no historical backfill obligation; future operators encountering this kind in an observed task should investigate the candidate-applier code path that produced the malformed source.
- **Source**: PR #27 Copilot review (six line-level findings) + critic-round-4 (M1 connector parser drift, M2 backfill question, S1 missing malformed-branch test, S2 missing frontend-error-UI test, S4 server-push catalog assumption). All M+S findings addressed in commit subsequent to `72e066d`.

## 2026-04-26: Phase 6c PR-2 implementation refinements (post review-pipeline) [agent:backend-architect]

- **Context**: Phase 6c PR-2 implementation went through critic + risk-reviewer + /security-review pipelines. Risk-reviewer raised three HIGH findings (H1 audit table had no read endpoint; H2 the 2026-04-25 DECISIONS entry described an `UpdateExecutionRole(ctx, id, role, actor)` API name that diverged from the shipped `Update(id, req, actor)` shape; H3 stale-role tasks auto-fail silently with no operator-visible signal). Critic raised four MAJOR fixes that overlap (confidence rule asymmetric on non-router actors, api-key vs session-user actor confusion, frontend stale-warning races the catalog fetch, drain-loop unbounded under poisoned queues).
- **Decision**: PR-2 absorbs the H1/H2/M1/M3 + critic-mandatory fixes inline rather than deferring. Concretely: (1) `BacklogCandidate.execution_role_authoring` is a server-populated read projection of the latest `actor_audit` row, populated by `BacklogCandidateStore.EnrichWithAuthoring(ctx, candidates)` in the list / get-by-id / patch / apply handlers. Pre-Phase-6c rows have no audit history and surface as `null` — backfill is intentionally not done because the historical authoring is unrecoverable. (2) `audit.ActorInfo.Confidence` is rejected when `actor_kind != "router"` — non-router rows MUST have nil confidence so downstream "router decisions ≥ X" queries don't silently include user-set rows. (3) Two new `ActorKind` values: `api_key` (automation auth) joins the existing user/router/system/connector enum so the audit trail can distinguish session callers from API-key callers; the apply and PATCH endpoints route through `buildAuthoringActor(r, rationale)` which inspects `APIKeyFromContext` first then `UserFromContext`. (4) `ClaimNextDispatchTask`'s stale-role drain loop is bounded by `staleDrainCap = 16` per call — a poisoned queue (catalog rename hits N tasks) drains in batches of 16, yielding control between batches so a connector poll cannot wedge for O(N) writes. (5) Frontend `availableRoles` is now `RoleInfo[] | null`; the `null` sentinel suppresses the stale-role warning until `/api/roles` resolves, eliminating the false-positive flash on mount. The CandidateRoleEditor's stale-check additionally consults the static `KNOWN_ROLE_IDS` mirror so the chip's stale state is unambiguous even before the runtime catalog arrives.
- **Constraints introduced (additive to the 2026-04-25 entry)**: (a) `BacklogCandidateStore.Update(id, req, actor)` is the canonical mutator — the 2026-04-25 entry's `UpdateExecutionRole(...)` name was a pre-implementation sketch and is hereby retracted. (b) `ApplyToTaskWithMode(id, mode, executionRole, actor)` carries the role in its argument list (not implicitly in the candidate row); the candidate's `execution_role` column is updated and audited inside the apply transaction when (and only when) it changes. (c) Stale-role tasks are auto-failed by `claim-next-task` with `error_kind=role_not_found` and an `actor_kind=system` audit row; operators detect them by inspecting `tasks.dispatch_status='failed'` until Phase 6c PR-4 surfaces this in the activity stream. (d) `actor_audit` has no foreign-key cascade to subject rows: this is intentional (history-preserving append-only ledger), and supersedes the 2026-04-25 entry's "cascade-delete with subject row" line which was incorrect.
- **Source**: PR-2 branch `phase6c-pr2-authoring-audit` review pipeline — critic round 1 findings #1/#3/#5/#6, risk-reviewer findings H1/H2/H3/M1, /security-review (zero findings).

## 2026-04-25: Phase 6c v5.1 — L0 safety + authoring lifecycle + LLM router (suggest-only) + activity visibility [agent:feature-planner]

- **Context**: Phase 6b shipped role-dispatch end-to-end in code but the user-facing path is broken in three independent ways: (1) `execution_role` has no UI authoring surface, so the role_dispatch radio is permanently disabled (catch-22); (2) "auto-dispatch" is in name only — operators must still manually pick a role for every task; (3) connector activity during long task execution (backend-architect can run 90 min) is invisible to the frontend, making dogfood debugging impossible. Phase 5 §(g)/§(h) also flagged subprocess sandboxing as a Phase 6 blocker. The user explicitly rejected simple / phase-staged solutions: "一律不考慮簡單作法 我希望是完善的改動 而不是臨時性處理" (memory: `feedback_no_simple_approach`). Closing all four gaps in one phase is therefore in scope, even at ~15-day total cost.
Expand Down
1 change: 1 addition & 0 deletions backend/cmd/server/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -212,6 +212,7 @@ func main() {
RemoteModelsHandler: remoteModelsHandler,
MetaHandler: metaHandler,
HealthHandler: healthHandler,
RolesHandler: handlers.NewRolesHandler(),
AuthMiddleware: sessionAuthMiddleware,
APIKeyMiddleware: apiKeyAuthMiddleware,
LocalModeMiddleware: localModeMiddleware,
Expand Down
3 changes: 3 additions & 0 deletions backend/db/migrations/030_authoring_audit.down.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
DROP INDEX IF EXISTS idx_actor_audit_subject_field;
DROP INDEX IF EXISTS idx_actor_audit_subject;
DROP TABLE IF EXISTS actor_audit;
34 changes: 34 additions & 0 deletions backend/db/migrations/030_authoring_audit.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
-- Phase 6c PR-2: generic actor_audit table for execution_role
-- authoring lifecycle (and any future field that needs audit). This
-- table is the single source of truth for who set the field, when,
-- and with what rationale. The backlog_candidates.execution_role
-- column carries only the current value, the trail lives here.
--
-- See docs/phase6c-plan.md section 3.2.1 (v5.1) and the matching
-- DECISIONS entry constraints (a) (b) (h).
--
-- subject_kind values: backlog_candidate, task, planning_run, connector
-- field examples: execution_role, status, po_decision (caller chooses)
-- actor_kind values: user, api_key, router, system, connector
-- user session-authenticated human operator
-- api_key automation-authenticated request via API key
-- router reserved for Phase 6d auto-apply (LLM router) -- NO writer in 6c
-- PR-3 suggest writes "user" after operator confirms
-- system server-side enforcement (e.g. claim-next-task stale-role)
-- connector connector-initiated change
-- confidence: 0.0-1.0, only set when actor_kind=router
CREATE TABLE actor_audit (
id TEXT PRIMARY KEY,
subject_kind TEXT NOT NULL,
subject_id TEXT NOT NULL,
field TEXT NOT NULL,
old_value TEXT,
new_value TEXT,
actor_kind TEXT NOT NULL,
actor_id TEXT,
rationale TEXT,
confidence REAL,
created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP
);
CREATE INDEX idx_actor_audit_subject ON actor_audit(subject_kind, subject_id, created_at DESC);
CREATE INDEX idx_actor_audit_subject_field ON actor_audit(subject_kind, subject_id, field, created_at DESC);
Loading
Loading