Skip to content

Latest commit

 

History

History
320 lines (245 loc) · 16.4 KB

File metadata and controls

320 lines (245 loc) · 16.4 KB

Evolith Tracker — Security Specification

Bilingual Navigation: English (this document) · Versión en Español

Document Status: Draft Type: Security Specification Satellite: Evolith Tracker Upstream: Evolith Core Date: 2026-06-07 Author: Architect Agent (security focus) + Governance Auditor (BMAD) Resolves: GAP-006 / REM-005


1. Purpose & Scope

This document defines the security architecture and controls for Evolith Tracker. It is the authoritative reference for authentication, authorization, tenant isolation, data protection, secrets, input validation, threat modeling, and security testing.

It derives its mandates from existing governance and design:

Source Mandate
Global Rules R-08 "Authentication designs must explicitly show both IDP and Internal flows"
Global Rules R-15 Multi-tenancy: application-layer isolation primary, DB-native (RLS) secondary failsafe
Tech Standards Dual-path auth (External IDP + Internal Credentials); RLS for tenant isolation
TAD §16, §18, §25 UMS auth, RLS, PII-safe logging, fail-closed permission guard
PRD BR-006 (tenant isolation absolute); STRIDE (EPIC-001, Could Have)

Decisions marked ⊕ PROPOSAL require human (PO/Architect/Security) approval and are NOT yet adopted. They are options surfaced by this audit, not policy.

Out of scope: penetration test execution reports (operational), compliance certification (SOC2/ISO — future), and UMS's own internal security (owned by the UMS product).


2. Security Architecture Overview

Evolith Tracker delegates identity entirely to UMS and enforces authorization locally as a fail-closed policy engine. Tenant isolation is defense-in-depth: application-layer scoping (primary) backed by PostgreSQL Row-Level Security (secondary failsafe).

flowchart LR
    subgraph Client
      U[User / BMAD Agent]
    end
    subgraph Edge
      GW[API Gateway / BFF]
    end
    subgraph Tracker[Tracker Monolith]
      MW[TenantContext + Auth Middleware]
      G[TrackerPermissionGuard fail-closed]
      H[Command/Query Handlers]
    end
    subgraph External
      UMS[UMS SaaS - AuthN/AuthZ]
    end
    DB[(PostgreSQL + RLS)]

    U -->|Bearer JWT| GW
    GW -->|validate JWT via JWKS| UMS
    GW --> MW
    MW --> G
    G -->|resolve effective permissions| UMS
    G --> H
    H -->|tenant-scoped, RLS-enforced| DB
Loading

Trust boundaries:

Boundary Control
Client → Edge TLS 1.2+; bearer JWT required
Edge → UMS JWT signature verified against UMS JWKS; short cache of JWKS keys
Middleware → Guard Tenant context established before any handler runs
Guard → Handler Fail-closed: no explicit grant ⇒ deny
Handler → DB Tenant-scoped queries + RLS failsafe

3. Authentication (AuthN)

Per R-08, both paths are explicit:

Path Flow
External IDP (primary) UMS is the IdP. User authenticates at UMS; UMS issues a signed JWT. The Tracker never holds credentials.
Internal / service Service-to-service and CLI/MCP agents present a token issued by UMS (client-credentials style). The Tracker validates it identically — there is no separate local credential store.

3.1 Token Validation

// Edge/middleware — validate every request's bearer token against UMS JWKS
interface ITokenValidator {
  validate(token: string): Promise<Result<TokenClaims, AuthError>>;
}

@Injectable()
export class UmsJwtValidator implements ITokenValidator {
  // JWKS keys cached with short TTL; rotation honored via `kid` lookup + cache miss refetch
  async validate(token: string): Promise<Result<TokenClaims, AuthError>> {
    // 1. Parse header, resolve `kid` → public key from cached JWKS
    // 2. Verify signature (RS256/ES256 — never `none`, never HS with shared secret)
    // 3. Verify `iss` == UMS issuer, `aud` includes 'evolith-tracker', `exp`/`nbf`
    // 4. Return canonical TokenClaims { userId, tenantId, sessionId }
  }
}

AuthN rules:

# Rule
AN-1 Only asymmetric signatures (RS256/ES256) accepted; alg: none and HS256 are rejected outright
AN-2 iss, aud, exp, nbf are always verified; clock skew tolerance ≤ 60s
AN-3 JWKS keys cached with a short TTL; unknown kid triggers a single JWKS refetch (key rotation support)
AN-4 No token contents are trusted for authorization beyond identity — permissions are resolved fresh from UMS (see §4)
AN-5 Tokens are never logged, cached in plaintext, or persisted (see §7, §8)

3.2 Token Lifecycle ⊕ PROPOSAL

Concern Proposed policy Decision owner
Access token TTL ⊕ 15 minutes Security + UMS alignment
Refresh ⊕ Handled by UMS; Tracker triggers re-auth on 401, never mints tokens Architect
Session revocation ⊕ Honor UMS token introspection / short TTL so revocation propagates within one TTL window Security

These values must be confirmed against the actual UMS token contract (TAD gap: "UMS Authorization Graph schema not inspected"). Until confirmed, they are proposals.


4. Authorization (AuthZ)

Authorization is fail-closed and resolved from the UMS authorization graph, mapped to Tracker-canonical permissions (TAD §18).

@Injectable()
export class TrackerPermissionGuard implements CanActivate {
  async canActivate(context: ExecutionContext): Promise<boolean> {
    const required = this.reflector.get<TrackerPermission>('requiredPermission', context.getHandler());
    if (!required) return true; // endpoints with no @RequirePermission are public-by-design (must be explicit)

    const req = context.switchToHttp().getRequest();
    const permissions = await this.umsAdapter.getEffectivePermissions(req.user.id, {
      tenantId: req.tenantId,
      initiativeId: req.initiativeId,
    });

    return permissions.includes(required); // FAIL-CLOSED: absence of grant ⇒ false
  }
}

AuthZ rules:

# Rule
AZ-1 Fail-closed by default. No grant ⇒ deny. There is no "allow on error" path.
AZ-2 A handler with no @RequirePermission must be explicitly intended as public; the default posture is "permission required".
AZ-3 Permissions are scoped (tenant + resource). A tracker:initiative:approve grant in tenant A never applies in tenant B.
AZ-4 UMS-graph resolution failures result in deny (not allow); errors are logged and surfaced as 403, never 500-with-access.
AZ-5 Authorization is enforced server-side. Frontend permission-driven UI (TAD §22) is UX only — never the security boundary.
AZ-6 The Web, CLI, and MCP surfaces share the same guard and permission model (BR-008) — no surface has a privileged bypass.

4.1 MCP / Agent Authorization

Per PRD §5.3, the Tracker is authoritative over agents. Security implications:

# Rule
MCP-1 An agent cannot self-assign work, skip a gate, or override an upstream constraint via MCP (PRD MCP principle)
MCP-2 Agent tokens carry the same scoped permissions as humans; BR-007 (agent = human for gate evaluation) does not mean elevated access
MCP-3 Every agent action is logged to the audit trail (BR-009) with attribution

5. Tenant Isolation (Defense-in-Depth)

Per R-15 and BR-006, isolation is enforced at two layers:

Layer Mechanism Role
Application (primary) TenantContextMiddleware derives tenantId from the validated token and scopes every query First line — every repository query is tenant-filtered
Database (secondary failsafe) PostgreSQL RLS policy on app.current_tenant_id Catches any application-layer scoping bug; a missing WHERE tenant_id cannot leak data
ALTER TABLE tracker_discovery.initiatives ENABLE ROW LEVEL SECURITY;
CREATE POLICY tenant_isolation ON tracker_discovery.initiatives
  USING (tenant_id = current_setting('app.current_tenant_id')::UUID);

Isolation rules:

# Rule
TI-1 app.current_tenant_id is set per request inside the transaction, from the token — never from a client-supplied header/body
TI-2 RLS is enabled on every tenant-scoped table; a new table without RLS fails the security review
TI-3 Cross-tenant access is impossible by design; there is no "admin sees all tenants" query path in Phase 1
TI-4 Tenant isolation is tested at the integration layer (see Test Strategy §4.3) — both layers asserted
TI-5 The DB role used by the app is non-superuser and non-BYPASSRLS, so RLS cannot be silently skipped

6. STRIDE Threat Model

Threat Vector Mitigation Residual risk
S — Spoofing Forged/replayed JWT Asymmetric signature verification, iss/aud/exp checks, JWKS rotation (AN-1..3) Low
S — Spoofing Agent impersonating another agent/user via MCP Scoped UMS tokens; no self-assignment (MCP-1..2) Low
T — Tampering Modifying initiative/contract state out of band Fail-closed guards; Unit of Work atomicity; optimistic concurrency (xmin) Low
T — Tampering Webhook payload forgery (GitHub/Jira ACL) ⊕ Verify webhook signatures (HMAC) at the ACL boundary Medium until ⊕ adopted
R — Repudiation Agent/human denies an action Append-only audit trail (tracker_audit), timestamped + attributed (BR-009) Low
I — Information Disclosure Cross-tenant data leak Two-layer isolation: app scoping + RLS failsafe (§5) Low
I — Information Disclosure PII/secrets in logs PII-safe structured logging; never log tokens/email/PII (§7) Low
D — Denial of Service Request flooding, expensive queries ⊕ Rate limiting + query timeouts (§9) Medium until ⊕ adopted
E — Elevation of Privilege Accessing higher-scope operation Fail-closed, scoped permissions (AZ-1..6); non-superuser DB role (TI-5) Low
E — Elevation of Privilege Frontend bypass of hidden actions Server-side enforcement is the boundary; UI gating is UX only (AZ-5) Low

7. Data Protection & Logging

Per TAD §25, logging is PII-safe by construction.

# Rule
DP-1 Never log: tokens, passwords, email, or any PII. Use internal userId (UUID) only.
DP-2 Logs are structured (JSON) with correlationId, tenantId, userId, action, durationMs
DP-3 Data in transit: TLS 1.2+ everywhere (client→edge, edge→UMS, app→DB)
DP-4 Data at rest: ⊕ PROPOSAL — PostgreSQL volume encryption + encrypted backups (cloud-provider managed)
DP-5 tracker_audit is append-only; no UPDATE/DELETE grants on audit tables
DP-6 Error responses never leak stack traces or internal identifiers to clients (generic problem+json)

8. Secrets Management ⊕ PROPOSAL

Concern Proposed approach Decision owner
Secret storage ⊕ Cloud secret manager (AWS Secrets Manager / Azure Key Vault) — not committed .env files in any non-local environment DevOps + Security
Local dev .env (gitignored) acceptable for local only
Injection Secrets injected as runtime env vars from the secret manager; never baked into images DevOps
Rotation ⊕ Periodic rotation of DB credentials and any service tokens Security
Scope DB app role is least-privilege, non-superuser, non-BYPASSRLS (TI-5) Architect

These are proposals consistent with the TAD's env-var usage and the Docker/Helm sections; the concrete secret manager choice is a DevOps decision (relates to GAP-007 CI/CD spec).


9. Input Validation, Rate Limiting & Hardening

Control Specification
Input validation class-validator (DTOs) + zod (boundaries) — layer-appropriate per TAD §10; reject unknown fields
Injection defense TypeORM parameterized queries only; no string-concatenated SQL; JSONB inputs validated before persistence
Rate limiting ⊕ ⊕ PROPOSAL — per-tenant + per-IP rate limits at the gateway/BFF; stricter limits on auth and mutation endpoints
Query timeouts ⊕ ⊕ PROPOSAL — statement timeout on DB sessions to bound expensive queries (DoS mitigation)
Security headers ⊕ PROPOSAL — HSTS, X-Content-Type-Options, X-Frame-Options/CSP at the edge
CORS Allowlist of known frontend origins (Shell Host + remotes); no wildcard in production
Mass assignment DTOs whitelist permitted fields; aggregates are constructed via factories, not direct body binding

10. OWASP Top 10 (2021) Coverage

OWASP Coverage in Tracker
A01 Broken Access Control Fail-closed guards, scoped permissions, server-side enforcement (§4)
A02 Cryptographic Failures Asymmetric JWT, TLS 1.2+, ⊕ at-rest encryption (§3, §7)
A03 Injection Parameterized TypeORM, class-validator/zod (§9)
A04 Insecure Design Hexagonal isolation, fail-closed default, threat model (§6)
A05 Security Misconfiguration Non-superuser DB role, RLS mandatory, ⊕ security headers (§5, §9)
A06 Vulnerable Components ⊕ PROPOSAL — dependency scanning (SCA) in CI (relates to GAP-007)
A07 Auth Failures UMS IdP, strong token validation, ⊕ short TTL + revocation (§3)
A08 Data Integrity Failures Unit of Work atomicity, optimistic concurrency, ⊕ webhook signature verification (§6)
A09 Logging/Monitoring Failures PII-safe structured logs, append-only audit, OpenTelemetry traces (§7)
A10 SSRF ⊕ PROPOSAL — egress allowlist for ACL adapters (GitHub/Jira/Core/UMS)

11. Security Testing

Per the Test Strategy:

Test Layer Asserts
Fail-closed authorization Presentation e2e Missing permission ⇒ 403 (§4)
Scoped permission Integration Grant in tenant A denied in tenant B (AZ-3)
Tenant isolation (RLS) Integration Tenant B cannot read Tenant A rows (TI-4)
Token rejection Unit/integration alg: none, expired, wrong aud all rejected (AN-1..2)
PII-safe logging Unit Logger redacts/omits token, email, PII (DP-1)
Audit trail Integration Every mutating action appends an attributed audit record (DP-5, BR-009)

⊕ PROPOSAL — add SAST + dependency (SCA) scanning to CI (GAP-007 CI/CD spec).


12. Open Security Decisions (Require Human Approval)

ID Decision Owner Blocking
SEC-D1 Token TTL / refresh / revocation values Security + UMS Confirm against UMS contract
SEC-D2 Secret manager choice (AWS/Azure/other) DevOps + Security Production deploy
SEC-D3 Rate-limiting thresholds (per-tenant/IP) Architect + Security Production hardening
SEC-D4 Webhook signature verification (HMAC) for ACLs Architect Before enabling GitHub/Jira ingestion
SEC-D5 At-rest encryption + backup encryption approach DevOps Production deploy
SEC-D6 Security headers / CSP policy Frontend + Security Production deploy
SEC-D7 Egress allowlist (SSRF) + SCA/SAST in CI DevOps + Security CI/CD spec (GAP-007)

Per audit rules, these are registered as proposals — not approved decisions. They should be ratified by the relevant owners and, where architectural, registered in DECISIONS.md.


References