diff --git a/CHANGELOG.md b/CHANGELOG.md index b91b45a..8fe614c 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -23,6 +23,7 @@ the first consumer-visible behaviour change and will drive the next SDK version - **Automation-detector wording**: the anti-tamper flag now reads "Anti-tamper signals" (not "Automation detected") when the combined confidence is weak — e.g. devtools open in a dev environment — so a human isn't labelled a bot. The machine-readable `code` (`automation_suspected`) is unchanged; the reason text also drops the `tamper.` prefix for readability. ### Internal +- **Docs: GDPR & consent** ([ADR-0004](docs/adr/0004-consent-and-data-lifecycle.md)): new [GDPR & consent integration guide](docs/integrations/gdpr-consent.md) (controller/processor split, CMP wiring per mode, lawful-basis guidance, data-subject rights, DPA stub); OpenAPI updated with the snapshot consent fields, the `LawfulBasis` schema, and the `DELETE`/`export` identity paths. - **Retention sweeper** ([ADR-0004](docs/adr/0004-consent-and-data-lifecycle.md)): a daily BullMQ repeatable job in the worker deletes identities (and, by cascade, their snapshots/drifts/risk/links) whose `last_seen` is older than their project's `retention_days`; projects with a null `retention_days` keep data indefinitely. `sweepRetention()` is idempotent and unit-tested against a real DB. - **Data-subject endpoints** ([ADR-0004](docs/adr/0004-consent-and-data-lifecycle.md)): `DELETE /v1/identity/:id` (GDPR Art. 17 — erases the identity; snapshots/drifts/risk assessments/account links cascade) and `GET /v1/identity/:id/export` (Art. 20 — the full bundle, including each snapshot's consent provenance). Erasure is strictly key-gated (a non-GET never reaches the route via an admin session); export is readable by key or admin session like the other reads. - **Consent provenance + IP minimization on the server** ([ADR-0004](docs/adr/0004-consent-and-data-lifecycle.md), migration 012): snapshots now record the lawful basis, consent version, and grant time the SDK forwards (falling back to a per-project `lawful_basis_default`). The stored `client_ip` is **network-truncated by default** (`/24` IPv4, `/48` IPv6 via `minimizeIp`) — still city-accurate for impossible-travel, while the full IP is used only transiently in the request path (so the anonymizer detector is unaffected). A per-project `store_full_ip` flag keeps the full address for operators with a documented basis. New `projects.retention_days` column lands here for the upcoming retention sweeper. diff --git a/README.md b/README.md index 4c71bbd..6400b38 100644 --- a/README.md +++ b/README.md @@ -231,6 +231,7 @@ The repo-root `docker-compose.yml` is the **local dev** stack instead — it bui - [Concepts](docs/concepts.md) — probabilistic identity, drift, confidence, risk - [Signal Reference](docs/signals.md) — every collected signal, stability class, GDPR notes - [Persistence Policies](docs/persistence-policies.md) — storage scopes, compliance guide +- [GDPR & Consent](docs/integrations/gdpr-consent.md) — privacy-by-default consent gate, CMP wiring, lawful basis, data-subject rights - [REST API](docs/openapi.yaml) — OpenAPI 3 spec (the authoritative contract; lint with `pnpm openapi:lint`) - [OTel Bridge](docs/integrations/otel-bridge.md) — tracing integration guide - [Migrating from FingerprintJS](docs/migrating-from-fingerprintjs.md) diff --git a/docs/integrations/gdpr-consent.md b/docs/integrations/gdpr-consent.md new file mode 100644 index 0000000..8453fdf --- /dev/null +++ b/docs/integrations/gdpr-consent.md @@ -0,0 +1,94 @@ +# GDPR & consent integration guide + +Scent is built **privacy-by-default**: the SDK collects, persists, and transmits +**nothing** until consent is granted. It *enforces* consent but never renders a banner — +because in almost every deployment **you are the data controller** and you already own +the consent experience. This guide shows how to wire Scent into it. The architecture +rationale is recorded in [ADR-0004](../adr/0004-consent-and-data-lifecycle.md). + +## Who is responsible for what + +| | Role | Owns | +|---|---|---| +| **You** (the site embedding Scent) | **Data controller** | The privacy notice, the lawful basis, the consent prompt (your CMP), and the user relationship. | +| **Scent** | Processor (hosted) / tool (self-host) | Enforcing the gate you configure, recording the basis you declare, and honouring deletion. | + +Consequence: **the SDK ships no consent UI.** It reads consent from your existing CMP +(or an explicit call) and gates everything on it. + +## The two gates (don't conflate them) + +1. **ePrivacy Art. 5(3)** — reading/writing on the device (the fingerprint signals *and* + Scent's `localStorage`/`IndexedDB`/cookie persistence) needs **prior opt-in consent** + *unless* it is "strictly necessary for a service the user requested." +2. **GDPR** — processing the resulting personal data (fingerprint + IP) needs a **lawful + basis**: consent, or **legitimate interest** (fraud prevention is a recognised LI). + +A login-security / account-takeover use case the *user themselves* initiated has a +credible "strictly necessary" argument under 5(3); analytics-style scoring does not. +**You** decide which applies and declare it — Scent records it, it does not adjudicate. + +## Wiring consent + +Pick the mode that matches your CMP. Collection stays off until the resolver reports +granted (fail-closed). + +```ts +import { init } from '@tindalabs/scent-sdk'; + +// 1) Manual — you flip it after your own banner resolves (default mode). +const scent = init({ apiKey }); +scent.setConsent(true); // ...and scent.setConsent(false) to revoke + +// 2) Callback — Scent asks your CMP on each observe() (sync or async). +init({ apiKey, consent: { mode: 'callback', resolve: () => myCmp.hasConsent('analytics') } }); + +// 3) IAB TCF v2 — reads window.__tcfapi (Purpose 1: store/access on device). +init({ apiKey, consent: { mode: 'tcf' } }); + +// 4) Google Consent Mode — reads analytics_storage / ad_storage from the dataLayer. +init({ apiKey, consent: { mode: 'gcm' } }); +``` + +### Declare the lawful basis + +```ts +init({ + apiKey, + basis: 'legitimate_interest', // 'consent' (default) | 'legitimate_interest' | 'strictly_necessary' + consentVersion: 'privacy-policy-2026-01', +}); +``` + +`basis`, `consentVersion`, and the grant time are attached to every snapshot and stored +immutably server-side, so you can demonstrate consent (GDPR Art. 7(1)). + +## Data-subject rights + +**Client** — `scent.forget()` purges every local storage layer and returns the cleared +identity id (use it to also delete server-side): + +```ts +const id = await scent.forget(); +if (id) await fetch(`${api}/v1/identity/${id}`, { method: 'DELETE', headers: { 'X-Api-Key': key } }); +``` + +**Server** +- `DELETE /v1/identity/:id` — erasure (Art. 17); snapshots/drifts/risk/links cascade. **Key-gated.** +- `GET /v1/identity/:id/export` — portability (Art. 20); the full bundle as JSON. + +## Data minimisation (defaults) + +- **Client IP is network-truncated at rest** (`/24` IPv4, `/48` IPv6) — still city-accurate + for impossible-travel, with the host bits dropped. Set the project's `store_full_ip` + only with a documented basis. +- **Retention**: set a project's `retention_days` and a daily sweep erases identities idle + longer than that (cascading). Null = keep indefinitely. + +## DPA (template stub) + +For the hosted tier, Tindalabs acts as your **processor**. A Data Processing Agreement +should cover: subject-matter & duration; nature/purpose (probabilistic identity & fraud +signals); categories of data (device signals, truncated IP, linked account ids); sub- +processors (the hosting provider); security measures; deletion on termination; and +audit rights. *(Contact for the current DPA; this is not legal advice.)* diff --git a/docs/openapi.yaml b/docs/openapi.yaml index 29b17e7..4ecfef9 100644 --- a/docs/openapi.yaml +++ b/docs/openapi.yaml @@ -152,6 +152,48 @@ paths: "401": { $ref: "#/components/responses/Unauthorized" } "403": { $ref: "#/components/responses/Forbidden" } "404": { $ref: "#/components/responses/NotFound" } + delete: + tags: [Identity] + summary: Erase an identity (GDPR Art. 17) + description: > + Deletes the identity and everything held about it — snapshots, drifts, risk + assessments, cluster merges, and account links cascade. Strictly key-gated: + an admin session can only read, never erase. + security: [{ ApiKeyAuth: [] }] + parameters: + - { $ref: "#/components/parameters/IdentityId" } + responses: + "204": { description: Identity erased } + "401": { $ref: "#/components/responses/Unauthorized" } + "404": { $ref: "#/components/responses/NotFound" } + + /v1/identity/{id}/export: + get: + tags: [Identity] + summary: Export everything held about an identity (GDPR Art. 20) + description: > + Returns the identity record plus its snapshots (with consent provenance), + drifts, risk assessments, and linked accounts as one JSON bundle. + security: [{ ApiKeyAuth: [] }, { AdminSession: [] }] + parameters: + - { $ref: "#/components/parameters/IdentityId" } + - { $ref: "#/components/parameters/ProjectIdHeader" } + responses: + "200": + description: Full data-subject export bundle + content: + application/json: + schema: + type: object + properties: + identity: { $ref: "#/components/schemas/Identity" } + snapshots: { type: array, items: { type: object } } + drifts: { type: array, items: { $ref: "#/components/schemas/Drift" } } + riskAssessments: { type: array, items: { type: object } } + accounts: { type: array, items: { type: object } } + "401": { $ref: "#/components/responses/Unauthorized" } + "403": { $ref: "#/components/responses/Forbidden" } + "404": { $ref: "#/components/responses/NotFound" } /v1/identity/{id}/timeline: get: @@ -974,6 +1016,13 @@ components: type: string enum: [conservative, balanced, aggressive, forensic] + LawfulBasis: + type: string + enum: [consent, legitimate_interest, strictly_necessary] + description: > + The GDPR lawful basis the controller asserts for the snapshot. The server + records it; it does not adjudicate legality. See ADR-0004. + ConfidenceBand: type: string enum: [high, medium, low, unknown] @@ -998,6 +1047,18 @@ components: traceparent: type: string description: Optional W3C Trace Context header. + lawfulBasis: + $ref: "#/components/schemas/LawfulBasis" + consentVersion: + type: string + maxLength: 128 + description: > + The controller's consent-policy version, forwarded by the SDK for + accountability. Optional. + consentedAt: + type: string + format: date-time + description: When the data subject's consent was granted. Optional. EventsBatch: type: object