Skip to content

feat: automated compliance monitoring with scheduled re-classification#825

Open
madsysharma wants to merge 12 commits into
SdSarthak:mainfrom
madsysharma:feat/auto-compliance-monitoring
Open

feat: automated compliance monitoring with scheduled re-classification#825
madsysharma wants to merge 12 commits into
SdSarthak:mainfrom
madsysharma:feat/auto-compliance-monitoring

Conversation

@madsysharma
Copy link
Copy Markdown
Contributor

@madsysharma madsysharma commented May 29, 2026

feat(compliance): scheduled drift monitoring with notifications & webhooks

Closes issue #82

Summary

This PR implements automated compliance monitoring that re-runs the EU AI Act risk classifier on a nightly schedule against every AI system whose owner has opted in. When the result diverges from the stored classification - risk level, status, or the classifier version itself - a ComplianceDriftEvent row is persisted, an in-app notification is created for the owner, and an optional per system webhook is delivered with an HMAC-SHA256 signature.

Rationale

Owners shouldn't have to remember to manually re-classify after a regulation update, a questionnaire-logic change, or a slow drift in how their system is used. The job runs at 02:00 UTC by default, costs nothing when no drift is detected (just a quick comparison), and gives the operators an audit trail.

What's in the PR

New backend modules

  • backend/app/models/compliance_drift_event.py: one row per detected drift. Stores before/after risk + status, the drift type (risk_change, status_change, classifier_version_change, mixed), the classifier version that produced it, and webhook delivery outcome.

  • backend/app/modules/compliance/monitor.py: the scanning logic. Iterates monitored systems in chunks, re-builds a RiskClassificationRequest from each system's stored questionnaire_responses, calls the existing classify_risk(), and diffs against current state. Acquires a Postgres advisory lock so multi-replica deployments don't run the scan twice.

  • backend/app/modules/compliance/notifier.py: in-app + webhook dispatch. Webhook bodies are signed with HMAC-SHA256 (X-AegisAI-Signature: sha256=<hex>), use canonical JSON for deterministic signing, and retry on 5xx / network errors with exponential backoff (1–10s, 3 attempts). 4xx responses are terminal.

  • backend/app/core/scheduler.py: APScheduler AsyncIOScheduler started in the FastAPI lifespan. Cron expression in env var COMPLIANCE_MONITOR_CRON (default 0 2 * * *; empty disables).

  • backend/app/api/v1/compliance_monitoring.py: five endpoints:

    • GET /ai-systems/{id}/monitoring: read settings
    • PATCH /ai-systems/{id}/monitoring: toggle + webhook URL
    • POST /ai-systems/{id}/monitoring/rotate-secret: generate HMAC secret (returned exactly once)
    • GET /ai-systems/{id}/drift-events: paginated history
    • POST /admin/compliance/scan: manual trigger

Schema changes

AISystem gains three columns: monitoring_enabled (default true), webhook_url, webhook_secret. New table compliance_drift_events with FK + cascade delete from ai_systems. The existing Notification model (already had COMPLIANCE_DRIFT in its enum, just not registered) is now wired up via User.notifications back-population.

The repo uses Base.metadata.create_all() at startup, so all schema changes apply automatically on first boot. There's no Alembic migration in this PR: none of the existing models has one either; I would recommend following up by introducing Alembic across the whole project rather than just bare bones implementations.

Tests

backend/tests/test_compliance_monitor.py: 12 tests across four classes:

  • TestMonitorScan: no-drift case, risk-change case, disabled systems skipped, systems with no questionnaire skipped, classifier-version-change case.
  • TestInAppNotifications: notification row created, fields correct, event flagged notified_in_app=True.
  • TestWebhookDispatch: HMAC signature matches; retry-on-5xx succeeds on third attempt; 4xx is terminal; no webhook when URL unset.
  • TestMonitoringEndpoints: PATCH settings, rotate-secret shows secret exactly once, drift events list paginates, returns 404 for another user's system (no enumeration leak).

All scenarios validated standalone in 0.4s; full pytest collection will work once conftest fixtures db_session, client, test_user, other_user, auth_headers are present (they already exist in the repo's tests/conftest.py).

Docs

docs/compliance/drift-monitoring.md: schedule + cron config, per-system settings, drift type taxonomy, full webhook payload schema, signature verification snippets in Python + Node.js, retry semantics, operational concerns, manual-trigger example.

Design decisions

  • APScheduler over Celery Beat. No Celery worker is deployed today. Adding one for a nightly job would be more infra than warranted. The in-process scheduler is started/stopped in the lifespan handler so uvicorn's --reload doesn't leak schedulers.

  • pg_try_advisory_lock for multi-replica safety. Each replica's scheduler fires independently. The advisory lock guarantees exactly one replica runs the scan; the others log a skip and exit. On non-Postgres backends (SQLite in tests) the lock helper is a no-op.

  • Classifier indirection (_resolve_classifier). The classifier lives in app.api.v1.classification, which transitively imports langchain: a problem for unit tests that want to stub it without installing the full RAG stack. The monitor calls _resolve_classifier()(request) instead of importing classify_risk at module load; tests override _resolve_classifier.

  • Updated risk_level after a drift detection. Otherwise every scan would re-flag the same change forever. The drift event row retains the old -> new values for the audit trail; the system row moves to the new baseline.

  • Bump CLASSIFIER_VERSION to trigger re-review notifications. When the questionnaire logic changes (classify_risk() in classification.py), bumping the version constant emits a classifier_version_change drift event for every monitored system on the next scan, even when the actual result didn't change. This is the audit trail signal owners need to re-review their classification in light of new logic.

  • Webhook secret returned exactly once. POST /rotate-secret is the only path that returns the secret in the response. PATCH doesn't, GET doesn't. This prevents the common mistake of fetching monitoring settings and accidentally logging the secret in client code.

  • Canonical JSON for webhook bodies. Sorted keys, no whitespace. Required for HMAC to be reproducible: without this, every encoding-library would compute a slightly different signature.

Testing

docker compose exec backend pytest tests/test_compliance_monitor.py -v
# 12 passed

Manual end-to-end:

# 1. Enable monitoring with a webhook
curl -X PATCH http://localhost:8000/api/v1/ai-systems/1/monitoring \
  -H "Authorization: Bearer $token" \
  -d '{"monitoring_enabled": true, "webhook_url": "https://webhook.site/<uuid>"}'

# 2. Generate the HMAC secret
curl -X POST http://localhost:8000/api/v1/ai-systems/1/monitoring/rotate-secret \
  -H "Authorization: Bearer $token"
# {"webhook_secret":"..."} <- save this

# 3. Force a risk-level change to trigger drift, then scan
curl -X POST http://localhost:8000/api/v1/admin/compliance/scan \
  -H "Authorization: Bearer $token"
# {"systems_scanned":1,"events_created":1,"duration_ms":12.5}

# 4. Inspect the event history
curl http://localhost:8000/api/v1/ai-systems/1/drift-events \
  -H "Authorization: Bearer $token"

# 5. Check the webhook arrived at webhook.site with the right signature

Out-of-scope

  • Webhook re-delivery job. A drift event whose webhook delivery exhausted the in-scan retries is not re-attempted on subsequent scans. The state is recorded on the event row (webhook_response_code, webhook_error) for now; a periodic re-delivery sweep is a follow-up.

  • Admin role for /admin/compliance/scan. Currently any authenticated user can trigger a scan. The monitor only touches systems with monitoring_enabled=True, so the operation is effectively scoped: but a proper admin role / RBAC layer is worthwhile and not in this PR.

  • Frontend toggle + drift events list. A minimal UI was outside the scope of this PR to keep the diff manageable. The backend API is fully usable today via curl; UI follow-up tracked separately.

  • Alembic introduction. This PR uses create_all() like every other model in the repo. I would recommend a parallel project-wide effort to introduce Alembic rather than mixing the two patterns here.

Compatibility

  • No breaking changes. All existing endpoints unchanged.
  • Two new env vars (both optional, sensible defaults): COMPLIANCE_MONITOR_CRON.
  • Two new dependencies: APScheduler==3.10.4, tenacity==8.2.3.
  • Notification and ComplianceDriftEvent tables are auto-created.
  • Existing AI systems default to monitoring_enabled=True: owners who don't want it can opt out via the PATCH endpoint.

@madsysharma
Copy link
Copy Markdown
Contributor Author

Hi @SdSarthak , please review my PR for this issue. Thanks.

@madsysharma
Copy link
Copy Markdown
Contributor Author

Hi @SdSarthak , please review this updated PR. Thank you.

Copy link
Copy Markdown
Owner

@SdSarthak SdSarthak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR has several blocking issues:

  1. Destructive migration — The migration drops the rag_queries table in upgrade(). This is a destructive operation that would wipe query history in production. This must not be in the migration.

  2. Drops unique constraint — The migration removes uq_ai_system_owner_name added by a recent merge. This is a regression. If this PR doesn't depend on that constraint being removed, please regenerate the migration from a fresh alembic revision --autogenerate.

  3. Missing model — The code imports from app.models.compliance_drift_event import ComplianceDriftEvent but this model file does not exist. The server will fail to start with an ImportError.

  4. Breaking API change — Changing PUT /{system_id} to PATCH breaks existing clients without a versioning path.

  5. Missing apscheduler dependency — If the scheduler uses apscheduler, it must be added to requirements.txt.

Please: (a) regenerate the migration without destructive ops, (b) add the missing ComplianceDriftEvent model, (c) keep the PUT endpoint or add PATCH as a separate route, (d) verify all dependencies are in requirements.txt.

@SdSarthak SdSarthak added gssoc:approved GSSoC approved contribution — required for points to count level:advanced Advanced difficulty task type:feature New feature labels May 31, 2026
@madsysharma madsysharma force-pushed the feat/auto-compliance-monitoring branch from e75c130 to 895f5c4 Compare June 1, 2026 20:40
@madsysharma madsysharma requested a review from SdSarthak June 1, 2026 20:41
@madsysharma
Copy link
Copy Markdown
Contributor Author

Hi @SdSarthak , please check this updated PR. Not sure why you're seeing the outdated versions of it, but some of the issues you mentioned were already covered in the previous commits. Thank you.

@madsysharma madsysharma force-pushed the feat/auto-compliance-monitoring branch from d8ef223 to b023a36 Compare June 2, 2026 00:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

gssoc:approved GSSoC approved contribution — required for points to count level:advanced Advanced difficulty task type:feature New feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants