Security

This repository is a learning and demonstration project. Treat the findings below as a threat model and hardening checklist, not as certification.

If you discover a security issue in this codebase, please report it privately to the maintainer when sensitive, or open a public issue for non-sensitive items (documentation-only, etc.).

OWASP baseline

Mappings are qualitative (code inspection). Earlier revisions used OWASP Top 10 (2021) numbering; this document is maintained against OWASP Top 10:2025. Dependency advisories change continuously—run pip-audit (see CI and commands below), Dependabot, or uv.lock + OSV regularly.

OWASP Top 10:2025 — review of this codebase

A01:2025 — Broken Access Control — high (if exposed beyond localhost)

Findings

tenant_id / source_type / metadata_filter are optional filters on POST /retrieve — omitting them returns hits across all rows (src/rag/main.py). When REQUIRE_TENANT_ID=true, tenant_id becomes mandatory (400 if missing).
Any authenticated caller (when RAG_API_KEY / API_KEY is set) shares one global secret — no RBAC, no per-tenant API keys, no JWT scopes. With no API key configured, every route except GET /health and GET /ready remains anonymous full access.
GET /health and GET /ready are always unauthenticated (for probes); /ready does not grant data access but confirms DB/pgvector/table readiness.
Ingest can overwrite chunks by (doc_id, chunk_index) for any declared tenant string — multi-tenancy is client-declared, not cryptographically enforced.
PATCH /config/runtime-search, POST /tuner/*, POST /telemetry/ingest-backlog/clear affect global tuning/telemetry state for the process.
OpenAPI /docs and /redoc expose the attack surface — disable with DISABLE_OPENAPI_UI=true when exposing the API beyond trusted networks.

Mitigations (production-oriented)

Put the API behind an API gateway / reverse proxy with mTLS or OAuth2/JWT, per-route scopes, and network policies.
Enforce tenant isolation (required tenant context + server-side checks, or PostgreSQL row-level security).
Disable or protect /docs on untrusted networks.

A02:2025 — Security Misconfiguration — high (default stack)

Findings

Postgres published on host port 5433 in docker-compose.yml — avoid unintended exposure beyond localhost.
Default DB credentials (rag / rag) are documented for dev — not for the internet.
Optional CORS allowlist via CORS_ORIGINS (comma-separated). When unset, FastAPI defaults apply.
Optional in-process rate limit via RATE_LIMIT_PER_MINUTE (per client IP, single worker only). For real deployments prefer an API gateway or reverse-proxy token bucket (see snippet below).
Positive control: GET /ready checks connectivity, vector extension, and public.chunks — helps orchestrators avoid routing traffic before migrations (does not replace auth).

Mitigations

Bind services to internal networks in real deployments; do not expose Postgres publicly.
Add rate limiting (proxy or middleware), security headers, and explicit CORS when browser clients exist. Example nginx zones (adjust zones and TLS separately):

limit_req_zone $binary_remote_addr zone=rag_ingest:10m rate=10r/s;
limit_req_zone $binary_remote_addr zone=rag_retrieve:10m rate=30r/s;

location /ingest/ {
  limit_req zone=rag_ingest burst=20 nodelay;
  proxy_pass http://127.0.0.1:8000;
}
location /retrieve {
  limit_req zone=rag_retrieve burst=50 nodelay;
  proxy_pass http://127.0.0.1:8000;
}

Caddy (rate limiting needs a plugin or separate layer; example shows TLS reverse proxy to the API):

rag.example.com {
  encode gzip
  reverse_proxy 127.0.0.1:8000
}

Traefik (Docker labels or file provider; example static routes.yml fragment):

http:
  routers:
    rag:
      rule: PathPrefix(`/`)
      service: rag-api
      entryPoints:
        - websecure
      tls: {}
  services:
    rag-api:
      loadBalancer:
        servers:
          - url: "http://host.docker.internal:8000"

Tune entryPoints, tls, and servers for your network; add middleware rate limits in Traefik v3 if needed.

Use secrets management and strong credentials outside demos.

A03:2025 — Software Supply Chain Failures — ongoing

Findings

Dependencies are pinned in uv.lock; disclosed CVEs can appear at any time.
Docker images should use pinned bases and be scanned (Trivy, Grype, etc.) before production deploy—recommended outside this minimal CI.

Mitigations

CI runs pip-audit on exported locked runtime dependencies and Bandit on src/rag and scripts (see .github/workflows/ci.yml).
Enable Dependabot or Renovate; rebuild images after upgrades.

Example (matches CI; omit process substitution for portability):

uv sync --group dev
uv export --frozen --no-dev --no-emit-project --no-hashes -o /tmp/deps-audit.txt
uv run pip-audit -r /tmp/deps-audit.txt

A04:2025 — Cryptographic Failures — medium (deployment-dependent)

Findings

DATABASE_URL may contain plain-text credentials (.env, Compose env).
No TLS in the app—terminate HTTPS at the edge if needed.
No encryption-at-rest configured here (use disk encryption / managed DB).

Mitigations

Store secrets outside git; rotate passwords.
sslmode=require/verify-full for Postgres clients where supported.

A05:2025 — Injection — lower for SQL; availability still matters

Findings

SQL: Ingest uses parameterized executemany; retrieve uses a fixed SQL shell with bound parameters for filters. Embedding literals come from server-side embedding of user text, not concatenated SQL from raw input.
YAML: yaml.safe_load (src/rag/config_loader.py).
scripts/migrate.py does not invoke shells with user-controlled strings.

Residual risks

POST /ingest/chunks batch size is capped by MAX_INGEST_CHUNKS_PER_REQUEST (default 500, HTTP 413) — reduces memory/DB abuse vs unbounded batches; large content strings per row can still stress resources unless further bounded.

Mitigations

Add Pydantic max_length on text fields if you need tighter bounds.

A06:2025 — Insecure Design — high (by intent for a demo)

Findings

Global in-memory tuner / telemetry — callers affect shared process state.
Demo embeddings are deterministic hashes — not a security boundary for semantic secrecy.
SSRF: The app does not fetch user-supplied URLs today.

Watchouts

Future features such as “fetch URL and embed” need strict URL validation (scheme/host allowlists, block RFC1918/metadata URLs).

Mitigations

Separate admin/tuning plane from data plane in production; quotas, auditing, real identity model.

A07:2025 — Authentication Failures — critical gap when API key unset (if network-exposed)

Findings

With RAG_API_KEY / API_KEY unset (demo default), there are no sessions or per-user credentials — full anonymous access except probe routes.
With API key set, clients may send Authorization: Bearer <token> or X-API-Key alone. If the header uses the Bearer prefix, only that token is validated (including rejecting an empty token); X-API-Key is ignored in that case — no OAuth2, no MFA, no rotation story in-app.

Mitigations

Prefer gateway-managed OAuth2/JWT, mTLS, or short-lived tokens for production; treat the env-based shared secret as a minimal stub only.

A08:2025 — Software or Data Integrity Failures — low in-repo

Findings

No unsigned auto-update or arbitrary deserialization paths in core handlers.

Mitigations

Sign releases / pin image digests; extend CI with container scanners when you publish images.

A09:2025 — Security Logging and Alerting Failures — medium

Findings

Operational telemetry exists for ingest/retrieve (src/rag/telemetry.py).
401 responses from optional API-key middleware give a minimal auth signal but no structured security audit stream or SIEM integration.

Mitigations

Log identity, policy decisions, and anomalies to your logging stack.

A10:2025 — Mishandling of Exceptional Conditions — medium

Findings

POST /ingest/chunks maps database failures to HTTPException(detail=str(exc)) — may leak internal errors (driver messages, constraint names) to clients (src/rag/main.py).
GET /ready returns detail strings on failure paths for operators—typically lower risk than authenticated multi-tenant APIs but still worth generic messages if exposed broadly.

Mitigations

Return generic messages to clients; log exc server-side only.

Additional: Availability / abuse

POST /ingest/chunks batch length is capped (see MAX_INGEST_CHUNKS_PER_REQUEST); POST /retrieve caps k ≤ 200.
No global rate limit in-app — add at proxy or middleware if the API is reachable by untrusted clients.

Quick hardening checklist (before any public deploy)

Authentication + authorization stronger than a single optional shared secret; require tenant context where multi-tenant.
TLS and private networking for API and database.
Remove or restrict /docs; generic error messages to clients (A10).
Secrets out of git; strong DB credentials; Postgres not on the public internet.
Rate limits and request/body size limits (proxy + app).
Dependency scanning (CI pip-audit) + static analysis (Bandit) + image scanning on release (Trivy/similar).
Protect tuning/admin endpoints from untrusted callers.
GET /ready for readiness only — combine with auth and network controls, not instead of them.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Security

SECURITY.md

Security

OWASP baseline

OWASP Top 10:2025 — review of this codebase

A01:2025 — Broken Access Control — high (if exposed beyond localhost)

A02:2025 — Security Misconfiguration — high (default stack)

A03:2025 — Software Supply Chain Failures — ongoing

A04:2025 — Cryptographic Failures — medium (deployment-dependent)

A05:2025 — Injection — lower for SQL; availability still matters

A06:2025 — Insecure Design — high (by intent for a demo)

A07:2025 — Authentication Failures — critical gap when API key unset (if network-exposed)

A08:2025 — Software or Data Integrity Failures — low in-repo

A09:2025 — Security Logging and Alerting Failures — medium

A10:2025 — Mishandling of Exceptional Conditions — medium

Additional: Availability / abuse

Quick hardening checklist (before any public deploy)

There aren't any published security advisories

Security: geozelos/rag-pgvector-tuning

Security

SECURITY.md

Security

OWASP baseline

OWASP Top 10:2025 — review of this codebase

A01:2025 — Broken Access Control — high (if exposed beyond localhost)

A02:2025 — Security Misconfiguration — high (default stack)

A03:2025 — Software Supply Chain Failures — ongoing

A04:2025 — Cryptographic Failures — medium (deployment-dependent)

A05:2025 — Injection — lower for SQL; availability still matters

A06:2025 — Insecure Design — high (by intent for a demo)

A07:2025 — Authentication Failures — critical gap when API key unset (if network-exposed)

A08:2025 — Software or Data Integrity Failures — low in-repo

A09:2025 — Security Logging and Alerting Failures — medium

A10:2025 — Mishandling of Exceptional Conditions — medium

Additional: Availability / abuse

Quick hardening checklist (before any public deploy)

There aren't any published security advisories