Skip to content

🔐 Security / SAML XML parsing surface — centralisation, audit gate, observability #32

@docJerem

Description

@docJerem

Context

This issue tracks a multi-PR initiative to make the SAML/XML parsing
surface of ex_saml auditable, hardened, and observable in a single
place
, gated by a CI check before each release.

Background:

The root cause across these incidents is structural: :xmerl_scan was
called in four different files with slightly different option shapes,
error handling, and encoding conversion. Any audit or hardening had to be
done four times; any drift was invisible.

Goal

Three pillars, delivered as four PRs:

  1. Centralisation — a single hardened entry point for XML parsing, so
    any audit reduces to one grep / one AST visitor.
  2. Audit gate — a CI-blocking check that fails the build if a new
    :xmerl_scan.* call appears anywhere outside the hardened wrapper.
  3. Observability — a configurable hook on parse failures so consumers
    (SIEM, OCSF, telemetry) can forward events without ex_saml owning
    the formatting.

"Done" looks like: every release pipeline runs mix security.check_release
and fails fast on a new unhardened parsing site; production parse failures
surface to the consumer's SIEM via the configured hook without any code
change in ex_saml.

PLAN

  • PR 1 — ExSaml.Core.Xml.SafeXml wrapper + tuple API + :on_error hook (✨ Feature / SafeXml — single hardened xmerl_scan entry point #29)

    • Single entry point for :xmerl_scan.string, always forces
      allow_entities: false, namespace_conformant: true, quiet: true.
    • Tuple-only API: scan/1 :: {:ok, xml()} | {:error, :invalid_xml}.
    • Cascades the tuple through Binding.decode_response/2,
      Sp.decrypt_assertion/2, Helper.decode_saml_payload/2.
    • Centralises the binary→charlist conversion that 🐛 Fix / UTF-8 characters in SAMLResponse rejected by xmerl_scan #22 had to patch
      site-by-site.
    • Error hook: config :ex_saml, :safe_xml, on_error: fn ctx -> … end
      — handler receives %{reason, kind, payload}, exceptions from the
      handler are swallowed.
  • PR 2 — AST-based audit in mix security.check_release

    • Replace the regex-window source scan drafted in (closed) ✨ Feature / Mix security.check_release task #15 with an
      AST visitor (Code.string_to_quoted/1).
    • Core invariant: any reference to :xmerl_scan.* in lib/ outside
      lib/ex_saml/core/xml/safe_xml.ex is a violation.
    • Fail loud (non-zero exit) when a check targets a file that does not
      exist — the previous draft returned [] silently, which is a false
      negative.
  • PR 3 — Config invariants at ExSaml.SP.setup/1

    • Migrate the open-redirect and unsolicited-target checks from
      "regex over sp_handler.ex" (which targets an app-side file not in
      this lib) to structural invariants raised at init:
      • raise if trusted_fingerprints is :any and
        Mix.env() == :prod.
      • raise if allowed_target_urls is not set when unsolicited
        IdP-initiated responses are accepted.
    • A raise at init is strictly more reliable than a grep — the
      consumer cannot ship without seeing the failure.
  • PR 4 — CI wiring + ExUnit coverage of the audit task

    • Add a Security audit step in .github/workflows/ci.yml running
      mix security.check_release.
    • ExUnit tests for Mix.Tasks.Security.CheckRelease:
      • fixture "good code" → 0 violations
      • fixture "new unhardened :xmerl_scan call" → 1 violation
      • fixture "targeted file does not exist" → error, not silent zero
    • Wire the task into the release pipeline ahead of mix hex.publish.

Non-goals

  • Cryptographic verification of <ds:Signature> against a trust anchor —
    lives downstream of parsing, orthogonal to this initiative.
  • Semantic validation of metadata structure — covered by ✨ Feature / SAML metadata validation #17 (different
    layer: :xmerl_xpath on an already-parsed tree, no new :xmerl_scan
    sites).
  • Replacing xmerl with a different XML parser.
  • OCSF event formatting — ex_saml exposes the :on_error hook as a
    primitive; OCSF formatting belongs in a downstream consumer
    (e.g. elixir-ocsf).

Cohérence avec l'existant

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions