Skip to content

♻️ Refactor / Metadata / Propagate :xmlElement narrowing through parse/1 boundary for dialyzer #33

@docJerem

Description

@docJerem

Context

While shipping #17 PR 2 (#31), the new ExSaml.Metadata.Certificate.violations/2 and ExSaml.Metadata.Signature.violations/2 initially carried a precise @spec:

@spec violations(:xmerl.xmlElement(), keyword()) :: [ValidationResult.violation()]

Dialyzer rejected this with:

lib/ex_saml/metadata.ex:211:8:no_return
Function check_entity_descriptor/1 has no local return.

lib/ex_saml/metadata.ex:214:19:call
The function call will not succeed.

ExSaml.Metadata.Certificate.violations(
  _root ::
    {:xmlDocument, _} | {:xmlComment, _, _, _, _} | {:xmlNsNode, _, _, _, _}
    | {:xmlPI, _, _, _, _} | {:xmlText, _, _, _, _, _}
    | {:xmlAttribute, _, _, _, _, _, _, _, _, _}
    | {:xmlElement, _, _, _, _, _, _, _, _, _, _, _},
  [...]
)

breaks the contract
(:xmerl.xmlElement(), :elixir.keyword()) :: [ExSaml.Metadata.ValidationResult.violation()]

The workaround in #31 was to drop the @spec on both functions to match the existing convention (none of the defp helpers in ExSaml.Metadata carry one either). That made dialyzer green, but at the cost of losing the type contract on a public function of a module.

Root cause

The XXE-safe parse/1 in lib/ex_saml/metadata.ex already narrows the parsed root to an :xmlElement:

defp parse(xml) do
  charlist = :binary.bin_to_list(xml)

  try do
    {root, _rest} =
      :xmerl_scan.string(charlist,
        namespace_conformant: true,
        allow_entities: false,
        quiet: true
      )

    if Record.is_record(root, :xmlElement) do
      {:ok, root}
    else
      {:error, invalid_xml_violation()}
    end
  rescue
    _ -> {:error, invalid_xml_violation()}
  catch
    _kind, _reason -> {:error, invalid_xml_violation()}
  end
end

Two reasons the narrowing does not survive:

  1. The if Record.is_record(root, :xmlElement) branch successfully narrows root to a 12-tuple shape inside the if, but Dialyzer's success-typing does not propagate that constraint across function boundaries (parse/1 returns {:ok, root} where root is rebound from the broader :xmerl_scan.string/2 return type).
  2. Even when the call site pattern-matches against the record explicitly (xml_element() = root), the inferred shape is {:xmlElement, _, _, ..., _} (12 term() fields), which Dialyzer still considers broader than :xmerl.xmlElement() — likely because the field defaults in the xmerl record definition (expanded_name = [], language = "", namespace = #xmlNamespace{}, …) get interpreted as type constraints during record-type expansion.

So the contract :xmerl.xmlElement() is strictly narrower than whatever Dialyzer can prove at the call site, regardless of how the caller pattern-matches.

Goal

Refactor parse/1 (and possibly run_rules/1 / check_entity_descriptor/1) so that the inferred type at the rule call sites unifies with :xmerl.xmlElement(). Once that holds, re-introduce the precise @spec on:

  • ExSaml.Metadata.Certificate.violations/2
  • ExSaml.Metadata.Signature.violations/2

and on any subsequent rule module added by #17 PR 3 / 4.

Approaches to evaluate

The right shape is open to discussion; below are starting points, not a prescription.

  1. Return a typed wrapper from parse/1

    Wrap the success path in an explicit record or struct, so the success-typing flows trivially:

    defp parse(xml) do
      case :xmerl_scan.string(...) do
        {xml_element() = root, _rest} -> {:ok, root}
        _ -> {:error, invalid_xml_violation()}
      end
    rescue
      ...
    end

    The leading pattern in the case head should be enough for Dialyzer to flow the narrowed type out.

  2. Reuse ✨ Feature / SafeXml — single hardened xmerl_scan entry point #29's SafeXml.scan/1 once that PR merges

    The new wrapper already centralises the hardened parse and returns {:ok, root} | {:error, :invalid_xml}. Migrating parse/1 onto it removes the duplicated try/rescue/catch and gives a single place to encode the post-parse Record.is_record/2 narrowing — possibly as a helper SafeXml.scan_root_element/1 that already returns {:ok, xmerl_xml_element()}.

  3. Define a local @type root :: ... alias in ExSaml.Metadata

    Define a shape that Dialyzer can actually prove (e.g. the 12-tuple) and use it on every rule module's @spec. Less precise than :xmerl.xmlElement() but documents intent and survives Dialyzer.

The best of the three is probably (1) + (2) combined once #29 lands: refactor parse/1 to call SafeXml.scan/1 and rely on the wrapper's narrowed return type.

Acceptance criteria

  • mix dialyzer is green with a precise @spec on Certificate.violations/2 and Signature.violations/2, without adding any new entry to .dialyzer_ignore.exs.
  • The same approach is applied to any rule module added during ✨ Feature / SAML metadata validation #17 PRs 3 and 4.
  • No regression on the 220-test metadata suite.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions