Skip to content

Harden ORT FlatBuffer model loader against malformed buffers#28186

Open
tianleiwu wants to merge 5 commits intomicrosoft:mainfrom
tianleiwu:tlwu/harden-ort-loader
Open

Harden ORT FlatBuffer model loader against malformed buffers#28186
tianleiwu wants to merge 5 commits intomicrosoft:mainfrom
tianleiwu:tlwu/harden-ort-loader

Conversation

@tianleiwu
Copy link
Copy Markdown
Contributor

Description

Harden the .ort FlatBuffer model loader to reject malformed buffers early instead of dereferencing null pointers, allocating attacker-controlled sizes, or accessing out-of-bounds memory.

Changes

Null table offset validation (flatbuffers_utils.h/.cc)

  • Add ValidateRequiredTableOffsets<T>() template that scans a FlatBuffer vector of table offsets and rejects any null (zero) entries before the caller dereferences them.
  • Apply to opset imports, initializers, sparse initializers, node args, nodes, node edges, and metadata properties.

Initializer raw_data size validation (graph_flatbuffers_utils.cc)

  • After reading shape and type, compute expected byte count and compare against actual raw_data size. Reject mismatches with a descriptive error.

Node index hardening (graph.cc)

  • Compute required_node_slot_count by scanning actual node/edge indices instead of trusting the serialized max_node_index field, which could be attacker-controlled and cause oversized allocation.
  • Reject duplicate node indices, dangling NodeEdge references to missing nodes, duplicate NodeArg names, and unknown NodeArg references in graph inputs/outputs.

Regression tests (ort_model_only_test.cc)

  • RejectsInitializerRawDataSizeMismatch: crafted buffer with wrong raw_data size
  • RejectsNullNodeArgTableEntry: buffer with zeroed-out node arg offset
  • RejectsDanglingNodeEdge: buffer with a NodeEdge pointing to a non-existent node

Motivation

These checks defend against crafted .ort files that could cause null-pointer dereferences, excessive memory allocation, or out-of-bounds access during model loading.

Testing

  • git diff --check passes (no whitespace issues)
  • Incremental build of touched core objects succeeds
  • New regression tests exercise each validation path

Copy link
Copy Markdown
Contributor Author

@tianleiwu tianleiwu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Summary

Well-targeted security hardening of the .ort FlatBuffer model loader. The changes add systematic null-offset detection for FlatBuffer vectors of tables, validate initializer raw data sizes, and replace the trust-the-metadata node sizing with a scan-based approach. Tests exercise the new error paths. Overall code quality is high and the defensive strategy is sound.

The ValidateRequiredTableOffsets template correctly uses ReadScalar<uoffset_t> + zero-check — the only reliable approach since Vector<Offset<T>>::Get(i) on a zero offset returns a non-null dangling pointer rather than nullptr. Good that it follows the existing pattern from LoadInitializerOrtFormat.

The key improvement in graph.cc — replacing nodes_.resize(fbs_graph.max_node_index()) with scan-based required_node_slot_count — eliminates a straightforward memory amplification attack via crafted max_node_index. The duplicate node index and dangling NodeEdge detection add integrity validation that was previously missing. The gsl::not_null → explicit nullptr check change for graph inputs/outputs is also good: returns a descriptive error instead of aborting.

See inline comments for suggestions.

Comment thread onnxruntime/core/graph/graph.cc
Comment thread onnxruntime/core/graph/graph_flatbuffers_utils.cc Outdated
Comment thread onnxruntime/test/framework/ort_model_only_test.cc
@tianleiwu tianleiwu requested review from Copilot and skottmckay May 2, 2026 07:39
@tianleiwu tianleiwu requested a review from edgchen1 May 2, 2026 07:40
@tianleiwu tianleiwu marked this pull request as draft May 2, 2026 07:40
tianleiwu added 3 commits May 2, 2026 00:42
- Add ValidateRequiredTableOffsets<T>() to reject null table offsets
  in FlatBuffer vectors before dereferencing
- Validate initializer raw_data byte size matches shape/type expectation
  in LoadInitializerOrtFormat
- Compute required node slot count from actual node/edge indices instead
  of trusting the attacker-controlled max_node_index field
- Reject dangling NodeEdge references to missing nodes
- Reject duplicate NodeArg names and duplicate node indices
- Fail on unknown NodeArg references in graph inputs/outputs
- Add regression tests for malformed .ort buffers:
  raw_data size mismatch, null node arg entry, dangling node edge
- Add sanity bound on required_node_slot_count to reject crafted buffers
  where a single node with a huge index could cause multi-GB allocation.
  Cap at max(1024, 2 * total_entries).
- Read tensor name from fbs_tensor.name() instead of the partially-populated
  protobuf initializer for the raw data size mismatch error message.
- Add RejectsAdversarialLargeNodeIndex test to exercise the new bound.
- Move huge_index variable inside lambda to fix -Wunused-lambda-capture
  error on Clang (macOS, wasm, Android NDK builds)
- Use kOrtModelVersion instead of ORT_VERSION for the ORT format version
  field so minimal builds accept the test model (version 6 vs "1.26.0")
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Hardens the ORT FlatBuffer (.ort) model loader against malformed/crafted buffers by adding additional validation during deserialization, and introduces regression tests that exercise the new failure paths.

Changes:

  • Add ValidateRequiredTableOffsets<T>() and use it to reject null (zero) table offsets in FlatBuffer vectors before dereferencing.
  • Validate initializer raw_data byte size against expected size computed from shape + type.
  • Harden graph loading against attacker-controlled node indexing (oversized allocations), plus additional structural validation; add regression tests for these scenarios.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
onnxruntime/test/framework/ort_model_only_test.cc Adds crafted-buffer regression tests for new validation paths.
onnxruntime/core/graph/model.cc Validates metadata property table offsets before iterating.
onnxruntime/core/graph/graph_flatbuffers_utils.cc Rejects initializer raw_data size mismatches early with a descriptive error.
onnxruntime/core/graph/graph.cc Adds table-offset validation, duplicate NodeArg detection, node index allocation hardening, and additional graph consistency checks.
onnxruntime/core/flatbuffers/flatbuffers_utils.h / .cc Introduces and applies ValidateRequiredTableOffsets<T>() for required table vectors (e.g., opset imports).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread onnxruntime/core/graph/graph.cc Outdated
Comment thread onnxruntime/core/graph/graph.cc
… indices

- Increase slot_cap multiplier from 2x to 64x with absolute cap of 10M to
  accommodate legitimately sparse node index ranges produced by ORT graph
  optimizations (ReleaseNode leaves holes).
- Add bounds + nullptr checks for EdgeEnd.node_index in LoadEdgesFromOrtFormat
  to prevent ORT_ENFORCE crash or nullptr dereference on malformed buffers.
- Add RejectsInvalidEdgeEndNodeIndex regression test.
- Update stale comment at nodes_.resize() site.
@tianleiwu tianleiwu force-pushed the tlwu/harden-ort-loader branch from 9340b82 to 95172db Compare May 2, 2026 07:54
@tianleiwu tianleiwu marked this pull request as ready for review May 2, 2026 07:55
…test

The test created a node with null inputs/outputs/implicit_inputs fields,
which caused Node::LoadFromOrtFormat to fail with 'fbs_node_arg_names
cannot be null' before reaching the edge validation logic under test.

Provide explicit empty vectors so the node loads successfully and the
edge referencing non-existent node 99 triggers the intended
'out-of-range node index' error.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants