Harden ORT FlatBuffer model loader against malformed buffers#28186
Harden ORT FlatBuffer model loader against malformed buffers#28186tianleiwu wants to merge 5 commits intomicrosoft:mainfrom
Conversation
tianleiwu
left a comment
There was a problem hiding this comment.
Review Summary
Well-targeted security hardening of the .ort FlatBuffer model loader. The changes add systematic null-offset detection for FlatBuffer vectors of tables, validate initializer raw data sizes, and replace the trust-the-metadata node sizing with a scan-based approach. Tests exercise the new error paths. Overall code quality is high and the defensive strategy is sound.
The ValidateRequiredTableOffsets template correctly uses ReadScalar<uoffset_t> + zero-check — the only reliable approach since Vector<Offset<T>>::Get(i) on a zero offset returns a non-null dangling pointer rather than nullptr. Good that it follows the existing pattern from LoadInitializerOrtFormat.
The key improvement in graph.cc — replacing nodes_.resize(fbs_graph.max_node_index()) with scan-based required_node_slot_count — eliminates a straightforward memory amplification attack via crafted max_node_index. The duplicate node index and dangling NodeEdge detection add integrity validation that was previously missing. The gsl::not_null → explicit nullptr check change for graph inputs/outputs is also good: returns a descriptive error instead of aborting.
See inline comments for suggestions.
- Add ValidateRequiredTableOffsets<T>() to reject null table offsets in FlatBuffer vectors before dereferencing - Validate initializer raw_data byte size matches shape/type expectation in LoadInitializerOrtFormat - Compute required node slot count from actual node/edge indices instead of trusting the attacker-controlled max_node_index field - Reject dangling NodeEdge references to missing nodes - Reject duplicate NodeArg names and duplicate node indices - Fail on unknown NodeArg references in graph inputs/outputs - Add regression tests for malformed .ort buffers: raw_data size mismatch, null node arg entry, dangling node edge
- Add sanity bound on required_node_slot_count to reject crafted buffers where a single node with a huge index could cause multi-GB allocation. Cap at max(1024, 2 * total_entries). - Read tensor name from fbs_tensor.name() instead of the partially-populated protobuf initializer for the raw data size mismatch error message. - Add RejectsAdversarialLargeNodeIndex test to exercise the new bound.
- Move huge_index variable inside lambda to fix -Wunused-lambda-capture error on Clang (macOS, wasm, Android NDK builds) - Use kOrtModelVersion instead of ORT_VERSION for the ORT format version field so minimal builds accept the test model (version 6 vs "1.26.0")
There was a problem hiding this comment.
Pull request overview
Hardens the ORT FlatBuffer (.ort) model loader against malformed/crafted buffers by adding additional validation during deserialization, and introduces regression tests that exercise the new failure paths.
Changes:
- Add
ValidateRequiredTableOffsets<T>()and use it to reject null (zero) table offsets in FlatBuffer vectors before dereferencing. - Validate initializer
raw_databyte size against expected size computed from shape + type. - Harden graph loading against attacker-controlled node indexing (oversized allocations), plus additional structural validation; add regression tests for these scenarios.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| onnxruntime/test/framework/ort_model_only_test.cc | Adds crafted-buffer regression tests for new validation paths. |
| onnxruntime/core/graph/model.cc | Validates metadata property table offsets before iterating. |
| onnxruntime/core/graph/graph_flatbuffers_utils.cc | Rejects initializer raw_data size mismatches early with a descriptive error. |
| onnxruntime/core/graph/graph.cc | Adds table-offset validation, duplicate NodeArg detection, node index allocation hardening, and additional graph consistency checks. |
| onnxruntime/core/flatbuffers/flatbuffers_utils.h / .cc | Introduces and applies ValidateRequiredTableOffsets<T>() for required table vectors (e.g., opset imports). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
… indices - Increase slot_cap multiplier from 2x to 64x with absolute cap of 10M to accommodate legitimately sparse node index ranges produced by ORT graph optimizations (ReleaseNode leaves holes). - Add bounds + nullptr checks for EdgeEnd.node_index in LoadEdgesFromOrtFormat to prevent ORT_ENFORCE crash or nullptr dereference on malformed buffers. - Add RejectsInvalidEdgeEndNodeIndex regression test. - Update stale comment at nodes_.resize() site.
9340b82 to
95172db
Compare
…test The test created a node with null inputs/outputs/implicit_inputs fields, which caused Node::LoadFromOrtFormat to fail with 'fbs_node_arg_names cannot be null' before reaching the edge validation logic under test. Provide explicit empty vectors so the node loads successfully and the edge referencing non-existent node 99 triggers the intended 'out-of-range node index' error.
Description
Harden the
.ortFlatBuffer model loader to reject malformed buffers early instead of dereferencing null pointers, allocating attacker-controlled sizes, or accessing out-of-bounds memory.Changes
Null table offset validation (
flatbuffers_utils.h/.cc)ValidateRequiredTableOffsets<T>()template that scans a FlatBuffer vector of table offsets and rejects any null (zero) entries before the caller dereferences them.Initializer raw_data size validation (
graph_flatbuffers_utils.cc)raw_datasize. Reject mismatches with a descriptive error.Node index hardening (
graph.cc)required_node_slot_countby scanning actual node/edge indices instead of trusting the serializedmax_node_indexfield, which could be attacker-controlled and cause oversized allocation.NodeEdgereferences to missing nodes, duplicateNodeArgnames, and unknownNodeArgreferences in graph inputs/outputs.Regression tests (
ort_model_only_test.cc)RejectsInitializerRawDataSizeMismatch: crafted buffer with wrong raw_data sizeRejectsNullNodeArgTableEntry: buffer with zeroed-out node arg offsetRejectsDanglingNodeEdge: buffer with a NodeEdge pointing to a non-existent nodeMotivation
These checks defend against crafted
.ortfiles that could cause null-pointer dereferences, excessive memory allocation, or out-of-bounds access during model loading.Testing
git diff --checkpasses (no whitespace issues)