Prevent double-free in OrtModelEditorApi ownership transfer#28123
Prevent double-free in OrtModelEditorApi ownership transfer#28123
Conversation
Add guards to reject duplicate ownership transfers in AddInitializerToGraph, AddNodeToGraph, AddGraphToModel, SetGraphInputs, and SetGraphOutputs. This prevents double-free and use-after-free when the same pointer is passed to these APIs more than once. - Add owned_ flag to ModelEditorNode to track ownership state - Add pointer tracking set to ModelEditorGraph for initializer OrtValues - Add duplicate name checks across initializer and external_initializer maps - Add duplicate pointer checks in SetGraphInputs/SetGraphOutputs arrays - Add null parameter validation for crash paths - Add regression tests for all ownership-transfer guards Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR hardens the OrtModelEditorApi C API against ownership-transfer misuse that can lead to double-free/use-after-free by adding explicit duplicate detection and ownership guards, and adds regression tests to cover the new rejection paths.
Changes:
- Add duplicate pointer detection in
SetGraphInputs/SetGraphOutputsand ownership-transfer guards inAddInitializerToGraph,AddNodeToGraph, andAddGraphToModel. - Extend internal model editor types to track ownership (
ModelEditorNode::owned_,ModelEditorGraph::owned_initializer_ptrs_). - Add regression tests covering duplicate name/pointer/node/graph scenarios.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
onnxruntime/test/shared_lib/test_model_builder_api.cc |
Adds regression tests to validate the new ownership-transfer rejection behavior. |
onnxruntime/core/session/model_editor_c_api.cc |
Implements duplicate pointer checks and ownership-transfer guards in the C API entrypoints. |
onnxruntime/core/graph/model_editor_api_types.h |
Adds ownership-tracking fields used to prevent double-ownership of nodes/initializers. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Check for duplicate pointers in the input array to prevent double-free | ||
| onnxruntime::InlinedHashSet<const OrtValueInfo*> seen; | ||
| for (size_t i = 0; i < inputs_len; ++i) { | ||
| if (inputs[i] != nullptr && !seen.insert(inputs[i]).second) { | ||
| return OrtApis::CreateStatus(ORT_INVALID_ARGUMENT, |
Copilot
AI
Apr 17, 2026
•
There was a problem hiding this comment.
SetGraphInputs iterates inputs[i] before validating that inputs itself is non-null. If a caller passes inputs_len > 0 with inputs == nullptr, this will dereference a null pointer and crash. Add an explicit check (and similarly consider guarding ort_graph before calling ToInternal, since ToInternal(nullptr) also dereferences). #Resolved
There was a problem hiding this comment.
The null-check gap for the inputs array pointer is pre-existing -- the original loop on the next line (if (inputs[i] == nullptr)) has the same crash behavior. Our diff only added a duplicate-detection loop inside the existing function body. The In SAL annotation on this parameter requires non-null from the caller. Happy to address in a follow-up hardening PR that adds null checks consistently across all ModelEditorApi functions.
…sfer Move owned_initializer_ptrs_ insertion before the map assignment so that if the map allocation throws, ownership has not been transferred yet and the caller still safely owns the pointer. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
The PR guards against adding a second graph to the same model, but does not guard against adding the same graph pointer to two different models. This is the exact same class of bug being fixed elsewhere: // Current code only checks:
if (model->graph != nullptr) { return error; }
model->graph = std::unique_ptr<OrtGraph>(graph); // takes ownershipIf a caller does: AddGraphToModel(model1, graph); // model1 owns graph
AddGraphToModel(model2, graph); // model2 also wraps graph → double-freeFix: Add an owned_ flag to ModelEditorGraph] (identical to the ModelEditorNode::owned_ approach), and check it before taking ownership. This is the same pattern already applied to nodes. #Resolved |
|
If e is null, e->graph_ir_api is UB. The PR adds explicit null checks for ort_node, tensor, name, and model, but does not add a null check for ort_graph before calling ToInternal(ort_graph) in AddInitializerToGraph, AddNodeToGraph, SetGraphInputs, or SetGraphOutputs. While these parameters are annotated In (contract: non-null), the PR is selectively adding null checks for other In parameters, creating an inconsistency. Suggestion: Either add null checks for ort_graph in these functions (for consistency with the defensive style of this PR), or don't — but be consistent. #Resolved |
…eToGraph AddInitializerToGraph: use try_emplace + reset pattern with catch rollback. All validation moved before any state mutation. If map insertion throws, tracking set is rolled back and caller retains ownership. AddNodeToGraph: reserve before emplace_back so reallocation failure does not delete the node via a temporary unique_ptr. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…Graph Address review feedback: - Add ort_graph null checks before ToInternal in all 4 functions - Add inputs/outputs array null checks in SetGraphInputs/SetGraphOutputs - Add owned_ flag to ModelEditorGraph to prevent same graph added to two models - Add regression tests for null graph and cross-model double-free Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Addressed all review feedback in latest commits:
|
|
The scenario: CreateNode→ caller holds node The PR adds the It does not use it in
ORT_API(void, OrtApis::ReleaseNode, _Frees_ptr_opt_ OrtNode* node) {
if (auto* me_node = ModelEditorNode::ToInternal(node); me_node != nullptr && me_node->owned_) {
// Already owned by a graph — caller should not release.
return; // or log a warning
}
delete node;
}
``` #Resolved |
|
SetGraphInputsnulls out inputs[i] after ownership transfer, but the caller typically has the original pointer stored separately: OrtValueInfo* x_info = ...; // caller creates this
OrtValueInfo* inputs[] = {x_info};
SetGraphInputs(graph, inputs, 1); // inputs[0] = nullptr, but x_info still valid
ReleaseValueInfo(x_info); // double-free — graph also owns itThe difference from nodes/graphs is that ModelEditorValueInfo does not have an owned_ flag. The PR added owned_ to ModelEditorNode and ModelEditorGraph but missed ModelEditorValueInfo. #Resolved |
- Add owned_ flag to ModelEditorValueInfo (matching Node and Graph) - SetGraphInputs/SetGraphOutputs: reject already-owned ValueInfos, set owned_ after transfer - ReleaseValueInfo/ReleaseNode/ReleaseGraph: check owned_ before deleting - Add 4 regression tests for release-after-ownership scenarios Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Addressed both comments in latest commit (004550f): Release-side guards (comment 1):
Tests: 4 new regression tests covering release-after-ownership for Node, Graph, and ValueInfo, plus rejection of already-owned ValueInfo reuse across graphs. |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // owned by a unique_ptr and the caller still safely owns it. | ||
| node->id = graph->nodes.size(); | ||
| graph->nodes.push_back(std::unique_ptr<onnxruntime::ModelEditorNode>(node)); // take ownership | ||
| graph->nodes.reserve(graph->nodes.size() + 1); |
There was a problem hiding this comment.
AddNodeToGraph calls graph->nodes.reserve(graph->nodes.size() + 1) on every insertion. Once nodes reaches capacity, this forces a reallocation for every additional node (capacity grows by 1 each time), turning appends into O(n^2) behavior for larger graphs. Consider only reserving when size() == capacity() and growing geometrically (e.g., max(capacity*2, size+1)), or otherwise preserve amortized growth while keeping the exception-safety goal.
| graph->nodes.reserve(graph->nodes.size() + 1); | |
| if (graph->nodes.size() == graph->nodes.capacity()) { | |
| const size_t current_capacity = graph->nodes.capacity(); | |
| const size_t new_capacity = current_capacity == 0 ? 1 : current_capacity * 2; | |
| graph->nodes.reserve(new_capacity); | |
| } |
There was a problem hiding this comment.
CreateGraph already does g->nodes.reserve(64) at creation, so the per-call reserve is a no-op until 65+ nodes. For realistic model building scenarios this seems unlikely ?
…puts/SetGraphOutputs
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.
Comments suppressed due to low confidence (1)
include/onnxruntime/core/session/onnxruntime_c_api.h:7749
- The updated
AddInitializerToGraphdocs claim the underlying tensor data "remains valid as long as either the caller or the graph holds a reference". That is only true when theOrtValueowns its buffer (e.g., ORT-allocated or created with a deleter). ForCreateTensorWithDataAsOrtValuethe external buffer lifetime is not reference-counted, so the caller must keep the backing memory alive independently. Please reword to avoid implying refcount-managed lifetime for externally-owned buffers.
* ORT will copy the OrtValue internally. The caller retains ownership of the OrtValue and should
* release it with ReleaseValue when done. The underlying tensor data is shared via reference counting,
* so the copy is cheap and the data remains valid as long as either the caller or the graph holds a reference.
*
* Two options:
*
* Allocated memory:
* Use CreateTensorAsOrtValue (allocates memory) and populate the tensor with the data.
* Set `data_is_external` to false.
*
* Pre-existing memory:
* Use CreateTensorWithDataAsOrtValue or CreateTensorWithDataAndDeleterAsOrtValue to create an OrtValue
* with a tensor that contains a pointer to the existing data.
* Set `data_is_external` to true.
*
* The pointer must remain valid for the duration of the inference session.
* If using CreateTensorWithDataAsOrtValue you are responsible for freeing the memory after the inference session
* is released.
* If using CreateTensorWithDataAndDeleterAsOrtValue, ORT will free the memory using the provided deleter as
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| @@ -125,6 +156,7 @@ ORT_API_STATUS_IMPL(OrtModelEditorAPI::SetGraphInputs, _In_ OrtGraph* ort_graph, | |||
| } | |||
|
|
|||
| graph->inputs.push_back(std::unique_ptr<onnxruntime::ModelEditorValueInfo>(input)); // take ownership | |||
| input->owned_ = true; | |||
| inputs[i] = nullptr; | |||
There was a problem hiding this comment.
SetGraphInputs clears graph->inputs before validating that every entry is non-null and a valid ModelEditorValueInfo. If an invalid/null entry appears later in the array, the function can return an error after partially transferring ownership and/or after deleting the previously-set inputs, leaving the graph in a partially-updated state despite failure. Consider fully validating all entries up-front (non-null, correct variant, not already owned) and only then clearing/replacing the graph inputs so the operation is atomic on success.
…_ SAL annotation on AddInitializerToGraph
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 7 out of 7 changed files in this pull request and generated 4 comments.
Comments suppressed due to low confidence (1)
include/onnxruntime/core/session/onnxruntime_c_api.h:7764
- The updated
AddInitializerToGraphdoc is internally inconsistent about lifetime/copying:
- It says “underlying tensor data is shared via reference counting … remains valid as long as either the caller or the graph holds a reference”, but a few lines later it says external-data pointers must remain valid for the duration of the inference session (which is the real requirement for
CreateTensorWithDataAsOrtValuewhere the buffer is user-owned). - The
data_is_externalparam description says the data “should not be copied”, but the implementation now copies theOrtValuein all cases;data_is_externalcontrols whether the initializer is recorded as external in the model.
Please tighten the wording to reflect that ORT copies theOrtValue(shared ownership of the Tensor object) but does not necessarily manage the lifetime of user-provided buffers unless a deleter/allocator-backed tensor is used.
* ORT will copy the OrtValue internally. The caller retains ownership of the OrtValue and should
* release it with ReleaseValue when done. The underlying tensor data is shared via reference counting,
* so the copy is cheap and the data remains valid as long as either the caller or the graph holds a reference.
*
* Two options:
*
* Allocated memory:
* Use CreateTensorAsOrtValue (allocates memory) and populate the tensor with the data.
* Set `data_is_external` to false.
*
* Pre-existing memory:
* Use CreateTensorWithDataAsOrtValue or CreateTensorWithDataAndDeleterAsOrtValue to create an OrtValue
* with a tensor that contains a pointer to the existing data.
* Set `data_is_external` to true.
*
* The pointer must remain valid for the duration of the inference session.
* If using CreateTensorWithDataAsOrtValue you are responsible for freeing the memory after the inference session
* is released.
* If using CreateTensorWithDataAndDeleterAsOrtValue, ORT will free the memory using the provided deleter as
* soon as the OrtValue is no longer in use.
*
* NOTE: A tensor containing pre-existing memory MUST have 128 bytes of data or more.
* For smaller tensors use CreateTensorAsOrtValue.
*
* ONNX shape inferencing does not support external data. An initializer involved in shape inferencing is
* typically small (a single value or limited by the rank of a tensor) and uses less than 128 bytes of
* memory, so this limit acts as a simple catch-all rule to avoid issues.
* e.g. Reshape's `shape`, Clip's `min` and `max`, various ops `axes`.
*
* \param[in] graph The OrtGraph instance to update.
* \param[in] name The value name for the initializer.
* \param[in] tensor The OrtValue instance containing the tensor data.
* \param[in] data_is_external Set to true if the data is external and should not be copied.
*
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
onnxruntime_cxx_api.h:3483 This comment is now wrong — the graph no longer takes ownership. It should say something like "Graph copies the OrtValue internally". We can also add constness here for Value. |
…sor to ort_value, assert in Release, fix includes and docs
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 9 out of 9 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…ease (no-op) variants
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 9 out of 9 changed files in this pull request and generated 3 comments.
Comments suppressed due to low confidence (2)
onnxruntime/core/session/model_editor_c_api.cc:1
- This still mutates
graph->inputsbefore all per-element validation is complete. If any later entry is null or an invalid variant, the function returns an error after clearing the old inputs and possibly transferring some new ones, so a failed call can leave the graph partially updated. Validate every element first, then replace the vector only after the whole batch is known to be valid.
// Copyright (c) Microsoft Corporation. All rights reserved.
onnxruntime/core/session/model_editor_c_api.cc:1
- Like
SetGraphInputs, this path clears the existing outputs and starts transferring ownership before finishing validation of the whole array. A null or invalid entry later in the list makes the call fail after the graph has already been modified, which is especially risky for a public C API because callers cannot roll the state back themselves.
// Copyright (c) Microsoft Corporation. All rights reserved.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
updated comment to "Graph copies the OrtValue internally" |
changed parameter to const Value&. |
Description
The OrtModelEditorApi C API functions (AddNodeToGraph, AddGraphToModel, SetGraphInputs/SetGraphOutputs) take raw pointers and wrap them in unique_ptr to transfer ownership. Without guards, callers can pass the same pointer twice or call Release after ownership transfer, causing double-free on destruction.
Changes