Handle addrList allocation failures safely in RF24Mesh master nodes by Copilot · Pull Request #266 · nRF24/RF24Mesh

Copilot · 2026-05-23T11:49:53Z

Inspect PR feedback that explicitly mentions @copilot
Run clang-format-14 on the modified PR file (RF24Mesh.cpp)
Run clang-tidy-14 against RF24Mesh.cpp and review diagnostics
Commit any required formatting updates

Original prompt

Create a pull request in repository nRF24/RF24Mesh against base branch master that hardens allocation failure handling for the master node address list in RF24Mesh.cpp.

Scope:

Modify RF24Mesh.cpp.
In ESBMesh<network_t, radio_t>::begin(...), check the result of the initial malloc() used to allocate addrList.
If allocation fails, return false and do not set addrMemAllocated = true.
In ESBMesh<network_t, radio_t>::setAddress(...), replace the direct assignment of realloc() back to addrList with a temporary pointer.
If realloc() fails, return early without losing the original addrList pointer.
Add a defensive null check before writing the new entry.

Constraints:

Keep the change minimal and localized.
Preserve existing behavior during successful allocations.
Do not refactor unrelated code.
Follow the repository's existing coding style.

Rationale:

The current code assigns realloc() directly to addrList, which can lose the original allocation on failure and lead to a memory leak plus null dereference.
The initial malloc() in begin() is also unchecked.

Use this PR metadata:
Title: Handle addrList allocation failures safely in RF24Mesh master nodes

Body:

Summary

This PR hardens dynamic memory handling for the master node's addrList in RF24Mesh.

Specifically:

checks the initial malloc() in begin()
avoids overwriting addrList directly with realloc() in setAddress()
preserves the existing allocation if realloc() fails
avoids dereferencing a null addrList after allocation failure

Problem

RF24Mesh allocates the master node's address list dynamically:

initial allocation in begin()
growth allocation in setAddress()

Before this change, setAddress() assigned realloc() directly back to addrList:

addrList = (addrListStruct*)realloc(addrList, ...);

If realloc() fails, it returns NULL while the original allocation remains valid. Assigning that return value directly to addrList loses the only pointer to the original allocation, which creates:

a memory leak of the old buffer
a likely null dereference on the next access to addrList

Similarly, the initial malloc() in begin() was not checked before loadDHCP() and later use.

Changes

`begin()`

check the result of malloc()
return false if allocation fails
only set addrMemAllocated = true after successful allocation

`setAddress()`

use a temporary pointer for realloc()
return early if reallocation fails
keep the original addrList untouched on failure
add a defensive null check before writing the new entry

Why this is useful

On many embedded targets allocation failure may be rare, but this still improves correctness and makes failure behavior safer and more predictable, especially on:

Linux builds
constrained-memory systems
long-running master nodes with growing addrList state

Behavior impact

There should be no behavior change during normal operation.

On allocation failure:

begin() now fails cleanly instead of proceeding with an invalid addrList
setAddress() now leaves the existing address list intact instead of losing it

Files changed

RF24Mesh.cpp

Notes

This PR intentionally keeps the current allocation model and only hardens failure handling. It does not change the overall ownership/lifetime design of addrList.

The following is the prior conversation context from the user's chat exploration (may be truncated):

User: Can you find any memory leaks in RF24Mesh?

Assistant: Yes — I found at least one real memory-management bug that can leak memory and also lose the only pointer to allocated storage.

Most likely leak: unchecked realloc() in setAddress().

if (addrListTop > 0 && addrListTop % MESH_MEM_ALLOC_SIZE == 0) {
    addrList = (addrListStruct*)realloc(addrList, (addrListTop + MESH_MEM_ALLOC_SIZE) * sizeof(addrListStruct));
}
addrList[addrListTop].address = address;
addrList[addrListTop++].nodeID = nodeID;

Why this is a bug

If realloc() fails, it returns NULL and the original pointer is not freed. But this code assigns the return value directly back to addrList, so on failure:

the original allocation becomes unreachable from the program
that memory is effectively leaked
the next line dereferences addrList, which can also crash

So this is both:

a memory leak on allocation failure
a null-pointer dereference risk

Safer pattern

Use a temporary pointer:

auto* newList = (addrListStruct*)realloc(addrList, newSize);
if (!newList) {
    // handle allocation failure
    return;
}
addrList = newList;

Another memory-lifetime concern

The master node allocates addrList with malloc() in begin():

</details>



<!-- START COPILOT CODING AGENT SUFFIX -->

*This pull request was created from Copilot chat.*
>

2bndy5 · 2026-05-23T11:53:23Z

Woah! Was that prompt generated? It is ridiculously long.

TMRh20 · 2026-05-23T11:54:40Z

Woah! Was that prompt generated? It is ridiculously long.

What is ridiculously long? This PR is all prompt generated.

2bndy5 · 2026-05-23T11:56:53Z

Click "Original prompt" in the description (at the bottom), and you'll see what I mean.

TMRh20 · 2026-05-23T11:58:45Z

lol, i see. Yeah, I just asked it to find memory leaks in the RF24Mesh repo lol.

@copilot run clang-tidy and clang-format on this pr

2bndy5 · 2026-05-23T12:02:18Z

I think Anthropic (& some exclusive friends) has been doing the same for a lot of package registries recently. I've seen a lot of CVE alerts about python, npm, and rust deps lately.

Copilot · 2026-05-23T12:03:08Z

lol, i see. Yeah, I just asked it to find memory leaks in the RF24Mesh repo lol.

@copilot run clang-tidy and clang-format on this pr
...

Ran both locally on this PR.

clang-format-14: applied and committed in 67f0e6f.
clang-tidy-14 on RF24Mesh.cpp: executed with checks (bugprone-*, clang-analyzer-*); it reports one environment dependency error (RF24/RF24_config.h not found in this sandbox) plus existing warnings outside this change set.

2bndy5 · 2026-05-23T12:05:25Z

We may need to add an AGENTS.md file in these repos, so that they know how to setup the build env (including RF24* deps and their git HEAD refs).

TMRh20 · 2026-05-23T12:06:20Z

I have no idea how to do that :p

2bndy5 · 2026-05-23T12:08:51Z

It is basically like a CONTRIBUTING.md doc for any AI agents: totally free-form, no standard convention.

github-actions · 2026-05-23T12:14:28Z

Memory usage change @ `55a6ce3`

Board	Flash	%	RAM for global variables	%
arduino:avr:nano	🔺 0 - +20	0 - +0.07	0 - 0	0 - 0
arduino:samd:mkrzero	🔺 +4 - +16	0 - +0.01	0 - 0	0 - 0

Click for full report per board

`arduino:avr:nano`

Sketch	Flash	%
examples/RF24Mesh_Example	0	0
examples/RF24Mesh_Example_Master_Statics	+16	+0.05
examples/RF24Mesh_Example_Master_To_Nodes	+20	+0.07
examples/RF24Mesh_Example_Node2Node	0	0
examples/RF24Mesh_Example_Node2NodeExtra	0	0
examples/RF24Mesh_SerialConfig	+8	+0.03
examples/RF24Mesh_Example_Master	+20	+0.07

`arduino:samd:mkrzero`

Sketch	Flash	%
examples/RF24Mesh_Example	+4	0
examples/RF24Mesh_Example_Master_Statics	+12	0
examples/RF24Mesh_Example_Master_To_Nodes	+16	+0.01
examples/RF24Mesh_Example_Node2Node	+4	0
examples/RF24Mesh_Example_Node2NodeExtra	+4	0
examples/RF24Mesh_SerialConfig	+4	0
examples/RF24Mesh_Example_Master	+16	+0.01

TMRh20 · 2026-05-23T12:19:06Z

Would you want to tackle that, or are you leaving it up to me?

2bndy5 · 2026-05-23T12:26:04Z

I was just thinking about how I'd make copilot do it for all the RF24 repos at once... I think I can do it in a VS Code workspace with all the local RF24 repos added.

PS - I've been rewriting the arduino/compile-sketches action in rust. When I'm done, it should be a lot faster (parallel compilation, no need for a complicated CI workflow matrix per board, and stuff). The current arduino/compile-sketches action we're using hasn't been updated in couple years.

2bndy5 · 2026-05-23T15:00:41Z

Hmm, having an agent build RF24Ethernet is very different from other Linux-compatible RF24 repos. It would have to use arduino-cli, and I doubt we can use clang-tidy unless the arduino-cli fixed their compilation database support.

I think I'll skip that one for now. Are you planning on tasking AI to analyze RF24Ethernet?

TMRh20 · 2026-05-23T15:46:10Z

Are you planning on tasking AI to analyze RF24Ethernet?

Already done to find that previous memory leak, this time I used it just to confirm what I already suspected and found another one. Its just a matter of testing and figuring out lwIP process now without crashing the MCU. This whole memory leak stuff is what is killing me as far as a timeframe for addressing these other PRs,

2bndy5 · 2026-05-23T15:52:17Z

You heavily rely on calling begin() a lot too. That's not something I was considering in my past code contributions/reviews.

Almost done with the AGENTS.md and clang-tidy configs. I usually waste about 3 prompts before I throw my hands in the air and just say aloud, "Damn it, I'll just do it myself!"

2bndy5 · 2026-05-23T15:57:57Z

This whole memory leak stuff is what is killing me ...

Another reason why I F&#$ing ❤️ Rust (🦀).

TMRh20 · 2026-06-03T13:23:18Z

@copilot are you able to modify this PR to suit the changes in #248 that now exist in the master branch?

Copilot · 2026-06-03T13:31:06Z

@copilot are you able to modify this PR to suit the changes in #248 that now exist in the master branch?

Updated this PR to match the #248 master changes. The allocation-failure handling is now applied in the new code path (setNodeID() + setAddress()), with the update in commit e9b0033.

2bndy5

It is a good idea to check the result of (re)malloc calls, and only proceed normally when successfully allocated the requested mem block.

If this wasn't embedded C++, then I'd insist that an exception is thrown to propagate alloc errors to the caller. Unfortunately, throwing exceptions in embedded C++ isn't really a thing, at least not for most implementations (I think the ESP32 core actually allows exceptions).

I say this because if alloc fails, this patch will silently drop any incoming requests for newly assigned mesh addresses. As it stands now, this patch just prevents data loss of previously assigned addresses.

Is there some kind of error handling elsewhere that I am missing? I think this is heavily relying on child nodes to interpret a master's response, be it silent, good, or somehow erroneous/unexpected.
I'm guessing the child node's double-check of a newly assigned address would surface this alloc problem in an obscure way. Didn't we add a Error message type in the reserved system message types?

TMRh20 · 2026-06-04T09:26:29Z

I'm running this code on all of my Linux/Gateway nodes with no apparent issues so far.

I think this is heavily relying on child nodes to interpret a master's response, be it silent, good, or somehow erroneous/unexpected.

The child nodes should either see themselves de-allocated or allocated with the wrong address, meaning they should just continue retrying to renew their addresses. There should be nothing new to interpret.

I'm not sure how to test this fully, other than hard-coding it to fail once or twice, but in a working situation, there should be no real "unexpected" behaviour that I can see.

2bndy5 · 2026-06-04T09:29:31Z

I'm envisioning a DoS when mem is saturated on master node.

TMRh20 · 2026-06-04T09:34:21Z

What else? If memory is saturated, the saveDHCP and actual list updating is handled prior to re-allocation, so existing nodes would have no problems, just any additional nodes would experience a DOS condition.

2bndy5 · 2026-06-04T09:46:02Z

yeah, that's what I'm thinking too. I was just thinking it would be better to have a message saying "we're at capacity" to prospecting child nodes, instead of letting them clog the bandwidth trying to reconnect. I'm probably overthinking again.

TMRh20 · 2026-06-04T09:52:42Z

Haha, yeah I'd agree with that expectation for Linux devices, but for embedded systems I think that is probably overkill. There should typically be a timeout on address renewals, so it's not that bad, but yeah a large number of nodes trying to connect constantly is a burden & could even create a DOS for the whole system.

TMRh20 · 2026-06-06T11:13:23Z

Wondering if you approve this PR or have more concerns @2bndy5 ?

2bndy5

Sorry. thought my last review was an approval

Initial plan

db7b6aa

Copilot AI assigned Copilot and TMRh20 May 23, 2026

Copilot started work on behalf of TMRh20 May 23, 2026 11:49 View session

Handle addrList allocation failures safely

6755336

Copilot AI changed the title ~~[WIP] Fix allocation handling for master node address list in RF24Mesh~~ Handle addrList allocation failures safely in RF24Mesh master nodes May 23, 2026

Copilot AI requested a review from TMRh20 May 23, 2026 11:53

Copilot finished work on behalf of TMRh20 May 23, 2026 11:53

Copilot started work on behalf of TMRh20 May 23, 2026 11:58 View session

Apply clang-format to allocation handling changes

67f0e6f

Copilot finished work on behalf of TMRh20 May 23, 2026 12:03

Copilot started work on behalf of TMRh20 June 3, 2026 13:23 View session

Align allocation checks with master setNodeID flow

2e89475

Copilot AI added 2 commits June 3, 2026 13:29

Clarify master nodeID check in setNodeID

6c76175

Initialize addrListTop on master addrList allocation

e9b0033

Copilot finished work on behalf of TMRh20 June 3, 2026 13:31

Merge branch 'master' into copilot/harden-allocation-failure-handling

55a6ce3

TMRh20 marked this pull request as ready for review June 4, 2026 07:30

TMRh20 requested a review from 2bndy5 June 4, 2026 07:30

TMRh20 approved these changes Jun 4, 2026

View reviewed changes

2bndy5 reviewed Jun 4, 2026

View reviewed changes

TMRh20 approved these changes Jun 6, 2026

View reviewed changes

2bndy5 approved these changes Jun 6, 2026

View reviewed changes

TMRh20 merged commit 7dc5f18 into master Jun 6, 2026
57 checks passed

TMRh20 deleted the copilot/harden-allocation-failure-handling branch June 6, 2026 11:27

Conversation

Copilot AI commented May 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Changes

begin()

setAddress()

Why this is useful

Behavior impact

Files changed

Notes

Why this is a bug

Safer pattern

Another memory-lifetime concern

Uh oh!

2bndy5 commented May 23, 2026

Uh oh!

TMRh20 commented May 23, 2026

Uh oh!

2bndy5 commented May 23, 2026

Uh oh!

TMRh20 commented May 23, 2026

Uh oh!

2bndy5 commented May 23, 2026

Uh oh!

Copilot AI commented May 23, 2026

Uh oh!

2bndy5 commented May 23, 2026

Uh oh!

TMRh20 commented May 23, 2026

Uh oh!

2bndy5 commented May 23, 2026

Uh oh!

github-actions Bot commented May 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Memory usage change @ 55a6ce3

arduino:avr:nano

arduino:samd:mkrzero

Uh oh!

TMRh20 commented May 23, 2026

Uh oh!

2bndy5 commented May 23, 2026

Uh oh!

2bndy5 commented May 23, 2026

Uh oh!

TMRh20 commented May 23, 2026

Uh oh!

2bndy5 commented May 23, 2026

Uh oh!

2bndy5 commented May 23, 2026

Uh oh!

TMRh20 commented Jun 3, 2026

Uh oh!

Copilot AI commented Jun 3, 2026

Uh oh!

2bndy5 left a comment

Choose a reason for hiding this comment

Uh oh!

TMRh20 commented Jun 4, 2026

Uh oh!

2bndy5 commented Jun 4, 2026

Uh oh!

TMRh20 commented Jun 4, 2026

Uh oh!

2bndy5 commented Jun 4, 2026

Uh oh!

TMRh20 commented Jun 4, 2026

Uh oh!

TMRh20 commented Jun 6, 2026

Uh oh!

2bndy5 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Copilot AI commented May 23, 2026 •

edited

Loading

`begin()`

`setAddress()`

github-actions Bot commented May 23, 2026 •

edited

Loading

Memory usage change @ `55a6ce3`

`arduino:avr:nano`

`arduino:samd:mkrzero`