Skip to content

fix: gate ErlNifResourceTypeInit members/dyncall on nif_version_2_16 to fix OTP 22/23 heap overflow#725

Merged
filmor merged 1 commit intorusterlium:masterfrom
lmth:fix-erl-nif-resource-type-init-otp22-overflow
May 2, 2026
Merged

fix: gate ErlNifResourceTypeInit members/dyncall on nif_version_2_16 to fix OTP 22/23 heap overflow#725
filmor merged 1 commit intorusterlium:masterfrom
lmth:fix-erl-nif-resource-type-init-otp22-overflow

Conversation

@lmth
Copy link
Copy Markdown
Contributor

@lmth lmth commented May 2, 2026

Summary

Gates ErlNifResourceTypeInit.members and ErlNifResourceTypeInit.dyncall on #[cfg(feature = "nif_version_2_16")] to fix a heap overflow when running on OTP 22/23.

Root cause

On NIF API versions < 2.16 (OTP 22 and 23), ErlNifResourceTypeInit only has three fields:

// OTP 22/23 -- sizeof = 24 bytes
typedef struct {
    ErlNifResourceDtor* dtor;
    ErlNifResourceStop* stop;
    ErlNifResourceDown* down;
} ErlNifResourceTypeInit;

The fields members and dyncall were added in NIF 2.16 / OTP 24. Since PR #358 (May 2021), Rustler has included both fields in the Rust struct unconditionally, making it 40 bytes regardless of the active NIF version.

When enif_open_resource_type_x is called on OTP 22/23, it does:

sys_memcpy(&ort->new_callbacks, init, sizeof(ErlNifResourceTypeInit));

where OTP's own sizeof is 24 bytes, but init points to Rustler's 40-byte struct, so OTP reads 16 bytes past the end of the new_callbacks field, overflowing into the next heap allocation.

  • On OTP debug builds: ASSERT fires immediately on startup.
  • On OTP release builds: silent heap corruption with ~35-40% crash rate (timing-dependent heisenbug).

Fix

Gate both fields on #[cfg(feature = "nif_version_2_16")] in rustler/src/sys/types.rs and update the companion struct-literal initialisers in rustler/src/resource/registration.rs accordingly.

The struct is then 24 bytes on NIF < 2.16 and 40 bytes on NIF >= 2.16, matching what OTP expects.

Why this matters

Rustler's own README documents NIF 2.15 (OTP 22) as the default supported version. The bug makes every NIF compiled with default settings silently corrupt heap on OTP 22/23.

On NIF versions < 2.16 (OTP < 24), ErlNifResourceTypeInit only has
three fields (dtor, stop, down = 24 bytes). The fields `members` and
`dyncall` were added in NIF 2.16 / OTP 24.

Since PR rusterlium#358 (May 2021) these two fields were added to the Rust struct
unconditionally, making it 40 bytes on all NIF versions. When running
on OTP 22 or 23, `enif_open_resource_type_x` calls:

    sys_memcpy(&ort->new_callbacks, init, sizeof(ErlNifResourceTypeInit))

where OTP's own sizeof is 24 bytes, but it copies 40 bytes from the
NIF's struct — a 16-byte heap overflow into the next allocation.

On OTP debug builds this triggers an ASSERT immediately. On release
builds it causes silent, timing-dependent heap corruption with a
~35-40% crash rate.

The fix gates both fields on `#[cfg(feature = "nif_version_2_16")]`
so the struct size matches what OTP expects for each NIF API level.
Rustler's own README documents NIF 2.15 (OTP 22) as the default, so
this version must work without heap corruption.

Companion struct-literal initialisers in resource/registration.rs are
gated accordingly.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@filmor
Copy link
Copy Markdown
Member

filmor commented May 2, 2026

Thank you for the fix. OTP's behaviour here is really odd :)

The comment is also a bit off. The behaviour is only broken in OTP<24. From then onwards (erlang/otp#4741) the members are copied individually instead of "blindly" memcpy'ing everything over.

@filmor filmor merged commit 7c81903 into rusterlium:master May 2, 2026
93 of 192 checks passed
@lmth
Copy link
Copy Markdown
Contributor Author

lmth commented May 2, 2026

Thank you for the fix. OTP's behaviour here is really odd :)

The comment is also a bit off. The behaviour is only broken in OTP<24. From then onwards (erlang/otp#4741) the members are copied individually instead of "blindly" memcpy'ing everything over.

Well, thank you! That was fast.
(I ran into this when backporting a NIF that was working perfectly on OTP-27, to an older branch of our product which runs on OTP-22. Obviously, when fixing it locally, I must also bring it upstream.)
I really appreciate the rustler project.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants