Skip to content

Add macOS (Apple Silicon) support#1

Draft
samtalki wants to merge 1 commit into
djturizo:mainfrom
samtalki:add-macos-apple-silicon
Draft

Add macOS (Apple Silicon) support#1
samtalki wants to merge 1 commit into
djturizo:mainfrom
samtalki:add-macos-apple-silicon

Conversation

@samtalki

@samtalki samtalki commented Jun 9, 2026

Copy link
Copy Markdown

Summary

Builds and runs on arm64 macOS (Apple Silicon) alongside the existing Linux x86 path. All 13 unit-test operations pass on arm64.

Changes

  • Per-architecture CMake flags. arm64 drops the x86 ISA flags (-maes, -mavx*, -mpclmul, -mrdseed, -no-pie) that Apple clang rejects and relies on baseline NEON; Linux x86 keeps the original set unchanged.
  • emp-tool discovery. emp-tool's config exports plain variables rather than an imported target, so its include/library dirs are now wired in explicitly (EMP-TOOL_INCLUDE_DIRS / EMP-TOOL_LIBRARIES). emp can then live in any prefix, not only /usr/local; this also helps Linux when emp is installed elsewhere.
  • Randomness probe. The x86 rdseed probe is skipped on Apple/arm64, forcing random_device (EMP_USE_RANDOM_DEVICE).
  • LFSR intrinsics. nsc-lfsr.h pulls the SSE intrinsics it needs from a vendored sse2neon.h (DLTcollab/sse2neon, MIT) on arm64, and the unused 128-state LFSR is guarded out (its __m128i shift does not compile under sse2neon).
  • gid counter. The gid block counter used __m128i lane arithmetic (gid++, gid += s), which sse2neon's int64x2_t rejects. A small portable helper reproduces the per-lane semantics; behavior is identical on x86.
  • Build script and docs. Adds compile-macos.sh and a macOS section in the README.

Prerequisite

emp-tool and emp-ot must be built from the 0.3.0 tag. Later versions move Ferret to ot_extension/ and drop the emp-ot/ferret/ferret_cot.h API this code targets. The README has the exact build commands.

Scope

Targets Apple Silicon (arm64). Intel macOS is not covered, but the CMake is structured so a branch for Intel macs can be added without rework. Linux x86 behavior is unchanged (the non-arm flag set is byte for byte the original).

Testing

./compile-macos.sh then ./unit-tests-linux.sh on an M-series Mac (macOS 26, Apple clang 21): operations 1–13 all report SUCCESS, including log (the tightest ULP check), which exercises the sse2neon LFSR path end to end.

Build and run on arm64 macOS alongside the existing Linux x86 path.

CMake now selects flags per architecture: arm64 drops the x86 ISA flags
(-maes, -mavx*, -mpclmul, -mrdseed, -no-pie) that Apple clang rejects and
relies on baseline NEON, while Linux x86 keeps the original set. emp-tool's
exported variables (EMP-TOOL_INCLUDE_DIRS/LIBRARIES) are wired in explicitly so
emp can live in any prefix, and the x86 rdseed probe is skipped on Apple/arm64
(forcing random_device).

nsc-lfsr.h pulls the SSE intrinsics it needs from a vendored sse2neon.h on arm64
and guards the unused 128-state LFSR, whose __m128i shift does not compile under
sse2neon. The gid block counter relied on __m128i lane arithmetic; a small
portable helper reproduces the per-lane semantics.

Adds compile-macos.sh and documents the emp 0.3.0 prerequisite. All 13
unit-test operations pass on arm64.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant