Add macOS (Apple Silicon) support#1
Draft
samtalki wants to merge 1 commit into
Draft
Conversation
Build and run on arm64 macOS alongside the existing Linux x86 path. CMake now selects flags per architecture: arm64 drops the x86 ISA flags (-maes, -mavx*, -mpclmul, -mrdseed, -no-pie) that Apple clang rejects and relies on baseline NEON, while Linux x86 keeps the original set. emp-tool's exported variables (EMP-TOOL_INCLUDE_DIRS/LIBRARIES) are wired in explicitly so emp can live in any prefix, and the x86 rdseed probe is skipped on Apple/arm64 (forcing random_device). nsc-lfsr.h pulls the SSE intrinsics it needs from a vendored sse2neon.h on arm64 and guards the unused 128-state LFSR, whose __m128i shift does not compile under sse2neon. The gid block counter relied on __m128i lane arithmetic; a small portable helper reproduces the per-lane semantics. Adds compile-macos.sh and documents the emp 0.3.0 prerequisite. All 13 unit-test operations pass on arm64. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Builds and runs on arm64 macOS (Apple Silicon) alongside the existing Linux x86 path. All 13 unit-test operations pass on arm64.
Changes
-maes,-mavx*,-mpclmul,-mrdseed,-no-pie) that Apple clang rejects and relies on baseline NEON; Linux x86 keeps the original set unchanged.EMP-TOOL_INCLUDE_DIRS/EMP-TOOL_LIBRARIES). emp can then live in any prefix, not only/usr/local; this also helps Linux when emp is installed elsewhere.rdseedprobe is skipped on Apple/arm64, forcingrandom_device(EMP_USE_RANDOM_DEVICE).nsc-lfsr.hpulls the SSE intrinsics it needs from a vendoredsse2neon.h(DLTcollab/sse2neon, MIT) on arm64, and the unused 128-state LFSR is guarded out (its__m128ishift does not compile under sse2neon).gidblock counter used__m128ilane arithmetic (gid++,gid += s), which sse2neon'sint64x2_trejects. A small portable helper reproduces the per-lane semantics; behavior is identical on x86.compile-macos.shand a macOS section in the README.Prerequisite
emp-tool and emp-ot must be built from the
0.3.0tag. Later versions move Ferret toot_extension/and drop theemp-ot/ferret/ferret_cot.hAPI this code targets. The README has the exact build commands.Scope
Targets Apple Silicon (arm64). Intel macOS is not covered, but the CMake is structured so a branch for Intel macs can be added without rework. Linux x86 behavior is unchanged (the non-arm flag set is byte for byte the original).
Testing
./compile-macos.shthen./unit-tests-linux.shon an M-series Mac (macOS 26, Apple clang 21): operations 1–13 all report SUCCESS, includinglog(the tightest ULP check), which exercises the sse2neon LFSR path end to end.