Parent
#58 (Phase 1 MVP: Evidence-Informed Alignment with Deferred Filtering)
What to build
Implement SSE4.2 SIMD-accelerated diagonal fill for WFA in phraya-align (x86_64 only). Use runtime CPU feature detection via is_x86_feature_detected!("sse4.2") to dispatch between SIMD and scalar implementations. Document all unsafe blocks with SAFETY invariants.
SIMD diagonal fill processes multiple cells in parallel, improving performance on modern x86 CPUs. Scalar fallback ensures correctness on older CPUs.
Acceptance criteria
Blocked by
#70 (WFA base implementation)
Parent
#58 (Phase 1 MVP: Evidence-Informed Alignment with Deferred Filtering)
What to build
Implement SSE4.2 SIMD-accelerated diagonal fill for WFA in phraya-align (x86_64 only). Use runtime CPU feature detection via is_x86_feature_detected!("sse4.2") to dispatch between SIMD and scalar implementations. Document all unsafe blocks with SAFETY invariants.
SIMD diagonal fill processes multiple cells in parallel, improving performance on modern x86 CPUs. Scalar fallback ensures correctness on older CPUs.
Acceptance criteria
Blocked by
#70 (WFA base implementation)