Add fltflt rounding and fmod functions by tbensonatl · Pull Request #1129 · NVIDIA/MatX

tbensonatl · 2026-02-27T22:15:20Z

Add support for the following float-float (fltflt) functions:

Round toward nearest, with ties toward even
Truncate toward zero
Truncate toward negative infinity
fmod (floating point remainder)

Also includes are new unit tests and benchmarks for the newly introduced functions.

Also add fltflt_add_same_sign(), with is more efficient than fltflt_add() for the case where we know both inputs have the same sign. Signed-off-by: Thomas Benson <tbenson@nvidia.com>

Signed-off-by: Thomas Benson <tbenson@nvidia.com>

copy-pr-bot · 2026-02-27T22:15:23Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

greptile-apps · 2026-02-27T22:19:06Z

Greptile Summary

Adds comprehensive float-float arithmetic rounding and modulo functions with proper IEEE-754 semantics. The implementation includes fltflt_round_to_nearest() with banker's rounding (round half to even), fltflt_round_toward_zero() for truncation, fltflt_floor() for rounding toward negative infinity, and fltflt_fmod() for floating-point remainder.

Key implementation details:

Proper handling of the fast path (|hi| < 2^23) where hi may have fractional parts
Correct treatment of the slow path (|hi| >= 2^23) where hi is already an integer
Edge case handling for when hi is exactly an integer but lo has opposite sign
Round-to-nearest uses tie-breaking to even when exactly at 0.5
Constructors modernized with constexpr and = default
Added fltflt_add_same_sign() optimization (11 FLOPs vs 20 FLOPs)

Testing and benchmarking:

Comprehensive unit tests covering edge cases, tie-breaking, boundary conditions, and large values
Performance benchmarks added for all new functions comparing fltflt against float/double baselines

Confidence Score: 5/5

Safe to merge - well-implemented mathematical functions with comprehensive tests
Implementation is mathematically sound with proper IEEE-754 semantics, comprehensive unit tests covering edge cases, and follows existing code patterns
No files require special attention

Important Files Changed

Filename	Overview
include/matx/kernels/fltflt.h	Adds fltflt rounding functions (round_to_nearest, round_toward_zero, floor) and fmod with proper tie-breaking and edge case handling
test/00_misc/FloatFloatTests.cu	Comprehensive unit tests for new rounding functions covering edge cases, tie-breaking, and boundary conditions
bench/00_misc/fltflt_arithmetic.cu	Adds benchmark kernels for round, fmod, trunc, floor, and cast operations with proper ILP and timing instrumentation
bench/scripts/run_fltflt_benchmarks.py	Updates benchmark list to include new functions in test execution and reporting

_{Last reviewed commit: 0651eb6}

greptile-apps

_{4 files reviewed, no comments}

_{Edit Code Review Agent Settings | Greptile}

cliffburdick · 2026-02-27T22:31:26Z

bench/00_misc/fltflt_arithmetic.cu

+  if (idx < size) {
+    T val[ILP_FACTOR];
+    //const T init_val = static_cast<T>(std::numbers::e);
+    const T init_val = static_cast<T>(33554432.5);


constexpr?

cliffburdick · 2026-02-27T23:01:36Z

include/matx/kernels/fltflt.h

+// given in Algorithm 14.1 of "Handbook of Floating-Point Arithmetic" by Muller et al. Rather
+// than include a conditional on the magnitude of a and b to use fltflt_fast_two_sum(), we
+// use fltflt_two_sum() at the cost of more FLOPs but without a branch.
+static __MATX_HOST__ __MATX_DEVICE__ __MATX_INLINE__ fltflt fltflt_add_same_sign(fltflt a, fltflt b) {


How would someone use this function?

Currently you would likely call it directly in a kernel. I plan on switching the sar_bp operator to use this function when computing a distance because in that case we are adding squared values and thus we know they are non-negative. I will also benchmark if it is worth adding the conditionals required to call this function when appropriate in operator+(), but I suspect the answer is no.

cliffburdick · 2026-02-27T23:01:55Z

/build

tbensonatl added 4 commits February 7, 2026 17:39

Add fltflt round, trunc, and fmod functions

0d15692

Also add fltflt_add_same_sign(), with is more efficient than fltflt_add() for the case where we know both inputs have the same sign. Signed-off-by: Thomas Benson <tbenson@nvidia.com>

Add fltflt_floor() function, unit tests, and floor benchmark

ce50521

Signed-off-by: Thomas Benson <tbenson@nvidia.com>

Fix fltflt ctor for gcc 8.5

3ade505

Signed-off-by: Thomas Benson <tbenson@nvidia.com>

Documentation updates for fltflt header

0651eb6

Signed-off-by: Thomas Benson <tbenson@nvidia.com>

tbensonatl requested a review from cliffburdick February 27, 2026 22:15

tbensonatl self-assigned this Feb 27, 2026

greptile-apps bot reviewed Feb 27, 2026

View reviewed changes

cliffburdick reviewed Feb 27, 2026

View reviewed changes

cliffburdick approved these changes Feb 27, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add fltflt rounding and fmod functions#1129

Add fltflt rounding and fmod functions#1129
tbensonatl wants to merge 4 commits intomainfrom
feature/add-fltflt-round-fmod

tbensonatl commented Feb 27, 2026

Uh oh!

copy-pr-bot bot commented Feb 27, 2026

Uh oh!

greptile-apps bot commented Feb 27, 2026

Uh oh!

greptile-apps bot left a comment

Uh oh!

cliffburdick Feb 27, 2026

Uh oh!

cliffburdick Feb 27, 2026

Uh oh!

tbensonatl Feb 27, 2026

Uh oh!

cliffburdick commented Feb 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

tbensonatl commented Feb 27, 2026

Uh oh!

copy-pr-bot bot commented Feb 27, 2026

Uh oh!

greptile-apps bot commented Feb 27, 2026

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

cliffburdick Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

cliffburdick Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

tbensonatl Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

cliffburdick commented Feb 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants