Batches the quat to vec, rotation, and mat compose functions for upst… by apasarkar · Pull Request #108 · pygfx/pylinalg

apasarkar · 2026-04-15T14:15:21Z

This PR allows for faster vectorized computations of some key functions that are regularly used in fastplotlib. Specifically, the functionality to go from vectors --> quaternions and the function to compose a transformation matrix from translation vectors/quaternions/scaling offsets is updated.

Edit: also includes updated tests for checking that batching works.

…ream use in graphics libraries

…ompose functions

almarklein · 2026-04-15T18:39:49Z

Thanks for this contribution! Some general comments:

You reduced and removed some docstrings, not sure why?
This project is formatted with ruff. You can run ruff format to autoformat the code, and ruff check to check for linting errors.
Would be good to have some benchmarks to verify that the code has not become significantly slower due to these changes, when used without the batching.

…compose

apasarkar · 2026-04-15T21:48:36Z

Sounds good, thanks @almarklein To address the comments:

Yep good call, I've reintroduced those comments.
Formatted with ruff in latest commits
Batching experiments below for the two key functions modified (compose_mat and quat_to_vecs)

Code for profiling compose_mat:


def benchmark_mat_compose(
    ns=(1, 10, 100, 1_000, 10_000),
    n_repeat=1,
    n_number=20,
    seed=0,
):
    """
    Benchmark mat_compose (scalar loop) vs mat_batch_compose (vectorized).
 
    Parameters
    ----------
    ns : sequence of int
        Batch sizes to test.
    n_repeat : int
        Number of timeit repeats (best-of is taken).
    n_number : int
        Number of calls per timeit repeat.
    seed : int
        RNG seed for reproducibility.
 
    Returns
    -------
    pd.DataFrame with columns: n, loop_ms, batch_ms, speedup
    """
    rng = np.random.default_rng(seed)
 
    rows = []
    for n in ns:
        translations = rng.random((n, 3))
        rotations = rng.random((n, 4))
        rotations /= np.linalg.norm(rotations, axis=1, keepdims=True)
        scalings  = rng.random((n, 3)) + 0.1
 
        def loop_fn():
            for i in range(n):
                mat_compose(translations[i], rotations[i], scalings[i])
 
        def batch_fn():
            mat_batch_compose(translations, rotations, scalings)
 
        t_loop  = min(timeit.repeat(loop_fn,  repeat=n_repeat, number=n_number)) / n_number * 1e3
        t_batch = min(timeit.repeat(batch_fn, repeat=n_repeat, number=n_number)) / n_number * 1e3
 
        rows.append(dict(n=n, loop_ms=round(t_loop, 4), batch_ms=round(t_batch, 4), speedup=round(t_loop / t_batch, 1)))
        print(f"n={n:>7,} | loop {t_loop:8.3f} ms | batch {t_batch:8.3f} ms | speedup {t_loop/t_batch:.1f}x")
 
    return pd.DataFrame(rows).set_index("n")

The relative runtimes here are:

n=      1 | loop    0.024 ms | batch    0.092 ms | speedup 0.3x
n=     10 | loop    0.235 ms | batch    0.095 ms | speedup 2.5x
n=    100 | loop    1.623 ms | batch    0.049 ms | speedup 33.3x
n=  1,000 | loop   13.604 ms | batch    0.154 ms | speedup 88.5x
n= 10,000 | loop  132.154 ms | batch    1.278 ms | speedup 103.4x

So mat_compose looks good w.r.t. overheads observable at n = 1.

Did a similar expt for the quat_to_vec code:



def benchmark_quat_from_vecs_singleton(n_repeat=5, n_number=1000, seed=0):
    """
    Compare quat_from_vecs_scalar vs quat_from_vecs_batch on singleton [3] inputs.
    Goal: verify batch overhead is not significant for a single vector pair.
    """
    rng = np.random.default_rng(seed)
 
    def random_unit_vec():
        v = rng.random(3)
        return v / np.linalg.norm(v)
 

    src, tgt = random_unit_vec(), random_unit_vec()

    t_scalar = min(timeit.repeat(
        lambda: quat_from_vecs(src, tgt),
        repeat=n_repeat, number=n_number
    )) / n_number * 1e3

    t_batch = min(timeit.repeat(
        lambda: quat_from_vecs_batch(src, tgt),
        repeat=n_repeat, number=n_number
    )) / n_number * 1e3

    overhead = t_batch / t_scalar
    print(f"scalar: {t_scalar:>10.4f} batch: {t_batch:>10.4f} ")

Each call takes ~0.1 ms, so the before/after didn't really change with the batching logic it seems.

Korijn · 2026-04-15T22:18:57Z

I think in particular we are eager to see a comparison with the implementation on the main branch :) to avoid a potential performance regression

apasarkar · 2026-04-15T22:27:22Z

Hi @Korijn! The comparison above is w.r.t. the code on the main branch. (I cut/pasted the code on main right now and compared that with the new batched code).

Korijn · 2026-04-16T07:18:46Z

So then do I correctly understand that the n=1 case became 3x slower? It's on the hot path for pygfx if I am not mistaken, can you address this?

E.g. the explicit float typecast in asarray is some overhead you can potentially do without... There's more ways to get it done

Batches the quat to vec, rotation, and mat compose functions for upst…

7734695

…ream use in graphics libraries

apasarkar requested a review from Korijn as a code owner April 15, 2026 14:15

Includes updated broadcasting tests justfor the quat to vec and mat c…

9f6ccce

…ompose functions

Adds back missing docs and also makes some minor improvements to mat …

69d4118

…compose

ruff formatting

4851a00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Batches the quat to vec, rotation, and mat compose functions for upst…#108

Batches the quat to vec, rotation, and mat compose functions for upst…#108
apasarkar wants to merge 4 commits into
pygfx:mainfrom
apasarkar:vectorize_quat_funcs

apasarkar commented Apr 15, 2026 •

edited

Loading

Uh oh!

almarklein commented Apr 15, 2026

Uh oh!

apasarkar commented Apr 15, 2026 •

edited

Loading

Uh oh!

Korijn commented Apr 15, 2026 •

edited

Loading

Uh oh!

apasarkar commented Apr 15, 2026 •

edited

Loading

Uh oh!

Korijn commented Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

apasarkar commented Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

almarklein commented Apr 15, 2026

Uh oh!

apasarkar commented Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Korijn commented Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

apasarkar commented Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Korijn commented Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

apasarkar commented Apr 15, 2026 •

edited

Loading

apasarkar commented Apr 15, 2026 •

edited

Loading

Korijn commented Apr 15, 2026 •

edited

Loading

apasarkar commented Apr 15, 2026 •

edited

Loading