Skip to content

Add enumerate to parallel iterators#127

Merged
orxfun merged 1 commit intoorxfun:mainfrom
TechnoPorg:push-lmpnoynmpxpy
Feb 12, 2026
Merged

Add enumerate to parallel iterators#127
orxfun merged 1 commit intoorxfun:mainfrom
TechnoPorg:push-lmpnoynmpxpy

Conversation

@TechnoPorg
Copy link
Copy Markdown
Contributor

Closes #35 as part of my ongoing effort to port wild to orx-parallel (wild-linker/wild#1413). Let me know if you think the tests I've added here are sufficient.

If you have time, I'd also appreciate your input on the PR I linked above, as performance seems to have gone down in some cases when using orx-parallel, so I'm likely not using it optimally.

Copy link
Copy Markdown
Owner

@orxfun orxfun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, this does not provide the desired behavior. Please try the following test:

#[test]
fn enumerate_on_filter() {
    let input: Vec<_> = (0..10).collect();

    // # seq
    use crate::*;
    let par = input.iter().copied().filter(|x| x % 2 == 0);
    let enum_par = par.enumerate();
    let seq_result: Vec<_> = enum_par.collect();

    // # rayon: does not compile
    // use rayon::iter::*;
    // let par = input.par_iter().copied().filter(|x| x % 2 == 0);
    // let enum_par = par.enumerate();
    // let rayon_result: Vec<_> = enum_par.collect();

    // # orx
    let par = input.par().copied().filter(|x| x % 2 == 0);
    let enum_par = par.enumerate();
    let orx_result: Vec<_> = enum_par.collect();

    assert_eq!(orx_result, seq_result);
}

@orxfun
Copy link
Copy Markdown
Owner

orxfun commented Jan 19, 2026

Hi @TechnoPorg . Thanks a lot for the PR. But unfortunately, we cannot get the required behavior with this. There are two potential fixes.

Why it doesn't work

It works whenever we are using par or map. But it doesn't work when we xap. I am pretty sure you know the difference of xap and map, but for completeness:

  • map is 1-to-1. This means that if we know the length of the input, then we know exactly the length of the output. For enumerate, we don't need to know the exact length but knowing these two numbers are the same is important.
  • xap is 1-to-n. Here n can be 1, >1 or 0. In the filter example above, it is 1-to-(0 or 1). As a result, we don't know the number of elements in the output without actually running the computation.

ConcurrentIter::enumerate is always one to one. It is a necessary step but not sufficient for the parallel computation.

Potential Fix 1 (Hard)

We can still post-compute the indices. This requires some performance optimizations such that:

  • we do not spend too much time on it or we do not allocate data to achieve this, and
  • we do not spend any time on this whenever we do not enumerate (zero-cost).

I have some ideas on this and planning to test it out. But have to implement and see the benchmarks to make sure.

Potential Fix 2 (Easy)

We need another trait, say X such that X: ParIter. par and map variants implement it but xap variants do not.

enumerate must be a method defined only on X, such as an additional feature.

In this case, indices from ConcurrentIter::enumerate will be valid for parallel iterator indices, and we could get the expected results.

Notice that in the example test above, rayon code does not compile. This is the approach it is taken in there as well where X is called IndexedParallelIterator. I would recommend to think on another name to reflect the 1-to-1 vs 1-to-n mapping nature of the computation, but naming is hard.

Wrap up:)

I will work on Potential Fix 1, trying to plan for a time for it.

If you are interested, you might take over Potential Fix 2. This could already enable enumerate on many use cases. And since, you have done almost all the work here, you might consider adding the new trait on this branch / PR. Please let me know what you think.

@orxfun
Copy link
Copy Markdown
Owner

orxfun commented Jan 19, 2026

And on wild-linker/wild#1413, firstly thank you for all your great work!
It is very interesting computational use case for me and would be great to dive deeper. To make sure I understand everything correctly, it would be nice to have a quick catch up and go over the changes and benchmark results? If it works for you, would be great to arrange a timeslot over email maybe.

@TechnoPorg
Copy link
Copy Markdown
Contributor Author

I've redone this MR with Option 1 as proposed above and a significantly reduced scope.

The new trait is called ExactSizeParIter, and Par is currently the only implementer. This would mean that enumerate would need to be the first call in a method chain, which I thought was more elegant than refining the ParIter trait implementation with #[allow(refining_trait_impl)]] fn ... -> impl ParIter + ExactSizeParIter on every method of Par and Map. For the same reason, I also didn't add a variant to the fallible or using types. I can certainly change it if you would prefer for it to be otherwise!

@orxfun
Copy link
Copy Markdown
Owner

orxfun commented Jan 29, 2026

Thanks a lot @TechnoPorg !

That's a very good point. I hadn't thought of this complexity of implementing it on Map. I agree that calling enumerate first is more elegant. I think we should merge this PR like this to enable the functionality.

On the other hand, it is still a limitation which might be a bit more problematic in some cases. One example would be not being able to input.par().copied().enumerate(), and instead, needing to input.par().enumerate().map(|(a, b)| (a, *b)). We have the functionality but not as nice as it could be. Together with your note above, we can create a new issue and try to find an elegant way to hanle this.

I have one last comment on the name. It is actually not about knowing the exact size of the iterator that enables enumerating. It is the one-to-one match between input and output, even when we don't know the length of the input. For instance, the following works nicely on this branch:

#[test]
fn enumerate_on_non_exact_size_iter() {
    let input: Vec<_> = (0..5).collect();

    // not an exact-sized iter
    let iter = input.into_iter().filter(|x| *x != 2);
    assert_ne!(iter.size_hint().0, iter.size_hint().1.unwrap());

    // not an exact-sized par
    let par = iter.iter_into_par();

    // we can still enumerate
    let values: Vec<_> = par.enumerate().collect();
    assert_eq!(values, vec![(0, 0), (1, 1), (2, 3), (3, 4)]);
}

Since it doesn't match the definition of the standard ExactSizeIterator, it might be misleading. Do you have another name in mind?

One thought:

This is a minimal trait. We have similar traits in standard library names of which are simply verbs, such as Copy, Clone or Debug. Maybe ParEnumerate? I am not exactly sure, would be great to hear your ideas.

@TechnoPorg
Copy link
Copy Markdown
Contributor Author

Sorry for the delay in getting back to you on this!

I like your name suggestion of ParEnumerate. The only other thought I had was ConstantSizeParIter, but that feels unnecessarily verbose.

As for the method chaining limitation, what the standard library and rayon both do is that they return known types (Enumerate<Self>, Map<Self>, etc. ) from trait functions, rather than impl ParIter. Doing this would require quite a large change to orx-parallel's API, but would make this use case and likely many others more elegant. What are your thoughts on that idea?

@orxfun
Copy link
Copy Markdown
Owner

orxfun commented Feb 8, 2026

For the naming, I am a bit more inclined towards ParEnumerate, as you said, due to its simplicity.

Thanks a lot for pointing to the solution for extending enumeration from par to map. I agree that we should try this solution by returning explicit types as much as possible. I think it is completely possible since we can use impl in place of composed functions. For instance, the map transformation on ParMap<I, O, M1, R> here returns ParMap<I, Out, impl Fn(<I as ConcurrentIter>::Item) -> Out, R>. We cannot type out the explicit type of the the returned composed function but we can get away with impl.

So yes, I fully agree with the direction you suggest. But as you said it will be a larger refactoring. I recommend to finish and merge this PR first to already enable functionality, and have the larger refactoring at a separate branch.

@TechnoPorg TechnoPorg marked this pull request as ready for review February 11, 2026 06:13
@TechnoPorg
Copy link
Copy Markdown
Contributor Author

So yes, I fully agree with the direction you suggest. But as you said it will be a larger refactoring. I recommend to finish and merge this PR first to already enable functionality, and have the larger refactoring at a separate branch.

That makes sense. I've just pushed what I hope will be the final version of this PR implementing the basic functionality, and I'll open an issue now to track the eventual refactor.

Copy link
Copy Markdown
Owner

@orxfun orxfun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot!

@orxfun orxfun merged commit f3cc814 into orxfun:main Feb 12, 2026
4 checks passed
@TechnoPorg TechnoPorg deleted the push-lmpnoynmpxpy branch February 12, 2026 16:57
@TechnoPorg
Copy link
Copy Markdown
Contributor Author

Happy to help!

Unfortunately, though, I will not be able to contribute much for the next few months, though, as things have gotten quite busy on my end.

@orxfun
Copy link
Copy Markdown
Owner

orxfun commented Feb 12, 2026

You helped a lot @TechnoPorg, and it is straightforward to take over the rest of enumeration, thanks to your work. Feel free to visit again when you need a break, I think we'll have more and hopefully fun issues in this repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement enumerate

2 participants