Conversation
There was a problem hiding this comment.
I don't see how this is useful. If there are parameters, document them on the types or functions that take them. Or if that's not practical (because they show up similarly in multiple places) document them in the module docs.
arthurp
left a comment
There was a problem hiding this comment.
I noticed a few things that appear to be a problem from the previous code. @aneeshdurg can you look at them?
Given there appear to be some inconsistencies. I think it's pretty important to factor out the boiler plate around the tests. If you have trouble getting the Rust lambdas to work for you, tell me and I can help. There are some potentially confusing trait bounds that come up around lambdas. The refactored boiler plate should be usable in all the benchmarks, new and old.
| .cpu_affinity(cpu_set) | ||
| .spawn(); | ||
| } | ||
| } |
There was a problem hiding this comment.
I think you could extract the loop and other shared boiler plate (for creating and placing the threads and looping) into a function which takes impl FnMut() as an argument and calls it inside the loop. The benchmarks would then pass a closure with their specific produce call. You might be able to use the same one for consume too.
(It would need to take a couple of extra parameters, like the prefix for the name in the printed message.)
| let producer = q.attach_ref_producer().unwrap(); | ||
| move || { | ||
| barrier.fetch_sub(1, Ordering::Acquire); | ||
| while barrier.load(Ordering::Relaxed) > 0 {} |
There was a problem hiding this comment.
I think, this should actually have call to spin_loop in it. It helps the CPU optimize spin loops in multiple different ways. @aneeshdurg? Did you leave this out intentionally?
There was a problem hiding this comment.
I can't remember if it was intentional or if I just didn't bother. Worth adding it and seeing what happens.
There was a problem hiding this comment.
It is unlikely to change all that much. But it is the "correct" way to do it. If we are trying to get the last 0.5% at some point we can try turning it off again.
| let barrier = barrier.clone(); | ||
| move || { | ||
| barrier.fetch_sub(1, Ordering::Acquire); | ||
| // ostd::task::Task::yield_now(); |
There was a problem hiding this comment.
Remove commented code unless there is a specific reason for it.
| .cpu_affinity(cpu_set) | ||
| .spawn(); | ||
|
|
||
| // Start all consumers |
There was a problem hiding this comment.
Comment is wrong. It's strong observers.
| let weak_observer = q.attach_consumer().unwrap(); | ||
| let barrier = barrier.clone(); | ||
| move || { | ||
| barrier.fetch_sub(1, Ordering::Acquire); | ||
| while barrier.load(Ordering::Relaxed) > 0 {} | ||
| let mut cnt = 0; | ||
| for _ in 0..(2 * N_MESSAGES_PER_THREAD) { | ||
| cnt += weak_observer.consume(); |
There was a problem hiding this comment.
@aneeshdurg. What's going on with the consumer named "weak_observer"? Was this in there for any of our benchmark results?
There was a problem hiding this comment.
This was so I could run the MPMC baseline in the same function. If you see the if above, the OQueue implementations will use weak observers while the normal MPMC queue uses a consumer.
There was a problem hiding this comment.
This could use a comment in the code, and probably a different variable name.
There was a problem hiding this comment.
Yea. Since the code isn't actually reused with a macro or anything. Let's definitely use the correct name and a comment. This is and or maybe should be the topic for another PR.
aneeshdurg
left a comment
There was a problem hiding this comment.
The benchmarks themselves look good!
| input: Option<OQueueNewBenchmarkInput>, | ||
| } | ||
|
|
||
| struct OQueueNewBenchmarkInput { |
There was a problem hiding this comment.
Instead of calling the "new" impl "new", I'd rather we called the old impl "legacy" for consistency,
There was a problem hiding this comment.
@aanyas72 You added "legacy" but didn't remove "new" as was implied here. The reason to do this is that when we remove legacy, we don't have to change the name of all the "new" stuff. This matches the naming approach in ostd.
| WeakObs, | ||
| StrongObs, |
| // Start all producers | ||
| for tid in 0..n_threads_per_type { | ||
| let mut cpu_set = ostd::cpu::set::CpuSet::new_empty(); | ||
| cpu_set.add(ostd::cpu::CpuId::try_from(tid + 1).unwrap()); |
There was a problem hiding this comment.
Ideally, we should assert in the benchmark setup that there are sufficient CPUs attached. Can be a separate PR though.
This does NOT enable the same feature for LTP. Problems with it's launcher prevented an easy way to do it, and the LTP is out of date, so a correct solution will wait until/if we update.
This removes our dependency on core2 and replaces it with no_std_io2. Initially there was an upstream issue with libflate carrying a core2 dependency, but libflate fixed that with a new 2.3 release. NOTE: For context, the maintainer of core2 yanked it from crates.io. This caused all unlocked builds with dependencies on it to fail, not just for us, but across the internet.
Previously, these error types had context, but did not display it. This meant enhanced logs didn't get stack traces even when they were being captured. This corrects that. A global via the `ostd_error` macro may be possible, but would not be trivial to implement and Snafu recommends explicit error messages anyway.
This changes the data capture machinery: * Use [CBOR](https://cbor.io/). It is a self describing format generally similar to JSON. This is selected not because the format is better than others, but because the Rust library supports no-std and is stable. I evaluated message pack, and BSON. CBOR has a Python implementation. Notably, we are not generating CBOR lists, instead we are generating a series of CBOR values directly into the file. * Update the output format to include a CBOR-based directory of the files in the partition image. * Added a global function to create new files within the output images. This is provided for both new and legacy OQueues. * Run environment support for the above. * A python library for reading the data generated by all the above.
| mut setup: Setup, | ||
| on_complete: Done, | ||
| ) where | ||
| Setup: FnMut() -> Work, | ||
| Work: FnMut() + Send + 'static, | ||
| Done: Fn() + Send + Sync + 'static, |
There was a problem hiding this comment.
You can use an impl type here: on_complete: impl Fn() + Send + Sync + 'static. You may be able to do the same for Setup but it's return type may make that more complex.
| let barrier = barrier.clone(); | ||
| let on_complete = on_complete.clone(); | ||
| let completed = completed.clone(); |
There was a problem hiding this comment.
Why clone these? They are only used once.
There was a problem hiding this comment.
If I try to use the default values it gives an error: error[E0382]: use of moved value: on_complete
There was a problem hiding this comment.
Oh. Huh. I don't see the other usage, but whatever. The clone is fine.
|
@aanyas72 I think you are going to need to clean up your merge. You conflict now. However, I think a local git rebase will correctly fix it: Git rebase is smart enough to drop the commits from main that you ended up putting in the middle of your history (it detected them as cherry-picks which is more or less correct). Once you have the cleaned up branch. I'll take a final look and approve it. |
|
Sounds good, I'll get that done by tomorrow |
#214) This adds a scheduling event OQueue and support for capturing it. Unlike most OQueues this one must be enabled by a feature. See the documentation file.
This adds path metadata to OQueues in a way that it can be accessed via the OQueueRef.
arthurp
left a comment
There was a problem hiding this comment.
In the future do the rebase and force push again. This diff is confusing, and git rebase main fixes it entirely.
Overall, I'd like you to carefully read this and look for inconsistencies and things that could be cleaned up. I think you missed some things. See the comments, but also look for similar patterns elsewhere in the code.
| WeakObs, | ||
| StrongObs, |
| OQueueRef::new_anonymous(2 << 20) | ||
| } | ||
|
|
||
| fn get_consumable_oq(&self) -> ConsumableOQueueRef<u64> { |
| input: Option<OQueueNewBenchmarkInput>, | ||
| } | ||
|
|
||
| struct OQueueNewBenchmarkInput { |
There was a problem hiding this comment.
@aanyas72 You added "legacy" but didn't remove "new" as was implied here. The reason to do this is that when we remove legacy, we don't have to change the name of all the "new" stuff. This matches the naming approach in ostd.
| } | ||
|
|
||
| fn strong_obs_bench( | ||
| fn weak_obs_bench<Q: ConsumableOQueue<u64>>( |
There was a problem hiding this comment.
weak_observer_bench or weak_observer_benchmark
| let mut cpu_set = ostd::cpu::set::CpuSet::new_empty(); | ||
| cpu_set.add(ostd::cpu::CpuId::try_from(tid + 1).unwrap()); | ||
| ThreadOptions::new({ | ||
| let completed = completed.clone(); | ||
| let producer = q.attach_value_producer().unwrap(); | ||
| let barrier = barrier.clone(); | ||
| move || { | ||
| barrier.fetch_sub(1, Ordering::Acquire); | ||
| while barrier.load(Ordering::Relaxed) > 0 {} | ||
| for _ in 0..(2 * N_MESSAGES_PER_THREAD) { | ||
| producer.produce(0); | ||
| } | ||
| completed.fetch_add(1, Ordering::Relaxed); | ||
| } | ||
| }) | ||
| .cpu_affinity(cpu_set) | ||
| .spawn(); |
There was a problem hiding this comment.
Why don't these and similar benchmarks in the file not use the new thread spawning and loop handling utilities?
I added benchmarks for mixed_bench, consume_bench, produce_bench, weak_obs_bench, and strong_obs_bench. I will add the fixes for the scaling benchmarks next.