Add code review for PR #4 (multipass performance optimizations) by Copilot · Pull Request #5 · KaHIP/FREIGHT

Copilot · 2026-04-02T14:26:38Z

PR #4 introduces ~4.3x multipass speedup via mmap-based I/O caching, bitset connectivity, and reduced per-node overhead. This PR documents a full review identifying correctness and safety issues in those changes.

Critical: Race conditions in parallel loops (`freight.cpp`)

nodes_moved unsynchronized write:

#pragma omp parallel for schedule(runtime)
for (LongNodeID curr_node = 0; ...) {
    ...
    if (config.restream_number > 0 && block != old_block_for_move_check)
        nodes_moved++;  // data race — no atomic/critical
}

Fix: #pragma omp atomic or std::atomic<LongNodeID>.

first_pass_bits / first_pass_connectivity concurrent RMW:
Multiple threads can race on the same net_idx slot (bit-OR) and on the shared double accumulator. Fix: std::atomic<uint64_t> for bitset words and #pragma omp atomic for the accumulator, or accumulate connectivity in a post-parallel pass over the bitset.

Medium issues

open() unchecked before fstat() (both graph_io_stream.h): fd == -1 is silently passed to fstat, leaving st.st_size indeterminate before the mmap call.
Duplicate if/else bodies in OMP critical guards (both vertex_partitioning.h): both branches contain the same call — only the pragma differs; the duplication should be collapsed.
PGO CMake flags are GCC/Clang-only (both CMakeLists.txt): -fprofile-generate/-fprofile-use need a CMAKE_CXX_COMPILER_ID MATCHES "GNU|Clang" guard.
evaluate_cut/evaluate_connectivity lack default initializers in partition_config.h — UB if the struct is used before standard() runs.
Stale .size() call in prefetch guard (graph_io_stream.cpp): guard uses line_numbers.size() instead of the pre-computed line_size.
best_cut/best_con/best_qap/best_pins set but never read — the intended skip of the final re-evaluation is incomplete.

Minor

Removal of config.remaining_stream_nodes-- in readNodeOnePass_netl lacks a comment explaining why it is safe in all modes.
cursor = curr_node alias is redundant; use curr_node directly.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

https://api.github.com/graphql
- Triggering command: /usr/bin/gh gh auth status (http block)
https://api.github.com/repos/KaHIP/FREIGHT/pulls/4
- Triggering command: /usr/bin/curl curl -s -H Authorization: token ****** REDACTED (http block)

If you need me to access, download, or install something from one of these locations, you can either:

Configure Actions setup steps to set up my environment, which run before the firewall is enabled
Add the appropriate URLs or hosts to the custom allowlist in this repository's Copilot coding agent settings (admins only)

Original prompt

review the latesr pr and add comments

codacy-production · 2026-04-02T14:27:35Z

Not up to standards ⛔

🔴 Issues 8 minor

Alerts:
⚠ 8 issues (≤ 0 issues of at least minor severity)

Results:
8 new issues

Category Results

BestPractice 1 minor

CodeStyle 7 minor

View in Codacy

_{TIP This summary will be updated as you push new changes. Give us feedback}

Copilot

Copilot wasn't able to review any files in this pull request.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Agent-Logs-Url: https://github.com/KaHIP/FREIGHT/sessions/70ec7d09-80a2-4157-ba10-86a0c3d6872c Co-authored-by: schulzchristian <5701578+schulzchristian@users.noreply.github.com>

… freight_graphs ~4.3x speedup for multipass streaming partitioning while preserving bit-identical partition output. Applied to both hypergraph and graph streaming partitioners. I/O optimizations: - Cache binary input across multipass iterations (ram_stream) - Skip readFirstLineStream on subsequent ram_stream passes - Use mmap with MAP_POPULATE for zero-copy file loading (macOS compat) - Bulk-read + parse from char buffer with pre-reserved vectors - Skip per-node gettimeofday syscalls in inner loop Evaluation optimizations (hypergraphs): - Use cached data for evaluation instead of re-reading file - Replace unordered_set with flat bitset for connectivity tracking - Skip unused metric per mode (cut-only or connectivity-only) - Specialize bitset check for k<=64 (single word) - Cheap CUT_NET count for first-pass cut evaluation - Compute first-pass connectivity incrementally (sequential post-pass) - Use local scratch vector for inter-pass eval to avoid save/restore - Skip evaluation when partition has converged Processing optimizations: - Bypass OMP critical sections when single-threaded - Eliminate valid_neighboring_nets vector for ram_stream (hypergraphs) - Prefetch stream_edges_assign entries in hot loops (hypergraphs) - Remove -fno-builtin-malloc flags - Use memset for all_blocks_to_keys initialization - Skip dead computations for default config - Add PGO support to CMakeLists.txt (GCC/Clang only) Safety fixes (from Copilot review #5): - Add #pragma omp atomic for nodes_moved in parallel loop - Move first-pass bitset computation out of parallel region - Check open() return before fstat() in mmap code - Add default initializers for evaluate_cut/evaluate_connectivity - Guard PGO CMake flags with compiler ID check

Initial plan

c5bc73e

Copilot AI assigned Copilot and schulzchristian Apr 2, 2026

Copilot started work on behalf of schulzchristian April 2, 2026 14:26 View session

schulzchristian requested a review from Copilot April 2, 2026 14:27

Copilot AI reviewed Apr 2, 2026

View reviewed changes

Add detailed code review for PR #4 (multipass performance optimizations)

0e306ab

Agent-Logs-Url: https://github.com/KaHIP/FREIGHT/sessions/70ec7d09-80a2-4157-ba10-86a0c3d6872c Co-authored-by: schulzchristian <5701578+schulzchristian@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Review the latest pull request and add comments~~ Add code review for PR #4 (multipass performance optimizations) Apr 2, 2026

Copilot finished work on behalf of schulzchristian April 2, 2026 14:35

Copilot AI requested a review from schulzchristian April 2, 2026 14:35

schulzchristian mentioned this pull request Apr 2, 2026

Multipass performance optimizations (~4.3x speedup) #4

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add code review for PR #4 (multipass performance optimizations)#5

Add code review for PR #4 (multipass performance optimizations)#5
Copilot wants to merge 2 commits into
mainfrom
copilot/review-latest-pr-comments

Copilot AI commented Apr 2, 2026 •

edited

Loading

Uh oh!

codacy-production Bot commented Apr 2, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Copilot AI commented Apr 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Critical: Race conditions in parallel loops (freight.cpp)

Medium issues

Minor

I tried to connect to the following addresses, but was blocked by firewall rules:

Uh oh!

codacy-production Bot commented Apr 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Not up to standards ⛔

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Copilot AI commented Apr 2, 2026 •

edited

Loading

Critical: Race conditions in parallel loops (`freight.cpp`)

codacy-production Bot commented Apr 2, 2026 •

edited

Loading