Skip to content

Add code review for PR #4 (multipass performance optimizations)#5

Draft
Copilot wants to merge 2 commits into
mainfrom
copilot/review-latest-pr-comments
Draft

Add code review for PR #4 (multipass performance optimizations)#5
Copilot wants to merge 2 commits into
mainfrom
copilot/review-latest-pr-comments

Conversation

Copy link
Copy Markdown

Copilot AI commented Apr 2, 2026

PR #4 introduces ~4.3x multipass speedup via mmap-based I/O caching, bitset connectivity, and reduced per-node overhead. This PR documents a full review identifying correctness and safety issues in those changes.

Critical: Race conditions in parallel loops (freight.cpp)

nodes_moved unsynchronized write:

#pragma omp parallel for schedule(runtime)
for (LongNodeID curr_node = 0; ...) {
    ...
    if (config.restream_number > 0 && block != old_block_for_move_check)
        nodes_moved++;  // data race — no atomic/critical
}

Fix: #pragma omp atomic or std::atomic<LongNodeID>.

first_pass_bits / first_pass_connectivity concurrent RMW:
Multiple threads can race on the same net_idx slot (bit-OR) and on the shared double accumulator. Fix: std::atomic<uint64_t> for bitset words and #pragma omp atomic for the accumulator, or accumulate connectivity in a post-parallel pass over the bitset.

Medium issues

  • open() unchecked before fstat() (both graph_io_stream.h): fd == -1 is silently passed to fstat, leaving st.st_size indeterminate before the mmap call.
  • Duplicate if/else bodies in OMP critical guards (both vertex_partitioning.h): both branches contain the same call — only the pragma differs; the duplication should be collapsed.
  • PGO CMake flags are GCC/Clang-only (both CMakeLists.txt): -fprofile-generate/-fprofile-use need a CMAKE_CXX_COMPILER_ID MATCHES "GNU|Clang" guard.
  • evaluate_cut/evaluate_connectivity lack default initializers in partition_config.h — UB if the struct is used before standard() runs.
  • Stale .size() call in prefetch guard (graph_io_stream.cpp): guard uses line_numbers.size() instead of the pre-computed line_size.
  • best_cut/best_con/best_qap/best_pins set but never read — the intended skip of the final re-evaluation is incomplete.

Minor

  • Removal of config.remaining_stream_nodes-- in readNodeOnePass_netl lacks a comment explaining why it is safe in all modes.
  • cursor = curr_node alias is redundant; use curr_node directly.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • https://api.github.com/graphql
    • Triggering command: /usr/bin/gh gh auth status (http block)
  • https://api.github.com/repos/KaHIP/FREIGHT/pulls/4
    • Triggering command: /usr/bin/curl curl -s -H Authorization: token ****** REDACTED (http block)

If you need me to access, download, or install something from one of these locations, you can either:

Original prompt

review the latesr pr and add comments

@codacy-production
Copy link
Copy Markdown

codacy-production Bot commented Apr 2, 2026

Not up to standards ⛔

🔴 Issues 8 minor

Alerts:
⚠ 8 issues (≤ 0 issues of at least minor severity)

Results:
8 new issues

Category Results
BestPractice 1 minor
CodeStyle 7 minor

View in Codacy

TIP This summary will be updated as you push new changes. Give us feedback

@schulzchristian schulzchristian requested a review from Copilot April 2, 2026 14:27
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot wasn't able to review any files in this pull request.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Agent-Logs-Url: https://github.com/KaHIP/FREIGHT/sessions/70ec7d09-80a2-4157-ba10-86a0c3d6872c

Co-authored-by: schulzchristian <5701578+schulzchristian@users.noreply.github.com>
Copilot AI changed the title [WIP] Review the latest pull request and add comments Add code review for PR #4 (multipass performance optimizations) Apr 2, 2026
Copilot AI requested a review from schulzchristian April 2, 2026 14:35
schulzchristian added a commit that referenced this pull request Apr 2, 2026
… freight_graphs

~4.3x speedup for multipass streaming partitioning while preserving
bit-identical partition output. Applied to both hypergraph and graph
streaming partitioners.

I/O optimizations:
- Cache binary input across multipass iterations (ram_stream)
- Skip readFirstLineStream on subsequent ram_stream passes
- Use mmap with MAP_POPULATE for zero-copy file loading (macOS compat)
- Bulk-read + parse from char buffer with pre-reserved vectors
- Skip per-node gettimeofday syscalls in inner loop

Evaluation optimizations (hypergraphs):
- Use cached data for evaluation instead of re-reading file
- Replace unordered_set with flat bitset for connectivity tracking
- Skip unused metric per mode (cut-only or connectivity-only)
- Specialize bitset check for k<=64 (single word)
- Cheap CUT_NET count for first-pass cut evaluation
- Compute first-pass connectivity incrementally (sequential post-pass)
- Use local scratch vector for inter-pass eval to avoid save/restore
- Skip evaluation when partition has converged

Processing optimizations:
- Bypass OMP critical sections when single-threaded
- Eliminate valid_neighboring_nets vector for ram_stream (hypergraphs)
- Prefetch stream_edges_assign entries in hot loops (hypergraphs)
- Remove -fno-builtin-malloc flags
- Use memset for all_blocks_to_keys initialization
- Skip dead computations for default config
- Add PGO support to CMakeLists.txt (GCC/Clang only)

Safety fixes (from Copilot review #5):
- Add #pragma omp atomic for nodes_moved in parallel loop
- Move first-pass bitset computation out of parallel region
- Check open() return before fstat() in mmap code
- Add default initializers for evaluate_cut/evaluate_connectivity
- Guard PGO CMake flags with compiler ID check
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants