Larger tage align by jensen-yan · Pull Request #846 · OpenXiangShan/GEM5

jensen-yan · 2026-04-28T07:36:35Z

Summary by CodeRabbit

Release Notes

New Features
- Added loop predictor override capability to force loop predictions when valid
- Added shadow overflow tables for extended branch prediction capacity
- Enabled row-bundle mode for branch predictor table organization
- Introduced position-based index mixing for enhanced table indexing
- Expanded TAGE branch predictor associativity configuration options

Change-Id: Ia31f46f482243e08f5052ff3f7a4cabe212c4626

Change-Id: Idfc615700d8b7e216e3520efa293d00f11c9a19c

coderabbitai · 2026-04-28T07:36:54Z

Warning

Rate limit exceeded

@jensen-yan has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 46 minutes and 46 seconds before requesting another review.

To continue reviewing without waiting, purchase usage credits in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 87f21553-761f-4418-87c4-5c450bae8dd9

📥 Commits

Reviewing files that changed from the base of the PR and between 506afd1 and 9d8ea49.

📒 Files selected for processing (4)

configs/example/kmhv3.py
src/cpu/pred/BranchPredictor.py
src/cpu/pred/btb/btb_tage.cc
src/cpu/pred/btb/btb_tage.hh

📝 Walkthrough

Walkthrough

The PR extends the BTBTAGE branch predictor with loop predictor integration, shadow overflow tables for sparse entries, row-bundle mode for TAGE tables, and position-aware indexing. New configuration parameters control these features, and the kmhv3 example configuration is updated to enable row-bundle tables.

Changes

Cohort / File(s)	Summary
Configuration Setup `configs/example/kmhv3.py`	Enables row-bundle support with explicit TAGE table configuration by setting `rowBundleTables` parameter in the `DecoupledBPUWithBTB` branch predictor setup.
Parameter Definitions `src/cpu/pred/BranchPredictor.py`	Adds `forceLoopUse` parameter to `LoopPredictor` and extends `BTBTAGE` with 9 new parameters for loop predictor integration, position-aware indexing, shadow overflow tables, and row-bundle support.
Loop Predictor Implementation `src/cpu/pred/loop_predictor.cc`, `src/cpu/pred/loop_predictor.hh`	Adds `forceLoopUse` configuration to allow loop predictions to override normal control flow when valid, and initializes this parameter in the predictor constructor.
Core Predictor Data Structures `src/cpu/pred/btb/btb_tage.hh`	Extends `TageEntry` and `TageTableInfo` with row-bundle subentry support; adds `shadowTageTable` storage; introduces `getTageIndex` overload and `getShadowIndex` helper; integrates loop predictor metadata via `TageMeta` with `LoopPredictionMeta`; adds configuration fields for position mixing, shadow overflow, and row-bundle modes; expands statistics tracking.
Core Predictor Logic `src/cpu/pred/btb/btb_tage.cc`	Implements loop predictor lookup/override path with dedicated statistics; adds shadow overflow table search on miss and allocation tracking; introduces row-bundle secondary subentry handling with position-disambiguated matching; adds position mixing in index generation; updates counter/useful access through new `TageTableInfo` accessors; expands statistics suite for new features.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~65 minutes

Possibly related PRs

cpu-o3: Refactor numWays to support per-table configuration #793: Modifies BTBTAGE to support per-table associativity (numWays) configuration, directly relevant to table-level parameter plumbing.
Tage1 align #615: Modifies BTBTAGE's tag/indexing and position-related prediction paths, directly related to position-aware indexing and API changes added here.
cpu-o3: update tage index #627: Makes code-level changes to BTBTAGE's index/tag/PC handling by adding position-mixing and related indexing features.

Suggested labels

perf, align-kmhv3

Suggested reviewers

CJ362ff
Yakkhini
tastynoob

Poem

🐰 Hop, hop, the TAGE grows!
With shadows deep and row-bundles in rows,
Loop predictors now override with flair,
Position-aware indexing fills the air!
KMHv3 stands taller, enhanced with care ✨

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 36.67% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check	❓ Inconclusive	The title 'Larger tage align' is vague and does not clearly convey the substantial changes made to branch predictor functionality.	Consider using a more descriptive title that summarizes the primary changes, such as 'Add loop predictor override and shadow overflow support to BTBTAGE' or 'Extend BTBTAGE with row-bundle and loop predictor integration'.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch larger-tage-align

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/cpu/pred/btb/btb_tage.cc`:
- Around line 386-406: The shadow-table match/allocate path currently compares
only tag and thus ignores row-bundle's position-specific subentries when both
enableShadowOverflow and enableRowBundle are enabled; update the shadow logic
(in the code around shadowTageTable lookup, getShadowIndex, getShadowNumWays,
matching_entry, matching_way, matching_subentry, and index assignment) to mirror
the main-table position/subentry handling: compute the correct subentry for
row-bundle entries, include position in the equality/matching check (or use the
same getSubentry/getPosition logic used by the main table), and set
matching_subentry accordingly so updates target the correct sub-slot;
alternatively, add an explicit guard that rejects/avoids using the shadow path
when enableRowBundle && enableShadowOverflow to prevent incorrect aliasing.
- Around line 1158-1172: The code calls loopPredictor->updateStats(...) for
every conditional branch that has loop metadata even when the loop predictor
didn't provide a prediction; change the logic in the block that handles
predMeta->loopPreds (the loop_it handling) to only call
LoopPredictor::updateStats(...) when loop_it->second.loopInfo->loopPredUsed is
true (i.e., move or guard updateStats behind the loopPredUsed check that
increments tageStats.loopPredUsedCorrect/Wrong), while still calling
loopPredictor->condBranchUpdate(...) as before; use the existing symbols
(enableLoopPredictor, has_original_pred, predMeta->loopPreds, loop_it, loopInfo,
loopPredUsed, updateStats, condBranchUpdate, instShiftAmt) to locate and modify
the code.
- Around line 497-514: The code overwrites the original TAGE prediction in
pred.taken before saving the pre-override direction, causing
LoopPredictor::loopUpdate() (when useDirectionBit is enabled) to see the
post-override value and mis-train for BTBTAGE; preserve the original TAGE
direction by storing the initial TAGE result (tage_pred_taken) into
loop_info->predTaken before calling loopPredict or otherwise ensure
meta->loopPreds[btb_entry.pc] receives the pre-override direction
(tage_pred_taken) along with the loop info; update the block around loopPredict,
pred.taken, loop_info->predTaken, and meta->loopPreds so loop_info->predTaken
contains the TAGE direction used before loopPredict flips pred.taken.
- Around line 1704-1715: The stats increments for shadowMainProvider,
shadowAltProvider, rowBundleSecondaryMainProvider and
rowBundleSecondaryAltProvider in updateStatsWithTagePrediction() are executed
both at lookup and at resolved update, causing double-counting; guard these
increments with the lookup-only condition by checking the when_pred variable
(e.g. only run them when when_pred indicates the initial prediction/lookup
phase) and leave other stats updates unchanged—locate the block using
pred.mainInfo / pred.altInfo and wrap the four increments in an if (when_pred ==
<lookup-phase>) { ... } using the existing when_pred enum/constant in this file.

In `@src/cpu/pred/loop_predictor.cc`:
- Around line 292-295: The branch info flag bi->loopPredUsed can remain stale
when the loop predictor is valid-but-not-used (or invalid), so reset it to false
before the override check: set bi->loopPredUsed = false prior to the if that
checks (forceLoopUse || loopUseCounter >= 0) && bi->loopPredValid, then keep the
existing assignment bi->loopPredUsed = true inside that block so the flag
correctly reflects whether the loop prediction was actually used; reference
symbols: bi->loopPredUsed, bi->loopPredValid, forceLoopUse, loopUseCounter,
pred_taken.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d1b3fc58-e287-47e9-8d12-c7b0e377e5bd

📥 Commits

Reviewing files that changed from the base of the PR and between b813b57 and 506afd1.

📒 Files selected for processing (6)

configs/example/kmhv3.py
src/cpu/pred/BranchPredictor.py
src/cpu/pred/btb/btb_tage.cc
src/cpu/pred/btb/btb_tage.hh
src/cpu/pred/loop_predictor.cc
src/cpu/pred/loop_predictor.hh

coderabbitai · 2026-04-28T07:42:54Z

+    if (pred.mainInfo.found && pred.mainInfo.inShadow) {
+        shadowMainProvider++;
+    }
+    if (pred.altInfo.found && pred.altInfo.inShadow) {
+        shadowAltProvider++;
+    }
+    if (pred.mainInfo.found && pred.mainInfo.subentry == 1) {
+        rowBundleSecondaryMainProvider++;
+    }
+    if (pred.altInfo.found && pred.altInfo.subentry == 1) {
+        rowBundleSecondaryAltProvider++;
+    }


⚠️ Potential issue | 🟡 Minor

Gate the new provider-shape stats on when_pred.

updateStatsWithTagePrediction() runs during lookup and again during resolved update. These increments happen before the when_pred split, so shadowMainProvider, shadowAltProvider, and the row-bundle secondary-provider stats are double-counted.

Suggested fix

- if (pred.mainInfo.found && pred.mainInfo.inShadow) { - shadowMainProvider++; - } - if (pred.altInfo.found && pred.altInfo.inShadow) { - shadowAltProvider++; - } - if (pred.mainInfo.found && pred.mainInfo.subentry == 1) { - rowBundleSecondaryMainProvider++; - } - if (pred.altInfo.found && pred.altInfo.subentry == 1) { - rowBundleSecondaryAltProvider++; - } if (when_pred) { + if (pred.mainInfo.found && pred.mainInfo.inShadow) { + shadowMainProvider++; + } + if (pred.altInfo.found && pred.altInfo.inShadow) { + shadowAltProvider++; + } + if (pred.mainInfo.found && pred.mainInfo.subentry == 1) { + rowBundleSecondaryMainProvider++; + } + if (pred.altInfo.found && pred.altInfo.subentry == 1) { + rowBundleSecondaryAltProvider++; + } if (hit) {

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/cpu/pred/btb/btb_tage.cc` around lines 1704 - 1715, The stats increments for shadowMainProvider, shadowAltProvider, rowBundleSecondaryMainProvider and rowBundleSecondaryAltProvider in updateStatsWithTagePrediction() are executed both at lookup and at resolved update, causing double-counting; guard these increments with the lookup-only condition by checking the when_pred variable (e.g. only run them when when_pred indicates the initial prediction/lookup phase) and leave other stats updates unchanged—locate the block using pred.mainInfo / pred.altInfo and wrap the four increments in an if (when_pred == <lookup-phase>) { ... } using the existing when_pred enum/constant in this file.

coderabbitai · 2026-04-28T07:42:54Z

+        if ((forceLoopUse || loopUseCounter >= 0) && bi->loopPredValid) {
            pred_taken = bi->loopPred;
            bi->loopPredUsed = true;
        }


⚠️ Potential issue | 🟠 Major

Reset loopPredUsed before the override check.

This path only writes bi->loopPredUsed on success. When the loop predictor is valid-but-not-used, or invalid, the flag keeps its prior/uninitialized value, and BTBTAGE now reads it for stats/update decisions.

Suggested fix

bool pred_taken = prev_pred_taken; if (cond_branch) { // loop prediction bi->loopPred = getLoop(branch_pc, bi, useSpeculation, instShiftAmt); + bi->loopPredUsed = false; if ((forceLoopUse || loopUseCounter >= 0) && bi->loopPredValid) { pred_taken = bi->loopPred; bi->loopPredUsed = true; }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@src/cpu/pred/loop_predictor.cc` around lines 292 - 295, The branch info flag bi->loopPredUsed can remain stale when the loop predictor is valid-but-not-used (or invalid), so reset it to false before the override check: set bi->loopPredUsed = false prior to the if that checks (forceLoopUse || loopUseCounter >= 0) && bi->loopPredValid, then keep the existing assignment bi->loopPredUsed = true inside that block so the flag correctly reflects whether the loop prediction was actually used; reference symbols: bi->loopPredUsed, bi->loopPredValid, forceLoopUse, loopUseCounter, pred_taken.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 506afd1505

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-28T07:43:43Z

            tageStats.updateStatsWithTagePrediction(pred, true);
            results.push_back({btb_entry.pc, pred.taken || btb_entry.alwaysTaken});
-            tageInfoForMgscs[btb_entry.pc].tage_pred_taken = pred.taken;
+            tageInfoForMgscs[btb_entry.pc].tage_pred_taken = tage_pred_taken;


Feed loop-adjusted direction into MGSC handoff

This line passes the pre-loop TAGE result to MGSC (tage_pred_taken captured before loopPredict), so when MGSC is enabled its fallback path (src/cpu/pred/btb/btb_mgsc.cc, taken = ... ? ... : tage_info.tage_pred_taken) ignores any loop-predictor override. In practice, enabling BTBTAGE.enableLoopPredictor will not affect final branch direction whenever MGSC does not select SC, because MGSC rewrites condTakens using this stale value.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-04-28T07:43:43Z

+            const unsigned shadowWays = getShadowNumWays(i);
+            for (unsigned way = 0; way < shadowWays; way++) {
+                auto &entry = shadowTageTable[i][shadowIndex][way];
+                if (entry.valid && tag == entry.tag) {


Disambiguate shadow hits by branch position

Shadow lookup matches only on tag, but row-bundle mode removes position from the tag (tagPosition = 0 for bundled tables). If enableRowBundle and enableShadowOverflow are both used on the same tables, different branch positions in the same fetch block can alias to the same shadow tag and be treated as a hit, producing wrong directions and updating the wrong shadow entry.

Useful? React with 👍 / 👎.

XiangShanRobot · 2026-04-28T08:15:29Z

[Generated by GEM5 Performance Robot]
commit: 506afd1
workflow: gem5 Align BTB Performance Test(0.3c)

Align BTB Performance

Overall Score

	PR	Master	Diff(%)
Score	18.82	18.72	+0.54 🟢

XiangShanRobot · 2026-04-28T08:15:44Z

[Generated by GEM5 Performance Robot]
commit: 506afd1
workflow: gem5 Align BTB Performance Test(0.3c)

Align BTB Performance

Overall Score

	PR	Master	Diff(%)
Score	18.82	18.72	+0.54 🟢

Change-Id: I135a9fab8186bc4d41be1cc6aca0e8d0bfa9da2d

Change-Id: Ie46db37ec481b93f9735bf8606304f17c9059655

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9d8ea492aa

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-07T10:11:05Z

+            const Addr shadowIndex = getShadowIndex(
+                index, i, position, startPC, indexFoldedHist[i].get());


Use snapshot history for shadow lookup

When generateSinglePrediction() is called with predMeta on the update/recompute path, the main index and tag above use the prediction-time folded histories, but this shadow lookup still hashes the live indexFoldedHist. With enableShadowOverflow plus useAltHashForShadowIndex, any speculative history change between lookup and update makes recomputation probe a different shadow set, so shadow providers are missed and the update can train/allocate as if no provider existed.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-07T10:11:05Z

+            const Addr shadowIndex = getShadowIndex(
+                newIndex, ti, position, startPC, indexFoldedHist[ti].get());


Allocate shadow entries with prediction history

This shadow allocation uses the live folded history even though the primary allocation just above is indexed with meta->indexFoldedHist. When useAltHashForShadowIndex is enabled, updates occur after history has advanced, so entries placed in the shadow table are keyed by update-time history and later predictions for the same branch/history will probe a different shadow index, effectively losing the allocation.

Useful? React with 👍 / 👎.

github-actions · 2026-05-07T12:27:05Z

🚀 Coremark Smoke Test Results

Branch	IPC	Change
Base (`xs-dev`)	`2.3130`	-
This PR	`2.3237`	📈 `+0.0107` (`+0.46%`)

✅ Difftest smoke test passed!

XiangShanRobot · 2026-05-07T13:11:06Z

[Generated by GEM5 Performance Robot]
commit: 9d8ea49
workflow: gem5 Align BTB Performance Test(0.3c)

Align BTB Performance

Overall Score

	PR	Master	Diff(%)
Score	18.73	18.73	+0.01 🟢

[Generated by GEM5 Performance Robot]
commit: 9d8ea49
workflow: gem5 Align BTB Performance Test(0.3c)

Align BTB Performance

Overall Score

	PR	Previous Commit	Diff(%)
Score	18.73	18.82	-0.49 🔴

jensen-yan added 2 commits April 28, 2026 15:01

configs: Enable low4x BTBTAGE ways

19629ba

Change-Id: Ia31f46f482243e08f5052ff3f7a4cabe212c4626

cpu: Enable low4x row-bundle BTBTAGE

506afd1

Change-Id: Idfc615700d8b7e216e3520efa293d00f11c9a19c

coderabbitai Bot reviewed Apr 28, 2026

View reviewed changes

chatgpt-codex-connector Bot reviewed Apr 28, 2026

View reviewed changes

jensen-yan added 2 commits May 7, 2026 18:08

cpu: Add alternate BTBTAGE shadow index hash

4906c57

Change-Id: I135a9fab8186bc4d41be1cc6aca0e8d0bfa9da2d

configs: Test row-bundle without low4x

9d8ea49

Change-Id: Ie46db37ec481b93f9735bf8606304f17c9059655

chatgpt-codex-connector Bot reviewed May 7, 2026

View reviewed changes

		const Addr shadowIndex = getShadowIndex(
		index, i, position, startPC, indexFoldedHist[i].get());

		const Addr shadowIndex = getShadowIndex(
		newIndex, ti, position, startPC, indexFoldedHist[ti].get());

Conversation

jensen-yan commented Apr 28, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai Bot commented Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

XiangShanRobot commented Apr 28, 2026

Align BTB Performance

Overall Score

Uh oh!

XiangShanRobot commented Apr 28, 2026

Align BTB Performance

Overall Score

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 7, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot May 7, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented May 7, 2026

🚀 Coremark Smoke Test Results

Uh oh!

XiangShanRobot commented May 7, 2026

Align BTB Performance

Overall Score

Align BTB Performance

Overall Score

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jensen-yan commented Apr 28, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 28, 2026 •

edited

Loading