Skip to content

Add CUTE matrix controller abstraction#862

Open
tastynoob wants to merge 1 commit into
xs-devfrom
xsai-cute-abstract-ai-config
Open

Add CUTE matrix controller abstraction#862
tastynoob wants to merge 1 commit into
xs-devfrom
xsai-cute-abstract-ai-config

Conversation

@tastynoob
Copy link
Copy Markdown
Collaborator

@tastynoob tastynoob commented May 25, 2026

Summary by CodeRabbit

Release Notes

  • New Features

    • Introduced command-line options to configure matrix operation timing parameters (cycles, latencies, pipeline settings) for fine-tuned performance analysis
    • Added new AI-optimized simulation configuration for matrix-related performance evaluation
  • Documentation

    • Updated simulation documentation with enhanced timing models and validation information for matrix operations

Review Change Stack

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 25, 2026

📝 Walkthrough

Walkthrough

This PR refactors RISC-V matrix operation support from inline ISA state to a dedicated MatrixController component, adds matrix timing configuration parameters, implements timing-aware scheduling with LocalMMU/L2 pipeline modeling, includes comprehensive unit tests, and integrates into a new AI simulation configuration script.

Changes

Matrix control plane refactoring and timing configuration

Layer / File(s) Summary
Configuration options and ISA parameter wiring
configs/common/Options.py, src/arch/riscv/RiscvISA.py, configs/common/CacheConfig.py, configs/example/se.py
Added matrix timing command-line knobs (issue intervals, load/store/zero/compute/release cycles, LocalMMU/L2 pipeline latencies). Extended RiscvISA class with corresponding Param.Unsigned fields. Integrated timing-option application into config cache setup. Updated se.py to track explicitly-provided options and avoid overriding them with defaults.
MatrixController public API and architecture
src/matrix/matrix_controller.hh
Defined compile-time constants, task/memory enums (TaskOp, FuType, MemPort, TaskEvent, ElemWidth), public Statistics/TimingConfig/ControlSnapshot structs, and public API for matrix operations, token lifecycle, serialization, and checkpoint integration. Declared private state structures and helper methods.
MatrixController core implementation and timing scheduling
src/matrix/matrix_controller.cc
Implemented statistics setup, reset/config methods, token and register accessors, functional matrix operations (load/store/zero/mmacc), task scheduling with decoded FIFO, memory pipeline with LocalMMU arbitration and request/response slot reservation, timing calculations, checkpoint serialization with backward compatibility, and token retirement with per-token/per-source ready ticks.
MatrixController unit tests and statistics infrastructure
src/matrix/SConscript, src/matrix/matrix_controller.test.cc, src/matrix/matrix_stats_test_stub.cc
Added comprehensive GoogleTest cases covering timing behavior, memory pipeline scheduling, LocalMMU source allocation/release, token lifecycle, and checkpoint integration. Implemented statistics test stub for unit test infrastructure (Info/InfoAccess/Group lifecycle and stat management).
RiscvISA integration with MatrixController
src/arch/riscv/isa.hh, src/arch/riscv/isa.cc
Updated RiscvISA header to replace inline matrix state with std::unique_ptr<MatrixController>, added destructor, and updated matrix-related method signatures to accept ExecContext*. Updated isa.cc to construct and configure MatrixController in ISA constructor, removed old matrix blob helpers, delegated operations to controller, and updated checkpoint serialization. Removed old matrix.hh/matrix.cc files.
RISC-V decoder and instruction classification updates
src/arch/riscv/isa/decoder.isa, src/arch/riscv/isa/includes.isa, src/arch/riscv/insts/SConscript
Updated matrix instruction handlers to pass ExecContext to ISA functions. Updated operation classification: tile setters to IntAluOp, loads to MemReadOp, stores to MemWriteOp, mmacc to IntMultOp. Removed matrix.hh includes and matrix.cc from build.
AI ideal KMHV3 configuration with matrix timing defaults
configs/example/ai_idealkmhv3.py
Added new configuration script with AI_MATRIX_TIMING_DEFAULTS dictionary, setAiMatrixTimingDefaults() and setAiKmhV3IdealParams() helpers, and m5_main entrypoint to initialize XIANGSHAN system with ideal CPU/cache parameters and apply matrix timing defaults.
SE matrix smoke documentation updates
docs/Gem5_Docs/xsai/se_matrix_smoke.md
Updated documentation to reflect MatrixController-based architecture, detailed minimum control plane/timing skeleton, memory access modeling (functional-first with fixed ready ticks), behavioral CUTE-to-L2/LocalMMU pipeline specification, RiscvISA analytic timing constants and command-line parameters, observable statistics items, verification approach with unit tests, and known limitations/future work for timing calibration.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • OpenXiangShan/GEM5#819: The main PR is related because both modify RISC-V matrix support in RiscvISA/matrix instruction plumbing—specifically this PR replaces the SE matrix state in RiscvISA with a MatrixController and changes matrix control/compute method signatures (e.g., matrixAcquire/matrixRelease to take ExecContext*), which is directly connected to the matrix smoke implementation work in PR #819.

Suggested labels

perf

Suggested reviewers

  • jensen-yan
  • Yakkhini
  • Ergou-ren

Poem

🐰 Whiskers twitch with glee
Matrix state now runs so free,
Controller takes the lead—
Timing tuned, performance freed! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 1.54% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: introducing a new CUTE matrix controller abstraction across the codebase.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch xsai-cute-abstract-ai-config

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (3)
configs/example/ai_idealkmhv3.py (1)

43-51: ⚡ Quick win

Rename new helper functions to lower_snake_case.

setAiMatrixTimingDefaults and setAiKmhV3IdealParams should be snake_case, with matching updates at call sites (Lines 175 and 185).

Proposed rename
-def setAiMatrixTimingDefaults(args):
+def set_ai_matrix_timing_defaults(args):
@@
-def setAiKmhV3IdealParams(args, system):
+def set_ai_kmh_v3_ideal_params(args, system):
@@
-    setAiMatrixTimingDefaults(args)
+    set_ai_matrix_timing_defaults(args)
@@
-    setAiKmhV3IdealParams(args, test_sys)
+    set_ai_kmh_v3_ideal_params(args, test_sys)

As per coding guidelines, "Functions and methods should use lower_snake_case naming convention".

Also applies to: 51-51, 161-161, 175-175, 185-185

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@configs/example/ai_idealkmhv3.py` around lines 43 - 51, Rename the two
functions to follow lower_snake_case: change setAiMatrixTimingDefaults ->
set_ai_matrix_timing_defaults and setAiKmhV3IdealParams ->
set_ai_kmh_v3_ideal_params, then update every call/reference to those symbols
(e.g., any occurrences of setAiMatrixTimingDefaults(...) and
setAiKmhV3IdealParams(...)) to the new names; ensure any imports/exports or
other usages in this module are updated to match the new identifiers so there
are no unresolved references.
configs/example/se.py (1)

127-143: ⚡ Quick win

Use lower_snake_case for helper function names.

Line 127 (explicitOptionDests) and Line 143 (setDefaultArgs) should be renamed to snake_case to match the project naming rule; update the Line 187 call accordingly.

Proposed rename
-def explicitOptionDests(parser, argv):
+def explicit_option_dests(parser, argv):
@@
-def setDefaultArgs(args, explicit_options):
+def set_default_args(args, explicit_options):
@@
-setDefaultArgs(args, explicitOptionDests(parser, sys.argv))
+set_default_args(args, explicit_option_dests(parser, sys.argv))

As per coding guidelines, "Functions and methods should use lower_snake_case naming convention".

Also applies to: 187-187

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@configs/example/se.py` around lines 127 - 143, Rename the helper functions
explicitOptionDests and setDefaultArgs to lower_snake_case (e.g.,
explicit_option_dests and set_default_args) and update every reference/call site
(including the call currently at line 187) to use the new names; specifically
change the function definitions for explicitOptionDests and setDefaultArgs and
any usage of those symbols so the module follows the project's snake_case
convention.
src/matrix/matrix_controller.test.cc (1)

16-26: ⚡ Quick win

Rename helper to lower_snake_case for consistency with repo rules.

testTimingConfig() should follow the same function naming convention as production code.

As per coding guidelines, Functions and methods should use lower_snake_case naming convention.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/matrix/matrix_controller.test.cc` around lines 16 - 26, Rename the helper
function testTimingConfig to follow lower_snake_case (e.g., test_timing_config)
and update all callsites accordingly; the function returns a
MatrixController::TimingConfig, so keep the same body and return type but change
the symbol name (testTimingConfig -> test_timing_config) in the definition and
any tests or files that reference it to comply with the project's naming
convention.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/matrix/matrix_controller.cc`:
- Around line 985-1026: MatrixController::executeLoad and
MatrixController::executeStore ignore task.transpose and always perform row-wise
copies, producing incorrect results for transposed tasks; add a defensive check
at the start of both functions (executeLoad and executeStore) that panics (or
returns an error) if task.transpose is true with a clear message like
"functional load/store does not support transpose", so the unsupported path is
caught until proper transpose-layout support is implemented; locate the checks
near the start of the functions (before using accReg/abReg,
rows/cols/stride/bytesPerRow and calls to matrixReadBlob/matrixWriteBlob) and
use the existing panic_if mechanism to enforce this.

In `@src/matrix/matrix_controller.hh`:
- Around line 29-50: Rename all constants declared in MatrixController to
ALL_CAPS with underscores (e.g., TokenCount -> TOKEN_COUNT, MatrixRegCount ->
MATRIX_REG_COUNT, OutsideDataWidthBits -> OUTSIDE_DATA_WIDTH_BITS,
ReduceWidthBytes -> REDUCE_WIDTH_BYTES, MatrixMaxM -> MATRIX_MAX_M,
MatrixABRegBytes -> MATRIX_AB_REG_BYTES, etc.) and rename methods to
lower_snake_case (e.g., setTimingConfig -> set_timing_config) to match
repository naming rules; update every use site accordingly (the declaration
block shown and the rest of the file/range referenced 240-516) and run a
project-wide search/replace for each unique symbol to avoid breaking references.

---

Nitpick comments:
In `@configs/example/ai_idealkmhv3.py`:
- Around line 43-51: Rename the two functions to follow lower_snake_case: change
setAiMatrixTimingDefaults -> set_ai_matrix_timing_defaults and
setAiKmhV3IdealParams -> set_ai_kmh_v3_ideal_params, then update every
call/reference to those symbols (e.g., any occurrences of
setAiMatrixTimingDefaults(...) and setAiKmhV3IdealParams(...)) to the new names;
ensure any imports/exports or other usages in this module are updated to match
the new identifiers so there are no unresolved references.

In `@configs/example/se.py`:
- Around line 127-143: Rename the helper functions explicitOptionDests and
setDefaultArgs to lower_snake_case (e.g., explicit_option_dests and
set_default_args) and update every reference/call site (including the call
currently at line 187) to use the new names; specifically change the function
definitions for explicitOptionDests and setDefaultArgs and any usage of those
symbols so the module follows the project's snake_case convention.

In `@src/matrix/matrix_controller.test.cc`:
- Around line 16-26: Rename the helper function testTimingConfig to follow
lower_snake_case (e.g., test_timing_config) and update all callsites
accordingly; the function returns a MatrixController::TimingConfig, so keep the
same body and return type but change the symbol name (testTimingConfig ->
test_timing_config) in the definition and any tests or files that reference it
to comply with the project's naming convention.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 4eda3e4e-2389-4d31-9c3f-af61d504eac8

📥 Commits

Reviewing files that changed from the base of the PR and between 7930d36 and c85d705.

📒 Files selected for processing (18)
  • configs/common/CacheConfig.py
  • configs/common/Options.py
  • configs/example/ai_idealkmhv3.py
  • configs/example/se.py
  • docs/Gem5_Docs/xsai/se_matrix_smoke.md
  • src/arch/riscv/RiscvISA.py
  • src/arch/riscv/insts/SConscript
  • src/arch/riscv/insts/matrix.cc
  • src/arch/riscv/insts/matrix.hh
  • src/arch/riscv/isa.cc
  • src/arch/riscv/isa.hh
  • src/arch/riscv/isa/decoder.isa
  • src/arch/riscv/isa/includes.isa
  • src/matrix/SConscript
  • src/matrix/matrix_controller.cc
  • src/matrix/matrix_controller.hh
  • src/matrix/matrix_controller.test.cc
  • src/matrix/matrix_stats_test_stub.cc
💤 Files with no reviewable changes (4)
  • src/arch/riscv/insts/SConscript
  • src/arch/riscv/insts/matrix.hh
  • src/arch/riscv/isa/includes.isa
  • src/arch/riscv/insts/matrix.cc

Comment on lines +985 to +1026
MatrixController::executeLoad(ExecContext *xc, const TaskDesc &task)
{
#ifdef UNIT_TEST
panic("MatrixController memory loads are not linked in unit tests");
#else
ThreadContext *tc = xc->tcBase();

if (task.destIsAcc) {
panic_if(task.elemWidth != ElemWidth::E32,
"C matrix functional load currently supports e32 only");
panic_if(task.rows > MatrixMaxM || task.cols > MatrixMaxN,
"C matrix load shape exceeds accumulator register capacity");

auto &dst_reg = accReg(task.destReg);
for (uint32_t row = 0; row < task.rows; ++row) {
auto *dst =
reinterpret_cast<uint8_t *>(&dst_reg[row * MatrixMaxN]);
Fault fault = matrixReadBlob(tc, task.base + row * task.stride,
dst, task.bytesPerRow);
if (fault != NoFault) {
return fault;
}
}
return NoFault;
}

panic_if(task.elemWidth != ElemWidth::E8,
"AB matrix functional load currently supports e8 only");
panic_if(task.rows > MatrixMaxM || task.cols > MatrixMaxK,
"AB matrix load shape exceeds tile register capacity");

auto &dst_reg = abReg(task.destReg);
for (uint32_t row = 0; row < task.rows; ++row) {
auto *dst = reinterpret_cast<uint8_t *>(&dst_reg[row * MatrixMaxK]);
Fault fault = matrixReadBlob(tc, task.base + row * task.stride, dst,
task.bytesPerRow);
if (fault != NoFault) {
return fault;
}
}
return NoFault;
#endif
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Transpose flag is ignored in functional load/store path.

task.transpose is propagated into task construction and timing, but executeLoad/executeStore always perform non-transposed row-wise blob copies. This can silently produce wrong data for transpose operations.

Suggested defensive fix until transpose layout is implemented
 Fault
 MatrixController::executeLoad(ExecContext *xc, const TaskDesc &task)
 {
 `#ifdef` UNIT_TEST
     panic("MatrixController memory loads are not linked in unit tests");
 `#else`
+    panic_if(task.transpose,
+        "matrix functional transpose load is not implemented");
     ThreadContext *tc = xc->tcBase();
 Fault
 MatrixController::executeStore(ExecContext *xc, const TaskDesc &task)
 {
 `#ifdef` UNIT_TEST
     panic("MatrixController memory stores are not linked in unit tests");
 `#else`
+    panic_if(task.transpose,
+        "matrix functional transpose store is not implemented");
     ThreadContext *tc = xc->tcBase();

Also applies to: 1030-1073

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/matrix/matrix_controller.cc` around lines 985 - 1026,
MatrixController::executeLoad and MatrixController::executeStore ignore
task.transpose and always perform row-wise copies, producing incorrect results
for transposed tasks; add a defensive check at the start of both functions
(executeLoad and executeStore) that panics (or returns an error) if
task.transpose is true with a clear message like "functional load/store does not
support transpose", so the unsupported path is caught until proper
transpose-layout support is implemented; locate the checks near the start of the
functions (before using accReg/abReg, rows/cols/stride/bytesPerRow and calls to
matrixReadBlob/matrixWriteBlob) and use the existing panic_if mechanism to
enforce this.

Comment on lines +29 to +50
static constexpr uint32_t TokenCount = 32;
static constexpr uint32_t MatrixRegCount = 4;
static constexpr uint32_t DecodedQueueDepth = 8;
static constexpr uint32_t MicroTaskFifoSlots = 4;
static constexpr uint32_t LocalMmuSourceCount = 64;
static constexpr uint32_t OutsideDataWidthBits = 512;
static constexpr uint32_t OutsideDataWidthBytes =
OutsideDataWidthBits / 8;
static constexpr uint32_t ReduceWidthBytes = 64;
static constexpr uint32_t ResultWidthBytes = 4;
static constexpr uint32_t MatrixMN = 4;

static constexpr uint32_t MatrixMaxM = 128;
static constexpr uint32_t MatrixMaxK = 64;
static constexpr uint32_t MatrixMaxN = 128;
static constexpr uint32_t MatrixABRegBytes = MatrixMaxM * MatrixMaxK;
static constexpr uint32_t MatrixAccElems = MatrixMaxM * MatrixMaxN;

static constexpr uint8_t DefaultAReg = 0;
static constexpr uint8_t DefaultBReg = 1;
static constexpr uint8_t DefaultAccReg = 0;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major | 🏗️ Heavy lift

Align new MatrixController API/constant names with repository naming rules.

The newly introduced constants and method names are mostly UpperCamelCase/lowerCamelCase (for example, TokenCount, setTimingConfig), which conflicts with the configured naming policy for this repo.

As per coding guidelines, Functions and methods should use lower_snake_case naming convention and Constants should use ALL_CAPS naming convention.

Also applies to: 240-516

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/matrix/matrix_controller.hh` around lines 29 - 50, Rename all constants
declared in MatrixController to ALL_CAPS with underscores (e.g., TokenCount ->
TOKEN_COUNT, MatrixRegCount -> MATRIX_REG_COUNT, OutsideDataWidthBits ->
OUTSIDE_DATA_WIDTH_BITS, ReduceWidthBytes -> REDUCE_WIDTH_BYTES, MatrixMaxM ->
MATRIX_MAX_M, MatrixABRegBytes -> MATRIX_AB_REG_BYTES, etc.) and rename methods
to lower_snake_case (e.g., setTimingConfig -> set_timing_config) to match
repository naming rules; update every use site accordingly (the declaration
block shown and the rest of the file/range referenced 240-516) and run a
project-wide search/replace for each unique symbol to avoid breaking references.

@github-actions
Copy link
Copy Markdown

🚀 Coremark Smoke Test Results

Branch IPC Change
Base (xs-dev) 2.3130 -
This PR 2.3130 ➡️ 0.0000 (0.00%)

✅ Difftest smoke test passed!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant