Skip to content

[tracing] Add kernel tracing subsystem and integration validation#401

Open
xuchang-vivo wants to merge 4 commits intovivoblueos:mainfrom
xuchang-vivo:xc/add_tracing_event
Open

[tracing] Add kernel tracing subsystem and integration validation#401
xuchang-vivo wants to merge 4 commits intovivoblueos:mainfrom
xuchang-vivo:xc/add_tracing_event

Conversation

@xuchang-vivo
Copy link
Copy Markdown
Contributor

@xuchang-vivo xuchang-vivo commented Apr 17, 2026

Summary

This PR introduces the first usable kernel tracing subsystem for BlueOS and validates it on qemu_mps2_an385 debug configuration.

Included in this PR

  • Add kernel tracing core module:
    • kernel/src/tracing/{mod,control,event,buffer,dump}.rs
  • Add trace event hooks at key runtime paths:
    • scheduler, thread lifecycle, IRQ enter/exit, syscall entry/exit, mutex wait/hold, memory alloc/free/fail
  • Add tracing Kconfig and enable tracing in:
    • kconfig/config/qemu_mps2_an385/debug/defconfig (CONFIG_TRACING=y)
  • Add integration test coverage:
    • kernel/tests/test_tracing.rs
    • wired in kernel/tests/integration_test.rs
  • Improve tracing init robustness on constrained Cortex-M stack:
    • avoid large stack temporary in tracing::buffer::init
    • explicitly reset tracing enable state in tracing::control::init
  • Improve allocator-side trace points and keep clippy happy in non-tracing build paths.

Embedded Tracing (Brief Introduction)

In embedded systems, tracing is a low-overhead way to record runtime behavior as structured events, especially for timing-sensitive and concurrency issues that are hard to debug with logs.

Compared with plain printf debugging:

  • Tracing events are structured (event_id, timestamp, cpu/thread context, payload), easier to correlate and visualize.
  • It has controllable overhead (event filtering, sampling, bounded ring buffer).
  • It is better suited for post-mortem or timeline analysis (scheduler/IRQ/lock/memory interactions).

Typical architecture:

  1. Probe points in kernel/runtime hot paths emit events.
  2. Trace control manages enable/disable, config, and counters.
  3. Ring buffer stores events with bounded memory and drop accounting.
  4. Exporter/dump serializes events for host-side analysis tools (e.g. Perfetto adapters in next stage).

This PR implements the kernel-side foundation (probe + control + buffer + dump + tests) as step 1.

Validation

  • Build and run integration tests on qemu_mps2_an385.debug:
    • ninja -C out/qemu_mps2_an385.debug run_integration_test
  • Verified tracing integration test passes and prints parsed dumped events:
    • TraceStart, MmAlloc, MmFree

Notes

  • Shell trace placeholder command is intentionally removed in a separate app-level change and is not part of this kernel PR.
  • Host-side visualization pipeline (Perfetto import path) is left for follow-up PRs.

Reference Links (Embedded Tracing)

Lock-Free Ring Buffer Strategy

To avoid deadlock risks in IRQ/SMP contexts, tracing write path now uses a lock-free per-CPU ring-buffer design:

  • Per-CPU ring: each CPU writes to its own ring to minimize cross-core contention.
  • Sequence reservation: writer gets a unique slot sequence via atomic fetch_add on write_seq.
  • Two-phase publish: writer fills record first, then publishes commit_seq with Release ordering.
  • Reader validation: dump side reads commit_seq with Acquire ordering and only consumes entries where commit_seq == seq.
  • Bounded overwrite: when producer outruns capacity, old entries are overwritten by sequence wrap as expected for ring buffers.

Why this strategy:

  • No spin-mutex in tracing hot path, so IRQ re-entrance will not self-deadlock on the same CPU.
  • Works for SMP concurrent writers without global lock serialization.
  • Keeps trace write path deterministic and low-overhead for embedded runtime.

Current tradeoff:

  • Dump output is currently collected per-CPU and not globally timestamp-merged yet. If needed, a follow-up can add k-way merge by timestamp.

Recorded Events Matrix

Event ID Event Name Category Recorded now? Main hook points
0x0001 TraceStart Trace lifecycle Yes tracing::start()
0x0002 TraceStop Trace lifecycle Yes tracing::stop()
0x0003 TraceDropped Trace lifecycle Not emitted as an event (counter only) Drop count is exported in dump metadata
0x0101 SchedSwitch Scheduler Yes scheduler::switch_current_thread()
0x0102 ThreadCreate Thread lifecycle Yes thread::Builder::build()
0x0103 ThreadExit Thread lifecycle Yes scheduler retire path
0x0104 ThreadWakeup Thread lifecycle Defined, not yet hooked Reserved for wakeup paths
0x0105 ThreadBlock Thread lifecycle Yes scheduler::suspend_me_until()
0x0201 IrqEnter Interrupt Yes IrqTrace::enter()
0x0202 IrqExit Interrupt Yes IrqTrace::leave()
0x0301 SysEnter Syscall Yes arch syscall entry (arm/riscv/aarch64)
0x0302 SysExit Syscall Yes arch syscall exit (arm/riscv/aarch64)
0x0401 LockWaitBegin Synchronization Yes sync::mutex wait path
0x0402 LockWaitEnd Synchronization Yes sync::mutex wait completion/timeout
0x0403 LockHoldBegin Synchronization Yes sync::mutex lock acquisition
0x0404 LockHoldEnd Synchronization Yes sync::mutex unlock path
0x0501 MmAlloc Memory Yes allocator::{malloc,realloc,malloc_align}
0x0502 MmFree Memory Yes allocator::{free,realloc,free_align}
0x0503 MmAllocFail Memory Yes allocator allocation failure paths
0x0601 CounterMemUsedBytes Counter Defined, not yet hooked Reserved for periodic memory usage counter

@xuchang-vivo
Copy link
Copy Markdown
Contributor Author

build_prs

@github-actions
Copy link
Copy Markdown

@github-actions
Copy link
Copy Markdown

❌ Job failed. Failed jobs: check_format (failure), build_and_check_boards (failure), see https://github.com/vivoblueos/kernel/actions/runs/24561455116.

@xuchang-vivo
Copy link
Copy Markdown
Contributor Author

build_prs

@github-actions
Copy link
Copy Markdown

@github-actions
Copy link
Copy Markdown

❌ Job failed. Failed jobs: build_and_check_boards (failure), see https://github.com/vivoblueos/kernel/actions/runs/24561980464.

@xuchang-vivo
Copy link
Copy Markdown
Contributor Author

build_prs

@github-actions
Copy link
Copy Markdown

@github-actions
Copy link
Copy Markdown

✅ All jobs completed successfully, see https://github.com/vivoblueos/kernel/actions/runs/24562282519.

@xuchang-vivo
Copy link
Copy Markdown
Contributor Author

build_prs

@github-actions
Copy link
Copy Markdown

@github-actions
Copy link
Copy Markdown

✅ All jobs completed successfully, see https://github.com/vivoblueos/kernel/actions/runs/24562749518.

Copy link
Copy Markdown
Contributor

@nooopslol nooopslol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add instructions for using the tracing tool, including how to capture traces and how to parse them.

@xuchang-vivo
Copy link
Copy Markdown
Contributor Author

Please add instructions for using the tracing tool, including how to capture traces and how to parse them.

The use cases can be found in test_tracing.rs. Currently, it does not support being triggered from shell commands and needs to be explicitly called in the firmware. Additionally, it has not been connected to an output backend yet. I will add shell commands and the output backend in a subsequent PR. This will need to wait until I have refactored the serial output module.

This PR will serve as a buildblock, with additional tracing features to be added later.

tracing::init();

tracing::start(TraceConfig::default());

let p = allocator::malloc(128);          <--- be recorded
allocator::free(p);                              <--- be recorded

let mut dump = [0u8; 4096];
let dumped = tracing::dump_to_slice(&mut dump);    <--- get packaged data

let event_ids = collect_event_ids(&dump, dumped);    <--- parse packaged data

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants