Skip to content

feat: M8.1c implementation#80

Open
artefactop wants to merge 66 commits into
mainfrom
feat/milestone-8.1c
Open

feat: M8.1c implementation#80
artefactop wants to merge 66 commits into
mainfrom
feat/milestone-8.1c

Conversation

@artefactop

@artefactop artefactop commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

M8.1c implementation.

Summary by CodeRabbit

  • Documentation

    • Expanded benchmarking docs (Fibonacci, eager-destruction) with examples, results, and how-to-run instructions
    • Updated roadmap to mark milestones complete
  • Tests

    • Added ASan and Valgrind leak-detection test harnesses and integration regressions exercising string ownership
    • New shared test utilities for building and running smoke tests
  • Chores

    • CI enhanced to run automated memory-safety checks on Linux/macOS
    • Added Docker-based Linux test runner and related scripts

artefactop added 30 commits June 5, 2026 22:34
Wires the runtime free function so subsequent M8.1c tasks can emit
free calls from codegen. AOT already links libryo_runtime.a;
declare_runtime_fn handles the per-call-site declaration.
Adds BranchId, FreePoint, and OwnershipSidecar types. The ownership
pass now returns the sidecar; codegen accepts it but does not yet
consult it. Pure plumbing — strings still leak.
The leading underscore is Rust's "intentionally unused" marker.
These helpers forward the sidecar to recursive calls — they do
use it. Match the live-name style used by the sibling helpers
(analyze_function, analyze_stmt, etc.).

Code-review follow-up to ef04688.
Backward walk over each function body finds the last reading
instruction for every owner that ends Valid. Schedules a Free
anchored to that instruction in the sidecar. Codegen consumption
lands in a follow-up — strings still leak today.
The u32::MAX / 2 check was duplicated between synthetic_param_ref
and the new backward last-use pass. Extract a named constant and
helper so the encoding lives in one place — protects against
silently breaking the filter if the encoding ever changes.

Code-review follow-up to 9aabc81.
Concat intermediates, formatter outputs, and str-returning calls
that don't become named-binding owners get a Free anchored after
their single consumer. Codegen consumption lands later.
…entry

Adds an inline comment explaining that the named_owners filter is
not redundant with the Valid-state filter (rebind_to_init resurrects
Valid after consume_for_assignment stamps Moved). Also files I-056
tracking the walker-shape duplication between collect_last_uses
and find_consumers.

Code-review follow-ups to 178fb10.
Dead-store warnings now have a paired Free in the sidecar so the
allocation is actually released. Anchored after the VarDecl
instruction. Pre-emptive dedup against free_on_reassign (Task 6)
included for safety. Codegen wiring follows.
Both analyze_var_decl and analyze_assign now go through a single
registration helper. Removes the 2-site drift surface that I-055
flagged. No behavior change.

Code-review follow-up to bc234da.
mut str rebinding pushes the old owner into free_on_reassign,
keyed by the Assign instruction. Old owner only recorded when
it's Valid — skip Borrowed/Moved/missing cases. Codegen consults
this map on Assign lowering (Task 8 follow-up).

Reassignment also clears the old owner's pending_dead_store
entry: the destructor that fires via free_on_reassign is an
observable use, so the prior decl/assign is no longer a dead
store and W0001 must not fire for it.
After every materialised statement, sweep OwnershipSidecar
free_schedule and call ryo_str_free for any unconditional entry
whose anchor TirRef is in inst_values. Statement-anchored Frees
(Task 5 dead-store) are emitted directly via emit_unconditional_frees;
sub-expression-anchored Frees (Task 3 last-use, Task 4 anonymous-
temp) flow through the end-of-stmt sweep so they land after the
consumer (e.g. print's write syscall) has finished using the
buffer.

freed_at (HashSet of free_schedule indices) guards against
double-emission. Scalar-cached targets (e.g. the borrowed-rodata
StrConst arg of __ryo_panic) are skipped — no heap allocation to
release. Targets whose ValueRepr isn't yet cached at sweep time
defer to the next sweep, accommodating a known ownership-pass
interaction where last_use uses post-rebind current_owner.

Branch-gated frees come in Task 9.
collect_last_uses previously resolved Var(name) reads via the
end-of-function current_owner state. After mut reassignment, that
resolved to the post-rebind owner — wrong target for any read
preceding the rebind. Today the bug is masked because both old
and new rebinding owners are rodata StrConsts (cap=0 → no-op
free), but heap-allocated strings would double-free.

Fix: take a per-Var-read snapshot during the forward walk
(`owner_at_read`) and look up that snapshot in collect_last_uses
instead of consulting the post-walk current_owner. Also exclude
owners already covered by free_on_reassign from the last-use
schedule so a pre-rebind owner isn't freed twice.

With the targeting bug fixed, codegen's defer-on-missing arm in
emit_unconditional_frees becomes unreachable and is replaced
with a hard error. Filed I-057 for the related borrowed-scalar
over-schedule (Scalar-skip in codegen retained, see comment).

Code-review follow-up to 3e154bf (Task 7).
Looks up sidecar.free_on_reassign[Assign] and emits
ryo_str_free(old_ptr, old_cap) before storing the new
fat-pointer triple. Old values read from the binding's
StrLocals Variables, not inst_values.
Releases the &ctx.str_locals borrow before declare_str_free needs
&mut ctx.module. Cheap (three Variable newtypes).

Code-review follow-up to 54d52f5.
Ownership pass assigns a BranchId per if/elif/else arm and
schedules branch-gated Frees for owners that end Valid in some
arms but Moved in others. Codegen tracks the active arm via
branch_stack and gates Free emission on branch_active.

Renames the *_unconditional_frees helpers to *_due_frees to
reflect the new behaviour.
The previous "hello" rodata literal had cap=0, making
ryo_str_free a no-op — the test would pass even if the
branch-gated Free emission were missing. Switching to
int_to_str(42) gives a real heap allocation, so the test now
actually exercises the conditional-Free path. ASan in Task 11
will turn this into a hard regression catcher.

Code-review follow-up to 6398be2.
Ownership pass scans loop bodies for break/continue and
schedules unconditional Frees for pre-loop owners still Valid
at the jump (modulo post-loop last-use). Codegen emits the
Frees before the Cranelift jump terminator.

Inside-loop owner break-handling is a known limitation for
M8.1; document if the pattern arises.
Two clarifying comments in schedule_break_continue_frees:
- Note that loop-condition refs also fall under the inside-loop
  body_min gate, harmless today (bool conditions don't allocate).
- Explain why post-loop last-use uses strict `>` not `>=`.

Code-review follow-up to 0e181b9.
Compiles representative str-using programs, relinks the object
file with -fsanitize=address via zig cc, and asserts zero leaks
on execution. RYO_KEEP_OBJ opts out of obj cleanup so the asan
relink can find the artifact.

Tests cover: simple-hello, concat-chain, mut-reassign,
conditional-move, break-loop. All exercise heap allocations so
missed Frees fail the suite under ASan.
Two cleanups from Task 11 review:
- zig_path() now sorts toolchain candidates lex (deterministic
  across filesystems) and returns PathBuf directly instead of
  the misleading Option that always returned Some.
- RYO_KEEP_OBJ now has an inline comment in build_file
  explaining its purpose and pointing at the ASan smoke test
  that consumes it.

Code-review follow-up to d23931f.
OwnershipSidecar previously held a single program-wide
free_schedule, free_on_reassign, and if_branches. But TirRefs
are per-function — each Tir's arena restarts at TirRef(1). The
sweep at end-of-stmt in codegen would walk every entry across
every function, occasionally matching another function's
TirRefs against the current function's inst_values and emitting
wrong frees.

Concrete repro: a str-returning user function called from main
plus a print(msg) afterwards would garble msg's bytes (its
heap buffer was freed prematurely by frees scheduled for the
other function).

Fix: split the sidecar into per-function entries keyed by
function StringId. Codegen looks up the entry for the current
function and threads only that into FunctionContext.

Adds cross_function_str_return_runs_clean to lock the bug down.
All three sub-milestones (8.1a runtime + heap str, 8.1b
ownership pass + move tracking, 8.1c ASAP destruction) are
merged. Strings no longer leak; ASan CI lane gates the
correctness contract.
The M8.1c break/continue Free scheduling deliberately excludes
inside-loop owners — covered in inline comments but not yet in
ISSUES.md. Document as a known limitation for follow-up.
Add self-contained benchmarks/eager_destruction recursive memory benchmark comparing Rust vs Ryo. Refactor benchmarks/fibonacci and standardise hyperfine runners to use --shell=none and local .gitignore rules.
Pre-fix: merge_branches only merged states, current_owner, origin,
and pending_dead_store. temp_owners and named_owners were dropped
on every branch/loop merge, silently leaking strings allocated
inside loop bodies (heap exhaustion in long-running loops).

Post-fix: union both sets across branches so the anonymous-temp
Free pass and the last-use pass see them at function exit.
Pre-fix: a binding declared before a loop and read only inside
the loop body was leaked on normal loop exit. The forward walk
populated owner_at_read[var_ref] = owner during the body pass,
but merge_branches discarded that snapshot, so the reverse-order
last-use pass at function end could not find the read and never
scheduled a Free for the owner.

Post-fix: union owner_at_read across branches with first-write
wins (read TirRefs are unique in the TIR).
artefactop added 11 commits June 8, 2026 16:11
Extract shared emit_frees helper in codegen for emit_due_frees /
sweep_due_frees (was 95% duplicated, including the I-057 hard-error).
Fuse the three sequential branch-iteration loops in merge_branches into
one. Replace the awkward four-arm match in the binding-aware merge with
an explicit take flag, preserving the first-observed-Moved span. Drop
the duplicate reassign_targets rebuild and inline the all_keys union
into a flat_map collect.
Five invariant-gap / scaling issues surfaced during the cleanup pass
that are out of scope for M8.1c but worth tracking:
I-059 panic ABI by name-match, I-060 rebind_to_init lattice
resurrection, I-061 Ownership clone bloat in branch/loop analysis,
I-062 quadratic free_schedule scan, I-063 smoke-harness duplication
plus zig_path drift from toolchain::ensure_zig.
@coderabbitai

coderabbitai Bot commented Jun 8, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

The PR produces an OwnershipSidecar from the ownership pass and threads it through the pipeline into codegen so ryo_str_free calls are emitted at scheduled FreePoints (statement anchors, reassigns, loop exits, branch-gated arms); it adds unit/integration tests, ASan/Valgrind smoke harnesses, CI jobs, container test artifacts, benchmarks, and docs.

Changes

Ownership sidecar & codegen string freeing

Layer / File(s) Summary
Ownership sidecar types & lattice foundation
src/ownership.rs
Adds BranchId, FreePoint, IfBranchIds, OwnershipSidecar, FunctionSidecar and extends the Ownership lattice with metadata (temp_owners, named_owners, owner_at_read, pending_dead_store, next_branch_id).
Ownership pass free scheduling & control flow
src/ownership.rs
Refactors analyze_function and analyzers to thread FunctionSidecar, compute last-uses, map temp consumers, schedule free_schedule and free_on_reassign, allocate per-if BranchIds, and thread sidecar through loops/returns.
Ownership analysis unit & regression tests
src/ownership.rs
Adds unit/regression tests validating dead-store anchoring, __ryo_panic temp exclusion, inside-loop frees, loop-break/continue scheduling, reassign frees, last-use correctness, and BranchId monotonicity.
Codegen sidecar threading & free emission
src/codegen.rs
Threads OwnershipSidecar into compile/compile_and_dump_ir; extends FunctionContext (freed_at, free_by_after, pending_sweep, branch_stack); emits due frees after non-terminating statements and before Break/Continue; frees old heap strings on Assign; gates frees with BranchId push/pop and registers ryo_str_free for JIT.
Pipeline sidecar wiring & CLI
src/pipeline.rs, src/main.rs
lower_and_analyze returns (Vec<Tir>, OwnershipSidecar); ir_command, run_file, and build_file pass sidecar to codegen; build_file conditionally preserves .o when RYO_KEEP_OBJ set; toolchain status gains --path used by tests.

Runtime validation & leak detection

Layer / File(s) Summary
Shared test harness & fixtures
tests/common/mod.rs
Adds build_and_link, Zig/toolchain detection via ryo toolchain status --path, runtime lookup, RYO_FIXTURES table and find_fixture.
ASan smoke test harness & cases
tests/asan_smoke.rs
Linux/macOS ASan harness using common::build_and_link with -fsanitize=address and ASAN/LSAN options; adds fixture tests for strings and ownership regressions.
Valgrind smoke test harness & cases
tests/valgrind_smoke.rs
Linux-only Valgrind harness that skips when valgrind missing, builds via common::build_and_link, runs under Valgrind leak-check settings, and asserts success for multiple fixtures.
Integration runtime tests
tests/integration_tests.rs
Adds tests validating last-use across statements, print/concat/reassign free behavior, branch-gated frees, loop-break free emission, cross-function returns, and branch-id collision regression.
CI jobs & Docker test container
.github/workflows/ci.yml, Dockerfile, run_linux_tests.sh, CLAUDE.md
Adds GitHub Actions asan and valgrind jobs that install/cache Zig and run smoke suites; provides Dockerfile with libunwind-dev and valgrind for container tests; adds run_linux_tests.sh to build/run the container and documents command.

Documentation, benchmarking & issue tracking

Layer / File(s) Summary
Eager destruction benchmark suite & docs
benchmarks/eager_destruction/*
Adds eager-destruction benchmark suite (.ryo and Rust), README comparing RAII vs eager freeing with results, and run_benchmarks.sh measuring RSS and running hyperfine.
Benchmark docs, scripts & gitignore
benchmarks/*, benchmarks/fibonacci/*
Reorganizes benchmark README to directory-level overview, updates Fibonacci README and scripts to check prerequisites, keep stderr visibility, and increase hyperfine warmup to 3 (--shell=none); updates gitignore entries.
Known issues & roadmap
ISSUES.md, docs/dev/implementation_roadmap.md
Adds ownership/codegen correctness issues (I-059,I-060,I-061,I-064–I-066,I-067) and marks Milestone 8.1 complete in roadmap.

Sequence Diagram(s)

sequenceDiagram
  participant CLI as ryo CLI
  participant Pipeline as pipeline::lower_and_analyze
  participant Ownership as ownership::check
  participant Codegen as codegen::compile
  participant Runtime as ryo runtime (ryo_str_free)

  CLI->>Pipeline: parse & lower file
  Pipeline->>Ownership: crate::ownership::check(tirs)
  Ownership-->>Pipeline: OwnershipSidecar (FunctionSidecar per function)
  Pipeline->>Codegen: compile(tirs, pool, sidecar)
  Codegen->>Runtime: emit calls to ryo_str_free at FreePoints (statement, reassign, loop-exit, branch-gated)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • ryolang/ryo#66: Related control-flow handling changes touching break/continue lowering used by free emission logic.
  • ryolang/ryo#63: Adds/adjusts panic/assert builtins; this PR special-cases __ryo_panic in temp scheduling.
  • ryolang/ryo#35: Prior CI workflow changes that overlap with added ASan/Valgrind CI jobs.

Poem

🐰 I hop through TIR and sidecar trails,

FreePoints bloom where last-use prevails.
Codegen listens, pushes BranchIds high,
ASan and Valgrind watch as leaks say goodbye.
A tiny rabbit cheers: no heap left to cry.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Title check ❓ Inconclusive The title 'feat: M8.1c implementation' is vague and generic, using only a milestone identifier without describing the actual technical changes. Consider using a more specific title that describes the primary technical achievement, such as 'feat: implement ASAP string deallocation with ownership sidecar codegen' or 'feat: add owned string free emission with branch-gated scheduling'.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/milestone-8.1c

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/codegen.rs (1)

615-647: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Flush scheduled frees before return, not just before loop jumps.

Break/Continue already call emit_due_frees() before emitting the terminator, but Return and ReturnVoid return immediately. Ownership can still anchor a FreePoint on a return statement—for example, a pre-if string that stays Valid in the returning arm and is Moved in another arm—so this path currently exits without emitting that cleanup.

Suggested fix
             TirTag::Return => {
                 let operand = match inst.data {
                     TirData::UnOp(o) => o,
                     _ => unreachable!("Return must carry TirData::UnOp"),
                 };
                 if is_str_type(ctx.tir.return_type, ctx.pool) {
                     let sret = ctx.sret_ptr.expect("str-returning fn must have sret_ptr");
                     let repr = Self::eval_inst_str(builder, ctx, operand)?;
                     let (ptr, len, cap) = match repr {
                         ValueRepr::Str { ptr, len, cap } => (ptr, len, cap),
                         _ => unreachable!("str return must produce ValueRepr::Str"),
                     };
                     builder.ins().store(MemFlags::trusted(), ptr, sret, 0);
                     builder.ins().store(MemFlags::trusted(), len, sret, 8);
                     builder.ins().store(MemFlags::trusted(), cap, sret, 16);
+                    Self::emit_due_frees(builder, ctx, r)?;
                     builder.ins().return_(&[]);
                 } else {
                     let val = Self::eval_inst(builder, ctx, operand)?;
+                    Self::emit_due_frees(builder, ctx, r)?;
                     builder.ins().return_(&[val]);
                 }
                 Ok(true)
             }
             TirTag::ReturnVoid => {
+                Self::emit_due_frees(builder, ctx, r)?;
                 let is_main = ctx.pool.str(ctx.tir.name) == "main";
                 if is_main {
                     let zero = builder.ins().iconst(ctx.int_type, 0);
                     builder.ins().return_(&[zero]);
                 } else {
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/codegen.rs` around lines 615 - 647, Return paths (TirTag::Return and
TirTag::ReturnVoid) currently emit the return terminator before flushing
scheduled frees; call emit_due_frees(...) to flush any pending FreePoint cleanup
before emitting the return(s). Concretely, in the TirTag::Return arm (both the
string-sret branch and the non-sret branch) and in the TirTag::ReturnVoid arm,
invoke the same emit_due_frees(...) helper (passing the current builder and ctx
as used by Break/Continue) immediately before any builder.ins().return_(...) so
that ownership-based frees are emitted prior to returning.
src/ownership.rs (1)

1586-1601: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Don't clear dead-store fallback on maybe-unexecuted reads.

This treats every syntactic Var read as a definite use. Inside if/while regions that may never execute, removing pending_dead_store here suppresses the only unconditional cleanup path and leaves the owner relying on a last-use anchor in code that might be skipped entirely. s: str = int_to_str(1); if false: print(s) will now finish without any reachable free.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/ownership.rs` around lines 1586 - 1601, The code unconditionally treats
any syntactic Var read as a definite use by calling
own.pending_dead_store.remove(&owner) inside the block guarded by
own.current_owner.get(&name) && needs_tracking(inst.ty, pool); change this so
pending_dead_store is only removed when the read is guaranteed to execute (i.e.,
not in a maybe-unexecuted control-flow region). In practice, add or use the
existing control-flow/execution-certainty predicate (e.g., an
is_definitely_executed / is_guaranteed_path check around the read) and wrap the
pending_dead_store.remove(&owner) call with that predicate (leave
own.origin.insert(r, Some(owner)) and owner_at_read behavior as appropriate), or
alternatively move the removal to the later point where execution is known;
ensure the change references own.current_owner, needs_tracking(inst.ty, pool),
own.pending_dead_store.remove(&owner), own.origin.insert, and own.owner_at_read
so the fix is applied to the correct block.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/ci.yml:
- Around line 84-85: The workflow redirects output with an unquoted variable (>>
$GITHUB_OUTPUT) which can trigger shellword-splitting/globbing issues; update
each “Get Zig version” step to quote the redirection target (use >>
"$GITHUB_OUTPUT") so the shell treats the variable as a single filename; apply
this change to the lines using VERSION=$(grep ... src/toolchain.rs | cut -d'"'
-f2) and echo "version=$VERSION" >> $GITHUB_OUTPUT (and the two other identical
steps).
- Around line 76-78: The workflow currently uses mutable action tags and
unquoted shell expansions which can cause drifting behavior: replace tag
references like actions/checkout@v6, dtolnay/rust-toolchain@stable,
Swatinem/rust-cache@v2 and actions/cache@v4 with their corresponding immutable
full commit SHAs (pin each uses: to the full SHA), add an explicit top-level
permissions: block to the workflow (explicit least-privilege permissions
required by the job), and change any redirection to GITHUB_OUTPUT (instances of
>> $GITHUB_OUTPUT) to quoted form (>> "$GITHUB_OUTPUT") to satisfy shellcheck
SC2086.
- Line 72: Update the CI workflow to harden the asan and valgrind jobs by adding
least-privilege permissions and disabling checkout credential persistence: for
jobs.asan and jobs.valgrind add a permissions block (e.g., permissions:
contents: read) and update the checkout step that uses actions/checkout@v6 to
include with: { persist-credentials: false } so the job cannot access the
GITHUB_TOKEN beyond read-only content access.

In `@benchmarks/eager_destruction/README.md`:
- Around line 67-74: The table header is inconsistent with the actual benchmark
depth: the README table column "Result at 80,000 Depth" conflicts with the
stated run depth and the run_benchmarks.sh script which executes at 50,000; fix
by either renaming the column header to "Result at 50,000 Depth" to match the
existing runs or add and document an additional benchmark run at 80,000 in
run_benchmarks.sh and regenerate the table results, updating the table rows for
Ryo and Rust accordingly; locate the table header text "Result at 80,000 Depth"
in the README and the script invocation in run_benchmarks.sh to make the
consistent change.

In `@benchmarks/eager_destruction/run_benchmarks.sh`:
- Line 18: In run_benchmarks.sh replace the direct "cargo build --release >
/dev/null" invocation so it always runs from the repository root: run the cargo
build inside a subshell that first cd's up to the repo root (two levels up from
benchmarks/eager_destruction) and then executes cargo build --release, while
still redirecting output to /dev/null; locate the line containing the exact
string "cargo build --release > /dev/null" and change it accordingly.

In `@benchmarks/fibonacci/run_benchmarks.sh`:
- Around line 4-13: The prereq check block in run_benchmarks.sh only validates
rustc and hyperfine but the script later uses additional tooling; add the same
fail-fast checks for every tool the script invokes (e.g., go, swiftc, kotlinc,
java, node (or uv if that's the binary used), bun, and any other
language/runtime compilers/runners you call) by adding `if ! command -v <tool>
&> /dev/null; then echo "Error: '<tool>' is not installed or not in PATH."; exit
1; fi` entries into the existing prerequisite section so all required tools are
validated up-front.

In `@Dockerfile`:
- Around line 14-17: The apt install line in the Dockerfile installs
libunwind-dev and valgrind without --no-install-recommends which pulls extra
packages; update the RUN instruction that currently runs "apt-get update &&
apt-get install -y \ libunwind-dev \ valgrind \ && rm -rf /var/lib/apt/lists/*"
to include --no-install-recommends (i.e., apt-get install -y
--no-install-recommends) so only required packages are installed while keeping
the existing apt-get update and rm -rf /var/lib/apt/lists/* steps intact.

In `@docs/dev/implementation_roadmap.md`:
- Line 24: The index link fragment for the heading "Milestone 8.1 —
Heap-Allocated `str` Type & Move Semantics [alpha]" is incorrect; update the
fragment in the checklist link to match the exact generated heading anchor (copy
the anchor shown in the rendered HTML or GitHub preview) — the expected anchor
will look like milestone-81-heap-allocated-str-type-and-move-semantics-alpha
(replace the current fragment with that or the exact rendered variant) so the
MD051 anchor mismatch is resolved.

In `@src/ownership.rs`:
- Around line 1167-1195: collect_last_uses currently overwrites a single owner
-> read mapping so mutually-exclusive branch reads clobber each other; change
last_use to record multiple anchors per owner (e.g., HashMap<TirRefOwner,
Vec<TirRef>> or HashMap<TirRefOwner, HashSet<TirRef>>) and update
collect_last_uses (and any callers) to push/insert instead of replace,
preserving every read site returned by owner_at_read; then adjust the codepath
that consumes last_use (e.g., analyze_if_stmt and any free-emission logic) to
treat multiple anchors as branch-sensitive (emit branch-gated frees tied to each
unique read site or only emit unconditional free when an owner’s last-use sites
occur on all control-flow paths). Ensure references to owner_at_read and
walk_operands remain used to map each read to its originating owner.
- Around line 1048-1070: The speculative first-pass can leave mutations in the
real FunctionSidecar (sidecar) that must be discarded before the forced rewalk;
after doing the first visit_expr/analyze_stmt loop with the scratch sink
(scratch_sink) and before calling visit_expr/analyze_stmt again for the real
sink, restore the sidecar to its pre-pass snapshot (e.g. replace or clone-reset
the live sidecar from the saved entry snapshot) so nested-if free_schedule
entries from the scratch pass are removed; apply the same reset in the other
symmetric block (around lines handling the alternate branch at 1100-1112) where
states_differ triggers a second walk.

---

Outside diff comments:
In `@src/codegen.rs`:
- Around line 615-647: Return paths (TirTag::Return and TirTag::ReturnVoid)
currently emit the return terminator before flushing scheduled frees; call
emit_due_frees(...) to flush any pending FreePoint cleanup before emitting the
return(s). Concretely, in the TirTag::Return arm (both the string-sret branch
and the non-sret branch) and in the TirTag::ReturnVoid arm, invoke the same
emit_due_frees(...) helper (passing the current builder and ctx as used by
Break/Continue) immediately before any builder.ins().return_(...) so that
ownership-based frees are emitted prior to returning.

In `@src/ownership.rs`:
- Around line 1586-1601: The code unconditionally treats any syntactic Var read
as a definite use by calling own.pending_dead_store.remove(&owner) inside the
block guarded by own.current_owner.get(&name) && needs_tracking(inst.ty, pool);
change this so pending_dead_store is only removed when the read is guaranteed to
execute (i.e., not in a maybe-unexecuted control-flow region). In practice, add
or use the existing control-flow/execution-certainty predicate (e.g., an
is_definitely_executed / is_guaranteed_path check around the read) and wrap the
pending_dead_store.remove(&owner) call with that predicate (leave
own.origin.insert(r, Some(owner)) and owner_at_read behavior as appropriate), or
alternatively move the removal to the later point where execution is known;
ensure the change references own.current_owner, needs_tracking(inst.ty, pool),
own.pending_dead_store.remove(&owner), own.origin.insert, and own.owner_at_read
so the fix is applied to the correct block.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 18741f8a-0b5f-4b0c-a6cf-285dadf1d8fd

📥 Commits

Reviewing files that changed from the base of the PR and between a1f1b05 and a0255ba.

📒 Files selected for processing (23)
  • .github/workflows/ci.yml
  • CLAUDE.md
  • Dockerfile
  • ISSUES.md
  • benchmarks/.gitignore
  • benchmarks/README.md
  • benchmarks/eager_destruction/.gitignore
  • benchmarks/eager_destruction/README.md
  • benchmarks/eager_destruction/eager_destruction.rs
  • benchmarks/eager_destruction/eager_destruction.ryo
  • benchmarks/eager_destruction/eager_destruction_manual_drop.rs
  • benchmarks/eager_destruction/run_benchmarks.sh
  • benchmarks/fibonacci/.gitignore
  • benchmarks/fibonacci/README.md
  • benchmarks/fibonacci/run_benchmarks.sh
  • docs/dev/implementation_roadmap.md
  • run_linux_tests.sh
  • src/codegen.rs
  • src/ownership.rs
  • src/pipeline.rs
  • tests/asan_smoke.rs
  • tests/integration_tests.rs
  • tests/valgrind_smoke.rs

Comment thread .github/workflows/ci.yml
- run: cargo build -p ryo-runtime --release
- run: cargo test

asan:

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail
FILE=".github/workflows/ci.yml"

echo "== File exists? =="
test -f "$FILE" && echo "Found $FILE" || { echo "Missing $FILE"; exit 1; }

echo "== Relevant lines around 60-120 =="
nl -ba "$FILE" | sed -n '50,130p'

echo "== Search for job-level permissions =="
rg -n "^[[:space:]]*permissions:" "$FILE" || true

echo "== Search for actions/checkout steps and persist-credentials usage =="
rg -n "actions/checkout@v[0-9]+" "$FILE" || true
rg -n "persist-credentials" "$FILE" || true

echo "== Dump asan/valgrind job blocks (approx) =="
# Extract from 'asan:' to 'valgrind:' boundaries using awk heuristics
awk '
  $0 ~ /^[[:space:]]{2}asan:[[:space:]]*$/ {in_as=1; in_val=0}
  $0 ~ /^[[:space:]]{2}valgrind:[[:space:]]*$/ {in_as=0; in_val=1}
  $0 ~ /^[[:space:]]{2}[a-zA-Z0-9_-]+:[[:space:]]*$/ {
    if(in_as||in_val) {
      # If another top-level job begins, stop after printing its contents
      if($0 !~ /^(  (asan|valgrind)):/) {
        # keep as is; no-op
      }
    }
  }
  {print}
' "$FILE" > /tmp/ci.yml.full.txt

# Just print lines matching 'asan:' and 'valgrind:' and a few following lines
echo "---- asan lines ----"
nl -ba "$FILE" | rg -n "asan:" -n
echo "---- valgrind lines ----"
nl -ba "$FILE" | rg -n "valgrind:" -n

Repository: ryolang/ryo

Length of output: 201


🏁 Script executed:

#!/bin/bash
set -euo pipefail
FILE=".github/workflows/ci.yml"

echo "== Relevant sections (with line numbers via awk) =="
awk '{printf "%5d:%s\n", NR, $0}' "$FILE" | sed -n '1,200p'

echo "== Search for job-level permissions =="
awk 'match($0,/^[[:space:]]*permissions:/){print NR ":" $0}' "$FILE" || true

echo "== Search for checkout steps =="
awk 'index($0,"actions/checkout@v") {print NR ":" $0}' "$FILE" || true

echo "== Search for persist-credentials usage =="
awk 'index($0,"persist-credentials") {print NR ":" $0}' "$FILE" || true

echo "== Print asan job block (from asan: to next top-level job) =="
awk '
  $0 ~ /^  asan:[[:space:]]*$/ {in=1}
  $0 ~ /^  [a-zA-Z0-9_-]+:[[:space:]]*$/ && $0 !~ /^  asan:[[:space:]]*$/ {if(in){exit}}
  in {print}
' "$FILE" || true

echo "== Print valgrind job block (from valgrind: to next top-level job) =="
awk '
  $0 ~ /^  valgrind:[[:space:]]*$/ {in=1}
  $0 ~ /^  [a-zA-Z0-9_-]+:[[:space:]]*$/ && $0 !~ /^  valgrind:[[:space:]]*$/ {if(in){exit}}
  in {print}
' "$FILE" || true

Repository: ryolang/ryo

Length of output: 5314


🏁 Script executed:

#!/bin/bash
set -euo pipefail
FILE=".github/workflows/ci.yml"

echo "== All occurrences of 'permissions:' =="
rg -n "^[[:space:]]*permissions:" "$FILE" || true

echo "== Occurrences of 'persist-credentials' =="
rg -n "persist-credentials" "$FILE" || true

echo "== asan job: lines 68-110 (context) =="
sed -n '68,110p' "$FILE" | nl -ba -w2 -s': ' || true

echo "== valgrind job: lines 92-140 (context) =="
sed -n '92,140p' "$FILE" | nl -ba -w2 -s': ' || true

Repository: ryolang/ryo

Length of output: 322


Harden CI asan/valgrind jobs to reduce GitHub token exposure.

Add least-privilege job permissions (e.g., contents: read) and set actions/checkout@v6 to use with: { persist-credentials: false } in both jobs.asan and jobs.valgrind.

🧰 Tools
🪛 zizmor (1.25.2)

[warning] 72-96: overly broad permissions (excessive-permissions): default permissions used due to no permissions: block

(excessive-permissions)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/ci.yml at line 72, Update the CI workflow to harden the
asan and valgrind jobs by adding least-privilege permissions and disabling
checkout credential persistence: for jobs.asan and jobs.valgrind add a
permissions block (e.g., permissions: contents: read) and update the checkout
step that uses actions/checkout@v6 to include with: { persist-credentials: false
} so the job cannot access the GITHUB_TOKEN beyond read-only content access.

Source: Linters/SAST tools

Comment thread .github/workflows/ci.yml
Comment on lines +76 to +78
- uses: actions/checkout@v6
- uses: dtolnay/rust-toolchain@stable
- uses: Swatinem/rust-cache@v2

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE=".github/workflows/ci.yml"

# Show relevant sections around the cited lines (60-140)
echo "=== File head (first 120 lines) ==="
nl -ba "$FILE" | sed -n '1,160p'

# Extract all "uses:" lines in the file
echo
echo "=== All uses: lines ==="
rg -n --no-heading '^\s*-\s*uses:' "$FILE" || true

# Check for permissions blocks (workflow/job)
echo
echo "=== permissions: occurrences ==="
rg -n --no-heading '^\s*permissions:' "$FILE" || true
rg -n --no-heading 'permissions:' "$FILE" || true

# Check for GITHUB_OUTPUT usage
echo
echo "=== GITHUB_OUTPUT occurrences ==="
rg -n --no-heading '\$GITHUB_OUTPUT' "$FILE" || true
rg -n --no-heading 'GITHUB_OUTPUT' "$FILE" || true

# Check for shellcheck SC2086-like unquoted variables
echo
echo "=== Possible unquoted variable expansions in run scripts (best-effort) ==="
# Show lines with "echo" and GITHUB_OUTPUT or ">> $GITHUB_OUTPUT"
rg -n --no-heading 'run:\s*\|?|\>\>\s*\$GITHUB_OUTPUT|GITHUB_OUTPUT' "$FILE" || true

# Show jobs section names for context
echo
echo "=== Job names ==="
rg -n --no-heading '^  [a-zA-Z0-9_-]+:\s*$' "$FILE" | head -n 50 || true

Repository: ryolang/ryo

Length of output: 152


🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE=".github/workflows/ci.yml"

# Show relevant sections around the cited lines (60-140)
echo "=== File head (first 220 lines) ==="
nl -ba "$FILE" | sed -n '1,220p'

# Extract all "uses:" lines in the file
echo
echo "=== All uses: lines ==="
rg -n --no-heading '^\s*-\s*uses:' "$FILE" || true

# Check for permissions blocks (workflow/job)
echo
echo "=== permissions: occurrences ==="
rg -n --no-heading '^\s*permissions:' "$FILE" || true
rg -n --no-heading 'permissions:' "$FILE" || true

# Check for GITHUB_OUTPUT usage
echo
echo "=== GITHUB_OUTPUT occurrences ==="
rg -n --no-heading '\$GITHUB_OUTPUT' "$FILE" || true
rg -n --no-heading 'GITHUB_OUTPUT' "$FILE" || true

# Check for unquoted $GITHUB_OUTPUT or other shellcheck SC2086 suspects
echo
echo "=== Lines with potential shellcheck SC2086 patterns ==="
rg -n --no-heading '\$[A-Za-z_][A-Za-z0-9_]*' "$FILE" | head -n 200

# Job names for context
echo
echo "=== Job keys ==="
rg -n --no-heading '^  [A-Za-z0-9_-]+:\s*$' "$FILE" || true

Repository: ryolang/ryo

Length of output: 152


🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE=".github/workflows/ci.yml"

echo "=== sanity check: file exists + size ==="
ls -l "$FILE"

echo
echo "=== File head (first 220 lines) ==="
awk '{printf "%6d\t%s\n", NR, $0}' "$FILE" | sed -n '1,220p'

echo
echo "=== All uses: lines ==="
rg -n --no-heading '^\s*-\s*uses:' "$FILE" || true

echo
echo "=== permissions: occurrences ==="
rg -n --no-heading '^\s*permissions:' "$FILE" || true
rg -n --no-heading 'permissions:' "$FILE" || true

echo
echo "=== GITHUB_OUTPUT occurrences ==="
rg -n --no-heading '\$GITHUB_OUTPUT' "$FILE" || true
rg -n --no-heading 'GITHUB_OUTPUT' "$FILE" || true

echo
echo "=== Lines containing likely shell variable expansions (context) ==="
rg -n --no-heading '\$[A-Za-z_][A-Za-z0-9_]*' "$FILE" | head -n 200 || true

echo
echo "=== Jobs keys ==="
rg -n --no-heading '^  [A-Za-z0-9_-]+:\s*$' "$FILE" || true

Repository: ryolang/ryo

Length of output: 6189


Pin GitHub Actions to immutable SHAs.

  • .github/workflows/ci.yml uses tag-based actions (actions/checkout@v6, dtolnay/rust-toolchain@stable, Swatinem/rust-cache@v2, actions/cache@v4 at lines 22-23, 32-33, 36, 50-52, 62, 76-78, 88, 102-104, 116); pin each uses: to a full commit SHA.
  • Add an explicit top-level permissions: block (none is defined currently).
  • Quote $GITHUB_OUTPUT in the workflow steps (>> $GITHUB_OUTPUT at lines 59, 85, 113 → >> "$GITHUB_OUTPUT"), aligning with shellcheck SC2086.
Suggested pattern
-      - uses: actions/checkout@v6
+      - uses: actions/checkout@<full_commit_sha>
🧰 Tools
🪛 zizmor (1.25.2)

[warning] 76-76: credential persistence through GitHub Actions artifacts (artipacked): does not set persist-credentials: false

(artipacked)


[error] 76-76: unpinned action reference (unpinned-uses): action is not pinned to a hash (required by blanket policy)

(unpinned-uses)


[error] 77-77: unpinned action reference (unpinned-uses): action is not pinned to a hash (required by blanket policy)

(unpinned-uses)


[error] 78-78: unpinned action reference (unpinned-uses): action is not pinned to a hash (required by blanket policy)

(unpinned-uses)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/ci.yml around lines 76 - 78, The workflow currently uses
mutable action tags and unquoted shell expansions which can cause drifting
behavior: replace tag references like actions/checkout@v6,
dtolnay/rust-toolchain@stable, Swatinem/rust-cache@v2 and actions/cache@v4 with
their corresponding immutable full commit SHAs (pin each uses: to the full SHA),
add an explicit top-level permissions: block to the workflow (explicit
least-privilege permissions required by the job), and change any redirection to
GITHUB_OUTPUT (instances of >> $GITHUB_OUTPUT) to quoted form (>>
"$GITHUB_OUTPUT") to satisfy shellcheck SC2086.

Source: Linters/SAST tools

Comment thread .github/workflows/ci.yml
Comment on lines +84 to +85
VERSION=$(grep -o 'const ZIG_VERSION: &str = "[^"]*"' src/toolchain.rs | cut -d'"' -f2)
echo "version=$VERSION" >> $GITHUB_OUTPUT

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail
sed -n '60,130p' .github/workflows/ci.yml | nl -ba | sed -n '70,140p'

Repository: ryolang/ryo

Length of output: 98


🏁 Script executed:

#!/bin/bash
set -euo pipefail
nl -ba .github/workflows/ci.yml | sed -n '70,130p'

Repository: ryolang/ryo

Length of output: 98


🏁 Script executed:

#!/bin/bash
set -euo pipefail
file=".github/workflows/ci.yml"
rg -n 'echo "version=\$VERSION" >> \$GITHUB_OUTPUT' "$file" || true
# Print a wider context around any matches found by the line numbers above
python3 - <<'PY'
import re, subprocess, os, sys
file = ".github/workflows/ci.yml"
p = subprocess.run(["rg","-n",r'echo "version=\$VERSION" >> \$GITHUB_OUTPUT',file],capture_output=True,text=True)
lines=[]
for line in p.stdout.strip().splitlines():
    m=re.match(r'.*:(\d+):', line)
    if m: lines.append(int(m.group(1)))
if not lines:
    sys.exit(0)
for ln in sorted(set(lines)):
    lo=max(1,ln-6); hi=ln+6
    subprocess.run(["sed","-n",f"{lo},{hi}p",file],check=False)
    print("----")
PY

Repository: ryolang/ryo

Length of output: 221


🏁 Script executed:

#!/bin/bash
set -euo pipefail
file=".github/workflows/ci.yml"

for ln in 59 85 113; do
  echo "===== context around $ln ====="
  lo=$((ln-6))
  hi=$((ln+6))
  sed -n "${lo},${hi}p" "$file"
done

Repository: ryolang/ryo

Length of output: 1675


Quote $GITHUB_OUTPUT redirection targets in .github/workflows/ci.yml

The redirection target should be quoted to avoid SC2086 word-splitting/globbing findings in strict shell lint checks (currently >> $GITHUB_OUTPUT in the three “Get Zig version” steps, around lines 59, 85, and 113).

Proposed fix
-          echo "version=$VERSION" >> $GITHUB_OUTPUT
+          echo "version=$VERSION" >> "$GITHUB_OUTPUT"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
VERSION=$(grep -o 'const ZIG_VERSION: &str = "[^"]*"' src/toolchain.rs | cut -d'"' -f2)
echo "version=$VERSION" >> $GITHUB_OUTPUT
VERSION=$(grep -o 'const ZIG_VERSION: &str = "[^"]*"' src/toolchain.rs | cut -d'"' -f2)
echo "version=$VERSION" >> "$GITHUB_OUTPUT"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/ci.yml around lines 84 - 85, The workflow redirects output
with an unquoted variable (>> $GITHUB_OUTPUT) which can trigger
shellword-splitting/globbing issues; update each “Get Zig version” step to quote
the redirection target (use >> "$GITHUB_OUTPUT") so the shell treats the
variable as a single filename; apply this change to the lines using
VERSION=$(grep ... src/toolchain.rs | cut -d'"' -f2) and echo "version=$VERSION"
>> $GITHUB_OUTPUT (and the two other identical steps).

Source: Linters/SAST tools

Comment thread benchmarks/eager_destruction/README.md Outdated
echo "Building benchmarks..."
rustc -O eager_destruction.rs -o eager_destruction_rs
rustc -O eager_destruction_manual_drop.rs -o eager_destruction_manual_drop_rs
cargo build --release > /dev/null

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Test whether cargo build works from the benchmark directory
cd benchmarks/eager_destruction
bash -c 'cargo build --release > /dev/null 2>&1'
exit_code=$?
if [ $exit_code -ne 0 ]; then
  echo "FAIL: cargo build --release fails from benchmarks/eager_destruction/ (exit code: $exit_code)"
  echo "Expected: command should run from repo root or use (cd ../.. && cargo build --release)"
else
  echo "PASS: cargo build succeeded"
fi

Repository: ryolang/ryo

Length of output: 226


Fix cargo build --release working directory in eager destruction benchmark script.
benchmarks/eager_destruction/run_benchmarks.sh (line 18) runs cargo build --release > /dev/null without ensuring it’s executed from the repo root; running from benchmarks/eager_destruction/ fails (exit code 1). Change to run from the root, e.g. (cd ../.. && cargo build --release) > /dev/null.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@benchmarks/eager_destruction/run_benchmarks.sh` at line 18, In
run_benchmarks.sh replace the direct "cargo build --release > /dev/null"
invocation so it always runs from the repository root: run the cargo build
inside a subshell that first cd's up to the repo root (two levels up from
benchmarks/eager_destruction) and then executes cargo build --release, while
still redirecting output to /dev/null; locate the line containing the exact
string "cargo build --release > /dev/null" and change it accordingly.

Comment on lines +4 to +13
# Check for prerequisites
if ! command -v rustc &> /dev/null; then
echo "Error: 'rustc' is not installed or not in PATH."
exit 1
fi

if ! command -v hyperfine &> /dev/null; then
echo "Error: 'hyperfine' is not installed or not in PATH. Please install it to run performance benchmarks."
exit 1
fi

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Expand prerequisite checks to match actual tool usage.

Line 4 adds fail-fast checks, but only rustc and hyperfine are validated while the script immediately depends on additional tools (go, swiftc, kotlinc, java, uv, Bun, etc.). This causes delayed, less actionable failures later in execution.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@benchmarks/fibonacci/run_benchmarks.sh` around lines 4 - 13, The prereq check
block in run_benchmarks.sh only validates rustc and hyperfine but the script
later uses additional tooling; add the same fail-fast checks for every tool the
script invokes (e.g., go, swiftc, kotlinc, java, node (or uv if that's the
binary used), bun, and any other language/runtime compilers/runners you call) by
adding `if ! command -v <tool> &> /dev/null; then echo "Error: '<tool>' is not
installed or not in PATH."; exit 1; fi` entries into the existing prerequisite
section so all required tools are validated up-front.

Comment thread Dockerfile
Comment on lines +14 to +17
RUN apt-get update && apt-get install -y \
libunwind-dev \
valgrind \
&& rm -rf /var/lib/apt/lists/*

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail
ls -la
echo "---- Dockerfile (with line numbers) ----"
nl -ba Dockerfile | sed -n '1,120p'
echo "---- Search for apt-get install usage ----"
rg -n "apt-get\s+install" Dockerfile || true
echo "---- Search for no-install-recommends usage ----"
rg -n "no-install-recommends" Dockerfile || true

Repository: ryolang/ryo

Length of output: 1898


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "---- Dockerfile (with line numbers via cat -n) ----"
cat -n Dockerfile | sed -n '1,120p'

echo "---- Search for apt-get install usage ----"
rg -n "apt-get\s+install" Dockerfile || true

echo "---- Search for no-install-recommends usage ----"
rg -n "no-install-recommends" Dockerfile || true

Repository: ryolang/ryo

Length of output: 1726


Use --no-install-recommends for apt installs

The Dockerfile installs libunwind-dev and valgrind via apt-get install -y without --no-install-recommends, which can bring in unnecessary recommended packages.

Proposed fix
-RUN apt-get update && apt-get install -y \
+RUN apt-get update && apt-get install -y --no-install-recommends \
     libunwind-dev \
     valgrind \
     && rm -rf /var/lib/apt/lists/*
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
RUN apt-get update && apt-get install -y \
libunwind-dev \
valgrind \
&& rm -rf /var/lib/apt/lists/*
RUN apt-get update && apt-get install -y --no-install-recommends \
libunwind-dev \
valgrind \
&& rm -rf /var/lib/apt/lists/*
🧰 Tools
🪛 Trivy (0.69.3)

[error] 14-17: 'apt-get' missing '--no-install-recommends'

'--no-install-recommends' flag is missed: 'apt-get update && apt-get install -y libunwind-dev valgrind && rm -rf /var/lib/apt/lists/*'

Rule: DS-0029

Learn more

(IaC/Dockerfile)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Dockerfile` around lines 14 - 17, The apt install line in the Dockerfile
installs libunwind-dev and valgrind without --no-install-recommends which pulls
extra packages; update the RUN instruction that currently runs "apt-get update
&& apt-get install -y \ libunwind-dev \ valgrind \ && rm -rf
/var/lib/apt/lists/*" to include --no-install-recommends (i.e., apt-get install
-y --no-install-recommends) so only required packages are installed while
keeping the existing apt-get update and rm -rf /var/lib/apt/lists/* steps
intact.

Source: Linters/SAST tools

- [x] [Milestone 8b — Conditionals & Logical Operators [alpha] ✅ COMPLETE](#milestone-8b-conditionals--logical-operators-alpha--complete)
- [x] [Milestone 8c — Loops & Loop Control [alpha] ✅ COMPLETE](#milestone-8c-loops--loop-control-alpha--complete)
- [ ] [Milestone 8.1 — Heap-Allocated `str` Type & Move Semantics [alpha]](#milestone-81-heap-allocated-str-type--move-semantics-alpha)
- [x] [Milestone 8.1 — Heap-Allocated `str` Type & Move Semantics [alpha]](#milestone-81-heap-allocated-str-type--move-semantics-alpha)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fix the Milestone 8.1 anchor fragment mismatch.

Line 24’s fragment link does not resolve to a valid heading anchor (MD051). Please align the index link target with the exact generated anchor for the Milestone 8.1 heading.

🧰 Tools
🪛 markdownlint-cli2 (0.22.1)

[warning] 24-24: Link fragments should be valid

(MD051, link-fragments)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/dev/implementation_roadmap.md` at line 24, The index link fragment for
the heading "Milestone 8.1 — Heap-Allocated `str` Type & Move Semantics [alpha]"
is incorrect; update the fragment in the checklist link to match the exact
generated heading anchor (copy the anchor shown in the rendered HTML or GitHub
preview) — the expected anchor will look like
milestone-81-heap-allocated-str-type-and-move-semantics-alpha (replace the
current fragment with that or the exact rendered variant) so the MD051 anchor
mismatch is resolved.

Source: Linters/SAST tools

Comment thread src/ownership.rs
Comment on lines 1048 to 1070
let mut scratch_sink = DiagSink::new();
visit_expr(tir, pool, own, &mut scratch_sink, by_name, view.cond);
visit_expr(
tir,
pool,
own,
&mut scratch_sink,
by_name,
sidecar,
view.cond,
);
for stmt in &view.body {
analyze_stmt(tir, pool, own, &mut scratch_sink, by_name, *stmt);
analyze_stmt(tir, pool, own, &mut scratch_sink, by_name, sidecar, *stmt);
}
let after_first = own.clone();

if states_differ(&entry, &after_first) {
*own = merge_two(&entry, &after_first);
visit_expr(tir, pool, own, sink, by_name, view.cond);
visit_expr(tir, pool, own, sink, by_name, sidecar, view.cond);
for stmt in &view.body {
analyze_stmt(tir, pool, own, sink, by_name, *stmt);
analyze_stmt(tir, pool, own, sink, by_name, sidecar, *stmt);
}
let after_second = own.clone();
*own = merge_two(&entry, &after_second);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Discard scratch-pass sidecar mutations before the second loop walk.

Both loop analyzers write into the real FunctionSidecar during the speculative first pass. When states_differ() forces the rewalk, if_branches gets overwritten, but any free_schedule entries appended by nested if bodies in the scratch pass remain behind with stale BranchIds. That leaves unreachable/stale frees in the final sidecar and can miscompile loop bodies that need the second pass.

Also applies to: 1100-1112

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/ownership.rs` around lines 1048 - 1070, The speculative first-pass can
leave mutations in the real FunctionSidecar (sidecar) that must be discarded
before the forced rewalk; after doing the first visit_expr/analyze_stmt loop
with the scratch sink (scratch_sink) and before calling visit_expr/analyze_stmt
again for the real sink, restore the sidecar to its pre-pass snapshot (e.g.
replace or clone-reset the live sidecar from the saved entry snapshot) so
nested-if free_schedule entries from the scratch pass are removed; apply the
same reset in the other symmetric block (around lines handling the alternate
branch at 1100-1112) where states_differ triggers a second walk.

Comment thread src/ownership.rs
artefactop added 15 commits June 8, 2026 18:43
Resolved I-062: Optimize free_schedule traversal in codegen from O(K * S) to O(1) via free_by_after HashMap indexing and pending_sweep unfired cursor, adding O(1) empty-schedule short-circuiting.

Resolved I-063: Unify duplicated fixtures, Zig location discovery, and AOT linking logic between asan_smoke.rs and valgrind_smoke.rs into tests/common/mod.rs. Added --path to ryo toolchain status.

Remove resolved entries I-062 and I-063 from ISSUES.md.
Re-order `analyze_function` so `schedule_loop_exit_frees_in` runs after
the anonymous-temp and dead-store passes have populated
`free_schedule`. Sets up I-058's inside-loop fix, where the jump-site
scheduler needs to see the complete schedule to decide whether an
owner is already covered.

Behaviour-preserving today: the existing `>= body_min` skip means
inside-loop owners are not scheduled at jumps, so no entry depends on
the post-pass state of `free_schedule` yet.
`break_before_last_use` and `continue_before_last_use` each allocate a
heap string inside a loop body and then jump (`break`/`continue`)
before the `print(s)` last-use. Under M8.1c the natural last-use Free
is anchored on `print(s)` and never fires on the jump path, leaking
the allocation. Valgrind on Linux detects the leak; ASan/LSan does
not (Cranelift output is not ASan-instrumented).

Used by the new tests in `valgrind_smoke.rs` and `asan_smoke.rs`.
These tests exercise the fixtures added in the previous commit:

- valgrind_{break,continue}_before_last_use: detect the leak under
  Valgrind on Linux. Fail on main; pass after the ownership-pass fix.
- asan_{break,continue}_before_last_use: parity coverage on macOS and
  Linux. Pass today (LSan can't see Cranelift leaks) and after the
  fix; their job is to catch regressions in the schedule shape.
`break_in_else_arm_sibling_use` allocates `s` in the loop body, then
prints it in the if's then-arm and breaks in the else-arm. The
natural last-use Free anchors on `print(s)`, which never runs on the
break path — so `s` leaks under the lexical-coverage scheduler.

Valgrind on Linux detects the leak; the ASan harness keeps the
parity test for regression coverage on macOS.
Resolves I-058. The break/continue scheduler used to skip any owner
whose `TirRef.raw()` placed it inside the enclosing loop body,
trusting the natural last-use Free to cover it. Two leak shapes
escape that proxy:

  - linear: the natural last-use is AFTER the jump in source order
    (e.g. `for: s = alloc(); if cond: break; print(s)`).
  - cross-branch: the natural last-use is in a SIBLING if-arm
    (e.g. `for: s = alloc(); if cond: print(s) else: break`).

The fix replaces the raw()-based classifier with two reachability
sets computed via walk_operands from the loop's body:
  - inside_loop: every TirRef under the loop body, classifying owners
    as inside-loop vs pre-loop. (Producer refs sit numerically below
    their parent body stmt, so raw() proxies misclassify them.)
  - on_path: the TirRefs evaluated on the path that takes this
    specific jump. A Free covers the jump iff its anchor is in
    on_path. Lexical raw() ordering would falsely classify a sibling
    arm's anchor as covering.

Inside-loop owners get a jump-anchored Free unless covered on this
path. Pre-loop owners keep the defensive emit only when no Free is
scheduled anywhere - freeing a pre-loop owner on continue would
cause a use-after-free on the next iteration.

Verified on Linux via the new Valgrind tests, including the cross-
branch shape.
Four TirBuilder-driven tests pin the schedule shape on the I-058
leak shapes:

- break_before_last_use_schedules_jump_free: Free for the
  inside-loop owner is anchored on the break instruction.
- continue_before_last_use_schedules_jump_free: same for continue.
- break_in_else_arm_sibling_print_schedules_jump_free: the
  cross-branch case where the natural last-use lives in a sibling
  if-arm; verifies path-aware coverage schedules a jump-anchored
  Free distinct from the then-arm's natural Free (anchored on the
  Var read, per collect_last_uses).
- break_after_last_use_does_not_double_schedule: when the natural
  last-use precedes the break, exactly one Free is scheduled and it
  is NOT anchored on break.

Together with the Valgrind tests these catch regressions on all
platforms — Valgrind covers Linux end-to-end, the inline tests cover
macOS dev loops where Valgrind is unreliable.
After the I-058 fix, `schedule_break_continue_frees` learns coverage
exclusively from `sidecar.free_schedule` (via the on_path set). The
`last_use` map was a pass-through parameter from `analyze_function`
to the helper; remove it from the helper's signature and from the
recursive walker. Behaviour-preserving cleanup.
Inside-loop heap owners are now freed on break/continue paths via the
path-aware `schedule_break_continue_frees` pass and Valgrind
regression tests covering both linear and cross-branch shapes. Drop
the entry from ISSUES.md per the file's "resolved entries are
removed" policy and remove the cross-reference comment in
`tests/valgrind_smoke.rs`.

The test-infrastructure half of I-058 (ASan can't see Cranelift-
emitted leaks) is a separate problem; re-file under a new I-XXX when
work begins.
Apply the small, local cleanups surfaced by the post-I-058 review:

- Extract `loop_body(tir, r) -> Option<Vec<TirRef>>` so the
  `WhileLoop`/`ForRange` body-slice match isn't duplicated between
  `collect_loop_body_refs` and `schedule_break_continue_frees`.
- Replace the open-coded `walk_operands { collect_refs_recursive }`
  in `collect_jump_path`'s `_` fallthrough with a direct call to
  `collect_refs_recursive` — the helper already handles the same
  insert-then-walk shape.
- Fix stale `body_min` mention in `collect_loop_body_refs` doc; the
  symbol no longer exists. Rephrase in terms of the actual invariant
  (producer refs sit numerically below their parent body stmt).
- Move `find_fixture` from `tests/asan_smoke.rs` and
  `tests/valgrind_smoke.rs` into `tests/common/mod.rs` so the smoke
  harnesses match the deduped pattern of `RYO_FIXTURES` /
  `build_and_link`.

File four follow-up issues noticed during review:

- I-064: `inside_loop` / `has_any` recomputed per jump.
- I-065: `collect_jump_path` builds throwaway HashSets for arm
  containment.
- I-066: TIR reachability helpers belong next to `walk_operands` in
  `src/tir.rs`.
- I-067: pre-loop defensive emit fires on `continue` jumps too —
  latent UAF if a future producer bypasses both upstream passes.

Behaviour-preserving; all macOS tests green.

| Benchmark Candidate | Language | Execution Strategy | Max Resident Memory (RSS) | Memory Efficiency | Result at 74,556+ Depth (Stack Limit) |
|---------------------|----------|--------------------|---------------------------|-------------------|---------------------------------------|
| **Ryo (AOT)** | Ryo 0.1.0 | Standalone Binary (Eager) | **4.42 MB** | **1.88x more efficient** | **Succeeds (0.00s)** |

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the base line?

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@benchmarks/eager_destruction/README.md`:
- Line 80: The sentence that states "exactly 74,556 recursive calls" is too
absolute; change it to qualify the measurement as environment-specific (for
example: "observed as 74,556 recursive calls on this setup" or give a small
range) and add a short parenthetical like "(observed on our test
machine/toolchain)" so the claim is not presented as universally deterministic;
update the same sentence that compares Rust vs Ryo (the phrase "exactly 74,556
recursive calls") accordingly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: cb7c72b1-94af-4aaa-91c2-a6c5f77c98b1

📥 Commits

Reviewing files that changed from the base of the PR and between a0255ba and 7847f41.

📒 Files selected for processing (8)
  • ISSUES.md
  • benchmarks/eager_destruction/README.md
  • src/codegen.rs
  • src/main.rs
  • src/ownership.rs
  • tests/asan_smoke.rs
  • tests/common/mod.rs
  • tests/valgrind_smoke.rs
✅ Files skipped from review due to trivial changes (1)
  • ISSUES.md
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/codegen.rs
  • src/ownership.rs


### Key Takeaways
1. **Unrivaled Memory Performance:** Ryo's Ahead-Of-Time (AOT) compiled binary achieves the **lowest memory footprint** (4.42 MB), outperforming even Rust's manual `drop` version.
2. **Stack Safety under Deep Recursion:** While Rust **crashes with a stack overflow at exactly 74,556 recursive calls** (even with release-level optimizations `-O` and manual `drop` due to conservative LLVM tail call heuristics), **Ryo runs completely clean up to 260,000 recursive calls** (3.5x deeper than Rust) before reaching the OS stack limit.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Avoid absolute crash-depth wording in benchmark claims.

“exactly 74,556 recursive calls” is environment-specific and likely non-deterministic across machines/toolchains; please qualify it as “observed on this setup” (or a range) to prevent misleading reproducibility expectations.

🧰 Tools
🪛 LanguageTool

[style] ~80-~80: Consider using a different adverb to strengthen your wording.
Context: ... LLVM tail call heuristics), Ryo runs completely clean up to 260,000 recursive calls (...

(COMPLETELY_ENTIRELY)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@benchmarks/eager_destruction/README.md` at line 80, The sentence that states
"exactly 74,556 recursive calls" is too absolute; change it to qualify the
measurement as environment-specific (for example: "observed as 74,556 recursive
calls on this setup" or give a small range) and add a short parenthetical like
"(observed on our test machine/toolchain)" so the claim is not presented as
universally deterministic; update the same sentence that compares Rust vs Ryo
(the phrase "exactly 74,556 recursive calls") accordingly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant