M3: Guaranteed self-tail-call optimization (loop lowering)#37
Open
assapir wants to merge 1 commit into
Open
Conversation
When a function returns a call to itself in tail position, lower the recursion to a loop instead of a stack-growing `call` + `ret`, so tail self-recursion runs in constant stack and cannot overflow — a guarantee the language needs as `for` is removed and recursion becomes the iteration primitive. Codegen-only (no surface syntax). Adds a tail-position analysis (`body_has_self_tail_call`) and a parallel tail-aware emitter (`generate_tail_expr` + `generate_tail_if`/`generate_tail_match`/ `emit_tail_self_call`) that share one `is_self_tail_call` predicate. Tail position flows through `?`/`|` match arms, `if`/ternary branches, `< >` block tails, and `|>` pipelines. The function's parameter allocas are reused as loop-carried slots; a tail self-call evaluates all args first (against the current iteration's params), stores them, and branches back to a loop header. Non-tail self-calls and calls to other functions stay ordinary calls (general/mutual tail calls are a deferred follow-up). - examples/tail_recursion.ql recurses 1,000,000 deep (would overflow without TCO) and exits 16; wired into the examples gate (JIT + native AOT under clang and gcc). - tests/tail_call_test.rs: deep ternary/match-arm/block recursion runs in constant stack; arg-before-overwrite ordering; non-tail and cross-function calls unaffected; all-arms-recurse modules still verify. - LANGUAGE.md documents the guarantee and adds a feature-matrix row. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Guaranteed self-tail-call optimization: when a function returns a call to itself in tail position, codegen lowers the recursion to a loop instead of a stack-growing
call+ret. So tail self-recursion runs in constant stack and cannot overflow — the guarantee the language needs asforis removed (later M3 wave) and recursion becomes the iteration primitive.This is codegen-only — there is no surface syntax.
How
body_has_self_tail_call/expr_has_self_tail_call/is_self_tail_call): a call is in tail position when it is the value the function returns directly. Tail position flows through?/|match arms,if/ternary branches,< >block tails, and|>pipelines — and not into an operator operand, a call argument, an array element, etc.generate_tail_expr+generate_tail_if/generate_tail_match/emit_tail_self_call): the function's parameter allocas are reused as loop-carried slots. A loop header is branched to from the entry block; a tail self-call evaluates all its args first (against the current iteration's params, sof(n-1, acc+n)is correct), stores them into the slots, andbrs back to the header. The tail emitter mirrors the existinggenerate_if/generate_matchshape but threads anOption(a tail self-call yieldsNone, having branched away).n * fact(n-1)) and calls to other functions stay ordinary calls. General/mutual tail calls (LLVMmusttail) are an explicit deferred follow-up.is_self_tail_callis true at codegen time, and the function is always terminated by the finalret.Tests & docs
examples/tail_recursion.ql— recurses 1,000,000 deep (would overflow the stack without TCO), exits 16. Wired intotests/examples_test.rsso it runs under the JIT and native AOT (clang and gcc).tests/tail_call_test.rs— deep ternary/match-arm/block recursion runs in constant stack and computes the right value; arg-before-overwrite ordering; non-tail and cross-function calls unaffected; unconditional / all-arms-recurse modules still pass LLVM verification.LANGUAGE.md— documents the guarantee ("tail self-recursion is optimized to a loop") and adds a feature-matrix row.Gate
cargo build,cargo test(incl. the native-AOT examples gate, clang+gcc),cargo fmt --all -- --check, andcargo clippy --all-targets --all-features -- -D warningsall pass. Ran/code-review(no surviving findings) and/simplify(one minor consolidation).🤖 Generated with Claude Code