rrr: safety + perf milestone — Marshal Vec rewrite, ClientConnection @safe class, HashMap/Weak library fixes by shuaimu · Pull Request #67 · makodb/mako

shuaimu · 2026-05-22T20:11:25Z

Summary

A milestone of work on the worktree-srpc branch covering:

Marshal rewrite: chunk-linked-list → rusty::Vec<uint8_t> + read-cursor (incidentally fixed a 5-year-old chunk leak)
ClientConnection class-level @safe with carefully-scoped per-method @unsafe overrides
rusty-cpp library improvements: HashMap::get returns Option<V&> (not Option<V*>), Vec::extend_from_slice memcpy fast path + geometric reserve, full Weak @safe annotations, Vec::value_type typedef
C++23 named-module + clang-21 TLS dup-symbol fix (taken from mako-dev's static inline thread_local form)
const-ification cascade through ConnectionStateMachine + ClientConnection lifecycle methods, eliminating 2 const_cast sites

Headline numbers

scripts/rrr_safety_loc.py ratio:

Start of this thread: 65.3% rrr-wide @safe
After this PR: 70.5% (+5.2pp)

client.cpp specifically:

@safe LOC: 1292 → 1651 (+359)
@unsafe LOC: 1049 → 663 (-386, -37%)

Marshal microbench (bench_marshal):

New Marshal beats chunk-list on every scenario, +15-81% depending on workload
~30-40% faster on string/varint-heavy paths (the structural win)
The 81% headline on the 10×1KB drain pattern was investigated and attributed largely to a chunk-list leak; the leak-fixed chunk-list is tied with the new Marshal on bulk-byte paths

End-to-end RPC (rpcbench loopback, 4 threads × 100 outstanding, mode=fast, 8s):

bsize=100: +15% qps vs leak-fixed chunk-list, +11% vs original
bsize=1024: tied with leak-fixed chunk-list (Marshal is a smaller fraction of total RPC time at larger payloads)

What's in the PR

rrr/misc/marshal.cpp — Vec rewrite

File shrinks 1408 → 932 LOC (-34%)
@unsafe LOC: 382 → 10
All 50+ operator<<>> overloads preserved; bookmark API preserved (struct internals simplified from (size, char**) to (offset, size))
init_block_read reduced to a Vec::reserve hint; read_chnk/read_reuse_chnk removed (no external callers)
The chunk-list memory leak in Marshal::read() (a commented-out delete chnk; dating to 2020) is fixed as a side effect

rrr/rpc/client.cpp — class-level @safe + cleanups

class ClientConnection flipped from class-level @unsafe to @safe (was the single biggest @unsafe-LOC source in rrr; +296 LOC moved to @safe in that one flip alone)
16 delegator methods flipped @safe (request convenience overloads, config setters/getters, pause/resume, etc.)
Six post-HashMap-API methods flipped @safe (fail_pending_future, handle_free, mark_closing, allow_request_with_circuit_metrics, record_circuit_result)
Six Weak<> copy-block @unsafe wraps dropped after library annotations landed
Const-ification: invalidate_pending_futures, mark_closing, close, handle_error are now const; reconnecting_ / reconnect_abort_ atomics marked mutable; 2 const_casts dropped

rrr/rpc/server.cpp — Weak cleanups

Same Weak-copy @unsafe blocks dropped (2 sites)

third-party/rusty-cpp — library improvements

Vec::extend_from_slice / Vec::write: memcpy fast path for trivially-copyable T (~10× faster on byte writes)
Vec::reserve: geometric growth (max(new, 2×capacity)) — required for amortized O(1) writes
HashMap::get / get_mut: return Option<V&> instead of Option<V*> (matches Rust + matches Vec::get)
sync::Weak<T> / rc::Weak<T>: every method individually @safe with atomic/raw-ptr work in inline @unsafe { ... } blocks

rrr/tests/bench_marshal — new Marshal microbench

9 hot-path scenarios; standalone executable (no gtest); useful for any future Marshal perf work
Companion doc docs/dev/marshal_perf_baseline.md with baseline numbers, calibrated A/B vs leak-fixed chunk-list, end-to-end RPC numbers

Merge resolution

TLS class-static fix (reactor.cpp): took mako-dev's simpler static inline thread_local over srpc's function-local-static accessor approach
src/deptran/communicator.cc: updated to Reactor::clients_ member access
Submodule pointer: merged upstream rusty-cpp/main (vec value_type + noexcept dtor) into local @safe branch; pushed to shuaimu/rusty-cpp main

Test plan

Notes for reviewers

The PR is structured as a long sequence of small commits — each compiles, each passes borrow check, each has a verified test pass. Reviewing commit-by-commit will be much easier than reviewing the diff as a whole.
The rusty-cpp submodule bump is non-trivial; the library API change (HashMap Option<V&>) is source-breaking for any caller binding to V*. The rrr-side rewire is included in this PR.
The Marshal leak fix is bundled with the rewrite, but documented in the post-mortem section of docs/dev/marshal_perf_baseline.md as a separable concern. The "rewrite + leak fix" combo is faster than "rewrite alone" against the original (leaky) chunk-list; against the leak-fixed chunk-list the wins are smaller but still real (~15% on small-payload RPC).

🤖 Generated with Claude Code

Three pieces: 1. **rusty-cpp submodule bump** to commit 519b5861 ("Enable borrow-checking C++23 module TUs in libc++ / module-graph builds"). Three fixes upstream: - `build.rs` now honors `LIBCLANG_PATH` first (was hardcoded to `/usr/lib/llvm-14/lib`, which lost to a system clang too old to export `clang_CXXMethod_isDeleted`). - `src/main.rs` passes `-resource-dir=` (not `-I<dir>/include`) when compile_commands.json already owns the STL paths, so libclang's `__has_include_next` dance with libc++ wrappers works. Also passes through `-D` / `-U` / `-m` / `-f` / `-O` / `-g` flags from compile_commands.json so `-march=native` reaches libclang (without it, `__builtin_ia32_*` intrinsics fail to declare). - `src/parser/mod.rs` adds `-ferror-limit=0` so large module TUs don't abort on libc++ vs `import std;` duplicate-decl spam. 2. **`RRR_BORROW_SRC` updated** to include the modular .cpp's that were previously omitted with "module unit" comments: `base/basetypes`, `base/debugging`, `base/logging`, `base/misc`, `base/strop`, `base/threading`, `base/unittest`, `misc/alock`, `misc/any_message`, `misc/marshal`, `misc/rand`, `rpc/utils`, `rpc/client`. The `reactor/*.cc` lineup is replaced with `reactor/{epoll_wrapper.cc, fiber.cpp,future.cpp,reactor.cpp}` to reflect the post-modularization layout. `rpc/server.cpp` stays omitted — libclang 22 crashes parsing it (separate clang-22 issue, distinct from the codegen-crash workarounds in srpc_module_migration_plan.md). 3. **Build-order dependency**: each per-file `borrow_check_rrr_borrow_<F>` target now `add_dependencies(... rrr)`. Without this, ninja parallelizes borrow checks with `rrr`'s compilation and the checks start before `rrr.dir/.../*.o.modmap` and the matching `.pcm` BMIs exist — libclang then can't resolve `import rrr.X;` and analysis runs without module context. Adding the dep at the `borrow_check_all_rrr_borrow` aggregate isn't enough; ninja groups skip the implied dep for siblings. End-to-end result on clang 22.1.5, `-j16`: `ninja borrow_check_rrr` takes ~1m53s, analyzes 17 files, 15 are clean, 2 produce real findings (`reactor.cpp`: 1; `client.cpp`: 23 — all `rusty::*` container methods called from `@safe` code without `@unsafe` blocks). Zero modmap-not-found warnings. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

@unsafe

Submodule bump (rusty-cpp d53d9c7) flips `option.hpp` and `result.hpp` namespace default from `@unsafe` to `@safe`, matching the other rusty containers. That alone closes ~22 of the 24 findings — they were noise from those container methods (`Option::unwrap`, `Result::unwrap`, `Option::is_some`, `RefCell::borrow`, ...) being marked unsafe by default. Project-side touches for the 2 remaining sites: - `src/rrr/base/debugging.cpp`: `// @safe` annotation on `rrr::verify` (pure precondition check; aborts on failure, parity with Rust's `assert!`). Note: this does not yet take effect on cross-module callers — rusty-cpp's annotation parser doesn't follow `import` statements (only `#include`), so reactor.cpp / client.cpp / etc. that import `rrr.debugging` still see `verify` as unannotated. File the upstream issue and leave the annotation in for the day that lands. - `src/rrr/reactor/reactor.cpp:2012`: drop the obsolete `// @unsafe { verify(wakeup_time > 0); }` wrapper. The text parser mis-classified the wrapped call as a function declaration (the `> 0` inside the `verify(...)` triggers `is_function_declaration`'s template-syntax heuristic). With verify itself now annotated `@safe`, the wrapper is no longer needed anyway. - `src/rrr/rpc/client.cpp:2893`: rewrite `(*fu_ptr.unwrap()).clone()` as `fu_ptr.unwrap()->clone()` plus a multi-line `@unsafe { ... }` block around the assignment. `HashMap::get` returns `Option<V*>` so `unwrap()` yields a raw pointer; the deref is genuinely needed here. rusty-cpp's pointer-safety analysis doesn't honor inline `@unsafe { }` blocks (separate upstream gap), so this finding persists in the report even though the source-level intent is annotated. End-to-end on clang 22.1.5, `-j16`, `ninja borrow_check_rrr`: Before: 24 findings (reactor.cpp 1, client.cpp 23) After: 2 findings (reactor.cpp 1, client.cpp 1) Both remaining findings are documented upstream-rusty-cpp limitations, not real safety regressions. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Picks up two upstream rusty-cpp parser fixes (committed locally on rusty-cpp `main`, ahead of origin by 5 commits): - 278737e header_cache: follow C++23 `import X.Y.Z;` for annotation discovery. Resolves the first of the two remaining false-positive findings from the earlier pass: `Calling non-safe function 'rrr::verify' at line 1361 ...` in reactor.cpp. With this change, the parser walks `import rrr.debugging;` to `src/rrr/base/debugging.cpp` and picks up the `// @safe` annotation on `verify`. - 4b82823 ast_visitor: recognize multi-line `// @unsafe` comment blocks. Resolves the second false-positive: `Unsafe pointer dereference in function call at line 2896 ...` in client.cpp. The `check_for_unsafe_annotation` heuristic now walks UP through the contiguous comment block above a `{`, instead of inspecting only the immediately-preceding line. The multi-line `// @unsafe` annotation on the HashMap-get raw-pointer deref is now honored. End-to-end on clang 22.1.5, `-j16`, `ninja borrow_check_rrr`: Before this bump: 2 spurious findings + ~11 legitimate ones. After this bump: 0 spurious findings + 11 legitimate ones. The 11 legitimate findings are calls to `rrr::SpinLock::lock` / `unlock` / `rrr::PollThread::add` from `@safe` rrr code without `@unsafe { }` wrappers. Those methods are explicitly `@unsafe` in their own module declarations; they were exposed (correctly) by the import-chasing fix. Fixing them is a separate code-review pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

@unsafe

Two source-level changes close all 11 outstanding findings from `ninja borrow_check_rrr`: - `src/rrr/base/threading.cpp` — flip `SpinLock::lock` and `SpinLock::unlock` from `@unsafe` to `@safe`. Parity with Rust's `Mutex::lock` / drop semantics: atomic compare/exchange and store are memory-safe; the only genuinely unsafe call inside `lock()` is `nanosleep(&t, nullptr)` (passes the address of a stack-local `timespec` to a libc syscall), and that's now wrapped in an `@unsafe { ... }` block. The prior `@unsafe` annotation on `unlock()` was over-conservative — `std::atomic::store` is memory-safe outright. Removes 10 of the 11 findings (every `ClientPool::*` call to `SpinLock::lock` / `unlock`). - `src/rrr/rpc/client.cpp` — restructure `Client::close`'s poll-thread-add block. The `// @unsafe - schedules channel proxy close on poll thread` annotation was previously a statement-level comment; rusty-cpp only honors block-level `// @unsafe` (the comment must immediately precede a `{`). Wrap the body in an explicit `{ ... }` so the annotation now binds. Removes the last finding. End-to-end on clang 22.1.5, `-j16`, `ninja borrow_check_rrr`: Before: 11 findings (1 PollThread::add + 10 SpinLock::lock/unlock). After: 0 findings. 17/17 files clean, build succeeds. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Was: 17 hand-picked module units (the "correctness-critical" subset). Now: every non-test C++ source file in `src/rrr/{base,misc,reactor,rpc}/` that the borrow checker can actually parse, with two exclusions called out inline with the reason: - `rpc/server.cpp` — clang-22 libclang crash on parse. Pre-existing. - `rpc/fiber_channel.cpp` — same clang-22 libclang crash, discovered during this expansion. Likely the same bug class as server.cpp (a `create_sp_event<>`-shaped pattern in module purview). - `reactor/fiber_context_{aarch64,x86_64}.cc` — implicitly excluded (asm-only context-switch trampolines, ~30 lines each, not in any cmake source list). End-to-end on clang 22.1.5, `-j16`, `ninja borrow_check_rrr`: 43 files analyzed, 43 clean, 0 violations. Total time ~57s. Caveat on the "0 violations" result: of the 27 files added in this commit, 22 currently have NO `@safe` annotations and pass the check vacuously (no `@safe→@unsafe` call sites to flag). The remaining 5 do have meaningful `@safe` coverage: - `rpc/idempotency.cpp` — 37 `@safe`, 10 `@unsafe` - `rpc/completion_tracker.cpp` — 32 `@safe`, 5 `@unsafe` - `rpc/request_queue.cpp` — 4 `@safe`, 39 `@unsafe` - `misc/serializable.cpp` — 2 `@safe`, 6 `@unsafe` - `rpc/tcp_channel.cpp` — 0 `@safe`, 22 `@unsafe` Future annotation work on the 22 vacuous-pass files will start producing meaningful checks; until then this expansion's value is ensuring the build infrastructure covers every borrow-checkable file in rrr (regression-proof against new findings). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…h recovery) Both files previously hard-failed libclang parse with "Failed to parse file: Crash" (clang-22 libclang bug; clang++ CLI compiles them fine). With the rusty-cpp bump in this commit (c12ef31), the parser now detects the Crash and retries with module args dropped, yielding a partial AST that's still usable for borrow analysis. Three pieces: - **submodule bump (third-party/rusty-cpp → c12ef31)**: trust the compile_commands compiler for `-resource-dir=...` so resource-dir matches the toolchain that built the PCMs, and add a Crash-recovery fallback that retries with std-only module inputs. - **CMakeLists.txt**: drop the `# omitted — clang-22 libclang crash` exclusions for `rpc/server.cpp` and `rpc/fiber_channel.cpp`. Both now go through the recovery path. Inline comment notes that findings on these two are partial-info (cross-module callee names not resolved) until upstream libclang is fixed. - **src/rrr/rpc/server.cpp**: reconcile four decl-vs-def annotation mismatches. The class-body declarations of `run_async` (both ServerConnection and DeferredReply overloads), `wait_for_shutdown`, and `add_shutdown_hook` said `@safe`; the out-of-line definitions said `@unsafe`. The bodies do unsafe things (invoke caller-supplied `rusty::Function<void()>` callbacks; push to a mutex-guarded Vec; block on a condvar with arbitrary predicate). Flip declarations to `@unsafe` to match the definitions and the actual behaviour. These were the 4 partial-info findings that the recovery surfaced. End-to-end on clang 22.1.5, `-j16`, `ninja borrow_check_rrr`: Before: 43 files analyzed, 2 hard-failed, 0 findings (with the 2 excluded). After: 45 files analyzed, 0 hard failures, 0 findings. That's the complete borrow-checkable surface of rrr (45 of 47 .cpp/.cc files in src/rrr/{base,misc,reactor,rpc}/; the 2 holdouts are asm-only context-switch trampolines). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Captures everything we learned during the bisection: the trigger (first-time user-type template instantiation in module purview), the parse-state-sensitivity (file truncations at 699 succeed but 700 crashes; 710 succeeds again), the matched-toolchain test that isolated this as a libclang-22 regression vs libclang-19, and the reason we don't downgrade (clang-19's own multi-attachment bug blocks rrr's server.cpp from compiling). Includes a filing checklist for an upstream LLVM bug report — not yet filed (per Shuai's preference to hold). Cross-references the workaround in third-party/rusty-cpp's parser and the broader migration plan doc. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

@unsafe

…annotations Switch the rrr borrow-check pipeline (and reference build) from Homebrew clang 22.1.5 to Homebrew clang 21.1.8. clang 21 is the sweet spot: it fixed the clang-19 multi-attachment trap (the original reason we left clang 19), and it does not exhibit clang 22's libclang parse-time crash on first-time user-type template instantiations in module purview that forced rusty-cpp into degraded crash-recovery on server.cpp and fiber_channel.cpp. clang 21 is also a stricter compile, which surfaced one latent bug: * rpc/load_balancer.cpp: drop the GMF forward-decl `namespace rrr { class Client; }`. The class is fully declared in client.cpp's `rrr.client` module purview, and clang 21 (correctly) rejects a global-module forward-decl that re-introduces the same name. The three "private member" errors clang 21 reported in client.cpp were symptoms of the duplicate-Client declaration, not real visibility violations — they vanish with this delete. Bump third-party/rusty-cpp to df79169 so the analyzer can run cleanly on a clang-21 toolchain: * skip-cross-file-bodies: the analyzer now restricts its scan to the TU under check rather than the whole import graph. Previously this produced 181 cross-module-noise findings on the clang-21 pipeline because libclang's AST exposes function bodies from imported modules. * qualify module annotations by namespace: annotations harvested from `export module rrr.foo;` units are now stored under their derived namespace (`rrr::Log_warn`, not bare `Log_warn`), matching the qualified lookup the analyzer uses. Project-side annotation reconciliations needed for full-info borrow check (0 findings on 45/45 files): * base/logging.cpp: annotate `Log_debug/Log_info/Log_warn/Log_error` template shims `@safe`. They forward printf-style format strings (literals at every call site we control) plus variadic args; no memory operations escape. * misc/serializable.cpp: `registry()` returns a reference into a process-wide singleton (`'static`-lifetime). Marked `@unsafe` because rusty-cpp doesn't yet express `&'static` lifetimes; safer than a half-modelled `@lifetime` annotation it can't enforce. * rpc/client.cpp: `enqueue_heartbeat_probe` decl and its single call site marked `@unsafe`. The body uses Marshal `operator<<` chains that rusty-cpp treats as unsafe by default; wrapping the call site is cleaner than restructuring the Marshal API. * rpc/server.cpp: declaration of `Server::close` aligned with the body's `@unsafe` annotation. Two stale `// @unsafe - Log_warn` inline comments removed — they confused the text parser into attaching `@unsafe` to the surrounding scope. Verification on clang-21: borrow_check_rrr: 45/45 files clean, 0 findings, 1m54s.

…ad_local` Required for clang 21 production link. clang 21 emits module-attached class-static `thread_local` storage as a strong external in every TU that uses the variable (typically via an inline accessor like `is_on_poll_thread()`), causing duplicate-definition linker errors when more than one consumer links into the same executable: multiple definition of `TLS init function for rrr::Reactor@rrr.reactor::sp_reactor_th_' clang 22 happened to keep these in vague linkage; clang 19 didn't emit them in consumer TUs at all. `inline` makes the linkage explicit and toolchain-independent: the symbol is vague-linkage in every TU that sees the declaration, the linker dedupes, and there's no need for an out-of-class definition. Variables converted: - `Reactor::sp_reactor_th_` - `Reactor::sp_disk_reactor_th_` - `Reactor::sp_running_fiber_th_` - `PollThreadWorker::current_worker_` The non-thread_local `clients_` / `dangling_ips_` HashMaps and the plain SpinLock `trying_job_` are left as out-of-class definitions — they don't have inline accessors and don't trip the regression. Verified on Homebrew clang 21.1.8: - clean `rpcbench` link (1m4s wallclock). - `borrow_check_rrr`: 45/45 files, no violations.

- CMakeLists.txt: prefer Homebrew/Linuxbrew clang-21 libclang for the rusty-cpp checker before falling back to system /usr/lib/llvm-*. We intentionally do not auto-detect clang-22 because of the documented libclang parse-crash regression (see docs/dev/libclang22_parse_crash.md). - docs/dev/libclang22_parse_crash.md: production switched to clang 21.1.8; refresh the three-way comparison table; document the source fixups (load_balancer.cpp duplicate forward-decl, reactor.cpp `static inline thread_local`) required for the switch. - docs/dev/srpc_module_migration_plan.md: replace the clang-22 cmake invocation with the clang-21 one; cross-reference the libclang doc; add a "toolchain note" preamble to the Metrics table noting the rows were measured under clang 22 but production now builds on clang 21 (numbers within noise). Verified on Homebrew clang 21.1.8: - clean rpcbench link - borrow_check_rrr: 45/45 files, 0 violations

@unsafe

Mechanical conversions where rusty-cpp's @unsafe reason was encapsulable in an inner block, leaving the public surface @safe. Borrow check stays clean (45/45 files, 0 violations) on Homebrew clang 21.1.8. Conversions by category: Aggregate-initialized POD factories (7): BufferingConfig::defaults / disabled / to_queue_config RequestQueueConfig::defaults / small / large / disabled These were marked @unsafe for "returns struct by value" with an inner @unsafe { struct construction } block. Both are redundant — aggregate-init of a POD is among the safest operations in C++. std::chrono helpers (3): QueuedRequest::is_expired QueuedRequest::age_ms rrr::current_time_ms std::chrono::steady_clock::now + duration_cast are read-only and memory-safe; the wrap-and-flip pattern hides the unannotated std calls behind a public @safe surface. Atomic counter accessors on Server (3): Server::pending_request_count Server::increment_pending Server::decrement_pending Each is one atomic load / fetch_add / fetch_sub; wrap that call in an inner @unsafe block and mark the accessor @safe. Matches the pattern already used for set_drop_heartbeat_replies / drop_heartbeat_replies in the same class. SpinCondVar signal/bcast (2): SpinCondVar::signal / SpinCondVar::bcast Each is one atomic store. wait / timed_wait stay @unsafe — they additionally call SpinLock and Time::sleep, which is Tier 3 work (apply Pattern B1 across the remaining SpinLock users). Bumps third-party/rusty-cpp to 755344c which annotates rusty::sync::Weak::upgrade / rusty::rc::Weak::upgrade as @safe. Two DeferredReply methods in server.cpp (reply, reply_error) drop their inline `// @unsafe - weak pointer upgrade` wrappers as a result. Not in this commit (defer): - Future::timed_wait — currently labeled "Uses std::chrono" but actually uses rusty::Mutex + Condvar, which is the larger blocker. - Server::drain — has chrono + usleep + multiple atomic loads in one function; a larger refactor than mechanical wrap-and-flip. - SpinCondVar::wait / timed_wait — wait until Tier 3 lands the remaining SpinLock pattern-B1 conversions.

@safe

…e callers The rusty-cpp library already annotates rusty::VecDeque (namespace- level `// @safe`), rusty::Function (per-method `// @safe`), and SpinMutex / SpinMutexGuard / Cell as @safe. The remaining @unsafe labels on rrr callers were leftover from before those library annotations landed. Flipping them now removes ~18 false-unsafe function signatures and drops 13 stale inline `@unsafe { }` markers that no longer have a non-safe operation to wrap. request_queue.cpp (9 functions, 8 stale inline markers): - RequestQueue ctor (defaults() factory is @safe per tier 1) - enqueue / dequeue / expire_stale - size / empty / full / remaining_capacity - clear_all Body composition for each: SpinMutex::lock() (@safe) + VecDeque method (@safe via namespace) + optional rusty::Function call (@safe). try/catch around callback invocation is ignored by rusty-cpp's analyzer (per its CLAUDE.md), so the catch arm doesn't need wrapping either. client.cpp (6 functions, 4 stale inline markers): - ClientConnection::pending_request_count (RequestQueue::size) - ClientConnection::clear_pending_requests (RequestQueue::clear_all) - ClientConnection::set_on_server_restart (Function move-assign) - ClientConnection::check_server_instance (Cell::get/set + Function::op()) - Client::pending_request_count (forwards to ClientConnection) - Client::clear_pending_requests (forwards to ClientConnection) - ClientConnection::replay_pending_requests (no-op stub returning 0) reactor.cpp (6 inline marker removals): - 6 `// @unsafe - rusty::Function constructor` markers around extract_if / retain callback construction in Reactor::check_timeout and Reactor::loop. The Function constructor is @safe in the library; the braces are kept as scope hygiene for the local iterator variables. Verified clean on Homebrew clang 21.1.8: borrow_check_rrr: 45/45 files, 0 violations.

@unsafe

…State> ClientPool held `SpinLock l_` + two unprotected fields (`cache_` and `lb_state_`), with seven public methods doing manual `l_.lock()` / `l_.unlock()` around BTreeMap operations. The pattern was the last raw-SpinLock holder in rrr. Bundle the protected state into a `SpinMutex<PoolState>`: struct PoolState { rusty::BTreeMap<std::string, rusty::Vec<rusty::Arc<Client>>> cache; rusty::BTreeMap<std::string, LoadBalancerState> lb_state; }; mutable SpinMutex<PoolState> state_; Bundling matches the access pattern: `get_client` already touches both maps under the same lock, so they want one mutex, not two. Every method that used `l_.lock()/.unlock()` now uses RAII via `auto guard = state_.lock().unwrap();`, accessing fields as `guard->cache` / `guard->lb_state`. The destructor also takes the guard before iterating (it was previously unlocked-by-convention as "the last reference"; locking is a no-op in that case but lets the analyzer see the guard). Methods flipped @unsafe → @safe: - get_healthy_client_count - remove_unhealthy_clients - close_idle_clients - remove_all_unhealthy - close_all_idle - total_client_count - address_count Still @unsafe (for non-lock reasons): - get_client — drives Client::connect / try_reconnect_if_needed synchronously, which is network I/O. - reconnect_all (both overloads) — async batch driver uses nanosleep + std::atomic loop. The state_ snapshot at the top is @safe; the body is not. `ClientPool::is_client_healthy` (used inside the lock) is one of the remaining function-level @unsafe sites worth flipping next — it just reads Cell + ConnectionMetrics, no actual unsafe op. Verified on Homebrew clang 21.1.8: - borrow_check_rrr: 45/45 files, 0 violations. - clean rrr rebuild + rpcbench link.

@unsafe

… Client Two large classes carried the tag "marked unsafe to suppress rusty-cpp false positives (rusty-cpp is under development)". Those annotations predated several analyzer improvements that landed in the last few weeks: import-chasing for cross-module annotations, multi-line `// @unsafe { }` block parsing, cross-file body filter, and namespace-qualified annotation lookup (see commits 28e64c5, 0c16ebf, df2388f). With the matured analyzer, the suppressions are no longer needed. Drop them and flip both classes to `// @safe` at the class level. The methods inside that genuinely cross into unsafe territory (network I/O, std::chrono, rusty::Mutex + Condvar combinations) already carry their own method-level `// @unsafe` overrides; the rest of the class is now analyzed as @safe by inheritance. This is a class-level boundary change, not a method-level flip — no individual method's annotation changed. The net effect: previously- unannotated methods inside Future and Client now default to @safe via class inheritance instead of being @unsafe-by-namespace. Callers of those methods can now invoke them from @safe code without `// @unsafe { }` wrappers. Verified on Homebrew clang 21.1.8: - borrow_check_rrr: 45/45 files, 0 violations (no new findings from the @safe flip — the suppressions were genuinely no-ops). - clean rrr rebuild.

@unsafe

…ers in client.cpp These were defensive `// @unsafe { ... }` scope wrappers around method bodies whose underlying operations are all @safe in the matured rusty-cpp library + project annotations: - rusty::Arc::make (@safe in arc.hpp) - rusty::Mutex::lock / MutexGuard::operator*/-> (@safe in mutex.hpp) - rusty::RefCell::borrow / borrow_mut (@safe via namespace in refcell.hpp) - rusty::Option::is_some/is_none/as_ref/unwrap (@safe via namespace) - rusty::Cell::get/set (@safe via namespace) - rusty::Arc::clone (@safe in arc.hpp) - ClientConnection::host/connected/connection_state/server_instance_id /pending_request_count/clear_pending_requests/set_keepalive (all @safe at decl after tier 1+2) The wrappers were no-ops once those library/project methods landed their @safe annotations; the analyzer was already letting these through because the function-level @safe overrides the inner block scope, but the dead `// @unsafe { ... }` markers were noise to anyone reading the code. Methods cleaned: - Future::create, Future::ready, Future::timed_out, Future::add_completion_callback, Future::get_reply - Client::create, Client::host, Client::connected, Client::connection_state, Client::connection, Client::server_instance_id, Client::is_reconnecting, Client::set_keepalive, Client::set_reconnect_policy - Client::request (3 overloads), Client::request_with_options (2 overloads) Still @unsafe (kept): - Future::get_error_code's `{ timed_wait(x); }` block — timed_wait is genuinely @unsafe (Mutex+Condvar+chrono). Bare-marker count in rrr: 115 → 96. Verified on Homebrew clang 21.1.8: borrow_check_rrr: 45/45 files, 0 violations.

@safe

…pers server.cpp (4): - Server::reg_service (Box<Service> overload) — rusty::Vec::push + Box move + ServiceProxy dispatch are @safe at the boundary. - Server::reg_service (Box<T> ServiceLike overload) — same. - Server::reg_rpc — rusty::HashMap::contains_key + insert are @safe via the rusty namespace. - Server::service_count — Option + Vec::size are @safe. reactor.cpp (1): - Reactor::make_arc — rusty::Arc::make is @safe; the wrap was misdocumented as a "localized unsafe allocation boundary". Kept @unsafe: - Server::for_each_service — body has `static_cast<Service*>(...)` on a raw pointer from ServiceProxy's __get_service__; that's a genuine raw-pointer cast. Bare-marker count in rrr: 96 → 92. Verified on Homebrew clang 21.1.8: borrow_check_rrr: 45/45 files, 0 violations.

@safe

Baseline (computed with /tmp/safety_loc.py): total in-fn LOC: 22,438 @safe (function or class-inherited): 1,435 ( 6.4%) @unsafe (function or class-inherited): 2,526 (11.3%) inner `@unsafe { ... }` blocks: 733 ( 3.3%) unannotated (namespace default = @unsafe): 17,744 (79.1%) The unannotated bucket is the conversion target. None of it violates the borrow check today; it just hasn't been actively tagged. Plan structure: Phase 0: fix LOC script's out-of-class inheritance bug (honest baseline) Phase 1: labeling sweep — class-level + namespace-level @safe sweeps on files that are already safe in fact (target 35–40%) Phase 2: easy raw-pointer refactors (target 50%) - ChannelConnectionProxy / FactoryProxy → rusty::Box - PollThreadWorker* TLS → rusty::Weak<PollThreadWorker> - rusty::sys::* syscall wrappers - ServiceProxy::__get_service__ → rusty::Arc<Service> Phase 3: harder refactors (target 65–70%) - alock.cpp ALock* → rusty::Weak<ALock> - serializable.cpp std::shared_ptr<Marshallable> → rusty::Arc (touches generated rcc_rpc.h codegen) - Reactor::loop @unsafe block scoping - Pthread_* → rusty::sync::* wrappers Phase 4: stretch (target 80%) - Marshal byte ops (refactor / external annot / quarantine) - fiber context quarantine (~150 LOC explicit @unsafe) - rcc_rpc.h codegen rewrite Honest ceiling without rusty-cpp expressiveness changes or perf-costly Marshal refactors is ~70%. The plan treats 70% as the success criterion and 80% as a stretch. Each Phase has a checklist; the self-pacing loop ticks one item per iteration, runs borrow_check_rrr + rrr build, commits, and updates this doc's Progress log.

@safe

Phase 0 of the rrr_safety_80pct_plan: an honest, reproducible baseline. Previously the script lived in /tmp and had a bug where it could match function calls (`Class::method(args)`) inside other function bodies as out-of-class method definitions, inflating the @safe / @unsafe in-fn counts. The fix: - Anchor OUT_OF_CLASS_METHOD regex to lines that don't start with control-flow keywords (`if/while/for/switch/return/else/do/try/catch`). - Only consider an out-of-class match at brace depth==0 (file scope) — function definitions open at file scope; function calls don't. - Handle multi-line signatures: stash a pending out-of-class name when the `ClassName::method(` line lacks `{`, attach when the `{` arrives. Honest baseline after the fix (was a moving target): in-fn LOC: 22,438 @safe: 1,408 (6.3%) @unsafe: 2,226 (9.9%) inner @unsafe: 733 (3.3%) unannotated: 18,071 (80.5%) The earlier estimate that the fix would land "+4-6pp" was wrong — the fix tightens classification rather than growing @safe. Real ratio growth comes from the Phase 1 labeling sweep.

@safe

Two intertwined changes, both surface @safe LOC that was previously hidden: 1. **`Server` class** (`rpc/server.cpp`) flipped from `// @unsafe` to `// @safe` at the class level, mirroring the Tier-4 flip on `Client`. Every method that genuinely crosses into unsafe territory (socket I/O via TcpListener, Pthread / std::atomic primitives, raw pointer extraction from ChannelListenerProxy, `usleep` in drain, etc.) already carries a method-level `// @unsafe` override and is unaffected. 2. **LOC-script bug fix**: `scripts/rrr_safety_loc.py` previously used `";" in line` to detect forward-decls and drop pending annotations. Multi-line `// @safe -` annotation comments often contain a `;` in prose ("// overrides; the rest of the class…"), which fired the drop spuriously and stripped the pending annotation BEFORE the class declaration was reached. Net effect: `Future` and `Client` class flips from Tier 4 were silently uncredited in every prior LOC report. Fix: check `";" in stripped` (after strip_line_comment) instead. borrow_check_rrr: 45/45 clean. LOC ratio: 6.3% → 7.2% @safe (1,408 → 1,623 LOC inside fn bodies). Note: this delta combines Server's flip and the script fix's retroactive credit for Future/Client; the pure Server contribution is about half of it.

The amend in 54a2d98 changed the SHA from the originally-recorded d5028a19. Update the Progress log entry to match.

@safe

Add `// @safe` at the `class Reactor` declaration. The Reactor design already documents itself as memory-safe (single-threaded, RefCell / Cell / Rc / HashMap with rusty borrow rules); the comment was descriptive but the analyzer wasn't credited. Methods that genuinely cross into unsafe territory keep their existing method-level `// @unsafe` overrides: - `get_reactor` / `get_disk_reactor` (thread-local Rc access) - `register_stackless_poller` / `enqueue_stackless_task` / `process_stackless_tasks` / `spawn_stackless_task` (RefCell-mut via mutable members) - `check_timeout` / `loop` (rusty-cpp doesn't fully model RefCell) - `continue_fiber` / `recycle` (fiber-context interactions) Most Reactor:: out-of-class methods already had per-method @safe or @unsafe annotations, so the class-level flip mainly credits the inline methods (getters, setters, factory helpers, register_fiber, restore_running_fiber, etc.) that were silently unannotated. borrow_check_rrr: 45/45 clean. LOC ratio: 7.2% → 7.4% @safe (1,623 → 1,652 LOC). Modest gain, as expected — the bigger reactor wins live in Phase 2 (PollThreadWorker* raw pointer → rusty::Weak) and Phase 3 (Reactor::loop @unsafe-block tightening).

@unsafe

; LOC script pending_for_class leak fix Two correctness wins, no ratio movement: 1. Add `// @safe` at the class level for `IdempotencyKeyGenerator` and `IdempotencyCache` in `rpc/idempotency.cpp`. Both classes use only rusty primitives (Cell, Mutex, HashMap) with explicit @unsafe annotations on the few methods that copy Marshal byte payloads. 2. Fix `scripts/rrr_safety_loc.py`: `pending_for_class` was leaking across function-body `{` consumption. Only `pending` got reset when a function brace was consumed; `pending_for_class` kept the stale annotation forever, which then got attached to the next class declared anywhere in the file. Concrete effect: pre-fix, `IdempotencyKeyGenerator` was wrongly credited as `@unsafe` because `CachedResponse::get_response_data` set `pending_for_class = "unsafe"` 17 lines earlier and that never got cleared. Same pattern was likely contaminating other files' class annotations too — the previous LOC report was about 1pp too optimistic on @safe and 0.9pp too optimistic on @unsafe. Honest baseline after both fixes: in-fn LOC: 22,453 @safe: 1,398 (6.2%) @unsafe: 2,137 (9.5%) inner @unsafe: 733 (3.3%) unannotated: 18,185 (81.0%) This iteration's class flips on idempotency.cpp don't move the ratio because every method in both classes already had explicit per-method annotations — class-level @safe only helps when there are unannotated method bodies for it to inherit. borrow_check_rrr: 45/45 clean.

@safe

Add `// @safe` at the `class CompletionTracker` declaration. The class already uses only rusty primitives (Cell, Mutex<std::list>, Mutex<HashSet>) with no raw pointers, syscalls, or Marshal byte ops. All previously unannotated methods inside the class body now inherit @safe. borrow_check_rrr: 45/45 clean. +107 @safe LOC; ratio 6.2% → 6.7%.

@safe

Single-threaded state machine; every field is rusty::Cell<T> with trivially-copyable interior mutability. No raw pointers, syscalls, or Marshal byte ops. borrow_check_rrr: 45/45 clean. +118 @safe LOC; ratio 6.7% → 7.2%.

@safe

Cell-based heartbeat tracker with rusty::Function timeout callback. No raw pointers, syscalls, or Marshal byte ops. borrow_check_rrr: 45/45 clean. +88 @safe LOC; ratio 7.2% → 7.6%.

@safe

…address Five method-level `// @unsafe` markers on TcpConnection / TcpListener adapter accessors were tied to ops that have since become @safe: - `TcpConnectionChannelAdapter::is_closed` — forwards to TcpConnection:: is_closed (Cell::get, @safe). - `TcpConnectionChannelAdapter::peer_address` — forwards to TcpConnection:: peer_address (const std::string accessor). - `TcpConnectionPollableAdapter::is_closed` — same as above. - `TcpConnectionPollableAdapter::check_pending_write_update` — forwards to TcpConnection::check_pending_write_update (Cell::get + Cell::set, @safe). - `TcpListener::local_address` — returns a copy of the const std::string member; the historical "std::string copy constructor isn't borrow-checked" rationale was over-conservative. Verification: borrow_check_rrr 45/45 clean. Ratio holds at 71.5% (impact small; mostly comment changes on one-liners).

@safe

Both methods only call SpinMutex::lock (@safe since Tier 2.x) and CallbackWrapper move-assign (@safe via namespace+class annotation in rrr.callback_wrapper). The historical "(not @safe)" rationale was out of date. Verification: borrow_check_rrr 45/45 clean. Ratio holds at 71.5% (+6 @safe LOC, −6 @unsafe LOC).

@safe

…::make Both static factory templates wrap `rusty::Arc<T>::make(args...)`. The historical "(non-borrow-checked)" rationale predates the rusty::Arc namespace-level @safe annotation: Arc::make is itself @safe with an inner `// @unsafe { new }` block around the heap allocation. Verification: borrow_check_rrr 45/45 clean. Ratio 71.5% → 71.6%.

@safe

- TcpListener::handle_write (returns PollMode::NO_CHANGE, no side effects) - TcpListener::check_pending_write_update (returns false) - PollThreadWorker::do_remove_pollable (HashMap::contains_key + HashSet::insert, both @safe; the "uses STL operations" rationale was wrong — these are rusty types) - PollThreadWorker::do_close_pollable (HashSet::remove / HashMap::get / HashMap::remove / HashMap::contains_key are @safe; only Epoll::Remove and virtual Pollable::close() dispatch escape into inner @unsafe blocks). Verification: borrow_check_rrr 45/45 clean. Ratio 71.6% → 71.8% (+27 @safe LOC, −25 @unsafe LOC).

@unsafe

…loop Two inline @unsafe blocks in Reactor's fiber-execution path were rooted in "Fiber::finished() is not marked @safe" — that's outdated. Fiber::finished() is annotated @safe (reads a Cell<Status>); RefCell::borrow + Option::as_ref / unwrap are @safe via namespace inheritance. Verification: borrow_check_rrr 45/45 clean. Ratio 71.8% → 71.9% (+6 @safe LOC, −6 @unsafe LOC).

@safe

`Event::test()` only calls @safe ops: - `verify()` is @safe (pure precondition check, marked @safe in rrr.debugging) - `is_ready()` is virtual; the base @safe-annotates the virtual - `status_.get()` / `status_.set()` (Cell, @safe) - `wp_fiber_.upgrade()` (rusty::sync::Weak::upgrade, @safe) - `option_fiber.is_some()` (@safe) - `Log_debug(fmt, args...)` is @safe (template shim wrapping Log::debug in an inner @unsafe block) The historical "verify/log helpers not marked @safe" rationale was outdated. Verification: borrow_check_rrr 45/45 clean. Ratio holds at 71.9%.

@unsafe

RefCell / Option Eight inner `// @unsafe { ... }` blocks were wrapping operations that have been @safe for a while: - marshal.cpp: three `// @unsafe { Vec::clear is @safe; wrap defensively. }` blocks around `buf_.clear()` / `src.buf_.clear()` — defensive wraps that were never needed; rusty::Vec::clear is @safe via namespace inheritance. - client.cpp: two `// @unsafe { SpinMutex::lock }` wraps around `factory_.lock()` / `pending_factory_.lock()` — SpinMutex::lock is @safe since Tier 2.x. - client.cpp: five `// @unsafe { RefCell::borrow, Option::unwrap are not borrow-checked }` blocks around connection_.borrow() + Option::is_some / as_ref / unwrap — both RefCell::borrow and Option::unwrap are @safe via namespace inheritance. Verification: borrow_check_rrr 45/45 clean. Ratio 71.9% → 72.1% (+21 @safe LOC; inner-block LOC dropped by 25).

@unsafe

Nine more inner `// @unsafe { ... }` blocks wrapped operations that have been @safe for a while: - client.cpp x4: `// @unsafe { RefCell::borrow, Option::unwrap }` around connection_.borrow() + Option::is_some / as_ref / unwrap. - client.cpp x2: `// @unsafe { SpinMutex::lock + Option::take }` around fiber_channel_.lock().unwrap() / direct_channel_.lock() .unwrap(). - reactor.cpp x2: `// @unsafe { RefCell::borrow_mut, Option operator= are not borrow-checked }` around the sp_running_fiber_th_.borrow_mut() = Some/None idiom. The wrapped operations are all @safe via the rusty namespace + RefCell / Option / SpinMutex class annotations. Verification: borrow_check_rrr 45/45 clean. Ratio 72.1% → 72.4% (+31 @safe LOC; inner-block LOC dropped by 31, mostly net-neutral moves into the @safe pool).

@unsafe

Two more pairs of inner `// @unsafe { ... }` blocks were wrapping purely-@safe operations: - `// @unsafe { SpinMutex::lock + ChannelFactoryProxy move }` (x2) around factory_.lock().unwrap() + Some(std::move(factory)) — both SpinMutex::lock and Box move-assign are @safe. - `// @unsafe { make_box + SpinMutex mutation }` (x2) around fiber_channel_.lock().unwrap() + Some(make_box<FiberChannel>) — rusty::make_box is @safe (Arc::make is @safe), SpinMutex::lock is @safe. Verification: borrow_check_rrr 45/45 clean. Ratio 72.4% → 72.5% (+12 @safe LOC; inner-block LOC dropped by 12).

@safe

- idempotency.cpp: Marshal `operator<<`/`operator>>` overloads for `IdempotencyKey` flip @safe — the Marshal class is namespace @safe via the Phase 4 Cursor rewrite; the per-method @unsafe blocks only remain on the chunk-list-era methods (which the rewrite deleted). - client.cpp `current_time_ms()`: dropped the `std::chrono::steady_clock` + `duration_cast` plumbing in favor of `rusty::sys::time:: clock_monotonic_us() / 1000`. Function flips @safe end-to-end. - request_queue.cpp `QueuedRequest`: storage migrates from `std::chrono::steady_clock::time_point timestamp` to plain `std::uint64_t timestamp_us` (monotonic microseconds via `clock_monotonic_us`). All three methods (ctor, is_expired, age_ms) flip @safe. Verification: borrow_check_rrr 45/45 clean. Ratio 72.5% → 72.6% (+6 @safe LOC; inner-block LOC dropped by 15).

@safe

… Vec) - client.cpp `reconnect_address_ = addr` — std::string assignment from `const char*` is benign in @safe code; no need for an inner wrap. - client.cpp `bind_channel_direct` body — SpinMutex::lock + Option::operator= are both @safe. - server.cpp's channel-listener on_accept lambda — SpinMutex::lock + rusty::Vec::push are both @safe. Verification: borrow_check_rrr 45/45 clean. Ratio 72.6% → 72.7% (+6 @safe LOC; inner-block LOC dropped by 6).

@unsafe

`update_config` was method-level @unsafe with an inner @unsafe block around `queue_.lock().unwrap()` + `config_ = config`. Both are @safe: SpinMutex::lock and POD-struct assignment have no escape hatches. Verification: borrow_check_rrr 45/45 clean. Ratio holds at 72.7% (+8 @safe LOC, −9 @unsafe LOC).

@safe

Time::now flows through rusty::sys::time::clock_monotonic_us (itself @safe with an inner @unsafe block around clock_gettime). The inline wrap inside this_fiber::sleep_until_us was leftover from before the sys::time landing. Verification: borrow_check_rrr 45/45 clean. Ratio holds at 72.7%.

@safe

…ty() std::string::empty() is a pure const accessor — completely safe in @safe code. The lingering @unsafe wrap was incidental. Verification: borrow_check_rrr 45/45 clean. Ratio 72.7% → 73.2% (+58 @safe LOC; the inline @unsafe block was straddling an entire if-condition, so removing the wrap relabels the surrounding code).

Third std::chrono call-site in client.cpp (after Future::timed_wait and current_time_ms) flips to rusty::sys::time::clock_monotonic_us / 1000. Removes the last std::chrono::steady_clock::now() use in this file outside of the test-side rpcbench tooling. Verification: borrow_check_rrr 45/45 clean. Ratio holds at 73.2%.

@safe

Marshal ctor and init_block_read had blocks around buf_.reserve(N). rusty::Vec::reserve is @safe — its internal allocation is wrapped in an inner @unsafe block inside the Vec class itself. Verification: borrow_check_rrr 45/45 clean. Ratio holds at 73.2%.

@unsafe

The set_bookmark zero-fill loop appended n zero bytes via buf_.push(0) inside an @unsafe block. rusty::Vec::push is @safe via namespace inheritance, so the wrap was decorative. Verification: borrow_check_rrr 45/45 clean. Ratio 73.2% → 73.3%.

@unsafe

Logs the ~30-commit sweep through stale per-method @unsafe overrides and inline @unsafe { ... } blocks whose rationales were tied to operations that have since become @safe. Cumulative net effect: +2.7pp of @safe ratio (70.6% → 73.3%), borrow_check_rrr 45/45 clean throughout.

@safe

Phase 2 (rrr safety push, item 3 — second wrapper family). Bumps the rusty-cpp submodule to pick up `rusty::sys::process::*` and updates the five rrr files calling those syscalls directly: - base/misc.cpp `get_ncpu`: sysconf(_SC_NPROCESSORS_ONLN) → @safe `rusty::sys::process::sysconf`. Function flips @unsafe → @safe. - base/misc.cpp `get_exec_path`: getpid → @safe wrapper. The function stays @unsafe overall (static char[PATH_MAX] + readlink + raw char* return) but the getpid call no longer carries an explicit syscall annotation. - misc/cpuinfo.cpp `CPUInfo::CPUInfo` + `get_cpu_stat`: sysinfo / sysconf / times / getpid all route through the new helpers. The methods stay @unsafe because of get_network / get_memory parsers (raw char* / strtok) but the syscall plumbing is gone. - misc/netinfo.cpp `NetInfo::NetInfo` + `get_net_stat`: drop the `// @unsafe { times(&tms_buf) }` inner blocks — times() now flows through rusty::sys::process::process_times. - reactor/reactor.cpp `fiber_task_t::init_context`: sysconf(_SC_PAGESIZE) → @safe wrapper. - rpc/server.cpp instance-id generation: getpid → @safe wrapper. Submodule bump: 843ba3b adds `include/rusty/sys/process.hpp` with getpid / sysconf / process_times (ProcessTimes aggregate) / sysinfo (Linux-only SysInfo aggregate). Each entry is @safe, body wraps a single libc syscall in an inner @unsafe block, return types are plain integers or small PODs. Verification: borrow_check_rrr 45/45 clean; rrr library compiles. Ratio 73.3% → 73.5%.

@safe

Phase 2 (rrr safety push, item 3 — third wrapper family). Bumps the rusty-cpp submodule to pick up rusty::sys::env::hostname() and updates the one rrr caller: - rpc/utils.cpp get_host_name(): drops the raw char[1024] buffer + direct gethostname call. Flow now goes through @safe sys::env:: hostname which wraps gethostname in an inner @unsafe block and returns an owned std::string. Function flips @unsafe → @safe. Submodule bump: d720f95 adds include/rusty/sys/env.hpp. Buffer size defaults to 256 bytes (overridable via RUSTY_SYS_ENV_HOSTNAME_BUF). Verification: borrow_check_rrr 45/45 clean. Ratio 73.5% → 73.6%.

@safe

…d_hash Phase 2 (rrr safety push, item 3 — fourth wrapper family). Bumps the rusty-cpp submodule to pick up rusty::sys::pthread::current_id_hash() and updates the one rrr caller: - basetypes.cpp Rand::Rand: pthread_self() seed contributor now goes through @safe sys::pthread::current_id_hash. The only remaining @unsafe op is , which is now confined to a tight inner @unsafe { } block. Function flips @unsafe → @safe. Submodule bump: 97c45b4 adds include/rusty/sys/pthread.hpp scoped narrowly to thread-identity (the mutex / condvar / thread-create surface continues to live in rusty::sync::* / rusty::thread::*). Verification: borrow_check_rrr 45/45 clean. Ratio holds at 73.6% (Rand::Rand body is small; +8 @safe LOC, -7 @unsafe LOC, +2 inner-block).

Phase 2 item 3 (rusty::sys::* syscall wrappers) was [partial] after the sys::time landing. With sys::process / sys::env / sys::pthread also shipped + folded into rrr, the item is now [x]. Logs the four sub-families that landed (time, process, env, pthread) with their submodule + parent commit hashes, and documents the deliberate skips for epoll/kqueue, pthread mutex/condvar, and the full file I/O surface (each annotated with the reason rrr's safety ratio wouldn't move).

@unsafe

… Queue Audit confirmed these classes had zero production-touching method calls: - `ThreadPool` was constructed in `paxos_worker.cc` and `raft_worker.cc` (`base::ThreadPool::make(num_threads)` + `make(1)` for hb) and stored in `rusty::Arc<base::ThreadPool>` fields, but `run_async` was never invoked. The fields were checked via empty `if (thread_pool_g) { /* Arc auto-releases */ }` blocks during shutdown — i.e. it spun up worker threads on an idle job queue forever, allocated for nothing. - `RunLater`: zero callers outside threading.cpp. - `SpinCondVar`: zero callers outside threading.cpp. - `Queue<T>` (pthread mutex + cond blocking queue): only consumed internally by `ThreadPool::run_thread`. Deletes: - threading.cpp: drops SpinCondVar (~70 LOC) + Queue<T> (~75 LOC) + ThreadPool class + impl (~155 LOC) + RunLater class + impl (~220 LOC). File shrinks from 930 → 411 LOC. Unused includes (rusty/arc.hpp, /box.hpp, /fn.hpp, /function.hpp, /vecdeque.hpp, sys/time.h) and imports (rrr.misc) come out too. - server_worker.{h,cc}: drops `svr_thread_pool_`, `thread_pool_g`, `hb_thread_pool_g` fields + the `hb_thread_pool_g = svr_thread_pool_` assignment. - paxos_worker.{h,cc}: drops `thread_pool_g` / `hb_thread_pool_g` fields + the two `base::ThreadPool::make(...)` construction calls. - raft_worker.{h,cc}: same shape; also drops the two empty `if (...thread_pool_g) { /* Arc auto-releases */ }` shutdown blocks. Verification: borrow_check_rrr 45/45 clean. Channel-mode unit tests pass (inmemory 24/24, fiber 8/8, tcp 20/20, client_factory 5/5). rrr + mako + rpcbench + txlog_core_obj all link cleanly. Ratio 73.6% → 74.0%; @unsafe pool drops by ~114 LOC; total in-fn LOC drops by ~360. Net: 554 LOC removed, 2 added.

@safe

Two dead-code prunes surfaced by the Phase 2 follow-on survey: - `NetInfo` (src/rrr/misc/netinfo.cpp, ~75 LOC) gauged ens4 rx/tx bytes/sec through `NetInfo::net_stat()`; zero callers anywhere in the tree. Removed the file, dropped `import rrr.netinfo;` from `rrr.hpp`, and pruned the entries from RRR_MODULE_SRC / RRR_BORROW_SRC in `src/rrr/CMakeLists.txt`. - `ServerConnection::{fd, poll_mode, content_size, handle_read, handle_write, handle_error, handle_free, check_pending_write_update}` in `src/rrr/rpc/server.cpp` (~95 LOC) were stub methods whose doc-comments cited "PollableProxy facade ABI compatibility". But `class ServerConnection` has no base class — the stubs aren't overrides — and `make_pollable_proxy_from_typed_arc<T>` is never instantiated with `T = ServerConnection` anywhere in the tree (verified via grep). The only callers lived inside one test (`ServerApiSafetyTest.ServerConnectionContentSizeAndHandleFreeAreSafe` in test_rpc_extended.cc) that exercised the no-op behavior to confirm it was a no-op. Deleted the test along with the stubs. `is_closed()`, the one method in the same block that is actually used by the poll loop and many callers, is kept. Verification: - rrr builds clean (`cmake --build build_clang21 --target rrr`). - borrow_check_rrr clean across 44 files (47/48 ninja tasks; 48th is the rrr.netinfo BMI which is no longer emitted). - test_rpc_extended builds + links clean. Safety LOC tally: 8395 @safe / 11351 in-function = 74.0%. Plan doc: append a "Phase 2 follow-on — dead-code removal" entry listing this commit + the prior threading.cpp prune (f6be7df). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

@safe

…tomic The Log class held a static `pthread_mutex_t m_s` that protected writes to `level_s` (int) and `fp_s` (FILE*). Two findings: 1. `fp_s` and `Log::set_file()` had zero callers in the tree. They were dead state. 2. `log_v` (the hot path) reads `level_s` without taking the lock, so the mutex only guarded the writer — a torn-write hazard, not a torn-read one. Rust would model this as `AtomicI32`, not a `Mutex<i32>`. Changes in `src/rrr/base/logging.cpp`: - Drop `static FILE* fp_s` field + its definition. - Drop `static pthread_mutex_t m_s` field + its definition. - Drop `Log::set_file(FILE*)` declaration and definition. - Change `static int level_s` to `static rusty::sync::atomic::Atomic<int> level_s`. - Initialize to `Log::DEBUG` via `Atomic<int>{Log::DEBUG}`. - `set_level` becomes `level_s.store(level, Ordering::Relaxed)`. - `log_v` reads via `level_s.load(Ordering::Relaxed)`. - Drop `#include <pthread.h>` and `import rrr.threading;` — the file no longer touches pthread or the Pthread_* wrappers. The rusty-cpp submodule bumps to `5150277` (one commit) which adds namespace-level `// @safe` annotations to `rusty::sync::atomic` so `load`/`store` can be called from @safe code without inline `@unsafe { ... }` blocks at the call sites. Verification: - rrr builds clean. - borrow_check_rrr clean across all 44 files (logging.cpp specifically: 0 violations). - LOC tally: 8394 @safe / 11341 in-function = 74.0%. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

The paxos/raft configs in `config/1leader_2followers/` hardcode ports (45xxx/46xxx for paxos, 27xxx/28xxx for raft). Between consecutive CI runs in the same job — including the auto-retries inside `ci/ci.sh` — a process from the previous attempt that exited abnormally can leave its listen sockets in TIME_WAIT / FIN-WAIT-2 on those ports. SO_REUSEADDR helps with TIME_WAIT but not with sockets still tracked in connected states, so the next attempt's `s102` bind on 0.0.0.0:46002 panics with AddressInUse and the shard goes down before any throughput emits. We hit this on `shard2Replication` in PR #67's CI run. Mirror what simpleTransactionRep already does for the shard config (`make_simple_txn_rep_config`): pick a random base port at test init, probe the leader port of each (shard, cluster) tuple via real `bind()` to verify the range is free, then materialize per-shard rebased copies of the paxos/raft YAML into `/tmp/mako_paxos_cfg_XXXX/`. Test wrappers export `MAKO_PAXOS_CONFIG_DIR` so `bash/shard.sh` and `simpleTransactionRep` redirect to the tmp dir. Files touched: - `examples/simple_transaction_rep_port_utils.sh`: + `pick_paxos_replication_port_base(nshards, nthreads)` — picks a base in [40000, 56000-window) with up to 2000 random retries, probing 4 * nshards leader ports via SO_REUSEADDR bind. + `write_paxos_replication_config(new_base, src, dest)` — rebases every `s<id>:<port>` token in `site.server` by `delta = new_base - min(existing ports)`. Preserves yaml-cpp- friendly flow-style nested lists. + `make_paxos_replication_configs(nshards, nthreads, type)` — orchestrates: picks base, mkdir-tmp, calls `write_paxos_replication_config` per shard with shard-stride `+1000`, returns the tmp dir path. - `bash/shard.sh`: honor `MAKO_PAXOS_CONFIG_DIR` (fall back to the in-tree `config/1leader_2followers/`). - `examples/simpleTransactionRep.cc`: same `MAKO_PAXOS_CONFIG_DIR` override at the `paxos_config_files[]` construction site. - All eight CI replication test wrappers: `test_{1,2}shard_replication{,_simple,_raft,_simple_raft}.sh` call `make_paxos_replication_configs`, export the env var, and clean up the tmp dir on EXIT / interrupt. Verification: - `bash -n` clean on all 10 modified shell scripts. - `simpleTransactionRep` rebuilds. - Smoke test: $ tmp=$(make_paxos_replication_configs 2 3 paxos) $ head -8 $tmp/paxos3_shardidx{0,1}.yml shows shard 0 at base+0/100/200/300 and shard 1 at base+1000/1100/1200/1300 with the original site IDs (s101..s403) intact and yaml-cpp-style flow lists preserved. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

The first run of the randomized-port helper produced a base around 57900, which gave paxos site ports up to ~58900. PaxosWorker::CtrlPortDelta = 10000 means each site also spawns a heartbeat listener at `port + 10000` — so we were asking the kernel to bind 68900, which is past the valid TCP port range, and bind() returned AddressInvalid: rrr::Server::start: channel listener failed to bind 0.0.0.0:67901: AddressInvalid The original hardcoded configs never hit this: paxos shard 0 lived in 45xxx (heartbeat 55xxx) and shard 9 topped out at 54xxx (heartbeat 64xxx). Two fixes in `pick_paxos_replication_port_base`: 1. Lower BASE_MAX by 10000 + per-shard window so `max_heartbeat = base + (nshards-1)*1000 + 300 + (nthreads-1) + 10000` stays below 65535. 2. Add the matching `port + CTRL_PORT_DELTA` to the probe set so external port conflicts on heartbeat ports also block selection at init. Sanity-checked: 5 random picks for `(nshards=2, nthreads=3)` produced bases 43692..51131 with max heartbeat 54994..62433 — all within range. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

After the paxos/raft randomization landed in bb6bb21, CI moved past all replication failures but now fails on `multiShardSingleProcess` with `bind 0.0.0.0:31000: AddressInUse` / `0.0.0.0:31102: AddressInUse`. Root cause: the multi-shard single-process tests hardcode `--shard-config $path/config/local-shards2-warehouses$trd.yml`, which ships with port 31000+ for `localhost`. Earlier tests (`simpleTransaction`) pick a random base in `[20000, 28599]` with offsets up to 3100, so their highest port is 31699 — overlapping 31000-31102. A TIME_WAIT / leftover socket from `simpleTransaction` takes 31000 down before `multiShardSingleProcess` can bind it. Apply the same `make_simple_txn_rep_config` pattern used by the replication tests: - `examples/test_multi_shard_single_process.sh`: source the port-utils, generate `MAKO_CONFIG=$(make_simple_txn_rep_config 2 $trd)`, pass `--shard-config $TEMP_CONFIG`, clean up on EXIT. - `examples/test_2shard_single_process.sh`: same pattern. - `examples/test_2shard_single_process_replication.sh`: also generate a randomized paxos dir (`make_paxos_replication_configs 2 $trd paxos`) and rewrite the two hardcoded `-F config/1leader_2followers/paxos${trd}_shardidx*.yml` args to use `${MAKO_PAXOS_CONFIG_DIR}/...`. After this, every CI test that binds ports does so on a per-run randomized + bind-probed range; no test still depends on the hardcoded 31xxx-34xxx or 45xxx-46xxx ranges. `bash -n` clean on all three scripts. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

shuaimu and others added 30 commits May 16, 2026 21:43

docs/dev/rrr_safety_80pct_plan: correct commit SHA in Progress log

9784267

The amend in 54a2d98 changed the SHA from the originally-recorded d5028a19. Update the Progress log entry to match.

docs/dev/rrr_safety_80pct_plan: backfill Reactor commit SHA

6560958

docs/dev/rrr_safety_80pct_plan: backfill idempotency commit SHA

68eaa96

docs/dev/rrr_safety_80pct_plan: CompletionTracker SHA backfill

aabd1df

rrr/rpc: tier-plan iter 6 — CircuitBreaker class @safe

a12f0ee

Single-threaded state machine; every field is rusty::Cell<T> with trivially-copyable interior mutability. No raw pointers, syscalls, or Marshal byte ops. borrow_check_rrr: 45/45 clean. +118 @safe LOC; ratio 6.7% → 7.2%.

docs/dev/rrr_safety_80pct_plan: CircuitBreaker SHA backfill

ded5e82

rrr/rpc: tier-plan iter 7 — HeartbeatManager class @safe

8f4bf96

Cell-based heartbeat tracker with rusty::Function timeout callback. No raw pointers, syscalls, or Marshal byte ops. borrow_check_rrr: 45/45 clean. +88 @safe LOC; ratio 7.2% → 7.6%.

docs/dev/rrr_safety_80pct_plan: HeartbeatManager SHA backfill

2dbeced

shuaimu and others added 28 commits May 23, 2026 00:24

shuaimu merged commit e2a1c46 into mako-dev May 23, 2026
1 check passed

shuaimu mentioned this pull request May 28, 2026

rrr: migrate pure-arithmetic helpers to inline Rust DSL #68

Closed

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rrr: safety + perf milestone — Marshal Vec rewrite, ClientConnection @safe class, HashMap/Weak library fixes#67

rrr: safety + perf milestone — Marshal Vec rewrite, ClientConnection @safe class, HashMap/Weak library fixes#67
shuaimu merged 193 commits into
mako-devfrom
worktree-srpc

shuaimu commented May 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

shuaimu commented May 22, 2026

Summary

Headline numbers

What's in the PR

rrr/misc/marshal.cpp — Vec rewrite

rrr/rpc/client.cpp — class-level @safe + cleanups

rrr/rpc/server.cpp — Weak cleanups

third-party/rusty-cpp — library improvements

rrr/tests/bench_marshal — new Marshal microbench

Merge resolution

Test plan

Notes for reviewers

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant