Skip to content

docs: rewrite README, update AGENTS.md, add 8 examples#87

Draft
cagataycali wants to merge 12 commits into
strands-labs:mainfrom
cagataycali:docs/readme-examples
Draft

docs: rewrite README, update AGENTS.md, add 8 examples#87
cagataycali wants to merge 12 commits into
strands-labs:mainfrom
cagataycali:docs/readme-examples

Conversation

@cagataycali
Copy link
Copy Markdown
Member

@cagataycali cagataycali commented Apr 1, 2026

TL;DR

Complete documentation rewrite, 8 new examples, and a brand-new Mesh Networking section covering the Zenoh-based peer-to-peer layer that ships in the default install.

⚠️ Depends on: PR #86 (Robot factory) and PR #101 (mesh implementation) — merge those first.

What changed

File What
README.md Complete rewrite + new Mesh Networking section (topic schema, robot_mesh tool table, teleoperation, E-STOP audit, disable switches) + expanded env-var table (STRANDS_MESH_* knobs, STRANDS_ROBOT_MODE, STRANDS_TRUST_REMOTE_CODE, MUJOCO_GL)
AGENTS.md Updated project structure with simulation modules, conventions
examples/01_sim_quickstart.py 5-line sim quickstart
examples/02_sim_agent.py Agent + simulation integration
examples/03_sim_recording.py Dataset recording to LeRobot format
examples/04_real_hardware.py Real hardware robot setup
examples/05_real_groot_policy.py GR00T policy on real hardware
examples/06_list_robots.py Robot discovery and listing
examples/act_policy_simulation.py ACT policy in MuJoCo sim with video
examples/physics_agent.py Natural language → physics introspection

New: Mesh Networking section

Documents the peer-to-peer Zenoh mesh that auto-attaches to every Robot() and Simulation():

  • Two-process discovery — 5-line example showing sim_a.mesh.peers finding sim_b across processes.
  • Topic schema — every published key (presence, state, cmd, response, stream, pose, imu, odom, health, lidar/*, hand/*, map/info, input/*, broadcast).
  • robot_mesh agent tool — 10-action reference table (peers / status / tell / send / broadcast / stop / emergency_stop / subscribe / watch / inbox).
  • TeleoperationInputPublisher / InputReceiver for streaming a leader arm's joints to a remote follower, with the full strands/{peer_id}/input/{device} JSON schema.
  • Safetyemergency_stop() audit log behaviour (mode 0o600 JSONL at ~/.strands_robots/mesh_audit.jsonl).
  • DisableSTRANDS_MESH=false (process) / Robot("so100", mesh=False) (per-robot).

No code changes

Docs and examples only. All 357 tests still pass.


Part 6 of 6 — final PR in the MuJoCo simulation decomposition

Full PR chain:

  1. fix: auto-adapt LeRobot state dimension instead of raising ValueError #82 — 🐛 LeRobot state dim fix
  2. chore: modernize build system — uv lockfile, Python >=3.12 #83 — 🔧 Build system (uv, py3.12, [sim])
  3. feat: simulation foundation — models, ABC, factory, model registry, assets #84 — 📐 Simulation foundation (models, ABC, registry)
  4. feat: MuJoCo simulation backend - AgentTool with 50+ actions #85 — 🎮 MuJoCo backend (AgentTool, 35 actions)
  5. feat: Robot() factory + top-level lazy imports #86 — 🏭 Robot() factory
  6. docs: rewrite README, update AGENTS.md, add 8 examples #87 — 📖 Docs & examples (this PR)
  7. feat(mesh): Mesh, AWS IoT - session, presence, RPC, streams, wiring #101 — 🔗 Zenoh mesh networking (companion implementation PR; this PR's mesh README section depends on it)

Comment thread examples/act_policy_simulation.py
Comment thread examples/01_sim_quickstart.py Outdated
@cagataycali cagataycali force-pushed the docs/readme-examples branch 3 times, most recently from 3b633c9 to 21247ce Compare April 1, 2026 20:15
yinsong1986
yinsong1986 previously approved these changes Apr 1, 2026
Copy link
Copy Markdown
Contributor

@yinsong1986 yinsong1986 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All review comments addressed. LGTM.

@cagataycali cagataycali force-pushed the docs/readme-examples branch from 21247ce to 910a3f6 Compare April 1, 2026 20:24
Comment thread pyproject.toml
Comment thread pyproject.toml Outdated
@cagataycali
Copy link
Copy Markdown
Member Author

📋 Review Status Summary

Hi @awsarron — quick status check on this docs PR.

Thread Resolution: ✅ 4/4 resolved

All 4 review threads from @yinsong1986 are resolved:

  • Mock policy fallback for quickstart example ✅
  • Emoji rendering in print statements ✅
  • simsim-mujoco naming in install commands ✅
  • strands-robots[sim]strands-robots[sim-mujoco] naming ✅

Reviews

CI: ✅ Passing

Merge readiness

This is a docs-only PR (README rewrite, AGENTS.md update, 8 examples) with zero code dependencies. It's the lowest-risk PR in the queue and can merge independently at any time.

Note: This PR also addresses the env var documentation that was requested in PR #86's unresolved thread — merging this first would resolve that cross-PR dependency.


🤖 Automated review triage by Strands Agents. Feedback welcome!

Comment thread examples/02_sim_agent.py
@cagataycali cagataycali moved this from In progress to In review in Strands Labs - Robots Apr 16, 2026
@cagataycali cagataycali requested a review from awsarron April 17, 2026 16:30
@cagataycali cagataycali modified the milestones: v0.4.0, v0.3.9 Apr 21, 2026
@cagataycali cagataycali force-pushed the docs/readme-examples branch from 8941793 to 914cbd7 Compare April 23, 2026 00:28
@cagataycali cagataycali force-pushed the docs/readme-examples branch 2 times, most recently from 229e59a to d91bca5 Compare May 16, 2026 07:41
cagataycali and others added 3 commits May 17, 2026 07:02
- README.md: Complete rewrite with 5-line promise, mermaid architecture
  diagram, installation extras table, simulation quickstart, policy
  execution examples
- AGENTS.md: Updated project structure to include simulation modules,
  test instructions, and conventions
- examples/01_sim_quickstart.py: 5-line sim quickstart
- examples/02_sim_agent.py: Agent + simulation integration
- examples/03_sim_recording.py: Dataset recording to LeRobot format
- examples/04_real_hardware.py: Real hardware robot setup
- examples/05_real_groot_policy.py: GR00T policy on real hardware
- examples/06_list_robots.py: Robot discovery and listing
- examples/act_policy_simulation.py: ACT policy in MuJoCo sim
- examples/physics_agent.py: Natural language physics introspection
- ACT example now documents mock provider as lightweight alternative
- Quickstart uses ASCII instead of emoji for terminal compatibility
Explain what Robot('so100') does internally: mode detection, backend
selection, Simulation construction, add_robot() call, and asset
auto-download. Addresses yinsong1986 review feedback.
…compliance

Bugs fixed:
- examples/03_sim_recording.py and examples/act_policy_simulation.py
  passed record_video=/video_fps= as top-level kwargs to run_policy(),
  which would TypeError on first call. Move to video={'path':..,'fps':..}
- examples/act_policy_simulation.py passed pretrained_name_or_path= as a
  top-level kwarg (silently dropped). Move into policy_config={...} so the
  intended pretrained model actually loads.
- examples/01..03, act_policy_simulation, physics_agent advertised
  'pip install strands-robots[sim]' but [sim] only installs
  robot_descriptions; running the example would ImportError on mujoco.
  Switch to [sim-mujoco] (and [sim-mujoco,lerobot] for the dataset
  recording examples). README L84 updated to match L455.

Doc accuracy:
- README + examples/06_list_robots.py: '38 robots' was stale; registry
  now has 62 sim-capable + 11 hardware. Updated to '60+ robots'.

ASCII / AGENTS.md compliance:
- examples/06_list_robots.py: replace emoji flags (joystick / wrench)
  with [sim] [real] ASCII tags via a small _flag() helper.
- examples/03_sim_recording.py + physics_agent.py: drop checkmark emoji
  from user-facing prints.
- examples/physics_agent.py: replace box-drawing separators and
  rightwards-arrow with ASCII equivalents.

Validation:
- hatch run lint: pass
- hatch run test: 1671 passed, 1 skipped
- examples/01_sim_quickstart.py and examples/06_list_robots.py
  smoke-run end-to-end (downloads SO-100, sims 100 steps, renders frame)
Document the Zenoh-based peer-to-peer mesh that strands-robots ships in
the [mesh] extra (auto-included in the default install). Adds:

- A standalone 'Mesh Networking' section after the Architecture diagram
  with a 5-line two-process discovery example.
- Topic schema reference for every published key
  (presence, state, cmd, response, stream, pose, imu, odom, health,
  lidar/*, hand/*, map/info, input/*, broadcast).
- 10-action robot_mesh agent-tool reference table.
- Teleoperation example using InputPublisher / InputReceiver with the
  full strands/{peer_id}/input/{device} payload schema.
- Safety / E-STOP audit-log behaviour.
- Per-robot and process-wide disable switches.

The Configuration env-var table now covers every STRANDS_MESH_* knob
(port, audit dir, all per-loop Hz overrides, camera opt-in) plus the
previously-undocumented STRANDS_ROBOT_MODE / STRANDS_TRUST_REMOTE_CODE /
MUJOCO_GL variables that ship with the Robot() factory and MuJoCo
backend.

No code changes.
@cagataycali cagataycali force-pushed the docs/readme-examples branch from d91bca5 to 5097531 Compare May 17, 2026 07:03
github-actions Bot and others added 2 commits May 18, 2026 07:31
Merge upstream main (LIBERO fixes) into docs/readme-examples branch:
- Resolve README.md conflict: keep Zenoh mesh env vars, add LIBERO env vars
- Fix torch_mock: add manual_seed, cuda.manual_seed_all, backends.cudnn
  (required by policy_runner._set_eval_seed added in main)
Merged main into docs/readme-examples branch. The conflict was in the
environment variables table in README.md:
- Kept main's 'Robot() factory' wording for STRANDS_ROBOT_MODE
- Preserved all mesh networking env vars added since PR was opened
- Deduplicated entries that appeared in both versions
- Kept the more descriptive variants from the PR branch

All tests pass (1677 passed), ruff + mypy clean.
Copy link
Copy Markdown
Contributor

@yinsong1986 yinsong1986 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary

Full rewrite of README.md, plus 8 new runnable examples, an updated AGENTS.md, and a documentation section for the Zenoh mesh layer. The diff is overwhelmingly markdown / examples; the only code change is in tests/mocks/torch_mock.py (which itself is a no-op duplicate — see inline). The reorganisation is a real improvement: front-loading installation, a 5-line promise, and concrete agent-driven examples is much friendlier than the previous mermaid-heavy layout.

The blocking concerns are factual, not structural. Several of the new code blocks won't actually run as written, and the install-extras table contradicts pyproject.toml in two places. Because this PR is user-facing documentation (it ships in the rendered PyPI page, in the examples/ tree people will copy-paste, and in the README anyone googling the project lands on), getting the API surface right matters more than for ordinary code PRs.

What's good

  • Scope is disciplined — pure docs + examples, no API changes hidden in the diff.
  • Each example has a docstring header with Requirements: and Usage: lines; the pattern is consistent across all 8.
  • examples/03_sim_recording.py correctly notes that start_recording needs the lerobot extra and shows the structured video={...} kwarg.
  • AGENTS.md reference from the Contributing section is a nice addition.

Concerns

  • PR description claims "No code changes" but tests/mocks/torch_mock.py is modified. The change is a duplicate no-op (see inline) — easy to drop, but the description should match the diff.
  • Depends on PR #86 and PR #101. The README's Mesh Networking section (Robot("so100", mesh=False), STRANDS_MESH_* env vars, InputPublisher/InputReceiver, mesh.peer_id) only makes sense if #101 lands. If this merges before #101, the docs describe an API that doesn't exist in main. Worth being explicit about the merge order in the PR thread before merging.
  • No CHANGELOG entry. Pre-1.0 docs PR is normally fine to skip, but the new mesh section advertises a substantial new feature surface; a one-liner under an Unreleased heading would help users tracking what shipped when (especially given this is Part 6 of 6).
  • Examples have no smoke-test gate in CI. None of the new examples/*.py files run in tests/ or tests_integ/, so the next API rename will silently rot them. Consider a tiny test that at least ast.parses every file and asserts the Robot(...) call sites use kwargs that exist on the public Simulation action set — would have caught the record_video= mistake below.

Verification suggestions

# 1. Sanity-check every Python example parses and imports cleanly
python -c "import ast, pathlib; [ast.parse(p.read_text()) for p in pathlib.Path('examples').glob('*.py')]"

# 2. Run the quickstart end-to-end (will reveal the record_video issue immediately)
pip install -e ".[sim-mujoco]"
python examples/01_sim_quickstart.py

# 3. Spot-check the README table against pyproject.toml
python -c "import tomllib; d=tomllib.loads(open('pyproject.toml').read()); print(d['project']['optional-dependencies'].keys())"

Comment thread README.md Outdated
Comment thread README.md Outdated
Comment thread README.md Outdated
Comment thread examples/05_real_groot_policy.py Outdated
Comment thread tests/mocks/torch_mock.py
1. README: fix run_policy(record_video=) → video={"path": ..., "fps": ...}
2. README: split extras table into sim/sim-mujoco rows, add mesh row
3. README: correct Zenoh claim — requires [mesh] extra, not in base
4. examples/05: add try/finally with gr00t stop + robot.cleanup()
5. tests/torch_mock: remove duplicate manual_seed binding (dead code)
Copy link
Copy Markdown
Contributor

@yinsong1986 yinsong1986 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary

Docs-and-examples PR: rewrites the README around the new Robot() factory, adds a Mesh Networking section that matches the implementation in strands_robots/mesh/, and ships eight runnable examples. Scope is well-contained (no production code beyond a one-line comment in tests/mocks/torch_mock.py), and the AGENTS.md additions about simulation modules are accurate.

The main concern is documentation drift between the new prose and the actual API:

  • The README repeatedly advertises "35 actions", but strands_robots/simulation/mujoco/tool_spec.json enumerates 64 actions (add_camera, apply_force, evaluate_benchmark, forward_kinematics, get_jacobian, ... — plus the recording / state / scene-mutation handlers). This number appears in 6 places in the rewritten README (L43, L47, L56, L230, L369, L445) and is the kind of marketing claim an LLM agent will repeat back at users. Either bump it to "60+" / drop the exact number, or generate it from tool_spec.json so it cannot drift again.
  • Two code snippets won't run as written — flagged inline.
  • The 8 new examples use direct method calls (sim.start_recording(...), sim.run_policy(...), agent.tool.gr00t_inference(...)) but I did not see a CI job that imports them; if examples/ are user-facing, even a python -c "import ast; ast.parse(open(...).read())" smoke test in CI would catch the kind of breakage flagged here.

What's good

  • Mesh section faithfully matches strands_robots/mesh/ — the topic schema, the 0o600 / 0o700 audit-log permissions claim (mesh/audit.py:80,89), the STRANDS_MESH_PORT "validated, falls back to default" claim (mesh/session.py:325), and the STRANDS_MESH=false / Robot("so100", mesh=False) opt-out paths all check out against the implementation.
  • Env-var table cleanup removes the duplicate STRANDS_ROBOT_MODE / STRANDS_TRUST_REMOTE_CODE / MUJOCO_GL rows that existed in the pre-PR README.
  • Examples 04 and 05 correctly use mode="real" (matching the safety default introduced in PR #86) rather than relying on STRANDS_ROBOT_MODE=real, and example 05 wraps the interactive loop in try/finally so the GR00T server is stopped on Ctrl-C.
  • No emojis or host paths leak into example files (per AGENTS.md > Review Learnings #86 > "No host paths in test files" / "No emojis in user-facing strings"). Note: the README still uses three em dashes via U+2014 plus one 📸 etc. inherited from the renderer output strings — those are existing, not introduced here.

Concerns

  • Dependency on unmerged PRs. The PR description states this depends on #86 (Robot factory) and #101 (mesh). #101 has already merged (176149c feat(mesh): Mesh, AWS IoT...). Confirm #86 is in main before merging this — otherwise the README's Robot("so100") 5-line promise is documenting an API that doesn't exist yet on the target branch.
  • No tests added or updated beyond a one-line comment in tests/mocks/torch_mock.py. The PR description claims "All 357 tests still pass" but there is no examples/ smoke test or a doctest pinning the README's ## Quick Start snippet to the real API. AGENTS.md > Review Learnings #85 ("Pin regression tests for reviewed fixes") applies less directly here, but the same principle would catch the broken snippets flagged inline.
  • examples/06_list_robots.py uses an _flag underscore-prefixed helper at module scope. That's just a local convenience and won't show up in __all__, but for a file users are likely to copy/paste, naming it _flag is a minor smell — format_flag or inlining the f-string would read cleaner.

Verification suggestions

With [sim-mujoco] installed:

# Should expose ~60+ actions, not 35 — confirms the docs claim is stale.
python -c "import json; d=json.load(open('strands_robots/simulation/mujoco/tool_spec.json')); \
import sys; \
for v in [d]: pass; \
print('actions:', sum(1 for _ in (a for a in str(d).split() if 'action' in a)))"

# Each example should at least parse + import cleanly.
for f in examples/01_sim_quickstart.py examples/02_sim_agent.py examples/03_sim_recording.py \
        examples/06_list_robots.py examples/physics_agent.py; do
  python -c "import ast; ast.parse(open('$f').read())" || echo "BROKEN: $f"
done

# Quick run of the safest example (no Agent, no LLM call, no recording):
python examples/01_sim_quickstart.py   # expect TypeError today: render() got unexpected save_path
python examples/06_list_robots.py      # should print all robots

Comment thread README.md Outdated
from strands import Agent

robot = Robot("so100") # MuJoCo sim, auto-downloads assets
agent = Agent(tools=[robot]) # 35 simulation actions as AgentTool
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stale count: claims "35 simulation actions" but the backend exposes 64.

strands_robots/simulation/mujoco/tool_spec.json enumerates 64 action names (counted via python -c "import json; ..." — includes apply_force, forward_kinematics, get_jacobian, inverse_dynamics, multi_raycast, patch_scene_mjcf, evaluate_benchmark, start_cameras_recording, etc. that the action-category table at L361-380 partially lists).

The "35 actions" claim repeats at L47, L56, L230, L369, L445. Either bump to "60+" / drop the exact number from the marketing line, or — better — generate it from tool_spec.json at docs-build time so the number can't silently drift again. An agent that reads this README will confidently quote "35" back to users.

Comment thread README.md Outdated
sim.step(n_steps=100)

# Render a frame
sim.render(width=640, height=480, save_path="/tmp/frame.png")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

save_path is not a kwarg of Simulation.render().

The abstract signature in strands_robots/simulation/base.py:239 and the MuJoCo implementation in strands_robots/simulation/mujoco/rendering.py:379 only accept camera_name, width, height. Calling sim.render(width=640, height=480, save_path="/tmp/frame.png") raises TypeError: render() got an unexpected keyword argument 'save_path'.

The returned dict has the PNG bytes inside content[1]["image"]["source"]["bytes"], not a file path. Either:

  • drop save_path=... and add a follow-up Path("/tmp/frame.png").write_bytes(frame["content"][1]["image"]["source"]["bytes"]) step, or
  • add a real save_path parameter to render() (a small handful of lines in rendering.py) and pin a regression test.

This snippet is also the very first "Quick Start > Simulation" code block, so a copy-paste user will hit the error immediately.

Comment thread README.md
},
port="/dev/ttyACM0",
data_config="so100_dualcam"
data_config="so100_dualcam",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole "Real Hardware" snippet is incompatible with the new Robot() factory.

The rewrite explicitly promotes the Robot(name, mode=..., ...) factory, but this block (L189-198) still uses the old HardwareRobot.__init__ signature: Robot(tool_name="my_arm", robot="so101_follower", ...).

The new factory in strands_robots/robot.py:173 requires name as the first positional arg and does not accept tool_name= / robot= as keywords. Calling this verbatim raises TypeError: Robot() missing 1 required positional argument: 'name'. It also passes cameras= without mode="real"; per robot.py:269-273 that combination is rejected with ValueError: cameras= is only supported in mode='real'.

Replace with the new-style call (matches the working examples/05_real_groot_policy.py):

robot = Robot(
    "so101",
    mode="real",
    cameras={
        "front": {"type": "opencv", "index_or_path": "/dev/video0", "fps": 30},
        "wrist": {"type": "opencv", "index_or_path": "/dev/video2", "fps": 30},
    },
    port="/dev/ttyACM0",
    data_config="so100_dualcam",
)

Note that the older ## Example: Complete Workflow block (further down, around the tool_name="orange_arm" line) appears to have been removed by this rewrite — but the diff still leaves this Quick-Start snippet on the old API.

Comment thread examples/03_sim_recording.py Outdated
)

# Run a mock policy (random actions) for 2 seconds.
# Video kwargs go inside the ``video`` dict, NOT as top-level args.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: the start_recording docstring (strands_robots/simulation/mujoco/recording.py:35) defaults fps=30, but here you pass fps=50 to start_recording and fps=30 to the video={...} block of run_policy. The dataset frame-rate metadata and the encoded MP4 wall-clock rate will diverge — viewers will see the rollout play back at 60% speed compared to what the LeRobot dataset claims happened.

For a copy-paste example, pick one fps and use it in both places (e.g. fps=30 for both, since control_frequency=50.0 is the policy-loop rate, not the camera rate).

Comment thread examples/05_real_groot_policy.py Outdated
finally:
# Always stop the inference server and release hardware, even on error.
agent.tool.gr00t_inference(action="stop", port=5555)
robot.cleanup()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two follow-up nits on this otherwise solid example:

  1. The try/finally correctly stops the GR00T container, but gr00t_inference(action="stop", port=5555) returns a status dict — log it (print(...)) so users can see whether the stop actually succeeded. A silent failure here leaks a Docker container.
  2. robot.cleanup() is the right call for HardwareRobot (hardware_robot.py:680), but if the very first Robot("so101", mode="real", ...) call raised — e.g. servo USB not enumerated — robot is unbound and the finally block crashes with NameError, masking the original exception. Either initialize robot = None above the construction and guard if robot: robot.cleanup(), or wrap the construction inside the try.

…e_path, update Robot() API, unify fps, guard cleanup

- README: update action count from 35 to 64 (matches tool_spec.json)
- README: remove non-existent save_path kwarg from render() example
- README: replace old tool_name=/robot= API with new Robot() factory
- examples/03: unify recording fps=30 to match video fps (prevents drift)
- examples/05: log gr00t stop result, guard robot.cleanup() with None check
Copy link
Copy Markdown
Contributor

@yinsong1986 yinsong1986 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary

Docs-only PR: README rewrite, eight new examples/*.py, plus a one-line comment in tests/mocks/torch_mock.py. The new README re-frames the project around the Robot() factory (default sim) and adds a substantial Mesh Networking section. The examples demonstrate the new factory API in sim, recording, real-hardware, and policy modes.

What's good

  • Examples follow a clean numbered progression (01 quickstart → 06 discovery) with Requirements: / Usage: headers; that's a nice on-ramp.
  • start_recording example correctly notes that the [lerobot] extra is needed in addition to [sim-mujoco] and points at start_cameras_recording as the dataset-free alternative — that's the kind of friction warning users actually need.
  • Env-var table in the README is now de-duplicated (the previous version had STRANDS_ROBOT_MODE listed twice with different defaults) and it adds the new STRANDS_MESH_* knobs.
  • Diff is purely additive plus a README rewrite; no production code paths touched, so blast radius is small.

Concerns

  • PR description vs. diff drift. The description's "What changed" table promises AGENTS.md updates, but the diff has zero AGENTS.md changes. Either drop that row from the description or actually include the AGENTS.md edit — reviewers shouldn't have to reconcile the two.
  • Documents code that hasn't merged yet. The Mesh Networking section (Robot("so100", mesh=False), sim.mesh.peers, InputPublisher, STRANDS_MESH_* env vars, the robot_mesh agent tool) all depend on PR #101, which the description acknowledges. If #87 lands before #101, the README on main documents APIs that don't exist in the installed package. Recommend either (a) merge #101 first as a hard prerequisite, (b) gate the Mesh Networking section behind a coming in v0.x.y note, or (c) split the mesh-only README diff into its own PR that lands with #101.
  • Action-count drift across the PR. README says "64 actions" (correct — verified against simulation/mujoco/tool_spec.json), examples/02_sim_agent.py says "35+ simulation actions", and examples/physics_agent.py says "50+ actions". Pick one and use it everywhere; otherwise readers don't know which doc to trust.
  • examples/06_list_robots.py mislabels its own output. list_robots(mode="sim") returns all sim-capable robots (62 on current registry, including the 5 that also have hardware support), not sim-only ones. The print label "Sim-only" is wrong by ~5 robots. Same nit applies to the description "60+ robots" — the registry actually has 68 entries on main, so this is a low-balled estimate and will silently rot as robots are added.
  • No try/finally around resource construction in real-hardware examples. examples/05_real_groot_policy.py and examples/03_sim_recording.py allocate a Robot() (which on real-hardware spins up servos, opens cameras, joins the mesh), then start a GR00T container or a recording session, outside any cleanup block. Per AGENTS.md → "Resource Cleanup on Partial Failure" (PR #86 review-learnings), every allocation that owns hardware/process state needs a guaranteed-cleanup path. These are the templates users will copy — the patterns shown here will leak GPU containers and dataset write-locks in the field.
  • Test coverage. Adding eight runnable example scripts with zero tests/test_examples_*.py coverage means none of these will catch regressions in the Robot() / recording / mesh APIs they advertise. Consider a smoke test that imports each example and statically validates the API calls resolve (or, for the sim-only ones, runs them under the existing [sim-mujoco] fixtures).

Verification suggestions

# 1. Confirm action count claim (should print 64)
python3 -c 'import json; d=json.load(open("strands_robots/simulation/mujoco/tool_spec.json")); print(len(d["properties"]["action"]["enum"]))'

# 2. Confirm registry size claim ("60+ robots")
python3 -c 'from strands_robots import list_robots; print(len(list_robots(mode="all")))'

# 3. Confirm mesh APIs exist on this branch (will fail on main if #101 hasn't landed)
python3 -c 'from strands_robots.mesh import init_mesh; from strands_robots.mesh.input import InputPublisher, InputReceiver; print("ok")'

# 4. Smoke-test the simplest example actually runs end-to-end
python3 examples/01_sim_quickstart.py

# 5. Static check of the example scripts (catches NameError / typos at import time)
for f in examples/*.py; do python3 -c "import ast; ast.parse(open('$f').read())" || echo BAD $f; done

Comment thread examples/05_real_groot_policy.py Outdated
robot = None

# Real robot with dual cameras
robot = Robot(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment on line 20 says "Initialize robot to None so finally block is safe if construction fails", but Robot(...) (line 24-41) and agent.tool.gr00t_inference(action="start", ...) (line 49-55) are outside the try: on line 59 — if Robot() raises (USB perm denied, camera busy, mesh init failure that propagates), or if the GR00T server fails to start, the finally: block never runs and the inference container is left orphaned.

Wrap construction and the gr00t_inference start in the try: block, and add a matching if robot: / agent.tool.gr00t_inference(action="stop") cleanup. This is the canonical example users will copy for production; getting cleanup right matters more here than anywhere else. See AGENTS.md → Review Learnings (PR #86) → "Resource Cleanup on Partial Failure".

Comment thread examples/06_list_robots.py Outdated
real_flag = _flag(r.get("has_real", False), "real")
print(f" {sim_flag} {real_flag} {r['name']:25s} {r.get('description', '')}")

print(f"\n=== Sim-only ({len(list_robots(mode='sim'))} robots) ===")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

list_robots(mode="sim") returns every robot with simulation assets, including the ones that also have hardware support. On the current registry that's 62 robots, but only 57 are sim-only (the other 5 also have has_real=True). The printed label "Sim-only (N robots)" is therefore off by the size of the sim∩real intersection.

Fix is one line: filter on not r['has_real'], e.g.

sim_only = [r for r in list_robots(mode='sim') if not r['has_real']]
print(f"\n=== Sim-only ({len(sim_only)} robots) ===")

or change the heading to "Sim-capable" if you want to keep the registry call as-is.

Comment thread examples/02_sim_agent.py Outdated
#!/usr/bin/env python3
"""The 5-Line Promise: natural language robot control.

Robot() returns an AgentTool with 35+ simulation actions. Hand it to a
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action-count drift across this PR:

  • README.md (this PR) repeats 64 actions in 6 places — verified against simulation/mujoco/tool_spec.json.
  • This file says "35+ simulation actions".
  • examples/physics_agent.py:26 says "50+ actions".

Pick one (the actual number is 64) and use it everywhere, or drop the count entirely and link to the README's Simulation Features table. Hardcoding magic numbers in three places guarantees they'll diverge again as soon as someone adds an action.

Comment thread examples/03_sim_recording.py Outdated

# Run a mock policy (random actions) for 2 seconds.
# Video kwargs go inside the ``video`` dict, NOT as top-level args.
result = sim.run_policy(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If run_policy raises (mock policy is safe, but the moment a user copies this template and swaps in lerobot_local or groot, exceptions become realistic) the stop_recording() on line 44 and the sim.destroy() on line 47 never execute. That leaves a half-written LeRobotDataset on disk (parquet writer not flushed, episode metadata not finalised) and the MuJoCo executor + temp dir not released.

Wrap lines 24–45 in try: ... finally: sim.stop_recording(); sim.destroy(). Same pattern would help examples/act_policy_simulation.py and examples/physics_agent.py. Cf. AGENTS.md → "Resource Cleanup on Partial Failure".

Comment thread README.md

<details>
<summary><b>🐳 Jetson Container Setup (Required for GR00T Inference)</b></summary>
## Mesh Networking
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The entire Mesh Networking section documents APIs (Robot("so100", mesh=False), sim.mesh.peers, sim.mesh.tell, sim.mesh.emergency_stop, InputPublisher/InputReceiver, STRANDS_MESH_* env vars, the robot_mesh agent tool) that the PR description says depend on PR #101 ("merge those first"). If #87 merges to main ahead of #101, the README on main will advertise public APIs that don't exist in the released strands-robots package, and pip install strands-robots[mesh] followed by the snippets here will raise AttributeError: 'Simulation' object has no attribute 'mesh'.

Three options, in decreasing order of preference:

  1. Hold this PR until feat(mesh): Mesh, AWS IoT - session, presence, RPC, streams, wiring #101 is merged (the description already acknowledges the dependency).
  2. Move the mesh section out of this PR and into a follow-up that lands together with feat(mesh): Mesh, AWS IoT - session, presence, RPC, streams, wiring #101.
  3. Wrap the section with a > Available from v0.x.y onwards — see PR #101 callout so users understand the version requirement.

…/finally cleanup, fix sim-only label

- README: fix stale '58 actions' → 64 in Simulation section (lines 614, 643)
- examples/02_sim_agent.py: fix '35+' → 64 action count
- examples/03_sim_recording.py: wrap in try/finally for resource cleanup
- examples/05_real_groot_policy.py: move Robot() + gr00t_inference start
  inside try block so finally always fires; track groot_started flag
- examples/06_list_robots.py: fix 'Sim-only' label to show sim-capable
  count and differentiate from truly sim-only (filter not has_real)
- examples/physics_agent.py: fix '50+' → 64 action count
Copy link
Copy Markdown
Contributor

@yinsong1986 yinsong1986 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary

Pure docs-and-examples PR: complete README rewrite, 8 new examples/*.py files, plus a one-line comment touch in tests/mocks/torch_mock.py. The new structure (5-Line Promise → Quick Start → Mesh → Simulation Features → 64-action reference) is much friendlier than the prior version, and the env-var table + extras table are genuine improvements. The [mesh] story and STRANDS_* knobs are now documented in one place.

Most of my concerns are about doc/example accuracy rather than safety — but the kind of inaccuracy that bites first-time users and erodes trust in the rest of the README. Worth fixing before merge.

What's good

  • Extras table (sim / sim-mujoco / lerobot / groot-service / mesh / all) matches pyproject.toml exactly.
  • Mesh section correctly notes [mesh] is opt-in and the base install does not include Zenoh — this is the right framing even though the PR description says the opposite ("ships in the default install").
  • Examples 03 and 05 use try/finally around destroy()/cleanup(), matching AGENTS.md > Review Learnings (#86) → "Resource Cleanup on Partial Failure".
  • New env vars added (mesh frequencies, audit dir, libero log knobs) — matches the AGENTS.md rule "Document every env var in README.md in the same PR."
  • All 64 sim actions claim is verifiable: tool_spec.json enum does have exactly 64 entries.

Concerns

  • Action-list accuracy (inline below): the bumped "### 64 actions grouped" heading is now numerically correct, but the bullet list under it still carries the prior 58-action shape with two stale names (get_joint_state, list_checkpoints) that don't exist in tool_spec.json, and is missing 8 actions that do (patch_scene_mjcf, replace_scene_mjcf, list_policies_running, get_robot_state, get_contact_forces, list_benchmarks, register_benchmark_from_file, evaluate_benchmark). An LLM agent using the README as ground truth will hit "Unknown parameter" / "Unknown action" errors.
  • PR description vs README disagree on mesh defaults. PR description: "covering the Zenoh-based peer-to-peer layer that ships in the default install." README + pyproject.toml: zenoh is in the [mesh] extra only. README is right; please correct the PR body before merge to avoid confusing later readers of the merge commit.
  • render() output description drift. Line 116 says # returns dict with PNG bytes, but MujocoSimulation.render returns base64-encoded PNG inside the content block (see strands_robots/simulation/mujoco/rendering.py:382 — "Render a camera view as base64 PNG image."). Tiny but the difference matters for a user wiring it to a file write.
  • Test surface for a docs PR is appropriately zero. No new tests, no behavioural changes — that part is fine. The tests/mocks/torch_mock.py one-line comment is a no-op annotation.

Verification suggestions

# Confirm action-list accuracy mechanically:
python3 -c "
import json, re, pathlib
spec = json.load(open('strands_robots/simulation/mujoco/tool_spec.json'))
actions = set(spec['properties']['action']['enum'])
readme = pathlib.Path('README.md').read_text()
start = readme.index('### 64 actions grouped')
end = readme.index('### Common footguns')
listed = set(re.findall(r'\\\`([a-z_]+)\\\`', readme[start:end]))
print('readme has 64 sim actions enumerated?', actions == listed)
print('extra in readme :', sorted(listed - actions))
print('missing from readme:', sorted(actions - listed))
"

# Smoke-test the four most-claimed examples (no GPU required):
uv pip install -e ".[sim-mujoco]"
python examples/01_sim_quickstart.py
python examples/06_list_robots.py
# 02 and physics_agent.py need a strands-agents API key; skip in CI.

Comment thread README.md
```

### 58 actions grouped
### 64 actions grouped
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The heading was bumped from 58 to 64, but the bulleted list below it still enumerates the old set of 58 names. Two of them don't exist in strands_robots/simulation/mujoco/tool_spec.json:

  • list_checkpoints (line 648) — not an action; checkpoint listing is part of save_state/load_state semantics in the actual API.
  • get_joint_state (line 651) — not an action; joint state is exposed via get_robot_state / get_body_state.

And 8 real actions are missing from the README list: patch_scene_mjcf, replace_scene_mjcf, list_policies_running, get_robot_state, get_contact_forces, list_benchmarks, register_benchmark_from_file, evaluate_benchmark.

Mechanical check (run from repo root):

python3 -c "import json; e=set(json.load(open('strands_robots/simulation/mujoco/tool_spec.json'))['properties']['action']['enum']); print(len(e), sorted(e))"

Returns 64. Please reconcile the bullet list with that enum — otherwise an agent that consults the README as ground truth will emit invalid action names and get the "Unknown action … Valid: [...]" self-healing error message that the README itself promotes as a feature.

Comment thread README.md

robot = Robot("so100") # MuJoCo sim, auto-downloads assets
agent = Agent(tools=[robot]) # 64 simulation actions as AgentTool
agent("Pick up the red cube") # Agent orchestrates sim via natural language
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Robot("so100") runs create_world + add_robot("so100") only — no objects are added to the scene (see strands_robots/robot.py:282-298). So agent("Pick up the red cube") will not pick up anything; at best the agent will figure out it needs to call add_object first. Compare to examples/02_sim_agent.py which explicitly says "run a mock policy for 1 second in fast mode" — that's the honest version of the promise.

Suggest one of:

  1. Change the prompt to something the default scene actually supports, e.g. agent("Run a mock policy for 1 second to wiggle the joints").
  2. Pre-add a cube in the snippet (agent('Add a red cube at [0.3,0,0.05], then pick it up')), but flag that the mock policy will not actually grasp it.

As written, the marquee "5-Line Promise" example fails on its first run, which is exactly the spot where users decide whether to keep reading.

task="reach target",
fps=30,
root="/tmp/so100_dataset",
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

start_recording is called immediately after Robot("so100"), before any add_camera calls. The default SO-100 scene constructed by Robot() has no scene cameras attached (the factory only does create_world + add_robot), so the resulting LeRobot v3 dataset will have joint-state parquet but no observation.image.* features and no per-camera MP4 — exactly the "silent data loss" pattern called out in AGENTS.md > Review Learnings (#85) → "Round-trip tests for recording."

For the example to demonstrate what the docstring claims ("captures joint states + video, ... AV1 video"), add at least one camera before recording, e.g.:

sim = Robot("so100")
sim.add_camera(name="topdown", position=[0, 0, 1.2], target=[0, 0, 0])
sim.start_recording(...)

Otherwise reframe the docstring to say "joint states only" so users don't think they got an AV1 file.


# Same Agent interface as simulation
agent = Agent(tools=[robot])
agent("Connect to the robot, read the current joint positions, and report status")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No try/finally and no robot.cleanup() — the only cleanup is whatever __del__ does, which is a non-deterministic fallback. If the agent call raises (USB reconnect failure, LeRobot connect timeout, KeyboardInterrupt during the agent's tool dispatch), the SO-100's serial port and any opened cameras are left held until process exit.

Compare to examples/05_real_groot_policy.py (lines 21-74) which does this correctly with the robot = None / try / finally: robot.cleanup() pattern. Hardware examples should be consistent — AGENTS.md > Review Learnings (#86) → "Resource Cleanup on Partial Failure" applies just as much to user-facing examples as to library code, since users copy these into their own scripts.

robot = None
try:
    robot = Robot("so100", mode="real", cameras={...})
    agent = Agent(tools=[robot])
    agent("Connect to the robot, ...")
finally:
    if robot is not None:
        robot.cleanup()

Comment thread examples/physics_agent.py
print(result)

# Clean up
sim.destroy()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sim.destroy() is the last statement at module scope — no try/finally wraps the five agent(...) calls (lines 39, 51, 64, 87, 97). If any of those raises (rate-limit error, schema validation failure, agent loop bug), destroy() is skipped and the Simulation leaks its ThreadPoolExecutor, MuJoCo model/data, renderer's TLS cache, and the temp-dir backing the scene MJCF — exactly the cleanup contract called out in AGENTS.md > Review Learnings (#86) → "Resource Cleanup on Partial Failure" and (#85) → "destroy() and cleanup() empty the renderer TLS cache."

Wrap the agent calls in try / finally: sim.destroy(). Examples 03 and 05 already follow this pattern; 01, 02, and this one don't. Worth standardising — users copy these as templates.

@cagataycali cagataycali added the documentation Improvements or additions to documentation label May 23, 2026
@cagataycali cagataycali marked this pull request as draft May 23, 2026 06:31
cagataycali pushed a commit to cagataycali/robots that referenced this pull request May 23, 2026
…nd AST pin to ast.Import

Three concerns from R3 review feedback addressed in one surgical commit:

1. DRY duplication (recurring across multiple review rounds): the four
   call sites in base.py each duplicated a 2-line lazy-import block with
   identical comment. Extracted to a single module-level helper
   _lazy_policy_runner(); call sites collapse to one line.

   The helper is at module scope but its body executes only at call time
   (Python function definitions do not execute their body at import) so
   the static import graph remains acyclic. CodeQL alerts strands-labs#83-strands-labs#87 stay
   closed.

2. AST pin gap: test_base_has_no_module_level_policy_runner_import only
   walked ast.ImportFrom nodes. A future regression introducing
   'import strands_robots.simulation.policy_runner' (ast.Import) would
   close the same cycle but slip the pin. Extended the pin to walk
   ast.Import as well.

3. OnFrame breadcrumb: the module-level rationale comment now explicitly
   notes that OnFrame is omitted because evaluate_benchmark uses the
   structural Callable[[int, dict, dict], None] type directly.

Two new pin tests:

- test_lazy_policy_runner_helper_exists -- asserts the module-level
  helper is present and contains the target import. Fails on pre-R4
  (helper does not exist).
- test_simengine_methods_use_lazy_helper -- AST-walks the SimEngine
  class body and fails if any method body still contains a direct
  'from strands_robots.simulation.policy_runner import ...'. Fails on
  pre-R4 (lists the four methods that previously had it).

Verification:
- pip install networkx pytest pytest-cov numpy
- pytest tests/simulation/test_no_import_cycle.py -v --no-cov
  -> 4 passed (was 2 passed)
- Pre-R4 pin verification: revert strands_robots/simulation/base.py to
  R3 state, re-run pin tests: 2 of 4 fail with offending-method list
  exactly enumerating run_policy/replay_episode/eval_policy/evaluate_benchmark
  (the four call sites).
- Post-R4: all 4 pass; ruff format/check clean; companion subprocess
  smoke test (test_no_cyclic_imports.py) still passes 4/4.
- mypy strands_robots/simulation/base.py: no new errors (existing
  PIL/strands.tools stub-not-found errors predate this PR).
cagataycali added a commit to cagataycali/robots that referenced this pull request May 23, 2026
…YPE_CHECKING forward refs

Yin's R5 review (and CI's mypy step) flagged that the R4 helper

    def _lazy_policy_runner() -> tuple[type, type]:

collapsed PolicyRunner and VideoConfig to bare `type`, so the four
call sites lost type information for mypy and IDE tooling. Five new
mypy errors landed on base.py: 'no-any-return' on dict[str,Any]
return paths and 'attr-defined' on `VideoConfig.from_dict`.

Fix: import PolicyRunner / VideoConfig under TYPE_CHECKING (already
the existing pattern for `Policy`) and use them as forward
references in the return annotation:

    def _lazy_policy_runner() -> tuple[type[PolicyRunner], type[VideoConfig]]:

Because `from __future__ import annotations` is in effect, the
annotation is string-form at runtime - it does not import
policy_runner at module-import time, the static cycle stays broken,
and the four CodeQL alerts (strands-labs#83-strands-labs#87) remain closed.

Pin tests in tests/simulation/test_no_import_cycle.py still pass
(8/8 + 4 fresh-interpreter smoke). Full mypy clean (0 errors,
185 source files). 239 simulation tests pass.
cagataycali pushed a commit to cagataycali/robots that referenced this pull request May 23, 2026
…h real pin

The original test_simengine_methods_use_lazy_helper asserted that no
SimEngine method body contained a direct 'from policy_runner import'.
This is vacuous: on pre-fix code (module-level import at base.py:43),
methods use the module-level name directly -- they never had method-body
imports either. The test passed on BOTH pre-fix and post-fix code.

Pre-fix failure transcript (replacement test):
  FAILED test_simengine_methods_call_lazy_helper - AssertionError:
  SimEngine method(s) ['run_policy', 'replay_episode', 'eval_policy',
  'evaluate_benchmark'] do not call `_lazy_policy_runner()`. On pre-fix
  code these methods resolve PolicyRunner/VideoConfig from module scope,
  which closes the CodeQL cycle (alerts strands-labs#83-strands-labs#87). Each must obtain them
  via the lazy helper to keep the import deferred to call time.

The replacement test_simengine_methods_call_lazy_helper verifies that
each of the four methods CALLS _lazy_policy_runner(). On pre-fix code
(module-level import), no such call exists -- the test fails with a
named AssertionError listing all four methods. Post-fix: passes.

Verified via git-stash round-trip:
  git checkout ce2a233~1 -- strands_robots/simulation/base.py
  pytest tests/simulation/test_no_import_cycle.py::test_simengine_methods_call_lazy_helper
  -> FAILED (4 methods listed)
  git checkout HEAD -- strands_robots/simulation/base.py
  -> PASSED (0 offenders)
cagataycali pushed a commit to cagataycali/robots that referenced this pull request May 24, 2026
…alerts strands-labs#83-strands-labs#87

The R5 commit re-introduced `from strands_robots.simulation.policy_runner
import PolicyRunner, VideoConfig` inside an `if TYPE_CHECKING:` block to
recover precise typing on `_lazy_policy_runner()`'s return annotation
after R4 collapsed it to bare `type`. Review feedback flagged this as
likely to re-fire `py/unsafe-cyclic-import` because CodeQL walks
TYPE_CHECKING blocks. CodeQL on commit 5f7fec2 confirmed the prediction
empirically (CodeQL conclusion: FAILURE on this head).

Fix per the review-suggested approach:

  1. Drop the TYPE_CHECKING re-import of PolicyRunner / VideoConfig
     from base.py. The import lives only inside the helper body now.
  2. Stringify the helper return annotation:
     `tuple[type["PolicyRunner"], type["VideoConfig"]]`. Under
     `from __future__ import annotations` (already in effect) this
     resolves to identical runtime behaviour; mypy resolves the strings
     via the lazy import inside the function body, so type precision at
     the four call sites is preserved.
  3. Update the rationale comment block to record why TYPE_CHECKING
     imports of policy_runner are unsafe here even though they are
     usually invisible to runtime cycles.

Pin tightening (same commit, same file context):

  * `test_base_has_no_module_level_policy_runner_import` previously
    only walked direct module-level `ast.Import` / `ast.ImportFrom`
    nodes. The R5 regression sat inside an `ast.If` (the
    TYPE_CHECKING block) and silently slipped past. Tightened the test
    to also descend into top-level `if TYPE_CHECKING:` bodies and
    treat any `policy_runner` import there as a pin failure. The pin
    docstring explicitly records why TYPE_CHECKING-guarded imports are
    not exempt.

Verification:

  * Synthetic ast input shaped like the R5 regression is correctly
    rejected by the new pin.
  * Current base.py passes the new pin (no module-level OR
    TYPE_CHECKING-level policy_runner import).
  * `_lazy_policy_runner` retains its in-body lazy import.
  * Module-level import-graph walk shows no `base -> policy_runner`
    edge.
  * Both files compile via py_compile.
  * Non-ASCII sweep clean on touched regions.

Round budget: this is R6 against a target of R3, justified by an
externally-validated blocking signal (CodeQL FAILURE on 5f7fec2).
Single surgical commit; no drive-by edits.
cagataycali pushed a commit to cagataycali/robots that referenced this pull request May 24, 2026
…ds-labs#83-strands-labs#87"

This reverts 415522c. The push was made on incomplete local verification:
the AST-level pin tests, module-graph walk, and py_compile all passed,
which suggested string forward-references would close the CodeQL cycle
without breaking type checking. A subsequent local mypy run (the gate
the PR description Section 3 had already verified would fail this
exact shape) confirmed 6 `name-defined` errors at the four call sites
plus the helper return annotation:

    strands_robots/simulation/base.py:55: error: Name "PolicyRunner" is not defined
    strands_robots/simulation/base.py:55: error: Name "VideoConfig" is not defined
    strands_robots/simulation/base.py:372: error: Returning Any from function declared to return "dict[str, Any]"
    strands_robots/simulation/base.py:441: error: Returning Any from function declared to return "dict[str, Any]"
    strands_robots/simulation/base.py:489: error: Returning Any from function declared to return "dict[str, Any]"
    strands_robots/simulation/base.py:625: error: Returning Any from function declared to return "dict[str, Any]"

The PR description's R6 PAUSED status is therefore correct as written:
mypy's name-resolution requirement and CodeQL's py/unsafe-cyclic-import
rule are mutually exclusive in this file's import shape, and the
right next move is the separate small PR that adds a CodeQL config
suppression for the simulation triple (Option A in PR description
Section 7), not further code surgery on this branch.

Returning HEAD to 5f7fec2 keeps the existing R6-PAUSED status visible
to the maintainer rather than shipping a broken-CI commit on top of it.

Round budget honesty: this is R6 with a self-correction. The lesson
captured in the working memory is to gate on every CI check that the
target branch already runs, not just the one most directly tied to
the reviewer-cited concern.
cagataycali pushed a commit to cagataycali/robots that referenced this pull request May 26, 2026
…l not line

Two surgical fixes addressing review feedback on the R3 cycle. Out of
scope items deferred per AGENTS.md round budget (R3 is the cap on this
PR -- see PR description Pre-push checklist).

1. PYTHONPATH internal-empties filter (tests/simulation/test_no_cyclic_imports.py):

   The R2 fix only filtered the *fully-empty* inherited PYTHONPATH case
   ("" -> trailing pathsep -> cwd injection). Inherited values with
   internal empties (":foo", "foo:", "foo::bar") still leaked through
   because the filter appended the inherited string verbatim.

   Now: split the inherited value on os.pathsep and filter empty entries
   individually before extending parts. Verified with synthetic
   PYTHONPATH="foo::bar:baz::" -> no empty entries in the joined result.

   The R2 docstring already promised "leading, trailing, or interior
   empty pathsep entry" coverage, so this brings the implementation in
   line with the documented contract.

2. Cite by symbol, not by line number (.github/codeql/README.md):

   The R2 README cited base.py:50 / policy_runner.py:51-54 /
   benchmark.py:46-47 as the locations of the cycle's runtime and
   TYPE_CHECKING edges. The R0 breadcrumbs landed in this same PR
   shifted those line numbers (base.py:50 is now the actual import at
   :66; the others moved by similar small amounts). Hard-coded line
   numbers in long-lived rationale comments are a maintenance liability
   -- any future edit above the import silently invalidates the
   citation.

   Now: cite by the import's *symbol shape* (the top-level
   "from ... import PolicyRunner, VideoConfig" statement and the
   "if TYPE_CHECKING:" blocks). The same edits to the References
   section preserve the historical alert line numbers (strands-labs#85, strands-labs#86, strands-labs#87
   raised against benchmark.py:42 / policy_runner.py:49-50 at the SHA
   when the alerts were filed) but reframe them explicitly as
   point-in-time historical markers, not current-tree citations.

Out of scope (deferred):

- Workflow major-version inconsistency between codeql.yml (v3.29.4
  SHA-pinned) and codeql-advanced.yml (floating @v4): tracked in strands-labs#217.
  That issue covered SHA pinning; updating it to also resolve the
  major-version drift is the right home for the unified fix.
- Stale Note (strands-labs#191) framing in base.py: the suppression note now
  framing-overlaps with the existing comment block. Out of scope for
  this PR since the existing comment is correct as a runtime-cycle
  explainer; the new comment correctly frames CodeQL. A follow-up that
  consolidates them belongs in a docs cleanup PR.
- CI enforcement that the suppression stays narrow: noted in README
  (the manual periodic-audit recipe added in R2). A CI step is the
  right long-run shape but out of scope for this config-only PR.

Verification:
- pytest tests/simulation/test_no_cyclic_imports.py -v --no-cov: 4 passed
- ruff check + ruff format --check on the touched test file: clean
- grep -nP '[^\x00-\x7F]' on touched files: no non-ASCII
- Synthetic PYTHONPATH='foo::bar:baz::' check: no empty entries leak

Pin contract preserved: the R2 traceback regex (anchored on start-of-line
ExceptionName: framing) is unchanged. The R3 PYTHONPATH change tightens
the hygiene of the subprocess env without weakening the failure-detection
shape.

Closes review thread on PYTHONPATH internal-empty leak.
Closes review thread on line-number drift in README citations.

Refs strands-labs#215 strands-labs#209 strands-labs#217
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation

Projects

Status: In review

Development

Successfully merging this pull request may close these issues.

3 participants