feat(adb-vision): restore live piloting and audit backend helpers by Coldaine · Pull Request #57 · Coldaine/ALAS

Coldaine · 2026-03-05T21:50:35Z

Summary

restore the canonical adb_vision piloting path with reliable screenshot fallbacks and a real diagnose.py health gate
add structured audit logging for nested backend activity, including ADB subprocesses, backend HTTP calls, scrcpy, ffmpeg, and the PowerShell desktop fallback
document the canonical audited surface and add focused audit tests

Why

Raw screencap was an unreliable source of truth on the live MEmu setup, which caused false "VM not running" diagnoses and blind piloting. The logging story was also incomplete because helper processes and backend calls were not being recorded.

Validation

cd adb_vision && uv run pytest test_server.py test_tool_audit.py -q
cd adb_vision && uv run python diagnose.py --json
cd adb_vision && uv run python pilot.py screenshot

Live checks

diagnose.py reported adb_connected: true, focus_matches_game: true, and diagnosis: OK
live screenshot capture succeeded from the new branch after the audit changes

chatgpt-codex-connector · 2026-03-05T21:50:42Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

Coldaine · 2026-03-05T21:51:20Z

Superseded by #55, which is the cleaned-up canonical adb_vision PR for this stack.

kilo-code-bot · 2026-03-05T21:54:43Z

adb_vision/diagnose.py

+from emulator import memuc_cli
+from screenshot import take_screenshot
+from tool_audit import audit_cli_call, configure, record_command
+


🔥 The Roast: r"D:\Program Files\Microvirt\MEmu\adb.exe" hardcoded into a diagnostic tool that exists to catch environment problems. There's a beautiful irony here: the first thing diagnose.py does is assume it's running on exactly the machine it was written on. On any other machine this file fails before it can tell you anything is wrong.

server.py already has a _find_adb() function that searches environment variables and known paths. diagnose.py duplicates the ADB path hardcode instead of reusing — which means if the ADB path ever changes, the diagnostic tool becomes the first thing to lie to you.

🩹 The Fix: Import _find_adb from server (or extract it to a shared adb_utils.py), then:

Suggested change

ADB = os.environ.get("ADB_EXECUTABLE") or r"D:\Program Files\Microvirt\MEmu\adb.exe"

At minimum add the env-var escape hatch so this isn't carved in stone.

📏 Severity: warning

kilo-code-bot · 2026-03-05T21:54:44Z

adb_vision/screenshot.py

+    include_metadata: bool = False,
+) -> str | tuple[str, dict[str, str | int]]:
    backends = _resolve_backends(method)
    last_error: Exception | None = None


🔥 The Roast: -> str | tuple[str, dict[str, str | int]] — the classic "flag argument that changes the return type" anti-pattern, wearing a Python type hint as a disguise. You've created a function that either returns a str OR a tuple depending on a boolean you pass in at call time. Every caller now needs to do type narrowing gymnastics, and mypy is going to throw a party in the error log.

🩹 The Fix: Use @overload to give each call-site a typed contract:

@overload async def take_screenshot(..., include_metadata: Literal[False] = ...) -> str: ... @overload async def take_screenshot(..., include_metadata: Literal[True]) -> tuple[str, dict[str, str | int]]: ...

Or simpler: always return the tuple and let callers unpack what they need.

📏 Severity: suggestion

kilo-code-bot · 2026-03-05T21:54:44Z

adb_vision/desktop_capture.ps1

+
+# Restore previous window
+[WinAPI]::SetForegroundWindow($previousWindow) | Out-Null
+


🔥 The Roast: Start-Job fires a background PowerShell job inside a script that's invoked as a one-shot subprocess from Python. When this script exits, the background job dies with it — it's not a persistent daemon, it's a fire-and-forget that gets immediately forgotten by the OS. Your 5-minute auto-delete should be called the "optimistic cleanup feature" because it will never run unless the caller holds the PS session open.

🩹 The Fix: Either use a scheduled task (Register-ScheduledJob/Register-ScheduledTask) for deferred cleanup, or just let the Python caller handle deletion after it's done consuming the file. PS Start-Job only works reliably in interactive sessions, not in -Command one-shots.

📏 Severity: warning

kilo-code-bot · 2026-03-05T21:55:07Z

Code Review Roast 🔥

Verdict: 4 Issues Found | Recommendation: Address before merge

Overview

Severity	Count
🚨 critical	0
⚠️ warning	2
💡 suggestion	1
🤏 nitpick	1

Issue Details (click to expand)

File	Line	Roast
`adb_vision/diagnose.py`	29	Hardcoded Windows ADB path `D:\...\adb.exe` in the diagnostic tool — the tool that exists to catch environment problems fails on any non-dev machine before it can report anything useful. `server.py` already has `_find_adb()`. Use it.
`adb_vision/desktop_capture.ps1`	181	`Start-Job` auto-delete fires a background PS job from a one-shot subprocess — it dies when the script exits. The 5-minute cleanup is aspirational fiction.
`adb_vision/screenshot.py`	320	`-> str \| tuple[str, dict]` return type controlled by a bool flag is the "mystery box" anti-pattern. Use `@overload` or always return the tuple.
`docs/ROADMAP.md`	—	"⏳ Piloting mode (vision + manual recovery) not started" appears directly below "✅ Live piloting path proven on March 5, 2026." These two lines are having a disagreement that only one of them knows about.

🏆 Best part: The tool_audit.py middleware design is actually clean — the contextvars-based parent event tracking for nested ADB/HTTP/backend calls is the right architecture for tracing a multi-layer tool call, and wrapping it in FastMCP middleware so it's transparent to tool implementations is exactly how you'd want this done. I need to sit down.

💀 Worst part: diagnose.py hardcoding the ADB path is the original sin here. A diagnostic tool that assumes the environment instead of probing it is not a diagnostic tool — it's a prayer.

📊 Overall: Like a well-engineered Swiss watch stuffed into a box labeled "definitely not a watch." The audit/tracing architecture is solid, the backend fallback ladder is correct, but the operational tooling (diagnose.py, desktop_capture.ps1) has fresh-off-a-dev-machine energy that'll bite on first deployment to any other box.

Files Reviewed (10 files)

adb_vision/diagnose.py — 1 issue (hardcoded ADB path)
adb_vision/desktop_capture.ps1 — 1 issue (Start-Job lifetime)
adb_vision/screenshot.py — 1 issue (ambiguous return type), plus screenrecord adds ≥2s per attempt in the hot fallback path (worth a comment in the docstring)
adb_vision/tool_audit.py — no issues
adb_vision/pilot.py — no issues
adb_vision/server.py — no issues
adb_vision/test_server.py — no issues
adb_vision/test_tool_audit.py — no issues
docs/ROADMAP.md — 1 doc inconsistency (piloting status)
adb_vision/pyproject.toml — no issues

Fix these issues in Kilo Cloud

Copilot

Pull request overview

Restores the adb_vision live piloting workflow by strengthening screenshot capture fallbacks and adding structured audit logging across MCP tool calls, CLI diagnostics, and nested backend helper activity.

Changes:

Add structured JSONL audit logging (tool_audit.py) with FastMCP middleware + child event support; integrate it into the MCP server and CLIs.
Extend the screenshot “auto” ladder with screenrecord (ffmpeg first-frame extraction) and improve scrcpy discovery/handling; update server contract to include screenshot metadata.
Add pilot.py and a more comprehensive diagnose.py (including a desktop PowerShell capture fallback) plus documentation and targeted tests.

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
docs/dev/logging.md	Documents the audit log and nested helper coverage.
docs/ROADMAP.md	Updates roadmap items to reflect the new fallback ladder + audit logging.
docs/ARCHITECTURE.md	Updates architecture diagram/notes for logging + surface guardrails.
adb_vision/tool_audit.py	Implements structured audit logging + FastMCP middleware.
adb_vision/test_tool_audit.py	Adds unit tests validating audit record writing and parent/child linkage.
adb_vision/test_server.py	Updates tests for new fallback order, scrcpy discovery hook, and screenshot metadata.
adb_vision/server.py	Integrates audit middleware, logs ADB subprocesses, and returns screenshot metadata-derived summaries.
adb_vision/screenshot.py	Adds `screenrecord` backend, adds audit events for HTTP/backend activity, improves scrcpy capture path.
adb_vision/pyproject.toml	Adds dependencies for new capture/diagnostic functionality (`av`, `uiautomator2`, `pymemuc`).
adb_vision/pilot.py	Adds a minimal audited CLI for manual piloting of core tools.
adb_vision/diagnose.py	Adds an audited health gate CLI with backend probing + desktop screenshot fallback.
adb_vision/desktop_capture.ps1	Implements desktop screenshot capture of the MEmu window for fallback diagnostics.
adb_vision/README.md	Updates documented screenshot backend list and `auto` behavior.
TDD_IMPLEMENTATION_PLAN.md	Updates TDD plan status notes to reflect audit + fallback changes.

Copilot · 2026-03-05T21:55:55Z

adb_vision/server.py

    parser.add_argument(
        "--screenshot-method",
        default="auto",
-        choices=["auto", "droidcast", "scrcpy", "u2", "screencap"],
+        choices=["auto", "droidcast", "scrcpy", "u2", "screenrecord", "screencap"],
        help="Default screenshot method",


--screenshot-method is exposed as a CLI option, but its value is never used (the adb_screenshot tool always defaults to method="auto"). This makes the flag misleading; either wire it into a module-level default used by adb_screenshot when no method is provided, or remove the argument to avoid a no-op configuration option.

Copilot · 2026-03-05T21:55:56Z

adb_vision/diagnose.py

+ADB = r"D:\Program Files\Microvirt\MEmu\adb.exe"
+SERIAL = "127.0.0.1:21513"
+GAME_PACKAGE = "com.YoStarEN.AzurLane"
+SCREEN_DIR = Path(__file__).parent / "pilot_screens"
+BLACK_STDDEV_THRESHOLD = 5.0


diagnose.py hardcodes the ADB executable path and serial (ADB = r"D:\...\adb.exe", SERIAL = "127.0.0.1:21513"). That will break on machines with different install locations/ports and makes the diagnostic gate hard to reuse in CI or on other dev setups. Prefer using the same discovery/env-var approach as server.py (e.g., $ADB_SERIAL, $ADB_EXECUTABLE or argparse flags) and defaulting to the current hardcoded values only if nothing else is configured.

Copilot · 2026-03-05T21:55:56Z

adb_vision/diagnose.py

+    stdout_text = stdout.decode(errors="replace").strip()
+    stderr_text = stderr.decode(errors="replace").strip()
+    record_command(
+        command_name="adb.exec",
+        argv=argv,
+        duration_ms=(time.perf_counter() - started) * 1000,
+        status="success",
+        stdout=stdout_text,


Similar to server._adb_run, this _adb_run decodes and logs stdout as text before calling record_command. When diagnosing the screencap backend (binary PNG), this can create extremely large audit records and waste CPU/memory. Passing the raw bytes into record_command (or truncating/printability-checking before decoding) would keep audits lightweight while still capturing the byte count.

Suggested change

stdout_text = stdout.decode(errors="replace").strip()

stderr_text = stderr.decode(errors="replace").strip()

record_command(

command_name="adb.exec",

argv=argv,

duration_ms=(time.perf_counter() - started) * 1000,

status="success",

stdout=stdout_text,

# Avoid decoding large or binary stdout payloads (e.g., screencap PNG) into audit logs.

# For small, printable outputs we decode; otherwise we log a compact summary.

if stdout:

# Threshold chosen to keep audits lightweight while still logging typical text output.

if len(stdout) <= 4096:

decoded = stdout.decode(errors="replace").strip()

if decoded.isprintable():

stdout_preview = decoded

else:

stdout_preview = f"<{len(stdout)} bytes binary data>"

else:

stdout_preview = f"<{len(stdout)} bytes (truncated for logging)>"

else:

stdout_preview = ""

stderr_text = stderr.decode(errors="replace").strip()

record_command(

command_name="adb.exec",

argv=argv,

duration_ms=(time.perf_counter() - started) * 1000,

status="success",

stdout=stdout_preview,

Copilot · 2026-03-05T21:55:56Z

adb_vision/screenshot.py

+async def _capture_screenrecord(*, adb_run: AdbRunFn, serial: str, adb_exe: str) -> str:
+    ffmpeg_exe = shutil.which("ffmpeg")
+    if not ffmpeg_exe:
+        raise RuntimeError("ffmpeg not found in PATH; required for screenrecord fallback")
+


The new screenrecord backend is a significant behavior addition (adb shell screenrecord + pull + ffmpeg decode), but it currently only has coverage via the fallback-order test that mocks _capture_screenrecord. Consider adding a focused unit test for _capture_screenrecord that mocks adb_run, _run_logged_process, and Path.read_bytes() to verify the happy path and the key failure modes (ffmpeg missing, non-PNG output, cleanup of the remote file).

Copilot · 2026-03-05T21:55:56Z

adb_vision/pilot.py

+SCREEN_DIR = Path(__file__).parent / "pilot_screens"
+
+
+def _save_image(raw: bytes, stem: str) -> str:
+    SCREEN_DIR.mkdir(exist_ok=True)
+    path = SCREEN_DIR / stem
+    path.write_bytes(raw)


pilot.py writes screenshots into adb_vision/pilot_screens/ under the repo. That directory isn’t currently ignored (unlike mcp_screenshots/), so it’s easy to accidentally commit runtime artifacts. Consider either reusing the existing mcp_screenshots/ directory or adding pilot_screens/ to adb_vision/.gitignore (or the root .gitignore).

Copilot · 2026-03-05T21:55:56Z

adb_vision/server.py

+    stdout_text = stdout.decode(errors="replace").strip()
+    stderr_text = stderr.decode(errors="replace").strip()
+    record_command(
+        command_name="adb.exec",
+        argv=argv,


_adb_run decodes and logs stdout/stderr as text. For commands that return binary (notably adb exec-out screencap -p used by the screenshot backend), this can produce a huge mostly-unreadable string and bloat/slow the audit log path. Consider passing the raw bytes into record_command (it already summarizes binary safely) and only decoding for known-text commands or after a size/printability check.

feat(adb-vision): restore live piloting and audit backend helpers

098a7b4

Copilot AI review requested due to automatic review settings March 5, 2026 21:50

Copilot started reviewing on behalf of Coldaine March 5, 2026 21:51 View session

Coldaine closed this Mar 5, 2026

kilo-code-bot bot reviewed Mar 5, 2026

View reviewed changes

Copilot AI reviewed Mar 5, 2026

View reviewed changes


	ADB = os.environ.get("ADB_EXECUTABLE") or r"D:\Program Files\Microvirt\MEmu\adb.exe"


		# Restore previous window
		[WinAPI]::SetForegroundWindow($previousWindow) \| Out-Null

Conversation

Coldaine commented Mar 5, 2026

Summary

Why

Validation

Live checks

Uh oh!

chatgpt-codex-connector bot commented Mar 5, 2026

Uh oh!

Coldaine commented Mar 5, 2026

Uh oh!

kilo-code-bot bot Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

kilo-code-bot bot Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

kilo-code-bot bot Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

kilo-code-bot bot commented Mar 5, 2026

Code Review Roast 🔥

Overview

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants