fix(launcher): explicit [headroom] route_upstream toggle, default off#84
Merged
Merged
Conversation
Pre-fix _should_route_helix_upstream_via_headroom returned True for any remote (non-loopback) upstream as long as HELIX_HEADROOM_ROUTE_UPSTREAM_AUTO wasn't explicitly set falsy. So an operator with cfg.server.upstream = "https://api.openai.com/v1" cfg.headroom.enabled = false (default!) cfg.headroom.installed = no (no headroom-ai package) would have the launcher rewrite HELIX_SERVER_UPSTREAM to http://127.0.0.1:8787 and start helix pointing at a Headroom proxy that was never started — every chat call then failed with ECONNREFUSED, with no clear diagnostic line tying it back to the routing decision. Adds a separate [headroom] route_upstream bool (default false) gating the rewrite. enabled / route_upstream are distinct concerns: lifecycle vs chat-redirect. Operators can now run the proxy + dashboard without the chat redirect, or explicitly opt the redirect in without changing the proxy lifecycle. HELIX_HEADROOM_ROUTE_UPSTREAM_AUTO remains as a per-launch override. Precedence: 1. auto_override=False (env=0) -> off 2. auto_override=True (env=1) -> on (even with route_upstream=false) 3. auto_override=None (unset) -> defer to cfg.headroom.route_upstream The previous test that pinned the bug (test_remote_upstream_routes_helix_via_headroom) is replaced with four tests covering the new precedence rules + the loopback-stays-direct invariant. Tests: 125 passed / 7 skipped across launcher + config + headroom test modules. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2ee5699 to
c5d9eb8
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a config-level toggle to disable the launcher's chat-upstream rewriting through the Headroom proxy. Surfaced from operator pain — running with Headroom not installed locally was occasionally producing 30s+ stalls / ECONNREFUSED on chat calls because the launcher was preemptively redirecting helix's upstream to a proxy that hadn't started.
The bug this fixes
_should_route_helix_upstream_via_headroom(launcher/app.py:287) was returningTruewhenever:cfg.server.upstreamparsed cleanly,cfg.server.upstreamwas not a loopback host (so any remote API),cfg.server.upstreamwasn't already pointing at the configured Headroom port.It did not check
cfg.headroom.enabled, nor any "is Headroom actually going to be available?" predicate. So a config like:with
headroom-ainot installed (pip show headroom-ai→ not found) would silently:HELIX_SERVER_UPSTREAM = http://127.0.0.1:8787andOPENAI_TARGET_API_URL = <real upstream>in env (launcher/app.py:327-333).is_headroom_installed()returned False atlauncher/app.py:363-365.load_configthen readHELIX_SERVER_UPSTREAMatconfig.py:616-617and setcfg.server.upstream = http://127.0.0.1:8787./v1/chat/completionscall dialled a dead local port. ECONNREFUSED with no clear log line connecting the failure to the routing decision.The existing test
test_remote_upstream_routes_helix_via_headroom(tests/test_launcher_app.py:339) pinned this exact buggy behavior — constructed a config with default-disabled headroom + remote upstream and asserted routing was ON.What changed
New
[headroom] route_upstreambool, defaultFalse(config.py:HeadroomConfig). Wired through the TOML parser atconfig.py:783-804. Documented inline inhelix.toml.Routing decision rewritten (
launcher/app.py:287) with explicit precedence:auto_override=False(fromHELIX_HEADROOM_ROUTE_UPSTREAM_AUTO=0) → off, always.auto_override=True(fromHELIX_HEADROOM_ROUTE_UPSTREAM_AUTO=1) → on, even if config says off.auto_override=None(env unset) → defer tocfg.headroom.route_upstream(default off).enabledandroute_upstreamare now separate concerns.enabledcontrols the proxy lifecycle (start / adopt the process);route_upstreamcontrols whether helix's chat upstream is rewritten to dial the proxy. An operator can now:enabled=true, route_upstream=false).enabled=false, route_upstream=true).Tests rewritten (
tests/test_launcher_app.py:338):test_remote_upstream_does_not_route_when_route_upstream_disabled— the new default-off contract.test_remote_upstream_routes_when_route_upstream_opted_in— explicit opt-in still works.test_env_var_false_forces_off_even_when_config_opted_in— per-launch kill switch.test_env_var_true_forces_on_even_when_config_disabled— per-launch opt-in symmetric.test_local_ollama_upstream_stays_direct— loopback never gets rewritten, regardless.Per-launch override (immediate workaround, also works post-merge)
This was already implemented (
_env_truthyatlauncher/app.py:275-280); the new code preserves it via theauto_overridearg.What this does NOT change
headroom-ailibrary import path (headroom_bridge.py/compress_text) is untouched. That surface has correct graceful-degradation already —is_headroom_available()probes the import + theHELIX_DISABLE_HEADROOMenv opt-out and falls back tocontent[:target_chars]truncation when the library isn't installed._maybe_build_headroom. Operators with an externally-managed Headroom can still have the launcher adopt it. The decoupling above means orphan adoption no longer implicitly enables upstream rewriting — operators who want that should setroute_upstream = trueexplicitly.Test plan
pytest tests/test_launcher_app.py tests/test_config.py— 54/54 pass.pytest tests/test_launcher_app.py tests/test_launcher_supervisor.py tests/test_headroom_supervisor.py tests/test_config.py tests/test_headroom_bridge.py— 125/125 pass, 7 skipped.helix-launcherin a fresh shell to confirmHELIX_SERVER_UPSTREAMis NOT set whenroute_upstreamdefaults to false.Related