feat(daimonion): impingement-native voice Phase 1#3727
Conversation
…ngements Introduces AudioPerceptionBackend — first step in dissolving the ConversationPipeline sidecar. Operator speech now enters the impingement field (strength = vad_confidence × speaker_posterior) and competes on force with exploration and narration. Non-operator speech emits weaker audio.scene impingements. Writes to /dev/shm/hapax-dmn/impingements.jsonl. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds AudioPerceptionBackend that STT-transcribes audio, resolves speaker/presence, emits calibrated impingements to an in-memory queue and JSONL bus, integrates with CpalRunner for preemption and transcript-forwarding, swaps resident STT to Parakeet ONNX, adjusts VAD/conversation flow, and adds TTS remote support plus config/docs updates. ChangesAudio Perception Backend
Runner integration and playback control
Resident STT (ONNX Parakeet)
Conversation pipeline & buffer
Daemon wiring, loops, configs, deps, and ops
Sequence Diagram(s)sequenceDiagram
participant Mic as Microphone
participant AP as AudioPerceptionBackend
participant STT as ResidentSTT (onnx_asr)
participant SID as SpeakerID
participant PRES as PresenceProvider
participant BUS as Impingement JSONL Bus
participant CPAL as CpalRunner
participant PIPE as ConversationPipeline
Mic->>AP: PCM audio frame (utterance)
AP->>STT: transcribe(audio) -> transcript
alt speaker-id available
AP->>SID: identify(transcript/audio) -> speaker, posterior
else fallback
AP->>PRES: presence_posterior() -> posterior
end
AP->>BUS: write impingement JSONL (source,type,content,strength)
AP->>CPAL: enqueue impingement (in-memory)
CPAL->>AP: drain/process impingements
CPAL->>PIPE: process_utterance_from_transcript(transcript) [on operator impingement]
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
Auto-fix classified CI failure categories as |
… preemption - AudioPerceptionBackend wired into CPAL runner — utterances route through impingement bus instead of direct _process_utterance() - process_impingement handles audio.operator_speech before adapter — operator speech is NOT subject to surfacing thresholds - ResourceArbiter claims audio_output at conversation priority (100), preempts exploration (15) with _kill_inflight_playback - Exploration surfacing checks arbiter before claiming audio_output - process_utterance_from_transcript added to ConversationPipeline for impingement-routed transcripts Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (1)
agents/hapax_daimonion/audio_perception.py (1)
94-97: ⚡ Quick winCreate the bus parent directory before append to prevent silent persistence loss.
If
/dev/shm/hapax-dmnis missing, writes fail and impingements are only kept in-memory. Creating parent dirs before open makes persistence more robust.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@agents/hapax_daimonion/audio_perception.py` around lines 94 - 97, Ensure the bus parent directory is created before attempting to append to IMPINGEMENT_BUS to avoid silent persistence loss: before calling IMPINGEMENT_BUS.open("a") in the impingement write block (the code that currently uses IMPINGEMENT_BUS.open("a") and catches OSError), call the directory-creation step (e.g. IMPINGEMENT_BUS.parent.mkdir(parents=True, exist_ok=True)) and handle/log any exceptions appropriately so that writes no longer silently fail when /dev/shm/hapax-dmn is missing.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@agents/hapax_daimonion/audio_perception.py`:
- Around line 35-36: The tier method currently returns the raw string "FAST";
change it to return the PerceptionTier enum value and update the signature:
import PerceptionTier, change def tier(self) -> str: to def tier(self) ->
PerceptionTier:, and return PerceptionTier.FAST in the tier() method so callers
receive the enum rather than a plain string (ensure the PerceptionTier import is
added at the top of the module).
- Around line 68-69: The emitted impingement record uses enum names
("PATTERN_MATCH"/"STATISTICAL_DEVIATION") but the consumer expects canonical
lowercase values; update the dict construction that sets the "type" field (the
expression using is_operator) to emit "pattern_match" when is_operator is true
and "statistical_deviation" otherwise (or use a small mapping from the enum/flag
to these canonical strings) so ImpingementConsumer/Impingement will accept the
records; keep the rest (e.g., strength rounding) unchanged.
- Around line 84-90: The log statement in audio_perception.py currently includes
raw transcript text at INFO level; change the log.info call that references
log.info(..., imp["source"], imp["strength"], speaker, transcript) so it only
logs high-level metadata (imp["source"], imp["strength"], speaker) at INFO and
either redact the transcript or move the full transcript text to a log.debug
call that references the same transcript variable for diagnostics; ensure you do
not print transcript at INFO and keep the log message context clear while using
the existing variables (imp, speaker, transcript).
In `@tests/hapax_daimonion/test_audio_perception.py`:
- Around line 3-4: Remove the unused imports causing Ruff F401 by deleting
"time" and "patch" from the import line in the test module; keep the necessary
AsyncMock and MagicMock imports and adjust any code that was intended to use
time/patch to use alternatives or mocks already present (look for the import
statement that currently reads "import time\nfrom unittest.mock import
AsyncMock, MagicMock, patch" or a single-line variant and remove the unused
names "time" and "patch").
---
Nitpick comments:
In `@agents/hapax_daimonion/audio_perception.py`:
- Around line 94-97: Ensure the bus parent directory is created before
attempting to append to IMPINGEMENT_BUS to avoid silent persistence loss: before
calling IMPINGEMENT_BUS.open("a") in the impingement write block (the code that
currently uses IMPINGEMENT_BUS.open("a") and catches OSError), call the
directory-creation step (e.g. IMPINGEMENT_BUS.parent.mkdir(parents=True,
exist_ok=True)) and handle/log any exceptions appropriately so that writes no
longer silently fail when /dev/shm/hapax-dmn is missing.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: e3221f3c-3e6c-47f3-9c80-95bf66f5739a
📒 Files selected for processing (2)
agents/hapax_daimonion/audio_perception.pytests/hapax_daimonion/test_audio_perception.py
| def tier(self) -> str: | ||
| return "FAST" |
There was a problem hiding this comment.
Use PerceptionTier enum for tier instead of a raw string.
Line 36 returns "FAST" as str, but existing perception backends expose a PerceptionTier value. This can break tier-based orchestration/comparisons.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@agents/hapax_daimonion/audio_perception.py` around lines 35 - 36, The tier
method currently returns the raw string "FAST"; change it to return the
PerceptionTier enum value and update the signature: import PerceptionTier,
change def tier(self) -> str: to def tier(self) -> PerceptionTier:, and return
PerceptionTier.FAST in the tier() method so callers receive the enum rather than
a plain string (ensure the PerceptionTier import is added at the top of the
module).
| log.info( | ||
| "Audio impingement: source=%s strength=%.2f speaker=%s transcript=%.40s", | ||
| imp["source"], | ||
| imp["strength"], | ||
| speaker, | ||
| transcript, | ||
| ) |
There was a problem hiding this comment.
Avoid logging transcript content at INFO level.
Line 85 logs raw transcript text, which can include sensitive user speech/PII. Keep high-level metadata at INFO and move/redact text for DEBUG-only diagnostics.
Suggested fix
- log.info(
- "Audio impingement: source=%s strength=%.2f speaker=%s transcript=%.40s",
- imp["source"],
- imp["strength"],
- speaker,
- transcript,
- )
+ log.info(
+ "Audio impingement: source=%s strength=%.2f speaker=%s",
+ imp["source"],
+ imp["strength"],
+ speaker,
+ )📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| log.info( | |
| "Audio impingement: source=%s strength=%.2f speaker=%s transcript=%.40s", | |
| imp["source"], | |
| imp["strength"], | |
| speaker, | |
| transcript, | |
| ) | |
| log.info( | |
| "Audio impingement: source=%s strength=%.2f speaker=%s", | |
| imp["source"], | |
| imp["strength"], | |
| speaker, | |
| ) |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@agents/hapax_daimonion/audio_perception.py` around lines 84 - 90, The log
statement in audio_perception.py currently includes raw transcript text at INFO
level; change the log.info call that references log.info(..., imp["source"],
imp["strength"], speaker, transcript) so it only logs high-level metadata
(imp["source"], imp["strength"], speaker) at INFO and either redact the
transcript or move the full transcript text to a log.debug call that references
the same transcript variable for diagnostics; ensure you do not print transcript
at INFO and keep the log message context clear while using the existing
variables (imp, speaker, transcript).
| import time | ||
| from unittest.mock import AsyncMock, MagicMock, patch |
There was a problem hiding this comment.
Remove unused imports to fix Ruff F401.
Line 3 (time) and Line 4 (patch) are unused and currently fail lint.
🧰 Tools
🪛 GitHub Check: lint
[failure] 4-4: ruff (F401)
tests/hapax_daimonion/test_audio_perception.py:4:49: F401 unittest.mock.patch imported but unused
help: Remove unused import: unittest.mock.patch
[failure] 3-3: ruff (F401)
tests/hapax_daimonion/test_audio_perception.py:3:8: F401 time imported but unused
help: Remove unused import: time
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tests/hapax_daimonion/test_audio_perception.py` around lines 3 - 4, Remove
the unused imports causing Ruff F401 by deleting "time" and "patch" from the
import line in the test module; keep the necessary AsyncMock and MagicMock
imports and adjust any code that was intended to use time/patch to use
alternatives or mocks already present (look for the import statement that
currently reads "import time\nfrom unittest.mock import AsyncMock, MagicMock,
patch" or a single-line variant and remove the unused names "time" and "patch").
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 5141e11e71
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| "id": str(uuid.uuid4()), | ||
| "timestamp": time.time(), | ||
| "source": "audio.operator_speech" if is_operator else "audio.scene", | ||
| "type": "PATTERN_MATCH" if is_operator else "STATISTICAL_DEVIATION", |
There was a problem hiding this comment.
Emit valid impingement type enum strings
Use the canonical ImpingementType wire values here (e.g. pattern_match, statistical_deviation) instead of uppercase names. The DMN consumers parse JSONL via Impingement.model_validate_json, so these uppercase values fail enum validation and the line is treated as malformed/skipped; in practice that drops every speech impingement before it reaches process_impingement/affordance routing.
Useful? React with 👍 / 👎.
| speaker_confidence = 0.0 | ||
| if self._speaker_id is not None: | ||
| try: | ||
| speaker, speaker_confidence = self._speaker_id.identify(audio_bytes) |
There was a problem hiding this comment.
Normalize speaker-ID API before operator classification
This call assumes identify accepts raw PCM bytes and returns a (speaker, confidence) tuple, but the repo’s shipped SpeakerIdentifier (agents/hapax_daimonion/speaker_id.py) expects an embedding and returns SpeakerResult. If this backend is wired to the daemon’s _speaker_identifier from agents/hapax_daimonion/init_audio.py, the call raises and the exception path forces speaker='unknown', so operator speech is never recognized and always emitted on the low-strength ambient path.
Useful? React with 👍 / 👎.
| OPERATOR_SPEAKER_THRESHOLD = 0.60 | ||
|
|
||
|
|
||
| class AudioPerceptionBackend: |
There was a problem hiding this comment.
Register the backend so audio impingements can run
This new backend is never instantiated or registered in the daemon backend wiring, so none of its process_utterance/bus-emission logic runs in production. I checked the runtime registration list in agents/hapax_daimonion/init_backends.py and there is no AudioPerceptionBackend entry, which makes this feature path effectively dead code despite the new implementation and tests.
Useful? React with 👍 / 👎.
…nt impingement Phase 1 of dissolving the ConversationPipeline sidecar. Operator speech now enters the impingement field via AudioPerceptionBackend instead of bypassing the cascade through a hardwired conversation pipeline. Architecture: - AudioPerceptionBackend emits audio.operator_speech impingements (strength ~0.76, PATTERN_MATCH type) from VAD + STT + presence - CPAL runner handles operator speech before the impingement adapter, bypassing surfacing thresholds - ResourceArbiter wired into audio output: conversation(100) preempts exploration(15) with in-flight playback kill - Exploration surfacing runs as background task, no longer blocks the impingement consumer loop - ConversationBuffer speaking/cooldown gates removed — XVF3800 hardware AEC handles echo rejection at device level Infrastructure fixes: - PyTorch 2.12.0+cu130 for Blackwell sm_120 (Chatterbox on CUDA) - GPU UUID updated in fish config post-migration - Ollama routed to appendix (192.168.68.50) - Parakeet TDT STT replacing Whisper - Session timeout disabled (pipeline never stops) - Utterance drain replaced with keep-latest queue - Audio queue increased to 1000 frames (30s) for GIL contention - Blocking TTS in proactive_delivery_loop wrapped in executor - ImpingementType enum case fix (PATTERN_MATCH → pattern_match) - Presence provider score() property call fix - ReSpeaker XVF3800 duplex profile + hardware AEC reference feed Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Auto-fix classified CI failure categories as |
There was a problem hiding this comment.
Actionable comments posted: 11
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
agents/hapax_daimonion/conversation_pipeline.py (1)
366-379:⚠️ Potential issue | 🟠 Major | ⚡ Quick winSend the refreshed system context here, not the static base prompt.
_update_system_context()updatesself.messages[0]["content"], but this call now sendsself.system_prompt. That drops the live conversation thread, environment/policy bands, salience, and grounding directives from spontaneous-speech generation.Suggested fix
- messages = [ + system_content = ( + self.messages[0]["content"] if self.messages else self.system_prompt + ) + messages = [ { "role": "system", - "content": f"{spontaneous_envelope}\n\n{self.system_prompt}", + "content": f"{spontaneous_envelope}\n\n{system_content}", }, {"role": "user", "content": prompt}, ]🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@agents/hapax_daimonion/conversation_pipeline.py` around lines 366 - 379, _update_system_context() mutates the live system content in self.messages[0]["content"], but the spontaneous-speech phase is currently sending the static self.system_prompt instead, losing updated context; change the messages construction to use the refreshed system entry (e.g., combine spontaneous_envelope with self.messages[0]["content"] or replace the "content" field with f"{spontaneous_envelope}\n\n{self.messages[0]['content']}") so the spontaneous_envelope and subsequent messages use the live conversation/policy/salience grounding produced by _update_system_context(), keeping render_envelope, spontaneous_envelope, and the messages list semantics intact.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@agents/hapax_daimonion/config.py`:
- Line 107: Replace the hardcoded llm_model default with the centralized alias
from shared.config: import the shared alias constant (for example
DEFAULT_LLM_ALIAS) from shared.config and set llm_model: str = DEFAULT_LLM_ALIAS
in agents/hapax_daimonion/config.py (keep the llm_model symbol and its type hint
unchanged); update any module imports accordingly so the default is sourced from
shared/config.py rather than the local literal.
In `@agents/hapax_daimonion/conversation_pipeline.py`:
- Around line 463-484: The transcript path currently mutates
_conversation_thread directly and calls litellm.acompletion without updating the
canonical conversation state; update this block to record the user and assistant
turns in the same structures the rest of the class uses: add the user message to
self.messages (matching the message dict shape used elsewhere), increment
turn_count, call litellm.acompletion with that message list, then when a
response is received append the assistant turn both to self.messages and to
_conversation_thread (or ensure _conversation_thread entries match the canonical
format), and only then call _speak_sentence; ensure you reference and update the
symbols _conversation_thread, self.messages, and turn_count and keep the
litellm.acompletion/ _speak_sentence flow intact.
- Around line 460-463: Replace the local import of LITELLM_BASE so this branch
uses the canonical shared config: change the import that currently does "from
agents.hapax_daimonion.config import LITELLM_BASE as _base" to import
LITELLM_BASE from shared.config (i.e., use "from shared.config import
LITELLM_BASE as _base") so LITELLM_BASE and MODELS are sourced from the same
place as the rest of the method; ensure the existing use of MODELS["local-fast"]
and assignment to grounded_model remains unchanged in conversation_pipeline.py.
In `@agents/hapax_daimonion/cpal/runner.py`:
- Around line 985-992: When the pipeline handles a transcript via
self._pipeline.process_utterance_from_transcript(...) you must record a response
speech event the same way the other response path does so dialog-active
suppression and refractory/echo cooldown logic see the reply; after the await
returns (inside the try block) set self._last_speech_end to the current time and
append a response event into self._recent_speech_events (matching the same event
shape used elsewhere: timestamp and a marker that this was a response speech) so
downstream suppression logic treats this transcript-routed path as having
spoken; do this inside the try before returning and keep the existing
exception/logging behavior for the except branch.
- Around line 1392-1404: The code currently stamps `_last_speech_end` and
appends a SpeechEvent(EXPLORATION) in the `finally` block regardless of whether
playback actually occurred; change this so the timestamp/update and
`_recent_speech_events.append(SpeechEvent(...))` happen only after successful
playback (i.e., when `_buffer.set_speaking(False)` is executed following actual
audio output and the utterance was not dropped/silenced). Move the
`_last_speech_end = time.monotonic()` and the `SpeechEvent` append out of the
`finally` and into the success path (after whatever method confirms playback
finished or that `effect.narrative` produced audible output), keep the exception
logging (`log.debug(..., exc_info=True)`) in the `except` block, and ensure
`_buffer.set_speaking(False)` still runs in `finally` to clear state even on
failure; reference symbols: `_buffer.set_speaking`, `_last_speech_end`,
`_recent_speech_events`, `SpeechEvent`, `SpeechEventKind.EXPLORATION`,
`effect.narrative`, and `"exploration_surfacing"`.
- Around line 507-515: Replace the hardcoded vad_confidence=0.95 in the call to
self._audio_perception.process_utterance with the real computed vad_prob (the
variable computed earlier), i.e. pass vad_confidence=vad_prob; ensure vad_prob
is in scope at this branch (or use a sensible fallback if necessary) so
_audio_perception.process_utterance receives the actual VAD probability rather
than a constant.
- Around line 980-984: Replace the INFO-level log that prints live operator
speech so transcripts are not exposed: in the log.info call that references
impingement and transcript (the call using getattr(impingement, "strength", 0)
and transcript), either change it to log.debug or redact the transcript (e.g.,
log a placeholder like "<REDACTED>" or only the transcript length/hash) while
keeping the strength value; ensure any transcript content is never logged at
INFO level and move any sensitive debug output to DEBUG-only logging.
In `@agents/hapax_daimonion/model_router.py`:
- Around line 57-59: The current mapping in the router collapses ModelTier.FAST,
ModelTier.STRONG, and ModelTier.CAPABLE all to the same "local-fast" route which
disables capability escalation and governance/tooling paths; update the
tier->route mapping in model_router.py so FAST continues to map to "local-fast"
while STRONG and CAPABLE map to their intended higher-capability routes (e.g.,
"local-strong"/"remote-capable" or whatever route names are used in your routing
table), ensuring any governance/tooling-specific route keys used elsewhere
(search for ModelTier, the tier->route dict, and any code that branches on those
route names) are preserved and exercised by STRONG/CAPABLE mappings.
In `@agents/hapax_daimonion/run_inner.py`:
- Around line 460-465: The loop over other_imps incorrectly matches exploration
sources by the exact literal "exploration", so variants like
"exploration.apperception" still get processed; update the suppression inside
the for imp in other_imps loop (the conditional that checks operator_imps and
inspects getattr(imp, "source", "")) to treat any source in the exploration
family as a match (e.g., use startswith("exploration") or split on '.' and
compare the first segment) while still preserving the existing check for
"endogenous.narrative_drive" so those are skipped when operator_imps is present.
In `@docs/superpowers/plans/2026-05-23-impingement-native-voice.md`:
- Line 13: The document jumps from the top-level "#" heading to "### Task 1:
AudioPerceptionBackend — Impingement Emitter", which violates the MD001 heading
increment rule; change the "### Task 1: AudioPerceptionBackend — Impingement
Emitter" heading to "## Task 1: AudioPerceptionBackend — Impingement Emitter"
(or insert an intermediate "##" section before it) so headings increment
properly and markdown lint passes.
In `@shared/config.py`:
- Line 42: The config default for OLLAMA_URL is currently a private LAN IP which
breaks setups without that host; change the default to a portable localhost URL
by updating the OLLAMA_URL constant (the variable name OLLAMA_URL in
shared/config.py) to use "http://localhost:11434" while still allowing overrides
via os.environ.get("OLLAMA_HOST", ...); ensure only the default value is changed
and environment variable behavior remains the same.
---
Outside diff comments:
In `@agents/hapax_daimonion/conversation_pipeline.py`:
- Around line 366-379: _update_system_context() mutates the live system content
in self.messages[0]["content"], but the spontaneous-speech phase is currently
sending the static self.system_prompt instead, losing updated context; change
the messages construction to use the refreshed system entry (e.g., combine
spontaneous_envelope with self.messages[0]["content"] or replace the "content"
field with f"{spontaneous_envelope}\n\n{self.messages[0]['content']}") so the
spontaneous_envelope and subsequent messages use the live
conversation/policy/salience grounding produced by _update_system_context(),
keeping render_envelope, spontaneous_envelope, and the messages list semantics
intact.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: f640afa4-269c-421e-a5ed-80d0c040a489
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (18)
agents/hapax_daimonion/audio_input.pyagents/hapax_daimonion/audio_perception.pyagents/hapax_daimonion/config.pyagents/hapax_daimonion/conversation_buffer.pyagents/hapax_daimonion/conversation_pipeline.pyagents/hapax_daimonion/cpal/runner.pyagents/hapax_daimonion/model_router.pyagents/hapax_daimonion/resident_stt.pyagents/hapax_daimonion/run_inner.pyagents/hapax_daimonion/run_loops_aux.pyagents/hapax_daimonion/session_events.pydocs/superpowers/plans/2026-05-23-impingement-native-voice.mddocs/superpowers/specs/2026-05-23-impingement-native-voice-design.mdpyproject.tomlshared/config.pysystemd/units/hapax-daimonion.service.d/gpu-pin.conftests/hapax_daimonion/test_audio_perception.pytests/hapax_daimonion/test_cpal_runner.py
| # Backends | ||
| backend: str = "local" # "local" or "gemini" | ||
| llm_model: str = "gemini-flash" | ||
| llm_model: str = "local-fast" |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major | 🏗️ Heavy lift
Centralize the LLM default alias through shared config
This hardcoded alias in daemon-local config can drift from centralized route aliases. Please source this default from shared/config.py (or a small shared alias module) instead of keeping a local literal.
As per coding guidelines, **/*.py: Use shared/config.py for model aliases and LiteLLM/Qdrant configuration.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@agents/hapax_daimonion/config.py` at line 107, Replace the hardcoded
llm_model default with the centralized alias from shared.config: import the
shared alias constant (for example DEFAULT_LLM_ALIAS) from shared.config and set
llm_model: str = DEFAULT_LLM_ALIAS in agents/hapax_daimonion/config.py (keep the
llm_model symbol and its type hint unchanged); update any module imports
accordingly so the default is sourced from shared/config.py rather than the
local literal.
| grounded_model = MODELS["local-fast"] | ||
| self._conversation_thread.append({"role": "user", "content": transcript}) | ||
|
|
||
| try: | ||
| response = await asyncio.wait_for( | ||
| litellm.acompletion( | ||
| model=f"openai/{grounded_model}", | ||
| messages=self.messages + [{"role": "user", "content": transcript}], | ||
| max_tokens=200, | ||
| temperature=0.7, | ||
| api_key=__import__("os").environ.get("LITELLM_API_KEY", ""), | ||
| api_base=_base, | ||
| timeout=10, | ||
| ), | ||
| timeout=15, | ||
| ) | ||
|
|
||
| text = response.choices[0].message.content.strip() | ||
| if text: | ||
| log.info("Conversation response: %s", text[:80]) | ||
| self._conversation_thread.append({"role": "assistant", "content": text}) | ||
| await self._speak_sentence(text) |
There was a problem hiding this comment.
This transcript path bypasses the pipeline’s normal conversation state updates.
It appends raw dicts to _conversation_thread and never records the user/assistant turns in self.messages or increments turn_count. The rest of the class rebuilds prompt state from self.messages and treats _conversation_thread as structured thread entries, so impingement-routed turns become effectively stateless and can break later thread rendering.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@agents/hapax_daimonion/conversation_pipeline.py` around lines 463 - 484, The
transcript path currently mutates _conversation_thread directly and calls
litellm.acompletion without updating the canonical conversation state; update
this block to record the user and assistant turns in the same structures the
rest of the class uses: add the user message to self.messages (matching the
message dict shape used elsewhere), increment turn_count, call
litellm.acompletion with that message list, then when a response is received
append the assistant turn both to self.messages and to _conversation_thread (or
ensure _conversation_thread entries match the canonical format), and only then
call _speak_sentence; ensure you reference and update the symbols
_conversation_thread, self.messages, and turn_count and keep the
litellm.acompletion/ _speak_sentence flow intact.
| if utterance is not None: | ||
| if self._audio_perception is not None: | ||
| asyncio.create_task( | ||
| self._audio_perception.process_utterance( | ||
| audio_bytes=utterance, | ||
| vad_confidence=0.95, | ||
| duration_s=len(utterance) / (16000 * 2), | ||
| energy_db=-12.0, | ||
| ) |
There was a problem hiding this comment.
Pass the real VAD confidence into audio_perception.
vad_prob is computed a few lines above, but this path hardcodes vad_confidence=0.95. That makes operator-speech impingement strength almost constant and weakens the salience ordering this PR is trying to introduce.
Suggested fix
self._audio_perception.process_utterance(
audio_bytes=utterance,
- vad_confidence=0.95,
+ vad_confidence=vad_prob,
duration_s=len(utterance) / (16000 * 2),
energy_db=-12.0,
)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if utterance is not None: | |
| if self._audio_perception is not None: | |
| asyncio.create_task( | |
| self._audio_perception.process_utterance( | |
| audio_bytes=utterance, | |
| vad_confidence=0.95, | |
| duration_s=len(utterance) / (16000 * 2), | |
| energy_db=-12.0, | |
| ) | |
| if utterance is not None: | |
| if self._audio_perception is not None: | |
| asyncio.create_task( | |
| self._audio_perception.process_utterance( | |
| audio_bytes=utterance, | |
| vad_confidence=vad_prob, | |
| duration_s=len(utterance) / (16000 * 2), | |
| energy_db=-12.0, | |
| ) |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@agents/hapax_daimonion/cpal/runner.py` around lines 507 - 515, Replace the
hardcoded vad_confidence=0.95 in the call to
self._audio_perception.process_utterance with the real computed vad_prob (the
variable computed earlier), i.e. pass vad_confidence=vad_prob; ensure vad_prob
is in scope at this branch (or use a sensible fallback if necessary) so
_audio_perception.process_utterance receives the actual VAD probability rather
than a constant.
| log.info( | ||
| "CPAL: operator speech impingement (strength=%.2f): %.40s", | ||
| getattr(impingement, "strength", 0), | ||
| transcript, | ||
| ) |
There was a problem hiding this comment.
Avoid logging operator transcript text at INFO.
This path now writes live user speech into normal service logs. Even a 40-character preview can contain secrets or PII, so keep transcript content redacted or DEBUG-only.
Suggested fix
- log.info(
- "CPAL: operator speech impingement (strength=%.2f): %.40s",
- getattr(impingement, "strength", 0),
- transcript,
- )
+ log.info(
+ "CPAL: operator speech impingement (strength=%.2f)",
+ getattr(impingement, "strength", 0),
+ )
+ log.debug("CPAL: operator speech transcript=%r", transcript)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@agents/hapax_daimonion/cpal/runner.py` around lines 980 - 984, Replace the
INFO-level log that prints live operator speech so transcripts are not exposed:
in the log.info call that references impingement and transcript (the call using
getattr(impingement, "strength", 0) and transcript), either change it to
log.debug or redact the transcript (e.g., log a placeholder like "<REDACTED>" or
only the transcript length/hash) while keeping the strength value; ensure any
transcript content is never logged at INFO level and move any sensitive debug
output to DEBUG-only logging.
| except Exception: | ||
| log.debug("Exploration speech failed", exc_info=True) | ||
| finally: | ||
| self._buffer.set_speaking(False) | ||
| self._last_speech_end = time.monotonic() | ||
| self._recent_speech_events.append( | ||
| SpeechEvent( | ||
| kind=SpeechEventKind.EXPLORATION, | ||
| timestamp=self._last_speech_end, | ||
| source_path="exploration_surfacing", | ||
| text_preview=effect.narrative[:40], | ||
| ) | ||
| ) |
There was a problem hiding this comment.
Only stamp exploration speech after actual playback.
The finally block always updates _last_speech_end and appends an EXPLORATION event, even when generation failed, routing dropped the utterance, or the LLM chose silence. That creates false refractory suppression for later impingements.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@agents/hapax_daimonion/cpal/runner.py` around lines 1392 - 1404, The code
currently stamps `_last_speech_end` and appends a SpeechEvent(EXPLORATION) in
the `finally` block regardless of whether playback actually occurred; change
this so the timestamp/update and
`_recent_speech_events.append(SpeechEvent(...))` happen only after successful
playback (i.e., when `_buffer.set_speaking(False)` is executed following actual
audio output and the utterance was not dropped/silenced). Move the
`_last_speech_end = time.monotonic()` and the `SpeechEvent` append out of the
`finally` and into the success path (after whatever method confirms playback
finished or that `effect.narrative` produced audible output), keep the exception
logging (`log.debug(..., exc_info=True)`) in the `except` block, and ensure
`_buffer.set_speaking(False)` still runs in `finally` to clear state even on
failure; reference symbols: `_buffer.set_speaking`, `_last_speech_end`,
`_recent_speech_events`, `SpeechEvent`, `SpeechEventKind.EXPLORATION`,
`effect.narrative`, and `"exploration_surfacing"`.
| ModelTier.FAST: "local-fast", | ||
| ModelTier.STRONG: "local-fast", | ||
| ModelTier.CAPABLE: "local-fast", |
There was a problem hiding this comment.
Tier-route collapse removes capability escalation
Mapping FAST/STRONG/CAPABLE to the same local route defeats the router’s escalation logic, including governance/tooling paths that are supposed to use more capable models.
Suggested fix
TIER_ROUTES: dict[ModelTier, str] = {
ModelTier.CANNED: "", # no LLM call
ModelTier.LOCAL: "local-fast",
- ModelTier.FAST: "local-fast",
- ModelTier.STRONG: "local-fast",
- ModelTier.CAPABLE: "local-fast",
+ ModelTier.FAST: "gemini-flash",
+ ModelTier.STRONG: "claude-sonnet",
+ ModelTier.CAPABLE: "claude-opus",
}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@agents/hapax_daimonion/model_router.py` around lines 57 - 59, The current
mapping in the router collapses ModelTier.FAST, ModelTier.STRONG, and
ModelTier.CAPABLE all to the same "local-fast" route which disables capability
escalation and governance/tooling paths; update the tier->route mapping in
model_router.py so FAST continues to map to "local-fast" while STRONG and
CAPABLE map to their intended higher-capability routes (e.g.,
"local-strong"/"remote-capable" or whatever route names are used in your routing
table), ensuring any governance/tooling-specific route keys used elsewhere
(search for ModelTier, the tier->route dict, and any code that branches on those
route names) are preserved and exercised by STRONG/CAPABLE mappings.
| for imp in other_imps: | ||
| # Skip if operator speech was just processed (it preempted) | ||
| if operator_imps: | ||
| source = getattr(imp, "source", "") | ||
| if source in ("exploration", "endogenous.narrative_drive"): | ||
| continue |
There was a problem hiding this comment.
Match exploration sources by family, not exact literals.
This suppression only catches "exploration", but the supplied CPAL tests already use "exploration.apperception". That source will still be processed after operator speech, so the preemption path can double-surface exploration right after a live utterance.
Suggested fix
- if source in ("exploration", "endogenous.narrative_drive"):
+ if source.startswith("exploration") or source == "endogenous.narrative_drive":
continue🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@agents/hapax_daimonion/run_inner.py` around lines 460 - 465, The loop over
other_imps incorrectly matches exploration sources by the exact literal
"exploration", so variants like "exploration.apperception" still get processed;
update the suppression inside the for imp in other_imps loop (the conditional
that checks operator_imps and inspects getattr(imp, "source", "")) to treat any
source in the exploration family as a match (e.g., use startswith("exploration")
or split on '.' and compare the first segment) while still preserving the
existing check for "endogenous.narrative_drive" so those are skipped when
operator_imps is present.
|
|
||
| --- | ||
|
|
||
| ### Task 1: AudioPerceptionBackend — Impingement Emitter |
There was a problem hiding this comment.
Fix heading level jump to satisfy markdown lint.
Line 13 starts at ### immediately after the document-level # heading, which violates MD001 heading increment rules.
Suggested diff
-### Task 1: AudioPerceptionBackend — Impingement Emitter
+## Implementation Tasks
+
+### Task 1: AudioPerceptionBackend — Impingement Emitter🧰 Tools
🪛 markdownlint-cli2 (0.22.1)
[warning] 13-13: Heading levels should only increment by one level at a time
Expected: h2; Actual: h3
(MD001, heading-increment)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@docs/superpowers/plans/2026-05-23-impingement-native-voice.md` at line 13,
The document jumps from the top-level "#" heading to "### Task 1:
AudioPerceptionBackend — Impingement Emitter", which violates the MD001 heading
increment rule; change the "### Task 1: AudioPerceptionBackend — Impingement
Emitter" heading to "## Task 1: AudioPerceptionBackend — Impingement Emitter"
(or insert an intermediate "##" section before it) so headings increment
properly and markdown lint passes.
| ) | ||
| QDRANT_URL: str = os.environ.get("QDRANT_URL", "http://localhost:6333") | ||
| OLLAMA_URL: str = os.environ.get("OLLAMA_HOST", "http://localhost:11434") | ||
| OLLAMA_URL: str = os.environ.get("OLLAMA_HOST", "http://192.168.68.50:11434") |
There was a problem hiding this comment.
Avoid a host-specific Ollama default
Defaulting to a private LAN IP makes embeddings fail in environments that don’t have that host. Keep the default portable (localhost) and rely on env override for host-specific setups.
Suggested fix
-OLLAMA_URL: str = os.environ.get("OLLAMA_HOST", "http://192.168.68.50:11434")
+OLLAMA_URL: str = os.environ.get("OLLAMA_HOST", "http://localhost:11434")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| OLLAMA_URL: str = os.environ.get("OLLAMA_HOST", "http://192.168.68.50:11434") | |
| OLLAMA_URL: str = os.environ.get("OLLAMA_HOST", "http://localhost:11434") |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@shared/config.py` at line 42, The config default for OLLAMA_URL is currently
a private LAN IP which breaks setups without that host; change the default to a
portable localhost URL by updating the OLLAMA_URL constant (the variable name
OLLAMA_URL in shared/config.py) to use "http://localhost:11434" while still
allowing overrides via os.environ.get("OLLAMA_HOST", ...); ensure only the
default value is changed and environment variable behavior remains the same.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: cc6cff97ed
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| from shared.config import MODELS | ||
|
|
||
| grounded_model = MODELS["local-fast"] | ||
| self._conversation_thread.append({"role": "user", "content": transcript}) |
There was a problem hiding this comment.
Append ThreadEntry objects in transcript path
process_utterance_from_transcript pushes plain dicts into _conversation_thread, but _update_system_context later passes that list to _render_thread, which dereferences ThreadEntry attributes like e.is_repair/e.user_text. After the first operator-impingement turn, the next context refresh raises AttributeError before the LLM call, so subsequent operator speech in this path gets dropped.
Useful? React with 👍 / 👎.
| ) | ||
| QDRANT_URL: str = os.environ.get("QDRANT_URL", "http://localhost:6333") | ||
| OLLAMA_URL: str = os.environ.get("OLLAMA_HOST", "http://localhost:11434") | ||
| OLLAMA_URL: str = os.environ.get("OLLAMA_HOST", "http://192.168.68.50:11434") |
There was a problem hiding this comment.
Keep Ollama fallback host local by default
This changes the default OLLAMA_URL to a machine-specific LAN IP when OLLAMA_HOST is unset. In environments that rely on defaults, embedding calls will attempt that external host and fail, which breaks local embedding-dependent flows (the same module’s Ollama client builder reads OLLAMA_URL). This is a portability regression compared with the repo’s other localhost defaults.
Useful? React with 👍 / 👎.
| log.info("Conversation response: %s", text[:80]) | ||
| self._conversation_thread.append({"role": "assistant", "content": text}) | ||
| await self._speak_sentence(text) |
There was a problem hiding this comment.
Mark transcript-path TTS as speaking
The new operator-impingement path speaks via _speak_sentence without ever setting buffer.set_speaking(True), unlike the normal utterance path which enables the speaking gate before response playback. Because audio ingestion is always-on, this allows Hapax’s own TTS audio to be re-captured as fresh speech during these responses, creating false operator impingements and self-trigger loops under active playback.
Useful? React with 👍 / 👎.
| if text: | ||
| _level = log.debug if speculative else log.info | ||
| _level( | ||
| 'STT: "%s" (%.1fs audio, lang=%s)', | ||
| "STT: \"%s\" (%.1fs audio)", | ||
| text, | ||
| len(audio) / sample_rate, | ||
| info.language, | ||
| ) | ||
|
|
||
| if not speculative: | ||
| self._extract_prosody(audio, sample_rate, all_words) | ||
|
|
||
| return text |
There was a problem hiding this comment.
Restore prosody publication after STT transcription
This refactor removed the only call site that writes prosody features after successful transcription. _extract_prosody remains defined but is now dead code, so /dev/shm prosody state is never refreshed from new utterances; components that inject speech prosody into context receive stale/empty data, reducing behavior quality tied to operator tone.
Useful? React with 👍 / 👎.
| if self._audio_output is not None and hasattr(self._audio_output, "kill"): | ||
| self._audio_output.kill() | ||
| log.info("CPAL: killed in-flight playback for preemption") |
There was a problem hiding this comment.
Implement real playback cancellation for preemption
The new preemption path calls _kill_inflight_playback, but it only invokes self._audio_output.kill() when that method exists. The shipped PwAudioOutput has no kill method, so this branch never cancels active playback; it only clears the speaking flag. In practice, operator speech cannot reliably interrupt an in-flight exploration/autonomous clip, which breaks the intended preemption behavior.
Useful? React with 👍 / 👎.
| if source in ("exploration", "endogenous.narrative_drive"): | ||
| continue |
There was a problem hiding this comment.
Skip autonomous narration after operator preemption
When operator speech is present in a poll batch, this filter only skips exploration and endogenous.narrative_drive impingements; autonomous_narrative continues to be processed in the same cycle. Because operator handling does not set _processing_utterance or acquire the speech lock, the autonomous path can still synthesize/play immediately after the operator claim, causing operator preemption to be undermined by overlapping follow-on narration.
Useful? React with 👍 / 👎.
|
Auto-fix classified CI failure categories as |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@agents/hapax_daimonion/tts_serve_remote.py`:
- Line 16: Remove the unused imports causing Ruff F401 by deleting the top-level
imports of struct and sys from tts_serve_remote.py; search for any references to
the symbols "struct" and "sys" in the file (e.g., in functions like any audio
packing/CLI handling) and only restore/import them if actually used, then re-run
the linter to confirm the F401 errors are resolved.
- Around line 82-90: Move the environment setup so HAPAX_CHATTERBOX_DEVICE is
set before importing agents.hapax_daimonion.tts: ensure
os.environ.setdefault("HAPAX_CHATTERBOX_DEVICE", device) runs prior to the
import and instantiation of TTSManager (the current TTSManager import and tts =
TTSManager() must occur after setting the env var) so the module-level
resolution in agents.hapax_daimonion.tts sees the correct device passed to
serve(device=...).
In `@agents/hapax_daimonion/tts.py`:
- Around line 140-170: The _synthesize_remote method can return truncated PCM
when the remote closes early; change the receiving logic so after the read loop
it verifies len(pcm) == pcm_len and if not raises a RuntimeError (e.g., "Remote
TTS truncated payload") instead of returning partial data; keep existing socket
closing in the finally block, use the existing symbols pcm_len, pcm, sock and
the recv loop in _synthesize_remote, and ensure the error includes contextual
info (remote host/port and expected vs received lengths).
- Around line 57-58: TTSManager's constructor currently converts
HAPAX_TTS_REMOTE_PORT with int(...) which will raise on non-numeric values and
prevent fallback; change the initialization of self._remote_port to parse the
environment variable defensively: read env via
os.environ.get("HAPAX_TTS_REMOTE_PORT"), attempt int(...) inside a try/except
(catch ValueError and TypeError), and on failure set the port to the default
9851 (and optionally log a warning). Update references to _remote_port
accordingly so invalid env values no longer crash initialization.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: 841a2f0c-60b8-435d-8e3b-314727fea381
📒 Files selected for processing (3)
agents/hapax_daimonion/tts.pyagents/hapax_daimonion/tts_serve_remote.pysystemd/units/hapax-tts-remote.service
| import asyncio | ||
| import json | ||
| import logging | ||
| import struct |
There was a problem hiding this comment.
Remove unused imports blocking lint CI.
struct and sys are unused and currently fail Ruff (F401), which is already breaking the pipeline.
Proposed fix
import asyncio
import json
import logging
-import struct
@@
if __name__ == "__main__":
import os
- import sysAlso applies to: 107-107
🧰 Tools
🪛 GitHub Actions: CI / 2_lint.txt
[error] 16-16: ruff check failed (F401): struct imported but unused
🪛 GitHub Actions: CI / lint
[error] 16-16: Ruff (check): F401 struct imported but unused
🪛 GitHub Check: lint
[failure] 16-16: ruff (F401)
agents/hapax_daimonion/tts_serve_remote.py:16:8: F401 struct imported but unused
help: Remove unused import: struct
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@agents/hapax_daimonion/tts_serve_remote.py` at line 16, Remove the unused
imports causing Ruff F401 by deleting the top-level imports of struct and sys
from tts_serve_remote.py; search for any references to the symbols "struct" and
"sys" in the file (e.g., in functions like any audio packing/CLI handling) and
only restore/import them if actually used, then re-run the linter to confirm the
F401 errors are resolved.
| from agents.hapax_daimonion.tts import TTSManager | ||
|
|
||
| tts = TTSManager() | ||
| log.info("Loading Chatterbox on %s...", device) | ||
|
|
||
| import os | ||
|
|
||
| os.environ.setdefault("HAPAX_CHATTERBOX_DEVICE", device) | ||
|
|
There was a problem hiding this comment.
Set device env before importing TTSManager.
device is configured after importing agents.hapax_daimonion.tts, but that module resolves _CHATTERBOX_DEVICE at import time. This can silently ignore the serve(device=...) argument.
Proposed fix
async def serve(port: int = DEFAULT_PORT, device: str = "cuda:0") -> None:
- from agents.hapax_daimonion.tts import TTSManager
-
- tts = TTSManager()
- log.info("Loading Chatterbox on %s...", device)
-
import os
os.environ.setdefault("HAPAX_CHATTERBOX_DEVICE", device)
+ from agents.hapax_daimonion.tts import TTSManager
+
+ tts = TTSManager()
+ log.info("Loading Chatterbox on %s...", device)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| from agents.hapax_daimonion.tts import TTSManager | |
| tts = TTSManager() | |
| log.info("Loading Chatterbox on %s...", device) | |
| import os | |
| os.environ.setdefault("HAPAX_CHATTERBOX_DEVICE", device) | |
| async def serve(port: int = DEFAULT_PORT, device: str = "cuda:0") -> None: | |
| import os | |
| os.environ.setdefault("HAPAX_CHATTERBOX_DEVICE", device) | |
| from agents.hapax_daimonion.tts import TTSManager | |
| tts = TTSManager() | |
| log.info("Loading Chatterbox on %s...", device) |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@agents/hapax_daimonion/tts_serve_remote.py` around lines 82 - 90, Move the
environment setup so HAPAX_CHATTERBOX_DEVICE is set before importing
agents.hapax_daimonion.tts: ensure
os.environ.setdefault("HAPAX_CHATTERBOX_DEVICE", device) runs prior to the
import and instantiation of TTSManager (the current TTSManager import and tts =
TTSManager() must occur after setting the env var) so the module-level
resolution in agents.hapax_daimonion.tts sees the correct device passed to
serve(device=...).
| self._remote_host = os.environ.get("HAPAX_TTS_REMOTE_HOST") | ||
| self._remote_port = int(os.environ.get("HAPAX_TTS_REMOTE_PORT", "9851")) |
There was a problem hiding this comment.
Guard remote port parsing to prevent init-time crashes.
A non-numeric HAPAX_TTS_REMOTE_PORT will raise during TTSManager construction and bypass all fallback behavior. Default safely when parsing fails.
Proposed fix
self._backend = "chatterbox"
self._remote_host = os.environ.get("HAPAX_TTS_REMOTE_HOST")
- self._remote_port = int(os.environ.get("HAPAX_TTS_REMOTE_PORT", "9851"))
+ raw_remote_port = os.environ.get("HAPAX_TTS_REMOTE_PORT", "9851")
+ try:
+ self._remote_port = int(raw_remote_port)
+ except ValueError:
+ log.warning(
+ "Invalid HAPAX_TTS_REMOTE_PORT=%r; defaulting to 9851",
+ raw_remote_port,
+ )
+ self._remote_port = 9851🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@agents/hapax_daimonion/tts.py` around lines 57 - 58, TTSManager's constructor
currently converts HAPAX_TTS_REMOTE_PORT with int(...) which will raise on
non-numeric values and prevent fallback; change the initialization of
self._remote_port to parse the environment variable defensively: read env via
os.environ.get("HAPAX_TTS_REMOTE_PORT"), attempt int(...) inside a try/except
(catch ValueError and TypeError), and on failure set the port to the default
9851 (and optionally log a warning). Update references to _remote_port
accordingly so invalid env values no longer crash initialization.
| def _synthesize_remote(self, text: str, use_case: str = "conversation") -> bytes: | ||
| import json | ||
| import socket | ||
|
|
||
| request = json.dumps({"text": text, "use_case": use_case}) + "\n" | ||
| sock = socket.create_connection((self._remote_host, self._remote_port), timeout=60) | ||
| try: | ||
| sock.sendall(request.encode()) | ||
| buf = b"" | ||
| while b"\n" not in buf: | ||
| chunk = sock.recv(4096) | ||
| if not chunk: | ||
| raise RuntimeError("Remote TTS connection closed before header") | ||
| buf += chunk | ||
| header_line, remainder = buf.split(b"\n", 1) | ||
| header = json.loads(header_line) | ||
| if header.get("status") != "ok": | ||
| raise RuntimeError(f"Remote TTS error: {header.get('error', 'unknown')}") | ||
| pcm_len = header["pcm_len"] | ||
| pcm = remainder | ||
| while len(pcm) < pcm_len: | ||
| chunk = sock.recv(min(65536, pcm_len - len(pcm))) | ||
| if not chunk: | ||
| break | ||
| pcm += chunk | ||
| log.info( | ||
| "Remote TTS: %d bytes from %s:%d", len(pcm), self._remote_host, self._remote_port | ||
| ) | ||
| return pcm | ||
| finally: | ||
| sock.close() |
There was a problem hiding this comment.
Fail on truncated PCM payloads instead of returning partial audio.
If the remote side closes early, the loop exits and returns undersized PCM. That silently corrupts output and violates the declared pcm_len framing.
Proposed fix
pcm_len = header["pcm_len"]
pcm = remainder
while len(pcm) < pcm_len:
chunk = sock.recv(min(65536, pcm_len - len(pcm)))
if not chunk:
- break
+ raise RuntimeError(
+ f"Remote TTS truncated payload: expected {pcm_len} bytes, got {len(pcm)}"
+ )
pcm += chunk
+ if len(pcm) > pcm_len:
+ pcm = pcm[:pcm_len]
log.info(
"Remote TTS: %d bytes from %s:%d", len(pcm), self._remote_host, self._remote_port
)
return pcm🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@agents/hapax_daimonion/tts.py` around lines 140 - 170, The _synthesize_remote
method can return truncated PCM when the remote closes early; change the
receiving logic so after the read loop it verifies len(pcm) == pcm_len and if
not raises a RuntimeError (e.g., "Remote TTS truncated payload") instead of
returning partial data; keep existing socket closing in the finally block, use
the existing symbols pcm_len, pcm, sock and the recv loop in _synthesize_remote,
and ensure the error includes contextual info (remote host/port and expected vs
received lengths).
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 40a551fc98
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| messages=messages, | ||
| max_tokens=80, | ||
| temperature=0.7, | ||
| api_key=__import__("os").environ.get("LITELLM_API_KEY", ""), |
There was a problem hiding this comment.
Keep LiteLLM API key fallback non-empty
These new litellm.acompletion calls use os.environ.get("LITELLM_API_KEY", ""), which sends an empty API key when the env var is unset. In this codebase, the main conversation path intentionally falls back to "not-set" to satisfy LiteLLM gateway auth expectations; using "" here can cause 401/auth failures specifically on spontaneous speech and transcript-driven operator responses in environments that rely on the fallback behavior.
Useful? React with 👍 / 👎.
| if not chunk: | ||
| break |
There was a problem hiding this comment.
Raise on incomplete remote TTS PCM reads
The remote TTS client stops reading when recv() returns empty but then returns whatever bytes were received, even if shorter than the advertised pcm_len. If the socket closes mid-transfer (network hiccup, server restart), this path silently returns truncated PCM instead of failing over, which can produce corrupted or clipped playback while appearing successful.
Useful? React with 👍 / 👎.
| asyncio.create_task( | ||
| self._audio_perception.process_utterance( | ||
| audio_bytes=utterance, | ||
| vad_confidence=0.95, | ||
| duration_s=len(utterance) / (16000 * 2), |
There was a problem hiding this comment.
Serialize audio-perception utterance processing
This now launches process_utterance as an unconstrained background task for every detected utterance, with no in-flight guard or queueing. Under sustained speech, STT calls can pile up and execute concurrently, which can overwhelm the single STT backend and emit delayed/out-of-order operator impingements compared with arrival order; the prior path only processed one utterance at a time and queued the next.
Useful? React with 👍 / 👎.
| response = await asyncio.wait_for( | ||
| litellm.acompletion( | ||
| model=f"openai/{grounded_model}", | ||
| messages=self.messages + [{"role": "user", "content": transcript}], |
There was a problem hiding this comment.
Persist transcript-path turns in message history
This transcript-only path sends the user utterance in the request, but it never appends either the user turn or assistant reply to self.messages. As a result, subsequent transcript-driven turns lose immediate conversational context (and tool-call continuity) even within the same session, unlike the main process_utterance flow that persists both roles to message history.
Useful? React with 👍 / 👎.
| asyncio.create_task( | ||
| self._audio_perception.process_utterance( | ||
| audio_bytes=utterance, | ||
| vad_confidence=0.95, |
There was a problem hiding this comment.
Use measured VAD confidence for speech impingements
The CPAL tick now forwards a hardcoded vad_confidence=0.95 for every utterance, so downstream strength scoring treats all detected clips as near-certain speech regardless of actual VAD probability. Because operator impingement strength is vad_confidence * speaker_confidence, this inflates low-quality/ambiguous audio into high-priority operator events and can trigger unnecessary preemption of ongoing exploration output.
Useful? React with 👍 / 👎.
TTS: - Kokoro 82M on GPU as primary (40ms synthesis, 312MB VRAM) - Chatterbox remote server on appendix as fallback - HAPAX_TTS_BACKEND env var selects primary backend - Voice identity lives in S-4 hardware effects, not TTS model Conversation: - Exploration surfacing and autonomous narration disabled - Compositor director speech disabled - _speak_sentence fires as background task (non-blocking) - _update_system_context skipped for speed - Direct TabbyAPI call (bypasses LiteLLM proxy overhead) - Message alternation fix for Command-R - LLM call timing instrumentation Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Auto-fix classified CI failure categories as |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
agents/hapax_daimonion/conversation_pipeline.py (1)
401-416:⚠️ Potential issue | 🟠 Major | ⚡ Quick winInitialize
_llm_startbefore logging elapsed time.Line 415 reads
_llm_start, but this branch never sets it. Any successful spontaneous-speech LLM response will raise here and drop the utterance.Suggested fix
with metrics_ctx: + _llm_start = time.monotonic() log.info("spontaneous_speech: LLM call start (model=openai/%s)", grounded_model) response = await asyncio.wait_for( litellm.acompletion(🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@agents/hapax_daimonion/conversation_pipeline.py` around lines 401 - 416, The branch that calls litellm.acompletion uses _llm_start when computing elapsed time but never initializes it; set _llm_start = time.monotonic() (or __import__("time").monotonic()) immediately before logging "spontaneous_speech: LLM call start..." and before awaiting litellm.acompletion so that the subsequent computation (_llm_elapsed = __import__("time").monotonic() - _llm_start) is valid; ensure the same initialization is present for this spontaneous-speech branch around the litellm.acompletion call and that the log.info("LLM call took...") remains unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@agents/hapax_daimonion/cpal/runner.py`:
- Around line 1246-1248: The early return on effect.should_surface prevents any
exploration routing, arbiter claiming, or background speech tasks from ever
running; remove the unconditional "return" when effect.should_surface is true in
the block handling the Effect object (refer to effect.should_surface) and
instead let the normal exploration path execute: perform routing/arbiter claim
logic and schedule the background speech task (the same flow used when
exploration is allowed) while ensuring the existing “operator speech preempts
exploration” and “speak during exploration → exploration interrupted” acceptance
checks still run after scheduling; update any surrounding comments to reflect
that effect.should_surface enables the exploration-surface code path rather than
short-circuiting it.
---
Outside diff comments:
In `@agents/hapax_daimonion/conversation_pipeline.py`:
- Around line 401-416: The branch that calls litellm.acompletion uses _llm_start
when computing elapsed time but never initializes it; set _llm_start =
time.monotonic() (or __import__("time").monotonic()) immediately before logging
"spontaneous_speech: LLM call start..." and before awaiting litellm.acompletion
so that the subsequent computation (_llm_elapsed =
__import__("time").monotonic() - _llm_start) is valid; ensure the same
initialization is present for this spontaneous-speech branch around the
litellm.acompletion call and that the log.info("LLM call took...") remains
unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: ac6a47e7-faea-40db-87c2-02b95a3fe434
📒 Files selected for processing (4)
agents/hapax_daimonion/conversation_pipeline.pyagents/hapax_daimonion/cpal/runner.pyagents/hapax_daimonion/run_loops_aux.pyagents/hapax_daimonion/tts.py
| if effect.should_surface: | ||
| return # Exploration surfacing disabled — conversation-first architecture | ||
| # Refractory inhibition for exploration surfacing. |
There was a problem hiding this comment.
This guard disables the exploration path the PR is supposed to preempt.
Lines 1246-1247 return before any routing, arbiter claim, or background speech task runs, so the acceptance cases “operator speech preempts exploration” and “speak during exploration → exploration interrupted” cannot happen in this build.
Suggested fix
- if effect.should_surface:
- return # Exploration surfacing disabled — conversation-first architecture
+ if effect.should_surface:📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if effect.should_surface: | |
| return # Exploration surfacing disabled — conversation-first architecture | |
| # Refractory inhibition for exploration surfacing. | |
| if effect.should_surface: | |
| # Refractory inhibition for exploration surfacing. |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@agents/hapax_daimonion/cpal/runner.py` around lines 1246 - 1248, The early
return on effect.should_surface prevents any exploration routing, arbiter
claiming, or background speech tasks from ever running; remove the unconditional
"return" when effect.should_surface is true in the block handling the Effect
object (refer to effect.should_surface) and instead let the normal exploration
path execute: perform routing/arbiter claim logic and schedule the background
speech task (the same flow used when exploration is allowed) while ensuring the
existing “operator speech preempts exploration” and “speak during exploration →
exploration interrupted” acceptance checks still run after scheduling; update
any surrounding comments to reflect that effect.should_surface enables the
exploration-surface code path rather than short-circuiting it.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f4be83ab55
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| ) | ||
|
|
||
| text = response.choices[0].message.content.strip() | ||
| _llm_elapsed = __import__("time").monotonic() - _llm_start |
There was a problem hiding this comment.
Define LLM timing start before spontaneous speech call
generate_spontaneous_speech computes _llm_elapsed using _llm_start, but this variable is never assigned in the function, so every successful completion path raises NameError at runtime and drops spontaneous speech into the exception handler. This makes exploration/impingement surfacing fail even when the LLM call itself succeeds.
Useful? React with 👍 / 👎.
| from shared.config import MODELS | ||
|
|
||
| grounded_model = MODELS["local-fast"] | ||
| self._conversation_thread.append({"role": "user", "content": transcript}) |
There was a problem hiding this comment.
Reuse utterance safety filters in transcript conversation path
This new transcript path sends STT text directly to generation without going through _process_utterance_inner, so it skips the existing echo/strip safeguards (_is_echo, _strip_echo_prefix, duplicate rejection) that protect against self-capture. In the same commit the speaking gates were relaxed, so recaptured TTS transcripts can now be treated as operator input and trigger self-reinforcing turns.
Useful? React with 👍 / 👎.
| try: | ||
| import httpx | ||
|
|
||
| _tabby_url = "http://localhost:5000/v1/chat/completions" |
There was a problem hiding this comment.
Use configured LLM base URL for transcript requests
The transcript-driven path hardcodes http://localhost:5000/v1/chat/completions instead of using configured routing (LITELLM_BASE/model settings), so deployments that rely on non-local or overridden endpoints will fail only on this path while normal conversation continues to work. This creates a production-only split-brain behavior that's hard to diagnose.
Useful? React with 👍 / 👎.
| async with self._speech_lock: | ||
| try: | ||
| await self._pipeline.generate_spontaneous_speech( | ||
| impingement, | ||
| register_hint=register_hint, | ||
| destination_target=destination_target, |
There was a problem hiding this comment.
Set speaking gate while exploration speech is synthesizing
The background exploration path now calls generate_spontaneous_speech without ever setting buffer.set_speaking(True), but it still clears the flag in finally. With this commit’s always-on VAD behavior, exploration playback can be re-captured as fresh speech and fed back into the impingement loop, causing false speech events and self-trigger churn during active narration.
Useful? React with 👍 / 👎.
| resp = await client.post( | ||
| _tabby_url, | ||
| json={ | ||
| "model": "command-r", |
There was a problem hiding this comment.
Respect configured model route in transcript LLM calls
This transcript path hardcodes "model": "command-r" instead of using the configured model alias/router (local-fast / MODELS / daemon config). Environments that override model IDs or route names will break specifically for transcript-driven turns, even though the primary conversation path remains configurable.
Useful? React with 👍 / 👎.
| return | ||
|
|
||
| if effect.should_surface: | ||
| return # Exploration surfacing disabled — conversation-first architecture |
There was a problem hiding this comment.
Remove unconditional return that blocks exploration surfacing
In process_impingement, if effect.should_surface: immediately returns, which makes the entire exploration surfacing branch below unreachable (refractory check, destination decision, arbiter claim, and pipeline speech task). As a result, surfaced exploration impingements are always dropped instead of being deferred/preempted as intended.
Useful? React with 👍 / 👎.
| "temperature": 0.7, | ||
| }, | ||
| ) | ||
| response = resp.json() |
There was a problem hiding this comment.
Raise on non-2xx transcript LLM responses
The transcript path parses resp.json() without checking HTTP status, so 4xx/5xx responses are treated as normal payloads and then silently dropped by the generic exception/choices checks. This hides transport/auth failures and causes operator utterances to be ignored without a clear failure signal.
Useful? React with 👍 / 👎.
… optimization - Phenomenal context (stimmung/temporal/perception) injected from /dev/shm (~15ms) - Conversation history with proper alternation (last 10 turns) - Echo rejection via _is_echo() + immediate _recent_tts_texts registration - httpx client cached (eliminates per-request TCP setup) - Exploration surfacing + autonomous narration disabled - Compositor director speech disabled - Conversation window suppresses exploration for 120s after operator speech Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Auto-fix classified CI failure categories as |
- Older conversation turns compressed via LLMLingua-2 at 0.33 rate (~3x token savings) - Compressor pre-warmed at startup (BERT-base, CPU) - Echo rejection registers response text immediately (before TTS fires) - No context excised — full system prompt + phenomenal + compressed history Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Auto-fix classified CI failure categories as |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@agents/hapax_daimonion/conversation_pipeline.py`:
- Around line 485-493: The slice of self._conversation_thread used to build
history may contain ThreadEntry objects (see ThreadEntry type and
process_utterance) but the code treats entries as dicts; convert each
ThreadEntry to the LLM message dict shape before using history. In the block
that builds history (referencing _conversation_thread, history, messages,
system_block, transcript) map/transform ThreadEntry instances to {"role": ...,
"content": ...} (preserving role and content fields) before checking
history[-1].get("role") and before messages.extend(history), so messages only
contains plain dicts expected by the LLM call.
- Around line 500-511: The _httpx_client is created lazily in
conversation_pipeline (symbol: _httpx_client) but never closed, the endpoint is
hardcoded, and response status isn't validated; to fix, initialize
self._httpx_client: httpx.AsyncClient | None = None in __init__, replace the
hardcoded URL with the configured base (use the same config key as
_voice_litellm_base or load from shared/config.py model aliases), add an HTTP
status check after the post and raise/log on non-2xx before parsing, and close
the client in stop() by awaiting _httpx_client.aclose() and setting it to None
to avoid connection pool leaks (keep function names: stop(),
generate_spontaneous_speech, and attribute _voice_litellm_base as references).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: 83df2dfe-fc63-462b-a4f6-12a0e5a88187
📒 Files selected for processing (1)
agents/hapax_daimonion/conversation_pipeline.py
| history = [] | ||
| for entry in self._conversation_thread[-10:]: | ||
| history.append(entry) | ||
| if history and history[-1].get("role") == "user": | ||
| history.pop() | ||
|
|
||
| messages = [{"role": "system", "content": system_block}] | ||
| messages.extend(history) | ||
| messages.append({"role": "user", "content": transcript}) |
There was a problem hiding this comment.
Type mismatch: _conversation_thread may contain ThreadEntry objects, not dicts.
The code assumes _conversation_thread entries are dicts with "role" keys, but process_utterance (line 1169) appends ThreadEntry objects. When both paths are used in a session:
history[-1].get("role")raisesAttributeErroronThreadEntrymessages.extend(history)adds invalid structures to the LLM request
🔧 Suggested fix: Convert ThreadEntry to message dicts
history = []
for entry in self._conversation_thread[-10:]:
- history.append(entry)
-if history and history[-1].get("role") == "user":
- history.pop()
+ if isinstance(entry, dict):
+ history.append(entry)
+ elif hasattr(entry, 'user_text') and hasattr(entry, 'response_summary'):
+ # Convert ThreadEntry to message pair
+ history.append({"role": "user", "content": entry.user_text})
+ if entry.response_summary:
+ history.append({"role": "assistant", "content": entry.response_summary})
+if history and history[-1].get("role") == "user":
+ history.pop()🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@agents/hapax_daimonion/conversation_pipeline.py` around lines 485 - 493, The
slice of self._conversation_thread used to build history may contain ThreadEntry
objects (see ThreadEntry type and process_utterance) but the code treats entries
as dicts; convert each ThreadEntry to the LLM message dict shape before using
history. In the block that builds history (referencing _conversation_thread,
history, messages, system_block, transcript) map/transform ThreadEntry instances
to {"role": ..., "content": ...} (preserving role and content fields) before
checking history[-1].get("role") and before messages.extend(history), so
messages only contains plain dicts expected by the LLM call.
| if not hasattr(self, '_httpx_client') or self._httpx_client is None: | ||
| self._httpx_client = httpx.AsyncClient(timeout=15.0) | ||
| _llm_start = time.monotonic() | ||
| resp = await self._httpx_client.post( | ||
| "http://localhost:5000/v1/chat/completions", | ||
| json={ | ||
| "model": "command-r", | ||
| "messages": messages, | ||
| "max_tokens": 120, | ||
| "temperature": 0.7, | ||
| }, | ||
| ) |
There was a problem hiding this comment.
httpx client leaks resources and endpoint is hardcoded.
_httpx_clientis created lazily but never closed instop()— connection pool leak- Endpoint
http://localhost:5000is hardcoded; should use configuration likegenerate_spontaneous_speechuses_voice_litellm_base(line 409) - No HTTP status check before parsing response — 4xx/5xx errors silently mishandled
🔧 Suggested fix
Declare in __init__:
self._httpx_client: httpx.AsyncClient | None = NoneClose in stop():
if self._httpx_client is not None:
await self._httpx_client.aclose()
self._httpx_client = NoneUse configuration and add error handling:
-resp = await self._httpx_client.post(
- "http://localhost:5000/v1/chat/completions",
+resp = await self._httpx_client.post(
+ f"{_voice_litellm_base}/v1/chat/completions",
json={
"model": "command-r",
"messages": messages,
"max_tokens": 120,
"temperature": 0.7,
},
)
+resp.raise_for_status()
response = resp.json()As per coding guidelines, **/*.py: Use shared/config.py for model aliases and LiteLLM/Qdrant configuration.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@agents/hapax_daimonion/conversation_pipeline.py` around lines 500 - 511, The
_httpx_client is created lazily in conversation_pipeline (symbol: _httpx_client)
but never closed, the endpoint is hardcoded, and response status isn't
validated; to fix, initialize self._httpx_client: httpx.AsyncClient | None =
None in __init__, replace the hardcoded URL with the configured base (use the
same config key as _voice_litellm_base or load from shared/config.py model
aliases), add an HTTP status check after the post and raise/log on non-2xx
before parsing, and close the client in stop() by awaiting
_httpx_client.aclose() and setting it to None to avoid connection pool leaks
(keep function names: stop(), generate_spontaneous_speech, and attribute
_voice_litellm_base as references).
No context excised — full system prompt with policy, environment, goals, health, nudges, imagination, phenomenal state, grounding directive, and effort calibration. Runs in thread executor so event loop stays free. LLMLingua-2 compresses older conversation history at 0.33 rate. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Auto-fix classified CI failure categories as |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 26ac5a2056
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| import onnx_asr | ||
|
|
||
| log.info("Loading Parakeet model %s on %s...", self._model_name, self._device) | ||
| self._model = onnx_asr.load_model(self._model_name) |
There was a problem hiding this comment.
Add missing onnx_asr runtime dependency
This new STT loader hard-depends on onnx_asr, but the same commit does not add that package to project dependencies/lock. In environments created from pyproject.toml + uv.lock, ResidentSTT.load() will raise ModuleNotFoundError, leaving _model unset and making speech transcription unavailable for the daemon.
Useful? React with 👍 / 👎.
| if self._audio_perception is not None: | ||
| asyncio.create_task( | ||
| self._audio_perception.process_utterance( | ||
| audio_bytes=utterance, |
There was a problem hiding this comment.
Fall back when audio perception cannot process utterances
The new utterance routing sends all captured speech to AudioPerceptionBackend whenever that object exists, but it never checks backend readiness (available()) and never falls back to _process_utterance. If STT fails to load (or is otherwise unavailable), this branch still consumes utterances and the legacy path is skipped, so operator speech is dropped instead of being processed.
Useful? React with 👍 / 👎.
| history = [] | ||
| for entry in self._conversation_thread[-10:]: | ||
| history.append(entry) | ||
| if history and history[-1].get("role") == "user": |
There was a problem hiding this comment.
Normalize thread entries before transcript history checks
process_utterance_from_transcript treats conversation history entries as dicts, but the normal conversation path appends ThreadEntry objects to _conversation_thread. Once any regular turn exists, this .get("role") call raises AttributeError (and messages.extend(history) would also be non-serializable), so transcript-driven operator turns fail before the LLM request.
Useful? React with 👍 / 👎.
| # Skip if operator speech was just processed (it preempted) | ||
| if operator_imps: | ||
| source = getattr(imp, "source", "") | ||
| if source in ("exploration", "endogenous.narrative_drive"): |
There was a problem hiding this comment.
Match exploration source prefixes in preemption filter
The operator-preemption filter only skips impingements whose source is exactly "exploration", but exploration producers emit prefixed sources such as "exploration.boredom" (see agents/reverie/mixer.py). In mixed batches, those exploration impingements still run immediately after operator speech despite the preemption comment, which weakens the intended operator-priority behavior.
Useful? React with 👍 / 👎.
|
Governed stale-PR reconciliation note (task |
Summary
Spec: docs/superpowers/specs/2026-05-23-impingement-native-voice-design.md
Test plan
🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Behavior Changes
Configuration
Tests
Documentation