feat(voice): core voice pipeline + UI#450
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds a full real-time voice feature: new voice extension, STT/TTS service factories, agent bridge and pipeline, session managers (memory/Redis), WebSocket REST endpoints, frontend UI/client/stores, configuration and dependency updates, tests, and supporting tooling and safety fixes. Changes
Sequence DiagramsequenceDiagram
participant User as User
participant FE as Frontend (VoiceClient)
participant WS as WebSocket
participant BE as VoiceEndpoint
participant SM as SessionManager
participant Bridge as AgentBridge
participant STT as STT Service
participant Agent as Agent
participant TTS as TTS Service
User->>FE: Start voice session
FE->>BE: POST /voice/session/start
BE->>SM: create_session(context_id)
SM-->>BE: {session_id, token?}
BE-->>FE: {session_id, ws_url}
FE->>WS: Connect ws_url (optional token)
WS->>BE: Accept WebSocket
BE->>SM: update_state(active)
loop session active
User->>FE: Speak (mic audio frames)
FE->>WS: Send PCM frames
WS->>BE: Receive audio frames
BE->>STT: Stream/process frames
STT-->>BE: Transcript chunk
BE->>Bridge: process_transcription(text)
Bridge->>Agent: manifest_run(context)
Agent-->>Bridge: Response (sync/stream)
Bridge->>TTS: Request synthesis
TTS-->>Bridge: Audio chunks
Bridge-->>BE: Emit transcripts + audio
BE->>WS: Send agent audio + events
WS-->>FE: Playback audio, show transcript
end
User->>FE: End session
FE->>WS: Send stop
WS->>BE: Close
BE->>SM: end_session(session_id)
BE-->>FE: Session ended
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Possibly related PRs
Poem
✨ Finishing Touches🧪 Generate unit tests (beta)
|
There was a problem hiding this comment.
Actionable comments posted: 16
Note
Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
tests/unit/test_minimax_example.py (1)
192-250:⚠️ Potential issue | 🟠 MajorMove live API tests out of
tests/unit/(or mark/split as integration).
TestMiniMaxIntegrationperforms real network calls, which makes this suite non-unit, flaky, and environment-dependent. Please relocate these tests to an integration test path (or gate them with an explicit integration marker) and keeptests/unit/deterministic.Suggested refactor sketch
-class TestMiniMaxIntegration: +@pytest.mark.integration +class TestMiniMaxIntegration:-# tests/unit/test_minimax_example.py +# tests/integration/test_minimax_example_integration.pyAs per coding guidelines, "
tests/unit/**/*.py: Place unit tests intests/unit/directory with pytest and@pytest.mark.asynciodecorator for async tests".🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/test_minimax_example.py` around lines 192 - 250, These tests (test_minimax_api_connection, test_minimax_chat_completion, test_minimax_m27_highspeed_model) perform real network calls and must be treated as integration tests: either move these functions (or the TestMiniMaxIntegration class containing them) out of the tests/unit suite into your integration test folder (e.g., tests/integration/) or mark them with an explicit pytest marker (e.g., decorate each function or the class with `@pytest.mark.integration`) and add the "integration" marker to pytest.ini; ensure any CI/test runners run integration tests separately so tests/unit remains deterministic.frontend/src/lib/utils/agentMessageHandler.ts (2)
363-370:⚠️ Potential issue | 🟡 MinorResponse body already consumed before fallback text read.
If
response.json()throws a parse error at line 364, the response body has already been consumed. The subsequentresponse.text()at line 367 will fail or return empty because the body stream can only be read once.🐛 Proposed fix
let result; try { result = await response.json(); } catch (parseError) { console.error('Failed to parse response JSON:', parseError); - const text = await response.text(); - console.error('Response text:', text); + // Note: body already consumed by json() attempt, cannot read text + console.error('Response body was not valid JSON'); throw new Error('Failed to parse response'); }Alternatively, read as text first and then parse:
const text = await response.text(); let result; try { result = JSON.parse(text); } catch (parseError) { console.error('Failed to parse response JSON:', parseError); console.error('Response text:', text); throw new Error('Failed to parse response'); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@frontend/src/lib/utils/agentMessageHandler.ts` around lines 363 - 370, The current try/catch around response.json() in agentMessageHandler.ts consumes the response body before attempting to read fallback text, causing response.text() to fail; change the logic to read the body as text first (const text = await response.text()), then attempt to parse with JSON.parse(text) inside a try/catch, logging both the parse error and the raw text on failure and rethrowing a descriptive Error, replacing the existing response.json() + response.text() sequence and preserving the result variable usage.
11-11:⚠️ Potential issue | 🟠 MajorHardcoded localhost URL will break in production deployments.
The
AGENT_BASE_URLis hardcoded tohttp://localhost:3773on line 11. This same pattern also appears inpaymentHandler.ts(line 6). Both files use this URL directly in fetch calls, which will fail when deployed outside localhost.The codebase already has an established pattern for client-side configuration via
PUBLIC_*environment variables (seePublicConfig.svelte.tsand.env.example). The agent URL should follow this same pattern: definePUBLIC_AGENT_BASE_URLas a configurable environment variable instead of hardcoding it.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@frontend/src/lib/utils/agentMessageHandler.ts` at line 11, Replace the hardcoded AGENT_BASE_URL constant with a configurable PUBLIC_AGENT_BASE_URL environment value: update the AGENT_BASE_URL declaration in agentMessageHandler.ts to read the client-side public config/env var used across the app (the same pattern as PublicConfig.svelte.ts), and make the same change in paymentHandler.ts (replace the literal 'http://localhost:3773' with the PUBLIC_AGENT_BASE_URL variable). Also ensure .env.example and the public config export include PUBLIC_AGENT_BASE_URL so the runtime build picks it up.
🟡 Minor comments (8)
tests/integration/grpc/test_grpc_e2e.py-198-200 (1)
198-200:⚠️ Potential issue | 🟡 MinorFail fast on JSON-RPC
errorpayloads instead of polling through timeout.When
tasks/getreturns an error, continuing to poll hides the root cause and delays failure diagnostics.🔧 Proposed patch
- if payload.get("error"): - time.sleep(0.2) - continue + if payload.get("error"): + raise AssertionError( + f"tasks/get returned error for {task_id}: {payload['error']}" + )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/integration/grpc/test_grpc_e2e.py` around lines 198 - 200, The loop currently swallows JSON-RPC error payloads by doing "if payload.get('error'): time.sleep(0.2); continue" — change this to fail fast: when payload.get('error') is truthy immediately raise or call pytest.fail with the payload/error details (include the full payload or payload['error'] in the message) instead of sleeping/continuing, so tests that call the tasks/get RPC surface the root cause promptly.tests/integration/grpc/test_grpc_e2e.py-184-186 (1)
184-186:⚠️ Potential issue | 🟡 MinorNarrow the blind exception handlers in polling.
Catching
Exceptionhere can mask real regressions and keeps Ruff BLE001 warnings active. Prefer scoped exceptions (httpx.HTTPError,ValueError) so unexpected failures surface immediately.🔧 Proposed patch
- except Exception: + except httpx.HTTPError: time.sleep(0.2) continue @@ - except Exception: + except ValueError: time.sleep(0.2) continueAlso applies to: 194-196
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/integration/grpc/test_grpc_e2e.py` around lines 184 - 186, Replace the broad "except Exception:" blind handlers in the polling loops in test_grpc_e2e.py with narrow, specific exceptions so real errors surface; update the two catch blocks (the one around the polling/wait retry at the snippet and the similar block at lines ~194-196) to catch only expected errors such as httpx.HTTPError and ValueError (and/or TimeoutError if the code awaits network/timeouts), add the necessary import for httpx at the top of the test file, and leave the existing retry behavior (time.sleep(0.2); continue) unchanged for these specific exceptions so unknown exceptions will still propagate and fail the test.tests/unit/server/scheduler/test_memory_scheduler.py-110-131 (1)
110-131:⚠️ Potential issue | 🟡 MinorThis test doesn't exercise the capacity path it claims to.
Monkeypatching
send()to return immediately bypassesfail_after()and queue saturation entirely, and the payload here doesn't match realTaskSendParams(message, UUID values). Either rename this to a simple "invokes send" test or actually fill the buffer and unblock it from another task.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/server/scheduler/test_memory_scheduler.py` around lines 110 - 131, The test currently monkeypatches scheduler._write_stream.send to return immediately and uses a fake payload, so it doesn't exercise capacity/queue saturation; change it to either rename the test to "invokes send" or rewrite it to block the send path and use a real TaskSendParams-like payload so run_task must wait: create an asyncio.Event (or similar) and monkeypatch scheduler._write_stream.send to await the event before returning (so the first run_task fills the buffer), then start a second run_task call and assert it blocks until you set the event, and use real UUIDs/messages matching TaskSendParams; reference scheduler.run_task and scheduler._write_stream.send when making the changes.bindu/penguin/config_validator.py-149-157 (1)
149-157:⚠️ Potential issue | 🟡 MinorUse
parsed.hostnameinstead ofnot parsed.netlocfor URL host validation.The current check accepts malformed URLs:
urlparse("http://:80")returns a non-emptynetlocof':80'buthostname=None. This allows invalid deployment URLs to pass validation and fail later. Additionally, line 149 checks if the stripped value is non-empty but parses the unstrippeddeployment_url, which preserves leading/trailing whitespace in the parsed result.Fix by stripping before parsing and checking
parsed.hostnameinstead:deployment_url = deployment_config.get("url") if not isinstance(deployment_url, str) or not deployment_url.strip(): raise ConfigError( "ConfigError: 'deployment.url' is a required field in the agent configuration." ) deployment_url = deployment_url.strip() # Strip before parsing parsed = urlparse(deployment_url) if parsed.scheme not in {"http", "https"} or not parsed.hostname: # Use hostname raise ConfigError( f"ConfigError: 'deployment.url' must be a valid http(s) URL, got '{deployment_url}'." )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bindu/penguin/config_validator.py` around lines 149 - 157, The deployment URL validation currently parses the raw value and checks parsed.netloc, which accepts malformed hosts and also leaves surrounding whitespace; update the logic in the validation around deployment_config.get("url") so you strip the string before parsing (assign back to deployment_url), then call urlparse(deployment_url) and validate using parsed.hostname instead of parsed.netloc, keeping the same ConfigError messages (raise ConfigError in the same places) so invalid or whitespace-padded URLs like "http://:80" are rejected.bindu/extensions/__init__.py-37-41 (1)
37-41:⚠️ Potential issue | 🟡 MinorVoice extension docs are too provider-specific.
This text reads as Deepgram+ElevenLabs only, while voice providers are configurable. Consider documenting it as provider-based STT/TTS to avoid stale guidance.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bindu/extensions/__init__.py` around lines 37 - 41, The extension docs currently call out Deepgram and ElevenLabs specifically in the "Voice (Real-time Voice Conversations)" section; update the wording to be provider-agnostic by describing it as configurable STT and TTS providers used via the Pipecat pipeline, and note that providers can be chosen via the voice config passed to bindufy() or via VOICE__ENABLED environment settings; keep the mention of Pipecat, WebSocket, and bidirectional voice conversations but remove vendor-specific names so the text describes "provider-based STT/TTS" instead of Deepgram/ElevenLabs.frontend/src/lib/stores/chat.ts-259-264 (1)
259-264:⚠️ Potential issue | 🟡 MinorLocal user echo can drop text in multipart messages.
Only the first
kind === 'text'part is rendered locally, so additional text parts are lost in UI.Proposed patch
- const textPart = parts.find((p) => p.kind === 'text'); - // Only add message if text part exists and has non-empty text - if (textPart?.text) { - addMessage(textPart.text, 'user', taskId); - } + const text = parts + .filter((p) => p?.kind === 'text' && typeof p.text === 'string') + .map((p) => p.text) + .join('\n') + .trim(); + if (text) { + addMessage(text, 'user', taskId); + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@frontend/src/lib/stores/chat.ts` around lines 259 - 264, The local user-echo currently only renders the first text part by using parts.find(...) and textPart, which drops any additional parts; change this to collect all parts where p.kind === 'text' (e.g., parts.filter(...)) and concatenate their .text values (preserving order and skipping empty strings) and pass the combined string into addMessage(text, 'user', taskId) so all text segments in multipart messages are shown locally.frontend/.env.example-91-93 (1)
91-93:⚠️ Potential issue | 🟡 MinorFix formatting: remove spaces around equal signs.
Static analysis correctly flags spaces around
=in these lines..envfiles should not have spaces around the equal sign as some parsers may include the space in the value.🔧 Proposed fix
-COOKIE_SAMESITE= # can be "lax", "strict", "none" or left empty -COOKIE_SECURE= # set to true to only allow cookies over https -TRUSTED_EMAIL_HEADER= # header to use to get the user email, only use if you know what you are doing +COOKIE_SAMESITE=# can be "lax", "strict", "none" or left empty +COOKIE_SECURE=# set to true to only allow cookies over https +TRUSTED_EMAIL_HEADER=# header to use to get the user email, only use if you know what you are doing🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@frontend/.env.example` around lines 91 - 93, The .env example lines have spaces around the equals which can be interpreted as part of the value; update the three entries COOKIE_SAMESITE, COOKIE_SECURE, and TRUSTED_EMAIL_HEADER to remove spaces around the '=' so they read as KEY= (no spaces) to ensure .env parsers treat values correctly and static analysis stops flagging them.bindu/extensions/voice/session_factory.py-117-124 (1)
117-124:⚠️ Potential issue | 🟡 MinorRemove redundant cleanup after
__aenter__failure—the manager already handles it internally.The
RedisVoiceSessionManager.__aenter__method catchesredis.RedisError, closes the client connection (line 144), and cleans up state before raisingConnectionError. Calling__aexit__afterward is unnecessary; it will perform no additional cleanup sinceself._redis_clientis alreadyNoneand there is no cleanup task to cancel.🛡️ Simplified cleanup
try: await manager.__aenter__() except Exception: + # __aenter__ handles its own cleanup on failure raise return manager🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bindu/extensions/voice/session_factory.py` around lines 117 - 124, The try/except around awaiting manager.__aenter__ performs redundant cleanup by calling await manager.__aexit__ after __aenter__ raises; remove that extra call and let RedisVoiceSessionManager.__aenter__ propagate its ConnectionError because __aenter__ already closes the client and resets state. Update the block that currently catches Exception around await manager.__aenter__ to simply await and let exceptions propagate (or re-raise without calling manager.__aexit__), ensuring references to manager.__aenter__, manager.__aexit__, and the RedisVoiceSessionManager behavior remain correct.
🧹 Nitpick comments (24)
frontend/src/lib/migrations/lock.ts (1)
18-18: Removeas Partial<Semaphore>cast and usesatisfies Semaphoreinstead.The object being inserted provides all required fields (
_id,key,createdAt,updatedAt,deleteAt) and fully satisfies theSemaphoreinterface. ThePartialcast weakens type safety without benefit. Usesatisfies Semaphoreto preserve compile-time schema validation while keeping type inference clean.♻️ Proposed fix
- const insert = await collections.semaphores.insertOne({ + const lockDoc = { _id: id, key, createdAt: new Date(), updatedAt: new Date(), deleteAt: new Date(Date.now() + 1000 * 60 * 3), // 3 minutes - } as Partial<Semaphore>); + } satisfies Semaphore; + + const insert = await collections.semaphores.insertOne(lockDoc);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@frontend/src/lib/migrations/lock.ts` at line 18, Replace the weak cast "as Partial<Semaphore>" with the stricter "satisfies Semaphore" on the object being inserted in lock.ts: locate the code that constructs the semaphore object (the literal containing _id, key, createdAt, updatedAt, deleteAt) and change the type assertion so the literal satisfies the Semaphore interface (e.g., use "satisfies Semaphore" instead of "as Partial<Semaphore>") to preserve full compile-time schema validation and keep type inference intact for the Semaphore type.frontend/src/lib/server/database.ts (3)
81-85: Redundant filter check after sorting.The
matchedarray is already filtered on line 67. The loop on lines 81-85 re-appliesmatchesFilterunnecessarily. You can simplify by returning the first element directly.♻️ Suggested simplification
} } - for (const doc of matched) { - if (this.matchesFilter(doc, filter)) { - return doc; - } - } - return null; + return matched[0] ?? null; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@frontend/src/lib/server/database.ts` around lines 81 - 85, The loop that re-checks each item with this.matchesFilter is redundant because matched is already filtered; replace the for-loop over matched (using matched, matchesFilter, filter) with a direct return of the first element (e.g. return matched[0]) and preserve the original empty-array behavior (return undefined/null) so the function's semantics don't change.
496-512: Note: Bothonandonceschedule emissions independently.When registering handlers via both
onandoncefor the same event (e.g., "data"), each registration triggers its ownsetTimeoutfor emission. This could cause duplicate data delivery to multiple handlers. For this in-memory mock used in tests, this is likely acceptable, but worth noting if real-world usage patterns involve multiple listener registrations.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@frontend/src/lib/server/database.ts` around lines 496 - 512, The on/once handlers in downloadStream currently schedule separate setTimeout emissions each time (functions on and once use addListener and then schedule emit), causing duplicate deliveries; modify the logic in downloadStream so emission scheduling is done only once per event (e.g., maintain a Set like scheduledEmits or check for existing listeners before scheduling) and have on and once only add listeners (using addListener/wrapped) while relying on a single scheduled emit per event (use emit(file.data / new Error...) only if not already scheduled) to prevent duplicate emissions to multiple registrations.
152-159: Minor: Redundant variable assignment.The
const updated = pipelineUpdate;on line 158 is unnecessary sincepipelineUpdatealready holds the final result.♻️ Suggested simplification
const pipelineUpdate = Array.isArray(update) ? update.reduce( (acc, stage) => this.applyUpdate(acc, stage), doc ) : this.applyUpdate(doc, update); - const updated = pipelineUpdate; - this.data.set(id, updated); + this.data.set(id, pipelineUpdate); count++;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@frontend/src/lib/server/database.ts` around lines 152 - 159, Remove the redundant assignment "const updated = pipelineUpdate;" and use pipelineUpdate directly when saving to the map; update the block in the method that computes pipelineUpdate (the Array.isArray(update) ? ... : this.applyUpdate(...) logic) so that you call this.data.set(id, pipelineUpdate) directly and delete the unused "updated" binding, ensuring no other references to "updated" remain in the surrounding code (look for applyUpdate, pipelineUpdate, updated, and this.data.set).tests/unit/penguin/test_bindufy.py (1)
186-198: Consider moving this test inside theTestBindufyUtilitiesclass.The new test validates an important error path. However, it's defined as a standalone function while all other tests in this file are methods of
TestBindufyUtilities. Moving it inside the class would maintain consistency.♻️ Suggested refactor for consistency
-def test_bindufy_non_callable_handler_raises_clear_error(): - """bindufy should fail fast with a clear handler validation message.""" - config = { - "author": "test@example.com", - "name": "Test Agent", - "deployment": {"url": "http://localhost:3773"}, - } - - with pytest.raises( - TypeError, - match="callable function or coroutine function", - ): - bindufy(config=config, handler=cast(Any, "not_callable"), run_server=False) +class TestBindufyUtilities: + # ... existing tests ... + + def test_bindufy_non_callable_handler_raises_clear_error(self): + """bindufy should fail fast with a clear handler validation message.""" + config = { + "author": "test@example.com", + "name": "Test Agent", + "deployment": {"url": "http://localhost:3773"}, + } + + with pytest.raises( + TypeError, + match="callable function or coroutine function", + ): + bindufy(config=config, handler=cast(Any, "not_callable"), run_server=False)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/penguin/test_bindufy.py` around lines 186 - 198, Move the standalone test function test_bindufy_non_callable_handler_raises_clear_error into the existing TestBindufyUtilities test class to keep test structure consistent; locate the function and cut it into the TestBindufyUtilities class body, preserving its name, docstring, config setup and the pytest.raises assertion that calls bindufy(config=..., handler=cast(Any, "not_callable"), run_server=False), and ensure indentation and any imports (pytest, cast, Any) remain correct so the test continues to run as a class method.frontend/src/lib/components/voice/LiveTranscript.svelte (1)
11-24: Add live-region semantics for real-time transcript updates.This panel updates dynamically; adding
role="log"+aria-live="polite"improves screen-reader usability.Proposed patch
-<div class="flex max-h-72 flex-col gap-2 overflow-y-auto rounded-lg border border-gray-200 bg-white/80 p-3 text-sm dark:border-gray-700 dark:bg-gray-900/80"> +<div + role="log" + aria-live="polite" + aria-relevant="additions text" + class="flex max-h-72 flex-col gap-2 overflow-y-auto rounded-lg border border-gray-200 bg-white/80 p-3 text-sm dark:border-gray-700 dark:bg-gray-900/80" +>🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@frontend/src/lib/components/voice/LiveTranscript.svelte` around lines 11 - 24, The transcript container in the LiveTranscript.svelte component needs live-region semantics so screen readers get updates: update the top-level div that renders the transcript items (the element wrapping the {`#if` items.length === 0} block) to include role="log" and aria-live="polite" (and optionally aria-atomic="false") so dynamic transcript entries from the items array are announced appropriately by assistive tech.tests/unit/extensions/voice/test_voice_websocket_integration.py (2)
25-74: Extract pipecat stubs to a shared fixture.The pipecat module stubbing code (lines 25-74) is duplicated almost identically in both tests. Consider extracting this to a shared pytest fixture to reduce duplication and improve maintainability.
♻️ Example fixture extraction
`@pytest.fixture` def mock_pipecat_modules(monkeypatch): """Stub Pipecat modules for WebSocket testing.""" fastapi_mod = types.ModuleType("pipecat.transports.websocket.fastapi") class FastAPIWebsocketParams: def __init__(self, **kwargs): self.kwargs = kwargs class FastAPIWebsocketTransport: def __init__(self, websocket, params): self._ws = websocket self._params = params def input(self): return object() def output(self): return object() fastapi_mod.FastAPIWebsocketTransport = FastAPIWebsocketTransport fastapi_mod.FastAPIWebsocketParams = FastAPIWebsocketParams # ... rest of module stubs monkeypatch.setitem(sys.modules, "pipecat.transports.websocket.fastapi", fastapi_mod) # ... rest of setitems🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/extensions/voice/test_voice_websocket_integration.py` around lines 25 - 74, Extract the repeated pipecat stubbing into a single pytest fixture (e.g., mock_pipecat_modules) that builds the same ModuleType objects and classes (FastAPIWebsocketParams, FastAPIWebsocketTransport, Pipeline, PipelineTask, PipelineRunner) and performs the monkeypatch.setitem calls on sys.modules for "pipecat.transports.websocket.fastapi", "pipecat.pipeline.pipeline", "pipecat.pipeline.task", and "pipecat.pipeline.runner"; then replace the duplicated block in both tests with a fixture parameter in the test signature so tests reuse the fixture instead of redefining the stubs inline.
98-112: Consider using pytest fixtures for app_settings modification.Directly modifying
module.app_settings.voice.*with manual restoration infinallyblocks is error-prone and could cause test pollution if tests run in parallel. Consider usingmonkeypatch.setattrfor these settings modifications, which automatically restores values after each test.♻️ Suggested approach using monkeypatch
- original_required = module.app_settings.voice.session_auth_required - original_timeout = module.app_settings.voice.session_timeout - original_enabled = module.app_settings.voice.enabled - original_stt = module.app_settings.voice.stt_api_key - original_tts = module.app_settings.voice.tts_api_key - try: - module.app_settings.voice.session_auth_required = True - module.app_settings.voice.session_timeout = 300 - module.app_settings.voice.enabled = True - module.app_settings.voice.stt_api_key = ( - "unit-test-stt-token" # pragma: allowlist secret - ) - module.app_settings.voice.tts_api_key = ( - "unit-test-tts-token" # pragma: allowlist secret - ) + monkeypatch.setattr(module.app_settings.voice, "session_auth_required", True) + monkeypatch.setattr(module.app_settings.voice, "session_timeout", 300) + monkeypatch.setattr(module.app_settings.voice, "enabled", True) + monkeypatch.setattr(module.app_settings.voice, "stt_api_key", "unit-test-stt-token") + monkeypatch.setattr(module.app_settings.voice, "tts_api_key", "unit-test-tts-token")Then remove the
finallyblock for restoration.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/extensions/voice/test_voice_websocket_integration.py` around lines 98 - 112, The test directly mutates module.app_settings.voice attributes (module.app_settings.voice.session_auth_required, session_timeout, enabled, stt_api_key, tts_api_key) and restores them in a finally block; replace those manual mutations with pytest's monkeypatch.setattr calls for each attribute so values are automatically reverted after the test (use monkeypatch.setattr(module.app_settings.voice, "session_auth_required", True), etc.), and remove the finally restoration block.bindu/utils/worker/messages.py (1)
83-86: Consider handling missing file structure more gracefully.When
part.get("file")returnsNoneor an empty dict, the code falls through to the MIME type check and logs an "Unsupported MIME type" warning with an empty string. Consider adding explicit handling for malformed file parts.♻️ Suggested improvement
file_info = part.get("file") or {} + if not file_info: + logger.warning("File part missing 'file' structure") + processed_parts.append({ + "kind": "text", + "text": "[System: Malformed file upload - missing file data]", + }) + continue mime_type = file_info.get("mimeType", "") file_name = file_info.get("name", "uploaded file") base64_data = file_info.get("bytes") or file_info.get("data", "")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bindu/utils/worker/messages.py` around lines 83 - 86, The code currently assumes part.get("file") yields a valid dict and proceeds to mime_type/file_name/base64_data extraction, which causes vague "Unsupported MIME type" warnings when the file structure is missing; update the handler around variables part, file_info, mime_type, file_name, and base64_data to explicitly detect a missing or malformed file_info (e.g., file_info is None or lacks bytes/data) and log a clear "Malformed file part" (including file_name or part identifier) and skip processing/return early before performing the MIME type check, so subsequent logic only runs when base64_data and mime_type are present and valid.bindu/server/applications.py (1)
113-115: Consider moving the import to the top of the file.The
get_voice_extension_from_capabilitiesimport is placed inside__init__alongside where it's used. While this works for lazy loading, it's inconsistent withget_x402_extension_from_capabilitieswhich is imported at the top (line 40). For consistency, consider moving this import to the module level alongside the x402 utility import.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bindu/server/applications.py` around lines 113 - 115, Move the local import of get_voice_extension_from_capabilities out of the __init__-time block and add it to the module-level imports alongside get_x402_extension_from_capabilities so imports are consistent; update the top of the file to import get_voice_extension_from_capabilities and remove the inline "from bindu.utils import get_voice_extension_from_capabilities" before the call that assigns voice_ext = get_voice_extension_from_capabilities(manifest).tests/unit/extensions/voice/test_session_manager.py (1)
99-104: Consider adding a test for update_state on non-existent session.The test verifies
update_stateworks on an existing session, but doesn't test the behavior when called with a non-existent session ID. This edge case should be covered to document expected behavior (silent no-op or exception).🧪 Suggested additional test
`@pytest.mark.asyncio` async def test_update_state_nonexistent_session(self, manager): # Should not raise (or raise specific exception - document expected behavior) await manager.update_state("nonexistent", "active")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/extensions/voice/test_session_manager.py` around lines 99 - 104, Add a new async pytest test named test_update_state_nonexistent_session that calls SessionManager.update_state with a non-existent session id (e.g., "nonexistent") and asserts the documented behavior: either that it completes without raising (no-op) or that it raises the specific exception your implementation uses; also verify via SessionManager.get_session that no session was created/changed. Reference the manager fixture and the update_state and get_session methods when adding this test so the edge case is covered and behavior is explicitly asserted.bindu/extensions/voice/pipeline_builder.py (1)
39-48: Update docstring to document all callback parameters.The docstring mentions
on_user_transcriptandon_agent_responsebut omitson_state_changeandon_agent_transcriptwhich are also accepted parameters.📝 Suggested docstring update
Args: voice_ext: Voice agent extension with STT/TTS config. manifest_run: The agent manifest's ``run`` callable. context_id: A2A context ID for this session. + on_state_change: Optional callback for pipeline state changes. on_user_transcript: Optional callback for user transcript events. on_agent_response: Optional callback for agent response events. + on_agent_transcript: Optional callback for agent transcript events. Returns: Dictionary with ``stt``, ``tts``, ``bridge``, and ``vad`` components.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bindu/extensions/voice/pipeline_builder.py` around lines 39 - 48, Docstring for the voice pipeline function in bindu/extensions/voice/pipeline_builder.py currently documents on_user_transcript and on_agent_response but omits the on_state_change and on_agent_transcript callback parameters; update the function's docstring (the block above the pipeline builder function) to list and briefly describe all four callbacks — on_user_transcript, on_agent_transcript, on_agent_response, and on_state_change — including their expected signatures and when they're invoked, while preserving the existing Args and Returns sections and references to voice_ext, manifest_run, and context_id.frontend/src/lib/components/chat/ChatInput.svelte (1)
172-182: Consider using a toast notification instead of alert().Using
alert()for error display is blocking and provides a poor user experience. If the codebase has a toast/notification system, consider using that instead for non-blocking error feedback.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@frontend/src/lib/components/chat/ChatInput.svelte` around lines 172 - 182, Replace the blocking alert fallback in the getFileParts() catch block with the app’s non-blocking toast/notification API: instead of alert(...), call the project’s toast function (e.g., showToast/notify/toast) with the same error message, keep the existing console.error, fileInputEl reset and files = [] logic, and ensure you import or reference the toast helper at the top of ChatInput.svelte so the notification displays the message `Error reading file: ${err instanceof Error ? err.message : String(err)}` in a non-blocking way.frontend/src/lib/components/voice/VoiceCallPanel.svelte (2)
66-68: Redundant Float32Array copy.
samplesis already aFloat32Arrayfrompcm16ToFloat32(), so creating a newFloat32Array(samples)is an unnecessary copy.♻️ Suggested fix
const samples = pcm16ToFloat32(pcmBuffer); if (!samples.length) { return; } const audioBuffer = ctx.createBuffer(1, samples.length, sampleRate); - const channelData = new Float32Array(samples); - audioBuffer.copyToChannel(channelData, 0); + audioBuffer.copyToChannel(samples, 0);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@frontend/src/lib/components/voice/VoiceCallPanel.svelte` around lines 66 - 68, The code is creating an unnecessary copy with new Float32Array(samples); update the block that creates audioBuffer and copies channel data to use the existing Float32Array returned by pcm16ToFloat32() directly: remove the intermediate channelData variable and call audioBuffer.copyToChannel(samples, 0) (keeping ctx.createBuffer(...) and the samples variable intact).
146-165: Consider awaiting AudioContext.close() to ensure proper cleanup.
playbackContext.close()returns a Promise. While the current code usesvoidto suppress the floating promise warning, not awaiting could potentially leave cleanup incomplete if another effect runs before closure finishes.♻️ Suggested improvement
- $effect(() => { + $effect(() => { if (!["idle", "ended", "error"].includes($voiceState)) { return; } playbackNextTime = 0; if (playbackContext) { - void playbackContext.close(); + playbackContext.close().catch(() => { + // Ignore close errors during cleanup + }); playbackContext = null; }Alternatively, if synchronous cleanup is needed, consider tracking the close promise and preventing new contexts until it resolves.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@frontend/src/lib/components/voice/VoiceCallPanel.svelte` around lines 146 - 165, The effect currently calls playbackContext.close() with void, which discards the Promise and can leave cleanup incomplete; change it to await the close call (or capture and await its Promise) before setting playbackContext = null so the AudioContext fully closes before further actions run, and if needed add a temporary flag or store the close Promise to prevent creating a new context while close is pending (refer to $effect, playbackContext, and playbackContext.close()) while leaving the existing timeout/clearing logic for fallbackSpeechTimer and lastFallbackSpoken unchanged.frontend/src/lib/components/chat/ChatWindow.svelte (1)
174-176: Consider adding a comment clarifying the intentional consumption pattern.The empty
for awaitloop consumes the async generator to completion but discards individual updates. While the existing comment hints at this, a clearer explanation would help future maintainers understand this is intentional for fire-and-forget streaming consumption.try { for await (const _update of sendAgentMessage(message, contextId, { fileParts })) { - // Process updates if needed + // Intentionally consume generator to completion; individual updates + // are handled elsewhere (e.g., via store subscriptions or callbacks). }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@frontend/src/lib/components/chat/ChatWindow.svelte` around lines 174 - 176, The empty for-await loop over sendAgentMessage(message, contextId, { fileParts }) intentionally consumes the async generator to completion while discarding per-update values; update the code by replacing the vague comment with a concise clarifying comment that states this is deliberate "fire-and-forget" consumption (e.g., to ensure streaming completes and side-effects run) and note that updates are intentionally ignored so maintainers don't attempt to process yielded values in sendAgentMessage.bindu/extensions/voice/service_factory.py (1)
87-103: Broad exception catch in fallback logic may mask unexpected errors.Catching
Exceptionon line 89 will catch any error including programming bugs (e.g.,AttributeError,TypeError). Consider catching more specific exceptions or at least logging the exception type to aid debugging.♻️ Suggested improvement
try: return _create_tts_service_for_provider(provider, config) - except Exception as primary_error: + except (ValueError, ImportError) as primary_error: if fallback_provider not in {"", "none", provider}: logger.warning( "Primary TTS provider failed; attempting fallback", provider=provider, fallback_provider=fallback_provider, error=str(primary_error), + error_type=type(primary_error).__name__, )This narrows the fallback trigger to configuration/import issues rather than all exceptions.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bindu/extensions/voice/service_factory.py` around lines 87 - 103, The try/except around _create_tts_service_for_provider(provider, config) is catching all Exception (primary_error) and may hide programming errors; change it to catch only expected errors (e.g., ImportError, ConfigurationError, or a specific TTSSetupError) or at minimum log the exception type before attempting the fallback; keep the fallback behavior using _create_tts_service_for_provider(fallback_provider, config) and when catching fallback_error raise a RuntimeError from fallback_error as before, and ensure logger.warning includes type(primary_error).__name__ (and/or re-raise non-expected exceptions) so unexpected exceptions like TypeError/AttributeError aren’t silently handled.bindu/extensions/voice/redis_session_manager.py (2)
340-367: Consider using pipeline forget_active_countto reduce round-trips.The current implementation makes N+1 Redis calls (1
SMEMBERS+ NGETcalls) which could be slow with many sessions. A Redis pipeline would batch these into fewer round-trips.♻️ Proposed optimization using pipeline
async def get_active_count(self) -> int: """Return the number of sessions that are not ended.""" if not self._redis_client: return 0 try: members = await self._redis_client.smembers(REDIS_ACTIVE_SET_KEY) if not members: return 0 count = 0 stale_members: list[str] = [] - for key in members: - data = await self._redis_client.get(key) - if data: - session = self._deserialize_session(key.split(":")[-1], data) - if session.state != "ended": - count += 1 - else: - stale_members.append(key) + # Batch GET calls using pipeline + async with self._redis_client.pipeline() as pipe: + for key in members: + pipe.get(key) + results = await pipe.execute() + + for key, data in zip(members, results): + if data: + session = self._deserialize_session(key.split(":")[-1], data) + if session.state != "ended": + count += 1 + else: + stale_members.append(key) if stale_members: await self._redis_client.srem(REDIS_ACTIVE_SET_KEY, *stale_members) return count🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bindu/extensions/voice/redis_session_manager.py` around lines 340 - 367, The get_active_count method currently does SMEMBERS then N separate GETs causing N+1 round-trips; change it to batch the GETs with a Redis pipeline on self._redis_client (use pipeline() or multi/execute variants available) after retrieving members from REDIS_ACTIVE_SET_KEY, enqueue GET for each member, execute the pipeline to get all values in one round-trip, then iterate results to count non-ended sessions via _deserialize_session(session_id, data) and collect stale_members; if stale_members non-empty, call srem on REDIS_ACTIVE_SET_KEY (can also be pipelined) to remove them, and preserve existing RedisError handling and return behavior.
401-441: Same optimization opportunity applies to_expire_timed_out_sessions.This method also iterates members with individual
GETcalls. Consider using a pipeline here as well for consistency with any optimization applied toget_active_count.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bindu/extensions/voice/redis_session_manager.py` around lines 401 - 441, The _expire_timed_out_sessions method currently fetches each session key with individual GET calls which is inefficient; change it to batch-fetch session data via a pipeline (or MGET) using the same _redis_client so you first call smembers(REDIS_ACTIVE_SET_KEY), then pipeline.get or mget for all members, deserialize each result with _deserialize_session, compute expirations, and finally perform deletes/evalsha/srem in the pipeline as well (respecting _delete_session_script_sha) to minimize round-trips and keep logic in the same function.bindu/settings.py (1)
1107-1111: Consider usingNoneinstead of empty string forredis_urldefault.The
redis_urldefaults to""(empty string), butsession_factory.pychecks for truthiness (if not redis_url). While this works, usingNonewould be more explicit and consistent with other optional URL fields likepostgres_urlinStorageSettings.♻️ Proposed fix
# Session storage backend (for multi-worker compatibility) session_backend: Literal["memory", "redis"] = "memory" - redis_url: str = "" # e.g., "redis://localhost:6379/0" + redis_url: str | None = None # e.g., "redis://localhost:6379/0" redis_session_ttl: int = 300 # seconds, TTL for session keys in Redis🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bindu/settings.py` around lines 1107 - 1111, The redis_url default is an empty string which relies on truthiness checks in session_factory.py; change the redis_url declaration in Settings (redis_url: str = "") to use Optional[str] with a default of None (e.g., redis_url: Optional[str] = None) so it's explicit and consistent with other optional URL fields like StorageSettings.postgres_url, and update any type imports if needed; keep the existing truthiness usage in session_factory.py (if not redis_url) which will continue to work with None.bindu/extensions/voice/agent_bridge.py (1)
33-39: Consider moving timeout constants toVoiceSettingsfor consistency.While these defaults work as fallbacks, they duplicate values that are also in
VoiceSettings(agent_timeout_secs,utterance_timeout_secs). The coding guidelines emphasize usingapp_settingsfor configuration values.However, since these are used as constructor defaults when
voice_settings=None(e.g., in unit tests), keeping them as module constants is acceptable for testability.The curly quote in line 38 (
') flagged by Ruff is a false positive - this is intentional typography in user-facing text.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bindu/extensions/voice/agent_bridge.py` around lines 33 - 39, Move the timeout defaults into VoiceSettings and use those settings in the AgentBridge constructor instead of module-level duplicates: update AgentBridge (constructor/init) to read agent_timeout_secs and utterance_timeout_secs from the provided voice_settings (VoiceSettings) and fall back to the existing module constants MAX_HISTORY_TURNS, DEFAULT_FIRST_TOKEN_TIMEOUT_SECONDS, DEFAULT_TOTAL_RESPONSE_TIMEOUT_SECONDS when voice_settings is None (to preserve testability); remove redundant hard-coded timeout values in AgentBridge and ensure references to DEFAULT_CANCELLATION_GRACE_SECONDS and DEFAULT_THINKING_TEXT/DEFAULT_TIMEOUT_FALLBACK_TEXT remain as module-level UI fallbacks, and keep the curly quote in DEFAULT_TIMEOUT_FALLBACK_TEXT unchanged.frontend/src/lib/services/voice-client.ts (2)
58-61: Unsafe type assertion to accessbaseUrl.The double cast
(agentAPI as unknown as { baseUrl?: string })is a code smell indicating the type definitions may be incomplete. Consider either:
- Exporting
baseUrlproperly fromagent-api- Adding it to the
agentAPItype definition♻️ Proposed improvement
If
agentAPIshould exposebaseUrl, update its type definition. Otherwise, create a dedicated function:// In agent-api.ts or a shared config module export function getAgentBaseUrl(): string { return agentAPI.baseUrl ?? "http://localhost:3773"; }Then use it consistently:
-const baseUrl = - (agentAPI as unknown as { baseUrl?: string }).baseUrl || "http://localhost:3773"; +const baseUrl = getAgentBaseUrl();🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@frontend/src/lib/services/voice-client.ts` around lines 58 - 61, The double type-assertion accessing baseUrl in startSession is unsafe; either add baseUrl to the agentAPI type definition or expose a small accessor from the agent-api module (e.g., getAgentBaseUrl) that returns agentAPI.baseUrl ?? "http://localhost:3773", then replace the inline cast in startSession with that accessor (or the correctly typed agentAPI.baseUrl) to remove the (agentAPI as unknown as { baseUrl?: string }) hack and centralize the default URL logic.
451-458: Remove unused functionconvertFloat32ToPcm16.This function is defined at module level but never called anywhere in the codebase. The PCM16 conversion is already handled inside the AudioWorklet processor (lines 349-354).
Proposed fix
-function convertFloat32ToPcm16(input: Float32Array): ArrayBuffer { - const pcm = new Int16Array(input.length); - for (let index = 0; index < input.length; index += 1) { - const sample = Math.max(-1, Math.min(1, input[index] ?? 0)); - pcm[index] = sample < 0 ? sample * 0x8000 : sample * 0x7fff; - } - return pcm.buffer; -}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@frontend/src/lib/services/voice-client.ts` around lines 451 - 458, Remove the unused module-level function convertFloat32ToPcm16: delete its entire definition since PCM16 conversion is already performed in the AudioWorklet processor; ensure there are no remaining references (imports/exports/tests) to convertFloat32ToPcm16 and run the build/type-check to confirm nothing else depends on it.bindu/utils/notifications.py (1)
38-70: Remove unused function_resolve_and_check_ip.This function is defined but never called anywhere in the codebase. The
_resolve_and_validate_destinationstatic method handles the SSRF defense logic and is the one actually invoked by the validation flow.♻️ Proposed fix
-def _resolve_and_check_ip(hostname: str) -> str: - """Resolve hostname to an IP address and verify it is not in a blocked range. - - Returns the resolved IP address string so callers can connect directly to it, - preventing a DNS-rebinding attack where a second resolution (inside urlopen) could - return a different—potentially internal—address. - - Args: - hostname: The hostname to resolve and validate. - - Returns: - The resolved IP address as a string. - - Raises: - ValueError: If the hostname cannot be resolved or resolves to a blocked range. - """ - try: - resolved_ip = str(socket.getaddrinfo(hostname, None)[0][4][0]) - addr = ipaddress.ip_address(resolved_ip) - except (socket.gaierror, ValueError) as exc: - raise ValueError( - f"Push notification URL hostname could not be resolved: {exc}" - ) from exc - - for blocked in _BLOCKED_NETWORKS: - if addr in blocked: - raise ValueError( - f"Push notification URL resolves to a blocked address range " - f"({addr} is in {blocked}). Internal addresses are not allowed." - ) - - return resolved_ip - -🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bindu/utils/notifications.py` around lines 38 - 70, The function _resolve_and_check_ip is unused and duplicates SSRF logic handled by the _resolve_and_validate_destination static method; remove the _resolve_and_check_ip function entirely to avoid dead code and potential confusion, and ensure any other code that might intend to perform hostname resolution uses _resolve_and_validate_destination instead (search for references to _resolve_and_check_ip and replace calls with _resolve_and_validate_destination if any exist).
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: edac1545-2eb2-4d66-b8ae-876c272f33d7
⛔ Files ignored due to path filters (4)
bindu/grpc/generated/agent_handler_pb2.pyis excluded by!**/generated/**bindu/grpc/generated/agent_handler_pb2.pyiis excluded by!**/generated/**bindu/grpc/generated/agent_handler_pb2_grpc.pyis excluded by!**/generated/**uv.lockis excluded by!**/*.lock
📒 Files selected for processing (93)
.github/workflows/ci.yml.gitignore.pre-commit-config.yamlbindu/common/protocol/types.pybindu/extensions/__init__.pybindu/extensions/did/did_agent_extension.pybindu/extensions/voice/__init__.pybindu/extensions/voice/agent_bridge.pybindu/extensions/voice/audio_config.pybindu/extensions/voice/pipeline_builder.pybindu/extensions/voice/redis_session_manager.pybindu/extensions/voice/service_factory.pybindu/extensions/voice/session_factory.pybindu/extensions/voice/session_manager.pybindu/extensions/voice/voice_agent_extension.pybindu/penguin/bindufy.pybindu/penguin/config_validator.pybindu/server/applications.pybindu/server/endpoints/utils.pybindu/server/endpoints/voice_endpoints.pybindu/server/handlers/message_handlers.pybindu/server/metrics.pybindu/server/scheduler/memory_scheduler.pybindu/server/workers/base.pybindu/settings.pybindu/utils/__init__.pybindu/utils/capabilities.pybindu/utils/config/settings.pybindu/utils/logging.pybindu/utils/notifications.pybindu/utils/retry.pybindu/utils/task_telemetry.pybindu/utils/worker/messages.pybindu/utils/worker/parts.pyfrontend/.env.examplefrontend/.gitignorefrontend/src/lib/buildPrompt.tsfrontend/src/lib/components/ShareConversationModal.sveltefrontend/src/lib/components/chat/ChatInput.sveltefrontend/src/lib/components/chat/ChatMessage.sveltefrontend/src/lib/components/chat/ChatWindow.sveltefrontend/src/lib/components/chat/ContextList.sveltefrontend/src/lib/components/voice/LiveTranscript.sveltefrontend/src/lib/components/voice/VoiceCallButton.sveltefrontend/src/lib/components/voice/VoiceCallPanel.sveltefrontend/src/lib/constants/mime.tsfrontend/src/lib/jobs/refresh-conversation-stats.tsfrontend/src/lib/migrations/lock.tsfrontend/src/lib/migrations/routines/02-update-assistants-models.tsfrontend/src/lib/migrations/routines/10-update-reports-assistantid.tsfrontend/src/lib/server/config.tsfrontend/src/lib/server/database.tsfrontend/src/lib/server/files/downloadFile.tsfrontend/src/lib/server/files/uploadFile.tsfrontend/src/lib/server/models.tsfrontend/src/lib/services/voice-client.tsfrontend/src/lib/stores/chat.tsfrontend/src/lib/stores/voice.tsfrontend/src/lib/types/ConvSidebar.tsfrontend/src/lib/types/Model.tsfrontend/src/lib/types/Session.tsfrontend/src/lib/utils/agentMessageHandler.tsfrontend/src/lib/utils/tree/addChildren.spec.tsfrontend/src/lib/utils/tree/addSibling.spec.tsfrontend/src/lib/utils/tree/buildSubtree.spec.tsfrontend/src/lib/utils/tree/convertLegacyConversation.spec.tsfrontend/src/routes/+layout.sveltefrontend/src/routes/+page.sveltefrontend/src/routes/api/v2/conversations/[id]/message/[messageId]/+server.tsfrontend/src/routes/api/v2/export/+server.tsfrontend/src/routes/conversation/[id]/+page.sveltefrontend/src/routes/conversation/[id]/+server.tsfrontend/src/routes/settings/(nav)/authentication/+page.sveltefrontend/src/routes/settings/(nav)/negotiation/+page.sveltepyproject.tomlpytest.iniscripts/test_voice_providers.shtests/conftest_stubs.pytests/integration/grpc/test_grpc_e2e.pytests/unit/extensions/__init__.pytests/unit/extensions/voice/__init__.pytests/unit/extensions/voice/test_agent_bridge.pytests/unit/extensions/voice/test_service_factory.pytests/unit/extensions/voice/test_session_manager.pytests/unit/extensions/voice/test_voice_endpoints.pytests/unit/extensions/voice/test_voice_extension.pytests/unit/extensions/voice/test_voice_websocket_integration.pytests/unit/penguin/test_bindufy.pytests/unit/penguin/test_config_validator.pytests/unit/server/scheduler/test_memory_scheduler.pytests/unit/test_minimax_example.pytests/unit/utils/test_notifications.pytests/unit/utils/worker/test_messages.py
| voice_config = validated_config.get("voice") | ||
| if voice_config and isinstance(voice_config, dict): | ||
| from bindu.extensions.voice import VoiceAgentExtension |
There was a problem hiding this comment.
Voice config gate can silently skip initialization.
if voice_config and isinstance(voice_config, dict) skips valid empty dict configs and also ignores non-dict truthy values instead of failing fast.
Proposed patch
- voice_config = validated_config.get("voice")
- if voice_config and isinstance(voice_config, dict):
+ voice_config = validated_config.get("voice")
+ if voice_config is not None:
+ if not isinstance(voice_config, dict):
+ raise ValueError("Invalid voice configuration: 'voice' must be an object")
from bindu.extensions.voice import VoiceAgentExtensionAs per coding guidelines: "Validate all external input and use type hints for input validation in Python files".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@bindu/penguin/bindufy.py` around lines 497 - 499, The current conditional
around voice_config in bindufy.py silently skips valid empty dicts and ignores
non-dict values instead of failing fast; change the logic to explicitly check
for the presence of the "voice" key in validated_config, assign voice_config =
validated_config["voice"], then validate its type with isinstance(voice_config,
dict) and raise a clear TypeError (or ValueError) if it is not a dict so invalid
types fail fast; only after successful type validation (including allowing an
empty dict) proceed to import/use VoiceAgentExtension.
| class ConfigError(ValueError): | ||
| """Raised when bindufy configuration is invalid.""" |
There was a problem hiding this comment.
ConfigError is still not the single failure type for config validation.
validate_and_process() now introduces a dedicated exception, but later branches in the same flow still raise bare ValueError (_validate_field_types(), _validate_auth_config(), _validate_hydra_config()). Callers catching ConfigError will still miss a large class of invalid-config failures.
🔧 Suggested fix
- processed_config = cls._process_complex_fields(processed_config)
-
- # Validate field types
- cls._validate_field_types(processed_config)
+ try:
+ processed_config = cls._process_complex_fields(processed_config)
+ cls._validate_field_types(processed_config)
+ except ConfigError:
+ raise
+ except ValueError as exc:
+ raise ConfigError(f"ConfigError: {exc}") from excAlso applies to: 72-94
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@bindu/penguin/config_validator.py` around lines 16 - 17, The code introduces
ConfigError but other validation functions still raise ValueError, so callers
catching ConfigError miss many failures; update the validation flow so all
config validation failures raise ConfigError instead of ValueError by changing
the exception types thrown in _validate_field_types(), _validate_auth_config(),
and _validate_hydra_config() (and any other validation helpers called by
validate_and_process()) to raise ConfigError with descriptive messages; ensure
validate_and_process() continues to either raise or re-raise ConfigError so all
invalid-config paths use the same exception class.
| """Voice session REST + WebSocket endpoints. | ||
|
|
||
| Provides: | ||
| POST /voice/session/start → Start a new voice session | ||
| DELETE /voice/session/{session_id} → End a voice session | ||
| GET /voice/session/{session_id}/status → Get session status | ||
| WS /ws/voice/{session_id} → Bidirectional audio stream | ||
| """ | ||
|
|
||
| from __future__ import annotations | ||
|
|
||
| import json | ||
| import asyncio | ||
| import importlib | ||
| import secrets | ||
| import time | ||
| from dataclasses import dataclass | ||
| from typing import TYPE_CHECKING, Any | ||
| from uuid import uuid4 | ||
|
|
||
| from fastapi import HTTPException | ||
| from starlette.requests import Request | ||
| from starlette.responses import JSONResponse, Response | ||
| from starlette.websockets import WebSocket, WebSocketDisconnect, WebSocketState | ||
|
|
||
| from bindu.settings import app_settings | ||
| from bindu.utils.logging import get_logger | ||
| from bindu.server.endpoints.utils import handle_endpoint_errors | ||
|
|
||
| if TYPE_CHECKING: | ||
| from bindu.server.applications import BinduApplication | ||
|
|
||
| logger = get_logger("bindu.server.endpoints.voice") | ||
|
|
||
| _BACKGROUND_TASKS: set[asyncio.Task[None]] = set() | ||
| _VOICE_RATE_LIMIT_LOCK = asyncio.Lock() | ||
| _VOICE_RATE_LIMIT_IP_BUCKET: dict[str, list[float]] = {} | ||
| _VOICE_RATE_LIMIT_REDIS_LOCK = asyncio.Lock() | ||
| _VOICE_RATE_LIMIT_REDIS_CLIENT: Any | None = None | ||
|
|
||
| try: | ||
| import redis.asyncio as _redis_async # type: ignore[import-not-found] | ||
|
|
||
| _REDIS_AVAILABLE = True | ||
| except Exception: # pragma: no cover | ||
| _redis_async = None # type: ignore[assignment] | ||
| _REDIS_AVAILABLE = False | ||
|
|
||
|
|
||
| _RATE_LIMIT_LUA = """ | ||
| -- Sliding-window rate limit using a sorted set. | ||
| -- KEYS[1] = zset key | ||
| -- ARGV[1] = now (seconds) | ||
| -- ARGV[2] = cutoff (seconds) | ||
| -- ARGV[3] = member (unique) | ||
| -- ARGV[4] = limit (int) | ||
| redis.call('ZREMRANGEBYSCORE', KEYS[1], 0, tonumber(ARGV[2])) | ||
| redis.call('ZADD', KEYS[1], tonumber(ARGV[1]), ARGV[3]) | ||
| local count = redis.call('ZCARD', KEYS[1]) | ||
| redis.call('EXPIRE', KEYS[1], 120) | ||
| if count > tonumber(ARGV[4]) then | ||
| return 0 | ||
| end | ||
| return 1 | ||
| """ | ||
|
|
||
|
|
||
| async def _get_rate_limit_redis_client() -> Any | None: | ||
| """Lazy init a Redis client for rate limiting (best-effort).""" | ||
| global _VOICE_RATE_LIMIT_REDIS_CLIENT | ||
| if _VOICE_RATE_LIMIT_REDIS_CLIENT is not None: | ||
| return _VOICE_RATE_LIMIT_REDIS_CLIENT | ||
|
|
||
| if not _REDIS_AVAILABLE: | ||
| return None | ||
|
|
||
| redis_url = app_settings.voice.redis_url | ||
| if not redis_url: | ||
| return None | ||
|
|
||
| async with _VOICE_RATE_LIMIT_REDIS_LOCK: | ||
| if _VOICE_RATE_LIMIT_REDIS_CLIENT is not None: | ||
| return _VOICE_RATE_LIMIT_REDIS_CLIENT | ||
| try: | ||
| _VOICE_RATE_LIMIT_REDIS_CLIENT = _redis_async.from_url( # type: ignore[union-attr] | ||
| redis_url, | ||
| encoding="utf-8", | ||
| decode_responses=True, | ||
| ) | ||
| return _VOICE_RATE_LIMIT_REDIS_CLIENT | ||
| except Exception: | ||
| logger.exception("Failed to initialize Redis rate limiter client") | ||
| _VOICE_RATE_LIMIT_REDIS_CLIENT = None | ||
| return None | ||
|
|
||
|
|
||
| async def _rate_limit_allow_ip( | ||
| ip: str, | ||
| *, | ||
| limit_per_minute: int, | ||
| now: float | None = None, | ||
| ) -> bool: | ||
| """Allow an IP through the sliding-window rate limiter.""" | ||
| if limit_per_minute <= 0: | ||
| return True | ||
| t = float(time.time() if now is None else now) | ||
| cutoff = t - 60.0 | ||
|
|
||
| if app_settings.voice.rate_limit_backend == "redis": | ||
| client = await _get_rate_limit_redis_client() | ||
| if client is not None: | ||
| key = f"voice:rate_limit:ip:{ip}" | ||
| member = f"{t}:{time.time_ns()}" | ||
| try: | ||
| allowed = await client.eval( | ||
| _RATE_LIMIT_LUA, | ||
| 1, | ||
| key, | ||
| t, | ||
| cutoff, | ||
| member, | ||
| int(limit_per_minute), | ||
| ) | ||
| return bool(allowed) | ||
| except Exception: | ||
| logger.exception("Redis rate limiter failed; falling back to memory") | ||
|
|
||
| async with _VOICE_RATE_LIMIT_LOCK: | ||
| window = _VOICE_RATE_LIMIT_IP_BUCKET.get(ip, []) | ||
| window = [ts for ts in window if ts >= cutoff] | ||
|
|
||
| if not window: | ||
| _VOICE_RATE_LIMIT_IP_BUCKET.pop(ip, None) | ||
|
|
||
| if len(window) >= limit_per_minute: | ||
| if window: | ||
| _VOICE_RATE_LIMIT_IP_BUCKET[ip] = window | ||
| return False | ||
| window.append(t) | ||
| _VOICE_RATE_LIMIT_IP_BUCKET[ip] = window | ||
| return True | ||
|
|
||
|
|
||
| @dataclass | ||
| class _VoiceControlState: | ||
| muted: bool = False | ||
| stopped: bool = False | ||
| suppress_audio_until: float = 0.0 | ||
|
|
||
|
|
||
| class _FilteredWebSocket: | ||
| """WebSocket wrapper that filters inbound frames. | ||
|
|
||
| Used to keep Pipecat's transport focused on audio frames while this endpoint | ||
| consumes and handles JSON control messages (start/mute/unmute/stop/etc). | ||
| """ | ||
|
|
||
| def __init__(self, websocket: WebSocket, queue: asyncio.Queue[dict[str, Any]]): | ||
| self._ws = websocket | ||
| self._queue = queue | ||
|
|
||
| def __getattr__(self, name: str) -> Any: | ||
| return getattr(self._ws, name) | ||
|
|
||
| async def receive(self) -> dict[str, Any]: | ||
| msg = await self._queue.get() | ||
| msg_type = msg.get("type", "unknown") | ||
| data_bytes = msg.get("bytes") | ||
| logger.info( | ||
| f"FilteredWebSocket.receive: type={msg_type}, bytes={len(data_bytes) if data_bytes else 0}" | ||
| ) | ||
| return msg | ||
|
|
||
| async def receive_text(self) -> str: | ||
| message = await self.receive() | ||
| if message.get("type") == "websocket.disconnect": | ||
| raise WebSocketDisconnect(code=message.get("code", 1000)) | ||
| text = message.get("text") | ||
| if text is None: | ||
| raise RuntimeError("Expected text WebSocket message") | ||
| return text | ||
|
|
||
| async def receive_bytes(self) -> bytes: | ||
| message = await self.receive() | ||
| if message.get("type") == "websocket.disconnect": | ||
| logger.info("FilteredWebSocket.receive_bytes: disconnect received") | ||
| raise WebSocketDisconnect(code=message.get("code", 1000)) | ||
| data = message.get("bytes") | ||
| logger.debug( | ||
| f"FilteredWebSocket.receive_bytes: got {len(data) if data else 0} bytes" | ||
| ) | ||
| if data is None: | ||
| raise RuntimeError("Expected bytes WebSocket message") | ||
| return data | ||
|
|
||
|
|
||
| try: | ||
| from pipecat.serializers.base_serializer import ( | ||
| FrameSerializer as _PipecatFrameSerializer, | ||
| ) | ||
| except Exception: # pragma: no cover | ||
|
|
||
| class _PipecatFrameSerializer: # type: ignore[too-many-ancestors] | ||
| def __init__(self, *args: Any, **kwargs: Any) -> None: | ||
| pass | ||
|
|
||
|
|
||
| class _RawAudioFrameSerializer(_PipecatFrameSerializer): | ||
| """Serializer for raw PCM audio over WebSocket. | ||
|
|
||
| Converts inbound binary frames into Pipecat input audio frames and sends | ||
| outbound audio frames as raw bytes for browser playback. | ||
| """ | ||
|
|
||
| def __init__(self, sample_rate: int, num_channels: int): | ||
| super().__init__() | ||
| self._sample_rate = sample_rate | ||
| self._num_channels = num_channels | ||
|
|
||
| async def setup(self, _frame: Any) -> None: | ||
| return | ||
|
|
||
| async def serialize(self, frame: Any) -> str | bytes | None: | ||
| from pipecat.frames.frames import ( | ||
| OutputAudioRawFrame, | ||
| OutputTransportMessageFrame, | ||
| OutputTransportMessageUrgentFrame, | ||
| ) | ||
|
|
||
| if isinstance(frame, OutputAudioRawFrame): | ||
| return frame.audio | ||
|
|
||
| if isinstance( | ||
| frame, (OutputTransportMessageFrame, OutputTransportMessageUrgentFrame) | ||
| ): | ||
| return json.dumps(frame.message) | ||
|
|
||
| return None | ||
|
|
||
| async def deserialize(self, data: str | bytes) -> Any | None: | ||
| from pipecat.frames.frames import InputAudioRawFrame, InputTransportMessageFrame | ||
|
|
||
| if isinstance(data, (bytes, bytearray)): | ||
| return InputAudioRawFrame( | ||
| audio=bytes(data), | ||
| sample_rate=self._sample_rate, | ||
| num_channels=self._num_channels, | ||
| ) | ||
|
|
||
| if isinstance(data, str): | ||
| try: | ||
| payload = json.loads(data) | ||
| except json.JSONDecodeError: | ||
| return None | ||
| return InputTransportMessageFrame(message=payload) | ||
|
|
||
| return None | ||
|
|
||
|
|
||
| async def _send_json( | ||
| websocket: Any, | ||
| payload: dict[str, Any], | ||
| send_lock: asyncio.Lock | None = None, | ||
| ) -> bool: | ||
| """Send a JSON payload over a WebSocket safely. | ||
|
|
||
| If a lock is provided, it is used to serialize concurrent send_text calls. | ||
| """ | ||
| try: | ||
| client_state = getattr(websocket, "client_state", None) | ||
| application_state = getattr(websocket, "application_state", None) | ||
| if ( | ||
| client_state == WebSocketState.DISCONNECTED | ||
| or application_state == WebSocketState.DISCONNECTED | ||
| ): | ||
| return False | ||
| except Exception: | ||
| # If state inspection fails, still attempt the send below. | ||
| pass | ||
|
|
||
| data = json.dumps(payload) | ||
| try: | ||
| if send_lock is None: | ||
| await websocket.send_text(data) | ||
| return True | ||
| async with send_lock: | ||
| await websocket.send_text(data) | ||
| return True | ||
| except (WebSocketDisconnect, RuntimeError): | ||
| # RuntimeError is raised by Starlette after a close frame is sent. | ||
| return False | ||
|
|
||
|
|
||
| def _trim_overlap_text(previous: str, current: str) -> str: | ||
| """Trim repeated word-overlap between adjacent transcript chunks. | ||
|
|
||
| Returns only the delta portion of ``current`` that does not duplicate | ||
| the suffix of ``previous``. | ||
| """ | ||
| prev = (previous or "").strip() | ||
| curr = (current or "").strip() | ||
| if not prev: | ||
| return curr | ||
| if prev == curr: | ||
| return "" | ||
|
|
||
| prev_tokens = prev.split() | ||
| curr_tokens = curr.split() | ||
| max_overlap = min(len(prev_tokens), len(curr_tokens)) | ||
|
|
||
| for overlap in range(max_overlap, 0, -1): | ||
| if prev_tokens[-overlap:] == curr_tokens[:overlap]: | ||
| return " ".join(curr_tokens[overlap:]).strip() | ||
|
|
||
| return curr | ||
|
|
||
|
|
||
| def _extract_bearer_token(value: str | None) -> str | None: | ||
| if not value: | ||
| return None | ||
| parts = value.split() | ||
| if len(parts) == 2 and parts[0].lower() == "bearer": | ||
| token = parts[1].strip() | ||
| return token or None | ||
| return None | ||
|
|
||
|
|
||
| def _extract_ws_session_token(websocket: WebSocket) -> str | None: | ||
| """Extract session token from WS headers (subprotocol or Authorization).""" | ||
| subprotocols = websocket.headers.get("sec-websocket-protocol") | ||
| if subprotocols: | ||
| # Starlette exposes the raw comma-separated list. | ||
| for item in subprotocols.split(","): | ||
| token = item.strip() | ||
| if token: | ||
| return token | ||
|
|
||
| return _extract_bearer_token(websocket.headers.get("authorization")) | ||
|
|
||
|
|
||
| def _has_ws_subprotocols(websocket: WebSocket) -> bool: | ||
| return bool(websocket.headers.get("sec-websocket-protocol")) | ||
|
|
||
|
|
||
| def _classify_voice_pipeline_error(exc: Exception) -> tuple[str, int]: | ||
| """Map internal pipeline exceptions to a user-facing error + close code.""" | ||
| module = type(exc).__module__ | ||
| name = type(exc).__name__ | ||
| message = str(exc) | ||
|
|
||
| if isinstance(exc, asyncio.TimeoutError) or name == "TimeoutError": | ||
| return ("Voice pipeline timed out", 1011) | ||
|
|
||
| if module.startswith("websockets") or "ConnectionClosed" in name: | ||
| return ("Voice provider connection closed", 1011) | ||
|
|
||
| lowered = message.lower() | ||
| if "deepgram" in lowered and ("disconnect" in lowered or "closed" in lowered): | ||
| return ("Deepgram connection closed", 1011) | ||
| if "elevenlabs" in lowered and ("disconnect" in lowered or "closed" in lowered): | ||
| return ("ElevenLabs connection closed", 1011) | ||
|
|
||
| return ("Voice pipeline error", 1011) | ||
|
|
||
|
|
||
| def _voice_preflight_error() -> str | None: | ||
| """Return a user-facing error if voice is not runnable in this process.""" | ||
| # Ensure optional dependency group is installed. | ||
| try: | ||
| importlib.import_module("pipecat") | ||
| except Exception: | ||
| return "Voice dependencies are not installed. Install with: pip install 'bindu[voice]'" | ||
|
|
||
| # Ensure provider keys exist (provider-dependent). | ||
| if app_settings.voice.stt_provider == "deepgram" and not app_settings.voice.stt_api_key: | ||
| return "VOICE__STT_API_KEY is required when VOICE__STT_PROVIDER=deepgram" | ||
|
|
||
| tts_provider = app_settings.voice.tts_provider | ||
| if tts_provider == "elevenlabs" and not app_settings.voice.tts_api_key: | ||
| return "VOICE__TTS_API_KEY is required when VOICE__TTS_PROVIDER=elevenlabs" | ||
| if tts_provider == "azure": | ||
| if not app_settings.voice.azure_tts_api_key: | ||
| return "VOICE__AZURE_TTS_API_KEY is required when VOICE__TTS_PROVIDER=azure" | ||
| if not app_settings.voice.azure_tts_region: | ||
| return "VOICE__AZURE_TTS_REGION is required when VOICE__TTS_PROVIDER=azure" | ||
| if tts_provider not in {"elevenlabs", "piper", "azure"}: | ||
| return f"Unsupported VOICE__TTS_PROVIDER={tts_provider!r}" | ||
|
|
||
| return None | ||
|
|
||
|
|
||
| async def _send_error_and_close( | ||
| websocket: WebSocket, | ||
| message: str, | ||
| *, | ||
| send_lock: asyncio.Lock, | ||
| close_code: int = 1008, | ||
| ) -> None: | ||
| try: | ||
| await _send_json(websocket, {"type": "error", "message": message}, send_lock) | ||
| finally: | ||
| try: | ||
| await websocket.close(code=close_code, reason=message) | ||
| except Exception: | ||
| pass | ||
|
|
||
|
|
||
| async def _voice_control_reader( | ||
| websocket: WebSocket, | ||
| inbound_queue: asyncio.Queue[dict[str, Any]], | ||
| control: _VoiceControlState, | ||
| *, | ||
| vad_enabled: bool, | ||
| send_lock: asyncio.Lock, | ||
| on_user_text: Any | None = None, | ||
| ) -> None: | ||
| """Read from the real WebSocket and push only audio frames to the queue.""" | ||
| max_binary_frame_bytes = 64 * 1024 | ||
| max_frames_per_second = 50 | ||
| max_frames_in_flight = 10 | ||
|
|
||
| window_started_at = time.monotonic() | ||
| window_count = 0 | ||
|
|
||
| while True: | ||
| message: dict[str, Any] = await websocket.receive() | ||
| message_type = message.get("type") | ||
|
|
||
| if message_type == "websocket.disconnect": | ||
| await inbound_queue.put(message) | ||
| return | ||
|
|
||
| if message_type != "websocket.receive": | ||
| await inbound_queue.put(message) | ||
| continue | ||
|
|
||
| text = message.get("text") | ||
| if text is not None: | ||
| try: | ||
| payload = json.loads(text) | ||
| except json.JSONDecodeError: | ||
| await _send_error_and_close( | ||
| websocket, | ||
| "Malformed JSON control frame", | ||
| send_lock=send_lock, | ||
| ) | ||
| return | ||
|
|
||
| frame_type = payload.get("type") | ||
| if frame_type == "mute": | ||
| control.muted = True | ||
| await _send_json( | ||
| websocket, {"type": "state", "state": "muted"}, send_lock | ||
| ) | ||
| continue | ||
| if frame_type == "unmute": | ||
| control.muted = False | ||
| await _send_json( | ||
| websocket, {"type": "state", "state": "listening"}, send_lock | ||
| ) | ||
| continue | ||
| if frame_type == "stop": | ||
| control.stopped = True | ||
| await _send_json( | ||
| websocket, {"type": "state", "state": "ended"}, send_lock | ||
| ) | ||
| try: | ||
| await websocket.close() | ||
| finally: | ||
| return | ||
| if frame_type == "user_text": | ||
| user_text = payload.get("text") | ||
| if isinstance(user_text, str) and user_text.strip() and on_user_text: | ||
| try: | ||
| await on_user_text(user_text.strip()) | ||
| except Exception: | ||
| logger.exception("Failed to handle user_text control frame") | ||
| continue | ||
| if frame_type in {"start", "commit_turn"}: | ||
| # If VAD is disabled, the transport/STT may rely on explicit turn boundary | ||
| # control frames (e.g. commit_turn). Forward these to the transport. | ||
| if not vad_enabled: | ||
| await inbound_queue.put(message) | ||
| continue | ||
|
|
||
| # Unknown control frame: ignore to preserve forward-compat. | ||
| continue | ||
|
|
||
| data = message.get("bytes") | ||
| if data is None: | ||
| logger.debug("Voice control reader: no bytes in message, skipping") | ||
| continue | ||
|
|
||
| logger.info(f"Voice control reader: received audio frame, {len(data)} bytes") | ||
|
|
||
| now_monotonic = time.monotonic() | ||
| if control.muted or now_monotonic < float(control.suppress_audio_until): | ||
| continue | ||
|
|
||
| if len(data) > max_binary_frame_bytes: | ||
| await _send_error_and_close( | ||
| websocket, | ||
| f"Audio frame too large (max {max_binary_frame_bytes} bytes)", | ||
| send_lock=send_lock, | ||
| ) | ||
| return | ||
|
|
||
| if now_monotonic - window_started_at >= 1.0: | ||
| window_started_at = now_monotonic | ||
| window_count = 0 | ||
| window_count += 1 | ||
| if window_count > max_frames_per_second: | ||
| await _send_error_and_close( | ||
| websocket, | ||
| f"Too many audio frames per second (max {max_frames_per_second})", | ||
| send_lock=send_lock, | ||
| ) | ||
| return | ||
|
|
||
| if inbound_queue.qsize() >= max_frames_in_flight: | ||
| await _send_error_and_close( | ||
| websocket, | ||
| f"Too many audio frames in flight (max {max_frames_in_flight})", | ||
| send_lock=send_lock, | ||
| ) | ||
| return | ||
|
|
||
| await inbound_queue.put(message) | ||
|
|
||
|
|
||
| # --------------------------------------------------------------------------- | ||
| # REST Endpoints | ||
| # --------------------------------------------------------------------------- | ||
|
|
||
|
|
||
| @handle_endpoint_errors("start voice session") | ||
| async def voice_session_start(app: BinduApplication, request: Request) -> Response: | ||
| """Start a new voice session. | ||
|
|
||
| Request body (optional JSON): | ||
| { "context_id": "<uuid>" } | ||
|
|
||
| Returns: | ||
| { "session_id": "...", "ws_url": "ws://host/ws/voice/{session_id}" } | ||
| """ | ||
| session_manager = getattr(app, "_voice_session_manager", None) | ||
| if session_manager is None: | ||
| return JSONResponse( | ||
| {"error": "Voice extension is not enabled"}, status_code=501 | ||
| ) | ||
|
|
||
| if not app_settings.voice.enabled: | ||
| return JSONResponse({"error": "Voice is disabled"}, status_code=501) | ||
|
|
||
| preflight_error = _voice_preflight_error() | ||
| if preflight_error: | ||
| return JSONResponse({"error": preflight_error}, status_code=503) | ||
|
|
||
| # Per-IP rate limit (best-effort; request.client may be missing in tests/proxies) | ||
| client_host = request.client.host if request.client else None | ||
| if client_host: | ||
| allowed = await _rate_limit_allow_ip( | ||
| client_host, | ||
| limit_per_minute=int(app_settings.voice.rate_limit_per_ip_per_minute), | ||
| ) | ||
| if not allowed: | ||
| return JSONResponse({"error": "Rate limit exceeded"}, status_code=429) | ||
|
|
||
| # Parse optional context_id from body | ||
| context_id = str(uuid4()) | ||
| raw_body = await request.body() | ||
| if raw_body: | ||
| try: | ||
| body = json.loads(raw_body) | ||
| except json.JSONDecodeError as exc: | ||
| raise HTTPException( | ||
| status_code=400, detail="Malformed JSON payload" | ||
| ) from exc | ||
|
|
||
| if isinstance(body, dict) and "context_id" in body: | ||
| context_id = str(body["context_id"]) | ||
|
|
||
| session_token: str | None = None | ||
| session_token_expires_at: float | None = None | ||
| if app_settings.voice.session_auth_required: | ||
| session_token = secrets.token_urlsafe(32) | ||
| session_token_expires_at = time.time() + max( | ||
| 1, int(app_settings.voice.session_token_ttl) | ||
| ) | ||
|
|
||
| try: | ||
| session = await session_manager.create_session( | ||
| context_id, | ||
| session_token=session_token, | ||
| session_token_expires_at=session_token_expires_at, | ||
| ) | ||
| except RuntimeError as exc: | ||
| return JSONResponse({"error": str(exc)}, status_code=429) | ||
|
|
||
| # Build WebSocket URL from request | ||
| scheme = "wss" if request.url.scheme == "https" else "ws" | ||
| # Use hostname from request, fallback to client host, or raise error if unavailable | ||
| host = request.url.hostname or (request.client.host if request.client else None) | ||
| if not host: | ||
| return JSONResponse( | ||
| {"error": "Unable to determine host for WebSocket URL"}, | ||
| status_code=400, | ||
| ) | ||
| ws_url = f"{scheme}://{host}" | ||
| if request.url.port: | ||
| ws_url += f":{request.url.port}" | ||
| ws_url += f"/ws/voice/{session.id}" | ||
|
|
||
| return JSONResponse( | ||
| { | ||
| "session_id": session.id, | ||
| "context_id": session.context_id, | ||
| "ws_url": ws_url, | ||
| **({"session_token": session_token} if session_token else {}), | ||
| }, | ||
| status_code=201, | ||
| ) | ||
|
|
||
|
|
||
| @handle_endpoint_errors("end voice session") | ||
| async def voice_session_end(app: BinduApplication, request: Request) -> Response: | ||
| """End a voice session. | ||
|
|
||
| Path params: | ||
| session_id: The voice session ID | ||
|
|
||
| Returns: | ||
| { "status": "ended" } | ||
| """ | ||
| session_manager = getattr(app, "_voice_session_manager", None) | ||
| if session_manager is None: | ||
| return JSONResponse( | ||
| {"error": "Voice extension is not enabled"}, status_code=501 | ||
| ) | ||
|
|
||
| session_id = request.path_params["session_id"] | ||
| session = await session_manager.get_session(session_id) | ||
| if session is None: | ||
| return JSONResponse({"error": "Session not found"}, status_code=404) | ||
|
|
||
| await session_manager.end_session(session_id) | ||
| return JSONResponse({"status": "ended"}) | ||
|
|
||
|
|
||
| @handle_endpoint_errors("voice session status") | ||
| async def voice_session_status(app: BinduApplication, request: Request) -> Response: | ||
| """Get voice session status. | ||
|
|
||
| Path params: | ||
| session_id: The voice session ID | ||
|
|
||
| Returns: | ||
| { "session_id": "...", "state": "...", "duration": 12.3, "context_id": "..." } | ||
| """ | ||
| session_manager = getattr(app, "_voice_session_manager", None) | ||
| if session_manager is None: | ||
| return JSONResponse( | ||
| {"error": "Voice extension is not enabled"}, status_code=501 | ||
| ) | ||
|
|
||
| session_id = request.path_params["session_id"] | ||
| session = await session_manager.get_session(session_id) | ||
| if session is None: | ||
| return JSONResponse({"error": "Session not found"}, status_code=404) | ||
|
|
||
| return JSONResponse( | ||
| { | ||
| "session_id": session.id, | ||
| "context_id": session.context_id, | ||
| "state": session.state, | ||
| "duration": round(session.duration_seconds, 1), | ||
| "task_id": session.task_id, | ||
| } | ||
| ) | ||
|
|
||
|
|
||
| # --------------------------------------------------------------------------- | ||
| # WebSocket Handler | ||
| # --------------------------------------------------------------------------- | ||
|
|
||
|
|
||
| async def voice_websocket(websocket: WebSocket) -> None: | ||
| """Bidirectional voice WebSocket handler using Pipecat pipeline.""" | ||
| app: BinduApplication = websocket.app # type: ignore[assignment] | ||
| session_id = websocket.path_params.get("session_id", "") | ||
|
|
||
| session_manager = getattr(app, "_voice_session_manager", None) | ||
| if session_manager is None: | ||
| await websocket.close(code=1008, reason="Voice extension is not enabled") | ||
| return | ||
| if not app_settings.voice.enabled: | ||
| await websocket.close(code=1008, reason="Voice is disabled") | ||
| return | ||
|
|
||
| preflight_error = _voice_preflight_error() | ||
| if preflight_error: | ||
| await websocket.close(code=1008, reason=preflight_error) | ||
| return | ||
|
|
||
| session = await session_manager.get_session(session_id) | ||
| if session is None: | ||
| await websocket.close(code=1008, reason="Invalid session ID") | ||
| return | ||
|
|
||
| # Per-IP rate limit on websocket connects (best-effort) | ||
| client_host = websocket.client.host if websocket.client else None | ||
| if client_host: | ||
| allowed = await _rate_limit_allow_ip( | ||
| client_host, | ||
| limit_per_minute=int(app_settings.voice.rate_limit_per_ip_per_minute), | ||
| ) | ||
| if not allowed: | ||
| await websocket.close(code=1008, reason="Rate limit exceeded") | ||
| return | ||
|
|
||
| if app_settings.voice.session_auth_required: | ||
| expected = getattr(session, "session_token", None) | ||
| expires_at = getattr(session, "session_token_expires_at", None) | ||
| provided = _extract_ws_session_token(websocket) | ||
|
|
||
| # If the client sent Sec-WebSocket-Protocol, the server should select one. | ||
| # We select the token itself as the negotiated subprotocol. | ||
| if provided and _has_ws_subprotocols(websocket): | ||
| await websocket.accept(subprotocol=provided) | ||
| else: | ||
| await websocket.accept() | ||
| if not provided: | ||
| try: | ||
| provided = (await websocket.receive_text()).strip() | ||
| except Exception: | ||
| await websocket.close(code=1008, reason="Missing session token") | ||
| return | ||
|
|
||
| if not expected or provided != expected: | ||
| await websocket.close(code=1008, reason="Invalid session token") | ||
| return | ||
| if isinstance(expires_at, (int, float)) and time.time() > float(expires_at): | ||
| await websocket.close(code=1008, reason="Session token expired") | ||
| return | ||
| else: | ||
| await websocket.accept() | ||
|
|
||
| await session_manager.update_state(session_id, "active") | ||
|
|
||
| voice_ext = getattr(app, "_voice_ext", None) | ||
| manifest = getattr(app, "manifest", None) | ||
| if voice_ext is None or manifest is None or not hasattr(manifest, "run"): | ||
| await websocket.send_text( | ||
| json.dumps({"type": "error", "message": "Agent not configured for voice"}) | ||
| ) | ||
| await websocket.close(code=1011) | ||
| return | ||
|
|
||
| from bindu.extensions.voice.pipeline_builder import build_voice_pipeline | ||
| from pipecat.transports.websocket.fastapi import ( | ||
| FastAPIWebsocketTransport, | ||
| FastAPIWebsocketParams, | ||
| ) | ||
| from pipecat.pipeline.pipeline import Pipeline | ||
| from pipecat.pipeline.task import PipelineTask | ||
| from pipecat.pipeline.runner import PipelineRunner | ||
|
|
||
| # Notify UI we are listening | ||
| send_lock = asyncio.Lock() | ||
| await _send_json(websocket, {"type": "state", "state": "listening"}, send_lock) | ||
|
|
||
| async def _on_user_transcript(text: str) -> None: | ||
| await _send_json( | ||
| websocket, | ||
| {"type": "transcript", "role": "user", "text": text, "is_final": True}, | ||
| send_lock, | ||
| ) | ||
|
|
||
| async def _on_agent_response(text: str) -> None: | ||
| control.suppress_audio_until = max( | ||
| float(control.suppress_audio_until), time.monotonic() + 0.6 | ||
| ) | ||
| await _send_json( | ||
| websocket, | ||
| {"type": "agent_response", "text": text, "task_id": session.task_id}, | ||
| send_lock, | ||
| ) | ||
|
|
||
| async def _on_state_change(state: str) -> None: | ||
| if state == "agent-speaking": | ||
| control.suppress_audio_until = max( | ||
| float(control.suppress_audio_until), time.monotonic() + 1.0 | ||
| ) | ||
| elif state == "listening": | ||
| control.suppress_audio_until = max( | ||
| float(control.suppress_audio_until), time.monotonic() + 0.35 | ||
| ) | ||
| await _send_json(websocket, {"type": "state", "state": state}, send_lock) | ||
|
|
||
| async def _on_agent_transcript(text: str, is_final: bool) -> None: | ||
| control.suppress_audio_until = max( | ||
| float(control.suppress_audio_until), | ||
| time.monotonic() + (0.6 if is_final else 0.9), | ||
| ) | ||
| logger.info( | ||
| f"on_agent_transcript: got text='{text[:50]}...' is_final={is_final}" | ||
| ) | ||
| await _send_json( | ||
| websocket, | ||
| { | ||
| "type": "transcript", | ||
| "role": "agent", | ||
| "text": text, | ||
| "is_final": is_final, | ||
| }, | ||
| send_lock, | ||
| ) | ||
|
|
||
| components = build_voice_pipeline( | ||
| voice_ext=voice_ext, | ||
| manifest_run=manifest.run, | ||
| context_id=session.context_id, | ||
| on_state_change=_on_state_change, | ||
| on_user_transcript=_on_user_transcript, | ||
| on_agent_response=_on_agent_response, | ||
| on_agent_transcript=_on_agent_transcript, | ||
| ) | ||
|
|
||
| control = _VoiceControlState() | ||
| inbound_queue: asyncio.Queue[dict[str, Any]] = asyncio.Queue(maxsize=10) | ||
| filtered_ws = _FilteredWebSocket(websocket, inbound_queue) | ||
|
|
||
| async def _handle_user_text(text: str) -> None: | ||
| await _send_json( | ||
| websocket, | ||
| {"type": "transcript", "role": "user", "text": text, "is_final": True}, | ||
| send_lock, | ||
| ) | ||
| response = await components["bridge"].process_transcription( | ||
| text, emit_frames=True | ||
| ) | ||
| if response: | ||
| await _on_agent_response(response) | ||
|
|
||
| reader_task: asyncio.Task[Any] | None = None | ||
| runner_task: asyncio.Task[Any] | None = None | ||
|
|
||
| try: | ||
| transport = FastAPIWebsocketTransport( | ||
| websocket=filtered_ws, # type: ignore[arg-type] | ||
| params=FastAPIWebsocketParams( | ||
| audio_in_enabled=True, | ||
| audio_out_enabled=True, | ||
| audio_in_sample_rate=app_settings.voice.sample_rate, | ||
| audio_out_sample_rate=app_settings.voice.sample_rate, | ||
| add_wav_header=False, | ||
| serializer=_RawAudioFrameSerializer( | ||
| sample_rate=app_settings.voice.sample_rate, | ||
| num_channels=app_settings.voice.audio_channels, | ||
| ), | ||
| ), | ||
| ) | ||
|
|
||
| logger.info( | ||
| f"Voice pipeline: transport created, sample_rate={app_settings.voice.sample_rate}" | ||
| ) | ||
| logger.info( | ||
| f"Voice pipeline: components - STT={type(components['stt']).__name__}, " | ||
| f"Bridge={type(components['bridge']).__name__}, TTS={type(components['tts']).__name__}" | ||
| ) | ||
|
|
||
| pipeline_components = [transport.input()] | ||
| if components.get("vad"): | ||
| pipeline_components.append(components["vad"]) | ||
| logger.info("Voice pipeline: VAD enabled and added") | ||
|
|
||
| pipeline_components.extend( | ||
| [ | ||
| components["stt"], | ||
| components["bridge"], | ||
| components["tts"], | ||
| transport.output(), | ||
| ] | ||
| ) | ||
| logger.info( | ||
| f"Voice pipeline: total components in pipeline: {len(pipeline_components)}" | ||
| ) | ||
|
|
||
| pipeline = Pipeline(pipeline_components) | ||
| logger.info("Voice pipeline: Pipeline created successfully") | ||
|
|
||
| task = PipelineTask(pipeline) | ||
| logger.info("Voice pipeline: PipelineTask created, starting runner...") | ||
| runner = PipelineRunner() | ||
|
|
||
| runner_task = asyncio.create_task(runner.run(task)) | ||
|
|
||
| # Start reading control/audio only after pipeline runner is live so | ||
| # user_text cannot emit TTS frames before StartFrame initialization. | ||
| reader_task = asyncio.create_task( | ||
| _voice_control_reader( | ||
| websocket, | ||
| inbound_queue, | ||
| control, | ||
| vad_enabled=app_settings.voice.vad_enabled, | ||
| send_lock=send_lock, | ||
| on_user_text=_handle_user_text, | ||
| ) | ||
| ) | ||
|
|
||
| async with asyncio.timeout(float(app_settings.voice.session_timeout)): | ||
| await runner_task | ||
| except WebSocketDisconnect: | ||
| logger.info(f"Voice WebSocket disconnected: {session_id}") | ||
| except TimeoutError: | ||
| logger.info(f"Voice session timed out: {session_id}") | ||
| if websocket.client_state == WebSocketState.CONNECTED: | ||
| await _send_json( | ||
| websocket, | ||
| {"type": "error", "message": "Voice session timed out"}, | ||
| send_lock, | ||
| ) | ||
| except Exception as e: | ||
| logger.exception(f"Error in voice WebSocket: {session_id}: {e}") | ||
| if websocket.client_state == WebSocketState.CONNECTED: | ||
| user_message, close_code = _classify_voice_pipeline_error(e) | ||
| await _send_json( | ||
| websocket, {"type": "error", "message": user_message}, send_lock | ||
| ) | ||
| try: | ||
| await websocket.close(code=close_code, reason=user_message) | ||
| except Exception: | ||
| pass | ||
| finally: | ||
| if runner_task and not runner_task.done(): | ||
| runner_task.cancel() | ||
| try: | ||
| await runner_task | ||
| except asyncio.CancelledError: | ||
| pass | ||
| except Exception: | ||
| logger.exception("Voice pipeline runner task failed") | ||
|
|
||
| if reader_task and not reader_task.done(): | ||
| reader_task.cancel() | ||
| try: | ||
| await reader_task | ||
| except asyncio.CancelledError: | ||
| pass | ||
| except Exception: | ||
| logger.exception("Voice control reader task failed") | ||
|
|
||
| if websocket.client_state == WebSocketState.CONNECTED: | ||
| try: | ||
| await _send_json( | ||
| websocket, {"type": "state", "state": "ended"}, send_lock | ||
| ) | ||
| except Exception: | ||
| pass | ||
|
|
||
| try: | ||
| await session_manager.update_state(session_id, "ending") | ||
| except Exception: | ||
| pass | ||
| try: | ||
| await components["bridge"].cleanup_background_tasks() | ||
| except Exception: | ||
| pass | ||
| try: | ||
| await session_manager.end_session(session_id) | ||
| except Exception: | ||
| pass | ||
| if websocket.client_state == WebSocketState.CONNECTED: | ||
| try: | ||
| await websocket.close() | ||
| except Exception: | ||
| pass |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Split this module before it grows further.
At ~1,000 LOC, this file already mixes rate limiting, serializer glue, control parsing, REST handlers, and WebSocket orchestration. That makes lifecycle bugs like the early-return leak below much harder to contain; please extract at least the rate limiter/control reader/REST handlers into separate modules.
As per coding guidelines, "Keep Python files under ~500 lines of code - extract modules when larger".
🧰 Tools
🪛 Ruff (0.15.9)
[warning] 45-45: Do not catch blind exception: Exception
(BLE001)
[warning] 201-201: Do not catch blind exception: Exception
(BLE001)
[error] 277-279: try-except-pass detected, consider logging the exception
(S110)
[warning] 277-277: Do not catch blind exception: Exception
(BLE001)
[warning] 371-371: Do not catch blind exception: Exception
(BLE001)
[error] 404-405: try-except-pass detected, consider logging the exception
(S110)
[warning] 404-404: Do not catch blind exception: Exception
(BLE001)
[warning] 470-470: return inside finally blocks cause exceptions to be silenced
(B012)
[warning] 735-735: Do not catch blind exception: Exception
(BLE001)
[error] 932-933: try-except-pass detected, consider logging the exception
(S110)
[warning] 932-932: Do not catch blind exception: Exception
(BLE001)
[error] 958-959: try-except-pass detected, consider logging the exception
(S110)
[warning] 958-958: Do not catch blind exception: Exception
(BLE001)
[error] 963-964: try-except-pass detected, consider logging the exception
(S110)
[warning] 963-963: Do not catch blind exception: Exception
(BLE001)
[error] 967-968: try-except-pass detected, consider logging the exception
(S110)
[warning] 967-967: Do not catch blind exception: Exception
(BLE001)
[error] 971-972: try-except-pass detected, consider logging the exception
(S110)
[warning] 971-971: Do not catch blind exception: Exception
(BLE001)
[error] 976-977: try-except-pass detected, consider logging the exception
(S110)
[warning] 976-976: Do not catch blind exception: Exception
(BLE001)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@bindu/server/endpoints/voice_endpoints.py` around lines 1 - 977, This file is
too large and mixes unrelated responsibilities; split it into smaller modules by
extracting the rate-limiter logic (functions and globals around
_get_rate_limit_redis_client, _RATE_LIMIT_LUA, _VOICE_RATE_LIMIT_LOCK,
_VOICE_RATE_LIMIT_IP_BUCKET, _VOICE_RATE_LIMIT_REDIS_* and _rate_limit_allow_ip)
into a voice_rate_limiter module, move WebSocket control parsing/queueing and
helper classes (class _VoiceControlState, class _FilteredWebSocket, function
_voice_control_reader, _extract_ws_session_token, _has_ws_subprotocols) into a
voice_control module, and move REST handlers (voice_session_start,
voice_session_end, voice_session_status) into a voice_rest module; keep
serializer and pipeline orchestration (class _RawAudioFrameSerializer,
voice_websocket and pipeline building) in the remaining voice_ws module. Update
all local imports/usages to import the moved symbols (e.g., import
_rate_limit_allow_ip, _voice_control_reader, _FilteredWebSocket,
_RawAudioFrameSerializer, voice_session_start/end/status) from their new modules
and run tests to ensure no circular imports and that session_manager/
app_settings references are preserved. Ensure any module-level globals are
encapsulated or re-exported and adjust tests/import paths accordingly.
| onkeydown={(event) => { | ||
| if (event.key === "Enter" || event.key === " ") { | ||
| event.preventDefault(); | ||
| isTapped = !isTapped; | ||
| } | ||
| }} |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cat -n frontend/src/lib/components/chat/ChatMessage.svelte | sed -n '250,300p'Repository: GetBindu/Bindu
Length of output: 1958
🏁 Script executed:
cat -n frontend/src/lib/components/chat/ChatMessage.svelte | sed -n '296,350p'Repository: GetBindu/Bindu
Length of output: 3023
🏁 Script executed:
find . -name "MarkdownRenderer*" -type fRepository: GetBindu/Bindu
Length of output: 183
🏁 Script executed:
cat -n ./frontend/src/lib/components/chat/MarkdownRenderer.svelte | head -100Repository: GetBindu/Bindu
Length of output: 2513
🏁 Script executed:
find . -name "MarkdownBlock.svelte" -type fRepository: GetBindu/Bindu
Length of output: 113
🏁 Script executed:
cat -n ./frontend/src/lib/components/chat/MarkdownBlock.svelte | head -150Repository: GetBindu/Bindu
Length of output: 830
Gate the wrapper's keyboard handler to prevent blocking nested control activation.
The wrapper div contains rendered markdown that may include links and buttons. When users press Enter or Space on these nested controls, the event bubbles to this handler, which calls preventDefault() and prevents the nested element from being activated. Check event.target !== event.currentTarget to only handle keyboard events fired directly on the wrapper itself.
Suggested fix
onkeydown={(event) => {
+ if (event.target !== event.currentTarget) {
+ return;
+ }
if (event.key === "Enter" || event.key === " ") {
event.preventDefault();
isTapped = !isTapped;
}
}}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| onkeydown={(event) => { | |
| if (event.key === "Enter" || event.key === " ") { | |
| event.preventDefault(); | |
| isTapped = !isTapped; | |
| } | |
| }} | |
| onkeydown={(event) => { | |
| if (event.target !== event.currentTarget) { | |
| return; | |
| } | |
| if (event.key === "Enter" || event.key === " ") { | |
| event.preventDefault(); | |
| isTapped = !isTapped; | |
| } | |
| }} |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@frontend/src/lib/components/chat/ChatMessage.svelte` around lines 274 - 279,
The wrapper div's onkeydown handler is intercepting Enter/Space for nested
interactive elements; update the handler (the onkeydown on the wrapper that
toggles isTapped) to first check that event.target === event.currentTarget (or
event.target !== event.currentTarget and return early) so it only handles
keyboard events originating on the wrapper itself, and only then call
preventDefault() and toggle isTapped; this ensures nested links/buttons can
receive Enter/Space normally.
| const startToken = ++startTokenCounter; | ||
| const localClient = new VoiceClient(); |
There was a problem hiding this comment.
endVoiceSession() doesn't cancel an in-flight start.
The token checks at Lines 138-149 can never trip today because only startVoiceSession() increments startTokenCounter, and isStarting blocks a second start. If the user closes the panel while startSession() or connect() is pending, that pending path still installs localClient afterward.
🛠️ Minimal fix
export async function endVoiceSession(): Promise<void> {
+ startTokenCounter += 1;
+ isStarting = false;
const active = client;
client = null;Also applies to: 138-149, 170-189
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@frontend/src/lib/stores/voice.ts` around lines 104 - 105, endVoiceSession()
currently doesn't prevent a pending startVoiceSession() from installing a
localClient because startTokenCounter is only incremented in startVoiceSession;
fix by making startTokenCounter the single source of truth for cancellation:
increment startTokenCounter (or set a new cancel token) inside
endVoiceSession(), capture the current startToken in startVoiceSession() before
any awaits, and after each await (especially after new VoiceClient() and await
localClient.connect()) compare the captured startToken with the global
startTokenCounter; only assign to the module-level localClient and flip
isStarting when the tokens match. Update checks in startVoiceSession (the
sections around startToken, localClient creation and connection) and in
endVoiceSession to use this token comparison so an in-flight start is ignored if
endVoiceSession was called.
| fileStream.on("data", (chunk) => { | ||
| if (chunk instanceof Uint8Array) { | ||
| chunks.push(chunk); | ||
| return; | ||
| } | ||
| reject(new Error("Unexpected chunk type from avatar download stream")); | ||
| }); | ||
| fileStream.on("error", (err) => | ||
| reject(err ?? new Error("Avatar download failed")) | ||
| ); | ||
| fileStream.on("end", () => resolve(Buffer.concat(chunks))); |
There was a problem hiding this comment.
Stop stream and cleanup listeners when rejecting avatar download
At Line 134, rejection on unexpected chunk type does not stop the stream or remove listeners. The stream can continue emitting after rejection, causing unnecessary work and brittle behavior.
Suggested fix
const fileBuffer = await new Promise<Buffer>((resolve, reject) => {
const chunks: Uint8Array[] = [];
- fileStream.on("data", (chunk) => {
- if (chunk instanceof Uint8Array) {
- chunks.push(chunk);
- return;
- }
- reject(new Error("Unexpected chunk type from avatar download stream"));
- });
- fileStream.on("error", (err) =>
- reject(err ?? new Error("Avatar download failed"))
- );
- fileStream.on("end", () => resolve(Buffer.concat(chunks)));
+ function cleanup() {
+ fileStream.off("data", onData);
+ fileStream.off("error", onError);
+ fileStream.off("end", onEnd);
+ }
+
+ function onData(chunk: unknown) {
+ if (chunk instanceof Uint8Array) {
+ chunks.push(chunk);
+ return;
+ }
+ cleanup();
+ fileStream.destroy();
+ reject(new Error("Unexpected chunk type from avatar download stream"));
+ }
+
+ function onError(err?: Error | null) {
+ cleanup();
+ reject(err ?? new Error("Avatar download failed"));
+ }
+
+ function onEnd() {
+ cleanup();
+ resolve(Buffer.concat(chunks));
+ }
+
+ fileStream.on("data", onData);
+ fileStream.once("error", onError);
+ fileStream.once("end", onEnd);
});📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| fileStream.on("data", (chunk) => { | |
| if (chunk instanceof Uint8Array) { | |
| chunks.push(chunk); | |
| return; | |
| } | |
| reject(new Error("Unexpected chunk type from avatar download stream")); | |
| }); | |
| fileStream.on("error", (err) => | |
| reject(err ?? new Error("Avatar download failed")) | |
| ); | |
| fileStream.on("end", () => resolve(Buffer.concat(chunks))); | |
| const fileBuffer = await new Promise<Buffer>((resolve, reject) => { | |
| const chunks: Uint8Array[] = []; | |
| function cleanup() { | |
| fileStream.off("data", onData); | |
| fileStream.off("error", onError); | |
| fileStream.off("end", onEnd); | |
| } | |
| function onData(chunk: unknown) { | |
| if (chunk instanceof Uint8Array) { | |
| chunks.push(chunk); | |
| return; | |
| } | |
| cleanup(); | |
| fileStream.destroy(); | |
| reject(new Error("Unexpected chunk type from avatar download stream")); | |
| } | |
| function onError(err?: Error | null) { | |
| cleanup(); | |
| reject(err ?? new Error("Avatar download failed")); | |
| } | |
| function onEnd() { | |
| cleanup(); | |
| resolve(Buffer.concat(chunks)); | |
| } | |
| fileStream.on("data", onData); | |
| fileStream.once("error", onError); | |
| fileStream.once("end", onEnd); | |
| }); |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@frontend/src/routes/api/v2/export/`+server.ts around lines 129 - 139, The
rejection path when encountering an unexpected chunk type in the avatar download
stream must stop the stream and remove listeners to avoid continued emissions;
update the handler around fileStream (the "data"/"error"/"end" listeners and the
chunks array) to perform cleanup before rejecting: create a small cleanup step
that removes the attached listeners (or calls fileStream.removeAllListeners) and
destroys the stream (fileStream.destroy(err)) and invoke that cleanup prior to
calling reject(new Error(...)); apply the same cleanup on the "error" and
"end"/"resolve" paths so all outcomes remove listeners and stop the stream.
bfb9c1e to
bb0e6fe
Compare
There was a problem hiding this comment.
Actionable comments posted: 9
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
bindu/utils/notifications.py (3)
146-152:⚠️ Potential issue | 🔴 CriticalFinish the
resolved_iprefactor before merging.Line 152 still uses
resolved_ip, butvalidate_config()no longer returns anything. Even if that were restored, Line 226 still calls_post_once()with the old argument list after its signature was reduced. Thesend_event()path cannot currently succeed.Also applies to: 154-175, 211-227
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bindu/utils/notifications.py` around lines 146 - 152, The code still passes a removed/resolved_ip value and calls _post_once with an outdated signature after validate_config() was changed to return nothing; update send_event and any callers (send_event, the block using self.validate_config(config) and the sections that call _post_with_retry and _post_once) to stop expecting or passing resolved_ip: have validate_config(config) only validate, remove resolved_ip from the call to _post_with_retry(config["url"], headers, payload, event) and refactor _post_once(...) call sites to match its new reduced parameter list (use url, headers, payload, event or whatever the current _post_once signature requires), and ensure _post_with_retry delegates to the updated _post_once signature so the send_event path no longer depends on the old resolved_ip argument.
259-286:⚠️ Potential issue | 🔴 CriticalAdd missing
urllibimports and fix HTTPError response handling in_post_once().Lines 265, 271, and 278 reference
request.Request(),request.urlopen(), anderror.HTTPErrorwithout importing them, causing immediateNameErroron any delivery attempt. Additionally, line 282 reads fromresponsewhich is only defined in thewithblock; in theexceptblock, useexc.read()instead.Fix
+from urllib import error, request from urllib.parse import urlparse ... except error.HTTPError as exc: status = exc.code body = b"" try: - body = response.read() or b"" + body = exc.read() or b"" except OSError: body = b""🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bindu/utils/notifications.py` around lines 259 - 286, Import the missing urllib names and fix the HTTPError handling in _post_once: add imports for urllib.request as request and urllib.error as error (or equivalent) so request.Request and request.urlopen and error.HTTPError are available, and in the except error.HTTPError block call exc.read() (not response.read()) to get the error body, decode it and raise NotificationDeliveryError(status, message) as currently done; ensure this all occurs in the _post_once method that raises NotificationDeliveryError and that the original with request.urlopen(...) as response block remains unchanged.
141-145:⚠️ Potential issue | 🔴 CriticalDNS-rebinding SSRF vulnerability: Code doesn't pass resolved IP to urlopen(), allowing independent DNS lookups.
send_event()docstring (lines 141-145) documents a single-resolution guarantee to prevent DNS-rebinding attacks. However, the implementation fails to enforce this:
- Line 152 references an undefined
resolved_ipvariablevalidate_config()doesn't return the resolved IP from_resolve_and_validate_destination()_post_once()signature (line 259) doesn't have aresolved_ipparameter, despite being called with one at line 226- Line 271 passes the original
urltourlopen(), not a resolved IPSince
urlopen()performs its own DNS lookup (not reusing any pre-resolved IP), a hostname can validate as public at line 261 and rebind to a private address during the connection at line 271, directly violating the documented security mechanism.This is compounded by other issues: duplicate
_resolve_and_check_ip()definitions (lines 39-70, 73-104) and unused imports (urllib.requestandurllib.errorappear to be missing).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bindu/utils/notifications.py` around lines 141 - 145, The code documents single-resolution DNS to prevent DNS-rebinding but fails to pass the pre-resolved IP through the call chain: fix by returning the resolved IP from _resolve_and_validate_destination() into validate_config() and ensure send_event() passes that resolved_ip into _post_once(); update _post_once(resolved_ip, ...) signature and use the resolved IP when opening the HTTP connection (construct the request to use the resolved IP as the network target and set the original hostname in the Host header so urlopen() does not perform a new DNS lookup). Also remove the duplicate _resolve_and_check_ip() definition and correct imports (ensure urllib.request/urllib.error are imported and unused imports removed). Ensure references to resolved_ip (previously undefined) are removed or replaced with the returned value from validate_config()/ _resolve_and_validate_destination().
♻️ Duplicate comments (7)
frontend/src/lib/server/files/uploadFile.ts (1)
18-19:⚠️ Potential issue | 🔴 CriticalAttach the upload listeners before ending the stream.
upload.end()still runs before thefinish/errorhandlers are registered. If the stream completes quickly, the promise misses the event and hangs until the timeout fires.Suggested fix
- upload.write(Buffer.from(buffer)); - upload.end(); - // only return the filename when upload throws a finish event or a 20s time out occurs return new Promise((resolve, reject) => { const timeoutId = setTimeout(() => { if (typeof upload.off === "function") { upload.off("finish", handleFinish); @@ upload.once("finish", handleFinish); upload.once("error", handleError); + upload.write(Buffer.from(buffer)); + upload.end(); });Also applies to: 49-50
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@frontend/src/lib/server/files/uploadFile.ts` around lines 18 - 19, The upload stream handlers are being attached after calling upload.end(), which can miss rapid finish/error events; move the event listener registration (upload.on('finish', ...), upload.on('error', ...), or Promise resolve/reject bindings) to before calling upload.write(...) / upload.end() in the upload flow (look for the upload variable and the surrounding upload.write(Buffer.from(buffer)); upload.end(); calls and similarly the block around lines 49-50) so the promise will always observe finish/error events; ensure you attach listeners first, then write and end the stream.bindu/server/endpoints/voice_endpoints.py (4)
569-581:⚠️ Potential issue | 🟠 MajorReject malformed
context_idvalues before callingcreate_session().This currently stringifies arbitrary JSON values.
null/objects become bogus context IDs, and blank strings fall through toVoiceSessionManager.create_session(), which raisesValueErrorfor non-empty validation and pushes a client input bug into the generic endpoint error path instead of a clean 400.🛡️ Minimal validation sketch
-from uuid import uuid4 +from uuid import UUID, uuid4 ... - if isinstance(body, dict) and "context_id" in body: - context_id = str(body["context_id"]) + if isinstance(body, dict) and "context_id" in body: + raw_context_id = body["context_id"] + if not isinstance(raw_context_id, str) or not raw_context_id.strip(): + raise HTTPException( + status_code=400, detail="context_id must be a non-empty UUID string" + ) + try: + context_id = str(UUID(raw_context_id)) + except ValueError as exc: + raise HTTPException( + status_code=400, detail="context_id must be a valid UUID" + ) from excAs per coding guidelines, "Validate all external input and use type hints for input validation in Python files".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bindu/server/endpoints/voice_endpoints.py` around lines 569 - 581, The handler currently stringifies any JSON "context_id" (including null, objects, empty strings) and passes it to VoiceSessionManager.create_session(), causing ValueError downstream; update the request parsing to validate and reject invalid context_id values before calling create_session(): after json.loads(raw_body) ensure body["context_id"] exists and is a non-empty string (and optionally match the expected UUID format) and raise HTTPException(status_code=400) for null/empty/non-string/object values so only a validated context_id is forwarded to create_session().
748-827:⚠️ Potential issue | 🔴 CriticalStart the cleanup scope before the session becomes active.
update_state("active"), the agent configuration check, andbuild_voice_pipeline()all happen before the onlytry/finally. Any early return/exception there skipsend_session()and can leave the session stuck active until it times out or is manually cleaned up.♻️ Safer structure
- await session_manager.update_state(session_id, "active") - ... - components = build_voice_pipeline(...) - ... - try: + components: dict[str, Any] | None = None + try: + await session_manager.update_state(session_id, "active") + ... + components = build_voice_pipeline(...) ... finally: - await components["bridge"].cleanup_background_tasks() + if components is not None: + await components["bridge"].cleanup_background_tasks() await session_manager.end_session(session_id)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bindu/server/endpoints/voice_endpoints.py` around lines 748 - 827, The session is being marked active and the voice pipeline is built before entering the try/finally that calls end_session, so any early return/exception (e.g., agent not configured or build_voice_pipeline failing) can leave the session active; move session_manager.update_state(session_id, "active") and the call to build_voice_pipeline(...) inside the try block that contains the finally/end_session cleanup and perform the agent configuration check (voice_ext and manifest/run) inside that same try so any exception or return will fall through to the finally which calls end_session; reference session_manager.update_state, build_voice_pipeline, manifest.run, voice_ext, and end_session when making this change.
1-977: 🛠️ Refactor suggestion | 🟠 MajorSplit this module before it grows further.
At ~1,000 LOC, this file is already combining rate limiting, REST handlers, websocket control parsing, serializer glue, and pipeline orchestration. That makes lifecycle bugs like the session leak below much harder to isolate, and it's already over the repo's file-size limit.
As per coding guidelines, "Keep Python files under ~500 lines of code - extract modules when larger".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bindu/server/endpoints/voice_endpoints.py` around lines 1 - 977, This file is too large and should be split: extract rate-limiting logic (_get_rate_limit_redis_client, _rate_limit_allow_ip, _RATE_LIMIT_LUA, related globals) into a new voice_rate_limit module; move REST handlers (voice_session_start, voice_session_end, voice_session_status) into voice_rest_endpoints that import the rate limiter; move WebSocket orchestration (voice_websocket, _voice_control_reader, _FilteredWebSocket, _VoiceControlState, _RawAudioFrameSerializer, _send_json, _extract_ws_session_token, _has_ws_subprotocols, _classify_voice_pipeline_error, _voice_preflight_error, _send_error_and_close and helpers like _trim_overlap_text/_extract_bearer_token) into voice_ws_handler; keep small shared helpers (e.g., _trim_overlap_text, _extract_bearer_token) in a voice_utils module; ensure each new module exposes only what others need and update imports in the original package to reference the new modules so lifecycle calls (create_session, update_state, end_session) and pipeline building (build_voice_pipeline) remain intact.
721-737:⚠️ Potential issue | 🟠 MajorDon't block accepted sockets indefinitely on the fallback token read.
When
session_auth_requiredis on and the client didn't send a token in headers, this accepts the socket and then waits forever onreceive_text(). The session timeout at Line 911 has not started yet, so an unauthenticated client can pin accepted connections until something else cleans them up.As per coding guidelines, "Use
bindu.settings.app_settingsfor all configuration - never hardcode values such as URLs, ports, timeouts, API keys, or feature flags".🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bindu/server/endpoints/voice_endpoints.py` around lines 721 - 737, The code accepts a websocket then blocks indefinitely on websocket.receive_text() when no header token is provided; to fix, use a configurable timeout from app_settings (e.g., app_settings.voice.ws_token_read_timeout or similar) and wrap the receive_text call in asyncio.wait_for so unauthenticated clients can't pin open connections: update the branch in voice_endpoints.py where session_auth_required is checked (symbols: session_auth_required, _extract_ws_session_token, websocket.accept, websocket.receive_text) to fetch the timeout from app_settings (do not hardcode), wrap the await websocket.receive_text() with asyncio.wait_for(..., timeout=app_settings.voice.<timeout_field>), handle asyncio.TimeoutError by closing the socket with code 1008 and a clear reason, and add the necessary asyncio import if missing.frontend/src/lib/stores/voice.ts (2)
170-189:⚠️ Potential issue | 🟠 MajorInvalidate pending starts when ending the session.
startTokenCounteronly changes instartVoiceSession(). If the user hangs up whilelocalClient.startSession()orconnect()is still awaiting, that stale path can still assignclient = localClientafterward.🛑 Minimal cancellation fix
export async function endVoiceSession(): Promise<void> { + startTokenCounter += 1; + isStarting = false; const active = client; client = null;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@frontend/src/lib/stores/voice.ts` around lines 170 - 189, endVoiceSession() must invalidate any in-flight start attempts so a stale localClient from startVoiceSession() cannot be assigned to the global client after the user hangs up; increment or change the shared startTokenCounter (the token used in startVoiceSession() to gate assignment) inside endVoiceSession() before awaiting stopSession(), and in startVoiceSession() check that the token from when the async start began still matches startTokenCounter before assigning client = localClient or calling localClient.startSession()/connect() results; reference startTokenCounter, startVoiceSession(), endVoiceSession(), client, localClient, startSession(), and connect() in your change.
107-111:⚠️ Potential issue | 🟠 MajorReset mute/audio leftovers before swapping clients.
endVoiceSession()clearsisVoiceMutedandlatestAgentAudio, butstartVoiceSession()does not. Starting a new call after a muted/previous session can leave the UI showing muted while the freshVoiceClientis live, and the old audio buffer stays around until new audio arrives.🧹 Minimal reset
voiceError.set(null); transcripts.set([]); currentUserTranscript.set(""); currentAgentTranscript.set(""); + isVoiceMuted.set(false); + latestAgentAudio.set(null);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@frontend/src/lib/stores/voice.ts` around lines 107 - 111, startVoiceSession() currently swaps clients without clearing mute/audio leftovers, causing UI to remain muted and old audio to persist; before or during the client swap in startVoiceSession(), reset the mute and audio stores the same way endVoiceSession() does (clear isVoiceMuted and latestAgentAudio, and any related per-call audio buffers) so the new VoiceClient starts with a clean state — update startVoiceSession() to call the same reset logic used in endVoiceSession() (or extract a small helper) to clear isVoiceMuted and latestAgentAudio when initializing a new session.
🧹 Nitpick comments (5)
.gitignore (1)
226-230: Consider removing redundant patterns.The root-level patterns
en_US-*.onnxanden_US-*.onnx.jsonwill match these files anywhere in the repository, including underexamples/voice-agent/. The explicitexamples/voice-agent/patterns on lines 229-230 are redundant and can be removed for cleaner maintenance.♻️ Proposed cleanup
# Voice model artifacts downloaded at runtime (Piper) en_US-*.onnx en_US-*.onnx.json -examples/voice-agent/en_US-*.onnx -examples/voice-agent/en_US-*.onnx.json🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.gitignore around lines 226 - 230, The .gitignore contains redundant patterns: remove the explicit examples/voice-agent/en_US-*.onnx and examples/voice-agent/en_US-*.onnx.json entries because the root-level patterns en_US-*.onnx and en_US-*.onnx.json already match those files anywhere; keep the root patterns (en_US-*.onnx and en_US-*.onnx.json) and delete the two example-specific lines to clean up maintenance.frontend/src/lib/stores/chat.ts (1)
229-229: Consider adding a type for thepartsparameter.Using
any[]loses type safety. Consider defining or importing aParttype that matches the expected structure (e.g.,{ kind: 'text' | 'file', ... }).♻️ Suggested type definition
+type MessagePart = { kind: 'text'; text: string } | { kind: 'file'; file: { bytes: string; mimeType: string; name: string } }; + -export async function sendMessage(parts: any[]) { +export async function sendMessage(parts: MessagePart[]) {Or import from a shared types module if one exists.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@frontend/src/lib/stores/chat.ts` at line 229, The sendMessage function currently accepts parts: any[] which loses type safety; define or import a Part interface (or type) and change the signature to parts: Part[] (or readonly Part[] as appropriate) so callers and the function body get proper typings. Ensure Part covers the expected shape used in the function (e.g., discriminated union with kind: 'text' | 'file', text?: string, fileId?: string, metadata?: Record<string, unknown>, etc.), update any local references inside sendMessage to use the new properties, and import the shared Part type if one exists instead of redefining it.bindu/server/applications.py (1)
153-154: Consider adding type annotation for_voice_session_manager.The field is initialized to
Nonewithout a type hint. Adding an annotation would improve IDE support and static analysis:from typing import TYPE_CHECKING if TYPE_CHECKING: from bindu.extensions.voice.session_factory import SessionManagerBackend # In __init__: self._voice_session_manager: SessionManagerBackend | None = None🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bindu/server/applications.py` around lines 153 - 154, Add a precise type annotation for the instance attribute `_voice_session_manager` in the class __init__ so static checkers/IDEs know its type: import TYPE_CHECKING and conditionally import the SessionManagerBackend from bindu.extensions.voice.session_factory, then annotate `self._voice_session_manager` as `SessionManagerBackend | None` (or `Optional[SessionManagerBackend]`) instead of an untyped None; leave the runtime assignment as `None`.tests/unit/extensions/voice/test_voice_websocket_integration.py (1)
114-119: Test session token is appropriately marked for secret scanning.The
session_token="token-abc"is a test fixture value, not a real credential. The Ruff S106 warning is a false positive. The STT/TTS tokens on lines 108-112 already include# pragma: allowlist secretcomments.Consider adding the same pragma for consistency:
session = await manager.create_session( "ctx", session_token="token-abc", # pragma: allowlist secret session_token_expires_at=1e12, )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/extensions/voice/test_voice_websocket_integration.py` around lines 114 - 119, The test uses a fake session token literal that triggers a Ruff S106 false positive; update the call to VoiceSessionManager.create_session (the line assigning session via await manager.create_session(...)) to annotate the session_token argument with the same secret allowlist pragma used for STT/TTS tokens (add a trailing comment like "# pragma: allowlist secret" after the session_token argument) so the linter treats it as an allowed test fixture.bindu/extensions/voice/agent_bridge.py (1)
490-495: History trimming may drop more messages than necessary for odd-length histories.When
overflowis positive, the logic dropsturns_to_drop * 2messages. However, if the history has an odd number of messages (e.g., user message without assistant response yet), this could over-trim. Consider whether single-message granularity would be more precise.That said, this is a minor edge case since:
- History typically alternates user/assistant pairs
- The trim only fires when over the limit
- Dropping slightly more is safe (just loses context faster)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bindu/extensions/voice/agent_bridge.py` around lines 490 - 495, _trim_history currently computes turns_to_drop and removes turns_to_drop * 2 messages which can exceed overflow for odd-length histories; change it to compute the exact number of items to remove by using items_to_drop = min(turns_to_drop * 2, overflow) (or simply remove exactly overflow messages if you prefer single-message granularity) and then delete self._conversation_history[:items_to_drop]; keep references to _trim_history, self._conversation_history, _max_history_messages, overflow and turns_to_drop when locating and updating the logic.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@bindu/extensions/did/did_agent_extension.py`:
- Around line 261-266: Wrap the ACL hardening subprocess.run call (the block
using subprocess.run and checking result.returncode) in a try/except that on
failure removes the key files that were created earlier (the key file paths
referenced by path and the corresponding public key file variable) before
re-raising the RuntimeError; specifically catch the failure after
subprocess.run, delete both key files (use the same variables used when writing
the files) in the except block, then re-raise the original RuntimeError so the
caller still sees the error.
- Around line 243-247: The code in did_agent_extension.py currently hard-fails
if os.getenv("USERNAME") is missing; update the ACL principal resolution in the
initialization path (the block that sets username) to try multiple fallbacks and
validate non-empty principal before raising: first try os.getenv("USERNAME"),
then os.getlogin() and getpass.getuser(), and if still empty attempt to derive a
principal from environment (e.g., USER or COMPUTERNAME) or fall back to a
clearly named safe default like "unknown_user" only after logging a warning;
ensure you validate the chosen principal is a non-empty string and raise
RuntimeError from the same location only if all fallbacks produce an
empty/invalid principal, and add a brief explanatory log entry referencing the
ACL hardening step.
In `@bindu/extensions/voice/agent_bridge.py`:
- Line 38: The constant DEFAULT_TIMEOUT_FALLBACK_TEXT contains a curly right
single quotation mark; replace it with a straight ASCII apostrophe so the value
reads "Sorry - I'm having trouble responding right now." Update the
DEFAULT_TIMEOUT_FALLBACK_TEXT definition in agent_bridge.py to use the straight
apostrophe (') to avoid rendering/logging issues.
In `@bindu/utils/notifications.py`:
- Around line 39-70: The file contains a shadowed duplicate of the function
_resolve_and_check_ip — the first definition (the one shown) is overwritten by a
second definition later, which makes edits to this copy inert; remove the
redundant/earlier definition so there is only a single canonical
_resolve_and_check_ip implementation (ensure the remaining version correctly
uses socket.getaddrinfo, ipaddress.ip_address, and checks against
_BLOCKED_NETWORKS and preserves the same error messages and behavior).
In `@frontend/src/lib/server/database.ts`:
- Around line 496-512: The on() and once() implementations currently schedule
setTimeout emissions every time a listener is added (in downloadStream), causing
duplicate emits; change the logic so emissions for "error"/"data"/"end" are
scheduled only once per downloadStream lifecycle instead of per registration:
introduce a per-stream flag (e.g., scheduledEmits or scheduled[event]) or move
the scheduling out of on()/once() into the downloadStream
construction/initialization, keep addListener/wrapped removal behavior as-is,
and use the flag to skip scheduling inside both on() and once() (reference the
on, once, downloadStream, addListener, emit, file identifiers).
- Around line 145-159: The updateMany implementation is incorrectly treating
aggregation-pipeline style updates as simple modifier objects (causing literal
"$field" strings and wrong $unset behavior); change updateMany to explicitly
reject array (pipeline) updates instead of applying them: in updateMany (and any
callers like the migration routine), detect Array.isArray(update) and throw a
clear error (or return a rejected Promise) stating pipeline updates are
unsupported; keep existing applyUpdate and matchesFilter logic for non-array
updates and update usages in migrations to use modifier-form updates (or convert
them) rather than passing arrays.
In `@frontend/src/lib/services/voice-client.ts`:
- Around line 244-260: stopSession can race with async audio setup and allow
startAudioStreaming to install new media resources after hangup; add a
guard/teardown token to prevent new installs once stopping. Concretely:
introduce a boolean or cancellation token (e.g., this.isStopping or
this.stopToken) set at the start of stopSession and checked at the start of
startAudioStreaming and any async work that creates mediaStream/nodes; make
cleanupAudioStreaming clear resources and mark the token as stopped; ensure
startAudioStreaming aborts early if the token indicates stopped (also check
this.sessionId null), and apply the same guard pattern to other async audio
setup paths referenced in the file (the async blocks around startAudioStreaming,
mediaStream and worklet setup).
In `@pyproject.toml`:
- Around line 87-97: The gRPC modules (bindu/grpc/client.py,
bindu/grpc/service.py, bindu/grpc/server.py) import grpc at module level which
will raise ImportError for environments without the [grpc] extra; change those
module-level imports to guarded try/except ImportError blocks (or perform lazy
imports inside the functions/classes like GrpcAgentClient, GrpcAgentService,
GrpcAgentServer) and raise a clear, descriptive ImportError instructing the user
to install the "grpc" extra (or fall back gracefully) so importing these modules
without grpc installed does not crash the program unexpectedly.
In `@tests/integration/grpc/test_grpc_e2e.py`:
- Around line 184-186: The task-polling helper uses broad `except Exception`
blocks that hide real errors; replace the first `except Exception` (around the
network retry) with `except httpx.HTTPError` and the second JSON-related `except
Exception` with `except ValueError`, and propagate/record the caught exception
so the eventual timeout error message includes the last exception
instance/details to aid debugging; update the retry loops in the helper so only
those specific exceptions are swallowed and any other exceptions are allowed to
bubble up.
---
Outside diff comments:
In `@bindu/utils/notifications.py`:
- Around line 146-152: The code still passes a removed/resolved_ip value and
calls _post_once with an outdated signature after validate_config() was changed
to return nothing; update send_event and any callers (send_event, the block
using self.validate_config(config) and the sections that call _post_with_retry
and _post_once) to stop expecting or passing resolved_ip: have
validate_config(config) only validate, remove resolved_ip from the call to
_post_with_retry(config["url"], headers, payload, event) and refactor
_post_once(...) call sites to match its new reduced parameter list (use url,
headers, payload, event or whatever the current _post_once signature requires),
and ensure _post_with_retry delegates to the updated _post_once signature so the
send_event path no longer depends on the old resolved_ip argument.
- Around line 259-286: Import the missing urllib names and fix the HTTPError
handling in _post_once: add imports for urllib.request as request and
urllib.error as error (or equivalent) so request.Request and request.urlopen and
error.HTTPError are available, and in the except error.HTTPError block call
exc.read() (not response.read()) to get the error body, decode it and raise
NotificationDeliveryError(status, message) as currently done; ensure this all
occurs in the _post_once method that raises NotificationDeliveryError and that
the original with request.urlopen(...) as response block remains unchanged.
- Around line 141-145: The code documents single-resolution DNS to prevent
DNS-rebinding but fails to pass the pre-resolved IP through the call chain: fix
by returning the resolved IP from _resolve_and_validate_destination() into
validate_config() and ensure send_event() passes that resolved_ip into
_post_once(); update _post_once(resolved_ip, ...) signature and use the resolved
IP when opening the HTTP connection (construct the request to use the resolved
IP as the network target and set the original hostname in the Host header so
urlopen() does not perform a new DNS lookup). Also remove the duplicate
_resolve_and_check_ip() definition and correct imports (ensure
urllib.request/urllib.error are imported and unused imports removed). Ensure
references to resolved_ip (previously undefined) are removed or replaced with
the returned value from validate_config()/ _resolve_and_validate_destination().
---
Duplicate comments:
In `@bindu/server/endpoints/voice_endpoints.py`:
- Around line 569-581: The handler currently stringifies any JSON "context_id"
(including null, objects, empty strings) and passes it to
VoiceSessionManager.create_session(), causing ValueError downstream; update the
request parsing to validate and reject invalid context_id values before calling
create_session(): after json.loads(raw_body) ensure body["context_id"] exists
and is a non-empty string (and optionally match the expected UUID format) and
raise HTTPException(status_code=400) for null/empty/non-string/object values so
only a validated context_id is forwarded to create_session().
- Around line 748-827: The session is being marked active and the voice pipeline
is built before entering the try/finally that calls end_session, so any early
return/exception (e.g., agent not configured or build_voice_pipeline failing)
can leave the session active; move session_manager.update_state(session_id,
"active") and the call to build_voice_pipeline(...) inside the try block that
contains the finally/end_session cleanup and perform the agent configuration
check (voice_ext and manifest/run) inside that same try so any exception or
return will fall through to the finally which calls end_session; reference
session_manager.update_state, build_voice_pipeline, manifest.run, voice_ext, and
end_session when making this change.
- Around line 1-977: This file is too large and should be split: extract
rate-limiting logic (_get_rate_limit_redis_client, _rate_limit_allow_ip,
_RATE_LIMIT_LUA, related globals) into a new voice_rate_limit module; move REST
handlers (voice_session_start, voice_session_end, voice_session_status) into
voice_rest_endpoints that import the rate limiter; move WebSocket orchestration
(voice_websocket, _voice_control_reader, _FilteredWebSocket, _VoiceControlState,
_RawAudioFrameSerializer, _send_json, _extract_ws_session_token,
_has_ws_subprotocols, _classify_voice_pipeline_error, _voice_preflight_error,
_send_error_and_close and helpers like _trim_overlap_text/_extract_bearer_token)
into voice_ws_handler; keep small shared helpers (e.g., _trim_overlap_text,
_extract_bearer_token) in a voice_utils module; ensure each new module exposes
only what others need and update imports in the original package to reference
the new modules so lifecycle calls (create_session, update_state, end_session)
and pipeline building (build_voice_pipeline) remain intact.
- Around line 721-737: The code accepts a websocket then blocks indefinitely on
websocket.receive_text() when no header token is provided; to fix, use a
configurable timeout from app_settings (e.g.,
app_settings.voice.ws_token_read_timeout or similar) and wrap the receive_text
call in asyncio.wait_for so unauthenticated clients can't pin open connections:
update the branch in voice_endpoints.py where session_auth_required is checked
(symbols: session_auth_required, _extract_ws_session_token, websocket.accept,
websocket.receive_text) to fetch the timeout from app_settings (do not
hardcode), wrap the await websocket.receive_text() with asyncio.wait_for(...,
timeout=app_settings.voice.<timeout_field>), handle asyncio.TimeoutError by
closing the socket with code 1008 and a clear reason, and add the necessary
asyncio import if missing.
In `@frontend/src/lib/server/files/uploadFile.ts`:
- Around line 18-19: The upload stream handlers are being attached after calling
upload.end(), which can miss rapid finish/error events; move the event listener
registration (upload.on('finish', ...), upload.on('error', ...), or Promise
resolve/reject bindings) to before calling upload.write(...) / upload.end() in
the upload flow (look for the upload variable and the surrounding
upload.write(Buffer.from(buffer)); upload.end(); calls and similarly the block
around lines 49-50) so the promise will always observe finish/error events;
ensure you attach listeners first, then write and end the stream.
In `@frontend/src/lib/stores/voice.ts`:
- Around line 170-189: endVoiceSession() must invalidate any in-flight start
attempts so a stale localClient from startVoiceSession() cannot be assigned to
the global client after the user hangs up; increment or change the shared
startTokenCounter (the token used in startVoiceSession() to gate assignment)
inside endVoiceSession() before awaiting stopSession(), and in
startVoiceSession() check that the token from when the async start began still
matches startTokenCounter before assigning client = localClient or calling
localClient.startSession()/connect() results; reference startTokenCounter,
startVoiceSession(), endVoiceSession(), client, localClient, startSession(), and
connect() in your change.
- Around line 107-111: startVoiceSession() currently swaps clients without
clearing mute/audio leftovers, causing UI to remain muted and old audio to
persist; before or during the client swap in startVoiceSession(), reset the mute
and audio stores the same way endVoiceSession() does (clear isVoiceMuted and
latestAgentAudio, and any related per-call audio buffers) so the new VoiceClient
starts with a clean state — update startVoiceSession() to call the same reset
logic used in endVoiceSession() (or extract a small helper) to clear
isVoiceMuted and latestAgentAudio when initializing a new session.
---
Nitpick comments:
In @.gitignore:
- Around line 226-230: The .gitignore contains redundant patterns: remove the
explicit examples/voice-agent/en_US-*.onnx and
examples/voice-agent/en_US-*.onnx.json entries because the root-level patterns
en_US-*.onnx and en_US-*.onnx.json already match those files anywhere; keep the
root patterns (en_US-*.onnx and en_US-*.onnx.json) and delete the two
example-specific lines to clean up maintenance.
In `@bindu/extensions/voice/agent_bridge.py`:
- Around line 490-495: _trim_history currently computes turns_to_drop and
removes turns_to_drop * 2 messages which can exceed overflow for odd-length
histories; change it to compute the exact number of items to remove by using
items_to_drop = min(turns_to_drop * 2, overflow) (or simply remove exactly
overflow messages if you prefer single-message granularity) and then delete
self._conversation_history[:items_to_drop]; keep references to _trim_history,
self._conversation_history, _max_history_messages, overflow and turns_to_drop
when locating and updating the logic.
In `@bindu/server/applications.py`:
- Around line 153-154: Add a precise type annotation for the instance attribute
`_voice_session_manager` in the class __init__ so static checkers/IDEs know its
type: import TYPE_CHECKING and conditionally import the SessionManagerBackend
from bindu.extensions.voice.session_factory, then annotate
`self._voice_session_manager` as `SessionManagerBackend | None` (or
`Optional[SessionManagerBackend]`) instead of an untyped None; leave the runtime
assignment as `None`.
In `@frontend/src/lib/stores/chat.ts`:
- Line 229: The sendMessage function currently accepts parts: any[] which loses
type safety; define or import a Part interface (or type) and change the
signature to parts: Part[] (or readonly Part[] as appropriate) so callers and
the function body get proper typings. Ensure Part covers the expected shape used
in the function (e.g., discriminated union with kind: 'text' | 'file', text?:
string, fileId?: string, metadata?: Record<string, unknown>, etc.), update any
local references inside sendMessage to use the new properties, and import the
shared Part type if one exists instead of redefining it.
In `@tests/unit/extensions/voice/test_voice_websocket_integration.py`:
- Around line 114-119: The test uses a fake session token literal that triggers
a Ruff S106 false positive; update the call to
VoiceSessionManager.create_session (the line assigning session via await
manager.create_session(...)) to annotate the session_token argument with the
same secret allowlist pragma used for STT/TTS tokens (add a trailing comment
like "# pragma: allowlist secret" after the session_token argument) so the
linter treats it as an allowed test fixture.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 89f4a10a-a05b-4444-a3b4-1959e9ba9a82
⛔ Files ignored due to path filters (4)
bindu/grpc/generated/agent_handler_pb2.pyis excluded by!**/generated/**bindu/grpc/generated/agent_handler_pb2.pyiis excluded by!**/generated/**bindu/grpc/generated/agent_handler_pb2_grpc.pyis excluded by!**/generated/**uv.lockis excluded by!**/*.lock
📒 Files selected for processing (87)
.github/workflows/ci.yml.gitignore.pre-commit-config.yamlbindu/common/protocol/types.pybindu/extensions/__init__.pybindu/extensions/did/did_agent_extension.pybindu/extensions/voice/__init__.pybindu/extensions/voice/agent_bridge.pybindu/extensions/voice/audio_config.pybindu/extensions/voice/pipeline_builder.pybindu/extensions/voice/redis_session_manager.pybindu/extensions/voice/service_factory.pybindu/extensions/voice/session_factory.pybindu/extensions/voice/session_manager.pybindu/extensions/voice/voice_agent_extension.pybindu/penguin/bindufy.pybindu/server/applications.pybindu/server/endpoints/utils.pybindu/server/endpoints/voice_endpoints.pybindu/server/metrics.pybindu/server/scheduler/memory_scheduler.pybindu/server/workers/base.pybindu/settings.pybindu/utils/__init__.pybindu/utils/capabilities.pybindu/utils/logging.pybindu/utils/notifications.pybindu/utils/retry.pybindu/utils/worker/messages.pybindu/utils/worker/parts.pyfrontend/.env.examplefrontend/.gitignorefrontend/src/lib/buildPrompt.tsfrontend/src/lib/components/ShareConversationModal.sveltefrontend/src/lib/components/chat/ChatInput.sveltefrontend/src/lib/components/chat/ChatMessage.sveltefrontend/src/lib/components/chat/ChatWindow.sveltefrontend/src/lib/components/voice/LiveTranscript.sveltefrontend/src/lib/components/voice/VoiceCallButton.sveltefrontend/src/lib/components/voice/VoiceCallPanel.sveltefrontend/src/lib/constants/mime.tsfrontend/src/lib/jobs/refresh-conversation-stats.tsfrontend/src/lib/migrations/lock.tsfrontend/src/lib/migrations/routines/02-update-assistants-models.tsfrontend/src/lib/migrations/routines/10-update-reports-assistantid.tsfrontend/src/lib/server/config.tsfrontend/src/lib/server/database.tsfrontend/src/lib/server/files/downloadFile.tsfrontend/src/lib/server/files/uploadFile.tsfrontend/src/lib/server/models.tsfrontend/src/lib/services/voice-client.tsfrontend/src/lib/stores/chat.tsfrontend/src/lib/stores/voice.tsfrontend/src/lib/types/ConvSidebar.tsfrontend/src/lib/types/Model.tsfrontend/src/lib/types/Session.tsfrontend/src/lib/utils/agentMessageHandler.tsfrontend/src/lib/utils/tree/addChildren.spec.tsfrontend/src/lib/utils/tree/addSibling.spec.tsfrontend/src/lib/utils/tree/buildSubtree.spec.tsfrontend/src/lib/utils/tree/convertLegacyConversation.spec.tsfrontend/src/routes/+layout.sveltefrontend/src/routes/+page.sveltefrontend/src/routes/api/v2/conversations/[id]/message/[messageId]/+server.tsfrontend/src/routes/api/v2/export/+server.tsfrontend/src/routes/conversation/[id]/+page.sveltefrontend/src/routes/conversation/[id]/+server.tsfrontend/src/routes/settings/(nav)/authentication/+page.sveltefrontend/src/routes/settings/(nav)/negotiation/+page.sveltepyproject.tomlpytest.iniscripts/test_voice_providers.shtests/conftest_stubs.pytests/integration/grpc/test_grpc_e2e.pytests/unit/extensions/__init__.pytests/unit/extensions/voice/__init__.pytests/unit/extensions/voice/test_agent_bridge.pytests/unit/extensions/voice/test_service_factory.pytests/unit/extensions/voice/test_session_manager.pytests/unit/extensions/voice/test_voice_endpoints.pytests/unit/extensions/voice/test_voice_extension.pytests/unit/extensions/voice/test_voice_websocket_integration.pytests/unit/penguin/test_bindufy.pytests/unit/penguin/test_config_validator.pytests/unit/server/scheduler/test_memory_scheduler.pytests/unit/utils/test_notifications.pytests/unit/utils/worker/test_messages.py
✅ Files skipped from review due to trivial changes (26)
- frontend/src/lib/migrations/routines/10-update-reports-assistantid.ts
- tests/unit/extensions/voice/init.py
- frontend/src/lib/buildPrompt.ts
- tests/unit/extensions/init.py
- bindu/utils/logging.py
- pytest.ini
- frontend/src/routes/api/v2/conversations/[id]/message/[messageId]/+server.ts
- bindu/server/endpoints/utils.py
- frontend/src/lib/utils/tree/addChildren.spec.ts
- frontend/.gitignore
- frontend/src/routes/settings/(nav)/negotiation/+page.svelte
- tests/unit/penguin/test_config_validator.py
- bindu/utils/init.py
- frontend/src/routes/settings/(nav)/authentication/+page.svelte
- tests/unit/penguin/test_bindufy.py
- frontend/src/lib/migrations/routines/02-update-assistants-models.ts
- frontend/src/lib/constants/mime.ts
- frontend/src/lib/server/config.ts
- bindu/extensions/voice/init.py
- frontend/src/lib/components/voice/LiveTranscript.svelte
- tests/unit/extensions/voice/test_session_manager.py
- frontend/src/lib/components/voice/VoiceCallButton.svelte
- tests/unit/extensions/voice/test_service_factory.py
- bindu/extensions/voice/voice_agent_extension.py
- bindu/extensions/voice/redis_session_manager.py
- bindu/settings.py
🚧 Files skipped from review as they are similar to previous changes (33)
- frontend/src/lib/components/ShareConversationModal.svelte
- bindu/utils/worker/parts.py
- frontend/src/routes/+page.svelte
- bindu/utils/retry.py
- frontend/src/lib/migrations/lock.ts
- tests/unit/server/scheduler/test_memory_scheduler.py
- frontend/src/lib/utils/tree/buildSubtree.spec.ts
- frontend/src/routes/conversation/[id]/+page.svelte
- frontend/src/lib/types/ConvSidebar.ts
- frontend/src/lib/utils/tree/addSibling.spec.ts
- bindu/extensions/init.py
- frontend/src/routes/api/v2/export/+server.ts
- frontend/src/lib/utils/tree/convertLegacyConversation.spec.ts
- frontend/src/lib/types/Session.ts
- frontend/src/lib/types/Model.ts
- frontend/src/routes/conversation/[id]/+server.ts
- frontend/src/lib/server/files/downloadFile.ts
- frontend/src/lib/jobs/refresh-conversation-stats.ts
- bindu/server/metrics.py
- tests/unit/utils/worker/test_messages.py
- bindu/server/workers/base.py
- .pre-commit-config.yaml
- tests/conftest_stubs.py
- frontend/src/lib/server/models.ts
- tests/unit/extensions/voice/test_voice_extension.py
- .github/workflows/ci.yml
- frontend/src/routes/+layout.svelte
- frontend/src/lib/components/chat/ChatMessage.svelte
- bindu/server/scheduler/memory_scheduler.py
- frontend/src/lib/utils/agentMessageHandler.ts
- tests/unit/extensions/voice/test_agent_bridge.py
- bindu/extensions/voice/pipeline_builder.py
- bindu/extensions/voice/audio_config.py
|
hey @Co-vengers, the Unit Tests (3.12) are failing after 7 seconds and check the code rabbit reviews |
There was a problem hiding this comment.
Actionable comments posted: 16
Note
Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
frontend/src/lib/components/chat/ChatWindow.svelte (1)
294-301:⚠️ Potential issue | 🟡 MinorKeep the MIME fallback consistent for dropped/pasted files.
file.typecan be empty, andsendAgentMessage()drops file parts with blank MIME. This path should matchChatInputand provide a fallback so valid files are not silently omitted.Proposed fix
type: "base64", value, - mime: file.type, + mime: file.type || "application/octet-stream", name: file.name,🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@frontend/src/lib/components/chat/ChatWindow.svelte` around lines 294 - 301, The derived "sources" creation uses file.type directly which can be empty and causes sendAgentMessage to drop file parts; update the mapping in the sources computation (where file2base64 is used) to provide the same MIME fallback as ChatInput (e.g. use file.type || <ChatInput's fallback> such as 'application/octet-stream' or file.mime) so blank MIME values are replaced before creating the part; ensure this change is applied alongside file2base64 usage and will produce parts accepted by sendAgentMessage.
♻️ Duplicate comments (3)
bindu/extensions/did/did_agent_extension.py (1)
218-229:⚠️ Potential issue | 🟠 MajorRollback key files for all ACL-hardening failures.
Line 221 only catches
RuntimeError; ifsubprocess.run()raises before_harden_windows_key_file_acl()converts the failure, the newly written key files are left behind.Suggested fix
- except RuntimeError: + except Exception: for key_path in (self.private_key_path, self.public_key_path): try: key_path.unlink(missing_ok=True) except OSError: logger.warning( f"Failed to remove key file during ACL hardening rollback: {key_path}" ) raisePython 3.12 subprocess.run FileNotFoundError OSError behavior when executable cannot be started🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bindu/extensions/did/did_agent_extension.py` around lines 218 - 229, The rollback currently only runs on RuntimeError, so if subprocess.run or other calls raise OSError/FileNotFoundError (or any other exception) before _harden_windows_key_file_acl converts it, the key files remain; change the except to catch the broader exception class (e.g., except (RuntimeError, OSError, FileNotFoundError) or simply except Exception) around the calls to self._harden_windows_key_file_acl(self.private_key_path) and self._harden_windows_key_file_acl(self.public_key_path) and keep the existing loop that unlinks self.private_key_path and self.public_key_path (with missing_ok=True and the same OSError warning) before re-raising the original exception so all failures trigger the same rollback.bindu/server/endpoints/voice_endpoints.py (1)
50-65: 🛠️ Refactor suggestion | 🟠 MajorStill move voice tuning knobs into
app_settings.Redis TTL, audio frame limits, queue depth, and echo-suppression windows remain hardcoded, so production tuning still requires code changes.
As per coding guidelines, "Use
bindu.settings.app_settingsfor all configuration - never hardcode values such as URLs, ports, timeouts, API keys, or feature flags" and "NEVER create local config constants - useapp_settingsfrombindu.settings". Based on learnings, "Usebindu.settings.app_settingsfor all configuration - never hardcode values such as URLs, ports, timeouts, API keys, or feature flags".Also applies to: 418-420, 773-797, 813-813
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bindu/server/endpoints/voice_endpoints.py` around lines 50 - 65, The _RATE_LIMIT_LUA string (and other hardcoded constants like the Redis TTL value 120 inside it, audio frame limits, queue depth, and echo-suppression window values referenced elsewhere) must be moved into bindu.settings.app_settings and referenced from there instead of hardcoding; update the code that builds/uses _RATE_LIMIT_LUA to read TTL and limit thresholds from app_settings (e.g., app_settings.voice.redis_ttl, app_settings.voice.rate_limit, app_settings.voice.frame_limit, app_settings.voice.queue_depth, app_settings.voice.echo_suppression_window) and replace other hardcoded occurrences (the blocks around the other referenced sections) to use these app_settings keys so production tuning can be done via configuration.frontend/src/lib/services/voice-client.ts (1)
310-431:⚠️ Potential issue | 🟠 MajorClean up mic resources when audio setup throws.
After
getUserMedia()succeeds, failures inAudioContextresume, worklet registration, node creation, or graph connection can leave the mic/context alive because there is no outercatch.🔒 Proposed cleanup guard
- const stream = await navigator.mediaDevices.getUserMedia({ + let stream: MediaStream | null = null; + let audioContext: AudioContext | null = null; + try { + stream = await navigator.mediaDevices.getUserMedia({ audio: { channelCount: 1, sampleRate: 16000, echoCancellation: true, noiseSuppression: true, autoGainControl: true, }, - }); + }); @@ - const audioContext = new AudioContextCtor({ sampleRate: desiredSampleRate }); + audioContext = new AudioContextCtor({ sampleRate: desiredSampleRate }); await audioContext.resume(); @@ this.silentGainNode = silentGainNode; this.isStreamingAudio = true; + } catch (err) { + stream?.getTracks().forEach((track) => track.stop()); + if (audioContext) { + void audioContext.close(); + } + throw err; + } }
🟡 Minor comments (6)
bindu/penguin/config_validator.py-154-157 (1)
154-157:⚠️ Potential issue | 🟡 MinorPersist the normalized
deployment.url.Line 154 trims only the local variable, so configs like
" http://localhost:3773 "pass validation butvalidate_and_process()still returns the untrimmed URL. Store the normalized value back into the nested config before building the processed result.Proposed fix
deployment_url = deployment_url.strip() + deployment_config["url"] = deployment_url parsed = urlparse(deployment_url)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bindu/penguin/config_validator.py` around lines 154 - 157, The code trims the local variable deployment_url but does not store the normalized value back into the config, so validate_and_process() can return the original untrimmed string; update the nested config key (e.g., config["deployment"]["url"] or equivalent structure used in validate_and_process()) with deployment_url.strip() before calling urlparse and before building the processed result so the normalized URL is persisted; ensure you still use the trimmed deployment_url for parsing/validation (parsed = urlparse(deployment_url)) and that the returned processed config contains the trimmed value.bindu/utils/worker/messages.py-93-93 (1)
93-93:⚠️ Potential issue | 🟡 MinorRemove dead code path for non-existent
"data"key — it is not defined inFileWithBytestype.The fallback
file_info.get("bytes") or file_info.get("data", "")at line 93 accepts a"data"key that does not exist in theFileWithBytestype definition (perbindu/common/protocol/types.py). No producer or serializer writes this key; all tests, examples, and the protocol itself use only"bytes". Remove this dead branch, or if legacy compatibility is required, document it with a comment and update the error message to avoid suggesting"data"is a valid alternative.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bindu/utils/worker/messages.py` at line 93, The assignment currently uses a dead fallback for a non-existent "data" key; update the logic in bindu/utils/worker/messages.py to read only the documented FileWithBytes field (use file_info.get("bytes") and remove the file_info.get("data", "") branch), and also adjust any related error/log messages in the same module that mention "data" so they only reference "bytes" (or add a clear comment if you intentionally keep legacy handling). Target the base64_data assignment and any subsequent error handling that references "data" to ensure consistency with the FileWithBytes type.frontend/src/lib/server/database.ts-94-102 (1)
94-102:⚠️ Potential issue | 🟡 Minor
_idpreservation type-casts a possibly non-ObjectId value.
("_id" in source && source._id ? source._id : new ObjectId()) as ObjectIdblindly casts whatever the caller passed (string, number, plain object) toObjectId, andinsertedIdin the returned result inherits that lie. Downstream code calling ObjectId-specific methods (e.g..toHexString(),.equals(...)) on the returnedinsertedIdwill fail at runtime when a caller seeds a non-ObjectId_id. Either normalize to anObjectIdinstance, or widen the return type toObjectId | stringso callers handle both.🛡️ Proposed normalization
- async insertOne( - doc: Partial<T> | T - ): Promise<{ insertedId: ObjectId; acknowledged: boolean }> { - const source = doc as Record<string, unknown>; - const id = ("_id" in source && source._id ? source._id : new ObjectId()) as ObjectId; - const docWithId = { ...source, _id: id } as T; - this.data.set(id.toString(), docWithId); - return { insertedId: id, acknowledged: true }; - } + async insertOne( + doc: Partial<T> | T + ): Promise<{ insertedId: ObjectId; acknowledged: boolean }> { + const source = doc as Record<string, unknown>; + const existing = "_id" in source ? source._id : undefined; + const id = + existing && typeof existing === "object" && "toHexString" in existing + ? (existing as ObjectId) + : existing + ? new ObjectId(String(existing)) + : new ObjectId(); + const docWithId = { ...source, _id: id } as T; + this.data.set(id.toString(), docWithId); + return { insertedId: id, acknowledged: true }; + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@frontend/src/lib/server/database.ts` around lines 94 - 102, insertOne currently casts any incoming _id to ObjectId which can lie and break callers; change insertOne to normalize the id: if source._id is already an ObjectId use it, else if it's a string/hex try to construct a new ObjectId from that string (wrap in try/catch and fall back to new ObjectId()), otherwise generate a fresh new ObjectId; ensure docWithId._id is that normalized ObjectId, store it in this.data, and return that real ObjectId as insertedId (or if you prefer widen the function signature return type to ObjectId | string, pick one consistent approach and update insertOne's return type accordingly).bindu/utils/notifications.py-251-259 (1)
251-259:⚠️ Potential issue | 🟡 MinorPreserve the wrapped
HTTPErrorcause.Add
from excto thisraisestatement so traceback context is explicit and Ruff B904 passes. This is consistent with exception chaining used elsewhere in the same function.🧹 Proposed fix
- raise NotificationDeliveryError(status, message or f"HTTP error {status}") + raise NotificationDeliveryError( + status, message or f"HTTP error {status}" + ) from exc🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bindu/utils/notifications.py` around lines 251 - 259, The except block catching error.HTTPError currently raises NotificationDeliveryError without chaining the original exception; update the raise to preserve the cause by using "from exc" so traceback context is explicit (i.e., when handling error.HTTPError as exc, raise NotificationDeliveryError(status, message or f"HTTP error {status}") from exc), keeping consistent with exception chaining used elsewhere and satisfying Ruff B904; locate this in bindu/utils/notifications.py within the HTTPError handling where exc, status, body, message are computed.tests/unit/extensions/voice/test_voice_websocket_integration.py-112-120 (1)
112-120:⚠️ Potential issue | 🟡 MinorPatch provider/auth settings explicitly for deterministic tests.
These tests rely on
_voice_preflight_error(), but only patch API keys. If the environment sets a different STT/TTS provider or enables session auth, the tests can take a different branch.🧪 Proposed deterministic setup
original_enabled = module.app_settings.voice.enabled original_stt = module.app_settings.voice.stt_api_key original_tts = module.app_settings.voice.tts_api_key + original_stt_provider = module.app_settings.voice.stt_provider + original_tts_provider = module.app_settings.voice.tts_provider @@ module.app_settings.voice.enabled = True + module.app_settings.voice.stt_provider = "deepgram" + module.app_settings.voice.tts_provider = "elevenlabs" @@ module.app_settings.voice.tts_api_key = original_tts + module.app_settings.voice.stt_provider = original_stt_provider + module.app_settings.voice.tts_provider = original_tts_providerApply the same pattern in the timeout test, and explicitly set/restore
session_auth_required = Falsethere.Also applies to: 201-208
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/extensions/voice/test_voice_websocket_integration.py` around lines 112 - 120, The test relies on _voice_preflight_error() but only patches STT/TTS API keys; explicitly patch provider/auth settings to make the test deterministic by setting module.app_settings.voice.session_auth_required = False (or restoring it to False in the timeout test) and explicitly set any provider-related fields used by _voice_preflight_error() (e.g., stt_api_key, tts_api_key and provider selection if present) before the assertion, and restore them after; apply the same pattern for the other occurrence around lines referenced (the timeout test and the block at 201-208) so both tests deterministically exercise the same branch.frontend/src/lib/stores/voice.ts-66-86 (1)
66-86:⚠️ Potential issue | 🟡 MinorPreserve turn boundaries for finalized transcripts.
This merges every consecutive event from the same role. If the user sends two finalized utterances in a row, or the agent emits a new finalized response after another agent event, the UI collapses distinct turns.
🧩 Proposed merge guard
- if (last && last.role === event.role) { + if (last && last.role === event.role && !last.isFinal) { const mergedText = mergeTranscriptText(last.text, event.text);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@frontend/src/lib/stores/voice.ts` around lines 66 - 86, In appendTranscript, avoid merging a new event into the previous item when that previous turn is finalized; currently any consecutive event with the same role gets merged. Change the merge guard so merging only occurs if last exists, last.role === event.role, AND last.isFinal === false (i.e., only merge into non-finalized turns). Use the same merge logic (mergeTranscriptText, building merged with isFinal and ts) and update currentUserTranscript/currentAgentTranscript when merging; otherwise push event as a new transcript entry.
🧹 Nitpick comments (9)
bindu/utils/worker/messages.py (2)
57-71:_decode_plain_textfallback order makeslatin-1effectively unreachable.
cp1252is almost-total: it only fails on five undefined code points (0x81, 0x8D, 0x8F, 0x90, 0x9D). As a result, any byte sequence that isn't valid UTF-8 will almost always be accepted as cp1252, silently misinterpreting real latin-1 files across 0x80–0x9F (e.g.,0x80becomes€instead of a control char). If the intent is "Windows-first, then Unix", this is fine; if latin-1 is meant as a meaningful branch, swap them or drop it.- for encoding in ("utf-8", "cp1252", "latin-1"): + for encoding in ("utf-8", "latin-1", "cp1252"):Alternatively, since
latin-1never raisesUnicodeDecodeError, consider dropping the loop and usingfile_bytes.decode("utf-8", errors="replace")directly when UTF-8 fails — the current replacement fallback on line 71 is already unreachable in practice.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bindu/utils/worker/messages.py` around lines 57 - 71, The _decode_plain_text function currently tries encodings in ("utf-8","cp1252","latin-1") which makes latin-1 effectively unreachable because cp1252 will decode nearly any bytes; change the fallback strategy in _decode_plain_text: either swap the order to ("utf-8","latin-1","cp1252") if you want ISO-8859-1 to take precedence, or simplify by attempting only UTF-8 and on failure returning file_bytes.decode("utf-8", errors="replace") (i.e., remove the loop) so you avoid silent cp1252 mis-decoding; update the function body of _decode_plain_text accordingly and keep the existing logger calls consistent with the chosen behavior.
115-134: Dead branch at lines 117–118 —base64_datais already guaranteed non-empty here.Line 95 rejects the part (
continue) whenevernot base64_data, so by the time we reach thetry:blockbase64_datais truthy andraise ValueError("Missing file bytes")is unreachable. Minor cleanup:try: - # Decode the Base64 payload - if not base64_data: - raise ValueError("Missing file bytes") - - padding = ( + # Estimate decoded size before allocating to reject oversized payloads early + padding = ( 2 if base64_data.endswith("==") else 1 if base64_data.endswith("=") else 0 )The post-decode
len(file_bytes) > MAX_FILE_SIZEcheck at lines 132–133 is good defense-in-depth against a pathological short-padded payload and should stay.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bindu/utils/worker/messages.py` around lines 115 - 134, Remove the redundant precondition check inside the try block that raises ValueError("Missing file bytes") for base64_data (the variable base64_data is already checked earlier with continue), i.e. delete the if not base64_data: raise ValueError(...) branch and leave the rest of the logic (estimated_size calc, estimated_size > MAX_FILE_SIZE check, base64.b64decode and the post-decode len(file_bytes) > MAX_FILE_SIZE check) intact so defense-in-depth remains.frontend/src/lib/server/database.ts (1)
458-540: Download-stream refactor correctly de-dupes emissions; one residual ordering caveat.The
scheduled.{error,data,end}flags ensure each event is scheduled at most once per stream lifetime, resolving the prior duplicate-delivery bug where eachon()/once()registration queued its own emit.oncewraps the callback and self-removes viaoff, anddestroyguardsemitvia thedestroyedflag. Nice cleanup.Minor caveat: because each event is scheduled only on its first registration via independent
setTimeout(..., 0)timers, the emit order is determined by listener-registration order, not by stream semantics. A consumer that registerson("end", ...)beforeon("data", ...)will observeendbeforedata. Typical consumers registerdatafirst so this is unlikely to bite, but if you want to be defensive you can schedule bothdataandendtogether the first time either is registered (whenfileexists), preservingdata→endorder regardless of registration order.♻️ Optional hardening for emit ordering
- if (event === "data" && !scheduled.data && file) { - scheduled.data = true; - setTimeout(() => emit("data", file.data), 0); - } - if (event === "end" && !scheduled.end && file) { - scheduled.end = true; - setTimeout(() => emit("end"), 0); - } + if ((event === "data" || event === "end") && file && !scheduled.data) { + scheduled.data = true; + scheduled.end = true; + setTimeout(() => { + emit("data", file.data); + emit("end"); + }, 0); + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@frontend/src/lib/server/database.ts` around lines 458 - 540, The scheduled flags prevent duplicate emits but can deliver "end" before "data" if listeners register in reverse order; update scheduleInitialEmit inside openDownloadStream (and ensure downloadStream.on/once call it) so that when either "data" or "end" is first registered and file exists you schedule both emits together in the correct order: schedule "data" (set scheduled.data = true and setTimeout to emit data) and then schedule "end" (set scheduled.end = true and setTimeout to emit end), rather than scheduling only the single requested event, so consumers always see data → end regardless of registration order.frontend/src/lib/services/agent-api.ts (1)
218-220: Redundant nullish fallback.
AgentAPI.getBaseUrl()always returns a non-empty string because the constructor (line 26) guaranteesthis.baseUrlis initialized to either the passed value or'http://localhost:3773'. The?? 'http://localhost:3773'here is dead code and misleads readers into thinking the method can returnnull/undefined.♻️ Proposed simplification
-export function getAgentBaseUrl(): string { - return agentAPI.getBaseUrl() ?? 'http://localhost:3773'; -} +export function getAgentBaseUrl(): string { + return agentAPI.getBaseUrl(); +}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@frontend/src/lib/services/agent-api.ts` around lines 218 - 220, getAgentBaseUrl contains a redundant nullish fallback because AgentAPI.getBaseUrl() is guaranteed to return a non-empty string; remove the "?? 'http://localhost:3773'" and simply return agentAPI.getBaseUrl() directly to avoid dead code and clarify intent (refer to getAgentBaseUrl and AgentAPI.getBaseUrl/AgentAPI constructor to locate the change).bindu/extensions/voice/service_factory.py (1)
80-104: Redundant runtime validation of pydantic-validated setting.
app_settings.voice.tts_fallback_provideris typed asLiteral["none", "elevenlabs", "azure"](settings.py line 1067), so pydantic already guarantees it's a str in that set at load time. Theisinstance(..., str)check and the membership re-validation at lines 81-85 are unreachable in practice; they'll also silently mask any future Literal expansion in settings. Consider trimming to just the direct use, and optionally narrow the fallbackexcept Exceptionat line 100 to(ImportError, ValueError, RuntimeError)for consistency with the primary path.♻️ Proposed simplification
- provider = config.tts_provider - fallback_provider_raw = app_settings.voice.tts_fallback_provider - fallback_provider = ( - fallback_provider_raw if isinstance(fallback_provider_raw, str) else "none" - ) - if fallback_provider not in {"none", "elevenlabs", "azure"}: - fallback_provider = "none" + provider = config.tts_provider + fallback_provider = app_settings.voice.tts_fallback_provider try: return _create_tts_service_for_provider(provider, config) except (ImportError, ValueError) as primary_error: - if fallback_provider not in {"", "none", provider}: + if fallback_provider not in {"none", provider}: logger.warning( "Primary TTS provider failed; attempting fallback", ... ) try: return _create_tts_service_for_provider(fallback_provider, config) - except Exception as fallback_error: + except (ImportError, ValueError, RuntimeError) as fallback_error: raise RuntimeError( "TTS setup failed for primary and fallback providers" ) from fallback_error raise🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bindu/extensions/voice/service_factory.py` around lines 80 - 104, The current runtime re-validation of app_settings.voice.tts_fallback_provider (variables fallback_provider_raw and fallback_provider) is redundant because Pydantic already enforces Literal["none","elevenlabs","azure"]; remove the isinstance(...) and membership checks and use app_settings.voice.tts_fallback_provider directly as fallback_provider; keep the existing try of _create_tts_service_for_provider(provider, config) and, in the fallback branch that calls _create_tts_service_for_provider(fallback_provider, config), narrow the broad except Exception that catches fallback_error to a specific tuple like (ImportError, ValueError, RuntimeError) to match the primary path and avoid masking other errors; update logger usage to reference provider, fallback_provider, primary_error as before and preserve raising the fallback_error as the cause.bindu/settings.py (2)
1127-1134: Consider logging thesession_token_ttlclamp.Silently clamping
session_token_ttltosession_timeoutlets the server start with a config that differs from what the operator set. For auth-sensitive values, a warning log so ops can see the mismatch in startup logs is preferable; otherwise a misconfigured env var is invisible until someone reads the source.Note that using
loggerinside a pydanticmodel_validatorcan be tricky during settings bootstrap (circular import withbindu.utils.loggingif it imports frombindu.settings). If that's a concern, emit the warning post-construction at the call site inbindu/penguin/bindufy.pywhere voice is wired up instead.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bindu/settings.py` around lines 1127 - 1134, The validator _validate_session_token_ttl in VoiceSettings currently silently clamps session_token_ttl to session_timeout; update it to emit a warning including the original (requested) TTL and the clamped value so operators see the mismatch, e.g., capture the old value before clamping and call a logger.warning with both values and context; if importing the project logger in bindu/settings.py risks a circular import, instead leave the validator as-is but add a post-construction check in the voice wiring code in bindu/penguin/bindufy.py (where the VoiceSettings instance is created/used) that compares session_token_ttl and session_timeout and logs the same warning when clamping is needed.
9-9: Type-hint style inconsistency.The rest of this file uses PEP 604
str | Nonestyle (e.g., lines 648, 658, 693, 697-700), but the newredis_url: Optional[str]at line 1109 imports and usesOptional. For consistency, preferstr | Noneand dropOptionalfrom the import.♻️ Proposed change
-from typing import Literal, Optional +from typing import Literal- redis_url: Optional[str] = None # e.g., "redis://localhost:6379/0" + redis_url: str | None = None # e.g., "redis://localhost:6379/0"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bindu/settings.py` at line 9, The import and type-hint for redis_url use Optional which is inconsistent with the file's PEP 604 style; update the import line to remove Optional (keep Literal) and change the annotation for redis_url from Optional[str] to str | None so the code uses the same `str | None` style used elsewhere (locate the import `from typing import Literal, Optional` and the `redis_url: Optional[str]` declaration to modify).bindu/extensions/voice/redis_session_manager.py (2)
350-371:zipacross two set iterations — addstrict=Trueand materialize.
membersis the return ofsmembers(a Pythonsetwhendecode_responses=True). You iterate it twice: once to queuepipeline.get(key)(line 351-352) and once inzip(members, values)(line 357). CPython guarantees deterministic iteration order for an unchanged set within a single process, so this happens to work today, but it's brittle — any refactor that interleaves a mutation, or a future change to a non-set return type, can silently misalign keys with values.Materialize to a list and use
strict=True(also resolves Ruff B905):♻️ Proposed fix
- members = await self._redis_client.smembers(REDIS_ACTIVE_SET_KEY) - if not members: + members = list(await self._redis_client.smembers(REDIS_ACTIVE_SET_KEY)) + if not members: return 0 pipeline = self._redis_client.pipeline() for key in members: pipeline.get(key) values = await pipeline.execute() count = 0 stale_members: list[str] = [] - for key, data in zip(members, values): + for key, data in zip(members, values, strict=True):🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bindu/extensions/voice/redis_session_manager.py` around lines 350 - 371, The bug is that `members` (returned from `smembers`) is a set and is iterated twice which can misalign keys and pipeline `values`; fix by materializing `members` to a list before creating the pipeline and by using zip(members_list, values, strict=True) when pairing keys to results (update references to `members` in the pipeline loop and the for loop to use the new `members_list`); this also satisfies Ruff B905 and prevents brittle iteration-order issues in the function (e.g., the method that contains `pipeline = self._redis_client.pipeline()` and the `zip(members, values)` usage).
405-446: Use a pipeline for bulk reads in the cleanup loop.
_expire_timed_out_sessionsissues oneGETper active-set member serially (line 415). On deployments with many concurrent voice sessions this becomes N round-trips on every 30-second tick, plus another N round-trips for the delete loop. Mirroring the pattern used inget_active_count(pipelinedGETs) will keep cleanup scalable, and the deletions can also be pipelined.Also, the cleanup loop runs every 30s regardless of
session_timeout— consider making the interval derive fromsession_timeout(e.g.,min(30, session_timeout // 2)) so short-lived test configs (e.g.,session_timeout=10) still get timely expiration.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bindu/extensions/voice/redis_session_manager.py` around lines 405 - 446, _expire_timed_out_sessions currently does N serial GETs and N serial deletes; change it to use a Redis pipeline like in get_active_count: after fetching members from REDIS_ACTIVE_SET_KEY, issue a pipelined GET for all keys via self._redis_client.pipeline() / multi_exec and await a single round-trip, map responses back to keys, then collect expired keys; for removals, use the same pipeline to either EVALSHA (when _delete_session_script_sha is set) or DELETE + SREM for each expired key in a single pipeline call to avoid N round-trips. Also make the cleanup tick caller derive its interval from self._session_timeout (e.g., interval = min(30, max(1, self._session_timeout // 2))) instead of a fixed 30s so short session_timeout values expire promptly—update wherever _expire_timed_out_sessions is scheduled to use that computed interval. Use identifiers _expire_timed_out_sessions, REDIS_ACTIVE_SET_KEY, _delete_session_script_sha, _redis_client, and get_active_count to locate the relevant code.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@bindu/extensions/did/did_agent_extension.py`:
- Around line 254-272: Stop using COMPUTERNAME and the "unknown_user" fallback
when resolving username in the ACL hardening code: remove
os.getenv("COMPUTERNAME") from the fallback chain and do not assign
"unknown_user"; instead if username cannot be determined, raise a RuntimeError
to fail closed. Also broaden the exception catch around os.getlogin() to
Exception (so non-OSError exceptions from Python 3.12 don’t abort the chain) and
wrap getpass.getuser() invocation in a try/except (catching Exception) so it
doesn’t raise out and prevents further fallbacks; keep the final RuntimeError
message from the existing RuntimeError(...) and keep the logger.warning only if
you retain any non-secure fallback (preferably remove that warning since we will
now fail).
- Around line 232-244: When writing the private key in the block that opens
self.private_key_path (inside the recreate_keys flow), call os.fchmod(fd, 0o600)
immediately after obtaining the descriptor (fd) and before writing private_pem
to the file so permissions are tightened on the open descriptor and no sensitive
bytes are exposed; keep the existing os.chmod call if you want to ensure
filesystem metadata is correct but the critical change is os.fchmod(fd, 0o600)
before the with os.fdopen(fd, "wb") write, and apply the analogous pattern for
the public key descriptor (use 0o644) if desired.
In `@bindu/extensions/voice/agent_bridge.py`:
- Around line 33-38: Replace the hardcoded voice defaults (MAX_HISTORY_TURNS,
DEFAULT_FIRST_TOKEN_TIMEOUT_SECONDS, DEFAULT_TOTAL_RESPONSE_TIMEOUT_SECONDS,
DEFAULT_CANCELLATION_GRACE_SECONDS, DEFAULT_THINKING_TEXT,
DEFAULT_TIMEOUT_FALLBACK_TEXT) with values sourced from app_settings.voice:
import app_settings from bindu.settings and use
app_settings.voice.<setting_name> (or getattr(app_settings.voice,
"<setting_name>", <current_constant>)) so the processor falls back to the
centralized configuration when voice_settings is omitted; also update any places
that currently reference the local constants (including the timeouts/strings
used around the voice_settings handling) to read from app_settings.voice instead
of the local constants.
- Around line 197-199: The debug statements in agent_bridge.py currently log raw
transcript text (logger.debug(f"Voice user ({self._context_id}):
{text[:80]}...")) which may contain PII; replace those debug calls (including
the similar call around lines 231-233) to avoid printing the utterance itself
and instead log safe metadata such as self._context_id and the character count
(e.g., "Voice user (context_id): length=NN chars") or a hashed/obfuscated token
if you need to correlate, leaving out the raw text; update both logger.debug
occurrences referencing self._context_id and text to follow this privacy-safe
pattern.
In `@bindu/extensions/voice/redis_session_manager.py`:
- Around line 120-148: The current __aenter__ implementation loads and caches
Lua script SHAs into _create_session_script_sha, _update_session_script_sha and
_delete_session_script_sha using
script_load(_CREATE_SESSION_LUA/_UPDATE_SESSION_LUA/_DELETE_SESSION_LUA);
replace that with redis-py’s client.register_script(...) to obtain AsyncScript
objects (e.g. create_session_script =
self._redis_client.register_script(_CREATE_SESSION_LUA), etc.), remove the
manual SHA caching and script_load calls in __aenter__, and update all places
that call evalsha(...) (the session create/update/delete/expire sites) to use
the AsyncScript.call(...) method so the library will auto-reload scripts on
NoScriptError and handle cache eviction transparently.
In `@bindu/server/endpoints/voice_endpoints.py`:
- Around line 798-800: The current log in the on_agent_transcript handler prints
transcript text (text[:50]) which can contain sensitive data; change the
logger.info call (the one referencing text and is_final) to avoid any transcript
content and instead log privacy-safe metadata such as text length (len(text)),
is_final, and any other context IDs (session_id, agent_id) if available—do not
include the transcript or portions of it in the message. Ensure you update the
logger.info invocation in voice_endpoints.py that mentions text/is_final to
reflect only these metadata fields.
- Around line 977-989: The teardown silently ignores failures—update_state,
bridge cleanup, and end_session can all raise exceptions but are caught without
logging; change each except block to log the error (including session_id and
exception details) instead of passing so failures are visible for debugging.
Specifically around session_manager.update_state(session_id, "ending"),
components["bridge"].cleanup_background_tasks(), and
session_manager.end_session(session_id) replace the bare excepts with logging
calls (e.g., logger.exception or logger.error with exc_info) that include a
descriptive message and the session_id to aid diagnostics.
- Around line 912-928: The current flow awaits only runner_task so reader_task
(from _voice_control_reader) can fail or return without being observed; change
coordination to await both runner_task and reader_task (e.g., use asyncio.gather
or asyncio.wait) and ensure any exception from reader_task is retrieved and
handled—if reader_task errors or returns early, cancel runner_task (runner.run)
and propagate/log the reader exception, and if runner_task errors cancel
reader_task; make these changes around the creation of runner_task and
reader_task so both tasks are awaited and cleaned up before leaving the async
with timeout block.
In `@bindu/utils/notifications.py`:
- Around line 132-139: Update _resolve_and_validate_destination to validate the
URL scheme and tighten port logic: ensure parsed.scheme is present and one of
"http" or "https" (raise ValueError otherwise), and only default port to 80/443
when scheme is http/https; for any other scheme require an explicit port or
reject the URL. This defends calls from _post_once that bypass validate_config
and keeps hostname, scheme, and port handling consistent across validate_config
and _resolve_and_validate_destination.
- Around line 227-244: The URL rewrite currently replaces the hostname with
destination_ip and sets the Host header (variables target_url, parsed,
destination_ip, headers, req, request.urlopen), which breaks TLS hostname
verification and mishandles IPv6 and userinfo; instead, for HTTPS keep the
original hostname in the request URL (so certificate verification and SNI use
it) and use a custom transport/connection that connects to the resolved IP (so
the socket connects to destination_ip while preserving server_hostname for SNI),
for HTTP only construct the netloc and Host header from parsed.hostname and
parsed.port (wrap IPv6 in brackets like [::1] and never include
parsed.username/parsed.password in Host), and remove setting Host to
parsed.netloc; ensure req.add_header uses the sanitized hostname-only Host value
and use the custom transport when calling urlopen with self.timeout.
In `@bindu/utils/worker/messages.py`:
- Line 21: Replace the hardcoded MAX_FILE_SIZE module constant by adding a new
settings entry (e.g., worker.max_file_size) in the central app_settings schema
with a default of 10 * 1024 * 1024 (10 MiB), then import
bindu.settings.app_settings into bindu/utils/worker/messages.py and replace all
uses of MAX_FILE_SIZE with app_settings.worker.max_file_size (including the two
current call sites where MAX_FILE_SIZE is referenced in this module). Remove the
MAX_FILE_SIZE constant declaration and ensure any type conversions/units match
the original behavior so no runtime changes occur.
In `@frontend/src/lib/components/chat/ChatInput.svelte`:
- Around line 36-39: ChatInput currently clears its bound draft (value and
files) immediately after onsubmit resolves, which can lose user input if the
parent aborts or send fails; change the contract so onsubmit returns a boolean
success (Promise<boolean> | boolean) and update the submit handler (the internal
submit function that calls the onsubmit prop) to only clear the bound
value/files when onsubmit resolves true; alternatively, remove automatic
clearing entirely and let the parent clear the bound props—adjust the onsubmit
type signature and the submit handler logic in ChatInput to reflect the chosen
approach (reference the onsubmit prop and the internal handler that currently
clears value/files, including the similar logic at the region around the lines
referenced 184-190).
In `@frontend/src/lib/components/chat/ChatWindow.svelte`:
- Around line 173-176: The loop is discarding updates from sendAgentMessage() so
message/task/final-answer updates never get applied; change the for-await loop
to process each yielded update (don't use the unused _update variable) and apply
them to the component state by invoking the existing onmessage handler if
present (onmessage(update)) or otherwise updating the messages store/array
(e.g., append user/assistant messages, merge task metadata into the related
message, and set final-answer content when received). Ensure you still fully
consume the async iterator but handle each update case so the user message, task
metadata, and assistant response are rendered (referencing sendAgentMessage,
onmessage, messages, and contextId/fileParts to locate the code).
In `@frontend/src/lib/components/voice/VoiceCallPanel.svelte`:
- Around line 68-82: The current logic resets playbackNextTime back to
ctx.currentTime after scheduling (variables: audioBuffer, source, startAt,
playbackNextTime, ctx), which can move the timeline backwards and cause the next
chunk to start while previous audio is still queued and leak playback into STT;
remove or stop moving playbackNextTime backwards — do not set playbackNextTime =
ctx.currentTime after source.start — and instead compute the queued audio
duration (queuedDuration = playbackNextTime - ctx.currentTime) and hold/mute
voice input for that queuedDuration (not only for speech-synthesis fallback) so
the microphone is blocked for the full time audio remains scheduled.
In `@frontend/src/lib/stores/chat.ts`:
- Around line 21-32: The FilePart.file.bytes field must be a base64 string to
avoid JSON serialization loss: update the FilePart type to narrow file.bytes to
string only and modify sendMessage (the function that calls
agentAPI.sendMessage) to detect non-string binary inputs
(Uint8Array/ArrayBuffer/Blob) and convert them to base64 before forwarding;
reference FilePart / FilePart.file.bytes, sendMessage, and agentAPI.sendMessage
(agent-api.ts uses JSON.stringify) so reviewers can find where to change the
type and add the binary->base64 conversion/validation.
In `@frontend/src/lib/utils/agentMessageHandler.ts`:
- Around line 66-73: The FilePart.file.bytes type and any places that accept
fileParts[].value must be narrowed to string (base64) and ensure binary inputs
are converted before JSON.stringify: update the FilePart interface
(FilePart.file.bytes) to accept only string (base64), and in the code paths that
build/send messages (the logic that iterates fileParts / handles value
conversion, referenced around the handlers at lines where fileParts[].value is
used) detect ArrayBuffer/Uint8Array/Blob and convert them to base64 strings
before attaching to the payload; keep the name/shape of FilePart and preserve
mimeType/name but encode bytes to base64 so JSON.stringify sends correct file
data.
---
Outside diff comments:
In `@frontend/src/lib/components/chat/ChatWindow.svelte`:
- Around line 294-301: The derived "sources" creation uses file.type directly
which can be empty and causes sendAgentMessage to drop file parts; update the
mapping in the sources computation (where file2base64 is used) to provide the
same MIME fallback as ChatInput (e.g. use file.type || <ChatInput's fallback>
such as 'application/octet-stream' or file.mime) so blank MIME values are
replaced before creating the part; ensure this change is applied alongside
file2base64 usage and will produce parts accepted by sendAgentMessage.
---
Minor comments:
In `@bindu/penguin/config_validator.py`:
- Around line 154-157: The code trims the local variable deployment_url but does
not store the normalized value back into the config, so validate_and_process()
can return the original untrimmed string; update the nested config key (e.g.,
config["deployment"]["url"] or equivalent structure used in
validate_and_process()) with deployment_url.strip() before calling urlparse and
before building the processed result so the normalized URL is persisted; ensure
you still use the trimmed deployment_url for parsing/validation (parsed =
urlparse(deployment_url)) and that the returned processed config contains the
trimmed value.
In `@bindu/utils/notifications.py`:
- Around line 251-259: The except block catching error.HTTPError currently
raises NotificationDeliveryError without chaining the original exception; update
the raise to preserve the cause by using "from exc" so traceback context is
explicit (i.e., when handling error.HTTPError as exc, raise
NotificationDeliveryError(status, message or f"HTTP error {status}") from exc),
keeping consistent with exception chaining used elsewhere and satisfying Ruff
B904; locate this in bindu/utils/notifications.py within the HTTPError handling
where exc, status, body, message are computed.
In `@bindu/utils/worker/messages.py`:
- Line 93: The assignment currently uses a dead fallback for a non-existent
"data" key; update the logic in bindu/utils/worker/messages.py to read only the
documented FileWithBytes field (use file_info.get("bytes") and remove the
file_info.get("data", "") branch), and also adjust any related error/log
messages in the same module that mention "data" so they only reference "bytes"
(or add a clear comment if you intentionally keep legacy handling). Target the
base64_data assignment and any subsequent error handling that references "data"
to ensure consistency with the FileWithBytes type.
In `@frontend/src/lib/server/database.ts`:
- Around line 94-102: insertOne currently casts any incoming _id to ObjectId
which can lie and break callers; change insertOne to normalize the id: if
source._id is already an ObjectId use it, else if it's a string/hex try to
construct a new ObjectId from that string (wrap in try/catch and fall back to
new ObjectId()), otherwise generate a fresh new ObjectId; ensure docWithId._id
is that normalized ObjectId, store it in this.data, and return that real
ObjectId as insertedId (or if you prefer widen the function signature return
type to ObjectId | string, pick one consistent approach and update insertOne's
return type accordingly).
In `@frontend/src/lib/stores/voice.ts`:
- Around line 66-86: In appendTranscript, avoid merging a new event into the
previous item when that previous turn is finalized; currently any consecutive
event with the same role gets merged. Change the merge guard so merging only
occurs if last exists, last.role === event.role, AND last.isFinal === false
(i.e., only merge into non-finalized turns). Use the same merge logic
(mergeTranscriptText, building merged with isFinal and ts) and update
currentUserTranscript/currentAgentTranscript when merging; otherwise push event
as a new transcript entry.
In `@tests/unit/extensions/voice/test_voice_websocket_integration.py`:
- Around line 112-120: The test relies on _voice_preflight_error() but only
patches STT/TTS API keys; explicitly patch provider/auth settings to make the
test deterministic by setting module.app_settings.voice.session_auth_required =
False (or restoring it to False in the timeout test) and explicitly set any
provider-related fields used by _voice_preflight_error() (e.g., stt_api_key,
tts_api_key and provider selection if present) before the assertion, and restore
them after; apply the same pattern for the other occurrence around lines
referenced (the timeout test and the block at 201-208) so both tests
deterministically exercise the same branch.
---
Duplicate comments:
In `@bindu/extensions/did/did_agent_extension.py`:
- Around line 218-229: The rollback currently only runs on RuntimeError, so if
subprocess.run or other calls raise OSError/FileNotFoundError (or any other
exception) before _harden_windows_key_file_acl converts it, the key files
remain; change the except to catch the broader exception class (e.g., except
(RuntimeError, OSError, FileNotFoundError) or simply except Exception) around
the calls to self._harden_windows_key_file_acl(self.private_key_path) and
self._harden_windows_key_file_acl(self.public_key_path) and keep the existing
loop that unlinks self.private_key_path and self.public_key_path (with
missing_ok=True and the same OSError warning) before re-raising the original
exception so all failures trigger the same rollback.
In `@bindu/server/endpoints/voice_endpoints.py`:
- Around line 50-65: The _RATE_LIMIT_LUA string (and other hardcoded constants
like the Redis TTL value 120 inside it, audio frame limits, queue depth, and
echo-suppression window values referenced elsewhere) must be moved into
bindu.settings.app_settings and referenced from there instead of hardcoding;
update the code that builds/uses _RATE_LIMIT_LUA to read TTL and limit
thresholds from app_settings (e.g., app_settings.voice.redis_ttl,
app_settings.voice.rate_limit, app_settings.voice.frame_limit,
app_settings.voice.queue_depth, app_settings.voice.echo_suppression_window) and
replace other hardcoded occurrences (the blocks around the other referenced
sections) to use these app_settings keys so production tuning can be done via
configuration.
---
Nitpick comments:
In `@bindu/extensions/voice/redis_session_manager.py`:
- Around line 350-371: The bug is that `members` (returned from `smembers`) is a
set and is iterated twice which can misalign keys and pipeline `values`; fix by
materializing `members` to a list before creating the pipeline and by using
zip(members_list, values, strict=True) when pairing keys to results (update
references to `members` in the pipeline loop and the for loop to use the new
`members_list`); this also satisfies Ruff B905 and prevents brittle
iteration-order issues in the function (e.g., the method that contains `pipeline
= self._redis_client.pipeline()` and the `zip(members, values)` usage).
- Around line 405-446: _expire_timed_out_sessions currently does N serial GETs
and N serial deletes; change it to use a Redis pipeline like in
get_active_count: after fetching members from REDIS_ACTIVE_SET_KEY, issue a
pipelined GET for all keys via self._redis_client.pipeline() / multi_exec and
await a single round-trip, map responses back to keys, then collect expired
keys; for removals, use the same pipeline to either EVALSHA (when
_delete_session_script_sha is set) or DELETE + SREM for each expired key in a
single pipeline call to avoid N round-trips. Also make the cleanup tick caller
derive its interval from self._session_timeout (e.g., interval = min(30, max(1,
self._session_timeout // 2))) instead of a fixed 30s so short session_timeout
values expire promptly—update wherever _expire_timed_out_sessions is scheduled
to use that computed interval. Use identifiers _expire_timed_out_sessions,
REDIS_ACTIVE_SET_KEY, _delete_session_script_sha, _redis_client, and
get_active_count to locate the relevant code.
In `@bindu/extensions/voice/service_factory.py`:
- Around line 80-104: The current runtime re-validation of
app_settings.voice.tts_fallback_provider (variables fallback_provider_raw and
fallback_provider) is redundant because Pydantic already enforces
Literal["none","elevenlabs","azure"]; remove the isinstance(...) and membership
checks and use app_settings.voice.tts_fallback_provider directly as
fallback_provider; keep the existing try of
_create_tts_service_for_provider(provider, config) and, in the fallback branch
that calls _create_tts_service_for_provider(fallback_provider, config), narrow
the broad except Exception that catches fallback_error to a specific tuple like
(ImportError, ValueError, RuntimeError) to match the primary path and avoid
masking other errors; update logger usage to reference provider,
fallback_provider, primary_error as before and preserve raising the
fallback_error as the cause.
In `@bindu/settings.py`:
- Around line 1127-1134: The validator _validate_session_token_ttl in
VoiceSettings currently silently clamps session_token_ttl to session_timeout;
update it to emit a warning including the original (requested) TTL and the
clamped value so operators see the mismatch, e.g., capture the old value before
clamping and call a logger.warning with both values and context; if importing
the project logger in bindu/settings.py risks a circular import, instead leave
the validator as-is but add a post-construction check in the voice wiring code
in bindu/penguin/bindufy.py (where the VoiceSettings instance is created/used)
that compares session_token_ttl and session_timeout and logs the same warning
when clamping is needed.
- Line 9: The import and type-hint for redis_url use Optional which is
inconsistent with the file's PEP 604 style; update the import line to remove
Optional (keep Literal) and change the annotation for redis_url from
Optional[str] to str | None so the code uses the same `str | None` style used
elsewhere (locate the import `from typing import Literal, Optional` and the
`redis_url: Optional[str]` declaration to modify).
In `@bindu/utils/worker/messages.py`:
- Around line 57-71: The _decode_plain_text function currently tries encodings
in ("utf-8","cp1252","latin-1") which makes latin-1 effectively unreachable
because cp1252 will decode nearly any bytes; change the fallback strategy in
_decode_plain_text: either swap the order to ("utf-8","latin-1","cp1252") if you
want ISO-8859-1 to take precedence, or simplify by attempting only UTF-8 and on
failure returning file_bytes.decode("utf-8", errors="replace") (i.e., remove the
loop) so you avoid silent cp1252 mis-decoding; update the function body of
_decode_plain_text accordingly and keep the existing logger calls consistent
with the chosen behavior.
- Around line 115-134: Remove the redundant precondition check inside the try
block that raises ValueError("Missing file bytes") for base64_data (the variable
base64_data is already checked earlier with continue), i.e. delete the if not
base64_data: raise ValueError(...) branch and leave the rest of the logic
(estimated_size calc, estimated_size > MAX_FILE_SIZE check, base64.b64decode and
the post-decode len(file_bytes) > MAX_FILE_SIZE check) intact so
defense-in-depth remains.
In `@frontend/src/lib/server/database.ts`:
- Around line 458-540: The scheduled flags prevent duplicate emits but can
deliver "end" before "data" if listeners register in reverse order; update
scheduleInitialEmit inside openDownloadStream (and ensure downloadStream.on/once
call it) so that when either "data" or "end" is first registered and file exists
you schedule both emits together in the correct order: schedule "data" (set
scheduled.data = true and setTimeout to emit data) and then schedule "end" (set
scheduled.end = true and setTimeout to emit end), rather than scheduling only
the single requested event, so consumers always see data → end regardless of
registration order.
In `@frontend/src/lib/services/agent-api.ts`:
- Around line 218-220: getAgentBaseUrl contains a redundant nullish fallback
because AgentAPI.getBaseUrl() is guaranteed to return a non-empty string; remove
the "?? 'http://localhost:3773'" and simply return agentAPI.getBaseUrl()
directly to avoid dead code and clarify intent (refer to getAgentBaseUrl and
AgentAPI.getBaseUrl/AgentAPI constructor to locate the change).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: f71b8a22-7e31-4d5e-be60-7fb777d46e4a
📒 Files selected for processing (40)
.gitignorebindu/extensions/__init__.pybindu/extensions/did/did_agent_extension.pybindu/extensions/voice/agent_bridge.pybindu/extensions/voice/pipeline_builder.pybindu/extensions/voice/redis_session_manager.pybindu/extensions/voice/service_factory.pybindu/extensions/voice/session_factory.pybindu/grpc/client.pybindu/grpc/server.pybindu/grpc/service.pybindu/penguin/config_validator.pybindu/server/applications.pybindu/server/endpoints/voice_endpoints.pybindu/settings.pybindu/utils/notifications.pybindu/utils/worker/messages.pyfrontend/.env.examplefrontend/src/lib/components/chat/ChatInput.sveltefrontend/src/lib/components/chat/ChatWindow.sveltefrontend/src/lib/components/voice/LiveTranscript.sveltefrontend/src/lib/components/voice/VoiceCallPanel.sveltefrontend/src/lib/migrations/lock.tsfrontend/src/lib/migrations/routines/10-update-reports-assistantid.tsfrontend/src/lib/server/config.tsfrontend/src/lib/server/database.tsfrontend/src/lib/server/files/uploadFile.tsfrontend/src/lib/services/agent-api.tsfrontend/src/lib/services/voice-client.tsfrontend/src/lib/stores/chat.tsfrontend/src/lib/stores/voice.tsfrontend/src/lib/utils/agentMessageHandler.tsfrontend/src/lib/utils/paymentHandler.tspytest.initests/integration/grpc/test_grpc_e2e.pytests/unit/extensions/voice/test_session_manager.pytests/unit/extensions/voice/test_voice_websocket_integration.pytests/unit/penguin/test_bindufy.pytests/unit/server/scheduler/test_memory_scheduler.pytests/unit/test_minimax_example.py
✅ Files skipped from review due to trivial changes (4)
- frontend/src/lib/migrations/lock.ts
- bindu/extensions/init.py
- frontend/src/lib/components/voice/LiveTranscript.svelte
- tests/unit/server/scheduler/test_memory_scheduler.py
🚧 Files skipped from review as they are similar to previous changes (7)
- pytest.ini
- frontend/src/lib/migrations/routines/10-update-reports-assistantid.ts
- frontend/src/lib/server/config.ts
- .gitignore
- bindu/extensions/voice/pipeline_builder.py
- bindu/server/applications.py
- tests/unit/extensions/voice/test_session_manager.py
There was a problem hiding this comment.
Actionable comments posted: 6
♻️ Duplicate comments (8)
bindu/utils/notifications.py (2)
131-138:⚠️ Potential issue | 🟠 MajorValidate the URL scheme inside the destination helper too.
_post_once()can call_resolve_and_validate_destination()directly whenresolved_ipis omitted, so non-HTTP schemes can bypassvalidate_config()and still reachurlopen(). Keep the scheme gate in this helper as the single defensive boundary.Proposed fix
`@staticmethod` def _resolve_and_validate_destination(url: str) -> str: parsed = urlparse(url) + if parsed.scheme not in {"http", "https"}: + raise ValueError("Push notification URL must use http or https scheme.") + hostname = parsed.hostname if not hostname: raise ValueError("Push notification URL must include a valid hostname.")#!/bin/bash # Verify the helper contains its own scheme validation. python - <<'PY' from pathlib import Path source = Path("bindu/utils/notifications.py").read_text() start = source.index(" def _resolve_and_validate_destination") end = source.index("\n `@create_retry_decorator`", start) body = source[start:end] print(body) if "parsed.scheme not in" not in body and "parsed.scheme" not in body: raise SystemExit("Missing scheme validation in _resolve_and_validate_destination") PY🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bindu/utils/notifications.py` around lines 131 - 138, The helper _resolve_and_validate_destination currently only checks hostname/port but must also validate URL scheme; add a guard in _resolve_and_validate_destination that ensures parsed.scheme is one of the allowed HTTP schemes (e.g., "http" or "https") and raise ValueError for other schemes so callers like _post_once (which may call this helper directly) cannot bypass validate_config(); reference parsed.scheme and parsed.hostname in the same function and return the validated URL string after scheme check.
226-243:⚠️ Potential issue | 🟠 MajorAvoid rewriting HTTPS URLs to the resolved IP.
For
https://webhooks,urllibverifies TLS against the hostname intarget_url; theHostheader does not preserve SNI/certificate validation. This will break normal HTTPS webhooks and still mishandles IPv6 netlocs and userinfo inparsed.netloc. Use a transport that connects todestination_ipwhile preserving the original hostname for HTTPS, and only build sanitizedHostvalues fromparsed.hostname/parsed.port.#!/bin/bash # Demonstrate the current URL rewrite changes the HTTPS URL host used by urllib. python - <<'PY' from urllib.parse import urlparse parsed = urlparse("https://example.com:8443/webhook") destination_ip = "203.0.113.10" target_url = parsed._replace(netloc=f"{destination_ip}:{parsed.port}").geturl() print("rewritten_url =", target_url) print("urllib_url_hostname =", urlparse(target_url).hostname) print("original_hostname =", parsed.hostname) ipv6_parsed = urlparse("http://example.com:8080/webhook") ipv6_destination = "2001:db8::10" bad_netloc = f"{ipv6_destination}:{ipv6_parsed.port}" print("unbracketed_ipv6_netloc =", bad_netloc) PY🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bindu/utils/notifications.py` around lines 226 - 243, The code rewrites target_url to the resolved IP and sets Host from parsed.netloc which breaks TLS SNI/cert validation and mishandles IPv6/userinfo; change the logic so that for parsed.scheme == "https" you do NOT replace the hostname in target_url (keep the original parsed.geturl()), and instead open the connection to destination_ip using a transport that preserves server_hostname/SNI (e.g., use http.client.HTTPSConnection(destination_ip, port=parsed.port, context=ssl_ctx, server_hostname=parsed.hostname) or a urllib opener/handler that does the same) while setting the Host header from parsed.hostname and parsed.port (not parsed.netloc), ensuring IPv6 addresses are bracketed and any userinfo is excluded; for plain http you may continue using destination_ip in the netloc but still build Host from parsed.hostname/parsed.port.bindu/server/endpoints/voice_endpoints.py (6)
801-803:⚠️ Potential issue | 🟠 MajorDo not log agent transcript text.
This logs spoken agent output content, which can include sensitive user data echoed back by the agent. Log metadata only.
🛡️ Privacy-safe log
- logger.info( - f"on_agent_transcript: got text='{text[:50]}...' is_final={is_final}" - ) + logger.info( + "on_agent_transcript: got %d chars, is_final=%s", + len(text), + is_final, + )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bindu/server/endpoints/voice_endpoints.py` around lines 801 - 803, The logger.info call in on_agent_transcript currently logs the agent's spoken text (logger.info(f"on_agent_transcript: got text='{text[:50]}...' is_final={is_final}")), which can expose sensitive content; change it to avoid logging any transcript text and log only metadata (e.g., is_final, text length, agent_id/session_id) instead—update the logger.info invocation in on_agent_transcript (and any similar uses of logger.info that reference text or text[:50]) to omit the text variable and include only non-sensitive fields like is_final and len(text) or relevant IDs.
974-999:⚠️ Potential issue | 🟠 MajorLog teardown failures instead of swallowing them.
Several teardown paths still use
except Exception: pass, so failed state updates, bridge cleanup, session removal, or close calls leave no diagnostics.🧹 Proposed teardown logging
if websocket.client_state == WebSocketState.CONNECTED: try: await _send_json( websocket, {"type": "state", "state": "ended"}, send_lock ) except Exception: - pass + logger.exception("Failed to send ended state for voice session %s", session_id) try: await session_manager.update_state(session_id, "ending") except Exception: - pass + logger.exception("Failed to mark voice session %s as ending", session_id) try: if components is not None: await components["bridge"].cleanup_background_tasks() except Exception: - pass + logger.exception("Failed to cleanup voice bridge for session %s", session_id) try: await session_manager.end_session(session_id) except Exception: - pass + logger.exception("Failed to end voice session %s", session_id) if websocket.client_state == WebSocketState.CONNECTED: try: await websocket.close() except Exception: - pass + logger.exception("Failed to close voice websocket for session %s", session_id)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bindu/server/endpoints/voice_endpoints.py` around lines 974 - 999, Replace the bare "except Exception: pass" blocks in the teardown with logged error handling so failures surface: wrap the calls to _send_json(websocket, ...), session_manager.update_state(session_id, "ending"), components["bridge"].cleanup_background_tasks(), session_manager.end_session(session_id), and websocket.close() in try/except and log the exception (including stack/exception details) instead of silently passing — e.g., use logging.exception or your module logger to emit a descriptive message for each failing operation (refer to websocket.client_state, _send_json, session_manager.update_state, components["bridge"].cleanup_background_tasks, session_manager.end_session, websocket.close).
917-933:⚠️ Potential issue | 🟠 MajorWait for the control reader as well as the pipeline runner.
The session still awaits only
runner_task; if_voice_control_reader()returns or fails first, stop/malformed-frame handling may not terminate the runner until timeout, and reader exceptions can be delayed until teardown.🔧 Proposed task coordination
- async with asyncio.timeout(float(app_settings.voice.session_timeout)): - await runner_task + async with asyncio.timeout(float(app_settings.voice.session_timeout)): + done, pending = await asyncio.wait( + {runner_task, reader_task}, + return_when=asyncio.FIRST_COMPLETED, + ) + for completed_task in done: + completed_task.result() + for pending_task in pending: + pending_task.cancel()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bindu/server/endpoints/voice_endpoints.py` around lines 917 - 933, The code currently only awaits runner_task; also await reader_task and ensure if either task fails or returns first we cancel the other and propagate exceptions: after creating runner_task and reader_task, use asyncio.wait({runner_task, reader_task}, return_when=asyncio.FIRST_EXCEPTION) inside the asyncio.timeout, then cancel any pending tasks (call .cancel()), await them (e.g., await asyncio.gather(*pending, return_exceptions=True)) and re-raise any exception from completed tasks so failures in _voice_control_reader or runner.run are surfaced and the other task is cleaned up.
50-65:⚠️ Potential issue | 🟠 MajorMove the remaining voice tuning constants into
app_settings.The Redis TTL, frame-size/fps/in-flight caps, and echo-suppression windows are still hardcoded. These are deployment knobs and should come from
app_settings.voiceinstead of local constants.As per coding guidelines, "Use
bindu.settings.app_settingsfor all configuration - never hardcode values such as URLs, ports, timeouts, API keys, or feature flags" and "NEVER create local config constants - useapp_settingsfrombindu.settings". Based on learnings, Applies to bindu/**/*.py : Usebindu.settings.app_settingsfor all configuration - never hardcode values such as URLs, ports, timeouts, API keys, or feature flags.Also applies to: 421-423, 776-799
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bindu/server/endpoints/voice_endpoints.py` around lines 50 - 65, Constants for voice tuning (Redis TTL, frame size/fps/in-flight caps, echo-suppression windows) are hardcoded in voice_endpoints.py (e.g., the _RATE_LIMIT_LUA block and other local constants); move these into bindu.settings.app_settings.voice and read them at runtime instead of local literals. Import app_settings from bindu.settings, add/consume fields like app_settings.voice.redis_ttl, .frame_size, .fps, .max_in_flight, .echo_suppression_window (or similar descriptive names), replace the local constants and usages (including the _RATE_LIMIT_LUA TTL/limit args and any frame/fps/in-flight variables) to reference app_settings.voice values, and ensure defaults/validation are handled where app_settings is loaded.
583-590:⚠️ Potential issue | 🟠 MajorValidate
context_idas a UUID before creating the session.This still accepts any non-empty string even though the endpoint documents
{ "context_id": "<uuid>" }; malformed IDs will enter the voice/task layer.🛡️ Proposed validation
-from uuid import uuid4 +from uuid import UUID, uuid4 ... if isinstance(body, dict) and "context_id" in body: raw_context_id = body["context_id"] if not isinstance(raw_context_id, str) or not raw_context_id.strip(): raise HTTPException( status_code=400, detail="context_id must be a non-empty string", ) - context_id = raw_context_id.strip() + try: + context_id = str(UUID(raw_context_id.strip())) + except ValueError as exc: + raise HTTPException( + status_code=400, + detail="context_id must be a valid UUID", + ) from excAs per coding guidelines, "Validate all external input and use type hints for input validation in Python files".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bindu/server/endpoints/voice_endpoints.py` around lines 583 - 590, The current check only ensures raw_context_id is a non-empty string but does not verify it's a UUID; update the validation where raw_context_id/context_id is handled (the block using raw_context_id and context_id in voice_endpoints.py) to parse and validate it as a UUID (e.g., use uuid.UUID(raw_context_id, version=4) or appropriate version) inside a try/except and raise HTTPException(status_code=400, detail="context_id must be a valid UUID") on failure; add the necessary import (import uuid) and ensure context_id is set to the canonical string form (str(parsed_uuid).lower()) after successful parsing.
1-999: 🛠️ Refactor suggestion | 🟠 MajorSplit this endpoint module before adding more voice logic.
This is still ~1,000 LOC and mixes rate limiting, REST handlers, WebSocket auth/control parsing, pipeline orchestration, serialization, and teardown. That makes lifecycle bugs harder to reason about and keeps violating the repository size rule.
As per coding guidelines, "Keep Python files under ~500 lines of code - extract modules when larger".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bindu/server/endpoints/voice_endpoints.py` around lines 1 - 999, The file is too large and should be split into focused modules: move rate-limit logic (_RATE_LIMIT_LUA, _get_rate_limit_redis_client, _rate_limit_allow_ip, _VOICE_RATE_LIMIT_LOCK, _VOICE_RATE_LIMIT_IP_BUCKET, _VOICE_RATE_LIMIT_REDIS_LOCK, _VOICE_RATE_LIMIT_REDIS_CLIENT) into a new voice_rate_limiter module; move helpers and small utils (_send_json, _trim_overlap_text, _extract_bearer_token, _extract_ws_session_token, _has_ws_subprotocols, _classify_voice_pipeline_error, _voice_preflight_error, _send_error_and_close) into voice_utils; move WebSocket control/transport wrappers and serializing classes (_VoiceControlState, _FilteredWebSocket, _RawAudioFrameSerializer, _voice_control_reader) into voice_ws_helpers or voice_transport; and keep REST handlers (voice_session_start, voice_session_end, voice_session_status) in a smaller rest module while placing the main handler voice_websocket and pipeline orchestration in voice_ws_handler; update imports to reference the new modules (e.g., build_voice_pipeline usage remains in voice_websocket) and re-export or adjust package imports so existing callers (and handle_endpoint_errors decorator) import names unchanged. Ensure tests and imports are updated and that cleanup/teardown calls (components["bridge"].cleanup_background_tasks, session_manager methods) remain accessible from the refactored locations.
🧹 Nitpick comments (1)
bindu/utils/notifications.py (1)
250-258: Chain the wrappedHTTPErrorto clarify exception origins.The B904 warning is valid here. Preserving
excin the exception chain makes delivery failures distinguishable from failures while reading or decoding the error body.Proposed fix
- raise NotificationDeliveryError(status, message or f"HTTP error {status}") + raise NotificationDeliveryError( + status, message or f"HTTP error {status}" + ) from exc🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bindu/utils/notifications.py` around lines 250 - 258, The except block catching error.HTTPError currently raises NotificationDeliveryError without chaining the original exc, losing the original traceback; update the raise in that block to include the original exception as the cause (i.e., raise NotificationDeliveryError(status, message or f"HTTP error {status}") from exc) so the HTTPError is preserved in the exception chain and debugging remains clear—look for the except error.HTTPError as exc handler and the NotificationDeliveryError construction to apply this change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@bindu/server/endpoints/voice_endpoints.py`:
- Around line 858-866: components is declared optional so referencing
components["bridge"] inside the nested _handle_user_text causes a type error;
after calling build_voice_pipeline() capture the non-optional bridge into a
local variable (e.g., bridge = components["bridge"] or the returned bridge
value) and close over that variable in _handle_user_text instead of using
components["bridge"]; update the callback to call
bridge.process_transcription(...) (and keep using _send_json and send_lock as
before) so the nested function only references the non-optional bridge.
- Around line 197-208: Replace the broad try/except import of
pipecat.serializers.base_serializer with a TYPE_CHECKING-aware pattern: import
FrameSerializer for typing only inside an if TYPE_CHECKING block and catch
ImportError (not Exception) for runtime; define a runtime no-op fallback class
named _FrameSerializerBase (instead of _PipecatFrameSerializer) and then have
_RawAudioFrameSerializer inherit from _FrameSerializerBase so type checkers see
the real FrameSerializer while runtime uses the fallback when pipecat is
unavailable.
In `@tests/unit/extensions/voice/test_voice_websocket_integration.py`:
- Around line 122-125: The test passes a hardcoded session token to
manager.create_session which triggers Ruff S106; suppress the linter for this
test-only literal by appending the correct pragma to that argument (e.g., change
the session_token line in the create_session call to include "# noqa: S106" or
the project’s configured Ruff noqa token) so Ruff ignores the hardcoded secret
in the test; target the session_token="token-abc" keyword in the
manager.create_session invocation.
- Around line 194-200: The timeout test can be affected by environment-driven
auth (VOICE__SESSION_AUTH_REQUIRED) causing the handler to close for missing
token instead of timing out; update the test to explicitly disable session auth
by setting module.app_settings.voice.session_auth_required = False (save the
original value alongside
original_timeout/original_enabled/original_stt/original_tts) before forcing
session_timeout and enabled, and restore the original session_auth_required in
the finally block to keep test isolation (use the same module.app_settings.voice
reference used in the diff).
- Around line 22-76: The test fixture mock_pipecat_modules currently only
registers leaf modules and causes _voice_preflight_error (called by
voice_websocket) to fail when importlib.import_module("pipecat") is attempted;
update mock_pipecat_modules to also create and register stub modules for the
top-level package and intermediate packages (pipecat, pipecat.transports,
pipecat.transports.websocket, and pipecat.pipeline) in sys.modules and set
appropriate attributes so importlib.import_module and attribute lookups succeed
before registering the existing leaf modules like
pipecat.transports.websocket.fastapi and pipecat.pipeline.pipeline.
---
Duplicate comments:
In `@bindu/server/endpoints/voice_endpoints.py`:
- Around line 801-803: The logger.info call in on_agent_transcript currently
logs the agent's spoken text (logger.info(f"on_agent_transcript: got
text='{text[:50]}...' is_final={is_final}")), which can expose sensitive
content; change it to avoid logging any transcript text and log only metadata
(e.g., is_final, text length, agent_id/session_id) instead—update the
logger.info invocation in on_agent_transcript (and any similar uses of
logger.info that reference text or text[:50]) to omit the text variable and
include only non-sensitive fields like is_final and len(text) or relevant IDs.
- Around line 974-999: Replace the bare "except Exception: pass" blocks in the
teardown with logged error handling so failures surface: wrap the calls to
_send_json(websocket, ...), session_manager.update_state(session_id, "ending"),
components["bridge"].cleanup_background_tasks(),
session_manager.end_session(session_id), and websocket.close() in try/except and
log the exception (including stack/exception details) instead of silently
passing — e.g., use logging.exception or your module logger to emit a
descriptive message for each failing operation (refer to websocket.client_state,
_send_json, session_manager.update_state,
components["bridge"].cleanup_background_tasks, session_manager.end_session,
websocket.close).
- Around line 917-933: The code currently only awaits runner_task; also await
reader_task and ensure if either task fails or returns first we cancel the other
and propagate exceptions: after creating runner_task and reader_task, use
asyncio.wait({runner_task, reader_task}, return_when=asyncio.FIRST_EXCEPTION)
inside the asyncio.timeout, then cancel any pending tasks (call .cancel()),
await them (e.g., await asyncio.gather(*pending, return_exceptions=True)) and
re-raise any exception from completed tasks so failures in _voice_control_reader
or runner.run are surfaced and the other task is cleaned up.
- Around line 50-65: Constants for voice tuning (Redis TTL, frame
size/fps/in-flight caps, echo-suppression windows) are hardcoded in
voice_endpoints.py (e.g., the _RATE_LIMIT_LUA block and other local constants);
move these into bindu.settings.app_settings.voice and read them at runtime
instead of local literals. Import app_settings from bindu.settings, add/consume
fields like app_settings.voice.redis_ttl, .frame_size, .fps, .max_in_flight,
.echo_suppression_window (or similar descriptive names), replace the local
constants and usages (including the _RATE_LIMIT_LUA TTL/limit args and any
frame/fps/in-flight variables) to reference app_settings.voice values, and
ensure defaults/validation are handled where app_settings is loaded.
- Around line 583-590: The current check only ensures raw_context_id is a
non-empty string but does not verify it's a UUID; update the validation where
raw_context_id/context_id is handled (the block using raw_context_id and
context_id in voice_endpoints.py) to parse and validate it as a UUID (e.g., use
uuid.UUID(raw_context_id, version=4) or appropriate version) inside a try/except
and raise HTTPException(status_code=400, detail="context_id must be a valid
UUID") on failure; add the necessary import (import uuid) and ensure context_id
is set to the canonical string form (str(parsed_uuid).lower()) after successful
parsing.
- Around line 1-999: The file is too large and should be split into focused
modules: move rate-limit logic (_RATE_LIMIT_LUA, _get_rate_limit_redis_client,
_rate_limit_allow_ip, _VOICE_RATE_LIMIT_LOCK, _VOICE_RATE_LIMIT_IP_BUCKET,
_VOICE_RATE_LIMIT_REDIS_LOCK, _VOICE_RATE_LIMIT_REDIS_CLIENT) into a new
voice_rate_limiter module; move helpers and small utils (_send_json,
_trim_overlap_text, _extract_bearer_token, _extract_ws_session_token,
_has_ws_subprotocols, _classify_voice_pipeline_error, _voice_preflight_error,
_send_error_and_close) into voice_utils; move WebSocket control/transport
wrappers and serializing classes (_VoiceControlState, _FilteredWebSocket,
_RawAudioFrameSerializer, _voice_control_reader) into voice_ws_helpers or
voice_transport; and keep REST handlers (voice_session_start, voice_session_end,
voice_session_status) in a smaller rest module while placing the main handler
voice_websocket and pipeline orchestration in voice_ws_handler; update imports
to reference the new modules (e.g., build_voice_pipeline usage remains in
voice_websocket) and re-export or adjust package imports so existing callers
(and handle_endpoint_errors decorator) import names unchanged. Ensure tests and
imports are updated and that cleanup/teardown calls
(components["bridge"].cleanup_background_tasks, session_manager methods) remain
accessible from the refactored locations.
In `@bindu/utils/notifications.py`:
- Around line 131-138: The helper _resolve_and_validate_destination currently
only checks hostname/port but must also validate URL scheme; add a guard in
_resolve_and_validate_destination that ensures parsed.scheme is one of the
allowed HTTP schemes (e.g., "http" or "https") and raise ValueError for other
schemes so callers like _post_once (which may call this helper directly) cannot
bypass validate_config(); reference parsed.scheme and parsed.hostname in the
same function and return the validated URL string after scheme check.
- Around line 226-243: The code rewrites target_url to the resolved IP and sets
Host from parsed.netloc which breaks TLS SNI/cert validation and mishandles
IPv6/userinfo; change the logic so that for parsed.scheme == "https" you do NOT
replace the hostname in target_url (keep the original parsed.geturl()), and
instead open the connection to destination_ip using a transport that preserves
server_hostname/SNI (e.g., use http.client.HTTPSConnection(destination_ip,
port=parsed.port, context=ssl_ctx, server_hostname=parsed.hostname) or a urllib
opener/handler that does the same) while setting the Host header from
parsed.hostname and parsed.port (not parsed.netloc), ensuring IPv6 addresses are
bracketed and any userinfo is excluded; for plain http you may continue using
destination_ip in the netloc but still build Host from
parsed.hostname/parsed.port.
---
Nitpick comments:
In `@bindu/utils/notifications.py`:
- Around line 250-258: The except block catching error.HTTPError currently
raises NotificationDeliveryError without chaining the original exc, losing the
original traceback; update the raise in that block to include the original
exception as the cause (i.e., raise NotificationDeliveryError(status, message or
f"HTTP error {status}") from exc) so the HTTPError is preserved in the exception
chain and debugging remains clear—look for the except error.HTTPError as exc
handler and the NotificationDeliveryError construction to apply this change.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 1e218909-b26a-44e9-93a8-65f301a33411
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (5)
.gitignorebindu/grpc/service.pybindu/server/endpoints/voice_endpoints.pybindu/utils/notifications.pytests/unit/extensions/voice/test_voice_websocket_integration.py
🚧 Files skipped from review as they are similar to previous changes (2)
- bindu/grpc/service.py
- .gitignore
There was a problem hiding this comment.
♻️ Duplicate comments (5)
bindu/server/endpoints/voice_endpoints.py (5)
981-993:⚠️ Potential issue | 🟠 MajorLog teardown failures before suppressing them.
If
update_state(), bridge cleanup, orend_session()fails, the session can remain active without diagnostics. This repeats an existing review concern.Teardown logging
try: await session_manager.update_state(session_id, "ending") - except Exception: - pass + except Exception: + logger.exception("Failed to mark voice session as ending: %s", session_id) try: if components is not None: await components["bridge"].cleanup_background_tasks() - except Exception: - pass + except Exception: + logger.exception("Failed to cleanup voice bridge tasks: %s", session_id) try: await session_manager.end_session(session_id) - except Exception: - pass + except Exception: + logger.exception("Failed to end voice session: %s", session_id)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bindu/server/endpoints/voice_endpoints.py` around lines 981 - 993, The teardown currently swallows exceptions silently; change the three try/except blocks around session_manager.update_state(session_id, "ending"), components["bridge"].cleanup_background_tasks(), and session_manager.end_session(session_id) to catch Exception as e and log the error (including exception info/stack) before continuing so failures are recorded for diagnostics; use the existing logger (or processLogger) to emit descriptive messages that include the session_id and the caught exception for each failing call.
916-932:⚠️ Potential issue | 🟠 MajorWait for the reader task as well as the runner.
_voice_control_reader()can return after malformed JSON, stop, or frame-limit violations whilerunner_taskkeeps running untilsession_timeout; completed reader exceptions also go unretrieved. This repeats an existing review concern.Coordinate both tasks
- async with asyncio.timeout(float(app_settings.voice.session_timeout)): - await runner_task + async with asyncio.timeout(float(app_settings.voice.session_timeout)): + done, pending = await asyncio.wait( + {runner_task, reader_task}, + return_when=asyncio.FIRST_COMPLETED, + ) + for completed in done: + completed.result() + for pending_task in pending: + pending_task.cancel()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bindu/server/endpoints/voice_endpoints.py` around lines 916 - 932, The code starts runner_task but does not wait for reader_task, leaving exceptions unobserved and the runner may outlive the reader; update the section that creates runner_task and reader_task so both tasks are awaited/coordinated: create both runner_task (asyncio.create_task(runner.run(task))) and reader_task (asyncio.create_task(_voice_control_reader(...))) and then use asyncio.wait or asyncio.gather with proper timeout (the existing async with asyncio.timeout(float(app_settings.voice.session_timeout))) to await both; on timeout or completion ensure you cancel the remaining task(s), await them to retrieve exceptions, and handle/log exceptions from _voice_control_reader and runner.run to avoid unhandled task exceptions.
796-803:⚠️ Potential issue | 🟠 MajorDo not log transcript content.
Line 802 logs the first 50 characters of agent speech; transcripts can contain sensitive user data echoed by the model. This repeats an existing review concern.
Privacy-safe log
- logger.info( - f"on_agent_transcript: got text='{text[:50]}...' is_final={is_final}" - ) + logger.info( + f"on_agent_transcript: got {len(text)} chars, is_final={is_final}" + )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bindu/server/endpoints/voice_endpoints.py` around lines 796 - 803, The handler _on_agent_transcript logs transcript content via logger.info(f"on_agent_transcript: got text='{text[:50]}...' is_final={is_final}") which may leak sensitive user data; remove or redact the transcript text (use only metadata) and keep the suppress_audio_until update intact. Update the logger call in _on_agent_transcript to log only non-sensitive fields (e.g., is_final and that a transcript was received) or a fixed redacted placeholder instead of text, leaving control.suppress_audio_until logic unchanged.
583-590:⚠️ Potential issue | 🟠 MajorValidate
context_idas a UUID, not just a non-empty string.The docstring declares
{ "context_id": "<uuid>" }, but any non-empty string is accepted and persisted into the session layer. This repeats an existing review concern. As per coding guidelines, "Validate all external input and use type hints for input validation in Python files".Validation sketch
-from uuid import uuid4 +from uuid import UUID, uuid4 ... if isinstance(body, dict) and "context_id" in body: raw_context_id = body["context_id"] if not isinstance(raw_context_id, str) or not raw_context_id.strip(): raise HTTPException( status_code=400, detail="context_id must be a non-empty string", ) - context_id = raw_context_id.strip() + try: + context_id = str(UUID(raw_context_id.strip())) + except ValueError as exc: + raise HTTPException( + status_code=400, + detail="context_id must be a valid UUID", + ) from exc🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bindu/server/endpoints/voice_endpoints.py` around lines 583 - 590, The current check accepts any non-empty string for raw_context_id; change it to validate that raw_context_id is a valid UUID: after stripping raw_context_id (the variable in this block), attempt to parse it with uuid.UUID(...) and on failure raise HTTPException(status_code=400, detail="context_id must be a valid UUID"); if parsing succeeds assign the canonical string (or the UUID object, depending on surrounding code expectations) back to context_id. Update the imports to include the uuid module and keep the existing HTTPException usage and variable names (raw_context_id, context_id) so the rest of the handler continues to work.
600-617:⚠️ Potential issue | 🟠 MajorValidate the WebSocket host before creating the session.
If host resolution fails, this returns
400aftercreate_session()succeeds, leaving an active session behind until cleanup/quota expiry. This repeats an existing review concern.Ordering fix
- try: - session = await session_manager.create_session( - context_id, - session_token=session_token, - session_token_expires_at=session_token_expires_at, - ) - except RuntimeError as exc: - return JSONResponse({"error": str(exc)}, status_code=429) - # Build WebSocket URL from request scheme = "wss" if request.url.scheme == "https" else "ws" # Use hostname from request, fallback to client host, or raise error if unavailable host = request.url.hostname or (request.client.host if request.client else None) if not host: return JSONResponse( {"error": "Unable to determine host for WebSocket URL"}, status_code=400, ) + + try: + session = await session_manager.create_session( + context_id, + session_token=session_token, + session_token_expires_at=session_token_expires_at, + ) + except RuntimeError as exc: + return JSONResponse({"error": str(exc)}, status_code=429)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bindu/server/endpoints/voice_endpoints.py` around lines 600 - 617, Compute and validate the WebSocket host (derive scheme and resolve host from request.url.hostname or request.client.host) before calling session_manager.create_session so you can return the 400 error early if host is missing; move the host/scheme resolution and the "if not host" check above the try/except that calls session_manager.create_session (referencing create_session and session_manager) to avoid creating a session when the WebSocket host cannot be determined.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@bindu/server/endpoints/voice_endpoints.py`:
- Around line 981-993: The teardown currently swallows exceptions silently;
change the three try/except blocks around
session_manager.update_state(session_id, "ending"),
components["bridge"].cleanup_background_tasks(), and
session_manager.end_session(session_id) to catch Exception as e and log the
error (including exception info/stack) before continuing so failures are
recorded for diagnostics; use the existing logger (or processLogger) to emit
descriptive messages that include the session_id and the caught exception for
each failing call.
- Around line 916-932: The code starts runner_task but does not wait for
reader_task, leaving exceptions unobserved and the runner may outlive the
reader; update the section that creates runner_task and reader_task so both
tasks are awaited/coordinated: create both runner_task
(asyncio.create_task(runner.run(task))) and reader_task
(asyncio.create_task(_voice_control_reader(...))) and then use asyncio.wait or
asyncio.gather with proper timeout (the existing async with
asyncio.timeout(float(app_settings.voice.session_timeout))) to await both; on
timeout or completion ensure you cancel the remaining task(s), await them to
retrieve exceptions, and handle/log exceptions from _voice_control_reader and
runner.run to avoid unhandled task exceptions.
- Around line 796-803: The handler _on_agent_transcript logs transcript content
via logger.info(f"on_agent_transcript: got text='{text[:50]}...'
is_final={is_final}") which may leak sensitive user data; remove or redact the
transcript text (use only metadata) and keep the suppress_audio_until update
intact. Update the logger call in _on_agent_transcript to log only non-sensitive
fields (e.g., is_final and that a transcript was received) or a fixed redacted
placeholder instead of text, leaving control.suppress_audio_until logic
unchanged.
- Around line 583-590: The current check accepts any non-empty string for
raw_context_id; change it to validate that raw_context_id is a valid UUID: after
stripping raw_context_id (the variable in this block), attempt to parse it with
uuid.UUID(...) and on failure raise HTTPException(status_code=400,
detail="context_id must be a valid UUID"); if parsing succeeds assign the
canonical string (or the UUID object, depending on surrounding code
expectations) back to context_id. Update the imports to include the uuid module
and keep the existing HTTPException usage and variable names (raw_context_id,
context_id) so the rest of the handler continues to work.
- Around line 600-617: Compute and validate the WebSocket host (derive scheme
and resolve host from request.url.hostname or request.client.host) before
calling session_manager.create_session so you can return the 400 error early if
host is missing; move the host/scheme resolution and the "if not host" check
above the try/except that calls session_manager.create_session (referencing
create_session and session_manager) to avoid creating a session when the
WebSocket host cannot be determined.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 4e5aa629-099b-4560-8f8e-877ff71d4c53
📒 Files selected for processing (1)
bindu/server/endpoints/voice_endpoints.py
Summary
Describe the problem and fix in 2–5 bullets:
Change Type (select all that apply)
Scope (select all touched areas)
Linked Issue/PR
User-Visible / Behavior Changes
VOICE__*(provider-dependent keys).Security Impact (required)
Yes/No) YesYes/No) YesYes/No) YesYes/No) NoYes/No) YesYes, explain risk + mitigation:VOICE__ENABLED=false.Verification
Environment
Steps to Test
uv sync --dev --extra agents --extra voiceVOICE__STT_API_KEY=...and TTS provider env (e.g.VOICE__TTS_PROVIDER=piper)POST /voice/session/startand connect toWS /ws/voice/{session_id}Expected Behavior
Actual Behavior
Evidence (attach at least one)
Human Verification (required)
What you personally verified (not just CI):
Compatibility / Migration
Yes/No) Yes (voice is opt-in)Yes/No) YesYes/No) NoVOICE__*env vars only if enabling voice; otherwise keepVOICE__ENABLED=false.Failure Recovery (if this breaks)
VOICE__ENABLED=false(and/or revert the PR).Risks and Mitigations
Checklist
uv run pytest)uv run pre-commit run --all-files)Summary by CodeRabbit
New Features
Bug Fixes
Improvements