Motivation
Multi-turn chatbots that use Plano's agent orchestration for intent-based routing currently cannot leverage the Responses API's previous_response_id for stateful conversations. Users must manage conversation history client-side.
Problem
The state management infrastructure (StateStorage trait, ResponsesStateProcessor, memory/PostgreSQL backends) is only wired into the direct proxy path (crates/brightstaff/src/handlers/llm.rs), not the agent orchestrator path (crates/brightstaff/src/handlers/agent_chat_completions.rs).
Technical gaps
-
crates/brightstaff/src/main.rs — state_storage is passed to llm_chat but not to agent_chat. The agent_chat function signature has no StateStorage parameter.
-
crates/brightstaff/src/handlers/agent_chat_completions.rs — handle_agent_chat_inner() calls client_request.get_messages() early, converting to Vec<OpenAIMessage>. For Responses API requests with InputItem types (tool results, images), this conversion may lose information. No previous_response_id handling exists.
-
crates/brightstaff/src/handlers/pipeline_processor.rs — invoke_agent hardcodes the endpoint URL path to /v1/chat/completions when calling downstream agents. The request body is serialized generically via ProviderRequestType::to_bytes(), but the URL forces agents to receive calls at the chat completions endpoint regardless of the original request format.
-
crates/brightstaff/src/handlers/response_handler.rs — create_streaming_response in the agent path is a raw byte passthrough with no stream processing. Compare to llm.rs where responses go through ObservableStreamProcessor and optionally ResponsesStateProcessor.
-
crates/brightstaff/src/state/response_state_processor.rs — Only instantiated in llm.rs. Never used in the agent orchestration flow.
Proposed solution (following existing patterns)
- Pass
state_storage to agent_chat from main.rs (same pattern as llm_chat).
- For Responses API requests with
previous_response_id: resolve stored state via StateStorage, convert InputItem → OpenAIMessage for determine_orchestration() / agent selection.
- For the final agent's response: apply the same stream translation pipeline used in
llm.rs — translate chat completions SSE into Responses API format (via hermesllm's translation layer), then wrap with ResponsesStateProcessor to capture response_id and output from the translated response.completed event.
For multi-agent chains within a single turn, the state processor should wrap only the final combined response (the orchestrator already distinguishes is_last_agent), so intermediate agent responses are not stored individually.
Related
Motivation
Multi-turn chatbots that use Plano's agent orchestration for intent-based routing currently cannot leverage the Responses API's
previous_response_idfor stateful conversations. Users must manage conversation history client-side.Problem
The state management infrastructure (
StateStoragetrait,ResponsesStateProcessor, memory/PostgreSQL backends) is only wired into the direct proxy path (crates/brightstaff/src/handlers/llm.rs), not the agent orchestrator path (crates/brightstaff/src/handlers/agent_chat_completions.rs).Technical gaps
crates/brightstaff/src/main.rs—state_storageis passed tollm_chatbut not toagent_chat. Theagent_chatfunction signature has noStateStorageparameter.crates/brightstaff/src/handlers/agent_chat_completions.rs—handle_agent_chat_inner()callsclient_request.get_messages()early, converting toVec<OpenAIMessage>. For Responses API requests withInputItemtypes (tool results, images), this conversion may lose information. Noprevious_response_idhandling exists.crates/brightstaff/src/handlers/pipeline_processor.rs—invoke_agenthardcodes the endpoint URL path to/v1/chat/completionswhen calling downstream agents. The request body is serialized generically viaProviderRequestType::to_bytes(), but the URL forces agents to receive calls at the chat completions endpoint regardless of the original request format.crates/brightstaff/src/handlers/response_handler.rs—create_streaming_responsein the agent path is a raw byte passthrough with no stream processing. Compare tollm.rswhere responses go throughObservableStreamProcessorand optionallyResponsesStateProcessor.crates/brightstaff/src/state/response_state_processor.rs— Only instantiated inllm.rs. Never used in the agent orchestration flow.Proposed solution (following existing patterns)
state_storagetoagent_chatfrommain.rs(same pattern asllm_chat).previous_response_id: resolve stored state viaStateStorage, convertInputItem→OpenAIMessagefordetermine_orchestration()/ agent selection.llm.rs— translate chat completions SSE into Responses API format (via hermesllm's translation layer), then wrap withResponsesStateProcessorto captureresponse_idand output from the translatedresponse.completedevent.For multi-agent chains within a single turn, the state processor should wrap only the final combined response (the orchestrator already distinguishes
is_last_agent), so intermediate agent responses are not stored individually.Related