Problem
Current implementation waits for complete LLM responses. Streaming support would improve user experience for long-running generations.
Challenges
Temporal activities return a single result. Streaming requires a different pattern:
- Signals: Push tokens to workflow as they arrive
- Queries: Poll for partial results
- Heartbeats: Include partial content in heartbeat data
Proposed Approach
Investigate Temporal patterns for streaming:
-
Activity with Signals
- Activity streams tokens and sends signals to workflow
- Workflow accumulates tokens and can expose via query
-
Event-based Pattern
- Similar to Pydantic AI's
_call_event_stream_handler_activity
- Buffer events and periodically flush to workflow
Research Needed
- How does Pydantic AI handle streaming in their Temporal integration?
- What's the overhead of signals for high-frequency token streaming?
- Can we maintain durability while streaming?
Priority
Medium
References
- Pydantic AI streaming:
_agent.py event stream handler
- Temporal signals documentation
Problem
Current implementation waits for complete LLM responses. Streaming support would improve user experience for long-running generations.
Challenges
Temporal activities return a single result. Streaming requires a different pattern:
Proposed Approach
Investigate Temporal patterns for streaming:
Activity with Signals
Event-based Pattern
_call_event_stream_handler_activityResearch Needed
Priority
Medium
References
_agent.pyevent stream handler