Streaming LLM Response Support

## Problem

Current implementation waits for complete LLM responses. Streaming support would improve user experience for long-running generations.

## Challenges

Temporal activities return a single result. Streaming requires a different pattern:
- **Signals**: Push tokens to workflow as they arrive
- **Queries**: Poll for partial results
- **Heartbeats**: Include partial content in heartbeat data

## Proposed Approach

Investigate Temporal patterns for streaming:

1. **Activity with Signals**
   - Activity streams tokens and sends signals to workflow
   - Workflow accumulates tokens and can expose via query
   
2. **Event-based Pattern**
   - Similar to Pydantic AI's `_call_event_stream_handler_activity`
   - Buffer events and periodically flush to workflow

## Research Needed

- How does Pydantic AI handle streaming in their Temporal integration?
- What's the overhead of signals for high-frequency token streaming?
- Can we maintain durability while streaming?

## Priority

Medium

## References

- Pydantic AI streaming: `_agent.py` event stream handler
- Temporal signals documentation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Streaming LLM Response Support #4

Problem

Challenges

Proposed Approach

Research Needed

Priority

References

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Streaming LLM Response Support #4

Description

Problem

Challenges

Proposed Approach

Research Needed

Priority

References

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions