-
Notifications
You must be signed in to change notification settings - Fork 46
AIT-238 - handle publish failures #3109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Link to the pending `/ai-transport` overview page.
Add intro describing the pattern, its properties, and use cases.
Includes continuous token streams, correlating tokens for distinct responses, and explicit start/end events.
Splits each token streaming approach into distinct patterns and shows both the publish and subscribe side behaviour alongside one another.
Includes hydration with rewind and hydration with persisted history + untilAttach. Describes the pattern for handling in-progress live responses with complete responses loaded from the database.
Add doc explaining streaming tokens with appendMessage and update compaction allowing message-per-response history.
Unifies the token streaming nav for token streaming after rebase.
Refines the intro copy in message-per-response to have structural similarity with the message-per-token page.
Refine the Publishing section of the message-per-response docs. - Include anchor tags on title - Describe the `serial` identifier - Align with stream pattern used in message-per-token docs - Remove duplicate example
Refine the Subscribing section of the message-per-response docs. - Add anchor tag to heading - Describes each action upfront - Uses RANDOM_CHANNEL_NAME
Refine the rewind section of the message-per-response docs. - Include description of allowed rewind paameters - Tweak copy
Refines the history section for the message-per-response docs. - Adds anchor to heading - Uses RANDOM_CHANNEL_NAME - Use message serial in code snippet instead of ID - Tweaks copy
Fix the hydration of in progress responses via rewind by using the responseId in the extras to correlate messages with completed responses loaded from the database.
Fix the hydration of in progress responses using history by obtaining the timestamp of the last completed response loaded from the database and paginating history forwards from that point.
Removes the headers/metadata section, as this covers the specific semantics of extras.headers handling with appends, which is better addressed by the (upcoming) message append pub/sub docs. Instead, a callout is used to describe header mixin semantics in the appropriate place insofar as it relates to the discussion at hand.
Update the token streaming with message per token docs to include a callout describing resume behaviour in case of transient disconnection.
Fix the message per token docs headers to include anchors and align with naming in the message per response page.
Adds an overview page for a Sessions & Identity section which describes the channel-oriented session model and its benefits over the traditional connection-oriented model. Describes how identity relates to session management and how this works in the context of channel-oriented sessions. Shows how to use identified clients to assign a trusted identity to users and obtain this identity from the agent side. Shows how to use Ably capabilities to control which operations authenticated users can perform on which channels. Shows how to use authenticated user claims to associated a role or other attribute with a user. Updates the docs to describe how to handle authentication, capabilities, identity and roles/attributes for agents separately from end users. Describes how to use presence to mark users and agents as online/offline. Includes description of synthetic leaves in the event of abrupt disconnection. Describe how to subscribe to presence to see who is online, and take action when a user is offline across all devices. Add docs for resuming user and agent sessions, linking to hydration patterns for different token streaming approaches for user resumes and describing agent resume behaviour with message catch up.
Adds a guide for using the OpenAI SDK to consume streaming events from the Responses API and publish them over Ably using the message per token pattern.
- Uses a further-reading callout instead of note - Removes repeated code initialising Ably client (OpenAI client already instantiated)
Adds an anchor tag to the "Client hydration" heading
Similar to the open ai message per token guide, but using the message per response pattern with appends.
Documents patterns for exposing reasoning output from models along with final output.
Overview page for token streaming in AI Transport --------- Co-authored-by: matt423 <matthew.a423@gmail.com> Co-authored-by: Fiona Corden <fiona.corden@ably.com> Co-authored-by: Paddy Byers <paddy.byers@gmail.com>
Document how to implement human oversight of AI agent actions using Ably channels and capabilities for authorization workflows.
Document how users send prompts to AI agents over Ably channels, including identified clients, message correlation, and handling concurrent prompts.
Co-authored-by: Paddy Byers <paddy.byers@gmail.com>
Details the message-per-response pattern using Ably `appendMessage` for Anthropic SDK.
Adds a page to the Messaging section that describes sending tool calls and results to users over channels. Indicates ability to build generative user interfaces or implement human in the loop workflows.
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
paddybyers
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggested clarification
|
|
||
| ### Handling append failures <a id="append-failures"/> | ||
|
|
||
| When appending without awaiting, it is possible for an intermediate append to fail while subsequent appends succeed. This creates a gap in the streamed response. For example, if a rate limit is exceeded, a single append may be rejected while the following tokens continue to be accepted. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| When appending without awaiting, it is possible for an intermediate append to fail while subsequent appends succeed. This creates a gap in the streamed response. For example, if a rate limit is exceeded, a single append may be rejected while the following tokens continue to be accepted. | |
| The examples above append successive tokens to a response message by pipelining the append operations - that is, the agent will publish an append operation without waiting for prior operations to complete. This is necessary in order to avoid the append rate being capped by the round-trip time from the agent to the Ably endpoint. However, this means that the agent does not await the outcome of each append operation, and that can result in the agent continuing to submit append operations after an earlier operation has failed. For example, if a rate limit is exceeded, a single append may be rejected while the following tokens continue to be accepted. | |
| The agent needs to obtain the outcome of each append operation, and take corrective action in the event that any operation failed for some reason. A simple but effective way to do this is simply to ensure that, if streaming of a response fails for any reason, then the message is updated with the final complete response text once it is available. This means that although the streaming experience is disrupted in the case of failure, there is no consistency problem with the final result once the response completes. |
|
|
||
| When appending without awaiting, it is possible for an intermediate append to fail while subsequent appends succeed. This creates a gap in the streamed response. For example, if a rate limit is exceeded, a single append may be rejected while the following tokens continue to be accepted. | ||
|
|
||
| To detect failures, keep a reference to each append operation and check for rejections after the stream completes: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| To detect failures, keep a reference to each append operation and check for rejections after the stream completes: | |
| To detect append failures, keep a reference to each append operation and check for rejections after the stream completes: |
|
|
||
| ### Handling publish failures <a id="publish-failures"/> | ||
|
|
||
| When publishing without awaiting, it is possible for an intermediate publish to fail while subsequent publishes succeed. This creates a gap in the streamed response. For example, if a rate limit is exceeded, a single token may be rejected while the following tokens continue to be accepted. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest updating this text in a similar way to proposed above
20df5cb to
01ab0f8
Compare
Description
Context: https://ably.atlassian.net/browse/AIT-238
Overall the questions I think need to be answers in these docs sections are: