Skip to content

[Enhancement] AgentCoreMemorySessionManager: add async_mode setting to prevent event loop blocking #452

@jariy17

Description

@jariy17

Problem

AgentCoreMemorySessionManager makes synchronous boto3 calls on the asyncio event loop hot path when used with Agent.stream_async() inside an async WebSocket server (e.g. AgentCore Runtime). This adds 200–800ms of blocked event loop time per agent turn, directly degrading time-to-first-token (TTFT) for streaming responses.

The four blocking call sites are:

  1. initialize()read_session(), read_agent(), list_messages() (sync gmdp client calls)
  2. append_message()create_event() (sync, once per message when batch_size=1)
  3. retrieve_customer_context() → uses ThreadPoolExecutor + as_completed(), but as_completed() still blocks the calling coroutine
  4. sync_agent()update_agent()create_event() (sync, per turn)

Because all four run inside the Strands hook lifecycle (BeforeInvocationEvent, MessageAddedEvent, AfterInvocationEvent), callers cannot wrap them without subclassing or forking the SDK.

Proposed Solution

Add an async_mode configuration setting to AgentCoreMemorySessionManager (or its config class):

class MemorySessionManagerConfig:
    async_mode: bool = False  # default: sync (backwards-compatible)

When async_mode=True, the session manager wraps all 4 blocking call sites with asyncio.to_thread() to offload boto3 calls to a thread pool, keeping the event loop unblocked:

# Pseudocode
if self.config.async_mode:
    await asyncio.to_thread(self._blocking_call, ...)
else:
    self._blocking_call(...)

This is a non-breaking change — existing sync users default to sync behavior unchanged. Async users opt in by setting async_mode=True.

Impact

  • Without fix: 200–800ms per turn of event loop blocking → degraded TTFT for streaming agents
  • With fix: Boto3 calls offloaded to thread pool → event loop stays free for I/O

Affected File

src/bedrock_agentcore/memory/integrations/strands/session_manager.py

Acceptance Criteria

  • async_mode: bool = False added to config (backwards-compatible default)
  • All 4 blocking call sites wrapped with asyncio.to_thread when async_mode=True
  • retrieve_customer_context uses asyncio.gather instead of blocking as_completed when async_mode=True
  • Unit tests cover both async_mode=False (existing behavior) and async_mode=True
  • Docs updated with usage example for async WebSocket agents

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions