This repository is a reference implementation of a Rank‑4 agent skeleton:
- Broker Lambda (
POST /agent) for conversations - Heartbeat Lambda (EventBridge
rate(5 minutes)) for background work - Unified DynamoDB state + memory store
- Shared Lambda layer (
agent_core+ deps) + Config layer (prompts) - FastAPI + Jinja2 GUI for deploying/managing stacks and chatting with the agent
This is intentionally a skeleton: you’re expected to adapt the toolset, prompts, planner, and action dispatch.
- Python 3.12
pip- AWS CLI v2
- AWS SAM CLI (
sam) - An AWS account with access to:
- CloudFormation/Lambda/DynamoDB/EventBridge/IAM
- Amazon Bedrock (and the chosen
MODEL_ID) - (Optional) Amazon Polly
You need a configured AWS profile in ~/.aws/credentials (or equivalent).
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txtsource initwyw <AWS_PROFILE> <AWS_REGION>STACK_NAME=<your stack name>
sam build --build-dir ".aws-${STACK_NAME}"
sam deploy --guided --stack-name "$STACK_NAME" --template-file ".aws-${STACK_NAME}/build/template.yaml"Notes:
MODEL_IDis configured via the SAM parameterModelIdParam.- Default model ID is
meta.llama3-1-8b-instruct-v1:0. - (Optional) set
AudioBucketNameto store Polly MP3s to S3 (requires S3 permissions).
After deployment, CloudFormation outputs BrokerEndpointURL like:
https://<api-id>.execute-api.<region>.amazonaws.com/Prod/agent
uvicorn client.gui:app --reload --port 8000Open:
From the home page you can:
- list agent stacks
- select one to chat
- delete a stack
- deploy a stack (runs
sam build+sam deploy)
Once deployed, you can call the Broker:
curl -sS -X POST "$BROKER_URL" \
-H "Content-Type: application/json" \
-H "X-Session-Id: demo-session" \
-d '{"text":"Hello. My name is Major.","channel":"text"}' | jqResponse schema:
{
"reply_text": "...",
"actions": [],
"audio": { "url": "..."} // optional
}The FastAPI/Jinja2 GUI exposes two major workflows:
- Manual interface (stack/deployment management) — browse, create, and delete stacks, or jump to AWS consoles.
- Chat GUI — talk to a deployed broker (text or voice), switch ASR modes, and inspect request/response metadata.
From the home page (/):
- List stacks: Fetches CloudFormation stacks matching the optional
AGENT_RESOURCE_STACK_PREFIX(defaults tomarai-). Stacks are grouped by status (available / creating / deleting / failed) to make it clear what you can select. - Select a stack: Choosing a stack reveals the API endpoint used for chat and a direct AWS console link for the stack so you can debug events/outputs.
- Delete: Removes a selected stack (uses
cf.delete_stackunder the hood). A confirmation prompt appears in the browser before issuing the request. - Deploy new stack (
/deploy):- Enter the stack name and optional
AGENT_RESOURCE_STACK_PREFIXoverride. - The GUI runs
sam build+sam deploy --guidedwith the generated build directory (.aws-<STACK_NAME>). Output streams to the page so you can watch progress. - When deployment finishes, the resulting broker endpoint is shown and the new stack is selectable on the home page.
- Enter the stack name and optional
- Settings (
/settings):- Set
AWS_PROFILE/AWS_REGIONfor subsequent requests. - Toggle “Use s3_bucket in samconfig.toml” to force/skip the
samconfig.tomlS3 bucket (handy when bootstrapping a new environment). - Adjust the stack prefix. The prefix is applied automatically when creating or selecting stacks.
- Set
After selecting a stack, open /chat to talk to the broker:
- Text chat: Enter text and send; responses appear in the thread. If the broker returns actions, they are rendered inline for debugging.
- Voice input: Choose an ASR mode:
- Server (AWS Transcribe Streaming): Requires locally configured AWS credentials with
transcribe:StartStreamTranscription*permissions. The browser streams audio to the FastAPI server, which forwards to AWS Transcribe; final transcripts are sent to the broker. - Browser (Web Speech API): Uses the browser’s built-in recognizer; no AWS creds needed. Device selection is browser-dependent.
- Server (AWS Transcribe Streaming): Requires locally configured AWS credentials with
- Audio output: If the broker returns
audio.url, the GUI provides a play button so you can listen without leaving the page. - Session management: Each chat uses the
X-Session-Idheader with the current stack name by default. You can override the session ID in the UI to simulate multiple conversations. - Endpoint override: Advanced users can paste a broker URL directly to test alternate deployments without switching stacks.
- Debug panel: Expand the request/response JSON to verify payloads, headers, and timing information when troubleshooting.
- This skeleton is built for AWS deployment. The GUI can talk to a deployed stack.
- The chat UI supports two ASR modes:
- Server (AWS Transcribe Streaming): true mic selection, optional speaker routing for playback, push-to-talk, and continuous ambient listening. Audio is captured in the browser and streamed to the local FastAPI server over a websocket; the server streams to AWS Transcribe and (on final) forwards the transcript to the deployed Broker.
- Browser (Web Speech API fallback): uses the built-in browser recognizer (Chrome/Edge); device selection is not guaranteed.
If you use Server (AWS Transcribe) mode, the AWS credentials/profile used to run the GUI must be allowed to start a streaming transcription.
In practice that means granting (at minimum) a Transcribe streaming action to the profile/role,
e.g. transcribe:StartStreamTranscriptionWebSocket (some environments use transcribe:StartStreamTranscription).
This does not change the SAM-deployed stack permissions; it's purely for the local GUI.
A minimal identity policy looks like:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"transcribe:StartStreamTranscriptionWebSocket",
"transcribe:StartStreamTranscription"
],
"Resource": "*"
}
]
}template.yaml # SAM template (DDB + Lambdas + layers + API + schedule)
lambda/broker/app.py # Broker Lambda
lambda/agent_heartbeat/handler.py
layers/shared/ # Shared deps + agent_core (SAM layer with Makefile build)
layers/config/prompts/ # Prompt templates (SAM config layer)
client/ # FastAPI + Jinja2 GUI
Delete the stack (GUI “Delete” button) or via CLI:
aws cloudformation delete-stack --stack-name <STACK_NAME>MIT (see LICENSE).