agentuity · parteeksingh24 · Jan 16, 2026 · Jan 21, 2026 · Jan 21, 2026 · Jan 21, 2026
diff --git a/content/Agents/evaluations.mdx b/content/Agents/evaluations.mdx
@@ -5,6 +5,21 @@ description: Automatically test and validate agent outputs for quality and compl
 
 Evaluations (evals) are automated tests that run after your agent completes. They validate output quality, check compliance, and monitor performance without blocking agent responses.
 
+## Why Evals?
+
+Most evaluation tools test the LLM: did the model respond appropriately? That's fine for chatbots, but agents aren't single LLM calls. They're entire runs with multiple model calls, tool executions, and orchestration working together.
+
+Agent failures can happen anywhere in the run—a tool call that returned bad data, a state bug that corrupted context, and more. Testing just the LLM response misses most of this.
+
+Agentuity evals test the whole run—every tool call, state change, and orchestration step. They run on every session in production, so you catch issues with real traffic.
+
+**The result:**
+
+- **Full-run evaluation**: Test the entire agent execution, not just LLM responses
+- **Production monitoring**: Once configured, evals run automatically on every session
+- **Async by default**: Evals don't block responses, so users aren't waiting
+- **Preset library**: Common checks (PII, safety, hallucination) available out of the box
+
 Evals come in two types: **binary** (pass/fail) for yes/no criteria, and **score** (0-1) for quality gradients.
 
 <Callout type="info" title="Where Scores Appear">

diff --git a/content/Agents/standalone-execution.mdx b/content/Agents/standalone-execution.mdx
@@ -16,10 +16,20 @@ import { createAgentContext } from '@agentuity/runtime';
 import chatAgent from '@agent/chat';
 
 const ctx = createAgentContext();
-const result = await ctx.invoke(() => chatAgent.run({ message: 'Hello' }));
+const result = await ctx.run(chatAgent, { message: 'Hello' });
 ```
 
-The `invoke()` method executes your agent with full infrastructure support: tracing, session management, and access to all storage services.
+The `run()` method executes your agent with full infrastructure support: tracing, session management, and access to all storage services.
+
+For agents that don't require input:
+
+```typescript
+const result = await ctx.run(statusAgent);
+```
+
+<Callout type="info" title="Legacy invoke() Method">
+The older `ctx.invoke(() => agent.run(input))` pattern still works but `ctx.run(agent, input)` is preferred for its cleaner syntax.
+</Callout>
 
 ## Options
 
@@ -45,10 +55,7 @@ await createApp();
 // Run cleanup every hour
 cron.schedule('0 * * * *', async () => {
   const ctx = createAgentContext({ trigger: 'cron' });
-
-  await ctx.invoke(async () => {
-    await cleanupAgent.run({ task: 'expired-sessions' });
-  });
+  await ctx.run(cleanupAgent, { task: 'expired-sessions' });
 });
 ```
 
@@ -58,35 +65,33 @@ For most scheduled tasks, use the [`cron()` middleware](/Routes/cron) instead. I
 
 ## Multiple Agents in Sequence
 
-Run multiple agents within a single `invoke()` call to share the same session and tracing context:
+Run multiple agents in sequence with the same context:
 
 ```typescript
 const ctx = createAgentContext();
 
-const result = await ctx.invoke(async () => {
-  // First agent analyzes the input
-  const analysis = await analyzeAgent.run({ text: userInput });
-
-  // Second agent generates response based on analysis
-  const response = await respondAgent.run({
-    analysis: analysis.summary,
-    sentiment: analysis.sentiment,
-  });
+// First agent analyzes the input
+const analysis = await ctx.run(analyzeAgent, { text: userInput });
 
-  return response;
+// Second agent generates response based on analysis
+const response = await ctx.run(respondAgent, {
+  analysis: analysis.summary,
+  sentiment: analysis.sentiment,
 });
 ```
 
+Each `ctx.run()` call shares the same session and tracing context.
+
 ## Reusing Contexts
 
 Create a context once and reuse it for multiple invocations:
 
 ```typescript
 const ctx = createAgentContext({ trigger: 'websocket' });
 
-// Each invoke() gets its own session and tracing span
+// Each run() gets its own session and tracing span
 websocket.on('message', async (data) => {
-  const result = await ctx.invoke(() => messageAgent.run(data));
+  const result = await ctx.run(messageAgent, data);
   websocket.send(result);
 });
 ```
@@ -104,6 +109,28 @@ Standalone contexts provide the same infrastructure as HTTP request handlers:
 - **Session events**: Start/complete events for observability
 </Callout>
 
+## Detecting Runtime Context
+
+Use `isInsideAgentRuntime()` to check if code is running within the Agentuity runtime:
+
+```typescript
+import { isInsideAgentRuntime, createAgentContext } from '@agentuity/runtime';
+import myAgent from '@agent/my-agent';
+
+async function processRequest(data: unknown) {
+  if (isInsideAgentRuntime()) {
+    // Already in runtime context, call agent directly
+    return myAgent.run(data);
+  }
+
+  // Outside runtime, create context first
+  const ctx = createAgentContext();
+  return ctx.run(myAgent, data);
+}
+```
+
+This is useful for writing utility functions that work both inside agent handlers and in standalone scripts.
+
 ## Next Steps
 
 - [Calling Other Agents](/Agents/calling-other-agents): Agent-to-agent communication patterns

diff --git a/content/Agents/workbench.mdx b/content/Agents/workbench.mdx
@@ -5,6 +5,19 @@ description: Use the built-in development UI to test agents, validate schemas, a
 
 Workbench is a built-in UI for testing your agents during development. It automatically discovers your agents, displays their input/output schemas, and lets you execute them with real inputs.
 
+## Why Workbench?
+
+Testing agents isn't like testing traditional APIs. You need to validate input schemas, see how responses format, test multi-turn conversations, and understand execution timing. Using `curl` or Postman means manually constructing JSON payloads and parsing responses.
+
+Workbench understands your agents. It reads your schemas, generates test forms, maintains conversation threads, and shows execution metrics. When something goes wrong, you see exactly what the agent received and returned.
+
+**Key capabilities:**
+
+- **Schema-aware testing**: Input forms generated from your actual schemas
+- **Thread persistence**: Test multi-turn conversations without manual state tracking
+- **Execution metrics**: See token usage and response times for every request
+- **Quick iteration**: Test prompts display in the UI for one-click execution
+
 ## Enabling Workbench
 
 Add a `workbench` section to your `agentuity.config.ts`:

diff --git a/content/Learn/Cookbook/Patterns/server-utilities.mdx b/content/Learn/Cookbook/Patterns/server-utilities.mdx
@@ -1,6 +1,6 @@
 ---
 title: SDK Utilities for External Apps
-description: Use storage, logging, error handling, and schema utilities from external backends like Next.js or Express
+description: Use storage, queues, logging, and error handling utilities from external backends like Next.js or Express
 ---
 
 Use `@agentuity/server` and `@agentuity/core` utilities in external apps, scripts, or backends that integrate with Agentuity.
@@ -122,6 +122,146 @@ export async function GET(request: NextRequest) {
 }
 ```
 
+## Queue Management
+
+Manage queues programmatically from external apps or scripts using `APIClient`:
+
+```typescript title="lib/agentuity-queues.ts"
+import { APIClient, createLogger, getServiceUrls } from '@agentuity/server';
+
+export const logger = createLogger('info');
+const urls = getServiceUrls(process.env.AGENTUITY_REGION!);
+
+export const client = new APIClient(
+  urls.catalyst,
+  logger,
+  process.env.AGENTUITY_SDK_KEY
+);
+```
+
+### Creating and Managing Queues
+
+```typescript
+import {
+  createQueue,
+  listQueues,
+  deleteQueue,
+  pauseQueue,
+  resumeQueue,
+} from '@agentuity/server';
+import { client } from '@/lib/agentuity-queues';
+
+// Create a worker queue
+const queue = await createQueue(client, {
+  name: 'order-processing',
+  queue_type: 'worker',
+  settings: {
+    default_max_retries: 5,
+    default_visibility_timeout_seconds: 60,
+  },
+});
+
+// List all queues
+const { queues } = await listQueues(client);
+
+// Pause and resume
+await pauseQueue(client, 'order-processing');
+await resumeQueue(client, 'order-processing');
+
+// Delete a queue
+await deleteQueue(client, 'old-queue');
+```
+
+### Dead Letter Queue Operations
+
+```typescript
+import {
+  listDeadLetterMessages,
+  replayDeadLetterMessage,
+  purgeDeadLetter,
+} from '@agentuity/server';
+import { client, logger } from '@/lib/agentuity-queues';
+
+// List failed messages
+const { messages } = await listDeadLetterMessages(client, 'order-processing');
+
+for (const msg of messages) {
+  logger.warn('Failed message', { id: msg.id, reason: msg.failure_reason });
+
+  // Replay back to the queue
+  await replayDeadLetterMessage(client, 'order-processing', msg.id);
+}
+
+// Purge all DLQ messages
+await purgeDeadLetter(client, 'order-processing');
+```
+
+### Webhook Destinations
+
+```typescript
+import { createDestination } from '@agentuity/server';
+import { client } from '@/lib/agentuity-queues';
+
+await createDestination(client, 'order-processing', {
+  destination_type: 'http',
+  config: {
+    url: 'https://api.example.com/webhook/orders',
+    method: 'POST',
+    headers: { 'X-API-Key': 'secret' },
+    timeout_ms: 30000,
+    retry_policy: {
+      max_attempts: 5,
+      initial_backoff_ms: 1000,
+      max_backoff_ms: 60000,
+      backoff_multiplier: 2.0,
+    },
+  },
+});
+```
+
+### HTTP Ingestion Sources
+
+```typescript
+import { createSource } from '@agentuity/server';
+import { client, logger } from '@/lib/agentuity-queues';
+
+const source = await createSource(client, 'webhook-queue', {
+  name: 'stripe-webhooks',
+  description: 'Receives Stripe payment events',
+  auth_type: 'header',
+  auth_value: 'Bearer whsec_...',
+});
+
+// External services POST to this URL
+logger.info('Source created', { url: source.url });
+```
+
+### Pull-Based Consumption
+
+For workers that pull and acknowledge messages:
+
+```typescript
+import { receiveMessage, ackMessage, nackMessage } from '@agentuity/server';
+import { client } from '@/lib/agentuity-queues';
+
+// Receive a message (blocks until available or timeout)
+const message = await receiveMessage(client, 'order-processing');
+
+if (message) {
+  try {
+    await processOrder(message.payload);
+    await ackMessage(client, 'order-processing', message.id);
+  } catch (error) {
+    // Message returns to queue for retry
+    await nackMessage(client, 'order-processing', message.id);
+  }
+}
+```
+
+<Callout type="info" title="CLI for Quick Operations">
+For one-off queue management, use the CLI instead: `agentuity cloud queue create`, `agentuity cloud queue dlq`, etc. See [Queues](/Services/queues) for CLI commands.
+</Callout>
+
 ## Alternative: HTTP Routes
 
 If you want to centralize storage logic in your Agentuity project (for [middleware](/Routes/middleware), sharing across multiple apps, or avoiding SDK key distribution), use [HTTP routes](/Routes/http) instead.
@@ -182,7 +322,8 @@ export default router;
 Add authentication middleware to protect storage endpoints:
 
 ```typescript title="src/api/sessions/route.ts"
-import { createRouter, createMiddleware } from '@agentuity/runtime';
+import { createRouter } from '@agentuity/runtime';
+import { createMiddleware } from 'hono/factory';
 
 const router = createRouter();
 
@@ -330,6 +471,7 @@ const jsonSchema = toJSONSchema(schema);
 
 ## See Also
 
+- [Queues](/Services/queues): Queue concepts and CLI commands
 - [HTTP Routes](/Routes/http): Route creation with `createRouter`
 - [Route Middleware](/Routes/middleware): Authentication patterns
 - [RPC Client](/Frontend/rpc-client): Typed client generation
diff --git a/content/Reference/CLI/configuration.mdx b/content/Reference/CLI/configuration.mdx
@@ -118,6 +118,58 @@ agentuity cloud secret import .env.secrets
 
 **In agents:** Access secrets via `process.env.API_KEY`. Secrets are injected at runtime and never logged.
 
+## Organization-Level Configuration
+
+Set environment variables and secrets at the organization level to share them across all projects in that organization. Use the `--org` flag with any `env` or `secret` command.
+
+### Set Org-Level Variables
+
+```bash
+# Set using your default org
+agentuity cloud env set DATABASE_URL "postgresql://..." --org
+
+# Set for a specific org
+agentuity cloud env set DATABASE_URL "postgresql://..." --org org_abc123
+```
+
+### Set Org-Level Secrets
+
+```bash
+# Set shared secret for default org
+agentuity cloud secret set SHARED_API_KEY "sk_..." --org
+
+# Set for specific org
+agentuity cloud secret set SHARED_API_KEY "sk_..." --org org_abc123
+```
+
+### List Org-Level Values
+
+```bash
+# List org environment variables
+agentuity cloud env list --org
+
+# List org secrets
+agentuity cloud secret list --org
+```
+
+### Get/Delete Org-Level Values
+
+```bash
+# Get an org variable
+agentuity cloud env get DATABASE_URL --org
+
+# Delete an org secret
+agentuity cloud secret delete OLD_KEY --org
+```
+
+<Callout type="info" title="Inheritance">
+Organization-level values are inherited by all projects in that organization. Project-level values take precedence over organization-level values when both are set.
+</Callout>
+
+<Callout type="tip" title="Default Organization">
+Set a default organization with `agentuity auth org select` to avoid specifying `--org` on every command. See [Getting Started](/Reference/CLI/getting-started) for details.
+</Callout>
+
 ## API Keys
 
 Create and manage API keys for programmatic access to your project.