diff --git a/docs/content/docs/use-cases/connecting-agents.mdx b/docs/content/docs/use-cases/connecting-agents.mdx
index 86481658..3908c2c4 100644
--- a/docs/content/docs/use-cases/connecting-agents.mdx
+++ b/docs/content/docs/use-cases/connecting-agents.mdx
@@ -251,3 +251,23 @@ Create a graph showing daily cost for workflow_name='meeting-analysis'
```

+
+## Debugging Workflows
+
+When a workflow produces incorrect output, sometimes it can be challenging to identify which agent introduced the problem. Since workflows chain multiple agents together, an error in an early agent can cascade through subsequent agents, making the final output incorrect.
+
+To save time from finding the problematic agent manually, you can ask your AI assistant to debug an entire workflow by trace_id:
+
+```
+Debug the workflow with trace_id=aece5cad-1090-47d2-b78b-eb7b014fc97e. The final output is
+incorrect, but I'm not sure which agent caused the problem. Review all completions in this
+workflow and help me identify where the issue was introduced.
+```
+
+Your AI assistant will:
+1. Query all agents' completions with the specified trace_id
+2. Review each agent's input and output in sequence
+3. Identify where the error was introduced
+4. Explain what went wrong and which agent is responsible
+
+This makes it easy to debug complex workflows without manually inspecting each agent's output.
diff --git a/docs/content/docs/use-cases/fundamentals/building.mdx b/docs/content/docs/use-cases/fundamentals/building.mdx
index dce0bb09..670e21e4 100644
--- a/docs/content/docs/use-cases/fundamentals/building.mdx
+++ b/docs/content/docs/use-cases/fundamentals/building.mdx
@@ -55,7 +55,7 @@ Your AI assistant will be able to construct the agent's code, and will be able t
You can also add custom metadata to your agents to help organize and track them. Common use cases include:
- **Workflow tracking**: include a trace_id and workflow_name key. (Learn more about workflows [here](/use-cases/connecting-agents))
-- **User identication**: include a user_id customer_email key.
+- **User identification**: include a user_id or customer_email key.
```
Create a new AnotherAI agent that summarizes emails and include a customer_id metadata key.
@@ -76,7 +76,7 @@ If you prefer to build manually or want to understand the configuration details,
## Testing your Agent
As part of the process of creating your agent with AnotherAI, your AI assistant will automatically create an initial experiment to test your agent's performance. Experiments allow you to systematically compare each of these different parameters of your agent to find the optimal setup for your use case across one or more inputs. You can use experiments to:
-- Compare performance across different models (GPT-4, Claude, Gemini, etc.)
+- Compare performance across different models (ex. GPT-5, Claude 4.5 sonnet, Gemini 2.0 flash, etc.)
- Test multiple prompt variations to find the most effective approach
- Optimize for specific metrics like cost, speed, and accuracy.
diff --git a/docs/content/docs/use-cases/fundamentals/deployments.mdx b/docs/content/docs/use-cases/fundamentals/deployments.mdx
index 88b9cad2..f28cb36f 100644
--- a/docs/content/docs/use-cases/fundamentals/deployments.mdx
+++ b/docs/content/docs/use-cases/fundamentals/deployments.mdx
@@ -96,12 +96,21 @@ This initial setup requires an AI coding agent with access to your codebase to m
If the version of your agent you want to deploy is already in your IDE, you can also just request to have a deployment created directly, without opening the web app.
1. Ensure your code is already using AnotherAI's base_url and API key.
-2. Tell your AI assistant to deploy your agent:
+2. Decide on a deployment_id for your deployment.
+
+A `deployment_id` is a unique identifier for a given deployment. Some common naming patterns include:
+
+- `my-agent:v1`: Differentiating your deployments by incrementing the version number is an easy to understand way to keep track of your deployments
+- `my-agent:low-latency-version`: You can also you differentiate your deployments based on characteristics or purpose. This can be useful if you want to A/B test different versions and want to easily distinguish the differences in versions.
+
+If you're not sure what id to give your deployment, you can ask your AI assistant to pick a name for you.
+
+3. Tell your AI assistant to deploy your agent:
```
-Deploy the version of anotherai/agent/travel-assistant in my code to production.
-Create a deployment for this version and update to my code to match this version
-and reference the deployment_id.
+Deploy this version of anotherai/agent/travel-assistant: anotherai/version/0c8ad37d1c9f5e9c06b8b643e87dfc8b.
+Deployment ID: travel-assistant:v1. Update my code to use this deployment and reference
+this deployment_id.
```
Your AI assistant will create a version ID for you and deploy it.
@@ -115,19 +124,27 @@ If you find the version you want to deploy is in a completion view on the web ap
2. Copy the version ID (located on the right side of the modal)

+3. 2. Decide on a deployment_id for your deployment.
+
+A `deployment_id` is a unique identifier for a given deployment. Some common naming patterns include:
-3. Paste the version ID into your preferred AI assistant and ask it to deploy:
+- `my-agent:v1`: Differentiating your deployments by incrementing the version number is an easy to understand way to keep track of your deployments
+- `my-agent:low-latency-version`: You can also you differentiate your deployments based on characteristics or purpose. This can be useful if you want to A/B test different versions and want to easily distinguish the differences in versions.
+
+If you're not sure what id to give your deployment, you can ask your AI assistant to pick a name for you.
+
+4. Paste the version ID into your preferred AI assistant and ask it to deploy:
```
-Deploy anotherai/version/acf2635be31cbd89f9363bfd3b2c6abc to production.
-Create a deployment for this version and update to my code to match this version
-and reference the deployment_id.
+Deploy this version of anotherai/agent/travel-assistant: anotherai/version/0c8ad37d1c9f5e9c06b8b643e87dfc8b.
+Deployment ID: travel-assistant:v1. Update my code to use this deployment and reference
+this deployment_id.
```
-If there is a version of the agent in an experimentyou want to deploy, you can get the version of your agent from the experiments web view.
+If there is a version of the agent in an experiment you want to deploy, you can get the version of your agent from the experiments web view.
1. Locate the experiment that has the version of the agent you want to deploy.
2. Hover over the version number to copy the version ID
@@ -135,12 +152,21 @@ If there is a version of the agent in an experimentyou want to deploy, you can g

3. Open your preferred AI assistant
-4. Request deployment to your preferred environment:
+4. Decide on a deployment_id for your deployment.
+
+A `deployment_id` is a unique identifier for a given deployment. Some common naming patterns include:
+
+- `my-agent:v1`: Differentiating your deployments by incrementing the version number is an easy to understand way to keep track of your deployments
+- `my-agent:low-latency-version`: You can also you differentiate your deployments based on characteristics or purpose. This can be useful if you want to A/B test different versions and want to easily distinguish the differences in versions.
+
+If you're not sure what id to give your deployment, you can ask your AI assistant to pick a name for you.
+
+5. Request deployment to your preferred environment:
```
-Deploy anotherai/version/acf2635be31cbd89f9363bfd3b2c6abc to production.
-Create a deployment for this version and update to my code to match this version
-and reference the deployment_id.
+Deploy this version of anotherai/agent/travel-assistant: anotherai/version/0c8ad37d1c9f5e9c06b8b643e87dfc8b.
+Deployment ID: travel-assistant:v1. Update my code to use this deployment and reference
+this deployment_id.
```

@@ -220,7 +246,7 @@ Always be helpful, accurate, and culturally sensitive."""
```js
const completion = await openai.chat.completions.create({
// Static components are now stored in the deployment
- model: "anotherai/deployment/travel-assistant:production#1",
+ model: "anotherai/deployment/travel-assistant:v1",
messages: [],
input: {
country: destination,
@@ -233,7 +259,7 @@ const completion = await openai.chat.completions.create({
```python
completion = await openai.chat.completions.create(
# Model, temperature, system message, and agent_id are now in the deployment
- model="anotherai/deployment/travel-assistant:production#1",
+ model="anotherai/deployment/travel-assistant:v1",
messages=[],
extra_body={
"input": {
@@ -249,6 +275,145 @@ completion = await openai.chat.completions.create(
+
+
+
+When a deployment is created, it captures the completion parameters from the existing code. Depending on what was in the code, the deployment will store different things, and the code updates needed to reference the deployment will differ.
+
+
+
+
+This is the pattern shown in the "Code Before and After Using Deployments" section above. When your deployment stores a templated prompt with input variables, your code passes empty messages and only provides the input variables.
+
+**When to use this pattern:**
+- You want maximum flexibility - both prompts and models managed outside your codebase
+- Non-technical team members need to iterate on prompts
+- You have a templated prompt with variable substitution
+
+
+
+
+
+If your deployment only stores the model, temperature, response_format, and other completion parameters **without a prompt template or input variables**, your code continues to pass messages normally. This pattern is generally used when working with images, PDFs, audio, or other non-text inputs that cannot be easily templated.
+
+**What the deployment stores:**
+- Model (e.g., `gpt-4o`, `claude-3-5-sonnet`)
+- Temperature, top_p, and other generation parameters
+- Response format (JSON schema)
+- Agent ID
+
+**What your code provides:**
+- All messages (system, user, assistant)
+- Any dynamic content
+
+**Example:**
+
+
+
+```python
+# Deployment stores: model, temperature, response_format, agent_id
+# Code provides: all messages
+completion = await client.chat.completions.create(
+ model="anotherai/deployment/identity-verification:v1",
+ messages=[
+ {
+ "role": "system",
+ "content": "You are an expert in identity verification..."
+ },
+ {
+ "role": "user",
+ "content": [
+ {"type": "image_url", "image_url": {"url": profile_url}},
+ {"type": "image_url", "image_url": {"url": selfie_url}}
+ ]
+ }
+ ]
+)
+```
+
+
+```js
+// Deployment stores: model, temperature, response_format, agent_id
+// Code provides: all messages
+const completion = await client.chat.completions.create({
+ model: "anotherai/deployment/identity-verification:v1",
+ messages: [
+ {
+ role: "system",
+ content: "You are an expert in identity verification..."
+ },
+ {
+ role: "user",
+ content: [
+ {type: "image_url", image_url: {url: profileUrl}},
+ {type: "image_url", image_url: {url: selfieUrl}}
+ ]
+ }
+ ]
+});
+```
+
+
+
+**When to use this pattern:**
+- You want to update models or parameters without code changes
+- You want to keep full control over prompts in your codebase
+- You're working with complex multi-turn conversations or multimodal inputs (images, PDFs, audio)
+
+
+
+
+
+To determine which pattern applies to a deployment, list the deployment and examine its structure:
+
+```bash
+"List deployments for agent document-classifier"
+```
+
+Check the deployment response for:
+- **Has `prompt` field with templates (e.g., `{{ variable }}`)?** → Pattern 1 (input variables)
+- **Has `input_variables_schema` field?** → Pattern 1 (input variables)
+- **Only has `model`, `temperature`, `output_schema`?** → Pattern 2 (pass messages)
+
+**Example deployment using Pattern 1:**
+```json
+{
+ "id": "travel-assistant:v1",
+ "version": {
+ "model": "gpt-4o",
+ "temperature": 0.7,
+ "prompt": [
+ {
+ "role": "system",
+ "content": "You are an expert on {{ country }}..."
+ }
+ ],
+ "input_variables_schema": {
+ "type": "object",
+ "properties": {"country": {}}
+ }
+ }
+}
+```
+
+**Example deployment using Pattern 2:**
+```json
+{
+ "id": "identity-verification:v1",
+ "version": {
+ "model": "gpt-4o-latest",
+ "temperature": 1.0,
+ "output_schema": {...}
+ }
+}
+```
+
+
+
+
+
+
+
@@ -260,7 +425,7 @@ The code allows targeting a deployment but still provide completion parameters.
```js
const completion = await openai.chat.completions.create({
- model: "anotherai/deployment/travel-assistant:production#1",
+ model: "anotherai/deployment/travel-assistant:v1",
input: {
country: "France"
}
@@ -279,7 +444,7 @@ We believe that code should be the source of truth which means that in the above
- any provided completion parameter can override the corresponding deployment parameter
- if the override creates a version that is incompatible with the deployment an error is raised.
-Consider a deployment `travel-assistant/production#1` created with:
+Consider a deployment `travel-assistant/v1` created with:
- model: "gpt-4o"
- temperature: 0.5
@@ -290,7 +455,7 @@ Consider a deployment `travel-assistant/production#1` created with:
```js
// Accepted since the version is compatible with the deployment
const completion = await openai.chat.completions.create({
- model: "anotherai/deployment/travel-assistant:production#1",
+ model: "anotherai/deployment/travel-assistant:v1",
input: {
country: "France"
}
@@ -300,7 +465,7 @@ const completion = await openai.chat.completions.create({
// Rejected since the version is incompatible with the deployment
const completion = await openai.chat.completions.create({
-model: "anotherai/deployment/travel-assistant:production#1",
+model: "anotherai/deployment/travel-assistant:v1",
input: {
country: "France"
}
@@ -355,11 +520,11 @@ Copy the new version ID you want to deploy from [AnotherAI](https://anotherai.de
- Ask your preferred AI assistant to update the existing deployment:
+ Ask your preferred AI assistant to update the existing deployment:
- ```
- Update deployment anotherai/deployment/question-answering-agent:production#1 to use
- anotherai/version/a9f1fc5ab11299a9fee5604e51fe7b6e
+ ```
+ Update deployment anotherai/deployment/travel-assistant:v1 to use
+ anotherai/version/a9f1fc5ab11299a9fee5604e51fe7b6e
```
@@ -405,7 +570,7 @@ When a new deployment is created, you will need to update your code to point to
```js
// Before - using old deployment
const completion = await openai.chat.completions.create({
- model: "anotherai/deployment/travel-assistant:production#1",
+ model: "anotherai/deployment/travel-assistant:v1",
input: {
country: "France"
}
@@ -413,7 +578,7 @@ const completion = await openai.chat.completions.create({
// After - using new deployment with breaking changes
const completion = await openai.chat.completions.create({
-model: "anotherai/deployment/travel-assistant:production#2", // new deployment
+model: "anotherai/deployment/travel-assistant:v2", // new deployment
input: {
destination: "France", // variable renamed: country -> destination
traveler_type: "business" // new required variable added
@@ -427,5 +592,5 @@ traveler_type: "business" // new required variable added
-Don't worry if you're unsure if an update version is a breaking change or not: if you ask your AI assistant to update an existing deployment and it cannot because the new version is incompatible, oyur AI assistant will automatically create a new deployment for you. You can create as many deployments as you need.
+Don't worry if you're unsure if an update version is a breaking change or not: if you ask your AI assistant to update an existing deployment and it cannot because the new version is incompatible, your AI assistant will automatically create a new deployment for you. You can create as many deployments as you need.
diff --git a/docs/content/docs/use-cases/fundamentals/evaluating.mdx b/docs/content/docs/use-cases/fundamentals/evaluating.mdx
index 4141caee..97eb31d9 100644
--- a/docs/content/docs/use-cases/fundamentals/evaluating.mdx
+++ b/docs/content/docs/use-cases/fundamentals/evaluating.mdx
@@ -200,9 +200,9 @@ These datasets contain only inputs without predefined expected outputs. Evaluati
### Populating Your Evaluation Dataset
-While there is no one-size-fits-all way to build a dataset, there are a few common ways to collect content for your dataset:
+While there is no one-size-fits-all way to build a dataset, there are a few common ways to collect content for your dataset.
-#### From User Feedback
+#### User Feedback
When users report issues with your agent's outputs, these completions become valuable test cases because they represent a case that your agent is not handling well but should. To add the content of a completion to your dataset:
@@ -213,7 +213,13 @@ When users report issues with your agent's outputs, these completions become val
3. Paste the completion ID into your AI coding agent's chat and ask them to add the completion to your dataset.
- Your AI agent will be able to convert the completion content into the format of your existing dataset entries.
-#### From Production Data
+```
+Get completion anotherai/completion/0199a698-56c2-72e7-4ad8-e6bb1699a0a8 and add
+it to email-rewriter-dataset.json as a new test case.
+```
+Learn more about collecting and using user feedback [here](/use-cases/user-feedback).
+
+#### Production Data
Using data from production completions instead of mocked data ensures that you're testing real-world scenarios. AnotherAI logs all completions from your agents, so you can easily review past completions for important cases to add to your dataset.
@@ -224,6 +230,65 @@ You can browse past completions from your in the AnotherAI web app:
3. Select the agent and scroll down it's page
- You'll be able to see some of the recent completions immediately, but for a full list, select "View all completions"
+#### Non-Text Data (Local files, URLs)
+
+When it comes to creating a dataset for non-text data, the inputs can be either local files (ex. `/Users/username/documents/f1040.pdf`) or public URLs (ex. `https://www.irs.gov/pub/irs-pdf/f1040.pdf`). Here is the process for creating a dataset for each type of input.
+
+**Public URLs**
+
+If the files you want to use are available via public URLs and you're building your dataset in an IDE like Cursor:
+1. You can paste the URL or URLs directly into your AI assistant's chat and ask them to add it to your dataset.
+
+```
+Add https://www.irs.gov/pub/irs-pdf/f1040.pdf as an input to my tax_form.json dataset.
+```
+2. Your AI assistant will be able to add the URL to your dataset using the correct formatting.
+3. If you want to include an expected output for your URL input (for example, the content of the PDF), you can also ask your AI assistant to view the content and extract the information you need.
+
+```
+Add https://www.irs.gov/pub/irs-pdf/f1040.pdf to my tax_form.json dataset as input.
+View the content of this URL, extract the content of the file and add it to the dataset
+as the input's expected output.
+```
+
+
+**When working with PDFs:** If you want to add an expected output for your URL input, you will need to use Claude Code to extract the content. At this time, Cursor's AI assistant is not able to view the content of the PDFs.
+
+
+**Local Files**
+
+If the files you want to use are available via local files and you're building your dataset in an IDE like Cursor:
+1. Create a folder for your dataset in your project.
+2. Locate the local files you want to use as inputs
+3. Drag and drop the files directly into the folder you created for your dataset
+4. From there, you can reference the file in your AI assistant chat freely (either by `@[file_name]` or by dragging and dropping it into the chatbox). Your AI assistant will be able to access the file to provide a ground-truth expected output, or add it as an input to an AnotherAI experiment.
+ - Note: At this time, Cursor's AI assistant is not able to view the content of the PDFs, even after preforming the steps above. If working with PDFs, we recommend using Claude Code.
+
+```
+Create a new experiment of anotherai/agent/food-image-analyzer and find the fastest model
+that can correctly list all food items in
+/Developer/anotherai/datasets/food-images/saturday-breakfast.jpg
+
+```
+
+
+**A note about image files:** Do not drag and drop a local images into your AI assistant's chat and ask the assistant to add them to your dataset for you. While the chat will be able to see the images, it will not be able to reference them by their file path, meaning it cannot correctly add the files to your dataset or as inputs to an AnotherAI experiment.
+
+You can drag and drop local **PDF files** directly into Claude Code, as their file path is preserved.
+
+
+**Existing Datasets**
+
+If you already have a dataset you want to use, and you're using an IDE like Cursor:
+1. Create a new folder for your dataset in your project.
+2. Drag and drop the existing dataset file into the folder you created in step 1.
+3. From there, you can reference your dataset in your AI assistant chat and request it use some - or all - of the dataset as inputs to test your agent.
+
+```
+Create a new experiment of anotherai/agent/food-image-analyzer and use the content
+of @food-analysis-dataset.json as inputs to the experiment.
+```
+
## Evaluating the Results of Running Your Dataset
To evaluate your agent's outputs from running your dataset, you have two approaches:
diff --git a/docs/content/docs/use-cases/image-agents.mdx b/docs/content/docs/use-cases/image-agents.mdx
index bec329ed..11641ced 100644
--- a/docs/content/docs/use-cases/image-agents.mdx
+++ b/docs/content/docs/use-cases/image-agents.mdx
@@ -1,5 +1,204 @@
---
title: Image Agents
+description: Learn how to build, test, and optimize AI agents that process images.
+summary: Build and optimize AI agents that process images with systematic experimentation and testing strategies.
---
-TODO
\ No newline at end of file
+import { Step, Steps } from 'fumadocs-ui/components/steps';
+import { Callout } from 'fumadocs-ui/components/callout';
+import { Accordion, Accordions } from 'fumadocs-ui/components/accordion';
+
+
+Before you begin, make sure you have the AnotherAI MCP server configured with your AI assistant. Your AI assistant needs this connection to create agents, run experiments, and manage deployments. See the [Getting Started guide](/getting-started) for setup instructions.
+
+
+
+Need an extra hand with building agents? We're happy to help. Reach us at [team@workflowai.support](mailto:team@workflowai.support) or on [Slack](https://join.slack.com/t/anotherai-dev/shared_invite/zt-3av2prezr-Lz10~8o~rSRQE72m_PyIJA).
+
+
+In this guide, we'll walk through the process of building an image-based AI agent that analyzes food photos and provides calorie information. Our goal with this agent's flow is to give it an image of a food or meal and have it return a structured list of detected foods with their respective calorie counts.
+
+This is just one example of an image-based agent. The same process can be applied to image-based agents that perform other tasks as well.
+
+
+
+### Prepare Your Test Data
+
+For text-based agents, it's quite simple for your AI assistant to generate data for you, however for image-based agents, AI assistants are inconsistent about generating valid image URLs as input. Therefore, to produce useful results, we strongly recommend that you use your own images for testing.
+
+**Types of Image Input Data:**
+
+When it comes to using image data as input, the inputs can be either:
+- Local files (ex. `/Users/username/images/pizza-margherita.jpg`)
+- Public URLs (ex. `https://example.com/pizza-margherita.jpg`)
+
+Additionally, these formats can either be:
+- Standalone files (generally recommended when starting the agent development process)
+- Part of a [dataset](/use-cases/fundamentals/evaluating#using-datasets-to-evaluate-your-agents) (generally more complex to put together and used later on as part of [agent evaluations](/use-cases/fundamentals/evaluating)).
+
+Here is the process for using each type of image as input when creating your agent:
+
+
+
+**Quickest Usage Option:**
+
+If you want to use the images immediately for testing and aren't interested in saving the URLs in your codebase or in a dataset for later use (or don't have access to your codebase), simply include the image URL(s) in your [agent creation prompt](/use-cases/image-agents#creating-your-agent) and request that they are used as inputs in the AnotherAI experiment:
+
+```
+Create an AnotherAI agent that provides a list of detected foods and their approximate
+calorie count from a given image. Use the content of https://example.com/pizza-margherita.jpg
+as an input to test the agent with an AnotherAI experiment.
+```
+
+**If you want to save the URLs for later use:**
+1. Create a file in your project for your input image URLs.
+ - Using a JSON file is common, but if there is another format you'd prefer to use (.csv, .txt, etc.), you can use that instead. If you're unsure how to format your dataset, ask your AI assistant to help you.
+ ```
+ I have a bunch of image URLs that I want to use as inputs to test my agent.
+ Help me create a dataset file to store them.
+ ```
+
+2. Paste the image URL(s) directly into your AI assistant's chat and ask them to add it to your file.
+
+```
+Add https://example.com/pizza-margherita.jpg as an input to my food-analysis-dataset.json.
+```
+3. Your AI assistant will be able to add the URL to your dataset using the correct formatting.
+4. (Optional) If you want to include an expected output for your URL input (for this example, the food items and their calorie counts), you can also ask your AI assistant to view the content and extract the information you need.
+
+```
+Add https://example.com/pizza-margherita.jpg to my food-analysis-dataset.json as input.
+Then view the content of this URL, analyze the food items and their calorie counts,
+and add it to the dataset as the input's expected output.
+```
+
+
+
+If the images you want to use are available via local files and you're in an IDE like Cursor:
+1. Create a folder for the images in your project.
+2. Locate the local files on your computer that you want to use as inputs
+3. Drag and drop the images directly into the folder you created
+
+
+4. From there, you can reference the image in your [agent creation prompt](/use-cases/image-agents#creating-your-agent) (either by `@[file_name]` or by dragging and dropping it into the chatbox) or subsequent prompts for additional testing.
+5. (Optional) Or, if you want to have corresponding expected outputs for your inputs, you can also ask your AI assistant to view the content and extract the information you need, before proceeding with creating your agent.
+
+```
+View each image in /Developer/anotherai/datasets/food-images, analyze the food items
+and their calorie counts, then create a dataset that contains the image file name as
+the input and the food items and their calorie counts that you've detected as the
+expected output.
+```
+
+
+**A note about image files: Do not drag and drop a local images into your AI assistant's chat without adding them to a folder in your project first.**
+
+If you drop the images into the chat directly from your computer file system, the chat will be able to see the images, but it will not be able to reference them by their file path, meaning it cannot correctly add the files to your dataset or as inputs to an AnotherAI experiment.
+
+
+
+
+If you already have a dataset you want to use, and you're using an IDE like Cursor:
+1. Create a new folder for your dataset in your project.
+2. Drag and drop the existing dataset file into the folder you created in step 1.
+3. From there, you can reference your dataset in your AI assistant chat and request it use some - or all - of the dataset as inputs to test your agent in your agent [creation prompt](/use-cases/image-agents#creating-your-agent) or subsequent prompts for additional testing. For example:
+
+```
+Create a new experiment of anotherai/agent/food-analyzer and use the content
+of @food-analysis-dataset.json as inputs to the experiment.
+```
+
+
+
+
+
+
+### Creating your Agent
+
+After preparing your test data, you can move on to building your agent. The easiest way to create a new agent is to ask your preferred AI assistant to build it for you.
+
+**Basic Agent Creation**
+
+Start by describing what your agent should do:
+
+For image agents, we recommend referencing your test data images that you want to use as input to the agent in your initial prompt.
+
+```
+Create an AnotherAI agent that provides a list of detected foods and their approximate
+calorie count from a given image. Use the content of /Developer/anotherai/datasets/food-images
+as inputs to test the agent with an AnotherAI experiment.
+```
+
+**Adding Performance Requirements**
+
+If you have other criteria or constraints for your agent, you can include them in your prompt and your AI assistant will use AnotherAI to help you optimize for them. Learn more about different performance requirements and how to include them in your prompt [here](/use-cases/fundamentals/building#adding-performance-requirements).
+
+**Adding Metadata**
+
+You can also add custom metadata to your agents to help organize and track them. Learn more about adding metadata and how metadata can be used [here](/use-cases/fundamentals/building#adding-metadata).
+
+
+
+### Testing your Agent
+
+As part of the process of creating your agent with AnotherAI, your AI assistant will automatically create an initial experiment to test your agent's performance. Experiments allow you to systematically compare each of these different parameters of your agent to find the optimal setup for your use case across one or more inputs. You can use experiments to:
+- Compare performance across different models (ex. GPT-5, Claude 4.5 sonnet, Gemini 2.0 flash, etc.)
+- Test multiple prompt variations to find the most effective approach
+- Optimize for specific metrics like cost, speed, and accuracy.
+
+If you find there is additional criteria you want to test, you can always ask your AI assistant to create additional experiments. The most common parameters to experiment with are [prompts](/use-cases/fundamentals/experiments#prompts) and [models](/use-cases/fundamentals/experiments#models), however you can also experiment with changes to [other parameters like temperature](/use-cases/fundamentals/experiments#other-parameters).
+
+Here's an example of a prompt you might use to find a faster model for your agent:
+
+```
+Create an AnotherAI experiment to help me find a faster model for anotherai/agent/food-analyzer,
+but still maintains the same calorie counting accuracy as my current model.
+```
+
+Learn more about experiments and see more examples [here](/use-cases/fundamentals/experiments).
+
+
+If you're testing an agent that has a large system prompt and/or very long inputs, you may encounter token limit issues with the `get_experiment` MCP tool that impacts Claude Code's ability to provide accurate insights on your agent.
+
+
+
+In this case, you can manually increase Claude Code's output token limit.
+
+**To set up permanently for all terminal sessions:**
+
+For zsh (default on macOS):
+```bash
+echo 'export MAX_MCP_OUTPUT_TOKENS=150000' >> ~/.zshrc && source ~/.zshrc
+```
+
+For bash:
+```bash
+echo 'export MAX_MCP_OUTPUT_TOKENS=150000' >> ~/.bashrc && source ~/.bashrc
+```
+
+**For temporary use in current session only:**
+```bash
+export MAX_MCP_OUTPUT_TOKENS=150000
+```
+
+**Note:** If you forget or don't realize you need to set a higher limit, you can quit your existing session, run the command to increase the limit, and then use `claude --resume` to continue your previous session with the increased limit applied.
+
+You can learn more about tool output limits for Claude Code in their [documentation](https://docs.claude.com/en/docs/claude-code/mcp#mcp-output-limits-and-warnings).
+
+
+
+
+### Adding Feedback to your Experiments
+
+When reviewing the results of experiments, you can add feedback (annotations) to help your AI coding agent understand what is working and what is not. Your AI coding agent can then use this feedback to create additional experiments with improved versions of your agent.
+
+Learn more about how annotations can be used to improve your agent [here](/use-cases/fundamentals/annotations).
+
+
+
+### Debugging Image Agents
+
+The process of debugging image agents is the same as debugging text-based agents. Learn more about debugging agents [here](/use-cases/fundamentals/debugging).
+
+
diff --git a/docs/content/docs/use-cases/meta.json b/docs/content/docs/use-cases/meta.json
index f1dd0b75..be5cbe39 100644
--- a/docs/content/docs/use-cases/meta.json
+++ b/docs/content/docs/use-cases/meta.json
@@ -8,6 +8,8 @@
"checking-new-models",
"lowering-costs",
"connecting-agents",
- "for-product-managers"
+ "for-product-managers",
+ "image-agents",
+ "pdf-processing-agents"
]
}
diff --git a/docs/content/docs/use-cases/pdf-processing-agents.mdx b/docs/content/docs/use-cases/pdf-processing-agents.mdx
new file mode 100644
index 00000000..3e3be2a6
--- /dev/null
+++ b/docs/content/docs/use-cases/pdf-processing-agents.mdx
@@ -0,0 +1,205 @@
+---
+title: PDF Processing Agents
+description: Learn how to build, test, and optimize AI agents that extract information from PDF documents.
+summary: Build and optimize AI agents that extract structured data from PDF documents with systematic experimentation and testing strategies.
+---
+
+import { Step, Steps } from 'fumadocs-ui/components/steps';
+import { Callout } from 'fumadocs-ui/components/callout';
+import { Accordion, Accordions } from 'fumadocs-ui/components/accordion';
+
+
+Before you begin, make sure you have the AnotherAI MCP server configured with your AI assistant. Your AI assistant needs this connection to create agents, run experiments, and manage deployments. See the [Getting Started guide](/getting-started) for setup instructions.
+
+
+
+Need an extra hand with building agents? We're happy to help. Reach us at [team@workflowai.support](mailto:team@workflowai.support) or on [Slack](https://join.slack.com/t/anotherai-dev/shared_invite/zt-3av2prezr-Lz10~8o~rSRQE72m_PyIJA).
+
+
+In this guide, we'll walk through the process of building a PDF processing AI agent that extracts key information from PDF documents. Our goal with this agent's flow is to give it a PDF document and have it return structured data with the extracted information.
+
+This is just one example of a PDF processing agent. The same process can be applied to PDF processing agents that perform other tasks as well.
+
+
+
+### Prepare Your Test Data
+
+For text-based agents, it's quite simple for your AI assistant to generate data for you, however for PDF processing agents, AI assistants are not generally able to generate valid PDF URLs as input. Therefore, to produce useful results, we strongly recommend that you use your own PDFs for testing.
+
+**Types of PDF Input Data:**
+
+When it comes to using PDF data as input, the inputs can be either:
+- Local files (ex. `/Users/username/documents/subscription-invoice.pdf`)
+- Public URLs (ex. `https://example.com/documents/subscription-invoice.pdf`)
+
+Additionally, these formats can either be:
+- Standalone files (generally recommended when starting the agent development process)
+- Part of a [dataset](/use-cases/fundamentals/evaluating#using-datasets-to-evaluate-your-agents) (generally more complex to put together and used later on as part of [agent evaluations](/use-cases/fundamentals/evaluating)).
+
+Here is the process for using each type of PDF as input when creating your agent:
+
+
+
+**Quickest Usage Option:**
+
+If you want to use the PDFs immediately for testing and aren't interested in saving the URLs in your codebase or in a dataset for later use (or don't have access to your codebase), simply include the PDF URL(s) in your [agent creation prompt](/use-cases/pdf-processing-agents#creating-your-agent) and request that they are used as inputs in the AnotherAI experiment:
+
+```
+Create an AnotherAI agent that extracts key information from PDF documents. Use the content of
+https://example.com/documents/subscription-invoice.pdf as an input to test the agent with an AnotherAI experiment.
+```
+
+**If you want to save the URLs for later use:**
+1. Create a file in your project for your input PDF URLs.
+ - Using a JSON file is common, but if there is another format you'd prefer to use (.csv, .txt, etc.), you can use that instead. If you're unsure how to format your dataset, ask your AI assistant to help you.
+ ```
+ I have a bunch of PDF URLs that I want to use as inputs to test my agent.
+ Help me create a dataset file to store them.
+ ```
+
+2. Paste the PDF URL(s) directly into your AI assistant's chat and ask them to add it to your file.
+
+```
+Add https://example.com/documents/subscription-invoice.pdf
+as an input to my pdf-extraction-dataset.json.
+```
+3. Your AI assistant will be able to add the URL to your dataset using the correct formatting.
+4. (Optional) If you want to include an expected output for your URL input, you can also ask your AI assistant to view the content and extract the information you need.
+
+```
+Add https://example.com/documents/subscription-invoice.pdf to my pdf-extraction-dataset.json as input.
+Then view the content of this URL, extract the key information, and add it to the dataset
+as the input's expected output.
+```
+
+
+**When working with PDFs:** If you want to add an expected output for your URL input, you will need to use Claude Code to extract the content. At this time, Cursor's AI assistant is not able to view the content of PDFs.
+
+
+
+
+If the PDFs you want to use are available via local files and you're in an IDE like Cursor:
+1. Create a folder for the PDFs in your project.
+2. Locate the local files on your computer that you want to use as inputs
+3. Drag and drop the PDFs directly into the folder you created
+
+
+4. From there, you can reference the PDF in your [agent creation prompt](/use-cases/pdf-processing-agents#creating-your-agent) (either by `@[file_name]` or by dragging and dropping it into the chatbox) or subsequent prompts for additional testing.
+5. (Optional) Or, if you want to have corresponding expected outputs for your inputs, you can also ask your AI assistant to view the content and extract the information you need, before proceeding with creating your agent.
+
+```
+View the content of /Developer/anotherai/datasets/pdf-documents/subscription-invoice.pdf,
+extract the key information, then create a dataset that contains the PDF file as the input
+and the key information as the expected output.
+```
+
+
+**A note about PDF files:** At this time, Cursor's AI assistant is not able to view the content of PDFs, even after performing the steps above. If working with PDFs, we recommend using Claude Code.
+
+
+
+
+If you already have a dataset you want to use, and you're using an IDE like Cursor:
+1. Create a new folder for your dataset in your project.
+2. Drag and drop the existing dataset file into the folder you created in step 1.
+3. From there, you can reference your dataset in your AI assistant chat and request it use some - or all - of the dataset as inputs to test your agent in your [agent creation prompt](/use-cases/pdf-processing-agents#creating-your-agent) or subsequent prompts for additional testing. For example:
+
+```
+Create a new experiment of anotherai/agent/pdf-extractor and use the content
+of @pdf-extraction-dataset.json as inputs to the experiment.
+```
+
+
+
+Learn more about building and using datasets [here](/use-cases/fundamentals/evaluating#using-datasets-to-evaluate-your-agents).
+
+
+
+### Creating your Agent
+
+After preparing your test data, you can move on to building your agent. The easiest way to create a new agent is to ask your preferred AI assistant to build it for you.
+
+**Basic Agent Creation**
+
+Start by describing what your agent should do:
+
+For PDF processing agents, we recommend referencing your test data PDFs that you want to use as input to the agent in your initial prompt.
+
+```
+Create an AnotherAI agent that extracts key information from PDF documents. Use the content of
+/Developer/anotherai/datasets/pdf-documents/subscription-invoice.pdf as inputs to test the agent with an AnotherAI experiment.
+```
+
+**Adding Performance Requirements**
+
+If you have other criteria or constraints for your agent, you can include them in your prompt and your AI assistant will use AnotherAI to help you optimize for them. Learn more about different performance requirements and how to include them in your prompt [here](/use-cases/fundamentals/building#adding-performance-requirements).
+
+**Adding Metadata**
+
+You can also add custom metadata to your agents to help organize and track them. Learn more about adding metadata and how metadata can be used [here](/use-cases/fundamentals/building#adding-metadata).
+
+
+
+### Testing your Agent
+
+As part of the process of creating your agent with AnotherAI, your AI assistant will automatically create an initial experiment to test your agent's performance. Experiments allow you to systematically compare each of these different parameters of your agent to find the optimal setup for your use case across one or more inputs. You can use experiments to:
+- Compare performance across different models (ex. GPT-5, Claude 4.5 sonnet, Gemini 2.0 flash, etc.)
+- Test multiple prompt variations to find the most effective approach
+- Optimize for specific metrics like cost, speed, and accuracy.
+
+If you find there is additional criteria you want to test, you can always ask your AI assistant to create additional experiments. The most common parameters to experiment with are [prompts](/use-cases/fundamentals/experiments#prompts) and [models](/use-cases/fundamentals/experiments#models), however you can also experiment with changes to [other parameters like temperature](/use-cases/fundamentals/experiments#other-parameters).
+
+Here's an example of a prompt you might use to find a faster model for your agent:
+
+```
+Create an AnotherAI experiment to help me find a faster model for anotherai/agent/pdf-extractor,
+but still maintains the same tone and succinct output as my current model.
+```
+
+Learn more about experiments and see more examples [here](/use-cases/fundamentals/experiments).
+
+
+If you're testing an agent that has a large system prompt and/or very long inputs, you may encounter token limit issues with the `get_experiment` MCP tool that impacts Claude Code's ability to provide accurate insights on your agent.
+
+
+
+In this case, you can manually increase Claude Code's output token limit.
+
+**To set up permanently for all terminal sessions:**
+
+For zsh (default on macOS):
+```bash
+echo 'export MAX_MCP_OUTPUT_TOKENS=150000' >> ~/.zshrc && source ~/.zshrc
+```
+
+For bash:
+```bash
+echo 'export MAX_MCP_OUTPUT_TOKENS=150000' >> ~/.bashrc && source ~/.bashrc
+```
+
+**For temporary use in current session only:**
+```bash
+export MAX_MCP_OUTPUT_TOKENS=150000
+```
+
+**Note:** If you forget or don't realize you need to set a higher limit, you can quit your existing session, run the command to increase the limit, and then use `claude --resume` to continue your previous session with the increased limit applied.
+
+You can learn more about tool output limits for Claude Code in their [documentation](https://docs.claude.com/en/docs/claude-code/mcp#mcp-output-limits-and-warnings).
+
+
+
+
+### Adding Feedback to your Experiments
+
+When reviewing the results of experiments, you can add feedback (annotations) to help your AI coding agent understand what is working and what is not. Your AI coding agent can then use this feedback to create additional experiments with improved versions of your agent.
+
+Learn more about how annotations can be used to improve your agent [here](/use-cases/fundamentals/annotations).
+
+
+
+### Debugging PDF Processing Agents
+
+The process of debugging PDF processing agents is the same as debugging text-based agents. Learn more about debugging agents [here](/use-cases/fundamentals/debugging).
+
+
diff --git a/docs/content/docs/use-cases/user-feedback.mdx b/docs/content/docs/use-cases/user-feedback.mdx
index f21ba0bf..ec4668a8 100644
--- a/docs/content/docs/use-cases/user-feedback.mdx
+++ b/docs/content/docs/use-cases/user-feedback.mdx
@@ -129,13 +129,36 @@ Review the user feedback annotations for agent/email-rewriter from the last week
and suggest prompt improvements based on common complaints
```
-Your AI assistant will query the annotations, identify patterns, and propose specific changes to improve user satisfaction.
+Your AI assistant will query the annotations, identify patterns, and give you a report on proposed changes.
+
+
+
+If you agree with the proposed changes, you can ask your AI assistant to create an experiment to test the new version against the old version.

Once you've validated improvements through experiments, you can deploy them instantly without code changes using [deployments](/use-cases/fundamentals/deployments). This allows your team to rapidly iterate on agent improvements based on user feedback - no engineering bottlenecks, no deployment delays.
+
+
+
+
+### Optional: Add feedback cases to your evaluation dataset
+
+Especially if the completion that received user feedback is challenging or unique, you may want to add it to your evaluation dataset. This creates a feedback loop where real user experiences continuously improve your agent's reliability and ensures future versions of your agent continue to handle the given input correctly.
+
+```
+Get completion anotherai/completion/0199a698-56c2-72e7-4ad8-e6bb1699a0a8 and add
+it to email-rewriter-dataset.json as a new test case. Use the completion's
+input variables for the "input" field, the output as "expected_output".
+```
+
+**You can add completions with positive feedback to preserve what works well:**
+- The real-life completions become high quality test cases to ensure future changes don't break successful patterns
+- Future versions can be evaluated against these proven-good examples
+
+You can learn more about evaluation datasets [here](/use-cases/fundamentals/evaluating).
diff --git a/docs/public/images/cost-optimization-experiment-food-analyzer.png b/docs/public/images/cost-optimization-experiment-food-analyzer.png
new file mode 100644
index 00000000..869ada43
Binary files /dev/null and b/docs/public/images/cost-optimization-experiment-food-analyzer.png differ
diff --git a/docs/public/images/experiment-prompt-enhancement-terminal-food-analyzer.png b/docs/public/images/experiment-prompt-enhancement-terminal-food-analyzer.png
new file mode 100644
index 00000000..f60e8a62
Binary files /dev/null and b/docs/public/images/experiment-prompt-enhancement-terminal-food-analyzer.png differ
diff --git a/docs/public/images/experiment-prompts-comparison-food-analyzer.png b/docs/public/images/experiment-prompts-comparison-food-analyzer.png
new file mode 100644
index 00000000..8773845b
Binary files /dev/null and b/docs/public/images/experiment-prompts-comparison-food-analyzer.png differ
diff --git a/docs/public/images/fast-models-experiment-food-analyzer.png b/docs/public/images/fast-models-experiment-food-analyzer.png
new file mode 100644
index 00000000..f92a37a9
Binary files /dev/null and b/docs/public/images/fast-models-experiment-food-analyzer.png differ
diff --git a/docs/public/images/model-comparison-experiment-food-analyzer.png b/docs/public/images/model-comparison-experiment-food-analyzer.png
new file mode 100644
index 00000000..c5c2a6ef
Binary files /dev/null and b/docs/public/images/model-comparison-experiment-food-analyzer.png differ
diff --git a/docs/public/images/pdf-extraction-fast-models.png b/docs/public/images/pdf-extraction-fast-models.png
new file mode 100644
index 00000000..4bc1f864
Binary files /dev/null and b/docs/public/images/pdf-extraction-fast-models.png differ
diff --git a/docs/public/images/pdf-extraction-low-cost.png b/docs/public/images/pdf-extraction-low-cost.png
new file mode 100644
index 00000000..1e2f34f6
Binary files /dev/null and b/docs/public/images/pdf-extraction-low-cost.png differ
diff --git a/docs/public/images/pdf-extraction-new-model.png b/docs/public/images/pdf-extraction-new-model.png
new file mode 100644
index 00000000..0082eb2c
Binary files /dev/null and b/docs/public/images/pdf-extraction-new-model.png differ
diff --git a/docs/public/images/pdf-extraction-prompt-compare.png b/docs/public/images/pdf-extraction-prompt-compare.png
new file mode 100644
index 00000000..30febaec
Binary files /dev/null and b/docs/public/images/pdf-extraction-prompt-compare.png differ
diff --git a/docs/public/images/pdf-extraction-prompt-enhancement-terminal.png b/docs/public/images/pdf-extraction-prompt-enhancement-terminal.png
new file mode 100644
index 00000000..375e233d
Binary files /dev/null and b/docs/public/images/pdf-extraction-prompt-enhancement-terminal.png differ
diff --git a/docs/public/images/pdf-extraction-temperature-experiment.png b/docs/public/images/pdf-extraction-temperature-experiment.png
new file mode 100644
index 00000000..7538a7b8
Binary files /dev/null and b/docs/public/images/pdf-extraction-temperature-experiment.png differ
diff --git a/docs/public/images/temperature-experiment-food-analyzer.png b/docs/public/images/temperature-experiment-food-analyzer.png
new file mode 100644
index 00000000..6204e3b7
Binary files /dev/null and b/docs/public/images/temperature-experiment-food-analyzer.png differ
diff --git a/docs/public/images/user-feedback-analysis.png b/docs/public/images/user-feedback-analysis.png
new file mode 100644
index 00000000..9bce80ea
Binary files /dev/null and b/docs/public/images/user-feedback-analysis.png differ