Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 18 additions & 5 deletions docs/TESTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,21 @@ integ-tests/

See [integ-tests/README.md](../integ-tests/README.md) for integration test details.

### E2E Tests

E2E tests live in `e2e-tests/` and verify the full user journey across the AWS boundary — deploy, invoke, status, logs,
traces, and control plane API calls.

```
e2e-tests/
├── e2e-helper.ts # Shared utilities and createE2ESuite() factory
├── strands-bedrock.test.ts
├── langgraph-openai.test.ts
└── ...
```

See [e2e-tests/README.md](../e2e-tests/README.md) for e2e test details.

## Writing Tests

### Imports
Expand Down Expand Up @@ -435,14 +450,12 @@ npx playwright install chromium

## Integration Tests

Integration tests require:

- AWS credentials configured
- IAM permissions for CloudFormation operations
- Dedicated test AWS account (recommended)
Integration tests require no AWS credentials. They run the real CLI binary and assert on local files and stdout only.

Run integration tests:

```bash
npm run test:integ
```

See [integ-tests/README.md](../integ-tests/README.md) for full details.
124 changes: 124 additions & 0 deletions e2e-tests/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,124 @@
# E2E Tests

This directory contains end-to-end tests that verify the full user journey across the AWS boundary. They create, deploy,
invoke, and destroy real AWS resources.

## What E2E Tests Cover

E2E tests verify behaviors that require AWS to confirm they happened:

- **Deployment** — `agentcore deploy` creates a real CloudFormation stack
- **`deployed-state.json`** — after deploy, `agentcore/.cli/deployed-state.json` contains the correct ARNs and IDs for
each deployed resource
- **Live AWS state** — `agentcore status` returns a real resource ARN and `deploymentState: 'deployed'`
- **Live agent behavior** — `agentcore invoke` succeeds against a running agent
- **Observability** — `agentcore logs` returns real CloudWatch entries, `agentcore traces list` returns real trace data
- **Direct control plane API calls** — `pause`, `resume`, and `promote` on AB tests return live execution state from AWS

They do **not** verify config file mutations or CLI input validation. Those belong in `integ-tests/`.

## Prerequisites

- AWS credentials configured (`aws sts get-caller-identity` must succeed)
- `npm`, `git`, and `uv` on PATH
- Sufficient IAM permissions to create/delete CloudFormation stacks
- A dedicated test AWS account (recommended to avoid cost surprises)
- Model-specific API keys set as env vars for non-Bedrock providers (e.g. `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`,
`GOOGLE_API_KEY`)

## Running

```bash
# Run all e2e tests (requires AWS credentials)
npm run test:e2e

# Run a specific file
npx vitest run e2e-tests/strands-bedrock.test.ts
```

E2E tests are not run automatically on every PR. They run on a schedule and before releases.

## Writing E2E Tests

Most framework/model combination tests are a single call to `createE2ESuite()`:

```typescript
import { createE2ESuite } from './e2e-helper.js';

createE2ESuite({
framework: 'Strands',
modelProvider: 'Bedrock',
});
```

`createE2ESuite()` generates the full lifecycle suite: `create → deploy → invoke → status → logs → traces → destroy`.

For feature-specific lifecycle tests (AB tests, evals, config bundles), write the suite directly using helpers from
`e2e-helper.ts`:

```typescript
import { baseCanRun, hasAws, runAgentCoreCLI, teardownE2EProject, writeAwsTargets } from './e2e-helper.js';
import { afterAll, beforeAll, describe, expect, it } from 'vitest';

const canRun = baseCanRun && hasAws;

describe.sequential('e2e: my feature lifecycle', () => {
let projectPath: string;
const agentName = `E2eMyFeat${String(Date.now()).slice(-8)}`;

beforeAll(async () => {
if (!canRun) return;
// create project, write AWS targets
await writeAwsTargets(projectPath);
}, 300000);

// Always destroy AWS resources — never skip this
afterAll(async () => {
if (projectPath && hasAws) {
await teardownE2EProject(projectPath, agentName, 'Bedrock');
}
}, 600000);

it.skipIf(!canRun)(
'deploys to AWS successfully',
async () => {
const result = await runAgentCoreCLI(['deploy', '--yes', '--json'], projectPath);
expect(result.exitCode).toBe(0);
expect(JSON.parse(result.stdout).success).toBe(true);
},
600000
);
});
```

### Key patterns

| Pattern | Why |
| ----------------------------------------- | --------------------------------------------------------------------- |
| `describe.sequential` | Tests depend on each other — deploy must succeed before invoke |
| `it.skipIf(!canRun)` | Gracefully skips when credentials or prerequisites are missing |
| `afterAll(() => teardownE2EProject(...))` | Always destroy AWS resources to avoid cost and leakage |
| `retry(fn, 3, 15000)` | AWS operations are eventually consistent — retries handle cold starts |
| `hasAwsCredentials()` | Gate the entire suite — skip all if no credentials |
| Long timeouts (600000ms) | CloudFormation deploys take minutes, not seconds |

### File naming

Framework/model combination tests: `{framework}-{model}.test.ts`

- `strands-bedrock.test.ts`
- `langgraph-openai.test.ts`

Feature lifecycle tests: describe what the test exercises end-to-end

- `ab-test-target-based.test.ts`
- `dev-lifecycle.test.ts`
- `evals-lifecycle.test.ts`

## Important Notes

- E2E tests create real AWS resources and **will incur costs**
- Always include `teardownE2EProject()` in `afterAll` — never skip cleanup
- Use unique agent names (timestamp suffix) to avoid conflicts with parallel runs
- Stale credential providers older than 30 minutes are cleaned up automatically in `beforeAll` via
`cleanupStaleCredentialProviders()`
113 changes: 77 additions & 36 deletions integ-tests/README.md
Original file line number Diff line number Diff line change
@@ -1,64 +1,105 @@
# Integration Tests

This directory contains real AWS integration tests that actually deploy resources.
This directory contains integration tests that run the real CLI binary and assert on what it produces locally — no AWS
credentials, no network access, no deployed resources.

## What Integration Tests Cover

Integration tests verify that CLI commands behave correctly by checking:

- **Exit code and stdout** — the command exits `0` on success, non-zero on failure, and `--json` output has the correct
shape
- **`agentcore/agentcore.json`** — the project config was mutated correctly after `add`, `remove`, or `create` commands
- **Scaffolded files** — `app/{agent}/pyproject.toml` contains the right framework dependencies, `app/{agent}/main.py`
exists, `.git/` was initialized
- **Validation behavior** — the CLI rejects invalid input with the right error message before making any network call

They do **not** verify deployments, live AWS state, or agent invocation. Those belong in `e2e-tests/`.

## Prerequisites

- AWS credentials configured
- Sufficient IAM permissions to create/delete CloudFormation stacks
- A dedicated test AWS account (recommended)
- `npm` and `git` on PATH (some tests skip automatically if missing via `describe.skipIf`)
- `uv` on PATH (required for tests that scaffold Python agents)
- No AWS credentials needed

## Running Integration Tests
## Running

```bash
# Run all integration tests
npm run test:integ

# Run a specific test
npx vitest run integ-tests/deploy.test.ts --testTimeout=300000
# Run a specific file
npx vitest run integ-tests/add-remove-gateway.test.ts
```

## Test Naming Convention
## Writing Integration Tests

All integration test files should be prefixed with `integ.`:
```typescript
import { createTestProject, readProjectConfig, runCLI } from '../src/test-utils/index.js';
import type { TestProject } from '../src/test-utils/index.js';
import { afterAll, beforeAll, describe, expect, it } from 'vitest';

- `integ.deploy.ts` - Tests actual deployment
- `integ.invoke.ts` - Tests invoking deployed agents
- `integ.destroy.ts` - Tests stack destruction
- `integ.e2e.ts` - Full end-to-end lifecycle test
describe('integration: add and remove a gateway', () => {
let project: TestProject;

## CI/CD
beforeAll(async () => {
project = await createTestProject({ noAgent: true });
});

Integration tests are NOT run automatically on every PR. They can be triggered:
afterAll(async () => {
await project.cleanup();
});

1. Manually via GitHub Actions workflow_dispatch
2. On a schedule (if configured)
3. Before releases
it('adds a gateway', async () => {
const result = await runCLI(['add', 'gateway', '--name', 'MyGateway', '--json'], project.projectPath);

## Writing Integration Tests
expect(result.exitCode).toBe(0);
expect(JSON.parse(result.stdout).success).toBe(true);

```typescript
import { runCLI } from '../src/test-utils/cli-runner';
import { afterAll, describe, expect, it } from 'vitest';
const config = await readProjectConfig(project.projectPath);
const gateway = config.agentCoreGateways?.find(g => g.name === 'MyGateway');
expect(gateway).toBeTruthy();
});

describe('integ: deploy', () => {
// Use unique stack names to avoid conflicts
const stackName = `test-${Date.now()}`;
it('removes the gateway', async () => {
const result = await runCLI(['remove', 'gateway', '--name', 'MyGateway', '--json'], project.projectPath);

afterAll(async () => {
// ALWAYS clean up - destroy the stack
await runCLI(['destroy', '--target', stackName, '--force'], projectDir);
});
expect(result.exitCode).toBe(0);

it('deploys successfully', async () => {
// Test implementation
const config = await readProjectConfig(project.projectPath);
expect(config.agentCoreGateways?.find(g => g.name === 'MyGateway')).toBeFalsy();
});
});
```

## Important Notes
### Key patterns

| Pattern | Why |
| ----------------------------------- | ------------------------------------------------------- |
| `createTestProject()` | Fast temp project setup — no npm/uv install |
| `runCLI([...args], projectPath)` | Runs the real built CLI binary, not a mock |
| `readProjectConfig(path)` | Reads and parses `agentcore/agentcore.json` |
| `afterAll(() => project.cleanup())` | Always delete the temp directory |
| `--json` flag | Makes stdout machine-readable for assertions |
| Assert exit code first | Fail fast with a useful message before asserting output |

### File naming

Name files after the feature area, not the command:

- `add-remove-gateway.test.ts` — not `add.test.ts`
- `create-frameworks.test.ts` — not `create.test.ts`
- `lifecycle-config.test.ts` — not `flags.test.ts`

- Integration tests create real AWS resources and may incur costs
- Always include cleanup in `after()` hooks
- Use unique names to avoid conflicts with parallel runs
- Set appropriate timeouts (5-15 minutes for deploy operations)
### No mocking

Integration tests contain zero mocks. The CLI commands tested here make no network calls, so there is nothing to
intercept. The real binary runs against the real filesystem.

## CI/CD

Integration tests are not run automatically on every PR. They can be triggered:

1. Manually via GitHub Actions `workflow_dispatch`
2. On a schedule (if configured)
3. Before releases
Loading