Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
2af93f3
fix(e2e): fix E2E test infrastructure for CLI tests
brandonkachen Dec 5, 2025
f6a969e
test(e2e): add comprehensive E2E test coverage for CLI
brandonkachen Dec 5, 2025
46cc212
fix(e2e): use host pg_isready/psql for DB readiness check
brandonkachen Dec 5, 2025
6e791c0
fix(ci): exclude e2e tests from CI test matrix
brandonkachen Dec 5, 2025
0794d3f
Polish CLI exit flow, input handling, and logging
brandonkachen Dec 5, 2025
701db65
Guard agent and SDK integrations when API key is missing
brandonkachen Dec 5, 2025
5acf02f
Add wrap-ansi types and clean up markdown renderer typing
brandonkachen Dec 5, 2025
c6d8789
Merge origin/main
brandonkachen Dec 5, 2025
61b483b
Merge remote-tracking branch 'origin/main' into tuistory-test
brandonkachen Dec 5, 2025
9f0cf96
Stabilize CLI exit handling and E2E expectations
brandonkachen Dec 5, 2025
19a3ed6
docs: add TESTING.md and reorganize CLI e2e tests
brandonkachen Dec 5, 2025
fed44dd
refactor(tests): remove flawed getApiKeyOrSkip pattern
brandonkachen Dec 5, 2025
3e1c475
refactor(tests): fail fast when SDK/Docker prerequisites are missing
brandonkachen Dec 6, 2025
dc2eb81
refactor(tests): remove skipIf patterns and reorganize CLI tests
brandonkachen Dec 6, 2025
1ccfd83
refactor(cli): consolidate analytics flush handling and improve logger
brandonkachen Dec 6, 2025
93dc087
Merge remote-tracking branch 'origin/main' into tuistory-test
brandonkachen Dec 6, 2025
4f29807
Increase tuistory typing delay for CLI e2e stability
brandonkachen Dec 6, 2025
32b9c38
Enforce API key and backend check in SDK e2e tests
brandonkachen Dec 6, 2025
054fdc2
Stabilize SDK backend-dependent e2e tests
brandonkachen Dec 6, 2025
ac1a850
Adjust prompt caching integration timeout
brandonkachen Dec 8, 2025
d53f7bf
Merge origin/main into tuistory-test
brandonkachen Dec 8, 2025
1b60433
Enable sdk knowledge/project e2e without skip gating
brandonkachen Dec 8, 2025
b4480b5
buffbench: accept evalDataPaths and fix examples
brandonkachen Dec 8, 2025
c134bad
Harden CLI/sdk e2e tests
brandonkachen Dec 8, 2025
2c98be4
Merge origin/main into tuistory-test
brandonkachen Dec 8, 2025
24aacee
Align agent validation tests with structured_output rules
brandonkachen Dec 8, 2025
754a24c
Align sdk agent validation test with structured_output rules
brandonkachen Dec 8, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -164,7 +164,8 @@ jobs:
elif [ "${{ matrix.package }}" = "web" ]; then
bun run test --runInBand
else
find src -name '*.test.ts' ! -name '*.integration.test.ts' | sort | xargs -I {} bun test {}
# Exclude integration tests and e2e tests (e2e tests require Docker)
find src -name '*.test.ts' ! -name '*.integration.test.ts' ! -path '*e2e*' | sort | xargs -I {} bun test {}
fi

# - name: Open interactive debug shell
Expand Down
11 changes: 3 additions & 8 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,12 +79,14 @@ Before you begin, you'll need to install a few tools:
8. **Start development services**:

**Option A: All-in-one (recommended)**

```bash
bun run dev
# Starts the web server, builds the SDK, and launches the CLI automatically
```

**Option B: Separate terminals (for more control)**

```bash
# Terminal 1 - Web server (start first)
bun run start-web
Expand Down Expand Up @@ -223,14 +225,7 @@ wsl --install
sudo apt-get install tmux
```

Run the proof-of-concept to validate your setup:

```bash
cd cli
bun run test:tmux-poc
```

See [cli/src/__tests__/README.md](cli/src/__tests__/README.md) for comprehensive interactive testing documentation.
See [cli/src/\_\_tests\_\_/README.md](cli/src/__tests__/README.md) for comprehensive testing documentation.

### Commit Messages

Expand Down
267 changes: 267 additions & 0 deletions TESTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,267 @@
# Testing Guide

This document explains how testing is organized across the Codebuff monorepo. For detailed, package-specific instructions, see the README files in each package's `__tests__/` directory.

## Test Types by Project

| Project | Unit | Integration | E2E |
| ------- | ------------------------------- | ------------------------- | -------------------------------- |
| **CLI** | Individual functions/components | CLI with mocked backend | Full stack: CLI → SDK → Web → DB |
| **Web** | React components, API handlers | API routes with mocked DB | Real browser via Playwright |
| **SDK** | Client functions, parsing | SDK calls to real API | (covered by CLI E2E) |

## What "E2E" Means Here

The term "end-to-end" means different things for different parts of the system:

### CLI E2E (Full-Stack Testing)

**CLI E2E tests are the most comprehensive** - they test the entire user journey:

```
User launches terminal
→ Types commands
→ CLI renders UI (via terminal emulator)
→ CLI calls SDK
→ SDK calls Web API
→ API queries Database (real Postgres in Docker)
→ Response flows back through the stack to the terminal
```

**Location:** `cli/src/__tests__/e2e/`

**Prerequisites:**

- Docker (for Postgres database)
- SDK built (`cd sdk && bun run build`)
- psql available (for database seeding)

### Web E2E (Browser Testing)

**Web E2E tests the browser experience** using Playwright:

```
Real browser loads page
→ Renders SSR content
→ Hydrates client-side
→ User interactions trigger API calls (mocked or real)
```

**Location:** `web/src/__tests__/e2e/`

**Prerequisites:**

- Playwright installed (`bunx playwright install`)
- Web server running (auto-started by Playwright)

### SDK Integration (API Testing)

**SDK integration tests verify API connectivity:**

```
SDK makes real HTTP calls to the backend
→ Verifies authentication, request/response formats
→ Tests prompt caching, error handling
```

**Location:** `sdk/src/__tests__/*.integration.test.ts`

**Prerequisites:**

- Valid `CODEBUFF_API_KEY` environment variable

## Running Tests

### Quick Start

```bash
# Run all tests in a package
cd cli && bun test
cd web && bun test
cd sdk && bun test

# Run specific test file
bun test path/to/test.ts

# Run with watch mode
bun test --watch
```

### CLI Tests

```bash
cd cli

# Unit tests (fast, no dependencies)
bun test cli-args.test.ts

# UI tests (requires SDK)
bun test cli-ui.test.ts

# E2E tests (requires Docker + SDK built)
bun test e2e/
```

### Web Tests

```bash
cd web

# Unit/integration tests
bun test

# E2E tests with Playwright
bunx playwright test

# E2E with UI mode (interactive debugging)
bunx playwright test --ui
```

### SDK Tests

```bash
cd sdk

# Unit tests
bun test

# Integration tests (requires API key)
CODEBUFF_API_KEY=your-key bun test run.integration.test.ts
```

## Test File Naming Conventions

| Pattern | Type | Example |
| ----------------------- | ---------------------- | ------------------------------------- |
| `*.test.ts` | Unit tests | `cli-args.test.ts` |
| `*.integration.test.ts` | Integration tests | `run.integration.test.ts` |
| `integration/*.test.ts` | Integration tests | `integration/api-integration.test.ts` |
| `e2e/*.test.ts` | E2E tests (Bun) | `e2e/full-stack.test.ts` |
| `*.spec.ts` | E2E tests (Playwright) | `store-ssr.spec.ts` |

Files matching `*integration*.test.ts` or `*e2e*.test.ts` trigger automatic dependency checking (tmux, SDK build status) in the `.bin/bun` wrapper.

## Directory Structure

```
cli/src/__tests__/
├── e2e/ # Full stack: CLI → SDK → Web → DB
│ ├── README.md # CLI E2E documentation
│ └── full-stack.test.ts
├── integration/ # Tests with mocked backend
├── helpers/ # Test utilities
├── mocks/ # Mock implementations
├── cli-ui.test.ts # CLI UI tests (requires SDK)
├── *.test.ts # Other unit tests
└── README.md # CLI testing overview

web/src/__tests__/
├── e2e/ # Browser tests with Playwright
│ ├── README.md # Web E2E documentation
│ └── *.spec.ts
└── ...

sdk/src/__tests__/
├── *.test.ts # Unit tests
└── *.integration.test.ts # Real API calls
```

## Writing Tests

### Best Practices

1. **Use dependency injection** over mocking modules
2. **Follow naming conventions** for automatic detection
3. **Clean up resources** in `afterEach`/`afterAll`
4. **Add graceful skipping** for missing dependencies
5. **Keep tests focused** - one behavior per test

### Example: CLI Unit Test

```typescript
import { describe, test, expect } from 'bun:test'

describe('parseArgs', () => {
test('parses --agent flag', () => {
const result = parseArgs(['--agent', 'base'])
expect(result.agent).toBe('base')
})
})
```

### Example: CLI Integration Test

```typescript
import { describe, test, expect, afterEach, mock } from 'bun:test'

describe('API Integration', () => {
afterEach(() => {
mock.restore()
})

test('handles 401 responses', async () => {
// Mock fetch, test error handling
})
})
```

### Example: CLI E2E Test

```typescript
import { describe, test, expect, beforeAll, afterAll } from 'bun:test'
import { createE2ETestContext } from './test-cli-utils'

describe('E2E: Chat', () => {
let ctx: E2ETestContext

beforeAll(async () => {
ctx = await createE2ETestContext('chat')
}, 180000)

afterAll(async () => {
await ctx?.cleanup()
})

test('can type and send message', async () => {
const session = await ctx.createSession()
await session.cli.type('hello')
await session.cli.press('enter')
// Assert response
})
})
```

## CI/CD

Tests run automatically in CI. Some tests are skipped when prerequisites aren't met:

- **E2E tests** skip if Docker unavailable or SDK not built
- **Integration tests** skip if tmux not installed
- **SDK integration tests** skip if no API key

## Troubleshooting

### Tests hanging?

- Check tmux session isn't waiting for input
- Ensure proper cleanup in `finally` blocks
- Use timeouts for async operations

### E2E tests failing?

- Verify Docker is running: `docker info`
- Rebuild SDK: `cd sdk && bun run build`
- Clean up orphaned containers: `docker ps -aq --filter "name=${E2E_CONTAINER_NAME:-manicode-e2e}-" | xargs docker rm -f`

### Playwright tests failing?

- Install browsers: `bunx playwright install`
- Check web server is accessible
- Run with `--debug` for step-by-step execution

## Package-Specific Documentation

- [CLI Testing](cli/src/__tests__/README.md)
- [CLI E2E Testing](cli/src/__tests__/e2e/README.md)
- [Web E2E Testing](web/src/__tests__/e2e/README.md)
- [Evals Framework](evals/README.md)
Loading