|
| 1 | +# Testing Guide |
| 2 | + |
| 3 | +This document explains how testing is organized across the Codebuff monorepo. For detailed, package-specific instructions, see the README files in each package's `__tests__/` directory. |
| 4 | + |
| 5 | +## Test Types by Project |
| 6 | + |
| 7 | +| Project | Unit | Integration | E2E | |
| 8 | +| ------- | ------------------------------- | ------------------------- | -------------------------------- | |
| 9 | +| **CLI** | Individual functions/components | CLI with mocked backend | Full stack: CLI → SDK → Web → DB | |
| 10 | +| **Web** | React components, API handlers | API routes with mocked DB | Real browser via Playwright | |
| 11 | +| **SDK** | Client functions, parsing | SDK calls to real API | (covered by CLI E2E) | |
| 12 | + |
| 13 | +## What "E2E" Means Here |
| 14 | + |
| 15 | +The term "end-to-end" means different things for different parts of the system: |
| 16 | + |
| 17 | +### CLI E2E (Full-Stack Testing) |
| 18 | + |
| 19 | +**CLI E2E tests are the most comprehensive** - they test the entire user journey: |
| 20 | + |
| 21 | +``` |
| 22 | +User launches terminal |
| 23 | + → Types commands |
| 24 | + → CLI renders UI (via terminal emulator) |
| 25 | + → CLI calls SDK |
| 26 | + → SDK calls Web API |
| 27 | + → API queries Database (real Postgres in Docker) |
| 28 | + → Response flows back through the stack to the terminal |
| 29 | +``` |
| 30 | + |
| 31 | +**Location:** `cli/src/__tests__/e2e/` |
| 32 | + |
| 33 | +**Prerequisites:** |
| 34 | + |
| 35 | +- Docker (for Postgres database) |
| 36 | +- SDK built (`cd sdk && bun run build`) |
| 37 | +- psql available (for database seeding) |
| 38 | + |
| 39 | +### Web E2E (Browser Testing) |
| 40 | + |
| 41 | +**Web E2E tests the browser experience** using Playwright: |
| 42 | + |
| 43 | +``` |
| 44 | +Real browser loads page |
| 45 | + → Renders SSR content |
| 46 | + → Hydrates client-side |
| 47 | + → User interactions trigger API calls (mocked or real) |
| 48 | +``` |
| 49 | + |
| 50 | +**Location:** `web/src/__tests__/e2e/` |
| 51 | + |
| 52 | +**Prerequisites:** |
| 53 | + |
| 54 | +- Playwright installed (`bunx playwright install`) |
| 55 | +- Web server running (auto-started by Playwright) |
| 56 | + |
| 57 | +### SDK Integration (API Testing) |
| 58 | + |
| 59 | +**SDK integration tests verify API connectivity:** |
| 60 | + |
| 61 | +``` |
| 62 | +SDK makes real HTTP calls to the backend |
| 63 | + → Verifies authentication, request/response formats |
| 64 | + → Tests prompt caching, error handling |
| 65 | +``` |
| 66 | + |
| 67 | +**Location:** `sdk/src/__tests__/*.integration.test.ts` |
| 68 | + |
| 69 | +**Prerequisites:** |
| 70 | + |
| 71 | +- Valid `CODEBUFF_API_KEY` environment variable |
| 72 | + |
| 73 | +## Running Tests |
| 74 | + |
| 75 | +### Quick Start |
| 76 | + |
| 77 | +```bash |
| 78 | +# Run all tests in a package |
| 79 | +cd cli && bun test |
| 80 | +cd web && bun test |
| 81 | +cd sdk && bun test |
| 82 | + |
| 83 | +# Run specific test file |
| 84 | +bun test path/to/test.ts |
| 85 | + |
| 86 | +# Run with watch mode |
| 87 | +bun test --watch |
| 88 | +``` |
| 89 | + |
| 90 | +### CLI Tests |
| 91 | + |
| 92 | +```bash |
| 93 | +cd cli |
| 94 | + |
| 95 | +# Unit tests (fast, no dependencies) |
| 96 | +bun test cli-args.test.ts |
| 97 | + |
| 98 | +# UI tests (requires SDK) |
| 99 | +bun test cli-ui.test.ts |
| 100 | + |
| 101 | +# E2E tests (requires Docker + SDK built) |
| 102 | +bun test e2e/ |
| 103 | +``` |
| 104 | + |
| 105 | +### Web Tests |
| 106 | + |
| 107 | +```bash |
| 108 | +cd web |
| 109 | + |
| 110 | +# Unit/integration tests |
| 111 | +bun test |
| 112 | + |
| 113 | +# E2E tests with Playwright |
| 114 | +bunx playwright test |
| 115 | + |
| 116 | +# E2E with UI mode (interactive debugging) |
| 117 | +bunx playwright test --ui |
| 118 | +``` |
| 119 | + |
| 120 | +### SDK Tests |
| 121 | + |
| 122 | +```bash |
| 123 | +cd sdk |
| 124 | + |
| 125 | +# Unit tests |
| 126 | +bun test |
| 127 | + |
| 128 | +# Integration tests (requires API key) |
| 129 | +CODEBUFF_API_KEY=your-key bun test run.integration.test.ts |
| 130 | +``` |
| 131 | + |
| 132 | +## Test File Naming Conventions |
| 133 | + |
| 134 | +| Pattern | Type | Example | |
| 135 | +| ----------------------- | ---------------------- | ------------------------------------- | |
| 136 | +| `*.test.ts` | Unit tests | `cli-args.test.ts` | |
| 137 | +| `*.integration.test.ts` | Integration tests | `run.integration.test.ts` | |
| 138 | +| `integration/*.test.ts` | Integration tests | `integration/api-integration.test.ts` | |
| 139 | +| `e2e/*.test.ts` | E2E tests (Bun) | `e2e/full-stack.test.ts` | |
| 140 | +| `*.spec.ts` | E2E tests (Playwright) | `store-ssr.spec.ts` | |
| 141 | + |
| 142 | +Files matching `*integration*.test.ts` or `*e2e*.test.ts` trigger automatic dependency checking (tmux, SDK build status) in the `.bin/bun` wrapper. |
| 143 | + |
| 144 | +## Directory Structure |
| 145 | + |
| 146 | +``` |
| 147 | +cli/src/__tests__/ |
| 148 | +├── e2e/ # Full stack: CLI → SDK → Web → DB |
| 149 | +│ ├── README.md # CLI E2E documentation |
| 150 | +│ └── full-stack.test.ts |
| 151 | +├── integration/ # Tests with mocked backend |
| 152 | +├── helpers/ # Test utilities |
| 153 | +├── mocks/ # Mock implementations |
| 154 | +├── cli-ui.test.ts # CLI UI tests (requires SDK) |
| 155 | +├── *.test.ts # Other unit tests |
| 156 | +└── README.md # CLI testing overview |
| 157 | +
|
| 158 | +web/src/__tests__/ |
| 159 | +├── e2e/ # Browser tests with Playwright |
| 160 | +│ ├── README.md # Web E2E documentation |
| 161 | +│ └── *.spec.ts |
| 162 | +└── ... |
| 163 | +
|
| 164 | +sdk/src/__tests__/ |
| 165 | +├── *.test.ts # Unit tests |
| 166 | +└── *.integration.test.ts # Real API calls |
| 167 | +``` |
| 168 | + |
| 169 | +## Writing Tests |
| 170 | + |
| 171 | +### Best Practices |
| 172 | + |
| 173 | +1. **Use dependency injection** over mocking modules |
| 174 | +2. **Follow naming conventions** for automatic detection |
| 175 | +3. **Clean up resources** in `afterEach`/`afterAll` |
| 176 | +4. **Add graceful skipping** for missing dependencies |
| 177 | +5. **Keep tests focused** - one behavior per test |
| 178 | + |
| 179 | +### Example: CLI Unit Test |
| 180 | + |
| 181 | +```typescript |
| 182 | +import { describe, test, expect } from 'bun:test' |
| 183 | + |
| 184 | +describe('parseArgs', () => { |
| 185 | + test('parses --agent flag', () => { |
| 186 | + const result = parseArgs(['--agent', 'base']) |
| 187 | + expect(result.agent).toBe('base') |
| 188 | + }) |
| 189 | +}) |
| 190 | +``` |
| 191 | + |
| 192 | +### Example: CLI Integration Test |
| 193 | + |
| 194 | +```typescript |
| 195 | +import { describe, test, expect, afterEach, mock } from 'bun:test' |
| 196 | + |
| 197 | +describe('API Integration', () => { |
| 198 | + afterEach(() => { |
| 199 | + mock.restore() |
| 200 | + }) |
| 201 | + |
| 202 | + test('handles 401 responses', async () => { |
| 203 | + // Mock fetch, test error handling |
| 204 | + }) |
| 205 | +}) |
| 206 | +``` |
| 207 | + |
| 208 | +### Example: CLI E2E Test |
| 209 | + |
| 210 | +```typescript |
| 211 | +import { describe, test, expect, beforeAll, afterAll } from 'bun:test' |
| 212 | +import { createE2ETestContext } from './test-cli-utils' |
| 213 | + |
| 214 | +describe('E2E: Chat', () => { |
| 215 | + let ctx: E2ETestContext |
| 216 | + |
| 217 | + beforeAll(async () => { |
| 218 | + ctx = await createE2ETestContext('chat') |
| 219 | + }, 180000) |
| 220 | + |
| 221 | + afterAll(async () => { |
| 222 | + await ctx?.cleanup() |
| 223 | + }) |
| 224 | + |
| 225 | + test('can type and send message', async () => { |
| 226 | + const session = await ctx.createSession() |
| 227 | + await session.cli.type('hello') |
| 228 | + await session.cli.press('enter') |
| 229 | + // Assert response |
| 230 | + }) |
| 231 | +}) |
| 232 | +``` |
| 233 | + |
| 234 | +## CI/CD |
| 235 | + |
| 236 | +Tests run automatically in CI. Some tests are skipped when prerequisites aren't met: |
| 237 | + |
| 238 | +- **E2E tests** skip if Docker unavailable or SDK not built |
| 239 | +- **Integration tests** skip if tmux not installed |
| 240 | +- **SDK integration tests** skip if no API key |
| 241 | + |
| 242 | +## Troubleshooting |
| 243 | + |
| 244 | +### Tests hanging? |
| 245 | + |
| 246 | +- Check tmux session isn't waiting for input |
| 247 | +- Ensure proper cleanup in `finally` blocks |
| 248 | +- Use timeouts for async operations |
| 249 | + |
| 250 | +### E2E tests failing? |
| 251 | + |
| 252 | +- Verify Docker is running: `docker info` |
| 253 | +- Rebuild SDK: `cd sdk && bun run build` |
| 254 | +- Clean up orphaned containers: `docker ps -aq --filter "name=${E2E_CONTAINER_NAME:-manicode-e2e}-" | xargs docker rm -f` |
| 255 | + |
| 256 | +### Playwright tests failing? |
| 257 | + |
| 258 | +- Install browsers: `bunx playwright install` |
| 259 | +- Check web server is accessible |
| 260 | +- Run with `--debug` for step-by-step execution |
| 261 | + |
| 262 | +## Package-Specific Documentation |
| 263 | + |
| 264 | +- [CLI Testing](cli/src/__tests__/README.md) |
| 265 | +- [CLI E2E Testing](cli/src/__tests__/e2e/README.md) |
| 266 | +- [Web E2E Testing](web/src/__tests__/e2e/README.md) |
| 267 | +- [Evals Framework](evals/README.md) |
0 commit comments