Open
Conversation
🦋 Changeset detectedLatest commit: 647e587 The changes in this PR will be included in the next version bump. Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
|
The latest updates on your projects. Learn more about Vercel for GitHub. |
commit: |
Owner
Author
|
@claude Remove the BACKEND_ONLY_CONSTANT import that's causing the error in CI |
|
Claude finished @mattpocock's task —— View job Task completed
Fixed the CI error: The error was caused by imports from
The changes have been committed and pushed to the |
This was referenced Oct 21, 2025
c69a19d to
4ae1080
Compare
This was referenced Nov 9, 2025
- Remove unused DB_LOCATION import from test-utils.ts - Replace FILES_LOCATION import with local constant in files.test.ts Co-authored-by: Matt Pocock <mattpocock@users.noreply.github.com>
- Add dotenv as a dependency - Create env-setup-file module that imports dotenv/config - Export env-setup-file as 'evalite/env-setup-file' - Automatically prepend env-setup-file to setupFiles array - Update documentation to reflect automatic .env loading - Update example config to remove manual dotenv setup Fixes #234 🤖 Generated with [Claude Code](https://claude.ai/code) Co-authored-by: Matt Pocock <mattpocock@users.noreply.github.com>
… precedence - Add loadVitestSetupFiles() to load setupFiles from vitest.config.ts - Merge setupFiles from both configs with evalite.config.ts taking precedence - Add tests for vitest.config.ts setupFiles support and precedence - setupFiles execution order: env-setup-file -> vitest -> evalite Co-authored-by: Matt Pocock <mattpocock@users.noreply.github.com>
…gorization guidelines; create OUT_OF_SCOPE.md for triage decisions.
…ions, data, observability, experimentation, display, and relationships
…eps for prioritization
…rompts, and remove review prompt
The exported static UI requires a static file server due to absolute asset paths, ES module CORS restrictions, and fetch calls that fail from file://. Updated docs to clarify a static file server is needed. Files changed: - apps/evalite-docs/src/content/docs/tips/run-evals-on-ci-cd.mdx - apps/evalite-docs/src/content/docs/api/cli.mdx Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ot be pushed to remote before task completion
Instead of throwing on unsupported content types (reasoning, file, etc.), handlePromptContent now returns null and processPromptForTracing filters out nulls. This fixes crashes when using AI SDK with thinking models that include reasoning parts in assistant messages. Key decisions: - Return null + filter instead of silently converting to avoid data loss - Generic fix covers all unsupported types, not just reasoning Files changed: - packages/evalite/src/ai-sdk.ts (handlePromptContent + processPromptForTracing) - packages/evalite-tests/tests/ai-sdk-reasoning.test.ts (new test) - packages/evalite-tests/tests/fixtures/ai-sdk-reasoning/reasoning.eval.ts (new fixture) - .changeset/0000-handle-reasoning-content.md Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… - Issue #386 Added @typescript/native-preview (tsgo v7.0) to root dependencies and updated all typecheck scripts to use tsgo instead of tsc. Build scripts still use tsc for emit. Key decisions: - tsgo used for typecheck only; tsc retained for build/emit in packages/evalite - Removed deprecated baseUrl option from evalite-ui tsconfig (tsgo 7.0 requires paths-only) - Used caret range for @typescript/native-preview to track latest 7.x dev releases Files changed: - package.json: added @typescript/native-preview dependency - pnpm-lock.yaml: updated lockfile - packages/evalite/package.json: typecheck script tsc -> tsgo - packages/evalite-tests/package.json: typecheck script tsc --noEmit -> tsgo --noEmit - apps/evalite-ui/package.json: typecheck script tsc --noEmit -> tsgo --noEmit - apps/evalite-ui/tsconfig.json: removed deprecated baseUrl option All typecheck and tests pass. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
PRD: Issue #370 - evalite export does not fail if threshold is not reached Key decisions: - Added scoreThreshold parameter to exportCommand - Threshold check runs after export completes (works for both auto-run and pre-existing data) - Passes scoreThreshold to runEvalite during auto-run for console output - Reuses same threshold semantics as run command (0-100 scale, exit code 1 if below) Files changed: - packages/evalite/src/export-static.ts: Added scoreThreshold param and post-export threshold check - packages/evalite/src/command.ts: Added --threshold CLI flag to export command - packages/evalite-tests/tests/export-static.test.ts: Added 3 tests for threshold behavior - .changeset/0000-export-threshold.md: Added changeset Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…de-code/20260327-102352
…le/claude-code/20260327-102352
…claude-code/20260327-102352
…astle/claude-code/20260327-102352
… Issue #386 Key decisions: - Replaced tsc with tsgo in evalite build and dev scripts (tsgo supports emit and watch) - Removed typescript from root dependencies and resolutions override - Moved typescript to devDependencies (still needed as peer dep for typescript-eslint) - Removed typescript from evalite-ui devDependencies (not needed by Vite or tsgo) Files changed: - package.json: removed typescript from dependencies/resolutions, added to devDependencies - packages/evalite/package.json: build/dev scripts now use tsgo instead of tsc - apps/evalite-ui/package.json: removed typescript devDependency - pnpm-lock.yaml: updated lockfile All tests (110), typecheck, and lint pass. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
PRD: Issue #223 - Eval duration not being tracked properly Key decisions: - Suite duration = wall-clock time from creation to completion (Date.now() - created_at) - Added optional `duration` field to Suites.UpdateOpts type - SQLite uses COALESCE to only update duration when provided - In-memory storage preserves existing duration when not provided (was hardcoded to 0) Files changed: - packages/evalite/src/types.ts: Added optional duration to UpdateOpts - packages/evalite/src/reporter/EvaliteRunner.ts: Compute and pass duration on suite completion - packages/evalite/src/storage/sqlite.ts: Handle duration in updateSuiteStatus - packages/evalite/src/storage/in-memory.ts: Use opts.duration instead of hardcoded 0 - packages/evalite-tests/tests/basics.test.ts: Unskipped duration test - .changeset/0000-eval-duration-tracking.md: Added changeset Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
PRD: Issue #354 - Evalite Creates Cache Directory Regardless of cacheEnabled Setting Key decisions: - Moved mkdir call from before config loading to after cacheEnabled is resolved - Guard mkdir with if (cacheEnabled) so directory is only created when needed - FILES_LOCATION (node_modules/.evalite/files/) is only for caching, so skip when disabled Files changed: - packages/evalite/src/run-evalite.ts: Moved and guarded mkdir call - packages/evalite-tests/tests/cache-dir.test.ts: New test verifying cache dir behavior - packages/evalite-tests/tests/fixtures/issue-354/issue-354.eval.ts: Minimal test fixture - .changeset/0000-cache-dir-guard.md: Patch changeset Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…to sandcastle/claude-code/20260327-104811
…astle/claude-code/20260327-104811
…ation tracking - sandcastle/issue-386-upgrade-to-tsgo: Replaced tsc with tsgo (@typescript/native-preview) for type checking and build - sandcastle/issue-354-cache-dir-regardless-of-config: Only create cache directory when cacheEnabled is true (fixes #354) - sandcastle/issue-223-eval-duration-tracking: Fix suite duration tracking so eval durations are stored correctly (fixes #223) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.