Skip to content

v1#240

Open
mattpocock wants to merge 190 commits intomainfrom
v1
Open

v1#240
mattpocock wants to merge 190 commits intomainfrom
v1

Conversation

@mattpocock
Copy link
Copy Markdown
Owner

No description provided.

@changeset-bot
Copy link
Copy Markdown

changeset-bot bot commented Oct 19, 2025

🦋 Changeset detected

Latest commit: 647e587

The changes in this PR will be included in the next version bump.

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@vercel
Copy link
Copy Markdown

vercel bot commented Oct 19, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
evalite Ignored Ignored Mar 27, 2026 10:54am

@pkg-pr-new
Copy link
Copy Markdown

pkg-pr-new bot commented Oct 19, 2025

Open in StackBlitz

npm i https://pkg.pr.new/evalite@240

commit: 647e587

@mattpocock
Copy link
Copy Markdown
Owner Author

@claude Remove the BACKEND_ONLY_CONSTANT import that's causing the error in CI

@claude
Copy link
Copy Markdown

claude bot commented Oct 19, 2025

Claude finished @mattpocock's task —— View job


Task completed

  • Check CI status to understand the error
  • Read modified files to locate the problematic import
  • Remove the BACKEND_ONLY_CONSTANT imports
  • Commit and push the fix

Fixed the CI error:

The error was caused by imports from evalite/backend-only-constants that referenced exports that no longer exist:

  1. test-utils.ts: Removed unused DB_LOCATION import
  2. files.test.ts: Replaced FILES_LOCATION import with a local constant "./node_modules/.evalite/files"

The changes have been committed and pushed to the v1 branch. CI should now pass!


Job Run | Branch: v1

mattpocock and others added 11 commits November 10, 2025 17:39
- Remove unused DB_LOCATION import from test-utils.ts
- Replace FILES_LOCATION import with local constant in files.test.ts

Co-authored-by: Matt Pocock <mattpocock@users.noreply.github.com>
- Add dotenv as a dependency
- Create env-setup-file module that imports dotenv/config
- Export env-setup-file as 'evalite/env-setup-file'
- Automatically prepend env-setup-file to setupFiles array
- Update documentation to reflect automatic .env loading
- Update example config to remove manual dotenv setup

Fixes #234

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-authored-by: Matt Pocock <mattpocock@users.noreply.github.com>
… precedence

- Add loadVitestSetupFiles() to load setupFiles from vitest.config.ts
- Merge setupFiles from both configs with evalite.config.ts taking precedence
- Add tests for vitest.config.ts setupFiles support and precedence
- setupFiles execution order: env-setup-file -> vitest -> evalite

Co-authored-by: Matt Pocock <mattpocock@users.noreply.github.com>
mattpocock and others added 30 commits March 26, 2026 20:39
…gorization guidelines; create OUT_OF_SCOPE.md for triage decisions.
…ions, data, observability, experimentation, display, and relationships
The exported static UI requires a static file server due to absolute
asset paths, ES module CORS restrictions, and fetch calls that fail
from file://. Updated docs to clarify a static file server is needed.

Files changed:
- apps/evalite-docs/src/content/docs/tips/run-evals-on-ci-cd.mdx
- apps/evalite-docs/src/content/docs/api/cli.mdx

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ot be pushed to remote before task completion
Instead of throwing on unsupported content types (reasoning, file, etc.),
handlePromptContent now returns null and processPromptForTracing filters
out nulls. This fixes crashes when using AI SDK with thinking models
that include reasoning parts in assistant messages.

Key decisions:
- Return null + filter instead of silently converting to avoid data loss
- Generic fix covers all unsupported types, not just reasoning

Files changed:
- packages/evalite/src/ai-sdk.ts (handlePromptContent + processPromptForTracing)
- packages/evalite-tests/tests/ai-sdk-reasoning.test.ts (new test)
- packages/evalite-tests/tests/fixtures/ai-sdk-reasoning/reasoning.eval.ts (new fixture)
- .changeset/0000-handle-reasoning-content.md

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… - Issue #386

Added @typescript/native-preview (tsgo v7.0) to root dependencies and updated all
typecheck scripts to use tsgo instead of tsc. Build scripts still use tsc for emit.

Key decisions:
- tsgo used for typecheck only; tsc retained for build/emit in packages/evalite
- Removed deprecated baseUrl option from evalite-ui tsconfig (tsgo 7.0 requires paths-only)
- Used caret range for @typescript/native-preview to track latest 7.x dev releases

Files changed:
- package.json: added @typescript/native-preview dependency
- pnpm-lock.yaml: updated lockfile
- packages/evalite/package.json: typecheck script tsc -> tsgo
- packages/evalite-tests/package.json: typecheck script tsc --noEmit -> tsgo --noEmit
- apps/evalite-ui/package.json: typecheck script tsc --noEmit -> tsgo --noEmit
- apps/evalite-ui/tsconfig.json: removed deprecated baseUrl option

All typecheck and tests pass.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
PRD: Issue #370 - evalite export does not fail if threshold is not reached

Key decisions:
- Added scoreThreshold parameter to exportCommand
- Threshold check runs after export completes (works for both auto-run and pre-existing data)
- Passes scoreThreshold to runEvalite during auto-run for console output
- Reuses same threshold semantics as run command (0-100 scale, exit code 1 if below)

Files changed:
- packages/evalite/src/export-static.ts: Added scoreThreshold param and post-export threshold check
- packages/evalite/src/command.ts: Added --threshold CLI flag to export command
- packages/evalite-tests/tests/export-static.test.ts: Added 3 tests for threshold behavior
- .changeset/0000-export-threshold.md: Added changeset

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… Issue #386

Key decisions:
- Replaced tsc with tsgo in evalite build and dev scripts (tsgo supports emit and watch)
- Removed typescript from root dependencies and resolutions override
- Moved typescript to devDependencies (still needed as peer dep for typescript-eslint)
- Removed typescript from evalite-ui devDependencies (not needed by Vite or tsgo)

Files changed:
- package.json: removed typescript from dependencies/resolutions, added to devDependencies
- packages/evalite/package.json: build/dev scripts now use tsgo instead of tsc
- apps/evalite-ui/package.json: removed typescript devDependency
- pnpm-lock.yaml: updated lockfile

All tests (110), typecheck, and lint pass.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
PRD: Issue #223 - Eval duration not being tracked properly

Key decisions:
- Suite duration = wall-clock time from creation to completion (Date.now() - created_at)
- Added optional `duration` field to Suites.UpdateOpts type
- SQLite uses COALESCE to only update duration when provided
- In-memory storage preserves existing duration when not provided (was hardcoded to 0)

Files changed:
- packages/evalite/src/types.ts: Added optional duration to UpdateOpts
- packages/evalite/src/reporter/EvaliteRunner.ts: Compute and pass duration on suite completion
- packages/evalite/src/storage/sqlite.ts: Handle duration in updateSuiteStatus
- packages/evalite/src/storage/in-memory.ts: Use opts.duration instead of hardcoded 0
- packages/evalite-tests/tests/basics.test.ts: Unskipped duration test
- .changeset/0000-eval-duration-tracking.md: Added changeset

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
PRD: Issue #354 - Evalite Creates Cache Directory Regardless of cacheEnabled Setting

Key decisions:
- Moved mkdir call from before config loading to after cacheEnabled is resolved
- Guard mkdir with if (cacheEnabled) so directory is only created when needed
- FILES_LOCATION (node_modules/.evalite/files/) is only for caching, so skip when disabled

Files changed:
- packages/evalite/src/run-evalite.ts: Moved and guarded mkdir call
- packages/evalite-tests/tests/cache-dir.test.ts: New test verifying cache dir behavior
- packages/evalite-tests/tests/fixtures/issue-354/issue-354.eval.ts: Minimal test fixture
- .changeset/0000-cache-dir-guard.md: Patch changeset

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ation tracking

- sandcastle/issue-386-upgrade-to-tsgo: Replaced tsc with tsgo (@typescript/native-preview) for type checking and build
- sandcastle/issue-354-cache-dir-regardless-of-config: Only create cache directory when cacheEnabled is true (fixes #354)
- sandcastle/issue-223-eval-duration-tracking: Fix suite duration tracking so eval durations are stored correctly (fixes #223)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants