Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
60 commits
Select commit Hold shift + click to select a range
1141fa1
Enhance Studio development experience with HMR and early-stop features
k-taro56 May 2, 2026
f705338
Implement callback hot-swapping and enhance cleanup logic for gracefu…
k-taro56 May 2, 2026
a2e92ee
Refactor Trainer API to remove replaceCallbacks method and implement …
k-taro56 May 2, 2026
a617a56
Refactor Trainer API to remove public requestEarlyStop method and imp…
k-taro56 May 2, 2026
ef6757f
Refactor Trainer API to ensure internal handling of early-stop and ca…
k-taro56 May 2, 2026
45c49a8
Merge branch 'main' into eng-615
k-taro56 May 2, 2026
8abc594
feat: integrate Rolldown for build and HMR, enhance cleanup handling
k-taro56 May 2, 2026
31f5108
fix(hmr.test): update late subscriber event assertion to handle spuri…
k-taro56 May 3, 2026
55b4a2e
fix(dev): ensure SIGINT handler remains active during token persisten…
k-taro56 May 3, 2026
5016dfc
Merge branch 'main' into eng-615
k-taro56 May 4, 2026
1eedccf
feat: enhance HMR handling and subprocess management
k-taro56 May 4, 2026
e870a3b
feat: enhance HMR support and JSON serialization in config hashing
k-taro56 May 4, 2026
0a459df
feat: improve HMR handling and prevent unnecessary cloud job spawning…
k-taro56 May 4, 2026
a9adf14
feat: enhance stream handling in server and improve inspection logic …
k-taro56 May 4, 2026
b242e2a
feat: add waitForStableEvents function to improve event handling stab…
k-taro56 May 4, 2026
33ac9d0
feat: implement module cache busting functions and tests for improved…
k-taro56 May 4, 2026
c8f5b5b
feat: preserve last successful config hash across ERROR events in HMR
k-taro56 May 4, 2026
1d0cc83
feat: implement cleanup hook detachment and reset functionality for i…
k-taro56 May 4, 2026
bd8461c
feat: inject HMR-enabled flag alongside CSRF token in index.html for …
k-taro56 May 4, 2026
c496870
feat: enhance early-stop handling to support custom trainers in shutd…
k-taro56 May 5, 2026
366ea23
feat: enhance signal handling to prevent crashes on unsupported platf…
k-taro56 May 5, 2026
88e098d
feat: update cache-bust logic to use mtime, ctime, and size for impro…
k-taro56 May 5, 2026
63fef23
feat: enhance HMR coordinator comments for clarity on lazy initializa…
k-taro56 May 5, 2026
2057c7a
feat: update trainer detection logic to support additional export sha…
k-taro56 May 5, 2026
8c72d89
fix: prevent watcher from crashing on undefined result during ERROR e…
k-taro56 May 5, 2026
0f8e55c
refactor: simplify cleanup hook exit logic by directly using in-fligh…
k-taro56 May 5, 2026
414fe37
feat: enhance JSON serialization in hashJobConfig to ensure key conte…
k-taro56 May 5, 2026
7114f9f
feat: improve JSON serialization in hashJobConfig to handle undefined…
k-taro56 May 5, 2026
aee40ff
feat: implement pre-spawn event buffering in RunTraining to handle HM…
k-taro56 May 5, 2026
430be81
fix: resolve early-stop latch issue in trainer to ensure immediate se…
k-taro56 May 5, 2026
c7022c4
feat: implement restart grace window in RunTraining to handle late SS…
k-taro56 May 5, 2026
ac35b20
Merge branch 'main' into eng-615
k-taro56 May 8, 2026
21ffd87
Refactor flushMicrotasks function documentation for clarity and accur…
k-taro56 May 8, 2026
e93488e
Update KillResult documentation in trainRegistry.ts to clarify the me…
k-taro56 May 8, 2026
c71d0d3
Enhance error handling in streamTraining to fail fast on non-2xx resp…
k-taro56 May 8, 2026
4facf01
Improve error handling in HMR subscription to prevent crashes from th…
k-taro56 May 8, 2026
5f1316f
Implement robust error handling in trainer callbacks to ensure early-…
k-taro56 May 8, 2026
8b037c5
Refactor child process management in buildStudioApp to maintain data …
k-taro56 May 8, 2026
ee3d4b0
Merge branch 'main' into eng-615
k-taro56 May 8, 2026
e30ef74
feat: add end-to-end tests for Studio HMR functionality
k-taro56 May 8, 2026
89a8171
feat: implement artifact hash tracking in HMR for accurate rebuild ma…
k-taro56 May 8, 2026
d5b89df
refactor: update comments in HMR test to clarify edit and subscribe s…
k-taro56 May 8, 2026
88612df
refactor: improve header handling in buildStudioApp response
k-taro56 May 9, 2026
eeb7f2a
feat: implement HMR fast path for manifest retrieval
k-taro56 May 9, 2026
bc5485b
feat: implement SIGKILL for user-initiated training cancellations
k-taro56 May 9, 2026
315c55f
fix: ensure early-stop checkpoint artifacts are correctly returned in…
k-taro56 May 9, 2026
40ca9b4
feat: capture cloud-side job ID for user-initiated cancellations
k-taro56 May 9, 2026
8cce117
test: add regression test for cloud job cancellation on SIGKILL
k-taro56 May 9, 2026
6970542
fix: reject early-stop deferred on cancel error to prevent silent fai…
k-taro56 May 9, 2026
8ed9eb3
fix: update exit codes for signal handling in cleanup hooks
k-taro56 May 9, 2026
48a9263
fix: ensure early-stop branch settles on user callback error to preve…
k-taro56 May 9, 2026
b8db73a
fix: implement POSIX exit codes for second signal handling and improv…
k-taro56 May 11, 2026
f65561b
fix: rename spawnArtifactHash to spawnArtifactContentHash for clarity…
k-taro56 May 11, 2026
19e3b84
fix: prevent concurrent token file deletion on failed persist during …
k-taro56 May 18, 2026
77fb46a
fix: improve comments for clarity and consistency across various files
k-taro56 May 18, 2026
d0d035a
fix: unify POSIX exit code handling across runner and CLI, improve er…
k-taro56 May 21, 2026
35c01cb
fix: improve error handling for checkpoint throws and refine cancel b…
k-taro56 May 21, 2026
378096a
fix: enhance token cleanup logic to prevent accidental deletion of co…
k-taro56 May 22, 2026
4e689df
Merge branch 'main' into eng-615
k-taro56 May 22, 2026
8de5cbb
fix: improve error handling and cleanup logic in RunTraining component
k-taro56 May 22, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 19 additions & 8 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,25 +63,36 @@ cd my-arkor-app && pnpm dev # Studio at http://127.0.

`arkor dev` generates a 32-byte base64url token per launch ([packages/arkor/src/cli/commands/dev.ts](packages/arkor/src/cli/commands/dev.ts)) and:

1. Passes it to `buildStudioApp({ studioToken })`. The Hono server validates every `/api/*` request via `X-Arkor-Studio-Token` header (or `?studioToken=` query for `EventSource`, which can't set headers). Comparison uses `timingSafeEqual`.
2. Persists it to `~/.arkor/studio-token` (mode 0600) so the SPA dev workflow (`pnpm --filter @arkor/studio-app dev`) can read it via the `arkor-studio-token` Vite plugin in [packages/studio-app/vite.config.ts](packages/studio-app/vite.config.ts), which injects `<meta name="arkor-studio-token">` into `index.html` on each request. Persistence failure must NOT block server start (read-only `$HOME` on Docker, etc.) just warn.
1. Passes it to `buildStudioApp({ studioToken })`. The Hono server validates every `/api/*` request via `X-Arkor-Studio-Token` header (or `?studioToken=` query for `EventSource`, which can't set headers). Comparison uses `timingSafeEqual`. The query-token allow-list lives in `eventStreamPathPattern` in [packages/arkor/src/studio/server.ts](packages/arkor/src/studio/server.ts), currently `/api/jobs/:id/events` and `/api/dev/events`. **Adding to that regex is CSRF-sensitive: each entry must be a GET stream-only route, never a mutation endpoint.**
2. Persists it to `~/.arkor/studio-token` (mode 0600) so the SPA dev workflow (`pnpm --filter @arkor/studio-app dev`) can read it via the `arkor-studio-token` Vite plugin in [packages/studio-app/vite.config.ts](packages/studio-app/vite.config.ts), which injects `<meta name="arkor-studio-token">` into `index.html` on each request. Persistence failure must NOT block server start (read-only `$HOME` on Docker, etc.); just warn.
3. Cleans up on `exit`/SIGINT/SIGTERM/SIGHUP via `unlinkSync`.

`/api/*` middleware also enforces a host-header allow-list (`127.0.0.1`/`localhost`) for DNS-rebinding defence. **CORS is intentionally NOT configured** the SPA is same-origin so reflecting `*` would let "simple" cross-origin POSTs reach handlers. The token check rejects those; cross-origin tabs cannot read the SPA's `<meta>`.
`/api/*` middleware also enforces a host-header allow-list (`127.0.0.1`/`localhost`) for DNS-rebinding defence. **CORS is intentionally NOT configured**: the SPA is same-origin so reflecting `*` would let "simple" cross-origin POSTs reach handlers. The token check rejects those; cross-origin tabs cannot read the SPA's `<meta>`.

The whole point: prevents another browser tab on the same machine from POSTing `/api/train` (which spawns `arkor train` and dynamically imports user TS RCE-grade).
The whole point: prevents another browser tab on the same machine from POSTing `/api/train` (which spawns `arkor train` and dynamically imports user TS, an RCE-grade exposure).

When touching the Studio server or SPA fetch layer, preserve: token via header for `fetch`, query param for `EventSource`, host-header guard, no CORS, timing-safe compare. The Vite plugin is dev-only (`apply: "serve"`) — running it during `vite build` would bake a stale per-launch token into the production `index.html` and shadow the runtime tag, causing every `/api/*` call to 403.
When touching the Studio server or SPA fetch layer, preserve: token via header for `fetch`, query param for `EventSource`, host-header guard, no CORS, timing-safe compare. The Vite plugin is dev-only (`apply: "serve"`): running it during `vite build` would bake a stale per-launch token into the production `index.html` and shadow the runtime tag, causing every `/api/*` call to 403.

### HMR + graceful early-stop + callback hot-swap

`arkor dev` keeps a [Rolldown](https://rolldown.rs) watcher over `src/arkor/` ([packages/arkor/src/studio/hmr.ts](packages/arkor/src/studio/hmr.ts)) and pushes rebuild events over `/api/dev/events` (SSE). On each successful build the watcher dynamic-imports the artifact, pulls a `TrainerInspection` snapshot off the discovered trainer (via the cross-realm `Symbol.for("arkor.trainer.inspect")` brand attached in [packages/arkor/src/core/trainerInspection.ts](packages/arkor/src/core/trainerInspection.ts)), and computes a stable `configHash` from the cloud-side `JobConfig`. The SPA re-fetches `/api/manifest` on each event so the Run Training button stays in sync without a browser refresh.

When a rebuild lands while a `/api/train`-spawned subprocess is in flight, the server makes a per-child decision in [packages/arkor/src/studio/trainRegistry.ts](packages/arkor/src/studio/trainRegistry.ts):

- **`configHash` matches the spawn-time hash** → SIGUSR2. The child's `installCallbackReloadHandler` re-imports the artifact and rotates the trainer's callback cell via the internal `Symbol.for("arkor.trainer.replaceCallbacks")` brand exposed by [packages/arkor/src/core/trainerInspection.ts](packages/arkor/src/core/trainerInspection.ts). The cloud-side run is untouched. Use this whenever a code change is contained inside the `callbacks: { ... }` object. Don't add a `replaceCallbacks()` method to the public `Trainer` interface: keeping the mutator behind a `Symbol.for` brand is what stops the dev-only HMR primitive from leaking into the SDK's published surface.
- **`configHash` differs (or is null because the new bundle didn't inspect)** → SIGTERM. `installShutdownHandlers` drives the trainer's internal early-stop entry point via the `Symbol.for("arkor.trainer.requestEarlyStop")` brand exposed by [packages/arkor/src/core/trainerInspection.ts](packages/arkor/src/core/trainerInspection.ts), which lets the next `checkpoint.saved` event finish (work preserved) before issuing `cancel()` and exiting cleanly. The SPA auto-restarts the run with the rebuilt artifact via the `restart: true` flag on the SSE event. A second SIGTERM bypasses the early-stop and exits 143 immediately, as an emergency escape hatch for a hung cancel.

Don't replace the SIGTERM-and-let-the-child-handle-it pattern with a SIGKILL escalation in the server: that would orphan Cloud-side jobs (no `cancel()` POST goes out) and waste GPU budget. Don't widen the SIGUSR2 path to "always hot-swap, server-side": the `configHash` check is what guarantees a hot-swap can't silently leave a child running with a stale `JobConfig`. Don't surface `requestEarlyStop()` (or `replaceCallbacks()`) as a method on the public `Trainer` interface: both are dev-only HMR primitives, and keeping them behind `Symbol.for` brands is what stops them from leaking into the published SDK shape; user code that wants similar semantics should compose `abortSignal` + `cancel()` per the cookbook.

### Project entry-point discovery

The CLI/Studio look at `src/arkor/index.ts` in user projects. Discovery in [packages/arkor/src/core/runner.ts](packages/arkor/src/core/runner.ts) accepts (in order): a named `arkor` export from `createArkor({...})`, a bare `trainer` export, a default export holding either an Arkor manifest or a Trainer, or a `default.trainer` nested shape. `createArkor` returns a frozen, opaque manifest tagged with `_kind: "arkor"`; treat it as a value to hand to tooling, not a programmable client.

`arkor build` ([packages/arkor/src/cli/commands/build.ts](packages/arkor/src/cli/commands/build.ts)) bundles to `.arkor/build/index.mjs` with esbuild; bare specifiers (e.g. `arkor`, anything in `node_modules`) stay external so the artifact resolves the runtime SDK from the project's installed copy.
`arkor build` ([packages/arkor/src/cli/commands/build.ts](packages/arkor/src/cli/commands/build.ts)) bundles to `.arkor/build/index.mjs` with [Rolldown](https://rolldown.rs); bare specifiers (e.g. `arkor`, anything in `node_modules`) stay external so the artifact resolves the runtime SDK from the project's installed copy. The `transform.target` is derived from `process.versions.node` at build time so the bundle targets the same Node binary that will execute it.

### E2E suite specifics

Both [e2e/cli](e2e/cli) and [e2e/studio](e2e/studio) declare `arkor` (and, for `e2e/cli`, `create-arkor`) as `workspace:*` `devDependencies`, so Turbo's `^build` produces `dist/bin.mjs` exactly once before `#test`/`#test:coverage` runs no `pretest` hooks, no concurrent rebuilds racing on `dist/`. Standalone runs (`pnpm --filter @arkor/e2e-* test`) need a prior `pnpm build`. Every supported Node (≥22.22.0) is in rolldown's compatible range (^20.19 || >=22.12), so the previous "rolldown-incompatible" CI bypass path was removed.
Both [e2e/cli](e2e/cli) and [e2e/studio](e2e/studio) declare `arkor` (and, for `e2e/cli`, `create-arkor`) as `workspace:*` `devDependencies`, so Turbo's `^build` produces `dist/bin.mjs` exactly once before `#test`/`#test:coverage` runs (no `pretest` hooks, no concurrent rebuilds racing on `dist/`). Standalone runs (`pnpm --filter @arkor/e2e-* test`) need a prior `pnpm build`. Every supported Node (≥22.22.0) is in rolldown's compatible range (^20.19 || >=22.12), so the previous "rolldown-incompatible" CI bypass path was removed.

Tests rely on `ARKOR_INTERNAL_SCAFFOLD_ARKOR_SPEC=file:.../packages/arkor` so the scaffolded fixtures install the workspace `arkor` instead of the npm-published one. Both this var and `SKIP_E2E_INSTALL` are declared in [turbo.json](turbo.json) so they pass through Turbo's hash.

Expand All @@ -96,7 +107,7 @@ When implementing anything (new feature, SDK/CLI/Studio behaviour change, schema
1. **Docs in both languages.** This repo pairs English/Japanese docs: `README.md` ↔ `README.ja.md`, `CONTRIBUTING.md` ↔ `CONTRIBUTING.ja.md`, and `docs/` ↔ `docs/ja/`. If you edit the English side, update the Japanese side in the same PR. Don't leave Japanese docs to be retro-translated later.
2. **Tests.** Add vitest cases under `packages/*/src/**/*.test.ts` for SDK/CLI/scaffold logic changes. For CLI flow changes, consider an `e2e/cli` scenario.

Don't split these into "docs in a follow-up PR" or "tests later" land them in the same PR. Skip only when the user explicitly says to.
Don't split these into "docs in a follow-up PR" or "tests later"; land them in the same PR. Skip only when the user explicitly says to.

## Non-obvious gotchas

Expand Down
7 changes: 6 additions & 1 deletion docs/concepts/studio.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,12 @@ Four jobs:
3. **Try a finished model.** A Playground page lets you pick the base model or the final adapter from any completed job and chat with it. The Playground does not load intermediate checkpoints; for mid-run inference, use [`onCheckpoint`](/concepts/lifecycle) callbacks in your trainer.
4. **Publish a model behind a `*.arkor.app` URL.** An Endpoints page creates a per-deployment subdomain that serves OpenAI-compatible chat completions for a chosen adapter or base model, plus the API keys that authenticate calls to it. The same actions are available programmatically via [`CloudApiClient`](/sdk/deployments) — Studio is the interactive surface; the SDK is the lower-level one.

A note on the dev loop: Studio's `/api/manifest` endpoint rebuilds and re-imports your trainer on every request (with a cache-bust query, see `packages/arkor/src/studio/manifest.ts`), but the UI only fetches it when the Run training page mounts. So if you edit `src/arkor/` and stay on the same Run training page, the next click reuses the existing `.arkor/build/index.mjs` and runs your old code. Refresh the page (or run `arkor build` from the terminal) between edits and clicks to pick up the new code reliably.
A note on the dev loop: Studio runs a [Rolldown](https://rolldown.rs) watcher over `src/arkor/` and pushes rebuild notifications to the SPA over a Server-Sent Events stream (`/api/dev/events`). Edit a file, save, and the Run training button updates with the new trainer name without a refresh. If a training run is in flight, the Studio compares the new bundle's cloud-side `JobConfig` hash to the one captured when the run was spawned:

- **Same hash (only callbacks changed).** The runner is signalled with SIGUSR2; it re-imports the rebuilt artifact and rotates the trainer's callback cell in place via an internal HMR brand. The cloud-side training run is untouched, no GPU time is wasted, and the SPA shows a brief "Callbacks hot-swapped" indicator.
- **Different hash (model / dataset / hyperparameters changed).** The runner is signalled with SIGTERM; the trainer's internal early-stop entry point lets the next checkpoint upload finish before issuing `cancel()`, then the SPA re-spawns the run with the rebuilt artifact. The previous Cloud-side job reaches `cancelled` after the checkpoint is uploaded, so the partial work is preserved as an artifact.

If you want this "stop after the next checkpoint" behaviour from your own code (rather than from the dev loop), build it on top of the public [`abortSignal` + `cancel()`](/sdk/trainer-control#abortsignal) pair. The [Early stopping recipe](/cookbook/early-stopping) walks through it.

## Where Studio runs

Expand Down
7 changes: 6 additions & 1 deletion docs/ja/concepts/studio.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,12 @@ Studio は `arkor dev` 実行時に立ち上がるローカル Web UI です。
3. **完成モデルを試す。** Playground ページでベースモデルや任意の完了済みジョブの最終アダプターを選んでチャットできます。中間チェックポイントは Playground からはロードしません。学習中の推論には [`onCheckpoint`](/ja/concepts/lifecycle) コールバックをトレーナーで使ってください。
4. **`*.arkor.app` URL でモデルを公開する。** Endpoints ページで OpenAI 互換 chat completions を提供する deployment 専用サブドメインを作成し、その API キーを発行・取り消しできます。同じ操作は [`CloudApiClient`](/ja/sdk/deployments) からプログラマティックにも可能で、Studio が対話的なインターフェイス、SDK が下位レイヤーという位置付けです。

dev ループのメモ: Studio の `/api/manifest` エンドポイントはリクエストごとにトレーナーをリビルド・再 import しますが(キャッシュバストクエリ付き、`packages/arkor/src/studio/manifest.ts` を参照)、UI が fetch するのは Run training ページがマウントされたときだけです。`src/arkor/` を編集して同じ Run training ページに留まり続けると、次のクリックは既存の `.arkor/build/index.mjs` を再利用して古いコードで走ります。確実に新しいコードを取り込むには、編集とクリックの間にページをリロード(あるいはターミナルから `arkor build`)してください。
dev ループのメモ: Studio は [Rolldown](https://rolldown.rs) のウォッチャを `src/arkor/` 上で常駐させ、再ビルド通知を Server-Sent Events ストリーム (`/api/dev/events`) で SPA に push します。ファイルを編集して保存すれば、Run training ボタンのトレーナー名表示はリロード無しで更新されます。学習が走っている最中であれば、Studio は再ビルドしたバンドルの Cloud 側 `JobConfig` ハッシュを、spawn 時に保存したハッシュと比較します。

- **ハッシュ一致(コールバックのみ変更)。** ランナーへ SIGUSR2 を送ります。ランナーは再ビルドされた成果物を再 import し、内部 HMR ブランド経由でトレーナーのコールバック cell をその場で差し替えます。Cloud 側の学習はそのまま継続し、GPU 時間を無駄にせず、SPA には "Callbacks hot-swapped" と短く表示されます。
- **ハッシュ不一致(モデル / データセット / ハイパーパラメータが変わった)。** ランナーへ SIGTERM を送ります。トレーナー内部の early-stop エントリが次のチェックポイントのアップロードを待ってから `cancel()` を発火し、SPA が再ビルドした成果物で再投入します。Cloud 側の以前のジョブはチェックポイントのアップロード完了後に `cancelled` 状態に遷移するので、ここまでの学習成果は artifact として保全されます。

自前のコードから(dev ループではなく)この「次のチェックポイントで止める」挙動が欲しい場合は、公開 API の [`abortSignal` + `cancel()`](/ja/sdk/trainer-control#abortsignal) を組み合わせて書いてください。具体的な手順は [Early Stopping レシピ](/ja/cookbook/early-stopping) にあります。

## Studio が動く場所

Expand Down
10 changes: 5 additions & 5 deletions docs/ja/studio/jobs.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -62,18 +62,18 @@ Jobs ページ(`#/jobs`)はマウント時に 1 度、その後 5 秒ごと

Loss チャートは `training.log` イベントから描画される SVG プロットです。Y 軸は最小値と最大値によるスケーリング、X 軸はステップ番号で、最大 2 系列を表示します:

- **Training loss** 実線のティール色。数値 `loss` を含むイベントごとに 1 頂点。
- **Eval loss** 破線のピンク色(点マーカー付き)。数値 `evalLoss` を含むイベント(通常は `evalSteps` 刻み)から描画。系列はイベントから直接構築するため、`evalLoss` のみを持ち `loss` を含まない eval-only フレームも線・凡例・統計に反映されます。Eval ポイントが 1 つも来ていない間は凡例にも表示されません。
- **Training loss**: 実線のティール色。数値 `loss` を含むイベントごとに 1 頂点。
- **Eval loss**: 破線のピンク色(点マーカー付き)。数値 `evalLoss` を含むイベント(通常は `evalSteps` 刻み)から描画。系列はイベントから直接構築するため、`evalLoss` のみを持ち `loss` を含まない eval-only フレームも線・凡例・統計に反映されます。Eval ポイントが 1 つも来ていない間は凡例にも表示されません。

ホバーすると最寄りステップと、そのステップに含まれる `loss` / `evalLoss` のうち存在する値が表示されます(eval-only ステップでは `loss` 値は出ず、その逆も同様)。チャートは `loss` または `evalLoss` のいずれかが数値であるイベントが 1 件以上届くまで `Waiting for training.log events…`(`training.log` イベント待ち)プレースホルダーを表示します。両方とも null / 省略の `training.log` フレームはカウントされません。

### 上級モード(Advanced metrics)

チャートヘッダーの **Advanced** トグルを ON にすると、系列ごとの統計パネルが現れます。各カードに表示される項目:

- **Mean loss ± 95% CI** Loss 値の標本平均と 95% 信頼区間の半幅(Student の t 分布。n > 31 では z = 1.96 にフォールバック)。
- **Std dev**(標準偏差)と **Variance**(分散) Bessel 補正済みの不偏推定量(`ddof=1`)。
- **p90** と **p95** numpy のデフォルトに合わせた線形補間パーセンタイル。
- **Mean loss ± 95% CI**: Loss 値の標本平均と 95% 信頼区間の半幅(Student の t 分布。n > 31 では z = 1.96 にフォールバック)。
- **Std dev**(標準偏差)と **Variance**(分散): Bessel 補正済みの不偏推定量(`ddof=1`)。
- **p90** と **p95**: numpy のデフォルトに合わせた線形補間パーセンタイル。

Eval カードは数値 `evalLoss` を含む `training.log` イベントが届くまでは空のままです。

Expand Down
Loading
Loading