Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .changeset/rich-keys-live.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
---
"@cephalization/math": minor
---

feat: Support openai + anthropic models via provider/model pattern
8 changes: 8 additions & 0 deletions .dex/tasks.jsonl

Large diffs are not rendered by default.

73 changes: 73 additions & 0 deletions .math/todo/LEARNINGS.md
Original file line number Diff line number Diff line change
Expand Up @@ -84,3 +84,76 @@ Use this knowledge to avoid repeating mistakes and build on what works.
- Pattern: Use `dexMock.getCalls()` to verify the exact sequence of start/complete calls and their order
- The test verifies: 3 tasks completed in dependency order (task-1 -> task-2 -> task-3), correct call sequence, no max iterations exceeded
- Test runs in ~56ms (well under the 1 second requirement)

## 9p2tu55l

- Added short flag `-m` as alias for `--model` in the parseArgs function
- Design pattern: Created `SHORT_FLAGS` mapping object to make adding new short flags trivial (just add to the map)
- Key change: Modified the "next value" check from `!next.startsWith("--")` to `!next.startsWith("-")` so that short flags are also recognized as flags, not values
- The parser handles short flags that are exactly 2 characters (dash + single letter) - this is intentional to avoid ambiguity
- Unknown short flags pass through using their short key (e.g., `-x value` becomes `{ x: "value" }`)
- Added dedicated unit tests for parseArgs in `src/parse-args.test.ts` since index.ts had no tests
- Pre-existing test failures in prune.test.ts are unrelated (macOS path canonicalization: `/var` vs `/private/var`)

## dvfozgy9

- Created src/model.ts with model validation utilities for the provider/model-name format
- Key design: Return type union `{ valid: true, model } | { valid: false, error }` provides TypeScript-friendly narrowing with `if (!result.valid)` checks
- Used `as const` for SUPPORTED_PROVIDERS array to enable type-safe provider checking: `(typeof SUPPORTED_PROVIDERS)[number]` derives the union type
- parseModelProvider returns null for invalid input (simple to check), validateModel returns structured error with helpful message
- Edge cases to handle: empty string, missing slash, provider-only with trailing slash, unsupported providers
- Pattern: Use indexOf + slice instead of split for parsing - handles multiple slashes correctly (e.g., "openai/gpt-4/turbo" keeps "gpt-4/turbo" as modelName)

## 9686t3iv

- Integrated model validation into CLI commands by adding a `validateModelOrExit()` helper function
- Key pattern: Centralized validation function handles type narrowing and exit logic - returns `string | undefined` for clean integration with command options
- Applied validation to all 4 commands that accept --model: run, plan, init, iterate
- For `run` command which passes raw options, validation is called separately before `run(options)` since the options object is passed through directly
- Updated help text to show model format requirement, supported providers, and default value - all pulled from existing constants/types
- Gotcha: TypeScript type narrowing requires checking `typeof model !== "string"` rather than `model === undefined || model === true` to handle all boolean cases
- Pre-existing test failures in prune.test.ts are unrelated to this task (macOS symlink path canonicalization issue)

## wjxkvy1t

- Created src/config.ts with Zod schema for iteration configuration
- Zod v4 imports use `import { z } from "zod/v4"` syntax (not just `"zod"`)
- Key pattern: Use `safeParse()` for load operations to handle invalid data gracefully (return null), use `parse()` for save operations to throw on invalid config
- Used synchronous `fs.readFileSync` instead of async Bun.file().text() for simpler return type (no Promise needed)
- `Bun.file(path).size` returns 0 for non-existent files - good way to check file existence before reading
- Tests use `mkdtemp()` for isolated temp directories per test - follows established pattern from LEARNINGS.md
- Pre-existing prune.test.ts failures remain (macOS `/var` vs `/private/var` path issue)

## 3vwxyw1q

- Updated plan command to load persisted model from config with priority: CLI --model > config.model > DEFAULT_MODEL
- Used nullish coalescing (`??`) instead of logical OR (`||`) for proper handling of empty string values
- Pattern: Model resolution at the command layer (src/commands/plan.ts) before calling the business logic function keeps concerns separated
- The resolved model is passed directly to `runPlanningMode()` - the function's internal DEFAULT_MODEL fallback still exists but will never be reached since we now always pass a resolved model

## 4mmqn1x7

- Implemented model loading from config in run command with same priority as plan: CLI --model flag > config.model > DEFAULT_MODEL
- Key design: Created `resolveModel()` function that returns `{ model, source }` tuple to enable logging which source was used
- Added `ModelSource` type ("flag" | "config" | "default") and passed `modelSource` through `LoopOptions` to `runLoop()`
- Pattern: Descriptive log output shows model source in parentheses: "Model: anthropic/claude-opus-4-5 (from config)"
- Reused existing `loadIterationConfig()` from src/config.ts - keeps config loading centralized
- Important: Model validation happens in index.ts via `validateModelOrExit()` before `run()` is called - the resolved model is already validated

## qwnnb48t

- Added interactive model prompt to iterate command as step 4 (after archive/backup steps, before planning)
- Key pattern: Used `createInterface` from `node:readline/promises` with a while loop for re-prompting on validation errors
- TTY detection via `process.stdin.isTTY` determines whether to show interactive prompt or use default silently
- Three distinct code paths: (1) --model flag provided → validate and persist, (2) interactive TTY → prompt user, (3) non-interactive → use default
- Empty input on prompt returns undefined (uses default but doesn't persist), valid input persists to config.json
- The resolved model is passed to `runPlanningMode()` if user chooses to plan, ensuring consistency
- Pre-existing prune.test.ts failures (macOS /var vs /private/var) are unrelated to this task

## vzd5ppbt

- Reused the promptForModel pattern from iterate.ts for init.ts - same TTY detection and validation flow
- Model prompt appears after directory creation, before the planning prompt (as specified in task)
- Three code paths identical to iterate: --model flag → validate/persist, interactive → prompt, non-interactive → default
- The init command's existing --model flag support was already passed to runPlanningMode - added validation and persist step before that call
- Pre-existing prune.test.ts failures (macOS /var vs /private/var path canonicalization) remain unrelated to this work
3 changes: 3 additions & 0 deletions bun.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

47 changes: 42 additions & 5 deletions index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ import { plan } from "./src/commands/plan";
import { prune } from "./src/commands/prune";
import { DEFAULT_MODEL } from "./src/constants";
import { migrateTasksToDexIfNeeded } from "./src/migrate-to-dex";
import { validateModel, SUPPORTED_PROVIDERS } from "./src/model";

// ANSI colors
const colors = {
Expand Down Expand Up @@ -41,7 +42,10 @@ ${colors.bold}COMMANDS${colors.reset}
${colors.cyan}help${colors.reset} Show this help message

${colors.bold}OPTIONS${colors.reset}
${colors.dim}--model <model>${colors.reset} Model to use (default: ${DEFAULT_MODEL})
${colors.dim}-m, --model <model>${colors.reset} Model to use in provider/model format
(e.g., openai/gpt-4, anthropic/claude-3-opus)
Supported providers: ${SUPPORTED_PROVIDERS.join(", ")}
Default: ${DEFAULT_MODEL}
${colors.dim}--max-iterations <n>${colors.reset} Safety limit (default: 100)
${colors.dim}--pause <seconds>${colors.reset} Pause between iterations (default: 3)
${colors.dim}--no-plan${colors.reset} Skip planning mode after init/iterate
Expand Down Expand Up @@ -83,6 +87,11 @@ ${colors.bold}EXAMPLES${colors.reset}
`);
}

// Map short flags to their long equivalents
const SHORT_FLAGS: Record<string, string> = {
m: "model",
};

function parseArgs(args: string[]): Record<string, string | boolean> {
const parsed: Record<string, string | boolean> = {};
for (let i = 0; i < args.length; i++) {
Expand All @@ -91,17 +100,44 @@ function parseArgs(args: string[]): Record<string, string | boolean> {
if (arg.startsWith("--")) {
const key = arg.slice(2);
const next = args[i + 1];
if (next && !next.startsWith("--")) {
if (next && !next.startsWith("-")) {
parsed[key] = next;
i++;
} else {
parsed[key] = true;
}
} else if (arg.startsWith("-") && arg.length === 2) {
// Short flag like -m
const shortKey = arg.slice(1);
const longKey = SHORT_FLAGS[shortKey] ?? shortKey;
const next = args[i + 1];
if (next && !next.startsWith("-")) {
parsed[longKey] = next;
i++;
} else {
parsed[longKey] = true;
}
}
}
return parsed;
}

/**
* Validate the model argument if provided. Exits with code 1 if invalid.
*/
function validateModelOrExit(model: string | boolean | undefined): string | undefined {
if (typeof model !== "string") {
// No model provided or --model without value, use default
return undefined;
}
const result = validateModel(model);
if (!result.valid) {
console.error(`${colors.red}Error: ${result.error}${colors.reset}`);
process.exit(1);
}
return model;
}

async function main() {
const [command, ...rest] = Bun.argv.slice(2);
const options = parseArgs(rest);
Expand All @@ -118,16 +154,17 @@ async function main() {
case "init":
await init({
skipPlan: !!options["no-plan"],
model: options.model as string,
model: validateModelOrExit(options.model),
});
break;
case "plan":
await plan({
model: options.model as string | undefined,
model: validateModelOrExit(options.model),
quick: !!options.quick,
});
break;
case "run":
validateModelOrExit(options.model);
await run(options);
break;
case "status":
Expand All @@ -136,7 +173,7 @@ async function main() {
case "iterate":
await iterate({
skipPlan: !!options["no-plan"],
model: options.model as string,
model: validateModelOrExit(options.model),
});
break;
case "prune":
Expand Down
3 changes: 2 additions & 1 deletion package.json
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,7 @@
"@types/react": "^19.2.8",
"@types/react-dom": "^19.2.3",
"react": "^19.2.3",
"react-dom": "^19.2.3"
"react-dom": "^19.2.3",
"zod": "^4.3.6"
}
}
88 changes: 87 additions & 1 deletion src/commands/init.ts
Original file line number Diff line number Diff line change
@@ -1,19 +1,77 @@
import { existsSync } from "node:fs";
import { mkdir } from "node:fs/promises";
import { join } from "node:path";
import { createInterface } from "node:readline/promises";
import { $ } from "bun";
import { PROMPT_TEMPLATE, LEARNINGS_TEMPLATE } from "../templates";
import { runPlanningMode, askToRunPlanning } from "../plan";
import { getTodoDir } from "../paths";
import { getDexDir, isDexAvailable } from "../dex";
import { DEFAULT_MODEL } from "../constants";
import { validateModel, SUPPORTED_PROVIDERS } from "../model";
import { saveIterationConfig } from "../config";

const colors = {
reset: "\x1b[0m",
green: "\x1b[32m",
yellow: "\x1b[33m",
cyan: "\x1b[36m",
red: "\x1b[31m",
};

/**
* Prompt user for implementation model with validation and re-prompt on error.
* Returns the validated model string or undefined if user skips.
*/
async function promptForModel(todoDir: string): Promise<string | undefined> {
const rl = createInterface({
input: process.stdin,
output: process.stdout,
});

try {
while (true) {
const answer = await rl.question(
`Enter model (${colors.cyan}provider/model${colors.reset}), or press Enter for default [${DEFAULT_MODEL}]: `
);

const trimmed = answer.trim();

// Empty input = use default, don't persist
if (!trimmed) {
console.log(
`${colors.green}✓${colors.reset} Using default model: ${DEFAULT_MODEL}`
);
rl.close();
return undefined;
}

// Validate the input
const result = validateModel(trimmed);
if (!result.valid) {
console.log(`${colors.red}✗${colors.reset} ${result.error}`);
console.log(
`${colors.yellow}Supported providers: ${SUPPORTED_PROVIDERS.join(", ")}${colors.reset}`
);
// Re-prompt
continue;
}

// Valid input - persist to config
saveIterationConfig(todoDir, {
model: trimmed,
createdAt: new Date().toISOString(),
});
console.log(`${colors.green}✓${colors.reset} Model set to: ${trimmed}`);
rl.close();
return trimmed;
}
} catch {
rl.close();
return undefined;
}
}

export async function init(
options: { skipPlan?: boolean; model?: string } = {}
) {
Expand Down Expand Up @@ -64,12 +122,40 @@ export async function init(
` ${colors.cyan}PROMPT.md${colors.reset} - System prompt with guardrails`
);
console.log(` ${colors.cyan}LEARNINGS.md${colors.reset} - Knowledge log`);
console.log();

// Model configuration
let resolvedModel: string | undefined = options.model;

if (options.model) {
// If --model flag was provided, validate and persist if valid
const result = validateModel(options.model);
if (!result.valid) {
throw new Error(result.error);
}
saveIterationConfig(todoDir, {
model: options.model,
createdAt: new Date().toISOString(),
});
console.log(
`${colors.green}✓${colors.reset} Using model from --model flag: ${options.model}`
);
} else if (process.stdin.isTTY) {
// Interactive mode: prompt for model
resolvedModel = await promptForModel(todoDir);
} else {
// Non-interactive mode: use default, don't persist
console.log(
`${colors.green}✓${colors.reset} Using default model: ${DEFAULT_MODEL}`
);
}
console.log();

// Ask to run planning mode unless --no-plan flag
if (!options.skipPlan) {
const shouldPlan = await askToRunPlanning();
if (shouldPlan) {
await runPlanningMode({ todoDir, options: { model: options.model } });
await runPlanningMode({ todoDir, options: { model: resolvedModel } });
return;
}
}
Expand Down
Loading
Loading