feat: project AGENTS.md instructions and DuckDB federation console#66
feat: project AGENTS.md instructions and DuckDB federation console#66tobias-gp wants to merge 3 commits into
Conversation
Let project owners steer the semantic-model authoring agent with an optional, user-authored AGENTS.md at the project root, loaded via the Deep Agents `memory` feature (no custom file reading; missing file is tolerated). Add base system-prompt guidance telling the agent to follow those instructions. Repurpose the AGENTS.md slot: stop auto-generating the model-summary AGENTS.md (which nothing consumed) and remove stale auto-generated files on startup (identified by their `# Semantic Models` header), preserving user-authored files. Update docs and project conventions accordingly. Co-authored-by: Cursor <cursoragent@cursor.com>
|
🚅 Deployed to the archmax-pr-66 environment in archmax SemLayer
|
| const removedLegacyAgentsMd = await new SemanticModelFileService(getEnv().projectsDir).cleanupLegacyAgentsMd(); | ||
| if (removedLegacyAgentsMd > 0) { | ||
| console.log(`[startup] Removed ${removedLegacyAgentsMd} legacy auto-generated AGENTS.md file(s)`); | ||
| } |
There was a problem hiding this comment.
Worker loads legacy before cleanup
Medium Severity
Legacy AGENTS.md cleanup runs only during API startup, while the BullMQ worker starts in parallel and creates the authoring agent with memory: ["AGENTS.md"] without running cleanup. Jobs can load the stale auto-generated summary as project instructions until API cleanup finishes, and worker-only runs never remove legacy files.
Reviewed by Cursor Bugbot for commit 81832b0. Configure here.
| lines.push("", "**Metrics:**", ...model.metrics.map((m) => `- ${m.name}`)); | ||
| if (content.startsWith(LEGACY_AGENTS_MD_SIGNATURE)) { | ||
| await unlink(agentsPath).catch(() => {}); | ||
| removed++; |
There was a problem hiding this comment.
Counts removal when unlink fails
Low Severity
cleanupLegacyAgentsMd increments its removed count whenever the legacy header matches, even if unlink fails (errors are swallowed). Startup can log that legacy files were removed while the stale AGENTS.md remains and may still be loaded into the agent.
Reviewed by Cursor Bugbot for commit 81832b0. Configure here.
There was a problem hiding this comment.
Security review completed for the changed files in this PR. I did not find any concrete issues against the requested threat surfaces.
Checked:
- MCP endpoint auth: the PR does not change
apps/api/src/mcp/archmax-route.ts; existing flow authenticates bearer tokens and binds sessions to token/project beforeregisterArchmaxTools()runs. - Query execution sandboxing: the PR does not change
execute_query; existingexecuteScopedQuery()still checks token model scope, validates SQL viavalidateSqlAst(..., { mode: "mcp" }), materializes model views, opens DuckDB read-only, enforces timeout, and caps results at 1000 rows. - Admin auth / Better Auth: no Better Auth config changes; existing env validation requires
BETTER_AUTH_SECRETmin length 32 and production cookies areSecure,HttpOnly,SameSite=Lax. - API input validation: no Hono request handlers or schemas were changed; the new startup cleanup has no request input.
- Environment secrets: no
.env.localor hardcoded real secrets added; the only new key-like value istest-keyin a unit-test mock. - Dependency exposure: no
package.jsonorpnpm-lock.yamlchanges, so no new dependency exposure to audit.
No inline findings.
Sent by Cursor Automation: archmax Security Review
Add an admin console to run read-oriented federated SQL, install/load extensions, and copy setup commands (INSTALL/LOAD/ATTACH). Includes API routes, core service, frontend page, docs, and OpenSpec proposal. Co-authored-by: Cursor <cursoragent@cursor.com>
| if (!CONSOLE_ALLOWED_KEYWORDS.has(keyword)) { | ||
| throw new Error(`Statement type ${keyword} is not allowed; use SELECT, WITH, SHOW, DESCRIBE, or EXPLAIN`); | ||
| } | ||
| } |
There was a problem hiding this comment.
EXPLAIN ANALYZE bypasses denylist
Medium Severity
Federation console query validation only inspects the first SQL keyword, so statements starting with EXPLAIN pass even when EXPLAIN ANALYZE wraps INSERT, UPDATE, DELETE, or other denied types. That undermines the console’s read-only denylist because EXPLAIN ANALYZE executes the inner statement.
Reviewed by Cursor Bugbot for commit f8e9939. Configure here.
Docker image readydocker pull ghcr.io/archmaxai/archmax:pr-66 |
Remove the setup-commands side panel and the separate extension field. One editor now runs queries and routes INSTALL/LOAD to the extension endpoint, with the Run action in the page header per UI guidelines. Update spec and data-federation docs to match. Co-authored-by: Cursor <cursoragent@cursor.com>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes using default effort and found 1 potential issue.
There are 4 total unresolved issues (including 3 from previous reviews).
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 6060afc. Configure here.
| const withoutTrailing = trimmed.replace(/;+\s*$/, ""); | ||
| if (withoutTrailing.includes(";")) { | ||
| throw new Error("Only a single SQL statement is allowed"); | ||
| } |
There was a problem hiding this comment.
Semicolon inside string rejected
Low Severity
Multi-statement detection uses a raw ; search on the trimmed SQL string, so a single SELECT that contains a semicolon inside a string literal is rejected as multiple statements even though it is one valid statement.
Reviewed by Cursor Bugbot for commit 6060afc. Configure here.




Summary
Project-root AGENTS.md (semantic model agent)
AGENTS.mdat the project root into the semantic-model authoring agent via the Deep Agentsmemoryfeature (no custom file reading; a missing file is tolerated). Adds base system-prompt guidance instructing the agent to follow those project-specific instructions.AGENTS.mdslot: stop auto-generating the model-summaryAGENTS.md(which nothing in the app ever consumed) and removeregenerateAgentsMd+ itswrite()/delete()call sites.AGENTS.mdfiles (identified by their# Semantic Modelsheader), preserving user-authored files. Idempotent.apps/docssemantic-models guide,openspec/project.md); OpenSpec change archived with applied spec deltas (semantic-model-agent+1,semantic-models+1/-1).DuckDB federation console
/$projectId/connections/console) for ad-hoc federated SQL against the project's DuckDB instance (raw connection slugs; not MCP scoped VIEWs).INSTALL/LOADpairs, redacted per-connectionATTACHexamples, and an example federation query.INSTALL <name> [FROM community]orLOAD <name>.GET/POST /api/projects/:projectId/duckdb-console/{setup,query,extensions}; core service with console-specific SQL validation and credential redaction in errors.openspec/changes/add-duckdb-federation-console/.Test plan
pnpm typecheck(incl.@archmax/apibuild) — exits 0pnpm lint— exits 0npx vitest run— passes; includesduckdb-consoleunit + API integration testsopenspec validate add-duckdb-federation-console --strictAGENTS.mdin a project root and confirm the agent follows it; confirm a brand-new project (no file) starts without errorSELECT 1, copy a setup command, install a community extensionNotes
design.md): a user-authored file beginning with# Semantic Modelswould be removed by startup cleanup.Note
Medium Risk
Console runs arbitrary read-oriented SQL against raw federated catalogs (bypassing MCP scoped VIEWs) and can install DuckDB extensions on the API process instance; AGENTS.md content is injected into the agent system prompt.
Overview
This PR adds two operator-facing capabilities and the supporting specs/docs.
Optional project-root
AGENTS.mdfor the semantic-model builder: The authoring agent now loadsAGENTS.mdvia Deep Agentsmemory: ["AGENTS.md"], with base prompt guidance to treat it as authoritative project instructions. Auto-generation of the unused model-summaryAGENTS.mdon every model write/delete is removed. API startup runs idempotent cleanup that deletes only legacy files whose content starts with# Semantic Models, preserving user-authored files.DuckDB Federation Console: New Data Federation → Console route and authenticated API (
GET /setup,POST /query,POST /extensions) backed by@archmax/coreservices with read-only query allowlists, validatedINSTALL/LOAD, timeouts, and redacted errors/attach examples. The UI is a single SQL textarea: Run sends queries or extension statements (by leading keyword), shows tabular results, and disables run when there are no active connections (setup is used for that gate, not a copy panel in the page). Docs and OpenSpec deltas cover console behavior, sidebar nav, and theAGENTS.mdconvention.Reviewed by Cursor Bugbot for commit 6060afc. Bugbot is set up for automated code reviews on this repo. Configure here.