feat: project AGENTS.md instructions and DuckDB federation console by tobias-gp · Pull Request #66 · archmaxai/archmax

tobias-gp · 2026-06-04T11:58:32Z

Summary

Project-root AGENTS.md (semantic model agent)

Load an optional, user-authored AGENTS.md at the project root into the semantic-model authoring agent via the Deep Agents memory feature (no custom file reading; a missing file is tolerated). Adds base system-prompt guidance instructing the agent to follow those project-specific instructions.
Repurpose the AGENTS.md slot: stop auto-generating the model-summary AGENTS.md (which nothing in the app ever consumed) and remove regenerateAgentsMd + its write()/delete() call sites.
On startup, remove stale auto-generated AGENTS.md files (identified by their # Semantic Models header), preserving user-authored files. Idempotent.
Docs + conventions updated (apps/docs semantic-models guide, openspec/project.md); OpenSpec change archived with applied spec deltas (semantic-model-agent +1, semantic-models +1/-1).

DuckDB federation console

Add Data Federation → Console (/$projectId/connections/console) for ad-hoc federated SQL against the project's DuckDB instance (raw connection slugs; not MCP scoped VIEWs).
Setup commands panel: copyable pre-installed INSTALL/LOAD pairs, redacted per-connection ATTACH examples, and an example federation query.
Install / Load control for validated single-statement INSTALL <name> [FROM community] or LOAD <name>.
API: GET/POST /api/projects/:projectId/duckdb-console/{setup,query,extensions}; core service with console-specific SQL validation and credential redaction in errors.
OpenSpec proposal: openspec/changes/add-duckdb-federation-console/.

Test plan

pnpm typecheck (incl. @archmax/api build) — exits 0
pnpm lint — exits 0
npx vitest run — passes; includes duckdb-console unit + API integration tests
openspec validate add-duckdb-federation-console --strict
Manual (AGENTS.md): drop an AGENTS.md in a project root and confirm the agent follows it; confirm a brand-new project (no file) starts without error
Manual (console): open Data Federation → Console, run SELECT 1, copy a setup command, install a community extension

Notes

AGENTS.md edge case (documented in design.md): a user-authored file beginning with # Semantic Models would be removed by startup cleanup.
Console extensions apply to the API process in-memory DuckDB instance; the worker has its own cache until rebuilt (documented in the data-federation guide).

Note

Medium Risk
Console runs arbitrary read-oriented SQL against raw federated catalogs (bypassing MCP scoped VIEWs) and can install DuckDB extensions on the API process instance; AGENTS.md content is injected into the agent system prompt.

Overview
This PR adds two operator-facing capabilities and the supporting specs/docs.

Optional project-root AGENTS.md for the semantic-model builder: The authoring agent now loads AGENTS.md via Deep Agents memory: ["AGENTS.md"], with base prompt guidance to treat it as authoritative project instructions. Auto-generation of the unused model-summary AGENTS.md on every model write/delete is removed. API startup runs idempotent cleanup that deletes only legacy files whose content starts with # Semantic Models, preserving user-authored files.

DuckDB Federation Console: New Data Federation → Console route and authenticated API (GET /setup, POST /query, POST /extensions) backed by @archmax/core services with read-only query allowlists, validated INSTALL/LOAD, timeouts, and redacted errors/attach examples. The UI is a single SQL textarea: Run sends queries or extension statements (by leading keyword), shows tabular results, and disables run when there are no active connections (setup is used for that gate, not a copy panel in the page). Docs and OpenSpec deltas cover console behavior, sidebar nav, and the AGENTS.md convention.

^{Reviewed by Cursor Bugbot for commit 6060afc. Bugbot is set up for automated code reviews on this repo. Configure here.}

Let project owners steer the semantic-model authoring agent with an optional, user-authored AGENTS.md at the project root, loaded via the Deep Agents `memory` feature (no custom file reading; missing file is tolerated). Add base system-prompt guidance telling the agent to follow those instructions. Repurpose the AGENTS.md slot: stop auto-generating the model-summary AGENTS.md (which nothing consumed) and remove stale auto-generated files on startup (identified by their `# Semantic Models` header), preserving user-authored files. Update docs and project conventions accordingly. Co-authored-by: Cursor <cursoragent@cursor.com>

railway-app · 2026-06-04T11:58:47Z

🚅 Deployed to the archmax-pr-66 environment in archmax SemLayer

Service	Status	Updated (UTC)
archmax_standalone_with_volume	✅ Success (View Logs)	Jun 4, 2026 at 12:23 pm
archmax_standalone	✅ Success (View Logs)	Jun 4, 2026 at 12:22 pm
archmax_external_dbs	✅ Success (View Logs)	Jun 4, 2026 at 12:22 pm

cursor · 2026-06-04T12:00:11Z

+const removedLegacyAgentsMd = await new SemanticModelFileService(getEnv().projectsDir).cleanupLegacyAgentsMd();
+if (removedLegacyAgentsMd > 0) {
+  console.log(`[startup] Removed ${removedLegacyAgentsMd} legacy auto-generated AGENTS.md file(s)`);
+}


Worker loads legacy before cleanup

Medium Severity

Legacy AGENTS.md cleanup runs only during API startup, while the BullMQ worker starts in parallel and creates the authoring agent with memory: ["AGENTS.md"] without running cleanup. Jobs can load the stale auto-generated summary as project instructions until API cleanup finishes, and worker-only runs never remove legacy files.

^{Reviewed by Cursor Bugbot for commit 81832b0. Configure here.}

cursor · 2026-06-04T12:00:11Z

-        lines.push("", "**Metrics:**", ...model.metrics.map((m) => `- ${m.name}`));
+      if (content.startsWith(LEGACY_AGENTS_MD_SIGNATURE)) {
+        await unlink(agentsPath).catch(() => {});
+        removed++;


Counts removal when unlink fails

Low Severity

cleanupLegacyAgentsMd increments its removed count whenever the legacy header matches, even if unlink fails (errors are swallowed). Startup can log that legacy files were removed while the stale AGENTS.md remains and may still be loaded into the agent.

^{Reviewed by Cursor Bugbot for commit 81832b0. Configure here.}

cursor

Security review completed for the changed files in this PR. I did not find any concrete issues against the requested threat surfaces.

Checked:

MCP endpoint auth: the PR does not change apps/api/src/mcp/archmax-route.ts; existing flow authenticates bearer tokens and binds sessions to token/project before registerArchmaxTools() runs.
Query execution sandboxing: the PR does not change execute_query; existing executeScopedQuery() still checks token model scope, validates SQL via validateSqlAst(..., { mode: "mcp" }), materializes model views, opens DuckDB read-only, enforces timeout, and caps results at 1000 rows.
Admin auth / Better Auth: no Better Auth config changes; existing env validation requires BETTER_AUTH_SECRET min length 32 and production cookies are Secure, HttpOnly, SameSite=Lax.
API input validation: no Hono request handlers or schemas were changed; the new startup cleanup has no request input.
Environment secrets: no .env.local or hardcoded real secrets added; the only new key-like value is test-key in a unit-test mock.
Dependency exposure: no package.json or pnpm-lock.yaml changes, so no new dependency exposure to audit.

No inline findings.

_{Sent by Cursor Automation: archmax Security Review}

Add an admin console to run read-oriented federated SQL, install/load extensions, and copy setup commands (INSTALL/LOAD/ATTACH). Includes API routes, core service, frontend page, docs, and OpenSpec proposal. Co-authored-by: Cursor <cursoragent@cursor.com>

cursor · 2026-06-04T12:04:47Z

+  if (!CONSOLE_ALLOWED_KEYWORDS.has(keyword)) {
+    throw new Error(`Statement type ${keyword} is not allowed; use SELECT, WITH, SHOW, DESCRIBE, or EXPLAIN`);
+  }
+}


EXPLAIN ANALYZE bypasses denylist

Medium Severity

Federation console query validation only inspects the first SQL keyword, so statements starting with EXPLAIN pass even when EXPLAIN ANALYZE wraps INSERT, UPDATE, DELETE, or other denied types. That undermines the console’s read-only denylist because EXPLAIN ANALYZE executes the inner statement.

^{Reviewed by Cursor Bugbot for commit f8e9939. Configure here.}

github-actions · 2026-06-04T12:07:58Z

Docker image ready

docker pull ghcr.io/archmaxai/archmax:pr-66

Remove the setup-commands side panel and the separate extension field. One editor now runs queries and routes INSTALL/LOAD to the extension endpoint, with the Run action in the page header per UI guidelines. Update spec and data-federation docs to match. Co-authored-by: Cursor <cursoragent@cursor.com>

cursor

Cursor Bugbot has reviewed your changes using default effort and found 1 potential issue.

There are 4 total unresolved issues (including 3 from previous reviews).

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit 6060afc. Configure here.}

cursor · 2026-06-04T12:23:34Z

+  const withoutTrailing = trimmed.replace(/;+\s*$/, "");
+  if (withoutTrailing.includes(";")) {
+    throw new Error("Only a single SQL statement is allowed");
+  }


Semicolon inside string rejected

Low Severity

Multi-statement detection uses a raw ; search on the trimmed SQL string, so a single SELECT that contains a semicolon inside a string literal is rejected as multiple statements even though it is one valid statement.

^{Reviewed by Cursor Bugbot for commit 6060afc. Configure here.}

railway-app Bot temporarily deployed to archmax SemLayer / archmax-pr-66 June 4, 2026 11:58 Destroyed

cursor Bot reviewed Jun 4, 2026

View reviewed changes

railway-app Bot temporarily deployed to archmax SemLayer / archmax-pr-66 June 4, 2026 12:02 Destroyed

tobias-gp changed the title ~~feat(agent): optional project-root AGENTS.md as agent instructions~~ feat: project AGENTS.md instructions and DuckDB federation console Jun 4, 2026

cursor Bot reviewed Jun 4, 2026

View reviewed changes

railway-app Bot deployed to archmax SemLayer / archmax-pr-66 June 4, 2026 12:21 View deployment

cursor Bot reviewed Jun 4, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: project AGENTS.md instructions and DuckDB federation console#66

feat: project AGENTS.md instructions and DuckDB federation console#66
tobias-gp wants to merge 3 commits into
mainfrom
add-project-agents-md-instructions

tobias-gp commented Jun 4, 2026 •

edited by cursor Bot

Loading

Uh oh!

railway-app Bot commented Jun 4, 2026 •

edited

Loading

Uh oh!

cursor Bot Jun 4, 2026

Uh oh!

cursor Bot Jun 4, 2026

Uh oh!

cursor Bot left a comment

Uh oh!

cursor Bot Jun 4, 2026

Uh oh!

github-actions Bot commented Jun 4, 2026

Uh oh!

cursor Bot left a comment

Uh oh!

cursor Bot Jun 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

tobias-gp commented Jun 4, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Project-root AGENTS.md (semantic model agent)

DuckDB federation console

Test plan

Notes

Uh oh!

railway-app Bot commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cursor Bot Jun 4, 2026

Choose a reason for hiding this comment

Worker loads legacy before cleanup

Uh oh!

cursor Bot Jun 4, 2026

Choose a reason for hiding this comment

Counts removal when unlink fails

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor Bot Jun 4, 2026

Choose a reason for hiding this comment

EXPLAIN ANALYZE bypasses denylist

Uh oh!

github-actions Bot commented Jun 4, 2026

Docker image ready

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor Bot Jun 4, 2026

Choose a reason for hiding this comment

Semicolon inside string rejected

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

tobias-gp commented Jun 4, 2026 •

edited by cursor Bot

Loading

railway-app Bot commented Jun 4, 2026 •

edited

Loading