Skip to content

fix(web): reject non-active machine tokens in auth middleware#210

Open
karthikarunapuram8-dot wants to merge 4 commits into
alookai:mainfrom
karthikarunapuram8-dot:fix/auth-reject-non-active-machine-tokens
Open

fix(web): reject non-active machine tokens in auth middleware#210
karthikarunapuram8-dot wants to merge 4 commits into
alookai:mainfrom
karthikarunapuram8-dot:fix/auth-reject-non-active-machine-tokens

Conversation

@karthikarunapuram8-dot

Copy link
Copy Markdown

fix(web): reject non-active machine tokens in auth middleware

Problem

The Bearer-token branch of withAuth admitted any row returned by getMachineTokenByToken, regardless of status. Pending tokens authenticated as the owning user.

Tokens created by POST /api/machine-tokens are stored with status="pending" and only transition to "active" when the daemon calls POST /api/machine-tokens/activate with a hostname and runtimes. Until that activation step, the raw token exists in the DB and is returned to the user (so the UI can show it / pipe it to the daemon).

Because auth.ts did not check mt.status, anyone holding the raw pending token — clipboard sniffer, screen capture, browser extension that scrapes the page, malicious helper running on the same machine — could call any authenticated route as that user. Endpoints that do not require workspaceId (e.g. /api/me, workspace creation, listing/creating additional machine tokens) work without activation. If a token was created and never activated, the window stays open indefinitely; once cached, the KV cache (TTL 900s) keeps the pending state hot.

After activation the token is correctly invalidated in the cache, so legitimate daemons keep working — this change only closes the pre-activation gap.

Fix

  • src/web/src/lib/middleware/auth.ts: after the existing null check, also reject any token whose status is not exactly "active", with the same 401 invalid token response used for unknown tokens. Same shape, so it does not leak whether the token exists.

Tests

  • src/web/src/lib/middleware/auth.test.ts:
    • Existing happy-path mocks now set status: "active" (they previously omitted the field, which masked the bug).
    • New test asserts a status: "pending" token is rejected with 401 invalid token and the wrapped handler is never invoked.

Run: pnpm --filter @alook/web test (1027 passing, +1 vs main).

Verification

  • pnpm typecheck ✓
  • pnpm --filter @alook/web lint ✓ (only pre-existing warning in unrelated file)
  • pnpm --filter @alook/{shared,email-worker,web,cli,ws-do,test-utils} test ✓

Scope

Single-file behavioural change plus test. No schema, query, or API contract changes. No new dependencies.

The Bearer-token auth path looked up the machine token by its raw value
and admitted any row returned by getMachineTokenByToken, regardless of
status. Tokens created by POST /api/machine-tokens are stored with
status="pending" and only transitioned to "active" by the daemon
calling /api/machine-tokens/activate.

This meant a pending token authenticated as the owning user for any
endpoint that did not require a workspace scope (e.g. /api/me, workspace
creation, listing/creating additional machine tokens). Anyone who
obtained the raw token between issuance and activation — clipboard
sniffer, screen capture, browser extension, etc. — could impersonate
the user during that window, and indefinitely if the token was never
activated.

Reject tokens whose status is not exactly "active" with the same
401 "invalid token" response used for unknown tokens. Existing test
mocks updated to set status="active"; new test asserts pending tokens
are refused and the wrapped handler is not invoked.
@karthikarunapuram8-dot karthikarunapuram8-dot requested a review from a team as a code owner May 29, 2026 03:25
@gusye1234

Copy link
Copy Markdown
Contributor

Review — looks correct and well-scoped. One follow-up to flag.

The fix is right: getMachineTokenByToken (src/shared/src/db/queries/machine-token.ts:36) does select status, so mt.status !== "active" is type-safe, and returning the same 401 invalid token shape avoids leaking token existence. Tests cover the new reject path and correctly fix the happy-path mocks that previously masked the bug by omitting status. 👍

Follow-up (not a blocker for this PR): auth.ts is not the only Bearer-token entry point. The same getMachineTokenByToken → accept-any-non-null pattern exists in two other helpers with no status guard:

  • src/web/src/lib/dual-auth.ts:15 (requireAuth)
  • src/web/src/lib/api-auth.ts:16 (withToken)

Both would admit a pending token exactly as auth.ts did before this change. The mitigating factor is that neither helper appears to have any callers in the repo right now (grep finds zero usages outside their own files), so this is a latent gap rather than a live one. Recommend either (a) deleting the two unused helpers, or (b) adding the same status !== "active" check there in a follow-up so the guard doesn't silently regress if they're ever wired up. Worth a one-line comment in the activation flow pointing at the single source of truth.

CI: I don't see any CI checks reported on this branch yet — please make sure the pipeline runs green before merge (the change is small but the auth path is sensitive).

Otherwise this is solid and ready once CI is green. (Note: I only leave review comments — not posting a GitHub approval.)

@karthikarunapuram8-dot

Copy link
Copy Markdown
Author

Thanks for the review.

Opened #225 for the follow-up: deleted both dual-auth.ts and api-auth.ts rather than hardening them, since both have zero callers. withAuth is now the single source of truth for Bearer-token validation, and there's no other path that could quietly reintroduce the pending-token-accepted bug.

On CI: my local checks pass (pnpm typecheck across all six packages, pnpm --filter @alook/web lint, full @alook/web vitest suite at 1027 passing including the new test). I don't have permission to retrigger the GitHub Actions workflow from the fork side, so it likely needs your approval to run on this PR. Happy to rebase or adjust if there's anything else I should do to get the pipeline kicked off.

The POST-then-auth test in machine-token.test.ts was relying on the same
bug this PR closes. It created a fresh token (status="pending" by default)
and immediately used it to authenticate, expecting 200. The buggy withAuth
accepted any non-null row from getMachineTokenByToken, so the assertion
passed by accident. With the fix in place, pending tokens correctly get
401 and the test surfaced the bug instead of hiding it.

This mirrors the unit-test happy-path fix already in this PR: instead of
omitting status (which masks the gap), the test now activates the token
through /api/machine-tokens/activate and then asserts the activated token
authenticates as expected. Same flow a real CLI user would take.

Pattern is identical to the activation tests already in this file.
@karthikarunapuram8-dot

karthikarunapuram8-dot commented May 30, 2026

Copy link
Copy Markdown
Author

Took a look at the E2E failure. It's the same bug class this PR closes, in a different test.

src/web/src/test/e2e/machine-token.test.ts:45 (expected 401 to be 200). The POST /api/machine-tokens creates a new token test creates a fresh token and immediately uses it to authenticate. New tokens are status="pending" until activated. Before this PR's fix, withAuth accepted pending tokens, so the assertion passed by accident. With the fix in place, pending tokens correctly return 401 and the test surfaced the bug it was masking.

Same shape as the unit-test happy-path mocks you already noted ("masked the bug by omitting status"). The E2E version was probably harder to spot locally because it requires APP_URL running.

Fixed in 9891ec4 by adding the activation step before the auth check. The test now does the full create + activate + authenticate flow a real CLI user would take. Pattern matches the activation tests already in the same file. Other 33 e2e test files still pass.

Local checks green at 9891ec4: pnpm --filter @alook/web typecheck exit 0, pnpm --filter @alook/web lint exit 0 (one pre-existing unrelated warning, zero errors).

@gusye1234

Copy link
Copy Markdown
Contributor

Re-review of the follow-up changes — looks good, ready once CI is green.

Verified the two changes since my last pass:

  1. E2E fix (machine-token.test.ts) — the "creates a new token" test now activates the token before asserting auth works. Correct fix for a test that was masking the same bug class:

    • Activate payload { token, hostname, runtimes: [{ type, version }] } matches ActivateTokenRequestSchema and the two existing activation tests in the file.
    • Token is created under ?workspace_id=seed.workspaceId, so activation skips the "Personal" workspace-creation branch — no orphan workspace; the upserted machine/agent_runtime rows are cleaned by cleanupTestData.
    • No cache staleness: token T is never read through withAuth before activation, so cacheKeys.machineToken(T) is never populated with a pending entry; first read is post-activation → fresh active row.
  2. chore(web): remove unused Bearer-token auth helpers #225 — deleting the two unused helpers (dual-auth.ts, api-auth.ts) rather than hardening them is the right call given zero callers. withAuth is now the single source of truth. 👍

Core fix unchanged and still correct (mt.status !== "active" is type-safe; getMachineTokenByToken selects status).

Only open item: GitHub Actions CI still hasn't run on this branch (fork PR needs a maintainer to approve the workflow run). The change is small but the auth path is sensitive — please kick off CI before merge. (As always, I only leave review comments, no GitHub approval.)

@codecov

codecov Bot commented May 31, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@gusye1234

Copy link
Copy Markdown
Contributor

Daily-review ping (logic already reviewed above): no CI checks are currently reported on this branch and the merge state shows UNSTABLE. The code fix and the E2E follow-up both look good per the earlier reviews — it just needs CI to actually run. Suggest a rebase on latest main (or an empty re-push) to re-trigger the workflows so this can go green and merge.

@karthikarunapuram8-dot

Copy link
Copy Markdown
Author

Updated the branch to current main in bca1116 and pushed it to re-trigger CI. The PR diff is still scoped to:

  • src/web/src/lib/middleware/auth.ts
  • src/web/src/lib/middleware/auth.test.ts
  • src/web/src/test/e2e/machine-token.test.ts

Local verification at bca1116 is green:

pnpm install --frozen-lockfile
CI=true pnpm typecheck
CI=true pnpm lint
CI=true pnpm test
CI=true pnpm build
pnpm run db:migrate
pnpm test:e2e
pnpm --filter @alook/cli run test:integration

E2E passed with 35 files and 237 tests. CLI integration passed with 14 tests and 1 skipped.

GitHub created CI run 26723423056 for the new head, but it is still action_required behind the fork approval gate. Approval from this account returns 403 (Must have admin rights to Repository), so it still needs a maintainer approval to start.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants