feat(email): replace Mailgun with Resend + Cloudflare inbound email routing#24
Conversation
- Rename findOutboxMessageByMailgunId → findOutboxMessageByProviderId in memory.js - Rename mailgun_id field → provider_message_id in storage scan and outbox service - Update _findOutboxMessageByMailgunId → _findOutboxMessageByProviderId in outbox.service.js - Update tests 69-71 to use new method and field names - mech.js renamed separately (gitignored private adapter) Related to tasks/0004-prd-email-inbound-outbound.md Task 1.0
- Replace _mailgunRequest/FormData with _resendRequest using Bearer auth + JSON
- addDomain: POST /domains JSON, stores resend_domain_id + provider_state
- _getDomainDnsRecords: parses Resend domain response records[]
- verifyDomain: POST /domains/{resend_domain_id}/verify
- _attemptSend: POST /emails, to as array, stores provider_message_id
- handleWebhook: reads event.type + event.data.email_id
- verifyWebhookSignature: Svix format (svixId, svixTimestamp, svixSignature, rawBody)
- Rename /webhooks/mailgun → /webhooks/resend with express.raw() middleware
- Mount webhook router before express.json() to capture raw body for Svix verification
- Update env var warnings: RESEND_API_KEY, RESEND_WEBHOOK_SECRET
- Update all tests to use _resendRequest stubs, RESEND_* env vars, Resend event format
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add src/utils/email.js with agentEmailAddress() helper
Format: {tenantId}.{agentId}@{domain} (with tenant) or {agentId}@{domain}
Domain defaults to INBOUND_EMAIL_DOMAIN env var or agentdispatch.io
- Add email_address field to GET /api/agents/:agentId response
- Add POST /api/webhooks/email/inbound route
Verifies X-Webhook-Secret against INBOUND_EMAIL_SECRET (timingSafeEqual)
Resolves agent by to_agent + optional to_namespace (tenant match)
Delivers to inbox via inboxService.send() with verify_signature:false
Encodes from_email as email:{local}.at.{domain} to satisfy ADMP validator
- Mount emailInboundRouter in server.js, add INBOUND_EMAIL_SECRET warning
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Scaffold workers/email-ingestion/ with package.json, wrangler.toml, index.ts
- parseRecipient(): parse {namespace}.{agentId}@Domain into structured parts
- acme.alice → namespace=acme, agentId=alice
- acme.alice.v2 → namespace=acme, agentId=alice.v2 (dots preserved)
- alice → namespace=null, agentId=alice
- email() handler: parse MIME with postal-mime, POST to ADMP /webhooks/email/inbound
- Passes X-Webhook-Secret for authentication
- Calls message.setReject() on 404 (unknown agent)
- Logs transient errors without bouncing
- 5 unit tests for parseRecipient (all pass)
- README: deployment guide, secrets setup, email routing config
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Remove MAILGUN_API_KEY, MAILGUN_API_URL, MAILGUN_WEBHOOK_SIGNING_KEY from .env.example - Add RESEND_API_KEY, RESEND_WEBHOOK_SECRET, INBOUND_EMAIL_SECRET, INBOUND_EMAIL_DOMAIN - Add Email section to README covering inbound (Cloudflare Worker) and outbound (Resend) - Update Production Checklist with email env vars Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ndpoints - outbox.service.js: add Svix timestamp freshness check (±5 min window) to verifyWebhookSignature() to prevent replay attacks on /webhooks/resend - email-inbound.js: fail closed when INBOUND_EMAIL_SECRET is not configured (returns 500 SERVER_MISCONFIGURATION instead of silently skipping auth) - email-inbound.js: sanitize 500 error responses (log error, return generic message to avoid leaking internal details) - server.js: update startup warning to reflect new fail-closed behavior - server.test.js: update 4 tests to provide auth headers; add test for SERVER_MISCONFIGURATION when secret is unconfigured Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Agent email address format ({namespace}.{agent_id}@agentdispatch.io)
- How to retrieve email_address from GET /api/agents/:id
- Receiving email: message type, body fields, from-field encoding
- Sending email: three-step domain setup, send endpoint, status values
- Full domain management endpoint reference table
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
Greptile SummaryThis PR replaces the Mailgun integration with Resend for outbound email and introduces a new Cloudflare Worker + ADMP inbound webhook pipeline for receiving email into agent inboxes. The overall architecture is sound and the security hardening additions (Svix timestamp freshness check, fail-closed Two issues need attention before this goes to production:
Confidence Score: 2/5
|
| Filename | Overview |
|---|---|
| src/services/outbox.service.js | Full rewrite from Mailgun to Resend; introduces a critical Svix secret handling bug where the whsec_ prefix is not stripped and base64-decoded before use as the HMAC key, causing webhook signature verification to always fail in production. |
| src/routes/email-inbound.js | New inbound email webhook route; contains a logic bug where tenanted agents are reachable at their un-namespaced address when to_namespace is null, allowing cross-namespace email delivery. |
| workers/email-ingestion/index.ts | New Cloudflare Worker that parses MIME via postal-mime and forwards to ADMP; logic is correct and handles 404 bounces properly; domain parameter in parseRecipient is accepted but not validated against the actual recipient domain, though domain filtering happens upstream at Cloudflare. |
| src/routes/outbox.js | Updated Mailgun → Resend webhook route with correct express.raw() placement before express.json(); minor inconsistency where the 500 error catch block still leaks error.message contrary to the PR's stated security hardening goal. |
| src/utils/email.js | New helper for generating agent email addresses; DEFAULT_DOMAIN is resolved at module load time rather than dynamically, which can cause stale values in tests that override INBOUND_EMAIL_DOMAIN after import. |
Sequence Diagram
sequenceDiagram
participant Sender as Email Sender
participant CF as Cloudflare Email Routing
participant Worker as CF Worker (email-ingestion)
participant ADMP as ADMP Server
participant Resend as Resend API
Note over Sender,ADMP: Inbound Email Flow
Sender->>CF: Send email to {namespace}.{agentId}@agentdispatch.io
CF->>Worker: Forward via Email Routing (catch-all)
Worker->>Worker: parseRecipient() → namespace + agentId
Worker->>Worker: PostalMime.parse(raw MIME)
Worker->>ADMP: POST /api/webhooks/email/inbound<br/>(X-Webhook-Secret header)
ADMP->>ADMP: timingSafeEqual(secret)
ADMP->>ADMP: storage.getAgent(to_agent)
ADMP->>ADMP: inboxService.send(type: email)
ADMP-->>Worker: 200 OK
Worker-->>CF: (complete)
Note over ADMP,Resend: Outbound Email Flow
ADMP->>ADMP: POST /api/agents/:id/outbox/send
ADMP->>ADMP: Verify domain status = verified
ADMP->>Resend: POST /emails (Bearer auth)
Resend-->>ADMP: { id: re_xxx }
ADMP->>ADMP: store provider_message_id
Note over Resend,ADMP: Delivery Status Webhook
Resend->>ADMP: POST /api/webhooks/resend<br/>(svix-id, svix-timestamp, svix-signature)
ADMP->>ADMP: verifyWebhookSignature (HMAC-SHA256)
ADMP->>ADMP: updateOutboxMessage(status)
ADMP-->>Resend: 200 OK
Last reviewed commit: 8b0b3b0
| // --- Agent resolution --- | ||
| let agent = await storage.getAgent(to_agent); | ||
| if (to_namespace && agent && agent.tenant_id !== to_namespace) { | ||
| agent = null; | ||
| } | ||
|
|
||
| if (!agent) { | ||
| return res.status(404).json({ | ||
| error: 'AGENT_NOT_FOUND', | ||
| message: `Agent ${to_agent} not found` | ||
| }); | ||
| } |
There was a problem hiding this comment.
Tenanted agents reachable at their un-namespaced address
When to_namespace is null (email sent to alice@agentdispatch.io), the namespace-mismatch check on line 88 is short-circuited by the to_namespace && guard. An agent that belongs to tenant acme (whose correct address is acme.alice@agentdispatch.io) will still receive mail addressed to alice@agentdispatch.io because the check is skipped entirely.
The fix is to also reject the delivery when the incoming email has no namespace but the resolved agent does have a tenant_id:
let agent = await storage.getAgent(to_agent);
if (agent) {
const namespaceMismatch =
(to_namespace && agent.tenant_id !== to_namespace) ||
(!to_namespace && agent.tenant_id);
if (namespaceMismatch) {
agent = null;
}
}| * Email address helpers for ADMP agent email addresses | ||
| */ | ||
|
|
||
| const DEFAULT_DOMAIN = process.env.INBOUND_EMAIL_DOMAIN || 'agentdispatch.io'; |
There was a problem hiding this comment.
Module-level DEFAULT_DOMAIN snapshot prevents runtime override
DEFAULT_DOMAIN is resolved once at import time. Every other sensitive env var in this codebase (RESEND_API_KEY, RESEND_WEBHOOK_SECRET, INBOUND_EMAIL_SECRET) is read dynamically via a getter function so tests can override them with process.env.XXX = … after import. This constant will silently ignore any post-import change, which can cause agentEmailAddress() to return stale values in tests or multi-tenant scenarios.
| const DEFAULT_DOMAIN = process.env.INBOUND_EMAIL_DOMAIN || 'agentdispatch.io'; | |
| const DEFAULT_DOMAIN = process.env.INBOUND_EMAIL_DOMAIN || 'agentdispatch.io'; |
Consider wrapping in a getter instead:
function getEmailDomain() {
return process.env.INBOUND_EMAIL_DOMAIN || 'agentdispatch.io';
}and using getEmailDomain() inside agentEmailAddress.
Additional Comments (2)
Resend's dashboard provides webhook secrets in Svix format: The correct approach: Without this,
The PR description explicitly states a security goal: "500 catch blocks return generic message; full error logged server-side only." The inbound email route ( res.status(500).json({ error: 'WEBHOOK_FAILED', message: error.message });Resend is a trusted caller, so the risk is low, but consistency with the stated security hardening would suggest logging and returning a generic message here too. |
PR Review: feat(email): replace Mailgun with Resend + Cloudflare inbound email routingOverall this is a solid, well-scoped migration. The provider-agnostic renaming is clean, the fail-closed behavior on missing secrets is exactly right, and the test coverage for the new inbound webhook is thorough. A few items to address before merging: Critical: Potential bug with Svix HMAC secret encodingFile: Resend (via Svix) provides webhook secrets in the format Suggested fix: strip the prefix and base64-decode before passing to createHmac: const keyBytes = Buffer.from(secret.replace(/^whsec_/, ''), 'base64');
const hmac = crypto.createHmac('sha256', keyBytes);Medium:
|
- Inbound email enters review_pending; trusted senders auto-queued - POST /api/webhooks/email/inbound/:messageId/review (approve|reject, reason, model_verdict) - GET/POST/DELETE /api/agents/:agentId/email/trusted-senders - Inbox service: initial_status, bypass_trust_check, system_metadata guard - Docs: README and AGENT-GUIDE updated for policy and endpoints Made-with: Cursor
… setup docs - src/lib/cloudflare.js: ported from circleinbox — DNS CRUD, upsertDnsRecord, provisionResendDns(), Email Routing enable/catch-all API - scripts/provision-resend-dns.js: check + upsert all 4 Resend records (DKIM, SPF MX, SPF TXT, DMARC) against Cloudflare; safe to re-run - docs/EMAIL-SETUP.md: step-by-step operator checklist (env, Worker deploy, Resend DNS in Cloudflare, validation) - .env.example: add CLOUDFLARE_API_TOKEN/ZONE_ID and RESEND_DNS_* vars - workers/email-ingestion/wrangler.toml: remove invalid [[email]] section - README.md: link to EMAIL-SETUP.md from deployment checklist Made-with: Cursor
- outbox.service: strip whsec_ prefix and base64-decode Svix secret before HMAC construction — without this, signature verification always fails when RESEND_WEBHOOK_SECRET is copied directly from the Resend dashboard - email-inbound: add inverse namespace guard — tenanted agents are no longer reachable at their un-namespaced address (alice@ vs acme.alice@) - email-inbound: add namespace mismatch test covering both wrong-namespace and no-namespace-to-tenanted-agent cases - outbox: Resend webhook 500 catch returns generic message (no error.message leak) - email.js: read INBOUND_EMAIL_DOMAIN lazily at call time, not at module load - server: gate RESEND_WEBHOOK_SECRET warning on RESEND_API_KEY being set Made-with: Cursor
Review Feedback AddressedAll blocking and important issues resolved in 63b5b1a. Blocking — Svix whsec_ decoding (outbox.service.js): strip prefix and base64-decode before HMAC — signature verification was always failing in production with a real Resend secret. Blocking — Namespace guard (email-inbound.js): added inverse guard — tenanted agents (acme.alice@) are no longer reachable at bare address (alice@). Added test for wrong-namespace and no-namespace-to-tenanted-agent cases. Minor — Resend webhook 500 returns generic error (no error.message leak), INBOUND_EMAIL_DOMAIN read lazily in email.js, RESEND_WEBHOOK_SECRET warning gated on RESEND_API_KEY being set. All 13 email inbound tests pass. Ready for re-review. |
|
Overall this is a well-structured migration. Provider-agnostic naming, the inbound quarantine design, and test coverage are all solid. That said, I found one critical bug that will break Resend webhook verification in production. CRITICAL: Svix webhook signature key not base64-decoded File: src/services/outbox.service.js, method: verifyWebhookSignature Resend Svix secrets are formatted as whsec_ followed by a base64-encoded key. The actual HMAC key bytes are the base64-decoded payload, not the raw string. The current code passes the raw secret including the prefix directly to crypto.createHmac, which produces the wrong HMAC for every real Resend secret -- verification will always fail in production. Fix: strip the whsec_ prefix and base64-decode the remainder before passing to createHmac. The existing tests pass because they use the plain string 'test-signing-key' with no prefix, so both sides compute the same wrong HMAC. The bug will not surface until tested against a real Resend account. Security: no validation of from_email format in the inbound webhook src/routes/email-inbound.js -- from_email is stored without format validation. A malformed value (no at-sign for example) goes through the replacement unchanged and ends up in the envelope returned to API consumers. A check analogous to SIMPLE_EMAIL_RE in agents.js would harden this boundary. Security: inbound webhook as an agent-existence oracle Once authenticated, anyone with the shared secret can probe arbitrary to_agent values and observe 404 vs 200. Likely acceptable given the Worker is the only expected caller, but worth documenting in operator notes so the shared secret is treated accordingly. Performance: findOutboxMessageByProviderId does a full scan in memory storage. An explicit TODO comment would ensure this is not missed when adding a persistent backend. Minor nits: getZoneId apex extraction via split dot slice(-2) breaks for ccTLD second-levels like foo.co.uk. Low risk for operator tooling but worth a comment. listZones and getDnsRecords have no pagination (50 and 200 record caps respectively). Could silently miss records on large accounts. email-inbound.js creates a second pino() instance at module level rather than sharing the app-wide logger. The Resend webhook catch block in src/routes/outbox.js returns error.message to the caller. The PR description states 500 catch blocks should return a generic message with full error logged server-side only. The email-inbound route gets this right and outbox.js should follow suit. Things done well: Timing-safe comparison for the inbound secret check. Verdict: The Svix HMAC key decoding bug must be fixed before merging -- it will silently break all delivery-status webhook updates in production. The remaining items are improvements rather than blockers. |
|
PR Review: feat(email) - Mailgun to Resend migration Well-structured PR with good security defaults. Several issues to address before merge: BUGS
SECURITY
NOTABLE GAPS
MINOR
WHAT IS DONE WELL
Issues 1 and 2 are concrete bugs to fix before merging. Issue 3 is worth a design discussion on whether the review API is for human operators only or also automated policy engines. |
Summary
mailgun_id→provider_message_idacrossmemory.js,mech.js, andoutbox.service.jsfor provider-agnostic storageoutbox.service.js— Bearer auth + JSON, Resend domain management, Svix webhook signature verification; renamed/webhooks/mailgun→/webhooks/resendworkers/email-ingestion/Worker parses MIME viapostal-mime, resolves{namespace}.{agent_id}@agentdispatch.ioaddresses, and forwards to ADMPPOST /webhooks/email/inboundendpoint delivers email as ADMP messages (type: email) to agent inboxesGET /api/agents/:idnow returnsemail_addressfieldINBOUND_EMAIL_SECRETis unconfigured (was fail-open).env.example,README.md, andAGENT-GUIDE.mdupdated; Mailgun references removedTest plan
bun test src/server.test.js— 158 pass, 15 fail (pre-existing mech.js baseline)bun test workers/email-ingestion/— 5 pass (parseRecipient unit tests)GET /api/agents/:idreturnsemail_addressfieldPOST /webhooks/email/inboundreturns 500 whenINBOUND_EMAIL_SECRETunset/webhooks/resendrejects stale Svix timestamp (>5 min old)cd workers/email-ingestion && wrangler deploywrangler secret put ADMP_URLandwrangler secret put INBOUND_EMAIL_SECRETRelated
tasks/0004-prd-email-inbound-outbound.mdtasks/tasks-0004-prd-email-inbound-outbound.md🤖 Generated with Claude Code