Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions REFACTOR-LOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# Refactor Log

## Iteration 1 — 2026-03-28

**Issue:** IDLE mode re-fetches all emails on every notification due to stale `ctx.last_seen_uid`
**Severity:** fragility
**Location:** `backend/app/modules/_shared/email/imap_watch_loop.py`, `fetch_new_emails()` (lines 145, 170)

**Problem:** The `fetch_new_emails` function persists the last processed UID to the database via `callbacks.save_uid(uid, db)` but never updates the in-memory `ctx.last_seen_uid`. In IDLE mode, the same `FetchContext` object is reused across notification cycles (passed from `watch_loop` → `idle_loop` → `fetch_new_emails`). This means the IMAP UID search (`UID {last_seen_uid + 1}:*`) always starts from the original position, causing every previously-processed email to be re-fetched from the IMAP server and re-checked against the dedup table on each IDLE notification. The overhead grows linearly with the number of emails received during an IDLE session. Polling mode is unaffected because it creates a fresh `FetchContext` from the database each cycle.

**Fix:** Added `ctx.last_seen_uid = uid` after each `callbacks.save_uid(uid, db)` call (both the route-skipped path on line 145 and the normal processing path on line 170). This keeps the in-memory context in sync with what was persisted, so subsequent IDLE cycles only fetch genuinely new emails.

**Verification:** All 222 existing tests pass. Added 3 new tests in `tests/test_fetch_new_emails.py` covering: (1) ctx updates after processing emails, (2) ctx updates when emails are skipped by routing, (3) ctx unchanged when no new emails. Full suite: 225 passed.

**Risk:** None identified. The `FetchContext` is a non-frozen dataclass, mutation is safe. The `ctx` is only used within the `fetch_new_emails` → `idle_loop` call chain. Polling mode already creates fresh contexts and is unaffected.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file duplicates information already in the PR description and commit messages. It will go stale quickly and adds maintenance burden. Consider removing it — git history and the PR are the appropriate places for this context.

2 changes: 2 additions & 0 deletions backend/app/modules/_shared/email/imap_watch_loop.py
Original file line number Diff line number Diff line change
Expand Up @@ -143,6 +143,7 @@ async def fetch_new_emails(
route = await callbacks.route_email(sender, db)
if route is None:
await callbacks.save_uid(uid, db)
ctx.last_seen_uid = uid
continue
user_id, source = route

Expand All @@ -168,6 +169,7 @@ async def fetch_new_emails(
)

await callbacks.save_uid(uid, db)
ctx.last_seen_uid = uid

if state:
state.last_scan_at = datetime.now(timezone.utc)
Expand Down
99 changes: 99 additions & 0 deletions backend/tests/test_fetch_new_emails.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
"""Tests for fetch_new_emails keeping ctx.last_seen_uid in sync."""

import pytest
from unittest.mock import AsyncMock, MagicMock, patch
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MagicMock and WorkerState (line 11) are imported but never used. Remove them to keep the test file clean.


from app.modules._shared.email.imap_watch_loop import (
FetchContext,
ImapWatcherCallbacks,
fetch_new_emails,
)
from app.modules._shared.email.imap_watcher import WorkerState


def _make_callbacks(route_result=None) -> ImapWatcherCallbacks:
"""Build mock callbacks for testing."""
return ImapWatcherCallbacks(
connect=AsyncMock(),
load_fetch_context=AsyncMock(),
route_email=AsyncMock(return_value=route_result),
save_uid=AsyncMock(),
log_label="test",
)


def _make_imap(uids: list[int], email_bytes: bytes = b"From: a@b.c\nSubject: test\n\nBody"):
"""Build a mock IMAP client that returns the given UIDs and a fixed email."""
imap = AsyncMock()
uid_str = " ".join(str(u) for u in uids).encode()
imap.uid_search = AsyncMock(return_value=("OK", [uid_str] if uids else [b""]))
imap.uid = AsyncMock(return_value=("OK", [bytearray(email_bytes)]))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion (test gap): This mock returns the same email for every UID and doesn't validate the search criteria string. Consider adding an assertion that uid_search is called with the expected UID {last_seen_uid + 1}:* criteria — that's the core invariant this fix protects. Without it, these tests would still pass even if the ctx.last_seen_uid update were removed (the mock doesn't change behavior based on UID range).

Also: for test_ctx_last_seen_uid_updated_after_processing, consider testing with a second call to fetch_new_emails on the same ctx to verify that the updated last_seen_uid actually affects the next IDLE cycle's search range. That would be a true integration-style test of the fix.

return imap


@pytest.mark.asyncio
async def test_ctx_last_seen_uid_updated_after_processing():
"""ctx.last_seen_uid must advance as emails are processed."""
ctx = FetchContext(
last_seen_uid=10,
folder_path="INBOX",
uidvalidity=123,
max_email_age_days=7,
source_info="test",
source_label="test",
account_id=1,
)
callbacks = _make_callbacks(route_result=(1, "test"))
imap = _make_imap([11, 12, 13])
db = AsyncMock()

with patch(
"app.modules._shared.email.imap_watch_loop.check_dedup_and_enqueue",
new_callable=AsyncMock,
return_value=True,
):
await fetch_new_emails(imap, ctx, callbacks, db, state=None)

assert ctx.last_seen_uid == 13


@pytest.mark.asyncio
async def test_ctx_last_seen_uid_updated_on_skipped_route():
"""ctx.last_seen_uid must advance even when route_email returns None (email skipped)."""
ctx = FetchContext(
last_seen_uid=5,
folder_path="INBOX",
uidvalidity=123,
max_email_age_days=7,
source_info="test",
source_label="test",
account_id=1,
)
callbacks = _make_callbacks(route_result=None) # all emails skipped
imap = _make_imap([6, 7])
db = AsyncMock()

await fetch_new_emails(imap, ctx, callbacks, db, state=None)

assert ctx.last_seen_uid == 7


@pytest.mark.asyncio
async def test_ctx_last_seen_uid_unchanged_when_no_new_emails():
"""ctx.last_seen_uid stays the same when there are no new UIDs."""
ctx = FetchContext(
last_seen_uid=10,
folder_path="INBOX",
uidvalidity=123,
max_email_age_days=7,
source_info="test",
source_label="test",
account_id=1,
)
callbacks = _make_callbacks()
imap = _make_imap([])
db = AsyncMock()

await fetch_new_emails(imap, ctx, callbacks, db, state=None)

assert ctx.last_seen_uid == 10
Loading