Skip to content

feat: opt-in auto-OCR on image download + image-download fix (P13b-3)#141

Merged
blokzdev merged 3 commits into
mainfrom
claude/p13b3-auto-ocr
Jun 2, 2026
Merged

feat: opt-in auto-OCR on image download + image-download fix (P13b-3)#141
blokzdev merged 3 commits into
mainfrom
claude/p13b3-auto-ocr

Conversation

@blokzdev
Copy link
Copy Markdown
Owner

@blokzdev blokzdev commented Jun 2, 2026

What & why

Closes out P13b with opt-in auto-OCR on image downloads — plus a precursor bug fix the maintainer chose to land first, since it's the enabler.

Two commits:

1. fix: image-only downloads become image items (precursor)

classifyDownloadOutputs routed every image extension (.jpg/.png/.webp/…) to thumb, so a single-image download (an Instagram/X photo, or a carousel) produced an empty media list → no library item at all. Auto-OCR (which scans outputs.media) could never fire, and image downloads were silently dropped generally.

Fix: image files are tentative thumbnails — alongside a video/audio download they stay thumbnails, but when a download has no video/audio, the images are the media (→ image items). Reuses mediaTypeForExt so classification matches the insert loop. The video+thumbnail case is unchanged. This also fixes image downloads app-wide (they now appear with dimensions, on-demand OCR, etc.).

2. feat: opt-in auto-OCR on download (P13b-3)

Mirrors P13a-2 auto-summarize. OCR is free + offline (bundled ML Kit) — no model download, no "needs setup" nudge.

  • Settings: autoOcrOnDownload (default off) + setter.
  • Pure shouldAutoOcr (enabled & engine-available & image & not-yet-scanned).
  • Queue: a gated block in _persistCompleted (after auto-summary) scans each image item → updateOcrText (FTS reindexes); ocrCount in _PersistResult; an ai success inbox entry when text is found (no nudge — OCR is always available).
  • Settings UI: an "Image text (OCR) · Auto-scan new image downloads" card in AI & graph settings, shown only where ML Kit OCR runs.

Tests

dart format clean · flutter analyze No issues · flutter test 800 passed — classifier image cases (image-only → media; carousel → all media; video+image → thumbnail unchanged), shouldAutoOcr truth table, settings round-trip, and queue cases (image+text → ocrText + ai entry; default-off no-op; video skipped). The realistic auto-OCR queue test relies on the precursor fix to produce an image item from a .jpg download. No schema/deps change.

Honest notes

  • The classifier bug was a genuine find while planning P13b-3 — image downloads weren't producing items, which would have made auto-OCR a silent no-op. I paused and surfaced it rather than ship an ineffective feature; you chose "fix classification first," so it's the precursor commit here.
  • Owed APK spot-check (VERIFICATION → P13b-3): a real image download → image item + searchable text + inbox entry, offline; the native ML Kit call + image-download classification can't be CI-tested.
  • The video+thumbnail path is covered by an unchanged-behavior test to guard against regressions.

This completes P13b (OCR + translation + auto-OCR). Next subphase is P13c — smart auto-tagging.

https://claude.ai/code/session_013JoYmLCosYt5tQ8qwdbL1T


Generated by Claude Code

claude added 3 commits June 2, 2026 17:45
…13b-3 precursor)

classifyDownloadOutputs routed every image extension to `thumb`, so a
single-image download (a photo or carousel) produced an empty media list —
no library item at all. Now image files are tentative thumbnails: alongside
a video/audio download they stay thumbnails, but when a download has no
video/audio the images ARE the media (→ image items). Reuses mediaTypeForExt
so classification matches the insert loop's type assignment.

This unblocks auto-OCR-on-download (P13b-3) and fixes image downloads
generally (they now appear in the library, with dimensions/OCR/etc.).

https://claude.ai/code/session_013JoYmLCosYt5tQ8qwdbL1T
Follow-up to P13b-1: optionally auto-scan image downloads for text on
completion, so search coverage grows automatically. Opt-in (default off);
mirrors P13a-2 auto-summarize. OCR is free + offline (bundled ML Kit), so
there's no model download or "needs setup" nudge.

- Settings: `autoOcrOnDownload` (default false) + setter.
- Pure `shouldAutoOcr` (enabled & engine-available & image & not-yet-scanned).
- Queue: gated block in `_persistCompleted` (after auto-summary) scans each
  image item via ocrEngine → `updateOcrText` (FTS reindexes); `ocrCount` in
  `_PersistResult`; an `ai` success inbox entry when text is found.
- Settings UI: an "Image text (OCR)" auto-scan card in AI & graph settings,
  shown only where ML Kit OCR runs.
- Tests: shouldAutoOcr truth table, settings round-trip, and queue cases
  (image+text → ocrText + entry; default-off no-op; video skipped). The
  realistic image test relies on the precursor classifier fix.
- Docs: P13-PLAN P13b-3 status, VERIFICATION P13b-3 + image-download fix.
  No schema/deps change.

https://claude.ai/code/session_013JoYmLCosYt5tQ8qwdbL1T
…s (P13b-3 sweep)

Pre-merge sweep of the image-download/classification work:

- MediaThumb now falls back to the image FILE for `image` items with a null
  thumbnail (they were showing a movie-icon placeholder in the grid,
  dashboard, collections, hero shuttle, and related strips). Typed fallback
  icon for images is now image_outlined.
- classifyDownloadOutputs collapses an image + its yt-dlp `--write-thumbnail`
  sidecar to ONE item: with no video/audio, the largest image is the media
  and the next-largest is its thumbnail (carousels expand to one task/folder
  per photo, so multiple images here = photo + thumbnail). Prevents a
  duplicate image item and gives image items a real thumbnail.
- Quick wins in _persistCompleted: auto-transcribe skips image items (no
  wasted whisper transcode of a photo); durationSec gated to non-image.
- Tests: classifier photo+thumbnail collapse (real temp files), MediaThumb
  image null-thumb renders Image.file (not the movie icon), queue cases
  (image+thumbnail → one item with a thumbnail; whisper skipped on images).
- Docs: VERIFICATION (thumbnail rendering, single-item, export), BACKLOG
  (unconditional --write-thumbnail; non-mediaTypeForExt image formats),
  P13-PLAN P13b-3 sweep note.

https://claude.ai/code/session_013JoYmLCosYt5tQ8qwdbL1T
@blokzdev blokzdev merged commit a9cf525 into main Jun 2, 2026
1 check passed
@blokzdev blokzdev deleted the claude/p13b3-auto-ocr branch June 2, 2026 19:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants