⚡ Implement lazy loading for PDF content to improve performance#29
⚡ Implement lazy loading for PDF content to improve performance#29
Conversation
- Replaced eager PDF text extraction with `PdfLazyList` in `ContentRepository`, enabling O(1) initialization time. - Updated `ContentResult` and `ChapterContent` to support pre-calculated text/image counts, avoiding eager iteration of lazy lists. - Modified `ReaderViewModel` to pass these counts to `ChapterContent`. - Changed PDF content granularity to one `ContentElement.Text` per PDF page (with internal paragraph separation preserved) to support lazy access by index. - This change prevents the application from blocking during the loading of large PDF documents. Co-authored-by: Aatricks <113598245+Aatricks@users.noreply.github.com>
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
This PR implements lazy loading for PDF documents in
ContentRepository. Previously, opening a large PDF would block execution while the application extracted text from every page, split it into paragraphs, and created a large list ofContentElementobjects.The new implementation uses
PdfLazyList, a customAbstractListthat opens the PDF document and extracts text for a specific page only when requested by the UI (e.g., byLazyColumnorHorizontalPager). This reduces the initialization complexity from O(N) (where N is the number of pages) to O(1).Key changes:
ContentRepository.kt: AddedPdfLazyListand updatedloadPdfContentto use it.ContentResult.kt&ChapterContent.kt: Added support for passing pre-calculatedtextCountandimageCountto avoid iterating the lazy list for metadata.ReaderViewModel.kt: Updated to propagate these counts.Note: PDF content is now returned as one text block per page (preserving paragraph breaks via double newlines), which aligns better with page-based navigation and enables the lazy loading strategy.
PR created automatically by Jules for task 141534413232369332 started by @Aatricks