Skip to content

[BUG] Heavy decoding permanently stops after fast scrolling (HeavyLanePool IO semaphore leak on cancellation) #85

@Huangzhiqiang

Description

@Huangzhiqiang

Describe the bug

When rapidly scrolling / flipping through images (faster than the prefetch/cache can handle), the app correctly stops decoding images that are no longer visible — this is expected behavior.

However, after stopping the fast scrolling and then trying to view any new image (even slowly), decoding never resumes. The image remains blank / white / unloaded forever until the application is restarted. This effectively deadlocks the heavy decoding pipeline.

To Reproduce

  1. Open QuickView with a folder containing large / heavy images (e.g. JXL).
  2. Scroll / arrow-key / mouse-wheel very quickly through 20–50+ images in rapid succession (faster than ~100–200 ms per image).
  3. Stop scrolling and try to view any single image slowly (wait 5–10 seconds or more).
  4. Observed: No decoding starts — image stays unloaded. Worker threads appear stuck / no CPU/disk activity for heavy decoding.
  5. Only way to recover: restart the application.

Expected behavior

After fast scrolling stops, decoding of the current (and prefetched) images should resume normally within a short time.

Suspected root cause

From reading the code around commit 57ebfd7 (HeavyLanePool logic):

  • HeavyLanePool uses a std::counting_semaphore (m_ioSemaphore) to limit concurrent IO / heavy decoding operations.
  • When fast navigation happens, CancelOthers() (or similar) calls request_stop() on busy workers / tasks.
  • However, workers that acquired the semaphore via .acquire() before being cancelled do not release it on stop.
  • This causes the semaphore count to leak downward permanently → eventually reaches 0 and never recovers.
  • New tasks submitted after stopping cannot acquire the semaphore → decoding pipeline is deadlocked.

(Note: This analysis / diagnosis was generated with the assistance of Grok AI based on the described symptoms and typical patterns in the provided commit / class name.)

Suggested fix

Add RAII-based automatic release of the IO semaphore in the worker loop.

Example minimal patch concept:

// In HeavyLanePool (private helper)
struct ScopedIOSlot {
    std::counting_semaphore<...>& sem;
    bool acquired = false;
    explicit ScopedIOSlot(auto& s) : sem(s) { acquired = sem.try_acquire(); /* or .acquire() */ }
    ~ScopedIOSlot() { if (acquired) sem.release(); }
};

// In WorkerLoop, around the decoding section:
{
    ScopedIOSlot slot(m_ioSemaphore);
    if (!slot.acquired) continue;  // or wait/retry logic

    // ... actual heavy decode work ...
}  // ← auto release here, even on early return / exception / stop

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions