Skip to content

Feature Request: GPU VRAM Allocation Release #393

@JesterEE

Description

@JesterEE

I run dsnote on a NVIDIA 3070 GPU with 8GB of RAM. I often keep the application running in the background (on the taskbar) so I can quickly perform speech-to-text via global keybindings. After triggering the first Listen event, I can see that my GPU Memory allocation increases to load the STT model I have selected as intended. dsnote keeps the model in active memory, which is great and allows me to quickly utilize the STT backend fluidly as I work.

Right now it is known that the Linux kernel does a pretty bad job at managing GPU memory allocation, and matters only get worse with proprietary drivers and kernel extensions doing as they please behind closed doors (NVIDIA). From my testing, dsnote will keep the STT model in VRAM indefinitely while the process is active. In my case, this is significant at ~20-25% of my GPU's resources. When I move from a productivity workload where I use dsnote extensively, to alternate, GPU-heavy workloads where I don't, I'd like some better control of dsnote's memory allocation to free up these key resources. Right now, I restart dsnote when I switch workloads to free up resources which is inconvenient, but especially bad when I forget and lock up the machine. Ultimately, the kernel would do it's job and schedule resources appropriately (shifting non-critical VRAM use to system RAM), but until the day where the kernel does this correctly in a hardware agnostic way, I would like to request a few features, settings and UI enhancements be added to dsnote to give the user better control of the state of the memory allocation.

My enhancement requests are:

  1. a feature and accompanying togglable, user-defined settings be added for a timeout where dsnote would release the VRAM allocation after a set dark period of non-use. On the next event, dsnote would re-load the model into VRAM to process the job, and the timer would be re-started.
  2. a "Stop" or "Release" button be added next to the Listen button to manually unload the model from VRAM.
  3. a "Stop" or "Release" option be added to the right-click context menu on the taskbar to manually unload the model from VRAM.
  4. an externally accessible action for manually unloading the model from VRAM.

Thank you for considering my feature request, and for the great software! dsnote really has changed the way I interact with my computer while I work. I am thankful it exists and for your efforts in supporting it!

About my system:
Speech Note 4.8.4 (flatpak) w/ Speech Note NVIDIA
Operating System: Bazzite 43
KDE Plasma Version: 6.6.4
KDE Frameworks Version: 6.25.0
Qt Version: 6.10.3
Kernel Version: 6.17.7-ba29.fc43.x86_64 (64-bit)
Graphics Platform: Wayland
Processors: 16 × AMD Ryzen 9 5950X 16-Core Processor
Memory: 32 GiB of RAM (31.3 GiB usable)
Graphics Processor: NVIDIA GeForce RTX 3070

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions