Skip to content

Wayland: resolve session lock client crashes on sleep, unlock, and output changes#829

Open
Luke458 wants to merge 6 commits into
quickshell-mirror:masterfrom
Luke458:fix-session-lock-crash
Open

Wayland: resolve session lock client crashes on sleep, unlock, and output changes#829
Luke458 wants to merge 6 commits into
quickshell-mirror:masterfrom
Luke458:fix-session-lock-crash

Conversation

@Luke458
Copy link
Copy Markdown

@Luke458 Luke458 commented May 28, 2026

Under the ext-session-lock-v1 protocol, calling unlock_and_destroy immediately invalidates and destroys all associated lock surface objects on the compositor.

In WlSessionLock::unlock(), the manager was unlocked before deleting the surfaces. Because surface->deleteLater() is asynchronous, the destructors ran in the next event loop tick and sent ext_session_lock_surface_v1.destroy requests to the compositor for object IDs that the compositor had already discarded, leading to a fatal invalid object Wayland protocol error and session crash.

Similarly, when screens are removed, deleting them asynchronously can trigger the same race condition.

This PR resolves the issue by synchronously deleting the surfaces first (delete surface;) before unlocking the session, conforming to the correct destruction order.

Fixes #540
Fixes #786
Fixes #733
Fixes #727
Fixes #608
Fixes #571
Fixes #406
Fixes #366
Fixes #368
Fixes #370
Fixes #361
Fixes #832

@Luke458
Copy link
Copy Markdown
Author

Luke458 commented May 28, 2026

I should also mention that I understand deleteLater() is standard Qt/QML practice and directly deleting a QObject using the C++ delete operator is generally discouraged as the QML engine bindings or event queues might still hold references to the object during the current event loop cycle.

However, this usage is safe from a Qt perspective because these surfaces are created and owned purely in C++ by WlSessionLock and are not shared with, referenced by, or garbage-collected through QML bindings elsewhere.

@Luke458
Copy link
Copy Markdown
Author

Luke458 commented May 28, 2026

Update: Added a second commit to prevent a crash during sleep/DPMS transitions.

When display outputs are temporarily disconnected or in a transition state, updateSurfaces() is now prevented from creating lock surfaces on placeholder or invalid screens. This avoids triggering the qFatal check in the QSWaylandSessionLockSurface constructor.

@Luke458
Copy link
Copy Markdown
Author

Luke458 commented May 28, 2026

upon further testing, this is an improvement but it doesn't fully resolve the wake from suspend/dpms off crashing. It prevents all wayland clients from crashing but quickshell itself will still crash :/

@Luke458 Luke458 changed the title wayland: destroy lock surfaces synchronously before unlocking fix(wayland): resolve session lock client crashes on sleep, unlock, and output changes May 28, 2026
@Luke458 Luke458 changed the title fix(wayland): resolve session lock client crashes on sleep, unlock, and output changes Wayland: resolve session lock client crashes on sleep, unlock, and output changes May 28, 2026
@Luke458 Luke458 marked this pull request as draft May 28, 2026 06:59
@Luke458
Copy link
Copy Markdown
Author

Luke458 commented May 28, 2026

Okay final update (I hope), after the second commit there were still 3 issues remaining.

The first issue involved a screen state transition abort caused by a qFatal check. During suspend and wake transitions, displays are briefly disconnected and reconnected, causing Qt to temporarily advertise placeholder screens with null Wayland outputs. Quickshell’s lock surface constructor contained a hardcoded qFatal safety check that immediately aborted the process when these placeholder screens were encountered. This was resolved by filtering out placeholder screens and screens with null outputs in WlSessionLock::updateSurfaces() before initializing new lock surfaces, safely bypassing the abort condition.

The second issue was a Wayland protocol error: wl_display#1: error 0: invalid object XX. In the original implementation, lock surfaces were deleted asynchronously using deleteLater(). When unlocking, or when a monitor was disconnected, the server-side lock object was destroyed first. Later, when the Qt event loop processed the deferred surface deletion, the client sent a destroy request for the surface. Because the compositor had already freed the corresponding server-side surface ID, this resulted in an invalid object protocol violation. The fix involved converting the raw lock pointer in LockWindowExtension to a weak QPointer<QSWaylandSessionLock>, and updating QSWaylandSessionLockSurface::~QSWaylandSessionLockSurface() to check whether the lock was inactive or null. If the lock was no longer valid, the code now cleans up the client-side proxy locally using wl_proxy_destroy() instead of sending a server-bound destroy request.

The third issue was another protocol error: [destroyed object]: error 1: Null buffer attached. The ext-session-lock-v1 protocol strictly forbids committing a null buffer to a lock surface, triggering a null_buffer error if violated. When surfaces were deleted before unlocking, the lock role remained active on the compositor. Because surface cleanup occurred asynchronously, Qt’s normal window-hide routine attempted to clear the screen by committing a null buffer to the associated wl_surface before teardown had fully completed, causing a crash. This was fixed by changing the destruction order in WlSessionLock::unlock() to call manager->unlock() first, allowing the compositor to exit lock mode and remove lock roles from all surfaces before deleting the local surface objects. Additionally, WlSessionLockSurface::~WlSessionLockSurface() was modified to call this->window->destroy() synchronously before scheduling deleteLater(). This immediately releases the platform window resources and tears down the Wayland objects in the correct protocol order, preventing asynchronous paint or hide events from committing null buffers.

@Luke458 Luke458 marked this pull request as ready for review May 28, 2026 08:14
@Luke458 Luke458 marked this pull request as draft May 28, 2026 08:21
@Luke458 Luke458 force-pushed the fix-session-lock-crash branch from 92a0fdd to 1ea83fc Compare May 28, 2026 08:23
@Luke458
Copy link
Copy Markdown
Author

Luke458 commented May 28, 2026

FINAL FINAL commits this time to fix the last couple of issues.

Firstly, just fixed the CI failure with qt 6.6.3 as described in the commit.

More importantly, resolved a double-free crash with the error Proxy requested for unref has no references. In QtWayland, the C++ platform wrapper classes automatically call wl_proxy_destroy() on their associated Wayland proxies from their base destructors. The initial implementation added a manual wl_proxy_destroy() call inside QSWaylandSessionLockSurface::~QSWaylandSessionLockSurface(), which caused the proxy to be freed twice, once manually and once again by the base class destructors. This triggered a fatal libwayland-client assertion. The fix involved removing the manual wl_proxy_destroy() call from ~QSWaylandSessionLockSurface(). When shouldDestroyOnServer is false, cleanup is now left entirely to the base destructor. An early return guard, if (this->object() == nullptr) return;, was also added at the start of the destructor. Additionally, LockWindowExtension::destroySurface() was modified to destroy the surface early via surface->destroy() only when the lock is still active. If the lock is inactive during unlocking, it now skips early destruction and allows the window destructor to clean up the proxy exactly once.

@Luke458
Copy link
Copy Markdown
Author

Luke458 commented May 28, 2026

After extensive debugging and validation, I've identified why this crash may be highly reproducible on systems running amdgpu/Mesa with Hyprland, and how the final changes resolve it.

The issue is triggered by how quickly the open-source AMD GPU stack handles display power transitions. Unlike some proprietary driver setups(Nvidia), amdgpu with Mesa performs DPMS off, suspend, and wake transitions almost immediately.

Similarly, Hyprland is extremely strict about Wayland object state and the moment unlock_and_destroy is processed, Hyprland immediately invalidates the surface IDs. If the client sends a destroy or commit on those IDs a split-second later, Hyprland raises a fatal protocol error ( invalid object or Null buffer attached ) and kills the client.

This exposes a race condition in Quickshell’s client-side surface lifecycle. If the client calls destroy on a lock surface after Hyprland has already freed it, the Wayland connection terminates with an invalid object error. Conversely, if the client destroys the window while the lock role is still active, Qt’s default hide routine may commit a null buffer, violating the ext-session-lock-v1 protocol and producing a Null buffer attached error.

The final fix resolves this by enforcing a strict teardown order. In WlSessionLock::unlock(), each lock surface is first explicitly destroyed through surface->ext->destroySurface() before any windows are hidden or deleted. Since the lock is still active at this point, this sends ext_session_lock_surface_v1.destroy to the compositor and strips the lock role server-side.

After that, the window is destroyed synchronously with window->destroy(). Any null buffers committed by Qt during this phase are ignored by the compositor because the surface is no longer acting as a lock surface.

The destructor also now begins with an early-return guard, if (this->object() == nullptr) return;. If the proxy was already destroyed while the lock was active, the destructor exits immediately. If the lock was inactive during unlock, QtWayland’s base class is left to clean up the proxy locally exactly once, avoiding the double-free crash reported as Proxy requested for unref has no references.

Finally, unlock_and_destroy is sent to the compositor only after the surfaces and windows have been handled. This matches the required protocol order, avoids desktop exposure flicker, and prevents both the invalid object and Null buffer attached crashes.

@Luke458
Copy link
Copy Markdown
Author

Luke458 commented May 30, 2026

AvengeMedia/DankMaterialShell#1468 (comment)

Many thanks to @bzbetty for testing and proposing this final change ^^

The original reason for switching to synchronous delete was to ensure that the Wayland lock surface role was destroyed before unlock_and_destroy was sent. However, now that destroySurface() explicitly and synchronously handles destruction of the lock surface role before any object deletion occurs, deletion of the WlSessionLockSurface itself becomes purely a local cleanup concern.

WlSessionLockSurface is a QML-exposed QObject with active property bindings (visible, width, height, and screen) as well as signal connections. Deleting it synchronously with delete while the QML engine may still have pending property evaluations or signal handlers referencing the object introduces a risk of use-after-free crashes within the QML engine. Using deleteLater() avoids this by deferring destruction until the next event loop iteration, allowing QML to safely complete any outstanding work before the object is destroyed.

The current unlock sequence is:

  1. ext->destroySurface() — synchronously sends ext_session_lock_surface_v1.destroy to the compositor, removing the lock role.
  2. surface->deleteLater() — schedules cleanup of the QML object and associated window resources for the next event loop iteration.
  3. manager->unlock() — sends unlock_and_destroy to the compositor.

By the time the deferred deletion from step 2 actually executes, the lock role has already been removed in step 1 and the session has already been unlocked in step 3. As a result, any null-buffer commits generated by Qt's normal window-hide or teardown routines are no longer subject to the restrictions of the ext-session-lock-v1 protocol, because the surface is no longer acting as a lock surface. This preserves correct protocol behavior while also maintaining Qt/QML's preferred object-lifetime semantics.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment