Use spin_lock_bh on recinf->lock to fix softirq deadlock#313
Closed
aversecat wants to merge 1 commit into
Closed
Conversation
timer_callback() runs in softirq context and acquires recinf->lock, but the process-context callers (scoutfs_recov_prepare, _begin, _finish, _is_pending, _next_pending, _shutdown) were taking the same lock with plain spin_lock(), leaving softirqs enabled. Found by Lockdep: ``` ================================ WARNING: inconsistent lock state 5.14.0-427.35.1.el9_4.x86_64+debug #1 Tainted: G OE ------- --- -------------------------------- inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. swapper/2/0 [HC0[0]:SC1[1]:HE1:SE0] takes: ffff88813cdd9c20 (&recinf->lock){+.?.}-{2:2}, at: timer_callback+0x26/0x380 [scoutfs] {SOFTIRQ-ON-W} state was registered at: __lock_acquire+0x7d0/0x1900 lock_acquire+0x1da/0x640 _raw_spin_lock+0x34/0x80 scoutfs_recov_finish+0x80/0x830 [scoutfs] server_greeting+0x244/0xe60 [scoutfs] scoutfs_net_proc_worker+0x28a/0xce0 [scoutfs] recv_one_message+0x7e3/0xd10 [scoutfs] scoutfs_net_recv_worker+0x441/0xe00 [scoutfs] process_one_work+0x8e5/0x1530 worker_thread+0x598/0xf70 kthread+0x2a4/0x350 ret_from_fork+0x29/0x50 irq event stamp: 549813370 hardirqs last enabled at (549813370): [<ffffffffabe25cb4>] _raw_spin_unlock_irq+0x24/0x50 hardirqs last disabled at (549813369): [<ffffffffabe2594e>] _raw_spin_lock_irq+0x5e/0x90 softirqs last enabled at (549813356): [<ffffffffabe28c91>] __do_softirq+0x621/0x9c2 softirqs last disabled at (549813363): [<ffffffffa9a44665>] __irq_exit_rcu+0x185/0x230 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&recinf->lock); <Interrupt> lock(&recinf->lock); *** DEADLOCK *** ``` Convert the six process-context sites to spin_lock_bh()/spin_unlock_bh(). Signed-off-by: Auke Kok <auke.kok@versity.com>
Contributor
Author
|
This needs to be part of #314 - closing. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
timer_callback() runs in softirq context and acquires recinf->lock, but the process-context callers (scoutfs_recov_prepare, _begin, _finish, _is_pending, _next_pending, _shutdown) were taking the same lock with plain spin_lock(), leaving softirqs enabled. Found by Lockdep:
Convert the six process-context sites to spin_lock_bh()/spin_unlock_bh().