Fix deadlock with tap/tun devices for arm guests#1232
Draft
AndrewFasano wants to merge 5 commits into
Draft
Conversation
A guest boot hangs while probing the network interface when iommu_platform=on is used. The following qemu cli hangs without this patch: # $QEMU \ -netdev tap,fd=3,id=hostnet0,vhost=on,vhostfd=4 3<>/dev/tap67 4<>/dev/host-net \ -device virtio-net-pci,netdev=hostnet0,id=net0,iommu_platform=on,disable-legacy=on \ ... Commit: c471ad0 (vhost_net: device IOTLB support) took care of setting vhostfd to non-blocking when QEMU opens /dev/host-net but if the fd is passed from qemu cli then we need to ensure that fd is set to non-blocking. Fixes: c471ad0 ("vhost_net: device IOTLB support") Cc: qemu-stable@nongnu.org Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
When QEMU sets up a tap based network device backend, it mostly ignores errors reported from various ioctl() calls it makes, assuming the TAP file descriptor is valid. This assumption can easily be violated when the user is passing in a pre-opened file descriptor. At best, the ioctls may fail with a -EBADF, but if the user passes in a bogus FD number that happens to clash with a FD number that QEMU has opened internally for another reason, a wide variety of errnos may result, as the TUNGETIFF ioctl number may map to a completely different command on a different type of file. By ignoring all these errors, QEMU sets up a zombie network backend that will never pass any data. Even worse, when QEMU shuts down, or that network backend is hot-removed, it will close this bogus file descriptor, which could belong to another QEMU device backend. There's no obvious guaranteed reliable way to detect that a FD genuinely is a TAP device, as opposed to a UNIX socket, or pipe, or something else. Checking the errno from probing vnet hdr flag though, does catch the big common cases. ie calling TUNGETIFF will return EBADF for an invalid FD, and ENOTTY when FD is a UNIX socket, or pipe which catches accidental collisions with FDs used for stdio, or monitor socket. Previously the example below where bogus fd 9 collides with the FD used for the chardev saw: $ ./x86_64-softmmu/qemu-system-x86_64 -netdev tap,id=hostnet0,fd=9 \ -chardev socket,id=charchannel0,path=/tmp/qga,server,nowait \ -monitor stdio -vnc :0 qemu-system-x86_64: -netdev tap,id=hostnet0,fd=9: TUNGETIFF ioctl() failed: Inappropriate ioctl for device TUNSETOFFLOAD ioctl() failed: Bad address QEMU 2.9.1 monitor - type 'help' for more information (qemu) Warning: netdev hostnet0 has no peer which gives a running QEMU with a zombie network backend. With this change applied we get an error message and QEMU immediately exits before carrying on and making a bigger disaster: $ ./x86_64-softmmu/qemu-system-x86_64 -netdev tap,id=hostnet0,fd=9 \ -chardev socket,id=charchannel0,path=/tmp/qga,server,nowait \ -monitor stdio -vnc :0 qemu-system-x86_64: -netdev tap,id=hostnet0,vhost=on,fd=9: Unable to query TUNGETIFF on FD 9: Inappropriate ioctl for device Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Tested-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-id: 20171027085548.3472-1-berrange@redhat.com [lv: to simplify, don't check on EINVAL with TUNGETIFF as it exists since v2.6.27] Signed-off-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
When using C11 atomics, non-seqcst reads and writes do not participate
in the total order of seqcst operations. In util/async.c and util/aio-posix.c,
in particular, the pattern that we use
write ctx->notify_me write bh->scheduled
read bh->scheduled read ctx->notify_me
if !bh->scheduled, sleep if ctx->notify_me, notify
needs to use seqcst operations for both the write and the read. In
general this is something that we do not want, because there can be
many sources that are polled in addition to bottom halves. The
alternative is to place a seqcst memory barrier between the write
and the read. This also comes with a disadvantage, in that the
memory barrier is implicit on strongly-ordered architectures and
it wastes a few dozen clock cycles.
Fortunately, ctx->notify_me is never written concurrently by two
threads, so we can assert that and relax the writes to ctx->notify_me.
The resulting solution works and performs well on both aarch64 and x86.
Note that the atomic_set/atomic_read combination is not an atomic
read-modify-write, and therefore it is even weaker than C11 ATOMIC_RELAXED;
on x86, ATOMIC_RELAXED compiles to a locked operation.
Analyzed-by: Ying Fang <fangying1@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: Ying Fang <fangying1@huawei.com>
Message-Id: <20200407140746.8041-6-pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
These iothread mutex locks/unlocks were dropped somewhere between qemu 2.9.1 and our current codebase. Assuming they don't break record/replay, they probalby should still be here. Without these, PANDA would deadlock about 75% of the time with the vexpress-a9 ARM guest machine when using -netdev tap and virtio networking.
Member
|
Can confirm this works for the target where we first saw the bug. There appears to be an issue with the barebox minimal test case (with or without tap device): |
…ng and unlocking fns
Member
|
Just a quick comment in case others run into this issue. The systems that were deadlocking were Linux kernels that somehow entered a race condition in QEMU during guest execution of the |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.