fix(hostagent): ensure VFs are unmanaged by NetworkManager via persistent udev rule#52
Open
tsorya wants to merge 6 commits into
Open
fix(hostagent): ensure VFs are unmanaged by NetworkManager via persistent udev rule#52tsorya wants to merge 6 commits into
tsorya wants to merge 6 commits into
Conversation
…tent udev rule Add EnsureVFsUnmanaged() to the Backend interface so that NM-specific udev rule logic is encapsulated in NetworkManagerBackend while systemd-networkd cleanly no-ops. Key changes: - Move udev rule logic from hostagent/util into netconfig/nm_udev.go - Mount /etc/udev/rules.d from host into hostagent container to persist the rule across reboots (HostPathDirectoryOrCreate) - Skip udevadm reload/trigger when rule file is already up-to-date NetworkManager only evaluates NM_UNMANAGED when a device first appears, so the rule must be present before VFs are created. Co-authored-by: Cursor <cursoragent@cursor.com>
…and MTU flapping Three related issues caused hostagent networking failures on hosts with SR-IOV VFs as bridge members: 1. VF MTU via netlink: VFs (e.g. ens7f0v0) that are bridge members cannot have their MTU configured through NetworkManager — NM activation fails because VF connection profiles conflict with the PF-managed config. Detect VFs via sysfs physfn symlink and set their MTU directly via netlink instead. 2. MAC address stripping: When reading a NM connection profile and writing it back (round-trip), the 802-3-ethernet.mac-address property causes Update failures for VFs whose MAC is managed by the PF. Add mac-address to unsafeRoundtripProps so it is stripped before calling NM Update. 3. Profile MTU check to prevent flapping: When the NM profile already has the desired MTU but the kernel link MTU temporarily differs (e.g. after driver reload), the hostagent would re-activate the connection on every reconcile loop, bouncing the interface. Check the NM profile MTU first and skip activation when it already matches. 4. Set interface-name on bridge member connections: Without interface-name, NM may match the wrong connection profile when multiple connections exist for the same interface type. Co-authored-by: Cursor <cursoragent@cursor.com>
AddNetworkRequest returned early when a request already existed, ignoring the provided vfCount. Now updates NumOfVFs and persists the change so the next processing cycle applies it.
wpeng102
reviewed
Jun 1, 2026
If writeUdevRuleFile() succeeds but reloadAndTriggerUdev() fails, the next reconcile would see the file already up-to-date and skip the reload/trigger, leaving the system in a broken state. Add an in-memory udevRulesApplied flag that only becomes true after a successful reload/trigger. This ensures retry on the next reconcile while still skipping redundant udevadm calls in the steady state. Signed-off-by: Igal Tsoiref <itsoiref@redhat.com> Co-authored-by: Cursor <cursoragent@cursor.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
EnsureVFsUnmanaged()to theBackendinterface so NM-specific udev rule logic is encapsulated inNetworkManagerBackendwhile systemd-networkd cleanly no-opshostagent/util/udev.gointonetconfig/nm_udev.go/etc/udev/rules.dfrom the host into the hostagent container (HostPathDirectoryOrCreate) to persist the NM unmanaged rule across rebootsudevadmreload/trigger when the rule file is already up-to-date (idempotency for the 30s reconcile loop)Background
NetworkManager only evaluates
NM_UNMANAGEDwhen a device first appears. The rule must exist in persistent/etc/udev/rules.d/before VFs are created so NM never manages them. Previously the file was written to the container's ephemeral filesystem, invisible to the host.Test plan
nm_udev_test.go): rule written + udevadm triggered on first run, idempotent skip, overwrite on mismatch, mkdir parents, error paths/etc/udev/rules.d/10-nm-unmanaged.ruleson hostunmanagedinnmcli device status