Since 4817165fc5, when reverting a
resized instance back to the source host, the libvirt driver waits for
vif-plugged events when spawning the instance. When called from
finish_revert_resize() in the source compute manager, libvirt's
finish_revert_migration() does not pass vifs_already_plugged to
_create_domain_and_network(), making the latter use the default False
value.
When the source compute manager calls
network_api.migrate_instance_finish() in finish_revert_resize(), this
updates the port binding back to the source host. If Neutron is
configured to use OVS hybrid plug, it will send the vif-plugged event
immediately after completing this request. This happens before the
virt driver's finish_revert_migration() method is called. This causes
the wait in the libvirt driver to time out because the event is
received before Nova starts waiting for it.
The neutron ovs l2 agent sends vif-plugged events when two conditions
are met. First the port must be bound to the host managed by the
l2 agent and second, the agent must have completed configuring the
port on ovs. This involves assigning the port a local VLAN for tenant
isolation, applying security group rules if required and applying
QoS policies or other agent extensions like service function chaining.
During the boot process, we bind the port first to the host
then plug the interface into ovs which triggers the l2 agent to
configure it resulting in the emission of the vif-plugged event.
In the revert case, as noted above, since the vif is already plugged
on the source node when hybrid-plug is used, binding the port to the
source node fulfils the second condition to send the vif-plugged event.
Events sent immediately after port binding update are hereafter known
as "bind-time" events. For ports that do not use OVS hybrid plug,
Neutron will continue to send vif-plugged events only when Nova
actually plugs the VIF. These types of events are hereafter known as
"plug-time" events. OVS hybrid plug is a per agent setting, so for
a particular host, bind-time events are an all-or-nothing thing for the
ovs backend: either all VIF_TYPE=ovs ports have them, or no ovs ports
have them. In general, a host will only have one network backend.
The only exception to this is SR-IOV. SR-IOV is commonly deployed on
the same host as other network backends such as OVS or linuxbridge.
SR-IOV ports with VNIC_TYPE=direct-physical will always have only
bind-time events. If an instance mixes OVS ports with hybrid-plug=False
with direct physical ports, it will have both kinds of events.
For same host resize reverts we do not update the binding host as the
host does not change, as such for same host resize we do not receive
bind time events. For same host revert we therefore do not wait for
bind time events in the compute manager.
This patch adds functions to the NetworkInfo model that return what
kinds of events each VIF has. These are then used in the migration
revert logic to decide when to wait for external events: in the
compute manager, when binding the port, for bind-time events,
and/or in libvirt, when plugging the VIFs, for plug-time events.
(cherry picked from commit 7a7a223602)
Conflicts in nova/tests/unit/objects/test_migration.py due to
1cf3da87027d87251920c2df665b850abb31178e's addition of
test_obj_make_compatible() and test_get_by_uuid().
(cherry picked from commit 7a3a8f325e)
NOTE(artom) uuidsentinel was moved to oslo_utils.fixture in Stein, in
Rocky it was still in nova.tests.
Closes-bug: #1832028
Closes-Bug: #1833902
Co-Authored-By: Sean Mooney <work@seanmooney.info>
Change-Id: I51673e58fc8d5f051df911630f6d7a928d123a5b