Due to the bug below, all the nova patches are failing. I accordingly
propose to make the job non-voting as it would allow our gate to ressurect.
Of course, this is a transient solution and we need a proper fix for the
job in order to have Centos 9 Stream support back in place ASAP.
When the job is back OK, please make it voting again by reverting this patch.
Change-Id: I5c3fff65fd6d9e4f3632d1ec62ae3f1f9cfbe626
Related-Bug: #1979047
Because in the above description:
To install nova-compute, run:
PS C:\> cd c:\nova
PS C:\> python setup.py install
The file is downloaded to c:\nova. After executing `python setup.py
install` in this directory, the etc directory is generated under
`c:\nova`. Correct the path.
Closes-Bug: #1964548
Change-Id: Ibf0c1b56f235fffae65afbbcee30056bae965afe
Change [1] added new fields 'src|dst_supports_numa_live_migration'
to LibvirtLiveMigrateData object, but missed if condition for
dst_supports_numa_live_migration field in obj_make_compatible
method.
This change adds the if condition as well as fix typo in unit test
because of which this wasn't catched earlier.
Closes-Bug: #1975891
Change-Id: Ice5a2c7aca77f47ea6328a10d835854d9aff408e
The libvirt virt dirver checks the AMD KVM kernel module parameter SEV
to see if that feature is enabled. However it seems that the
/sys/module/kvm_amd/parameters/sev file can either contain "1\n" or
"Y\n" to indicate that the feature is enabled. Nova only checked for
"1\n" so far making the feature disabled on compute nodes with "Y\n"
value. Now the logic is extended to accept both.
Closes-Bug: #1975686
Change-Id: I737e1d73242430b6756178eb0bf9bd6ec5c94160
Nova's use of libvirt's compareCPU() API served its purpose
over the years, but its design limitations break live migration in
subtle ways. For example, the compareCPU() API compares against the
host physical CPUID. Some of the features from this CPUID aren not
exposed by KVM, and then there are some features that KVM emulates that
are not in the host CPUID. The latter can cause bogus live migration
failures.
With QEMU >=2.9 and libvirt >= 4.4.0, libvirt will do the right thing in
terms of CPU compatibility checks on the destination host during live
migration. Nova satisfies these minimum version requirements by a good
margin. So, provide a workaround to skip the CPU comparison check on
the destination host before migrating a guest, and let libvirt handle it
correctly. This workaround will be removed once Nova replaces the older
libvirt APIs with their newer and improved counterparts[1][2].
- - -
Note that Nova's libvirt driver calls compareCPU() in another method,
_check_cpu_compatibility(); I did not remove its usage yet. As it needs
more careful combing of the code, and then:
- where possible, remove the usage of compareCPU() altogether, and
rely on libvirt doing the right thing under the hood; or
- where Nova _must_ do the CPU comparison checks, switch to the better
libvirt CPU APIs -- baselineHypervisorCPU() and
compareHypervisorCPU() -- that are described here[1]. This is work
in progress[2].
[1] https://opendev.org/openstack/nova-specs/commit/70811da221035044e27
[2] https://review.opendev.org/q/topic:bp%252Fcpu-selection-with-hypervisor-consideration
Change-Id: I444991584118a969e9ea04d352821b07ec0ba88d
Closes-Bug: #1913716
Signed-off-by: Kashyap Chamarthy <kchamart@redhat.com>
Signed-off-by: Balazs Gibizer <bgibizer@redhat.com>
Just because we encountered a PortNotFound error when unbinding a port
doesn't mean we should stop unbinding the remaining ports. If this error
is encountered, simply continue with the other ports.
While we're here, we clean up some other tests related to '_unbind_port'
since they're clearly duplicates.
Change-Id: Id04e0df12829df4d8929e03a8b76b5cbe0549059
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
Closes-Bug: #1974173
The call to _get_pci_passthrough_devices could fail because a
network device could have disappeared which would cause a traceback
in the logs.
This wraps the function in a safe way to return an empty array
if it fails, which will clean-up the logs if the device disappears
Closes-Bug: #1972028
Change-Id: I46d3bbe122d9f8452f168286391bab67ecea3128
This change simply replaces calling notifyAll() with
notify_all() on the threading condition object in
the notification fixture.
notify_all() was introduced in python 2.6
Change-Id: If8d386f20693016dd35ecfdbc703bf31aa103a67
Patch fixing bug #1861071 resolved the issue of extending LUKS v1
volumes when nova connects them via libvirt instead of through os-brick,
but nova side still fails to extend in-use volumes when they don't go
through libvirt (i.e., LUKS v2).
The logs will show a very similar error, but the user won't know that
this has happened and Cinder will show the new size:
libvirt.libvirtError: internal error: unable to execute QEMU command
'block_resize': Cannot grow device files
There are 2 parts to this problem:
- The device mapper device is not automatically extended.
- Nova tries to use the encrypted block device size as the size of the
decrypted device.
This patch leverages the "extend_volume" method in os-brick connectors
to extend the device mapper device, after the encrypted device has been
extended, and use the size of the decrypted volume for the block_resize
operation.
Related change: I351f1a7769c9f915e4cd280f05a8b8b87f40df84
Closes-Bug: #1967157
Change-Id: Ia1411f11ec4bf44af6a42d5f96c8a0903846ed66
At the moment, if libvirt times out in detaching a device, it
reports this as an ERROR even if the process will be retried
and eventually succeed.
We should just log a warning since there's nothing to do, and
if the process fails after all the retries, it will log an ERROR
anyways.
Closes-Bug: #1972023
Change-Id: Idda12db5758706a97b7841571b9ecd3dc6e6905e
These are currently non-voting since we don't care about this stuff for
Zed. It does get us ready for a 3.10-having future, however.
Change-Id: I7740dafd6523eca27fa4e725d7eaf8558e434779
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
In Zed cycle we have stopped testing the python3.6|7
means dropped the support and we also bumped the min
python required version in setup.cfg
- Iba5074ea6f981a7527e86cfc98edd1ed7dd3086f
Adding a releasentoes will be good to communicate the same.
Change-Id: I85c1136dd0cd0dce96f8285d3930a31c5a68ead5
The flag was added to prevent the source host neutron agents to trigger
a vif-plugged event too early during a live migration. But actually it
can be used in a more generic sense as the code filter on migration_to
binding profile attribute.
We saw too early vif-plugged events from neutron during evacuation in
post part of the nova-ovs-hybrid-plug job. So this patch enables the
workaround flag for this job too.
Closes-Bug: #1971563
Change-Id: Ifd20ece3a4f126da16f077247c2f1e072edb7163
As If9ab424cc7375a1f0d41b03f01c4a823216b3eb8 stated there is a way for
the pci_device table to become inconsistent. Parent PF can be in
'available' state while children VFs are still in 'unavailable' state.
In this situation the PF is schedulable but the PCI claim will fail
when try to mark the dependent VFs unavailable.
This patch changes the PCI claim logic to allow claiming the parent PF
in the inconsistent situation as we assume that it is safe to do so.
This claim also fixed the inconsistency so that when the parent PF is
freed the children VFs become available again.
Closes-Bug: #1969496
Change-Id: I575ce06bcc913add7db0849f85728371da2032fc
Today Nova updates the mac_address of a direct-physical port to reflect
the MAC address of the physical device the port is bound to. But this
can only be done before the port is bound. However during migration Nova
does not update the MAC when the port is bound to a different physical
device on the destination host.
This patch extends the libvirt virt driver to provide the MAC address of
the PF in the pci_info returned to the resource tracker. This
information will be then persisted in the extra_info field of the
PciDevice object.
Then the port update logic during migration, resize, live
migration, evacuation and unshelve is also extended to record the MAC of
physical device in the port binding profile according to the device on
the destination host.
The related neutron change Ib0638f5db69cb92daf6932890cb89e83cf84f295
uses this info from the binding profile to update the mac_address field
of the port when the binding is activated.
Closes-Bug: #1942329
Change-Id: Iad5e70b43a65c076134e1874cb8e75d1ba214fde
When getting an instance using the compute.API we call
scatter_gather_single_cell() to be able to capture details when we fail
to retrieve a result from a cell such as timeouts and exceptions.
Currently however, we aren't logging the content of an exception if
scatter_gather_single_cell() returns an exception as the result. The
scatter gather method itself logs exceptions that are not of type
NovaException as these represent definite unexpected errors such as
database errors but NovaException handling are left for the caller to
decide whether they want to log it or re-raise it and so on.
It can be difficult to debug a situation where a cell is returning a
NovaException result so this adds logging of the exception content in
the compute API when we encounter an unexpected NovaException.
The existing log message has been updated to more accurately reflect
what has happened (did not respond vs exception). The assignment of the
exception object in scatter gather has also been updated to not
unnecessarily construct a new exception object because it (a) wasn't
necessary and (b) made asserting the LOG.exception() call argument in
the unit test difficult.
Related-Bug: #1970087
Change-Id: Iae1c61c72be5b6017b934293e3dc079a24eeb0e7