Since the commit the remove AMI snapshot format special casing
has merged, we're now running the libvirt snapshot tests as expected.
However, for those tests qemu-img binary needs to be installed.
Because these tests have been silently and incorrectly skipped for so long,
they didn't receive the same maintenance as other tests as the failures went unnoticed.
Change-Id: Ia90eedbe35f4ab2b200bdc90e0e35e5a86cc2110
Closes-bug: #2075178
Signed-off-by: Julien Le Jeune <julien.le-jeune@ovhcloud.com>
(cherry picked from commit 0809f75d7921fe01a6832211081e756a11b3ad4e)
While we do cache the hosts's capabilities in self._caps in the
libvirt Host object, if we happen to fist call get_capabilities() with
some of our dedicated CPUs offline, libvirt erroneously reports them
as being on socket 0 regardless of their real socket. We would then
cache that topology, thus breaking pretty much all of our NUMA
accounting.
To fix this, this patch makes sure to call get_capabilities()
immediately upon host init, and to power up all our dedicated CPUs
before doing so. That way, we cache their real socket ID.
For testing, because we don't really want to implement a libvirt bug
in our Python libvirt fixture, we make due with a simple unit tests
that asserts that init_host() has powered on the correct CPUs.
Closes-bug: 2077228
Change-Id: I9a2a7614313297f11a55d99fb94916d3583a9504
(cherry picked from commit 79d1f06094599249e6e30ebba2488b8b7a10834e)
We currently get the following error message if attempting to fit a
guest with hugepages on a node that doesn't have enough:
Host does not support requested memory pagesize, or not enough free
pages of the requested size. Requested: -2 kB
Correct this, removing the kB suffix and adding a note on the meaning of
the negative values, like we have for the success path.
Change-Id: I247dc0ec03cd9e5a7b41f5c5534bdfb1af550029
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
Closes-Bug: #2075959
(cherry picked from commit 4678bcbb064da580500b1dbeddb0bdfdeac074ef)
The resource tracker Claim object works on a copy of the instance object
got from the compute manager. But the PCI claim logic does not use the
copy but use the original instance object. However the abort claim logic
including the abort PCI claim logic worked on the copy only. Therefore the
claimed PCI devices are visible to the compute manager in the
instance.pci_decives list even after the claim is aborted.
There was another bug in the PCIDevice object where the instance object
wasn't passed to the free() function and therefore the
instance.pci_devices list wasn't updated when the device was freed.
Closes-Bug: #1860555
Change-Id: Iff343d4d78996cd17a6a584fefa7071c81311673
(cherry picked from commit f8b98390dc99f6cb0101c88223eb840e0d1c7124)
Lets first ensure we have a test that proves we have bad behaviour,
then follow up with the fix and the test tweak to prove it.
On the first compute node it fails due to group policy error.
On the second compute node instance should have exactly one PCI device.
Related-Bug: #1860555
Change-Id: Ia122fff268c8f45ad3e5a3071d2cb7c990cb2c1d
(cherry picked from commit a2d77845ab247f1b09e2ae4f32f9421c3f50b98d)
Latest Zuul drops the following warnings:
All regular expressions must conform to RE2 syntax, but an
expression using the deprecated Perl-style syntax has been detected.
Adjust the configuration to conform to RE2 syntax.
The RE2 syntax error is: invalid perl operator: (?!
This patch replaces the 'irrelevant-files' to 'files' with explicitly
listing the pattern which files should be the tests run against.
Change-Id: If287e800fb9ff428dbe6f9c4c046627f22afe3df
(cherry picked from commit 9b77bae8a32ff41712b96bb6a67c7eacae45a4c9)
When the script was created there were only stable/* branches, but now
there are unmaintained/* branches as well, where the validator fails
when looking for hashes only on stable/* branches even if the given
hash is already on unmtaintained/* branch. This patch matches now both
stable/* and unmaintained/* branches.
Change-Id: I08fcc63ab0fbe5af1be70d5fde5af98bf006101c
(cherry picked from commit e2697de8e41a566eb86aefa364906bda9bc59863)
This is a fix for the test whether a patch is bot generated or not, as
that did not worked as intended. The problem is that the script is
checking the email address of the parent patch (HEAD~), which probably
should be right in case the patch would be a MERGE patch. But this is
wrong in case the patch is not a MERGE patch. This fix uses the very
same pattern as it is using for the commit message parsing: the
$commit_hash variable, which is the parent's commit hash if the patch
is a MERGE patch, and an empty string in the other case (causing to
call 'git show' on HEAD).
Change-Id: I0abc72180edf34a6dd0624a40fb8682397805eca
(cherry picked from commit b8f3975d3641fad19971cc159bdb9decb6ea95f8)
In general the card_serial_number will not be present on sriov
VFs/PFs, it is only supported on very new cards.
Also, all 3 need not to be always required for vf_profile.
Related-Bug: #2008238
Change-Id: I00b126635612ace51b5e3138afcb064f001f1901
(cherry picked from commit a1a07e0d2d01e95b7d3c8db62d149a4278617c93)
These tests depend on qemu-img being installed and in the path, if it is not installed, skip them.
Change-Id: I896f16c512f24bcdd898ab002af4e5e068f66b64
Closes-bug: #2073862
Signed-off-by: Julien Le Jeune <julien.le-jeune@ovhcloud.com>
(cherry picked from commit a3202f7bf9f1aecc7d0632011167d38a1698f0a0)
Note that this includes seemingly-unrelated test changes because we
were actually skipping the snapshot_running test for libvirt, which
has been a bug for years. In that test case, when we went to look
for image_meta.disk_format, that attribute was not set on the o.vo
object, which raised a NotImplementedError. That error is also checked
by the test to skip the test for drivers that do not support snapshot,
which meant that for libvirt, we haven't been running that case
beyond the point at which we create snapshot metadata and trip that
exception. Thus, once removing that, there are other mocks not in
place that are required for the test to actually run. So, this adds
mocks for qemu_img_info() calls that actually try to read the file on
disk, as well as the privsep chown() that attempts to run after.
Change-Id: Ie731045629f0899840a4680d21793a16ade9b98e
(cherry picked from commit d5a631ba7791b37e49213707e4ea650a56d2ed9e)
When we moved the qemu-img command in fetch_to_raw() to force the
format to what we expect, we lost the ability to identify and react
to situations where qemu-img detected a file as a format that is not
supported by us (i.e. identfied and safety-checked by
format_inspector). In the case of some of the other VMDK variants
that we don't support, we need to be sure to catch any case where
qemu-img thinks it's something other than raw when we think it is,
which will be the case for those formats we don't support.
Note this also moves us from explicitly using the format_inspector
that we're told by glance is appropriate, to using our own detection.
We assert that we agree with glance and as above, qemu agrees with
us. This helps us avoid cases where the uploader lies about the
image format, causing us to not run the appropriate safety check.
AMI formats are a liability here since we have a very hard time
asserting what they are and what they will be detected as later in
the pipeline, so there is still special-casing for those.
Closes-Bug: #2071734
Change-Id: I4b792c5bc959a904854c21565682ed3a687baa1a
(cherry picked from commit 8b4c522f6699514e7d1f20ac25cf426af6ea588f)
Kernels don't accept to access the governor strategy on an offline core, so
we need to only validate strategies for online cores.
Change-Id: I14c9b268d0b97221216bd1a9ab9e48b48d6dcc2c
Closes-Bug: #2073528
(cherry picked from commit 757c333c0e55df4bcaf9d442fbe8dc8009e36989)
Some version of mkisofs does not properly handle if both the input and
the output file of the command are the same. So this commit changes the
unit tests depending on that binary to use a different files.
Related-Bug: #2059809
Change-Id: I6924eb23ff5804c22a48ec6fabcec25f061906bb
(cherry picked from commit c6d8c6972d52845774b36acb84cd08a4b2e4dcde)
while backporting Ia34203f246f0bc574e11476287dfb33fda7954fe
We observed that several of the tests showed distro specific
behavior depending on if qemu was installed in the test env,
what version is installed and how it was compiled
This change ensures that if qemu is present that it
supprot the required formats otherwise it skips the test.
Change-Id: I131996cdd7aaf1f52d4caac33b153753ff6db869
(cherry picked from commit cc2514d02e0b0ebaf60a46d02732f7f8facc3191)
This reverts us back to using the standard disk image for most of our
tests, which is more representative of how people actually use nova.
This leaves the UEC image on a few jobs for the sake of comparison
data for the time being, and because we should actually test that
code path if we're going to say we support it.
Change-Id: I16ed92d342464325d4bef33c1e22b328bcfbe7d6
(cherry picked from commit eed3e2b47ffea24d08ad7a85a4e9c36ef56d815e)
This change includes unit tests for the ISO
format inspector using mkisofs to generate
the iso files.
A test for stashing qcow content in the system_area
of an iso file is also included.
This change modifies format_inspector.detect_file_format
to evaluate all inspectors until they are complete and
raise an InvalidDiskInfo exception if multiple formats
match.
Related-Bug: #2059809
Change-Id: I7e12718fb3e1f77eb8d1cfcb9fa64e8ddeb9e712
(cherry picked from commit b1cc39848ebe9b9cb63141a647bda52a2842ee4b)
This change adds a reproducer for the regression in iso
file support when
workarounds.disable_deep_image_inspection = False
Change-Id: I56d8b9980b4871941ba5de91e60a7df6a40106a8
(cherry picked from commit b5a1d3b4b2d0aaa351479b1d7e41a3895c28fab0)
This commit is a direct port of the format inspector
unit tests from glance as of commit
0d8e79b713bc31a78f0f4eac14ee594ca8520999
the only changes to the test are as follows
"from glance.common import format_inspector" was updated to
"from nova.image import format_inspector"
"from glance.tests import utils as test_utils"
was replaced with "from nova import test"
"test_utils.BaseTestCase" was replaced with "test.NoDBTestCase"
"glance-unittest-formatinspector-" was replaced with
"nova-unittest-formatinspector-"
This makes the test funtional in nova.
TestFormatInspectors requries qemu-img to be installed on the
host which would be a new depency for executing unit tests.
to avoid that we skip TestFormatInspectors if qemu-img
is not installed.
TestFormatInspectorInfra and TestFormatInspectorsTargeted
do not have a qemu-img dependency so
no changes to the test assertions were required.
Change-Id: Ia34203f246f0bc574e11476287dfb33fda7954fe
(cherry picked from commit 838daa3cad5fb3cdd10fb7aa76c647330a66939e)
When switching to using OpenStack SDK, there was a change missed
that didn't account for the SDK returning generators instead of
a list, so the loop on ports and port groups made it so that it
started returning an empty list afterwards.
Since there is no a masse of ports for a baremetal system usually,
we take the generator into a list right away to prevent this.
Closes-Bug: #2071972
Change-Id: I90766f8c225d834bb2eec606754107ea6a212f6d
(cherry picked from commit 8558f59630f81beba2789e6deef2cb5e6b367f20)
This restores the vmdk_allowed_types checking in create_image()
that was unintentionally lost by tightening the
qemu-type-matches-glance code in the fetch patch recently. Since we
are still detecting the format of base images without metadata, we
would have treated a vmdk file that claims to be raw as raw in fetch,
but then read it like a vmdk once it was used as a base image for
something else.
Change-Id: I07b332a7edb814f6a91661651d9d24bfd6651ae7
Related-Bug: #2059809
(cherry picked from commit 08be7b2a0dc1d7728d8034bc2aab0428c4fb642e)
There is an additional way we can be fooled into using a qcow2 file
with a data-file, which is uploading it as raw to glance and then
booting an instance from it. Because when we go to create the
ephemeral disk from a cached base image, we've lost the information
about the original source's format, we probe the image's file type
without a strict format specified. If a qcow2 file is listed in
glance as a raw, we won't notice it until it is too late.
This brings over another piece of code (proposed against) glance's
format inspector which provides a safe format detection routine. This
patch uses that to detect the format of and run a safety check on the
base image each time we go to use it to create an ephemeral disk
image from it.
This also detects QED files and always marks them as unsafe as we do
not support that format at all. Since we could be fooled into
downloading one and passing it to qemu-img if we don't recognize it,
we need to detect and reject it as unsafe.
Change-Id: I4881c8cbceb30c1ff2d2b859c554e0d02043f1f5
(cherry picked from commit b1b88bf001757546fbbea959f4b73cb344407dfb)
It has been asserted that we should not be calling qemu-img info
on untrusted files. That means we need to know if they have a
backing_file, data_file or other unsafe configuration *before* we use
qemu-img to probe or convert them.
This grafts glance's format_inspector module into nova/images so we
can use it to check the file early for safety. The expectation is that
this will be moved to oslo.utils (or something) later and thus we will
just delete the file from nova and change our import when that happens.
NOTE: This includes whitespace changes from the glance version of
format_inspector.py because of autopep8 demands.
Change-Id: Iaefbe41b4c4bf0cf95d8f621653fdf65062aaa59
Closes-Bug: #2059809
(cherry picked from commit 9cdce715945619fc851ab3f43c97fab4bae4e35a)
Tempest currently defaults to disk_formats[0] for images it creates,
which is 'ami'. However, it's actually using a qcow2 disk image by
default, which means we're lying to glance when we create those.
Change-Id: I737e9aa51c268a387f1eed24cf717618d057d747
(cherry picked from commit c0ff2386ed15b531c44283909b9619e43d1ed66a)
This change adds a retry_if_busy decorator
to the read_sys and write_sys functions in the filesystem
module that will retry reads and writes up to 5 times with
an linear backoff.
This allows nova to tolerate short periods of time where
sysfs retruns device busy. If the reties are exausted
and offlineing a core fails a warning is log and the failure is
ignored. onling a core is always treated as a hard error if
retries are exausted.
Closes-Bug: #2065927
Change-Id: I2a6a9f243cb403167620405e167a8dd2bbf3fa79
(cherry picked from commit 44c1b48b3121682cf959c90b3adaf2a3f92e318c)
Today if the write sys call to offline a cpu when
deleting an instnace fails due to an OSERROR or ValueERROR
the instance delete fails and the instance goes to error.
as reported in bug: #2065927 this can happen as a result of
OSError: [Errno 16] Device or resource busy if the vm is
deleted shortly after its started.
Related-Bug: #2065927
Change-Id: I1352a3a1e28cfe14ec8f32042ed35cb25e70338e
(cherry picked from commit ee581a5c9d1c0b7c0d8830a08f55fe8bc2fbcd0f)
After this patch nova rejects the add host to aggregate API action
if the host has instances and the new aggregate for the host would
mean that these instances need to move from one AZ (even from the
default one) to another. Such AZ change is not implemented in nova
and currently leads to stuck instances.
Similarly nova will reject remove host from aggregate API action if the
host has instances and the aggregate removal would mean that the
instances need to change AZ.
Depends-On: https://review.opendev.org/c/openstack/tempest/+/821732
Change-Id: I19c4c6d34aa2cc1f32d81e8c1a52762fa3a18580
Closes-Bug: #1907775
(cherry picked from commit 3c0eadae0b9ec48586087ea6c0c4e9176f0aa3bc)
When deleting an instance, always send VIR_DOMAIN_UNDEFINE_NVRAM to
delete the NVRAM file, regardless of whether the image is of type UEFI.
This prevents a bug when rebuilding an instance from an UEFI image to a
non-UEFI image.
Closes-Bug: #1997352
Change-Id: I24648f5b7895bf5d093f222b6c6e364becbb531f
Signed-off-by: Simon Hensel <simon.hensel@inovex.de>
(cherry picked from commit 406d590a364d2c3ebc91e5f28f94011b158459d2)
Update the URL to the upper-constraints file to point to the redirect
rule on releases.openstack.org so that anyone working on this branch
will switch to the correct upper-constraints list automatically when
the requirements repository branches.
Until the requirements repository has as stable/2024.1 branch, tests will
continue to use the upper-constraints list on master.
Change-Id: I29cf77fe03c2456c380ddd0e980c41e7cc415e22