This change adds a simple sequence diagram showing the flow of a volume
attachment between the various services, using the libvirt driver as an
example virt driver.
Change-Id: I631ac9de3d48aa0ad849f6615d0ad2052cb63e80
nova-ceph-multistore setup needs non-admin users to copy the image.
To allow that glance's policy was overriden to allow public
images to copy. This restriction again can cause issue if there
is any new copy image tempest test try to copy private image with
admin users.
- https://review.opendev.org/#/c/742546/
Let's allow everyone to copy every image to make it work
for all type of test credentials.
Change-Id: Ia65afdfb8989909441dba55faeed2d78cc7f1ee7
I668643c836d46a25df46d4c99a973af5e50a39db attempted to fix service wide
pauses by providing a more complete list of classes to tpool.Proxy.
While this excluded libvirtError it can include internal libvirt-python
classes pointed to by private globals that have been introduced with the
use of type checking within the module.
Any attempt to wrap these internal classes will result in the failure
seen in bug #1901383. As a result this change simply ignores any class
found during inspection that doesn't start with the `vir` string, used
by libvirt to denote public methods and classes.
Closes-Bug: #1901383
Co-Authored-By: Daniel Berrange <berrange@redhat.com>
Change-Id: I568b0c4fd6069b9118ff116532f14abb46cc42ab
Pygments 2.7.x is stricter in how it validates JSON escapes, aligning it
closer with the spec [1]. Turns out we have some invalid JSON in our
docs, meaning builds are now failing with the following error:
doc/source/user/metadata.rst:262: WARNING: Could not lex literal_block
as "json". Highlighting skipped.
Resolve this.
[1] 9514e794e0
Change-Id: Ic50e29e9c7817744ad0b4f9de309aa3e96a09505
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
When performing a live migration between hypervisors running
libvirt, where one or more CPU features are disabled, nova does
not take account of these. This results in migration failures
as none of the available hypervisor targets appear compatible.
This patch ensures that the libvirt 'disable' poicy is taken
account of, at least in a basic sense, by explicitly ignoring
items flagged in this way when enumerating CPU features.
Closes-Bug: #1898715
Change-Id: Iaf14ca97cfac99dd280d1114123f2d4bb6292b63
When a DPDK VM attaches an interface, libvirt can not automatically
generate "<target dev='vhuXXX'/>" in XML file, which causes virsh
domifstat query to fail.
Change-Id: Id4e5b52af521b5f3a206e87bf024fd1e47fc4824
Closes-Bug: #1899431
Currently in the archive_deleted_rows code, we will attempt to clean up
"residue" of deleted instance records by assuming any table with a
'instance_uuid' column represents data tied to an instance's lifecycle
and delete such records.
This behavior poses a problem in the case where an instance has a PCI
device allocated and someone deletes the instance. The 'instance_uuid'
column in the pci_devices table is used to track the allocation
association of a PCI with an instance. There is a small time window
during which the instance record has been deleted but the PCI device
has not yet been freed from a database record perspective as PCI
devices are freed during the _complete_deletion method in the compute
manager as part of the resource tracker update call.
Records in the pci_devices table are anyway not related to the
lifecycle of instances so they should not be considered residue to
clean up if an instance is deleted. This adds a condition to avoid
archiving pci_devices on the basis of an instance association.
Closes-Bug: #1899541
Change-Id: Ie62d3566230aa3e2786d129adbb2e3570b06e4c6
We were attempting to pass a 'target_version' variable into an exception
message. 'target_version' is a tuple which means it's expanded out
resulting in the following error:
TypeError: not all arguments converted during string formatting
Fix this.
Change-Id: I6063b2108ae38776d034fd7a4c3aa88dc66a084f
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
Closes-Bug: #1898554
There are a bunch of unused parameters in DynamicVendorData constructor
and a comment that they cannot be removed due to JsonFileVendorData. But
JsonFileVendorData does not depends on those paramters and both the
base class and JsonFileVendorData uses *args **kwargs. So it is safe to
remove the unused params.
The context field of DynamicVendorData is also removed as it is unused.
This makes the request_context parameter of the InstanceMeta
constructor also unused so that is removed.
Change-Id: Ie27fd6a5513e53903b9acd5d63038b3b484acbde
The metadata service supports a multicell deployment in a configuration
where the nova-api service implements the metadata API. In this case the
metadata query needs to be cell targeted. This was partly implemented
already. The instance itself is queried from the cell DB properly.
However the BDM data used the non targeted context resulting in an empty
BDM returned by the metadata service.
Functional reproduction test is not added as I did not find a way to
have a cell setup in the functional test that reproduce the problem. I
reproduced the bug and tested the fix in a devstack.
Change-Id: I48f57082edaef3ec4722bd31ce29a90b94d32523
Closes-Bug: #1881944
The libvirt driver is currently the only virt driver implementing swap
volume within Nova. While libvirt itself does support moving between
multiple volumes attached to the same instance at the same time the
current logic within the libvirt driver makes a call to
virDomainGetXMLDesc that fails if there are active block jobs against
any disk attached to the domain.
This change simply uses an instance.uuid based lock in the compute layer
to serialise requests to swap_volume to avoid this from being possible.
Closes-Bug: #1896621
Change-Id: Ic5ce2580e7638a47f1ffddb4edbb503bf490504c
This is a follow up to change
I8e4e5afc773d53dee9c1c24951bb07a45ddc2f1a which fixed an issue with
validation when the topmost patch after a Zuul rebase is a merge
patch.
We need to also use the $commit_hash variable for the check for
stable-only patches, else it will incorrectly fail because it is
checking the merge patch's commit message.
Change-Id: Ia725346b65dd5e2f16aa049c74b45d99e22b3524
The 'vram' property of the 'video' device must be an integer else
libvirt will spit the dummy out, e.g.
libvirt.libvirtError: XML error: cannot parse video vram '8192.0'
The division operator in Python 3 results in a float, not an integer
like in Python 2. Use the truncation division operator instead.
Change-Id: Iebf678c229da4f455459d068cafeee5f241aea1f
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
Closes-Bug: #1896496
Bug #1894804 outlines how DEVICE_DELETED events were often missing from
QEMU on Focal based OpenStack CI hosts as originally seen in bug
#1882521. This has eventually been tracked down to some undefined QEMU
behaviour when a new device_del QMP command is received while another is
still being processed, causing the original attempt to be aborted.
We hit this race in slower OpenStack CI envs as n-cpu rather crudely
retries attempts to detach devices using the RetryDecorator from
oslo.service. The default incremental sleep time currently being tight
enough to ensure QEMU is still processing the first device_del request
on these slower CI hosts when n-cpu asks libvirt to retry the detach,
sending another device_del to QEMU hitting the above behaviour.
Additionally we have also seen the following check being hit when
testing with QEMU >= v5.0.0. This check now rejects overlapping
device_del requests in QEMU rather than aborting the original:
cce8944cc9
This change aims to avoid this situation entirely by raising the default
incremental sleep time between detach requests from 2 seconds to 10,
leaving enough time for the first attempt to complete. The overall
maximum sleep time is also increased from 30 to 60 seconds.
Future work will aim to entirely remove this retry logic with a libvirt
event driven approach, polling for the the
VIR_DOMAIN_EVENT_ID_DEVICE_REMOVED and
VIR_DOMAIN_EVENT_ID_DEVICE_REMOVAL_FAILED events before retrying.
Finally, the cleanup of unused arguments in detach_device_with_retry is
left for a follow up change in order to keep this initial change small
enough to quickly backport.
Closes-Bug: #1882521
Related-Bug: #1894804
Change-Id: Ib9ed7069cef5b73033351f7a78a3fb566753970d
This should help provide some context when the RbdDriver later raises a
RuntimeError if rbd or rados hasn't been imported correctly.
Change-Id: Ie8bb5e5622bd37dfe8073cca12f77174e8e7d98c