The default config `both` means that both the legacy and the versioned
notifications are emitted. This was selected as default in the past when
we thought that this will help the adoption of the versioned interface
while we worked on to make that new interface in feature parity with the
legacy. Even though the versioned notification interface is in feature
parity with the legacy interface since Stein the projects consuming nova
notifications do not have the resources to switch to the new interface.
In the other hand having `both` as a default in an environtment where
only the legacy notifications are consumed causes performance issues in
the message bus hence the bug #1805659.
The original plan was that we set the default to `versioned` when the
interface reaches feature parity but as major consumers are not ready
to switch we cannot do that.
So the only option left is to set the default to `unversioned`.
Related devstack patch: https://review.opendev.org/#/c/662849/
Closes-Bug: #1805659
Change-Id: I72faa356afffb7a079a9ce86fed1b463773a0507
Blueprints hide-hypervisor-id-flavor-extra-spec [1] and
add-kvm-hidden-feature [2] allow hiding KVM's signature for guests,
which is necessary for Nvidia drivers to work in VMs with passthrough
GPUs. While this works well for linux guests on KVM, it doesn't work
for Windows guests.
For them, KVM emulates some HyperV features. With the
current implementation, KVM's signature is hidden, but HyperV's is not,
and Nvidia drivers don't work in Windows VMs.
This change generates an extra element in the libvirt xml for Windows
guests on KVM which obfuscates HyperV's signature too, controlled by the
existing image and flavor parameters (img_hide_hypervisor_id and
hide_hypervisor_id correspondingly). The extra xml element is
<vendor_id state='on' value='1234567890ab'/>
in features/hyperv.
[1] https://blueprints.launchpad.net/nova/+spec/hide-hypervisor-id-flavor-extra-spec
[2] https://blueprints.launchpad.net/nova/+spec/add-kvm-hidden-feature
Change-Id: Iaaeae9281301f14f4ae9b43f4a06de58b699fd68
Closes-Bug: 1779845
This is no longer used anywhere and can therefore be safely removed.
Part of blueprint remove-cells-v1
Change-Id: I16b6d428accabf9dd7692909084faaf426e13524
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
This counts instance mappings for counting quota usage for instances
and adds calls to placement for counting quota usage for cores and ram.
During an upgrade, if any un-migrated instance mappings are found (with
NULL user_id or NULL queued_for_delete fields), we will fall back to
the legacy counting method.
Counting quota usage from placement is opt-in via the
[quota]count_usage_from_placement configuration option because:
* Though beneficial for multi-cell deployments to be resilient to
down cells, the vast majority of deployments are single cell and
will not be able to realize a down cells resiliency benefit and may
prefer to keep legacy quota usage counting.
* Usage for resizes will reflect resources being held on both the
source and destination until the resize is confirmed or reverted.
Operators may not want to enable counting from placement based on
whether the behavior change is problematic for them.
* Placement does not yet support the ability to partition resource
providers from mulitple Nova deployments, so environments that are
sharing a single placement deployment would see usage that
aggregates all Nova deployments together. Such environments should
not enable counting from placement.
* Usage for unscheduled instances in ERROR state will not reflect
resource consumption for cores and ram because the instance has no
placement allocations.
* Usage for instances in SHELVED_OFFLOADED state will not reflect
resource consumption for cores and ram because the instance has no
placement allocations. Note that because of this, it will be possible for a
request to unshelve a server to be rejected if the user does not have
enough quota available to support the cores and ram needed by the server to
be unshelved.
Part of blueprint count-quota-usage-from-placement
Change-Id: Ie22b0acb5824a41da327abdcf9848d02fc9a92f5
Add a parameter to limit the archival of deleted rows by date. That is,
only rows related to instances deleted before provided date will be
archived.
This option works together with --max_rows, if both are specified both
will take effect.
Closes-Bug: #1751192
Change-Id: I408c22d8eada0518ec5d685213f250e8e3dae76e
Implements: blueprint nova-archive-before
If we're swapping from a multiattach volume that has more than one
read/write attachment, another server on the secondary attachment could
be writing to the volume which is not getting copied into the volume to
which we're swapping, so we could have data loss during the swap.
This change does volume read/write attachment counting for the volume
we're swapping from and if there is more than one read/write attachment
on the volume, the swap volume operation fails with a 400 BadRequest
error.
Depends-On: https://review.openstack.org/573025/
Closes-Bug: #1775418
Change-Id: Icd7fcb87a09c35a13e4e14235feb30a289d22778
Ceph doesn't support QCOW2 for hosting a virtual machine
disk:
http://docs.ceph.com/docs/master/rbd/rbd-openstack/
When we set image_type as rbd and force_raw_images as
False and we don't launch an instance with boot-from-volume,
the instance is spawned using qcow2 as root disk but
fails to boot because data is accessed as raw.
To fix this, we raise an error and refuse to start
nova-compute service when force_raw_images and
image_type are incompatible.
When we import image into rbd, check the format of cache
images. If the format is not raw, remove it first and
fetch it again. It will be raw format now.
Change-Id: I1aa471e8df69fbb6f5d9aeb35651bd32c7123d78
Closes-Bug: 1816686
This patch adds a new parameter ``locked_reason`` to
``POST /servers/{server_id}/action`` request where the
action is lock. It enables the user to specify a reason when locking
a server.
The locked_reason will be exposed through ``GET servers/{server_id}``,
``GET /servers/detail``, ``POST /servers/{server_id}/action`` where
the action is rebuild and ``PUT servers/{server_id}`` requests' responses.
The InstanceActionNotification will emit the locked_reason
along with the other instance details. This patch hence changes the
payload object to include the "locked_reason" field.
Note that "locked" will be allowed as a valid filtering/sorting parameter
for ``GET /servers/detail`` and ``GET /servers`` from this new microversion.
Implements blueprint add-locked-reason
Change-Id: I46edd595e7417c584106487123774a73c6dbe65e
This enables the scheduler, if configured, to limit placement results
to only computes that support the disk_format of the image used
for the request.
Change-Id: I41511365eb2b76c4cad804445766638a92b68378
The compute API has required cinder API >= 3.44 since Queens [1] for
working with the volume attachments API as part of the wider
volume multi-attach support.
In order to start removing the compatibility code in the compute API
this change adds an upgrade check for the minimum required cinder API
version (3.44).
[1] Ifc01dbf98545104c998ab96f65ff8623a6db0f28
Change-Id: Ic9d1fb364e06e08250c7c5d7d4bdb956cb60e678
The deprecated 'default_flavor' option has been removed.
The following methods in nova/compute/flavors.py
have been removed because they are only used in unit tests.
* get_default_flavor
* get_flavor_by_name
Change-Id: If1e461da382f707be2b5ba89f74f77269f0909dd
This finalizes the removal of the placement code from nova.
This change primarily removes code and makes fixes to cmd,
test and migration tooling to adapt to the removal.
Placement tests and documention were already removed in
early patches.
A database migration that calls
consumer_obj.create_incomplete_consumers in nova-manage has been
removed.
A functional test which confirms the default incomplete
consumer user and project id has been changes so its its use of
conf.placement.incomplete_* (now removed) is replaced with a
constant. The placement server, running in the functional
test, provides its own config.
placement-related configuration is updated to only register those
opts which are relevant on the nova side. This mostly means
ksa-related opts. placement-database configuration is removed
from nova/conf/database.
tox.ini is updated to remove the group_regex required by the
placement gabbi tests. This should probably have gone when the
placement functional tests went, but was overlooked.
A release note is added which describes that this is cleanup,
the main action already happened, but points people to the
nova to placement upgrade instructions in case they haven't
done it yet.
Change-Id: I4181f39dea7eb10b84e6f5057938767b3e422aff
[ironic]api_endpoint was deprecated in Queens [1] and is hereby
annihilated.
[1] If625411f40be0ba642baeb02950f568f43673655
Change-Id: I527f512b371705b490ba55dfab101340d417edb6
- This patch detaches all directmode sriov interfaces before calculating
the updated xml for the destination immediately before starting the migration.
- This change modifies post_live_migration_at_destination to check
if an instance has all interfecs defined in the guest xml and attaches
the missing sriov interfaces if they are not present.
- This change adds a release note for the sriov live migration feature.
- This change extends the base virt driver interface with a new method
rollback_live_migration_at_source and invokes it from
rollback_live_migration in the compute manager.
Change-Id: Ib61913d9d6ef6148170963463bb71c13f4272c5d
Implements: blueprint libvirt-neutron-sriov-livemigration
This patch, builds on previous patches and enables
Live migration with SR-IOV indirect ports.
Prior to this change migration would have either:
- Failed with instance running on the source node.
- Failed with two VMs booted on both source and destination
nodes, VM state is set to migrating, duplicate MACs on
source and destination node and improper PCI resource claiming.
This scenario is observed in the case of macvtap port type and
neutron does not support multiple port binding API extension.
With very little, non user friendly information in the log.
Conductor Changes:
- Allow live migration only with VIF related PCI devices to allow
properly claiming PCI resources on the destination node.
With this change live migration with generic flavor based
PCI passthrough devices will not be supported due to libvirt and
qemu constraint.
- Add a check to allow live migration with VIF related PCI allocation
only when neutron supports multiple ports binding API extension
and compute nodes are up to date.
- update the migrating VIF with the correct profile when binding the
ports on the destination host, this will allow proper binding against
the destination host and ensure VIF will be plugged correctly by Nova.
Compute Changes:
- Create VIFMigrateData for all VIFs in
check_can_live_migrate_destination()
- For every VIF that contains a PCI device in its profile
claim a PCI device on the destination node using the matching
InstancePCIRequest of the instance being migrated.
- Update the relevant VIFMigrateData profile with the newly
claimed PCI device.
- Free PCI devices on source and allocate on destination upon
a successful migration or free claimed PCI devices on destination
upon failure.
NeutronV2 Changes:
- Don't update binding profile with PCI devices if migration type is
live-migration as the profile was already updated when an inactive
port binding was created during bind_ports_to_host() call
from conductor.
Note: This builds on multiple ports binding API.
Change-Id: I734cc01dce13f9e75a16639faf890ddb1661b7eb
Partial-Implement: blueprint libvirt-neutron-sriov-livemigration
This patch adds the translation of `RequestGroup.in_tree` to the
actual placement query and bumps microversion to enable it.
The release note for this change is added.
Change-Id: I8ec95d576417c32a57aa0298789dac6afb0cca02
Blueprint: use-placement-in-tree
Related-Bug: #1777591
These are no longer necessary with the removal of cells v1. A check for
cells v1 in 'nova-manage cell_v2 simple_cell_setup' is also removed,
meaning this can no longer return the '2' exit code.
Part of blueprint remove-cells-v1
Change-Id: I8c2bfb31224300bc639d5089c4dfb62143d04b7f
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
Drop support for the os-cells REST APIs, which are part of the cells v1
feature which has been deprecated since Pike.
This API now returns a 410 response for all routes.
Unit tests are removed and the functional API sample tests are just
asserting the 410 response now. The latter are also expanded to cover
APIs that weren't previously tested.
The API sample docs are left intact since the API reference still builds
from those and can be considered more or less branchless, so people
looking at the API reference can apply it to older deployments of nova
before os-cells was removed.
A release note added for previous cells v1 removals is amended to note
this additional change.
Part of blueprint remove-cells-v1
Change-Id: Iddb519008515f591cf1d884872a5887afbe766f2
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
The metadata service makes use of the deprecated '[DEFAULT] dhcp_domain'
option when providing a hostname to the instance. This is used by
cloud-init to configure the hostname in the instance. This use was not
captured when the option was initially deprecated. This option is now
undeprecated and moved to the '[api]' group to ensure it won't be
removed alongside the other nova-network options.
Change-Id: I3940ebd1888d8019716e7d4eb6d4a413a37b9b78
Closes-Bug: #1698010
Fix a long-standing issue whereby setting 'dhcp_domain' to 'None' would
result in a hostname of '${hostname}None' instead of '${hostname}'.
Change-Id: Ic9aa74f5344ba469b61a87de1ebd27e6f49c3318
Closes-Bug: #1824813
This resolves one of the TODOs in the heal_allocations CLI
by adding an --instance option to the command which, when
specified, will process just the single instance given.
Change-Id: Icf57f217f03ac52b1443addc34aa5128661a8554
This resolves one of the TODOs in the heal_allocations CLI
by adding a --dry-run option which will still print the output
as we process instances but not commit any allocation changes
to placement, just print out that they would happen.
Change-Id: Ide31957306602c1f306ebfa48d6e95f48b1e8ead
We're going to start unpicking this stuff from the top down. Start with
the 'nova-cells' executable itself.
Part of blueprint remove-cells-v1
Change-Id: I5bd1dd9f1bbae7a977ab9e032c4f4d200c35e193
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
When configuring QEMU cache modes for Nova instances, we use
'writethrough' when 'none' is not available. But that's not correct,
because of our misunderstanding of how cache modes work. E.g. the
function disk_cachemode() in the libvirt driver assumes that
'writethrough' and 'none' cache modes have the same behaviour with
respect to host crash safety, which is not at all true.
The misunderstanding and complexity stems from not realizing that each
QEMU cache mode is a shorthand to toggle *three* booleans. Refer to the
convenient cache mode table in the code comment (in
nova/virt/libvirt/driver.py).
As Kevin Wolf (thanks!), QEMU Block Layer maintainer, explains (I made
a couple of micro edits for clarity):
The thing that makes 'writethrough' so safe against host crashes is
that it never keeps data in a "write cache", but it calls fsync()
after _every_ write. This is also what makes it horribly slow. But
'cache=none' doesn't do this and therefore doesn't provide this kind
of safety. The guest OS must explicitly flush the cache in the
right places to make sure data is safe on the disk. And OSes do
that.
So if 'cache=none' is safe enough for you, then 'cache=writeback'
should be safe enough for you, too -- because both of them have the
boolean 'cache.writeback=on'. The difference is only in
'cache.direct', but 'cache.direct=on' only bypasses the host kernel
page cache and data could still sit in other caches that could be
present between QEMU and the disk (such as commonly a volatile write
cache on the disk itself).
So use 'writeback' mode instead of the debilitatingly slow
'writethrough' for cases where the O_DIRECT-based 'none' is unsupported.
Do the minimum required update to the `disk_cachemodes` config help
text. (In a future patch, rewrite the cache modes documentation to fix
confusing fragments and outdated information.)
Closes-Bug: #1818847
Change-Id: Ibe236988af24a3b43508eec4efbe52a4ed05d45f
Signed-off-by: Kashyap Chamarthy <kchamart@redhat.com>
Looks-good-to-me'd-by: Kevin Wolf <kwolf@redhat.com>