1580 Commits

Author SHA1 Message Date
Zuul
1da3c4c399 Merge "conf: Remove cells v1 options, group" 2019-06-05 14:27:18 +00:00
Zuul
9762090711 Merge "Change the default of notification_format to unversioned" 2019-06-05 13:10:04 +00:00
Balazs Gibizer
ed613aa66f Change the default of notification_format to unversioned
The default config `both` means that both the legacy and the versioned
notifications are emitted. This was selected as default in the past when
we thought that this will help the adoption of the versioned interface
while we worked on to make that new interface in feature parity with the
legacy. Even though the versioned notification interface is in feature
parity with the legacy interface since Stein the projects consuming nova
notifications do not have the resources to switch to the new interface.

In the other hand having `both` as a default in an environtment where
only the legacy notifications are consumed causes performance issues in
the message bus hence the bug #1805659.

The original plan was that we set the default to `versioned` when the
interface reaches feature parity but as major consumers are not ready
to switch we cannot do that.

So the only option left is to set the default to `unversioned`.

Related devstack patch: https://review.opendev.org/#/c/662849/

Closes-Bug: #1805659

Change-Id: I72faa356afffb7a079a9ce86fed1b463773a0507
2019-06-04 10:36:45 +02:00
Konstantinos Samaras-Tsakiris
ca543438e1 Hide hypervisor id on windows guests
Blueprints hide-hypervisor-id-flavor-extra-spec [1] and
add-kvm-hidden-feature [2] allow hiding KVM's signature for guests,
which is necessary for Nvidia drivers to work in VMs with passthrough
GPUs.  While this works well for linux guests on KVM, it doesn't work
for Windows guests.

For them, KVM emulates some HyperV features. With the
current implementation, KVM's signature is hidden, but HyperV's is not,
and Nvidia drivers don't work in Windows VMs.

This change generates an extra element in the libvirt xml for Windows
guests on KVM which obfuscates HyperV's signature too, controlled by the
existing image and flavor parameters (img_hide_hypervisor_id and
hide_hypervisor_id correspondingly). The extra xml element is
  <vendor_id state='on' value='1234567890ab'/>
in features/hyperv.

[1] https://blueprints.launchpad.net/nova/+spec/hide-hypervisor-id-flavor-extra-spec
[2] https://blueprints.launchpad.net/nova/+spec/add-kvm-hidden-feature

Change-Id: Iaaeae9281301f14f4ae9b43f4a06de58b699fd68
Closes-Bug: 1779845
2019-06-04 00:51:08 +00:00
Zuul
653515a450 Merge "Count instances from mappings and cores/ram from placement" 2019-05-30 20:39:43 +00:00
Zuul
6f6f99d13d Merge "Block swap volume on volumes with >1 rw attachment" 2019-05-30 17:47:10 +00:00
Stephen Finucane
10bbe6b739 conf: Remove cells v1 options, group
This is no longer used anywhere and can therefore be safely removed.

Part of blueprint remove-cells-v1

Change-Id: I16b6d428accabf9dd7692909084faaf426e13524
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
2019-05-29 17:14:13 +01:00
melanie witt
8354f42e20 Count instances from mappings and cores/ram from placement
This counts instance mappings for counting quota usage for instances
and adds calls to placement for counting quota usage for cores and ram.

During an upgrade, if any un-migrated instance mappings are found (with
NULL user_id or NULL queued_for_delete fields), we will fall back to
the legacy counting method.

Counting quota usage from placement is opt-in via the
[quota]count_usage_from_placement configuration option because:

  * Though beneficial for multi-cell deployments to be resilient to
    down cells, the vast majority of deployments are single cell and
    will not be able to realize a down cells resiliency benefit and may
    prefer to keep legacy quota usage counting.

  * Usage for resizes will reflect resources being held on both the
    source and destination until the resize is confirmed or reverted.
    Operators may not want to enable counting from placement based on
    whether the behavior change is problematic for them.

  * Placement does not yet support the ability to partition resource
    providers from mulitple Nova deployments, so environments that are
    sharing a single placement deployment would see usage that
    aggregates all Nova deployments together. Such environments should
    not enable counting from placement.

  * Usage for unscheduled instances in ERROR state will not reflect
    resource consumption for cores and ram because the instance has no
    placement allocations.

  * Usage for instances in SHELVED_OFFLOADED state will not reflect
    resource consumption for cores and ram because the instance has no
    placement allocations. Note that because of this, it will be possible for a
    request to unshelve a server to be rejected if the user does not have
    enough quota available to support the cores and ram needed by the server to
    be unshelved.

Part of blueprint count-quota-usage-from-placement

Change-Id: Ie22b0acb5824a41da327abdcf9848d02fc9a92f5
2019-05-23 18:01:58 +00:00
Jake Yip
e822360b66 Add --before to nova-manage db archive_deleted_rows
Add a parameter to limit the archival of deleted rows by date. That is,
only rows related to instances deleted before provided date will be
archived.

This option works together with --max_rows, if both are specified both
will take effect.

Closes-Bug: #1751192
Change-Id: I408c22d8eada0518ec5d685213f250e8e3dae76e
Implements: blueprint nova-archive-before
2019-05-23 11:07:08 +10:00
Matt Riedemann
5a1d159d14 Block swap volume on volumes with >1 rw attachment
If we're swapping from a multiattach volume that has more than one
read/write attachment, another server on the secondary attachment could
be writing to the volume which is not getting copied into the volume to
which we're swapping, so we could have data loss during the swap.

This change does volume read/write attachment counting for the volume
we're swapping from and if there is more than one read/write attachment
on the volume, the swap volume operation fails with a 400 BadRequest
error.

Depends-On: https://review.openstack.org/573025/
Closes-Bug: #1775418
Change-Id: Icd7fcb87a09c35a13e4e14235feb30a289d22778
2019-05-22 09:09:00 +00:00
Zuul
2f26a22100 Merge "Fix failure to boot instances with qcow2 format images" 2019-05-21 08:28:39 +00:00
zhu.boxiang
6c6ffc0476 Fix failure to boot instances with qcow2 format images
Ceph doesn't support QCOW2 for hosting a virtual machine
disk:

  http://docs.ceph.com/docs/master/rbd/rbd-openstack/

When we set image_type as rbd and force_raw_images as
False and we don't launch an instance with boot-from-volume,
the instance is spawned using qcow2 as root disk but
fails to boot because data is accessed as raw.

To fix this, we raise an error and refuse to start
nova-compute service when force_raw_images and
image_type are incompatible.
When we import image into rbd, check the format of cache
images. If the format is not raw, remove it first and
fetch it again. It will be raw format now.

Change-Id: I1aa471e8df69fbb6f5d9aeb35651bd32c7123d78
Closes-Bug: 1816686
2019-05-20 19:10:31 +08:00
Zuul
f8e8bba512 Merge "Add image type request filter" 2019-05-15 21:53:55 +00:00
Zuul
cf76777b61 Merge "Microversion 2.73: Support adding the reason behind a server lock" 2019-05-12 02:05:18 +00:00
Surya Seetharaman
c541ace518 Microversion 2.73: Support adding the reason behind a server lock
This patch adds a new parameter ``locked_reason`` to
``POST /servers/{server_id}/action`` request where the
action is lock. It enables the user to specify a reason when locking
a server.

The locked_reason will be exposed through ``GET servers/{server_id}``,
``GET /servers/detail``, ``POST /servers/{server_id}/action``  where
the action is rebuild and ``PUT servers/{server_id}`` requests' responses.

The InstanceActionNotification will emit the locked_reason
along with the other instance details. This patch hence changes the
payload object to include the "locked_reason" field.

Note that "locked" will be allowed as a valid filtering/sorting parameter
for ``GET /servers/detail`` and ``GET /servers`` from this new microversion.

Implements blueprint add-locked-reason

Change-Id: I46edd595e7417c584106487123774a73c6dbe65e
2019-05-11 21:48:27 +00:00
Zuul
44e686c727 Merge "Add --instance option to heal_allocations" 2019-05-09 19:22:50 +00:00
Zuul
7f185133f0 Merge "libvirt: auto detach/attach sriov ports on migration" 2019-05-09 17:35:53 +00:00
Dan Smith
57978de4a8 Add image type request filter
This enables the scheduler, if configured, to limit placement results
to only computes that support the disk_format of the image used
for the request.

Change-Id: I41511365eb2b76c4cad804445766638a92b68378
2019-05-07 10:24:20 -07:00
Zuul
1388855be2 Merge "Delete the placement code" 2019-05-04 09:16:41 +00:00
Zuul
8aed9cdc01 Merge "Add nova-status upgrade check for minimum required cinder API version" 2019-05-04 03:05:42 +00:00
Zuul
d1243d3473 Merge "Remove [ironic]api_endpoint option" 2019-05-03 21:08:38 +00:00
Matt Riedemann
270d5d351e Add nova-status upgrade check for minimum required cinder API version
The compute API has required cinder API >= 3.44 since Queens [1] for
working with the volume attachments API as part of the wider
volume multi-attach support.

In order to start removing the compatibility code in the compute API
this change adds an upgrade check for the minimum required cinder API
version (3.44).

[1] Ifc01dbf98545104c998ab96f65ff8623a6db0f28

Change-Id: Ic9d1fb364e06e08250c7c5d7d4bdb956cb60e678
2019-05-03 11:53:12 -04:00
Zuul
dd6bd75355 Merge "Query in_tree to placement" 2019-05-02 21:55:38 +00:00
Takashi NATSUME
7cf16e317b Remove deprecated 'default_flavor' config option
The deprecated 'default_flavor' option has been removed.
The following methods in nova/compute/flavors.py
have been removed because they are only used in unit tests.

* get_default_flavor
* get_flavor_by_name

Change-Id: If1e461da382f707be2b5ba89f74f77269f0909dd
2019-04-30 13:01:40 +00:00
Chris Dent
70a2879b2c Delete the placement code
This finalizes the removal of the placement code from nova.
This change primarily removes code and makes fixes to cmd,
test and migration tooling to adapt to the removal.

Placement tests and documention were already removed in
early patches.

A database migration that calls
consumer_obj.create_incomplete_consumers in nova-manage has been
removed.

A functional test which confirms the default incomplete
consumer user and project id has been changes so its its use of
conf.placement.incomplete_* (now removed) is replaced with a
constant. The placement server, running in the functional
test, provides its own config.

placement-related configuration is updated to only register those
opts which are relevant on the nova side. This mostly means
ksa-related opts. placement-database configuration is removed
from nova/conf/database.

tox.ini is updated to remove the group_regex required by the
placement gabbi tests. This should probably have gone when the
placement functional tests went, but was overlooked.

A release note is added which describes that this is cleanup,
the main action already happened, but points people to the
nova to placement upgrade instructions in case they haven't
done it yet.

Change-Id: I4181f39dea7eb10b84e6f5057938767b3e422aff
2019-04-28 20:06:15 +00:00
Zuul
fe198c1bf8 Merge "Replace git.openstack.org URLs with opendev.org URLs" 2019-04-26 20:38:42 +00:00
Eric Fried
d54f35d54e Remove [ironic]api_endpoint option
[ironic]api_endpoint was deprecated in Queens [1] and is hereby
annihilated.

[1] If625411f40be0ba642baeb02950f568f43673655

Change-Id: I527f512b371705b490ba55dfab101340d417edb6
2019-04-26 12:02:02 -05:00
Sean Mooney
d966ffabc3 libvirt: auto detach/attach sriov ports on migration
- This patch detaches all directmode sriov interfaces before calculating
  the updated xml for the destination immediately before starting the migration.
- This change modifies post_live_migration_at_destination to check
  if an instance has all interfecs defined in the guest xml and attaches
  the missing sriov interfaces if they are not present.
- This change adds a release note for the sriov live migration feature.
- This change extends the base virt driver interface with a new method
  rollback_live_migration_at_source and invokes it from
  rollback_live_migration in the compute manager.

Change-Id: Ib61913d9d6ef6148170963463bb71c13f4272c5d
Implements: blueprint libvirt-neutron-sriov-livemigration
2019-04-26 11:11:40 +01:00
ZhongShengping
7ecaa3fcf8 Replace git.openstack.org URLs with opendev.org URLs
Thorough replacement of git.openstack.org URLs with their opendev.org
counterparts.

Change-Id: I3e0af55e0707f04428a422b973d016ad30c82a12
2019-04-24 13:59:57 +08:00
Adrian Chiris
fd8fdc9345 SR-IOV Live migration indirect port support
This patch, builds on previous patches and enables
Live migration with SR-IOV indirect ports.

Prior to this change migration would have either:
  - Failed with instance running on the source node.
  - Failed with two VMs booted on both source and destination
    nodes, VM state is set to migrating, duplicate MACs on
    source and destination node and improper PCI resource claiming.
    This scenario is observed in the case of macvtap port type and
    neutron does not support multiple port binding API extension.
    With very little, non user friendly information in the log.

Conductor Changes:
- Allow live migration only with VIF related PCI devices to allow
  properly claiming PCI resources on the destination node.
  With this change live migration with generic flavor based
  PCI passthrough devices will not be supported due to libvirt and
  qemu constraint.

- Add a check to allow live migration with VIF related PCI allocation
  only when neutron supports multiple ports binding API extension
  and compute nodes are up to date.

- update the migrating VIF with the correct profile when binding the
  ports on the destination host, this will allow proper binding against
  the destination host and ensure VIF will be plugged correctly by Nova.

Compute Changes:
- Create VIFMigrateData for all VIFs in
  check_can_live_migrate_destination()

- For every VIF that contains a PCI device in its profile
  claim a PCI device on the destination node using the matching
  InstancePCIRequest of the instance being migrated.

- Update the relevant VIFMigrateData profile with the newly
  claimed PCI device.

- Free PCI devices on source and allocate on destination upon
  a successful migration or free claimed PCI devices on destination
  upon failure.

NeutronV2 Changes:
- Don't update binding profile with PCI devices if migration type is
  live-migration as the profile was already updated when an inactive
  port binding was created during bind_ports_to_host() call
  from conductor.

  Note: This builds on multiple ports binding API.

Change-Id: I734cc01dce13f9e75a16639faf890ddb1661b7eb
Partial-Implement: blueprint libvirt-neutron-sriov-livemigration
2019-04-21 11:18:24 +03:00
Tetsuro Nakamura
575fd08e63 Query in_tree to placement
This patch adds the translation of `RequestGroup.in_tree` to the
actual placement query and bumps microversion to enable it.

The release note for this change is added.

Change-Id: I8ec95d576417c32a57aa0298789dac6afb0cca02
Blueprint: use-placement-in-tree
Related-Bug: #1777591
2019-04-17 08:52:59 +00:00
Stephen Finucane
7954b2714e Remove 'nova-manage cell' commands
These are no longer necessary with the removal of cells v1. A check for
cells v1 in 'nova-manage cell_v2 simple_cell_setup' is also removed,
meaning this can no longer return the '2' exit code.

Part of blueprint remove-cells-v1

Change-Id: I8c2bfb31224300bc639d5089c4dfb62143d04b7f
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
2019-04-16 18:26:17 +01:00
Stephen Finucane
fb14f24cc3 Remove '/os-cells' REST APIs
Drop support for the os-cells REST APIs, which are part of the cells v1
feature which has been deprecated since Pike.

This API now returns a 410 response for all routes.

Unit tests are removed and the functional API sample tests are just
asserting the 410 response now. The latter are also expanded to cover
APIs that weren't previously tested.

The API sample docs are left intact since the API reference still builds
from those and can be considered more or less branchless, so people
looking at the API reference can apply it to older deployments of nova
before os-cells was removed.

A release note added for previous cells v1 removals is amended to note
this additional change.

Part of blueprint remove-cells-v1

Change-Id: Iddb519008515f591cf1d884872a5887afbe766f2
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
2019-04-16 18:26:13 +01:00
Zuul
e25d59078e Merge "Add minimum value in max_concurrent_live_migrations" 2019-04-16 15:50:48 +00:00
Zuul
5868303f2c Merge "conf: Undeprecate and move the 'dhcp_domain' option" 2019-04-16 12:48:40 +00:00
Zuul
013aa1915c Merge "Remove 'nova-cells' service" 2019-04-16 08:25:34 +00:00
Zuul
5297d58821 Merge "Handle unsetting '[DEFAULT] dhcp_domain'" 2019-04-16 05:23:26 +00:00
Stephen Finucane
886b0a5d74 conf: Undeprecate and move the 'dhcp_domain' option
The metadata service makes use of the deprecated '[DEFAULT] dhcp_domain'
option when providing a hostname to the instance. This is used by
cloud-init to configure the hostname in the instance. This use was not
captured when the option was initially deprecated. This option is now
undeprecated and moved to the '[api]' group to ensure it won't be
removed alongside the other nova-network options.

Change-Id: I3940ebd1888d8019716e7d4eb6d4a413a37b9b78
Closes-Bug: #1698010
2019-04-15 15:34:12 -04:00
Stephen Finucane
97549a2c41 Handle unsetting '[DEFAULT] dhcp_domain'
Fix a long-standing issue whereby setting 'dhcp_domain' to 'None' would
result in a hostname of '${hostname}None' instead of '${hostname}'.

Change-Id: Ic9aa74f5344ba469b61a87de1ebd27e6f49c3318
Closes-Bug: #1824813
2019-04-15 15:34:04 -04:00
Matt Riedemann
c92b297896 Add --instance option to heal_allocations
This resolves one of the TODOs in the heal_allocations CLI
by adding an --instance option to the command which, when
specified, will process just the single instance given.

Change-Id: Icf57f217f03ac52b1443addc34aa5128661a8554
2019-04-15 10:34:59 -04:00
Takashi NATSUME
37c42c97e2 Add minimum value in max_concurrent_live_migrations
Add minimum value 0 in the max_concurrent_live_migrations option.

Change-Id: I52ead0154e311a0ebdfd6d7413704fa350020587
2019-04-12 01:05:29 +00:00
Matt Riedemann
ded3e4d900 Add --dry-run option to heal_allocations CLI
This resolves one of the TODOs in the heal_allocations CLI
by adding a --dry-run option which will still print the output
as we process instances but not commit any allocation changes
to placement, just print out that they would happen.

Change-Id: Ide31957306602c1f306ebfa48d6e95f48b1e8ead
2019-04-11 18:15:09 -04:00
Stephen Finucane
a4743f982a Remove 'nova-cells' service
We're going to start unpicking this stuff from the top down. Start with
the 'nova-cells' executable itself.

Part of blueprint remove-cells-v1

Change-Id: I5bd1dd9f1bbae7a977ab9e032c4f4d200c35e193
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
2019-04-09 17:15:37 +01:00
Zuul
fc8c957eea Merge "Added mount fstype based validation of Quobyte mounts" 2019-04-05 17:41:13 +00:00
Zuul
2384c41b78 Merge "libvirt: Use 'writeback' QEMU cache mode when 'none' is not viable" 2019-04-02 16:34:20 +00:00
Zuul
be9ac9989d Merge "Exec systemd-run without --user flag in Quobyte driver" 2019-03-28 07:28:03 +00:00
Zuul
0456b43985 Merge "Set min=0 for block_device_allocate_retries option" 2019-03-25 23:24:28 +00:00
Zuul
c9977f2aed Merge "Fix links to neutron QoS minimum bandwidth doc" 2019-03-21 16:41:35 +00:00
Zuul
aa1bfb645f Merge "Add a prelude release note for the 19.0.0 Stein GA" 2019-03-21 15:59:33 +00:00
Kashyap Chamarthy
b9dc86d8d6 libvirt: Use 'writeback' QEMU cache mode when 'none' is not viable
When configuring QEMU cache modes for Nova instances, we use
'writethrough' when 'none' is not available.  But that's not correct,
because of our misunderstanding of how cache modes work.  E.g. the
function disk_cachemode() in the libvirt driver assumes that
'writethrough' and 'none' cache modes have the same behaviour with
respect to host crash safety, which is not at all true.

The misunderstanding and complexity stems from not realizing that each
QEMU cache mode is a shorthand to toggle *three* booleans.  Refer to the
convenient cache mode table in the code comment (in
nova/virt/libvirt/driver.py).

As Kevin Wolf (thanks!), QEMU Block Layer maintainer, explains (I made
a couple of micro edits for clarity):

    The thing that makes 'writethrough' so safe against host crashes is
    that it never keeps data in a "write cache", but it calls fsync()
    after _every_ write.  This is also what makes it horribly slow.  But
    'cache=none' doesn't do this and therefore doesn't provide this kind
    of safety.  The guest OS must explicitly flush the cache in the
    right places to make sure data is safe on the disk.  And OSes do
    that.

    So if 'cache=none' is safe enough for you, then 'cache=writeback'
    should be safe enough for you, too -- because both of them have the
    boolean 'cache.writeback=on'.  The difference is only in
    'cache.direct', but 'cache.direct=on' only bypasses the host kernel
    page cache and data could still sit in other caches that could be
    present between QEMU and the disk (such as commonly a volatile write
    cache on the disk itself).

So use 'writeback' mode instead of the debilitatingly slow
'writethrough' for cases where the O_DIRECT-based 'none' is unsupported.

Do the minimum required update to the `disk_cachemodes` config help
text.  (In a future patch, rewrite the cache modes documentation to fix
confusing fragments and outdated information.)

Closes-Bug: #1818847
Change-Id: Ibe236988af24a3b43508eec4efbe52a4ed05d45f
Signed-off-by: Kashyap Chamarthy <kchamart@redhat.com>
Looks-good-to-me'd-by: Kevin Wolf <kwolf@redhat.com>
2019-03-21 14:17:22 +01:00