Add configuration of maximum disk devices to attach

This adds a new config option to control the maximum number of disk
devices allowed to attach to a single instance, which can be set per
compute host.

The configured maximum is enforced when device names are generated
during server create, rebuild, evacuate, unshelve, live migrate, and
attach volume. When the maximum is exceeded during server create,
rebuild, evacuate, unshelve, or live migrate, the server will go into
ERROR state and the server fault will contain the reason. When the
maximum is exceeded during an attach volume request, the request fails
fast in the API with a 403 error.

The configured maximum on the destination is not enforced before cold
migrate because the maximum is enforced in-place only (the destination
is not checked over RPC). The configured maximum is also not enforced
on shelved offloaded servers because they have no compute host, and the
option is implemented at the nova-compute level.

Part of blueprint conf-max-attach-volumes

Change-Id: Ia9cc1c250483c31f44cdbba4f6342ac8d7fbe92b
This commit is contained in:
melanie witt 2019-01-15 06:37:23 +00:00 committed by Matt Riedemann
parent 6489f2d2b4
commit bb0906f4f3
8 changed files with 302 additions and 14 deletions

View File

@ -48,6 +48,12 @@ When we talk about block device mapping, we usually refer to one of two things
virt driver code). We will refer to this format as 'Driver BDMs' from now
on.
.. note::
The maximum limit on the number of disk devices allowed to attach to
a single server is configurable with the option
:oslo.config:option:`compute.max_disk_devices_to_attach`.
Data format and its history
----------------------------

View File

@ -39,6 +39,12 @@ To complete these tasks, use these parameters on the
:cinder-doc:`Cinder documentation
<cli/cli-manage-volumes.html#attach-a-volume-to-an-instance>`.
.. note::
The maximum limit on the number of disk devices allowed to attach to
a single server is configurable with the option
:oslo.config:option:`compute.max_disk_devices_to_attach`.
.. _Boot_instance_from_image_and_attach_non-bootable_volume:
Boot instance from image and attach non-bootable volume

View File

@ -1566,6 +1566,17 @@ class ComputeManager(manager.Manager):
requested_networks, macs, security_groups, is_vpn)
def _default_root_device_name(self, instance, image_meta, root_bdm):
"""Gets a default root device name from the driver.
:param nova.objects.Instance instance:
The instance for which to get the root device name.
:param nova.objects.ImageMeta image_meta:
The metadata of the image of the instance.
:param nova.objects.BlockDeviceMapping root_bdm:
The description of the root device.
:returns: str -- The default root device name.
:raises: InternalError, TooManyDiskDevices
"""
try:
return self.driver.default_root_device_name(instance,
image_meta,
@ -1576,6 +1587,15 @@ class ComputeManager(manager.Manager):
def _default_device_names_for_instance(self, instance,
root_device_name,
*block_device_lists):
"""Default the missing device names in the BDM from the driver.
:param nova.objects.Instance instance:
The instance for which to get default device names.
:param str root_device_name: The root device name.
:param list block_device_lists: List of block device mappings.
:returns: None
:raises: InternalError, TooManyDiskDevices
"""
try:
self.driver.default_device_names_for_instance(instance,
root_device_name,
@ -1585,6 +1605,18 @@ class ComputeManager(manager.Manager):
instance, root_device_name, *block_device_lists)
def _get_device_name_for_instance(self, instance, bdms, block_device_obj):
"""Get the next device name from the driver, based on the BDM.
:param nova.objects.Instance instance:
The instance whose volume is requesting a device name.
:param nova.objects.BlockDeviceMappingList bdms:
The block device mappings for the instance.
:param nova.objects.BlockDeviceMapping block_device_obj:
A block device mapping containing info about the requested block
device.
:returns: The next device name.
:raises: InternalError, TooManyDiskDevices
"""
# NOTE(ndipanov): Copy obj to avoid changing the original
block_device_obj = block_device_obj.obj_clone()
try:

View File

@ -19,7 +19,6 @@ import functools
import inspect
import itertools
import math
import string
import traceback
import netifaces
@ -117,6 +116,9 @@ def get_device_name_for_instance(instance, bdms, device):
This method is a wrapper for get_next_device_name that gets the list
of used devices and the root device from a block device mapping.
:raises TooManyDiskDevices: if the maxmimum allowed devices to attach to a
single instance is exceeded.
"""
mappings = block_device.instance_block_mapping(instance, bdms)
return get_next_device_name(instance, mappings.values(),
@ -125,7 +127,12 @@ def get_device_name_for_instance(instance, bdms, device):
def default_device_names_for_instance(instance, root_device_name,
*block_device_lists):
"""Generate missing device names for an instance."""
"""Generate missing device names for an instance.
:raises TooManyDiskDevices: if the maxmimum allowed devices to attach to a
single instance is exceeded.
"""
dev_list = [bdm.device_name
for bdm in itertools.chain(*block_device_lists)
@ -143,6 +150,15 @@ def default_device_names_for_instance(instance, root_device_name,
dev_list.append(dev)
def check_max_disk_devices_to_attach(num_devices):
maximum = CONF.compute.max_disk_devices_to_attach
if maximum < 0:
return
if num_devices > maximum:
raise exception.TooManyDiskDevices(maximum=maximum)
def get_next_device_name(instance, device_name_list,
root_device_name=None, device=None):
"""Validates (or generates) a device name for instance.
@ -153,6 +169,9 @@ def get_next_device_name(instance, device_name_list,
name is valid but applicable to a different backend (for example
/dev/vdc is specified but the backend uses /dev/xvdc), the device
name will be converted to the appropriate format.
:raises TooManyDiskDevices: if the maxmimum allowed devices to attach to a
single instance is exceeded.
"""
req_prefix = None
@ -196,6 +215,8 @@ def get_next_device_name(instance, device_name_list,
if flavor.swap:
used_letters.add('c')
check_max_disk_devices_to_attach(len(used_letters) + 1)
if not req_letter:
req_letter = _get_unused_letter(used_letters)
@ -261,13 +282,13 @@ def convert_mb_to_ceil_gb(mb_value):
def _get_unused_letter(used_letters):
doubles = [first + second for second in string.ascii_lowercase
for first in string.ascii_lowercase]
all_letters = set(list(string.ascii_lowercase) + doubles)
letters = list(all_letters - used_letters)
# NOTE(vish): prepend ` so all shorter sequences sort first
letters.sort(key=lambda x: x.rjust(2, '`'))
return letters[0]
# Return the first unused device letter
index = 0
while True:
letter = block_device.generate_device_letter(index)
if letter not in used_letters:
return letter
index += 1
def get_value_from_system_metadata(instance, key, type, default):

View File

@ -802,6 +802,50 @@ Number of concurrent disk-IO-intensive operations (glance image downloads,
image format conversions, etc.) that we will do in parallel. If this is set
too high then response time suffers.
The default value of 0 means no limit.
"""),
cfg.IntOpt('max_disk_devices_to_attach',
default=-1,
min=-1,
help="""
Maximum number of disk devices allowed to attach to a single server. Note
that the number of disks supported by an server depends on the bus used. For
example, the ``ide`` disk bus is limited to 4 attached devices. The configured
maximum is enforced during server create, rebuild, evacuate, unshelve, live
migrate, and attach volume.
Usually, disk bus is determined automatically from the device type or disk
device, and the virtualization type. However, disk bus
can also be specified via a block device mapping or an image property.
See the ``disk_bus`` field in :doc:`/user/block-device-mapping` for more
information about specifying disk bus in a block device mapping, and
see https://docs.openstack.org/glance/latest/admin/useful-image-properties.html
for more information about the ``hw_disk_bus`` image property.
Operators changing the ``[compute]/max_disk_devices_to_attach`` on a compute
service that is hosting servers should be aware that it could cause rebuilds to
fail, if the maximum is decreased lower than the number of devices already
attached to servers. For example, if server A has 26 devices attached and an
operators changes ``[compute]/max_disk_devices_to_attach`` to 20, a request to
rebuild server A will fail and go into ERROR state because 26 devices are
already attached and exceed the new configured maximum of 20.
Operators setting ``[compute]/max_disk_devices_to_attach`` should also be aware
that during a cold migration, the configured maximum is only enforced in-place
and the destination is not checked before the move. This means if an operator
has set a maximum of 26 on compute host A and a maximum of 20 on compute host
B, a cold migration of a server with 26 attached devices from compute host A to
compute host B will succeed. Then, once the server is on compute host B, a
subsequent request to rebuild the server will fail and go into ERROR state
because 26 devices are already attached and exceed the configured maximum of 20
on compute host B.
The configured maximum is not enforced on shelved offloaded servers, as they
have no compute host.
Possible values:
* -1 means unlimited
* Any integer >= 0 represents the maximum allowed
"""),
]

View File

@ -0,0 +1,116 @@
# Licensed under the Apache License, Version 2.0 (the "License"); you may
# not use this file except in compliance with the License. You may obtain
# a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
# License for the specific language governing permissions and limitations
# under the License.
import time
import six
from nova.tests import fixtures as nova_fixtures
from nova.tests.functional.api import client
from nova.tests.functional import integrated_helpers
from nova.tests.functional import test_servers
class ConfigurableMaxDiskDevicesTest(integrated_helpers.InstanceHelperMixin,
test_servers.ServersTestBase):
def setUp(self):
super(ConfigurableMaxDiskDevicesTest, self).setUp()
self.cinder = self.useFixture(
nova_fixtures.CinderFixtureNewAttachFlow(self))
def _wait_for_volume_attach(self, server_id, volume_id):
for i in range(0, 100):
server = self.api.get_server(server_id)
attached_vols = [vol['id'] for vol in
server['os-extended-volumes:volumes_attached']]
if volume_id in attached_vols:
return
time.sleep(.1)
self.fail('Timed out waiting for volume %s to be attached to '
'server %s. Currently attached volumes: %s' %
(volume_id, server_id, attached_vols))
def test_boot_from_volume(self):
# Set the maximum to 1 and boot from 1 volume. This should pass.
self.flags(max_disk_devices_to_attach=1, group='compute')
server = self._build_server(flavor_id='1')
server['imageRef'] = ''
volume_uuid = nova_fixtures.CinderFixtureNewAttachFlow.IMAGE_BACKED_VOL
bdm = {'boot_index': 0,
'uuid': volume_uuid,
'source_type': 'volume',
'destination_type': 'volume'}
server['block_device_mapping_v2'] = [bdm]
created_server = self.api.post_server({"server": server})
self._wait_for_state_change(self.api, created_server, 'ACTIVE')
def test_boot_from_volume_plus_attach_max_exceeded(self):
# Set the maximum to 1, boot from 1 volume, and attach one volume.
# This should fail because it's trying to attach 2 disk devices.
self.flags(max_disk_devices_to_attach=1, group='compute')
server = self._build_server(flavor_id='1')
server['imageRef'] = ''
vol_img_id = nova_fixtures.CinderFixtureNewAttachFlow.IMAGE_BACKED_VOL
boot_vol = {'boot_index': 0,
'uuid': vol_img_id,
'source_type': 'volume',
'destination_type': 'volume'}
vol_id = '9a695496-44aa-4404-b2cc-ccab2501f87e'
other_vol = {'uuid': vol_id,
'source_type': 'volume',
'destination_type': 'volume'}
server['block_device_mapping_v2'] = [boot_vol, other_vol]
created_server = self.api.post_server({"server": server})
server_id = created_server['id']
# Server should go into ERROR state
self._wait_for_state_change(self.api, created_server, 'ERROR')
# Verify the instance fault
server = self.api.get_server(server_id)
# If anything fails during _prep_block_device, a 500 internal server
# error is raised.
self.assertEqual(500, server['fault']['code'])
expected = ('Build of instance %s aborted: The maximum allowed number '
'of disk devices (1) to attach to a single instance has '
'been exceeded.' % server_id)
self.assertIn(expected, server['fault']['message'])
# Verify no volumes are attached (this is a generator)
attached_vols = list(self.cinder.volume_ids_for_instance(server_id))
self.assertNotIn(vol_img_id, attached_vols)
self.assertNotIn(vol_id, attached_vols)
def test_attach(self):
# Set the maximum to 2. This will allow one disk device for the local
# disk of the server and also one volume to attach. A second volume
# attach request will fail.
self.flags(max_disk_devices_to_attach=2, group='compute')
server = self._build_server(flavor_id='1')
created_server = self.api.post_server({"server": server})
server_id = created_server['id']
self._wait_for_state_change(self.api, created_server, 'ACTIVE')
# Attach one volume, should pass.
vol_id = '9a695496-44aa-4404-b2cc-ccab2501f87e'
self.api.post_server_volume(
server_id, dict(volumeAttachment=dict(volumeId=vol_id)))
self._wait_for_volume_attach(server_id, vol_id)
# Attach a second volume, should fail fast in the API.
other_vol_id = 'f2063123-0f88-463c-ac9d-74127faebcbe'
ex = self.assertRaises(
client.OpenStackApiException, self.api.post_server_volume,
server_id, dict(volumeAttachment=dict(volumeId=other_vol_id)))
self.assertEqual(403, ex.response.status_code)
expected = ('The maximum allowed number of disk devices (2) to attach '
'to a single instance has been exceeded.')
self.assertIn(expected, six.text_type(ex))
# Verify only one volume is attached (this is a generator)
attached_vols = list(self.cinder.volume_ids_for_instance(server_id))
self.assertIn(vol_id, attached_vols)
self.assertNotIn(other_vol_id, attached_vols)

View File

@ -153,13 +153,16 @@ def get_dev_count_for_disk_bus(disk_bus):
Determine how many disks can be supported in
a single VM for a particular disk bus.
Returns the number of disks supported.
Returns the number of disks supported or None
if there is no limit.
"""
if disk_bus == "ide":
return 4
else:
return 26
if CONF.compute.max_disk_devices_to_attach < 0:
return None
return CONF.compute.max_disk_devices_to_attach
def find_disk_dev_for_disk_bus(mapping, bus,
@ -183,13 +186,16 @@ def find_disk_dev_for_disk_bus(mapping, bus,
assigned_devices = []
max_dev = get_dev_count_for_disk_bus(bus)
devs = range(max_dev)
for idx in devs:
disk_dev = dev_prefix + chr(ord('a') + idx)
idx = 0
while True:
if max_dev is not None and idx >= max_dev:
raise exception.TooManyDiskDevices(maximum=max_dev)
disk_dev = block_device.generate_device_name(dev_prefix, idx)
if not has_disk_dev(mapping, disk_dev):
if disk_dev not in assigned_devices:
return disk_dev
idx += 1
raise exception.TooManyDiskDevices(maximum=max_dev)

View File

@ -0,0 +1,57 @@
---
features:
- |
A new configuration option, ``[compute]/max_disk_devices_to_attach``, which
defaults to ``-1`` (unlimited), has been added and can be used to configure
the maximum number of disk devices allowed to attach to a single server,
per compute host. Note that the number of disks supported by a server
depends on the bus used. For example, the ``ide`` disk bus is limited to 4
attached devices.
Usually, disk bus is determined automatically from the device type or disk
device, and the virtualization type. However, disk bus
can also be specified via a block device mapping or an image property.
See the ``disk_bus`` field in
https://docs.openstack.org/nova/latest/user/block-device-mapping.html
for more information about specifying disk bus in a block device mapping,
and see
https://docs.openstack.org/glance/latest/admin/useful-image-properties.html
for more information about the ``hw_disk_bus`` image property.
The configured maximum is enforced during server create, rebuild, evacuate,
unshelve, live migrate, and attach volume. When the maximum is exceeded
during server create, rebuild, evacuate, unshelve, or live migrate, the
server will go into ``ERROR`` state and the server fault message will
indicate the failure reason. When the maximum is exceeded during a server
attach volume API operation, the request will fail with a
``403 HTTPForbidden`` error.
issues:
- |
Operators changing the ``[compute]/max_disk_devices_to_attach`` on a
compute service that is hosting servers should be aware that it could
cause rebuilds to fail, if the maximum is decreased lower than the number
of devices already attached to servers. For example, if server A has 26
devices attached and an operators changes
``[compute]/max_disk_devices_to_attach`` to 20, a request to rebuild server
A will fail and go into ERROR state because 26 devices are already
attached and exceed the new configured maximum of 20.
Operators setting ``[compute]/max_disk_devices_to_attach`` should also be
aware that during a cold migration, the configured maximum is only enforced
in-place and the destination is not checked before the move. This means if
an operator has set a maximum of 26 on compute host A and a maximum of 20
on compute host B, a cold migration of a server with 26 attached devices
from compute host A to compute host B will succeed. Then, once the server
is on compute host B, a subsequent request to rebuild the server will fail
and go into ERROR state because 26 devices are already attached and exceed
the configured maximum of 20 on compute host B.
The configured maximum is not enforced on shelved offloaded servers, as
they have no compute host.
upgrade:
- |
The new configuration option, ``[compute]/max_disk_devices_to_attach``
defaults to ``-1`` (unlimited). Users of the libvirt driver should be
advised that the default limit for non-ide disk buses has changed from 26
to unlimited, upon upgrade to Stein. The ``ide`` disk bus continues to be
limited to 4 attached devices per server.