Add configuration of maximum disk devices to attach
This adds a new config option to control the maximum number of disk devices allowed to attach to a single instance, which can be set per compute host. The configured maximum is enforced when device names are generated during server create, rebuild, evacuate, unshelve, live migrate, and attach volume. When the maximum is exceeded during server create, rebuild, evacuate, unshelve, or live migrate, the server will go into ERROR state and the server fault will contain the reason. When the maximum is exceeded during an attach volume request, the request fails fast in the API with a 403 error. The configured maximum on the destination is not enforced before cold migrate because the maximum is enforced in-place only (the destination is not checked over RPC). The configured maximum is also not enforced on shelved offloaded servers because they have no compute host, and the option is implemented at the nova-compute level. Part of blueprint conf-max-attach-volumes Change-Id: Ia9cc1c250483c31f44cdbba4f6342ac8d7fbe92b
This commit is contained in:
parent
6489f2d2b4
commit
bb0906f4f3
@ -48,6 +48,12 @@ When we talk about block device mapping, we usually refer to one of two things
|
||||
virt driver code). We will refer to this format as 'Driver BDMs' from now
|
||||
on.
|
||||
|
||||
.. note::
|
||||
|
||||
The maximum limit on the number of disk devices allowed to attach to
|
||||
a single server is configurable with the option
|
||||
:oslo.config:option:`compute.max_disk_devices_to_attach`.
|
||||
|
||||
|
||||
Data format and its history
|
||||
----------------------------
|
||||
|
@ -39,6 +39,12 @@ To complete these tasks, use these parameters on the
|
||||
:cinder-doc:`Cinder documentation
|
||||
<cli/cli-manage-volumes.html#attach-a-volume-to-an-instance>`.
|
||||
|
||||
.. note::
|
||||
|
||||
The maximum limit on the number of disk devices allowed to attach to
|
||||
a single server is configurable with the option
|
||||
:oslo.config:option:`compute.max_disk_devices_to_attach`.
|
||||
|
||||
.. _Boot_instance_from_image_and_attach_non-bootable_volume:
|
||||
|
||||
Boot instance from image and attach non-bootable volume
|
||||
|
@ -1566,6 +1566,17 @@ class ComputeManager(manager.Manager):
|
||||
requested_networks, macs, security_groups, is_vpn)
|
||||
|
||||
def _default_root_device_name(self, instance, image_meta, root_bdm):
|
||||
"""Gets a default root device name from the driver.
|
||||
|
||||
:param nova.objects.Instance instance:
|
||||
The instance for which to get the root device name.
|
||||
:param nova.objects.ImageMeta image_meta:
|
||||
The metadata of the image of the instance.
|
||||
:param nova.objects.BlockDeviceMapping root_bdm:
|
||||
The description of the root device.
|
||||
:returns: str -- The default root device name.
|
||||
:raises: InternalError, TooManyDiskDevices
|
||||
"""
|
||||
try:
|
||||
return self.driver.default_root_device_name(instance,
|
||||
image_meta,
|
||||
@ -1576,6 +1587,15 @@ class ComputeManager(manager.Manager):
|
||||
def _default_device_names_for_instance(self, instance,
|
||||
root_device_name,
|
||||
*block_device_lists):
|
||||
"""Default the missing device names in the BDM from the driver.
|
||||
|
||||
:param nova.objects.Instance instance:
|
||||
The instance for which to get default device names.
|
||||
:param str root_device_name: The root device name.
|
||||
:param list block_device_lists: List of block device mappings.
|
||||
:returns: None
|
||||
:raises: InternalError, TooManyDiskDevices
|
||||
"""
|
||||
try:
|
||||
self.driver.default_device_names_for_instance(instance,
|
||||
root_device_name,
|
||||
@ -1585,6 +1605,18 @@ class ComputeManager(manager.Manager):
|
||||
instance, root_device_name, *block_device_lists)
|
||||
|
||||
def _get_device_name_for_instance(self, instance, bdms, block_device_obj):
|
||||
"""Get the next device name from the driver, based on the BDM.
|
||||
|
||||
:param nova.objects.Instance instance:
|
||||
The instance whose volume is requesting a device name.
|
||||
:param nova.objects.BlockDeviceMappingList bdms:
|
||||
The block device mappings for the instance.
|
||||
:param nova.objects.BlockDeviceMapping block_device_obj:
|
||||
A block device mapping containing info about the requested block
|
||||
device.
|
||||
:returns: The next device name.
|
||||
:raises: InternalError, TooManyDiskDevices
|
||||
"""
|
||||
# NOTE(ndipanov): Copy obj to avoid changing the original
|
||||
block_device_obj = block_device_obj.obj_clone()
|
||||
try:
|
||||
|
@ -19,7 +19,6 @@ import functools
|
||||
import inspect
|
||||
import itertools
|
||||
import math
|
||||
import string
|
||||
import traceback
|
||||
|
||||
import netifaces
|
||||
@ -117,6 +116,9 @@ def get_device_name_for_instance(instance, bdms, device):
|
||||
|
||||
This method is a wrapper for get_next_device_name that gets the list
|
||||
of used devices and the root device from a block device mapping.
|
||||
|
||||
:raises TooManyDiskDevices: if the maxmimum allowed devices to attach to a
|
||||
single instance is exceeded.
|
||||
"""
|
||||
mappings = block_device.instance_block_mapping(instance, bdms)
|
||||
return get_next_device_name(instance, mappings.values(),
|
||||
@ -125,7 +127,12 @@ def get_device_name_for_instance(instance, bdms, device):
|
||||
|
||||
def default_device_names_for_instance(instance, root_device_name,
|
||||
*block_device_lists):
|
||||
"""Generate missing device names for an instance."""
|
||||
"""Generate missing device names for an instance.
|
||||
|
||||
|
||||
:raises TooManyDiskDevices: if the maxmimum allowed devices to attach to a
|
||||
single instance is exceeded.
|
||||
"""
|
||||
|
||||
dev_list = [bdm.device_name
|
||||
for bdm in itertools.chain(*block_device_lists)
|
||||
@ -143,6 +150,15 @@ def default_device_names_for_instance(instance, root_device_name,
|
||||
dev_list.append(dev)
|
||||
|
||||
|
||||
def check_max_disk_devices_to_attach(num_devices):
|
||||
maximum = CONF.compute.max_disk_devices_to_attach
|
||||
if maximum < 0:
|
||||
return
|
||||
|
||||
if num_devices > maximum:
|
||||
raise exception.TooManyDiskDevices(maximum=maximum)
|
||||
|
||||
|
||||
def get_next_device_name(instance, device_name_list,
|
||||
root_device_name=None, device=None):
|
||||
"""Validates (or generates) a device name for instance.
|
||||
@ -153,6 +169,9 @@ def get_next_device_name(instance, device_name_list,
|
||||
name is valid but applicable to a different backend (for example
|
||||
/dev/vdc is specified but the backend uses /dev/xvdc), the device
|
||||
name will be converted to the appropriate format.
|
||||
|
||||
:raises TooManyDiskDevices: if the maxmimum allowed devices to attach to a
|
||||
single instance is exceeded.
|
||||
"""
|
||||
|
||||
req_prefix = None
|
||||
@ -196,6 +215,8 @@ def get_next_device_name(instance, device_name_list,
|
||||
if flavor.swap:
|
||||
used_letters.add('c')
|
||||
|
||||
check_max_disk_devices_to_attach(len(used_letters) + 1)
|
||||
|
||||
if not req_letter:
|
||||
req_letter = _get_unused_letter(used_letters)
|
||||
|
||||
@ -261,13 +282,13 @@ def convert_mb_to_ceil_gb(mb_value):
|
||||
|
||||
|
||||
def _get_unused_letter(used_letters):
|
||||
doubles = [first + second for second in string.ascii_lowercase
|
||||
for first in string.ascii_lowercase]
|
||||
all_letters = set(list(string.ascii_lowercase) + doubles)
|
||||
letters = list(all_letters - used_letters)
|
||||
# NOTE(vish): prepend ` so all shorter sequences sort first
|
||||
letters.sort(key=lambda x: x.rjust(2, '`'))
|
||||
return letters[0]
|
||||
# Return the first unused device letter
|
||||
index = 0
|
||||
while True:
|
||||
letter = block_device.generate_device_letter(index)
|
||||
if letter not in used_letters:
|
||||
return letter
|
||||
index += 1
|
||||
|
||||
|
||||
def get_value_from_system_metadata(instance, key, type, default):
|
||||
|
@ -802,6 +802,50 @@ Number of concurrent disk-IO-intensive operations (glance image downloads,
|
||||
image format conversions, etc.) that we will do in parallel. If this is set
|
||||
too high then response time suffers.
|
||||
The default value of 0 means no limit.
|
||||
"""),
|
||||
cfg.IntOpt('max_disk_devices_to_attach',
|
||||
default=-1,
|
||||
min=-1,
|
||||
help="""
|
||||
Maximum number of disk devices allowed to attach to a single server. Note
|
||||
that the number of disks supported by an server depends on the bus used. For
|
||||
example, the ``ide`` disk bus is limited to 4 attached devices. The configured
|
||||
maximum is enforced during server create, rebuild, evacuate, unshelve, live
|
||||
migrate, and attach volume.
|
||||
|
||||
Usually, disk bus is determined automatically from the device type or disk
|
||||
device, and the virtualization type. However, disk bus
|
||||
can also be specified via a block device mapping or an image property.
|
||||
See the ``disk_bus`` field in :doc:`/user/block-device-mapping` for more
|
||||
information about specifying disk bus in a block device mapping, and
|
||||
see https://docs.openstack.org/glance/latest/admin/useful-image-properties.html
|
||||
for more information about the ``hw_disk_bus`` image property.
|
||||
|
||||
Operators changing the ``[compute]/max_disk_devices_to_attach`` on a compute
|
||||
service that is hosting servers should be aware that it could cause rebuilds to
|
||||
fail, if the maximum is decreased lower than the number of devices already
|
||||
attached to servers. For example, if server A has 26 devices attached and an
|
||||
operators changes ``[compute]/max_disk_devices_to_attach`` to 20, a request to
|
||||
rebuild server A will fail and go into ERROR state because 26 devices are
|
||||
already attached and exceed the new configured maximum of 20.
|
||||
|
||||
Operators setting ``[compute]/max_disk_devices_to_attach`` should also be aware
|
||||
that during a cold migration, the configured maximum is only enforced in-place
|
||||
and the destination is not checked before the move. This means if an operator
|
||||
has set a maximum of 26 on compute host A and a maximum of 20 on compute host
|
||||
B, a cold migration of a server with 26 attached devices from compute host A to
|
||||
compute host B will succeed. Then, once the server is on compute host B, a
|
||||
subsequent request to rebuild the server will fail and go into ERROR state
|
||||
because 26 devices are already attached and exceed the configured maximum of 20
|
||||
on compute host B.
|
||||
|
||||
The configured maximum is not enforced on shelved offloaded servers, as they
|
||||
have no compute host.
|
||||
|
||||
Possible values:
|
||||
|
||||
* -1 means unlimited
|
||||
* Any integer >= 0 represents the maximum allowed
|
||||
"""),
|
||||
]
|
||||
|
||||
|
116
nova/tests/functional/test_conf_max_attach_disk_devices.py
Normal file
116
nova/tests/functional/test_conf_max_attach_disk_devices.py
Normal file
@ -0,0 +1,116 @@
|
||||
# Licensed under the Apache License, Version 2.0 (the "License"); you may
|
||||
# not use this file except in compliance with the License. You may obtain
|
||||
# a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
|
||||
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
|
||||
# License for the specific language governing permissions and limitations
|
||||
# under the License.
|
||||
|
||||
import time
|
||||
|
||||
import six
|
||||
|
||||
from nova.tests import fixtures as nova_fixtures
|
||||
from nova.tests.functional.api import client
|
||||
from nova.tests.functional import integrated_helpers
|
||||
from nova.tests.functional import test_servers
|
||||
|
||||
|
||||
class ConfigurableMaxDiskDevicesTest(integrated_helpers.InstanceHelperMixin,
|
||||
test_servers.ServersTestBase):
|
||||
def setUp(self):
|
||||
super(ConfigurableMaxDiskDevicesTest, self).setUp()
|
||||
self.cinder = self.useFixture(
|
||||
nova_fixtures.CinderFixtureNewAttachFlow(self))
|
||||
|
||||
def _wait_for_volume_attach(self, server_id, volume_id):
|
||||
for i in range(0, 100):
|
||||
server = self.api.get_server(server_id)
|
||||
attached_vols = [vol['id'] for vol in
|
||||
server['os-extended-volumes:volumes_attached']]
|
||||
if volume_id in attached_vols:
|
||||
return
|
||||
time.sleep(.1)
|
||||
self.fail('Timed out waiting for volume %s to be attached to '
|
||||
'server %s. Currently attached volumes: %s' %
|
||||
(volume_id, server_id, attached_vols))
|
||||
|
||||
def test_boot_from_volume(self):
|
||||
# Set the maximum to 1 and boot from 1 volume. This should pass.
|
||||
self.flags(max_disk_devices_to_attach=1, group='compute')
|
||||
server = self._build_server(flavor_id='1')
|
||||
server['imageRef'] = ''
|
||||
volume_uuid = nova_fixtures.CinderFixtureNewAttachFlow.IMAGE_BACKED_VOL
|
||||
bdm = {'boot_index': 0,
|
||||
'uuid': volume_uuid,
|
||||
'source_type': 'volume',
|
||||
'destination_type': 'volume'}
|
||||
server['block_device_mapping_v2'] = [bdm]
|
||||
created_server = self.api.post_server({"server": server})
|
||||
self._wait_for_state_change(self.api, created_server, 'ACTIVE')
|
||||
|
||||
def test_boot_from_volume_plus_attach_max_exceeded(self):
|
||||
# Set the maximum to 1, boot from 1 volume, and attach one volume.
|
||||
# This should fail because it's trying to attach 2 disk devices.
|
||||
self.flags(max_disk_devices_to_attach=1, group='compute')
|
||||
server = self._build_server(flavor_id='1')
|
||||
server['imageRef'] = ''
|
||||
vol_img_id = nova_fixtures.CinderFixtureNewAttachFlow.IMAGE_BACKED_VOL
|
||||
boot_vol = {'boot_index': 0,
|
||||
'uuid': vol_img_id,
|
||||
'source_type': 'volume',
|
||||
'destination_type': 'volume'}
|
||||
vol_id = '9a695496-44aa-4404-b2cc-ccab2501f87e'
|
||||
other_vol = {'uuid': vol_id,
|
||||
'source_type': 'volume',
|
||||
'destination_type': 'volume'}
|
||||
server['block_device_mapping_v2'] = [boot_vol, other_vol]
|
||||
created_server = self.api.post_server({"server": server})
|
||||
server_id = created_server['id']
|
||||
# Server should go into ERROR state
|
||||
self._wait_for_state_change(self.api, created_server, 'ERROR')
|
||||
# Verify the instance fault
|
||||
server = self.api.get_server(server_id)
|
||||
# If anything fails during _prep_block_device, a 500 internal server
|
||||
# error is raised.
|
||||
self.assertEqual(500, server['fault']['code'])
|
||||
expected = ('Build of instance %s aborted: The maximum allowed number '
|
||||
'of disk devices (1) to attach to a single instance has '
|
||||
'been exceeded.' % server_id)
|
||||
self.assertIn(expected, server['fault']['message'])
|
||||
# Verify no volumes are attached (this is a generator)
|
||||
attached_vols = list(self.cinder.volume_ids_for_instance(server_id))
|
||||
self.assertNotIn(vol_img_id, attached_vols)
|
||||
self.assertNotIn(vol_id, attached_vols)
|
||||
|
||||
def test_attach(self):
|
||||
# Set the maximum to 2. This will allow one disk device for the local
|
||||
# disk of the server and also one volume to attach. A second volume
|
||||
# attach request will fail.
|
||||
self.flags(max_disk_devices_to_attach=2, group='compute')
|
||||
server = self._build_server(flavor_id='1')
|
||||
created_server = self.api.post_server({"server": server})
|
||||
server_id = created_server['id']
|
||||
self._wait_for_state_change(self.api, created_server, 'ACTIVE')
|
||||
# Attach one volume, should pass.
|
||||
vol_id = '9a695496-44aa-4404-b2cc-ccab2501f87e'
|
||||
self.api.post_server_volume(
|
||||
server_id, dict(volumeAttachment=dict(volumeId=vol_id)))
|
||||
self._wait_for_volume_attach(server_id, vol_id)
|
||||
# Attach a second volume, should fail fast in the API.
|
||||
other_vol_id = 'f2063123-0f88-463c-ac9d-74127faebcbe'
|
||||
ex = self.assertRaises(
|
||||
client.OpenStackApiException, self.api.post_server_volume,
|
||||
server_id, dict(volumeAttachment=dict(volumeId=other_vol_id)))
|
||||
self.assertEqual(403, ex.response.status_code)
|
||||
expected = ('The maximum allowed number of disk devices (2) to attach '
|
||||
'to a single instance has been exceeded.')
|
||||
self.assertIn(expected, six.text_type(ex))
|
||||
# Verify only one volume is attached (this is a generator)
|
||||
attached_vols = list(self.cinder.volume_ids_for_instance(server_id))
|
||||
self.assertIn(vol_id, attached_vols)
|
||||
self.assertNotIn(other_vol_id, attached_vols)
|
@ -153,13 +153,16 @@ def get_dev_count_for_disk_bus(disk_bus):
|
||||
Determine how many disks can be supported in
|
||||
a single VM for a particular disk bus.
|
||||
|
||||
Returns the number of disks supported.
|
||||
Returns the number of disks supported or None
|
||||
if there is no limit.
|
||||
"""
|
||||
|
||||
if disk_bus == "ide":
|
||||
return 4
|
||||
else:
|
||||
return 26
|
||||
if CONF.compute.max_disk_devices_to_attach < 0:
|
||||
return None
|
||||
return CONF.compute.max_disk_devices_to_attach
|
||||
|
||||
|
||||
def find_disk_dev_for_disk_bus(mapping, bus,
|
||||
@ -183,13 +186,16 @@ def find_disk_dev_for_disk_bus(mapping, bus,
|
||||
assigned_devices = []
|
||||
|
||||
max_dev = get_dev_count_for_disk_bus(bus)
|
||||
devs = range(max_dev)
|
||||
|
||||
for idx in devs:
|
||||
disk_dev = dev_prefix + chr(ord('a') + idx)
|
||||
idx = 0
|
||||
while True:
|
||||
if max_dev is not None and idx >= max_dev:
|
||||
raise exception.TooManyDiskDevices(maximum=max_dev)
|
||||
disk_dev = block_device.generate_device_name(dev_prefix, idx)
|
||||
if not has_disk_dev(mapping, disk_dev):
|
||||
if disk_dev not in assigned_devices:
|
||||
return disk_dev
|
||||
idx += 1
|
||||
|
||||
raise exception.TooManyDiskDevices(maximum=max_dev)
|
||||
|
||||
|
@ -0,0 +1,57 @@
|
||||
---
|
||||
features:
|
||||
- |
|
||||
A new configuration option, ``[compute]/max_disk_devices_to_attach``, which
|
||||
defaults to ``-1`` (unlimited), has been added and can be used to configure
|
||||
the maximum number of disk devices allowed to attach to a single server,
|
||||
per compute host. Note that the number of disks supported by a server
|
||||
depends on the bus used. For example, the ``ide`` disk bus is limited to 4
|
||||
attached devices.
|
||||
|
||||
Usually, disk bus is determined automatically from the device type or disk
|
||||
device, and the virtualization type. However, disk bus
|
||||
can also be specified via a block device mapping or an image property.
|
||||
See the ``disk_bus`` field in
|
||||
https://docs.openstack.org/nova/latest/user/block-device-mapping.html
|
||||
for more information about specifying disk bus in a block device mapping,
|
||||
and see
|
||||
https://docs.openstack.org/glance/latest/admin/useful-image-properties.html
|
||||
for more information about the ``hw_disk_bus`` image property.
|
||||
|
||||
The configured maximum is enforced during server create, rebuild, evacuate,
|
||||
unshelve, live migrate, and attach volume. When the maximum is exceeded
|
||||
during server create, rebuild, evacuate, unshelve, or live migrate, the
|
||||
server will go into ``ERROR`` state and the server fault message will
|
||||
indicate the failure reason. When the maximum is exceeded during a server
|
||||
attach volume API operation, the request will fail with a
|
||||
``403 HTTPForbidden`` error.
|
||||
issues:
|
||||
- |
|
||||
Operators changing the ``[compute]/max_disk_devices_to_attach`` on a
|
||||
compute service that is hosting servers should be aware that it could
|
||||
cause rebuilds to fail, if the maximum is decreased lower than the number
|
||||
of devices already attached to servers. For example, if server A has 26
|
||||
devices attached and an operators changes
|
||||
``[compute]/max_disk_devices_to_attach`` to 20, a request to rebuild server
|
||||
A will fail and go into ERROR state because 26 devices are already
|
||||
attached and exceed the new configured maximum of 20.
|
||||
|
||||
Operators setting ``[compute]/max_disk_devices_to_attach`` should also be
|
||||
aware that during a cold migration, the configured maximum is only enforced
|
||||
in-place and the destination is not checked before the move. This means if
|
||||
an operator has set a maximum of 26 on compute host A and a maximum of 20
|
||||
on compute host B, a cold migration of a server with 26 attached devices
|
||||
from compute host A to compute host B will succeed. Then, once the server
|
||||
is on compute host B, a subsequent request to rebuild the server will fail
|
||||
and go into ERROR state because 26 devices are already attached and exceed
|
||||
the configured maximum of 20 on compute host B.
|
||||
|
||||
The configured maximum is not enforced on shelved offloaded servers, as
|
||||
they have no compute host.
|
||||
upgrade:
|
||||
- |
|
||||
The new configuration option, ``[compute]/max_disk_devices_to_attach``
|
||||
defaults to ``-1`` (unlimited). Users of the libvirt driver should be
|
||||
advised that the default limit for non-ide disk buses has changed from 26
|
||||
to unlimited, upon upgrade to Stein. The ``ide`` disk bus continues to be
|
||||
limited to 4 attached devices per server.
|
Loading…
Reference in New Issue
Block a user