restrict swap volume to cinder

This change tightens the validation around the attachment
update API to ensure that it can only be called if the source
volume has a non empty migration status.

That means it will only accept a request to swap the volume if
it is the result of a cinder volume migration.

This change is being made to prevent the instance domain
XML from getting out of sync with the nova BDM records
and cinder connection info. In the future support for direct
swap volume actions can be re-added if and only if the
nova libvirt driver is updated to correctly modify the domain.
The libvirt driver is the only driver that supported this API
outside of a cinder orchestrated swap volume.

By allowing the domain XML and BDMs to get out of sync
if an admin later live-migrates the VM the host path will not be
modified for the destination host. Normally this results in a live
migration failure which often prompts the admin to cold migrate instead.
however if the source device path exists on the destination the migration
will proceed. This can lead to 2 VMs using the same host block device.
At best this will cause a crash or data corruption.
At worst it will allow one guest to access the data of another.

Prior to this change there was an explicit warning in nova API ref
stating that humans should never call this API because it can lead
to this situation. Now it considered a hard error due to the
security implications.

Closes-Bug: #2112187
Depends-on: https://review.opendev.org/c/openstack/tempest/+/957753
Change-Id: I439338bd2f27ccd65a436d18c8cbc9c3127ee612
Signed-off-by: Sean Mooney <work@seanmooney.info>
(cherry picked from commit 93c0f9bc74)
This commit is contained in:
Sean Mooney
2025-08-15 14:33:34 +01:00
committed by sean mooney
parent 63433cca8e
commit 76fe0294ff
9 changed files with 214 additions and 89 deletions

View File

@@ -185,16 +185,16 @@ Update a volume attachment.
.. note:: This action only valid when the server is in ACTIVE, PAUSED and RESIZED state,
or a conflict(409) error will be returned.
.. warning:: When updating volumeId, this API is typically meant to
only be used as part of a larger orchestrated volume
migration operation initiated in the block storage
service via the ``os-retype`` or ``os-migrate_volume``
volume actions. Direct usage of this API to update
volumeId is not recommended and may result in needing to
hard reboot the server to update details within the guest
such as block storage serial IDs. Furthermore, updating
volumeId via this API is only implemented by `certain
compute drivers`_.
.. Important::
When updating volumeId, this API **MUST** only be used
as part of a larger orchestrated volume
migration operation initiated in the block storage
service via the ``os-retype`` or ``os-migrate_volume``
volume actions. Direct usage of this API is not supported
and will be blocked by nova with a 409 conflict.
Furthermore, updating ``volumeId`` via this API is only
implemented by `certain compute drivers`_.
.. _certain compute drivers: https://docs.openstack.org/nova/latest/user/support-matrix.html#operation_swap_volume

View File

@@ -434,6 +434,12 @@ class VolumeAttachmentController(wsgi.Controller):
except exception.VolumeNotFound as e:
raise exc.HTTPNotFound(explanation=e.format_message())
if ('migration_status' not in old_volume or
old_volume['migration_status'] in (None, '')):
message = (f"volume {old_volume_id} is not migrating this api "
"should only be called by Cinder")
raise exc.HTTPConflict(explanation=message)
new_volume_id = body['volumeAttachment']['volumeId']
try:
new_volume = self.volume_api.get(context, new_volume_id)

View File

@@ -262,11 +262,15 @@ class CinderFixture(fixtures.Fixture):
'attachment_id': attachment['id'],
'mountpoint': '/dev/vdb',
}
migration_status = (
None if volume_id not in (
self.SWAP_OLD_VOL, self.SWAP_ERR_OLD_VOL)
else "migrating")
volume.update({
'status': 'in-use',
'attach_status': 'attached',
'attachments': attachments,
'migration_status': migration_status
})
# Otherwise mark the volume as available and detached
else:

View File

@@ -1560,8 +1560,14 @@ class TestInstanceNotificationSample(
def test_volume_swap_server_with_error(self):
server = self._do_setup_server_and_error_flag()
self._volume_swap_server(server, self.cinder.SWAP_ERR_OLD_VOL,
self.cinder.SWAP_ERR_NEW_VOL)
# This is calling swap volume but we are emulating cinder
# volume migrate in the fixture to allow this.
# i.e. this is simulating the workflow where you move a volume
# between cinder backend using a temp volume that cinder internally
# cleans up at the end of the migration.
self._volume_swap_server(
server, self.cinder.SWAP_ERR_OLD_VOL,
self.cinder.SWAP_ERR_NEW_VOL)
self._wait_for_notification('compute.exception')
# Eight versioned notifications are generated.
@@ -1576,6 +1582,8 @@ class TestInstanceNotificationSample(
self.assertLessEqual(7, len(self.notifier.versioned_notifications),
'Unexpected number of versioned notifications. '
'Got: %s' % self.notifier.versioned_notifications)
# the block device mapping is using SWAP_ERR_OLD_VOL because this is
# the cinder volume migrate workflow.
block_devices = [{
"nova_object.data": {
"boot_index": None,

View File

@@ -16,6 +16,7 @@ from oslo_serialization import jsonutils
from nova import context
from nova import objects
from nova.tests.functional.api import client
from nova.tests.functional import integrated_helpers
from nova.tests.functional.libvirt import base
from nova.virt import block_device as driver_block_device
@@ -46,6 +47,8 @@ class TestLibvirtROMultiattachMigrate(
self.start_compute()
def test_ro_multiattach_swap_volume(self):
# NOTE(sean-k-mooney): This test is emulating calling swap volume
# directly instead of using cinder volume migrate or retype.
server_id = self._create_server(networks='none')['id']
self.api.post_server_volume(
server_id,
@@ -58,47 +61,13 @@ class TestLibvirtROMultiattachMigrate(
self._wait_for_volume_attach(
server_id, self.cinder.MULTIATTACH_RO_SWAP_OLD_VOL)
# Swap between the old and new volumes
self.api.put_server_volume(
server_id,
self.cinder.MULTIATTACH_RO_SWAP_OLD_VOL,
# NOTE(sean-k-mooney): because of bug 212187 directly using
# swap volume is not supported and should fail.
ex = self.assertRaises(
client.OpenStackApiException, self.api.put_server_volume,
server_id, self.cinder.MULTIATTACH_RO_SWAP_OLD_VOL,
self.cinder.MULTIATTACH_RO_SWAP_NEW_VOL)
# Wait until the old volume is detached and new volume is attached
self._wait_for_volume_detach(
server_id, self.cinder.MULTIATTACH_RO_SWAP_OLD_VOL)
self._wait_for_volume_attach(
server_id, self.cinder.MULTIATTACH_RO_SWAP_NEW_VOL)
bdm = objects.BlockDeviceMapping.get_by_volume_and_instance(
context.get_admin_context(),
self.cinder.MULTIATTACH_RO_SWAP_NEW_VOL,
server_id)
connection_info = jsonutils.loads(bdm.connection_info)
# Assert that only the new volume UUID is referenced within the stashed
# connection_info and returned by driver_block_device.get_volume_id
self.assertIn('volume_id', connection_info.get('data'))
self.assertEqual(
self.cinder.MULTIATTACH_RO_SWAP_NEW_VOL,
connection_info['data']['volume_id'])
self.assertIn('volume_id', connection_info)
self.assertEqual(
self.cinder.MULTIATTACH_RO_SWAP_NEW_VOL,
connection_info['volume_id'])
self.assertIn('serial', connection_info)
self.assertEqual(
self.cinder.MULTIATTACH_RO_SWAP_NEW_VOL,
connection_info.get('serial'))
self.assertEqual(
self.cinder.MULTIATTACH_RO_SWAP_NEW_VOL,
driver_block_device.get_volume_id(connection_info))
# Assert that the new volume can be detached from the instance
self.api.delete_server_volume(
server_id, self.cinder.MULTIATTACH_RO_SWAP_NEW_VOL)
self._wait_for_volume_detach(
server_id, self.cinder.MULTIATTACH_RO_SWAP_NEW_VOL)
self.assertIn("this api should only be called by Cinder", str(ex))
def test_ro_multiattach_migrate_volume(self):
server_id = self._create_server(networks='none')['id']

View File

@@ -0,0 +1,67 @@
# Licensed under the Apache License, Version 2.0 (the "License"); you may
# not use this file except in compliance with the License. You may obtain
# a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
# License for the specific language governing permissions and limitations
# under the License.
from nova.tests.functional.api import client
from nova.tests.functional import integrated_helpers
from nova.tests.functional.libvirt import base
class TestDirectSwapVolume(
base.ServersTestBase,
integrated_helpers.InstanceHelperMixin
):
"""Regression test for bug 2112187
During a Cinder orchestrated volume migration nova leaves the
stashed connection_info of the attachment pointing at the original
volume UUID used during the migration because cinder will atomically
revert the UUID of the volume back to the original value.
When swap volume is used directly the uuid should be updated
in the libvirt xml but nova does not support that today.
That results in the uuid in the xml and the uuid in the BDMs
being out of sync.
As a result it is unsafe to allow direct swap volume.
"""
microversion = 'latest'
ADMIN_API = True
def setUp(self):
super().setUp()
self.start_compute()
def test_direct_swap_volume(self):
# NOTE(sean-k-mooney): This test is emulating calling swap volume
# directly instead of using cinder volume migrate or retype.
server_id = self._create_server(networks='none')['id']
# We do not need to use a multiattach volume but any volume
# that does not have a migration state set will work.
self.api.post_server_volume(
server_id,
{
'volumeAttachment': {
'volumeId': self.cinder.MULTIATTACH_RO_SWAP_OLD_VOL
}
}
)
self._wait_for_volume_attach(
server_id, self.cinder.MULTIATTACH_RO_SWAP_OLD_VOL)
# NOTE(sean-k-mooney): because of bug 212187 directly using
# swap volume is not supported and should fail.
ex = self.assertRaises(
client.OpenStackApiException, self.api.put_server_volume,
server_id, self.cinder.MULTIATTACH_RO_SWAP_OLD_VOL,
self.cinder.MULTIATTACH_RO_SWAP_NEW_VOL)
self.assertIn("this api should only be called by Cinder", str(ex))

View File

@@ -15,14 +15,16 @@
# under the License.
import datetime
from unittest import mock
import urllib
from unittest import mock
import fixtures
import webob
from oslo_serialization import jsonutils
from oslo_utils import encodeutils
from oslo_utils.fixture import uuidsentinel as uuids
import webob
from webob import exc
from nova.api.openstack import api_version_request
@@ -65,17 +67,28 @@ def fake_get_instance(self, context, instance_id, expected_attrs=None,
return fake_instance.fake_instance_obj(
context, id=1, uuid=instance_id, project_id=context.project_id)
# TODO(sean-k-mooney): this is duplicated in the policy tests
# we should consider consolidating this.
def fake_get_volume(self, context, id):
migration_status = None
if id == FAKE_UUID_A:
status = 'in-use'
attach_status = 'attached'
elif id == FAKE_UUID_B:
status = 'available'
attach_status = 'detached'
elif id == uuids.source_swap_vol:
status = 'in-use'
attach_status = 'attached'
migration_status = 'migrating'
else:
raise exception.VolumeNotFound(volume_id=id)
return {'id': id, 'status': status, 'attach_status': attach_status}
return {
'id': id, 'status': status, 'attach_status': attach_status,
'migration_status': migration_status
}
def fake_create_snapshot(self, context, volume, name, description):
@@ -99,7 +112,7 @@ def fake_compute_volume_snapshot_delete(self, context, volume_id, snapshot_id,
@classmethod
def fake_bdm_get_by_volume_and_instance(cls, ctxt, volume_id, instance_uuid):
if volume_id != FAKE_UUID_A:
if volume_id not in (FAKE_UUID_A, uuids.source_swap_vol):
raise exception.VolumeBDMNotFound(volume_id=volume_id)
db_bdm = fake_block_device.FakeDbBlockDeviceDict({
'id': 1,
@@ -110,7 +123,7 @@ def fake_bdm_get_by_volume_and_instance(cls, ctxt, volume_id, instance_uuid):
'source_type': 'volume',
'destination_type': 'volume',
'snapshot_id': None,
'volume_id': FAKE_UUID_A,
'volume_id': volume_id,
'volume_size': 1,
'attachment_id': uuids.attachment_id
})
@@ -572,6 +585,7 @@ class VolumeAttachTestsV21(test.NoDBTestCase):
test.MatchType(objects.Instance),
{'attach_status': 'attached',
'status': 'in-use',
'migration_status': None,
'id': FAKE_UUID_A})
@mock.patch.object(compute_api.API, 'detach_volume')
@@ -585,7 +599,8 @@ class VolumeAttachTestsV21(test.NoDBTestCase):
test.MatchType(objects.Instance),
{'attach_status': 'attached',
'status': 'in-use',
'id': FAKE_UUID_A})
'id': FAKE_UUID_A,
'migration_status': None})
def test_attach_volume(self):
self.stub_out('nova.compute.api.API.attach_volume',
@@ -739,7 +754,7 @@ class VolumeAttachTestsV21(test.NoDBTestCase):
self.assertRaises(exc.HTTPBadRequest, self.attachments.create,
req, FAKE_UUID, body=body)
def _test_swap(self, attachments, uuid=FAKE_UUID_A, body=None):
def _test_swap(self, attachments, uuid=uuids.source_swap_vol, body=None):
body = body or {'volumeAttachment': {'volumeId': FAKE_UUID_B}}
return attachments.update(self.req, uuids.instance, uuid, body=body)
@@ -754,10 +769,13 @@ class VolumeAttachTestsV21(test.NoDBTestCase):
self.req.environ['nova.context'], test.MatchType(objects.Instance),
{'attach_status': 'attached',
'status': 'in-use',
'id': FAKE_UUID_A},
'id': uuids.source_swap_vol,
'migration_status': 'migrating'
},
{'attach_status': 'detached',
'status': 'available',
'id': FAKE_UUID_B})
'id': FAKE_UUID_B,
'migration_status': None})
@mock.patch.object(compute_api.API, 'swap_volume')
def test_swap_volume(self, mock_swap_volume):
@@ -774,10 +792,12 @@ class VolumeAttachTestsV21(test.NoDBTestCase):
self.req.environ['nova.context'], test.MatchType(objects.Instance),
{'attach_status': 'attached',
'status': 'in-use',
'id': FAKE_UUID_A},
'id': uuids.source_swap_vol,
'migration_status': 'migrating'},
{'attach_status': 'detached',
'status': 'available',
'id': FAKE_UUID_B})
'id': FAKE_UUID_B,
'migration_status': None})
def test_swap_volume_with_nonexistent_uri(self):
self.assertRaises(exc.HTTPNotFound, self._test_swap,
@@ -786,13 +806,14 @@ class VolumeAttachTestsV21(test.NoDBTestCase):
@mock.patch.object(cinder.API, 'get')
def test_swap_volume_with_nonexistent_dest_in_body(self, mock_get):
mock_get.side_effect = [
None, exception.VolumeNotFound(volume_id=FAKE_UUID_C)]
fake_get_volume(None, None, uuids.source_swap_vol),
exception.VolumeNotFound(volume_id=FAKE_UUID_C)]
body = {'volumeAttachment': {'volumeId': FAKE_UUID_C}}
with mock.patch.object(self.attachments, '_update_volume_regular'):
self.assertRaises(exc.HTTPBadRequest, self._test_swap,
self.attachments, body=body)
mock_get.assert_has_calls([
mock.call(self.req.environ['nova.context'], FAKE_UUID_A),
mock.call(self.req.environ['nova.context'], uuids.source_swap_vol),
mock.call(self.req.environ['nova.context'], FAKE_UUID_C)])
def test_swap_volume_without_volumeId(self):
@@ -823,8 +844,9 @@ class VolumeAttachTestsV21(test.NoDBTestCase):
self.attachments)
if mock_bdm.called:
# New path includes regular PUT procedure
mock_bdm.assert_called_once_with(self.req.environ['nova.context'],
FAKE_UUID_A, uuids.instance)
mock_bdm.assert_called_once_with(
self.req.environ['nova.context'],
uuids.source_swap_vol, uuids.instance)
mock_swap_volume.assert_not_called()
else:
# Old path is pure swap-volume
@@ -834,10 +856,12 @@ class VolumeAttachTestsV21(test.NoDBTestCase):
test.MatchType(objects.Instance),
{'attach_status': 'attached',
'status': 'in-use',
'id': FAKE_UUID_A},
'migration_status': 'migrating',
'id': uuids.source_swap_vol},
{'attach_status': 'detached',
'status': 'available',
'id': FAKE_UUID_B})
'id': FAKE_UUID_B,
'migration_status': None})
def _test_list_with_invalid_filter(self, url):
req = self._build_request(url)
@@ -1191,7 +1215,7 @@ class UpdateVolumeAttachTests(VolumeAttachTestsV279):
self.context,
id=1,
instance_uuid=FAKE_UUID,
volume_id=FAKE_UUID_A,
volume_id=uuids.source_swap_vol,
source_type='volume',
destination_type='volume',
delete_on_termination=False,
@@ -1306,7 +1330,7 @@ class UpdateVolumeAttachTests(VolumeAttachTestsV279):
self.context,
id=1,
instance_uuid=FAKE_UUID,
volume_id=FAKE_UUID_A,
volume_id=uuids.source_swap_vol,
source_type='volume',
destination_type='volume',
delete_on_termination=False,
@@ -1322,7 +1346,7 @@ class UpdateVolumeAttachTests(VolumeAttachTestsV279):
'delete_on_termination': True,
}}
self.attachments.update(self.req, FAKE_UUID,
FAKE_UUID_A, body=body)
uuids.source_swap_vol, body=body)
mock_bdm_save.assert_called_once()
self.assertTrue(vol_bdm['delete_on_termination'])
# Swap volume is tested elsewhere, just make sure that we did
@@ -1339,7 +1363,7 @@ class UpdateVolumeAttachTests(VolumeAttachTestsV279):
self.context,
id=1,
instance_uuid=FAKE_UUID,
volume_id=FAKE_UUID_A,
volume_id=uuids.source_swap_vol,
source_type='volume',
destination_type='volume',
delete_on_termination=False,
@@ -1354,7 +1378,7 @@ class UpdateVolumeAttachTests(VolumeAttachTestsV279):
}}
req = self._get_req(body, microversion='2.84')
self.attachments.update(req, FAKE_UUID,
FAKE_UUID_A, body=body)
uuids.source_swap_vol, body=body)
mock_swap.assert_called_once()
mock_bdm_save.assert_not_called()
@@ -1640,6 +1664,7 @@ class SwapVolumeMultiattachTestCase(test.NoDBTestCase):
'id': volume_id,
'size': 1,
'multiattach': True,
'migration_status': 'migrating',
'attachments': {
uuids.server1: {
'attachment_id': uuids.attachment_id1,
@@ -1689,12 +1714,12 @@ class SwapVolumeMultiattachTestCase(test.NoDBTestCase):
ex = self.assertRaises(
webob.exc.HTTPBadRequest, controller.update, req,
uuids.server1, uuids.old_vol_id, body=body)
self.assertIn('Swapping multi-attach volumes with more than one ',
str(ex))
mock_attachment_get.assert_has_calls([
mock.call(ctxt, uuids.attachment_id1),
mock.call(ctxt, uuids.attachment_id2)], any_order=True)
mock_roll_detaching.assert_called_once_with(ctxt, uuids.old_vol_id)
self.assertIn(
'Swapping multi-attach volumes with more than one ', str(ex))
mock_attachment_get.assert_has_calls([
mock.call(ctxt, uuids.attachment_id1),
mock.call(ctxt, uuids.attachment_id2)], any_order=True)
mock_roll_detaching.assert_called_once_with(ctxt, uuids.old_vol_id)
class CommonBadRequestTestCase(object):

View File

@@ -38,7 +38,7 @@ FAKE_UUID_B = 'bbbbbbbb-bbbb-bbbb-bbbb-bbbbbbbbbbbb'
def fake_bdm_get_by_volume_and_instance(cls, ctxt, volume_id, instance_uuid):
if volume_id != FAKE_UUID_A:
if volume_id not in (FAKE_UUID_A, uuids.source_swap_vol):
raise exception.VolumeBDMNotFound(volume_id=volume_id)
db_bdm = fake_block_device.FakeDbBlockDeviceDict(
{'id': 1,
@@ -55,15 +55,23 @@ def fake_bdm_get_by_volume_and_instance(cls, ctxt, volume_id, instance_uuid):
def fake_get_volume(self, context, id):
migration_status = None
if id == FAKE_UUID_A:
status = 'in-use'
attach_status = 'attached'
elif id == FAKE_UUID_B:
status = 'available'
attach_status = 'detached'
elif id == uuids.source_swap_vol:
status = 'in-use'
attach_status = 'attached'
migration_status = 'migrating'
else:
raise exception.VolumeNotFound(volume_id=id)
return {'id': id, 'status': status, 'attach_status': attach_status}
return {
'id': id, 'status': status, 'attach_status': attach_status,
'migration_status': migration_status
}
class VolumeAttachPolicyTest(base.BasePolicyTest):
@@ -163,9 +171,10 @@ class VolumeAttachPolicyTest(base.BasePolicyTest):
def test_swap_volume_attach_policy(self, mock_swap_volume):
rule_name = self.policy_root % "swap"
body = {'volumeAttachment': {'volumeId': FAKE_UUID_B}}
self.common_policy_auth(self.project_admin_authorized_contexts,
rule_name, self.controller.update,
self.req, FAKE_UUID, FAKE_UUID_A, body=body)
self.common_policy_auth(
self.project_admin_authorized_contexts,
rule_name, self.controller.update,
self.req, FAKE_UUID, uuids.source_swap_vol, body=body)
@mock.patch.object(block_device_obj.BlockDeviceMapping, 'save')
@mock.patch('nova.compute.api.API.swap_volume')
@@ -198,9 +207,10 @@ class VolumeAttachPolicyTest(base.BasePolicyTest):
req = fakes.HTTPRequest.blank('', version='2.85')
body = {'volumeAttachment': {'volumeId': FAKE_UUID_B,
'delete_on_termination': True}}
self.common_policy_auth(self.project_admin_authorized_contexts,
rule_name, self.controller.update,
req, FAKE_UUID, FAKE_UUID_A, body=body)
self.common_policy_auth(
self.project_admin_authorized_contexts,
rule_name, self.controller.update,
req, FAKE_UUID, uuids.source_swap_vol, body=body)
mock_swap_volume.assert_called()
mock_bdm_save.assert_called()

View File

@@ -0,0 +1,36 @@
---
security:
- |
Nova has documented that the ``update volume attachment`` API
PUT /servers/{server_id}/os-volume_attachments/{volume_id}
should not be called directly for a very long time.
"When updating volumeId, this API is typically meant to only
be used as part of a larger orchestrated volume migration
operation initiated in the block storage service via
the os-retype or os-migrate_volume volume actions.
Direct usage of this API to update volumeId is not recommended
and may result in needing to hard reboot the server
to update details within the guest such as block storage serial IDs.
Furthermore, updating volumeId via this API is only implemented
by certain compute drivers."
As an admin only api, direct usage has always been limited to admins
or service like ``watcher``.
This longstanding recommendation is now enforced as a security
hardening measure and restricted to only cinder.
The prior warning alluded to the fact that directly using this
api can result in a guest with a de-synced definition of the volume
serial. Before this change it was possible for an admin to unknowingly
put a VM in an inconsistent state such that a future live migration may
fail or succeed and break tenant isolation. This could not happen
when the api was called by cinder so Nova has restricted that api
exclusively to that use-case.
see: https://bugs.launchpad.net/nova/+bug/2112187 for details.
fixes:
- |
``Nova`` now strictly enforces that only ``cinder`` can call the
``update volume attachment`` aka ``swap volume`` api. This is part
of addressing a security hardening gap identified as part of bug:
https://bugs.launchpad.net/nova/+bug/2112187