Report tri-state shared_targets for NVMe volumes

NVMe-oF drivers that share the subsystem have the same race condition
issue that iSCSI volumes that share targets do.

The race condition is caused by AER messages that trigger automatic
rescans on the connector host side in both cases.

For iSCSI we added a feature on the Open-iSCSI project that allowed
disabling these scans, and added support for it in os-brick.

Since manual scans is a new feature that may be missing in a host's
iSCSI client, cinder has a flag in volumes to indicate when they use
shared targets.  Using that flag os-brick consumers can use the
"guard_connection" context manager to ensure race conditions don't
happen.

The race condition is prevented by os-brick using manual scans if they
are available in the iSCSI client, or a file lock if not.

The problem we face now is that we also want to use the lock for NVMe-oF
volumes that share a subsystem for multiple namespaces (there is no way
to disable automatic scans), but cinder doesn't (and shouldn't) expose
the actual storage protocol on the volume resource, so we need to
leverage the "shared_targets" parameter.

So with a single boolean value we need to encode 3 possible options:

- Don't use locks because targets/subystems are not shared
- Use locks if iSCSI client doesn't support automatic connections
- Always use locks (for example for NVMe-oF)

The only option we have is using the "None" value as well. That way we
can encode 3 different cases.

But we have an additional restriction, "True" is already taken for the
iSCSI case, because there will exist volumes in the database that
already have that value stored.

And making guard_connection always lock when shared_targets is set to
True will introduce the bottleneck from bug (#1800515).

That leaves us with the "None" value to force the use of locks.

So we end up with the following tristate for "shared_targets":

- True to use lock if iSCSI initiator doesn't support manual scans
- False means that os-brick should never lock.
- None means that os-brick should always lock.

The alternative to this encoding would be to have an online data
migration for volumes to change "True" to "None", and accept that there
could be race conditions during the rolling upgrade (because os-brick on
computes will interpret "None" as "False").

Since "in theory" Cinder was only returning True or False for the
"shared_target", we add a new microversion with number 3.69 that returns
null when the value is internally set to None.

The patch also updates the database with a migration, though it looks
like it's not necessary since the DB already allows null values, but it
seems more correct to make sure that's always the case.

This patch doesn't close but #1961102 because the os-brick patch is
needed for that.

Related-Bug: #1961102
Change-Id: I8cda6d9830f39e27ac700b1d8796fe0489fd7c0a
This commit is contained in:
Gorka Eguileor 2022-03-07 13:25:05 +01:00
parent 6ff76609c7
commit ef741228d8
22 changed files with 403 additions and 16 deletions

View File

@ -2683,6 +2683,18 @@ shared_targets:
required: true
type: boolean
min_version: 3.48
max_version: 3.68
shared_targets_tristate:
description: |
An indicator whether the host connecting the volume should lock for the
whole attach/detach process or not. ``true`` means only is iSCSI initiator
running on host doesn't support manual scans, ``false`` means never use
locks, and ``null`` means to always use locks. Look at os-brick's
``guard_connection`` context manager. Default=True.
in: body
required: true
type: boolean
min_version: 3.69
size:
description: |
The size of the volume, in gibibytes (GiB).

View File

@ -22,7 +22,7 @@
"min_version": "3.0",
"status": "CURRENT",
"updated": "2022-03-30T00:00:00Z",
"version": "3.68"
"version": "3.69"
}
]
}

View File

@ -22,7 +22,7 @@
"min_version": "3.0",
"status": "CURRENT",
"updated": "2022-03-30T00:00:00Z",
"version": "3.68"
"version": "3.69"
}
]
}

View File

@ -0,0 +1,41 @@
{
"volume": {
"attachments": [],
"availability_zone": "nova",
"bootable": "false",
"consistencygroup_id": null,
"created_at": "2018-11-28T06:21:12.715987",
"description": null,
"encrypted": false,
"id": "2b955850-f177-45f7-9f49-ecb2c256d161",
"links": [
{
"href": "http://127.0.0.1:33951/v3/89afd400-b646-4bbc-b12b-c0a4d63e5bd3/volumes/2b955850-f177-45f7-9f49-ecb2c256d161",
"rel": "self"
},
{
"href": "http://127.0.0.1:33951/89afd400-b646-4bbc-b12b-c0a4d63e5bd3/volumes/2b955850-f177-45f7-9f49-ecb2c256d161",
"rel": "bookmark"
}
],
"metadata": {},
"migration_status": null,
"multiattach": false,
"name": null,
"replication_status": null,
"size": 10,
"snapshot_id": null,
"source_volid": null,
"status": "creating",
"updated_at": null,
"user_id": "c853ca26-e8ea-4797-8a52-ee124a013d0e",
"volume_type": "__DEFAULT__",
"group_id": null,
"provider_id": null,
"service_uuid": null,
"shared_targets": null,
"cluster_name": null,
"volume_type_id": "5fed9d7c-401d-46e2-8e80-f30c70cb7e1d",
"consumes_quota": true
}
}

View File

@ -0,0 +1,45 @@
{
"volume": {
"attachments": [],
"availability_zone": "nova",
"bootable": "false",
"consistencygroup_id": null,
"created_at": "2018-11-29T06:50:07.770785",
"description": null,
"encrypted": false,
"id": "f7223234-1afc-4d19-bfa3-d19deb6235ef",
"links": [
{
"href": "http://127.0.0.1:45839/v3/89afd400-b646-4bbc-b12b-c0a4d63e5bd3/volumes/f7223234-1afc-4d19-bfa3-d19deb6235ef",
"rel": "self"
},
{
"href": "http://127.0.0.1:45839/89afd400-b646-4bbc-b12b-c0a4d63e5bd3/volumes/f7223234-1afc-4d19-bfa3-d19deb6235ef",
"rel": "bookmark"
}
],
"metadata": {},
"migration_status": null,
"multiattach": false,
"name": null,
"os-vol-host-attr:host": null,
"os-vol-mig-status-attr:migstat": null,
"os-vol-mig-status-attr:name_id": null,
"os-vol-tenant-attr:tenant_id": "89afd400-b646-4bbc-b12b-c0a4d63e5bd3",
"replication_status": null,
"size": 10,
"snapshot_id": null,
"source_volid": null,
"status": "creating",
"updated_at": null,
"user_id": "c853ca26-e8ea-4797-8a52-ee124a013d0e",
"volume_type": "__DEFAULT__",
"provider_id": null,
"group_id": null,
"service_uuid": null,
"shared_targets": null,
"cluster_name": null,
"volume_type_id": "5fed9d7c-401d-46e2-8e80-f30c70cb7e1d",
"consumes_quota": true
}
}

View File

@ -0,0 +1,43 @@
{
"volume": {
"attachments": [],
"availability_zone": "nova",
"bootable": "false",
"consistencygroup_id": null,
"created_at": "2018-11-29T06:59:23.679903",
"description": "This is yet, another volume.",
"encrypted": false,
"id": "8b2459d1-0059-4e14-a89f-dfa73a452af6",
"links": [
{
"href": "http://127.0.0.1:41467/v3/89afd400-b646-4bbc-b12b-c0a4d63e5bd3/volumes/8b2459d1-0059-4e14-a89f-dfa73a452af6",
"rel": "self"
},
{
"href": "http://127.0.0.1:41467/89afd400-b646-4bbc-b12b-c0a4d63e5bd3/volumes/8b2459d1-0059-4e14-a89f-dfa73a452af6",
"rel": "bookmark"
}
],
"metadata": {
"name": "metadata0"
},
"migration_status": null,
"multiattach": false,
"name": "vol-003",
"replication_status": null,
"size": 10,
"snapshot_id": null,
"source_volid": null,
"status": "creating",
"updated_at": null,
"user_id": "c853ca26-e8ea-4797-8a52-ee124a013d0e",
"volume_type": "__DEFAULT__",
"group_id": null,
"provider_id": null,
"service_uuid": null,
"shared_targets": null,
"cluster_name": null,
"volume_type_id": "5fed9d7c-401d-46e2-8e80-f30c70cb7e1d",
"consumes_quota": true
}
}

View File

@ -0,0 +1,47 @@
{
"volumes": [
{
"attachments": [],
"availability_zone": "nova",
"bootable": "false",
"consistencygroup_id": null,
"created_at": "2018-11-28T06:25:15.288987",
"description": null,
"encrypted": false,
"id": "cb49b381-9012-40cb-b8ee-80c19a4801b5",
"links": [
{
"href": "http://127.0.0.1:43543/v3/89afd400-b646-4bbc-b12b-c0a4d63e5bd3/volumes/cb49b381-9012-40cb-b8ee-80c19a4801b5",
"rel": "self"
},
{
"href": "http://127.0.0.1:43543/89afd400-b646-4bbc-b12b-c0a4d63e5bd3/volumes/cb49b381-9012-40cb-b8ee-80c19a4801b5",
"rel": "bookmark"
}
],
"metadata": {},
"migration_status": null,
"multiattach": false,
"name": null,
"os-vol-host-attr:host": null,
"os-vol-mig-status-attr:migstat": null,
"os-vol-mig-status-attr:name_id": null,
"os-vol-tenant-attr:tenant_id": "89afd400-b646-4bbc-b12b-c0a4d63e5bd3",
"replication_status": null,
"size": 10,
"snapshot_id": null,
"source_volid": null,
"status": "creating",
"updated_at": null,
"user_id": "c853ca26-e8ea-4797-8a52-ee124a013d0e",
"volume_type": "__DEFAULT__",
"volume_type_id": "5fed9d7c-401d-46e2-8e80-f30c70cb7e1d",
"provider_id": null,
"group_id": null,
"service_uuid": null,
"shared_targets": null,
"cluster_name": null,
"consumes_quota": true
}
]
}

View File

@ -140,6 +140,7 @@ Response Parameters
- provider_id: provider_id
- service_uuid: service_uuid
- shared_targets: shared_targets
- shared_targets: shared_targets_tristate
- cluster_name: cluster_name
- consumes_quota: consumes_quota
- count: count
@ -267,6 +268,7 @@ Response Parameters
- provider_id: provider_id
- service_uuid: service_uuid
- shared_targets: shared_targets
- shared_targets: shared_targets_tristate
- cluster_name: cluster_name
- consumes_quota: consumes_quota
@ -403,6 +405,7 @@ Response Parameters
- volume_type_id: volume_type_id_363
- service_uuid: service_uuid
- shared_targets: shared_targets
- shared_targets: shared_targets_tristate
- cluster_name: cluster_name
- provider_id: provider_id
- group_id: group_id_optional
@ -485,6 +488,7 @@ Response Parameters
- provider_id: provider_id
- service_uuid: service_uuid
- shared_targets: shared_targets
- shared_targets: shared_targets_tristate
- cluster_name: cluster_name
- consumes_quota: consumes_quota

View File

@ -175,6 +175,8 @@ PROJECT_ID_OPTIONAL_IN_URL = '3.67'
SUPPORT_REIMAGE_VOLUME = '3.68'
SHARED_TARGETS_TRISTATE = '3.69'
def get_mv_header(version):
"""Gets a formatted HTTP microversion header.

View File

@ -154,14 +154,15 @@ REST_API_VERSION_HISTORY = """
* 3.66 - Allow snapshotting in-use volumes without force flag.
* 3.67 - API URLs no longer need to include a project_id parameter.
* 3.68 - Support re-image volume
* 3.69 - Allow null value for shared_targets
"""
# The minimum and maximum versions of the API supported
# The default api version request is defined to be the
# minimum version of the API supported.
_MIN_API_VERSION = "3.0"
_MAX_API_VERSION = "3.68"
UPDATED = "2021-11-02T00:00:00Z"
_MAX_API_VERSION = "3.69"
UPDATED = "2022-04-20T00:00:00Z"
# NOTE(cyeoh): min and max versions declared as functions so we can

View File

@ -518,3 +518,13 @@ not be specified in the API path.
----------------------
Support ability to re-image a volume with a specific image. Specify the
``os-reimage`` action in the request body.
3.69
----
Volume field ``shared_targets`` is a tristate boolean value now, with the
following meanings:
- ``true``: Do os-brick locking when host iSCSI initiator doesn't support
manual scans.
- ``false``: Never do locking.
- ``null``: Forced locking regardless of the iSCSI initiator.

View File

@ -57,8 +57,15 @@ class ViewBuilder(views_v2.ViewBuilder):
if req_version.matches(
mv.VOLUME_SHARED_TARGETS_AND_SERVICE_FIELDS, None):
volume_ref['volume']['shared_targets'] = volume.get(
'shared_targets', None)
# For microversion 3.69 or higher it is acceptable to be null
# but for earlier versions we convert None to True
shared = volume.get('shared_targets', False)
if (not req_version.matches(mv.SHARED_TARGETS_TRISTATE, None)
and shared is None):
shared = True
volume_ref['volume']['shared_targets'] = shared
volume_ref['volume']['service_uuid'] = volume.get(
'service_uuid', None)

View File

@ -0,0 +1,50 @@
# Licensed under the Apache License, Version 2.0 (the "License"); you may
# not use this file except in compliance with the License. You may obtain
# a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
# License for the specific language governing permissions and limitations
# under the License.
"""Make shared_targets nullable
Revision ID: c92a3e68beed
Revises: 921e1a36b076
Create Date: 2022-03-23 21:30:18.585830
"""
from alembic import op
import sqlalchemy as sa
# revision identifiers, used by Alembic.
revision = 'c92a3e68beed'
down_revision = '921e1a36b076'
branch_labels = None
depends_on = None
def upgrade():
connection = op.get_bind()
# Preserve existing type, be it boolean or tinyint treated as boolean
table = sa.Table('volumes', sa.MetaData(), autoload_with=connection)
existing_type = table.c.shared_targets.type
# SQLite doesn't support altering tables, so we use a workaround
if connection.engine.name == 'sqlite':
with op.batch_alter_table('volumes') as batch_op:
batch_op.alter_column('shared_targets',
existing_type=existing_type,
type_=sa.Boolean(),
nullable=True)
else:
op.alter_column('volumes', 'shared_targets',
existing_type=existing_type,
type_=sa.Boolean(),
nullable=True)

View File

@ -402,8 +402,11 @@ class Volume(BASE, CinderBase):
foreign_keys=service_uuid,
primaryjoin='Volume.service_uuid == Service.uuid',
)
# True => Do locking when iSCSI initiator doesn't support manual scan
# False => Never do locking
# None => Forced locking regardless of the iSCSI initiator
# make an FK of service?
shared_targets = sa.Column(sa.Boolean, default=True)
shared_targets = sa.Column(sa.Boolean, nullable=True, default=True)
class VolumeMetadata(BASE, CinderBase):

View File

@ -36,6 +36,8 @@ class AttachmentManagerTestCase(test.TestCase):
"""Setup test class."""
super(AttachmentManagerTestCase, self).setUp()
self.manager = importutils.import_object(CONF.volume_manager)
self.mock_object(self.manager, '_driver_shares_targets',
return_value=False)
self.configuration = mock.Mock(conf.Configuration)
self.context = context.get_admin_context()
self.context.user_id = fake.USER_ID

View File

@ -135,6 +135,7 @@ class MigrationsWalk(
# Migrations can take a long time, particularly on underpowered CI nodes.
# Give them some breathing room.
TIMEOUT_SCALING_FACTOR = 4
BOOL_TYPE = sqlalchemy.types.BOOLEAN
def setUp(self):
super().setUp()
@ -198,6 +199,20 @@ class MigrationsWalk(
).get_current_head()
self.assertEqual(head, migration.db_version())
def _pre_upgrade_c92a3e68beed(self, connection):
"""Test shared_targets is nullable."""
table = db_utils.get_table(connection, 'volumes')
self._previous_type = type(table.c.shared_targets.type)
def _check_c92a3e68beed(self, connection):
"""Test shared_targets is nullable."""
table = db_utils.get_table(connection, 'volumes')
self.assertIn('shared_targets', table.c)
# Type hasn't changed
self.assertIsInstance(table.c.shared_targets.type, self._previous_type)
# But it's nullable
self.assertTrue(table.c.shared_targets.nullable)
class TestMigrationsWalkSQLite(
MigrationsWalk,

View File

@ -46,6 +46,8 @@ class GroupManagerTestCase(test.TestCase):
def setUp(self):
super(GroupManagerTestCase, self).setUp()
self.volume = importutils.import_object(CONF.volume_manager)
self.mock_object(self.volume, '_driver_shares_targets',
return_value=False)
self.configuration = mock.Mock(conf.Configuration)
self.context = context.get_admin_context()
self.context.user_id = fake.USER_ID

View File

@ -49,6 +49,8 @@ class BaseVolumeTestCase(test.TestCase):
self.flags(volumes_dir=vol_tmpdir)
self.addCleanup(self._cleanup)
self.volume = importutils.import_object(CONF.volume_manager)
self.mock_object(self.volume, '_driver_shares_targets',
return_value=False)
self.volume.message_api = mock.Mock()
self.configuration = mock.Mock(conf.Configuration)
self.context = context.get_admin_context()

View File

@ -161,6 +161,8 @@ class BaseDriverTestCase(test.TestCase):
self.override_config('volumes_dir', vol_tmpdir,
conf.SHARED_CONF_GROUP)
self.volume = importutils.import_object(CONF.volume_manager)
self.mock_object(self.volume, '_driver_shares_targets',
return_value=False)
self.context = context.get_admin_context()
self.output = ""
self.configuration = conf.Configuration(None)

View File

@ -355,9 +355,11 @@ class VolumeTestCase(base.BaseVolumeTestCase):
@mock.patch('cinder.quota.QUOTAS.rollback', new=mock.Mock())
@mock.patch('cinder.quota.QUOTAS.commit')
@mock.patch('cinder.quota.QUOTAS.reserve', return_value=['RESERVATION'])
def test_create_delete_volume(self, use_quota, _mock_reserve, commit_mock,
def test_create_delete_volume(self, boolean, _mock_reserve, commit_mock,
mock_notify, mock_clean):
"""Test volume can be created and deleted."""
mock_shares = self.mock_object(self.volume, '_driver_shares_targets',
return_value=boolean)
volume = tests_utils.create_volume(
self.context,
availability_zone=CONF.storage_availability_zone,
@ -368,6 +370,8 @@ class VolumeTestCase(base.BaseVolumeTestCase):
self.volume.create_volume(self.context, volume)
mock_shares.assert_called_once_with()
self.assertIs(boolean, volume.shared_targets)
self.assert_notify_called(mock_notify,
(['INFO', 'volume.create.start'],
['INFO', 'volume.create.end']),
@ -376,7 +380,7 @@ class VolumeTestCase(base.BaseVolumeTestCase):
self.volume.stats['pools'])
# Confirm delete_volume handles use_quota field
volume.use_quota = use_quota
volume.use_quota = boolean
volume.save() # Need to save to DB because of the refresh call
commit_mock.reset_mock()
_mock_reserve.reset_mock()
@ -386,7 +390,7 @@ class VolumeTestCase(base.BaseVolumeTestCase):
volume_id)
self.assertEqual(vol['status'], 'deleted')
if use_quota:
if boolean:
expected_capacity = 0
self.assert_notify_called(mock_notify,
(['INFO', 'volume.delete.start'],

View File

@ -16,6 +16,9 @@
from unittest import mock
import ddt
from cinder.common import constants
from cinder import exception
from cinder.message import message_field
from cinder.tests.unit import fake_constants as fake
@ -24,6 +27,7 @@ from cinder.tests.unit import volume as base
from cinder.volume import manager as vol_manager
@ddt.ddt
class VolumeManagerTestCase(base.BaseVolumeTestCase):
@mock.patch('cinder.message.api.API.create')
@ -287,3 +291,54 @@ class VolumeManagerTestCase(base.BaseVolumeTestCase):
manager._parse_connection_options(ctxt, vol, conn_info)
self.assertIn('cacheable', conn_info['data'])
self.assertIs(conn_info['data']['cacheable'], False)
@ddt.data(*(constants.ISCSI_VARIANTS + constants.NVMEOF_VARIANTS))
def test__driver_shares_targets_reported_shared(self, protocol):
"""Shared targets must be reported for iSCSI and NVMe-oF."""
manager = vol_manager.VolumeManager()
fake_driver = mock.MagicMock()
fake_driver.capabilities = {'shared_targets': True,
'storage_protocol': protocol}
manager.driver = fake_driver
res = manager._driver_shares_targets()
expected = True if protocol in constants.ISCSI_VARIANTS else None
self.assertIs(expected, res)
@ddt.data(*(constants.ISCSI_VARIANTS + constants.NVMEOF_VARIANTS))
def test__driver_shares_targets_reported_nonshared(self, protocol):
"""Protocol is irrelevant for drivers that don't share targets."""
manager = vol_manager.VolumeManager()
fake_driver = mock.MagicMock()
fake_driver.capabilities = {'shared_targets': False,
'storage_protocol': protocol}
manager.driver = fake_driver
res = manager._driver_shares_targets()
self.assertFalse(res)
@ddt.data(*(constants.ISCSI_VARIANTS + constants.NVMEOF_VARIANTS))
def test__driver_shares_targets_not_reported(self, protocol):
"""When driver doesn't report, assume it's shared."""
manager = vol_manager.VolumeManager()
fake_driver = mock.MagicMock()
fake_driver.capabilities = {'storage_protocol': protocol}
manager.driver = fake_driver
res = manager._driver_shares_targets()
expected = True if protocol in constants.ISCSI_VARIANTS else None
self.assertIs(expected, res)
@ddt.data({'storage_protocol': 'NFS'},
{'shared_targets': True, 'storage_protocol': 'NFS'},
{'storage_protocol': 'ceph'},
{'shared_targets': True, 'storage_protocol': 'ceph'})
def test__driver_shares_targets_other_protocols(self, capabilities):
"""Sharing is irrelevant for other protocols."""
manager = vol_manager.VolumeManager()
fake_driver = mock.MagicMock()
fake_driver.capabilities = capabilities
manager.driver = fake_driver
res = manager._driver_shares_targets()
self.assertFalse(res)

View File

@ -844,12 +844,8 @@ class VolumeManager(manager.CleanableManager,
self._update_allocated_capacity(volume, decrement=True,
host=original_host)
# Shared targets is only relevant for iSCSI connections.
# We default to True to be on the safe side.
capabilities = self.driver.capabilities
volume.shared_targets = (
capabilities.get('storage_protocol') in constants.ISCSI_VARIANTS
and capabilities.get('shared_targets', True))
# Shared targets is only relevant for some connections.
volume.shared_targets = self._driver_shares_targets()
# TODO(geguileo): service_uuid won't be enough on Active/Active
# deployments. There can be 2 services handling volumes from the same
# backend.
@ -859,6 +855,50 @@ class VolumeManager(manager.CleanableManager,
LOG.info("Created volume successfully.", resource=volume)
return volume.id
def _driver_shares_targets(self):
"""Report if driver shares targets and needs locking on connecing side.
This is currently only relevant for iSCSI and for NVMe-oF.
Relevant for iSCSI, because older iSCSI initiators didn't support
disabling automatic scans, allowing race conditions between os-brick
and cinder.
In the NVMe-oF case it's even worse, because not only scans are always
automatic, but also in most cases devices/namespaces cannot be removed
from the subsystem, and they are automatically removed when unmapped
via an asynchronous message from the storage system.
By exposing the shared_targets characteristic in the volume nova (and
other consumers) can use os-brick's specific context manager to prevent
these race conditions.
Can return 3 possible values::
True => iSCSI protocol and shared targets
False => iSCSI protocol and not shared targets
None => Force shared targets locking in os-brick ignoring
ISCSI_SUPPORTS_MANUAL_SCAN. Used by NVMe-oF.
Drivers can return in their capabilities shared_targets set to ``None``
to force locking on the host side regarless of the protocol
"""
capabilities = self.driver.capabilities
# We default to True to be on the safe side.
shared = capabilities.get('shared_targets', True)
# If driver says no shared_targets, honor it
if shared is False:
return False
protocol = capabilities.get('storage_protocol')
if protocol in constants.NVMEOF_VARIANTS:
return None # True must be changed to None for NVMe-oF drivers
# Only iSCSI drivers would need to do locking for shared targets
return protocol in constants.ISCSI_VARIANTS
def _check_is_our_resource(self, resource) -> None:
if resource.host:
res_backend = volume_utils.extract_host(