Add Redfish RAID interface to idrac HW type

Adds MVP support for idrac-redfish to RAID interface. Based on
generic redfish implementation, but requires OEM extension
to check when `Immediate` time becomes available shortly
after IPA starts executing steps.

Does not support foreign disks, convert from non-RAID mode.

Story: 2008602
Task: 41778
Depends-On: https://review.opendev.org/c/x/sushy-oem-idrac/+/776224
Change-Id: Iefb7f882c97e33a176962e4e907163d9e4809445
This commit is contained in:
Aija Jauntēva 2021-02-05 04:59:01 -05:00
parent a06e403b11
commit 952695be33
11 changed files with 434 additions and 14 deletions

View File

@ -55,17 +55,13 @@ Enabling
The iDRAC driver supports WSMAN for the bios, inspect, management, power,
raid, and vendor interfaces. In addition, it supports Redfish for
the bios, inspect, management, and power interfaces. The iDRAC driver
the bios, inspect, management, power, and raid interfaces. The iDRAC driver
allows you to mix and match WSMAN and Redfish interfaces.
The ``idrac-wsman`` implementation must be enabled to use WSMAN for
an interface. The ``idrac-redfish`` implementation must be enabled
to use Redfish for an interface.
.. NOTE::
Redfish is supported for only the bios, inspect, management, and power
interfaces at the present time.
To enable the ``idrac`` hardware type with the minimum interfaces,
all using WSMAN, add the following to your ``/etc/ironic/ironic.conf``:
@ -88,7 +84,7 @@ following configuration:
enabled_inspect_interfaces=idrac-redfish
enabled_management_interfaces=idrac-redfish
enabled_power_interfaces=idrac-redfish
enabled_raid_interfaces=idrac-wsman
enabled_raid_interfaces=idrac-redfish
enabled_vendor_interfaces=idrac-redfish
Below is the list of supported interface implementations in priority
@ -106,7 +102,7 @@ Interface Supported Implementations
``management`` ``idrac-wsman``, ``idrac``, ``idrac-redfish``
``network`` ``flat``, ``neutron``, ``noop``
``power`` ``idrac-wsman``, ``idrac``, ``idrac-redfish``
``raid`` ``idrac-wsman``, ``idrac``, ``no-raid``
``raid`` ``idrac-wsman``, ``idrac``, ``idrac-redfish``, ``no-raid``
``rescue`` ``no-rescue``, ``agent``
``storage`` ``noop``, ``cinder``, ``external``
``vendor`` ``idrac-wsman``, ``idrac``, ``idrac-redfish``,
@ -180,7 +176,7 @@ hardware type using Redfish for all interfaces:
--inspect-interface idrac-redfish \
--management-interface idrac-redfish \
--power-interface idrac-redfish \
--raid-interface no-raid \
--raid-interface idrac-redfish \
--vendor-interface idrac-redfish
The following command enrolls a bare metal node with the ``idrac``
@ -283,9 +279,12 @@ RAID Interface
See :doc:`/admin/raid` for more information on Ironic RAID support.
The following properties are supported by the iDRAC WSMAN raid interface
implementation, ``idrac-wsman``:
The following properties are supported by the iDRAC WSMAN and Redfish RAID
interface implementation:
.. NOTE::
When using ``idrac-redfish`` for RAID interface iDRAC firmware greater than
4.40.00.00 is required.
Mandatory properties
--------------------
@ -310,6 +309,11 @@ Optional properties
Backing physical disk hints
---------------------------
.. NOTE::
Backing physical disk hints are not widely tested with ``idrac-redfish`` yet
and they might not work as desired. This will be addressed in future
releases.
See :doc:`/admin/raid` for more information on backing disk hints.
These are machine-independent information. The hints are specified for each
@ -408,6 +412,20 @@ be used to fetch the information directly from the Dell bare metal:
physical_disks = client.list_physical_disks()
print(physical_disks)
Or using ``sushy`` with Redfish:
.. code-block:: python
import sushy
client = sushy.Sushy('https://192.168.1.1', username='root', password='calvin', verify=False)
for s in client.get_system_collection().get_members():
print("System: %(id)s" % {'id': s.identity})
for c in system1.storage.get_members():
print("\tController: %(id)s" % {'id': c.identity})
for d in c.drives:
print("\t\tDrive: %(id)s" % {'id': d.identity})
Vendor Interface
================

View File

@ -20,4 +20,4 @@ ansible>=2.7
python-ibmcclient>=0.2.2,<0.3.0
# Dell EMC iDRAC sushy OEM extension
sushy-oem-idrac<2.0.0
sushy-oem-idrac>=2.0.0,<3.0.0

View File

@ -74,7 +74,8 @@ class IDRACHardware(generic.GenericHardware):
@property
def supported_raid_interfaces(self):
"""List of supported raid interfaces."""
return [raid.DracWSManRAID, raid.DracRAID] + super(
return [raid.DracWSManRAID, raid.DracRAID,
raid.DracRedfishRAID] + super(
IDRACHardware, self).supported_raid_interfaces
@property

View File

@ -23,6 +23,7 @@ from ironic_lib import metrics_utils
from oslo_log import log as logging
from oslo_utils import importutils
from oslo_utils import units
import tenacity
from ironic.common import exception
from ironic.common.i18n import _
@ -34,9 +35,12 @@ from ironic.drivers import base
from ironic.drivers.modules import deploy_utils
from ironic.drivers.modules.drac import common as drac_common
from ironic.drivers.modules.drac import job as drac_job
from ironic.drivers.modules.redfish import raid as redfish_raid
from ironic.drivers.modules.redfish import utils as redfish_utils
drac_exceptions = importutils.try_import('dracclient.exceptions')
drac_constants = importutils.try_import('dracclient.constants')
sushy = importutils.try_import('sushy')
LOG = logging.getLogger(__name__)
@ -1160,6 +1164,166 @@ def _get_disk_free_size_mb(disk, pending_delete):
return disk.size_mb if pending_delete else disk.free_size_mb
def _wait_till_realtime_ready(task):
"""Waits till real time operations are ready to be executed.
Useful for RAID operations where almost all controllers support
real time configuration, but controllers might not be ready for
it by the time IPA starts executing steps. It can take minute or
bit more to be ready for real time configuration.
:param task: TaskManager object containing the node.
:raises RedfishError: If can't find OEM extension or it fails to
execute
"""
try:
_retry_till_realtime_ready(task)
except tenacity.RetryError:
LOG.debug('Retries exceeded while waiting for real-time ready '
'for node %(node)s. Will proceed with out real-time '
'ready state', {'node': task.node.uuid})
@tenacity.retry(
stop=(tenacity.stop_after_attempt(30)),
wait=tenacity.wait_fixed(10),
retry=tenacity.retry_if_result(lambda result: not result))
def _retry_till_realtime_ready(task):
"""Retries till real time operations are ready to be executed.
:param task: TaskManager object containing the node.
:raises RedfishError: If can't find OEM extension or it fails to
execute
:raises RetryError: If retries exceeded and still not ready for real-time
"""
return _is_realtime_ready(task)
def _is_realtime_ready(task):
"""Gets is real time ready status
Uses sushy-oem-idrac extension.
:param task: TaskManager object containing the node.
:returns: True, if real time operations are ready, otherwise False.
:raises RedfishError: If can't find OEM extension or it fails to
execute
"""
system = redfish_utils.get_system(task.node)
for manager in system.managers:
try:
manager_oem = manager.get_oem_extension('Dell')
except sushy.exceptions.OEMExtensionNotFoundError as e:
error_msg = (_("Search for Sushy OEM extension Python package "
"'sushy-oem-idrac' failed for node %(node)s. "
"Ensure it is installed. Error: %(error)s") %
{'node': task.node.uuid, 'error': e})
LOG.error(error_msg)
raise exception.RedfishError(error=error_msg)
try:
return manager_oem.lifecycle_service.is_realtime_ready()
except sushy.exceptions.SushyError as e:
LOG.debug("Failed to get real time ready status with system "
"%(system)s manager %(manager)s for node %(node)s. Will "
"try next manager, if available. Error: %(error)s",
{'system': system.uuid if system.uuid else
system.identity,
'manager': manager.uuid if manager.uuid else
manager.identity,
'node': task.node.uuid,
'error': e})
continue
break
else:
error_msg = (_("iDRAC Redfish get real time ready status failed for "
"node %(node)s, because system %(system)s has no "
"manager%(no_manager)s.") %
{'node': task.node.uuid,
'system': system.uuid if system.uuid else
system.identity,
'no_manager': '' if not system.managers else
' which could'})
LOG.error(error_msg)
raise exception.RedfishError(error=error_msg)
class DracRedfishRAID(redfish_raid.RedfishRAID):
"""iDRAC Redfish interface for RAID related actions.
Includes iDRAC specific adjustments for RAID related actions.
"""
@base.clean_step(priority=0, abortable=False, argsinfo={
'create_root_volume': {
'description': (
'This specifies whether to create the root volume. '
'Defaults to `True`.'
),
'required': False
},
'create_nonroot_volumes': {
'description': (
'This specifies whether to create the non-root volumes. '
'Defaults to `True`.'
),
'required': False
},
'delete_existing': {
'description': (
'Setting this to `True` indicates to delete existing RAID '
'configuration prior to creating the new configuration. '
'Default value is `False`.'
),
'required': False,
}
})
def create_configuration(self, task, create_root_volume=True,
create_nonroot_volumes=True,
delete_existing=False):
"""Create RAID configuration on the node.
This method creates the RAID configuration as read from
node.target_raid_config. This method
by default will create all logical disks.
:param task: TaskManager object containing the node.
:param create_root_volume: Setting this to False indicates
not to create root volume that is specified in the node's
target_raid_config. Default value is True.
:param create_nonroot_volumes: Setting this to False indicates
not to create non-root volumes (all except the root volume) in
the node's target_raid_config. Default value is True.
:param delete_existing: Setting this to True indicates to delete RAID
configuration prior to creating the new configuration. Default is
False.
:returns: states.CLEANWAIT if RAID configuration is in progress
asynchronously or None if it is complete.
:raises: RedfishError if there is an error creating the configuration
"""
_wait_till_realtime_ready(task)
return super(DracRedfishRAID, self).create_configuration(
task, create_root_volume, create_nonroot_volumes,
delete_existing)
@base.clean_step(priority=0)
@base.deploy_step(priority=0)
def delete_configuration(self, task):
"""Delete RAID configuration on the node.
:param task: TaskManager object containing the node.
:returns: states.CLEANWAIT (cleaning) or states.DEPLOYWAIT (deployment)
if deletion is in progress asynchronously or None if it is
complete.
"""
_wait_till_realtime_ready(task)
return super(DracRedfishRAID, self).delete_configuration(task)
def _validate_vendor(self, task):
pass # for now assume idrac-redfish is used with iDRAC BMC, thus pass
class DracWSManRAID(base.RAIDInterface):
def get_properties(self):

View File

@ -687,6 +687,32 @@ class RedfishRAID(base.RAIDInterface):
"""
return redfish_utils.COMMON_PROPERTIES.copy()
def _validate_vendor(self, task):
vendor = task.node.properties.get('vendor')
if not vendor:
return
if 'dell' in vendor.lower().split():
raise exception.InvalidParameterValue(
_("The %(iface)s raid interface is not suitable for node "
"%(node)s with vendor %(vendor)s, use idrac-redfish instead")
% {'iface': task.node.raid_interface,
'node': task.node.uuid, 'vendor': vendor})
def validate(self, task):
"""Validates the RAID Interface.
This method validates the properties defined by Ironic for RAID
configuration. Driver implementations of this interface can override
this method for doing more validations (such as BMC's credentials).
:param task: A TaskManager instance.
:raises: InvalidParameterValue, if the RAID configuration is invalid.
:raises: MissingParameterValue, if some parameters are missing.
"""
self._validate_vendor(task)
super(RedfishRAID, self).validate(task)
@base.deploy_step(priority=0,
argsinfo=base.RAID_APPLY_CONFIGURATION_ARGSINFO)
def apply_configuration(self, task, raid_config, create_root_volume=True,

View File

@ -92,6 +92,10 @@ def get_test_drac_info():
"drac_protocol": "https",
"drac_username": "admin",
"drac_password": "fake",
"redfish_address": "1.2.3.4",
"redfish_system_id": "/redfish/v1/Systems/System.Embedded.1",
"redfish_username": "admin",
"redfish_password": "fake"
}

View File

@ -20,6 +20,8 @@ from unittest import mock
from dracclient import constants
from dracclient import exceptions as drac_exceptions
from oslo_utils import importutils
import tenacity
from ironic.common import exception
from ironic.common import states
@ -28,9 +30,13 @@ from ironic.drivers import base
from ironic.drivers.modules.drac import common as drac_common
from ironic.drivers.modules.drac import job as drac_job
from ironic.drivers.modules.drac import raid as drac_raid
from ironic.drivers.modules.redfish import raid as redfish_raid
from ironic.drivers.modules.redfish import utils as redfish_utils
from ironic.tests.unit.drivers.modules.drac import utils as test_utils
from ironic.tests.unit.objects import utils as obj_utils
sushy = importutils.try_import('sushy')
INFO_DICT = test_utils.INFO_DICT
@ -2239,3 +2245,159 @@ class DracRaidInterfaceTestCase(test_utils.BaseDracTest):
mock_apply_configuration.assert_called_once_with(
task.driver.raid, task,
self.target_raid_configuration, False, True, False)
class DracRedfishRAIDTestCase(test_utils.BaseDracTest):
def setUp(self):
super(DracRedfishRAIDTestCase, self).setUp()
self.node = obj_utils.create_test_node(self.context,
driver='idrac',
driver_info=INFO_DICT)
self.raid = drac_raid.DracRedfishRAID()
@mock.patch.object(drac_raid, '_wait_till_realtime_ready', autospec=True)
@mock.patch.object(redfish_raid.RedfishRAID, 'create_configuration',
autospec=True)
def test_create_configuration(self, mock_redfish_create, mock_wait):
task = mock.Mock(node=self.node, context=self.context)
self.raid.create_configuration(task)
mock_wait.assert_called_once_with(task)
mock_redfish_create.assert_called_once_with(
self.raid, task, True, True, False)
@mock.patch.object(drac_raid, '_wait_till_realtime_ready', autospec=True)
@mock.patch.object(redfish_raid.RedfishRAID, 'delete_configuration',
autospec=True)
def test_delete_configuration(self, mock_redfish_delete, mock_wait):
task = mock.Mock(node=self.node, context=self.context)
self.raid.delete_configuration(task)
mock_wait.assert_called_once_with(task)
mock_redfish_delete.assert_called_once_with(self.raid, task)
@mock.patch.object(drac_raid, '_retry_till_realtime_ready', autospec=True)
def test__wait_till_realtime_ready(self, mock_ready):
task = mock.Mock(node=self.node, context=self.context)
drac_raid._wait_till_realtime_ready(task)
mock_ready.assert_called_once_with(task)
@mock.patch.object(drac_raid, 'LOG', autospec=True)
@mock.patch.object(drac_raid, '_retry_till_realtime_ready', autospec=True)
def test__wait_till_realtime_ready_retryerror(self, mock_ready, mock_log):
task = mock.Mock(node=self.node, context=self.context)
mock_ready.side_effect = tenacity.RetryError(3)
drac_raid._wait_till_realtime_ready(task)
mock_ready.assert_called_once_with(task)
self.assertEqual(mock_log.debug.call_count, 1)
@mock.patch.object(drac_raid, '_is_realtime_ready', autospec=True)
def test__retry_till_realtime_ready_retry_exceeded(self, mock_ready):
drac_raid._retry_till_realtime_ready.retry.sleep = mock.Mock()
drac_raid._retry_till_realtime_ready.retry.stop =\
tenacity.stop_after_attempt(3)
task = mock.Mock(node=self.node, context=self.context)
mock_ready.return_value = False
self.assertRaises(
tenacity.RetryError,
drac_raid._retry_till_realtime_ready, task)
self.assertEqual(3, mock_ready.call_count)
@mock.patch.object(drac_raid, '_is_realtime_ready', autospec=True)
def test__retry_till_realtime_ready_retry_fails(self, mock_ready):
drac_raid._retry_till_realtime_ready.retry.sleep = mock.Mock()
drac_raid._retry_till_realtime_ready.retry.stop =\
tenacity.stop_after_attempt(3)
task = mock.Mock(node=self.node, context=self.context)
mock_ready.side_effect = [False, exception.RedfishError]
self.assertRaises(
exception.RedfishError,
drac_raid._retry_till_realtime_ready, task)
self.assertEqual(2, mock_ready.call_count)
@mock.patch.object(drac_raid, '_is_realtime_ready', autospec=True)
def test__retry_till_realtime_ready(self, mock_ready):
drac_raid._retry_till_realtime_ready.retry.sleep = mock.Mock()
task = mock.Mock(node=self.node, context=self.context)
mock_ready.side_effect = [False, True]
is_ready = drac_raid._retry_till_realtime_ready(task)
self.assertTrue(is_ready)
self.assertEqual(2, mock_ready.call_count)
@mock.patch.object(redfish_utils, 'get_system', autospec=True)
def test__is_realtime_ready_no_managers(self, mock_get_system):
task = mock.Mock(node=self.node, context=self.context)
fake_system = mock.Mock(managers=[])
mock_get_system.return_value = fake_system
self.assertRaises(exception.RedfishError,
drac_raid._is_realtime_ready, task)
@mock.patch.object(drac_raid, 'LOG', autospec=True)
@mock.patch.object(redfish_utils, 'get_system', autospec=True)
def test__is_realtime_ready_oem_not_found(self, mock_get_system, mock_log):
task = mock.Mock(node=self.node, context=self.context)
fake_manager1 = mock.Mock()
fake_manager1.get_oem_extension.side_effect = (
sushy.exceptions.OEMExtensionNotFoundError)
fake_system = mock.Mock(managers=[fake_manager1])
mock_get_system.return_value = fake_system
self.assertRaises(exception.RedfishError,
drac_raid._is_realtime_ready, task)
self.assertEqual(mock_log.error.call_count, 1)
@mock.patch.object(drac_raid, 'LOG', autospec=True)
@mock.patch.object(redfish_utils, 'get_system', autospec=True)
def test__is_realtime_ready_all_managers_fail(self, mock_get_system,
mock_log):
task = mock.Mock(node=self.node, context=self.context)
fake_manager_oem1 = mock.Mock()
fake_manager_oem1.lifecycle_service.is_realtime_ready.side_effect = (
sushy.exceptions.SushyError)
fake_manager1 = mock.Mock()
fake_manager1.get_oem_extension.return_value = fake_manager_oem1
fake_manager_oem2 = mock.Mock()
fake_manager_oem2.lifecycle_service.is_realtime_ready.side_effect = (
sushy.exceptions.SushyError)
fake_manager2 = mock.Mock()
fake_manager2.get_oem_extension.return_value = fake_manager_oem2
fake_system = mock.Mock(managers=[fake_manager1, fake_manager2])
mock_get_system.return_value = fake_system
self.assertRaises(exception.RedfishError,
drac_raid._is_realtime_ready, task)
self.assertEqual(mock_log.debug.call_count, 2)
@mock.patch.object(drac_raid, 'LOG', autospec=True)
@mock.patch.object(redfish_utils, 'get_system', autospec=True)
def test__is_realtime_ready(self, mock_get_system, mock_log):
task = mock.Mock(node=self.node, context=self.context)
fake_manager_oem1 = mock.Mock()
fake_manager_oem1.lifecycle_service.is_realtime_ready.side_effect = (
sushy.exceptions.SushyError)
fake_manager1 = mock.Mock()
fake_manager1.get_oem_extension.return_value = fake_manager_oem1
fake_manager_oem2 = mock.Mock()
fake_manager_oem2.lifecycle_service.is_realtime_ready.return_value = (
True)
fake_manager2 = mock.Mock()
fake_manager2.get_oem_extension.return_value = fake_manager_oem2
fake_system = mock.Mock(managers=[fake_manager1, fake_manager2])
mock_get_system.return_value = fake_system
is_ready = drac_raid._is_realtime_ready(task)
self.assertTrue(is_ready)
self.assertEqual(mock_log.debug.call_count, 1)
def test_validate_correct_vendor(self):
task = mock.Mock(node=self.node, context=self.context)
self.node.properties['vendor'] = 'Dell Inc.'
self.raid.validate(task)

View File

@ -844,3 +844,19 @@ class RedfishRAIDTestCase(db_base.DbTestCase):
mock_error_handler.assert_called_once_with(
task, sushy_error, volume_collection, expected_payload
)
def test_validate(self, mock_get_system):
with task_manager.acquire(self.context, self.node.uuid,
shared=True) as task:
task.node.properties['vendor'] = "Supported vendor"
task.driver.raid.validate(task)
def test_validate_unsupported_vendor(self, mock_get_system):
with task_manager.acquire(self.context, self.node.uuid,
shared=True) as task:
task.node.properties['vendor'] = "Dell Inc."
self.assertRaisesRegex(exception.InvalidParameterValue,
"with vendor Dell.Inc.",
task.driver.raid.validate, task)

View File

@ -43,7 +43,8 @@ class IDRACHardwareTestCase(db_base.DbTestCase):
'no-inspect'],
enabled_network_interfaces=['flat', 'neutron', 'noop'],
enabled_raid_interfaces=[
'idrac', 'idrac-wsman', 'no-raid', 'agent'],
'idrac', 'idrac-wsman', 'idrac-redfish', 'no-raid',
'agent'],
enabled_vendor_interfaces=[
'idrac', 'idrac-wsman', 'no-vendor'],
enabled_bios_interfaces=[
@ -113,6 +114,15 @@ class IDRACHardwareTestCase(db_base.DbTestCase):
with task_manager.acquire(self.context, node.id) as task:
self._validate_interfaces(task.driver, raid=impl)
def test_override_with_redfish_raid(self):
node = obj_utils.create_test_node(self.context,
uuid=uuidutils.generate_uuid(),
driver='idrac',
raid_interface='idrac-redfish')
with task_manager.acquire(self.context, node.id) as task:
self._validate_interfaces(task.driver,
raid=drac.raid.DracRedfishRAID)
def test_override_no_vendor(self):
node = obj_utils.create_test_node(self.context, driver='idrac',
vendor_interface='no-vendor')

View File

@ -0,0 +1,18 @@
---
features:
- |
Adds basic support for managing RAID configuration via the Redfish
out-of-band (OOB) management protocol to the ``idrac`` hardware type by
adding new interface named ``idrac-redfish``.
iDRAC firmware greater than 4.40.00.00 is required. Compared to
``idrac-wsman`` implementation does not yet support foreign disks and
converting from non-RAID mode.
Backing physical disk hints are not widely tested with ``idrac-redfish``
yet and they might not work as desired. Backing physical disks
(``controller``, ``physical_disks``) with ``size_gb="MAX"`` are tested.
The ``idrac`` hardware type now supports ``idrac-wsman``, ``idrac``,
``idrac-redfish``, and ``no-raid`` interfaces in given priority
order.

View File

@ -141,6 +141,7 @@ ironic.hardware.interfaces.raid =
fake = ironic.drivers.modules.fake:FakeRAID
ibmc = ironic.drivers.modules.ibmc.raid:IbmcRAID
idrac = ironic.drivers.modules.drac.raid:DracRAID
idrac-redfish = ironic.drivers.modules.drac.raid:DracRedfishRAID
idrac-wsman = ironic.drivers.modules.drac.raid:DracWSManRAID
ilo5 = ironic.drivers.modules.ilo.raid:Ilo5RAID
irmc = ironic.drivers.modules.irmc.raid:IRMCRAID