IPMI: Handle vendor set boot device differences

Supermicro machines, when in UEFI mode, have a different
device number, in binary, to represent the hard disk from
other vendors such as Fujitsu which actually has somewhat
similar code in their driver.

This means we need to be somewhat cognizent of the vendor of
the BMC and possibly update the device mapping based upon that
vendor.

This may ultimately fix a number of IPMI related problems, because
there is a reliance upon the text output of ipmitool, which only
reads the bytes retured by the BMC, which may not be reality after
the next reset, espescialy if ipmitool doesn't know of the UEFI
operating difference.

Change-Id: Ie19db9e0cf1eafdfc9bb46248f4d457337821f94
Story: 2008241
Task: 41085
This commit is contained in:
Julia Kreger 2020-10-09 17:12:07 -07:00 committed by Dmitry Tantsur
parent cde792a8c3
commit a7ac9ce8cd
6 changed files with 321 additions and 16 deletions

View File

@ -198,6 +198,49 @@ See :ref:`static-boot-order`.
.. TODO(lucasagomes): Write about privilege level
.. TODO(lucasagomes): Write about force boot device
Vendor Differences
~~~~~~~~~~~~~~~~~~
While the Intelligent Platform Management Interface (IPMI) interface is based
upon a defined standard, the Ironic community is aware of at least one vendor
which utilizes a non-standard boot device selector. In essence, this could be
something as simple as different interpretation of the standard.
As of October 2020, the known difference is with Supermicro hardware where
a selector of ``0x24``, signifying a *REMOTE* boot device in the standard,
must be used when a boot operation from the local disk subsystem is requested
**in UEFI mode**. This is contrary to BIOS mode where the same BMC's expect
the selector to be a value of ``0x08``.
Because the BMC does not respond with any sort of error, nor do we want to
risk BMC connectivity issues by explicitly querying all BMCs what vendor it may
be before every operation, the vendor can automatically be recorded in the
``properties`` field ``vendor``. When this is set to a value of
``supermicro``, Ironic will navigate the UEFI behavior difference enabling
the UEFI to be requested with boot to disk.
Example::
baremetal node set <UUID or name> \
--properties vendor="supermicro"
Luckily, Ironic will attempt to perform this detection in power
synchronization process, and record this value if not already set.
While similar issues may exist when setting the boot mode and target
boot device in other vendors' BMCs, we are not aware of them at present.
Should you encounter such an issue, please feel free to report this via
`Storyboard <https://storyboard.openstack.org>`_, and be sure to include
the ``chassis bootparam get 5`` output value along with the ``mc info``
output from your BMC.
Example::
ipmitool -I lanplus -H <BMC ADDRESS> -U <Username> -P <Password> \
mc info
ipmitool -I lanplus -H <BMC ADDRESS> -U <Username> -P <Password> \
chassis bootparam get 5
.. _IPMItool: https://sourceforge.net/projects/ipmitool/
.. _IPMI: https://en.wikipedia.org/wiki/Intelligent_Platform_Management_Interface
.. _BMC: https://en.wikipedia.org/wiki/Intelligent_Platform_Management_Interface#Baseboard_management_controller

View File

@ -1107,6 +1107,24 @@ class ManagementInterface(BaseInterface):
raise exception.UnsupportedDriverExtension(
driver=task.node.driver, extension='get_indicator_state')
def detect_vendor(self, task):
"""Detects, stores, and returns the hardware vendor.
If the Node object ``properties`` field does not already contain
a ``vendor`` field, then this method is intended to query
Detects the BMC hardware vendor and stores the returned value
with-in the Node object ``properties`` field if detected.
:param task: A task from TaskManager.
:raises: InvalidParameterValue if an invalid component, indicator
or state is specified.
:raises: MissingParameterValue if a required parameter is missing
:returns: String representing the BMC reported Vendor or
Manufacturer, otherwise returns None.
"""
raise exception.UnsupportedDriverExtension(
driver=task.node.driver, extension='detect_vendor')
class InspectInterface(BaseInterface):
"""Interface for inspection-related actions."""

View File

@ -156,7 +156,7 @@ IPMITOOL_RETRYABLE_FAILURES = ['insufficient resources for session',
# NOTE(lucasagomes): A mapping for the boot devices and their hexadecimal
# value. For more information about these values see the "Set System Boot
# Options Command" section of the link below (page 418)
# Options Command" section 28.13 of the link below (page 392)
# http://www.intel.com/content/www/us/en/servers/ipmi/ipmi-second-gen-interface-spec-v2-rev1-1.html # noqa
BOOT_DEVICE_HEXA_MAP = {
boot_devices.PXE: '0x04',
@ -165,6 +165,50 @@ BOOT_DEVICE_HEXA_MAP = {
boot_devices.BIOS: '0x18',
boot_devices.SAFE: '0x0c'
}
# NOTE(TheJulia): Inline notes for ipmi raw boot device commands.
# Per spec (intel 2nd gen ipmi interface, v2, rev1-1)
# - bit 7 is CMOS clear
# - bit 6 is Keyboard lock
# - bit 5-2 is the boot device selector
# * where 0000b is to do nothing
# * where 0001b is to force pxe
# * where 0010b is boot from hard drive
# * And anything above 1100b is listed reserved
# which is 0x30 for ipmitool raw commands!
# - bit 1 is screen blank
# - bit 0 is lock out reset button
#
# Effectively this results in
# 00000100b == 0x04 for booting from PXE
# 00001000b == 0x08 for booting from the primary hard disk.
# 00100000b == 0x20 is remotely connected virtual media
# 00100100b == 0x24 which is primary remote boot media as set
# via the OEM initiator mailbox information.
# Like.. iscsi?
# However! Supermicro uses this as the UEFI
# hard disk. See: https://www.supermicro.com/support/faqs/faq.cfm?faq=25559 # noqa
# 00010000b == 0x30 Enters the reserved territory, and should not
# be expected in use.
def _vendor_aware_boot_device_map(task):
"""Returns an altered vendor boot device map based upon vendor."""
node = task.node
vendor = node.properties.get('vendor', None)
boot_dev_map = BOOT_DEVICE_HEXA_MAP.copy()
boot_mode = boot_mode_utils.get_boot_mode(node)
if vendor:
vendor = str(vendor).lower()
if boot_mode == 'uefi' and vendor == "supermicro":
# This difference is only known on UEFI mode for supermicro
# hardware.
boot_dev_map[boot_devices.DISK] = '0x24'
# NOTE(TheJulia): Similar differences may exist with Cisco UCS
# hardware when using IPMI, however at present we don't know
# what the setting would be.
# NOTE(TheJulia) I've observed "mc info" manufacter name of a cisco
# c-series machine to return "Unknown (0x168B)"
return boot_dev_map
def _check_option_support(options):
@ -877,6 +921,24 @@ class IPMIPower(base.PowerInterface):
call).
"""
# NOTE(TheJulia): Temporary until we promote detect_vendor as
# a higher level management method and able to automatically
# run detection upon either power state sync or in the enrollment
# to management step.
try:
properties = task.node.properties
if not properties.get('vendor'):
# We have no vendor stored, so we'll go ahead and
# call to store it.
vendor = task.driver.management.detect_vendor(task)
if vendor:
props = task.node.properties
props['vendor'] = vendor
task.node.properties = props
task.node.save()
except exception.UnsupportedDriverExtension:
pass
driver_info = _parse_driver_info(task.node)
return _power_status(driver_info)
@ -1040,7 +1102,7 @@ class IPMIManagement(base.ManagementInterface):
# persistent or we do not have admin rights.
persistent = False
# FIXME(lucasagomes): Older versions of the ipmitool utility
# NOTE(lucasagomes): Older versions of the ipmitool utility
# are not able to set the options "efiboot" and "persistent"
# at the same time, combining other options seems to work fine,
# except efiboot. Newer versions of ipmitool (1.8.17) does fix
@ -1050,19 +1112,37 @@ class IPMIManagement(base.ManagementInterface):
# uefi mode, this will work with newer and older versions of the
# ipmitool utility. Also see:
# https://bugs.launchpad.net/ironic/+bug/1611306
# NOTE(TheJulia): Previously we added raw device support because
# most distributions are carrying older downstream patched versions
# of ipmitool where "efiboot" and "persistent" do not work. However,
# using the embedded support assumes that every vendor out there uses
# the same device numbers all the time. Reality is not so kind.
# It should also be noted that the ipmitool chassis bootparm get
# command also just interprets back the same fields, so if the vendor
# has set a different value as default, then ipmitool is not going
# to understand it and may be listing the wrong boot flag as a result.
# See: https://storyboard.openstack.org/#!/story/2008241
boot_mode = boot_mode_utils.get_boot_mode(task.node)
if persistent and boot_mode == 'uefi':
raw_cmd = ('0x00 0x08 0x05 0xe0 %s 0x00 0x00 0x00' %
BOOT_DEVICE_HEXA_MAP[device])
if boot_mode == 'uefi':
# Long story short: UEFI was added to IPMI after the final spec
# release occured. This means BMCs may actually NEED to be
# explicitly told if the boot is persistant because the
# BMC may return a value which is explicitly parsed as
# no change, BUT the BMC may treat that as operational default.
efi_persistence = '0xe0' if persistent else '0xa0'
# 0xe0 is persistent UEFI boot, 0xa0 is one-time UEFI boot.
boot_device_map = _vendor_aware_boot_device_map(task)
raw_cmd = ('0x00 0x08 0x05 %s %s 0x00 0x00 0x00' %
(efi_persistence, boot_device_map[device]))
send_raw(task, raw_cmd)
return
options = []
if persistent:
options.append('persistent')
if boot_mode == 'uefi':
options.append('efiboot')
# NOTE(TheJulia): Once upon a time we had efiboot here. It doesn't
# work in all cases and directs us to unhappy places if the values
# are incorrect.
cmd = "chassis bootdev %s" % device
if options:
cmd = cmd + " options=%s" % ','.join(options)
@ -1100,6 +1180,8 @@ class IPMIManagement(base.ManagementInterface):
driver_internal_info = task.node.driver_internal_info
ifbd = driver_info.get('ipmi_force_boot_device', False)
driver_info = _parse_driver_info(task.node)
if (strutils.bool_from_string(ifbd)
and driver_internal_info.get('persistent_boot_device')
and driver_internal_info.get('is_next_boot_persistent', True)):
@ -1109,7 +1191,6 @@ class IPMIManagement(base.ManagementInterface):
}
cmd = "chassis bootparam get 5"
driver_info = _parse_driver_info(task.node)
response = {'boot_device': None, 'persistent': None}
try:
@ -1121,7 +1202,9 @@ class IPMIManagement(base.ManagementInterface):
'Error: %(error)s',
{'node': driver_info['uuid'], 'cmd': cmd, 'error': e})
raise exception.IPMIFailure(cmd=cmd)
# FIXME(TheJulia): This whole thing needs to be refactored to be based
# upon the response generated by the "Boot parameter data".
# See: https://storyboard.openstack.org/#!/story/2008241
re_obj = re.search('Boot Device Selector : (.+)?\n', out)
if re_obj:
boot_selector = re_obj.groups('')[0]
@ -1140,6 +1223,38 @@ class IPMIManagement(base.ManagementInterface):
response['persistent'] = 'Options apply to all future boots' in out
return response
@task_manager.require_exclusive_lock
@METRICS.timer('IPMIManagement.detect_vendor')
def detect_vendor(self, task):
"""Detects, stores, and returns the hardware vendor.
If the Node object ``properties`` field does not already contain
a ``vendor`` field, then this method is intended to query
Detects the BMC hardware vendor and stores the returned value
with-in the Node object ``properties`` field if detected.
:param task: A task from TaskManager.
:raises: InvalidParameterValue if an invalid component, indicator
or state is specified.
:raises: MissingParameterValue if a required parameter is missing
:returns: String representing the BMC reported Vendor or
Manufacturer, otherwise returns None.
"""
try:
driver_info = _parse_driver_info(task.node)
out, err = _exec_ipmitool(driver_info, "mc info")
re_obj = re.search("Manufacturer Name .*: (.+)", out)
if re_obj:
bmc_vendor = str(re_obj.groups('')[0]).lower().split(':')
# Pull unparsed data and save the vendor
return bmc_vendor[-1]
except (exception.PasswordFileFailedToCreate,
processutils.ProcessExecutionError) as e:
LOG.warning('IPMI get boot device failed to detect vendor '
'of bmc for %(node)s. Error %(err)s',
{'node': task.node.uuid,
'err': e})
@METRICS.timer('IPMIManagement.get_sensors_data')
def get_sensors_data(self, task):
"""Get sensors data.

View File

@ -30,6 +30,7 @@ from ironic.tests.unit.db import base as db_base
from ironic.tests.unit.objects import utils as object_utils
from ironic.tests.unit import stubs
@mock.patch('ironic.common.keystone.get_service_auth', autospec=True,
return_value=mock.sentinel.sauth)
@mock.patch('ironic.common.keystone.get_auth', autospec=True,

View File

@ -58,6 +58,31 @@ INFO_DICT = db_utils.get_test_ipmi_info()
BRIDGE_INFO_DICT = INFO_DICT.copy()
BRIDGE_INFO_DICT.update(db_utils.get_test_ipmi_bridging_parameters())
MC_INFO = """Device ID : 32
Device Revision : 1
Firmware Revision : 0.99
IPMI Version : 2.0
Manufacturer ID : 1234
Manufacturer Name : {vendor}
Product ID : 2001 (0x0801)
Product Name : Unknown (0x101)
Device Available : yes
Provides Device SDRs : no
Additional Device Support :
Sensor Device
SDR Repository Device
SEL Device
FRU Inventory Device
IPMB Event Receiver
IPMB Event Generator
Chassis Device
Aux Firmware Rev Info :
0x00
0x00
0x00
0x00
"""
class IPMIToolCheckInitTestCase(base.TestCase):
@ -1510,20 +1535,22 @@ class IPMIToolPrivateMethodTestCase(
mock.call(self.info, "power status", kill_on_timeout=True)]
with task_manager.acquire(self.context, self.node.uuid) as task:
task.node.properties['vendor'] = 'lolcat'
self.assertRaises(exception.PowerStateFailure,
ipmi._power_on, task, self.info, timeout=2)
self.assertEqual(expected, mock_exec.call_args_list)
@mock.patch.object(ipmi.IPMIManagement, 'detect_vendor', autospec=True)
@mock.patch.object(ipmi, '_exec_ipmitool', autospec=True)
@mock.patch('oslo_utils.eventletutils.EventletEvent.wait', autospec=True)
def test__soft_power_off(self, sleep_mock, mock_exec):
def test__soft_power_off(self, sleep_mock, mock_exec, mock_vendor):
def side_effect(driver_info, command, **kwargs):
resp_dict = {"power status": ["Chassis Power is off\n", None],
"power soft": [None, None]}
return resp_dict.get(command, ["Bad\n", None])
mock_vendor.return_value = None
mock_exec.side_effect = side_effect
expected = [mock.call(self.info, "power soft"),
@ -1534,10 +1561,13 @@ class IPMIToolPrivateMethodTestCase(
self.assertEqual(expected, mock_exec.call_args_list)
self.assertEqual(states.POWER_OFF, state)
self.assertTrue(mock_vendor.called)
@mock.patch.object(ipmi.IPMIManagement, 'detect_vendor', autospec=True)
@mock.patch.object(ipmi, '_exec_ipmitool', autospec=True)
@mock.patch('oslo_utils.eventletutils.EventletEvent.wait', autospec=True)
def test__soft_power_off_max_retries(self, sleep_mock, mock_exec):
def test__soft_power_off_max_retries(self, sleep_mock, mock_exec,
mock_vendor):
def side_effect(driver_info, command, **kwargs):
resp_dict = {"power status": ["Chassis Power is on\n", None],
@ -1551,10 +1581,13 @@ class IPMIToolPrivateMethodTestCase(
mock.call(self.info, "power status", kill_on_timeout=True)]
with task_manager.acquire(self.context, self.node.uuid) as task:
task.node.properties['vendor'] = 'meow5000'
self.assertRaises(exception.PowerStateFailure,
ipmi._soft_power_off, task, self.info, timeout=2)
self.assertEqual(expected, mock_exec.call_args_list)
# Should be removed when detect_vendor automatic invocation is moved.
self.assertFalse(mock_vendor.called)
@mock.patch.object(ipmi, '_power_status', autospec=True)
@mock.patch.object(ipmi, '_exec_ipmitool', autospec=True)
@ -1594,8 +1627,9 @@ class IPMIToolDriverTestCase(Base):
self.assertEqual(sorted(expected),
sorted(task.driver.get_properties()))
@mock.patch.object(ipmi.IPMIManagement, 'detect_vendor', autospec=True)
@mock.patch.object(ipmi, '_exec_ipmitool', autospec=True)
def test_get_power_state(self, mock_exec):
def test_get_power_state(self, mock_exec, mock_detect):
returns = iter([["Chassis Power is off\n", None],
["Chassis Power is on\n", None],
["\n", None]])
@ -1603,6 +1637,7 @@ class IPMIToolDriverTestCase(Base):
mock.call(self.info, "power status", kill_on_timeout=True),
mock.call(self.info, "power status", kill_on_timeout=True)]
mock_exec.side_effect = returns
mock_detect.return_value = None
with task_manager.acquire(self.context, self.node.uuid) as task:
pstate = self.power.get_power_state(task)
@ -1615,16 +1650,20 @@ class IPMIToolDriverTestCase(Base):
self.assertEqual(states.ERROR, pstate)
self.assertEqual(mock_exec.call_args_list, expected)
self.assertEqual(3, mock_detect.call_count)
@mock.patch.object(ipmi.IPMIManagement, 'detect_vendor', autospec=True)
@mock.patch.object(ipmi, '_exec_ipmitool', autospec=True)
def test_get_power_state_exception(self, mock_exec):
def test_get_power_state_exception(self, mock_exec, mock_vendor):
mock_exec.side_effect = processutils.ProcessExecutionError("error")
mock_vendor.return_value = None
with task_manager.acquire(self.context, self.node.uuid) as task:
self.assertRaises(exception.IPMIFailure,
self.power.get_power_state,
task)
mock_exec.assert_called_once_with(self.info, "power status",
kill_on_timeout=True)
self.assertEqual(1, mock_vendor.call_count)
@mock.patch.object(ipmi, '_power_on', autospec=True)
@mock.patch.object(ipmi, '_power_off', autospec=True)
@ -2230,7 +2269,7 @@ class IPMIToolDriverTestCase(Base):
mock_calls = [
mock.call(self.info, "raw 0x00 0x08 0x03 0x08"),
mock.call(self.info, "chassis bootdev pxe options=efiboot")
mock.call(self.info, "raw 0x00 0x08 0x05 0xa0 0x04 0x00 0x00 0x00")
]
mock_exec.assert_has_calls(mock_calls)
@ -2251,6 +2290,48 @@ class IPMIToolDriverTestCase(Base):
]
mock_exec.assert_has_calls(mock_calls)
@mock.patch.object(boot_mode_utils, 'get_boot_mode_for_deploy',
autospec=True)
@mock.patch.object(ipmi, '_exec_ipmitool', autospec=True)
def test_management_interface_set_boot_device_uefi_and_persistent_smci(
self, mock_exec, mock_boot_mode):
mock_boot_mode.return_value = 'uefi'
mock_exec.return_value = [None, None]
properties = self.node.properties
properties['vendor'] = 'SuperMicro'
self.node.properties = properties
self.node.save()
with task_manager.acquire(self.context, self.node.uuid) as task:
self.management.set_boot_device(task, boot_devices.DISK,
persistent=True)
mock_calls = [
mock.call(self.info, "raw 0x00 0x08 0x03 0x08"),
mock.call(self.info, "raw 0x00 0x08 0x05 0xe0 "
"0x24 0x00 0x00 0x00")
]
mock_exec.assert_has_calls(mock_calls)
@mock.patch.object(boot_mode_utils, 'get_boot_mode_for_deploy',
autospec=True)
@mock.patch.object(ipmi, '_exec_ipmitool', autospec=True)
def test_management_interface_set_boot_device_uefi_and_onetime_smci(
self, mock_exec, mock_boot_mode):
mock_boot_mode.return_value = 'uefi'
mock_exec.return_value = [None, None]
properties = self.node.properties
properties['vendor'] = 'SuperMicro'
self.node.properties = properties
self.node.save()
with task_manager.acquire(self.context, self.node.uuid) as task:
self.management.set_boot_device(task, boot_devices.DISK,
persistent=False)
mock_calls = [
mock.call(self.info, "raw 0x00 0x08 0x03 0x08"),
mock.call(self.info, "raw 0x00 0x08 0x05 0xa0 "
"0x24 0x00 0x00 0x00")
]
mock_exec.assert_has_calls(mock_calls)
def test_management_interface_get_supported_boot_devices(self):
with task_manager.acquire(self.context, self.node.uuid) as task:
expected = [boot_devices.PXE, boot_devices.DISK,
@ -2366,6 +2447,16 @@ class IPMIToolDriverTestCase(Base):
mock_exec.assert_called_once_with(driver_info, "power diag")
self.assertTrue(mock_log.called)
@mock.patch.object(ipmi, '_exec_ipmitool', autospec=True)
def test_detect_vendor(self, mock_exec):
mock_exec.return_value = (MC_INFO.format(vendor='LolCatMeow'),
None)
with task_manager.acquire(self.context, self.node.uuid) as task:
driver_info = ipmi._parse_driver_info(task.node)
vendor = self.management.detect_vendor(task)
mock_exec.assert_called_once_with(driver_info, 'mc info')
self.assertEqual('lolcatmeow', vendor)
def test__parse_ipmi_sensor_data_ok(self):
fake_sensors_data = """
Sensor ID : Temp (0x1)

View File

@ -0,0 +1,37 @@
---
fixes:
- |
Fixes issues when ``UEFI`` boot mode has been requested with persistent
boot to ``DISK`` where some versions of ``ipmitool`` do not properly
handle multiple options being set at the same time. While some of this
logic was addressed in upstream `ipmitool <https://github.com/ipmitool/ipmitool/issues/163>`_
development, new versions are not released and vendors maintain downstream
forks of the ipmitool utility. When considering vendor specific `selector
differences <https://storyboard.openstack.org/#!/story/2008241>`_ along
with the current stance of new versions from the upstream ``ipmitool``
community, it only made sense to handle this logic with-in Ironic.
In part this was because if already set the selector value would not be
updated. Now ironic always transmits the selector value for ``UEFI``.
- Fixes handling of Supermicro ``UEFI`` supporting BMCs with the ``ipmi``
hardware type such that an appropriate boot device selector value is sent
to the remote BMC to indicate boot from local storage. This is available
for both persistent and one-time boot applications. For more information,
please consult `story 2008241 <https://storyboard.openstack.org/#!/story/2008241>`_.
- Fixes handling of the ``ipmi`` hardware type where ``UEFI`` boot mode and
"one-time" boot to PXE has been requested. As Ironic now specifically
transmits the raw commands, this setting should be properly appied where
previously PXE boot operations may have previously occured in
``Legacy BIOS`` mode.
other:
- Adds a ``detect_vendor`` management interface method to the ``ipmi``
hardware type. This method is being promoted as a higher level interface
as the fundimental need to be able to have logic aware of the hardware
vendor is necessary with vendor agnostic drivers where slight differences
require slightly different behavior.
upgrade:
- An automated detection of a IPMI BMC hardware vendor has been added to
appropriately handle IPMI BMC variations. Ironic will now query this and
save this value if not already set in order to avoid querying for
every single operation. Operators upgrading should expect an elongated
first power state synchronization if for nodes with the ``ipmi``
hardware type.