Merge "Add Redfish support to Maintenance"
This commit is contained in:
commit
3bb3080439
@ -0,0 +1,304 @@
|
||||
==================================
|
||||
Add Redfish support to Maintenance
|
||||
==================================
|
||||
|
||||
Storyboard: https://storyboard.openstack.org/#!/story/2005861
|
||||
|
||||
This story adds ``Redfish Platform Management`` support to Starling-X
|
||||
Maintenance as a prioritized alternative to the existing less secure
|
||||
IPMI support for the following board management functions
|
||||
|
||||
* Reset and Power On/Off Control
|
||||
* Network Boot Override
|
||||
* Sensor Monitoring
|
||||
|
||||
Problem description
|
||||
===================
|
||||
|
||||
Starling-X Maintenance currently uses ``ipmitool`` to invoke board management
|
||||
functions. Unfortunately however, IPMI is aged and not evolving with the server
|
||||
market.
|
||||
|
||||
``Redfish`` is a new and emerging well-defined Platform Management Application
|
||||
Programming Interface (API) standard that leverages modern software, is more
|
||||
secure and is easier to use and understand compared to IPMI.
|
||||
|
||||
Redfish API uses the HTTP protocol over a TCP/IP network using either JSON
|
||||
or XML data schemas to leverage common Internet and web services standards
|
||||
and modern tool chains to add new board management services for modern
|
||||
host servers to meet today's system administrator demands.
|
||||
|
||||
Redfish offers a single root endpoint that expands to reveal a well-structured
|
||||
hierarchy of service, system, chassis and management endpoints accessed in
|
||||
user sessions and or single shot command operations to manage and monitor the
|
||||
hardware in polled and event driven models.
|
||||
|
||||
Use Cases
|
||||
---------
|
||||
|
||||
System developers, testers, operators, administrators and auto provisioning
|
||||
tools need the ability to power on, power off and reset hosts as well as
|
||||
force hosts to boot from the network during installation activities.
|
||||
|
||||
High availability products such as Starling-X also need the ability to monitor
|
||||
the health of its host server pool so that it can notify system administrators
|
||||
or system orchestrators of pending or immediate service affecting hardware
|
||||
failures for proactive action and service migrations.
|
||||
|
||||
Proposed change
|
||||
===============
|
||||
|
||||
Maintenance shall continue with the existing centralized power/reset control
|
||||
and sensor monitoring model.
|
||||
|
||||
Integrate BSD licenced Redfish tool into the load and use it similar to how
|
||||
ipmitool is used today which launches a thread that runs ``ipmitool`` as a
|
||||
system command with hidden credentials and reports execution status to the
|
||||
main process as a json string.
|
||||
|
||||
Maintain the existing ipmitool solution for hosts that do not support redfish.
|
||||
|
||||
A common redfish root query will be implemented and called upon BMC
|
||||
provisioning notification to Maintenance (mtcAgent) and the Hardware
|
||||
Monitor (hwmond).
|
||||
|
||||
If that query indicates support for ``Redfish`` then all BMC access to that
|
||||
host will be done using the new Redfish tool and managed by the associated
|
||||
content added by this feature. Otherwise, current ipmitool method will be used.
|
||||
This way Redfish management takes priority over IPMI.
|
||||
|
||||
Aside from work to integrate Redfish tool into the load, all changes for this
|
||||
feature update are restricted to two maintenance daemons ; ``mtcAgent`` and
|
||||
``hwmond``.
|
||||
|
||||
The implementation model for this Redfish support follows what is currently
|
||||
done for ipmitool. For each request, launch the tool thread to run the system
|
||||
command that makes the Redfish request followed by interpreting the response
|
||||
and passing pertinent data back to the main process in a formatted json string.
|
||||
|
||||
There are very little change to the main mtcAgent and hwmond processes.
|
||||
There are no changes to Starling-X System Inventory (sysinv).
|
||||
There are no changes to BMC provisioning.
|
||||
|
||||
Alternatives
|
||||
------------
|
||||
|
||||
An alternative to using the opensource Redfishtool is to implement an HTTP
|
||||
agent that conforms to the DMTF Redfish Scalable Platforms Management API
|
||||
Specification (DSP0266) with the ability to initiate and handle success and
|
||||
failure responses for System Reset, System setBootOverride as well as Chassis
|
||||
Power and Thermal targets for sensor monitoring.
|
||||
|
||||
Such agent would require a back-end interface that the Starling-X Maintenance
|
||||
and Hardware Monitor processes could bind into for orchestration purposes.
|
||||
|
||||
The work involved to implement this alternative is extensive and could require
|
||||
ongoing updates as the Redfish API evolves.
|
||||
|
||||
Data model impact
|
||||
-----------------
|
||||
|
||||
If a host represents its sensors differently in name or type between its
|
||||
ipmi and redfish services then the sensor model for that host may have to
|
||||
be relearned.
|
||||
|
||||
Fortunately the Hardware Monitor already supports a sensor model relearn
|
||||
function in support of BMC and SDR firmware upgrade but also serves feature
|
||||
patch cases as well.
|
||||
|
||||
The sensor model relearn is
|
||||
|
||||
* automatic over a ``hwmond`` process restart if the detected model differs
|
||||
from the model stored in system inventory.
|
||||
* manual using the ``system host-sensorgroup-relearn`` CLI command or by
|
||||
pressing the relearn button on the Host's Sensor tab in Horizon.
|
||||
|
||||
REST API impact
|
||||
---------------
|
||||
|
||||
None. This story does not change any existing REST APIs.
|
||||
|
||||
Security impact
|
||||
---------------
|
||||
|
||||
A primary design goal in the development of Redfish was to offer improved
|
||||
platform management security compared to existing solutions such as IPMI.
|
||||
|
||||
Redfish API supports two authentication methods
|
||||
|
||||
* Basic Authentication
|
||||
* Token Authentication
|
||||
|
||||
This feature makes its sparse and infrequent requests using Basic
|
||||
authentication. Token authentication adds complexity with no justification.
|
||||
|
||||
Security features built into Redfish are described in the Redfish Scalable
|
||||
Platforms Management API Specification ;
|
||||
https://www.dmtf.org/sites/default/files/standards/documents/DSP0266_1.6.0.pdf
|
||||
|
||||
American Department of Homeland Security warns of the security vulnerabilities
|
||||
of IPMI ; https://www.us-cert.gov/ncas/alerts/TA13-207A
|
||||
|
||||
Other end user impact
|
||||
---------------------
|
||||
|
||||
None.
|
||||
|
||||
Performance Impact
|
||||
------------------
|
||||
|
||||
Any performance impact by the introduction of this feature is negligible
|
||||
for the following reasons:
|
||||
|
||||
* the current method uses ipmitool while this feature uses redfishtool in a
|
||||
very similar way.
|
||||
* both methods invoke the tool as a thread to avoid blocking the main process.
|
||||
* maintenance actions are rare, on-demand only and while the host is locked.
|
||||
* sensor monitoring is periodic with a cadence in minutes not seconds.
|
||||
* only impact would be in the difference between the individual two open
|
||||
source tools and prototype testing demonstrated comparable performances.
|
||||
* measured both ipmitool and redfishtool command execution with ``time``
|
||||
and found them to be comparable.
|
||||
|
||||
Other deployer impact
|
||||
---------------------
|
||||
|
||||
This feature introduces a new RPM ; redfishtool.
|
||||
If this feature were to be patched back to an earlier release then that
|
||||
redfishtool RPM would also have to be patched back.
|
||||
|
||||
If this feature is patched back to an earlier release or patched into a
|
||||
current release then
|
||||
* the mtcAgent process would have to be restarted.
|
||||
* the hwmond process would have to be restarted.
|
||||
|
||||
Developer impact
|
||||
----------------
|
||||
|
||||
This feature has no impact to other developers working on StarlingX.
|
||||
|
||||
Upgrade impact
|
||||
--------------
|
||||
|
||||
None currently as this is the initial implementation of Redfish support.
|
||||
|
||||
Newer versions of Redfishtool can be introduced if integration testing of that
|
||||
newer version verifies that the currently used command line options and relied
|
||||
upon underlying behavior passes the test cases listed in the ``Testing``
|
||||
section below.
|
||||
|
||||
If a newer version of redfishtool is required and has functionally impacting
|
||||
changes then maintenance will have to query the redfishtool version and behave
|
||||
as required by the detected version. 'redfishtool -V' prints the redfish tool
|
||||
version.
|
||||
|
||||
|
||||
Implementation
|
||||
==============
|
||||
|
||||
Assignee(s)
|
||||
-----------
|
||||
|
||||
Primary assignee:
|
||||
Eric MacDonald
|
||||
|
||||
Other contributors:
|
||||
Zhipeng Liu
|
||||
|
||||
Repos Impacted
|
||||
--------------
|
||||
|
||||
* stx-integ - adding redfishtool
|
||||
* stx-metal - updating mainteance with redfish support
|
||||
|
||||
Work Items
|
||||
----------
|
||||
|
||||
redfish - stx-integ/bmc/Redfishtool
|
||||
|
||||
* create patched RPM package and include on controllers
|
||||
* create patch that adds unimplemented cfgFile support for hiding credentials.
|
||||
* push cfgFile support upstream.
|
||||
* create patch that makes redfishtool support python-2 and then removed once
|
||||
Starling-X supports python-3
|
||||
|
||||
Maintenance Common - stx-metal/mtce-common/src/common
|
||||
|
||||
* create common redfishUtil.cpp/.h for similar purpose/function to the
|
||||
existing ipmiUtil.cpp/h for use with both hwmond and mtcAgent.
|
||||
|
||||
Maintenance - stx-metal/mtce/src/maintenance - mtcAgent process
|
||||
|
||||
* create mtcRedFishUtil.cpp/h for similar purpose/function to the existing
|
||||
mtcIpmiUtil.cpp/h for sending and receiving RedFishTool requests for
|
||||
maintenance power reset and control, power status and hw/fw version query.
|
||||
* enhance mtcThread.cpp/h with mtcThread_redfishtool request support similar
|
||||
to the existing mtcThread_ipmitool thread used to handle redfish tool
|
||||
requests and responses as a thread.
|
||||
|
||||
Hardware Monitor - stx-metal/mtce/src/hwmon - hwmond process
|
||||
|
||||
* create hwmonRedFish.cpp/h for similar purpose/function to the existing
|
||||
hwmonIpmi.cpp/h for parsing sensor query responses into a common format
|
||||
for the hardware monitor sensor manager engine.
|
||||
* enhance hwmonThreads.cpp/h with new hwmonThread_redfishtool request support
|
||||
similar to the existing mtcThread_ipmitool pthread.
|
||||
|
||||
Dependencies
|
||||
============
|
||||
|
||||
This specification depends upon the open source Redfishtool.
|
||||
|
||||
https://github.com/DMTF/Redfishtool
|
||||
|
||||
Testing
|
||||
=======
|
||||
|
||||
This feature can be tested in a fully provisioned duplex Starling-X system
|
||||
with Redfish supported hosts that have their BMC provisioned through system
|
||||
inventory.
|
||||
|
||||
* With a host's BMC provisioned, verify that the mtcAgent and hwmond processes
|
||||
on the active controller each report a log stating that the UUT host is
|
||||
being managed by Redfish ; rather than IPMI.
|
||||
* With UUT host locked, perform Reset action and verify the host
|
||||
experiences a graceful shutdown followed by a reboot.
|
||||
* With UUT host locked and online, perform Power-Off action and verify the
|
||||
host experiences a graceful shutdown followed by a power-off.
|
||||
* With UUT host locked and powered off, perform power-on action and verify
|
||||
the host powers on and starts to boot.
|
||||
* With UUT host locked and powered off with a bootable image on disk, perform
|
||||
a ReInstall action and verify that the host gets powered on and reinstalls
|
||||
a new image from the controller.
|
||||
* With UUT verify sensor monitoring by viewing the sensor groups and sensors
|
||||
list from Horizon with CLI commands.
|
||||
|
||||
Documentation Impact
|
||||
====================
|
||||
|
||||
This feature change has no customer visible impact.
|
||||
This feature change requires no customer documentation update.
|
||||
|
||||
References
|
||||
==========
|
||||
|
||||
Redfish was developed by DTMF (Distributed Management Task Force), lead by a
|
||||
diverse board of directors and contributors from many of the major technology
|
||||
companies like Intel, Dell, HP, Hitachi, Lenovo, Vmware, etc.
|
||||
|
||||
Redfish Platform Management Application Programming Interface (API) standard
|
||||
and supporting specifications can be found at the following URL.
|
||||
|
||||
https://www.dmtf.org/standards/redfish
|
||||
|
||||
|
||||
History
|
||||
=======
|
||||
|
||||
.. list-table:: Revisions
|
||||
:header-rows: 1
|
||||
|
||||
* - Release Name
|
||||
- Description
|
||||
* - 2019.11
|
||||
- Introduced
|
Loading…
x
Reference in New Issue
Block a user