diff --git a/doc/source/admin/cleaning.rst b/doc/source/admin/cleaning.rst index 85d5fde3e9..8ef672b562 100644 --- a/doc/source/admin/cleaning.rst +++ b/doc/source/admin/cleaning.rst @@ -25,10 +25,10 @@ automated cleaning on the node to ensure it's ready for another workload. This ensures the tenant will get a consistent bare metal node deployed every time. Ironic implements automated cleaning by collecting a list of cleaning steps -to perform on a node from the Power, Deploy, Management, and RAID interfaces -of the driver assigned to the node. These steps are then ordered by priority -and executed on the node when the node is moved -to ``cleaning`` state, if automated cleaning is enabled. +to perform on a node from the Power, Deploy, Management, BIOS, and RAID +interfaces of the driver assigned to the node. These steps are then ordered by +priority and executed on the node when the node is moved to ``cleaning`` state, +if automated cleaning is enabled. With automated cleaning, nodes move to ``cleaning`` state when moving from ``active`` -> ``available`` state (when the hardware is recycled from one @@ -63,7 +63,7 @@ Cleaning steps Cleaning steps used for automated cleaning are ordered from higher to lower priority, where a larger integer is a higher priority. In case of a conflict between priorities across interfaces, the following resolution order is used: -Power, Management, Deploy, and RAID interfaces. +Power, Management, Deploy, BIOS, and RAID interfaces. You can skip a cleaning step by setting the priority for that cleaning step to zero or 'None'. @@ -236,6 +236,7 @@ across hardware interfaces, the following resolution order is used: #. Power interface #. Management interface #. Deploy interface +#. BIOS interface #. RAID interface For manual cleaning, the cleaning steps should be specified in the desired diff --git a/doc/source/admin/deploy-steps.rst b/doc/source/admin/deploy-steps.rst new file mode 100644 index 0000000000..22bd87b98d --- /dev/null +++ b/doc/source/admin/deploy-steps.rst @@ -0,0 +1,60 @@ +============ +Deploy steps +============ + +Overview +======== + +Node deployment is performed by the Bare Metal service to prepare a node for +use by a workload. The exact work flow used depends on a number of factors, +including the hardware type and interfaces assigned to a node. + +Customizing deployment +====================== + +The Bare Metal service implements deployment by collecting a list of deploy +steps to perform on a node from the Power, Deploy, Management, BIOS, and RAID +interfaces of the driver assigned to the node. These steps are then ordered by +priority and executed on the node when the node is moved to the ``deploying`` +state. + +Nodes move to the ``deploying`` state when attempting to move to the ``active`` +state (when the hardware is prepared for use by a workload). For a full +understanding of all state transitions into deployment, please see +:ref:`states`. + +The Bare Metal service added support for deploy steps in the Rocky release. + +Deploy steps +------------ + +Deploy steps are ordered from higher to lower priority, where a larger integer +is a higher priority. If the same priority is used by deploy steps on different +interfaces, the following resolution order is used: Power, Management, Deploy, +BIOS, and RAID interfaces. + +FAQ +=== + +What deploy step is running? +---------------------------- +To check what deploy step the node is performing or attempted to perform and +failed, run the following command; it will return the value in the node's +``driver_internal_info`` field:: + + openstack baremetal node show $node_ident -f value -c driver_internal_info + +The ``deploy_steps`` field will contain a list of all remaining steps with +their priorities, and the first one listed is the step currently in progress or +that the node failed before going into ``deploy failed`` state. + +Troubleshooting +=============== +If deployment fails on a node, the node will be put into the ``deploy failed`` +state until the node is deprovisioned. A deprovisioned node is moved to the +``available`` state after the cleaning process has been performed successfully. + +Strategies for determining why a deploy step failed include checking the ironic +conductor logs, checking logs from the ironic-python-agent that have been +stored on the ironic conductor, or performing general hardware troubleshooting +on the node. diff --git a/doc/source/admin/index.rst b/doc/source/admin/index.rst index 42ccfd7ba9..b3ac27151d 100644 --- a/doc/source/admin/index.rst +++ b/doc/source/admin/index.rst @@ -11,6 +11,7 @@ the services. Drivers, Hardware Types and Hardware Interfaces Ironic Python Agent Node Hardware Inspection + Deploy steps Node Cleaning Node Adoption RAID Configuration diff --git a/doc/source/contributor/code-contribution-guide.rst b/doc/source/contributor/code-contribution-guide.rst index 4026a0830d..1a7c50914b 100644 --- a/doc/source/contributor/code-contribution-guide.rst +++ b/doc/source/contributor/code-contribution-guide.rst @@ -232,6 +232,8 @@ Here is the list of existing common and agent driver attributes: * ``is_whole_disk_image``: A Boolean value to indicate whether the user image contains ramdisk/kernel. * ``clean_steps``: An ordered list of clean steps that will be performed on the node. + * ``deploy_steps``: An ordered list of deploy steps that will be performed on the node. Support for + deploy steps was added in the ``11.1.0`` release. * ``instance``: A list of dictionaries containing the disk layout values. * ``root_uuid_or_disk_id``: A String value of the bare metal node's root partition uuid or disk id. * ``persistent_boot_device``: A String value of device from ``ironic.common.boot_devices``. diff --git a/doc/source/contributor/drivers.rst b/doc/source/contributor/drivers.rst index 519a58a7b2..7c9caa1b41 100644 --- a/doc/source/contributor/drivers.rst +++ b/doc/source/contributor/drivers.rst @@ -49,6 +49,15 @@ The minimum required interfaces are: A few common implementations are provided by the ``GenericHardware`` base class. + As of the Rocky release, a deploy interface should decorate its deploy method + to indicate that it is a deploy step. Conventionally, the deploy method uses + a priority of 100. + + .. code-block:: python + + @ironic.drivers.base.deploy_step(priority=100) + def deploy(self, task): + .. note:: Most of the hardware types should not override this interface.