During a recent doc rework it was highighted that we should stop linking to the ironic spec for the state machine and instead move that information into the real documentation. This patch moves the description about each state from that document into the states documentation alongside the state machine. Change-Id: If0cba97f7d7f6e3b7c4f4009ca61c7bcdec470e0
7.2 KiB
Ironic's State Machine
State Machine Diagram
The diagram below shows the provisioning states that an Ironic node goes through during the lifetime of a node. The diagram also depicts the events that transition the node to different states.
Stable states are highlighted with a thicker border. All transitions from stable states are initiated by API requests. There are a few other API-initiated-transitions that are possible from non-stable states. The events for these API-initiated transitions are indicated with '(via API)'. Internally, the conductor initiates the other transitions (depicted in gray).
State Descriptions
- enroll (stable state)
-
This is the state that all nodes start off in when created using API version 1.11 or newer. When a node is in the
enrollstate, the only thing ironic knows about it is that it exists, and ironic cannot take any further action by itself. Once a node has its driver/interfaces and their required information set innode.driver_info, the node can be transitioned to theverifyingstate by setting the node's provision state using themanageverb. - verifying
-
ironic will validate that it can manage the node using the information given in
node.driver_infoand with either the driver/hardware type and interfaces it has been assigned. This involves going out and confirming that the credentials work to access whatever node control mechanism they talk to. - manageable (stable state)
-
Once ironic has verified that it can manage the node using the driver/interfaces and credentials passed in at node create time, the node will be transitioned to the
manageablestate. Frommanageable, nodes can transition to:manageable(throughcleaning) by setting the node's provision state using thecleanverb.manageable(throughinspecting) by setting the node's provision state using theinspectverb.available(throughcleaningif automatic cleaning is enabled) by setting the node's provision state using theprovideverb.active(throughadopting) by setting the node's provision state using theadoptverb.
manageableis the state that a node should be moved into when any updates need to be made to it such as changes to fields in driver_info and updates to networking information on ironic ports assigned to the node.manageableis also the only stable state that can be transitioned to, from these failure states:adopt failedclean failedinspect failed
- inspecting
-
inspectingwill utilize node introspection to update hardware-derived node properties to reflect the current state of the hardware. If introspection fails, the node will transition toinspect failed. - inspect failed
-
This is the state a node will move into when inspection of the node fails. From here the node can transitioned to:
inspectingby setting the node's provision state using theinspectverb.manageableby setting the node's provision state using themanageverb
- cleaning
-
Nodes in the
cleaningstate are being scrubbed and reprogrammed into a known configuration.When a node is in the
cleaningstate it means that the conductor is executing the clean step (for out-of-band clean steps) or preparing the environment (building PXE configuration files, configuring the DHCP, etc) to boot the ramdisk for running in-band clean steps. - clean wait
-
Just like the
cleaningstate, the nodes in theclean waitstate are being scrubbed and reprogrammed. The difference is that in theclean waitstate the conductor is waiting for the ramdisk to boot or the clean step which is running in-band to finish.The cleaning process of a node in the
clean waitstate can be interrupted by setting the node's provision state using theabortverb if the task that is running allows it. - available (stable state)
-
After nodes have been successfully preconfigured and cleaned, they are moved into the
availablestate and are ready to be provisioned. Fromavailable, nodes can transition to:active(throughdeploying) by setting the node's provision state using theactiveverb.manageableby setting the node's provision state using themanageverb
- deploying
-
Nodes in
deployingare being prepared to run a workload on them. This consists of running a series of tasks, such as:- Setting appropriate BIOS configurations
- Partitioning drives and laying down file systems.
- Creating any additional resources (node-specific network config, a config drive partition, etc.) that may be required by additional subsystems.
- wait call-back
-
Just like the
deployingstate, the nodes inwait call-backare being deployed. The difference is that inwait call-backthe conductor is waiting for the ramdisk to boot or execute parts of the deployment which need to run in-band on the node (for example, installing the bootloader, or writing the image to the disk).The deployment of a node in
wait call-backcan be interrupted by setting the node's provision state using thedeletedverb. - deploy failed
-
This is the state a node will move into when a deployment fails, for example a timeout waiting for the ramdisk to PXE boot. From here the node can be transitioned to:
active(throughdeploying) by setting the node's provision state using either theactiveorrebuildverbs.available(throughdeletingandcleaning) by setting the node's provision state using thedeletedverb.
- active (stable state)
-
Nodes in
activehave a workload running on them. ironic may collect out-of-band sensor information (including power state) on a regular basis. Nodes inactivecan transition to:available(throughdeletingandcleaning) by setting the node's provision state using thedeletedverb.active(throughdeploying) by setting the node's provision state using therebuildverb.
- deleting
-
Nodes in
deletingstate are being torn down from running an active workload. Indeleting, ironic tears down and removes any configuration and resources it added indeploying. - error (stable state)
-
This is the state a node will move into when deleting an active deployment fails. From
error, nodes can transition to:available(throughdeletingandcleaning) by setting the node's provision state using thedeletedverb.
- adopting
-
This state allows ironic to take over management of a baremetal node with an existing workload on it. Ordinarily when a baremetal node is enrolled and managed by ironic, it must transition through
cleaninganddeployingto reachactivestate. However, those baremetal nodes that have an existing workload on them, do not need to be deployed or cleaned again, so this transition allows these nodes to move directly frommanageabletoactive.