ironic/doc/source/admin/rescue.rst
Shivanand Tendulker 545b4fd363 Add admin documentation for rescue interface
This commit adds admin documentation for rescue interface
introduced in API version 1.38

Change-Id: I02776ba4028de3b93ba1bccec59601d28648fd0a
Partial-Bug: #1526449
2018-09-26 14:10:10 +00:00

3.6 KiB

Rescue Mode

Overview

The Bare Metal Service supports putting nodes in rescue mode using hardware types that support rescue interfaces. The hardware types utilizing ironic-python-agent with PXE/Virtual Media based boot interface can support rescue operation when configured appropriately.

Note

The rescue operation is currently supported only when tenant networks use DHCP to obtain IP addresses.

Rescue operation can be used to boot nodes into a rescue ramdisk so that the rescue user can access the node, in order to provide the ability to access the node in case access to OS is not possible. For example, if there is a need to perform manual password reset or data recovery in the event of some failure, rescue operation can be used.

Configuring The Bare Metal Service

Configure the Bare Metal Service appropriately so that the service has the information needed to boot the ramdisk before a user tries to initiate rescue operation. This will differ somewhat between different deploy environments, but an example of how to do this is outlined below:

  1. Create and configure ramdisk that supports rescue operation. The ramdisk types that supports rescue operation is ironic-python-agent with CoreOS/tinyIPA and DIB based ramdisk. Please see /install/deploy-ramdisk for detailed instructions to build a ramdisk.

  2. Configure a network to use for booting nodes into the rescue ramdisk in neutron, and note the UUID or name of this network. This is required if you're using the neutron DHCP provider and have Bare Metal Service managing ramdisk booting (the default). This can be the same network as your cleaning or tenant network (for flat network). For an example of how to configure new networks with Bare Metal Service, see the /install/configure-networking documentation.

  3. Add the unique name or UUID of your rescue network to ironic.conf:

    [neutron]
    rescuing_network=<RESCUE_UUID_OR_NAME>

    Note

    This can be set per node via driver_info['rescuing_network']

  4. Restart the ironic conductor service.

  5. Specify a rescue kernel and ramdisk or rescue ISO compatible with the node's driver for pxe based boot interface or virtual-media based boot interface respectively.

    Example for pxe based boot interface:

    openstack baremetal node set $NODE_UUID \
        --driver-info rescue_ramdisk=$RESCUE_INITRD_UUID \
        --driver-info rescue_kernel=$RESCUE_VMLINUZ_UUID

    See /install/configure-glance-images for details. If you are not using Image service, it is possible to provide images to Bare Metal service via hrefs.

After this, The Bare Metal Service should be ready for rescue operation. Test it out by attempting to rescue an active node and connect to the instance using ssh, as given below:

openstack baremetal node rescue $NODE_UUID \
    --rescue-password <PASSWORD> --wait

ssh rescue@$INSTANCE_IP_ADDRESS

To move a node back to active state after using rescue mode you can use unrescue. Please unmount any filesystems that were manually mounted before proceeding with unrescue. The node unrescue can be done as given below:

openstack baremetal node unrescue $NODE_UUID

rescue and unrescue operations can also be triggered via the Compute Service using the following commands:

openstack server rescue --password <password> <server>

openstack server unrescue <server>