diff --git a/doc/source/post_deployment/delete_nodes.rst b/doc/source/post_deployment/delete_nodes.rst index 47e1184e..a356f9e2 100644 --- a/doc/source/post_deployment/delete_nodes.rst +++ b/doc/source/post_deployment/delete_nodes.rst @@ -18,8 +18,9 @@ IDs (which represent nodes) to be deleted. changes to the overcloud. .. note:: - Before deleting a compute node please make sure that the node is quiesced, - see :ref:`quiesce_compute`. + Before deleting a compute node or a cephstorage node, please make sure that + the node is quiesced, see :ref:`quiesce_compute` or + :ref:`quiesce_cephstorage`. .. note:: A list of nova instance IDs can be listed with command:: diff --git a/doc/source/post_deployment/post_deployment.rst b/doc/source/post_deployment/post_deployment.rst index 22ea03c7..0b2caa95 100644 --- a/doc/source/post_deployment/post_deployment.rst +++ b/doc/source/post_deployment/post_deployment.rst @@ -10,6 +10,7 @@ In this chapter you will find advanced management of various |project| areas. scale_roles delete_nodes quiesce_compute + quiesce_cephstorage vm_snapshot package_update upgrade diff --git a/doc/source/post_deployment/quiesce_cephstorage.rst b/doc/source/post_deployment/quiesce_cephstorage.rst new file mode 100644 index 00000000..9e75be91 --- /dev/null +++ b/doc/source/post_deployment/quiesce_cephstorage.rst @@ -0,0 +1,56 @@ +.. _quiesce_cephstorage: + +Quiescing a CephStorage Node +============================ + +The process of quiescing a cephstorage node means to inform the Ceph +cluster that one or multiple OSDs will be permanently removed so that +the node can be shut down without affecting the data availability. + +Take the OSDs out of the cluster +-------------------------------- + +Before you remove an OSD, you need to take it out of the cluster so that Ceph +can begin rebalancing and copying its data to other OSDs. Running the following +commands on a given cephstorage node will take all data out of the OSDs hosted +on it:: + + OSD_IDS=$(ls /var/lib/ceph/osd | awk 'BEGIN { FS = "-" } ; { print $2 }') + for OSD_ID in $OSD_IDS; do ceph crush reweight osd.$OSD_ID 0.0; done + +Ceph will begin rebalancing the cluster by migrating placement groups out of +the OSDs. You can observe this process with the ceph tool:: + + ceph -w + +You should see the placement group states change from active+clean to active, +some degraded objects, and finally active+clean when migration completes. + +Removing the OSDs +----------------- + +After the rebalancing, the OSDs will still be running. Running the following on +that same cephstorage node will stop all OSDs hosted on it, remove them from the +CRUSH map, from the OSDs map and delete the authentication keys:: + + OSD_IDS=$(ls /var/lib/ceph/osd | awk 'BEGIN { FS = "-" } ; { print $2 }') + for OSD_ID in $OSD_IDS; do + ceph osd out $OSD_ID + systemctl stop ceph-osd@$OSD_ID + ceph osd crush remove osd.$OSD_ID + ceph auth del osd.$OSD_ID + ceph osd rm $OSD_ID + done + +.. admonition:: Mitaka + :class: mitaka + + TripleO/Mitaka uses and supports Ceph/Hammer, not Jewel, which does not + use systemd but sysv init scripts. For Mitaka the systemctl command above + which stops the OSD should be replaced by:: + + service ceph stop osd.$OSD_ID + +You are now free to reboot or shut down the node (using the Ironic API), or +even remove it from the overcloud altogether by scaling down the overcloud +deployment, see :ref:`delete_nodes`.