diff --git a/userdocs/fuel-user-guide/maintain-environment.rst b/userdocs/fuel-user-guide/maintain-environment.rst index c4c25a199..05cbde2e0 100644 --- a/userdocs/fuel-user-guide/maintain-environment.rst +++ b/userdocs/fuel-user-guide/maintain-environment.rst @@ -22,3 +22,5 @@ This section includes the following topics: maintain-environment/data-driven.rst maintain-environment/deployment-history.rst maintain-environment/deployment-information.rst + maintain-environment/shutdown-env.rst + maintain-environment/start-env.rst diff --git a/userdocs/fuel-user-guide/maintain-environment/shutdown-env.rst b/userdocs/fuel-user-guide/maintain-environment/shutdown-env.rst new file mode 100644 index 000000000..ae0b1660a --- /dev/null +++ b/userdocs/fuel-user-guide/maintain-environment/shutdown-env.rst @@ -0,0 +1,117 @@ +.. _shutdown-env: + +================================== +Shut down an OpenStack environment +================================== + +This section provides the recommended process for shutting down an entire +OpenStack environment. You may need to shut down an OpenStack environment +if you want to perform maintenance or recovery procedures. The shutdown +process involves stopping all OpenStack virtual machines, the Fuel Master +node, compute, controller and other nodes in a determinate order. Adhering +to the procedure ensures that the shutdown process is performed gracefully +and mitigates the risks of failure during a subsequent start of +the environment. + +**To shut down an entire OpenStack environment:** + +#. Shut down the OpenStack virtual machines gracefully through either + Horizon or CLI. + + Verify if any virtual machines in your OpenStack environment require + customized shutdown procedure or special shutdown sequence between + several virtual machines. Shut down or suspend these instances gracefully. + +#. Shut down compute nodes. + + Log in to each compute node as administrator and type: + + .. code-block:: console + + poweroff + + .. note:: All compute nodes may be shut down at the same time. + + .. warning:: + + If a node combines more than one role, you may need to perform + additional steps, such as setting the ``noout`` flag for Ceph OSD nodes, + before you can shut down the node. + +#. Shut down Ceph OSD nodes: + + #. Set the ``noout`` flag to prevent the rebalance procedure launch + that can be triggered by a delay between Ceph nodes powering off: + + #. Log in to any controller node or any Ceph OSD node and type: + + .. code-block:: console + + ceph osd set noout + + #. Verify the ``noout`` flag is set: + + .. code-block:: console + + ceph -s + + The output of the command above should show the ``noout`` flag + set into the health status. + + #. On each ceph osd node, type: + + .. code-block:: console + + poweroff + +#. Shut down controller nodes by powering them off sequentially. + + An estimated duration of a single controller node shutdown is 30 minutes. + However, it may take more time depending on configuration. + Most of the time the system shows the ``unload corosync services`` message. + + Assuming your environment contains 3 controller nodes, shut them down + as follows: + + #. Put the pacemaker cluster in maintenance mode: + + .. code-block:: console + + pcs property set maintenance-mode=true + + #. Log in to the Controller-03 node as Administrator and type: + + .. code-block:: console + + poweroff + + Wait until the node accomplishes the poweroff procedure and + some extra minutes to enable Pacemaker/Corosync to redistribute + services between the remained controller nodes. + + #. Log in to the Controller-02 node as Administrator and type: + + .. code-block:: console + + poweroff + + Wait until the node accomplishes the poweroff procedure and + some extra minutes to enable Pacemaker/Corosync to stop all services + on a single remained controller node due to no quorum. + + #. Log in to the Controller-01 node as Administrator and type: + + .. code-block:: console + + poweroff + +#. Shut down the Fuel Master node. Log in to the Fuel Master node CLI and + type: + + .. code-block:: console + + poweroff + +#. Shut down any remaining nodes in your environment. +#. If required, shut down the networking infrastructure. +#. To start an environment, proceed to :ref:`start-env`. \ No newline at end of file diff --git a/userdocs/fuel-user-guide/maintain-environment/start-env.rst b/userdocs/fuel-user-guide/maintain-environment/start-env.rst new file mode 100644 index 000000000..4f4539994 --- /dev/null +++ b/userdocs/fuel-user-guide/maintain-environment/start-env.rst @@ -0,0 +1,92 @@ +.. _start-env: + +============================== +Start an OpenStack environment +============================== + +To resume an OpenStack environment after it has been shut down, you need +to bring the environment back to operation. This section provides instructions +on how to start an entire OpenStack environment. + +**To start an OpenStack environment:** + +#. Verify that hardware is up and running. +#. Power on the Fuel Master node. +#. Start controller nodes. + + Assuming your environment contains 3 controller nodes, start them + as follows: + + .. note:: + + The first controller node to start is the controller node that + was shut down last. + + #. Start the Controller-01 node. + + Wait until the node accomplishes the boot process and some extra minutes + for Pacemaker/Corosync to complete the start up and to shut down + the required services due to no quorum. + + #. Start the Controller-02 node. + + Wait until the node accomplishes the boot process and some extra minutes + for Pacemaker/Corosync to complete the start up and to redistribute + the OpenStack services between the nodes. + + #. Start the Controller-03 node. + + Wait until the node accomplishes the boot process and some extra minutes + for Pacemaker/Corosync to complete the start up and to redistribute + the OpenStack services between the nodes. + + #. Remove the maintenance mode from the Pacemaker resources: + + .. code-block:: console + + pcs property set maintenance-mode=false + + #. Verify the Galera service. + + .. warning:: + + If your configuration includes a MySQL database of a huge size, + Galera may stay in the syncing state for several hours until it + verifies both MySQL replicas between the available controllers. + Do not interrupt syncing, wait until Galera finishes the process. + +#. Start Ceph OSD nodes. + + You can start all Ceph OSD nodes at the same time. Ceph starts + Ceph OSD services one by one, depending on the current load to Ceph + monitors. + +#. Verify that all Ceph OSD nodes are up and running by logging in to + any controller node and typing: + + .. code-block:: console + + ceph osd tree + + If some Ceph OSD nodes are not up and running, check it manually. + +#. Remove the ``noout`` flag: + + #. Log in to any controller or any Ceph node and type: + + .. code-block:: console + + ceph osd unset noout + + #. Verify the flag is set: + + .. code-block:: console + + ceph -s + + The output of the command above should NOT show the ``noout`` flag + set into the health status. + +#. Start compute nodes. +#. Verify the OpenStack services. +#. Start virtual machines through either Horizon or CLI.