[UG] Adds shut down and start environment procedures

Change-Id: Ica64a0ea83881fa28e672f3b7d6bd20297aa1738 Closes-bug: #1550388
2016-08-02 14:06:25 +03:00 · 2016-08-02 14:06:25 +03:00 · 50131896b1
parent de47e3a072
commit 50131896b1
3 changed files with 211 additions and 0 deletions
--- a/userdocs/fuel-user-guide/maintain-environment.rst
+++ b/userdocs/fuel-user-guide/maintain-environment.rst
@ -22,3 +22,5 @@ This section includes the following topics:
   maintain-environment/data-driven.rst
   maintain-environment/deployment-history.rst
   maintain-environment/deployment-information.rst
+   maintain-environment/shutdown-env.rst
+   maintain-environment/start-env.rst
--- a/userdocs/fuel-user-guide/maintain-environment/shutdown-env.rst
+++ b/userdocs/fuel-user-guide/maintain-environment/shutdown-env.rst
@ -0,0 +1,117 @@
+.. _shutdown-env:
+
+==================================
+Shut down an OpenStack environment
+==================================
+
+This section provides the recommended process for shutting down an entire
+OpenStack environment. You may need to shut down an OpenStack environment
+if you want to perform maintenance or recovery procedures. The shutdown
+process involves stopping all OpenStack virtual machines, the Fuel Master
+node, compute, controller and other nodes in a determinate order. Adhering
+to the procedure ensures that the shutdown process is performed gracefully
+and mitigates the risks of failure during a subsequent start of
+the environment.
+
+**To shut down an entire OpenStack environment:**
+
+#. Shut down the OpenStack virtual machines gracefully through either
+   Horizon or CLI.
+
+   Verify if any virtual machines in your OpenStack environment require
+   customized shutdown procedure or special shutdown sequence between
+   several virtual machines. Shut down or suspend these instances gracefully.
+
+#. Shut down compute nodes.
+
+   Log in to each compute node as administrator and type:
+
+   .. code-block:: console
+
+      poweroff
+
+   .. note:: All compute nodes may be shut down at the same time.
+
+   .. warning::
+
+      If a node combines more than one role, you may need to perform
+      additional steps, such as setting the ``noout`` flag for Ceph OSD nodes,
+      before you can shut down the node.
+
+#. Shut down Ceph OSD nodes:
+
+   #. Set the ``noout`` flag to prevent the rebalance procedure launch
+      that can be triggered by a delay between Ceph nodes powering off:
+
+      #. Log in to any controller node or any Ceph OSD node and type:
+
+         .. code-block:: console
+
+            ceph osd set noout
+
+      #. Verify the ``noout`` flag is set:
+
+         .. code-block:: console
+
+            ceph -s
+
+         The output of the command above should show the ``noout`` flag
+         set into the health status.
+
+   #. On each ceph osd node, type:
+
+      .. code-block:: console
+
+         poweroff
+
+#. Shut down controller nodes by powering them off sequentially.
+
+   An estimated duration of a single controller node shutdown is 30 minutes.
+   However, it may take more time depending on configuration.
+   Most of the time the system shows the ``unload corosync services`` message.
+
+   Assuming your environment contains 3 controller nodes, shut them down
+   as follows:
+
+   #. Put the pacemaker cluster in maintenance mode:
+
+      .. code-block:: console
+
+         pcs property set maintenance-mode=true
+
+   #. Log in to the Controller-03 node as Administrator and type:
+
+      .. code-block:: console
+
+         poweroff
+
+      Wait until the node accomplishes the poweroff procedure and
+      some extra minutes to enable Pacemaker/Corosync to redistribute
+      services between the remained controller nodes.
+
+   #. Log in to the Controller-02 node as Administrator and type:
+
+      .. code-block:: console
+
+         poweroff
+
+      Wait until the node accomplishes the poweroff procedure and
+      some extra minutes to enable Pacemaker/Corosync to stop all services
+      on a single remained controller node due to no quorum.
+
+   #. Log in to the Controller-01 node as Administrator and type:
+
+      .. code-block:: console
+
+         poweroff
+
+#. Shut down the Fuel Master node. Log in to the Fuel Master node CLI and
+   type:
+
+   .. code-block:: console
+
+      poweroff
+
+#. Shut down any remaining nodes in your environment.
+#. If required, shut down the networking infrastructure.
+#. To start an environment, proceed to :ref:`start-env`.
--- a/userdocs/fuel-user-guide/maintain-environment/start-env.rst
+++ b/userdocs/fuel-user-guide/maintain-environment/start-env.rst
@ -0,0 +1,92 @@
+.. _start-env:
+
+==============================
+Start an OpenStack environment
+==============================
+
+To resume an OpenStack environment after it has been shut down, you need
+to bring the environment back to operation. This section provides instructions
+on how to start an entire OpenStack environment.
+
+**To start an OpenStack environment:**
+
+#. Verify that hardware is up and running.
+#. Power on the Fuel Master node.
+#. Start controller nodes.
+
+   Assuming your environment contains 3 controller nodes, start them
+   as follows:
+
+   ..  note::
+
+       The first controller node to start is the controller node that
+       was shut down last.
+
+   #. Start the Controller-01 node.
+
+      Wait until the node accomplishes the boot process and some extra minutes
+      for Pacemaker/Corosync to complete the start up and to shut down
+      the required services due to no quorum.
+
+   #. Start the Controller-02 node.
+
+      Wait until the node accomplishes the boot process and some extra minutes
+      for Pacemaker/Corosync to complete the start up and to redistribute
+      the OpenStack services between the nodes.
+
+   #. Start the Controller-03 node.
+
+      Wait until the node accomplishes the boot process and some extra minutes
+      for Pacemaker/Corosync to complete the start up and to redistribute
+      the OpenStack services between the nodes.
+
+   #. Remove the maintenance mode from the Pacemaker resources:
+
+      .. code-block:: console
+
+         pcs property set maintenance-mode=false
+
+   #. Verify the Galera service.
+
+      .. warning::
+
+         If your configuration includes a MySQL database of a huge size,
+         Galera may stay in the syncing state for several hours until it
+         verifies both MySQL replicas between the available controllers.
+         Do not interrupt syncing, wait until Galera finishes the process.
+
+#. Start Ceph OSD nodes.
+
+   You can start all Ceph OSD nodes at the same time. Ceph starts
+   Ceph OSD services one by one, depending on the current load to Ceph
+   monitors.
+
+#. Verify that all Ceph OSD nodes are up and running by logging in to
+   any controller node and typing:
+
+   .. code-block:: console
+
+      ceph osd tree
+
+   If some Ceph OSD nodes are not up and running, check it manually.
+
+#. Remove the ``noout`` flag:
+
+   #. Log in to any controller or any Ceph node and type:
+
+      .. code-block:: console
+
+         ceph osd unset noout
+
+   #. Verify the flag is set:
+
+      .. code-block:: console
+
+         ceph -s
+
+      The output of the command above should NOT show the ``noout`` flag
+      set into the health status.
+
+#. Start compute nodes.
+#. Verify the OpenStack services.
+#. Start virtual machines through either Horizon or CLI.