Merge "Maintenance Mode update"
This commit is contained in:
Binary file not shown.
|
Before Width: | Height: | Size: 91 KiB After Width: | Height: | Size: 46 KiB |
@@ -19,35 +19,35 @@ parameter in one of the following ways:
|
||||
|
||||
* by selecting the respective option in the boot menu in Ubuntu or CentOS;
|
||||
|
||||
* by forcing the reboot into maintenance mode from shell with the ``umm on``
|
||||
command;
|
||||
* by forcing the reboot into maintenance mode from shell with
|
||||
the :command:`umm on` command;
|
||||
|
||||
* automatically, by reaching an number of unclean-reboots specified in
|
||||
REBOOT_COUNT parameters.
|
||||
``REBOOT_COUNT`` parameters.
|
||||
|
||||
"Unclean reboot" means that system reboots unexpectedly without a
|
||||
`Unclean reboot` means that system reboots unexpectedly without a
|
||||
direct call from the user.
|
||||
|
||||
You can also disable the maintenance mode functionality
|
||||
if you do not need it (e.g. you do not want to
|
||||
if you do not need it (for example, you do not want to
|
||||
be automatically booted into it every time).
|
||||
|
||||
You can operate in maintenance mode through ssh or tty2.
|
||||
|
||||
A return back into normal mode is issued with the *umm off* command.
|
||||
A return back into normal mode is issued with the :command:`umm off`
|
||||
command.
|
||||
|
||||
.. Note ::
|
||||
|
||||
If you manually start a service in the maintenance mode, it will not
|
||||
be automatically restarted when you put the system back in the normal
|
||||
mode with the *umm off* command.
|
||||
mode with the :command:`umm off` command.
|
||||
|
||||
|
||||
Using the :command:`umm` command
|
||||
--------------------------------
|
||||
|
||||
Using the ``umm`` command
|
||||
-------------------------
|
||||
|
||||
There are several parameters to use with the *umm* command:
|
||||
There are several parameters to use with the :command:`umm` command:
|
||||
|
||||
- ``umm on [cmd]`` - enter the maintenance mode, and execute cmd when MM is reached;
|
||||
|
||||
@@ -68,10 +68,10 @@ There are several parameters to use with the *umm* command:
|
||||
- ``umm disable`` - disable the maintenance mode functionality.
|
||||
|
||||
|
||||
Configuring the ``UMM.conf`` file
|
||||
Configuring the `UMM.conf` file
|
||||
---------------------------------
|
||||
|
||||
You can automate the maintenance mode start by editing the */etc/umm.conf* file.
|
||||
You can automate the maintenance mode start by editing the `/etc/umm.conf` file.
|
||||
|
||||
The configuration options are:
|
||||
|
||||
@@ -82,20 +82,19 @@ The configuration options are:
|
||||
|
||||
where:
|
||||
|
||||
UMM
|
||||
will tell the system to go into the maintenance mode based on
|
||||
the REBOOT_COUNT and COUNTER_RESET_TIME values. If the value is
|
||||
anything other than ``yes`` (or if the ``UMM.conf`` file is missing), the
|
||||
system will go into the native Ubuntu recovery mode.
|
||||
UMM
|
||||
tells the system to go into the maintenance mode based on
|
||||
the ``REBOOT_COUNT`` and ``COUNTER_RESET_TIME`` values. If the value is
|
||||
anything other than ``yes`` (or if the `UMM.conf` file is missing), the
|
||||
system will go into the native Ubuntu recovery mode.
|
||||
|
||||
REBOOT_COUNT
|
||||
determines the number of unclean reboots that will
|
||||
trigger the system to go into the maintenance mode;
|
||||
|
||||
COUNTER_RESET_TIME
|
||||
this is a time value in minutes after the system reboot when
|
||||
"Unclean reboot" counter will be resetted.
|
||||
REBOOT_COUNT
|
||||
determines the number of unclean reboots that trigger the system to go
|
||||
into the maintenance mode.
|
||||
|
||||
COUNTER_RESET_TIME
|
||||
determines the period of time (in minutes) before the `Unclean reboot`
|
||||
counter reset.
|
||||
|
||||
|
||||
Example of using MM on one node
|
||||
@@ -103,53 +102,57 @@ Example of using MM on one node
|
||||
|
||||
- Switching node into MM:
|
||||
|
||||
::
|
||||
.. code-block:: bash
|
||||
:linenos:
|
||||
|
||||
root@node-1:~#umm on
|
||||
umm-gr start/running, process 6657
|
||||
root@node-1:~#umm on
|
||||
umm-gr start/running, process 6657
|
||||
|
||||
Broadcast message from root@node-1
|
||||
(/dev/pts/0) at 14:29 ...
|
||||
Broadcast message from root@node-1
|
||||
(/dev/pts/0) at 14:29 ...
|
||||
|
||||
The system is going down for reboot NOW!
|
||||
root@node-1:~# umm status
|
||||
rebooting
|
||||
root@node-1:~# Connection to node-1 closed by remote host.
|
||||
Connection node-1:~# closed.
|
||||
root@fuel:~#:~$
|
||||
The system is going down for reboot NOW!
|
||||
root@node-1:~# umm status
|
||||
rebooting
|
||||
root@node-1:~# Connection to node-1 closed by remote host.
|
||||
Connection node-1:~# closed.
|
||||
root@fuel:~#:~$
|
||||
|
||||
root@node-1:~#ssh
|
||||
root@node-1:~#ssh
|
||||
|
||||
root@node-1:~# umm status
|
||||
umm
|
||||
root@node-1:~#ps -Af
|
||||
root@node-1:~# umm status
|
||||
umm
|
||||
root@node-1:~#ps -Af
|
||||
|
||||
|
||||
We can see only small set of working process.
|
||||
We can see only small set of working processes.
|
||||
|
||||
- Start the service:
|
||||
|
||||
::
|
||||
.. code-block:: bash
|
||||
:linenos:
|
||||
|
||||
root@node-1:~# /etc/init.d/apache2 start
|
||||
root@node-1:~# /etc/init.d/apache2 status
|
||||
Apache2 is running (pid 1907).
|
||||
root@node-1:~# /etc/init.d/apache2 start
|
||||
root@node-1:~# /etc/init.d/apache2 status
|
||||
Apache2 is running (pid 1907).
|
||||
|
||||
|
||||
- Switch back to the working mode:
|
||||
|
||||
::
|
||||
.. code-block:: bash
|
||||
:linenos:
|
||||
|
||||
root@node-1:~#umm off
|
||||
root@node-1:~#umm off
|
||||
|
||||
- Continue booting into working mode:
|
||||
|
||||
::
|
||||
.. code-block:: bash
|
||||
:linenos:
|
||||
|
||||
root@node-1:~#umm status
|
||||
runlevel N 2
|
||||
root@node-1:~#/etc/init.d/apache2 status
|
||||
Apache2 is running (pid 1907).
|
||||
root@node-1:~#umm status
|
||||
runlevel N 2
|
||||
root@node-1:~#/etc/init.d/apache2 status
|
||||
Apache2 is running (pid 1907).
|
||||
|
||||
|
||||
We can see that service was not restarted during switching from MM to
|
||||
@@ -157,115 +160,122 @@ Example of using MM on one node
|
||||
|
||||
- Check the state of the OpenStack services:
|
||||
|
||||
::
|
||||
.. code-block:: bash
|
||||
:linenos:
|
||||
|
||||
root@node-1:~#crm status
|
||||
root@node-1:~#crm status
|
||||
|
||||
- If you want to reach working mode by reboot, you should use the following
|
||||
command:
|
||||
|
||||
::
|
||||
.. code-block:: bash
|
||||
:linenos:
|
||||
|
||||
root@node-1:~# umm off reboot umm-gr start/running, process 2825
|
||||
root@node-1:~# umm off reboot umm-gr start/running, process 2825
|
||||
|
||||
Broadcast message from root@node-1
|
||||
(/dev/pts/0) at 11:23 ...
|
||||
Broadcast message from root@node-1
|
||||
(/dev/pts/0) at 11:23 ...
|
||||
|
||||
The system is going down for reboot NOW!
|
||||
root@node-1:~# Connection to node-1 closed by remote host.
|
||||
Connection to node-1 closed.
|
||||
[root@fuel ~]#
|
||||
The system is going down for reboot NOW!
|
||||
root@node-1:~# Connection to node-1 closed by remote host.
|
||||
Connection to node-1 closed.
|
||||
[root@fuel ~]#
|
||||
|
||||
|
||||
Example of putting all nodes into the maintenance mode at the same time
|
||||
-----------------------------------------------------------------------
|
||||
|
||||
The following maintenance mode sequence is called "Last input First out".
|
||||
The following maintenance mode sequence is called `Last input First out`.
|
||||
This guarantees that there is going to be the most recent data on
|
||||
the Cloud Infrastructure Controller (CIC) that comes back first.
|
||||
|
||||
|
||||
- Determine what nodes have Controller (CIC) role:
|
||||
- Determine which nodes have Controller (CIC) role:
|
||||
|
||||
::
|
||||
.. code-block:: bash
|
||||
:linenos:
|
||||
|
||||
[root@fuel ~]# fuel nodes
|
||||
id | status | name | cluster| ip | mac | roles | pending_roles| online
|
||||
---|--------|------------------|--------|-----------|-------------------|------------|--------------|-------
|
||||
2 | ready | Untitled (c0:02) | 1 | 10.20.0.4 | e6:6a:42:96:a4:45 | controller | | True
|
||||
4 | ready | Untitled (c0:04) | 1 | 10.20.0.6 | 66:10:2e:0c:12:4a | compute | | True
|
||||
1 | ready | Untitled (c0:01) | 1 | 10.20.0.3 | fa:a1:39:94:7f:4c | controller | | True
|
||||
3 | ready | Untitled (c0:03) | 1 | 10.20.0.5 | 82:cb:bb:50:40:47 | controller | | True
|
||||
[root@fuel ~]# fuel nodes
|
||||
id | status | name | cluster| ip | mac | roles | pending_roles| online
|
||||
---|--------|------------------|--------|-----------|-------------------|------------|--------------|-------
|
||||
2 | ready | Untitled (c0:02) | 1 | 10.20.0.4 | e6:6a:42:96:a4:45 | controller | | True
|
||||
4 | ready | Untitled (c0:04) | 1 | 10.20.0.6 | 66:10:2e:0c:12:4a | compute | | True
|
||||
1 | ready | Untitled (c0:01) | 1 | 10.20.0.3 | fa:a1:39:94:7f:4c | controller | | True
|
||||
3 | ready | Untitled (c0:03) | 1 | 10.20.0.5 | 82:cb:bb:50:40:47 | controller | | True
|
||||
|
||||
- Copy id_rsa to the CICs for passwordless ssh authentification:
|
||||
- Copy ``id_rsa`` to the CICs for passwordless ssh authentification:
|
||||
|
||||
::
|
||||
.. code-block:: bash
|
||||
:linenos:
|
||||
|
||||
[root@fuel ~]# scp .ssh/id_rsa node-1:.ssh/id_rsa
|
||||
Warning: Permanently added 'node-1' (RSA) to the list of known hosts.
|
||||
id_rsa 100% 1675 1.6KB/s 00:00
|
||||
[root@fuel ~]# scp .ssh/id_rsa node-2:.ssh/id_rsa
|
||||
Warning: Permanently added 'node-2' (RSA) to the list of known hosts.
|
||||
id_rsa 100% 1675 1.6KB/s 00:00
|
||||
[root@fuel ~]# scp .ssh/id_rsa node-3:.ssh/id_rsa
|
||||
Warning: Permanently added 'node-3' (RSA) to the list of known hosts.
|
||||
id_rsa 100% 1675 1.6KB/s 00:00
|
||||
[root@fuel ~]# scp .ssh/id_rsa node-1:.ssh/id_rsa
|
||||
Warning: Permanently added 'node-1' (RSA) to the list of known hosts.
|
||||
id_rsa 100% 1675 1.6KB/s 00:00
|
||||
[root@fuel ~]# scp .ssh/id_rsa node-2:.ssh/id_rsa
|
||||
Warning: Permanently added 'node-2' (RSA) to the list of known hosts.
|
||||
id_rsa 100% 1675 1.6KB/s 00:00
|
||||
[root@fuel ~]# scp .ssh/id_rsa node-3:.ssh/id_rsa
|
||||
Warning: Permanently added 'node-3' (RSA) to the list of known hosts.
|
||||
id_rsa 100% 1675 1.6KB/s 00:00
|
||||
|
||||
- Enforce switching into MM mode on all nodes:
|
||||
|
||||
::
|
||||
.. code-block:: bash
|
||||
:linenos:
|
||||
|
||||
[root@fuel ~]# ssh node-1 umm on ssh node-2 umm on ssh node-3 umm on
|
||||
Warning: Permanently added 'node-1' (RSA) to the list of known hosts.
|
||||
umm-gr start/running, process 24318
|
||||
Connection to node-1 closed by remote host.
|
||||
Connection to node-1 closed.
|
||||
[root@fuel ~]#
|
||||
[root@fuel ~]# ssh -tt node-1 ssh -tt node-2 ssh -tt node-3 sleep 1
|
||||
Warning: Permanently added 'node-1' (RSA) to the list of known hosts.
|
||||
ECDSA key fingerprint is 84:17:0d:ea:27:1f:4e:08:f7:54:b2:8c:fe:8a:13:1a.
|
||||
Are you sure you want to continue connecting (yes/no)? yes
|
||||
Warning: Permanently added 'node-2,10.20.0.4' (ECDSA)
|
||||
to the list of known hosts. established.
|
||||
ECDSA key fingerprint is
|
||||
c3:c6:ca:7d:11:d3:53:01:15:64:20:f7:c7:44:fb:d1.
|
||||
Are you sure you want to continue connecting (yes/no)? yes
|
||||
Warning: Permanently added 'node-3,192.168.0.6' (ECDSA)
|
||||
to the list of known hosts.
|
||||
Connection to node-3 closed.
|
||||
Connection to node-2 closed.
|
||||
Connection to node-1 closed. [root@fuel ~]#
|
||||
[root@fuel ~]# ssh node-1 umm on ssh node-2 umm on ssh node-3 umm on
|
||||
Warning: Permanently added 'node-1' (RSA) to the list of known hosts.
|
||||
umm-gr start/running, process 24318
|
||||
Connection to node-1 closed by remote host.
|
||||
Connection to node-1 closed.
|
||||
[root@fuel ~]#
|
||||
[root@fuel ~]# ssh -tt node-1 ssh -tt node-2 ssh -tt node-3 sleep 1
|
||||
Warning: Permanently added 'node-1' (RSA) to the list of known hosts.
|
||||
ECDSA key fingerprint is 84:17:0d:ea:27:1f:4e:08:f7:54:b2:8c:fe:8a:13:1a.
|
||||
Are you sure you want to continue connecting (yes/no)? yes
|
||||
Warning: Permanently added 'node-2,10.20.0.4' (ECDSA)
|
||||
to the list of known hosts. established.
|
||||
ECDSA key fingerprint is
|
||||
c3:c6:ca:7d:11:d3:53:01:15:64:20:f7:c7:44:fb:d1.
|
||||
Are you sure you want to continue connecting (yes/no)? yes
|
||||
Warning: Permanently added 'node-3,192.168.0.6' (ECDSA)
|
||||
to the list of known hosts.
|
||||
Connection to node-3 closed.
|
||||
Connection to node-2 closed.
|
||||
Connection to node-1 closed. [root@fuel ~]#
|
||||
|
||||
- Wait until the last node reboots:
|
||||
|
||||
::
|
||||
.. code-block:: bash
|
||||
:linenos:
|
||||
|
||||
[root@fuel ~]# ssh node-3
|
||||
Warning: Permanently added 'node-3' (RSA) to the list of known hosts.
|
||||
Welcome to Ubuntu 12.04.4 LTS (GNU/Linux 3.13.0-32-generic x86_64)
|
||||
* Documentation: https://help.ubuntu.com/
|
||||
Last login: Tue Dec 23 05:55:47 2014 from 10.20.0.2
|
||||
root@node-3:~#
|
||||
Broadcast message from root@node-3
|
||||
(unknown) at 6:00 ...
|
||||
The system is going down for reboot NOW!
|
||||
Connection to node-3 closed by remote host.
|
||||
Connection to node-3 closed.
|
||||
[root@fuel ~]#
|
||||
[root@fuel ~]# ssh node-3
|
||||
Warning: Permanently added 'node-3' (RSA) to the list of known hosts.
|
||||
Welcome to Ubuntu 12.04.4 LTS (GNU/Linux 3.13.0-32-generic x86_64)
|
||||
* Documentation: https://help.ubuntu.com/
|
||||
Last login: Tue Dec 23 05:55:47 2014 from 10.20.0.2
|
||||
root@node-3:~#
|
||||
Broadcast message from root@node-3
|
||||
(unknown) at 6:00 ...
|
||||
The system is going down for reboot NOW!
|
||||
Connection to node-3 closed by remote host.
|
||||
Connection to node-3 closed.
|
||||
[root@fuel ~]#
|
||||
|
||||
- Perform all the steps, planned for MM.
|
||||
- Perform all the steps planned for MM.
|
||||
|
||||
|
||||
- Enforce a return back into normal mode in reverse state:
|
||||
|
||||
::
|
||||
.. code-block:: bash
|
||||
:linenos:
|
||||
|
||||
[root@fuel ~]# ssh node-3 umm off
|
||||
Warning: Permanently added 'node-3' (RSA) to the list of known hosts.
|
||||
[root@fuel ~]# ssh node-2 umm off
|
||||
Warning: Permanently added 'node-2' (RSA) to the list of known hosts.
|
||||
[root@fuel ~]# ssh node-1 umm off
|
||||
Warning: Permanently added 'node-1' (RSA) to the list of known hosts.
|
||||
[root@fuel ~]# ssh node-3 umm off
|
||||
Warning: Permanently added 'node-3' (RSA) to the list of known hosts.
|
||||
[root@fuel ~]# ssh node-2 umm off
|
||||
Warning: Permanently added 'node-2' (RSA) to the list of known hosts.
|
||||
[root@fuel ~]# ssh node-1 umm off
|
||||
Warning: Permanently added 'node-1' (RSA) to the list of known hosts.
|
||||
|
||||
|
||||
|
||||
|
||||
@@ -6,21 +6,20 @@ Maintenance Mode
|
||||
|
||||
Maintenance mode (MM) is a mode when the operating system on the node
|
||||
has only a critical set of working services that the system needs for
|
||||
basic network and disk operations. The purpose of the maintenance mode
|
||||
is to do a system repair or run other service operations on the system.
|
||||
The implementation of maintenance mode in 15B is based on the Ubuntu
|
||||
recovery mode. The system goes into a reboot and goes through the
|
||||
regular boot process until the system initialization stage (rc-sysinit).
|
||||
This is where the system enters the maintenance mode with the network
|
||||
and filesystem services started. In this moment we have already started
|
||||
network and filesystem. In MM stage are started sshd, tty2 and main MM
|
||||
service wait command for boot flow continue.
|
||||
basic network and disk operations. The purpose of MM is to perform a system
|
||||
repair or run other maintenance operations on the system.
|
||||
|
||||
For switching to MM, the system shuts down and then goes through the regular
|
||||
boot process until the system initialization stage (rc-sysinit).
|
||||
At that moment the system enters MM, the network and filesystem services
|
||||
have already started. During the MM stage, the ``sshd`` and ``tty2``
|
||||
services start, and the main MM service waits for the command to continue the boot flow.
|
||||
|
||||
See the :ref:`mm-ops` section of the Operations guide for the details.
|
||||
|
||||
Here is a Cloud Infrastructure Controller boot flow scheme:
|
||||
|
||||
.. image:: /_images/mm_bootflow.png
|
||||
|
||||
For more information, see:
|
||||
|
||||
- :ref:`mm-ops`.
|
||||
|
||||
|
||||
Reference in New Issue
Block a user