Merge "Maintenance Mode update"
This commit is contained in:
Binary file not shown.
|
Before Width: | Height: | Size: 91 KiB After Width: | Height: | Size: 46 KiB |
@@ -19,35 +19,35 @@ parameter in one of the following ways:
|
|||||||
|
|
||||||
* by selecting the respective option in the boot menu in Ubuntu or CentOS;
|
* by selecting the respective option in the boot menu in Ubuntu or CentOS;
|
||||||
|
|
||||||
* by forcing the reboot into maintenance mode from shell with the ``umm on``
|
* by forcing the reboot into maintenance mode from shell with
|
||||||
command;
|
the :command:`umm on` command;
|
||||||
|
|
||||||
* automatically, by reaching an number of unclean-reboots specified in
|
* automatically, by reaching an number of unclean-reboots specified in
|
||||||
REBOOT_COUNT parameters.
|
``REBOOT_COUNT`` parameters.
|
||||||
|
|
||||||
"Unclean reboot" means that system reboots unexpectedly without a
|
`Unclean reboot` means that system reboots unexpectedly without a
|
||||||
direct call from the user.
|
direct call from the user.
|
||||||
|
|
||||||
You can also disable the maintenance mode functionality
|
You can also disable the maintenance mode functionality
|
||||||
if you do not need it (e.g. you do not want to
|
if you do not need it (for example, you do not want to
|
||||||
be automatically booted into it every time).
|
be automatically booted into it every time).
|
||||||
|
|
||||||
You can operate in maintenance mode through ssh or tty2.
|
You can operate in maintenance mode through ssh or tty2.
|
||||||
|
|
||||||
A return back into normal mode is issued with the *umm off* command.
|
A return back into normal mode is issued with the :command:`umm off`
|
||||||
|
command.
|
||||||
|
|
||||||
.. Note ::
|
.. Note ::
|
||||||
|
|
||||||
If you manually start a service in the maintenance mode, it will not
|
If you manually start a service in the maintenance mode, it will not
|
||||||
be automatically restarted when you put the system back in the normal
|
be automatically restarted when you put the system back in the normal
|
||||||
mode with the *umm off* command.
|
mode with the :command:`umm off` command.
|
||||||
|
|
||||||
|
|
||||||
|
Using the :command:`umm` command
|
||||||
|
--------------------------------
|
||||||
|
|
||||||
Using the ``umm`` command
|
There are several parameters to use with the :command:`umm` command:
|
||||||
-------------------------
|
|
||||||
|
|
||||||
There are several parameters to use with the *umm* command:
|
|
||||||
|
|
||||||
- ``umm on [cmd]`` - enter the maintenance mode, and execute cmd when MM is reached;
|
- ``umm on [cmd]`` - enter the maintenance mode, and execute cmd when MM is reached;
|
||||||
|
|
||||||
@@ -68,10 +68,10 @@ There are several parameters to use with the *umm* command:
|
|||||||
- ``umm disable`` - disable the maintenance mode functionality.
|
- ``umm disable`` - disable the maintenance mode functionality.
|
||||||
|
|
||||||
|
|
||||||
Configuring the ``UMM.conf`` file
|
Configuring the `UMM.conf` file
|
||||||
---------------------------------
|
---------------------------------
|
||||||
|
|
||||||
You can automate the maintenance mode start by editing the */etc/umm.conf* file.
|
You can automate the maintenance mode start by editing the `/etc/umm.conf` file.
|
||||||
|
|
||||||
The configuration options are:
|
The configuration options are:
|
||||||
|
|
||||||
@@ -82,20 +82,19 @@ The configuration options are:
|
|||||||
|
|
||||||
where:
|
where:
|
||||||
|
|
||||||
UMM
|
UMM
|
||||||
will tell the system to go into the maintenance mode based on
|
tells the system to go into the maintenance mode based on
|
||||||
the REBOOT_COUNT and COUNTER_RESET_TIME values. If the value is
|
the ``REBOOT_COUNT`` and ``COUNTER_RESET_TIME`` values. If the value is
|
||||||
anything other than ``yes`` (or if the ``UMM.conf`` file is missing), the
|
anything other than ``yes`` (or if the `UMM.conf` file is missing), the
|
||||||
system will go into the native Ubuntu recovery mode.
|
system will go into the native Ubuntu recovery mode.
|
||||||
|
|
||||||
REBOOT_COUNT
|
REBOOT_COUNT
|
||||||
determines the number of unclean reboots that will
|
determines the number of unclean reboots that trigger the system to go
|
||||||
trigger the system to go into the maintenance mode;
|
into the maintenance mode.
|
||||||
|
|
||||||
COUNTER_RESET_TIME
|
|
||||||
this is a time value in minutes after the system reboot when
|
|
||||||
"Unclean reboot" counter will be resetted.
|
|
||||||
|
|
||||||
|
COUNTER_RESET_TIME
|
||||||
|
determines the period of time (in minutes) before the `Unclean reboot`
|
||||||
|
counter reset.
|
||||||
|
|
||||||
|
|
||||||
Example of using MM on one node
|
Example of using MM on one node
|
||||||
@@ -103,53 +102,57 @@ Example of using MM on one node
|
|||||||
|
|
||||||
- Switching node into MM:
|
- Switching node into MM:
|
||||||
|
|
||||||
::
|
.. code-block:: bash
|
||||||
|
:linenos:
|
||||||
|
|
||||||
root@node-1:~#umm on
|
root@node-1:~#umm on
|
||||||
umm-gr start/running, process 6657
|
umm-gr start/running, process 6657
|
||||||
|
|
||||||
Broadcast message from root@node-1
|
Broadcast message from root@node-1
|
||||||
(/dev/pts/0) at 14:29 ...
|
(/dev/pts/0) at 14:29 ...
|
||||||
|
|
||||||
The system is going down for reboot NOW!
|
The system is going down for reboot NOW!
|
||||||
root@node-1:~# umm status
|
root@node-1:~# umm status
|
||||||
rebooting
|
rebooting
|
||||||
root@node-1:~# Connection to node-1 closed by remote host.
|
root@node-1:~# Connection to node-1 closed by remote host.
|
||||||
Connection node-1:~# closed.
|
Connection node-1:~# closed.
|
||||||
root@fuel:~#:~$
|
root@fuel:~#:~$
|
||||||
|
|
||||||
root@node-1:~#ssh
|
root@node-1:~#ssh
|
||||||
|
|
||||||
root@node-1:~# umm status
|
root@node-1:~# umm status
|
||||||
umm
|
umm
|
||||||
root@node-1:~#ps -Af
|
root@node-1:~#ps -Af
|
||||||
|
|
||||||
|
|
||||||
We can see only small set of working process.
|
We can see only small set of working processes.
|
||||||
|
|
||||||
- Start the service:
|
- Start the service:
|
||||||
|
|
||||||
::
|
.. code-block:: bash
|
||||||
|
:linenos:
|
||||||
|
|
||||||
root@node-1:~# /etc/init.d/apache2 start
|
root@node-1:~# /etc/init.d/apache2 start
|
||||||
root@node-1:~# /etc/init.d/apache2 status
|
root@node-1:~# /etc/init.d/apache2 status
|
||||||
Apache2 is running (pid 1907).
|
Apache2 is running (pid 1907).
|
||||||
|
|
||||||
|
|
||||||
- Switch back to the working mode:
|
- Switch back to the working mode:
|
||||||
|
|
||||||
::
|
.. code-block:: bash
|
||||||
|
:linenos:
|
||||||
|
|
||||||
root@node-1:~#umm off
|
root@node-1:~#umm off
|
||||||
|
|
||||||
- Continue booting into working mode:
|
- Continue booting into working mode:
|
||||||
|
|
||||||
::
|
.. code-block:: bash
|
||||||
|
:linenos:
|
||||||
|
|
||||||
root@node-1:~#umm status
|
root@node-1:~#umm status
|
||||||
runlevel N 2
|
runlevel N 2
|
||||||
root@node-1:~#/etc/init.d/apache2 status
|
root@node-1:~#/etc/init.d/apache2 status
|
||||||
Apache2 is running (pid 1907).
|
Apache2 is running (pid 1907).
|
||||||
|
|
||||||
|
|
||||||
We can see that service was not restarted during switching from MM to
|
We can see that service was not restarted during switching from MM to
|
||||||
@@ -157,115 +160,122 @@ Example of using MM on one node
|
|||||||
|
|
||||||
- Check the state of the OpenStack services:
|
- Check the state of the OpenStack services:
|
||||||
|
|
||||||
::
|
.. code-block:: bash
|
||||||
|
:linenos:
|
||||||
|
|
||||||
root@node-1:~#crm status
|
root@node-1:~#crm status
|
||||||
|
|
||||||
- If you want to reach working mode by reboot, you should use the following
|
- If you want to reach working mode by reboot, you should use the following
|
||||||
command:
|
command:
|
||||||
|
|
||||||
::
|
.. code-block:: bash
|
||||||
|
:linenos:
|
||||||
|
|
||||||
root@node-1:~# umm off reboot umm-gr start/running, process 2825
|
root@node-1:~# umm off reboot umm-gr start/running, process 2825
|
||||||
|
|
||||||
Broadcast message from root@node-1
|
Broadcast message from root@node-1
|
||||||
(/dev/pts/0) at 11:23 ...
|
(/dev/pts/0) at 11:23 ...
|
||||||
|
|
||||||
The system is going down for reboot NOW!
|
The system is going down for reboot NOW!
|
||||||
root@node-1:~# Connection to node-1 closed by remote host.
|
root@node-1:~# Connection to node-1 closed by remote host.
|
||||||
Connection to node-1 closed.
|
Connection to node-1 closed.
|
||||||
[root@fuel ~]#
|
[root@fuel ~]#
|
||||||
|
|
||||||
|
|
||||||
Example of putting all nodes into the maintenance mode at the same time
|
Example of putting all nodes into the maintenance mode at the same time
|
||||||
-----------------------------------------------------------------------
|
-----------------------------------------------------------------------
|
||||||
|
|
||||||
The following maintenance mode sequence is called "Last input First out".
|
The following maintenance mode sequence is called `Last input First out`.
|
||||||
This guarantees that there is going to be the most recent data on
|
This guarantees that there is going to be the most recent data on
|
||||||
the Cloud Infrastructure Controller (CIC) that comes back first.
|
the Cloud Infrastructure Controller (CIC) that comes back first.
|
||||||
|
|
||||||
|
|
||||||
- Determine what nodes have Controller (CIC) role:
|
- Determine which nodes have Controller (CIC) role:
|
||||||
|
|
||||||
::
|
.. code-block:: bash
|
||||||
|
:linenos:
|
||||||
|
|
||||||
[root@fuel ~]# fuel nodes
|
[root@fuel ~]# fuel nodes
|
||||||
id | status | name | cluster| ip | mac | roles | pending_roles| online
|
id | status | name | cluster| ip | mac | roles | pending_roles| online
|
||||||
---|--------|------------------|--------|-----------|-------------------|------------|--------------|-------
|
---|--------|------------------|--------|-----------|-------------------|------------|--------------|-------
|
||||||
2 | ready | Untitled (c0:02) | 1 | 10.20.0.4 | e6:6a:42:96:a4:45 | controller | | True
|
2 | ready | Untitled (c0:02) | 1 | 10.20.0.4 | e6:6a:42:96:a4:45 | controller | | True
|
||||||
4 | ready | Untitled (c0:04) | 1 | 10.20.0.6 | 66:10:2e:0c:12:4a | compute | | True
|
4 | ready | Untitled (c0:04) | 1 | 10.20.0.6 | 66:10:2e:0c:12:4a | compute | | True
|
||||||
1 | ready | Untitled (c0:01) | 1 | 10.20.0.3 | fa:a1:39:94:7f:4c | controller | | True
|
1 | ready | Untitled (c0:01) | 1 | 10.20.0.3 | fa:a1:39:94:7f:4c | controller | | True
|
||||||
3 | ready | Untitled (c0:03) | 1 | 10.20.0.5 | 82:cb:bb:50:40:47 | controller | | True
|
3 | ready | Untitled (c0:03) | 1 | 10.20.0.5 | 82:cb:bb:50:40:47 | controller | | True
|
||||||
|
|
||||||
- Copy id_rsa to the CICs for passwordless ssh authentification:
|
- Copy ``id_rsa`` to the CICs for passwordless ssh authentification:
|
||||||
|
|
||||||
::
|
.. code-block:: bash
|
||||||
|
:linenos:
|
||||||
|
|
||||||
[root@fuel ~]# scp .ssh/id_rsa node-1:.ssh/id_rsa
|
[root@fuel ~]# scp .ssh/id_rsa node-1:.ssh/id_rsa
|
||||||
Warning: Permanently added 'node-1' (RSA) to the list of known hosts.
|
Warning: Permanently added 'node-1' (RSA) to the list of known hosts.
|
||||||
id_rsa 100% 1675 1.6KB/s 00:00
|
id_rsa 100% 1675 1.6KB/s 00:00
|
||||||
[root@fuel ~]# scp .ssh/id_rsa node-2:.ssh/id_rsa
|
[root@fuel ~]# scp .ssh/id_rsa node-2:.ssh/id_rsa
|
||||||
Warning: Permanently added 'node-2' (RSA) to the list of known hosts.
|
Warning: Permanently added 'node-2' (RSA) to the list of known hosts.
|
||||||
id_rsa 100% 1675 1.6KB/s 00:00
|
id_rsa 100% 1675 1.6KB/s 00:00
|
||||||
[root@fuel ~]# scp .ssh/id_rsa node-3:.ssh/id_rsa
|
[root@fuel ~]# scp .ssh/id_rsa node-3:.ssh/id_rsa
|
||||||
Warning: Permanently added 'node-3' (RSA) to the list of known hosts.
|
Warning: Permanently added 'node-3' (RSA) to the list of known hosts.
|
||||||
id_rsa 100% 1675 1.6KB/s 00:00
|
id_rsa 100% 1675 1.6KB/s 00:00
|
||||||
|
|
||||||
- Enforce switching into MM mode on all nodes:
|
- Enforce switching into MM mode on all nodes:
|
||||||
|
|
||||||
::
|
.. code-block:: bash
|
||||||
|
:linenos:
|
||||||
|
|
||||||
[root@fuel ~]# ssh node-1 umm on ssh node-2 umm on ssh node-3 umm on
|
[root@fuel ~]# ssh node-1 umm on ssh node-2 umm on ssh node-3 umm on
|
||||||
Warning: Permanently added 'node-1' (RSA) to the list of known hosts.
|
Warning: Permanently added 'node-1' (RSA) to the list of known hosts.
|
||||||
umm-gr start/running, process 24318
|
umm-gr start/running, process 24318
|
||||||
Connection to node-1 closed by remote host.
|
Connection to node-1 closed by remote host.
|
||||||
Connection to node-1 closed.
|
Connection to node-1 closed.
|
||||||
[root@fuel ~]#
|
[root@fuel ~]#
|
||||||
[root@fuel ~]# ssh -tt node-1 ssh -tt node-2 ssh -tt node-3 sleep 1
|
[root@fuel ~]# ssh -tt node-1 ssh -tt node-2 ssh -tt node-3 sleep 1
|
||||||
Warning: Permanently added 'node-1' (RSA) to the list of known hosts.
|
Warning: Permanently added 'node-1' (RSA) to the list of known hosts.
|
||||||
ECDSA key fingerprint is 84:17:0d:ea:27:1f:4e:08:f7:54:b2:8c:fe:8a:13:1a.
|
ECDSA key fingerprint is 84:17:0d:ea:27:1f:4e:08:f7:54:b2:8c:fe:8a:13:1a.
|
||||||
Are you sure you want to continue connecting (yes/no)? yes
|
Are you sure you want to continue connecting (yes/no)? yes
|
||||||
Warning: Permanently added 'node-2,10.20.0.4' (ECDSA)
|
Warning: Permanently added 'node-2,10.20.0.4' (ECDSA)
|
||||||
to the list of known hosts. established.
|
to the list of known hosts. established.
|
||||||
ECDSA key fingerprint is
|
ECDSA key fingerprint is
|
||||||
c3:c6:ca:7d:11:d3:53:01:15:64:20:f7:c7:44:fb:d1.
|
c3:c6:ca:7d:11:d3:53:01:15:64:20:f7:c7:44:fb:d1.
|
||||||
Are you sure you want to continue connecting (yes/no)? yes
|
Are you sure you want to continue connecting (yes/no)? yes
|
||||||
Warning: Permanently added 'node-3,192.168.0.6' (ECDSA)
|
Warning: Permanently added 'node-3,192.168.0.6' (ECDSA)
|
||||||
to the list of known hosts.
|
to the list of known hosts.
|
||||||
Connection to node-3 closed.
|
Connection to node-3 closed.
|
||||||
Connection to node-2 closed.
|
Connection to node-2 closed.
|
||||||
Connection to node-1 closed. [root@fuel ~]#
|
Connection to node-1 closed. [root@fuel ~]#
|
||||||
|
|
||||||
- Wait until the last node reboots:
|
- Wait until the last node reboots:
|
||||||
|
|
||||||
::
|
.. code-block:: bash
|
||||||
|
:linenos:
|
||||||
|
|
||||||
[root@fuel ~]# ssh node-3
|
[root@fuel ~]# ssh node-3
|
||||||
Warning: Permanently added 'node-3' (RSA) to the list of known hosts.
|
Warning: Permanently added 'node-3' (RSA) to the list of known hosts.
|
||||||
Welcome to Ubuntu 12.04.4 LTS (GNU/Linux 3.13.0-32-generic x86_64)
|
Welcome to Ubuntu 12.04.4 LTS (GNU/Linux 3.13.0-32-generic x86_64)
|
||||||
* Documentation: https://help.ubuntu.com/
|
* Documentation: https://help.ubuntu.com/
|
||||||
Last login: Tue Dec 23 05:55:47 2014 from 10.20.0.2
|
Last login: Tue Dec 23 05:55:47 2014 from 10.20.0.2
|
||||||
root@node-3:~#
|
root@node-3:~#
|
||||||
Broadcast message from root@node-3
|
Broadcast message from root@node-3
|
||||||
(unknown) at 6:00 ...
|
(unknown) at 6:00 ...
|
||||||
The system is going down for reboot NOW!
|
The system is going down for reboot NOW!
|
||||||
Connection to node-3 closed by remote host.
|
Connection to node-3 closed by remote host.
|
||||||
Connection to node-3 closed.
|
Connection to node-3 closed.
|
||||||
[root@fuel ~]#
|
[root@fuel ~]#
|
||||||
|
|
||||||
- Perform all the steps, planned for MM.
|
- Perform all the steps planned for MM.
|
||||||
|
|
||||||
|
|
||||||
- Enforce a return back into normal mode in reverse state:
|
- Enforce a return back into normal mode in reverse state:
|
||||||
|
|
||||||
::
|
.. code-block:: bash
|
||||||
|
:linenos:
|
||||||
|
|
||||||
[root@fuel ~]# ssh node-3 umm off
|
[root@fuel ~]# ssh node-3 umm off
|
||||||
Warning: Permanently added 'node-3' (RSA) to the list of known hosts.
|
Warning: Permanently added 'node-3' (RSA) to the list of known hosts.
|
||||||
[root@fuel ~]# ssh node-2 umm off
|
[root@fuel ~]# ssh node-2 umm off
|
||||||
Warning: Permanently added 'node-2' (RSA) to the list of known hosts.
|
Warning: Permanently added 'node-2' (RSA) to the list of known hosts.
|
||||||
[root@fuel ~]# ssh node-1 umm off
|
[root@fuel ~]# ssh node-1 umm off
|
||||||
Warning: Permanently added 'node-1' (RSA) to the list of known hosts.
|
Warning: Permanently added 'node-1' (RSA) to the list of known hosts.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|||||||
@@ -6,21 +6,20 @@ Maintenance Mode
|
|||||||
|
|
||||||
Maintenance mode (MM) is a mode when the operating system on the node
|
Maintenance mode (MM) is a mode when the operating system on the node
|
||||||
has only a critical set of working services that the system needs for
|
has only a critical set of working services that the system needs for
|
||||||
basic network and disk operations. The purpose of the maintenance mode
|
basic network and disk operations. The purpose of MM is to perform a system
|
||||||
is to do a system repair or run other service operations on the system.
|
repair or run other maintenance operations on the system.
|
||||||
The implementation of maintenance mode in 15B is based on the Ubuntu
|
|
||||||
recovery mode. The system goes into a reboot and goes through the
|
For switching to MM, the system shuts down and then goes through the regular
|
||||||
regular boot process until the system initialization stage (rc-sysinit).
|
boot process until the system initialization stage (rc-sysinit).
|
||||||
This is where the system enters the maintenance mode with the network
|
At that moment the system enters MM, the network and filesystem services
|
||||||
and filesystem services started. In this moment we have already started
|
have already started. During the MM stage, the ``sshd`` and ``tty2``
|
||||||
network and filesystem. In MM stage are started sshd, tty2 and main MM
|
services start, and the main MM service waits for the command to continue the boot flow.
|
||||||
service wait command for boot flow continue.
|
|
||||||
|
See the :ref:`mm-ops` section of the Operations guide for the details.
|
||||||
|
|
||||||
Here is a Cloud Infrastructure Controller boot flow scheme:
|
Here is a Cloud Infrastructure Controller boot flow scheme:
|
||||||
|
|
||||||
.. image:: /_images/mm_bootflow.png
|
.. image:: /_images/mm_bootflow.png
|
||||||
|
|
||||||
For more information, see:
|
|
||||||
|
|
||||||
- :ref:`mm-ops`.
|
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user