From b977a4b14bcad2d3dd613fc1b0381982bece90ca Mon Sep 17 00:00:00 2001 From: "Denis V. Meltsaykin" Date: Tue, 13 Oct 2015 19:31:07 +0300 Subject: [PATCH] [RN6.1] 6.1 MU 3 - RabbitMQ OCF-script Added section about RabbitMQ OCF-script bugfix Change-Id: Iaad6bfb092438d6072e5a8839f4730101e76b8e7 --- pages/maintenance-updates/v61.rst | 4 + .../v61/1487517-rabbitmq-ocf.rst | 81 +++++++++++++++++++ .../v6-1/0061-maintenance-updates.rst | 2 + 3 files changed, 87 insertions(+) create mode 100644 pages/maintenance-updates/v61/1487517-rabbitmq-ocf.rst diff --git a/pages/maintenance-updates/v61.rst b/pages/maintenance-updates/v61.rst index 80c44bf97..b8987cdc4 100644 --- a/pages/maintenance-updates/v61.rst +++ b/pages/maintenance-updates/v61.rst @@ -86,3 +86,7 @@ The following updates for Mirantis OpenStack 6.1 are shared for an early access. ----- .. include:: /pages/maintenance-updates/v61/1452389-atop-logrotate.rst + +----- + +.. include:: /pages/maintenance-updates/v61/1487517-rabbitmq-ocf.rst diff --git a/pages/maintenance-updates/v61/1487517-rabbitmq-ocf.rst b/pages/maintenance-updates/v61/1487517-rabbitmq-ocf.rst new file mode 100644 index 000000000..dbfe50e73 --- /dev/null +++ b/pages/maintenance-updates/v61/1487517-rabbitmq-ocf.rst @@ -0,0 +1,81 @@ +.. _mos61mu-1487517: + +Status of alarms and queues is silently ignored in RabbitMQ monitoring OCF +========================================================================== + +The existing RabbitMQ monitoring OCF script is unreliable and may stop +the running RabbitMQ instance if "heavy" operations cause timeouts. +Also, the script is unable to determine the RabbitMQ deadlocking if there is +no free memory left. + +The resulting solution consists of two changes: + +* Avoid deadlocking of RabbitMQ if there are no free memory resources left. See `LP1463433 `_. + +* Make the OCF script ignore a small number of timeouts in rabbitmqctl + for 'heavy' operations: ``list_channels``, ``get_alarms`` and ``list_queues``. + See `LP1487517 `_ and `LP1479815 `_. + +As a part of this change, adjust the default timeouts of monitor actions in the deployed environment. + +Affected packages +----------------- +* **CentOS/@6.1:** fuel-library6.1=6.1.0-6760.2 +* **CentOS/@6.1:** fuel-ha-utils6.1=6.1.0-6760.2 +* **Ubuntu/@6.1:** fuel-library6.1=6.1.0-6760.2 +* **Ubuntu/@6.1:** fuel-ha-utils6.1=6.1.0-6760.2 + +Fixed packages +-------------- +* **CentOS/@6.1:** fuel-library6.1=6.1.0-6761.2 +* **CentOS/@6.1:** fuel-ha-utils6.1=6.1.0-6761.2 +* **Ubuntu/@6.1:** fuel-library6.1=6.1.0-6761.2 +* **Ubuntu/@6.1:** fuel-ha-utils6.1=6.1.0-6761.2 + +Patching scenario - Fuel Master node +------------------------------------ + +Run the following commands on the Fuel Master node:: + + yum clean expire-cache + yum -y update fuel-library + +Patching scenario - Ubuntu +-------------------------- + +Run the following commands on OpenStack Controller nodes:: + + apt-get update + apt-get install --only-upgrade fuel-ha-utils + +Patching scenario - CentOS +-------------------------- + +Run the following commands on OpenStack Controller nodes:: + + yum clean expire-cache + yum update fuel-ha-utils + +Patching scenario - further actions +----------------------------------- + +#. Run the following command on one of the OpenStack Controller nodes to edit the database of pacemaker:: + + crm configure edit p_rabbitmq-server + +#. Adjust the timeouts to 180 for the following monitor actions:: + + op monitor interval=30 timeout=60 + op monitor interval=27 role=Master timeout=60 + op monitor interval=103 role=Slave timeout=60 + + save the changes and quit the editor. + +#. Optionally, you may add a parameter to control how many retries there will be + before the RabbitMQ server will be considered as non-functional and will be + rebooted:: + + crm_resource --resource p_rabbitmq-server --set-parameter \ + max_rabbitmqctl_timeouts --parameter-value N + + where ``N`` is the number of retries, which is ``1`` by default and also is a recommended value. diff --git a/pages/release-notes/v6-1/0061-maintenance-updates.rst b/pages/release-notes/v6-1/0061-maintenance-updates.rst index c490713b2..a3aca8f5f 100644 --- a/pages/release-notes/v6-1/0061-maintenance-updates.rst +++ b/pages/release-notes/v6-1/0061-maintenance-updates.rst @@ -63,3 +63,5 @@ Proposed updates * :ref:`#1504916 ` Cumulative OpenStack update (14 fixes) * :ref:`#1452389 ` logrotate is rotating already rotated atop's logs + +* :ref:`#1487517 ` Status of alarms and queues is silently ignored in RabbitMQ monitoring OCF