Stop all but one RabbitMQ node prior to upgrade

RabbitMQ nodes must all be stopped prior to a major/minor version
upgrade [1]. The role does this by distinguishing between the upgrader
node and the rest in separate stop and start tasks.

Upgrades can fail when more than one member of rabbitmq_all are not
members of the cluster. This is due to a bug fixed for greenfield
deployments by 5dc67955f0. The same fix
was not applied to upgrades because major/minor upgrades require all
RabbitMQ nodes to be stopped which is incompatible with serialising the
role in isolation.

This change uses a play to stop all but one of the nodes, prior to
running the rabbitmq_server role, and then serialises the running of
the role so that one node is upgraded at a time. This minimises the
downtime as much as possible while allowing the role to be applied to
one node at a time.

[1] http://www.rabbitmq.com/clustering.html#upgrading

Change-Id: Icca5cb1a96f83063223b6ddbeb02eeb562b0931b
This commit is contained in:
git-harry
2016-10-26 20:54:39 +01:00
parent d89bd9c009
commit 351dac725d

View File

@@ -18,11 +18,23 @@
user: root user: root
gather_facts: true gather_facts: true
# NOTE(mancdaz): rabbitmq cannot be upgraded in serial, so when # The cluster must be stopped when doing major/minor upgrades
# rabbitmq_upgrade=True, serial is set to 0, else it is 1 for installs # http://www.rabbitmq.com/clustering.html#upgrading
- name: Stop RabbitMQ nodes that are not the upgrader
hosts: rabbitmq_all[1:]
serial: 1
max_fail_percentage: 0
user: root
tasks:
- name: "Stop RabbitMQ"
service:
name: "rabbitmq-server"
state: "stopped"
when: rabbitmq_upgrade | default(false) | bool
- name: Install RabbitMQ server - name: Install RabbitMQ server
hosts: "{{ rabbitmq_host_group }}" hosts: "{{ rabbitmq_host_group }}"
serial: "{{ rabbitmq_upgrade|default(false) | bool | ternary(0, 1)}}" serial: 1
user: root user: root
gather_facts: True gather_facts: True
roles: roles: