Adjust legacy OpenStack HA policy to make reply queues HA

Changes in oslo.messaging for 2023.1 exposed a known race
condition in RabbitMQ when dealing with non-HA classic queues.
When a RMQ cluster member is taken down, clients failing over
to other members may erroneously be told a queue exists when it
is in the process of being deleted. This can cause them to
permanently sit waiting for messages from a queue that no longer
exists until their services are restarted.

Making the reply queues HA resolves this issue, at the expense
of a x3 increase in reply queues across the cluster. My
assumption is that reply queues were previously excluded from HA
policy as a performance gain given their link to the number of
compute nodes in an OpenStack deployment.

Context: https://bugs.launchpad.net/oslo.messaging/+bug/2031512

Change-Id: Ia0a26fdfdfa09088c921f1530d4ac020b2bec290
This commit is contained in:
Andrew Bonney 2024-04-17 08:23:01 +01:00
parent cfb1c28151
commit f5ecdf4852

View File

@ -290,7 +290,7 @@ rabbitmq_policies: []
rabbitmq_apply_openstack_policies: False
rabbitmq_openstack_policies:
- name: "HA"
pattern: '^(?!(amq\.)|(.*_fanout_)|(reply_)).*'
pattern: '^(?!(amq\.)|(.*_fanout_)).*'
tags: "ha-mode=all"
rabbitmq_port_bindings: