Add info about rabbit_transient_queues_ttl

This params should be tuned correctly when running large-scale.

The default value is 30 minutes.
My recommandation is to set this to 60secs.

Signed-off-by: Arnaud Morin <arnaud.morin@ovhcloud.com>
Change-Id: I3fa643ef94a60474273902e95d4e03b9f31164d4
This commit is contained in:
Arnaud Morin 2022-07-05 15:59:41 +02:00
parent 65f65e8df4
commit dc7818d769
1 changed files with 20 additions and 1 deletions

View File

@ -40,7 +40,10 @@ You can also consider deploying rabbit in two ways:
* one rabbit (cluster or not) for each OpenStack services
* one big rabbit (cluster or not) for all OpenStack services
There is no recommendation on that part, except that if you split your rabbit in multiples services, you will, for sure, reduce the risk.
The recommendation is to split your rabbit in multiples clusters for multiple reasons:
* Reduce the impact when a rabbit cluster is down
* Allow intervention on smaller part of infrastructure
Which version of rabbit should I run?
@ -247,3 +250,19 @@ in every OpenStack config file
Note that the durability of a queue (or an exchange) cannot be set AFTER the queue has been created.
So if you forgot to set this at the beginning, you will have to delete your queue before OpenStack can recreate the queues with correct durability.
rabbit_transient_queues_ttl
^^^^^^^^^^^^^^^^^^^^^^^^^^^
The default TTL for transient queues is 30 minutes, but this is too much for neutron (both server and agent).
For example, when restarting an agent, it will keep the transient queues in the rabbit cluster for 30 minutes. Most of those queues are fanout queues, so they will stack the messages.
On a large cluster, you can endup with million of messages very quickly (before the 30 minutes).
So the recommendation is to drastically lower this value to get rid of transient queues much quicker (like 60sec):
.. code-block:: console
rabbit_transient_queues_ttl = 60