From e7889891987c6108cb66ef06cd0f48868fc3809f Mon Sep 17 00:00:00 2001 From: Michele Baldessari Date: Mon, 2 Dec 2019 09:03:06 +0100 Subject: [PATCH] Increase rabbitmq tcp backlog From https://bugzilla.redhat.com/show_bug.cgi?id=1778428 We need to tune the default rabbitmq tcp listen backlog. Currently it defaults to 128, but here's what happens: Say we have 1500 total rabbitmq client connections spread across a 3 node cluster, evenly distributed so each node has 500 clients. Then, we stop rabbitmq on one of the nodes. Now those 500 client connections all immediately fail over to the other two node. Assume roughly even split, and each gets 250 connections simultaneously. Since the tcp listen backlog is only 128, a large number of the failover connections cannot connect and get ECONNREFUSED because the kernel just drops them. Eventually things retry and the backlog clears, but it just makes things noisy in the logs and makes failover take a little bit longer. Upstream docs discuss here: https://www.rabbitmq.com/networking.html#tuning-for-large-number-of-connections-connection-backlog Suggested-By: John Eckersberg Closes-Bug: #1854704 Change-Id: If6da4aff016db9a72e1cb9dfc9731f06e062f64d (cherry picked from commit 9f4832fcc4d939da3d4e7f83e26c4f934bff7dc0) --- deployment/rabbitmq/rabbitmq-container-puppet.yaml | 1 + 1 file changed, 1 insertion(+) diff --git a/deployment/rabbitmq/rabbitmq-container-puppet.yaml b/deployment/rabbitmq/rabbitmq-container-puppet.yaml index 9c2421da22..b5ace89b4a 100644 --- a/deployment/rabbitmq/rabbitmq-container-puppet.yaml +++ b/deployment/rabbitmq/rabbitmq-container-puppet.yaml @@ -126,6 +126,7 @@ outputs: rabbitmq::wipe_db_on_cookie_change: true rabbitmq::port: 5672 rabbitmq::loopback_users: [] + rabbitmq::tcp_backlog: 4096 rabbitmq::package_provider: yum rabbitmq::package_source: undef rabbitmq::repos_ensure: false