Slow down Kombu reconnect attempts

For a rationale for this patch, see the discussion surrounding Bug

When reconnecting to a RabbitMQ cluster with mirrored queues in
use, the attempt to release the connection can hang "indefinitely"
somewhere deep down in Kombu.  Blocking the thread for a bit
prior to release seems to kludge around the problem where it is
otherwise reproduceable.

DocImpact

Change-Id: Ic2ede3046709b831adf8204e4c909c589c1786c4
Partial-Bug: #856764
This commit is contained in:
Bogdan Dobrelya 2014-06-10 14:26:42 +03:00
parent 1cc7390033
commit 1472013830
1 changed files with 15 additions and 0 deletions

View File

@ -52,6 +52,10 @@ kombu_opts = [
default='', default='',
help=('SSL certification authority file ' help=('SSL certification authority file '
'(valid only if SSL enabled)')), '(valid only if SSL enabled)')),
cfg.FloatOpt('kombu_reconnect_delay',
default=1.0,
help='How long to wait before reconnecting in response to an '
'AMQP consumer cancel notification.'),
cfg.StrOpt('rabbit_host', cfg.StrOpt('rabbit_host',
default='localhost', default='localhost',
help='The RabbitMQ broker address where a single node is used'), help='The RabbitMQ broker address where a single node is used'),
@ -498,6 +502,17 @@ class Connection(object):
LOG.info(_LI("Reconnecting to AMQP server on " LOG.info(_LI("Reconnecting to AMQP server on "
"%(hostname)s:%(port)d") % params) "%(hostname)s:%(port)d") % params)
try: try:
# XXX(nic): when reconnecting to a RabbitMQ cluster
# with mirrored queues in use, the attempt to release the
# connection can hang "indefinitely" somewhere deep down
# in Kombu. Blocking the thread for a bit prior to
# release seems to kludge around the problem where it is
# otherwise reproduceable.
if self.conf.kombu_reconnect_delay > 0:
LOG.info(_("Delaying reconnect for %1.1f seconds...") %
self.conf.kombu_reconnect_delay)
time.sleep(self.conf.kombu_reconnect_delay)
self.connection.release() self.connection.release()
except self.connection_errors: except self.connection_errors:
pass pass