Use gearman client keepalive

If the gearman server vanishes (e.g. due to a VM crash) some clients
like the merger may not notice that it is gone. They just wait forever
for data to be received on an inactive connection. In our case the VM
containing the zuul-scheduler crashed and after the restart of the
scheduler all mergers were waiting for data on the stale connection
which blocked a successful scheduler restart.  Using tcp keepalive we
can detect that situation and let broken inactive connections be
killed by the kernel.

Depends-On: I8589cd45450245a25539c051355b38d16ee9f4b9
Change-Id: I30049d59d873d64f3b69c5587c775827e3545854
This commit is contained in:
Tobias Henkel
2018-09-04 13:52:33 +02:00
parent cea6505d1b
commit fb4c6402a4
8 changed files with 25 additions and 9 deletions

View File

@@ -151,7 +151,9 @@ class GithubGearmanWorker(object):
ssl_ca = get_default(self.config, 'gearman', 'ssl_ca')
self.gearman = gear.TextWorker('Zuul Github Connector')
self.log.debug("Connect to gearman")
self.gearman.addServer(server, port, ssl_key, ssl_cert, ssl_ca)
self.gearman.addServer(server, port, ssl_key, ssl_cert, ssl_ca,
keepalive=True, tcp_keepidle=60,
tcp_keepintvl=30, tcp_keepcnt=5)
self.log.debug("Waiting for server")
self.gearman.waitForServer()
self.log.debug("Registering")