Adam Harwell 78f1c7b128 Minimize the effect overloaded Health Manager processes
If a Health Manager is overloaded, it can begin to fall very far behind
in processing health updates. This causes huge delays in the whole
system and can cause two distinctly different issues:

1) If the HMs are all suddenly busy, delays can be long enough that no
messages get through within the failover timeout, and amps start to
fail, increasing load on the HMs and causing a cascade failure (I have
witnessed this happen once and take down over 50 LBs before manual
intervention could be taken)..

2) Even one overloaded HM can cause updates to queue for extremely long
periods, which makes the system unreliable. Amps can go down and still
have health updates register for some time as the HM processes the queue
(in some cases I have seen dead amps updated for 5-10 minutes).

If we short-circuit handling before we update the health table, we can
solve these problems in two ways:

1) The heavy processing generally happens after this, so
short-circuiting early will let some other threads finish faster and
have some chance of success.

2) Amphora health won't continue to be updated long after the messages
were received, so it won't be possible for zombie amphorae to eat as
many brains.

Change-Id: Iceeacfdcaebe1f9bb99bc08e318c9da73a66898d
(cherry picked from commit 61e0c14f48130d1d0519fa5527d2712ba6ce504f)
2018-04-21 00:12:53 +00:00
..
2017-08-16 22:08:24 +00:00
2017-12-21 12:12:49 -08:00
2018-04-20 17:12:38 -07:00
2017-05-23 16:28:29 +07:00