nova/nova/tests/unit/servicegroup
Roman Podoliaka e0647dd4b2 servicegroup: stop zombie service due to exception
If an exception is raised out of the _report_state call, we find that
the service no longer reports any updates to the database, so the
service is considered dead, thus creating a kind of zombie service.

I55417a5b91282c69432bb2ab64441c5cea474d31 seems to introduce a
regression, which leads to nova-* services marked as 'down', if an
error happens in a remote nova-conductor while processing a state
report: only Timeout errors are currently handled, but other errors
are possible, e.g. a DBError (wrapped with RemoteError on RPC
client side), if a DB temporarily goes away. This unhandled exception
will effectively break the state reporting thread - service will be
up again only after restart.

While the intention of I55417a5b91282c69432bb2ab64441c5cea474d31 was
to avoid cathing all the possible exceptions, but it looks like we must
do that to avoid creating a zombie.
The other part of that change was to ensure that during upgrade, we do
not spam the log server about MessagingTimeouts while the
nova-conductors are being restarted. This change ensures that still
happens.

Closes-Bug: #1517926

Change-Id: I44f118f82fbb811b790222face4c74d79795fe21
(cherry picked from commit 49b0d1741c)
2015-12-01 11:13:57 +02:00
..
__init__.py move all tests to nova/tests/unit 2014-11-12 15:31:08 -05:00
test_api.py Service group drivers forced_down flag utilization 2015-07-31 15:43:36 +02:00
test_db_servicegroup.py servicegroup: stop zombie service due to exception 2015-12-01 11:13:57 +02:00
test_mc_servicegroup.py servicegroup: remove get_all method never used as public 2015-05-07 03:50:05 -04:00
test_zk_driver.py Fix up join() and leave() methods of servicegroup 2015-03-11 23:22:11 +00:00