shaker/shaker
Oded Le'Sage 539e376978 Update hung process/thread approach
The previous commit attempted to prevent a hung process/thread by using
a "tcp ping" to determine if the server_endpoint was reachable before
deploying the heat stack and starting the heartbeat thread.

While this approach works well in theory we're finding that in practice
in a live K8 environment we're seeing a lot of random errors:
[Errno 111] Connection refused: ConnectionRefusedError

There could be numerous reasons for these random connection errors but
ZMQ has retry logic which should overcome these problems.

This commit updates the sockets used in ZMQ to add a timeout (a padded
value of agent_loss_timeout). While this does not prevent the
creation of a heat stack and heartbeat thread that might never respond,
it does solve the initial problem of having stuck process/threads and
getting a clean exit

Change-Id: I8193c72120b459c2a18d780d9f8799e8df592e20
2020-02-11 12:13:58 -06:00
..
agent Allow more than 1 agent per host 2017-03-01 11:17:31 +04:00
engine Update hung process/thread approach 2020-02-11 12:13:58 -06:00
openstack Make environment to create_stack have a default value 2019-10-17 07:20:36 +00:00
resources Upgrade pip and setuptools in CentOS images 2020-01-07 12:49:37 -05:00
scenarios Enhance Iperf executor to accept "unmapped" args 2018-11-07 16:10:57 -06:00
tests Add support for best-effort accommodations 2018-12-10 12:15:03 -08:00
__init__.py Initial commit 2015-01-28 18:56:01 +03:00
lib.py Remove unused logging import 2017-02-20 11:15:30 +07:00
version.py Initial commit 2015-01-28 18:56:01 +03:00