shaker/shaker/engine
Oded Le'Sage 539e376978 Update hung process/thread approach
The previous commit attempted to prevent a hung process/thread by using
a "tcp ping" to determine if the server_endpoint was reachable before
deploying the heat stack and starting the heartbeat thread.

While this approach works well in theory we're finding that in practice
in a live K8 environment we're seeing a lot of random errors:
[Errno 111] Connection refused: ConnectionRefusedError

There could be numerous reasons for these random connection errors but
ZMQ has retry logic which should overcome these problems.

This commit updates the sockets used in ZMQ to add a timeout (a padded
value of agent_loss_timeout). While this does not prevent the
creation of a heat stack and heartbeat thread that might never respond,
it does solve the initial problem of having stuck process/threads and
getting a clean exit

Change-Id: I8193c72120b459c2a18d780d9f8799e8df592e20
2020-02-11 12:13:58 -06:00
..
aggregators Remove unused logging import 2017-02-20 11:15:30 +07:00
executors Enhance Iperf executor to accept "unmapped" args 2018-11-07 16:10:57 -06:00
__init__.py Added aggregator for traffic stats post-processing 2015-03-11 18:52:38 +03:00
all_in_one.py Fix logging in shaker-all-in-one 2017-02-27 16:51:27 +04:00
config.py Fix failures in pep8 job caused by flake8>=2.6.0 2019-10-17 00:34:01 +04:00
deploy.py Update cleanup cfg parameter to a more appropriate name 2019-08-20 10:30:26 -05:00
image_builder.py Update cleanup cfg parameter to a more appropriate name 2019-08-20 10:30:26 -05:00
messaging.py Update hung process/thread approach 2020-02-11 12:13:58 -06:00
quorum.py Run multiple scenarios at once 2016-08-18 19:18:38 +03:00
report.py Revert "Report presence of any records with errors" 2018-10-28 19:11:38 +04:00
server.py Merge "Show proper error when there are not enough compute nodes" 2017-08-10 14:01:58 +00:00
sla.py Make SLA evaluation tolerant to errors 2015-12-23 13:28:31 +03:00
spot.py Remove unused logging import 2017-02-20 11:15:30 +07:00
utils.py Do not list test scenarios in CLI help 2018-11-05 16:05:32 +01:00
writer.py Run multiple scenarios at once 2016-08-18 19:18:38 +03:00