The previous commit attempted to prevent a hung process/thread by using
a "tcp ping" to determine if the server_endpoint was reachable before
deploying the heat stack and starting the heartbeat thread.
While this approach works well in theory we're finding that in practice
in a live K8 environment we're seeing a lot of random errors:
[Errno 111] Connection refused: ConnectionRefusedError
There could be numerous reasons for these random connection errors but
ZMQ has retry logic which should overcome these problems.
This commit updates the sockets used in ZMQ to add a timeout (a padded
value of agent_loss_timeout). While this does not prevent the
creation of a heat stack and heartbeat thread that might never respond,
it does solve the initial problem of having stuck process/threads and
getting a clean exit
Change-Id: I8193c72120b459c2a18d780d9f8799e8df592e20