According to the docs ansible is expected to return 3 if there were
unreachable nodes. Zuul reacts on this and retries the job. However
ansible really returns 4 in this case  which actually is the code
for a parse error (which doesn't make sense to retry).
To work around that add a simple callback plugin that gets notified if
there are unreachable nodes and write the information about which
nodes failed to a special file beneath the job-output.txt. The
executor can detect this and react properly then. Further we have the
information about which nodes failed in this file which gets uploaded
to the log server too if it exists.