Browse Source

zuul_console: fix python 3 support

send() requires a bytes-like object in Python 3, ensure the error
message is encoded correctly.


Some debugging notes might come in handy for the future here.  This
problem appeared in a fairly specific part of the test cases when
setting "ansible_python_interpreter" to /usr/bin/python3.  The remote
streaming test has a task that is designed to fail [1]:

 - hosts: all
     - name: Remote shell task with python exception
       command: echo foo
         chdir: /remote-shelltask/somewhere/that/does/not/exist
       failed_when: false

We see that Ansible ships over a payload and tries to run it, but it
raises an exception very early.

 <> SSH: EXEC ssh -C ...  '/bin/sh -c '"'"'/usr/bin/python3 && sleep 0'"'"''
 <> Failed to connect to the host via ssh:
 Traceback (most recent call last):
   File "<stdin>", line 114, in <module>
   File "<stdin>", line 106, in _ansiballz_main
   File "/tmp/ansible_command_payload_tieedyzs/", line 263, in main
 FileNotFoundError: [Errno 2] No such file or directory: '/remote-shelltask/somewhere/that/does/not/exist'

When this task started, the Ansible task callbacks in the zuul_stream
callback plugin have setup a thread that listens for the console
output being sent by the remote zuul_console daemon started earlier in
the playbook [2].  This listening thread is sitting in a recv()
waiting for some streaming data to log [3].

There will be no remote log file for zuul_console to stream back,
because this task failed before it even got started.  What should
happen is the "[Zuul] Log not found" message should be sent back and
logic in [4] will match this and stop this thread.

When this does *not* happen, such as when this send() raises an
exception because of wrong data type, the task ends anyway and Ansible
moves on to make the end-of-task callbacks in zuul_stream (actually
there's a bunch of looping happening, but let's ignore those details).
This ends up in _stop_streamers() [5] which attempts to join(30) the
streaming thread.  Under normal circumstances, this thread should be
finished and the join() successful.  However, because the target
thread is stuck in a recv(), the 30-second timeout begins.  The clue
to this is in the logs you eventually get:

 [Zuul] Log Stream did not terminate

So eventually, Zuul would have made progress here and given up on
waiting for the thread to finish properly.  However, 30 seconds is a
long time to the unit-test and pushes the job over it's timeout.

Thus your end result is that when using Python 3 Zuul aborts the job,
and the test rather mysteriously fails!

[1] 3f8b36aa0b/tests/fixtures/config/remote-zuul-stream/git/org_project/playbooks/command.yaml (L93)
[2] 3f8b36aa0b/tests/fixtures/config/remote-zuul-stream/git/org_project/playbooks/command.yaml (L93)
[3] 3f8b36aa0b/zuul/ansible/base/callback/ (L14)
[4] 3f8b36aa0b/zuul/ansible/base/callback/ (L174)
[5] 3f8b36aa0b/zuul/ansible/base/callback/ (L271)

This is tested in the follow-on I2b3bc6d4f873b7d653cfaccd1598464583c561e7

Change-Id: I7cdcfc760975871f7fa9949da1015d7cec92ee67
Ian Wienand 2 years ago
  1. 17


@ -185,7 +185,7 @@ class Server(object):
console = self.chunkConsole(conn, log_uuid)
if console:
conn.send('[Zuul] Log not found\n')
conn.send(b'[Zuul] Log not found\n')
while True:
if self.followConsole(console, conn):
@ -307,6 +307,19 @@ def main():
from ansible.module_utils.basic import * # noqa
from ansible.module_utils.basic import AnsibleModule
# To debug this, you can run it with arguments specified on the
# command-line in a json file. e.g.
# $ cat args.json
# "state": "present"
# }
# }
# Then from a virtualenv with Ansible installed you can run
# $ ./ansible-env/bin/python ./ args.json
if __name__ == '__main__':
# test()