zuul/zuul/ansible
Ian Wienand 9462c9e466 zuul_console: fix python 3 support
send() requires a bytes-like object in Python 3, ensure the error
message is encoded correctly.

---

Some debugging notes might come in handy for the future here.  This
problem appeared in a fairly specific part of the test cases when
setting "ansible_python_interpreter" to /usr/bin/python3.  The remote
streaming test has a task that is designed to fail [1]:

 - hosts: all
   tasks:
     - name: Remote shell task with python exception
       command: echo foo
       args:
         chdir: /remote-shelltask/somewhere/that/does/not/exist
       failed_when: false

We see that Ansible ships over a payload and tries to run it, but it
raises an exception very early.

 <192.168.122.1> SSH: EXEC ssh -C ...  '/bin/sh -c '"'"'/usr/bin/python3 && sleep 0'"'"''
 <192.168.122.1> Failed to connect to the host via ssh:
 Traceback (most recent call last):
   File "<stdin>", line 114, in <module>
   File "<stdin>", line 106, in _ansiballz_main
   ...
   File "/tmp/ansible_command_payload_tieedyzs/__main__.py", line 263, in main
 FileNotFoundError: [Errno 2] No such file or directory: '/remote-shelltask/somewhere/that/does/not/exist'

When this task started, the Ansible task callbacks in the zuul_stream
callback plugin have setup a thread that listens for the console
output being sent by the remote zuul_console daemon started earlier in
the playbook [2].  This listening thread is sitting in a recv()
waiting for some streaming data to log [3].

There will be no remote log file for zuul_console to stream back,
because this task failed before it even got started.  What should
happen is the "[Zuul] Log not found" message should be sent back and
logic in [4] will match this and stop this thread.

When this does *not* happen, such as when this send() raises an
exception because of wrong data type, the task ends anyway and Ansible
moves on to make the end-of-task callbacks in zuul_stream (actually
there's a bunch of looping happening, but let's ignore those details).
This ends up in _stop_streamers() [5] which attempts to join(30) the
streaming thread.  Under normal circumstances, this thread should be
finished and the join() successful.  However, because the target
thread is stuck in a recv(), the 30-second timeout begins.  The clue
to this is in the logs you eventually get:

 [Zuul] Log Stream did not terminate

So eventually, Zuul would have made progress here and given up on
waiting for the thread to finish properly.  However, 30 seconds is a
long time to the unit-test and pushes the job over it's timeout.

Thus your end result is that when using Python 3 Zuul aborts the job,
and the test rather mysteriously fails!

[1] 3f8b36aa0b/tests/fixtures/config/remote-zuul-stream/git/org_project/playbooks/command.yaml (L93)
[2] 3f8b36aa0b/tests/fixtures/config/remote-zuul-stream/git/org_project/playbooks/command.yaml (L93)
[3] 3f8b36aa0b/zuul/ansible/base/callback/zuul_stream.py (L14)
[4] 3f8b36aa0b/zuul/ansible/base/callback/zuul_stream.py (L174)
[5] 3f8b36aa0b/zuul/ansible/base/callback/zuul_stream.py (L271)

This is tested in the follow-on I2b3bc6d4f873b7d653cfaccd1598464583c561e7

Change-Id: I7cdcfc760975871f7fa9949da1015d7cec92ee67
2019-09-18 13:58:45 +10:00
..
2.5 Fix: prevent usage of hashi_vault 2019-09-10 14:55:19 +02:00
2.6 Fix: prevent usage of hashi_vault 2019-09-10 14:55:19 +02:00
2.7 Fix: prevent usage of hashi_vault 2019-09-10 14:55:19 +02:00
2.8 Fix: prevent usage of hashi_vault 2019-09-10 14:55:19 +02:00
base zuul_console: fix python 3 support 2019-09-18 13:58:45 +10:00
__init__.py Ansible launcher: add zuul_runner module 2016-05-12 11:37:19 -07:00
logconfig.py Add cherrypy to built-in logging config 2018-06-04 12:58:54 -07:00
paths.py Remove restriction on add_host 2018-09-06 03:33:19 +07:00