send() requires a bytes-like object in Python 3, ensure the error
message is encoded correctly.
---
Some debugging notes might come in handy for the future here. This
problem appeared in a fairly specific part of the test cases when
setting "ansible_python_interpreter" to /usr/bin/python3. The remote
streaming test has a task that is designed to fail [1]:
- hosts: all
tasks:
- name: Remote shell task with python exception
command: echo foo
args:
chdir: /remote-shelltask/somewhere/that/does/not/exist
failed_when: false
We see that Ansible ships over a payload and tries to run it, but it
raises an exception very early.
<192.168.122.1> SSH: EXEC ssh -C ... '/bin/sh -c '"'"'/usr/bin/python3 && sleep 0'"'"''
<192.168.122.1> Failed to connect to the host via ssh:
Traceback (most recent call last):
File "<stdin>", line 114, in <module>
File "<stdin>", line 106, in _ansiballz_main
...
File "/tmp/ansible_command_payload_tieedyzs/__main__.py", line 263, in main
FileNotFoundError: [Errno 2] No such file or directory: '/remote-shelltask/somewhere/that/does/not/exist'
When this task started, the Ansible task callbacks in the zuul_stream
callback plugin have setup a thread that listens for the console
output being sent by the remote zuul_console daemon started earlier in
the playbook [2]. This listening thread is sitting in a recv()
waiting for some streaming data to log [3].
There will be no remote log file for zuul_console to stream back,
because this task failed before it even got started. What should
happen is the "[Zuul] Log not found" message should be sent back and
logic in [4] will match this and stop this thread.
When this does *not* happen, such as when this send() raises an
exception because of wrong data type, the task ends anyway and Ansible
moves on to make the end-of-task callbacks in zuul_stream (actually
there's a bunch of looping happening, but let's ignore those details).
This ends up in _stop_streamers() [5] which attempts to join(30) the
streaming thread. Under normal circumstances, this thread should be
finished and the join() successful. However, because the target
thread is stuck in a recv(), the 30-second timeout begins. The clue
to this is in the logs you eventually get:
[Zuul] Log Stream did not terminate
So eventually, Zuul would have made progress here and given up on
waiting for the thread to finish properly. However, 30 seconds is a
long time to the unit-test and pushes the job over it's timeout.
Thus your end result is that when using Python 3 Zuul aborts the job,
and the test rather mysteriously fails!
[1] 3f8b36aa0b/tests/fixtures/config/remote-zuul-stream/git/org_project/playbooks/command.yaml (L93)
[2] 3f8b36aa0b/tests/fixtures/config/remote-zuul-stream/git/org_project/playbooks/command.yaml (L93)
[3] 3f8b36aa0b/zuul/ansible/base/callback/zuul_stream.py (L14)
[4] 3f8b36aa0b/zuul/ansible/base/callback/zuul_stream.py (L174)
[5] 3f8b36aa0b/zuul/ansible/base/callback/zuul_stream.py (L271)
This is tested in the follow-on I2b3bc6d4f873b7d653cfaccd1598464583c561e7
Change-Id: I7cdcfc760975871f7fa9949da1015d7cec92ee67