Fix console log streaming with duplicated roles

If a role is applied to a host more than once (via either play
roles or include_roles, but not via an include_role loop), it will
have the same task UUID from ansible which means Zuul's command
plugin will write the streaming output to the same filename, and
the log streaming will request the same file.  That means the file
might look this after the second invocation:

2022-05-19 17:06:23.673625 | one
2022-05-19 17:06:23.673781 | [Zuul] Task exit code: 0
2022-05-19 17:06:29.226463 | two
2022-05-19 17:06:29.226605 | [Zuul] Task exit code: 0

But since we stop reading the log after "Task exit code", the user
would see "one" twice, and never see "two".

Here are some potential fixes for this that don't work:

* Accessing the task vars from zuul_stream to store any additional
  information: the callback plugins are not given the task vars.
* Setting the log id on the task args in zuul_stream instead of
  command: the same Task object is used for each host and therefore
  the command module might see the task object after it has been
  further modified (in other words, nothing host-specific can be
  set on the task object).
* Setting an even more unique uuid than Task._uuid on the Task
  object in zuul_stream and using that in the command module instead
  of Task._uuid: in some rare cases, the actual task Python object
  may be different between the callback and command plugin, yet still
  have the same _uuid; therefore the new attribute would be missing.

Instead, a global variable is used in order to transfer data between
zuul_stream and command.  This variable holds a counter for each
task+host combination.  Most of the time it will be 1, but if we run
the same task on the same host again, it will increment.  Since Ansible
will not run more than one task on a host simultaneously, so there is
no race between the counter being incremented in zuul_stream and used
in command.

Because Ansible is re-invoked for each playbook, the memory usage is
not a concern.

There may be a fork between zuul_stream and command, but that's fine
as long as we treat it as read-only in the command plugin.  It will
have the data for our current task+host from the most recent zuul_stream
callback invocation.

This change also includes a somewhat unrelated change to the test
infrastructure.  Because we were not setting the log stream port on
the executor in tests, we were actually relying on the "real" OpenDev
Zuul starting zuul_console on the test nodes rather than the
zuul_console we set up for each specific Ansible version from the tests.
This corrects that and uses the correct zuul_console port, so that if we
make any changes to zuul_console in the future, we will test the
changed version, not the one from the Zuul which actually runs the
tox-remote job.

Change-Id: Ia656db5f3dade52c8dbd0505b24049fe0fff67a5
This commit is contained in:
James E. Blair 2022-05-17 16:31:35 -07:00
parent 50a8d8ab96
commit 096e47dc43
7 changed files with 70 additions and 6 deletions

View File

@ -89,3 +89,22 @@
- failed_in_loop1
- failed_in_loop2
ignore_errors: True
# Try transitive includes two different ways
- hosts: compute1
tasks:
- include_role:
name: include-echo-role
vars:
item: transitive-one
- include_role:
name: include-echo-role
vars:
item: transitive-two
- hosts: compute1
roles:
- role: include-echo-role
item: transitive-three
- role: include-echo-role
item: transitive-four

View File

@ -0,0 +1,2 @@
- name: Echo message
command: "echo {{item}}"

View File

@ -0,0 +1,3 @@
- name: Include echo role
include_role:
name: echo-role

View File

@ -28,6 +28,7 @@ class FunctionalZuulStreamMixIn:
def _setUp(self):
self.log_console_port = 19000 + int(
self.ansible_core_version.split('.')[1])
self.executor_server.log_console_port = self.log_console_port
self.fake_nodepool.remote_ansible = True
ansible_remote = os.environ.get('ZUUL_REMOTE_IPV4')
@ -150,6 +151,10 @@ class FunctionalZuulStreamMixIn:
self.assertLogLine(r'compute1 \| failed_in_loop2', text)
self.assertLogLine(r'compute1 \| ok: Item: failed_in_loop2 '
r'Result: 1', text)
self.assertLogLine(r'compute1 \| transitive-one', text)
self.assertLogLine(r'compute1 \| transitive-two', text)
self.assertLogLine(r'compute1 \| transitive-three', text)
self.assertLogLine(r'compute1 \| transitive-four', text)
self.assertLogLine(
r'controller \| ok: Runtime: \d:\d\d:\d\d\.\d\d\d\d\d\d', text)
self.assertLogLine('PLAY RECAP', text)

View File

@ -22,10 +22,18 @@ class ActionModule(command.ActionModule):
def run(self, tmp=None, task_vars=None):
# we need the zuul_log_id on shell and command tasks
host = paths._sanitize_filename(task_vars.get('inventory_hostname'))
if self._task.action in (
'command', 'shell',
'ansible.builtin.command', 'ansible.builtin.shell'):
self._task.args['zuul_log_id'] = "%s-%s" % (self._task._uuid, host)
# Get a unique key for ZUUL_LOG_ID_MAP. ZUUL_LOG_ID_MAP
# is read-only since we are forked. Use it to add a
# counter to the log id so that if we run the same task
# more than once, we get a unique log file. See comments
# in paths.py for details.
log_host = paths._sanitize_filename(
task_vars.get('inventory_hostname'))
key = "%s-%s" % (self._task._uuid, log_host)
count = paths.ZUUL_LOG_ID_MAP.get(key, 0)
self._task.args['zuul_log_id'] = "%s-%s-%s" % (
self._task._uuid, count, log_host)
return super(ActionModule, self).run(tmp, task_vars)

View File

@ -261,7 +261,8 @@ class CallbackModule(default.CallbackModule):
if task.async_val:
# Don't try to stream from async tasks
return
if task.action in ('command', 'shell'):
if task.action in ('command', 'shell',
'ansible.builtin.command', 'ansible.builtin.shell'):
play_vars = self._play._variable_manager._hostvars
hosts = self._get_task_hosts(task)
@ -290,8 +291,17 @@ class CallbackModule(default.CallbackModule):
continue
ip = '127.0.0.1'
log_id = "%s-%s" % (
task._uuid, paths._sanitize_filename(inventory_hostname))
# Get a unique key for ZUUL_LOG_ID_MAP. Use it to add
# a counter to the log id so that if we run the same
# task more than once, we get a unique log file. See
# comments in paths.py for details.
log_host = paths._sanitize_filename(inventory_hostname)
key = "%s-%s" % (self._task._uuid, log_host)
count = paths.ZUUL_LOG_ID_MAP.get(key, 0) + 1
paths.ZUUL_LOG_ID_MAP[key] = count
log_id = "%s-%s-%s" % (
self._task._uuid, count, log_host)
streamer = threading.Thread(
target=self._read_log, args=(
host, ip, port, log_id, task_name, hosts))

View File

@ -52,3 +52,20 @@ def _import_ansible_action_plugin(name):
def _sanitize_filename(name):
return ''.join(c for c in name if c.isalnum())
# Ansible assigns a unique id to every task (Task._uuid). However, if
# a role is included more than once, the task object is re-used. In
# order to provide unique log ids for the Zuul command log streaming
# system, this global dictionary is used to map keys that are derived
# from tasks (task._uuid concatenated with the host name) to a counter
# which is incremented each time the task+host combination is
# encountered. Ansible will not run more than one task on a host
# simultaneously, so this should be sufficiently unique to avoid
# collisions.
#
# We use a global dictionary defined here so that zuul_stream can
# write to it and zuul.ansible.command modules can read it. Note that
# the command module operates after a fork and therefore it should be
# treated as read-only there.
ZUUL_LOG_ID_MAP = {}