connection: ssh: Clear environment when connecting to LXC containers

We should clear the environment before connecting to an LXC container to
avoid inheriting host variables that may break container services like
the following failure in rabbitmqctl:

TASK [Get status of rabbitmq] *************************************************************************************************************************************************************************************
fatal: [container1]: FAILED! => {"changed": true, "cmd": ["rabbitmqctl", "status"], "delta": "0:00:00.705116", "end": "2017-06-06 20:03:13.771796", "failed": true, "rc": 69, "start": "2017-06-06 20:03:13.066680", "stderr": "Error: unable to connect to node 'rabbit@vagrant-openSUSE-Leap': nodedown\n\nDIAGNOSTICS\n===========\n\nattempted to contact: ['rabbit@vagrant-openSUSE-Leap']\n\nrabbit@vagrant-openSUSE-Leap:\n  * unable to connect to epmd (port 4369) on vagrant-openSUSE-Leap: address (cannot connect to host/port)\n\n\ncurrent node details:\n- node name: 'rabbitmq-cli-82@localhost'\n- home dir: /var/lib/rabbitmq\n- cookie hash: NqWRA5RzO5daz4Jb5LJsXg==", "stderr_lines": ["Error: unable to connect to node 'rabbit@vagrant-openSUSE-Leap': nodedown", "", "DIAGNOSTICS", "===========", "", "attempted to contact: ['rabbit@vagrant-openSUSE-Leap']", "", "rabbit@vagrant-openSUSE-Leap:", "  * unable to connect to epmd (port 4369) on vagrant-openSUSE-Leap: address (cannot connect to host/port)", "", "", "current node details:", "- node name: 'rabbitmq-cli-82@localhost'", "- home dir: /var/lib/rabbitmq", "- cookie hash: NqWRA5RzO5daz4Jb5LJsXg=="], "stdout": "Status of node 'rabbit@vagrant-openSUSE-Leap' ...", "stdout_lines": ["Status of node 'rabbit@vagrant-openSUSE-Leap' ..."]}
	to retry, use: --limit @/vagrant/tests/test-rabbitmq-server-functional.retry

The reason for this failure is that the HOSTNAME variable is being
inherited by the host (vagrant-openSUSE-Leap) and the rabbitmqctl
command uses this variable to guess the host it should try to
connect to.

This is similar to what the upstream lxc connection module is doing.

This is an attempt to fix problems introduced in
https://review.openstack.org/#/c/471472/ and subsequently
reverted in https://review.openstack.org/#/c/471713/

The reason for these failures was that 'lxc-attach' executed commands
which assumed that basic variables like HOME are set properly. However,
--clear-env didn't preserve these variables so various operations started
to fail. In order to fix that, it's best if we start a real login shell
using 'su' in order to mimic an expected user environment when executing
commands within the container.

Change-Id: I684a11f4380f91b1cb0585f38817859dfaa68f80
(cherry picked from commit 1c7cb99fcc)
This commit is contained in:
Markos Chandras 2017-06-07 15:31:16 +01:00 committed by Jesse Pretorius (odyssey4me)
parent 2fbc4fd08f
commit af204e4418

View File

@ -17,6 +17,9 @@
import imp
import os
from ansible.module_utils._text import to_bytes
from ansible.compat.six.moves import shlex_quote
# NOTICE(cloudnull): The connection plugin imported using the full path to the
# file because the ssh connection plugin is not importable.
import ansible.plugins.connection as conn
@ -70,8 +73,32 @@ class Connection(SSH.Connection):
"""run a command on the remote host."""
if self._container_check():
lxc_command = 'lxc-attach --name %s' % self.container_name
cmd = '%s -- %s' % (lxc_command, cmd)
# Remote user is normally set, but if it isn't, then default to 'root'
container_user = 'root'
if self._play_context.remote_user:
container_user = to_bytes(self._play_context.remote_user,
errors='surrogate_or_strict')
# NOTE(hwoarang) It is important to connect to the container
# without inheriting the host environment as that would interfere
# with running commands and services inside the container. However,
# it is also important to create a sensible environment within the
# container because certain commands and services expect some
# enviromental variables to be set properly. The best way to do
# that would be to execute the commands in a login shell
lxc_command = 'lxc-attach --clear-env --name %s' % self.container_name
# NOTE(hwoarang): the shlex_quote method is necessary here because
# we need to properly quote the cmd as it's being passed as argument
# to the -c su option. The Ansible ssh class has already
# quoted the command of the _executable_ (ie /bin/bash -c "$cmd").
# However, we also need to quote the executable itself because the
# entire command is being passed to the su process. This produces
# a somewhat ugly output with too many quotes in a row but we can't
# do much since we are effectively passing a command to a command
# to a command etc... It's somewhat ugly but maybe it can be
# improved somehow...
cmd = '%s -- su - %s -c %s' % (lxc_command, container_user,
shlex_quote(cmd))
if self._chroot_check():
chroot_command = 'chroot %s' % self.chroot_path