Add spec safe-side-containers
This is an alternative pattern which can be used to launch side containers in a safe manner within our current architecture. A specific focus here is Neutron which requires its side container processes to run in network namespaces. Change-Id: I352b0fbad444f8f340e53da0b758287f55e1c752
This commit is contained in:
parent
a81a4ad8fa
commit
827c692c8a
162
specs/stein/safe-side-containers.rst
Normal file
162
specs/stein/safe-side-containers.rst
Normal file
@ -0,0 +1,162 @@
|
||||
..
|
||||
This work is licensed under a Creative Commons Attribution 3.0 Unported
|
||||
License.
|
||||
|
||||
http://creativecommons.org/licenses/by/3.0/legalcode
|
||||
|
||||
==============================================================
|
||||
TripleO - Pattern to safely spawn a container from a container
|
||||
==============================================================
|
||||
|
||||
This spec describes a pattern which can be used as an alternative to
|
||||
what TripleO does today to allow certain containers (Neutron, etc.) to
|
||||
spawn side processes which require special privs like network
|
||||
namespaces. Specifically it avoids exposing the docker socket or
|
||||
using Podman nsenter hacks that have recently entered the codebase in Stein.
|
||||
|
||||
Problem Description
|
||||
===================
|
||||
|
||||
In Queens TripleO implemented a containerized architecture with the goal of
|
||||
containerizing all OpenStack services. This architecture was a success but
|
||||
a few applications had regressions when compared with their baremetal deployed
|
||||
equivalent. One of these applications was Neutron, which requires the ability
|
||||
to spawn long lived "side" processes that are launched directly from the
|
||||
Neutron agents themselves. In the original Queens architecture Neutron
|
||||
launched these side processes inside of the agent container itself which
|
||||
caused a service disruption if the neutron agents themselves were restarted.
|
||||
This was previously not the case on baremetal as these processes would continue
|
||||
running across an agent restart/upgrade.
|
||||
|
||||
The work around in Rocky was to add "wrapper" scripts for Neutron agents and
|
||||
to expose the docker socket to each agent container. These wrappers scripts
|
||||
were bind mounted into the containers so that they overwrote the normal location
|
||||
of the side process. Using this crude mechanism binaries like 'dnsmasq' and
|
||||
'haproxy' would instead launch a shell script instead of the normal binary and
|
||||
these custom shell scripts relied on the an exposed docker socket from the
|
||||
host to be able to launch a side container with the same arguments supplied
|
||||
to the script.
|
||||
|
||||
This mechanism functionally solved the issues with our containerization but
|
||||
exposed some security problems in that we were now exposing the ability to
|
||||
launch any container to these Neutron agent containers (privileged containers
|
||||
with access to a docker socket).
|
||||
|
||||
In Stein things changed with our desire to support Podman. Unlike Docker
|
||||
Podman does not include a daemon on the host. All Podman commands are executed
|
||||
via a CLI which runs the command on the host directly. We landed
|
||||
patches which required Podman commands to use nsenter to enter the hosts
|
||||
namespace and run the commands there directly. Again this mechanism requires
|
||||
extra privileges to be granted to the Neutron agent containers in order for
|
||||
them to be able to launch these commands. Furthermore the mechanism is
|
||||
a bit cryptic to support and debug in the field.
|
||||
|
||||
Proposed Change
|
||||
===============
|
||||
|
||||
Overview
|
||||
--------
|
||||
|
||||
Use systemd on the host to launch the side process containers directly with
|
||||
support for network namespaces that Neutron agents require. The benefit of
|
||||
this approach is that we no longer have to give the Neutron containers privs
|
||||
to launch containers which they shouldn't require.
|
||||
|
||||
The pattern could work like this:
|
||||
|
||||
#. A systemd.path file monitors a know location on the host for changes.
|
||||
Example (neutron-dhcp-dnsmasq.path):
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
[Path]
|
||||
PathModified=/var/lib/neutron/neutron-dnsmasq-processes-timestamp
|
||||
PathChanged=/var/lib/neutron/neutron-dnsmasq-processes-timestamp
|
||||
|
||||
[Install]
|
||||
WantedBy=multi-user.target
|
||||
|
||||
#. When systemd.path notices a change it fires the service for this
|
||||
path file:
|
||||
Example (neutron-dhcp-dnsmasq.service):
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
[Unit]
|
||||
Description=neutron dhcp dnsmasq sync service
|
||||
|
||||
[Service]
|
||||
Type=oneshot
|
||||
ExecStart=/usr/local/bin/neutron-dhcp-dnsmasq-process-sync
|
||||
User=root
|
||||
|
||||
#. We use the same "wrapper scripts" used today to write two files. The
|
||||
first file is a dump of CLI arguments used to launch the process
|
||||
on the host. This file can optionally include extra data like
|
||||
network namespaces which are required for some neutron side processes.
|
||||
The second file is a timestamp which is monitored by systemd.path
|
||||
on the host for changes and is used as a signal that it needs to
|
||||
process the first file with arguments.
|
||||
|
||||
# When a change is detected the systemd.service above executes a script on the
|
||||
host to cleanly launch containerized side processes. When the script finishes
|
||||
launching processes it truncates the file to start with a clean slate.
|
||||
|
||||
# Both the wrapper scripts and the host scripts use flock to eliminate race
|
||||
conditions which could cause issues in relaunching or missed containers.
|
||||
|
||||
Alternatives
|
||||
------------
|
||||
|
||||
With Podman an API like varlink would be an option however it would likely
|
||||
still required exposure to a socket on the host which would involve
|
||||
extra privileges like what we have today. This would avoid the nsenter hacks
|
||||
however.
|
||||
|
||||
An architecture like Kubernetes would give us an API which could be used
|
||||
to launch containers directly via the COE.
|
||||
|
||||
Additionally an external process manager in Neutron that is "containers aware"
|
||||
could be written to improve either of the above options. The current python
|
||||
in Neutron was writtin primarily for launching processes on baremetal with
|
||||
assumptions that some of the processes it launches are meant to live across
|
||||
a contain restart. Implementing a class that can launch side processes via a
|
||||
clean interface rather than overwriting binaries would be desirable.
|
||||
Classes which supported launching containers via Kubernetes and or Systemd
|
||||
via the host directly could be supported.
|
||||
|
||||
Security Impact
|
||||
---------------
|
||||
|
||||
This mechanism should allow us to remove some of the container privileges for
|
||||
neutron agents which in the past were used to execute containers. It is
|
||||
a more restrictive crude interface that allows the containers only to launch
|
||||
a specific type of process rather than any container it chooses.
|
||||
|
||||
Upgrade Impact
|
||||
--------------
|
||||
|
||||
The side process containers should be the same regardless of how they are
|
||||
launched so the upgrade should be minimal.
|
||||
|
||||
|
||||
Implementation
|
||||
==============
|
||||
|
||||
Assignee(s)
|
||||
-----------
|
||||
|
||||
Primary assignee:
|
||||
dan-prince
|
||||
|
||||
Other contributors:
|
||||
emilienm
|
||||
|
||||
Work Items
|
||||
----------
|
||||
|
||||
# Ansible playbook to create systemd files, wrappers
|
||||
|
||||
# TripleO Heat template updates to use the new playbooks
|
||||
|
||||
# Remove/deprecate the old docker.socket and nsenter code from puppet-tripleo
|
Loading…
Reference in New Issue
Block a user