Run Divingbell containers as unprivileged

Divingbell runs all its containers as privileged. Some Divingbell
containers can perform their jobs with the default set of Linux
capabilities that Docker gives to unprivileged containers while others
need additional capabilities. The default list of capabilties include
the following:
  - SETPCAP
  - MKNOD
  - AUDIT_WRITE
  - CHOWN
  - NET_RAW
  - DAC_OVERRIDE
  - FOWNER
  - FSETID
  - KILL
  - SETGID
  - SETUID
  - NET_BIND_SERVICE
  - SYS_CHROOT
  - SETFCAP

The capabilities listed in the daemonset templates function as a
whitelist in that the corresponding containers have access to the Linux
capabilities listed in their SecurityContext, but also the
aforementioned capabilties included by default by Docker.

Summary of testing for each daemonset:

The bcc-capable tool [0] was used to discover which Linux capabilities
the Divingbell containers invoke. The tool was ran against all the
processes running in the container. The Divingbell logs for each
container were also carefully analyzed for failed permission checks.

daemonset-exec:
A recent change to use nsenter to enter all host namespaces when running
exec prevents divingbell-exec from being able to run unprivileged as
there are no Linux capabilties that allows write access to '/proc'.
When trying to run as unprivileged, the following prevents the pod from
coming up:
"nsenter: cannot open /proc/1/ns/ipc: Permission denied"

daemonset-sysctl:
Ran the divingbell-sys containers as unprivileged and the kernel config
on the host updated as defined in the manifest. Kernel configs were
checked before and after running divingbell-sys container as
unprivileged. Beyond the default Linux capabilties given by
Docker, the 'SYS_PTRACE', 'SYS_ADMIN', and 'SYS_RAWIO' Linux
capabilities are needed. The following is a snippet of the logs showing
under which circumstance these privileges are needed:

"INFO * Applying /etc/sysctl.d/10-kernel-hardening.conf ...
INFO sysctl: setting key "kernel.kptr_restrict": Operation not permitted

INFO * Applying /etc/sysctl.d/10-ptrace.conf ...
INFO sysctl: setting key "kernel.yama.ptrace_scope": Operation not
permitted

INFO * Applying /etc/sysctl.d/10-zeropage.conf ...
INFO sysctl: setting key "vm.mmap_min_addr": Operation not permitted"

daemonset-perm:
Ran the divingbell-perm containers as unprivileged and the file
ownership and permissions on the host updated as defined in the
manifest. As a test, the daemon was configured to run every minute
and the targeted files ownership and permissions were manually
changed. It was then verified that divingbell restored the ownership
and permissions of the file to what it should be. This applies to
the divingbell-perm-default and the divingbell-perm-calico containers.

daemonset-limits:
Ran the divingbell-limits containers as unprivileged and checked the
ulimits on the host before and after running divingbell and the ulimit
updated to the value defined in the manifest. The capable tool also
showed that no additional Linux capabilties are needed.

daemonset-apparmor:
Ran the divingbell-apparmor containers as unprivileged and logs show no
evidence of failed permission checks. Additionally, the apparmor config
was updated in the manifest and the apparmor profile successfully
loaded. Beyond the default Linux capabilties given by Docker, the
'MAC_ADMIN' Linux capability is needed to load an apparmor profile.

daemonset-apt:
Ran the divingbell-apt containers as unprivileged and was able to
successfully install package without issues. As a test, the
manifest was updated to install 'htop' and after running Divingbell,
it was confirmed that 'htop' installed successfully. Here is
a snippet from the logs:
DEBUG + INSTALLED_THIS_TIME=' htop'
DEBUG + REQUESTED_PACKAGES=' htop'

daemonset-ethtool:
Ran the divingbell-ethtool containers as unprivileged and was able to
manage NIC tunables. As a check, the NIC tunables for ens3 was checked
before and after running Divingbell - 'ethtool -k ens3'. Divingbell
configured the NIC as defined in the manifest. Beyond the default Linux
capabilties given by Docker, the 'NET_ADMIN' Linux capability is needed.
The following is a log snippet showing what happens when the 'NET_ADMIN'
capability is not added:
"DEBUG + /sbin/ethtool -K cali86cb821b7db tx-nocache-copy off
INFO Cannot set device feature settings: Operation not permitted"

daemonset-uamlite:
Ran the divingbell-uamlite containers as unprivileged and was able to
successfully add user accounts as defined in the manifest. No additional
Linux capabilities are needed.

daemonset_mounts:
Ran the divingbell-mounts containers as unprivileged and was able to
successfully add host level mounts as defined in the manifest. No
additional Linux capabilities are needed.

[0]https://github.com/iovisor/bcc/blob/master/tools/capable.py

Change-Id: I26a1b5e06ad27c854d95e6675de05b884ce3bdc1
This commit is contained in:
BARTRA, RICK 2019-02-26 14:17:09 -05:00 committed by Rick Bartra
parent 85534b7796
commit 2c80c45fe8
8 changed files with 11 additions and 13 deletions

View File

@ -49,7 +49,9 @@ spec:
subPath: {{ $daemonset }} subPath: {{ $daemonset }}
readOnly: true readOnly: true
securityContext: securityContext:
privileged: true capabilities:
add:
- 'MAC_ADMIN'
volumes: volumes:
- name: rootfs-{{ $daemonset }} - name: rootfs-{{ $daemonset }}
hostPath: hostPath:

View File

@ -48,8 +48,6 @@ spec:
mountPath: /tmp/{{ $daemonset }}.sh mountPath: /tmp/{{ $daemonset }}.sh
subPath: {{ $daemonset }} subPath: {{ $daemonset }}
readOnly: true readOnly: true
securityContext:
privileged: true
volumes: volumes:
- name: rootfs-{{ $daemonset }} - name: rootfs-{{ $daemonset }}
hostPath: hostPath:

View File

@ -51,7 +51,9 @@ spec:
subPath: {{ $daemonset }} subPath: {{ $daemonset }}
readOnly: true readOnly: true
securityContext: securityContext:
privileged: true capabilities:
add:
- 'NET_ADMIN'
volumes: volumes:
- name: rootfs-{{ $daemonset }} - name: rootfs-{{ $daemonset }}
hostPath: hostPath:

View File

@ -50,8 +50,6 @@ spec:
mountPath: /tmp/{{ $daemonset }}.sh mountPath: /tmp/{{ $daemonset }}.sh
subPath: {{ $daemonset }} subPath: {{ $daemonset }}
readOnly: true readOnly: true
securityContext:
privileged: true
volumes: volumes:
- name: rootfs-{{ $daemonset }} - name: rootfs-{{ $daemonset }}
hostPath: hostPath:

View File

@ -50,8 +50,6 @@ spec:
mountPath: /tmp/{{ $daemonset }}.sh mountPath: /tmp/{{ $daemonset }}.sh
subPath: {{ $daemonset }} subPath: {{ $daemonset }}
readOnly: true readOnly: true
securityContext:
privileged: true
volumes: volumes:
- name: rootfs-{{ $daemonset }} - name: rootfs-{{ $daemonset }}
hostPath: hostPath:

View File

@ -50,8 +50,6 @@ spec:
mountPath: /tmp/{{ $daemonset }}.sh mountPath: /tmp/{{ $daemonset }}.sh
subPath: {{ $daemonset }} subPath: {{ $daemonset }}
readOnly: true readOnly: true
securityContext:
privileged: true
volumes: volumes:
- name: rootfs-{{ $daemonset }} - name: rootfs-{{ $daemonset }}
hostPath: hostPath:

View File

@ -51,7 +51,11 @@ spec:
subPath: {{ $daemonset }} subPath: {{ $daemonset }}
readOnly: true readOnly: true
securityContext: securityContext:
privileged: true capabilities:
add:
- 'SYS_PTRACE'
- 'SYS_ADMIN'
- 'SYS_RAWIO'
volumes: volumes:
- name: rootfs-{{ $daemonset }} - name: rootfs-{{ $daemonset }}
hostPath: hostPath:

View File

@ -50,8 +50,6 @@ spec:
mountPath: /tmp/{{ $daemonset }}.sh mountPath: /tmp/{{ $daemonset }}.sh
subPath: {{ $daemonset }} subPath: {{ $daemonset }}
readOnly: true readOnly: true
securityContext:
privileged: true
volumes: volumes:
- name: rootfs-{{ $daemonset }} - name: rootfs-{{ $daemonset }}
hostPath: hostPath: