From 2c80c45fe88c54f6a7bb4cba4b98a79465714e0f Mon Sep 17 00:00:00 2001 From: "BARTRA, RICK" Date: Tue, 26 Feb 2019 14:17:09 -0500 Subject: [PATCH] Run Divingbell containers as unprivileged Divingbell runs all its containers as privileged. Some Divingbell containers can perform their jobs with the default set of Linux capabilities that Docker gives to unprivileged containers while others need additional capabilities. The default list of capabilties include the following: - SETPCAP - MKNOD - AUDIT_WRITE - CHOWN - NET_RAW - DAC_OVERRIDE - FOWNER - FSETID - KILL - SETGID - SETUID - NET_BIND_SERVICE - SYS_CHROOT - SETFCAP The capabilities listed in the daemonset templates function as a whitelist in that the corresponding containers have access to the Linux capabilities listed in their SecurityContext, but also the aforementioned capabilties included by default by Docker. Summary of testing for each daemonset: The bcc-capable tool [0] was used to discover which Linux capabilities the Divingbell containers invoke. The tool was ran against all the processes running in the container. The Divingbell logs for each container were also carefully analyzed for failed permission checks. daemonset-exec: A recent change to use nsenter to enter all host namespaces when running exec prevents divingbell-exec from being able to run unprivileged as there are no Linux capabilties that allows write access to '/proc'. When trying to run as unprivileged, the following prevents the pod from coming up: "nsenter: cannot open /proc/1/ns/ipc: Permission denied" daemonset-sysctl: Ran the divingbell-sys containers as unprivileged and the kernel config on the host updated as defined in the manifest. Kernel configs were checked before and after running divingbell-sys container as unprivileged. Beyond the default Linux capabilties given by Docker, the 'SYS_PTRACE', 'SYS_ADMIN', and 'SYS_RAWIO' Linux capabilities are needed. The following is a snippet of the logs showing under which circumstance these privileges are needed: "INFO * Applying /etc/sysctl.d/10-kernel-hardening.conf ... INFO sysctl: setting key "kernel.kptr_restrict": Operation not permitted INFO * Applying /etc/sysctl.d/10-ptrace.conf ... INFO sysctl: setting key "kernel.yama.ptrace_scope": Operation not permitted INFO * Applying /etc/sysctl.d/10-zeropage.conf ... INFO sysctl: setting key "vm.mmap_min_addr": Operation not permitted" daemonset-perm: Ran the divingbell-perm containers as unprivileged and the file ownership and permissions on the host updated as defined in the manifest. As a test, the daemon was configured to run every minute and the targeted files ownership and permissions were manually changed. It was then verified that divingbell restored the ownership and permissions of the file to what it should be. This applies to the divingbell-perm-default and the divingbell-perm-calico containers. daemonset-limits: Ran the divingbell-limits containers as unprivileged and checked the ulimits on the host before and after running divingbell and the ulimit updated to the value defined in the manifest. The capable tool also showed that no additional Linux capabilties are needed. daemonset-apparmor: Ran the divingbell-apparmor containers as unprivileged and logs show no evidence of failed permission checks. Additionally, the apparmor config was updated in the manifest and the apparmor profile successfully loaded. Beyond the default Linux capabilties given by Docker, the 'MAC_ADMIN' Linux capability is needed to load an apparmor profile. daemonset-apt: Ran the divingbell-apt containers as unprivileged and was able to successfully install package without issues. As a test, the manifest was updated to install 'htop' and after running Divingbell, it was confirmed that 'htop' installed successfully. Here is a snippet from the logs: DEBUG + INSTALLED_THIS_TIME=' htop' DEBUG + REQUESTED_PACKAGES=' htop' daemonset-ethtool: Ran the divingbell-ethtool containers as unprivileged and was able to manage NIC tunables. As a check, the NIC tunables for ens3 was checked before and after running Divingbell - 'ethtool -k ens3'. Divingbell configured the NIC as defined in the manifest. Beyond the default Linux capabilties given by Docker, the 'NET_ADMIN' Linux capability is needed. The following is a log snippet showing what happens when the 'NET_ADMIN' capability is not added: "DEBUG + /sbin/ethtool -K cali86cb821b7db tx-nocache-copy off INFO Cannot set device feature settings: Operation not permitted" daemonset-uamlite: Ran the divingbell-uamlite containers as unprivileged and was able to successfully add user accounts as defined in the manifest. No additional Linux capabilities are needed. daemonset_mounts: Ran the divingbell-mounts containers as unprivileged and was able to successfully add host level mounts as defined in the manifest. No additional Linux capabilities are needed. [0]https://github.com/iovisor/bcc/blob/master/tools/capable.py Change-Id: I26a1b5e06ad27c854d95e6675de05b884ce3bdc1 --- divingbell/templates/daemonset-apparmor.yaml | 4 +++- divingbell/templates/daemonset-apt.yaml | 2 -- divingbell/templates/daemonset-ethtool.yaml | 4 +++- divingbell/templates/daemonset-limits.yaml | 2 -- divingbell/templates/daemonset-mounts.yaml | 2 -- divingbell/templates/daemonset-perm.yaml | 2 -- divingbell/templates/daemonset-sysctl.yaml | 6 +++++- divingbell/templates/daemonset-uamlite.yaml | 2 -- 8 files changed, 11 insertions(+), 13 deletions(-) diff --git a/divingbell/templates/daemonset-apparmor.yaml b/divingbell/templates/daemonset-apparmor.yaml index 6d673b0..35e82a6 100644 --- a/divingbell/templates/daemonset-apparmor.yaml +++ b/divingbell/templates/daemonset-apparmor.yaml @@ -49,7 +49,9 @@ spec: subPath: {{ $daemonset }} readOnly: true securityContext: - privileged: true + capabilities: + add: + - 'MAC_ADMIN' volumes: - name: rootfs-{{ $daemonset }} hostPath: diff --git a/divingbell/templates/daemonset-apt.yaml b/divingbell/templates/daemonset-apt.yaml index eeb929b..955688a 100644 --- a/divingbell/templates/daemonset-apt.yaml +++ b/divingbell/templates/daemonset-apt.yaml @@ -48,8 +48,6 @@ spec: mountPath: /tmp/{{ $daemonset }}.sh subPath: {{ $daemonset }} readOnly: true - securityContext: - privileged: true volumes: - name: rootfs-{{ $daemonset }} hostPath: diff --git a/divingbell/templates/daemonset-ethtool.yaml b/divingbell/templates/daemonset-ethtool.yaml index f58b5d1..2eadffb 100644 --- a/divingbell/templates/daemonset-ethtool.yaml +++ b/divingbell/templates/daemonset-ethtool.yaml @@ -51,7 +51,9 @@ spec: subPath: {{ $daemonset }} readOnly: true securityContext: - privileged: true + capabilities: + add: + - 'NET_ADMIN' volumes: - name: rootfs-{{ $daemonset }} hostPath: diff --git a/divingbell/templates/daemonset-limits.yaml b/divingbell/templates/daemonset-limits.yaml index fa7c767..5000203 100644 --- a/divingbell/templates/daemonset-limits.yaml +++ b/divingbell/templates/daemonset-limits.yaml @@ -50,8 +50,6 @@ spec: mountPath: /tmp/{{ $daemonset }}.sh subPath: {{ $daemonset }} readOnly: true - securityContext: - privileged: true volumes: - name: rootfs-{{ $daemonset }} hostPath: diff --git a/divingbell/templates/daemonset-mounts.yaml b/divingbell/templates/daemonset-mounts.yaml index cf7addc..0a05db0 100644 --- a/divingbell/templates/daemonset-mounts.yaml +++ b/divingbell/templates/daemonset-mounts.yaml @@ -50,8 +50,6 @@ spec: mountPath: /tmp/{{ $daemonset }}.sh subPath: {{ $daemonset }} readOnly: true - securityContext: - privileged: true volumes: - name: rootfs-{{ $daemonset }} hostPath: diff --git a/divingbell/templates/daemonset-perm.yaml b/divingbell/templates/daemonset-perm.yaml index 6c31c71..727d394 100644 --- a/divingbell/templates/daemonset-perm.yaml +++ b/divingbell/templates/daemonset-perm.yaml @@ -50,8 +50,6 @@ spec: mountPath: /tmp/{{ $daemonset }}.sh subPath: {{ $daemonset }} readOnly: true - securityContext: - privileged: true volumes: - name: rootfs-{{ $daemonset }} hostPath: diff --git a/divingbell/templates/daemonset-sysctl.yaml b/divingbell/templates/daemonset-sysctl.yaml index 7731302..7e5bc57 100644 --- a/divingbell/templates/daemonset-sysctl.yaml +++ b/divingbell/templates/daemonset-sysctl.yaml @@ -51,7 +51,11 @@ spec: subPath: {{ $daemonset }} readOnly: true securityContext: - privileged: true + capabilities: + add: + - 'SYS_PTRACE' + - 'SYS_ADMIN' + - 'SYS_RAWIO' volumes: - name: rootfs-{{ $daemonset }} hostPath: diff --git a/divingbell/templates/daemonset-uamlite.yaml b/divingbell/templates/daemonset-uamlite.yaml index b298973..847ac50 100644 --- a/divingbell/templates/daemonset-uamlite.yaml +++ b/divingbell/templates/daemonset-uamlite.yaml @@ -50,8 +50,6 @@ spec: mountPath: /tmp/{{ $daemonset }}.sh subPath: {{ $daemonset }} readOnly: true - securityContext: - privileged: true volumes: - name: rootfs-{{ $daemonset }} hostPath: