Container pinning on worker nodes and All-in-one servers

This story will pin the infrastructure and openstack pods to the
platform cores for worker nodes and All-in-one servers.

This configures systemd system.conf parameter
CPUAffinity=<platform_cpus> by generating
/etc/systemd/system.conf.d/platform-cpuaffinity.conf .
All services launch tasks with the appropriate cpu affinity.

This creates the cgroup called 'k8s-infra' for the following subset
of controllers ('cpuacct', 'cpuset', 'cpu', 'memory', 'systemd').
This configures custom cpuset.cpus (i.e., cpuset) and cpuset.mems
(i.e., nodeset) based on sysinv platform configurable cores. This is
generated by puppet using sysinv host cpu information and is stored
to the hieradata variables:
- platform::kubernetes::params::k8s_cpuset
- platform::kubernetes::params::k8s_nodeset

This creates the cgroup called 'machine.slice' for the controller
'cpuset' and sets cpuset.cpus and cpuset.mems to the parent values.
This prevents VMs from inheriting those settings from libvirt.

Note: systemd automatically mounts cgroups and all available
resource controllers, so the new puppet code does not need to do
that.

Kubelet is now launched with --cgroup-root /k8s-infra by configuring
kubeadm.yaml with the option: cgroupRoot: "/k8s-infra" .

For openstack based worker nodes including AIO
(i.e., host-label openstack-compute-node=enabled):
- the k8s cpuset and nodeset include the assigned platform cores

For non-openstack based worker nodes including AIO:
- the k8s cpuset and nodeset include all cpus except the assigned
  platform cores. This will be refined in a later update since
  we need isolate cpusets of k8s infrastructure from other pods.

The cpuset topology can be viewed with the following:
 sudo systemd-cgls cpuset

The task cpu affinity can be verified with the following:
 ps-sched.sh

The dynamic affining of platform tasks during start-up is disabled,
that code requires cleanup, and likely no longer required
since we are using systemd CPUAffinity and cgroups.

This includes a few small fixes to enable testing of this feature:
- facter platform_res_mem was updated to not require 'memtop', since
  that depends on existance of numa nodes. This was failing on QEMU
  environment when the host does not have Numa nodes. This occurs
  when there is no CPU topology specified.
- cpumap_functions.sh updated parameter defaults so that calling
  bash scripts may enable 'set -u' undefined variable checking.
- the generation of platform_cpu_list did not have all threads.
- the cpulist-to-ranges inline code was incorrect; in certain
  senarios the rstrip(',') would take out the wrong commas.

Story: 2004762
Task: 28879

Change-Id: I6fd21bac59fc2d408132905b88710da48aa8d928
Signed-off-by: Jim Gauld <james.gauld@windriver.com>
This commit is contained in:
Jim Gauld 2019-03-28 14:26:24 -04:00
parent 145a406eac
commit 7609a434b2
3 changed files with 19 additions and 13 deletions

View File

@ -34,12 +34,15 @@ function affine_tasks {
# Affine non-kernel-thread tasks (excluded [kthreadd] and its children) to all available
# cores. They will be reaffined to platform cores later on as part of nova-compute
# launch.
log_debug "Affining all tasks to all available CPUs..."
affine_tasks_to_all_cores
RET=$?
if [ $RET -ne 0 ]; then
log_error "Some tasks failed to be affined to all cores."
fi
##log_debug "Affining all tasks to all available CPUs..."
# TODO: Should revisit this since this leaves a few lingering floating
# tasks and does not really work with cgroup cpusets.
# Comment out for now. Cleanup required.
##affine_tasks_to_all_cores
##RET=$?
##if [ $RET -ne 0 ]; then
## log_error "Some tasks failed to be affined to all cores."
##fi
# Get number of logical cpus
N_CPUS=$(cat /proc/cpuinfo 2>/dev/null | \

View File

@ -26,8 +26,11 @@ start ()
log "Initial Configuration incomplete. Skipping affining tasks."
exit 0
fi
affine_tasks_to_platform_cores
[[ $? -eq 0 ]] && log "Tasks re-affining done." || log "Tasks re-affining failed."
# TODO: Should revisit this since this leaves a few lingering floating
# tasks and does not really work with cgroup cpusets.
# Comment out for now. Cleanup required.
##affine_tasks_to_platform_cores
##[[ $? -eq 0 ]] && log "Tasks re-affining done." || log "Tasks re-affining failed."
}
stop ()
@ -68,4 +71,4 @@ case "$1" in
;;
esac
exit 0
exit 0

View File

@ -32,8 +32,8 @@ function expand_sequence {
# Append a string to comma separated list string
################################################################################
function append_list {
local PUSH=$1
local LIST=$2
local PUSH=${1-}
local LIST=${2-}
if [ -z "${LIST}" ]; then
LIST=${PUSH}
else
@ -179,8 +179,8 @@ function invert_cpulist {
#
################################################################################
function in_list {
local item="$1"
local list="$2"
local item="${1-}"
local list="${2-}"
# expand list format 0-3,8-11 to a full sequence {0..3} {8..11}
local exp_list