StarlingX Integration and packaging
Go to file
Steven Webster 5d1a26b89d Implement CNI cache file cleanup for stale files
It has been observed in systems running for months -> years
that the CNI cache files (representing attributes of
network attachment definitions of pods) can accumulate in
large numbers in the /var/lib/cni/results/ and
/var/lib/cni/multus/ directories.

The cache files in /var/lib/cni/results/ have a naming signature of:

<type>-<pod id>-<interface name>

While the cache files in /var/lib/cni/multus have a naming signature
of:

<pod id>

Normally these files are cleaned up automatically (I believe
this is the responsibility of containerd).  It has been seen
that this happens reliably when one manually deletes a pod.

The issue has been reproduced in the case of a host being manually
rebooted.  In this case, the pods are re-created when the host comes
back up, but with a different pod-id than was used before

In this case, _most_ of the time the cache files from the previous
instantiation of the pod are deleted, but occasionally a few are
missed by the internal garbage collection mechanism.

Once a cache file from the previous instantiation of a pod escapes
garbage collection, it seems to be left as a stale file for all
subsequent reboots.  Over time, this can cause these stale files
to accumulate and take up disk space unnecessarily.

The script will be called once by the k8s-pod-recovery service
on system startup, and then periodically via a cron job installed
by puppet.

The cleanup mechanism analyzes the cache files by name and
compares them with the id(s) of the currently running pods. Any
stale files detected are deleted.

Test Plan:

PASS: Verify existing pods do not have their cache files removed
PASS: Verify files younger than the specified 'olderthan' time
      are not removed
PASS: Verify stale cache files for pods that do not exist anymore
      are removed.
PASS: Verify the script does not run if kubelet is not up yet.

Failure Path:

PASS: Verify files not matching the naming signature (pod id
      embedded in file name) are not processed

Regression:

PASS: Verify system install
PASS: Verify feature logging

Partial-Bug: 1947386

Signed-off-by: Steven Webster <steven.webster@windriver.com>
Change-Id: I0ce06646001e52d1cc6d204b924f41d049264b4c
2021-11-01 10:39:39 -04:00
base Uprev linuxptp to version 3.1.1-1 2021-09-03 17:24:27 -04:00
bmc/Redfishtool Add auto-versioning to starlingx/integ packages 2020-06-24 09:48:28 +08:00
ceph/ceph Fix python3 incompatibility 2021-07-26 14:35:12 -04:00
config Merge "Add debian package for puppet-dnsmasq module" 2021-09-28 18:14:26 +00:00
database Add auto-versioning to starlingx/integ packages 2020-06-24 09:48:28 +08:00
devstack Relocated some packages to repo 'utilities' 2019-09-05 20:31:36 -04:00
doc Switch to newer openstackdocstheme and reno versions 2020-06-04 14:28:48 +02:00
docker/python-docker/centos Applying patch with -p1 for docker-py 2021-07-12 13:26:37 -04:00
filesystem DRBD upversion from 8.4 to 9.15 2021-09-16 14:14:27 -03:00
gpu/gpu-operator integ: add nvidia gpu-operator helm charts 2021-03-31 17:33:41 +00:00
grub tftp: roll over block counter to prevent timeouts with data packets 2021-09-01 20:57:18 -04:00
kubernetes Implement CNI cache file cleanup for stale files 2021-11-01 10:39:39 -04:00
ldap Add auto-versioning to starlingx/integ packages 2020-06-24 09:48:28 +08:00
logging/logrotate/centos Add auto-versioning to starlingx/integ packages 2020-06-24 09:48:28 +08:00
networking Add alternative command to disable lldp agent for i40e devices 2021-04-19 15:06:40 -04:00
python Patch watch.py in python-kubernetes package 2021-08-25 17:05:03 -04:00
releasenotes Switch to newer openstackdocstheme and reno versions 2020-06-04 14:28:48 +02:00
requests-toolbelt Add auto-versioning to starlingx/integ packages 2020-06-24 09:48:28 +08:00
security Copy shim.efi to /pxeboot for UEFI pxeboot support 2021-05-07 11:48:35 -04:00
storage-drivers Add auto-versioning to starlingx/integ packages 2020-06-24 09:48:28 +08:00
tools Fix py3 issues 2021-08-16 14:15:42 +00:00
virt remove /data which is not being used 2021-08-16 09:50:38 -04:00
.gitignore Add Docker Registry Token Server 2019-01-08 11:42:04 -05:00
.gitreview OpenDev Migration Patch 2019-04-19 19:52:31 +00:00
.yamllint Add .yamllint file 2021-09-09 19:05:36 +03:00
.zuul.yaml Adding job to upload commits to GitHub 2020-02-06 10:50:16 -05:00
bindep.txt Fix pylint zuul jobs failing due to libvirt-python and pkgconfig 2019-07-04 14:14:39 -05:00
centos_build_layer.cfg Build layering, add layer build config file and srpm and tarball lst 2019-10-21 09:24:22 +08:00
centos_extra_downloads.lst Move mellanox userspace from integ repo 2020-05-06 19:58:38 -04:00
centos_guest_image_rt.inc Subdirectory kernel relocated to new repo starlingx/kernel 2020-04-11 13:08:18 -04:00
centos_guest_image.inc Subdirectory kernel relocated to new repo starlingx/kernel 2020-04-11 13:08:18 -04:00
centos_iso_image.inc Implement CNI cache file cleanup for stale files 2021-11-01 10:39:39 -04:00
centos_pkg_dirs Implement CNI cache file cleanup for stale files 2021-11-01 10:39:39 -04:00
centos_pkg_dirs_installer Config file changes for packages being relocated to repo 'compile' 2019-09-05 20:28:59 -04:00
centos_pkg_dirs_rt Move mellanox userspace from integ repo 2020-05-06 19:58:38 -04:00
centos_srpms_3rdparties.lst Uprev linuxptp to version 3.1.1-1 2021-09-03 17:24:27 -04:00
centos_srpms_centos.lst Patch watch.py in python-kubernetes package 2021-08-25 17:05:03 -04:00
centos_stable_docker_images.inc Create Docker image for running Intel N3000 FPGA tools 2020-02-20 19:09:13 -05:00
centos_stable_wheels.inc Add libvirt module to stable wheels for image build 2019-04-04 22:54:04 -04:00
centos_tarball-dl.lst Add staged kubernetes version 1.21.3 2021-09-22 16:31:39 -04:00
CONTRIBUTORS.wrs StarlingX open source release updates 2018-05-31 07:36:35 -07:00
debian_build_layer.cfg Add debian_build_layer.cfg file 2021-10-05 14:08:19 -04:00
debian_pkg_dirs Add debian package for facter 3.14.12 2021-09-15 12:18:58 +03:00
distroless_stable_docker_images.inc add intel-gpu-plugin docker image to stable docker image build 2019-07-16 09:48:24 +08:00
LICENSE StarlingX open source release updates 2018-05-31 07:36:35 -07:00
pylint.rc Add pylint py3 portability checks for the integ repo 2021-09-13 09:57:53 +00:00
README.rst Followup opendev cleanup and test jobs 2019-04-21 09:23:19 -05:00
test-requirements.txt Add default test framework 2018-06-11 13:45:22 -05:00
tox.ini Fix unit tests 2021-09-28 14:20:49 -04:00

integ

StarlingX Integration