integ/kubernetes/k8s-pod-recovery
Steven Webster 7756299303 Enable pod restart based on a label
This commit adds a mechanism to the pod recovery service to restart
pods based on the restart-on-reboot label.

This is a mitigation for an issue seen on an AIO system using SR-IOV
interfaces on an N3000 FPGA device.  Since the kubernetes services
start coming up after the controller manifest has completed, a race
can happen with the configuration of devices and the SR-IOV device
plugin in the worker manifest.  The symptom of this would be the
SR-IOV device in the running pod disappearing as the FPGA device is
reset.

Notes:

- The pod recovery service only runs on controller nodes.
- The raciness between the kubernetes bring-up and worker configuration
  should be fixed in the future by a re-organization of the manifests to
  either have a separate AIO or kubernetes manifest.  This would require
  extensive feature work.  In the meantime, this mitigation will allow
  pods which experience this issue to recover.

Change-Id: If84b66b3a632752bd08293105bb780ea8c7cf400
Closes-Bug: #1896631
Signed-off-by: Steven Webster <steven.webster@windriver.com>
2020-09-22 12:29:03 -04:00
..
centos Enable pod restart based on a label 2020-09-22 12:29:03 -04:00