11 Commits

Author SHA1 Message Date
Cole Walker
8e84309624 Add check to avoid restarting running device plugin pod
This script was set to always restart the local sriov device plugin pod
which could result in sriov pods not starting properly.

Originally, this sequence of commands would not work properly if the
device plugin was running

kubectl delete pods -n kube-system --selector=app=sriovdp
--field-selector=spec.nodeName=${HOST} --wait=false

kubectl wait pods -n kube-system --selector=app=sriovdp
--field-selector=spec.nodeName=${HOST} --for=condition=Ready
--timeout=360s

Result when device plugin is running:
pod "kube-sriov-device-plugin-amd64-rbjpw" deleted
pod/kube-sriov-device-plugin-amd64-rbjpw condition met

The wait command succeeds against the deleted pod and the script
continues. It then deletes labeled pods without having confirmed that
the device plugin is running and can result in sriov pods not starting
properly.

Ensuring that we are only restarting a not-running device plugin pod
prevents the wait condition from immediately passing.

Closes-Bug: 1928965

Signed-off-by: Cole Walker <cole.walker@windriver.com>
Change-Id: I1cc576b26a4bba4eba4a088d33f918bb07ef3b0d
2021-06-09 17:24:48 -04:00
Cole Walker
6c61e3b665 Wait for SRIOV device plugin before recovering labeled pods
This change modifies the k8s-pod-recovery service to wait for the
kube-sriov-device-plugin-amd64 pod on the local node to become
available before proceeding with the recovery of
restart-on-reboot=true labeled pods.

This is required because of a race condition where pods marked for
recovery would be restarted before the device plugin was ready and
the pods would then be stuck in "ContainerCreating".

The fix in this commit uses the kubectl wait ...
command to wait for the daemonset to be available. A timeout of 360s
has been set for this command in order to all enough time on busy
systems for the device-plugin pod to come up. The wait command
completes as soon as the pod is ready.

Closes-Bug: 1928965

Signed-off-by: Cole Walker <cole.walker@windriver.com>
Change-Id: Ie1937cf0612827b28762049e2dc440e55726d4f3
2021-06-02 13:00:51 -04:00
Zuul
c9aaf25330 Merge "Revert "Remove recover operations to "restart-on-reboot" pods"" 2021-05-19 18:58:19 +00:00
Cole Walker
b428a5de00 Revert "Remove recover operations to "restart-on-reboot" pods"
This reverts commit 8abcbf6fb1951b25e9964933558b75b9aff88135.

Reason for revert:

After performing a backup and restore on an AIO-SX system, SRIOV pods do
not return to a running state and are instead stuck in "container
creating". The workaround for this is to restart SRIOV pods when the
system unlocks.

Reverting this commit to allow users to label SRIOV pods and have them
restarted by k8s-pod-recovery. Labelled pods will be restarted by
k8s-pod-recovery and will be running after backup and restore is
completed.

This change has been tested by performing backup and restore on an
AIO-SX system. SRIOV pods now come up correctly when labelled with
restart-on-reboot=true

Closes-Bug: 1928965

Signed-off-by: Cole Walker <cole.walker@windriver.com>
Change-Id: I9c520c0a47aabca7b96e50adf0f71742f4199c2f
2021-05-19 14:31:34 -04:00
Angie Wang
03665ae745 Add armada namespace in k8s pod recovery
Update k8s pod recovery service to include armada namespace
so armada pod that stuck in an unknown state after host
lock/unlock or reboot could be recovered by the service.

Change-Id: Iacd92637a9b4fcaf4c0076e922e1bd739f69a584
Closes-Bug: 1928018
Signed-off-by: Angie Wang <angie.wang@windriver.com>
2021-05-12 12:07:35 -04:00
Bin Qian
8abcbf6fb1 Remove recover operations to "restart-on-reboot" pods
The pods being labeled as "restart-on-reboot" is to workaround
kubernetes restart on worker manifest. As the AIO running a
single manifest to start kubernetes only once, the operation
is no longer needed.

Depends-On: https://review.opendev.org/c/starlingx/stx-puppet/+/785736
Change-Id: I0d6c549199559b2bc19d8edff52f64ea0b08b50d
Closes-Bug: 1918139
Signed-off-by: Bin Qian <bin.qian@windriver.com>
2021-04-14 21:26:21 -04:00
Zuul
0e8206e8b7 Merge "Add custom apps in the k8s-pod-recovery service" 2021-03-24 13:58:29 +00:00
Mihnea Saracin
852ec5ed53 Add custom apps in the k8s-pod-recovery service
At startup, there might be pods that are left in unknown states.
The k8s-pod-recovery service takes care of
recovering these unknown pods in specific namespaces.
To fix this for custom apps that are not part of starlingx,
we modify the service to look into the /etc/k8s-post-recovery.d
directory for conf files. Any app that needs to be recovered by this
service will have to create a conf file e.g the app-1 will create
/etc/k8s-post-recovery.d/APP_1.conf which will contain the following:
namespace=app-1-namespace

Closes-Bug: 1917781
Signed-off-by: Mihnea Saracin <Mihnea.Saracin@windriver.com>
Change-Id: I8febdb685d506cff3c34946163612cafdab3e3a8
2021-03-19 14:33:03 +00:00
Douglas Henrique Koerich
6169cc5d81 Handle labeled pods after stabilized
Pods that are in a k8s deployment, daemonset, etc can be labeled as
restart-on-reboot="true", which will automatically cause them to be
restarted after the worker manifest has completed in an AIO system.
It may happen, however, that k8s-pod-recovery service is started
before the pods are scheduled and created at the node the script is
running on, causing them to be not restarted. The proposed solution is
to wait for stabilization of labeled pods before restarting them.

Closes-Bug: 1900920
Signed-off-by: Douglas Henrique Koerich <douglashenrique.koerich@windriver.com>
Change-Id: I5c73bd838ab2be070bd40bea9e315dcf3852e47f
2021-03-15 15:27:55 -04:00
Steven Webster
7756299303 Enable pod restart based on a label
This commit adds a mechanism to the pod recovery service to restart
pods based on the restart-on-reboot label.

This is a mitigation for an issue seen on an AIO system using SR-IOV
interfaces on an N3000 FPGA device.  Since the kubernetes services
start coming up after the controller manifest has completed, a race
can happen with the configuration of devices and the SR-IOV device
plugin in the worker manifest.  The symptom of this would be the
SR-IOV device in the running pod disappearing as the FPGA device is
reset.

Notes:

- The pod recovery service only runs on controller nodes.
- The raciness between the kubernetes bring-up and worker configuration
  should be fixed in the future by a re-organization of the manifests to
  either have a separate AIO or kubernetes manifest.  This would require
  extensive feature work.  In the meantime, this mitigation will allow
  pods which experience this issue to recover.

Change-Id: If84b66b3a632752bd08293105bb780ea8c7cf400
Closes-Bug: #1896631
Signed-off-by: Steven Webster <steven.webster@windriver.com>
2020-09-22 12:29:03 -04:00
Robert Church
17c1b8894d Introduce k8s pod recovery service
Add a recovery service, started by systemd on a host boot, that waits
for pod transitions to stabilize and then takes corrective action for
the following set of conditions:
- Delete to restart pods stuck in an Unknown or Init:Unknown state for
  the 'openstack' and 'monitor' namespaces.
- Delete to restart Failed pods stuck in a NodeAffinity state that occur
  in any namespace.
- Delete to restart the libvirt pod in the 'openstack' namespace when
  any of its conditions (Initialized, Ready, ContainersReady,
  PodScheduled) are not True.

This will only recover pods specific to the host where the service is
installed.

This service is installed on all controller types. There is currently no
evidence that we need this on dedicated worker nodes.

Each of these conditions should to be evaluated after the next k8s
component rebase to determine if any of these recovery action can be
removed.

Change-Id: I0e304d1a2b0425624881f3b2d9c77f6568844196
Closes-Bug: #1893977
Signed-off-by: Robert Church <robert.church@windriver.com>
2020-09-03 23:38:41 -04:00