Deleting ic-nginx-ingress-controller at restore
Once k8s comes up after the etcd restore, there is a span of time (around 20s) that the pod states have not been updated and are reported as they were at the point in time where the backup was taken. This returns that the ic-nginx-ingress-ingress-nginx-controller-XXX pod is "Ready", but it is not... in several instances during my tests, the pod was restarted 3-10 seconds after the task "Launch Armada with Helm v3" failed due to not being able to call the webhook. The proposed solution is to delete the pod preemptively and wait for it to be recreated and "Ready". TEST PLAN PASS restore on virtual AIO-SX (CentOS) Closes-Bug: #1978899 Signed-off-by: Thiago Brito <thiago.brito@windriver.com> Change-Id: I20bec1fbbf809bfcf5d515ef55c6d47ab968dbf3
This commit is contained in:
parent
822540ac77
commit
c2e5db4305
@ -162,6 +162,13 @@
|
||||
register: nginx_webhook_service
|
||||
ignore_errors: true
|
||||
|
||||
- name: If on system restore mode, kill ingress validating webhook pod so it can be recreated
|
||||
shell: >-
|
||||
kubectl delete pod -n kube-system
|
||||
-l $(kubectl get service -n kube-system {{ nginx_webhook_service.stdout }}
|
||||
-o jsonpath="{.spec.selector}" | tr -d "{}\"" | tr ":" "=")
|
||||
when: mode == 'restore' and armada_check.rc == 0 and nginx_webhook_service.rc == 0
|
||||
|
||||
- name: Check ingress validating webhook service and pod status
|
||||
shell: >-
|
||||
kubectl wait pod -n kube-system
|
||||
|
Loading…
Reference in New Issue
Block a user