Go to file
Rei Oliveira 1bf3f77ebf Recover expired certificates on AIO-DX subclouds
Role for recovering subcloud certificates after expiry.
This role recovers k8s Root CAs, k8s leaf certificates and dc admin
endpoint certificate chain.
This commit adds support for AIO-DX subcloud types to the existing
common/recover-subcloud-certificates role.

Note:

- As it is now, it only works for AIO-SX and AIO-DX subcloud types.
  Additional work will be done for other types of environments and role
  common/recover-subcloud-certificates will evolve to support other
  multi-node system types as well.
  A follow-up review will be posted with enhancements for compute nodes
  after more testing and refactoring.

Test case:

PASS: In a subcloud where k8s certificates are not expired, run sudo
      show-certs.sh and take note of the kubelet certificates dates
      (kubelet-client-current.pem, kubelet-server, kubelet CA),
      k8s certificates dates (admin.conf,apiserver,
      apiserver-kubelet-client, controller-manager.conf,
      front-proxy-client, scheduler.conf, K8s Root CA, FrontProxy CA),
      dc admin endpoint certificates (DC-adminep-root-ca,
      sc-adminep-intermediate-ca, subcloud#-adminep-certificate).
      Now trigger the execution of role common/
      recover-subcloud-certificates from the system controller and
      verify that dates have not changed.
PASS: After the step above, run 'kubectl get po -A' to verify the
      health of the cluster.

PASS: Verify rehoming runs for SX and DX subclouds successfully
after recovery:

1) On subcloud:
- Change 'hardware clock' of vbox vm to more than 11 years in the future
- Verify, after turning on vm, that system date was now 11 years ahead
- Verify that kubernetes is not responding. No kubectl commands were
  accepted. 'sudo show-certs.sh' showed etcd and kubelet certs expired
  and 'kubeadm certs check-expiration --config
  /etc/kubernetes/kubeadm.yaml' shows all k8s certificates have expired

2) On the new systemcontroller:
- Configure network and ensure connectivity between new system
  controller and the subcloud
- Trigger the execution of role common/recover-subcloud-certificates
  and wait for it to finish ( ~ 7 mins when certs are expired,
  about 20 secs otherwise)

3) On subcloud:
- Run 'kubeadm certs check-expiration --config
  /etc/kubernetes/kubeadm.yaml and verify that all certificates
  (admin.conf,  apiserver, apiserver-kubelet-client,
  controller-manager.conf, front-proxy-client, scheduler.conf,
  FrontProxy CA, K8s Root CA) now show valid dates.
- Run 'sudo show-certs.sh' and verity that
  the etcd certificates (etcd-client.crt, etcd-server.crt,
  apiserver-etcd-client.crt), kubelet certificates
  (kubelet-client-current.pem, kubelet-server, kubelet CA),
  and the dc admin endpoint certificates (DC-adminep-root-ca,
  sc-adminep-intermediate-ca, subcloud#-adminep-certificate)
  now show valid dates.

4) On the new systemcontroller:
 - Run 'dcmanager subcloud add --migrate' for the subcloud and verify
   that the rehoming procedure is able to complete.

PASS: Verify reconnect to the same system controller is possible:

1) On the systemcontroller:
- Verify that target subcloud is online
- Change system controller's date to the target date in the future
- Manually recover certificates by running
  /usr/bin/kube-cert-rotation.sh and
  /usr/bin/kube-expired-kubelet-cert-recovery.sh and manually
  restarting pods

2) On the subcloud:
- Change 'hardware clock' of vbox vm to more than 11 years in the future
- Verify, after turning on vm, that system date was now 11 years ahead
- Verify that kubernetes is not responding. No kubectl commands were
  accepted. 'sudo show-certs.sh' showed etcd and kubelet certs expired
  and 'kubeadm certs check-expiration --config
  /etc/kubernetes/kubeadm.yaml' shows all k8s certificates have expired

3) On systemcontroller:
- Verify that subcloud now appears as 'offline'
- Trigger the execution of role common/recover-subcloud-certificates
  and wait for it to finish ( ~ 7 mins when certs are expired,
  about 20 secs otherwise)

4) On subcloud:
- Run 'kubeadm certs check-expiration --config
  /etc/kubernetes/kubeadm.yaml and verify that all certificates
  (admin.conf,  apiserver, apiserver-kubelet-client,
  controller-manager.conf, front-proxy-client, scheduler.conf,
  FrontProxy CA, K8s Root CA) now show valid dates.
- Run 'sudo show-certs.sh' and verity that
  the etcd certificates (etcd-client.crt, etcd-server.crt,
  apiserver-etcd-client.crt), kubelet certificates
  (kubelet-client-current.pem, kubelet-server, kubelet CA),
  and the dc admin endpoint certificates (DC-adminep-root-ca,
  sc-adminep-intermediate-ca, subcloud#-adminep-certificate)
  now show valid dates.

5) On systemcontroller:
 - Run 'dcmanager subcloud show subcloud#' and verify that subcloud is
   now back online

Story: 2010815
Task: 48713
Depends-on: https://review.opendev.org/c/starlingx/config/+/893163

Signed-off-by: Rei Oliveira <Reinildes.JoseMateusOliveira@windriver.com>
Change-Id: Ief6504644e55d6ce83a85742c83b4173fbbf8808
2023-09-19 15:09:41 -03:00
2019-06-15 14:03:07 -05:00
2023-04-28 12:38:49 -04:00
2019-06-15 14:21:19 -05:00
2019-06-15 14:21:19 -05:00
2019-06-15 14:21:19 -05:00
2022-12-26 21:52:05 +00:00

stx-ansible-playbooks

StarlingX Bootstrap and Deployment Ansible1 Playbooks

Execution environment

  • Unix like OS (recent Linux based distributions, MacOS, Cygwin)
  • Python 3.8 and later

Additional Required Packages

In addition to the pakages listed in requirements.txt and test-requirements.txt, the following packages are required to run the playbooks remotely:

  • python3-pexpect
  • python3-ptyprocess
  • sshpass

Supported StarlingX Releases

The playbooks are compatible with StarlingX R8.0 and later.

Executing StarlingX Playbooks

Bootstrap Playbook

For instructions on how to set up and execute the bootstrap playbook from another host, please refer to the StarlingX Documentation2, at Installation Guides, section Configure controller-0 of the respective system deployment type.

Developer Notes

This repository is not intended to be developed standalone, but rather as part of the StarlingX Source System, which is defined by the StarlingX manifest3.

References


  1. https://docs.ansible.com/ansible/latest/installation_guide↩︎

  2. https://docs.starlingx.io↩︎

  3. https://opendev.org/starlingx/manifest.git↩︎

Description
StarlingX Ansible Playbooks
Readme 34 MiB
Languages
Jinja 73%
Python 17.5%
Shell 6.5%
Smarty 2.9%