Upgrade and reboot test nodes before openafs installation

CentOS ARM in particular complains bitterly if we try to upgrade openafs
on it and the kernel it is running is stale compared to packges in the
mirrors. This happens if our images have lagged behind upstream package
updates. To mitigate this we upgrade and reboot the system prior to
testing openafs installation.

To make arm64 reboots onto the new kernel reliable we manually run
grubby --set-default to explicitly set the default kernel to boot to the
newly installed kernel. Then we remove the old kernel with grubby and
finally generate a new grub config. This is necessary likely due to:

  https://bugzilla.redhat.com/show_bug.cgi?id=2032680

Note we also do the reboot (but not grub(by) dance) on Debuntu for
symmetry.

Change-Id: If87a0b1d7dc063ac9122d85f65b6fe1c129d2052
This commit is contained in:
Clark Boylan 2024-11-20 09:06:02 -08:00
parent 660a906c09
commit b5148387ba
2 changed files with 60 additions and 0 deletions

59
roles-test/pre.yaml Normal file
View File

@ -0,0 +1,59 @@
# Do these updates particularly for CentOS arm which may have stale kernels
# preventing openafs installation
- name: Update and reboot nodes before installing openafs
hosts: all
tasks:
- name: Update CentOS nodes
when: ansible_distribution == "CentOS"
block:
- name: DNF Update
dnf:
name: "*"
state: latest
become: yes
- name: Hacky script to force default kernel to new version
shell: |
set -x
# Get the newest kernel version in /boot
NEWEST=$(ls /boot | grep vmlinuz | sort -V -r | head -1)
OLDEST=$(ls /boot | grep vmlinuz | sort -V | head -1)
grubby --set-default=/boot/$NEWEST
if [[ "$OLDEST" != "$NEWEST" ]] ; then
grubby --remove-kernel=/boot/$OLDEST
fi
args:
executable: /usr/bin/bash
become: yes
- name: Tell grub about the new kernel setup
command: grub2-mkconfig --update-bls-cmdline -o /boot/grub2/grub.cfg
become: yes
- name: Update Debuntu nodes
when: ansible_distribution == "Ubuntu" or ansible_distribution == "Debian"
block:
- name: Apt upgrade
apt:
name: "*"
state: latest
become: yes
- name: Record running kernel version
command: uname -a
- name: Reboot
reboot:
reboot_timeout: 900
become: yes
- name: Restart zuul console log daemon
include_role:
name: start-zuul-console
- name: Record running kernel version
command: uname -a
- name: Pause for a bit to ensure system is up post reboot
pause:
seconds: 60

View File

@ -9,6 +9,7 @@
description: Test roles provided by system-config with Zuul.
abstract: true
parent: base
pre-run: roles-test/pre.yaml
run: roles-test/base.yaml
post-run: roles-test/post.yaml
files: