Setup swap in nox jobs

Python3.12 appears to be more memory hungry than python3.11 leading to
test case failures when Ansible attempts to fork processes in the tests
leading to:

  ERROR! A worker was found in a dead state

We port openstack's configure-swap role into zuul so that we can
configure at least 1GB of swap on test nodes. The idea being while
slower this will prevent test cases from failing if they are on the edge
of running out of memory. A reasonable followup would be to inspect
where the memory consumption is going, but for now we're trying to
address the immediate problem of flaky jobs.

Depends-On: https://review.opendev.org/c/zuul/zuul-jobs/+/943861
Change-Id: I00f672ae20c8e8ddc3ac88d179b23271d7aa307b
This commit is contained in:
Clark Boylan
2025-03-08 10:14:13 -08:00
parent d92aa20278
commit 2c9ea69ec0
6 changed files with 279 additions and 0 deletions

View File

@@ -1,5 +1,6 @@
- hosts: all
roles:
- configure-swap
- ensure-dstat-graph
- run-dstat
- role: ensure-zookeeper

View File

@@ -0,0 +1,17 @@
Configure a swap partition
This role has been copied from openstack/openstack-zuul-jobs. The role
doesn't make much sense in zuul/zuul-jobs because it makes assumptions
about the runtime environments (like rax's epehemeral disk drives). Since
openstack-zuul-jobs is in another Zuul tenant we port it over to zuul where
we want to make use of it.
Creates a swap partition on the ephemeral block device (the rest of which
will be mounted on /opt).
**Role Variables**
.. zuul:rolevar:: configure_swap_size
:default: 1024
The size of the swap partition, in MiB.

View File

@@ -0,0 +1,2 @@
# Default swap partition/file size, in MiB
configure_swap_size: 1024

View File

@@ -0,0 +1,131 @@
---
# Configure attached ephemeral devices for storage and swap
- name: Assert that ephemeral_device is defined
assert:
that:
- "ephemeral_device is defined"
- name: Set partition names
set_fact:
swap_partition: "{{ ephemeral_device }}1"
opt_partition: "{{ ephemeral_device }}2"
- name: Ensure ephemeral device is unmounted
become: yes
ansible.posix.mount:
name: "{{ ephemeral_device }}"
state: "{{ item }}"
with_items:
- unmounted
- absent
# ^ https://github.com/ansible/ansible/issues/48313
- name: Get existing partitions
become: yes
community.general.parted:
device: "{{ ephemeral_device }}"
unit: MiB
register: ephemeral_partitions
- name: Remove any existing partitions
become: yes
community.general.parted:
device: "{{ ephemeral_device }}"
number: "{{ item.num }}"
state: absent
with_items:
- "{{ ephemeral_partitions.partitions }}"
- name: Create new disk label
become: yes
community.general.parted:
label: msdos
device: "{{ ephemeral_device }}"
- name: Create swap partition
become: yes
community.general.parted:
device: "{{ ephemeral_device }}"
number: 1
state: present
part_start: '0%'
part_end: "{{ configure_swap_size }}MiB"
- name: Create opt partition
become: yes
community.general.parted:
device: "{{ ephemeral_device }}"
number: 2
state: present
part_start: "{{ configure_swap_size }}MiB"
part_end: "100%"
- name: Make swap on partition
become: yes
command: "mkswap {{ swap_partition }}"
- name: Write swap to fstab
become: yes
ansible.posix.mount:
path: none
src: "{{ swap_partition }}"
fstype: swap
opts: sw
passno: 0
dump: 0
state: present
# XXX: does "parted" plugin ensure the partition is available
# before moving on? No udev settles here ...
- name: Add all swap
become: yes
command: swapon -a
- name: Create /opt filesystem
become: yes
community.general.filesystem:
fstype: ext4
# The default ratio is 16384 bytes per inode or so. Reduce that to 8192
# bytes per inode so that we get roughly twice the number of inodes as
# by default. This should still be well above the block size of 4096.
# We do this because we have found in at least a couple locations that
# more inodes is useful and is painful to fix after the fact.
opts: -i 8192
dev: "{{ opt_partition }}"
# Rackspace at least does not have enough room for two devstack
# installs on the primary partition. We copy in the existing /opt to
# the new partition on the ephemeral device, and then overmount /opt
# to there for the test runs.
#
# NOTE(ianw): the existing "mount" touches fstab. There is currently (Sep2017)
# work in [1] to split mount & fstab into separate parts, but for now we bundle
# it into an atomic shell command
# [1] https://github.com/ansible/ansible/pull/27174
- name: Copy old /opt
become: yes
register: moving_opt
shell: |
mount {{ opt_partition }} /mnt
find /opt/ -mindepth 1 -maxdepth 1 -print -exec mv {} /mnt/ \;
umount /mnt
df -h
tags:
- skip_ansible_lint
- name: Output data from old /opt
debug:
var: moving_opt
# This overmounts any existing /opt
- name: Add opt to fstab and mount
become: yes
ansible.posix.mount:
path: /opt
src: "{{ opt_partition }}"
fstype: ext4
opts: noatime
state: mounted

View File

@@ -0,0 +1,67 @@
---
# On RAX hosts, we have a small root partition and a large,
# unallocated ephemeral device attached at /dev/xvde
- name: Set ephemeral device if /dev/xvde exists
when: ansible_devices["xvde"] is defined
set_fact:
ephemeral_device: "/dev/xvde"
# On other providers, we have a device called "ephemeral0".
#
# NOTE(ianw): Once [1] is in our ansible (2.4 era?), we can figure
# this out more directly by walking the device labels in the facts
#
# [1] https://github.com/ansible/ansible/commit/d46dd99f47c0ee5081d15bc5b741e9096d8bfd3e
- name: Set ephemeral device by label
when: ephemeral_device is undefined
block:
- name: Get ephemeral0 device node
command: /sbin/blkid -L ephemeral0
register: ephemeral0
# rc !=0 is expected
failed_when: False
changed_when: False
- name: Set ephemeral device if LABEL exists
when: "ephemeral0.rc == 0"
set_fact:
ephemeral_device: "{{ ephemeral0.stdout }}"
# If we have ephemeral storage and we don't appear to have setup swap,
# we will create a swap and move /opt to a large data partition there.
- name: Setup swap on ephemeral storage
include_tasks: ephemeral.yaml
when:
- ephemeral_device is defined
- ansible_memory_mb['swap']['total'] | int + 10 <= configure_swap_size
# If no ephemeral device and no swap, then we will setup some swap
# space on the root device to ensure all hosts a consistent memory
# environment.
- name: Setup swap file on root device
include_tasks: root.yaml
when:
- ephemeral_device is undefined
- ansible_memory_mb['swap']['total'] | int + 10 <= configure_swap_size
# ensure a standard level of swappiness. Some platforms
# (rax+centos7) come with swappiness of 0 (presumably because the
# vm doesn't come with swap setup ... but we just did that above),
# which depending on the kernel version can lead to the OOM killer
# kicking in on some processes despite swap being available;
# particularly things like mysql which have very high ratio of
# anonymous-memory to file-backed mappings.
#
# This sets swappiness low; we really don't want to be relying on
# cloud I/O based swap during our runs if we can help it
- name: Set swappiness
become: yes
ansible.posix.sysctl:
name: vm.swappiness
value: 30
state: present
- name: Debug the ephemeral_device variable
debug:
var: ephemeral_device

View File

@@ -0,0 +1,61 @@
---
# If no ephemeral devices are available, use root filesystem
- name: Calculate required swap
set_fact:
swap_required: "{{ configure_swap_size - ansible_memory_mb['swap']['total'] | int }}"
- name: Get root filesystem
block:
- name: Get root filesystem
shell: df --output='fstype' /root | tail -1
register: root_fs
- name: Save root filesystem
set_fact:
root_filesystem: "{{ root_fs.stdout }}"
- name: Debug the root_filesystem variable
debug:
var: root_filesystem
# Note, we don't use a sparse device to avoid wedging when disk space
# and memory are both unavailable.
- name: Create swap backing file
become: yes
command: dd if=/dev/zero of=/root/swapfile bs=1M count={{ swap_required }}
args:
creates: /root/swapfile
- name: Ensure swapfile perms
become: yes
file:
path: /root/swapfile
owner: root
group: root
mode: '0600'
- name: Make swapfile
become: yes
command: mkswap /root/swapfile
- name: Write swap to fstab
become: yes
ansible.posix.mount:
path: none
src: /root/swapfile
fstype: swap
opts: sw
passno: 0
dump: 0
state: present
- name: Add all swap
become: yes
command: swapon -a
- name: Debug the swap_required variable
debug:
var: swap_required