Files
Clark Boylan a50827ece9 Add an early base job check for CPU counts
Ubuntu Noble (and possible Debian Trixie and other newer kernels) do not
properly handle x2apic with some of the older Xen hypervisors in rax
classic. When that happens instances boot with only a single useable
CPU. This then leads to problems later in jobs as many jobs in OpenDev
are designed to run tasks in parallel which doesn't work as well with a
single CPU.

To work around this we check the CPU count early in the job runtime and
fail if there are fewer than 2 CPUs present. Since this happens early
Zuul's retry mechanisms will restart the job on a new node.

More info about the x2api problem can be found here:

  https://docs.oracle.com/en/operating-systems/uek/8/relnotes8.0/38006792.html

This does suggest another potential workaround which is to use the older
apic version (which does work), but doing so has potential performance
problems. Since this issue seems infrequent we simply recycle the node
instead and let the job retry.

Note that we only update base-test for now to ensure that this doesn't
create widespread problems before applying it to the global base job.

Change-Id: Iff0249ae09da3c591746ce6300c033f6f06f58e6
2026-02-02 10:20:23 -08:00
..
2024-09-19 14:18:01 -07:00
2024-09-19 14:18:01 -07:00
2024-09-19 14:18:01 -07:00
2024-09-19 14:18:01 -07:00