Ubuntu focal was in testing runtime as best effort
testing in 2023.1 cycle. In 2023.2, we do not need to
test the focal as such. Removing its testing to more
focus on making Jammy testing more stable.
Two issues have occcured:
1) Zuul has decided some syntax is deprecated and generates an error.
The exlcusionary nature of the syntax is just not supported by RE2
which is the new requirement, so explicitly matching "^master$"
as opposed to "not stable branches".
2) Marking the snmp job as non-voting, the root issue appears to be ipxe
or the VMs, unknown as of yet.
Investigation of our standalone test job issues, where jobs would
fail, hosts not get DHCP updates, and ultimately IPXE would
fail prior to getting a valid or the expected response,
revealed the discovery that dnsmasq was crashing often when
the port updates were going through, ultimately preventing
the mutli-scenario test jobs from running as the standalone
jobs represent a number of different scenarios which are
executed across a pool of test machines.
In this case, the path forward appears to be to downgrade
dnsmasq to stablize our CI and allow us to otherwise upgrade.
This patch adds the focal updates as a package source,
and installs the dnsmasq package.
All database migration testing in opestack is done through
an opportunistic worker model, where if the database is available
and correctly configured for testing, i.e. openstack-citest user
and access appropriately granted, then the tests will create and
However, this has been problematic with mysql as of recent, as we
have seen a long standing migration issue boil to the surface often
As a result, we're isolating that test down to it's own job so we
can limit the blast damage. This also helps us isolate is it all
of the tests, or is it just soley isolated down to the mysql test
run class, which is an additional data point.
By default, we continue to run Postgres migration tests in the
main jobs, as they haven't been impacted by this issue.
It appear the push to Cirros 0.6.1 has re-occured, and we now
have things failing as a result.
Specifically ironic-grenade is trying to run with Cirros 0.5.2,
yet the file is not found later on.
Anyhow, an explicit pin should resolve this.
In the recent change to cinder, to address CVE-2023-2088,
cinder changed the policy rules and behavior for unbinding,
or "detaching" a volume. This was because of a vulnerability
in compute nodes where a volume which was in use by a VM
could be detached outside of Nova, and nova wouldn't become
aware the volume was detached, and the volume could be accessible
to the next VM.
This vulnerability doesn't apply to bare metal operations as
volumes are attached to whole baremetal nodes with Ironic.
We now generate and use a service token when interacting with
Cinder which allows cinder to recognize "this request is
coming from a fellow OpenStack service", and by-pass
checking with Nova if the "instance" is managed by Nova,
or Not. This allows the volumes to be attached, and detached
as needed as part of the power operation flow and overall
set of lifecycle operations.
Until we're able to get the BFV job softed, we need to unblock
the gate, and as such moving the BFV job to non-voting to allow
other contributors to make progress.
Launching test VMs can take a while, and grenade can fail
if the VM's networking is not quite online in under sixty
seconds. As such, it is reasonable to use a larger window
so the failure rate of ironic-grenade will hopefully decline.
The anaconda job is failing as were getting a redirect issued back
upon attempting to validate URLs. The servers are now directing us
to use HTTPS instead.
This commit partially reverts change set
where the amount of memory for test VMs was
increased to 4GB. This was because excess
junk getting stuck in the staged ramdisk
images used by CI.
It appears we are getting an opcode error when attempting to boot
Centos 9-stream utilizing the EFI artifacts from Ubuntu.
Technically this should work, however further aftifacts in the boot
chain may be signed with other key credentials that Ubuntu's
grub does not know about, because the chain of trust is
MSFT -> Vendor shim (slow change rate) -> Vendor GRUB -> Kernel
Where vendor differences should never work, is if Secure Boot
Exception on launch:
X64 Exception Type - 06(#UD - Invalid Opcode) CPU Apic ID - 00000000 !!!!
A similar Debian bug is open for a very similar issue:
However, no additional comments or information have been in follow
up to that reported issue. So in the mean time, we're going to try
and do what those smarter than I recommend, use the vendor's
binaries for their distribution.
There is one further, potentially far more depressing possibility,
that centos9's kernel doesn't support the type of hardware
we're getting. This is suggested by the precise opcode error, UD,
But again, easiest possibility first.
- Remove skipsdist that it was never supported and causes breakage
when used with usedevelop.
- add script to allowlist for pep8 test
- disable setuptools autodiscovery
- Increase base VM memory according to new requirements for CS9
Introduces additional job configuration to enable automated
integration testing via tempest of the anaconda deployment
Also, configures a private subnet with DNS, which is required
by anaconda executing, in order to facilitate processing of URLs.
Instance network boot (not to be confused with ramdisk, iSCSI or
anaconda deploy methods) is insecure, underused and difficult to
maintain. This change removes a lot of related code from Ironic.
The so called "netboot fallback" is still supported for legacy boot when
boot device management is not available or is unreliable.
* Fixes the IPv6 job by utilizing HOST_IPV6 instead of
SERVICE_IPV6, as Devstack now automatically wraps
SERVICE_IPV6 with brackets as if it is for a URL.
* Locks ipv6 job to bios mode. Ubuntu Focal OVMF/EDK2 does not
support IPv6 PXE boot by default.
* Split from Devstack in terms of IP usage, since full explicit
V6 usage is not a thing anymore. 4+6 is the default in devstack
and regardless of what we set on the job we see both now used.
So we delineate apart our usage for our own sanity.
* Reduce VM Interface count for IPv6 in an attempt to eliminate
in-kernel routing confusion by two interfaces on the same physical
* Set IPv6 mode to dhcpv6-stateless due to fun issues in dhcp clients.
When we move to UEFI, this will need to be changed to stateful as
stateless is not supported in general by OVMF/E2DK.
Once the job has run in normal non-voting for a while, and we
ensure that it seems to be stable, we can make it voting again.
Grenade, for some confusing reason, creates a separate network,
and uses that for upgrade testing as opposed to the original network
the VMs were bound to. If Julia's memory is correct, this was for
multinode upgrade testing.
Anyway, When in UEFI mode, it appears that the TFTP packets
don't get tracked nor cross the boundrary. We likley need to
explicitly address this, but first, lets get the job working as
it was and can then update it.
Also, update requirements because markupsafe removed soft_unicode
method taht was deprecated since a while. Jinja2 started using the
new soft_str method since version 3.0.0