This commit adds logic
* to determine whether irmc hardware type is enabled
* (if enabled) to install python package python-scciclient & snmp
into DevStack code to support construction of Ironic environment
with iRMC supported Fujitsu server through DevStack.
Story: 2008722
Task: 42066
Change-Id: Ie50d8e4b43cdbfd8cd46333a75de20015e67829e
When it is set to True, we try to write text data to a binary file,
which is not possible in Python 3. The issue has been "helpfully"
hidden by the fact that we use bytes in unit tests, as well as
by lack of CI coverage.
Change-Id: Ibbf90dcbcb36a5f7cf084a44a221c0c5c003b95a
In devstack/lib/ironic, IRONIC_DEPLOY_DRIVER is defined at line 341.
However variables which use IRONIC_DEPLOY_DRIVER in default value
(e.g. IRONIC_DEPLOY_RAMDISK, IRONIC_DEPLOY_KERNEL, IRONIC_DEPLOY_ISO
and IRONIC_EFIBOOT) are defined at line 276-282.
This will cause problem at line 295-296:
if [[ "$IRONIC_BUILD_DEPLOY_RAMDISK" == "False" && \
! (-e "$IRONIC_DEPLOY_RAMDISK" && -e "$IRONIC_DEPLOY_KERNEL")
So, this commit moves definition of IRONIC_DEPLOY_DRIVER before
its first use.
Change-Id: I74acb32714ce8830d4697fc796146b894aa7d8c9
One of the biggest frustrations larger operators have is when they
trigger a massive number of concurrent deployments. As one would
expect, the memory utilization of the conductor goes up. Except,
even with the default number of worker threads, if we're requested
to convert 80 images at the same time, or to perform the write-out
to the remote node at the same time, we will consume a large amount
of system RAM. Or more specifically, qemu-img will consume a large
amount of memory.
If the amount of memory goes too low, the system can trigger
OOMKiller which will slay processes using ram. Ideally, we do not
want this to happen to our conductor process, much less the work
that is being performed, so we need to add some guard rails to help
keep us from entering into situations where we may compromise the
conductor by taking on too much work.
Adds a guard in the conductor to prevent multiple parallel
deployment operations from running the conductor out of memory.
With the defaults, the conductor will attempt to throttle back
automatically and hold worker threads which will slow down the
amount of work also proceeding through the conductor, as we are
in a memory condition where we should be careful about the work.
The defaults allow this to occur for a total of 15 seconds between
re-check of available RAM, for a total number of six retries.
The minimum default is 1024 (MB), as this is the amount of memory
qemu-img allocates when trying to write images. This quite literally
means no additional qemu-img process can spawn until the default
memory situation has resolved itself.
Change-Id: I69db0169c564c5b22abd0cb1b890f409c13b0ac2
This reverts commit 05f2c8b79f0d6b7e9200bbc531ff621d2029da2e.
It is being reverted as the centos stream images
contain extra, un-necessary libraries and packages
installed which swells the ramdisk size up substantially
and is causing failures in CI as the compressed image size
expanded by about 100MB, and uncompressed the stream images
are 1.1GB.
Change-Id: Icc3a18ed12d309fd9a00f02d5e703dfeda50e86b
A new option allows embedding a CA certificate in the virtual media
ISO to allow fully secure TLS between ironic and IPA.
Depends-On: https://review.opendev.org/763207
Change-Id: Idaacf44fd829c441d708b11704a97f9cd2b7a74c
Also update the devstack plugin to use the same procedure.
Based on https://review.opendev.org/760423.
Change-Id: I8e20ad0fbc7e62e418b24ef56425328ec3a201b0
We're seeing cases where cleaning barely manages to finish after
a 2nd PXE retry, failing a job.
Also make the PXE retry timeout consistent between the CI and
local devstack installations.
Change-Id: I6dc7a91d1a482008cf4ec855a60a95ec0a1abe28
* libvirt is usually shipped with the correct nvram options
* Fedora already has the suitable iPXE ROM
Change-Id: Ia53e25cee646241c8dccc6296c62f90c6793abc7
At the moment, it's not possible to use diskimage-builder to
create images based on centos-minimal on ubuntu focal.
To be able to use ubuntu focal as nodeset, we start using
the centos dib element.
Change-Id: Ibe363729038da92c630c4d8c4deedb2507515999
As per victoria cycle testing runtime and community goal
we need to migrate upstream CI/CD to Ubuntu Focal(20.04).
keeping few jobs running on bionic nodeset till
https://storyboard.openstack.org/#!/story/2008185 is fixed
otherwise base devstack jobs switching to Focal will block
the gate.
Change-Id: I1106c5c2b400e7db899959550eb1dc92577b319d
Story: #2007865
Task: #40188
This change marks the iscsi deploy interface as deprecated and
stops enabling it by default.
An online data migration is provided for iscsi->direct, provided that:
1) the direct deploy is enabled,
2) image_download_source!=swift.
The CI coverage for iscsi deploy is left only on standalone jobs.
Story: #2008114
Task: #40830
Change-Id: I4a66401b24c49c705861e0745867b7fc706a7509
In the devstack plugin, we identify RAX hosts and change the settings
slightly to help ensure CI job passage. This is because the machines are
fully emulated as opposed to paravirtualized which causes a performance
impact with our test VMs that we create. Since we can hit issues
while scheduling on RAX in general, it only makes sense to also
swap the job out for centos based IPA jobs that some how happen on
to a RAX node. Since the CI jobs have largely been "de-typed" name
wise, this should be fine.
Change-Id: I63fc5de05c3975433c12ddd9ba21ebb676de2e6c
As part of the plan to deprecate the iSCSI deploy interface, changing
this option to a value that will work out-of-box for more deployments.
The standalone CI jobs are switched to http as well, the rest of jobs
are left with swift. The explicit indirect jobs are removed.
Change-Id: Idc56a70478dfe65e9b936006a5355d6b96e536e1
Story: #2008114
Task: #40831
Removes the deprecated support for token-less agents which
better secures the ironic-python-agent<->ironic interactions
to help ensure heartbeat operations are coming from the same
node which originally checked-in with the Ironic and that
commands coming to an agent are originating from the same
ironic deployment which the agent checked-in with to begin
with.
Story: 2007025
Task: 40814
Change-Id: Id7a3f402285c654bc4665dcd45bd0730128bf9b0
Initialize virtual volumes with no allocated initial space to prevent
consuming all space available during volume creation.
The behavior of virsh has changed, by default in the past the virtual
volumes were initialized automatically with no initial allocated space,
while with recent versions we need to specify explicitly to not
allocate initial space for the virtual volumes.
Change-Id: I588d0888c19304119af607433de6e35199f7c555
It could cause issues during execution, and in general ide or
editors specific config should not be kept here but in a specific
config file.
Change-Id: I4fbf7c6d00f2ba19a860d4c39cd882f541ff0600
In Ubuntu 20.04.1 LTS, a path of iptables is not /sbin/iptables but
/usr/sbin/iptables. So, current code gets an error with
"/sbin/iptables" failed : No such file or directory
This commit fixes to use iptables when neutron L3 service is enabled on
Ubuntu 20.04.1 LTS.
Change-Id: I76eb89a2ae26431065cd19f0af235e71eb9f4169
Sets the settings to enable the ramdisk iso booting tests
including a bootable ISO image that should boot the machine.
NB: The first depends-on is only for temporary testing of another
which changes the substrate ramdisk interface. Since this change pulls
in tempest testing for iso ramdisk and uses it, might as well
use it to test if the change works or not as the other two patches
below are known to be in a good state.
Change-Id: I5d4213b0ba4f7884fb542e7d6680f95fc94e112e
It's very confusing that we use username/password everywhere, except
for [json_rpc]. Just use the standard options.
Also the version if keystoneauth is bumpted to one that supports
http_basic.
Change-Id: Icc834c3f8febd45c2548314ee00b85a7f9cebd2c
The CPU overhead of nested virtualization on rax hosts simply
is too much for Ironic's CI to justify using full size IPA images.
The failure rate is simply too high. As a result, lets use TinyIPA
images when we are not building a ramdisk to reduce that failure rate.
Change-Id: Ifa81397519833201b737cff89f61178c8835e3ca
When extending the timeouts for jobs to execute with-in,
we've observed a case where RAX hosts are cutting off at
the time limit of 900 seconds (as being asserted by another
change set). This is both good and bad. We know the timeout
feature works, but the agent was not quite online yet.
As such, we should also auto-extend base retry timeouts
so there is hope for the job to complete.
Change-Id: I8efa3a52188de558a7964d1daafd2225e102e251
Rax hosts uses qemu software emulated VMs without leveraging the
magic with-in the processors to help ensure speedy execution.
As such, they can be substantially slower in some operations, such
decompressing ramdisks. This adds an unpredictable element into our
CI and causes job failures when they should ahve succeeded, which
causes more rechecks, which consumes more resources... and the cycle
continues.
So instead, we'll extend the timeout a little, to hopefully give the
job time to complete without causing failures.
Change-Id: I0cd08e527763f0626fd1e43cc3b87163a4b0d018
This patches update the devstack to automatically
set the new tempest configuration `boot_mode`,
it will use the value from IRONIC_BOOT_MODE variable.
Increase the number of VM's in ironic-tempest-ipa-partition-pxe_ipmitool
and ironic-tempest-ipa-wholedisk-bios-agent_ipmitool-tinyipa
to 2 since it runs cleanning and now we run two tempest tests.
Depends-On: https://review.opendev.org/735960
Change-Id: Ic6faf73430e56e2b1ff19a72b1b03f8ef34eff5f
I did something stupid when started driving forth the split of ipxe
from the pxe interface: I didn't think about the need to actually
separate bootloaders. In part, because the use case was a mixed
Power8/Power9 and x86 cluster. Mainly because the Power hardware
does not honor or care about the bootfile name provided over DHCP.
The firmware knows how to read the PXELINUX boot file format
and the machines are able to boot from there.
Where this all goes sideways is when:
* Enabled boot interfaces are set to ipxe,pxe
* No default boot interface is set
* Node is created without a default for x86 hardware.
* Node uses ipxe boot_interface, and creates files under /httpboot
* bootfile transmitted via DHCP is pxelinux.0.
Fun right?
The simple workaround for the power user is to just define the iPXE
loader, or maybe use UEFI. But that is neither here nor there, this
is still a bug and a possible use case is GRUB2 via PXE and iPXE.
Not that would really work via ipxe, but hopefully people get the
idea.
The solution kind of seems clear, duplicate configuration and
fallback if not defined.
Story: #2007003
Task: #40282
Change-Id: I4419254c23095929e52a0fda11789f2f5167dc6b
The ovmf pacakge in bionic doesn't really work in our CI.
As a workaround we use the old package from xenial, but we can't keep
using it also in Ubuntu Focal.
This patch aims to convert the uefi jobs to use Ubuntu Focal as
base operating system and use the native ovmf package.
Story: 2007785
Task: 40025
Change-Id: I653e5da2672b14eae88c6cab923b8617432f1dc1