739 Commits

Author SHA1 Message Date
Vanou Ishii
7552c489e3 Add iRMC Driver Support to DevStack Code
This commit adds logic
  * to determine whether irmc hardware type is enabled
  * (if enabled) to install python package python-scciclient & snmp
into DevStack code to support construction of Ironic environment
with iRMC supported Fujitsu server through DevStack.

Story: 2008722
Task: 42066
Change-Id: Ie50d8e4b43cdbfd8cd46333a75de20015e67829e
2021-03-17 18:48:09 +09:00
Dmitry Tantsur
7abac806a7 devstack: a safeguard for disabled tempurls
Change-Id: Id5fcd4cc1f73b80e8a9e9d2c50e2e4e1667c01cb
2021-02-25 12:09:30 +01:00
Zuul
6e0682377c Merge "Fix broken configdrive_use_object_store" 2021-02-23 18:08:57 +00:00
Dmitry Tantsur
73bdebd127 Fix broken configdrive_use_object_store
When it is set to True, we try to write text data to a binary file,
which is not possible in Python 3. The issue has been "helpfully"
hidden by the fact that we use bytes in unit tests, as well as
by lack of CI coverage.

Change-Id: Ibbf90dcbcb36a5f7cf084a44a221c0c5c003b95a
2021-02-18 10:25:07 +01:00
Zuul
6b9d7fa407 Merge "devstack: support installing ironic-lib from source in DIB IPA" 2021-02-18 04:04:40 +00:00
Zuul
52ff615c98 Merge "Guard conductor from consuming all of the ram" 2021-02-12 18:11:57 +00:00
Dmitry Tantsur
189b5e40cd devstack: support installing ironic-lib from source in DIB IPA
Depends-On: https://review.opendev.org/c/openstack/ironic-python-agent-builder/+/775153
Change-Id: I8734776bf59b5a34327624184c1c2360ccda330a
2021-02-11 14:46:49 +01:00
Vanou Ishii
13e77e2179 Fix Mis-Ordering of Bash Variable Definition in DevStack
In devstack/lib/ironic, IRONIC_DEPLOY_DRIVER is defined at line 341.
However variables which use IRONIC_DEPLOY_DRIVER in default value
(e.g. IRONIC_DEPLOY_RAMDISK, IRONIC_DEPLOY_KERNEL, IRONIC_DEPLOY_ISO
and IRONIC_EFIBOOT) are defined at line 276-282.

This will cause problem at line 295-296:

 if [[ "$IRONIC_BUILD_DEPLOY_RAMDISK" == "False" && \
         ! (-e "$IRONIC_DEPLOY_RAMDISK" && -e "$IRONIC_DEPLOY_KERNEL")

So, this commit moves definition of IRONIC_DEPLOY_DRIVER before
its first use.

Change-Id: I74acb32714ce8830d4697fc796146b894aa7d8c9
2021-02-01 10:17:39 +09:00
Julia Kreger
d9913370de Guard conductor from consuming all of the ram
One of the biggest frustrations larger operators have is when they
trigger a massive number of concurrent deployments. As one would
expect, the memory utilization of the conductor goes up. Except,
even with the default number of worker threads, if we're requested
to convert 80 images at the same time, or to perform the write-out
to the remote node at the same time, we will consume a large amount
of system RAM. Or more specifically, qemu-img will consume a large
amount of memory.

If the amount of memory goes too low, the system can trigger
OOMKiller which will slay processes using ram. Ideally, we do not
want this to happen to our conductor process, much less the work
that is being performed, so we need to add some guard rails to help
keep us from entering into situations where we may compromise the
conductor by taking on too much work.

Adds a guard in the conductor to prevent multiple parallel
deployment operations from running the conductor out of memory.

With the defaults, the conductor will attempt to throttle back
automatically and hold worker threads which will slow down the
amount of work also proceeding through the conductor, as we are
in a memory condition where we should be careful about the work.

The defaults allow this to occur for a total of 15 seconds between
re-check of available RAM, for a total number of six retries.
The minimum default is 1024 (MB), as this is the amount of memory
qemu-img allocates when trying to write images. This quite literally
means no additional qemu-img process can spawn until the default
memory situation has resolved itself.

Change-Id: I69db0169c564c5b22abd0cb1b890f409c13b0ac2
2021-01-29 14:33:57 -08:00
Zuul
6c9e28dd50 Merge "Inject TLS certificate when using virtual media" 2020-12-19 22:14:12 +00:00
Dmitry Tantsur
8b83e9ec62 Revert "devstack: build DIB images with CentOS Stream by default"
This reverts commit 05f2c8b79f0d6b7e9200bbc531ff621d2029da2e.

It is being reverted as the centos stream images
contain extra, un-necessary libraries and packages
installed which swells the ramdisk size up substantially
and is causing failures in CI as the compressed image size
expanded by about 100MB, and uncompressed the stream images
are 1.1GB.

Change-Id: Icc3a18ed12d309fd9a00f02d5e703dfeda50e86b
2020-12-15 14:20:13 +00:00
Dmitry Tantsur
628109f960 Inject TLS certificate when using virtual media
A new option allows embedding a CA certificate in the virtual media
ISO to allow fully secure TLS between ironic and IPA.

Depends-On: https://review.opendev.org/763207
Change-Id: Idaacf44fd829c441d708b11704a97f9cd2b7a74c
2020-12-15 13:41:50 +01:00
Dmitry Tantsur
05f2c8b79f devstack: build DIB images with CentOS Stream by default
Change-Id: I50edd6b2740a26d00be19abc58c3ff770417fb68
2020-12-11 12:02:45 +01:00
Dmitry Tantsur
31f3f9fca1 Document how to build an ESP image for redfish-virtual-media
Also update the devstack plugin to use the same procedure.

Based on https://review.opendev.org/760423.

Change-Id: I8e20ad0fbc7e62e418b24ef56425328ec3a201b0
2020-11-10 19:19:07 +01:00
Zuul
7080f2ce20 Merge "devstack: log all requests to sushy-emulator" 2020-11-02 12:36:31 +00:00
Zuul
09e246294b Merge "CI: increase cleaning timeout and tie it to PXE boot timeout" 2020-10-30 19:19:20 +00:00
Dmitry Tantsur
db7c0c069d devstack: log all requests to sushy-emulator
Change-Id: Ie2e97ebd51182ccec488199a34c27a3d8a2a02b9
2020-10-28 15:56:12 +01:00
Dmitry Tantsur
87e634dee1 CI: increase cleaning timeout and tie it to PXE boot timeout
We're seeing cases where cleaning barely manages to finish after
a 2nd PXE retry, failing a job.

Also make the PXE retry timeout consistent between the CI and
local devstack installations.

Change-Id: I6dc7a91d1a482008cf4ec855a60a95ec0a1abe28
2020-10-27 12:21:43 +01:00
Dmitry Tantsur
fece20c8a9 devstack: remove no longer required UEFI hacks
* libvirt is usually shipped with the correct nvram options
* Fedora already has the suitable iPXE ROM

Change-Id: Ia53e25cee646241c8dccc6296c62f90c6793abc7
2020-10-26 14:35:17 +01:00
Riccardo Pittau
86bee227c9 Use centos as base element for dib images
At the moment, it's not possible to use diskimage-builder to
create images based on centos-minimal on ubuntu focal.
To be able to use ubuntu focal as nodeset, we start using
the centos dib element.

Change-Id: Ibe363729038da92c630c4d8c4deedb2507515999
2020-10-14 09:54:54 +02:00
Riccardo Pittau
4775288b93 migrate testing to ubuntu focal
As per victoria cycle testing runtime and community goal
we need to migrate upstream CI/CD to Ubuntu Focal(20.04).

keeping few jobs running on bionic nodeset till
https://storyboard.openstack.org/#!/story/2008185 is fixed
otherwise base devstack jobs switching to Focal will block
the gate.

Change-Id: I1106c5c2b400e7db899959550eb1dc92577b319d
Story: #2007865
Task: #40188
2020-10-05 17:05:49 +02:00
Zuul
33a34ce364 Merge "Feat: add ibmc hardware info support for devstack" 2020-10-01 13:03:59 +00:00
Dmitry Tantsur
365005ba48 devstack: do not default to swift if SWIFT_ENABLE_TEMPURLS is False
Change-Id: I93a0b548d42d4819917e61f87a0823ff6139b697
2020-09-29 15:29:21 +02:00
Zuul
28c0f38322 Merge "Deprecate the iscsi deploy interface" 2020-09-24 06:09:10 +00:00
Dmitry Tantsur
d8dccc8d06 Deprecate the iscsi deploy interface
This change marks the iscsi deploy interface as deprecated and
stops enabling it by default.

An online data migration is provided for iscsi->direct, provided that:
1) the direct deploy is enabled,
2) image_download_source!=swift.

The CI coverage for iscsi deploy is left only on standalone jobs.

Story: #2008114
Task: #40830
Change-Id: I4a66401b24c49c705861e0745867b7fc706a7509
2020-09-22 15:39:36 +02:00
Julia Kreger
4e8f664ea2 CI: Remove the build check for pre-build ramdisks only
In the devstack plugin, we identify RAX hosts and change the settings
slightly to help ensure CI job passage. This is because the machines are
fully emulated as opposed to paravirtualized which causes a performance
impact with our test VMs that we create. Since we can hit issues
while scheduling on RAX in general, it only makes sense to also
swap the job out for centos based IPA jobs that some how happen on
to a RAX node. Since the CI jobs have largely been "de-typed" name
wise, this should be fine.

Change-Id: I63fc5de05c3975433c12ddd9ba21ebb676de2e6c
2020-09-21 09:14:25 -07:00
Iury Gregory Melo Ferreira
1f0174bb41 Native zuulv3 grenade multinode multitenant
Based on the native 'grenade-multinode' job

Change-Id: I4d0a23c371bc42c5bf18e79ea7920bd77b066154
2020-09-16 23:33:42 +02:00
Dmitry Tantsur
b5d5e5774c Change [agent]image_download_source=http
As part of the plan to deprecate the iSCSI deploy interface, changing
this option to a value that will work out-of-box for more deployments.

The standalone CI jobs are switched to http as well, the rest of jobs
are left with swift. The explicit indirect jobs are removed.

Change-Id: Idc56a70478dfe65e9b936006a5355d6b96e536e1
Story: #2008114
Task: #40831
2020-09-08 16:28:31 +02:00
Zuul
b6cf0432a7 Merge "Remove token-less agent support" 2020-09-07 15:07:17 +00:00
Zuul
3709cce11f Merge "ISO ramdisk virtual media test enablement" 2020-09-06 12:07:02 +00:00
Julia Kreger
5b272b0c46 Remove token-less agent support
Removes the deprecated support for token-less agents which
better secures the ironic-python-agent<->ironic interactions
to help ensure heartbeat operations are coming from the same
node which originally checked-in with the Ironic and that
commands coming to an agent are originating from the same
ironic deployment which the agent checked-in with to begin
with.

Story: 2007025
Task: 40814
Change-Id: Id7a3f402285c654bc4665dcd45bd0730128bf9b0
2020-09-04 17:09:39 +00:00
Zuul
aee4846f3a Merge "[trivial] remove emacs config from devstack script" 2020-09-03 12:58:24 +00:00
Zuul
14e2ac1cb4 Merge "Remove absolute path with iptables when L3 enabled" 2020-09-03 11:12:02 +00:00
Riccardo Pittau
ce39115fa7 Explicitely do not allocate initial space for virtual volumes
Initialize virtual volumes with no allocated initial space to prevent
consuming all space available during volume creation.
The behavior of virsh has changed, by default in the past the virtual
volumes were initialized automatically with no initial allocated space,
while with recent versions we need to specify explicitly to not
allocate initial space for the virtual volumes.

Change-Id: I588d0888c19304119af607433de6e35199f7c555
2020-09-02 11:10:58 +00:00
Qianbiao.NG
f28d30eafe Feat: add ibmc hardware info support for devstack
Add ibmc hardware type support when generate testing nodes for ironic
devstack

Change-Id: I6f9f129256151f0a721f2b1962477ddef0ab453e
2020-09-01 20:30:13 +08:00
Riccardo Pittau
ed6f9a14ec [trivial] remove emacs config from devstack script
It could cause issues during execution, and in general ide or
editors specific config should not be kept here but in a specific
config file.

Change-Id: I4fbf7c6d00f2ba19a860d4c39cd882f541ff0600
2020-08-31 10:27:17 +02:00
Yushiro FURUKAWA
02fc64a35b Remove absolute path with iptables when L3 enabled
In Ubuntu 20.04.1 LTS, a path of iptables is not /sbin/iptables but
/usr/sbin/iptables.  So, current code gets an error with
"/sbin/iptables" failed : No such file or directory
This commit fixes to use iptables when neutron L3 service is enabled on
Ubuntu 20.04.1 LTS.

Change-Id: I76eb89a2ae26431065cd19f0af235e71eb9f4169
2020-08-31 08:54:38 +09:00
Julia Kreger
9e7f1cb570 ISO ramdisk virtual media test enablement
Sets the settings to enable the ramdisk iso booting tests
including a bootable ISO image that should boot the machine.

NB: The first depends-on is only for temporary testing of another
which changes the substrate ramdisk interface. Since this change pulls
in tempest testing for iso ramdisk and uses it, might as well
use it to test if the change works or not as the other two patches
below are known to be in a good state.

Change-Id: I5d4213b0ba4f7884fb542e7d6680f95fc94e112e
2020-08-17 10:14:38 -07:00
Dmitry Tantsur
ed10d7ed30 Enable deploy-time software RAID in standalone jobs
Change-Id: I56ef54cf897988566bf07fd13012590a6b4445fa
Depends-On: https://review.opendev.org/741227
2020-07-30 18:13:46 +02:00
Zuul
3670be1283 Merge "Deprecate http_basic_username and http_basic_password in [json_rpc]" 2020-07-28 19:14:00 +00:00
Dmitry Tantsur
74e9e1d82a Deprecate http_basic_username and http_basic_password in [json_rpc]
It's very confusing that we use username/password everywhere, except
for [json_rpc]. Just use the standard options.

Also the version if keystoneauth is bumpted to one that supports
http_basic.

Change-Id: Icc834c3f8febd45c2548314ee00b85a7f9cebd2c
2020-07-24 11:51:41 +02:00
Julia Kreger
6dfc409133 Force RAX hosts to run tinyipa
The CPU overhead of nested virtualization on rax hosts simply
is too much for Ironic's CI to justify using full size IPA images.

The failure rate is simply too high. As a result, lets use TinyIPA
images when we are not building a ramdisk to reduce that failure rate.

Change-Id: Ifa81397519833201b737cff89f61178c8835e3ca
2020-07-23 16:33:34 +00:00
Julia Kreger
67e51af6d5 Extend PXE boot retry timeout for RAX hosts
When extending the timeouts for jobs to execute with-in,
we've observed a case where RAX hosts are cutting off at
the time limit of 900 seconds (as being asserted by another
change set). This is both good and bad. We know the timeout
feature works, but the agent was not quite online yet.

As such, we should also auto-extend base retry timeouts
so there is hope for the job to complete.

Change-Id: I8efa3a52188de558a7964d1daafd2225e102e251
2020-07-22 10:41:07 -07:00
Julia Kreger
3750ba62df Auto extend the timeout for RAX hosts
Rax hosts uses qemu software emulated VMs without leveraging the
magic with-in the processors to help ensure speedy execution.

As such, they can be substantially slower in some operations, such
decompressing ramdisks. This adds an unpredictable element into our
CI and causes job failures when they should ahve succeeded, which
causes more rechecks, which consumes more resources... and the cycle
continues.

So instead, we'll extend the timeout a little, to hopefully give the
job time to complete without causing failures.

Change-Id: I0cd08e527763f0626fd1e43cc3b87163a4b0d018
2020-07-17 16:16:59 -07:00
Zuul
3c1ab6136f Merge "add tempest boot_mode config" 2020-07-09 10:07:07 +00:00
Iury Gregory Melo Ferreira
ddbc4a6a09 add tempest boot_mode config
This patches update the devstack to automatically
set the new tempest configuration `boot_mode`,
it will use the value from IRONIC_BOOT_MODE variable.

Increase the number of VM's in ironic-tempest-ipa-partition-pxe_ipmitool
and ironic-tempest-ipa-wholedisk-bios-agent_ipmitool-tinyipa
to 2 since it runs cleanning and now we run two tempest tests.

Depends-On: https://review.opendev.org/735960
Change-Id: Ic6faf73430e56e2b1ff19a72b1b03f8ef34eff5f
2020-07-08 21:55:02 +02:00
Zuul
19866e3ddb Merge "Provide a path to set explicit ipxe bootloaders" 2020-07-08 12:08:48 +00:00
Julia Kreger
5f7d84f483 Provide a path to set explicit ipxe bootloaders
I did something stupid when started driving forth the split of ipxe
from the pxe interface: I didn't think about the need to actually
separate bootloaders. In part, because the use case was a mixed
Power8/Power9 and x86 cluster. Mainly because the Power hardware
does not honor or care about the bootfile name provided over DHCP.
The firmware knows how to read the PXELINUX boot file format
and the machines are able to boot from there.

Where this all goes sideways is when:
* Enabled boot interfaces are set to ipxe,pxe
* No default boot interface is set
* Node is created without a default for x86 hardware.
* Node uses ipxe boot_interface, and creates files under /httpboot
* bootfile transmitted via DHCP is pxelinux.0.

Fun right?

The simple workaround for the power user is to just define the iPXE
loader, or maybe use UEFI. But that is neither here nor there, this
is still a bug and a possible use case is GRUB2 via PXE and iPXE.
Not that would really work via ipxe, but hopefully people get the
idea.

The solution kind of seems clear, duplicate configuration and
fallback if not defined.

Story: #2007003
Task: #40282
Change-Id: I4419254c23095929e52a0fda11789f2f5167dc6b
2020-07-07 12:38:33 -07:00
Riccardo Pittau
daca490226 Follow up of fix uefi jobs with ovmf native ubuntu package
Following up on comments from https://review.opendev.org/716889

Change-Id: I805a65478f469b1b4e25c1bf2397f034f61d6ec7
2020-07-07 12:04:56 +02:00
Riccardo Pittau
aac89c2149 Fix uefi jobs with native ubuntu ovmf package
The ovmf pacakge in bionic doesn't really work in our CI.
As a workaround we use the old package from xenial, but we can't keep
using it also in Ubuntu Focal.
This patch aims to convert the uefi jobs to use Ubuntu Focal as
base operating system and use the native ovmf package.

Story: 2007785
Task: 40025

Change-Id: I653e5da2672b14eae88c6cab923b8617432f1dc1
2020-07-02 17:10:36 +00:00