283 Commits

Author SHA1 Message Date
Dmitry Tantsur
f824930bbd
Import disk_{utils,partitioner} from ironic-lib
With the iscsi deploy long gone, these modules are only used in IPA and
in fact represent a large part of its critical logic. Having them
separately sometimes makes fixing issues tricky if an interface of
a function needs changing.

This change imports the code mostly as it is, just removing run_as_root and
a deprecated function, as well as moving configuration options to config.py.

Also migrates one relevant function from ironic_lib.utils.

Change-Id: If8fae8210d85c61abb85c388b300e40a75d0531c
2024-03-15 18:45:04 +01:00
Dmitry Tantsur
9f849472ca
Drop usage of run_as_root
IPA can only be run as root and does not use rootwrap. We need to
eventually remove support for rootwrap from ironic-lib.

Change-Id: Iffd5cae5e3dc8637bc6dd10b3bcc9fe33932b8cf
2024-01-23 14:23:23 +01:00
Zuul
d298e06b49 Merge "[codespell] Fix spelling issues in IPA" 2024-01-08 17:22:02 +00:00
Jay Faulkner
dcaed43ef9 Update to latest pep8/code style versions
Update various linting programs to their latest version, and fix any
issues created by the update.

Change-Id: I014c846560663a76a1663b568ef48659d0ab6d4d
2023-12-28 14:19:27 -08:00
Jay Faulkner
36e5993a04 [codespell] Fix spelling issues in IPA
This fixes several spelling issues identified by codepsell. In some
cases, I may have manually modified a line to make the output more clear
or to correct grammatical issues which were obvious in the codespell
output.

Later changes in this chain will provide the codespell config used to
generate this, as well as adding this commit's SHA, once landed, to a
.git-blame-ignore-revs file to ensure it will not pollute git historys
for modern clients.

Related-Bug: 2047654
Change-Id: I240cf8484865c9b748ceb51f3c7b9fd973cb5ada
2023-12-28 10:54:46 -08:00
Jay Faulkner
3d42298619 Remove standby.cache_image support
Image caching was never fully supported in Ironic or IPA; this is vestigal
code leftover from a partial implementation.

Even if we implemetented it today, we'd likely use a completely different
methodology.

Change-Id: Id4ab7b3c4f106b209585dbd090cdcb229b1daa73
2023-10-24 15:02:44 -07:00
Zuul
b42f0be422 Merge "implement basic-auth support for user-image download process" 2023-10-13 17:08:28 +00:00
Julia Kreger
cb61a8d6c0 Retry on checksum failures
HTTP is a fun protocol.

Size is basically optional. And clients implicitly trust the server
and socket has transferred all the bytes. Which *really* means you
should always checksum.

But... previously we didn't checksum as part of retrying.

So if anything happened with python-requests, or lower level
library code or the system itself causing bytes to be lost off the
buffer, creating an incomplete transfer situation, then we wouldn't
know until the checksum.

So now, we checksum and re-trigger the download if there is a
failure of the checksum.

This involved a minor shift in the download logic, and resulted in
a needful minor fix to an image checksum test as it would loop for
90 seconds as well.

Closes-Bug: 2038934
Change-Id: I543a60555a2621b49dd7b6564bd0654a46db2e9a
2023-10-10 09:15:31 -07:00
Adam Rozman
70961789a6 implement basic-auth support for user-image download process
This feature was proposed in https://bugs.launchpad.net/ironic-python-agent/+bug/2021947

Change-Id: I9dbfc1402240beb75b6736214753fd86dccae676
2023-10-10 16:25:51 +03:00
Julia Kreger
eb95273ffb Add get_service_steps logic to the agent
Initial code patches for service steps have merged in
ironic, and it is now time to add support into the
agent which allows service steps to be raised to
the service.

Updates the default hardware manager version to 1.2,
which has *rarely* been incremented due to oversight.

Change-Id: Iabd2c6c551389ec3c24e94b71245b1250345f7a7
2023-08-31 06:22:22 -07:00
Julia Kreger
c65ad42ff1 Log the number of bytes downloaded
When troubleshooting download issues, which may present
as checksum validation failures, it is difficult to understand
if the *entire* file was downloaded due to the way HTTP works.

In that, a download may start with a successful result code,
and the content is streamed out until the socket is closed.

But with HTTP there is no way to know if that socket closed
prematurely and the original server size is *also* an optional
field, so just log the size we got to so we don't drive the
humans [more-]insane.

Also now logs the (optional) content-length field if
supplied by the server.

Change-Id: Id71b167f4e330d54b9afddf95f1a2ef9e40398bf
2023-07-19 16:20:40 +00:00
Zuul
bb156aad6c Merge "Fix Bandit errors" 2023-06-26 09:25:09 +00:00
Julia Kreger
78c1343a54 Fix Bandit errors
Bandit 1.7.5 released with a timeout check for all requests and
urllib calls.

Fixed those.

In the process, then exposed a bandit b310 issue, which was already
covered by the code, but explicitly marked it as such.

Also, enables bandit checks to be voting for CI..

Change-Id: If0e87790191f5f3648366d571e1d85dd7393a548
2023-06-06 08:34:55 -07:00
Zuul
141c5ff1c3 Merge "Add support for CentOS SUM files" 2023-05-09 09:03:25 +00:00
Harald Jensås
e7a048ecbe
Add support for CentOS SUM files
The CentOS Stream SUM files uses format:
  # FILENAME: <size> bytes
  ALGORITHM (FILENAME) = CHECKSUM

Compared to the more common format:
  CHECKSUM  *FILE_A
  CHECKSUM  FILE_B

Use regular expressions to check for filename both
in the middle with parentheses and at the end.
Similarly look for valid checksums at beginning or
end of line. Also look for know checsum patterns in
case file only contain the checksum iteself.

Change-Id: I9e49c1a6c66e51a7b884485f0bcaf7f1802bda33
2023-05-03 21:31:23 +02:00
Julia Kreger
c05fdf790c Fix checksum validation logic
The checksum validation logic, which was updated early on in the
whole process of deprecating md5, didn't account for a URL *or* a
longer checksum (i.e. sha256/sha512) which was decided while the
overall approach was being decided.

Fixes the logic, and adds additional tests.

Change-Id: Ic4053776e131fc02ace295a1e69e9f9faab47f42
2023-05-02 17:24:57 -07:00
Julia Kreger
32df26a22a Disable MD5 image checksums
MD5 image checksums have long been supersceeded by the use of a
``os_hash_algo`` and ``os_hash_value`` field as part of the
properties of an image.

In the process of doing this, we determined that checksum via
URL usage was non-trivial and determined that an appropriate
path was to allow the checksum type to be determined as needed.

Change-Id: I26ba8f8c37d663096f558e83028ff463d31bd4e6
2023-04-24 16:54:42 -07:00
Vanou Ishii
0bf579c955 Fix failure of bind mount in _install_grub2
When IPA runs _install_grub2, IPA tries to bind mount /dev, /proc and /run
to <temporal directory path root partition mounted>/{dev,proc,run}.
However that bind mount fails because there aren't such mount point path
under temporal directory.
To fix this failure, this patch add mkdir command before bind mount.

Story: 2010292
Task: 46273
Change-Id: I434ce1bf1863ee0f11c4d09918d6d2d8dc065c02
2022-09-22 19:34:12 +09:00
Dmitry Tantsur
6a1334a068 Drop support for instance netboot
Change-Id: I2b4c543537dac8904028fdcdb590c1c214238e10
2022-07-07 16:38:22 +02:00
waleedm
eb07839bd4 Fix passing kwargs in clean steps
Pass kwargs to dispatch_to_managers method in execute_clean_step

Change-Id: Ida4ed4646659b2ee3f8f92b0a4d73c0266dd5a99
Story: 2010123
Task: 45705
2022-07-01 23:03:55 +00:00
Julia Kreger
99ca1086db Create fstab entry with appropriate label
Depending on the how the stars align with partition images
being written to a remote system, we *may* end up with
*either* a Partition UUID value, or a Partition's UUID value.

Which are distinctly different.

This is becasue the value, when collected as a result of writing
an image to disk *falls* back and passes the value to enable
partition discovery and matching.

Later on, when we realized we ought to create an fstab entry,
we blindly re-used the value thinking it was, indeed, always
a Partition's UUID and not the Partition UUID. Obviously,
the label type is quite explicit, either UUID or PARTUUID
respectively, when initial ramdisk utilities such as dracut
are searching and mounting filesystems.

Adds capability to identify the correct label to utilize
based upon the current state of the block devices on disk.

Granted, we are likely only exposed to this because of IO
race conditions under high concurrecy load operations.
Normally this would only be seen on test VMs, but
systems being backed by a Storage Area Network *can*
exibit the same IO race conditions as virtual machines.

Change-Id: I953c936cbf8fad889108cbf4e50b1a15f511b38c
Resolves: rhbz#2058717
Story: #2009881
Task: 44623
2022-03-10 07:04:01 -08:00
Arne Wiebalck
a83f38479e Move prepare_boot_partitions_for_softraid to raid_utils
prepare_boot_partitions_for_softraid() is used in BIOS and UEFI
modes to prepare the partitions for the bootloader. Move it from
the image extensions to raid_utils to reflect this and avoid the
import of an extension to efi_utils.

Follow-up to 62c5674a600baeeef0af3b12baeab486870eb103.

Change-Id: I9f5974fbbfea5e8cdfbb7e49bea375e5cbfdd145
2022-02-14 11:21:36 +01:00
Dmitry Tantsur
6ebf041704 Use canonical device name for RAID device for ESP
It seems like tinyIPA silently replaces /dev/md/esp with /dev/md127.
Find the next free /dev/md device and use it instead.

Also rescan the resulting device before copying files.

Change-Id: Ie04f530be434c4b1561e75f387b9da679e4607e0
Depends-On: https://review.opendev.org/c/openstack/ironic/+/827129/
2022-02-01 10:06:31 +01:00
Arne Wiebalck
62c5674a60 SoftwareRAID: Use efibootmgr (and drop grub2-install)
Move the software RAID code path from grub2-install to
efibootmgr:

- remove the UEFI efibootmgr exception for software RAID
- create and populate the ESPs on the holder disks
- update the NVRAM with all ESPs (the component devices
  of the ESP mirror, use unique labels to avoid unintentional
  deduplication of entries in the NVRAM)

Story: #2009794

Change-Id: I7ed34e595215194a589c2f1cd0b39ff0336da8f1
2022-01-26 14:43:40 +01:00
Derek Higgins
12f5f30e63 Instruct qemu-img to write image zeros to disk.
Doing this will cause it not to zero out the entire
block device which can be very costly on a slow HDD.

Story: 2009227
Task: 43315

Change-Id: I62ba2afc037d9844387e6b0984fe5008779d95d2
2021-12-08 15:56:05 +00:00
Dmitry Tantsur
abe38a6a5f Fix compatibility with disk_utils.find_efi_partition
This function returns the complete block device record, not just number.
Fixes regression in 89bc73aa0105850c6ae44428642e31802bba3b20.

Also fix the incorrect job in the gate queue, which prevented us from
catching this issue on merging.

Change-Id: I4cbc359ceabfc193ce18fed14a1952359460e7d9
2021-11-19 14:51:27 +01:00
Dmitry Tantsur
89bc73aa01 Use two more functions from disk_utils
Change-Id: If01c9cd7f95b4495509369786360741b731161db
2021-11-18 13:49:51 +01:00
Dmitry Tantsur
36d4a18fbc Move manage_uefi from the image extension to a public location
This call is very useful for custom deploy implementations, such as one
we maintain for OpenShift. Splitting it out also makes image.py slightly
more manageable.

The get_partition call is moved to partition_utils.

Change-Id: I60a6a2823d3eb27a4ae78e913e3655dae7b54ffe
2021-11-16 17:58:16 +01:00
Zuul
f5efbc3e7e Merge "Simplify error messages when running clean/deploy step" 2021-11-13 07:35:50 +00:00
Riccardo Pittau
a799dcc422 Move rescan device function to general utils
We use basically the same function in two modules in the same way, let's
put that in a common place.

Change-Id: I4016e43f2cb102d4327bafcc8a2f90112a6f944a
2021-11-10 15:34:37 +01:00
Dmitry Tantsur
c5fb191393 Simplify error messages when running clean/deploy step
The caller knows what step it invokes, there is no point in repeating
it in the error message. There is also no need to wrap the exception
if it's a RESTError or an ironic-lib exception already since they
are normally detailed enough.

Only leave a detailed message when an unexpected exception happens.

Change-Id: I1d8ca1e7ed1462159e4ae5f0bcf58686f6a2681c
2021-11-09 13:58:44 +01:00
Arne Wiebalck
dc8c1f16f9 Re-read the partition table with partx -a
Re-read the partition table with 'partx -a', rather than 'partx -u'.

This should fix an timing issue where the bootloader installation
fails to mount the EFI partition from a whole disk image since it
is not yet aware of the new partitions (observed with both, the
iscsi and the direct deploy interface).

Change-Id: If5da3075e813ae01df3decf8f0647aba111b0515
2021-11-06 13:43:48 +01:00
Julia Kreger
c5268bbdbb Fix UEFI record regex
I accidently put colons on the test data and remembered taking the
colon character out of the regex I was working on, but apparently
left it in, and accounted for the active entry indicator flag
which appears to have inconsistent support across vendors.

The regex has been fixed, and a test added from a Lenovo SR650
which has some additional string entry data in the UEFI output
which may separate entries.

Change-Id: I1f67b0fb1f645fa82e98bd7c7bba3ffc7755cc74
2021-11-04 09:45:25 -07:00
Julia Kreger
67eddfa7e3 Delete EFI boot entry duplicate labels first
Some firmware seems to take an objection with EFI nvram
entries being deleted after one is added, resulting in the
entire entry table being reset to the last known good state.

This is problematic, as ultimately deployments can time out
if we previously booted with Networking, and the machine, while
commanded to do other wise, reboots back to networking regardless.

We will now delete entries first, before proceeding.

Additionally, for general use, this pattern may serve the
community better by avoiding cases where we would have
previously just relied upon efibootmgr[0] to warn us of duplicate
entries.

[0]: 103aa22ece/src/efibootmgr.c (L228)

Change-Id: Ib61a7100a059e79a8b0901fd8f46b9bc41d657dc
Story: 2009649
Task: 43808
2021-11-01 06:59:26 -07:00
Arne Wiebalck
333ed70c94 Assert EFI part UUID is not None before editing fstab
The EFI partition UUID may be None and this will break
the fstab editing. While this is not necessarily fatal when
instantiating a node, it creates an exception at the end of
bootloader installation, so only attempt to add a line to
fstab when the UUID is not None.

Change-Id: I68799980e67c05afe4ca68ca9733605dd166d54d
2021-10-08 08:35:29 +02:00
Dmitry Tantsur
cb836a29bf Trivial: minor fixes in error messages
Change-Id: I06b32c2eb576520cddff88074e4619070731017d
2021-09-07 14:41:38 +02:00
Zuul
c616b4dba3 Merge "Output verbose info from efibootmgr" 2021-08-11 11:08:34 +00:00
Derek Higgins
caf695f70a Output verbose info from efibootmgr
When debugging boot manager problems it can be advantageous to
see all the full entries rather then just their labels.

Change-Id: I6a1bb78acaf5a4284727bdf533d4be6db2099f50
2021-08-03 12:01:17 +00:00
Riccardo Pittau
efbbc86f53 Increase version of hacking and pycodestyle
Fix H904 "Delay string interpolations at logging calls" errors

Change-Id: I331808d0132094faf739998a6984440787d3ebf8
2021-07-30 14:34:33 +02:00
Julia Kreger
e5d552474b Catch ismount not being handled
While investigating another grub issue, I was confused by the path
taken in the logs reported, and noticed that on a ramdisk, we might
not actually have a valid response to os.path.ismount, I'm guessing
depending on what in memory filesystem is in use while also coupled
with attempting to check a filesystem.

Adds a test to validate that exceptions raised on these commands
where this issue can be encountered, are properly bypassed, and also
adds additional logging to make it easier to figure out what is
going on in the entire bootloader setup sequence.

Change-Id: Ibd3060bef2e56468ada6b1a5c1cc1632a42803c3
2021-06-29 14:14:52 -07:00
Arne Wiebalck
27568204ae Only mount the ESP if not yet mounted
Check if the ESP is already mounted before attempting to mount it
for the bootloader installation.

Change-Id: Ifd738b2c5663f1a211d7e13b5ba386be631d8db1
2021-06-21 12:10:54 +02:00
Julia Kreger
2fab70c36b Utilize CSV file for EFI loader selection
Adds support to identify and utilize a CSV file to signal which
bootloader to utilize, and set it when the OS is running as opposed
to when EFI is running. This works around EFI loader potentially
crashing some vendors hardware types when entry stored in the
image does not match the EFI loader record which was utilzied to
boot.

Grub2+shim specifically specifically needs the CSV file name
and entry label to match what the system was booted with in order
to prevent the machine from potentially crashing.

See https://storyboard.openstack.org/#!/story/2008962
and https://bugzilla.redhat.com/show_bug.cgi?id=1966129#c37
for more information.

Change-Id: Ibf1ef4fe0764c0a6f1a39cb7eebc23ecc0ee177d
Story: 2008962
Task: 42598
Co-Authored-By: Bob Fournier <bfournie@redhat.com>
2021-06-10 11:23:14 -07:00
Zuul
434de569e6 Merge "Ignore efi grub2-install failure" 2021-06-07 09:47:12 +00:00
Zuul
6be440eb3b Merge "Refactor: use convert_image from ironic_lib" 2021-06-04 16:35:00 +00:00
Steve Baker
a057be7dad Ignore efi grub2-install failure
Recent releases of redhat grub2 will always fail when installing to
EFI paths, to encourage a transition to the signed shim bootloader.

Partition image deploys avoid calling grub2-install with the
preserve-efi-assets functions. Deploying whole disk images doesn't
require grub2-install. This leaves whole disk images installed onto
softraid devices, which still attempts to call grub2-install.

This change will still attempt to run grub2-install in this
one remaining case, but will ignore any failure.

A future enhancement can avoid calling grub2-install entirely so that
non-redhat secure-boot capable images can keep their signed
bootloaders.

Story: 2008923
Task: 42521
Change-Id: If432ef795d64d76442d739eb4f7d155ff847041e
2021-06-04 10:03:55 +12:00
Zuul
7fdbcde3de Merge "Stop accepting duplicated configdrive" 2021-06-02 12:36:57 +00:00
Dmitry Tantsur
f657526807 Stop accepting duplicated configdrive
We're currently requiring it twice: in image_info and in a separate
configdrive argument. I think we should eventually settle on separate
arguments for separate entities, so this change makes the value in
image_info optional with a goal to stop accepting it.

We could probably just remove the handling in image_info, but a
deprecation is safer.

The (unused in ironic) cache_image call is updated with an optional
configdrive arguments.

Story: #2008904
Task: #42480
Change-Id: I1e2efa28efa3ea7e389774cb7633d916757bc6ed
2021-06-02 11:19:39 +02:00
Dmitry Tantsur
33d889c3c4 Refactor: use convert_image from ironic_lib
Change-Id: If890baf3545cff6cef7c645c42e7f9d9038c9aa7
2021-06-01 14:07:34 +02:00
Zuul
5c063c8224 Merge "Make _get_efi_bootloaders return relative paths" 2021-05-27 13:09:48 +00:00
Julia Kreger
9e4c7052a2 Limit qemu-img execution arenas
qemu-img attempts to launch multiple threads by default *and*
attempts to have multiple memory allocation arenas to operate
from. While multithreading can be good for performance, this
pattern and the memory footprint for process launch and
dependencies can turn the memory footprint for a cirros image
conversion (16MB) into 1.2GB of memory being asked for by the
qemu-img tool.

In order to limit this impact, as the default number of arenas
is governed by the number of CPUs times the number 8, it seems
reasonable to lower this to a more reasonable number which
also helps keep our possible memory footprint from being exceeded.

Change-Id: I71a28ec59ec31c691205eb34d9fcab63a2ccb682
Story: 2008928
Task: 42528
2021-05-26 13:04:46 -07:00