88 Commits

Author SHA1 Message Date
Jay Faulkner
36e5993a04 [codespell] Fix spelling issues in IPA
This fixes several spelling issues identified by codepsell. In some
cases, I may have manually modified a line to make the output more clear
or to correct grammatical issues which were obvious in the codespell
output.

Later changes in this chain will provide the codespell config used to
generate this, as well as adding this commit's SHA, once landed, to a
.git-blame-ignore-revs file to ensure it will not pollute git historys
for modern clients.

Related-Bug: 2047654
Change-Id: I240cf8484865c9b748ceb51f3c7b9fd973cb5ada
2023-12-28 10:54:46 -08:00
Vanou Ishii
0bf579c955 Fix failure of bind mount in _install_grub2
When IPA runs _install_grub2, IPA tries to bind mount /dev, /proc and /run
to <temporal directory path root partition mounted>/{dev,proc,run}.
However that bind mount fails because there aren't such mount point path
under temporal directory.
To fix this failure, this patch add mkdir command before bind mount.

Story: 2010292
Task: 46273
Change-Id: I434ce1bf1863ee0f11c4d09918d6d2d8dc065c02
2022-09-22 19:34:12 +09:00
Julia Kreger
99ca1086db Create fstab entry with appropriate label
Depending on the how the stars align with partition images
being written to a remote system, we *may* end up with
*either* a Partition UUID value, or a Partition's UUID value.

Which are distinctly different.

This is becasue the value, when collected as a result of writing
an image to disk *falls* back and passes the value to enable
partition discovery and matching.

Later on, when we realized we ought to create an fstab entry,
we blindly re-used the value thinking it was, indeed, always
a Partition's UUID and not the Partition UUID. Obviously,
the label type is quite explicit, either UUID or PARTUUID
respectively, when initial ramdisk utilities such as dracut
are searching and mounting filesystems.

Adds capability to identify the correct label to utilize
based upon the current state of the block devices on disk.

Granted, we are likely only exposed to this because of IO
race conditions under high concurrecy load operations.
Normally this would only be seen on test VMs, but
systems being backed by a Storage Area Network *can*
exibit the same IO race conditions as virtual machines.

Change-Id: I953c936cbf8fad889108cbf4e50b1a15f511b38c
Resolves: rhbz#2058717
Story: #2009881
Task: 44623
2022-03-10 07:04:01 -08:00
Arne Wiebalck
a83f38479e Move prepare_boot_partitions_for_softraid to raid_utils
prepare_boot_partitions_for_softraid() is used in BIOS and UEFI
modes to prepare the partitions for the bootloader. Move it from
the image extensions to raid_utils to reflect this and avoid the
import of an extension to efi_utils.

Follow-up to 62c5674a600baeeef0af3b12baeab486870eb103.

Change-Id: I9f5974fbbfea5e8cdfbb7e49bea375e5cbfdd145
2022-02-14 11:21:36 +01:00
Dmitry Tantsur
6ebf041704 Use canonical device name for RAID device for ESP
It seems like tinyIPA silently replaces /dev/md/esp with /dev/md127.
Find the next free /dev/md device and use it instead.

Also rescan the resulting device before copying files.

Change-Id: Ie04f530be434c4b1561e75f387b9da679e4607e0
Depends-On: https://review.opendev.org/c/openstack/ironic/+/827129/
2022-02-01 10:06:31 +01:00
Arne Wiebalck
62c5674a60 SoftwareRAID: Use efibootmgr (and drop grub2-install)
Move the software RAID code path from grub2-install to
efibootmgr:

- remove the UEFI efibootmgr exception for software RAID
- create and populate the ESPs on the holder disks
- update the NVRAM with all ESPs (the component devices
  of the ESP mirror, use unique labels to avoid unintentional
  deduplication of entries in the NVRAM)

Story: #2009794

Change-Id: I7ed34e595215194a589c2f1cd0b39ff0336da8f1
2022-01-26 14:43:40 +01:00
Dmitry Tantsur
abe38a6a5f Fix compatibility with disk_utils.find_efi_partition
This function returns the complete block device record, not just number.
Fixes regression in 89bc73aa0105850c6ae44428642e31802bba3b20.

Also fix the incorrect job in the gate queue, which prevented us from
catching this issue on merging.

Change-Id: I4cbc359ceabfc193ce18fed14a1952359460e7d9
2021-11-19 14:51:27 +01:00
Dmitry Tantsur
89bc73aa01 Use two more functions from disk_utils
Change-Id: If01c9cd7f95b4495509369786360741b731161db
2021-11-18 13:49:51 +01:00
Dmitry Tantsur
36d4a18fbc Move manage_uefi from the image extension to a public location
This call is very useful for custom deploy implementations, such as one
we maintain for OpenShift. Splitting it out also makes image.py slightly
more manageable.

The get_partition call is moved to partition_utils.

Change-Id: I60a6a2823d3eb27a4ae78e913e3655dae7b54ffe
2021-11-16 17:58:16 +01:00
Riccardo Pittau
a799dcc422 Move rescan device function to general utils
We use basically the same function in two modules in the same way, let's
put that in a common place.

Change-Id: I4016e43f2cb102d4327bafcc8a2f90112a6f944a
2021-11-10 15:34:37 +01:00
Arne Wiebalck
dc8c1f16f9 Re-read the partition table with partx -a
Re-read the partition table with 'partx -a', rather than 'partx -u'.

This should fix an timing issue where the bootloader installation
fails to mount the EFI partition from a whole disk image since it
is not yet aware of the new partitions (observed with both, the
iscsi and the direct deploy interface).

Change-Id: If5da3075e813ae01df3decf8f0647aba111b0515
2021-11-06 13:43:48 +01:00
Julia Kreger
c5268bbdbb Fix UEFI record regex
I accidently put colons on the test data and remembered taking the
colon character out of the regex I was working on, but apparently
left it in, and accounted for the active entry indicator flag
which appears to have inconsistent support across vendors.

The regex has been fixed, and a test added from a Lenovo SR650
which has some additional string entry data in the UEFI output
which may separate entries.

Change-Id: I1f67b0fb1f645fa82e98bd7c7bba3ffc7755cc74
2021-11-04 09:45:25 -07:00
Julia Kreger
67eddfa7e3 Delete EFI boot entry duplicate labels first
Some firmware seems to take an objection with EFI nvram
entries being deleted after one is added, resulting in the
entire entry table being reset to the last known good state.

This is problematic, as ultimately deployments can time out
if we previously booted with Networking, and the machine, while
commanded to do other wise, reboots back to networking regardless.

We will now delete entries first, before proceeding.

Additionally, for general use, this pattern may serve the
community better by avoiding cases where we would have
previously just relied upon efibootmgr[0] to warn us of duplicate
entries.

[0]: 103aa22ece/src/efibootmgr.c (L228)

Change-Id: Ib61a7100a059e79a8b0901fd8f46b9bc41d657dc
Story: 2009649
Task: 43808
2021-11-01 06:59:26 -07:00
Arne Wiebalck
333ed70c94 Assert EFI part UUID is not None before editing fstab
The EFI partition UUID may be None and this will break
the fstab editing. While this is not necessarily fatal when
instantiating a node, it creates an exception at the end of
bootloader installation, so only attempt to add a line to
fstab when the UUID is not None.

Change-Id: I68799980e67c05afe4ca68ca9733605dd166d54d
2021-10-08 08:35:29 +02:00
Zuul
c616b4dba3 Merge "Output verbose info from efibootmgr" 2021-08-11 11:08:34 +00:00
Derek Higgins
caf695f70a Output verbose info from efibootmgr
When debugging boot manager problems it can be advantageous to
see all the full entries rather then just their labels.

Change-Id: I6a1bb78acaf5a4284727bdf533d4be6db2099f50
2021-08-03 12:01:17 +00:00
Riccardo Pittau
efbbc86f53 Increase version of hacking and pycodestyle
Fix H904 "Delay string interpolations at logging calls" errors

Change-Id: I331808d0132094faf739998a6984440787d3ebf8
2021-07-30 14:34:33 +02:00
Julia Kreger
e5d552474b Catch ismount not being handled
While investigating another grub issue, I was confused by the path
taken in the logs reported, and noticed that on a ramdisk, we might
not actually have a valid response to os.path.ismount, I'm guessing
depending on what in memory filesystem is in use while also coupled
with attempting to check a filesystem.

Adds a test to validate that exceptions raised on these commands
where this issue can be encountered, are properly bypassed, and also
adds additional logging to make it easier to figure out what is
going on in the entire bootloader setup sequence.

Change-Id: Ibd3060bef2e56468ada6b1a5c1cc1632a42803c3
2021-06-29 14:14:52 -07:00
Arne Wiebalck
27568204ae Only mount the ESP if not yet mounted
Check if the ESP is already mounted before attempting to mount it
for the bootloader installation.

Change-Id: Ifd738b2c5663f1a211d7e13b5ba386be631d8db1
2021-06-21 12:10:54 +02:00
Julia Kreger
2fab70c36b Utilize CSV file for EFI loader selection
Adds support to identify and utilize a CSV file to signal which
bootloader to utilize, and set it when the OS is running as opposed
to when EFI is running. This works around EFI loader potentially
crashing some vendors hardware types when entry stored in the
image does not match the EFI loader record which was utilzied to
boot.

Grub2+shim specifically specifically needs the CSV file name
and entry label to match what the system was booted with in order
to prevent the machine from potentially crashing.

See https://storyboard.openstack.org/#!/story/2008962
and https://bugzilla.redhat.com/show_bug.cgi?id=1966129#c37
for more information.

Change-Id: Ibf1ef4fe0764c0a6f1a39cb7eebc23ecc0ee177d
Story: 2008962
Task: 42598
Co-Authored-By: Bob Fournier <bfournie@redhat.com>
2021-06-10 11:23:14 -07:00
Steve Baker
a057be7dad Ignore efi grub2-install failure
Recent releases of redhat grub2 will always fail when installing to
EFI paths, to encourage a transition to the signed shim bootloader.

Partition image deploys avoid calling grub2-install with the
preserve-efi-assets functions. Deploying whole disk images doesn't
require grub2-install. This leaves whole disk images installed onto
softraid devices, which still attempts to call grub2-install.

This change will still attempt to run grub2-install in this
one remaining case, but will ignore any failure.

A future enhancement can avoid calling grub2-install entirely so that
non-redhat secure-boot capable images can keep their signed
bootloaders.

Story: 2008923
Task: 42521
Change-Id: If432ef795d64d76442d739eb4f7d155ff847041e
2021-06-04 10:03:55 +12:00
Steve Baker
10d18c4113 Make _get_efi_bootloaders return relative paths
To make this function useful for purposes other than efibootmgr
entries, this change moves the path manipulation to _run_efibootmgr.

This change also adds boot*.efi entries to BOOTLOADERS_EFI so that it
includes every entry in the UEFI Spec 2.9[1] Table 3-2 UEFI Image
Types.

[1] https://uefi.org/sites/default/files/resources/UEFI_Spec_2_9_2021_03_18.pdf
Story: 2008923
Task: 42521

Change-Id: Ibe02786609aa0de65115897d8f4a9b4f36c8aed2
2021-05-26 11:21:15 +12:00
Zuul
d6e4fbd827 Merge "Remove the iscsi extension" 2021-05-12 11:08:19 +00:00
Zuul
29f3230791 Merge "Software RAID: RAID the ESPs" 2021-05-11 09:31:36 +00:00
Dmitry Tantsur
be3882162e Remove the iscsi extension
Change-Id: I2f0e581575112d6c7ba0d211661cab3e0b6caca6
2021-05-10 12:43:44 +02:00
Julia Kreger
fe825fa97e Fix NVMe Partition image on UEFI
The _manage_uefi code has a check where it attempts to just
identify the precise partition number of the device, in order
for configuration to be parsed and passed. However, the same code
did not handle the existence of a `p1` partition instead of just a
partition #1. This is because the device naming format is different
with NVMe and Software RAID.

Likely, this wasn't an issue with software raid due to how complex the
code interaction is, but the docs also indicate to use only whole disk
images in that case.

This patch was pulled down my one RH's professional services folks
who has confirmed it does indeed fix the issue at hand. This is noted
as a public comment on the Red Hat bugzilla.
https://bugzilla.redhat.com/show_bug.cgi?id=1954096

Story: 2008881
Task: 42426
Related: rhbz#1954096
Change-Id: Ie3bd49add9a57fabbcdcbae4b73309066b620d02
2021-05-04 16:44:37 +00:00
Arne Wiebalck
c2d04dc156 Software RAID: RAID the ESPs
For software RAID in UEFI mode, we create ESPs on all holder disks
and copy the bootloader there. Since there is no mechanism to keep
the ESPs in sync, e.g. on kernel upgrades or when kernel parameters
are updated, the ESPs will get out of sync eventually. This may lead
to a situation where a node boots with outdated parameters or does
not have any of the installed kernels in the boot menu anymore.
This change proposes to RAID the ESPs. While the UEFI firmware will
find an ESP partition (one leg of the mirror), the node will see
an md device and all subsequent updates will go to all member disks.

Also, remove the source ESP after copying in order to avoid mount
confusion (same UUID!).

Story: #2008745
Task: #42103
Change-Id: I9078ef37f1e94382c645ae98ce724ac9ed87c287
2021-04-16 14:40:28 +02:00
kartikeyaj0
319efe2c2d Fixes local boot for partition images
IPA is not properly checking if the root partition is already
mounted. Device is being passed to os.path.ismount() instead
of the mount point.

Story: 2008631
Task: 41839
Change-Id: I37a6e7e6bbe0bbbb0317c6e55bb822dafe7cce20
2021-02-17 10:56:31 +05:30
Dmitry Tantsur
403d2f06c6 Fix error message with UEFI-incompatible images
It's somewhat confusing at the moment, since we're trying to find
a UEFI partition by UUID "None". Don't search for partition if
we don't know its UUID, and provide a better error message.

Change-Id: Ief874084132797a445ddae8009264712a05facfd
2021-02-10 18:08:58 +01:00
Xinliang Liu
68a43b9da8 Fix UEFI boot entry creation for aarch64
Diskimage-builder installs grub with option '--removable'[1], thus for
aarch64 no 'grubaa64.efi' file in efi directory only got 'BOOTAA64.EFI':
linaro@bm-ubuntu:~$ tree /boot/efi
/boot/efi
└── EFI
    └── BOOT
        └── BOOTAA64.EFI

2 directories, 1 file

[1]: 8f12d9530e/diskimage_builder/elements/bootloader/finalise.d/50-bootloader (L158)

Task: #41698
Story: #2008560
Change-Id: I9fc55c068ea980beae273411db9d3568eec25eb8
2021-01-27 03:32:23 +00:00
Julia Kreger
a12a5744b6 Add fstab pointer to EFI partition
Adds support for the EFI partition to be appended to fstab so the
filesystem can be automounted and EFI loader updated should the
deployed operating system need to do so.

This should enable bootloaders to be upgraded by linux based
operating systems after the instance has been deployed when
a partition image was utilized for the initial deployment.

Change-Id: Iec28a8841cc01ec8b01a3f5cca070c934c7a2531
Story: 2008070
Task: 40754
2020-12-17 14:17:31 +00:00
Julia Kreger
f9870d5812 Prevent broken partition image UEFI deploys
Partition images can sometimes contain a /boot folder structure
event he assets for EFI booting on that filesystem. Which is a
good thing. The conundrum is that Ironic does not handle this
properly and potentially replaces the bootloader in this sequence
such that grub2-install is used instead of signed bootloader assets.

As such, we should be preserving the assets and using them from
a partition image much like we do when we have a wholedisk
image and can identify the assets.

Now we will preserve the EFI boot assets, copy them to the new EFI
boot partition, and call the EFI setup methods to manage the EFI
nvram.

Note, this change also splits the logic path out that performs the
end call of the EFI boot manager into a reusable method but does
not retool all of the testing as it is intertwined in the
install_grub2 testing.

Also adds some additional debug logging, as much of the bootloader
installation code has multiple fallback/cleanup points which makes
it difficult to debug from logs.

Story: 2008070
Task: 40753
Change-Id: If17d4b4c06df5504987e61a1fde6662e9acd6989
2020-12-14 14:37:14 +00:00
Julia Kreger
7a83773fbc Option to enable bootloader config failure bypass
Some hardware is very well intentioned. However this intention
can result in the UEFI NVRAM table being full which prevents us
from adding new records to the table. We can't be sure what to
delete, so in this case some operators just need the ability to
tell ironic "it is okay if this fails, it will still work."

The added ``ignore_bootloader_failure`` option adds
this capability which can be set per-node either in the agent
configuation via the ramdisk image, or in the pxe_append_params
configuration parameter for the node itself with a
``ipa-ignore-bootloader-failure`` option in order to prevent
the failure from being raised.

Change-Id: If3c83fb2ea2025fce092d495a64f32077c70d2d6
Story: 2008386
Task: 41309
2020-12-10 06:42:48 -08:00
Fedor Tarasenko
694ea7425d Support using LABEL as identifier for rootfs
Add possibility to use disk LABEL to identify rootfs uuid for
Software RAID deployment

Change-Id: I77f36e70ddc539af0190db1c1abe0fb2c66f34b4
Story: 2008303
Task: 41188
2020-11-03 13:03:34 +03:00
Julia Kreger
6542a9cb04 Don't run os-prober from grub2-mkconfig
By default, grub2-mkconfig scans everything to look for other
environments and then load those into the grub configuration.

It makes sense, but on newer versions of grub2 in distribution
images, os-prober is taking an exceptionally long time in some
cases where more than one storage device exists with other
filesystems.

As a result, of the os-prober execution by grub2-mkconfig, the
bootloader installation can completely time out and fail the
deployment. This is presently experienced with metalsmith on
centos8.

There are numerous sporatic reports of issues like this issue
where grub2-mkconfig hangs for some period of time, and this is
observable on Centos8.2 in our CI. While one report[0] mentions
this issue, Another bug [1] has the dialog that actually helps us
frame the context as to what we likely should do.

Also, fixes the unit testing so we actually test if we're running
with grub2. :\

[0]: https://bugzilla.redhat.com/show_bug.cgi?id=1744693
[1]: https://bugzilla.redhat.com/show_bug.cgi?id=1709682

Depends-On: https://review.opendev.org/#/c/748315
Change-Id: I14bf299afef3a1ddb2006fe5f182d7f0d249e734
2020-10-22 22:28:07 +00:00
Zuul
35d2292aa4 Merge "Log a warning of target_boot_mode does not match current boot mode" 2020-10-07 17:01:51 +00:00
Dmitry Tantsur
1a67dddde7 Log a warning of target_boot_mode does not match current boot mode
This is not a normal situation and is likely to cause problems.

Change-Id: Id0668fd160ac0539d85997e985f8c43d9da75c90
2020-10-07 12:30:23 +02:00
Dmitry Tantsur
fc4e0eed6a Don't try to call GRUB when root UUID is not provided
We don't have a really working way to detect root UUID for whole
disk images at the moment, which results in an ignored traceback
every time install_bootloader is called with whole disk images in
UEFI mode. Avoid it by skipping GRUB2 if root UUID is unknown.

Change-Id: I84245538f59c664b72d1cafbca8d61be0978f489
2020-10-07 12:06:42 +02:00
Zuul
dc395c5837 Merge "More refactoring of the image module" 2020-07-27 07:15:42 +00:00
Riccardo Pittau
80e11811f5 More refactoring of the image module
Introducing new function _umount_all_partitions to reduce the size
of _install_grub2

Change-Id: I304468d57b10d677f2a9d58aec42a1bf414c6cba
2020-07-24 14:34:46 +02:00
Julia Kreger
2a56ee03b6 Prevent un-needed iscsi cleanup
When we added software raid support, we started calling bootloader
installation. As time went on, we ehnanced that code path for non
RAID cases in order to ensure that UEFI nvram was setup
for the instance to boot properly.

Somewhere in this process, we missed a possible failure case where
the iscsi client tgtadm may return failures. Obviously, the correct
path is to not call iscsi teardown if we don't need to.

Since it was always semi-opportunistic teardown, we can't blindly
catch any error, and if we started iSCSI and failed to tear the
connection down, we might want to still fail, so this change
moves the logic over to use a flag on the agent object which
one extension to set the flag and the other to read it and take
action based upon that.

Change-Id: Id3b1ae5e59282f4109f6246d5614d44c93aefa7c
Story: 2007937
Task: 40395
2020-07-20 14:24:06 -07:00
Riccardo Pittau
9d9a6bce5c Refactor part of image module
Shuffle some functions around and reduce size of _is_bootloader_loaded
moving logic out to a new function.

Change-Id: I9c10bf05186dcebb37f175d61bf4ac9ff86b6510
2020-07-07 10:44:50 +02:00
Dmitry Tantsur
ba3caa6c64 Increase the ESP partition size to 550 MiB when using software RAID
This has been a popular guidance, and diskimage-builder has recently
started following it.

Change-Id: I794c846fb191c15b0a30546bf64d624dfbde0fd4
2020-07-02 17:30:33 +02:00
Zuul
de7d5affe7 Merge "Mount all vfat partitions before calling grub2" 2020-07-02 10:37:04 +00:00
Arne Wiebalck
c5022790b3 Mount all vfat partitions before calling grub2
In order to ensure grub2 finds all files it needs, mount all
vfat partitions specified in the deployed image.

Story: #2007618
Task: #39629
Change-Id: Ie5b6e0abc3f266409562f9ecb26538126b667056
2020-06-30 18:31:58 +02:00
Dmitry Tantsur
7e5fe1121e Make the install_bootloader command asynchronous
It does not return anything, so it makes no point for it to be
synchronous. Ironic always calls it with wait=True, so there is
no problem with backward compatibility either.

Change-Id: I44fec2e0cb54486328ce71263613d8592e384870
2020-06-08 15:10:05 +02:00
Arne Wiebalck
66c32784af Editing follow-up for UEFI Software RAID support
This is a follow-up to https://review.opendev.org/#/c/696156/

Change-Id: I0fd2c09045ff07a57374934c35d4a3a8467f5e99
Story: #2006379
Task: #37635
2020-04-06 18:03:25 +02:00
Raphael Glon
9343348106 Software RAID: Add UEFI support
The proposed changes concern two steps:

First, when creating the RAID configuration, have a GPT partition
table type (this is not necessary, but more natural with UEFI).
Also, leave some space, either for the EFI partitions or the BIOS
boot partitions, outside the Software RAID.

Secondly, when installing the bootloader, make sure the correct
boot partitions are created or relocated.

Change-Id: Icf0a76b0de89e7a8494363ec91b2f1afda4faa3b
Story: #2006379
Task: #37635
2020-04-02 18:02:19 +02:00
Zuul
68a71513f0 Merge "Bump hacking to 3.0.0" 2020-03-31 12:36:11 +00:00
Riccardo Pittau
a332a19a57 Bump hacking to 3.0.0
Change-Id: I1032ea6a2e9d79aeaecb1458c319cbeb15ac1fff
2020-03-30 12:55:46 +02:00