330 Commits

Author SHA1 Message Date
Zuul
99dee5067e Merge "Software RAID: Get component devices by md UUID" 2020-09-30 18:30:56 +00:00
Arne Wiebalck
044c64dbc0 Software RAID: Get component devices by md UUID
Scanning the output of mdadm commands for RAID members will
miss component devices which are currently not part of the
RAID. For proper cleaning it is better to scan block devices
for a signature of the md device for which we would like to
get the components.

Story: #2008186
Task: #40947

Change-Id: Ib46612697851e36a16d272ccaeb0115106253863
2020-09-29 17:08:40 +02:00
Arne Wiebalck
c7aec775ff Software RAID: Don't delete partitions too early
Partions on the holder disk should only be deleted after
all RAID devices have been deleted. Otherwise, super blocks
on partitions which reside on the same disks cannot be cleaned.

Story: #2008199
Task: #40979
Change-Id: I19293f5b992cd1fa68957d6f306dcec8f3b7a820
2020-09-28 10:35:12 +02:00
Zuul
11a87365fb Merge "Generate a TLS certificate and send it to ironic" 2020-09-23 12:14:38 +00:00
Dmitry Tantsur
021e0a6a46 Generate a TLS certificate and send it to ironic
Adds a new flag (on by default) that enables generating a TLS
certificate and sending it to ironic via heartbeat. Whether
ironic supports auto-generated certificates is determined by
checking its API version.

Change-Id: I01f83dd04cfec2adc9e2a6b9c531391773ed36e5
Depends-On: https://review.opendev.org/747136
Depends-On: https://review.opendev.org/749975
Story: #2007214
Task: #40604
2020-09-11 17:46:52 +02:00
Julia Kreger
3426963552 Fix backup node lookup
The node lookup code added in change
I27201319f31cdc01605a3c5ae9ef4b4218e4a3f6
was slightly broken in that we call a method
with a keyword arguemnt which doesn't exist.

uuid versus node_uuid.

It happens, it is a quick fix!

Spotted on a metalsmith job:

[-] Agent is requesting to perform an explicit node cache update.
    This is to pickup any chanages in the cache before deployment.
[-] Failed to update node cache. Error lookup_node() got an
    unexpected keyword argument 'uuid'

Change-Id: I59ecec65707a2f03918b233f1925395ebe59b8c4
2020-09-09 15:19:38 -07:00
Julia Kreger
d3c3d4dabe Update the cache if we don't have a root device hint
Or at least try to.

Some deployments just don't use root device hints, and this is okay.

However, other deployments need root device hints, and with fast
track mode in ramdisks, we created a situation where the node cache
could be updated by a human or software between the time the agent
was started, and the deployment was requested.

As a result, the agent has been updated to check if we have a hint
and if we don't, update the cache from the node lookup endpoint.

This is not needed when the inband deploy steps are executed, as
the process of updating the steps does force the node cache to be
updated.

Change-Id: I27201319f31cdc01605a3c5ae9ef4b4218e4a3f6
Story: 2008039
Task: 40701
2020-08-25 19:34:48 +00:00
Zuul
cda5467839 Merge "Examples: add deploy_steps examples" 2020-08-07 15:55:30 +00:00
Dmitry Tantsur
ce53863361 Examples: add deploy_steps examples
Change-Id: Ifacd8fb05a80f34029965156334fbb707468f1f6
2020-08-04 16:51:54 +02:00
Zuul
ad9c54f55c Merge "Return the final RAID configuration from apply_configuration" 2020-07-29 14:00:08 +00:00
Dmitry Tantsur
f03d72019a Return the final RAID configuration from apply_configuration
AgentRAID expects it and fails with TypeError if it's not provided.

Change-Id: Id84ac129bba97540338e25f0027aa0a0f51bde52
Story: #2006963
2020-07-29 10:10:18 +02:00
Dmitry Tantsur
eb87651496 Allow erase_devices_metadata to be used as a deploy step
Change-Id: I75f156dd76b0e3aaa1592ba24fe42fb2a7057cc8
Story: #2006963
2020-07-27 17:57:37 +02:00
Doug Szumski
5e95b1321d Fix bootloader install issue with MDRAID
When no root_device hint is set, an MDRAID partition can be incorrectly
selected as the root device which causes installation of the bootloader
to the physical disks behind the MDRAID volume to fail. See the notes
in the referenced Story for more detail.

This change adds a little more specificity to the listing of block
devices.

Change-Id: I66db457e71a0586723ee753bef961aec5bf58827
Story: 2007905
Task: 40303
2020-07-22 11:16:13 -07:00
Dmitry Tantsur
1f3b70c4e9 Ignore devices with size 0 when collecting inventory
delete_configuration still fetches all devices as it needs to clean
ones with broken RAID.

Story: #2007907
Task: #40307
Change-Id: I4b0be2b0755108490f9cd3c4f3b71a5e036761a1
2020-07-09 18:28:20 +02:00
Julia Kreger
c76b8b2c21 Limit Inspection->Lookup->Heartbeat lag
Caches hardware information collected during inspection
so that the initial lookup can occur without any delay.

Also adds logging to track how long inventory collection takes.

Co-Authored-By: Dmitry Tantsur <dtantsur@protonmail.com>
Change-Id: I3e0d237d37219e783d81913fa6cc490492b3f96a
2020-07-03 10:32:26 +02:00
Zuul
46bf7e0ef4 Merge "Add a deploy step for writing an image" 2020-06-20 00:00:10 +00:00
Dmitry Tantsur
9d4cf5532f Add a deploy step for writing an image
The new step just invokes the appropriate method of the standby extension.

Change-Id: Ic74f83ab2b7e58f8e4b46e0abfab79e221afeb3e
Story: 2006963
2020-06-02 15:23:54 +02:00
Riccardo Pittau
557d5603a2 Split and move logic for partition tables
Move and split the logic to create the partition tables when
applying raid configuration.

Change-Id: Ic76dd2067ace02dd02351caca0c7f9b05571e510
2020-05-25 08:11:28 +00:00
Riccardo Pittau
8d210638a8 Fix TypeError with newer version of lshw
The issue with json output in lshw was fixed in version B.02.19
This patch makes the memory calculation compatible with that
version and later versions that are included in recent distributions
(e.g. Ubuntu 20.04, Fedora 31)

Change-Id: Id5a30028b139c51cae6232cac73a50b917fea233
Story: 2007588
Task: 39527
2020-04-27 15:07:54 +02:00
Riccardo Pittau
2738e57f2a Add function to calculate memory
Move logic to calculate memory to its own function.

Change-Id: I5ab98b6450ff45dff35ddae093a83140f37047a8
2020-04-27 10:46:05 +02:00
Zuul
5dcff4d2b3 Merge "Add raid.apply_configuration deploy step" 2020-04-21 09:32:40 +00:00
Zuul
6a95e216f6 Merge "Simplify deduplicate_steps" 2020-04-21 00:37:15 +00:00
Dmitry Tantsur
c0502649ba Add raid.apply_configuration deploy step
For compatibility with out-of-band RAID deploy steps, we need to have
one apply_configuration step, not a create/delete pair.

Change-Id: I55bbed96673c9fa247cafdac9a3ade3a6ff3f38d
Story: #2006963
2020-04-20 12:50:14 +02:00
Riccardo Pittau
3966871f47 Move logic to calculate raid sectors to raid_utils
Some more raid related logic moved to raid_utils.

Change-Id: I08c73ad14e5b01ebac2490b83997c5452506d4a2
2020-04-09 15:03:37 +02:00
Zuul
83b5a8b202 Merge "Move logic for raid start sector to raid_utils" 2020-04-09 12:31:29 +00:00
Zuul
b9e320e76f Merge "Add an ability to run in-band deploy steps" 2020-04-09 09:31:49 +00:00
Riccardo Pittau
f32a4a2b29 Move logic for raid start sector to raid_utils
A starting tentative in reducing size of raid related functions.

Change-Id: I81f912d0dc0ad138d8cc776cdb4ee3b5251ec3ba
2020-04-08 17:32:38 +02:00
Arne Wiebalck
66c32784af Editing follow-up for UEFI Software RAID support
This is a follow-up to https://review.opendev.org/#/c/696156/

Change-Id: I0fd2c09045ff07a57374934c35d4a3a8467f5e99
Story: #2006379
Task: #37635
2020-04-06 18:03:25 +02:00
Dmitry Tantsur
079f61d09c Simplify deduplicate_steps
The same result can be achieved using a multi-component sorting key.

Change-Id: Ieacf9fcecb2a6de7b4ccd8889f789099af39aa37
2020-04-06 10:30:31 +02:00
Mark Goddard
1b4ce47921 Add an ability to run in-band deploy steps
Mostly adaptation of cleaning methods.

Co-Authored-By: Dmitry Tantsur <dtantsur@redhat.com>
Change-Id: Ife0502391bbece46d619a20a825dfdb191d5c2b4
Story: 2006963
Task: 37791
2020-04-06 10:24:08 +02:00
Raphael Glon
9343348106 Software RAID: Add UEFI support
The proposed changes concern two steps:

First, when creating the RAID configuration, have a GPT partition
table type (this is not necessary, but more natural with UEFI).
Also, leave some space, either for the EFI partitions or the BIOS
boot partitions, outside the Software RAID.

Secondly, when installing the bootloader, make sure the correct
boot partitions are created or relocated.

Change-Id: Icf0a76b0de89e7a8494363ec91b2f1afda4faa3b
Story: #2006379
Task: #37635
2020-04-02 18:02:19 +02:00
Zuul
d71a8375fa Merge "Only check for partitions on devices that are part of software RAID" 2020-04-02 14:56:45 +00:00
Zuul
ab8c7c05bc Merge "Allow specifying target devices for software RAID" 2020-04-01 17:36:50 +00:00
Dmitry Tantsur
34b58f6024 Only check for partitions on devices that are part of software RAID
Now that an operator can pick the devices that participate in RAID,
it no longer makes sense to verify all devices.

Change-Id: Id5d8d539183f0db4ba3c4132ce6bc9919f9cd1ea
Story: #2006369
2020-04-01 16:02:05 +02:00
Riccardo Pittau
a332a19a57 Bump hacking to 3.0.0
Change-Id: I1032ea6a2e9d79aeaecb1458c319cbeb15ac1fff
2020-03-30 12:55:46 +02:00
Julia Kreger
bf0bb7a87a Improve debug logging around Raid/Bootloader
Change-Id: I7d34b918a859972a2d5650494824d3333016dd11
2020-03-28 08:55:32 -07:00
Dmitry Tantsur
ddbba07021 Allow specifying target devices for software RAID
This change adds support for the physical_disks RAID parameter in
a form of device hints (same as for root device selection).

Change-Id: I9751ab0f86ada41e3b668670dc112d58093b8099
Story: #2006369
Task: #39080
2020-03-17 13:03:24 +01:00
Julia Kreger
c1da514645 Ignore pyudev errors about device number
Pyudev is used to return extra data about a device
using the udev interface.

Sometimes that lookup doesn't quite work, like on
md devices after restarting them. As such, we will
now tollerate the failure and continue the process
as before.

Change-Id: Ibbc1759fe2cd3d7d09019b4e80d3c61d54c844dd
Story: 2007281
Task: 38726
2020-02-11 20:44:18 +00:00
Julia Kreger
cd7b2693f8 Skip read-only devices with metadata erase
HPE "Virtual Install Devices" appear as read-only block
devices, and may... or may not be visible depending on the
bios configuration state.

These devices can no longer be disabled from the bios settings
so the simplest course of action seems to be that we should
handle the existence of a read-only device.

In the event of secure erase, this is treated as a hard failure
case and a driver_internal_info flag has been added to enable
a future bypass method for knowledgable operators.

Change-Id: Ief8b360d11e654d8fae3a04a2a9f8d474a06e167
Story: 2007229
Task: 38502
2020-01-29 13:41:25 -06:00
Dmitry Tantsur
31b73b4984 Expose collector and hardware manager names via introspection data
This change adds a new introspection data field 'configuration'
with two lists: managers and collectors.

Change-Id: Ice0d7e6ecff3f319bc3a4f41617059fd6914e31c
2020-01-22 11:15:38 +01:00
Zuul
98838597c6 Merge "Allow reading root_device from instance_info" 2020-01-14 08:32:35 +00:00
Zuul
12b62d6c3a Merge "Collect lsblk and /proc/mdstat with ramdisk logs" 2020-01-10 09:22:29 +00:00
Dmitry Tantsur
96a094646b Allow reading root_device from instance_info
For the future deployment API we need to be able to set root_device
per deployment in addition to per node. This change adds it.

Change-Id: I1dd046c2e5fca211a84290bac8daa7550b21614f
Story: #2006910
Task: #37955
2020-01-03 17:29:05 +01:00
Zuul
6032643a04 Merge "Stop using six library" 2019-12-03 13:09:30 +00:00
Dmitry Tantsur
4354bc04f9 Replace netaddr dependency with stdlib ipaddress
netaddr is quite a big library, and all we need is covered by
the built-in ipaddress module (available in python 3).

Also add a safeguard for invalid 'ip route' output.

Change-Id: I9d76a8d1c1b6b1585e301a9c63b37fab3b98746f
2019-12-02 12:13:04 +01:00
Riccardo Pittau
ca7a46b113 Stop using six library
Since we've dropped support for Python 2.7, it's time to look at
the bright future that Python 3.x will bring and stop forcing
compatibility with older versions.
This patch removes the six library from requirements, not
looking back.

Change-Id: I4795417aa649be75ba7162a8cf30eacbb88c7b5e
2019-11-29 10:18:14 +01:00
Julia Kreger
e4659c94cf RAID 5/6
Looks like adding RAID 5/6 support may be easier than most
could immagine. The code, as written appears to be safe and
logical creating a RAID 5 or RAID 6 volume.

Not that we can really test this in CI, but it seems only
validation code needs to be changed to loosen the constraint.

Change-Id: Ib891b3c97f0bfb02af3b59581a451c4b25e03b85
2019-11-27 20:30:28 +00:00
Andrei Nistor
1975478097 Set rd.md.uuid kernel parameter when deploying on software raid
When deploying an image to a software raid array, it is currently
required that the deployed image assembles the md arrays automatically
so that the rootfs can be mounted. In order to remove this
requirement/limitation on the deployed image we can add rd.md.uuid to
the kernel command line with the raid array's uuid.

Story: 2006648
Task: 36884
Change-Id: I42cb198753ecd84b7eaef6b5aa7c2064535bfe0e
2019-10-17 11:14:04 +00:00
Dmitry Tantsur
11976c9d2b Collect lsblk and /proc/mdstat with ramdisk logs
This should improve debugability of partitioning problems.

Change-Id: I3c7ae3f2831c9900a3f0d24daec6dd6b8bea6a60
2019-10-14 15:28:08 +02:00
Zuul
3968ec9d5a Merge "Delete_configuration, consider removed raid members as well" 2019-09-24 11:51:06 +00:00