280 Commits

Author SHA1 Message Date
Riccardo Pittau
7b03fbbb36 Call execute from ironic-lib in hardware.py
Replace the execute wrapper from utils with execute from ironic-lib in
hardware.py

Adjust unit tests as needed.

Change-Id: I63a3b0407b2ca2246bd0e6624bfa0f748c0d73f7
2021-11-18 07:52:48 +01:00
Riccardo Pittau
a799dcc422 Move rescan device function to general utils
We use basically the same function in two modules in the same way, let's
put that in a common place.

Change-Id: I4016e43f2cb102d4327bafcc8a2f90112a6f944a
2021-11-10 15:34:37 +01:00
Riccardo Pittau
23e67b5fea Re-read the partition table with partx -a, part 2
Use add instead of update to re-read the partition table with partx.

See [1] for more details.

Co-authored-by: Arne Wiebalck <arne.wiebalck@cern.ch>

[1] https: //opendev.org/openstack/ironic-python-agent/commit/dc8c1f16f9a00e2bff21612d1a9cf0ea0f3addf0

Change-Id: I2336e22dadc790cfbde87904612fcaa3b8c501db
2021-11-09 13:03:14 +01:00
Arne Wiebalck
9d707e9f4b Software RAID: Call udev_settle before creation
This patch fixes a race during software RAID creation:
we create the partition with parted, the kernel then
notifies udev, but we need to wait for udevd to create
the device files before calling mdadm to create the
md device.

Credits to jcosmao for finding this.

Change-Id: I642f28acc351cf50263e37dfbc8468bf59de2cc5
2021-10-05 11:42:49 +02:00
Dmitry Tantsur
07ff3b8bbc Trivial: better debugging in list_all_block_devices
One debug message only specified "Skipping" without any details.
Another did not log the whole line from lsblk. Fix both.

Change-Id: I9f8f4edad88ba2df5abc6a45a74ebdb3c7afcf97
2021-08-27 12:19:28 +02:00
Zuul
438a1f4445 Merge "Move loading of IPMI module loading to a single point" 2021-08-23 16:14:14 +00:00
Zuul
71f54b7f98 Merge "Increase version of hacking and pycodestyle" 2021-08-11 10:02:24 +00:00
Jonas Schäfer
6441db61ce Move loading of IPMI module loading to a single point
This means we do not have to rely on modprobe idempotency as
much and it's less code duplication, which is always nice.

Signed-off-by: Jonas Schäfer <jonas.schaefer@cloudandheat.com>

Change-Id: I996aba47bc54309e15e7d56e4a96b23b8deb5c9c
2021-08-06 13:14:45 +02:00
Jonas Schäfer
61af712fe5 Expose BMC MAC address in inventory data
This exposes the MAC address of the first LAN channel with an assigned
IP address in the inventory data. This is useful for inventory
processes where the asset number is not discoverable from the software
side: the BMC MAC is going to be unique (at least within an
organization).

Change-Id: I8a4bee0c25743befd7f2033e4e0cba26895c8926
2021-08-06 13:14:45 +02:00
Riccardo Pittau
efbbc86f53 Increase version of hacking and pycodestyle
Fix H904 "Delay string interpolations at logging calls" errors

Change-Id: I331808d0132094faf739998a6984440787d3ebf8
2021-07-30 14:34:33 +02:00
Arne Wiebalck
cacdd9bab3 Burn-in: Add network step
Add a clean step for network burn-in via fio. Get basic
run parameters from the node's driver_info.

Story: #2007523
Task: #42385

Change-Id: I2861696740b2de9ec38f7e9fc2c5e448c009d0bf
2021-07-13 11:36:31 +02:00
Arne Wiebalck
20c5894bc2 Burn-in: Add disk step
Add a clean step for disk burn-in via fio. Get basic
run parameters from the node's driver_info.

Story: #2007523
Task: #42384

Change-Id: I5f5e336bd629846b3d779fd0fc7a2060b385b035
2021-05-21 16:33:11 +02:00
Zuul
823e0ed743 Merge "Burn-in: Add memory step" 2021-05-11 09:31:54 +00:00
Zuul
5c01ec4f6f Merge "Burn-in: Add CPU step" 2021-05-10 15:00:14 +00:00
Arne Wiebalck
5c222560f0 Burn-in: Add memory step
Add a clean step for memory burn-in via stress-ng. Get basic
run parameters from the node's driver_info.

Story: #2007523
Task: #42383

Change-Id: I33a83968c9f87cf795ec7ec922bce98b52c5181c
2021-05-01 10:36:58 +02:00
Arne Wiebalck
6702fcaa43 Burn-in: Add CPU step
Add a clean step for CPU burn-in via stress-ng. Get basic
run parameters from the node's driver_info.

Story: #2007523
Task: #42382

Change-Id: I14fd4164991fb94263757244f716b6bfe8edf875
2021-05-01 10:36:20 +02:00
Zuul
10c29cdc41 Merge "Fix getting memory size in some lshw output" 2021-04-30 12:24:44 +00:00
Zane Bitter
ed791d9778 Fix getting memory size in some lshw output
Due to a regression in lshw introduced by
https://github.com/lyonel/lshw/pull/60, there are some versions in the
wild that do not return sizes for memory banks <32GiB. In those cases,
work around the problem by looking at the top-level size (if available)
to find the total size. Previously we assumed that we only needed the
top-level size when there was no list of memory banks.

The issue is fixed upstream by https://github.com/lyonel/lshw/pull/65,
but the erroneous patch is still present in the lshw-B.02.19.2-5.el8
package in CentOS 8.4 and 8.5.

Change-Id: I6eb5981d28b9ae368239af0c1d0ec32ff79d95b3
Story: #2008865
Task: 42395
2021-04-29 14:41:11 -04:00
Zane Bitter
c56cd4abc0 Fix missing data in log messages
Change-Id: I5d08deed86d79a7ea0b7a1625122af595037dab5
2021-04-29 09:55:56 -04:00
Dmitry Tantsur
1ab405b509 Do not fail network interface collection on unsupported interface
Currently if one interface cannot be handled (e.g. it has empty MAC),
the whole collection fails. Ignore unsupported interfaces instead.

Change-Id: Ibdaad62b39c239d4f3fb3111c2fae9e31e877b28
2021-04-07 17:16:27 +02:00
Bernd Mueller
2a64413bb6 typo chanages -> changes
Change-Id: Ifb75a5f6f01bd98011464eb05f98d8db001dcd54
2021-03-24 13:53:32 +01:00
Bob Fournier
4afe4f6069 Check the base device if the read-only file cannot be read
For some drives, the partition e.g. `/dev/sda1` will not have the
'ro' file which can result in a metadata erasure failure but the base
device (`/dev/sda`) will have this file.  Add an additional check
for the base device.

Change-Id: Ia01bdbf82cee6ce15fabdc42f9c23036df55b4c5
Story: 2008696
Task: 42004
2021-03-09 07:05:27 -05:00
Riccardo Pittau
bff252c726 Remove default parameter from execute
The param check_exit_code from the processutils extension execute has
default already at [0]
See:
https://opendev.org/openstack/oslo.concurrency/src/branch/master/oslo_concurrency/processutils.py#L214

Change-Id: Iedff5325e0737556d5eb3da601c984ddfc633873
2021-03-02 16:19:32 +01:00
Jacob Anders
d2127e7ef4 Remove nvme-cli warning and delay on nvme-format
This change adds '-f' flag to nvme-cli calls during NVMe Secure Erase.
This removes nvme-cli output warning that the device is about to be
irreversibly deleted as well as the related 10 second delay which is
pointlessly increasing NVMe cleaning time.

Story: 2008290
Change-Id: I7b7b8b7d4f643b07d5c9dcf7ec35cf7ebedf44d1
2021-03-02 15:37:35 +10:00
Zuul
4a22c887f8 Merge "Use try_execute from ironic-lib" 2021-03-01 13:54:15 +00:00
Mohammed Naser
ab267aabdd Allow clean_configuration to run against full-device arrays
At the moment, it is not possible for Ironic to clean up a
RAID array that is built from an entire device.  This patch
allows it to do so by overriding the behaviour of attempting
to find the device name if the device names does not end with
a number and is a real block device.

Story: #2008663
Task: #41948
Change-Id: I66b0990acaec45b1635795563987b99f9fa04ac7
2021-02-27 17:24:16 -05:00
Riccardo Pittau
0459c61c8d Use try_execute from ironic-lib
Also adapt unit tests

Change-Id: I37d050877daabc9dc0a5821cf20a689652b26f34
2021-02-25 14:46:17 +01:00
Zuul
6ea3aff8d6 Merge "New deploy step for injecting arbitrary files" 2021-02-22 18:48:22 +00:00
Zuul
2979ee5314 Merge "Add support for using NVMe specific cleaning" 2021-02-19 12:13:55 +00:00
Jacob Anders
8bcf1be920 Add support for using NVMe specific cleaning
This change adds support for utilising NVMe specific cleaning tools
on supported devices. This will remove the neccessity of using shred to
securely delete the contents of a NVMe drive and enable using nvme-cli
tools instead, improving cleaning performance and reducing wear on the device.

Story: 2008290
Task: 41168
Change-Id: I2f63db9b739e53699bd5f164b79640927bf757d7
2021-02-18 22:51:34 +10:00
Riccardo Pittau
7d7940d904 Move some raid specific functions to raid_utils
To reduce size of the hardware module and separate the raid specific
code in raid_utils, we move some functions and adapt the tests.

Change-Id: I73f6cf118575b627e66727d88d5567377c1999a0
2021-02-17 10:11:13 +01:00
Dmitry Tantsur
59cb08fd28 New deploy step for injecting arbitrary files
This change adds a deploy step inject_files that adds a flexible
way to inject files into the instance.

Change-Id: I0e70a2cbc13744195c9493a48662e465ec010dbe
Story: #2008611
Task: #41794
2021-02-16 16:56:52 +01:00
Riccardo Pittau
fc1f2c73c6 Use variable for lsblk columns device info
Adjusted unit tests accordingly.

Also removed redundant parenthesis.

Change-Id: I8e2cac5172f009d5204f83bd83e1f27cfd721f09
2021-02-03 15:31:32 +01:00
Julia Kreger
cb6c0059b5 Fix default disk label with partition images
Partition images through the agent have the unfortunate
side effect of being executed without full node context
by default. Luckilly we've had a similar problem and
cache the node.

This patch changes the lookup from a default of msdos
partitions to use the cached node object.

Change-Id: I002816c9372fdf1cc32f3c67f420073551479fd9
2020-12-14 06:36:18 -08:00
Zuul
1a9491e651 Merge "Bring up VLAN interfaces and include in introspection report" 2020-12-02 13:59:28 +00:00
Zuul
22985da710 Merge "Make mdadm a soft requirement" 2020-11-23 19:37:59 +00:00
Dmitry Tantsur
ab8dee0386 Make mdadm a soft requirement
No point in requiring it for deployments that don't use software RAID.

Change-Id: I8b40f02cc81d3154f98fa3f2cbb4d3c7319291b8
2020-11-20 17:07:00 +01:00
Bob Fournier
6e3f28d720 Bring up VLAN interfaces and include in introspection report
Add the ability to bring up VLAN interfaces and include them in the
introspection report.  A new configuration field is added -
``ipa-enable-vlan-interfaces``, which defines either the VLAN interface
to enable, the interface to use, or 'all' - which indicates all
interfaces.  If the particular VLAN is not provided, IPA will
use the lldp info for the interface to determine which VLANs should
be enabled.

Change-Id: Icb4f66a02b298b4d165ebb58134cd31029e535cc
Story: 2008298
Task: 41183
2020-11-20 10:17:00 -05:00
Zuul
4762aca077 Merge "Add clean step 'erase_pstore'" 2020-11-18 17:38:00 +00:00
Arne Wiebalck
92e26b01e9 Add clean step 'erase_pstore'
Add an automatic clean step to clean the Linux kernel's pstore.
The step is disabled by default.

Story: #2008317
Task: #41214

Change-Id: Ie1a42dfff4c7e1c7abeaf39feca956bb9e2ea497
2020-11-17 18:00:16 +01:00
Vladyslav Drok
3761a44800 Fix vendor info retrieval for some versions of lshw
There is one more place that relies on lshw json output being a dict,
so let's fix the function that gets the dict rather than places it is
being used in.

Change-Id: Ia1c2c2e6a32c76ac0249e6a46e4cced18d6093a9
Task: 39527
Story: 2007588
2020-11-16 15:25:12 +01:00
Zuul
c33b3fff66 Merge "Add UUID to BlockDevice object" 2020-11-11 21:42:51 +00:00
Vladyslav Drok
c7858d3cc8 Add UUID to BlockDevice object
It'd allow for example custom ansible playbooks to use UUIDs of the
introspected node's disks. In future it might also enable agent
to use UUID (or by_path value) to refer to a device instead of
name, as it happens currently.

Change-Id: Id00437d2295c39fb12f3c25a92b30b56a58eef13
2020-11-11 17:25:59 +00:00
Vladyslav Drok
448ded43fe Fix physical memory calculation with new lshw
It seems that fix Id5a30028b139c51cae6232cac73a50b917fea233 was
dealing with a different issue. According to the description
in the story, and the linked commit there, the problem is the
fact that output is changed from dictionary to a list (with just
one value supposedly?). This commit changes the isinstance call
to check if an output of lshw is a list, and if so, we just use
the first element of the list.

Story: 2007588
Task: 39527
Change-Id: I87d87fd035701303e7d530a47b682db84e72ccb9
2020-11-06 19:09:28 +01:00
Zuul
f52863a4d8 Merge "Updated Implementation of string interpolation delay on LOG messages" 2020-11-04 10:45:32 +00:00
ebagakis
35d412e9d5 Updated Implementation of string interpolation delay on LOG messages
This is a follow up to https://review.opendev.org/#/c/756300/

Change-Id: Ifba8a57b58d61ede169c60f6d51f224d134c7708
2020-11-03 15:27:27 +01:00
Arne Wiebalck
c7f6baf7f4 [trivial] Remove redundant list conversion
Follow-up to https://review.opendev.org/#/c/756300/

Change-Id: Ibc6c044e24dde82928f19a9b9a7eaf68be53fb0e
2020-10-13 08:29:53 +02:00
Zuul
80b0a9a132 Merge "Software RAID: Re-add missing devices" 2020-10-12 12:24:24 +00:00
Dmitry Tantsur
420ebc0d73 Do not silently swallow errors in the write_image deploy step
Calling join() does not raise, we need to explicitly check the result.

Change-Id: I81d3d727af220c2b50358edab8139f07874611f0
Story: #2008240
Task: #41083
2020-10-09 11:24:12 +02:00
Arne Wiebalck
253b4887d5 Software RAID: Re-add missing devices
Upon md device creation, component devices are sometimes removed
immediately again due to a "disk failure". The disks seem healthy,
though. This patch re-adds compoenent devices in such cases to
prevent that the md device will remain in a degraded state (which
would cause issues later, e.g. during ESP creation).

Story: #2008164
Task: #40914

Change-Id: I2ac7cb4a546de84686d5c3435e850c14b3f6c1d7
2020-10-06 14:00:57 +02:00