Add the ability to bring up VLAN interfaces and include them in the
introspection report. A new configuration field is added -
``ipa-enable-vlan-interfaces``, which defines either the VLAN interface
to enable, the interface to use, or 'all' - which indicates all
interfaces. If the particular VLAN is not provided, IPA will
use the lldp info for the interface to determine which VLANs should
be enabled.
Change-Id: Icb4f66a02b298b4d165ebb58134cd31029e535cc
Story: 2008298
Task: 41183
Add an automatic clean step to clean the Linux kernel's pstore.
The step is disabled by default.
Story: #2008317
Task: #41214
Change-Id: Ie1a42dfff4c7e1c7abeaf39feca956bb9e2ea497
There is one more place that relies on lshw json output being a dict,
so let's fix the function that gets the dict rather than places it is
being used in.
Change-Id: Ia1c2c2e6a32c76ac0249e6a46e4cced18d6093a9
Task: 39527
Story: 2007588
It'd allow for example custom ansible playbooks to use UUIDs of the
introspected node's disks. In future it might also enable agent
to use UUID (or by_path value) to refer to a device instead of
name, as it happens currently.
Change-Id: Id00437d2295c39fb12f3c25a92b30b56a58eef13
This is very convenient for debugging and is something ironic and
ironic-inspector already do.
Register SSL options earlier so that they're accounted for.
Change-Id: I56aca8eec1dfeb065ac657452a7076a9e3d17cc3
It seems that fix Id5a30028b139c51cae6232cac73a50b917fea233 was
dealing with a different issue. According to the description
in the story, and the linked commit there, the problem is the
fact that output is changed from dictionary to a list (with just
one value supposedly?). This commit changes the isinstance call
to check if an output of lshw is a list, and if so, we just use
the first element of the list.
Story: 2007588
Task: 39527
Change-Id: I87d87fd035701303e7d530a47b682db84e72ccb9
Add possibility to use disk LABEL to identify rootfs uuid for
Software RAID deployment
Change-Id: I77f36e70ddc539af0190db1c1abe0fb2c66f34b4
Story: 2008303
Task: 41188
Follow-up on Ib96a1057792f45f2e4554671e32c436140463ee8 to
improve some of the wording and review feedback by
Dmitry Tantsur.
Change-Id: Id77b0d72f3d78e5befd05fbdb6b21bc780f4ddfe
Typically, the Ironic API client in IPA will autodetect the API version
based on the output of a GET of the root of the API. If for some reason
this API endpoint is restricted, or the operator wishes to limit the
Ironic API version IPA uses, they can now set CONF.ironic_api_version to
avoid autodetection and force a version.
Change-Id: Ib96a1057792f45f2e4554671e32c436140463ee8
By default, grub2-mkconfig scans everything to look for other
environments and then load those into the grub configuration.
It makes sense, but on newer versions of grub2 in distribution
images, os-prober is taking an exceptionally long time in some
cases where more than one storage device exists with other
filesystems.
As a result, of the os-prober execution by grub2-mkconfig, the
bootloader installation can completely time out and fail the
deployment. This is presently experienced with metalsmith on
centos8.
There are numerous sporatic reports of issues like this issue
where grub2-mkconfig hangs for some period of time, and this is
observable on Centos8.2 in our CI. While one report[0] mentions
this issue, Another bug [1] has the dialog that actually helps us
frame the context as to what we likely should do.
Also, fixes the unit testing so we actually test if we're running
with grub2. :\
[0]: https://bugzilla.redhat.com/show_bug.cgi?id=1744693
[1]: https://bugzilla.redhat.com/show_bug.cgi?id=1709682
Depends-On: https://review.opendev.org/#/c/748315
Change-Id: I14bf299afef3a1ddb2006fe5f182d7f0d249e734
Calling join() does not raise, we need to explicitly check the result.
Change-Id: I81d3d727af220c2b50358edab8139f07874611f0
Story: #2008240
Task: #41083
Currently the test takes 5*5=25 seconds. Re-arrange the code so
that it's possible to change the retry delay in tests.
Change-Id: Ia559dad4bc656f8ad6b2cb8cb0137a97e2614db7
We don't have a really working way to detect root UUID for whole
disk images at the moment, which results in an ignored traceback
every time install_bootloader is called with whole disk images in
UEFI mode. Avoid it by skipping GRUB2 if root UUID is unknown.
Change-Id: I84245538f59c664b72d1cafbca8d61be0978f489
Upon md device creation, component devices are sometimes removed
immediately again due to a "disk failure". The disks seem healthy,
though. This patch re-adds compoenent devices in such cases to
prevent that the md device will remain in a degraded state (which
would cause issues later, e.g. during ESP creation).
Story: #2008164
Task: #40914
Change-Id: I2ac7cb4a546de84686d5c3435e850c14b3f6c1d7
Scanning the output of mdadm commands for RAID members will
miss component devices which are currently not part of the
RAID. For proper cleaning it is better to scan block devices
for a signature of the md device for which we would like to
get the components.
Story: #2008186
Task: #40947
Change-Id: Ib46612697851e36a16d272ccaeb0115106253863
Partions on the holder disk should only be deleted after
all RAID devices have been deleted. Otherwise, super blocks
on partitions which reside on the same disks cannot be cleaned.
Story: #2008199
Task: #40979
Change-Id: I19293f5b992cd1fa68957d6f306dcec8f3b7a820
Currently, IntelCnaHardwareManager inherits GenericHardwareManager
which makes it a new "GenericHardwareManager" with "MAINLINE" priority.
This causes all other hardware-managers with lower priority than
"MAINLINE" never be used. To fix this, make IntelCnaHardwareManager
inherit basic HardwareManager.
Change-Id: I28b665d8841b0b2e83b132e1f25df95e03e7ba10
Story: 2008142
Task: 40882
Heartbeating in IPA has used select.poll() for years to workaround
a bug where changing the time in the ramdisk could cause heartbeats
to stop and never resume.
Now that IPA syncs time at start and exit, this workaround is no
longer needed. So instead, we'll revert to using threading.Event()
in order to make the code simpler and easier to understand.
Since we need this to be an eventlet-event, and not a standard-thread
event, also monkey_patch threading.
Additionally, there were a few completely unused backoff interval
values set, that were never applied. In respect of maintaining the
5+ years old behavior of not doing error backoffs, that code was
removed instead of being made to work.
Change-Id: Ibcde99de64bb7e95d5df63a42a4ca4999f0c4c9b
A transitory connection failure, such as one caused by
a port being held down for traffic forwarding, can experience
intermittent connectivity failures which result in failed
introspections.
Now the agent retries.
Change-Id: I72c5e3aca000d3854a17f8a461b1a2935e5c0d9b
Adds a new flag (on by default) that enables generating a TLS
certificate and sending it to ironic via heartbeat. Whether
ironic supports auto-generated certificates is determined by
checking its API version.
Change-Id: I01f83dd04cfec2adc9e2a6b9c531391773ed36e5
Depends-On: https://review.opendev.org/747136
Depends-On: https://review.opendev.org/749975
Story: #2007214
Task: #40604
Makes sure heartbeats can send versions higher than one required for
tokens while also making sure we never send a version we don't know.
Also makes code easier to understand.
Change-Id: Ice1e7d45ea90c9fd8220c4b94e691b6015e23074
The node lookup code added in change
I27201319f31cdc01605a3c5ae9ef4b4218e4a3f6
was slightly broken in that we call a method
with a keyword arguemnt which doesn't exist.
uuid versus node_uuid.
It happens, it is a quick fix!
Spotted on a metalsmith job:
[-] Agent is requesting to perform an explicit node cache update.
This is to pickup any chanages in the cache before deployment.
[-] Failed to update node cache. Error lookup_node() got an
unexpected keyword argument 'uuid'
Change-Id: I59ecec65707a2f03918b233f1925395ebe59b8c4
Apparently, functional-py36 just runs unit tests.
Fix the test that has regressed in the meantime and make it voting
so that we don't regress again.
Change-Id: Id5efe89a12a00c27e6299380a51cdb840285d691