This commit changes the GA metadata status on fresh install
to "deployed" given recent technical decision changes.
Test Plan:
PASS: build and install iso, verify the correct output with
"software list"
Story: 2010676
Task: 49166
Change-Id: Idbab8655f9f2e4e080f389fa7823f5e6744c4c74
Signed-off-by: Heitor Matsui <heitorvieira.matsui@windriver.com>
In management network reconfiguration for AIO-SX, the runtime manifest
executed during host unlock could take more than five minutes to complete.
This commit is to extend the timeout period from five minutes to eight
minutes.
Test Plan:
PASS: AIO-SX subcloud mgmt network reconfiguration
Story: 2010722
Task: 49133
Change-Id: I6bc0bacad86e82cc1385132f9cf10b56002f385e
Signed-off-by: Teresa Ho <teresa.ho@windriver.com>
This commits switches the GA metadata file copy from
/opt/patching/metadata to /opt/software/metadata.
Test Plan
PASS: build iso, install and verify that "software list"
lists the GA release
Story: 2010676
Task: 49112
Change-Id: I75b8cd6ae41a9cf9b5af0225ebcaaf0d9e0ddb4e
Signed-off-by: Heitor Matsui <heitorvieira.matsui@windriver.com>
fsmond tries to create a test file in "/.fs-test" but
it is not possible because "/" is blocked by ostree.
So the fix is to replace this path from fsmond monitoring
with /sysroot/.fs_test.
Below is a comparison of the logs:
- Before change:
( 196) fsmon_service : Warn : File (/.fs-test) test failed
- After change:
( 201) fsmon_service : Info : tests passed
Test Plan:
- PASS: Build mtce package
- PASS: Replace fsmond binary on AIO-SX
- PASS: Check fsmond.log output
Closes-Bug: 2043712
Change-Id: Ib4bad73448735bce1dff598151fce86f867f4db7
Signed-off-by: Erickson Silva de Oliveira <Erickson.SilvadeOliveira@windriver.com>
It was detected that the PXE boot the IPv6 autoconf is turned on
due to an error in the network config file for the PXE interface.
Instead of applying the config to the interface it is configuring
the loopback.
By leaving autoconf turned on the interface it can receive unwanted
address configuration that can create errors during the ansible
playbook execution that will follow.
Closes-Bug: 2043509
Change-Id: I48584dc6b92fca02205c4774c4624410b6a29ba8
Signed-off-by: Andre Kantek <andrefernandozanella.kantek@windriver.com>
With the introduction of FQDN for MGMT network feature, the DNS lookup
of 'controller' resolves to 'controller.internal'.
The kickstart script uses the DNS lookup of controller to determine
whether the system is using a IPv6 or IPv4 which results in a string
instead of IP address or 0 return code. This causes a problem in
installing nodes in IPv4 when the management interface is configured
over vlan.
The fix is to use the FQDN controller.internal.
Test plan:
PASS: Install IPv4 AIO-DX with mgmt vlan
PASS: Install IPv6 AIO-DX with mgmt vlan
Story: 2010722
Task: 48682
Closes-Bug: 2042953
Signed-off-by: Teresa Ho <teresa.ho@windriver.com>
Change-Id: I5377587c8bc8c62a62f03123cabef7366df3dd94
This update solves two issues involving bmc reset.
Issue #1: A race condition can occur if the mtcAgent finds an
unlocked-disabled or heartbeat failing node early in
its startup sequence, say over a swact or an SM service
restart and needs to issue a one-time-reset. If at that
point it has not yet established access to the BMC then
the one-time-reset request is skipped.
Issue #2: When issue #1 race conbdition does not occur before BMC
access is established the mtcAgent will issue its one-time
reset to a node. If this occurs as a result of a crashdump
then this one-time reset can interrupt the collection of
the vmcore crashdump file.
This update solves both of these issues by introducing a bmc reset
delay following the detection and in the handling of a failed node
that 'may' need to be reset to recover from being network isolated.
The delay prevents the crashdump from being interrupted and removes
the race condition by giving maintenance more time to establish bmc
access required to send the reset command.
To handle significantly long bmc reset delay values this update
cancels the posted 'in waiting' reset if the target recovers online
before the delay expires.
It is recommended to use a bmc reset delay that is longer than a
typical node reboot time. This is so that in the typical case, where
there is no crashdump happening, we don't reset the node late in its
almost done recovery. The number of seconds till the pending reset
countdown is logged periodically.
It can take upwards of 2-3 minutes for a crashdump to complete.
To avoid the double reboot, in the typical case, the bmc reset delay
is set to 5 minutes which is longer than a typical boot time.
This means that if the node recovers online before the delay expires
then great, the reset wasn't needed and is cancelled.
However, if the node is truely isolated or the shutdown sequence
hangs then although the recovery is delayed a bit to accomodate for
the crashdump case, the node is still recovered after the bmc reset
delay period. This could lead to a double reboot if the node
recovery-to-online time is longer than the bmc reset delay.
This update implements this change by adding a new 'reset send wait'
phase to the exhisting reset progression command handler.
Some consistency driven logging improvements were also implemented.
Test Plan:
PASS: Verify failed node crashdump is not interrupted by bmc reset.
PASS: Verify bmc is accessible after the bmc reset delay.
PASS: Verify handling of a node recovery case where the node does not
come back before bmc_reset_delay timeout.
PASS: Verify posted reset is cancelled if the node goes online before
the bmc reset delay and uptime shows less than 5 mins.
PASS: Verify reset is not cancelled if node comes back online without
reboot before bmc reset delay and still seeing mtcAlive on one
or more links.Handles the cluster-host only heartbeat loss case.
The node is still rebooted with the bmc reset delay as backup.
PASS: Verify reset progression command handling, with and
without reboot ACKs, with and without bmc
PASS: Verify reset delay defaults to 5 minutes
PASS: Verify reset delay change over a manual change and sighup
PASS: Verify bmc reset delay of 0, 10, 60, 120, 300 (default), 500
PASS: Verify host-reset when host is already rebooting
PASS: Verify host-reboot when host is already rebooting
PASS: Verify timing of retries and bmc reset timeout
PASS: Verify posted reset throttled log countdown
Failure Mode Cases:
PASS: Verify recovery handling of failed powered off node
PASS: Verify recovery handling of failed node that never comes online
PASS: Verify recovery handling when bmc is never accessible
PASS: Verify recovery handling cluster-host network heartbeat loss
PASS: Verify recovery handling management network heartbeat loss
PASS: Verify recovery handling both heartbeat loss
PASS: Verify mtcAgent restart handling finding unlocked disabled host
Regression:
PASS: Verify build and DX system install
PASS: Verify lock/unlock (soak 10 loops)
PASS: Verify host-reboot
PASS: Verify host-reset
PASS: Verify host-reinstall
PASS: Verify reboot graceful recovery (force and no force)
PASS: Verify transient heartbeat failure handling
PASS: Verify persistent heartbeat loss handling of mgmt and/or cluster networks
PASS: Verify SM peer reset handling when standby controller is rebooted
PASS: Verify logging and issue debug ability
Closes-Bug: 2042567
Closes-Bug: 2042571
Change-Id: I195661702b0d843d0bac19f3d1ae70195fdec308
Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>
This commit applies to the prestaged ISO install. The kickstart.cfg is
updated to copy the prestaged ostree_repo into release-specific
/opt/platform-backup/<release> location.
A minor change is also included in miniboot.cfg to sync the patching
metadata for prepatched ISOs. This fills a potential hole in the
patching metadata sync behaviour identified during testing.
Normally the patching metadata is synchronized from the system
controller down to the subcloud. For the prestaged ISO case, this change
is necessary to ensure the patching metadata is seeded from the
prepatched ISO created via gen-prestaged-iso.sh.
Test Plan
PASS:
- Build prestaged ISO, including container images and a patch
- Install subcloud using prestaged ISO
- Verify contents of /opt/platform-backup/<release> are properly
populated.
- Verify subcloud is installed using prestaged data from
/opt/platform-backup/<release>
- Verify that included container images are installed
- Build prestaged ISO using a pre-patched ISO. Install subcloud, ensure
that patching metadata is properly synchronized on installation.
Out of scope failure:
- A new bug to be raised for the following:
- Verify that the included patch is installed on the subcloud
- It appears that this has never worked in Debian. The --patch
option makes sense for a Debian installation, since the patches
are contained in ostree commits. To fully support this
functionality we need to implement a new mechanism to do a
sw-patch upload and apply at some point during the installation.
- Support for the gen-prestaged-iso.sh --patch option will be
added in a future commit
Closes-Bug: 2039282
Signed-off-by: Kyle MacLeod <kyle.macleod@windriver.com>
Change-Id: I973f4704eae09634a0c3fe2f7fbc31ac1835fcf8
Ostree doesn't manage the /var filesystem. Anything
installed there during initial filesystem setup becomes
unpatchable [1]. As a result, the kickstart install dir
/var/www/pages/feed/rel-${platform_release}/kickstart
is not updated according to patch changes. /var/www/pages/feed/rel-${platform_release}/kickstart
is currently only used for PXE boot installs.
Subcloud remote installations are using the miniboot.cfg
kickstart from the load-imported ISO
(we may want to change this in some future commit).
This commit adds kickstart update support to
pxeboot-feed.service (pxeboot_feed.sh) so that
/var/www/pages/feed/rel-${platform_release}/kickstarts
is refreshed based on the kickstart dir from
/ostree (i.e., the patched changes).
[1] https://review.opendev.org/c/starlingx/ha/+/890918
Test Plan:
1. PASS: Verify Debian build and DC system install
(virtual lab - disk and pxe installs)
2. PASS: Verify pxe install (DC remote install) with
patched kickstart
3. PASS: Create a patch with changes to kickstart feed:
- modify an existing kickstart
- create a new kickstart file
- delete an existing file
- create a new kickstart sub-directory
- modify centos subdir
verify patch apply, ensure that changes are
correctly applied to:
/var/www/pages/feed/rel-${platform_release}/kickstarts
4. PASS: Revert the patch from test #3 and ensure changes
are correctly undone in the feed dir
Closes-Bug: 2034753
Change-Id: I74804bff23a74512db6a95fa514c84a1a6ea54a8
Signed-off-by: Salman Rana <salman.rana@windriver.com>
This commit updates the crashDumpMgr service in order to:
- Cleanup of current service naming and packaging to follow the
standard Linux naming convention:
- Repackage /etc/init.d/crashDumpMgr to
/usr/sbin/crash-dump-manager
- Rename crashDumpMgr.service to crash-dump-manager.service
- Add EnvironmentFile to crash-dump-manager service file to source
configuration from /etc/default/crash-dump-manager.
- Update ExecStart of crash-dump-manager service to use parameters
from EnvironmentFile
- Update crash-dump-manager service dependencies to run after
config.service.
- Update logrotate configuration to support the retention polices of
the maximum files. The “rotate 1” option was removed to permit
crash-dump-manager to manage pruning old files.
- Modify the crash-dump-manager script to enable updates to the
max_files parameter to a lower value. If there are currently more
files than the new max_files value, the oldest files will be
deleted the next time a crash dump file needs to be stored, thus
adhering to the new max_files values.
Test Plan:
PASS: Build ISO and perform a fresh install. Verify the new
crash-dump-manager service is enabled and working as expected.
PASS: Add and apply new crashdump service parameters and force a kernel
panic. Verify that after the reboot, the max_files, max_used,
min_available and max_size values are updated accordingly to the service
parameters values.
PASS: Verify that the crashdump files are rotated as expected.
Story: 2010893
Task: 48910
Change-Id: I4a81fcc6ba456a0d73067b77588ee4a125e44e62
Signed-off-by: Enzo Candotti <enzo.candotti@windriver.com>
1.Extended the timeout to 14mins to accommodate the longer shutdown time.
2.Fixed the power state error log so that it logs the requested state
instead of the current power_state.
Test Plan:
PASS: Verify logged version is 2.2
PASS: Verify success path with no FIT delay ; HP and ZT servers
PASS: Verify timing of the loop with timeout of 14 minutes
PASS: Verify shutdown timeout handling when shutdown exceeds 14
minutes.
PASS: Verify install completes successfully when Power Off takes
close to but less than 14 minutes
PASS: Verify power state failure log reports proper state
Closes-Bug: 2038484
Signed-off-by: Li Zhu <li.zhu@windriver.com>
Change-Id: Ic99a06dca9962fcae43b20e00d8ebcb127a80560
Backup and Restore are not completing because the manifest is
not applied when trying drbd-cephmon turns primary,
It is occurring because the LVs are not being wiped before
being removed, so some garbage is impacting drbd-cephmon
turns primary and causes the manifest fails to not be applied.
To ensure that drbd-cephmon turns primary on first unlock,
LVs will be wiped before recreating them during kickstart
procedure.
Test Plan:
PASS: Backup and restore on AIO-DX
PASS: Install AIO-SX over the previous installation without
wiping the disks and checking the install.log to verify
if the disks are wiped during kickstart.
PASS: Install AIO-DX, reinstall Controller-1, and checking the
install.log to verify if the disks were wiped during kickstart.
Closes-Bug: #2031542
Change-Id: Ib00d77fbc9dfd62e9c94f418e29f2805f8a0c036
Signed-off-by: Gustavo Ornaghi Antunes <gustavo.ornaghiantunes@windriver.com>
As it was done in the previous change for local installation
https://review.opendev.org/c/starlingx/metal/+/863322
This change removes the ISO embedded machine-id file to allow the
value regeneration after the first boot post install for subclouds
that use the redfish protocol when added in a system controller.
Test Plan
[PASS] install 2 subclouds from the system controller containing the
patch and check the values in /etc/machine-id and
/var/lib/dbus/machine-id to unique for each subcloud
Closes-Bug: 2037434
Change-Id: If7a631b5769cb499956a7e5ee33e3361a6230452
Signed-off-by: Andre Kantek <andrefernandozanella.kantek@windriver.com>
Ostree doesn't manage the /var filesystem. Anything
installed there during initial filesystem setup becomes
unpatchable [1]. As a result, the kickstart install dir
/var/www/pages/feed/rel-${platform_release}/kickstart
is not updated according to patch changes.
This commit changes the platform-kickstarts install paths
to a place that ostree handles,
/usr/share/platform-kickstarts/rel-${platform_release}
in this case and symlinks it to
/var/www/pages/feed/rel-${platform_release}/kickstarts.
[1] https://review.opendev.org/c/starlingx/ha/+/890918
Test Plan:
1. PASS: ISO install and verify symlink created:
/var/www/pages/feed/rel-${platform_release}/kickstarts ->
/usr/share/platform-kickstarts/rel-${platform_release}
2. PASS: Verify that the centos/ dir, kickstart.cfg & miniboot.cfg
are installed to /usr/share/platform-kickstarts/
rel-${platform_release}
3. PASS: Verify PATCH apply, ensure that changes are applied to
/var/www/pages/feed/rel-${platform_release}/kickstarts
4. PASS: Manually remove/re-install the platform-kickstarts package
and verify kickstarts dir and symlink
Closes-Bug: 2034753
Closes-Bug: 2035109
Change-Id: I307d28c086bb3d9f0e4d6792db44e55c99358a50
Signed-off-by: Salman Rana <salman.rana@windriver.com>
This commmit updates crashDumpMgr in order to add three new parameters
and enhance the existing one.
1. Maximum Files: Added 'max-files' parameter to specify the maximum
number of saved crash dump files. The default value is 4.
2. Maximum Size: Updated the 'max-size' parameter to support
the 'unlimited' value. The default value is 5GiB.
3. Maximum Used: Included 'max-used' parameter to limit the maximum
storage used by saved crash dump files. It supports 'unlimited'
and has a default value of unlimited.
4. Minimum Available: Implemented 'min-available' parameter, enabling
the definition of a minimum available storage threshold on the
crash dump file system. The value is restricted to a minimum of
1GB and defaults to 10%.
These enhancements refine the crash dump management process and
offer more control over storage usage and crash dump file retention.
Story: 2010893
Task: 48676
Test Plan:
1) max-files parameter:
PASS: don't set max-files param. Ensure the default value is used.
Create 5 directories inside /var/crash. Each of them contains
dmesg.<date> and dump.<date>. run the crashDumpMgr script.
Verify:
PASS: the vmcore_first.tar.1.gz is created when the first
directory is read.
PASS: 4 more vmcore_<date>.tar files are created.
PASS: There will be 1 vmcore_first.tar.1.gz and 4
vmcore_<date>.tar inside /var/log/crash.
PASS: There will be one summary file for each direcory:
<date>_dmesg.<date> inside /var/crash
2) max-size parameter
PASS: don't set max-size param. Ensure the default value is used
(5GiB).
PASS: Set a fixed max-size param. Create a dump.<date> file greater
that the max-size param. Run the crashDumpMgr script. Verify
that the crash dump file is not generated and a log
message is displayed.
3) max-used parameter:
PASS: don't set max-used param. Ensure the default value is used
(unlimited).
PASS: Set a fixed max-used param. Create a dump.<date> file that
will generate that the used space is greater that the
max-used param. Run the crashDumpMgr script. Verify that
the crash dump file is not generated, a log message is
displayed and the directory is deleted.
4) min-available parameter:
PASS: don't set min-available param. Ensure the default value is
used (10% of /var/log/crash).
PASS: Set a fixed 'min-available' param. Generate a 'dump.<date>'
file to simulate a situation where the remaining space is
less than the 'min-available' parameter. Run the crashDumpMgr
script and ensure that it does not create the crashdump file,
displays a log message, and deletes the entry.
5) PASS: Since the crashDumpMgr.service file is not being modified,
verify that the script takes the default values.
Note: All tests have also been conducted by generating a kernel panic
and ensuring the crashDumpMgr script follows the correct workflow.
Change-Id: I8948593469dae01f190fd1ea21da3d0852bd7814
Signed-off-by: Enzo Candotti <enzo.candotti@windriver.com>
A new version of sphinx was released May 29 2022
which requires a language setting in config otherwise
a warning (treated as error) causes the sphinx operation
to fail.
Updated the sphinx config file to correct the issue.
The sphinx behavioural change is mentioned here:
https://github.com/sphinx-doc/sphinx/issues/10062https://github.com/sphinx-doc/sphinx/issues/10474
Partial-Bug: 1976377
Partial-Bug: 2033431
Change-Id: I882faa7ce199d8817598980b9dc5090b4e1af57d
Signed-off-by: John Kung <john.kung@windriver.com>
This commit adds extracting the patches files (metadata) from the
load being imported.
Test Plan:
Passed: load from previous version imported as inactive
Passed: load from new version imported
Story: 2010611
Task: 48546
Change-Id: I12a2c9f62523f6b08294f2538ad77b5c8338a751
Signed-off-by: Guilherme Schons <guilherme.dossantosschons@windriver.com>
Add support for importing load with import.sh from the current load.
This enables to always import load with higher version import.sh (in the
case of load-import --inactive)
TCs:
passed: from system controller running 23.09 load, import 22.12 load
passed: regression import N+1 load
Story: 2010611
Task: 48371
Change-Id: I4aec6eaa89019d4852979c27a708e409f32e27b0
Signed-off-by: Bin Qian <bin.qian@windriver.com>
This story shall update the README file of a few most used StarlingX
repos.
Test Plan: N/A
Story: 2010814
Task: 48378
Change-Id: I3323f10f9cd983a5ff12f846e4c14af8cebbbd2f
Signed-off-by: Roger Ferraz <rogerio.ferraz@encora.com>
This update adds support to the Debian kickstarts to search the
install kernel command line for the multi-drivers-switch= option.
If that option is found, then the full option with the specified
version, ex: multi-drivers-switch=2.54 , will be added to the
disk boot kernel command line options.
Test Plan:
PASS: Verify Build and Install SX system
PASS: Unit test of code block function over an install
PASS: Verify if the multi-drivers-switch parameter exists on the
the node install command line then the same option is
propagated to the disk boot command line.
PASS: Verify the opposite of the above is true.
Closes-Bug: 2026893
Change-Id: I648b16dbc5aa2a0a7b8368c1b89a5d46418ab1e5
Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>
Accept and apply intel driver ver parameter to pxeboot conf file
for nodes to be installed (include reinstall and upgrade).
TCs:
Observed the pxeboot cfg file for a new host is configured
with param multi-drivers-switch=<ver from service parameter>
Story: 2010651
Task: 48276
Signed-off-by: Bin Qian <bin.qian@windriver.com>
Change-Id: I6aebff98a5bb831de82f6da07ac53978b17f8caf
This commit introduces support for installing CentOS-based previous
release (21.12) in Debian.
There are two main components in this commit:
1. Handle the label change for the backup partition:
Platform Backup in 21.12 vs 'platform_backup' in Debian
This is accomplished by ignoring the label/partlabel entirely when
searching for an existing backup partition. Instead, the partition
GUID is used to locate the partition. The GUID does not change
between distributions.
2. Use pre-bundled CentOS kickstarts for subcloud installs in Debian
Since modifications are required to the CentOS kickstart files for the
above, we copy the relevant pre-bundled centos kickstarts (for miniboot
and prestaged ISO only) into a centos-specific directory under the
Debian /var/www/pages/feed/rel-${platform_release}/kickstart directory
structure, in order to be available for the gen-bootloader-iso-centos.sh
utility. These files are included in the platform-kickstarts .deb
package.
NOTES on how the pre-bundled files are created:
- We cannot use the files under bsp-file/kickstarts/*.cfg, since they
are not valid for 21.12 release (e.g. they refer to /var/www)
- Instead, files were taken from a valid 21.12 release and manually
merged with the pre-bundled files generated from this repo
GOING FORWARD:
Only the bundled files at kickstart/files/centos/*.cfg will be
maintained. At a later time, we may choose to remove the partial
kickstarts under bsp-files/kickstarts/*.cfg, since they are not used
anywhere.
Test Plan
PASS:
- Build full ISO, verify that the
/var/www/pages/feed/rel-23.09/kickstart/centos directory is populated
with the pre-bundled kickstart files
- Verify previous-release CentOS subcloud install/deployment under
Debian (requires patched 22.12 load)
- Verify current-release subcloud install under Debian
Story: 2010611
Task: 48268
Signed-off-by: Kyle MacLeod <kyle.macleod@windriver.com>
Change-Id: I1b7f76212e222dea7c6e586e4e9492f8a86a955e
This commit supports the developer use-case of a system controller
ostree repo configured with gpg-verify=false. In such cases, the
subcloud ostree repo instances must also be configured with
gpg-verify=false, or the ostree pull will fail.
We detect the boot parameter 'instgpg=0'. In which case we configure the
ostree repo with gpg-verify=false. The instgpg=0 parameter is also
detected by LAT /install, which handles the LAT side of the ostree
repo configuration.
Test Plan:
PASS:
- Install subcloud with non-GPG signed ostree commits present on system
controller. Ensure the ostree pull is successful on subcloud, with a
successful install.
- Ensure normal subcloud installation is successful
Story: 2010611
Task: 48309
Signed-off-by: Kyle MacLeod <kyle.macleod@windriver.com>
Change-Id: I40a0823ed1fc868aa5d4fb7686f1648440664037
Mtce polls/queries the remote host for mtcAlive messages
for 42 x 100 ms intervals over unlock or host failed cases.
Absence of mtcAlive during this (~5 sec) period indicates
the node is offline.
However, in the rare case where shutdown is slow, 5 seconds
is not long enough. Rare cases have been seen where 7 or 8
second wait time is required to properly declare offline.
To avoid the rare transient 200.004 host alarm over an
unlock operation, this update increases the mtce host
offline window from 5 to 10 seconds (approx) by modifying
the mtce configuration file offline threshold from 42 to 90.
Test Plan:
PASS: Verify unchallenged failed to offline period to be ~10 secs
PASS: Verify algorithm restarts if there is mtcAlive received
anytime during the polls/queries (challenge) window.
PASS: Verify challenge handling leads to a longer but
successful offline declaration.
PASS: Verify above handling for both unlock and spontaneous
failure handling cases.
Closes-Bug: 2024249
Change-Id: Ice41ed611b4ba71d9cf8edbfe98da4b65dcd05cf
Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>
This commit fixes the missing support for bootstrap_address_prefix
in the miniboot ip -6 address add command. We check for the provided
prefix value parsed from the boot arguments and make sure that it is
applied if present. Note that the bootstrap_address_prefix is a
mandatory install value, so it will be provided. However, we leave the
capability for it to be missing, in order to de-risk this commit.
Additionally, a workaround is included for full support of
multi-drivers-switch given in the boot arguments. When this argument is
given we parse out the kernel module version and use it to replace the
current kernel modules for ice/i40e/iavf with the modules of the given
version.
Test Plan
PASS:
- Replace miniboot.cfg at /var/miniboot/kickstart-override/miniboot.cfg
on target lab system requiring multi-drivers-switch=cvl-2.54:
- Using subcloud install-value
extra_boot_params: multi-drivers-switch=cvl-2.54,
verify that the subcloud switches to the legacy kernel modules
and the subcloud is able to properly configure its IP address and
perform the ostree pull operation from the system controller.
- Install subcloud with no extra_boot_params, verify that the
bootstrap_address_prefix is properly applied. Verify no regression.
Closes-Bug: 2023407
Signed-off-by: Kyle MacLeod <kyle.macleod@windriver.com>
Change-Id: I4f3d8e2f240f2aa061de30014cf39dfb9b42a035
Currently sssd is not configured and running on storage nodes so
ldap users can't login to storage nodes. This update creates sssd
pmon config file so that sssd is running on storage nodes.
Test Plan:
PASS: System with storage nodes deployment
PASS: In storage nodes, verify that the following config file exist:
/etc/pmon.d/sssd.conf
PASS: In storage nodes, verify that sssd is running by
systemctl status sssd
PASS: In storage nodes, verify ldap users are accessible by
getent passwd
Closes-Bug: 2023399
Depends-On: https://review.opendev.org/c/starlingx/stx-puppet/+/885878
Change-Id: I2e85873c3ddd18bab68365a58b5a8617eb1b2766
Signed-off-by: Andy Ning <andy.ning@windriver.com>
The extra_boot_params install value is presented as a single boot
parameter in the initial miniboot ISO boot. This kickstart change
translates the install value into proper disk boot kernel options, so
that the provided extra_boot_params are applied as boot options for the
main /boot parameters in grub and syslinux.
Although the extra_boot_params value must be a single string, multiple
extra boot parameters can be specified by separating individual args
by a comma. Example: extra_boot_params=arg1=1,arg2=2. This change splits
the args by comma and ensures that the kernel boot options are separate
for the main boot.
Test Plan
PASS:
- Verify that extra_boot_params is parsed into separate kernel options
- Verify that disk kernel options are applied when subcloud is installed
(i.e., the final install boots with the configured extra options)
- Verify comma-separated input values are translated into proper
kernel options:
- extra_boot_params=arg1=1,arg2=2 -> kernel options: arg1=1 arg2=2
- extra_boot_params=arg1=1 -> kernel options: arg1=1
- extra_boot_params=arg1 -> kernel options: arg1
Partial-Bug: 2023407
Depends-On: https://review.opendev.org/c/starlingx/distcloud/+/885758
Change-Id: I8ed10f7ffe8af51ae7b77eaa398b824347a0a998
Signed-off-by: Kyle MacLeod <kyle.macleod@windriver.com>
This commit fixes the detection of www/pages/feed/rel-xx.x/install_uuid
via device '/dev/cgts-vg/var-lv'. There was a bug which was always
mounting the same device, rather than the proper device_list.
The code is also slightly refactored for simplification and clarity.
Test Plan
PASS:
- Generate ISO using gen-prestage-iso.sh without --force-install option
- Verify installation failure (drop to boot prompt) if previous
subcloud installation exists
- Verify successful subcloud installation if no previous
subcloud installation exists
- Generate ISO using gen-prestage-iso.sh with --force-install option
- Verify successful installation regardless if previous subcloud
installation exists or not
Closes-Bug: 2020526
Change-Id: Ib83d72fa07335ffa29d365da7813b226c4ef310b
Signed-off-by: Kyle MacLeod <kyle.macleod@windriver.com>
This commit handles the relocation of ostree_repo prestaging data from
/opt/platform-backup to /opt/platform-backup/<release>. The miniboot.cfg
kickstart now looks for prestaged data in the release-specific location.
We also handle the backup partition name change across CentOS/Debian. In
the case of a downgrade the CentOS miniboot kickstart code is updated to
use the partition GUID rather than LABEL or PARTLABEL. The GUID is
constant across all releases and is therefore a more reliable indicator
of the backup partition.
Tech debt: Fix the arbitrary wait sleep calls used when configuring
VLAN addressing. Now uses the more efficient wait_for_interface
approach for the VLAN links.
Test Plan
PASS:
- Boot with prestaged data under /opt/platform-backup/<release>/
Ensure boot/install successfully uses prestaged data.
- Boot into older release under prestaged /opt/platform-backup/21.12
- Test moving from 22.12 -> 21.12 and 21.12 -> 22.12
- Ensure backup partition is found using GUID approach.
- Ensure boot/install successfully uses prestaged data.
- Boot into both current and older release with no prestaged data
- Test moving from 22.12 -> 21.12 and 21.12 -> 22.12
- Ensure boot/install is successful.
- Boot subcloud with bootstrap_vlan, ensure that the wait_for_interface
calls properly wait until the link is up.
Story: 2010611
Task: 47943
Depends-On: https://review.opendev.org/c/starlingx/distcloud/+/880789
Change-Id: I381b60285e9bfc375f01f45b79174b71da7f0565
Signed-off-by: Kyle MacLeod <kyle.macleod@windriver.com>