4161 Commits

Author SHA1 Message Date
Kyle MacLeod
da21e1f9f7 Fix cert-mon PriorityQueue regression in python3
In python3 the PriorityQueue raises an exception
due to

TypeError: '<' not supported between
        instances of 'SubcloudAuditData' and 'SubcloudAuditData'

The fix is to include a __lt__ method in SubcloudAuditData.
A timestamp field is added (primarily used in the tuple added to the
queue, but easy enough to include here) in order to aid in the sorting.

Test Plan:

PASS: trigger cert-mon audit for subclouds. Verify that the exception
is not raised, and that subclouds are properly enqueued for audit.

Closes-Bug: 1992680
Change-Id: Ibaa9a421eb809edc434793bc7e8ae92691be021f
Signed-off-by: Kyle MacLeod <kyle.macleod@windriver.com>
2022-10-12 13:03:56 -04:00
Zuul
94045cefc1 Merge "Fix failure to add OIDC service parameters" 2022-10-11 21:14:59 +00:00
Jorge Saffe
9e956b764f Fix failure to add OIDC service parameters
This changes fix OIDC service-parameter-add
operation.

Test Plan:
* CENTOS and DEBIAN distro:
  - Fresh Install with AIO-SX.
  - Add OIDC service-parameter.
  - Apply changes on kubernetes service.
  - Verify cluster health and configuration.

Closes-Bug: 1992208

Signed-off-by: Jorge Saffe <jorge.saffe@windriver.com>
Change-Id: I3ecc17606531068a9d2c4371b081c1661d47670f
2022-10-07 17:25:33 -04:00
Zuul
62dea5bcea Merge "Add sysinv upgrades support for Kubernetes 1.24.4" 2022-10-07 19:32:07 +00:00
Zuul
f3a9f01794 Merge "create vim_db locally and move to nfs device" 2022-10-07 18:11:58 +00:00
Zuul
202e440267 Merge "Update delete_load.sh permission on Debian" 2022-10-07 18:11:52 +00:00
Zuul
08e91725b8 Merge "Alarm Hostname controller function has in-service failure reported" 2022-10-07 17:37:47 +00:00
Junfeng (Shawn) Li
0a2d8d95c7 Update delete_load.sh permission on Debian
Details: This is to update this script with execution permission.
It will be run to clean up the load after the upgrade.

Test Plan:

PASS: built the iso and verified its permission during upgrade
PASS: ran the file to verify the load is cleaned

Task: 46435
Story: 2009303

Signed-off-by: Junfeng (Shawn) Li <junfeng.li@windriver.com>
Change-Id: I3276077b24c9314f8f1ed0f5eff02848446d9869
2022-10-06 15:04:30 -04:00
Davi Frossard
da037fcd12 Add missing device image cache directory
N3000 image update is failing due to invalid cache folder

Test plan (Debian):
[PASS] Build, install and verify N3000 image update

Story: 2010087
Task: 45628

Signed-off-by: Davi Frossard <dbarrosf@windriver.com>
Change-Id: I6adb378a7de4599cc0b1612692282c7b50e36a85
2022-10-05 20:14:24 +00:00
Zuul
eb3b253ee5 Merge "Revert "Disable openldap CA cert installation for upgrade"" 2022-10-05 15:48:50 +00:00
Girish Subramanya
96fa364817 Alarm Hostname controller function has in-service failure reported
When compute services remain healthy:
 - listing alarms shall not refer to the below Obsoleted alarm
 - 200.012 alarm hostname controller function has an in-service failure

This update deletes definition of the obsoleted alarm and any references
200.012 is removed in events.yaml file
Also updated any reference to this alarm definition.
Need to also raise a Bug to track the Doc change.

Test Plan:

Verify on a Standard configuration no alarms are listed for hostname
controller in-service failure
Code (removal) changes exercised with fix prior to ansible bootstrap
and host-unlock and verify no unexpected alarms
Regression:

There is no need to test the alarm referred here as they are obsolete

Closes-Bug: 1991531

Signed-off-by: Girish Subramanya <girish.subramanya@windriver.com>

Change-Id: I255af68155c5392ea42244b931516f742fa838c3
2022-10-05 10:28:26 -04:00
Zuul
90d16558a6 Merge "Removed unused code in upgrade." 2022-10-05 14:03:00 +00:00
Andy Ning
44db9dea36 Revert "Disable openldap CA cert installation for upgrade"
This reverts commit 6a704b12b86af12176475563b50eef867b3a2a0d.
This breaks system deployment.

Signed-off-by: Andy Ning <andy.ning@windriver.com>
Change-Id: I3be9a666d097a87fd268dcb091f5505b70d39242
2022-10-05 09:22:00 -04:00
Zuul
6d75517946 Merge "Merge sysinv_fpga_agent with sysinv_agent" 2022-10-04 22:09:37 +00:00
Zuul
171c67cafc Merge "Debian: fix oam VLAN interface MTU" 2022-10-04 18:22:03 +00:00
Caio Bruchert
765338d352 Debian: fix oam VLAN interface MTU
When Debian's ifup tool runs for a IPv6 VLAN interface it is not setting
the MTU found in the configuration file. Instead it sets it to the
underlying interface's MTU. If that's a jumbo MTU value, it can cause
packet drops during file transfer and installation on controller-1
to fail.

This fix uses post-up configuration to set the correct MTU value to
mimic CentOS's ifup tool behavior.

Test Plan:
    PASS: check that the VLAN's MTU is correct
    PASS: installation on standard lab

Closes-Bug: 1991582

Signed-off-by: Caio Bruchert <caio.bruchert@windriver.com>
Change-Id: Id898a0eb132abe6838ddc81ff0adb4401c33d731
2022-10-04 13:07:12 -04:00
Zuul
a427afa1c0 Merge "Disable openldap CA cert installation for upgrade" 2022-10-04 14:31:53 +00:00
Andy Ning
6a704b12b8 Disable openldap CA cert installation for upgrade
In 66-create-open-ldap-certificate.py, it will call "system
certificate-install -m ssl_ca" to install the openldap CA cert.
Since sysinv is blocked waiting for the script to return, it
won't process the system certificate install call, causing the
call eventually timeout and the script fail.

This change disabled openldap CA cert installation in the upgrade
script as a temporary fix. A proper solution will be followed.

Test Plan:
PASS: DX system upgrade at least to the point of upgrade activation.

Story: 2009834
Task: 46455
Depends-On: https://review.opendev.org/c/starlingx/ansible-playbooks/+/859669
Signed-off-by: Andy Ning <andy.ning@windriver.com>
Change-Id: I3eee375936b13f0f666bfd9bcf964e35a088834b
2022-10-04 09:42:40 -04:00
Davi Frossard
6d4e2681a0 Merge sysinv_fpga_agent with sysinv_agent
Merging sysinv-fpga-agent service with sysinv-agent
in order to reduce overall OS overhead.

Replaced calls "wait_for_n3000_reset()" and "wait_for_host_uuid()" in
previous fpga-agent-manager by checks that ensure fpga devices are
reset and host_uuid is available in agent-manager. Also, the content of
"fpga_pci_update()" and "report_fpga_inventory()" methods is directly
inserted in the body of "agent_audit()" method.

Test Plan:

On AIO-DX env (CentOS):
<sysinv-fpga-agent tests>
PASS: Check FPGA pod and its resources.
PASS: Check FPGA pod and its resources after lock/unlock.
PASS: Check FPGA pod and its resources after the system reboot.
PASS: Verify image upload with non-functional image with
retimer-included
PASS: Verify retimer_a_version and retimer_b_version after applying
BMC image with re-timer and bmc
PASS: Verify firmware update for BMC and retimer image with
retimer-include=False
PASS: Verify apply BMC image without re-timer first and then BMC
image with re-timer, only latest image is kept in
device-image-state-list
PASS: Test accelerator configuration is persistent after lock/unlock.
PASS: Test to verify that the accelerator configuration is persistent
after a graceful reboot.

<sysinv-agent tests>
PASS: Verify alarms raised by PTP feature
PASS: Verify the configuration and run of single ptp-instance
PASS: Verify the configuration and run of single phc2sys
PASS: Verify PTP CLI commands

On AIO-SX env (Debian):
PASS: Check FPGA pod and its resources.
PASS: Check FPGA pod and its resources after lock/unlock.
PASS: Check FPGA pod and its resources after system reboot.
PASS: Check if FPGA device can be detected, configured.
PASS: Test accelerator configuration is persistent after lock/unlock.
PASS: Test to verify that the accelerator configuration is persistent
after graceful reboot.

Story: 2010087
Task: 45628

Signed-off-by: Davi Frossard <dbarrosf@windriver.com>
Change-Id: I83edd261898498344001ca90bb53a5f65e66728c
2022-10-03 14:12:28 -04:00
Zuul
af35377f56 Merge "Add type checks to AppImageParser" 2022-10-03 14:27:44 +00:00
Zuul
ab0a2d38aa Merge "Add sssd service parameters for ldap domains" 2022-10-03 14:12:15 +00:00
Carmen Rata
2c74c40a04 Add sssd service parameters for ldap domains
This commit adds sysinv service parameters configuration for sssd
support of remote ldap domains. Remote ldap domains get configured
with default configuration. A subset of the domain parameters
that are specific to the ldap server will to be added using
service parameters mechanism.
A maximum of 3 AD remote ldap domains are allowed: ldap-domain1,
ldap-domain2, ldap-domain3.
Validation methods are implemented for the service parameters.
Parameter Validation will be enabled in the next code drop.
In this commit service parameters are applied to only controllers.
Worker and Storage node personalities will be added in a subsequent
commit.

Tests performed:
PASS: Successful install in AIO-SX system configuration.
PASS: The default remote ldap domain configuration gets populated in
sssd.conf.
PASS: sssd service is successfully started.
PASS: Remote ldap domain service parameters are added and applied at
runtime.
PASS: Verify connection to the new ldap server using ldapsearch.
PASS: Verify ldap users have been discovered and cached in /etc/passwd
PASS: Verify remote ssh connection for an AD ldap user.

Story: 2009834
Task: 46364

Signed-off-by: Carmen Rata <carmen.rata@windriver.com>
Change-Id: I28df5059acd0a5e4a9f4368eb3cc8b0544d36333
2022-10-03 02:31:10 +00:00
Leonardo Fagundes Luz Serrano
2f1d2d8147 Debian: Remove conf files from etc-pmon.d
Removed conf files from /etc/pmon.d/
as they are being moved to another location.

This is part of an effort to allow pmon conf files
to be selected at runtime by kickstarts.

The change is debian-only, since centos support
will be dropped soon.
Centos' pmon conf files remain in /etc/pmon.d/

Test Plan:
PASS - deb doesn't install anything to /etc/pmon.d/
PASS - AIOSX unlocked-enabled-available
PASS - Standard 2+2 unlocked-enabled-available

Story: 2010211
Task: 46301

Depends-On: https://review.opendev.org/c/starlingx/metal/+/855095

Signed-off-by: Leonardo Fagundes Luz Serrano <Leonardo.FagundesLuzSerrano@windriver.com>
Change-Id: I1055170e1d5c4ff3a21350c6c5a54b31b6fc57bb
2022-09-30 13:46:19 -03:00
Thales Elero Cervi
78040d2017 Add type checks to AppImageParser
Recent changes [1] to AppImageParser _find_images_in_dict and
generate_download_images_list methods made this code to break with both
AttributeError and TypeError when stx-openstack application is being
uploaded.

This change includes extra protection against these types of errors and
restablish the flow for generating stx-openstack image list based on its
overrides.

It also adds a new image resource to TestKubeAppImageParser unit tests,
using an Openstack resource extracted from when debugging the original
error. It should prevent this issue to happen again for future changes
at AppImageParser logic.
The original change to generate_download_images_list, for example, would
fail the test:
    * TestKubeAppImageParser.test_generate_download_images_list

[1] https://review.opendev.org/c/starlingx/config/+/858762

Test Plan:
PASS - Locally execute unit tests: TestKubeAppImageParser
PASS - Build the sysinv package with this change
PASS - Upload stx-openstack app
PASS - Apply stx-openstack app

Closes-Bug: 1991115

Signed-off-by: Thales Elero Cervi <thaleselero.cervi@windriver.com>
Change-Id: I8a1384bfefd12f8a893249853cbeae3a9d3661e0
2022-09-29 16:57:35 -03:00
Bin Qian
2a84c3659f create vim_db locally and move to nfs device
This change is to avoid intermittent file lock error when
creating vim database directly on nfs device.

As a safer (and more efficient) way is to create the database
on local temp directory and copy it to the nfs mount path.

Also add an audit code to determine if the database copied to
nfs still has the file lock issue and report in log.

Note that the database does not need to be opened over nfs mount,
so the file lock failure issue would not impact the system.

TCs:
   passed DX upgrade 22.06 to 22.12 Debian completed.

Closes-Bug: 1990544
Change-Id: Ib3f1dee3df4f0c240c919b3f5c3414a6b807b1de
Signed-off-by: Bin Qian <bin.qian@windriver.com>
2022-09-29 17:17:32 +00:00
Zuul
e09f3b901f Merge "Remove k8s versions older than 1.21" 2022-09-29 17:12:04 +00:00
Zuul
7f29eea70a Merge "debian: Remove package preset install for config" 2022-09-29 16:38:28 +00:00
Luis Eduardo Angelini Marquitti
20b80803c9 Removed unused code in upgrade.
Remove upgrade code specific to StX5 -> StX6 upgrades.

Test Plan:

PASS: AIO-SX Fresh Install
PASS: AIO-DX Fresh Install
PASS: Standard Fresh Install
PASS: AIO-SX Upgrade
PASS: AIO-DX Upgrade
PASS: Standard Upgrade

Story: 2009754
Task: 45456

Co-Authored by: Lucas Soares Pellizzaro <lucas.soarespellizzaro@windriver.com>

Signed-off-by: Luis Eduardo Angelini Marquitti <luis.eduardoangelinimarquitti@windriver.com>
Change-Id: Ifa9afcdcde7251738f6598d2c33936202d0cd3b2
2022-09-28 22:08:34 -04:00
rsivanan
19f721037c Remove k8s versions older than 1.21
k8s versions older than 1.21 are no longer required. This change removes k8s older versions - 1.18.1, 1.19.13 and 1.20.9

Test-plan: Debian
PASS: system kube-version-list doesn't show the old versions - 1.18.1, 1.19.13 and 1.20.9

Story: 2010301
Task: 46416

Signed-off-by: rsivanan <rameshkumar.sivanandam@windriver.com>
Change-Id: Ia1dc4b105e091e83f3bcf8a5038f40ff4c29a7c1
2022-09-28 11:45:06 -04:00
Zuul
c84c140fec Merge "Change periodic_tasks timer to be dynamic instead of fixed" 2022-09-28 14:01:45 +00:00
Junfeng (Shawn) Li
d2be5f8490 Add platform-upgrade cmd to /usr/bin/
Details: Add platform-upgrade cmd to /usr/bin/ during Debian
installation.
This is a fix for https://review.opendev.org/c/starlingx/config/+/853676

Task: 45858
Story: 2009303
Signed-off-by: Junfeng (Shawn) Li <junfeng.li@windriver.com>
Change-Id: Iaf0722b063ac2b06c30b59f7ba266ea1573a463d
2022-09-27 14:40:40 -04:00
Zuul
e228d990f2 Merge "Add cli command to wrap platform upgrade playbook" 2022-09-27 16:24:04 +00:00
Charles Short
ddf7f070dc debian: Remove package preset install for config
Remove the installation of per-package preset installs
since they are centrally managed now by the ISO install
for the following packages:

- config-gate-worker
- config-gate
- controllerconfig
- sysinv-agent
- sysinv-fpga-agent

Story: 2009968
Task: 46406

Test Plan

PASS Build package
PASS Build ISO
PASS Check for non-existant preset file in /etc/systemd/system-preset

Depends-On: https://review.opendev.org/c/starlingx/integ/+/853653

Signed-off-by: Charles Short <charles.short@windriver.com>
Change-Id: I4204f75d3a7cfc25ab8b5f303d12023eafc212f0
2022-09-27 08:20:41 +00:00
Zuul
e97c9c922b Merge "Ensure osd conf intact after migrating to 22.12" 2022-09-27 04:06:15 +00:00
Zuul
bf8888e028 Merge "Remove centos_helm.inc" 2022-09-26 19:41:54 +00:00
Davlet Panech
38f123b674 Remove centos_helm.inc
This file references helm chart packages from outside of this repo:
* stx-openstack-helm
* stx-monitor-helm

These packages used to be in this repo (under kubernetes/) but have
since been moved to independent repos:
* starlingx/openstack-armada-app
* starlingx/monitor-armada-app

TESTS
=========================
Build packages, then run build-helm-charts.sh and make sure
"stx-openstack-helm" & "stx-monitor-helm" tarballs are generated.

Story: 2010226
Task: 46421

Depends-On: https://review.opendev.org/c/starlingx/openstack-armada-app/+/859326
Depends-On: https://review.opendev.org/c/starlingx/monitor-armada-app/+/859329
Signed-off-by: Davlet Panech <davlet.panech@windriver.com>
Change-Id: I674969f147e48658c7e7f2b36db109e73adc480c
2022-09-26 14:11:08 -04:00
Bin Qian
53600f8b90 Ensure osd conf intact after migrating to 22.12
This change is to ensure the disk uuid is preserved during
data migration. The disk uuid is used in storage configuration
that links to osd.

TCs:
    1. complete upgrade from 22.06 Centos to 22.12 Debian on AIO-DX
       with ceph configuration. No ceph osd failure.
    2. complete upgrade from 22.06 Centos to 22.12 Debian on AIO-SX
       with ceph configuration. No ceph osd failure.
Story: 2009303
Task: 46300

Signed-off-by: Bin Qian <bin.qian@windriver.com>
Change-Id: Ief1e5cd9588aca8148106f2d95b7e60989f6bb8b
2022-09-26 16:59:57 +00:00
Zuul
9bb3d419ac Merge "Add support for new kube app images format" 2022-09-26 15:27:52 +00:00
Zuul
d7de42cbc5 Merge "Update image tag for n3000-opae" 2022-09-23 16:28:28 +00:00
Mohammad Issa
70235b4221 Update image tag for n3000-opae
Changed image tag from stx.6.0-v1.0.1. to stx.8.0-v1.0.2

Story: 2009831
Task: 46404
Depends-On: https://review.opendev.org/c/starlingx/root/+/857468

Signed-off-by: Mohammad Issa <mohammad.issa@windriver.com>
Change-Id: I2431dce863cd24a7fccdb2868a73ba754b407d72
2022-09-23 14:16:13 +00:00
Jim Gauld
c8dba67b85 Add sysinv upgrades support for Kubernetes 1.24.4
This adds sysinv upgrades support for Kubernetes 1.23.1 to 1.24.4.

Test-plan: Debian
PASS: Install k8s 1.23.1, system kube-version-list shows
      v1.24.4 available

Story: 2010301
Task: 46321

Depends-On: https://review.opendev.org/c/starlingx/integ/+/857975

Signed-off-by: Jim Gauld <james.gauld@windriver.com>
Change-Id: Ic5de632bd9bbb1fc0d0faf24cebf929ce30c547e
2022-09-22 12:45:27 -04:00
Zuul
a8cc17e12b Merge "Better error message for sysinv forbidden error" 2022-09-21 15:53:48 +00:00
Steven Webster
9390d6f293 Add support for new kube app images format
In support for the STS silicom application, this
commit adds support for a new image format, which
may be found in the application charts (eg. values.yaml).

For the STS application, the format is as follows:

Images:
  Tsyncd: quay.io/silicom/tsyncd:2.1.2.9
  TsyncExtts: quay.io/silicom/tsync_extts:1.0.0
  Phc2Sys: quay.io/silicom/phc2sys:3.1.1
  GrpcTsyncd: quay.io/silicom/grpc-tsyncd:2.1.2.9
  Gpsd: quay.io/silicom/gpsd:3.23.1

Testing:
  - Apply the app-sts-silicom application.  Ensure images
    can be extracted and downloaded from the helm charts.
  - Ensure the application is applied with no errors

Story: 2010213
Task: 45955

Signed-off-by: Steven Webster <steven.webster@windriver.com>
Change-Id: Iebe94fb77780e516697c2d98efb296aff415b22f
2022-09-21 15:04:59 +00:00
Zuul
65ab3c50ff Merge "Add sleep between sysinv-agent attempts to mount /opt/platform" 2022-09-20 18:07:59 +00:00
Jerry Sun
c9bd767fbd Better error message for sysinv forbidden error
This commit catches the forbidden error raised for a user trying to
run sysinv commands without enough privileges. The forbidden exception
used to not be caught, resulting in sysinv CLI returning a "None" to
the user. With this commit, a more useful error message is shown to
the user.

Test Cases:

PASS: Create a reader user and run "system modify --description".
      Ensure a meaningful error is returned since readers are not
      allowed to run system modify. Ensure no changes are made
      by checking "system show"

Change-Id: I11c407f06196962ba445c6d8a9f7591cc8a5cf05
Story: 2010149
Task: 46360
Signed-off-by: Jerry Sun <jerry.sun@windriver.com>
2022-09-20 12:54:15 -04:00
Andre Fernando Zanella Kantek
f7a729995c Add sleep between sysinv-agent attempts to mount /opt/platform
During the boot process, most notably in IPv6 installations,
sysinv-agent is failing to mount /opt/platform/sysinv/${SW_VERSION}
to /mnt/sysinv during the 3 attempts done. This is starting the
agent with default values, making the process stuck trying to connect
to rabbitmq in localhost:5672 instead of controller-0:5672,
preventing the correct node installation from controller-0.

This change introduces a 1s sleep between attempts, with tests
indicating that the 2nd attempt is usually successful.

It is also adding a explicitly dependency with remote-fs.target, it
did not took effect on the first reboot after install (the loop was
responsible to copy config file). But on subsequent lock/unlock it
did not executed the mount as /opt/platform was available (this was
not happenning without the dependency).

Test Plan:
[PASS] install an AIO-DX with 3 compute nodes in IPv6 network

Story: 2010211
Task: 46348

Signed-off-by: Andre Fernando Zanella Kantek <AndreFernandoZanella.Kantek@windriver.com>
Change-Id: Idab6b1887f38283ea8e0e05923a0ae4265c2e877
2022-09-20 10:36:48 -04:00
Zuul
3b20b4eead Merge "Fix certificate ssl_ca cert install by dc-orch sync" 2022-09-16 21:55:57 +00:00
Zuul
6792dae88a Merge "Create openldap certificate on upgrade" 2022-09-16 21:17:19 +00:00
Rei Oliveira
397e708a42 Fix certificate ssl_ca cert install by dc-orch sync
This commit fixes an issue where trying to install the same certificate
again results in a 'Cannot install certificate with same subject'. That
is incorrect and should be thrown only for a different certificate with
the same subject.

Test Plan:

PASS: Manage a subcloud and verify that it's able to synchronize certs
      without the 'Cannot install certificate with same subject' error
PASS: Try to install the same certificate multiple times and verify
      that no 'Cannot install certificate with same subject' error
      is returned
PASS: Try to install two different certificates with same subjects and
      verify that a 'Cannot install certificate with same subject' error
      is returned

Closes-Bug: 1990007

Signed-off-by: Rei Oliveira <Reinildes.JoseMateusOliveira@windriver.com>
Change-Id: I17861145f20b8e1ef61896c3271a96a28fe9ded2
2022-09-16 16:01:01 -03:00
Zuul
1e41b8e445 Merge "PTP: fix VLAN interface not working" 2022-09-16 14:12:09 +00:00