Commit Graph

402 Commits (master)

Author SHA1 Message Date
Zuul e2e6af9d71 Merge "Add missing fields for Event traps" 2023-09-08 18:09:55 +00:00
Agustin Carranza 496702c932 Add missing fields for Event traps
wrsEventMessage traps are being managed as wrsAlarmMessages.
Events do not contain wrsEventProposedRepairAction and
wrsEventSuppressionAllowed fields, so they need to generate a default
value in FM.
This commit adds those fields in order to create the event traps
with the same format as the alarm traps.


Partial-bug: 2032844

Signed-off-by: Agustin Carranza <>
Change-Id: I58577406cc75c597f6f430015ddd51d0029d4539
2023-08-31 12:01:35 -03:00
Agustin Carranza b9cc232de3 Fix sphinx configuration for tox docs
With sphinx version update, it is raised a warning (treated as
error) with the 'language = None' configuration. The default value is
'language = en' which has the same behavior.

Test plan
PASS: Run tox and check it ends successfully.

Closes-bug: 2033412
Related-Bug: #1976377

Change-Id: Ie003c0a02fcfc6f237ae5b3efb259de6748077ad
Signed-off-by: Agustin Carranza <>
2023-08-29 17:33:47 -03:00
Zuul b6727e579b Merge "Change context for 400.001 and 400.002 alarms" 2023-07-24 14:08:40 +00:00
Agustin Carranza 5b94faf575 Change context for 400.001 and 400.002 alarms
400.001 and 400.002 alarms are tagged for openstack but should be

This change tags them to starlingx so the documentation scripts are
able to classify them correctly.

Test plan
PASS: Check the parsing scripts end successfully.

Closes-bug: 2028379

Signed-off-by: Agustin Carranza <>
Change-Id: I8f5966d5b0a7b82198e4bc2e735fa4536a4cdd0a
2023-07-21 11:09:50 -03:00
Roger Ferraz 5d501eedc7 starlingx/fault README improvement
This story shall update the README file of a few most used StarlingX

Test Plan: N/A

Story: 2010814
Task: 48379

Change-Id: I98483245931c5d764c662f5283c59da0b2d69efe
Signed-off-by: Roger Ferraz <>
2023-07-19 10:48:29 -03:00
Kyale, Eliud 901437c0fe Define kernel mismatch alarms: 100.120, 100.121
2 new alarms:

- 100.120 - Controllers running mismatched kernels
  (minor, non-management affecting)

- 100.121 - Host not running the provisioned kernel
  (major, management affecting)

Part of the kernel switchover feature

Task: 48281
Story: 2010731

Test plan:
PASS - AIO-DX - install iso and bootstrap

PASS - AIO-DX - raise and clears both alarms: 100.120, 100.121

Change-Id: Ifb2df5658071d1a2fab42737267c621fc42d7136
Signed-off-by: Kyale, Eliud <>
2023-06-30 08:02:41 -04:00
Zuul dbffaed98f Merge "Remove 250.002 alarm from events.yaml file" 2023-06-20 20:17:23 +00:00
Agustin Carranza de22b91ed5 Link alarms to current documentation URL
Since the alarm documentation has been automated and the events.yaml
file is taken as source of truth for it, it is required to link
the alarms proposed repair action with a direct link to the
documentation for the users.

This change modifies the mentions of documentation to a proper link,
using Sphinx placeholder that are interpreted by the documentation

Test plan
* Build fm-doc package. Check that all parsing checks were run and
package was built successfully.

Closes-bug: 2022104

Change-Id: Iccb34e42ed80634d73cf7549e9230976579deef7
Signed-off-by: Agustin Carranza <>
2023-06-15 16:04:12 -03:00
Agustin Carranza 063021cd70 Remove 250.002 alarm from events.yaml file
The 250.002 alarm has been deprecated long time ago.
This change deletes it from the alarm list.

Test plan:
* Build fm-doc and fm-api packages.
* Check that all parsing checks were run and package was built


Closes-bug: 2024010

Signed-off-by: Agustin Carranza <>
Change-Id: I0180bd5addc1feae6e3e45edd24c1f50d6622e2c
2023-06-15 15:52:11 -03:00
Zuul e2b15b733d Merge "Remove 800.103 and modify documentation reference" 2023-06-08 14:25:57 +00:00
Agustin Carranza ad5e224dd2 Remove 800.103 and modify documentation reference
Some alarms reference to "System Administration Manual" but this
document does not exist. It was changed to a generic documentation

The 800.103 alarm has been deprecated so it is deleted from the
events.yaml file.

Test plan
* Build fm-doc package. Check that all parsing checks were run and
package was built successfully.

Closes-bug: 2022104

Signed-off-by: Agustin Carranza <>
Change-Id: I4723d05e77983796a0f64c7242f5c2bcf4699763
2023-06-01 16:38:54 -03:00
Al Bailey a425fa6626 Support newer version of yaml
yaml.load will report a warning in pyyaml 5 and an error
in pyyaml 6 if it is called without a Loader argument.

The no-member pylint error was being suppressed due to
legacy http code, so now that is un-suppressed globally
and the yaml.load is replaced with yaml.safe_load

Test Plan:
  PASS: tox
  PASS: yaml.load('events.yaml') returns the same content
       as yaml.safe_load('events.yaml')

Story: 2010642
Task: 48157
Signed-off-by: Al Bailey <>
Change-Id: Ibac118cd9555f3334251b10a6b3e0a5986285854
2023-05-31 16:36:27 +00:00
Zuul 6172b8ee41 Merge "Add System Config Update orch alarms and events" 2023-05-23 21:55:36 +00:00
Agustin Carranza 6438565a4d Add parsing check when Context field is empty
This change adds a parsing check to ERROR if Context field is Empty.
Until now there had not been a requirement of non empty fields, so in
case this is needed in the future for other key/values, a collection is

Test plan
PASS: * Add/modify an alarm/log in events.yaml file with Context field
        set to <Empty>.
      * Run the checkEventYaml script and check it fails.
PASS: * Check that all the events in events.yaml file have the Context
        field set to a non empty value.
      * Run the checkEventYaml script and check it ends successfully.

Closes-bug: 2020381

Signed-off-by: Agustin Carranza <>
Change-Id: Ia267886dd49099525751165975fb5d291c0c6f82
2023-05-22 15:07:56 -03:00
Yuxing Jiang 04d6b31d95 Add System Config Update orch alarms and events
System config update alarms are 900.6xx series

The new alarms are originated by a new type of vim strategy
orchestrating configuration update.

The new alarms are similar in numbering and wording as the
kube upgrade auto apply 900.4xx series alarms and logs.

System config update in-progress alarm is 900.010.
System config update aborted alarm is 900.011.

Story: 2010719
Task: 47947

Change-Id: Ieb6e68adf359ac7b0489d15bb33cb4b4a9f3ef3f
Signed-off-by: Yuxing Jiang <>
2023-05-19 16:57:16 -04:00
Zuul 82bcc2d0a5 Merge "Documentation is missing 900.007 alarm" 2023-05-11 20:31:56 +00:00
Agustin Carranza 888463cf06 Documentation is missing 900.007 alarm
Product Documentation is missing the alarm 900.007 'Kubernetes upgrade
in progress.'
That alarm has the Context field set to none. In order to be included
in stx documentation, it has to be set to Context: starlingx.

Test plan:
PASS: Run documentation generating scripts and check the alarm is now

Closes-bug: 2019146

Signed-off-by: Agustin Carranza <>
Change-Id: I4d4867e5299e3fb1eb37c9bcd3e53447d4f08ba5
2023-05-10 15:46:33 -03:00
Enzo Candotti 3ef3df74db Update 260.002 alarm to be non-mgmt-affecting
This commit is intended to update the 260.002 alarm. As the 'severity'
is set to 'minor', it is desired to classify it as
non-management-affecting by adjusting its Management_Affecting_Severity
value to 'none'.

Test Plan:
PASS: Build and install Debian package.

Story: 2010719
Task: 47938

Change-Id: Ie228191ebdda5f2651dab1309b929ae06bc1f7f6
Signed-off-by: Enzo Candotti <>
2023-05-09 09:34:42 -03:00
Zuul d6d68db460 Merge "Add alarm id for resources out of sync" 2023-05-05 21:07:46 +00:00
Enzo Candotti a891bdcfd0 Add alarm id for resources out of sync
This commit adds a new alarm id and definition for resources
that has INSYNC=False.

The alarm will be raised when a resource is not
synchronized during a process of update. It will be cleared when
the resource is synchronized again.

Test Plan:
 - Verify successful tox test and package build
 - Verify the alarm can be raised using FmClientCli

Story: 2010719
Task: 47910

Change-Id: I24a976ed4beaa8248df25fd97eeee27f5754b969
Signed-off-by: Enzo Candotti <>
2023-05-05 19:47:38 +00:00
Davlet Panech 25aa783f24 Fix github mirroring for this repo
Updating the rsa ssh host key based on:

Note: In the future, StarlingX should have a zuul job and
secret setup for all repos so we do not need to do this
for every repo.

Needed to rename the secret, because zuul fails if like-named
secrets have diffent values in different branches of the same

Partial-Bug: #2015246
Change-Id: Id0caa3ad6efbaed9fff904c6fab8ba35472ee6f5
Signed-off-by: Davlet Panech <>
2023-04-28 12:38:51 -04:00
Agustin Carranza dad8caed91 Fix Context value for some alarms
Some documentation generating scripts were introduced in order to avoid
manual intervention every time an alarm/log is changed/added/removed.
Those scripts required a way to know where the alarm/log belongs to.
For that requirement, the field Context was introduced in previous
commits. During that development, it was taken the current
classification at that time in the docs as source of truth, but it was

This commits modifies the values that were detected as wrong/outdated.
The scripts also require the value 'none' in the Context field for when
an alarm/log should not be included in the documentation but still be
defined in the events.yaml file. So the Context value is updated for
that case too.

Context incorrectly tagged as openstack and changed to starlingx:
* 900.006

Context incorrectly tagged as starlingx and changed to openstack:
* 100.105
* 100.112
* 100.113
* 300.001
* 300.002

Closes-bug: 2012981

Test plan
PASS: Since the Context field does not have impact in functionality,
      build and install fm-doc package successfully.
      Check the file in the filesystem contains this change.
PASS: Trigger random alarms and check FM functionality.

Signed-off-by: Agustin Carranza <>
Change-Id: I16f858bbb712349f08b2ceca33152e365b0ed733
2023-04-10 15:25:11 -03:00
Zuul a4601e8026 Merge "Add alarm for Restore in progress" 2023-03-27 14:49:30 +00:00
Joshua Kraitberg 3b430eb604 Add alarm for Restore in progress
Currently, there is no alarm for Restore in progress.
Because of this, the system is shown as healthy,
before restore has been completed.

This new alarm will prevent the system from being healthy
until restore has properly been completed.

PASS: On any available system, the following commands can
be triggered at anytime:
* Run "system restore-start" to trigger alarm
* Run "system restore-complete" to clear alarm

Story: 2010117
Task: 47689
Signed-off-by: Joshua Kraitberg <>
Change-Id: I292b5c8083c08b68ac757fe5a650989178eb819f
2023-03-22 10:43:28 -04:00
Zuul 1a61473b14 Merge "Add ceph commands in the 800 series alarm document" 2023-03-17 13:32:25 +00:00
Agustin Carranza 0e1321913b Add ceph commands in the 800 series alarm document
When a 800-Series alarm occurs, users refer to the documentation to
know what kind of error is shown. But sometimes that is not enough
The output of some commands can be useful information and could
save time when solving issues related to the storage alarms.

Closes-bug: 2004601

Test plan
PASS: * Build fm packages and deploy an ISO containing new fm
      * Trigger alarms that were modified by this commit,
        (e.g. shutdown a controller).
      * Run fm alarm-list --uuid and copy the uuid of a 800-series
      * Run fm alarm-show <uuid> and check that the field
        has changed.

Signed-off-by: Agustin Carranza <>
Change-Id: I94e2719b55b4fc14b692439526b5b47204460ac7
2023-03-13 14:13:44 -03:00
Al Bailey e5a8ba7ff4 Tox and Zuul cleanup for python3.9
Added in the following tox targets for fm-rest-api:
 - bandit
 - flake8 / pep8
 - pylint (suppressing most of the codes)

All the tox targets run on python3
The test-requirements.txt have been updated
The StarlingX Debian upper constraints are utilized.
The spec-lint (rpm) job is removed from Zuul.

Zuul runs pylint for sub directories
Bandit exclusions are updated.

Included a change to a .py file to trigger
the bandit zuul job.

Test Plan (for fm-rest-api)
  PASS: tox -e bandit
  PASS: tox -e coverage
  PASS: tox -e flake8
  PASS: tox -e pylint

Story: 2010531
Task: 47575
Signed-off-by: Al Bailey <>
Change-Id: I7ecaf1c90495b283c26e02e3b481bfe4c77c3939
2023-03-02 19:32:25 +00:00
Al Bailey bd8857357b Run checkEventYaml as part of zuul linter job
The checkEventYaml script verifies if all contents
are properly populated for the events.yaml file.

This change ensures that check is done by zuul, rather
than during the build.

yaml.load after version 5.1 requires a Loader argument.
The yaml.load in fm-doc are now updated to use safe_load

Test Plan:
  PASS: tox -e linters
  PASS: remove 'context' field from an alarm and observe
  that tox -e linters reports a failure.
  PASS: build-pkgs -p fm-doc

Story: 2010531
Task: 47549
Signed-off-by: Al Bailey <>
Change-Id: I369ffe4c74fcaf5fe4a916822fed18a78ead8ff8
2023-02-27 16:16:01 +00:00
Zuul c7e47234e9 Merge "Update debian package versions to use git commits" 2023-02-13 16:51:45 +00:00
Zuul 69ca1b650f Merge "Host compute service failure alarm removal" 2023-02-10 19:49:20 +00:00
Vanathi.Selvaraju 447ed111ae Host compute service failure alarm removal
Removal of stale alarm 270.001(Host compute service failure)
is raised by the vim. This might be an old reference to nova.
It’s likely not in use since stx.

Test Plan:
PASS: Verify with a load without the changes (removal of alarm)
and the event log in platform.log shows an entry for 270.001 alarm.
PASS: Verify with a load with changes of alarm removal and
the event log in platform.log does not show an entry for 270.001 alarm.

Closes-Bug: 2004744

Change-Id: I47a9f5cede2cfade4a16c63a2dc1bcfd563e88cf
Signed-off-by: Vanathi.Selvaraju <>
2023-02-10 09:32:01 -05:00
Al Bailey 60ab3f6b45 Update debian package versions to use git commits
The Debian packaging has been changed to reflect all the
git commits under the directory, and not just the commits
to the metadata folder.

This ensures that any new code submissions under those
directories will increment the versions.

All packages have a higher version than before the change.

Test Plan:
  PASS: build-pkgs -c -p fm-api
  PASS: build-pkgs -c -p fm-common
  PASS: build-pkgs -c -p fm-doc
  PASS: build-pkgs -c -p fm-mgr
  PASS: build-pkgs -c -p fm-rest-api
  PASS: build-pkgs -c -p python-fmclient

Story: 2010550
Task: 47226

Signed-off-by: Al Bailey <>
Change-Id: I65e881ba96512d2eaba25c44332d5ae82efea502
2023-02-09 18:06:57 +00:00
Al Bailey 7c9989cc34 Remove python2 jobs from zuul for this repo
The python2.7 jobs will no longer be executed as part
of the zuul check and gate.

This also removed the unused devstack job for stx/fault

Story: 2010531
Task: 47304
Signed-off-by: Al Bailey <>
Change-Id: I308a067e6ca23e45b7f5539853d7bb28f31bb7f5
2023-02-07 15:29:51 +00:00
Enzo Candotti ca8be6b866 Fix fm command bash dynamic completion
For dynamic bash completion, instead of using the legacy
/etc/bash_completiond.d, the current bash-completion can use a
dynamic mechanism in which the customized completion is called
upon completion activation.
The new location that is already pointed by the .bashrc file,
also engaged by the /etc/bash_completion, is

However, the bash file was placed under a subfolder with the
name of the command which is not necessary since the file already
contains the command name.
Also, the proper file name shall contain .bash extension.

Closes-Bug: 2001553

Test Plan:
PASS: Build python-fmclient package.
PASS: Build Debian image and install it successfully.
Verify fm.bash is installed under /usr/share/bash-completion/completions
PASS: Verify bash completion is working as expected:

Signed-off-by: Enzo Candotti <>
Change-Id: I3b796d26633459b98d7555e48e0bf5ea01c630d3
2023-01-04 19:56:25 +00:00
Al Bailey 13b24042a1 Update tox.ini to work with tox 4
This change will allow this repo to pass zuul now
that this has merged:

Tox 4 deprecated whitelist_externals.
Replace whitelist_externals with allowlist_externals

Removed the 'build' target from zuul which just invokes
the devstack script which is un-supported.

Partial-Bug: #2000399

Signed-off-by: Al Bailey <>
Change-Id: I59bd7c82c297e12969e31b5de9ac02d2a47834a6
2022-12-27 01:38:20 +00:00
Zuul ea4d279e4b Merge "Debian: fault: update" 2022-11-22 15:22:45 +00:00
Al Bailey cf658fba98 Fix openstack-tox jobs for fault repo
The Zuul upper-constraints env variable declaration needed
to be added to tox.ini otherwise an older constraints
was being used which does not work with newer
versions of python.

Partial-Bug: #1997255

Signed-off-by: Al Bailey <>
Change-Id: Ie912dc7ae3f9f1639311f0c1f5cf62070f44909d
2022-11-22 01:15:40 +00:00
Yue Tao a5b3d12469 Debian: fault: update
Move the packages of "fault" from stx-std.lst to

Test Plan:

Pass: build-pkgs -c -a
Pass: build-image
Pass: boot

Story: 2008862
Task: 46841

Signed-off-by: Yue Tao <>
Change-Id: Ic58349d7b1adce50fb28c6c843967da8c908dd02
2022-11-18 08:15:57 +08:00
Agustin Carranza d161fe5922 Extend events.yaml schema with usage context field
The events.yaml file contains every alarm and log used by platform and
openstack. There is no way to know which one relates to one or
the other.
In order to know that, it is required an additional field as part of
each record to differential between platform and openstack.

Story: 2010143
Task: 46723

Test plan
PASS: Build the fm-api and fm-doc packages.
      Install fm-api first and then fm-doc.
      No errors are found during build and installation process.

Signed-off-by: Agustin Carranza <>
Change-Id: I8598afc77d27d107c4f9a108dd46b2ebc79b30a1
2022-11-09 16:14:21 -03:00
Zuul 8596fc8fc5 Merge "Add stx-fm-rest-api loci image" 2022-10-31 19:08:24 +00:00
Enzo Candotti cd0f5c38c2 Add stx-fm-rest-api loci image
This change reorganizes the source directories of the stx-fm-rest-api
container to be reused by both CentOS and Debian Dockerfiles in order
to build the images having the corresponding OS-specic base.

As part of this, the fm-api, fm-rest-api, fm-common and
python-fmclient packages have been ported in order to generate deb
files that contain .whl.

Test plan:
PASS: Build debian iso and perform fresh install. Verify fm commands are
working as expected.
PASS: Build python3 wheels tarball on Debian. Verify fm, fm_api, fm_core
and fmclient .whl files are added.
PASS: Build Debian-based container and push it to a public registry.
Apply openstack application and update the fm-rest-api url to pull
this new image. Verify that:
    - pods are up and running with the new image/tag specified.
    - the container is running on Debian.
    - from inside the container, fm querys are working as expected.

Story: 2009831
Task: 46634


Signed-off-by: Enzo Candotti <>
Change-Id: I2b35139f8775141e39f97a5a6037c5de2b4d5d76
2022-10-27 15:29:06 +00:00
Joao Victor Portal 277c64bed7 Fix FM error messages for forbidden requests
The CLI error messages for users with reader role were not clear to the
operator and this change fixes this.

Test Plan:

PASS: In an AIO-SX with this change present, create a new openstack user
with reader role and through this user execute the commands:
fm alarm-list
fm alarm-delete <uuid>
fm event-suppress --alarm_id <alarm_id>
and check that the "alarm-list" command is executed without errors and
that the error message of the other commands changes from:
"HTTP Client Error (HTTP 403) (Request-ID: req-<req_id>)"
"Error: Forbidden."

Story: 2010149
Task: 46620

Signed-off-by: Joao Victor Portal <>
Change-Id: I45007a7f5319ef0a0238a07d671a859b5081660a
2022-10-20 20:47:13 -03:00
Zuul 8a07de3ea1 Merge "Alarm Hostname controller function has in-service failure reported" 2022-10-07 17:37:49 +00:00
Girish Subramanya efa09aa3db Alarm Hostname controller function has in-service failure reported
When compute services remain healthy:
 - listing alarms shall not refer to the below Obsoleted alarm
 - 200.012 alarm hostname controller function has an in-service failure

This update deletes definition of the obsoleted alarm and any references
200.012 is removed in events.yaml file
Also updated any reference to this alarm definition.
Need to also raise a Bug to track the Doc change.

Test Plan:

Verify on a Standard configuration no alarms are listed for hostname
controller in-service failure
Code (removal) changes exercised with fix prior to ansible bootstrap
and host-unlock and verify no unexpected alarms

There is no need to test the alarm referred here as they are obsolete

Closes-Bug: 1991531

Signed-off-by: Girish Subramanya <>

Change-Id: I255af68155c5392ea42244b931516f742fa838c3
2022-10-05 10:30:49 -04:00
Zuul e28e068018 Merge "Restrict fmClientCli binary permissions" 2022-10-03 20:18:44 +00:00
Zuul f6330794d9 Merge "Debian: Remove conf files from etc-pmon.d" 2022-09-30 19:10:43 +00:00
Zuul 5815cce15c Merge "debian: Remove preset file for fm-rest-api" 2022-09-29 16:34:40 +00:00
Joao Victor Portal 74d56e72a0 Restrict fmClientCli binary permissions
The fmClientCli binary can create and delete alarms freely on the
system, so the access to this binary should be restricted to Linux admin

Test Plan:

PASS: Deploy an AIO-SX using a Debian image containing this change and
check that the permissions for file "/usr/local/bin/fmClientCli" is
"-rwxr-x---" and the owner:group is root:root.
PASS: Repeat the test above using a CentOS image.

Closes-Bug: 1991118
Signed-off-by: Joao Victor Portal <>
Change-Id: I0375ddc68ae1b5967447a326780272f77695793a
2022-09-28 11:19:55 -03:00
Charles Short 9cd969eb25 debian: Remove preset file for fm-rest-api
Remove the fm-rest-api preset file since it will
be managed centrally going forward and not on per-package basis.

Test Plan
Build Package
Build ISO
Install ISO
Check for non-existant

Story: 2009968
Task: 46406


Signed-off-by: Charles Short <>
Change-Id: Ic3ec52bfb985c9e06d654476e6913ab897b67eb2
2022-09-27 08:19:36 +00:00