12424 Commits

Author SHA1 Message Date
Julia Kreger
d2039a29de Handle nova policy change
It appears nova's policies have changed, or to be more precise,
they have turned on new policy enforcement[0] and our plugin
was wrong.

+++ /opt/stack/ironic/devstack/lib/ironic:\n
ironic_configure_tempest:3205 :\n
oscwrap --os-cloud devstack-system-admin flavor show baremetal -f value -c id
ForbiddenException: 403: Client Error for url:
https://173.231.254.232/compute/v2.1/flavors/
e4312534-b349-4f70-9a1b-5806debff275/os-extra_specs,
Policy doesn't allow os_compute_api:os-flavor-extra-specs:index to be performed.

[0]: dfd7aeaf6c

Change-Id: I8070852fbe9346e346c50088537797f353753d02
2023-05-23 21:33:57 -07:00
Julia Kreger
124ad571fd Explicitly pin CIRROS_VERSION
It appear the push to Cirros 0.6.1 has re-occured, and we now
have things failing as a result.

Specifically ironic-grenade is trying to run with Cirros 0.5.2,
yet the file is not found later on.

Anyhow, an explicit pin should resolve this.

Change-Id: I97a1403820c8dbe633cf1d529adc79e8af463e80
2023-05-23 15:10:00 -07:00
Zuul
2afaf4d0a0 Merge "CI: DB: Don't return inside of node get wrappers" 2023-05-21 02:51:46 +00:00
Zuul
b5eafe9069 Merge "Update docs: Ironic uses launchpad now" 2023-05-20 22:40:12 +00:00
Zuul
fe8134ea28 Merge "Fix self_owned_node policy check" 2023-05-19 22:23:36 +00:00
Zuul
69b5dd0a73 Merge "CI: Change tinycore URL" 2023-05-19 22:23:34 +00:00
Zuul
bfb0a4cdc4 Merge "CI: Disable mysql counters for grenade" 2023-05-19 19:04:52 +00:00
Julia Kreger
4beeef777f CI: DB: Don't return inside of node get wrappers
Previously, we did query object handlers with model_query
where we woudl then return the handler, however that was problematic
because we returned ORM objects.

Now we have selects, but the data has to be extrated to a set before
we close out the reader session with the database. Returning the
overall object means that the call seems to sometimes not completely
execute on newer python versions in unit tests, which create unit
test failures.

We now hold off the database query and operation to a variable,
exit the reader, and return the variable.

Change-Id: I6ae9e4442bcb473ab5ccff6e167bf61f3daa0f7e
2023-05-19 11:29:33 -07:00
Zuul
f206eb1f65 Merge "[iRMC] Fix parse_driver_info bug enforcing SNMP v3 under FIPS mode" 2023-05-19 17:42:09 +00:00
Zuul
ce5183bfde Merge "CI: Try to isolate test failures in neutron vif logic" 2023-05-19 16:18:17 +00:00
Julia Kreger
fce8c3a651 CI: Change tinycore URL
We'v been able to observe one of the scenario test jobs failing
due to tinycorelinux being inaccessible. Possibly on an IPv6 only
test VM. Turns out tinycorelinux's main page is only accessible via
IPv4.

As such, I've changed the mirror to a mirror which is acessible via
IPv6 and which I've verified works for me.

Change-Id: I2b4ccd16189038ce2f054d7403775b012796aea3
2023-05-19 07:20:56 -07:00
Zuul
91e6a86fbe Merge "Fix anaconda stage2_id loading from image properties" 2023-05-19 09:25:40 +00:00
Zuul
40f3d4b7ba Merge "Fix Cinder Integration fallout from CVE-2023-2088" 2023-05-19 02:21:39 +00:00
Julia Kreger
8b98dfafd8 CI: Disable mysql counters for grenade
Disabling the performance counters as we suspect it is causing
database interaction to freeze on the grenade CI job.

Change-Id: Id951815ab9bfd1ca16aa66fa4c87c0e1b3e788f6
2023-05-18 17:27:38 -07:00
Julia Kreger
1cb371327f CI: Try to isolate test failures in neutron vif logic
One of our tests is super-friendly to race because it relies on changing
an object, but that object not being saved yet to verify the difference.
However, that object and even similar changes can occur in the database
and trigger the test to fail.

So instead using the shared resource created on test startup, create an
isolated, dedicated resource for it to use inside of the test, which should
eliminate the entire issue. I hope.

Change-Id: Ia0df1c211d5510777c6055f1341335f9bc5c5006
2023-05-18 17:15:15 -07:00
Zuul
bca0fe407e Merge "Migrate to pysnmp lextudio ecosystem" 2023-05-18 17:13:11 +00:00
Julia Kreger
9c0b4c90a1 Fix Cinder Integration fallout from CVE-2023-2088
In the recent change to cinder, to address CVE-2023-2088,
cinder changed the policy rules and behavior for unbinding,
or "detaching" a volume. This was because of a vulnerability
in compute nodes where a volume which was in use by a VM
could be detached outside of Nova, and nova wouldn't become
aware the volume was detached, and the volume could be accessible
to the next VM.

This vulnerability doesn't apply to bare metal operations as
volumes are attached to whole baremetal nodes with Ironic.

We now generate and use a service token when interacting with
Cinder which allows cinder to recognize "this request is
coming from a fellow OpenStack service", and by-pass
checking with Nova if the "instance" is managed by Nova,
or Not. This allows the volumes to be attached, and detached
as needed as part of the power operation flow and overall
set of lifecycle operations.

Related-Bug: 2004555
Closes-Bug: 2019892

Change-Id: Ib258bc9650496da989fc93b759b112d279c8b217
2023-05-18 07:43:31 -07:00
Jay Faulkner
65b8895e8a Update docs: Ironic uses launchpad now
Ironic switched to launchpad. Ensure our docs point contributors to the
correct location.

Change-Id: Ifa75c75741dd4a584bc2cb972eb4726c4c48d064
2023-05-17 15:42:41 -07:00
Julia Kreger
912dcbbdc9 CI: Mark BFV job non-voting for now
Until we're able to get the BFV job softed, we need to unblock
the gate, and as such moving the BFV job to non-voting to allow
other contributors to make progress.

Change-Id: I045d58afe195f08823af3b1a2fa6eabb6efb63ca
2023-05-16 11:18:43 -07:00
Riccardo Pittau
c8c83ef544 Migrate to pysnmp lextudio ecosystem
The pysnmp library is not maintained since 4 years now and it's
incompatible with recent libraries like pyasn1.
Its fork pysnmp-lextudio is regularly maintained, we should move
to that.

Change-Id: I3b7b5c0cf2f7d669b265669d27e0eaca0dd1fc2a
2023-05-11 08:52:52 +02:00
Zuul
832275015a Merge "Support longer checksums for redfish firmware upgrade" 2023-05-09 23:45:15 +00:00
Julia Kreger
9da6dfd73d Fix self_owned_node policy check
When enabling scope enforcement, the self_owned_node check could
generate a failure because the check internally can be touched
by both a project scoped and system scoped endpoint.

This change changes the tag in the policy so it doesn't prematurely
return an error to the API consumer.

Change-Id: I49e2f7f29eb98e5bb4e18614cea0aca726703f55
2023-05-09 09:51:43 -07:00
Zuul
1d0818cba2 Merge "Remove use of nomodeset by default" 2023-05-09 06:29:42 +00:00
Zuul
0e6935bb1d Merge "Imported Translations from Zanata" 2023-05-09 05:06:28 +00:00
Zuul
b36ffd0e96 Merge "Remove autocommit, again." 2023-05-09 04:46:38 +00:00
OpenStack Proposal Bot
3139460cd2 Imported Translations from Zanata
For more information about this automatic import see:
https://docs.openstack.org/i18n/latest/reviewing-translation-import.html

Change-Id: Ice56ac44161d27ede41fdf53024e62e49c572049
2023-05-09 04:29:51 +00:00
Zuul
00c26cd443 Merge "Fix api-ref v1-indicators" 2023-05-09 03:04:42 +00:00
Zuul
f926758bd4 Merge "Use monotonic time for hashring reset" 2023-05-09 00:22:22 +00:00
Harald Jensås
1b8c0be0b7
Fix api-ref v1-indicators
When doing a GET or PUT on an indicator the indicator
specified must use `indicator@component` syntax. For
example:

 indicators/led@system or indicators/led-0@chassis

Ref: ironic.api.controllers.v1.node.IndicatorAtComponent

Change-Id: I6908544b52be88cbddf537c954dd5e097a0d83e6
2023-05-09 00:25:55 +02:00
Julia Kreger
cec72275a1 CI: Fix another network test
Turns out more than one test was relying upon the object
change determination test. Modifies this test to use the
same pattern of behavior so we are avoiding racing.

Change-Id: I29ee6cab7320d13fcc2eeda27dae08aeb2d98b00
2023-05-08 14:08:47 -07:00
Julia Kreger
4518577770 CI: Modify dhcp client ID fail
The test, periodically under certian CI race conditions, may
be handled as if there was not a change, which breaks the
test as it does not save a modified port, it uses the in-flight
list of changes to determine the correct path.

The challenge is, the list of changes may not reconize there has
been a change with the underlying object/db layer. So instead
of re-test the library code, we just force the behavior by
replacing the method on the object in the test, as the
undrelying method being tested is tested as part of the
oslo versioned objects code base.

Change-Id: Ic8f9b2384ab2f8f76299afce9806fbe93e350f0e
2023-05-08 11:03:27 -07:00
Zuul
47b778977c Merge "Handle MissingAttributeError when using OOB inspections to fetch MACs" 2023-05-08 15:08:21 +00:00
Kaifeng Wang
b48dfd44c7 Use monotonic time for hashring reset
hashring use time.time() to calculate intervals, replace with
monotonic so it will not be affected by system time jump.

Change-Id: I17569359f4d2c0f2f24ca8b50773c4d210ed8deb
2023-05-07 15:09:50 +08:00
Riccardo Pittau
cae05c70e6 Make rbac enforced test non-voting for the time being
We need to fix the tests first, so this needs to pass
https://review.opendev.org/c/openstack/ironic-tempest-plugin/+/882312

Change-Id: I5a9536f24103032059f25f9f69fda354e9e38187
2023-05-05 17:35:24 +02:00
Lana Kaleif
7a5f80cec7 Fix anaconda stage2_id loading from image properties
Anaconda stage2 location is not verified nor loaded when kernel
and ramdisk instance info on a node is already populated, due
to a condition removed by commit 4d653ac225.
This change implements a check based on stage2 label presence
that again allows stage2 to be processed separately from kernel
and ramdisk.

And changes incorrect parameter check_on_exit passed to
oslo_concurrency.processutils.execute() to the correct one,
check_exit_code.

Change-Id: If7bfc99f200fd0bfc9fbe2a4c8d039ba6ae8d0c5
2023-05-04 14:43:31 +02:00
Julia Kreger
03cd9788e6 Support longer checksums for redfish firmware upgrade
Previoulsy only SHA1 hashes were supported, now we support
SHA256 and SHA512 by length.

Change-Id: Iddb196faca4008837595a3d0923f55d0e9d2aea5
2023-05-03 07:34:37 -07:00
Julia Kreger
7f281392c2 Change wholedisk image checksum to sha256
Change-Id: I0c90ac87ca88329e7fb315385345e8020a59fdd5
2023-05-02 13:03:13 -07:00
Jacob Anders
f10958a542 Handle MissingAttributeError when using OOB inspections to fetch MACs
Currently, if an attempt is made to fetch MAC address information using
OOB inspection on a Redfish-managed node and EthernetInterfaces
attribute is missing on the node, inspection fails due to a
MissingAttributeError exception being raised by sushy. This change adds
catching and handling this exception.

Change-Id: I6f16da05e19c7efc966128fdf79f13546f51b5a6
2023-05-02 22:09:26 +10:00
Julia Kreger
c03a5b44ef Remove autocommit, again.
Patch Ic8b1d964f7be5784e01c89bfb6c0277ea82eec2d was developed
without the autocommit change in place, which should allow
Ironic to operate properly with sqlite, in a single process
standalone mode.

As such, we shouldn't need to keep autocommit turned on, but
we may need to put it back if we identify yet another issue,
which is entirely possible with major database refactors.

Change-Id: Icde231e9db3b7a9f59205505cd51a4064e41d746
2023-05-01 22:36:09 +00:00
Julia Kreger
75b881bd31 Fix DB/Lock session handling issues
Prior to this fix, we have been unable to run the Metal3 CI job
with SQLAlchemy's internal autocommit setting enabled. However
that setting is deprecated and needs to be removed.

Investigating our DB queries and request patterns, we were able
to identify some queries which generally resulted in the
underlying task and lock being held longer because the output
was not actually returned, which is something we've generally
had to fix in some places previously. Doing some of these
changes did drastically reduce the number of errors encountered
with the Metal3 CI job, however it did not eliminate them
entirely.

Further investigation, we were able to determine that the underlying
issue we were encountering was when we had an external semi-random
reader, such as Metal3 polling endpoints, we could reach a situation
where we would be blocked from updating the database as to open a
write lock, we need the active readers not to be interacting with
the database, and with a random reader of sorts, the only realistic
option we have is to enable the Write Ahead Log[0]. We didn't have
to do this with SQLAlchemy previously because autocommit behavior
hid the complexities from us, but in order to move to SQLAlchemy
2.0, we do need to remove autocommit.

Additionally, adds two unit tests for get_node_with_token rpc
method, which apparently we missed or lost somewhere along the
way. Also, adds notes to two Database interactions to suggest
we look at them in the future as they may not be the most
efficient path forward.

[0]: https://www.sqlite.org/wal.html

Change-Id: Iebcc15fe202910b942b58fc004d077740ec61912
2023-05-01 15:35:33 -07:00
Zuul
5e6fa6ef30 Merge "Upgrade to latest hacking - v6" 2023-04-30 09:41:59 +00:00
Zuul
fbd1350229 Merge "Configure docs: we no longer use storyboard" 2023-04-28 15:45:32 +00:00
Zuul
b8c8236222 Merge "[iRMC] Fix typo of Python string format in log message" 2023-04-27 23:39:13 +00:00
Zuul
c42c2efe95 Merge "Remove all references to the "cpus" property" 2023-04-27 23:10:00 +00:00
Zuul
c757dbaa45 Merge "Set ironic-grenade to wait 120 seconds" 2023-04-27 18:52:31 +00:00
Zuul
41bc762aa6 Merge "Add ablity to power off nodes in clean failed" 2023-04-27 16:07:47 +00:00
Julia Kreger
f2605e9281 Remove use of nomodeset by default
The troubleshooting kernel command line option nomodeset
unfortunately changes the way framebuffer interactions work
with graphics devices which in some cases can result in kernel
memory to be used for graphics updates. When this happens on
some specific hardware common in rack mount servers with baseboard
management controllers, this can cause the memory bus to become
locked for a brief time while the graphics update is occuring.

This locked memory bus means disk IO can become blocked,
and network cards can overflow their buffers resulting in
packet loss on top of the latency incurred by the graphics
update executing.

As such, we've removed the nomodeset option from default usage and
added a note describing its removal to the documentation along
with a release note.

Change-Id: I9084d88c3ec6f13bd64b8707892758fa87dd7f86
2023-04-26 07:34:29 -07:00
Vanou Ishii
3f09bdcf95 [iRMC] Fix parse_driver_info bug enforcing SNMP v3 under FIPS mode
This patch fixes a condition where iRMC driver interfaces would have
the FIPS enforcement logic check applied if the SNMP version was not
set to SNMP v3, even if the interfaces did not use SNMP.

With this patch, if FIPS enabled, iRMC driver enforces SNMP
version to be version 3 only when any xxx_interface of iRMC
driver actually uses SNMP.

Story: 2010713
Task: 47879
Change-Id: I774c459a5e11b7cd01f7a65754d5a2c7cc573476
2023-04-26 06:36:45 -04:00
Jay Faulkner
c7b8236ab5 Configure docs: we no longer use storyboard
Change-Id: I8a5221b7d8a44d73510efb9ad6a5f16d75a270f5
2023-04-25 09:11:21 -07:00
Chris Krelle
510a612eed Add ablity to power off nodes in clean failed
We have seen duplicate ip issues when leaving clean failed nodes
powered on. This patch allows operators to power down nodes that
enter clean failed state.

Change-Id: Iecb402227485fe0ba787a262121c9d6a048b0e13
2023-04-24 16:20:54 -07:00