It appears nova's policies have changed, or to be more precise,
they have turned on new policy enforcement[0] and our plugin
was wrong.
+++ /opt/stack/ironic/devstack/lib/ironic:\n
ironic_configure_tempest:3205 :\n
oscwrap --os-cloud devstack-system-admin flavor show baremetal -f value -c id
ForbiddenException: 403: Client Error for url:
https://173.231.254.232/compute/v2.1/flavors/
e4312534-b349-4f70-9a1b-5806debff275/os-extra_specs,
Policy doesn't allow os_compute_api:os-flavor-extra-specs:index to be performed.
[0]: dfd7aeaf6c
Change-Id: I8070852fbe9346e346c50088537797f353753d02
It appear the push to Cirros 0.6.1 has re-occured, and we now
have things failing as a result.
Specifically ironic-grenade is trying to run with Cirros 0.5.2,
yet the file is not found later on.
Anyhow, an explicit pin should resolve this.
Change-Id: I97a1403820c8dbe633cf1d529adc79e8af463e80
Previously, we did query object handlers with model_query
where we woudl then return the handler, however that was problematic
because we returned ORM objects.
Now we have selects, but the data has to be extrated to a set before
we close out the reader session with the database. Returning the
overall object means that the call seems to sometimes not completely
execute on newer python versions in unit tests, which create unit
test failures.
We now hold off the database query and operation to a variable,
exit the reader, and return the variable.
Change-Id: I6ae9e4442bcb473ab5ccff6e167bf61f3daa0f7e
We'v been able to observe one of the scenario test jobs failing
due to tinycorelinux being inaccessible. Possibly on an IPv6 only
test VM. Turns out tinycorelinux's main page is only accessible via
IPv4.
As such, I've changed the mirror to a mirror which is acessible via
IPv6 and which I've verified works for me.
Change-Id: I2b4ccd16189038ce2f054d7403775b012796aea3
Disabling the performance counters as we suspect it is causing
database interaction to freeze on the grenade CI job.
Change-Id: Id951815ab9bfd1ca16aa66fa4c87c0e1b3e788f6
One of our tests is super-friendly to race because it relies on changing
an object, but that object not being saved yet to verify the difference.
However, that object and even similar changes can occur in the database
and trigger the test to fail.
So instead using the shared resource created on test startup, create an
isolated, dedicated resource for it to use inside of the test, which should
eliminate the entire issue. I hope.
Change-Id: Ia0df1c211d5510777c6055f1341335f9bc5c5006
In the recent change to cinder, to address CVE-2023-2088,
cinder changed the policy rules and behavior for unbinding,
or "detaching" a volume. This was because of a vulnerability
in compute nodes where a volume which was in use by a VM
could be detached outside of Nova, and nova wouldn't become
aware the volume was detached, and the volume could be accessible
to the next VM.
This vulnerability doesn't apply to bare metal operations as
volumes are attached to whole baremetal nodes with Ironic.
We now generate and use a service token when interacting with
Cinder which allows cinder to recognize "this request is
coming from a fellow OpenStack service", and by-pass
checking with Nova if the "instance" is managed by Nova,
or Not. This allows the volumes to be attached, and detached
as needed as part of the power operation flow and overall
set of lifecycle operations.
Related-Bug: 2004555
Closes-Bug: 2019892
Change-Id: Ib258bc9650496da989fc93b759b112d279c8b217
Until we're able to get the BFV job softed, we need to unblock
the gate, and as such moving the BFV job to non-voting to allow
other contributors to make progress.
Change-Id: I045d58afe195f08823af3b1a2fa6eabb6efb63ca
The pysnmp library is not maintained since 4 years now and it's
incompatible with recent libraries like pyasn1.
Its fork pysnmp-lextudio is regularly maintained, we should move
to that.
Change-Id: I3b7b5c0cf2f7d669b265669d27e0eaca0dd1fc2a
When enabling scope enforcement, the self_owned_node check could
generate a failure because the check internally can be touched
by both a project scoped and system scoped endpoint.
This change changes the tag in the policy so it doesn't prematurely
return an error to the API consumer.
Change-Id: I49e2f7f29eb98e5bb4e18614cea0aca726703f55
When doing a GET or PUT on an indicator the indicator
specified must use `indicator@component` syntax. For
example:
indicators/led@system or indicators/led-0@chassis
Ref: ironic.api.controllers.v1.node.IndicatorAtComponent
Change-Id: I6908544b52be88cbddf537c954dd5e097a0d83e6
Turns out more than one test was relying upon the object
change determination test. Modifies this test to use the
same pattern of behavior so we are avoiding racing.
Change-Id: I29ee6cab7320d13fcc2eeda27dae08aeb2d98b00
The test, periodically under certian CI race conditions, may
be handled as if there was not a change, which breaks the
test as it does not save a modified port, it uses the in-flight
list of changes to determine the correct path.
The challenge is, the list of changes may not reconize there has
been a change with the underlying object/db layer. So instead
of re-test the library code, we just force the behavior by
replacing the method on the object in the test, as the
undrelying method being tested is tested as part of the
oslo versioned objects code base.
Change-Id: Ic8f9b2384ab2f8f76299afce9806fbe93e350f0e
hashring use time.time() to calculate intervals, replace with
monotonic so it will not be affected by system time jump.
Change-Id: I17569359f4d2c0f2f24ca8b50773c4d210ed8deb
Anaconda stage2 location is not verified nor loaded when kernel
and ramdisk instance info on a node is already populated, due
to a condition removed by commit 4d653ac225.
This change implements a check based on stage2 label presence
that again allows stage2 to be processed separately from kernel
and ramdisk.
And changes incorrect parameter check_on_exit passed to
oslo_concurrency.processutils.execute() to the correct one,
check_exit_code.
Change-Id: If7bfc99f200fd0bfc9fbe2a4c8d039ba6ae8d0c5
Currently, if an attempt is made to fetch MAC address information using
OOB inspection on a Redfish-managed node and EthernetInterfaces
attribute is missing on the node, inspection fails due to a
MissingAttributeError exception being raised by sushy. This change adds
catching and handling this exception.
Change-Id: I6f16da05e19c7efc966128fdf79f13546f51b5a6
Patch Ic8b1d964f7be5784e01c89bfb6c0277ea82eec2d was developed
without the autocommit change in place, which should allow
Ironic to operate properly with sqlite, in a single process
standalone mode.
As such, we shouldn't need to keep autocommit turned on, but
we may need to put it back if we identify yet another issue,
which is entirely possible with major database refactors.
Change-Id: Icde231e9db3b7a9f59205505cd51a4064e41d746
Prior to this fix, we have been unable to run the Metal3 CI job
with SQLAlchemy's internal autocommit setting enabled. However
that setting is deprecated and needs to be removed.
Investigating our DB queries and request patterns, we were able
to identify some queries which generally resulted in the
underlying task and lock being held longer because the output
was not actually returned, which is something we've generally
had to fix in some places previously. Doing some of these
changes did drastically reduce the number of errors encountered
with the Metal3 CI job, however it did not eliminate them
entirely.
Further investigation, we were able to determine that the underlying
issue we were encountering was when we had an external semi-random
reader, such as Metal3 polling endpoints, we could reach a situation
where we would be blocked from updating the database as to open a
write lock, we need the active readers not to be interacting with
the database, and with a random reader of sorts, the only realistic
option we have is to enable the Write Ahead Log[0]. We didn't have
to do this with SQLAlchemy previously because autocommit behavior
hid the complexities from us, but in order to move to SQLAlchemy
2.0, we do need to remove autocommit.
Additionally, adds two unit tests for get_node_with_token rpc
method, which apparently we missed or lost somewhere along the
way. Also, adds notes to two Database interactions to suggest
we look at them in the future as they may not be the most
efficient path forward.
[0]: https://www.sqlite.org/wal.html
Change-Id: Iebcc15fe202910b942b58fc004d077740ec61912
The troubleshooting kernel command line option nomodeset
unfortunately changes the way framebuffer interactions work
with graphics devices which in some cases can result in kernel
memory to be used for graphics updates. When this happens on
some specific hardware common in rack mount servers with baseboard
management controllers, this can cause the memory bus to become
locked for a brief time while the graphics update is occuring.
This locked memory bus means disk IO can become blocked,
and network cards can overflow their buffers resulting in
packet loss on top of the latency incurred by the graphics
update executing.
As such, we've removed the nomodeset option from default usage and
added a note describing its removal to the documentation along
with a release note.
Change-Id: I9084d88c3ec6f13bd64b8707892758fa87dd7f86
This patch fixes a condition where iRMC driver interfaces would have
the FIPS enforcement logic check applied if the SNMP version was not
set to SNMP v3, even if the interfaces did not use SNMP.
With this patch, if FIPS enabled, iRMC driver enforces SNMP
version to be version 3 only when any xxx_interface of iRMC
driver actually uses SNMP.
Story: 2010713
Task: 47879
Change-Id: I774c459a5e11b7cd01f7a65754d5a2c7cc573476
We have seen duplicate ip issues when leaving clean failed nodes
powered on. This patch allows operators to power down nodes that
enter clean failed state.
Change-Id: Iecb402227485fe0ba787a262121c9d6a048b0e13