The kickstart unit tests were written in such a way that if
the tests are run on a system with kickstart validator present,
then the test behavior is different (and fails) than if it runs
without. Specifically, when it is present, an error is generated:
TypeError: write() argument must be str, not MagicMock
This is because we pass in a mock value for unit testing.
Removes the alternative path of if the validator is present
for unit testing, and locks the test into the false which
simplifies the validation path for the kickstart interface.
Change-Id: Idfb6b4f3b49901aa1a222c6fedc4367ef3bfd2a2
(cherry picked from commit bbc82fa1482459e028c14782a3eeb7db8b03181e)
I totally missed Julia's comment in the review, this commit
adds unit tests for the RedfishFirmwareInterface and also more
logs when a specific component is missing.
Change-Id: Ice2c946726103d9957518c2d30ddad3310ee145d
(cherry picked from commit 32c9c7445978650093e2d3b2413b9e97a0ab721a)
Automated cleaning is not guaranteed to be enabled, and in any case it's
too late to cache the components at that point: firwmare upgrades may
happen before the transition to "available".
Change-Id: I6b74970fffcc150c167830bef195f284a8c6f197
(cherry picked from commit 607b8734e4b91cc0ee26b49da8dbb63714dfa180)
First, it tries to create components even if the current version is not
known and fails with a database constraint error (because the initial
version cannot be NULL). Can be reproduced with sushy-tools before
37f118237a
Second, unexpected exceptions are not handled in the caching code, so
any of them will cause the node to get stuck in cleaning forever.
On top of that, the caching code is missing a metrics decorator.
This change does not update any unit tests because none currently exist.
Change-Id: Iaa242ca6aa6138fcdaaf63b763708e2f1e559cb0
(cherry picked from commit 23745d97fe4e154c79389a951b4fd47d49fed494)
The host currently hard-coded is not functioning. This replaces
the hard-coded mirror by the local CI mirror detected. In case
mirror info is not available then upstream centos mirror is used.
Change-Id: I96a8cb45154c9dbb50efecc22d34c4ff75c6722a
(cherry picked from commit 7032a0d9ac2c875c5349708eb78b779473a41a6e)
In the backports to fix the policy of the original change, Dmitry
noted that it was actually wrong, because we should have instead
raised NotAuthorized. Dmitry was absolutely correct, because in hind
sight I made the change trying to keep exactly the same behavior,
but the reality is this is a case where we should be explicit,
and tell the user they have done something forbidden.
This revert of the revert fixes that change.
Original Change: https://review.opendev.org/c/openstack/ironic/+/905038
Dmitry's Review Feedback: https://review.opendev.org/c/openstack/ironic/+/905088
Change-Id: I5727df00b8c4ae9495ed14b5cea1c0734b5f688d
(cherry picked from commit 4398c11a5f14980505ede44032d17fa8f5969cdc)
Before this change, if a user requested a node to be cleaned
or "managed" with cleaning enabled when the user is in the
system scope, Ironic would attempt to user's token to
make the request to Neutron.
This, unfortunately, does not work, as the neutron client explicitly
requires a project ID to make the request to Neutron. As a result,
Ironic now falls back to it's internal credential configuration to make
the forward request, which matches the behavior if a node has been
unprovisioned and the cleaning has been started automatically.
Closes-Bug: 2048416
Change-Id: Id91ec6afcf89642fb3069918e768016b8b657a31
(cherry picked from commit c3074524da97517bad4e1aaa5efc1f2cd09152cb)
We have some drivers, such as SNMP, which do not support metrics.
Environments with these nodes should not get "N" messages for "N" nodes
that can't generate sensor data.
Closes-bug: 2047709
Change-Id: Ibc1f3feb055521214512c8b350d67933491c2550
(cherry picked from commit b0b7ee425430745ead6eb43d218d7a515da63e23)
Fixes an issue with debug logging referencing node vs node_uuid.
Change-Id: Ic7de9826fbec32038947be89b14f6dfdc2248de4
(cherry picked from commit 578c02813d983e21501a1b9136d57c30bc2b0daa)
At least on some Dell machines, the Redfish SecureBoot resource is
unavailable during configuration, GET requests return HTTP 503.
Sushy does retry these, but not for long enough (the error message
suggests at least 30 seconds, which would be too much to just integrate
in Sushy). This change treats internal errors the same way as
mismatching "enabled" value, i.e. just waits.
Change-Id: I676f48de6b6195a69ea76b4e8b45a034220db2fa
(cherry picked from commit a6e3a7f50cd8594520cf80fa4ef0a07646221809)
Prior to this change, Ironic would not cleanup unix sockets leftover on
exit if it was configured to listen on a unix socket. Now, it will clean
them up ona clean service exit.
Note: This is not a clean cherry-pick from master, it includes both
- Ia7d8ed8b40db7e3d6752e768113ccf52318ee374 (the fix) and
- I6ecb489ea1a9e6490c5ddca5c7467b0c4324dfd1 (the release note)
Change-Id: Ia7d8ed8b40db7e3d6752e768113ccf52318ee374
(cherry picked from commit 01507db18c436ec864f1bea509ef12a72f9ffb04)
When the per-node external_http_url feature was introduced by
c197a2d8b24e2fa4c5e7901e448da1b0c93fcd26, it only applied to a config
floppy. This fix ensures that it is also used for the boot ISO, both
when it is generated locally (by _prepare_iso_image()) or just cached
locally (by prepare_remote_image()).
Change-Id: Ic241da6845b4d97fd29888e28cc1d9ee34e182c1
Closes-Bug: #2044314
(cherry picked from commit 0d59e25cf8ae3e531fcca46b20907014a9a92f09)
It's possible to use virtual media based provisioning on
servers that only support DVD MediaTypes and do not support CD
MediaTypes. The problem in this scenario is that Ironic will keep
the media attached since it will only eject the ones matching the
CD device, now we check if there is any DVD device with media inserted
when looking for CD devices.
Closes-Bug: 2039042
Change-Id: I7a5e871133300fea8a77ad5bfd9a0b045c24c201
(cherry picked from commit 766d2804a1743e238a37762c946dec0984721167)
They are huge, may expose sensitive data and are normally stored in
local files instead. Match the inspector behavior and drop logs.
Change-Id: I569ef8c7f9d78a7a65c48b6b46c12493c5c571c3
(cherry picked from commit 56cbe2569db6f4eb49c1436f576ccdad87b53b5a)
Update the URL to the upper-constraints file to point to the redirect
rule on releases.openstack.org so that anyone working on this branch
will switch to the correct upper-constraints list automatically when
the requirements repository branches.
Until the requirements repository has as stable/2023.2 branch, tests will
continue to use the upper-constraints list on master.
Change-Id: Ifd70929fdc8d2c5a1f89f00eb507db1540b1fb17
If redfish_address is in brackets, unwrap it and check
that it is a valid IPv6 address. If that is the case use
the unwrapped address to avoid "Name or service not known".
Also add a unit test for normal_ipv6_as_url.
Closes-Bug: #2036455
Change-Id: I8df20e85e40d8321bd5f88c09fae33b6015bcf51
When parsing redfish driver info wrap IPv6 address in brackets
before appending default scheme/authority.
Updated common.utils.wrap_ipv6() to ignore ValueError, e.g
simply return the string if ip is not an ipv6 address string.
Related: RHBZ#2239356
Closes-Bug: #2036454
Change-Id: Icefd96d6873474b4cfb7fbf3d8337cd42fd63ca6
Generally, print should not be used for unit tests, it may pollute
the output stream. Right now, our internal build system is facing
BlockingIOError: [Errno 11] write could not complete without blocking
on prints. Especially ACL tests seem to be a big offender because they
are vary numerous and each may print several times. Many prints in other
tests are cryptic and probably just leftover from debugging.
I only leave the API unit tests where the output is arguably useful.
But I reduce it to one print per call since the input is already known.
Change-Id: Ic5aaf9624f86b39609e2db6157c98cf8e35712fc
We have discovered hardware that only applies boot mode / secure boot
changes during a reboot. Furthermore, the same hardware cannot update
both at the same time. To err on the safe side, reboot and wait for
the value to change if it's not changed immediately.
Co-Authored-By: Jacob Anders <janders@redhat.com>
Change-Id: I318940a76be531f453f0f5cf31a59cba16febf57
Ubuntu focal was in testing runtime as best effort
testing in 2023.1 cycle. In 2023.2, we do not need to
test the focal as such. Removing its testing to more
focus on making Jammy testing more stable.
[0] https://review.opendev.org/c/openstack/tempest/+/884952
Change-Id: Ia3a9bfb6287fd283c3eeb49b43d2c0d12420596d
Two issues have occcured:
1) Zuul has decided some syntax is deprecated and generates an error.
The exlcusionary nature of the syntax is just not supported by RE2
which is the new requirement, so explicitly matching "^master$"
as opposed to "not stable branches".
2) Marking the snmp job as non-voting, the root issue appears to be ipxe
or the VMs, unknown as of yet.
Change-Id: I68aa95eb1ed80a0fde1c29d708ebd606393481aa
Dbapi method _reserve_node_place_lock is a bit of a special
method. It has both a decorator to retry sqlite "database is locked"
issues, and an outer synchronized process fair lock
(from oslo.concurrency.lockutils), which ensures only *one* thread
is working on locks at a time.
Thing is, we can build contention when a stack of heartbeats
come in, because they are forced to execute in serialized fashion.
And whil investigating some metal3 logs, we could see some lock
interactions are basically instant, and when things begin to
get backed up, we start seeing 10+ second gaps where we are
trying to get ahold of the database, and can't lock the node.
And looking at the code for the method, I realized we were *always*
re-querying the node, but never returning it after updating the node.
Apparently, so we can just log *if* there was an issue.
Instead, just consult the result set and then re-query if we must
to determine *who* holds the lock, we now only do so *if* we are
operating without SQLite, because if we are then we can safely
assume the lock came from another thread.
Change-Id: Ie606439670be21cf267eb541ce864711d2097207