12650 Commits

Author SHA1 Message Date
Zuul
1f3ae01626 Merge "RedfishFirmwareInterface - Unit Tests & More logs" into stable/2023.2 2024-01-27 04:35:03 +00:00
Zuul
68d3168ec0 Merge "Cache firwmare components on the transition to "manageable"" into stable/2023.2 2024-01-27 00:25:42 +00:00
Zuul
aa1f7e11c3 Merge "Kickstart: Don't error unit tests ksvalidate is present" into stable/2023.2 2024-01-26 22:05:37 +00:00
Julia Kreger
4895c687b1 Kickstart: Don't error unit tests ksvalidate is present
The kickstart unit tests were written in such a way that if
the tests are run on a system with kickstart validator present,
then the test behavior is different (and fails) than if it runs
without. Specifically, when it is present, an error is generated:

TypeError: write() argument must be str, not MagicMock

This is because we pass in a mock value for unit testing.

Removes the alternative path of if the validator is present
for unit testing, and locks the test into the false which
simplifies the validation path for the kickstart interface.

Change-Id: Idfb6b4f3b49901aa1a222c6fedc4367ef3bfd2a2
(cherry picked from commit bbc82fa1482459e028c14782a3eeb7db8b03181e)
2024-01-26 19:08:54 +00:00
Iury Gregory Melo Ferreira
8e75762233 RedfishFirmwareInterface - Unit Tests & More logs
I totally missed Julia's comment in the review, this commit
adds unit tests for the RedfishFirmwareInterface and also more
logs when a specific component is missing.

Change-Id: Ice2c946726103d9957518c2d30ddad3310ee145d
(cherry picked from commit 32c9c7445978650093e2d3b2413b9e97a0ab721a)
2024-01-26 12:53:58 +00:00
Dmitry Tantsur
5c1f85835e Cache firwmare components on the transition to "manageable"
Automated cleaning is not guaranteed to be enabled, and in any case it's
too late to cache the components at that point: firwmare upgrades may
happen before the transition to "available".

Change-Id: I6b74970fffcc150c167830bef195f284a8c6f197
(cherry picked from commit 607b8734e4b91cc0ee26b49da8dbb63714dfa180)
2024-01-26 09:48:30 -03:00
Dmitry Tantsur
e9dd47e2d9 Fix two severe errors in the firmware caching code
First, it tries to create components even if the current version is not
known and fails with a database constraint error (because the initial
version cannot be NULL). Can be reproduced with sushy-tools before
37f118237a

Second, unexpected exceptions are not handled in the caching code, so
any of them will cause the node to get stuck in cleaning forever.

On top of that, the caching code is missing a metrics decorator.

This change does not update any unit tests because none currently exist.

Change-Id: Iaa242ca6aa6138fcdaaf63b763708e2f1e559cb0
(cherry picked from commit 23745d97fe4e154c79389a951b4fd47d49fed494)
2024-01-26 09:39:59 -03:00
Takashi Kajinami
aec3c072cd Stop using a specific mirror in infra
The host currently hard-coded is not functioning. This replaces
the hard-coded mirror by the local CI mirror detected. In case
mirror info is not available then upstream centos mirror is used.

Change-Id: I96a8cb45154c9dbb50efecc22d34c4ff75c6722a
(cherry picked from commit 7032a0d9ac2c875c5349708eb78b779473a41a6e)
2024-01-23 03:40:37 +00:00
Julia Kreger
ab5a5f6a19 Revert "Revert "RBAC: Fix allocation check"" to use Unauthorized
In the backports to fix the policy of the original change, Dmitry
noted that it was actually wrong, because we should have instead
raised NotAuthorized. Dmitry was absolutely correct, because in hind
sight I made the change trying to keep exactly the same behavior,
but the reality is this is a case where we should be explicit,
and tell the user they have done something forbidden.

This revert of the revert fixes that change.
Original Change: https://review.opendev.org/c/openstack/ironic/+/905038
Dmitry's Review Feedback: https://review.opendev.org/c/openstack/ironic/+/905088

Change-Id: I5727df00b8c4ae9495ed14b5cea1c0734b5f688d
(cherry picked from commit 4398c11a5f14980505ede44032d17fa8f5969cdc)
2024-01-17 15:04:36 +00:00
Julia Kreger
7bfc9e3f80 Fix system scoped manageable node network failure
Before this change, if a user requested a node to be cleaned
or "managed" with cleaning enabled when the user is in the
system scope, Ironic would attempt to user's token to
make the request to Neutron.

This, unfortunately, does not work, as the neutron client explicitly
requires a project ID to make the request to Neutron. As a result,
Ironic now falls back to it's internal credential configuration to make
the forward request, which matches the behavior if a node has been
unprovisioned and the cleaning has been started automatically.

Closes-Bug: 2048416
Change-Id: Id91ec6afcf89642fb3069918e768016b8b657a31
(cherry picked from commit c3074524da97517bad4e1aaa5efc1f2cd09152cb)
2024-01-10 05:20:32 +00:00
Jay Faulkner
11c7d6c96e Do not log lack of metrics support at WARNING lvl
We have some drivers, such as SNMP, which do not support metrics.
Environments with these nodes should not get "N" messages for "N" nodes
that can't generate sensor data.

Closes-bug: 2047709
Change-Id: Ibc1f3feb055521214512c8b350d67933491c2550
(cherry picked from commit b0b7ee425430745ead6eb43d218d7a515da63e23)
2024-01-05 02:16:31 +00:00
Zuul
5dc913ea7b Merge "Fix log message var reference" into stable/2023.2 2023-12-07 10:58:46 +00:00
James Denton
47522a71c5 Fix log message var reference
Fixes an issue with debug logging referencing node vs node_uuid.

Change-Id: Ic7de9826fbec32038947be89b14f6dfdc2248de4
(cherry picked from commit 578c02813d983e21501a1b9136d57c30bc2b0daa)
2023-12-06 13:21:31 +00:00
Dmitry Tantsur
7775cf3da1 Handle internal server errors while configuring secure boot
At least on some Dell machines, the Redfish SecureBoot resource is
unavailable during configuration, GET requests return HTTP 503.
Sushy does retry these, but not for long enough (the error message
suggests at least 30 seconds, which would be too much to just integrate
in Sushy). This change treats internal errors the same way as
mismatching "enabled" value, i.e. just waits.

Change-Id: I676f48de6b6195a69ea76b4e8b45a034220db2fa
(cherry picked from commit a6e3a7f50cd8594520cf80fa4ef0a07646221809)
2023-12-06 12:23:43 +00:00
Zuul
4afac28c40 Merge "Properly cleanup unix sockets in wsgi_service" into stable/2023.2 2023-11-29 12:28:06 +00:00
Dmitry Tantsur
af85c20769 Properly cleanup unix sockets in wsgi_service
Prior to this change, Ironic would not cleanup unix sockets leftover on
exit if it was configured to listen on a unix socket. Now, it will clean
them up ona clean service exit.

Note: This is not a clean cherry-pick from master, it includes both
- Ia7d8ed8b40db7e3d6752e768113ccf52318ee374 (the fix) and
- I6ecb489ea1a9e6490c5ddca5c7467b0c4324dfd1 (the release note)

Change-Id: Ia7d8ed8b40db7e3d6752e768113ccf52318ee374
(cherry picked from commit 01507db18c436ec864f1bea509ef12a72f9ffb04)
2023-11-28 14:21:49 -08:00
Dmitry Tantsur
d385c4a4da Add missing compatibility between idrac and redfish firmware
Change-Id: I3026a5c69930825ea2b88587e62b36e8824fa91e
(cherry picked from commit 6e10ad9ad7548ca698055e882e3c45e5d962598c)
2023-11-28 21:33:54 +00:00
Zane Bitter
c6089dc5e0
Use per-node external_http_url for boot ISO
When the per-node external_http_url feature was introduced by
c197a2d8b24e2fa4c5e7901e448da1b0c93fcd26, it only applied to a config
floppy. This fix ensures that it is also used for the boot ISO, both
when it is generated locally (by _prepare_iso_image()) or just cached
locally (by prepare_remote_image()).

Change-Id: Ic241da6845b4d97fd29888e28cc1d9ee34e182c1
Closes-Bug: #2044314
(cherry picked from commit 0d59e25cf8ae3e531fcca46b20907014a9a92f09)
2023-11-23 17:01:13 +01:00
Iury Gregory Melo Ferreira
235e6ccb98 Make sure we eject media from DVD when CD is requested
It's possible to use virtual media based provisioning on
servers that only support DVD MediaTypes and do not support CD
MediaTypes. The problem in this scenario is that Ironic will keep
the media attached since it will only eject the ones matching the
CD device, now we check if there is any DVD device with media inserted
when looking for CD devices.

Closes-Bug: 2039042
Change-Id: I7a5e871133300fea8a77ad5bfd9a0b045c24c201
(cherry picked from commit 766d2804a1743e238a37762c946dec0984721167)
2023-10-30 23:44:02 +00:00
Zuul
2a523a8aa9 Merge "Do not store ramdisk logs as part of the inventory" into stable/2023.2 2023-10-08 23:32:03 +00:00
Dmitry Tantsur
39578d9198 Fix the HTTP code for reaching max_concurrent_deploy: 503 instead of 500
Change-Id: I3d8c7724c1d44baa67a6364dde2f52abdb906526
(cherry picked from commit cba10669f52b4d8aca872e9b7d69f12f66d5bb82)
2023-10-03 12:17:10 +00:00
Dmitry Tantsur
7a6e893166 Do not store ramdisk logs as part of the inventory
They are huge, may expose sensitive data and are normally stored in
local files instead. Match the inspector behavior and drop logs.

Change-Id: I569ef8c7f9d78a7a65c48b6b46c12493c5c571c3
(cherry picked from commit 56cbe2569db6f4eb49c1436f576ccdad87b53b5a)
2023-10-03 12:16:38 +00:00
da4e1c9827 Update TOX_CONSTRAINTS_FILE for stable/2023.2
Update the URL to the upper-constraints file to point to the redirect
rule on releases.openstack.org so that anyone working on this branch
will switch to the correct upper-constraints list automatically when
the requirements repository branches.

Until the requirements repository has as stable/2023.2 branch, tests will
continue to use the upper-constraints list on master.

Change-Id: Ifd70929fdc8d2c5a1f89f00eb507db1540b1fb17
2023-09-22 13:48:25 +00:00
6b40719e12 Update .gitreview for stable/2023.2
Change-Id: I8accb31c9fb07a2a10bbcdab4387ad9aa09eb456
2023-09-22 13:48:23 +00:00
Zuul
f78f872271 Merge "Trivial: attach versions to release series" 23.0.0 2023-09-21 13:29:50 +00:00
Zuul
6d9779bf6b Merge "Redfish: wait for secure boot state change if it's not immediate" 2023-09-21 13:29:47 +00:00
Iury Gregory Melo Ferreira
4eb0dbf7b5 RedfishFirmware Interface
Change-Id: I75b2433fade0c36522024c16608d61cd663b38d5
2023-09-20 13:09:38 -03:00
Zuul
bc1c89d993 Merge "inspect_utils, handle bracketed IPv6 redfish addr" 2023-09-19 17:41:21 +00:00
Harald Jensås
21e3e71ea3
inspect_utils, handle bracketed IPv6 redfish addr
If redfish_address is in brackets, unwrap it and check
that it is a valid IPv6 address. If that is the case use
the unwrapped address to avoid "Name or service not known".

Also add a unit test for normal_ipv6_as_url.

Closes-Bug: #2036455
Change-Id: I8df20e85e40d8321bd5f88c09fae33b6015bcf51
2023-09-19 14:54:12 +02:00
Dmitry Tantsur
2bb653a52e Trivial: attach versions to release series
Also fix an incorrect version in the release notes.

Change-Id: If57f34357c03e64188c493f3a1bdc072954c2541
2023-09-19 11:47:24 +02:00
Harald Jensås
72037b596a
redfish_address - wrap_ipv6 address
When parsing redfish driver info wrap IPv6 address in brackets
before appending default scheme/authority.

Updated common.utils.wrap_ipv6() to ignore ValueError, e.g
simply return the string if ip is not an ipv6 address string.

Related: RHBZ#2239356
Closes-Bug: #2036454
Change-Id: Icefd96d6873474b4cfb7fbf3d8337cd42fd63ca6
2023-09-18 21:07:06 +02:00
Dmitry Tantsur
88fd22de79 Remove most prints for unit tests
Generally, print should not be used for unit tests, it may pollute
the output stream. Right now, our internal build system is facing

    BlockingIOError: [Errno 11] write could not complete without blocking

on prints. Especially ACL tests seem to be a big offender because they
are vary numerous and each may print several times. Many prints in other
tests are cryptic and probably just leftover from debugging.

I only leave the API unit tests where the output is arguably useful.
But I reduce it to one print per call since the input is already known.

Change-Id: Ic5aaf9624f86b39609e2db6157c98cf8e35712fc
2023-09-15 14:58:57 +02:00
Zuul
22918bde84 Merge "[releasenotes] Prelude for 2023.2/bobcat" 2023-09-15 08:58:03 +00:00
Jay Faulkner
d115a52b20 [releasenotes] Prelude for 2023.2/bobcat
Prelude entry for 2023.2 release.

Change-Id: Ib78dca723d3aa9a3458ce452124657ad0be55a63
2023-09-14 09:54:35 -07:00
Zuul
3d4cd28f89 Merge "devstack - configurable ipv6 address mode" 2023-09-14 10:43:23 +00:00
Zuul
f0fde6c22d Merge "CI: Remove ubuntu focal job" 2023-09-13 05:18:29 +00:00
Harald Jensås
a8ede77e3e devstack - configurable ipv6 address mode
Add variable to define ipv6-address-mode and ipv6-ra-mode
in the devstack plugin.

Change-Id: I0a145bafc2ea37065b0e0fa7445837ded7bd8e46
2023-09-12 18:56:06 +00:00
Dmitry Tantsur
6487b95813 Redfish: wait for secure boot state change if it's not immediate
We have discovered hardware that only applies boot mode / secure boot
changes during a reboot. Furthermore, the same hardware cannot update
both at the same time. To err on the safe side, reboot and wait for
the value to change if it's not changed immediately.

Co-Authored-By: Jacob Anders <janders@redhat.com>
Change-Id: I318940a76be531f453f0f5cf31a59cba16febf57
2023-09-12 18:30:36 +02:00
Zuul
eae2b1260a Merge "Fix minor grammar issues in the help for new inspector options" 2023-09-12 14:45:24 +00:00
likui
065b4bfc12 CI: Remove ubuntu focal job
Ubuntu focal was in testing runtime as best effort
testing in 2023.1 cycle. In 2023.2, we do not need to
test the focal as such. Removing its testing to more
focus on making Jammy testing more stable.

[0] https://review.opendev.org/c/openstack/tempest/+/884952

Change-Id: Ia3a9bfb6287fd283c3eeb49b43d2c0d12420596d
2023-09-11 10:52:15 +08:00
Zuul
ac28e54071 Merge "DB: Only re-query for a lock holder if we cannot lock" 2023-09-08 19:58:52 +00:00
Zuul
bc80399b3f Merge "Fix two places that can cause issues under SQLite" 2023-09-08 19:58:47 +00:00
Zuul
c00a262d26 Merge "Update proliantutils driver requirements for bobcat" 2023-09-08 09:13:13 +00:00
Zuul
40728f39f7 Merge "PXE: Remove DHCP option 210 from being set" 2023-09-07 18:33:02 +00:00
Dmitry Tantsur
7b9007375e Fix two places that can cause issues under SQLite
In both places, we may potentially iterate over a result set after
closing the read transaction.

Change-Id: I0afce854287a4375c525c19c49ed0ec01bac76b1
2023-09-07 17:03:39 +02:00
Julia Kreger
985c7fdf21 [CI] Unblock CI by fixing job regex and non-voting snmp
Two issues have occcured:

1) Zuul has decided some syntax is deprecated and generates an error.
   The exlcusionary nature of the syntax is just not supported by RE2
   which is the new requirement, so explicitly matching "^master$"
   as opposed to "not stable branches".

2) Marking the snmp job as non-voting, the root issue appears to be ipxe
or the VMs, unknown as of yet.

Change-Id: I68aa95eb1ed80a0fde1c29d708ebd606393481aa
2023-09-07 03:58:34 +00:00
Nisha Agarwal
ec2a5cc7c6 Update proliantutils driver requirements for bobcat
Change-Id: I3230a8fd446126d294cbf837a65b07e497d4031c
2023-09-06 14:07:50 +00:00
Julia Kreger
bb02c49def DB: Only re-query for a lock holder if we cannot lock
Dbapi method _reserve_node_place_lock is a bit of a special
method. It has both a decorator to retry sqlite "database is locked"
issues, and an outer synchronized process fair lock
(from oslo.concurrency.lockutils), which ensures only *one* thread
is working on locks at a time.

Thing is, we can build contention when a stack of heartbeats
come in, because they are forced to execute in serialized fashion.

And whil investigating some metal3 logs, we could see some lock
interactions are basically instant, and when things begin to
get backed up, we start seeing 10+ second gaps where we are
trying to get ahold of the database, and can't lock the node.

And looking at the code for the method, I realized we were *always*
re-querying the node, but never returning it after updating the node.
Apparently, so we can just log *if* there was an issue.

Instead, just consult the result set and then re-query if we must
to determine *who* holds the lock, we now only do so *if* we are
operating without SQLite, because if we are then we can safely
assume the lock came from another thread.

Change-Id: Ie606439670be21cf267eb541ce864711d2097207
2023-09-05 10:47:54 -07:00
Zuul
0eb3f40f10 Merge "Add service steps and initial docs" 2023-09-01 23:15:27 +00:00
Zuul
907465eceb Merge "Log an exception from heartbeat" 2023-09-01 21:57:16 +00:00