Commit Graph

2492 Commits (master)

Author SHA1 Message Date
Dmitry Tantsur cba10669f5 Fix the HTTP code for reaching max_concurrent_deploy: 503 instead of 500
Change-Id: I3d8c7724c1d44baa67a6364dde2f52abdb906526
2023-10-02 16:13:15 +02:00
OpenStack Proposal Bot db549850e0 Imported Translations from Zanata
For more information about this automatic import see:
https://docs.openstack.org/i18n/latest/reviewing-translation-import.html

Change-Id: Ic59ac600efb738d26a2f38186dcc9272e349e5c7
2023-09-28 04:05:55 +00:00
OpenStack Release Bot 7b91f67df0 Update master for stable/2023.2
Add file to the reno documentation build to show release notes for
stable/2023.2.

Use pbr instruction to increment the minor version number
automatically so that master versions are higher than the versions on
stable/2023.2.

Sem-Ver: feature
Change-Id: Ia3d5f975e26f33b9f43610dd46246b5da03bc10e
2023-09-22 13:48:26 +00:00
Zuul f78f872271 Merge "Trivial: attach versions to release series" 2023-09-21 13:29:50 +00:00
Zuul 6d9779bf6b Merge "Redfish: wait for secure boot state change if it's not immediate" 2023-09-21 13:29:47 +00:00
Iury Gregory Melo Ferreira 4eb0dbf7b5 RedfishFirmware Interface
Change-Id: I75b2433fade0c36522024c16608d61cd663b38d5
2023-09-20 13:09:38 -03:00
Harald Jensås 21e3e71ea3
inspect_utils, handle bracketed IPv6 redfish addr
If redfish_address is in brackets, unwrap it and check
that it is a valid IPv6 address. If that is the case use
the unwrapped address to avoid "Name or service not known".

Also add a unit test for normal_ipv6_as_url.

Closes-Bug: #2036455
Change-Id: I8df20e85e40d8321bd5f88c09fae33b6015bcf51
2023-09-19 14:54:12 +02:00
Dmitry Tantsur 2bb653a52e Trivial: attach versions to release series
Also fix an incorrect version in the release notes.

Change-Id: If57f34357c03e64188c493f3a1bdc072954c2541
2023-09-19 11:47:24 +02:00
Jay Faulkner d115a52b20 [releasenotes] Prelude for 2023.2/bobcat
Prelude entry for 2023.2 release.

Change-Id: Ib78dca723d3aa9a3458ce452124657ad0be55a63
2023-09-14 09:54:35 -07:00
Dmitry Tantsur 6487b95813 Redfish: wait for secure boot state change if it's not immediate
We have discovered hardware that only applies boot mode / secure boot
changes during a reboot. Furthermore, the same hardware cannot update
both at the same time. To err on the safe side, reboot and wait for
the value to change if it's not changed immediately.

Co-Authored-By: Jacob Anders <janders@redhat.com>
Change-Id: I318940a76be531f453f0f5cf31a59cba16febf57
2023-09-12 18:30:36 +02:00
Zuul 40728f39f7 Merge "PXE: Remove DHCP option 210 from being set" 2023-09-07 18:33:02 +00:00
Zuul ab76ff12e1 Merge "Utilize the JSON-RPC port" 2023-08-31 04:45:05 +00:00
Julia Kreger 980611186e PXE: Remove DHCP option 210 from being set
Ages ago we supported pxelinux. Now, not really since it is long EOL.

And while troubleshooting bug # 2033430, we discovered we had option
210 in the DHCP payload from the server, which ended up being the
folder base path for a tftp client to self reference the structure,
but only with OVN.

Further troubleshooting with the neutron-dhcp-agent and dnsmasq
revealed we never actaully really sent that option to clients.

In other words, meaning it was always redundant. Since excess
information could be part of the problem with grub, we're removing
it.

Change-Id: Iaa2f174b6082fadcab6635ca874fc5fae2fb4842
2023-08-30 13:27:54 -07:00
Julia Kreger c84fe147a3 Utilize the JSON-RPC port
Adds storage of the json-rpc port number to the conductor hostname
to enable rpc clients to understand which rpc servies they need to
connect to.

Depends-On: https://review.opendev.org/c/openstack/ironic-lib/+/879211
Change-Id: I6021152c83ab5025a9a9e6d8d24c64278c4c1053
2023-08-30 08:56:17 -07:00
Zuul 1bbc67c1b6 Merge "Add inspection (processing) hooks" 2023-08-29 16:45:16 +00:00
Zuul 9f7218243b Merge "Permit Ironic to notify IPA it can support MD5" 2023-08-29 12:32:54 +00:00
Zuul 120ccf50cc Merge "Add service steps call to agent logic" 2023-08-29 04:11:35 +00:00
Zuul 8be7efdeab Merge "Introduce default kernel/ramdisks by arch" 2023-08-29 04:11:32 +00:00
Zuul f7dfc13c94 Merge "Adds service steps" 2023-08-29 02:56:22 +00:00
Julia Kreger e1a0864635 Add service steps call to agent logic
While the prior sevice steps patch had a huge portion of the
needed code already due to copy-pasta, this change finishes
wiring in the ability for the agent to be launched for service
steps and heartbeat to occur, combined with support to retrieve
service steps from the running agent, ultimately to enable
operators to take a deployed node, and ask Ironic to make changes,
or my more favorite use case, go benchmark it for a while.

Also edits the service steps release note to remove the outstanding
issue, and makes some minor corrections in the code which was copied
but didn't quite have testing wired up yet.

Change-Id: Ibfe42037b520a76539234cf1a5e19afd335ce8a8
2023-08-28 20:57:43 +00:00
Bifrost 3c5e05a8a4 Introduce default kernel/ramdisks by arch
Introduce config to allow setting default ramdisks per-architecture.
The hierarchy of the parameters is:
Node config > config by architecture > general config

Change-Id: I95dfece3e8f7bcd3121ac808985cb61997877a51
2023-08-28 17:25:37 +01:00
Mahnoor Asghar e6360bc84b Add inspection (processing) hooks
Adds inspection hooks in the agent inspect interface for processing
data received from the ramdisk at the /v1/continue_inspection
endpoint. The four default configuration hooks 'ramdisk-error',
'validate-interfaces', 'ports' and 'architecture' are added.
(The remaining inspection hooks will be added in further patches.)

Change-Id: I2cf1be465ba7a93fd66881b14972e960acd4dd4e
Story: #2010275
2023-08-25 09:38:39 -02:00
Julia Kreger 2fd3d8f01e Fail on node lookup if it is locked
In the agent token mechanism, restrictions exist when a an agent
token can be generated, and unfortunately this has to be done on
the conductor side involving a lock and a task because we need to
save the state of the node.

As such, we were in a situation where we were waiting on DB node
locking, which would prevent the agent from getting a node, and
potentially causing the lookup operation to fail, eventually.

We now quickly return NodeLocked which shouldn't cause the agent
any issues, although we need to improve error handling there as
well.

Change-Id: Ice335eed82b936753be99eedb16ceccf8a9a86a8
2023-08-23 13:18:43 -07:00
Julia Kreger 84f1a1c321 Permit Ironic to notify IPA it can support MD5
Adds a new configuration option which can be set by an
operator to tell Ironic's agent that it is able to process
an MD5 checksum.

Depends-On: https://review.opendev.org/c/openstack/ironic-python-agent/+/882367
Change-Id: I79228e773db9e60fcc2d16ec028ba233c4ba756f
2023-08-22 16:06:35 -07:00
Zuul cf018a6121 Merge "Retry connecting vmedia through a DVD device if available" 2023-08-18 02:10:17 +00:00
Jacob Anders e332898164 Retry connecting vmedia through a DVD device if available
Currently, it is not possible to use virtual media based provisioning on
servers that only support DVD MediaTypes and do not support CD
MediaTypes. This change adds support for DVD-only virtual media. In
addition to this, it adds hanling of BadRequestException on attempt to
insert virtual media, so that instead of giving up on the first failure
encountered, the code continues to iterate through the list of other
virtual media devices available on the system and attempts to insert
media into the next suitable device till it either suceeds or runs out of
devices to try.

Change-Id: I0bb0ad5df613df86cd8a7686f9c32e902826cd20
Closes-Bug: #2031595
2023-08-17 22:39:16 +10:00
Julia Kreger 2366a4b86e Adds service steps
A huge list of initial work for service steps

* Adds service_step verb
* Adds service_step db/object/API field on the node object for the
  status.
* Increments the API version to 1.87 for both changes.
* Increments the RPC API version to 1.57.
* Adds initial testing to facilitate ensurance that supplied steps
  are passed through and executed upon.

Does not:

* Have tests for starting the agent ramdisk, although this is
  relatively boiler plate.
* Have a collection of pre-decorated steps available for immediate
  consumption.

Change-Id: I5b9dd928f24dff7877a4ab8dc7b743058cace994
2023-08-16 06:34:08 -07:00
Zuul 9b181b83a8 Merge "Support sha256/sha512 with the ilo firmware upgrade logic" 2023-08-14 16:36:21 +00:00
Zuul e2011518f1 Merge "Prevent MissingAttribute error when supportedApplyTime missing" 2023-08-08 07:59:03 +00:00
Jacob Anders f93712d7a6 Prevent MissingAttribute error when supportedApplyTime missing
On some hardware, supportedApplyTime attribute may not be listed
under Redfish BIOS settings URL. This patch adds handling of this case
to prevent failure on attempt of updating BIOS settings.

Change-Id: I40359973fd832146cb2b179bfa447a308078e83d
2023-08-08 13:18:31 +10:00
Julia Kreger 23f4a7d993 Support sha256/sha512 with the ilo firmware upgrade logic
Adds support for SHA256 and SHA512 checksums to be passed
to firmware upgrade steps for the ilo hardware type.

Change-Id: I5455c4bfa4741a35b0ddada37298c897887e6cea
2023-08-07 15:20:14 +00:00
Zuul b769a8199a Merge "Add wait step" 2023-07-28 05:16:26 +00:00
Zuul 96b1718b42 Merge "Enable vendor interfaces to be called as steps" 2023-07-27 17:19:09 +00:00
Julia Kreger 8fc8372e74 Add wait step
Adds a wait step to allow for finer grained workflows
and forcing interruptions which may be needed in some
cases with specialized hardware.

Change-Id: Idc338b761ebe35a4635022a324ca5acbf29fc462
2023-07-24 22:42:20 +00:00
Julia Kreger 091edb0631 Retry SQLite DB write failures due to locks
Adds a database retry decorator to capture and retry exceptions
rooted in SQLite locking. These locking errors are rooted in
the fact that essentially, we can only have one distinct writer
at a time. This writer becomes transaction oriented as well.

Unfortunately with our green threads and API surface, we run into
cases where we have background operations (mainly, periodic tasks...)
and API surface transacations which need to operate against the DB
as well. Because we can't say one task or another (realistically
speaking) can have exclusive control and access, then we run into
database locking errors.

So when we encounter a lock error, we retry.

Adds two additional configuration parameters to the database
configuration section, to allow this capability to be further
tuned, as file IO performance is *surely* a contributing factor
to our locking issues as we mostly see them with a loaded CI
system where other issues begin to crop up.

The new parameters are as follows:
* sqlite_retries, a boolean value allowing the retry logic
  to be disabled. This can largely be ignored, but is available
  as it was logical to include.
* sqlite_max_wait_for_retry, a integer value, default 30 seconds
  as to how long to wait for retrying SQLite database operations
  which are failing due to a "database is locked" error.

The retry logic uses the tenacity library, and performs an
expoential backoff. Setting the amount of time to a very large
number is not advisable, as such the default of 30 seconds was
deemed reasonable.

Change-Id: Ifeb92e9f23a94f2d96bb495fe63a71df9865fef3
2023-07-18 13:14:45 +00:00
Julia Kreger 0099d1812d Don't actually heartbeat with sqlite
Disables internal heartbeat mechanism when ironic has been
configured to utilize a SQLite database backend.

This is done to lessen the possibility of a
"database is locked" error, which can occur when two
distinct threads attempt to write to the database
at the same time with open writers.

The process keepalive heartbeat process was identified as
a major source of these write operations as it was writing
every ten seconds by default, which would also collide with
periodic tasks.

Change-Id: I7b6d7a78ba2910f22673ad8e72e255f321d3fdff
2023-07-14 09:22:01 -07:00
Julia Kreger 76c075269d Enable vendor interfaces to be called as steps
Adds the logic and testing to handle vendor interfaces to be able
to be called as steps, as well as adds the ipmitool send_raw
vendor passthru  method to be able to be called as a step.

Change-Id: I741a4173f1d150298008d3190e4c3998402a8b86
2023-07-13 07:40:53 -07:00
Julia Kreger fb978dab1c DB: Fix result set locking with periodics
An issue previously existed where periodics would cause an open
transaction to exist with the database which would cause issues
when attempting to write to the database.

This issue has been fixed by assembling the data to return to
the calling method, such that an open transaction does not
remain, by copying the data retrieved from the database,
thus disjointing it from the transaction.

Closes-Bug: #2027405
Change-Id: I6401193b04fd3be78c37433bfdd0ccbd92aac8da
2023-07-12 12:07:09 -07:00
Julia Kreger c4e3100d5c Add hold steps
* Updates API version to 1.85 to permit an ``unhold`` verb
* Adds the ``deploy hold`` and ``clean hold`` provision states
  to the internal state machine.
* Adds on documentation on steps to help provide greater clarity
  to Ironic's users on how to utilize steps. It should be noted
  this documentation also includes the power state reserved step
  names from the DPU functionality patch.
* Fixes the state machine diagram. Changes type to PNG as SVG
  rendering is broken due to python libraries utilized for SVG
  generation which do not work on more recent Python versions.

Change-Id: I34f58f4e77e7757b89247fd64f5fcde26f679453
2023-06-30 14:34:26 -07:00
Julia Kreger 402c32094b Handle SAWarning around allocations FK Constratins
We have started to notice an SAWarning from sqlalchemy indicating:

  SAWarning: Cannot correctly sort tables; there are unresolvable
      cycles between tables "allocations, nodes", which is usually
      caused by mutually dependent foreign key constraints.
      Foreign key constraints involving these tables will not be
      considered; this warning may raise an error in a future release.

Hunting this down, it appears to be the two data consistency Foreign
Key constraints in the "allocations" table where an allocation would
try to have a conductor_affinity value mapped to conductors.id
and also have a direct association to a node, which *also* had the
same constraint.

And then similarlly, mapping in reverse, asserting a fk constraint,
when nodes also had it's own constraint back on allocations.

Sort of a circular loop.

Anyhow, removes it, and adds a db migration to remove the two
constraints.

Change-Id: I5596008e4971a29c635c45b24cb85db2d0d13ed3
2023-06-26 14:27:59 -07:00
Zuul ce1abd4007 Merge "Handle duplicate node inventory entries per node" 2023-06-14 16:33:52 +00:00
Mahnoor Asghar fa2d6685f3 Handle duplicate node inventory entries per node
When a node is inspected more than one time and the database is
configured as a storage backend, a new entry is made in the database
for each inspection result (node inventory). This patch handles this
behaviour as follows:
By deleting previous inventory entries for the same node before adding
 a new entry in the database.
By retrieving the most recent node inventory from the database when the
database is queried.

Change-Id: Ic3df86f395601742d2fea2bcde62f7547067d8e4
2023-06-07 08:08:37 -04:00
Zuul 97f7177495 Merge "execute on child node support" 2023-06-07 04:04:45 +00:00
Zuul 8ef69aaa6a Merge "Prepare [inspector]require_managed_boot to change to True in the future" 2023-06-05 14:36:59 +00:00
Zuul 964a82db18 Merge "Add to Redfish hardware inventory collection" 2023-06-01 10:25:14 +00:00
Zuul 2bd69444d9 Merge "[iRMC] Fix IPMI incompatibility handling error" 2023-05-30 13:20:39 +00:00
Mahnoor Asghar b3d7ba88d2 Add to Redfish hardware inventory collection
Add to the information collected by Redfish hardware inspection from
sushy, and store it in the documented hardware inventory format

Change-Id: I651599b84e6b8901647960b719626489b000b65f
2023-05-30 05:58:00 -04:00
Zuul 32532eeda5 Merge "DPU modeling - parent_node DB/Model/API" 2023-05-24 23:18:33 +00:00
Julia Kreger 013ac0cb41 execute on child node support
Allows steps to be executed on child nodes, and adds
the reserved power_on, power_off, and reboot step names.

Change-Id: I4673214d2ed066aa8b95a35513b144668ade3e2b
2023-05-24 15:42:46 -07:00
Julia Kreger 93688e9531 Explicitly use a session for DB version check
The db field value version check, which is a preflight to
major upgrades (to detect if a prior upgrade was not completed)
was using model_query, which could orphan an open transaction in
the same process until the python interpretter went and took out
the perverable trash.

We now use an explicit session which structurally ensures we close
any open transactions which allows a metadata lock to be obtained
to perform a schema update..

Change-Id: Id51419bc50af5a756bb7b0ca451df1936dd6f904
2023-05-23 20:26:05 +00:00