Currently, we can't abort the process of node inspection from ironic api. When a node is not properly setup under inspection network, admins can only wait it to fail after a specified timeout value, or by using inspector api, or ``openstack baremetal introspection abort`` command to abort the introspection process. Although the inspection state will be synchronized to ironic by periodic task, it will create a little delay of time. Node state will be inconsistent between ironic api and inspector api before next state synchronization. The time defaults to 60 seconds, but may vary depends to user configuration. This spec will support ironic handling abort from the inspect wait state. The inspect wait state is required before this spec can proceed, which will be addressed by [1]. [1] https://bugs.launchpad.net/ironic/+bug/1725211 Change-Id: I54775e12ef558fd99c8d6f517e91348e6f0261c3 Related-bug: #1703089
6.2 KiB
Support baremetal inspection abort
https://bugs.launchpad.net/ironic/+bug/1703089
This spec aims to support aborting node inspection from ironic API. A
dependency of inspect wait
state in1 is
required for this spec to continue.
Problem description
Currently, we can't abort the process of node inspection from ironic
API. When a node is not properly setup under inspection network, admins
can only wait it to fail after specified timeout, or abort the
inspection process from ironic inspector API/CLI (if the in-band inspect
interface inspector
is in use).
Although the inspection state will be synchronized to ironic by periodic task, it's not consistent for an operation started from ironic, then stopped by inspector, furthermore, it creates a little delay of time. Node state is inconsistent between ironic and inspector until next state synchronization. The default time interval for ironic-inspector state synchronization is 60 seconds, it may vary depending on user configuration.
Proposed change
Add state transition of inspect wait
to
inspect failed
to state machine, add support to ironic to
allow the verb abort
can be requested when node in
inspect wait
state.
Add a method named abort
into
InspectInterface
, so that inspect interface can provide
implementation to support inspection abort. The default behavior is to
raise an UnsupportedDriverExtension
exception. Implement
the abort operation for inspector
inspect interface.
When an abort operation is requested from ironic API, and the node in
the state of inspect wait
, ironic calls abort
method from inspect interface of driver API, and moves node state to
inspect failed
if the method executed successfully.
Note that, the abort request to ironic-inspector is asynchronous,
ironic will move node to inspect failed
once the request is
accepted (202), disregard if the operation at ironic-inspector is
performed successfully. This reduces the design complexity for this
feature by handling failure at the side of ironic-inspector.
From the point of view of ironic-inspector, every inspect request will refresh local cache for the node, it assures that node state is in sync when starting node inspection. However, inconsistent node state do exist if abort request is accepted but not performed successfully at ironic-inspector. This inconsistency will be eliminated by ironic-inspector node cache clean up when timeout is reached.
Involved changes are:
- Add a method named
abort()
to base inspect interface (InspectInterface). - Implement
abort()
forinspector
inspect interface. - Implement the logic for ironic handling the verb
abort
when provisioning state isinspect wait
.
Alternatives
- Wait for
inspect fail
after specified timeout value. - Request through ironic-inspector api or
openstack baremetal introspection abort
command. Be aware that it's only viable when using ironic inspector as inspect interface. Other inspect interfaces like out-of-band inspection may have different approach to achieve the same goal, that is beyond the scope of this spec.
Data model impact
None
State Machine Impact
Add a state transition of inspect wait
to
inspect failed
with event abort
to ironic
state machine.
REST API impact
Modify provision state API to support the transition described in
this spec. API microversion will be bumped. For clients with earlier
microversion, the verb abort
is not allowed when a node is
in inspect wait
state.
- PUT /v1/nodes/{node_ident}/states/provision
The same JSON Schema is used to
abort
a node ininspecting
state:{ "target": "abort" }
For client with earlier microversion, 406 (Not Acceptable) is returned
For client with supported microversion
- 202 (Accepted) is returned if request accepted
- 400 (Bad Request) is returned if current inspect interface does not support abort
Client (CLI) impact
"ironic" CLI
None
"openstack baremetal" CLI
None
RPC API impact
None
Driver API impact
A new method abort
will be added to
InspectInterface
in base.py, the default behavior is to
raise the exception UnsupportedDriverExtension
:
def abort(self, task):
raise exception.UnsupportedDriverExtension(
driver=task.node.inspect_interface,
extension='abort')
Nova driver impact
None
Ramdisk impact
None
Security impact
None
Other end user impact
None
Scalability impact
For multiple nodes under inspection in a notable scale, it will reduce a little time costs in case of inspection retry.
Performance Impact
None
Other deployer impact
Deployers can abort hardware introspection through ironic API/CLI, besides the inspector API/CLI, for nodes using inspector as the (in-band) inspection interface.
Developer impact
None
Implementation
Assignee(s)
- Primary assignee:
-
kaifeng
Work Items
- Add transition of
inspect wait
toinspect failed
viaabort
. - Add a new method
abort()
to the base inspect interface. - Add the abort implementation to ironic
inspector.Inspector
. - Implement the abort logic in ironic conductor.
Dependencies
None
Testing
Tempest test will be added to test the REST API change.
Upgrades and Backwards Compatibility
API will be bumped for backward compatibility. Client requests with microversion before this feature will be treated identically.
Documentation Impact
Related documents and state machine diagram will be updated accordingly.