Enumerate Inspector errors
Propose Ironic Inspector error message enumeration for the purpose of allowing the automation around Ironic Inspector REST API being more robust and straightforward. Change-Id: Ib5af833224c33274e23b417da17c71825b26775b
This commit is contained in:
parent
77819e7057
commit
8ff5eaa38b
|
@ -0,0 +1,247 @@
|
||||||
|
..
|
||||||
|
This work is licensed under a Creative Commons Attribution 3.0 Unported
|
||||||
|
License.
|
||||||
|
|
||||||
|
http://creativecommons.org/licenses/by/3.0/legalcode
|
||||||
|
|
||||||
|
==================================
|
||||||
|
Ironic Inspector Error Enumeration
|
||||||
|
==================================
|
||||||
|
|
||||||
|
https://bugs.launchpad.net/ironic-inspector/+bug/1710945
|
||||||
|
|
||||||
|
This blueprint will introduce a new field `error-code` to the
|
||||||
|
**Ironic Inspector** API. The new field is thought to make the automation
|
||||||
|
around **Ironic Inspector** easier and more reliable.
|
||||||
|
|
||||||
|
|
||||||
|
Problem description
|
||||||
|
===================
|
||||||
|
|
||||||
|
Currently, if node inspection process fails for one reason or the other,
|
||||||
|
it may be hard for the **Ironic Inspector** REST API consumers to determine
|
||||||
|
the exact cause of the failure. That is because the only error indication
|
||||||
|
currently being offered by the **Ironic Inspector** REST API (other than
|
||||||
|
HTTP error code) is a free-form error message text.
|
||||||
|
|
||||||
|
.. code-block:: json
|
||||||
|
|
||||||
|
{
|
||||||
|
"error": {
|
||||||
|
"message": "Diskette drive 0 seek failure"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
Proposed change
|
||||||
|
===============
|
||||||
|
|
||||||
|
The proposal is to enumerate **Ironic Inspector** REST API errors by
|
||||||
|
introducing a new numeric field `error-code` to the
|
||||||
|
**Ironic Inspector** REST API.
|
||||||
|
|
||||||
|
There is probably no need to assign a distinct `error-code` to every
|
||||||
|
possible `error` message. Instead a handful of important classes of
|
||||||
|
errors may be determined, then all `error` messages may be distributed
|
||||||
|
over the `error-code` set.
|
||||||
|
|
||||||
|
The collection of generally useful `error-code` values would become
|
||||||
|
part of a common library consumed by **Ironic Python Agent**,
|
||||||
|
**Ironic Inspector** and its CLI tool.
|
||||||
|
|
||||||
|
|
||||||
|
Alternatives
|
||||||
|
------------
|
||||||
|
|
||||||
|
Advise **Ironic Inspector** REST API consumers to rely upon the
|
||||||
|
`error` messages they observe. This would constitute a somewhat toxic
|
||||||
|
design as it effectively blocks **Ironic Inspector** developers from
|
||||||
|
changing error messages (accidental change, rewording, localization), puts
|
||||||
|
needless efforts on the consumers while the end product would remain
|
||||||
|
fragile.
|
||||||
|
|
||||||
|
|
||||||
|
Data model impact
|
||||||
|
-----------------
|
||||||
|
|
||||||
|
The node object at the **Ironic Inspector** database schema would include
|
||||||
|
the new integer field - `error_code`.
|
||||||
|
|
||||||
|
The **Ironic Inspector** REST API would include the new integer
|
||||||
|
field - `error-code`.
|
||||||
|
|
||||||
|
The `error-code` values would encode the exact error (lower byte),
|
||||||
|
more general error class (higher byte) and the severity of the error
|
||||||
|
(most significant byte):
|
||||||
|
|
||||||
|
.. code-block::
|
||||||
|
|
||||||
|
ERROR_SEVERITY_LOW = 0
|
||||||
|
ERROR_SEVERITY_HIGH = 1
|
||||||
|
ERROR_SEVERITY_FATAL = 2
|
||||||
|
|
||||||
|
ERROR_CLASS_NONE = 0x0000
|
||||||
|
ERROR_CLASS_IO = 0x0100
|
||||||
|
ERROR_CLASS_MEMORY = 0x0200
|
||||||
|
...
|
||||||
|
|
||||||
|
ERROR_CODE_NONE = 0x00
|
||||||
|
ERROR_CODE_BADSECTOR = 0x01
|
||||||
|
ERROR_CODE_OOM = 0x02
|
||||||
|
...
|
||||||
|
|
||||||
|
error_code = ERROR_SEVERITY_LOW | ERROR_CLASS_MEMORY | ERROR_CODE_OOM
|
||||||
|
|
||||||
|
The existence of the error class would relax the dependency on the exact
|
||||||
|
error codes among different versions of **Ironic Inspector** and the
|
||||||
|
surrounding tooling. Even if the client is not aware of the exact
|
||||||
|
`error-code` it received from **Ironic Inspector**, the client can
|
||||||
|
still attempt to interpret the error class and act accordingly to
|
||||||
|
the encoded severity.
|
||||||
|
|
||||||
|
Existing database would have to be migrated onto the modified schema.
|
||||||
|
Initial value for the new `error_code` field would be set to `<no-error>`
|
||||||
|
(e.g. 0x000000).
|
||||||
|
|
||||||
|
|
||||||
|
HTTP API impact
|
||||||
|
---------------
|
||||||
|
|
||||||
|
When **Ironic Python Agent** is sending the introspection
|
||||||
|
results up to the **Ironic Inspector** via the Ramdisk
|
||||||
|
callback, the `error-code` attribute may be present:
|
||||||
|
|
||||||
|
.. code-block:: json
|
||||||
|
|
||||||
|
POST /v1/continue
|
||||||
|
{
|
||||||
|
"inventory":
|
||||||
|
{
|
||||||
|
...
|
||||||
|
},
|
||||||
|
"root_disk": "/dev/sda1",
|
||||||
|
"boot_interface": "01:11:22:33:44:55:66",
|
||||||
|
"error": "Diskette drive 0 seek failure",
|
||||||
|
"error-code": 1234
|
||||||
|
}
|
||||||
|
|
||||||
|
When **Ironic Inspector** clients (e.g. CLI) retrieve introspection
|
||||||
|
status, the `error-code` attribute will be present alongside the
|
||||||
|
existing `error` attribute:
|
||||||
|
|
||||||
|
.. code-block:: json
|
||||||
|
|
||||||
|
GET /v1/introspection/13211c7a-0402-4a1d-b970-5a44870125f5
|
||||||
|
{
|
||||||
|
"finished": true,
|
||||||
|
"state": "error",
|
||||||
|
"error": "Diskette drive 0 seek failure",
|
||||||
|
"error-code": 1234,
|
||||||
|
...
|
||||||
|
}
|
||||||
|
|
||||||
|
The **Ironic Python Agent** REST API microversion would have to be bumped.
|
||||||
|
|
||||||
|
|
||||||
|
Client (CLI) impact
|
||||||
|
-------------------
|
||||||
|
|
||||||
|
.. code-block:: bash
|
||||||
|
|
||||||
|
$ openstack baremetal introspection status 13211c7a-0402-4a1d-b970-5a44870125f5
|
||||||
|
+-------------+--------------------------------------+
|
||||||
|
| Field | Value |
|
||||||
|
+-------------+--------------------------------------+
|
||||||
|
| error | Diskette drive 0 seek failure |
|
||||||
|
| error-code | 1234 (I/O Error) |
|
||||||
|
| finished | True |
|
||||||
|
| finished_at | 2017-09-01T14:04:58 |
|
||||||
|
| started_at | 2017-09-01T14:02:12 |
|
||||||
|
| state | error |
|
||||||
|
| uuid | 13211c7a-0402-4a1d-b970-5a44870125f5 |
|
||||||
|
+-------------+--------------------------------------+
|
||||||
|
|
||||||
|
|
||||||
|
Ironic python agent impact
|
||||||
|
--------------------------
|
||||||
|
|
||||||
|
A new dependency on the common library enumerating error codes
|
||||||
|
would be introduced.
|
||||||
|
|
||||||
|
|
||||||
|
Performance and scalability impact
|
||||||
|
----------------------------------
|
||||||
|
|
||||||
|
None
|
||||||
|
|
||||||
|
Security impact
|
||||||
|
---------------
|
||||||
|
|
||||||
|
None.
|
||||||
|
|
||||||
|
|
||||||
|
Deployer impact
|
||||||
|
---------------
|
||||||
|
|
||||||
|
The Deployer would be able to build automation utilizing the
|
||||||
|
same library as the ironic-* projects when processing/reporting
|
||||||
|
an error.
|
||||||
|
|
||||||
|
|
||||||
|
Developer impact
|
||||||
|
----------------
|
||||||
|
|
||||||
|
Developers should adhere to the standardized error codes. Introducing
|
||||||
|
new error code will require an update of the shared error codes library.
|
||||||
|
|
||||||
|
|
||||||
|
Upgrades and Backwards Compatibility
|
||||||
|
------------------------------------
|
||||||
|
|
||||||
|
The new `error-code` attribute/field enhances the current error
|
||||||
|
handling with further detail, expanding on the current error reporting.
|
||||||
|
This should be a backwards-compatible change (e.g. older CLI/automation)
|
||||||
|
won't be broken.
|
||||||
|
|
||||||
|
|
||||||
|
Implementation
|
||||||
|
==============
|
||||||
|
|
||||||
|
Assignee(s)
|
||||||
|
-----------
|
||||||
|
|
||||||
|
Primary assignee:
|
||||||
|
<etingof>
|
||||||
|
|
||||||
|
|
||||||
|
Work Items
|
||||||
|
----------
|
||||||
|
|
||||||
|
* Create a common library for error codes
|
||||||
|
* Adopt the new common library in:
|
||||||
|
* IPA
|
||||||
|
* inspector
|
||||||
|
* inspector client
|
||||||
|
* Modify **Ironic Python Agent** to report `error-code`
|
||||||
|
* Modify **Ironic Inspector** to consume, store and report `error-code`
|
||||||
|
|
||||||
|
|
||||||
|
Dependencies
|
||||||
|
============
|
||||||
|
|
||||||
|
The new dependency on the common error codes library would be
|
||||||
|
introduced. Possibly a new OpenStack project would be created
|
||||||
|
to accommodate the new library.
|
||||||
|
|
||||||
|
|
||||||
|
Testing
|
||||||
|
=======
|
||||||
|
|
||||||
|
The new functionality and the new library would require unittesting
|
||||||
|
and integration testing the same way as e.g **Ironic Inspector** does.
|
||||||
|
|
||||||
|
|
||||||
|
References
|
||||||
|
==========
|
||||||
|
|
||||||
|
None.
|
Loading…
Reference in New Issue