Merge "Enable rebuild for instances in cell0"
This commit is contained in:
commit
8d5540935e
278
specs/train/approved/enable-rebuild-for-instances-in-cell0.rst
Normal file
278
specs/train/approved/enable-rebuild-for-instances-in-cell0.rst
Normal file
@ -0,0 +1,278 @@
|
||||
..
|
||||
This work is licensed under a Creative Commons Attribution 3.0 Unported
|
||||
License.
|
||||
|
||||
http://creativecommons.org/licenses/by/3.0/legalcode
|
||||
|
||||
=====================================
|
||||
Enable Rebuild for Instances in cell0
|
||||
=====================================
|
||||
https://blueprints.launchpad.net/nova/+spec/enable-rebuild-for-instances-in-cell0
|
||||
|
||||
This spec summarizes the changes needed to enable the rebuilding of instances
|
||||
that failed to be scheduled because there were not enough resources.
|
||||
|
||||
Problem description
|
||||
===================
|
||||
|
||||
Presently, it is allowed to rebuild servers in ERROR state, as long as they
|
||||
have successfully started up before. But if a user tries to rebuild an instance
|
||||
that was never launched before because scheduler failed to find a valid host
|
||||
due to the lack of available resources, the request fails with an exception of
|
||||
type ``InstanceInvalidState`` [#]_. We are not addressing the case where the
|
||||
server was never launched due to exceeding the maximum number of build retries.
|
||||
|
||||
Use Cases
|
||||
---------
|
||||
|
||||
#. As an operator I want to be able to perform corrective actions after a
|
||||
server fails to be scheduled because there were not enough resources (i.e.
|
||||
the instance ends up in PENDING state, if configured). Such actions could
|
||||
be adding more capacity or freeing up used resources. Following the
|
||||
execution of these actions I want to be able to rebuild the server that
|
||||
failed.
|
||||
|
||||
.. note:: Adding the PENDING state as well as setting instances to it, are out
|
||||
of the scope of this spec, as they are being addressed by another
|
||||
change [#]_.
|
||||
|
||||
Proposed change
|
||||
===============
|
||||
|
||||
The flow of the rebuild procedure for instances mapped in cell0 because of
|
||||
scheduling failures caused by lack of resources would then be like this:
|
||||
|
||||
#. The nova-api, after identifying an instance as being in cell0, should create
|
||||
a new BuildRequest and update the instance mapping (cell_id=None).
|
||||
|
||||
#. At this point the api should also delete the instance records from cell0 DB.
|
||||
If this is a soft delete [#]_, then after the successful completion of the
|
||||
operation, we would end up with one record of the instance in the new cell's
|
||||
DB and a record of the same instance in cell0 (deleted=True). A better
|
||||
approach, here, would be to hard delete [#]_ the instance's information from
|
||||
cell0.
|
||||
|
||||
#. Then the nova-api should make an RPC API call to the conductor's new method
|
||||
``rebuild_instance_in_cell0``. This new method's purpose is almost (if not
|
||||
exactly) the same as the existing ``schedule_and_build_instances``. So we
|
||||
could either call to it internally or extract parts of it's functionality
|
||||
and reuse them. The reason behind this is mainly to avoid calling schedule
|
||||
and build code in the super-conductor directly from rebuild code in the API.
|
||||
|
||||
#. Finally, an RPC API call is needed from the conductor to the compute
|
||||
service of the selected cell. The ``rebuild_instance`` method tries to
|
||||
destroy an existing instance and then re-create it. In this case and since
|
||||
the instance was in cell0, there is nothing to destroy and re-create. So,
|
||||
an RPC API call to the existing method ``build_and_run_instance`` seems
|
||||
appropriate.
|
||||
|
||||
Information provided by the user in the initial request such as keypair,
|
||||
trusted_image_certificates, BDMs, tags and config_drive can be retrieved from
|
||||
the instance buried in cell0.
|
||||
Currently, there is no way to recover the requested networks while trying to
|
||||
rebuild the instance. For this:
|
||||
|
||||
#. A reasonable change would be to extend the RequestSpec object to adding a
|
||||
requested_networks field, where the requested networks will be stored.
|
||||
|
||||
#. When scheduler fails to find a valid host for an instance and the VM goes to
|
||||
cell0, the list of requested networks will be stored in the RequestSpec.
|
||||
|
||||
#. As soon as the rebuild procedure starts and the requested networks are
|
||||
retrieved, the new field will be set to None.
|
||||
|
||||
The same applies for personality files, that can be provided during the initial
|
||||
create request and since microversion 2.57 it is deprecated from the rebuild
|
||||
API [#]_. Since the field is not persisted we have no way of retrieving them
|
||||
during rebuild from cell0. For this we have a couple of alternatives:
|
||||
|
||||
#. Handle personality files as requested networks and persist them in the
|
||||
RequestSpec.
|
||||
|
||||
#. Document this as a limitation of the feature and that if people would like
|
||||
to use the new rebuild functionality they should not use personality files.
|
||||
|
||||
#. Another option would be to track in the ``system_metadata`` of the instance,
|
||||
if the instance was created with personality files. Then during rebuild from
|
||||
cell0, we could check and not accept the request for instances created with
|
||||
personality files.
|
||||
|
||||
There is an ongoing discussion on how to handle personality files in the
|
||||
mailing list [#]_.
|
||||
|
||||
Quota Checks
|
||||
------------
|
||||
|
||||
During the normal build flow, there are quota checks in the API level [#]_ as
|
||||
well as in the conductor level [#]_. Consider the scenario where a user has
|
||||
enough RAM quota for a new instance. As soon as the instance is created, it
|
||||
ends up in cell0 because the scheduling failed.
|
||||
|
||||
There are two distinct cases when checking quota for instances, cores, ram:
|
||||
|
||||
#. Checking quota from Nova DB
|
||||
|
||||
In this case, the instance's resources, although in cell0, will be
|
||||
aggregated since the instance records will be in the DB. There is though
|
||||
a slight window for a race condition when the instance gets hard deleted.
|
||||
|
||||
#. Checking quota from Placement [#]_
|
||||
|
||||
When the instance is in cell0, there are no allocations to Placement for
|
||||
this consumer. Meaning that the instance's resources will not be aggregated
|
||||
during subsequent checks and there is no check in the API level when
|
||||
rebuilding.
|
||||
|
||||
Rechecking quota at the conductor level will make sure that user's quota is
|
||||
enough before proceeding with the build procedure.
|
||||
|
||||
Between initial build and rebuild (from cell0) port usage might have changed.
|
||||
In this case and since port quota is not checked when rebuilding from cell0, we
|
||||
might fail late in the compute service trying to create the port. Although the
|
||||
user will not get a quick failure from the API, this is acceptable because at
|
||||
this point usage is already over limit and the server would not have booted
|
||||
successfully.
|
||||
|
||||
Alternatives
|
||||
------------
|
||||
|
||||
The user could delete the instance that failed and create a new one with the
|
||||
same characteristics but not the same ID. The proposed functionality is the
|
||||
dependency for supporting preemptible instances, where an external service
|
||||
automatically rebuilds the failed server after taking corrective actions. In
|
||||
the aforementioned feature maintaining the ID of the instance is of vital
|
||||
importance. This is the main reason for which this cannot be considered as an
|
||||
acceptable alternative solution.
|
||||
|
||||
Data model impact
|
||||
-----------------
|
||||
|
||||
Add a ``requested_networks`` field in the RequestSpec object that will contain
|
||||
a NetworkRequestList object. Since the RequestSpec is stored as a blob
|
||||
(mediumtext) in the database, no schema modification is needed.
|
||||
|
||||
REST API impact
|
||||
---------------
|
||||
|
||||
A new API microversion is needed. Rebuilding an instance that is mapped to
|
||||
cell0 will continue to fail for older microversions.
|
||||
|
||||
Security impact
|
||||
---------------
|
||||
|
||||
None.
|
||||
|
||||
Notifications impact
|
||||
--------------------
|
||||
|
||||
None.
|
||||
|
||||
Other end user impact
|
||||
---------------------
|
||||
|
||||
Users will be allowed to rebuild instances that failed due to the lack of
|
||||
resources.
|
||||
|
||||
Performance Impact
|
||||
------------------
|
||||
|
||||
None.
|
||||
|
||||
Other deployer impact
|
||||
---------------------
|
||||
|
||||
None.
|
||||
|
||||
Developer impact
|
||||
----------------
|
||||
|
||||
None.
|
||||
|
||||
Upgrade impact
|
||||
--------------
|
||||
|
||||
None.
|
||||
|
||||
Implementation
|
||||
==============
|
||||
|
||||
Assignee(s)
|
||||
-----------
|
||||
|
||||
Primary assignee:
|
||||
<ttsiouts>
|
||||
|
||||
Other contributors:
|
||||
<johnthetubaguy>
|
||||
<strigazi>
|
||||
<belmoreira>
|
||||
|
||||
Work Items
|
||||
----------
|
||||
|
||||
See `Proposed change`_.
|
||||
|
||||
Dependencies
|
||||
============
|
||||
|
||||
None.
|
||||
|
||||
Testing
|
||||
=======
|
||||
|
||||
In order to verify the validity of the functionality:
|
||||
|
||||
#. New unit tests have to be implmented and existing ones should be adapted.
|
||||
|
||||
#. New functional tests have to be implemented to verify the rebuilding of
|
||||
instances in cell0 and the handling of instance tags, keypairs,
|
||||
trusted_image_certificates etc.
|
||||
|
||||
#. The new tests should take into consideration BFV instances and the handling
|
||||
of BDMs.
|
||||
|
||||
Documentation Impact
|
||||
====================
|
||||
|
||||
We should update the documentation to state that the rebuild is allowed for
|
||||
instances that have never booted before.
|
||||
|
||||
References
|
||||
==========
|
||||
|
||||
.. [#] https://github.com/openstack/nova/blob/d42a007425d9adb691134137e1e0b7dda356df62/nova/compute/api.py#L147
|
||||
|
||||
.. [#] https://review.openstack.org/#/c/648687/
|
||||
|
||||
.. [#] In this scope soft delete means a non-zero value is set to the
|
||||
``deleted`` column.
|
||||
|
||||
.. [#] Hard delete means that the record is removed from the table.
|
||||
|
||||
.. [#] https://docs.openstack.org/nova/latest/reference/api-microversion-history.html#id52
|
||||
|
||||
.. [#] http://lists.openstack.org/pipermail/openstack-discuss/2019-April/004901.html
|
||||
|
||||
.. [#] https://github.com/openstack/nova/blob/fc3890667e4971e3f0f35ac921c2a6c25f72adec/nova/compute/api.py#L937
|
||||
|
||||
.. [#] https://github.com/openstack/nova/blob/fc3890667e4971e3f0f35ac921c2a6c25f72adec/nova/conductor/manager.py#L1422
|
||||
|
||||
.. [#] https://review.opendev.org/#/c/638073/
|
||||
|
||||
Discussed at the Dublin PTG:
|
||||
* https://etherpad.openstack.org/p/nova-ptg-rocky (#L459)
|
||||
|
||||
History
|
||||
=======
|
||||
|
||||
.. list-table:: Revisions
|
||||
:header-rows: 1
|
||||
|
||||
* - Release Name
|
||||
- Description
|
||||
* - Rocky
|
||||
- Introduced
|
||||
* - Stein
|
||||
- Re-proposed
|
||||
* - Train
|
||||
- Re-proposed
|
Loading…
Reference in New Issue
Block a user