Merge "Enable rebuild for instances in cell0"

2019-06-28 01:26:56 +00:00 · 2019-06-28 01:26:56 +00:00 · 8d5540935e
commit 8d5540935e
parent 77e9009cb5 fecd4635c3
1 changed files with 278 additions and 0 deletions
--- a/specs/train/approved/enable-rebuild-for-instances-in-cell0.rst
+++ b/specs/train/approved/enable-rebuild-for-instances-in-cell0.rst
@ -0,0 +1,278 @@
+..
+ This work is licensed under a Creative Commons Attribution 3.0 Unported
+ License.
+
+ http://creativecommons.org/licenses/by/3.0/legalcode
+
+=====================================
+Enable Rebuild for Instances in cell0
+=====================================
+https://blueprints.launchpad.net/nova/+spec/enable-rebuild-for-instances-in-cell0
+
+This spec summarizes the changes needed to enable the rebuilding of instances
+that failed to be scheduled because there were not enough resources.
+
+Problem description
+===================
+
+Presently, it is allowed to rebuild servers in ERROR state, as long as they
+have successfully started up before. But if a user tries to rebuild an instance
+that was never launched before because scheduler failed to find a valid host
+due to the lack of available resources, the request fails with an exception of
+type ``InstanceInvalidState`` [#]_. We are not addressing the case where the
+server was never launched due to exceeding the maximum number of build retries.
+
+Use Cases
+---------
+
+#. As an operator I want to be able to perform corrective actions after a
+   server fails to be scheduled because there were not enough resources (i.e.
+   the instance ends up in PENDING state, if configured). Such actions could
+   be adding more capacity or freeing up used resources. Following the
+   execution of these actions I want to be able to rebuild the server that
+   failed.
+
+.. note:: Adding the PENDING state as well as setting instances to it, are out
+          of the scope of this spec, as they are being addressed by another
+          change [#]_.
+
+Proposed change
+===============
+
+The flow of the rebuild procedure for instances mapped in cell0 because of
+scheduling failures caused by lack of resources would then be like this:
+
+#. The nova-api, after identifying an instance as being in cell0, should create
+   a new BuildRequest and update the instance mapping (cell_id=None).
+
+#. At this point the api should also delete the instance records from cell0 DB.
+   If this is a soft delete [#]_, then after the successful completion of the
+   operation, we would end up with one record of the instance in the new cell's
+   DB and a record of the same instance in cell0 (deleted=True). A better
+   approach, here, would be to hard delete [#]_ the instance's information from
+   cell0.
+
+#. Then the nova-api should make an RPC API call to the conductor's new method
+   ``rebuild_instance_in_cell0``. This new method's purpose is almost (if not
+   exactly) the same as the existing ``schedule_and_build_instances``. So we
+   could either call to it internally or extract parts of it's functionality
+   and reuse them. The reason behind this is mainly to avoid calling schedule
+   and build code in the super-conductor directly from rebuild code in the API.
+
+#. Finally, an RPC API call is needed from the conductor to the compute
+   service of the selected cell. The ``rebuild_instance`` method tries to
+   destroy an existing instance and then re-create it. In this case and since
+   the instance was in cell0, there is nothing to destroy and re-create. So,
+   an RPC API call to the existing method ``build_and_run_instance`` seems
+   appropriate.
+
+Information provided by the user in the initial request such as keypair,
+trusted_image_certificates, BDMs, tags and config_drive can be retrieved from
+the instance buried in cell0.
+Currently, there is no way to recover the requested networks while trying to
+rebuild the instance. For this:
+
+#. A reasonable change would be to extend the RequestSpec object to adding a
+   requested_networks field, where the requested networks will be stored.
+
+#. When scheduler fails to find a valid host for an instance and the VM goes to
+   cell0, the list of requested networks will be stored in the RequestSpec.
+
+#. As soon as the rebuild procedure starts and the requested networks are
+   retrieved, the new field will be set to None.
+
+The same applies for personality files, that can be provided during the initial
+create request and since microversion 2.57 it is deprecated from the rebuild
+API [#]_. Since the field is not persisted we have no way of retrieving them
+during rebuild from cell0. For this we have a couple of alternatives:
+
+#. Handle personality files as requested networks and persist them in the
+   RequestSpec.
+
+#. Document this as a limitation of the feature and that if people would like
+   to use the new rebuild functionality they should not use personality files.
+
+#. Another option would be to track in the ``system_metadata`` of the instance,
+   if the instance was created with personality files. Then during rebuild from
+   cell0, we could check and not accept the request for instances created with
+   personality files.
+
+There is an ongoing discussion on how to handle personality files in the
+mailing list [#]_.
+
+Quota Checks
+------------
+
+During the normal build flow, there are quota checks in the API level [#]_ as
+well as in the conductor level [#]_. Consider the scenario where a user has
+enough RAM quota for a new instance. As soon as the instance is created, it
+ends up in cell0 because the scheduling failed.
+
+There are two distinct cases when checking quota for instances, cores, ram:
+
+#. Checking quota from Nova DB
+
+   In this case, the instance's resources, although in cell0, will be
+   aggregated since the instance records will be in the DB. There is though
+   a slight window for a race condition when the instance gets hard deleted.
+
+#. Checking quota from Placement [#]_
+
+   When the instance is in cell0, there are no allocations to Placement for
+   this consumer. Meaning that the instance's resources will not be aggregated
+   during subsequent checks and there is no check in the API level when
+   rebuilding.
+
+Rechecking quota at the conductor level will make sure that user's quota is
+enough before proceeding with the build procedure.
+
+Between initial build and rebuild (from cell0) port usage might have changed.
+In this case and since port quota is not checked when rebuilding from cell0, we
+might fail late in the compute service trying to create the port. Although the
+user will not get a quick failure from the API, this is acceptable because at
+this point usage is already over limit and the server would not have booted
+successfully.
+
+Alternatives
+------------
+
+The user could delete the instance that failed and create a new one with the
+same characteristics but not the same ID. The proposed functionality is the
+dependency for supporting preemptible instances, where an external service
+automatically rebuilds the failed server after taking corrective actions. In
+the aforementioned feature maintaining the ID of the instance is of vital
+importance. This is the main reason for which this cannot be considered as an
+acceptable alternative solution.
+
+Data model impact
+-----------------
+
+Add a ``requested_networks`` field in the RequestSpec object that will contain
+a NetworkRequestList object. Since the RequestSpec is stored as a blob
+(mediumtext) in the database, no schema modification is needed.
+
+REST API impact
+---------------
+
+A new API microversion is needed. Rebuilding an instance that is mapped to
+cell0 will continue to fail for older microversions.
+
+Security impact
+---------------
+
+None.
+
+Notifications impact
+--------------------
+
+None.
+
+Other end user impact
+---------------------
+
+Users will be allowed to rebuild instances that failed due to the lack of
+resources.
+
+Performance Impact
+------------------
+
+None.
+
+Other deployer impact
+---------------------
+
+None.
+
+Developer impact
+----------------
+
+None.
+
+Upgrade impact
+--------------
+
+None.
+
+Implementation
+==============
+
+Assignee(s)
+-----------
+
+Primary assignee:
+  <ttsiouts>
+
+Other contributors:
+  <johnthetubaguy>
+  <strigazi>
+  <belmoreira>
+
+Work Items
+----------
+
+See `Proposed change`_.
+
+Dependencies
+============
+
+None.
+
+Testing
+=======
+
+In order to verify the validity of the functionality:
+
+#. New unit tests have to be implmented and existing ones should be adapted.
+
+#. New functional tests have to be implemented to verify the rebuilding of
+   instances in cell0 and the handling of instance tags, keypairs,
+   trusted_image_certificates etc.
+
+#. The new tests should take into consideration BFV instances and the handling
+   of BDMs.
+
+Documentation Impact
+====================
+
+We should update the documentation to state that the rebuild is allowed for
+instances that have never booted before.
+
+References
+==========
+
+.. [#] https://github.com/openstack/nova/blob/d42a007425d9adb691134137e1e0b7dda356df62/nova/compute/api.py#L147
+
+.. [#] https://review.openstack.org/#/c/648687/
+
+.. [#] In this scope soft delete means a non-zero value is set to the
+       ``deleted`` column.
+
+.. [#] Hard delete means that the record is removed from the table.
+
+.. [#] https://docs.openstack.org/nova/latest/reference/api-microversion-history.html#id52
+
+.. [#] http://lists.openstack.org/pipermail/openstack-discuss/2019-April/004901.html
+
+.. [#] https://github.com/openstack/nova/blob/fc3890667e4971e3f0f35ac921c2a6c25f72adec/nova/compute/api.py#L937
+
+.. [#] https://github.com/openstack/nova/blob/fc3890667e4971e3f0f35ac921c2a6c25f72adec/nova/conductor/manager.py#L1422
+
+.. [#] https://review.opendev.org/#/c/638073/
+
+Discussed at the Dublin PTG:
+* https://etherpad.openstack.org/p/nova-ptg-rocky (#L459)
+
+History
+=======
+
+.. list-table:: Revisions
+   :header-rows: 1
+
+   * - Release Name
+     - Description
+   * - Rocky
+     - Introduced
+   * - Stein
+     - Re-proposed
+   * - Train
+     - Re-proposed