Merge "Amend count-quota-usage-from-placement to reflect implementation"

2019-06-04 13:47:16 +00:00
parent 090d47bf80 bf4d8af262
commit 37e4368601
1 changed files with 54 additions and 44 deletions
--- a/specs/train/approved/count-quota-usage-from-placement.rst
+++ b/specs/train/approved/count-quota-usage-from-placement.rst
@@ -78,39 +78,42 @@ The new method will contain:
  mappings for a project and a user to represent the instance count.
 We will rename the ``_instances_cores_ram_count`` method to
-``_cores_ram_count`` that counts cores and ram from the cell databases and
+``_instances_cores_ram_count_legacy`` that counts cores and ram from the cell
-is only used if ``[workarounds]disable_quota_usage_from_placement`` is True.
+databases and is only used if ``[quota]count_usage_from_placement`` is False or
 if the data migration has not yet completed.
-Because there is not yet an ability to partition allocations (or perhaps,
+Because there is not yet an ability to partition resource providers in
 resource providers from which allocations could derive a partition) in
 placement, in order to support deployments where multiple Nova deployments
-share the same placement service, like possibly in an Edge scenario, we can add
+share the same placement service, like possibly in an Edge scenario, we will
-a ``[workarounds]disable_quota_usage_from_placement`` which defaults to False.
+add a ``[quota]count_usage_from_placement`` config option which defaults to
-If True, we use the legacy quota counting method for instances, cores, and
+False. If False, we use the legacy quota counting method for instances, cores,
-ram. If False, we use a quota counting method that calls placement. This is a
+and ram. If True, we use a quota counting method that calls placement. This is
-minimal way to keep "legacy" quota counting available for the scenario of
+a way to keep "legacy" quota counting available for the scenario of multiple
-multiple Nova deployments sharing one placement service. The config option will
+Nova deployments sharing one placement service. The config option will simply
-simply control which counting method will be called by the pluggable quota
+control which counting method will be called by the pluggable quota system.
-system. For example (pseudo-code):
+For example (pseudo-code):
 ::
-    if CONF.workarounds.disable_quota_usage_from_placement:
+    if CONF.quota.count_usage_from_placement:
-        CountableResource('cores', _cores_ram_count, 'cores')
+        return _instances_cores_ram_count_api_db_placement(...)
        CountableResource('ram', _cores_ram_count, 'ram')
    else:
-        CountableResource('cores', _cores_ram_count_placement, 'cores')
+        return _instances_cores_ram_count_legacy(...)
        CountableResource('ram', _cores_ram_count_placement, 'ram')
 We will add a new method for counting cores and ram from placement that is used
-when ``[workarounds]disable_quota_usage_from_placement`` is False. This
+when ``[quota]count_usage_from_placement`` is True. This method could be called
-method could be called ``_cores_ram_count_placement``.
+``_cores_ram_count_placement``.
 The new method will contain:
-* One call to placement to get resource usage for CPU and RAM. We can get CPU
+* Up to two calls to placement to get resource usage for CPU and RAM. One call
-  and RAM usage for a project and user by querying the ``/usages`` resource::
+  will count usage across a project. Then, if user-scoped quota limits are
  found for a resource, a second call will count usage across a project and a
  user.
  We can get CPU and RAM usage for a project and user by querying the
  ``/usages`` resource::
    GET /usages?project_id=<project id>
    GET /usages?project_id=<project id>&user_id=<user id>
 Alternatives
@@ -126,8 +129,9 @@ allowing server create requests to potentionally exceed quota limits.
 Another alternative which has been discussed is, to use placement aggregates to
 surround each entire Nova deployment and use that as a means to partition
 placement usages. We would need to add a ``aggregate=`` query parameter to the
-placement /usages API in this case. This approach would also require some work
+placement ``/usages`` API in this case. This approach would also require some
-by either Nova or the operator to keep the placement aggregate updated.
+work by either Nova or the operator to keep the placement aggregate
 synchronized.
 .. _policy-driven behavior: https://review.openstack.org/614783
@@ -187,7 +191,7 @@ Upgrade impact
 The addition of the ``user_id`` column to the ``nova_api.instance_mappings``
 table will require a data migration of all existing instance mappings to
 populate the ``user_id`` field. The migration routine would look for mappings
-where ``user_id`` is None and query cells by corresponding ``project_id`` in
+where ``user_id`` is None and query cells by corresponding ``cell_id`` in
 the mapping. The query could filter on instance UUIDs, finding the ``user_id``
 values to populate in the mappings. This would implement the batched
 ``nova-manage db online_data_migrations`` way of doing the migration.
@@ -199,25 +203,28 @@ situation where an upgrade has not run
 In order to handle a live in-progress upgrade, we will need to be able to fall
 back on the legacy counting method for instances, cores, and ram if
-``nova_api.instance_mappings`` don't yet have ``user_id`` populated (if the
+``nova_api.instance_mappings`` do not yet have ``user_id`` populated (if the
 operator has not yet run the data migration). We will need a way to detect that
 the migration has not yet been run in order to fall back on the legacy counting
-method. We could have a check such as ``if count(InstanceMapping.id) where
+method. We could have a check such as ``if exists(InstanceMapping.id) where
-project_id=<project id> and user_id=None > 0``, then fall back on the legacy
+project_id=<project id> and user_id=None``, then fall back on the legacy
 counting method to query cell databases. We should cache the results of the
 each migration completeness check per ``project_id`` so we avoid needlessly
 checking a ``project_id`` that has already been migrated every time quota is
 checked.
 We will populate the ``user_id`` field even for instance mappings that are
-``queued_for_delete=True`` because we will be filtering on
+``queued_for_delete=True`` because such instance mappings include instances
-``queued_for_delete=False`` during the instance count based on instance
+that are ``SOFT_DELETED`` and these can be restored at any time in the future.
-mappings.
+If we do not migrate ``SOFT_DELETED`` instances with ``queued_for_delete=True``
 and they are restored in the future, their instance mappings would be
 unmigrated and would prevent us being able to eventually drop the related data
 migration code.
 The data migrations and fallback to the legacy counting method will be
-temporary for Stein, to be dropped in T with a blocker migration. That is, you
+temporary for Train, to be dropped in U or V with a blocker migration. That is,
-cannot pass ``nova-manage api_db sync`` if there are any instance mappings with
+you cannot pass ``nova-manage api_db sync`` if there are any instance mappings
-``user_id=None`` to force the batched migration using ``nova-manage``.
+with ``user_id=None`` to force the batched migration using ``nova-manage``.
 Implementation
 ==============
@@ -239,20 +246,23 @@ Work Items
 * Update the ``_server_group_count_members_by_user`` quota counting method to
  use only the ``nova_api.instance_mappings`` table instead of querying cell
  databases.
-* Add a config option ``[workarounds]disable_quota_usage_from_placement`` that
+* Add a config option ``[quota]count_usage_from_placement`` that
  defaults to False. This will be able to be deprecated when partitioning of
-  resource providers or allocations is available in placement.
+  resource providers is available in placement and other quirks around
  placement resource allocations in Nova are resolved in the future (example:
  "doubling" of allocations during resizes).
 * Add a new method to count instances with a count of
  ``nova_api.instance_mappings`` filtering by ``project_id=<project_id>`` and
  ``user_id=<user_id>`` and ``queued_for_delete=False``.
 * Add a new count method that queries the placement API for CPU and RAM usage.
  In the new count method, add a check for whether the online data migration
  has been run yet and if not, fall back on the legacy count method.
-* Rename the ``_instances_cores_ram_count`` method to ``_cores_ram_count`` and
+* Rename the ``_instances_cores_ram_count`` method to
-  let it count only cores and ram in the legacy way, for use if
+  ``_instances_cores_ram_count_legacy`` and let it count only cores and ram in
-  ``[workarounds]disable_quota_usage_from_placement`` is set to True.
+  the legacy way, for use if ``[quota]count_usage_from_placement`` is False or
-* Adjust the nova-next or nova-live-migration CI job to run with
+  the data migration is not yet completed.
-  ``[workarounds]disable_quota_usage_from_placement=True``.
+* Adjust the nova-next CI job to run with
  ``[quota]count_usage_from_placement=True``.
 Dependencies
 ============
@@ -263,16 +273,16 @@ Testing
 =======
 Unit tests and functional tests will be included to test the new functionality.
-We will also adjust one CI job (nova-next or nova-live-migration) to run with
+We will also adjust one CI job (nova-next) to run with
-``[workarounds]disable_quota_usage_from_placement=True`` to make sure we have
+``[quota]count_usage_from_placement=True`` to make sure we have integration
-integration test coverage of that path.
+test coverage of that path.
 Documentation Impact
 ====================
 The documentation_ of Cells v2 caveats will be updated to update the paragraph
 about the inability to correctly calculate quota usage when one or more cells
-are unreachable. We will document that beginning in Stein, there are new
+are unreachable. We will document that beginning in Train, there are new
 deployment options.
 .. _documentation: https://docs.openstack.org/nova/latest/user/cellsv2-layout.html#quota-related-quirks