Merge "Amend count-quota-usage-from-placement to reflect implementation"
This commit is contained in:
@@ -78,39 +78,42 @@ The new method will contain:
|
||||
mappings for a project and a user to represent the instance count.
|
||||
|
||||
We will rename the ``_instances_cores_ram_count`` method to
|
||||
``_cores_ram_count`` that counts cores and ram from the cell databases and
|
||||
is only used if ``[workarounds]disable_quota_usage_from_placement`` is True.
|
||||
``_instances_cores_ram_count_legacy`` that counts cores and ram from the cell
|
||||
databases and is only used if ``[quota]count_usage_from_placement`` is False or
|
||||
if the data migration has not yet completed.
|
||||
|
||||
Because there is not yet an ability to partition allocations (or perhaps,
|
||||
resource providers from which allocations could derive a partition) in
|
||||
Because there is not yet an ability to partition resource providers in
|
||||
placement, in order to support deployments where multiple Nova deployments
|
||||
share the same placement service, like possibly in an Edge scenario, we can add
|
||||
a ``[workarounds]disable_quota_usage_from_placement`` which defaults to False.
|
||||
If True, we use the legacy quota counting method for instances, cores, and
|
||||
ram. If False, we use a quota counting method that calls placement. This is a
|
||||
minimal way to keep "legacy" quota counting available for the scenario of
|
||||
multiple Nova deployments sharing one placement service. The config option will
|
||||
simply control which counting method will be called by the pluggable quota
|
||||
system. For example (pseudo-code):
|
||||
share the same placement service, like possibly in an Edge scenario, we will
|
||||
add a ``[quota]count_usage_from_placement`` config option which defaults to
|
||||
False. If False, we use the legacy quota counting method for instances, cores,
|
||||
and ram. If True, we use a quota counting method that calls placement. This is
|
||||
a way to keep "legacy" quota counting available for the scenario of multiple
|
||||
Nova deployments sharing one placement service. The config option will simply
|
||||
control which counting method will be called by the pluggable quota system.
|
||||
For example (pseudo-code):
|
||||
|
||||
::
|
||||
|
||||
if CONF.workarounds.disable_quota_usage_from_placement:
|
||||
CountableResource('cores', _cores_ram_count, 'cores')
|
||||
CountableResource('ram', _cores_ram_count, 'ram')
|
||||
if CONF.quota.count_usage_from_placement:
|
||||
return _instances_cores_ram_count_api_db_placement(...)
|
||||
else:
|
||||
CountableResource('cores', _cores_ram_count_placement, 'cores')
|
||||
CountableResource('ram', _cores_ram_count_placement, 'ram')
|
||||
return _instances_cores_ram_count_legacy(...)
|
||||
|
||||
We will add a new method for counting cores and ram from placement that is used
|
||||
when ``[workarounds]disable_quota_usage_from_placement`` is False. This
|
||||
method could be called ``_cores_ram_count_placement``.
|
||||
when ``[quota]count_usage_from_placement`` is True. This method could be called
|
||||
``_cores_ram_count_placement``.
|
||||
|
||||
The new method will contain:
|
||||
|
||||
* One call to placement to get resource usage for CPU and RAM. We can get CPU
|
||||
and RAM usage for a project and user by querying the ``/usages`` resource::
|
||||
* Up to two calls to placement to get resource usage for CPU and RAM. One call
|
||||
will count usage across a project. Then, if user-scoped quota limits are
|
||||
found for a resource, a second call will count usage across a project and a
|
||||
user.
|
||||
We can get CPU and RAM usage for a project and user by querying the
|
||||
``/usages`` resource::
|
||||
|
||||
GET /usages?project_id=<project id>
|
||||
GET /usages?project_id=<project id>&user_id=<user id>
|
||||
|
||||
Alternatives
|
||||
@@ -126,8 +129,9 @@ allowing server create requests to potentionally exceed quota limits.
|
||||
Another alternative which has been discussed is, to use placement aggregates to
|
||||
surround each entire Nova deployment and use that as a means to partition
|
||||
placement usages. We would need to add a ``aggregate=`` query parameter to the
|
||||
placement /usages API in this case. This approach would also require some work
|
||||
by either Nova or the operator to keep the placement aggregate updated.
|
||||
placement ``/usages`` API in this case. This approach would also require some
|
||||
work by either Nova or the operator to keep the placement aggregate
|
||||
synchronized.
|
||||
|
||||
.. _policy-driven behavior: https://review.openstack.org/614783
|
||||
|
||||
@@ -187,7 +191,7 @@ Upgrade impact
|
||||
The addition of the ``user_id`` column to the ``nova_api.instance_mappings``
|
||||
table will require a data migration of all existing instance mappings to
|
||||
populate the ``user_id`` field. The migration routine would look for mappings
|
||||
where ``user_id`` is None and query cells by corresponding ``project_id`` in
|
||||
where ``user_id`` is None and query cells by corresponding ``cell_id`` in
|
||||
the mapping. The query could filter on instance UUIDs, finding the ``user_id``
|
||||
values to populate in the mappings. This would implement the batched
|
||||
``nova-manage db online_data_migrations`` way of doing the migration.
|
||||
@@ -199,25 +203,28 @@ situation where an upgrade has not run
|
||||
|
||||
In order to handle a live in-progress upgrade, we will need to be able to fall
|
||||
back on the legacy counting method for instances, cores, and ram if
|
||||
``nova_api.instance_mappings`` don't yet have ``user_id`` populated (if the
|
||||
``nova_api.instance_mappings`` do not yet have ``user_id`` populated (if the
|
||||
operator has not yet run the data migration). We will need a way to detect that
|
||||
the migration has not yet been run in order to fall back on the legacy counting
|
||||
method. We could have a check such as ``if count(InstanceMapping.id) where
|
||||
project_id=<project id> and user_id=None > 0``, then fall back on the legacy
|
||||
method. We could have a check such as ``if exists(InstanceMapping.id) where
|
||||
project_id=<project id> and user_id=None``, then fall back on the legacy
|
||||
counting method to query cell databases. We should cache the results of the
|
||||
each migration completeness check per ``project_id`` so we avoid needlessly
|
||||
checking a ``project_id`` that has already been migrated every time quota is
|
||||
checked.
|
||||
|
||||
We will populate the ``user_id`` field even for instance mappings that are
|
||||
``queued_for_delete=True`` because we will be filtering on
|
||||
``queued_for_delete=False`` during the instance count based on instance
|
||||
mappings.
|
||||
``queued_for_delete=True`` because such instance mappings include instances
|
||||
that are ``SOFT_DELETED`` and these can be restored at any time in the future.
|
||||
If we do not migrate ``SOFT_DELETED`` instances with ``queued_for_delete=True``
|
||||
and they are restored in the future, their instance mappings would be
|
||||
unmigrated and would prevent us being able to eventually drop the related data
|
||||
migration code.
|
||||
|
||||
The data migrations and fallback to the legacy counting method will be
|
||||
temporary for Stein, to be dropped in T with a blocker migration. That is, you
|
||||
cannot pass ``nova-manage api_db sync`` if there are any instance mappings with
|
||||
``user_id=None`` to force the batched migration using ``nova-manage``.
|
||||
temporary for Train, to be dropped in U or V with a blocker migration. That is,
|
||||
you cannot pass ``nova-manage api_db sync`` if there are any instance mappings
|
||||
with ``user_id=None`` to force the batched migration using ``nova-manage``.
|
||||
|
||||
Implementation
|
||||
==============
|
||||
@@ -239,20 +246,23 @@ Work Items
|
||||
* Update the ``_server_group_count_members_by_user`` quota counting method to
|
||||
use only the ``nova_api.instance_mappings`` table instead of querying cell
|
||||
databases.
|
||||
* Add a config option ``[workarounds]disable_quota_usage_from_placement`` that
|
||||
* Add a config option ``[quota]count_usage_from_placement`` that
|
||||
defaults to False. This will be able to be deprecated when partitioning of
|
||||
resource providers or allocations is available in placement.
|
||||
resource providers is available in placement and other quirks around
|
||||
placement resource allocations in Nova are resolved in the future (example:
|
||||
"doubling" of allocations during resizes).
|
||||
* Add a new method to count instances with a count of
|
||||
``nova_api.instance_mappings`` filtering by ``project_id=<project_id>`` and
|
||||
``user_id=<user_id>`` and ``queued_for_delete=False``.
|
||||
* Add a new count method that queries the placement API for CPU and RAM usage.
|
||||
In the new count method, add a check for whether the online data migration
|
||||
has been run yet and if not, fall back on the legacy count method.
|
||||
* Rename the ``_instances_cores_ram_count`` method to ``_cores_ram_count`` and
|
||||
let it count only cores and ram in the legacy way, for use if
|
||||
``[workarounds]disable_quota_usage_from_placement`` is set to True.
|
||||
* Adjust the nova-next or nova-live-migration CI job to run with
|
||||
``[workarounds]disable_quota_usage_from_placement=True``.
|
||||
* Rename the ``_instances_cores_ram_count`` method to
|
||||
``_instances_cores_ram_count_legacy`` and let it count only cores and ram in
|
||||
the legacy way, for use if ``[quota]count_usage_from_placement`` is False or
|
||||
the data migration is not yet completed.
|
||||
* Adjust the nova-next CI job to run with
|
||||
``[quota]count_usage_from_placement=True``.
|
||||
|
||||
Dependencies
|
||||
============
|
||||
@@ -263,16 +273,16 @@ Testing
|
||||
=======
|
||||
|
||||
Unit tests and functional tests will be included to test the new functionality.
|
||||
We will also adjust one CI job (nova-next or nova-live-migration) to run with
|
||||
``[workarounds]disable_quota_usage_from_placement=True`` to make sure we have
|
||||
integration test coverage of that path.
|
||||
We will also adjust one CI job (nova-next) to run with
|
||||
``[quota]count_usage_from_placement=True`` to make sure we have integration
|
||||
test coverage of that path.
|
||||
|
||||
Documentation Impact
|
||||
====================
|
||||
|
||||
The documentation_ of Cells v2 caveats will be updated to update the paragraph
|
||||
about the inability to correctly calculate quota usage when one or more cells
|
||||
are unreachable. We will document that beginning in Stein, there are new
|
||||
are unreachable. We will document that beginning in Train, there are new
|
||||
deployment options.
|
||||
|
||||
.. _documentation: https://docs.openstack.org/nova/latest/user/cellsv2-layout.html#quota-related-quirks
|
||||
|
||||
Reference in New Issue
Block a user