Add quota utils to cinder-manage
We have many issues with quota: - Wrong values caused by race conditions - Duplicated quota usage entries caused by race conditions - Code that doesn't clean quota usage Our current code has mechanisms to automatically refresh these values, but there could be admins that don't want to enable it all the time, and others that will be affected by bug #1697906 and cannot enable it. There are a couple of scripts available online, but they all have their issues: Not taking into account config option no_snapshot_gb_quota, not fixing backups and groups, being racy, etc. Moreover, they are external tools to the OpenStack community. We will eventually fix quotas, but in the meantime we should provide a tool to help administrators manage the situation. This patch adds a new 'quota' category to cinder-manage and two actions 'check' and 'sync' to allow administrators to check and fix on demand the status of quotas (for a single or all projects). Related-Bug: #1869749 Related-Bug: #1847791 Related-Bug: #1733179 Related-Bug: #1877164 Related-Bug: #1484343 Change-Id: Ic9323f8cfd75c0fbc425ddc9c9b35959fbe7d482
This commit is contained in:
parent
596cb963e6
commit
b50a571374
@ -50,7 +50,8 @@
|
||||
|
||||
"""CLI interface for cinder management."""
|
||||
|
||||
import collections.abc as collections
|
||||
import collections
|
||||
import collections.abc as collections_abc
|
||||
import logging as python_logging
|
||||
import sys
|
||||
import time
|
||||
@ -75,6 +76,7 @@ from cinder import exception
|
||||
from cinder.i18n import _
|
||||
from cinder import objects
|
||||
from cinder.objects import base as ovo_base
|
||||
from cinder import quota
|
||||
from cinder import rpc
|
||||
from cinder.scheduler import rpcapi as scheduler_rpcapi
|
||||
from cinder import version
|
||||
@ -347,6 +349,220 @@ class DbCommands(object):
|
||||
sys.exit(1)
|
||||
|
||||
|
||||
class QuotaCommands(object):
|
||||
"""Class for managing quota issues."""
|
||||
|
||||
def __init__(self):
|
||||
pass
|
||||
|
||||
@args('--project-id', default=None,
|
||||
help=('The ID of the project where we want to sync the quotas '
|
||||
'(defaults to all projects).'))
|
||||
def check(self, project_id):
|
||||
"""Check if quotas and reservations are correct
|
||||
|
||||
This action checks quotas and reservations, for a specific project or
|
||||
for all projects, to see if they are out of sync.
|
||||
|
||||
The check will also look for duplicated entries.
|
||||
|
||||
One way to use this check in combination with the sync action is to
|
||||
run the check for all projects, take note of those that are out of
|
||||
sync, and then sync them one by one at intervals to reduce stress on
|
||||
the DB.
|
||||
"""
|
||||
result = self._check_sync(project_id, do_fix=False)
|
||||
if result:
|
||||
sys.exit(1)
|
||||
|
||||
@args('--project-id', default=None,
|
||||
help=('The ID of the project where we want to sync the quotas '
|
||||
'(defaults to all projects).'))
|
||||
def sync(self, project_id):
|
||||
"""Fix quotas and reservations
|
||||
|
||||
This action refreshes existing quota usage and reservation count for a
|
||||
specific project or for all projects.
|
||||
|
||||
The refresh will also remove duplicated entries.
|
||||
|
||||
This operation is best executed when Cinder is not running, but it can
|
||||
be run with cinder services running as well.
|
||||
|
||||
A different transaction is used for each project's quota sync, so an
|
||||
action failure will only rollback the current project's changes.
|
||||
"""
|
||||
self._check_sync(project_id, do_fix=True)
|
||||
|
||||
def _get_quota_projects(self, ctxt, project_id):
|
||||
"""Get project ids that have quota_usage entries."""
|
||||
if project_id:
|
||||
model = models.QuotaUsage
|
||||
session = db_api.get_session()
|
||||
# If the project does not exist
|
||||
if not session.query(db_api.sql.exists().where(
|
||||
db_api.and_(model.project_id == project_id,
|
||||
~model.deleted))).scalar():
|
||||
print('Project id %s has no quota usage. Nothing to do.' %
|
||||
project_id)
|
||||
return []
|
||||
return [project_id]
|
||||
|
||||
projects = db_api.model_query(context,
|
||||
models.QuotaUsage,
|
||||
read_deleted="no").\
|
||||
with_entities('project_id').\
|
||||
distinct().\
|
||||
all()
|
||||
project_ids = [row.project_id for row in projects]
|
||||
return project_ids
|
||||
|
||||
def _get_quota_syncs(self, ctxt, session, resources, project_id):
|
||||
"""Get data necessary to check out of sync quota usage.
|
||||
|
||||
Returns a list of tuples where each tuple contains a QuotaUsage
|
||||
instance, the corresponding resource from quota.QUOTAS.resources, the
|
||||
volume type id, and the volume type name.
|
||||
|
||||
Volume type id and name will be None for non volume type entries, as
|
||||
expected by the DB sync methods.
|
||||
"""
|
||||
usages = db_api.model_query(ctxt,
|
||||
db_api.models.QuotaUsage,
|
||||
read_deleted="no",
|
||||
session=session).\
|
||||
filter_by(project_id=project_id).\
|
||||
with_for_update().\
|
||||
all()
|
||||
|
||||
res = []
|
||||
for usage in usages:
|
||||
resource = resources[usage.resource]
|
||||
volume_type_id = getattr(resource, 'volume_type_id', None)
|
||||
volume_type_name = getattr(resource, 'volume_type_name', None)
|
||||
res.append((usage, resource, volume_type_id, volume_type_name))
|
||||
return res
|
||||
|
||||
def _get_reservations(self, ctxt, session, project_id, usage_id):
|
||||
"""Get reservations for a given project and usage id."""
|
||||
reservations = db_api.model_query(ctxt, models.Reservation,
|
||||
read_deleted="no",
|
||||
session=session).\
|
||||
filter_by(project_id=project_id, usage_id=usage_id).\
|
||||
with_for_update().\
|
||||
all()
|
||||
return reservations
|
||||
|
||||
def _check_duplicates(self, ctxt, session, sync_data, do_fix):
|
||||
"""Look for duplicated quota used entries (bug#1484343)
|
||||
|
||||
If we have duplicates and we are fixing them, then we reassign the
|
||||
reservations of the usage we are removing.
|
||||
"""
|
||||
resources = collections.defaultdict(list)
|
||||
for sync in sync_data:
|
||||
usage = sync[0]
|
||||
resources[usage.resource].append(sync)
|
||||
|
||||
duplicates_found = False
|
||||
result = []
|
||||
for syncs in resources.values():
|
||||
keep_sync = syncs[0]
|
||||
if len(syncs) > 1:
|
||||
duplicates_found = True
|
||||
keep_usage = keep_sync[0]
|
||||
print('\t%s: %s duplicated usage entries - ' %
|
||||
(keep_usage.resource, len(syncs) - 1),
|
||||
end='')
|
||||
|
||||
if do_fix:
|
||||
# Each of the duplicates can have reservations
|
||||
reassigned = 0
|
||||
for sync in syncs[1:]:
|
||||
usage = sync[0]
|
||||
reservations = self._get_reservations(ctxt, session,
|
||||
usage.project_id,
|
||||
usage.id)
|
||||
reassigned += len(reservations)
|
||||
for reservation in reservations:
|
||||
reservation.usage_id = keep_usage.id
|
||||
keep_usage.in_use += usage.in_use
|
||||
keep_usage.reserved += usage.reserved
|
||||
usage.delete(session=session)
|
||||
print('duplicates removed & %s reservations reassigned' %
|
||||
reassigned)
|
||||
else:
|
||||
print('ignored')
|
||||
result.append(keep_sync)
|
||||
return result, duplicates_found
|
||||
|
||||
def _check_sync(self, project_id, do_fix):
|
||||
"""Check the quotas and reservations optionally fixing them."""
|
||||
|
||||
ctxt = context.get_admin_context()
|
||||
# Get the quota usage types and their sync methods
|
||||
resources = quota.QUOTAS.resources
|
||||
|
||||
# Get all project ids that have quota usage. Method doesn't lock
|
||||
# projects, since newly added projects should not be out of sync and
|
||||
# projects removed will just turn nothing on the quota usage.
|
||||
projects = self._get_quota_projects(ctxt, project_id)
|
||||
|
||||
session = db_api.get_session()
|
||||
|
||||
action_msg = ' - fixed' if do_fix else ''
|
||||
|
||||
discrepancy = False
|
||||
|
||||
# NOTE: It's important to always get the quota first and then the
|
||||
# reservations to prevent deadlocks with quota commit and rollback from
|
||||
# running Cinder services.
|
||||
for project in projects:
|
||||
with session.begin():
|
||||
print('Processing quota usage for project %s' % project)
|
||||
# We only want to sync existing quota usage rows
|
||||
syncs = self._get_quota_syncs(ctxt, session, resources,
|
||||
project)
|
||||
|
||||
# Check for duplicated entries (bug#1484343)
|
||||
syncs, duplicates_found = self._check_duplicates(ctxt, session,
|
||||
syncs, do_fix)
|
||||
if duplicates_found:
|
||||
discrepancy = True
|
||||
|
||||
# Check quota and reservations
|
||||
for usage, resource, vol_type_id, vol_type_name in syncs:
|
||||
# Get the correct value for this quota usage resource
|
||||
sync = db_api.QUOTA_SYNC_FUNCTIONS[resource.sync]
|
||||
updates = sync(ctxt, project,
|
||||
volume_type_id=vol_type_id,
|
||||
volume_type_name=vol_type_name,
|
||||
session=session)
|
||||
|
||||
in_use = list(updates.values())[0]
|
||||
if in_use != usage.in_use:
|
||||
print('\t%s: invalid usage saved=%s actual=%s%s' %
|
||||
(usage.resource, usage.in_use, in_use,
|
||||
action_msg))
|
||||
discrepancy = True
|
||||
if do_fix:
|
||||
usage.in_use = in_use
|
||||
|
||||
reservations = self._get_reservations(ctxt, session,
|
||||
project, usage.id)
|
||||
num_reservations = sum(r.delta for r in reservations
|
||||
if r.delta > 0)
|
||||
if num_reservations != usage.reserved:
|
||||
print('\t%s: invalid reserved saved=%s actual=%s%s' %
|
||||
(usage.resource, usage.reserved,
|
||||
num_reservations, action_msg))
|
||||
discrepancy = True
|
||||
if do_fix:
|
||||
usage.reserved = num_reservations
|
||||
print('Action successfully completed')
|
||||
return discrepancy
|
||||
|
||||
|
||||
class VersionCommands(object):
|
||||
"""Class for exposing the codebase version."""
|
||||
|
||||
@ -667,6 +883,7 @@ CATEGORIES = {
|
||||
'cg': ConsistencyGroupCommands,
|
||||
'db': DbCommands,
|
||||
'host': HostCommands,
|
||||
'quota': QuotaCommands,
|
||||
'service': ServiceCommands,
|
||||
'version': VersionCommands,
|
||||
'volume': VolumeCommands,
|
||||
@ -682,7 +899,7 @@ def methods_of(obj):
|
||||
result = []
|
||||
for i in dir(obj):
|
||||
if isinstance(getattr(obj, i),
|
||||
collections.Callable) and not i.startswith('_'):
|
||||
collections_abc.Callable) and not i.startswith('_'):
|
||||
result.append((i, getattr(obj, i)))
|
||||
return result
|
||||
|
||||
|
@ -37,8 +37,7 @@ For example, to obtain a list of the cinder services currently running:
|
||||
Run without arguments to see a list of available command categories:
|
||||
``cinder-manage``
|
||||
|
||||
Categories are shell, logs, migrate, db, volume, host, service, backup,
|
||||
version, and config. Detailed descriptions are below.
|
||||
The categories are listed below, along with detailed descriptions.
|
||||
|
||||
You can also run with a category argument such as 'db' to see a list of all
|
||||
commands in that category:
|
||||
@ -47,6 +46,76 @@ commands in that category:
|
||||
These sections describe the available categories and arguments for
|
||||
cinder-manage.
|
||||
|
||||
Cinder Quota
|
||||
~~~~~~~~~~~~
|
||||
|
||||
Cinder quotas sometimes run out of sync, and while there are some mechanisms
|
||||
in place in Cinder that, with the proper configuration, try to do a resync
|
||||
of the quotas, they are not perfect and are susceptible to race conditions,
|
||||
so they may result in less than perfect accuracy in refreshed quotas.
|
||||
|
||||
The cinder-manage quota commands are meant to help manage these issues while
|
||||
allowing a finer control of when and what quotas are fixed.
|
||||
|
||||
**Checking if quotas and reservations are correct.**
|
||||
|
||||
``cinder-manage quota check [-h] [--project-id PROJECT_ID] [--use-locks]``
|
||||
|
||||
Accepted arguments are:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
--project-id PROJECT_ID
|
||||
The ID of the project where we want to sync the quotas
|
||||
(defaults to all projects).
|
||||
--use-locks For precise results tables in the DB need to be
|
||||
locked.
|
||||
|
||||
This command checks quotas and reservations, for a specific project (passing
|
||||
``--project-id``) or for all projects, to see if they are out of sync.
|
||||
|
||||
The check will also look for duplicated entries.
|
||||
|
||||
By default it runs in the least accurate mode (where races have a higher
|
||||
chance of happening) to minimize the impact on running cinder services. This
|
||||
means that false errors are more likely to be reported due to race conditions
|
||||
when Cinder services are running.
|
||||
|
||||
Accurate mode is also supported, but it will lock many tables (affecting all
|
||||
tenants) and is not recommended with services that are being used.
|
||||
|
||||
One way to use this action in combination with the sync action is to run the
|
||||
check for all projects, take note of those that are out of sync, and the sync
|
||||
them one by one at intervals to allow cinder to operate semi-normally.
|
||||
|
||||
**Fixing quotas and reservations**
|
||||
|
||||
``cinder-manage quota sync [-h] [--project-id PROJECT_ID] [--no-locks]``
|
||||
|
||||
Accepted arguments are:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
--project-id PROJECT_ID
|
||||
The ID of the project where we want to sync the quotas
|
||||
(defaults to all projects).
|
||||
--no-locks For less precise results, but also less intrusive.
|
||||
|
||||
This command refreshes existing quota usage and reservation count for a
|
||||
specific project or for all projects.
|
||||
|
||||
The refresh will also remove duplicated entries.
|
||||
|
||||
This operation is best executed when Cinder is not running, as it requires
|
||||
locking many tables (affecting all tenants) to make sure that then sync is
|
||||
accurate.
|
||||
|
||||
If accuracy is not our top priority, or we know that a specific project is not
|
||||
in use, we can disable the locking.
|
||||
|
||||
A different transaction is used for each project's quota sync, so an action
|
||||
failure will only rollback the current project's changes.
|
||||
|
||||
Cinder Db
|
||||
~~~~~~~~~
|
||||
|
||||
|
@ -0,0 +1,6 @@
|
||||
---
|
||||
features:
|
||||
- |
|
||||
The cinder-manage command now includes a new ``quota`` category with two
|
||||
possible actions ``check`` and ``sync`` to help administrators manage out
|
||||
of sync quotas on long running deployments.
|
Loading…
Reference in New Issue
Block a user