RBD: Change rbd_exclusive_cinder_pool's default
In Cinder we always try to have sane defaults, but the current RBD default for rbd_exclusive_cinder_pools may lead to issues on deployments with a large number of volumes: - Cinder taking a long time to start. - Cinder becoming non-responsive. - Cinder stats gathering taking longer than the gathering period. This is cause by the driver making an independent request to get detailed information on each image to accurately calculate the space used by the Cinder volumes. With this patch we change the default to make sure that these issues don't happen in the most common deployment case (the exclusive Cinder pool). Related-Bug: #1704106 Change-Id: I839441a71238cdad540ba8d9d4d18b1f0fa3ee9d
This commit is contained in:
parent
7175a23731
commit
4ba6664dee
@ -102,13 +102,16 @@ RBD_OPTS = [
|
|||||||
'dynamic value (used + current free) and to False to '
|
'dynamic value (used + current free) and to False to '
|
||||||
'report a static value (quota max bytes if defined and '
|
'report a static value (quota max bytes if defined and '
|
||||||
'global size of cluster if not).'),
|
'global size of cluster if not).'),
|
||||||
cfg.BoolOpt('rbd_exclusive_cinder_pool', default=False,
|
cfg.BoolOpt('rbd_exclusive_cinder_pool', default=True,
|
||||||
help="Set to True if the pool is used exclusively by Cinder. "
|
help="Set to False if the pool is shared with other usages. "
|
||||||
"On exclusive use driver won't query images' provisioned "
|
"On exclusive use driver won't query images' provisioned "
|
||||||
"size as they will match the value calculated by the "
|
"size as they will match the value calculated by the "
|
||||||
"Cinder core code for allocated_capacity_gb. This "
|
"Cinder core code for allocated_capacity_gb. This "
|
||||||
"reduces the load on the Ceph cluster as well as on the "
|
"reduces the load on the Ceph cluster as well as on the "
|
||||||
"volume service."),
|
"volume service. On non exclusive use driver will query "
|
||||||
|
"the Ceph cluster for per image used disk, this is an "
|
||||||
|
"intensive operation having an independent request for "
|
||||||
|
"each image."),
|
||||||
cfg.BoolOpt('enable_deferred_deletion', default=False,
|
cfg.BoolOpt('enable_deferred_deletion', default=False,
|
||||||
help='Enable deferred deletion. Upon deletion, volumes are '
|
help='Enable deferred deletion. Upon deletion, volumes are '
|
||||||
'tagged for deletion but will only be removed '
|
'tagged for deletion but will only be removed '
|
||||||
|
@ -81,6 +81,37 @@ Ceph exposes RADOS; you can access it through the following interfaces:
|
|||||||
Linux kernel and QEMU block devices that stripe
|
Linux kernel and QEMU block devices that stripe
|
||||||
data across multiple objects.
|
data across multiple objects.
|
||||||
|
|
||||||
|
RBD pool
|
||||||
|
~~~~~~~~
|
||||||
|
|
||||||
|
The RBD pool used by the Cinder backend is configured with option ``rbd_pool``,
|
||||||
|
and by default the driver expects exclusive management access to that pool, as
|
||||||
|
in being the only system creating and deleting resources in it, since that's
|
||||||
|
the recommended deployment choice.
|
||||||
|
|
||||||
|
Pool sharing is strongly discouraged, and if we were to share the pool with
|
||||||
|
other services, within OpenStack (Nova, Glance, another Cinder backend) or
|
||||||
|
outside of OpenStack (oVirt), then the stats returned by the driver to the
|
||||||
|
scheduler would not be entirely accurate.
|
||||||
|
|
||||||
|
The inaccuracy would be that the actual size in use by the cinder volumes would
|
||||||
|
be lower than the reported one, since it would be also including the used space
|
||||||
|
by the other services.
|
||||||
|
|
||||||
|
We can set the ``rbd_exclusive_cinder_pool`` configuration option to ``false``
|
||||||
|
to fix this inaccuracy, but this has a performance impact.
|
||||||
|
|
||||||
|
.. warning::
|
||||||
|
|
||||||
|
Setting ``rbd_exclusive_cinder_pool`` to ``false`` will increase the burden
|
||||||
|
on the Cinder driver and the Ceph cluster, since a request will be made for
|
||||||
|
each existing image, to retrieve its size, during the stats gathering
|
||||||
|
process.
|
||||||
|
|
||||||
|
For deployments with large amount of volumes it is recommended to leave the
|
||||||
|
default value of ``true``, and accept the inaccuracy, as it should not be
|
||||||
|
particularly problematic.
|
||||||
|
|
||||||
Driver options
|
Driver options
|
||||||
~~~~~~~~~~~~~~
|
~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
@ -0,0 +1,17 @@
|
|||||||
|
---
|
||||||
|
upgrade:
|
||||||
|
- |
|
||||||
|
Ceph/RBD volume backends will now assume exclusive cinder pools, as if they
|
||||||
|
had ``rbd_exclusive_cinder_pool = true`` in their configuration.
|
||||||
|
|
||||||
|
This helps deployments with a large number of volumes and prevent issues on
|
||||||
|
deployments with a growing number of volumes at the small cost of a
|
||||||
|
slightly less accurate stats being reported to the scheduler.
|
||||||
|
fixes:
|
||||||
|
- |
|
||||||
|
Ceph/RBD: Fix cinder taking a long time to start for Ceph/RBD backends.
|
||||||
|
(`Related-Bug #1704106 <https://bugs.launchpad.net/cinder/+bug/1704106>`_)
|
||||||
|
- |
|
||||||
|
Ceph/RBD: Fix Cinder becoming non-responsive and stats gathering taking
|
||||||
|
longer that its period. (`Related-Bug #1704106
|
||||||
|
<https://bugs.launchpad.net/cinder/+bug/1704106>`_)
|
Loading…
Reference in New Issue
Block a user