cinder/doc/source/admin/ts-db-cpu-spikes.rst
Gorka Eguileor bbe42df26c Improve resource listing efficiency
Cinder's resource tables (volumes, snapshots, backups, groups,
group_snapshots) don't have required indexes to do efficient resource
listings on the database engine.

This forces the database to go through all existing database records for
any listing (even when there are no additional user requested filtering)
and check one by one the conditions, resulting in high CPU load on the
database servers.

As an example a listing for a project with a single volume:

$ cinder list
+--------------------------------------+-----------+------+------+-------------+----------+-------------+
| ID                                   | Status    | Name | Size | Volume Type | Bootable | Attached to |
+--------------------------------------+-----------+------+------+-------------+----------+-------------+
| 8a6b11d5-3343-4c0d-8a64-8e7070d1988e | available | test | 1    | lvmdriver-1 | false    |             |
+--------------------------------------+-----------+------+------+-------------+----------+-------------+

May result in the database going through thousand of records (all
deleted records and all records for other projects), as demonstrated by
the following SQL queries where 10435 rows existed in the database and
had to be checked just to return a single one.

This is the SQL equivalent of the earlier cinder list command:

$ mysql cinder -e 'select id, display_name from volumes where not deleted and project_id="a41464e54125407aab09e0236cce2c3c"'
+--------------------------------------+--------------+
| id                                   | display_name |
+--------------------------------------+--------------+
| 8a6b11d5-3343-4c0d-8a64-8e7070d1988e | test         |
+--------------------------------------+--------------+

Which if we look at the numbers of rows that it hits with `explain` we
can see it hits every single row:

$ mysql cinder -e 'explain select id, display_name from volumes where not deleted and project_id="a41464e54125407aab09e0236cce2c3c"'
+------+-------------+---------+------+---------------+------+---------+------+-------+-------------+
| id   | select_type | table   | type | possible_keys | key  | key_len | ref  | rows  | Extra       |
+------+-------------+---------+------+---------------+------+---------+------+-------+-------------+
|    1 | SIMPLE      | volumes | ALL  | NULL          | NULL | NULL    | NULL | 10435 | Using where |
+------+-------------+---------+------+---------------+------+---------+------+-------+-------------+

This patch introduces a deleted and project_id index for the volumes,
snapshots, groups, group_snapshots, and backups tables, which will allow
the database to do efficient retrieval of records for listings.

The reason why we order first by deleted and then by project_id is
because when an admin does a listing with `--all-tenants` that query
will be able to use the deleted table of the new compound index.

We can see the new index this patch adds and how it allows the DB engine
to efficiently retrieve non deleted volumes from the specific project.

$ mysql cinder -e 'show index from volumes'
+---------+------------+--------------------------------+--------------+---------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table   | Non_unique | Key_name                       | Seq_in_index | Column_name         | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+---------+------------+--------------------------------+--------------+---------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| volumes |          0 | PRIMARY                        |            1 | id                  | A         |           1 |     NULL | NULL   |      | BTREE      |         |               |
| volumes |          1 | volumes_service_uuid_idx       |            1 | service_uuid        | A         |           1 |     NULL | NULL   | YES  | BTREE      |         |               |
| volumes |          1 | volumes_service_uuid_idx       |            2 | deleted             | A         |           1 |     NULL | NULL   | YES  | BTREE      |         |               |
| volumes |          1 | ix_volumes_consistencygroup_id |            1 | consistencygroup_id | A         |           1 |     NULL | NULL   | YES  | BTREE      |         |               |
| volumes |          1 | ix_volumes_group_id            |            1 | group_id            | A         |           1 |     NULL | NULL   | YES  | BTREE      |         |               |
| volumes |          1 | volumes_deleted_project_id_idx |            1 | deleted             | A         |           1 |     NULL | NULL   | YES  | BTREE      |         |               |
| volumes |          1 | volumes_deleted_project_id_idx |            2 | project_id          | A         |           1 |     NULL | NULL   | YES  | BTREE      |         |               |
| volumes |          1 | volumes_deleted_host_idx       |            1 | deleted             | A         |           1 |     NULL | NULL   | YES  | BTREE      |         |               |
| volumes |          1 | volumes_deleted_host_idx       |            2 | host                | A         |           1 |     NULL | NULL   | YES  | BTREE      |         |               |
+---------+------------+--------------------------------+--------------+---------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

$ mysql cinder -e 'explain select id, display_name from volumes where not deleted and project_id="a41464e54125407aab09e0236cce2c3c"'
+------+-------------+---------+------+--------------------------------+--------------------------------+---------+-------------+------+-----------------------+
| id   | select_type | table   | type | possible_keys                  | key                            | key_len | ref         | rows | Extra                 |
+------+-------------+---------+------+--------------------------------+--------------------------------+---------+-------------+------+-----------------------+
|    1 | SIMPLE      | volumes | ref  | volumes_deleted_project_id_idx | volumes_deleted_project_id_idx | 770     | const,const |    1 | Using index condition |
+------+-------------+---------+------+--------------------------------+--------------------------------+---------+-------------+------+-----------------------+

We also add another missing index for the volumes that is used by the
create volume from image.

The patch also updates 3 tests that were expecting the result from a
query to be in a specific order when there is no actual ORDER BY in the
query.

Closes-Bug: #1952443
Change-Id: I8456a9f82bdf18ada76874dc0c4f59542e1c03ab
2023-03-06 14:04:57 +00:00

1.8 KiB

Database CPU spikes during operations

Query load upon the database can become a bottleneck that cascades across a deployment and ultimately degrades not only the Cinder service but also the whole OpenStack deployment.

Often, depending on load, query patterns, periodic tasks, and so on and so forth, additional indexes may be needed to help provide hints to the database so it can most efficently attempt to reduce the number of rows which need to be examined in order to return a result set.

Adding indexes

In older releases, before 2023.1 (Antelope), there were some tables that performed poorly in the presence of a large number of deleted resources (volumes, snapshots, backups, etc) which resulted in high CPU loads on the DB servers not only when listing those resources, but also when doing some operations on them. This was resolved by adding appropriate indexes to them.

This example below is specific to MariaDB/MySQL, but the syntax should be easy to modify for operators using PostgreSQL, and it represents the changes that older releases could add to resolve these DB server CPU spikes in such a way that they would not conflict with the ones that Cinder introduced in 2023.1 (Antelope).

use cinder;
create index groups_deleted_project_id_idx on groups (deleted, project_id);
create index group_snapshots_deleted_project_id_idx on groups (deleted, project_id);
create index volumes_deleted_project_id_idx on volumes (deleted, project_id);
create index volumes_deleted_host_idx on volumes (deleted, host);
create index snapshots_deleted_project_id_idx on snapshots (deleted, project_id);
create index backups_deleted_project_id_idx on backups (deleted, project_id);