Files
distcloud/distributedcloud/dcmanager/state
Raphael Lima 8797888d14 Optimize subcloud state manager's queries
This commit optimizes subcloud state manager's queries used to perform a
bulk update for a subcloud's availability and endpoint status, which
were being used in [1].
Previously, each database query was made separately for every update
for either the availability status and/or the endpoint(s) during the
audit process, which resulted in duplicated state and database calls.
In [1] the RPC state calls were significantly reduced by creating a
single request for each subcloud and. Consequently, all queries that
were made in the database in separate steps of the process started to be
executed at once, resulting in approximately 23 [2] queries per subcloud
for a complete audit. With this commit, the maximum number of database
transactions is reduced from 23 to 3.

Test plan:
1. PASS: Unmanage a subcloud and verify that all of its endpoints' sync
   status become unknown.
2. PASS: Manage a subcloud and verify that all of its endpoints' sync
   status become in-sync
3. PASS: Apply a patch in the system controller and verify that all of
   the subclouds' patching sync status becomes out-of-sync.
4. PASS: Apply the patch in the subclouds and verify that their patching
   sync status becomes in-sync.
5. PASS: Verify that the hourly unconditional update for the subcloud's
   availability status updates the database

[1] https://review.opendev.org/c/starlingx/distcloud/+/922058
[2] Analysis of the number of requests considering dcmanager's audit
Subcloud becoming online:
- subcloud_get_by_region_name: 1
- fm's get_fault: 1
- fm's set_fault or clear_fault: 1
- subcloud_update: 1
- for each endpoint audited by dcmanager (7):
    - subcloud_get_by_region_name: 1 (removed)
    - subcloud_get_with_status: 1 (removed)
    - subcloud_endpoint_status_db_model_to_dict: 9 (removed)
    - subcloud_status_update: 1 (changed to one query for all endpoints)
    - fm's get_fault: 1
    - fm's set_fault or clear_fault: 1
Total:
    - 39 queries that are now 19, considering fm's database.
    - 23 queries that are now 3 in dcmanager's database

Note that the totals does not include the db_model_to_dict request
because it does not query the database.

Subcloud becoming offline:
- subcloud_get_by_region_name: 2 reduced to 1
- fm's get_fault: 10 (one for each endpoint and the availability)
- fm's set_fault or clear_fault: 10
- subcloud_get_with_status: 1 (removed)
- subcloud_endpoint_status_db_model_to_dict: 9 (removed)
- subcloud_status_update_endpoints: 1
- subcloud_update: 1

Story: 2011106
Task: 50433

Change-Id: I34b8604bf445cc0ebdc02c5959a919221e62de5a
Signed-off-by: Raphael Lima <Raphael.Lima@windriver.com>
2024-07-11 17:29:27 -03:00
..

Service

DC Manager State Service has responsibility for:

Subcloud state updates coming from dcmanager-manager service

service.py:

run DC Manager State Service in multi-worker mode, and establish RPC server

subcloud_state_manager.py:

Provide subcloud state updates