This has come up a few times via support questions from operators that have a nova cell database out of sync with the placement database resulting in a mismatch in compute nodes to provider uuids and they just want to wipe the placement database and rebuild it from the current data in nova. This provides a document with the high level steps to do that. Change-Id: Ie4fed22615f60e132a887fe541771c447fae1082
2.5 KiB
Rebuild placement DB
Problem
You have somehow changed a nova cell database and the
compute_nodes
table entries are now reporting different
uuids to the placement service but placement already has
resource_providers
table entries with the same names as
those computes so the resource providers in placement and the compute
nodes in the nova database are not synchronized. Maybe this happens as a
result of restoring the nova cell database from a backup where the
compute hosts have not changed but they are using different uuids.
Nova reports compute node inventory to placement using the
hypervisor_hostname
and uuid of the
compute_nodes
table to the placement
resource_providers
table, which has a unique constraint on
the name (hostname in this case) and uuid. Trying to create a new
resource provider with a new uuid but the same name as an existing
provider results in a 409 error from placement, such as in bug 1817833.
Solution
Warning
This is likely a last resort when all computes and resource
providers are not synchronized and it is simpler to just rebuild the
placement database from the current state of nova. This may, however,
not work when using placement for more advanced features such as ports with minimum bandwidth guarantees </admin/config-qos-min-bw>
or accelerators.
Obviously testing first in a pre-production environment is ideal.
These are the steps at a high level:
- Make a backup of the existing placement database in case these steps fail and you need to start over.
- Recreate the placement database and run the schema migrations to initialize the placement database.
- Either restart or wait for the :oslo.config
update_resources_interval
on thenova-compute
services to report resource providers and their inventory to placement. - Run the
nova-manage placement heal_allocations <heal_allocations_cli>
command to report allocations to placement for the existing instances in nova. - Run the
nova-manage placement sync_aggregates <sync_aggregates_cli>
command to synchronize nova host aggregates to placement resource provider aggregates.
Once complete, test your deployment as usual, e.g. running Tempest integration and/or Rally tests, creating, migrating and deleting a server, etc.