Exclude deleted service records when calling hypervisor statistics
Hypervisor statistics could be incorrect if not exclude deleted service records from DB. User may stop 'nova-compute' service on some compute nodes and delete the service from nova. When delete 'nova-compute' service, it performs 'soft-delete' to the corresponding db records in both 'service' table and 'compute_nodes' table if the compute_nodes record is old, i.e. it is linked to the service record. For modern compute_nodes records, they aren't linked to the services table so deleting the services record will not delete the compute_nodes record, and the ResourceTracker won't recreate the compute_nodes record if the host and hypervisor_hostname still match the existing record, but restarting the process after deleting the service will create a new services table record with the same host/binary/topic. If the 'nova-compute' service on that server re-starts, it will automatically add a record in 'compute_nodes' table (assuming it was deleted because it was an old-style record) and also a correspoding record in 'service' table, and if the host name of the compute node did not change, the newly created records in 'service' and 'compute_nodes' table will be identical to the priously soft-deleted records except the deleted row. When calling Hypervisor-statistics, the DB layer joined records across the whole deployment by comparing records' host field selected from serivce table and records' host field selected from compute_nodes table, and the calculated results could be multiplied if multiple records from service table have the same host field, and this scenario could happen if user perform the above actions. Co-Authored-By: Matt Riedemann <mriedem.os@gmail.com> Change-Id: I9dfa15f69f8ef9c6cb36b2734a8601bd73e9d6b3 Closes-Bug: #1692397 (cherry picked from commit3d3e9cdd77
) (cherry picked from commit74e2a400b2
)
This commit is contained in:
parent
9d299ae50e
commit
6dc2a0ec1c
@ -746,7 +746,8 @@ def compute_node_statistics(context):
|
||||
inner_sel.c.service_id == services_tbl.c.id
|
||||
),
|
||||
services_tbl.c.disabled == false(),
|
||||
services_tbl.c.binary == 'nova-compute'
|
||||
services_tbl.c.binary == 'nova-compute',
|
||||
services_tbl.c.deleted == 0
|
||||
)
|
||||
)
|
||||
|
||||
|
@ -8068,6 +8068,55 @@ class ComputeNodeTestCase(test.TestCase, ModelsObjectComparatorMixin):
|
||||
for key, value in six.iteritems(data):
|
||||
self.assertEqual(value, stats.pop(key))
|
||||
|
||||
def test_compute_node_statistics_delete_and_recreate_service(self):
|
||||
# Test added for bug #1692397, this test tests that deleted
|
||||
# service record will not be selected when calculate compute
|
||||
# node statistics.
|
||||
|
||||
# Let's first assert what we expect the setup to look like.
|
||||
self.assertEqual(1, len(db.service_get_all_by_binary(
|
||||
self.ctxt, 'nova-compute')))
|
||||
self.assertEqual(1, len(db.compute_node_get_all_by_host(
|
||||
self.ctxt, 'host1')))
|
||||
# Get the statistics for the original node/service before we delete
|
||||
# the service.
|
||||
original_stats = db.compute_node_statistics(self.ctxt)
|
||||
|
||||
# At this point we have one compute_nodes record and one services
|
||||
# record pointing at the same host. Now we need to simulate the user
|
||||
# deleting the service record in the API, which will only delete very
|
||||
# old compute_nodes records where the service and compute node are
|
||||
# linked via the compute_nodes.service_id column, which is the case
|
||||
# in this test class; at some point we should decouple those to be more
|
||||
# modern.
|
||||
db.service_destroy(self.ctxt, self.service['id'])
|
||||
|
||||
# Now we're going to simulate that the nova-compute service was
|
||||
# restarted, which will create a new services record with a unique
|
||||
# uuid but it will have the same host, binary and topic values as the
|
||||
# deleted service. The unique constraints don't fail in this case since
|
||||
# they include the deleted column and this service and the old service
|
||||
# have a different deleted value.
|
||||
service2_dict = self.service_dict.copy()
|
||||
service2_dict['uuid'] = uuidsentinel.service2_uuid
|
||||
db.service_create(self.ctxt, service2_dict)
|
||||
|
||||
# Again, because of the way the setUp is done currently, the compute
|
||||
# node was linked to the original now-deleted service, so when we
|
||||
# deleted that service it also deleted the compute node record, so we
|
||||
# have to simulate the ResourceTracker in the nova-compute worker
|
||||
# re-creating the compute nodes record.
|
||||
new_compute_node = self.compute_node_dict.copy()
|
||||
del new_compute_node['service_id'] # make it a new style compute node
|
||||
new_compute_node['uuid'] = uuidsentinel.new_compute_uuid
|
||||
db.compute_node_create(self.ctxt, new_compute_node)
|
||||
|
||||
# Now get the stats for all compute nodes (we just have one) and it
|
||||
# should just be for a single service, not double, as we should ignore
|
||||
# the (soft) deleted service.
|
||||
stats = db.compute_node_statistics(self.ctxt)
|
||||
self.assertDictEqual(original_stats, stats)
|
||||
|
||||
def test_compute_node_not_found(self):
|
||||
self.assertRaises(exception.ComputeHostNotFound, db.compute_node_get,
|
||||
self.ctxt, 100500)
|
||||
|
Loading…
Reference in New Issue
Block a user