7f35e4e857
In the fix for #1853840 I made a mistake and since then we created the physical NIC resource providers as a child of the hypervisor resource provider instead of the agent resource provider. Here: https://review.opendev.org/c/openstack/neutron/+/696600/3/neutron/agent/common/placement_report.py#159 This *did not* break the minimum bandwidth aware scheduling. But still there are multiple problems: 1) If you created your physical NIC RPs before the fix for #1853840 but upgraded to after the fix for #1853840, then resource syncs will throw an error in neutron-server at each physical NIC RP update. That pollutes the logs and wastes some resources since the prohibited update will be forever retried. 2) If you created your physical NIC RPs after the fix for #1853840 then your physical NIC RPs have the wrong parent. Which again does not break minimum bandwidth aware scheduling. But it may pose problems for later features wanting to build on the originally planned RP tree structure. 3) Cleanup of decommissioned RPs is a bit different than expected. This cleanup was always left to the admin, so it only affects a manual process. The proper RP structure was and should be the following: The hypervisor RP(s) must be the root(s). As a child of each hypervisor RP, there should be an agent RP. The physical NIC RPs should be the children of the agent RPs. Unfortunately at the moment the Placement API generically prohibits update of the parent resource provider id in a PUT request: https://docs.openstack.org/api-ref/placement/?expanded=update-resource-provider-detail#update-resource-provider Therefore without a later Placement change we cannot fix the RPs already created with the wrong parent. However we can fix the RPs to be created later. We do that here. We also fix a bug in the unit tests that allowed the wrong parent to pass unnoticed. Plus we add an extra log message to direct the user seeing the pollution in the logs to the proper bug report. There may be a follow up patch later, because not all RP re-parenting operations are problematic, therefore we are thinking of relaxing this blanket prohibition in Placement. When Placement allows updates to the parent id we can fix RPs already created with the wrong parent too. Change-Id: I7caa8827d22103600ca685a58294640fc831dbd9 Closes-Bug: #1921150 Co-Authored-By: "Balazs Gibizer" <balazs.gibizer@est.tech> Related-Bug: #1853840
18 lines
958 B
YAML
18 lines
958 B
YAML
---
|
|
issues:
|
|
- |
|
|
When using the minimim-bandwidth QoS feature due to bug
|
|
https://launchpad.net/bugs/1921150 physical NIC resource providers
|
|
were for some time created with the wrong parent (i.e. the
|
|
hypervisor RP). This is now partially fixed and new resource
|
|
providers are created now with the expected parent (i.e. the agent
|
|
RP). However Placement does not allow re-parenting an already
|
|
existing resource provider, therefore the following Placement
|
|
DB update may be needed after the fix for bug 1921150 is applied:
|
|
neutron/tools/bug-1921150-re-parent-device-rps.sql
|
|
Until all resource providers have the proper parent, neutron-server
|
|
will retry the re-parenting update, which will be rejected every time,
|
|
therefore expect polluted logs and some wasted load on Placement.
|
|
However please note that the bandwidth-aware scheduling is supposed
|
|
to work even with the wrongly parented resource providers.
|