Implements blueprint neutron-resource-cleanup Change-Id: I7937426a44be5f6b3d0310ca11fee4ed9debf8d9
3.8 KiB
Neutron Resource Cleanup
Include the URL of your launchpad blueprint:
https://blueprints.launchpad.net/fuel/+spec/neutron-resource-cleanup
On controllers Neutron network resources such as namespaces, interfaces, OVS ports and processes are managed in two unrelated parts of code: Neutron agents and OCF scripts including q-agent-cleanup script. The goal is to move all responsibility to a single point - Neutron.
Problem description
Right now resource cleanup is performed by external script provided by Fuel (q-agent-cleanup). The script lives in fuel-library repo and it may be changed to re-use code from Neutron. Ideally all cleanup code should belong to Neutron agents as they currently able to manage namespaces, network interfaces, OVS ports and processes.
The proposal in inspired by issues:
- Clearing up the large number of network namespaces is very time consuming. Pacemaker has very strict timeout management and if the operation doesn't fit the whole resource may turn into unmanaged state (https://bugs.launchpad.net/fuel/+bug/1436414).
- Restart of a particular agent is performed via banning and clearing the associated pacemaker resource. As result all network namespaces are dropped and then created again causing traffic interruptions (https://bugs.launchpad.net/fuel/+bug/1464817).
- Code defects in q-agent-cleanup (https://bugs.launchpad.net/fuel/+bug/1434196)
Proposed change
The proposed change is mostly related to code refactoring and optimization. It doesn't suggest changes in the architecture. OCF scripts will remain the entry-point for resource clean-up, however the code will move to Neutron.
- Step-by-step changes:
-
- Move q-agent-cleanup and the corresponding test into Neutron tree. Add shell entry point into Neutron package.
- Refactor the code by re-using existing parts from Neutron. The CLI should stay the same.
- Apply optimizations with replacing shell calls by iproute2 (native ovsdb if applicable)
Alternatives
Initially we discussed the proposal to move the whole clean-up from OCF to Neutron agents. However there are cases where resources need to be cleared by external stuff.
Data model impact
None
REST API impact
None
Upgrade impact
None
Security impact
None
Notifications impact
None
Other end user impact
None
Performance Impact
Restart of agents should become faster when using native APIs instead of shell commands.
Plugin impact
None
Other deployer impact
None
Developer impact
Patches into q-agent-cleanup will be tested with Neutron infra, thus making changes less risky.
Infrastructure impact
None
Implementation
Assignee(s)
- Primary assignee:
-
shakhat
- Other contributors:
-
skolekonov kkuznetsova
Work Items
Implementation
- Move q-agent-cleanup into Neutron tree
- Update Neutron package to install q-agent-cleanup shell utility
- Remove the old script from fuel-library tree
- Optimize resource cleanup to satisfy scale tests
Dependencies
None
Testing
Revisit disaster tests, add new if needed. The following cases should be covered: restart of agent, death of agent, banning agent on a particular node. The same scenarios should be ran manually at scale on large numbers (thousands) of networks and routers.
Documentation Impact
None
References
None