941559042f
We have had a gate bug for a long time where occasionally the scheduler service gets into a state where many requests fail in it with CellTimeout errors. Example: Timed out waiting for response from cell <cell uuid> Through the use of much DNM patch debug logging in oslo.db, it was revealed that service child processes (workers) were sometimes starting off with already locked internal oslo.db locks. This is a known issue in python [1] where if a parent process forks a child process while a lock is held, the child will inherit the held lock which can never be acquired. The python issue is not considered a bug and the recommended way to handle it is by making use of the os.register_at_fork() in the oslo.db to reinitialize its lock. The method is new in python 3.7, so as long as we still support python 3.6, we must handle the situation outside of oslo.db. We can do this by clearing the cell cache that holds oslo.db database transaction context manager objects during service start(). This way, we get fresh oslo.db locks that are in an unlocked state when a child process begins. We can also take this opportunity to resolve part of a TODO to clear the same cell cache during service reset() (SIGHUP) since it is another case where we intended to clear it. The rest of the TODO related to periodic clearing of the cache is removed after discussion on the review, as such clearing would be unsynchronized among multiple services and for periods of time each service might have a different view of cached cells than another. Closes-Bug: #1844929 [1] https://bugs.python.org/issue6721 Change-Id: Id233f673a57461cc312e304873a41442d732c051 |
||
---|---|---|
api-guide/source | ||
api-ref/source | ||
devstack | ||
doc | ||
etc/nova | ||
gate | ||
nova | ||
playbooks | ||
releasenotes | ||
roles | ||
tools | ||
.coveragerc | ||
.gitignore | ||
.gitreview | ||
.mailmap | ||
.pre-commit-config.yaml | ||
.stestr.conf | ||
.zuul.yaml | ||
babel.cfg | ||
bindep.txt | ||
CONTRIBUTING.rst | ||
HACKING.rst | ||
LICENSE | ||
lower-constraints.txt | ||
MAINTAINERS | ||
README.rst | ||
requirements.txt | ||
setup.cfg | ||
setup.py | ||
test-requirements.txt | ||
tox.ini |
OpenStack Nova
OpenStack Nova provides a cloud computing fabric controller, supporting a wide variety of compute technologies, including: libvirt (KVM, Xen, LXC and more), Hyper-V, VMware, XenServer, OpenStack Ironic and PowerVM.
Use the following resources to learn more.
API
To learn how to use Nova's API, consult the documentation available online at:
For more information on OpenStack APIs, SDKs and CLIs in general, refer to:
Operators
To learn how to deploy and configure OpenStack Nova, consult the documentation available online at:
In the unfortunate event that bugs are discovered, they should be reported to the appropriate bug tracker. If you obtained the software from a 3rd party operating system vendor, it is often wise to use their own bug tracker for reporting problems. In all other cases use the master OpenStack bug tracker, available at:
Developers
For information on how to contribute to Nova, please see the contents of the CONTRIBUTING.rst.
Any new code must follow the development guidelines detailed in the HACKING.rst file, and pass all unit tests.
Further developer focused documentation is available at:
Other Information
During each Summit and Project Team Gathering, we agree on what the whole community wants to focus on for the upcoming release. The plans for nova can be found at: