nova/releasenotes/notes/bug-1834048-8b19ae1c5048b801.yaml
Lee Yarwood 03f7dc29b7 libvirt: Add a rbd_connect_timeout configurable
Previously the initial call to connect to a RBD cluster via the RADOS
API could hang indefinitely if network or other environmental related
issues were encountered.

When encountered during a call to update_available_resource this can
result in the local n-cpu service reporting as UP while never being able
to break out of a subsequent RPC timeout loop as documented in bug

This change adds a simple timeout configurable to be used when initially
connecting to the cluster [1][2][3]. The default timeout of 5 seconds
being sufficiently small enough to ensure that if encountered the n-cpu
service will be able to be marked as DOWN before a RPC timeout is seen.

[1] http://docs.ceph.com/docs/luminous/rados/api/python/#rados.Rados.connect
[2] http://docs.ceph.com/docs/mimic/rados/api/python/#rados.Rados.connect
[3] http://docs.ceph.com/docs/nautilus/rados/api/python/#rados.Rados.connect

Closes-bug: #1834048
Change-Id: I67f341bf895d6cc5d503da274c089d443295199e
2019-07-02 19:36:16 +01:00

13 lines
509 B
YAML

---
other:
- |
A new ``[libvirt]/rbd_connect_timeout`` configuration option has been
introduced to limit the time spent waiting when connecting to a RBD cluster
via the RADOS API. This timeout currently defaults to 5 seconds.
This aims to address issues reported in `bug 1834048`_ where failures to
initially connect to a RBD cluster left the nova-compute service inoperable
due to constant RPC timeouts being hit.
.. _bug 1834048: https://bugs.launchpad.net/nova/+bug/1834048