b9dc86d8d6
When configuring QEMU cache modes for Nova instances, we use 'writethrough' when 'none' is not available. But that's not correct, because of our misunderstanding of how cache modes work. E.g. the function disk_cachemode() in the libvirt driver assumes that 'writethrough' and 'none' cache modes have the same behaviour with respect to host crash safety, which is not at all true. The misunderstanding and complexity stems from not realizing that each QEMU cache mode is a shorthand to toggle *three* booleans. Refer to the convenient cache mode table in the code comment (in nova/virt/libvirt/driver.py). As Kevin Wolf (thanks!), QEMU Block Layer maintainer, explains (I made a couple of micro edits for clarity): The thing that makes 'writethrough' so safe against host crashes is that it never keeps data in a "write cache", but it calls fsync() after _every_ write. This is also what makes it horribly slow. But 'cache=none' doesn't do this and therefore doesn't provide this kind of safety. The guest OS must explicitly flush the cache in the right places to make sure data is safe on the disk. And OSes do that. So if 'cache=none' is safe enough for you, then 'cache=writeback' should be safe enough for you, too -- because both of them have the boolean 'cache.writeback=on'. The difference is only in 'cache.direct', but 'cache.direct=on' only bypasses the host kernel page cache and data could still sit in other caches that could be present between QEMU and the disk (such as commonly a volatile write cache on the disk itself). So use 'writeback' mode instead of the debilitatingly slow 'writethrough' for cases where the O_DIRECT-based 'none' is unsupported. Do the minimum required update to the `disk_cachemodes` config help text. (In a future patch, rewrite the cache modes documentation to fix confusing fragments and outdated information.) Closes-Bug: #1818847 Change-Id: Ibe236988af24a3b43508eec4efbe52a4ed05d45f Signed-off-by: Kashyap Chamarthy <kchamart@redhat.com> Looks-good-to-me'd-by: Kevin Wolf <kwolf@redhat.com>
20 lines
970 B
YAML
20 lines
970 B
YAML
---
|
|
fixes:
|
|
- |
|
|
Update the way QEMU cache mode is configured for Nova guests: If the
|
|
file system hosting the directory with Nova instances is capable of
|
|
Linux's O_DIRECT, use ``none``; otherwise fallback to ``writeback``
|
|
cache mode. This improves performance without compromising data
|
|
integrity. `Bug 1818847`_.
|
|
|
|
Context: What makes ``writethrough`` so safe against host crashes is
|
|
that it never keeps data in a "write cache", but it calls fsync()
|
|
after *every* write. This is also what makes it horribly slow. But
|
|
cache mode ``none`` doesn't do this and therefore doesn't provide
|
|
this kind of safety. The guest OS must explicitly flush the cache
|
|
in the right places to make sure data is safe on the disk; and all
|
|
modern OSes flush data as needed. So if cache mode ``none`` is safe
|
|
enough for you, then ``writeback`` should be safe enough too.
|
|
|
|
.. _Bug 1818847: https://bugs.launchpad.net/nova/+bug/1818847
|