When configuring QEMU cache modes for Nova instances, we use
'writethrough' when 'none' is not available. But that's not correct,
because of our misunderstanding of how cache modes work. E.g. the
function disk_cachemode() in the libvirt driver assumes that
'writethrough' and 'none' cache modes have the same behaviour with
respect to host crash safety, which is not at all true.
The misunderstanding and complexity stems from not realizing that each
QEMU cache mode is a shorthand to toggle *three* booleans. Refer to the
convenient cache mode table in the code comment (in
nova/virt/libvirt/driver.py).
As Kevin Wolf (thanks!), QEMU Block Layer maintainer, explains (I made
a couple of micro edits for clarity):
The thing that makes 'writethrough' so safe against host crashes is
that it never keeps data in a "write cache", but it calls fsync()
after _every_ write. This is also what makes it horribly slow. But
'cache=none' doesn't do this and therefore doesn't provide this kind
of safety. The guest OS must explicitly flush the cache in the
right places to make sure data is safe on the disk. And OSes do
that.
So if 'cache=none' is safe enough for you, then 'cache=writeback'
should be safe enough for you, too -- because both of them have the
boolean 'cache.writeback=on'. The difference is only in
'cache.direct', but 'cache.direct=on' only bypasses the host kernel
page cache and data could still sit in other caches that could be
present between QEMU and the disk (such as commonly a volatile write
cache on the disk itself).
So use 'writeback' mode instead of the debilitatingly slow
'writethrough' for cases where the O_DIRECT-based 'none' is unsupported.
Do the minimum required update to the `disk_cachemodes` config help
text. (In a future patch, rewrite the cache modes documentation to fix
confusing fragments and outdated information.)
Closes-Bug: #1818847
Change-Id: Ibe236988af24a3b43508eec4efbe52a4ed05d45f
Signed-off-by: Kashyap Chamarthy <kchamart@redhat.com>
Looks-good-to-me'd-by: Kevin Wolf <kwolf@redhat.com>