From ee8118423b88f61ed58b0420356ceaed2fa53a40 Mon Sep 17 00:00:00 2001 From: Julia Kreger Date: Wed, 26 May 2021 11:55:44 -0700 Subject: [PATCH] Limit qemu-img execution arenas qemu-img attempts to launch multiple threads by default *and* attempts to have multiple memory allocation arenas to operate from. While multithreading can be good for performance, this pattern and the memory footprint for process launch and dependencies can turn the memory footprint for a cirros image conversion (16MB) into 1.2GB of memory being asked for by the qemu-img tool. In order to limit this impact, as the default number of arenas is governed by the number of CPUs times the number 8, it seems reasonable to lower this to a more reasonable number which also helps keep our possible memory footprint from being exceeded. NOTE: This change is largely different than the original change as an intermediate change converted write_image.sh to python. As it is unlikely for us to backport the intermediate change, it is logical for us to just modify the original script. Otherwise the release note is ultimately what is backported for release note tooling continutity. Change-Id: I71a28ec59ec31c691205eb34d9fcab63a2ccb682 Story: 2008928 Task: 42528 (cherry picked from commit 9e4c7052a2fd9aac03858db696bf1ea9487f15e6) (cherry picked from commit 9c20cca36284a2a17aa535bef92c828096c7d926) --- ironic_python_agent/shell/write_image.sh | 9 +++++++++ .../limit-qemu-img-malloc-arena-025ed84115481eae.yaml | 7 +++++++ 2 files changed, 16 insertions(+) create mode 100644 releasenotes/notes/limit-qemu-img-malloc-arena-025ed84115481eae.yaml diff --git a/ironic_python_agent/shell/write_image.sh b/ironic_python_agent/shell/write_image.sh index bcd6bfd37..16fcb9038 100755 --- a/ironic_python_agent/shell/write_image.sh +++ b/ironic_python_agent/shell/write_image.sh @@ -49,6 +49,15 @@ log "Imaging $IMAGEFILE to $DEVICE" # limit the memory usage for qemu-img to 2 GiB ulimit -v 2097152 +# NOTE(TheJulia): qemu-img uses multiple threads by default and in +# cross-thread memory allocation lock conflicts, glibc will ultimately +# attempt to provide it with an additional arena to allocate from, however +# the running default, when not overridden is 8 * ncpu * the footprint, which +# very quickly exceeds the ulimit. This is most observable on CI systems where +# cross-vcpu thread locking can result in a conflict that wouldn't normally be +# as likely on physical hardware. +# See discussion on https://bugzilla.redhat.com/show_bug.cgi?id=1892773 +export MALLOC_ARENA_MAX=3 qemu-img convert -t directsync -O host_device $IMAGEFILE $DEVICE sync diff --git a/releasenotes/notes/limit-qemu-img-malloc-arena-025ed84115481eae.yaml b/releasenotes/notes/limit-qemu-img-malloc-arena-025ed84115481eae.yaml new file mode 100644 index 000000000..38ddf24ea --- /dev/null +++ b/releasenotes/notes/limit-qemu-img-malloc-arena-025ed84115481eae.yaml @@ -0,0 +1,7 @@ +--- +fixes: + - | + Fixes failures with disk image conversions which result in memory + allocation or input/output errors due to memory limitations by limiting + the number of available memory allocation pools to a non-dynamic + reasonable number which should not exceed the available system memory.