nova/releasenotes/notes/disable-live-migration-with-numa-bc710a1bcde25957.yaml
Stephen Finucane ae2e5650d1 Fail to live migration if instance has a NUMA topology
Live migration is currently totally broken if a NUMA topology is
present. This affects everything that's been regrettably stuffed in with
NUMA topology including CPU pinning, hugepage support and emulator
thread support. Side effects can range from simple unexpected
performance hits (due to instances running on the same cores) to
complete failures (due to instance cores or huge pages being mapped to
CPUs/NUMA nodes that don't exist on the destination host).

Until such a time as we resolve these issues, we should alert users to
the fact that such issues exist. A workaround option is provided for
operators that _really_ need the broken behavior, but it's defaulted to
False to highlight the brokenness of this feature to unsuspecting
operators.

Change-Id: I217fba9138132b107e9d62895d699d238392e761
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
Related-bug: #1289064
2018-12-14 14:08:35 -05:00

26 lines
1.3 KiB
YAML

---
upgrade:
- |
Live migration of instances with NUMA topologies is now disabled by default
when using the libvirt driver. This includes live migration of instances
with CPU pinning or hugepages. CPU pinning and huge page information for
such instances is not currently re-calculated, as noted in `bug #1289064`_.
This means that if instances were already present on the destination host,
the migrated instance could be placed on the same dedicated cores as these
instances or use hugepages allocated for another instance. Alternately, if
the host platforms were not homogeneous, the instance could be assigned to
non-existent cores or be inadvertently split across host NUMA nodes.
The `long term solution`_ to these issues is to recalculate the XML on the
destination node. When this work is completed, the restriction on live
migration with NUMA topologies will be lifted.
For operators that are aware of the issues and are able to manually work
around them, the ``[workarounds] enable_numa_live_migration`` option can
be used to allow the broken behavior.
For more information, refer to `bug #1289064`_.
.. _bug #1289064: https://bugs.launchpad.net/nova/+bug/1289064
.. _long term solution: https://blueprints.launchpad.net/nova/+spec/numa-aware-live-migration