libvirt: Allow to specify granular CPU feature flags
The recent "Meltdown" CVE fixes have resulted in a critical performance penalty[*] that will impact every Nova guest with certain CPU models. I.e. assume you have applied all the "Meltdown" CVE fixes, and performed a cold reboot (explicit stop & start) of all Nova guests, for the updates to take effect. Now, if any guests that are booted with certain named virtual CPU models (e.g. "IvyBridge", "Westmere", etc), then those guests, will incur noticeable performance degradation[*], while being protected from the CVE itself. To alleviate this guest performance impact, it is now important to specify an obscure Intel CPU feature flag, 'PCID' (Process-Context ID) -- for the virtual CPU models that don't already include it (more on this below). To that end, this change will allow Nova to explicitly specify CPU feature flags via a new configuration attribute, `cpu_model_extra_flags`, e.g. in `nova.conf`: ... [libvirt] cpu_mode = custom cpu_model = IvyBridge cpu_model_extra_flags = pcid ... NB: In the first iteration, the choices for `cpu_model_extra_flags` is restricted to only 'pcid' (the option is case-insensitive) -- to address the earlier mentioned guest performance degradation. A future patch will remove this restriction, allowing to add / remove multiple CPU feature flags, thus making way for other useful features. Some have asked: "Why not simply hardcode the 'PCID' CPU feature flag into Nova?" That's not graceful, and more importantly, impractical: (1) Not every Intel CPU model has 'PCID': - The only Intel CPU models that include the 'PCID' capability are: "Haswell", "Broadwell", and "Skylake" variants. - The libvirt / QEMU Intel CPU models: "Nehalem", "Westmere", "SandyBridge", and "IvyBridge" will *not* expose the 'PCID' capability, even if the host CPUs by the same name include it. I.e. 'PCID' needs to be explicitly when using the said virtual CPU models. (2) Magically adding new CPU feature flags under the user's feet impacts live migration. [*] https://groups.google.com/forum/m/#!topic/mechanical-sympathy/L9mHTbeQLNU Closes-Bug: #1750829 Change-Id: I6bb956808aa3df58747c865c92e5b276e61aff44 BluePrint: libvirt-cpu-model-extra-flags
This commit is contained in:
parent
315a4d63c2
commit
6b601b7cf6
@ -23,6 +23,8 @@ import itertools
|
||||
|
||||
from oslo_config import cfg
|
||||
|
||||
from oslo_config import types
|
||||
|
||||
from nova.conf import paths
|
||||
|
||||
|
||||
@ -524,6 +526,58 @@ configure (via ``cpu_model``) a specific named CPU model. Otherwise, it
|
||||
would result in an error and the instance launch will fail.
|
||||
|
||||
* ``virt_type``: Only the virtualization types ``kvm`` and ``qemu`` use this.
|
||||
"""),
|
||||
cfg.ListOpt(
|
||||
'cpu_model_extra_flags',
|
||||
item_type=types.String(
|
||||
choices=['pcid'],
|
||||
ignore_case=True,
|
||||
),
|
||||
default=[],
|
||||
help="""
|
||||
This allows specifying granular CPU feature flags when specifying CPU
|
||||
models. For example, to explicitly specify the ``pcid``
|
||||
(Process-Context ID, an Intel processor feature) flag to the "IvyBridge"
|
||||
virtual CPU model::
|
||||
|
||||
[libvirt]
|
||||
cpu_mode = custom
|
||||
cpu_model = IvyBridge
|
||||
cpu_model_extra_flags = pcid
|
||||
|
||||
Currently, the choice is restricted to only one option: ``pcid`` (the
|
||||
option is case-insensitive, so ``PCID`` is also valid). This flag is
|
||||
now required to address the guest performance degradation as a result of
|
||||
applying the "Meltdown" CVE fixes on certain Intel CPU models.
|
||||
|
||||
Note that when using this config attribute to set the 'PCID' CPU flag,
|
||||
not all virtual (i.e. libvirt / QEMU) CPU models need it:
|
||||
|
||||
* The only virtual CPU models that include the 'PCID' capability are
|
||||
Intel "Haswell", "Broadwell", and "Skylake" variants.
|
||||
|
||||
* The libvirt / QEMU CPU models "Nehalem", "Westmere", "SandyBridge",
|
||||
and "IvyBridge" will _not_ expose the 'PCID' capability by default,
|
||||
even if the host CPUs by the same name include it. I.e. 'PCID' needs
|
||||
to be explicitly specified when using the said virtual CPU models.
|
||||
|
||||
For now, the ``cpu_model_extra_flags`` config attribute is valid only in
|
||||
combination with ``cpu_mode`` + ``cpu_model`` options.
|
||||
|
||||
Besides ``custom``, the libvirt driver has two other CPU modes: The
|
||||
default, ``host-model``, tells it to do the right thing with respect to
|
||||
handling 'PCID' CPU flag for the guest -- *assuming* you are running
|
||||
updated processor microcode, host and guest kernel, libvirt, and QEMU.
|
||||
The other mode, ``host-passthrough``, checks if 'PCID' is available in
|
||||
the hardware, and if so directly passes it through to the Nova guests.
|
||||
Thus, in context of 'PCID', with either of these CPU modes
|
||||
(``host-model`` or ``host-passthrough``), there is no need to use the
|
||||
``cpu_model_extra_flags``.
|
||||
|
||||
Related options:
|
||||
|
||||
* cpu_mode
|
||||
* cpu_model
|
||||
"""),
|
||||
cfg.StrOpt('snapshots_directory',
|
||||
default='$instances_path/snapshots',
|
||||
|
@ -270,6 +270,15 @@ class LibvirtConfigGuestCPUFeatureTest(LibvirtConfigBaseTest):
|
||||
<feature name="mtrr" policy="force"/>
|
||||
""")
|
||||
|
||||
def test_config_simple_pcid(self):
|
||||
obj = config.LibvirtConfigGuestCPUFeature("pcid")
|
||||
obj.policy = "require"
|
||||
|
||||
xml = obj.to_xml()
|
||||
self.assertXmlEqual(xml, """
|
||||
<feature name="pcid" policy="require"/>
|
||||
""")
|
||||
|
||||
|
||||
class LibvirtConfigGuestCPUNUMATest(LibvirtConfigBaseTest):
|
||||
|
||||
|
@ -6108,6 +6108,83 @@ class LibvirtConnTestCase(test.NoDBTestCase,
|
||||
self.assertEqual(conf.cpu.cores, 1)
|
||||
self.assertEqual(conf.cpu.threads, 1)
|
||||
|
||||
@mock.patch.object(libvirt_driver.LOG, 'warning')
|
||||
def test_get_guest_cpu_config_custom_with_extra_flags(self,
|
||||
mock_warn):
|
||||
drvr = libvirt_driver.LibvirtDriver(fake.FakeVirtAPI(), True)
|
||||
instance_ref = objects.Instance(**self.test_instance)
|
||||
image_meta = objects.ImageMeta.from_dict(self.test_image_meta)
|
||||
|
||||
self.flags(cpu_mode="custom",
|
||||
cpu_model="IvyBridge",
|
||||
cpu_model_extra_flags="pcid",
|
||||
group='libvirt')
|
||||
disk_info = blockinfo.get_disk_info(CONF.libvirt.virt_type,
|
||||
instance_ref,
|
||||
image_meta)
|
||||
conf = drvr._get_guest_config(instance_ref,
|
||||
_fake_network_info(self, 1),
|
||||
image_meta, disk_info)
|
||||
self.assertIsInstance(conf.cpu,
|
||||
vconfig.LibvirtConfigGuestCPU)
|
||||
self.assertEqual(conf.cpu.mode, "custom")
|
||||
self.assertEqual(conf.cpu.model, "IvyBridge")
|
||||
self.assertIn(conf.cpu.features.pop().name, "pcid")
|
||||
self.assertEqual(conf.cpu.sockets, instance_ref.flavor.vcpus)
|
||||
self.assertEqual(conf.cpu.cores, 1)
|
||||
self.assertEqual(conf.cpu.threads, 1)
|
||||
self.assertFalse(mock_warn.called)
|
||||
|
||||
@mock.patch.object(libvirt_driver.LOG, 'warning')
|
||||
def test_get_guest_cpu_config_host_model_with_extra_flags(self,
|
||||
mock_warn):
|
||||
drvr = libvirt_driver.LibvirtDriver(fake.FakeVirtAPI(), True)
|
||||
instance_ref = objects.Instance(**self.test_instance)
|
||||
image_meta = objects.ImageMeta.from_dict(self.test_image_meta)
|
||||
|
||||
self.flags(cpu_mode="host-model",
|
||||
cpu_model_extra_flags="pcid",
|
||||
group='libvirt')
|
||||
disk_info = blockinfo.get_disk_info(CONF.libvirt.virt_type,
|
||||
instance_ref,
|
||||
image_meta)
|
||||
conf = drvr._get_guest_config(instance_ref,
|
||||
_fake_network_info(self, 1),
|
||||
image_meta, disk_info)
|
||||
self.assertIsInstance(conf.cpu,
|
||||
vconfig.LibvirtConfigGuestCPU)
|
||||
self.assertEqual(conf.cpu.mode, "host-model")
|
||||
self.assertEqual(len(conf.cpu.features), 0)
|
||||
self.assertEqual(conf.cpu.sockets, instance_ref.flavor.vcpus)
|
||||
self.assertEqual(conf.cpu.cores, 1)
|
||||
self.assertEqual(conf.cpu.threads, 1)
|
||||
self.assertTrue(mock_warn.called)
|
||||
|
||||
@mock.patch.object(libvirt_driver.LOG, 'warning')
|
||||
def test_get_guest_cpu_config_host_passthrough_with_extra_flags(self,
|
||||
mock_warn):
|
||||
drvr = libvirt_driver.LibvirtDriver(fake.FakeVirtAPI(), True)
|
||||
instance_ref = objects.Instance(**self.test_instance)
|
||||
image_meta = objects.ImageMeta.from_dict(self.test_image_meta)
|
||||
|
||||
self.flags(cpu_mode="host-passthrough",
|
||||
cpu_model_extra_flags="pcid",
|
||||
group='libvirt')
|
||||
disk_info = blockinfo.get_disk_info(CONF.libvirt.virt_type,
|
||||
instance_ref,
|
||||
image_meta)
|
||||
conf = drvr._get_guest_config(instance_ref,
|
||||
_fake_network_info(self, 1),
|
||||
image_meta, disk_info)
|
||||
self.assertIsInstance(conf.cpu,
|
||||
vconfig.LibvirtConfigGuestCPU)
|
||||
self.assertEqual(conf.cpu.mode, "host-passthrough")
|
||||
self.assertEqual(len(conf.cpu.features), 0)
|
||||
self.assertEqual(conf.cpu.sockets, instance_ref.flavor.vcpus)
|
||||
self.assertEqual(conf.cpu.cores, 1)
|
||||
self.assertEqual(conf.cpu.threads, 1)
|
||||
self.assertTrue(mock_warn.called)
|
||||
|
||||
def test_get_guest_cpu_topology(self):
|
||||
instance_ref = objects.Instance(**self.test_instance)
|
||||
instance_ref.flavor.vcpus = 8
|
||||
|
@ -3817,6 +3817,7 @@ class LibvirtDriver(driver.ComputeDriver):
|
||||
def _get_guest_cpu_model_config(self):
|
||||
mode = CONF.libvirt.cpu_mode
|
||||
model = CONF.libvirt.cpu_model
|
||||
extra_flags = CONF.libvirt.cpu_model_extra_flags
|
||||
|
||||
if (CONF.libvirt.virt_type == "kvm" or
|
||||
CONF.libvirt.virt_type == "qemu"):
|
||||
@ -3859,14 +3860,49 @@ class LibvirtDriver(driver.ComputeDriver):
|
||||
msg = _("A CPU model name should not be set when a "
|
||||
"host CPU model is requested")
|
||||
raise exception.Invalid(msg)
|
||||
# FIXME (kchamart): We're intentionally restricting the choices
|
||||
# (in the conf/libvirt.py) for 'extra_flags` to just 'PCID', to
|
||||
# address the immediate guest performance degradation caused by
|
||||
# "Meltdown" CVE fixes on certain Intel CPU models. In a future
|
||||
# patch, we will:
|
||||
# (a) Remove the restriction of choices for 'extra_flags',
|
||||
# allowing to add / remove additional CPU flags, as it will
|
||||
# make way for other useful features.
|
||||
# (b) Remove the below check for "host-model", as it is a
|
||||
# valid configuration to supply additional CPU flags to it.
|
||||
# (c) Revisit and fix the warnings / exception handling for
|
||||
# different combinations of CPU modes and 'extra_flags'.
|
||||
elif ((mode == "host-model" or mode == "host-passthrough") and
|
||||
extra_flags):
|
||||
extra_flags = []
|
||||
LOG.warning("Setting extra CPU flags is only valid in "
|
||||
"combination with a custom CPU model. Refer "
|
||||
"to the 'nova.conf' documentation for "
|
||||
"'[libvirt]/cpu_model_extra_flags'")
|
||||
|
||||
LOG.debug("CPU mode '%(mode)s' model '%(model)s' was chosen",
|
||||
{'mode': mode, 'model': (model or "")})
|
||||
LOG.debug("CPU mode '%(mode)s' model '%(model)s' was chosen, "
|
||||
"with extra flags: '%(extra_flags)s'",
|
||||
{'mode': mode,
|
||||
'model': (model or ""),
|
||||
'extra_flags': (extra_flags or "")})
|
||||
|
||||
cpu = vconfig.LibvirtConfigGuestCPU()
|
||||
cpu.mode = mode
|
||||
cpu.model = model
|
||||
|
||||
# NOTE (kchamart): Currently there's no existing way to ask if a
|
||||
# given CPU model + CPU flags combination is supported by KVM &
|
||||
# a specific QEMU binary. However, libvirt runs the 'CPUID'
|
||||
# command upfront -- before even a Nova instance (a QEMU
|
||||
# process) is launched -- to construct CPU models and check
|
||||
# their validity; so we are good there. In the long-term,
|
||||
# upstream libvirt intends to add an additional new API that can
|
||||
# do fine-grained validation of a certain CPU model + CPU flags
|
||||
# against a specific QEMU binary (the libvirt RFE bug for that:
|
||||
# https://bugzilla.redhat.com/show_bug.cgi?id=1559832).
|
||||
for flag in extra_flags:
|
||||
cpu.add_feature(vconfig.LibvirtConfigGuestCPUFeature(flag))
|
||||
|
||||
return cpu
|
||||
|
||||
def _get_guest_cpu_config(self, flavor, image_meta,
|
||||
|
@ -0,0 +1,21 @@
|
||||
---
|
||||
features:
|
||||
- |
|
||||
The libvirt driver now allows specifying individual CPU feature
|
||||
flags for guests, via a new configuration attribute
|
||||
``[libvirt]/cpu_model_extra_flags`` -- only with ``custom`` as the
|
||||
``[libvirt]/cpu_model``. Refer to its documentation in
|
||||
``nova.conf`` for usage details.
|
||||
|
||||
One of the motivations for this is to alleviate the performance
|
||||
degradation (caused as a result of applying the "Meltdown" CVE
|
||||
fixes) for guests running with certain Intel-based virtual CPU
|
||||
models. This guest performance impact is reduced by exposing the
|
||||
CPU feature flag 'PCID' ("Process-Context ID") to the *guest* CPU,
|
||||
assuming that it is available in the physical hardware itself.
|
||||
|
||||
Note that besides ``custom``, Nova's libvirt driver has two other
|
||||
CPU modes: ``host-model`` (which is the default), and
|
||||
``host-passthrough``. Refer to the
|
||||
``[libvirt]/cpu_model_extra_flags`` documentation for what to do
|
||||
when you are using either of those CPU modes in context of 'PCID'.
|
Loading…
Reference in New Issue
Block a user