Merge "Detect maximum number of SEV guests automatically"

This commit is contained in:
Zuul 2023-11-23 18:13:25 +00:00 committed by Gerrit Code Review
commit 1738b52c30
9 changed files with 140 additions and 63 deletions

View File

@ -91,32 +91,31 @@ steps:
needs to track how many slots are available and used in order to needs to track how many slots are available and used in order to
avoid attempting to exceed that limit in the hardware. avoid attempting to exceed that limit in the hardware.
At the time of writing (September 2019), work is in progress to Since version 8.0.0, libvirt exposes maximun mumber of SEV guests
allow QEMU and libvirt to expose the number of slots available on which can run concurrently in its host, so the limit is automatically
SEV hardware; however until this is finished and released, it will detected using this feature.
not be possible for Nova to programmatically detect the correct
value.
So this configuration option serves as a stop-gap, allowing the However in case an older version of libvirt is used, it is not possible for
cloud operator the option of providing this value manually. It may Nova to programmatically detect the correct value and Nova imposes no limit.
later be demoted to a fallback value for cases where the limit So this configuration option serves as a stop-gap, allowing the cloud
cannot be detected programmatically, or even removed altogether when operator the option of providing this value manually.
Nova's minimum QEMU version guarantees that it can always be
detected. This option also allows the cloud operator to set the limit lower than
the actual hard limit.
.. note:: .. note::
When deciding whether to use the default of ``None`` or manually If libvirt older than 8.0.0 is used, operators should carefully weigh
impose a limit, operators should carefully weigh the benefits the benefits vs. the risk when deciding whether to use the default of
vs. the risk. The benefits of using the default are a) immediate ``None`` or manually impose a limit.
convenience since nothing needs to be done now, and b) convenience The benefits of using the default are a) immediate convenience since
later when upgrading compute hosts to future versions of Nova, nothing needs to be done now, and b) convenience later when upgrading
since again nothing will need to be done for the correct limit to compute hosts to future versions of libvirt, since again nothing will
be automatically imposed. However the risk is that until need to be done for the correct limit to be automatically imposed.
auto-detection is implemented, users may be able to attempt to However the risk is that until auto-detection is implemented, users may
launch guests with encrypted memory on hosts which have already be able to attempt to launch guests with encrypted memory on hosts which
reached the maximum number of guests simultaneously running with have already reached the maximum number of guests simultaneously running
encrypted memory. This risk may be mitigated by other limitations with encrypted memory. This risk may be mitigated by other limitations
which operators can impose, for example if the smallest RAM which operators can impose, for example if the smallest RAM
footprint of any flavor imposes a maximum number of simultaneously footprint of any flavor imposes a maximum number of simultaneously
running guests which is less than or equal to the SEV limit. running guests which is less than or equal to the SEV limit.
@ -221,16 +220,6 @@ features:
include using ``hw_disk_bus=scsi`` with include using ``hw_disk_bus=scsi`` with
``hw_scsi_model=virtio-scsi`` , or ``hw_disk_bus=sata``. ``hw_scsi_model=virtio-scsi`` , or ``hw_disk_bus=sata``.
- QEMU and libvirt cannot yet expose the number of slots available for
encrypted guests in the memory controller on SEV hardware. Until
this is implemented, it is not possible for Nova to programmatically
detect the correct value. As a short-term workaround, operators can
optionally manually specify the upper limit of SEV guests for each
compute host, via the new
:oslo.config:option:`libvirt.num_memory_encrypted_guests`
configuration option :ref:`described above
<num_memory_encrypted_guests>`.
Permanent limitations Permanent limitations
~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~

View File

@ -854,14 +854,15 @@ The option may be reused for other equivalent technologies in the
future. If the machine does not support memory encryption, the option future. If the machine does not support memory encryption, the option
will be ignored and inventory will be set to 0. will be ignored and inventory will be set to 0.
If the machine does support memory encryption, *for now* a value of If the machine does support memory encryption and this option is not set,
``None`` means an effectively unlimited inventory, i.e. no limit will the driver detects maximum number of SEV guests from the libvirt API which
be imposed by Nova on the number of SEV guests which can be launched, is available since v8.0.0. Setting this option overrides the detected limit,
even though the underlying hardware will enforce its own limit. unless the given value is not larger than the detected limit.
However it is expected that in the future, auto-detection of the
inventory from the hardware will become possible, at which point On the other hand, if an older version of libvirt is used, ``None`` means
``None`` will cause auto-detection to automatically impose the correct an effectively unlimited inventory, i.e. no limit will be imposed by Nova
limit. on the number of SEV guests which can be launched, even though the underlying
hardware will enforce its own limit.
.. note:: .. note::

View File

@ -1990,6 +1990,16 @@ class Connection(object):
_domain_capability_features_with_SEV_unsupported = \ _domain_capability_features_with_SEV_unsupported = \
_domain_capability_features_with_SEV.replace('yes', 'no') _domain_capability_features_with_SEV.replace('yes', 'no')
_domain_capability_features_with_SEV_max_guests = ''' <features>
<gic supported='no'/>
<sev supported='yes'>
<cbitpos>47</cbitpos>
<reducedPhysBits>1</reducedPhysBits>
<maxGuests>100</maxGuests>
<maxESGuests>15</maxESGuests>
</sev>
</features>'''
def getCapabilities(self): def getCapabilities(self):
"""Return spoofed capabilities.""" """Return spoofed capabilities."""
numa_topology = self.host_info.numa_topology numa_topology = self.host_info.numa_topology

View File

@ -29368,7 +29368,7 @@ class TestLibvirtSEVUnsupported(TestLibvirtSEV):
@mock.patch.object(vc, '_domain_capability_features', @mock.patch.object(vc, '_domain_capability_features',
new=vc._domain_capability_features_with_SEV) new=vc._domain_capability_features_with_SEV)
class TestLibvirtSEVSupported(TestLibvirtSEV): class TestLibvirtSEVSupportedNoMaxGuests(TestLibvirtSEV):
"""Libvirt driver tests for when AMD SEV support is present.""" """Libvirt driver tests for when AMD SEV support is present."""
@test.patch_exists(SEV_KERNEL_PARAM_FILE, True) @test.patch_exists(SEV_KERNEL_PARAM_FILE, True)
@test.patch_open(SEV_KERNEL_PARAM_FILE, "1\n") @test.patch_open(SEV_KERNEL_PARAM_FILE, "1\n")
@ -29389,6 +29389,36 @@ class TestLibvirtSEVSupported(TestLibvirtSEV):
self.assertEqual(0, self.driver._get_memory_encrypted_slots()) self.assertEqual(0, self.driver._get_memory_encrypted_slots())
@mock.patch.object(vc, '_domain_capability_features',
new=vc._domain_capability_features_with_SEV_max_guests)
class TestLibvirtSEVSupportedMaxGuests(TestLibvirtSEV):
"""Libvirt driver tests for when AMD SEV support is present."""
@test.patch_exists(SEV_KERNEL_PARAM_FILE, True)
@test.patch_open(SEV_KERNEL_PARAM_FILE, "1\n")
@mock.patch.object(libvirt_driver.LOG, 'warning')
def test_get_mem_encrypted_slots_no_override(self, mock_log):
self.assertEqual(100, self.driver._get_memory_encrypted_slots())
mock_log.assert_not_called()
@test.patch_exists(SEV_KERNEL_PARAM_FILE, True)
@test.patch_open(SEV_KERNEL_PARAM_FILE, "1\n")
@mock.patch.object(libvirt_driver.LOG, 'warning')
def test_get_mem_encrypted_slots_overlide_more(self, mock_log):
self.flags(num_memory_encrypted_guests=120, group='libvirt')
self.assertEqual(100, self.driver._get_memory_encrypted_slots())
mock_log.assert_called_with(
'Host is configured with libvirt.num_memory_encrypted_guests '
'set to %d, but supports only %d.', 120, 100)
@test.patch_exists(SEV_KERNEL_PARAM_FILE, True)
@test.patch_open(SEV_KERNEL_PARAM_FILE, "1\n")
@mock.patch.object(libvirt_driver.LOG, 'warning')
def test_get_mem_encrypted_slots_override_less(self, mock_log):
self.flags(num_memory_encrypted_guests=80, group='libvirt')
self.assertEqual(80, self.driver._get_memory_encrypted_slots())
mock_log.assert_not_called()
class LibvirtPMEMNamespaceTests(test.NoDBTestCase): class LibvirtPMEMNamespaceTests(test.NoDBTestCase):
def setUp(self): def setUp(self):

View File

@ -890,6 +890,8 @@ class HostTestCase(test.NoDBTestCase):
if supported: if supported:
self.assertEqual(47, sev.cbitpos) self.assertEqual(47, sev.cbitpos)
self.assertEqual(1, sev.reduced_phys_bits) self.assertEqual(1, sev.reduced_phys_bits)
self.assertIsNone(sev.max_guests)
self.assertIsNone(sev.max_es_guests)
@mock.patch.object( @mock.patch.object(
fakelibvirt.virConnect, '_domain_capability_features', new= fakelibvirt.virConnect, '_domain_capability_features', new=
@ -904,6 +906,22 @@ class HostTestCase(test.NoDBTestCase):
def test_get_domain_capabilities_sev_supported(self): def test_get_domain_capabilities_sev_supported(self):
self._test_get_domain_capabilities_sev(True) self._test_get_domain_capabilities_sev(True)
@mock.patch.object(
fakelibvirt.virConnect, '_domain_capability_features', new=
fakelibvirt.virConnect._domain_capability_features_with_SEV_max_guests)
def test_get_domain_capabilities_sev_max_guests(self):
caps = self._test_get_domain_capabilities()
self.assertEqual(vconfig.LibvirtConfigDomainCaps, type(caps))
features = caps.features
self.assertEqual(1, len(features))
sev = features[0]
self.assertEqual(vconfig.LibvirtConfigDomainCapsFeatureSev, type(sev))
self.assertTrue(sev.supported)
self.assertEqual(47, sev.cbitpos)
self.assertEqual(1, sev.reduced_phys_bits)
self.assertEqual(100, sev.max_guests)
self.assertEqual(15, sev.max_es_guests)
@mock.patch.object(fakelibvirt.virConnect, "getHostname") @mock.patch.object(fakelibvirt.virConnect, "getHostname")
def test_get_hostname_caching(self, mock_hostname): def test_get_hostname_caching(self, mock_hostname):
mock_hostname.return_value = "foo" mock_hostname.return_value = "foo"

View File

@ -310,6 +310,8 @@ class LibvirtConfigDomainCapsFeatureSev(LibvirtConfigObject):
self.supported = False self.supported = False
self.cbitpos = None self.cbitpos = None
self.reduced_phys_bits = None self.reduced_phys_bits = None
self.max_guests = None
self.max_es_guests = None
def parse_dom(self, xmldoc): def parse_dom(self, xmldoc):
super(LibvirtConfigDomainCapsFeatureSev, self).parse_dom(xmldoc) super(LibvirtConfigDomainCapsFeatureSev, self).parse_dom(xmldoc)
@ -322,6 +324,10 @@ class LibvirtConfigDomainCapsFeatureSev(LibvirtConfigObject):
self.reduced_phys_bits = int(c.text) self.reduced_phys_bits = int(c.text)
elif c.tag == 'cbitpos': elif c.tag == 'cbitpos':
self.cbitpos = int(c.text) self.cbitpos = int(c.text)
elif c.tag == 'maxGuests':
self.max_guests = int(c.text)
elif c.tag == 'maxESGuests':
self.max_es_guests = int(c.text)
class LibvirtConfigDomainCapsOS(LibvirtConfigObject): class LibvirtConfigDomainCapsOS(LibvirtConfigObject):

View File

@ -9009,33 +9009,30 @@ class LibvirtDriver(driver.ComputeDriver):
resources[rc].add(resource_obj) resources[rc].add(resource_obj)
def _get_memory_encrypted_slots(self): def _get_memory_encrypted_slots(self):
slots = CONF.libvirt.num_memory_encrypted_guests conf_slots = CONF.libvirt.num_memory_encrypted_guests
if not self._host.supports_amd_sev: if not self._host.supports_amd_sev:
if slots and slots > 0: if conf_slots and conf_slots > 0:
LOG.warning("Host is configured with " LOG.warning("Host is configured with "
"libvirt.num_memory_encrypted_guests set to " "libvirt.num_memory_encrypted_guests set to "
"%d, but is not SEV-capable.", slots) "%d, but is not SEV-capable.", conf_slots)
return 0 return 0
# NOTE(aspiers): Auto-detection of the number of available slots = db_const.MAX_INT
# slots for AMD SEV is not yet possible, so honor the
# configured value, or impose no limit if this is not # NOTE(tkajinam): Current nova supports SEV only so we ignore SEV-ES
# specified. This does incur a risk that if operators don't if self._host.max_sev_guests is not None:
# read the instructions and configure the maximum correctly, slots = self._host.max_sev_guests
# the maximum could be exceeded resulting in SEV guests
# failing at launch-time. However at least SEV guests will if conf_slots is not None:
# launch until the maximum, and when auto-detection code is if conf_slots > slots:
# added later, an upgrade will magically fix the issue. LOG.warning("Host is configured with "
# "libvirt.num_memory_encrypted_guests set to %d, "
# Note also that the configured value can be 0 on an "but supports only %d.", conf_slots, slots)
# SEV-capable host, since there might conceivably be good slots = min(slots, conf_slots)
# reasons for the operator to want to disable SEV even when
# it's available (e.g. due to performance impact, or LOG.debug("Available memory encrypted slots: %d", slots)
# implementation bugs which may surface later). return slots
if slots is not None:
return slots
else:
return db_const.MAX_INT
@property @property
def static_traits(self) -> ty.Dict[str, bool]: def static_traits(self) -> ty.Dict[str, bool]:

View File

@ -162,6 +162,8 @@ class Host(object):
# kernel, QEMU, and/or libvirt. These are determined on demand and # kernel, QEMU, and/or libvirt. These are determined on demand and
# memoized by various properties below # memoized by various properties below
self._supports_amd_sev: ty.Optional[bool] = None self._supports_amd_sev: ty.Optional[bool] = None
self._max_sev_guests: ty.Optional[int] = None
self._max_sev_es_guests: ty.Optional[int] = None
self._supports_uefi: ty.Optional[bool] = None self._supports_uefi: ty.Optional[bool] = None
self._supports_secure_boot: ty.Optional[bool] = None self._supports_secure_boot: ty.Optional[bool] = None
@ -1836,11 +1838,29 @@ class Host(object):
if feature_is_sev and feature.supported: if feature_is_sev and feature.supported:
LOG.info("AMD SEV support detected") LOG.info("AMD SEV support detected")
self._supports_amd_sev = True self._supports_amd_sev = True
self._max_sev_guests = feature.max_guests
self._max_sev_es_guests = feature.max_es_guests
return self._supports_amd_sev return self._supports_amd_sev
LOG.debug("No AMD SEV support detected for any (arch, machine_type)") LOG.debug("No AMD SEV support detected for any (arch, machine_type)")
return self._supports_amd_sev return self._supports_amd_sev
@property
def max_sev_guests(self) -> ty.Optional[int]:
"""Determine maximum number of guests with AMD SEV.
"""
if not self.supports_amd_sev:
return None
return self._max_sev_guests
@property
def max_sev_es_guests(self) -> ty.Optional[int]:
"""Determine maximum number of guests with AMD SEV-ES.
"""
if not self.supports_amd_sev:
return None
return self._max_sev_es_guests
@property @property
def supports_remote_managed_ports(self) -> bool: def supports_remote_managed_ports(self) -> bool:
"""Determine if the host supports remote managed ports. """Determine if the host supports remote managed ports.

View File

@ -0,0 +1,6 @@
---
features:
- |
Now the libvirt driver is capable to detect maximum number of guests with
memory encrypted which can run concurrently in its compute host using
the new fields in libvirt API available since version 8.0.0.