[train-only] post stack creation tsx validation
RHEL-8.3 kernel disabled the Intel TSX (Transactional Synchronization Extensions) feature by default as a preemptive security measure, but it breaks live migration from RHEL-7.9 (or even RHEL-8.1 or RHEL-8.2) to RHEL-8.3. Operators are expected to explicitly define the TSX flag in their KernelArgs for the compute role to prevent live-migration issues during the upgrade process. This is explained in detail in this article [a] If operators don't want to add the TSX flag to the KernelArgs, they can always set "ForceNoTsx" to true. Adding this mandatory validation right after the stacks are updated is probably the earliest place where we can validate and fail if necessary. We'd rather fail quickly than too late as this will provide the best experience for our users. In addition to this, there's a tripleo-validation [b] in the work. This is meant to be train-only for now but we will have to refactor if (when?) we support FFU from queens to Wallaby+ [a] https://access.redhat.com/solutions/6036141 [b] https://review.opendev.org/c/openstack/tripleo-validations/+/790806 Co-Authored-By: Martin Schuppert <mschuppert@redhat.com> Related: https://bugzilla.redhat.com/1923165 Closes-Bug: #1916758 Change-Id: I35246fbf74394f6e315973283464085d2aef08b2
This commit is contained in:
parent
0a0296f1fa
commit
050c9aa99f
|
@ -0,0 +1,17 @@
|
|||
---
|
||||
fixes:
|
||||
- |
|
||||
RHEL-8.3 kernel disabled the Intel TSX (Transactional
|
||||
Synchronization Extensions) feature by default as a preemptive
|
||||
security measure, but it breaks live migration from RHEL-7.9
|
||||
(or even RHEL-8.1 or RHEL-8.2) to RHEL-8.3.
|
||||
|
||||
Operators are expected to explicitly define the TSX flag in
|
||||
their KernelArgs for the compute role to prevent live-migration
|
||||
issues during the upgrade or update process.
|
||||
|
||||
We now introduce this validation in tripleoclient to ensure
|
||||
early failure.
|
||||
|
||||
More information here:
|
||||
https://access.redhat.com/solutions/6036141
|
|
@ -147,3 +147,7 @@ class CellExportError(Base):
|
|||
|
||||
class BannedParameters(Base):
|
||||
"""Some of the environment parameters provided should be removed"""
|
||||
|
||||
|
||||
class PostStackValidationError(Base):
|
||||
"""Stack validation failed"""
|
||||
|
|
|
@ -712,6 +712,42 @@ class DeployOvercloud(command.Command):
|
|||
roles=roles
|
||||
)
|
||||
|
||||
def _post_stack_validation(self, stack):
|
||||
"""Post stack update mandatory validation
|
||||
|
||||
Runs a validation in the to make sure that KernelArgs either
|
||||
contains a TSX parameter or the ForceNoTsx parameter is defined.
|
||||
This is a mandatory validation and it has to happen before
|
||||
as soon as possible.
|
||||
"""
|
||||
|
||||
libvirt_service = "OS::TripleO::Services::NovaLibvirt"
|
||||
services = filter(lambda x: (x.endswith('Services') and
|
||||
libvirt_service in stack.parameters[x]),
|
||||
stack.parameters)
|
||||
impacted_roles = []
|
||||
for i in services:
|
||||
role_name = re.sub('Services$', '', i)
|
||||
role_param = stack.parameters.get(role_name + 'Parameters')
|
||||
if role_param:
|
||||
role_params = json.loads(role_param)
|
||||
kernel_args = role_params.get('KernelArgs')
|
||||
no_tsx = role_params.get('ForceNoTsx')
|
||||
if (not no_tsx and
|
||||
(not kernel_args or "tsx=" not in kernel_args)):
|
||||
impacted_roles.append(role_name)
|
||||
if len(impacted_roles):
|
||||
self.log.error("Roles in the following list are expected to have "
|
||||
"a TSX flag configured in their KernelArgs "
|
||||
"parameter. For more information on why we must "
|
||||
"explicitly define the TSX flag, please visit: "
|
||||
"https://access.redhat.com/solutions/6036141")
|
||||
self.log.error("You can also skip this validation by setting "
|
||||
"ForceNoTsx parameter for the desired role(s)")
|
||||
self.log.error("Impacted roles: {roles}".format(
|
||||
roles=",".join(impacted_roles)))
|
||||
raise exceptions.PostStackValidationError()
|
||||
|
||||
def get_parser(self, prog_name):
|
||||
# add_help doesn't work properly, set it to False:
|
||||
parser = argparse.ArgumentParser(
|
||||
|
|
|
@ -86,6 +86,14 @@ class UpdatePrepare(DeployOvercloud):
|
|||
|
||||
super(UpdatePrepare, self).take_action(parsed_args)
|
||||
package_update.update(clients, container=stack_name)
|
||||
|
||||
# "Mandatory" validation to make sure kernelargs contains
|
||||
# a TSX flag
|
||||
if not parsed_args.disable_validations:
|
||||
stack = oooutils.get_stack(clients.orchestration,
|
||||
parsed_args.stack)
|
||||
self._post_stack_validation(stack)
|
||||
|
||||
package_update.get_config(clients, container=stack_name)
|
||||
self.log.info("Update init on stack {0} complete.".format(
|
||||
parsed_args.stack))
|
||||
|
|
|
@ -102,6 +102,13 @@ class UpgradePrepare(DeployOvercloud):
|
|||
# DeployOvercloud.
|
||||
package_update.get_config(clients, container=stack_name)
|
||||
|
||||
# "Mandatory" validation to make sure kernelargs contains
|
||||
# a TSX flag
|
||||
if not parsed_args.disable_validations:
|
||||
stack = oooutils.get_stack(clients.orchestration,
|
||||
parsed_args.stack)
|
||||
self._post_stack_validation(stack)
|
||||
|
||||
# enable ssh admin for Ansible-via-Mistral as that's done only
|
||||
# when config_download is true
|
||||
deployment.get_hosts_and_enable_ssh_admin(
|
||||
|
|
Loading…
Reference in New Issue