CVE-2024-44982: Harden all image handling and conversion code

It was recently learned by the OpenStack community that running qemu-img
on untrusted images without a format pre-specified can present a
security risk. Furthermore, some of these specific image formats have
inherently unsafe features. This is rooted in how qemu-img operates
where all image drivers are loaded and attempt to evaluate the input data.
This can result in several different vectors which this patch works to
close.

This change imports the qemu-img handling code from Ironic-Lib into
Ironic, and image format inspection code, which has been developed by
the wider community to validate general safety of images before converting
them for use in a deployment.

This patch contains functional changes related to the hardening of these
calls including how images are handled, and updates documentation to
provide context and guidance to operators.

Closes-Bug: 2071740
Change-Id: I7fac5c64f89aec39e9755f0930ee47ff8f7aed47
Signed-off-by: Julia Kreger <juliaashleykreger@gmail.com>
This commit is contained in:
Julia Kreger 2024-08-08 12:42:20 -07:00
parent e01522cd4c
commit c996aafa6d
23 changed files with 3501 additions and 159 deletions

View File

@ -404,3 +404,69 @@ be tuned using the :oslo.config:option:`DEFAULT.minimum_required_memory` setting
Operators with a higher level of concurrency may wish to increase the
default value.
Disk Images
===========
Ironic relies upon the ``qemu-img`` tool to convert images from a supplied
disk image format, to a ``raw`` format in order to write the contents of a
disk image to the remote device.
By default, only ``qcow2`` format is supported for this operation, however there
have been reports other formats work when so enabled using the
``[conductor]permitted_image_formats`` configuration option.
Ironic takes several steps by default.
#. Ironic checks and compares supplied metadata with a remote authoritative
source, such as the Glance Image Service, if available.
#. Ironic attempts to "fingerprint" the file type based upon available
metadata and file structure. A file format which is not known to the image
format inspection code may be evaluated as "raw", which means the image
would not be passed through ``qemu-img``. When in doubt, use a ``raw``
image which you can verify is in the desirable and expected state.
#. The image then has a set of safety and sanity checks executed which look
for unknown or unsafe feature usage in the base format which could permit
an attacker to potentially leverage functionality in ``qemu-img`` which
should not be utilized. This check, by default, occurs only through images
which transverse *through* the conductor.
#. Ironic then checks if the fingerprint values and metadata values match.
If they do not match, the requested image is rejected and the operation
fails.
#. The image is then provided to the ``ironic-python-agent``.
Images which are considered "pass-through", as in they are supplied by an
API user as a URL, or are translated to a temporary URL via available
service configuration, are supplied as a URL to the
``ironic-python-agent``.
Ironic can be configured to intercept this interaction and have the conductor
download and inspect these items before the ``ironic-python-agent`` will do so,
however this can increase the temporary disk utilization of the Conductor
along with network traffic to facilitate the transfer. This check is disabled
by default, but can be enabled using the
``[conductor]conductor_always_validates_images`` configuration option.
An option exists which forces all files to be served from the conductor, and
thus force image inspection before involvement of the ``ironic-python-agent``
is the use of the ``[agent]image_download_source`` configuration parameter
when set to ``local`` which proxies all disk images through the conductor.
This setting is also available in the node ``driver_info`` and
``instance_info`` fields.
Mitigating Factors to disk images
---------------------------------
In a fully integrated OpenStack context, Ironic requires images to be set to
"public" in the Image Service.
A direct API user with sufficient elevated access rights *can* submit a URL
for the baremetal node ``instance_info`` dictionary field with an
``image_source`` key value set to a URL. To do so explicitly requires
elevated (trusted) access rights of a System scoped Member,
or Project scoped Owner-Member, or a Project scoped Lessee-Admin via
the ``baremetal:node:update_instance_info`` policy permission rule.
Before the Wallaby release of OpenStack, this was restricted to
``admin`` and ``baremetal_admin`` roles and remains similarly restrictive
in the newer "Secure RBAC" model.

View File

@ -1259,3 +1259,27 @@ To resolve this case, you can either supply new configuration drive contents
with your request, or disable configuration from being stored in Swift for
new baremetal node deployments by changing setting
:oslo.config:option:`deploy.configdrive_use_object_store` to ``false``.
Ironic says my Image is Invalid
===============================
As a result of security fixes which were added to Ironic, resulting from the
security posture of the ``qemu-img`` utility, Ironic enforces certain aspects
related to image files.
* Enforces that the file format of a disk image matches what Ironic is
told by an API user. Any mismatch will result in the image being declared
as invalid. A mismatch with the file contents and what is stored in the
Image service will necessitate uploading a new image as that property
cannot be changed in the image service *after* creation of an image.
* Enforces that the input file format to be written is ``qcow2`` or ``raw``.
This can be extended by modifying ``[conductor]permitted_image_formats`` in
``ironic.conf``.
* Performs safety and sanity check assessment against the file, which can be
disabled by modifying ``[conductor]disable_deep_image_inspection`` and
setting it to ``True``. Doing so is not considered safe and should only
be done by operators accepting the inherent risk that the image they
are attempting to use may have a bad or malicious structure.
Image safety checks are generally performed as the deployment process begins
and stages artifacts, however a late stage check is performed when
needed by the ironic-python-agent.

View File

@ -3,6 +3,24 @@
Add images to the Image service
===============================
Supported Image Formats
~~~~~~~~~~~~~~~~~~~~~~~
Ironic officially supports and tests use of ``qcow2`` formatted images as well
as ``raw`` format images. Other types of disk images, like ``vdi``, and single
file ``vmdk`` files have been reported by users as working in their specific
cases, but are not tested upstream. We advise operators to convert the image
and properly upload the image to Glance.
Ironic enforces the list of supported and permitted image formats utilizing
the ``[conductor]permitted_image_formats`` option in ironic.conf. This setting
defaults to "raw" and "qcow2".
A detected format mismatch between Glance and what the actual contents of
the disk image file are detected as will result in a failed deployment.
To correct such a situation, the image must be re-uploaded with the
declared ``--disk-format`` or actual image file format corrected.
Instance (end-user) images
~~~~~~~~~~~~~~~~~~~~~~~~~~
@ -11,6 +29,10 @@ Build or download the user images as described in :doc:`/user/creating-images`.
Load all the created images into the Image service, and note the image UUIDs in
the Image service for each one as it is generated.
.. note::
Images from Glance used by Ironic must be flagged as ``public``, which
requires administrative privileges with the Glance image service to set.
- For *whole disk images* just upload the image:
.. code-block:: console

View File

@ -28,6 +28,39 @@ Many distributions publish their own cloud images. These are usually whole disk
images that are built for legacy boot mode (not UEFI), with Ubuntu being an
exception (they publish images that work in both modes).
Supported Disk Image Formats
----------------------------
The following formats are tested by Ironic and are expected to work as
long as no unknown or unsafe special features are being used
* raw - A file containing bytes as they would exist on a disk or other
block storage device. This is the simplest format.
* qcow2 - An updated file format based upon the `QEMU <https://www.qemu.org>`_
Copy-on-Write format.
A special mention exists for ``iso`` formatted "CD" images. While Ironic uses
the ISO9660 filesystems in some of it's processes for aspects such as virtual
media, it does *not* support writing them to the remote block storage device.
Image formats we believe may work due to third party reports, but do not test:
* vmdk - A file format derived from the image format originally created
by VMware for their hypervisor product line. Specifically we believe
a single file VMDK formatted image should work. As there are
are several subformats, some of which will not work and may result
in unexpected behavior such as failed deployments.
* vdi - A file format used by
`Oracle VM Virtualbox <https://www.virtualbox.org>`_ hypervisor.
As Ironic does not support these formats, their usage is normally blocked
due security considerations by default. Please consult with your Ironic Operator.
It is important to highlight that Ironic enforces and matches the file type
based upon signature, and not file extension. If there is a mismatch,
the input and or remote service records such as in the Image service
must be corrected.
disk-image-builder
------------------

View File

@ -63,6 +63,8 @@ def config(token):
# and behavior into place.
'agent_token_required': True,
'agent_md5_checksum_enable': CONF.agent.allow_md5_checksum,
'disable_deep_image_inspection': CONF.conductor.disable_deep_image_inspection, # noqa
'permitted_image_formats': CONF.conductor.permitted_image_formats,
}

View File

@ -905,3 +905,7 @@ class UnsupportedHardwareFeature(Invalid):
class BootModeNotAllowed(Invalid):
_msg_fmt = _("'%(mode)s' boot mode is not allowed for %(op)s operation.")
class InvalidImage(ImageUnacceptable):
_msg_fmt = _("The requested image is not valid for use.")

File diff suppressed because it is too large Load Diff

View File

@ -23,7 +23,6 @@ import os
import shutil
import time
from ironic_lib import qemu_img
from oslo_concurrency import processutils
from oslo_log import log as logging
from oslo_utils import fileutils
@ -32,7 +31,9 @@ import pycdlib
from ironic.common import exception
from ironic.common.glance_service import service_utils as glance_utils
from ironic.common.i18n import _
from ironic.common import image_format_inspector
from ironic.common import image_service as service
from ironic.common import qemu_img
from ironic.common import utils
from ironic.conf import CONF
@ -388,28 +389,18 @@ def fetch_into(context, image_href, image_file):
def fetch(context, image_href, path, force_raw=False):
with fileutils.remove_path_on_error(path):
fetch_into(context, image_href, path)
if force_raw:
image_to_raw(image_href, path, "%s.part" % path)
def get_source_format(image_href, path):
data = qemu_img.image_info(path)
fmt = data.file_format
if fmt is None:
try:
img_format = image_format_inspector.detect_file_format(path)
except image_format_inspector.ImageFormatError:
raise exception.ImageUnacceptable(
reason=_("'qemu-img info' parsing failed."),
reason=_("parsing of the image failed."),
image_id=image_href)
backing_file = data.backing_file
if backing_file is not None:
raise exception.ImageUnacceptable(
image_id=image_href,
reason=_("fmt=%(fmt)s backed by: %(backing_file)s") %
{'fmt': fmt, 'backing_file': backing_file})
return fmt
return str(img_format)
def force_raw_will_convert(image_href, path_tmp):
@ -422,24 +413,46 @@ def force_raw_will_convert(image_href, path_tmp):
def image_to_raw(image_href, path, path_tmp):
with fileutils.remove_path_on_error(path_tmp):
fmt = get_source_format(image_href, path_tmp)
if not CONF.conductor.disable_deep_image_inspection:
fmt = safety_check_image(path_tmp)
if fmt != "raw":
if fmt not in CONF.conductor.permitted_image_formats:
LOG.error("Security: The requested image %(image_href)s "
"is of format image %(format)s and is not in "
"the [conductor]permitted_image_formats list.",
{'image_href': image_href,
'format': fmt})
raise exception.InvalidImage()
else:
fmt = get_source_format(image_href, path)
LOG.warning("Security: Image safety checking has been disabled. "
"This is unsafe operation. Attempting to continue "
"the detected format %(img_fmt)s for %(path)s.",
{'img_fmt': fmt,
'path': path})
if fmt != "raw" and fmt != "iso":
# When the target format is NOT raw, we need to convert it.
# however, we don't need nor want to do that when we have
# an ISO image. If we have an ISO because it was requested,
# we have correctly fingerprinted it. Prior to proper
# image detection, we thought we had a raw image, and we
# would end up asking for a raw image to be made a raw image.
staged = "%s.converted" % path
utils.is_memory_insufficient(raise_if_fail=True)
LOG.debug("%(image)s was %(format)s, converting to raw",
{'image': image_href, 'format': fmt})
with fileutils.remove_path_on_error(staged):
qemu_img.convert_image(path_tmp, staged, 'raw')
qemu_img.convert_image(path_tmp, staged, 'raw',
source_format=fmt)
os.unlink(path_tmp)
data = qemu_img.image_info(staged)
if data.file_format != "raw":
new_fmt = get_source_format(image_href, staged)
if new_fmt != "raw":
raise exception.ImageConvertFailed(
image_id=image_href,
reason=_("Converted to raw, but format is "
"now %s") % data.file_format)
"now %s") % new_fmt)
os.rename(staged, path)
else:
@ -470,7 +483,7 @@ def converted_size(path, estimate=False):
the original image scaled by the configuration value
`raw_image_growth_factor`.
"""
data = qemu_img.image_info(path)
data = image_format_inspector.detect_file_format(path)
if not estimate:
return data.virtual_size
growth_factor = CONF.raw_image_growth_factor
@ -787,3 +800,92 @@ def _get_deploy_iso_files(deploy_iso, mountdir):
# present in deploy iso. This path varies for different OS vendors.
# e_img_rel_path: is required by mkisofs to generate boot iso.
return uefi_path_info, e_img_rel_path, grub_rel_path
def __node_or_image_cache(node):
"""A helper for logging to determine if image cache or node uuid."""
if not node:
return 'image cache'
else:
return node.uuid
def safety_check_image(image_path, node=None):
"""Performs a safety check on the supplied image.
This method triggers the image format inspector's to both identify the
type of the supplied file and safety check logic to identify if there
are any known unsafe features being leveraged, and return the detected
file format in the form of a string for the caller.
:param image_path: A fully qualified path to an image which needs to
be evaluated for safety.
:param node: A Node object, optional. When supplied logging indicates the
node which triggered this issue, but the node is not
available in all invocation cases.
:returns: a string representing the the image type which is used.
:raises: InvalidImage when the supplied image is detected as unsafe,
or the image format inspector has failed to parse the supplied
image's contents.
"""
id_string = __node_or_image_cache(node)
try:
img_class = image_format_inspector.detect_file_format(image_path)
if not img_class.safety_check():
LOG.error("Security: The requested image for "
"deployment of node %(node)s fails safety sanity "
"checking.",
{'node': id_string})
raise exception.InvalidImage()
image_format_name = str(img_class)
except image_format_inspector.ImageFormatError:
LOG.error("Security: The requested user image for the "
"deployment node %(node)s failed to be able "
"to be parsed by the image format checker.",
{'node': id_string})
raise exception.InvalidImage()
return image_format_name
def check_if_image_format_is_permitted(img_format,
expected_format=None,
node=None):
"""Checks image format consistency.
:params img_format: The determined image format by name.
:params expected_format: Optional, the expected format based upon
supplied configuration values.
:params node: A node object or None implying image cache.
:raises: InvalidImage if the requested image format is not permitted
by configuration, or the expected_format does not match the
determined format.
"""
id_string = __node_or_image_cache(node)
if img_format not in CONF.conductor.permitted_image_formats:
LOG.error("Security: The requested deploy image for node %(node)s "
"is of format image %(format)s and is not in the "
"[conductor]permitted_image_formats list.",
{'node': id_string,
'format': img_format})
raise exception.InvalidImage()
if expected_format is not None and img_format != expected_format:
if expected_format in ['ari', 'aki']:
# In this case, we have an ari or aki, meaning we're pulling
# down a kernel/ramdisk, and this is rooted in a misunderstanding.
# They should be raw. The detector should be detecting this *as*
# raw anyway, so the data just mismatches from a common
# misunderstanding, and that is okay in this case as they are not
# passed to qemu-img.
# TODO(TheJulia): Add a log entry to warn here at some point in
# the future as we begin to shift the perception around this.
# See: https://bugs.launchpad.net/ironic/+bug/2074090
return
LOG.error("Security: The requested deploy image for node %(node)s "
"has a format (%(format)s) which does not match the "
"expected image format (%(expected)s) based upon "
"supplied or retrieved information.",
{'node': id_string,
'format': img_format,
'expected': expected_format})
raise exception.InvalidImage()

89
ironic/common/qemu_img.py Normal file
View File

@ -0,0 +1,89 @@
# Licensed under the Apache License, Version 2.0 (the "License"); you may
# not use this file except in compliance with the License. You may obtain
# a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
# License for the specific language governing permissions and limitations
# under the License.
import logging
from oslo_concurrency import processutils
from oslo_utils import units
import tenacity
from ironic.common import utils
from ironic.conf import CONF
LOG = logging.getLogger(__name__)
# Limit the memory address space to 1 GiB when running qemu-img
QEMU_IMG_LIMITS = None
def _qemu_img_limits():
# NOTE(TheJulia): If you make *any* chance to this code, you may need
# to make an identitical or similar change to ironic-python-agent.
global QEMU_IMG_LIMITS
if QEMU_IMG_LIMITS is None:
QEMU_IMG_LIMITS = processutils.ProcessLimits(
address_space=CONF.disk_utils.image_convert_memory_limit
* units.Mi)
return QEMU_IMG_LIMITS
def _retry_on_res_temp_unavailable(exc):
if (isinstance(exc, processutils.ProcessExecutionError)
and ('Resource temporarily unavailable' in exc.stderr
or 'Cannot allocate memory' in exc.stderr)):
return True
return False
@tenacity.retry(
retry=tenacity.retry_if_exception(_retry_on_res_temp_unavailable),
stop=tenacity.stop_after_attempt(CONF.disk_utils.image_convert_attempts),
reraise=True)
def convert_image(source, dest, out_format, run_as_root=False, cache=None,
out_of_order=False, sparse_size=None,
source_format='qcow2'):
# NOTE(TheJulia): If you make *any* chance to this code, you may need
# to make an identitical or similar change to ironic-python-agent.
"""Convert image to other format."""
cmd = ['qemu-img', 'convert', '-f', source_format, '-O', out_format]
if cache is not None:
cmd += ['-t', cache]
if sparse_size is not None:
cmd += ['-S', sparse_size]
if out_of_order:
cmd.append('-W')
cmd += [source, dest]
# NOTE(TheJulia): Statically set the MALLOC_ARENA_MAX to prevent leaking
# and the creation of new malloc arenas which will consume the system
# memory. If limited to 1, qemu-img consumes ~250 MB of RAM, but when
# another thread tries to access a locked section of memory in use with
# another thread, then by default a new malloc arena is created,
# which essentially balloons the memory requirement of the machine.
# Default for qemu-img is 8 * nCPU * ~250MB (based on defaults +
# thread/code/process/library overhead. In other words, 64 GB. Limiting
# this to 3 keeps the memory utilization in happy cases below the overall
# threshold which is in place in case a malicious image is attempted to
# be passed through qemu-img.
env_vars = {'MALLOC_ARENA_MAX': '3'}
try:
utils.execute(*cmd, run_as_root=run_as_root,
prlimit=_qemu_img_limits(),
use_standard_locale=True,
env_variables=env_vars)
except processutils.ProcessExecutionError as e:
if ('Resource temporarily unavailable' in e.stderr
or 'Cannot allocate memory' in e.stderr):
LOG.debug('Failed to convert image, retrying. Error: %s', e)
# Sync disk caches before the next attempt
utils.execute('sync')
raise

View File

@ -27,6 +27,7 @@ from ironic.conf import database
from ironic.conf import default
from ironic.conf import deploy
from ironic.conf import dhcp
from ironic.conf import disk_utils
from ironic.conf import dnsmasq
from ironic.conf import drac
from ironic.conf import fake
@ -64,6 +65,7 @@ default.register_opts(CONF)
deploy.register_opts(CONF)
drac.register_opts(CONF)
dhcp.register_opts(CONF)
disk_utils.register_opts(CONF)
dnsmasq.register_opts(CONF)
fake.register_opts(CONF)
glance.register_opts(CONF)

View File

@ -428,6 +428,49 @@ opts = [
mutable=True,
help=_("Specifies a list of boot modes that are not allowed "
"during deployment. Eg: ['bios']")),
cfg.BoolOpt('disable_deep_image_inspection',
default=False,
# Normally such an option would be mutable, but this is,
# a security guard and operators should not expect to change
# this option under normal circumstances.
mutable=False,
help=_('Security Option to permit an operator to disable '
'file content inspections. Under normal conditions, '
'the conductor will inspect requested image contents '
'which are transferred through the conductor. '
'Disabling this option is not advisable and opens '
'the risk of unsafe images being processed which may '
'allow an attacker to leverage unsafe features in '
'various disk image formats to perform a variety of '
'unsafe and potentially compromising actions. '
'This option is *not* mutable, and '
'requires a service restart to change.')),
cfg.BoolOpt('conductor_always_validates_images',
default=False,
# Normally mutable, however from a security context we do want
# all logging to be generated from this option to be changed,
# and as such is set to False to force a conductor restart.
mutable=False,
help=_('Security Option to enable the conductor to *always* '
'inspect the image content of any requested deploy, '
'even if the deployment would have normally bypassed '
'the conductor\'s cache. When this is set to False, '
'the Ironic-Python-Agent is responsible '
'for any necessary image checks. Setting this to '
'True will result in a higher utilization of '
'resources (disk space, network traffic) '
'as the conductor will evaluate *all* images. '
'This option is *not* mutable, and requires a '
'service restart to change. This option requires '
'[conductor]disable_deep_image_inspection to be set '
'to False.')),
cfg.ListOpt('permitted_image_formats',
default=['raw', 'qcow2', 'iso'],
mutable=True,
help=_('The supported list of image formats which are '
'permitted for deployment with Ironic. If an image '
'format outside of this list is detected, the image '
'validation logic will fail the deployment process.')),
]

33
ironic/conf/disk_utils.py Normal file
View File

@ -0,0 +1,33 @@
# Licensed under the Apache License, Version 2.0 (the "License"); you may
# not use this file except in compliance with the License. You may obtain
# a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
# License for the specific language governing permissions and limitations
# under the License.
from oslo_config import cfg
# NOTE(TheJulia): If you make *any* chance to this code, you may need
# to make an identitical or similar change to ironic-python-agent.
# These options were originally taken from ironic-lib upon the decision
# to move the qemu-img image conversion calls into the projects in
# order to simplify fixes related to them.
opts = [
cfg.IntOpt('image_convert_memory_limit',
default=2048,
help='Memory limit for "qemu-img convert" in MiB. Implemented '
'via the address space resource limit.'),
cfg.IntOpt('image_convert_attempts',
default=3,
help='Number of attempts to convert an image.'),
]
def register_opts(conf):
conf.register_opts(opts, group='disk_utils')

View File

@ -27,6 +27,7 @@ _opts = [
('database', ironic.conf.database.opts),
('deploy', ironic.conf.deploy.opts),
('dhcp', ironic.conf.dhcp.opts),
('disk_utils', ironic.conf.disk_utils.opts),
('drac', ironic.conf.drac.opts),
('glance', ironic.conf.glance.list_opts()),
('healthcheck', ironic.conf.healthcheck.opts),

View File

@ -1,6 +1,6 @@
- name: convert and write
become: yes
command: qemu-img convert -t directsync -O host_device /tmp/{{ inventory_hostname }}.img {{ ironic_image_target }}
command: qemu-img convert -f {{ ironic.image.disk_format }} -t directsync -O host_device /tmp/{{ inventory_hostname }}.img {{ ironic_image_target }}
async: 1200
poll: 10
when: ironic.image.disk_format != 'raw'

View File

@ -26,6 +26,7 @@ from oslo_utils import fileutils
from oslo_utils import strutils
from ironic.common import async_steps
from ironic.common import context
from ironic.common import exception
from ironic.common import faults
from ironic.common.glance_service import service_utils
@ -204,7 +205,8 @@ def check_for_missing_params(info_dict, error_msg, param_prefix=''):
'missing_info': missing_info})
def fetch_images(ctx, cache, images_info, force_raw=True):
def fetch_images(ctx, cache, images_info, force_raw=True,
expected_format=None):
"""Check for available disk space and fetch images using ImageCache.
:param ctx: context
@ -212,7 +214,11 @@ def fetch_images(ctx, cache, images_info, force_raw=True):
:param images_info: list of tuples (image href, destination path)
:param force_raw: boolean value, whether to convert the image to raw
format
:param expected_format: The expected format of the image.
:raises: InstanceDeployFailure if unable to find enough disk space
:raises: InvalidImage if the supplied image metadata or contents are
deemed to be invalid, unsafe, or not matching the expectations
asserted by configuration supplied or set.
"""
try:
@ -224,8 +230,14 @@ def fetch_images(ctx, cache, images_info, force_raw=True):
# if disk space is used between the check and actual download.
# This is probably unavoidable, as we can't control other
# (probably unrelated) processes
image_list = []
for href, path in images_info:
cache.fetch_image(href, path, ctx=ctx, force_raw=force_raw)
# NOTE(TheJulia): Href in this case can be an image UUID or a URL.
image_format = cache.fetch_image(href, path, ctx=ctx,
force_raw=force_raw,
expected_format=expected_format)
image_list.append((href, path, image_format))
return image_list
def set_failed_state(task, msg, collect_logs=True):
@ -1060,7 +1072,7 @@ class InstanceImageCache(image_cache.ImageCache):
@METRICS.timer('cache_instance_image')
def cache_instance_image(ctx, node, force_raw=None):
def cache_instance_image(ctx, node, force_raw=None, expected_format=None):
"""Fetch the instance's image from Glance
This method pulls the disk image and writes them to the appropriate
@ -1069,8 +1081,12 @@ def cache_instance_image(ctx, node, force_raw=None):
:param ctx: context
:param node: an ironic node object
:param force_raw: whether convert image to raw format
:param expected_format: The expected format of the disk image contents.
:returns: a tuple containing the uuid of the image and the path in
the filesystem where image is cached.
:raises: InvalidImage if the requested image is invalid and cannot be
used for deployed based upon contents of the image or the metadata
surrounding the image not matching the configured image.
"""
# NOTE(dtantsur): applying the default here to make the option mutable
if force_raw is None:
@ -1084,10 +1100,9 @@ def cache_instance_image(ctx, node, force_raw=None):
LOG.debug("Fetching image %(image)s for node %(uuid)s",
{'image': uuid, 'uuid': node.uuid})
fetch_images(ctx, InstanceImageCache(), [(uuid, image_path)],
force_raw)
return (uuid, image_path)
image_list = fetch_images(ctx, InstanceImageCache(), [(uuid, image_path)],
force_raw, expected_format=expected_format)
return (uuid, image_path, image_list[0][2])
@METRICS.timer('destroy_images')
@ -1128,13 +1143,39 @@ def destroy_http_instance_images(node):
destroy_images(node.uuid)
def _validate_image_url(node, url, secret=False):
def _validate_image_url(node, url, secret=False, inspect_image=None,
expected_format=None):
"""Validates image URL through the HEAD request.
:param url: URL to be validated
:param secret: if URL is secret (e.g. swift temp url),
it will not be shown in logs.
:param inspect_image: If the requested URL should have extensive
content checking applied. Defaults to the value provided by
the [conductor]conductor_always_validates_images configuration
parameter setting, but is also able to be turned off by supplying
False where needed to perform a redirect or URL head request only.
:param expected_format: The expected image format, if known, for
the image inspection logic.
:returns: Returns a dictionary with basic information about the
requested image if image introspection is
"""
if inspect_image is not None:
# The caller has a bit more context and we can rely upon it,
# for example if it knows we cannot or should not inspect
# the image contents.
inspect = inspect_image
elif not CONF.conductor.disable_deep_image_inspection:
inspect = CONF.conductor.conductor_always_validates_images
else:
# If we're here, file inspection has been explicitly disabled.
inspect = False
# NOTE(TheJulia): This method gets used in two different ways.
# The first is as a "i did a thing, let me make sure my url works."
# The second is to validate a remote URL is valid. In the remote case
# we will grab the file and proceed from there.
image_info = {}
try:
# NOTE(TheJulia): This method only validates that an exception
# is NOT raised. In other words, that the endpoint does not
@ -1146,20 +1187,50 @@ def _validate_image_url(node, url, secret=False):
LOG.error("The specified URL is not a valid HTTP(S) URL or is "
"not reachable for node %(node)s: %(msg)s",
{'node': node.uuid, 'msg': e})
if inspect:
LOG.info("Inspecting image contents for %(node)s with url %(url)s. "
"Expecting user supplied format: %(expected)s",
{'node': node.uuid,
'expected': expected_format,
'url': url})
# Utilizes the file cache since it knows how to pull files down
# and handles pathing and caching and all that fun, however with
# force_raw set as false.
# The goal here being to get the file we would normally just point
# IPA at, be it via swift transfer *or* direct URL request, and
# perform the safety check on it before allowing it to proceed.
ctx = context.get_admin_context()
# NOTE(TheJulia): Because we're using the image cache here, we
# let it run the image validation checking as it's normal course
# of action, and save what it tells us the image format is.
# if there *was* a mismatch, it will raise the error.
_, image_path, img_format = cache_instance_image(
ctx,
node,
force_raw=False,
expected_format=expected_format)
# NOTE(TheJulia): We explicitly delete this file because it has no use
# in the cache after this point.
il_utils.unlink_without_raise(image_path)
image_info['disk_format'] = img_format
return image_info
def _cache_and_convert_image(task, instance_info, image_info=None):
"""Cache an image locally and convert it to RAW if needed."""
# Ironic cache and serve images from httpboot server
force_raw = direct_deploy_should_convert_raw_image(task.node)
_, image_path = cache_instance_image(task.context, task.node,
force_raw=force_raw)
if force_raw or image_info is None:
if image_info is None:
initial_format = instance_info.get('image_disk_format')
else:
initial_format = image_info.get('disk_format')
if image_info is None:
initial_format = instance_info.get('image_disk_format')
else:
initial_format = image_info.get('disk_format')
_, image_path, img_format = cache_instance_image(
task.context, task.node,
force_raw=force_raw,
expected_format=initial_format)
if force_raw or image_info is None:
if force_raw:
instance_info['image_disk_format'] = 'raw'
else:
@ -1240,7 +1311,11 @@ def _cache_and_convert_image(task, instance_info, image_info=None):
task.node.uuid])
if file_extension:
http_image_url = http_image_url + file_extension
_validate_image_url(task.node, http_image_url, secret=False)
# We don't inspect the image in our url check because we just need to do
# an quick path validity check here, we should be checking contents way
# earlier on in this method.
_validate_image_url(task.node, http_image_url, secret=False,
inspect_image=False)
instance_info['image_url'] = http_image_url
@ -1265,29 +1340,57 @@ def build_instance_info_for_deploy(task):
instance_info = node.instance_info
iwdi = node.driver_internal_info.get('is_whole_disk_image')
image_source = instance_info['image_source']
# Flag if we know the source is a path, used for Anaconda
# deploy interface where you can just tell anaconda to
# consume artifacts from a path. In this case, we are not
# doing any image conversions, we're just passing through
# a URL in the form of configuration.
isap = node.driver_internal_info.get('is_source_a_path')
# If our url ends with a /, i.e. we have been supplied with a path,
# we can only deploy this in limited cases for drivers and tools
# which are aware of such. i.e. anaconda.
image_download_source = get_image_download_source(node)
boot_option = get_boot_option(task.node)
# There is no valid reason this should already be set, and
# and gets replaced at various points in this sequence.
instance_info['image_url'] = None
if service_utils.is_glance_image(image_source):
glance = image_service.GlanceImageService(context=task.context)
image_info = glance.show(image_source)
LOG.debug('Got image info: %(info)s for node %(node)s.',
{'info': image_info, 'node': node.uuid})
if image_download_source == 'swift':
# In this case, we are getting a file *from* swift for a glance
# image which is backed by swift. IPA downloads the file directly
# from swift, but cannot get any metadata related to it otherwise.
swift_temp_url = glance.swift_temp_url(image_info)
_validate_image_url(node, swift_temp_url, secret=True)
image_format = image_info.get('disk_format')
# In the process of validating the URL is valid, we will perform
# the requisite safety checking of the asset as we can end up
# converting it in the agent, or needing the disk format value
# to be correct for the Ansible deployment interface.
validate_results = _validate_image_url(
node, swift_temp_url, secret=True,
expected_format=image_format)
# Values are explicitly set into the instance info field
# so IPA have the values available.
instance_info['image_url'] = swift_temp_url
instance_info['image_checksum'] = image_info['checksum']
instance_info['image_disk_format'] = image_info['disk_format']
instance_info['image_disk_format'] = \
validate_results.get('disk_format', image_format)
instance_info['image_os_hash_algo'] = image_info['os_hash_algo']
instance_info['image_os_hash_value'] = image_info['os_hash_value']
else:
# In this case, we're directly downloading the glance image and
# hosting it locally for retrieval by the IPA.
_cache_and_convert_image(task, instance_info, image_info)
# We're just populating extra information for a glance backed image in
# case a deployment interface driver needs them at some point.
instance_info['image_container_format'] = (
image_info['container_format'])
instance_info['image_tags'] = image_info.get('tags', [])
@ -1298,20 +1401,80 @@ def build_instance_info_for_deploy(task):
instance_info['ramdisk'] = image_info['properties']['ramdisk_id']
elif (image_source.startswith('file://')
or image_download_source == 'local'):
# In this case, we're explicitly downloading (or copying a file)
# hosted locally so IPA can download it directly from Ironic.
# NOTE(TheJulia): Intentionally only supporting file:/// as image
# based deploy source since we don't want to, nor should we be in
# in the business of copying large numbers of files as it is a
# huge performance impact.
_cache_and_convert_image(task, instance_info)
else:
# This is the "all other cases" logic for aspects like the user
# has supplied us a direct URL to reference. In cases like the
# anaconda deployment interface where we might just have a path
# and not a file, or where a user may be supplying a full URL to
# a remotely hosted image, we at a minimum need to check if the url
# is valid, and address any redirects upfront.
try:
_validate_image_url(node, image_source)
# NOTE(TheJulia): In the case we're here, we not doing an
# integrated image based deploy, but we may also be doing
# a path based anaconda base deploy, in which case we have
# no backing image, but we need to check for a URL
# redirection. So, if the source is a path (i.e. isap),
# we don't need to inspect the image as there is no image
# in the case for the deployment to drive.
validated_results = {}
if isap:
# This is if the source is a path url, such as one used by
# anaconda templates to to rely upon bootstrapping defaults.
_validate_image_url(node, image_source, inspect_image=False)
else:
# When not isap, we can just let _validate_image_url make a
# the required decision on if contents need to be sampled,
# or not. We try to pass the image_disk_format which may be
# declared by the user, and if not we set expected_format to
# None.
validate_results = _validate_image_url(
node,
image_source,
expected_format=instance_info.get('image_disk_format',
None))
# image_url is internal, and used by IPA and some boot templates.
# in most cases, it needs to come from image_source explicitly.
if 'disk_format' in validated_results:
# Ensure IPA has the value available, so write what we detect,
# if anything. This is also an item which might be needful
# with ansible deploy interface, when used in standalone mode.
instance_info['image_disk_format'] = \
validate_results.get('disk_format')
instance_info['image_url'] = image_source
except exception.ImageRefIsARedirect as e:
# At this point, we've got a redirect response from the webserver,
# and we're going to try to handle it as a single redirect action,
# as requests, by default, only lets a single redirect to occur.
# This is likely a URL pathing fix, like a trailing / on a path,
# or move to HTTPS from a user supplied HTTP url.
if e.redirect_url:
# Since we've got a redirect, we need to carry the rest of the
# request logic as well, which includes recording a disk
# format, if applicable.
instance_info['image_url'] = e.redirect_url
# We need to save the image_source back out so it caches
instance_info['image_source'] = e.redirect_url
task.node.instance_info = instance_info
if not isap:
# The redirect doesn't relate to a path being used, so
# the target is a filename, likely cause is webserver
# telling the client to use HTTPS.
validated_results = _validate_image_url(
node, e.redirect_url,
expected_format=instance_info.get('image_disk_format',
None))
if 'disk_format' in validated_results:
instance_info['image_disk_format'] = \
validated_results.get('disk_format')
else:
raise
@ -1326,7 +1489,6 @@ def build_instance_info_for_deploy(task):
# Call central parsing so we retain things like config drives.
i_info = parse_instance_info(node, image_deploy=False)
instance_info.update(i_info)
return instance_info

View File

@ -65,7 +65,8 @@ class ImageCache(object):
if master_dir is not None:
fileutils.ensure_tree(master_dir)
def fetch_image(self, href, dest_path, ctx=None, force_raw=True):
def fetch_image(self, href, dest_path, ctx=None, force_raw=True,
expected_format=None):
"""Fetch image by given href to the destination path.
Does nothing if destination path exists and is up to date with cache
@ -80,6 +81,7 @@ class ImageCache(object):
:param ctx: context
:param force_raw: boolean value, whether to convert the image to raw
format
:param expected_format: The expected image format.
"""
img_download_lock_name = 'download-image'
if self.master_dir is None:
@ -140,13 +142,14 @@ class ImageCache(object):
{'href': href})
self._download_image(
href, master_path, dest_path, img_info,
ctx=ctx, force_raw=force_raw)
ctx=ctx, force_raw=force_raw,
expected_format=expected_format)
# NOTE(dtantsur): we increased cache size - time to clean up
self.clean_up()
def _download_image(self, href, master_path, dest_path, img_info,
ctx=None, force_raw=True):
ctx=None, force_raw=True, expected_format=None):
"""Download image by href and store at a given path.
This method should be called with uuid-specific lock taken.
@ -158,6 +161,7 @@ class ImageCache(object):
:param ctx: context
:param force_raw: boolean value, whether to convert the image to raw
format
:param expected_format: The expected original format for the image.
:raise ImageDownloadFailed: when the image cache and the image HTTP or
TFTP location are on different file system,
causing hard link to fail.
@ -169,7 +173,7 @@ class ImageCache(object):
try:
with _concurrency_semaphore:
_fetch(ctx, href, tmp_path, force_raw)
_fetch(ctx, href, tmp_path, force_raw, expected_format)
if img_info.get('no_cache'):
LOG.debug("Caching is disabled for image %s", href)
@ -333,34 +337,59 @@ def _free_disk_space_for(path):
return stat.f_frsize * stat.f_bavail
def _fetch(context, image_href, path, force_raw=False):
def _fetch(context, image_href, path, force_raw=False, expected_format=None):
"""Fetch image and convert to raw format if needed."""
path_tmp = "%s.part" % path
if os.path.exists(path_tmp):
LOG.warning("%s exist, assuming it's stale", path_tmp)
os.remove(path_tmp)
images.fetch(context, image_href, path_tmp, force_raw=False)
# By default, the image format is unknown
image_format = None
disable_dii = CONF.conductor.disable_deep_image_inspection
if not disable_dii:
if not expected_format:
# Call of last resort to check the image format. Caching other
# artifacts like kernel/ramdisks are not going to have an expected
# format known even if they are not passed to qemu-img.
remote_image_format = images.image_show(
context,
image_href).get('disk_format')
else:
remote_image_format = expected_format
image_format = images.safety_check_image(path_tmp)
images.check_if_image_format_is_permitted(
image_format, remote_image_format)
# Notes(yjiang5): If glance can provide the virtual size information,
# then we can firstly clean cache and then invoke images.fetch().
if force_raw:
if images.force_raw_will_convert(image_href, path_tmp):
required_space = images.converted_size(path_tmp, estimate=False)
directory = os.path.dirname(path_tmp)
if (force_raw
and ((disable_dii
and images.force_raw_will_convert(image_href, path_tmp))
or (not disable_dii and image_format != 'raw'))):
# NOTE(TheJulia): What is happening here is the rest of the logic
# is hinged on force_raw, but also we don't need to take the entire
# path *if* the image on disk is *already* raw. Depending on settings,
# the path differs slightly because if we have deep image inspection,
# we can just rely upon the inspection image format, otherwise we
# need to ask the image format.
required_space = images.converted_size(path_tmp, estimate=False)
directory = os.path.dirname(path_tmp)
try:
_clean_up_caches(directory, required_space)
except exception.InsufficientDiskSpace:
# try again with an estimated raw size instead of the full size
required_space = images.converted_size(path_tmp, estimate=True)
try:
_clean_up_caches(directory, required_space)
except exception.InsufficientDiskSpace:
# try again with an estimated raw size instead of the full size
required_space = images.converted_size(path_tmp, estimate=True)
try:
_clean_up_caches(directory, required_space)
except exception.InsufficientDiskSpace:
LOG.warning('Not enough space for estimated image size. '
'Consider lowering '
'[DEFAULT]raw_image_growth_factor=%s',
CONF.raw_image_growth_factor)
raise
LOG.error('Not enough space for estimated image size. '
'Consider lowering '
'[DEFAULT]raw_image_growth_factor=%s',
CONF.raw_image_growth_factor)
raise
images.image_to_raw(image_href, path, path_tmp)
else:
os.rename(path_tmp, path)

View File

@ -82,6 +82,8 @@ class TestLookup(test_api_base.BaseApiTest):
'agent_token': mock.ANY,
'agent_token_required': True,
'agent_md5_checksum_enable': CONF.agent.allow_md5_checksum,
'disable_deep_image_inspection': CONF.conductor.disable_deep_image_inspection, # noqa
'permitted_image_formats': CONF.conductor.permitted_image_formats,
}
self.assertEqual(expected_config, data['config'])
self.assertIsNotNone(data['config']['agent_token'])

View File

@ -0,0 +1,668 @@
# Copyright 2020 Red Hat, Inc
# All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License"); you may
# not use this file except in compliance with the License. You may obtain
# a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
# License for the specific language governing permissions and limitations
# under the License.
import io
import os
import re
import struct
import subprocess
import tempfile
from unittest import mock
from oslo_utils import units
from ironic.common import image_format_inspector as format_inspector
from ironic.tests import base as test_base
TEST_IMAGE_PREFIX = 'ironic-unittest-formatinspector-'
def get_size_from_qemu_img(filename):
output = subprocess.check_output('qemu-img info "%s"' % filename,
shell=True)
for line in output.split(b'\n'):
m = re.search(b'^virtual size: .* .([0-9]+) bytes', line.strip())
if m:
return int(m.group(1))
raise Exception('Could not find virtual size with qemu-img')
class TestFormatInspectors(test_base.TestCase):
block_execute = False
def setUp(self):
super(TestFormatInspectors, self).setUp()
self._created_files = []
def tearDown(self):
super(TestFormatInspectors, self).tearDown()
for fn in self._created_files:
try:
os.remove(fn)
except Exception:
pass
def _create_iso(self, image_size, subformat='9660'):
"""Create an ISO file of the given size.
:param image_size: The size of the image to create in bytes
:param subformat: The subformat to use, if any
"""
# these tests depend on mkisofs
# being installed and in the path,
# if it is not installed, skip
try:
subprocess.check_output('mkisofs --version', shell=True)
except Exception:
self.skipTest('mkisofs not installed')
size = image_size // units.Mi
base_cmd = "mkisofs"
if subformat == 'udf':
# depending on the distribution mkisofs may not support udf
# and may be provided by genisoimage instead. As a result we
# need to check if the command supports udf via help
# instead of checking the installed version.
# mkisofs --help outputs to stderr so we need to
# redirect it to stdout to use grep.
try:
subprocess.check_output(
'mkisofs --help 2>&1 | grep udf', shell=True)
except Exception:
self.skipTest('mkisofs does not support udf format')
base_cmd += " -udf"
prefix = TEST_IMAGE_PREFIX
prefix += '-%s-' % subformat
fn = tempfile.mktemp(prefix=prefix, suffix='.iso')
self._created_files.append(fn)
subprocess.check_output(
'dd if=/dev/zero of=%s bs=1M count=%i' % (fn, size),
shell=True)
# We need to use different file as input and output as the behavior
# of mkisofs is version dependent if both the input and the output
# are the same and can cause test failures
out_fn = "%s.iso" % fn
subprocess.check_output(
'%s -V "TEST" -o %s %s' % (base_cmd, out_fn, fn),
shell=True)
self._created_files.append(out_fn)
return out_fn
def _create_img(
self, fmt, size, subformat=None, options=None,
backing_file=None):
"""Create an image file of the given format and size.
:param fmt: The format to create
:param size: The size of the image to create in bytes
:param subformat: The subformat to use, if any
:param options: A dictionary of options to pass to the format
:param backing_file: The backing file to use, if any
"""
if fmt == 'iso':
return self._create_iso(size, subformat)
if fmt == 'vhd':
# QEMU calls the vhd format vpc
fmt = 'vpc'
# these tests depend on qemu-img being installed and in the path,
# if it is not installed, skip. we also need to ensure that the
# format is supported by qemu-img, this can vary depending on the
# distribution so we need to check if the format is supported via
# the help output.
try:
subprocess.check_output(
'qemu-img --help | grep %s' % fmt, shell=True)
except Exception:
self.skipTest(
'qemu-img not installed or does not support %s format' % fmt)
if options is None:
options = {}
opt = ''
prefix = TEST_IMAGE_PREFIX
if subformat:
options['subformat'] = subformat
prefix += subformat + '-'
if options:
opt += '-o ' + ','.join('%s=%s' % (k, v)
for k, v in options.items())
if backing_file is not None:
opt += ' -b %s -F raw' % backing_file
fn = tempfile.mktemp(prefix=prefix,
suffix='.%s' % fmt)
self._created_files.append(fn)
subprocess.check_output(
'qemu-img create -f %s %s %s %i' % (fmt, opt, fn, size),
shell=True)
return fn
def _create_allocated_vmdk(self, size_mb, subformat=None):
# We need a "big" VMDK file to exercise some parts of the code of the
# format_inspector. A way to create one is to first create an empty
# file, and then to convert it with the -S 0 option.
if subformat is None:
# Matches qemu-img default, see `qemu-img convert -O vmdk -o help`
subformat = 'monolithicSparse'
prefix = TEST_IMAGE_PREFIX
prefix += '-%s-' % subformat
fn = tempfile.mktemp(prefix=prefix, suffix='.vmdk')
self._created_files.append(fn)
raw = tempfile.mktemp(prefix=prefix, suffix='.raw')
self._created_files.append(raw)
# Create a file with pseudo-random data, otherwise it will get
# compressed in the streamOptimized format
subprocess.check_output(
'dd if=/dev/urandom of=%s bs=1M count=%i' % (raw, size_mb),
shell=True)
# Convert it to VMDK
subprocess.check_output(
'qemu-img convert -f raw -O vmdk -o subformat=%s -S 0 %s %s' % (
subformat, raw, fn),
shell=True)
return fn
def _test_format_at_block_size(self, format_name, img, block_size):
fmt = format_inspector.get_inspector(format_name)()
self.assertIsNotNone(fmt,
'Did not get format inspector for %s' % (
format_name))
wrapper = format_inspector.InfoWrapper(open(img, 'rb'), fmt)
while True:
chunk = wrapper.read(block_size)
if not chunk:
break
wrapper.close()
return fmt
def _test_format_at_image_size(self, format_name, image_size,
subformat=None):
"""Test the format inspector for the given format at the given image size.
:param format_name: The format to test
:param image_size: The size of the image to create in bytes
:param subformat: The subformat to use, if any
""" # noqa
img = self._create_img(format_name, image_size, subformat=subformat)
# Some formats have internal alignment restrictions making this not
# always exactly like image_size, so get the real value for comparison
virtual_size = get_size_from_qemu_img(img)
# Read the format in various sizes, some of which will read whole
# sections in a single read, others will be completely unaligned, etc.
block_sizes = [64 * units.Ki, 1 * units.Mi]
# ISO images have a 32KB system area at the beginning of the image
# as a result reading that in 17 or 512 byte blocks takes too long,
# causing the test to fail. The 64KiB block size is enough to read
# the system area and header in a single read. the 1MiB block size
# adds very little time to the test so we include it.
if format_name != 'iso':
block_sizes.extend([17, 512])
for block_size in block_sizes:
fmt = self._test_format_at_block_size(format_name, img, block_size)
self.assertTrue(fmt.format_match,
'Failed to match %s at size %i block %i' % (
format_name, image_size, block_size))
self.assertEqual(virtual_size, fmt.virtual_size,
('Failed to calculate size for %s at size %i '
'block %i') % (format_name, image_size,
block_size))
memory = sum(fmt.context_info.values())
self.assertLess(memory, 512 * units.Ki,
'Format used more than 512KiB of memory: %s' % (
fmt.context_info))
def _test_format(self, format_name, subformat=None):
# Try a few different image sizes, including some odd and very small
# sizes
for image_size in (512, 513, 2057, 7):
self._test_format_at_image_size(format_name, image_size * units.Mi,
subformat=subformat)
def test_qcow2(self):
self._test_format('qcow2')
def test_iso_9660(self):
self._test_format('iso', subformat='9660')