From f7c7ea935ac5acfc22decbb51da316837b77e69b Mon Sep 17 00:00:00 2001 From: Julia Kreger Date: Thu, 8 Aug 2024 12:42:20 -0700 Subject: [PATCH] CVE-2024-44982: Harden all image handling and conversion code It was recently learned by the OpenStack community that running qemu-img on un-trusted images without a format pre-specified can present a security risk. Furthermore, some of these specific image formats have inherently unsafe features. This is rooted in how qemu-img operates where all image drivers are loaded and attempt to evaluate the input data. This can result in several different vectors which this patch works to close. This change imports the qemu-img handling code from Ironic-Lib into Ironic, and image format inspection code, which has been developed by the wider community to validate general safety of images before converting them for use in a deployment. This patch contains functional changes related to the hardening of these calls including how images are handled, and updates documentation to provide context and guidance to operators. Closes-Bug: 2071740 Change-Id: I7fac5c64f89aec39e9755f0930ee47ff8f7aed47 Signed-off-by: Julia Kreger --- doc/source/admin/security.rst | 67 ++ doc/source/admin/troubleshooting.rst | 24 + .../install/configure-glance-images.rst | 22 + doc/source/user/creating-images.rst | 33 + ironic/api/controllers/v1/ramdisk.py | 2 + ironic/common/exception.py | 4 + ironic/common/image_format_inspector.py | 1038 +++++++++++++++++ ironic/common/images.py | 150 ++- ironic/common/qemu_img.py | 89 ++ ironic/conf/__init__.py | 2 + ironic/conf/conductor.py | 43 + ironic/conf/disk_utils.py | 33 + ironic/conf/opts.py | 1 + .../playbooks/roles/deploy/tasks/write.yaml | 2 +- ironic/drivers/modules/deploy_utils.py | 202 +++- ironic/drivers/modules/image_cache.py | 71 +- .../unit/api/controllers/v1/test_ramdisk.py | 2 + .../unit/common/test_format_inspector.py | 668 +++++++++++ ironic/tests/unit/common/test_images.py | 192 ++- ironic/tests/unit/common/test_qemu_img.py | 146 +++ .../unit/drivers/modules/test_deploy_utils.py | 517 +++++++- .../unit/drivers/modules/test_image_cache.py | 245 +++- .../address-qemu-issues-1bbead8bb70b76fb.yaml | 108 ++ 23 files changed, 3502 insertions(+), 159 deletions(-) create mode 100644 ironic/common/image_format_inspector.py create mode 100644 ironic/common/qemu_img.py create mode 100644 ironic/conf/disk_utils.py create mode 100644 ironic/tests/unit/common/test_format_inspector.py create mode 100644 ironic/tests/unit/common/test_qemu_img.py create mode 100644 releasenotes/notes/address-qemu-issues-1bbead8bb70b76fb.yaml diff --git a/doc/source/admin/security.rst b/doc/source/admin/security.rst index c3b986f6e0..2ee591c523 100644 --- a/doc/source/admin/security.rst +++ b/doc/source/admin/security.rst @@ -275,3 +275,70 @@ An easy way to do this is to: # Access IPA ramdisk functions "baremetal:driver:ipa_lookup": "rule:is_admin" + +Disk Images +=========== + +Ironic relies upon the ``qemu-img`` tool to convert images from a supplied +disk image format, to a ``raw`` format in order to write the contents of a +disk image to the remote device. + +By default, only ``qcow2`` format is supported for this operation, however there +have been reports other formats work when so enabled using the +``[conductor]permitted_image_formats`` configuration option. + + +Ironic takes several steps by default. + +#. Ironic checks and compares supplied metadata with a remote authoritative + source, such as the Glance Image Service, if available. +#. Ironic attempts to "fingerprint" the file type based upon available + metadata and file structure. A file format which is not known to the image + format inspection code may be evaluated as "raw", which means the image + would not be passed through ``qemu-img``. When in doubt, use a ``raw`` + image which you can verify is in the desirable and expected state. +#. The image then has a set of safety and sanity checks executed which look + for unknown or unsafe feature usage in the base format which could permit + an attacker to potentially leverage functionality in ``qemu-img`` which + should not be utilized. This check, by default, occurs only through images + which transverse *through* the conductor. +#. Ironic then checks if the fingerprint values and metadata values match. + If they do not match, the requested image is rejected and the operation + fails. +#. The image is then provided to the ``ironic-python-agent``. + +Images which are considered "pass-through", as in they are supplied by an +API user as a URL, or are translated to a temporary URL via available +service configuration, are supplied as a URL to the +``ironic-python-agent``. + +Ironic can be configured to intercept this interaction and have the conductor +download and inspect these items before the ``ironic-python-agent`` will do so, +however this can increase the temporary disk utilization of the Conductor +along with network traffic to facilitate the transfer. This check is disabled +by default, but can be enabled using the +``[conductor]conductor_always_validates_images`` configuration option. + +An option exists which forces all files to be served from the conductor, and +thus force image inspection before involvement of the ``ironic-python-agent`` +is the use of the ``[agent]image_download_source`` configuration parameter +when set to ``local`` which proxies all disk images through the conductor. +This setting is also available in the node ``driver_info`` and +``instance_info`` fields. + +Mitigating Factors to disk images +--------------------------------- + +In a fully integrated OpenStack context, Ironic requires images to be set to +"public" in the Image Service. + +A direct API user with sufficient elevated access rights *can* submit a URL +for the baremetal node ``instance_info`` dictionary field with an +``image_source`` key value set to a URL. To do so explicitly requires +elevated (trusted) access rights of a System scoped Member, +or Project scoped Owner-Member, or a Project scoped Lessee-Admin via +the ``baremetal:node:update_instance_info`` policy permission rule. +Before the Wallaby release of OpenStack, this was restricted to +``admin`` and ``baremetal_admin`` roles and remains similarly restrictive +in the newer "Secure RBAC" model. + diff --git a/doc/source/admin/troubleshooting.rst b/doc/source/admin/troubleshooting.rst index 5645a0167d..8aa837eee8 100644 --- a/doc/source/admin/troubleshooting.rst +++ b/doc/source/admin/troubleshooting.rst @@ -1239,3 +1239,27 @@ If your bare metal management processes require that full machine management is made using a project scoped account, please configure an appropriate node ``owner`` for the nodes which need to be managed. Ironic recognizes this is going to vary based upon processes and preferences. + +Ironic says my Image is Invalid +=============================== + +As a result of security fixes which were added to Ironic, resulting from the +security posture of the ``qemu-img`` utility, Ironic enforces certain aspects +related to image files. + +* Enforces that the file format of a disk image matches what Ironic is + told by an API user. Any mismatch will result in the image being declared + as invalid. A mismatch with the file contents and what is stored in the + Image service will necessitate uploading a new image as that property + cannot be changed in the image service *after* creation of an image. +* Enforces that the input file format to be written is ``qcow2`` or ``raw``. + This can be extended by modifying ``[conductor]permitted_image_formats`` in + ``ironic.conf``. +* Performs safety and sanity check assessment against the file, which can be + disabled by modifying ``[conductor]disable_deep_image_inspection`` and + setting it to ``True``. Doing so is not considered safe and should only + be done by operators accepting the inherent risk that the image they + are attempting to use may have a bad or malicious structure. + Image safety checks are generally performed as the deployment process begins + and stages artifacts, however a late stage check is performed when + needed by the ironic-python-agent. diff --git a/doc/source/install/configure-glance-images.rst b/doc/source/install/configure-glance-images.rst index 5cdecfad2c..26be019492 100644 --- a/doc/source/install/configure-glance-images.rst +++ b/doc/source/install/configure-glance-images.rst @@ -3,6 +3,24 @@ Add images to the Image service =============================== +Supported Image Formats +~~~~~~~~~~~~~~~~~~~~~~~ + +Ironic officially supports and tests use of ``qcow2`` formatted images as well +as ``raw`` format images. Other types of disk images, like ``vdi``, and single +file ``vmdk`` files have been reported by users as working in their specific +cases, but are not tested upstream. We advise operators to convert the image +and properly upload the image to Glance. + +Ironic enforces the list of supported and permitted image formats utilizing +the ``[conductor]permitted_image_formats`` option in ironic.conf. This setting +defaults to "raw" and "qcow2". + +A detected format mismatch between Glance and what the actual contents of +the disk image file are detected as will result in a failed deployment. +To correct such a situation, the image must be re-uploaded with the +declared ``--disk-format`` or actual image file format corrected. + Instance (end-user) images ~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -11,6 +29,10 @@ Build or download the user images as described in :doc:`/user/creating-images`. Load all the created images into the Image service, and note the image UUIDs in the Image service for each one as it is generated. +.. note:: + Images from Glance used by Ironic must be flagged as ``public``, which + requires administrative privileges with the Glance image service to set. + - For *whole disk images* just upload the image: .. code-block:: console diff --git a/doc/source/user/creating-images.rst b/doc/source/user/creating-images.rst index efcd632a5f..acbd5b39bb 100644 --- a/doc/source/user/creating-images.rst +++ b/doc/source/user/creating-images.rst @@ -27,6 +27,39 @@ Many distributions publish their own cloud images. These are usually whole disk images that are built for legacy boot mode (not UEFI), with Ubuntu being an exception (they publish images that work in both modes). +Supported Disk Image Formats +---------------------------- + +The following formats are tested by Ironic and are expected to work as +long as no unknown or unsafe special features are being used + +* raw - A file containing bytes as they would exist on a disk or other + block storage device. This is the simplest format. +* qcow2 - An updated file format based upon the `QEMU `_ + Copy-on-Write format. + +A special mention exists for ``iso`` formatted "CD" images. While Ironic uses +the ISO9660 filesystems in some of it's processes for aspects such as virtual +media, it does *not* support writing them to the remote block storage device. + +Image formats we believe may work due to third party reports, but do not test: + +* vmdk - A file format derived from the image format originally created + by VMware for their hypervisor product line. Specifically we believe + a single file VMDK formatted image should work. As there are + are several subformats, some of which will not work and may result + in unexpected behavior such as failed deployments. +* vdi - A file format used by + `Oracle VM Virtualbox `_ hypervisor. + +As Ironic does not support these formats, their usage is normally blocked +due security considerations by default. Please consult with your Ironic Operator. + +It is important to highlight that Ironic enforces and matches the file type +based upon signature, and not file extension. If there is a mismatch, +the input and or remote service records such as in the Image service +must be corrected. + disk-image-builder ------------------ diff --git a/ironic/api/controllers/v1/ramdisk.py b/ironic/api/controllers/v1/ramdisk.py index 47b320b24c..ddf5fba82e 100644 --- a/ironic/api/controllers/v1/ramdisk.py +++ b/ironic/api/controllers/v1/ramdisk.py @@ -62,6 +62,8 @@ def config(token): # and behavior into place. 'agent_token_required': True, 'agent_md5_checksum_enable': CONF.agent.allow_md5_checksum, + 'disable_deep_image_inspection': CONF.conductor.disable_deep_image_inspection, # noqa + 'permitted_image_formats': CONF.conductor.permitted_image_formats, } diff --git a/ironic/common/exception.py b/ironic/common/exception.py index 59d60f3ba0..e53ce472c5 100644 --- a/ironic/common/exception.py +++ b/ironic/common/exception.py @@ -893,3 +893,7 @@ class UnsupportedHardwareFeature(Invalid): _msg_fmt = _("Node %(node)s hardware does not support feature " "%(feature)s, which is required based upon the " "requested configuration.") + + +class InvalidImage(ImageUnacceptable): + _msg_fmt = _("The requested image is not valid for use.") diff --git a/ironic/common/image_format_inspector.py b/ironic/common/image_format_inspector.py new file mode 100644 index 0000000000..4fd381c941 --- /dev/null +++ b/ironic/common/image_format_inspector.py @@ -0,0 +1,1038 @@ +# Copyright 2020 Red Hat, Inc +# All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); you may +# not use this file except in compliance with the License. You may obtain +# a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT +# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the +# License for the specific language governing permissions and limitations +# under the License. + +""" +This is a python implementation of virtual disk format inspection routines +gathered from various public specification documents, as well as qemu disk +driver code. It attempts to store and parse the minimum amount of data +required, and in a streaming-friendly manner to collect metadata about +complex-format images. +""" + +import struct + +from oslo_log import log as logging +from oslo_utils import units + +LOG = logging.getLogger(__name__) + + +def chunked_reader(fileobj, chunk_size=512): + while True: + chunk = fileobj.read(chunk_size) + if not chunk: + break + yield chunk + + +class CaptureRegion(object): + """Represents a region of a file we want to capture. + + A region of a file we want to capture requires a byte offset into + the file and a length. This is expected to be used by a data + processing loop, calling capture() with the most recently-read + chunk. This class handles the task of grabbing the desired region + of data across potentially multiple fractional and unaligned reads. + + :param offset: Byte offset into the file starting the region + :param length: The length of the region + """ + + def __init__(self, offset, length): + self.offset = offset + self.length = length + self.data = b'' + + @property + def complete(self): + """Returns True when we have captured the desired data.""" + return self.length == len(self.data) + + def capture(self, chunk, current_position): + """Process a chunk of data. + + This should be called for each chunk in the read loop, at least + until complete returns True. + + :param chunk: A chunk of bytes in the file + :param current_position: The position of the file processed by the + read loop so far. Note that this will be + the position in the file *after* the chunk + being presented. + """ + read_start = current_position - len(chunk) + if (read_start <= self.offset <= current_position + or self.offset <= read_start <= (self.offset + self.length)): + if read_start < self.offset: + lead_gap = self.offset - read_start + else: + lead_gap = 0 + self.data += chunk[lead_gap:] + self.data = self.data[:self.length] + + +class ImageFormatError(Exception): + """An unrecoverable image format error that aborts the process.""" + pass + + +class TraceDisabled(object): + """A logger-like thing that swallows tracing when we do not want it.""" + + def debug(self, *a, **k): + pass + + info = debug + warning = debug + error = debug + + +class FileInspector(object): + """A stream-based disk image inspector. + + This base class works on raw images and is subclassed for more + complex types. It is to be presented with the file to be examined + one chunk at a time, during read processing and will only store + as much data as necessary to determine required attributes of + the file. + """ + + def __init__(self, tracing=False): + self._total_count = 0 + + # NOTE(danms): The logging in here is extremely verbose for a reason, + # but should never really be enabled at that level at runtime. To + # retain all that work and assist in future debug, we have a separate + # debug flag that can be passed from a manual tool to turn it on. + if tracing: + self._log = logging.getLogger(str(self)) + else: + self._log = TraceDisabled() + self._capture_regions = {} + + def _capture(self, chunk, only=None): + for name, region in self._capture_regions.items(): + if only and name not in only: + continue + if not region.complete: + region.capture(chunk, self._total_count) + + def eat_chunk(self, chunk): + """Call this to present chunks of the file to the inspector.""" + pre_regions = set(self._capture_regions.keys()) + + # Increment our position-in-file counter + self._total_count += len(chunk) + + # Run through the regions we know of to see if they want this + # data + self._capture(chunk) + + # Let the format do some post-read processing of the stream + self.post_process() + + # Check to see if the post-read processing added new regions + # which may require the current chunk. + new_regions = set(self._capture_regions.keys()) - pre_regions + if new_regions: + self._capture(chunk, only=new_regions) + + def post_process(self): + """Post-read hook to process what has been read so far. + + This will be called after each chunk is read and potentially captured + by the defined regions. If any regions are defined by this call, + those regions will be presented with the current chunk in case it + is within one of the new regions. + """ + pass + + def region(self, name): + """Get a CaptureRegion by name.""" + return self._capture_regions[name] + + def new_region(self, name, region): + """Add a new CaptureRegion by name.""" + if self.has_region(name): + # This is a bug, we tried to add the same region twice + raise ImageFormatError('Inspector re-added region %s' % name) + self._capture_regions[name] = region + + def has_region(self, name): + """Returns True if named region has been defined.""" + return name in self._capture_regions + + @property + def format_match(self): + """Returns True if the file appears to be the expected format.""" + return True + + @property + def virtual_size(self): + """Returns the virtual size of the disk image, or zero if unknown.""" + return self._total_count + + @property + def actual_size(self): + """Returns the total size of the file, usually smaller than virtual_size. + + NOTE: this will only be accurate if the entire file is read and processed. + """ # noqa + return self._total_count + + @property + def complete(self): + """Returns True if we have all the information needed.""" + return all(r.complete for r in self._capture_regions.values()) + + def __str__(self): + """The string name of this file format.""" + return 'raw' + + @property + def context_info(self): + """Return info on amount of data held in memory for auditing. + + This is a dict of region:sizeinbytes items that the inspector + uses to examine the file. + """ + return {name: len(region.data) for name, region in + self._capture_regions.items()} + + @classmethod + def from_file(cls, filename): + """Read as much of a file as necessary to complete inspection. + + NOTE: Because we only read as much of the file as necessary, the + actual_size property will not reflect the size of the file, but the + amount of data we read before we satisfied the inspector. + + Raises ImageFormatError if we cannot parse the file. + """ + inspector = cls() + with open(filename, 'rb') as f: + for chunk in chunked_reader(f): + inspector.eat_chunk(chunk) + if inspector.complete: + # No need to eat any more data + break + if not inspector.complete or not inspector.format_match: + raise ImageFormatError('File is not in requested format') + return inspector + + def safety_check(self): + """Perform some checks to determine if this file is safe. + + Returns True if safe, False otherwise. It may raise ImageFormatError + if safety cannot be guaranteed because of parsing or other errors. + """ + return True + + +# The qcow2 format consists of a big-endian 72-byte header, of which +# only a small portion has information we care about: +# +# Dec Hex Name +# 0 0x00 Magic 4-bytes 'QFI\xfb' +# 4 0x04 Version (uint32_t, should always be 2 for modern files) +# . . . +# 8 0x08 Backing file offset (uint64_t) +# 24 0x18 Size in bytes (unint64_t) +# . . . +# 72 0x48 Incompatible features bitfield (6 bytes) +# +# https://gitlab.com/qemu-project/qemu/-/blob/master/docs/interop/qcow2.txt +class QcowInspector(FileInspector): + """QEMU QCOW2 Format + + This should only require about 32 bytes of the beginning of the file + to determine the virtual size, and 104 bytes to perform the safety check. + """ + + BF_OFFSET = 0x08 + BF_OFFSET_LEN = 8 + I_FEATURES = 0x48 + I_FEATURES_LEN = 8 + I_FEATURES_DATAFILE_BIT = 3 + I_FEATURES_MAX_BIT = 4 + + def __init__(self, *a, **k): + super(QcowInspector, self).__init__(*a, **k) + self.new_region('header', CaptureRegion(0, 512)) + + def _qcow_header_data(self): + magic, version, bf_offset, bf_sz, cluster_bits, size = ( + struct.unpack('>4sIQIIQ', self.region('header').data[:32])) + return magic, size + + @property + def has_header(self): + return self.region('header').complete + + @property + def virtual_size(self): + if not self.region('header').complete: + return 0 + if not self.format_match: + return 0 + magic, size = self._qcow_header_data() + return size + + @property + def format_match(self): + if not self.region('header').complete: + return False + magic, size = self._qcow_header_data() + return magic == b'QFI\xFB' + + @property + def has_backing_file(self): + if not self.region('header').complete: + return None + if not self.format_match: + return False + bf_offset_bytes = self.region('header').data[ + self.BF_OFFSET:self.BF_OFFSET + self.BF_OFFSET_LEN] + # nonzero means "has a backing file" + bf_offset, = struct.unpack('>Q', bf_offset_bytes) + return bf_offset != 0 + + @property + def has_unknown_features(self): + if not self.region('header').complete: + return None + if not self.format_match: + return False + i_features = self.region('header').data[ + self.I_FEATURES:self.I_FEATURES + self.I_FEATURES_LEN] + + # This is the maximum byte number we should expect any bits to be set + max_byte = self.I_FEATURES_MAX_BIT // 8 + + # The flag bytes are in big-endian ordering, so if we process + # them in index-order, they're reversed + for i, byte_num in enumerate(reversed(range(self.I_FEATURES_LEN))): + if byte_num == max_byte: + # If we're in the max-allowed byte, allow any bits less than + # the maximum-known feature flag bit to be set + allow_mask = ((1 << self.I_FEATURES_MAX_BIT) - 1) + elif byte_num > max_byte: + # If we're above the byte with the maximum known feature flag + # bit, then we expect all zeroes + allow_mask = 0x0 + else: + # Any earlier-than-the-maximum byte can have any of the flag + # bits set + allow_mask = 0xFF + + if i_features[i] & ~allow_mask: + LOG.warning('Found unknown feature bit in byte %i: %s/%s', + byte_num, bin(i_features[byte_num] & ~allow_mask), + bin(allow_mask)) + return True + + return False + + @property + def has_data_file(self): + if not self.region('header').complete: + return None + if not self.format_match: + return False + i_features = self.region('header').data[ + self.I_FEATURES:self.I_FEATURES + self.I_FEATURES_LEN] + + # First byte of bitfield, which is i_features[7] + byte = self.I_FEATURES_LEN - 1 - self.I_FEATURES_DATAFILE_BIT // 8 + # Third bit of bitfield, which is 0x04 + bit = 1 << (self.I_FEATURES_DATAFILE_BIT - 1 % 8) + return bool(i_features[byte] & bit) + + def __str__(self): + return 'qcow2' + + def safety_check(self): + return (not self.has_backing_file + and not self.has_data_file + and not self.has_unknown_features) + + +class QEDInspector(FileInspector): + def __init__(self, tracing=False): + super().__init__(tracing) + self.new_region('header', CaptureRegion(0, 512)) + + @property + def format_match(self): + if not self.region('header').complete: + return False + return self.region('header').data.startswith(b'QED\x00') + + def safety_check(self): + # QED format is not supported by anyone, but we want to detect it + # and mark it as just always unsafe. + return False + + +# The VHD (or VPC as QEMU calls it) format consists of a big-endian +# 512-byte "footer" at the beginning of the file with various +# information, most of which does not matter to us: +# +# Dec Hex Name +# 0 0x00 Magic string (8-bytes, always 'conectix') +# 40 0x28 Disk size (uint64_t) +# +# https://github.com/qemu/qemu/blob/master/block/vpc.c +class VHDInspector(FileInspector): + """Connectix/MS VPC VHD Format + + This should only require about 512 bytes of the beginning of the file + to determine the virtual size. + """ + + def __init__(self, *a, **k): + super(VHDInspector, self).__init__(*a, **k) + self.new_region('header', CaptureRegion(0, 512)) + + @property + def format_match(self): + return self.region('header').data.startswith(b'conectix') + + @property + def virtual_size(self): + if not self.region('header').complete: + return 0 + + if not self.format_match: + return 0 + + return struct.unpack('>Q', self.region('header').data[40:48])[0] + + def __str__(self): + return 'vhd' + + +# The VHDX format consists of a complex dynamic little-endian +# structure with multiple regions of metadata and data, linked by +# offsets with in the file (and within regions), identified by MSFT +# GUID strings. The header is a 320KiB structure, only a few pieces of +# which we actually need to capture and interpret: +# +# Dec Hex Name +# 0 0x00000 Identity (Technically 9-bytes, padded to 64KiB, the first +# 8 bytes of which are 'vhdxfile') +# 196608 0x30000 The Region table (64KiB of a 32-byte header, followed +# by up to 2047 36-byte region table entry structures) +# +# The region table header includes two items we need to read and parse, +# which are: +# +# 196608 0x30000 4-byte signature ('regi') +# 196616 0x30008 Entry count (uint32-t) +# +# The region table entries follow the region table header immediately +# and are identified by a 16-byte GUID, and provide an offset of the +# start of that region. We care about the "metadata region", identified +# by the METAREGION class variable. The region table entry is (offsets +# from the beginning of the entry, since it could be in multiple places): +# +# 0 0x00000 16-byte MSFT GUID +# 16 0x00010 Offset of the actual metadata region (uint64_t) +# +# When we find the METAREGION table entry, we need to grab that offset +# and start examining the region structure at that point. That +# consists of a metadata table of structures, which point to places in +# the data in an unstructured space that follows. The header is +# (offsets relative to the region start): +# +# 0 0x00000 8-byte signature ('metadata') +# . . . +# 16 0x00010 2-byte entry count (up to 2047 entries max) +# +# This header is followed by the specified number of metadata entry +# structures, identified by GUID: +# +# 0 0x00000 16-byte MSFT GUID +# 16 0x00010 4-byte offset (uint32_t, relative to the beginning of +# the metadata region) +# +# We need to find the "Virtual Disk Size" metadata item, identified by +# the GUID in the VIRTUAL_DISK_SIZE class variable, grab the offset, +# add it to the offset of the metadata region, and examine that 8-byte +# chunk of data that follows. +# +# The "Virtual Disk Size" is a naked uint64_t which contains the size +# of the virtual disk, and is our ultimate target here. +# +# https://docs.microsoft.com/en-us/openspecs/windows_protocols/ms-vhdx/83e061f8-f6e2-4de1-91bd-5d518a43d477 +class VHDXInspector(FileInspector): + """MS VHDX Format + + This requires some complex parsing of the stream. The first 256KiB + of the image is stored to get the header and region information, + and then we capture the first metadata region to read those + records, find the location of the virtual size data and parse + it. This needs to store the metadata table entries up until the + VDS record, which may consist of up to 2047 32-byte entries at + max. Finally, it must store a chunk of data at the offset of the + actual VDS uint64. + + """ + METAREGION = '8B7CA206-4790-4B9A-B8FE-575F050F886E' + VIRTUAL_DISK_SIZE = '2FA54224-CD1B-4876-B211-5DBED83BF4B8' + VHDX_METADATA_TABLE_MAX_SIZE = 32 * 2048 # From qemu + + def __init__(self, *a, **k): + super(VHDXInspector, self).__init__(*a, **k) + self.new_region('ident', CaptureRegion(0, 32)) + self.new_region('header', CaptureRegion(192 * 1024, 64 * 1024)) + + def post_process(self): + # After reading a chunk, we may have the following conditions: + # + # 1. We may have just completed the header region, and if so, + # we need to immediately read and calculate the location of + # the metadata region, as it may be starting in the same + # read we just did. + # 2. We may have just completed the metadata region, and if so, + # we need to immediately calculate the location of the + # "virtual disk size" record, as it may be starting in the + # same read we just did. + if self.region('header').complete and not self.has_region('metadata'): + region = self._find_meta_region() + if region: + self.new_region('metadata', region) + elif self.has_region('metadata') and not self.has_region('vds'): + region = self._find_meta_entry(self.VIRTUAL_DISK_SIZE) + if region: + self.new_region('vds', region) + + @property + def format_match(self): + return self.region('ident').data.startswith(b'vhdxfile') + + @staticmethod + def _guid(buf): + """Format a MSFT GUID from the 16-byte input buffer.""" + guid_format = '= 2048: + raise ImageFormatError('Region count is %i (limit 2047)' % count) + + # Process the regions until we find the metadata one; grab the + # offset and return + self._log.debug('Region entry first is %x', region_entry_first) + self._log.debug('Region entries %i', count) + meta_offset = 0 + for i in range(0, count): + entry_start = region_entry_first + (i * 32) + entry_end = entry_start + 32 + entry = self.region('header').data[entry_start:entry_end] + self._log.debug('Entry offset is %x', entry_start) + + # GUID is the first 16 bytes + guid = self._guid(entry[:16]) + if guid == self.METAREGION: + # This entry is the metadata region entry + meta_offset, meta_len, meta_req = struct.unpack( + '= 2048: + raise ImageFormatError( + 'Metadata item count is %i (limit 2047)' % count) + + for i in range(0, count): + entry_offset = 32 + (i * 32) + guid = self._guid(meta_buffer[entry_offset:entry_offset + 16]) + if guid == desired_guid: + # Found the item we are looking for by id. + # Stop our region from capturing + item_offset, item_length, _reserved = struct.unpack( + ' 1: + all_formats = [str(inspector) for inspector in detections] + raise ImageFormatError( + 'Multiple formats detected: %s' % ', '.join(all_formats)) + + return inspectors['raw'] if not detections else detections[0] diff --git a/ironic/common/images.py b/ironic/common/images.py index 9947927bc1..0f070684b5 100644 --- a/ironic/common/images.py +++ b/ironic/common/images.py @@ -23,7 +23,6 @@ import os import shutil import time -from ironic_lib import qemu_img from oslo_concurrency import processutils from oslo_log import log as logging from oslo_utils import fileutils @@ -32,7 +31,9 @@ import pycdlib from ironic.common import exception from ironic.common.glance_service import service_utils as glance_utils from ironic.common.i18n import _ +from ironic.common import image_format_inspector from ironic.common import image_service as service +from ironic.common import qemu_img from ironic.common import utils from ironic.conf import CONF @@ -388,28 +389,18 @@ def fetch_into(context, image_href, image_file): def fetch(context, image_href, path, force_raw=False): with fileutils.remove_path_on_error(path): fetch_into(context, image_href, path) - if force_raw: image_to_raw(image_href, path, "%s.part" % path) def get_source_format(image_href, path): - data = qemu_img.image_info(path) - - fmt = data.file_format - if fmt is None: + try: + img_format = image_format_inspector.detect_file_format(path) + except image_format_inspector.ImageFormatError: raise exception.ImageUnacceptable( - reason=_("'qemu-img info' parsing failed."), + reason=_("parsing of the image failed."), image_id=image_href) - - backing_file = data.backing_file - if backing_file is not None: - raise exception.ImageUnacceptable( - image_id=image_href, - reason=_("fmt=%(fmt)s backed by: %(backing_file)s") % - {'fmt': fmt, 'backing_file': backing_file}) - - return fmt + return str(img_format) def force_raw_will_convert(image_href, path_tmp): @@ -422,24 +413,46 @@ def force_raw_will_convert(image_href, path_tmp): def image_to_raw(image_href, path, path_tmp): with fileutils.remove_path_on_error(path_tmp): - fmt = get_source_format(image_href, path_tmp) + if not CONF.conductor.disable_deep_image_inspection: + fmt = safety_check_image(path_tmp) - if fmt != "raw": + if fmt not in CONF.conductor.permitted_image_formats: + LOG.error("Security: The requested image %(image_href)s " + "is of format image %(format)s and is not in " + "the [conductor]permitted_image_formats list.", + {'image_href': image_href, + 'format': fmt}) + raise exception.InvalidImage() + else: + fmt = get_source_format(image_href, path) + LOG.warning("Security: Image safety checking has been disabled. " + "This is unsafe operation. Attempting to continue " + "the detected format %(img_fmt)s for %(path)s.", + {'img_fmt': fmt, + 'path': path}) + + if fmt != "raw" and fmt != "iso": + # When the target format is NOT raw, we need to convert it. + # however, we don't need nor want to do that when we have + # an ISO image. If we have an ISO because it was requested, + # we have correctly fingerprinted it. Prior to proper + # image detection, we thought we had a raw image, and we + # would end up asking for a raw image to be made a raw image. staged = "%s.converted" % path utils.is_memory_insufficient(raise_if_fail=True) LOG.debug("%(image)s was %(format)s, converting to raw", {'image': image_href, 'format': fmt}) with fileutils.remove_path_on_error(staged): - qemu_img.convert_image(path_tmp, staged, 'raw') + qemu_img.convert_image(path_tmp, staged, 'raw', + source_format=fmt) os.unlink(path_tmp) - - data = qemu_img.image_info(staged) - if data.file_format != "raw": + new_fmt = get_source_format(image_href, staged) + if new_fmt != "raw": raise exception.ImageConvertFailed( image_id=image_href, reason=_("Converted to raw, but format is " - "now %s") % data.file_format) + "now %s") % new_fmt) os.rename(staged, path) else: @@ -470,7 +483,7 @@ def converted_size(path, estimate=False): the original image scaled by the configuration value `raw_image_growth_factor`. """ - data = qemu_img.image_info(path) + data = image_format_inspector.detect_file_format(path) if not estimate: return data.virtual_size growth_factor = CONF.raw_image_growth_factor @@ -787,3 +800,92 @@ def _get_deploy_iso_files(deploy_iso, mountdir): # present in deploy iso. This path varies for different OS vendors. # e_img_rel_path: is required by mkisofs to generate boot iso. return uefi_path_info, e_img_rel_path, grub_rel_path + + +def __node_or_image_cache(node): + """A helper for logging to determine if image cache or node uuid.""" + if not node: + return 'image cache' + else: + return node.uuid + + +def safety_check_image(image_path, node=None): + """Performs a safety check on the supplied image. + + This method triggers the image format inspector's to both identify the + type of the supplied file and safety check logic to identify if there + are any known unsafe features being leveraged, and return the detected + file format in the form of a string for the caller. + + :param image_path: A fully qualified path to an image which needs to + be evaluated for safety. + :param node: A Node object, optional. When supplied logging indicates the + node which triggered this issue, but the node is not + available in all invocation cases. + :returns: a string representing the the image type which is used. + :raises: InvalidImage when the supplied image is detected as unsafe, + or the image format inspector has failed to parse the supplied + image's contents. + """ + id_string = __node_or_image_cache(node) + try: + img_class = image_format_inspector.detect_file_format(image_path) + if not img_class.safety_check(): + LOG.error("Security: The requested image for " + "deployment of node %(node)s fails safety sanity " + "checking.", + {'node': id_string}) + raise exception.InvalidImage() + image_format_name = str(img_class) + except image_format_inspector.ImageFormatError: + LOG.error("Security: The requested user image for the " + "deployment node %(node)s failed to be able " + "to be parsed by the image format checker.", + {'node': id_string}) + raise exception.InvalidImage() + return image_format_name + + +def check_if_image_format_is_permitted(img_format, + expected_format=None, + node=None): + """Checks image format consistency. + + :params img_format: The determined image format by name. + :params expected_format: Optional, the expected format based upon + supplied configuration values. + :params node: A node object or None implying image cache. + :raises: InvalidImage if the requested image format is not permitted + by configuration, or the expected_format does not match the + determined format. + """ + + id_string = __node_or_image_cache(node) + if img_format not in CONF.conductor.permitted_image_formats: + LOG.error("Security: The requested deploy image for node %(node)s " + "is of format image %(format)s and is not in the " + "[conductor]permitted_image_formats list.", + {'node': id_string, + 'format': img_format}) + raise exception.InvalidImage() + if expected_format is not None and img_format != expected_format: + if expected_format in ['ari', 'aki']: + # In this case, we have an ari or aki, meaning we're pulling + # down a kernel/ramdisk, and this is rooted in a misunderstanding. + # They should be raw. The detector should be detecting this *as* + # raw anyway, so the data just mismatches from a common + # misunderstanding, and that is okay in this case as they are not + # passed to qemu-img. + # TODO(TheJulia): Add a log entry to warn here at some point in + # the future as we begin to shift the perception around this. + # See: https://bugs.launchpad.net/ironic/+bug/2074090 + return + LOG.error("Security: The requested deploy image for node %(node)s " + "has a format (%(format)s) which does not match the " + "expected image format (%(expected)s) based upon " + "supplied or retrieved information.", + {'node': id_string, + 'format': img_format, + 'expected': expected_format}) + raise exception.InvalidImage() diff --git a/ironic/common/qemu_img.py b/ironic/common/qemu_img.py new file mode 100644 index 0000000000..345afec1df --- /dev/null +++ b/ironic/common/qemu_img.py @@ -0,0 +1,89 @@ +# Licensed under the Apache License, Version 2.0 (the "License"); you may +# not use this file except in compliance with the License. You may obtain +# a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT +# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the +# License for the specific language governing permissions and limitations +# under the License. + +import logging + +from oslo_concurrency import processutils +from oslo_utils import units +import tenacity + +from ironic.common import utils +from ironic.conf import CONF + + +LOG = logging.getLogger(__name__) + +# Limit the memory address space to 1 GiB when running qemu-img +QEMU_IMG_LIMITS = None + + +def _qemu_img_limits(): + # NOTE(TheJulia): If you make *any* chance to this code, you may need + # to make an identitical or similar change to ironic-python-agent. + global QEMU_IMG_LIMITS + if QEMU_IMG_LIMITS is None: + QEMU_IMG_LIMITS = processutils.ProcessLimits( + address_space=CONF.disk_utils.image_convert_memory_limit + * units.Mi) + return QEMU_IMG_LIMITS + + +def _retry_on_res_temp_unavailable(exc): + if (isinstance(exc, processutils.ProcessExecutionError) + and ('Resource temporarily unavailable' in exc.stderr + or 'Cannot allocate memory' in exc.stderr)): + return True + return False + + +@tenacity.retry( + retry=tenacity.retry_if_exception(_retry_on_res_temp_unavailable), + stop=tenacity.stop_after_attempt(CONF.disk_utils.image_convert_attempts), + reraise=True) +def convert_image(source, dest, out_format, run_as_root=False, cache=None, + out_of_order=False, sparse_size=None, + source_format='qcow2'): + # NOTE(TheJulia): If you make *any* chance to this code, you may need + # to make an identitical or similar change to ironic-python-agent. + """Convert image to other format.""" + cmd = ['qemu-img', 'convert', '-f', source_format, '-O', out_format] + if cache is not None: + cmd += ['-t', cache] + if sparse_size is not None: + cmd += ['-S', sparse_size] + if out_of_order: + cmd.append('-W') + cmd += [source, dest] + # NOTE(TheJulia): Statically set the MALLOC_ARENA_MAX to prevent leaking + # and the creation of new malloc arenas which will consume the system + # memory. If limited to 1, qemu-img consumes ~250 MB of RAM, but when + # another thread tries to access a locked section of memory in use with + # another thread, then by default a new malloc arena is created, + # which essentially balloons the memory requirement of the machine. + # Default for qemu-img is 8 * nCPU * ~250MB (based on defaults + + # thread/code/process/library overhead. In other words, 64 GB. Limiting + # this to 3 keeps the memory utilization in happy cases below the overall + # threshold which is in place in case a malicious image is attempted to + # be passed through qemu-img. + env_vars = {'MALLOC_ARENA_MAX': '3'} + try: + utils.execute(*cmd, run_as_root=run_as_root, + prlimit=_qemu_img_limits(), + use_standard_locale=True, + env_variables=env_vars) + except processutils.ProcessExecutionError as e: + if ('Resource temporarily unavailable' in e.stderr + or 'Cannot allocate memory' in e.stderr): + LOG.debug('Failed to convert image, retrying. Error: %s', e) + # Sync disk caches before the next attempt + utils.execute('sync') + raise diff --git a/ironic/conf/__init__.py b/ironic/conf/__init__.py index 648395362a..3030eb901d 100644 --- a/ironic/conf/__init__.py +++ b/ironic/conf/__init__.py @@ -27,6 +27,7 @@ from ironic.conf import database from ironic.conf import default from ironic.conf import deploy from ironic.conf import dhcp +from ironic.conf import disk_utils from ironic.conf import dnsmasq from ironic.conf import drac from ironic.conf import fake @@ -66,6 +67,7 @@ default.register_opts(CONF) deploy.register_opts(CONF) drac.register_opts(CONF) dhcp.register_opts(CONF) +disk_utils.register_opts(CONF) dnsmasq.register_opts(CONF) fake.register_opts(CONF) glance.register_opts(CONF) diff --git a/ironic/conf/conductor.py b/ironic/conf/conductor.py index 01f385ba6f..fba120be98 100644 --- a/ironic/conf/conductor.py +++ b/ironic/conf/conductor.py @@ -417,6 +417,49 @@ opts = [ 'seconds, or 30 minutes. If you need to wait longer ' 'than the maximum value, we recommend exploring ' 'hold steps.')), + cfg.BoolOpt('disable_deep_image_inspection', + default=False, + # Normally such an option would be mutable, but this is, + # a security guard and operators should not expect to change + # this option under normal circumstances. + mutable=False, + help=_('Security Option to permit an operator to disable ' + 'file content inspections. Under normal conditions, ' + 'the conductor will inspect requested image contents ' + 'which are transferred through the conductor. ' + 'Disabling this option is not advisable and opens ' + 'the risk of unsafe images being processed which may ' + 'allow an attacker to leverage unsafe features in ' + 'various disk image formats to perform a variety of ' + 'unsafe and potentially compromising actions. ' + 'This option is *not* mutable, and ' + 'requires a service restart to change.')), + cfg.BoolOpt('conductor_always_validates_images', + default=False, + # Normally mutable, however from a security context we do want + # all logging to be generated from this option to be changed, + # and as such is set to False to force a conductor restart. + mutable=False, + help=_('Security Option to enable the conductor to *always* ' + 'inspect the image content of any requested deploy, ' + 'even if the deployment would have normally bypassed ' + 'the conductor\'s cache. When this is set to False, ' + 'the Ironic-Python-Agent is responsible ' + 'for any necessary image checks. Setting this to ' + 'True will result in a higher utilization of ' + 'resources (disk space, network traffic) ' + 'as the conductor will evaluate *all* images. ' + 'This option is *not* mutable, and requires a ' + 'service restart to change. This option requires ' + '[conductor]disable_deep_image_inspection to be set ' + 'to False.')), + cfg.ListOpt('permitted_image_formats', + default=['raw', 'qcow2', 'iso'], + mutable=True, + help=_('The supported list of image formats which are ' + 'permitted for deployment with Ironic. If an image ' + 'format outside of this list is detected, the image ' + 'validation logic will fail the deployment process.')), ] diff --git a/ironic/conf/disk_utils.py b/ironic/conf/disk_utils.py new file mode 100644 index 0000000000..d34796e071 --- /dev/null +++ b/ironic/conf/disk_utils.py @@ -0,0 +1,33 @@ +# Licensed under the Apache License, Version 2.0 (the "License"); you may +# not use this file except in compliance with the License. You may obtain +# a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT +# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the +# License for the specific language governing permissions and limitations +# under the License. + +from oslo_config import cfg + + +# NOTE(TheJulia): If you make *any* chance to this code, you may need +# to make an identitical or similar change to ironic-python-agent. +# These options were originally taken from ironic-lib upon the decision +# to move the qemu-img image conversion calls into the projects in +# order to simplify fixes related to them. +opts = [ + cfg.IntOpt('image_convert_memory_limit', + default=2048, + help='Memory limit for "qemu-img convert" in MiB. Implemented ' + 'via the address space resource limit.'), + cfg.IntOpt('image_convert_attempts', + default=3, + help='Number of attempts to convert an image.'), +] + + +def register_opts(conf): + conf.register_opts(opts, group='disk_utils') diff --git a/ironic/conf/opts.py b/ironic/conf/opts.py index 0d8aeafd20..413caf5850 100644 --- a/ironic/conf/opts.py +++ b/ironic/conf/opts.py @@ -27,6 +27,7 @@ _opts = [ ('database', ironic.conf.database.opts), ('deploy', ironic.conf.deploy.opts), ('dhcp', ironic.conf.dhcp.opts), + ('disk_utils', ironic.conf.disk_utils.opts), ('drac', ironic.conf.drac.opts), ('glance', ironic.conf.glance.list_opts()), ('healthcheck', ironic.conf.healthcheck.opts), diff --git a/ironic/drivers/modules/ansible/playbooks/roles/deploy/tasks/write.yaml b/ironic/drivers/modules/ansible/playbooks/roles/deploy/tasks/write.yaml index ed0cc85b67..4a960ac8dd 100644 --- a/ironic/drivers/modules/ansible/playbooks/roles/deploy/tasks/write.yaml +++ b/ironic/drivers/modules/ansible/playbooks/roles/deploy/tasks/write.yaml @@ -1,6 +1,6 @@ - name: convert and write become: yes - command: qemu-img convert -t directsync -O host_device /tmp/{{ inventory_hostname }}.img {{ ironic_image_target }} + command: qemu-img convert -f {{ ironic.image.disk_format }} -t directsync -O host_device /tmp/{{ inventory_hostname }}.img {{ ironic_image_target }} async: 1200 poll: 10 when: ironic.image.disk_format != 'raw' diff --git a/ironic/drivers/modules/deploy_utils.py b/ironic/drivers/modules/deploy_utils.py index cc5684ab9b..5bcf199b47 100644 --- a/ironic/drivers/modules/deploy_utils.py +++ b/ironic/drivers/modules/deploy_utils.py @@ -26,6 +26,7 @@ from oslo_utils import fileutils from oslo_utils import strutils from ironic.common import async_steps +from ironic.common import context from ironic.common import exception from ironic.common import faults from ironic.common.glance_service import service_utils @@ -204,7 +205,8 @@ def check_for_missing_params(info_dict, error_msg, param_prefix=''): 'missing_info': missing_info}) -def fetch_images(ctx, cache, images_info, force_raw=True): +def fetch_images(ctx, cache, images_info, force_raw=True, + expected_format=None): """Check for available disk space and fetch images using ImageCache. :param ctx: context @@ -212,7 +214,11 @@ def fetch_images(ctx, cache, images_info, force_raw=True): :param images_info: list of tuples (image href, destination path) :param force_raw: boolean value, whether to convert the image to raw format + :param expected_format: The expected format of the image. :raises: InstanceDeployFailure if unable to find enough disk space + :raises: InvalidImage if the supplied image metadata or contents are + deemed to be invalid, unsafe, or not matching the expectations + asserted by configuration supplied or set. """ try: @@ -224,8 +230,14 @@ def fetch_images(ctx, cache, images_info, force_raw=True): # if disk space is used between the check and actual download. # This is probably unavoidable, as we can't control other # (probably unrelated) processes + image_list = [] for href, path in images_info: - cache.fetch_image(href, path, ctx=ctx, force_raw=force_raw) + # NOTE(TheJulia): Href in this case can be an image UUID or a URL. + image_format = cache.fetch_image(href, path, ctx=ctx, + force_raw=force_raw, + expected_format=expected_format) + image_list.append((href, path, image_format)) + return image_list def set_failed_state(task, msg, collect_logs=True): @@ -1060,7 +1072,7 @@ class InstanceImageCache(image_cache.ImageCache): @METRICS.timer('cache_instance_image') -def cache_instance_image(ctx, node, force_raw=None): +def cache_instance_image(ctx, node, force_raw=None, expected_format=None): """Fetch the instance's image from Glance This method pulls the disk image and writes them to the appropriate @@ -1069,8 +1081,12 @@ def cache_instance_image(ctx, node, force_raw=None): :param ctx: context :param node: an ironic node object :param force_raw: whether convert image to raw format + :param expected_format: The expected format of the disk image contents. :returns: a tuple containing the uuid of the image and the path in the filesystem where image is cached. + :raises: InvalidImage if the requested image is invalid and cannot be + used for deployed based upon contents of the image or the metadata + surrounding the image not matching the configured image. """ # NOTE(dtantsur): applying the default here to make the option mutable if force_raw is None: @@ -1084,10 +1100,9 @@ def cache_instance_image(ctx, node, force_raw=None): LOG.debug("Fetching image %(image)s for node %(uuid)s", {'image': uuid, 'uuid': node.uuid}) - fetch_images(ctx, InstanceImageCache(), [(uuid, image_path)], - force_raw) - - return (uuid, image_path) + image_list = fetch_images(ctx, InstanceImageCache(), [(uuid, image_path)], + force_raw, expected_format=expected_format) + return (uuid, image_path, image_list[0][2]) @METRICS.timer('destroy_images') @@ -1128,13 +1143,39 @@ def destroy_http_instance_images(node): destroy_images(node.uuid) -def _validate_image_url(node, url, secret=False): +def _validate_image_url(node, url, secret=False, inspect_image=None, + expected_format=None): """Validates image URL through the HEAD request. :param url: URL to be validated :param secret: if URL is secret (e.g. swift temp url), it will not be shown in logs. + :param inspect_image: If the requested URL should have extensive + content checking applied. Defaults to the value provided by + the [conductor]conductor_always_validates_images configuration + parameter setting, but is also able to be turned off by supplying + False where needed to perform a redirect or URL head request only. + :param expected_format: The expected image format, if known, for + the image inspection logic. + :returns: Returns a dictionary with basic information about the + requested image if image introspection is """ + if inspect_image is not None: + # The caller has a bit more context and we can rely upon it, + # for example if it knows we cannot or should not inspect + # the image contents. + inspect = inspect_image + elif not CONF.conductor.disable_deep_image_inspection: + inspect = CONF.conductor.conductor_always_validates_images + else: + # If we're here, file inspection has been explicitly disabled. + inspect = False + + # NOTE(TheJulia): This method gets used in two different ways. + # The first is as a "i did a thing, let me make sure my url works." + # The second is to validate a remote URL is valid. In the remote case + # we will grab the file and proceed from there. + image_info = {} try: # NOTE(TheJulia): This method only validates that an exception # is NOT raised. In other words, that the endpoint does not @@ -1146,20 +1187,50 @@ def _validate_image_url(node, url, secret=False): LOG.error("The specified URL is not a valid HTTP(S) URL or is " "not reachable for node %(node)s: %(msg)s", {'node': node.uuid, 'msg': e}) + if inspect: + LOG.info("Inspecting image contents for %(node)s with url %(url)s. " + "Expecting user supplied format: %(expected)s", + {'node': node.uuid, + 'expected': expected_format, + 'url': url}) + # Utilizes the file cache since it knows how to pull files down + # and handles pathing and caching and all that fun, however with + # force_raw set as false. + + # The goal here being to get the file we would normally just point + # IPA at, be it via swift transfer *or* direct URL request, and + # perform the safety check on it before allowing it to proceed. + ctx = context.get_admin_context() + # NOTE(TheJulia): Because we're using the image cache here, we + # let it run the image validation checking as it's normal course + # of action, and save what it tells us the image format is. + # if there *was* a mismatch, it will raise the error. + _, image_path, img_format = cache_instance_image( + ctx, + node, + force_raw=False, + expected_format=expected_format) + # NOTE(TheJulia): We explicitly delete this file because it has no use + # in the cache after this point. + il_utils.unlink_without_raise(image_path) + image_info['disk_format'] = img_format + return image_info def _cache_and_convert_image(task, instance_info, image_info=None): """Cache an image locally and convert it to RAW if needed.""" # Ironic cache and serve images from httpboot server force_raw = direct_deploy_should_convert_raw_image(task.node) - _, image_path = cache_instance_image(task.context, task.node, - force_raw=force_raw) - if force_raw or image_info is None: - if image_info is None: - initial_format = instance_info.get('image_disk_format') - else: - initial_format = image_info.get('disk_format') + if image_info is None: + initial_format = instance_info.get('image_disk_format') + else: + initial_format = image_info.get('disk_format') + _, image_path, img_format = cache_instance_image( + task.context, task.node, + force_raw=force_raw, + expected_format=initial_format) + if force_raw or image_info is None: if force_raw: instance_info['image_disk_format'] = 'raw' else: @@ -1240,7 +1311,11 @@ def _cache_and_convert_image(task, instance_info, image_info=None): task.node.uuid]) if file_extension: http_image_url = http_image_url + file_extension - _validate_image_url(task.node, http_image_url, secret=False) + # We don't inspect the image in our url check because we just need to do + # an quick path validity check here, we should be checking contents way + # earlier on in this method. + _validate_image_url(task.node, http_image_url, secret=False, + inspect_image=False) instance_info['image_url'] = http_image_url @@ -1265,29 +1340,57 @@ def build_instance_info_for_deploy(task): instance_info = node.instance_info iwdi = node.driver_internal_info.get('is_whole_disk_image') image_source = instance_info['image_source'] + + # Flag if we know the source is a path, used for Anaconda + # deploy interface where you can just tell anaconda to + # consume artifacts from a path. In this case, we are not + # doing any image conversions, we're just passing through + # a URL in the form of configuration. isap = node.driver_internal_info.get('is_source_a_path') + # If our url ends with a /, i.e. we have been supplied with a path, # we can only deploy this in limited cases for drivers and tools # which are aware of such. i.e. anaconda. image_download_source = get_image_download_source(node) boot_option = get_boot_option(task.node) + # There is no valid reason this should already be set, and + # and gets replaced at various points in this sequence. + instance_info['image_url'] = None + if service_utils.is_glance_image(image_source): glance = image_service.GlanceImageService(context=task.context) image_info = glance.show(image_source) LOG.debug('Got image info: %(info)s for node %(node)s.', {'info': image_info, 'node': node.uuid}) if image_download_source == 'swift': + # In this case, we are getting a file *from* swift for a glance + # image which is backed by swift. IPA downloads the file directly + # from swift, but cannot get any metadata related to it otherwise. swift_temp_url = glance.swift_temp_url(image_info) - _validate_image_url(node, swift_temp_url, secret=True) + image_format = image_info.get('disk_format') + # In the process of validating the URL is valid, we will perform + # the requisite safety checking of the asset as we can end up + # converting it in the agent, or needing the disk format value + # to be correct for the Ansible deployment interface. + validate_results = _validate_image_url( + node, swift_temp_url, secret=True, + expected_format=image_format) + # Values are explicitly set into the instance info field + # so IPA have the values available. instance_info['image_url'] = swift_temp_url instance_info['image_checksum'] = image_info['checksum'] - instance_info['image_disk_format'] = image_info['disk_format'] + instance_info['image_disk_format'] = \ + validate_results.get('disk_format', image_format) instance_info['image_os_hash_algo'] = image_info['os_hash_algo'] instance_info['image_os_hash_value'] = image_info['os_hash_value'] else: + # In this case, we're directly downloading the glance image and + # hosting it locally for retrieval by the IPA. _cache_and_convert_image(task, instance_info, image_info) + # We're just populating extra information for a glance backed image in + # case a deployment interface driver needs them at some point. instance_info['image_container_format'] = ( image_info['container_format']) instance_info['image_tags'] = image_info.get('tags', []) @@ -1298,20 +1401,80 @@ def build_instance_info_for_deploy(task): instance_info['ramdisk'] = image_info['properties']['ramdisk_id'] elif (image_source.startswith('file://') or image_download_source == 'local'): + # In this case, we're explicitly downloading (or copying a file) + # hosted locally so IPA can download it directly from Ironic. + # NOTE(TheJulia): Intentionally only supporting file:/// as image # based deploy source since we don't want to, nor should we be in # in the business of copying large numbers of files as it is a # huge performance impact. + _cache_and_convert_image(task, instance_info) else: + # This is the "all other cases" logic for aspects like the user + # has supplied us a direct URL to reference. In cases like the + # anaconda deployment interface where we might just have a path + # and not a file, or where a user may be supplying a full URL to + # a remotely hosted image, we at a minimum need to check if the url + # is valid, and address any redirects upfront. try: - _validate_image_url(node, image_source) + # NOTE(TheJulia): In the case we're here, we not doing an + # integrated image based deploy, but we may also be doing + # a path based anaconda base deploy, in which case we have + # no backing image, but we need to check for a URL + # redirection. So, if the source is a path (i.e. isap), + # we don't need to inspect the image as there is no image + # in the case for the deployment to drive. + validated_results = {} + if isap: + # This is if the source is a path url, such as one used by + # anaconda templates to to rely upon bootstrapping defaults. + _validate_image_url(node, image_source, inspect_image=False) + else: + # When not isap, we can just let _validate_image_url make a + # the required decision on if contents need to be sampled, + # or not. We try to pass the image_disk_format which may be + # declared by the user, and if not we set expected_format to + # None. + validate_results = _validate_image_url( + node, + image_source, + expected_format=instance_info.get('image_disk_format', + None)) # image_url is internal, and used by IPA and some boot templates. # in most cases, it needs to come from image_source explicitly. + if 'disk_format' in validated_results: + # Ensure IPA has the value available, so write what we detect, + # if anything. This is also an item which might be needful + # with ansible deploy interface, when used in standalone mode. + instance_info['image_disk_format'] = \ + validate_results.get('disk_format') instance_info['image_url'] = image_source except exception.ImageRefIsARedirect as e: + # At this point, we've got a redirect response from the webserver, + # and we're going to try to handle it as a single redirect action, + # as requests, by default, only lets a single redirect to occur. + # This is likely a URL pathing fix, like a trailing / on a path, + # or move to HTTPS from a user supplied HTTP url. if e.redirect_url: + # Since we've got a redirect, we need to carry the rest of the + # request logic as well, which includes recording a disk + # format, if applicable. instance_info['image_url'] = e.redirect_url + # We need to save the image_source back out so it caches + instance_info['image_source'] = e.redirect_url + task.node.instance_info = instance_info + if not isap: + # The redirect doesn't relate to a path being used, so + # the target is a filename, likely cause is webserver + # telling the client to use HTTPS. + validated_results = _validate_image_url( + node, e.redirect_url, + expected_format=instance_info.get('image_disk_format', + None)) + if 'disk_format' in validated_results: + instance_info['image_disk_format'] = \ + validated_results.get('disk_format') else: raise @@ -1326,7 +1489,6 @@ def build_instance_info_for_deploy(task): # Call central parsing so we retain things like config drives. i_info = parse_instance_info(node, image_deploy=False) instance_info.update(i_info) - return instance_info diff --git a/ironic/drivers/modules/image_cache.py b/ironic/drivers/modules/image_cache.py index 08383898c0..02588a0a5b 100644 --- a/ironic/drivers/modules/image_cache.py +++ b/ironic/drivers/modules/image_cache.py @@ -65,7 +65,8 @@ class ImageCache(object): if master_dir is not None: fileutils.ensure_tree(master_dir) - def fetch_image(self, href, dest_path, ctx=None, force_raw=True): + def fetch_image(self, href, dest_path, ctx=None, force_raw=True, + expected_format=None): """Fetch image by given href to the destination path. Does nothing if destination path exists and is up to date with cache @@ -80,6 +81,7 @@ class ImageCache(object): :param ctx: context :param force_raw: boolean value, whether to convert the image to raw format + :param expected_format: The expected image format. """ img_download_lock_name = 'download-image' if self.master_dir is None: @@ -140,13 +142,14 @@ class ImageCache(object): {'href': href}) self._download_image( href, master_path, dest_path, img_info, - ctx=ctx, force_raw=force_raw) + ctx=ctx, force_raw=force_raw, + expected_format=expected_format) # NOTE(dtantsur): we increased cache size - time to clean up self.clean_up() def _download_image(self, href, master_path, dest_path, img_info, - ctx=None, force_raw=True): + ctx=None, force_raw=True, expected_format=None): """Download image by href and store at a given path. This method should be called with uuid-specific lock taken. @@ -158,6 +161,7 @@ class ImageCache(object): :param ctx: context :param force_raw: boolean value, whether to convert the image to raw format + :param expected_format: The expected original format for the image. :raise ImageDownloadFailed: when the image cache and the image HTTP or TFTP location are on different file system, causing hard link to fail. @@ -169,7 +173,7 @@ class ImageCache(object): try: with _concurrency_semaphore: - _fetch(ctx, href, tmp_path, force_raw) + _fetch(ctx, href, tmp_path, force_raw, expected_format) if img_info.get('no_cache'): LOG.debug("Caching is disabled for image %s", href) @@ -333,34 +337,59 @@ def _free_disk_space_for(path): return stat.f_frsize * stat.f_bavail -def _fetch(context, image_href, path, force_raw=False): +def _fetch(context, image_href, path, force_raw=False, expected_format=None): """Fetch image and convert to raw format if needed.""" path_tmp = "%s.part" % path if os.path.exists(path_tmp): LOG.warning("%s exist, assuming it's stale", path_tmp) os.remove(path_tmp) images.fetch(context, image_href, path_tmp, force_raw=False) + # By default, the image format is unknown + image_format = None + disable_dii = CONF.conductor.disable_deep_image_inspection + if not disable_dii: + if not expected_format: + # Call of last resort to check the image format. Caching other + # artifacts like kernel/ramdisks are not going to have an expected + # format known even if they are not passed to qemu-img. + remote_image_format = images.image_show( + context, + image_href).get('disk_format') + else: + remote_image_format = expected_format + image_format = images.safety_check_image(path_tmp) + images.check_if_image_format_is_permitted( + image_format, remote_image_format) + # Notes(yjiang5): If glance can provide the virtual size information, # then we can firstly clean cache and then invoke images.fetch(). - if force_raw: - if images.force_raw_will_convert(image_href, path_tmp): - required_space = images.converted_size(path_tmp, estimate=False) - directory = os.path.dirname(path_tmp) + if (force_raw + and ((disable_dii + and images.force_raw_will_convert(image_href, path_tmp)) + or (not disable_dii and image_format != 'raw'))): + # NOTE(TheJulia): What is happening here is the rest of the logic + # is hinged on force_raw, but also we don't need to take the entire + # path *if* the image on disk is *already* raw. Depending on settings, + # the path differs slightly because if we have deep image inspection, + # we can just rely upon the inspection image format, otherwise we + # need to ask the image format. + + required_space = images.converted_size(path_tmp, estimate=False) + directory = os.path.dirname(path_tmp) + try: + _clean_up_caches(directory, required_space) + except exception.InsufficientDiskSpace: + + # try again with an estimated raw size instead of the full size + required_space = images.converted_size(path_tmp, estimate=True) try: _clean_up_caches(directory, required_space) except exception.InsufficientDiskSpace: - - # try again with an estimated raw size instead of the full size - required_space = images.converted_size(path_tmp, estimate=True) - try: - _clean_up_caches(directory, required_space) - except exception.InsufficientDiskSpace: - LOG.warning('Not enough space for estimated image size. ' - 'Consider lowering ' - '[DEFAULT]raw_image_growth_factor=%s', - CONF.raw_image_growth_factor) - raise - + LOG.error('Not enough space for estimated image size. ' + 'Consider lowering ' + '[DEFAULT]raw_image_growth_factor=%s', + CONF.raw_image_growth_factor) + raise images.image_to_raw(image_href, path, path_tmp) else: os.rename(path_tmp, path) diff --git a/ironic/tests/unit/api/controllers/v1/test_ramdisk.py b/ironic/tests/unit/api/controllers/v1/test_ramdisk.py index 03bfe088ae..a491b3cb78 100644 --- a/ironic/tests/unit/api/controllers/v1/test_ramdisk.py +++ b/ironic/tests/unit/api/controllers/v1/test_ramdisk.py @@ -82,6 +82,8 @@ class TestLookup(test_api_base.BaseApiTest): 'agent_token': mock.ANY, 'agent_token_required': True, 'agent_md5_checksum_enable': CONF.agent.allow_md5_checksum, + 'disable_deep_image_inspection': CONF.conductor.disable_deep_image_inspection, # noqa + 'permitted_image_formats': CONF.conductor.permitted_image_formats, } self.assertEqual(expected_config, data['config']) self.assertIsNotNone(data['config']['agent_token']) diff --git a/ironic/tests/unit/common/test_format_inspector.py b/ironic/tests/unit/common/test_format_inspector.py new file mode 100644 index 0000000000..ebbe3488ac --- /dev/null +++ b/ironic/tests/unit/common/test_format_inspector.py @@ -0,0 +1,668 @@ +# Copyright 2020 Red Hat, Inc +# All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); you may +# not use this file except in compliance with the License. You may obtain +# a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT +# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the +# License for the specific language governing permissions and limitations +# under the License. + +import io +import os +import re +import struct +import subprocess +import tempfile +from unittest import mock + +from oslo_utils import units + +from ironic.common import image_format_inspector as format_inspector +from ironic.tests import base as test_base + + +TEST_IMAGE_PREFIX = 'ironic-unittest-formatinspector-' + + +def get_size_from_qemu_img(filename): + output = subprocess.check_output('qemu-img info "%s"' % filename, + shell=True) + for line in output.split(b'\n'): + m = re.search(b'^virtual size: .* .([0-9]+) bytes', line.strip()) + if m: + return int(m.group(1)) + + raise Exception('Could not find virtual size with qemu-img') + + +class TestFormatInspectors(test_base.TestCase): + + block_execute = False + + def setUp(self): + super(TestFormatInspectors, self).setUp() + self._created_files = [] + + def tearDown(self): + super(TestFormatInspectors, self).tearDown() + for fn in self._created_files: + try: + os.remove(fn) + except Exception: + pass + + def _create_iso(self, image_size, subformat='9660'): + """Create an ISO file of the given size. + + :param image_size: The size of the image to create in bytes + :param subformat: The subformat to use, if any + """ + + # these tests depend on mkisofs + # being installed and in the path, + # if it is not installed, skip + try: + subprocess.check_output('mkisofs --version', shell=True) + except Exception: + self.skipTest('mkisofs not installed') + + size = image_size // units.Mi + base_cmd = "mkisofs" + if subformat == 'udf': + # depending on the distribution mkisofs may not support udf + # and may be provided by genisoimage instead. As a result we + # need to check if the command supports udf via help + # instead of checking the installed version. + # mkisofs --help outputs to stderr so we need to + # redirect it to stdout to use grep. + try: + subprocess.check_output( + 'mkisofs --help 2>&1 | grep udf', shell=True) + except Exception: + self.skipTest('mkisofs does not support udf format') + base_cmd += " -udf" + prefix = TEST_IMAGE_PREFIX + prefix += '-%s-' % subformat + fn = tempfile.mktemp(prefix=prefix, suffix='.iso') + self._created_files.append(fn) + subprocess.check_output( + 'dd if=/dev/zero of=%s bs=1M count=%i' % (fn, size), + shell=True) + # We need to use different file as input and output as the behavior + # of mkisofs is version dependent if both the input and the output + # are the same and can cause test failures + out_fn = "%s.iso" % fn + subprocess.check_output( + '%s -V "TEST" -o %s %s' % (base_cmd, out_fn, fn), + shell=True) + self._created_files.append(out_fn) + return out_fn + + def _create_img( + self, fmt, size, subformat=None, options=None, + backing_file=None): + """Create an image file of the given format and size. + + :param fmt: The format to create + :param size: The size of the image to create in bytes + :param subformat: The subformat to use, if any + :param options: A dictionary of options to pass to the format + :param backing_file: The backing file to use, if any + """ + + if fmt == 'iso': + return self._create_iso(size, subformat) + + if fmt == 'vhd': + # QEMU calls the vhd format vpc + fmt = 'vpc' + + # these tests depend on qemu-img being installed and in the path, + # if it is not installed, skip. we also need to ensure that the + # format is supported by qemu-img, this can vary depending on the + # distribution so we need to check if the format is supported via + # the help output. + try: + subprocess.check_output( + 'qemu-img --help | grep %s' % fmt, shell=True) + except Exception: + self.skipTest( + 'qemu-img not installed or does not support %s format' % fmt) + + if options is None: + options = {} + opt = '' + prefix = TEST_IMAGE_PREFIX + + if subformat: + options['subformat'] = subformat + prefix += subformat + '-' + + if options: + opt += '-o ' + ','.join('%s=%s' % (k, v) + for k, v in options.items()) + + if backing_file is not None: + opt += ' -b %s -F raw' % backing_file + + fn = tempfile.mktemp(prefix=prefix, + suffix='.%s' % fmt) + self._created_files.append(fn) + subprocess.check_output( + 'qemu-img create -f %s %s %s %i' % (fmt, opt, fn, size), + shell=True) + return fn + + def _create_allocated_vmdk(self, size_mb, subformat=None): + # We need a "big" VMDK file to exercise some parts of the code of the + # format_inspector. A way to create one is to first create an empty + # file, and then to convert it with the -S 0 option. + + if subformat is None: + # Matches qemu-img default, see `qemu-img convert -O vmdk -o help` + subformat = 'monolithicSparse' + + prefix = TEST_IMAGE_PREFIX + prefix += '-%s-' % subformat + fn = tempfile.mktemp(prefix=prefix, suffix='.vmdk') + self._created_files.append(fn) + raw = tempfile.mktemp(prefix=prefix, suffix='.raw') + self._created_files.append(raw) + + # Create a file with pseudo-random data, otherwise it will get + # compressed in the streamOptimized format + subprocess.check_output( + 'dd if=/dev/urandom of=%s bs=1M count=%i' % (raw, size_mb), + shell=True) + + # Convert it to VMDK + subprocess.check_output( + 'qemu-img convert -f raw -O vmdk -o subformat=%s -S 0 %s %s' % ( + subformat, raw, fn), + shell=True) + return fn + + def _test_format_at_block_size(self, format_name, img, block_size): + fmt = format_inspector.get_inspector(format_name)() + self.assertIsNotNone(fmt, + 'Did not get format inspector for %s' % ( + format_name)) + wrapper = format_inspector.InfoWrapper(open(img, 'rb'), fmt) + + while True: + chunk = wrapper.read(block_size) + if not chunk: + break + + wrapper.close() + return fmt + + def _test_format_at_image_size(self, format_name, image_size, + subformat=None): + """Test the format inspector for the given format at the given image size. + + :param format_name: The format to test + :param image_size: The size of the image to create in bytes + :param subformat: The subformat to use, if any + """ # noqa + img = self._create_img(format_name, image_size, subformat=subformat) + + # Some formats have internal alignment restrictions making this not + # always exactly like image_size, so get the real value for comparison + virtual_size = get_size_from_qemu_img(img) + + # Read the format in various sizes, some of which will read whole + # sections in a single read, others will be completely unaligned, etc. + block_sizes = [64 * units.Ki, 1 * units.Mi] + # ISO images have a 32KB system area at the beginning of the image + # as a result reading that in 17 or 512 byte blocks takes too long, + # causing the test to fail. The 64KiB block size is enough to read + # the system area and header in a single read. the 1MiB block size + # adds very little time to the test so we include it. + if format_name != 'iso': + block_sizes.extend([17, 512]) + for block_size in block_sizes: + fmt = self._test_format_at_block_size(format_name, img, block_size) + self.assertTrue(fmt.format_match, + 'Failed to match %s at size %i block %i' % ( + format_name, image_size, block_size)) + self.assertEqual(virtual_size, fmt.virtual_size, + ('Failed to calculate size for %s at size %i ' + 'block %i') % (format_name, image_size, + block_size)) + memory = sum(fmt.context_info.values()) + self.assertLess(memory, 512 * units.Ki, + 'Format used more than 512KiB of memory: %s' % ( + fmt.context_info)) + + def _test_format(self, format_name, subformat=None): + # Try a few different image sizes, including some odd and very small + # sizes + for image_size in (512, 513, 2057, 7): + self._test_format_at_image_size(format_name, image_size * units.Mi, + subformat=subformat) + + def test_qcow2(self): + self._test_format('qcow2') + + def test_iso_9660(self): + self._test_format('iso', subformat='9660') + + def test_iso_udf(self): + self._test_format('iso', subformat='udf') + + def _generate_bad_iso(self): + # we want to emulate a malicious user who uploads a an + # ISO file has a qcow2 header in the system area + # of the ISO file + # we will create a qcow2 image and an ISO file + # and then copy the qcow2 header to the ISO file + # e.g. + # mkisofs -o orig.iso /etc/resolv.conf + # qemu-img create orig.qcow2 -f qcow2 64M + # dd if=orig.qcow2 of=outcome bs=32K count=1 + # dd if=orig.iso of=outcome bs=32K skip=1 seek=1 + + qcow = self._create_img('qcow2', 10 * units.Mi) + iso = self._create_iso(64 * units.Mi, subformat='9660') + # first ensure the files are valid + iso_fmt = self._test_format_at_block_size('iso', iso, 4 * units.Ki) + self.assertTrue(iso_fmt.format_match) + qcow_fmt = self._test_format_at_block_size('qcow2', qcow, 4 * units.Ki) + self.assertTrue(qcow_fmt.format_match) + # now copy the qcow2 header to an ISO file + prefix = TEST_IMAGE_PREFIX + prefix += '-bad-' + fn = tempfile.mktemp(prefix=prefix, suffix='.iso') + self._created_files.append(fn) + subprocess.check_output( + 'dd if=%s of=%s bs=32K count=1' % (qcow, fn), + shell=True) + subprocess.check_output( + 'dd if=%s of=%s bs=32K skip=1 seek=1' % (iso, fn), + shell=True) + return qcow, iso, fn + + def test_bad_iso_qcow2(self): + + _, _, fn = self._generate_bad_iso() + + iso_check = self._test_format_at_block_size('iso', fn, 4 * units.Ki) + qcow_check = self._test_format_at_block_size('qcow2', fn, 4 * units.Ki) + # this system area of the ISO file is not considered part of the format + # the qcow2 header is in the system area of the ISO file + # so the ISO file is still valid + self.assertTrue(iso_check.format_match) + # the qcow2 header is in the system area of the ISO file + # but that will be parsed by the qcow2 format inspector + # and it will match + self.assertTrue(qcow_check.format_match) + # if we call format_inspector.detect_file_format it should detect + # and raise an exception because both match internally. + e = self.assertRaises( + format_inspector.ImageFormatError, + format_inspector.detect_file_format, fn) + self.assertIn('Multiple formats detected', str(e)) + + def test_vhd(self): + self._test_format('vhd') + + # NOTE(TheJulia): This is not a supported format, and we know this + # test can timeout due to some of the inner workings. Overall the + # code voered by this is being moved to oslo in the future, so this + # test being in ironic is also not the needful. + # def test_vhdx(self): + # self._test_format('vhdx') + + def test_vmdk(self): + self._test_format('vmdk') + + def test_vmdk_stream_optimized(self): + self._test_format('vmdk', 'streamOptimized') + + def test_from_file_reads_minimum(self): + img = self._create_img('qcow2', 10 * units.Mi) + file_size = os.stat(img).st_size + fmt = format_inspector.QcowInspector.from_file(img) + # We know everything we need from the first 512 bytes of a QCOW image, + # so make sure that we did not read the whole thing when we inspect + # a local file. + self.assertLess(fmt.actual_size, file_size) + + def test_qed_always_unsafe(self): + img = self._create_img('qed', 10 * units.Mi) + fmt = format_inspector.get_inspector('qed').from_file(img) + self.assertTrue(fmt.format_match) + self.assertFalse(fmt.safety_check()) + + def _test_vmdk_bad_descriptor_offset(self, subformat=None): + format_name = 'vmdk' + image_size = 10 * units.Mi + descriptorOffsetAddr = 0x1c + BAD_ADDRESS = 0x400 + img = self._create_img(format_name, image_size, subformat=subformat) + + # Corrupt the header + fd = open(img, 'r+b') + fd.seek(descriptorOffsetAddr) + fd.write(struct.pack('`_. + - | + Ironic *always* inspects the supplied user image content for safety prior + to deployment of a node should the image pass through the conductor, + even if the image is supplied in ``raw`` format. This is utilized to + identify the format of the image and the overall safety + of the image, such that source images with unknown or unsafe feature + usage are explicitly rejected. This can be disabled by setting + ``[conductor]disable_deep_image_inspection`` to ``True``. + This is the result of CVE-2024-44082 tracked as + `bug 2071740 `_. + - | + Ironic can also inspect images which would normally be provided as a URL + for direct download by the ``ironic-python-agent`` ramdisk. This is not + enabled by default as it will increase the overall network traffic and + disk space utilization of the conductor. This level of inspection can be + enabled by setting ``[conductor]conductor_always_validates_images`` to + ``True``. Once the ``ironic-python-agent`` ramdisk has been updated, + it will perform similar image security checks independently, should an + image conversion be required. + This is the result of CVE-2024-44082 tracked as + `bug 2071740 `_. + - | + Ironic now explicitly enforces a list of permitted image types for + deployment via the ``[conductor]permitted_image_formats`` setting, + which defaults to "raw", "qcow2", and "iso". + While the project has classically always declared permissible + images as "qcow2" and "raw", it was previously possible to supply other + image formats known to ``qemu-img``, and the utility would attempt to + convert the images. The "iso" support is required for "boot from ISO" + ramdisk support. + - | + Ironic now explicitly passes the source input format to executions of + ``qemu-img`` to limit the permitted qemu disk image drivers which may + evaluate an image to prevent any mismatched format attacks against + ``qemu-img``. + - | + The ``ansible`` deploy interface example playbooks now supply an input + format to execution of ``qemu-img``. If you are using customized + playbooks, please add "-f {{ ironic.image.disk_format }}" to your + invocations of ``qemu-img``. If you do not do so, ``qemu-img`` will + automatically try and guess which can lead to known security issues + with the incorrect source format driver. + - | + Operators who have implemented any custom deployment drivers or additional + functionality like machine snapshot, should review their downstream code + to ensure they are properly invoking ``qemu-img``. If there are any + questions or concerns, please reach out to the Ironic project developers. + - | + Operators are reminded that they should utilize cleaning in their + environments. Disabling any security features such as cleaning or image + inspection are at **your** **own** **risk**. Should you have any issues + with security related features, please don't hesitate to open a bug with + the project. + - | + The ``[conductor]disable_deep_image_inspection`` setting is + conveyed to the ``ironic-python-agent`` ramdisks automatically, and + will prevent those operating ramdisks from performing deep inspection + of images before they are written. + - The ``[conductor]permitted_image_formats`` setting is conveyed to the + ``ironic-python-agent`` ramdisks automatically. Should a need arise + to explicitly permit an additional format, that should take place in + the Ironic service configuration. +fixes: + - | + Fixes multiple issues in the handling of images as it relates to the + execution of the ``qemu-img`` utility, which is used for image format + conversion, where a malicious user could craft a disk image to potentially + extract information from an ``ironic-conductor`` process's operating + environment. + + Ironic now explicitly enforces a list of approved image + formats as a ``[conductor]permitted_image_formats`` list, which mirrors + the image formats the Ironic project has historically tested and expressed + as known working. Testing is not based upon file extension, but upon + content fingerprinting of the disk image files. + This is tracked as CVE-2024-44082 via + `bug 2071740 `_. +upgrade: + - | + When upgrading Ironic to address the ``qemu-img`` image conversion + security issues, the ``ironic-python-agent`` ramdisks will also need + to be upgraded. + - | + When upgrading Ironic to address the ``qemu-img`` image conversion + security issues, the ``[conductor]conductor_always_validates_images`` + setting may be set to ``True`` as a short term remedy while + ``ironic-python-agent`` ramdisks are being updated. Alternatively it + may be advisable to also set the ``[agent]image_download_source`` + setting to ``local`` to minimize redundant network data transfers. + - | + As a result of security fixes to address ``qemu-img`` image conversion + security issues, a new configuration parameter has been added to + Ironic, ``[conductor]permitted_image_formats`` with a default value of + "raw,qcow2,iso". Raw and qcow2 format disk images are the image formats + the Ironic community has consistently stated as what is supported + and expected for use with Ironic. These formats also match the formats + which the Ironic community tests. Operators who leverage other disk image + formats, may need to modify this setting further.