glance/doc/source/admin/interoperable-image-import.rst
Dan Smith d8a6309893 Add administrator docs for distributed-import
This adds some text to the documentation about configuring the import
mechanism, including details about shared vs. local staging
directories. It also clarifies that *all* import methods require the
staging directory to be configured, as well as cleans up some
single-store-specific wording in this area.

Related to blueprint distributed-image-import

Change-Id: I726abe5d1104510e8da0e94f90f2b36d43b82cbe
2021-03-03 06:37:29 -08:00

29 KiB

Interoperable Image Import

Version 2.6 of the Image Service API introduces new API calls that implement an interoperable image import process. These API calls, and the workflow for using them, are described in the Interoperable Image Import section of the Image Service API reference. That documentation explains the end user's view of interoperable image import. In this section, we discuss what configuration options are available to operators.

The interoperable image import process uses Glance tasks, but does not require that the Tasks API be exposed to end users. Further, it requires the taskflow task executor. The following configuration options must be set:

  • in the [task] option group:
    • task_executor must either be set to taskflow or be used in its default value
  • in the [taskflow_executor] options group:
    • The default values are fine. It's a good idea to read through the descriptions in the sample glance-api.conf file to see what options are available.

      Note

      You can find an example glance-api.conf file in the etc/ subdirectory of the Glance source code tree. Make sure that you are looking in the correct branch for the OpenStack release you are working with.

  • in the default options group:
    • node_staging_uri as a file:///path/to/dir URI (in the single-store case) or [os_glance_staging_store]/filesystem_store_datadir as a path (in the multi-store case) must specify a location writable by the glance user. See Staging Directory Configuration for more details and recommendations.
    • enabled_import_methods must specify the import methods you are exposing at your installation. The default value for this setting is ['glance-direct','web-download']. See the next section for a description of these import methods.

Additionally, your policies must be such that an ordinary end user can manipulate tasks. In releases prior to Pike, we recommended that the task-related policies be admin-only so that end users could not access the Tasks API. In Pike, a new policy was introduced that controls access to the Tasks API. Thus it is now possible to keep the individual task policies unrestricted while not exposing the Tasks API to end users. Thus, the following is the recommended configuration for the task-related policies:

"get_task": "",
"get_tasks": "",
"add_task": "",
"modify_task": "",
"tasks_api_access": "role:admin",

Image Import Methods

Glance provides three import methods that you can make available to your users: glance-direct, web-download and copy-image. By default, all three methods are enabled.

  • The glance-direct import method allows your users to upload image data directly to Glance.

  • The web-download method allows an end user to import an image from a remote URL. The image data is retrieved from the URL and stored in the Glance backend. (In other words, this is a copy-from operation.)

    Note

    The web-download import method replaces the copy-from functionality that was available in the Image API v1 but previously absent from v2. Additionally, the Image API v1 was removed in Glance 17.0.0 (Rocky).

  • The copy-image method allows and end user to copy existing image to other Glance backends available in deployment. This import method is only used if multiple glance backends are enabled in your deployment.

You control which methods are available to API users by the enabled_import_methods configuration option in the default section of the glance-api.conf file.

Staging Directory Configuration

All of the import methods require a staging directory to be configured. This is essentially a temporary scratch location where the image can be staged (by the user via glance-direct), downloaded (by web-download), or pulled from an existing store (as in copy-image) before being copied to a given store location. In the single-store case, this location is specified by a local filesystem URI in the node_staging_uri configuration option, like this:

[DEFAULTS]
node_staging_uri = file:///var/lib/glance/staging

In the multistore case, as described in reserved_stores, the staging store should be configured with the path:

[os_glance_staging_store]
filesystem_store_datadir = /var/lib/glance/staging

The staging directory for each worker must be configured for all import methods, and can be either local (recommended) or shared. In the case of a shared location, all Glance API workers will be dependent on the shared storage availability, will compete for IO resources, and may introduce additional network traffic. If local storage is chosen, you must configure each worker with the URL by which the other workers can reach it directly. This allows one worker behind a load balancer to stage an image in one request, and another worker to handle the subsequent import request. As an example:

[DEFAULTS]
worker_self_reference_url = https://glance01.example.com:8000

This assumes you have several glance-api workers named glance01, glance02, etc behind your load balancer.

Note that public_endpoint will be used as the default if worker_self_reference_url is not set. As this will generally be set to the same value across all workers, the result is that all workers will assume the same identity and thus revert to shared-staging behavior. If public_endpoint is set differently for one or a group of workers, they will be considered isolated and thus not sharing staging storage.

Configuring the glance-direct method

For the glance-direct method, make sure that glance-direct is included in the list specified by your enabled_import_methods setting, and that all the options described above are set properly.

Note that in order to use glance-direct, the worker_self_reference_url configuration option must be set as above, or all Glance API workers must have their staging directory mounted to a common location (such as an NFS server).

Configuring the web-download method

To enable the web-download import method, make sure that it is included in the list of methods in the enabled_import_methods option, and that all the options described above are set properly.

Additionally, you have the following configuration available.

Depending on the nature of your cloud and the sophistication of your users, you may wish to restrict what URIs they may use for the web-download import method.

Note

You should be aware of OSSN-0078, "copy_from in Image Service API v1 allows network port scan". The v1 copy_from feature does not have the configurability described here.

You can do this by configuring options in the [import_filtering_opts] section of the glance-image-import.conf file.

Note

The glance-image-import.conf is an optional file. (See below for a discussion of the default settings if you don't include this file.)

You can find an example file named glance-image-import.conf.sample in the etc/ subdirectory of the Glance source code tree. Make sure that you are looking in the correct branch for the OpenStack release you are working with.

You can whitelist ("allow only these") or blacklist ("do not allow these") at three levels:

  • scheme (allowed_schemes, disallowed_schemes)
  • host (allowed_hosts, disallowed_hosts)
  • port (allowed_ports, disallowed_ports)

There are six configuration options, but the way it works is that if you specify both at any level, the whitelist is honored and the blacklist is ignored. (So why have both? Well, you may want to whitelist a scheme, but blacklist a host, and whitelist a particular port.)

Validation of a URI happens as follows:

  1. The scheme is checked.
    1. missing scheme: reject
    2. If there's a whitelist, and the scheme is not in it: reject. Otherwise, skip c and continue on to 2.
    3. If there's a blacklist, and the scheme is in it: reject.
  2. The hostname is checked.
    1. missing hostname: reject
    2. If there's a whitelist, and the host is not in it: reject. Otherwise, skip c and continue on to 3.
    3. If there's a blacklist, and the host is in it: reject.
  3. If there's a port in the URI, the port is checked.
    1. If there's a whitelist, and the port is not in it: reject. Otherwise, skip b and continue on to 4.
    2. If there's a blacklist, and the port is in it: reject.
  4. The URI is accepted as valid.

Note that if you allow a scheme, either by whitelisting it or by not blacklisting it, any URI that uses the default port for that scheme by not including a port in the URI is allowed. If it does include a port in the URI, the URI will be validated according to the above rules.

Default settings

The glance-image-import.conf is an optional file. Here are the default settings for these options:

  • allowed_schemes - ['http', 'https']
  • disallowed_schemes - empty list
  • allowed_hosts - empty list
  • disallowed_hosts - empty list
  • allowed_ports - [80, 443]
  • disallowed_ports - empty list

Thus if you use the defaults, end users will only be able to access URIs using the http or https scheme. The only ports users will be able to specify are 80 and 443. (Users do not have to specify a port, but if they do, it must be either 80 or 443.)

Note

The glance-image-import.conf is an optional file. You can find an example file named glance-image-import.conf.sample in the etc/ subdirectory of the Glance source code tree. Make sure that you are looking in the correct branch for the OpenStack release you are working with.

Configuring the copy-image method

For the copy-image method, make sure that copy-image is included in the list specified by your enabled_import_methods setting as well as you have multiple glance backends configured in your environment. To allow copy-image operation to be performed by users on images they do not own, you can set the copy_image policy to something other than the default, for example:

"copy_image": "'public':%(visibility)s"

Copying existing-image in multiple stores

Starting with Ussuri release, it is possible to copy existing image data into multiple stores using interoperable image import workflow.

Basically user will be able to copy only those images which are owned by him. Unless the copying of unowned images are allowed by cloud operator by enforcing policy check, user will get Forbidden (Operation not permitted response) for such copy operations. Even if copying of unowned images is allowed by enforcing policy, ownership of the image remains unchanged.

Operator or end user can either copy the existing image by specifying all_stores as True in request body or by passing list of desired stores in request body. If all_stores is specified and image data is already present in some of the available stores then those stores will be silently excluded from the list of all configured stores, whereas if all_stores is False, stores are specified in explicitly in request body and if image data is present in any of the specified store then the request will be rejected. In case of all_stores is specified in request body and cloud operator has also configured a read-only http store then it will be excluded explicitly.

Image will be copied to staging area from one of the available locations and then import processing will be continued using import workflow as explained in below Importing in multiple stores section.

Importing in multiple stores

Starting with Ussuri, it is possible to import data into multiple stores using interoperable image import workflow.

The status of the image is set to active according to the value of all_stores_must_succeed parameter.

  • If set to False: the image will be available as soon as an import to one store has succeeded.
  • If set to True (default): the status is set to active only when all stores have been successfully treated.

Check progress

As each store is treated sequentially, it can take quite some time for the workflow to complete depending on the size of the image and the number of stores to import data to. It is possible to follow task progress by looking at 2 reserved image properties:

  • os_glance_importing_to_stores: This property contains a list of stores that has not yet been processed. At the beginning of the import flow, it is filled with the stores provided in the request. Each time a store is fully handled, it is removed from the list.
  • os_glance_failed_import: Each time an import in a store fails, it is added to this list. This property is emptied at the beginning of the import flow.

These 2 properties are also available in the notifications sent during the workflow:

Note

Example

An operator calls the import image api with the following parameters:

curl -i -X POST -H "X-Auth-Token: $token"
     -H "Content-Type: application/json"
     -d '{"method": {"name":"glance-direct"},
          "stores": ["ceph1", "ceph2"],
          "all_stores_must_succeed": false}'
    $image_url/v2/images/{image_id}/import

The upload fails for 'ceph2' but succeed on 'ceph1'. Since the parameter all_stores_must_succeed has been set to 'false', the task ends successfully and the image is now active.

Notifications sent by glance looks like (payload is truncated for clarity):

{
    "priority": "INFO",
    "event_type": "image.prepare",
    "timestamp": "2019-08-27 16:10:30.066867",
    "payload": {"status": "importing",
                "name": "example",
                "backend": "ceph1",
                "os_glance_importing_to_stores": ["ceph1", "ceph2"],
                "os_glance_failed_import": [],
                ...},
    "message_id": "1c8993ad-e47c-4af7-9f75-fa49596eeb10",
    ...
}

{
    "priority": "INFO",
    "event_type": "image.upload",
    "timestamp": "2019-08-27 16:10:32.058812",
    "payload": {"status": "active",
                "name": "example",
                "backend": "ceph1",
                "os_glance_importing_to_stores": ["ceph2"],
                "os_glance_failed_import": [],
                ...},
    "message_id": "8b8993ad-e47c-4af7-9f75-fa49596eeb11",
    ...
}

{
    "priority": "INFO",
    "event_type": "image.prepare",
    "timestamp": "2019-08-27 16:10:33.066867",
    "payload": {"status": "active",
                "name": "example",
                "backend": "ceph2",
                "os_glance_importing_to_stores": ["ceph2"],
                "os_glance_failed_import": [],
                ...},
    "message_id": "1c8993ad-e47c-4af7-9f75-fa49596eeb18",
    ...
}

{
    "priority": "ERROR",
    "event_type": "image.upload",
    "timestamp": "2019-08-27 16:10:34.058812",
    "payload": "Error Message",
    "message_id": "8b8993ad-e47c-4af7-9f75-fa49596eeb11",
    ...
}

Customizing the image import process

When a user issues the image-import call, Glance retrieves the staged image data, processes it, and saves the result in the backing store. You can customize the nature of this processing by using plugins. Some plugins are provided by the Glance project team, you can use third-party plugins, or you can write your own.

Technical information

The import step of interoperable image import is performed by a Taskflow "flow" object. This object, provided by Glance, will call any plugins you have specified in the glance-image-import.conf file. The plugins are loaded by Stevedore and must be listed in the entry point registry in the namespace glance.image_import.plugins. (If you are using only plugins provided by the Glance project team, these are already registered for you.)

A plugin must be written in Python as a Taskflow "Task" object. The file containing this object must be present in the glance/async_/flows/plugins directory. The plugin file must contain a get_flow function that returns a Taskflow Task object wrapped in a linear flow. See the no_op plugin, located at glance/async_/flows/plugins/no_op.py for an example of how to do this.

Specifying the plugins to be used

First, the plugin code must exist in the directory glance/async_/flows/plugins. The name of a plugin is the filename (without extension) of the file containing the plugin code. For example, a file named fred_mertz.py would contain the plugin fred_mertz.

Second, the plugin must be listed in the entry point list for the glance.image_import.plugins namespace. (If you are using only plugins provided with Glance, this will have already been done for you, but it never hurts to check.) The entry point list is in setup.cfg. Find the section with the heading [entry_points] and look for the line beginning with glance.image_import.plugins =. It will be followed by a series of lines of the form:

<plugin-name> = <module-package-name>:get_flow

For example:

no_op = glance.async_.flows.plugins.no_op:get_flow

Make sure any plugin you want to use is included here.

Third, the plugin must be listed in the glance-image-import.conf file as one of the plugin names in the list providing the value for the image_import_plugins option. Plugins are executed in the order they are specified in this list.

The Image Property Injection Plugin

release introduced Queens (Glance 16.0.0)
configuration file glance-image-import.conf
configuration file section [inject_metadata_properties]

This plugin implements the Glance spec Inject metadata properties automatically to non-admin images. One use case for this plugin is a situation where an operator wants to put specific metadata on images imported by end users so that virtual machines booted from these images will be located on specific compute nodes. Since it's unlikely that an end user (the image owner) will know the appropriate properties or values, an operator may use this plugin to inject the properties automatically upon image import.

Note

This plugin may only be used as part of the interoperable image import workflow (POST v2/images/{image_id}/import). It has no effect on the image data upload call (PUT v2/images/{image_id}/file).

You can guarantee that your end users must use interoperable image import by restricting the upload_image policy appropriately in the Glance policy.yaml file. By default, this policy is unrestricted (that is, any authorized user may make the image upload call).

For example, to allow only admin or service users to make the image upload call, the policy could be restricted as follows:

"upload_image": "role:admin or (service_user_id:<uuid of nova user>) or
   (service_roles:<service user role>)"

where "service_role" is the role which is created for the service user and assigned to trusted services.

To use the Image Property Injection Plugin, the following configuration is required.

  1. You will need to configure 'glance-image-import.conf' file as shown below:

    [image_import_opts]
    image_import_plugins = [inject_image_metadata]
    
    [inject_metadata_properties]
    ignore_user_roles = admin,...
    inject = "property1":"value1","property2":"value2",...

    The first section, image_import_opts, is used to enable the plugin by specifying the plugin name as one of the elements of the list that is the value of the image_import_plugins parameter. The plugin name is simply the module name under glance/async_/flows/plugins/

    The second section, inject_metadata_properties, is where you set the parameters for the injection plugin. (Note that the values you specify here only have an effect if the plugin has been enabled in the image_import_plugins list as described above.)

    • ignore_user_roles is a comma-separated list of Keystone roles that the plugin will ignore. In other words, if the user making the image import call has any of these roles, the plugin will not inject any properties into the image.
    • inject is a comma-separated list of properties and values that will be injected into the image record for the imported image. Each property and value should be quoted and separated by a colon (':') as shown in the example above.
  2. If your use case is such that you don't want to allow end-users to create, modify, or delete metadata properties that you are injecting during the interoperable image import process, you will need to protect these properties using the Glance property protection feature (available since the Havana release).

    For example, suppose there is a property named 'property1' that you want injected during import, but you only want an administrator or service user to be able to create this property, and you want only an administrator to be able to modify or delete it. You could accomplish this by adding the following to the property protection configuration file:

    [property1]
    create = admin,service_role
    read = admin,service_role,member,_member_
    update = admin
    delete = admin

    See the property-protections section of this Guide for more information.

The Image Conversion

release introduced Rocky (Glance 17.0.0)
configuration file glance-image-import.conf
configuration file section [image_conversion]

This plugin implements automated image conversion for Interoperable Image Import. One use case for this plugin would be environments where Ceph is used as image back-end and operators want to optimize the back-end capabilities by ensuring that all images will be in raw format while not putting the burden of converting the images to their end users.

Note

This plugin may only be used as part of the interoperable image import workflow (POST v2/images/{image_id}/import). It has no effect on the image data upload call (PUT v2/images/{image_id}/file).

You can guarantee that your end users must use interoperable image import by restricting the upload_image policy appropriately in the Glance policy.yaml file. By default, this policy is unrestricted (that is, any authorized user may make the image upload call).

For example, to allow only admin or service users to make the image upload call, the policy could be restricted as follows:

"upload_image": "role:admin or (service_user_id:<uuid of nova user>) or
   (service_roles:<service user role>)"

where "service_role" is the role which is created for the service user and assigned to trusted services.

To use the Image Conversion Plugin, the following configuration is required.

You will need to configure 'glance-image-import.conf' file as shown below:

[image_import_opts]
image_import_plugins = ['image_conversion']

[image_conversion]
output_format = raw

Note

The default output format is raw in which case there is no need to have 'image_conversion' section and its 'output_format' defined in the config file.

The input format needs to be one of the qemu-img supported ones for this feature to work. In case of qemu-img call failing on the source image the import process will fail if 'image_conversion' plugin is enabled.

Note

image_import_plugins config option is a list and multiple plugins can be enabled for the import flow. The plugins are not run in parallel. One can enable multiple plugins by configuring them in the glance-image-import.conf for example as following:

[image_import_opts]
image_import_plugins = ['inject_image_metadata', 'image_conversion']

[inject_metadata_properties]
ignore_user_roles = admin,...
inject = "property1":"value1","property2":"value2",...

[image_conversion]
output_format = raw

The Image Decompression

release introduced Ussuri (Glance 20.0.0)
configuration file glance-image-import.conf

This plugin implements automated image decompression for Interoperable Image Import. One use case for this plugin would be environments where user or operator wants to use 'web-download' method and the image provider supplies only compressed images.

Note

This plugin may only be used as part of the interoperable image import workflow (POST v2/images/{image_id}/import). It has no effect on the image data upload call (PUT v2/images/{image_id}/file).

You can guarantee that your end users must use interoperable image import by restricting the upload_image policy appropriately in the Glance policy.yaml file. By default, this policy is unrestricted (that is, any authorized user may make the image upload call).

For example, to allow only admin or service users to make the image upload call, the policy could be restricted as follows:

"upload_image": "role:admin or (service_user_id:<uuid of nova user>) or
(service_roles:<service user role>)"

where "service_role" is the role which is created for the service user and assigned to trusted services.

Note

The plugin will not decompressed images which container_format is set to 'compressed' to maintain the original intent of the image creator.

To use the Image Decompression Plugin, the following configuration is required.

You will need to add "image_decompression" to 'glance-image-import.conf' file as shown below:

[image_import_opts]
image_import_plugins = ['image_decompression']

Note

The supported archive types for Image Decompression are zip, lha/lzh and gzip. Currently the plugin does not support multi-layered archives (like tar.gz). Lha/lzh is only supported in case python3 lhafile dependency library is installed, absence of this dependency will fail the import job where lha file is provided. (In this case we know it won't be bootable as the image is compressed and we do not have means to decompress it.)

Note

image_import_plugins config option is a list and multiple plugins can be enabled for the import flow. The plugins are not run in parallel. One can enable multiple plugins by configuring them in the glance-image-import.conf for example as following:

[image_import_opts]
image_import_plugins = ['image_decompression', 'image_conversion']

[image_conversion]
output_format = raw

If Image Conversion is used together, decompression must happen first, this is ensured by ordering the plugins.