Merge "Amendment of the agent http provisioning spec"

This commit is contained in:
Zuul 2018-08-30 19:35:38 +00:00 committed by Gerrit Code Review
commit 1c16f83fdc
1 changed files with 107 additions and 34 deletions

View File

@ -11,8 +11,8 @@ The direct deploy interface provisioning with HTTP server
https://storyboard.openstack.org/#!/story/1598852 https://storyboard.openstack.org/#!/story/1598852
This spec proposes a mechanism to provision baremetal nodes by hosting custom This spec proposes a mechanism to provision baremetal nodes by hosting custom
HTTP service as an image source provider to ``direct`` deploy interface, when HTTP service as an image source provider to the ``direct`` deploy interface,
the Image service is utilized. when the Image service is utilized.
Problem description Problem description
=================== ===================
@ -30,7 +30,7 @@ The problem is the Object Storage service is not always adopted in a
deployment due to various reasons, and itself imposes restrictions on deployment due to various reasons, and itself imposes restrictions on
deployment. E.g.: deployment. E.g.:
* It profits low for a small cloud but takes more hardware resource. * It has little benefit for a small cloud but takes more hardware resource.
* It requires baremetal nodes to have access to control plane network, which * It requires baremetal nodes to have access to control plane network, which
is a restriction to network topology. is a restriction to network topology.
* It requires the Image service be configured with a backend of swift, which * It requires the Image service be configured with a backend of swift, which
@ -49,46 +49,104 @@ work.
Currently there are two scenarios, if the ``instance_info['image_source']`` Currently there are two scenarios, if the ``instance_info['image_source']``
indicates it's a glance image, the ``direct`` deploy interface generates indicates it's a glance image, the ``direct`` deploy interface generates
tempurl via glance client, and stores it to ``instance_info['image_url']``, tempurl via glance client, and stores it to ``instance_info['image_url']``,
otherwise it will be used directly as ``image_url``. Typically the two otherwise it will be directly taken as ``image_url``. The two cases typically
cases represent using the Bare Metal service in the cloud or as a standalone represent using the Bare Metal service in the cloud or as a standalone
service, respectively. service, respectively.
Introduces a new string option ``[agent]image_download_source`` to control The proposal introduces a new string option ``[agent]image_download_source``
which kind of image URL will be generated when the ``image_source`` is a to control which kind of image URL will be generated when the ``image_source``
glance image. Allowed values are ``swift`` and ``http``, defaults to ``swift``. is a glance image. Allowed values are ``swift`` and ``http``, defaults to
``swift``.
The process of the ``direct`` deploy interface on different configurations The process of the ``direct`` deploy interface on different configurations
is defined as: is defined as:
* ``swift``: keeps current logic, generates tempurl and update it to * ``swift``: Keeps current logic, generates tempurl and update it to
``instance_info['image_url']``. ``instance_info['image_url']``.
* ``http``: downloads instance image via ``ImageCache`` before node * ``http``: Downloads instance image via ``InstanceImageCache`` before node
deployment, makes the cached image accessible by local HTTP service, deployment, creates symbolic link to downloaded instance image in the
generates proper URL and updates it to ``instance_info['image_url']``. directory accessible by local HTTP service, generates proper URL and updates
it to ``instance_info['image_url']``.
The existing ``[deploy]http_root`` and ``[deploy]http_url`` are reused for The existing ``[deploy]http_root`` is reused for storing symbolic links to
storing instance image symlinks and generating instance image URLs. A new downloaded instance images. A new string option ``[deploy]http_image_subdir``
string option ``[deploy]http_image_path`` is introduced to keep it isolated is introduced to keep it isolated with iPXE related scripts. The default value
with iPXE related scripts. The default value is ``agent_images``. is ``agent_images``. The existing ``[deploy]http_url`` is reused to generate
instance image URLs.
The ``direct`` deploy interface will use the same instance cache for image The ``direct`` deploy interface will use the same instance cache for image
caching, this will be performed at ``AgentDeploy.deploy``. After an instance caching, the caching will be performed at ``AgentDeploy.deploy``. After an
image is cached, the ``direct`` deploy interface creates a soft symlink at instance image is cached, the ``direct`` deploy interface creates a symbolic
``<http_root>/<http_image_path>`` to reference the instance image. It will be link at ``<http_root>/<http_image_subdir>`` to reference the instance image.
``/httpboot/agent_images/<node-uuid>`` if all goes to default. It will be ``/httpboot/agent_images/<node-uuid>`` if all goes to default.
The ``direct`` deploy interface generates URL for the instance image and The ``direct`` deploy interface generates URL for the instance image and
updates it to ``instance_info`` at ``AgentDeploy.prepare``. The corresponding updates it to ``instance_info`` at ``AgentDeploy.prepare``. The corresponding
image URL will be ``<http_url>/<http_image_path>/<node-uuid>``. If image URL will be ``<http_url>/<http_image_subdir>/<node-uuid>``. The symbolic
``[DEFAULT]force_raw_images`` is set to true, checksum will be recalculated link will be removed at ``AgentDeploy.deploy`` when a node deploy is done, or
and updated as well. It is highly encouraged to set it false for better ``AgentDeploy.clean_up`` when a node is teared down from the state
performance.
The symbolic link will be removed at ``AgentDeploy.deploy`` when a node deploy
is done, or ``AgentDeploy.clean_up`` when a node is teared down from the state
``deploy failed``. ``deploy failed``.
Rule to convert image
---------------------
Currently the ``iscsi`` deploy interface will convert image to ``raw`` if
``[DEFAULT]force_raw_images`` is set to True.
While IPA treats instance image in two different ways:
* If the instance image format is ``raw``, ``stream_raw_images`` is True and
image type is whole disk image, the image will be streamed into the target
disk of the Bare Metal.
* Otherwise the image will be cached into memory before written to disk.
To avoid a raw image been cached into the memory of Bare Metal, the ``direct``
deploy interface will convert image to raw only if following criteria is met:
* ``[DEFAULT]force_raw_images`` is set to True,
* ``[agent]stream_raw_images`` is set to True,
* The instance image type is a whole disk image.
The ``direct`` deploy interface will recalculate MD5 checksum and update
necessary fields to ``instance_info`` if image conversion happened.
Cache sharing
-------------
``iscsi`` and ``direct`` deploy interface are sharing the same cache,
but apply different rule to whether the image should be converted to raw.
It leads to cache compatibility issue when both interface are in use.
As an example, suppose we deploy node A (using iscsi) with a partition image,
then deploy node B (use direct) with the same image. The image in the cache is
converted to raw, but according to the rule of ``direct`` deploy interface, it
assumes image will not be converted to raw, though it specifies ``force_raw``
to false to the image cache, due to cache hit, actually no image action will
be performed, this will leads to the situation that the ``direct`` deploy
interface actually provide a raw image but without MD5 recalculation.
Vice versa, if we reverse the order above, the ``iscsi`` deploy interface may
get a qcow with ``[DEFAULT]force_raw_images`` set to true, though it's
probably not an issue because populate_image will check image format before
writing. it's still not a consistent behavior.
To address the issue described above, this spec proposes to update
``ImageCache.fetch_image`` to take the input argument ``force_raw`` into
account for the master image file name:
* The master file name is not changed if ``force_raw`` is set to ``False``.
* The master file name will have ``.converted`` as file extension if
``force_raw`` is set to ``True``, e.g.::
/var/lib/ironic/master_images/6e2c5132-db24-4e0d-b612-478c3539da1e.converted
Note that the ``.converted`` extension merely acts as an indicator that the
image downloaded has gone through the conversion logic. For a raw image in the
glance, the name of master image file still has ``.converted`` as long as
``force_raw`` argument passed in is True.
Alternatives Alternatives
------------ ------------
@ -178,19 +236,21 @@ Scalability impact
Instance images will be cached on the ironic conductor node once the Instance images will be cached on the ironic conductor node once the
``[agent]image_download_source`` is set to ``http``, it will cost more ``[agent]image_download_source`` is set to ``http``, it will cost more
disk space if the conductor node is using ``direct`` deploy interface before. disk space if the conductor node is using ``direct`` deploy interface before.
The expected space usage basically should be the same with ``iscsi`` The expected space usage basically should be no more than ``iscsi`` deploy
deploy interface. interface.
IPA downloads instance image directly from the conductor node, which will IPA downloads instance image directly from the conductor node, which will
reduce traffic on the control plane network, by the cost of increasing traffic reduce traffic on the control plane network, by the cost of increasing traffic
on each conductor node. Substantially the consumption should be equivalent on each conductor node. The consumption should be no more than ``iscsi`` deloy
with the ``iscsi`` deploy interface if ``[DEFAULT]force_raw_images`` is set to interface.
true.
Performance Impact Performance Impact
------------------ ------------------
None Depending on the hardware and image type, recalculating MD5 checksum for a raw
image could consume considerable amount of CPU/IO resources. If the
performance on ironic conductor node is in concern, please set
``[DEFAULT]force_raw_images`` to ``False`` (The option is ``True`` by default).
Other deployer impact Other deployer impact
--------------------- ---------------------
@ -198,6 +258,10 @@ Other deployer impact
When using this feature, an HTTP server should be set up and configured on When using this feature, an HTTP server should be set up and configured on
each ironic conductor node. each ironic conductor node.
Each HTTP servers should be configured to follow symlinks for instance images
are accessible from external requests. Refer to ``FollowSymLinks`` if Apache
HTTP server is used, or ``disable_symlinks`` if Nginx HTTP server is used.
Developer impact Developer impact
---------------- ----------------
@ -212,6 +276,9 @@ Assignee(s)
Primary assignee: Primary assignee:
kaifeng kaifeng
Other contributors:
sambetts
Work Items Work Items
---------- ----------
@ -235,11 +302,17 @@ Upgrades and Backwards Compatibility
==================================== ====================================
Two new options ``[agent]image_download_source`` and Two new options ``[agent]image_download_source`` and
``[deploy]http_image_path`` are introduced in this feature. ``[deploy]http_image_subdir`` are introduced in this feature.
``[agent]image_download_source`` defaults to ``swift``, which should have no ``[agent]image_download_source`` defaults to ``swift``, which should have no
impact on upgrades. impact on upgrades.
The change of the cache file naming could probably invalidate some cached
instance images after upgrades, they will be re-cached when used, images not
referenced will be cleaned up eventually. This will have no impact if caching
is disabled before upgrade.
Documentation Impact Documentation Impact
==================== ====================