Browse Source

doc: restructure the image building documentation

Main goal: consolidate the information about image
building under the same documentation page, and move
plugin-specific details inside plugin pages.
No plugin-specific information should live outside
those pages.

More details:
- move the detailed documentation about sahara-image-pack
  from the contributor guide to the new dedicated page
  in the user manual;
- remove the vanilla and cdh pages which describes building
  images with sahara-image-create, and move the common
  information to new sahara-image-create page
  in the user manual;
- add the matrix of supported plugin versions and
  supported building technology for each plugin inside
  the respective <plugin>-plugin.rst;
- add the redirects for the removed pages (only for master
  and rocky, where this change should be backported).
- remove few details not really needed (e.g. how to convert
  to VMDK images, location of cloud-init packages, etc,
  which do not really belong here).

Change-Id: I8398a7ad625276d8f11d743688ba71902a7e1adc
Luigi Toscano 3 years ago
  1. 94
  2. 74


@ -1,94 +0,0 @@
.. _diskimage-builder-label:
Building Images for Vanilla Plugin
In this document you will find instruction on how to build Ubuntu, Fedora, and
CentOS images with Apache Hadoop version 2.x.x.
As of now the vanilla plugin works with images with pre-installed versions of
Apache Hadoop. To simplify the task of building such images we use
`Disk Image Builder <>`_.
`Disk Image Builder` builds disk images using elements. An element is a
particular set of code that alters how the image is built, or runs within the
chroot to prepare the image.
Elements for building vanilla images are stored in the
`Sahara image elements repository <>`_
.. note::
Sahara requires images with cloud-init package installed:
* `For Ubuntu 16.04 <>`_
* `For CentOS 7 <>`_
* `For Fedora <>`_
To create vanilla images follow these steps:
1. Clone repository ""
2. Use tox to build images.
You can run the command below in sahara-image-elements
directory to build images. By default this script will attempt to create
cloud images for all versions of supported plugins and all operating systems
(subset of Ubuntu, Fedora, and CentOS depending on plugin).
.. sourcecode:: console
tox -e venv -- sahara-image-create -u
If you want to build Vanilla 2.7.1 image with centos 7 just execute:
.. sourcecode:: console
tox -e venv -- sahara-image-create -p vanilla -v 2.7.1 -i centos7
Tox will create a virtualenv and install required python packages in it,
clone the repositories "" and
"" and export necessary
* ``DIB_HADOOP_VERSION`` - version of Hadoop to install
* ``JAVA_DOWNLOAD_URL`` - download link for JDK (tarball or bin)
* ``OOZIE_DOWNLOAD_URL`` - download link for OOZIE (we have built
Oozie libs here: ````)
* ``SPARK_DOWNLOAD_URL`` - download link for Spark
* ``HIVE_VERSION`` - version of Hive to install
(currently supports only 0.11.0)
* ``ubuntu_image_name``
* ``fedora_image_name``
* ``DIB_IMAGE_SIZE`` - parameter that specifies a volume of hard disk
of instance. You need to specify it only for Fedora because Fedora
doesn't use all available volume
* ``DIB_COMMIT_ID`` - latest commit id of diskimage-builder project
* ``SAHARA_ELEMENTS_COMMIT_ID`` - latest commit id of
sahara-image-elements project
NOTE: If you don't want to use default values, you should set your values
of parameters.
Then it will create required cloud images using image elements that install
all the necessary packages and configure them. You will find created images
in the parent directory.
.. note::
Disk Image Builder will generate QCOW2 images, used with the default
OpenStack Qemu/KVM hypervisors. If your OpenStack uses a different
hypervisor, the generated image should be converted to an appropriate
VMware Nova backend requires VMDK image format. You may use qemu-img
utility to convert a QCOW2 image to VMDK.
.. sourcecode:: console
qemu-img convert -O vmdk <original_image>.qcow2 <converted_image>.vmdk
For finer control of see the `official documentation


@ -7,10 +7,43 @@ a cluster with Apache Hadoop.
Since the Newton release Spark is integrated into the Vanilla plugin so you
can launch Spark jobs on a Vanilla cluster.
For cluster provisioning prepared images should be used. They already have
Apache Hadoop 2.7.1 installed.
You may build images by yourself using :doc:`vanilla-imagebuilder`.
For cluster provisioning, prepared images should be used.
.. list-table:: Support matrix for the `vanilla` plugin
:widths: 15 15 20 15 35
:header-rows: 1
* - Version
(image tag)
- Distribution
- Build method
- Version
(build parameter)
- Notes
* - 2.8.2
- Ubuntu 16.04, CentOS 7
- sahara-image-create
- 2.8.2
- Hive 2.3.2, Oozie 4.3.0
* - 2.7.5
- Ubuntu 16.04, CentOS 7
- sahara-image-create
- 2.7.5
- Hive 2.3.2, Oozie 4.3.0
* - 2.7.1
- Ubuntu 16.04, CentOS 7
- sahara-image-create
- 2.7.1
- Hive 0.11.0, Oozie 4.2.0
For more information about building image, refer to
Vanilla plugin requires an image to be tagged in Sahara Image Registry with
two tags: 'vanilla' and '<hadoop version>' (e.g. '2.7.1').
@ -18,6 +51,41 @@ two tags: 'vanilla' and '<hadoop version>' (e.g. '2.7.1').
The image requires a username. For more information, refer to the
:doc:`registering-image` section.
Build settings
When ``sahara-image-create`` is used, you can override few settings
by exporting the corresponding environment variables
before starting the build command:
* ``DIB_HADOOP_VERSION`` - version of Hadoop to install
* ``HIVE_VERSION`` - version of Hive to install
* ``OOZIE_DOWNLOAD_URL`` - download link for Oozie (we have built
Oozie libs here:
* ``SPARK_DOWNLOAD_URL`` - download link for Spark
Vanilla Plugin Requirements
The image building tools described in :ref:`building-guest-images-label`
add the required software to the image and their usage is strongly suggested.
Nevertheless, here are listed the software that should be pre-loaded
on the guest image so that it can be used to create Vanilla clusters:
* ssh-client installed
* Java (version >= 7)
* Apache Hadoop installed
* 'hadoop' user created
See :doc:`hadoop-swift` for information on using Swift with your sahara cluster
(for EDP support Swift integration is currently required).
To support EDP, the following components must also be installed on the guest:
* Oozie version 4 or higher
* mysql/mariadb
* hive
Cluster Validation