diff --git a/doc/source/index.rst b/doc/source/index.rst index af401a63..a0053073 100644 --- a/doc/source/index.rst +++ b/doc/source/index.rst @@ -48,6 +48,7 @@ User guide userdoc/vanilla_plugin userdoc/hdp_plugin userdoc/spark_plugin + userdoc/cdh_plugin **Elastic Data Processing** diff --git a/doc/source/userdoc/cdh_imagebuilder.rst b/doc/source/userdoc/cdh_imagebuilder.rst new file mode 100644 index 00000000..ad300adb --- /dev/null +++ b/doc/source/userdoc/cdh_imagebuilder.rst @@ -0,0 +1,74 @@ +.. _diskimage-builder-label: + +Building Images for Cloudera Plugin +================================== + +In this document you will find instructions on how to build Ubuntu and CentOS +images with Cloudera Express versions 5.2.0. + +Apache Hadoop. To simplify the task of building such images we use +`Disk Image Builder `_. + +`Disk Image Builder` builds disk images using elements. An element is a +particular set of code that alters how the image is built, or runs within the +chroot to prepare the image. + +Elements for building Cloudera images are stored in +`Sahara extra repository `_ + +.. note:: + + Sahara requires images with cloud-init package installed: + + * `For CentOS `_ + * `For Ubuntu `_ + +To create cloudera images follow these steps: + +1. Clone repository "https://github.com/openstack/sahara-image-elements" locally. + +2. Run the diskimage-create.sh script. + + You can run the script diskimage-create.sh in any directory (for example, in + your home directory). By default this script will attempt to create cloud + images for all versions of supported plugins and all operating systems + (subset of Ubuntu, Fedora, and CentOS depending on plugin). To only create + Cloudera images, you should use the "-p cloudera" parameter in the command + line. If you want to create the image only for a specific operating system, + you should use the "-i ubuntu|fedora|centos" parameter to assign the operating + system. This script must be run with root privileges. Below is an example to + create Cloudera images for both Ubuntu and CentOS. + + .. sourcecode:: console + + sudo bash diskimage-create.sh -p cloudera + + NOTE: If you don't want to use default values, you should explicitly set the + values of your required parameters. + + The script will create required cloud images using image elements that install + all the necessary packages and configure them. You will find the created + images in the current directory. + +.. note:: + + Disk Image Builder will generate QCOW2 images, used with the default + OpenStack Qemu/KVM hypervisors. If your OpenStack uses a different + hypervisor, the generated image should be converted to an appropriate + format. + + The VMware Nova backend requires the VMDK image format. You may use qemu-img + utility to convert a QCOW2 image to VMDK. + + .. sourcecode:: console + + qemu-img convert -O vmdk .qcow2 .vmdk + + +For finer control of diskimage-create.sh see the `official documentation +`_ +or run: + +.. sourcecode:: console + + $ diskimage-create.sh -h diff --git a/doc/source/userdoc/cdh_plugin.rst b/doc/source/userdoc/cdh_plugin.rst new file mode 100644 index 00000000..2194aebe --- /dev/null +++ b/doc/source/userdoc/cdh_plugin.rst @@ -0,0 +1,60 @@ +Cloudera Plugin +============== + +The cloudera plugin is a Sahara plugin which allows the user to deploy and +operate a cluster with Cloudera Manager. + +The cloudera plugin is not enabled in Sahara by default. To enable it you should +manually modify the Sahara configuration file (default /etc/sahara/sahara.conf) +to add "cdh" in "plugins" value. + +.. sourcecode:: cfg + + plugins=cdh,vanilla,hdp,fake + +To use the cloudera plugin, you should have cm_api (version >=8.0.0) installed +on the server where Sahara is running. To install cm_api, simply use pip: + +.. sourcecode:: console + + sudo pip install cm_api + +You need to build images using :doc:`cdh_imagebuilder` to produce images used +to provision cluster. They already have Cloudera Express 5.2.0 installed. + +The cloudera plugin requires an image to be tagged in Sahara Image Registry with +two tags: 'cdh' and '' (e.g. '5'). + +The default username specified for these images is different for each +distribution: + ++--------------+------------+ +| OS | username | ++==============+============+ +| Ubuntu 12.04 | ubuntu | ++--------------+------------+ +| CentOS 6.5 | cloud-user | ++--------------+------------+ + + +Cluster Validation +------------------ + +When the user creates or scales a Hadoop cluster using a cloudera plugin, the +cluster topology requested by the user is verified for consistency. + +The following limitations are required in the cluster topology for the cloudera +plugin: + + + Cluster must contain exactly one manager. + + Cluster must contain exactly one namenode. + + Cluster must contain exactly one secondarynamenode. + + Cluster can contain at most one resourcemanager and this process is also + required by nodemanager. + + Cluster can contain at most one jobhistory and this process is also + requried for resourcemanager. + + Cluster can contain at most one oozie and this process is also required + for EDP. + + Cluster can't contain oozie without datanode. + + Cluster can't contain oozie without nodemanager. + + Cluster can't contain oozie without jobhistory. diff --git a/doc/source/userdoc/guest-requirements.rst b/doc/source/userdoc/guest-requirements.rst index 76203592..e434ecb4 100644 --- a/doc/source/userdoc/guest-requirements.rst +++ b/doc/source/userdoc/guest-requirements.rst @@ -38,3 +38,11 @@ HDP Plugin This plugin does not have any additional requirements. Currently, only the CentOS Linux distribution is supported but other distributions will be supported in the future. To speed up provisioning, the HDP packages can be pre-installed on the image used. The packages' versions depend on the HDP version being used. +Cloudera Plugin Requirements +--------------------------- + +If the Cloudera Plugin is used for cluster deployment the guest is required to have + +* Cloudera Express installed + +See :doc:`cdh_imagebuilder` for instructions on building images for this plugin. diff --git a/doc/source/userdoc/plugins.rst b/doc/source/userdoc/plugins.rst index fb8c9bd7..d9397099 100644 --- a/doc/source/userdoc/plugins.rst +++ b/doc/source/userdoc/plugins.rst @@ -9,3 +9,4 @@ distribution in various topologies and with management/monitoring tools. * :doc:`hdp_plugin` - deploys Hortonworks Data Platform * :doc:`spark_plugin` - deploys Apache Spark with Cloudera HDFS * :doc:`mapr_plugin` - deploys MapR plugin with MapR File System +* :doc:`cdh_plugin` - deploys Cloudera Hadoop