Add CDH plugin documents

We added documents on how to deploy and enable CDH plugin in Sahara,
and how to build images for CDH plugin. We also modified some other
pages to add involved contents and links.

Closes-Bug: #1395583

Change-Id: If1451eaf4a570165a87313dab347422480546f0c
This commit is contained in:
Ken Chen 2014-11-24 15:05:38 +08:00
parent 8e7f7e9d65
commit ffa9f135b7
5 changed files with 144 additions and 0 deletions

View File

@ -48,6 +48,7 @@ User guide
userdoc/vanilla_plugin
userdoc/hdp_plugin
userdoc/spark_plugin
userdoc/cdh_plugin
**Elastic Data Processing**

View File

@ -0,0 +1,74 @@
.. _diskimage-builder-label:
Building Images for Cloudera Plugin
==================================
In this document you will find instructions on how to build Ubuntu and CentOS
images with Cloudera Express versions 5.2.0.
Apache Hadoop. To simplify the task of building such images we use
`Disk Image Builder <https://github.com/openstack/diskimage-builder>`_.
`Disk Image Builder` builds disk images using elements. An element is a
particular set of code that alters how the image is built, or runs within the
chroot to prepare the image.
Elements for building Cloudera images are stored in
`Sahara extra repository <https://github.com/openstack/sahara-image-elements>`_
.. note::
Sahara requires images with cloud-init package installed:
* `For CentOS <http://mirror.centos.org/centos/6/extras/x86_64/Packages/cloud-init-0.7.5-10.el6.centos.2.x86_64.rpm>`_
* `For Ubuntu <http://packages.ubuntu.com/precise/cloud-init>`_
To create cloudera images follow these steps:
1. Clone repository "https://github.com/openstack/sahara-image-elements" locally.
2. Run the diskimage-create.sh script.
You can run the script diskimage-create.sh in any directory (for example, in
your home directory). By default this script will attempt to create cloud
images for all versions of supported plugins and all operating systems
(subset of Ubuntu, Fedora, and CentOS depending on plugin). To only create
Cloudera images, you should use the "-p cloudera" parameter in the command
line. If you want to create the image only for a specific operating system,
you should use the "-i ubuntu|fedora|centos" parameter to assign the operating
system. This script must be run with root privileges. Below is an example to
create Cloudera images for both Ubuntu and CentOS.
.. sourcecode:: console
sudo bash diskimage-create.sh -p cloudera
NOTE: If you don't want to use default values, you should explicitly set the
values of your required parameters.
The script will create required cloud images using image elements that install
all the necessary packages and configure them. You will find the created
images in the current directory.
.. note::
Disk Image Builder will generate QCOW2 images, used with the default
OpenStack Qemu/KVM hypervisors. If your OpenStack uses a different
hypervisor, the generated image should be converted to an appropriate
format.
The VMware Nova backend requires the VMDK image format. You may use qemu-img
utility to convert a QCOW2 image to VMDK.
.. sourcecode:: console
qemu-img convert -O vmdk <original_image>.qcow2 <converted_image>.vmdk
For finer control of diskimage-create.sh see the `official documentation
<https://github.com/openstack/sahara-image-elements/blob/master/diskimage-create/README.rst>`_
or run:
.. sourcecode:: console
$ diskimage-create.sh -h

View File

@ -0,0 +1,60 @@
Cloudera Plugin
==============
The cloudera plugin is a Sahara plugin which allows the user to deploy and
operate a cluster with Cloudera Manager.
The cloudera plugin is not enabled in Sahara by default. To enable it you should
manually modify the Sahara configuration file (default /etc/sahara/sahara.conf)
to add "cdh" in "plugins" value.
.. sourcecode:: cfg
plugins=cdh,vanilla,hdp,fake
To use the cloudera plugin, you should have cm_api (version >=8.0.0) installed
on the server where Sahara is running. To install cm_api, simply use pip:
.. sourcecode:: console
sudo pip install cm_api
You need to build images using :doc:`cdh_imagebuilder` to produce images used
to provision cluster. They already have Cloudera Express 5.2.0 installed.
The cloudera plugin requires an image to be tagged in Sahara Image Registry with
two tags: 'cdh' and '<cloudera version>' (e.g. '5').
The default username specified for these images is different for each
distribution:
+--------------+------------+
| OS | username |
+==============+============+
| Ubuntu 12.04 | ubuntu |
+--------------+------------+
| CentOS 6.5 | cloud-user |
+--------------+------------+
Cluster Validation
------------------
When the user creates or scales a Hadoop cluster using a cloudera plugin, the
cluster topology requested by the user is verified for consistency.
The following limitations are required in the cluster topology for the cloudera
plugin:
+ Cluster must contain exactly one manager.
+ Cluster must contain exactly one namenode.
+ Cluster must contain exactly one secondarynamenode.
+ Cluster can contain at most one resourcemanager and this process is also
required by nodemanager.
+ Cluster can contain at most one jobhistory and this process is also
requried for resourcemanager.
+ Cluster can contain at most one oozie and this process is also required
for EDP.
+ Cluster can't contain oozie without datanode.
+ Cluster can't contain oozie without nodemanager.
+ Cluster can't contain oozie without jobhistory.

View File

@ -38,3 +38,11 @@ HDP Plugin
This plugin does not have any additional requirements. Currently, only the CentOS Linux distribution is supported but other distributions will be supported in the future.
To speed up provisioning, the HDP packages can be pre-installed on the image used. The packages' versions depend on the HDP version being used.
Cloudera Plugin Requirements
---------------------------
If the Cloudera Plugin is used for cluster deployment the guest is required to have
* Cloudera Express installed
See :doc:`cdh_imagebuilder` for instructions on building images for this plugin.

View File

@ -9,3 +9,4 @@ distribution in various topologies and with management/monitoring tools.
* :doc:`hdp_plugin` - deploys Hortonworks Data Platform
* :doc:`spark_plugin` - deploys Apache Spark with Cloudera HDFS
* :doc:`mapr_plugin` - deploys MapR plugin with MapR File System
* :doc:`cdh_plugin` - deploys Cloudera Hadoop