Merge "Modify HDP plugin doc for Ambari plugin"

This commit is contained in:
Jenkins 2016-06-06 11:54:22 +00:00 committed by Gerrit Code Review
commit aa5b77ae73
5 changed files with 69 additions and 129 deletions

Binary file not shown.

Before

Width:  |  Height:  |  Size: 156 KiB

View File

@ -49,7 +49,7 @@ User guide
userdoc/plugins
userdoc/vanilla_plugin
userdoc/hdp_plugin
userdoc/ambari_plugin
userdoc/spark_plugin
userdoc/cdh_plugin
userdoc/mapr_plugin

View File

@ -0,0 +1,67 @@
Ambari Plugin
=============
The Ambari sahara plugin provides a way to provision
clusters with Hortonworks Data Platform on OpenStack using templates in a
single click and in an easily repeatable fashion. The sahara controller serves
as the glue between Hadoop and OpenStack. The Ambari plugin mediates between
the sahara controller and Apache Ambari in order to deploy and configure Hadoop
on OpenStack. Core to the HDP Plugin is Apache Ambari
which is used as the orchestrator for deploying HDP on OpenStack. The Ambari
plugin uses Ambari Blueprints for cluster provisioning.
Apache Ambari Blueprints
------------------------
Apache Ambari Blueprints is a portable document definition, which provides a
complete definition for an Apache Hadoop cluster, including cluster topology,
components, services and their configurations. Ambari Blueprints can be
consumed by the Ambari plugin to instantiate a Hadoop cluster on OpenStack. The
benefits of this approach is that it allows for Hadoop clusters to be
configured and deployed using an Ambari native format that can be used with as
well as outside of OpenStack allowing for clusters to be re-instantiated in a
variety of environments.
Images
------
The sahara Ambari plugin is using minimal (operating system only) images.
For more information about Ambari images, refer to
https://github.com/openstack/sahara-image-elements.
You could download well tested and up-to-date prepared images from
http://sahara-files.mirantis.com/images/upstream/
HDP plugin requires an image to be tagged in sahara Image Registry with two
tags: 'ambari' and '<plugin version>' (e.g. '2.3').
Also in the Image Registry you will need to specify username for an image.
The username specified should be 'cloud-user' in case of CentOS 6.x image,
'centos' for CentOS 7 images and 'ubuntu' for Ubuntu images.
High Availability for HDFS and YARN
-----------------------------------
High Availability (Using the Quorum Journal Manager) can be
deployed automatically with the Ambari plugin. You can deploy High Avaliable
cluster through UI by selecting ``NameNode HA`` and/or ``ResourceManager HA``
options in general configs of cluster template.
The NameNode High Availability is deployed using 2 NameNodes, one active and
one standby. The NameNodes use a set of JournalNodes and Zookepeer Servers to
ensure the necessary synchronization. In case of ResourceManager HA 2
ResourceManagers should be enabled in addition.
A typical Highly available Ambari cluster uses 2 separate NameNodes, 2 separate
ResourceManagers and at least 3 JournalNodes and at least 3 Zookeeper Servers.
HDP Version Support
-------------------
The HDP plugin currently supports deployment of HDP 2.3.
Cluster Validation
------------------
Prior to Hadoop cluster creation, the HDP plugin will perform the following
validation checks to ensure a successful Hadoop deployment:
* Ensure the existence of Ambari Server process in the cluster;
* Ensure the existence of a NameNode, Zookeeper, ResourceManagers processes
HistoryServer and App TimeLine Server in the cluster

View File

@ -1,127 +0,0 @@
Hortonworks Data Platform Plugin
================================
The Hortonworks Data Platform (HDP) sahara plugin provides a way to provision
HDP clusters on OpenStack using templates in a single click and in an easily
repeatable fashion. As seen from the architecture diagram below, the sahara
controller serves as the glue between Hadoop and OpenStack. The HDP plugin
mediates between the sahara controller and Apache Ambari in order to deploy
and configure Hadoop on OpenStack. Core to the HDP Plugin is Apache Ambari
which is used as the orchestrator for deploying HDP on OpenStack.
.. image:: ../images/hdp-plugin-architecture.png
:width: 800 px
:scale: 80 %
:align: center
The HDP plugin can make use of Ambari Blueprints for cluster provisioning.
Apache Ambari Blueprints
------------------------
Apache Ambari Blueprints is a portable document definition, which provides a
complete definition for an Apache Hadoop cluster, including cluster topology,
components, services and their configurations. Ambari Blueprints can be
consumed by the HDP plugin to instantiate a Hadoop cluster on OpenStack. The
benefits of this approach is that it allows for Hadoop clusters to be
configured and deployed using an Ambari native format that can be used with as
well as outside of OpenStack allowing for clusters to be re-instantiated in a
variety of environments.
For more information about Apache Ambari Blueprints, refer to:
https://issues.apache.org/jira/browse/AMBARI-1783. Note that Apache Ambari
Blueprints are not yet finalized.
Operation
---------
The HDP Plugin performs the following four primary functions during cluster
creation:
1. Software deployment - the plugin orchestrates the deployment of the
required software to the target VMs
2. Services Installation - the Hadoop services configured for the node groups
within the cluster are installed on the associated VMs
3. Services Configuration - the plugin merges the default configuration values
and user provided configurations for each installed service to the cluster
4. Services Start - the plugin invokes the appropriate APIs to indicate to the
Ambari Server that the cluster services should be started
Images
------
The sahara HDP plugin can make use of either minimal (operating system only)
images or pre-populated HDP images. The base requirement for both is that the
image is cloud-init enabled and contains a supported operating system (see
http://docs.hortonworks.com/HDPDocuments/HDP1/HDP-1.2.4/bk_hdp1-system-admin-guide/content/sysadminguides_ha_chap2_3.html).
The advantage of a pre-populated image is that provisioning time is reduced,
as packages do not need to be downloaded and installed which make up the
majority of the time spent in the provisioning cycle. In addition, provisioning
large clusters will put a burden on the network as packages for all nodes need
to be downloaded from the package repository.
For more information about HDP images, refer to
https://github.com/openstack/sahara-image-elements.
You could download well tested and up-to-date prepared images from
http://sahara-files.mirantis.com/images/upstream/liberty/
HDP plugin requires an image to be tagged in sahara Image Registry with two
tags: 'hdp' and '<hdp version>' (e.g. '2.0.6').
Also in the Image Registry you will need to specify username for an image.
The username specified should be 'cloud-user'.
HDFS NameNode High Availability
-------------------------------
HDFS NameNode High Availability (Using the Quorum Journal Manager) can be
deployed automatically with HDP 2.0.6. Currently the only way to deploy it is
through the command line client (python-saharaclient) or sahara REST API by
simply adding the following cluster_configs parameter in the cluster's JSON :
.. sourcecode:: cfg
"cluster_configs": {
"HDFSHA": {
"hdfs.nnha": true
}
}
The NameNode High Availability is deployed using 2 NameNodes, one active and
one standby. The NameNodes use a set of JOURNALNODES and ZOOKEEPER_SERVERS to
ensure the necessary synchronization.
A typical Highly available HDP 2.0.6 cluster uses 2 separate NameNodes, at
least 3 JOURNALNODES and at least 3 ZOOKEEPER_SERVERS.
When HDFS NameNode High Availability is enabled, the plugin will perform the
following additional validations:
* Ensure the existence of 2 NAMENODES processes in the cluster
* Ensure the existence of at least 3 JOURNALNODES processes in the cluster
* Ensure the existence of at least 3 ZOOKEEPER_SERVERS processes in the cluster
Limitations
-----------
The HDP plugin currently has the following limitations:
* It is not possible to decrement the number of node-groups or hosts per node
group in a sahara generated cluster.
HDP Version Support
-------------------
The HDP plugin currently supports HDP 2.0.6.
Cluster Validation
------------------
Prior to Hadoop cluster creation, the HDP plugin will perform the following
validation checks to ensure a successful Hadoop deployment:
* Ensure the existence of a NAMENODE process in the cluster
* Ensure the existence of a JOBTRACKER should any TASKTRACKER be deployed to
the cluster
* Ensure the deployment of one Ambari Server instance to the cluster
* Ensure that each defined node group had an associated Ambari Agent configured
The HDP Plugin and sahara support
---------------------------------
For more information, please contact Hortonworks.

View File

@ -7,7 +7,7 @@ Hadoop) or distribution, and allows configuration of topology and
management/monitoring tools.
* :doc:`vanilla_plugin` - deploys Vanilla Apache Hadoop
* :doc:`hdp_plugin` - deploys Hortonworks Data Platform
* :doc:`ambari_plugin` - deploys Hortonworks Data Platform
* :doc:`spark_plugin` - deploys Apache Spark with Cloudera HDFS
* :doc:`mapr_plugin` - deploys MapR plugin with MapR File System
* :doc:`cdh_plugin` - deploys Cloudera Hadoop