doc: refer to the split plugin documentation
Remove the detailed information about the plugins from the core documentation and redirect to the plugins documentation instead. A redirect has been set to not break the existing links. Change-Id: Ief8593b02242748a5ffd4f55515975faebd19523
This commit is contained in:
parent
dc17f1903f
commit
bb5f75db7f
|
@ -4,3 +4,4 @@ redirectmatch 301 ^/sahara/([^/]+)/contributor/launchpad.html$ /sahara/$1/contri
|
|||
redirectmatch 301 ^/sahara/(?!ocata|pike|queens)([^/]+)/user/vanilla-imagebuilder.html$ /sahara/$1/user/vanilla-plugin.html
|
||||
redirectmatch 301 ^/sahara/(?!ocata|pike|queens)([^/]+)/user/cdh-imagebuilder.html$ /sahara/$1/user/cdh-plugin.html
|
||||
redirectmatch 301 ^/sahara/(?!ocata|pike|queens)([^/]+)/user/guest-requirements.html$ /sahara/$1/user/building-guest-images.html
|
||||
redirectmatch 301 ^/sahara/([^/]+)/user/([^-]+)-plugin.html$ /sahara-plugin-$2/$1/
|
||||
|
|
|
@ -60,6 +60,12 @@ openstack_projects = [
|
|||
'neutron',
|
||||
'nova',
|
||||
'oslo.middleware',
|
||||
'sahara-plugin-ambari',
|
||||
'sahara-plugin-cdh',
|
||||
'sahara-plugin-mapr',
|
||||
'sahara-plugin-spark',
|
||||
'sahara-plugin-storm',
|
||||
'sahara-plugin-vanilla',
|
||||
'tooz'
|
||||
]
|
||||
|
||||
|
|
|
@ -1,161 +0,0 @@
|
|||
|
||||
Ambari Plugin
|
||||
=============
|
||||
The Ambari sahara plugin provides a way to provision
|
||||
clusters with Hortonworks Data Platform on OpenStack using templates in a
|
||||
single click and in an easily repeatable fashion. The sahara controller serves
|
||||
as the glue between Hadoop and OpenStack. The Ambari plugin mediates between
|
||||
the sahara controller and Apache Ambari in order to deploy and configure Hadoop
|
||||
on OpenStack. Core to the HDP Plugin is Apache Ambari
|
||||
which is used as the orchestrator for deploying HDP on OpenStack. The Ambari
|
||||
plugin uses Ambari Blueprints for cluster provisioning.
|
||||
|
||||
Apache Ambari Blueprints
|
||||
------------------------
|
||||
Apache Ambari Blueprints is a portable document definition, which provides a
|
||||
complete definition for an Apache Hadoop cluster, including cluster topology,
|
||||
components, services and their configurations. Ambari Blueprints can be
|
||||
consumed by the Ambari plugin to instantiate a Hadoop cluster on OpenStack. The
|
||||
benefits of this approach is that it allows for Hadoop clusters to be
|
||||
configured and deployed using an Ambari native format that can be used with as
|
||||
well as outside of OpenStack allowing for clusters to be re-instantiated in a
|
||||
variety of environments.
|
||||
|
||||
Images
|
||||
------
|
||||
|
||||
For cluster provisioning, prepared images should be used.
|
||||
|
||||
.. list-table:: Support matrix for the `ambari` plugin
|
||||
:widths: 15 15 20 15 35
|
||||
:header-rows: 1
|
||||
|
||||
* - Version
|
||||
(image tag)
|
||||
- Distribution
|
||||
- Build method
|
||||
- Version
|
||||
(build parameter)
|
||||
- Notes
|
||||
|
||||
* - 2.6
|
||||
- Ubuntu 16.04, CentOS 7
|
||||
- sahara-image-pack
|
||||
- 2.6
|
||||
- uses Ambari 2.6
|
||||
|
||||
* - 2.5
|
||||
- Ubuntu 16.04, CentOS 7
|
||||
- sahara-image-pack
|
||||
- 2.5
|
||||
- uses Ambari 2.6
|
||||
|
||||
* - 2.4
|
||||
- Ubuntu 14.04, CentOS 7
|
||||
- sahara-image-pack
|
||||
- 2.4
|
||||
- uses Ambari 2.6
|
||||
|
||||
* - 2.4
|
||||
- Ubuntu 14.04, CentOS 7
|
||||
- sahara-image-create
|
||||
- 2.4
|
||||
- uses Ambari 2.2.1.0
|
||||
|
||||
* - 2.3
|
||||
- Ubuntu 14.04, CentOS 7
|
||||
- sahara-image-pack
|
||||
- 2.3
|
||||
- uses Ambari 2.4
|
||||
|
||||
* - 2.3
|
||||
- Ubuntu 14.04, CentOS 7
|
||||
- sahara-image-create
|
||||
- 2.3
|
||||
- uses Ambari 2.2.0.0
|
||||
|
||||
For more information about building image, refer to
|
||||
:doc:`building-guest-images`.
|
||||
|
||||
HDP plugin requires an image to be tagged in sahara Image Registry with two
|
||||
tags: 'ambari' and '<plugin version>' (e.g. '2.5').
|
||||
|
||||
The image requires a username. For more information, refer to the
|
||||
:doc:`registering-image` section.
|
||||
|
||||
To speed up provisioning, the HDP packages can be pre-installed on the image
|
||||
used. The packages' versions depend on the HDP version required.
|
||||
|
||||
High Availability for HDFS and YARN
|
||||
-----------------------------------
|
||||
High Availability (Using the Quorum Journal Manager) can be
|
||||
deployed automatically with the Ambari plugin. You can deploy High Available
|
||||
cluster through UI by selecting ``NameNode HA`` and/or ``ResourceManager HA``
|
||||
options in general configs of cluster template.
|
||||
|
||||
The NameNode High Availability is deployed using 2 NameNodes, one active and
|
||||
one standby. The NameNodes use a set of JournalNodes and Zookepeer Servers to
|
||||
ensure the necessary synchronization. In case of ResourceManager HA 2
|
||||
ResourceManagers should be enabled in addition.
|
||||
|
||||
A typical Highly available Ambari cluster uses 2 separate NameNodes, 2 separate
|
||||
ResourceManagers and at least 3 JournalNodes and at least 3 Zookeeper Servers.
|
||||
|
||||
HDP Version Support
|
||||
-------------------
|
||||
The HDP plugin currently supports deployment of HDP 2.3, 2.4 and 2.5.
|
||||
|
||||
Cluster Validation
|
||||
------------------
|
||||
Prior to Hadoop cluster creation, the HDP plugin will perform the following
|
||||
validation checks to ensure a successful Hadoop deployment:
|
||||
|
||||
* Ensure the existence of Ambari Server process in the cluster;
|
||||
* Ensure the existence of a NameNode, Zookeeper, ResourceManagers processes
|
||||
HistoryServer and App TimeLine Server in the cluster
|
||||
|
||||
Enabling Kerberos security for cluster
|
||||
--------------------------------------
|
||||
|
||||
If you want to protect your clusters using MIT Kerberos security you have to
|
||||
complete a few steps below.
|
||||
|
||||
* If you would like to create a cluster protected by Kerberos security you
|
||||
just need to enable Kerberos by checkbox in the ``General Parameters``
|
||||
section of the cluster configuration. If you prefer to use the OpenStack CLI
|
||||
for cluster creation, you have to put the data below in the
|
||||
``cluster_configs`` section:
|
||||
|
||||
.. sourcecode:: console
|
||||
|
||||
"cluster_configs": {
|
||||
"Enable Kerberos Security": true,
|
||||
}
|
||||
|
||||
Sahara in this case will correctly prepare KDC server and will create
|
||||
principals along with keytabs to enable authentication for Hadoop services.
|
||||
|
||||
* Ensure that you have the latest hadoop-openstack jar file distributed
|
||||
on your cluster nodes. You can download one at
|
||||
``https://tarballs.openstack.org/sahara-extra/dist/``
|
||||
|
||||
* Sahara will create principals along with keytabs for system users
|
||||
like ``oozie``, ``hdfs`` and ``spark`` so that you will not have to
|
||||
perform additional auth operations to execute your jobs on top of the
|
||||
cluster.
|
||||
|
||||
Adjusting Ambari Agent Package Installation timeout Parameter
|
||||
-------------------------------------------------------------
|
||||
|
||||
For a cluster with large number of nodes or slow connectivity to HDP repo
|
||||
server, a Sahara HDP Cluster creation may fail due to ambari agent
|
||||
reaching the timeout threshold while installing the packages in the nodes.
|
||||
|
||||
Such failures will occur during the "cluster start" stage which can be
|
||||
monitored from Cluster Events tab of Sahara Dashboard. The timeout error will
|
||||
be visible from the Ambari Dashboard as well.
|
||||
|
||||
* To avoid the package installation timeout by ambari agent you need to change
|
||||
the default value of ``Ambari Agent Package Install timeout`` parameter which
|
||||
can be found in the ``General Parameters`` section of the cluster template
|
||||
configuration.
|
|
@ -1,190 +0,0 @@
|
|||
Cloudera Plugin
|
||||
===============
|
||||
|
||||
The Cloudera plugin is a Sahara plugin which allows the user to
|
||||
deploy and operate a cluster with Cloudera Manager.
|
||||
|
||||
The Cloudera plugin is enabled in Sahara by default. You can manually
|
||||
modify the Sahara configuration file (default /etc/sahara/sahara.conf) to
|
||||
explicitly enable or disable it in "plugins" line.
|
||||
|
||||
Images
|
||||
------
|
||||
|
||||
For cluster provisioning, prepared images should be used.
|
||||
|
||||
.. list-table:: Support matrix for the `vanilla` plugin
|
||||
:widths: 15 15 20 15 35
|
||||
:header-rows: 1
|
||||
|
||||
* - Version
|
||||
(image tag)
|
||||
- Distribution
|
||||
- Build method
|
||||
- Version
|
||||
(build parameter)
|
||||
- Notes
|
||||
|
||||
* - 5.13.0
|
||||
- Ubuntu 16.04, CentOS 7
|
||||
- sahara-image-pack
|
||||
- 5.13.0
|
||||
-
|
||||
|
||||
* - 5.11.0
|
||||
- Ubuntu 16.04, CentOS 7
|
||||
- sahara-image-pack, sahara-image-create
|
||||
- 5.11.0
|
||||
-
|
||||
|
||||
* - 5.9.0
|
||||
- Ubuntu 14.04, CentOS 7
|
||||
- sahara-image-pack, sahara-image-create
|
||||
- 5.9.0
|
||||
-
|
||||
|
||||
* - 5.7.0
|
||||
- Ubuntu 14.04, CentOS 7
|
||||
- sahara-image-pack, sahara-image-create
|
||||
- 5.7.0
|
||||
-
|
||||
|
||||
For more information about building image, refer to
|
||||
:doc:`building-guest-images`.
|
||||
|
||||
The cloudera plugin requires an image to be tagged in Sahara Image Registry
|
||||
with two tags: 'cdh' and '<cloudera version>' (e.g. '5.13.0', '5.11.0',
|
||||
'5.9.0', etc).
|
||||
|
||||
The default username specified for these images is different for each
|
||||
distribution. For more information, refer to the
|
||||
:doc:`registering-image` section.
|
||||
|
||||
Build settings
|
||||
~~~~~~~~~~~~~~
|
||||
|
||||
It is possible to specify minor versions of CDH when ``sahara-image-create``
|
||||
is used.
|
||||
If you want to use a minor versions, export ``DIB_CDH_MINOR_VERSION``
|
||||
before starting the build command, e.g.:
|
||||
|
||||
.. sourcecode:: console
|
||||
|
||||
export DIB_CDH_MINOR_VERSION=5.7.1
|
||||
|
||||
Services Supported
|
||||
------------------
|
||||
|
||||
Currently below services are supported in both versions of Cloudera plugin:
|
||||
HDFS, Oozie, YARN, Spark, Zookeeper, Hive, Hue, HBase. 5.3.0 version of
|
||||
Cloudera Plugin also supported following services: Impala, Flume, Solr, Sqoop,
|
||||
and Key-value Store Indexer. In version 5.4.0 KMS service support was added
|
||||
based on version 5.3.0. Kafka 2.0.2 was added for CDH 5.5 and higher.
|
||||
|
||||
.. note::
|
||||
|
||||
Sentry service is enabled in Cloudera plugin. However, as we do not enable
|
||||
Kerberos authentication in the cluster for CDH version < 5.5 (which is
|
||||
required for Sentry functionality) then using Sentry service will not
|
||||
really take any effect, and other services depending on Sentry will not do
|
||||
any authentication too.
|
||||
|
||||
High Availability Support
|
||||
-------------------------
|
||||
|
||||
Currently HDFS NameNode High Availability is supported beginning with
|
||||
Cloudera 5.4.0 version. You can refer to :doc:`features` for the detail
|
||||
info.
|
||||
|
||||
YARN ResourceManager High Availability is supported beginning with Cloudera
|
||||
5.4.0 version. This feature adds redundancy in the form of an Active/Standby
|
||||
ResourceManager pair to avoid the failure of single RM. Upon failover, the
|
||||
Standby RM become Active so that the applications can resume from their last
|
||||
check-pointed state.
|
||||
|
||||
Cluster Validation
|
||||
------------------
|
||||
|
||||
When the user performs an operation on the cluster using a Cloudera plugin, the
|
||||
cluster topology requested by the user is verified for consistency.
|
||||
|
||||
The following limitations are required in the cluster topology for all
|
||||
cloudera plugin versions:
|
||||
|
||||
+ Cluster must contain exactly one manager.
|
||||
+ Cluster must contain exactly one namenode.
|
||||
+ Cluster must contain exactly one secondarynamenode.
|
||||
+ Cluster must contain at least ``dfs_replication`` datanodes.
|
||||
+ Cluster can contain at most one resourcemanager and this process is also
|
||||
required by nodemanager.
|
||||
+ Cluster can contain at most one jobhistory and this process is also
|
||||
required for resourcemanager.
|
||||
+ Cluster can contain at most one oozie and this process is also required
|
||||
for EDP.
|
||||
+ Cluster can't contain oozie without datanode.
|
||||
+ Cluster can't contain oozie without nodemanager.
|
||||
+ Cluster can't contain oozie without jobhistory.
|
||||
+ Cluster can't contain hive on the cluster without the following services:
|
||||
metastore, hive server, webcat and resourcemanager.
|
||||
+ Cluster can contain at most one hue server.
|
||||
+ Cluster can't contain hue server without hive service and oozie.
|
||||
+ Cluster can contain at most one spark history server.
|
||||
+ Cluster can't contain spark history server without resourcemanager.
|
||||
+ Cluster can't contain hbase master service without at least one zookeeper
|
||||
and at least one hbase regionserver.
|
||||
+ Cluster can't contain hbase regionserver without at least one hbase maser.
|
||||
|
||||
In case of 5.3.0, 5.4.0, 5.5.0, 5.7.x or 5.9.x version of Cloudera Plugin
|
||||
there are few extra limitations in the cluster topology:
|
||||
|
||||
+ Cluster can't contain flume without at least one datanode.
|
||||
+ Cluster can contain at most one sentry server service.
|
||||
+ Cluster can't contain sentry server service without at least one zookeeper
|
||||
and at least one datanode.
|
||||
+ Cluster can't contain solr server without at least one zookeeper and at
|
||||
least one datanode.
|
||||
+ Cluster can contain at most one sqoop server.
|
||||
+ Cluster can't contain sqoop server without at least one datanode,
|
||||
nodemanager and jobhistory.
|
||||
+ Cluster can't contain hbase indexer without at least one datanode,
|
||||
zookeeper, solr server and hbase master.
|
||||
+ Cluster can contain at most one impala catalog server.
|
||||
+ Cluster can contain at most one impala statestore.
|
||||
+ Cluster can't contain impala catalogserver without impala statestore,
|
||||
at least one impalad service, at least one datanode, and metastore.
|
||||
+ If using Impala, the daemons must be installed on every datanode.
|
||||
|
||||
In case of version 5.5.0, 5.7.x or 5.9.x of Cloudera Plugin additional
|
||||
services in the cluster topology are available:
|
||||
|
||||
+ Cluster can have the kafka service and several kafka brokers.
|
||||
|
||||
Enabling Kerberos security for cluster
|
||||
--------------------------------------
|
||||
|
||||
If you want to protect your clusters using MIT Kerberos security you have to
|
||||
complete a few steps below.
|
||||
|
||||
* If you would like to create a cluster protected by Kerberos security you
|
||||
just need to enable Kerberos by checkbox in the ``General Parameters``
|
||||
section of the cluster configuration. If you prefer to use the OpenStack CLI
|
||||
for cluster creation, you have to put the data below in the
|
||||
``cluster_configs`` section:
|
||||
|
||||
.. sourcecode:: console
|
||||
|
||||
"cluster_configs": {
|
||||
"Enable Kerberos Security": true,
|
||||
}
|
||||
|
||||
Sahara in this case will correctly prepare KDC server and will create
|
||||
principals along with keytabs to enable authentication for Hadoop services.
|
||||
|
||||
* Ensure that you have the latest hadoop-openstack jar file distributed
|
||||
on your cluster nodes. You can download one at
|
||||
``https://tarballs.openstack.org/sahara-extra/dist/``
|
||||
|
||||
* Sahara will create principals along with keytabs for system users
|
||||
like ``hdfs`` and ``spark`` so that you will not have to
|
||||
perform additional auth operations to execute your jobs on top of the
|
||||
cluster.
|
|
@ -24,12 +24,6 @@ Plugins
|
|||
:maxdepth: 2
|
||||
|
||||
plugins
|
||||
vanilla-plugin
|
||||
ambari-plugin
|
||||
spark-plugin
|
||||
storm-plugin
|
||||
cdh-plugin
|
||||
mapr-plugin
|
||||
|
||||
|
||||
Elastic Data Processing
|
||||
|
|
|
@ -1,128 +0,0 @@
|
|||
MapR Distribution Plugin
|
||||
========================
|
||||
The MapR Sahara plugin allows to provision MapR clusters on
|
||||
OpenStack in an easy way and do it, quickly, conveniently and simply.
|
||||
|
||||
Operation
|
||||
---------
|
||||
The MapR Plugin performs the following four primary functions during cluster
|
||||
creation:
|
||||
|
||||
1. MapR components deployment - the plugin manages the deployment of the
|
||||
required software to the target VMs
|
||||
2. Services Installation - MapR services are installed according to provided
|
||||
roles list
|
||||
3. Services Configuration - the plugin combines default settings with user
|
||||
provided settings
|
||||
4. Services Start - the plugin starts appropriate services according to
|
||||
specified roles
|
||||
|
||||
Images
|
||||
------
|
||||
The Sahara MapR plugin can make use of either minimal (operating system only)
|
||||
images or pre-populated MapR images. The base requirement for both is that the
|
||||
image is cloud-init enabled and contains a supported operating system (see
|
||||
http://maprdocs.mapr.com/home/InteropMatrix/r_os_matrix.html).
|
||||
|
||||
The advantage of a pre-populated image is that provisioning time is reduced, as
|
||||
packages do not need to be downloaded which make up the majority of the time
|
||||
spent in the provisioning cycle. In addition, provisioning large clusters will
|
||||
put a burden on the network as packages for all nodes need to be downloaded
|
||||
from the package repository.
|
||||
|
||||
|
||||
.. list-table:: Support matrix for the `mapr` plugin
|
||||
:widths: 15 15 20 15 35
|
||||
:header-rows: 1
|
||||
|
||||
* - Version
|
||||
(image tag)
|
||||
- Distribution
|
||||
- Build method
|
||||
- Version
|
||||
(build parameter)
|
||||
- Notes
|
||||
|
||||
* - 5.2.0.mrv2
|
||||
- Ubuntu 14.04, CentOS 7
|
||||
- sahara-image-pack
|
||||
- 5.2.0.mrv2
|
||||
-
|
||||
|
||||
* - 5.2.0.mrv2
|
||||
- Ubuntu 14.04, CentOS 7
|
||||
- sahara-image-create
|
||||
- 5.2.0
|
||||
-
|
||||
|
||||
For more information about building image, refer to
|
||||
:doc:`building-guest-images`.
|
||||
|
||||
MapR plugin needs an image to be tagged in Sahara Image Registry with
|
||||
two tags: 'mapr' and '<MapR version>' (e.g. '5.2.0.mrv2').
|
||||
|
||||
The default username specified for these images is different for each
|
||||
distribution. For more information, refer to the
|
||||
:doc:`registering-image` section.
|
||||
|
||||
Hadoop Version Support
|
||||
----------------------
|
||||
The MapR plugin currently supports Hadoop 2.7.0 (5.2.0.mrv2).
|
||||
|
||||
Cluster Validation
|
||||
------------------
|
||||
When the user creates or scales a Hadoop cluster using a mapr plugin, the
|
||||
cluster topology requested by the user is verified for consistency.
|
||||
|
||||
Every MapR cluster must contain:
|
||||
|
||||
* at least 1 *CLDB* process
|
||||
* exactly 1 *Webserver* process
|
||||
* odd number of *ZooKeeper* processes but not less than 1
|
||||
* *FileServer* process on every node
|
||||
* at least 1 ephemeral drive (then you need to specify the ephemeral drive in
|
||||
the flavor not on the node group template creation) or 1 Cinder volume
|
||||
per instance
|
||||
|
||||
Every Hadoop cluster must contain exactly 1 *Oozie* process
|
||||
|
||||
Every MapReduce v1 cluster must contain:
|
||||
|
||||
* at least 1 *JobTracker* process
|
||||
* at least 1 *TaskTracker* process
|
||||
|
||||
Every MapReduce v2 cluster must contain:
|
||||
|
||||
* exactly 1 *ResourceManager* process
|
||||
* exactly 1 *HistoryServer* process
|
||||
* at least 1 *NodeManager* process
|
||||
|
||||
Every Spark cluster must contain:
|
||||
|
||||
* exactly 1 *Spark Master* process
|
||||
* exactly 1 *Spark HistoryServer* process
|
||||
* at least 1 *Spark Slave* (worker) process
|
||||
|
||||
HBase service is considered valid if:
|
||||
|
||||
* cluster has at least 1 *HBase-Master* process
|
||||
* cluster has at least 1 *HBase-RegionServer* process
|
||||
|
||||
Hive service is considered valid if:
|
||||
|
||||
* cluster has exactly 1 *HiveMetastore* process
|
||||
* cluster has exactly 1 *HiveServer2* process
|
||||
|
||||
Hue service is considered valid if:
|
||||
|
||||
* cluster has exactly 1 *Hue* process
|
||||
* *Hue* process resides on the same node as *HttpFS* process
|
||||
|
||||
HttpFS service is considered valid if cluster has exactly 1 *HttpFS* process
|
||||
|
||||
Sqoop service is considered valid if cluster has exactly 1 *Sqoop2-Server*
|
||||
process
|
||||
|
||||
The MapR Plugin
|
||||
---------------
|
||||
For more information, please contact MapR.
|
|
@ -6,12 +6,20 @@ enables sahara to deploy a specific data processing framework (for example,
|
|||
Hadoop) or distribution, and allows configuration of topology and
|
||||
management/monitoring tools.
|
||||
|
||||
* :doc:`vanilla-plugin` - deploys Vanilla Apache Hadoop
|
||||
* :doc:`ambari-plugin` - deploys Hortonworks Data Platform
|
||||
* :doc:`spark-plugin` - deploys Apache Spark with Cloudera HDFS
|
||||
* :doc:`storm-plugin` - deploys Apache Storm
|
||||
* :doc:`mapr-plugin` - deploys MapR plugin with MapR File System
|
||||
* :doc:`cdh-plugin` - deploys Cloudera Hadoop
|
||||
The plugins currently developed as part of the official Sahara project are:
|
||||
|
||||
* :sahara-plugin-ambari-doc:`Ambari Plugin <>` -
|
||||
deploys Hortonworks Data Platform
|
||||
* :sahara-plugin-cdh-doc:`CDH Plugin <>` -
|
||||
deploys Cloudera Hadoop
|
||||
* :sahara-plugin-mapr-doc:`MapR Plugin <>` -
|
||||
deploys MapR plugin with MapR File System
|
||||
* :sahara-plugin-spark-doc:`Spark Plugin <>` -
|
||||
deploys Apache Spark with Cloudera HDFS
|
||||
* :sahara-plugin-storm-doc:`Storm Plugin <>` -
|
||||
deploys Apache Storm
|
||||
* :sahara-plugin-vanilla-doc:`Vanilla Plugin <>` -
|
||||
deploys Vanilla Apache Hadoop
|
||||
|
||||
Managing plugins
|
||||
----------------
|
||||
|
|
|
@ -1,91 +0,0 @@
|
|||
Spark Plugin
|
||||
============
|
||||
|
||||
The Spark plugin for sahara provides a way to provision Apache Spark clusters
|
||||
on OpenStack in a single click and in an easily repeatable fashion.
|
||||
|
||||
Currently Spark is installed in standalone mode, with no YARN or Mesos
|
||||
support.
|
||||
|
||||
Images
|
||||
------
|
||||
|
||||
For cluster provisioning, prepared images should be used.
|
||||
|
||||
.. list-table:: Support matrix for the `spark` plugin
|
||||
:widths: 15 15 20 15 35
|
||||
:header-rows: 1
|
||||
|
||||
* - Version
|
||||
(image tag)
|
||||
- Distribution
|
||||
- Build method
|
||||
- Version
|
||||
(build parameter)
|
||||
- Notes
|
||||
|
||||
* - 2.3
|
||||
- Ubuntu 16.04
|
||||
- sahara-image-create
|
||||
- 2.3.0
|
||||
- based on CDH 5.11
|
||||
|
||||
* - 2.2
|
||||
- Ubuntu 16.04
|
||||
- sahara-image-create
|
||||
- 2.2.0
|
||||
- based on CDH 5.11
|
||||
|
||||
For more information about building image, refer to
|
||||
:doc:`building-guest-images`.
|
||||
|
||||
The Spark plugin requires an image to be tagged in the sahara image registry
|
||||
with two tags: 'spark' and '<Spark version>' (e.g. '1.6.0').
|
||||
|
||||
The image requires a username. For more information, refer to the
|
||||
:doc:`registering-image` section.
|
||||
|
||||
Note that the Spark cluster is deployed using the scripts available in the
|
||||
Spark distribution, which allow the user to start all services (master and
|
||||
slaves), stop all services and so on. As such (and as opposed to CDH HDFS
|
||||
daemons), Spark is not deployed as a standard Ubuntu service and if the
|
||||
virtual machines are rebooted, Spark will not be restarted.
|
||||
|
||||
Build settings
|
||||
~~~~~~~~~~~~~~
|
||||
|
||||
When ``sahara-image-create`` is used, you can override few settings
|
||||
by exporting the corresponding environment variables
|
||||
before starting the build command:
|
||||
|
||||
* ``SPARK_DOWNLOAD_URL`` - download link for Spark
|
||||
|
||||
Spark configuration
|
||||
-------------------
|
||||
|
||||
Spark needs few parameters to work and has sensible defaults. If needed they
|
||||
can be changed when creating the sahara cluster template. No node group
|
||||
options are available.
|
||||
|
||||
Once the cluster is ready, connect with ssh to the master using the `ubuntu`
|
||||
user and the appropriate ssh key. Spark is installed in `/opt/spark` and
|
||||
should be completely configured and ready to start executing jobs. At the
|
||||
bottom of the cluster information page from the OpenStack dashboard, a link to
|
||||
the Spark web interface is provided.
|
||||
|
||||
Cluster Validation
|
||||
------------------
|
||||
|
||||
When a user creates an Hadoop cluster using the Spark plugin, the cluster
|
||||
topology requested by user is verified for consistency.
|
||||
|
||||
Currently there are the following limitations in cluster topology for the
|
||||
Spark plugin:
|
||||
|
||||
+ Cluster must contain exactly one HDFS namenode
|
||||
+ Cluster must contain exactly one Spark master
|
||||
+ Cluster must contain at least one Spark slave
|
||||
+ Cluster must contain at least one HDFS datanode
|
||||
|
||||
The tested configuration co-locates the NameNode with the master and a
|
||||
DataNode with each slave to maximize data locality.
|
|
@ -1,82 +0,0 @@
|
|||
Storm Plugin
|
||||
============
|
||||
|
||||
The Storm plugin for sahara provides a way to provision Apache Storm clusters
|
||||
on OpenStack in a single click and in an easily repeatable fashion.
|
||||
|
||||
Currently Storm is installed in standalone mode, with no YARN support.
|
||||
|
||||
Images
|
||||
------
|
||||
|
||||
For cluster provisioning, prepared images should be used.
|
||||
|
||||
.. list-table:: Support matrix for the `storm` plugin
|
||||
:widths: 15 15 20 15 35
|
||||
:header-rows: 1
|
||||
|
||||
* - Version
|
||||
(image tag)
|
||||
- Distribution
|
||||
- Build method
|
||||
- Version
|
||||
(build parameter)
|
||||
- Notes
|
||||
|
||||
* - 1.2
|
||||
- Ubuntu 16.04
|
||||
- sahara-image-create
|
||||
- 1.2.1, 1.2.0
|
||||
- both versions are supported by the same image tag
|
||||
|
||||
* - 1.1.0
|
||||
- Ubuntu 16.04
|
||||
- sahara-image-create
|
||||
- 1.1.1, 1.1.0
|
||||
- both versions are supported by the same image tag
|
||||
|
||||
For more information about building image, refer to
|
||||
:doc:`building-guest-images`.
|
||||
|
||||
The Storm plugin requires an image to be tagged in the sahara image registry
|
||||
with two tags: 'storm' and '<Storm version>' (e.g. '1.1.0').
|
||||
|
||||
The image requires a username. For more information, refer to the
|
||||
:doc:`registering-image` section.
|
||||
|
||||
Note that the Storm cluster is deployed using the scripts available in the
|
||||
Storm distribution, which allow the user to start all services (nimbus,
|
||||
supervisors and zookeepers), stop all services and so on. As such Storm is not
|
||||
deployed as a standard Ubuntu service and if the virtual machines are rebooted,
|
||||
Storm will not be restarted.
|
||||
|
||||
Storm configuration
|
||||
-------------------
|
||||
|
||||
Storm needs few parameters to work and has sensible defaults. If needed they
|
||||
can be changed when creating the sahara cluster template. No node group
|
||||
options are available.
|
||||
|
||||
Once the cluster is ready, connect with ssh to the master using the `ubuntu`
|
||||
user and the appropriate ssh key. Storm is installed in `/usr/local/storm` and
|
||||
should be completely configured and ready to start executing jobs. At the
|
||||
bottom of the cluster information page from the OpenStack dashboard, a link to
|
||||
the Storm web interface is provided.
|
||||
|
||||
Cluster Validation
|
||||
------------------
|
||||
|
||||
When a user creates a Storm cluster using the Storm plugin, the cluster
|
||||
topology requested by user is verified for consistency.
|
||||
|
||||
Currently there are the following limitations in cluster topology for the
|
||||
Storm plugin:
|
||||
|
||||
+ Cluster must contain exactly one Storm nimbus
|
||||
+ Cluster must contain at least one Storm supervisor
|
||||
+ Cluster must contain at least one Zookeeper node
|
||||
|
||||
The tested configuration has nimbus, supervisor, and Zookeeper processes each
|
||||
running on their own nodes.
|
||||
Another possible configuration is one node with nimbus alone, and additional
|
||||
nodes each with supervisor and Zookeeper processes together.
|
|
@ -1,111 +0,0 @@
|
|||
Vanilla Plugin
|
||||
==============
|
||||
|
||||
The vanilla plugin is a reference implementation which allows users to operate
|
||||
a cluster with Apache Hadoop.
|
||||
|
||||
Since the Newton release Spark is integrated into the Vanilla plugin so you
|
||||
can launch Spark jobs on a Vanilla cluster.
|
||||
|
||||
Images
|
||||
------
|
||||
|
||||
For cluster provisioning, prepared images should be used.
|
||||
|
||||
.. list-table:: Support matrix for the `vanilla` plugin
|
||||
:widths: 15 15 20 15 35
|
||||
:header-rows: 1
|
||||
|
||||
* - Version
|
||||
(image tag)
|
||||
- Distribution
|
||||
- Build method
|
||||
- Version
|
||||
(build parameter)
|
||||
- Notes
|
||||
|
||||
* - 2.8.2
|
||||
- Ubuntu 16.04, CentOS 7
|
||||
- sahara-image-create
|
||||
- 2.8.2
|
||||
- Hive 2.3.2, Oozie 4.3.0
|
||||
|
||||
* - 2.7.5
|
||||
- Ubuntu 16.04, CentOS 7
|
||||
- sahara-image-create
|
||||
- 2.7.5
|
||||
- Hive 2.3.2, Oozie 4.3.0
|
||||
|
||||
* - 2.7.1
|
||||
- Ubuntu 16.04, CentOS 7
|
||||
- sahara-image-create
|
||||
- 2.7.1
|
||||
- Hive 0.11.0, Oozie 4.2.0
|
||||
|
||||
For more information about building image, refer to
|
||||
:doc:`building-guest-images`.
|
||||
|
||||
Vanilla plugin requires an image to be tagged in Sahara Image Registry with
|
||||
two tags: 'vanilla' and '<hadoop version>' (e.g. '2.7.1').
|
||||
|
||||
The image requires a username. For more information, refer to the
|
||||
:doc:`registering-image` section.
|
||||
|
||||
Build settings
|
||||
~~~~~~~~~~~~~~
|
||||
|
||||
When ``sahara-image-create`` is used, you can override few settings
|
||||
by exporting the corresponding environment variables
|
||||
before starting the build command:
|
||||
|
||||
* ``DIB_HADOOP_VERSION`` - version of Hadoop to install
|
||||
* ``HIVE_VERSION`` - version of Hive to install
|
||||
* ``OOZIE_DOWNLOAD_URL`` - download link for Oozie (we have built
|
||||
Oozie libs here: https://tarballs.openstack.org/sahara-extra/dist/oozie/)
|
||||
* ``SPARK_DOWNLOAD_URL`` - download link for Spark
|
||||
|
||||
Vanilla Plugin Requirements
|
||||
---------------------------
|
||||
|
||||
The image building tools described in :ref:`building-guest-images-label`
|
||||
add the required software to the image and their usage is strongly suggested.
|
||||
Nevertheless, here are listed the software that should be pre-loaded
|
||||
on the guest image so that it can be used to create Vanilla clusters:
|
||||
|
||||
* ssh-client installed
|
||||
* Java (version >= 7)
|
||||
* Apache Hadoop installed
|
||||
* 'hadoop' user created
|
||||
|
||||
See :doc:`hadoop-swift` for information on using Swift with your sahara cluster
|
||||
(for EDP support Swift integration is currently required).
|
||||
|
||||
To support EDP, the following components must also be installed on the guest:
|
||||
|
||||
* Oozie version 4 or higher
|
||||
* mysql/mariadb
|
||||
* hive
|
||||
|
||||
Cluster Validation
|
||||
------------------
|
||||
|
||||
When user creates or scales a Hadoop cluster using a Vanilla plugin,
|
||||
the cluster topology requested by user is verified for consistency.
|
||||
|
||||
Currently there are the following limitations in cluster topology for Vanilla
|
||||
plugin:
|
||||
|
||||
For Vanilla Hadoop version 2.x.x:
|
||||
|
||||
+ Cluster must contain exactly one namenode
|
||||
+ Cluster can contain at most one resourcemanager
|
||||
+ Cluster can contain at most one secondary namenode
|
||||
+ Cluster can contain at most one historyserver
|
||||
+ Cluster can contain at most one oozie and this process is also required
|
||||
for EDP
|
||||
+ Cluster can't contain oozie without resourcemanager and without
|
||||
historyserver
|
||||
+ Cluster can't have nodemanager nodes if it doesn't have resourcemanager
|
||||
+ Cluster can have at most one hiveserver node.
|
||||
+ Cluster can have at most one spark history server and this process is also
|
||||
required for Spark EDP (Spark is available since the Newton release).
|
|
@ -5,3 +5,5 @@
|
|||
/sahara/latest/user/cdh-imagebuilder.html 301 /sahara/latest/user/cdh-plugin.html
|
||||
/sahara/latest/user/guest-requirements.html 301 /sahara/latest/user/building-guest-images.html
|
||||
/sahara/rocky/user/guest-requirements.html 301 /sahara/rocky/user/building-guest-images.html
|
||||
/sahara/latest/user/vanilla-plugin.html 301 /sahara-plugin-vanilla/latest/
|
||||
/sahara/stein/user/storm-plugin.html 301 /sahara-plugin-storm/stein/
|
||||
|
|
Loading…
Reference in New Issue