Merge "cleanup spark plugin documentation"
This commit is contained in:
commit
57523a684f
@ -1,68 +1,66 @@
|
|||||||
Spark Plugin
|
Spark Plugin
|
||||||
============
|
============
|
||||||
|
|
||||||
The Spark Sahara plugin provides a way to provision Apache Spark clusters on
|
The Spark plugin for sahara provides a way to provision Apache Spark clusters
|
||||||
OpenStack in a single click and in an easily repeatable fashion.
|
on OpenStack in a single click and in an easily repeatable fashion.
|
||||||
|
|
||||||
Currently Spark is installed in standalone mode, with no YARN or Mesos support.
|
Currently Spark is installed in standalone mode, with no YARN or Mesos
|
||||||
|
support.
|
||||||
|
|
||||||
Images
|
Images
|
||||||
------
|
------
|
||||||
|
|
||||||
For cluster provisioning prepared images should be used. The Spark plugin
|
For cluster provisioning, prepared images should be used. The Spark plugin
|
||||||
has been developed and tested with the images generated by sahara-image-elements:
|
has been developed and tested with the images generated by
|
||||||
|
sahara-image-elements:
|
||||||
|
|
||||||
* https://github.com/openstack/sahara-image-elements
|
* https://github.com/openstack/sahara-image-elements
|
||||||
|
|
||||||
Those Ubuntu images already have Cloudera CDH5 HDFS and Apache Spark installed.
|
The Ubuntu images generated by sahara-image-elements have Cloudera CDH 5.4.0
|
||||||
A prepared image for Spark 1.0 and CDH4 HDFS can be found at the following location:
|
HDFS and Apache Spark installed. A prepared image for Spark 1.3.1 and CDH
|
||||||
|
5.4.0 HDFS can be found at the following location:
|
||||||
|
|
||||||
* http://sahara-files.mirantis.com/sahara-juno-spark-1.0.0-ubuntu-14.04.qcow2
|
* http://sahara-files.mirantis.com/images/upstream/liberty/
|
||||||
|
|
||||||
The Spark plugin requires an image to be tagged in Sahara Image Registry with
|
The Spark plugin requires an image to be tagged in the sahara image registry
|
||||||
two tags: 'spark' and '<Spark version>' (e.g. '1.0.0').
|
with two tags: 'spark' and '<Spark version>' (e.g. '1.3.1').
|
||||||
|
|
||||||
Also you should specify the username of the default cloud-user used in the image. For
|
Also you should specify the username of the default cloud-user used in the
|
||||||
the images available at the URLs listed above and for all the ones generated with the
|
image. For the images available at the URLs listed above and for all the ones
|
||||||
DIB it is 'ubuntu'.
|
generated with the DIB it is `ubuntu`.
|
||||||
|
|
||||||
Note that the Spark cluster is deployed using the scripts available in the
|
Note that the Spark cluster is deployed using the scripts available in the
|
||||||
Spark distribution, which allow to start all services (master and slaves), stop
|
Spark distribution, which allow the user to start all services (master and
|
||||||
all services and so on. As such (and as opposed to CDH HDFS daemons), Spark is
|
slaves), stop all services and so on. As such (and as opposed to CDH HDFS
|
||||||
not deployed as a standard Ubuntu service and if the virtual machines are
|
daemons), Spark is not deployed as a standard Ubuntu service and if the
|
||||||
rebooted, Spark will not be restarted.
|
virtual machines are rebooted, Spark will not be restarted.
|
||||||
|
|
||||||
Spark configuration
|
Spark configuration
|
||||||
-------------------
|
-------------------
|
||||||
|
|
||||||
Spark needs few parameters to work and has sensible defaults. If needed they
|
Spark needs few parameters to work and has sensible defaults. If needed they
|
||||||
can be changed when creating the Sahara cluster template. No node group options
|
can be changed when creating the sahara cluster template. No node group
|
||||||
are available.
|
options are available.
|
||||||
|
|
||||||
Once the cluster is ready, connect with ssh to the master using the 'ubuntu'
|
Once the cluster is ready, connect with ssh to the master using the `ubuntu`
|
||||||
user and the appropriate ssh key. Spark is installed in /opt/spark and should
|
user and the appropriate ssh key. Spark is installed in `/opt/spark` and
|
||||||
be completely configured and ready to start executing jobs. At the bottom of
|
should be completely configured and ready to start executing jobs. At the
|
||||||
the cluster information page from the OpenStack dashboard, a link to the Spark
|
bottom of the cluster information page from the OpenStack dashboard, a link to
|
||||||
web interface is provided.
|
the Spark web interface is provided.
|
||||||
|
|
||||||
Cluster Validation
|
Cluster Validation
|
||||||
------------------
|
------------------
|
||||||
|
|
||||||
When a user creates an Hadoop cluster using the Spark plugin,
|
When a user creates an Hadoop cluster using the Spark plugin, the cluster
|
||||||
the cluster topology requested by user is verified for consistency.
|
topology requested by user is verified for consistency.
|
||||||
|
|
||||||
Currently there are the following limitations in cluster topology for the Spark plugin:
|
Currently there are the following limitations in cluster topology for the
|
||||||
|
Spark plugin:
|
||||||
|
|
||||||
+ Cluster must contain exactly one HDFS namenode
|
+ Cluster must contain exactly one HDFS namenode
|
||||||
+ Cluster must contain exactly one Spark master
|
+ Cluster must contain exactly one Spark master
|
||||||
+ Cluster must contain at least one Spark slave
|
+ Cluster must contain at least one Spark slave
|
||||||
+ Cluster must contain at least one HDFS datanode
|
+ Cluster must contain at least one HDFS datanode
|
||||||
|
|
||||||
The tested configuration puts the NameNode co-located with the master and a DataNode
|
The tested configuration co-locates the NameNode with the master and a
|
||||||
with each slave to maximize data locality.
|
DataNode with each slave to maximize data locality.
|
||||||
|
|
||||||
Limitations
|
|
||||||
-----------
|
|
||||||
|
|
||||||
Swift support is not available in Spark. Once it is developed there, it will be
|
|
||||||
possible to add it to this plugin.
|
|
||||||
|
Loading…
Reference in New Issue
Block a user