Merge "cleanup spark plugin documentation"

2015-09-21 08:52:55 +00:00 · 2015-09-21 08:52:55 +00:00 · 57523a684f
commit 57523a684f
parent 576833de89 aae5c905b8
1 changed files with 33 additions and 35 deletions
--- a/doc/source/userdoc/spark_plugin.rst
+++ b/doc/source/userdoc/spark_plugin.rst
@ -1,68 +1,66 @@
 Spark Plugin
 ============

-The Spark Sahara plugin provides a way to provision Apache Spark clusters on
-OpenStack in a single click and in an easily repeatable fashion.
+The Spark plugin for sahara provides a way to provision Apache Spark clusters
+on OpenStack in a single click and in an easily repeatable fashion.

-Currently Spark is installed in standalone mode, with no YARN or Mesos support.
+Currently Spark is installed in standalone mode, with no YARN or Mesos
+support.

 Images
 ------

-For cluster provisioning prepared images should be used. The Spark plugin
-has been developed and tested with the images generated by sahara-image-elements:
+For cluster provisioning, prepared images should be used. The Spark plugin
+has been developed and tested with the images generated by
+sahara-image-elements:

 * https://github.com/openstack/sahara-image-elements

-Those Ubuntu images already have Cloudera CDH5 HDFS and Apache Spark installed.
-A prepared image for Spark 1.0 and CDH4 HDFS can be found at the following location:
+The Ubuntu images generated by sahara-image-elements have Cloudera CDH 5.4.0
+HDFS and Apache Spark installed. A prepared image for Spark 1.3.1 and CDH
+5.4.0 HDFS can be found at the following location:

-* http://sahara-files.mirantis.com/sahara-juno-spark-1.0.0-ubuntu-14.04.qcow2
+* http://sahara-files.mirantis.com/images/upstream/liberty/

-The Spark plugin requires an image to be tagged in Sahara Image Registry with
-two tags: 'spark' and '<Spark version>' (e.g. '1.0.0').
+The Spark plugin requires an image to be tagged in the sahara image registry
+with two tags: 'spark' and '<Spark version>' (e.g. '1.3.1').

-Also you should specify the username of the default cloud-user used in the image. For
-the images available at the URLs listed above and for all the ones generated with the
-DIB it is 'ubuntu'.
+Also you should specify the username of the default cloud-user used in the
+image. For the images available at the URLs listed above and for all the ones
+generated with the DIB it is `ubuntu`.

 Note that the Spark cluster is deployed using the scripts available in the
-Spark distribution, which allow to start all services (master and slaves), stop
-all services and so on. As such (and as opposed to CDH HDFS daemons), Spark is
-not deployed as a standard Ubuntu service and if the virtual machines are
-rebooted, Spark will not be restarted.
+Spark distribution, which allow the user to start all services (master and
+slaves), stop all services and so on. As such (and as opposed to CDH HDFS
+daemons), Spark is not deployed as a standard Ubuntu service and if the
+virtual machines are rebooted, Spark will not be restarted.

 Spark configuration
 -------------------

 Spark needs few parameters to work and has sensible defaults. If needed they
-can be changed when creating the Sahara cluster template. No node group options
-are available.
+can be changed when creating the sahara cluster template. No node group
+options are available.

-Once the cluster is ready, connect with ssh to the master using the 'ubuntu'
-user and the appropriate ssh key. Spark is installed in /opt/spark and should
-be completely configured and ready to start executing jobs. At the bottom of
-the cluster information page from the OpenStack dashboard, a link to the Spark
-web interface is provided.
+Once the cluster is ready, connect with ssh to the master using the `ubuntu`
+user and the appropriate ssh key. Spark is installed in `/opt/spark` and
+should be completely configured and ready to start executing jobs. At the
+bottom of the cluster information page from the OpenStack dashboard, a link to
+the Spark web interface is provided.

 Cluster Validation
 ------------------

-When a user creates an Hadoop cluster using the Spark plugin,
-the cluster topology requested by user is verified for consistency.
+When a user creates an Hadoop cluster using the Spark plugin, the cluster
+topology requested by user is verified for consistency.

-Currently there are the following limitations in cluster topology for the Spark plugin:
+Currently there are the following limitations in cluster topology for the
+Spark plugin:

  + Cluster must contain exactly one HDFS namenode
  + Cluster must contain exactly one Spark master
  + Cluster must contain at least one Spark slave
  + Cluster must contain at least one HDFS datanode

-The tested configuration puts the NameNode co-located with the master and a DataNode
-with each slave to maximize data locality.
-
-Limitations
-----------
-
-Swift support is not available in Spark. Once it is developed there, it will be
-possible to add it to this plugin.
+The tested configuration co-locates the NameNode with the master and a
+DataNode with each slave to maximize data locality.