Merge "Documentation fixes and updates for devref"

This commit is contained in:
Jenkins 2016-09-16 10:53:37 +00:00 committed by Gerrit Code Review
commit 6c1e4158ab
16 changed files with 305 additions and 249 deletions

View File

@ -63,10 +63,10 @@ $ mv 507eb70202af_my_new_revision.py 007_my_new_revision.py
Add Alembic Operations to the Script
++++++++++++++++++++++++++++++++++++
The migration script contains method ``upgrade()``. Since Kilo release Sahara
doesn't support downgrades. Fill in this method with the appropriate Alembic
operations to perform upgrades. In the above example, an upgrade will move from
revision '006' to revision '007'.
The migration script contains method ``upgrade()``. Sahara has not supported
downgrades since the Kilo release. Fill in this method with the appropriate
Alembic operations to perform upgrades. In the above example, an upgrade will
move from revision '006' to revision '007'.
Command Summary for sahara-db-manage
++++++++++++++++++++++++++++++++++++
@ -87,15 +87,15 @@ To run the offline migration between specific migration versions::
$ sahara-db-manage --config-file /path/to/sahara.conf upgrade <start version>:<end version> --sql
Upgrade the database incrementally::
To upgrade the database incrementally::
$ sahara-db-manage --config-file /path/to/sahara.conf upgrade --delta <# of revs>
Create new revision::
To create a new revision::
$ sahara-db-manage --config-file /path/to/sahara.conf revision -m "description of revision" --autogenerate
Create a blank file::
To create a blank file::
$ sahara-db-manage --config-file /path/to/sahara.conf revision -m "description of revision"

View File

@ -32,7 +32,7 @@ On Ubuntu:
$ sudo apt-get install git-core python-dev python-virtualenv gcc libpq-dev libmysqlclient-dev python-pip rabbitmq-server
$ sudo pip install tox
On Fedora-based distributions (e.g., Fedora/RHEL/CentOS/Scientific Linux):
On Red Hat and related distributions (CentOS/Fedora/RHEL/Scientific Linux):
.. sourcecode:: console

View File

@ -78,7 +78,7 @@ Documentation Guidelines
------------------------
All Sahara docs are written using Sphinx / RST and located in the main repo
in ``doc`` directory. You can add/edit pages here to update
in the ``doc`` directory. You can add or edit pages here to update the
http://docs.openstack.org/developer/sahara site.
The documentation in docstrings should follow the `PEP 257`_ conventions
@ -92,10 +92,7 @@ More specifically:
3. For docstrings that take multiple lines, there should be a newline
after the opening quotes, and before the closing quotes.
4. `Sphinx`_ is used to build documentation, so use the restructured text
markup to designate parameters, return values, etc. Documentation on
the sphinx specific markup can be found here:
markup to designate parameters, return values, etc.
Run the following command to build docs locally.
@ -106,14 +103,14 @@ Run the following command to build docs locally.
After it you can access generated docs in ``doc/build/`` directory, for
example, main page - ``doc/build/html/index.html``.
To make docs generation process faster you can use:
To make the doc generation process faster you can use:
.. sourcecode:: console
$ SPHINX_DEBUG=1 tox -e docs
or to avoid sahara reinstallation to virtual env each time you want to rebuild
docs you can use the following command (it could be executed only after
To avoid sahara reinstallation to virtual env each time you want to rebuild
docs you can use the following command (it can be executed only after
running ``tox -e docs`` first time):
.. sourcecode:: console
@ -123,8 +120,8 @@ running ``tox -e docs`` first time):
.. note::
For more details on documentation guidelines see file HACKING.rst in the root
of Sahara repo.
For more details on documentation guidelines see HACKING.rst in the root of
the Sahara repo.
.. _PEP 8: http://www.python.org/dev/peps/pep-0008/
@ -136,47 +133,48 @@ running ``tox -e docs`` first time):
Event log Guidelines
--------------------
Currently Sahara keep with cluster useful information about provisioning.
Currently Sahara keeps useful information about provisioning for each cluster.
Cluster provisioning can be represented as a linear series of provisioning
steps, which are executed one after another. Also each step would consist of
several events. The amount of events depends on the step and the amount of
instances in the cluster. Also each event can contain information about
cluster, instance, and node group. In case of errors, this event would contain
information about reasons of errors. Each exception in sahara contains a
unique identifier that will allow the user to find extra information about
the reasons for errors in the sahara logs. Here
http://developer.openstack.org/api-ref-data-processing-v1.1.html#v1.1eventlog
you can see an example of provisioning progress information.
steps, which are executed one after another. Each step may consist of several
events. The number of events depends on the step and the number of instances
in the cluster. Also each event can contain information about its cluster,
instance, and node group. In case of errors, events contain useful information
for identifying the error. Additionally, each exception in sahara contains a
unique identifier that allows the user to find extra information about that
error in the sahara logs. You can see an example of provisioning progress
information here:
http://developer.openstack.org/api-ref/data-processing/#event-log
This means that if you add some important phase for cluster provisioning to
sahara code, it's recommended to add new provisioning step for this phase.
It would allow users to use event log for handling errors during this phase.
the sahara code, it's recommended to add a new provisioning step for this
phase. This will allow users to use event log for handling errors during this
phase.
Sahara already have special utils for operating provisioning steps and events
in module ``sahara/utils/cluster_progress_ops.py``.
Sahara already has special utils for operating provisioning steps and events
in the module ``sahara/utils/cluster_progress_ops.py``.
.. note::
It's strictly recommended not use ``conductor`` event log ops directly
It's strictly recommended not to use ``conductor`` event log ops directly
to assign events and operate provisioning steps.
.. note::
You should not add a new provisioning step until the previous step
You should not start a new provisioning step until the previous step has
successfully completed.
.. note::
It's strictly recommended to use ``event_wrapper`` for events handling
It's strictly recommended to use ``event_wrapper`` for event handling.
OpenStack client usage guidelines
---------------------------------
The sahara project uses several OpenStack clients internally. These clients
are all wrapped by utility functions which make using them more convenient.
When developing sahara, if you need to use a OpenStack client you should
When developing sahara, if you need to use an OpenStack client you should
check the ``sahara.utils.openstack`` package for the appropriate one.
When developing new OpenStack client interactions in sahara, it is important
to understand the ``sahara.service.sessions`` package and the usage of
keystone ``Session`` and auth plugin objects(for example, ``Token`` or
to understand the ``sahara.service.sessions`` package and the usage of the
keystone ``Session`` and auth plugin objects (for example, ``Token`` and
``Password``). Sahara is migrating all clients to use this authentication
methodology, where available. For more information on using sessions with
keystone, please see

View File

@ -1,14 +1,14 @@
Setup DevStack
==============
The DevStack could be installed on Fedora, Ubuntu and CentOS. For supported
DevStack can be installed on Fedora, Ubuntu, and CentOS. For supported
versions see `DevStack documentation <http://devstack.org>`_
We recommend to install DevStack not into your main system, but run it in
a VM instead. That way you may avoid contamination of your system
with various stuff. You may find hypervisor and VM requirements in the
the next section. If you still want to install DevStack on top of your
main system, just skip the next section and read further.
We recommend that you install DevStack in a VM, rather than on your main
system. That way you may avoid contamination of your system. You may find
hypervisor and VM requirements in the the next section. If you still want to
install DevStack on your baremetal system, just skip the next section and read
further.
Start VM and set up OS
@ -54,7 +54,7 @@ Ubuntu 14.04 system.
$ sudo apt-get install git-core
$ git clone https://git.openstack.org/openstack-dev/devstack.git
2. Create file ``local.conf`` in devstack directory with the following
2. Create the file ``local.conf`` in devstack directory with the following
content:
.. sourcecode:: bash
@ -101,14 +101,15 @@ Ubuntu 14.04 system.
In cases where you need to specify a git refspec (branch, tag, or commit hash)
for the sahara in-tree devstack plugin (or sahara repo), it should be
appended after the git repo URL as follows:
appended to the git repo URL as follows:
.. sourcecode:: bash
enable_plugin sahara git://git.openstack.org/openstack/sahara <some_git_refspec>
3. Sahara can send notifications to Ceilometer, if Ceilometer is enabled.
If you want to enable Ceilometer add the following lines to ``local.conf`` file:
If you want to enable Ceilometer add the following lines to the
``local.conf`` file:
.. sourcecode:: bash
@ -120,20 +121,21 @@ appended after the git repo URL as follows:
$ ./stack.sh
5. Once previous step is finished Devstack will print Horizon URL. Navigate to
this URL and login with login "admin" and password from ``local.conf``.
5. Once the previous step is finished Devstack will print a Horizon URL.
Navigate to this URL and login with login "admin" and password from
``local.conf``.
6. Congratulations! You have OpenStack running in your VM and ready to launch
VMs inside that VM :)
6. Congratulations! You have OpenStack running in your VM and you're ready to
launch VMs inside that VM. :)
Managing sahara in DevStack
---------------------------
If you install DevStack with sahara included you can rejoin screen with
``rejoin-stack.sh`` command and switch to ``sahara`` tab. Here you can manage
the sahara service as other OpenStack services. Sahara source code is located
at ``$DEST/sahara`` which is usually ``/opt/stack/sahara``.
If you install DevStack with sahara included you can rejoin screen with the
``rejoin-stack.sh`` command and switch to the ``sahara`` tab. Here you can
manage the sahara service as other OpenStack services. Sahara source code is
located at ``$DEST/sahara`` which is usually ``/opt/stack/sahara``.
.. _fusion-fixed-ip:
@ -172,4 +174,4 @@ Setting fixed IP address for VMware Fusion VM
$ sudo /Applications/VMware\ Fusion.app/Contents/Library/vmnet-cli --stop
$ sudo /Applications/VMware\ Fusion.app/Contents/Library/vmnet-cli --start
7. Now start your VM, it should have new fixed IP address
7. Now start your VM; it should have new fixed IP address.

View File

@ -1,14 +1,13 @@
Elastic Data Processing (EDP) SPI
=================================
EDP job engine objects provide methods for creating, monitoring, and
terminating jobs on Sahara clusters. Provisioning plugins that support EDP must
return an EDP job engine object from the :ref:`get_edp_engine` method described
in :doc:`plugin.spi`.
The EDP job engine objects provide methods for creating, monitoring, and
terminating jobs on Sahara clusters. Provisioning plugins that support EDP
must return an EDP job engine object from the :ref:`get_edp_engine` method
described in :doc:`plugin.spi`.
Sahara provides subclasses of the base job engine interface that support EDP
on clusters running Oozie or on Spark standalone clusters. These are described
below.
on clusters running Oozie, Spark, and/or Storm. These are described below.
.. _edp_spi_job_types:
@ -25,8 +24,10 @@ values for job types:
* MapReduce.Streaming
* Spark
* Shell
* Storm
Note, constants for job types are defined in *sahara.utils.edp*
.. note::
Constants for job types are defined in *sahara.utils.edp*.
Job Status Values
------------------------
@ -61,7 +62,7 @@ cancel_job(job_execution)
Stops the running job whose id is stored in the job_execution object.
*Returns*: None if the operation was unsuccessful or an updated job status
value
value.
get_job_status(job_execution)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@ -69,7 +70,7 @@ get_job_status(job_execution)
Returns the current status of the job whose id is stored in the job_execution
object.
*Returns*: a job status value
*Returns*: a job status value.
run_job(job_execution)
@ -77,7 +78,7 @@ run_job(job_execution)
Starts the job described by the job_execution object
*Returns*: a tuple of the form (job_id, job_status_value, job_extra_info)
*Returns*: a tuple of the form (job_id, job_status_value, job_extra_info).
* *job_id* is required and must be a string that allows the EDP engine to
uniquely identify the job.
@ -100,8 +101,8 @@ raise an exception.
get_possible_job_config(job_type)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Returns hints used by the Sahara UI to prompt users for values when configuring
and launching a job. Note that no hints are required.
Returns hints used by the Sahara UI to prompt users for values when
configuring and launching a job. Note that no hints are required.
See :doc:`/userdoc/edp` for more information on how configuration values,
parameters, and arguments are used by different job types.
@ -123,7 +124,7 @@ get_supported_job_types()
This method returns the job types that the engine supports. Not all engines
will support all job types.
*Returns*: a list of job types supported by the engine
*Returns*: a list of job types supported by the engine.
Oozie Job Engine Interface
--------------------------
@ -132,8 +133,8 @@ The sahara.service.edp.oozie.engine.OozieJobEngine class is derived from
JobEngine. It provides implementations for all of the methods in the base
interface but adds a few more abstract methods.
Note, the *validate_job_execution(cluster, job, data)* method does basic checks
on the job configuration but probably should be overloaded to include
Note that the *validate_job_execution(cluster, job, data)* method does basic
checks on the job configuration but probably should be overloaded to include
additional checks on the cluster configuration. For example, the job engines
for plugins that support Oozie add checks to make sure that the Oozie service
is up and running.
@ -145,7 +146,7 @@ get_hdfs_user()
Oozie uses HDFS to distribute job files. This method gives the name of the
account that is used on the data nodes to access HDFS (such as 'hadoop' or
'hdfs'). The Oozie job engine expects that HDFS contains a directory for this
user under */user/*
user under */user/*.
*Returns*: a string giving the username for the account used to access HDFS on
the cluster.
@ -170,8 +171,8 @@ get_oozie_server_uri(cluster)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Returns the full URI for the Oozie server, for example
*http://my_oozie_host:11000/oozie*. This URI is used by an Oozie client to send
commands and queries to the Oozie server.
*http://my_oozie_host:11000/oozie*. This URI is used by an Oozie client to
send commands and queries to the Oozie server.
*Returns*: a string giving the Oozie server URI.
@ -179,9 +180,10 @@ commands and queries to the Oozie server.
get_oozie_server(self, cluster)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Returns the node instance for the host in the cluster running the Oozie server
Returns the node instance for the host in the cluster running the Oozie
server.
*Returns*: a node instance
*Returns*: a node instance.
get_name_node_uri(self, cluster)
@ -198,7 +200,7 @@ get_resource_manager_uri(self, cluster)
Returns the full URI for the Hadoop JobTracker for Hadoop version 1 or the
Hadoop ResourceManager for Hadoop version 2.
*Returns*: a string giving the JobTracker or ResourceManager URI
*Returns*: a string giving the JobTracker or ResourceManager URI.
Spark Job Engine
----------------
@ -206,11 +208,12 @@ Spark Job Engine
The sahara.service.edp.spark.engine.SparkJobEngine class provides a full EDP
implementation for Spark standalone clusters.
Note, the *validate_job_execution(cluster, job, data)* method does basic checks
on the job configuration but probably should be overloaded to include
additional checks on the cluster configuration. For example, the job engine
returned by the Spark plugin checks that the Spark version is >= 1.0.0 to
ensure that *spark-submit* is available.
.. note::
The *validate_job_execution(cluster, job, data)* method does basic
checks on the job configuration but probably should be overloaded to
include additional checks on the cluster configuration. For example, the
job engine returned by the Spark plugin checks that the Spark version is
>= 1.0.0 to ensure that *spark-submit* is available.
get_driver_classpath(self)
~~~~~~~~~~~~~~~~~~~~~~~~~~
@ -218,4 +221,4 @@ get_driver_classpath(self)
Returns driver class path.
*Returns*: a string of the following format ' --driver-class-path
*class_path_value*'
*class_path_value*'.

View File

@ -1,16 +1,14 @@
Code Reviews with Gerrit
========================
Sahara uses the `Gerrit`_ tool to review proposed code changes. The review site
is http://review.openstack.org.
Sahara uses the `Gerrit`_ tool to review proposed code changes. The review
site is http://review.openstack.org.
Gerrit is a complete replacement for Github pull requests. `All Github pull
requests to the Sahara repository will be ignored`.
See `Gerrit Workflow Quick Reference`_ for information about how to get
started using Gerrit. See `Development Workflow`_ for more detailed
documentation on how to work with Gerrit.
See `Development Workflow`_ for information about how to get
started using Gerrit.
.. _Gerrit: http://code.google.com/p/gerrit
.. _Development Workflow: http://docs.openstack.org/infra/manual/developers.html#development-workflow
.. _Gerrit Workflow Quick Reference: http://docs.openstack.org/infra/manual/developers.html#development-workflow

View File

@ -24,16 +24,18 @@ To build Oozie the following command can be used:
$ {oozie_dir}/bin/mkdistro.sh -DskipTests
By default it builds against Hadoop 1.1.1. To built it with 2.x Hadoop version:
* hadoop-2 version in pom.xml files should be changed.
It could be done manually or with following command(You should replace
2.x.x to your hadoop version):
By default it builds against Hadoop 1.1.1. To built it with Hadoop version
2.x:
* The hadoop-2 version should be changed in pom.xml.
This can be done manually or with the following command (you should
replace 2.x.x with your hadoop version):
.. sourcecode:: console
$ find . -name pom.xml | xargs sed -ri 's/2.3.0/2.x.x/'
* build command should be launched with ``-P hadoop-2`` flag
* The build command should be launched with the ``-P hadoop-2`` flag
JDK Versions
------------
@ -44,12 +46,12 @@ There are 2 build properties that can be used to change the JDK version
requirements:
* ``javaVersion`` specifies the version of the JDK used to compile (default
1.6)
1.6).
* ``targetJavaVersion`` specifies the version of the generated bytecode
(default 1.6)
(default 1.6).
For example, to specify 1.7 JDK version, build command should contain
For example, to specify JDK version 1.7, the build command should contain the
``-D javaVersion=1.7 -D tagetJavaVersion=1.7`` flags.
@ -57,16 +59,16 @@ For example, to specify 1.7 JDK version, build command should contain
Build
-----
To build Ozzie with 2.6.0 hadoop and 1.7 JDK versions following command can be
used:
To build Oozie with Hadoop 2.6.0 and JDK version 1.7, the following command
can be used:
.. sourcecode:: console
$ {oozie_dir}/bin/mkdistro.sh assembly:single -P hadoop-2 -D javaVersion=1.7 -D targetJavaVersion=1.7 -D skipTests
Also, pig version can be passed as maven property with ``-D pig.version=x.x.x``
flag.
Also, the pig version can be passed as a maven property with the flag
``-D pig.version=x.x.x``.
Similar instruction to build oozie.tar.gz you may find there:
You can find similar instructions to build oozie.tar.gz here:
http://oozie.apache.org/docs/4.0.0/DG_QuickStart.html#Building_Oozie

View File

@ -4,7 +4,7 @@ How to Participate
Getting started
---------------
* Create account on `Github <https://github.com/openstack/sahara>`_
* Create an account on `Github <https://github.com/openstack/sahara>`_
(if you don't have one)
* Make sure that your local git is properly configured by executing
@ -29,25 +29,29 @@ Getting started
* Go to ``watched projects``
* Add ``openstack/sahara``, ``openstack/sahara-extra``,
``openstack/python-saharaclient``, ``openstack/sahara-image-elements``
``openstack/python-saharaclient``, and ``openstack/sahara-image-elements``
How to stay in touch with the community?
----------------------------------------
How to stay in touch with the community
---------------------------------------
* If you have something to discuss use
`OpenStack development mail-list <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev>`_.
Prefix mail subject with ``[Sahara]``
Prefix the mail subject with ``[Sahara]``
* Join ``#openstack-sahara`` IRC channel on `freenode <http://freenode.net/>`_
* Weekly on Thursdays at 1400 UTC and 1800 UTC (on alternate weeks)
* Attend Sahara team meetings
* IRC channel: ``#openstack-meeting-alt`` (1800UTC) and
* Weekly on Thursdays at 1400 UTC and 1800 UTC (on alternate weeks)
* IRC channel: ``#openstack-meeting-alt`` (1800UTC) or
``#openstack-meeting-3`` (1400UTC)
* See agenda at https://wiki.openstack.org/wiki/Meetings/SaharaAgenda
How to send your first patch on review?
How to post your first patch for review
---------------------------------------
* Checkout Sahara code from `Github <https://github.com/openstack/sahara>`_
@ -61,9 +65,9 @@ How to send your first patch on review?
* Make sure that your code passes ``PEP8`` checks and unit-tests.
See :doc:`development.guidelines`
* Send your patch on review
* Post your patch for review
* Monitor status of your patch review on https://review.openstack.org/#/
* Monitor the status of your patch review on https://review.openstack.org/#/

View File

@ -7,7 +7,7 @@ feature will enable your plugin to:
* Validate that images passed to it for use in cluster provisioning meet its
specifications.
* Enable your plugin to provision images from "clean" (OS-only) images.
* Provision images from "clean" (OS-only) images.
* Pack pre-populated images for registration in Glance and use by Sahara.
All of these features can use the same image declaration, meaning that logic
@ -66,7 +66,7 @@ base image.
This CLI will automatically populate the set of available plugins and
versions from the plugin set loaded in Sahara, and will show any plugin for
which the image packing feature is available. The next sections of this guide
will describe, first, how to modify an image packing specification for one
will first describe how to modify an image packing specification for one
of the plugins, and second, how to enable the image packing feature for new
or existing plugins.
@ -340,7 +340,8 @@ The Argument Set Validator
~~~~~~~~~~~~~~~~~~~~~~~~~~
You may find that you wish to store state in one place in the specification
for use in another. In this case, you can use this validator to
for use in another. In this case, you can use this validator to set an
argument for future use.
::

View File

@ -2,7 +2,7 @@ Continuous Integration with Jenkins
===================================
Each change made to Sahara core code is tested with unit and integration tests
and style checks flake8.
and style checks using flake8.
Unit tests and style checks are performed on public `OpenStack Jenkins
<https://jenkins.openstack.org/>`_ managed by `Zuul
@ -10,10 +10,10 @@ Unit tests and style checks are performed on public `OpenStack Jenkins
Unit tests are checked using python 2.7.
The result of those checks and Unit tests are +1 or -1 to *Verify* column in a
code review from *Jenkins* user.
The result of those checks and Unit tests are represented as a vote of +1 or
-1 in the *Verify* column in code reviews from the *Jenkins* user.
Integration tests check CRUD operations for Image Registry, Templates and
Integration tests check CRUD operations for the Image Registry, Templates, and
Clusters. Also a test job is launched on a created Cluster to verify Hadoop
work.
@ -27,15 +27,16 @@ integration testing may take a while.
Jenkins is controlled for the most part by Zuul which determines what jobs are
run when.
Zuul status is available by address: `Zuul Status
Zuul status is available at this address: `Zuul Status
<https://sahara.mirantis.com/zuul>`_.
For more information see: `Sahara Hadoop Cluster CI
<https://wiki.openstack.org/wiki/Sahara/SaharaCI>`_.
The integration tests result is +1 or -1 to *Verify* column in a code review
from *Sahara Hadoop Cluster CI* user.
The integration tests result is represented as a vote of +1 or -1 in the
*Verify* column in a code review from the *Sahara Hadoop Cluster CI* user.
You can put *sahara-ci-recheck* in comment, if you want to recheck sahara-ci
jobs. Also, you can put *recheck* in comment, if you want to recheck both
jenkins and sahara-ci jobs.
Jenkins and sahara-ci jobs. Finally, you can put *reverify* in a comment, if
you only want to recheck Jenkins jobs.

View File

@ -1,8 +1,8 @@
Project hosting
===============
`Launchpad`_ hosts the Sahara project. The Sahara project homepage on Launchpad
is http://launchpad.net/sahara.
`Launchpad`_ hosts the Sahara project. The Sahara project homepage on
Launchpad is http://launchpad.net/sahara.
Launchpad credentials
---------------------
@ -18,9 +18,10 @@ OpenStack-related sites. These sites include:
Mailing list
------------
The mailing list email is ``openstack-dev@lists.openstack.org`` with subject
prefix ``[sahara]``. To participate in the mailing list subscribe to the list
at http://lists.openstack.org/cgi-bin/mailman/listinfo
The mailing list email is ``openstack-dev@lists.openstack.org``; use the
subject prefix ``[sahara]`` to address the team. To participate in the
mailing list subscribe to the list at
http://lists.openstack.org/cgi-bin/mailman/listinfo
Bug tracking
------------
@ -36,7 +37,9 @@ proposed changes and track associated commits. Sahara also uses specs for
in-depth descriptions and discussions of blueprints. Specs follow a defined
format and are submitted as change requests to the openstack/sahara-specs
repository. Every blueprint should have an associated spec that is agreed
on and merged to the sahara-specs repository before it is approved.
on and merged to the sahara-specs repository before it is approved, unless the
whole team agrees that the implementation path for the feature described in
the blueprint is completely understood.
Technical support
-----------------

View File

@ -28,7 +28,7 @@ log levels:
Formatting Guidelines
---------------------
Now sahara uses string formatting defined in `PEP 3101`_ for logs.
Sahara uses string formatting defined in `PEP 3101`_ for logs.
.. code:: python
@ -41,8 +41,8 @@ Now sahara uses string formatting defined in `PEP 3101`_ for logs.
Translation Guidelines
----------------------
All log levels except Debug requires translation. None of the separate
cli tools packaged with sahara contain log translations.
All log levels except Debug require translation. None of the separate
CLI tools packaged with sahara contain log translations.
* Debug: no translation
* Info: _LI

View File

@ -7,28 +7,29 @@ Plugin interface
get_versions()
~~~~~~~~~~~~~~
Returns all versions of Hadoop that could be used with the plugin. It is
responsibility of the plugin to make sure that all required images for each
hadoop version are available, as well as configs and whatever else that plugin
needs to create the Hadoop cluster.
Returns all available versions of the plugin. Depending on the plugin, this
version may map directly to the HDFS version, or it may not; check your
plugin's documentation. It is responsibility of the plugin to make sure that
all required images for each hadoop version are available, as well as configs
and whatever else that plugin needs to create the Hadoop cluster.
*Returns*: list of strings - Hadoop versions
*Returns*: list of strings representing plugin versions
*Example return value*: [“1.2.1”, “2.3.0”, “2.4.1”]
get_configs( hadoop_version)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
get_configs( hadoop_version )
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Lists all configs supported by plugin with descriptions, defaults and targets
for which this config is applicable.
Lists all configs supported by the plugin with descriptions, defaults, and
targets for which this config is applicable.
*Returns*: list of configs
*Example return value*: ((“JobTracker heap size”, "JobTracker heap size, in
MB", "int", “512”, `“mapreduce”`, "node", True, 1))
get_node_processes( hadoop_version)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
get_node_processes( hadoop_version )
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Returns all supported services and node processes for a given Hadoop version.
Each node process belongs to a single service and that relationship is
@ -39,21 +40,21 @@ reflected in the returned dict object. See example for details.
*Example return value*: {"mapreduce": ["tasktracker", "jobtracker"], "hdfs":
["datanode", "namenode"]}
get_required_image_tags( hadoop_version)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
get_required_image_tags( hadoop_version )
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Lists tags, that should be added to OpenStack Image via Image Registry. Tags
Lists tags that should be added to OpenStack Image via Image Registry. Tags
are used to filter Images by plugin and hadoop version.
*Returns*: list of tags
*Example return value*: ["tag1", "some_other_tag", ...]
validate(cluster)
~~~~~~~~~~~~~~~~~
validate( cluster )
~~~~~~~~~~~~~~~~~~~
Validates a given cluster object. Raises *SaharaException* with meaningful
message.
Validates a given cluster object. Raises a *SaharaException* with a meaningful
message in the case of validation failure.
*Returns*: None
@ -61,8 +62,8 @@ message.
message='Hadoop cluster should contain only 1 NameNode instance. Actual NN
count is 2' }>
validate_scaling(cluster, existing, additional)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
validate_scaling( cluster, existing, additional )
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
To be improved.
@ -70,46 +71,44 @@ Validates a given cluster before scaling operation.
*Returns*: list of validation_errors
update_infra(cluster)
~~~~~~~~~~~~~~~~~~~~~
update_infra( cluster )
~~~~~~~~~~~~~~~~~~~~~~~
Plugin has a chance to change cluster description here. Specifically, plugin
must specify image for VMs
could change VMs specs in any way it needs.
For instance, plugin can ask for additional VMs for the management tool.
This method is no longer used now that Sahara utilizes Heat for OpenStack
resource provisioning, and is not currently utilized by any plugin.
*Returns*: None
configure_cluster(cluster)
~~~~~~~~~~~~~~~~~~~~~~~~~~
configure_cluster( cluster )
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Configures cluster on provisioned by sahara VMs. In this function plugin
should perform all actions like adjusting OS, installing required packages
(including Hadoop, if needed), configuring Hadoop, etc.
Configures cluster on the VMs provisioned by sahara. In this function the
plugin should perform all actions like adjusting OS, installing required
packages (including Hadoop, if needed), configuring Hadoop, etc.
*Returns*: None
start_cluster(cluster)
~~~~~~~~~~~~~~~~~~~~~~
start_cluster( cluster )
~~~~~~~~~~~~~~~~~~~~~~~~
Start already configured cluster. This method is guaranteed to be called only
on cluster which was already prepared with configure_cluster(...) call.
on a cluster which was already prepared with configure_cluster(...) call.
*Returns*: None
scale_cluster(cluster, instances)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
scale_cluster( cluster, instances )
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Scale an existing cluster with additional instances. Instances argument is a
list of ready-to-configure instances. Plugin should do all configuration
Scale an existing cluster with additional instances. The instances argument is
a list of ready-to-configure instances. Plugin should do all configuration
operations in this method and start all services on those instances.
*Returns*: None
.. _get_edp_engine:
get_edp_engine(cluster, job_type)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
get_edp_engine( cluster, job_type )
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Returns an EDP job engine object that supports the specified job_type on the
given cluster, or None if there is no support. The EDP job engine object
@ -119,50 +118,75 @@ job_type is a String matching one of the job types listed in
*Returns*: an EDP job engine object or None
decommission_nodes(cluster, instances)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
decommission_nodes( cluster, instances )
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Scale cluster down by removing a list of instances. Plugin should stop services
on a provided list of instances. Plugin also may want to update some
configurations on other instances, so this method is the right place to do
that.
Scale cluster down by removing a list of instances. The plugin should stop
services on the provided list of instances. The plugin also may need to update
some configurations on other instances when nodes are removed; if so, this
method must perform that reconfiguration.
*Returns*: None
on_terminate_cluster(cluster)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
on_terminate_cluster( cluster )
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
When user terminates cluster, sahara simply shuts down all the cluster VMs.
This method is guaranteed to be invoked before that, allowing plugin to do some
clean-up.
This method is guaranteed to be invoked before that, allowing the plugin to do
some clean-up.
*Returns*: None
get_open_ports(node_group)
~~~~~~~~~~~~~~~~~~~~~~~~~~
get_open_ports( node_group )
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
When user requests sahara to automatically create security group for the node
group (``auto_security_group`` property set to True), sahara will call this
plugin method to get list of ports that need to be opened.
When user requests sahara to automatically create a security group for the
node group (``auto_security_group`` property set to True), sahara will call
this plugin method to get a list of ports that need to be opened.
*Returns*: list of ports to be open in auto security group for the given node
group
def get_edp_job_types(versions)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
def get_edp_job_types( versions )
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Optional method, which provides ability to see all supported job types for
specified plugin versions
Optional method, which provides the ability to see all supported job types for
specified plugin versions.
*Returns*: dict with supported job types for specified versions of plugin
def recommend_configs(self, cluster, scaling=False)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
def recommend_configs( self, cluster, scaling=False )
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Optional method, which provides recommendations for cluster configuration
before creating/scaling operation.
*Returns*: None
def get_image_arguments( self, hadoop_version ):
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Optional method, which gets the argument set taken by the plugin's image
generator, or NotImplemented if the plugin does not provide image generation
support. See :doc:`image-gen`.
*Returns*: A sequence with items of type sahara.plugins.images.ImageArgument.
def pack_image( self, hadoop_version, remote, reconcile=True, ... ):
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Optional method which packs an image for registration in Glance and use by
Sahara. This method is called from the image generation CLI rather than from
the Sahara api or engine service. See :doc:`image-gen`.
*Returns*: None (modifies the image pointed to by the remote in-place.)
def validate_images( self, cluster, reconcile=True, image_arguments=None ):
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Validates the image to be used to create a cluster, to ensure that it meets
the specifications of the plugin. See :doc:`image-gen`.
*Returns*: None; may raise a sahara.plugins.exceptions.ImageValidationError
Object Model
============
@ -171,9 +195,9 @@ Here is a description of all the objects involved in the API.
Notes:
- cluster and node_group have extra field allowing plugin to persist any
complementary info about the cluster.
- node_process is just a process that runs at some node in cluster.
- clusters and node_groups have extra fields allowing the plugin to
persist any supplementary info about the cluster.
- node_process is just a process that runs on some node in cluster.
Example list of node processes:
@ -212,7 +236,7 @@ An object, describing one configuration entry
| scope | enum | Could be either 'node' or 'cluster'. |
+-------------------+--------+------------------------------------------------+
| is_optional | bool | If is_optional is False and no default_value |
| | | is specified, user should provide a value |
| | | is specified, user must provide a value. |
+-------------------+--------+------------------------------------------------+
| priority | int | 1 or 2. A Hint for UI. Configs with priority |
| | | *1* are always displayed. |
@ -245,7 +269,7 @@ An instance created for cluster.
+===============+=========+===================================================+
| instance_id | string | Unique instance identifier. |
+---------------+---------+---------------------------------------------------+
| instance_name | string | OpenStack Instance name. |
| instance_name | string | OpenStack instance name. |
+---------------+---------+---------------------------------------------------+
| internal_ip | string | IP to communicate with other instances. |
+---------------+---------+---------------------------------------------------+
@ -255,7 +279,7 @@ An instance created for cluster.
| volumes | list | List of volumes attached to instance. Empty if |
| | | ephemeral drive is used. |
+---------------+---------+---------------------------------------------------+
| nova_info | object | Nova Instance object. |
| nova_info | object | Nova instance object. |
+---------------+---------+---------------------------------------------------+
| username | string | Username, that sahara uses for establishing |
| | | remote connections to instance. |
@ -265,7 +289,7 @@ An instance created for cluster.
| fqdn | string | Fully qualified domain name for this instance. |
+---------------+---------+---------------------------------------------------+
| remote | helpers | Object with helpers for performing remote |
| | | operations |
| | | operations. |
+---------------+---------+---------------------------------------------------+

View File

@ -1,22 +1,23 @@
Pluggable Provisioning Mechanism
================================
Sahara could be integrated with 3rd party management tools like Apache Ambari
and Cloudera Management Console. The integration is achieved using plugin
Sahara can be integrated with 3rd party management tools like Apache Ambari
and Cloudera Management Console. The integration is achieved using the plugin
mechanism.
In short, responsibilities are divided between Sahara core and plugin as
follows. Sahara interacts with user and provisions infrastructure (VMs).
Plugin installs and configures Hadoop cluster on the VMs. Optionally Plugin
could deploy management and monitoring tools for the cluster. Sahara
provides plugin with utility methods to work with VMs.
In short, responsibilities are divided between the Sahara core and a plugin as
follows. Sahara interacts with the user and uses Heat to provision OpenStack
resources (VMs, baremetal servers, security groups, etc.) The plugin installs
and configures a Hadoop cluster on the provisioned instances. Optionally,
a plugin can deploy management and monitoring tools for the cluster. Sahara
provides plugins with utility methods to work with provisioned instances.
A plugin must extend `sahara.plugins.provisioning:ProvisioningPluginBase`
A plugin must extend the `sahara.plugins.provisioning:ProvisioningPluginBase`
class and implement all the required methods. Read :doc:`plugin.spi` for
details.
The `instance` objects provided by Sahara have `remote` property which
could be used to work with VM. The `remote` is a context manager so you
can use it in `with instance.remote:` statements. The list of available
commands could be found in `sahara.utils.remote.InstanceInteropHelper`.
See Vanilla plugin source for usage examples.
The `instance` objects provided by Sahara have a `remote` property which
can be used to interact with instances. The `remote` is a context manager so
you can use it in `with instance.remote:` statements. The list of available
commands can be found in `sahara.utils.remote.InstanceInteropHelper`.
See the source code of the Vanilla plugin for usage examples.

View File

@ -54,7 +54,7 @@ choice.
$ ssh user@hostname
$ wget http://sahara-files.mirantis.com/images/upstream/<openstack_release>/<sahara_image>.qcow2
Upload the above downloaded image into the OpenStack Image service:
Upload the image downloaded above into the OpenStack Image service:
.. sourcecode:: console
@ -87,7 +87,7 @@ OR
* Build the image using: `diskimage-builder script <https://github.com/openstack/sahara-image-elements/blob/master/diskimage-create/README.rst>`_
Remember the image name or save the image ID, this will be used during the
Remember the image name or save the image ID. This will be used during the
image registration with sahara. You can get the image ID using the
``openstack`` command line tool as follows:
@ -106,9 +106,19 @@ image registration with sahara. You can get the image ID using the
Now you will begin to interact with sahara by registering the virtual
machine image in the sahara image registry.
Register the image with the username ``ubuntu``. *Note, the username
will vary depending on the source image used, for more please see*
:doc:`../userdoc/vanilla_plugin`
Register the image with the username ``ubuntu``.
.. note::
The username will vary depending on the source image used, as follows:
Ubuntu: ``ubuntu``
CentOS 7: ``centos``
CentOS 6: ``cloud-user``
Fedora: ``fedora``
Note that the Sahara team recommends using CentOS 7 instead of CentOS 6 as
a base OS wherever possible; it is better supported throughout OpenStack
image maintenance infrastructure and its more modern filesystem is much
more appropriate for large-scale data processing. For more please see
:doc:`../userdoc/vanilla_plugin`
.. sourcecode:: console
@ -118,8 +128,9 @@ will vary depending on the source image used, for more please see*
Tag the image to inform sahara about the plugin and the version with which
it shall be used.
**Note:** For the steps below and the rest of this guide, substitute
``<plugin_version>`` with the appropriate version of your plugin.
.. note::
For the steps below and the rest of this guide, substitute
``<plugin_version>`` with the appropriate version of your plugin.
.. sourcecode:: console
@ -174,8 +185,9 @@ with the ``plugin show`` command. For example:
| YARN | nodemanager, resourcemanager |
+---------------------+-----------------------------------------------------------------------------------------------------------------------+
*Note, these commands assume that floating IP addresses are being used. For
more details on floating IP please see* :ref:`floating_ip_management`
.. note::
These commands assume that floating IP addresses are being used. For more
details on floating IP please see :ref:`floating_ip_management`.
Create a master node group template with the command:
@ -237,7 +249,7 @@ Create a worker node group template with the command:
Alternatively you can create node group templates from JSON files:
If your environment does not use floating IP, omit defining floating IP in
If your environment does not use floating IPs, omit defining floating IP in
the template below.
Sample templates can be found here:
@ -302,8 +314,8 @@ added properly:
| vanilla-default-worker | 6546bf44-0590-4539-bfcb-99f8e2c11efc | vanilla | <plugin_version> |
+------------------------+--------------------------------------+-------------+--------------------+
Remember the name or save the ID for the master and worker node group templates
as they will be used during cluster template creation.
Remember the name or save the ID for the master and worker node group
templates, as they will be used during cluster template creation.
For example:
@ -420,7 +432,7 @@ Create a cluster with the command:
| Version | <plugin_version> |
+----------------------------+----------------------------------------------------+
Alternatively you can create cluster template from JSON file:
Alternatively you can create a cluster template from a JSON file:
Create a file named ``my_cluster_create.json`` with the following content:
@ -445,11 +457,10 @@ Dashboard, or through the ``openstack`` command line client as follows:
$ openstack keypair create my_stack --public-key $PATH_TO_PUBLIC_KEY
If sahara is configured to use neutron for networking, you will also need to
include the ``--neutron-network`` argument in the ``cluster create`` command or
``neutron_management_network`` parameter in ``my_cluster_create.json``. If
your environment does not use neutron, you can omit ``--neutron-network`` or
the ``neutron_management_network`` above. You can determine the neutron network
id with the following command:
include the ``--neutron-network`` argument in the ``cluster create`` command
or the ``neutron_management_network`` parameter in ``my_cluster_create.json``.
If your environment does not use neutron, you should omit these arguments. You
can determine the neutron network id with the following command:
.. sourcecode:: console
@ -475,9 +486,9 @@ line tool as follows:
The cluster creation operation may take several minutes to complete. During
this time the "status" returned from the previous command may show states
other than ``Active``. A cluster also can be created with the ``wait`` flag. In
that case the cluster creation command will not be finished until the cluster
will be moved to the ``Active`` state.
other than ``Active``. A cluster also can be created with the ``wait`` flag.
In that case the cluster creation command will not be finished until the
cluster is moved to the ``Active`` state.
8. Run a MapReduce job to check Hadoop installation
---------------------------------------------------
@ -485,7 +496,8 @@ will be moved to the ``Active`` state.
Check that your Hadoop installation is working properly by running an
example job on the cluster manually.
* Login to NameNode (usually master node) via ssh with ssh-key used above:
* Login to the NameNode (usually the master node) via ssh with the ssh-key
used above:
.. sourcecode:: console

View File

@ -6,24 +6,31 @@ We have a bunch of different tests for Sahara.
Unit Tests
++++++++++
In most Sahara sub repositories we have `_package_/tests/unit` or
`_package_/tests` that contains Python unit tests.
In most Sahara sub-repositories we have a directory that contains Python unit
tests, located at `_package_/tests/unit` or `_package_/tests`.
Scenario integration tests
++++++++++++++++++++++++++
New scenario integration tests were implemented for Sahara, they are available
in the sahara-tests repository (https://git.openstack.org/cgit/openstack/sahara-tests).
New scenario integration tests were implemented for Sahara. They are available
in the sahara-tests repository
(https://git.openstack.org/cgit/openstack/sahara-tests).
Tempest tests
+++++++++++++
We have some tests based on Tempest (https://git.openstack.org/cgit/openstack/tempest)
that tests Sahara. Here is a list of currently implemented tests:
Sahara has a Tempest plugin in the sahara-tests repository covering all major
API features.
* REST API tests are checking how the Sahara REST API works.
The only part that is not tested is cluster creation, more info about api
tests - http://docs.openstack.org/developer/tempest/field_guide/api.html
Additional tests
++++++++++++++++
* CLI tests are checking read-only operations using the Sahara CLI, more info -
Additional tests reside in the sahara-tests repository (as above):
* REST API tests checking to ensure that the Sahara REST API works.
The only parts that are not tested are cluster creation and EDP. For more
info about api tests see
http://docs.openstack.org/developer/tempest/field_guide/api.html
* CLI tests check read-only operations using the Sahara CLI. For more info see
http://docs.openstack.org/developer/tempest/field_guide/cli.html