Merge "Removing extraneous Swift information from Features"
This commit is contained in:
commit
36ccea816b
@ -4,55 +4,43 @@ Features Overview
|
||||
Cluster Scaling
|
||||
---------------
|
||||
|
||||
The mechanism of cluster scaling is designed to enable user to change the number of running instances without creating a new cluster.
|
||||
User may change number of instances in existing Node Groups or add new Node Groups.
|
||||
The mechanism of cluster scaling is designed to enable user to change the
|
||||
number of running instances without creating a new cluster.
|
||||
User may change number of instances in existing Node Groups or add new Node
|
||||
Groups.
|
||||
|
||||
If cluster fails to scale properly, all changes will be rolled back.
|
||||
|
||||
Swift Integration
|
||||
-----------------
|
||||
|
||||
In order to leverage Swift within Hadoop, including using Swift data sources from within EDP, Hadoop requires the application of a patch.
|
||||
For additional information about this patch and configuration, please refer to :doc:`hadoop-swift`. Sahara automatically sets information
|
||||
about the Swift filesystem implementation, location awareness, URL and tenant name for authorization.
|
||||
|
||||
The only required information that is still needed to be set is username and password to access Swift. These parameters need to be
|
||||
explicitly set prior to launching the job.
|
||||
|
||||
E.g. :
|
||||
|
||||
.. sourcecode:: console
|
||||
|
||||
$ hadoop distcp -D fs.swift.service.sahara.username=admin \
|
||||
-D fs.swift.service.sahara.password=swordfish \
|
||||
swift://integration.sahara/temp swift://integration.sahara/temp1
|
||||
|
||||
How to compose a swift URL? The template is: ``swift://${container}.${provider}/${object}``.
|
||||
We don't need to point out the account because it will be automatically
|
||||
determined from tenant name from configs. Actually, account=tenant.
|
||||
|
||||
${provider} was designed to provide an opportunity to work
|
||||
with several Swift installations. E.g. it is possible to read data from one Swift installation and write it to another one.
|
||||
But as for now, Sahara automatically generates configs only for one Swift installation
|
||||
with name "sahara".
|
||||
|
||||
Currently user can only enable/disable Swift for a Hadoop cluster. But there is a blueprint about making Swift access
|
||||
more configurable: https://blueprints.launchpad.net/sahara/+spec/swift-configuration-through-rest-and-ui
|
||||
In order to leverage Swift within Hadoop, including using Swift data sources
|
||||
from within EDP, Hadoop requires the application of a patch.
|
||||
For additional information about using Swift with Sahara, including patching
|
||||
Hadoop and configuring Sahara, please refer to the :doc:`hadoop-swift`
|
||||
documentation.
|
||||
|
||||
Cinder support
|
||||
--------------
|
||||
Cinder is a block storage service that can be used as an alternative for an ephemeral drive. Using Cinder volumes increases reliability of data which is important for HDFS service.
|
||||
Cinder is a block storage service that can be used as an alternative for an
|
||||
ephemeral drive. Using Cinder volumes increases reliability of data which is
|
||||
important for HDFS service.
|
||||
|
||||
User can set how many volumes will be attached to each node in a Node Group and the size of each volume.
|
||||
User can set how many volumes will be attached to each node in a Node Group
|
||||
and the size of each volume.
|
||||
|
||||
All volumes are attached during Cluster creation/scaling operations.
|
||||
|
||||
Neutron and Nova Network support
|
||||
--------------------------------
|
||||
OpenStack Cluster may use Nova Network or Neutron as a networking service. Sahara supports both, but when deployed,
|
||||
a special configuration for networking should be set explicitly. By default Sahara will behave as if Nova Network is used.
|
||||
If OpenStack Cluster uses Neutron, then ``use_neutron`` option should be set to ``True`` in Sahara configuration file. In
|
||||
addition, if the OpenStack Cluster supports network namespaces, set the ``use_namespaces`` option to ``True``
|
||||
OpenStack Cluster may use Nova Network or Neutron as a networking service.
|
||||
Sahara supports both, but when deployed,
|
||||
a special configuration for networking should be set explicitly. By default
|
||||
Sahara will behave as if Nova Network is used.
|
||||
If OpenStack Cluster uses Neutron, then ``use_neutron`` option should be set
|
||||
to ``True`` in Sahara configuration file. In
|
||||
addition, if the OpenStack Cluster supports network namespaces, set the
|
||||
``use_namespaces`` option to ``True``
|
||||
|
||||
.. sourcecode:: cfg
|
||||
|
||||
@ -62,28 +50,40 @@ addition, if the OpenStack Cluster supports network namespaces, set the ``use_na
|
||||
Floating IP Management
|
||||
----------------------
|
||||
|
||||
Sahara needs to access instances through ssh during a Cluster setup. To establish a connection Sahara may
|
||||
use both: fixed and floating IP of an Instance. By default ``use_floating_ips`` parameter is set to ``True``, so
|
||||
Sahara will use Floating IP of an Instance to connect. In this case, user has two options for how to make all instances
|
||||
Sahara needs to access instances through ssh during a Cluster setup. To
|
||||
establish a connection Sahara may
|
||||
use both: fixed and floating IP of an Instance. By default
|
||||
``use_floating_ips`` parameter is set to ``True``, so
|
||||
Sahara will use Floating IP of an Instance to connect. In this case, user has
|
||||
two options for how to make all instances
|
||||
get a floating IP:
|
||||
|
||||
* Nova Network may be configured to assign floating IPs automatically by setting ``auto_assign_floating_ip`` to ``True`` in ``nova.conf``
|
||||
* Nova Network may be configured to assign floating IPs automatically by
|
||||
setting ``auto_assign_floating_ip`` to ``True`` in ``nova.conf``
|
||||
* User may specify a floating IP pool for each Node Group directly.
|
||||
|
||||
Note: When using floating IPs for management (``use_floating_ip=True``) **every** instance in the Cluster should have a floating IP,
|
||||
Note: When using floating IPs for management (``use_floating_ip=True``)
|
||||
**every** instance in the Cluster should have a floating IP,
|
||||
otherwise Sahara will not be able to work with it.
|
||||
|
||||
If ``use_floating_ips`` parameter is set to ``False`` Sahara will use Instances' fixed IPs for management. In this case
|
||||
the node where Sahara is running should have access to Instances' fixed IP network. When OpenStack uses Neutron for
|
||||
networking, user will be able to choose fixed IP network for all instances in a Cluster.
|
||||
If ``use_floating_ips`` parameter is set to ``False`` Sahara will use
|
||||
Instances' fixed IPs for management. In this case
|
||||
the node where Sahara is running should have access to Instances' fixed IP
|
||||
network. When OpenStack uses Neutron for
|
||||
networking, user will be able to choose fixed IP network for all instances
|
||||
in a Cluster.
|
||||
|
||||
Anti-affinity
|
||||
-------------
|
||||
One of the problems in Hadoop running on OpenStack is that there is no ability to control where machine is actually running.
|
||||
We cannot be sure that two new virtual machines are started on different physical machines. As a result, any replication with cluster
|
||||
One of the problems in Hadoop running on OpenStack is that there is no
|
||||
ability to control where machine is actually running.
|
||||
We cannot be sure that two new virtual machines are started on different
|
||||
physical machines. As a result, any replication with cluster
|
||||
is not reliable because all replicas may turn up on one physical machine.
|
||||
Anti-affinity feature provides an ability to explicitly tell Sahara to run specified processes on different compute nodes. This
|
||||
is especially useful for Hadoop datanode process to make HDFS replicas reliable.
|
||||
Anti-affinity feature provides an ability to explicitly tell Sahara to run
|
||||
specified processes on different compute nodes. This
|
||||
is especially useful for Hadoop datanode process to make HDFS replicas
|
||||
reliable.
|
||||
|
||||
Starting with Juno release Sahara creates server groups with
|
||||
``anti-affinity`` policy to enable anti affinity feature. Sahara creates one
|
||||
@ -162,7 +162,9 @@ environments it is recommended to control security group policy manually.
|
||||
Heat Integration
|
||||
----------------
|
||||
|
||||
Sahara may use `OpenStack Orchestration engine <https://wiki.openstack.org/wiki/Heat>`_ (aka Heat) to provision nodes for Hadoop cluster.
|
||||
Sahara may use
|
||||
`OpenStack Orchestration engine <https://wiki.openstack.org/wiki/Heat>`_
|
||||
(aka Heat) to provision nodes for Hadoop cluster.
|
||||
To make Sahara work with Heat the following steps are required:
|
||||
|
||||
* Your OpenStack installation must have 'orchestration' service up and running
|
||||
|
Loading…
Reference in New Issue
Block a user