f86890ee6b
This patch implements configuring NTP service for cluster instances, docs was updated as well. Implements blueprint: support-ntp Change-Id: I07c739b849d5f0d739833703167050d7637cf3fc
448 lines
19 KiB
ReStructuredText
448 lines
19 KiB
ReStructuredText
Sahara Advanced Configuration Guide
|
|
===================================
|
|
|
|
This guide addresses specific aspects of Sahara configuration that pertain to
|
|
advanced usage. It is divided into sections about various features that can be
|
|
utilized, and their related configurations.
|
|
|
|
.. _custom_network_topologies:
|
|
|
|
Custom network topologies
|
|
-------------------------
|
|
|
|
Sahara accesses instances at several stages of cluster spawning through
|
|
SSH and HTTP. Floating IPs and network namespaces
|
|
(see :ref:`neutron-nova-network`) will be automatically used for
|
|
access when present. When floating IPs are not assigned to instances and
|
|
namespaces are not being used, sahara will need an alternative method to
|
|
reach them.
|
|
|
|
The ``proxy_command`` parameter of the configuration file can be used to
|
|
give sahara a command to access instances. This command is run on the
|
|
sahara host and must open a netcat socket to the instance destination
|
|
port. The ``{host}`` and ``{port}`` keywords should be used to describe the
|
|
destination, they will be substituted at runtime. Other keywords that
|
|
can be used are: ``{tenant_id}``, ``{network_id}`` and ``{router_id}``.
|
|
|
|
For example, the following parameter in the sahara configuration file
|
|
would be used if instances are accessed through a relay machine:
|
|
|
|
.. sourcecode:: cfg
|
|
|
|
[DEFAULT]
|
|
proxy_command='ssh relay-machine-{tenant_id} nc {host} {port}'
|
|
|
|
Whereas the following shows an example of accessing instances though
|
|
a custom network namespace:
|
|
|
|
.. sourcecode:: cfg
|
|
|
|
[DEFAULT]
|
|
proxy_command='ip netns exec ns_for_{network_id} nc {host} {port}'
|
|
|
|
.. _data_locality_configuration:
|
|
|
|
Data-locality configuration
|
|
---------------------------
|
|
|
|
Hadoop provides the data-locality feature to enable task tracker and
|
|
data nodes the capability of spawning on the same rack, Compute node,
|
|
or virtual machine. Sahara exposes this functionality to the user
|
|
through a few configuration parameters and user defined topology files.
|
|
|
|
To enable data-locality, set the ``enable_data_locality`` parameter to
|
|
``True`` in the sahara configuration file
|
|
|
|
.. sourcecode:: cfg
|
|
|
|
[DEFAULT]
|
|
enable_data_locality=True
|
|
|
|
With data locality enabled, you must now specify the topology files
|
|
for the Compute and Object Storage services. These files are
|
|
specified in the sahara configuration file as follows:
|
|
|
|
.. sourcecode:: cfg
|
|
|
|
[DEFAULT]
|
|
compute_topology_file=/etc/sahara/compute.topology
|
|
swift_topology_file=/etc/sahara/swift.topology
|
|
|
|
The ``compute_topology_file`` should contain mappings between Compute
|
|
nodes and racks in the following format:
|
|
|
|
.. sourcecode:: cfg
|
|
|
|
compute1 /rack1
|
|
compute1 /rack2
|
|
compute1 /rack2
|
|
|
|
Note that the Compute node names must be exactly the same as configured in
|
|
OpenStack (``host`` column in admin list for instances).
|
|
|
|
The ``swift_topology_file`` should contain mappings between Object Storage
|
|
nodes and racks in the following format:
|
|
|
|
.. sourcecode:: cfg
|
|
|
|
node1 /rack1
|
|
node2 /rack2
|
|
node3 /rack2
|
|
|
|
Note that the Object Storage node names must be exactly the same as
|
|
configured in the object ring. Also, you should ensure that instances
|
|
with the task tracker process have direct access to the Object Storage
|
|
nodes.
|
|
|
|
Hadoop versions after 1.2.0 support four-layer topology (for more detail
|
|
please see `HADOOP-8468 JIRA issue`_). To enable this feature set the
|
|
``enable_hypervisor_awareness`` parameter to ``True`` in the configuration
|
|
file. In this case sahara will add the Compute node ID as a second level of
|
|
topology for virtual machines.
|
|
|
|
.. _HADOOP-8468 JIRA issue: https://issues.apache.org/jira/browse/HADOOP-8468
|
|
|
|
.. _distributed-mode-configuration:
|
|
|
|
Distributed mode configuration
|
|
------------------------------
|
|
|
|
Sahara can be configured to run in a distributed mode that creates a
|
|
separation between the API and engine processes. This allows the API
|
|
process to remain relatively free to handle requests while offloading
|
|
intensive tasks to the engine processes.
|
|
|
|
The ``sahara-api`` application works as a front-end and serves user
|
|
requests. It offloads 'heavy' tasks to the ``sahara-engine`` process
|
|
via RPC mechanisms. While the ``sahara-engine`` process could be loaded
|
|
with tasks, ``sahara-api`` stays free and hence may quickly respond to
|
|
user queries.
|
|
|
|
If sahara runs on several hosts, the API requests could be
|
|
balanced between several ``sahara-api`` hosts using a load balancer.
|
|
It is not required to balance load between different ``sahara-engine``
|
|
hosts as this will be automatically done via the message broker.
|
|
|
|
If a single host becomes unavailable, other hosts will continue
|
|
serving user requests. Hence, a better scalability is achieved and some
|
|
fault tolerance as well. Note that distributed mode is not a true
|
|
high availability. While the failure of a single host does not
|
|
affect the work of the others, all of the operations running on
|
|
the failed host will stop. For example, if a cluster scaling is
|
|
interrupted, the cluster will be stuck in a half-scaled state. The
|
|
cluster might continue working, but it will be impossible to scale it
|
|
further or run jobs on it via EDP.
|
|
|
|
To run sahara in distributed mode pick several hosts on which
|
|
you want to run sahara services and follow these steps:
|
|
|
|
* On each host install and configure sahara using the
|
|
`installation guide <../installation.guide.html>`_
|
|
except:
|
|
|
|
* Do not run ``sahara-db-manage`` or launch sahara with ``sahara-all``
|
|
* Ensure that each configuration file provides a database connection
|
|
string to a single database for all hosts.
|
|
|
|
* Run ``sahara-db-manage`` as described in the installation guide,
|
|
but only on a single (arbitrarily picked) host.
|
|
|
|
* The ``sahara-api`` and ``sahara-engine`` processes use oslo.messaging to
|
|
communicate with each other. You will need to configure it properly on
|
|
each host (see below).
|
|
|
|
* Run ``sahara-api`` and ``sahara-engine`` on the desired hosts. You may
|
|
run both processes on the same or separate hosts as long as they are
|
|
configured to use the same message broker and database.
|
|
|
|
To configure oslo.messaging, first you will need to choose a message
|
|
broker driver. Currently there are three drivers provided: RabbitMQ, Qpid
|
|
or ZeroMQ. For the RabbitMQ or Qpid drivers please see the
|
|
:ref:`notification-configuration` documentation for an explanation of
|
|
common configuration options.
|
|
|
|
For an expanded view of all the options provided by each message broker
|
|
driver in oslo.messaging please refer to the options available in the
|
|
respective source trees:
|
|
|
|
* For Rabbit MQ see
|
|
|
|
* rabbit_opts variable in `impl_rabbit.py <https://git.openstack.org/cgit/openstack/oslo.messaging/tree/oslo/messaging/_drivers/impl_rabbit.py?id=1.4.0#n38>`_
|
|
* amqp_opts variable in `amqp.py <https://git.openstack.org/cgit/openstack/oslo.messaging/tree/oslo/messaging/_drivers/amqp.py?id=1.4.0#n37>`_
|
|
|
|
* For Qpid see
|
|
|
|
* qpid_opts variable in `impl_qpid.py <https://git.openstack.org/cgit/openstack/oslo.messaging/tree/oslo/messaging/_drivers/impl_qpid.py?id=1.4.0#n40>`_
|
|
* amqp_opts variable in `amqp.py <https://git.openstack.org/cgit/openstack/oslo.messaging/tree/oslo/messaging/_drivers/amqp.py?id=1.4.0#n37>`_
|
|
|
|
* For Zmq see
|
|
|
|
* zmq_opts variable in `impl_zmq.py <https://git.openstack.org/cgit/openstack/oslo.messaging/tree/oslo/messaging/_drivers/impl_zmq.py?id=1.4.0#n49>`_
|
|
* matchmaker_opts variable in `matchmaker.py <https://git.openstack.org/cgit/openstack/oslo.messaging/tree/oslo/messaging/_drivers/matchmaker.py?id=1.4.0#n27>`_
|
|
* matchmaker_redis_opts variable in `matchmaker_redis.py <https://git.openstack.org/cgit/openstack/oslo.messaging/tree/oslo/messaging/_drivers/matchmaker_redis.py?id=1.4.0#n26>`_
|
|
* matchmaker_opts variable in `matchmaker_ring.py <https://git.openstack.org/cgit/openstack/oslo.messaging/tree/oslo/messaging/_drivers/matchmaker_ring.py?id=1.4.0#n27>`_
|
|
|
|
These options will also be present in the generated sample configuration
|
|
file. For instructions on creating the configuration file please see the
|
|
:doc:`configuration.guide`.
|
|
|
|
External key manager usage (EXPERIMENTAL)
|
|
-----------------------------------------
|
|
|
|
Sahara generates and stores several passwords during the course of operation.
|
|
To harden sahara's usage of passwords it can be instructed to use an
|
|
external key manager for storage and retrieval of these secrets. To enable
|
|
this feature there must first be an OpenStack Key Manager service deployed
|
|
within the stack. Currently, the barbican project is the only key manager
|
|
supported by sahara.
|
|
|
|
With a Key Manager service deployed on the stack, sahara must be configured
|
|
to enable the external storage of secrets. This is accomplished by editing
|
|
the sahara configuration file as follows:
|
|
|
|
.. sourcecode:: cfg
|
|
|
|
[DEFAULT]
|
|
use_external_key_manager=True
|
|
|
|
.. TODO (mimccune)
|
|
this language should be removed once a new keystone authentication
|
|
section has been created in the configuration file.
|
|
|
|
Additionally, at this time there are two more values which must be provided
|
|
to ensure proper access for sahara to the Key Manager service. These are
|
|
the Identity domain for the administrative user and the domain for the
|
|
administrative project. By default these values will appear as:
|
|
|
|
.. sourcecode:: cfg
|
|
|
|
[DEFAULT]
|
|
admin_user_domain_name=default
|
|
admin_project_domain_name=default
|
|
|
|
With all of these values configured and the Key Manager service deployed,
|
|
sahara will begin storing its secrets in the external manager.
|
|
|
|
Indirect instance access through proxy nodes
|
|
--------------------------------------------
|
|
|
|
.. warning::
|
|
The indirect VMs access feature is in alpha state. We do not
|
|
recommend using it in a production environment.
|
|
|
|
Sahara needs to access instances through SSH during cluster setup. This
|
|
access can be obtained a number of different ways (see
|
|
:ref:`neutron-nova-network`, :ref:`floating_ip_management`,
|
|
:ref:`custom_network_topologies`). Sometimes it is impossible to provide
|
|
access to all nodes (because of limited numbers of floating IPs or security
|
|
policies). In these cases access can be gained using other nodes of the
|
|
cluster as proxy gateways. To enable this set ``is_proxy_gateway=True``
|
|
for the node group you want to use as proxy. Sahara will communicate with
|
|
all other cluster instances through the instances of this node group.
|
|
|
|
Note, if ``use_floating_ips=true`` and the cluster contains a node group with
|
|
``is_proxy_gateway=True``, the requirement to have ``floating_ip_pool``
|
|
specified is applied only to the proxy node group. Other instances will be
|
|
accessed through proxy instances using the standard private network.
|
|
|
|
Note, the Cloudera Hadoop plugin doesn't support access to Cloudera manager
|
|
through a proxy node. This means that for CDH clusters only nodes with
|
|
the Cloudera manager can be designated as proxy gateway nodes.
|
|
|
|
Multi region deployment
|
|
-----------------------
|
|
|
|
Sahara supports multi region deployment. To enable this option each
|
|
instance of sahara should have the ``os_region_name=<region>``
|
|
parameter set in the configuration file. The following example demonstrates
|
|
configuring sahara to use the ``RegionOne`` region:
|
|
|
|
.. sourcecode:: cfg
|
|
|
|
[DEFAULT]
|
|
os_region_name=RegionOne
|
|
|
|
.. _non-root-users:
|
|
|
|
Non-root users
|
|
--------------
|
|
|
|
In cases where a proxy command is being used to access cluster instances
|
|
(for example, when using namespaces or when specifying a custom proxy
|
|
command), rootwrap functionality is provided to allow users other than
|
|
``root`` access to the needed operating system facilities. To use rootwrap
|
|
the following configuration parameter is required to be set:
|
|
|
|
.. sourcecode:: cfg
|
|
|
|
[DEFAULT]
|
|
use_rootwrap=True
|
|
|
|
|
|
Assuming you elect to leverage the default rootwrap command
|
|
(``sahara-rootwrap``), you will need to perform the following additional setup
|
|
steps:
|
|
|
|
* Copy the provided sudoers configuration file from the local project file
|
|
``etc/sudoers.d/sahara-rootwrap`` to the system specific location, usually
|
|
``/etc/sudoers.d``. This file is setup to allow a user named ``sahara``
|
|
access to the rootwrap script. It contains the following:
|
|
|
|
.. sourcecode:: cfg
|
|
|
|
sahara ALL = (root) NOPASSWD: /usr/bin/sahara-rootwrap /etc/sahara/rootwrap.conf *
|
|
|
|
|
|
* Copy the provided rootwrap configuration file from the local project file
|
|
``etc/sahara/rootwrap.conf`` to the system specific location, usually
|
|
``/etc/sahara``. This file contains the default configuration for rootwrap.
|
|
|
|
* Copy the provided rootwrap filters file from the local project file
|
|
``etc/sahara/rootwrap.d/sahara.filters`` to the location specified in the
|
|
rootwrap configuration file, usually ``/etc/sahara/rootwrap.d``. This file
|
|
contains the filters that will allow the ``sahara`` user to access the
|
|
``ip netns exec``, ``nc``, and ``kill`` commands through the rootwrap
|
|
(depending on ``proxy_command`` you may need to set additional filters).
|
|
It should look similar to the followings:
|
|
|
|
.. sourcecode:: cfg
|
|
|
|
[Filters]
|
|
ip: IpNetnsExecFilter, ip, root
|
|
nc: CommandFilter, nc, root
|
|
kill: CommandFilter, kill, root
|
|
|
|
If you wish to use a rootwrap command other than ``sahara-rootwrap`` you can
|
|
set the following parameter in your sahara configuration file:
|
|
|
|
.. sourcecode:: cfg
|
|
|
|
[DEFAULT]
|
|
rootwrap_command='sudo sahara-rootwrap /etc/sahara/rootwrap.conf'
|
|
|
|
For more information on rootwrap please refer to the
|
|
`official Rootwrap documentation <https://wiki.openstack.org/wiki/Rootwrap>`_
|
|
|
|
Object Storage access using proxy users
|
|
---------------------------------------
|
|
|
|
To improve security for clusters accessing files in Object Storage,
|
|
sahara can be configured to use proxy users and delegated trusts for
|
|
access. This behavior has been implemented to reduce the need for
|
|
storing and distributing user credentials.
|
|
|
|
The use of proxy users involves creating an Identity domain that will be
|
|
designated as the home for these users. Proxy users will be
|
|
created on demand by sahara and will only exist during a job execution
|
|
which requires Object Storage access. The domain created for the
|
|
proxy users must be backed by a driver that allows sahara's admin user to
|
|
create new user accounts. This new domain should contain no roles, to limit
|
|
the potential access of a proxy user.
|
|
|
|
Once the domain has been created, sahara must be configured to use it by
|
|
adding the domain name and any potential delegated roles that must be used
|
|
for Object Storage access to the sahara configuration file. With the
|
|
domain enabled in sahara, users will no longer be required to enter
|
|
credentials for their data sources and job binaries referenced in
|
|
Object Storage.
|
|
|
|
Detailed instructions
|
|
^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
First a domain must be created in the Identity service to hold proxy
|
|
users created by sahara. This domain must have an identity backend driver
|
|
that allows for sahara to create new users. The default SQL engine is
|
|
sufficient but if your keystone identity is backed by LDAP or similar
|
|
then domain specific configurations should be used to ensure sahara's
|
|
access. Please see the `Keystone documentation`_ for more information.
|
|
|
|
.. _Keystone documentation: http://docs.openstack.org/developer/keystone/configuration.html#domain-specific-drivers
|
|
|
|
With the domain created, sahara's configuration file should be updated to
|
|
include the new domain name and any potential roles that will be needed. For
|
|
this example let's assume that the name of the proxy domain is
|
|
``sahara_proxy`` and the roles needed by proxy users will be ``Member`` and
|
|
``SwiftUser``.
|
|
|
|
.. sourcecode:: cfg
|
|
|
|
[DEFAULT]
|
|
use_domain_for_proxy_users=True
|
|
proxy_user_domain_name=sahara_proxy
|
|
proxy_user_role_names=Member,SwiftUser
|
|
|
|
..
|
|
|
|
A note on the use of roles. In the context of the proxy user, any roles
|
|
specified here are roles intended to be delegated to the proxy user from the
|
|
user with access to Object Storage. More specifically, any roles that
|
|
are required for Object Storage access by the project owning the object
|
|
store must be delegated to the proxy user for authentication to be
|
|
successful.
|
|
|
|
Finally, the stack administrator must ensure that images registered with
|
|
sahara have the latest version of the Hadoop swift filesystem plugin
|
|
installed. The sources for this plugin can be found in the
|
|
`sahara extra repository`_. For more information on images or swift
|
|
integration see the sahara documentation sections
|
|
:ref:`diskimage-builder-label` and :ref:`swift-integration-label`.
|
|
|
|
.. _Sahara extra repository: http://github.com/openstack/sahara-extra
|
|
|
|
.. _volume_instance_locality_configuration:
|
|
|
|
Volume instance locality configuration
|
|
--------------------------------------
|
|
|
|
The Block Storage service provides the ability to define volume instance
|
|
locality to ensure that instance volumes are created on the same host
|
|
as the hypervisor. The ``InstanceLocalityFilter`` provides the mechanism
|
|
for the selection of a storage provider located on the same physical
|
|
host as an instance.
|
|
|
|
To enable this functionality for instances of a specific node group, the
|
|
``volume_local_to_instance`` field in the node group template should be
|
|
set to ``True`` and some extra configurations are needed:
|
|
|
|
* The cinder-volume service should be launched on every physical host and at
|
|
least one physical host should run both cinder-scheduler and
|
|
cinder-volume services.
|
|
* ``InstanceLocalityFilter`` should be added to the list of default filters
|
|
(``scheduler_default_filters`` in cinder) for the Block Storage
|
|
configuration.
|
|
* The Extended Server Attributes extension needs to be active in the Compute
|
|
service (this is true by default in nova), so that the
|
|
``OS-EXT-SRV-ATTR:host`` property is returned when requesting instance
|
|
info.
|
|
* The user making the call needs to have sufficient rights for the property to
|
|
be returned by the Compute service.
|
|
This can be done by:
|
|
|
|
* by changing nova's ``policy.json`` to allow the user access to the
|
|
``extended_server_attributes`` option.
|
|
* by designating an account with privileged rights in the cinder
|
|
configuration:
|
|
|
|
.. sourcecode:: cfg
|
|
|
|
os_privileged_user_name =
|
|
os_privileged_user_password =
|
|
os_privileged_user_tenant =
|
|
|
|
It should be noted that in a situation when the host has no space for volume
|
|
creation, the created volume will have an ``Error`` state and can not be used.
|
|
|
|
NTP service configuration
|
|
-------------------------
|
|
|
|
By default sahara will enable the NTP service on all cluster instances if the
|
|
NTP package is included in the image (the sahara disk image builder will
|
|
include NTP in all images it generates). The default NTP server will be
|
|
``pool.ntp.org``; this can be overridden using the ``default_ntp_server``
|
|
setting in the ``DEFAULT`` section of the sahara configuration file.
|
|
If you would like to specify a different NTP server for a particular cluster
|
|
template, use the ``URL of NTP server`` setting in the ``General Parameters``
|
|
section when you create the template. If you would like to disable NTP for a
|
|
particular cluster template, deselect the ``Enable NTP service`` checkbox in
|
|
the ``General Parameters`` section when you create the template.
|