openstack-manuals/doc/common-rst/get_started_sahara.rst
daz ab45cacbb6 Getting started chapter reorganisation
1. Split chapter file content into section files per current Cloud Admin Guide
2. Added cross-references
Change-Id: I51025b6eb4bb9b8912871837f9ce83d91dca973d
Implements: blueprint reorganise-user-guides
2015-07-21 10:50:03 +10:00

43 lines
1.4 KiB
ReStructuredText

.. :orphan:
=================================
OpenStack Data processing service
=================================
The Data processing service for OpenStack (sahara) aims to provide users
with a simple means to provision data processing (Hadoop, Spark)
clusters by specifying several parameters like Hadoop version, cluster
topology, node hardware details and a few more. After a user fills in
all the parameters, the Data processing service deploys the cluster in a
few minutes. Sahara also provides a means to scale already provisioned
clusters by adding or removing worker nodes on demand.
The solution addresses the following use cases:
- Fast provisioning of Hadoop clusters on OpenStack for development and
QA.
- Utilization of unused compute power from general purpose OpenStack
IaaS cloud.
- Analytics-as-a-Service for ad-hoc or bursty analytic workloads.
Key features are:
- Designed as an OpenStack component.
- Managed through REST API with UI available as part of OpenStack
dashboard.
- Support for different Hadoop distributions:
- Pluggable system of Hadoop installation engines.
- Integration with vendor specific management tools, such as Apache
Ambari or Cloudera Management Console.
- Predefined templates of Hadoop configurations with the ability to
modify parameters.
- User-friendly UI for ad-hoc analytics queries based on Hive or Pig.