openstack-manuals/doc/common-rst/get_started_sahara.rst
daz c6036a4c8e Convert Cloud Admin Guide files to RST
Files converted:
- common/ch_getstart.xml
- common/section_getstart_compute.xml
- common/section_storage-concepts.xml
- common/section_getstart_object-storage.xml
- common/section_getstart_block-storage.xml
- common/section_getstart_networking.xml
- common/section_getstart_dashboard.xml
- common/section_keystone-concepts.xml
- common/section_getstart_image.xml
- common/section_getstart_telemetry.xml
- common/section_getstart_orchestration.xml
- common/section_getstart_trove.xml
- common/section_getstart_sahara.xml
- common/section_getstart_conceptual_arch.xml
- common/section_getstart_logical_arch.xml

Change-Id: I41318a887af6855f3a25e6adedbbd9e75f431332
Implements: blueprint reorganise-user-guides
2015-07-13 10:27:29 +10:00

1.3 KiB

Data processing service

The Data processing service for OpenStack (sahara) aims to provide users with a simple means to provision data processing (Hadoop, Spark) clusters by specifying several parameters like Hadoop version, cluster topology, node hardware details and a few more. After a user fills in all the parameters, the Data processing service deploys the cluster in a few minutes. Sahara also provides a means to scale already provisioned clusters by adding/removing worker nodes on demand.

The solution addresses the following use cases:

  • Fast provisioning of Hadoop clusters on OpenStack for development and QA.
  • Utilization of unused compute power from general purpose OpenStack IaaS cloud.
  • Analytics-as-a-Service for ad-hoc or bursty analytic workloads.

Key features are:

  • Designed as an OpenStack component.
  • Managed through REST API with UI available as part of OpenStack dashboard.
  • Support for different Hadoop distributions:
    • Pluggable system of Hadoop installation engines.
    • Integration with vendor specific management tools, such as Apache Ambari or Cloudera Management Console.
  • Predefined templates of Hadoop configurations with the ability to modify parameters.
  • User-friendly UI for ad-hoc analytics queries based on Hive or Pig.