c6036a4c8e
Files converted: - common/ch_getstart.xml - common/section_getstart_compute.xml - common/section_storage-concepts.xml - common/section_getstart_object-storage.xml - common/section_getstart_block-storage.xml - common/section_getstart_networking.xml - common/section_getstart_dashboard.xml - common/section_keystone-concepts.xml - common/section_getstart_image.xml - common/section_getstart_telemetry.xml - common/section_getstart_orchestration.xml - common/section_getstart_trove.xml - common/section_getstart_sahara.xml - common/section_getstart_conceptual_arch.xml - common/section_getstart_logical_arch.xml Change-Id: I41318a887af6855f3a25e6adedbbd9e75f431332 Implements: blueprint reorganise-user-guides
42 lines
1.3 KiB
ReStructuredText
42 lines
1.3 KiB
ReStructuredText
.. :orphan:
|
|
|
|
Data processing service
|
|
-----------------------
|
|
|
|
The Data processing service for OpenStack (sahara) aims to provide users
|
|
with a simple means to provision data processing (Hadoop, Spark)
|
|
clusters by specifying several parameters like Hadoop version, cluster
|
|
topology, node hardware details and a few more. After a user fills in
|
|
all the parameters, the Data processing service deploys the cluster in a
|
|
few minutes. Sahara also provides a means to scale already provisioned
|
|
clusters by adding/removing worker nodes on demand.
|
|
|
|
The solution addresses the following use cases:
|
|
|
|
- Fast provisioning of Hadoop clusters on OpenStack for development and
|
|
QA.
|
|
|
|
- Utilization of unused compute power from general purpose OpenStack
|
|
IaaS cloud.
|
|
|
|
- Analytics-as-a-Service for ad-hoc or bursty analytic workloads.
|
|
|
|
Key features are:
|
|
|
|
- Designed as an OpenStack component.
|
|
|
|
- Managed through REST API with UI available as part of OpenStack
|
|
dashboard.
|
|
|
|
- Support for different Hadoop distributions:
|
|
|
|
- Pluggable system of Hadoop installation engines.
|
|
|
|
- Integration with vendor specific management tools, such as Apache
|
|
Ambari or Cloudera Management Console.
|
|
|
|
- Predefined templates of Hadoop configurations with the ability to
|
|
modify parameters.
|
|
|
|
- User-friendly UI for ad-hoc analytics queries based on Hive or Pig.
|