diff --git a/doc/common/ch_getstart.xml b/doc/common/ch_getstart.xml index 5b675604b6..62a206e845 100644 --- a/doc/common/ch_getstart.xml +++ b/doc/common/ch_getstart.xml @@ -204,6 +204,7 @@ +
Feedback diff --git a/doc/common/section_getstart_sahara.xml b/doc/common/section_getstart_sahara.xml new file mode 100644 index 0000000000..e66ff31b44 --- /dev/null +++ b/doc/common/section_getstart_sahara.xml @@ -0,0 +1,48 @@ + +
+ Data processing service + The Data processing service for OpenStack (sahara) aims to provide + users with simple means to provision data processing (Hadoop, Spark) + clusters by specifying several parameters like Hadoop version, cluster + topology, nodes hardware details and a few more. After user fills in + all the parameters, the Data processing service deploys the cluster in a + few minutes. Also sahara provides means to scale already provisioned + clusters by adding/removing worker nodes on demand. + + + The solution addresses the following use cases: + + Fast provisioning of Hadoop clusters on OpenStack for + development and QA. + Utilization of unused compute power from general + purpose OpenStack IaaS cloud. + Analytics-as-a-Service for ad-hoc or bursty analytic + workloads. + + + + + Key features are: + + Designed as an OpenStack component. + Managed through REST API with UI available as part + of OpenStack dashboard. + Support for different Hadoop distributions: + + Pluggable system of Hadoop installation + engines. + Integration with vendor specific management tools, + such as Apache Ambari or Cloudera Management Console. + + + Predefined templates of Hadoop configurations with + ability to modify parameters. + User-friendly UI for ad-hoc analytics queries based on + Hive or Pig. + + +
diff --git a/doc/install-guide/bk-openstack-install-guide.xml b/doc/install-guide/bk-openstack-install-guide.xml index 6174fb694b..5224e914e1 100644 --- a/doc/install-guide/bk-openstack-install-guide.xml +++ b/doc/install-guide/bk-openstack-install-guide.xml @@ -219,6 +219,7 @@ + diff --git a/doc/install-guide/ch_sahara.xml b/doc/install-guide/ch_sahara.xml new file mode 100644 index 0000000000..dc45a13422 --- /dev/null +++ b/doc/install-guide/ch_sahara.xml @@ -0,0 +1,19 @@ + + + Add the Data processing service + The Data processing service (sahara) enables users to provide a + scalable data processing stack and associated management interfaces. + This includes provision and operation of data processing clusters as + well as scheduling and operation of data processing jobs. + + + This chapter is a work in progress. It may contain + incorrect information, and will be updated frequently. + + + + diff --git a/doc/install-guide/section_sahara-install.xml b/doc/install-guide/section_sahara-install.xml new file mode 100644 index 0000000000..707b014e86 --- /dev/null +++ b/doc/install-guide/section_sahara-install.xml @@ -0,0 +1,96 @@ + +
+ Install the Data processing service + This procedure installs the Data processing service (sahara) on the + controller node. + To install the Data processing service on the controller: + + + Install required packages: + # yum install openstack-sahara python-saharaclient + # zypper install openstack-sahara python-saharaclient + + + You need to install required packages. For now, sahara + doesn't have packages for Ubuntu and Debian. + Documentation will be updated once packages are available. The rest + of this document assumes that you have sahara service packages + installed on the system. + + + Edit /etc/sahara/sahara.conf configuration file + + First, edit parameter in + the [database] section. The URL provided here + should point to an empty database. For instance, connection + string for MySQL database will be: + connection = mysql://sahara:SAHARA_DBPASS@controller/sahara + + Switch to the [keystone_authtoken] + section. The parameter should point to + the public Identity API endpoint. + should point to the admin Identity API endpoint. For example: + auth_uri = http://controller:5000/v2.0 +identity_uri = http://controller:35357 + + Next specify admin_user, + admin_password and + admin_tenant_name. These parameters must specify + a keystone user which has the admin role in the + given tenant. These credentials allow sahara to authenticate and + authorize its users. + + Switch to the [DEFAULT] section. + Proceed to the networking parameters. If you are using Neutron + for networking, then set use_neutron=true. + Otherwise if you are using nova-network set + the given parameter to false. + + That should be enough for the first run. If you want to + increase logging level for troubleshooting, there are two parameters + in the config: verbose and + debug. If the former is set to + true, sahara will + start to write logs of INFO level and above. If + debug is set to + true, sahara will write all the logs, including + the DEBUG ones. + + + + If you use the Data processing service with MySQL database, + then for storing big job binaries in sahara internal database you must + configure size of max allowed packet. Edit my.cnf + file and change parameter: + [mysqld] +max_allowed_packet = 256M + and restart MySQL server. + + Create database schema: + # sahara-db-manage --config-file /etc/sahara/sahara.conf upgrade head + + You must register the Data processing service with the Identity + service so that other OpenStack services can locate it. Register the + service and specify the endpoint: + $ keystone service-create --name sahara --type data_processing \ + --description "Data processing service" +$ keystone endpoint-create \ + --service-id $(keystone service-list | awk '/ sahara / {print $2}') \ + --publicurl http://controller:8386/v1.1/%\(tenant_id\)s \ + --internalurl http://controller:8386/v1.1/%\(tenant_id\)s \ + --adminurl http://controller:8386/v1.1/%\(tenant_id\)s + + Start the sahara service: + # systemctl start openstack-sahara-all + # service openstack-sahara-all start + + (Optional) Enable the Data processing service to start on boot + # systemctl enable openstack-sahara-all + # chkconfig openstack-sahara-all on + + +
diff --git a/doc/install-guide/section_sahara-verify.xml b/doc/install-guide/section_sahara-verify.xml new file mode 100644 index 0000000000..03440c4392 --- /dev/null +++ b/doc/install-guide/section_sahara-verify.xml @@ -0,0 +1,26 @@ + +
+ Verify the Data processing service installation + To verify that the Data processing service (sahara) is installed and + configured correctly, try requesting clusters list using sahara + client. + + + Source the demo tenant credentials: + $ source demo-openrc.sh + + + Retrieve sahara clusters list: + $ sahara cluster-list + You should see output similar to this: + +------+----+--------+------------+ +| name | id | status | node_count | ++------+----+--------+------------+ ++------+----+--------+------------+ + + +