Files

venkatamahesh 9eba19af5e Glossary terms mark ups are added for few terms

In this patch glossary term mark-ups are added for few
terms which are helpful for readers

Change-Id: I239f361c6908b366e3b3d0760ab68f627f7b1cea

2015-11-25 10:08:38 +05:30

3.6 KiB

Raw Permalink Blame History

Deployment

The Data processing service is deployed, like many other OpenStack services, as an application running on a host connected to the stack. As of the Kilo release, it has the ability to be deployed in a distributed manner with several redundant controllers. Like other services, it also requires a database to store information about its resources. See ../databases. It is important to note that the Data processing service will need to manage several Identity service trusts, communicate directly with the Orchestration and Networking services, and potentially create users in a proxy domain. For these reasons the controller will need access to the control plane and as such we recommend installing it alongside other service controllers.

Data processing interacts directly with several openstack services:

Compute
Identity
Networking
Object Storage
Orchestration
Block Storage (optional)

We recommend documenting all the data flows and bridging points between these services and the data processing controller. See ../documentation.

The Object Storage service is used by the Data processing service to store job binaries and data sources. Users wishing to have access to the full Data processing service functionality will need an object store in the projects they are using.

The Networking service plays an important role in the provisioning of clusters. Prior to provisioning, the user is expected to provide one or more networks for the cluster instances. The action of associating networks is similar to the process of assigning networks when launching instances through the dashboard. These networks are used by the controller for administrative access to the instances and frameworks of its clusters.

Also of note is the Identity service. Users of the Data processing service will need appropriate roles in their projects to allow the provisioning of instances for their clusters. Installations that use the proxy domain configuration require special consideration. See data-processing-proxy-domains. Specifically, the Data processing service will need the ability to create users within the proxy domain.

Controller network access to clusters

One of the primary tasks of the data processing controller is to communicate with the instances it spawns. These instances are provisioned and then configured depending on the framework being used. The communication between the controller and the instances uses secure shell (SSH) and HTTP protocols.

When provisioning clusters each instance will be given an IP address in the networks provided by the user. The first network is often referred to as the data processing management network and instances can use the fixed IP address assigned by the Networking service for this network. The controller can also be configured to use floating IP addresses for the instances in addition to their fixed address. When communicating with the instances the controller will prefer the floating address if enabled.

For situations where the fixed and floating IP addresses do not provide the functionality required the controller can provide access through two alternate methods: custom network topologies and indirect access. The custom network topologies feature allows the controller to access the instances through a supplied shell command in the configuration file. Indirect access is used to specify instances that can be used as proxy gateways by the user during cluster provisioning. These options are discussed with examples of usage in configuration-and-hardening.

3.6 KiB Raw Permalink Blame History

Deployment

Controller network access to clusters

3.6 KiB

Raw Permalink Blame History