Alexandra Settle f6451c96b1 Edits to the full ha-guide document

Cleaning up the ha-guide for minor errors and restructure of content.
Some bugs have been filed to draw attention to the TODOs inline.

Change-Id: Id6cdff494db905826ae87be3e38d587e9829d6da

2016-12-02 15:35:24 +00:00

3.0 KiB

Raw Blame History

Overview of highly available controllers

OpenStack is a set of multiple services exposed to the end users as HTTP(s) APIs. Additionally, for your own internal usage, OpenStack requires an SQL database server and AMQP broker. The physical servers, where all the components are running, are called controllers. This modular OpenStack architecture allows you to duplicate all the components and run them on different controllers. By making all the components redundant it is possible to make OpenStack highly available.

In general we can divide all the OpenStack components into three categories:

OpenStack APIs: These are HTTP(s) stateless services written in python, easy to duplicate and mostly easy to load balance.
SQL relational database server provides stateful type consumed by other components. Supported databases are MySQL, MariaDB, and PostgreSQL. Making SQL database redundant is complex.
Advanced Message Queuing Protocol (AMQP) provides OpenStack internal stateful communication service.

Network components

[TODO Need discussion of network hardware, bonding interfaces, intelligent Layer 2 switches, routers and Layer 3 switches.]

The configuration uses static routing without Virtual Router Redundancy Protocol (VRRP) or similar techniques implemented.

[TODO Need description of VIP failover inside Linux namespaces and expected SLA.]

See networking-ha for more information about configuring Networking for high availability.

Common deployment architectures

We recommend two primary architectures for making OpenStack highly available.

The architectures differ in the sets of services managed by the cluster.

Both use a cluster manager, such as Pacemaker or Veritas, to orchestrate the actions of the various services across a set of machines. Because we are focused on FOSS, we refer to these as Pacemaker architectures.

Traditionally, Pacemaker has been positioned as an all-encompassing solution. However, as OpenStack services have matured, they are increasingly able to run in an active/active configuration and gracefully tolerate the disappearance of the APIs on which they depend.

With this in mind, some vendors are restricting Pacemaker's use to services that must operate in an active/passive mode (such as cinder-volume), those with multiple states (for example, Galera), and those with complex bootstrapping procedures (such as RabbitMQ).

The majority of services, needing no real orchestration, are handled by Systemd on each node. This approach avoids the need to coordinate service upgrades or location changes with the cluster and has the added advantage of more easily scaling beyond Corosync's 16 node limit. However, it will generally require the addition of an enterprise monitoring solution such as Nagios or Sensu for those wanting centralized failure reporting.

intro-ha-arch-pacemaker.rst

3.0 KiB Raw Blame History

Overview of highly available controllers

Network components

Common deployment architectures

3.0 KiB

Raw Blame History