Edits to the full ha-guide document
Cleaning up the ha-guide for minor errors and restructure of content. Some bugs have been filed to draw attention to the TODOs inline. Change-Id: Id6cdff494db905826ae87be3e38d587e9829d6da
This commit is contained in:
parent
79a5ba2125
commit
f6451c96b1
@ -1,12 +1,9 @@
|
||||
|
||||
============================================
|
||||
Configure high availability on compute nodes
|
||||
============================================
|
||||
==============================================
|
||||
Configuring high availability on compute nodes
|
||||
==============================================
|
||||
|
||||
The `Newton Installation Tutorials and Guides
|
||||
<http://docs.openstack.org/project-install-guide/newton/>`_
|
||||
gives instructions for installing multiple compute nodes.
|
||||
To make them highly available,
|
||||
you must configure the environment
|
||||
to include multiple instances of the API
|
||||
and other services.
|
||||
provide instructions for installing multiple compute nodes.
|
||||
To make the compute nodes highly available, you must configure the
|
||||
environment to include multiple instances of the API and other services.
|
||||
|
@ -1,4 +1,3 @@
|
||||
|
||||
==================================================
|
||||
Configuring the compute node for high availability
|
||||
==================================================
|
||||
|
@ -8,228 +8,222 @@ under very high loads while needing persistence or Layer 7 processing.
|
||||
It realistically supports tens of thousands of connections with recent
|
||||
hardware.
|
||||
|
||||
Each instance of HAProxy configures its front end to accept connections
|
||||
only from the virtual IP (VIP) address and to terminate them as a list
|
||||
of all instances of the corresponding service under load balancing,
|
||||
such as any OpenStack API service.
|
||||
Each instance of HAProxy configures its front end to accept connections only
|
||||
to the virtual IP (VIP) address. The HAProxy back end (termination
|
||||
point) is a list of all the IP addresses of instances for load balancing.
|
||||
|
||||
This makes the instances of HAProxy act independently and fail over
|
||||
transparently together with the network endpoints (VIP addresses)
|
||||
failover and, therefore, shares the same SLA.
|
||||
.. note::
|
||||
|
||||
You can alternatively use a commercial load balancer, which is a hardware
|
||||
or software. A hardware load balancer generally has good performance.
|
||||
Ensure your HAProxy installation is not a single point of failure,
|
||||
it is advisable to have multiple HAProxy instances running.
|
||||
|
||||
You can also ensure the availability by other means, using Keepalived
|
||||
or Pacemaker.
|
||||
|
||||
Alternatively, you can use a commercial load balancer, which is hardware
|
||||
or software. We recommend a hardware load balancer as it generally has
|
||||
good performance.
|
||||
|
||||
For detailed instructions about installing HAProxy on your nodes,
|
||||
see its `official documentation <http://www.haproxy.org/#docs>`_.
|
||||
see the HAProxy `official documentation <http://www.haproxy.org/#docs>`_.
|
||||
|
||||
.. note::
|
||||
|
||||
HAProxy should not be a single point of failure.
|
||||
It is advisable to have multiple HAProxy instances running,
|
||||
where the number of these instances is a small odd number like 3 or 5.
|
||||
You need to ensure its availability by other means,
|
||||
such as Keepalived or Pacemaker.
|
||||
|
||||
The common practice is to locate an HAProxy instance on each OpenStack
|
||||
controller in the environment.
|
||||
|
||||
Once configured (see example file below), add HAProxy to the cluster
|
||||
and ensure the VIPs can only run on machines where HAProxy is active:
|
||||
|
||||
``pcs``
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
$ pcs resource create lb-haproxy systemd:haproxy --clone
|
||||
$ pcs constraint order start vip then lb-haproxy-clone kind=Optional
|
||||
$ pcs constraint colocation add lb-haproxy-clone with vip
|
||||
|
||||
``crmsh``
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
$ crm cib new conf-haproxy
|
||||
$ crm configure primitive haproxy lsb:haproxy op monitor interval="1s"
|
||||
$ crm configure clone haproxy-clone haproxy
|
||||
$ crm configure colocation vip-with-haproxy inf: vip haproxy-clone
|
||||
$ crm configure order haproxy-after-vip mandatory: vip haproxy-clone
|
||||
|
||||
Example Config File
|
||||
Configuring HAProxy
|
||||
~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Here is an example ``/etc/haproxy/haproxy.cfg`` configuration file.
|
||||
You need a copy of it on each controller node.
|
||||
#. Restart the HAProxy service.
|
||||
|
||||
.. note::
|
||||
#. Locate your HAProxy instance on each OpenStack controller in your
|
||||
environment. The following is an example ``/etc/haproxy/haproxy.cfg``
|
||||
configuration file. Configure your instance using the following
|
||||
configuration file, you will need a copy of it on each
|
||||
controller node.
|
||||
|
||||
To implement any changes made to this you must restart the HAProxy service
|
||||
|
||||
.. code-block:: none
|
||||
.. code-block:: none
|
||||
|
||||
global
|
||||
chroot /var/lib/haproxy
|
||||
daemon
|
||||
group haproxy
|
||||
maxconn 4000
|
||||
pidfile /var/run/haproxy.pid
|
||||
user haproxy
|
||||
global
|
||||
chroot /var/lib/haproxy
|
||||
daemon
|
||||
group haproxy
|
||||
maxconn 4000
|
||||
pidfile /var/run/haproxy.pid
|
||||
user haproxy
|
||||
|
||||
defaults
|
||||
log global
|
||||
maxconn 4000
|
||||
option redispatch
|
||||
retries 3
|
||||
timeout http-request 10s
|
||||
timeout queue 1m
|
||||
timeout connect 10s
|
||||
timeout client 1m
|
||||
timeout server 1m
|
||||
timeout check 10s
|
||||
defaults
|
||||
log global
|
||||
maxconn 4000
|
||||
option redispatch
|
||||
retries 3
|
||||
timeout http-request 10s
|
||||
timeout queue 1m
|
||||
timeout connect 10s
|
||||
timeout client 1m
|
||||
timeout server 1m
|
||||
timeout check 10s
|
||||
|
||||
listen dashboard_cluster
|
||||
bind <Virtual IP>:443
|
||||
balance source
|
||||
option tcpka
|
||||
option httpchk
|
||||
option tcplog
|
||||
server controller1 10.0.0.12:443 check inter 2000 rise 2 fall 5
|
||||
server controller2 10.0.0.13:443 check inter 2000 rise 2 fall 5
|
||||
server controller3 10.0.0.14:443 check inter 2000 rise 2 fall 5
|
||||
listen dashboard_cluster
|
||||
bind <Virtual IP>:443
|
||||
balance source
|
||||
option tcpka
|
||||
option httpchk
|
||||
option tcplog
|
||||
server controller1 10.0.0.12:443 check inter 2000 rise 2 fall 5
|
||||
server controller2 10.0.0.13:443 check inter 2000 rise 2 fall 5
|
||||
server controller3 10.0.0.14:443 check inter 2000 rise 2 fall 5
|
||||
|
||||
listen galera_cluster
|
||||
bind <Virtual IP>:3306
|
||||
balance source
|
||||
option mysql-check
|
||||
server controller1 10.0.0.12:3306 check port 9200 inter 2000 rise 2 fall 5
|
||||
server controller2 10.0.0.13:3306 backup check port 9200 inter 2000 rise 2 fall 5
|
||||
server controller3 10.0.0.14:3306 backup check port 9200 inter 2000 rise 2 fall 5
|
||||
listen galera_cluster
|
||||
bind <Virtual IP>:3306
|
||||
balance source
|
||||
option mysql-check
|
||||
server controller1 10.0.0.12:3306 check port 9200 inter 2000 rise 2 fall 5
|
||||
server controller2 10.0.0.13:3306 backup check port 9200 inter 2000 rise 2 fall 5
|
||||
server controller3 10.0.0.14:3306 backup check port 9200 inter 2000 rise 2 fall 5
|
||||
|
||||
listen glance_api_cluster
|
||||
bind <Virtual IP>:9292
|
||||
balance source
|
||||
option tcpka
|
||||
option httpchk
|
||||
option tcplog
|
||||
server controller1 10.0.0.12:9292 check inter 2000 rise 2 fall 5
|
||||
server controller2 10.0.0.13:9292 check inter 2000 rise 2 fall 5
|
||||
server controller3 10.0.0.14:9292 check inter 2000 rise 2 fall 5
|
||||
listen glance_api_cluster
|
||||
bind <Virtual IP>:9292
|
||||
balance source
|
||||
option tcpka
|
||||
option httpchk
|
||||
option tcplog
|
||||
server controller1 10.0.0.12:9292 check inter 2000 rise 2 fall 5
|
||||
server controller2 10.0.0.13:9292 check inter 2000 rise 2 fall 5
|
||||
server controller3 10.0.0.14:9292 check inter 2000 rise 2 fall 5
|
||||
|
||||
listen glance_registry_cluster
|
||||
bind <Virtual IP>:9191
|
||||
balance source
|
||||
option tcpka
|
||||
option tcplog
|
||||
server controller1 10.0.0.12:9191 check inter 2000 rise 2 fall 5
|
||||
server controller2 10.0.0.13:9191 check inter 2000 rise 2 fall 5
|
||||
server controller3 10.0.0.14:9191 check inter 2000 rise 2 fall 5
|
||||
listen glance_registry_cluster
|
||||
bind <Virtual IP>:9191
|
||||
balance source
|
||||
option tcpka
|
||||
option tcplog
|
||||
server controller1 10.0.0.12:9191 check inter 2000 rise 2 fall 5
|
||||
server controller2 10.0.0.13:9191 check inter 2000 rise 2 fall 5
|
||||
server controller3 10.0.0.14:9191 check inter 2000 rise 2 fall 5
|
||||
|
||||
listen keystone_admin_cluster
|
||||
bind <Virtual IP>:35357
|
||||
balance source
|
||||
option tcpka
|
||||
option httpchk
|
||||
option tcplog
|
||||
server controller1 10.0.0.12:35357 check inter 2000 rise 2 fall 5
|
||||
server controller2 10.0.0.13:35357 check inter 2000 rise 2 fall 5
|
||||
server controller3 10.0.0.14:35357 check inter 2000 rise 2 fall 5
|
||||
listen keystone_admin_cluster
|
||||
bind <Virtual IP>:35357
|
||||
balance source
|
||||
option tcpka
|
||||
option httpchk
|
||||
option tcplog
|
||||
server controller1 10.0.0.12:35357 check inter 2000 rise 2 fall 5
|
||||
server controller2 10.0.0.13:35357 check inter 2000 rise 2 fall 5
|
||||
server controller3 10.0.0.14:35357 check inter 2000 rise 2 fall 5
|
||||
|
||||
listen keystone_public_internal_cluster
|
||||
bind <Virtual IP>:5000
|
||||
balance source
|
||||
option tcpka
|
||||
option httpchk
|
||||
option tcplog
|
||||
server controller1 10.0.0.12:5000 check inter 2000 rise 2 fall 5
|
||||
server controller2 10.0.0.13:5000 check inter 2000 rise 2 fall 5
|
||||
server controller3 10.0.0.14:5000 check inter 2000 rise 2 fall 5
|
||||
listen keystone_public_internal_cluster
|
||||
bind <Virtual IP>:5000
|
||||
balance source
|
||||
option tcpka
|
||||
option httpchk
|
||||
option tcplog
|
||||
server controller1 10.0.0.12:5000 check inter 2000 rise 2 fall 5
|
||||
server controller2 10.0.0.13:5000 check inter 2000 rise 2 fall 5
|
||||
server controller3 10.0.0.14:5000 check inter 2000 rise 2 fall 5
|
||||
|
||||
listen nova_ec2_api_cluster
|
||||
bind <Virtual IP>:8773
|
||||
balance source
|
||||
option tcpka
|
||||
option tcplog
|
||||
server controller1 10.0.0.12:8773 check inter 2000 rise 2 fall 5
|
||||
server controller2 10.0.0.13:8773 check inter 2000 rise 2 fall 5
|
||||
server controller3 10.0.0.14:8773 check inter 2000 rise 2 fall 5
|
||||
listen nova_ec2_api_cluster
|
||||
bind <Virtual IP>:8773
|
||||
balance source
|
||||
option tcpka
|
||||
option tcplog
|
||||
server controller1 10.0.0.12:8773 check inter 2000 rise 2 fall 5
|
||||
server controller2 10.0.0.13:8773 check inter 2000 rise 2 fall 5
|
||||
server controller3 10.0.0.14:8773 check inter 2000 rise 2 fall 5
|
||||
|
||||
listen nova_compute_api_cluster
|
||||
bind <Virtual IP>:8774
|
||||
balance source
|
||||
option tcpka
|
||||
option httpchk
|
||||
option tcplog
|
||||
server controller1 10.0.0.12:8774 check inter 2000 rise 2 fall 5
|
||||
server controller2 10.0.0.13:8774 check inter 2000 rise 2 fall 5
|
||||
server controller3 10.0.0.14:8774 check inter 2000 rise 2 fall 5
|
||||
listen nova_compute_api_cluster
|
||||
bind <Virtual IP>:8774
|
||||
balance source
|
||||
option tcpka
|
||||
option httpchk
|
||||
option tcplog
|
||||
server controller1 10.0.0.12:8774 check inter 2000 rise 2 fall 5
|
||||
erver controller2 10.0.0.13:8774 check inter 2000 rise 2 fall 5
|
||||
server controller3 10.0.0.14:8774 check inter 2000 rise 2 fall 5
|
||||
|
||||
listen nova_metadata_api_cluster
|
||||
bind <Virtual IP>:8775
|
||||
balance source
|
||||
option tcpka
|
||||
option tcplog
|
||||
server controller1 10.0.0.12:8775 check inter 2000 rise 2 fall 5
|
||||
server controller2 10.0.0.13:8775 check inter 2000 rise 2 fall 5
|
||||
server controller3 10.0.0.14:8775 check inter 2000 rise 2 fall 5
|
||||
listen nova_metadata_api_cluster
|
||||
bind <Virtual IP>:8775
|
||||
balance source
|
||||
option tcpka
|
||||
option tcplog
|
||||
server controller1 10.0.0.12:8775 check inter 2000 rise 2 fall 5
|
||||
server controller2 10.0.0.13:8775 check inter 2000 rise 2 fall 5
|
||||
server controller3 10.0.0.14:8775 check inter 2000 rise 2 fall 5
|
||||
|
||||
listen cinder_api_cluster
|
||||
bind <Virtual IP>:8776
|
||||
balance source
|
||||
option tcpka
|
||||
option httpchk
|
||||
option tcplog
|
||||
server controller1 10.0.0.12:8776 check inter 2000 rise 2 fall 5
|
||||
server controller2 10.0.0.13:8776 check inter 2000 rise 2 fall 5
|
||||
server controller3 10.0.0.14:8776 check inter 2000 rise 2 fall 5
|
||||
listen cinder_api_cluster
|
||||
bind <Virtual IP>:8776
|
||||
balance source
|
||||
option tcpka
|
||||
option httpchk
|
||||
option tcplog
|
||||
server controller1 10.0.0.12:8776 check inter 2000 rise 2 fall 5
|
||||
server controller2 10.0.0.13:8776 check inter 2000 rise 2 fall 5
|
||||
server controller3 10.0.0.14:8776 check inter 2000 rise 2 fall 5
|
||||
|
||||
listen ceilometer_api_cluster
|
||||
bind <Virtual IP>:8777
|
||||
balance source
|
||||
option tcpka
|
||||
option tcplog
|
||||
server controller1 10.0.0.12:8777 check inter 2000 rise 2 fall 5
|
||||
server controller2 10.0.0.13:8777 check inter 2000 rise 2 fall 5
|
||||
server controller3 10.0.0.14:8777 check inter 2000 rise 2 fall 5
|
||||
listen ceilometer_api_cluster
|
||||
bind <Virtual IP>:8777
|
||||
balance source
|
||||
option tcpka
|
||||
option tcplog
|
||||
server controller1 10.0.0.12:8777 check inter 2000 rise 2 fall 5
|
||||
server controller2 10.0.0.13:8777 check inter 2000 rise 2 fall 5
|
||||
server controller3 10.0.0.14:8777 check inter 2000 rise 2 fall 5
|
||||
|
||||
listen nova_vncproxy_cluster
|
||||
bind <Virtual IP>:6080
|
||||
balance source
|
||||
option tcpka
|
||||
option tcplog
|
||||
server controller1 10.0.0.12:6080 check inter 2000 rise 2 fall 5
|
||||
server controller2 10.0.0.13:6080 check inter 2000 rise 2 fall 5
|
||||
server controller3 10.0.0.14:6080 check inter 2000 rise 2 fall 5
|
||||
listen nova_vncproxy_cluster
|
||||
bind <Virtual IP>:6080
|
||||
balance source
|
||||
option tcpka
|
||||
option tcplog
|
||||
server controller1 10.0.0.12:6080 check inter 2000 rise 2 fall 5
|
||||
server controller2 10.0.0.13:6080 check inter 2000 rise 2 fall 5
|
||||
server controller3 10.0.0.14:6080 check inter 2000 rise 2 fall 5
|
||||
|
||||
listen neutron_api_cluster
|
||||
bind <Virtual IP>:9696
|
||||
balance source
|
||||
option tcpka
|
||||
option httpchk
|
||||
option tcplog
|
||||
server controller1 10.0.0.12:9696 check inter 2000 rise 2 fall 5
|
||||
server controller2 10.0.0.13:9696 check inter 2000 rise 2 fall 5
|
||||
server controller3 10.0.0.14:9696 check inter 2000 rise 2 fall 5
|
||||
listen neutron_api_cluster
|
||||
bind <Virtual IP>:9696
|
||||
balance source
|
||||
option tcpka
|
||||
option httpchk
|
||||
option tcplog
|
||||
server controller1 10.0.0.12:9696 check inter 2000 rise 2 fall 5
|
||||
server controller2 10.0.0.13:9696 check inter 2000 rise 2 fall 5
|
||||
server controller3 10.0.0.14:9696 check inter 2000 rise 2 fall 5
|
||||
|
||||
listen swift_proxy_cluster
|
||||
bind <Virtual IP>:8080
|
||||
balance source
|
||||
option tcplog
|
||||
option tcpka
|
||||
server controller1 10.0.0.12:8080 check inter 2000 rise 2 fall 5
|
||||
server controller2 10.0.0.13:8080 check inter 2000 rise 2 fall 5
|
||||
server controller3 10.0.0.14:8080 check inter 2000 rise 2 fall 5
|
||||
listen swift_proxy_cluster
|
||||
bind <Virtual IP>:8080
|
||||
balance source
|
||||
option tcplog
|
||||
option tcpka
|
||||
server controller1 10.0.0.12:8080 check inter 2000 rise 2 fall 5
|
||||
server controller2 10.0.0.13:8080 check inter 2000 rise 2 fall 5
|
||||
server controller3 10.0.0.14:8080 check inter 2000 rise 2 fall 5
|
||||
|
||||
.. note::
|
||||
.. note::
|
||||
|
||||
The Galera cluster configuration directive ``backup`` indicates
|
||||
that two of the three controllers are standby nodes.
|
||||
This ensures that only one node services write requests
|
||||
because OpenStack support for multi-node writes is not yet production-ready.
|
||||
The Galera cluster configuration directive ``backup`` indicates
|
||||
that two of the three controllers are standby nodes.
|
||||
This ensures that only one node services write requests
|
||||
because OpenStack support for multi-node writes is not yet production-ready.
|
||||
|
||||
.. note::
|
||||
.. note::
|
||||
|
||||
The Telemetry API service configuration does not have the ``option httpchk``
|
||||
directive as it cannot process this check properly.
|
||||
TODO: explain why the Telemetry API is so special
|
||||
The Telemetry API service configuration does not have the ``option httpchk``
|
||||
directive as it cannot process this check properly.
|
||||
|
||||
[TODO: we need more commentary about the contents and format of this file]
|
||||
.. TODO: explain why the Telemetry API is so special
|
||||
|
||||
#. Add HAProxy to the cluster and ensure the VIPs can only run on machines
|
||||
where HAProxy is active:
|
||||
|
||||
``pcs``
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
$ pcs resource create lb-haproxy systemd:haproxy --clone
|
||||
$ pcs constraint order start vip then lb-haproxy-clone kind=Optional
|
||||
$ pcs constraint colocation add lb-haproxy-clone with vip
|
||||
|
||||
``crmsh``
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
$ crm cib new conf-haproxy
|
||||
$ crm configure primitive haproxy lsb:haproxy op monitor interval="1s"
|
||||
$ crm configure clone haproxy-clone haproxy
|
||||
$ crm configure colocation vip-with-haproxy inf: vip haproxy-clone
|
||||
$ crm configure order haproxy-after-vip mandatory: vip haproxy-clone
|
||||
|
@ -2,13 +2,8 @@
|
||||
Highly available Identity API
|
||||
=============================
|
||||
|
||||
You should be familiar with
|
||||
`OpenStack Identity service
|
||||
<http://docs.openstack.org/admin-guide/common/get-started-identity.html>`_
|
||||
before proceeding, which is used by many services.
|
||||
|
||||
Making the OpenStack Identity service highly available
|
||||
in active / passive mode involves:
|
||||
in active and passive mode involves:
|
||||
|
||||
- :ref:`identity-pacemaker`
|
||||
- :ref:`identity-config-identity`
|
||||
@ -16,17 +11,28 @@ in active / passive mode involves:
|
||||
|
||||
.. _identity-pacemaker:
|
||||
|
||||
Prerequisites
|
||||
~~~~~~~~~~~~~
|
||||
|
||||
Before beginning, ensure you have read the
|
||||
`OpenStack Identity service getting started documentation
|
||||
<http://docs.openstack.org/admin-guide/common/get-started-identity.html>`_
|
||||
before proceeding.
|
||||
|
||||
Add OpenStack Identity resource to Pacemaker
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
The following section(s) detail how to add the OpenStack Identity
|
||||
resource to Pacemaker on SUSE and Red Hat.
|
||||
|
||||
SUSE
|
||||
-----
|
||||
|
||||
SUSE Enterprise Linux and SUSE-based distributions, such as openSUSE,
|
||||
use a set of OCF agents for controlling OpenStack services.
|
||||
|
||||
#. You must first download the OpenStack Identity resource to Pacemaker
|
||||
by running the following commands:
|
||||
#. Run the following commands to download the OpenStack Identity resource
|
||||
to Pacemaker:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
@ -36,40 +42,49 @@ use a set of OCF agents for controlling OpenStack services.
|
||||
# wget https://git.openstack.org/cgit/openstack/openstack-resource-agents/plain/ocf/keystone
|
||||
# chmod a+rx *
|
||||
|
||||
#. You can now add the Pacemaker configuration
|
||||
for the OpenStack Identity resource
|
||||
by running the :command:`crm configure` command
|
||||
to connect to the Pacemaker cluster.
|
||||
Add the following cluster resources:
|
||||
#. Add the Pacemaker configuration for the OpenStack Identity resource
|
||||
by running the following command to connect to the Pacemaker cluster:
|
||||
|
||||
::
|
||||
.. code-block:: console
|
||||
|
||||
# crm configure
|
||||
|
||||
#. Add the following cluster resources:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
clone p_keystone ocf:openstack:keystone \
|
||||
params config="/etc/keystone/keystone.conf" os_password="secretsecret" os_username="admin" os_tenant_name="admin" os_auth_url="http://10.0.0.11:5000/v2.0/" \
|
||||
op monitor interval="30s" timeout="30s"
|
||||
|
||||
This configuration creates ``p_keystone``,
|
||||
a resource for managing the OpenStack Identity service.
|
||||
.. note::
|
||||
|
||||
:command:`crm configure` supports batch input
|
||||
so you may copy and paste the above lines
|
||||
into your live Pacemaker configuration,
|
||||
and then make changes as required.
|
||||
For example, you may enter edit ``p_ip_keystone``
|
||||
from the :command:`crm configure` menu
|
||||
and edit the resource to match your preferred virtual IP address.
|
||||
This configuration creates ``p_keystone``,
|
||||
a resource for managing the OpenStack Identity service.
|
||||
|
||||
#. After you add these resources,
|
||||
commit your configuration changes by entering :command:`commit`
|
||||
from the :command:`crm configure` menu.
|
||||
Pacemaker then starts the OpenStack Identity service
|
||||
and its dependent resources on all of your nodes.
|
||||
#. Commit your configuration changes from the :command:`crm configure` menu
|
||||
with the following command:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
# commit
|
||||
|
||||
The :command:`crm configure` supports batch input. You may have to copy and
|
||||
paste the above lines into your live Pacemaker configuration, and then make
|
||||
changes as required.
|
||||
|
||||
For example, you may enter ``edit p_ip_keystone`` from the
|
||||
:command:`crm configure` menu and edit the resource to match your preferred
|
||||
virtual IP address.
|
||||
|
||||
Pacemaker now starts the OpenStack Identity service and its dependent
|
||||
resources on all of your nodes.
|
||||
|
||||
Red Hat
|
||||
--------
|
||||
|
||||
For Red Hat Enterprise Linux and Red Hat-based Linux distributions,
|
||||
the process is simpler as they use the standard Systemd unit files.
|
||||
the following process uses Systemd unit files.
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
@ -116,29 +131,24 @@ Configure OpenStack Identity service
|
||||
Configure OpenStack services to use the highly available OpenStack Identity
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Your OpenStack services must now point
|
||||
their OpenStack Identity configuration
|
||||
to the highly available virtual cluster IP address
|
||||
rather than point to the physical IP address
|
||||
of an OpenStack Identity server as you would do
|
||||
in a non-HA environment.
|
||||
Your OpenStack services now point their OpenStack Identity configuration
|
||||
to the highly available virtual cluster IP address.
|
||||
|
||||
#. For OpenStack Compute, for example,
|
||||
if your OpenStack Identity service IP address is 10.0.0.11,
|
||||
use the following configuration in your :file:`api-paste.ini` file:
|
||||
#. For OpenStack Compute, (if your OpenStack Identity service IP address
|
||||
is 10.0.0.11) use the following configuration in the :file:`api-paste.ini`
|
||||
file:
|
||||
|
||||
.. code-block:: ini
|
||||
|
||||
auth_host = 10.0.0.11
|
||||
|
||||
#. You also need to create the OpenStack Identity Endpoint
|
||||
with this IP address.
|
||||
#. Create the OpenStack Identity Endpoint with this IP address.
|
||||
|
||||
.. note::
|
||||
|
||||
If you are using both private and public IP addresses,
|
||||
you should create two virtual IP addresses
|
||||
and define your endpoint like this:
|
||||
create two virtual IP addresses and define the endpoint. For
|
||||
example:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
@ -150,12 +160,9 @@ in a non-HA environment.
|
||||
$service-type internal http://10.0.0.11:5000/v2.0
|
||||
|
||||
|
||||
#. If you are using the horizon dashboard,
|
||||
edit the :file:`local_settings.py` file
|
||||
to include the following:
|
||||
#. If you are using the horizon Dashboard, edit the :file:`local_settings.py`
|
||||
file to include the following:
|
||||
|
||||
.. code-block:: ini
|
||||
|
||||
OPENSTACK_HOST = 10.0.0.11
|
||||
|
||||
|
||||
|
@ -1,6 +1,6 @@
|
||||
===================
|
||||
=========
|
||||
Memcached
|
||||
===================
|
||||
=========
|
||||
|
||||
Memcached is a general-purpose distributed memory caching system. It
|
||||
is used to speed up dynamic database-driven websites by caching data
|
||||
@ -10,12 +10,12 @@ source must be read.
|
||||
Memcached is a memory cache demon that can be used by most OpenStack
|
||||
services to store ephemeral data, such as tokens.
|
||||
|
||||
Access to memcached is not handled by HAproxy because replicated
|
||||
access is currently only in an experimental state. Instead OpenStack
|
||||
Access to Memcached is not handled by HAProxy because replicated
|
||||
access is currently in an experimental state. Instead, OpenStack
|
||||
services must be supplied with the full list of hosts running
|
||||
memcached.
|
||||
Memcached.
|
||||
|
||||
The Memcached client implements hashing to balance objects among the
|
||||
instances. Failure of an instance only impacts a percentage of the
|
||||
instances. Failure of an instance impacts only a percentage of the
|
||||
objects and the client automatically removes it from the list of
|
||||
instances. The SLA is several minutes.
|
||||
instances. The SLA is several minutes.
|
||||
|
@ -2,23 +2,24 @@
|
||||
Pacemaker cluster stack
|
||||
=======================
|
||||
|
||||
`Pacemaker <http://clusterlabs.org/>`_ cluster stack is the state-of-the-art
|
||||
`Pacemaker <http://clusterlabs.org/>`_ cluster stack is a state-of-the-art
|
||||
high availability and load balancing stack for the Linux platform.
|
||||
Pacemaker is useful to make OpenStack infrastructure highly available.
|
||||
Also, it is storage and application-agnostic, and in no way
|
||||
specific to OpenStack.
|
||||
Pacemaker is used to make OpenStack infrastructure highly available.
|
||||
|
||||
.. note::
|
||||
|
||||
It is storage and application-agnostic, and in no way specific to OpenStack.
|
||||
|
||||
Pacemaker relies on the
|
||||
`Corosync <http://corosync.github.io/corosync/>`_ messaging layer
|
||||
for reliable cluster communications.
|
||||
Corosync implements the Totem single-ring ordering and membership protocol.
|
||||
It also provides UDP and InfiniBand based messaging,
|
||||
quorum, and cluster membership to Pacemaker.
|
||||
for reliable cluster communications. Corosync implements the Totem single-ring
|
||||
ordering and membership protocol. It also provides UDP and InfiniBand based
|
||||
messaging, quorum, and cluster membership to Pacemaker.
|
||||
|
||||
Pacemaker does not inherently (need or want to) understand the
|
||||
applications it manages. Instead, it relies on resource agents (RAs),
|
||||
scripts that encapsulate the knowledge of how to start, stop, and
|
||||
check the health of each application managed by the cluster.
|
||||
Pacemaker does not inherently understand the applications it manages.
|
||||
Instead, it relies on resource agents (RAs) that are scripts that encapsulate
|
||||
the knowledge of how to start, stop, and check the health of each application
|
||||
managed by the cluster.
|
||||
|
||||
These agents must conform to one of the `OCF <https://github.com/ClusterLabs/
|
||||
OCF-spec/blob/master/ra/resource-agent-api.md>`_,
|
||||
@ -44,57 +45,61 @@ The steps to implement the Pacemaker cluster stack are:
|
||||
Install packages
|
||||
~~~~~~~~~~~~~~~~
|
||||
|
||||
On any host that is meant to be part of a Pacemaker cluster,
|
||||
you must first establish cluster communications
|
||||
through the Corosync messaging layer.
|
||||
This involves installing the following packages
|
||||
(and their dependencies, which your package manager
|
||||
usually installs automatically):
|
||||
On any host that is meant to be part of a Pacemaker cluster, establish cluster
|
||||
communications through the Corosync messaging layer.
|
||||
This involves installing the following packages (and their dependencies, which
|
||||
your package manager usually installs automatically):
|
||||
|
||||
- pacemaker
|
||||
- `pacemaker`
|
||||
|
||||
- pcs (CentOS or RHEL) or crmsh
|
||||
- `pcs` (CentOS or RHEL) or crmsh
|
||||
|
||||
- corosync
|
||||
- `corosync`
|
||||
|
||||
- fence-agents (CentOS or RHEL) or cluster-glue
|
||||
- `fence-agents` (CentOS or RHEL) or cluster-glue
|
||||
|
||||
- resource-agents
|
||||
- `resource-agents`
|
||||
|
||||
- libqb0
|
||||
- `libqb0`
|
||||
|
||||
.. _pacemaker-corosync-setup:
|
||||
|
||||
Set up the cluster with `pcs`
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
Set up the cluster with pcs
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
#. Make sure pcs is running and configured to start at boot time:
|
||||
#. Make sure `pcs` is running and configured to start at boot time:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
$ systemctl enable pcsd
|
||||
$ systemctl start pcsd
|
||||
|
||||
#. Set a password for hacluster user **on each host**.
|
||||
|
||||
Since the cluster is a single administrative domain, it is generally
|
||||
accepted to use the same password on all nodes.
|
||||
#. Set a password for hacluster user on each host:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
$ echo my-secret-password-no-dont-use-this-one \
|
||||
| passwd --stdin hacluster
|
||||
|
||||
#. Use that password to authenticate to the nodes which will
|
||||
make up the cluster. The :option:`-p` option is used to give
|
||||
the password on command line and makes it easier to script.
|
||||
.. note::
|
||||
|
||||
Since the cluster is a single administrative domain, it is
|
||||
acceptable to use the same password on all nodes.
|
||||
|
||||
#. Use that password to authenticate to the nodes that will
|
||||
make up the cluster:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
$ pcs cluster auth controller1 controller2 controller3 \
|
||||
-u hacluster -p my-secret-password-no-dont-use-this-one --force
|
||||
|
||||
#. Create the cluster, giving it a name, and start it:
|
||||
.. note::
|
||||
|
||||
The :option:`-p` option is used to give the password on command
|
||||
line and makes it easier to script.
|
||||
|
||||
#. Create and name the cluster, and then start it:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
@ -115,12 +120,12 @@ After installing the Corosync package, you must create
|
||||
the :file:`/etc/corosync/corosync.conf` configuration file.
|
||||
|
||||
.. note::
|
||||
For Ubuntu, you should also enable the Corosync service
|
||||
in the ``/etc/default/corosync`` configuration file.
|
||||
|
||||
Corosync can be configured to work
|
||||
with either multicast or unicast IP addresses
|
||||
or to use the votequorum library.
|
||||
For Ubuntu, you should also enable the Corosync service in the
|
||||
``/etc/default/corosync`` configuration file.
|
||||
|
||||
Corosync can be configured to work with either multicast or unicast IP
|
||||
addresses or to use the votequorum library.
|
||||
|
||||
- :ref:`corosync-multicast`
|
||||
- :ref:`corosync-unicast`
|
||||
@ -132,11 +137,10 @@ Set up Corosync with multicast
|
||||
------------------------------
|
||||
|
||||
Most distributions ship an example configuration file
|
||||
(:file:`corosync.conf.example`)
|
||||
as part of the documentation bundled with the Corosync package.
|
||||
An example Corosync configuration file is shown below:
|
||||
(:file:`corosync.conf.example`) as part of the documentation bundled with
|
||||
the Corosync package. An example Corosync configuration file is shown below:
|
||||
|
||||
**Example Corosync configuration file for multicast (corosync.conf)**
|
||||
**Example Corosync configuration file for multicast (``corosync.conf``)**
|
||||
|
||||
.. code-block:: ini
|
||||
|
||||
@ -215,26 +219,26 @@ Note the following:
|
||||
When this timeout expires, the token is declared lost,
|
||||
and after ``token_retransmits_before_loss_const lost`` tokens,
|
||||
the non-responding processor (cluster node) is declared dead.
|
||||
In other words, ``token × token_retransmits_before_loss_const``
|
||||
``token × token_retransmits_before_loss_const``
|
||||
is the maximum time a node is allowed to not respond to cluster messages
|
||||
before being considered dead.
|
||||
The default for token is 1000 milliseconds (1 second),
|
||||
with 4 allowed retransmits.
|
||||
These defaults are intended to minimize failover times,
|
||||
but can cause frequent "false alarms" and unintended failovers
|
||||
but can cause frequent false alarms and unintended failovers
|
||||
in case of short network interruptions. The values used here are safer,
|
||||
albeit with slightly extended failover times.
|
||||
|
||||
- With ``secauth`` enabled,
|
||||
Corosync nodes mutually authenticate using a 128-byte shared secret
|
||||
stored in the :file:`/etc/corosync/authkey` file,
|
||||
which may be generated with the :command:`corosync-keygen` utility.
|
||||
When using ``secauth``, cluster communications are also encrypted.
|
||||
Corosync nodes mutually authenticates using a 128-byte shared secret
|
||||
stored in the :file:`/etc/corosync/authkey` file.
|
||||
This can be generated with the :command:`corosync-keygen` utility.
|
||||
Cluster communications are encrypted when using ``secauth``.
|
||||
|
||||
- In Corosync configurations using redundant networking
|
||||
(with more than one interface),
|
||||
you must select a Redundant Ring Protocol (RRP) mode other than none.
|
||||
``active`` is the recommended RRP mode.
|
||||
- In Corosync, configurations use redundant networking
|
||||
(with more than one interface). This means you must select a Redundant
|
||||
Ring Protocol (RRP) mode other than none. We recommend ``active`` as
|
||||
the RRP mode.
|
||||
|
||||
Note the following about the recommended interface configuration:
|
||||
|
||||
@ -245,61 +249,57 @@ Note the following:
|
||||
The example uses two network addresses of /24 IPv4 subnets.
|
||||
|
||||
- Multicast groups (``mcastaddr``) must not be reused
|
||||
across cluster boundaries.
|
||||
In other words, no two distinct clusters
|
||||
across cluster boundaries. No two distinct clusters
|
||||
should ever use the same multicast group.
|
||||
Be sure to select multicast addresses compliant with
|
||||
`RFC 2365, "Administratively Scoped IP Multicast"
|
||||
<http://www.ietf.org/rfc/rfc2365.txt>`_.
|
||||
|
||||
- For firewall configurations,
|
||||
note that Corosync communicates over UDP only,
|
||||
and uses ``mcastport`` (for receives)
|
||||
and ``mcastport - 1`` (for sends).
|
||||
- For firewall configurations, Corosync communicates over UDP only,
|
||||
and uses ``mcastport`` (for receives) and ``mcastport - 1`` (for sends).
|
||||
|
||||
- The service declaration for the pacemaker service
|
||||
- The service declaration for the Pacemaker service
|
||||
may be placed in the :file:`corosync.conf` file directly
|
||||
or in its own separate file, :file:`/etc/corosync/service.d/pacemaker`.
|
||||
|
||||
.. note::
|
||||
|
||||
If you are using Corosync version 2 on Ubuntu 14.04,
|
||||
remove or comment out lines under the service stanza,
|
||||
which enables Pacemaker to start up. Another potential
|
||||
problem is the boot and shutdown order of Corosync and
|
||||
Pacemaker. To force Pacemaker to start after Corosync and
|
||||
stop before Corosync, fix the start and kill symlinks manually:
|
||||
If you are using Corosync version 2 on Ubuntu 14.04,
|
||||
remove or comment out lines under the service stanza.
|
||||
These stanzas enable Pacemaker to start up. Another potential
|
||||
problem is the boot and shutdown order of Corosync and
|
||||
Pacemaker. To force Pacemaker to start after Corosync and
|
||||
stop before Corosync, fix the start and kill symlinks manually:
|
||||
|
||||
.. code-block:: console
|
||||
.. code-block:: console
|
||||
|
||||
# update-rc.d pacemaker start 20 2 3 4 5 . stop 00 0 1 6 .
|
||||
# update-rc.d pacemaker start 20 2 3 4 5 . stop 00 0 1 6 .
|
||||
|
||||
The Pacemaker service also requires an additional
|
||||
configuration file ``/etc/corosync/uidgid.d/pacemaker``
|
||||
to be created with the following content:
|
||||
The Pacemaker service also requires an additional
|
||||
configuration file ``/etc/corosync/uidgid.d/pacemaker``
|
||||
to be created with the following content:
|
||||
|
||||
.. code-block:: ini
|
||||
.. code-block:: ini
|
||||
|
||||
uidgid {
|
||||
uid: hacluster
|
||||
gid: haclient
|
||||
}
|
||||
uidgid {
|
||||
uid: hacluster
|
||||
gid: haclient
|
||||
}
|
||||
|
||||
- Once created, the :file:`corosync.conf` file
|
||||
- Once created, synchronize the :file:`corosync.conf` file
|
||||
(and the :file:`authkey` file if the secauth option is enabled)
|
||||
must be synchronized across all cluster nodes.
|
||||
across all cluster nodes.
|
||||
|
||||
.. _corosync-unicast:
|
||||
|
||||
Set up Corosync with unicast
|
||||
----------------------------
|
||||
|
||||
For environments that do not support multicast,
|
||||
Corosync should be configured for unicast.
|
||||
An example fragment of the :file:`corosync.conf` file
|
||||
for unicastis shown below:
|
||||
For environments that do not support multicast, Corosync should be configured
|
||||
for unicast. An example fragment of the :file:`corosync.conf` file
|
||||
for unicastis is shown below:
|
||||
|
||||
**Corosync configuration file fragment for unicast (corosync.conf)**
|
||||
**Corosync configuration file fragment for unicast (``corosync.conf``)**
|
||||
|
||||
.. code-block:: ini
|
||||
|
||||
@ -341,45 +341,38 @@ for unicastis shown below:
|
||||
|
||||
Note the following:
|
||||
|
||||
- If the ``broadcast`` parameter is set to yes,
|
||||
the broadcast address is used for communication.
|
||||
If this option is set, the ``mcastaddr`` parameter should not be set.
|
||||
- If the ``broadcast`` parameter is set to ``yes``, the broadcast address is
|
||||
used for communication. If this option is set, the ``mcastaddr`` parameter
|
||||
should not be set.
|
||||
|
||||
- The ``transport`` directive controls the transport mechanism used.
|
||||
To avoid the use of multicast entirely,
|
||||
specify the ``udpu`` unicast transport parameter.
|
||||
This requires specifying the list of members
|
||||
in the ``nodelist`` directive;
|
||||
this could potentially make up the membership before deployment.
|
||||
The default is ``udp``.
|
||||
The transport type can also be set to ``udpu`` or ``iba``.
|
||||
- The ``transport`` directive controls the transport mechanism.
|
||||
To avoid the use of multicast entirely, specify the ``udpu`` unicast
|
||||
transport parameter. This requires specifying the list of members in the
|
||||
``nodelist`` directive. This potentially makes up the membership before
|
||||
deployment. The default is ``udp``. The transport type can also be set to
|
||||
``udpu`` or ``iba``.
|
||||
|
||||
- Within the ``nodelist`` directive,
|
||||
it is possible to specify specific information
|
||||
about the nodes in the cluster.
|
||||
The directive can contain only the node sub-directive,
|
||||
which specifies every node that should be a member of the membership,
|
||||
and where non-default options are needed.
|
||||
Every node must have at least the ``ring0_addr`` field filled.
|
||||
- Within the ``nodelist`` directive, it is possible to specify specific
|
||||
information about the nodes in the cluster. The directive can contain only
|
||||
the node sub-directive, which specifies every node that should be a member
|
||||
of the membership, and where non-default options are needed. Every node must
|
||||
have at least the ``ring0_addr`` field filled.
|
||||
|
||||
.. note::
|
||||
|
||||
For UDPU, every node that should be a member
|
||||
of the membership must be specified.
|
||||
For UDPU, every node that should be a member of the membership must be specified.
|
||||
|
||||
Possible options are:
|
||||
|
||||
- ``ring{X}_addr`` specifies the IP address of one of the nodes.
|
||||
{X} is the ring number.
|
||||
``{X}`` is the ring number.
|
||||
|
||||
- ``nodeid`` is optional
|
||||
when using IPv4 and required when using IPv6.
|
||||
This is a 32-bit value specifying the node identifier
|
||||
delivered to the cluster membership service.
|
||||
If this is not specified with IPv4,
|
||||
the node id is determined from the 32-bit IP address
|
||||
of the system to which the system is bound with ring identifier of 0.
|
||||
The node identifier value of zero is reserved and should not be used.
|
||||
- ``nodeid`` is optional when using IPv4 and required when using IPv6.
|
||||
This is a 32-bit value specifying the node identifier delivered to the
|
||||
cluster membership service. If this is not specified with IPv4,
|
||||
the node ID is determined from the 32-bit IP address of the system to which
|
||||
the system is bound with ring identifier of 0. The node identifier value of
|
||||
zero is reserved and should not be used.
|
||||
|
||||
|
||||
.. _corosync-votequorum:
|
||||
@ -387,15 +380,14 @@ Note the following:
|
||||
Set up Corosync with votequorum library
|
||||
---------------------------------------
|
||||
|
||||
The votequorum library is part of the corosync project.
|
||||
It provides an interface to the vote-based quorum service
|
||||
and it must be explicitly enabled in the Corosync configuration file.
|
||||
The main role of votequorum library is to avoid split-brain situations,
|
||||
but it also provides a mechanism to:
|
||||
The votequorum library is part of the Corosync project. It provides an
|
||||
interface to the vote-based quorum service and it must be explicitly enabled
|
||||
in the Corosync configuration file. The main role of votequorum library is to
|
||||
avoid split-brain situations, but it also provides a mechanism to:
|
||||
|
||||
- Query the quorum status
|
||||
|
||||
- Get a list of nodes known to the quorum service
|
||||
- List the nodes known to the quorum service
|
||||
|
||||
- Receive notifications of quorum state changes
|
||||
|
||||
@ -403,15 +395,13 @@ but it also provides a mechanism to:
|
||||
|
||||
- Change the number of expected votes for a cluster to be quorate
|
||||
|
||||
- Connect an additional quorum device
|
||||
to allow small clusters remain quorate during node outages
|
||||
- Connect an additional quorum device to allow small clusters remain quorate
|
||||
during node outages
|
||||
|
||||
The votequorum library has been created to replace and eliminate
|
||||
qdisk, the disk-based quorum daemon for CMAN,
|
||||
from advanced cluster configurations.
|
||||
The votequorum library has been created to replace and eliminate ``qdisk``, the
|
||||
disk-based quorum daemon for CMAN, from advanced cluster configurations.
|
||||
|
||||
A sample votequorum service configuration
|
||||
in the :file:`corosync.conf` file is:
|
||||
A sample votequorum service configuration in the :file:`corosync.conf` file is:
|
||||
|
||||
.. code-block:: ini
|
||||
|
||||
@ -425,42 +415,33 @@ in the :file:`corosync.conf` file is:
|
||||
|
||||
Note the following:
|
||||
|
||||
- Specifying ``corosync_votequorum`` enables the votequorum library;
|
||||
this is the only required option.
|
||||
- Specifying ``corosync_votequorum`` enables the votequorum library.
|
||||
This is the only required option.
|
||||
|
||||
- The cluster is fully operational with ``expected_votes`` set to 7 nodes
|
||||
(each node has 1 vote), quorum: 4.
|
||||
If a list of nodes is specified as ``nodelist``,
|
||||
the ``expected_votes`` value is ignored.
|
||||
(each node has 1 vote), quorum: 4. If a list of nodes is specified as
|
||||
``nodelist``, the ``expected_votes`` value is ignored.
|
||||
|
||||
- Setting ``wait_for_all`` to 1 means that,
|
||||
When starting up a cluster (all nodes down),
|
||||
the cluster quorum is held until all nodes are online
|
||||
and have joined the cluster for the first time.
|
||||
This parameter is new in Corosync 2.0.
|
||||
- When you start up a cluster (all nodes down) and set ``wait_for_all`` to 1,
|
||||
the cluster quorum is held until all nodes are online and have joined the
|
||||
cluster for the first time. This parameter is new in Corosync 2.0.
|
||||
|
||||
- Setting ``last_man_standing`` to 1 enables
|
||||
the Last Man Standing (LMS) feature;
|
||||
by default, it is disabled (set to 0).
|
||||
If a cluster is on the quorum edge
|
||||
(``expected_votes:`` set to 7; ``online nodes:`` set to 4)
|
||||
for longer than the time specified
|
||||
for the ``last_man_standing_window`` parameter,
|
||||
the cluster can recalculate quorum and continue operating
|
||||
even if the next node will be lost.
|
||||
This logic is repeated until the number of online nodes
|
||||
in the cluster reaches 2.
|
||||
In order to allow the cluster to step down from 2 members to only 1,
|
||||
the ``auto_tie_breaker`` parameter needs to be set;
|
||||
this is not recommended for production environments.
|
||||
- Setting ``last_man_standing`` to 1 enables the Last Man Standing (LMS)
|
||||
feature. By default, it is disabled (set to 0).
|
||||
If a cluster is on the quorum edge (``expected_votes:`` set to 7;
|
||||
``online nodes:`` set to 4) for longer than the time specified
|
||||
for the ``last_man_standing_window`` parameter, the cluster can recalculate
|
||||
quorum and continue operating even if the next node will be lost.
|
||||
This logic is repeated until the number of online nodes in the cluster
|
||||
reaches 2. In order to allow the cluster to step down from 2 members to only
|
||||
1, the ``auto_tie_breaker`` parameter needs to be set.
|
||||
We do not recommended this for production environments.
|
||||
|
||||
- ``last_man_standing_window`` specifies the time, in milliseconds,
|
||||
required to recalculate quorum after one or more hosts
|
||||
have been lost from the cluster.
|
||||
To do the new quorum recalculation,
|
||||
have been lost from the cluster. To perform a new quorum recalculation,
|
||||
the cluster must have quorum for at least the interval
|
||||
specified for ``last_man_standing_window``;
|
||||
the default is 10000ms.
|
||||
specified for ``last_man_standing_window``. The default is 10000ms.
|
||||
|
||||
|
||||
.. _pacemaker-corosync-start:
|
||||
@ -468,30 +449,29 @@ Note the following:
|
||||
Start Corosync
|
||||
--------------
|
||||
|
||||
``Corosync`` is started as a regular system service.
|
||||
Depending on your distribution, it may ship with an LSB init script,
|
||||
an upstart job, or a systemd unit file.
|
||||
Either way, the service is usually named ``corosync``:
|
||||
Corosync is started as a regular system service. Depending on your
|
||||
distribution, it may ship with an LSB init script, an upstart job, or
|
||||
a Systemd unit file.
|
||||
|
||||
- To start ``corosync`` with the LSB init script:
|
||||
- Start ``corosync`` with the LSB init script:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
# /etc/init.d/corosync start
|
||||
|
||||
- Alternatively:
|
||||
Alternatively:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
# service corosync start
|
||||
|
||||
- To start ``corosync`` with upstart:
|
||||
- Start ``corosync`` with upstart:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
# start corosync
|
||||
|
||||
- To start ``corosync`` with systemd unit file:
|
||||
- Start ``corosync`` with systemd unit file:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
@ -514,8 +494,8 @@ to get a summary of the health of the communication rings:
|
||||
id = 10.0.42.100
|
||||
status = ring 1 active with no faults
|
||||
|
||||
Use the :command:`corosync-objctl` utility
|
||||
to dump the Corosync cluster member list:
|
||||
Use the :command:`corosync-objctl` utility to dump the Corosync cluster
|
||||
member list:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
@ -527,11 +507,8 @@ to dump the Corosync cluster member list:
|
||||
runtime.totem.pg.mrp.srp.983895584.join_count=1
|
||||
runtime.totem.pg.mrp.srp.983895584.status=joined
|
||||
|
||||
You should see a ``status=joined`` entry
|
||||
for each of your constituent cluster nodes.
|
||||
|
||||
[TODO: Should the main example now use corosync-cmapctl and have the note
|
||||
give the command for Corosync version 1?]
|
||||
You should see a ``status=joined`` entry for each of your constituent
|
||||
cluster nodes.
|
||||
|
||||
.. note::
|
||||
|
||||
@ -543,38 +520,38 @@ give the command for Corosync version 1?]
|
||||
Start Pacemaker
|
||||
---------------
|
||||
|
||||
After the ``corosync`` service have been started
|
||||
and you have verified that the cluster is communicating properly,
|
||||
you can start :command:`pacemakerd`, the Pacemaker master control process.
|
||||
Choose one from the following four ways to start it:
|
||||
After the ``corosync`` service have been started and you have verified that the
|
||||
cluster is communicating properly, you can start :command:`pacemakerd`, the
|
||||
Pacemaker master control process. Choose one from the following four ways to
|
||||
start it:
|
||||
|
||||
- To start ``pacemaker`` with the LSB init script:
|
||||
#. Start ``pacemaker`` with the LSB init script:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
# /etc/init.d/pacemaker start
|
||||
|
||||
- Alternatively:
|
||||
Alternatively:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
# service pacemaker start
|
||||
|
||||
- To start ``pacemaker`` with upstart:
|
||||
#. Start ``pacemaker`` with upstart:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
# start pacemaker
|
||||
|
||||
- To start ``pacemaker`` with the systemd unit file:
|
||||
#. Start ``pacemaker`` with the systemd unit file:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
# systemctl start pacemaker
|
||||
|
||||
After the ``pacemaker`` service have started,
|
||||
Pacemaker creates a default empty cluster configuration with no resources.
|
||||
Use the :command:`crm_mon` utility to observe the status of ``pacemaker``:
|
||||
After the ``pacemaker`` service has started, Pacemaker creates a default empty
|
||||
cluster configuration with no resources. Use the :command:`crm_mon` utility to
|
||||
observe the status of ``pacemaker``:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
@ -596,30 +573,29 @@ Use the :command:`crm_mon` utility to observe the status of ``pacemaker``:
|
||||
Set basic cluster properties
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
After you set up your Pacemaker cluster,
|
||||
you should set a few basic cluster properties:
|
||||
After you set up your Pacemaker cluster, set a few basic cluster properties:
|
||||
|
||||
``crmsh``
|
||||
- ``crmsh``
|
||||
|
||||
.. code-block:: console
|
||||
.. code-block:: console
|
||||
|
||||
$ crm configure property pe-warn-series-max="1000" \
|
||||
pe-input-series-max="1000" \
|
||||
pe-error-series-max="1000" \
|
||||
cluster-recheck-interval="5min"
|
||||
$ crm configure property pe-warn-series-max="1000" \
|
||||
pe-input-series-max="1000" \
|
||||
pe-error-series-max="1000" \
|
||||
cluster-recheck-interval="5min"
|
||||
|
||||
``pcs``
|
||||
- ``pcs``
|
||||
|
||||
.. code-block:: console
|
||||
.. code-block:: console
|
||||
|
||||
$ pcs property set pe-warn-series-max=1000 \
|
||||
pe-input-series-max=1000 \
|
||||
pe-error-series-max=1000 \
|
||||
cluster-recheck-interval=5min
|
||||
$ pcs property set pe-warn-series-max=1000 \
|
||||
pe-input-series-max=1000 \
|
||||
pe-error-series-max=1000 \
|
||||
cluster-recheck-interval=5min
|
||||
|
||||
Note the following:
|
||||
|
||||
- Setting the ``pe-warn-series-max``, ``pe-input-series-max``
|
||||
- Setting the ``pe-warn-series-max``, ``pe-input-series-max``,
|
||||
and ``pe-error-series-max`` parameters to 1000
|
||||
instructs Pacemaker to keep a longer history of the inputs processed
|
||||
and errors and warnings generated by its Policy Engine.
|
||||
@ -631,4 +607,4 @@ Note the following:
|
||||
It is usually prudent to reduce this to a shorter interval,
|
||||
such as 5 or 3 minutes.
|
||||
|
||||
After you make these changes, you may commit the updated configuration.
|
||||
After you make these changes, commit the updated configuration.
|
||||
|
@ -2,76 +2,80 @@
|
||||
Highly available Telemetry API
|
||||
==============================
|
||||
|
||||
`Telemetry service
|
||||
<http://docs.openstack.org/admin-guide/common/get-started-telemetry.html>`__
|
||||
provides data collection service and alarming service.
|
||||
The `Telemetry service
|
||||
<http://docs.openstack.org/admin-guide/common/get-started-telemetry.html>`_
|
||||
provides a data collection service and an alarming service.
|
||||
|
||||
Telemetry central agent
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
The Telemetry central agent can be configured to partition its polling
|
||||
workload between multiple agents, enabling high availability.
|
||||
workload between multiple agents. This enables high availability (HA).
|
||||
|
||||
Both the central and the compute agent can run in an HA deployment,
|
||||
which means that multiple instances of these services can run in
|
||||
Both the central and the compute agent can run in an HA deployment.
|
||||
This means that multiple instances of these services can run in
|
||||
parallel with workload partitioning among these running instances.
|
||||
|
||||
The `Tooz <https://pypi.python.org/pypi/tooz>`__ library provides
|
||||
The `Tooz <https://pypi.python.org/pypi/tooz>`_ library provides
|
||||
the coordination within the groups of service instances.
|
||||
It provides an API above several back ends that can be used for building
|
||||
distributed applications.
|
||||
|
||||
Tooz supports
|
||||
`various drivers <http://docs.openstack.org/developer/tooz/drivers.html>`__
|
||||
`various drivers <http://docs.openstack.org/developer/tooz/drivers.html>`_
|
||||
including the following back end solutions:
|
||||
|
||||
* `Zookeeper <http://zookeeper.apache.org/>`__.
|
||||
* `Zookeeper <http://zookeeper.apache.org/>`_:
|
||||
Recommended solution by the Tooz project.
|
||||
|
||||
* `Redis <http://redis.io/>`__.
|
||||
* `Redis <http://redis.io/>`_:
|
||||
Recommended solution by the Tooz project.
|
||||
|
||||
* `Memcached <http://memcached.org/>`__.
|
||||
* `Memcached <http://memcached.org/>`_:
|
||||
Recommended for testing.
|
||||
|
||||
You must configure a supported Tooz driver for the HA deployment of
|
||||
the Telemetry services.
|
||||
|
||||
For information about the required configuration options that have
|
||||
to be set in the :file:`ceilometer.conf` configuration file for both
|
||||
the central and compute agents, see the `coordination section
|
||||
<http://docs.openstack.org/newton/config-reference/telemetry.html>`__
|
||||
For information about the required configuration options
|
||||
to set in the :file:`ceilometer.conf`, see the `coordination section
|
||||
<http://docs.openstack.org/newton/config-reference/telemetry.html>`_
|
||||
in the OpenStack Configuration Reference.
|
||||
|
||||
.. note:: Without the ``backend_url`` option being set only one
|
||||
instance of both the central and compute agent service is able to run
|
||||
and function correctly.
|
||||
.. note::
|
||||
|
||||
Only one instance for the central and compute agent service(s) is able
|
||||
to run and function correctly if the ``backend_url`` option is not set.
|
||||
|
||||
The availability check of the instances is provided by heartbeat messages.
|
||||
When the connection with an instance is lost, the workload will be
|
||||
reassigned within the remained instances in the next polling cycle.
|
||||
reassigned within the remaining instances in the next polling cycle.
|
||||
|
||||
.. note:: Memcached uses a timeout value, which should always be set to
|
||||
.. note::
|
||||
|
||||
Memcached uses a timeout value, which should always be set to
|
||||
a value that is higher than the heartbeat value set for Telemetry.
|
||||
|
||||
For backward compatibility and supporting existing deployments, the central
|
||||
agent configuration also supports using different configuration files for
|
||||
groups of service instances of this type that are running in parallel.
|
||||
agent configuration supports using different configuration files. This is for
|
||||
groups of service instances that are running in parallel.
|
||||
For enabling this configuration, set a value for the
|
||||
``partitioning_group_prefix`` option in the
|
||||
`polling section <http://docs.openstack.org/newton/config-reference/telemetry/telemetry-config-options.html>`__
|
||||
`polling section <http://docs.openstack.org/newton/config-reference/telemetry/telemetry-config-options.html>`_
|
||||
in the OpenStack Configuration Reference.
|
||||
|
||||
.. warning:: For each sub-group of the central agent pool with the same
|
||||
``partitioning_group_prefix`` a disjoint subset of meters must be polled --
|
||||
otherwise samples may be missing or duplicated. The list of meters to poll
|
||||
.. warning::
|
||||
|
||||
For each sub-group of the central agent pool with the same
|
||||
``partitioning_group_prefix``, a disjoint subset of meters must be polled
|
||||
to avoid samples being missing or duplicated. The list of meters to poll
|
||||
can be set in the :file:`/etc/ceilometer/pipeline.yaml` configuration file.
|
||||
For more information about pipelines see the `Data collection and
|
||||
processing
|
||||
<http://docs.openstack.org/admin-guide/telemetry-data-collection.html#data-collection-and-processing>`__
|
||||
<http://docs.openstack.org/admin-guide/telemetry-data-collection.html#data-collection-and-processing>`_
|
||||
section.
|
||||
|
||||
To enable the compute agent to run multiple instances simultaneously with
|
||||
workload partitioning, the workload_partitioning option has to be set to
|
||||
``True`` under the `compute section <http://docs.openstack.org/newton/config-reference/telemetry.html>`__
|
||||
workload partitioning, the ``workload_partitioning`` option must be set to
|
||||
``True`` under the `compute section <http://docs.openstack.org/newton/config-reference/telemetry.html>`_
|
||||
in the :file:`ceilometer.conf` configuration file.
|
||||
|
@ -1,13 +1,12 @@
|
||||
|
||||
=================
|
||||
Configure the VIP
|
||||
=================
|
||||
|
||||
You must select and assign a virtual IP address (VIP)
|
||||
that can freely float between cluster nodes.
|
||||
You must select and assign a virtual IP address (VIP) that can freely float
|
||||
between cluster nodes.
|
||||
|
||||
This configuration creates ``vip``,
|
||||
a virtual IP address for use by the API node (``10.0.0.11``):
|
||||
This configuration creates ``vip``, a virtual IP address for use by the
|
||||
API node (``10.0.0.11``).
|
||||
|
||||
For ``crmsh``:
|
||||
|
||||
|
@ -2,8 +2,8 @@
|
||||
Configuring the controller for high availability
|
||||
================================================
|
||||
|
||||
The cloud controller runs on the management network
|
||||
and must talk to all other services.
|
||||
The cloud controller runs on the management network and must talk to
|
||||
all other services.
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
|
@ -2,29 +2,26 @@
|
||||
Hardware considerations for high availability
|
||||
=============================================
|
||||
|
||||
.. TODO: Provide a minimal architecture example for HA, expanded on that
|
||||
given in the *Environment* section of
|
||||
http://docs.openstack.org/project-install-guide/newton (depending
|
||||
on the distribution) for easy comparison.
|
||||
When you use high availability, consider the hardware requirements needed
|
||||
for your application.
|
||||
|
||||
Hardware setup
|
||||
~~~~~~~~~~~~~~
|
||||
|
||||
The standard hardware requirements:
|
||||
The following are the standard hardware requirements:
|
||||
|
||||
- Provider networks. See the *Overview -> Networking Option 1: Provider
|
||||
- Provider networks: See the *Overview -> Networking Option 1: Provider
|
||||
networks* section of the
|
||||
`Install Tutorials and Guides <http://docs.openstack.org/project-install-guide/newton>`_
|
||||
depending on your distribution.
|
||||
- Self-service networks. See the *Overview -> Networking Option 2:
|
||||
- Self-service networks: See the *Overview -> Networking Option 2:
|
||||
Self-service networks* section of the
|
||||
`Install Tutorials and Guides <http://docs.openstack.org/project-install-guide/newton>`_
|
||||
depending on your distribution.
|
||||
|
||||
However, OpenStack does not require a significant amount of resources
|
||||
and the following minimum requirements should support
|
||||
a proof-of-concept high availability environment
|
||||
with core services and several instances:
|
||||
OpenStack does not require a significant amount of resources and the following
|
||||
minimum requirements should support a proof-of-concept high availability
|
||||
environment with core services and several instances:
|
||||
|
||||
+-------------------+------------------+----------+-----------+------+
|
||||
| Node type | Processor Cores | Memory | Storage | NIC |
|
||||
@ -39,26 +36,23 @@ nodes is 2 milliseconds. Although the cluster software can be tuned to
|
||||
operate at higher latencies, some vendors insist on this value before
|
||||
agreeing to support the installation.
|
||||
|
||||
The `ping` command can be used to find the latency between two
|
||||
servers.
|
||||
You can use the `ping` command to find the latency between two servers.
|
||||
|
||||
Virtualized hardware
|
||||
~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
For demonstrations and studying,
|
||||
you can set up a test environment on virtual machines (VMs).
|
||||
This has the following benefits:
|
||||
For demonstrations and studying, you can set up a test environment on virtual
|
||||
machines (VMs). This has the following benefits:
|
||||
|
||||
- One physical server can support multiple nodes,
|
||||
each of which supports almost any number of network interfaces.
|
||||
|
||||
- Ability to take periodic "snap shots" throughout the installation process
|
||||
and "roll back" to a working configuration in the event of a problem.
|
||||
- You can take periodic snap shots throughout the installation process
|
||||
and roll back to a working configuration in the event of a problem.
|
||||
|
||||
However, running an OpenStack environment on VMs
|
||||
degrades the performance of your instances,
|
||||
particularly if your hypervisor and/or processor lacks support
|
||||
for hardware acceleration of nested VMs.
|
||||
However, running an OpenStack environment on VMs degrades the performance of
|
||||
your instances, particularly if your hypervisor or processor lacks
|
||||
support for hardware acceleration of nested VMs.
|
||||
|
||||
.. note::
|
||||
|
||||
|
@ -1,40 +1,32 @@
|
||||
=================
|
||||
Install memcached
|
||||
=================
|
||||
====================
|
||||
Installing Memcached
|
||||
====================
|
||||
|
||||
[TODO: Verify that Oslo supports hash synchronization;
|
||||
if so, this should not take more than load balancing.]
|
||||
|
||||
[TODO: This hands off to two different docs for install information.
|
||||
We should choose one or explain the specific purpose of each.]
|
||||
|
||||
Most OpenStack services can use memcached
|
||||
to store ephemeral data such as tokens.
|
||||
Although memcached does not support
|
||||
typical forms of redundancy such as clustering,
|
||||
OpenStack services can use almost any number of instances
|
||||
Most OpenStack services can use Memcached to store ephemeral data such as
|
||||
tokens. Although Memcached does not support typical forms of redundancy such
|
||||
as clustering, OpenStack services can use almost any number of instances
|
||||
by configuring multiple hostnames or IP addresses.
|
||||
The memcached client implements hashing
|
||||
to balance objects among the instances.
|
||||
Failure of an instance only impacts a percentage of the objects
|
||||
|
||||
The Memcached client implements hashing to balance objects among the instances.
|
||||
Failure of an instance only impacts a percentage of the objects,
|
||||
and the client automatically removes it from the list of instances.
|
||||
|
||||
To install and configure memcached, read the
|
||||
`official documentation <https://github.com/memcached/memcached/wiki#getting-started>`_.
|
||||
To install and configure Memcached, read the
|
||||
`official documentation <https://github.com/Memcached/Memcached/wiki#getting-started>`_.
|
||||
|
||||
Memory caching is managed by `oslo.cache
|
||||
<http://specs.openstack.org/openstack/oslo-specs/specs/kilo/oslo-cache-using-dogpile.html>`_
|
||||
so the way to use multiple memcached servers is the same for all projects.
|
||||
|
||||
Example configuration with three hosts:
|
||||
<http://specs.openstack.org/openstack/oslo-specs/specs/kilo/oslo-cache-using-dogpile.html>`_.
|
||||
This ensures consistency across all projects when using multiple Memcached
|
||||
servers. The following is an example configuration with three hosts:
|
||||
|
||||
.. code-block:: ini
|
||||
|
||||
memcached_servers = controller1:11211,controller2:11211,controller3:11211
|
||||
Memcached_servers = controller1:11211,controller2:11211,controller3:11211
|
||||
|
||||
By default, ``controller1`` handles the caching service.
|
||||
If the host goes down, ``controller2`` or ``controller3`` does the job.
|
||||
For more information about memcached installation, see the
|
||||
By default, ``controller1`` handles the caching service. If the host goes down,
|
||||
``controller2`` or ``controller3`` will complete the service.
|
||||
|
||||
For more information about Memcached installation, see the
|
||||
*Environment -> Memcached* section in the
|
||||
`Installation Tutorials and Guides <http://docs.openstack.org/project-install-guide/newton>`_
|
||||
depending on your distribution.
|
||||
|
@ -1,6 +1,6 @@
|
||||
=====================================
|
||||
Install operating system on each node
|
||||
=====================================
|
||||
===============================
|
||||
Installing the operating system
|
||||
===============================
|
||||
|
||||
The first step in setting up your highly available OpenStack cluster
|
||||
is to install the operating system on each node.
|
||||
|
@ -3,7 +3,7 @@ HA community
|
||||
============
|
||||
|
||||
Weekly IRC meetings
|
||||
-------------------
|
||||
~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
The OpenStack HA community holds `weekly IRC meetings
|
||||
<https://wiki.openstack.org/wiki/Meetings/HATeamMeeting>`_ to discuss
|
||||
@ -12,7 +12,7 @@ encouraged to attend. The `logs of all previous meetings
|
||||
<http://eavesdrop.openstack.org/meetings/ha/>`_ are available to read.
|
||||
|
||||
Contacting the community
|
||||
------------------------
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
You can contact the HA community directly in `the #openstack-ha
|
||||
channel on Freenode IRC <https://wiki.openstack.org/wiki/IRC>`_, or by
|
||||
|
@ -1,20 +1,23 @@
|
||||
=================================
|
||||
=================================
|
||||
OpenStack High Availability Guide
|
||||
=================================
|
||||
|
||||
Abstract
|
||||
~~~~~~~~
|
||||
|
||||
This guide describes how to install and configure
|
||||
OpenStack for high availability.
|
||||
It supplements the Installation Tutorials and Guides
|
||||
This guide describes how to install and configure OpenStack for high
|
||||
availability. It supplements the Installation Tutorials and Guides
|
||||
and assumes that you are familiar with the material in those guides.
|
||||
|
||||
This guide documents OpenStack Newton, Mitaka, and Liberty releases.
|
||||
|
||||
.. warning:: This guide is a work-in-progress and changing rapidly
|
||||
while we continue to test and enhance the guidance. Please note
|
||||
where there are open "to do" items and help where you are able.
|
||||
.. warning::
|
||||
|
||||
This guide is a work-in-progress and changing rapidly
|
||||
while we continue to test and enhance the guidance. There are
|
||||
open `TODO` items throughout and available on the OpenStack manuals
|
||||
`bug list <https://bugs.launchpad.net/openstack-manuals/>`_.
|
||||
Please help where you are able.
|
||||
|
||||
Contents
|
||||
~~~~~~~~
|
||||
|
@ -4,28 +4,28 @@ Configure high availability of instances
|
||||
|
||||
As of September 2016, the OpenStack High Availability community is
|
||||
designing and developing an official and unified way to provide high
|
||||
availability for instances. That is, we are developing automatic
|
||||
availability for instances. We are developing automatic
|
||||
recovery from failures of hardware or hypervisor-related software on
|
||||
the compute node, or other failures which could prevent instances from
|
||||
functioning correctly - issues with a cinder volume I/O path, for example.
|
||||
the compute node, or other failures that could prevent instances from
|
||||
functioning correctly, such as, issues with a cinder volume I/O path.
|
||||
|
||||
More details are available in the `user story
|
||||
<http://specs.openstack.org/openstack/openstack-user-stories/user-stories/proposed/ha_vm.html>`_
|
||||
co-authored by OpenStack's HA community and `Product Working Group
|
||||
<https://wiki.openstack.org/wiki/ProductTeam>`_ (PWG), who have
|
||||
identified this feature as missing functionality in OpenStack which
|
||||
<https://wiki.openstack.org/wiki/ProductTeam>`_ (PWG), where this feature is
|
||||
identified as missing functionality in OpenStack, which
|
||||
should be addressed with high priority.
|
||||
|
||||
Existing solutions
|
||||
------------------
|
||||
~~~~~~~~~~~~~~~~~~
|
||||
|
||||
The architectural challenges of instance HA and several currently
|
||||
existing solutions were presented in `a talk at the Austin summit
|
||||
<https://www.openstack.org/videos/video/high-availability-for-pets-and-hypervisors-state-of-the-nation>`_,
|
||||
for which `slides are also available
|
||||
<http://aspiers.github.io/openstack-summit-2016-austin-compute-ha/>`_.
|
||||
for which `slides are also available <http://aspiers.github.io/openstack-summit-2016-austin-compute-ha/>`_.
|
||||
|
||||
The code for three of these solutions can be found online:
|
||||
The code for three of these solutions can be found online at the following
|
||||
links:
|
||||
|
||||
* `a mistral-based auto-recovery workflow
|
||||
<https://github.com/gryf/mistral-evacuate>`_, by Intel
|
||||
@ -35,7 +35,7 @@ The code for three of these solutions can be found online:
|
||||
as used by Red Hat and SUSE
|
||||
|
||||
Current upstream work
|
||||
---------------------
|
||||
~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Work is in progress on a unified approach, which combines the best
|
||||
aspects of existing upstream solutions. More details are available on
|
||||
|
@ -2,24 +2,24 @@
|
||||
The Pacemaker architecture
|
||||
==========================
|
||||
|
||||
What is a cluster manager
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
What is a cluster manager?
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
At its core, a cluster is a distributed finite state machine capable
|
||||
of co-ordinating the startup and recovery of inter-related services
|
||||
across a set of machines.
|
||||
|
||||
Even a distributed and/or replicated application that is able to
|
||||
survive failures on one or more machines can benefit from a
|
||||
cluster manager:
|
||||
Even a distributed or replicated application that is able to survive failures
|
||||
on one or more machines can benefit from a cluster manager because a cluster
|
||||
manager has the following capabilities:
|
||||
|
||||
#. Awareness of other applications in the stack
|
||||
|
||||
While SYS-V init replacements like systemd can provide
|
||||
deterministic recovery of a complex stack of services, the
|
||||
recovery is limited to one machine and lacks the context of what
|
||||
is happening on other machines - context that is crucial to
|
||||
determine the difference between a local failure, clean startup
|
||||
is happening on other machines. This context is crucial to
|
||||
determine the difference between a local failure, and clean startup
|
||||
and recovery after a total site failure.
|
||||
|
||||
#. Awareness of instances on other machines
|
||||
@ -27,17 +27,17 @@ cluster manager:
|
||||
Services like RabbitMQ and Galera have complicated boot-up
|
||||
sequences that require co-ordination, and often serialization, of
|
||||
startup operations across all machines in the cluster. This is
|
||||
especially true after site-wide failure or shutdown where we must
|
||||
especially true after a site-wide failure or shutdown where you must
|
||||
first determine the last machine to be active.
|
||||
|
||||
#. A shared implementation and calculation of `quorum
|
||||
<http://en.wikipedia.org/wiki/Quorum_(Distributed_Systems)>`_.
|
||||
<http://en.wikipedia.org/wiki/Quorum_(Distributed_Systems)>`_
|
||||
|
||||
It is very important that all members of the system share the same
|
||||
view of who their peers are and whether or not they are in the
|
||||
majority. Failure to do this leads very quickly to an internal
|
||||
`split-brain <http://en.wikipedia.org/wiki/Split-brain_(computing)>`_
|
||||
state - where different parts of the system are pulling in
|
||||
state. This is where different parts of the system are pulling in
|
||||
different and incompatible directions.
|
||||
|
||||
#. Data integrity through fencing (a non-responsive process does not
|
||||
@ -46,7 +46,7 @@ cluster manager:
|
||||
A single application does not have sufficient context to know the
|
||||
difference between failure of a machine and failure of the
|
||||
application on a machine. The usual practice is to assume the
|
||||
machine is dead and carry on, however this is highly risky - a
|
||||
machine is dead and continue working, however this is highly risky. A
|
||||
rogue process or machine could still be responding to requests and
|
||||
generally causing havoc. The safer approach is to make use of
|
||||
remotely accessible power switches and/or network switches and SAN
|
||||
@ -59,46 +59,46 @@ cluster manager:
|
||||
required volume of requests. A cluster can automatically recover
|
||||
failed instances to prevent additional load induced failures.
|
||||
|
||||
For this reason, the use of a cluster manager like `Pacemaker
|
||||
<http://clusterlabs.org>`_ is highly recommended.
|
||||
For these reasons, we highly recommend the use of a cluster manager like
|
||||
`Pacemaker <http://clusterlabs.org>`_.
|
||||
|
||||
Deployment flavors
|
||||
~~~~~~~~~~~~~~~~~~
|
||||
|
||||
It is possible to deploy three different flavors of the Pacemaker
|
||||
architecture. The two extremes are **Collapsed** (where every
|
||||
component runs on every node) and **Segregated** (where every
|
||||
architecture. The two extremes are ``Collapsed`` (where every
|
||||
component runs on every node) and ``Segregated`` (where every
|
||||
component runs in its own 3+ node cluster).
|
||||
|
||||
Regardless of which flavor you choose, it is recommended that the
|
||||
clusters contain at least three nodes so that we can take advantage of
|
||||
Regardless of which flavor you choose, we recommend that
|
||||
clusters contain at least three nodes so that you can take advantage of
|
||||
`quorum <quorum_>`_.
|
||||
|
||||
Quorum becomes important when a failure causes the cluster to split in
|
||||
two or more partitions. In this situation, you want the majority to
|
||||
ensure the minority are truly dead (through fencing) and continue to
|
||||
host resources. For a two-node cluster, no side has the majority and
|
||||
two or more partitions. In this situation, you want the majority members of
|
||||
the system to ensure the minority are truly dead (through fencing) and continue
|
||||
to host resources. For a two-node cluster, no side has the majority and
|
||||
you can end up in a situation where both sides fence each other, or
|
||||
both sides are running the same services - leading to data corruption.
|
||||
both sides are running the same services. This can lead to data corruption.
|
||||
|
||||
Clusters with an even number of hosts suffer from similar issues - a
|
||||
Clusters with an even number of hosts suffer from similar issues. A
|
||||
single network failure could easily cause a N:N split where neither
|
||||
side retains a majority. For this reason, we recommend an odd number
|
||||
of cluster members when scaling up.
|
||||
|
||||
You can have up to 16 cluster members (this is currently limited by
|
||||
the ability of corosync to scale higher). In extreme cases, 32 and
|
||||
even up to 64 nodes could be possible, however, this is not well tested.
|
||||
even up to 64 nodes could be possible. However, this is not well tested.
|
||||
|
||||
Collapsed
|
||||
---------
|
||||
|
||||
In this configuration, there is a single cluster of 3 or more
|
||||
In a collapsed configuration, there is a single cluster of 3 or more
|
||||
nodes on which every component is running.
|
||||
|
||||
This scenario has the advantage of requiring far fewer, if more
|
||||
powerful, machines. Additionally, being part of a single cluster
|
||||
allows us to accurately model the ordering dependencies between
|
||||
allows you to accurately model the ordering dependencies between
|
||||
components.
|
||||
|
||||
This scenario can be visualized as below.
|
||||
@ -136,12 +136,11 @@ It is also possible to follow a segregated approach for one or more
|
||||
components that are expected to be a bottleneck and use a collapsed
|
||||
approach for the remainder.
|
||||
|
||||
|
||||
Proxy server
|
||||
~~~~~~~~~~~~
|
||||
|
||||
Almost all services in this stack benefit from being proxied.
|
||||
Using a proxy server provides:
|
||||
Using a proxy server provides the following capabilities:
|
||||
|
||||
#. Load distribution
|
||||
|
||||
@ -152,8 +151,8 @@ Using a proxy server provides:
|
||||
|
||||
#. API isolation
|
||||
|
||||
By sending all API access through the proxy, we can clearly
|
||||
identify service interdependencies. We can also move them to
|
||||
By sending all API access through the proxy, you can clearly
|
||||
identify service interdependencies. You can also move them to
|
||||
locations other than ``localhost`` to increase capacity if the
|
||||
need arises.
|
||||
|
||||
@ -169,7 +168,7 @@ Using a proxy server provides:
|
||||
|
||||
The proxy can be configured as a secondary mechanism for detecting
|
||||
service failures. It can even be configured to look for nodes in
|
||||
a degraded state (such as being 'too far' behind in the
|
||||
a degraded state (such as being too far behind in the
|
||||
replication) and take them out of circulation.
|
||||
|
||||
The following components are currently unable to benefit from the use
|
||||
@ -179,20 +178,13 @@ of a proxy server:
|
||||
* Memcached
|
||||
* MongoDB
|
||||
|
||||
However, the reasons vary and are discussed under each component's
|
||||
heading.
|
||||
|
||||
We recommend HAProxy as the load balancer, however, there are many
|
||||
alternatives in the marketplace.
|
||||
|
||||
We use a check interval of 1 second, however, the timeouts vary by service.
|
||||
We recommend HAProxy as the load balancer, however, there are many alternative
|
||||
load balancing solutions in the marketplace.
|
||||
|
||||
Generally, we use round-robin to distribute load amongst instances of
|
||||
active/active services, however, Galera uses the ``stick-table`` options
|
||||
to ensure that incoming connections to the virtual IP (VIP) should be
|
||||
directed to only one of the available back ends.
|
||||
|
||||
In Galera's case, although it can run active/active, this helps avoid
|
||||
lock contention and prevent deadlocks. It is used in combination with
|
||||
the ``httpchk`` option that ensures only nodes that are in sync with its
|
||||
active/active services. Alternatively, Galera uses ``stack-table`` options
|
||||
to ensure that incoming connection to virtual IP (VIP) are directed to only one
|
||||
of the available back ends. This helps avoid lock contention and prevent
|
||||
deadlocks, although Galera can run active/active. Used in combination with
|
||||
the ``httpchk`` option, this ensure only nodes that are in sync with their
|
||||
peers are allowed to handle requests.
|
||||
|
@ -2,20 +2,18 @@
|
||||
High availability concepts
|
||||
==========================
|
||||
|
||||
High availability systems seek to minimize two things:
|
||||
High availability systems seek to minimize the following issues:
|
||||
|
||||
**System downtime**
|
||||
Occurs when a user-facing service is unavailable
|
||||
beyond a specified maximum amount of time.
|
||||
#. System downtime: Occurs when a user-facing service is unavailable
|
||||
beyond a specified maximum amount of time.
|
||||
|
||||
**Data loss**
|
||||
Accidental deletion or destruction of data.
|
||||
#. Data loss: Accidental deletion or destruction of data.
|
||||
|
||||
Most high availability systems guarantee protection against system downtime
|
||||
and data loss only in the event of a single failure.
|
||||
However, they are also expected to protect against cascading failures,
|
||||
where a single failure deteriorates into a series of consequential failures.
|
||||
Many service providers guarantee :term:`Service Level Agreement (SLA)`
|
||||
Many service providers guarantee a :term:`Service Level Agreement (SLA)`
|
||||
including uptime percentage of computing service, which is calculated based
|
||||
on the available time and system downtime excluding planned outage time.
|
||||
|
||||
@ -65,19 +63,16 @@ guarantee 99.99% availability for individual guest instances.
|
||||
This document discusses some common methods of implementing highly
|
||||
available systems, with an emphasis on the core OpenStack services and
|
||||
other open source services that are closely aligned with OpenStack.
|
||||
These methods are by no means the only ways to do it;
|
||||
you may supplement these services with commercial hardware and software
|
||||
that provides additional features and functionality.
|
||||
You also need to address high availability concerns
|
||||
for any applications software that you run on your OpenStack environment.
|
||||
The important thing is to make sure that your services are redundant
|
||||
and available; how you achieve that is up to you.
|
||||
|
||||
Stateless vs. stateful services
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
You will need to address high availability concerns for any applications
|
||||
software that you run on your OpenStack environment. The important thing is
|
||||
to make sure that your services are redundant and available.
|
||||
How you achieve that is up to you.
|
||||
|
||||
Preventing single points of failure can depend on whether or not a
|
||||
service is stateless.
|
||||
Stateless versus stateful services
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
The following are the definitions of stateless and stateful services:
|
||||
|
||||
Stateless service
|
||||
A service that provides a response after your request
|
||||
@ -86,13 +81,13 @@ Stateless service
|
||||
you need to provide redundant instances and load balance them.
|
||||
OpenStack services that are stateless include ``nova-api``,
|
||||
``nova-conductor``, ``glance-api``, ``keystone-api``,
|
||||
``neutron-api`` and ``nova-scheduler``.
|
||||
``neutron-api``, and ``nova-scheduler``.
|
||||
|
||||
Stateful service
|
||||
A service where subsequent requests to the service
|
||||
depend on the results of the first request.
|
||||
Stateful services are more difficult to manage because a single
|
||||
action typically involves more than one request, so simply providing
|
||||
action typically involves more than one request. Providing
|
||||
additional instances and load balancing does not solve the problem.
|
||||
For example, if the horizon user interface reset itself every time
|
||||
you went to a new page, it would not be very useful.
|
||||
@ -101,10 +96,11 @@ Stateful service
|
||||
Making stateful services highly available can depend on whether you choose
|
||||
an active/passive or active/active configuration.
|
||||
|
||||
Active/Passive vs. Active/Active
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
Active/passive versus active/active
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Stateful services may be configured as active/passive or active/active:
|
||||
Stateful services can be configured as active/passive or active/active,
|
||||
which are defined as follows:
|
||||
|
||||
:term:`active/passive configuration`
|
||||
Maintains a redundant instance
|
||||
@ -148,7 +144,7 @@ in order for the cluster to remain functional.
|
||||
When one node fails and failover transfers control to other nodes,
|
||||
the system must ensure that data and processes remain sane.
|
||||
To determine this, the contents of the remaining nodes are compared
|
||||
and, if there are discrepancies, a "majority rules" algorithm is implemented.
|
||||
and, if there are discrepancies, a majority rules algorithm is implemented.
|
||||
|
||||
For this reason, each cluster in a high availability environment should
|
||||
have an odd number of nodes and the quorum is defined as more than a half
|
||||
@ -157,7 +153,7 @@ If multiple nodes fail so that the cluster size falls below the quorum
|
||||
value, the cluster itself fails.
|
||||
|
||||
For example, in a seven-node cluster, the quorum should be set to
|
||||
floor(7/2) + 1 == 4. If quorum is four and four nodes fail simultaneously,
|
||||
``floor(7/2) + 1 == 4``. If quorum is four and four nodes fail simultaneously,
|
||||
the cluster itself would fail, whereas it would continue to function, if
|
||||
no more than three nodes fail. If split to partitions of three and four nodes
|
||||
respectively, the quorum of four nodes would continue to operate the majority
|
||||
@ -169,25 +165,23 @@ example.
|
||||
|
||||
.. note::
|
||||
|
||||
Note that setting the quorum to a value less than floor(n/2) + 1 is not
|
||||
recommended and would likely cause a split-brain in a face of network
|
||||
partitions.
|
||||
We do not recommend setting the quorum to a value less than ``floor(n/2) + 1``
|
||||
as it would likely cause a split-brain in a face of network partitions.
|
||||
|
||||
Then, for the given example when four nodes fail simultaneously,
|
||||
the cluster would continue to function as well. But if split to partitions of
|
||||
three and four nodes respectively, the quorum of three would have made both
|
||||
sides to attempt to fence the other and host resources. And without fencing
|
||||
enabled, it would go straight to running two copies of each resource.
|
||||
When four nodes fail simultaneously, the cluster would continue to function as
|
||||
well. But if split to partitions of three and four nodes respectively, the
|
||||
quorum of three would have made both sides to attempt to fence the other and
|
||||
host resources. Without fencing enabled, it would go straight to running
|
||||
two copies of each resource.
|
||||
|
||||
This is why setting the quorum to a value less than floor(n/2) + 1 is
|
||||
dangerous. However it may be required for some specific cases, like a
|
||||
This is why setting the quorum to a value less than ``floor(n/2) + 1`` is
|
||||
dangerous. However it may be required for some specific cases, such as a
|
||||
temporary measure at a point it is known with 100% certainty that the other
|
||||
nodes are down.
|
||||
|
||||
When configuring an OpenStack environment for study or demonstration purposes,
|
||||
it is possible to turn off the quorum checking;
|
||||
this is discussed later in this guide.
|
||||
Production systems should always run with quorum enabled.
|
||||
it is possible to turn off the quorum checking. Production systems should
|
||||
always run with quorum enabled.
|
||||
|
||||
|
||||
Single-controller high availability mode
|
||||
@ -203,11 +197,12 @@ but is not appropriate for a production environment.
|
||||
It is possible to add controllers to such an environment
|
||||
to convert it into a truly highly available environment.
|
||||
|
||||
|
||||
High availability is not for every user. It presents some challenges.
|
||||
High availability may be too complex for databases or
|
||||
systems with large amounts of data. Replication can slow large systems
|
||||
down. Different setups have different prerequisites. Read the guidelines
|
||||
for each setup.
|
||||
|
||||
High availability is turned off as the default in OpenStack setups.
|
||||
.. important::
|
||||
|
||||
High availability is turned off as the default in OpenStack setups.
|
||||
|
@ -3,17 +3,17 @@ Overview of highly available controllers
|
||||
========================================
|
||||
|
||||
OpenStack is a set of multiple services exposed to the end users
|
||||
as HTTP(s) APIs. Additionally, for own internal usage OpenStack
|
||||
requires SQL database server and AMQP broker. The physical servers,
|
||||
where all the components are running are often called controllers.
|
||||
This modular OpenStack architecture allows to duplicate all the
|
||||
as HTTP(s) APIs. Additionally, for your own internal usage, OpenStack
|
||||
requires an SQL database server and AMQP broker. The physical servers,
|
||||
where all the components are running, are called controllers.
|
||||
This modular OpenStack architecture allows you to duplicate all the
|
||||
components and run them on different controllers.
|
||||
By making all the components redundant it is possible to make
|
||||
OpenStack highly available.
|
||||
|
||||
In general we can divide all the OpenStack components into three categories:
|
||||
|
||||
- OpenStack APIs, these are HTTP(s) stateless services written in python,
|
||||
- OpenStack APIs: These are HTTP(s) stateless services written in python,
|
||||
easy to duplicate and mostly easy to load balance.
|
||||
|
||||
- SQL relational database server provides stateful type consumed by other
|
||||
@ -42,17 +42,16 @@ Networking for high availability.
|
||||
Common deployment architectures
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
There are primarily two recommended architectures for making OpenStack
|
||||
highly available.
|
||||
|
||||
Both use a cluster manager such as Pacemaker or Veritas to
|
||||
orchestrate the actions of the various services across a set of
|
||||
machines. Since we are focused on FOSS, we will refer to these as
|
||||
Pacemaker architectures.
|
||||
We recommend two primary architectures for making OpenStack highly available.
|
||||
|
||||
The architectures differ in the sets of services managed by the
|
||||
cluster.
|
||||
|
||||
Both use a cluster manager, such as Pacemaker or Veritas, to
|
||||
orchestrate the actions of the various services across a set of
|
||||
machines. Because we are focused on FOSS, we refer to these as
|
||||
Pacemaker architectures.
|
||||
|
||||
Traditionally, Pacemaker has been positioned as an all-encompassing
|
||||
solution. However, as OpenStack services have matured, they are
|
||||
increasingly able to run in an active/active configuration and
|
||||
@ -61,7 +60,7 @@ depend.
|
||||
|
||||
With this in mind, some vendors are restricting Pacemaker's use to
|
||||
services that must operate in an active/passive mode (such as
|
||||
cinder-volume), those with multiple states (for example, Galera) and
|
||||
``cinder-volume``), those with multiple states (for example, Galera), and
|
||||
those with complex bootstrapping procedures (such as RabbitMQ).
|
||||
|
||||
The majority of services, needing no real orchestration, are handled
|
||||
|
@ -1,4 +1,3 @@
|
||||
|
||||
======================================
|
||||
High availability for other components
|
||||
======================================
|
||||
|
@ -1,9 +1,7 @@
|
||||
|
||||
===========================================
|
||||
Introduction to OpenStack high availability
|
||||
===========================================
|
||||
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
|
||||
|
@ -2,12 +2,10 @@
|
||||
Run Networking DHCP agent
|
||||
=========================
|
||||
|
||||
The OpenStack Networking service has a scheduler
|
||||
that lets you run multiple agents across nodes;
|
||||
the DHCP agent can be natively highly available.
|
||||
To configure the number of DHCP agents per network,
|
||||
modify the ``dhcp_agents_per_network`` parameter
|
||||
in the :file:`/etc/neutron/neutron.conf` file.
|
||||
By default this is set to 1.
|
||||
To achieve high availability,
|
||||
assign more than one DHCP agent per network.
|
||||
The OpenStack Networking (neutron) service has a scheduler that lets you run
|
||||
multiple agents across nodes. The DHCP agent can be natively highly available.
|
||||
|
||||
To configure the number of DHCP agents per network, modify the
|
||||
``dhcp_agents_per_network`` parameter in the :file:`/etc/neutron/neutron.conf`
|
||||
file. By default this is set to 1. To achieve high availability, assign more
|
||||
than one DHCP agent per network.
|
||||
|
@ -2,12 +2,12 @@
|
||||
Run Networking L3 agent
|
||||
=======================
|
||||
|
||||
The neutron L3 agent is scalable, due to the scheduler that supports
|
||||
Virtual Router Redundancy Protocol (VRRP)
|
||||
to distribute virtual routers across multiple nodes.
|
||||
To enable high availability for configured routers,
|
||||
edit the :file:`/etc/neutron/neutron.conf` file
|
||||
to set the following values:
|
||||
The Networking (neutron) service L3 agent is scalable, due to the scheduler
|
||||
that supports Virtual Router Redundancy Protocol (VRRP) to distribute virtual
|
||||
routers across multiple nodes.
|
||||
|
||||
To enable high availability for configured routers, edit the
|
||||
:file:`/etc/neutron/neutron.conf` file to set the following values:
|
||||
|
||||
.. list-table:: /etc/neutron/neutron.conf parameters for high availability
|
||||
:widths: 15 10 30
|
||||
|
@ -2,12 +2,10 @@
|
||||
Run Networking LBaaS agent
|
||||
==========================
|
||||
|
||||
Currently, no native feature is provided
|
||||
to make the LBaaS agent highly available
|
||||
using the default plug-in HAProxy.
|
||||
A common way to make HAProxy highly available
|
||||
is to use the VRRP (Virtual Router Redundancy Protocol).
|
||||
Unfortunately, this is not yet implemented
|
||||
in the LBaaS HAProxy plug-in.
|
||||
Currently, no native feature is provided to make the LBaaS agent highly
|
||||
available using the default plug-in HAProxy. A common way to make HAProxy
|
||||
highly available is to use the VRRP (Virtual Router Redundancy Protocol).
|
||||
|
||||
Unfortunately, this is not yet implemented in the LBaaS HAProxy plug-in.
|
||||
|
||||
[TODO: update this section.]
|
||||
|
@ -2,11 +2,9 @@
|
||||
Run Networking metadata agent
|
||||
=============================
|
||||
|
||||
No native feature is available
|
||||
to make this service highly available.
|
||||
At this time, the Active/Passive solution exists
|
||||
to run the neutron metadata agent
|
||||
in failover mode with Pacemaker.
|
||||
Currently, no native feature is available to make this service highly
|
||||
available. At this time, the active/passive solution exists to run the
|
||||
neutron metadata agent in failover mode with Pacemaker.
|
||||
|
||||
[TODO: Update this information.
|
||||
Can this service now be made HA in active/active mode
|
||||
|
@ -2,10 +2,10 @@
|
||||
Networking services for high availability
|
||||
=========================================
|
||||
|
||||
Configure networking on each node. See basic information
|
||||
Configure networking on each node. See the basic information
|
||||
about configuring networking in the *Networking service*
|
||||
section of the
|
||||
`Install Tutorials and Guides <http://docs.openstack.org/project-install-guide/newton>`_
|
||||
`Install Tutorials and Guides <http://docs.openstack.org/project-install-guide/newton>`_,
|
||||
depending on your distribution.
|
||||
|
||||
Notes from planning outline:
|
||||
|
@ -12,26 +12,27 @@ Certain services running on the underlying operating system of your
|
||||
OpenStack database may block Galera Cluster from normal operation
|
||||
or prevent ``mysqld`` from achieving network connectivity with the cluster.
|
||||
|
||||
|
||||
Firewall
|
||||
---------
|
||||
|
||||
Galera Cluster requires that you open four ports to network traffic:
|
||||
Galera Cluster requires that you open the following ports to network traffic:
|
||||
|
||||
- On ``3306``, Galera Cluster uses TCP for database client connections
|
||||
and State Snapshot Transfers methods that require the client,
|
||||
(that is, ``mysqldump``).
|
||||
- On ``4567`` Galera Cluster uses TCP for replication traffic. Multicast
|
||||
- On ``4567``, Galera Cluster uses TCP for replication traffic. Multicast
|
||||
replication uses both TCP and UDP on this port.
|
||||
- On ``4568`` Galera Cluster uses TCP for Incremental State Transfers.
|
||||
- On ``4444`` Galera Cluster uses TCP for all other State Snapshot Transfer
|
||||
- On ``4568``, Galera Cluster uses TCP for Incremental State Transfers.
|
||||
- On ``4444``, Galera Cluster uses TCP for all other State Snapshot Transfer
|
||||
methods.
|
||||
|
||||
.. seealso:: For more information on firewalls, see `Firewalls and default ports
|
||||
<http://docs.openstack.org/newton/config-reference/firewalls-default-ports.html>`_ in the Configuration Reference.
|
||||
.. seealso::
|
||||
|
||||
This can be achieved through the use of either the ``iptables``
|
||||
command such as:
|
||||
For more information on firewalls, see `Firewalls and default ports
|
||||
<http://docs.openstack.org/newton/config-reference/firewalls-default-ports.html>`_
|
||||
in the Configuration Reference.
|
||||
|
||||
This can be achieved using the :command:`iptables` command:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
@ -39,15 +40,14 @@ command such as:
|
||||
--protocol tcp --match tcp --dport ${PORT} \
|
||||
--source ${NODE-IP-ADDRESS} --jump ACCEPT
|
||||
|
||||
Make sure to save the changes once you are done, this will vary
|
||||
Make sure to save the changes once you are done. This will vary
|
||||
depending on your distribution:
|
||||
|
||||
- `Ubuntu <http://askubuntu.com/questions/66890/how-can-i-make-a-specific-set-of-iptables-rules-permanent#66905>`_
|
||||
- `Fedora <https://fedoraproject.org/wiki/How_to_edit_iptables_rules>`_
|
||||
- For `Ubuntu <http://askubuntu.com/questions/66890/how-can-i-make-a-specific-set-of-iptables-rules-permanent#66905>`_
|
||||
- For `Fedora <https://fedoraproject.org/wiki/How_to_edit_iptables_rules>`_
|
||||
|
||||
Alternatively you may be able to make modifications using the
|
||||
``firewall-cmd`` utility for FirewallD that is available on many Linux
|
||||
distributions:
|
||||
Alternatively, make modifications using the ``firewall-cmd`` utility for
|
||||
FirewallD that is available on many Linux distributions:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
@ -60,11 +60,11 @@ SELinux
|
||||
Security-Enhanced Linux is a kernel module for improving security on Linux
|
||||
operating systems. It is commonly enabled and configured by default on
|
||||
Red Hat-based distributions. In the context of Galera Cluster, systems with
|
||||
SELinux may block the database service, keep it from starting or prevent it
|
||||
SELinux may block the database service, keep it from starting, or prevent it
|
||||
from establishing network connections with the cluster.
|
||||
|
||||
To configure SELinux to permit Galera Cluster to operate, you may need
|
||||
to use the ``semanage`` utility to open the ports it uses, for
|
||||
to use the ``semanage`` utility to open the ports it uses. For
|
||||
example:
|
||||
|
||||
.. code-block:: console
|
||||
@ -79,14 +79,16 @@ relaxed about database access and actions:
|
||||
|
||||
# semanage permissive -a mysqld_t
|
||||
|
||||
.. note:: Bear in mind, leaving SELinux in permissive mode is not a good
|
||||
security practice. Over the longer term, you need to develop a
|
||||
security policy for Galera Cluster and then switch SELinux back
|
||||
into enforcing mode.
|
||||
.. note::
|
||||
|
||||
For more information on configuring SELinux to work with
|
||||
Galera Cluster, see the `Documentation
|
||||
<http://galeracluster.com/documentation-webpages/selinux.html>`_
|
||||
Bear in mind, leaving SELinux in permissive mode is not a good
|
||||
security practice. Over the longer term, you need to develop a
|
||||
security policy for Galera Cluster and then switch SELinux back
|
||||
into enforcing mode.
|
||||
|
||||
For more information on configuring SELinux to work with
|
||||
Galera Cluster, see the `SELinux Documentation
|
||||
<http://galeracluster.com/documentation-webpages/selinux.html>`_
|
||||
|
||||
AppArmor
|
||||
---------
|
||||
@ -111,7 +113,7 @@ following steps on each cluster node:
|
||||
|
||||
# service apparmor restart
|
||||
|
||||
For servers that use ``systemd``, instead run this command:
|
||||
For servers that use ``systemd``, run the following command:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
@ -119,7 +121,6 @@ following steps on each cluster node:
|
||||
|
||||
AppArmor now permits Galera Cluster to operate.
|
||||
|
||||
|
||||
Database configuration
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
@ -152,21 +153,20 @@ additions.
|
||||
wsrep_sst_method=rsync
|
||||
|
||||
|
||||
|
||||
Configuring mysqld
|
||||
-------------------
|
||||
|
||||
While all of the configuration parameters available to the standard MySQL,
|
||||
MariaDB or Percona XtraDB database server are available in Galera Cluster,
|
||||
MariaDB, or Percona XtraDB database servers are available in Galera Cluster,
|
||||
there are some that you must define an outset to avoid conflict or
|
||||
unexpected behavior.
|
||||
|
||||
- Ensure that the database server is not bound only to to the localhost,
|
||||
``127.0.0.1``. Also, do not bind it to ``0.0.0.0``. It makes ``mySQL``
|
||||
bind to all IP addresses on the machine including the virtual IP address,
|
||||
which will cause ``HAProxy`` not to start. Instead, bind it to the
|
||||
management IP address of the controller node to enable access by other
|
||||
nodes through the management network:
|
||||
- Ensure that the database server is not bound only to the localhost:
|
||||
``127.0.0.1``. Also, do not bind it to ``0.0.0.0``. Binding to the localhost
|
||||
or ``0.0.0.0`` makes ``mySQL`` bind to all IP addresses on the machine,
|
||||
including the virtual IP address causing ``HAProxy`` not to start. Instead,
|
||||
bind to the management IP address of the controller node to enable access by
|
||||
other nodes through the management network:
|
||||
|
||||
.. code-block:: ini
|
||||
|
||||
@ -194,7 +194,7 @@ parameters that you must define to avoid conflicts.
|
||||
default_storage_engine=InnoDB
|
||||
|
||||
- Ensure that the InnoDB locking mode for generating auto-increment values
|
||||
is set to ``2``, which is the interleaved locking mode.
|
||||
is set to ``2``, which is the interleaved locking mode:
|
||||
|
||||
.. code-block:: ini
|
||||
|
||||
@ -211,8 +211,8 @@ parameters that you must define to avoid conflicts.
|
||||
|
||||
innodb_flush_log_at_trx_commit=0
|
||||
|
||||
Bear in mind, while setting this parameter to ``1`` or ``2`` can improve
|
||||
performance, it introduces certain dangers. Operating system failures can
|
||||
Setting this parameter to ``1`` or ``2`` can improve
|
||||
performance, but it introduces certain dangers. Operating system failures can
|
||||
erase the last second of transactions. While you can recover this data
|
||||
from another node, if the cluster goes down at the same time
|
||||
(in the event of a data center power outage), you lose this data permanently.
|
||||
@ -230,19 +230,19 @@ Configuring wsrep replication
|
||||
------------------------------
|
||||
|
||||
Galera Cluster configuration parameters all have the ``wsrep_`` prefix.
|
||||
There are five that you must define for each cluster node in your
|
||||
You must define the following parameters for each cluster node in your
|
||||
OpenStack database.
|
||||
|
||||
- **wsrep Provider** The Galera Replication Plugin serves as the wsrep
|
||||
Provider for Galera Cluster. It is installed on your system as the
|
||||
``libgalera_smm.so`` file. You must define the path to this file in
|
||||
your ``my.cnf``.
|
||||
- **wsrep Provider**: The Galera Replication Plugin serves as the ``wsrep``
|
||||
provider for Galera Cluster. It is installed on your system as the
|
||||
``libgalera_smm.so`` file. Define the path to this file in
|
||||
your ``my.cnf``:
|
||||
|
||||
.. code-block:: ini
|
||||
|
||||
wsrep_provider="/usr/lib/libgalera_smm.so"
|
||||
|
||||
- **Cluster Name** Define an arbitrary name for your cluster.
|
||||
- **Cluster Name**: Define an arbitrary name for your cluster.
|
||||
|
||||
.. code-block:: ini
|
||||
|
||||
@ -251,7 +251,7 @@ OpenStack database.
|
||||
You must use the same name on every cluster node. The connection fails
|
||||
when this value does not match.
|
||||
|
||||
- **Cluster Address** List the IP addresses for each cluster node.
|
||||
- **Cluster Address**: List the IP addresses for each cluster node.
|
||||
|
||||
.. code-block:: ini
|
||||
|
||||
@ -260,21 +260,18 @@ OpenStack database.
|
||||
Replace the IP addresses given here with comma-separated list of each
|
||||
OpenStack database in your cluster.
|
||||
|
||||
- **Node Name** Define the logical name of the cluster node.
|
||||
- **Node Name**: Define the logical name of the cluster node.
|
||||
|
||||
.. code-block:: ini
|
||||
|
||||
wsrep_node_name="Galera1"
|
||||
|
||||
- **Node Address** Define the IP address of the cluster node.
|
||||
- **Node Address**: Define the IP address of the cluster node.
|
||||
|
||||
.. code-block:: ini
|
||||
|
||||
wsrep_node_address="192.168.1.1"
|
||||
|
||||
|
||||
|
||||
|
||||
Additional parameters
|
||||
^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
@ -299,6 +296,6 @@ For a complete list of the available parameters, run the
|
||||
| wsrep_sync_wait | 0 |
|
||||
+------------------------------+-------+
|
||||
|
||||
For the documentation of these parameters, wsrep Provider option and status
|
||||
variables available in Galera Cluster, see `Reference
|
||||
For documentation about these parameters, ``wsrep`` provider option, and status
|
||||
variables available in Galera Cluster, see the Galera cluster `Reference
|
||||
<http://galeracluster.com/documentation-webpages/reference.html>`_.
|
||||
|
@ -2,35 +2,31 @@
|
||||
Management
|
||||
==========
|
||||
|
||||
When you finish the installation and configuration process on each
|
||||
cluster node in your OpenStack database, you can initialize Galera Cluster.
|
||||
When you finish installing and configuring the OpenStack database,
|
||||
you can initialize the Galera Cluster.
|
||||
|
||||
Before you attempt this, verify that you have the following ready:
|
||||
Prerequisites
|
||||
~~~~~~~~~~~~~
|
||||
|
||||
- Database hosts with Galera Cluster installed. You need a
|
||||
minimum of three hosts;
|
||||
- No firewalls between the hosts;
|
||||
- SELinux and AppArmor set to permit access to ``mysqld``;
|
||||
- Database hosts with Galera Cluster installed
|
||||
- A minimum of three hosts
|
||||
- No firewalls between the hosts
|
||||
- SELinux and AppArmor set to permit access to ``mysqld``
|
||||
- The correct path to ``libgalera_smm.so`` given to the
|
||||
``wsrep_provider`` parameter.
|
||||
``wsrep_provider`` parameter
|
||||
|
||||
Initializing the cluster
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
In Galera Cluster, the Primary Component is the cluster of database
|
||||
In the Galera Cluster, the Primary Component is the cluster of database
|
||||
servers that replicate into each other. In the event that a
|
||||
cluster node loses connectivity with the Primary Component, it
|
||||
defaults into a non-operational state, to avoid creating or serving
|
||||
inconsistent data.
|
||||
|
||||
By default, cluster nodes do not start as part of a Primary
|
||||
Component. Instead they assume that one exists somewhere and
|
||||
attempts to establish a connection with it. To create a Primary
|
||||
Component, you must start one cluster node using the
|
||||
``--wsrep-new-cluster`` option. You can do this using any cluster
|
||||
node, it is not important which you choose. In the Primary
|
||||
Component, replication and state transfers bring all databases to
|
||||
the same state.
|
||||
By default, cluster nodes do not start as part of a Primary Component.
|
||||
In the Primary Component, replication and state transfers bring all databases
|
||||
to the same state.
|
||||
|
||||
To start the cluster, complete the following steps:
|
||||
|
||||
@ -41,7 +37,7 @@ To start the cluster, complete the following steps:
|
||||
|
||||
# service mysql start --wsrep-new-cluster
|
||||
|
||||
For servers that use ``systemd``, instead run this command:
|
||||
For servers that use ``systemd``, run the following command:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
@ -68,15 +64,15 @@ To start the cluster, complete the following steps:
|
||||
|
||||
# service mysql start
|
||||
|
||||
For servers that use ``systemd``, instead run this command:
|
||||
For servers that use ``systemd``, run the following command:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
# systemctl start mariadb
|
||||
|
||||
#. When you have all cluster nodes started, log into the database
|
||||
client on one of them and check the ``wsrep_cluster_size``
|
||||
status variable again.
|
||||
client of any cluster node and check the ``wsrep_cluster_size``
|
||||
status variable again:
|
||||
|
||||
.. code-block:: mysql
|
||||
|
||||
@ -89,32 +85,33 @@ To start the cluster, complete the following steps:
|
||||
+--------------------+-------+
|
||||
|
||||
When each cluster node starts, it checks the IP addresses given to
|
||||
the ``wsrep_cluster_address`` parameter and attempts to establish
|
||||
the ``wsrep_cluster_address`` parameter. It then attempts to establish
|
||||
network connectivity with a database server running there. Once it
|
||||
establishes a connection, it attempts to join the Primary
|
||||
Component, requesting a state transfer as needed to bring itself
|
||||
into sync with the cluster.
|
||||
|
||||
In the event that you need to restart any cluster node, you can do
|
||||
so. When the database server comes back it, it establishes
|
||||
connectivity with the Primary Component and updates itself to any
|
||||
changes it may have missed while down.
|
||||
.. note::
|
||||
|
||||
In the event that you need to restart any cluster node, you can do
|
||||
so. When the database server comes back it, it establishes
|
||||
connectivity with the Primary Component and updates itself to any
|
||||
changes it may have missed while down.
|
||||
|
||||
Restarting the cluster
|
||||
-----------------------
|
||||
|
||||
Individual cluster nodes can stop and be restarted without issue.
|
||||
When a database loses its connection or restarts, Galera Cluster
|
||||
When a database loses its connection or restarts, the Galera Cluster
|
||||
brings it back into sync once it reestablishes connection with the
|
||||
Primary Component. In the event that you need to restart the
|
||||
entire cluster, identify the most advanced cluster node and
|
||||
initialize the Primary Component on that node.
|
||||
|
||||
To find the most advanced cluster node, you need to check the
|
||||
sequence numbers, or seqnos, on the last committed transaction for
|
||||
sequence numbers, or the ``seqnos``, on the last committed transaction for
|
||||
each. You can find this by viewing ``grastate.dat`` file in
|
||||
database directory,
|
||||
database directory:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
@ -139,26 +136,24 @@ Alternatively, if the database server is running, use the
|
||||
+----------------------+--------+
|
||||
|
||||
This value increments with each transaction, so the most advanced
|
||||
node has the highest sequence number, and therefore is the most up to date.
|
||||
|
||||
node has the highest sequence number and therefore is the most up to date.
|
||||
|
||||
Configuration tips
|
||||
~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
|
||||
Deployment strategies
|
||||
----------------------
|
||||
|
||||
Galera can be configured using one of the following
|
||||
strategies:
|
||||
|
||||
- Each instance has its own IP address;
|
||||
- Each instance has its own IP address:
|
||||
|
||||
OpenStack services are configured with the list of these IP
|
||||
addresses so they can select one of the addresses from those
|
||||
available.
|
||||
|
||||
- Galera runs behind HAProxy.
|
||||
- Galera runs behind HAProxy:
|
||||
|
||||
HAProxy load balances incoming requests and exposes just one IP
|
||||
address for all the clients.
|
||||
@ -166,32 +161,25 @@ strategies:
|
||||
Galera synchronous replication guarantees a zero slave lag. The
|
||||
failover procedure completes once HAProxy detects that the active
|
||||
back end has gone down and switches to the backup one, which is
|
||||
then marked as 'UP'. If no back ends are up (in other words, the
|
||||
Galera cluster is not ready to accept connections), the failover
|
||||
procedure finishes only when the Galera cluster has been
|
||||
then marked as ``UP``. If no back ends are ``UP``, the failover
|
||||
procedure finishes only when the Galera Cluster has been
|
||||
successfully reassembled. The SLA is normally no more than 5
|
||||
minutes.
|
||||
|
||||
- Use MySQL/Galera in active/passive mode to avoid deadlocks on
|
||||
``SELECT ... FOR UPDATE`` type queries (used, for example, by nova
|
||||
and neutron). This issue is discussed more in the following:
|
||||
and neutron). This issue is discussed in the following:
|
||||
|
||||
- `IMPORTANT: MySQL Galera does *not* support SELECT ... FOR UPDATE
|
||||
<http://lists.openstack.org/pipermail/openstack-dev/2014-May/035264.html>`_
|
||||
- `Understanding reservations, concurrency, and locking in Nova
|
||||
<http://www.joinfu.com/2015/01/understanding-reservations-concurrency-locking-in-nova/>`_
|
||||
|
||||
Of these options, the second one is highly recommended. Although Galera
|
||||
supports active/active configurations, we recommend active/passive
|
||||
(enforced by the load balancer) in order to avoid lock contention.
|
||||
|
||||
|
||||
|
||||
Configuring HAProxy
|
||||
--------------------
|
||||
|
||||
If you use HAProxy for load-balancing client access to Galera
|
||||
Cluster as described in the :doc:`controller-ha-haproxy`, you can
|
||||
If you use HAProxy as a load-balancing client to provide access to the
|
||||
Galera Cluster, as described in the :doc:`controller-ha-haproxy`, you can
|
||||
use the ``clustercheck`` utility to improve health checks.
|
||||
|
||||
#. Create a configuration file for ``clustercheck`` at
|
||||
@ -205,7 +193,7 @@ use the ``clustercheck`` utility to improve health checks.
|
||||
MYSQL_PORT="3306"
|
||||
|
||||
#. Log in to the database client and grant the ``clustercheck`` user
|
||||
``PROCESS`` privileges.
|
||||
``PROCESS`` privileges:
|
||||
|
||||
.. code-block:: mysql
|
||||
|
||||
@ -248,12 +236,10 @@ use the ``clustercheck`` utility to improve health checks.
|
||||
# service xinetd enable
|
||||
# service xinetd start
|
||||
|
||||
For servers that use ``systemd``, instead run these commands:
|
||||
For servers that use ``systemd``, run the following commands:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
# systemctl daemon-reload
|
||||
# systemctl enable xinetd
|
||||
# systemctl start xinetd
|
||||
|
||||
|
||||
|
@ -13,19 +13,18 @@ You can achieve high availability for the OpenStack database in many
|
||||
different ways, depending on the type of database that you want to use.
|
||||
There are three implementations of Galera Cluster available to you:
|
||||
|
||||
- `Galera Cluster for MySQL <http://galeracluster.com/>`_ The MySQL
|
||||
reference implementation from Codership, Oy;
|
||||
- `MariaDB Galera Cluster <https://mariadb.org/>`_ The MariaDB
|
||||
- `Galera Cluster for MySQL <http://galeracluster.com/>`_: The MySQL
|
||||
reference implementation from Codership, Oy.
|
||||
- `MariaDB Galera Cluster <https://mariadb.org/>`_: The MariaDB
|
||||
implementation of Galera Cluster, which is commonly supported in
|
||||
environments based on Red Hat distributions;
|
||||
- `Percona XtraDB Cluster <http://www.percona.com/>`_ The XtraDB
|
||||
environments based on Red Hat distributions.
|
||||
- `Percona XtraDB Cluster <http://www.percona.com/>`_: The XtraDB
|
||||
implementation of Galera Cluster from Percona.
|
||||
|
||||
In addition to Galera Cluster, you can also achieve high availability
|
||||
through other database options, such as PostgreSQL, which has its own
|
||||
replication system.
|
||||
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
|
||||
|
@ -9,8 +9,7 @@ execution of jobs entered into the system.
|
||||
The most popular AMQP implementation used in OpenStack installations
|
||||
is RabbitMQ.
|
||||
|
||||
RabbitMQ nodes fail over both on the application and the
|
||||
infrastructure layers.
|
||||
RabbitMQ nodes fail over on the application and the infrastructure layers.
|
||||
|
||||
The application layer is controlled by the ``oslo.messaging``
|
||||
configuration options for multiple AMQP hosts. If the AMQP node fails,
|
||||
@ -21,7 +20,7 @@ constitutes its SLA.
|
||||
On the infrastructure layer, the SLA is the time for which RabbitMQ
|
||||
cluster reassembles. Several cases are possible. The Mnesia keeper
|
||||
node is the master of the corresponding Pacemaker resource for
|
||||
RabbitMQ; when it fails, the result is a full AMQP cluster downtime
|
||||
RabbitMQ. When it fails, the result is a full AMQP cluster downtime
|
||||
interval. Normally, its SLA is no more than several minutes. Failure
|
||||
of another node that is a slave of the corresponding Pacemaker
|
||||
resource for RabbitMQ results in no AMQP cluster downtime at all.
|
||||
@ -32,43 +31,18 @@ Making the RabbitMQ service highly available involves the following steps:
|
||||
|
||||
- :ref:`Configure RabbitMQ for HA queues<rabbitmq-configure>`
|
||||
|
||||
- :ref:`Configure OpenStack services to use Rabbit HA queues
|
||||
- :ref:`Configure OpenStack services to use RabbitMQ HA queues
|
||||
<rabbitmq-services>`
|
||||
|
||||
.. note::
|
||||
|
||||
Access to RabbitMQ is not normally handled by HAproxy. Instead,
|
||||
Access to RabbitMQ is not normally handled by HAProxy. Instead,
|
||||
consumers must be supplied with the full list of hosts running
|
||||
RabbitMQ with ``rabbit_hosts`` and turn on the ``rabbit_ha_queues``
|
||||
option.
|
||||
|
||||
Jon Eck found the `core issue
|
||||
<http://people.redhat.com/jeckersb/private/vip-failover-tcp-persist.html>`_
|
||||
and went into some detail regarding the `history and solution
|
||||
<http://john.eckersberg.com/improving-ha-failures-with-tcp-timeouts.html>`_
|
||||
on his blog.
|
||||
|
||||
In summary though:
|
||||
|
||||
The source address for the connection from HAProxy back to the
|
||||
client is the VIP address. However the VIP address is no longer
|
||||
present on the host. This means that the network (IP) layer
|
||||
deems the packet unroutable, and informs the transport (TCP)
|
||||
layer. TCP, however, is a reliable transport. It knows how to
|
||||
handle transient errors and will retry. And so it does.
|
||||
|
||||
In this case that is a problem though, because:
|
||||
|
||||
TCP generally holds on to hope for a long time. A ballpark
|
||||
estimate is somewhere on the order of tens of minutes (30
|
||||
minutes is commonly referenced). During this time it will keep
|
||||
probing and trying to deliver the data.
|
||||
|
||||
It is important to note that HAProxy has no idea that any of this is
|
||||
happening. As far as its process is concerned, it called
|
||||
``write()`` with the data and the kernel returned success. The
|
||||
resolution is already understood and just needs to make its way
|
||||
through a review.
|
||||
option. For more information, read the `core issue
|
||||
<http://people.redhat.com/jeckersb/private/vip-failover-tcp-persist.html>`_.
|
||||
For more detail, read the `history and solution
|
||||
<http://john.eckersberg.com/improving-ha-failures-with-tcp-timeouts.html>`_.
|
||||
|
||||
.. _rabbitmq-install:
|
||||
|
||||
@ -93,17 +67,16 @@ you are using:
|
||||
* - SLES 12
|
||||
- :command:`# zypper addrepo -f obs://Cloud:OpenStack:Kilo/SLE_12 Kilo`
|
||||
|
||||
[Verify fingerprint of imported GPG key; see below]
|
||||
[Verify the fingerprint of the imported GPG key. See below.]
|
||||
|
||||
:command:`# zypper install rabbitmq-server`
|
||||
|
||||
|
||||
.. note::
|
||||
|
||||
For SLES 12, the packages are signed by GPG key 893A90DAD85F9316.
|
||||
You should verify the fingerprint of the imported GPG key before using it.
|
||||
|
||||
::
|
||||
.. code-block:: ini
|
||||
|
||||
Key ID: 893A90DAD85F9316
|
||||
Key Name: Cloud:OpenStack OBS Project <Cloud:OpenStack@build.opensuse.org>
|
||||
@ -111,8 +84,8 @@ you are using:
|
||||
Key Created: Tue Oct 8 13:34:21 2013
|
||||
Key Expires: Thu Dec 17 13:34:21 2015
|
||||
|
||||
For more information,
|
||||
see the official installation manual for the distribution:
|
||||
For more information, see the official installation manual for the
|
||||
distribution:
|
||||
|
||||
- `Debian and Ubuntu <http://www.rabbitmq.com/install-debian.html>`_
|
||||
- `RPM based <http://www.rabbitmq.com/install-rpm.html>`_
|
||||
@ -123,53 +96,45 @@ see the official installation manual for the distribution:
|
||||
Configure RabbitMQ for HA queues
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
[TODO: This section should begin with a brief mention
|
||||
about what HA queues are and why they are valuable, etc]
|
||||
.. [TODO: This section should begin with a brief mention
|
||||
.. about what HA queues are and why they are valuable, etc]
|
||||
|
||||
We are building a cluster of RabbitMQ nodes to construct a RabbitMQ broker,
|
||||
which is a logical grouping of several Erlang nodes.
|
||||
.. [TODO: replace "currently" with specific release names]
|
||||
|
||||
.. [TODO: Does this list need to be updated? Perhaps we need a table
|
||||
.. that shows each component and the earliest release that allows it
|
||||
.. to work with HA queues.]
|
||||
|
||||
The following components/services can work with HA queues:
|
||||
|
||||
[TODO: replace "currently" with specific release names]
|
||||
|
||||
[TODO: Does this list need to be updated? Perhaps we need a table
|
||||
that shows each component and the earliest release that allows it
|
||||
to work with HA queues.]
|
||||
|
||||
- OpenStack Compute
|
||||
- OpenStack Block Storage
|
||||
- OpenStack Networking
|
||||
- Telemetry
|
||||
|
||||
We have to consider that, while exchanges and bindings
|
||||
survive the loss of individual nodes,
|
||||
queues and their messages do not
|
||||
because a queue and its contents are located on one node.
|
||||
If we lose this node, we also lose the queue.
|
||||
Consider that, while exchanges and bindings survive the loss of individual
|
||||
nodes, queues and their messages do not because a queue and its contents
|
||||
are located on one node. If we lose this node, we also lose the queue.
|
||||
|
||||
Mirrored queues in RabbitMQ improve
|
||||
the availability of service since it is resilient to failures.
|
||||
Mirrored queues in RabbitMQ improve the availability of service since
|
||||
it is resilient to failures.
|
||||
|
||||
Production servers should run (at least) three RabbitMQ servers;
|
||||
for testing and demonstration purposes,
|
||||
it is possible to run only two servers.
|
||||
In this section, we configure two nodes,
|
||||
called ``rabbit1`` and ``rabbit2``.
|
||||
To build a broker, we need to ensure
|
||||
that all nodes have the same Erlang cookie file.
|
||||
Production servers should run (at least) three RabbitMQ servers for testing
|
||||
and demonstration purposes, however it is possible to run only two servers.
|
||||
In this section, we configure two nodes, called ``rabbit1`` and ``rabbit2``.
|
||||
To build a broker, ensure that all nodes have the same Erlang cookie file.
|
||||
|
||||
[TODO: Should the example instead use a minimum of three nodes?]
|
||||
.. [TODO: Should the example instead use a minimum of three nodes?]
|
||||
|
||||
#. To do so, stop RabbitMQ everywhere and copy the cookie
|
||||
from the first node to each of the other node(s):
|
||||
#. Stop RabbitMQ and copy the cookie from the first node to each of the
|
||||
other node(s):
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
# scp /var/lib/rabbitmq/.erlang.cookie root@NODE:/var/lib/rabbitmq/.erlang.cookie
|
||||
|
||||
#. On each target node, verify the correct owner,
|
||||
group, and permissions of the file :file:`erlang.cookie`.
|
||||
group, and permissions of the file :file:`erlang.cookie`:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
@ -177,9 +142,7 @@ that all nodes have the same Erlang cookie file.
|
||||
# chmod 400 /var/lib/rabbitmq/.erlang.cookie
|
||||
|
||||
#. Start the message queue service on all nodes and configure it to start
|
||||
when the system boots.
|
||||
|
||||
On Ubuntu, it is configured by default.
|
||||
when the system boots. On Ubuntu, it is configured by default.
|
||||
|
||||
On CentOS, RHEL, openSUSE, and SLES:
|
||||
|
||||
@ -216,7 +179,7 @@ that all nodes have the same Erlang cookie file.
|
||||
The default node type is a disc node. In this guide, nodes
|
||||
join the cluster as RAM nodes.
|
||||
|
||||
#. To verify the cluster status:
|
||||
#. Verify the cluster status:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
@ -225,8 +188,8 @@ that all nodes have the same Erlang cookie file.
|
||||
[{nodes,[{disc,[rabbit@rabbit1]},{ram,[rabbit@NODE]}]}, \
|
||||
{running_nodes,[rabbit@NODE,rabbit@rabbit1]}]
|
||||
|
||||
If the cluster is working,
|
||||
you can create usernames and passwords for the queues.
|
||||
If the cluster is working, you can create usernames and passwords
|
||||
for the queues.
|
||||
|
||||
#. To ensure that all queues except those with auto-generated names
|
||||
are mirrored across all running nodes,
|
||||
@ -255,53 +218,50 @@ More information is available in the RabbitMQ documentation:
|
||||
Configure OpenStack services to use Rabbit HA queues
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
We have to configure the OpenStack components
|
||||
to use at least two RabbitMQ nodes.
|
||||
Configure the OpenStack components to use at least two RabbitMQ nodes.
|
||||
|
||||
Do this configuration on all services using RabbitMQ:
|
||||
Use these steps to configurate all services using RabbitMQ:
|
||||
|
||||
#. RabbitMQ HA cluster host:port pairs:
|
||||
#. RabbitMQ HA cluster ``host:port`` pairs:
|
||||
|
||||
::
|
||||
.. code-block:: console
|
||||
|
||||
rabbit_hosts=rabbit1:5672,rabbit2:5672,rabbit3:5672
|
||||
|
||||
#. How frequently to retry connecting with RabbitMQ:
|
||||
[TODO: document the unit of measure here? Seconds?]
|
||||
#. Retry connecting with RabbitMQ:
|
||||
|
||||
::
|
||||
.. code-block:: console
|
||||
|
||||
rabbit_retry_interval=1
|
||||
|
||||
#. How long to back-off for between retries when connecting to RabbitMQ:
|
||||
[TODO: document the unit of measure here? Seconds?]
|
||||
|
||||
::
|
||||
.. code-block:: console
|
||||
|
||||
rabbit_retry_backoff=2
|
||||
|
||||
#. Maximum retries with trying to connect to RabbitMQ (infinite by default):
|
||||
|
||||
::
|
||||
.. code-block:: console
|
||||
|
||||
rabbit_max_retries=0
|
||||
|
||||
#. Use durable queues in RabbitMQ:
|
||||
|
||||
::
|
||||
.. code-block:: console
|
||||
|
||||
rabbit_durable_queues=true
|
||||
|
||||
#. Use HA queues in RabbitMQ (x-ha-policy: all):
|
||||
#. Use HA queues in RabbitMQ (``x-ha-policy: all``):
|
||||
|
||||
::
|
||||
.. code-block:: console
|
||||
|
||||
rabbit_ha_queues=true
|
||||
|
||||
.. note::
|
||||
|
||||
If you change the configuration from an old set-up
|
||||
that did not use HA queues, you should restart the service:
|
||||
that did not use HA queues, restart the service:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
|
@ -5,29 +5,20 @@
|
||||
Storage back end
|
||||
================
|
||||
|
||||
Most of this guide concerns the control plane of high availability:
|
||||
ensuring that services continue to run even if a component fails.
|
||||
Ensuring that data is not lost
|
||||
is the data plane component of high availability;
|
||||
this is discussed here.
|
||||
|
||||
An OpenStack environment includes multiple data pools for the VMs:
|
||||
|
||||
- Ephemeral storage is allocated for an instance
|
||||
and is deleted when the instance is deleted.
|
||||
The Compute service manages ephemeral storage.
|
||||
By default, Compute stores ephemeral drives as files
|
||||
on local disks on the Compute node
|
||||
but Ceph RBD can instead be used
|
||||
as the storage back end for ephemeral storage.
|
||||
- Ephemeral storage is allocated for an instance and is deleted when the
|
||||
instance is deleted. The Compute service manages ephemeral storage and
|
||||
by default, Compute stores ephemeral drives as files on local disks on the
|
||||
Compute node. CAs an alternative, you can use Ceph RBD as the storage back
|
||||
end for ephemeral storage.
|
||||
|
||||
- Persistent storage exists outside all instances.
|
||||
Two types of persistent storage are provided:
|
||||
- Persistent storage exists outside all instances. Two types of persistent
|
||||
storage are provided:
|
||||
|
||||
- Block Storage service (cinder)
|
||||
can use LVM or Ceph RBD as the storage back end.
|
||||
- Image service (glance)
|
||||
can use the Object Storage service (swift)
|
||||
- The Block Storage service (cinder) that can use LVM or Ceph RBD as the
|
||||
storage back end.
|
||||
- The Image service (glance) that can use the Object Storage service (swift)
|
||||
or Ceph RBD as the storage back end.
|
||||
|
||||
For more information about configuring storage back ends for
|
||||
@ -35,45 +26,37 @@ the different storage options, see `Manage volumes
|
||||
<http://docs.openstack.org/admin-guide/blockstorage-manage-volumes.html>`_
|
||||
in the OpenStack Administrator Guide.
|
||||
|
||||
This section discusses ways to protect against
|
||||
data loss in your OpenStack environment.
|
||||
This section discusses ways to protect against data loss in your OpenStack
|
||||
environment.
|
||||
|
||||
RAID drives
|
||||
-----------
|
||||
|
||||
Configuring RAID on the hard drives that implement storage
|
||||
protects your data against a hard drive failure.
|
||||
If, however, the node itself fails, data may be lost.
|
||||
Configuring RAID on the hard drives that implement storage protects your data
|
||||
against a hard drive failure. If the node itself fails, data may be lost.
|
||||
In particular, all volumes stored on an LVM node can be lost.
|
||||
|
||||
Ceph
|
||||
----
|
||||
|
||||
`Ceph RBD <http://ceph.com/>`_
|
||||
is an innately high availability storage back end.
|
||||
It creates a storage cluster with multiple nodes
|
||||
that communicate with each other
|
||||
to replicate and redistribute data dynamically.
|
||||
A Ceph RBD storage cluster provides
|
||||
a single shared set of storage nodes
|
||||
that can handle all classes of persistent and ephemeral data
|
||||
-- glance, cinder, and nova --
|
||||
that are required for OpenStack instances.
|
||||
`Ceph RBD <http://ceph.com/>`_ is an innately high availability storage back
|
||||
end. It creates a storage cluster with multiple nodes that communicate with
|
||||
each other to replicate and redistribute data dynamically.
|
||||
A Ceph RBD storage cluster provides a single shared set of storage nodes that
|
||||
can handle all classes of persistent and ephemeral data (glance, cinder, and
|
||||
nova) that are required for OpenStack instances.
|
||||
|
||||
Ceph RBD provides object replication capabilities
|
||||
by storing Block Storage volumes as Ceph RBD objects;
|
||||
Ceph RBD ensures that each replica of an object
|
||||
is stored on a different node.
|
||||
This means that your volumes are protected against
|
||||
hard drive and node failures
|
||||
or even the failure of the data center itself.
|
||||
Ceph RBD provides object replication capabilities by storing Block Storage
|
||||
volumes as Ceph RBD objects. Ceph RBD ensures that each replica of an object
|
||||
is stored on a different node. This means that your volumes are protected
|
||||
against hard drive and node failures, or even the failure of the data center
|
||||
itself.
|
||||
|
||||
When Ceph RBD is used for ephemeral volumes
|
||||
as well as block and image storage, it supports
|
||||
`live migration
|
||||
When Ceph RBD is used for ephemeral volumes as well as block and image storage,
|
||||
it supports `live migration
|
||||
<http://docs.openstack.org/admin-guide/compute-live-migration-usage.html>`_
|
||||
of VMs with ephemeral drives;
|
||||
LVM only supports live migration of volume-backed VMs.
|
||||
of VMs with ephemeral drives. LVM only supports live migration of
|
||||
volume-backed VMs.
|
||||
|
||||
Remote backup facilities
|
||||
------------------------
|
||||
|
@ -2,7 +2,7 @@
|
||||
Highly available Block Storage API
|
||||
==================================
|
||||
|
||||
Cinder provides 'block storage as a service' suitable for performance
|
||||
Cinder provides Block-Storage-as-a-Service suitable for performance
|
||||
sensitive scenarios such as databases, expandable file systems, or
|
||||
providing a server with access to raw block level storage.
|
||||
|
||||
@ -10,7 +10,7 @@ Persistent block storage can survive instance termination and can also
|
||||
be moved across instances like any external storage device. Cinder
|
||||
also has volume snapshots capability for backing up the volumes.
|
||||
|
||||
Making this Block Storage API service highly available in
|
||||
Making the Block Storage API service highly available in
|
||||
active/passive mode involves:
|
||||
|
||||
- :ref:`ha-blockstorage-pacemaker`
|
||||
@ -18,60 +18,22 @@ active/passive mode involves:
|
||||
- :ref:`ha-blockstorage-services`
|
||||
|
||||
In theory, you can run the Block Storage service as active/active.
|
||||
However, because of sufficient concerns, it is recommended running
|
||||
However, because of sufficient concerns, we recommend running
|
||||
the volume component as active/passive only.
|
||||
|
||||
Jon Bernard writes:
|
||||
|
||||
::
|
||||
|
||||
Requests are first seen by Cinder in the API service, and we have a
|
||||
fundamental problem there - a standard test-and-set race condition
|
||||
exists for many operations where the volume status is first checked
|
||||
for an expected status and then (in a different operation) updated to
|
||||
a pending status. The pending status indicates to other incoming
|
||||
requests that the volume is undergoing a current operation, however it
|
||||
is possible for two simultaneous requests to race here, which
|
||||
undefined results.
|
||||
|
||||
Later, the manager/driver will receive the message and carry out the
|
||||
operation. At this stage there is a question of the synchronization
|
||||
techniques employed by the drivers and what guarantees they make.
|
||||
|
||||
If cinder-volume processes exist as different process, then the
|
||||
'synchronized' decorator from the lockutils package will not be
|
||||
sufficient. In this case the programmer can pass an argument to
|
||||
synchronized() 'external=True'. If external is enabled, then the
|
||||
locking will take place on a file located on the filesystem. By
|
||||
default, this file is placed in Cinder's 'state directory' in
|
||||
/var/lib/cinder so won't be visible to cinder-volume instances running
|
||||
on different machines.
|
||||
|
||||
However, the location for file locking is configurable. So an
|
||||
operator could configure the state directory to reside on shared
|
||||
storage. If the shared storage in use implements unix file locking
|
||||
semantics, then this could provide the requisite synchronization
|
||||
needed for an active/active HA configuration.
|
||||
|
||||
The remaining issue is that not all drivers use the synchronization
|
||||
methods, and even fewer of those use the external file locks. A
|
||||
sub-concern would be whether they use them correctly.
|
||||
|
||||
You can read more about these concerns on the
|
||||
`Red Hat Bugzilla <https://bugzilla.redhat.com/show_bug.cgi?id=1193229>`_
|
||||
and there is a
|
||||
`psuedo roadmap <https://etherpad.openstack.org/p/cinder-kilo-stabilisation-work>`_
|
||||
for addressing them upstream.
|
||||
|
||||
|
||||
.. _ha-blockstorage-pacemaker:
|
||||
|
||||
Add Block Storage API resource to Pacemaker
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
On RHEL-based systems, you should create resources for cinder's
|
||||
systemd agents and create constraints to enforce startup/shutdown
|
||||
ordering:
|
||||
On RHEL-based systems, create resources for cinder's systemd agents and create
|
||||
constraints to enforce startup/shutdown ordering:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
@ -115,29 +77,25 @@ and add the following cluster resources:
|
||||
keystone_get_token_url="http://10.0.0.11:5000/v2.0/tokens" \
|
||||
op monitor interval="30s" timeout="30s"
|
||||
|
||||
This configuration creates ``p_cinder-api``,
|
||||
a resource for managing the Block Storage API service.
|
||||
This configuration creates ``p_cinder-api``, a resource for managing the
|
||||
Block Storage API service.
|
||||
|
||||
The command :command:`crm configure` supports batch input,
|
||||
so you may copy and paste the lines above
|
||||
into your live pacemaker configuration and then make changes as required.
|
||||
For example, you may enter ``edit p_ip_cinder-api``
|
||||
from the :command:`crm configure` menu
|
||||
and edit the resource to match your preferred virtual IP address.
|
||||
The command :command:`crm configure` supports batch input, copy and paste the
|
||||
lines above into your live Pacemaker configuration and then make changes as
|
||||
required. For example, you may enter ``edit p_ip_cinder-api`` from the
|
||||
:command:`crm configure` menu and edit the resource to match your preferred
|
||||
virtual IP address.
|
||||
|
||||
Once completed, commit your configuration changes
|
||||
by entering :command:`commit` from the :command:`crm configure` menu.
|
||||
Pacemaker then starts the Block Storage API service
|
||||
and its dependent resources on one of your nodes.
|
||||
Once completed, commit your configuration changes by entering :command:`commit`
|
||||
from the :command:`crm configure` menu. Pacemaker then starts the Block Storage
|
||||
API service and its dependent resources on one of your nodes.
|
||||
|
||||
.. _ha-blockstorage-configure:
|
||||
|
||||
Configure Block Storage API service
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Edit the ``/etc/cinder/cinder.conf`` file:
|
||||
|
||||
On a RHEL-based system, it should look something like:
|
||||
Edit the ``/etc/cinder/cinder.conf`` file. For example, on a RHEL-based system:
|
||||
|
||||
.. code-block:: ini
|
||||
:linenos:
|
||||
@ -211,19 +169,17 @@ database.
|
||||
|
||||
.. _ha-blockstorage-services:
|
||||
|
||||
Configure OpenStack services to use highly available Block Storage API
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
Configure OpenStack services to use the highly available Block Storage API
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Your OpenStack services must now point their
|
||||
Block Storage API configuration to the highly available,
|
||||
virtual cluster IP address
|
||||
rather than a Block Storage API server’s physical IP address
|
||||
as you would for a non-HA environment.
|
||||
Your OpenStack services must now point their Block Storage API configuration
|
||||
to the highly available, virtual cluster IP address rather than a Block Storage
|
||||
API server’s physical IP address as you would for a non-HA environment.
|
||||
|
||||
You must create the Block Storage API endpoint with this IP.
|
||||
Create the Block Storage API endpoint with this IP.
|
||||
|
||||
If you are using both private and public IP addresses,
|
||||
you should create two virtual IPs and define your endpoint like this:
|
||||
If you are using both private and public IP addresses, create two virtual IPs
|
||||
and define your endpoint. For example:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
|
@ -14,41 +14,56 @@ in active/passive mode involves:
|
||||
Add Shared File Systems API resource to Pacemaker
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
You must first download the resource agent to your system:
|
||||
#. Download the resource agent to your system:
|
||||
|
||||
.. code-block:: console
|
||||
.. code-block:: console
|
||||
|
||||
# cd /usr/lib/ocf/resource.d/openstack
|
||||
# wget https://git.openstack.org/cgit/openstack/openstack-resource-agents/plain/ocf/manila-api
|
||||
# chmod a+rx *
|
||||
# cd /usr/lib/ocf/resource.d/openstack
|
||||
# wget https://git.openstack.org/cgit/openstack/openstack-resource-agents/plain/ocf/manila-api
|
||||
# chmod a+rx *
|
||||
|
||||
You can now add the Pacemaker configuration for the Shared File Systems
|
||||
API resource. Connect to the Pacemaker cluster with the
|
||||
:command:`crm configure` command and add the following cluster resources:
|
||||
#. Add the Pacemaker configuration for the Shared File Systems
|
||||
API resource. Connect to the Pacemaker cluster with the following
|
||||
command:
|
||||
|
||||
.. code-block:: ini
|
||||
.. code-block:: console
|
||||
|
||||
primitive p_manila-api ocf:openstack:manila-api \
|
||||
params config="/etc/manila/manila.conf" \
|
||||
os_password="secretsecret" \
|
||||
os_username="admin" \
|
||||
os_tenant_name="admin" \
|
||||
keystone_get_token_url="http://10.0.0.11:5000/v2.0/tokens" \
|
||||
op monitor interval="30s" timeout="30s"
|
||||
# crm configure
|
||||
|
||||
This configuration creates ``p_manila-api``, a resource for managing the
|
||||
Shared File Systems API service.
|
||||
.. note::
|
||||
|
||||
The :command:`crm configure` supports batch input, so you may copy and paste
|
||||
the lines above into your live Pacemaker configuration and then make changes
|
||||
as required. For example, you may enter ``edit p_ip_manila-api`` from the
|
||||
:command:`crm configure` menu and edit the resource to match your preferred
|
||||
virtual IP address.
|
||||
The :command:`crm configure` supports batch input. Copy and paste
|
||||
the lines in the next step into your live Pacemaker configuration and then
|
||||
make changes as required.
|
||||
|
||||
Once completed, commit your configuration changes by entering :command:`commit`
|
||||
from the :command:`crm configure` menu. Pacemaker then starts the
|
||||
Shared File Systems API service and its dependent resources on one of your
|
||||
nodes.
|
||||
For example, you may enter ``edit p_ip_manila-api`` from the
|
||||
:command:`crm configure` menu and edit the resource to match your preferred
|
||||
virtual IP address.
|
||||
|
||||
#. Add the following cluster resources:
|
||||
|
||||
.. code-block:: ini
|
||||
|
||||
primitive p_manila-api ocf:openstack:manila-api \
|
||||
params config="/etc/manila/manila.conf" \
|
||||
os_password="secretsecret" \
|
||||
os_username="admin" \
|
||||
os_tenant_name="admin" \
|
||||
keystone_get_token_url="http://10.0.0.11:5000/v2.0/tokens" \
|
||||
op monitor interval="30s" timeout="30s"
|
||||
|
||||
This configuration creates ``p_manila-api``, a resource for managing the
|
||||
Shared File Systems API service.
|
||||
|
||||
#. Commit your configuration changes by entering the following command
|
||||
from the :command:`crm configure` menu:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
# commit
|
||||
|
||||
Pacemaker now starts the Shared File Systems API service and its
|
||||
dependent resources on one of your nodes.
|
||||
|
||||
.. _ha-sharedfilesystems-configure:
|
||||
|
||||
|
@ -2,19 +2,21 @@
|
||||
Highly available Image API
|
||||
==========================
|
||||
|
||||
The OpenStack Image service offers a service for discovering,
|
||||
registering, and retrieving virtual machine images.
|
||||
To make the OpenStack Image API service highly available
|
||||
in active / passive mode, you must:
|
||||
The OpenStack Image service offers a service for discovering, registering, and
|
||||
retrieving virtual machine images. To make the OpenStack Image API service
|
||||
highly available in active/passive mode, you must:
|
||||
|
||||
- :ref:`glance-api-pacemaker`
|
||||
- :ref:`glance-api-configure`
|
||||
- :ref:`glance-services`
|
||||
|
||||
This section assumes that you are familiar with the
|
||||
Prerequisites
|
||||
~~~~~~~~~~~~~
|
||||
|
||||
Before beginning, ensure that you are familiar with the
|
||||
documentation for installing the OpenStack Image API service.
|
||||
See the *Image service* section in the
|
||||
`Installation Tutorials and Guides <http://docs.openstack.org/project-install-guide/newton>`_
|
||||
`Installation Tutorials and Guides <http://docs.openstack.org/project-install-guide/newton>`_,
|
||||
depending on your distribution.
|
||||
|
||||
.. _glance-api-pacemaker:
|
||||
@ -22,44 +24,54 @@ depending on your distribution.
|
||||
Add OpenStack Image API resource to Pacemaker
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
You must first download the resource agent to your system:
|
||||
#. Download the resource agent to your system:
|
||||
|
||||
.. code-block:: console
|
||||
.. code-block:: console
|
||||
|
||||
# cd /usr/lib/ocf/resource.d/openstack
|
||||
# wget https://git.openstack.org/cgit/openstack/openstack-resource-agents/plain/ocf/glance-api
|
||||
# chmod a+rx *
|
||||
# cd /usr/lib/ocf/resource.d/openstack
|
||||
# wget https://git.openstack.org/cgit/openstack/openstack-resource-agents/plain/ocf/glance-api
|
||||
# chmod a+rx *
|
||||
|
||||
You can now add the Pacemaker configuration
|
||||
for the OpenStack Image API resource.
|
||||
Use the :command:`crm configure` command
|
||||
to connect to the Pacemaker cluster
|
||||
and add the following cluster resources:
|
||||
#. Add the Pacemaker configuration for the OpenStack Image API resource.
|
||||
Use the following command to connect to the Pacemaker cluster:
|
||||
|
||||
::
|
||||
.. code-block:: console
|
||||
|
||||
primitive p_glance-api ocf:openstack:glance-api \
|
||||
params config="/etc/glance/glance-api.conf" \
|
||||
os_password="secretsecret" \
|
||||
os_username="admin" os_tenant_name="admin" \
|
||||
os_auth_url="http://10.0.0.11:5000/v2.0/" \
|
||||
op monitor interval="30s" timeout="30s"
|
||||
crm configure
|
||||
|
||||
This configuration creates ``p_glance-api``,
|
||||
a resource for managing the OpenStack Image API service.
|
||||
.. note::
|
||||
|
||||
The :command:`crm configure` command supports batch input,
|
||||
so you may copy and paste the above into your live Pacemaker configuration
|
||||
and then make changes as required.
|
||||
For example, you may enter edit ``p_ip_glance-api``
|
||||
from the :command:`crm configure` menu
|
||||
and edit the resource to match your preferred virtual IP address.
|
||||
The :command:`crm configure` command supports batch input. Copy and paste
|
||||
the lines in the next step into your live Pacemaker configuration and
|
||||
then make changes as required.
|
||||
|
||||
After completing these steps,
|
||||
commit your configuration changes by entering :command:`commit`
|
||||
from the :command:`crm configure` menu.
|
||||
Pacemaker then starts the OpenStack Image API service
|
||||
and its dependent resources on one of your nodes.
|
||||
For example, you may enter ``edit p_ip_glance-api`` from the
|
||||
:command:`crm configure` menu and edit the resource to match your
|
||||
preferred virtual IP address.
|
||||
|
||||
#. Add the following cluster resources:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
primitive p_glance-api ocf:openstack:glance-api \
|
||||
params config="/etc/glance/glance-api.conf" \
|
||||
os_password="secretsecret" \
|
||||
os_username="admin" os_tenant_name="admin" \
|
||||
os_auth_url="http://10.0.0.11:5000/v2.0/" \
|
||||
op monitor interval="30s" timeout="30s"
|
||||
|
||||
This configuration creates ``p_glance-api``, a resource for managing the
|
||||
OpenStack Image API service.
|
||||
|
||||
#. Commit your configuration changes by entering the following command from
|
||||
the :command:`crm configure` menu:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
commit
|
||||
|
||||
Pacemaker then starts the OpenStack Image API service and its dependent
|
||||
resources on one of your nodes.
|
||||
|
||||
.. _glance-api-configure:
|
||||
|
||||
@ -67,7 +79,7 @@ Configure OpenStack Image service API
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Edit the :file:`/etc/glance/glance-api.conf` file
|
||||
to configure the OpenStack image service:
|
||||
to configure the OpenStack Image service:
|
||||
|
||||
.. code-block:: ini
|
||||
|
||||
@ -93,20 +105,17 @@ to configure the OpenStack image service:
|
||||
|
||||
.. _glance-services:
|
||||
|
||||
Configure OpenStack services to use highly available OpenStack Image API
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
Configure OpenStack services to use the highly available OpenStack Image API
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Your OpenStack services must now point
|
||||
their OpenStack Image API configuration to the highly available,
|
||||
virtual cluster IP address
|
||||
instead of pointing to the physical IP address
|
||||
of an OpenStack Image API server
|
||||
as you would in a non-HA cluster.
|
||||
Your OpenStack services must now point their OpenStack Image API configuration
|
||||
to the highly available, virtual cluster IP address instead of pointing to the
|
||||
physical IP address of an OpenStack Image API server as you would in a non-HA
|
||||
cluster.
|
||||
|
||||
For OpenStack Compute, for example,
|
||||
if your OpenStack Image API service IP address is 10.0.0.11
|
||||
(as in the configuration explained here),
|
||||
you would use the following configuration in your :file:`nova.conf` file:
|
||||
For example, if your OpenStack Image API service IP address is 10.0.0.11
|
||||
(as in the configuration explained here), you would use the following
|
||||
configuration in your :file:`nova.conf` file:
|
||||
|
||||
.. code-block:: ini
|
||||
|
||||
@ -117,9 +126,8 @@ you would use the following configuration in your :file:`nova.conf` file:
|
||||
|
||||
|
||||
You must also create the OpenStack Image API endpoint with this IP address.
|
||||
If you are using both private and public IP addresses,
|
||||
you should create two virtual IP addresses
|
||||
and define your endpoint like this:
|
||||
If you are using both private and public IP addresses, create two virtual IP
|
||||
addresses and define your endpoint. For example:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
|
Loading…
x
Reference in New Issue
Block a user