216 lines
11 KiB
XML
216 lines
11 KiB
XML
<?xml version="1.0" encoding="UTF-8"?>
|
|
<!DOCTYPE section [
|
|
<!ENTITY % openstack SYSTEM "../openstack.ent">
|
|
%openstack;
|
|
]>
|
|
<section xmlns="http://docbook.org/ns/docbook"
|
|
xmlns:xi="http://www.w3.org/2001/XInclude"
|
|
xmlns:xlink="http://www.w3.org/1999/xlink"
|
|
version="5.0"
|
|
xml:id="s-rabbitmq">
|
|
<title>Highly available RabbitMQ</title>
|
|
<para>RabbitMQ is the default AMQP server used by many OpenStack
|
|
services. Making the RabbitMQ service highly available involves:</para>
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>configuring a DRBD device for use by RabbitMQ,</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>configuring RabbitMQ to use a data directory residing on
|
|
that DRBD device,</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>selecting and assigning a virtual IP address (VIP) that can freely
|
|
float between cluster nodes,</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>configuring RabbitMQ to listen on that IP address,</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>managing all resources, including the RabbitMQ daemon itself, with
|
|
the Pacemaker cluster manager.</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
<note>
|
|
<para><link xlink:href="http://www.rabbitmq.com/ha.html">Active-active mirrored queues</link>
|
|
is another method for configuring RabbitMQ versions 3.3.0 and later
|
|
for high availability. You can also manage a RabbitMQ cluster with
|
|
active-active mirrored queues using the Pacemaker cluster manager.</para>
|
|
</note>
|
|
<section xml:id="_configure_drbd_2">
|
|
<title>Configure DRBD</title>
|
|
<para>The Pacemaker based RabbitMQ server requires a DRBD resource from
|
|
which it mounts the <filename>/var/lib/rabbitmq</filename> directory.
|
|
In this example, the DRBD resource is simply named
|
|
<literal>rabbitmq</literal>:</para>
|
|
<formalpara>
|
|
<title><literal>rabbitmq</literal> DRBD resource configuration
|
|
(<filename>/etc/drbd.d/rabbitmq.res</filename>)</title>
|
|
<para>
|
|
<programlisting>resource rabbitmq {
|
|
device minor 1;
|
|
disk "/dev/data/rabbitmq";
|
|
meta-disk internal;
|
|
on node1 {
|
|
address ipv4 10.0.42.100:7701;
|
|
}
|
|
on node2 {
|
|
address ipv4 10.0.42.254:7701;
|
|
}
|
|
}</programlisting>
|
|
</para>
|
|
</formalpara>
|
|
<para>This resource uses an underlying local disk (in DRBD terminology, a
|
|
backing device) named <filename>/dev/data/rabbitmq</filename> on both
|
|
cluster nodes, <literal>node1</literal> and <literal>node2</literal>.
|
|
Normally, this would be an LVM Logical Volume specifically set aside for
|
|
this purpose. The DRBD meta-disk is internal, meaning DRBD-specific
|
|
metadata is being stored at the end of the disk device itself. The device
|
|
is configured to communicate between IPv4 addresses
|
|
<literal>10.0.42.100</literal> and <literal>10.0.42.254</literal>,
|
|
using TCP port 7701. Once enabled, it will map to a local DRBD block
|
|
device with the device minor number 1, that is,
|
|
<filename>/dev/drbd1</filename>.</para>
|
|
<para>Enabling a DRBD resource is explained in detail in
|
|
<link xlink:href="http://www.drbd.org/users-guide-8.3/s-first-time-up.html">the DRBD
|
|
User's Guide</link>. In brief, the proper sequence of commands is this:</para>
|
|
<screen><prompt>#</prompt> <userinput>drbdadm create-md rabbitmq</userinput><co xml:id="CO4-1"/>
|
|
<prompt>#</prompt> <userinput>drbdadm up rabbitmq</userinput><co xml:id="CO4-2"/>
|
|
<prompt>#</prompt> <userinput>drbdadm -- --force primary rabbitmq</userinput><co xml:id="CO4-3"/></screen>
|
|
<calloutlist>
|
|
<callout arearefs="CO4-1">
|
|
<para>Initializes DRBD metadata and writes the initial set of
|
|
metadata to <filename>/dev/data/rabbitmq</filename>. Must be
|
|
completed on both nodes.</para>
|
|
</callout>
|
|
<callout arearefs="CO4-2">
|
|
<para>Creates the <filename>/dev/drbd1</filename> device node,
|
|
attaches the DRBD device to its backing store, and connects
|
|
the DRBD node to its peer. Must be completed on both nodes.</para>
|
|
</callout>
|
|
<callout arearefs="CO4-3">
|
|
<para>Kicks off the initial device synchronization, and puts the
|
|
device into the <literal>primary</literal> (readable and writable)
|
|
role. See <link xlink:href="http://www.drbd.org/users-guide-8.3/ch-admin.html#s-roles">
|
|
Resource roles</link> (from the DRBD User's Guide) for a more
|
|
detailed description of the primary and secondary roles in DRBD.
|
|
Must be completed on one node only, namely the one where you
|
|
are about to continue with creating your filesystem.</para>
|
|
</callout>
|
|
</calloutlist>
|
|
</section>
|
|
<section xml:id="_create_a_file_system">
|
|
<title>Create a file system</title>
|
|
<para>Once the DRBD resource is running and in the primary role (and
|
|
potentially still in the process of running the initial device
|
|
synchronization), you may proceed with creating the filesystem for
|
|
RabbitMQ data. XFS is generally the recommended filesystem:</para>
|
|
<screen><prompt>#</prompt> <userinput>mkfs -t xfs /dev/drbd1</userinput></screen>
|
|
<para>You may also use the alternate device path for the DRBD device,
|
|
which may be easier to remember as it includes the self-explanatory
|
|
resource name:</para>
|
|
<screen><prompt>#</prompt> <userinput>mkfs -t xfs /dev/drbd/by-res/rabbitmq</userinput></screen>
|
|
<para>Once completed, you may safely return the device to the secondary
|
|
role. Any ongoing device synchronization will continue in the
|
|
background:</para>
|
|
<screen><prompt>#</prompt> <userinput>drbdadm secondary rabbitmq</userinput></screen>
|
|
</section>
|
|
<section xml:id="_prepare_rabbitmq_for_pacemaker_high_availability">
|
|
<title>Prepare RabbitMQ for Pacemaker high availability</title>
|
|
<para>In order for Pacemaker monitoring to function properly, you must
|
|
ensure that RabbitMQ's <filename>.erlang.cookie</filename> files
|
|
are identical on all nodes, regardless of whether DRBD is mounted
|
|
there or not. The simplest way of doing so is to take an existing
|
|
<filename>.erlang.cookie</filename> from one of your nodes, copying
|
|
it to the RabbitMQ data directory on the other node, and also
|
|
copying it to the DRBD-backed filesystem.</para>
|
|
<screen><prompt>#</prompt> <userinput>scp -p /var/lib/rabbitmq/.erlang.cookie node2:/var/lib/rabbitmq/</userinput>
|
|
<prompt>#</prompt> <userinput>mount /dev/drbd/by-res/rabbitmq /mnt</userinput>
|
|
<prompt>#</prompt> <userinput>cp -a /var/lib/rabbitmq/.erlang.cookie /mnt</userinput>
|
|
<prompt>#</prompt> <userinput>umount /mnt</userinput></screen>
|
|
</section>
|
|
<section xml:id="_add_rabbitmq_resources_to_pacemaker">
|
|
<title>Add RabbitMQ resources to Pacemaker</title>
|
|
<para>You may now proceed with adding the Pacemaker configuration for
|
|
RabbitMQ resources. Connect to the Pacemaker cluster with
|
|
<command>crm configure</command>, and add the following cluster
|
|
resources:</para>
|
|
<programlisting>primitive p_ip_rabbitmq ocf:heartbeat:IPaddr2 \
|
|
params ip="192.168.42.100" cidr_netmask="24" \
|
|
op monitor interval="10s"
|
|
primitive p_drbd_rabbitmq ocf:linbit:drbd \
|
|
params drbd_resource="rabbitmq" \
|
|
op start timeout="90s" \
|
|
op stop timeout="180s" \
|
|
op promote timeout="180s" \
|
|
op demote timeout="180s" \
|
|
op monitor interval="30s" role="Slave" \
|
|
op monitor interval="29s" role="Master"
|
|
primitive p_fs_rabbitmq ocf:heartbeat:Filesystem \
|
|
params device="/dev/drbd/by-res/rabbitmq" \
|
|
directory="/var/lib/rabbitmq" \
|
|
fstype="xfs" options="relatime" \
|
|
op start timeout="60s" \
|
|
op stop timeout="180s" \
|
|
op monitor interval="60s" timeout="60s"
|
|
primitive p_rabbitmq ocf:rabbitmq:rabbitmq-server \
|
|
params nodename="rabbit@localhost" \
|
|
mnesia_base="/var/lib/rabbitmq" \
|
|
op monitor interval="20s" timeout="10s"
|
|
group g_rabbitmq p_ip_rabbitmq p_fs_rabbitmq p_rabbitmq
|
|
ms ms_drbd_rabbitmq p_drbd_rabbitmq \
|
|
meta notify="true" master-max="1" clone-max="2"
|
|
colocation c_rabbitmq_on_drbd inf: g_rabbitmq ms_drbd_rabbitmq:Master
|
|
order o_drbd_before_rabbitmq inf: ms_drbd_rabbitmq:promote g_rabbitmq:start</programlisting>
|
|
<para>This configuration creates</para>
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para><literal>p_ip_rabbitmq</literal>, a virtual IP address for
|
|
use by RabbitMQ (<literal>192.168.42.100</literal>),</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para><literal>p_fs_rabbitmq</literal>, a Pacemaker managed
|
|
filesystem mounted to <filename>/var/lib/rabbitmq</filename>
|
|
on whatever node currently runs the RabbitMQ service,</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para><literal>ms_drbd_rabbitmq</literal>, the master/slave set
|
|
managing the <literal>rabbitmq</literal> DRBD resource,</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>a service group and order and colocation constraints to
|
|
ensure resources are started on the correct nodes, and in the
|
|
correct sequence.</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
<para><command>crm configure</command> supports batch input, so you
|
|
may copy and paste the above into your live pacemaker configuration,
|
|
and then make changes as required. For example, you may enter
|
|
<literal>edit p_ip_rabbitmq</literal> from the
|
|
<command>crm configure</command> menu and edit the resource to
|
|
match your preferred virtual IP address.</para>
|
|
<para>Once completed, commit your configuration changes by entering
|
|
<literal>commit</literal> from the <command>crm configure</command>
|
|
menu. Pacemaker will then start the RabbitMQ service, and its
|
|
dependent resources, on one of your nodes.</para>
|
|
</section>
|
|
<section xml:id="_configure_openstack_services_for_highly_available_rabbitmq">
|
|
<title>Configure OpenStack services for highly available RabbitMQ</title>
|
|
<para>Your OpenStack services must now point their RabbitMQ
|
|
configuration to the highly available, virtual cluster IP
|
|
address—rather than a RabbitMQ server's physical IP address
|
|
as you normally would.</para>
|
|
<para>For OpenStack Image, for example, if your RabbitMQ service
|
|
IP address is <literal>192.168.42.100</literal> as in the
|
|
configuration explained here, you would use the following line
|
|
in your OpenStack Image API configuration file
|
|
(<filename>glance-api.conf</filename>):</para>
|
|
<programlisting language="ini">rabbit_host = 192.168.42.100</programlisting>
|
|
<para>No other changes are necessary to your OpenStack configuration.
|
|
If the node currently hosting your RabbitMQ experiences a problem
|
|
necessitating service failover, your OpenStack services may
|
|
experience a brief RabbitMQ interruption, as they would in the
|
|
event of a network hiccup, and then continue to run normally.</para>
|
|
</section>
|
|
</section>
|