openstack-manuals/doc/admin-guide-cloud/compute/section_compute-configure-service-groups.xml
ghezal sherdil 362ff748e1 made minor change to section_compute-configure-service-groups.xml
per convention doc when referring to services use Compute instead of nova

Change-Id: I33c6be91ff7bc8188d0f072fe1feed51601057f4
2015-06-19 08:51:09 +02:00

114 lines
6.3 KiB
XML

<!DOCTYPE section [
<!ENTITY % openstack SYSTEM "../../common/entities/openstack.ent">
%openstack;
]><section xmlns="http://docbook.org/ns/docbook"
xmlns:xi="http://www.w3.org/2001/XInclude"
xmlns:xlink="http://www.w3.org/1999/xlink"
version="5.0"
xml:id="configuring-compute-service-groups">
<title>Configure Compute service groups</title>
<para>The Compute service must know the status of each compute node to
effectively manage and use them. This can include events like a user
launching a new VM, the scheduler sending a request to a live node, or a
query to the ServiceGroup API to determine if a node is live.</para>
<para>When a compute worker running the
<systemitem class="service">nova-compute</systemitem> daemon starts, it
calls the <systemitem>join</systemitem> API to join the compute group. Any
service (such as the scheduler) can query the group's membership and the
status of its nodes. Internally, the <systemitem>ServiceGroup</systemitem>
client driver automatically updates the compute worker status.</para>
<!-- I don't understand what "available" means in this context. Can someone please help? LKB
<para>The database, ZooKeeper, and Memcache drivers are available.</para>-->
<section xml:id="database-servicegroup-driver">
<title>Database ServiceGroup driver</title>
<para>By default, Compute uses the database driver to track if a node is
live. In a compute worker, this driver periodically sends a
<command>db update</command> command to the database, saying <quote>I'm
OK</quote> with a timestamp. Compute uses a pre-defined timeout
(<literal>service_down_time</literal>) to determine if a node is dead.</para>
<para>The driver has limitations, which can be problematic depending on
your environment. If a lot of compute worker nodes need to be checked,
the database can be put under heavy load, which can cause the timeout to
trigger, and a live node could incorrectly be considered dead. By default,
the timeout is 60 seconds. Reducing the timeout value can help in this
situation, but you must also make the database update more frequently,
which again increases the database workload.</para>
<para>The database contains data that is both transient (such as whether
the node is alive) and persistent (such as entries for VM owners). With
the ServiceGroup abstraction, Compute can treat each type separately.</para>
</section>
<section xml:id="zookeeper-servicegroup-driver">
<title>ZooKeeper ServiceGroup driver</title>
<para>The ZooKeeper ServiceGroup driver works by using ZooKeeper ephemeral
nodes. ZooKeeper, unlike databases, is a distributed system, with its
load divided among several servers. On a compute worker node, the driver
can establish a ZooKeeper session, then create an ephemeral znode
in the group directory. Ephemeral znodes have the same lifespan as the
session. If the worker node or the
<systemitem class="service">nova-compute</systemitem> daemon crashes, or
a network partition is in place between the worker and the ZooKeeper
server quorums, the ephemeral znodes are removed automatically. The
driver can be given group membership by running the <command>ls</command>
command in the group directory.</para>
<para>The ZooKeeper driver requires the ZooKeeper servers and client
libraries. Setting up ZooKeeper servers is outside the scope of this
guide (for more information, see
<link xlink:href="http://zookeeper.apache.org/">Apache Zookeeper</link>).
These client-side Python libraries must be installed on every compute node:</para>
<variablelist>
<varlistentry>
<term><literal>python-zookeeper</literal></term>
<listitem><para>The official Zookeeper Python binding</para></listitem>
</varlistentry>
<varlistentry>
<term><literal>evzookeeper</literal></term>
<listitem><para>This library makes the binding work with the eventlet
threading model.</para></listitem>
</varlistentry>
</variablelist>
<para>This example assumes the ZooKeeper server addresses and ports are
<literal>192.168.2.1:2181</literal>, <literal>192.168.2.2:2181</literal>,
and <literal>192.168.2.3:2181</literal>.</para>
<para>These values in the <filename>/etc/nova/nova.conf</filename> file are
required on every node for the ZooKeeper driver:</para>
<programlisting language="ini"># Driver for the ServiceGroup service
servicegroup_driver="zk"
[zookeeper]
address="192.168.2.1:2181,192.168.2.2:2181,192.168.2.3:2181"</programlisting>
<para>To customize the Compute Service groups, use these configuration
option settings:</para>
<xi:include href="../../common/tables/nova-zookeeper.xml"/>
</section>
<section xml:id="memcache-servicegroup-driver">
<title>Memcache ServiceGroup driver</title>
<para>The <systemitem>memcache</systemitem> ServiceGroup driver uses
<systemitem>memcached</systemitem>, a distributed memory object
caching system that is used to increase site performance. For more
details, see <link xlink:href="http://memcached.org/">memcached.org</link>.</para>
<para>To use the <systemitem>memcache</systemitem> driver, you must
install <systemitem>memcached</systemitem>. You might already have
it installed, as the same driver is also used for the OpenStack
Object Storage and OpenStack dashboard. If you need to install
<systemitem>memcached</systemitem>, see the instructions in the <link
xlink:href="http://docs.openstack.org/"><citetitle>OpenStack
Installation Guide</citetitle></link>.</para>
<para>These values in the <filename>/etc/nova/nova.conf</filename> file
are required on every node for the <systemitem>memcache</systemitem>
driver:</para>
<programlisting language="ini"># Driver for the ServiceGroup service
servicegroup_driver="mc"
# Memcached servers. Use either a list of memcached servers to use for caching (list value),
# or "&lt;None>" for in-process caching (default).
memcached_servers=&lt;None>
# Timeout; maximum time since last check-in for up service (integer value).
# Helps to define whether a node is dead
service_down_time=60</programlisting>
</section>
</section>