b417259465
Partial-Bug: #1250515 author: diane fleming Change-Id: I92c9ad25ee5bbd2c9334c4860637faa1b358718b backport: havana
85 lines
4.7 KiB
XML
85 lines
4.7 KiB
XML
<!DOCTYPE section [
|
|
<!-- Some useful entities borrowed from HTML -->
|
|
<!ENTITY ndash "–">
|
|
<!ENTITY mdash "—">
|
|
<!ENTITY hellip "…">
|
|
]><section xml:id="configuring-compute-service-groups"
|
|
xmlns="http://docbook.org/ns/docbook"
|
|
xmlns:xlink="http://www.w3.org/1999/xlink"
|
|
xmlns:xi="http://www.w3.org/2001/XInclude"
|
|
xmlns:ns5="http://www.w3.org/1999/xhtml"
|
|
xmlns:ns4="http://www.w3.org/2000/svg"
|
|
xmlns:ns3="http://www.w3.org/1998/Math/MathML"
|
|
xmlns:ns="http://docbook.org/ns/docbook"
|
|
version="5.0">
|
|
<title>Configuring Compute Service Groups</title>
|
|
<para>To effectively manage and utilize compute nodes, the Compute service must know their statuses. For example, when a user launches a
|
|
new VM, the Compute scheduler should send the request to a live node
|
|
(with enough capacity too, of course). From the Grizzly release
|
|
and later, the Compute service queries the ServiceGroup API to get the node
|
|
liveness information.</para>
|
|
<para>When a compute worker (running the <systemitem class="service">nova-compute</systemitem> daemon) starts,
|
|
it calls the join API to join the compute group, so that every
|
|
service that is interested in the information (for example, the scheduler)
|
|
can query the group membership or the status of a
|
|
particular node. Internally, the ServiceGroup client driver
|
|
automatically updates the compute worker status.</para>
|
|
<para>The following drivers are implemented: database and
|
|
ZooKeeper. Further drivers are in review or development, such as
|
|
memcache.</para>
|
|
<section xml:id="database-servicegroup-driver">
|
|
<title>Database ServiceGroup driver</title>
|
|
<para>Compute uses the database driver, which is the default driver, to track node
|
|
liveness.
|
|
In a compute worker, this driver periodically sends a <command>db update</command> command
|
|
to the database, saying <quote>I'm OK</quote> with a timestamp. A pre-defined
|
|
timeout (<literal>service_down_time</literal>)
|
|
determines if a node is dead.</para>
|
|
<para>The driver has limitations, which may or may not be an
|
|
issue for you, depending on your setup. The more compute
|
|
worker nodes that you have, the more pressure you put on the database.
|
|
By default, the timeout is 60 seconds so it might take some time to detect node failures. You could reduce
|
|
the timeout value, but you must also make the DB update
|
|
more frequently, which again increases the DB workload.</para>
|
|
<para>Fundamentally, the data that describes whether the
|
|
node is alive is "transient" — After a
|
|
few seconds, this data is obsolete. Other data in the database is persistent, such as the entries
|
|
that describe who owns which VMs. However, because this data is stored in the same database,
|
|
is treated the same way. The
|
|
ServiceGroup abstraction aims to treat
|
|
them separately.</para>
|
|
</section>
|
|
<section xml:id="zookeeper-servicegroup-driver">
|
|
<title>ZooKeeper ServiceGroup driver</title>
|
|
<para>The ZooKeeper ServiceGroup driver works by using ZooKeeper
|
|
ephemeral nodes. ZooKeeper, in contrast to databases, is a
|
|
distributed system. Its load is divided among several servers.
|
|
At a compute worker node, after establishing a ZooKeeper session,
|
|
it creates an ephemeral znode in the group directory. Ephemeral
|
|
znodes have the same lifespan as the session. If the worker node
|
|
or the <systemitem class="service">nova-compute</systemitem> daemon crashes, or a network
|
|
partition is in place between the worker and the ZooKeeper server quorums,
|
|
the ephemeral znodes are removed automatically. The driver
|
|
gets the group membership by running the <command>ls</command> command in the group directory.</para>
|
|
<para>To use the ZooKeeper driver, you must install
|
|
ZooKeeper servers and client libraries. Setting
|
|
up ZooKeeper servers is outside the scope of this article.
|
|
For the rest of the article, assume these servers are installed,
|
|
and their addresses and ports are <literal>192.168.2.1:2181</literal>, <literal>192.168.2.2:2181</literal>,
|
|
<literal>192.168.2.3:2181</literal>.
|
|
</para>
|
|
<para>To use ZooKeeper, you must install client-side Python
|
|
libraries on every nova node: <literal>python-zookeeper</literal>
|
|
– the official Zookeeper Python binding
|
|
and <literal>evzookeeper</literal> – the library to make the
|
|
binding work with the eventlet threading model.
|
|
</para>
|
|
<para>The relevant configuration snippet in the <filename>/etc/nova/nova.conf</filename> file on every node is:</para>
|
|
<programlisting language="ini">servicegroup_driver="zk"
|
|
|
|
[zookeeper]
|
|
address="192.168.2.1:2181,192.168.2.2:2181,192.168.2.3:2181"</programlisting>
|
|
<xi:include href="../../common/tables/nova-zookeeper.xml"/>
|
|
</section>
|
|
</section>
|