Moved security hardening HowTos from Config Ref to Cloud Admin
Moved procedures and theory over to the admin guide. Also moved node recovery into its own file. Edited trusted-flavor procedure with minor edits on rest. Change-Id: I060d79271130d49b9c6b37638943e2f85ffae5cd Partial-Bug: #290687
This commit is contained in:
parent
469af0158b
commit
0ccb2136b4
405
doc/admin-guide-cloud/compute/section_compute-recover-nodes.xml
Normal file
405
doc/admin-guide-cloud/compute/section_compute-recover-nodes.xml
Normal file
@ -0,0 +1,405 @@
|
||||
<?xml version="1.0" encoding="UTF-8"?>
|
||||
<section xml:id="section_nova-compute-node-down"
|
||||
xmlns="http://docbook.org/ns/docbook"
|
||||
xmlns:xi="http://www.w3.org/2001/XInclude"
|
||||
xmlns:xlink="http://www.w3.org/1999/xlink"
|
||||
version="5.0">
|
||||
<title>Recover from a failed compute node</title>
|
||||
<para>If you deployed Compute with a shared file system, you can quickly recover from a failed
|
||||
compute node. Of the two methods covered in these sections, evacuating is the preferred
|
||||
method even in the absence of shared storage. Evacuating provides many benefits over manual
|
||||
recovery, such as re-attachment of volumes and floating IPs.</para>
|
||||
<xi:include href="../../common/section_cli_nova_evacuate.xml"/>
|
||||
<section xml:id="nova-compute-node-down-manual-recovery">
|
||||
<title>Manual recovery</title>
|
||||
<para>To recover a KVM/libvirt compute node, see the previous section. Use the
|
||||
following procedure for all other hypervisors.</para>
|
||||
<procedure>
|
||||
<title>Review host information</title>
|
||||
<step>
|
||||
<para>Identify the VMs on the affected hosts, using tools such as a
|
||||
combination of <literal>nova list</literal> and <literal>nova show</literal> or
|
||||
<literal>euca-describe-instances</literal>. For example, the following
|
||||
output displays information about instance <systemitem>i-000015b9</systemitem>
|
||||
that is running on node <systemitem>np-rcc54</systemitem>:</para>
|
||||
<screen><prompt>$</prompt> <userinput>euca-describe-instances</userinput>
|
||||
<computeroutput>i-000015b9 at3-ui02 running nectarkey (376, np-rcc54) 0 m1.xxlarge 2012-06-19T00:48:11.000Z 115.146.93.60</computeroutput></screen>
|
||||
</step>
|
||||
<step>
|
||||
<para>Review the status of the host by querying the Compute database. Some of the
|
||||
important information is highlighted below. The following example converts an
|
||||
EC2 API instance ID into an OpenStack ID; if you used the
|
||||
<literal>nova</literal> commands, you can substitute the ID directly. You
|
||||
can find the credentials for your database in
|
||||
<filename>/etc/nova.conf</filename>.</para>
|
||||
<screen><prompt>mysql></prompt> <userinput>SELECT * FROM instances WHERE id = CONV('15b9', 16, 10) \G;</userinput>
|
||||
<computeroutput>*************************** 1. row ***************************
|
||||
created_at: 2012-06-19 00:48:11
|
||||
updated_at: 2012-07-03 00:35:11
|
||||
deleted_at: NULL
|
||||
...
|
||||
id: 5561
|
||||
...
|
||||
power_state: 5
|
||||
vm_state: shutoff
|
||||
...
|
||||
hostname: at3-ui02
|
||||
host: np-rcc54
|
||||
...
|
||||
uuid: 3f57699a-e773-4650-a443-b4b37eed5a06
|
||||
...
|
||||
task_state: NULL
|
||||
... </computeroutput></screen></step>
|
||||
</procedure>
|
||||
<procedure>
|
||||
<title>Recover the VM</title>
|
||||
<step>
|
||||
<para>After you have determined the status of the VM on the failed host,
|
||||
decide to which compute host the affected VM should be moved. For example, run
|
||||
the following database command to move the VM to
|
||||
<systemitem>np-rcc46</systemitem>:</para>
|
||||
<screen><prompt>mysql></prompt> <userinput>UPDATE instances SET host = 'np-rcc46' WHERE uuid = '3f57699a-e773-4650-a443-b4b37eed5a06';</userinput></screen>
|
||||
</step>
|
||||
<step>
|
||||
<para>If using a hypervisor that relies on libvirt (such as KVM), it is a
|
||||
good idea to update the <literal>libvirt.xml</literal> file (found in
|
||||
<literal>/var/lib/nova/instances/[instance ID]</literal>). The important
|
||||
changes to make are:</para>
|
||||
<para>
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>Change the <literal>DHCPSERVER</literal> value to the host IP
|
||||
address of the compute host that is now the VM's new
|
||||
home.</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>Update the VNC IP, if it isn't already updated, to:
|
||||
<literal>0.0.0.0</literal>.</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
</para>
|
||||
</step>
|
||||
<step>
|
||||
<para>Reboot the VM:</para>
|
||||
<screen><prompt>$</prompt> <userinput>nova reboot --hard 3f57699a-e773-4650-a443-b4b37eed5a06</userinput></screen>
|
||||
</step>
|
||||
</procedure>
|
||||
<para>In theory, the above database update and <literal>nova
|
||||
reboot</literal> command are all that is required to recover a VM from a
|
||||
failed host. However, if further problems occur, consider looking at
|
||||
recreating the network filter configuration using <literal>virsh</literal>,
|
||||
restarting the Compute services or updating the <literal>vm_state</literal>
|
||||
and <literal>power_state</literal> in the Compute database.</para>
|
||||
</section>
|
||||
<section xml:id="section_nova-uid-mismatch">
|
||||
<title>Recover from a UID/GID mismatch</title>
|
||||
<para>When running OpenStack Compute, using a shared file system or an automated
|
||||
configuration tool, you could encounter a situation where some files on your compute
|
||||
node are using the wrong UID or GID. This causes a number of errors, such as being
|
||||
unable to do live migration or start virtual machines.</para>
|
||||
<para>The following procedure runs on <systemitem class="service"
|
||||
>nova-compute</systemitem> hosts, based on the KVM hypervisor, and could help to
|
||||
restore the situation:</para>
|
||||
<procedure>
|
||||
<title>To recover from a UID/GID mismatch</title>
|
||||
<step>
|
||||
<para>Ensure you do not use numbers that are already used for some other
|
||||
user/group.</para>
|
||||
</step>
|
||||
<step>
|
||||
<para>Set the nova uid in <filename>/etc/passwd</filename> to the same number in
|
||||
all hosts (for example, 112).</para>
|
||||
</step>
|
||||
<step>
|
||||
<para>Set the libvirt-qemu uid in
|
||||
<filename>/etc/passwd</filename> to the
|
||||
same number in all hosts (for example,
|
||||
119).</para>
|
||||
</step>
|
||||
<step>
|
||||
<para>Set the nova group in
|
||||
<filename>/etc/group</filename> file to
|
||||
the same number in all hosts (for example,
|
||||
120).</para>
|
||||
</step>
|
||||
<step>
|
||||
<para>Set the libvirtd group in
|
||||
<filename>/etc/group</filename> file to
|
||||
the same number in all hosts (for example,
|
||||
119).</para>
|
||||
</step>
|
||||
<step>
|
||||
<para>Stop the services on the compute
|
||||
node.</para>
|
||||
</step>
|
||||
<step>
|
||||
<para>Change all the files owned by user <systemitem>nova</systemitem> or by
|
||||
group <systemitem>nova</systemitem>. For example:</para>
|
||||
<screen><prompt>#</prompt> <userinput>find / -uid 108 -exec chown nova {} \; </userinput># note the 108 here is the old nova uid before the change
|
||||
<prompt>#</prompt> <userinput>find / -gid 120 -exec chgrp nova {} \;</userinput></screen>
|
||||
</step>
|
||||
<step>
|
||||
<para>Repeat the steps for the libvirt-qemu owned files if those needed to
|
||||
change.</para>
|
||||
</step>
|
||||
<step>
|
||||
<para>Restart the services.</para>
|
||||
</step>
|
||||
<step>
|
||||
<para>Now you can run the <command>find</command>
|
||||
command to verify that all files using the
|
||||
correct identifiers.</para>
|
||||
</step>
|
||||
</procedure>
|
||||
</section>
|
||||
<section xml:id="section_nova-disaster-recovery-process">
|
||||
<title>Recover cloud after disaster</title>
|
||||
<para>Use the following procedures to manage your cloud after a disaster, and to easily
|
||||
back up its persistent storage volumes. Backups <emphasis role="bold">are</emphasis>
|
||||
mandatory, even outside of disaster scenarios.</para>
|
||||
<para>For a DRP definition, see <link
|
||||
xlink:href="http://en.wikipedia.org/wiki/Disaster_Recovery_Plan"
|
||||
>http://en.wikipedia.org/wiki/Disaster_Recovery_Plan</link>.</para>
|
||||
<simplesect>
|
||||
<title>Disaster recovery example</title>
|
||||
<para>A disaster could happen to several components of your architecture (for
|
||||
example, a disk crash, a network loss, or a power cut). In this example, the
|
||||
following components are configured:</para>
|
||||
<orderedlist>
|
||||
<listitem>
|
||||
<para>A cloud controller (<systemitem>nova-api</systemitem>,
|
||||
<systemitem>nova-objectstore</systemitem>,
|
||||
<systemitem>nova-network</systemitem>)</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>A compute node (<systemitem
|
||||
class="service"
|
||||
>nova-compute</systemitem>)</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>A Storage Area Network (SAN) used by OpenStack Block Storage
|
||||
(<systemitem class="service">cinder-volumes</systemitem>)</para>
|
||||
</listitem>
|
||||
</orderedlist>
|
||||
<para>The worst disaster for a cloud is a power loss, which applies to all three
|
||||
components. Before a power loss:</para>
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>From the SAN to the cloud controller, we have an active iSCSI session
|
||||
(used for the "cinder-volumes" LVM's VG).</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>From the cloud controller to the compute node, we also have active
|
||||
iSCSI sessions (managed by <systemitem class="service"
|
||||
>cinder-volume</systemitem>).</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>For every volume, an iSCSI session is made (so 14 ebs volumes equals
|
||||
14 sessions).</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>From the cloud controller to the compute node, we also have iptables/
|
||||
ebtables rules which allow access from the cloud controller to the running
|
||||
instance.</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>And at least, from the cloud controller to the compute node; saved
|
||||
into database, the current state of the instances (in that case "running" ),
|
||||
and their volumes attachment (mount point, volume ID, volume status, and so
|
||||
on.)</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
<para>After the power loss occurs and all hardware components restart:</para>
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>From the SAN to the cloud, the iSCSI session no longer exists.</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>From the cloud controller to the compute node, the iSCSI sessions no
|
||||
longer exist.</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>From the cloud controller to the compute node, the iptables and
|
||||
ebtables are recreated, since at boot, <systemitem>nova-network</systemitem>
|
||||
reapplies configurations.</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>From the cloud controller, instances are in a shutdown state (because
|
||||
they are no longer running).</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>In the database, data was not updated at all, since Compute could not
|
||||
have anticipated the crash.</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
<para>Before going further, and to prevent the administrator from making fatal
|
||||
mistakes,<emphasis role="bold"> instances won't be lost</emphasis>, because no
|
||||
"<command>destroy</command>" or "<command>terminate</command>" command was
|
||||
invoked, so the files for the instances remain on the compute node.</para>
|
||||
<para>Perform these tasks in the following order.
|
||||
<warning><para>Do not add any extra steps at this stage.</para></warning></para>
|
||||
<para>
|
||||
<orderedlist>
|
||||
<listitem>
|
||||
<para>Get the current relation from a
|
||||
volume to its instance, so that you
|
||||
can recreate the attachment.</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>Update the database to clean the
|
||||
stalled state. (After that, you cannot
|
||||
perform the first step).</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>Restart the instances. In other
|
||||
words, go from a shutdown to running
|
||||
state.</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>After the restart, reattach the volumes to their respective
|
||||
instances (optional).</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>SSH into the instances to reboot them.</para>
|
||||
</listitem>
|
||||
</orderedlist>
|
||||
</para>
|
||||
</simplesect>
|
||||
<simplesect>
|
||||
<title>Recover after a disaster</title>
|
||||
<procedure>
|
||||
<title>To perform disaster recovery</title>
|
||||
<step>
|
||||
<title>Get the instance-to-volume
|
||||
relationship</title>
|
||||
<para>You must determine the current relationship from a volume to its
|
||||
instance, because you will re-create the attachment.</para>
|
||||
<para>You can find this relationship by running <command>nova
|
||||
volume-list</command>. Note that the <command>nova</command> client
|
||||
includes the ability to get volume information from OpenStack Block
|
||||
Storage.</para>
|
||||
</step>
|
||||
<step>
|
||||
<title>Update the database</title>
|
||||
<para>Update the database to clean the stalled state. You must restore for
|
||||
every volume, using these queries to clean up the database:</para>
|
||||
<screen><prompt>mysql></prompt> <userinput>use cinder;</userinput>
|
||||
<prompt>mysql></prompt> <userinput>update volumes set mountpoint=NULL;</userinput>
|
||||
<prompt>mysql></prompt> <userinput>update volumes set status="available" where status <>"error_deleting";</userinput>
|
||||
<prompt>mysql></prompt> <userinput>update volumes set attach_status="detached";</userinput>
|
||||
<prompt>mysql></prompt> <userinput>update volumes set instance_id=0;</userinput></screen>
|
||||
<para>You can then run <command>nova volume-list</command> commands to list
|
||||
all volumes.</para>
|
||||
</step>
|
||||
<step>
|
||||
<title>Restart instances</title>
|
||||
<para>Restart the instances using the <command>nova reboot
|
||||
<replaceable>$instance</replaceable></command> command.</para>
|
||||
<para>At this stage, depending on your image, some instances completely
|
||||
reboot and become reachable, while others stop on the "plymouth"
|
||||
stage.</para>
|
||||
</step>
|
||||
<step>
|
||||
<title>DO NOT reboot a second time</title>
|
||||
<para>Do not reboot instances that are stopped at this point. Instance state
|
||||
depends on whether you added an <filename>/etc/fstab</filename> entry for
|
||||
that volume. Images built with the <package>cloud-init</package> package
|
||||
remain in a pending state, while others skip the missing volume and start.
|
||||
The idea of that stage is only to ask Compute to reboot every instance, so
|
||||
the stored state is preserved. For more information about
|
||||
<package>cloud-init</package>, see <link
|
||||
xlink:href="https://help.ubuntu.com/community/CloudInit"
|
||||
>help.ubuntu.com/community/CloudInit</link>.</para>
|
||||
</step>
|
||||
<step>
|
||||
<title>Reattach volumes</title>
|
||||
<para>After the restart, and Compute has restored the right status, you can
|
||||
reattach the volumes to their respective instances using the <command>nova
|
||||
volume-attach</command> command. The following snippet uses a file of
|
||||
listed volumes to reattach them:</para>
|
||||
<programlisting language="bash">#!/bin/bash
|
||||
|
||||
while read line; do
|
||||
volume=`echo $line | $CUT -f 1 -d " "`
|
||||
instance=`echo $line | $CUT -f 2 -d " "`
|
||||
mount_point=`echo $line | $CUT -f 3 -d " "`
|
||||
echo "ATTACHING VOLUME FOR INSTANCE - $instance"
|
||||
nova volume-attach $instance $volume $mount_point
|
||||
sleep 2
|
||||
done < $volumes_tmp_file</programlisting>
|
||||
<para>At this stage, instances that were pending on the boot sequence
|
||||
(<application>plymouth</application>) automatically continue their boot,
|
||||
and restart normally, while the ones that booted see the volume.</para>
|
||||
</step>
|
||||
<step>
|
||||
<title>SSH into instances</title>
|
||||
<para>If some services depend on the volume, or if a volume has an entry
|
||||
into <systemitem>fstab</systemitem>, you should now simply restart the
|
||||
instance. This restart needs to be made from the instance itself, not
|
||||
through <command>nova</command>.</para>
|
||||
<para>SSH into the instance and perform a reboot:</para>
|
||||
<screen><prompt>#</prompt> <userinput>shutdown -r now</userinput></screen>
|
||||
</step>
|
||||
</procedure>
|
||||
<para>By completing this procedure, you can
|
||||
successfully recover your cloud.</para>
|
||||
<note>
|
||||
<para>Follow these guidelines:</para>
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>Use the <parameter> errors=remount</parameter> parameter in the
|
||||
<filename>fstab</filename> file, which prevents data
|
||||
corruption.</para>
|
||||
<para>The system locks any write to the disk if it detects an I/O error.
|
||||
This configuration option should be added into the <systemitem
|
||||
class="service">cinder-volume</systemitem> server (the one which
|
||||
performs the iSCSI connection to the SAN), but also into the instances'
|
||||
<filename>fstab</filename> file.</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>Do not add the entry for the SAN's disks to the <systemitem
|
||||
class="service">cinder-volume</systemitem>'s
|
||||
<filename>fstab</filename> file.</para>
|
||||
<para>Some systems hang on that step, which means you could lose access to
|
||||
your cloud-controller. To re-run the session manually, run the following
|
||||
command before performing the mount:
|
||||
<screen><prompt>#</prompt> <userinput>iscsiadm -m discovery -t st -p $SAN_IP $ iscsiadm -m node --target-name $IQN -p $SAN_IP -l</userinput></screen></para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>For your instances, if you have the whole <filename>/home/</filename>
|
||||
directory on the disk, leave a user's directory with the user's bash
|
||||
files and the <filename>authorized_keys</filename> file (instead of
|
||||
emptying the <filename>/home</filename> directory and mapping the disk
|
||||
on it).</para>
|
||||
<para>This enables you to connect to the instance, even without the volume
|
||||
attached, if you allow only connections through public keys.</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
</note>
|
||||
</simplesect>
|
||||
<simplesect>
|
||||
<title>Script the DRP</title>
|
||||
<para>You can download from <link
|
||||
xlink:href="https://github.com/Razique/BashStuff/blob/master/SYSTEMS/OpenStack/SCR_5006_V00_NUAC-OPENSTACK-DRP-OpenStack.sh"
|
||||
>here</link> a bash script which performs the following steps:</para>
|
||||
<orderedlist>
|
||||
<listitem><para>An array is created for instances and their attached volumes.</para></listitem>
|
||||
<listitem><para>The MySQL database is updated.</para></listitem>
|
||||
<listitem><para>Using <systemitem>euca2ools</systemitem>, all instances are restarted.</para></listitem>
|
||||
<listitem><para>The volume attachment is made.</para></listitem>
|
||||
<listitem><para>An SSH connection is performed into every instance using Compute credentials.</para></listitem>
|
||||
</orderedlist>
|
||||
<para>The "test mode" allows you to perform
|
||||
that whole sequence for only one
|
||||
instance.</para>
|
||||
<para>To reproduce the power loss, connect to the compute node which runs
|
||||
that same instance and close the iSCSI session. Do not detach the volume using the <command>nova
|
||||
volume-detach</command> command; instead, manually close the iSCSI session. For the following
|
||||
example command uses an iSCSI session with the number 15:</para>
|
||||
<screen><prompt>#</prompt> <userinput>iscsiadm -m session -u -r 15</userinput></screen>
|
||||
<para>Do not forget the <literal>-r</literal>
|
||||
flag. Otherwise, you close ALL
|
||||
sessions.</para>
|
||||
</simplesect>
|
||||
</section>
|
||||
</section>
|
@ -500,437 +500,6 @@ local0.error @@172.20.1.43:1024</programlisting>
|
||||
</section>
|
||||
<xi:include href="../../common/section_compute-configure-console.xml"/>
|
||||
<xi:include href="section_compute-configure-service-groups.xml"/>
|
||||
<section xml:id="section_nova-compute-node-down">
|
||||
<title>Recover from a failed compute node</title>
|
||||
<para>If you have deployed Compute with a shared file
|
||||
system, you can quickly recover from a failed compute
|
||||
node. Of the two methods covered in these sections,
|
||||
the evacuate API is the preferred method even in the
|
||||
absence of shared storage. The evacuate API provides
|
||||
many benefits over manual recovery, such as
|
||||
re-attachment of volumes and floating IPs.</para>
|
||||
<xi:include href="../../common/section_cli_nova_evacuate.xml"/>
|
||||
<section xml:id="nova-compute-node-down-manual-recovery">
|
||||
<title>Manual recovery</title>
|
||||
<para>For KVM/libvirt compute node recovery, see the previous section. Use the
|
||||
following procedure for all other hypervisors.</para>
|
||||
<procedure>
|
||||
<title>To work with host information</title>
|
||||
<step>
|
||||
<para>Identify the VMs on the affected hosts, using tools such as a
|
||||
combination of <literal>nova list</literal> and <literal>nova show</literal>
|
||||
or <literal>euca-describe-instances</literal>. Here's an example using the
|
||||
EC2 API - instance i-000015b9 that is running on node np-rcc54:</para>
|
||||
<programlisting language="bash">i-000015b9 at3-ui02 running nectarkey (376, np-rcc54) 0 m1.xxlarge 2012-06-19T00:48:11.000Z 115.146.93.60</programlisting>
|
||||
</step>
|
||||
<step>
|
||||
<para>You can review the status of the host by using the Compute database.
|
||||
Some of the important information is highlighted below. This example
|
||||
converts an EC2 API instance ID into an OpenStack ID; if you used the
|
||||
<literal>nova</literal> commands, you can substitute the ID directly.
|
||||
You can find the credentials for your database in
|
||||
<filename>/etc/nova.conf</filename>.</para>
|
||||
<programlisting language="bash">SELECT * FROM instances WHERE id = CONV('15b9', 16, 10) \G;
|
||||
*************************** 1. row ***************************
|
||||
created_at: 2012-06-19 00:48:11
|
||||
updated_at: 2012-07-03 00:35:11
|
||||
deleted_at: NULL
|
||||
...
|
||||
id: 5561
|
||||
...
|
||||
power_state: 5
|
||||
vm_state: shutoff
|
||||
...
|
||||
hostname: at3-ui02
|
||||
host: np-rcc54
|
||||
...
|
||||
uuid: 3f57699a-e773-4650-a443-b4b37eed5a06
|
||||
...
|
||||
task_state: NULL
|
||||
...</programlisting>
|
||||
</step>
|
||||
</procedure>
|
||||
<procedure>
|
||||
<title>To recover the VM</title>
|
||||
<step>
|
||||
<para>When you know the status of the VM on the failed host, determine to
|
||||
which compute host the affected VM should be moved. For example, run the
|
||||
following database command to move the VM to np-rcc46:</para>
|
||||
<programlisting language="bash">UPDATE instances SET host = 'np-rcc46' WHERE uuid = '3f57699a-e773-4650-a443-b4b37eed5a06'; </programlisting>
|
||||
</step>
|
||||
<step>
|
||||
<para>If using a hypervisor that relies on libvirt (such as KVM), it is a
|
||||
good idea to update the <literal>libvirt.xml</literal> file (found in
|
||||
<literal>/var/lib/nova/instances/[instance ID]</literal>). The important
|
||||
changes to make are:</para>
|
||||
<para>
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>Change the <literal>DHCPSERVER</literal> value to the host IP
|
||||
address of the compute host that is now the VM's new
|
||||
home.</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>Update the VNC IP if it isn't already to:
|
||||
<literal>0.0.0.0</literal>.</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
</para>
|
||||
</step>
|
||||
<step>
|
||||
<para>Reboot the VM:</para>
|
||||
<screen><prompt>$</prompt> <userinput>nova reboot --hard 3f57699a-e773-4650-a443-b4b37eed5a06</userinput></screen>
|
||||
</step>
|
||||
</procedure>
|
||||
<para>In theory, the above database update and <literal>nova
|
||||
reboot</literal> command are all that is required to recover a VM from a
|
||||
failed host. However, if further problems occur, consider looking at
|
||||
recreating the network filter configuration using <literal>virsh</literal>,
|
||||
restarting the Compute services or updating the <literal>vm_state</literal>
|
||||
and <literal>power_state</literal> in the Compute database.</para>
|
||||
</section>
|
||||
</section>
|
||||
<section xml:id="section_nova-uid-mismatch">
|
||||
<title>Recover from a UID/GID mismatch</title>
|
||||
<para>When running OpenStack compute, using a shared file
|
||||
system or an automated configuration tool, you could
|
||||
encounter a situation where some files on your compute
|
||||
node are using the wrong UID or GID. This causes a
|
||||
raft of errors, such as being unable to live migrate,
|
||||
or start virtual machines.</para>
|
||||
<para>The following procedure runs on <systemitem class="service"
|
||||
>nova-compute</systemitem> hosts, based on the KVM hypervisor, and could help to
|
||||
restore the situation:</para>
|
||||
<procedure>
|
||||
<title>To recover from a UID/GID mismatch</title>
|
||||
<step>
|
||||
<para>Ensure you don't use numbers that are already used for some other
|
||||
user/group.</para>
|
||||
</step>
|
||||
<step>
|
||||
<para>Set the nova uid in <filename>/etc/passwd</filename> to the same number in
|
||||
all hosts (for example, 112).</para>
|
||||
</step>
|
||||
<step>
|
||||
<para>Set the libvirt-qemu uid in
|
||||
<filename>/etc/passwd</filename> to the
|
||||
same number in all hosts (for example,
|
||||
119).</para>
|
||||
</step>
|
||||
<step>
|
||||
<para>Set the nova group in
|
||||
<filename>/etc/group</filename> file to
|
||||
the same number in all hosts (for example,
|
||||
120).</para>
|
||||
</step>
|
||||
<step>
|
||||
<para>Set the libvirtd group in
|
||||
<filename>/etc/group</filename> file to
|
||||
the same number in all hosts (for example,
|
||||
119).</para>
|
||||
</step>
|
||||
<step>
|
||||
<para>Stop the services on the compute
|
||||
node.</para>
|
||||
</step>
|
||||
<step>
|
||||
<para>Change all the files owned by user nova or
|
||||
by group nova. For example:</para>
|
||||
<programlisting language="bash">find / -uid 108 -exec chown nova {} \; # note the 108 here is the old nova uid before the change
|
||||
find / -gid 120 -exec chgrp nova {} \;</programlisting>
|
||||
</step>
|
||||
<step>
|
||||
<para>Repeat the steps for the libvirt-qemu owned files if those needed to
|
||||
change.</para>
|
||||
</step>
|
||||
<step>
|
||||
<para>Restart the services.</para>
|
||||
</step>
|
||||
<step>
|
||||
<para>Now you can run the <command>find</command>
|
||||
command to verify that all files using the
|
||||
correct identifiers.</para>
|
||||
</step>
|
||||
</procedure>
|
||||
</section>
|
||||
<section xml:id="section_nova-disaster-recovery-process">
|
||||
<title>Compute disaster recovery process</title>
|
||||
<para>Use the following procedures to manage your cloud after a disaster, and to easily
|
||||
back up its persistent storage volumes. Backups <emphasis role="bold">are</emphasis>
|
||||
mandatory, even outside of disaster scenarios.</para>
|
||||
<para>For a DRP definition, see <link
|
||||
xlink:href="http://en.wikipedia.org/wiki/Disaster_Recovery_Plan"
|
||||
>http://en.wikipedia.org/wiki/Disaster_Recovery_Plan</link>.</para>
|
||||
<simplesect>
|
||||
<title>A- The disaster recovery process
|
||||
presentation</title>
|
||||
<para>A disaster could happen to several components of
|
||||
your architecture: a disk crash, a network loss, a
|
||||
power cut, and so on. In this example, assume the
|
||||
following set up:</para>
|
||||
<orderedlist>
|
||||
<listitem>
|
||||
<para>A cloud controller (<systemitem>nova-api</systemitem>,
|
||||
<systemitem>nova-objecstore</systemitem>,
|
||||
<systemitem>nova-network</systemitem>)</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>A compute node (<systemitem
|
||||
class="service"
|
||||
>nova-compute</systemitem>)</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>A Storage Area Network used by
|
||||
<systemitem class="service"
|
||||
>cinder-volumes</systemitem> (aka
|
||||
SAN)</para>
|
||||
</listitem>
|
||||
</orderedlist>
|
||||
<para>The disaster example is the worst one: a power
|
||||
loss. That power loss applies to the three
|
||||
components. <emphasis role="italic">Let's see what
|
||||
runs and how it runs before the
|
||||
crash</emphasis>:</para>
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>From the SAN to the cloud controller, we
|
||||
have an active iscsi session (used for the
|
||||
"cinder-volumes" LVM's VG).</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>From the cloud controller to the compute node, we also have active
|
||||
iscsi sessions (managed by <systemitem class="service"
|
||||
>cinder-volume</systemitem>).</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>For every volume, an iscsi session is made (so 14 ebs volumes equals
|
||||
14 sessions).</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>From the cloud controller to the compute node, we also have iptables/
|
||||
ebtables rules which allow access from the cloud controller to the running
|
||||
instance.</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>And at least, from the cloud controller to the compute node; saved
|
||||
into database, the current state of the instances (in that case "running" ),
|
||||
and their volumes attachment (mount point, volume ID, volume status, and so
|
||||
on.)</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
<para>Now, after the power loss occurs and all
|
||||
hardware components restart, the situation is as
|
||||
follows:</para>
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>From the SAN to the cloud, the ISCSI
|
||||
session no longer exists.</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>From the cloud controller to the compute
|
||||
node, the ISCSI sessions no longer exist.
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>From the cloud controller to the compute node, the iptables and
|
||||
ebtables are recreated, since, at boot,
|
||||
<systemitem>nova-network</systemitem> reapplies the
|
||||
configurations.</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>From the cloud controller, instances are in a shutdown state (because
|
||||
they are no longer running)</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>In the database, data was not updated at all, since Compute could not
|
||||
have anticipated the crash.</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
<para>Before going further, and to prevent the administrator from making fatal
|
||||
mistakes,<emphasis role="bold"> the instances won't be lost</emphasis>, because
|
||||
no "<command role="italic">destroy</command>" or "<command role="italic"
|
||||
>terminate</command>" command was invoked, so the files for the instances remain
|
||||
on the compute node.</para>
|
||||
<para>Perform these tasks in this exact order. <emphasis role="underline">Any extra
|
||||
step would be dangerous at this stage</emphasis> :</para>
|
||||
<para>
|
||||
<orderedlist>
|
||||
<listitem>
|
||||
<para>Get the current relation from a
|
||||
volume to its instance, so that you
|
||||
can recreate the attachment.</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>Update the database to clean the
|
||||
stalled state. (After that, you cannot
|
||||
perform the first step).</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>Restart the instances. In other
|
||||
words, go from a shutdown to running
|
||||
state.</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>After the restart, reattach the volumes to their respective
|
||||
instances (optional).</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>SSH into the instances to reboot them.</para>
|
||||
</listitem>
|
||||
</orderedlist>
|
||||
</para>
|
||||
</simplesect>
|
||||
<simplesect>
|
||||
<title>B - Disaster recovery</title>
|
||||
<procedure>
|
||||
<title>To perform disaster recovery</title>
|
||||
<step>
|
||||
<title>Get the instance-to-volume
|
||||
relationship</title>
|
||||
<para>You must get the current relationship from a volume to its instance,
|
||||
because you will re-create the attachment.</para>
|
||||
<para>You can find this relationship by running <command>nova
|
||||
volume-list</command>. Note that the <command>nova</command> client
|
||||
includes the ability to get volume information from Block Storage.</para>
|
||||
</step>
|
||||
<step>
|
||||
<title>Update the database</title>
|
||||
<para>Update the database to clean the stalled state. You must restore for
|
||||
every volume, using these queries to clean up the database:</para>
|
||||
<screen><prompt>mysql></prompt> <userinput>use cinder;</userinput>
|
||||
<prompt>mysql></prompt> <userinput>update volumes set mountpoint=NULL;</userinput>
|
||||
<prompt>mysql></prompt> <userinput>update volumes set status="available" where status <>"error_deleting";</userinput>
|
||||
<prompt>mysql></prompt> <userinput>update volumes set attach_status="detached";</userinput>
|
||||
<prompt>mysql></prompt> <userinput>update volumes set instance_id=0;</userinput></screen>
|
||||
<para>Then, when you run <command>nova volume-list</command> commands, all
|
||||
volumes appear in the listing.</para>
|
||||
</step>
|
||||
<step>
|
||||
<title>Restart instances</title>
|
||||
<para>Restart the instances using the <command>nova reboot
|
||||
<replaceable>$instance</replaceable></command> command.</para>
|
||||
<para>At this stage, depending on your image, some instances completely
|
||||
reboot and become reachable, while others stop on the "plymouth"
|
||||
stage.</para>
|
||||
</step>
|
||||
<step>
|
||||
<title>DO NOT reboot a second time</title>
|
||||
<para>Do not reboot instances that are stopped at this point. Instance state
|
||||
depends on whether you added an <filename>/etc/fstab</filename> entry for
|
||||
that volume. Images built with the <package>cloud-init</package> package
|
||||
remain in a pending state, while others skip the missing volume and start.
|
||||
The idea of that stage is only to ask nova to reboot every instance, so the
|
||||
stored state is preserved. For more information about
|
||||
<package>cloud-init</package>, see <link
|
||||
xlink:href="https://help.ubuntu.com/community/CloudInit"
|
||||
>help.ubuntu.com/community/CloudInit</link>.</para>
|
||||
</step>
|
||||
<step>
|
||||
<title>Reattach volumes</title>
|
||||
<para>After the restart, you can reattach the volumes to their respective
|
||||
instances. Now that <command>nova</command> has restored the right status,
|
||||
it is time to perform the attachments through a <command>nova
|
||||
volume-attach</command></para>
|
||||
<para>This simple snippet uses the created
|
||||
file:</para>
|
||||
<programlisting language="bash">#!/bin/bash
|
||||
|
||||
while read line; do
|
||||
volume=`echo $line | $CUT -f 1 -d " "`
|
||||
instance=`echo $line | $CUT -f 2 -d " "`
|
||||
mount_point=`echo $line | $CUT -f 3 -d " "`
|
||||
echo "ATTACHING VOLUME FOR INSTANCE - $instance"
|
||||
nova volume-attach $instance $volume $mount_point
|
||||
sleep 2
|
||||
done < $volumes_tmp_file</programlisting>
|
||||
<para>At that stage, instances that were
|
||||
pending on the boot sequence (<emphasis
|
||||
role="italic">plymouth</emphasis>)
|
||||
automatically continue their boot, and
|
||||
restart normally, while the ones that
|
||||
booted see the volume.</para>
|
||||
</step>
|
||||
<step>
|
||||
<title>SSH into instances</title>
|
||||
<para>If some services depend on the volume, or if a volume has an entry
|
||||
into <systemitem>fstab</systemitem>, it could be good to simply restart the
|
||||
instance. This restart needs to be made from the instance itself, not
|
||||
through <command>nova</command>. So, we SSH into the instance and perform a
|
||||
reboot:</para>
|
||||
<screen><prompt>#</prompt> <userinput>shutdown -r now</userinput></screen>
|
||||
</step>
|
||||
</procedure>
|
||||
<para>By completing this procedure, you can
|
||||
successfully recover your cloud.</para>
|
||||
<note>
|
||||
<para>Follow these guidelines:</para>
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>Use the <parameter> errors=remount</parameter> parameter in the
|
||||
<filename>fstab</filename> file, which prevents data
|
||||
corruption.</para>
|
||||
<para>The system locks any write to the disk if it detects an I/O error.
|
||||
This configuration option should be added into the <systemitem
|
||||
class="service">cinder-volume</systemitem> server (the one which
|
||||
performs the ISCSI connection to the SAN), but also into the instances'
|
||||
<filename>fstab</filename> file.</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>Do not add the entry for the SAN's disks to the <systemitem
|
||||
class="service">cinder-volume</systemitem>'s
|
||||
<filename>fstab</filename> file.</para>
|
||||
<para>Some systems hang on that step, which means you could lose access to
|
||||
your cloud-controller. To re-run the session manually, you would run the
|
||||
following command before performing the mount:
|
||||
<screen><prompt>#</prompt> <userinput>iscsiadm -m discovery -t st -p $SAN_IP $ iscsiadm -m node --target-name $IQN -p $SAN_IP -l</userinput></screen></para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>For your instances, if you have the whole <filename>/home/</filename>
|
||||
directory on the disk, instead of emptying the
|
||||
<filename>/home</filename> directory and map the disk on it, leave a
|
||||
user's directory with the user's bash files and the
|
||||
<filename>authorized_keys</filename> file.</para>
|
||||
<para>This enables you to connect to the instance, even without the volume
|
||||
attached, if you allow only connections through public keys.</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
</note>
|
||||
</simplesect>
|
||||
<simplesect>
|
||||
<title>C - Scripted DRP</title>
|
||||
<procedure>
|
||||
<title>To use scripted DRP</title>
|
||||
<para>You can download from <link
|
||||
xlink:href="https://github.com/Razique/BashStuff/blob/master/SYSTEMS/OpenStack/SCR_5006_V00_NUAC-OPENSTACK-DRP-OpenStack.sh"
|
||||
>here</link> a bash script which performs
|
||||
these steps:</para>
|
||||
<step>
|
||||
<para>The "test mode" allows you to perform
|
||||
that whole sequence for only one
|
||||
instance.</para>
|
||||
</step>
|
||||
<step>
|
||||
<para>To reproduce the power loss, connect to
|
||||
the compute node which runs that same
|
||||
instance and close the iscsi session.
|
||||
<emphasis role="underline">Do not
|
||||
detach the volume through
|
||||
<command>nova
|
||||
volume-detach</command></emphasis>,
|
||||
but instead manually close the iscsi
|
||||
session.</para>
|
||||
</step>
|
||||
<step>
|
||||
<para>In this example, the iscsi session is
|
||||
number 15 for that instance:</para>
|
||||
<screen><prompt>#</prompt> <userinput>iscsiadm -m session -u -r 15</userinput></screen>
|
||||
</step>
|
||||
<step>
|
||||
<para>Do not forget the <literal>-r</literal>
|
||||
flag. Otherwise, you close ALL
|
||||
sessions.</para>
|
||||
</step>
|
||||
</procedure>
|
||||
</simplesect>
|
||||
</section>
|
||||
<xi:include href="section_compute-security.xml"/>
|
||||
<xi:include href="section_compute-recover-nodes.xml"/>
|
||||
</section>
|
||||
|
@ -4,34 +4,26 @@
|
||||
xmlns:xlink="http://www.w3.org/1999/xlink" version="5.0"
|
||||
xml:id="nova_cli_evacuate">
|
||||
<title>Evacuate instances</title>
|
||||
<para>If a cloud compute node fails due to a hardware malfunction
|
||||
or another reason, you can evacuate instances to make them
|
||||
available again.</para>
|
||||
<para>You can choose evacuation parameters for your use
|
||||
case.</para>
|
||||
<para>To preserve user data on server disk, you must configure
|
||||
shared storage on the target host. Also, you must validate
|
||||
that the current VM host is down. Otherwise the evacuation
|
||||
<para>If a cloud compute node fails due to a hardware malfunction or another reason, you can
|
||||
evacuate instances to make them available again. You can choose evacuation parameters for
|
||||
your use case.</para>
|
||||
<para>To preserve user data on server disk, you must configure shared storage on the target
|
||||
host. Also, you must validate that the current VM host is down; otherwise, the evacuation
|
||||
fails with an error.</para>
|
||||
<procedure xml:id="evacuate_shared">
|
||||
<step>
|
||||
<para>To find a different host for the evacuated instance,
|
||||
run this command to list hosts:</para>
|
||||
<para>To list hosts and find a different host for the evacuated instance, run:</para>
|
||||
<screen><prompt>$</prompt> <userinput>nova host-list</userinput></screen>
|
||||
</step>
|
||||
<step>
|
||||
<para>You can pass the instance password to the command by
|
||||
using the <literal>--password <pwd></literal>
|
||||
option. If you do not specify a password, one is
|
||||
generated and printed after the command finishes
|
||||
successfully. The following command evacuates a server
|
||||
without shared storage:</para>
|
||||
<para>Evacuate the instance. You can pass the instance password to the command by using
|
||||
the <literal>--password <pwd></literal> option. If you do not specify a
|
||||
password, one is generated and printed after the command finishes successfully. The
|
||||
following command evacuates a server without shared storage from a host that is down
|
||||
to the specified <replaceable>host_b</replaceable>:</para>
|
||||
<screen><prompt>$</prompt> <userinput>nova evacuate <replaceable>evacuated_server_name</replaceable> <replaceable>host_b</replaceable></userinput> </screen>
|
||||
<para>The command evacuates an instance from a down host
|
||||
to a specified host. The instance is booted from a new
|
||||
disk, but preserves its configuration including its
|
||||
ID, name, uid, IP address, and so on. The command
|
||||
returns a password:</para>
|
||||
<para>The instance is booted from a new disk, but preserves its configuration including
|
||||
its ID, name, uid, IP address, and so on. The command returns a password:</para>
|
||||
<screen><computeroutput><?db-font-size 70%?>+-----------+--------------+
|
||||
| Property | Value |
|
||||
+-----------+--------------+
|
||||
@ -39,14 +31,12 @@
|
||||
+-----------+--------------+</computeroutput></screen>
|
||||
</step>
|
||||
<step>
|
||||
<para>To preserve the user disk data on the evacuated
|
||||
server, deploy OpenStack Compute with shared file
|
||||
system. To configure your system, see <link
|
||||
<para>To preserve the user disk data on the evacuated server, deploy OpenStack Compute
|
||||
with a shared file system. To configure your system, see <link
|
||||
xlink:href="http://docs.openstack.org/havana/config-reference/content/configuring-openstack-compute-basics.html#section_configuring-compute-migrations"
|
||||
>Configure migrations</link> in
|
||||
<citetitle>OpenStack Configuration
|
||||
Reference</citetitle>. In this example, the
|
||||
password remains unchanged.</para>
|
||||
>Configure migrations</link> in <citetitle>OpenStack Configuration
|
||||
Reference</citetitle>. In the following example, the password remains
|
||||
unchanged:</para>
|
||||
<screen><prompt>$</prompt> <userinput>nova evacuate <replaceable>evacuated_server_name</replaceable> <replaceable>host_b</replaceable> --on-shared-storage</userinput> </screen>
|
||||
</step>
|
||||
</procedure>
|
||||
|
@ -4,16 +4,14 @@
|
||||
xmlns:xlink="http://www.w3.org/1999/xlink" version="5.0"
|
||||
xml:id="trusted-compute-pools">
|
||||
<title>Trusted compute pools</title>
|
||||
<para>Trusted compute pools enable administrators to designate a
|
||||
group of compute hosts as trusted. These hosts use hardware-based
|
||||
security features, such as the Intel Trusted Execution
|
||||
Technology (TXT), to provide an additional level of security.
|
||||
Combined with an external stand-alone web-based remote
|
||||
attestation server, cloud providers can ensure that the
|
||||
compute node runs only software with verified measurements and
|
||||
can ensure a secure cloud stack.</para>
|
||||
<para>Through the trusted compute pools, cloud subscribers can
|
||||
request services to run on verified compute nodes.</para>
|
||||
<para>Trusted compute pools enable administrators to designate a group of compute hosts as
|
||||
trusted. These hosts use hardware-based security features, such as the Intel Trusted
|
||||
Execution Technology (TXT), to provide an additional level of security. Combined with an
|
||||
external stand-alone, web-based remote attestation server, cloud providers can ensure that
|
||||
the compute node runs only software with verified measurements and can ensure a secure cloud
|
||||
stack.</para>
|
||||
<para>Using the trusted compute pools, cloud subscribers can request services to run on verified
|
||||
compute nodes.</para>
|
||||
<para>The remote attestation server performs node verification as
|
||||
follows:</para>
|
||||
<orderedlist>
|
||||
@ -26,13 +24,12 @@
|
||||
measured.</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>Measured data is sent to the attestation server when
|
||||
challenged by attestation server.</para>
|
||||
<para>Measured data is sent to the attestation server when challenged by the attestation
|
||||
server.</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>The attestation server verifies those measurements
|
||||
against a good and known database to determine nodes'
|
||||
trustworthiness.</para>
|
||||
<para>The attestation server verifies those measurements against a good and known
|
||||
database to determine node trustworthiness.</para>
|
||||
</listitem>
|
||||
</orderedlist>
|
||||
<para>A description of how to set up an attestation service is
|
||||
@ -57,27 +54,40 @@
|
||||
<title>Configure Compute to use trusted compute pools</title>
|
||||
<procedure>
|
||||
<step>
|
||||
<para>Configure the Compute service with the
|
||||
connection information for the attestation
|
||||
service.</para>
|
||||
<para>Specify these connection options in the
|
||||
<literal>trusted_computing</literal> section
|
||||
in the <filename>nova.conf</filename>
|
||||
configuration file:</para>
|
||||
<para>Enable scheduling support for trusted compute pools by adding the following
|
||||
lines in the <literal>DEFAULT</literal> section in the
|
||||
<filename>/etc/nova/nova.conf</filename> file:</para>
|
||||
<programlisting language="ini">[DEFAULT]
|
||||
compute_scheduler_driver=nova.scheduler.filter_scheduler.FilterScheduler
|
||||
scheduler_available_filters=nova.scheduler.filters.all_filters
|
||||
scheduler_default_filters=AvailabilityZoneFilter,RamFilter,ComputeFilter,TrustedFilter</programlisting>
|
||||
</step>
|
||||
<step>
|
||||
<para>Specify the connection information for your attestation service by adding the
|
||||
following lines to the <literal>trusted_computing</literal> section in the
|
||||
<filename>/etc/nova/nova.conf</filename> file:</para>
|
||||
<programlisting language="ini">[trusted_computing]
|
||||
server=10.1.71.206
|
||||
port=8443
|
||||
server_ca_file=/etc/nova/ssl.10.1.71.206.crt
|
||||
# If using OAT v1.5, use this api_url:
|
||||
api_url=/AttestationService/resources
|
||||
# If using OAT pre-v1.5, use this api_url:
|
||||
#api_url=/OpenAttestationWebServices/V1.0
|
||||
auth_blob=i-am-openstack</programlisting>
|
||||
<para>Where:</para>
|
||||
<variablelist>
|
||||
<varlistentry>
|
||||
<term>server</term>
|
||||
<listitem>
|
||||
<para>Host name or IP address of the host
|
||||
that runs the attestation
|
||||
service</para>
|
||||
<para>Host name or IP address of the host that runs the attestation
|
||||
service.</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
<varlistentry>
|
||||
<term>port</term>
|
||||
<listitem>
|
||||
<para>HTTPS port for the attestation
|
||||
service</para>
|
||||
<para>HTTPS port for the attestation service.</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
<varlistentry>
|
||||
@ -90,8 +100,7 @@
|
||||
<varlistentry>
|
||||
<term>api_url</term>
|
||||
<listitem>
|
||||
<para>The attestation service URL
|
||||
path.</para>
|
||||
<para>The attestation service's URL path.</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
<varlistentry>
|
||||
@ -104,31 +113,6 @@
|
||||
</varlistentry>
|
||||
</variablelist>
|
||||
</step>
|
||||
<step>
|
||||
<para>To enable scheduling support for trusted compute
|
||||
pools, add the following lines to the
|
||||
<literal>DEFAULT</literal> and
|
||||
<literal>trusted_computing</literal> sections
|
||||
in the <filename>/etc/nova/nova.conf</filename>
|
||||
file. Edit the details in the
|
||||
<literal>trusted_computing</literal> section
|
||||
based on the details of your attestation
|
||||
service:</para>
|
||||
<programlisting language="ini">[DEFAULT]
|
||||
compute_scheduler_driver=nova.scheduler.filter_scheduler.FilterScheduler
|
||||
scheduler_available_filters=nova.scheduler.filters.all_filters
|
||||
scheduler_default_filters=AvailabilityZoneFilter,RamFilter,ComputeFilter,TrustedFilter
|
||||
|
||||
[trusted_computing]
|
||||
server=10.1.71.206
|
||||
port=8443
|
||||
server_ca_file=/etc/nova/ssl.10.1.71.206.crt
|
||||
# If using OAT v1.5, use this api_url:
|
||||
api_url=/AttestationService/resources
|
||||
# If using OAT pre-v1.5, use this api_url:
|
||||
#api_url=/OpenAttestationWebServices/V1.0
|
||||
auth_blob=i-am-openstack</programlisting>
|
||||
</step>
|
||||
<step>
|
||||
<para>Restart the <systemitem class="service"
|
||||
>nova-compute</systemitem> and <systemitem
|
||||
@ -138,35 +122,41 @@ auth_blob=i-am-openstack</programlisting>
|
||||
</procedure>
|
||||
<section xml:id="config_ref">
|
||||
<title>Configuration reference</title>
|
||||
<para>To customize the trusted compute pools, use the configuration
|
||||
option settings documented in <xref
|
||||
linkend="config_table_nova_trustedcomputing"/>.</para>
|
||||
<para>To customize the trusted compute pools, use the following configuration
|
||||
option settings:
|
||||
</para>
|
||||
<xi:include href="tables/nova-trustedcomputing.xml"/>
|
||||
</section>
|
||||
</section>
|
||||
<section xml:id="trusted_flavors">
|
||||
<title>Specify trusted flavors</title>
|
||||
<para>You must configure one or more flavors as
|
||||
trusted. Users can request
|
||||
trusted nodes by specifying a trusted flavor when they
|
||||
boot an instance.</para>
|
||||
<para>Use the <command>nova flavor-key set</command> command
|
||||
to set a flavor as trusted. For example, to set the
|
||||
<literal>m1.tiny</literal> flavor as trusted:</para>
|
||||
<screen><prompt>$</prompt> <userinput>nova flavor-key m1.tiny set trust:trusted_host trusted</userinput></screen>
|
||||
<para>To request that their instances run on a trusted host,
|
||||
users can specify a trusted flavor on the <command>nova
|
||||
boot</command> command:</para>
|
||||
<mediaobject>
|
||||
<imageobject role="fo">
|
||||
<imagedata
|
||||
fileref="figures/OpenStackTrustedComputePool2.png"
|
||||
format="PNG" contentwidth="6in"/>
|
||||
</imageobject>
|
||||
<imageobject role="html">
|
||||
<imagedata
|
||||
fileref="figures/OpenStackTrustedComputePool2.png"
|
||||
format="PNG" contentwidth="6in"/>
|
||||
</imageobject>
|
||||
</mediaobject>
|
||||
<para>To designate hosts as trusted:</para>
|
||||
<procedure>
|
||||
<step>
|
||||
<para>Configure one or more flavors as trusted by using the <command>nova
|
||||
flavor-key set</command> command. For example, to set the
|
||||
<literal>m1.tiny</literal> flavor as trusted:</para>
|
||||
<screen><prompt>$</prompt> <userinput>nova flavor-key m1.tiny set trust:trusted_host trusted</userinput></screen>
|
||||
</step>
|
||||
<step><para>Request that your instance be run on a trusted host, by specifying a trusted flavor when
|
||||
booting the instance. For example:</para>
|
||||
<screen><prompt>$</prompt> <userinput>nova boot --flavor m1.tiny --key_name myKeypairName --image myImageID newInstanceName</userinput></screen>
|
||||
<figure xml:id="concept_trusted_pool">
|
||||
<title>Trusted compute pool</title>
|
||||
<mediaobject>
|
||||
<imageobject role="fo">
|
||||
<imagedata
|
||||
fileref="figures/OpenStackTrustedComputePool2.png"
|
||||
format="PNG" contentwidth="6in"/>
|
||||
</imageobject>
|
||||
<imageobject role="html">
|
||||
<imagedata
|
||||
fileref="figures/OpenStackTrustedComputePool2.png"
|
||||
format="PNG" contentwidth="6in"/>
|
||||
</imageobject>
|
||||
</mediaobject>
|
||||
</figure>
|
||||
</step>
|
||||
</procedure>
|
||||
</section>
|
||||
</section>
|
||||
|
@ -92,7 +92,6 @@
|
||||
<xi:include href="compute/section_compute-scheduler.xml"/>
|
||||
<xi:include href="compute/section_compute-cells.xml"/>
|
||||
<xi:include href="compute/section_compute-conductor.xml"/>
|
||||
<xi:include href="compute/section_compute-security.xml"/>
|
||||
<xi:include href="compute/section_compute-config-samples.xml"/>
|
||||
<xi:include href="compute/section_nova-log-files.xml"/>
|
||||
<xi:include href="compute/section_compute-options-reference.xml"/>
|
||||
|
Loading…
Reference in New Issue
Block a user