ComputeThe Compute service, code-named Nova, provides a tool that lets you orchestrate a cloud.
By using Compute, you can run instances, manage networks, and manage access to the cloud
through users and projects. Compute provides software that enables you to control an
Infrastructure as a Service (IaaS) cloud computing platform.Introduction to ComputeCompute does not include any virtualization software;
rather it defines drivers that interact with underlying
virtualization mechanisms that run on your host operating
system, and exposes functionality over a web-based
API.HypervisorsCompute requires a hypervisor. Compute controls hypervisors through an API server.
To select a hypervisor, you must prioritize and make decisions based on budget,
resource constraints, supported features, and required technical specifications. The
majority of development is done with the KVM and Xen-based hypervisors. For a
detailed list of features and support across the hypervisors, see http://wiki.openstack.org/HypervisorSupportMatrix.With Compute, you can orchestrate clouds using multiple hypervisors in different
availability zones. The types of virtualization standards that can be used with
Compute include:Baremetal.Hyper-VKernel-based Virtual Machine (KVM).Linux Containers (LXC).Quick EMUlator (QEMU).User Mode Linux (UML).VMWare vSphere.Xen.Please see the Hypervisors section in OpenStack Configuration
Reference for more information.Tenants, users, and rolesThe Compute system is designed to be used by many different cloud computing
consumers or customers, in native terms, tenants on a shared system, using
role-based access assignments. Roles control the actions that a user is allowed to
perform.While the original EC2 API supports users, Compute uses the concept of tenants.
Tenants are isolated resource containers that form the principal organizational
structure within the Compute service. They consist of a separate VLAN, volumes,
instances, images, keys, and users. A user can specify the tenant by appending
:project_id to their access key. If no tenant is specified in
the API request, Compute attempts to use a tenant with the same ID as the
user.For tenants, quota controls are available to limit the:Number of volumes that may be launched.Number of processor cores and the amount of RAM that may be
allocated.Floating IP addresses (assigned to any instance when it launches so
the instance has the same publicly accessible IP addresses).Fixed IP addresses (assigned to the same instance each time it boots,
publicly or privately accessibletypically private for management
purposes).Roles control the actions a user is allowed to perform. In the default configuration, most
actions do not require a particular role, but the system administrator can configure
them by editing the appropriate policy.json file that maintains
the rules. For example, a rule can be defined so that a user cannot allocate a
public IP without the admin role. A tenant limits the users' access to particular
images, but each user is assigned the username and password. Key pairs granting
access to an instance are enabled for each user, but quotas are set for each tenant
to control resource consumption across available hardware resources . Earlier versions of OpenStack used the term "project" instead of "tenant."
Because of this legacy terminology, some command-line tools use
--project_id when a tenant ID is expected.
Images and instancesImages are disk images which are templates for virtual machine file systems. The
image service, Glance, is responsible for the storage and management of images
within OpenStack.The Image Services section of OpenStack Configuration
Reference explains image configuration options, while the Manage Images section of OpenStack Admin User
Guide for specifics about creating and troubleshooting
images.Instances are the individual virtual machines running on physical compute nodes.
Compute manages instances. Any number of instances may be started from the same
image. Each instance is run from a copy of the base image so runtime changes made by
an instance do not change the image it is based on. Snapshots of running instances
may be taken, creating a new image based on the current disk state of a particular
instance.When starting an instance, a user must select a set of virtual resources known as
a flavor. Flavors define how many virtual CPUs an instance has and the amount of RAM
and size of its ephemeral disks. OpenStack provides a number of predefined flavors
that cloud administrators may edit or add to. Users must select from the set of
available flavors defined on their cloud. For more information about flavors, see
the Flavors section in OpenStack Operations Guide.A user may add and remove additional resources from running instances, such as
persistent volume storage and public IP address. The following example shows the
lifecycle of a typical virtual system within an OpenStack cloud. It features the
cinder-volume service, which provides
persistent block storage, as opposed to the ephemeral storage provided by the
instance flavor.Initial stateThe following diagram shows the system state prior to launching an instance.
The image store fronted by the image service, Glance, has a number of predefined
images. Inside the cloud, a compute node contains available vCPU, memory, and
local disk resources. Additionaly, the cinder-volume service provides a number of predefined
volumes.Launching an instanceTo launch an instance, the user selects an image, a flavor, and other optional
attributes. In this case, the selected flavor provides a root volume (as all
flavors do) labeled vda in the diagram and additional ephemeral storage labeled
vdb in the diagram. The user has also opted to map a volume from the cinder-volume store to the third virtual disk,
vdc, on this instance.The OpenStack system copies the base image from the image store to the local
disk. The local disk is the first disk (vda) that the instance accesses. Using
small images results in a faster start up of your instances as less data is
copied across the network. The system also creates a new empty disk image to
present as the second disk (vdb). Please note that the second disk is an empty
disk with an ephemeral life because it is destroyed when you delete an instance.
The compute node attaches to the requested cinder-volume using iSCSI and maps this to the third disk
(vdc) as requested. The vCPU and memory resources are provisioned and the
instance is booted from the first drive. The instance runs and changes data on
the disks indicated in red in the diagram.The details of this scenario can vary, particularly the type of back-end
storage and the network protocols that are used. One variant worth mentioning
here is that the ephemeral storage used for volumes vda and vdb in this example
may be backed by network storage rather than local disk.End stateOnce the instance has served its purpose and is deleted, all state is
reclaimed, except the persistent volume. The ephemeral storage is purged. Memory
and vCPU resources are released. The image remains unchanged throughout.System architectureCompute consists of several main components. A "cloud controller" contains many of
these components, and it represents the global state and interacts with all other
components. An API Server acts as the web services front end for the cloud
controller. The compute controller provides compute server resources and typically
contains the compute service.The Object Store component optionally provides storage services. An auth manager provides
authentication and authorization services when used with the Compute system, or you
can use the Identity Service (Keystone) as a separate authentication service. A
volume controller provides fast and permanent block-level storage for the compute
servers. A network controller provides virtual networks to enable compute servers to
interact with each other and with the public network. A scheduler selects the most
suitable compute controller to host an instance.Compute is built on a shared-nothing, messaging-based architecture. You can run
all of the major components on multiple servers including a compute controller,
volume controller, network controller, and object store (or image service). A cloud
controller communicates with the internal object store through Hyper Text Transfer
Protocol (HTTP), but it communicates with a scheduler, network controller, and
volume controller through Advanced Message Queue Protocol (AMQP). To avoid blocking
each component while waiting for a response, Compute uses asynchronous calls, with a
callback that gets triggered when a response is received.To achieve the shared-nothing property with multiple copies of the same component,
Compute keeps all of the cloud system state in a database.Block Storage and ComputeOpenStack provides two classes of block storage: ephemeral storage and persistent
volumes. Ephemeral storage exists only for the life of an instance. It persists
across reboots of the guest operating system, but when the instance is deleted so is
the associated storage. All instances have some ephemeral storage. Volumes are
persistent virtualized block devices independent of any particular instance.Ephemeral storageEphemeral storage is associated with a single
unique instance. Its size is defined by the flavor
of the instance.Terminating the instance associated with ephemeral storage causes the loss of
data from that ephemeral storage. Rebooting the VM or restarting the host
server, however, does not destroy ephemeral data. In a typical use case, an
instance's root file system is stored on ephemeral storage.In addition to the ephemeral root volume, all flavors except the smallest,
m1.tiny, provide an additional ephemeral block device whose size ranges from 20
GB for m1.small to 160 GB for m1.xlarge. You can configure these sizes. This is
presented as a raw block device with no partition table or file system.
Cloud-aware operating system images may discover, format, and mount this device.
For example, the cloud-init package included in Ubuntu's stock cloud images
format this space as an ext3 file system and mount it on
/mnt. It is important to note this a feature of the
guest operating system. OpenStack only provisions the raw storage.Volume storageVolumes are created by users and their size may go up to the quota and
availability limits. Upon initial creation, volumes are raw block devices
without a partition table or a file system. To partition or format volumes, you
must attach them to an instance. After you attach them to an instance, you may
use volumes much like you would an external disk drive. You may attach volumes
to one instance at a time. However, you may detach and reattach volumes to
either the same or a different instance.You may configure a volume so that it is bootable and provides a persistent
virtual instance similar to traditional non-cloud-based virtualization systems.
Typically, the resulting instance may still have ephemeral storage depending on
the flavor selected, but the root file system (and possibly others) may be on
the persistent volume and its state may be maintained even if the instance is
shut down. The details of this configuration are discussed in the
OpenStack Configuration Reference .Volumes do not provide concurrent access from multiple instances. For that,
you need either a traditional network file system like NFS or CIFS or a cluster
file system such as GlusterFS. These systems may be built within an OpenStack
cluster or provisioned outside of it, but OpenStack software does not provide
such features.Image managementThe Image service, code-named Glance, discovers,
registers, and retrieves virtual machine images. The service includes a RESTful API
that allows users to query VM image metadata and retrieve the actual image with HTTP
requests. You can also use the glance command-line tool, or the Python
API to accomplish the same tasks.VM images made available through the Image service can be stored in a variety of
locations. The Image service supports the following back-end stores:Object Storage service (code-named Swift)The highly-available object storage
project in OpenStack.File systemThe default back-end that OpenStack Image Service uses to store
virtual machine images is the file system back-end. This simple back-end writes
image files to the local file system.S3This back-end allows OpenStack Image Service to store virtual machine
images in Amazon’s S3 service.HTTPOpenStack Image Service can read virtual machine images that are
available through HTTP somewhere on the Internet. This store is read
only.Rados Block Device (RBD)This back-end stores images inside of a Ceph storage
cluster using Ceph's RBD interface.GridFSThis back-end stores images inside of MongoDB.You must have a working installation of the Image Service, with a working endpoint and
users created in the Identity Service. Also, you must source the environment variables
required by the Compute and Image clients.Instance managementInstances are the running virtual machines within an
OpenStack cloud.Interfaces to instance managementOpenStack provides command line, web based, and API
based instance management. Additionally a number of
third party management tools are available for use
with OpenStack using either the native API or the
provided EC2 compatibility API.nova CLIThe nova command
provided by the OpenStack python-novaclient
package is the basic command line utility for
users interacting with OpenStack. This is
available as a native package for most modern
Linux distributions or the latest version can be
installed directly using
pip python package
installer:
sudo pip install python-novaclientFull details for nova and other CLI tools are
provided in OpenStack End User Guide. What follows is
the minimal introduction required to follow the CLI example in this chapter. In
the case of a conflict in OpenStack End User Guide should be
considered authoritative (and a bug filed against this section).To function, the
nova CLI needs the following information:Authentication URL:
This can be passed as the
--os_auth_url
flag or using the OS_AUTH_URL environment
variable.Tenant (sometimes referred to
as project) name: This can
be passed as the
--os_tenant_name
flag or using the OS_TENANT_NAME
environment variable.User name: This can
be passed as the
--os_username
flag or using the OS_USERNAME environment
variable.Password: This can be
passed as the
--os_password
flag or using the OS_PASSWORD environment
variable.For example if you have your Identity Service
running on the default port (5000) on host
keystone.example.com and want to use the
nova cli as the
user "demouser" with the password "demopassword"
in the "demoproject" tenant you can export the
following values in your shell environment or pass
the equivalent command line args (presuming these
identities already exist):export OS_AUTH_URL="http://keystone.example.com:5000/v2.0/"
export OS_USERNAME=demouser
export OS_PASSWORD=demopassword
export OS_TENANT_NAME=demoprojectIf you are using the Horizon
web dashboard, users can easily download
credential files like this with the correct values
for your particular implementation.Horizon web dashboardHorizon is the highly customizable and
extensible OpenStack web dashboard. The Horizon Project home page has detailed
information on deploying horizon.Compute APIOpenStack provides a RESTful API for all functionality. Complete API
documentation is available at http://docs.openstack.org/api. The OpenStack
Compute API documentation refers to instances as "servers."The nova cli can be made to show the API
calls it is making by passing it the
--debug flag
#nova --debug listconnect: (10.0.0.15, 5000)
send: 'POST /v2.0/tokens HTTP/1.1\r\nHost: 10.0.0.15:5000\r\nContent-Length: 116\r\ncontent-type: application/json\r\naccept-encoding: gzip, deflate\r\naccept: application/json\r\nuser-agent: python-novaclient\r\n\r\n{"auth": {"tenantName": "demoproject", "passwordCredentials": {"username": "demouser", "password": "demopassword"}}}'
reply: 'HTTP/1.1 200 OK\r\n'
header: Content-Type: application/json
header: Vary: X-Auth-Token
header: Date: Thu, 13 Sep 2012 20:27:36 GMT
header: Transfer-Encoding: chunked
connect: (128.52.128.15, 8774)
send: u'GET /v2/fa9dccdeadbeef23ae230969587a14bf/servers/detail HTTP/1.1\r\nHost: 10.0.0.15:8774\r\nx-auth-project-id: demoproject\r\nx-auth-token: deadbeef9998823afecc3d552525c34c\r\naccept-encoding: gzip, deflate\r\naccept: application/json\r\nuser-agent: python-novaclient\r\n\r\n'
reply: 'HTTP/1.1 200 OK\r\n'
header: X-Compute-Request-Id: req-bf313e7d-771a-4c0b-ad08-c5da8161b30f
header: Content-Type: application/json
header: Content-Length: 15
header: Date: Thu, 13 Sep 2012 20:27:36 GMT
+----+------+--------+----------+
| ID | Name | Status | Networks |
+----+------+--------+----------+
+----+------+--------+----------+EC2 Compatibility APIIn addition to the native compute API OpenStack
provides an EC2 compatible API. This allows legacy
workflows built for EC2 to work with
OpenStack.Third-party toolsNumerous third party tools and
language-specific SDKs interact with
OpenStack clouds, both through native and
compatibility APIs. Though not OpenStack
projects, the following links are to some of
the more popular projects:euca2ools is a popular
open source CLI for interacting with
the EC2 API. This is convenient for
multi cloud environments where EC2 is
the common API, or for transitioning
from EC2 API based clouds to
OpenStack.hybridfox is a Firefox
browser add-on that provides a
graphical interface to many popular
public and private cloud
technologies.boto is a Python library
for interacting with Amazon Web
Services. It can be used to access
OpenStack through the EC2
compatibility APIfog is the Ruby cloud
services library and provides methods
for interacting with a large number of
cloud and virtualization
platforms.php-opencloud is a PHP SDK
that should work with most
OpenStack-based cloud deployments and
the Rackspace public cloud.Building blocksThere are two fundamental requirements for a
computing system, software and hardware.
Virtualization and cloud frameworks tend to blur these
lines and some of your "hardware" may actually be
"software" but conceptually you still need an
operating system and something to run it on.ImagesIn OpenStack the base operating system is
usually copied from an image stored in the Glance
image service. This is the most common case and
results in an ephemeral instance that starts from
a known template state and loses all accumulated
states on shutdown. It is also possible in special
cases to put an operating system on a persistent
"volume" in the Nova-Volume or Cinder volume
system. This gives a more traditional persistent
system that accumulates states, which are
preserved across restarts. To get a list of
available images on your system run:$nova image-list+--------------------------------------+-------------------------------+--------+--------------------------------------+
| ID | Name | Status | Server |
+--------------------------------------+-------------------------------+--------+--------------------------------------+
| aee1d242-730f-431f-88c1-87630c0f07ba | Ubuntu 12.04 cloudimg amd64 | ACTIVE | |
| 0b27baa1-0ca6-49a7-b3f4-48388e440245 | Ubuntu 12.10 cloudimg amd64 | ACTIVE | |
| df8d56fc-9cea-4dfd-a8d3-28764de3cb08 | jenkins | ACTIVE | |
+--------------------------------------+-------------------------------+--------+--------------------------------------+The displayed image attributes are:ID: the automatically
generate UUID of the imageName: a free form
human readable name given to the
imageStatus: shows the
status of the image ACTIVE images are
available for use.Server: for images
that are created as snapshots of running
instance this is the UUID of the instance
the snapshot derives from, for uploaded
images it is blankFlavorsVirtual hardware templates are called "flavors"
in OpenStack. The default install provides a range
of five flavors. These are configurable by admin
users (this too is configurable and may be
delegated by redefining the access controls for
"compute_extension:flavormanage" in
/etc/nova/policy.json on
the compute-api server). To get a list of
available flavors on your system run:$nova flavor-list+----+-----------+-----------+------+-----------+------+-------+-------------+
| ID | Name | Memory_MB | Disk | Ephemeral | Swap | VCPUs | RXTX_Factor |
+----+-----------+-----------+------+-----------+------+-------+-------------+
| 1 | m1.tiny | 512 | 1 | N/A | 0 | 1 | |
| 2 | m1.small | 2048 | 20 | N/A | 0 | 1 | |
| 3 | m1.medium | 4096 | 40 | N/A | 0 | 2 | |
| 4 | m1.large | 8192 | 80 | N/A | 0 | 4 | |
| 5 | m1.xlarge | 16384 | 160 | N/A | 0 | 8 | |
+----+-----------+-----------+------+-----------+------+-------+-------------+
InstancesFor information about launching instances through
the nova command-line client, see the OpenStack End User
Guide.Control where instances runThe
OpenStack Configuration
Reference provides detailed
information on controlling where your instances run,
including ensuring a set of instances run on different
compute nodes for service resiliency or on the same
node for high performance inter-instance
communicationsAdditionally admin users can specify an exact
compute node to run on by specifying
--availability-zone
<availability-zone>:<compute-host>
on the command line, for example to force an instance
to launch on the nova-1 compute
node in the default nova
availability zone:
#nova boot --image aee1d242-730f-431f-88c1-87630c0f07ba --flavor 1 --availability-zone nova:nova-1 testhostInstance-specific dataFor each instance, you can specify certain data
including authorized_keys key injection, user-data,
metadata service, and file injection.For information, see the OpenStack End User
Guide.Instance networkingFor information, see the OpenStack End User
Guide.Networking with nova-networkUnderstanding the networking configuration options helps
you design the best configuration for your Compute
instances.Networking optionsThis section offers a brief overview of each concept
in networking for Compute. With the Grizzly release,
you can choose to either install and configure
nova-network for networking between
VMs or use the Networking service (neutron) for
networking. To configure Compute networking options
with Neutron, see the networking chapter of the
Cloud Administrator Guide.For each VM instance, Compute assigns to it a
private IP address. (Currently, Compute with
nova-network only supports Linux
bridge networking that allows the virtual interfaces
to connect to the outside network through the physical
interface.)The network controller with nova-network provides
virtual networks to enable compute servers to interact
with each other and with the public network.Currently, Compute with nova-network supports these kinds of
networks, implemented in different “Network Manager”
types: Flat Network ManagerFlat DHCP Network ManagerVLAN Network ManagerThese networks can co-exist in a cloud system.
However, because you can't yet select the type of
network for a given project, you cannot configure more
than one type of network in a given Compute
installation.All networking options require network
connectivity to be already set up between
OpenStack physical nodes. OpenStack does not
configure any physical network interfaces.
OpenStack automatically creates all network
bridges (for example, br100) and VM virtual
interfaces.All machines must have a public and internal network interface
(controlled by the options:
public_interface for the
public interface, and
flat_interface and
vlan_interface for the
internal interface with flat / VLAN
managers).The internal network interface is used for
communication with VMs, it shouldn't have an IP
address attached to it before OpenStack
installation (it serves merely as a fabric where
the actual endpoints are VMs and dnsmasq). Also,
the internal network interface must be put in
promiscuous
mode, because it must receive
packets whose target MAC address is of the guest
VM, not of the host.All the network managers configure the network using
network
drivers. For example, the Linux L3 driver
(l3.py and
linux_net.py), which makes use
of iptables,
route and other network
management facilities, and libvirt's network filtering facilities. The driver
isn't tied to any particular network manager; all
network managers use the same driver. The driver
usually initializes (creates bridges and so on) only
when the first VM lands on this host node.All network managers operate in either single-host or multi-host mode. This
choice greatly influences the network configuration.
In single-host mode, there is just 1 instance of
nova-network which is used as a
default gateway for VMs and hosts a single DHCP server
(dnsmasq), whereas in multi-host mode every compute
node has its own nova-network. In
any case, all traffic between VMs and the outer world
flows through nova-network. There
are pros and cons to both modes, read more in the
OpenStack Configuration
Reference.Compute makes a distinction between fixed IPs and floating IPs for VM
instances. Fixed IPs are IP addresses that are
assigned to an instance on creation and stay the same
until the instance is explicitly terminated. By
contrast, floating IPs are addresses that can be
dynamically associated with an instance. A floating IP
address can be disassociated and associated with
another instance at any time. A user can reserve a
floating IP for their project.In Flat Mode, a
network administrator specifies a subnet. The IP
addresses for VM instances are grabbed from the
subnet, and then injected into the image on launch.
Each instance receives a fixed IP address from the
pool of available addresses. A system administrator
may create the Linux networking bridge (typically
named br100, although this
configurable) on the systems running the nova-network service.
All instances of the system are attached to the same
bridge, configured manually by the network
administrator.The configuration injection currently only
works on Linux-style systems that keep
networking configuration in
/etc/network/interfaces.In Flat DHCP Mode,
OpenStack starts a DHCP server (dnsmasq) to pass out
IP addresses to VM instances from the specified subnet
in addition to manually configuring the networking
bridge. IP addresses for VM instances are grabbed from
a subnet specified by the network
administrator.Like Flat Mode, all instances are attached to a
single bridge on the compute node. In addition a DHCP
server is running to configure instances (depending on
single-/multi-host mode, alongside each nova-network). In
this mode, Compute does a bit more configuration in
that it attempts to bridge into an ethernet device
(flat_interface, eth0 by
default). It also runs and configures dnsmasq as a
DHCP server listening on this bridge, usually on IP
address 10.0.0.1 (see DHCP server: dnsmasq). For every instance,
nova allocates a fixed IP address and configure
dnsmasq with the MAC/IP pair for the VM. For example,
dnsmasq doesn't take part in the IP address allocation
process, it only hands out IPs according to the
mapping done by nova. Instances receive their fixed
IPs by doing a dhcpdiscover. These IPs are not assigned to any of
the host's network interfaces, only to the VM's
guest-side interface.In any setup with flat networking, the host(-s) with
nova-network on it is (are) responsible for forwarding
traffic from the private network. Compute can determine
the NAT entries for each network, though sometimes NAT is not
used, such as when configured with all
public IPs or a hardware router is used (one of the HA
options). Such host(-s) needs to have
br100 configured and physically
connected to any other nodes that are hosting VMs. You
must set the flat_network_bridge option
or create networks with the bridge parameter in order to
avoid raising an error. Compute nodes have
iptables/ebtables entries created for each project and instance
to protect against IP/MAC address spoofing and ARP
poisoning.In single-host Flat DHCP mode you will be able to ping VMs
through their fixed IP from the nova-network node, but
you cannot ping them
from the compute nodes. This is
expected behavior.VLAN Network Mode is the
default mode for OpenStack Compute. In
this mode, Compute creates a VLAN and bridge for each
project. For multiple machine installation, the VLAN
Network Mode requires a switch that supports VLAN
tagging (IEEE 802.1Q). The project gets a range of
private IPs that are only accessible from inside the
VLAN. In order for a user to access the instances in
their project, a special VPN instance (code named
cloudpipe) needs to be created. Compute generates a
certificate and key for the user to access the VPN and
starts the VPN automatically. It provides a private
network segment for each project's instances that can
be accessed through a dedicated VPN connection from
the Internet. In this mode, each project gets its own
VLAN, Linux networking bridge, and subnet.The subnets are specified by the network
administrator, and are assigned dynamically to a
project when required. A DHCP Server is started for
each VLAN to pass out IP addresses to VM instances
from the subnet assigned to the project. All instances
belonging to one project are bridged into the same
VLAN for that project. OpenStack Compute creates the
Linux networking bridges and VLANs when
required.DHCP server: dnsmasqThe Compute service uses dnsmasq as the DHCP server when running
with either that Flat DHCP Network Manager or the VLAN
Network Manager. The nova-network service is responsible
for starting up dnsmasq processes.The behavior of dnsmasq can be customized by
creating a dnsmasq configuration file. Specify the
config file using the
dnsmasq_config_file
configuration option. For example:
dnsmasq_config_file=/etc/dnsmasq-nova.conf
See the OpenStack Configuration
Reference for an example of
how to change the behavior of dnsmasq using a dnsmasq
configuration file. The dnsmasq documentation has a
more comprehensive dnsmasq configuration file example.Dnsmasq also acts as a caching DNS server for
instances. You can explicitly specify the DNS server
that dnsmasq should use by setting the
dns_server configuration option
in /etc/nova/nova.conf. The
following example would configure dnsmasq to use
Google's public DNS
server: dns_server=8.8.8.8Dnsmasq logging output goes to the syslog (typically
/var/log/syslog or
/var/log/messages, depending
on Linux distribution). The dnsmasq logging output can
be useful for troubleshooting if VM instances boot
successfully but are not reachable over the
network.A network administrator can run nova-manage
fixed reserve
--address=x.x.x.x
to specify the starting point IP address (x.x.x.x) to
reserve with the DHCP server. This reservation only
affects which IP address the VMs start at, not the
fixed IP addresses that the nova-network service
places on the bridges.Metadata serviceIntroductionThe Compute service uses a special metadata
service to enable virtual machine instances to
retrieve instance-specific data. Instances access
the metadata service at
http://169.254.169.254. The
metadata service supports two sets of APIs: an
OpenStack metadata API and an EC2-compatible API.
Each of the APIs is versioned by date.To retrieve a list of supported versions for the
OpenStack metadata API, make a GET request to
http://169.254.169.254/openstack
For example:$curl http://169.254.169.254/openstack2012-08-10
latest
To retrieve a list of supported versions for the
EC2-compatible metadata API, make a GET request to
http://169.254.169.254For example:$curl http://169.254.169.2541.0
2007-01-19
2007-03-01
2007-08-29
2007-10-10
2007-12-15
2008-02-01
2008-09-01
2009-04-04
latestIf you write a consumer for one of these APIs,
always attempt to access the most recent API
version supported by your consumer first, then
fall back to an earlier version if the most recent
one is not available.OpenStack metadata APIMetadata from the OpenStack API is distributed
in JSON format. To retrieve the metadata, make a
GET request to:http://169.254.169.254/openstack/2012-08-10/meta_data.jsonFor example:$curl http://169.254.169.254/openstack/2012-08-10/meta_data.json{"uuid": "d8e02d56-2648-49a3-bf97-6be8f1204f38", "availability_zone": "nova", "hostname": "test.novalocal", "launch_index": 0, "meta": {"priority": "low", "role": "webserver"}, "public_keys": {"mykey": "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAAAgQDYVEprvtYJXVOBN0XNKVVRNCRX6BlnNbI+USLGais1sUWPwtSg7z9K9vhbYAPUZcq8c/s5S9dg5vTHbsiyPCIDOKyeHba4MUJq8Oh5b2i71/3BISpyxTBH/uZDHdslW2a+SrPDCeuMMoss9NFhBdKtDkdG9zyi0ibmCP6yMdEX8Q== Generated by Nova\n"}, "name": "test"}Here is the same content after having run
through a JSON pretty-printer:{
"availability_zone": "nova",
"hostname": "test.novalocal",
"launch_index": 0,
"meta": {
"priority": "low",
"role": "webserver"
},
"name": "test",
"public_keys": {
"mykey": "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAAAgQDYVEprvtYJXVOBN0XNKVVRNCRX6BlnNbI+USLGais1sUWPwtSg7z9K9vhbYAPUZcq8c/s5S9dg5vTHbsiyPCIDOKyeHba4MUJq8Oh5b2i71/3BISpyxTBH/uZDHdslW2a+SrPDCeuMMoss9NFhBdKtDkdG9zyi0ibmCP6yMdEX8Q== Generated by Nova\n"
},
"uuid": "d8e02d56-2648-49a3-bf97-6be8f1204f38"
}Instances also retrieve user data (passed as the
user_data parameter in the
API call or by the --user_data
flag in the nova boot command)
through the metadata service, by making a GET
request
to: http://169.254.169.254/openstack/2012-08-10/user_data
For example:$curl http://169.254.169.254/openstack/2012-08-10/user_data#!/bin/bash
echo 'Extra user data here'EC2 metadata APIThe metadata service has an API that is
compatible with version 2009-04-04 of the Amazon EC2 metadata service; virtual
machine images that are designed for EC2 work
properly with OpenStack.The EC2 API exposes a separate URL for each
metadata. You can retrieve a listing of these
elements by making a GET query to:http://169.254.169.254/2009-04-04/meta-data/For example:$curl http://169.254.169.254/2009-04-04/meta-data/ami-id
ami-launch-index
ami-manifest-path
block-device-mapping/
hostname
instance-action
instance-id
instance-type
kernel-id
local-hostname
local-ipv4
placement/
public-hostname
public-ipv4
public-keys/
ramdisk-id
reservation-id
security-groups$curl http://169.254.169.254/2009-04-04/meta-data/block-device-mapping/ami$curl http://169.254.169.254/2009-04-04/meta-data/placement/availability-zone$curl http://169.254.169.254/2009-04-04/meta-data/public-keys/0=mykeyInstances can retrieve the public SSH key
(identified by keypair name when a user requests a
new instance) by making a GET request to:http://169.254.169.254/2009-04-04/meta-data/public-keys/0/openssh-keyFor example:$curl http://169.254.169.254/2009-04-04/meta-data/public-keys/0/openssh-keyssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAAAgQDYVEprvtYJXVOBN0XNKVVRNCRX6BlnNbI+USLGais1sUWPwtSg7z9K9vhbYAPUZcq8c/s5S9dg5vTHbsiyPCIDOKyeHba4MUJq8Oh5b2i71/3BISpyxTBH/uZDHdslW2a+SrPDCeuMMoss9NFhBdKtDkdG9zyi0ibmCP6yMdEX8Q== Generated by NovaInstances can retrieve user data by making a GET
request to:http://169.254.169.254/2009-04-04/user-dataFor example:$curl http://169.254.169.254/2009-04-04/user-data#!/bin/bash
echo 'Extra user data here'Run the metadata serviceThe metadata service is implemented by either
the nova-api service or the
nova-api-metadata service. (The
nova-api-metadata service is
generally only used when running in multi-host
mode, see the OpenStack Configuration
Reference for details). If you are
running the nova-api service, you must have
metadata as one of the
elements of the list of the
enabled_apis configuration
option in
/etc/nova/nova.conf. The
default enabled_apis
configuration setting includes the metadata
service, so you should not need to modify
it.To allow instances to reach the metadata
service, the nova-network service configures
iptables to NAT port 80 of the
169.254.169.254 address to
the IP address specified in
metadata_host (default
$my_ip, which is the IP
address of the nova-network service) and port
specified in metadata_port
(default 8775) in
/etc/nova/nova.conf. The metadata_host
configuration option must be an IP
address, not a host name.The default Compute service settings
assume that the nova-network service and
the nova-api service are
running on the same host. If this is not
the case, you must make the following
change in the
/etc/nova/nova.conf
file on the host running the nova-network
service:Set the metadata_host
configuration option to the IP address of
the host where the nova-api
service is running.Enable ping and SSH on VMsBe sure you enable access to your VMs by using the
euca-authorize or nova
secgroup-add-rule command. The following
commands allow you to ping and
ssh to your VMs:These commands need to be run as root only if
the credentials used to interact with nova-api have
been put under /root/.bashrc.
If the EC2 credentials have been put into another
user's .bashrc file, then, it
is necessary to run these commands as the
user.Using the nova command-line tool:$nova secgroup-add-rule default icmp -1 -1 0.0.0.0/0$nova secgroup-add-rule default tcp 22 22 0.0.0.0/0Using euca2ools:$euca-authorize -P icmp -t -1:-1 -s 0.0.0.0/0 default$euca-authorize -P tcp -p 22 -s 0.0.0.0/0 defaultIf you still cannot ping or SSH your instances after
issuing the nova secgroup-add-rule
commands, look at the number of
dnsmasq processes that are
running. If you have a running instance, check to see
that TWO dnsmasq processes are
running. If not, perform the following as root:#killall dnsmasq#service nova-network restartRemove a network from a projectYou cannot remove a network that has already been
associated to a project by simply deleting it.To determine the project ID you must have admin
rights. You can disassociate the project from the
network with a scrub command and the project ID as the
final parameter:$nova-manage project scrub --project=<id>Multiple interfaces for your instances
(multinic)The multi-nic feature allows you to plug more than
one interface to your instances, making it possible to
make several use cases available: SSL Configurations (VIPs)Services failover/ HABandwidth AllocationAdministrative/ Public access to your
instances Each VIF is representative of a
separate network with its own IP block. Every network
mode introduces it's own set of changes regarding the
mulitnic usage:
Use the multinic featureIn order to use the multinic feature, first
create two networks, and attach them to your
project:
$nova network-create first-net --fixed-range-v4=20.20.0.0/24 --project-id=$your-project$nova network-create second-net --fixed-range-v4=20.20.10.0/24 --project-id=$your-project
Now every time you spawn a new instance, it gets
two IP addresses from the respective DHCP servers: $nova list+-----+------------+--------+----------------------------------------+
| ID | Name | Status | Networks |
+-----+------------+--------+----------------------------------------+
| 124 | Server 124 | ACTIVE | network2=20.20.0.3; private=20.20.10.14|
+-----+------------+--------+----------------------------------------+Make sure to power up the second
interface on the instance, otherwise that
last won't be reachable through its second
IP. Here is an example of how to setup the
interfaces within the instance (this is
the configuration that needs to be applied
inside the image):/etc/network/interfaces# The loopback network interface
auto lo
iface lo inet loopback
auto eth0
iface eth0 inet dhcp
auto eth1
iface eth1 inet dhcpIf the Virtual Network Service Neutron is
installed, it is possible to specify the
networks to attach to the respective
interfaces by using the
--nic flag when
invoking the nova command:
$nova boot --image ed8b2a37-5535-4a5f-a615-443513036d71 --flavor 1 --nic net-id= <id of first network> --nic net-id= <id of first network> test-vm1Troubleshoot NetworkingCan't reach floating IPsIf you aren't able to reach your instances
through the floating IP address, make sure the
default security group allows ICMP (ping) and SSH
(port 22), so that you can reach the
instances:$nova secgroup-list-rules default+-------------+-----------+---------+-----------+--------------+
| IP Protocol | From Port | To Port | IP Range | Source Group |
+-------------+-----------+---------+-----------+--------------+
| icmp | -1 | -1 | 0.0.0.0/0 | |
| tcp | 22 | 22 | 0.0.0.0/0 | |
+-------------+-----------+---------+-----------+--------------+Ensure the NAT rules have been added to iptables
on the node that nova-network is running on, as
root:#iptables -L -nv-A nova-network-OUTPUT -d 68.99.26.170/32 -j DNAT --to-destination 10.0.0.3#iptables -L -nv -t nat-A nova-network-PREROUTING -d 68.99.26.170/32 -j DNAT --to-destination10.0.0.3
-A nova-network-floating-snat -s 10.0.0.3/32 -j SNAT --to-source 68.99.26.170Check that the public address, in this example
"68.99.26.170", has been added to your public
interface: You should see the address in the
listing when you enter "ip addr" at the command
prompt.$ip addr2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000
link/ether xx:xx:xx:17:4b:c2 brd ff:ff:ff:ff:ff:ff
inet 13.22.194.80/24 brd 13.22.194.255 scope global eth0
inet 68.99.26.170/32 scope global eth0
inet6 fe80::82b:2bf:fe1:4b2/64 scope link
valid_lft forever preferred_lft foreverNote that you cannot SSH to an instance with a
public IP from within the same server as the
routing configuration won't allow it.You can use tcpdump to
identify if packets are being routed to the
inbound interface on the compute host. If the
packets are reaching the compute hosts but the
connection is failing, the issue may be that the
packet is being dropped by reverse path filtering.
Try disabling reverse path filtering on the
inbound interface. For example, if the inbound
interface is eth2, as
root:#sysctl -w net.ipv4.conf.eth2.rp_filter=0If this solves your issue, add the following
line to /etc/sysctl.conf so
that the reverse path filter is disabled the next
time the compute host
reboots: net.ipv4.conf.rp_filter=0Disabling firewallTo help debug networking issues with reaching
VMs, you can disable the firewall by setting the
following option in /etc/nova/nova.conf:
firewall_driver=nova.virt.firewall.NoopFirewallDriverWe strongly recommend you remove the above line
to re-enable the firewall once your networking
issues have been resolved.Packet loss from instances to nova-network
server (VLANManager mode)If you can SSH to your instances but you find
that the network interactions to your instance is
slow, or if you find that running certain
operations are slower than they should be (for
example, sudo), then there may
be packet loss occurring on the connection to the
instance.Packet loss can be caused by Linux networking
configuration settings related to bridges. Certain
settings can cause packets to be dropped between
the VLAN interface (for example,
vlan100) and the associated
bridge interface (for example,
br100) on the host running
the nova-network service.One way to check if this is the issue in your
setup is to open up three terminals and run the
following commands:In the first terminal, on the host running
nova-network, use tcpdump to
monitor DNS-related traffic (UDP, port 53) on the
VLAN interface. As root:#tcpdump -K -p -i vlan100 -v -vv udp port 53In the second terminal, also on the host running
nova-network, use tcpdump to
monitor DNS-related traffic on the bridge
interface. As root:#tcpdump -K -p -i br100 -v -vv udp port 53In the third terminal, SSH inside of the
instance and generate DNS requests by using the
nslookup command:$nslookup www.google.comThe symptoms may be intermittent, so try running
nslookup multiple times. If
the network configuration is correct, the command
should return immediately each time. If it is not
functioning properly, the command hangs for
several seconds.If the nslookup command
sometimes hangs, and there are packets that appear
in the first terminal but not the second, then the
problem may be due to filtering done on the
bridges. Try to disable filtering, as root:#sysctl -w net.bridge.bridge-nf-call-arptables=0#sysctl -w net.bridge.bridge-nf-call-iptables=0#sysctl -w net.bridge.bridge-nf-call-ip6tables=0If this solves your issue, add the following
line to /etc/sysctl.conf so
that these changes take effect the next time the
host reboots:net.bridge.bridge-nf-call-arptables=0
net.bridge.bridge-nf-call-iptables=0
net.bridge.bridge-nf-call-ip6tables=0KVM: Network connectivity works initially, then
failsSome administrators have observed an issue with
the KVM hypervisor where instances running Ubuntu
12.04 sometimes loses network connectivity after
functioning properly for a period of time. Some
users have reported success with loading the
vhost_net kernel module as a workaround for this
issue (see bug #997978) . This kernel module may
also improve network performance on KVM. To
load the kernel module, as
root:#modprobe vhost_netLoading the module has no effect on running
instances.VolumesThe Block Storage service provides persistent block
storage resources that OpenStack Compute instances can
consume.See the OpenStack Configuration
Reference for information about
configuring volume drivers and creating and attaching
volumes to server instances.System administrationBy understanding how the different installed nodes interact with each other you can
administer the Compute installation. Compute offers many ways to install using multiple
servers but the general idea is that you can have multiple compute nodes that control
the virtual servers and a cloud controller node that contains the remaining Compute
services.The Compute cloud works through the interaction of a
series of daemon processes named nova-* that reside
persistently on the host machine or machines. These
binaries can all run on the same machine or be spread out
on multiple boxes in a large deployment. The
responsibilities of Services, Managers, and Drivers, can
be a bit confusing at first. Here is an outline the
division of responsibilities to make understanding the
system a little bit easier.Currently, Services are nova-api, nova-objectstore (which can be replaced
with Glance, the OpenStack Image Service), nova-compute, and
nova-network.
Managers and Drivers are specified by configuration
options and loaded using utils.load_object(). Managers are
responsible for a certain aspect of the system. It is a
logical grouping of code relating to a portion of the
system. In general other components should be using the
manager to make changes to the components that it is
responsible for.nova-api. Receives xml requests
and sends them to the rest of the system. It is a
wsgi app that routes and authenticate requests. It
supports the EC2 and OpenStack APIs. There is a
nova-api.conf file
created when you install Compute.nova-objectstore: The
nova-objectstore service is an
ultra simple file-based storage system for images
that replicates most of the S3 API. It can be
replaced with OpenStack Image Service and a simple
image manager or use OpenStack Object Storage as
the virtual machine image storage facility. It
must reside on the same node as nova-compute.nova-compute. Responsible for
managing virtual machines. It loads a Service
object which exposes the public methods on
ComputeManager through Remote Procedure Call
(RPC).nova-network. Responsible for
managing floating and fixed IPs, DHCP, bridging
and VLANs. It loads a Service object which exposes
the public methods on one of the subclasses of
NetworkManager. Different networking strategies
are available to the service by changing the
network_manager configuration option to
FlatManager, FlatDHCPManager, or VlanManager
(default is VLAN if no other is specified).Compute service architectureThese basic categories describe the service
architecture and what's going on within the cloud
controller.API ServerAt the heart of the cloud framework is an API
Server. This API Server makes command and control
of the hypervisor, storage, and networking
programmatically available to users in realization
of the definition of cloud computing.The API endpoints are basic http web services
which handle authentication, authorization, and
basic command and control functions using various
API interfaces under the Amazon, Rackspace, and
related models. This enables API compatibility
with multiple existing tool sets created for
interaction with offerings from other vendors.
This broad compatibility prevents vendor
lock-in.Message queueA messaging queue brokers the interaction
between compute nodes (processing), the networking
controllers (software which controls network
infrastructure), API endpoints, the scheduler
(determines which physical hardware to allocate to
a virtual resource), and similar components.
Communication to and from the cloud controller is
by HTTP requests through multiple API
endpoints.A typical message passing event begins with the
API server receiving a request from a user. The
API server authenticates the user and ensures that
the user is permitted to issue the subject
command. Availability of objects implicated in the
request is evaluated and, if available, the
request is routed to the queuing engine for the
relevant workers. Workers continually listen to
the queue based on their role, and occasionally
their type host name. When such listening produces
a work request, the worker takes assignment of the
task and begins its execution. Upon completion, a
response is dispatched to the queue which is
received by the API server and relayed to the
originating user. Database entries are queried,
added, or removed as necessary throughout the
process.Compute workerCompute workers manage computing instances on
host machines. The API dispatches commands to
compute workers to complete the following
tasks:Run instancesTerminate instancesReboot instancesAttach volumesDetach volumesGet console outputNetwork ControllerThe Network Controller manages the networking
resources on host machines. The API server
dispatches commands through the message queue,
which are subsequently processed by Network
Controllers. Specific operations include:Allocate fixed IP addressesConfiguring VLANs for projectsConfiguring networks for compute
nodesManage Compute usersAccess to the Euca2ools (ec2) API is controlled by
an access and secret key. The user’s access key needs
to be included in the request, and the request must be
signed with the secret key. Upon receipt of API
requests, Compute verifies the signature and runs
commands on behalf of the user.To begin using Compute, you must create a user with the Identity Service.Manage the cloudTA system administrator can use the following tools
to manage their cloud; the nova client, the
nova-manage command, and the Euca2ools
commands.The nova-manage command can only be run by cloud
administrators. Both novaclient and euca2ools can be
used by all users, though specific commands may be
restricted by Role Based Access Control in the
Identity Management service.To use the nova command-line toolInstalling the python-novaclient gives you a
nova shell command that
enables Compute API interactions from the
command line. You install the client, and then
provide your user name and password, set as
environment variables for convenience, and
then you can have the ability to send commands
to your cloud on the command-line.To install python-novaclient, download the
tarball from http://pypi.python.org/pypi/python-novaclient/2.6.3#downloads
and then install it in your favorite python
environment.$curl -O http://pypi.python.org/packages/source/p/python-novaclient/python-novaclient-2.6.3.tar.gz$tar -zxvf python-novaclient-2.6.3.tar.gz$cd python-novaclient-2.6.3$sudo python setup.py installNow that you have installed the
python-novaclient, confirm the installation by
entering:$nova helpusage: nova [--debug] [--os-username OS_USERNAME] [--os-password OS_PASSWORD]
[--os-tenant-name_name OS_TENANT_NAME] [--os-auth-url OS_AUTH_URL]
[--os-region-name OS_REGION_NAME] [--service-type SERVICE_TYPE]
[--service-name SERVICE_NAME] [--endpoint-type ENDPOINT_TYPE]
[--version VERSION]
<subcommand> ...This command returns a list of nova commands
and parameters. Set the required parameters as
environment variables to make running commands
easier. You can add
--os-username, for
example, on the nova command, or set it as
environment variables:$export OS_USERNAME=joecool$export OS_PASSWORD=coolword$export OS_TENANT_NAME=cooluUsing the Identity Service, you are supplied
with an authentication endpoint, which nova
recognizes as the
OS_AUTH_URL.$export OS_AUTH_URL=http://hostname:5000/v2.0$export NOVA_VERSION=1.1To use the nova-manage commandThe nova-manage command may be used to perform
many essential functions for administration and
ongoing maintenance of nova, such as network
creation or user manipulation.The man page for nova-manage has a good
explanation for each of its functions, and is
recommended reading for those starting out.
Access it by running:$man nova-manageFor administrators, the standard pattern for
executing a nova-manage command is:$nova-manage category command [args]For example, to obtain a list of all
projects:$nova-manage project listRun without arguments to see a list of
available command categories:$nova-manageYou can also run with a category argument
such as user to see a list of all commands in
that category:$nova-manage serviceTo use the euca2ools commandsFor a command-line interface to EC2 API calls,
use the euca2ools command line tool. See http://open.eucalyptus.com/wiki/Euca2oolsGuide_v1.3Manage logsLogging moduleAdding the following line to the
/etc/nova/nova.conf file
enables you to specify a configuration file to
change the logging behavior, in particular for
changing the logging level (such as,
DEBUG,
INFO,
WARNING,
ERROR): log-config=/etc/nova/logging.confThe log config file is an ini-style config file
which must contain a section called
logger_nova, which controls
the behavior of the logging facility in the
nova-* services. The file
must contain a section called
logger_nova, for
example:[logger_nova]
level = INFO
handlers = stderr
qualname = novaThis example sets the debugging level to
INFO (which less verbose
than the default DEBUG
setting). See the Python documentation on logging configuration
file format for more details on this
file, including the meaning of the
handlers and
quaname variables. See
etc/nova/logging_sample.conf in the
openstack/nova repository on GitHub for an example
logging.conf file with
various handlers defined.SyslogYou can configure OpenStack Compute services to send logging information to
syslog. This is useful if you want to use rsyslog, which forwards the logs to a
remote machine. You need to separately configure the Compute service (Nova), the
Identity service (Keystone), the Image service (Glance), and, if you are using
it, the Block Storage service (Cinder) to send log messages to syslog. To do so,
add the following lines to:/etc/nova/nova.conf/etc/keystone/keystone.conf/etc/glance/glance-api.conf/etc/glance/glance-registry.conf/etc/cinder/cinder.confverbose = False
debug = False
use_syslog = True
syslog_log_facility = LOG_LOCAL0In addition to enabling syslog, these settings
also turn off more verbose output and debugging
output from the log.While the example above uses the same
local facility for each service
(LOG_LOCAL0, which
corresponds to syslog facility
LOCAL0), we
recommend that you configure a separate
local facility for each service, as this
provides better isolation and more
flexibility. For example, you may want to
capture logging info at different severity
levels for different services. Syslog
allows you to define up to seven local
facilities, LOCAL0, LOCAL1, ...,
LOCAL7. See the syslog
documentation for more details.RsyslogRsyslog is a useful tool for setting up a
centralized log server across multiple machines.
We briefly describe the configuration to set up an
rsyslog server; a full treatment of rsyslog is
beyond the scope of this document. We assume
rsyslog has already been installed on your hosts,
which is the default on most Linux
distributions.This example shows a minimal configuration for
/etc/rsyslog.conf on the
log server host, which receives the log
files:# provides TCP syslog reception
$ModLoad imtcp
$InputTCPServerRun 1024Add to /etc/rsyslog.conf a
filter rule on which looks for a host name. The
example below use
compute-01 as an
example of a compute host
name::hostname, isequal, "compute-01" /mnt/rsyslog/logs/compute-01.logOn the compute hosts, create a file named
/etc/rsyslog.d/60-nova.conf,
with the following
content.# prevent debug from dnsmasq with the daemon.none parameter
*.*;auth,authpriv.none,daemon.none,local0.none -/var/log/syslog
# Specify a log level of ERROR
local0.error @@172.20.1.43:1024Once you have created this file, restart your
rsyslog daemon. Error-level log messages on the
compute hosts should now be sent to your log
server.MigrationBefore starting migrations, review the Configure migrations section in OpenStack Configuration
Reference.Migration provides a scheme to migrate running
instances from one OpenStack Compute server to another
OpenStack Compute server.To migrate instancesLook at the running instances, to get the ID
of the instance you wish to migrate.#nova listLook at information associated with that
instance - our example is vm1 from
above.#nova show d1df1b5a-70c4-4fed-98b7-423362f2c47cIn this example, vm1 is running on
HostB.Select the server to migrate instances
to.#nova-manage service listIn this example, HostC can be picked up
because nova-compute is running on
it.Ensure that HostC has enough resource for
migration.#nova-manage service describe_resource HostCcpu:the number of
cpumem(mb):total amount of
memory (MB)hdd:total amount of space for
NOVA-INST-DIR/instances (GB)1st line shows
total amount of resource
physical server has.2nd line shows
current used
resource.3rd line shows
maximum used
resource.4th line and
under shows the resource
for each project.Use the nova
live-migration command to
migrate the instances.#nova live-migration d1df1b5a-70c4-4fed-98b7-423362f2c47c HostCMake sure instances are migrated
successfully with nova
list. If instances are still running
on HostB, check log files (src/dest
nova-compute and nova-scheduler) to determine
why. While the nova command is called
live-migration,
under the default Compute
configuration options the instances
are suspended before migration.See Configure migrations in OpenStack Configuration Reference for more details.Recover from a failed compute nodeIf you have deployed Compute with a shared file system, you can quickly recover
from a failed compute node. Of the two methods covered in the following sections,
the evacuate API is the preferred method even in the absence of shared storage. The
evacuate API provides many benefits over manual recovery, such as re-attachment of
volumes and floating IPs.Manual recoveryFor KVM/libvirt compute node recovery, see the previous section. Use the
following procedure for other hypervisors.To work with host informationIdentify the vms on the affected hosts,
using tools such as a combination of
nova list and
nova show or
euca-describe-instances.
Here's an example using the EC2 API -
instance i-000015b9 that is running on
node np-rcc54:i-000015b9 at3-ui02 running nectarkey (376, np-rcc54) 0 m1.xxlarge 2012-06-19T00:48:11.000Z 115.146.93.60You can review the status of the host by
using the nova database. Some of the
important information is highlighted
below. This example converts an EC2 API
instance ID into an OpenStack ID - if you
used the nova commands,
you can substitute the ID directly. You
can find the credentials for your database
in
/etc/nova.conf.SELECT * FROM instances WHERE id = CONV('15b9', 16, 10) \G;
*************************** 1. row ***************************
created_at: 2012-06-19 00:48:11
updated_at: 2012-07-03 00:35:11
deleted_at: NULL
...
id: 5561
...
power_state: 5
vm_state: shutoff
...
hostname: at3-ui02
host: np-rcc54
...
uuid: 3f57699a-e773-4650-a443-b4b37eed5a06
...
task_state: NULL
...To recover the VMArmed with the information of VMs on the
failed host, determine to which compute
host the affected VMs should move. Run the
following database command to move the VM
to np-rcc46:UPDATE instances SET host = 'np-rcc46' WHERE uuid = '3f57699a-e773-4650-a443-b4b37eed5a06'; Next, if using a hypervisor that relies on libvirt (such as KVM) it is
a good idea to update the libvirt.xml file (found in
/var/lib/nova/instances/[instance ID]). The
important changes to make are to change the
DHCPSERVER value to the host ip address of the
Compute host that is the VMs new home, and update the VNC IP if it isn't
already 0.0.0.0.Next, reboot the VM:$nova reboot --hard 3f57699a-e773-4650-a443-b4b37eed5a06In theory, the above database update and nova
reboot command are all that is required to recover the VMs
from a failed host. However, if further problems occur, consider looking
at recreating the network filter configuration using
virsh, restarting the Compute services or
updating the vm_state and
power_state in the Compute database.Recover from a UID/GID mismatchWhen running OpenStack compute, using a shared file
system or an automated configuration tool, you could
encounter a situation where some files on your compute
node are using the wrong UID or GID. This causes a
raft of errors, such as being unable to live migrate,
or start virtual machines.The following is a basic procedure run on
nova-compute hosts, based on the KVM
hypervisor, that could help to restore the
situation:To recover from a UID/GID mismatchMake sure you don't use numbers that are
already used for some other user/group.Set the nova uid in
/etc/passwd to the
same number in all hosts (for example,
112).Set the libvirt-qemu uid in
/etc/passwd to the
same number in all hosts (for example,
119).Set the nova group in
/etc/group file to
the same number in all hosts (for example,
120).Set the libvirtd group in
/etc/group file to
the same number in all hosts (for example,
119).Stop the services on the compute
node.Change all the files owned by user nova or
by group nova. For example:find / -uid 108 -exec chown nova {} \; # note the 108 here is the old nova uid before the change
find / -gid 120 -exec chgrp nova {} \;Repeat the steps for the libvirt-qemu owned
files if those were needed to change.Restart the services.Now you can run the find
command to verify that all files using the
correct identifiers.Compute disaster recovery processIn this section describes how to manage your cloud
after a disaster, and how to easily back up the
persistent storage volumes. Back ups ARE mandatory,
even outside of disaster scenarios.For reference, you can find a DRP definition here:
http://en.wikipedia.org/wiki/Disaster_Recovery_Plan.A- The disaster recovery process presentationA disaster could happen to several components of
your architecture: a disk crash, a network loss, a
power cut, and so on. In this example, assume the
following set up:A cloud controller (nova-api,
nova-objecstore, nova-network)A compute node (nova-compute)A Storage Area Network used by
cinder-volumes (aka
SAN)The disaster example is the worst one: a power
loss. That power loss applies to the three
components. Let's see what
runs and how it runs before the
crash:From the SAN to the cloud controller, we
have an active iscsi session (used for the
"cinder-volumes" LVM's VG).From the cloud controller to the compute
node we also have active iscsi sessions
(managed by cinder-volume).For every volume an iscsi session is
made (so 14 ebs volumes equals 14
sessions).From the cloud controller to the compute
node, we also have iptables/ ebtables
rules which allows the access from the
cloud controller to the running
instance.And at least, from the cloud controller
to the compute node ; saved into database,
the current state of the instances (in
that case "running" ), and their volumes
attachment (mount point, volume id, volume
status, and so on.)Now, after the power loss occurs and all
hardware components restart, the situation is as
follows:From the SAN to the cloud, the ISCSI
session no longer exists.From the cloud controller to the compute
node, the ISCSI sessions no longer exist.
From the cloud controller to the compute
node, the iptables and ebtables are
recreated, since, at boot, nova-network
reapply the configurations.From the cloud controller, instances
turn into a shutdown state (because they
are no longer running)Into the database, data was not updated at all, since Compute could
not have guessed the crash.Before going further, and to prevent the admin
to make fatal mistakes, the
instances won't be lost, because no
"destroy" or
"terminate"
command was invoked, so the files for the
instances remain on the compute node.The plan is to perform the following tasks, in
that exact order.Any extra step would
be dangerous at this stage :Get the current relation from a
volume to its instance, so that you
can recreate the attachment.Update the database to clean the
stalled state. (After that, you cannot
perform the first step).Restart the instances. In other
words, go from a shutdown to running
state.After the restart, you can reattach
the volumes to their respective
instances.That step, which is not a mandatory
one, exists in an SSH into the
instances to reboot them.B - The disaster recovery procedure Instance-to-volume relation We need to get the current relation from a volume to its instance,
because we recreate the attachment:This relation could be figured by running nova
volume-list (note that nova client includes ability to
get volume info from cinder) Update the database Second, we need to update the database in order to clean the
stalled state. Now that we have saved the attachments we need to
restore for every volume, the database can be cleaned with the
following queries:
mysql>use cinder;mysql>update volumes set mountpoint=NULL;mysql>update volumes set status="available" where status <>"error_deleting";mysql>update volumes set attach_status="detached";mysql>update volumes set instance_id=0;
Now, when running nova volume-list all volumes should
be available. Restart instances You can restart the instances through a simple nova
reboot $instanceAt that stage, depending on your image, some instances completely
reboot and become reachable, while others stop on the "plymouth"
stage.DO NOT reboot a second time the
ones which are stopped at that stage (see
below, the fourth step). In fact it depends on
whether or not you added an /etc/fstab entry
for that volume. Images built with the cloud-init package remain in a pending state, while
others skip the missing volume and start. (More information is
available on help.ubuntu.com.) The idea of that stage is only to ask
nova to reboot every instance, so the stored state is
preserved. Reattach volumesAfter the restart, we can reattach the volumes to their respective
instances. Now that nova has restored the right status, it is time
to perform the attachments through a nova
volume-attachHere is a simple snippet that uses the file we created:#!/bin/bash
while read line; do
volume=`echo $line | $CUT -f 1 -d " "`
instance=`echo $line | $CUT -f 2 -d " "`
mount_point=`echo $line | $CUT -f 3 -d " "`
echo "ATTACHING VOLUME FOR INSTANCE - $instance"
nova volume-attach $instance $volume $mount_point
sleep 2
done < $volumes_tmp_fileAt that stage, instances that were pending on the boot sequence
(plymouth) automatically
continue their boot, and restart normally, while the ones that
booted see the volume. SSH into instances If some services depend on the volume, or if a volume has an entry
into fstab, it could be good to simply restart the instance. This
restart needs to be made from the instance itself, not through nova.
So, we SSH into the instance and perform a reboot:#shutdown -r nowBy completing this procedure, you will have successfully
recovered your cloud.Here are some suggestions:Use the
errors=remount
parameter in the
fstab file,
which prevents data corruption.The system would lock any write to
the disk if it detects an I/O error.
This configuration option should be
added into the cinder-volume server
(the one which performs the ISCSI
connection to the SAN), but also into
the instances'
fstab
file.Do not add the entry for the SAN's
disks to the cinder-volume's
fstab
file.Some systems hang on that step,
which means you could lose access to
your cloud-controller. To re-run the
session manually, you would run the
following command before performing
the mount:
#iscsiadm -m discovery -t st -p $SAN_IP $ iscsiadm -m node --target-name $IQN -p $SAN_IP -lFor your instances, if you have the
whole /home/
directory on the disk, instead of
emptying the
/home
directory and map the disk on it,
leave a user's directory with the
user's bash files and the
authorized_keys
file.This enables you to connect to the
instance, even without the volume
attached, if you allow only
connections through public
keys.C- scripted DRPYou can download from here a bash script which performs
these five steps:The "test mode" allows you to perform that whole
sequence for only one instance.To reproduce the power loss, connect to the
compute node which runs that same instance and
close the iscsi session. Do not detach the volume
through nova
volume-detach, but
instead manually close the iscsi session.In the following example, the iscsi session is
number 15 for that instance:$iscsiadm -m session -u -r 15Do not forget the
-r flag; otherwise, you
close ALL sessions.