17692203bc
Update links for the HA Guide conversion to RST. Change-Id: I9ac73929178a68d20a857c4223efdcdcc889528e
739 lines
38 KiB
XML
739 lines
38 KiB
XML
<?xml version="1.0" encoding="UTF-8"?>
|
||
<!DOCTYPE section [
|
||
<!ENTITY % openstack SYSTEM "../../common/entities/openstack.ent">
|
||
%openstack;
|
||
]>
|
||
<section xmlns="http://docbook.org/ns/docbook"
|
||
xmlns:xi="http://www.w3.org/2001/XInclude"
|
||
xmlns:xlink="http://www.w3.org/1999/xlink"
|
||
version="5.0"
|
||
xml:id="technical-considerations-general-purpose">
|
||
<?dbhtml stop-chunking?>
|
||
<title>Technical considerations</title>
|
||
<para>General purpose clouds are expected to
|
||
include these base services:</para>
|
||
<itemizedlist>
|
||
<listitem>
|
||
<para>
|
||
Compute
|
||
</para>
|
||
</listitem>
|
||
<listitem>
|
||
<para>
|
||
Network
|
||
</para>
|
||
</listitem>
|
||
<listitem>
|
||
<para>
|
||
Storage
|
||
</para>
|
||
</listitem>
|
||
</itemizedlist>
|
||
<para>Each of these services have different resource requirements.
|
||
As a result, you must make design decisions relating directly
|
||
to the service, as well as provide a balanced infrastructure for
|
||
all services.</para>
|
||
<para>Take into consideration the unique aspects of each service, as
|
||
individual characteristics and service mass can impact the hardware
|
||
selection process. Hardware designs should be generated for each of the
|
||
services.</para>
|
||
<para>Hardware decisions are also made in relation to network architecture
|
||
and facilities planning. These factors play heavily into
|
||
the overall architecture of an OpenStack cloud.</para>
|
||
|
||
<section xml:id="designing-compute-resources-tech-considerations">
|
||
<title>Compute resource design</title>
|
||
<para>When designing compute resource pools, a number of factors
|
||
can impact your design decisions. Factors such as number of processors,
|
||
amount of memory, and the quantity of storage required for each hypervisor
|
||
must be taken into account.</para>
|
||
<para>You will also need to decide whether to provide compute resources
|
||
in a single pool or in multiple pools. In most cases, multiple pools
|
||
of resources can be allocated and addressed on demand. A compute design
|
||
that allocates multiple pools of resources makes best use of application
|
||
resources, and is commonly referred to as
|
||
<firstterm>bin packing</firstterm>.</para>
|
||
<para>In a bin packing design, each independent resource pool provides service
|
||
for specific flavors. This helps to ensure that, as instances are scheduled
|
||
onto compute hypervisors, each independent node's resources will be allocated
|
||
in a way that makes the most efficient use of the available hardware. Bin
|
||
packing also requires a common hardware design, with all hardware nodes within
|
||
a compute resource pool sharing a common processor, memory, and storage layout.
|
||
This makes it easier to deploy, support, and maintain nodes throughout their
|
||
life cycle.</para>
|
||
<para>An <firstterm>overcommit ratio</firstterm> is the ratio of available
|
||
virtual resources to available physical resources. This ratio is
|
||
configurable for CPU and memory. The default CPU overcommit ratio is 16:1, and
|
||
the default memory overcommit ratio is 1.5:1. Determining the tuning of the
|
||
overcommit ratios during the design phase is important as it has a direct
|
||
impact on the hardware layout of your compute nodes.</para>
|
||
<para>When selecting a processor, compare features and performance
|
||
characteristics. Some processors include features specific to virtualized
|
||
compute hosts, such as hardware-assisted virtualization, and technology
|
||
related to memory paging (also known as EPT shadowing). These types of features
|
||
can have a significant impact on the performance of your virtual machine.</para>
|
||
<para>You will also need to consider the compute requirements of non-hypervisor
|
||
nodes (sometimes referred to as resource nodes). This includes controller, object
|
||
storage, and block storage nodes, and networking services.</para>
|
||
<para>The number of processor cores and threads impacts the number of worker
|
||
threads which can be run on a resource node. Design decisions must relate
|
||
directly to the service being run on it, as well as provide a balanced
|
||
infrastructure for all services.</para>
|
||
<para>Workload can be unpredictable in a general purpose cloud, so consider
|
||
including the ability to add additional compute resource pools on demand.
|
||
In some cases, however, the demand for certain instance types or flavors may not
|
||
justify individual hardware design. In either case, start by allocating
|
||
hardware designs that are capable of servicing the most common instance
|
||
requests. If you want to add additional hardware to the overall architecture,
|
||
this can be done later.</para>
|
||
</section>
|
||
|
||
<section xml:id="designing-network-resources-tech-considerations">
|
||
<title>Designing network resources</title>
|
||
<para>OpenStack clouds generally have multiple network segments, with
|
||
each segment providing access to particular resources. The network services
|
||
themselves also require network communication paths which should
|
||
be separated from the other networks. When designing network services
|
||
for a general purpose cloud, plan for either a physical or logical
|
||
separation of network segments used by operators and tenants. You can also
|
||
create an additional network segment for access to internal services such as
|
||
the message bus and database used by various services. Segregating these
|
||
services onto separate networks helps to protect sensitive data and protects
|
||
against unauthorized access to services.</para>
|
||
<para>Choose a networking service based on the requirements of your instances.
|
||
The architecture and design of your cloud will impact whether you choose
|
||
OpenStack Networking(neutron), or legacy networking (nova-network).</para>
|
||
<variablelist>
|
||
<varlistentry>
|
||
<term>Legacy networking (nova-network)</term>
|
||
<listitem>
|
||
<para>The legacy networking (nova-network) service is primarily a
|
||
layer-2 networking service that functions in two modes, which
|
||
use VLANs in different ways. In a flat network mode, all
|
||
network hardware nodes and devices throughout the cloud are connected
|
||
to a single layer-2 network segment that provides access to
|
||
application data.</para>
|
||
<para>When the network devices in the cloud support segmentation
|
||
using VLANs, legacy networking can operate in the second mode. In
|
||
this design model, each tenant within the cloud is assigned a
|
||
network subnet which is mapped to a VLAN on the physical
|
||
network. It is especially important to remember the maximum
|
||
number of 4096 VLANs which can be used within a spanning tree
|
||
domain. This places a hard limit on the amount of
|
||
growth possible within the data center. When designing a
|
||
general purpose cloud intended to support multiple tenants, we
|
||
recommend the use of legacy networking with VLANs, and
|
||
not in flat network mode.</para>
|
||
</listitem>
|
||
</varlistentry>
|
||
</variablelist>
|
||
<para>Another consideration regarding network is the fact that
|
||
legacy networking is entirely managed by the cloud operator;
|
||
tenants do not have control over network resources. If tenants
|
||
require the ability to manage and create network resources
|
||
such as network segments and subnets, it will be necessary to
|
||
install the OpenStack Networking service to provide network
|
||
access to instances.</para>
|
||
<variablelist>
|
||
<varlistentry>
|
||
<term>OpenStack Networking (neutron)</term>
|
||
<listitem>
|
||
<para>OpenStack Networking (neutron) is a first class networking
|
||
service that gives full control over creation of virtual
|
||
network resources to tenants. This is often accomplished in
|
||
the form of tunneling protocols which will establish
|
||
encapsulated communication paths over existing network
|
||
infrastructure in order to segment tenant traffic. These
|
||
methods vary depending on the specific implementation, but
|
||
some of the more common methods include tunneling over GRE,
|
||
encapsulating with VXLAN, and VLAN tags.</para>
|
||
</listitem>
|
||
</varlistentry>
|
||
</variablelist>
|
||
<para>We recommend you design at least three network segments:</para>
|
||
<itemizedlist>
|
||
<listitem>
|
||
<para>The first segment is a public network, used for access to REST APIs
|
||
by tenants and operators. The controller nodes and swift
|
||
proxies are the only devices connecting to this network segment. In some
|
||
cases, this network might also be serviced by hardware load balancers
|
||
and other network devices.</para>
|
||
</listitem>
|
||
<listitem>
|
||
<para>The second segment is used by administrators to manage hardware resources.
|
||
Configuration management tools also use this for deploying software and
|
||
services onto new hardware. In some cases, this network segment might also be
|
||
used for internal services, including the message bus and database services.
|
||
This network needs to communicate with every hardware node.
|
||
Due to the highly sensitive nature of this network segment, you also need to
|
||
secure this network from unauthorized access.</para>
|
||
</listitem>
|
||
<listitem>
|
||
<para>The third network segment is used by applications and consumers to access
|
||
the physical network, and for users to access applications. This network is
|
||
segregated from the one used to access the cloud APIs and is not
|
||
capable of communicating directly with the hardware resources in the cloud.
|
||
Compute resource nodes and network gateway services which allow application
|
||
data to access the physical network from outside of the cloud need to
|
||
communicate on this network segment.</para>
|
||
</listitem>
|
||
</itemizedlist>
|
||
</section>
|
||
|
||
<section xml:id="designing-openstack-object-storage-tech-considerations">
|
||
<title>Designing OpenStack Object Storage</title>
|
||
<para>When designing hardware resources for OpenStack Object
|
||
Storage, the primary goal is to maximize the amount of storage
|
||
in each resource node while also ensuring that the cost per
|
||
terabyte is kept to a minimum. This often involves utilizing
|
||
servers which can hold a large number of spinning disks.
|
||
Whether choosing to use 2U server form factors with directly
|
||
attached storage or an external chassis that holds a larger
|
||
number of drives, the main goal is to maximize the storage
|
||
available in each node.</para>
|
||
<note>
|
||
<para>We do not recommended investing in enterprise class drives
|
||
for an OpenStack Object Storage cluster. The consistency and
|
||
partition tolerance characteristics of OpenStack Object
|
||
Storage ensures that data stays up to date and survives
|
||
hardware faults without the use of any specialized data
|
||
replication devices.</para>
|
||
</note>
|
||
<para>One of the benefits of OpenStack Object Storage is the ability
|
||
to mix and match drives by making use of weighting within the
|
||
swift ring. When designing your swift storage cluster, we
|
||
recommend making use of the most cost effective storage
|
||
solution available at the time.</para>
|
||
<para>To achieve durability and availability of data stored as objects
|
||
it is important to design object storage resource pools to ensure they can
|
||
provide the suggested availability. Considering rack-level and zone-level
|
||
designs to accommodate the number of replicas configured to be stored in the
|
||
Object Storage service (the default number of replicas is three) is important
|
||
when designing beyond the hardware node level. Each replica of
|
||
data should exist in its own availability zone with its own
|
||
power, cooling, and network resources available to service
|
||
that specific zone.</para>
|
||
<para>Object storage nodes should be designed so that the number
|
||
of requests does not hinder the performance of the cluster.
|
||
The object storage service is a chatty protocol, therefore
|
||
making use of multiple processors that have higher core counts
|
||
will ensure the IO requests do not inundate the server.</para>
|
||
</section>
|
||
|
||
<section xml:id="designing-openstack-block-storage">
|
||
<title>Designing OpenStack Block Storage</title>
|
||
<para>When designing OpenStack Block Storage resource nodes, it is
|
||
helpful to understand the workloads and requirements that will
|
||
drive the use of block storage in the cloud. We recommend designing
|
||
block storage pools so that tenants can choose appropriate storage
|
||
solutions for their applications. By creating multiple storage pools of different
|
||
types, in conjunction with configuring an advanced storage
|
||
scheduler for the block storage service, it is possible to
|
||
provide tenants with a large catalog of storage services with
|
||
a variety of performance levels and redundancy options.</para>
|
||
<para>Block storage also takes advantage of a number of enterprise storage
|
||
solutions. These are addressed via a plug-in driver developed by the
|
||
hardware vendor. A large number of
|
||
enterprise storage plug-in drivers ship out-of-the-box with
|
||
OpenStack Block Storage (and many more available via third
|
||
party channels). General purpose clouds are more likely to use
|
||
directly attached storage in the majority of block storage nodes,
|
||
deeming it necessary to provide additional levels of service to tenants
|
||
which can only be provided by enterprise class storage solutions.</para>
|
||
<para>Redundancy and availability requirements impact the decision to use
|
||
a RAID controller card in block storage nodes. The input-output per second (IOPS)
|
||
demand of your application will influence whether or not you should use a RAID
|
||
controller, and which level of RAID is required.
|
||
Making use of higher performing RAID volumes is suggested when
|
||
considering performance. However, where redundancy of
|
||
block storage volumes is more important we recommend
|
||
making use of a redundant RAID configuration such as RAID 5 or
|
||
RAID 6. Some specialized features, such as automated
|
||
replication of block storage volumes, may require the use of
|
||
third-party plug-ins and enterprise block storage solutions in
|
||
order to provide the high demand on storage. Furthermore,
|
||
where extreme performance is a requirement it may also be
|
||
necessary to make use of high speed SSD disk drives' high
|
||
performing flash storage solutions.</para>
|
||
</section>
|
||
|
||
<section xml:id="software-selection-tech-considerations">
|
||
<title>Software selection</title>
|
||
<para>The software selection process plays a large role in the
|
||
architecture of a general purpose cloud. The following have
|
||
a large impact on the design of the cloud:</para>
|
||
<itemizedlist>
|
||
<listitem>
|
||
<para>
|
||
Choice of operating system
|
||
</para>
|
||
</listitem>
|
||
<listitem>
|
||
<para>
|
||
Selection of OpenStack software components
|
||
</para>
|
||
</listitem>
|
||
<listitem>
|
||
<para>
|
||
Choice of hypervisor
|
||
</para>
|
||
</listitem>
|
||
<listitem>
|
||
<para>
|
||
Selection of supplemental software
|
||
</para>
|
||
</listitem>
|
||
</itemizedlist>
|
||
<para>Operating system (OS) selection plays a large role in the
|
||
design and architecture of a cloud. There are a number of OSes
|
||
which have native support for OpenStack including:</para>
|
||
<itemizedlist>
|
||
<listitem>
|
||
<para>
|
||
Ubuntu
|
||
</para>
|
||
</listitem>
|
||
<listitem>
|
||
<para>
|
||
Red Hat Enterprise Linux (RHEL)
|
||
</para>
|
||
</listitem>
|
||
<listitem>
|
||
<para>
|
||
CentOS
|
||
</para>
|
||
</listitem>
|
||
<listitem>
|
||
<para>
|
||
SUSE Linux Enterprise Server (SLES)
|
||
</para>
|
||
</listitem>
|
||
</itemizedlist>
|
||
<note>
|
||
<para>Native support is not a constraint on the choice of OS; users are
|
||
free to choose just about any Linux distribution (or even
|
||
Microsoft Windows) and install OpenStack directly from source
|
||
(or compile their own packages). However, many organizations will
|
||
prefer to install OpenStack from distribution-supplied packages or
|
||
repositories (although using the distribution vendor's OpenStack
|
||
packages might be a requirement for support).
|
||
</para>
|
||
</note>
|
||
<para>OS selection also directly influences hypervisor selection.
|
||
A cloud architect who selects Ubuntu, RHEL, or SLES has some
|
||
flexibility in hypervisor; KVM, Xen, and LXC are supported
|
||
virtualization methods available under OpenStack Compute
|
||
(nova) on these Linux distributions. However, a cloud architect
|
||
who selects Hyper-V is limited to Windows Servers. Similarly, a
|
||
cloud architect who selects XenServer is limited to the CentOS-based
|
||
dom0 operating system provided with XenServer.</para>
|
||
<para>The primary factors that play into OS-hypervisor selection
|
||
include:</para>
|
||
<variablelist>
|
||
<varlistentry>
|
||
<term>User requirements</term>
|
||
<listitem>
|
||
<para>The selection of OS-hypervisor
|
||
combination first and foremost needs to support the
|
||
user requirements.</para>
|
||
</listitem>
|
||
</varlistentry>
|
||
<varlistentry>
|
||
<term>Support</term>
|
||
<listitem>
|
||
<para>The selected OS-hypervisor combination
|
||
needs to be supported by OpenStack.</para>
|
||
</listitem>
|
||
</varlistentry>
|
||
<varlistentry>
|
||
<term>Interoperability</term>
|
||
<listitem>
|
||
<para>The OS-hypervisor needs to be
|
||
interoperable with other features and services in the
|
||
OpenStack design in order to meet the user
|
||
requirements.</para>
|
||
</listitem>
|
||
</varlistentry>
|
||
</variablelist>
|
||
</section>
|
||
|
||
<section xml:id="hypervisor-tech-considerations">
|
||
<title>Hypervisor</title>
|
||
<para>OpenStack supports a wide variety of hypervisors, one or
|
||
more of which can be used in a single cloud. These hypervisors
|
||
include:</para>
|
||
<itemizedlist>
|
||
<listitem>
|
||
<para>KVM (and QEMU)</para>
|
||
</listitem>
|
||
<listitem>
|
||
<para>XCP/XenServer</para>
|
||
</listitem>
|
||
<listitem>
|
||
<para>vSphere (vCenter and ESXi)</para>
|
||
</listitem>
|
||
<listitem>
|
||
<para>Hyper-V</para>
|
||
</listitem>
|
||
<listitem>
|
||
<para>LXC</para>
|
||
</listitem>
|
||
<listitem>
|
||
<para>Docker</para>
|
||
</listitem>
|
||
<listitem>
|
||
<para>Bare-metal</para>
|
||
</listitem>
|
||
</itemizedlist>
|
||
<para>A complete list of supported hypervisors and their
|
||
capabilities can be found at
|
||
<link xlink:href="https://wiki.openstack.org/wiki/HypervisorSupportMatrix">OpenStack Hypervisor Support Matrix</link>.
|
||
</para>
|
||
<para>We recommend general purpose clouds use hypervisors that
|
||
support the most general purpose use cases, such as KVM and
|
||
Xen. More specific hypervisors should be chosen to account
|
||
for specific functionality or a supported feature requirement.
|
||
In some cases, there may also be a mandated
|
||
requirement to run software on a certified hypervisor
|
||
including solutions from VMware, Microsoft, and Citrix.</para>
|
||
<para>The features offered through the OpenStack cloud platform
|
||
determine the best choice of a hypervisor. Each hypervisor
|
||
has their own hardware requirements which may affect the decisions
|
||
around designing a general purpose cloud.</para>
|
||
<para>In a mixed hypervisor environment, specific aggregates of
|
||
compute resources, each with defined capabilities, enable
|
||
workloads to utilize software and hardware specific to their
|
||
particular requirements. This functionality can be exposed
|
||
explicitly to the end user, or accessed through defined
|
||
metadata within a particular flavor of an instance.</para>
|
||
</section>
|
||
|
||
<section xml:id="openstack-components-tech-considerations">
|
||
<title>OpenStack components</title>
|
||
<para>A general purpose OpenStack cloud design should incorporate
|
||
the core OpenStack services to provide a wide range of
|
||
services to end-users. The OpenStack core services recommended
|
||
in a general purpose cloud are:</para>
|
||
<itemizedlist>
|
||
<listitem>
|
||
<para>OpenStack <glossterm>Compute</glossterm>
|
||
(<glossterm>nova</glossterm>)</para>
|
||
</listitem>
|
||
<listitem>
|
||
<para>OpenStack <glossterm>Networking</glossterm>
|
||
(<glossterm>neutron</glossterm>)</para>
|
||
</listitem>
|
||
<listitem>
|
||
<para>OpenStack <glossterm>Image service</glossterm>
|
||
(<glossterm>glance</glossterm>)</para>
|
||
</listitem>
|
||
<listitem>
|
||
<para>OpenStack <glossterm>Identity</glossterm>
|
||
(<glossterm>keystone</glossterm>)</para>
|
||
</listitem>
|
||
<listitem>
|
||
<para>OpenStack <glossterm>dashboard</glossterm>
|
||
(<glossterm>horizon</glossterm>)</para>
|
||
</listitem>
|
||
<listitem>
|
||
<para><glossterm>Telemetry</glossterm> module
|
||
(<glossterm>ceilometer</glossterm>)</para>
|
||
</listitem>
|
||
</itemizedlist>
|
||
<para>A general purpose cloud may also include OpenStack
|
||
<glossterm>Object Storage</glossterm> (<glossterm>swift</glossterm>).
|
||
OpenStack <glossterm>Block Storage</glossterm>
|
||
(<glossterm>cinder</glossterm>). These may be
|
||
selected to provide storage to applications and
|
||
instances.</para>
|
||
</section>
|
||
|
||
<section xml:id="supplemental-software-tech-considerations">
|
||
<title>Supplemental software</title>
|
||
<para>A general purpose OpenStack deployment consists of more than
|
||
just OpenStack-specific components. A typical deployment
|
||
involves services that provide supporting functionality,
|
||
including databases and message queues, and may also involve
|
||
software to provide high availability of the OpenStack
|
||
environment. Design decisions around the underlying message
|
||
queue might affect the required number of controller services,
|
||
as well as the technology to provide highly resilient database
|
||
functionality, such as MariaDB with Galera. In such a
|
||
scenario, replication of services relies on quorum.</para>
|
||
<para>Where many general purpose deployments use hardware load
|
||
balancers to provide highly available API access and SSL
|
||
termination, software solutions, for example HAProxy, can also
|
||
be considered. It is vital to ensure that such software
|
||
implementations are also made highly available. High
|
||
availability can be achieved by using software such as
|
||
Keepalived or Pacemaker with Corosync. Pacemaker and Corosync
|
||
can provide active-active or active-passive highly available
|
||
configuration depending on the specific service in the
|
||
OpenStack environment. Using this software can affect the
|
||
design as it assumes at least a 2-node controller
|
||
infrastructure where one of those nodes may be running certain
|
||
services in standby mode.</para>
|
||
<para>Memcached is a distributed memory object caching system, and
|
||
Redis is a key-value store. Both are deployed on
|
||
general purpose clouds to assist in alleviating load to the
|
||
Identity service. The memcached service caches tokens, and due
|
||
to its distributed nature it can help alleviate some
|
||
bottlenecks to the underlying authentication system. Using
|
||
memcached or Redis does not affect the overall design of your
|
||
architecture as they tend to be deployed onto the
|
||
infrastructure nodes providing the OpenStack services.</para>
|
||
</section>
|
||
|
||
<section xml:id="controller-infrastructure-tech-considerations">
|
||
<title>Controller infrastructure</title>
|
||
<para>The Controller infrastructure nodes provide management
|
||
services to the end-user as well as providing services
|
||
internally for the operating of the cloud. The Controllers
|
||
run message queuing services that carry system
|
||
messages between each service. Performance issues related to
|
||
the message bus would lead to delays in sending that message
|
||
to where it needs to go. The result of this condition would be
|
||
delays in operation functions such as spinning up and deleting
|
||
instances, provisioning new storage volumes and managing
|
||
network resources. Such delays could adversely affect an
|
||
application’s ability to react to certain conditions,
|
||
especially when using auto-scaling features. It is important
|
||
to properly design the hardware used to run the controller
|
||
infrastructure as outlined above in the Hardware Selection
|
||
section.</para>
|
||
<para>Performance of the controller services is not limited
|
||
to processing power, but restrictions may emerge in serving
|
||
concurrent users. Ensure that the APIs and Horizon services
|
||
are load tested to ensure that you are able to serve your
|
||
customers. Particular attention should be made to the
|
||
OpenStack Identity Service (Keystone), which provides the
|
||
authentication and authorization for all services, both
|
||
internally to OpenStack itself and to end-users. This service
|
||
can lead to a degradation of overall performance if this is
|
||
not sized appropriately.</para>
|
||
</section>
|
||
|
||
<section xml:id="network-performance-tech-considerations">
|
||
<title>Network performance</title>
|
||
<para>In a general purpose OpenStack cloud, the requirements of
|
||
the network help determine performance capabilities.
|
||
It is possible to design OpenStack
|
||
environments that run a mix of networking capabilities. By
|
||
utilizing the different interface speeds, the users of the
|
||
OpenStack environment can choose networks that are fit for
|
||
their purpose.</para>
|
||
<para>Network performance can be boosted considerably by
|
||
implementing hardware load balancers to provide front-end
|
||
service to the cloud APIs. The hardware load balancers also
|
||
perform SSL termination if that is a requirement of your
|
||
environment. When implementing SSL offloading, it is important
|
||
to understand the SSL offloading capabilities of the devices
|
||
selected.</para>
|
||
</section>
|
||
|
||
<section xml:id="compute-host-tech-considerations">
|
||
<title>Compute host</title>
|
||
<para>The choice of hardware specifications used in compute nodes
|
||
including CPU, memory and disk type directly affects the
|
||
performance of the instances. Other factors which can directly
|
||
affect performance include tunable parameters within the
|
||
OpenStack services, for example the overcommit ratio applied
|
||
to resources. The defaults in OpenStack Compute set a 16:1
|
||
over-commit of the CPU and 1.5 over-commit of the memory.
|
||
Running at such high ratios leads to an increase in
|
||
"noisy-neighbor" activity. Care must be taken when sizing your
|
||
Compute environment to avoid this scenario. For running
|
||
general purpose OpenStack environments it is possible to keep
|
||
to the defaults, but make sure to monitor your environment as
|
||
usage increases.</para>
|
||
</section>
|
||
|
||
<section xml:id="storage-performance-tech-considerations">
|
||
<title>Storage performance</title>
|
||
<para>When considering performance of OpenStack Block Storage,
|
||
hardware and architecture choice is important. Block Storage
|
||
can use enterprise back-end systems such as NetApp or EMC,
|
||
scale out storage such as GlusterFS and Ceph, or simply use
|
||
the capabilities of directly attached storage in the nodes
|
||
themselves. Block Storage may be deployed so that traffic
|
||
traverses the host network, which could affect, and be
|
||
adversely affected by, the front-side API traffic performance.
|
||
As such, consider using a dedicated data storage network with
|
||
dedicated interfaces on the Controller and Compute
|
||
hosts.</para>
|
||
<para>When considering performance of OpenStack Object Storage, a
|
||
number of design choices will affect performance. A user’s
|
||
access to the Object Storage is through the proxy services,
|
||
which sit behind hardware load balancers. By the
|
||
very nature of a highly resilient storage system, replication
|
||
of the data would affect performance of the overall system. In
|
||
this case, 10 GbE (or better) networking is recommended
|
||
throughout the storage network architecture.</para>
|
||
</section>
|
||
|
||
<section xml:id="availability-tech-considerations">
|
||
<title>Availability</title>
|
||
<para>In OpenStack, the infrastructure is integral to providing
|
||
services and should always be available, especially when
|
||
operating with SLAs. Ensuring network availability is
|
||
accomplished by designing the network architecture so that no
|
||
single point of failure exists. A consideration of the number
|
||
of switches, routes and redundancies of power should be
|
||
factored into core infrastructure, as well as the associated
|
||
bonding of networks to provide diverse routes to your highly
|
||
available switch infrastructure.</para>
|
||
<para>The OpenStack services themselves should be deployed across
|
||
multiple servers that do not represent a single point of
|
||
failure. Ensuring API availability can be achieved by placing
|
||
these services behind highly available load balancers that
|
||
have multiple OpenStack servers as members.</para>
|
||
<para>OpenStack lends itself to deployment in a highly available
|
||
manner where it is expected that at least 2 servers be
|
||
utilized. These can run all the services involved from the
|
||
message queuing service, for example RabbitMQ or QPID, and an
|
||
appropriately deployed database service such as MySQL or
|
||
MariaDB. As services in the cloud are scaled out, back-end
|
||
services will need to scale too. Monitoring and reporting on
|
||
server utilization and response times, as well as load testing
|
||
your systems, will help determine scale out decisions.</para>
|
||
<para>Care must be taken when deciding network functionality.
|
||
Currently, OpenStack supports both the legacy networking (nova-network)
|
||
system and the newer, extensible OpenStack Networking (neutron). Both
|
||
have their pros and cons when it comes to providing highly
|
||
available access. Legacy networking, which provides networking
|
||
access maintained in the OpenStack Compute code, provides a
|
||
feature that removes a single point of failure when it comes
|
||
to routing, and this feature is currently missing in OpenStack
|
||
Networking. The effect of legacy networking’s multi-host
|
||
functionality restricts failure domains to the host running
|
||
that instance.</para>
|
||
<para>When using OpenStack Networking, the
|
||
OpenStack controller servers or separate Networking
|
||
hosts handle routing. For a deployment that requires features
|
||
available in only Networking, it is possible to
|
||
remove this restriction by using third party software that
|
||
helps maintain highly available L3 routes. Doing so allows for
|
||
common APIs to control network hardware, or to provide complex
|
||
multi-tier web applications in a secure manner. It is also
|
||
possible to completely remove routing from
|
||
Networking, and instead rely on hardware routing capabilities.
|
||
In this case, the switching infrastructure must support L3
|
||
routing.</para>
|
||
<para>OpenStack Networking and legacy networking
|
||
both have their advantages and
|
||
disadvantages. They are both valid and supported options that
|
||
fit different network deployment models described in the
|
||
<citetitle><link
|
||
xlink:href="http://docs.openstack.org/openstack-ops/content/network_design.html#network_deployment_options"
|
||
>OpenStack Operations Guide</link></citetitle>.</para>
|
||
<para>Ensure your deployment has adequate back-up capabilities.</para>
|
||
<para>Application design must also be factored into the
|
||
capabilities of the underlying cloud infrastructure. If the
|
||
compute hosts do not provide a seamless live migration
|
||
capability, then it must be expected that when a compute host
|
||
fails, that instance and any data local to that instance will
|
||
be deleted. However, when providing an expectation to users
|
||
that instances have a high-level of uptime guarantees, the
|
||
infrastructure must be deployed in a way that eliminates any
|
||
single point of failure when a compute host disappears. This
|
||
may include utilizing shared file systems on enterprise
|
||
storage or OpenStack Block storage to provide a level of
|
||
guarantee to match service features.</para>
|
||
<para>For more information on high availability in OpenStack, see the <link
|
||
xlink:href="http://docs.openstack.org/ha-guide/"><citetitle>OpenStack
|
||
High Availability Guide</citetitle></link>.
|
||
</para>
|
||
</section>
|
||
|
||
<section xml:id="security-tech-considerations">
|
||
<title>Security</title>
|
||
<para>A security domain comprises users, applications, servers or
|
||
networks that share common trust requirements and expectations
|
||
within a system. Typically they have the same authentication
|
||
and authorization requirements and users.</para>
|
||
<para>These security domains are:</para>
|
||
<itemizedlist>
|
||
<listitem>
|
||
<para>Public</para>
|
||
</listitem>
|
||
<listitem>
|
||
<para>Guest</para>
|
||
</listitem>
|
||
<listitem>
|
||
<para>Management</para>
|
||
</listitem>
|
||
<listitem>
|
||
<para>Data</para>
|
||
</listitem>
|
||
</itemizedlist>
|
||
<para>These security domains can be mapped to an OpenStack
|
||
deployment individually, or combined. In each case, the cloud operator
|
||
should be aware of the appropriate security concerns. Security
|
||
domains should be mapped out against your specific OpenStack
|
||
deployment topology. The domains and their trust requirements
|
||
depend upon whether the cloud instance is public, private, or
|
||
hybrid.</para>
|
||
<itemizedlist>
|
||
<listitem>
|
||
<para>The public security domain is an entirely untrusted area of
|
||
the cloud infrastructure. It can refer to the internet as a
|
||
whole or simply to networks over which you have no authority.
|
||
This domain should always be considered untrusted.</para>
|
||
</listitem>
|
||
<listitem>
|
||
<para>The guest security domain handles compute data generated by
|
||
instances on the cloud but not services that support the
|
||
operation of the cloud, such as API calls. Public cloud
|
||
providers and private cloud providers who do not have
|
||
stringent controls on instance use or who allow unrestricted
|
||
internet access to instances should consider this domain to be
|
||
untrusted. Private cloud providers may want to consider this
|
||
network as internal and therefore trusted only if they have
|
||
controls in place to assert that they trust instances and all
|
||
their tenants.</para>
|
||
</listitem>
|
||
<listitem>
|
||
<para>The management security domain is where services interact.
|
||
Sometimes referred to as the <emphasis>control plane</emphasis>, the networks
|
||
in this domain transport confidential data such as configuration
|
||
parameters, user names, and passwords. In most deployments this
|
||
domain is considered trusted.</para>
|
||
</listitem>
|
||
<listitem>
|
||
<para>The data security domain is concerned primarily with
|
||
information pertaining to the storage services within
|
||
OpenStack. Much of the data that crosses this network has high
|
||
integrity and confidentiality requirements and, depending on
|
||
the type of deployment, may also have strong availability
|
||
requirements. The trust level of this network is heavily
|
||
dependent on other deployment decisions.</para>
|
||
</listitem>
|
||
</itemizedlist>
|
||
<para>When deploying OpenStack in an enterprise as a private cloud
|
||
it is usually behind the firewall and within the trusted
|
||
network alongside existing systems. Users of the cloud are
|
||
employees that are bound by the security
|
||
requirements set forth by the company. This tends to push most
|
||
of the security domains towards a more trusted model. However,
|
||
when deploying OpenStack in a public facing role, no
|
||
assumptions can be made and the attack vectors significantly
|
||
increase.</para>
|
||
<para>Consideration must be taken when managing the users of the
|
||
system for both public and private clouds. The identity
|
||
service allows for LDAP to be part of the authentication
|
||
process. Including such systems in an OpenStack deployment may
|
||
ease user management if integrating into existing
|
||
systems.</para>
|
||
<para>It is important to understand that user authentication
|
||
requests include sensitive information including user names,
|
||
passwords, and authentication tokens. For this reason, placing
|
||
the API services behind hardware that performs SSL termination
|
||
is strongly recommended.</para>
|
||
<para>
|
||
For more information OpenStack Security, see the <link
|
||
xlink:href="http://docs.openstack.org/security-guide/"><citetitle>OpenStack
|
||
Security Guide</citetitle></link>
|
||
</para>
|
||
</section>
|
||
</section>
|