ca85389375
Removed Legal requirements bits. Other minor changes to the User, Technical and Operational requirements sections. Implements: blueprint arch-guide Change-Id: I59dcc77ab45c41fd50c5581bb606b74016d2a62a
132 lines
7.4 KiB
XML
132 lines
7.4 KiB
XML
<?xml version="1.0" encoding="UTF-8"?>
|
|
<!DOCTYPE section [
|
|
<!ENTITY % openstack SYSTEM "../../common/entities/openstack.ent">
|
|
%openstack;
|
|
]>
|
|
<section xmlns="http://docbook.org/ns/docbook"
|
|
xmlns:xi="http://www.w3.org/2001/XInclude"
|
|
xmlns:xlink="http://www.w3.org/1999/xlink"
|
|
version="5.0"
|
|
xml:id="technical-considerations-massive-scale">
|
|
<?dbhtml stop-chunking?>
|
|
<title>Technical considerations</title>
|
|
<para>Repurposing an existing OpenStack environment to be
|
|
massively scalable is a formidable task. When building
|
|
a massively scalable environment from the ground up, ensure
|
|
you build the initial deployment with the same principles
|
|
and choices that apply as the environment grows. For example,
|
|
a good approach is to deploy the first site as a multi-site
|
|
environment. This enables you to use the same deployment
|
|
and segregation methods as the environment grows to separate
|
|
locations across dedicated links or wide area networks. In
|
|
a hyperscale cloud, scale trumps redundancy. Modify applications
|
|
with this in mind, relying on the scale and homogeneity of the
|
|
environment to provide reliability rather than redundant
|
|
infrastructure provided by non-commodity hardware
|
|
solutions.</para>
|
|
<section xml:id="infrastructure-segregation-massive-scale">
|
|
<title>Infrastructure segregation</title>
|
|
<para>OpenStack services support massive horizontal scale.
|
|
Be aware that this is not the case for the entire supporting
|
|
infrastructure. This is particularly a problem for the database
|
|
management systems and message queues that OpenStack services
|
|
use for data storage and remote procedure call communications.</para>
|
|
<para>Traditional clustering techniques typically
|
|
provide high availability and some additional scale for these
|
|
environments. In the quest for massive scale, however, you must
|
|
take additional steps to relieve the performance
|
|
pressure on these components in order to prevent them from negatively
|
|
impacting the overall performance of the environment. Ensure that
|
|
all the components are in balance so that if the massively
|
|
scalable environment fails, all the components are near maximum
|
|
capacity and a single component is not causing the failure.</para>
|
|
<para>Regions segregate completely independent
|
|
installations linked only by an Identity and Dashboard
|
|
(optional) installation. Services have separate
|
|
API endpoints for each region, and include separate database
|
|
and queue installations. This exposes some awareness of the
|
|
environment's fault domains to users and gives them the
|
|
ability to ensure some degree of application resiliency while
|
|
also imposing the requirement to specify which region to apply
|
|
their actions to.</para>
|
|
<para>Environments operating at massive scale typically need their
|
|
regions or sites subdivided further without exposing the
|
|
requirement to specify the failure domain to the user. This
|
|
provides the ability to further divide the installation into
|
|
failure domains while also providing a logical unit for
|
|
maintenance and the addition of new hardware. At hyperscale,
|
|
instead of adding single compute nodes, administrators can add
|
|
entire racks or even groups of racks at a time with each new
|
|
addition of nodes exposed via one of the segregation concepts
|
|
mentioned herein.</para>
|
|
<para><glossterm baseform="cell">Cells</glossterm> provide the ability
|
|
to subdivide the compute portion
|
|
of an OpenStack installation, including regions, while still
|
|
exposing a single endpoint. Each region has an API cell
|
|
along with a number of compute cells where the
|
|
workloads actually run. Each cell has its own database and
|
|
message queue setup (ideally clustered), providing the ability
|
|
to subdivide the load on these subsystems, improving overall
|
|
performance.</para>
|
|
<para>Each compute cell provides a complete compute installation,
|
|
complete with full database and queue installations,
|
|
scheduler, conductor, and multiple compute hosts. The cells
|
|
scheduler handles placement of user requests from the single
|
|
API endpoint to a specific cell from those available. The
|
|
normal filter scheduler then handles placement within the
|
|
cell.</para>
|
|
<para>Unfortunately, Compute is the only OpenStack service that
|
|
provides good support for cells. In addition, cells
|
|
do not adequately support some standard
|
|
OpenStack functionality such as security groups and host
|
|
aggregates. Due to their relative newness and specialized use,
|
|
cells receive relatively little testing in the OpenStack gate.
|
|
Despite these issues, cells play an important role in
|
|
well known OpenStack installations operating at massive scale,
|
|
such as those at CERN and Rackspace.</para></section>
|
|
<section xml:id="host-aggregates">
|
|
<title>Host aggregates</title>
|
|
<para>Host aggregates enable partitioning of OpenStack Compute
|
|
deployments into logical groups for load balancing and
|
|
instance distribution. You can also use host aggregates to
|
|
further partition an availability zone. Consider a cloud which
|
|
might use host aggregates to partition an availability zone
|
|
into groups of hosts that either share common resources, such
|
|
as storage and network, or have a special property, such as
|
|
trusted computing hardware. You cannot target host aggregates
|
|
explicitly. Instead, select instance flavors that map to host
|
|
aggregate metadata. These flavors target host aggregates
|
|
implicitly.</para></section>
|
|
<section xml:id="availability-zones">
|
|
<title>Availability zones</title>
|
|
<para>Availability zones provide another mechanism for subdividing
|
|
an installation or region. They are, in effect, host
|
|
aggregates exposed for (optional) explicit targeting
|
|
by users.</para>
|
|
<para>Unlike cells, availability zones do not have their own database
|
|
server or queue broker but represent an arbitrary grouping of
|
|
compute nodes. Typically, nodes are grouped into availability
|
|
zones using a shared failure domain based on a physical
|
|
characteristic such as a shared power source or physical network
|
|
connections. Users can target exposed availability zones; however,
|
|
this is not a requirement. An alternative approach is to set a default
|
|
availability zone to schedule instances to a non-default availability
|
|
zone of <literal>nova</literal>.</para></section>
|
|
<section xml:id="segregation-example">
|
|
<title>Segregation example</title>
|
|
<para>In this example the cloud is divided into two regions, one
|
|
for each site, with two availability zones in each based on
|
|
the power layout of the data centers. A number of host
|
|
aggregates enable targeting of
|
|
virtual machine instances using flavors, that require special
|
|
capabilities shared by the target hosts such as SSDs, 10 GbE
|
|
networks, or GPU cards.</para>
|
|
<mediaobject>
|
|
<imageobject>
|
|
<imagedata contentwidth="4in"
|
|
fileref="../figures/Massively_Scalable_Cells_+_regions_+_azs.png"
|
|
/>
|
|
</imageobject>
|
|
</mediaobject></section>
|
|
</section>
|