Remove passive voice from Arch Guide Chap 2
Closes-Bug: #1400552 Change-Id: I5b1572d7b5cf3321dcfa2836d7f777fe450a87e3
This commit is contained in:
parent
2a93d184c5
commit
60002fc5be
@ -8,12 +8,11 @@
|
||||
|
||||
<para>A compute-focused cloud is a specialized subset of the general purpose
|
||||
OpenStack cloud architecture. Unlike the general purpose OpenStack
|
||||
architecture, which is built to host a wide variety of workloads and
|
||||
architecture, which hosts a wide variety of workloads and
|
||||
applications and does not heavily tax any particular computing aspect,
|
||||
a compute-focused cloud is built and designed specifically to support
|
||||
compute intensive workloads. As such, the design must be specifically
|
||||
tailored to support hosting compute intensive workloads. Compute intensive
|
||||
workloads may be CPU intensive, RAM intensive, or both. However, they are
|
||||
a compute-focused cloud specifically supports
|
||||
compute intensive workloads. Compute intensive
|
||||
workloads may be CPU intensive, RAM intensive, or both; they are
|
||||
not typically storage intensive or network intensive. Compute-focused
|
||||
workloads may include the following use cases:</para>
|
||||
<itemizedlist>
|
||||
@ -36,11 +35,11 @@
|
||||
</itemizedlist>
|
||||
<para>Based on the use case requirements, such clouds might need to provide
|
||||
additional services such as a virtual machine disk library, file or object
|
||||
storage, firewalls, load balancers, IP addresses, and network connectivity
|
||||
storage, firewalls, load balancers, IP addresses, or network connectivity
|
||||
in the form of overlays or virtual local area networks (VLANs). A
|
||||
compute-focused OpenStack cloud will not typically use raw block storage
|
||||
services since the applications hosted on a compute-focused OpenStack
|
||||
cloud generally do not need persistent block storage.</para>
|
||||
compute-focused OpenStack cloud does not typically use raw block storage
|
||||
services as it does not generally host applications that require
|
||||
persistent block storage.</para>
|
||||
|
||||
<xi:include href="compute_focus/section_user_requirements_compute_focus.xml"/>
|
||||
<xi:include href="compute_focus/section_tech_considerations_compute_focus.xml"/>
|
||||
|
@ -20,15 +20,15 @@
|
||||
</itemizedlist>
|
||||
<para>
|
||||
An OpenStack cloud with extreme demands on processor and memory
|
||||
resources is considered to be compute-focused, and requires hardware that
|
||||
resources is compute-focused, and requires hardware that
|
||||
can handle these demands. This can mean choosing hardware which might
|
||||
not perform as well on storage or network capabilities. In a compute-
|
||||
focused architecture, storage and networking are required while loading a
|
||||
focused architecture, storage and networking load a
|
||||
data set into the computational cluster, but are not otherwise in heavy
|
||||
demand.
|
||||
</para>
|
||||
<para>
|
||||
Compute (server) hardware must be evaluated against four dimensions:
|
||||
Consider the following factors when selecting compute (server) hardware:
|
||||
</para>
|
||||
<variablelist>
|
||||
<varlistentry>
|
||||
@ -42,14 +42,14 @@
|
||||
<term>Resource capacity</term>
|
||||
<listitem>
|
||||
<para>The number of CPU cores, how much RAM, or how
|
||||
much storage a given server will deliver.</para>
|
||||
much storage a given server delivers.</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
<varlistentry>
|
||||
<term>Expandability</term>
|
||||
<listitem>
|
||||
<para>The number of additional resources that can be
|
||||
added to a server before it has reached its limit.</para>
|
||||
<para>The number of additional resources you can add to a
|
||||
server before it reaches its limit.</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
<varlistentry>
|
||||
@ -60,7 +60,7 @@
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
</variablelist>
|
||||
<para>The dimensions need to be weighed against each other to determine the
|
||||
<para>Weigh these considerations against each other to determine the
|
||||
best design for the desired purpose. For example, increasing server density
|
||||
means sacrificing resource capacity or expandability. Increasing resource
|
||||
capacity and expandability can increase cost but decreases server density.
|
||||
@ -68,38 +68,38 @@
|
||||
resource capacity, and expandability.</para>
|
||||
<para>A compute-focused cloud should have an emphasis on server hardware
|
||||
that can offer more CPU sockets, more CPU cores, and more RAM. Network
|
||||
connectivity and storage capacity are less critical. The hardware will
|
||||
need to be configured to provide enough network connectivity and storage
|
||||
connectivity and storage capacity are less critical. The hardware must
|
||||
provide enough network connectivity and storage
|
||||
capacity to meet minimum user requirements, but they are not the primary
|
||||
consideration.</para>
|
||||
<para>Some server hardware form factors are better suited than others, as
|
||||
CPU and RAM capacity have the highest priority. Some considerations for
|
||||
selecting hardware:</para>
|
||||
<para>Some server hardware form factors suit a compute-focused architecture
|
||||
better than others. CPU and RAM capacity have the highest priority. Some
|
||||
considerations for selecting hardware:</para>
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>Most blade servers can support dual-socket multi-core CPUs. To
|
||||
avoid this CPU limit, select "full width" or "full height" blades,
|
||||
however this will also decrease the server density. For example,
|
||||
high density blade servers (like HP BladeSystem or Dell PowerEdge
|
||||
M1000e) which support up to 16 servers in only ten rack units. Using
|
||||
avoid this CPU limit, select "full width" or "full height" blades.
|
||||
Be aware, however, that this also decreases server density. For example,
|
||||
high density blade servers such as HP BladeSystem or Dell PowerEdge
|
||||
M1000e support up to 16 servers in only ten rack units. Using
|
||||
half-height blades is twice as dense as using full-height blades,
|
||||
which results in only eight servers per ten rack units.</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>1U rack-mounted servers (servers that occupy only a single rack
|
||||
unit) may be able to offer greater server density than a blade server
|
||||
<para>1U rack-mounted servers that occupy only a single rack
|
||||
unit may offer greater server density than a blade server
|
||||
solution. It is possible to place forty 1U servers in a rack, providing
|
||||
space for the top of rack (ToR) switches, compared to 32 full width
|
||||
blade servers. However, as of the Icehouse release, 1U servers from
|
||||
the major vendors are limited to dual-socket, multi-core CPU
|
||||
the major vendors have only dual-socket, multi-core CPU
|
||||
configurations. To obtain greater than dual-socket support in a 1U
|
||||
rack-mount form factor, you will need to buy systems from original
|
||||
rack-mount form factor, purchase systems from original
|
||||
design (ODMs) or second-tier manufacturers.</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>2U rack-mounted servers provide quad-socket, multi-core CPU
|
||||
support, but with a corresponding decrease in server density (half
|
||||
the density offered by 1U rack-mounted servers).</para>
|
||||
the density that 1U rack-mounted servers offer).</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>Larger rack-mounted servers, such as 4U servers, often provide
|
||||
@ -108,8 +108,8 @@
|
||||
have much lower server density and are often more expensive.</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>"Sled servers" (rack-mounted servers that support multiple
|
||||
independent servers in a single 2U or 3U enclosure) deliver increased
|
||||
<para>"Sled servers" are rack-mounted servers that support multiple
|
||||
independent servers in a single 2U or 3U enclosure. These deliver higher
|
||||
density as compared to typical 1U or 2U rack-mounted servers. For
|
||||
example, many sled servers offer four independent dual-socket
|
||||
nodes in 2U for a total of eight CPU sockets in 2U. However, the
|
||||
@ -125,7 +125,7 @@
|
||||
<listitem>
|
||||
<para>In a compute-focused architecture, instance density is
|
||||
lower, which means CPU and RAM over-subscription ratios are
|
||||
also lower. More hosts will be required to support the anticipated
|
||||
also lower. You require more hosts to support the anticipated
|
||||
scale due to instance density being lower, especially if the
|
||||
design uses dual-socket hardware designs.</para>
|
||||
</listitem>
|
||||
@ -134,8 +134,8 @@
|
||||
<term>Host density</term>
|
||||
<listitem>
|
||||
<para>Another option to address the higher host count
|
||||
that might be needed with dual socket designs is to use a quad
|
||||
socket platform. Taking this approach will decrease host density,
|
||||
of dual socket designs is to use a quad
|
||||
socket platform. Taking this approach decreases host density,
|
||||
which increases rack count. This configuration may
|
||||
affect the network requirements, the number of power connections, and
|
||||
possibly impact the cooling requirements.</para>
|
||||
@ -145,64 +145,62 @@
|
||||
<term>Power and cooling density</term>
|
||||
<listitem>
|
||||
<para>The power and cooling density
|
||||
requirements might be lower than with blade, sled, or 1U server
|
||||
designs because of lower host density (by using 2U, 3U or even 4U
|
||||
server designs). For data centers with older infrastructure, this may
|
||||
requirements for 2U, 3U or even 4U server designs might be lower
|
||||
than for blade, sled, or 1U server designs because of lower host
|
||||
density. For data centers with older infrastructure, this may
|
||||
be a desirable feature.</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
</variablelist>
|
||||
<para>Compute-focused OpenStack design architecture server hardware
|
||||
selection results in a "scale up" versus "scale out" decision.
|
||||
Selecting a better solution, smaller number of larger hosts, or a
|
||||
larger number of smaller hosts depends on a combination of factors:
|
||||
<para>When designing a compute-focused OpenStack architecture, you must
|
||||
consider whether you intend to scale up or scale out.
|
||||
Selecting a smaller number of larger hosts, or a
|
||||
larger number of smaller hosts, depends on a combination of factors:
|
||||
cost, power, cooling, physical rack and floor space, support-warranty,
|
||||
and manageability.</para>
|
||||
<section xml:id="storage-hardware-selection">
|
||||
<title>Storage hardware selection</title>
|
||||
<para>For a compute-focused OpenStack design architecture, the selection of
|
||||
storage hardware is not critical as it is not primary criteria, however
|
||||
it is still important. There are a number of different factors that a
|
||||
cloud architect must consider:</para>
|
||||
<para>For a compute-focused OpenStack architecture, the
|
||||
selection of storage hardware is not critical as it is not a primary
|
||||
consideration. Nonetheless, there are several factors
|
||||
to consider:</para>
|
||||
<variablelist>
|
||||
<varlistentry>
|
||||
<term>Cost</term>
|
||||
<listitem>
|
||||
<para>The overall cost of the solution will play a major role
|
||||
in what storage architecture (and resulting storage hardware) is
|
||||
selected.</para>
|
||||
<para>The overall cost of the solution plays a major role
|
||||
in what storage architecture and storage hardware you select.</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
<varlistentry>
|
||||
<term>Performance</term>
|
||||
<listitem>
|
||||
<para>The performance of the solution is also a big
|
||||
role and can be measured by observing the latency of storage I-O
|
||||
<para>The performance of the storage solution is important; you can
|
||||
measure it by observing the latency of storage I-O
|
||||
requests. In a compute-focused OpenStack cloud, storage latency
|
||||
can be a major consideration. In some compute-intensive
|
||||
workloads, minimizing the delays that the CPU experiences while
|
||||
fetching data from the storage can have a significant impact on
|
||||
fetching data from storage can impact significantly on
|
||||
the overall performance of the application.</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
<varlistentry>
|
||||
<term>Scalability</term>
|
||||
<listitem>
|
||||
<para>This section will refer to the term "scalability"
|
||||
to refer to how well the storage solution performs as it is
|
||||
expanded up to its maximum size. A storage solution that performs
|
||||
<para>Scalability refers to the performance of a storage solution
|
||||
as it expands to its maximum size. A solution that performs
|
||||
well in small configurations but has degrading
|
||||
performance as it expands would not be considered scalable. On
|
||||
performance as it expands is not scalable. On
|
||||
the other hand, a solution that continues to perform well at
|
||||
maximum expansion would be considered scalable.</para>
|
||||
maximum expansion is scalable.</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
<varlistentry>
|
||||
<term>Expandability</term>
|
||||
<listitem>
|
||||
<para>Expandability refers to the overall ability of
|
||||
the solution to grow. A storage solution that expands to 50 PB is
|
||||
considered more expandable than a solution that only scales to 10PB.
|
||||
a storage solution to grow. A solution that expands to 50 PB is
|
||||
more expandable than a solution that only scales to 10PB.
|
||||
Note that this metric is related to, but different
|
||||
from, scalability, which is a measure of the solution's
|
||||
performance as it expands.</para>
|
||||
@ -211,23 +209,20 @@
|
||||
</variablelist>
|
||||
<para>For a compute-focused OpenStack cloud, latency of storage is a
|
||||
major consideration. Using solid-state disks (SSDs) to minimize
|
||||
latency for instance storage and reduce CPU delays caused by waiting
|
||||
for the storage will increase performance. Consider using RAID
|
||||
latency for instance storage reduces CPU delays related to storage
|
||||
and improves performance. Consider using RAID
|
||||
controller cards in compute hosts to improve the performance of the
|
||||
underlying disk subsystem.</para>
|
||||
<para>The selection of storage architecture, and the corresponding
|
||||
storage hardware (if there is the option), is determined by evaluating
|
||||
possible solutions against the key factors listed above. This will
|
||||
determine if a scale-out solution (such as Ceph, GlusterFS, or similar)
|
||||
should be used, or if a single, highly expandable and scalable
|
||||
centralized storage array would be a better choice. If a centralized
|
||||
storage array is the right fit for the requirements, the hardware will
|
||||
be determined by the array vendor. It is also possible to build a
|
||||
storage array using commodity hardware with Open Source software, but
|
||||
there needs to be access to people with expertise to build such a
|
||||
system. Conversely, a scale-out storage solution that uses
|
||||
<para>Evaluate solutions against the key factors above when considering
|
||||
your storage architecture. This determines if a scale-out solution
|
||||
such as Ceph or GlusterFS is suitable, or if a single, highly expandable,
|
||||
scalable, centralized storage array is better. If a centralized
|
||||
storage array suits the requirements, the array vendor determines the
|
||||
hardware. You can build a storage array using commodity hardware with
|
||||
Open Source software, but you require people with expertise to build
|
||||
such a system. Conversely, a scale-out storage solution that uses
|
||||
direct-attached storage (DAS) in the servers may be an appropriate
|
||||
choice. If so, then the server hardware needs to be configured to
|
||||
choice. If so, then the server hardware must
|
||||
support the storage solution.</para>
|
||||
<para>The following lists some of the potential impacts that may affect a
|
||||
particular storage architecture, and the corresponding storage hardware,
|
||||
@ -236,92 +231,89 @@
|
||||
<varlistentry>
|
||||
<term>Connectivity</term>
|
||||
<listitem>
|
||||
<para>Based on the storage solution selected, ensure
|
||||
the connectivity matches the storage solution requirements. If a
|
||||
centralized storage array is selected, it is important to determine
|
||||
how the hypervisors will connect to the storage array. Connectivity
|
||||
could affect latency and thus performance, so check that the network
|
||||
characteristics will minimize latency to boost the overall
|
||||
<para>Ensure connectivity matches the storage solution requirements.
|
||||
If you select a centralized storage array, determine how the
|
||||
hypervisors should connect to the storage array. Connectivity
|
||||
can affect latency and thus performance, so ensure that the network
|
||||
characteristics minimize latency to boost the overall
|
||||
performance of the design.</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
<varlistentry>
|
||||
<term>Latency</term>
|
||||
<listitem>
|
||||
<para>Determine if the use case will have consistent or
|
||||
<para>Determine if the use case has consistent or
|
||||
highly variable latency.</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
<varlistentry>
|
||||
<term>Throughput</term>
|
||||
<listitem>
|
||||
<para>To improve overall performance, make sure that the
|
||||
storage solution throughout is optimized. While it is not likely
|
||||
that a compute-focused cloud will have major data I-O to and
|
||||
from storage, this is an important factor to consider.</para>
|
||||
<para>To improve overall performance, ensure that you optimize the
|
||||
storage solution. While a compute-focused cloud does not usually
|
||||
have major data I-O to and from storage, this is an important
|
||||
factor to consider.</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
<varlistentry>
|
||||
<term>Server Hardware</term>
|
||||
<listitem>
|
||||
<para>If the solution uses DAS, this impacts, and
|
||||
is not limited to, the server hardware choice that will ripple into
|
||||
<para>If the solution uses DAS, this impacts the server hardware choice,
|
||||
host density, instance density, power density, OS-hypervisor, and
|
||||
management tools.</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
</variablelist>
|
||||
<para>Where instances need to be made highly available, or they need to be
|
||||
capable of migration between hosts, use of a shared storage file-system
|
||||
to store instance ephemeral data should be employed to ensure that
|
||||
<para>When instances must be highly available or capable of migration
|
||||
between hosts, use a shared storage file-system
|
||||
to store instance ephemeral data to ensure that
|
||||
compute services can run uninterrupted in the event of a node
|
||||
failure.</para>
|
||||
</section>
|
||||
<section xml:id="selecting-networking-hardware-arch">
|
||||
<title>Selecting networking hardware</title>
|
||||
<para>Some of the key considerations that should be included in
|
||||
the selection of networking hardware include:</para>
|
||||
<para>Some of the key considerations for networking hardware selection
|
||||
include:</para>
|
||||
<variablelist>
|
||||
<varlistentry>
|
||||
<term>Port count</term>
|
||||
<listitem>
|
||||
<para>The design will require networking hardware that
|
||||
<para>The design requires networking hardware that
|
||||
has the requisite port count.</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
<varlistentry>
|
||||
<term>Port density</term>
|
||||
<listitem>
|
||||
<para>The network design will be affected by the
|
||||
physical space that is required to provide the requisite port count.
|
||||
<para>The required port count affects the physical space that a
|
||||
network design requires.
|
||||
A switch that can provide 48 10 GbE ports in 1U has a much higher
|
||||
port density than a switch that provides 24 10 GbE ports in 2U. A
|
||||
higher port density is preferred, as it leaves more rack space for
|
||||
compute or storage components that might be required by the design.
|
||||
This also leads into concerns about fault domains and power density
|
||||
that must also be considered. Higher density switches are more
|
||||
expensive and should also be considered, as it is important not to
|
||||
over design the network if it is not required.</para>
|
||||
higher port density is better, as it leaves more rack space for
|
||||
compute or storage components. You must also consider fault
|
||||
domains and power density. Although more expensive, you can also
|
||||
consider higher density switches as it is important not to
|
||||
design the network beyond requirements.</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
<varlistentry>
|
||||
<term>Port speed</term>
|
||||
<listitem>
|
||||
<para>The networking hardware must support the proposed
|
||||
network speed, for example: 1 GbE, 10 GbE, or 40 GbE (or even 100
|
||||
GbE).</para>
|
||||
network speed, for example: 1 GbE, 10 GbE, 40 GbE, or 100
|
||||
GbE.</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
<varlistentry>
|
||||
<term>Redundancy</term>
|
||||
<listitem>
|
||||
<para>The level of network hardware redundancy required
|
||||
is influenced by the user requirements for high availability and
|
||||
cost considerations. Network redundancy can be achieved by adding
|
||||
<para>User requirements for high availability and cost considerations
|
||||
influence the level of network hardware redundancy you require.
|
||||
You can achieve network redundancy by adding
|
||||
redundant power supplies or paired switches. If this is a
|
||||
requirement, the hardware will need to support this configuration.
|
||||
User requirements will determine if a completely redundant network
|
||||
infrastructure is required.</para>
|
||||
requirement, the hardware must support this configuration.
|
||||
User requirements determine if you require a completely redundant network
|
||||
infrastructure.</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
<varlistentry>
|
||||
@ -335,29 +327,24 @@
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
</variablelist>
|
||||
<para>It is important to first understand additional factors as well as
|
||||
the use case because these additional factors heavily influence the
|
||||
cloud network architecture. Once these key considerations have been
|
||||
decided, the proper network can be designed to best serve the workloads
|
||||
being placed in the cloud.</para>
|
||||
<para>We recommend designing the network architecture using
|
||||
a scalable network model that makes it easy to add capacity and
|
||||
bandwidth. A good example of such a model is the leaf-spline model. In
|
||||
this type of network design, it is possible to easily add additional
|
||||
bandwidth as well as scale out to additional racks of gear. It is
|
||||
important to select network hardware that will support the required
|
||||
port count, port speed and port density while also allowing for future
|
||||
important to select network hardware that supports the required
|
||||
port count, port speed, and port density while also allowing for future
|
||||
growth as workload demands increase. It is also important to evaluate
|
||||
where in the network architecture it is valuable to provide redundancy.
|
||||
Increased network availability and redundancy comes at a cost, therefore
|
||||
we recommend to weigh the cost versus the benefit gained from
|
||||
we recommend weighing the cost versus the benefit gained from
|
||||
utilizing and deploying redundant network switches and using bonded
|
||||
interfaces at the host level.</para>
|
||||
</section>
|
||||
<section xml:id="software-selection-arch">
|
||||
<title>Software selection</title>
|
||||
<para>Selecting software to be included in a compute-focused OpenStack
|
||||
architecture design must include three main areas:</para>
|
||||
<para>Consider your selection of software for a compute-focused
|
||||
OpenStack:</para>
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>Operating system (OS) and hypervisor</para>
|
||||
@ -377,24 +364,19 @@
|
||||
<para>The selection of operating system (OS) and hypervisor has a
|
||||
significant impact on the end point design. Selecting a particular
|
||||
operating system and hypervisor could affect server hardware selection.
|
||||
For example, a selected combination needs to be supported on the selected
|
||||
hardware. Ensuring the storage hardware selection and topology supports
|
||||
the selected operating system and hypervisor combination should also be
|
||||
considered. Additionally, make sure that the networking hardware
|
||||
selection and topology will work with the chosen operating system and
|
||||
hypervisor combination. For example, if the design uses Link Aggregation
|
||||
Control Protocol (LACP), the hypervisor needs to support it.</para>
|
||||
<para>Some areas that could be impacted by the selection of OS and
|
||||
hypervisor include:</para>
|
||||
The node, networking, and storage hardware must support the selected
|
||||
combination. For example, if the design uses Link Aggregation
|
||||
Control Protocol (LACP), the hypervisor must support it.</para>
|
||||
<para>OS and hypervisor selection impact the following areas:</para>
|
||||
<variablelist>
|
||||
<varlistentry>
|
||||
<term>Cost</term>
|
||||
<listitem>
|
||||
<para>Selecting a commercially supported hypervisor such as
|
||||
Microsoft Hyper-V will result in a different cost model rather than
|
||||
choosing a community-supported open source hypervisor like Kinstance
|
||||
Microsoft Hyper-V results in a different cost model from
|
||||
choosing a community-supported, open source hypervisor like Kinstance
|
||||
or Xen. Even within the ranks of open source solutions, choosing
|
||||
Ubuntu over Red Hat (or vice versa) will have an impact on cost due
|
||||
one solution over another can impact cost due
|
||||
to support contracts. On the other hand, business or application
|
||||
requirements might dictate a specific or commercially supported
|
||||
hypervisor.</para>
|
||||
@ -403,11 +385,9 @@
|
||||
<varlistentry>
|
||||
<term>Supportability</term>
|
||||
<listitem>
|
||||
<para>Depending on the selected hypervisor, the staff
|
||||
should have the appropriate training and knowledge to support the
|
||||
selected OS and hypervisor combination. If they do not, training
|
||||
will need to be provided which could have a cost impact on the
|
||||
design.</para>
|
||||
<para>Staff require appropriate training and knowledge to support the
|
||||
selected OS and hypervisor combination. Consideration of training
|
||||
costs may impact the design.</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
<varlistentry>
|
||||
@ -415,10 +395,8 @@
|
||||
<listitem>
|
||||
<para>The management tools used for Ubuntu and
|
||||
Kinstance differ from the management tools for VMware vSphere.
|
||||
Although both OS and hypervisor combinations are supported by
|
||||
OpenStack, there will be very different impacts to the rest of the
|
||||
design as a result of the selection of one combination versus the
|
||||
other.</para>
|
||||
Although OpenStack supports both OS and hypervisor combinations,
|
||||
the choice of tool impacts the rest of the design.</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
<varlistentry>
|
||||
@ -426,7 +404,7 @@
|
||||
<listitem>
|
||||
<para>Ensure that selected OS and hypervisor
|
||||
combinations meet the appropriate scale and performance
|
||||
requirements. The chosen architecture will need to meet the targeted
|
||||
requirements. The chosen architecture must meet the targeted
|
||||
instance-host ratios with the selected OS-hypervisor
|
||||
combination.</para>
|
||||
</listitem>
|
||||
@ -435,52 +413,47 @@
|
||||
<term>Security</term>
|
||||
<listitem>
|
||||
<para>Ensure that the design can accommodate the regular
|
||||
periodic installation of application security patches while
|
||||
installation of application security patches while
|
||||
maintaining the required workloads. The frequency of security
|
||||
patches for the proposed OS-hypervisor combination will have an
|
||||
impact on performance and the patch installation process could
|
||||
patches for the proposed OS-hypervisor combination has an
|
||||
impact on performance and the patch installation process can
|
||||
affect maintenance windows.</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
<varlistentry>
|
||||
<term>Supported features</term>
|
||||
<listitem>
|
||||
<para>Determine what features of OpenStack are
|
||||
required. This will often determine the selection of the
|
||||
<para>Determine what features of OpenStack you require.
|
||||
The choice of features often determines the selection of the
|
||||
OS-hypervisor combination. Certain features are only available with
|
||||
specific OSs or hypervisors. For example, if certain features are
|
||||
not available, the design might need to be modified to meet the user
|
||||
requirements.</para>
|
||||
not available, modify the design to meet user requirements.</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
<varlistentry>
|
||||
<term>Interoperability</term>
|
||||
<listitem>
|
||||
<para>Consideration should be given to the ability
|
||||
of the selected OS-hypervisor combination to interoperate or
|
||||
co-exist with other OS-hypervisors, or other software solutions in
|
||||
the overall design (if required). Operational and troubleshooting
|
||||
tools for one OS-hypervisor combination may differ from the
|
||||
tools used for another OS-hypervisor combination and,
|
||||
as a result, the design will need to address if the
|
||||
two sets of tools need to interoperate.</para>
|
||||
<para>Consider the ability of the selected OS-hypervisor combination
|
||||
to interoperate or co-exist with other OS-hypervisors, or with
|
||||
other software solutions in the overall design. Operational and
|
||||
troubleshooting tools for one OS-hypervisor combination may differ
|
||||
from the tools for another OS-hypervisor combination. The design
|
||||
must address if the two sets of tools need to interoperate.</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
</variablelist>
|
||||
</section>
|
||||
<section xml:id="openstack-components-arch">
|
||||
<title>OpenStack components</title>
|
||||
<para>The selection of which OpenStack components will actually be
|
||||
included in the design and deployed has significant impact. There are
|
||||
certain components that will always be present, (Compute and Image service, for
|
||||
example) yet there are other services that might not need to be present.
|
||||
For example, a certain design may not require the Orchestration module.
|
||||
Omitting Heat would not typically have a significant impact on the
|
||||
overall design. However, if the architecture uses a replacement for
|
||||
OpenStack Object Storage for its storage component, this could potentially have
|
||||
<para>The selection of OpenStack components has a significant impact.
|
||||
There are certain components that are omnipresent, for example the compute
|
||||
and image services, but others, such as the orchestration module may not
|
||||
be present. Omitting heat does not typically have a significant impact
|
||||
on the overall design. However, if the architecture uses a replacement for
|
||||
OpenStack Object Storage for its storage component, this could have
|
||||
significant impacts on the rest of the design.</para>
|
||||
<para>For a compute-focused OpenStack design architecture, the
|
||||
following components would be used:</para>
|
||||
following components may be present:</para>
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>Identity (keystone)</para>
|
||||
@ -504,96 +477,89 @@
|
||||
<para>Orchestration (heat)</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
<para>OpenStack Block Storage would potentially not be incorporated
|
||||
into a compute-focused design due to persistent block storage not
|
||||
being a significant requirement for the types of workloads that would
|
||||
be deployed onto instances running in a compute-focused cloud. However,
|
||||
there may be some situations where the need for performance dictates
|
||||
that a block storage component be used to improve data I-O.</para>
|
||||
<para>The exclusion of certain OpenStack components might also limit or
|
||||
constrain the functionality of other components. If a design opts to
|
||||
<para>A compute-focused design is less likely to include OpenStack Block
|
||||
Storage due to persistent block storage not
|
||||
being a significant requirement for the expected workloads. However,
|
||||
there may be some situations where the need for performance employs
|
||||
a block storage component to improve data I-O.</para>
|
||||
<para>The exclusion of certain OpenStack components might also limit the
|
||||
functionality of other components. If a design opts to
|
||||
include the Orchestration module but excludes the Telemetry module, then
|
||||
the design will not be able to take advantage of Orchestration's auto
|
||||
scaling functionality (which relies on information from Telemetry). This is due
|
||||
to the fact that you can use Orchestration to spin up a large number of
|
||||
instances to perform the compute-intensive processing. This includes
|
||||
Orchestration in a compute-focused architecture design, which is strongly
|
||||
recommended.</para>
|
||||
the design cannot take advantage of Orchestration's auto
|
||||
scaling functionality as this relies on information from Telemetry.</para>
|
||||
</section>
|
||||
<section xml:id="supplemental-software">
|
||||
<title>Supplemental software</title>
|
||||
<para>While OpenStack is a fairly complete collection of software
|
||||
projects for building a platform for cloud services, there are
|
||||
invariably additional pieces of software that might need to be
|
||||
added to any given OpenStack design.</para>
|
||||
invariably additional pieces of software that you might add
|
||||
to an OpenStack design.</para>
|
||||
<section xml:id="networking-software-arch">
|
||||
<title>Networking software</title>
|
||||
<para>OpenStack Networking provides a wide variety of networking services
|
||||
for instances. There are many additional networking software packages
|
||||
that might be useful to manage the OpenStack components themselves.
|
||||
Some examples include software to provide load balancing,
|
||||
network redundancy protocols, and routing daemons. Some of these
|
||||
software packages are described in more detail in the
|
||||
network redundancy protocols, and routing daemons. The
|
||||
<citetitle>OpenStack High Availability Guide</citetitle> (<link
|
||||
xlink:href="http://docs.openstack.org/high-availability-guide/content">http://docs.openstack.org/high-availability-guide/content</link>).
|
||||
xlink:href="http://docs.openstack.org/high-availability-guide/content">http://docs.openstack.org/high-availability-guide/content</link>)
|
||||
describes some of these software packages in more detail.
|
||||
</para>
|
||||
<para>For a compute-focused OpenStack cloud, the OpenStack infrastructure
|
||||
components will need to be highly available. If the design does not
|
||||
include hardware load balancing, networking software packages like
|
||||
HAProxy will need to be included.</para>
|
||||
components must be highly available. If the design does not
|
||||
include hardware load balancing, you must add networking software packages
|
||||
like HAProxy.</para>
|
||||
</section>
|
||||
<section xml:id="management-software-arch">
|
||||
<title>Management software</title>
|
||||
<para>The selected supplemental software solution impacts and affects
|
||||
the overall OpenStack cloud design. This includes software for
|
||||
providing clustering, logging, monitoring and alerting.</para>
|
||||
<para>Inclusion of clustering Software, such as Corosync or Pacemaker,
|
||||
is determined primarily by the availability design requirements.
|
||||
Therefore, the impact of including (or not including) these software
|
||||
packages is primarily determined by the availability of the cloud
|
||||
infrastructure and the complexity of supporting the configuration after
|
||||
it is deployed. The OpenStack High Availability Guide provides more
|
||||
details on the installation and configuration of Corosync and Pacemaker,
|
||||
should these packages need to be included in the design.</para>
|
||||
<para>Requirements for logging, monitoring, and alerting are determined
|
||||
by operational considerations. Each of these sub-categories includes
|
||||
a number of various options. For example, in the logging sub-category
|
||||
one might consider Logstash, Splunk, Log Insight, or some other log
|
||||
aggregation-consolidation tool. Logs should be stored in a centralized
|
||||
location to make it easier to perform analytics against the data. Log
|
||||
<para>The availability of design requirements is the main determination
|
||||
for the inclusion of clustering Software, such as Corosync or Pacemaker.
|
||||
Therefore, the availability of the cloud infrastructure and the
|
||||
complexity of supporting the configuration after deployment impacts
|
||||
the inclusion of these software packages. The OpenStack High Availability
|
||||
Guide provides more
|
||||
details on the installation and configuration of Corosync and Pacemaker.
|
||||
</para>
|
||||
<para>Operational considerations determine the requirements for logging,
|
||||
monitoring, and alerting. Each of these sub-categories includes
|
||||
various options. For example, in the logging sub-category
|
||||
consider Logstash, Splunk, Log Insight, or some other log
|
||||
aggregation-consolidation tool. Store logs in a centralized
|
||||
location to ease analysis of the data. Log
|
||||
data analytics engines can also provide automation and issue
|
||||
notification by providing a mechanism to both alert and automatically
|
||||
attempt to remediate some of the more commonly known issues.</para>
|
||||
<para>If any of these software packages are needed, then the design
|
||||
must account for the additional resource consumption (CPU, RAM,
|
||||
storage, and network bandwidth for a log aggregation solution, for
|
||||
example). Some other potential design impacts include:</para>
|
||||
notification by alerting and
|
||||
attempting to remediate some of the more commonly known issues.</para>
|
||||
<para>If you require any of these software packages, then the design
|
||||
must account for the additional resource consumption.
|
||||
Some other potential design impacts include:</para>
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>OS-hypervisor combination: Ensure that the selected logging,
|
||||
<para>OS-hypervisor combination: ensure that the selected logging,
|
||||
monitoring, or alerting tools support the proposed OS-hypervisor
|
||||
combination.</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>Network hardware: The network hardware selection needs to be
|
||||
supported by the logging, monitoring, and alerting software.</para>
|
||||
<para>Network hardware: the logging, monitoring, and alerting software
|
||||
must support the network hardware selection.</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
</section>
|
||||
<section xml:id="database-software-arch">
|
||||
<title>Database software</title>
|
||||
<para>A large majority of the OpenStack components require access to
|
||||
<para>A large majority of OpenStack components require access to
|
||||
back-end database services to store state and configuration
|
||||
information. Selection of an appropriate back-end database that will
|
||||
satisfy the availability and fault tolerance requirements of the
|
||||
OpenStack services is required. OpenStack services support connecting
|
||||
to any database that is supported by the SQLAlchemy Python drivers,
|
||||
information. Select an appropriate back-end database that
|
||||
satisfies the availability and fault tolerance requirements of the
|
||||
OpenStack services. OpenStack services support connecting
|
||||
to any database that the SQLAlchemy Python drivers support,
|
||||
however most common database deployments make use of MySQL or some
|
||||
variation of it. We recommend that the database which provides
|
||||
back-end services within a general-purpose cloud, be made highly
|
||||
available using an available technology which can accomplish that
|
||||
goal. Some of the more common software solutions used include Galera,
|
||||
MariaDB and MySQL with multi-master replication.</para>
|
||||
variation of it. We recommend that you make the database that provides
|
||||
back-end services within a general-purpose cloud highly
|
||||
available. Some of the more common software solutions include Galera,
|
||||
MariaDB, and MySQL with multi-master replication.</para>
|
||||
</section>
|
||||
</section>
|
||||
</section>
|
||||
|
@ -52,7 +52,7 @@
|
||||
cloud service providers or telecom providers. Smaller implementations
|
||||
are more inclined to rely on smaller support teams that need
|
||||
to combine the engineering, design, and operation roles.</para>
|
||||
<para>The maintenance of OpenStack installations require a variety
|
||||
<para>The maintenance of OpenStack installations requires a variety
|
||||
of technical skills. To ease the operational burden, consider
|
||||
incorporating features into the architecture and
|
||||
design. Some examples include:</para>
|
||||
@ -61,7 +61,7 @@
|
||||
<para>Automating the operations functions</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>Utilising a third party management company</para>
|
||||
<para>Utilizing a third party management company</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
</section>
|
||||
@ -86,7 +86,7 @@
|
||||
<section xml:id="expected-unexpected-server-downtime">
|
||||
<title>Expected and unexpected server downtime</title>
|
||||
<para>Unexpected server downtime is inevitable, and SLAs can
|
||||
be used to address how long it takes to recover from failure.
|
||||
address how long it takes to recover from failure.
|
||||
Recovery of a failed host means restoring instances from a snapshot, or
|
||||
respawning that instance on another available host.</para>
|
||||
<para>It is acceptable to design a compute-focused cloud
|
||||
@ -103,10 +103,10 @@
|
||||
<para>Adding extra capacity to an OpenStack cloud is a
|
||||
horizontally scaling process.</para>
|
||||
<note>
|
||||
<para>Be mindful, however, of any additional work to place the nodes into
|
||||
appropriate Availability Zones and Host Aggregates if necessary.</para>
|
||||
<para>Be mindful, however, of additional work to place the nodes into
|
||||
appropriate Availability Zones and Host Aggregates.</para>
|
||||
</note>
|
||||
<para>We recommend the same (or very similar) CPUs
|
||||
<para>We recommend the same or very similar CPUs
|
||||
when adding extra nodes to the environment because they reduce
|
||||
the chance of breaking live-migration features if they are
|
||||
present. Scaling out hypervisor hosts also has a direct effect
|
||||
@ -116,9 +116,8 @@
|
||||
<para>Changing the internal components of a Compute host to account for
|
||||
increases in demand is a process known as vertical scaling.
|
||||
Swapping a CPU for one with more cores, or
|
||||
increasing the memory in a server, can help add extra needed
|
||||
capacity depending on whether the running applications are
|
||||
more CPU intensive or memory based.</para>
|
||||
increasing the memory in a server, can help add extra
|
||||
capacity for running applications.</para>
|
||||
<para>Another option is to assess the average workloads and
|
||||
increase the number of instances that can run within the
|
||||
compute environment by adjusting the overcommit ratio. While
|
||||
|
@ -7,7 +7,7 @@
|
||||
<?dbhtml stop-chunking?>
|
||||
<title>Prescriptive examples</title>
|
||||
<para>The Conseil Européen pour la Recherche Nucléaire (CERN),
|
||||
also known as the European Organization for, Nuclear Research
|
||||
also known as the European Organization for Nuclear Research,
|
||||
provides particle accelerators and other infrastructure for
|
||||
high-energy physics research.</para>
|
||||
<para>As of 2011 CERN operated these two compute centers in Europe
|
||||
@ -43,35 +43,35 @@
|
||||
</tr>
|
||||
</tbody>
|
||||
</informaltable>
|
||||
<para>To support a growing number of compute heavy users of
|
||||
experiments related to the Large Hadron Collider (LHC) CERN
|
||||
<para>To support a growing number of compute-heavy users of
|
||||
experiments related to the Large Hadron Collider (LHC), CERN
|
||||
ultimately elected to deploy an OpenStack cloud using
|
||||
Scientific Linux and RDO. This effort aimed to simplify the
|
||||
management of the center's compute resources with a view to
|
||||
doubling compute capacity through the addition of an
|
||||
additional data center in 2013 while maintaining the same
|
||||
doubling compute capacity through the addition of a
|
||||
data center in 2013 while maintaining the same
|
||||
levels of compute staff.</para>
|
||||
<para>The CERN solution uses <glossterm baseform="cell">cells</glossterm>
|
||||
for segregation of compute
|
||||
resources and to transparently scale between different data
|
||||
resources and for transparently scaling between different data
|
||||
centers. This decision meant trading off support for security
|
||||
groups and live migration. In addition some details like
|
||||
flavors needed to be manually replicated across cells. In
|
||||
spite of these drawbacks cells were determined to provide the
|
||||
groups and live migration. In addition, they must manually replicate
|
||||
some details, like flavors, across cells. In
|
||||
spite of these drawbacks cells provide the
|
||||
required scale while exposing a single public API endpoint to
|
||||
users.</para>
|
||||
<para>A compute cell was created for each of the two original data
|
||||
centers and a third was created when a new data center was
|
||||
added in 2013. Each cell contains three availability zones to
|
||||
<para>CERN created a compute cell for each of the two original data
|
||||
centers and created a third when it added a new data center
|
||||
in 2013. Each cell contains three availability zones to
|
||||
further segregate compute resources and at least three
|
||||
RabbitMQ message brokers configured to be clustered with
|
||||
RabbitMQ message brokers configured for clustering with
|
||||
mirrored queues for high availability.</para>
|
||||
<para>The API cell, which resides behind a HAProxy load balancer,
|
||||
is located in the data center in Switzerland and directs API
|
||||
is in the data center in Switzerland and directs API
|
||||
calls to compute cells using a customized variation of the
|
||||
cell scheduler. The customizations allow certain workloads to
|
||||
be directed to a specific data center or "all" data centers
|
||||
with cell selection determined by cell RAM availability in the
|
||||
route to a specific data center or all data centers,
|
||||
with cell RAM availability determining cell selection in the
|
||||
latter case.</para>
|
||||
<mediaobject>
|
||||
<imageobject>
|
||||
@ -94,45 +94,42 @@
|
||||
single default.
|
||||
</para></listitem>
|
||||
</itemizedlist>
|
||||
<para>The MySQL database server in each cell is managed by a
|
||||
central database team and configured in an active/passive
|
||||
configuration with a NetApp storage back end. Backups are
|
||||
performed ever 6 hours.</para>
|
||||
<para>A central database team manages the MySQL database server in each cell
|
||||
in an active/passive configuration with a NetApp storage back end.
|
||||
Backups run every 6 hours.</para>
|
||||
<section xml:id="network-architecture">
|
||||
<title>Network architecture</title>
|
||||
<para>To integrate with existing CERN networking infrastructure
|
||||
customizations were made to legacy networking (nova-network). This was in the
|
||||
<para>To integrate with existing networking infrastructure, CERN
|
||||
made customizations to legacy networking (nova-network). This was in the
|
||||
form of a driver to integrate with CERN's existing database
|
||||
for tracking MAC and IP address assignments.</para>
|
||||
<para>The driver facilitates selection of a MAC address and IP for
|
||||
new instances based on the compute node the scheduler places
|
||||
the instance on</para>
|
||||
<para>The driver considers the compute node that the scheduler
|
||||
placed an instance on and then selects a MAC address and IP
|
||||
new instances based on the compute node where the scheduler places
|
||||
the instance.</para>
|
||||
<para>The driver considers the compute node where the scheduler
|
||||
placed an instance and selects a MAC address and IP
|
||||
from the pre-registered list associated with that node in the
|
||||
database. The database is then updated to reflect the instance
|
||||
the addresses were assigned to.</para></section>
|
||||
database. The database updates to reflect the address assignment to
|
||||
that instance.</para></section>
|
||||
<section xml:id="storage-architecture">
|
||||
<title>Storage architecture</title>
|
||||
<para>The OpenStack Image service is deployed in the API cell and
|
||||
configured to expose version 1 (V1) of the API. As a result
|
||||
the image registry is also required. The storage back end in
|
||||
<para>CERN deploys the OpenStack Image service in the API cell and
|
||||
configures it to expose version 1 (V1) of the API. This also requires
|
||||
the image registry. The storage back end in
|
||||
use is a 3 PB Ceph cluster.</para>
|
||||
<para>A small set of "golden" Scientific Linux 5 and 6 images are
|
||||
maintained which applications can in turn be placed on using
|
||||
orchestration tools. Puppet is used for instance configuration
|
||||
management and customization but Orchestration deployment is
|
||||
expected.</para></section>
|
||||
<para>CERN maintains a small set of Scientific Linux 5 and 6 images onto
|
||||
which orchestration tools can place applications. Puppet manages
|
||||
instance configuration and customization.</para></section>
|
||||
<section xml:id="monitoring">
|
||||
<title>Monitoring</title>
|
||||
<para>Although direct billing is not required, the Telemetry module
|
||||
is used to perform metering for the purposes of adjusting
|
||||
project quotas. A sharded, replicated, MongoDB back end is
|
||||
used. To spread API load, instances of the nova-api service
|
||||
were deployed within the child cells for Telemetry to query
|
||||
against. This also meant that some supporting services
|
||||
including keystone, glance-api and glance-registry needed to
|
||||
also be configured in the child cells.</para>
|
||||
<para>CERN does not require direct billing, but uses the Telemetry module
|
||||
to perform metering for the purposes of adjusting
|
||||
project quotas. CERN uses a sharded, replicated, MongoDB back-end.
|
||||
To spread API load, CERN deploys instances of the nova-api service
|
||||
within the child cells for Telemetry to query
|
||||
against. This also requires the configuration of supporting services
|
||||
such as keystone, glance-api, and glance-registry in the child cells.
|
||||
</para>
|
||||
<mediaobject>
|
||||
<imageobject>
|
||||
<imagedata contentwidth="4in"
|
||||
|
@ -11,43 +11,46 @@
|
||||
<?dbhtml stop-chunking?>
|
||||
<title>Technical considerations</title>
|
||||
<para>In a compute-focused OpenStack cloud, the type of instance
|
||||
workloads being provisioned heavily influences technical
|
||||
workloads you provision heavily influences technical
|
||||
decision making. For example, specific use cases that demand
|
||||
multiple short running jobs present different requirements
|
||||
multiple, short-running jobs present different requirements
|
||||
than those that specify long-running jobs, even though both
|
||||
situations are considered "compute focused."</para>
|
||||
situations are compute focused.</para>
|
||||
<para>Public and private clouds require deterministic capacity
|
||||
planning to support elastic growth in order to meet user SLA
|
||||
expectations. Deterministic capacity planning is the path to
|
||||
predicting the effort and expense of making a given process
|
||||
consistently performant. This process is important because,
|
||||
when a service becomes a critical part of a user's
|
||||
infrastructure, the user's fate becomes wedded to the SLAs of
|
||||
the cloud itself. In cloud computing, a service's performance
|
||||
will not be measured by its average speed but rather by the
|
||||
consistency of its speed.</para>
|
||||
<para>There are two aspects of capacity planning to consider:
|
||||
planning the initial deployment footprint, and planning
|
||||
expansion of it to stay ahead of the demands of cloud
|
||||
users.</para>
|
||||
<para>Planning the initial footprint for an OpenStack deployment
|
||||
is typically done based on existing infrastructure workloads
|
||||
infrastructure, the user's experience links directly to the SLAs of
|
||||
the cloud itself. In cloud computing, it is not average speed but
|
||||
speed consistency that determines a service's performance.
|
||||
There are two aspects of capacity planning to consider:</para>
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>planning the initial deployment footprint</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>planning expansion of it to stay ahead of the demands of cloud
|
||||
users</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
<para>Plan the initial footprint for an OpenStack deployment
|
||||
based on existing infrastructure workloads
|
||||
and estimates based on expected uptake.</para>
|
||||
<para>The starting point is the core count of the cloud. By
|
||||
applying relevant ratios, the user can gather information
|
||||
about:</para>
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>The number of instances expected to be available
|
||||
concurrently: (overcommit fraction × cores) / virtual
|
||||
cores per instance</para>
|
||||
<para>The number of expected concurrent instances:
|
||||
(overcommit fraction × cores) / virtual cores per instance</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>How much storage is required: flavor disk size ×
|
||||
number of instances</para>
|
||||
<para>Required storage: flavor disk size × number of instances</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
<para>These ratios can be used to determine the amount of
|
||||
<para>Use these ratios to determine the amount of
|
||||
additional infrastructure needed to support the cloud. For
|
||||
example, consider a situation in which you require 1600
|
||||
instances, each with 2 vCPU and 50 GB of storage. Assuming the
|
||||
@ -69,18 +72,18 @@
|
||||
services, database servers, and queue servers are likely to
|
||||
encounter.</para>
|
||||
<para>Consider, for example, the differences between a cloud that
|
||||
supports a managed web-hosting platform with one running
|
||||
supports a managed web-hosting platform and one running
|
||||
integration tests for a development project that creates one
|
||||
instance per code commit. In the former, the heavy work of
|
||||
creating an instance happens only every few months, whereas
|
||||
the latter puts constant heavy load on the cloud controller.
|
||||
The average instance lifetime must be considered, as a larger
|
||||
The average instance lifetime is significant, as a larger
|
||||
number generally means less load on the cloud
|
||||
controller.</para>
|
||||
<para>Aside from the creation and termination of instances, the
|
||||
impact of users must be considered when accessing the service,
|
||||
<para>Aside from the creation and termination of instances, consider the
|
||||
impact of users accessing the service,
|
||||
particularly on nova-api and its associated database. Listing
|
||||
instances garners a great deal of information and, given the
|
||||
instances gathers a great deal of information and, given the
|
||||
frequency with which users run this operation, a cloud with a
|
||||
large number of users can increase the load significantly.
|
||||
This can even occur unintentionally. For example, the
|
||||
@ -88,8 +91,8 @@
|
||||
instances every 30 seconds, so leaving it open in a browser
|
||||
window can cause unexpected load.</para>
|
||||
<para>Consideration of these factors can help determine how many
|
||||
cloud controller cores are required. A server with 8 CPU cores
|
||||
and 8 GB of RAM server would be sufficient for up to a rack of
|
||||
cloud controller cores you require. A server with 8 CPU cores
|
||||
and 8 GB of RAM server would be sufficient for a rack of
|
||||
compute nodes, given the above caveats.</para>
|
||||
<para>Key hardware specifications are also crucial to the
|
||||
performance of user instances. Be sure to consider budget and
|
||||
@ -98,59 +101,58 @@
|
||||
bandwidth (Gbps/core), and overall CPU performance
|
||||
(CPU/core).</para>
|
||||
<para>The cloud resource calculator is a useful tool in examining
|
||||
the impacts of different hardware and instance load outs. It
|
||||
is available at: <link xlink:href="https://github.com/noslzzp/cloud-resource-calculator/blob/master/cloud-resource-calculator.ods">https://github.com/noslzzp/cloud-resource-calculator/blob/master/cloud-resource-calculator.ods</link>
|
||||
the impacts of different hardware and instance load outs. See:
|
||||
<link xlink:href="https://github.com/noslzzp/cloud-resource-calculator/blob/master/cloud-resource-calculator.ods">https://github.com/noslzzp/cloud-resource-calculator/blob/master/cloud-resource-calculator.ods</link>
|
||||
</para>
|
||||
<section xml:id="expansion-planning-compute-focus">
|
||||
<title>Expansion planning</title>
|
||||
<para>A key challenge faced when planning the expansion of cloud
|
||||
<para>A key challenge for planning the expansion of cloud
|
||||
compute services is the elastic nature of cloud infrastructure
|
||||
demands. Previously, new users or customers would be forced to
|
||||
demands. Previously, new users or customers had to
|
||||
plan for and request the infrastructure they required ahead of
|
||||
time, allowing time for reactive procurement processes. Cloud
|
||||
computing users have come to expect the agility provided by
|
||||
having instant access to new resources as they are required.
|
||||
Consequently, this means planning should be delivered for
|
||||
typical usage, but also more importantly, for sudden bursts in
|
||||
computing users have come to expect the agility of having
|
||||
instant access to new resources as required.
|
||||
Consequently, plan for typical usage and for sudden bursts in
|
||||
usage.</para>
|
||||
<para>Planning for expansion can be a delicate balancing act.
|
||||
<para>Planning for expansion is a balancing act.
|
||||
Planning too conservatively can lead to unexpected
|
||||
oversubscription of the cloud and dissatisfied users. Planning
|
||||
for cloud expansion too aggressively can lead to unexpected
|
||||
underutilization of the cloud and funds spent on operating
|
||||
infrastructure that is not being used efficiently.</para>
|
||||
<para>The key is to carefully monitor the spikes and valleys in
|
||||
underutilization of the cloud and funds spent unnecessarily on operating
|
||||
infrastructure.</para>
|
||||
<para>The key is to carefully monitor the trends in
|
||||
cloud usage over time. The intent is to measure the
|
||||
consistency with which services can be delivered, not the
|
||||
consistency with which you deliver services, not the
|
||||
average speed or capacity of the cloud. Using this information
|
||||
to model performance results in capacity enables users to more
|
||||
to model capacity performance enables users to more
|
||||
accurately determine the current and future capacity of the
|
||||
cloud.</para></section>
|
||||
<section xml:id="cpu-and-ram-compute-focus"><title>CPU and RAM</title>
|
||||
<para>(Adapted from:
|
||||
<link xlink:href="http://docs.openstack.org/openstack-ops/content/compute_nodes.html#cpu_choice">http://docs.openstack.org/openstack-ops/content/compute_nodes.html#cpu_choice</link>)</para>
|
||||
<para>Adapted from:
|
||||
<link xlink:href="http://docs.openstack.org/openstack-ops/content/compute_nodes.html#cpu_choice">http://docs.openstack.org/openstack-ops/content/compute_nodes.html#cpu_choice</link></para>
|
||||
<para>In current generations, CPUs have up to 12 cores. If an
|
||||
Intel CPU supports Hyper-Threading, those 12 cores are doubled
|
||||
to 24 cores. If a server is purchased that supports multiple
|
||||
CPUs, the number of cores is further multiplied.
|
||||
Intel CPU supports Hyper-Threading, those 12 cores double
|
||||
to 24 cores. A server that supports multiple CPUs multiplies
|
||||
the number of available cores.
|
||||
Hyper-Threading is Intel's proprietary simultaneous
|
||||
multi-threading implementation, used to improve
|
||||
parallelization on their CPUs. Consider enabling
|
||||
Hyper-Threading to improve the performance of multithreaded
|
||||
applications.</para>
|
||||
<para>Whether the user should enable Hyper-Threading on a CPU
|
||||
depends upon the use case. For example, disabling
|
||||
depends on the use case. For example, disabling
|
||||
Hyper-Threading can be beneficial in intense computing
|
||||
environments. Performance testing conducted by running local
|
||||
workloads with both Hyper-Threading on and off can help
|
||||
determine what is more appropriate in any particular
|
||||
environments. Running performance tests using local
|
||||
workloads with and without Hyper-Threading can help
|
||||
determine which option is more appropriate in any particular
|
||||
case.</para>
|
||||
<para>If the Libvirt/KVM hypervisor driver are the intended use
|
||||
cases, then the CPUs used in the compute nodes must support
|
||||
<para>If they must run the Libvirt or KVM hypervisor drivers,
|
||||
then the compute node CPUs must support
|
||||
virtualization by way of the VT-x extensions for Intel chips
|
||||
and AMD-v extensions for AMD chips to provide full
|
||||
performance.</para>
|
||||
<para>OpenStack enables the user to overcommit CPU and RAM on
|
||||
<para>OpenStack enables users to overcommit CPU and RAM on
|
||||
compute nodes. This allows an increase in the number of
|
||||
instances running on the cloud at the cost of reducing the
|
||||
performance of the instances. OpenStack Compute uses the
|
||||
@ -179,8 +181,8 @@
|
||||
the RAM associated with the instances reaches 72 GB (such as
|
||||
nine instances, in the case where each instance has 8 GB of
|
||||
RAM).</para>
|
||||
<para>The appropriate CPU and RAM allocation ratio must be
|
||||
selected based on particular use cases.</para></section>
|
||||
<para>You must select the appropriate CPU and RAM allocation ratio
|
||||
based on particular use cases.</para></section>
|
||||
<section xml:id="additional-hardware-compute-focus">
|
||||
<title>Additional hardware</title>
|
||||
<para>Certain use cases may benefit from exposure to additional
|
||||
@ -201,15 +203,15 @@
|
||||
<listitem>
|
||||
<para>Database management systems that benefit from the
|
||||
availability of SSDs for ephemeral storage to maximize
|
||||
read/write time when it is required.</para>
|
||||
read/write time.</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
<para>Host aggregates are used to group hosts that share similar
|
||||
<para>Host aggregates group hosts that share similar
|
||||
characteristics, which can include hardware similarities. The
|
||||
addition of specialized hardware to a cloud deployment is
|
||||
likely to add to the cost of each node, so careful
|
||||
consideration must be given to whether all compute nodes, or
|
||||
just a subset which is targetable using flavors, need the
|
||||
likely to add to the cost of each node, so consider carefully
|
||||
consideration whether all compute nodes, or
|
||||
just a subset targeted by flavors, need the
|
||||
additional customization to support the desired
|
||||
workloads.</para></section>
|
||||
<section xml:id="utilization"><title>Utilization</title>
|
||||
@ -219,13 +221,13 @@
|
||||
instances while making the best use of the available physical
|
||||
resources.</para>
|
||||
<para>In order to facilitate packing of virtual machines onto
|
||||
physical hosts, the default selection of flavors are
|
||||
constructed so that the second largest flavor is half the size
|
||||
physical hosts, the default selection of flavors provides a
|
||||
second largest flavor that is half the size
|
||||
of the largest flavor in every dimension. It has half the
|
||||
vCPUs, half the vRAM, and half the ephemeral disk space. The
|
||||
next largest flavor is half that size again. As a result,
|
||||
packing a server for general purpose computing might look
|
||||
conceptually something like this figure:
|
||||
next largest flavor is half that size again. The following figure
|
||||
provides a visual representation of this concept for a general
|
||||
purpose computing design:
|
||||
<mediaobject>
|
||||
<imageobject>
|
||||
<imagedata contentwidth="4in"
|
||||
@ -233,8 +235,7 @@
|
||||
/>
|
||||
</imageobject>
|
||||
</mediaobject></para>
|
||||
<para>On the other hand, a CPU optimized packed server might look
|
||||
like the following figure:
|
||||
<para>The following figure displays a CPU-optimized, packed server:
|
||||
<mediaobject>
|
||||
<imageobject>
|
||||
<imagedata contentwidth="4in"
|
||||
@ -242,35 +243,32 @@
|
||||
/>
|
||||
</imageobject>
|
||||
</mediaobject></para>
|
||||
<para>These default flavors are well suited to typical load outs
|
||||
for commodity server hardware. To maximize utilization,
|
||||
<para>These default flavors are well suited to typical configurations
|
||||
of commodity server hardware. To maximize utilization,
|
||||
however, it may be necessary to customize the flavors or
|
||||
create new ones, to better align instance sizes to the
|
||||
create new ones in order to better align instance sizes to the
|
||||
available hardware.</para>
|
||||
<para>Workload characteristics may also influence hardware choices
|
||||
and flavor configuration, particularly where they present
|
||||
different ratios of CPU versus RAM versus HDD
|
||||
requirements.</para>
|
||||
<para>For more information on Flavors refer to:
|
||||
<para>For more information on Flavors see:
|
||||
<link xlink:href="http://docs.openstack.org/openstack-ops/content/flavors.html">http://docs.openstack.org/openstack-ops/content/flavors.html</link></para>
|
||||
</section>
|
||||
<section xml:id="performance-compute-focus"><title>Performance</title>
|
||||
<para>The infrastructure of a cloud should not be shared, so that
|
||||
it is possible for the workloads to consume as many resources
|
||||
as are made available, and accommodations should be made to
|
||||
provide large scale workloads.</para>
|
||||
<para>So that workloads can consume as many resources
|
||||
as are available, do not share cloud infrastructure. Ensure you accommodate
|
||||
large scale workloads.</para>
|
||||
<para>The duration of batch processing differs depending on
|
||||
individual workloads that are launched. Time limits range from
|
||||
seconds, minutes to hours, and as a result it is considered
|
||||
difficult to predict when resources will be used, for how
|
||||
long, and even which resources will be used.</para>
|
||||
individual workloads. Time limits range from
|
||||
seconds to hours, and as a result it is difficult to predict resource
|
||||
use.</para>
|
||||
</section>
|
||||
<section xml:id="security-compute-focus"><title>Security</title>
|
||||
<para>The security considerations needed for this scenario are
|
||||
similar to those of the other scenarios discussed in this
|
||||
book.</para>
|
||||
<para>A security domain comprises users, applications, servers
|
||||
or networks that share common trust requirements and
|
||||
<para>The security considerations for this scenario are
|
||||
similar to those of the other scenarios in this guide.</para>
|
||||
<para>A security domain comprises users, applications, servers,
|
||||
and networks that share common trust requirements and
|
||||
expectations within a system. Typically they have the same
|
||||
authentication and authorization requirements and
|
||||
users.</para>
|
||||
@ -289,77 +287,77 @@
|
||||
<para>Data</para>
|
||||
</listitem>
|
||||
</orderedlist>
|
||||
<para>These security domains can be mapped individually to the
|
||||
installation, or they can also be combined. For example, some
|
||||
<para>You can map these security domains individually to the
|
||||
installation, or combine them. For example, some
|
||||
deployment topologies combine both guest and data domains onto
|
||||
one physical network, whereas in other cases these networks
|
||||
are physically separated. In each case, the cloud operator
|
||||
should be aware of the appropriate security concerns. Security
|
||||
domains should be mapped out against specific OpenStack
|
||||
are physically separate. In each case, the cloud operator
|
||||
should be aware of the appropriate security concerns. Map out
|
||||
security domains against specific OpenStack
|
||||
deployment topology. The domains and their trust requirements
|
||||
depend upon whether the cloud instance is public, private, or
|
||||
depend on whether the cloud instance is public, private, or
|
||||
hybrid.</para>
|
||||
<para>The public security domain is an entirely untrusted area of
|
||||
<para>The public security domain is an untrusted area of
|
||||
the cloud infrastructure. It can refer to the Internet as a
|
||||
whole or simply to networks over which the user has no
|
||||
authority. This domain should always be considered
|
||||
untrusted.</para>
|
||||
authority. Always consider this domain untrusted.</para>
|
||||
<para>Typically used for compute instance-to-instance traffic, the
|
||||
guest security domain handles compute data generated by
|
||||
instances on the cloud; not services that support the
|
||||
instances on the cloud. It does not handle services that support the
|
||||
operation of the cloud, for example API calls. Public cloud
|
||||
providers and private cloud providers who do not have
|
||||
stringent controls on instance use or who allow unrestricted
|
||||
Internet access to instances should consider this domain to be
|
||||
untrusted. Private cloud providers may want to consider this
|
||||
network as internal and therefore trusted only if they have
|
||||
Internet access to instances should consider this an untrusted domain.
|
||||
Private cloud providers may want to consider this
|
||||
an internal network and therefore trusted only if they have
|
||||
controls in place to assert that they trust instances and all
|
||||
their tenants.</para>
|
||||
<para>The management security domain is where services interact.
|
||||
Sometimes referred to as the "control plane", the networks in
|
||||
this domain transport confidential data such as configuration
|
||||
parameters, user names, and passwords. In most deployments this
|
||||
domain is considered trusted.</para>
|
||||
<para>The data security domain is concerned primarily with
|
||||
is a trusted domain.</para>
|
||||
<para>The data security domain deals with
|
||||
information pertaining to the storage services within
|
||||
OpenStack. Much of the data that crosses this network has high
|
||||
integrity and confidentiality requirements and depending on
|
||||
the type of deployment there may also be strong availability
|
||||
requirements. The trust level of this network is heavily
|
||||
dependent on deployment decisions and as such we do not assign
|
||||
this any default level of trust.</para>
|
||||
<para>When deploying OpenStack in an enterprise as a private cloud
|
||||
it is assumed to be behind a firewall and within the trusted
|
||||
this a default level of trust.</para>
|
||||
<para>When deploying OpenStack in an enterprise as a private cloud, you can
|
||||
generally assume it is behind a firewall and within the trusted
|
||||
network alongside existing systems. Users of the cloud are
|
||||
typically employees or trusted individuals that are bound by
|
||||
the security requirements set forth by the company. This tends
|
||||
to push most of the security domains towards a more trusted
|
||||
model. However, when deploying OpenStack in a public-facing
|
||||
role, no assumptions can be made and the attack vectors
|
||||
significantly increase. For example, the API endpoints and the
|
||||
software behind it will be vulnerable to potentially hostile
|
||||
entities wanting to gain unauthorized access or prevent access
|
||||
to services. This can result in loss of reputation and must be
|
||||
protected against through auditing and appropriate
|
||||
role, you cannot make these assumptions and the number of attack vectors
|
||||
significantly increases. For example, the API endpoints and the
|
||||
software behind it become vulnerable to hostile
|
||||
entities attempting to gain unauthorized access or prevent access
|
||||
to services. This can result in loss of reputation and you must
|
||||
protect against it through auditing and appropriate
|
||||
filtering.</para>
|
||||
<para>Consideration must be taken when managing the users of the
|
||||
system, whether it is the operation of public or private
|
||||
clouds. The identity service allows for LDAP to be part of the
|
||||
<para>Take care when managing the users of the
|
||||
system, whether in public or private
|
||||
clouds. The identity service enables LDAP to be part of the
|
||||
authentication process, and includes such systems as an
|
||||
OpenStack deployment that may ease user management if
|
||||
integrated into existing systems.</para>
|
||||
<para>It is strongly recommended that the API services are placed
|
||||
behind hardware that performs SSL termination. API services
|
||||
<para>We recommend placing API services behind hardware that
|
||||
performs SSL termination. API services
|
||||
transmit user names, passwords, and generated tokens between
|
||||
client machines and API endpoints and therefore must be
|
||||
secured.</para>
|
||||
<para>More information on OpenStack Security can be found
|
||||
at <link xlink:href="http://docs.openstack.org/security-guide/">http://docs.openstack.org/security-guide/</link></para>
|
||||
secure.</para>
|
||||
<para>For more information on OpenStack Security, see
|
||||
<link xlink:href="http://docs.openstack.org/security-guide/">http://docs.openstack.org/security-guide/</link>
|
||||
</para>
|
||||
</section>
|
||||
<section xml:id="openstack-components-compute-focus">
|
||||
<title>OpenStack components</title>
|
||||
<para>Due to the nature of the workloads that will be used in this
|
||||
scenario, a number of components will be highly beneficial in
|
||||
<para>Due to the nature of the workloads in this
|
||||
scenario, a number of components are highly beneficial for
|
||||
a Compute-focused cloud. This includes the typical OpenStack
|
||||
components:</para>
|
||||
<itemizedlist>
|
||||
@ -381,10 +379,10 @@
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
<para>It is safe to assume that, given the nature of the
|
||||
applications involved in this scenario, these will be heavily
|
||||
automated deployments. Making use of Orchestration will be highly
|
||||
beneficial in this case. Deploying a batch of instances and
|
||||
running an automated set of tests can be scripted, however it
|
||||
applications involved in this scenario, these are heavily
|
||||
automated deployments. Making use of Orchestration is highly
|
||||
beneficial in this case. You can script the deployment of a
|
||||
batch of instances and the running of tests, but it
|
||||
makes sense to use the Orchestration module
|
||||
to handle all these actions.</para>
|
||||
<itemizedlist>
|
||||
@ -392,11 +390,10 @@
|
||||
<para>Telemetry module (ceilometer)</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
<para>Telemetry and the alarms it generates are required
|
||||
to support autoscaling of instances using
|
||||
Orchestration. Users that are not using the
|
||||
<para>Telemetry and the alarms it generates support autoscaling
|
||||
of instances using Orchestration. Users that are not using the
|
||||
Orchestration module do not need to deploy the Telemetry module and
|
||||
may choose to use other external solutions to fulfill their
|
||||
may choose to use external solutions to fulfill their
|
||||
metering and monitoring requirements.</para>
|
||||
<para>See also:
|
||||
<link xlink:href="http://docs.openstack.org/openstack-ops/content/logging_monitoring.html">http://docs.openstack.org/openstack-ops/content/logging_monitoring.html</link></para>
|
||||
@ -406,12 +403,12 @@
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
<para>Due to the burst-able nature of the workloads and the
|
||||
applications and instances that will be used for batch
|
||||
processing, this cloud will utilize mainly memory or CPU, so
|
||||
applications and instances that perform batch
|
||||
processing, this cloud mainly uses memory or CPU, so
|
||||
the need for add-on storage to each instance is not a likely
|
||||
requirement. This does not mean that OpenStack Block Storage
|
||||
(cinder) will not be used in the infrastructure, but
|
||||
typically it will not be used as a central component.</para>
|
||||
requirement. This does not mean that you do not use
|
||||
OpenStack Block Storage (cinder) in the infrastructure, but
|
||||
typically it is not a central component.</para>
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>Networking</para>
|
||||
@ -419,7 +416,7 @@
|
||||
</itemizedlist>
|
||||
<para>When choosing a networking platform, ensure that it either
|
||||
works with all desired hypervisor and container technologies
|
||||
and their OpenStack drivers, or includes an implementation of
|
||||
an ML2 mechanism driver. Networking platforms that provide ML2
|
||||
mechanisms drivers can be mixed.</para></section>
|
||||
and their OpenStack drivers, or that it includes an implementation of
|
||||
an ML2 mechanism driver. You can mix networking platforms
|
||||
that provide ML2 mechanisms drivers.</para></section>
|
||||
</section>
|
||||
|
@ -10,10 +10,9 @@
|
||||
xml:id="user-requirements-compute-focus">
|
||||
<?dbhtml stop-chunking?>
|
||||
<title>User requirements</title>
|
||||
<para>Compute intensive workloads are defined by their high
|
||||
utilization of CPU, RAM, or both. User requirements will
|
||||
determine if a cloud must be built to accommodate anticipated
|
||||
performance demands.
|
||||
<para>High utilization of CPU, RAM, or both defines compute
|
||||
intensive workloads. User requirements determine the performance
|
||||
demands for the cloud.
|
||||
</para>
|
||||
<variablelist>
|
||||
<varlistentry>
|
||||
@ -23,25 +22,23 @@
|
||||
compute-focused cloud, however some organizations
|
||||
might be concerned with cost avoidance. Repurposing
|
||||
existing resources to tackle compute-intensive tasks
|
||||
instead of needing to acquire additional resources may
|
||||
instead of acquiring additional resources may
|
||||
offer cost reduction opportunities.</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
<varlistentry>
|
||||
<term>Time to market</term>
|
||||
<listitem>
|
||||
<para>Compute-focused clouds can be used
|
||||
to deliver products more quickly, for example,
|
||||
speeding up a company's software development life cycle
|
||||
(SDLC) for building products and applications.</para>
|
||||
<para>Compute-focused clouds can deliver products more quickly,
|
||||
for example by speeding up a company's software development
|
||||
life cycle (SDLC) for building products and applications.</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
<varlistentry>
|
||||
<term>Revenue opportunity</term>
|
||||
<listitem>
|
||||
<para>Companies that are interested
|
||||
in building services or products that rely on the
|
||||
power of the compute resources will benefit from a
|
||||
<para>Companies that want to build services or products that
|
||||
rely on the power of compute resources benefit from a
|
||||
compute-focused cloud. Examples include the analysis
|
||||
of large data sets (via Hadoop or Cassandra) or
|
||||
completing computational intensive tasks such as
|
||||
@ -71,9 +68,9 @@
|
||||
jurisdictions.</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>Data compliance—certain types of information needs
|
||||
to reside in certain locations due to regular issues—and
|
||||
more important cannot reside in other locations
|
||||
<para>Data compliance: certain types of information need
|
||||
to reside in certain locations due to regular issues and,
|
||||
more importantly, cannot reside in other locations
|
||||
for the same reason.</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
@ -88,15 +85,14 @@
|
||||
information.</para></section>
|
||||
<section xml:id="technical-considerations-compute-focus-user">
|
||||
<title>Technical considerations</title>
|
||||
<para>The following are some technical requirements that need to
|
||||
be incorporated into the architecture design.
|
||||
<para>The following are some technical requirements you must consider
|
||||
in the architecture design:
|
||||
</para>
|
||||
<variablelist>
|
||||
<varlistentry>
|
||||
<term>Performance</term>
|
||||
<listitem>
|
||||
<para>If a primary technical concern is for
|
||||
the environment to deliver high performance
|
||||
<para>If a primary technical concern is to deliver high performance
|
||||
capability, then a compute-focused design is an
|
||||
obvious choice because it is specifically designed to
|
||||
host compute-intensive workloads.</para>
|
||||
@ -106,24 +102,23 @@
|
||||
<term>Workload persistence</term>
|
||||
<listitem>
|
||||
<para>Workloads can be either
|
||||
short-lived or long running. Short-lived workloads
|
||||
might include continuous integration and continuous
|
||||
deployment (CI-CD) jobs, where large numbers of
|
||||
compute instances are created simultaneously to
|
||||
perform a set of compute-intensive tasks. The results
|
||||
or artifacts are then copied from the instance into
|
||||
long-term storage before the instance is destroyed.
|
||||
short-lived or long-running. Short-lived workloads
|
||||
can include continuous integration and continuous
|
||||
deployment (CI-CD) jobs, which create large numbers of
|
||||
compute instances simultaneously to
|
||||
perform a set of compute-intensive tasks. The environment then
|
||||
copies the results or artifacts from each instance into
|
||||
long-term storage before destroying the instance.
|
||||
Long-running workloads, like a Hadoop or
|
||||
high-performance computing (HPC) cluster, typically
|
||||
ingest large data sets, perform the computational work
|
||||
on those data sets, then push the results into long
|
||||
term storage. Unlike short-lived workloads, when the
|
||||
computational work is completed, they will remain idle
|
||||
until the next job is pushed to them. Long-running
|
||||
workloads are often larger and more complex, so the
|
||||
effort of building them is mitigated by keeping them
|
||||
active between jobs. Another example of long running
|
||||
workloads is legacy applications that typically are
|
||||
on those data sets, then push the results into long-term
|
||||
storage. When the computational work finishes, the instances
|
||||
remain idle until they receive another job. Environments
|
||||
for long-running workloads are often larger and more complex,
|
||||
but you can offset the cost of building them by keeping them
|
||||
active between jobs. Another example of long-running
|
||||
workloads is legacy applications that are
|
||||
persistent over time.</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
@ -132,14 +127,14 @@
|
||||
<listitem>
|
||||
<para>Workloads targeted for a compute-focused
|
||||
OpenStack cloud generally do not require any
|
||||
persistent block storage (although some usages of
|
||||
Hadoop with HDFS may dictate the use of persistent
|
||||
block storage). A shared filesystem or object store
|
||||
will maintain the initial data set(s) and serve as the
|
||||
persistent block storage, although some uses of
|
||||
Hadoop with HDFS may require persistent
|
||||
block storage. A shared filesystem or object store
|
||||
maintains the initial data sets and serves as the
|
||||
destination for saving the computational results. By
|
||||
avoiding the input-output (IO) overhead, workload
|
||||
performance is significantly enhanced. Depending on
|
||||
the size of the data set(s), it might be necessary to
|
||||
avoiding the input-output (IO) overhead, you can significantly
|
||||
enhance workload performance. Depending on
|
||||
the size of the data sets, it may be necessary to
|
||||
scale the object store or shared file system to match
|
||||
the storage demand.</para>
|
||||
</listitem>
|
||||
@ -150,7 +145,7 @@
|
||||
<para>Like any other cloud architecture, a
|
||||
compute-focused OpenStack cloud requires an on-demand
|
||||
and self-service user interface. End users must be
|
||||
able to provision computing power, storage, networks
|
||||
able to provision computing power, storage, networks,
|
||||
and software simply and flexibly. This includes
|
||||
scaling the infrastructure up to a substantial level
|
||||
without disrupting host operations.</para>
|
||||
@ -159,12 +154,12 @@
|
||||
<varlistentry>
|
||||
<term>Security</term>
|
||||
<listitem>
|
||||
<para>Security is going to be highly dependent
|
||||
on the business requirements. For example, a
|
||||
<para>Security is highly dependent
|
||||
on business requirements. For example, a
|
||||
computationally intense drug discovery application
|
||||
will obviously have much higher security requirements
|
||||
than a cloud that is designed for processing market
|
||||
data for a retailer. As a general start, the security
|
||||
has much higher security requirements
|
||||
than a cloud for processing market
|
||||
data for a retailer. As a general rule, the security
|
||||
recommendations and guidelines provided in the
|
||||
OpenStack Security Guide are applicable.</para>
|
||||
</listitem>
|
||||
@ -173,9 +168,8 @@
|
||||
</section>
|
||||
<section xml:id="operational-considerations-compute-focus-user">
|
||||
<title>Operational considerations</title>
|
||||
<para>The compute intensive cloud from the operational perspective
|
||||
is similar to the requirements for the general-purpose cloud.
|
||||
More details on operational requirements can be found in the
|
||||
general-purpose design section.</para>
|
||||
<para>From an operational perspective, a compute intensive cloud
|
||||
is similar to a general-purpose cloud. See the general-purpose
|
||||
design section for more details on operational requirements.</para>
|
||||
</section>
|
||||
</section>
|
||||
|
Loading…
Reference in New Issue
Block a user