Remove passive voice from Arch Guide Chap 2

Closes-Bug: #1400552

Change-Id: I5b1572d7b5cf3321dcfa2836d7f777fe450a87e3
This commit is contained in:
Brian Moss 2015-04-23 09:48:06 +10:00
parent 2a93d184c5
commit 60002fc5be
6 changed files with 408 additions and 456 deletions

View File

@ -8,12 +8,11 @@
<para>A compute-focused cloud is a specialized subset of the general purpose <para>A compute-focused cloud is a specialized subset of the general purpose
OpenStack cloud architecture. Unlike the general purpose OpenStack OpenStack cloud architecture. Unlike the general purpose OpenStack
architecture, which is built to host a wide variety of workloads and architecture, which hosts a wide variety of workloads and
applications and does not heavily tax any particular computing aspect, applications and does not heavily tax any particular computing aspect,
a compute-focused cloud is built and designed specifically to support a compute-focused cloud specifically supports
compute intensive workloads. As such, the design must be specifically compute intensive workloads. Compute intensive
tailored to support hosting compute intensive workloads. Compute intensive workloads may be CPU intensive, RAM intensive, or both; they are
workloads may be CPU intensive, RAM intensive, or both. However, they are
not typically storage intensive or network intensive. Compute-focused not typically storage intensive or network intensive. Compute-focused
workloads may include the following use cases:</para> workloads may include the following use cases:</para>
<itemizedlist> <itemizedlist>
@ -36,11 +35,11 @@
</itemizedlist> </itemizedlist>
<para>Based on the use case requirements, such clouds might need to provide <para>Based on the use case requirements, such clouds might need to provide
additional services such as a virtual machine disk library, file or object additional services such as a virtual machine disk library, file or object
storage, firewalls, load balancers, IP addresses, and network connectivity storage, firewalls, load balancers, IP addresses, or network connectivity
in the form of overlays or virtual local area networks (VLANs). A in the form of overlays or virtual local area networks (VLANs). A
compute-focused OpenStack cloud will not typically use raw block storage compute-focused OpenStack cloud does not typically use raw block storage
services since the applications hosted on a compute-focused OpenStack services as it does not generally host applications that require
cloud generally do not need persistent block storage.</para> persistent block storage.</para>
<xi:include href="compute_focus/section_user_requirements_compute_focus.xml"/> <xi:include href="compute_focus/section_user_requirements_compute_focus.xml"/>
<xi:include href="compute_focus/section_tech_considerations_compute_focus.xml"/> <xi:include href="compute_focus/section_tech_considerations_compute_focus.xml"/>

View File

@ -20,15 +20,15 @@
</itemizedlist> </itemizedlist>
<para> <para>
An OpenStack cloud with extreme demands on processor and memory An OpenStack cloud with extreme demands on processor and memory
resources is considered to be compute-focused, and requires hardware that resources is compute-focused, and requires hardware that
can handle these demands. This can mean choosing hardware which might can handle these demands. This can mean choosing hardware which might
not perform as well on storage or network capabilities. In a compute- not perform as well on storage or network capabilities. In a compute-
focused architecture, storage and networking are required while loading a focused architecture, storage and networking load a
data set into the computational cluster, but are not otherwise in heavy data set into the computational cluster, but are not otherwise in heavy
demand. demand.
</para> </para>
<para> <para>
Compute (server) hardware must be evaluated against four dimensions: Consider the following factors when selecting compute (server) hardware:
</para> </para>
<variablelist> <variablelist>
<varlistentry> <varlistentry>
@ -42,14 +42,14 @@
<term>Resource capacity</term> <term>Resource capacity</term>
<listitem> <listitem>
<para>The number of CPU cores, how much RAM, or how <para>The number of CPU cores, how much RAM, or how
much storage a given server will deliver.</para> much storage a given server delivers.</para>
</listitem> </listitem>
</varlistentry> </varlistentry>
<varlistentry> <varlistentry>
<term>Expandability</term> <term>Expandability</term>
<listitem> <listitem>
<para>The number of additional resources that can be <para>The number of additional resources you can add to a
added to a server before it has reached its limit.</para> server before it reaches its limit.</para>
</listitem> </listitem>
</varlistentry> </varlistentry>
<varlistentry> <varlistentry>
@ -60,7 +60,7 @@
</listitem> </listitem>
</varlistentry> </varlistentry>
</variablelist> </variablelist>
<para>The dimensions need to be weighed against each other to determine the <para>Weigh these considerations against each other to determine the
best design for the desired purpose. For example, increasing server density best design for the desired purpose. For example, increasing server density
means sacrificing resource capacity or expandability. Increasing resource means sacrificing resource capacity or expandability. Increasing resource
capacity and expandability can increase cost but decreases server density. capacity and expandability can increase cost but decreases server density.
@ -68,38 +68,38 @@
resource capacity, and expandability.</para> resource capacity, and expandability.</para>
<para>A compute-focused cloud should have an emphasis on server hardware <para>A compute-focused cloud should have an emphasis on server hardware
that can offer more CPU sockets, more CPU cores, and more RAM. Network that can offer more CPU sockets, more CPU cores, and more RAM. Network
connectivity and storage capacity are less critical. The hardware will connectivity and storage capacity are less critical. The hardware must
need to be configured to provide enough network connectivity and storage provide enough network connectivity and storage
capacity to meet minimum user requirements, but they are not the primary capacity to meet minimum user requirements, but they are not the primary
consideration.</para> consideration.</para>
<para>Some server hardware form factors are better suited than others, as <para>Some server hardware form factors suit a compute-focused architecture
CPU and RAM capacity have the highest priority. Some considerations for better than others. CPU and RAM capacity have the highest priority. Some
selecting hardware:</para> considerations for selecting hardware:</para>
<itemizedlist> <itemizedlist>
<listitem> <listitem>
<para>Most blade servers can support dual-socket multi-core CPUs. To <para>Most blade servers can support dual-socket multi-core CPUs. To
avoid this CPU limit, select "full width" or "full height" blades, avoid this CPU limit, select "full width" or "full height" blades.
however this will also decrease the server density. For example, Be aware, however, that this also decreases server density. For example,
high density blade servers (like HP BladeSystem or Dell PowerEdge high density blade servers such as HP BladeSystem or Dell PowerEdge
M1000e) which support up to 16 servers in only ten rack units. Using M1000e support up to 16 servers in only ten rack units. Using
half-height blades is twice as dense as using full-height blades, half-height blades is twice as dense as using full-height blades,
which results in only eight servers per ten rack units.</para> which results in only eight servers per ten rack units.</para>
</listitem> </listitem>
<listitem> <listitem>
<para>1U rack-mounted servers (servers that occupy only a single rack <para>1U rack-mounted servers that occupy only a single rack
unit) may be able to offer greater server density than a blade server unit may offer greater server density than a blade server
solution. It is possible to place forty 1U servers in a rack, providing solution. It is possible to place forty 1U servers in a rack, providing
space for the top of rack (ToR) switches, compared to 32 full width space for the top of rack (ToR) switches, compared to 32 full width
blade servers. However, as of the Icehouse release, 1U servers from blade servers. However, as of the Icehouse release, 1U servers from
the major vendors are limited to dual-socket, multi-core CPU the major vendors have only dual-socket, multi-core CPU
configurations. To obtain greater than dual-socket support in a 1U configurations. To obtain greater than dual-socket support in a 1U
rack-mount form factor, you will need to buy systems from original rack-mount form factor, purchase systems from original
design (ODMs) or second-tier manufacturers.</para> design (ODMs) or second-tier manufacturers.</para>
</listitem> </listitem>
<listitem> <listitem>
<para>2U rack-mounted servers provide quad-socket, multi-core CPU <para>2U rack-mounted servers provide quad-socket, multi-core CPU
support, but with a corresponding decrease in server density (half support, but with a corresponding decrease in server density (half
the density offered by 1U rack-mounted servers).</para> the density that 1U rack-mounted servers offer).</para>
</listitem> </listitem>
<listitem> <listitem>
<para>Larger rack-mounted servers, such as 4U servers, often provide <para>Larger rack-mounted servers, such as 4U servers, often provide
@ -108,8 +108,8 @@
have much lower server density and are often more expensive.</para> have much lower server density and are often more expensive.</para>
</listitem> </listitem>
<listitem> <listitem>
<para>"Sled servers" (rack-mounted servers that support multiple <para>"Sled servers" are rack-mounted servers that support multiple
independent servers in a single 2U or 3U enclosure) deliver increased independent servers in a single 2U or 3U enclosure. These deliver higher
density as compared to typical 1U or 2U rack-mounted servers. For density as compared to typical 1U or 2U rack-mounted servers. For
example, many sled servers offer four independent dual-socket example, many sled servers offer four independent dual-socket
nodes in 2U for a total of eight CPU sockets in 2U. However, the nodes in 2U for a total of eight CPU sockets in 2U. However, the
@ -125,7 +125,7 @@
<listitem> <listitem>
<para>In a compute-focused architecture, instance density is <para>In a compute-focused architecture, instance density is
lower, which means CPU and RAM over-subscription ratios are lower, which means CPU and RAM over-subscription ratios are
also lower. More hosts will be required to support the anticipated also lower. You require more hosts to support the anticipated
scale due to instance density being lower, especially if the scale due to instance density being lower, especially if the
design uses dual-socket hardware designs.</para> design uses dual-socket hardware designs.</para>
</listitem> </listitem>
@ -134,8 +134,8 @@
<term>Host density</term> <term>Host density</term>
<listitem> <listitem>
<para>Another option to address the higher host count <para>Another option to address the higher host count
that might be needed with dual socket designs is to use a quad of dual socket designs is to use a quad
socket platform. Taking this approach will decrease host density, socket platform. Taking this approach decreases host density,
which increases rack count. This configuration may which increases rack count. This configuration may
affect the network requirements, the number of power connections, and affect the network requirements, the number of power connections, and
possibly impact the cooling requirements.</para> possibly impact the cooling requirements.</para>
@ -145,64 +145,62 @@
<term>Power and cooling density</term> <term>Power and cooling density</term>
<listitem> <listitem>
<para>The power and cooling density <para>The power and cooling density
requirements might be lower than with blade, sled, or 1U server requirements for 2U, 3U or even 4U server designs might be lower
designs because of lower host density (by using 2U, 3U or even 4U than for blade, sled, or 1U server designs because of lower host
server designs). For data centers with older infrastructure, this may density. For data centers with older infrastructure, this may
be a desirable feature.</para> be a desirable feature.</para>
</listitem> </listitem>
</varlistentry> </varlistentry>
</variablelist> </variablelist>
<para>Compute-focused OpenStack design architecture server hardware <para>When designing a compute-focused OpenStack architecture, you must
selection results in a "scale up" versus "scale out" decision. consider whether you intend to scale up or scale out.
Selecting a better solution, smaller number of larger hosts, or a Selecting a smaller number of larger hosts, or a
larger number of smaller hosts depends on a combination of factors: larger number of smaller hosts, depends on a combination of factors:
cost, power, cooling, physical rack and floor space, support-warranty, cost, power, cooling, physical rack and floor space, support-warranty,
and manageability.</para> and manageability.</para>
<section xml:id="storage-hardware-selection"> <section xml:id="storage-hardware-selection">
<title>Storage hardware selection</title> <title>Storage hardware selection</title>
<para>For a compute-focused OpenStack design architecture, the selection of <para>For a compute-focused OpenStack architecture, the
storage hardware is not critical as it is not primary criteria, however selection of storage hardware is not critical as it is not a primary
it is still important. There are a number of different factors that a consideration. Nonetheless, there are several factors
cloud architect must consider:</para> to consider:</para>
<variablelist> <variablelist>
<varlistentry> <varlistentry>
<term>Cost</term> <term>Cost</term>
<listitem> <listitem>
<para>The overall cost of the solution will play a major role <para>The overall cost of the solution plays a major role
in what storage architecture (and resulting storage hardware) is in what storage architecture and storage hardware you select.</para>
selected.</para>
</listitem> </listitem>
</varlistentry> </varlistentry>
<varlistentry> <varlistentry>
<term>Performance</term> <term>Performance</term>
<listitem> <listitem>
<para>The performance of the solution is also a big <para>The performance of the storage solution is important; you can
role and can be measured by observing the latency of storage I-O measure it by observing the latency of storage I-O
requests. In a compute-focused OpenStack cloud, storage latency requests. In a compute-focused OpenStack cloud, storage latency
can be a major consideration. In some compute-intensive can be a major consideration. In some compute-intensive
workloads, minimizing the delays that the CPU experiences while workloads, minimizing the delays that the CPU experiences while
fetching data from the storage can have a significant impact on fetching data from storage can impact significantly on
the overall performance of the application.</para> the overall performance of the application.</para>
</listitem> </listitem>
</varlistentry> </varlistentry>
<varlistentry> <varlistentry>
<term>Scalability</term> <term>Scalability</term>
<listitem> <listitem>
<para>This section will refer to the term "scalability" <para>Scalability refers to the performance of a storage solution
to refer to how well the storage solution performs as it is as it expands to its maximum size. A solution that performs
expanded up to its maximum size. A storage solution that performs
well in small configurations but has degrading well in small configurations but has degrading
performance as it expands would not be considered scalable. On performance as it expands is not scalable. On
the other hand, a solution that continues to perform well at the other hand, a solution that continues to perform well at
maximum expansion would be considered scalable.</para> maximum expansion is scalable.</para>
</listitem> </listitem>
</varlistentry> </varlistentry>
<varlistentry> <varlistentry>
<term>Expandability</term> <term>Expandability</term>
<listitem> <listitem>
<para>Expandability refers to the overall ability of <para>Expandability refers to the overall ability of
the solution to grow. A storage solution that expands to 50 PB is a storage solution to grow. A solution that expands to 50 PB is
considered more expandable than a solution that only scales to 10PB. more expandable than a solution that only scales to 10PB.
Note that this metric is related to, but different Note that this metric is related to, but different
from, scalability, which is a measure of the solution's from, scalability, which is a measure of the solution's
performance as it expands.</para> performance as it expands.</para>
@ -211,23 +209,20 @@
</variablelist> </variablelist>
<para>For a compute-focused OpenStack cloud, latency of storage is a <para>For a compute-focused OpenStack cloud, latency of storage is a
major consideration. Using solid-state disks (SSDs) to minimize major consideration. Using solid-state disks (SSDs) to minimize
latency for instance storage and reduce CPU delays caused by waiting latency for instance storage reduces CPU delays related to storage
for the storage will increase performance. Consider using RAID and improves performance. Consider using RAID
controller cards in compute hosts to improve the performance of the controller cards in compute hosts to improve the performance of the
underlying disk subsystem.</para> underlying disk subsystem.</para>
<para>The selection of storage architecture, and the corresponding <para>Evaluate solutions against the key factors above when considering
storage hardware (if there is the option), is determined by evaluating your storage architecture. This determines if a scale-out solution
possible solutions against the key factors listed above. This will such as Ceph or GlusterFS is suitable, or if a single, highly expandable,
determine if a scale-out solution (such as Ceph, GlusterFS, or similar) scalable, centralized storage array is better. If a centralized
should be used, or if a single, highly expandable and scalable storage array suits the requirements, the array vendor determines the
centralized storage array would be a better choice. If a centralized hardware. You can build a storage array using commodity hardware with
storage array is the right fit for the requirements, the hardware will Open Source software, but you require people with expertise to build
be determined by the array vendor. It is also possible to build a such a system. Conversely, a scale-out storage solution that uses
storage array using commodity hardware with Open Source software, but
there needs to be access to people with expertise to build such a
system. Conversely, a scale-out storage solution that uses
direct-attached storage (DAS) in the servers may be an appropriate direct-attached storage (DAS) in the servers may be an appropriate
choice. If so, then the server hardware needs to be configured to choice. If so, then the server hardware must
support the storage solution.</para> support the storage solution.</para>
<para>The following lists some of the potential impacts that may affect a <para>The following lists some of the potential impacts that may affect a
particular storage architecture, and the corresponding storage hardware, particular storage architecture, and the corresponding storage hardware,
@ -236,92 +231,89 @@
<varlistentry> <varlistentry>
<term>Connectivity</term> <term>Connectivity</term>
<listitem> <listitem>
<para>Based on the storage solution selected, ensure <para>Ensure connectivity matches the storage solution requirements.
the connectivity matches the storage solution requirements. If a If you select a centralized storage array, determine how the
centralized storage array is selected, it is important to determine hypervisors should connect to the storage array. Connectivity
how the hypervisors will connect to the storage array. Connectivity can affect latency and thus performance, so ensure that the network
could affect latency and thus performance, so check that the network characteristics minimize latency to boost the overall
characteristics will minimize latency to boost the overall
performance of the design.</para> performance of the design.</para>
</listitem> </listitem>
</varlistentry> </varlistentry>
<varlistentry> <varlistentry>
<term>Latency</term> <term>Latency</term>
<listitem> <listitem>
<para>Determine if the use case will have consistent or <para>Determine if the use case has consistent or
highly variable latency.</para> highly variable latency.</para>
</listitem> </listitem>
</varlistentry> </varlistentry>
<varlistentry> <varlistentry>
<term>Throughput</term> <term>Throughput</term>
<listitem> <listitem>
<para>To improve overall performance, make sure that the <para>To improve overall performance, ensure that you optimize the
storage solution throughout is optimized. While it is not likely storage solution. While a compute-focused cloud does not usually
that a compute-focused cloud will have major data I-O to and have major data I-O to and from storage, this is an important
from storage, this is an important factor to consider.</para> factor to consider.</para>
</listitem> </listitem>
</varlistentry> </varlistentry>
<varlistentry> <varlistentry>
<term>Server Hardware</term> <term>Server Hardware</term>
<listitem> <listitem>
<para>If the solution uses DAS, this impacts, and <para>If the solution uses DAS, this impacts the server hardware choice,
is not limited to, the server hardware choice that will ripple into
host density, instance density, power density, OS-hypervisor, and host density, instance density, power density, OS-hypervisor, and
management tools.</para> management tools.</para>
</listitem> </listitem>
</varlistentry> </varlistentry>
</variablelist> </variablelist>
<para>Where instances need to be made highly available, or they need to be <para>When instances must be highly available or capable of migration
capable of migration between hosts, use of a shared storage file-system between hosts, use a shared storage file-system
to store instance ephemeral data should be employed to ensure that to store instance ephemeral data to ensure that
compute services can run uninterrupted in the event of a node compute services can run uninterrupted in the event of a node
failure.</para> failure.</para>
</section> </section>
<section xml:id="selecting-networking-hardware-arch"> <section xml:id="selecting-networking-hardware-arch">
<title>Selecting networking hardware</title> <title>Selecting networking hardware</title>
<para>Some of the key considerations that should be included in <para>Some of the key considerations for networking hardware selection
the selection of networking hardware include:</para> include:</para>
<variablelist> <variablelist>
<varlistentry> <varlistentry>
<term>Port count</term> <term>Port count</term>
<listitem> <listitem>
<para>The design will require networking hardware that <para>The design requires networking hardware that
has the requisite port count.</para> has the requisite port count.</para>
</listitem> </listitem>
</varlistentry> </varlistentry>
<varlistentry> <varlistentry>
<term>Port density</term> <term>Port density</term>
<listitem> <listitem>
<para>The network design will be affected by the <para>The required port count affects the physical space that a
physical space that is required to provide the requisite port count. network design requires.
A switch that can provide 48 10 GbE ports in 1U has a much higher A switch that can provide 48 10 GbE ports in 1U has a much higher
port density than a switch that provides 24 10 GbE ports in 2U. A port density than a switch that provides 24 10 GbE ports in 2U. A
higher port density is preferred, as it leaves more rack space for higher port density is better, as it leaves more rack space for
compute or storage components that might be required by the design. compute or storage components. You must also consider fault
This also leads into concerns about fault domains and power density domains and power density. Although more expensive, you can also
that must also be considered. Higher density switches are more consider higher density switches as it is important not to
expensive and should also be considered, as it is important not to design the network beyond requirements.</para>
over design the network if it is not required.</para>
</listitem> </listitem>
</varlistentry> </varlistentry>
<varlistentry> <varlistentry>
<term>Port speed</term> <term>Port speed</term>
<listitem> <listitem>
<para>The networking hardware must support the proposed <para>The networking hardware must support the proposed
network speed, for example: 1 GbE, 10 GbE, or 40 GbE (or even 100 network speed, for example: 1 GbE, 10 GbE, 40 GbE, or 100
GbE).</para> GbE.</para>
</listitem> </listitem>
</varlistentry> </varlistentry>
<varlistentry> <varlistentry>
<term>Redundancy</term> <term>Redundancy</term>
<listitem> <listitem>
<para>The level of network hardware redundancy required <para>User requirements for high availability and cost considerations
is influenced by the user requirements for high availability and influence the level of network hardware redundancy you require.
cost considerations. Network redundancy can be achieved by adding You can achieve network redundancy by adding
redundant power supplies or paired switches. If this is a redundant power supplies or paired switches. If this is a
requirement, the hardware will need to support this configuration. requirement, the hardware must support this configuration.
User requirements will determine if a completely redundant network User requirements determine if you require a completely redundant network
infrastructure is required.</para> infrastructure.</para>
</listitem> </listitem>
</varlistentry> </varlistentry>
<varlistentry> <varlistentry>
@ -335,29 +327,24 @@
</listitem> </listitem>
</varlistentry> </varlistentry>
</variablelist> </variablelist>
<para>It is important to first understand additional factors as well as
the use case because these additional factors heavily influence the
cloud network architecture. Once these key considerations have been
decided, the proper network can be designed to best serve the workloads
being placed in the cloud.</para>
<para>We recommend designing the network architecture using <para>We recommend designing the network architecture using
a scalable network model that makes it easy to add capacity and a scalable network model that makes it easy to add capacity and
bandwidth. A good example of such a model is the leaf-spline model. In bandwidth. A good example of such a model is the leaf-spline model. In
this type of network design, it is possible to easily add additional this type of network design, it is possible to easily add additional
bandwidth as well as scale out to additional racks of gear. It is bandwidth as well as scale out to additional racks of gear. It is
important to select network hardware that will support the required important to select network hardware that supports the required
port count, port speed and port density while also allowing for future port count, port speed, and port density while also allowing for future
growth as workload demands increase. It is also important to evaluate growth as workload demands increase. It is also important to evaluate
where in the network architecture it is valuable to provide redundancy. where in the network architecture it is valuable to provide redundancy.
Increased network availability and redundancy comes at a cost, therefore Increased network availability and redundancy comes at a cost, therefore
we recommend to weigh the cost versus the benefit gained from we recommend weighing the cost versus the benefit gained from
utilizing and deploying redundant network switches and using bonded utilizing and deploying redundant network switches and using bonded
interfaces at the host level.</para> interfaces at the host level.</para>
</section> </section>
<section xml:id="software-selection-arch"> <section xml:id="software-selection-arch">
<title>Software selection</title> <title>Software selection</title>
<para>Selecting software to be included in a compute-focused OpenStack <para>Consider your selection of software for a compute-focused
architecture design must include three main areas:</para> OpenStack:</para>
<itemizedlist> <itemizedlist>
<listitem> <listitem>
<para>Operating system (OS) and hypervisor</para> <para>Operating system (OS) and hypervisor</para>
@ -377,24 +364,19 @@
<para>The selection of operating system (OS) and hypervisor has a <para>The selection of operating system (OS) and hypervisor has a
significant impact on the end point design. Selecting a particular significant impact on the end point design. Selecting a particular
operating system and hypervisor could affect server hardware selection. operating system and hypervisor could affect server hardware selection.
For example, a selected combination needs to be supported on the selected The node, networking, and storage hardware must support the selected
hardware. Ensuring the storage hardware selection and topology supports combination. For example, if the design uses Link Aggregation
the selected operating system and hypervisor combination should also be Control Protocol (LACP), the hypervisor must support it.</para>
considered. Additionally, make sure that the networking hardware <para>OS and hypervisor selection impact the following areas:</para>
selection and topology will work with the chosen operating system and
hypervisor combination. For example, if the design uses Link Aggregation
Control Protocol (LACP), the hypervisor needs to support it.</para>
<para>Some areas that could be impacted by the selection of OS and
hypervisor include:</para>
<variablelist> <variablelist>
<varlistentry> <varlistentry>
<term>Cost</term> <term>Cost</term>
<listitem> <listitem>
<para>Selecting a commercially supported hypervisor such as <para>Selecting a commercially supported hypervisor such as
Microsoft Hyper-V will result in a different cost model rather than Microsoft Hyper-V results in a different cost model from
choosing a community-supported open source hypervisor like Kinstance choosing a community-supported, open source hypervisor like Kinstance
or Xen. Even within the ranks of open source solutions, choosing or Xen. Even within the ranks of open source solutions, choosing
Ubuntu over Red Hat (or vice versa) will have an impact on cost due one solution over another can impact cost due
to support contracts. On the other hand, business or application to support contracts. On the other hand, business or application
requirements might dictate a specific or commercially supported requirements might dictate a specific or commercially supported
hypervisor.</para> hypervisor.</para>
@ -403,11 +385,9 @@
<varlistentry> <varlistentry>
<term>Supportability</term> <term>Supportability</term>
<listitem> <listitem>
<para>Depending on the selected hypervisor, the staff <para>Staff require appropriate training and knowledge to support the
should have the appropriate training and knowledge to support the selected OS and hypervisor combination. Consideration of training
selected OS and hypervisor combination. If they do not, training costs may impact the design.</para>
will need to be provided which could have a cost impact on the
design.</para>
</listitem> </listitem>
</varlistentry> </varlistentry>
<varlistentry> <varlistentry>
@ -415,10 +395,8 @@
<listitem> <listitem>
<para>The management tools used for Ubuntu and <para>The management tools used for Ubuntu and
Kinstance differ from the management tools for VMware vSphere. Kinstance differ from the management tools for VMware vSphere.
Although both OS and hypervisor combinations are supported by Although OpenStack supports both OS and hypervisor combinations,
OpenStack, there will be very different impacts to the rest of the the choice of tool impacts the rest of the design.</para>
design as a result of the selection of one combination versus the
other.</para>
</listitem> </listitem>
</varlistentry> </varlistentry>
<varlistentry> <varlistentry>
@ -426,7 +404,7 @@
<listitem> <listitem>
<para>Ensure that selected OS and hypervisor <para>Ensure that selected OS and hypervisor
combinations meet the appropriate scale and performance combinations meet the appropriate scale and performance
requirements. The chosen architecture will need to meet the targeted requirements. The chosen architecture must meet the targeted
instance-host ratios with the selected OS-hypervisor instance-host ratios with the selected OS-hypervisor
combination.</para> combination.</para>
</listitem> </listitem>
@ -435,52 +413,47 @@
<term>Security</term> <term>Security</term>
<listitem> <listitem>
<para>Ensure that the design can accommodate the regular <para>Ensure that the design can accommodate the regular
periodic installation of application security patches while installation of application security patches while
maintaining the required workloads. The frequency of security maintaining the required workloads. The frequency of security
patches for the proposed OS-hypervisor combination will have an patches for the proposed OS-hypervisor combination has an
impact on performance and the patch installation process could impact on performance and the patch installation process can
affect maintenance windows.</para> affect maintenance windows.</para>
</listitem> </listitem>
</varlistentry> </varlistentry>
<varlistentry> <varlistentry>
<term>Supported features</term> <term>Supported features</term>
<listitem> <listitem>
<para>Determine what features of OpenStack are <para>Determine what features of OpenStack you require.
required. This will often determine the selection of the The choice of features often determines the selection of the
OS-hypervisor combination. Certain features are only available with OS-hypervisor combination. Certain features are only available with
specific OSs or hypervisors. For example, if certain features are specific OSs or hypervisors. For example, if certain features are
not available, the design might need to be modified to meet the user not available, modify the design to meet user requirements.</para>
requirements.</para>
</listitem> </listitem>
</varlistentry> </varlistentry>
<varlistentry> <varlistentry>
<term>Interoperability</term> <term>Interoperability</term>
<listitem> <listitem>
<para>Consideration should be given to the ability <para>Consider the ability of the selected OS-hypervisor combination
of the selected OS-hypervisor combination to interoperate or to interoperate or co-exist with other OS-hypervisors, or with
co-exist with other OS-hypervisors, or other software solutions in other software solutions in the overall design. Operational and
the overall design (if required). Operational and troubleshooting troubleshooting tools for one OS-hypervisor combination may differ
tools for one OS-hypervisor combination may differ from the from the tools for another OS-hypervisor combination. The design
tools used for another OS-hypervisor combination and, must address if the two sets of tools need to interoperate.</para>
as a result, the design will need to address if the
two sets of tools need to interoperate.</para>
</listitem> </listitem>
</varlistentry> </varlistentry>
</variablelist> </variablelist>
</section> </section>
<section xml:id="openstack-components-arch"> <section xml:id="openstack-components-arch">
<title>OpenStack components</title> <title>OpenStack components</title>
<para>The selection of which OpenStack components will actually be <para>The selection of OpenStack components has a significant impact.
included in the design and deployed has significant impact. There are There are certain components that are omnipresent, for example the compute
certain components that will always be present, (Compute and Image service, for and image services, but others, such as the orchestration module may not
example) yet there are other services that might not need to be present. be present. Omitting heat does not typically have a significant impact
For example, a certain design may not require the Orchestration module. on the overall design. However, if the architecture uses a replacement for
Omitting Heat would not typically have a significant impact on the OpenStack Object Storage for its storage component, this could have
overall design. However, if the architecture uses a replacement for
OpenStack Object Storage for its storage component, this could potentially have
significant impacts on the rest of the design.</para> significant impacts on the rest of the design.</para>
<para>For a compute-focused OpenStack design architecture, the <para>For a compute-focused OpenStack design architecture, the
following components would be used:</para> following components may be present:</para>
<itemizedlist> <itemizedlist>
<listitem> <listitem>
<para>Identity (keystone)</para> <para>Identity (keystone)</para>
@ -504,96 +477,89 @@
<para>Orchestration (heat)</para> <para>Orchestration (heat)</para>
</listitem> </listitem>
</itemizedlist> </itemizedlist>
<para>OpenStack Block Storage would potentially not be incorporated <para>A compute-focused design is less likely to include OpenStack Block
into a compute-focused design due to persistent block storage not Storage due to persistent block storage not
being a significant requirement for the types of workloads that would being a significant requirement for the expected workloads. However,
be deployed onto instances running in a compute-focused cloud. However, there may be some situations where the need for performance employs
there may be some situations where the need for performance dictates a block storage component to improve data I-O.</para>
that a block storage component be used to improve data I-O.</para> <para>The exclusion of certain OpenStack components might also limit the
<para>The exclusion of certain OpenStack components might also limit or functionality of other components. If a design opts to
constrain the functionality of other components. If a design opts to
include the Orchestration module but excludes the Telemetry module, then include the Orchestration module but excludes the Telemetry module, then
the design will not be able to take advantage of Orchestration's auto the design cannot take advantage of Orchestration's auto
scaling functionality (which relies on information from Telemetry). This is due scaling functionality as this relies on information from Telemetry.</para>
to the fact that you can use Orchestration to spin up a large number of
instances to perform the compute-intensive processing. This includes
Orchestration in a compute-focused architecture design, which is strongly
recommended.</para>
</section> </section>
<section xml:id="supplemental-software"> <section xml:id="supplemental-software">
<title>Supplemental software</title> <title>Supplemental software</title>
<para>While OpenStack is a fairly complete collection of software <para>While OpenStack is a fairly complete collection of software
projects for building a platform for cloud services, there are projects for building a platform for cloud services, there are
invariably additional pieces of software that might need to be invariably additional pieces of software that you might add
added to any given OpenStack design.</para> to an OpenStack design.</para>
<section xml:id="networking-software-arch"> <section xml:id="networking-software-arch">
<title>Networking software</title> <title>Networking software</title>
<para>OpenStack Networking provides a wide variety of networking services <para>OpenStack Networking provides a wide variety of networking services
for instances. There are many additional networking software packages for instances. There are many additional networking software packages
that might be useful to manage the OpenStack components themselves. that might be useful to manage the OpenStack components themselves.
Some examples include software to provide load balancing, Some examples include software to provide load balancing,
network redundancy protocols, and routing daemons. Some of these network redundancy protocols, and routing daemons. The
software packages are described in more detail in the
<citetitle>OpenStack High Availability Guide</citetitle> (<link <citetitle>OpenStack High Availability Guide</citetitle> (<link
xlink:href="http://docs.openstack.org/high-availability-guide/content">http://docs.openstack.org/high-availability-guide/content</link>). xlink:href="http://docs.openstack.org/high-availability-guide/content">http://docs.openstack.org/high-availability-guide/content</link>)
describes some of these software packages in more detail.
</para> </para>
<para>For a compute-focused OpenStack cloud, the OpenStack infrastructure <para>For a compute-focused OpenStack cloud, the OpenStack infrastructure
components will need to be highly available. If the design does not components must be highly available. If the design does not
include hardware load balancing, networking software packages like include hardware load balancing, you must add networking software packages
HAProxy will need to be included.</para> like HAProxy.</para>
</section> </section>
<section xml:id="management-software-arch"> <section xml:id="management-software-arch">
<title>Management software</title> <title>Management software</title>
<para>The selected supplemental software solution impacts and affects <para>The selected supplemental software solution impacts and affects
the overall OpenStack cloud design. This includes software for the overall OpenStack cloud design. This includes software for
providing clustering, logging, monitoring and alerting.</para> providing clustering, logging, monitoring and alerting.</para>
<para>Inclusion of clustering Software, such as Corosync or Pacemaker, <para>The availability of design requirements is the main determination
is determined primarily by the availability design requirements. for the inclusion of clustering Software, such as Corosync or Pacemaker.
Therefore, the impact of including (or not including) these software Therefore, the availability of the cloud infrastructure and the
packages is primarily determined by the availability of the cloud complexity of supporting the configuration after deployment impacts
infrastructure and the complexity of supporting the configuration after the inclusion of these software packages. The OpenStack High Availability
it is deployed. The OpenStack High Availability Guide provides more Guide provides more
details on the installation and configuration of Corosync and Pacemaker, details on the installation and configuration of Corosync and Pacemaker.
should these packages need to be included in the design.</para> </para>
<para>Requirements for logging, monitoring, and alerting are determined <para>Operational considerations determine the requirements for logging,
by operational considerations. Each of these sub-categories includes monitoring, and alerting. Each of these sub-categories includes
a number of various options. For example, in the logging sub-category various options. For example, in the logging sub-category
one might consider Logstash, Splunk, Log Insight, or some other log consider Logstash, Splunk, Log Insight, or some other log
aggregation-consolidation tool. Logs should be stored in a centralized aggregation-consolidation tool. Store logs in a centralized
location to make it easier to perform analytics against the data. Log location to ease analysis of the data. Log
data analytics engines can also provide automation and issue data analytics engines can also provide automation and issue
notification by providing a mechanism to both alert and automatically notification by alerting and
attempt to remediate some of the more commonly known issues.</para> attempting to remediate some of the more commonly known issues.</para>
<para>If any of these software packages are needed, then the design <para>If you require any of these software packages, then the design
must account for the additional resource consumption (CPU, RAM, must account for the additional resource consumption.
storage, and network bandwidth for a log aggregation solution, for Some other potential design impacts include:</para>
example). Some other potential design impacts include:</para>
<itemizedlist> <itemizedlist>
<listitem> <listitem>
<para>OS-hypervisor combination: Ensure that the selected logging, <para>OS-hypervisor combination: ensure that the selected logging,
monitoring, or alerting tools support the proposed OS-hypervisor monitoring, or alerting tools support the proposed OS-hypervisor
combination.</para> combination.</para>
</listitem> </listitem>
<listitem> <listitem>
<para>Network hardware: The network hardware selection needs to be <para>Network hardware: the logging, monitoring, and alerting software
supported by the logging, monitoring, and alerting software.</para> must support the network hardware selection.</para>
</listitem> </listitem>
</itemizedlist> </itemizedlist>
</section> </section>
<section xml:id="database-software-arch"> <section xml:id="database-software-arch">
<title>Database software</title> <title>Database software</title>
<para>A large majority of the OpenStack components require access to <para>A large majority of OpenStack components require access to
back-end database services to store state and configuration back-end database services to store state and configuration
information. Selection of an appropriate back-end database that will information. Select an appropriate back-end database that
satisfy the availability and fault tolerance requirements of the satisfies the availability and fault tolerance requirements of the
OpenStack services is required. OpenStack services support connecting OpenStack services. OpenStack services support connecting
to any database that is supported by the SQLAlchemy Python drivers, to any database that the SQLAlchemy Python drivers support,
however most common database deployments make use of MySQL or some however most common database deployments make use of MySQL or some
variation of it. We recommend that the database which provides variation of it. We recommend that you make the database that provides
back-end services within a general-purpose cloud, be made highly back-end services within a general-purpose cloud highly
available using an available technology which can accomplish that available. Some of the more common software solutions include Galera,
goal. Some of the more common software solutions used include Galera, MariaDB, and MySQL with multi-master replication.</para>
MariaDB and MySQL with multi-master replication.</para>
</section> </section>
</section> </section>
</section> </section>

View File

@ -52,7 +52,7 @@
cloud service providers or telecom providers. Smaller implementations cloud service providers or telecom providers. Smaller implementations
are more inclined to rely on smaller support teams that need are more inclined to rely on smaller support teams that need
to combine the engineering, design, and operation roles.</para> to combine the engineering, design, and operation roles.</para>
<para>The maintenance of OpenStack installations require a variety <para>The maintenance of OpenStack installations requires a variety
of technical skills. To ease the operational burden, consider of technical skills. To ease the operational burden, consider
incorporating features into the architecture and incorporating features into the architecture and
design. Some examples include:</para> design. Some examples include:</para>
@ -61,7 +61,7 @@
<para>Automating the operations functions</para> <para>Automating the operations functions</para>
</listitem> </listitem>
<listitem> <listitem>
<para>Utilising a third party management company</para> <para>Utilizing a third party management company</para>
</listitem> </listitem>
</itemizedlist> </itemizedlist>
</section> </section>
@ -86,7 +86,7 @@
<section xml:id="expected-unexpected-server-downtime"> <section xml:id="expected-unexpected-server-downtime">
<title>Expected and unexpected server downtime</title> <title>Expected and unexpected server downtime</title>
<para>Unexpected server downtime is inevitable, and SLAs can <para>Unexpected server downtime is inevitable, and SLAs can
be used to address how long it takes to recover from failure. address how long it takes to recover from failure.
Recovery of a failed host means restoring instances from a snapshot, or Recovery of a failed host means restoring instances from a snapshot, or
respawning that instance on another available host.</para> respawning that instance on another available host.</para>
<para>It is acceptable to design a compute-focused cloud <para>It is acceptable to design a compute-focused cloud
@ -103,10 +103,10 @@
<para>Adding extra capacity to an OpenStack cloud is a <para>Adding extra capacity to an OpenStack cloud is a
horizontally scaling process.</para> horizontally scaling process.</para>
<note> <note>
<para>Be mindful, however, of any additional work to place the nodes into <para>Be mindful, however, of additional work to place the nodes into
appropriate Availability Zones and Host Aggregates if necessary.</para> appropriate Availability Zones and Host Aggregates.</para>
</note> </note>
<para>We recommend the same (or very similar) CPUs <para>We recommend the same or very similar CPUs
when adding extra nodes to the environment because they reduce when adding extra nodes to the environment because they reduce
the chance of breaking live-migration features if they are the chance of breaking live-migration features if they are
present. Scaling out hypervisor hosts also has a direct effect present. Scaling out hypervisor hosts also has a direct effect
@ -116,9 +116,8 @@
<para>Changing the internal components of a Compute host to account for <para>Changing the internal components of a Compute host to account for
increases in demand is a process known as vertical scaling. increases in demand is a process known as vertical scaling.
Swapping a CPU for one with more cores, or Swapping a CPU for one with more cores, or
increasing the memory in a server, can help add extra needed increasing the memory in a server, can help add extra
capacity depending on whether the running applications are capacity for running applications.</para>
more CPU intensive or memory based.</para>
<para>Another option is to assess the average workloads and <para>Another option is to assess the average workloads and
increase the number of instances that can run within the increase the number of instances that can run within the
compute environment by adjusting the overcommit ratio. While compute environment by adjusting the overcommit ratio. While

View File

@ -7,7 +7,7 @@
<?dbhtml stop-chunking?> <?dbhtml stop-chunking?>
<title>Prescriptive examples</title> <title>Prescriptive examples</title>
<para>The Conseil Européen pour la Recherche Nucléaire (CERN), <para>The Conseil Européen pour la Recherche Nucléaire (CERN),
also known as the European Organization for, Nuclear Research also known as the European Organization for Nuclear Research,
provides particle accelerators and other infrastructure for provides particle accelerators and other infrastructure for
high-energy physics research.</para> high-energy physics research.</para>
<para>As of 2011 CERN operated these two compute centers in Europe <para>As of 2011 CERN operated these two compute centers in Europe
@ -43,35 +43,35 @@
</tr> </tr>
</tbody> </tbody>
</informaltable> </informaltable>
<para>To support a growing number of compute heavy users of <para>To support a growing number of compute-heavy users of
experiments related to the Large Hadron Collider (LHC) CERN experiments related to the Large Hadron Collider (LHC), CERN
ultimately elected to deploy an OpenStack cloud using ultimately elected to deploy an OpenStack cloud using
Scientific Linux and RDO. This effort aimed to simplify the Scientific Linux and RDO. This effort aimed to simplify the
management of the center's compute resources with a view to management of the center's compute resources with a view to
doubling compute capacity through the addition of an doubling compute capacity through the addition of a
additional data center in 2013 while maintaining the same data center in 2013 while maintaining the same
levels of compute staff.</para> levels of compute staff.</para>
<para>The CERN solution uses <glossterm baseform="cell">cells</glossterm> <para>The CERN solution uses <glossterm baseform="cell">cells</glossterm>
for segregation of compute for segregation of compute
resources and to transparently scale between different data resources and for transparently scaling between different data
centers. This decision meant trading off support for security centers. This decision meant trading off support for security
groups and live migration. In addition some details like groups and live migration. In addition, they must manually replicate
flavors needed to be manually replicated across cells. In some details, like flavors, across cells. In
spite of these drawbacks cells were determined to provide the spite of these drawbacks cells provide the
required scale while exposing a single public API endpoint to required scale while exposing a single public API endpoint to
users.</para> users.</para>
<para>A compute cell was created for each of the two original data <para>CERN created a compute cell for each of the two original data
centers and a third was created when a new data center was centers and created a third when it added a new data center
added in 2013. Each cell contains three availability zones to in 2013. Each cell contains three availability zones to
further segregate compute resources and at least three further segregate compute resources and at least three
RabbitMQ message brokers configured to be clustered with RabbitMQ message brokers configured for clustering with
mirrored queues for high availability.</para> mirrored queues for high availability.</para>
<para>The API cell, which resides behind a HAProxy load balancer, <para>The API cell, which resides behind a HAProxy load balancer,
is located in the data center in Switzerland and directs API is in the data center in Switzerland and directs API
calls to compute cells using a customized variation of the calls to compute cells using a customized variation of the
cell scheduler. The customizations allow certain workloads to cell scheduler. The customizations allow certain workloads to
be directed to a specific data center or "all" data centers route to a specific data center or all data centers,
with cell selection determined by cell RAM availability in the with cell RAM availability determining cell selection in the
latter case.</para> latter case.</para>
<mediaobject> <mediaobject>
<imageobject> <imageobject>
@ -94,45 +94,42 @@
single default. single default.
</para></listitem> </para></listitem>
</itemizedlist> </itemizedlist>
<para>The MySQL database server in each cell is managed by a <para>A central database team manages the MySQL database server in each cell
central database team and configured in an active/passive in an active/passive configuration with a NetApp storage back end.
configuration with a NetApp storage back end. Backups are Backups run every 6 hours.</para>
performed ever 6 hours.</para>
<section xml:id="network-architecture"> <section xml:id="network-architecture">
<title>Network architecture</title> <title>Network architecture</title>
<para>To integrate with existing CERN networking infrastructure <para>To integrate with existing networking infrastructure, CERN
customizations were made to legacy networking (nova-network). This was in the made customizations to legacy networking (nova-network). This was in the
form of a driver to integrate with CERN's existing database form of a driver to integrate with CERN's existing database
for tracking MAC and IP address assignments.</para> for tracking MAC and IP address assignments.</para>
<para>The driver facilitates selection of a MAC address and IP for <para>The driver facilitates selection of a MAC address and IP for
new instances based on the compute node the scheduler places new instances based on the compute node where the scheduler places
the instance on</para> the instance.</para>
<para>The driver considers the compute node that the scheduler <para>The driver considers the compute node where the scheduler
placed an instance on and then selects a MAC address and IP placed an instance and selects a MAC address and IP
from the pre-registered list associated with that node in the from the pre-registered list associated with that node in the
database. The database is then updated to reflect the instance database. The database updates to reflect the address assignment to
the addresses were assigned to.</para></section> that instance.</para></section>
<section xml:id="storage-architecture"> <section xml:id="storage-architecture">
<title>Storage architecture</title> <title>Storage architecture</title>
<para>The OpenStack Image service is deployed in the API cell and <para>CERN deploys the OpenStack Image service in the API cell and
configured to expose version 1 (V1) of the API. As a result configures it to expose version 1 (V1) of the API. This also requires
the image registry is also required. The storage back end in the image registry. The storage back end in
use is a 3 PB Ceph cluster.</para> use is a 3 PB Ceph cluster.</para>
<para>A small set of "golden" Scientific Linux 5 and 6 images are <para>CERN maintains a small set of Scientific Linux 5 and 6 images onto
maintained which applications can in turn be placed on using which orchestration tools can place applications. Puppet manages
orchestration tools. Puppet is used for instance configuration instance configuration and customization.</para></section>
management and customization but Orchestration deployment is
expected.</para></section>
<section xml:id="monitoring"> <section xml:id="monitoring">
<title>Monitoring</title> <title>Monitoring</title>
<para>Although direct billing is not required, the Telemetry module <para>CERN does not require direct billing, but uses the Telemetry module
is used to perform metering for the purposes of adjusting to perform metering for the purposes of adjusting
project quotas. A sharded, replicated, MongoDB back end is project quotas. CERN uses a sharded, replicated, MongoDB back-end.
used. To spread API load, instances of the nova-api service To spread API load, CERN deploys instances of the nova-api service
were deployed within the child cells for Telemetry to query within the child cells for Telemetry to query
against. This also meant that some supporting services against. This also requires the configuration of supporting services
including keystone, glance-api and glance-registry needed to such as keystone, glance-api, and glance-registry in the child cells.
also be configured in the child cells.</para> </para>
<mediaobject> <mediaobject>
<imageobject> <imageobject>
<imagedata contentwidth="4in" <imagedata contentwidth="4in"

View File

@ -11,43 +11,46 @@
<?dbhtml stop-chunking?> <?dbhtml stop-chunking?>
<title>Technical considerations</title> <title>Technical considerations</title>
<para>In a compute-focused OpenStack cloud, the type of instance <para>In a compute-focused OpenStack cloud, the type of instance
workloads being provisioned heavily influences technical workloads you provision heavily influences technical
decision making. For example, specific use cases that demand decision making. For example, specific use cases that demand
multiple short running jobs present different requirements multiple, short-running jobs present different requirements
than those that specify long-running jobs, even though both than those that specify long-running jobs, even though both
situations are considered "compute focused."</para> situations are compute focused.</para>
<para>Public and private clouds require deterministic capacity <para>Public and private clouds require deterministic capacity
planning to support elastic growth in order to meet user SLA planning to support elastic growth in order to meet user SLA
expectations. Deterministic capacity planning is the path to expectations. Deterministic capacity planning is the path to
predicting the effort and expense of making a given process predicting the effort and expense of making a given process
consistently performant. This process is important because, consistently performant. This process is important because,
when a service becomes a critical part of a user's when a service becomes a critical part of a user's
infrastructure, the user's fate becomes wedded to the SLAs of infrastructure, the user's experience links directly to the SLAs of
the cloud itself. In cloud computing, a service's performance the cloud itself. In cloud computing, it is not average speed but
will not be measured by its average speed but rather by the speed consistency that determines a service's performance.
consistency of its speed.</para> There are two aspects of capacity planning to consider:</para>
<para>There are two aspects of capacity planning to consider: <itemizedlist>
planning the initial deployment footprint, and planning <listitem>
expansion of it to stay ahead of the demands of cloud <para>planning the initial deployment footprint</para>
users.</para> </listitem>
<para>Planning the initial footprint for an OpenStack deployment <listitem>
is typically done based on existing infrastructure workloads <para>planning expansion of it to stay ahead of the demands of cloud
users</para>
</listitem>
</itemizedlist>
<para>Plan the initial footprint for an OpenStack deployment
based on existing infrastructure workloads
and estimates based on expected uptake.</para> and estimates based on expected uptake.</para>
<para>The starting point is the core count of the cloud. By <para>The starting point is the core count of the cloud. By
applying relevant ratios, the user can gather information applying relevant ratios, the user can gather information
about:</para> about:</para>
<itemizedlist> <itemizedlist>
<listitem> <listitem>
<para>The number of instances expected to be available <para>The number of expected concurrent instances:
concurrently: (overcommit fraction × cores) / virtual (overcommit fraction × cores) / virtual cores per instance</para>
cores per instance</para>
</listitem> </listitem>
<listitem> <listitem>
<para>How much storage is required: flavor disk size × <para>Required storage: flavor disk size × number of instances</para>
number of instances</para>
</listitem> </listitem>
</itemizedlist> </itemizedlist>
<para>These ratios can be used to determine the amount of <para>Use these ratios to determine the amount of
additional infrastructure needed to support the cloud. For additional infrastructure needed to support the cloud. For
example, consider a situation in which you require 1600 example, consider a situation in which you require 1600
instances, each with 2 vCPU and 50 GB of storage. Assuming the instances, each with 2 vCPU and 50 GB of storage. Assuming the
@ -69,18 +72,18 @@
services, database servers, and queue servers are likely to services, database servers, and queue servers are likely to
encounter.</para> encounter.</para>
<para>Consider, for example, the differences between a cloud that <para>Consider, for example, the differences between a cloud that
supports a managed web-hosting platform with one running supports a managed web-hosting platform and one running
integration tests for a development project that creates one integration tests for a development project that creates one
instance per code commit. In the former, the heavy work of instance per code commit. In the former, the heavy work of
creating an instance happens only every few months, whereas creating an instance happens only every few months, whereas
the latter puts constant heavy load on the cloud controller. the latter puts constant heavy load on the cloud controller.
The average instance lifetime must be considered, as a larger The average instance lifetime is significant, as a larger
number generally means less load on the cloud number generally means less load on the cloud
controller.</para> controller.</para>
<para>Aside from the creation and termination of instances, the <para>Aside from the creation and termination of instances, consider the
impact of users must be considered when accessing the service, impact of users accessing the service,
particularly on nova-api and its associated database. Listing particularly on nova-api and its associated database. Listing
instances garners a great deal of information and, given the instances gathers a great deal of information and, given the
frequency with which users run this operation, a cloud with a frequency with which users run this operation, a cloud with a
large number of users can increase the load significantly. large number of users can increase the load significantly.
This can even occur unintentionally. For example, the This can even occur unintentionally. For example, the
@ -88,8 +91,8 @@
instances every 30 seconds, so leaving it open in a browser instances every 30 seconds, so leaving it open in a browser
window can cause unexpected load.</para> window can cause unexpected load.</para>
<para>Consideration of these factors can help determine how many <para>Consideration of these factors can help determine how many
cloud controller cores are required. A server with 8 CPU cores cloud controller cores you require. A server with 8 CPU cores
and 8 GB of RAM server would be sufficient for up to a rack of and 8 GB of RAM server would be sufficient for a rack of
compute nodes, given the above caveats.</para> compute nodes, given the above caveats.</para>
<para>Key hardware specifications are also crucial to the <para>Key hardware specifications are also crucial to the
performance of user instances. Be sure to consider budget and performance of user instances. Be sure to consider budget and
@ -98,59 +101,58 @@
bandwidth (Gbps/core), and overall CPU performance bandwidth (Gbps/core), and overall CPU performance
(CPU/core).</para> (CPU/core).</para>
<para>The cloud resource calculator is a useful tool in examining <para>The cloud resource calculator is a useful tool in examining
the impacts of different hardware and instance load outs. It the impacts of different hardware and instance load outs. See:
is available at: <link xlink:href="https://github.com/noslzzp/cloud-resource-calculator/blob/master/cloud-resource-calculator.ods">https://github.com/noslzzp/cloud-resource-calculator/blob/master/cloud-resource-calculator.ods</link> <link xlink:href="https://github.com/noslzzp/cloud-resource-calculator/blob/master/cloud-resource-calculator.ods">https://github.com/noslzzp/cloud-resource-calculator/blob/master/cloud-resource-calculator.ods</link>
</para> </para>
<section xml:id="expansion-planning-compute-focus"> <section xml:id="expansion-planning-compute-focus">
<title>Expansion planning</title> <title>Expansion planning</title>
<para>A key challenge faced when planning the expansion of cloud <para>A key challenge for planning the expansion of cloud
compute services is the elastic nature of cloud infrastructure compute services is the elastic nature of cloud infrastructure
demands. Previously, new users or customers would be forced to demands. Previously, new users or customers had to
plan for and request the infrastructure they required ahead of plan for and request the infrastructure they required ahead of
time, allowing time for reactive procurement processes. Cloud time, allowing time for reactive procurement processes. Cloud
computing users have come to expect the agility provided by computing users have come to expect the agility of having
having instant access to new resources as they are required. instant access to new resources as required.
Consequently, this means planning should be delivered for Consequently, plan for typical usage and for sudden bursts in
typical usage, but also more importantly, for sudden bursts in
usage.</para> usage.</para>
<para>Planning for expansion can be a delicate balancing act. <para>Planning for expansion is a balancing act.
Planning too conservatively can lead to unexpected Planning too conservatively can lead to unexpected
oversubscription of the cloud and dissatisfied users. Planning oversubscription of the cloud and dissatisfied users. Planning
for cloud expansion too aggressively can lead to unexpected for cloud expansion too aggressively can lead to unexpected
underutilization of the cloud and funds spent on operating underutilization of the cloud and funds spent unnecessarily on operating
infrastructure that is not being used efficiently.</para> infrastructure.</para>
<para>The key is to carefully monitor the spikes and valleys in <para>The key is to carefully monitor the trends in
cloud usage over time. The intent is to measure the cloud usage over time. The intent is to measure the
consistency with which services can be delivered, not the consistency with which you deliver services, not the
average speed or capacity of the cloud. Using this information average speed or capacity of the cloud. Using this information
to model performance results in capacity enables users to more to model capacity performance enables users to more
accurately determine the current and future capacity of the accurately determine the current and future capacity of the
cloud.</para></section> cloud.</para></section>
<section xml:id="cpu-and-ram-compute-focus"><title>CPU and RAM</title> <section xml:id="cpu-and-ram-compute-focus"><title>CPU and RAM</title>
<para>(Adapted from: <para>Adapted from:
<link xlink:href="http://docs.openstack.org/openstack-ops/content/compute_nodes.html#cpu_choice">http://docs.openstack.org/openstack-ops/content/compute_nodes.html#cpu_choice</link>)</para> <link xlink:href="http://docs.openstack.org/openstack-ops/content/compute_nodes.html#cpu_choice">http://docs.openstack.org/openstack-ops/content/compute_nodes.html#cpu_choice</link></para>
<para>In current generations, CPUs have up to 12 cores. If an <para>In current generations, CPUs have up to 12 cores. If an
Intel CPU supports Hyper-Threading, those 12 cores are doubled Intel CPU supports Hyper-Threading, those 12 cores double
to 24 cores. If a server is purchased that supports multiple to 24 cores. A server that supports multiple CPUs multiplies
CPUs, the number of cores is further multiplied. the number of available cores.
Hyper-Threading is Intel's proprietary simultaneous Hyper-Threading is Intel's proprietary simultaneous
multi-threading implementation, used to improve multi-threading implementation, used to improve
parallelization on their CPUs. Consider enabling parallelization on their CPUs. Consider enabling
Hyper-Threading to improve the performance of multithreaded Hyper-Threading to improve the performance of multithreaded
applications.</para> applications.</para>
<para>Whether the user should enable Hyper-Threading on a CPU <para>Whether the user should enable Hyper-Threading on a CPU
depends upon the use case. For example, disabling depends on the use case. For example, disabling
Hyper-Threading can be beneficial in intense computing Hyper-Threading can be beneficial in intense computing
environments. Performance testing conducted by running local environments. Running performance tests using local
workloads with both Hyper-Threading on and off can help workloads with and without Hyper-Threading can help
determine what is more appropriate in any particular determine which option is more appropriate in any particular
case.</para> case.</para>
<para>If the Libvirt/KVM hypervisor driver are the intended use <para>If they must run the Libvirt or KVM hypervisor drivers,
cases, then the CPUs used in the compute nodes must support then the compute node CPUs must support
virtualization by way of the VT-x extensions for Intel chips virtualization by way of the VT-x extensions for Intel chips
and AMD-v extensions for AMD chips to provide full and AMD-v extensions for AMD chips to provide full
performance.</para> performance.</para>
<para>OpenStack enables the user to overcommit CPU and RAM on <para>OpenStack enables users to overcommit CPU and RAM on
compute nodes. This allows an increase in the number of compute nodes. This allows an increase in the number of
instances running on the cloud at the cost of reducing the instances running on the cloud at the cost of reducing the
performance of the instances. OpenStack Compute uses the performance of the instances. OpenStack Compute uses the
@ -179,8 +181,8 @@
the RAM associated with the instances reaches 72 GB (such as the RAM associated with the instances reaches 72 GB (such as
nine instances, in the case where each instance has 8 GB of nine instances, in the case where each instance has 8 GB of
RAM).</para> RAM).</para>
<para>The appropriate CPU and RAM allocation ratio must be <para>You must select the appropriate CPU and RAM allocation ratio
selected based on particular use cases.</para></section> based on particular use cases.</para></section>
<section xml:id="additional-hardware-compute-focus"> <section xml:id="additional-hardware-compute-focus">
<title>Additional hardware</title> <title>Additional hardware</title>
<para>Certain use cases may benefit from exposure to additional <para>Certain use cases may benefit from exposure to additional
@ -201,15 +203,15 @@
<listitem> <listitem>
<para>Database management systems that benefit from the <para>Database management systems that benefit from the
availability of SSDs for ephemeral storage to maximize availability of SSDs for ephemeral storage to maximize
read/write time when it is required.</para> read/write time.</para>
</listitem> </listitem>
</itemizedlist> </itemizedlist>
<para>Host aggregates are used to group hosts that share similar <para>Host aggregates group hosts that share similar
characteristics, which can include hardware similarities. The characteristics, which can include hardware similarities. The
addition of specialized hardware to a cloud deployment is addition of specialized hardware to a cloud deployment is
likely to add to the cost of each node, so careful likely to add to the cost of each node, so consider carefully
consideration must be given to whether all compute nodes, or consideration whether all compute nodes, or
just a subset which is targetable using flavors, need the just a subset targeted by flavors, need the
additional customization to support the desired additional customization to support the desired
workloads.</para></section> workloads.</para></section>
<section xml:id="utilization"><title>Utilization</title> <section xml:id="utilization"><title>Utilization</title>
@ -219,13 +221,13 @@
instances while making the best use of the available physical instances while making the best use of the available physical
resources.</para> resources.</para>
<para>In order to facilitate packing of virtual machines onto <para>In order to facilitate packing of virtual machines onto
physical hosts, the default selection of flavors are physical hosts, the default selection of flavors provides a
constructed so that the second largest flavor is half the size second largest flavor that is half the size
of the largest flavor in every dimension. It has half the of the largest flavor in every dimension. It has half the
vCPUs, half the vRAM, and half the ephemeral disk space. The vCPUs, half the vRAM, and half the ephemeral disk space. The
next largest flavor is half that size again. As a result, next largest flavor is half that size again. The following figure
packing a server for general purpose computing might look provides a visual representation of this concept for a general
conceptually something like this figure: purpose computing design:
<mediaobject> <mediaobject>
<imageobject> <imageobject>
<imagedata contentwidth="4in" <imagedata contentwidth="4in"
@ -233,8 +235,7 @@
/> />
</imageobject> </imageobject>
</mediaobject></para> </mediaobject></para>
<para>On the other hand, a CPU optimized packed server might look <para>The following figure displays a CPU-optimized, packed server:
like the following figure:
<mediaobject> <mediaobject>
<imageobject> <imageobject>
<imagedata contentwidth="4in" <imagedata contentwidth="4in"
@ -242,35 +243,32 @@
/> />
</imageobject> </imageobject>
</mediaobject></para> </mediaobject></para>
<para>These default flavors are well suited to typical load outs <para>These default flavors are well suited to typical configurations
for commodity server hardware. To maximize utilization, of commodity server hardware. To maximize utilization,
however, it may be necessary to customize the flavors or however, it may be necessary to customize the flavors or
create new ones, to better align instance sizes to the create new ones in order to better align instance sizes to the
available hardware.</para> available hardware.</para>
<para>Workload characteristics may also influence hardware choices <para>Workload characteristics may also influence hardware choices
and flavor configuration, particularly where they present and flavor configuration, particularly where they present
different ratios of CPU versus RAM versus HDD different ratios of CPU versus RAM versus HDD
requirements.</para> requirements.</para>
<para>For more information on Flavors refer to: <para>For more information on Flavors see:
<link xlink:href="http://docs.openstack.org/openstack-ops/content/flavors.html">http://docs.openstack.org/openstack-ops/content/flavors.html</link></para> <link xlink:href="http://docs.openstack.org/openstack-ops/content/flavors.html">http://docs.openstack.org/openstack-ops/content/flavors.html</link></para>
</section> </section>
<section xml:id="performance-compute-focus"><title>Performance</title> <section xml:id="performance-compute-focus"><title>Performance</title>
<para>The infrastructure of a cloud should not be shared, so that <para>So that workloads can consume as many resources
it is possible for the workloads to consume as many resources as are available, do not share cloud infrastructure. Ensure you accommodate
as are made available, and accommodations should be made to large scale workloads.</para>
provide large scale workloads.</para>
<para>The duration of batch processing differs depending on <para>The duration of batch processing differs depending on
individual workloads that are launched. Time limits range from individual workloads. Time limits range from
seconds, minutes to hours, and as a result it is considered seconds to hours, and as a result it is difficult to predict resource
difficult to predict when resources will be used, for how use.</para>
long, and even which resources will be used.</para>
</section> </section>
<section xml:id="security-compute-focus"><title>Security</title> <section xml:id="security-compute-focus"><title>Security</title>
<para>The security considerations needed for this scenario are <para>The security considerations for this scenario are
similar to those of the other scenarios discussed in this similar to those of the other scenarios in this guide.</para>
book.</para> <para>A security domain comprises users, applications, servers,
<para>A security domain comprises users, applications, servers and networks that share common trust requirements and
or networks that share common trust requirements and
expectations within a system. Typically they have the same expectations within a system. Typically they have the same
authentication and authorization requirements and authentication and authorization requirements and
users.</para> users.</para>
@ -289,77 +287,77 @@
<para>Data</para> <para>Data</para>
</listitem> </listitem>
</orderedlist> </orderedlist>
<para>These security domains can be mapped individually to the <para>You can map these security domains individually to the
installation, or they can also be combined. For example, some installation, or combine them. For example, some
deployment topologies combine both guest and data domains onto deployment topologies combine both guest and data domains onto
one physical network, whereas in other cases these networks one physical network, whereas in other cases these networks
are physically separated. In each case, the cloud operator are physically separate. In each case, the cloud operator
should be aware of the appropriate security concerns. Security should be aware of the appropriate security concerns. Map out
domains should be mapped out against specific OpenStack security domains against specific OpenStack
deployment topology. The domains and their trust requirements deployment topology. The domains and their trust requirements
depend upon whether the cloud instance is public, private, or depend on whether the cloud instance is public, private, or
hybrid.</para> hybrid.</para>
<para>The public security domain is an entirely untrusted area of <para>The public security domain is an untrusted area of
the cloud infrastructure. It can refer to the Internet as a the cloud infrastructure. It can refer to the Internet as a
whole or simply to networks over which the user has no whole or simply to networks over which the user has no
authority. This domain should always be considered authority. Always consider this domain untrusted.</para>
untrusted.</para>
<para>Typically used for compute instance-to-instance traffic, the <para>Typically used for compute instance-to-instance traffic, the
guest security domain handles compute data generated by guest security domain handles compute data generated by
instances on the cloud; not services that support the instances on the cloud. It does not handle services that support the
operation of the cloud, for example API calls. Public cloud operation of the cloud, for example API calls. Public cloud
providers and private cloud providers who do not have providers and private cloud providers who do not have
stringent controls on instance use or who allow unrestricted stringent controls on instance use or who allow unrestricted
Internet access to instances should consider this domain to be Internet access to instances should consider this an untrusted domain.
untrusted. Private cloud providers may want to consider this Private cloud providers may want to consider this
network as internal and therefore trusted only if they have an internal network and therefore trusted only if they have
controls in place to assert that they trust instances and all controls in place to assert that they trust instances and all
their tenants.</para> their tenants.</para>
<para>The management security domain is where services interact. <para>The management security domain is where services interact.
Sometimes referred to as the "control plane", the networks in Sometimes referred to as the "control plane", the networks in
this domain transport confidential data such as configuration this domain transport confidential data such as configuration
parameters, user names, and passwords. In most deployments this parameters, user names, and passwords. In most deployments this
domain is considered trusted.</para> is a trusted domain.</para>
<para>The data security domain is concerned primarily with <para>The data security domain deals with
information pertaining to the storage services within information pertaining to the storage services within
OpenStack. Much of the data that crosses this network has high OpenStack. Much of the data that crosses this network has high
integrity and confidentiality requirements and depending on integrity and confidentiality requirements and depending on
the type of deployment there may also be strong availability the type of deployment there may also be strong availability
requirements. The trust level of this network is heavily requirements. The trust level of this network is heavily
dependent on deployment decisions and as such we do not assign dependent on deployment decisions and as such we do not assign
this any default level of trust.</para> this a default level of trust.</para>
<para>When deploying OpenStack in an enterprise as a private cloud <para>When deploying OpenStack in an enterprise as a private cloud, you can
it is assumed to be behind a firewall and within the trusted generally assume it is behind a firewall and within the trusted
network alongside existing systems. Users of the cloud are network alongside existing systems. Users of the cloud are
typically employees or trusted individuals that are bound by typically employees or trusted individuals that are bound by
the security requirements set forth by the company. This tends the security requirements set forth by the company. This tends
to push most of the security domains towards a more trusted to push most of the security domains towards a more trusted
model. However, when deploying OpenStack in a public-facing model. However, when deploying OpenStack in a public-facing
role, no assumptions can be made and the attack vectors role, you cannot make these assumptions and the number of attack vectors
significantly increase. For example, the API endpoints and the significantly increases. For example, the API endpoints and the
software behind it will be vulnerable to potentially hostile software behind it become vulnerable to hostile
entities wanting to gain unauthorized access or prevent access entities attempting to gain unauthorized access or prevent access
to services. This can result in loss of reputation and must be to services. This can result in loss of reputation and you must
protected against through auditing and appropriate protect against it through auditing and appropriate
filtering.</para> filtering.</para>
<para>Consideration must be taken when managing the users of the <para>Take care when managing the users of the
system, whether it is the operation of public or private system, whether in public or private
clouds. The identity service allows for LDAP to be part of the clouds. The identity service enables LDAP to be part of the
authentication process, and includes such systems as an authentication process, and includes such systems as an
OpenStack deployment that may ease user management if OpenStack deployment that may ease user management if
integrated into existing systems.</para> integrated into existing systems.</para>
<para>It is strongly recommended that the API services are placed <para>We recommend placing API services behind hardware that
behind hardware that performs SSL termination. API services performs SSL termination. API services
transmit user names, passwords, and generated tokens between transmit user names, passwords, and generated tokens between
client machines and API endpoints and therefore must be client machines and API endpoints and therefore must be
secured.</para> secure.</para>
<para>More information on OpenStack Security can be found <para>For more information on OpenStack Security, see
at <link xlink:href="http://docs.openstack.org/security-guide/">http://docs.openstack.org/security-guide/</link></para> <link xlink:href="http://docs.openstack.org/security-guide/">http://docs.openstack.org/security-guide/</link>
</para>
</section> </section>
<section xml:id="openstack-components-compute-focus"> <section xml:id="openstack-components-compute-focus">
<title>OpenStack components</title> <title>OpenStack components</title>
<para>Due to the nature of the workloads that will be used in this <para>Due to the nature of the workloads in this
scenario, a number of components will be highly beneficial in scenario, a number of components are highly beneficial for
a Compute-focused cloud. This includes the typical OpenStack a Compute-focused cloud. This includes the typical OpenStack
components:</para> components:</para>
<itemizedlist> <itemizedlist>
@ -381,10 +379,10 @@
</listitem> </listitem>
</itemizedlist> </itemizedlist>
<para>It is safe to assume that, given the nature of the <para>It is safe to assume that, given the nature of the
applications involved in this scenario, these will be heavily applications involved in this scenario, these are heavily
automated deployments. Making use of Orchestration will be highly automated deployments. Making use of Orchestration is highly
beneficial in this case. Deploying a batch of instances and beneficial in this case. You can script the deployment of a
running an automated set of tests can be scripted, however it batch of instances and the running of tests, but it
makes sense to use the Orchestration module makes sense to use the Orchestration module
to handle all these actions.</para> to handle all these actions.</para>
<itemizedlist> <itemizedlist>
@ -392,11 +390,10 @@
<para>Telemetry module (ceilometer)</para> <para>Telemetry module (ceilometer)</para>
</listitem> </listitem>
</itemizedlist> </itemizedlist>
<para>Telemetry and the alarms it generates are required <para>Telemetry and the alarms it generates support autoscaling
to support autoscaling of instances using of instances using Orchestration. Users that are not using the
Orchestration. Users that are not using the
Orchestration module do not need to deploy the Telemetry module and Orchestration module do not need to deploy the Telemetry module and
may choose to use other external solutions to fulfill their may choose to use external solutions to fulfill their
metering and monitoring requirements.</para> metering and monitoring requirements.</para>
<para>See also: <para>See also:
<link xlink:href="http://docs.openstack.org/openstack-ops/content/logging_monitoring.html">http://docs.openstack.org/openstack-ops/content/logging_monitoring.html</link></para> <link xlink:href="http://docs.openstack.org/openstack-ops/content/logging_monitoring.html">http://docs.openstack.org/openstack-ops/content/logging_monitoring.html</link></para>
@ -406,12 +403,12 @@
</listitem> </listitem>
</itemizedlist> </itemizedlist>
<para>Due to the burst-able nature of the workloads and the <para>Due to the burst-able nature of the workloads and the
applications and instances that will be used for batch applications and instances that perform batch
processing, this cloud will utilize mainly memory or CPU, so processing, this cloud mainly uses memory or CPU, so
the need for add-on storage to each instance is not a likely the need for add-on storage to each instance is not a likely
requirement. This does not mean that OpenStack Block Storage requirement. This does not mean that you do not use
(cinder) will not be used in the infrastructure, but OpenStack Block Storage (cinder) in the infrastructure, but
typically it will not be used as a central component.</para> typically it is not a central component.</para>
<itemizedlist> <itemizedlist>
<listitem> <listitem>
<para>Networking</para> <para>Networking</para>
@ -419,7 +416,7 @@
</itemizedlist> </itemizedlist>
<para>When choosing a networking platform, ensure that it either <para>When choosing a networking platform, ensure that it either
works with all desired hypervisor and container technologies works with all desired hypervisor and container technologies
and their OpenStack drivers, or includes an implementation of and their OpenStack drivers, or that it includes an implementation of
an ML2 mechanism driver. Networking platforms that provide ML2 an ML2 mechanism driver. You can mix networking platforms
mechanisms drivers can be mixed.</para></section> that provide ML2 mechanisms drivers.</para></section>
</section> </section>

View File

@ -10,10 +10,9 @@
xml:id="user-requirements-compute-focus"> xml:id="user-requirements-compute-focus">
<?dbhtml stop-chunking?> <?dbhtml stop-chunking?>
<title>User requirements</title> <title>User requirements</title>
<para>Compute intensive workloads are defined by their high <para>High utilization of CPU, RAM, or both defines compute
utilization of CPU, RAM, or both. User requirements will intensive workloads. User requirements determine the performance
determine if a cloud must be built to accommodate anticipated demands for the cloud.
performance demands.
</para> </para>
<variablelist> <variablelist>
<varlistentry> <varlistentry>
@ -23,25 +22,23 @@
compute-focused cloud, however some organizations compute-focused cloud, however some organizations
might be concerned with cost avoidance. Repurposing might be concerned with cost avoidance. Repurposing
existing resources to tackle compute-intensive tasks existing resources to tackle compute-intensive tasks
instead of needing to acquire additional resources may instead of acquiring additional resources may
offer cost reduction opportunities.</para> offer cost reduction opportunities.</para>
</listitem> </listitem>
</varlistentry> </varlistentry>
<varlistentry> <varlistentry>
<term>Time to market</term> <term>Time to market</term>
<listitem> <listitem>
<para>Compute-focused clouds can be used <para>Compute-focused clouds can deliver products more quickly,
to deliver products more quickly, for example, for example by speeding up a company's software development
speeding up a company's software development life cycle life cycle (SDLC) for building products and applications.</para>
(SDLC) for building products and applications.</para>
</listitem> </listitem>
</varlistentry> </varlistentry>
<varlistentry> <varlistentry>
<term>Revenue opportunity</term> <term>Revenue opportunity</term>
<listitem> <listitem>
<para>Companies that are interested <para>Companies that want to build services or products that
in building services or products that rely on the rely on the power of compute resources benefit from a
power of the compute resources will benefit from a
compute-focused cloud. Examples include the analysis compute-focused cloud. Examples include the analysis
of large data sets (via Hadoop or Cassandra) or of large data sets (via Hadoop or Cassandra) or
completing computational intensive tasks such as completing computational intensive tasks such as
@ -71,9 +68,9 @@
jurisdictions.</para> jurisdictions.</para>
</listitem> </listitem>
<listitem> <listitem>
<para>Data compliance&mdash;certain types of information needs <para>Data compliance: certain types of information need
to reside in certain locations due to regular issues&mdash;and to reside in certain locations due to regular issues and,
more important cannot reside in other locations more importantly, cannot reside in other locations
for the same reason.</para> for the same reason.</para>
</listitem> </listitem>
</itemizedlist> </itemizedlist>
@ -88,15 +85,14 @@
information.</para></section> information.</para></section>
<section xml:id="technical-considerations-compute-focus-user"> <section xml:id="technical-considerations-compute-focus-user">
<title>Technical considerations</title> <title>Technical considerations</title>
<para>The following are some technical requirements that need to <para>The following are some technical requirements you must consider
be incorporated into the architecture design. in the architecture design:
</para> </para>
<variablelist> <variablelist>
<varlistentry> <varlistentry>
<term>Performance</term> <term>Performance</term>
<listitem> <listitem>
<para>If a primary technical concern is for <para>If a primary technical concern is to deliver high performance
the environment to deliver high performance
capability, then a compute-focused design is an capability, then a compute-focused design is an
obvious choice because it is specifically designed to obvious choice because it is specifically designed to
host compute-intensive workloads.</para> host compute-intensive workloads.</para>
@ -106,24 +102,23 @@
<term>Workload persistence</term> <term>Workload persistence</term>
<listitem> <listitem>
<para>Workloads can be either <para>Workloads can be either
short-lived or long running. Short-lived workloads short-lived or long-running. Short-lived workloads
might include continuous integration and continuous can include continuous integration and continuous
deployment (CI-CD) jobs, where large numbers of deployment (CI-CD) jobs, which create large numbers of
compute instances are created simultaneously to compute instances simultaneously to
perform a set of compute-intensive tasks. The results perform a set of compute-intensive tasks. The environment then
or artifacts are then copied from the instance into copies the results or artifacts from each instance into
long-term storage before the instance is destroyed. long-term storage before destroying the instance.
Long-running workloads, like a Hadoop or Long-running workloads, like a Hadoop or
high-performance computing (HPC) cluster, typically high-performance computing (HPC) cluster, typically
ingest large data sets, perform the computational work ingest large data sets, perform the computational work
on those data sets, then push the results into long on those data sets, then push the results into long-term
term storage. Unlike short-lived workloads, when the storage. When the computational work finishes, the instances
computational work is completed, they will remain idle remain idle until they receive another job. Environments
until the next job is pushed to them. Long-running for long-running workloads are often larger and more complex,
workloads are often larger and more complex, so the but you can offset the cost of building them by keeping them
effort of building them is mitigated by keeping them active between jobs. Another example of long-running
active between jobs. Another example of long running workloads is legacy applications that are
workloads is legacy applications that typically are
persistent over time.</para> persistent over time.</para>
</listitem> </listitem>
</varlistentry> </varlistentry>
@ -132,14 +127,14 @@
<listitem> <listitem>
<para>Workloads targeted for a compute-focused <para>Workloads targeted for a compute-focused
OpenStack cloud generally do not require any OpenStack cloud generally do not require any
persistent block storage (although some usages of persistent block storage, although some uses of
Hadoop with HDFS may dictate the use of persistent Hadoop with HDFS may require persistent
block storage). A shared filesystem or object store block storage. A shared filesystem or object store
will maintain the initial data set(s) and serve as the maintains the initial data sets and serves as the
destination for saving the computational results. By destination for saving the computational results. By
avoiding the input-output (IO) overhead, workload avoiding the input-output (IO) overhead, you can significantly
performance is significantly enhanced. Depending on enhance workload performance. Depending on
the size of the data set(s), it might be necessary to the size of the data sets, it may be necessary to
scale the object store or shared file system to match scale the object store or shared file system to match
the storage demand.</para> the storage demand.</para>
</listitem> </listitem>
@ -150,7 +145,7 @@
<para>Like any other cloud architecture, a <para>Like any other cloud architecture, a
compute-focused OpenStack cloud requires an on-demand compute-focused OpenStack cloud requires an on-demand
and self-service user interface. End users must be and self-service user interface. End users must be
able to provision computing power, storage, networks able to provision computing power, storage, networks,
and software simply and flexibly. This includes and software simply and flexibly. This includes
scaling the infrastructure up to a substantial level scaling the infrastructure up to a substantial level
without disrupting host operations.</para> without disrupting host operations.</para>
@ -159,12 +154,12 @@
<varlistentry> <varlistentry>
<term>Security</term> <term>Security</term>
<listitem> <listitem>
<para>Security is going to be highly dependent <para>Security is highly dependent
on the business requirements. For example, a on business requirements. For example, a
computationally intense drug discovery application computationally intense drug discovery application
will obviously have much higher security requirements has much higher security requirements
than a cloud that is designed for processing market than a cloud for processing market
data for a retailer. As a general start, the security data for a retailer. As a general rule, the security
recommendations and guidelines provided in the recommendations and guidelines provided in the
OpenStack Security Guide are applicable.</para> OpenStack Security Guide are applicable.</para>
</listitem> </listitem>
@ -173,9 +168,8 @@
</section> </section>
<section xml:id="operational-considerations-compute-focus-user"> <section xml:id="operational-considerations-compute-focus-user">
<title>Operational considerations</title> <title>Operational considerations</title>
<para>The compute intensive cloud from the operational perspective <para>From an operational perspective, a compute intensive cloud
is similar to the requirements for the general-purpose cloud. is similar to a general-purpose cloud. See the general-purpose
More details on operational requirements can be found in the design section for more details on operational requirements.</para>
general-purpose design section.</para>
</section> </section>
</section> </section>