Architecture
- The hardware selection covers three areas:
-
-
- Compute
-
-
- Network
-
-
- Storage
-
-
-
- An OpenStack cloud with extreme demands on processor and memory
- resources is compute-focused, and requires hardware that
- can handle these demands. This can mean choosing hardware which might
- not perform as well on storage or network capabilities. In a compute-
- focused architecture, storage and networking load a
- data set into the computational cluster, but are not otherwise in heavy
- demand.
-
-
- Consider the following factors when selecting compute (server) hardware:
-
-
-
- Server density
-
- A measure of how many servers can fit into a
- given amount of physical space, such as a rack unit (U).
-
-
-
- Resource capacity
-
- The number of CPU cores, how much RAM, or how
- much storage a given server delivers.
-
-
-
- Expandability
-
- The number of additional resources you can add to a
- server before it reaches its limit.
-
-
-
- Cost
-
- The relative purchase price of the hardware weighted
- against the level of design effort needed to build the system.
-
-
-
+ The hardware selection covers three areas:
+
+
+ Compute
+
+
+ Network
+
+
+ Storage
+
+
+ Compute-focused OpenStack clouds have high demands on processor and
+ memory resources, and requires hardware that can handle these demands.
+ Consider the following factors when selecting compute (server) hardware:
+
+
+ Server density
+
+
+ Resource capacity
+
+
+ Expandability
+
+
+ Cost
+
+ Weigh these considerations against each other to determine the
best design for the desired purpose. For example, increasing server density
- means sacrificing resource capacity or expandability. Increasing resource
- capacity and expandability can increase cost but decreases server density.
- Decreasing cost can mean decreasing supportability, server density,
- resource capacity, and expandability.
+ means sacrificing resource capacity or expandability.
A compute-focused cloud should have an emphasis on server hardware
that can offer more CPU sockets, more CPU cores, and more RAM. Network
- connectivity and storage capacity are less critical. The hardware must
- provide enough network connectivity and storage
- capacity to meet minimum user requirements, but they are not the primary
- consideration.
- Some server hardware form factors suit a compute-focused architecture
- better than others. CPU and RAM capacity have the highest priority. Some
- considerations for selecting hardware:
-
-
- Most blade servers can support dual-socket multi-core CPUs. To
- avoid this CPU limit, select "full width" or "full height" blades.
- Be aware, however, that this also decreases server density. For example,
- high density blade servers such as HP BladeSystem or Dell PowerEdge
- M1000e support up to 16 servers in only ten rack units. Using
- half-height blades is twice as dense as using full-height blades,
- which results in only eight servers per ten rack units.
-
-
- 1U rack-mounted servers that occupy only a single rack
- unit may offer greater server density than a blade server
- solution. It is possible to place forty 1U servers in a rack, providing
- space for the top of rack (ToR) switches, compared to 32 full width
- blade servers. However, as of the Icehouse release, 1U servers from
- the major vendors have only dual-socket, multi-core CPU
- configurations. To obtain greater than dual-socket support in a 1U
- rack-mount form factor, purchase systems from original
- design (ODMs) or second-tier manufacturers.
-
-
- 2U rack-mounted servers provide quad-socket, multi-core CPU
- support, but with a corresponding decrease in server density (half
- the density that 1U rack-mounted servers offer).
-
-
- Larger rack-mounted servers, such as 4U servers, often provide
- even greater CPU capacity, commonly supporting four or even eight CPU
- sockets. These servers have greater expandability, but such servers
- have much lower server density and are often more expensive.
-
-
- "Sled servers" are rack-mounted servers that support multiple
- independent servers in a single 2U or 3U enclosure. These deliver higher
- density as compared to typical 1U or 2U rack-mounted servers. For
- example, many sled servers offer four independent dual-socket
- nodes in 2U for a total of eight CPU sockets in 2U. However, the
- dual-socket limitation on individual nodes may not be sufficient to
- offset their additional cost and configuration complexity.
-
-
- Consider these facts when choosing server hardware for a compute-
- focused OpenStack design architecture:
-
-
- Instance density
-
- In a compute-focused architecture, instance density is
- lower, which means CPU and RAM over-subscription ratios are
- also lower. You require more hosts to support the anticipated
- scale due to instance density being lower, especially if the
- design uses dual-socket hardware designs.
-
-
-
- Host density
-
- Another option to address the higher host count
- of dual socket designs is to use a quad
- socket platform. Taking this approach decreases host density,
- which increases rack count. This configuration may
- affect the network requirements, the number of power connections, and
- possibly impact the cooling requirements.
-
-
-
- Power and cooling density
-
- The power and cooling density
- requirements for 2U, 3U or even 4U server designs might be lower
- than for blade, sled, or 1U server designs because of lower host
- density. For data centers with older infrastructure, this may
- be a desirable feature.
-
-
-
+ connectivity and storage capacity are less critical.
When designing a compute-focused OpenStack architecture, you must
consider whether you intend to scale up or scale out.
Selecting a smaller number of larger hosts, or a
larger number of smaller hosts, depends on a combination of factors:
cost, power, cooling, physical rack and floor space, support-warranty,
and manageability.
-
- Storage hardware selection
- For a compute-focused OpenStack architecture, the
- selection of storage hardware is not critical as it is not a primary
- consideration. Nonetheless, there are several factors
- to consider:
-
-
- Cost
-
- The overall cost of the solution plays a major role
- in what storage architecture and storage hardware you select.
-
-
-
- Performance
-
- The performance of the storage solution is important; you can
- measure it by observing the latency of storage I-O
- requests. In a compute-focused OpenStack cloud, storage latency
- can be a major consideration. In some compute-intensive
- workloads, minimizing the delays that the CPU experiences while
- fetching data from storage can impact significantly on
- the overall performance of the application.
-
-
-
- Scalability
-
- Scalability refers to the performance of a storage solution
- as it expands to its maximum size. A solution that performs
- well in small configurations but has degrading
- performance as it expands is not scalable. On
- the other hand, a solution that continues to perform well at
- maximum expansion is scalable.
-
-
-
- Expandability
-
- Expandability refers to the overall ability of
- a storage solution to grow. A solution that expands to 50 PB is
- more expandable than a solution that only scales to 10PB.
- Note that this meter is related to, but different
- from, scalability, which is a measure of the solution's
- performance as it expands.
-
-
-
- For a compute-focused OpenStack cloud, latency of storage is a
- major consideration. Using solid-state disks (SSDs) to minimize
- latency for instance storage reduces CPU delays related to storage
- and improves performance. Consider using RAID
- controller cards in compute hosts to improve the performance of the
- underlying disk subsystem.
- Evaluate solutions against the key factors above when considering
- your storage architecture. This determines if a scale-out solution
- such as Ceph or GlusterFS is suitable, or if a single, highly expandable,
- scalable, centralized storage array is better. If a centralized
- storage array suits the requirements, the array vendor determines the
- hardware. You can build a storage array using commodity hardware with
- Open Source software, but you require people with expertise to build
- such a system. Conversely, a scale-out storage solution that uses
- direct-attached storage (DAS) in the servers may be an appropriate
- choice. If so, then the server hardware must
- support the storage solution.
- The following lists some of the potential impacts that may affect a
- particular storage architecture, and the corresponding storage hardware,
- of a compute-focused OpenStack cloud:
-
-
- Connectivity
-
- Ensure connectivity matches the storage solution requirements.
- If you select a centralized storage array, determine how the
- hypervisors should connect to the storage array. Connectivity
- can affect latency and thus performance, so ensure that the network
- characteristics minimize latency to boost the overall
- performance of the design.
-
-
-
- Latency
-
- Determine if the use case has consistent or
- highly variable latency.
-
-
-
- Throughput
-
- To improve overall performance, ensure that you optimize the
- storage solution. While a compute-focused cloud does not usually
- have major data I-O to and from storage, this is an important
- factor to consider.
-
-
-
- Server Hardware
-
- If the solution uses DAS, this impacts the server hardware choice,
- host density, instance density, power density, OS-hypervisor, and
- management tools.
-
-
-
- When instances must be highly available or capable of migration
- between hosts, use a shared storage file-system
- to store instance ephemeral data to ensure that
- compute services can run uninterrupted in the event of a node
- failure.
-
-
- Selecting networking hardware
- Some of the key considerations for networking hardware selection
- include:
-
-
- Port count
-
- The design requires networking hardware that
- has the requisite port count.
-
-
-
- Port density
-
- The required port count affects the physical space that a
- network design requires.
- A switch that can provide 48 10 GbE ports in 1U has a much higher
- port density than a switch that provides 24 10 GbE ports in 2U. A
- higher port density is better, as it leaves more rack space for
- compute or storage components. You must also consider fault
- domains and power density. Although more expensive, you can also
- consider higher density switches as it is important not to
- design the network beyond requirements.
-
-
-
- Port speed
-
- The networking hardware must support the proposed
- network speed, for example: 1 GbE, 10 GbE, 40 GbE, or 100
- GbE.
-
-
-
- Redundancy
-
- User requirements for high availability and cost considerations
- influence the level of network hardware redundancy you require.
- You can achieve network redundancy by adding
- redundant power supplies or paired switches. If this is a
- requirement, the hardware must support this configuration.
- User requirements determine if you require a completely redundant network
- infrastructure.
-
-
-
- Power requirements
-
- Ensure that the physical data center
- provides the necessary power for the selected network hardware. This
- is not an issue for top of rack (ToR) switches, but may be an issue
- for spine switches in a leaf and spine fabric, or end of row (EoR)
- switches.
-
-
-
- We recommend designing the network architecture using
- a scalable network model that makes it easy to add capacity and
- bandwidth. A good example of such a model is the leaf-spline model. In
- this type of network design, it is possible to easily add additional
- bandwidth as well as scale out to additional racks of gear. It is
- important to select network hardware that supports the required
- port count, port speed, and port density while also allowing for future
- growth as workload demands increase. It is also important to evaluate
- where in the network architecture it is valuable to provide redundancy.
- Increased network availability and redundancy comes at a cost, therefore
- we recommend weighing the cost versus the benefit gained from
- utilizing and deploying redundant network switches and using bonded
- interfaces at the host level.
-
-
- Software selection
- Consider your selection of software for a compute-focused
- OpenStack:
+ Considerations for selecting hardware:
- Operating system (OS) and hypervisor
+ Most blade servers can support dual-socket multi-core CPUs. To
+ avoid this CPU limit, select full width
+ or full height blades.
+ Be aware, however, that this also decreases server density. For example,
+ high density blade servers such as HP BladeSystem or Dell PowerEdge
+ M1000e support up to 16 servers in only ten rack units. Using
+ half-height blades is twice as dense as using full-height blades,
+ which results in only eight servers per ten rack units.
- OpenStack components
+ 1U rack-mounted servers that occupy only a single rack
+ unit may offer greater server density than a blade server
+ solution. It is possible to place forty 1U servers in a rack, providing
+ space for the top of rack (ToR) switches, compared to 32 full width
+ blade servers.
- Supplemental software
+ 2U rack-mounted servers provide quad-socket, multi-core CPU
+ support, but with a corresponding decrease in server density (half
+ the density that 1U rack-mounted servers offer).
+
+
+ Larger rack-mounted servers, such as 4U servers, often provide
+ even greater CPU capacity, commonly supporting four or even eight CPU
+ sockets. These servers have greater expandability, but such servers
+ have much lower server density and are often more expensive.
+
+
+ Sled servers are rack-mounted servers that
+ support multiple
+ independent servers in a single 2U or 3U enclosure. These deliver higher
+ density as compared to typical 1U or 2U rack-mounted servers. For
+ example, many sled servers offer four independent dual-socket
+ nodes in 2U for a total of eight CPU sockets in 2U.
- Design decisions made in each of these areas impact the rest
- of the OpenStack architecture design.
-
-
- Operating system and hypervisor
- The selection of operating system (OS) and hypervisor has a
- significant impact on the end point design. Selecting a particular
- operating system and hypervisor could affect server hardware selection.
- The node, networking, and storage hardware must support the selected
- combination. For example, if the design uses Link Aggregation
- Control Protocol (LACP), the hypervisor must support it.
- OS and hypervisor selection impact the following areas:
-
-
- Cost
-
- Selecting a commercially supported hypervisor such as
- Microsoft Hyper-V results in a different cost model from
- choosing a community-supported, open source hypervisor like Kinstance
- or Xen. Even within the ranks of open source solutions, choosing
- one solution over another can impact cost due
- to support contracts. On the other hand, business or application
- requirements might dictate a specific or commercially supported
- hypervisor.
-
-
-
- Supportability
-
- Staff require appropriate training and knowledge to support the
- selected OS and hypervisor combination. Consideration of training
- costs may impact the design.
-
-
-
- Management tools
-
- The management tools used for Ubuntu and
- Kinstance differ from the management tools for VMware vSphere.
- Although OpenStack supports both OS and hypervisor combinations,
- the choice of tool impacts the rest of the design.
-
-
-
- Scale and performance
-
- Ensure that selected OS and hypervisor
- combinations meet the appropriate scale and performance
- requirements. The chosen architecture must meet the targeted
- instance-host ratios with the selected OS-hypervisor
- combination.
-
-
-
- Security
-
- Ensure that the design can accommodate the regular
- installation of application security patches while
- maintaining the required workloads. The frequency of security
- patches for the proposed OS-hypervisor combination has an
- impact on performance and the patch installation process can
- affect maintenance windows.
-
-
-
- Supported features
-
- Determine what features of OpenStack you require.
- The choice of features often determines the selection of the
- OS-hypervisor combination. Certain features are only available with
- specific OSs or hypervisors. For example, if certain features are
- not available, modify the design to meet user requirements.
-
-
-
- Interoperability
-
- Consider the ability of the selected OS-hypervisor combination
- to interoperate or co-exist with other OS-hypervisors, or with
- other software solutions in the overall design. Operational and
- troubleshooting tools for one OS-hypervisor combination may differ
- from the tools for another OS-hypervisor combination. The design
- must address if the two sets of tools need to interoperate.
-
-
-
-
-
- OpenStack components
- The selection of OpenStack components has a significant impact.
- There are certain components that are omnipresent, for example the compute
- and image services, but others, such as the orchestration module may not
- be present. Omitting heat does not typically have a significant impact
- on the overall design. However, if the architecture uses a replacement for
- OpenStack Object Storage for its storage component, this could have
- significant impacts on the rest of the design.
- For a compute-focused OpenStack design architecture, the
- following components may be present:
+ Consider these when choosing server hardware for a compute-
+ focused OpenStack design architecture:
- Identity (keystone)
+ Instance density
- Dashboard (horizon)
+ Host density
- Compute (nova)
-
-
- Object Storage (swift, ceph or a commercial solution)
-
-
- Image (glance)
-
-
- Networking (neutron)
-
-
- Orchestration (heat)
-
-
- A compute-focused design is less likely to include OpenStack Block
- Storage due to persistent block storage not
- being a significant requirement for the expected workloads. However,
- there may be some situations where the need for performance employs
- a block storage component to improve data I-O.
- The exclusion of certain OpenStack components might also limit the
- functionality of other components. If a design opts to
- include the Orchestration module but excludes the Telemetry module, then
- the design cannot take advantage of Orchestration's auto
- scaling functionality as this relies on information from Telemetry.
-
-
- Supplemental software
- While OpenStack is a fairly complete collection of software
- projects for building a platform for cloud services, there are
- invariably additional pieces of software that you might add
- to an OpenStack design.
-
- Networking software
- OpenStack Networking provides a wide variety of networking services
- for instances. There are many additional networking software packages
- that might be useful to manage the OpenStack components themselves.
- Some examples include software to provide load balancing,
- network redundancy protocols, and routing daemons. The
- OpenStack High Availability Guide (http://docs.openstack.org/high-availability-guide/content)
- describes some of these software packages in more detail.
-
- For a compute-focused OpenStack cloud, the OpenStack infrastructure
- components must be highly available. If the design does not
- include hardware load balancing, you must add networking software packages
- like HAProxy.
-
-
- Management software
- The selected supplemental software solution impacts and affects
- the overall OpenStack cloud design. This includes software for
- providing clustering, logging, monitoring and alerting.
- The availability of design requirements is the main determination
- for the inclusion of clustering Software, such as Corosync or Pacemaker.
- Therefore, the availability of the cloud infrastructure and the
- complexity of supporting the configuration after deployment impacts
- the inclusion of these software packages. The OpenStack High Availability
- Guide provides more
- details on the installation and configuration of Corosync and Pacemaker.
-
- Operational considerations determine the requirements for logging,
- monitoring, and alerting. Each of these sub-categories includes
- various options. For example, in the logging sub-category
- consider Logstash, Splunk, Log Insight, or some other log
- aggregation-consolidation tool. Store logs in a centralized
- location to ease analysis of the data. Log
- data analytics engines can also provide automation and issue
- notification by alerting and
- attempting to remediate some of the more commonly known issues.
- If you require any of these software packages, then the design
- must account for the additional resource consumption.
- Some other potential design impacts include:
-
-
- OS-hypervisor combination: ensure that the selected logging,
- monitoring, or alerting tools support the proposed OS-hypervisor
- combination.
-
-
- Network hardware: the logging, monitoring, and alerting software
- must support the network hardware selection.
+ Power and cooling density
+
+
+ Selecting networking hardware
+ Some of the key considerations for networking hardware selection
+ include:
+
+
+ Port count
+
+
+ Port density
+
+
+ Port speed
+
+
+ Redundancy
+
+
+ Power requirements
+
+
+ We recommend designing the network architecture using
+ a scalable network model that makes it easy to add capacity and
+ bandwidth. A good example of such a model is the leaf-spline model. In
+ this type of network design, it is possible to easily add additional
+ bandwidth as well as scale out to additional racks of gear. It is
+ important to select network hardware that supports the required
+ port count, port speed, and port density while also allowing for future
+ growth as workload demands increase. It is also important to evaluate
+ where in the network architecture it is valuable to provide redundancy.
+
+
+ Operating system and hypervisor
+ The selection of operating system (OS) and hypervisor has a
+ significant impact on the end point design.
+ OS and hypervisor selection impact the following areas:
+
+
+ Cost
+
+
+ Supportability
+
+
+ Management tools
+
+
+ Scale and performance
+
+
+ Security
+
+
+ Supported features
+
+
+ Interoperability
+
+
+
+
+
+ OpenStack components
+ The selection of OpenStack components is important.
+ There are certain components that are required, for example the compute
+ and image services, but others, such as the Orchestration module, may not
+ be present.
+ For a compute-focused OpenStack design architecture, the
+ following components may be present:
+
+
+ Identity (keystone)
+
+
+ Dashboard (horizon)
+
+
+ Compute (nova)
+
+
+ Object Storage (swift)
+
+
+ Image (glance)
+
+
+ Networking (neutron)
+
+
+ Orchestration (heat)
+
+
+
+ A compute-focused design is less likely to include OpenStack Block
+ Storage. However, there may be some situations where the need for
+ performance requires a block storage component to improve data I-O.
+
+ The exclusion of certain OpenStack components might also limit the
+ functionality of other components. If a design includes
+ the Orchestration module but excludes the Telemetry module, then
+ the design cannot take advantage of Orchestration's auto
+ scaling functionality as this relies on information from Telemetry.
+
+
+
+ Networking software
+ OpenStack Networking provides a wide variety of networking services
+ for instances. There are many additional networking software packages
+ that might be useful to manage the OpenStack components themselves.
+ The OpenStack High Availability Guide
+ (http://docs.openstack.org/high-availability-guide/content)
+ describes some of these software packages in more detail.
+
+ For a compute-focused OpenStack cloud, the OpenStack infrastructure
+ components must be highly available. If the design does not
+ include hardware load balancing, you must add networking software packages,
+ for example, HAProxy.
+
+
+
+ Management software
+ The selected supplemental software solution impacts and affects
+ the overall OpenStack cloud design. This includes software for
+ providing clustering, logging, monitoring and alerting.
+ The availability of design requirements is the main determiner
+ for the inclusion of clustering software, such as Corosync or Pacemaker.
+ Operational considerations determine the requirements for logging,
+ monitoring, and alerting. Each of these sub-categories include
+ various options.
+ Some other potential design impacts include:
+
+
+ OS-hypervisor combination
+
+ Ensure that the selected logging,
+ monitoring, or alerting tools support the proposed OS-hypervisor
+ combination.
+
+
+
+ Network hardware
+
+ The logging, monitoring, and alerting software
+ must support the network hardware selection.
+
+
+
+
+
- Database software
- A large majority of OpenStack components require access to
- back-end database services to store state and configuration
- information. Select an appropriate back-end database that
- satisfies the availability and fault tolerance requirements of the
- OpenStack services. OpenStack services support connecting
- to any database that the SQLAlchemy Python drivers support,
- however most common database deployments make use of MySQL or some
- variation of it. We recommend that you make the database that provides
- back-end services within a general-purpose cloud highly
- available. Some of the more common software solutions include Galera,
- MariaDB, and MySQL with multi-master replication.
+ Database software
+ A large majority of OpenStack components require access to
+ back-end database services to store state and configuration
+ information. Select an appropriate back-end database that
+ satisfies the availability and fault tolerance requirements of the
+ OpenStack services. OpenStack services support connecting
+ to any database that the SQLAlchemy Python drivers support,
+ however most common database deployments make use of MySQL or some
+ variation of it. We recommend that you make the database that provides
+ back-end services within a general-purpose cloud highly
+ available. Some of the more common software solutions include Galera,
+ MariaDB, and MySQL with multi-master replication.
-
+