From 3b9a1efd8c2dff78406281011d3811947f71bce9 Mon Sep 17 00:00:00 2001 From: asettle Date: Fri, 14 Aug 2015 11:42:46 +1000 Subject: [PATCH] Edits to the architecture_compute section 1. Removal of repeated/unnecessary content 2. General grammar/textual edits Change-Id: I0d182b18cc2d0f2d20deaa587a9151c89cf283c8 Implements: blueprint arch-guide --- .../section_architecture_compute_focus.xml | 757 ++++++------------ 1 file changed, 230 insertions(+), 527 deletions(-) diff --git a/doc/arch-design/compute_focus/section_architecture_compute_focus.xml b/doc/arch-design/compute_focus/section_architecture_compute_focus.xml index 14188f056e..e8de95df1f 100644 --- a/doc/arch-design/compute_focus/section_architecture_compute_focus.xml +++ b/doc/arch-design/compute_focus/section_architecture_compute_focus.xml @@ -6,560 +6,263 @@ xml:id="arch-design-architecture-hardware"> Architecture - The hardware selection covers three areas: - - - Compute - - - Network - - - Storage - - - - An OpenStack cloud with extreme demands on processor and memory - resources is compute-focused, and requires hardware that - can handle these demands. This can mean choosing hardware which might - not perform as well on storage or network capabilities. In a compute- - focused architecture, storage and networking load a - data set into the computational cluster, but are not otherwise in heavy - demand. - - - Consider the following factors when selecting compute (server) hardware: - - - - Server density - - A measure of how many servers can fit into a - given amount of physical space, such as a rack unit (U). - - - - Resource capacity - - The number of CPU cores, how much RAM, or how - much storage a given server delivers. - - - - Expandability - - The number of additional resources you can add to a - server before it reaches its limit. - - - - Cost - - The relative purchase price of the hardware weighted - against the level of design effort needed to build the system. - - - + The hardware selection covers three areas: + + + Compute + + + Network + + + Storage + + + Compute-focused OpenStack clouds have high demands on processor and + memory resources, and requires hardware that can handle these demands. + Consider the following factors when selecting compute (server) hardware: + + + Server density + + + Resource capacity + + + Expandability + + + Cost + + Weigh these considerations against each other to determine the best design for the desired purpose. For example, increasing server density - means sacrificing resource capacity or expandability. Increasing resource - capacity and expandability can increase cost but decreases server density. - Decreasing cost can mean decreasing supportability, server density, - resource capacity, and expandability. + means sacrificing resource capacity or expandability. A compute-focused cloud should have an emphasis on server hardware that can offer more CPU sockets, more CPU cores, and more RAM. Network - connectivity and storage capacity are less critical. The hardware must - provide enough network connectivity and storage - capacity to meet minimum user requirements, but they are not the primary - consideration. - Some server hardware form factors suit a compute-focused architecture - better than others. CPU and RAM capacity have the highest priority. Some - considerations for selecting hardware: - - - Most blade servers can support dual-socket multi-core CPUs. To - avoid this CPU limit, select "full width" or "full height" blades. - Be aware, however, that this also decreases server density. For example, - high density blade servers such as HP BladeSystem or Dell PowerEdge - M1000e support up to 16 servers in only ten rack units. Using - half-height blades is twice as dense as using full-height blades, - which results in only eight servers per ten rack units. - - - 1U rack-mounted servers that occupy only a single rack - unit may offer greater server density than a blade server - solution. It is possible to place forty 1U servers in a rack, providing - space for the top of rack (ToR) switches, compared to 32 full width - blade servers. However, as of the Icehouse release, 1U servers from - the major vendors have only dual-socket, multi-core CPU - configurations. To obtain greater than dual-socket support in a 1U - rack-mount form factor, purchase systems from original - design (ODMs) or second-tier manufacturers. - - - 2U rack-mounted servers provide quad-socket, multi-core CPU - support, but with a corresponding decrease in server density (half - the density that 1U rack-mounted servers offer). - - - Larger rack-mounted servers, such as 4U servers, often provide - even greater CPU capacity, commonly supporting four or even eight CPU - sockets. These servers have greater expandability, but such servers - have much lower server density and are often more expensive. - - - "Sled servers" are rack-mounted servers that support multiple - independent servers in a single 2U or 3U enclosure. These deliver higher - density as compared to typical 1U or 2U rack-mounted servers. For - example, many sled servers offer four independent dual-socket - nodes in 2U for a total of eight CPU sockets in 2U. However, the - dual-socket limitation on individual nodes may not be sufficient to - offset their additional cost and configuration complexity. - - - Consider these facts when choosing server hardware for a compute- - focused OpenStack design architecture: - - - Instance density - - In a compute-focused architecture, instance density is - lower, which means CPU and RAM over-subscription ratios are - also lower. You require more hosts to support the anticipated - scale due to instance density being lower, especially if the - design uses dual-socket hardware designs. - - - - Host density - - Another option to address the higher host count - of dual socket designs is to use a quad - socket platform. Taking this approach decreases host density, - which increases rack count. This configuration may - affect the network requirements, the number of power connections, and - possibly impact the cooling requirements. - - - - Power and cooling density - - The power and cooling density - requirements for 2U, 3U or even 4U server designs might be lower - than for blade, sled, or 1U server designs because of lower host - density. For data centers with older infrastructure, this may - be a desirable feature. - - - + connectivity and storage capacity are less critical. When designing a compute-focused OpenStack architecture, you must consider whether you intend to scale up or scale out. Selecting a smaller number of larger hosts, or a larger number of smaller hosts, depends on a combination of factors: cost, power, cooling, physical rack and floor space, support-warranty, and manageability. -
- Storage hardware selection - For a compute-focused OpenStack architecture, the - selection of storage hardware is not critical as it is not a primary - consideration. Nonetheless, there are several factors - to consider: - - - Cost - - The overall cost of the solution plays a major role - in what storage architecture and storage hardware you select. - - - - Performance - - The performance of the storage solution is important; you can - measure it by observing the latency of storage I-O - requests. In a compute-focused OpenStack cloud, storage latency - can be a major consideration. In some compute-intensive - workloads, minimizing the delays that the CPU experiences while - fetching data from storage can impact significantly on - the overall performance of the application. - - - - Scalability - - Scalability refers to the performance of a storage solution - as it expands to its maximum size. A solution that performs - well in small configurations but has degrading - performance as it expands is not scalable. On - the other hand, a solution that continues to perform well at - maximum expansion is scalable. - - - - Expandability - - Expandability refers to the overall ability of - a storage solution to grow. A solution that expands to 50 PB is - more expandable than a solution that only scales to 10PB. - Note that this meter is related to, but different - from, scalability, which is a measure of the solution's - performance as it expands. - - - - For a compute-focused OpenStack cloud, latency of storage is a - major consideration. Using solid-state disks (SSDs) to minimize - latency for instance storage reduces CPU delays related to storage - and improves performance. Consider using RAID - controller cards in compute hosts to improve the performance of the - underlying disk subsystem. - Evaluate solutions against the key factors above when considering - your storage architecture. This determines if a scale-out solution - such as Ceph or GlusterFS is suitable, or if a single, highly expandable, - scalable, centralized storage array is better. If a centralized - storage array suits the requirements, the array vendor determines the - hardware. You can build a storage array using commodity hardware with - Open Source software, but you require people with expertise to build - such a system. Conversely, a scale-out storage solution that uses - direct-attached storage (DAS) in the servers may be an appropriate - choice. If so, then the server hardware must - support the storage solution. - The following lists some of the potential impacts that may affect a - particular storage architecture, and the corresponding storage hardware, - of a compute-focused OpenStack cloud: - - - Connectivity - - Ensure connectivity matches the storage solution requirements. - If you select a centralized storage array, determine how the - hypervisors should connect to the storage array. Connectivity - can affect latency and thus performance, so ensure that the network - characteristics minimize latency to boost the overall - performance of the design. - - - - Latency - - Determine if the use case has consistent or - highly variable latency. - - - - Throughput - - To improve overall performance, ensure that you optimize the - storage solution. While a compute-focused cloud does not usually - have major data I-O to and from storage, this is an important - factor to consider. - - - - Server Hardware - - If the solution uses DAS, this impacts the server hardware choice, - host density, instance density, power density, OS-hypervisor, and - management tools. - - - - When instances must be highly available or capable of migration - between hosts, use a shared storage file-system - to store instance ephemeral data to ensure that - compute services can run uninterrupted in the event of a node - failure. -
-
- Selecting networking hardware - Some of the key considerations for networking hardware selection - include: - - - Port count - - The design requires networking hardware that - has the requisite port count. - - - - Port density - - The required port count affects the physical space that a - network design requires. - A switch that can provide 48 10 GbE ports in 1U has a much higher - port density than a switch that provides 24 10 GbE ports in 2U. A - higher port density is better, as it leaves more rack space for - compute or storage components. You must also consider fault - domains and power density. Although more expensive, you can also - consider higher density switches as it is important not to - design the network beyond requirements. - - - - Port speed - - The networking hardware must support the proposed - network speed, for example: 1 GbE, 10 GbE, 40 GbE, or 100 - GbE. - - - - Redundancy - - User requirements for high availability and cost considerations - influence the level of network hardware redundancy you require. - You can achieve network redundancy by adding - redundant power supplies or paired switches. If this is a - requirement, the hardware must support this configuration. - User requirements determine if you require a completely redundant network - infrastructure. - - - - Power requirements - - Ensure that the physical data center - provides the necessary power for the selected network hardware. This - is not an issue for top of rack (ToR) switches, but may be an issue - for spine switches in a leaf and spine fabric, or end of row (EoR) - switches. - - - - We recommend designing the network architecture using - a scalable network model that makes it easy to add capacity and - bandwidth. A good example of such a model is the leaf-spline model. In - this type of network design, it is possible to easily add additional - bandwidth as well as scale out to additional racks of gear. It is - important to select network hardware that supports the required - port count, port speed, and port density while also allowing for future - growth as workload demands increase. It is also important to evaluate - where in the network architecture it is valuable to provide redundancy. - Increased network availability and redundancy comes at a cost, therefore - we recommend weighing the cost versus the benefit gained from - utilizing and deploying redundant network switches and using bonded - interfaces at the host level. -
-
- Software selection - Consider your selection of software for a compute-focused - OpenStack: + Considerations for selecting hardware: - Operating system (OS) and hypervisor + Most blade servers can support dual-socket multi-core CPUs. To + avoid this CPU limit, select full width + or full height blades. + Be aware, however, that this also decreases server density. For example, + high density blade servers such as HP BladeSystem or Dell PowerEdge + M1000e support up to 16 servers in only ten rack units. Using + half-height blades is twice as dense as using full-height blades, + which results in only eight servers per ten rack units. - OpenStack components + 1U rack-mounted servers that occupy only a single rack + unit may offer greater server density than a blade server + solution. It is possible to place forty 1U servers in a rack, providing + space for the top of rack (ToR) switches, compared to 32 full width + blade servers. - Supplemental software + 2U rack-mounted servers provide quad-socket, multi-core CPU + support, but with a corresponding decrease in server density (half + the density that 1U rack-mounted servers offer). + + + Larger rack-mounted servers, such as 4U servers, often provide + even greater CPU capacity, commonly supporting four or even eight CPU + sockets. These servers have greater expandability, but such servers + have much lower server density and are often more expensive. + + + Sled servers are rack-mounted servers that + support multiple + independent servers in a single 2U or 3U enclosure. These deliver higher + density as compared to typical 1U or 2U rack-mounted servers. For + example, many sled servers offer four independent dual-socket + nodes in 2U for a total of eight CPU sockets in 2U. - Design decisions made in each of these areas impact the rest - of the OpenStack architecture design. -
-
- Operating system and hypervisor - The selection of operating system (OS) and hypervisor has a - significant impact on the end point design. Selecting a particular - operating system and hypervisor could affect server hardware selection. - The node, networking, and storage hardware must support the selected - combination. For example, if the design uses Link Aggregation - Control Protocol (LACP), the hypervisor must support it. - OS and hypervisor selection impact the following areas: - - - Cost - - Selecting a commercially supported hypervisor such as - Microsoft Hyper-V results in a different cost model from - choosing a community-supported, open source hypervisor like Kinstance - or Xen. Even within the ranks of open source solutions, choosing - one solution over another can impact cost due - to support contracts. On the other hand, business or application - requirements might dictate a specific or commercially supported - hypervisor. - - - - Supportability - - Staff require appropriate training and knowledge to support the - selected OS and hypervisor combination. Consideration of training - costs may impact the design. - - - - Management tools - - The management tools used for Ubuntu and - Kinstance differ from the management tools for VMware vSphere. - Although OpenStack supports both OS and hypervisor combinations, - the choice of tool impacts the rest of the design. - - - - Scale and performance - - Ensure that selected OS and hypervisor - combinations meet the appropriate scale and performance - requirements. The chosen architecture must meet the targeted - instance-host ratios with the selected OS-hypervisor - combination. - - - - Security - - Ensure that the design can accommodate the regular - installation of application security patches while - maintaining the required workloads. The frequency of security - patches for the proposed OS-hypervisor combination has an - impact on performance and the patch installation process can - affect maintenance windows. - - - - Supported features - - Determine what features of OpenStack you require. - The choice of features often determines the selection of the - OS-hypervisor combination. Certain features are only available with - specific OSs or hypervisors. For example, if certain features are - not available, modify the design to meet user requirements. - - - - Interoperability - - Consider the ability of the selected OS-hypervisor combination - to interoperate or co-exist with other OS-hypervisors, or with - other software solutions in the overall design. Operational and - troubleshooting tools for one OS-hypervisor combination may differ - from the tools for another OS-hypervisor combination. The design - must address if the two sets of tools need to interoperate. - - - -
-
- OpenStack components - The selection of OpenStack components has a significant impact. - There are certain components that are omnipresent, for example the compute - and image services, but others, such as the orchestration module may not - be present. Omitting heat does not typically have a significant impact - on the overall design. However, if the architecture uses a replacement for - OpenStack Object Storage for its storage component, this could have - significant impacts on the rest of the design. - For a compute-focused OpenStack design architecture, the - following components may be present: + Consider these when choosing server hardware for a compute- + focused OpenStack design architecture: - Identity (keystone) + Instance density - Dashboard (horizon) + Host density - Compute (nova) - - - Object Storage (swift, ceph or a commercial solution) - - - Image (glance) - - - Networking (neutron) - - - Orchestration (heat) - - - A compute-focused design is less likely to include OpenStack Block - Storage due to persistent block storage not - being a significant requirement for the expected workloads. However, - there may be some situations where the need for performance employs - a block storage component to improve data I-O. - The exclusion of certain OpenStack components might also limit the - functionality of other components. If a design opts to - include the Orchestration module but excludes the Telemetry module, then - the design cannot take advantage of Orchestration's auto - scaling functionality as this relies on information from Telemetry. -
-
- Supplemental software - While OpenStack is a fairly complete collection of software - projects for building a platform for cloud services, there are - invariably additional pieces of software that you might add - to an OpenStack design. -
- Networking software - OpenStack Networking provides a wide variety of networking services - for instances. There are many additional networking software packages - that might be useful to manage the OpenStack components themselves. - Some examples include software to provide load balancing, - network redundancy protocols, and routing daemons. The - OpenStack High Availability Guide (http://docs.openstack.org/high-availability-guide/content) - describes some of these software packages in more detail. - - For a compute-focused OpenStack cloud, the OpenStack infrastructure - components must be highly available. If the design does not - include hardware load balancing, you must add networking software packages - like HAProxy. -
-
- Management software - The selected supplemental software solution impacts and affects - the overall OpenStack cloud design. This includes software for - providing clustering, logging, monitoring and alerting. - The availability of design requirements is the main determination - for the inclusion of clustering Software, such as Corosync or Pacemaker. - Therefore, the availability of the cloud infrastructure and the - complexity of supporting the configuration after deployment impacts - the inclusion of these software packages. The OpenStack High Availability - Guide provides more - details on the installation and configuration of Corosync and Pacemaker. - - Operational considerations determine the requirements for logging, - monitoring, and alerting. Each of these sub-categories includes - various options. For example, in the logging sub-category - consider Logstash, Splunk, Log Insight, or some other log - aggregation-consolidation tool. Store logs in a centralized - location to ease analysis of the data. Log - data analytics engines can also provide automation and issue - notification by alerting and - attempting to remediate some of the more commonly known issues. - If you require any of these software packages, then the design - must account for the additional resource consumption. - Some other potential design impacts include: - - - OS-hypervisor combination: ensure that the selected logging, - monitoring, or alerting tools support the proposed OS-hypervisor - combination. - - - Network hardware: the logging, monitoring, and alerting software - must support the network hardware selection. + Power and cooling density + +
+ Selecting networking hardware + Some of the key considerations for networking hardware selection + include: + + + Port count + + + Port density + + + Port speed + + + Redundancy + + + Power requirements + + + We recommend designing the network architecture using + a scalable network model that makes it easy to add capacity and + bandwidth. A good example of such a model is the leaf-spline model. In + this type of network design, it is possible to easily add additional + bandwidth as well as scale out to additional racks of gear. It is + important to select network hardware that supports the required + port count, port speed, and port density while also allowing for future + growth as workload demands increase. It is also important to evaluate + where in the network architecture it is valuable to provide redundancy.
+ +
+ Operating system and hypervisor + The selection of operating system (OS) and hypervisor has a + significant impact on the end point design. + OS and hypervisor selection impact the following areas: + + + Cost + + + Supportability + + + Management tools + + + Scale and performance + + + Security + + + Supported features + + + Interoperability + + +
+ +
+ OpenStack components + The selection of OpenStack components is important. + There are certain components that are required, for example the compute + and image services, but others, such as the Orchestration module, may not + be present. + For a compute-focused OpenStack design architecture, the + following components may be present: + + + Identity (keystone) + + + Dashboard (horizon) + + + Compute (nova) + + + Object Storage (swift) + + + Image (glance) + + + Networking (neutron) + + + Orchestration (heat) + + + + A compute-focused design is less likely to include OpenStack Block + Storage. However, there may be some situations where the need for + performance requires a block storage component to improve data I-O. + + The exclusion of certain OpenStack components might also limit the + functionality of other components. If a design includes + the Orchestration module but excludes the Telemetry module, then + the design cannot take advantage of Orchestration's auto + scaling functionality as this relies on information from Telemetry. +
+ +
+ Networking software + OpenStack Networking provides a wide variety of networking services + for instances. There are many additional networking software packages + that might be useful to manage the OpenStack components themselves. + The OpenStack High Availability Guide + (http://docs.openstack.org/high-availability-guide/content) + describes some of these software packages in more detail. + + For a compute-focused OpenStack cloud, the OpenStack infrastructure + components must be highly available. If the design does not + include hardware load balancing, you must add networking software packages, + for example, HAProxy. +
+ +
+ Management software + The selected supplemental software solution impacts and affects + the overall OpenStack cloud design. This includes software for + providing clustering, logging, monitoring and alerting. + The availability of design requirements is the main determiner + for the inclusion of clustering software, such as Corosync or Pacemaker. + Operational considerations determine the requirements for logging, + monitoring, and alerting. Each of these sub-categories include + various options. + Some other potential design impacts include: + + + OS-hypervisor combination + + Ensure that the selected logging, + monitoring, or alerting tools support the proposed OS-hypervisor + combination. + + + + Network hardware + + The logging, monitoring, and alerting software + must support the network hardware selection. + + + +
+
- Database software - A large majority of OpenStack components require access to - back-end database services to store state and configuration - information. Select an appropriate back-end database that - satisfies the availability and fault tolerance requirements of the - OpenStack services. OpenStack services support connecting - to any database that the SQLAlchemy Python drivers support, - however most common database deployments make use of MySQL or some - variation of it. We recommend that you make the database that provides - back-end services within a general-purpose cloud highly - available. Some of the more common software solutions include Galera, - MariaDB, and MySQL with multi-master replication. + Database software + A large majority of OpenStack components require access to + back-end database services to store state and configuration + information. Select an appropriate back-end database that + satisfies the availability and fault tolerance requirements of the + OpenStack services. OpenStack services support connecting + to any database that the SQLAlchemy Python drivers support, + however most common database deployments make use of MySQL or some + variation of it. We recommend that you make the database that provides + back-end services within a general-purpose cloud highly + available. Some of the more common software solutions include Galera, + MariaDB, and MySQL with multi-master replication.
-
+