From a4b0bcff0318e4965613b9d6e605144bccb48862 Mon Sep 17 00:00:00 2001 From: kallimachos Date: Tue, 10 Mar 2015 16:45:46 +1000 Subject: [PATCH] Remove passive voice Arch Guide Ch4 Storage Focus Change-Id: I95f80c5f4ff7181790e6ee789ca08c29999901e0 Closes-bug: #1402462 --- doc/arch-design/ch_storage_focus.xml | 32 +- .../section_architecture_storage_focus.xml | 280 ++++++++---------- ...erational_considerations_storage_focus.xml | 100 +++---- ...on_prescriptive_examples_storage_focus.xml | 37 ++- ...tion_tech_considerations_storage_focus.xml | 21 +- ...ection_user_requirements_storage_focus.xml | 31 +- 6 files changed, 235 insertions(+), 266 deletions(-) diff --git a/doc/arch-design/ch_storage_focus.xml b/doc/arch-design/ch_storage_focus.xml index eb1808ccfa..26f11531b1 100644 --- a/doc/arch-design/ch_storage_focus.xml +++ b/doc/arch-design/ch_storage_focus.xml @@ -6,34 +6,34 @@ xml:id="storage_focus"> Storage focused - Cloud storage is a model of data storage where digital data - is stored in logical pools and physical storage that spans + Cloud storage is a model of data storage that stores digital + data in logical pools and physical storage that spans across multiple servers and locations. Cloud storage commonly refers to a hosted object storage service, however the term - has extended to include other types of data storage that are + also includes other types of data storage that are available as a service, for example block storage. - Cloud storage is based on virtualized infrastructure and + Cloud storage runs on virtualized infrastructure and resembles broader cloud computing in terms of accessible interfaces, elasticity, scalability, multi-tenancy, and - metered resources. Cloud storage services can be utilized from - an off-premises service or deployed on-premises. - Cloud storage is made up of many distributed, yet still - synonymous resources, and is often referred to as integrated + metered resources. You can use cloud storage services from + an off-premises service or deploy on-premises. + Cloud storage consists of many distributed, synonymous + resources, which are often referred to as integrated storage clouds. Cloud storage is highly fault tolerant through redundancy and the distribution of data. It is highly durable through the creation of versioned copies, and can be consistent with regard to data replicas. - At a certain scale, management of data operations can become - a resource intensive process to an organization. Hierarchical - storage management (HSM) systems and data grids can help + At large scale, management of data operations is + a resource intensive process for an organization. Hierarchical + storage management (HSM) systems and data grids help annotate and report a baseline data valuation to make - intelligent decisions and automate data decisions. HSM allows - for automating tiering and movement, as well as orchestration + intelligent decisions and automate data decisions. HSM enables + automated tiering and movement, as well as orchestration of data operations. A data grid is an architecture, or set of services evolving technology, that brings together sets of - services allowing users to manage large data sets. - Examples of applications that can be deployed with cloud - storage characteristics are: + services enabling users to manage large data sets. + Example applications deployed with cloud + storage characteristics: Active archive, backups and hierarchical storage diff --git a/doc/arch-design/storage_focus/section_architecture_storage_focus.xml b/doc/arch-design/storage_focus/section_architecture_storage_focus.xml index c0a178e019..4defe8f99d 100644 --- a/doc/arch-design/storage_focus/section_architecture_storage_focus.xml +++ b/doc/arch-design/storage_focus/section_architecture_storage_focus.xml @@ -27,60 +27,53 @@ heavily utilized to transfer storage, but they are not otherwise network intensive. For a storage-focused OpenStack design architecture, the - selection of storage hardware will determine the overall - performance and scalability of the design architecture. A - number of different factors must be considered in the design - process: + selection of storage hardware determines the overall + performance and scalability of the design architecture. Several factors + impact the design process: Cost - The cost of components can change which storage + The cost of components affects which storage architecture and hardware you choose. Performance - Performance is measured by observing the latency of - storage I/O requests. Performance requirements can change - which solution is implemented. + The latency of storage I/O requests indicates performance. + Performance requirements affect which solution you choose. Scalability - Scalability refers to how well the - storage solution performs as it is expanded up to its - maximum size. Storage solutions that perform well in - small configurations but have degraded performance - would not be considered scalable. - However, a solution that continues to perform well - at maximum expansion would be considered scalable. The - ability of the storage solution to continue to perform - well as it expands is important. + Scalability refers to how the storage solution performs + as it expands to its maximum size. Storage solutions + that perform well in small configurations but have + degraded performance in large configurations are not scalable. + A solution that performs well at maximum expansion is + scalable. Large deployments require a storage solution + that performs well as it expands. Expandability - Expandability is the overall - ability of the solution to grow. A storage solution - that expands to 50 PB is considered more expandable - than a solution that only scales to 10 PB. + Expandability is the overall ability of the solution + to grow. A storage solution that expands to 50 PB is + more expandable than a solution that only scales to 10 PB. - This metric is related to but different from - scalability which is a measure of the solution's - performance as it expands. + This metric is related to scalability. - Latency is one of the key considerations in a + Latency is a key consideration in a storage-focused OpenStack cloud. Using solid-state disks - (SSDs) to minimize latency for instance storage, and reduce - CPU delays caused by waiting for the storage, will increase + (SSDs) to minimize latency for instance storage, and to reduce + CPU delays caused by waiting for the storage, increases performance. We recommend evaluating the gains from using RAID controller cards in compute hosts to improve the performance of the underlying disk @@ -89,36 +82,33 @@ solution should be used or if a single, highly expandable and scalable centralized storage array would be a better choice. If a centralized storage array is the right fit for the requirements - then the hardware will be determined by the array vendor. It is possible + then the array vendor determines the hardware selection. It is possible to build a storage array using commodity hardware with Open Source - software, but there needs to be access to people with expertise - to build such a system. + software, but requires people with expertise to build such a system. On the other hand, a scale-out storage solution that uses direct-attached storage (DAS) in the servers may be an - appropriate choice. If this is true, then the server hardware - needs to be configured to support the storage solution. - Some potential impacts that might affect a particular - storage architecture (and corresponding storage hardware) of a - Storage-focused OpenStack cloud: + appropriate choice. This requires configuration of the server + hardware to support the storage solution. + Considerations affecting storage architecture (and corresponding + storage hardware) of a Storage-focused OpenStack cloud: Connectivity - Based on the storage solution - selected, ensure the connectivity matches the storage - solution requirements. If a centralized storage array - is selected, it is important to determine how the - hypervisors will connect to the storage array. + Based on the selected storage solution, ensure the + connectivity matches the storage solution requirements. + If selecting centralized storage array, determine how the + hypervisors connect to the storage array. Connectivity can affect latency and thus performance. - We recommended you check that the network - characteristics will minimize latency to boost the + We recommended confirming that the network + characteristics minimize latency to boost the overall performance of the design. Latency - Determine if the use case will have + Determine if the use case has consistent or highly variable latency. @@ -143,7 +133,7 @@
Compute (server) hardware selection - Compute (server) hardware must be evaluated against four + Evaluate Compute (server) hardware four opposing dimensions: @@ -158,16 +148,14 @@ Resource capacity The number of CPU cores, how much - RAM, or how much storage a given server will - deliver. + RAM, or how much storage a given server delivers. Expandability - The number of additional resources - that can be added to a server before it has reached - its limit. + The number of additional resources you can add to a server + before it reaches capacity. @@ -179,7 +167,7 @@ - The dimensions need to be weighed against each other to + You must weigh the dimensions against each other to determine the best design for the desired purpose. For example, increasing server density can mean sacrificing resource capacity or expandability. Increasing resource @@ -192,7 +180,7 @@ a result, the required server hardware must supply adequate CPU sockets, additional CPU cores, and more RAM; network connectivity and storage capacity are not as critical. The - hardware will need to provide enough network connectivity and + hardware needs to provide enough network connectivity and storage capacity to meet the user requirements, however they are not the primary consideration. Some server hardware form factors are better @@ -242,7 +230,7 @@ Larger rack-mounted servers, such as 4U servers, often provide even greater CPU capacity. Commonly supporting four or even eight CPU sockets. These - servers have greater expandability capacity but such + servers have greater expandability but such servers have much lower server density and usually greater hardware cost. @@ -258,7 +246,7 @@ additional cost and configuration complexity. - Other factors will strongly influence server hardware + Other factors strongly influence server hardware selection for a storage-focused OpenStack design architecture. The following is a list of these factors: @@ -266,8 +254,8 @@ Instance density In this architecture, instance - density and CPU-RAM oversubscription are lower. More - hosts will be required to support the anticipated + density and CPU-RAM oversubscription are lower. You + require more hosts to support the anticipated scale, especially if the design uses dual-socket hardware designs. @@ -277,7 +265,7 @@ Another option to address the higher host count is to use a quad socket platform. Taking - this approach will decrease host density which also + this approach decreases host density which also increases rack count. This configuration affects the number of power connections and also impacts network and cooling requirements. @@ -297,31 +285,30 @@ Storage-focused OpenStack design architecture server hardware selection should focus on a "scale up" versus "scale - out" solution. The determination of which will be the best + out" solution. The determination of which is the best solution, a smaller number of larger hosts or a larger number of - smaller hosts, will depend on a combination of factors + smaller hosts, depends on a combination of factors including cost, power, cooling, physical rack and floor space, support-warranty, and manageability.
Networking hardware selection - Some of the key considerations that should be included in - the selection of networking hardware include: + Key considerations for the selection of networking hardware include: Port count - The user will require networking + The user requires networking hardware that has the requisite port count. Port density - The network design will be affected by - the physical space that is required to provide the - requisite port count. A switch that can provide 48 10 GbE + The physical space required to provide the + requisite port count affects the network design. + A switch that provides 48 10 GbE ports in 1U has a much higher port density than a switch that provides 24 10 GbE ports in 2U. On a general scale, a higher port density leaves more rack @@ -343,15 +330,14 @@ Redundancy - The level of network hardware redundancy - required is influenced by the user requirements for - high availability and cost considerations. Network - redundancy can be achieved by adding redundant power - supplies or paired switches. + User requirements for high availability and cost + considerations influence the required level of network + hardware redundancy. Achieve network redundancy by adding + redundant power supplies or paired switches. - If this is a requirement - the hardware will need to support this configuration. - User requirements will determine if a completely + If this is a requirement, + the hardware must support this configuration. + User requirements determine if a completely redundant network infrastructure is required. @@ -382,7 +368,7 @@
Software selection - Selecting software to be included in a storage-focused + Selecting software for a storage-focused OpenStack architecture design includes three areas: @@ -403,25 +389,22 @@ Operating system and hypervisor Selecting the OS and hypervisor has a significant impact on the overall design and also affects server hardware - selection. Ensure that the storage hardware is supported by - the selected operating system and hypervisor combination and - that the networking hardware selection and topology will work - with the chosen operating system and hypervisor combination. - For example, if the design uses Link Aggregation Control - Protocol (LACP), the OS and hypervisor are both required to - support it. - Some areas that could be impacted by the selection of OS and - hypervisor include: + selection. Ensure that the selected operating system and + hypervisor combination support the storage hardware and work + with the networking hardware selection and topology. + For example, Link Aggregation Control Protocol (LACP) requires + support from both the OS and hypervisor. + OS and hypervisor selection affect the following areas: Cost Selection of a commercially supported - hypervisor, such as Microsoft Hyper-V, will result in - a different cost model rather than selected a + hypervisor, such as Microsoft Hyper-V, results in + a different cost model than a community-supported open source hypervisor like Kinstance or Xen. Similarly, choosing Ubuntu over Red - Hat (or vice versa) will have an impact on cost due to + Hat (or vice versa) impacts cost due to support contracts. However, business or application requirements might dictate a specific or commercially supported hypervisor. @@ -431,8 +414,8 @@ Supportability Staff must have training with the chosen hypervisor. - The cost of training should be considered when choosing - the solution. The support of a commercial product + Consider the cost of training when choosing + a solution. The support of a commercial product such as Red Hat, SUSE, or Windows, is the responsibility of the OS vendor. If an open source platform is chosen, the support comes from in-house @@ -442,11 +425,10 @@ Management tools - The management tools used for - Ubuntu and Kinstance differ from the management tools - for VMware vSphere. Although both OS and hypervisor - combinations are supported by OpenStack, there will - be varying impacts to the rest of the + Ubuntu and Kinstance use different management tools + than VMware vSphere. Although both OS and hypervisor + combinations are supported by OpenStack, there are + varying impacts to the rest of the design as a result of the selection of one combination versus the other. @@ -454,36 +436,36 @@ Scale and performance - Make sure that selected OS + Ensure that the selected OS and hypervisor combination meet the appropriate scale and performance requirements needed for this storage - focused OpenStack cloud. The chosen architecture will - need to meet the targeted instance-host ratios with + focused OpenStack cloud. The chosen architecture must + meet the targeted instance-host ratios with the selected OS-hypervisor combination. Security - Make sure that the design can accommodate + Ensure that the design can accommodate the regular periodic installation of application security patches while maintaining the required workloads. The frequency of security patches for the - proposed OS-hypervisor combination will have an impact - on performance and the patch installation process + proposed OS-hypervisor combination impacts + performance and the patch installation process could affect maintenance windows. Supported features - Determine what features of - OpenStack are required. This will often determine the + Determine the required features of + OpenStack. This often determines the selection of the OS-hypervisor combination. Certain features are only available with specific OSes or hypervisors. For example, if certain features are not - available, the design might need to be modified to - meet the user requirements. + available, you might need to modify the design to + meet user requirements. @@ -494,7 +476,7 @@ OS-hyervisor combinations. Operational and troubleshooting tools for one OS-hypervisor combination may differ from the tools used for another OS-hypervisor - combination. As a result, the design will need to + combination. As a result, the design must address if the two sets of tools need to interoperate. @@ -506,8 +488,8 @@ OpenStack components Which OpenStack components you choose can have a significant impact on the overall design. While there are certain - components that will always be present, (Compute and Image Service, for - example) there are other services that may not need to be + components that are always present, Compute and Image Service, for + example, there are other services that may not need to be present. As an example, a certain design may not require the Orchestration module. Omitting Orchestration would not typically have a significant impact on the overall design, however, if the @@ -517,8 +499,8 @@ A storage-focused design might require the ability to use Orchestration to launch instances with Block Storage volumes to perform storage-intensive processing. - For a storage-focused OpenStack design architecture, the - following components would typically be used: + A storage-focused OpenStack design architecture typically uses the + following components: OpenStack Identity (keystone) @@ -546,20 +528,19 @@ Excluding certain OpenStack components may limit or constrain the functionality of other components. If a design opts to include Orchestration but exclude Telemetry, then the design - will not be able to take advantage of Orchestration's auto scaling + cannot take advantage of Orchestration's auto scaling functionality (which relies on information from Telemetry). Due to the fact that you can use Orchestration to spin up a large number of instances to perform the compute-intensive - processing, including Orchestration in a compute-focused architecture - design is strongly recommended. + processing, we strongly recommend including Orchestration in a + compute-focused architecture design.
Supplemental software While OpenStack is a fairly complete collection of software - projects for building a platform for cloud services, there are - additional pieces of software that might need to be added to - any given OpenStack design. + projects for building a platform for cloud services, you may need + to add other pieces of software.
@@ -568,13 +549,12 @@ services for instances. There are many additional networking software packages that may be useful to manage the OpenStack components themselves. Some examples include HAProxy, - keepalived, and various routing daemons (like Quagga). Some of - these software packages, HAProxy in particular, are described - in more detail in the OpenStack High Availability - Guide (refer to the OpenStack High Availability Guide describes + some of these software packages, HAProxy in particular. See the Network controller cluster stack chapter of the OpenStack High - Availability Guide). + Availability Guide.
@@ -604,30 +584,29 @@ The factors for determining which - software packages in this category should be selected is + software packages in this category to select is outside the scope of this design guide. - Clustering Software, such as Corosync or Pacemaker, is - determined primarily by the availability design requirements. - The impact of including (or not including) these - software packages is determined by the availability of the - cloud infrastructure and the complexity of supporting the - configuration after it is deployed. The OpenStack High - Availability Guide provides more details on the installation - and configuration of Corosync and Pacemaker, should these - packages need to be included in the design. - Requirements for logging, monitoring, and alerting are - determined by operational considerations. Each of these - sub-categories includes a number of various options. For - example, in the logging sub-category one might consider + The availability design requirements determine the selection of + Clustering Software, such as Corosync or Pacemaker. + The availability of the cloud infrastructure and the complexity + of supporting the configuration after deployment determines + the impact of including these software packages. The + OpenStack High Availability Guide provides + more details on the installation and configuration of Corosync + and Pacemaker. + Operational considerations determine the requirements for + logging, monitoring, and alerting. Each of these + sub-categories includes options. For + example, in the logging sub-category you could select Logstash, Splunk, Log Insight, or another log - aggregation-consolidation tool. Logs should be stored in a - centralized location to make it easier to perform analytics + aggregation-consolidation tool. Store logs in a + centralized location to facilitate performing analytics against the data. Log data analytics engines can also provide automation and issue notification, by providing a mechanism to both alert and automatically attempt to remediate some of the more commonly known issues. - If any of these software packages are needed, then the + If you require any of these software packages, the design must account for the additional resource consumption (CPU, RAM, storage, and network bandwidth for a log aggregation solution, for example). Some other potential @@ -648,40 +627,37 @@
Database software - Virtually all of the OpenStack components require access to + Most OpenStack components require access to back-end database services to store state and configuration information. Choose an appropriate back-end database which - will satisfy the availability and fault tolerance requirements + satisfies the availability and fault tolerance requirements of the OpenStack services. - MySQL is generally considered to be the de facto database - for OpenStack, however, other compatible databases are also - known to work. + MySQL is the default database for OpenStack, but other + compatible databases are available. Telemetry uses MongoDB. - The solution selected to provide high availability for the - database will change based on the selected database. If MySQL - is selected, then a number of options are available. For - active-active clustering a replication technology such as - Galera can be used. For active-passive some form of shared - storage must be used. Each of these potential solutions has an + The chosen high availability database solution changes + according to the selected database. MySQL, for example, provides + several options. Use a replication technology such as Galera + for active-active clustering. For active-passive use some form of + shared storage. Each of these potential solutions has an impact on the design: - Solutions that employ Galera/MariaDB will require at + Solutions that employ Galera/MariaDB require at least three MySQL nodes. - MongoDB will have its own design considerations, - with regards to making the database highly - available. + MongoDB has its own design considerations for high + availability. OpenStack design, generally, does not include shared - storage but for a high availability design some - components might require it depending on the specific + storage. However, for some high availability designs, + certain components might require it depending on the specific implementation. diff --git a/doc/arch-design/storage_focus/section_operational_considerations_storage_focus.xml b/doc/arch-design/storage_focus/section_operational_considerations_storage_focus.xml index cc7d93d1f7..6e1034ff32 100644 --- a/doc/arch-design/storage_focus/section_operational_considerations_storage_focus.xml +++ b/doc/arch-design/storage_focus/section_operational_considerations_storage_focus.xml @@ -83,7 +83,7 @@ Alerting and notification of responsible teams or - automated systems which will remediate problems with + automated systems which remediate problems with storage as they arise. @@ -94,7 +94,7 @@
Management efficiency - Operations personnel will often be required to replace failed + Operations personnel are often required to replace failed drives or nodes and provide ongoing maintenance of the storage hardware. Provisioning and configuration of new or upgraded storage is another important consideration when it comes to management of @@ -109,8 +109,8 @@ Application awareness Well-designed applications should be aware of underlying storage subsystems, in order to use cloud storage solutions effectively. - If natively available replication is not available, the application - must be able to be modified by operations personnel so that they + If natively available replication is not available, operations personnel + must be able to modify the application so that they can provide their own replication service. In the event that replication is unavailable, operations personnel can design applications to react such that they can provide their own replication services. @@ -125,22 +125,21 @@ Designing for fault tolerance and availability of storage systems in an OpenStack cloud is vastly different when comparing the Block Storage and Object Storage services. The - Object Storage service is designed to have consistency and + Object Storage service design features consistency and partition tolerance as a function of the application. Therefore, it does not have any reliance on hardware RAID controllers to provide redundancy for physical disks.
Block Storage fault tolerance and availability - Block Storage resource nodes are commonly configured - with advanced RAID controllers and high performance disks that - are designed to provide fault tolerance at the hardware - level. - Deploy high performing storage solutions + Block Storage resource nodes are commonly configured + with advanced RAID controllers and high performance disks to + provide fault tolerance at the hardware level. + Deploy high performing storage solutions such as SSD disk drives or flash storage systems in cases where applications require extreme performance out of Block Storage devices. - In environments that place extreme demands on Block Storage, - it is advisable to take advantage of multiple storage pools. + In environments that place extreme demands on Block Storage, + we recommend using multiple storage pools. In this case, each pool of devices should have a similar hardware design and disk configuration across all hardware nodes in that pool. This allows for a design that provides @@ -152,13 +151,13 @@ storage across resource nodes. Ensuring that applications can schedule volumes in multiple regions, each with their own network, power, and cooling infrastructure, can give tenants - the ability to build fault tolerant applications that will be + the ability to build fault tolerant applications that are distributed across multiple availability zones. In addition to the Block Storage resource nodes, it is important to design for high availability and redundancy of the APIs and related services that are responsible for provisioning and providing access to storage. We - recommend desiging a layer of hardware or software load + recommend designing a layer of hardware or software load balancers in order to achieve high availability of the appropriate REST API services to provide uninterrupted service. In some cases, it may also be necessary to deploy an @@ -172,8 +171,8 @@ so that tenants can manage Block Storage volumes. In a cloud with extreme demands on Block Storage, the network architecture should take into account the amount of East-West - bandwidth that will be required for instances to make use of - the available storage resources. Network devices selected + bandwidth required for instances to make use of + the available storage resources. The selected network devices should support jumbo frames for transferring large blocks of data. In some cases, it may be necessary to create an additional back-end storage network dedicated to providing @@ -184,38 +183,37 @@ Object Storage fault tolerance and availability While consistency and partition tolerance are both inherent features of the Object Storage service, it is important to - design the overall storage architecture to ensure that those - goals are met by the system being implemented. The + design the overall storage architecture to ensure that the + implemented system meets those goals. The OpenStack Object Storage service places a specific number of data replicas as objects on resource nodes. These replicas are distributed throughout the cluster based on a consistent hash ring which exists on all nodes in the cluster. - The Object Storage system should be designed with sufficient + Design the Object Storage system with a sufficient number of zones to provide quorum for the number of replicas - defined. As an example, with three replicas configured in the + defined. For example, with three replicas configured in the Swift cluster, the recommended number of zones to configure within the Object Storage cluster in order to achieve quorum is 5. While it is possible to deploy a solution with fewer zones, the implied risk of doing so is that some data may not be available and API requests to certain objects stored in the - cluster might fail. For this reason, ensure the number of - zones in the Object Storage cluster is properly accounted for. + cluster might fail. For this reason, ensure you properly account + for the number of zones in the Object Storage cluster. Each Object Storage zone should be self-contained within its own availability zone. Each availability zone should have independent access to network, power and cooling infrastructure to ensure uninterrupted access to data. In - addition, each availability zone should be serviced by a pool - of Object Storage proxy servers which will provide access to - data stored on the object nodes. Object proxies in each region - should leverage local read and write affinity so that access - to objects is facilitated by local storage resources wherever - possible. We recommend that upstream load balancing be - deployed to ensure that proxy services can be distributed - across the multiple zones and, in some cases, it may be - necessary to make use of third party solutions to aid with - geographical distribution of services. + addition, a pool of Object Storage proxy servers providing access + to data stored on the object nodes should service + each availability zone. Object proxies in each region + should leverage local read and write affinity so that local storage + resources facilitate access to objects wherever + possible. We recommend deploying upstream load balancing to ensure + that proxy services are distributed across the multiple zones and, + in some cases, it may be necessary to make use of third party + solutions to aid with geographical distribution of services. A zone within an Object Storage cluster is a logical - division. A zone can be represented as any of the following: + division. Any of the following may represent a zone: @@ -243,7 +241,7 @@ - Deciding the proper zone design is crucial for allowing the Object + Selecting the proper zone design is crucial for allowing the Object Storage cluster to scale while providing an available and redundant storage system. It may be necessary to configure storage policies that have different requirements @@ -263,9 +261,9 @@ consideration during the design phase.
Scaling Block Storage - Block Storage pools can be upgraded to add storage capacity - rather easily without interruption to the overall Block - Storage service. Nodes can be added to the pool by simply + You can upgrade Block Storage pools to add storage capacity + without interruption to the overall Block + Storage service. Add nodes to the pool by installing and configuring the appropriate hardware and software and then allowing that node to report in to the proper storage pool via the message bus. This is because Block @@ -276,10 +274,10 @@ In some cases, the demand on Block Storage from instances may exhaust the available network bandwidth. As a result, design network infrastructure that services Block Storage - resources in such a way that capacity and bandwidth can be - added relatively easily. This often involves the use of + resources in such a way that you can add capacity and + bandwidth easily. This often involves the use of dynamic routing protocols or advanced networking solutions to - allow capacity to be added to downstream devices easily. Both + add capacity to downstream devices easily. Both the front-end and back-end storage network designs should encompass the ability to quickly and easily add capacity and bandwidth. @@ -297,23 +295,23 @@ disks. For example, a system that starts with a single disk and a partition power of 3 can have 8 (2^3) partitions. Adding a - second disk means that each will have 4 partitions. + second disk means that each has 4 partitions. The one-disk-per-partition limit means that this system can never have more than 8 disks, limiting its scalability. However, a system that starts with a single disk and a partition power of 10 can have up to 1024 (2^10) disks. - As back-end storage capacity is added to the system, the - partition maps cause data to be redistributed amongst storage - nodes. In some cases, this replication can consist of - extremely large data sets. In these cases, we recommended - making use of back-end replication links which will not + As you add back-end storage capacity to the system, the + partition maps redistribute data amongst the storage + nodes. In some cases, this replication consists of + extremely large data sets. In these cases, we recommend + using back-end replication links that do not contend with tenants' access to data. As more tenants begin to access data within the cluster and - their data sets grow it will become necessary to add front-end + their data sets grow it is necessary to add front-end bandwidth to service data access requests. Adding front-end bandwidth to an Object Storage cluster requires careful - planning and design of the Object Storage proxies that will be - used by tenants to gain access to the data, along with the + planning and design of the Object Storage proxies that tenants + use to gain access to the data, along with the high availability solutions that enable easy scaling of the proxy layer. We recommend designing a front-end load balancing layer that tenants and consumers use to gain access @@ -321,9 +319,9 @@ may be distributed across zones, regions or even across geographic boundaries, which may also require that the design encompass geo-location solutions. - In some cases, adding bandwidth and capacity to the network + In some cases, you must add bandwidth and capacity to the network resources servicing requests between proxy servers and storage - nodes will be required. For this reason, the network + nodes. For this reason, the network architecture used for access to storage nodes and proxy servers should make use of a design which is scalable.
diff --git a/doc/arch-design/storage_focus/section_prescriptive_examples_storage_focus.xml b/doc/arch-design/storage_focus/section_prescriptive_examples_storage_focus.xml index 8f83f84445..5a8d16fccb 100644 --- a/doc/arch-design/storage_focus/section_prescriptive_examples_storage_focus.xml +++ b/doc/arch-design/storage_focus/section_prescriptive_examples_storage_focus.xml @@ -7,8 +7,8 @@ Prescriptive examples Storage-focused architecture highly depends on the - specific use case. Three specific example use cases are - discussed in this section: + specific use case. This section discusses three + specific example use cases:
@@ -38,9 +38,9 @@ - The presented REST interface does not require a high performance - caching tier, and is presented as a traditional Object store running - on traditional spindles. + The example REST interface, presented as a traditional Object store running + on traditional spindles, does not require a high performance + caching tier. This example uses the following components: Network: @@ -52,7 +52,7 @@ Storage hardware: - 10 storage servers each with 12x4 TB disks equalling + 10 storage servers each with 12x4 TB disks equaling 480 TB total space with approximately 160 Tb of usable space after replicas. @@ -87,7 +87,7 @@
One potential solution to this problem is the implementation of storage systems designed for performance. Parallel file systems have previously - filled this need in the HPC space and as a result could be considered + filled this need in the HPC space and are suitable for large scale performance-orientated systems. OpenStack has integration with Hadoop to manage the Hadoop cluster within the cloud. This diagram shows an OpenStack store with a high @@ -112,37 +112,36 @@ High performance database with Database service Databases are a common workload that benefit from high performance storage back ends. Although enterprise storage is not a requirement, - many environments have existing storage that can be used as back ends for - OpenStack cloud. A storage pool can be created to provide block devices + many environments have existing storage that OpenStack cloud can use as + back ends. You can create a storage pool to provide block devices with OpenStack Block Storage for instances as well as object interfaces. - In this example, the database I-O requirements were high and demanded + In this example, the database I-O requirements are high and demand storage presented from a fast SSD pool. - A storage system is used to present a LUN that is backed by + A storage system presents a LUN backed by a set of SSDs using a traditional storage array with OpenStack Block Storage integration or a storage platform such as Ceph or Gluster. This system can provide additional performance. For example, in the database example below, a portion of the SSD pool can act as a block device to the Database server. In the high performance analytics - example, the REST interface would be accelerated by the inline - SSD cache layer. + example, the inline SSD cache layer accelerates the REST interface. - Ceph was selected to present a Swift-compatible REST + In this example, Ceph presents a Swift-compatible REST interface, as well as a block level storage from a distributed storage cluster. It is highly flexible and has features that - allow to reduce cost of operations such as self healing and + enable reduced cost of operations such as self healing and auto balancing. Using erasure coded pools are a suitable way of maximizing the amount of usable space. There are special considerations around erasure coded pools. For example, higher computational requirements and limitations on - the operations allowed on an object; partial writes are not - supported in an erasure coded pool. + the operations allowed on an object; erasure coded pools do not + support partial writes. Using Ceph as an applicable example, a potential architecture @@ -183,8 +182,8 @@ Using an SSD cache layer, you can present block devices - directly to Hypervisors or instances. The SSD cache systems - can also be used as an inline cache for the REST interface. + directly to Hypervisors or instances. The REST interface can + also use the SSD cache systems as an inline cache.
diff --git a/doc/arch-design/storage_focus/section_tech_considerations_storage_focus.xml b/doc/arch-design/storage_focus/section_tech_considerations_storage_focus.xml index 3d5ddb538f..6a2a3ed1d9 100644 --- a/doc/arch-design/storage_focus/section_tech_considerations_storage_focus.xml +++ b/doc/arch-design/storage_focus/section_tech_considerations_storage_focus.xml @@ -23,7 +23,7 @@ Running scripted smaller benchmarks during the life cycle of the architecture helps record the system health at different points in time. The data from - these scripted benchmarks will assist in future + these scripted benchmarks assist in future scoping and gaining a deeper understanding of an organization's needs.
@@ -32,14 +32,14 @@ Scale Scaling storage solutions in a storage focused - OpenStack architecture design is driven by both initial + OpenStack architecture design is driven by initial requirements, including IOPS, capacity, and bandwidth, and future needs. Planning capacity based on projected needs over the course of a budget cycle is important for a design. The architecture should balance cost - and capacity, while also allowing flexibility - for new technologies and methods to be implemented as + and capacity, while also allowing flexibility to + implement new technologies and methods as they become available. @@ -49,10 +49,9 @@ Designing security around data has multiple points of focus that vary depending on SLAs, legal requirements, industry regulations, and certifications - needed for systems or people. HIPPA, ISO9000, and SOX - compliance should be considered based on the type of - data. Levels of access control can be important for - certain organizations. + needed for systems or people. Consider compliance with HIPPA, + ISO9000, and SOX based on the type of data. For certain + organizations, levels of access control are important.
@@ -71,8 +70,8 @@ Storage management - A range of storage - management-related considerations must be addressed in + You must address a range of storage + management-related considerations in the design of a storage focused OpenStack cloud. These considerations include, but are not limited to, backup strategy (and restore strategy, since a backup that @@ -94,7 +93,7 @@ When building a storage focused OpenStack architecture, - strive to build a flexible design that is based on an + strive to build a flexible design based on an industry standard core. One way of accomplishing this might be through the use of different back ends serving different use cases. diff --git a/doc/arch-design/storage_focus/section_user_requirements_storage_focus.xml b/doc/arch-design/storage_focus/section_user_requirements_storage_focus.xml index e0ca8cea43..5279ccd985 100644 --- a/doc/arch-design/storage_focus/section_user_requirements_storage_focus.xml +++ b/doc/arch-design/storage_focus/section_user_requirements_storage_focus.xml @@ -6,8 +6,7 @@ xml:id="user-requirements-storage-focus"> User requirements - Storage-focused clouds are defined by their requirements for - data. These include: + Requirements for data define storage-focused clouds. These include: @@ -25,9 +24,8 @@ - A balance between cost and user - requirements dictate what methods and technologies will be - used in a cloud architecture. + A balance between cost and user requirements dictate + what methods and technologies to use in a cloud architecture. Cost @@ -94,8 +92,8 @@ Data compliance Policies governing types of - information that are required to reside in certain - locations due to regular issues and cannot reside in + information that must reside in certain + locations due to regulatory issues and cannot reside in other locations for the same reason. @@ -104,17 +102,17 @@
Technical requirements - The following are technical requirements that could be - incorporated into the architecture design: + You can incorporate the following technical requirements + into the architecture design: Storage proximity In order to provide high performance or large amounts of storage space, the - design may have to accommodate storage that is each of - the hypervisors or served from a central storage - device. + design may have to accommodate storage that is + attached to each hypervisor or served from a + central storage device. @@ -129,16 +127,15 @@ Availability Specific requirements regarding - availability will influence the technology used to - store and protect data. These requirements will - influence the cost and solution that will be - implemented. + availability influence the technology used to + store and protect data. These requirements + influence cost and the implemented solution. Security - Data will need to be protected both in + You must protect data both in transit and at rest.