diff --git a/doc/arch-design-draft/source/capacity-planning-scaling.rst b/doc/arch-design-draft/source/capacity-planning-scaling.rst index 73fd8adf5b..fd9da341cd 100644 --- a/doc/arch-design-draft/source/capacity-planning-scaling.rst +++ b/doc/arch-design-draft/source/capacity-planning-scaling.rst @@ -1,9 +1,244 @@ -============ -Introduction -============ +============================= +Capacity planning and scaling +============================= -.. toctree:: - :maxdepth: 2 +An important consideration in running a cloud over time is projecting growth +and utilization trends in order to plan capital expenditures for the short and +long term. Gather utilization meters for compute, network, and storage, along +with historical records of these meters. While securing major anchor tenants +can lead to rapid jumps in the utilization of resources, the average rate of +adoption of cloud services through normal usage also needs to be carefully +monitored. +General storage considerations +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +A wide variety of operator-specific requirements dictates the nature of the +storage back end. Examples of such requirements are as follows: +* Public or private cloud, and associated SLA requirements +* The need for encryption-at-rest, for data on storage nodes +* Whether live migration will be offered + +We recommend that data be encrypted both in transit and at-rest. +If you plan to use live migration, a shared storage configuration is highly +recommended. + +Block Storage +~~~~~~~~~~~~~ + +Configure Block Storage resource nodes with advanced RAID controllers +and high-performance disks to provide fault tolerance at the hardware +level. + +Deploy high performing storage solutions such as SSD drives or +flash storage systems for applications requiring additional performance out +of Block Storage devices. + +In environments that place substantial demands on Block Storage, we +recommend using multiple storage pools. In this case, each pool of +devices should have a similar hardware design and disk configuration +across all hardware nodes in that pool. This allows for a design that +provides applications with access to a wide variety of Block Storage +pools, each with their own redundancy, availability, and performance +characteristics. When deploying multiple pools of storage, it is also +important to consider the impact on the Block Storage scheduler which is +responsible for provisioning storage across resource nodes. Ideally, +ensure that applications can schedule volumes in multiple regions, each with +their own network, power, and cooling infrastructure. This will give tenants +the option of building fault-tolerant applications that are distributed +across multiple availability zones. + +In addition to the Block Storage resource nodes, it is important to +design for high availability and redundancy of the APIs, and related +services that are responsible for provisioning and providing access to +storage. We recommend designing a layer of hardware or software load +balancers in order to achieve high availability of the appropriate REST +API services to provide uninterrupted service. In some cases, it may +also be necessary to deploy an additional layer of load balancing to +provide access to back-end database services responsible for servicing +and storing the state of Block Storage volumes. It is imperative that a +highly available database cluster is used to store the Block +Storage metadata. + +In a cloud with significant demands on Block Storage, the network +architecture should take into account the amount of East-West bandwidth +required for instances to make use of the available storage resources. +The selected network devices should support jumbo frames for +transferring large blocks of data, and utilize a dedicated network for +providing connectivity between instances and Block Storage. + +Scaling Block Storage +--------------------- + +You can upgrade Block Storage pools to add storage capacity without +interrupting the overall Block Storage service. Add nodes to the pool by +installing and configuring the appropriate hardware and software and +then allowing that node to report in to the proper storage pool through the +message bus. Block Storage nodes generally report into the scheduler +service advertising their availability. As a result, after the node is +online and available, tenants can make use of those storage resources +instantly. + +In some cases, the demand on Block Storage may exhaust the available +network bandwidth. As a result, design network infrastructure that +services Block Storage resources in such a way that you can add capacity +and bandwidth easily. This often involves the use of dynamic routing +protocols or advanced networking solutions to add capacity to downstream +devices easily. Both the front-end and back-end storage network designs +should encompass the ability to quickly and easily add capacity and +bandwidth. + +.. note:: + + Sufficient monitoring and data collection should be in-place + from the start, such that timely decisions regarding capacity, + input/output metrics (IOPS) or storage-associated bandwidth can + be made. + +Object Storage +~~~~~~~~~~~~~~ + +While consistency and partition tolerance are both inherent features of +the Object Storage service, it is important to design the overall +storage architecture to ensure that the implemented system meets those +goals. The OpenStack Object Storage service places a specific number of +data replicas as objects on resource nodes. Replicas are distributed +throughout the cluster, based on a consistent hash ring also stored on +each node in the cluster. + +Design the Object Storage system with a sufficient number of zones to +provide quorum for the number of replicas defined. For example, with +three replicas configured in the swift cluster, the recommended number +of zones to configure within the Object Storage cluster in order to +achieve quorum is five. While it is possible to deploy a solution with +fewer zones, the implied risk of doing so is that some data may not be +available and API requests to certain objects stored in the cluster +might fail. For this reason, ensure you properly account for the number +of zones in the Object Storage cluster. + +Each Object Storage zone should be self-contained within its own +availability zone. Each availability zone should have independent access +to network, power, and cooling infrastructure to ensure uninterrupted +access to data. In addition, a pool of Object Storage proxy servers +providing access to data stored on the object nodes should service each +availability zone. Object proxies in each region should leverage local +read and write affinity so that local storage resources facilitate +access to objects wherever possible. We recommend deploying upstream +load balancing to ensure that proxy services are distributed across the +multiple zones and, in some cases, it may be necessary to make use of +third-party solutions to aid with geographical distribution of services. + +A zone within an Object Storage cluster is a logical division. Any of +the following may represent a zone: + +* A disk within a single node +* One zone per node +* Zone per collection of nodes +* Multiple racks +* Multiple data centers + +Selecting the proper zone design is crucial for allowing the Object +Storage cluster to scale while providing an available and redundant +storage system. It may be necessary to configure storage policies that +have different requirements with regards to replicas, retention, and +other factors that could heavily affect the design of storage in a +specific zone. + +Scaling Object Storage +---------------------- + +Adding back-end storage capacity to an Object Storage cluster requires +careful planning and forethought. In the design phase, it is important +to determine the maximum partition power required by the Object Storage +service, which determines the maximum number of partitions which can +exist. Object Storage distributes data among all available storage, but +a partition cannot span more than one disk, so the maximum number of +partitions can only be as high as the number of disks. + +For example, a system that starts with a single disk and a partition +power of 3 can have 8 (2^3) partitions. Adding a second disk means that +each has 4 partitions. The one-disk-per-partition limit means that this +system can never have more than 8 disks, limiting its scalability. +However, a system that starts with a single disk and a partition power +of 10 can have up to 1024 (2^10) disks. + +As you add back-end storage capacity to the system, the partition maps +redistribute data amongst the storage nodes. In some cases, this +involves replication of extremely large data sets. In these cases, we +recommend using back-end replication links that do not contend with +tenants' access to data. + +As more tenants begin to access data within the cluster and their data +sets grow, it is necessary to add front-end bandwidth to service data +access requests. Adding front-end bandwidth to an Object Storage cluster +requires careful planning and design of the Object Storage proxies that +tenants use to gain access to the data, along with the high availability +solutions that enable easy scaling of the proxy layer. We recommend +designing a front-end load balancing layer that tenants and consumers +use to gain access to data stored within the cluster. This load +balancing layer may be distributed across zones, regions or even across +geographic boundaries, which may also require that the design encompass +geo-location solutions. + +In some cases, you must add bandwidth and capacity to the network +resources servicing requests between proxy servers and storage nodes. +For this reason, the network architecture used for access to storage +nodes and proxy servers should make use of a design which is scalable. + + +Network +~~~~~~~ +.. TODO(unassigned): consolidate and update existing network sub-chapters. + + +Compute +~~~~~~~ +A relationship exists between the size of the compute environment and +the supporting OpenStack infrastructure controller nodes requiring +support. + +Increasing the size of the supporting compute environment increases the +network traffic and messages, adding load to the controller or +networking nodes. Effective monitoring of the environment will help with +capacity decisions on scaling. + +Compute nodes automatically attach to OpenStack clouds, resulting in a +horizontally scaling process when adding extra compute capacity to an +OpenStack cloud. Additional processes are required to place nodes into +appropriate availability zones and host aggregates. When adding +additional compute nodes to environments, ensure identical or functional +compatible CPUs are used, otherwise live migration features will break. +It is necessary to add rack capacity or network switches as scaling out +compute hosts directly affects network and data center resources. + +Compute host components can also be upgraded to account for increases in +demand. This is also known as vertical scaling. Upgrading CPUs with more +cores, or increasing the overall server memory, can add extra needed +capacity depending on whether the running applications are more CPU +intensive or memory intensive. + +Another option is to assess the average workloads and increase the +number of instances that can run within the compute environment by +adjusting the overcommit ratio. + +.. note:: + + Changing the CPU overcommit ratio can have a detrimental effect + and cause a potential increase in a noisy neighbor. + +Insufficient disk capacity could also have a negative effect on overall +performance including CPU and memory usage. Depending on the back-end +architecture of the OpenStack Block Storage layer, capacity includes +adding disk shelves to enterprise storage systems or installing +additional block storage nodes. Upgrading directly attached storage +installed in compute hosts, and adding capacity to the shared storage +for additional ephemeral storage to instances, may be necessary. + +For a deeper discussion on many of these topics, refer to the `OpenStack +Operations Guide `_. + + +Control plane API services and Horizon +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +.. TODO(unassigned): consolidate existing control plane sub-chapters.