[arch-design] Set up book structure
1. Move mitaka changes temporarily into the arch-guide-draft-mitaka subdirectory 2. Set up book structure per Architecture Design Guide specification Change-Id: I27e66561208647f8ead32fc3e5b333051cd92a42 Implements: blueprint arch-guide-restructure
@ -0,0 +1,402 @@
|
||||
=============================
|
||||
Capacity planning and scaling
|
||||
=============================
|
||||
|
||||
An important consideration in running a cloud over time is projecting growth
|
||||
and utilization trends in order to plan capital expenditures for the short and
|
||||
long term. Gather utilization meters for compute, network, and storage, along
|
||||
with historical records of these meters. While securing major anchor tenants
|
||||
can lead to rapid jumps in the utilization of resources, the average rate of
|
||||
adoption of cloud services through normal usage also needs to be carefully
|
||||
monitored.
|
||||
|
||||
General storage considerations
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
A wide variety of operator-specific requirements dictates the nature of the
|
||||
storage back end. Examples of such requirements are as follows:
|
||||
|
||||
* Public, private or a hybrid cloud, and associated SLA requirements
|
||||
* The need for encryption-at-rest, for data on storage nodes
|
||||
* Whether live migration will be offered
|
||||
|
||||
We recommend that data be encrypted both in transit and at-rest.
|
||||
If you plan to use live migration, a shared storage configuration is highly
|
||||
recommended.
|
||||
|
||||
Capacity planning for a multi-site cloud
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
An OpenStack cloud can be designed in a variety of ways to handle individual
|
||||
application needs. A multi-site deployment has additional challenges compared
|
||||
to single site installations.
|
||||
|
||||
When determining capacity options, take into account technical, economic and
|
||||
operational issues that might arise from specific decisions.
|
||||
|
||||
Inter-site link capacity describes the connectivity capability between
|
||||
different OpenStack sites. This includes parameters such as
|
||||
bandwidth, latency, whether or not a link is dedicated, and any business
|
||||
policies applied to the connection. The capability and number of the
|
||||
links between sites determine what kind of options are available for
|
||||
deployment. For example, if two sites have a pair of high-bandwidth
|
||||
links available between them, it may be wise to configure a separate
|
||||
storage replication network between the two sites to support a single
|
||||
swift endpoint and a shared Object Storage capability between them. An
|
||||
example of this technique, as well as a configuration walk-through, is
|
||||
available at
|
||||
http://docs.openstack.org/developer/swift/replication_network.html#dedicated-replication-network.
|
||||
Another option in this scenario is to build a dedicated set of tenant
|
||||
private networks across the secondary link, using overlay networks with
|
||||
a third party mapping the site overlays to each other.
|
||||
|
||||
The capacity requirements of the links between sites is driven by
|
||||
application behavior. If the link latency is too high, certain
|
||||
applications that use a large number of small packets, for example
|
||||
:term:`RPC <Remote Procedure Call (RPC)>` API calls, may encounter
|
||||
issues communicating with each other or operating
|
||||
properly. OpenStack may also encounter similar types of issues.
|
||||
To mitigate this, the Identity service provides service call timeout
|
||||
tuning to prevent issues authenticating against a central Identity services.
|
||||
|
||||
Another network capacity consideration for a multi-site deployment is
|
||||
the amount and performance of overlay networks available for tenant
|
||||
networks. If using shared tenant networks across zones, it is imperative
|
||||
that an external overlay manager or controller be used to map these
|
||||
overlays together. It is necessary to ensure the amount of possible IDs
|
||||
between the zones are identical.
|
||||
|
||||
.. note::
|
||||
|
||||
As of the Kilo release, OpenStack Networking was not capable of
|
||||
managing tunnel IDs across installations. So if one site runs out of
|
||||
IDs, but another does not, that tenant's network is unable to reach
|
||||
the other site.
|
||||
|
||||
The ability for a region to grow depends on scaling out the number of
|
||||
available compute nodes. However, it may be necessary to grow cells in an
|
||||
individual region, depending on the size of your cluster and the ratio of
|
||||
virtual machines per hypervisor.
|
||||
|
||||
A third form of capacity comes in the multi-region-capable components of
|
||||
OpenStack. Centralized Object Storage is capable of serving objects
|
||||
through a single namespace across multiple regions. Since this works by
|
||||
accessing the object store through swift proxy, it is possible to
|
||||
overload the proxies. There are two options available to mitigate this
|
||||
issue:
|
||||
|
||||
* Deploy a large number of swift proxies. The drawback is that the
|
||||
proxies are not load-balanced and a large file request could
|
||||
continually hit the same proxy.
|
||||
|
||||
* Add a caching HTTP proxy and load balancer in front of the swift
|
||||
proxies. Since swift objects are returned to the requester via HTTP,
|
||||
this load balancer alleviates the load required on the swift
|
||||
proxies.
|
||||
|
||||
Capacity planning for a compute-focused cloud
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Adding extra capacity to an compute-focused cloud is a horizontally scaling
|
||||
process.
|
||||
|
||||
We recommend using similar CPUs when adding extra nodes to the environment.
|
||||
This reduces the chance of breaking live-migration features if they are
|
||||
present. Scaling out hypervisor hosts also has a direct effect on network
|
||||
and other data center resources. We recommend you factor in this increase
|
||||
when reaching rack capacity or when requiring extra network switches.
|
||||
|
||||
Changing the internal components of a Compute host to account for increases in
|
||||
demand is a process known as vertical scaling. Swapping a CPU for one with more
|
||||
cores, or increasing the memory in a server, can help add extra capacity for
|
||||
running applications.
|
||||
|
||||
Another option is to assess the average workloads and increase the number of
|
||||
instances that can run within the compute environment by adjusting the
|
||||
overcommit ratio.
|
||||
|
||||
.. note::
|
||||
It is important to remember that changing the CPU overcommit ratio can
|
||||
have a detrimental effect and cause a potential increase in a noisy
|
||||
neighbor.
|
||||
|
||||
The added risk of increasing the overcommit ratio is that more instances fail
|
||||
when a compute host fails. We do not recommend that you increase the CPU
|
||||
overcommit ratio in compute-focused OpenStack design architecture. It can
|
||||
increase the potential for noisy neighbor issues.
|
||||
|
||||
Capacity planning for a hybrid cloud
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
One of the primary reasons many organizations use a hybrid cloud is to
|
||||
increase capacity without making large capital investments.
|
||||
|
||||
Capacity and the placement of workloads are key design considerations for
|
||||
hybrid clouds. The long-term capacity plan for these designs must incorporate
|
||||
growth over time to prevent permanent consumption of more expensive external
|
||||
clouds. To avoid this scenario, account for future applications’ capacity
|
||||
requirements and plan growth appropriately.
|
||||
|
||||
It is difficult to predict the amount of load a particular application might
|
||||
incur if the number of users fluctuate, or the application experiences an
|
||||
unexpected increase in use. It is possible to define application requirements
|
||||
in terms of vCPU, RAM, bandwidth, or other resources and plan appropriately.
|
||||
However, other clouds might not use the same meter or even the same
|
||||
oversubscription rates.
|
||||
|
||||
Oversubscription is a method to emulate more capacity than may physically be
|
||||
present. For example, a physical hypervisor node with 32 GB RAM may host 24
|
||||
instances, each provisioned with 2 GB RAM. As long as all 24 instances do not
|
||||
concurrently use 2 full gigabytes, this arrangement works well. However, some
|
||||
hosts take oversubscription to extremes and, as a result, performance can be
|
||||
inconsistent. If at all possible, determine what the oversubscription rates
|
||||
of each host are and plan capacity accordingly.
|
||||
|
||||
Block Storage
|
||||
~~~~~~~~~~~~~
|
||||
|
||||
Configure Block Storage resource nodes with advanced RAID controllers
|
||||
and high-performance disks to provide fault tolerance at the hardware
|
||||
level.
|
||||
|
||||
Deploy high performing storage solutions such as SSD drives or
|
||||
flash storage systems for applications requiring additional performance out
|
||||
of Block Storage devices.
|
||||
|
||||
In environments that place substantial demands on Block Storage, we
|
||||
recommend using multiple storage pools. In this case, each pool of
|
||||
devices should have a similar hardware design and disk configuration
|
||||
across all hardware nodes in that pool. This allows for a design that
|
||||
provides applications with access to a wide variety of Block Storage
|
||||
pools, each with their own redundancy, availability, and performance
|
||||
characteristics. When deploying multiple pools of storage, it is also
|
||||
important to consider the impact on the Block Storage scheduler which is
|
||||
responsible for provisioning storage across resource nodes. Ideally,
|
||||
ensure that applications can schedule volumes in multiple regions, each with
|
||||
their own network, power, and cooling infrastructure. This will give tenants
|
||||
the option of building fault-tolerant applications that are distributed
|
||||
across multiple availability zones.
|
||||
|
||||
In addition to the Block Storage resource nodes, it is important to
|
||||
design for high availability and redundancy of the APIs, and related
|
||||
services that are responsible for provisioning and providing access to
|
||||
storage. We recommend designing a layer of hardware or software load
|
||||
balancers in order to achieve high availability of the appropriate REST
|
||||
API services to provide uninterrupted service. In some cases, it may
|
||||
also be necessary to deploy an additional layer of load balancing to
|
||||
provide access to back-end database services responsible for servicing
|
||||
and storing the state of Block Storage volumes. It is imperative that a
|
||||
highly available database cluster is used to store the Block
|
||||
Storage metadata.
|
||||
|
||||
In a cloud with significant demands on Block Storage, the network
|
||||
architecture should take into account the amount of East-West bandwidth
|
||||
required for instances to make use of the available storage resources.
|
||||
The selected network devices should support jumbo frames for
|
||||
transferring large blocks of data, and utilize a dedicated network for
|
||||
providing connectivity between instances and Block Storage.
|
||||
|
||||
Scaling Block Storage
|
||||
---------------------
|
||||
|
||||
You can upgrade Block Storage pools to add storage capacity without
|
||||
interrupting the overall Block Storage service. Add nodes to the pool by
|
||||
installing and configuring the appropriate hardware and software and
|
||||
then allowing that node to report in to the proper storage pool through the
|
||||
message bus. Block Storage nodes generally report into the scheduler
|
||||
service advertising their availability. As a result, after the node is
|
||||
online and available, tenants can make use of those storage resources
|
||||
instantly.
|
||||
|
||||
In some cases, the demand on Block Storage may exhaust the available
|
||||
network bandwidth. As a result, design network infrastructure that
|
||||
services Block Storage resources in such a way that you can add capacity
|
||||
and bandwidth easily. This often involves the use of dynamic routing
|
||||
protocols or advanced networking solutions to add capacity to downstream
|
||||
devices easily. Both the front-end and back-end storage network designs
|
||||
should encompass the ability to quickly and easily add capacity and
|
||||
bandwidth.
|
||||
|
||||
.. note::
|
||||
|
||||
Sufficient monitoring and data collection should be in-place
|
||||
from the start, such that timely decisions regarding capacity,
|
||||
input/output metrics (IOPS) or storage-associated bandwidth can
|
||||
be made.
|
||||
|
||||
Object Storage
|
||||
~~~~~~~~~~~~~~
|
||||
|
||||
While consistency and partition tolerance are both inherent features of
|
||||
the Object Storage service, it is important to design the overall
|
||||
storage architecture to ensure that the implemented system meets those
|
||||
goals. The OpenStack Object Storage service places a specific number of
|
||||
data replicas as objects on resource nodes. Replicas are distributed
|
||||
throughout the cluster, based on a consistent hash ring also stored on
|
||||
each node in the cluster.
|
||||
|
||||
Design the Object Storage system with a sufficient number of zones to
|
||||
provide quorum for the number of replicas defined. For example, with
|
||||
three replicas configured in the swift cluster, the recommended number
|
||||
of zones to configure within the Object Storage cluster in order to
|
||||
achieve quorum is five. While it is possible to deploy a solution with
|
||||
fewer zones, the implied risk of doing so is that some data may not be
|
||||
available and API requests to certain objects stored in the cluster
|
||||
might fail. For this reason, ensure you properly account for the number
|
||||
of zones in the Object Storage cluster.
|
||||
|
||||
Each Object Storage zone should be self-contained within its own
|
||||
availability zone. Each availability zone should have independent access
|
||||
to network, power, and cooling infrastructure to ensure uninterrupted
|
||||
access to data. In addition, a pool of Object Storage proxy servers
|
||||
providing access to data stored on the object nodes should service each
|
||||
availability zone. Object proxies in each region should leverage local
|
||||
read and write affinity so that local storage resources facilitate
|
||||
access to objects wherever possible. We recommend deploying upstream
|
||||
load balancing to ensure that proxy services are distributed across the
|
||||
multiple zones and, in some cases, it may be necessary to make use of
|
||||
third-party solutions to aid with geographical distribution of services.
|
||||
|
||||
A zone within an Object Storage cluster is a logical division. Any of
|
||||
the following may represent a zone:
|
||||
|
||||
* A disk within a single node
|
||||
* One zone per node
|
||||
* Zone per collection of nodes
|
||||
* Multiple racks
|
||||
* Multiple data centers
|
||||
|
||||
Selecting the proper zone design is crucial for allowing the Object
|
||||
Storage cluster to scale while providing an available and redundant
|
||||
storage system. It may be necessary to configure storage policies that
|
||||
have different requirements with regards to replicas, retention, and
|
||||
other factors that could heavily affect the design of storage in a
|
||||
specific zone.
|
||||
|
||||
Scaling Object Storage
|
||||
----------------------
|
||||
|
||||
Adding back-end storage capacity to an Object Storage cluster requires
|
||||
careful planning and forethought. In the design phase, it is important
|
||||
to determine the maximum partition power required by the Object Storage
|
||||
service, which determines the maximum number of partitions which can
|
||||
exist. Object Storage distributes data among all available storage, but
|
||||
a partition cannot span more than one disk, so the maximum number of
|
||||
partitions can only be as high as the number of disks.
|
||||
|
||||
For example, a system that starts with a single disk and a partition
|
||||
power of 3 can have 8 (2^3) partitions. Adding a second disk means that
|
||||
each has 4 partitions. The one-disk-per-partition limit means that this
|
||||
system can never have more than 8 disks, limiting its scalability.
|
||||
However, a system that starts with a single disk and a partition power
|
||||
of 10 can have up to 1024 (2^10) disks.
|
||||
|
||||
As you add back-end storage capacity to the system, the partition maps
|
||||
redistribute data amongst the storage nodes. In some cases, this
|
||||
involves replication of extremely large data sets. In these cases, we
|
||||
recommend using back-end replication links that do not contend with
|
||||
tenants' access to data.
|
||||
|
||||
As more tenants begin to access data within the cluster and their data
|
||||
sets grow, it is necessary to add front-end bandwidth to service data
|
||||
access requests. Adding front-end bandwidth to an Object Storage cluster
|
||||
requires careful planning and design of the Object Storage proxies that
|
||||
tenants use to gain access to the data, along with the high availability
|
||||
solutions that enable easy scaling of the proxy layer. We recommend
|
||||
designing a front-end load balancing layer that tenants and consumers
|
||||
use to gain access to data stored within the cluster. This load
|
||||
balancing layer may be distributed across zones, regions or even across
|
||||
geographic boundaries, which may also require that the design encompass
|
||||
geo-location solutions.
|
||||
|
||||
In some cases, you must add bandwidth and capacity to the network
|
||||
resources servicing requests between proxy servers and storage nodes.
|
||||
For this reason, the network architecture used for access to storage
|
||||
nodes and proxy servers should make use of a design which is scalable.
|
||||
|
||||
Compute resource design
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
When designing compute resource pools, consider the number of processors,
|
||||
amount of memory, and the quantity of storage required for each hypervisor.
|
||||
|
||||
Consider whether compute resources will be provided in a single pool or in
|
||||
multiple pools. In most cases, multiple pools of resources can be allocated
|
||||
and addressed on demand, commonly referred to as bin packing.
|
||||
|
||||
In a bin packing design, each independent resource pool provides service
|
||||
for specific flavors. Since instances are scheduled onto compute hypervisors,
|
||||
each independent node's resources will be allocated to efficiently use the
|
||||
available hardware. Bin packing also requires a common hardware design,
|
||||
with all hardware nodes within a compute resource pool sharing a common
|
||||
processor, memory, and storage layout. This makes it easier to deploy,
|
||||
support, and maintain nodes throughout their lifecycle.
|
||||
|
||||
Increasing the size of the supporting compute environment increases the
|
||||
network traffic and messages, adding load to the controller or
|
||||
networking nodes. Effective monitoring of the environment will help with
|
||||
capacity decisions on scaling.
|
||||
|
||||
Compute nodes automatically attach to OpenStack clouds, resulting in a
|
||||
horizontally scaling process when adding extra compute capacity to an
|
||||
OpenStack cloud. Additional processes are required to place nodes into
|
||||
appropriate availability zones and host aggregates. When adding
|
||||
additional compute nodes to environments, ensure identical or functional
|
||||
compatible CPUs are used, otherwise live migration features will break.
|
||||
It is necessary to add rack capacity or network switches as scaling out
|
||||
compute hosts directly affects network and data center resources.
|
||||
|
||||
Compute host components can also be upgraded to account for increases in
|
||||
demand, known as vertical scaling. Upgrading CPUs with more
|
||||
cores, or increasing the overall server memory, can add extra needed
|
||||
capacity depending on whether the running applications are more CPU
|
||||
intensive or memory intensive.
|
||||
|
||||
When selecting a processor, compare features and performance
|
||||
characteristics. Some processors include features specific to
|
||||
virtualized compute hosts, such as hardware-assisted virtualization, and
|
||||
technology related to memory paging (also known as EPT shadowing). These
|
||||
types of features can have a significant impact on the performance of
|
||||
your virtual machine.
|
||||
|
||||
The number of processor cores and threads impacts the number of worker
|
||||
threads which can be run on a resource node. Design decisions must
|
||||
relate directly to the service being run on it, as well as provide a
|
||||
balanced infrastructure for all services.
|
||||
|
||||
Another option is to assess the average workloads and increase the
|
||||
number of instances that can run within the compute environment by
|
||||
adjusting the overcommit ratio.
|
||||
|
||||
An overcommit ratio is the ratio of available virtual resources to
|
||||
available physical resources. This ratio is configurable for CPU and
|
||||
memory. The default CPU overcommit ratio is 16:1, and the default memory
|
||||
overcommit ratio is 1.5:1. Determining the tuning of the overcommit
|
||||
ratios during the design phase is important as it has a direct impact on
|
||||
the hardware layout of your compute nodes.
|
||||
|
||||
.. note::
|
||||
|
||||
Changing the CPU overcommit ratio can have a detrimental effect
|
||||
and cause a potential increase in a noisy neighbor.
|
||||
|
||||
Insufficient disk capacity could also have a negative effect on overall
|
||||
performance including CPU and memory usage. Depending on the back-end
|
||||
architecture of the OpenStack Block Storage layer, capacity includes
|
||||
adding disk shelves to enterprise storage systems or installing
|
||||
additional block storage nodes. Upgrading directly attached storage
|
||||
installed in compute hosts, and adding capacity to the shared storage
|
||||
for additional ephemeral storage to instances, may be necessary.
|
||||
|
||||
Consider the compute requirements of non-hypervisor nodes (also referred to as
|
||||
resource nodes). This includes controller, object storage, and block storage
|
||||
nodes, and networking services.
|
||||
|
||||
The ability to add compute resource pools for unpredictable workloads should
|
||||
be considered. In some cases, the demand for certain instance types or flavors
|
||||
may not justify individual hardware design. Allocate hardware designs that are
|
||||
capable of servicing the most common instance requests. Adding hardware to the
|
||||
overall architecture can be done later.
|
||||
|
||||
For more information on these topics, refer to the `OpenStack
|
||||
Operations Guide <http://docs.openstack.org/ops>`_.
|
||||
|
||||
.. TODO Add information on control plane API services and horizon.
|
After Width: | Height: | Size: 52 KiB |
After Width: | Height: | Size: 39 KiB |
After Width: | Height: | Size: 35 KiB |
After Width: | Height: | Size: 79 KiB |
After Width: | Height: | Size: 70 KiB |
After Width: | Height: | Size: 24 KiB |
After Width: | Height: | Size: 42 KiB |
After Width: | Height: | Size: 59 KiB |
After Width: | Height: | Size: 54 KiB |
After Width: | Height: | Size: 54 KiB |
After Width: | Height: | Size: 68 KiB |
After Width: | Height: | Size: 50 KiB |
After Width: | Height: | Size: 52 KiB |
After Width: | Height: | Size: 75 KiB |
After Width: | Height: | Size: 37 KiB |
After Width: | Height: | Size: 56 KiB |
After Width: | Height: | Size: 46 KiB |
After Width: | Height: | Size: 56 KiB |
After Width: | Height: | Size: 30 KiB |
After Width: | Height: | Size: 22 KiB |
After Width: | Height: | Size: 25 KiB |
After Width: | Height: | Size: 50 KiB |
After Width: | Height: | Size: 50 KiB |
After Width: | Height: | Size: 35 KiB |
@ -0,0 +1,190 @@
|
||||
.. _high-availability:
|
||||
|
||||
=================
|
||||
High availability
|
||||
=================
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
Data Plane and Control Plane
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
When designing an OpenStack cloud, it is important to consider the needs
|
||||
dictated by the :term:`Service Level Agreement (SLA)` in terms of the core
|
||||
services required to maintain availability of running Compute service
|
||||
instances, networks, storage and additional services running on top of those
|
||||
resources. These services are often referred to as the Data Plane services,
|
||||
and are generally expected to be available all the time.
|
||||
|
||||
The remaining services, responsible for CRUD operations, metering, monitoring,
|
||||
and so on, are often referred to as the Control Plane. The SLA is likely to
|
||||
dictate a lower uptime requirement for these services.
|
||||
|
||||
The services comprising an OpenStack cloud have a number of requirements which
|
||||
the architect needs to understand in order to be able to meet SLA terms. For
|
||||
example, in order to provide the Compute service a minimum of storage, message
|
||||
queueing, and database services are necessary as well as the networking between
|
||||
them.
|
||||
|
||||
Ongoing maintenance operations are made much simpler if there is logical and
|
||||
physical separation of Data Plane and Control Plane systems. It then becomes
|
||||
possible to, for example, reboot a controller without affecting customers.
|
||||
If one service failure affects the operation of an entire server ('noisy
|
||||
neighbor’), the separation between Control and Data Planes enables rapid
|
||||
maintenance with a limited effect on customer operations.
|
||||
|
||||
|
||||
Eliminating Single Points of Failure
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Within each site
|
||||
----------------
|
||||
|
||||
OpenStack lends itself to deployment in a highly available manner where it is
|
||||
expected that at least 2 servers be utilized. These can run all the services
|
||||
involved from the message queuing service, for example ``RabbitMQ`` or
|
||||
``QPID``, and an appropriately deployed database service such as ``MySQL`` or
|
||||
``MariaDB``. As services in the cloud are scaled out, back-end services will
|
||||
need to scale too. Monitoring and reporting on server utilization and response
|
||||
times, as well as load testing your systems, will help determine scale out
|
||||
decisions.
|
||||
|
||||
The OpenStack services themselves should be deployed across multiple servers
|
||||
that do not represent a single point of failure. Ensuring availability can
|
||||
be achieved by placing these services behind highly available load balancers
|
||||
that have multiple OpenStack servers as members.
|
||||
|
||||
There are a small number of OpenStack services which are intended to only run
|
||||
in one place at a time (e.g. the ``ceilometer-agent-central`` service). In
|
||||
order to prevent these services from becoming a single point of failure, they
|
||||
can be controlled by clustering software such as ``Pacemaker``.
|
||||
|
||||
In OpenStack, the infrastructure is integral to providing services and should
|
||||
always be available, especially when operating with SLAs. Ensuring network
|
||||
availability is accomplished by designing the network architecture so that no
|
||||
single point of failure exists. A consideration of the number of switches,
|
||||
routes and redundancies of power should be factored into core infrastructure,
|
||||
as well as the associated bonding of networks to provide diverse routes to your
|
||||
highly available switch infrastructure.
|
||||
|
||||
Care must be taken when deciding network functionality. Currently, OpenStack
|
||||
supports both the legacy networking (nova-network) system and the newer,
|
||||
extensible OpenStack Networking (neutron). OpenStack Networking and legacy
|
||||
networking both have their advantages and disadvantages. They are both valid
|
||||
and supported options that fit different network deployment models described in
|
||||
the `OpenStack Operations Guide
|
||||
<http://docs.openstack.org/openstack-ops/content/network_design.html#network_deployment_options>`_.
|
||||
|
||||
When using the Networking service, the OpenStack controller servers or separate
|
||||
Networking hosts handle routing unless the dynamic virtual routers pattern for
|
||||
routing is selected. Running routing directly on the controller servers mixes
|
||||
the Data and Control Planes and can cause complex issues with performance and
|
||||
troubleshooting. It is possible to use third party software and external
|
||||
appliances that help maintain highly available layer three routes. Doing so
|
||||
allows for common application endpoints to control network hardware, or to
|
||||
provide complex multi-tier web applications in a secure manner. It is also
|
||||
possible to completely remove routing from Networking, and instead rely on
|
||||
hardware routing capabilities. In this case, the switching infrastructure must
|
||||
support layer three routing.
|
||||
|
||||
Application design must also be factored into the capabilities of the
|
||||
underlying cloud infrastructure. If the compute hosts do not provide a seamless
|
||||
live migration capability, then it must be expected that if a compute host
|
||||
fails, that instance and any data local to that instance will be deleted.
|
||||
However, when providing an expectation to users that instances have a
|
||||
high-level of uptime guaranteed, the infrastructure must be deployed in a way
|
||||
that eliminates any single point of failure if a compute host disappears.
|
||||
This may include utilizing shared file systems on enterprise storage or
|
||||
OpenStack Block storage to provide a level of guarantee to match service
|
||||
features.
|
||||
|
||||
If using a storage design that includes shared access to centralized storage,
|
||||
ensure that this is also designed without single points of failure and the SLA
|
||||
for the solution matches or exceeds the expected SLA for the Data Plane.
|
||||
|
||||
Between sites in a multi region design
|
||||
--------------------------------------
|
||||
|
||||
Some services are commonly shared between multiple regions, including the
|
||||
Identity service and the Dashboard. In this case, it is necessary to ensure
|
||||
that the databases backing the services are replicated, and that access to
|
||||
multiple workers across each site can be maintained in the event of losing a
|
||||
single region.
|
||||
|
||||
Multiple network links should be deployed between sites to provide redundancy
|
||||
for all components. This includes storage replication, which should be isolated
|
||||
to a dedicated network or VLAN with the ability to assign QoS to control the
|
||||
replication traffic or provide priority for this traffic. Note that if the data
|
||||
store is highly changeable, the network requirements could have a significant
|
||||
effect on the operational cost of maintaining the sites.
|
||||
|
||||
If the design incorporates more than one site, the ability to maintain object
|
||||
availability in both sites has significant implications on the object storage
|
||||
design and implementation. It also has a significant impact on the WAN network
|
||||
design between the sites.
|
||||
|
||||
If applications running in a cloud are not cloud-aware, there should be clear
|
||||
measures and expectations to define what the infrastructure can and cannot
|
||||
support. An example would be shared storage between sites. It is possible,
|
||||
however such a solution is not native to OpenStack and requires a third-party
|
||||
hardware vendor to fulfill such a requirement. Another example can be seen in
|
||||
applications that are able to consume resources in object storage directly.
|
||||
|
||||
Connecting more than two sites increases the challenges and adds more
|
||||
complexity to the design considerations. Multi-site implementations require
|
||||
planning to address the additional topology used for internal and external
|
||||
connectivity. Some options include full mesh topology, hub spoke, spine leaf,
|
||||
and 3D Torus.
|
||||
|
||||
For more information on high availability in OpenStack, see the `OpenStack High
|
||||
Availability Guide <http://docs.openstack.org/ha-guide/>`_.
|
||||
|
||||
Site loss and recovery
|
||||
~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Outages can cause partial or full loss of site functionality. Strategies
|
||||
should be implemented to understand and plan for recovery scenarios.
|
||||
|
||||
* The deployed applications need to continue to function and, more
|
||||
importantly, you must consider the impact on the performance and
|
||||
reliability of the application if a site is unavailable.
|
||||
|
||||
* It is important to understand what happens to the replication of
|
||||
objects and data between the sites when a site goes down. If this
|
||||
causes queues to start building up, consider how long these queues
|
||||
can safely exist until an error occurs.
|
||||
|
||||
* After an outage, ensure that operations of a site are resumed when it
|
||||
comes back online. We recommend that you architect the recovery to
|
||||
avoid race conditions.
|
||||
|
||||
|
||||
Inter-site replication data
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Traditionally, replication has been the best method of protecting object store
|
||||
implementations. A variety of replication methods exist in storage
|
||||
architectures, for example synchronous and asynchronous mirroring. Most object
|
||||
stores and back-end storage systems implement methods for replication at the
|
||||
storage subsystem layer. Object stores also tailor replication techniques to
|
||||
fit a cloud's requirements.
|
||||
|
||||
Organizations must find the right balance between data integrity and data
|
||||
availability. Replication strategy may also influence disaster recovery
|
||||
methods.
|
||||
|
||||
Replication across different racks, data centers, and geographical regions
|
||||
increases focus on determining and ensuring data locality. The ability to
|
||||
guarantee data is accessed from the nearest or fastest storage can be necessary
|
||||
for applications to perform well.
|
||||
|
||||
.. note::
|
||||
|
||||
When running embedded object store methods, ensure that you do not
|
||||
instigate extra data replication as this may cause performance issues.
|
@ -0,0 +1,56 @@
|
||||
.. meta::
|
||||
:description: This guide targets OpenStack Architects
|
||||
for architectural design
|
||||
:keywords: Architecture, OpenStack
|
||||
|
||||
===================================
|
||||
OpenStack Architecture Design Guide
|
||||
===================================
|
||||
|
||||
Abstract
|
||||
~~~~~~~~
|
||||
|
||||
To reap the benefits of OpenStack, you should plan, design,
|
||||
and architect your cloud properly, taking user's needs into
|
||||
account and understanding the use cases.
|
||||
|
||||
.. TODO rewrite the abstract
|
||||
|
||||
Contents
|
||||
~~~~~~~~
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
|
||||
common/conventions.rst
|
||||
introduction.rst
|
||||
identifying-stakeholders.rst
|
||||
technical-requirements.rst
|
||||
customer-requirements.rst
|
||||
operator-requirements.rst
|
||||
capacity-planning-scaling.rst
|
||||
high-availability.rst
|
||||
security-requirements.rst
|
||||
legal-requirements.rst
|
||||
arch-examples.rst
|
||||
|
||||
Appendix
|
||||
~~~~~~~~
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 1
|
||||
|
||||
common/app_support.rst
|
||||
|
||||
Glossary
|
||||
~~~~~~~~
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 1
|
||||
|
||||
common/glossary.rst
|
||||
|
||||
Search in this guide
|
||||
~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
* :ref:`search`
|
@ -1,402 +1,3 @@
|
||||
=============================
|
||||
Capacity planning and scaling
|
||||
=============================
|
||||
|
||||
An important consideration in running a cloud over time is projecting growth
|
||||
and utilization trends in order to plan capital expenditures for the short and
|
||||
long term. Gather utilization meters for compute, network, and storage, along
|
||||
with historical records of these meters. While securing major anchor tenants
|
||||
can lead to rapid jumps in the utilization of resources, the average rate of
|
||||
adoption of cloud services through normal usage also needs to be carefully
|
||||
monitored.
|
||||
|
||||
General storage considerations
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
A wide variety of operator-specific requirements dictates the nature of the
|
||||
storage back end. Examples of such requirements are as follows:
|
||||
|
||||
* Public, private or a hybrid cloud, and associated SLA requirements
|
||||
* The need for encryption-at-rest, for data on storage nodes
|
||||
* Whether live migration will be offered
|
||||
|
||||
We recommend that data be encrypted both in transit and at-rest.
|
||||
If you plan to use live migration, a shared storage configuration is highly
|
||||
recommended.
|
||||
|
||||
Capacity planning for a multi-site cloud
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
An OpenStack cloud can be designed in a variety of ways to handle individual
|
||||
application needs. A multi-site deployment has additional challenges compared
|
||||
to single site installations.
|
||||
|
||||
When determining capacity options, take into account technical, economic and
|
||||
operational issues that might arise from specific decisions.
|
||||
|
||||
Inter-site link capacity describes the connectivity capability between
|
||||
different OpenStack sites. This includes parameters such as
|
||||
bandwidth, latency, whether or not a link is dedicated, and any business
|
||||
policies applied to the connection. The capability and number of the
|
||||
links between sites determine what kind of options are available for
|
||||
deployment. For example, if two sites have a pair of high-bandwidth
|
||||
links available between them, it may be wise to configure a separate
|
||||
storage replication network between the two sites to support a single
|
||||
swift endpoint and a shared Object Storage capability between them. An
|
||||
example of this technique, as well as a configuration walk-through, is
|
||||
available at
|
||||
http://docs.openstack.org/developer/swift/replication_network.html#dedicated-replication-network.
|
||||
Another option in this scenario is to build a dedicated set of tenant
|
||||
private networks across the secondary link, using overlay networks with
|
||||
a third party mapping the site overlays to each other.
|
||||
|
||||
The capacity requirements of the links between sites is driven by
|
||||
application behavior. If the link latency is too high, certain
|
||||
applications that use a large number of small packets, for example
|
||||
:term:`RPC <Remote Procedure Call (RPC)>` API calls, may encounter
|
||||
issues communicating with each other or operating
|
||||
properly. OpenStack may also encounter similar types of issues.
|
||||
To mitigate this, the Identity service provides service call timeout
|
||||
tuning to prevent issues authenticating against a central Identity services.
|
||||
|
||||
Another network capacity consideration for a multi-site deployment is
|
||||
the amount and performance of overlay networks available for tenant
|
||||
networks. If using shared tenant networks across zones, it is imperative
|
||||
that an external overlay manager or controller be used to map these
|
||||
overlays together. It is necessary to ensure the amount of possible IDs
|
||||
between the zones are identical.
|
||||
|
||||
.. note::
|
||||
|
||||
As of the Kilo release, OpenStack Networking was not capable of
|
||||
managing tunnel IDs across installations. So if one site runs out of
|
||||
IDs, but another does not, that tenant's network is unable to reach
|
||||
the other site.
|
||||
|
||||
The ability for a region to grow depends on scaling out the number of
|
||||
available compute nodes. However, it may be necessary to grow cells in an
|
||||
individual region, depending on the size of your cluster and the ratio of
|
||||
virtual machines per hypervisor.
|
||||
|
||||
A third form of capacity comes in the multi-region-capable components of
|
||||
OpenStack. Centralized Object Storage is capable of serving objects
|
||||
through a single namespace across multiple regions. Since this works by
|
||||
accessing the object store through swift proxy, it is possible to
|
||||
overload the proxies. There are two options available to mitigate this
|
||||
issue:
|
||||
|
||||
* Deploy a large number of swift proxies. The drawback is that the
|
||||
proxies are not load-balanced and a large file request could
|
||||
continually hit the same proxy.
|
||||
|
||||
* Add a caching HTTP proxy and load balancer in front of the swift
|
||||
proxies. Since swift objects are returned to the requester via HTTP,
|
||||
this load balancer alleviates the load required on the swift
|
||||
proxies.
|
||||
|
||||
Capacity planning for a compute-focused cloud
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Adding extra capacity to an compute-focused cloud is a horizontally scaling
|
||||
process.
|
||||
|
||||
We recommend using similar CPUs when adding extra nodes to the environment.
|
||||
This reduces the chance of breaking live-migration features if they are
|
||||
present. Scaling out hypervisor hosts also has a direct effect on network
|
||||
and other data center resources. We recommend you factor in this increase
|
||||
when reaching rack capacity or when requiring extra network switches.
|
||||
|
||||
Changing the internal components of a Compute host to account for increases in
|
||||
demand is a process known as vertical scaling. Swapping a CPU for one with more
|
||||
cores, or increasing the memory in a server, can help add extra capacity for
|
||||
running applications.
|
||||
|
||||
Another option is to assess the average workloads and increase the number of
|
||||
instances that can run within the compute environment by adjusting the
|
||||
overcommit ratio.
|
||||
|
||||
.. note::
|
||||
It is important to remember that changing the CPU overcommit ratio can
|
||||
have a detrimental effect and cause a potential increase in a noisy
|
||||
neighbor.
|
||||
|
||||
The added risk of increasing the overcommit ratio is that more instances fail
|
||||
when a compute host fails. We do not recommend that you increase the CPU
|
||||
overcommit ratio in compute-focused OpenStack design architecture. It can
|
||||
increase the potential for noisy neighbor issues.
|
||||
|
||||
Capacity planning for a hybrid cloud
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
One of the primary reasons many organizations use a hybrid cloud is to
|
||||
increase capacity without making large capital investments.
|
||||
|
||||
Capacity and the placement of workloads are key design considerations for
|
||||
hybrid clouds. The long-term capacity plan for these designs must incorporate
|
||||
growth over time to prevent permanent consumption of more expensive external
|
||||
clouds. To avoid this scenario, account for future applications’ capacity
|
||||
requirements and plan growth appropriately.
|
||||
|
||||
It is difficult to predict the amount of load a particular application might
|
||||
incur if the number of users fluctuate, or the application experiences an
|
||||
unexpected increase in use. It is possible to define application requirements
|
||||
in terms of vCPU, RAM, bandwidth, or other resources and plan appropriately.
|
||||
However, other clouds might not use the same meter or even the same
|
||||
oversubscription rates.
|
||||
|
||||
Oversubscription is a method to emulate more capacity than may physically be
|
||||
present. For example, a physical hypervisor node with 32 GB RAM may host 24
|
||||
instances, each provisioned with 2 GB RAM. As long as all 24 instances do not
|
||||
concurrently use 2 full gigabytes, this arrangement works well. However, some
|
||||
hosts take oversubscription to extremes and, as a result, performance can be
|
||||
inconsistent. If at all possible, determine what the oversubscription rates
|
||||
of each host are and plan capacity accordingly.
|
||||
|
||||
Block Storage
|
||||
~~~~~~~~~~~~~
|
||||
|
||||
Configure Block Storage resource nodes with advanced RAID controllers
|
||||
and high-performance disks to provide fault tolerance at the hardware
|
||||
level.
|
||||
|
||||
Deploy high performing storage solutions such as SSD drives or
|
||||
flash storage systems for applications requiring additional performance out
|
||||
of Block Storage devices.
|
||||
|
||||
In environments that place substantial demands on Block Storage, we
|
||||
recommend using multiple storage pools. In this case, each pool of
|
||||
devices should have a similar hardware design and disk configuration
|
||||
across all hardware nodes in that pool. This allows for a design that
|
||||
provides applications with access to a wide variety of Block Storage
|
||||
pools, each with their own redundancy, availability, and performance
|
||||
characteristics. When deploying multiple pools of storage, it is also
|
||||
important to consider the impact on the Block Storage scheduler which is
|
||||
responsible for provisioning storage across resource nodes. Ideally,
|
||||
ensure that applications can schedule volumes in multiple regions, each with
|
||||
their own network, power, and cooling infrastructure. This will give tenants
|
||||
the option of building fault-tolerant applications that are distributed
|
||||
across multiple availability zones.
|
||||
|
||||
In addition to the Block Storage resource nodes, it is important to
|
||||
design for high availability and redundancy of the APIs, and related
|
||||
services that are responsible for provisioning and providing access to
|
||||
storage. We recommend designing a layer of hardware or software load
|
||||
balancers in order to achieve high availability of the appropriate REST
|
||||
API services to provide uninterrupted service. In some cases, it may
|
||||
also be necessary to deploy an additional layer of load balancing to
|
||||
provide access to back-end database services responsible for servicing
|
||||
and storing the state of Block Storage volumes. It is imperative that a
|
||||
highly available database cluster is used to store the Block
|
||||
Storage metadata.
|
||||
|
||||
In a cloud with significant demands on Block Storage, the network
|
||||
architecture should take into account the amount of East-West bandwidth
|
||||
required for instances to make use of the available storage resources.
|
||||
The selected network devices should support jumbo frames for
|
||||
transferring large blocks of data, and utilize a dedicated network for
|
||||
providing connectivity between instances and Block Storage.
|
||||
|
||||
Scaling Block Storage
|
||||
---------------------
|
||||
|
||||
You can upgrade Block Storage pools to add storage capacity without
|
||||
interrupting the overall Block Storage service. Add nodes to the pool by
|
||||
installing and configuring the appropriate hardware and software and
|
||||
then allowing that node to report in to the proper storage pool through the
|
||||
message bus. Block Storage nodes generally report into the scheduler
|
||||
service advertising their availability. As a result, after the node is
|
||||
online and available, tenants can make use of those storage resources
|
||||
instantly.
|
||||
|
||||
In some cases, the demand on Block Storage may exhaust the available
|
||||
network bandwidth. As a result, design network infrastructure that
|
||||
services Block Storage resources in such a way that you can add capacity
|
||||
and bandwidth easily. This often involves the use of dynamic routing
|
||||
protocols or advanced networking solutions to add capacity to downstream
|
||||
devices easily. Both the front-end and back-end storage network designs
|
||||
should encompass the ability to quickly and easily add capacity and
|
||||
bandwidth.
|
||||
|
||||
.. note::
|
||||
|
||||
Sufficient monitoring and data collection should be in-place
|
||||
from the start, such that timely decisions regarding capacity,
|
||||
input/output metrics (IOPS) or storage-associated bandwidth can
|
||||
be made.
|
||||
|
||||
Object Storage
|
||||
~~~~~~~~~~~~~~
|
||||
|
||||
While consistency and partition tolerance are both inherent features of
|
||||
the Object Storage service, it is important to design the overall
|
||||
storage architecture to ensure that the implemented system meets those
|
||||
goals. The OpenStack Object Storage service places a specific number of
|
||||
data replicas as objects on resource nodes. Replicas are distributed
|
||||
throughout the cluster, based on a consistent hash ring also stored on
|
||||
each node in the cluster.
|
||||
|
||||
Design the Object Storage system with a sufficient number of zones to
|
||||
provide quorum for the number of replicas defined. For example, with
|
||||
three replicas configured in the swift cluster, the recommended number
|
||||
of zones to configure within the Object Storage cluster in order to
|
||||
achieve quorum is five. While it is possible to deploy a solution with
|
||||
fewer zones, the implied risk of doing so is that some data may not be
|
||||
available and API requests to certain objects stored in the cluster
|
||||
might fail. For this reason, ensure you properly account for the number
|
||||
of zones in the Object Storage cluster.
|
||||
|
||||
Each Object Storage zone should be self-contained within its own
|
||||
availability zone. Each availability zone should have independent access
|
||||
to network, power, and cooling infrastructure to ensure uninterrupted
|
||||
access to data. In addition, a pool of Object Storage proxy servers
|
||||
providing access to data stored on the object nodes should service each
|
||||
availability zone. Object proxies in each region should leverage local
|
||||
read and write affinity so that local storage resources facilitate
|
||||
access to objects wherever possible. We recommend deploying upstream
|
||||
load balancing to ensure that proxy services are distributed across the
|
||||
multiple zones and, in some cases, it may be necessary to make use of
|
||||
third-party solutions to aid with geographical distribution of services.
|
||||
|
||||
A zone within an Object Storage cluster is a logical division. Any of
|
||||
the following may represent a zone:
|
||||
|
||||
* A disk within a single node
|
||||
* One zone per node
|
||||
* Zone per collection of nodes
|
||||
* Multiple racks
|
||||
* Multiple data centers
|
||||
|
||||
Selecting the proper zone design is crucial for allowing the Object
|
||||
Storage cluster to scale while providing an available and redundant
|
||||
storage system. It may be necessary to configure storage policies that
|
||||
have different requirements with regards to replicas, retention, and
|
||||
other factors that could heavily affect the design of storage in a
|
||||
specific zone.
|
||||
|
||||
Scaling Object Storage
|
||||
----------------------
|
||||
|
||||
Adding back-end storage capacity to an Object Storage cluster requires
|
||||
careful planning and forethought. In the design phase, it is important
|
||||
to determine the maximum partition power required by the Object Storage
|
||||
service, which determines the maximum number of partitions which can
|
||||
exist. Object Storage distributes data among all available storage, but
|
||||
a partition cannot span more than one disk, so the maximum number of
|
||||
partitions can only be as high as the number of disks.
|
||||
|
||||
For example, a system that starts with a single disk and a partition
|
||||
power of 3 can have 8 (2^3) partitions. Adding a second disk means that
|
||||
each has 4 partitions. The one-disk-per-partition limit means that this
|
||||
system can never have more than 8 disks, limiting its scalability.
|
||||
However, a system that starts with a single disk and a partition power
|
||||
of 10 can have up to 1024 (2^10) disks.
|
||||
|
||||
As you add back-end storage capacity to the system, the partition maps
|
||||
redistribute data amongst the storage nodes. In some cases, this
|
||||
involves replication of extremely large data sets. In these cases, we
|
||||
recommend using back-end replication links that do not contend with
|
||||
tenants' access to data.
|
||||
|
||||
As more tenants begin to access data within the cluster and their data
|
||||
sets grow, it is necessary to add front-end bandwidth to service data
|
||||
access requests. Adding front-end bandwidth to an Object Storage cluster
|
||||
requires careful planning and design of the Object Storage proxies that
|
||||
tenants use to gain access to the data, along with the high availability
|
||||
solutions that enable easy scaling of the proxy layer. We recommend
|
||||
designing a front-end load balancing layer that tenants and consumers
|
||||
use to gain access to data stored within the cluster. This load
|
||||
balancing layer may be distributed across zones, regions or even across
|
||||
geographic boundaries, which may also require that the design encompass
|
||||
geo-location solutions.
|
||||
|
||||
In some cases, you must add bandwidth and capacity to the network
|
||||
resources servicing requests between proxy servers and storage nodes.
|
||||
For this reason, the network architecture used for access to storage
|
||||
nodes and proxy servers should make use of a design which is scalable.
|
||||
|
||||
Compute resource design
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
When designing compute resource pools, consider the number of processors,
|
||||
amount of memory, and the quantity of storage required for each hypervisor.
|
||||
|
||||
Consider whether compute resources will be provided in a single pool or in
|
||||
multiple pools. In most cases, multiple pools of resources can be allocated
|
||||
and addressed on demand, commonly referred to as bin packing.
|
||||
|
||||
In a bin packing design, each independent resource pool provides service
|
||||
for specific flavors. Since instances are scheduled onto compute hypervisors,
|
||||
each independent node's resources will be allocated to efficiently use the
|
||||
available hardware. Bin packing also requires a common hardware design,
|
||||
with all hardware nodes within a compute resource pool sharing a common
|
||||
processor, memory, and storage layout. This makes it easier to deploy,
|
||||
support, and maintain nodes throughout their lifecycle.
|
||||
|
||||
Increasing the size of the supporting compute environment increases the
|
||||
network traffic and messages, adding load to the controller or
|
||||
networking nodes. Effective monitoring of the environment will help with
|
||||
capacity decisions on scaling.
|
||||
|
||||
Compute nodes automatically attach to OpenStack clouds, resulting in a
|
||||
horizontally scaling process when adding extra compute capacity to an
|
||||
OpenStack cloud. Additional processes are required to place nodes into
|
||||
appropriate availability zones and host aggregates. When adding
|
||||
additional compute nodes to environments, ensure identical or functional
|
||||
compatible CPUs are used, otherwise live migration features will break.
|
||||
It is necessary to add rack capacity or network switches as scaling out
|
||||
compute hosts directly affects network and data center resources.
|
||||
|
||||
Compute host components can also be upgraded to account for increases in
|
||||
demand, known as vertical scaling. Upgrading CPUs with more
|
||||
cores, or increasing the overall server memory, can add extra needed
|
||||
capacity depending on whether the running applications are more CPU
|
||||
intensive or memory intensive.
|
||||
|
||||
When selecting a processor, compare features and performance
|
||||
characteristics. Some processors include features specific to
|
||||
virtualized compute hosts, such as hardware-assisted virtualization, and
|
||||
technology related to memory paging (also known as EPT shadowing). These
|
||||
types of features can have a significant impact on the performance of
|
||||
your virtual machine.
|
||||
|
||||
The number of processor cores and threads impacts the number of worker
|
||||
threads which can be run on a resource node. Design decisions must
|
||||
relate directly to the service being run on it, as well as provide a
|
||||
balanced infrastructure for all services.
|
||||
|
||||
Another option is to assess the average workloads and increase the
|
||||
number of instances that can run within the compute environment by
|
||||
adjusting the overcommit ratio.
|
||||
|
||||
An overcommit ratio is the ratio of available virtual resources to
|
||||
available physical resources. This ratio is configurable for CPU and
|
||||
memory. The default CPU overcommit ratio is 16:1, and the default memory
|
||||
overcommit ratio is 1.5:1. Determining the tuning of the overcommit
|
||||
ratios during the design phase is important as it has a direct impact on
|
||||
the hardware layout of your compute nodes.
|
||||
|
||||
.. note::
|
||||
|
||||
Changing the CPU overcommit ratio can have a detrimental effect
|
||||
and cause a potential increase in a noisy neighbor.
|
||||
|
||||
Insufficient disk capacity could also have a negative effect on overall
|
||||
performance including CPU and memory usage. Depending on the back-end
|
||||
architecture of the OpenStack Block Storage layer, capacity includes
|
||||
adding disk shelves to enterprise storage systems or installing
|
||||
additional block storage nodes. Upgrading directly attached storage
|
||||
installed in compute hosts, and adding capacity to the shared storage
|
||||
for additional ephemeral storage to instances, may be necessary.
|
||||
|
||||
Consider the compute requirements of non-hypervisor nodes (also referred to as
|
||||
resource nodes). This includes controller, object storage, and block storage
|
||||
nodes, and networking services.
|
||||
|
||||
The ability to add compute resource pools for unpredictable workloads should
|
||||
be considered. In some cases, the demand for certain instance types or flavors
|
||||
may not justify individual hardware design. Allocate hardware designs that are
|
||||
capable of servicing the most common instance requests. Adding hardware to the
|
||||
overall architecture can be done later.
|
||||
|
||||
For more information on these topics, refer to the `OpenStack
|
||||
Operations Guide <http://docs.openstack.org/ops>`_.
|
||||
|
||||
.. TODO Add information on control plane API services and horizon.
|
||||
|
@ -90,7 +90,7 @@ html_context = {"gitsha": gitsha, "bug_tag": bug_tag,
|
||||
# List of patterns, relative to source directory, that match files and
|
||||
# directories to ignore when looking for source files.
|
||||
exclude_patterns = ['common/cli*', 'common/nova*', 'common/get_started_*',
|
||||
'common/dashboard_customizing.rst']
|
||||
'common/dashboard_customizing.rst', 'arch-guide-draft-mitaka']
|
||||
|
||||
# The reST default role (used for this markup: `text`) to use for all
|
||||
# documents.
|
||||
|
24
doc/arch-design-draft/source/design.rst
Normal file
@ -0,0 +1,24 @@
|
||||
======
|
||||
Design
|
||||
======
|
||||
|
||||
Compute service
|
||||
~~~~~~~~~~~~~~~
|
||||
|
||||
Storage
|
||||
~~~~~~~
|
||||
|
||||
Networking service
|
||||
~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Identity service
|
||||
~~~~~~~~~~~~~~~~
|
||||
|
||||
Image service
|
||||
~~~~~~~~~~~~~
|
||||
|
||||
Control Plane
|
||||
~~~~~~~~~~~~~
|
||||
|
||||
Dashboard and APIs
|
||||
~~~~~~~~~~~~~~~~~~
|
@ -1,190 +1,3 @@
|
||||
.. _high-availability:
|
||||
|
||||
=================
|
||||
High availability
|
||||
High Availability
|
||||
=================
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
Data Plane and Control Plane
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
When designing an OpenStack cloud, it is important to consider the needs
|
||||
dictated by the :term:`Service Level Agreement (SLA)` in terms of the core
|
||||
services required to maintain availability of running Compute service
|
||||
instances, networks, storage and additional services running on top of those
|
||||
resources. These services are often referred to as the Data Plane services,
|
||||
and are generally expected to be available all the time.
|
||||
|
||||
The remaining services, responsible for CRUD operations, metering, monitoring,
|
||||
and so on, are often referred to as the Control Plane. The SLA is likely to
|
||||
dictate a lower uptime requirement for these services.
|
||||
|
||||
The services comprising an OpenStack cloud have a number of requirements which
|
||||
the architect needs to understand in order to be able to meet SLA terms. For
|
||||
example, in order to provide the Compute service a minimum of storage, message
|
||||
queueing, and database services are necessary as well as the networking between
|
||||
them.
|
||||
|
||||
Ongoing maintenance operations are made much simpler if there is logical and
|
||||
physical separation of Data Plane and Control Plane systems. It then becomes
|
||||
possible to, for example, reboot a controller without affecting customers.
|
||||
If one service failure affects the operation of an entire server ('noisy
|
||||
neighbor’), the separation between Control and Data Planes enables rapid
|
||||
maintenance with a limited effect on customer operations.
|
||||
|
||||
|
||||
Eliminating Single Points of Failure
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Within each site
|
||||
----------------
|
||||
|
||||
OpenStack lends itself to deployment in a highly available manner where it is
|
||||
expected that at least 2 servers be utilized. These can run all the services
|
||||
involved from the message queuing service, for example ``RabbitMQ`` or
|
||||
``QPID``, and an appropriately deployed database service such as ``MySQL`` or
|
||||
``MariaDB``. As services in the cloud are scaled out, back-end services will
|
||||
need to scale too. Monitoring and reporting on server utilization and response
|
||||
times, as well as load testing your systems, will help determine scale out
|
||||
decisions.
|
||||
|
||||
The OpenStack services themselves should be deployed across multiple servers
|
||||
that do not represent a single point of failure. Ensuring availability can
|
||||
be achieved by placing these services behind highly available load balancers
|
||||
that have multiple OpenStack servers as members.
|
||||
|
||||
There are a small number of OpenStack services which are intended to only run
|
||||
in one place at a time (e.g. the ``ceilometer-agent-central`` service). In
|
||||
order to prevent these services from becoming a single point of failure, they
|
||||
can be controlled by clustering software such as ``Pacemaker``.
|
||||
|
||||
In OpenStack, the infrastructure is integral to providing services and should
|
||||
always be available, especially when operating with SLAs. Ensuring network
|
||||
availability is accomplished by designing the network architecture so that no
|
||||
single point of failure exists. A consideration of the number of switches,
|
||||
routes and redundancies of power should be factored into core infrastructure,
|
||||
as well as the associated bonding of networks to provide diverse routes to your
|
||||
highly available switch infrastructure.
|
||||
|
||||
Care must be taken when deciding network functionality. Currently, OpenStack
|
||||
supports both the legacy networking (nova-network) system and the newer,
|
||||
extensible OpenStack Networking (neutron). OpenStack Networking and legacy
|
||||
networking both have their advantages and disadvantages. They are both valid
|
||||
and supported options that fit different network deployment models described in
|
||||
the `OpenStack Operations Guide
|
||||
<http://docs.openstack.org/openstack-ops/content/network_design.html#network_deployment_options>`_.
|
||||
|
||||
When using the Networking service, the OpenStack controller servers or separate
|
||||
Networking hosts handle routing unless the dynamic virtual routers pattern for
|
||||
routing is selected. Running routing directly on the controller servers mixes
|
||||
the Data and Control Planes and can cause complex issues with performance and
|
||||
troubleshooting. It is possible to use third party software and external
|
||||
appliances that help maintain highly available layer three routes. Doing so
|
||||
allows for common application endpoints to control network hardware, or to
|
||||
provide complex multi-tier web applications in a secure manner. It is also
|
||||
possible to completely remove routing from Networking, and instead rely on
|
||||
hardware routing capabilities. In this case, the switching infrastructure must
|
||||
support layer three routing.
|
||||
|
||||
Application design must also be factored into the capabilities of the
|
||||
underlying cloud infrastructure. If the compute hosts do not provide a seamless
|
||||
live migration capability, then it must be expected that if a compute host
|
||||
fails, that instance and any data local to that instance will be deleted.
|
||||
However, when providing an expectation to users that instances have a
|
||||
high-level of uptime guaranteed, the infrastructure must be deployed in a way
|
||||
that eliminates any single point of failure if a compute host disappears.
|
||||
This may include utilizing shared file systems on enterprise storage or
|
||||
OpenStack Block storage to provide a level of guarantee to match service
|
||||
features.
|
||||
|
||||
If using a storage design that includes shared access to centralized storage,
|
||||
ensure that this is also designed without single points of failure and the SLA
|
||||
for the solution matches or exceeds the expected SLA for the Data Plane.
|
||||
|
||||
Between sites in a multi region design
|
||||
--------------------------------------
|
||||
|
||||
Some services are commonly shared between multiple regions, including the
|
||||
Identity service and the Dashboard. In this case, it is necessary to ensure
|
||||
that the databases backing the services are replicated, and that access to
|
||||
multiple workers across each site can be maintained in the event of losing a
|
||||
single region.
|
||||
|
||||
Multiple network links should be deployed between sites to provide redundancy
|
||||
for all components. This includes storage replication, which should be isolated
|
||||
to a dedicated network or VLAN with the ability to assign QoS to control the
|
||||
replication traffic or provide priority for this traffic. Note that if the data
|
||||
store is highly changeable, the network requirements could have a significant
|
||||
effect on the operational cost of maintaining the sites.
|
||||
|
||||
If the design incorporates more than one site, the ability to maintain object
|
||||
availability in both sites has significant implications on the object storage
|
||||
design and implementation. It also has a significant impact on the WAN network
|
||||
design between the sites.
|
||||
|
||||
If applications running in a cloud are not cloud-aware, there should be clear
|
||||
measures and expectations to define what the infrastructure can and cannot
|
||||
support. An example would be shared storage between sites. It is possible,
|
||||
however such a solution is not native to OpenStack and requires a third-party
|
||||
hardware vendor to fulfill such a requirement. Another example can be seen in
|
||||
applications that are able to consume resources in object storage directly.
|
||||
|
||||
Connecting more than two sites increases the challenges and adds more
|
||||
complexity to the design considerations. Multi-site implementations require
|
||||
planning to address the additional topology used for internal and external
|
||||
connectivity. Some options include full mesh topology, hub spoke, spine leaf,
|
||||
and 3D Torus.
|
||||
|
||||
For more information on high availability in OpenStack, see the `OpenStack High
|
||||
Availability Guide <http://docs.openstack.org/ha-guide/>`_.
|
||||
|
||||
Site loss and recovery
|
||||
~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Outages can cause partial or full loss of site functionality. Strategies
|
||||
should be implemented to understand and plan for recovery scenarios.
|
||||
|
||||
* The deployed applications need to continue to function and, more
|
||||
importantly, you must consider the impact on the performance and
|
||||
reliability of the application if a site is unavailable.
|
||||
|
||||
* It is important to understand what happens to the replication of
|
||||
objects and data between the sites when a site goes down. If this
|
||||
causes queues to start building up, consider how long these queues
|
||||
can safely exist until an error occurs.
|
||||
|
||||
* After an outage, ensure that operations of a site are resumed when it
|
||||
comes back online. We recommend that you architect the recovery to
|
||||
avoid race conditions.
|
||||
|
||||
|
||||
Inter-site replication data
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Traditionally, replication has been the best method of protecting object store
|
||||
implementations. A variety of replication methods exist in storage
|
||||
architectures, for example synchronous and asynchronous mirroring. Most object
|
||||
stores and back-end storage systems implement methods for replication at the
|
||||
storage subsystem layer. Object stores also tailor replication techniques to
|
||||
fit a cloud's requirements.
|
||||
|
||||
Organizations must find the right balance between data integrity and data
|
||||
availability. Replication strategy may also influence disaster recovery
|
||||
methods.
|
||||
|
||||
Replication across different racks, data centers, and geographical regions
|
||||
increases focus on determining and ensuring data locality. The ability to
|
||||
guarantee data is accessed from the nearest or fastest storage can be necessary
|
||||
for applications to perform well.
|
||||
|
||||
.. note::
|
||||
|
||||
When running embedded object store methods, ensure that you do not
|
||||
instigate extra data replication as this may cause performance issues.
|
||||
|
@ -10,11 +10,11 @@ OpenStack Architecture Design Guide
|
||||
Abstract
|
||||
~~~~~~~~
|
||||
|
||||
To reap the benefits of OpenStack, you should plan, design,
|
||||
and architect your cloud properly, taking user's needs into
|
||||
account and understanding the use cases.
|
||||
|
||||
.. TODO rewrite the abstract
|
||||
This guide provides information on planning and designing an OpenStack
|
||||
cloud. It describes common use cases, high availability, and considerations
|
||||
when changing capacity and scaling your cloud environment. A breakdown of the
|
||||
major OpenStack components is also described in relation to cloud architecture
|
||||
design.
|
||||
|
||||
Contents
|
||||
~~~~~~~~
|
||||
@ -23,16 +23,11 @@ Contents
|
||||
:maxdepth: 2
|
||||
|
||||
common/conventions.rst
|
||||
introduction.rst
|
||||
identifying-stakeholders.rst
|
||||
technical-requirements.rst
|
||||
customer-requirements.rst
|
||||
operator-requirements.rst
|
||||
capacity-planning-scaling.rst
|
||||
overview.rst
|
||||
use-cases.rst
|
||||
high-availability.rst
|
||||
security-requirements.rst
|
||||
legal-requirements.rst
|
||||
arch-examples.rst
|
||||
capacity-planning-scaling.rst
|
||||
design.rst
|
||||
|
||||
Appendix
|
||||
~~~~~~~~
|
||||
|
3
doc/arch-design-draft/source/overview.rst
Normal file
@ -0,0 +1,3 @@
|
||||
========
|
||||
Overview
|
||||
========
|
15
doc/arch-design-draft/source/use-cases.rst
Normal file
@ -0,0 +1,15 @@
|
||||
=========
|
||||
Use Cases
|
||||
=========
|
||||
|
||||
Development Cloud
|
||||
~~~~~~~~~~~~~~~~~
|
||||
|
||||
General Compute Cloud
|
||||
~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Web Scale Cloud
|
||||
~~~~~~~~~~~~~~~
|
||||
|
||||
Public Cloud
|
||||
~~~~~~~~~~~~
|