[arch-design] Migrate cloud architecture examples

1. Migrate and tidy up cloud architecture examples from the current guide
2. Migrate figures
3. Add placeholder sections for new content

Change-Id: I290f555f6e0cd4200deccb4d705127d99e61c343
Partial-Bug: #1548176
Implements: blueprint archguide-mitaka-reorg
This commit is contained in:
daz 2016-03-09 11:25:53 +11:00
parent 54bc787a56
commit bfe2987752
43 changed files with 1254 additions and 10 deletions

View File

@ -0,0 +1,126 @@
=============================
Compute-focused cloud example
=============================
The Conseil Européen pour la Recherche Nucléaire (CERN), also known as
the European Organization for Nuclear Research, provides particle
accelerators and other infrastructure for high-energy physics research.
As of 2011 CERN operated these two compute centers in Europe with plans
to add a third.
+-----------------------+------------------------+
| Data center | Approximate capacity |
+=======================+========================+
| Geneva, Switzerland | - 3.5 Mega Watts |
| | |
| | - 91000 cores |
| | |
| | - 120 PB HDD |
| | |
| | - 100 PB Tape |
| | |
| | - 310 TB Memory |
+-----------------------+------------------------+
| Budapest, Hungary | - 2.5 Mega Watts |
| | |
| | - 20000 cores |
| | |
| | - 6 PB HDD |
+-----------------------+------------------------+
To support a growing number of compute-heavy users of experiments
related to the Large Hadron Collider (LHC), CERN ultimately elected to
deploy an OpenStack cloud using Scientific Linux and RDO. This effort
aimed to simplify the management of the center's compute resources with
a view to doubling compute capacity through the addition of a data
center in 2013 while maintaining the same levels of compute staff.
The CERN solution uses :term:`cells <cell>` for segregation of compute
resources and for transparently scaling between different data centers.
This decision meant trading off support for security groups and live
migration. In addition, they must manually replicate some details, like
flavors, across cells. In spite of these drawbacks cells provide the
required scale while exposing a single public API endpoint to users.
CERN created a compute cell for each of the two original data centers
and created a third when it added a new data center in 2013. Each cell
contains three availability zones to further segregate compute resources
and at least three RabbitMQ message brokers configured for clustering
with mirrored queues for high availability.
The API cell, which resides behind a HAProxy load balancer, is in the
data center in Switzerland and directs API calls to compute cells using
a customized variation of the cell scheduler. The customizations allow
certain workloads to route to a specific data center or all data
centers, with cell RAM availability determining cell selection in the
latter case.
.. figure:: figures/Generic_CERN_Example.png
There is also some customization of the filter scheduler that handles
placement within the cells:
ImagePropertiesFilter
Provides special handling depending on the guest operating system in
use (Linux-based or Windows-based).
ProjectsToAggregateFilter
Provides special handling depending on which project the instance is
associated with.
default_schedule_zones
Allows the selection of multiple default availability zones, rather
than a single default.
A central database team manages the MySQL database server in each cell
in an active/passive configuration with a NetApp storage back end.
Backups run every 6 hours.
Network architecture
~~~~~~~~~~~~~~~~~~~~
To integrate with existing networking infrastructure, CERN made
customizations to legacy networking (nova-network). This was in the form
of a driver to integrate with CERN's existing database for tracking MAC
and IP address assignments.
The driver facilitates selection of a MAC address and IP for new
instances based on the compute node where the scheduler places the
instance.
The driver considers the compute node where the scheduler placed an
instance and selects a MAC address and IP from the pre-registered list
associated with that node in the database. The database updates to
reflect the address assignment to that instance.
Storage architecture
~~~~~~~~~~~~~~~~~~~~
CERN deploys the OpenStack Image service in the API cell and configures
it to expose version 1 (V1) of the API. This also requires the image
registry. The storage back end in use is a 3 PB Ceph cluster.
CERN maintains a small set of Scientific Linux 5 and 6 images onto which
orchestration tools can place applications. Puppet manages instance
configuration and customization.
Monitoring
~~~~~~~~~~
CERN does not require direct billing, but uses the Telemetry service to
perform metering for the purposes of adjusting project quotas. CERN uses
a sharded, replicated, MongoDB back-end. To spread API load, CERN
deploys instances of the nova-api service within the child cells for
Telemetry to query against. This also requires the configuration of
supporting services such as keystone, glance-api, and glance-registry in
the child cells.
.. figure:: figures/Generic_CERN_Architecture.png
Additional monitoring tools in use include
`Flume <http://flume.apache.org/>`_, `Elastic
Search <http://www.elasticsearch.org/>`_,
`Kibana <http://www.elasticsearch.org/overview/kibana/>`_, and the CERN
developed `Lemon <http://lemon.web.cern.ch/lemon/index.shtml>`_
project.

View File

@ -0,0 +1,85 @@
=====================
General cloud example
=====================
An online classified advertising company wants to run web applications
consisting of Tomcat, Nginx and MariaDB in a private cloud. To be able
to meet policy requirements, the cloud infrastructure will run in their
own data center. The company has predictable load requirements, but
requires scaling to cope with nightly increases in demand. Their current
environment does not have the flexibility to align with their goal of
running an open source API environment. The current environment consists
of the following:
* Between 120 and 140 installations of Nginx and Tomcat, each with 2
vCPUs and 4 GB of RAM
* A three-node MariaDB and Galera cluster, each with 4 vCPUs and 8 GB
RAM
The company runs hardware load balancers and multiple web applications
serving their websites, and orchestrates environments using combinations
of scripts and Puppet. The website generates large amounts of log data
daily that requires archiving.
The solution would consist of the following OpenStack components:
* A firewall, switches and load balancers on the public facing network
connections.
* OpenStack Controller service running Image, Identity, Networking,
combined with support services such as MariaDB and RabbitMQ,
configured for high availability on at least three controller nodes.
* OpenStack Compute nodes running the KVM hypervisor.
* OpenStack Block Storage for use by compute instances, requiring
persistent storage (such as databases for dynamic sites).
* OpenStack Object Storage for serving static objects (such as images).
.. figure:: figures/General_Architecture3.png
Running up to 140 web instances and the small number of MariaDB
instances requires 292 vCPUs available, as well as 584 GB RAM. On a
typical 1U server using dual-socket hex-core Intel CPUs with
Hyperthreading, and assuming 2:1 CPU overcommit ratio, this would
require 8 OpenStack Compute nodes.
The web application instances run from local storage on each of the
OpenStack Compute nodes. The web application instances are stateless,
meaning that any of the instances can fail and the application will
continue to function.
MariaDB server instances store their data on shared enterprise storage,
such as NetApp or Solidfire devices. If a MariaDB instance fails,
storage would be expected to be re-attached to another instance and
rejoined to the Galera cluster.
Logs from the web application servers are shipped to OpenStack Object
Storage for processing and archiving.
Additional capabilities can be realized by moving static web content to
be served from OpenStack Object Storage containers, and backing the
OpenStack Image service with OpenStack Object Storage.
.. note::
Increasing OpenStack Object Storage means network bandwidth needs to
be taken into consideration. Running OpenStack Object Storage with
network connections offering 10 GbE or better connectivity is
advised.
Leveraging Orchestration and Telemetry services is also a potential
issue when providing auto-scaling, orchestrated web application
environments. Defining the web applications in a
:term:`Heat Orchestration Template (HOT)`
negates the reliance on the current scripted Puppet
solution.
OpenStack Networking can be used to control hardware load balancers
through the use of plug-ins and the Networking API. This allows users to
control hardware load balance pools and instances as members in these
pools, but their use in production environments must be carefully
weighed against current stability.

View File

@ -0,0 +1,154 @@
=====================
Hybrid cloud examples
=====================
Hybrid cloud environments are designed for these use cases:
* Bursting workloads from private to public OpenStack clouds
* Bursting workloads from private to public non-OpenStack clouds
* High availability across clouds (for technical diversity)
This chapter provides examples of environments that address
each of these use cases.
Bursting to a public OpenStack cloud
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Company A's data center is running low on capacity.
It is not possible to expand the data center in the foreseeable future.
In order to accommodate the continuously growing need for
development resources in the organization,
Company A decides to use resources in the public cloud.
Company A has an established data center with a substantial amount
of hardware. Migrating the workloads to a public cloud is not feasible.
The company has an internal cloud management platform that directs
requests to the appropriate cloud, depending on the local capacity.
This is a custom in-house application written for this specific purpose.
This solution is depicted in the figure below:
.. figure:: figures/Multi-Cloud_Priv-Pub3.png
:width: 100%
This example shows two clouds with a Cloud Management
Platform (CMP) connecting them. This guide does not
discuss a specific CMP, but describes how the Orchestration and
Telemetry services handle, manage, and control workloads.
The private OpenStack cloud has at least one controller and at least
one compute node. It includes metering using the Telemetry service.
The Telemetry service captures the load increase and the CMP
processes the information. If there is available capacity,
the CMP uses the OpenStack API to call the Orchestration service.
This creates instances on the private cloud in response to user requests.
When capacity is not available on the private cloud, the CMP issues
a request to the Orchestration service API of the public cloud.
This creates the instance on the public cloud.
In this example, Company A does not direct the deployments to an
external public cloud due to concerns regarding resource control,
security, and increased operational expense.
Bursting to a public non-OpenStack cloud
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The second example examines bursting workloads from the private cloud
into a non-OpenStack public cloud using Amazon Web Services (AWS)
to take advantage of additional capacity and to scale applications.
The following diagram demonstrates an OpenStack-to-AWS hybrid cloud:
.. figure:: figures/Multi-Cloud_Priv-AWS4.png
:width: 100%
Company B states that its developers are already using AWS
and do not want to change to a different provider.
If the CMP is capable of connecting to an external cloud
provider with an appropriate API, the workflow process remains
the same as the previous scenario.
The actions the CMP takes, such as monitoring loads and
creating new instances, stay the same.
However, the CMP performs actions in the public cloud
using applicable API calls.
If the public cloud is AWS, the CMP would use the
EC2 API to create a new instance and assign an Elastic IP.
It can then add that IP to HAProxy in the private cloud.
The CMP can also reference AWS-specific
tools such as CloudWatch and CloudFormation.
Several open source tool kits for building CMPs are
available and can handle this kind of translation.
Examples include ManageIQ, jClouds, and JumpGate.
High availability and disaster recovery
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Company C requires their local data center to be able to
recover from failure. Some of the workloads currently in
use are running on their private OpenStack cloud.
Protecting the data involves Block Storage, Object Storage,
and a database. The architecture supports the failure of
large components of the system while ensuring that the
system continues to deliver services.
While the services remain available to users, the failed
components are restored in the background based on standard
best practice data replication policies.
To achieve these objectives, Company C replicates data to
a second cloud in a geographically distant location.
The following diagram describes this system:
.. figure:: figures/Multi-Cloud_failover2.png
:width: 100%
This example includes two private OpenStack clouds connected with a CMP.
The source cloud, OpenStack Cloud 1, includes a controller and
at least one instance running MySQL. It also includes at least
one Block Storage volume and one Object Storage volume.
This means that data is available to the users at all times.
The details of the method for protecting each of these sources
of data differs.
Object Storage relies on the replication capabilities of
the Object Storage provider.
Company C enables OpenStack Object Storage so that it creates
geographically separated replicas that take advantage of this feature.
The company configures storage so that at least one replica
exists in each cloud. In order to make this work, the company
configures a single array spanning both clouds with OpenStack Identity.
Using Federated Identity, the array talks to both clouds, communicating
with OpenStack Object Storage through the Swift proxy.
For Block Storage, the replication is a little more difficult,
and involves tools outside of OpenStack itself.
The OpenStack Block Storage volume is not set as the drive itself
but as a logical object that points to a physical back end.
Disaster recovery is configured for Block Storage for
synchronous backup for the highest level of data protection,
but asynchronous backup could have been set as an alternative
that is not as latency sensitive.
For asynchronous backup, the Block Storage API makes it possible
to export the data and also the metadata of a particular volume,
so that it can be moved and replicated elsewhere.
More information can be found here:
https://blueprints.launchpad.net/cinder/+spec/cinder-backup-volume-metadata-support.
The synchronous backups create an identical volume in both
clouds and chooses the appropriate flavor so that each cloud
has an identical back end. This is done by creating volumes
through the CMP. After this is configured, a solution
involving DRDB synchronizes the physical drives.
The database component is backed up using synchronous backups.
MySQL does not support geographically diverse replication,
so disaster recovery is provided by replicating the file itself.
As it is not possible to use Object Storage as the back end of
a database like MySQL, Swift replication is not an option.
Company C decides not to store the data on another geo-tiered
storage system, such as Ceph, as Block Storage.
This would have given another layer of protection.
Another option would have been to store the database on an OpenStack
Block Storage volume and backing it up like any other Block Storage.

View File

@ -0,0 +1,192 @@
=========================
Multi-site cloud examples
=========================
There are multiple ways to build a multi-site OpenStack installation,
based on the needs of the intended workloads. Below are example
architectures based on different requirements. These examples are meant
as a reference, and not a hard and fast rule for deployments. Use the
previous sections of this chapter to assist in selecting specific
components and implementations based on specific needs.
A large content provider needs to deliver content to customers that are
geographically dispersed. The workload is very sensitive to latency and
needs a rapid response to end-users. After reviewing the user, technical
and operational considerations, it is determined beneficial to build a
number of regions local to the customer's edge. Rather than build a few
large, centralized data centers, the intent of the architecture is to
provide a pair of small data centers in locations that are closer to the
customer. In this use case, spreading applications out allows for
different horizontal scaling than a traditional compute workload scale.
The intent is to scale by creating more copies of the application in
closer proximity to the users that need it most, in order to ensure
faster response time to user requests. This provider deploys two
datacenters at each of the four chosen regions. The implications of this
design are based around the method of placing copies of resources in
each of the remote regions. Swift objects, Glance images, and block
storage need to be manually replicated into each region. This may be
beneficial for some systems, such as the case of content service, where
only some of the content needs to exist in some but not all regions. A
centralized Keystone is recommended to ensure authentication and that
access to the API endpoints is easily manageable.
It is recommended that you install an automated DNS system such as
Designate. Application administrators need a way to manage the mapping
of which application copy exists in each region and how to reach it,
unless an external Dynamic DNS system is available. Designate assists by
making the process automatic and by populating the records in the each
region's zone.
Telemetry for each region is also deployed, as each region may grow
differently or be used at a different rate. Ceilometer collects each
region's meters from each of the controllers and report them back to a
central location. This is useful both to the end user and the
administrator of the OpenStack environment. The end user will find this
method useful, as it makes possible to determine if certain locations
are experiencing higher load than others, and take appropriate action.
Administrators also benefit by possibly being able to forecast growth
per region, rather than expanding the capacity of all regions
simultaneously, therefore maximizing the cost-effectiveness of the
multi-site design.
One of the key decisions of running this infrastructure is whether or
not to provide a redundancy model. Two types of redundancy and high
availability models in this configuration can be implemented. The first
type is the availability of central OpenStack components. Keystone can
be made highly available in three central data centers that host the
centralized OpenStack components. This prevents a loss of any one of the
regions causing an outage in service. It also has the added benefit of
being able to run a central storage repository as a primary cache for
distributing content to each of the regions.
The second redundancy type is the edge data center itself. A second data
center in each of the edge regional locations house a second region near
the first region. This ensures that the application does not suffer
degraded performance in terms of latency and availability.
:ref:`ms-customer-edge` depicts the solution designed to have both a
centralized set of core data centers for OpenStack services and paired edge
data centers:
.. _ms-customer-edge:
.. figure:: figures/Multi-Site_Customer_Edge.png
**Multi-site architecture example**
Geo-redundant load balancing
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
A large-scale web application has been designed with cloud principles in
mind. The application is designed provide service to application store,
on a 24/7 basis. The company has typical two tier architecture with a
web front-end servicing the customer requests, and a NoSQL database back
end storing the information.
As of late there has been several outages in number of major public
cloud providers due to applications running out of a single geographical
location. The design therefore should mitigate the chance of a single
site causing an outage for their business.
The solution would consist of the following OpenStack components:
* A firewall, switches and load balancers on the public facing network
connections.
* OpenStack Controller services running, Networking, dashboard, Block
Storage and Compute running locally in each of the three regions.
Identity service, Orchestration service, Telemetry service, Image
service and Object Storage service can be installed centrally, with
nodes in each of the region providing a redundant OpenStack
Controller plane throughout the globe.
* OpenStack Compute nodes running the KVM hypervisor.
* OpenStack Object Storage for serving static objects such as images
can be used to ensure that all images are standardized across all the
regions, and replicated on a regular basis.
* A distributed DNS service available to all regions that allows for
dynamic update of DNS records of deployed instances.
* A geo-redundant load balancing service can be used to service the
requests from the customers based on their origin.
An autoscaling heat template can be used to deploy the application in
the three regions. This template includes:
* Web Servers, running Apache.
* Appropriate ``user_data`` to populate the central DNS servers upon
instance launch.
* Appropriate Telemetry alarms that maintain state of the application
and allow for handling of region or instance failure.
Another autoscaling Heat template can be used to deploy a distributed
MongoDB shard over the three locations, with the option of storing
required data on a globally available swift container. According to the
usage and load on the database server, additional shards can be
provisioned according to the thresholds defined in Telemetry.
Two data centers would have been sufficient had the requirements been
met. But three regions are selected here to avoid abnormal load on a
single region in the event of a failure.
Orchestration is used because of the built-in functionality of
autoscaling and auto healing in the event of increased load. Additional
configuration management tools, such as Puppet or Chef could also have
been used in this scenario, but were not chosen since Orchestration had
the appropriate built-in hooks into the OpenStack cloud, whereas the
other tools were external and not native to OpenStack. In addition,
external tools were not needed since this deployment scenario was
straight forward.
OpenStack Object Storage is used here to serve as a back end for the
Image service since it is the most suitable solution for a globally
distributed storage solution with its own replication mechanism. Home
grown solutions could also have been used including the handling of
replication, but were not chosen, because Object Storage is already an
intricate part of the infrastructure and a proven solution.
An external load balancing service was used and not the LBaaS in
OpenStack because the solution in OpenStack is not redundant and does
not have any awareness of geo location.
.. _ms-geo-redundant:
.. figure:: figures/Multi-site_Geo_Redundant_LB.png
**Multi-site geo-redundant architecture**
Location-local service
~~~~~~~~~~~~~~~~~~~~~~
A common use for multi-site OpenStack deployment is creating a Content
Delivery Network. An application that uses a location-local architecture
requires low network latency and proximity to the user to provide an
optimal user experience and reduce the cost of bandwidth and transit.
The content resides on sites closer to the customer, instead of a
centralized content store that requires utilizing higher cost
cross-country links.
This architecture includes a geo-location component that places user
requests to the closest possible node. In this scenario, 100% redundancy
of content across every site is a goal rather than a requirement, with
the intent to maximize the amount of content available within a minimum
number of network hops for end users. Despite these differences, the
storage replication configuration has significant overlap with that of a
geo-redundant load balancing use case.
In :ref:`ms-shared-keystone`, the application utilizing this multi-site
OpenStack install that is location-aware would launch web server or content
serving instances on the compute cluster in each site. Requests from clients
are first sent to a global services load balancer that determines the location
of the client, then routes the request to the closest OpenStack site where the
application completes the request.
.. _ms-shared-keystone:
.. figure:: figures/Multi-Site_shared_keystone1.png
**Multi-site shared keystone architecture**

View File

@ -0,0 +1,166 @@
==============================
Network-focused cloud examples
==============================
An organization designs a large-scale web application with cloud
principles in mind. The application scales horizontally in a bursting
fashion and generates a high instance count. The application requires an
SSL connection to secure data and must not lose connection state to
individual servers.
The figure below depicts an example design for this workload. In this
example, a hardware load balancer provides SSL offload functionality and
connects to tenant networks in order to reduce address consumption. This
load balancer links to the routing architecture as it services the VIP
for the application. The router and load balancer use the GRE tunnel ID
of the application's tenant network and an IP address within the tenant
subnet but outside of the address pool. This is to ensure that the load
balancer can communicate with the application's HTTP servers without
requiring the consumption of a public IP address.
Because sessions persist until closed, the routing and switching
architecture provides high availability. Switches mesh to each
hypervisor and each other, and also provide an MLAG implementation to
ensure that layer-2 connectivity does not fail. Routers use VRRP and
fully mesh with switches to ensure layer-3 connectivity. Since GRE is
provides an overlay network, Networking is present and uses the Open
vSwitch agent in GRE tunnel mode. This ensures all devices can reach all
other devices and that you can create tenant networks for private
addressing links to the load balancer.
.. figure:: figures/Network_Web_Services1.png
A web service architecture has many options and optional components. Due
to this, it can fit into a large number of other OpenStack designs. A
few key components, however, need to be in place to handle the nature of
most web-scale workloads. You require the following components:
* OpenStack Controller services (Image, Identity, Networking and
supporting services such as MariaDB and RabbitMQ)
* OpenStack Compute running KVM hypervisor
* OpenStack Object Storage
* Orchestration service
* Telemetry service
Beyond the normal Identity, Compute, Image service, and Object Storage
components, we recommend the Orchestration service component to handle
the proper scaling of workloads to adjust to demand. Due to the
requirement for auto-scaling, the design includes the Telemetry service.
Web services tend to be bursty in load, have very defined peak and
valley usage patterns and, as a result, benefit from automatic scaling
of instances based upon traffic. At a network level, a split network
configuration works well with databases residing on private tenant
networks since these do not emit a large quantity of broadcast traffic
and may need to interconnect to some databases for content.
Load balancing
~~~~~~~~~~~~~~
Load balancing spreads requests across multiple instances. This workload
scales well horizontally across large numbers of instances. This enables
instances to run without publicly routed IP addresses and instead to
rely on the load balancer to provide a globally reachable service. Many
of these services do not require direct server return. This aids in
address planning and utilization at scale since only the virtual IP
(VIP) must be public.
Overlay networks
~~~~~~~~~~~~~~~~
The overlay functionality design includes OpenStack Networking in Open
vSwitch GRE tunnel mode. In this case, the layer-3 external routers pair
with VRRP, and switches pair with an implementation of MLAG to ensure
that you do not lose connectivity with the upstream routing
infrastructure.
Performance tuning
~~~~~~~~~~~~~~~~~~
Network level tuning for this workload is minimal. Quality-of-Service
(QoS) applies to these workloads for a middle ground Class Selector
depending on existing policies. It is higher than a best effort queue
but lower than an Expedited Forwarding or Assured Forwarding queue.
Since this type of application generates larger packets with
longer-lived connections, you can optimize bandwidth utilization for
long duration TCP. Normal bandwidth planning applies here with regards
to benchmarking a session's usage multiplied by the expected number of
concurrent sessions with overhead.
Network functions
~~~~~~~~~~~~~~~~~
Network functions is a broad category but encompasses workloads that
support the rest of a system's network. These workloads tend to consist
of large amounts of small packets that are very short lived, such as DNS
queries or SNMP traps. These messages need to arrive quickly and do not
deal with packet loss as there can be a very large volume of them. There
are a few extra considerations to take into account for this type of
workload and this can change a configuration all the way to the
hypervisor level. For an application that generates 10 TCP sessions per
user with an average bandwidth of 512 kilobytes per second per flow and
expected user count of ten thousand concurrent users, the expected
bandwidth plan is approximately 4.88 gigabits per second.
The supporting network for this type of configuration needs to have a
low latency and evenly distributed availability. This workload benefits
from having services local to the consumers of the service. Use a
multi-site approach as well as deploying many copies of the application
to handle load as close as possible to consumers. Since these
applications function independently, they do not warrant running
overlays to interconnect tenant networks. Overlays also have the
drawback of performing poorly with rapid flow setup and may incur too
much overhead with large quantities of small packets and therefore we do
not recommend them.
QoS is desirable for some workloads to ensure delivery. DNS has a major
impact on the load times of other services and needs to be reliable and
provide rapid responses. Configure rules in upstream devices to apply a
higher Class Selector to DNS to ensure faster delivery or a better spot
in queuing algorithms.
Cloud storage
~~~~~~~~~~~~~
Another common use case for OpenStack environments is providing a
cloud-based file storage and sharing service. You might consider this a
storage-focused use case, but its network-side requirements make it a
network-focused use case.
For example, consider a cloud backup application. This workload has two
specific behaviors that impact the network. Because this workload is an
externally-facing service and an internally-replicating application, it
has both :term:`north-south<north-south traffic>` and
:term:`east-west<east-west traffic>` traffic considerations:
north-south traffic
When a user uploads and stores content, that content moves into the
OpenStack installation. When users download this content, the
content moves out from the OpenStack installation. Because this
service operates primarily as a backup, most of the traffic moves
southbound into the environment. In this situation, it benefits you
to configure a network to be asymmetrically downstream because the
traffic that enters the OpenStack installation is greater than the
traffic that leaves the installation.
east-west traffic
Likely to be fully symmetric. Because replication originates from
any node and might target multiple other nodes algorithmically, it
is less likely for this traffic to have a larger volume in any
specific direction. However this traffic might interfere with
north-south traffic.
.. figure:: figures/Network_Cloud_Storage2.png
This application prioritizes the north-south traffic over east-west
traffic: the north-south traffic involves customer-facing data.
The network design in this case is less dependent on availability and
more dependent on being able to handle high bandwidth. As a direct
result, it is beneficial to forgo redundant links in favor of bonding
those connections. This increases available bandwidth. It is also
beneficial to configure all devices in the path, including OpenStack, to
generate and pass jumbo frames.

View File

@ -0,0 +1,42 @@
=================
Specialized cases
=================
.. toctree::
:maxdepth: 2
specialized-multi-hypervisor.rst
specialized-networking.rst
specialized-software-defined-networking.rst
specialized-desktop-as-a-service.rst
specialized-openstack-on-openstack.rst
specialized-hardware.rst
specialized-single-site.rst
specialized-add-region.rst
specialized-scaling-multiple-cells.rst
Although OpenStack architecture designs have been described
in seven major scenarios outlined in other sections
(compute focused, network focused, storage focused, general
purpose, multi-site, hybrid cloud, and massively scalable),
there are a few use cases that do not fit into these categories.
This section discusses these specialized cases and provide some
additional details and design considerations for each use case:
* :doc:`Specialized networking <specialized-networking>`:
describes running networking-oriented software that may involve reading
packets directly from the wire or participating in routing protocols.
* :doc:`Software-defined networking (SDN)
<specialized-software-defined-networking>`:
describes both running an SDN controller from within OpenStack
as well as participating in a software-defined network.
* :doc:`Desktop-as-a-Service <specialized-desktop-as-a-service>`:
describes running a virtualized desktop environment in a cloud
(:term:`Desktop-as-a-Service`).
This applies to private and public clouds.
* :doc:`OpenStack on OpenStack <specialized-openstack-on-openstack>`:
describes building a multi-tiered cloud by running OpenStack
on top of an OpenStack installation.
* :doc:`Specialized hardware <specialized-hardware>`:
describes the use of specialized hardware devices from within
the OpenStack environment.

View File

@ -0,0 +1,143 @@
==============================
Storage-focused cloud examples
==============================
Storage-focused architecture depends on specific use cases. This section
discusses three example use cases:
* An object store with a RESTful interface
* Compute analytics with parallel file systems
* High performance database
The example below shows a REST interface without a high performance
requirement.
Swift is a highly scalable object store that is part of the OpenStack
project. This diagram explains the example architecture:
.. figure:: figures/Storage_Object.png
The example REST interface, presented as a traditional Object store
running on traditional spindles, does not require a high performance
caching tier.
This example uses the following components:
Network:
* 10 GbE horizontally scalable spine leaf back-end storage and front
end network.
Storage hardware:
* 10 storage servers each with 12x4 TB disks equaling 480 TB total
space with approximately 160 TB of usable space after replicas.
Proxy:
* 3x proxies
* 2x10 GbE bonded front end
* 2x10 GbE back-end bonds
* Approximately 60 Gb of total bandwidth to the back-end storage
cluster
.. note::
It may be necessary to implement a 3rd-party caching layer for some
applications to achieve suitable performance.
Compute analytics with Data processing service
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Analytics of large data sets are dependent on the performance of the
storage system. Clouds using storage systems such as Hadoop Distributed
File System (HDFS) have inefficiencies which can cause performance
issues.
One potential solution to this problem is the implementation of storage
systems designed for performance. Parallel file systems have previously
filled this need in the HPC space and are suitable for large scale
performance-orientated systems.
OpenStack has integration with Hadoop to manage the Hadoop cluster
within the cloud. The following diagram shows an OpenStack store with a
high performance requirement:
.. figure:: figures/Storage_Hadoop3.png
The hardware requirements and configuration are similar to those of the
High Performance Database example below. In this case, the architecture
uses Ceph's Swift-compatible REST interface, features that allow for
connecting a caching pool to allow for acceleration of the presented
pool.
High performance database with Database service
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Databases are a common workload that benefit from high performance
storage back ends. Although enterprise storage is not a requirement,
many environments have existing storage that OpenStack cloud can use as
back ends. You can create a storage pool to provide block devices with
OpenStack Block Storage for instances as well as object interfaces. In
this example, the database I-O requirements are high and demand storage
presented from a fast SSD pool.
A storage system presents a LUN backed by a set of SSDs using a
traditional storage array with OpenStack Block Storage integration or a
storage platform such as Ceph or Gluster.
This system can provide additional performance. For example, in the
database example below, a portion of the SSD pool can act as a block
device to the Database server. In the high performance analytics
example, the inline SSD cache layer accelerates the REST interface.
.. figure:: figures/Storage_Database_+_Object5.png
In this example, Ceph presents a Swift-compatible REST interface, as
well as a block level storage from a distributed storage cluster. It is
highly flexible and has features that enable reduced cost of operations
such as self healing and auto balancing. Using erasure coded pools are a
suitable way of maximizing the amount of usable space.
.. note::
There are special considerations around erasure coded pools. For
example, higher computational requirements and limitations on the
operations allowed on an object; erasure coded pools do not support
partial writes.
Using Ceph as an applicable example, a potential architecture would have
the following requirements:
Network:
* 10 GbE horizontally scalable spine leaf back-end storage and
front-end network
Storage hardware:
* 5 storage servers for caching layer 24x1 TB SSD
* 10 storage servers each with 12x4 TB disks which equals 480 TB total
space with about approximately 160 TB of usable space after 3
replicas
REST proxy:
* 3x proxies
* 2x10 GbE bonded front end
* 2x10 GbE back-end bonds
* Approximately 60 Gb of total bandwidth to the back-end storage
cluster
Using an SSD cache layer, you can present block devices directly to
hypervisors or instances. The REST interface can also use the SSD cache
systems as an inline cache.

View File

@ -0,0 +1,14 @@
===========================
Cloud architecture examples
===========================
.. toctree::
:maxdepth: 2
arch-examples-general.rst
arch-examples-compute.rst
arch-examples-storage.rst
arch-examples-network.rst
arch-examples-multi-site.rst
arch-examples-hybrid.rst
arch-examples-specialized.rst

View File

@ -1,9 +0,0 @@
=====================
Example architectures
=====================
.. toctree::
:maxdepth: 2

Binary file not shown.

After

Width:  |  Height:  |  Size: 52 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 39 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 35 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 79 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 70 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 24 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 42 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 59 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 54 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 54 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 68 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 50 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 52 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 75 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 37 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 56 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 46 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 56 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 30 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 22 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 25 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 50 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 50 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 35 KiB

View File

@ -32,7 +32,7 @@ Contents
high-availability.rst
security-requirements.rst
legal-requirements.rst
example-architectures.rst
arch-examples.rst
common/app_support.rst
common/glossary.rst

View File

@ -0,0 +1,5 @@
=====================
Adding another region
=====================
.. TODO

View File

@ -0,0 +1,47 @@
====================
Desktop-as-a-Service
====================
Virtual Desktop Infrastructure (VDI) is a service that hosts
user desktop environments on remote servers. This application
is very sensitive to network latency and requires a high
performance compute environment. Traditionally these types of
services do not use cloud environments because few clouds
support such a demanding workload for user-facing applications.
As cloud environments become more robust, vendors are starting
to provide services that provide virtual desktops in the cloud.
OpenStack may soon provide the infrastructure for these types of deployments.
Challenges
~~~~~~~~~~
Designing an infrastructure that is suitable to host virtual
desktops is a very different task to that of most virtual workloads.
For example, the design must consider:
* Boot storms, when a high volume of logins occur in a short period of time
* The performance of the applications running on virtual desktops
* Operating systems and their compatibility with the OpenStack hypervisor
Broker
~~~~~~
The connection broker determines which remote desktop host
users can access. Medium and large scale environments require a broker
since its service represents a central component of the architecture.
The broker is a complete management product, and enables automated
deployment and provisioning of remote desktop hosts.
Possible solutions
~~~~~~~~~~~~~~~~~~
There are a number of commercial products currently available that
provide a broker solution. However, no native OpenStack projects
provide broker services.
Not providing a broker is also an option, but managing this manually
would not suffice for a large scale, enterprise solution.
Diagram
~~~~~~~
.. figure:: figures/Specialized_VDI1.png

View File

@ -0,0 +1,43 @@
====================
Specialized hardware
====================
Certain workloads require specialized hardware devices that
have significant virtualization or sharing challenges.
Applications such as load balancers, highly parallel brute
force computing, and direct to wire networking may need
capabilities that basic OpenStack components do not provide.
Challenges
~~~~~~~~~~
Some applications need access to hardware devices to either
improve performance or provide capabilities that are not
virtual CPU, RAM, network, or storage. These can be a shared
resource, such as a cryptography processor, or a dedicated
resource, such as a Graphics Processing Unit (GPU). OpenStack can
provide some of these, while others may need extra work.
Solutions
~~~~~~~~~
To provide cryptography offloading to a set of instances,
you can use Image service configuration options.
For example, assign the cryptography chip to a device node in the guest.
The OpenStack Command Line Reference contains further information on
configuring this solution in the section `Image service property keys
<http://docs.openstack.org/cli-reference/glance.html#image-service-property-keys>`_.
A challenge, however, is this option allows all guests using the
configured images to access the hypervisor cryptography device.
If you require direct access to a specific device, PCI pass-through
enables you to dedicate the device to a single instance per hypervisor.
You must define a flavor that has the PCI device specifically in order
to properly schedule instances.
More information regarding PCI pass-through, including instructions for
implementing and using it, is available at
`https://wiki.openstack.org/wiki/Pci_passthrough <https://wiki.openstack.org/
wiki/Pci_passthrough#How_to_check_PCI_status_with_PCI_api_patches>`_.
.. figure:: figures/Specialized_Hardware2.png
:width: 100%

View File

@ -0,0 +1,78 @@
========================
Multi-hypervisor example
========================
A financial company requires its applications migrated
from a traditional, virtualized environment to an API driven,
orchestrated environment. The new environment needs
multiple hypervisors since many of the company's applications
have strict hypervisor requirements.
Currently, the company's vSphere environment runs 20 VMware
ESXi hypervisors. These hypervisors support 300 instances of
various sizes. Approximately 50 of these instances must run
on ESXi. The remaining 250 or so have more flexible requirements.
The financial company decides to manage the
overall system with a common OpenStack platform.
.. figure:: figures/Compute_NSX.png
:width: 100%
Architecture planning teams decided to run a host aggregate
containing KVM hypervisors for the general purpose instances.
A separate host aggregate targets instances requiring ESXi.
Images in the OpenStack Image service have particular
hypervisor metadata attached. When a user requests a
certain image, the instance spawns on the relevant aggregate.
Images for ESXi use the VMDK format. You can convert
QEMU disk images to VMDK, VMFS Flat Disks. These disk images
can also be thin, thick, zeroed-thick, and eager-zeroed-thick.
After exporting a VMFS thin disk from VMFS to the
OpenStack Image service (a non-VMFS location), it becomes a
preallocated flat disk. This impacts the transfer time from the
OpenStack Image service to the data store since transfers require
moving the full preallocated flat disk rather than the thin disk.
The VMware host aggregate compute nodes communicate with
vCenter rather than spawning directly on a hypervisor.
The vCenter then requests scheduling for the instance to run on
an ESXi hypervisor.
This functionality requires that VMware Distributed Resource
Scheduler (DRS) is enabled on a cluster and set to **Fully Automated**.
The vSphere requires shared storage because the DRS uses vMotion
which is a service that relies on shared storage.
This solution to the company's migration uses shared storage
to provide Block Storage capabilities to the KVM instances while
also providing vSphere storage. The new environment provides this
storage functionality using a dedicated data network. The
compute hosts should have dedicated NICs to support the
dedicated data network. vSphere supports OpenStack Block Storage. This
support gives storage from a VMFS datastore to an instance. For the
financial company, Block Storage in their new architecture supports
both hypervisors.
OpenStack Networking provides network connectivity in this new
architecture, with the VMware NSX plug-in driver configured. legacy
networking (nova-network) supports both hypervisors in this new
architecture example, but has limitations. Specifically, vSphere
with legacy networking does not support security groups. The new
architecture uses VMware NSX as a part of the design. When users launch an
instance within either of the host aggregates, VMware NSX ensures the
instance attaches to the appropriate network overlay-based logical networks.
The architecture planning teams also consider OpenStack Compute integration.
When running vSphere in an OpenStack environment, nova-compute
communications with vCenter appear as a single large hypervisor.
This hypervisor represents the entire ESXi cluster. Multiple nova-compute
instances can represent multiple ESXi clusters. They can connect to
multiple vCenter servers. If the process running nova-compute
crashes it cuts the connection to the vCenter server.
Any ESXi clusters will stop running, and you will not be able to
provision further instances on the vCenter, even if you enable high
availability. You must monitor the nova-compute service connected
to vSphere carefully for any disruptions as a result of this failure point.

View File

@ -0,0 +1,32 @@
==============================
Specialized networking example
==============================
Some applications that interact with a network require
specialized connectivity. Applications such as a looking glass
require the ability to connect to a BGP peer, or route participant
applications may need to join a network at a layer2 level.
Challenges
~~~~~~~~~~
Connecting specialized network applications to their required
resources alters the design of an OpenStack installation.
Installations that rely on overlay networks are unable to
support a routing participant, and may also block layer-2 listeners.
Possible solutions
~~~~~~~~~~~~~~~~~~
Deploying an OpenStack installation using OpenStack Networking with a
provider network allows direct layer-2 connectivity to an
upstream networking device.
This design provides the layer-2 connectivity required to communicate
via Intermediate System-to-Intermediate System (ISIS) protocol or
to pass packets controlled by an OpenFlow controller.
Using the multiple layer-2 plug-in with an agent such as
:term:`Open vSwitch` allows a private connection through a VLAN
directly to a specific port in a layer-3 device.
This allows a BGP point-to-point link to join the autonomous system.
Avoid using layer-3 plug-ins as they divide the broadcast
domain and prevent router adjacencies from forming.

View File

@ -0,0 +1,70 @@
======================
OpenStack on OpenStack
======================
In some cases, users may run OpenStack nested on top
of another OpenStack cloud. This scenario describes how to
manage and provision complete OpenStack environments on instances
supported by hypervisors and servers, which an underlying OpenStack
environment controls.
Public cloud providers can use this technique to manage the
upgrade and maintenance process on complete OpenStack environments.
Developers and those testing OpenStack can also use this
technique to provision their own OpenStack environments on
available OpenStack Compute resources, whether public or private.
Challenges
~~~~~~~~~~
The network aspect of deploying a nested cloud is the most
complicated aspect of this architecture.
You must expose VLANs to the physical ports on which the underlying
cloud runs because the bare metal cloud owns all the hardware.
You must also expose them to the nested levels as well.
Alternatively, you can use the network overlay technologies on the
OpenStack environment running on the host OpenStack environment to
provide the required software defined networking for the deployment.
Hypervisor
~~~~~~~~~~
In this example architecture, consider which
approach you should take to provide a nested
hypervisor in OpenStack. This decision influences which
operating systems you use for the deployment of the nested
OpenStack deployments.
Possible solutions: deployment
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Deployment of a full stack can be challenging but you can mitigate
this difficulty by creating a Heat template to deploy the
entire stack, or a configuration management system. After creating
the Heat template, you can automate the deployment of additional stacks.
The OpenStack-on-OpenStack project (:term:`TripleO`)
addresses this issue. Currently, however, the project does
not completely cover nested stacks. For more information, see
https://wiki.openstack.org/wiki/TripleO.
Possible solutions: hypervisor
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In the case of running TripleO, the underlying OpenStack
cloud deploys the Compute nodes as bare-metal. You then deploy
OpenStack on these Compute bare-metal servers with the
appropriate hypervisor, such as KVM.
In the case of running smaller OpenStack clouds for testing
purposes, where performance is not a critical factor, you can use
QEMU instead. It is also possible to run a KVM hypervisor in an instance
(see http://davejingtian.org/2014/03/30/nested-kvm-just-for-fun/),
though this is not a supported configuration, and could be a
complex solution for such a use case.
Diagram
~~~~~~~
.. figure:: figures/Specialized_OOO.png
:width: 100%

View File

@ -0,0 +1,5 @@
======================
Scaling multiple cells
======================
.. TODO

View File

@ -0,0 +1,5 @@
==================================================
Single site architecture with OpenStack Networking
==================================================
.. TODO

View File

@ -0,0 +1,46 @@
===========================
Software-defined networking
===========================
Software-defined networking (SDN) is the separation of the data
plane and control plane. SDN is a popular method of
managing and controlling packet flows within networks.
SDN uses overlays or directly controlled layer-2 devices to
determine flow paths, and as such presents challenges to a
cloud environment. Some designers may wish to run their
controllers within an OpenStack installation. Others may wish
to have their installations participate in an SDN-controlled network.
Challenges
~~~~~~~~~~
SDN is a relatively new concept that is not yet standardized,
so SDN systems come in a variety of different implementations.
Because of this, a truly prescriptive architecture is not feasible.
Instead, examine the differences between an existing and a planned
OpenStack design and determine where potential conflicts and gaps exist.
Possible solutions
~~~~~~~~~~~~~~~~~~
If an SDN implementation requires layer-2 access because it
directly manipulates switches, we do not recommend running an
overlay network or a layer-3 agent.
If the controller resides within an OpenStack installation,
it may be necessary to build an ML2 plug-in and schedule the
controller instances to connect to tenant VLANs that they can
talk directly to the switch hardware.
Alternatively, depending on the external device support,
use a tunnel that terminates at the switch hardware itself.
Diagram
-------
OpenStack hosted SDN controller:
.. figure:: figures/Specialized_SDN_hosted.png
OpenStack participating in an SDN controller network:
.. figure:: figures/Specialized_SDN_external.png