[arch-guide-draft] Update use cases chapter

1. Reorganise chapter structure
2. Migrate arch examples

Change-Id: Id0f4485eaef02a62cde06b84697252183188431e
Implements: blueprint arch-guide-restructure
This commit is contained in:
daz 2016-08-19 14:54:13 +10:00 committed by Joseph Robinson
parent be0e68542e
commit f4bb062f89
27 changed files with 531 additions and 422 deletions

View File

@ -1,126 +0,0 @@
=============================
Compute-focused cloud example
=============================
The Conseil Européen pour la Recherche Nucléaire (CERN), also known as
the European Organization for Nuclear Research, provides particle
accelerators and other infrastructure for high-energy physics research.
As of 2011 CERN operated these two compute centers in Europe with plans
to add a third.
+-----------------------+------------------------+
| Data center | Approximate capacity |
+=======================+========================+
| Geneva, Switzerland | - 3.5 Mega Watts |
| | |
| | - 91000 cores |
| | |
| | - 120 PB HDD |
| | |
| | - 100 PB Tape |
| | |
| | - 310 TB Memory |
+-----------------------+------------------------+
| Budapest, Hungary | - 2.5 Mega Watts |
| | |
| | - 20000 cores |
| | |
| | - 6 PB HDD |
+-----------------------+------------------------+
To support a growing number of compute-heavy users of experiments
related to the Large Hadron Collider (LHC), CERN ultimately elected to
deploy an OpenStack cloud using Scientific Linux and RDO. This effort
aimed to simplify the management of the center's compute resources with
a view to doubling compute capacity through the addition of a data
center in 2013 while maintaining the same levels of compute staff.
The CERN solution uses :term:`cells <cell>` for segregation of compute
resources and for transparently scaling between different data centers.
This decision meant trading off support for security groups and live
migration. In addition, they must manually replicate some details, like
flavors, across cells. In spite of these drawbacks cells provide the
required scale while exposing a single public API endpoint to users.
CERN created a compute cell for each of the two original data centers
and created a third when it added a new data center in 2013. Each cell
contains three availability zones to further segregate compute resources
and at least three RabbitMQ message brokers configured for clustering
with mirrored queues for high availability.
The API cell, which resides behind a HAProxy load balancer, is in the
data center in Switzerland and directs API calls to compute cells using
a customized variation of the cell scheduler. The customizations allow
certain workloads to route to a specific data center or all data
centers, with cell RAM availability determining cell selection in the
latter case.
.. figure:: figures/Generic_CERN_Example.png
There is also some customization of the filter scheduler that handles
placement within the cells:
ImagePropertiesFilter
Provides special handling depending on the guest operating system in
use (Linux-based or Windows-based).
ProjectsToAggregateFilter
Provides special handling depending on which project the instance is
associated with.
default_schedule_zones
Allows the selection of multiple default availability zones, rather
than a single default.
A central database team manages the MySQL database server in each cell
in an active/passive configuration with a NetApp storage back end.
Backups run every 6 hours.
Network architecture
~~~~~~~~~~~~~~~~~~~~
To integrate with existing networking infrastructure, CERN made
customizations to legacy networking (nova-network). This was in the form
of a driver to integrate with CERN's existing database for tracking MAC
and IP address assignments.
The driver facilitates selection of a MAC address and IP for new
instances based on the compute node where the scheduler places the
instance.
The driver considers the compute node where the scheduler placed an
instance and selects a MAC address and IP from the pre-registered list
associated with that node in the database. The database updates to
reflect the address assignment to that instance.
Storage architecture
~~~~~~~~~~~~~~~~~~~~
CERN deploys the OpenStack Image service in the API cell and configures
it to expose version 1 (V1) of the API. This also requires the image
registry. The storage back end in use is a 3 PB Ceph cluster.
CERN maintains a small set of Scientific Linux 5 and 6 images onto which
orchestration tools can place applications. Puppet manages instance
configuration and customization.
Monitoring
~~~~~~~~~~
CERN does not require direct billing, but uses the Telemetry service to
perform metering for the purposes of adjusting project quotas. CERN uses
a sharded, replicated, MongoDB back-end. To spread API load, CERN
deploys instances of the nova-api service within the child cells for
Telemetry to query against. This also requires the configuration of
supporting services such as keystone, glance-api, and glance-registry in
the child cells.
.. figure:: figures/Generic_CERN_Architecture.png
Additional monitoring tools in use include
`Flume <http://flume.apache.org/>`_, `Elastic
Search <http://www.elasticsearch.org/>`_,
`Kibana <http://www.elasticsearch.org/overview/kibana/>`_, and the CERN
developed `Lemon <http://lemon.web.cern.ch/lemon/index.shtml>`_
project.

View File

@ -1,85 +0,0 @@
=====================
General cloud example
=====================
An online classified advertising company wants to run web applications
consisting of Tomcat, Nginx and MariaDB in a private cloud. To be able
to meet policy requirements, the cloud infrastructure will run in their
own data center. The company has predictable load requirements, but
requires scaling to cope with nightly increases in demand. Their current
environment does not have the flexibility to align with their goal of
running an open source API environment. The current environment consists
of the following:
* Between 120 and 140 installations of Nginx and Tomcat, each with 2
vCPUs and 4 GB of RAM
* A three-node MariaDB and Galera cluster, each with 4 vCPUs and 8 GB
RAM
The company runs hardware load balancers and multiple web applications
serving their websites, and orchestrates environments using combinations
of scripts and Puppet. The website generates large amounts of log data
daily that requires archiving.
The solution would consist of the following OpenStack components:
* A firewall, switches and load balancers on the public facing network
connections.
* OpenStack Controller service running Image, Identity, Networking,
combined with support services such as MariaDB and RabbitMQ,
configured for high availability on at least three controller nodes.
* OpenStack Compute nodes running the KVM hypervisor.
* OpenStack Block Storage for use by compute instances, requiring
persistent storage (such as databases for dynamic sites).
* OpenStack Object Storage for serving static objects (such as images).
.. figure:: figures/General_Architecture3.png
Running up to 140 web instances and the small number of MariaDB
instances requires 292 vCPUs available, as well as 584 GB RAM. On a
typical 1U server using dual-socket hex-core Intel CPUs with
Hyperthreading, and assuming 2:1 CPU overcommit ratio, this would
require 8 OpenStack Compute nodes.
The web application instances run from local storage on each of the
OpenStack Compute nodes. The web application instances are stateless,
meaning that any of the instances can fail and the application will
continue to function.
MariaDB server instances store their data on shared enterprise storage,
such as NetApp or Solidfire devices. If a MariaDB instance fails,
storage would be expected to be re-attached to another instance and
rejoined to the Galera cluster.
Logs from the web application servers are shipped to OpenStack Object
Storage for processing and archiving.
Additional capabilities can be realized by moving static web content to
be served from OpenStack Object Storage containers, and backing the
OpenStack Image service with OpenStack Object Storage.
.. note::
Increasing OpenStack Object Storage means network bandwidth needs to
be taken into consideration. Running OpenStack Object Storage with
network connections offering 10 GbE or better connectivity is
advised.
Leveraging Orchestration and Telemetry services is also a potential
issue when providing auto-scaling, orchestrated web application
environments. Defining the web applications in a
:term:`Heat Orchestration Template (HOT)`
negates the reliance on the current scripted Puppet
solution.
OpenStack Networking can be used to control hardware load balancers
through the use of plug-ins and the Networking API. This allows users to
control hardware load balance pools and instances as members in these
pools, but their use in production environments must be carefully
weighed against current stability.

View File

@ -1,154 +0,0 @@
=====================
Hybrid cloud examples
=====================
Hybrid cloud environments are designed for these use cases:
* Bursting workloads from private to public OpenStack clouds
* Bursting workloads from private to public non-OpenStack clouds
* High availability across clouds (for technical diversity)
This chapter provides examples of environments that address
each of these use cases.
Bursting to a public OpenStack cloud
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Company A's data center is running low on capacity.
It is not possible to expand the data center in the foreseeable future.
In order to accommodate the continuously growing need for
development resources in the organization,
Company A decides to use resources in the public cloud.
Company A has an established data center with a substantial amount
of hardware. Migrating the workloads to a public cloud is not feasible.
The company has an internal cloud management platform that directs
requests to the appropriate cloud, depending on the local capacity.
This is a custom in-house application written for this specific purpose.
This solution is depicted in the figure below:
.. figure:: figures/Multi-Cloud_Priv-Pub3.png
:width: 100%
This example shows two clouds with a Cloud Management
Platform (CMP) connecting them. This guide does not
discuss a specific CMP, but describes how the Orchestration and
Telemetry services handle, manage, and control workloads.
The private OpenStack cloud has at least one controller and at least
one compute node. It includes metering using the Telemetry service.
The Telemetry service captures the load increase and the CMP
processes the information. If there is available capacity,
the CMP uses the OpenStack API to call the Orchestration service.
This creates instances on the private cloud in response to user requests.
When capacity is not available on the private cloud, the CMP issues
a request to the Orchestration service API of the public cloud.
This creates the instance on the public cloud.
In this example, Company A does not direct the deployments to an
external public cloud due to concerns regarding resource control,
security, and increased operational expense.
Bursting to a public non-OpenStack cloud
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The second example examines bursting workloads from the private cloud
into a non-OpenStack public cloud using Amazon Web Services (AWS)
to take advantage of additional capacity and to scale applications.
The following diagram demonstrates an OpenStack-to-AWS hybrid cloud:
.. figure:: figures/Multi-Cloud_Priv-AWS4.png
:width: 100%
Company B states that its developers are already using AWS
and do not want to change to a different provider.
If the CMP is capable of connecting to an external cloud
provider with an appropriate API, the workflow process remains
the same as the previous scenario.
The actions the CMP takes, such as monitoring loads and
creating new instances, stay the same.
However, the CMP performs actions in the public cloud
using applicable API calls.
If the public cloud is AWS, the CMP would use the
EC2 API to create a new instance and assign an Elastic IP.
It can then add that IP to HAProxy in the private cloud.
The CMP can also reference AWS-specific
tools such as CloudWatch and CloudFormation.
Several open source tool kits for building CMPs are
available and can handle this kind of translation.
Examples include ManageIQ, jClouds, and JumpGate.
High availability and disaster recovery
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Company C requires their local data center to be able to
recover from failure. Some of the workloads currently in
use are running on their private OpenStack cloud.
Protecting the data involves Block Storage, Object Storage,
and a database. The architecture supports the failure of
large components of the system while ensuring that the
system continues to deliver services.
While the services remain available to users, the failed
components are restored in the background based on standard
best practice data replication policies.
To achieve these objectives, Company C replicates data to
a second cloud in a geographically distant location.
The following diagram describes this system:
.. figure:: figures/Multi-Cloud_failover2.png
:width: 100%
This example includes two private OpenStack clouds connected with a CMP.
The source cloud, OpenStack Cloud 1, includes a controller and
at least one instance running MySQL. It also includes at least
one Block Storage volume and one Object Storage volume.
This means that data is available to the users at all times.
The details of the method for protecting each of these sources
of data differs.
Object Storage relies on the replication capabilities of
the Object Storage provider.
Company C enables OpenStack Object Storage so that it creates
geographically separated replicas that take advantage of this feature.
The company configures storage so that at least one replica
exists in each cloud. In order to make this work, the company
configures a single array spanning both clouds with OpenStack Identity.
Using Federated Identity, the array talks to both clouds, communicating
with OpenStack Object Storage through the Swift proxy.
For Block Storage, the replication is a little more difficult,
and involves tools outside of OpenStack itself.
The OpenStack Block Storage volume is not set as the drive itself
but as a logical object that points to a physical back end.
Disaster recovery is configured for Block Storage for
synchronous backup for the highest level of data protection,
but asynchronous backup could have been set as an alternative
that is not as latency sensitive.
For asynchronous backup, the Block Storage API makes it possible
to export the data and also the metadata of a particular volume,
so that it can be moved and replicated elsewhere.
More information can be found here:
https://blueprints.launchpad.net/cinder/+spec/cinder-backup-volume-metadata-support.
The synchronous backups create an identical volume in both
clouds and chooses the appropriate flavor so that each cloud
has an identical back end. This is done by creating volumes
through the CMP. After this is configured, a solution
involving DRDB synchronizes the physical drives.
The database component is backed up using synchronous backups.
MySQL does not support geographically diverse replication,
so disaster recovery is provided by replicating the file itself.
As it is not possible to use Object Storage as the back end of
a database like MySQL, Swift replication is not an option.
Company C decides not to store the data on another geo-tiered
storage system, such as Ceph, as Block Storage.
This would have given another layer of protection.
Another option would have been to store the database on an OpenStack
Block Storage volume and backing it up like any other Block Storage.

View File

@ -1,14 +0,0 @@
===========================
Cloud architecture examples
===========================
.. toctree::
:maxdepth: 2
arch-examples-general.rst
arch-examples-compute.rst
arch-examples-storage.rst
arch-examples-network.rst
arch-examples-multi-site.rst
arch-examples-hybrid.rst
arch-examples-specialized.rst

View File

Before

Width:  |  Height:  |  Size: 50 KiB

After

Width:  |  Height:  |  Size: 50 KiB

View File

Before

Width:  |  Height:  |  Size: 35 KiB

After

Width:  |  Height:  |  Size: 35 KiB

View File

@ -4,3 +4,13 @@
Use cases
=========
.. toctree::
:maxdepth: 2
use-cases/use-case-development
use-cases/use-case-general-compute
use-cases/use-case-web-scale
use-cases/use-case-public
use-cases/use-case-storage
use-cases/use-case-multisite
use-cases/use-case-nfv

View File

@ -0,0 +1,17 @@
.. _development-cloud:
=================
Development cloud
=================
Stakeholder
~~~~~~~~~~~
User stories
~~~~~~~~~~~~
Design model
~~~~~~~~~~~~
Component block diagram
~~~~~~~~~~~~~~~~~~~~~~~

View File

@ -0,0 +1,385 @@
.. _general-compute-cloud:
=====================
General compute cloud
=====================
Design model
~~~~~~~~~~~~
Hybrid cloud environments are designed for these use cases:
* Bursting workloads from private to public OpenStack clouds
* Bursting workloads from private to public non-OpenStack clouds
* High availability across clouds (for technical diversity)
This chapter provides examples of environments that address
each of these use cases.
Component block diagram
~~~~~~~~~~~~~~~~~~~~~~~
Stakeholder
~~~~~~~~~~~
User stories
~~~~~~~~~~~~
General cloud example
---------------------
An online classified advertising company wants to run web applications
consisting of Tomcat, Nginx and MariaDB in a private cloud. To meet the
policy requirements, the cloud infrastructure will run in their
own data center. The company has predictable load requirements, but
requires scaling to cope with nightly increases in demand. Their current
environment does not have the flexibility to align with their goal of
running an open source API environment. The current environment consists
of the following:
* Between 120 and 140 installations of Nginx and Tomcat, each with 2
vCPUs and 4 GB of RAM
* A three-node MariaDB and Galera cluster, each with 4 vCPUs and 8 GB
RAM
The company runs hardware load balancers and multiple web applications
serving their websites, and orchestrates environments using combinations
of scripts and Puppet. The website generates large amounts of log data
daily that requires archiving.
The solution would consist of the following OpenStack components:
* A firewall, switches and load balancers on the public facing network
connections.
* OpenStack Controller service running Image, Identity, Networking,
combined with support services such as MariaDB and RabbitMQ,
configured for high availability on at least three controller nodes.
* OpenStack Compute nodes running the KVM hypervisor.
* OpenStack Block Storage for use by compute instances, requiring
persistent storage (such as databases for dynamic sites).
* OpenStack Object Storage for serving static objects (such as images).
.. figure:: ../figures/General_Architecture3.png
Running up to 140 web instances and the small number of MariaDB
instances requires 292 vCPUs available, as well as 584 GB RAM. On a
typical 1U server using dual-socket hex-core Intel CPUs with
Hyperthreading, and assuming 2:1 CPU overcommit ratio, this would
require 8 OpenStack Compute nodes.
The web application instances run from local storage on each of the
OpenStack Compute nodes. The web application instances are stateless,
meaning that any of the instances can fail and the application will
continue to function.
MariaDB server instances store their data on shared enterprise storage,
such as NetApp or Solidfire devices. If a MariaDB instance fails,
storage would be expected to be re-attached to another instance and
rejoined to the Galera cluster.
Logs from the web application servers are shipped to OpenStack Object
Storage for processing and archiving.
Additional capabilities can be realized by moving static web content to
be served from OpenStack Object Storage containers, and backing the
OpenStack Image service with OpenStack Object Storage.
.. note::
Increasing OpenStack Object Storage means network bandwidth needs to
be taken into consideration. Running OpenStack Object Storage with
network connections offering 10 GbE or better connectivity is
advised.
Leveraging Orchestration and Telemetry services is also a potential
issue when providing auto-scaling, orchestrated web application
environments. Defining the web applications in a
:term:`Heat Orchestration Template (HOT)`
negates the reliance on the current scripted Puppet
solution.
OpenStack Networking can be used to control hardware load balancers
through the use of plug-ins and the Networking API. This allows users to
control hardware load balance pools and instances as members in these
pools, but their use in production environments must be carefully
weighed against current stability.
Compute-focused cloud example
-----------------------------
The Conseil Européen pour la Recherche Nucléaire (CERN), also known as
the European Organization for Nuclear Research, provides particle
accelerators and other infrastructure for high-energy physics research.
As of 2011 CERN operated these two compute centers in Europe with plans
to add a third.
+-----------------------+------------------------+
| Data center | Approximate capacity |
+=======================+========================+
| Geneva, Switzerland | - 3.5 Mega Watts |
| | |
| | - 91000 cores |
| | |
| | - 120 PB HDD |
| | |
| | - 100 PB Tape |
| | |
| | - 310 TB Memory |
+-----------------------+------------------------+
| Budapest, Hungary | - 2.5 Mega Watts |
| | |
| | - 20000 cores |
| | |
| | - 6 PB HDD |
+-----------------------+------------------------+
To support a growing number of compute-heavy users of experiments
related to the Large Hadron Collider (LHC), CERN ultimately elected to
deploy an OpenStack cloud using Scientific Linux and RDO. This effort
aimed to simplify the management of the center's compute resources with
a view to doubling compute capacity through the addition of a data
center in 2013 while maintaining the same levels of compute staff.
The CERN solution uses :term:`cells <cell>` for segregation of compute
resources and for transparently scaling between different data centers.
This decision meant trading off support for security groups and live
migration. In addition, they must manually replicate some details, like
flavors, across cells. In spite of these drawbacks cells provide the
required scale while exposing a single public API endpoint to users.
CERN created a compute cell for each of the two original data centers
and created a third when it added a new data center in 2013. Each cell
contains three availability zones to further segregate compute resources
and at least three RabbitMQ message brokers configured for clustering
with mirrored queues for high availability.
The API cell, which resides behind a HAProxy load balancer, is in the
data center in Switzerland and directs API calls to compute cells using
a customized variation of the cell scheduler. The customizations allow
certain workloads to route to a specific data center or all data
centers, with cell RAM availability determining cell selection in the
latter case.
.. figure:: ../figures/Generic_CERN_Example.png
There is also some customization of the filter scheduler that handles
placement within the cells:
ImagePropertiesFilter
Provides special handling depending on the guest operating system in
use (Linux-based or Windows-based).
ProjectsToAggregateFilter
Provides special handling depending on which project the instance is
associated with.
default_schedule_zones
Allows the selection of multiple default availability zones, rather
than a single default.
A central database team manages the MySQL database server in each cell
in an active/passive configuration with a NetApp storage back end.
Backups run every 6 hours.
Network architecture
^^^^^^^^^^^^^^^^^^^^
To integrate with existing networking infrastructure, CERN made
customizations to legacy networking (nova-network). This was in the form
of a driver to integrate with CERN's existing database for tracking MAC
and IP address assignments.
The driver facilitates selection of a MAC address and IP for new
instances based on the compute node where the scheduler places the
instance.
The driver considers the compute node where the scheduler placed an
instance and selects a MAC address and IP from the pre-registered list
associated with that node in the database. The database updates to
reflect the address assignment to that instance.
Storage architecture
^^^^^^^^^^^^^^^^^^^^
CERN deploys the OpenStack Image service in the API cell and configures
it to expose version 1 (V1) of the API. This also requires the image
registry. The storage back end in use is a 3 PB Ceph cluster.
CERN maintains a small set of Scientific Linux 5 and 6 images onto which
orchestration tools can place applications. Puppet manages instance
configuration and customization.
Monitoring
^^^^^^^^^^
CERN does not require direct billing, but uses the Telemetry service to
perform metering for the purposes of adjusting project quotas. CERN uses
a sharded, replicated, MongoDB back-end. To spread API load, CERN
deploys instances of the nova-api service within the child cells for
Telemetry to query against. This also requires the configuration of
supporting services such as keystone, glance-api, and glance-registry in
the child cells.
.. figure:: ../figures/Generic_CERN_Architecture.png
Additional monitoring tools in use include
`Flume <http://flume.apache.org/>`_, `Elastic
Search <http://www.elasticsearch.org/>`_,
`Kibana <http://www.elasticsearch.org/overview/kibana/>`_, and the CERN
developed `Lemon <http://lemon.web.cern.ch/lemon/index.shtml>`_
project.
Hybrid cloud example: bursting to a public OpenStack cloud
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Company A's data center is running low on capacity.
It is not possible to expand the data center in the foreseeable future.
In order to accommodate the continuously growing need for
development resources in the organization,
Company A decides to use resources in the public cloud.
Company A has an established data center with a substantial amount
of hardware. Migrating the workloads to a public cloud is not feasible.
The company has an internal cloud management platform that directs
requests to the appropriate cloud, depending on the local capacity.
This is a custom in-house application written for this specific purpose.
This solution is depicted in the figure below:
.. figure:: ../figures/Multi-Cloud_Priv-Pub3.png
:width: 100%
This example shows two clouds with a Cloud Management
Platform (CMP) connecting them. This guide does not
discuss a specific CMP, but describes how the Orchestration and
Telemetry services handle, manage, and control workloads.
The private OpenStack cloud has at least one controller and at least
one compute node. It includes metering using the Telemetry service.
The Telemetry service captures the load increase and the CMP
processes the information. If there is available capacity,
the CMP uses the OpenStack API to call the Orchestration service.
This creates instances on the private cloud in response to user requests.
When capacity is not available on the private cloud, the CMP issues
a request to the Orchestration service API of the public cloud.
This creates the instance on the public cloud.
In this example, Company A does not direct the deployments to an
external public cloud due to concerns regarding resource control,
security, and increased operational expense.
Hybrid cloud example: bursting to a public non-OpenStack cloud
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The second example examines bursting workloads from the private cloud
into a non-OpenStack public cloud using Amazon Web Services (AWS)
to take advantage of additional capacity and to scale applications.
The following diagram demonstrates an OpenStack-to-AWS hybrid cloud:
.. figure:: ../figures/Multi-Cloud_Priv-AWS4.png
:width: 100%
Company B states that its developers are already using AWS
and do not want to change to a different provider.
If the CMP is capable of connecting to an external cloud
provider with an appropriate API, the workflow process remains
the same as the previous scenario.
The actions the CMP takes, such as monitoring loads and
creating new instances, stay the same.
However, the CMP performs actions in the public cloud
using applicable API calls.
If the public cloud is AWS, the CMP would use the
EC2 API to create a new instance and assign an Elastic IP.
It can then add that IP to HAProxy in the private cloud.
The CMP can also reference AWS-specific
tools such as CloudWatch and CloudFormation.
Several open source tool kits for building CMPs are
available and can handle this kind of translation.
Examples include ManageIQ, jClouds, and JumpGate.
Hybrid cloud example: high availability and disaster recovery
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Company C requires their local data center to be able to
recover from failure. Some of the workloads currently in
use are running on their private OpenStack cloud.
Protecting the data involves Block Storage, Object Storage,
and a database. The architecture supports the failure of
large components of the system while ensuring that the
system continues to deliver services.
While the services remain available to users, the failed
components are restored in the background based on standard
best practice data replication policies.
To achieve these objectives, Company C replicates data to
a second cloud in a geographically distant location.
The following diagram describes this system:
.. figure:: ../figures/Multi-Cloud_failover2.png
:width: 100%
This example includes two private OpenStack clouds connected with a CMP.
The source cloud, OpenStack Cloud 1, includes a controller and
at least one instance running MySQL. It also includes at least
one Block Storage volume and one Object Storage volume.
This means that data is available to the users at all times.
The details of the method for protecting each of these sources
of data differs.
Object Storage relies on the replication capabilities of
the Object Storage provider.
Company C enables OpenStack Object Storage so that it creates
geographically separated replicas that take advantage of this feature.
The company configures storage so that at least one replica
exists in each cloud. In order to make this work, the company
configures a single array spanning both clouds with OpenStack Identity.
Using Federated Identity, the array talks to both clouds, communicating
with OpenStack Object Storage through the Swift proxy.
For Block Storage, the replication is a little more difficult,
and involves tools outside of OpenStack itself.
The OpenStack Block Storage volume is not set as the drive itself
but as a logical object that points to a physical back end.
Disaster recovery is configured for Block Storage for
synchronous backup for the highest level of data protection,
but asynchronous backup could have been set as an alternative
that is not as latency sensitive.
For asynchronous backup, the Block Storage API makes it possible
to export the data and also the metadata of a particular volume,
so that it can be moved and replicated elsewhere.
More information can be found here:
https://blueprints.launchpad.net/cinder/+spec/cinder-backup-volume-metadata-support.
The synchronous backups create an identical volume in both
clouds and chooses the appropriate flavor so that each cloud
has an identical back end. This is done by creating volumes
through the CMP. After this is configured, a solution
involving DRDB synchronizes the physical drives.
The database component is backed up using synchronous backups.
MySQL does not support geographically diverse replication,
so disaster recovery is provided by replicating the file itself.
As it is not possible to use Object Storage as the back end of
a database like MySQL, Swift replication is not an option.
Company C decides not to store the data on another geo-tiered
storage system, such as Ceph, as Block Storage.
This would have given another layer of protection.
Another option would have been to store the database on an OpenStack
Block Storage volume and backing it up like any other Block Storage.

View File

@ -1,13 +1,26 @@
=========================
Multi-site cloud examples
=========================
.. _multisite-cloud:
================
Multi-site cloud
================
Design Model
~~~~~~~~~~~~
Component block diagram
~~~~~~~~~~~~~~~~~~~~~~~
Stakeholder
~~~~~~~~~~~
User stories
~~~~~~~~~~~~
There are multiple ways to build a multi-site OpenStack installation,
based on the needs of the intended workloads. Below are example
architectures based on different requirements. These examples are meant
as a reference, and not a hard and fast rule for deployments. Use the
previous sections of this chapter to assist in selecting specific
components and implementations based on specific needs.
architectures based on different requirements, which are not hard and
fast rules for deployment. Refer to previous sections to assist in
selecting specific components and implementations based on your needs.
A large content provider needs to deliver content to customers that are
geographically dispersed. The workload is very sensitive to latency and
@ -64,18 +77,18 @@ center in each of the edge regional locations house a second region near
the first region. This ensures that the application does not suffer
degraded performance in terms of latency and availability.
:ref:`ms-customer-edge` depicts the solution designed to have both a
The follow figure depicts the solution designed to have both a
centralized set of core data centers for OpenStack services and paired edge
data centers:
data centers.
.. _ms-customer-edge:
.. figure:: figures/Multi-Site_Customer_Edge.png
**Multi-site architecture example**
Geo-redundant load balancing
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. figure:: ../figures/Multi-Site_Customer_Edge.png
Geo-redundant load balancing example
------------------------------------
A large-scale web application has been designed with cloud principles in
mind. The application is designed provide service to application store,
@ -83,7 +96,7 @@ on a 24/7 basis. The company has typical two tier architecture with a
web front-end servicing the customer requests, and a NoSQL database back
end storing the information.
As of late there has been several outages in number of major public
Recently there has been several outages in number of major public
cloud providers due to applications running out of a single geographical
location. The design therefore should mitigate the chance of a single
site causing an outage for their business.
@ -155,12 +168,13 @@ not have any awareness of geo location.
.. _ms-geo-redundant:
.. figure:: figures/Multi-site_Geo_Redundant_LB.png
**Multi-site geo-redundant architecture**
Location-local service
~~~~~~~~~~~~~~~~~~~~~~
.. figure:: ../figures/Multi-site_Geo_Redundant_LB.png
Location-local service example
------------------------------
A common use for multi-site OpenStack deployment is creating a Content
Delivery Network. An application that uses a location-local architecture
@ -187,6 +201,6 @@ application completes the request.
.. _ms-shared-keystone:
.. figure:: figures/Multi-Site_shared_keystone1.png
.. figure:: ../figures/Multi-Site_shared_keystone1.png
**Multi-site shared keystone architecture**

View File

@ -1,12 +1,28 @@
.. _nfv-cloud:
==============================
Network-focused cloud examples
Network virtual function cloud
==============================
An organization designs a large-scale web application with cloud
principles in mind. The application scales horizontally in a bursting
fashion and generates a high instance count. The application requires an
SSL connection to secure data and must not lose connection state to
individual servers.
Stakeholder
~~~~~~~~~~~
Design model
~~~~~~~~~~~~
Component block diagram
~~~~~~~~~~~~~~~~~~~~~~~
User stories
~~~~~~~~~~~~
Network-focused cloud examples
------------------------------
An organization designs a large scale cloud-baesed web application. The
application scales horizontally in a bursting fashion and generates a
high instance count. The application requires an SSL connection to secure
data and must not lose connection state to individual servers.
The figure below depicts an example design for this workload. In this
example, a hardware load balancer provides SSL offload functionality and
@ -28,7 +44,7 @@ vSwitch agent in GRE tunnel mode. This ensures all devices can reach all
other devices and that you can create tenant networks for private
addressing links to the load balancer.
.. figure:: figures/Network_Web_Services1.png
.. figure:: ../figures/Network_Web_Services1.png
A web service architecture has many options and optional components. Due
to this, it can fit into a large number of other OpenStack designs. A
@ -153,7 +169,7 @@ east-west traffic
specific direction. However this traffic might interfere with
north-south traffic.
.. figure:: figures/Network_Cloud_Storage2.png
.. figure:: ../figures/Network_Cloud_Storage2.png
This application prioritizes the north-south traffic over east-west
traffic: the north-south traffic involves customer-facing data.

View File

@ -0,0 +1,17 @@
.. _public-cloud:
============
Public cloud
============
Stakeholder
~~~~~~~~~~~
User stories
~~~~~~~~~~~~
Design model
~~~~~~~~~~~~
Component block diagram
~~~~~~~~~~~~~~~~~~~~~~~

View File

@ -1,6 +1,11 @@
==============================
Storage-focused cloud examples
==============================
.. _storage-cloud:
=============
Storage cloud
=============
Design model
~~~~~~~~~~~~
Storage-focused architecture depends on specific use cases. This section
discusses three example use cases:
@ -11,15 +16,22 @@ discusses three example use cases:
* High performance database
Component block diagram
~~~~~~~~~~~~~~~~~~~~~~~
Stakeholder
~~~~~~~~~~~
User stories
~~~~~~~~~~~~
The example below shows a REST interface without a high performance
requirement.
requirement. The following diagram depicts the example architecture:
Swift is a highly scalable object store that is part of the OpenStack
project. This diagram explains the example architecture:
.. figure:: ../figures/Storage_Object.png
.. figure:: figures/Storage_Object.png
The example REST interface, presented as a traditional Object store
The example REST interface, presented as a traditional Object Store
running on traditional spindles, does not require a high performance
caching tier.
@ -48,11 +60,11 @@ Proxy:
.. note::
It may be necessary to implement a 3rd-party caching layer for some
It may be necessary to implement a third party caching layer for some
applications to achieve suitable performance.
Compute analytics with Data processing service
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Compute analytics with data processing service
----------------------------------------------
Analytics of large data sets are dependent on the performance of the
storage system. Clouds using storage systems such as Hadoop Distributed
@ -68,7 +80,7 @@ OpenStack has integration with Hadoop to manage the Hadoop cluster
within the cloud. The following diagram shows an OpenStack store with a
high performance requirement:
.. figure:: figures/Storage_Hadoop3.png
.. figure:: ../figures/Storage_Hadoop3.png
The hardware requirements and configuration are similar to those of the
High Performance Database example below. In this case, the architecture
@ -96,9 +108,9 @@ database example below, a portion of the SSD pool can act as a block
device to the Database server. In the high performance analytics
example, the inline SSD cache layer accelerates the REST interface.
.. figure:: figures/Storage_Database_+_Object5.png
.. figure:: ../figures/Storage_Database_+_Object5.png
In this example, Ceph presents a Swift-compatible REST interface, as
In this example, Ceph presents a swift-compatible REST interface, as
well as a block level storage from a distributed storage cluster. It is
highly flexible and has features that enable reduced cost of operations
such as self healing and auto balancing. Using erasure coded pools are a

View File

@ -0,0 +1,17 @@
.. _web-scale-cloud:
===============
Web scale cloud
===============
Stakeholder
~~~~~~~~~~~
User stories
~~~~~~~~~~~~
Design model
~~~~~~~~~~~~
Component block diagram
~~~~~~~~~~~~~~~~~~~~~~~