Merge "[arch-design-draft] edit Use Cases chapter"
This commit is contained in:
commit
6c69358327
@ -10,8 +10,5 @@ Use cases
|
||||
use-cases/use-case-development
|
||||
use-cases/use-case-general-compute
|
||||
use-cases/use-case-web-scale
|
||||
use-cases/use-case-public
|
||||
use-cases/use-case-storage
|
||||
use-cases/use-case-multisite
|
||||
use-cases/use-case-nfv
|
||||
use-cases/use-cases-specialized
|
||||
|
@ -1,47 +0,0 @@
|
||||
====================
|
||||
Desktop-as-a-Service
|
||||
====================
|
||||
|
||||
Virtual Desktop Infrastructure (VDI) is a service that hosts
|
||||
user desktop environments on remote servers. This application
|
||||
is very sensitive to network latency and requires a high
|
||||
performance compute environment. Traditionally these types of
|
||||
services do not use cloud environments because few clouds
|
||||
support such a demanding workload for user-facing applications.
|
||||
As cloud environments become more robust, vendors are starting
|
||||
to provide services that provide virtual desktops in the cloud.
|
||||
OpenStack may soon provide the infrastructure for these types of deployments.
|
||||
|
||||
Challenges
|
||||
~~~~~~~~~~
|
||||
|
||||
Designing an infrastructure that is suitable to host virtual
|
||||
desktops is a very different task to that of most virtual workloads.
|
||||
For example, the design must consider:
|
||||
|
||||
* Boot storms, when a high volume of logins occur in a short period of time
|
||||
* The performance of the applications running on virtual desktops
|
||||
* Operating systems and their compatibility with the OpenStack hypervisor
|
||||
|
||||
Broker
|
||||
~~~~~~
|
||||
|
||||
The connection broker determines which remote desktop host
|
||||
users can access. Medium and large scale environments require a broker
|
||||
since its service represents a central component of the architecture.
|
||||
The broker is a complete management product, and enables automated
|
||||
deployment and provisioning of remote desktop hosts.
|
||||
|
||||
Possible solutions
|
||||
~~~~~~~~~~~~~~~~~~
|
||||
|
||||
There are a number of commercial products currently available that
|
||||
provide a broker solution. However, no native OpenStack projects
|
||||
provide broker services.
|
||||
Not providing a broker is also an option, but managing this manually
|
||||
would not suffice for a large scale, enterprise solution.
|
||||
|
||||
Diagram
|
||||
~~~~~~~
|
||||
|
||||
.. figure:: ../figures/Specialized_VDI1.png
|
@ -1,42 +0,0 @@
|
||||
====================
|
||||
Specialized hardware
|
||||
====================
|
||||
|
||||
Certain workloads require specialized hardware devices that
|
||||
have significant virtualization or sharing challenges.
|
||||
Applications such as load balancers, highly parallel brute
|
||||
force computing, and direct to wire networking may need
|
||||
capabilities that basic OpenStack components do not provide.
|
||||
|
||||
Challenges
|
||||
~~~~~~~~~~
|
||||
|
||||
Some applications need access to hardware devices to either
|
||||
improve performance or provide capabilities that are not
|
||||
virtual CPU, RAM, network, or storage. These can be a shared
|
||||
resource, such as a cryptography processor, or a dedicated
|
||||
resource, such as a Graphics Processing Unit (GPU). OpenStack can
|
||||
provide some of these, while others may need extra work.
|
||||
|
||||
Solutions
|
||||
~~~~~~~~~
|
||||
|
||||
To provide cryptography offloading to a set of instances,
|
||||
you can use Image service configuration options.
|
||||
For example, assign the cryptography chip to a device node in the guest.
|
||||
For further information on this configuration, see `Image service
|
||||
property keys <http://docs.openstack.org/cli-reference/glance-property-
|
||||
keys.html>`_. However, this option allows all guests using the
|
||||
configured images to access the hypervisor cryptography device.
|
||||
|
||||
If you require direct access to a specific device, PCI pass-through
|
||||
enables you to dedicate the device to a single instance per hypervisor.
|
||||
You must define a flavor that has the PCI device specifically in order
|
||||
to properly schedule instances.
|
||||
More information regarding PCI pass-through, including instructions for
|
||||
implementing and using it, is available at
|
||||
`https://wiki.openstack.org/wiki/Pci_passthrough <https://wiki.openstack.org/
|
||||
wiki/Pci_passthrough#How_to_check_PCI_status_with_PCI_api_patches>`_.
|
||||
|
||||
.. figure:: ../figures/Specialized_Hardware2.png
|
||||
:width: 100%
|
@ -1,80 +0,0 @@
|
||||
========================
|
||||
Multi-hypervisor example
|
||||
========================
|
||||
|
||||
A financial company requires its applications migrated
|
||||
from a traditional, virtualized environment to an API-driven,
|
||||
orchestrated environment. The new environment needs
|
||||
multiple hypervisors since many of the company's applications
|
||||
have strict hypervisor requirements.
|
||||
|
||||
Currently, the company's vSphere environment runs 20 VMware
|
||||
ESXi hypervisors. These hypervisors support 300 instances of
|
||||
various sizes. Approximately 50 of these instances must run
|
||||
on ESXi. The remaining 250 instances have more flexible requirements.
|
||||
|
||||
The financial company decides to manage the
|
||||
overall system with a common OpenStack platform.
|
||||
|
||||
.. figure:: ../figures/Compute_NSX.png
|
||||
:width: 100%
|
||||
|
||||
Architecture planning teams decided to run a host aggregate
|
||||
containing KVM hypervisors for the general purpose instances.
|
||||
A separate host aggregate targets instances requiring ESXi.
|
||||
|
||||
Images in the OpenStack Image service have particular
|
||||
hypervisor metadata attached. When a user requests a
|
||||
certain image, the instance spawns on the relevant aggregate.
|
||||
|
||||
Images for ESXi use the VMDK format. QEMU disk images can be
|
||||
converted to VMDK, VMFS Flat Disks. These disk images
|
||||
can also be thin, thick, zeroed-thick, and eager-zeroed-thick.
|
||||
After exporting a VMFS thin disk from VMFS to the
|
||||
OpenStack Image service (a non-VMFS location), it becomes a
|
||||
preallocated flat disk. This impacts the transfer time from the
|
||||
OpenStack Image service to the data store since transfers require
|
||||
moving the full preallocated flat disk rather than the thin disk.
|
||||
|
||||
The VMware host aggregate compute nodes communicate with
|
||||
vCenter rather than spawning directly on a hypervisor.
|
||||
The vCenter then requests scheduling for the instance to run on
|
||||
an ESXi hypervisor.
|
||||
|
||||
This functionality requires that VMware Distributed Resource
|
||||
Scheduler (DRS) is enabled on a cluster and set to **Fully Automated**.
|
||||
The vSphere requires shared storage because the DRS uses vMotion
|
||||
which is a service that relies on shared storage.
|
||||
|
||||
This solution to the company's migration uses shared storage
|
||||
to provide Block Storage capabilities to the KVM instances while
|
||||
also providing vSphere storage. The new environment provides this
|
||||
storage functionality using a dedicated data network. The
|
||||
compute hosts should have dedicated NICs to support the
|
||||
dedicated data network. vSphere supports OpenStack Block Storage. This
|
||||
support gives storage from a VMFS datastore to an instance. For the
|
||||
financial company, Block Storage in their new architecture supports
|
||||
both hypervisors.
|
||||
|
||||
OpenStack Networking provides network connectivity in this new
|
||||
architecture, with the VMware NSX plug-in driver configured. Legacy
|
||||
networking (nova-network) supports both hypervisors in this new
|
||||
architecture example, but has limitations. Specifically, vSphere
|
||||
with legacy networking does not support security groups. The new
|
||||
architecture uses VMware NSX as a part of the design. When users launch an
|
||||
instance within either of the host aggregates, VMware NSX ensures the
|
||||
instance attaches to the appropriate network overlay-based logical networks.
|
||||
|
||||
.. TODO update example??
|
||||
|
||||
The architecture planning teams also consider OpenStack Compute integration.
|
||||
When running vSphere in an OpenStack environment, nova-compute
|
||||
communications with vCenter appear as a single large hypervisor.
|
||||
This hypervisor represents the entire ESXi cluster. Multiple nova-compute
|
||||
instances can represent multiple ESXi clusters. They can connect to
|
||||
multiple vCenter servers. If the process running nova-compute
|
||||
crashes, it cuts the connection to the vCenter server.
|
||||
Any ESXi clusters will stop running, and you will not be able to
|
||||
provision further instances on the vCenter, even if you enable high
|
||||
availability. You must monitor the nova-compute service connected
|
||||
to vSphere carefully for any disruptions as a result of this failure point.
|
@ -1,33 +0,0 @@
|
||||
==============================
|
||||
Specialized networking example
|
||||
==============================
|
||||
|
||||
Some applications that interact with a network require
|
||||
specialized connectivity. For example, applications used in Looking Glass
|
||||
servers require the ability to connect to a Border Gateway Protocol (BGP) peer,
|
||||
or route participant applications may need to join a layer-2 network.
|
||||
|
||||
Challenges
|
||||
~~~~~~~~~~
|
||||
|
||||
Connecting specialized network applications to their required
|
||||
resources impacts the OpenStack architecture design. Installations that
|
||||
rely on overlay networks cannot support a routing participant, and may
|
||||
also block listeners on a layer-2 network.
|
||||
|
||||
Possible solutions
|
||||
~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Deploying an OpenStack installation using OpenStack Networking with a
|
||||
provider network allows direct layer-2 connectivity to an
|
||||
upstream networking device. This design provides the layer-2 connectivity
|
||||
required to communicate through Intermediate System-to-Intermediate System
|
||||
(ISIS) protocol, or pass packets using an OpenFlow controller.
|
||||
|
||||
Using the multiple layer-2 plug-in with an agent such as
|
||||
:term:`Open vSwitch` allows a private connection through a VLAN
|
||||
directly to a specific port in a layer-3 device. This allows a BGP
|
||||
point-to-point link to join the autonomous system.
|
||||
|
||||
Avoid using layer-3 plug-ins as they divide the broadcast
|
||||
domain and prevent router adjacencies from forming.
|
@ -1,65 +0,0 @@
|
||||
======================
|
||||
OpenStack on OpenStack
|
||||
======================
|
||||
|
||||
In some cases, users may run OpenStack nested on top
|
||||
of another OpenStack cloud. This scenario describes how to
|
||||
manage and provision complete OpenStack environments on instances
|
||||
supported by hypervisors and servers, which an underlying OpenStack
|
||||
environment controls.
|
||||
|
||||
Public cloud providers can use this technique to manage the
|
||||
upgrade and maintenance process on OpenStack environments.
|
||||
Developers and operators testing OpenStack can also use this
|
||||
technique to provision their own OpenStack environments on
|
||||
available OpenStack Compute resources.
|
||||
|
||||
Challenges
|
||||
~~~~~~~~~~
|
||||
|
||||
The network aspect of deploying a nested cloud is the most
|
||||
complicated aspect of this architecture.
|
||||
You must expose VLANs to the physical ports on which the underlying
|
||||
cloud runs because the bare metal cloud owns all the hardware.
|
||||
You must also expose them to the nested levels as well.
|
||||
Alternatively, you can use the network overlay technologies on the
|
||||
OpenStack environment running on the host OpenStack environment to
|
||||
provide the software-defined networking for the deployment.
|
||||
|
||||
Hypervisor
|
||||
~~~~~~~~~~
|
||||
|
||||
In this example architecture, consider which
|
||||
approach to provide a nested hypervisor in OpenStack. This decision
|
||||
influences the operating systems you use for nested OpenStack deployments.
|
||||
|
||||
Possible solutions: deployment
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Deployment of a full stack can be challenging but you can mitigate
|
||||
this difficulty by creating a Heat template to deploy the
|
||||
entire stack, or a configuration management system. After creating
|
||||
the Heat template, you can automate the deployment of additional stacks.
|
||||
|
||||
The OpenStack-on-OpenStack project (:term:`TripleO`)
|
||||
addresses this issue. Currently, however, the project does
|
||||
not completely cover nested stacks. For more information, see
|
||||
https://wiki.openstack.org/wiki/TripleO.
|
||||
|
||||
Possible solutions: hypervisor
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
In the case of running TripleO, the underlying OpenStack
|
||||
cloud deploys bare-metal Compute nodes. You then deploy
|
||||
OpenStack on these Compute bare-metal servers with the
|
||||
appropriate hypervisor, such as KVM.
|
||||
|
||||
In the case of running smaller OpenStack clouds for testing
|
||||
purposes, where performance is not a critical factor, you can use
|
||||
QEMU instead. It is also possible to run a KVM hypervisor in an instance
|
||||
(see http://davejingtian.org/2014/03/30/nested-kvm-just-for-fun/),
|
||||
though this is not a supported configuration and could be a
|
||||
complex solution for such a use case.
|
||||
|
||||
.. figure:: ../figures/Specialized_OOO.png
|
||||
:width: 100%
|
@ -1,5 +0,0 @@
|
||||
==================================================
|
||||
Single site architecture with OpenStack Networking
|
||||
==================================================
|
||||
|
||||
.. TODO
|
@ -1,46 +0,0 @@
|
||||
===========================
|
||||
Software-defined networking
|
||||
===========================
|
||||
|
||||
Software-defined networking (SDN) is the separation of the data
|
||||
plane and control plane. SDN is a popular method of
|
||||
managing and controlling packet flows within networks.
|
||||
SDN uses overlays or directly controlled layer-2 devices to
|
||||
determine flow paths, and as such presents challenges to a
|
||||
cloud environment. Some designers may wish to run their
|
||||
controllers within an OpenStack installation. Others may wish
|
||||
to have their installations participate in an SDN-controlled network.
|
||||
|
||||
Challenges
|
||||
~~~~~~~~~~
|
||||
|
||||
SDN is a relatively new concept that is not yet standardized,
|
||||
so SDN systems come in a variety of different implementations.
|
||||
Because of this, a truly prescriptive architecture is not feasible.
|
||||
Instead, examine the differences between an existing and a planned
|
||||
OpenStack design and determine where potential conflicts and gaps exist.
|
||||
|
||||
Possible solutions
|
||||
~~~~~~~~~~~~~~~~~~
|
||||
|
||||
If an SDN implementation requires layer-2 access because it
|
||||
directly manipulates switches, we do not recommend running an
|
||||
overlay network or a layer-3 agent.
|
||||
If the controller resides within an OpenStack installation,
|
||||
build an ML2 plugin, and schedule the controller instances
|
||||
to connect to tenant VLANs so they can talk directly to the switch
|
||||
hardware.
|
||||
Alternatively, depending on the external device support,
|
||||
use a tunnel that terminates at the switch hardware itself.
|
||||
|
||||
Diagram
|
||||
-------
|
||||
|
||||
OpenStack hosted SDN controller:
|
||||
|
||||
.. figure:: ../figures/Specialized_SDN_hosted.png
|
||||
|
||||
OpenStack participating in an SDN controller network:
|
||||
|
||||
.. figure:: ../figures/Specialized_SDN_external.png
|
||||
|
@ -4,13 +4,10 @@
|
||||
Development cloud
|
||||
=================
|
||||
|
||||
Stakeholder
|
||||
~~~~~~~~~~~
|
||||
|
||||
User stories
|
||||
Design model
|
||||
~~~~~~~~~~~~
|
||||
|
||||
Design model
|
||||
Requirements
|
||||
~~~~~~~~~~~~
|
||||
|
||||
Component block diagram
|
||||
|
@ -7,30 +7,6 @@ General compute cloud
|
||||
Design model
|
||||
~~~~~~~~~~~~
|
||||
|
||||
Hybrid cloud environments are designed for these use cases:
|
||||
|
||||
* Bursting workloads from private to public OpenStack clouds
|
||||
* Bursting workloads from private to public non-OpenStack clouds
|
||||
* High availability across clouds (for technical diversity)
|
||||
|
||||
This chapter provides examples of environments that address
|
||||
each of these use cases.
|
||||
|
||||
|
||||
Component block diagram
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
|
||||
Stakeholder
|
||||
~~~~~~~~~~~
|
||||
|
||||
|
||||
User stories
|
||||
~~~~~~~~~~~~
|
||||
|
||||
General cloud example
|
||||
---------------------
|
||||
|
||||
An online classified advertising company wants to run web applications
|
||||
consisting of Tomcat, Nginx, and MariaDB in a private cloud. To meet the
|
||||
policy requirements, the cloud infrastructure will run in their
|
||||
@ -113,273 +89,9 @@ control hardware load balance pools and instances as members in these
|
||||
pools, but their use in production environments must be carefully
|
||||
weighed against current stability.
|
||||
|
||||
|
||||
Compute-focused cloud example
|
||||
-----------------------------
|
||||
|
||||
The Conseil Européen pour la Recherche Nucléaire (CERN), also known as
|
||||
the European Organization for Nuclear Research, provides particle
|
||||
accelerators and other infrastructure for high-energy physics research.
|
||||
|
||||
As of 2011, CERN operated these two compute centers in Europe with plans
|
||||
to add a third one.
|
||||
|
||||
+-----------------------+------------------------+
|
||||
| Data center | Approximate capacity |
|
||||
+=======================+========================+
|
||||
| Geneva, Switzerland | - 3.5 Mega Watts |
|
||||
| | |
|
||||
| | - 91000 cores |
|
||||
| | |
|
||||
| | - 120 PB HDD |
|
||||
| | |
|
||||
| | - 100 PB Tape |
|
||||
| | |
|
||||
| | - 310 TB Memory |
|
||||
+-----------------------+------------------------+
|
||||
| Budapest, Hungary | - 2.5 Mega Watts |
|
||||
| | |
|
||||
| | - 20000 cores |
|
||||
| | |
|
||||
| | - 6 PB HDD |
|
||||
+-----------------------+------------------------+
|
||||
|
||||
To support the growing number of compute-heavy users of experiments
|
||||
related to the Large Hadron Collider (LHC), CERN ultimately elected to
|
||||
deploy an OpenStack cloud using Scientific Linux and RDO. This effort
|
||||
aimed to simplify the management of the center's compute resources with
|
||||
a view to doubling compute capacity through the addition of a data
|
||||
center in 2013 while maintaining the same levels of compute staff.
|
||||
|
||||
The CERN solution uses :term:`cells <cell>` for segregation of compute
|
||||
resources and for transparently scaling between different data centers.
|
||||
This decision meant trading off support for security groups and live
|
||||
migration. In addition, they must manually replicate some details, like
|
||||
flavors, across cells. In spite of these drawbacks, cells provide the
|
||||
required scale while exposing a single public API endpoint to users.
|
||||
|
||||
CERN created a compute cell for each of the two original data centers
|
||||
and created a third when it added a new data center in 2013. Each cell
|
||||
contains three availability zones to further segregate compute resources
|
||||
and at least three RabbitMQ message brokers configured for clustering
|
||||
with mirrored queues for high availability.
|
||||
|
||||
The API cell, which resides behind an HAProxy load balancer, is in the
|
||||
data center in Switzerland and directs API calls to compute cells using
|
||||
a customized variation of the cell scheduler. The customizations allow
|
||||
certain workloads to route to a specific data center or all data
|
||||
centers, with cell RAM availability determining cell selection in the
|
||||
latter case.
|
||||
|
||||
.. figure:: ../figures/Generic_CERN_Example.png
|
||||
|
||||
There is also some customization of the filter scheduler that handles
|
||||
placement within the cells:
|
||||
|
||||
ImagePropertiesFilter
|
||||
Provides special handling depending on the guest operating system in
|
||||
use (Linux-based or Windows-based).
|
||||
|
||||
ProjectsToAggregateFilter
|
||||
Provides special handling depending on which project the instance is
|
||||
associated with.
|
||||
|
||||
default_schedule_zones
|
||||
Allows the selection of multiple default availability zones, rather
|
||||
than a single default.
|
||||
|
||||
A central database team manages the MySQL database server in each cell
|
||||
in an active/passive configuration with a NetApp storage back end.
|
||||
Backups run every 6 hours.
|
||||
|
||||
Network architecture
|
||||
^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
To integrate with existing networking infrastructure, CERN made
|
||||
customizations to legacy networking (nova-network). This was in the form
|
||||
of a driver to integrate with CERN's existing database for tracking MAC
|
||||
and IP address assignments.
|
||||
|
||||
The driver facilitates selection of a MAC address and IP for new
|
||||
instances based on the compute node where the scheduler places the
|
||||
instance.
|
||||
|
||||
The driver considers the compute node where the scheduler placed an
|
||||
instance and selects a MAC address and IP from the pre-registered list
|
||||
associated with that node in the database. The database updates to
|
||||
reflect the address assignment to that instance.
|
||||
|
||||
Storage architecture
|
||||
^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
CERN deploys the OpenStack Image service in the API cell and configures
|
||||
it to expose version 1 (V1) of the API. This also requires the image
|
||||
registry. The storage back end in use is a 3 PB Ceph cluster.
|
||||
|
||||
CERN maintains a small set of Scientific Linux 5 and 6 images onto which
|
||||
orchestration tools can place applications. Puppet manages instance
|
||||
configuration and customization.
|
||||
|
||||
Monitoring
|
||||
^^^^^^^^^^
|
||||
|
||||
CERN does not require direct billing but uses the Telemetry service to
|
||||
perform metering for the purposes of adjusting project quotas. CERN uses
|
||||
a sharded, replicated MongoDB back end. To spread API load, CERN
|
||||
deploys instances of the nova-api service within the child cells for
|
||||
Telemetry to query against. This also requires the configuration of
|
||||
supporting services such as keystone, glance-api, and glance-registry in
|
||||
the child cells.
|
||||
|
||||
.. figure:: ../figures/Generic_CERN_Architecture.png
|
||||
|
||||
Additional monitoring tools in use include
|
||||
`Flume <http://flume.apache.org/>`_, `Elastic
|
||||
Search <http://www.elasticsearch.org/>`_,
|
||||
`Kibana <http://www.elasticsearch.org/overview/kibana/>`_, and the CERN
|
||||
developed `Lemon <http://lemon.web.cern.ch/lemon/index.shtml>`_
|
||||
project.
|
||||
Requirements
|
||||
~~~~~~~~~~~~
|
||||
|
||||
|
||||
|
||||
Hybrid cloud example: bursting to a public OpenStack cloud
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Company A's data center is running low on capacity.
|
||||
It is not possible to expand the data center in the foreseeable future.
|
||||
In order to accommodate the continuously growing need for
|
||||
development resources in the organization,
|
||||
Company A decides to use resources in the public cloud.
|
||||
|
||||
Company A has an established data center with a substantial amount
|
||||
of hardware. Migrating the workloads to a public cloud is not feasible.
|
||||
|
||||
The company has an internal cloud management platform that directs
|
||||
requests to the appropriate cloud, depending on the local capacity.
|
||||
This is a custom in-house application written for this specific purpose.
|
||||
|
||||
This solution is depicted in the figure below:
|
||||
|
||||
.. figure:: ../figures/Multi-Cloud_Priv-Pub3.png
|
||||
:width: 100%
|
||||
|
||||
This example shows two clouds with a Cloud Management
|
||||
Platform (CMP) connecting them. This guide does not
|
||||
discuss a specific CMP but describes how the Orchestration and
|
||||
Telemetry services handle, manage, and control workloads.
|
||||
|
||||
The private OpenStack cloud has at least one controller and at least
|
||||
one compute node. It includes metering using the Telemetry service.
|
||||
The Telemetry service captures the load increase and the CMP
|
||||
processes the information. If there is available capacity,
|
||||
the CMP uses the OpenStack API to call the Orchestration service.
|
||||
This creates instances on the private cloud in response to user requests.
|
||||
When capacity is not available on the private cloud, the CMP issues
|
||||
a request to the Orchestration service API of the public cloud.
|
||||
This creates the instance on the public cloud.
|
||||
|
||||
In this example, Company A does not direct the deployments to an
|
||||
external public cloud due to concerns regarding resource control,
|
||||
security, and increased operational expense.
|
||||
|
||||
Hybrid cloud example: bursting to a public non-OpenStack cloud
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
The second example examines bursting workloads from the private cloud
|
||||
into a non-OpenStack public cloud using Amazon Web Services (AWS)
|
||||
to take advantage of additional capacity and to scale applications.
|
||||
|
||||
The following diagram demonstrates an OpenStack-to-AWS hybrid cloud:
|
||||
|
||||
.. figure:: ../figures/Multi-Cloud_Priv-AWS4.png
|
||||
:width: 100%
|
||||
|
||||
Company B states that its developers are already using AWS
|
||||
and do not want to change to a different provider.
|
||||
|
||||
If the CMP is capable of connecting to an external cloud
|
||||
provider with an appropriate API, the workflow process remains
|
||||
the same as the previous scenario.
|
||||
The actions the CMP takes, such as monitoring loads and
|
||||
creating new instances, stay the same.
|
||||
However, the CMP performs actions in the public cloud
|
||||
using applicable API calls.
|
||||
|
||||
If the public cloud is AWS, the CMP would use the
|
||||
EC2 API to create a new instance and assign an Elastic IP.
|
||||
It can then add that IP to HAProxy in the private cloud.
|
||||
The CMP can also reference AWS-specific
|
||||
tools such as CloudWatch and CloudFormation.
|
||||
|
||||
Several open source tool kits for building CMPs are
|
||||
available and can handle this kind of translation.
|
||||
Examples include ManageIQ, jClouds, and JumpGate.
|
||||
|
||||
Hybrid cloud example: high availability and disaster recovery
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Company C requires their local data center to be able to
|
||||
recover from failure. Some of the workloads currently in
|
||||
use are running on their private OpenStack cloud.
|
||||
Protecting the data involves Block Storage, Object Storage,
|
||||
and a database. The architecture supports the failure of
|
||||
large components of the system while ensuring that the
|
||||
system continues to deliver services.
|
||||
While the services remain available to users, the failed
|
||||
components are restored in the background based on standard
|
||||
best practice data replication policies.
|
||||
To achieve these objectives, Company C replicates data to
|
||||
a second cloud in a geographically distant location.
|
||||
The following diagram describes this system:
|
||||
|
||||
.. figure:: ../figures/Multi-Cloud_failover2.png
|
||||
:width: 100%
|
||||
|
||||
This example includes two private OpenStack clouds connected with a CMP.
|
||||
The source cloud, OpenStack Cloud 1, includes a controller and
|
||||
at least one instance running MySQL. It also includes at least
|
||||
one Block Storage volume and one Object Storage volume.
|
||||
This means that data is available to the users at all times.
|
||||
The details of the method for protecting each of these sources
|
||||
of data differs.
|
||||
|
||||
Object Storage relies on the replication capabilities of
|
||||
the Object Storage provider.
|
||||
Company C enables OpenStack Object Storage so that it creates
|
||||
geographically separated replicas that take advantage of this feature.
|
||||
The company configures storage so that at least one replica
|
||||
exists in each cloud. In order to make this work, the company
|
||||
configures a single array spanning both clouds with OpenStack Identity.
|
||||
Using Federated Identity, the array talks to both clouds, communicating
|
||||
with OpenStack Object Storage through the Swift proxy.
|
||||
|
||||
For Block Storage, the replication is a little more difficult
|
||||
and involves tools outside of OpenStack itself.
|
||||
The OpenStack Block Storage volume is not set as the drive itself
|
||||
but as a logical object that points to a physical back end.
|
||||
Disaster recovery is configured for Block Storage for
|
||||
synchronous backup for the highest level of data protection,
|
||||
but asynchronous backup could have been set as an alternative
|
||||
that is not as latency sensitive.
|
||||
For asynchronous backup, the Block Storage API makes it possible
|
||||
to export the data and also the metadata of a particular volume,
|
||||
so that it can be moved and replicated elsewhere.
|
||||
More information can be found here:
|
||||
https://blueprints.launchpad.net/cinder/+spec/cinder-backup-volume-metadata-support.
|
||||
|
||||
The synchronous backups create an identical volume in both
|
||||
clouds and choose the appropriate flavor so that each cloud
|
||||
has an identical back end. This is done by creating volumes
|
||||
through the CMP. After this is configured, a solution
|
||||
involving DRDB synchronizes the physical drives.
|
||||
|
||||
The database component is backed up using synchronous backups.
|
||||
MySQL does not support geographically diverse replication,
|
||||
so disaster recovery is provided by replicating the file itself.
|
||||
As it is not possible to use Object Storage as the back end of
|
||||
a database like MySQL, Swift replication is not an option.
|
||||
Company C decides not to store the data on another geo-tiered
|
||||
storage system, such as Ceph, as Block Storage.
|
||||
This would have given another layer of protection.
|
||||
Another option would have been to store the database on an OpenStack
|
||||
Block Storage volume and backing it up like any other Block Storage.
|
||||
Component block diagram
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
@ -1,197 +0,0 @@
|
||||
.. _multisite-cloud:
|
||||
|
||||
================
|
||||
Multi-site cloud
|
||||
================
|
||||
|
||||
Design Model
|
||||
~~~~~~~~~~~~
|
||||
|
||||
Component block diagram
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Stakeholder
|
||||
~~~~~~~~~~~
|
||||
|
||||
User stories
|
||||
~~~~~~~~~~~~
|
||||
|
||||
There are multiple ways to build a multi-site OpenStack installation,
|
||||
based on the needs of the intended workloads. Below are example
|
||||
architectures based on different requirements, which are not hard and
|
||||
fast rules for deployment. Refer to previous sections to assist in
|
||||
selecting specific components and implementations based on your needs.
|
||||
|
||||
A large content provider needs to deliver content to customers that are
|
||||
geographically dispersed. The workload is very sensitive to latency and
|
||||
needs a rapid response to end-users. After reviewing the user, technical
|
||||
and operational considerations, it is determined beneficial to build a
|
||||
number of regions local to the customer's edge. Rather than build a few
|
||||
large, centralized data centers, the intent is to provide a pair of small
|
||||
data centers in locations closer to the customer. In this use case,
|
||||
spreading out applications allows for different horizontal scaling than
|
||||
a traditional compute workload scale. The intent is to scale by creating
|
||||
more copies of the application in closer proximity to the users that need
|
||||
it most, in order to ensure faster response time to user requests. This
|
||||
provider deploys two data centers at each of the four chosen regions. The
|
||||
implications of this design are based on the method of placing copies
|
||||
of resources in each of the remote regions. Swift objects, glance images,
|
||||
and Block Storage need to be manually replicated into each region. This may
|
||||
be beneficial for some systems, for example, a content service where
|
||||
only some of the content needs to exist in some regions. A centralized
|
||||
Identity service is recommended to manage authentication and access to
|
||||
the API endpoints.
|
||||
|
||||
It is recommended that you install an automated DNS system such as
|
||||
Designate. Application administrators need a way to manage the mapping
|
||||
of which application copy exists in each region and how to reach it,
|
||||
unless an external Dynamic DNS system is available. Designate assists by
|
||||
making the process automatic and by populating the records in the each
|
||||
region's zone.
|
||||
|
||||
Telemetry for each region is also deployed, as each region may grow
|
||||
differently or be used at a different rate. Ceilometer collects each
|
||||
region's meters from each of the controllers and reports them back to a
|
||||
central location. This is useful both to the end user and the
|
||||
administrator of the OpenStack environment. The end user will find this
|
||||
method useful, as it makes possible to determine if certain locations
|
||||
are experiencing higher load than others, and take appropriate action.
|
||||
Administrators also benefit by possibly being able to forecast growth
|
||||
per region, rather than expanding the capacity of all regions
|
||||
simultaneously, therefore maximizing the cost-effectiveness of the
|
||||
multi-site design.
|
||||
|
||||
One of the key decisions of running this infrastructure is whether or
|
||||
not to provide a redundancy model. Two types of redundancy and high
|
||||
availability models in this configuration can be implemented. The first
|
||||
type is the availability of central OpenStack components. Keystone can
|
||||
be made highly available in three central data centers that host the
|
||||
centralized OpenStack components. This prevents a loss of any one of the
|
||||
regions causing an outage in service. It also has the added benefit of
|
||||
being able to run a central storage repository as a primary cache for
|
||||
distributing content to each of the regions.
|
||||
|
||||
The second redundancy type is the edge data center itself. A second data
|
||||
center in each of the edge regional locations stores a second region near
|
||||
the first region. This ensures that the application does not suffer
|
||||
degraded performance in terms of latency and availability.
|
||||
|
||||
The following figure depicts the solution designed to have both a
|
||||
centralized set of core data centers for OpenStack services and paired edge
|
||||
data centers.
|
||||
|
||||
**Multi-site architecture example**
|
||||
|
||||
.. figure:: ../figures/Multi-Site_Customer_Edge.png
|
||||
|
||||
Geo-redundant load balancing example
|
||||
------------------------------------
|
||||
|
||||
A large-scale web application has been designed with cloud principles in
|
||||
mind. The application is designed to provide service to the application
|
||||
store on a 24/7 basis. The company has a two-tier architecture with
|
||||
a web front-end servicing the customer requests, and a NoSQL database back
|
||||
end storing the information.
|
||||
|
||||
Recently there has been several outages in a number of major public
|
||||
cloud providers due to applications running out of a single geographical
|
||||
location. The design, therefore, should mitigate the chance of a single
|
||||
site causing an outage for their business.
|
||||
|
||||
The solution would consist of the following OpenStack components:
|
||||
|
||||
* A firewall, switches, and load balancers on the public facing network
|
||||
connections.
|
||||
|
||||
* OpenStack controller services running Networking service, dashboard, Block
|
||||
Storage service, and Compute service running locally in each of the three
|
||||
regions. Identity service, Orchestration service, Telemetry service, Image
|
||||
service and Object Storage service can be installed centrally, with
|
||||
nodes in each of the region providing a redundant OpenStack
|
||||
controller plane throughout the globe.
|
||||
|
||||
* OpenStack Compute nodes running the KVM hypervisor.
|
||||
|
||||
* OpenStack Object Storage for serving static objects such as images
|
||||
can be used to ensure that all images are standardized across all the
|
||||
regions, and replicated on a regular basis.
|
||||
|
||||
* A distributed DNS service available to all regions that allows for
|
||||
dynamic update of DNS records of deployed instances.
|
||||
|
||||
* A geo-redundant load balancing service can be used to service the
|
||||
requests from the customers based on their origin.
|
||||
|
||||
An autoscaling heat template can be used to deploy the application in
|
||||
the three regions. This template includes:
|
||||
|
||||
* Web servers running Apache.
|
||||
|
||||
* Appropriate ``user_data`` to populate the central DNS servers upon
|
||||
instance launch.
|
||||
|
||||
* Appropriate Telemetry alarms that maintain the application state
|
||||
and allow for handling of region or instance failure.
|
||||
|
||||
Another autoscaling Heat template can be used to deploy a distributed
|
||||
MongoDB shard over the three locations, with the option of storing
|
||||
required data on a globally available swift container. According to the
|
||||
usage and load on the database server, additional shards can be
|
||||
provisioned according to the thresholds defined in Telemetry.
|
||||
|
||||
Two data centers would have been sufficient had the requirements been
|
||||
met. But three regions are selected here to avoid abnormal load on a
|
||||
single region in the event of a failure.
|
||||
|
||||
Orchestration is used because of the built-in functionality of
|
||||
autoscaling and auto healing in the event of increased load. External
|
||||
configuration management tools, such as Puppet or Chef could also have
|
||||
been used in this scenario, but were not chosen since Orchestration had
|
||||
the appropriate built-in hooks into the OpenStack cloud. In addition,
|
||||
external tools were not needed since this deployment scenario was
|
||||
straight forward.
|
||||
|
||||
OpenStack Object Storage is used here to serve as a back end for the
|
||||
Image service since it is the most suitable solution for a globally
|
||||
distributed storage solution with its own replication mechanism. Home
|
||||
grown solutions could also have been used including the handling of
|
||||
replication, but were not chosen, because Object Storage is already an
|
||||
intricate part of the infrastructure and a proven solution.
|
||||
|
||||
An external load balancing service was used and not the LBaaS in
|
||||
OpenStack because the solution in OpenStack is not redundant and does
|
||||
not have any awareness of geo location.
|
||||
|
||||
**Multi-site geo-redundant architecture**
|
||||
|
||||
.. figure:: ../figures/Multi-site_Geo_Redundant_LB.png
|
||||
|
||||
Local location service example
|
||||
------------------------------
|
||||
|
||||
A common use for multi-site OpenStack deployment is creating a Content
|
||||
Delivery Network. An application that uses a local location architecture
|
||||
requires low network latency and proximity to the user to provide an
|
||||
optimal user experience and reduce the cost of bandwidth and transit.
|
||||
The content resides on sites closer to the customer, instead of a
|
||||
centralized content store that requires utilizing higher cost
|
||||
cross-country links.
|
||||
|
||||
This architecture includes a geo-location component that places user
|
||||
requests to the closest possible node. In this scenario, 100% redundancy
|
||||
of content across every site is a goal rather than a requirement, with
|
||||
the intent to maximize the amount of content available within a minimum
|
||||
number of network hops for end users. Despite these differences, the
|
||||
storage replication configuration has significant overlap with that of a
|
||||
geo-redundant load balancing use case.
|
||||
|
||||
In the below architecture, the application utilizing this multi-site
|
||||
OpenStack install that is location-aware would launch web server or content
|
||||
serving instances on the compute cluster in each site. Requests from clients
|
||||
are first sent to a global services load balancer that determines the location
|
||||
of the client, then routes the request to the closest OpenStack site where the
|
||||
application completes the request.
|
||||
|
||||
**Multi-site shared keystone architecture**
|
||||
|
||||
.. figure:: ../figures/Multi-Site_shared_keystone1.png
|
@ -4,17 +4,16 @@
|
||||
Network virtual function cloud
|
||||
==============================
|
||||
|
||||
Stakeholder
|
||||
~~~~~~~~~~~
|
||||
|
||||
Design model
|
||||
~~~~~~~~~~~~
|
||||
|
||||
Requirements
|
||||
~~~~~~~~~~~~
|
||||
|
||||
Component block diagram
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
User stories
|
||||
~~~~~~~~~~~~
|
||||
|
||||
Network-focused cloud examples
|
||||
------------------------------
|
||||
|
@ -1,17 +0,0 @@
|
||||
.. _public-cloud:
|
||||
|
||||
============
|
||||
Public cloud
|
||||
============
|
||||
|
||||
Stakeholder
|
||||
~~~~~~~~~~~
|
||||
|
||||
User stories
|
||||
~~~~~~~~~~~~
|
||||
|
||||
Design model
|
||||
~~~~~~~~~~~~
|
||||
|
||||
Component block diagram
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
@ -16,15 +16,9 @@ discusses three example use cases:
|
||||
|
||||
* High performance database
|
||||
|
||||
Component block diagram
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
|
||||
Stakeholder
|
||||
~~~~~~~~~~~
|
||||
|
||||
User stories
|
||||
~~~~~~~~~~~~
|
||||
An object store with a RESTful interface
|
||||
----------------------------------------
|
||||
|
||||
The example below shows a REST interface without a high performance
|
||||
requirement. The following diagram depicts the example architecture:
|
||||
@ -63,6 +57,8 @@ Proxy:
|
||||
It may be necessary to implement a third party caching layer for some
|
||||
applications to achieve suitable performance.
|
||||
|
||||
|
||||
|
||||
Compute analytics with data processing service
|
||||
----------------------------------------------
|
||||
|
||||
@ -153,3 +149,10 @@ REST proxy:
|
||||
Using an SSD cache layer, you can present block devices directly to
|
||||
hypervisors or instances. The REST interface can also use the SSD cache
|
||||
systems as an inline cache.
|
||||
|
||||
|
||||
Requirements
|
||||
~~~~~~~~~~~~
|
||||
|
||||
Component block diagram
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
@ -4,13 +4,10 @@
|
||||
Web scale cloud
|
||||
===============
|
||||
|
||||
Stakeholder
|
||||
~~~~~~~~~~~
|
||||
|
||||
User stories
|
||||
Design model
|
||||
~~~~~~~~~~~~
|
||||
|
||||
Design model
|
||||
Requirements
|
||||
~~~~~~~~~~~~
|
||||
|
||||
Component block diagram
|
||||
|
@ -1,35 +0,0 @@
|
||||
=====================
|
||||
Specialized use cases
|
||||
=====================
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
|
||||
specialized-multi-hypervisor.rst
|
||||
specialized-networking.rst
|
||||
specialized-software-defined-networking.rst
|
||||
specialized-desktop-as-a-service.rst
|
||||
specialized-openstack-on-openstack.rst
|
||||
specialized-hardware.rst
|
||||
specialized-single-site.rst
|
||||
|
||||
|
||||
This section describes the architecture and design considerations for the
|
||||
following specialized use cases:
|
||||
|
||||
* :doc:`Specialized networking <specialized-networking>`:
|
||||
Running networking-oriented software that may involve reading
|
||||
packets directly from the wire or participating in routing protocols.
|
||||
* :doc:`Software-defined networking (SDN)
|
||||
<specialized-software-defined-networking>`:
|
||||
Running an SDN controller from within OpenStack
|
||||
as well as participating in a software-defined network.
|
||||
* :doc:`Desktop-as-a-Service <specialized-desktop-as-a-service>`:
|
||||
Running a virtualized desktop environment in a private or public cloud.
|
||||
* :doc:`OpenStack on OpenStack <specialized-openstack-on-openstack>`:
|
||||
Building a multi-tiered cloud by running OpenStack
|
||||
on top of an OpenStack installation.
|
||||
* :doc:`Specialized hardware <specialized-hardware>`:
|
||||
Using specialized hardware devices from within the OpenStack environment.
|
||||
* :doc:`specialized-single-site`: Single site architecture with OpenStack
|
||||
Networking.
|
Loading…
Reference in New Issue
Block a user