From 7b786609d8fdba4be330b6bb0876dd95f8181d58 Mon Sep 17 00:00:00 2001 From: daz Date: Thu, 17 Nov 2016 15:52:11 +1100 Subject: [PATCH] [arch-design-draft] edit Use Cases chapter Remove user stories and refine use cases per the spec Change-Id: I0df30a414684cdd2a3f21d5910bb3a452795df25 Implements: blueprint arch-guide-restructure-ocata --- doc/arch-design-draft/source/use-cases.rst | 3 - .../specialized-desktop-as-a-service.rst | 47 --- .../source/use-cases/specialized-hardware.rst | 42 --- .../specialized-multi-hypervisor.rst | 80 ----- .../use-cases/specialized-networking.rst | 33 -- .../specialized-openstack-on-openstack.rst | 65 ---- .../use-cases/specialized-single-site.rst | 5 - ...pecialized-software-defined-networking.rst | 46 --- .../source/use-cases/use-case-development.rst | 7 +- .../use-cases/use-case-general-compute.rst | 296 +----------------- .../source/use-cases/use-case-multisite.rst | 197 ------------ .../source/use-cases/use-case-nfv.rst | 7 +- .../source/use-cases/use-case-public.rst | 17 - .../source/use-cases/use-case-storage.rst | 19 +- .../source/use-cases/use-case-web-scale.rst | 7 +- .../use-cases/use-cases-specialized.rst | 35 --- 16 files changed, 22 insertions(+), 884 deletions(-) delete mode 100644 doc/arch-design-draft/source/use-cases/specialized-desktop-as-a-service.rst delete mode 100644 doc/arch-design-draft/source/use-cases/specialized-hardware.rst delete mode 100644 doc/arch-design-draft/source/use-cases/specialized-multi-hypervisor.rst delete mode 100644 doc/arch-design-draft/source/use-cases/specialized-networking.rst delete mode 100644 doc/arch-design-draft/source/use-cases/specialized-openstack-on-openstack.rst delete mode 100644 doc/arch-design-draft/source/use-cases/specialized-single-site.rst delete mode 100644 doc/arch-design-draft/source/use-cases/specialized-software-defined-networking.rst delete mode 100644 doc/arch-design-draft/source/use-cases/use-case-multisite.rst delete mode 100644 doc/arch-design-draft/source/use-cases/use-case-public.rst delete mode 100644 doc/arch-design-draft/source/use-cases/use-cases-specialized.rst diff --git a/doc/arch-design-draft/source/use-cases.rst b/doc/arch-design-draft/source/use-cases.rst index 81d619bf5c..7e0d5959b3 100644 --- a/doc/arch-design-draft/source/use-cases.rst +++ b/doc/arch-design-draft/source/use-cases.rst @@ -10,8 +10,5 @@ Use cases use-cases/use-case-development use-cases/use-case-general-compute use-cases/use-case-web-scale - use-cases/use-case-public use-cases/use-case-storage - use-cases/use-case-multisite use-cases/use-case-nfv - use-cases/use-cases-specialized diff --git a/doc/arch-design-draft/source/use-cases/specialized-desktop-as-a-service.rst b/doc/arch-design-draft/source/use-cases/specialized-desktop-as-a-service.rst deleted file mode 100644 index 6f63a43253..0000000000 --- a/doc/arch-design-draft/source/use-cases/specialized-desktop-as-a-service.rst +++ /dev/null @@ -1,47 +0,0 @@ -==================== -Desktop-as-a-Service -==================== - -Virtual Desktop Infrastructure (VDI) is a service that hosts -user desktop environments on remote servers. This application -is very sensitive to network latency and requires a high -performance compute environment. Traditionally these types of -services do not use cloud environments because few clouds -support such a demanding workload for user-facing applications. -As cloud environments become more robust, vendors are starting -to provide services that provide virtual desktops in the cloud. -OpenStack may soon provide the infrastructure for these types of deployments. - -Challenges -~~~~~~~~~~ - -Designing an infrastructure that is suitable to host virtual -desktops is a very different task to that of most virtual workloads. -For example, the design must consider: - -* Boot storms, when a high volume of logins occur in a short period of time -* The performance of the applications running on virtual desktops -* Operating systems and their compatibility with the OpenStack hypervisor - -Broker -~~~~~~ - -The connection broker determines which remote desktop host -users can access. Medium and large scale environments require a broker -since its service represents a central component of the architecture. -The broker is a complete management product, and enables automated -deployment and provisioning of remote desktop hosts. - -Possible solutions -~~~~~~~~~~~~~~~~~~ - -There are a number of commercial products currently available that -provide a broker solution. However, no native OpenStack projects -provide broker services. -Not providing a broker is also an option, but managing this manually -would not suffice for a large scale, enterprise solution. - -Diagram -~~~~~~~ - -.. figure:: ../figures/Specialized_VDI1.png diff --git a/doc/arch-design-draft/source/use-cases/specialized-hardware.rst b/doc/arch-design-draft/source/use-cases/specialized-hardware.rst deleted file mode 100644 index e39d852050..0000000000 --- a/doc/arch-design-draft/source/use-cases/specialized-hardware.rst +++ /dev/null @@ -1,42 +0,0 @@ -==================== -Specialized hardware -==================== - -Certain workloads require specialized hardware devices that -have significant virtualization or sharing challenges. -Applications such as load balancers, highly parallel brute -force computing, and direct to wire networking may need -capabilities that basic OpenStack components do not provide. - -Challenges -~~~~~~~~~~ - -Some applications need access to hardware devices to either -improve performance or provide capabilities that are not -virtual CPU, RAM, network, or storage. These can be a shared -resource, such as a cryptography processor, or a dedicated -resource, such as a Graphics Processing Unit (GPU). OpenStack can -provide some of these, while others may need extra work. - -Solutions -~~~~~~~~~ - -To provide cryptography offloading to a set of instances, -you can use Image service configuration options. -For example, assign the cryptography chip to a device node in the guest. -For further information on this configuration, see `Image service -property keys `_. However, this option allows all guests using the -configured images to access the hypervisor cryptography device. - -If you require direct access to a specific device, PCI pass-through -enables you to dedicate the device to a single instance per hypervisor. -You must define a flavor that has the PCI device specifically in order -to properly schedule instances. -More information regarding PCI pass-through, including instructions for -implementing and using it, is available at -`https://wiki.openstack.org/wiki/Pci_passthrough `_. - -.. figure:: ../figures/Specialized_Hardware2.png - :width: 100% diff --git a/doc/arch-design-draft/source/use-cases/specialized-multi-hypervisor.rst b/doc/arch-design-draft/source/use-cases/specialized-multi-hypervisor.rst deleted file mode 100644 index 7dd892acae..0000000000 --- a/doc/arch-design-draft/source/use-cases/specialized-multi-hypervisor.rst +++ /dev/null @@ -1,80 +0,0 @@ -======================== -Multi-hypervisor example -======================== - -A financial company requires its applications migrated -from a traditional, virtualized environment to an API-driven, -orchestrated environment. The new environment needs -multiple hypervisors since many of the company's applications -have strict hypervisor requirements. - -Currently, the company's vSphere environment runs 20 VMware -ESXi hypervisors. These hypervisors support 300 instances of -various sizes. Approximately 50 of these instances must run -on ESXi. The remaining 250 instances have more flexible requirements. - -The financial company decides to manage the -overall system with a common OpenStack platform. - -.. figure:: ../figures/Compute_NSX.png - :width: 100% - -Architecture planning teams decided to run a host aggregate -containing KVM hypervisors for the general purpose instances. -A separate host aggregate targets instances requiring ESXi. - -Images in the OpenStack Image service have particular -hypervisor metadata attached. When a user requests a -certain image, the instance spawns on the relevant aggregate. - -Images for ESXi use the VMDK format. QEMU disk images can be -converted to VMDK, VMFS Flat Disks. These disk images -can also be thin, thick, zeroed-thick, and eager-zeroed-thick. -After exporting a VMFS thin disk from VMFS to the -OpenStack Image service (a non-VMFS location), it becomes a -preallocated flat disk. This impacts the transfer time from the -OpenStack Image service to the data store since transfers require -moving the full preallocated flat disk rather than the thin disk. - -The VMware host aggregate compute nodes communicate with -vCenter rather than spawning directly on a hypervisor. -The vCenter then requests scheduling for the instance to run on -an ESXi hypervisor. - -This functionality requires that VMware Distributed Resource -Scheduler (DRS) is enabled on a cluster and set to **Fully Automated**. -The vSphere requires shared storage because the DRS uses vMotion -which is a service that relies on shared storage. - -This solution to the company's migration uses shared storage -to provide Block Storage capabilities to the KVM instances while -also providing vSphere storage. The new environment provides this -storage functionality using a dedicated data network. The -compute hosts should have dedicated NICs to support the -dedicated data network. vSphere supports OpenStack Block Storage. This -support gives storage from a VMFS datastore to an instance. For the -financial company, Block Storage in their new architecture supports -both hypervisors. - -OpenStack Networking provides network connectivity in this new -architecture, with the VMware NSX plug-in driver configured. Legacy -networking (nova-network) supports both hypervisors in this new -architecture example, but has limitations. Specifically, vSphere -with legacy networking does not support security groups. The new -architecture uses VMware NSX as a part of the design. When users launch an -instance within either of the host aggregates, VMware NSX ensures the -instance attaches to the appropriate network overlay-based logical networks. - -.. TODO update example?? - -The architecture planning teams also consider OpenStack Compute integration. -When running vSphere in an OpenStack environment, nova-compute -communications with vCenter appear as a single large hypervisor. -This hypervisor represents the entire ESXi cluster. Multiple nova-compute -instances can represent multiple ESXi clusters. They can connect to -multiple vCenter servers. If the process running nova-compute -crashes, it cuts the connection to the vCenter server. -Any ESXi clusters will stop running, and you will not be able to -provision further instances on the vCenter, even if you enable high -availability. You must monitor the nova-compute service connected -to vSphere carefully for any disruptions as a result of this failure point. diff --git a/doc/arch-design-draft/source/use-cases/specialized-networking.rst b/doc/arch-design-draft/source/use-cases/specialized-networking.rst deleted file mode 100644 index 932a6e92fd..0000000000 --- a/doc/arch-design-draft/source/use-cases/specialized-networking.rst +++ /dev/null @@ -1,33 +0,0 @@ -============================== -Specialized networking example -============================== - -Some applications that interact with a network require -specialized connectivity. For example, applications used in Looking Glass -servers require the ability to connect to a Border Gateway Protocol (BGP) peer, -or route participant applications may need to join a layer-2 network. - -Challenges -~~~~~~~~~~ - -Connecting specialized network applications to their required -resources impacts the OpenStack architecture design. Installations that -rely on overlay networks cannot support a routing participant, and may -also block listeners on a layer-2 network. - -Possible solutions -~~~~~~~~~~~~~~~~~~ - -Deploying an OpenStack installation using OpenStack Networking with a -provider network allows direct layer-2 connectivity to an -upstream networking device. This design provides the layer-2 connectivity -required to communicate through Intermediate System-to-Intermediate System -(ISIS) protocol, or pass packets using an OpenFlow controller. - -Using the multiple layer-2 plug-in with an agent such as -:term:`Open vSwitch` allows a private connection through a VLAN -directly to a specific port in a layer-3 device. This allows a BGP -point-to-point link to join the autonomous system. - -Avoid using layer-3 plug-ins as they divide the broadcast -domain and prevent router adjacencies from forming. diff --git a/doc/arch-design-draft/source/use-cases/specialized-openstack-on-openstack.rst b/doc/arch-design-draft/source/use-cases/specialized-openstack-on-openstack.rst deleted file mode 100644 index 6b6c3a9c19..0000000000 --- a/doc/arch-design-draft/source/use-cases/specialized-openstack-on-openstack.rst +++ /dev/null @@ -1,65 +0,0 @@ -====================== -OpenStack on OpenStack -====================== - -In some cases, users may run OpenStack nested on top -of another OpenStack cloud. This scenario describes how to -manage and provision complete OpenStack environments on instances -supported by hypervisors and servers, which an underlying OpenStack -environment controls. - -Public cloud providers can use this technique to manage the -upgrade and maintenance process on OpenStack environments. -Developers and operators testing OpenStack can also use this -technique to provision their own OpenStack environments on -available OpenStack Compute resources. - -Challenges -~~~~~~~~~~ - -The network aspect of deploying a nested cloud is the most -complicated aspect of this architecture. -You must expose VLANs to the physical ports on which the underlying -cloud runs because the bare metal cloud owns all the hardware. -You must also expose them to the nested levels as well. -Alternatively, you can use the network overlay technologies on the -OpenStack environment running on the host OpenStack environment to -provide the software-defined networking for the deployment. - -Hypervisor -~~~~~~~~~~ - -In this example architecture, consider which -approach to provide a nested hypervisor in OpenStack. This decision -influences the operating systems you use for nested OpenStack deployments. - -Possible solutions: deployment -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Deployment of a full stack can be challenging but you can mitigate -this difficulty by creating a Heat template to deploy the -entire stack, or a configuration management system. After creating -the Heat template, you can automate the deployment of additional stacks. - -The OpenStack-on-OpenStack project (:term:`TripleO`) -addresses this issue. Currently, however, the project does -not completely cover nested stacks. For more information, see -https://wiki.openstack.org/wiki/TripleO. - -Possible solutions: hypervisor -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -In the case of running TripleO, the underlying OpenStack -cloud deploys bare-metal Compute nodes. You then deploy -OpenStack on these Compute bare-metal servers with the -appropriate hypervisor, such as KVM. - -In the case of running smaller OpenStack clouds for testing -purposes, where performance is not a critical factor, you can use -QEMU instead. It is also possible to run a KVM hypervisor in an instance -(see http://davejingtian.org/2014/03/30/nested-kvm-just-for-fun/), -though this is not a supported configuration and could be a -complex solution for such a use case. - -.. figure:: ../figures/Specialized_OOO.png - :width: 100% diff --git a/doc/arch-design-draft/source/use-cases/specialized-single-site.rst b/doc/arch-design-draft/source/use-cases/specialized-single-site.rst deleted file mode 100644 index cc8a1a5239..0000000000 --- a/doc/arch-design-draft/source/use-cases/specialized-single-site.rst +++ /dev/null @@ -1,5 +0,0 @@ -================================================== -Single site architecture with OpenStack Networking -================================================== - -.. TODO diff --git a/doc/arch-design-draft/source/use-cases/specialized-software-defined-networking.rst b/doc/arch-design-draft/source/use-cases/specialized-software-defined-networking.rst deleted file mode 100644 index 6c62a62bc6..0000000000 --- a/doc/arch-design-draft/source/use-cases/specialized-software-defined-networking.rst +++ /dev/null @@ -1,46 +0,0 @@ -=========================== -Software-defined networking -=========================== - -Software-defined networking (SDN) is the separation of the data -plane and control plane. SDN is a popular method of -managing and controlling packet flows within networks. -SDN uses overlays or directly controlled layer-2 devices to -determine flow paths, and as such presents challenges to a -cloud environment. Some designers may wish to run their -controllers within an OpenStack installation. Others may wish -to have their installations participate in an SDN-controlled network. - -Challenges -~~~~~~~~~~ - -SDN is a relatively new concept that is not yet standardized, -so SDN systems come in a variety of different implementations. -Because of this, a truly prescriptive architecture is not feasible. -Instead, examine the differences between an existing and a planned -OpenStack design and determine where potential conflicts and gaps exist. - -Possible solutions -~~~~~~~~~~~~~~~~~~ - -If an SDN implementation requires layer-2 access because it -directly manipulates switches, we do not recommend running an -overlay network or a layer-3 agent. -If the controller resides within an OpenStack installation, -build an ML2 plugin, and schedule the controller instances -to connect to tenant VLANs so they can talk directly to the switch -hardware. -Alternatively, depending on the external device support, -use a tunnel that terminates at the switch hardware itself. - -Diagram -------- - -OpenStack hosted SDN controller: - -.. figure:: ../figures/Specialized_SDN_hosted.png - -OpenStack participating in an SDN controller network: - -.. figure:: ../figures/Specialized_SDN_external.png - diff --git a/doc/arch-design-draft/source/use-cases/use-case-development.rst b/doc/arch-design-draft/source/use-cases/use-case-development.rst index 89563b98ee..f1ee023330 100644 --- a/doc/arch-design-draft/source/use-cases/use-case-development.rst +++ b/doc/arch-design-draft/source/use-cases/use-case-development.rst @@ -4,13 +4,10 @@ Development cloud ================= -Stakeholder -~~~~~~~~~~~ - -User stories +Design model ~~~~~~~~~~~~ -Design model +Requirements ~~~~~~~~~~~~ Component block diagram diff --git a/doc/arch-design-draft/source/use-cases/use-case-general-compute.rst b/doc/arch-design-draft/source/use-cases/use-case-general-compute.rst index ae5316134e..cb54229a8b 100644 --- a/doc/arch-design-draft/source/use-cases/use-case-general-compute.rst +++ b/doc/arch-design-draft/source/use-cases/use-case-general-compute.rst @@ -7,30 +7,6 @@ General compute cloud Design model ~~~~~~~~~~~~ -Hybrid cloud environments are designed for these use cases: - -* Bursting workloads from private to public OpenStack clouds -* Bursting workloads from private to public non-OpenStack clouds -* High availability across clouds (for technical diversity) - -This chapter provides examples of environments that address -each of these use cases. - - -Component block diagram -~~~~~~~~~~~~~~~~~~~~~~~ - - -Stakeholder -~~~~~~~~~~~ - - -User stories -~~~~~~~~~~~~ - -General cloud example ---------------------- - An online classified advertising company wants to run web applications consisting of Tomcat, Nginx, and MariaDB in a private cloud. To meet the policy requirements, the cloud infrastructure will run in their @@ -113,273 +89,9 @@ control hardware load balance pools and instances as members in these pools, but their use in production environments must be carefully weighed against current stability. - -Compute-focused cloud example ------------------------------ - -The Conseil Européen pour la Recherche Nucléaire (CERN), also known as -the European Organization for Nuclear Research, provides particle -accelerators and other infrastructure for high-energy physics research. - -As of 2011, CERN operated these two compute centers in Europe with plans -to add a third one. - -+-----------------------+------------------------+ -| Data center | Approximate capacity | -+=======================+========================+ -| Geneva, Switzerland | - 3.5 Mega Watts | -| | | -| | - 91000 cores | -| | | -| | - 120 PB HDD | -| | | -| | - 100 PB Tape | -| | | -| | - 310 TB Memory | -+-----------------------+------------------------+ -| Budapest, Hungary | - 2.5 Mega Watts | -| | | -| | - 20000 cores | -| | | -| | - 6 PB HDD | -+-----------------------+------------------------+ - -To support the growing number of compute-heavy users of experiments -related to the Large Hadron Collider (LHC), CERN ultimately elected to -deploy an OpenStack cloud using Scientific Linux and RDO. This effort -aimed to simplify the management of the center's compute resources with -a view to doubling compute capacity through the addition of a data -center in 2013 while maintaining the same levels of compute staff. - -The CERN solution uses :term:`cells ` for segregation of compute -resources and for transparently scaling between different data centers. -This decision meant trading off support for security groups and live -migration. In addition, they must manually replicate some details, like -flavors, across cells. In spite of these drawbacks, cells provide the -required scale while exposing a single public API endpoint to users. - -CERN created a compute cell for each of the two original data centers -and created a third when it added a new data center in 2013. Each cell -contains three availability zones to further segregate compute resources -and at least three RabbitMQ message brokers configured for clustering -with mirrored queues for high availability. - -The API cell, which resides behind an HAProxy load balancer, is in the -data center in Switzerland and directs API calls to compute cells using -a customized variation of the cell scheduler. The customizations allow -certain workloads to route to a specific data center or all data -centers, with cell RAM availability determining cell selection in the -latter case. - -.. figure:: ../figures/Generic_CERN_Example.png - -There is also some customization of the filter scheduler that handles -placement within the cells: - -ImagePropertiesFilter - Provides special handling depending on the guest operating system in - use (Linux-based or Windows-based). - -ProjectsToAggregateFilter - Provides special handling depending on which project the instance is - associated with. - -default_schedule_zones - Allows the selection of multiple default availability zones, rather - than a single default. - -A central database team manages the MySQL database server in each cell -in an active/passive configuration with a NetApp storage back end. -Backups run every 6 hours. - -Network architecture -^^^^^^^^^^^^^^^^^^^^ - -To integrate with existing networking infrastructure, CERN made -customizations to legacy networking (nova-network). This was in the form -of a driver to integrate with CERN's existing database for tracking MAC -and IP address assignments. - -The driver facilitates selection of a MAC address and IP for new -instances based on the compute node where the scheduler places the -instance. - -The driver considers the compute node where the scheduler placed an -instance and selects a MAC address and IP from the pre-registered list -associated with that node in the database. The database updates to -reflect the address assignment to that instance. - -Storage architecture -^^^^^^^^^^^^^^^^^^^^ - -CERN deploys the OpenStack Image service in the API cell and configures -it to expose version 1 (V1) of the API. This also requires the image -registry. The storage back end in use is a 3 PB Ceph cluster. - -CERN maintains a small set of Scientific Linux 5 and 6 images onto which -orchestration tools can place applications. Puppet manages instance -configuration and customization. - -Monitoring -^^^^^^^^^^ - -CERN does not require direct billing but uses the Telemetry service to -perform metering for the purposes of adjusting project quotas. CERN uses -a sharded, replicated MongoDB back end. To spread API load, CERN -deploys instances of the nova-api service within the child cells for -Telemetry to query against. This also requires the configuration of -supporting services such as keystone, glance-api, and glance-registry in -the child cells. - -.. figure:: ../figures/Generic_CERN_Architecture.png - -Additional monitoring tools in use include -`Flume `_, `Elastic -Search `_, -`Kibana `_, and the CERN -developed `Lemon `_ -project. +Requirements +~~~~~~~~~~~~ - -Hybrid cloud example: bursting to a public OpenStack cloud -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Company A's data center is running low on capacity. -It is not possible to expand the data center in the foreseeable future. -In order to accommodate the continuously growing need for -development resources in the organization, -Company A decides to use resources in the public cloud. - -Company A has an established data center with a substantial amount -of hardware. Migrating the workloads to a public cloud is not feasible. - -The company has an internal cloud management platform that directs -requests to the appropriate cloud, depending on the local capacity. -This is a custom in-house application written for this specific purpose. - -This solution is depicted in the figure below: - -.. figure:: ../figures/Multi-Cloud_Priv-Pub3.png - :width: 100% - -This example shows two clouds with a Cloud Management -Platform (CMP) connecting them. This guide does not -discuss a specific CMP but describes how the Orchestration and -Telemetry services handle, manage, and control workloads. - -The private OpenStack cloud has at least one controller and at least -one compute node. It includes metering using the Telemetry service. -The Telemetry service captures the load increase and the CMP -processes the information. If there is available capacity, -the CMP uses the OpenStack API to call the Orchestration service. -This creates instances on the private cloud in response to user requests. -When capacity is not available on the private cloud, the CMP issues -a request to the Orchestration service API of the public cloud. -This creates the instance on the public cloud. - -In this example, Company A does not direct the deployments to an -external public cloud due to concerns regarding resource control, -security, and increased operational expense. - -Hybrid cloud example: bursting to a public non-OpenStack cloud -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -The second example examines bursting workloads from the private cloud -into a non-OpenStack public cloud using Amazon Web Services (AWS) -to take advantage of additional capacity and to scale applications. - -The following diagram demonstrates an OpenStack-to-AWS hybrid cloud: - -.. figure:: ../figures/Multi-Cloud_Priv-AWS4.png - :width: 100% - -Company B states that its developers are already using AWS -and do not want to change to a different provider. - -If the CMP is capable of connecting to an external cloud -provider with an appropriate API, the workflow process remains -the same as the previous scenario. -The actions the CMP takes, such as monitoring loads and -creating new instances, stay the same. -However, the CMP performs actions in the public cloud -using applicable API calls. - -If the public cloud is AWS, the CMP would use the -EC2 API to create a new instance and assign an Elastic IP. -It can then add that IP to HAProxy in the private cloud. -The CMP can also reference AWS-specific -tools such as CloudWatch and CloudFormation. - -Several open source tool kits for building CMPs are -available and can handle this kind of translation. -Examples include ManageIQ, jClouds, and JumpGate. - -Hybrid cloud example: high availability and disaster recovery -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Company C requires their local data center to be able to -recover from failure. Some of the workloads currently in -use are running on their private OpenStack cloud. -Protecting the data involves Block Storage, Object Storage, -and a database. The architecture supports the failure of -large components of the system while ensuring that the -system continues to deliver services. -While the services remain available to users, the failed -components are restored in the background based on standard -best practice data replication policies. -To achieve these objectives, Company C replicates data to -a second cloud in a geographically distant location. -The following diagram describes this system: - -.. figure:: ../figures/Multi-Cloud_failover2.png - :width: 100% - -This example includes two private OpenStack clouds connected with a CMP. -The source cloud, OpenStack Cloud 1, includes a controller and -at least one instance running MySQL. It also includes at least -one Block Storage volume and one Object Storage volume. -This means that data is available to the users at all times. -The details of the method for protecting each of these sources -of data differs. - -Object Storage relies on the replication capabilities of -the Object Storage provider. -Company C enables OpenStack Object Storage so that it creates -geographically separated replicas that take advantage of this feature. -The company configures storage so that at least one replica -exists in each cloud. In order to make this work, the company -configures a single array spanning both clouds with OpenStack Identity. -Using Federated Identity, the array talks to both clouds, communicating -with OpenStack Object Storage through the Swift proxy. - -For Block Storage, the replication is a little more difficult -and involves tools outside of OpenStack itself. -The OpenStack Block Storage volume is not set as the drive itself -but as a logical object that points to a physical back end. -Disaster recovery is configured for Block Storage for -synchronous backup for the highest level of data protection, -but asynchronous backup could have been set as an alternative -that is not as latency sensitive. -For asynchronous backup, the Block Storage API makes it possible -to export the data and also the metadata of a particular volume, -so that it can be moved and replicated elsewhere. -More information can be found here: -https://blueprints.launchpad.net/cinder/+spec/cinder-backup-volume-metadata-support. - -The synchronous backups create an identical volume in both -clouds and choose the appropriate flavor so that each cloud -has an identical back end. This is done by creating volumes -through the CMP. After this is configured, a solution -involving DRDB synchronizes the physical drives. - -The database component is backed up using synchronous backups. -MySQL does not support geographically diverse replication, -so disaster recovery is provided by replicating the file itself. -As it is not possible to use Object Storage as the back end of -a database like MySQL, Swift replication is not an option. -Company C decides not to store the data on another geo-tiered -storage system, such as Ceph, as Block Storage. -This would have given another layer of protection. -Another option would have been to store the database on an OpenStack -Block Storage volume and backing it up like any other Block Storage. +Component block diagram +~~~~~~~~~~~~~~~~~~~~~~~ diff --git a/doc/arch-design-draft/source/use-cases/use-case-multisite.rst b/doc/arch-design-draft/source/use-cases/use-case-multisite.rst deleted file mode 100644 index 99d5fc9b5b..0000000000 --- a/doc/arch-design-draft/source/use-cases/use-case-multisite.rst +++ /dev/null @@ -1,197 +0,0 @@ -.. _multisite-cloud: - -================ -Multi-site cloud -================ - -Design Model -~~~~~~~~~~~~ - -Component block diagram -~~~~~~~~~~~~~~~~~~~~~~~ - -Stakeholder -~~~~~~~~~~~ - -User stories -~~~~~~~~~~~~ - -There are multiple ways to build a multi-site OpenStack installation, -based on the needs of the intended workloads. Below are example -architectures based on different requirements, which are not hard and -fast rules for deployment. Refer to previous sections to assist in -selecting specific components and implementations based on your needs. - -A large content provider needs to deliver content to customers that are -geographically dispersed. The workload is very sensitive to latency and -needs a rapid response to end-users. After reviewing the user, technical -and operational considerations, it is determined beneficial to build a -number of regions local to the customer's edge. Rather than build a few -large, centralized data centers, the intent is to provide a pair of small -data centers in locations closer to the customer. In this use case, -spreading out applications allows for different horizontal scaling than -a traditional compute workload scale. The intent is to scale by creating -more copies of the application in closer proximity to the users that need -it most, in order to ensure faster response time to user requests. This -provider deploys two data centers at each of the four chosen regions. The -implications of this design are based on the method of placing copies -of resources in each of the remote regions. Swift objects, glance images, -and Block Storage need to be manually replicated into each region. This may -be beneficial for some systems, for example, a content service where -only some of the content needs to exist in some regions. A centralized -Identity service is recommended to manage authentication and access to -the API endpoints. - -It is recommended that you install an automated DNS system such as -Designate. Application administrators need a way to manage the mapping -of which application copy exists in each region and how to reach it, -unless an external Dynamic DNS system is available. Designate assists by -making the process automatic and by populating the records in the each -region's zone. - -Telemetry for each region is also deployed, as each region may grow -differently or be used at a different rate. Ceilometer collects each -region's meters from each of the controllers and reports them back to a -central location. This is useful both to the end user and the -administrator of the OpenStack environment. The end user will find this -method useful, as it makes possible to determine if certain locations -are experiencing higher load than others, and take appropriate action. -Administrators also benefit by possibly being able to forecast growth -per region, rather than expanding the capacity of all regions -simultaneously, therefore maximizing the cost-effectiveness of the -multi-site design. - -One of the key decisions of running this infrastructure is whether or -not to provide a redundancy model. Two types of redundancy and high -availability models in this configuration can be implemented. The first -type is the availability of central OpenStack components. Keystone can -be made highly available in three central data centers that host the -centralized OpenStack components. This prevents a loss of any one of the -regions causing an outage in service. It also has the added benefit of -being able to run a central storage repository as a primary cache for -distributing content to each of the regions. - -The second redundancy type is the edge data center itself. A second data -center in each of the edge regional locations stores a second region near -the first region. This ensures that the application does not suffer -degraded performance in terms of latency and availability. - -The following figure depicts the solution designed to have both a -centralized set of core data centers for OpenStack services and paired edge -data centers. - -**Multi-site architecture example** - -.. figure:: ../figures/Multi-Site_Customer_Edge.png - -Geo-redundant load balancing example ------------------------------------- - -A large-scale web application has been designed with cloud principles in -mind. The application is designed to provide service to the application -store on a 24/7 basis. The company has a two-tier architecture with -a web front-end servicing the customer requests, and a NoSQL database back -end storing the information. - -Recently there has been several outages in a number of major public -cloud providers due to applications running out of a single geographical -location. The design, therefore, should mitigate the chance of a single -site causing an outage for their business. - -The solution would consist of the following OpenStack components: - -* A firewall, switches, and load balancers on the public facing network - connections. - -* OpenStack controller services running Networking service, dashboard, Block - Storage service, and Compute service running locally in each of the three - regions. Identity service, Orchestration service, Telemetry service, Image - service and Object Storage service can be installed centrally, with - nodes in each of the region providing a redundant OpenStack - controller plane throughout the globe. - -* OpenStack Compute nodes running the KVM hypervisor. - -* OpenStack Object Storage for serving static objects such as images - can be used to ensure that all images are standardized across all the - regions, and replicated on a regular basis. - -* A distributed DNS service available to all regions that allows for - dynamic update of DNS records of deployed instances. - -* A geo-redundant load balancing service can be used to service the - requests from the customers based on their origin. - -An autoscaling heat template can be used to deploy the application in -the three regions. This template includes: - -* Web servers running Apache. - -* Appropriate ``user_data`` to populate the central DNS servers upon - instance launch. - -* Appropriate Telemetry alarms that maintain the application state - and allow for handling of region or instance failure. - -Another autoscaling Heat template can be used to deploy a distributed -MongoDB shard over the three locations, with the option of storing -required data on a globally available swift container. According to the -usage and load on the database server, additional shards can be -provisioned according to the thresholds defined in Telemetry. - -Two data centers would have been sufficient had the requirements been -met. But three regions are selected here to avoid abnormal load on a -single region in the event of a failure. - -Orchestration is used because of the built-in functionality of -autoscaling and auto healing in the event of increased load. External -configuration management tools, such as Puppet or Chef could also have -been used in this scenario, but were not chosen since Orchestration had -the appropriate built-in hooks into the OpenStack cloud. In addition, -external tools were not needed since this deployment scenario was -straight forward. - -OpenStack Object Storage is used here to serve as a back end for the -Image service since it is the most suitable solution for a globally -distributed storage solution with its own replication mechanism. Home -grown solutions could also have been used including the handling of -replication, but were not chosen, because Object Storage is already an -intricate part of the infrastructure and a proven solution. - -An external load balancing service was used and not the LBaaS in -OpenStack because the solution in OpenStack is not redundant and does -not have any awareness of geo location. - -**Multi-site geo-redundant architecture** - -.. figure:: ../figures/Multi-site_Geo_Redundant_LB.png - -Local location service example ------------------------------- - -A common use for multi-site OpenStack deployment is creating a Content -Delivery Network. An application that uses a local location architecture -requires low network latency and proximity to the user to provide an -optimal user experience and reduce the cost of bandwidth and transit. -The content resides on sites closer to the customer, instead of a -centralized content store that requires utilizing higher cost -cross-country links. - -This architecture includes a geo-location component that places user -requests to the closest possible node. In this scenario, 100% redundancy -of content across every site is a goal rather than a requirement, with -the intent to maximize the amount of content available within a minimum -number of network hops for end users. Despite these differences, the -storage replication configuration has significant overlap with that of a -geo-redundant load balancing use case. - -In the below architecture, the application utilizing this multi-site -OpenStack install that is location-aware would launch web server or content -serving instances on the compute cluster in each site. Requests from clients -are first sent to a global services load balancer that determines the location -of the client, then routes the request to the closest OpenStack site where the -application completes the request. - -**Multi-site shared keystone architecture** - -.. figure:: ../figures/Multi-Site_shared_keystone1.png diff --git a/doc/arch-design-draft/source/use-cases/use-case-nfv.rst b/doc/arch-design-draft/source/use-cases/use-case-nfv.rst index 30593307e8..2f4042e59f 100644 --- a/doc/arch-design-draft/source/use-cases/use-case-nfv.rst +++ b/doc/arch-design-draft/source/use-cases/use-case-nfv.rst @@ -4,17 +4,16 @@ Network virtual function cloud ============================== -Stakeholder -~~~~~~~~~~~ Design model ~~~~~~~~~~~~ +Requirements +~~~~~~~~~~~~ + Component block diagram ~~~~~~~~~~~~~~~~~~~~~~~ -User stories -~~~~~~~~~~~~ Network-focused cloud examples ------------------------------ diff --git a/doc/arch-design-draft/source/use-cases/use-case-public.rst b/doc/arch-design-draft/source/use-cases/use-case-public.rst deleted file mode 100644 index 10c48447d4..0000000000 --- a/doc/arch-design-draft/source/use-cases/use-case-public.rst +++ /dev/null @@ -1,17 +0,0 @@ -.. _public-cloud: - -============ -Public cloud -============ - -Stakeholder -~~~~~~~~~~~ - -User stories -~~~~~~~~~~~~ - -Design model -~~~~~~~~~~~~ - -Component block diagram -~~~~~~~~~~~~~~~~~~~~~~~ diff --git a/doc/arch-design-draft/source/use-cases/use-case-storage.rst b/doc/arch-design-draft/source/use-cases/use-case-storage.rst index 512a2f503a..b18c1a0655 100644 --- a/doc/arch-design-draft/source/use-cases/use-case-storage.rst +++ b/doc/arch-design-draft/source/use-cases/use-case-storage.rst @@ -16,15 +16,9 @@ discusses three example use cases: * High performance database -Component block diagram -~~~~~~~~~~~~~~~~~~~~~~~ - -Stakeholder -~~~~~~~~~~~ - -User stories -~~~~~~~~~~~~ +An object store with a RESTful interface +---------------------------------------- The example below shows a REST interface without a high performance requirement. The following diagram depicts the example architecture: @@ -63,6 +57,8 @@ Proxy: It may be necessary to implement a third party caching layer for some applications to achieve suitable performance. + + Compute analytics with data processing service ---------------------------------------------- @@ -153,3 +149,10 @@ REST proxy: Using an SSD cache layer, you can present block devices directly to hypervisors or instances. The REST interface can also use the SSD cache systems as an inline cache. + + +Requirements +~~~~~~~~~~~~ + +Component block diagram +~~~~~~~~~~~~~~~~~~~~~~~ diff --git a/doc/arch-design-draft/source/use-cases/use-case-web-scale.rst b/doc/arch-design-draft/source/use-cases/use-case-web-scale.rst index 4a0c4f0f10..170d187bf0 100644 --- a/doc/arch-design-draft/source/use-cases/use-case-web-scale.rst +++ b/doc/arch-design-draft/source/use-cases/use-case-web-scale.rst @@ -4,13 +4,10 @@ Web scale cloud =============== -Stakeholder -~~~~~~~~~~~ - -User stories +Design model ~~~~~~~~~~~~ -Design model +Requirements ~~~~~~~~~~~~ Component block diagram diff --git a/doc/arch-design-draft/source/use-cases/use-cases-specialized.rst b/doc/arch-design-draft/source/use-cases/use-cases-specialized.rst deleted file mode 100644 index b321a85bca..0000000000 --- a/doc/arch-design-draft/source/use-cases/use-cases-specialized.rst +++ /dev/null @@ -1,35 +0,0 @@ -===================== -Specialized use cases -===================== - -.. toctree:: - :maxdepth: 2 - - specialized-multi-hypervisor.rst - specialized-networking.rst - specialized-software-defined-networking.rst - specialized-desktop-as-a-service.rst - specialized-openstack-on-openstack.rst - specialized-hardware.rst - specialized-single-site.rst - - -This section describes the architecture and design considerations for the -following specialized use cases: - -* :doc:`Specialized networking `: - Running networking-oriented software that may involve reading - packets directly from the wire or participating in routing protocols. -* :doc:`Software-defined networking (SDN) - `: - Running an SDN controller from within OpenStack - as well as participating in a software-defined network. -* :doc:`Desktop-as-a-Service `: - Running a virtualized desktop environment in a private or public cloud. -* :doc:`OpenStack on OpenStack `: - Building a multi-tiered cloud by running OpenStack - on top of an OpenStack installation. -* :doc:`Specialized hardware `: - Using specialized hardware devices from within the OpenStack environment. -* :doc:`specialized-single-site`: Single site architecture with OpenStack - Networking.