Adds starting point for Architecture and Design Guide

The areas that still need work are: - needs double-checking for tables - see http://docs.openstack.org/arc/OpenStackArchitectureDesignGuide.epub for intended structure Co-Authored-By: Nick Chase <nchase@mirantis.com> Co-Authored-By: Beth Cohen <beth.cohen@verizon.com> Co-Authored-By: Sean Collins <sean_collins2@cable.comcast.com> Co-Authored-By: Steve Gordon <sgordon@redhat.com> Co-Authored-By: Sebastian Gutierrez <segutier@redhat.com> Co-Authored-By: Kevin Jackson <Kevin.Jackson@rackspace.co.uk> Co-Authored-By: Scott Lowe <slowe@vmware.com> Co-Authored-By: Maish Saidel-Keesing <msaidelk@cisco.com> Co-Authored-By: Alexandra Settle <alexandra.settle@rackspace.com> Co-Authored-By: Vinny Valdez <vvaldez@redhat.com> Co-Authored-By: Anthony Veiga <Anthony_Veiga@cable.comcast.com> Co-Authored-By: Sean Winn <sean.winn@cloudscaling.com> Change-Id: Ia0ca278cd5d2d0ee67b9b7528870c1a2a80fdadf
2014-07-17 15:50:40 -05:00 · 2014-07-17 15:50:40 -05:00 · 483b337b9e
commit 483b337b9e
parent 61e4a39c4f
111 changed files with 10567 additions and 0 deletions
--- a/doc/arch-design/bk-openstack-arch-design.xml
+++ b/doc/arch-design/bk-openstack-arch-design.xml
@ -0,0 +1,75 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<book xmlns="http://docbook.org/ns/docbook"
+  xmlns:xi="http://www.w3.org/2001/XInclude"
+  xmlns:xlink="http://www.w3.org/1999/xlink"
+  version="5.0"
+  xml:id="openstack-compute-admin-manual-grizzly">
+    <title>OpenStack Architecture Design Guide</title>
+    <?rax title.font.size="28px" subtitle.font.size="28px"?>
+    <titleabbrev>Architecture Guide</titleabbrev>
+    <info>
+        <author>
+            <personname>
+                <firstname/>
+                <surname/>
+            </personname>
+            <affiliation>
+                <orgname>OpenStack Foundation</orgname>
+            </affiliation>
+        </author>
+        <copyright>
+           <year>2014</year>
+            <holder>OpenStack Foundation</holder>
+        </copyright>
+        <releaseinfo>current</releaseinfo>
+        <productname>OpenStack</productname>
+        <pubdate/>
+        <legalnotice role="apache2">
+            <annotation>
+                <remark>Copyright details are filled in by the
+                    template.</remark>
+            </annotation>
+        </legalnotice>
+        <legalnotice role="cc-by-sa">
+            <annotation>
+                <remark>Remaining licensing details are filled in by
+                    the template.</remark>
+            </annotation>
+        </legalnotice>
+        <abstract>
+            <para>To reap the benefits of OpenStack, you should
+                plan, design, and architect your cloud properly,
+                taking user's needs into account and understanding the
+                use cases.</para>
+        </abstract>
+        <revhistory>
+            <!-- ... continue adding more revisions here as you change this document using the markup shown below... -->
+            <revision>
+                <date>2014-07-21</date>
+                <revdescription>
+                    <itemizedlist>
+                        <listitem>
+                            <para>Initial release.</para>
+                        </listitem>
+                    </itemizedlist>
+                </revdescription>
+            </revision>
+        </revhistory>
+    </info>
+    <!-- Chapters are referred from the book file through these
+         include statements. You can add additional chapters using
+         these types of statements. -->
+    <xi:include href="../common/ch_preface.xml"/>
+    <xi:include href="ch_introduction.xml"/>
+    <xi:include href="ch_generalpurpose.xml"/>
+    <xi:include href="ch_compute_focus.xml"/>
+    <xi:include href="ch_storage_focus.xml"/>
+    <xi:include href="ch_network_focus.xml"/>
+    <xi:include href="ch_multi_site.xml"/>
+    <xi:include href="ch_hybrid.xml"/>
+    <xi:include href="ch_massively_scalable.xml"/>
+    <xi:include href="ch_specialized.xml"/>
+    <xi:include href="ch_references.xml"/><!--
+      <xi:include href="ch_glossary.xml"/>-->
+    <xi:include href="../common/app_support.xml"/>
+</book>
--- a/doc/arch-design/ch_compute_focus.xml
+++ b/doc/arch-design/ch_compute_focus.xml
@ -0,0 +1,16 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<chapter xmlns="http://docbook.org/ns/docbook"
+    xmlns:xi="http://www.w3.org/2001/XInclude"
+    xmlns:xlink="http://www.w3.org/1999/xlink"
+    version="5.0"
+    xml:id="compute_focus">
+    <title>Compute Focused</title>
+
+    <xi:include href="compute_focus/section_introduction_compute_focus.xml"/>
+    <xi:include href="compute_focus/section_user_requirements_compute_focus.xml"/>
+    <xi:include href="compute_focus/section_tech_considerations_compute_focus.xml"/>
+    <xi:include href="compute_focus/section_operational_considerations_compute_focus.xml"/>
+    <xi:include href="compute_focus/section_architecture_compute_focus.xml"/>
+    <xi:include href="compute_focus/section_prescriptive_examples_compute_focus.xml"/>
+
+</chapter>
--- a/doc/arch-design/ch_generalpurpose.xml
+++ b/doc/arch-design/ch_generalpurpose.xml
@ -0,0 +1,16 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<chapter xmlns="http://docbook.org/ns/docbook"
+    xmlns:xi="http://www.w3.org/2001/XInclude"
+    xmlns:xlink="http://www.w3.org/1999/xlink"
+    version="5.0"
+    xml:id="generalpurpose">
+    <title>General Purpose</title>
+
+    <xi:include href="generalpurpose/section_introduction_generalpurpose.xml"/>
+    <xi:include href="generalpurpose/section_user_requirements_general_purpose.xml"/>
+    <xi:include href="generalpurpose/section_tech_considerations_general_purpose.xml"/>
+    <xi:include href="generalpurpose/section_operational_considerations_general_purpose.xml"/>
+    <xi:include href="generalpurpose/section_architecture_general_purpose.xml"/>
+    <xi:include href="generalpurpose/section_prescriptive_example_general_purpose.xml"/>
+
+</chapter>
--- a/doc/arch-design/ch_glossary.xml
+++ b/doc/arch-design/ch_glossary.xml
@ -0,0 +1,580 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<chapter xmlns="http://docbook.org/ns/docbook"
+    xmlns:xi="http://www.w3.org/2001/XInclude"
+    xmlns:xlink="http://www.w3.org/1999/xlink"
+    version="5.0"
+    xml:id="arch-design-glossary">
+    <title>Glossary</title>
+    <itemizedlist>
+        <listitem>
+            <para>6to4 - A mechanism that allows IPv6 packets to be
+                transmitted over an IPv4 network, providing a strategy
+                for migrating to IPv6.</para>
+        </listitem>
+        <listitem>
+            <para>AAA - authentication, authorization and
+                auditing.</para>
+        </listitem>
+        <listitem>
+            <para>Anycast - A network routing methodology that routes
+                traffic from a single sender to the nearest node, in a
+                pool of nodes.</para>
+        </listitem>
+        <listitem>
+            <para>ARP - Address Resolution Protocol - the protocol by
+                which layer 3 IP addresses are resolved into layer 2,
+                link local addresses.</para>
+        </listitem>
+        <listitem>
+            <para>BGP - Border Gateway Protocol is a dynamic routing
+                protocol that connects autonomous systems together.
+                Considered the backbone of the Internet, this protocol
+                connects disparate networks together to form a larger
+                network.</para>
+        </listitem>
+        <listitem>
+            <para>Boot Storm - When hundreds of users log in to and
+                consume resources at the same time, causing
+                significant performance degradation. This problem is
+                particularly common in Virtual Desktop Infrastructure
+                (VDI) environments.</para>
+        </listitem>
+        <listitem>
+            <para>Broadcast Domain - The layer 2 segment shared by a
+                group of network connected nodes.</para>
+        </listitem>
+        <listitem>
+            <para>Bursting - The practice of utilizing a secondary
+                environment to elastically build instances on-demand
+                when the primary environment is resource
+                constrained.</para>
+        </listitem>
+        <listitem>
+            <para>Capital Expenditure (CapEx) - A capital expense,
+                capital expenditure, CapEx is an initial cost for
+                building a product, business, or system.</para>
+        </listitem>
+        <listitem>
+            <para>Cascading Failure - A scenario where a single
+                failure in a system creates a cascading effect, where
+                other systems fail as load is transferred from the
+                failing system.</para>
+        </listitem>
+        <listitem>
+            <para>CDN - Content delivery network - a specialized
+                network that is used to distribute content to clients,
+                typically located close to the client for increased
+                performance.</para>
+        </listitem>
+        <listitem>
+            <para>Cells - An OpenStack Compute (Nova) feature, where a
+                compute deployment can be split into smaller clusters
+                or cells with their own queue and database for
+                performance and scalability, while still providing a
+                single API endpoint.</para>
+        </listitem>
+        <listitem>
+            <para>CI/CD - Continuous Integration / Continuous
+                Deployment, a methodology where software is
+                continually built and unit tests run for each change
+                that is merged, or proposed for merge. Continuous
+                Deployment is a software development methodology where
+                changes are deployed into production as they are
+                merged into source control, rather than being
+                collected into a release and deployed at regular
+                intervals</para>
+        </listitem>
+        <listitem>
+            <para>Cloud Broker - A cloud broker is a third-party
+                individual or business that acts as an intermediary
+                between the purchaser of a cloud computing service and
+                the sellers of that service. In general, a broker is
+                someone who acts as an intermediary between two or
+                more parties during negotiations.</para>
+        </listitem>
+        <listitem>
+            <para>Cloud Consumer - User that consumes cloud instances,
+                storage, or other resources in a cloud environment.
+                This user interacts with OpenStack or other cloud
+                management tools.</para>
+        </listitem>
+        <listitem>
+            <para>Cloud Management Platform (CMP) - Products that
+                provide a common interface to manage multiple cloud
+                environments or platforms.</para>
+        </listitem>
+        <listitem>
+            <para>Connection Broker - In desktop virtualization, a
+                connection broker is a software program that allows
+                the end-user to connect to an available
+                desktop.</para>
+        </listitem>
+        <listitem>
+            <para>Direct Attached Storage (DAS) - Data storage that is
+                directly connected to a machine.</para>
+        </listitem>
+        <listitem>
+            <para>DefCore - DefCore sets base requirements by defining
+                capabilities, code and must-pass tests for all
+                OpenStack products. This definition uses community
+                resources and involvement to drive interoperability by
+                creating the minimum standards for products labeled
+                "OpenStack." See
+                https://wiki.openstack.org/wiki/Governance/CoreDefinition
+                for more information.</para>
+        </listitem>
+        <listitem>
+            <para>Desktop as a Service (DaaS) - A platform that
+                provides a suite of desktop environments that users
+                may log in to to receive a desktop experience from any
+                location. This may provide general use, development,
+                or even homogenous testing environments.</para>
+        </listitem>
+        <listitem>
+            <para>Direct Server Return - A technique in load balancing
+                where an initial request is routed through a load
+                balancer, and the reply is sent from the responding
+                node directly to the requester.</para>
+        </listitem>
+        <listitem>
+            <para>Denial of Service (DoS) - In computing, a
+                denial-of-service or distributed denial-of-service
+                attack is an attempt to make a machine or network
+                resource unavailable to its intended users.</para>
+        </listitem>
+        <listitem>
+            <para>Distributed Replicated Block Device (DRBD) - The
+                Distributed Replicated Block Device (DRBD) is a
+                distributed replicated storage system for the Linux
+                platform.</para>
+        </listitem>
+        <listitem>
+            <para>Differentiated Service Code Point (DSCP) - Defined
+                in RFC 2474, this field in IPv4 and IPv6 headers is
+                used to define classes of network traffic, for quality
+                of service purposes.</para>
+        </listitem>
+        <listitem>
+            <para>External Border Gateway Protocol (eBGP) - External
+                Border Gateway Protocol describes a specific
+                implementation of BGP designed for inter-autonomous
+                system communication</para>
+        </listitem>
+        <listitem>
+            <para>Elastic IP - An Amazon Web Services concept, which
+                is an IP address that can be dynamically allocated and
+                re-assigned to running instances on the fly. The
+                OpenStack equivalent is a Floating IP.</para>
+        </listitem>
+        <listitem>
+            <para>Encapsulation - The practice of placing one packet
+                type within another for the purposes of abstracting or
+                securing data. Examples include GRE, MPLS, or
+                IPSEC.</para>
+        </listitem>
+        <listitem>
+            <para>External Cloud - A cloud environment that exists
+                outside of the control of an organization. Referred to
+                for hybrid cloud to indicate a public cloud or an
+                off-site hosted cloud.</para>
+        </listitem>
+        <listitem>
+            <para>Federated Cloud - A federated cloud describes a
+                multiple sets of cloud resources, for example compute
+                or storage, that are managed by a centralized
+                endpoint.</para>
+        </listitem>
+        <listitem>
+            <para>Flow - A series of packets that are stateful in
+                nature and represent a session. Usually represented by
+                a TCP stream, but can also indicate other packet types
+                that when combined comprise a connection between two
+                points.</para>
+        </listitem>
+        <listitem>
+            <para>Golden Image - An operating system image that
+                contains a set of pre-installed software packages and
+                configurations. This may be used to build standardized
+                instances that have the same base set of configuration
+                to improve mean time to functional application</para>
+        </listitem>
+        <listitem>
+            <para>Graphics Processing Unit (GPU) - A single chip
+                processor with integrated transform, lighting,
+                triangle setup/clipping, and rendering engines that is
+                capable of processing a minimum of 10 million polygons
+                per second. Traditional uses are any compute problem
+                that can be represented as a vector or matrix
+                operation.</para>
+        </listitem>
+        <listitem>
+            <para>Hadoop Distributed File System (HDFS) - A
+                distributed file-system that stores data on commodity
+                machines, providing very high aggregate bandwidth
+                across the cluster.</para>
+        </listitem>
+        <listitem>
+            <para>High Availability (HA) - High availability system
+                design approach and associated service implementation
+                that ensures a prearranged level of operational
+                performance will be met during a contractual
+                measurement period.</para>
+        </listitem>
+        <listitem>
+            <para>High Performance Computing (HPC) - Also known as
+                distributed computing - used for computation intensive
+                processes run on a large number of instances</para>
+        </listitem>
+        <listitem>
+            <para>Hierarchical Storage Management (HSM) - Hierarchical
+                storage management is a data storage technique, which
+                automatically moves data between high-cost and
+                low-cost storage media</para>
+        </listitem>
+        <listitem>
+            <para>Hot Standby Router Protocol (HSRP) - Hot Standby
+                Router Protocol is a Cisco proprietary redundancy
+                protocol for establishing a fault-tolerant default
+                gateway, and has been described in detail in RFC
+                2281.</para>
+        </listitem>
+        <listitem>
+            <para>Hybrid Cloud - Hybrid cloud is a composition of two
+                or more clouds (private, community or public) that
+                remain distinct entities but are bound together,
+                offering the benefits of multiple deployment models.
+                Hybrid cloud can also mean the ability to connect
+                colocation, managed and/or dedicated services with
+                cloud resources.</para>
+        </listitem>
+        <listitem>
+            <para>Interior Border Gateway Protocol (iBGP) - Interior
+                Border Gateway Protocol is a an interior gateway
+                protocol designed to exchange routing and reachability
+                information within autonomous systems.</para>
+        </listitem>
+        <listitem>
+            <para>Interior Gateway Protocol (IGP) - An Interior
+                Gateway Protocol is a type of protocol used for
+                exchanging routing information between gateways
+                (commonly routers) within an Autonomous System (for
+                example, a system of corporate local area networks).
+                This routing information can then be used to route
+                network-level protocols like IP.</para>
+        </listitem>
+        <listitem>
+            <para>Input/Output Operations Per Second (IOPS) - A common
+                performance measurement used to benchmark computer
+                storage devices like hard disk drives, solid state
+                drives, and storage area networks.</para>
+        </listitem>
+        <listitem>
+            <para>jClouds - An open source multi-cloud toolkit for the
+                Java platform that gives you the freedom to create
+                applications that are portable across clouds while
+                giving you full control to use cloud-specific
+                features.</para>
+        </listitem>
+        <listitem>
+            <para>Jitter - Is the deviation from true periodicity of a
+                presumed periodic signal in electronics and
+                telecommunications, often in relation to a reference
+                clock source.</para>
+        </listitem>
+        <listitem>
+            <para>Jumbo Frame - Ethernet frames with more than 1500
+                bytes of payload.</para>
+        </listitem>
+        <listitem>
+            <para>Kernel-based Virtual Machine (KVM) - A full
+                virtualization solution for Linux on x86 hardware
+                containing virtualization extensions (Intel VT or
+                AMD-V). It consists of a loadable kernel module, that
+                provides the core virtualization infrastructure and a
+                processor specific module.</para>
+        </listitem>
+        <listitem>
+            <para>LAG - Link aggregation group is a term to describe
+                various methods of combining (aggregating) multiple
+                network connections in parallel into a group to
+                increase throughput beyond what a single connection
+                could sustain, and to provide redundancy in case one
+                of the links fail.</para>
+        </listitem>
+        <listitem>
+            <para>Layer 2 - The data link layer provides a reliable
+                link between two directly connected nodes, by
+                detecting and possibly correcting errors that may
+                occur in the physical layer.</para>
+        </listitem>
+        <listitem>
+            <para>Layer 3 - The network layer provides the functional
+                and procedural means of transferring variable length
+                data sequences (called datagrams) from one node to
+                another connected to the same network.</para>
+        </listitem>
+        <listitem>
+            <para>Legacy System - An old method, technology, computer
+                system, or application program that is considered
+                outdated.</para>
+        </listitem>
+        <listitem>
+            <para>Looking Glass - A tool that provides information on
+                backbone routing and network efficiency.</para>
+        </listitem>
+        <listitem>
+            <para>Microsoft Azure - A cloud computing platform and
+                infrastructure, created by Microsoft, for building,
+                deploying and managing applications and services
+                through a global network of Microsoft-managed
+                datacenters.</para>
+        </listitem>
+        <listitem>
+            <para>MongoDB - A cross-platform document-oriented
+                database. Classified as a NoSQL database, MongoDB
+                eschews the traditional table-based relational
+                database structure in favor of JSON-like documents
+                with dynamic schemas.</para>
+        </listitem>
+        <listitem>
+            <para>Mean Time Before Failures (MTBF) - Mean time before
+                failures is the predicted elapsed time before inherent
+                failures of a system during operation. MTBF can be
+                calculated as the arithmetic mean (average) time
+                between failures of a system.</para>
+        </listitem>
+        <listitem>
+            <para>Maximum Transmission Unit (MTU) - The maximum
+                transmission unit of a communications protocol of a
+                layer is the size (in bytes) of the largest protocol
+                data unit that the layer can pass onwards.</para>
+        </listitem>
+        <listitem>
+            <para>NAT64 - NAT64 is a mechanism to allow IPv6 hosts to
+                communicate with IPv4 servers. The NAT64 server is the
+                endpoint for at least one IPv4 address and an IPv6
+                network segment of 32-bits.</para>
+        </listitem>
+        <listitem>
+            <para>Network Functions Virtualization (NFV) - Network
+                Functions Virtualization is a network architecture
+                concept that proposes using IT virtualization related
+                technologies, to virtualize entire classes of network
+                node functions into building blocks that may be
+                connected, or chained, together to create
+                communication services.</para>
+        </listitem>
+        <listitem>
+            <para>NoSQL - A NoSQL or Not Only SQL database provides a
+                mechanism for storage and retrieval of data that is
+                modeled in means other than the tabular relations used
+                in relational databases.</para>
+        </listitem>
+        <listitem>
+            <para>Open vSwitch - Open vSwitch is a production quality,
+                multilayer virtual switch licensed under the open
+                source Apache 2.0 license. It is designed to enable
+                massive network automation through programmatic
+                extension, while still supporting standard management
+                interfaces and protocols (e.g. NetFlow, sFlow, SPAN,
+                RSPAN, CLI, LACP, 802.1ag).</para>
+        </listitem>
+        <listitem>
+            <para>Operational Expenditure (OPEX) - An operating
+                expense, operating expenditure, operational expense,
+                operational expenditure or OPEX is an ongoing cost for
+                running a product, business, or system.</para>
+        </listitem>
+        <listitem>
+            <para>Original Design Manufacturers (ODM) - Original
+                Design Manufacturers, a company which designs and
+                manufactures a product which is specified and
+                eventually branded by another firm for sale.</para>
+        </listitem>
+        <listitem>
+            <para>Overlay Network - An overlay network is a computer
+                network which is built on the top of another network.
+                Nodes in the overlay can be thought of as being
+                connected by virtual or logical links, each of which
+                corresponds to a path, perhaps through many physical
+                links, in the underlying network.</para>
+        </listitem>
+        <listitem>
+            <para>Packet Storm - A cause of degraded service or
+                failure that occurs when a network system is
+                overwhelmed by continuous multicast or broadcast
+                traffic.</para>
+        </listitem>
+        <listitem>
+            <para>Platform as a Service (PaaS) - Platform as a Service
+                is a category of cloud computing services that
+                provides a computing platform and a solution stack as
+                a service.</para>
+        </listitem>
+        <listitem>
+            <para>Power Usage Effectiveness (PUE) - Power usage
+                effectiveness is a measure of how efficiently a
+                computer data center uses energy; specifically, how
+                much energy is used by the computing equipment (in
+                contrast to cooling and other overhead).</para>
+        </listitem>
+        <listitem>
+            <para>Quality of Service (QoS) - Quality of Service is the
+                overall performance of a telephony or computer
+                network, particularly the performance seen by the
+                users of the network.</para>
+        </listitem>
+        <listitem>
+            <para>Remote Desktop Host - A server that hosts Remote
+                Applications as session-based desktops. Users can
+                access a Remote Desktop Host server by using the
+                Remote Desktop Connection client.</para>
+        </listitem>
+        <listitem>
+            <para>Renumbering - Network renumbering, the exercise of
+                renumbering a network consists of changing the IP host
+                addresses, and perhaps the network mask, of each
+                device within the network that has an address
+                associated with it.</para>
+        </listitem>
+        <listitem>
+            <para>Rollback - In database technologies, a rollback is
+                an operation which returns the database to some
+                previous state. Rollbacks are important for database
+                integrity, because they mean that the database can be
+                restored to a clean copy even after erroneous
+                operations are performed.</para>
+        </listitem>
+        <listitem>
+            <para>Remote Procedure Call (RPC) - A powerful technique
+                for constructing distributed, client-server based
+                applications. The communicating processes may be on
+                the same system, or they may be on different systems
+                with a network connecting them.</para>
+        </listitem>
+        <listitem>
+            <para>Recovery Point Objective (RPO) - A recovery point
+                objective is defined by business continuity planning.
+                It is the maximum tolerable period in which data might
+                be lost from an IT service due to a major incident.
+                The RPO gives systems designers a limit to work
+                to.</para>
+        </listitem>
+        <listitem>
+            <para>Recovery Time Objective (RTO) - The recovery time
+                objective is the duration of time and a service level
+                within which a business process must be restored after
+                a disaster (or disruption) in order to avoid
+                unacceptable consequences associated with a break in
+                business continuity.</para>
+        </listitem>
+        <listitem>
+            <para>Software Development Kit (SDK) - A software
+                development kit is typically a set of software
+                development tools that allows for the creation of
+                applications for a certain software package, software
+                framework, hardware platform, computer system, video
+                game console, operating system, or similar development
+                platform.</para>
+        </listitem>
+        <listitem>
+            <para>Service Level Agreement (SLA) - A service-level
+                agreement is a part of a service
+                contract[disambiguation needed] where a service is
+                formally defined. In practice, the term SLA is
+                sometimes used to refer to the contracted delivery
+                time (of the service or performance).</para>
+        </listitem>
+        <listitem>
+            <para>Software Development Lifecycle (SDLC) - Software
+                development life cycle - A software development
+                process, also known as a software development
+                life-cycle (SDLC), is a structure imposed on the
+                development of a software product.</para>
+        </listitem>
+        <listitem>
+            <para>Top of Rack Switch (ToR Switch) - A Top of the Rack
+                or (TOR) switch is a small port count switch that sits
+                on the very top or near the top of a Telco rack you
+                see in Datacenters.</para>
+        </listitem>
+        <listitem>
+            <para>Traffic Shaping - Traffic shaping (also known as
+                "packet shaping") is a computer network traffic
+                management technique which delays some or all
+                datagrams to bring them into compliance with a desired
+                traffic profile. Traffic shaping is a form of rate
+                limiting.</para>
+        </listitem>
+        <listitem>
+            <para>Tunneling - Computer networks use a tunneling
+                protocol when one network protocol (the delivery
+                protocol) encapsulates a different payload protocol.
+                By using tunneling one can (for example) carry a
+                payload over an incompatible delivery-network, or
+                provide a secure path through an untrusted
+                network.</para>
+        </listitem>
+        <listitem>
+            <para>Virtual Desktop Infrastructure (VDI) - Virtual
+                Desktop Infrastructure is a desktop-centric service
+                that hosts user desktop environments on remote
+                servers, which are accessed over a network using a
+                remote display protocol. A connection brokering
+                service is used to connect users to their assigned
+                desktop sessions.</para>
+        </listitem>
+        <listitem>
+            <para>Virtual Local Area Networks (VLAN) - In computer
+                networking, a single layer-2 network may be
+                partitioned to create multiple distinct broadcast
+                domains, which are mutually isolated so that packets
+                can only pass between them via one or more routers;
+                such a domain is referred to as a virtual local area
+                network, virtual LAN or VLAN.</para>
+        </listitem>
+        <listitem>
+            <para>Voice over Internet Protocol (VoIP) -
+                Voice-over-Internet Protocol (VoIP) is a methodology
+                and group of technologies for the delivery of voice
+                communications and multimedia sessions over Internet
+                Protocol (IP) networks, such as the Internet.</para>
+        </listitem>
+        <listitem>
+            <para>Virtual Router Redundancy Protocol (VRRP) - The
+                Virtual Router Redundancy Protocol (VRRP) is a
+                computer networking protocol that provides for
+                automatic assignment of available Internet Protocol
+                (IP) routers to participating hosts. This increases
+                the availability and reliability of routing paths via
+                automatic default gateway selections on an IP
+                sub-network.</para>
+        </listitem>
+        <listitem>
+            <para>VXLAN Tunnel Endpoint (VTEP) - VXLAN Tunnel Endpoint
+                - Used for frame encapsulation. VTEP functionality can
+                be implemented in software such as a virtual switch or
+                in the form a physical switch.</para>
+        </listitem>
+        <listitem>
+            <para>Virtual Extensible Local Area Network (VXLAN) -
+                Virtual Extensible LAN is a network virtualization
+                technology that attempts to ameliorate the scalability
+                problems associated with large cloud computing
+                deployments. It uses a VLAN-like encapsulation
+                technique to encapsulate MAC-based OSI layer 2
+                Ethernet frames within layer 3 UDP packets.</para>
+        </listitem>
+        <listitem>
+            <para>Wide Area Network (WAN) - A wide area network is a
+                network that covers a broad area using leased or
+                private telecommunication lines.</para>
+        </listitem>
+        <listitem>
+            <para>Xen - Xen is a hypervisor using a microkernel
+                design, providing services that allow multiple
+                computer operating systems to execute on the same
+                computer hardware concurrently.</para>
+        </listitem>
+    </itemizedlist>
+</chapter>
--- a/doc/arch-design/ch_hybrid.xml
+++ b/doc/arch-design/ch_hybrid.xml
@ -0,0 +1,17 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<chapter xmlns="http://docbook.org/ns/docbook"
+    xmlns:xi="http://www.w3.org/2001/XInclude"
+    xmlns:xlink="http://www.w3.org/1999/xlink"
+    version="5.0"
+    xml:id="hybrid">
+    <title>General Purpose</title>
+
+    <xi:include href="hybrid/section_introduction_hybrid.xml"/>
+    <xi:include href="hybrid/section_user_requirements_hybrid.xml"/>
+    <xi:include href="hybrid/section_tech_considerations_hybrid.xml"/>
+    <xi:include href="hybrid/section_operational_considerations_hybrid.xml"/>
+    <xi:include href="hybrid/section_architecture_hybrid.xml"/>
+    <xi:include href="hybrid/section_prescriptive_examples_hybrid.xml"/>
+
+</chapter>
+
--- a/doc/arch-design/ch_introduction.xml
+++ b/doc/arch-design/ch_introduction.xml
@ -0,0 +1,15 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<chapter xmlns="http://docbook.org/ns/docbook"
+  xmlns:xi="http://www.w3.org/2001/XInclude"
+  xmlns:xlink="http://www.w3.org/1999/xlink"
+  version="5.0"
+  xml:id="introduction">
+    <title>Introduction</title>
+
+  <xi:include href="introduction/section_introduction_to_openstack_architecture_design_guide.xml"/>
+  <xi:include href="introduction/section_intended_audience.xml"/>
+  <xi:include href="introduction/section_how_this_book_is_organized.xml"/>
+  <xi:include href="introduction/section_how_this_book_was_written.xml"/>
+  <xi:include href="introduction/section_methodology.xml"/>
+
+</chapter>
--- a/doc/arch-design/ch_massively_scalable.xml
+++ b/doc/arch-design/ch_massively_scalable.xml
@ -0,0 +1,14 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<chapter xmlns="http://docbook.org/ns/docbook"
+    xmlns:xi="http://www.w3.org/2001/XInclude"
+    xmlns:xlink="http://www.w3.org/1999/xlink"
+    version="5.0"
+    xml:id="massively_scalable">
+    <title>Massively Scalable</title>
+
+    <xi:include href="massively_scalable/section_introduction_massively_scalable.xml"/>
+    <xi:include href="massively_scalable/section_user_requirements_massively_scalable.xml"/>
+    <xi:include href="massively_scalable/section_tech_considerations_massively_scalable.xml"/>
+    <xi:include href="massively_scalable/section_operational_considerations_massively_scalable.xml"/>
+
+</chapter>
--- a/doc/arch-design/ch_multi_site.xml
+++ b/doc/arch-design/ch_multi_site.xml
@ -0,0 +1,16 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<chapter xmlns="http://docbook.org/ns/docbook"
+    xmlns:xi="http://www.w3.org/2001/XInclude"
+    xmlns:xlink="http://www.w3.org/1999/xlink"
+    version="5.0"
+    xml:id="multi_site">
+    <title>Hybrid</title>
+
+    <xi:include href="multi_site/section_introduction_multi_site.xml"/>
+    <xi:include href="multi_site/section_user_requirements_multi_site.xml"/>
+    <xi:include href="multi_site/section_tech_considerations_multi_site.xml"/>
+    <xi:include href="multi_site/section_operational_considerations_multi_site.xml"/>
+    <xi:include href="multi_site/section_architecture_multi_site.xml"/>
+    <xi:include href="multi_site/section_prescriptive_examples_multi_site.xml"/>
+
+</chapter>
--- a/doc/arch-design/ch_network_focus.xml
+++ b/doc/arch-design/ch_network_focus.xml
@ -0,0 +1,16 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<chapter xmlns="http://docbook.org/ns/docbook"
+    xmlns:xi="http://www.w3.org/2001/XInclude"
+    xmlns:xlink="http://www.w3.org/1999/xlink"
+    version="5.0"
+    xml:id="network_focus">
+    <title>Network Focused</title>
+
+    <xi:include href="network_focus/section_introduction_network_focus.xml"/>
+    <xi:include href="network_focus/section_user_requirements_network_focus.xml"/>
+    <xi:include href="network_focus/section_tech_considerations_network_focus.xml"/>
+    <xi:include href="network_focus/section_operational_considerations_network_focus.xml"/>
+    <xi:include href="network_focus/section_architecture_network_focus.xml"/>
+    <xi:include href="network_focus/section_prescriptive_examples_network_focus.xml"/>
+
+</chapter>
--- a/doc/arch-design/ch_references.xml
+++ b/doc/arch-design/ch_references.xml
@ -0,0 +1,77 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<chapter xmlns="http://docbook.org/ns/docbook"
+  xmlns:xi="http://www.w3.org/2001/XInclude"
+  xmlns:xlink="http://www.w3.org/1999/xlink"
+  version="5.0"
+  xml:id="arch-design-references">
+  <?dbhtml stop-chunking?>
+  <title>References</title>
+  <para>Data Protection framework of the European Union:
+    http://ec.europa.eu/justice/data-protection/Guidance on Data
+    Protection laws governed by the EU</para>
+  <para>Depletion of IPv4 Addresses:
+    http://www.internetsociety.org/deploy360/blog/2014/05/goodbye-ipv4-iana-starts-allocating-final-address-blocks/Article
+    describing how IPv4 addresses and the migration to IPv6 is
+    inevitable</para>
+  <para>Ethernet Switch Reliability:
+    http://www.garrettcom.com/techsupport/papers/ethernet_switch_reliability.pdf
+    Research white paper on Ethernet Switch reliability</para>
+  <para>Financial Industry Regulatory Authority:
+    http://www.finra.org/Industry/Regulation/FINRARules/ Requirements
+    of the Financial Industry Regulatory Authority in the USA</para>
+  <para>Image Service property keys:
+    http://docs.openstack.org/cli-reference/content/chapter_cli-glance-property.html Glance
+    API property keys allows the administrator to attach custom
+    characteristics to images</para>
+  <para>LibGuestFS Documentation: http://libguestfs.orgOfficial
+    LibGuestFS documentation</para>
+  <para>Logging and Monitoring
+    http://docs.openstack.org/openstack-ops/content/logging_monitoring.html Official
+    OpenStack Operations documentation</para>
+  <para>ManageIQ Cloud Management Platform: http://manageiq.org/ An
+    Open Source Cloud Management Platform for managing multiple
+    clouds</para>
+  <para>N-Tron Network Availability:
+    http://www.n-tron.com/pdf/network_availability.pdfResearch
+    white paper on network availability</para>
+  <para>Nested KVM:
+    http://davejingtian.org/2014/03/30/nested-kvm-just-for-funBlog
+    Post on how to nest KVM under KVM.</para>
+  <para>Open Compute Project: http://www.opencompute.org/The Open
+    Compute Project Foundation’s mission is to design and enable the
+    delivery of the most efficient server, storage and data center
+    hardware designs for scalable computing.</para>
+  <para>OpenStack Flavors:
+    http://docs.openstack.org/openstack-ops/content/flavors.htmlOfficial
+    OpenStack documentation</para>
+  <para>OpenStack High Availability Guide:
+    http://docs.openstack.org/high-availability-guide/content/Information
+    on how to provide redundancy for the OpenStack components</para>
+  <para>OpenStack Hypervisor Support
+    Matrix:https://wiki.openstack.org/wiki/HypervisorSupportMatrix
+    Matrix of supported hypervisors and capabilities when used with
+    OpenStack</para>
+  <para>OpenStack Object Store (Swift) Replication Reference:
+    http://docs.openstack.org/developer/swift/replication_network.html
+    Developer documentation of Swift replication</para>
+  <para>OpenStack Operations Guide:
+    http://docs.openstack.org/openstack-ops/The OpenStack Operations
+    Guide provides information on setting up and installing
+    OpenStack</para>
+  <para>OpenStack Security
+    Guide:http://docs.openstack.org/security-guide/The OpenStack
+    Security Guide provides information on securing OpenStack
+    deployments</para>
+  <para>OpenStack Training Marketplace:
+    http://www.openstack.org/marketplace/trainingThe OpenStack Market
+    for training and Vendors providing training on OpenStack.</para>
+  <para>PCI passthrough:
+    https://wiki.openstack.org/wiki/Pci_passthrough#How_to_check_PCI_status_with_PCI_api_paches
+    The PCI api patches extends the servers/os-hypervisor to show PCI
+    information for instance and compute node, and also provides a
+    resource endpoint to show PCI information.</para>
+  <para>TripleO: https://wiki.openstack.org/wiki/TripleOTripleO is a
+    program aimed at installing, upgrading and operating OpenStack
+    clouds using OpenStack's own cloud facilities as the
+    foundation.</para>
+</chapter>
--- a/doc/arch-design/ch_specialized.xml
+++ b/doc/arch-design/ch_specialized.xml
@ -0,0 +1,17 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<chapter xmlns="http://docbook.org/ns/docbook"
+    xmlns:xi="http://www.w3.org/2001/XInclude"
+    xmlns:xlink="http://www.w3.org/1999/xlink"
+    version="5.0"
+    xml:id="specialized">
+    <title>Specialized Cases</title>
+
+    <xi:include href="specialized/section_introduction_specialized.xml"/>
+    <xi:include href="specialized/section_multi_hypervisor_specialized.xml"/>
+    <xi:include href="specialized/section_networking_specialized.xml"/>
+    <xi:include href="specialized/section_software_defined_networking_specialized.xml"/>
+    <xi:include href="specialized/section_desktop_as_a_service_specialized.xml"/>
+    <xi:include href="specialized/section_openstack_on_openstack_specialized.xml"/>
+    <xi:include href="specialized/section_hardware_specialized.xml"/>
+
+</chapter>
--- a/doc/arch-design/ch_storage_focus.xml
+++ b/doc/arch-design/ch_storage_focus.xml
@ -0,0 +1,16 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<chapter xmlns="http://docbook.org/ns/docbook"
+    xmlns:xi="http://www.w3.org/2001/XInclude"
+    xmlns:xlink="http://www.w3.org/1999/xlink"
+    version="5.0"
+    xml:id="storage_focus">
+    <title>Storage Focused</title>
+
+    <xi:include href="storage_focus/section_introduction_storage_focus.xml"/>
+    <xi:include href="storage_focus/section_user_requirements_storage_focus.xml"/>
+    <xi:include href="storage_focus/section_tech_considerations_storage_focus.xml"/>
+    <xi:include href="storage_focus/section_operational_considerations_storage_focus.xml"/>
+    <xi:include href="storage_focus/section_architecture_storage_focus.xml"/>
+    <xi:include href="storage_focus/section_prescriptive_examples_storage_focus.xml"/>
+
+</chapter>
--- a/doc/arch-design/compute_focus/section_architecture_compute_focus.xml
+++ b/doc/arch-design/compute_focus/section_architecture_compute_focus.xml
@ -0,0 +1,879 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<section xmlns="http://docbook.org/ns/docbook"
+         xmlns:xi="http://www.w3.org/2001/XInclude"
+         xmlns:xlink="http://www.w3.org/1999/xlink"
+         version="5.0"
+         xml:id="arch-design-architecture-hardware">
+    <?dbhtml stop-chunking?>
+        <title>Architecture</title>
+        <para>The hardware selection covers three areas:</para>
+        <itemizedlist>
+                <listitem>
+                        <para>Compute</para>
+                </listitem>
+                <listitem>
+                        <para>Network</para>
+                </listitem>
+                <listitem>
+                        <para>Storage</para>
+                </listitem>
+        </itemizedlist>
+        <para>In a compute-focused OpenStack cloud the hardware selection
+                must
+                reflect the workloads being compute intensive.
+                Compute-focused is
+                defined as having extreme demands on
+                processor and memory resources.
+                The hardware selection for a
+                compute-focused OpenStack architecture
+                design must reflect
+                this preference for compute-intensive workloads, as
+                these
+                workloads are not storage intensive, nor are they consistently
+                network intensive. The network and storage may be heavily
+                utilized
+                while loading a data set into the computational
+                cluster, but they are
+                not otherwise intensive.
+        </para>
+        <para>Compute (server) hardware must be evaluated against four
+                opposing
+                dimensions:
+        </para>
+        <itemizedlist>
+                <listitem>
+                        <para>Server density: A measure of how many servers can
+                                fit into a
+                                given measure of physical space, such as a
+                                rack unit [U].
+                        </para>
+                </listitem>
+                <listitem>
+                        <para>Resource capacity: The number of CPU cores, how much
+                                RAM, or how
+                                much storage a given server will
+                                deliver.
+                        </para>
+                </listitem>
+                <listitem>
+                        <para>Expandability: The number of additional resources
+                                that can be
+                                added to a server before it has reached
+                                its limit.
+                        </para>
+                </listitem>
+                <listitem>
+                        <para>Cost: The relative purchase price of the hardware
+                                weighted
+                                against the level of design effort needed to
+                                build the system.
+                        </para>
+                </listitem>
+        </itemizedlist>
+        <para>The dimensions need to be weighted against each other to
+                determine the best design for the desired purpose. For
+                example,
+                increasing server density means sacrificing resource
+                capacity or
+                expandability. Increasing resource capacity and
+                expandability can
+                increase cost but decreases server density.
+                Decreasing cost can mean
+                decreasing supportability, server
+                density, resource capacity, and
+                expandability.
+        </para>
+        <para>Selection of hardware for a compute-focused cloud should
+                have an
+                emphasis on server hardware that can offer more CPU
+                sockets, more CPU
+                cores, and more RAM; network connectivity
+                and storage capacity are less
+                critical. The hardware will need
+                to be configured to provide enough
+                network connectivity and
+                storage capacity to meet minimum user
+                requirements, but they
+                are not the primary consideration.
+        </para>
+        <para>Some server hardware form factors are better suited than
+                others,
+                as CPU and RAM capacity have the highest
+                priority.
+        </para>
+        <itemizedlist>
+                <listitem>
+                        <para>Most blade servers can support dual-socket
+                                multi-core CPUs. To
+                                avoid the limit means selecting
+                                "full width" or "full height" blades,
+                                which
+                                consequently loses server density. As an example,
+                                using high
+                                density blade servers including HP
+                                BladeSystem and Dell PowerEdge
+                                M1000e) which support
+                                up to 16 servers in only 10 rack units using
+                                half-height blades, suddenly decreases the density by
+                                50% by
+                                selecting full-height blades resulting in only
+                                8 servers per 10 rack
+                                units.
+                        </para>
+                </listitem>
+                <listitem>
+                        <para>1U rack-mounted servers (servers that occupy only a
+                                single rack
+                                unit) may be able to offer greater server
+                                density than a blade server
+                                solution. It is possible
+                                to place 40 servers in a rack, providing
+                                space for the
+                                top of rack [ToR] switches, versus 32 "full width" or
+                                "full height" blade servers in a rack), but often are
+                                limited to
+                                dual-socket, multi-core CPU configurations.
+                                Note that, as of the
+                                Icehouse release, neither HP,
+                                IBM, nor Dell offered 1U rack servers
+                                with more than 2
+                                CPU sockets. To obtain greater than dual-socket
+                                support in a 1U rack-mount form factor, customers need
+                                to buy their
+                                systems from Original Design
+                                Manufacturers (ODMs) or second-tier
+                                manufacturers.
+                                This may cause issues for organizations that have
+                                preferred vendor policies or concerns with support and
+                                hardware
+                                warranties of non-tier 1 vendors.
+                        </para>
+                </listitem>
+                <listitem>
+                        <para>2U rack-mounted servers provide quad-socket,
+                                multi-core CPU
+                                support, but with a corresponding
+                                decrease in server density (half
+                                the density offered
+                                by 1U rack-mounted servers).
+                        </para>
+                </listitem>
+                <listitem>
+                        <para>Larger rack-mounted servers, such as 4U servers,
+                                often provide
+                                even greater CPU capacity, commonly
+                                supporting four or even eight CPU
+                                sockets. These
+                                servers have greater expandability, but such servers
+                                have much lower server density and usually greater
+                                hardware cost.
+                        </para>
+                </listitem>
+                <listitem>
+                        <para>"Sled servers" (rack-mounted servers that support
+                                multiple
+                                independent servers in a single 2U or 3U
+                                enclosure) deliver increased
+                                density as compared to
+                                typical 1U or 2U rack-mounted servers. For
+                                example,
+                                many sled servers offer four independent dual-socket
+                                nodes in
+                                2U for a total of 8 CPU sockets in 2U.
+                                However, the dual-socket
+                                limitation on individual
+                                nodes may not be sufficient to offset their
+                                additional
+                                cost and configuration complexity.
+                        </para>
+                </listitem>
+        </itemizedlist>
+        <para>The following facts will strongly influence server hardware
+                selection for a compute-focused OpenStack design
+                architecture:
+        </para>
+        <itemizedlist>
+                <listitem>
+                        <para>Instance density: In this architecture instance
+                                density is
+                                considered lower; therefore CPU and RAM
+                                over-subscription ratios are
+                                also lower. More hosts
+                                will be required to support the anticipated
+                                scale due
+                                to instance density being lower, especially if the
+                                design
+                                uses dual-socket hardware designs.
+                        </para>
+                </listitem>
+                <listitem>
+                        <para>Host density: Another option to address the higher host count
+                                that might be needed with dual socket designs is to use a quad
+                                socket platform. Taking this approach will decrease host
+                                density,
+                                which increases rack count. This configuration may
+                                affect the network
+                                requirements, the number of power
+                                connections, and possibly impact
+                                the cooling
+                                requirements.
+                        </para>
+                </listitem>
+                <listitem>
+                        <para>Power and cooling density: The power and cooling
+                                density
+                                requirements might be lower than with blade,
+                                sled, or 1U server
+                                designs because of lower host
+                                density (by using 2U, 3U or even 4U
+                                server designs).
+                                For data centers with older infrastructure, this may
+                                be a desirable feature.
+                        </para>
+                </listitem>
+        </itemizedlist>
+        <para>Compute-focused OpenStack design architecture server
+                hardware
+                selection results in a "scale up" versus "scale out"
+                decision.
+                Selecting a better solution, smaller number of
+                larger hosts, or a
+                larger number of smaller hosts depends on a
+                combination of factors:
+                cost, power, cooling, physical rack
+                and floor space, support-warranty,
+                and manageability.
+        </para>
+        <section xml:id="storage-hardware-selection">
+                <title>Storage Hardware Selection</title>
+                <para>For a compute-focused OpenStack design architecture, the
+                        selection of
+                        storage hardware is not critical as it is not primary
+                        criteria, however
+                        it is still important. There are a number of
+                        different factors that a
+                        cloud architect must consider:
+                </para>
+                <itemizedlist>
+                        <listitem>
+                                <para>Cost: The overall cost of the solution will play a major role
+                                        in what storage architecture (and resulting storage hardware) is
+                                        selected.
+                                </para>
+                        </listitem>
+                        <listitem>
+                                <para>Performance: The performance of the solution is also a big
+                                        role and can be measured by observing the latency of storage I-O
+                                        requests. In a compute-focused OpenStack cloud, storage latency
+                                        can
+                                        be a major consideration. In some compute-intensive
+                                        workloads,
+                                        minimizing the delays that the CPU experiences while
+                                        fetching data
+                                        from the storage can have a significant impact on
+                                        the overall
+                                        performance of the application.
+                                </para>
+                        </listitem>
+                        <listitem>
+                                <para>Scalability: This section will refer to the term "scalability"
+                                        to refer to how well the storage solution performs as it is
+                                        expanded up to its maximum size. A storage solution that
+                                        performs
+                                        well in small configurations but has degrading
+                                        performance as it
+                                        expands would not be considered scalable. On
+                                        the other hand, a
+                                        solution that continues to perform well at
+                                        maximum expansion would
+                                        be considered scalable.
+                                </para>
+                        </listitem>
+                        <listitem>
+                                <para>Expandability: Expandability refers to the overall ability of
+                                        the solution to grow. A storage solution that expands to 50 PB
+                                        is
+                                        considered more expandable than a solution that only scales
+                                        to 10PB.
+                                        Note that this metric is related to, but different
+                                        from,
+                                        scalability, which is a measure of the solution's
+                                        performance as it
+                                        expands.
+                                </para>
+                        </listitem>
+                </itemizedlist>
+                <para>For a compute-focused OpenStack cloud, latency of storage is a
+                        major
+                        consideration. Using solid-state disks (SSDs) to minimize
+                        latency for
+                        instance storage and reduce CPU delays caused by waiting
+                        for the storage
+                        will increase performance. Consider using RAID
+                        controller cards in
+                        compute hosts to improve the performance of the
+                        underlying disk
+                        subsystem.
+                </para>
+                <para>The selection of storage architecture, and the corresponding
+                        storage
+                        hardware (if there is the option), is determined by evaluating
+                        possible
+                        solutions against the key factors listed above. This will
+                        determine
+                        if a
+                        scale-out solution (such as Ceph, GlusterFS, or similar)
+                        should be used,
+                        or if a single, highly expandable and scalable
+                        centralized storage
+                        array
+                        would be a better choice. If a centralized
+                        storage array is the right
+                        fit for the requirements, the hardware will
+                        be determined by the array
+                        vendor. It is also possible to build a
+                        storage array using commodity
+                        hardware with Open Source software, but
+                        there needs to be access to
+                        people with expertise to build such a
+                        system. Conversely, a scale-out
+                        storage solution that uses
+                        direct-attached storage (DAS) in the
+                        servers
+                        may be an appropriate
+                        choice. If so, then the server hardware needs to
+                        be configured to
+                        support the storage solution.
+                </para>
+                <para>The following lists some of the potential impacts that may
+                        affect a
+                        particular storage architecture, and the corresponding
+                        storage hardware,
+                        of a compute-focused OpenStack cloud:
+                </para>
+                <itemizedlist>
+                        <listitem>
+                                <para>Connectivity: Based on the storage solution selected, ensure
+                                        the connectivity matches the storage solution requirements. If a
+                                        centralized storage array is selected, it is important to
+                                        determine
+                                        how the hypervisors will connect to the storage array.
+                                        Connectivity
+                                        could affect latency and thus performance, so check
+                                        that the network
+                                        characteristics will minimize latency to boost
+                                        the overall
+                                        performance of the design.
+                                </para>
+                        </listitem>
+                        <listitem>
+                                <para>Latency: Determine if the use case will have consistent or
+                                        highly variable latency.
+                                </para>
+                        </listitem>
+                        <listitem>
+                                <para>Throughput: To improve overall performance, make sure that the
+                                        storage solution throughout is optimized. While it is not likely
+                                        that a compute-focused cloud will have major data I-O to and
+                                        from storage, this is an important factor to consider.
+                                </para>
+                        </listitem>
+                        <listitem>
+                                <para>Server Hardware: If the solution uses DAS, this impacts, and
+                                        is not limited to, the server hardware choice that will ripple
+                                        into
+                                        host density, instance density, power density,
+                                        OS-hypervisor, and
+                                        management tools.
+                                </para>
+                        </listitem>
+                </itemizedlist>
+                <para>Where instances need to be made highly available, or they need
+                        to be
+                        capable of migration between hosts, use of a shared storage
+                        file-system
+                        to store instance ephemeral data should be employed to
+                        ensure that
+                        compute services can run uninterrupted in the event of a
+                        node
+                        failure.
+                </para>
+        </section>
+        <section xml:id="selecting-networking-hardware-arch">
+                <title>Selecting Networking Hardware</title>
+                <para>Some of the key considerations that should be included in
+                        the
+                        selection of networking hardware include:
+                </para>
+                <itemizedlist>
+                        <listitem>
+                                <para>Port count: The design will require networking
+                                        hardware that
+                                        has the requisite port count.
+                                </para>
+                        </listitem>
+                        <listitem>
+                                <para>Port density: The network design will be affected by
+                                        the
+                                        physical space that is required to provide the
+                                        requisite port count.
+                                        A switch that can provide 48 10
+                                        GbE ports in 1U has a much higher
+                                        port density than a
+                                        switch that provides 24 10 GbE ports in 2U. A
+                                        higher
+                                        port density is preferred, as it leaves more rack
+                                        space for
+                                        compute or storage components that might be
+                                        required by the design.
+                                        This also leads into concerns
+                                        about fault domains and power density
+                                        that must also
+                                        be considered. Higher density switches are more
+                                        expensive and should also be considered, as it is
+                                        important not to
+                                        over design the network if it is not
+                                        required.
+                                </para>
+                        </listitem>
+                        <listitem>
+                                <para>Port speed: The networking hardware must support the
+                                        proposed
+                                        network speed, for example: 1 GbE, 10 GbE, or
+                                        40 GbE (or even 100
+                                        GbE).
+                                </para>
+                        </listitem>
+                        <listitem>
+                                <para>Redundancy: The level of network hardware redundancy
+                                        required
+                                        is influenced by the user requirements for
+                                        high availability and
+                                        cost considerations. Network
+                                        redundancy can be achieved by adding
+                                        redundant power
+                                        supplies or paired switches. If this is a
+                                        requirement,
+                                        the hardware will need to support this configuration.
+                                        User requirements will determine if a completely
+                                        redundant network
+                                        infrastructure is required.
+                                </para>
+                        </listitem>
+                        <listitem>
+                                <para>Power requirements: Ensure that the physical data
+                                        center
+                                        provides the necessary power for the selected
+                                        network hardware. This
+                                        is not an issue for top of rack
+                                        (ToR) switches, but may be an issue
+                                        for spine switches
+                                        in a leaf and spine fabric, or end of row (EoR)
+                                        switches.
+                                </para>
+                        </listitem>
+                </itemizedlist>
+                <para>It is important to first understand additional factors as
+                        well as
+                        the use case because these additional factors heavily
+                        influence the
+                        cloud network architecture. Once these key
+                        considerations have been
+                        decided, the proper network can be
+                        designed to best serve the
+                        workloads being placed in the
+                        cloud.
+                </para>
+                <para>It is recommended that the network architecture is designed
+                        using a scalable network model that makes it easy to add
+                        capacity and
+                        bandwidth. A good example of such a model is the
+                        leaf-spline model. In
+                        this type of network design, it is
+                        possible to easily add additional
+                        bandwidth as well as scale
+                        out to additional racks of gear. It is
+                        important to select
+                        network hardware that will support the required
+                        port count,
+                        port speed and port density while also allowing for future
+                        growth as workload demands increase. It is also important to
+                        evaluate
+                        where in the network architecture it is valuable to
+                        provide
+                        redundancy. Increased network availability and
+                        redundancy comes at a
+                        cost, therefore it is recommended to
+                        weigh the cost versus the benefit
+                        gained from utilizing and
+                        deploying redundant network switches and
+                        using bonded
+                        interfaces at the host level.
+                </para>
+        </section>
+        <section xml:id="software-selection-arch">
+                <title>Software Selection</title>
+                <para>Selecting software to be included in a compute-focused
+                        OpenStack
+                        architecture design must include three main
+                        areas:
+                </para>
+                <itemizedlist>
+                        <listitem>
+                                <para>Operating system (OS) and hypervisor</para>
+                        </listitem>
+                        <listitem>
+                                <para>OpenStack components</para>
+                        </listitem>
+                        <listitem>
+                                <para>Supplemental software</para>
+                        </listitem>
+                </itemizedlist>
+                <para>Design decisions made in each of these areas impact the rest
+                        of
+                        the OpenStack architecture design.
+                </para>
+        </section>
+        <section xml:id="os-and-hypervisor-arch">
+                <title>OS and Hypervisor</title>
+                <para>The selection of OS and hypervisor has a significant impact
+                        on
+                        the end point design. Selecting a particular operating
+                        system and
+                        hypervisor could affect server hardware selection.
+                        For example, a
+                        selected combination needs to be supported on
+                        the selected hardware.
+                        Ensuring the storage hardware selection
+                        and topology supports the
+                        selected operating system and
+                        hypervisor combination should also be
+                        considered.
+                        Additionally, make sure that the networking hardware
+                        selection
+                        and topology will work with the chosen operating system and
+                        hypervisor combination. For example, if the design uses Link
+                        Aggregation Control Protocol (LACP), the hypervisor needs to
+                        support
+                        it.
+                </para>
+                <para>Some areas that could be impacted by the selection of OS and
+                        hypervisor include:
+                </para>
+                <itemizedlist>
+                        <listitem>
+                                <para>Cost: Selecting a commercially supported hypervisor
+                                        such as
+                                        Microsoft Hyper-V will result in a different
+                                        cost model rather than
+                                        chosen a community-supported
+                                        open source hypervisor like Kinstance
+                                        or Xen. Even
+                                        within the ranks of open source solutions, choosing
+                                        Ubuntu over Red Hat (or vice versa) will have an
+                                        impact on cost due
+                                        to support contracts. On the other
+                                        hand, business or application
+                                        requirements might
+                                        dictate a specific or commercially supported
+                                        hypervisor.
+                                </para>
+                        </listitem>
+                        <listitem>
+                                <para>Supportability: Depending on the selected
+                                        hypervisor, the staff
+                                        should have the appropriate
+                                        training and knowledge to support the
+                                        selected OS and
+                                        hypervisor combination. If they do not, training
+                                        will
+                                        need to be provided which could have a cost impact on
+                                        the
+                                        design.
+                                </para>
+                        </listitem>
+                        <listitem>
+                                <para>Management tools: The management tools used for
+                                        Ubuntu and
+                                        Kinstance differ from the management tools
+                                        for VMware vSphere.
+                                        Although both OS and hypervisor
+                                        combinations are supported by
+                                        OpenStack, there will be
+                                        very different impacts to the rest of the
+                                        design as a
+                                        result of the selection of one combination versus the
+                                        other.
+                                </para>
+                        </listitem>
+                        <listitem>
+                                <para>Scale and performance: Ensure that selected OS and
+                                        hypervisor
+                                        combinations meet the appropriate scale and
+                                        performance
+                                        requirements. The chosen architecture will
+                                        need to meet the targeted
+                                        instance-host ratios with
+                                        the selected OS-hypervisor combination.
+                                </para>
+                        </listitem>
+                        <listitem>
+                                <para>Security: Ensure that the design can accommodate the
+                                        regular
+                                        periodic installation of application security
+                                        patches while
+                                        maintaining the required workloads. The
+                                        frequency of security
+                                        patches for the proposed OS -
+                                        hypervisor combination will have an
+                                        impact on
+                                        performance and the patch installation process could
+                                        affect maintenance windows.
+                                </para>
+                        </listitem>
+                        <listitem>
+                                <para>Supported features: Determine what features of
+                                        OpenStack are
+                                        required. This will often determine the
+                                        selection of the
+                                        OS-hypervisor combination. Certain
+                                        features are only available with
+                                        specific OSs or
+                                        hypervisors. For example, if certain features are
+                                        not
+                                        available, the design might need to be modified to
+                                        meet the user
+                                        requirements.
+                                </para>
+                        </listitem>
+                        <listitem>
+                                <para>Interoperability: Consideration should be given to
+                                        the ability
+                                        of the selected OS-hypervisor combination
+                                        to interoperate or
+                                        co-exist with other OS-hypervisors,
+                                        or other software solutions in
+                                        the overall design (if
+                                        required). Operational and troubleshooting
+                                        tools for
+                                        one OS-hypervisor combination may differ from the
+                                        tools
+                                        used for another OS-hypervisor combination and,
+                                        as a result, the
+                                        design will need to address if the
+                                        two sets of tools need to
+                                        interoperate.
+                                </para>
+                        </listitem>
+                </itemizedlist>
+        </section>
+        <section xml:id="openstack-components-arch">
+                <title>OpenStack Components</title>
+                <para>The selection of which OpenStack components will actually be
+                        included in the design and deployed has significant impact.
+                        There are
+                        certain components that will always be present,
+                        (Nova and Glance, for
+                        example) yet there are other services
+                        that might not need to be
+                        present. For example, a certain
+                        design may not require OpenStack Heat.
+                        Omitting Heat would not
+                        typically have a significant impact on the
+                        overall design.
+                        However, if the architecture uses a replacement for
+                        OpenStack
+                        Swift for its storage component, this could potentially have
+                        significant impacts on the rest of the design.
+                </para>
+                <para>For a compute-focused OpenStack design architecture, the
+                        following components would be used:
+                </para>
+                <itemizedlist>
+                        <listitem>
+                                <para>Identity (Keystone)</para>
+                        </listitem>
+                        <listitem>
+                                <para>Dashboard (Horizon)</para>
+                        </listitem>
+                        <listitem>
+                                <para>Compute (Nova)</para>
+                        </listitem>
+                        <listitem>
+                                <para>Object Storage (Swift, Ceph or a commercial
+                                        solution)
+                                </para>
+                        </listitem>
+                        <listitem>
+                                <para>Image (Glance)</para>
+                        </listitem>
+                        <listitem>
+                                <para>Networking (Neutron)</para>
+                        </listitem>
+                        <listitem>
+                                <para>Orchestration (Heat)</para>
+                        </listitem>
+                </itemizedlist>
+                <para>OpenStack Block Storage would potentially not be
+                        incorporated
+                        into a compute-focused design due to persistent
+                        block storage not
+                        being a significant requirement for the
+                        types of workloads that would
+                        be deployed onto instances
+                        running in a compute-focused cloud.
+                        However, there may be some
+                        situations where the need for performance
+                        dictates that a
+                        block storage component be used to improve data I-O.
+                </para>
+                <para>The exclusion of certain OpenStack components might also
+                        limit or
+                        constrain the functionality of other components. If a
+                        design opts to
+                        include Heat but exclude Ceilometer, then the
+                        design will not be able
+                        to take advantage of Heat's
+                        auto scaling functionality (which relies
+                        on information from
+                        Ceilometer). Due to the fact that you can use Heat
+                        to spin up
+                        a large number of instances to perform the
+                        compute-intensive
+                        processing, including Heat in a compute-focused
+                        architecture
+                        design is strongly recommended.
+                </para>
+        </section>
+        <section xml:id="supplemental-software">
+                <title>Supplemental Software</title>
+                <para>While OpenStack is a fairly complete collection of software
+                        projects for building a platform for cloud services, there are
+                        invariably additional pieces of software that might need to be
+                        added
+                        to any given OpenStack design.
+                </para>
+                <section xml:id="networking-software-arch">
+                        <title>Networking Software</title>
+                        <para>OpenStack Networking provides a wide variety of networking
+                                services for instances. There are many additional networking
+                                software packages that might be useful to manage the OpenStack
+                                components themselves. Some examples include software to
+                                provide load
+                                balancing, network redundancy protocols, and
+                                routing daemons. Some of
+                                these software packages are described
+                                in more detail in the OpenStack
+                                HA Guide (refer to Chapter 8
+                                of the OpenStack High Availability
+                                Guide).
+                        </para>
+                        <para>For a compute-focused OpenStack cloud, the OpenStack
+                                infrastructure components will need to be highly available. If
+                                the
+                                design does not include hardware load balancing,
+                                networking software
+                                packages like HAProxy will need to be
+                                included.
+                        </para>
+                </section>
+                <section xml:id="management-software-arch">
+                        <title>Management Software</title>
+                        <para>The selected supplemental software solution impacts and
+                                affects
+                                the overall OpenStack cloud design. This includes
+                                software for
+                                providing clustering, logging, monitoring and
+                                alerting.
+                        </para>
+                        <para>Inclusion of clustering Software, such as Corosync or
+                                Pacemaker, is determined primarily by the availability design
+                                requirements. Therefore, the impact of including (or not
+                                including)
+                                these software packages is primarily determined by
+                                the availability
+                                of the cloud infrastructure and the
+                                complexity of supporting the
+                                configuration after it is
+                                deployed. The OpenStack High Availability
+                                Guide provides more
+                                details on the installation and configuration of
+                                Corosync and
+                                Pacemaker, should these packages need to be included in
+                                the
+                                design.
+                        </para>
+                        <para>Requirements for logging, monitoring, and alerting are
+                                determined by operational considerations. Each of these
+                                sub-categories includes a number of various options. For
+                                example, in
+                                the logging sub-category one might consider
+                                Logstash, Splunk, Log
+                                Insight, or some other log
+                                aggregation-consolidation tool. Logs
+                                should be stored in a
+                                centralized location to make it easier to
+                                perform analytics
+                                against the data. Log data analytics engines can
+                                also provide
+                                automation and issue notification by providing a
+                                mechanism to
+                                both alert and automatically attempt to remediate some
+                                of the
+                                more commonly known issues.
+                        </para>
+                        <para>If any of these software packages are needed, then the
+                                design
+                                must account for the additional resource consumption
+                                (CPU, RAM,
+                                storage, and network bandwidth for a log
+                                aggregation solution, for
+                                example). Some other potential
+                                design impacts include:
+                        </para>
+                        <itemizedlist>
+                                <listitem>
+                                        <para>OS - Hypervisor combination: Ensure that the
+                                                selected logging,
+                                                monitoring, or alerting tools
+                                                support the proposed OS-hypervisor
+                                                combination.
+                                        </para>
+                                </listitem>
+                                <listitem>
+                                        <para>Network hardware: The network hardware selection
+                                                needs to be
+                                                supported by the logging, monitoring, and
+                                                alerting software.
+                                        </para>
+                                </listitem>
+                        </itemizedlist>
+                </section>
+                <section xml:id="database-software-arch">
+                        <title>Database Software</title>
+                        <para>A large majority of the OpenStack components require access
+                                to
+                                back-end database services to store state and configuration
+                                information. Selection of an appropriate back-end database
+                                that will
+                                satisfy the availability and fault tolerance
+                                requirements of the
+                                OpenStack services is required. OpenStack
+                                services support connecting
+                                to any database that is supported
+                                by the sqlalchemy Python drivers,
+                                however most common database
+                                deployments make use of mySQL or some
+                                variation of it. It is
+                                recommended that the database which provides
+                                back-end service
+                                within a general purpose cloud be made highly
+                                available using
+                                an available technology which can accomplish that
+                                goal. Some
+                                of the more common software solutions used include Galera,
+                                MariaDB and mySQL with multi-master replication.
+                        </para>
+                </section>
+        </section>
+</section>
--- a/doc/arch-design/compute_focus/section_introduction_compute_focus.xml
+++ b/doc/arch-design/compute_focus/section_introduction_compute_focus.xml
@ -0,0 +1,49 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<section xmlns="http://docbook.org/ns/docbook"
+    xmlns:xi="http://www.w3.org/2001/XInclude"
+    xmlns:xlink="http://www.w3.org/1999/xlink"
+    version="5.0"
+    xml:id="arch-guide-intro-compute-focus">
+    <title>Introduction</title>
+    <para>A compute-focused cloud is a specialized subset of the
+        general purpose OpenStack cloud architecture. Unlike the
+        general purpose OpenStack architecture, which is built to host
+        a wide variety of workloads and applications and does not
+        heavily tax any particular computing aspect, a compute-focused
+        cloud is built and designed specifically to support compute
+        intensive workloads. As such, the design must be specifically
+        tailored to support hosting compute intensive workloads.
+        Compute intensive workloads may be CPU intensive, RAM
+        intensive, or both. However, they are not typically storage
+        intensive or network intensive. Compute-focused workloads may
+        include the following use cases:</para>
+    <itemizedlist>
+        <listitem>
+            <para>High performance computing (HPC)</para>
+        </listitem>
+        <listitem>
+            <para>Big data analytics using Hadoop or other distributed
+                data stores</para>
+        </listitem>
+        <listitem>
+            <para>Continuous integration/continuous deployment
+                (CI/CD)</para>
+        </listitem>
+        <listitem>
+            <para>Platform-as-a-Service (PaaS)</para>
+        </listitem>
+        <listitem>
+            <para>Signal processing for Network Function
+                Virtualization (NFV)</para>
+        </listitem>
+    </itemizedlist>
+    <para>Based on the use case requirements, such clouds might need
+        to provide additional services such as a virtual machine disk
+        library, file or object storage, firewalls, load balancers, IP
+        addresses, and network connectivity in the form of overlays or
+        virtual Local Area Networks (VLANs). A compute-focused
+        OpenStack cloud will not typically use raw block storage
+        services since the applications hosted on a compute-focused
+        OpenStack cloud generally do not need persistent block
+        storage.</para>
+</section>
--- a/doc/arch-design/compute_focus/section_operational_considerations_compute_focus.xml
+++ b/doc/arch-design/compute_focus/section_operational_considerations_compute_focus.xml
@ -0,0 +1,117 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<section xmlns="http://docbook.org/ns/docbook"
+    xmlns:xi="http://www.w3.org/2001/XInclude"
+    xmlns:xlink="http://www.w3.org/1999/xlink"
+    version="5.0"
+    xml:id="operational-considerations-compute-focus">
+    <?dbhtml stop-chunking?>
+    <title>Operational Considerations</title>
+    <para>Operationally, there are a number of considerations that
+        affect the design of compute-focused OpenStack clouds. Some
+        examples might include enforcing strict API availability
+        requirements, understanding and dealing with failure
+        scenarios, or managing host maintenance schedules.</para>
+    <para>Service-level agreements (SLAs) are a contractual obligation
+        that gives assurances around availability of a provided
+        service. As such, factoring in promises of availability
+        implies a certain level of redundancy and resiliency when
+        designing an OpenStack cloud.</para>
+    <itemizedlist>
+        <listitem>
+            <para>Guarantees for API availability imply multiple
+                infrastructure services combined with appropriately
+                high available load balancers.</para>
+        </listitem>
+        <listitem>
+            <para>Network uptime guarantees will affect the switch
+                design and might require redundant switching and
+                power.</para>
+        </listitem>
+        <listitem>
+            <para>Network security policy requirements need to be
+                factored in to deployments.</para>
+        </listitem>
+    </itemizedlist>
+    <para>Knowing when and where to implement redundancy and high
+        availability (HA) is directly affected by terms contained in
+        any associated SLA, if one is present.</para>
+    <section xml:id="support-and-maintainability-compute-focus">
+        <title>Support and Maintainability</title>
+    <para>OpenStack cloud management requires operations staff to be
+        able to understand and comprehend design architecture content
+        on some level. The level of skills and the level of separation
+        of the operations and engineering staff is dependent on the
+        size and purpose of the installation. A large cloud service
+        provider or a telecom provider is more inclined to be managed
+        by specially trained dedicated operations organization. A
+        smaller implementation is more inclined to rely on a smaller
+        support staff that might need to take on the combined
+        engineering, design and operations functions.</para>
+    <para>Maintaining OpenStack installations require a variety of
+        technical skills. Some of these skills may include the ability
+        to debug Python log output to a basic level as well as an
+        understanding of networking concepts.</para>
+    <para>Consider incorporating features into the architecture and
+        design that reduce the operational burden. Some examples
+        include automating some of the operations functions, or
+        alternatively exploring the possibility of using a third party
+        management company with special expertise in managing
+        OpenStack deployments.</para></section>
+    <section xml:id="montioring-compute-focus"><title>Monitoring</title>
+    <para>Like any other infrastructure deployment, OpenStack clouds
+        need an appropriate monitoring platform to ensure errors are
+        caught and managed appropriately. Consider leveraging any
+        existing monitoring system to see if it will be able to
+        effectively monitor an OpenStack environment. While there are
+        many aspects that need to be monitored, specific metrics that
+        are critically important to capture include image disk
+        utilization, or response time to the Compute API.</para></section>
+    <section xml:id="expected-unexpected-server-downtime"><title>Expected and unexpected server downtime</title>
+    <para>At some point, servers will fail. The SLAs in place affect
+        how the design has to address recovery time. Recovery of a
+        failed host may mean restoring instances from a snapshot, or
+        respawning that instance on another available host, which then
+        has consequences on the overall application design running on
+        the OpenStack cloud.</para>
+    <para>It might be acceptable to design a compute-focused cloud
+        without the ability to migrate instances from one host to
+        another, because the expectation is that the application
+        developer must handle failure within the application itself.
+        Conversely, a compute-focused cloud might be provisioned to
+        provide extra resilience as a requirement of that business. In
+        this scenario, it is expected that extra supporting services
+        are also deployed, such as shared storage attached to hosts to
+        aid in recovery and resiliency of services in order to meet
+        strict SLAs.</para></section>
+    <section  xml:id="capacity-planning-operational"><title>Capacity Planning</title>
+    <para>Adding extra capacity to an OpenStack cloud is an easy
+        horizontally scaling process, as consistently configured nodes
+        automatically attach to an OpenStack cloud. Be mindful,
+        however, of any additional work to place the nodes into
+        appropriate Availability Zones and Host Aggregates if
+        necessary. The same (or very similar) CPUs are recommended
+        when adding extra nodes to the environment because it reduces
+        the chance to break any live-migration features if they are
+        present. Scaling out hypervisor hosts also has a direct effect
+        on network and other data center resources, so factor in this
+        increase when reaching rack capacity or when extra network
+        switches are required.</para>
+    <para>Compute hosts can also have internal components changed to
+        account for increases in demand, a process also known as
+        vertical scaling. Swapping a CPU for one with more cores, or
+        increasing the memory in a server, can help add extra needed
+        capacity depending on whether the running applications are
+        more CPU intensive or memory based (as would be expected in a
+        compute-focused OpenStack cloud).</para>
+    <para>Another option is to assess the average workloads and
+        increase the number of instances that can run within the
+        compute environment by adjusting the overcommit ratio. While
+        only appropriate in some environments, it's important to
+        remember that changing the CPU overcommit ratio can have a
+        detrimental effect and cause a potential increase in noisy
+        neighbor. The added risk of increasing the overcommit ratio is
+        more instances will fail when a compute host fails. In a
+        compute-focused OpenStack design architecture, increasing the
+        CPU overcommit ratio increases the potential for noisy
+        neighbor issues and is not recommended.</para></section>
+</section>
--- a/doc/arch-design/compute_focus/section_prescriptive_examples_compute_focus.xml
+++ b/doc/arch-design/compute_focus/section_prescriptive_examples_compute_focus.xml
@ -0,0 +1,128 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<section xmlns="http://docbook.org/ns/docbook"
+    xmlns:xi="http://www.w3.org/2001/XInclude"
+    xmlns:xlink="http://www.w3.org/1999/xlink"
+    version="5.0"
+    xml:id="prescriptive-example-compute-focus">
+    <?dbhtml stop-chunking?>
+    <title>Prescriptive Examples</title>
+    <para>The Conseil Européen pour la Recherche Nucléaire (CERN),
+        also known as the European Organization for, Nuclear Research
+        provides particle accelerators and other infrastructure for
+        high-energy physics research.</para>
+    <para>As of 2011 CERN operated these two compute centers in Europe
+        with plans to add a third:</para>
+    <para>To support a growing number of compute heavy users of
+        experiments related to the Large Hadron Collider (LHC) CERN
+        ultimately elected to deploy an OpenStack cloud using
+        Scientific Linux and RDO. This effort aimed to simplify the
+        management of the center's compute resources with a view to
+        doubling compute capacity through the addition of an
+        additional data center in 2013 while maintaining the same
+        levels of compute staff.</para>
+    <para>The CERN solution uses Cells for segregation of compute
+        resources and to transparently scale between different data
+        centers. This decision meant trading off support for security
+        groups and live migration. In addition some details like
+        flavors needed to be manually replicated across cells. In
+        spite of these drawbacks cells were determined to provide the
+        required scale while exposing a single public API endpoint to
+        users.</para>
+    <para>A compute cell was created for each of the two original data
+        centers and a third was created when a new data center was
+        added in 2013. Each cell contains three availability zones to
+        further segregate compute resources and at least three
+        RabbitMQ message brokers configured to be clustered with
+        mirrored queues for high availability.</para>
+    <para>The API cell, which resides behind a HAProxy load balancer,
+        is located in the data center in Switzerland and directs API
+        calls to compute cells using a customized variation of the
+        cell scheduler. The customizations allow certain workloads to
+        be directed to a specific data center or "all" data centers
+        with cell selection determined by cell RAM availability in the
+        latter case.</para>
+    <mediaobject>
+        <imageobject>
+            <imagedata fileref="../images/Generic_CERN_Example.png"/>
+        </imageobject>
+    </mediaobject>
+    <para>There is also some customization of the filter scheduler
+        that handles placement within the cells:</para>
+    <itemizedlist>
+        <listitem>
+            <para>ImagePropertiesFilter - To provide special handling
+                depending on the guest operating system in use
+                (Linux-based or Windows-based).</para>
+        </listitem>
+        <listitem>
+            <para>ProjectsToAggregateFilter - To provide special
+                handling depending on the project the instance is
+                associated with.</para>
+        </listitem>
+        <listitem>
+            <para>default_schedule_zones - Allows the selection of
+                multiple default availability zones, rather than a
+                single default.</para>
+        </listitem>
+    </itemizedlist>
+    <para>The MySQL database server in each cell is managed by a
+        central database team and configured in an active/passive
+        configuration with a NetApp storage back end. Backups are
+        performed ever 6 hours.</para>
+    <section xml:id="network-architecture"><title>Network Architecture</title>
+    <para>To integrate with existing CERN networking infrastructure
+        customizations were made to Nova Networking. This was in the
+        form of a driver to integrate with CERN's existing database
+        for tracking MAC and IP address assignments.</para>
+    <para>The driver facilitates selection of a MAC address and IP for
+        new instances based on the compute node the scheduler places
+        the instance on</para>
+    <para>The driver considers the compute node that the scheduler
+        placed an instance on and then selects a MAC address and IP
+        from the pre-registered list associated with that node in the
+        database. The database is then updated to reflect the instance
+        the addresses were assigned to.</para></section>
+    <section xml:id="storage-architecture"><title>Storage Architecture</title>
+    <para>The OpenStack image service is deployed in the API cell and
+        configured to expose version 1 (V1) of the API. As a result
+        the image registry is also required. The storage back end in
+        use is a 3 PB Ceph cluster.</para>
+    <para>A small set of "golden" Scientific Linux 5 and 6 images are
+        maintained which applications can in turn be placed on using
+        orchestration tools. Puppet is used for instance configuration
+        management and customization but Heat deployment is
+        expected.</para></section>
+    <section xml:id="monitoring"><title>Monitoring</title>
+    <para>Although direct billing is not required, OpenStack Telemetry
+        is used to perform metering for the purposes of adjusting
+        project quotas. A sharded, replicated, MongoDB back end is
+        used. To spread API load, instances of the nova-api service
+        were deployed within the child cells for Telemetry to query
+        against. This also meant that some supporting services
+        including keystone, glance-api and glance-registry needed to
+        also be configured in the child cells.</para>
+    <mediaobject>
+        <imageobject>
+            <imagedata
+                fileref="../images/Generic_CERN_Architecture.png"/>
+        </imageobject>
+    </mediaobject>
+    <para>Additional monitoring tools in use include Flume
+        (http://flume.apache.org/), Elastic Search, Kibana
+        (http://www.elasticsearch.org/overview/kibana/), and the CERN
+        developed Lemon (http://lemon.web.cern.ch/lemon/index.shtml)
+        project.</para></section>
+    <section xml:id="references-cern-resources"><title>References</title>
+    <para>The authors of the Architecture Design Guide would like to
+        thank CERN for publicly documenting their OpenStack deployment
+        in these resources, which formed the basis for this
+        chapter:</para>
+    <itemizedlist>
+        <listitem>
+            <para>http://openstack-in-production.blogspot.fr/</para>
+        </listitem>
+        <listitem>
+            <para>http://www.openstack.org/assets/presentation-media/Deep-Dive-into-the-CERN-Cloud-Infrastructure.pdf</para>
+        </listitem>
+    </itemizedlist></section>
+</section>
--- a/doc/arch-design/compute_focus/section_tech_considerations_compute_focus.xml
+++ b/doc/arch-design/compute_focus/section_tech_considerations_compute_focus.xml
@ -0,0 +1,421 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<section xmlns="http://docbook.org/ns/docbook"
+    xmlns:xi="http://www.w3.org/2001/XInclude"
+    xmlns:xlink="http://www.w3.org/1999/xlink"
+    version="5.0"
+    xml:id="technical-considerations-compute-focus">
+    <?dbhtml stop-chunking?>
+    <title>Technical Considerations</title>
+    <para>In a compute-focused OpenStack cloud, the type of instance
+        workloads being provisioned heavily influences technical
+        decision making. For example, specific use cases that demand
+        multiple short running jobs present different requirements
+        than those that specify long-running jobs, even though both
+        situations are considered "compute focused."</para>
+    <para>Public and private clouds require deterministic capacity
+        planning to support elastic growth in order to meet user SLA
+        expectations. Deterministic capacity planning is the path to
+        predicting the effort and expense of making a given process
+        consistently performant. This process is important because,
+        when a service becomes a critical part of a user's
+        infrastructure, the user's fate becomes wedded to the SLAs of
+        the cloud itself. In cloud computing, a service’s performance
+        will not be measured by its average speed but rather by the
+        consistency of its speed.</para>
+    <para>There are two aspects of capacity planning to consider:
+        planning the initial deployment footprint, and planning
+        expansion of it to stay ahead of the demands of cloud
+        users.</para>
+    <para>Planning the initial footprint for an OpenStack deployment
+        is typically done based on existing infrastructure workloads
+        and estimates based on expected uptake.</para>
+    <para>The starting point is the core count of the cloud. By
+        applying relevant ratios, the user can gather information
+        about:</para>
+    <itemizedlist>
+        <listitem>
+            <para>The number of instances expected to be available
+                concurrently: (overcommit fraction × cores) / virtual
+                cores per instance</para>
+        </listitem>
+        <listitem>
+            <para>How much storage is required: flavor disk size ×
+                number of instances</para>
+        </listitem>
+    </itemizedlist>
+    <para>These ratios can be used to determine the amount of
+        additional infrastructure needed to support the cloud. For
+        example, consider a situation in which you require 1600
+        instances, each with 2 vCPU and 50 GB of storage. Assuming the
+        default overcommit rate of 16:1, working out the math provides
+        an equation of:</para>
+    <itemizedlist>
+        <listitem>
+            <para>1600 = (16 x (number of physical cores)) / 2</para>
+        </listitem>
+        <listitem>
+            <para>storage required = 50 GB x 1600</para>
+        </listitem>
+    </itemizedlist>
+    <para>On the surface, the equations reveal the need for 200
+        physical cores and 80 TB of storage for
+        /var/lib/nova/instances/. However, it is also important to
+        look at patterns of usage to estimate the load that the API
+        services, database servers, and queue servers are likely to
+        encounter.</para>
+    <para>Consider, for example, the differences between a cloud that
+        supports a managed web-hosting platform with one running
+        integration tests for a development project that creates one
+        instance per code commit. In the former, the heavy work of
+        creating an instance happens only every few months, whereas
+        the latter puts constant heavy load on the cloud controller.
+        The average instance lifetime must be considered, as a larger
+        number generally means less load on the cloud
+        controller.</para>
+    <para>Aside from the creation and termination of instances, the
+        impact of users must be considered when accessing the service,
+        particularly on nova-api and its associated database. Listing
+        instances garners a great deal of information and, given the
+        frequency with which users run this operation, a cloud with a
+        large number of users can increase the load significantly.
+        This can even occur unintentionally. For example, the
+        OpenStack Dashboard instances tab refreshes the list of
+        instances every 30 seconds, so leaving it open in a browser
+        window can cause unexpected load.</para>
+    <para>Consideration of these factors can help determine how many
+        cloud controller cores are required. A server with 8 CPU cores
+        and 8 GB of RAM server would be sufficient for up to a rack of
+        compute nodes, given the above caveats.</para>
+    <para>Key hardware specifications are also crucial to the
+        performance of user instances. Be sure to consider budget and
+        performance needs, including storage performance
+        (spindles/core), memory availability (RAM/core), network
+        bandwidth (Gbps/core), and overall CPU performance
+        (CPU/core).</para>
+    <para>The cloud resource calculator is a useful tool in examining
+        the impacts of different hardware and instance load outs. It
+        is available at:</para>
+    <itemizedlist>
+        <listitem>
+            <para>https://github.com/noslzzp/cloud-resource-calculator/blob/master/cloud-resource-calculator.ods</para>
+        </listitem>
+    </itemizedlist>
+    <section xml:id="expansion-planning-compute-focus">
+        <title>Expansion Planning</title>
+    <para>A key challenge faced when planning the expansion of cloud
+        compute services is the elastic nature of cloud infrastructure
+        demands. Previously, new users or customers would be forced to
+        plan for and request the infrastructure they required ahead of
+        time, allowing time for reactive procurement processes. Cloud
+        computing users have come to expect the agility provided by
+        having instant access to new resources as they are required.
+        Consequently, this means planning should be delivered for
+        typical usage, but also more importantly, for sudden bursts in
+        usage.</para>
+    <para>Planning for expansion can be a delicate balancing act.
+        Planning too conservatively can lead to unexpected
+        oversubscription of the cloud and dissatisfied users. Planning
+        for cloud expansion too aggressively can lead to unexpected
+        underutilization of the cloud and funds spent on operating
+        infrastructure that is not being used efficiently.</para>
+    <para>The key is to carefully monitor the spikes and valleys in
+        cloud usage over time. The intent is to measure the
+        consistency with which services can be delivered, not the
+        average speed or capacity of the cloud. Using this information
+        to model performance results in capacity enables users to more
+        accurately determine the current and future capacity of the
+        cloud.</para></section>
+    <section xml:id="cpu-and-ram-compute-focus"><title>CPU and RAM</title>
+    <para>(Adapted from:
+        http://docs.openstack.org/openstack-ops/content/compute_nodes.html#cpu_choice)</para>
+    <para>In current generations, CPUs have up to 12 cores. If an
+        Intel CPU supports Hyper-Threading, those 12 cores are doubled
+        to 24 cores. If a server is purchased that supports multiple
+        CPUs, the number of cores is further multiplied.
+        Hyper-Threading is Intel's proprietary simultaneous
+        multi-threading implementation, used to improve
+        parallelization on their CPUs. Consider enabling
+        Hyper-Threading to improve the performance of multithreaded
+        applications.</para>
+    <para>Whether the user should enable Hyper-Threading on a CPU
+        depends upon the use case. For example, disabling
+        Hyper-Threading can be beneficial in intense computing
+        environments. Performance testing conducted by running local
+        workloads with both Hyper-Threading on and off can help
+        determine what is more appropriate in any particular
+        case.</para>
+    <para>If the Libvirt/KVM Hypervisor driver are the intended use
+        cases, then the CPUs used in the compute nodes must support
+        virtualization by way of the VT-x extensions for Intel chips
+        and AMD-v extensions for AMD chips to provide full
+        performance.</para>
+    <para>OpenStack enables the user to overcommit CPU and RAM on
+        compute nodes. This allows an increase in the number of
+        instances running on the cloud at the cost of reducing the
+        performance of the instances. OpenStack Compute uses the
+        following ratios by default:</para>
+    <itemizedlist>
+        <listitem>
+            <para>CPU allocation ratio: 16:1</para>
+        </listitem>
+        <listitem>
+            <para>RAM allocation ratio: 1.5:1</para>
+        </listitem>
+    </itemizedlist>
+    <para>The default CPU allocation ratio of 16:1 means that the
+        scheduler allocates up to 16 virtual cores per physical core.
+        For example, if a physical node has 12 cores, the scheduler
+        sees 192 available virtual cores. With typical flavor
+        definitions of 4 virtual cores per instance, this ratio would
+        provide 48 instances on a physical node.</para>
+    <para>Similarly, the default RAM allocation ratio of 1.5:1 means
+        that the scheduler allocates instances to a physical node as
+        long as the total amount of RAM associated with the instances
+        is less than 1.5 times the amount of RAM available on the
+        physical node.</para>
+    <para>For example, if a physical node has 48 GB of RAM, the
+        scheduler allocates instances to that node until the sum of
+        the RAM associated with the instances reaches 72 GB (such as
+        nine instances, in the case where each instance has 8 GB of
+        RAM).</para>
+    <para>The appropriate CPU and RAM allocation ratio must be
+        selected based on particular use cases.</para></section>
+    <section xml:id="additional-hardware-compute-focus"><title>Additional Hardware</title>
+    <para>Certain use cases may benefit from exposure to additional
+        devices on the compute node. Examples might include:</para>
+    <itemizedlist>
+        <listitem>
+            <para>High performance computing jobs that benefit from
+                the availability of graphics processing units (GPUs)
+                for general-purpose computing.</para>
+        </listitem>
+    </itemizedlist>
+    <itemizedlist>
+        <listitem>
+            <para>Cryptographic routines that benefit from the
+                availability of hardware random number generators to
+                avoid entropy starvation.</para>
+        </listitem>
+        <listitem>
+            <para>Database management systems that benefit from the
+                availability of SSDs for ephemeral storage to maximize
+                read/write time when it is required.</para>
+        </listitem>
+    </itemizedlist>
+    <para>Host aggregates are used to group hosts that share similar
+        characteristics, which can include hardware similarities. The
+        addition of specialized hardware to a cloud deployment is
+        likely to add to the cost of each node, so careful
+        consideration must be given to whether all compute nodes, or
+        just a subset which is targetable using flavors, need the
+        additional customization to support the desired
+        workloads.</para></section>
+    <section xml:id="utilization"><title>Utilization</title>
+    <para>Infrastructure-as-a-Service offerings, including OpenStack,
+        use flavors to provide standardized views of virtual machine
+        resource requirements that simplify the problem of scheduling
+        instances while making the best use of the available physical
+        resources.</para>
+    <para>In order to facilitate packing of virtual machines onto
+        physical hosts, the default selection of flavors are
+        constructed so that the second largest flavor is half the size
+        of the largest flavor in every dimension. It has half the
+        vCPUs, half the vRAM, and half the ephemeral disk space. The
+        next largest flavor is half that size again. As a result,
+        packing a server for general purpose computing might look
+        conceptually something like this figure:</para>
+    <mediaobject>
+        <imageobject>
+            <imagedata
+                fileref="../images/Compute_Tech_Bin_Packing_General1.png"
+            />
+        </imageobject>
+    </mediaobject>
+    <para>On the other hand, a CPU optimized packed server might look
+        like the following figure:</para>
+    <mediaobject>
+        <imageobject>
+            <imagedata
+                fileref="../images/Compute_Tech_Bin_Packing_CPU_optimized1.png"
+            />
+        </imageobject>
+    </mediaobject>
+    <para>These default flavors are well suited to typical load outs
+        for commodity server hardware. To maximize utilization,
+        however, it may be necessary to customize the flavors or
+        create new ones, to better align instance sizes to the
+        available hardware.</para>
+    <para>Workload characteristics may also influence hardware choices
+        and flavor configuration, particularly where they present
+        different ratios of CPU versus RAM versus HDD
+        requirements.</para>
+    <para>For more information on Flavors refer to:
+        http://docs.openstack.org/openstack-ops/content/flavors.html</para>
+    </section>
+    <section xml:id="performance-compute-focus"><title>Performance</title>
+    <para>The infrastructure of a cloud should not be shared, so that
+        it is possible for the workloads to consume as many resources
+        as are made available, and accommodations should be made to
+        provide large scale workloads.</para>
+    <para>The duration of batch processing differs depending on
+        individual workloads that are launched. Time limits range from
+        seconds, minutes to hours, and as a result it is considered
+        difficult to predict when resources will be used, for how
+        long, and even which resources will be used.</para>
+    </section>
+    <section xml:id="security-compute-focus"><title>Security</title>
+    <para>The security considerations needed for this scenario are
+        similar to those of the other scenarios discussed in this
+        book.</para>
+    <para>A security domain comprises of users, applications, servers
+        or networks that share common trust requirements and
+        expectations within a system. Typically they have the same
+        authentication and authorization requirements and
+        users.</para>
+    <para>These security domains are:</para>
+    <orderedlist>
+        <listitem>
+            <para>Public</para>
+        </listitem>
+        <listitem>
+            <para>Guest</para>
+        </listitem>
+        <listitem>
+            <para>Management</para>
+        </listitem>
+        <listitem>
+            <para>Data</para>
+        </listitem>
+    </orderedlist>
+    <para>These security domains can be mapped individually to the
+        installation, or they can also be combined. For example, some
+        deployment topologies combine both guest and data domains onto
+        one physical network, whereas in other cases these networks
+        are physically separated. In each case, the cloud operator
+        should be aware of the appropriate security concerns. Security
+        domains should be mapped out against specific OpenStack
+        deployment topology. The domains and their trust requirements
+        depend upon whether the cloud instance is public, private, or
+        hybrid.</para>
+    <para>The public security domain is an entirely untrusted area of
+        the cloud infrastructure. It can refer to the Internet as a
+        whole or simply to networks over which the user has no
+        authority. This domain should always be considered
+        untrusted.</para>
+    <para>Typically used for compute instance-to-instance traffic, the
+        guest security domain handles compute data generated by
+        instances on the cloud; not services that support the
+        operation of the cloud, for example API calls. Public cloud
+        providers and private cloud providers who do not have
+        stringent controls on instance use or who allow unrestricted
+        internet access to instances should consider this domain to be
+        untrusted. Private cloud providers may want to consider this
+        network as internal and therefore trusted only if they have
+        controls in place to assert that they trust instances and all
+        their tenants.</para>
+    <para>The management security domain is where services interact.
+        Sometimes referred to as the "control plane", the networks in
+        this domain transport confidential data such as configuration
+        parameters, user names, and passwords. In most deployments this
+        domain is considered trusted.</para>
+    <para>The data security domain is concerned primarily with
+        information pertaining to the storage services within
+        OpenStack. Much of the data that crosses this network has high
+        integrity and confidentiality requirements and depending on
+        the type of deployment there may also be strong availability
+        requirements. The trust level of this network is heavily
+        dependent on deployment decisions and as such we do not assign
+        this any default level of trust.</para>
+    <para>When deploying OpenStack in an enterprise as a private cloud
+        it is assumed to be behind a firewall and within the trusted
+        network alongside existing systems. Users of the cloud are
+        typically employees or trusted individuals that are bound by
+        the security requirements set forth by the company. This tends
+        to push most of the security domains towards a more trusted
+        model. However, when deploying OpenStack in a public-facing
+        role, no assumptions can be made and the attack vectors
+        significantly increase. For example, the API endpoints and the
+        software behind it will be vulnerable to potentially hostile
+        entities wanting to gain unauthorized access or prevent access
+        to services. This can result in loss of reputation and must be
+        protected against through auditing and appropriate
+        filtering.</para>
+    <para>Consideration must be taken when managing the users of the
+        system, whether it is the operation of public or private
+        clouds. The identity service allows for LDAP to be part of the
+        authentication process, and includes such systems as an
+        OpenStack deployment that may ease user management if
+        integrated into existing systems.</para>
+    <para>It is strongly recommended that the API services are placed
+        behind hardware that performs SSL termination. API services
+        transmit user names, passwords, and generated tokens between
+        client machines and API endpoints and therefore must be
+        secured.</para>
+    <para>More information on OpenStack Security can be found
+        at http://docs.openstack.org/security-guide/</para>
+    </section>
+    <section xml:id="openstack-components-compute-focus"><title>OpenStack Components</title>
+    <para>Due to the nature of the workloads that will be used in this
+        scenario, a number of components will be highly beneficial in
+        a Compute-focused cloud. This includes the typical OpenStack
+        components:</para>
+    <itemizedlist>
+        <listitem>
+            <para>OpenStack Compute (Nova)</para>
+        </listitem>
+        <listitem>
+            <para>OpenStack Image Service (Glance)</para>
+        </listitem>
+        <listitem>
+            <para>OpenStack Identity Service (Keystone)</para>
+        </listitem>
+    </itemizedlist>
+    <para>Also consider several specialized components:</para>
+    <itemizedlist>
+        <listitem>
+            <para>OpenStack Orchestration Engine (Heat)</para>
+        </listitem>
+    </itemizedlist>
+    <para>It is safe to assume that, given the nature of the
+        applications involved in this scenario, these will be heavily
+        automated deployments. Making use of Heat will be highly
+        beneficial in this case. Deploying a batch of instances and
+        running an automated set of tests can be scripted, however it
+        makes sense to use the OpenStack Orchestration Engine (Heat)
+        to handle all these actions.</para>
+    <itemizedlist>
+        <listitem>
+            <para>OpenStack Telemetry (Ceilometer)</para>
+        </listitem>
+    </itemizedlist>
+    <para>OpenStack Telemetry and the alarms it generates are required
+        to support autoscaling of instances using OpenStack
+        Orchestration. Users that are not using OpenStack
+        Orchestration do not need to deploy OpenStack Telemetry and
+        may choose to use other external solutions to fulfill their
+        metering and monitoring requirements.</para>
+    <para>See also:
+        http://docs.openstack.org/openstack-ops/content/logging_monitoring.html</para>
+    <itemizedlist>
+        <listitem>
+            <para>OpenStack Block Storage (Cinder)</para>
+        </listitem>
+    </itemizedlist>
+    <para>Due to the burst-able nature of the workloads and the
+        applications and instances that will be used for batch
+        processing, this cloud will utilize mainly memory or CPU, so
+        the need for add-on storage to each instance is not a likely
+        requirement. This does not mean the OpenStack Block Storage
+        service (Cinder) will not be used in the infrastructure, but
+        typically it will not be used as a central component.</para>
+    <itemizedlist>
+        <listitem>
+            <para>Networking</para>
+        </listitem>
+    </itemizedlist>
+    <para>When choosing a networking platform, ensure that it either
+        works with all desired hypervisor and container technologies
+        and their OpenStack drivers, or includes an implementation of
+        an ML2 mechanism driver. Networking platforms that provide ML2
+        mechanisms drivers can be mixed.</para></section>
+</section>
--- a/doc/arch-design/compute_focus/section_user_requirements_compute_focus.xml
+++ b/doc/arch-design/compute_focus/section_user_requirements_compute_focus.xml
@ -0,0 +1,144 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<section xmlns="http://docbook.org/ns/docbook"
+    xmlns:xi="http://www.w3.org/2001/XInclude"
+    xmlns:xlink="http://www.w3.org/1999/xlink"
+    version="5.0"
+    xml:id="user-requirements-compute-focus">
+    <?dbhtml stop-chunking?>
+    <title>User Requirements</title>
+    <para>Compute intensive workloads are defined by their high
+        utilization of CPU, RAM, or both. User requirements will
+        determine if a cloud must be built to accommodate anticipated
+        performance demands.</para>
+    <itemizedlist>
+        <listitem>
+            <para>Cost: Cost is not generally a primary concern for a
+                compute-focused cloud, however some organizations
+                might be concerned with cost avoidance. Repurposing
+                existing resources to tackle compute-intensive tasks
+                instead of needing to acquire additional resources may
+                offer cost reduction opportunities.</para>
+        </listitem>
+        <listitem>
+            <para>Time to Market: Compute-focused clouds can be used
+                to deliver products more quickly, for example,
+                speeding up a company's software development life cycle
+                (SDLC) for building products and applications.</para>
+        </listitem>
+        <listitem>
+            <para>Revenue Opportunity: Companies that are interested
+                in building services or products that rely on the
+                power of the compute resources will benefit from a
+                compute-focused cloud. Examples include the analysis
+                of large data sets (via Hadoop or Cassandra) or
+                completing computational intensive tasks such as
+                rendering, scientific computation, or
+                simulations.</para>
+        </listitem>
+    </itemizedlist>
+    <section xml:id="legal-requirements-compute-focus"><title>Legal Requirements</title>
+    <para>Many jurisdictions have legislative and regulatory
+        requirements governing the storage and management of data in
+        cloud environments. Common areas of regulation include:</para>
+    <itemizedlist>
+        <listitem>
+            <para>Data retention policies ensuring storage of
+                persistent data and records management to meet data
+                archival requirements.</para>
+        </listitem>
+        <listitem>
+            <para>Data ownership policies governing the possession and
+                responsibility for data.</para>
+        </listitem>
+        <listitem>
+            <para>Data sovereignty policies governing the storage of
+                data in foreign countries or otherwise separate
+                jurisdictions.</para>
+        </listitem>
+        <listitem>
+            <para>Data compliance - certain types of information needs
+                to reside in certain locations due to regular issues -
+                and more important cannot reside in other locations
+                for the same reason.</para>
+        </listitem>
+    </itemizedlist>
+    <para>Examples of such legal frameworks include the data
+        protection framework of the European Union
+        (http://ec.europa.eu/justice/data-protection/ ) and the
+        requirements of the Financial Industry Regulatory Authority
+        (http://www.finra.org/Industry/Regulation/FINRARules/ ) in the
+        United States. Consult a local regulatory body for more
+        information.</para></section>
+    <section xml:id="technical-considerations-compute-focus-user"><title>Technical Considerations</title>
+    <para>The following are some technical requirements that need to
+        be incorporated into the architecture design.</para>
+    <itemizedlist>
+        <listitem>
+            <para>Performance: If a primary technical concern is for
+                the environment to deliver high performance
+                capability, then a compute-focused design is an
+                obvious choice because it is specifically designed to
+                host compute-intensive workloads.</para>
+        </listitem>
+        <listitem>
+            <para>Workload persistence: Workloads can be either
+                short-lived or long running. Short-lived workloads
+                might include continuous integration and continuous
+                deployment (CI-CD) jobs, where large numbers of
+                compute instances are created simultaneously to
+                perform a set of compute-intensive tasks. The results
+                or artifacts are then copied from the instance into
+                long-term storage before the instance is destroyed.
+                Long-running workloads, like a Hadoop or
+                high-performance computing (HPC) cluster, typically
+                ingest large data sets, perform the computational work
+                on those data sets, then push the results into long
+                term storage. Unlike short-lived workloads, when the
+                computational work is completed, they will remain idle
+                until the next job is pushed to them. Long-running
+                workloads are often larger and more complex, so the
+                effort of building them is mitigated by keeping them
+                active between jobs. Another example of long running
+                workloads is legacy applications that typically are
+                persistent over time.</para>
+        </listitem>
+        <listitem>
+            <para>Storage: Workloads targeted for a compute-focused
+                OpenStack cloud generally do not require any
+                persistent block storage (although some usages of
+                Hadoop with HDFS may dictate the use of persistent
+                block storage). A shared filesystem or object store
+                will maintain the initial data set(s) and serve as the
+                destination for saving the computational results. By
+                avoiding the input-output (IO) overhead, workload
+                performance is significantly enhanced. Depending on
+                the size of the data set(s), it might be necessary to
+                scale the object store or shared file system to match
+                the storage demand.</para>
+        </listitem>
+        <listitem>
+            <para>User Interface: Like any other cloud architecture, a
+                compute-focused OpenStack cloud requires an on-demand
+                and self-service user interface. End users must be
+                able to provision computing power, storage, networks
+                and software simply and flexibly. This includes
+                scaling the infrastructure up to a substantial level
+                without disrupting host operations.</para>
+        </listitem>
+        <listitem>
+            <para>Security: Security is going to be highly dependent
+                on the business requirements. For example, a
+                computationally intense drug discovery application
+                will obviously have much higher security requirements
+                than a cloud that is designed for processing market
+                data for a retailer. As a general start, the security
+                recommendations and guidelines provided in the
+                OpenStack Security Guide are applicable.</para>
+        </listitem>
+    </itemizedlist></section>
+    <section xml:id="operational-considerations-compute-focus-user"><title>Operational Considerations</title>
+    <para>The compute intensive cloud from the operational perspective
+        is similar to the requirements for the general-purpose cloud.
+        More details on operational requirements can be found in the
+        general-purpose design section.</para></section>
+</section>
--- a/doc/arch-design/generalpurpose/section_architecture_general_purpose.xml
+++ b/doc/arch-design/generalpurpose/section_architecture_general_purpose.xml
@ -0,0 +1,744 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<section xmlns="http://docbook.org/ns/docbook"
+    xmlns:xi="http://www.w3.org/2001/XInclude"
+    xmlns:xlink="http://www.w3.org/1999/xlink"
+    version="5.0"
+    xml:id="arch-guide-architecture-overview">
+    <?dbhtml stop-chunking?>
+    <title>Architecture</title>
+    <para>Hardware selection involves three key areas:</para>
+    <itemizedlist>
+        <listitem>
+            <para>Compute</para>
+        </listitem>
+        <listitem>
+            <para>Network</para>
+        </listitem>
+        <listitem>
+            <para>Storage</para>
+        </listitem>
+    </itemizedlist>
+    <para>For each of these areas, the selection of hardware for a
+        general purpose OpenStack cloud must reflect the fact that a
+        the cloud has no pre-defined usage model. This means that
+        there will be a wide variety of applications running on this
+        cloud that will have varying resource usage requirements. Some
+        applications will be RAM-intensive, some applications will be
+        CPU-intensive, while others will be storage-intensive.
+        Therefore, choosing hardware for a general purpose OpenStack
+        cloud must provide balanced access to all major
+        resources.</para>
+    <para>Certain hardware form factors may be better suited for use
+        in a general purpose OpenStack cloud because of the need for
+        an equal or nearly equal balance of resources. Server hardware
+        for a general purpose OpenStack architecture design must
+        provide an equal or nearly equal balance of compute capacity
+        (RAM and CPU), network capacity (number and speed of links),
+        and storage capacity (gigabytes or terabytes as well as I-O
+        Operations Per Second (IOPS).</para>
+    <para>Server hardware is evaluated around four conflicting
+        dimensions:</para>
+    <itemizedlist>
+        <listitem>
+            <para>Server density: A measure of how many servers can
+                fit into a given measure of physical space, such as a
+                rack unit [U].</para>
+        </listitem>
+        <listitem>
+            <para>Resource capacity: The number of CPU cores, how much
+                RAM, or how much storage a given server will
+                deliver.</para>
+        </listitem>
+        <listitem>
+            <para>Expandability: The number of additional resources
+                that can be added to a server before it has reached
+                its limit.</para>
+        </listitem>
+        <listitem>
+            <para>Cost: The relative purchase price of the hardware
+                weighted against the level of design effort needed to
+                build the system.</para>
+        </listitem>
+    </itemizedlist>
+    <para>Increasing server density means sacrificing resource
+        capacity or expandability, however, increasing resource
+        capacity and expandability increases cost and decreases server
+        density. As a result, determining the best server hardware for
+        a general purpose OpenStack architecture means understanding
+        how choice of form factor will impact the rest of the
+        design.</para>
+    <itemizedlist>
+        <listitem>
+            <para>Blade servers typically support dual-socket
+                multi-core CPUs, which is the configuration generally
+                considered to be the "sweet spot" for a general
+                purpose cloud deployment. Blades also offer
+                outstanding density. As an example, both HP
+                BladeSystem and Dell PowerEdge M1000e support up to 16
+                servers in only 10 rack units. However, the blade
+                servers themselves often have limited storage and
+                networking capacity. Additionally, the expandability
+                of many blade servers can be limited.</para>
+        </listitem>
+        <listitem>
+            <para>1U rack-mounted servers occupy only a single rack
+                unit. Their benefits include high density, support for
+                dual-socket multi-core CPUs, and support for
+                reasonable RAM amounts. This form factor offers
+                limited storage capacity, limited network capacity,
+                and limited expandability.</para>
+        </listitem>
+        <listitem>
+            <para>2U rack-mounted servers offer the expanded storage
+                and networking capacity that 1U servers tend to lack,
+                but with a corresponding decrease in server density
+                (half the density offered by 1U rack-mounted
+                servers).</para>
+        </listitem>
+        <listitem>
+            <para>Larger rack-mounted servers, such as 4U servers,
+                will tend to offer even greater CPU capacity, often
+                supporting four or even eight CPU sockets. These
+                servers often have much greater expandability so will
+                provide the best option for upgradability. This means,
+                however, that the servers have a much lower server
+                density and a much greater hardware cost.</para>
+        </listitem>
+        <listitem>
+            <para>"Sled servers" are rack-mounted servers that support
+                multiple independent servers in a single 2U or 3U
+                enclosure. This form factor offers increased density
+                over typical 1U-2U rack-mounted servers but tends to
+                suffer from limitations in the amount of storage or
+                network capacity each individual server
+                supports.</para>
+        </listitem>
+    </itemizedlist>
+    <para>Given the wide selection of hardware and general user
+        requirements, the best form factor for the server hardware
+        supporting a general purpose OpenStack cloud is driven by
+        outside business and cost factors. No single reference
+        architecture will apply to all implementations; the decision
+        must flow out of the user requirements, technical
+        considerations, and operational considerations. Here are some
+        of the key factors that influence the selection of server
+        hardware:</para>
+    <itemizedlist>
+        <listitem>
+            <para>Instance density: Sizing is an important
+                consideration for a general purpose OpenStack cloud.
+                The expected or anticipated number of instances that
+                each hypervisor can host is a common metric used in
+                sizing the deployment. The selected server hardware
+                needs to support the expected or anticipated instance
+                density.</para>
+        </listitem>
+        <listitem>
+            <para>Host density: Physical data centers have limited
+                physical space, power, and cooling. The number of
+                hosts (or hypervisors) that can be fitted into a given
+                metric (rack, rack unit, or floor tile) is another
+                important method of sizing. Floor weight is an often
+                overlooked consideration. The data center floor must
+                be able to support the weight of the proposed number
+                of hosts within a rack or set of racks. These factors
+                need to be applied as part of the host density
+                calculation and server hardware selection.</para>
+        </listitem>
+        <listitem>
+            <para>Power density: Data centers have a specified amount
+                of power fed to a given rack or set of racks. Older
+                data centers may have a power density as power as low
+                as 20 AMPs per rack, while more recent data centers
+                can be architected to support power densities as high
+                as 120 AMP per rack. The selected server hardware must
+                take power density into account.</para>
+        </listitem>
+        <listitem>
+            <para>Network connectivity: The selected server hardware
+                must have the appropriate number of network
+                connections, as well as the right type of network
+                connections, in order to support the proposed
+                architecture. Ensure that, at a minimum, there are at
+                least two diverse network connections coming into each
+                rack. For architectures requiring even more
+                redundancy, it might be necessary to confirm that the
+                network connections are from diverse telecom
+                providers. Many data centers have that capacity
+                available.</para>
+        </listitem>
+    </itemizedlist>
+    <para>The selection of certain form factors or architectures will
+        affect the selection of server hardware. For example, if the
+        design calls for a scale-out storage architecture (For
+        example, leveraging Ceph, Gluster, or a similar commercial
+        solution), then the server hardware selection will need to be
+        carefully considered to match the requirements set by the
+        commercial solution. Ensure that the selected server hardware
+        is configured to support enough storage capacity (or storage
+        expandability) to match the requirements of selected scale-out
+        storage solution. For example, if a centralized storage
+        solution is required, such as a centralized storage array from
+        a storage vendor that has infiniBand or FDDI connections, the
+        server hardware will need to have appropriate network adapters
+        installed to be compatible with the storage array vendor's
+        specifications.</para>
+    <para>Similarly, the network architecture will have an impact on
+        the server hardware selection and vice versa. For example,
+        make sure that the server is configured with enough additional
+        network ports and expansion cards to support all of the
+        networks required. There is variability in network expansion
+        cards, so it is important to be aware of potential impacts or
+        interoperability issues with other components in the
+        architecture. This is especially true if the architecture uses
+        InfiniBand or another less commonly used networking
+        protocol.</para>
+    <section xml:id="selecting-storage-hardware">
+        <title>Selecting Storage Hardware</title>
+    <para>The selection of storage hardware is largely determined by
+        the proposed storage architecture. Factors that need to be
+        incorporated into the storage architecture include:</para>
+    <itemizedlist>
+        <listitem>
+            <para>Cost: Storage can be a significant portion of the
+                overall system cost that should be factored into the
+                design decision. For an organization that is concerned
+                with vendor support, a commercial storage solution is
+                advisable, although it is comes with a higher price
+                tag. If initial capital expenditure requires
+                minimization, designing a system based on commodity
+                hardware would apply. The trade-off is potentially
+                higher support costs and a greater risk of
+                incompatibility and interoperability issues.</para>
+        </listitem>
+        <listitem>
+            <para>Performance: Storage performance, measured by
+                observing the latency of storage I-O requests, is not
+                a critical factor for a general purpose OpenStack
+                cloud as overall systems performance is not a design
+                priority.</para>
+        </listitem>
+        <listitem>
+            <para>Scalability: The term "scalability" refers to how
+                well the storage solution performs as it expands up to
+                its maximum designed size. A solution that continues
+                to perform well at maximum expansion is considered
+                scalable. A storage solution that performs well in
+                small configurations but has degrading performance as
+                it expands was not designed to be not scalable.
+                Scalability, along with expandability, is a major
+                consideration in a general purpose OpenStack cloud. It
+                might be difficult to predict the final intended size
+                of the implementation because there are no established
+                usage patterns for a general purpose cloud. Therefore,
+                it may become necessary to expand the initial
+                deployment in order to accommodate growth and user
+                demand. The ability of the storage solution to
+                continue to perform well as it expands is
+                important.</para>
+        </listitem>
+        <listitem>
+            <para>Expandability: This refers to the overall ability of
+                the solution to grow. A storage solution that expands
+                to 50 PB is considered more expandable than a solution
+                that only scales to 10 PB. This metric is related to,
+                but different, from scalability, which is a measure of
+                the solution's performance as it expands.
+                Expandability is a major architecture factor for
+                storage solutions with general purpose OpenStack
+                cloud. For example, the storage architecture for a
+                cloud that is intended for a development platform may
+                not have the same expandability and scalability
+                requirements as a cloud that is intended for a
+                commercial product.</para>
+        </listitem>
+    </itemizedlist>
+    <para>Storage hardware architecture is largely determined by the
+        selected storage architecture. The selection of storage
+        architecture, as well as the corresponding storage hardware,
+        is determined by evaluating possible solutions against the
+        critical factors, the user requirements, technical
+        considerations, and operational considerations. A combination
+        of all the factors and considerations will determine which
+        approach will be best.</para>
+    <para>Using a scale-out storage solution with direct-attached
+        storage (DAS) in the servers is well suited for a general
+        purpose OpenStack cloud. In this scenario, it is possible to
+        populate storage in either the compute hosts similar to a grid
+        computing solution or into hosts dedicated to providing block
+        storage exclusively. When deploying storage in the compute
+        hosts, appropriate hardware which can support both the storage
+        and compute services on the same hardware will be required.
+        This approach is referred to as a grid computing architecture
+        because there is a grid of modules that have both compute and
+        storage in a single box.</para>
+    <para>Understanding the requirements of cloud services will help
+        determine if Ceph, Gluster, or a similar scale-out solution
+        should be used. It can then be further determined if a single,
+        highly expandable and highly vertical, scalable, centralized
+        storage array should be included in the design. Once the
+        approach has been determined, the storage hardware needs to be
+        chosen based on this criteria. If a centralized storage array
+        fits the requirements best, then the array vendor will
+        determine the hardware. For cost reasons it may be decided to
+        build an open source storage array using solutions such as
+        OpenFiler, Nexenta Open Source, or BackBlaze Open
+        Source.</para>
+    <para>This list expands upon the potential impacts for including a
+        particular storage architecture (and corresponding storage
+        hardware) into the design for a general purpose OpenStack
+        cloud:</para>
+    <itemizedlist>
+        <listitem>
+            <para>Connectivity: Ensure that, if storage protocols
+                other than Ethernet are part of the storage solution,
+                the appropriate hardware has been selected. Some
+                examples include InfiniBand, FDDI and Fibre Channel.
+                If a centralized storage array is selected, ensure
+                that the hypervisor will be able to connect to that
+                storage array for image storage.</para>
+        </listitem>
+        <listitem>
+            <para>Usage: How the particular storage architecture will
+                be used is critical for determining the architecture.
+                Some of the configurations that will influence the
+                architecture include whether it will be used by the
+                hypervisors for ephemeral instance storage or if
+                OpenStack Swift will use it for object storage. All of
+                these usage models are affected by the selection of
+                particular storage architecture and the corresponding
+                storage hardware to support that architecture.</para>
+        </listitem>
+        <listitem>
+            <para>Instance and image locations: Where instances and
+                images will be stored will influence the architecture.
+                For example, instances can be stored in a number of
+                options. OpenStack Cinder is a good location for
+                instances because it is persistent block storage,
+                however, Swift can be used if storage latency is less
+                of a concern. The same argument applies to the
+                appropriate image storage location.</para>
+        </listitem>
+        <listitem>
+            <para>Server Hardware: If the solution is a scale-out
+                storage architecture that includes DAS, naturally that
+                will affect the server hardware selection. This could
+                ripple into the decisions that affect host density,
+                instance density, power density, OS-hypervisor,
+                management tools and others.</para>
+        </listitem>
+    </itemizedlist>
+    <para>General purpose OpenStack cloud has multiple options. As a
+        result, there is no single decision that will apply to all
+        implementations. The key factors that will have an influence
+        on selection of storage hardware for a general purpose
+        OpenStack cloud are as follows:</para>
+    <itemizedlist>
+        <listitem>
+            <para>Capacity: Hardware resources selected for the
+                resource nodes should be capable of supporting enough
+                storage for the cloud services that will use them. It
+                is important to clearly define the initial
+                requirements and ensure that the design can support
+                adding capacity as resources are used in the cloud, as
+                workloads are relatively unknown. Hardware nodes
+                selected for object storage should be capable of
+                supporting a large number of inexpensive disks and
+                should not have any reliance on RAID controller cards.
+                Hardware nodes selected for block storage should be
+                capable of supporting higher speed storage solutions
+                and RAID controller cards to provide performance and
+                redundancy to storage at the hardware level. Selecting
+                hardware RAID controllers that can automatically
+                repair damaged arrays will further assist with
+                replacing and repairing degraded or destroyed storage
+                devices within the cloud.</para>
+        </listitem>
+        <listitem>
+            <para>Performance: Disks selected for the object storage
+                service do not need to be fast performing disks. It is
+                recommended that object storage nodes take advantage
+                of the best cost per terabyte available for storage at
+                the time of acquisition and avoid enterprise class
+                drives. In contrast, disks chosen for the block
+                storage service should take advantage of performance
+                boosting features and may entail the use of SSDs or
+                flash storage to provide for high performing block
+                storage pools. Storage performance of ephemeral disks
+                used for instances should also be taken into
+                consideration. If compute pools are expected to have a
+                high utilization of ephemeral storage or requires very
+                high performance, it would be advantageous to deploy
+                similar hardware solutions to block storage in order
+                to increase the storage performance.</para>
+        </listitem>
+        <listitem>
+            <para>Fault Tolerance: Object storage resource nodes have
+                no requirements for hardware fault tolerance or RAID
+                controllers. It is not necessary to plan for fault
+                tolerance within the object storage hardware because
+                the object storage service provides replication
+                between zones as a feature of the service. Block
+                storage nodes, compute nodes and cloud controllers
+                should all have fault tolerance built in at the
+                hardware level by making use of hardware RAID
+                controllers and varying levels of RAID configuration.
+                The level of RAID chosen should be consistent with the
+                performance and availability requirements of the
+                cloud.</para>
+        </listitem>
+    </itemizedlist>
+    </section>
+    <section xml:id="selecting-networking-hardware">
+        <title>Selecting Networking Hardware</title>
+    <para>As is the case with storage architecture, selecting a
+        network architecture often determines which network hardware
+        will be used. The networking software in use is determined by
+        the selected networking hardware. Some design impacts are
+        obvious, for example, selecting networking hardware that only
+        supports Gigabit Ethernet (GbE) will naturally have an impact
+        on many different areas of the overall design. Similarly,
+        deciding to use 10 Gigabit Ethernet (10 GbE) has a number of
+        impacts on various areas of the overall design.</para>
+    <para>As an example, selecting Cisco networking hardware implies
+        that the architecture will be using Cisco networking software
+        (IOS, NX-OS, etc.). Conversely, selecting Arista networking
+        hardware means the network devices will use Arista networking
+        software (EOS). In addition, there are more subtle design
+        impacts that need to be considered. The selection of certain
+        networking hardware (and therefore the networking software)
+        could affect the management tools that can be used. There are
+        exceptions to this; the rise of "open" networking software
+        that supports a range of networking hardware means that there
+        are instances where the relationship between networking
+        hardware and networking software are not as tightly defined.
+        An example of this type of software is Cumulus Linux, which is
+        capable of running on a number of switch vendor’s hardware
+        solutions.</para>
+    <para>Some of the key considerations that should be included in
+        the selection of networking hardware include:</para>
+    <itemizedlist>
+        <listitem>
+            <para>Port count: The design will require networking
+                hardware that has the requisite port count.</para>
+        </listitem>
+        <listitem>
+            <para>Port density: The network design will be affected by
+                the physical space that is required to provide the
+                requisite port count. A switch that can provide 48 10
+                GbE ports in 1U has a much higher port density than a
+                switch that provides 24 10 GbE ports in 2U. A higher
+                port density is preferred, as it leaves more rack
+                space for compute or storage components that may be
+                required by the design. This can also lead into
+                concerns about fault domains and power density that
+                should be considered. Higher density switches are more
+                expensive and should also be considered, as it is
+                important not to over design the network if it is not
+                required.</para>
+        </listitem>
+        <listitem>
+            <para>Port speed: The networking hardware must support the
+                proposed network speed, for example: 1 GbE, 10 GbE, or
+                40 GbE (or even 100 GbE).</para>
+        </listitem>
+        <listitem>
+            <para>Redundancy: The level of network hardware redundancy
+                required is influenced by the user requirements for
+                high availability and cost considerations. Network
+                redundancy can be achieved by adding redundant power
+                supplies or paired switches. If this is a requirement,
+                the hardware will need to support this configuration.
+                User requirements will determine if a completely
+                redundant network infrastructure is required.</para>
+        </listitem>
+        <listitem>
+            <para>Power requirements: Make sure that the physical data
+                center provides the necessary power for the selected
+                network hardware. This is not an issue for top of rack
+                (ToR) switches, but may be an issue for spine switches
+                in a leaf and spine fabric, or end of row (EoR)
+                switches.</para>
+        </listitem>
+    </itemizedlist>
+    <para>There is no single best practice architecture for the
+        networking hardware supporting a general purpose OpenStack
+        cloud that will apply to all implementations. Some of the key
+        factors that will have a strong influence on selection of
+        networking hardware include:</para>
+    <itemizedlist>
+        <listitem>
+            <para>Connectivity: All nodes within an OpenStack cloud
+                require some form of network connectivity. In some
+                cases, nodes require access to more than one network
+                segment. The design must encompass sufficient network
+                capacity and bandwidth to ensure that all
+                communications within the cloud, both north-south and
+                east-west traffic have sufficient resources
+                available.</para>
+        </listitem>
+        <listitem>
+            <para>Scalability: The chosen network design should
+                encompass a physical and logical network design that
+                can be easily expanded upon. Network hardware should
+                offer the appropriate types of interfaces and speeds
+                that are required by the hardware nodes.</para>
+        </listitem>
+        <listitem>
+            <para>Availability: To ensure that access to nodes within
+                the cloud is not interrupted, it is recommended that
+                the network architecture identify any single points of
+                failure and provide some level of redundancy or fault
+                tolerance. With regard to the network infrastructure
+                itself, this often involves use of networking
+                protocols such as LACP, VRRP or others to achieve a
+                highly available network connection. In addition, it
+                is important to consider the networking implications
+                on API availability. In order to ensure that the APIs,
+                and potentially other services in the cloud are highly
+                available, it is recommended to design load balancing
+                solutions within the network architecture to
+                accommodate for these requirements.</para>
+        </listitem>
+    </itemizedlist>
+    </section>
+    <section xml:id="software-selection">
+        <title>Software Selection</title>
+    <para>Software selection for a general purpose OpenStack
+        architecture design needs to include these three areas:</para>
+    <itemizedlist>
+        <listitem>
+            <para>Operating system (OS) and hypervisor</para>
+        </listitem>
+        <listitem>
+            <para>OpenStack components</para>
+        </listitem>
+        <listitem>
+            <para>Supplemental software</para>
+        </listitem>
+    </itemizedlist>
+    </section>
+    <section xml:id="os-and-hypervisor"><title>OS and Hypervisor</title>
+    <para>The selection of OS and hypervisor has a tremendous impact
+        on the overall design. Selecting a particular operating system
+        and hypervisor can also directly affect server hardware
+        selection. It is recommended to make sure the storage hardware
+        selection and topology support the selected operating system
+        and hypervisor combination. Finally, it is important to ensure
+        that the networking hardware selection and topology will work
+        with the chosen operating system and hypervisor combination.
+        For example, if the design uses Link Aggregation Control
+        Protocol (LACP), the OS and hypervisor both need to support
+        it.</para>
+    <para>Some areas that could be impacted by the selection of OS and
+        hypervisor include:</para>
+    <itemizedlist>
+        <listitem>
+            <para>Cost: Selecting a commercially supported hypervisor,
+                such as Microsoft Hyper-V, will result in a different
+                cost model rather than community-supported open source
+                hypervisors including KVM, Kinstance or Xen. When
+                comparing open source OS solutions, choosing Ubuntu
+                over Red Hat (or vice versa) will have an impact on
+                cost due to support contracts. On the other hand,
+                business or application requirements may dictate a
+                specific or commercially supported hypervisor.</para>
+        </listitem>
+        <listitem>
+            <para>Supportability: Depending on the selected
+                hypervisor, the staff should have the appropriate
+                training and knowledge to support the selected OS and
+                hypervisor combination. If they do not, training will
+                need to be provided which could have a cost impact on
+                the design.</para>
+        </listitem>
+        <listitem>
+            <para>Management tools: The management tools used for
+                Ubuntu and Kinstance differ from the management tools
+                for VMware vSphere. Although both OS and hypervisor
+                combinations are supported by OpenStack, there will be
+                very different impacts to the rest of the design as a
+                result of the selection of one combination versus the
+                other.</para>
+        </listitem>
+        <listitem>
+            <para>Scale and performance: Ensure that selected OS and
+                hypervisor combinations meet the appropriate scale and
+                performance requirements. The chosen architecture will
+                need to meet the targeted instance-host ratios with
+                the selected OS-hypervisor combinations.</para>
+        </listitem>
+        <listitem>
+            <para>Security: Ensure that the design can accommodate the
+                regular periodic installation of application security
+                patches while maintaining the required workloads. The
+                frequency of security patches for the proposed OS -
+                hypervisor combination will have an impact on
+                performance and the patch installation process could
+                affect maintenance windows.</para>
+        </listitem>
+        <listitem>
+            <para>Supported features: Determine which features of
+                OpenStack are required. This will often determine the
+                selection of the OS-hypervisor combination. Certain
+                features are only available with specific OSs or
+                hypervisors. For example, if certain features are not
+                available, the design might need to be modified to
+                meet the user requirements.</para>
+        </listitem>
+        <listitem>
+            <para>Interoperability: Consideration should be given to
+                the ability of the selected OS-hypervisor combination
+                to interoperate or co-exist with other OS-hypervisors
+                as well as other software solutions in the overall
+                design (if required). Operational troubleshooting
+                tools for one OS-hypervisor combination may differ
+                from the tools used for another OS-hypervisor
+                combination and, as a result, the design will need to
+                address if the two sets of tools need to interoperate.
+           </para>
+        </listitem>
+    </itemizedlist>
+    </section>
+    <section xml:id="openstack-components">
+        <title>OpenStack Components</title>
+    <para>The selection of which OpenStack components are included has
+        a significant impact on the overall design. While there are
+        certain components that will always be present, (Nova and
+        Glance, for example) there are other services that may not be
+        required. As an example, a certain design might not need
+        OpenStack Heat. Omitting Heat would not have a significant
+        impact on the overall design of a cloud; however, if the
+        architecture uses a replacement for OpenStack Swift for its
+        storage component, it could potentially have significant
+        impacts on the rest of the design.</para>
+    <para>The exclusion of certain OpenStack components might also
+        limit or constrain the functionality of other components. If
+        the architecture includes Heat but excludes Ceilometer, then
+        the design will not be able to take advantage of Heat's auto
+        scaling functionality (which relies on information from
+        Ceilometer). It is important to research the component
+        interdependencies in conjunction with the technical
+        requirements before deciding what components need to be
+        included and what components can be dropped from the final
+        architecture.</para>
+    </section>
+    <section xml:id="supplemental-components"><title>Supplemental Components</title>
+    <para>While OpenStack is a fairly complete collection of software
+        projects for building a platform for cloud services, there are
+        invariably additional pieces of software that need to be
+        considered in any given OpenStack design.</para>
+    </section>
+    <section xml:id="networking-software"><title>Networking Software</title>
+    <para>OpenStack Neutron provides a wide variety of networking
+        services for instances. There are many additional networking
+        software packages that might be useful to manage the OpenStack
+        components themselves. Some examples include software to
+        provide load balancing, network redundancy protocols, and
+        routing daemons. Some of these software packages are described
+        in more detail in the OpenStack HA Guide (refer to Chapter 8
+        of the OpenStack High Availability Guide).</para>
+    <para>For a general purpose OpenStack cloud, the OpenStack
+        infrastructure components will need to be highly available. If
+        the design does not include hardware load balancing,
+        networking software packages like HAProxy will need to be
+        included.</para>
+    </section>
+    <section xml:id="management-software"><title>Management Software</title>
+    <para>The selected supplemental software solution impacts and
+        affects the overall OpenStack cloud design. This includes
+        software for providing clustering, logging, monitoring and
+        alerting.</para>
+    <para>Inclusion of clustering Software, such as Corosync or
+        Pacemaker, is determined primarily by the availability
+        requirements. Therefore, the impact of including (or not
+        including) these software packages is primarily determined by
+        the availability of the cloud infrastructure and the
+        complexity of supporting the configuration after it is
+        deployed. The OpenStack High Availability Guide provides more
+        details on the installation and configuration of Corosync and
+        Pacemaker, should these packages need to be included in the
+        design.</para>
+    <para>Requirements for logging, monitoring, and alerting are
+        determined by operational considerations. Each of these
+        sub-categories includes a number of various options. For
+        example, in the logging sub-category one might consider
+        Logstash, Splunk, instanceware Log Insight, or some other log
+        aggregation-consolidation tool. Logs should be stored in a
+        centralized location to make it easier to perform analytics
+        against the data. Log data analytics engines can also provide
+        automation and issue notification by providing a mechanism to
+        both alert and automatically attempt to remediate some of the
+        more commonly known issues.</para>
+    <para>If any of these software packages are required, then the
+        design must account for the additional resource consumption
+        (CPU, RAM, storage, and network bandwidth for a log
+        aggregation solution, for example). Some other potential
+        design impacts include:</para>
+    <itemizedlist>
+        <listitem>
+            <para>OS - Hypervisor combination: Ensure that the
+                selected logging, monitoring, or alerting tools
+                support the proposed OS-hypervisor combination.</para>
+        </listitem>
+        <listitem>
+            <para>Network hardware: The network hardware selection
+                needs to be supported by the logging, monitoring, and
+                alerting software.</para>
+        </listitem>
+    </itemizedlist>
+    </section>
+    <section xml:id="database-software"><title>Database Software</title>
+    <para>A large majority of the OpenStack components require access
+        to back-end database services to store state and configuration
+        information. Selection of an appropriate back-end database
+        that will satisfy the availability and fault tolerance
+        requirements of the OpenStack services is required. OpenStack
+        services supports connecting to any database that is supported
+        by the sqlalchemy python drivers, however, most common
+        database deployments make use of mySQL or variations of it. It
+        is recommended that the database which provides back-end
+        service within a general purpose cloud be made highly
+        available when using an available technology which can
+        accomplish that goal. Some of the more common software
+        solutions used include Galera, MariaDB and mySQL with
+        multi-master replication.</para>
+    </section>
+    <section xml:id="addressing-performance-sensitive-workloads"><title>Addressing Performance-Sensitive Workloads</title>
+    <para>Although one of the key defining factors for a general
+        purpose OpenStack cloud is that performance is not a
+        determining factor, there may still be some
+        performance-sensitive workloads deployed on the general
+        purpose OpenStack cloud. For design guidance on
+        performance-sensitive workloads, it is recommended to refer to
+        the focused scenarios later in this guide. The
+        resource-focused guides can be used as a supplement to this
+        guide to help with decisions regarding performance-sensitive
+        workloads.</para>
+    </section>
+    <section xml:id="compute-focused-workloads"><title>Compute-Focused Workloads</title>
+    <para>In an OpenStack cloud that is compute-focused, there are
+        some design choices that can help accommodate those workloads.
+        Compute-focused workloads are generally those that would place
+        a higher demand on CPU and memory resources with lower
+        priority given to storage and network performance, other than
+        what is required to support the intended compute workloads.
+        For guidance on designing for this type of cloud, please refer
+        to the section on Compute Focused clouds.</para>
+    </section>
+    <section xml:id="network-focused-workloads"><title>Network-Focused Workloads</title>
+    <para>In a network-focused OpenStack cloud some design choices can
+        improve the performance of these types of workloads.
+        Network-focused workloads have extreme demands on network
+        bandwidth and services that require specialized consideration
+        and planning. For guidance on designing for this type of
+        cloud, please refer to the section on Network-Focused clouds.</para>
+    </section>
+    <section xml:id="storage-focused-workloads"><title>Storage-Focused Workloads</title>
+    <para>Storage focused OpenStack clouds need to be designed to
+        accommodate workloads that have extreme demands on either
+        object or block storage services that require specialized
+        consideration and planning. For guidance on designing for this
+        type of cloud, please refer to the section on Storage-Focused
+        clouds.</para></section>
+</section>
--- a/doc/arch-design/generalpurpose/section_introduction_generalpurpose.xml
+++ b/doc/arch-design/generalpurpose/section_introduction_generalpurpose.xml
@ -0,0 +1,64 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<section xmlns="http://docbook.org/ns/docbook"
+    xmlns:xi="http://www.w3.org/2001/XInclude"
+    xmlns:xlink="http://www.w3.org/1999/xlink"
+    version="5.0"
+    xml:id="arch-guide-intro-general-purpose">
+    <title>Introduction</title>
+    <para>An OpenStack general purpose cloud is often considered a
+        starting point for building a cloud deployment. General
+        purpose clouds, by their nature, balance the components and do
+        not emphasize (or heavily emphasize) any particular aspect of
+        the overall computing environment. The expectation is that the
+        compute, network, and storage components will be given equal
+        weight in the design. General purpose clouds can be found in
+        private, public, and hybrid environments. They lend themselves
+        to many different use cases but, since they are homogeneous
+        deployments, they are not suited to specialized environments
+        or edge case situations. Common uses to consider for a general
+        purpose cloud could be, but are not limited to, providing a
+        simple database, a web application runtime environment, a
+        shared application development platform, or lab test bed. In
+        other words, any use case that would benefit from a scale-out
+        rather than a scale-up approach is a good candidate for a
+        general purpose cloud architecture.</para>
+    <para>A general purpose cloud, by definition, is something that is
+        designed to have a range of potential uses or functions; not
+        specialized for a specific use. General purpose architecture
+        is largely considered a scenario that would address 80% of the
+        potential use cases. The infrastructure, in itself, is a
+        specific use case. It is also a good place to start the design
+        process. As the most basic cloud service model, general
+        purpose clouds are designed to be platforms suited for general
+        purpose applications.</para>
+    <para>General purpose clouds are limited to the most basic
+        components, but they can include additional resources such
+        as:</para>
+    <itemizedlist>
+        <listitem>
+            <para>Virtual-machine disk image library</para>
+        </listitem>
+        <listitem>
+            <para>Raw block storage</para>
+        </listitem>
+        <listitem>
+            <para>File or object storage</para>
+        </listitem>
+        <listitem>
+            <para>Firewalls</para>
+        </listitem>
+        <listitem>
+            <para>Load balancers</para>
+        </listitem>
+        <listitem>
+            <para>IP addresses</para>
+        </listitem>
+        <listitem>
+            <para>Network overlays or virtual local area networks
+                (VLANs)</para>
+        </listitem>
+        <listitem>
+            <para>Software bundles</para>
+        </listitem>
+    </itemizedlist>
+</section>
--- a/doc/arch-design/generalpurpose/section_operational_considerations_general_purpose.xml
+++ b/doc/arch-design/generalpurpose/section_operational_considerations_general_purpose.xml
@ -0,0 +1,143 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<section xmlns="http://docbook.org/ns/docbook"
+    xmlns:xi="http://www.w3.org/2001/XInclude"
+    xmlns:xlink="http://www.w3.org/1999/xlink"
+    version="5.0"
+    xml:id="operational-considerations-general-purpose">
+    <?dbhtml stop-chunking?>
+    <title>Operational Considerations</title>
+    <para>Many operational factors will affect general purpose cloud
+        design choices. In larger installations, it is not uncommon
+        for operations staff to be tasked with maintaining cloud
+        environments. This differs from the operations staff that is
+        responsible for building or designing the infrastructure. It
+        is important to include the operations function in the
+        planning and design phases of the build out.</para>
+    <para>Service Level Agreements (SLAs) are contractual obligations
+        that provide assurances for service availability. SLAs define
+        levels of availability that drive the technical design, often
+        with penalties for not meeting the contractual obligations.
+        The strictness of the SLA dictates the level of redundancy and
+        resiliency in the OpenStack cloud design. Knowing when and
+        where to implement redundancy and HA is directly affected by
+        expectations set by the terms of the SLA. Some of the SLA
+        terms that will affect the design include:</para>
+    <itemizedlist>
+        <listitem>
+            <para>Guarantees for API availability imply multiple
+                infrastructure services combined with highly available
+                load balancers.</para>
+        </listitem>
+        <listitem>
+            <para>Network uptime guarantees will affect the switch
+                design and might require redundant switching and
+                power.</para>
+        </listitem>
+        <listitem>
+            <para>Network security policies requirements need to be
+                factored in to deployments.</para>
+        </listitem>
+    </itemizedlist>
+    <section xml:id="support-and-maintainability-general-purpose"><title>Support and Maintainability</title>
+    <para>OpenStack cloud management requires operations staff to be
+        able to understand and comprehend design architecture content
+        on some level. The level of skills and the level of separation
+        of the operations and engineering staff are dependent on the
+        size and purpose of the installation. A large cloud service
+        provider or a telecom provider is more likely to be managed by
+        a specially trained, dedicated operations organization. A
+        smaller implementation is more likely to rely on a smaller
+        support staff that might need to take on the combined
+        engineering, design and operations functions.</para>
+    <para>Furthermore, maintaining OpenStack installations requires a
+        variety of technical skills. Some of these skills may include
+        the ability to debug Python log output to a basic level and an
+        understanding of networking concepts.</para>
+    <para>Consider incorporating features into the architecture and
+        design that reduce the operations burden. This is accomplished
+        by automating some of the operations functions. In some cases
+        it may be beneficial to use a third party management company
+        with special expertise in managing OpenStack
+        deployments.</para></section>
+    <section xml:id="monitoring-general-purpose"><title>Monitoring</title>
+    <para>Like any other infrastructure deployment, OpenStack clouds
+        need an appropriate monitoring platform to ensure any errors
+        are caught and managed appropriately. Consider leveraging any
+        existing monitoring system to see if it will be able to
+        effectively monitor an OpenStack environment. While there are
+        many aspects that need to be monitored, specific metrics that
+        are critically important to capture include image disk
+        utilization, or response time to the Compute API.</para></section>
+    <section xml:id="downtime-general-purpose"><title>Downtime</title>
+    <para>No matter how robust the architecture is, at some point
+        components will fail. Designing for high availability (HA) can
+        have significant cost ramifications, therefore the resiliency
+        of the overall system and the individual components is going
+        to be dictated by the requirements of the SLA. Downtime
+        planning includes creating processes and architectures that
+        support planned (maintenance) and unplanned (system faults)
+        downtime.</para>
+    <para>An example of an operational consideration is the recovery
+        of a failed compute host. This might mean requiring the
+        restoration of instances from a snapshot or respawning an
+        instance on another available compute host. This could have
+        consequences on the overall application design. A general
+        purpose cloud should not need to provide an ability to migrate
+        instances from one host to another. If the expectation is that
+        the application will be designed to tolerate failure,
+        additional considerations need to be made around supporting
+        instance migration. In this scenario, extra supporting
+        services, including shared storage attached to compute hosts,
+        might need to be deployed.</para></section>
+    <section xml:id="capacity-planning"><title>Capacity Planning</title>
+    <para>Capacity planning for future growth is a critically
+        important and often overlooked consideration. Capacity
+        constraints in a general purpose cloud environment include
+        compute and storage limits. There is a relationship between
+        the size of the compute environment and the supporting
+        OpenStack infrastructure controller nodes required to support
+        it. As the size of the supporting compute environment
+        increases, the network traffic and messages will increase
+        which will add load to the controller or networking nodes.
+        While no hard and fast rule exists, effective monitoring of
+        the environment will help with capacity decisions on when to
+        scale the back-end infrastructure as part of the scaling of
+        the compute resources.</para>
+    <para>Adding extra compute capacity to an OpenStack cloud is a
+        horizontally scaling process as consistently configured
+        compute nodes automatically attach to an OpenStack cloud. Be
+        mindful of any additional work that is needed to place the
+        nodes into appropriate availability zones and host aggregates.
+        Make sure to use identical or functionally compatible CPUs
+        when adding additional compute nodes to the environment
+        otherwise live migration features will break. Scaling out
+        compute hosts will directly affect network and other
+        datacenter resources so it will be necessary to add rack
+        capacity or network switches.</para>
+    <para>Another option is to assess the average workloads and
+        increase the number of instances that can run within the
+        compute environment by adjusting the overcommit ratio. While
+        only appropriate in some environments, it's important to
+        remember that changing the CPU overcommit ratio can have a
+        detrimental effect and cause a potential increase in noisy
+        neighbor. The added risk of increasing the overcommit ratio is
+        more instances will fail when a compute host fails.</para>
+    <para>Compute host components can also be upgraded to account for
+        increases in demand; this is known as vertical scaling.
+        Upgrading CPUs with more cores, or increasing the overall
+        server memory, can add extra needed capacity depending on
+        whether the running applications are more CPU intensive or
+        memory intensive.</para>
+    <para>Insufficient disk capacity could also have a negative effect
+        on overall performance including CPU and memory usage.
+        Depending on the back-end architecture of the OpenStack Block
+        Storage layer, capacity might include adding disk shelves to
+        enterprise storage systems or installing additional block
+        storage nodes. It may also be necessary to upgrade directly
+        attached storage installed in compute hosts or add capacity to
+        the shared storage to provide additional ephemeral storage to
+        instances.</para>
+    <para>For a deeper discussion on many of these topics, refer to
+        the OpenStack Operations Guide at
+        http://docs.openstack.org/ops.</para></section>
+</section>
--- a/doc/arch-design/generalpurpose/section_prescriptive_example_general_purpose.xml
+++ b/doc/arch-design/generalpurpose/section_prescriptive_example_general_purpose.xml
@ -0,0 +1,100 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<section xmlns="http://docbook.org/ns/docbook"
+    xmlns:xi="http://www.w3.org/2001/XInclude"
+    xmlns:xlink="http://www.w3.org/1999/xlink"
+    version="5.0"
+    xml:id="prescriptive-example-online-classifieds">
+    <?dbhtml stop-chunking?>
+    <title>Prescriptive Example</title>
+    <para>An online classified advertising company wants to run web applications
+        consisting of Tomcat, Nginx and MariaDB in a private cloud. In order to
+        meet policy requirements, the cloud infrastructure will run in their own
+        data center. They have predictable load requirements but require an
+        element of scaling to cope with nightly increases in demand. Their
+        current environment is not flexible enough to align with their goal of
+        running an open source API driven environment. Their current environment
+        consists of the following:</para>
+    <itemizedlist>
+        <listitem>
+            <para>Between 120 and 140 installations of Nginx and
+                Tomcat, each with 2 vCPUs and 4 GB of RAM</para>
+        </listitem>
+        <listitem>
+            <para>A three-node MariaDB and Galera cluster, each with 4
+                vCPUs and 8 GB RAM</para>
+        </listitem>
+    </itemizedlist>
+    <para>The company runs hardware load balancers and multiple web
+        applications serving the sites. The company orchestrates their
+        environment using a combination of scripts and Puppet. The
+        websites generate a large amount of log data each day that
+        needs to be archived.</para>
+    <para>The solution would consist of the following OpenStack
+        components:</para>
+    <itemizedlist>
+        <listitem>
+            <para>A firewall, switches and load balancers on the
+                public facing network connections.</para>
+        </listitem>
+        <listitem>
+            <para>OpenStack Controller services running Image,
+                Identity, Networking and supporting services such as
+                MariaDB and RabbitMQ. The controllers will run in a
+                highly available configuration on at least three
+                controller nodes.</para>
+        </listitem>
+        <listitem>
+            <para>OpenStack Compute nodes running the KVM
+                hypervisor.</para>
+        </listitem>
+        <listitem>
+            <para>OpenStack Block Storage for use by compute instances
+                that require persistent storage such as databases for
+                dynamic sites.</para>
+        </listitem>
+        <listitem>
+            <para>OpenStack Object Storage for serving static objects
+                such as images.</para>
+        </listitem>
+    </itemizedlist>
+    <para><inlinemediaobject><imageobject><imagedata
+                    fileref="../images/General_Architecture3.png"
+                /></imageobject></inlinemediaobject>Running up to 140
+        web instances and the small number of MariaDB instances
+        requires 292 vCPUs available, as well as 584 GB RAM. On a
+        typical 1U server using dual-socket hex-core Intel CPUs with
+        Hyperthreading, and assuming 2:1 CPU overcommit ratio, this
+        would require 8 OpenStack Compute nodes.</para>
+    <para>The web application instances run from local storage on each
+        of the OpenStack Compute nodes. The web application instances
+        are stateless, meaning that any of the instances can fail and
+        the application will continue to function.</para>
+    <para>MariaDB server instances store their data on shared
+        enterprise storage, such as NetApp or Solidfire devices. If a
+        MariaDB instance fails, storage would be expected to be
+        re-attached to another instance and rejoined to the Galera
+        cluster.</para>
+    <para>Logs from the web application servers are shipped to
+        OpenStack Object Storage for later processing and
+        archiving.</para>
+    <para>In this scenario, additional capabilities can be realized by
+        moving static web content to be served from OpenStack Object
+        Storage containers, and backing the OpenStack Image Service
+        with OpenStack Object Storage. Note that an increase in
+        OpenStack Object Storage means that network bandwidth needs to
+        be taken in to consideration. It is best to run OpenStack
+        Object Storage with network connections offering 10 GbE or
+        better connectivity.</para>
+    <para>There is also a potential to leverage the Orchestration and
+        Telemetry OpenStack modules to provide an auto-scaling,
+        orchestrated web application environment. Defining the web
+        applications in Heat Orchestration Templates (HOT) would
+        negate the reliance on the scripted Puppet solution currently
+        employed.</para>
+    <para>OpenStack Networking can be used to control hardware load
+        balancers through the use of plug-ins and the Networking API.
+        This would allow a user to control hardware load balance pools
+        and instances as members in these pools, but their use in
+        production environments must be carefully weighed against
+        current stability.</para>
+</section>
--- a/doc/arch-design/generalpurpose/section_tech_considerations_general_purpose.xml
+++ b/doc/arch-design/generalpurpose/section_tech_considerations_general_purpose.xml
@ -0,0 +1,715 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<section xmlns="http://docbook.org/ns/docbook"
+    xmlns:xi="http://www.w3.org/2001/XInclude"
+    xmlns:xlink="http://www.w3.org/1999/xlink"
+    version="5.0"
+    xml:id="technical-considerations-general-purpose">
+    <?dbhtml stop-chunking?>
+    <title>Technical Considerations</title>
+    <para>When designing a general purpose cloud, there is an implied
+        requirement to design for all of the base services generally
+        associated with providing Infrastructure-as-a-Service:
+        compute, network and storage. Each of these services have
+        different resource requirements. As a result, it is important
+        to make design decisions relating directly to the service
+        currently under design, while providing a balanced
+        infrastructure that provides for all services.</para>
+    <para>When designing an OpenStack cloud as a general purpose
+        cloud, the hardware selection process can be lengthy and
+        involved due to the sheer mass of services which need to be
+        designed and the unique characteristics and requirements of
+        each service within the cloud. Hardware designs need to be
+        generated for each type of resource pool; specifically,
+        compute, network, and storage. In addition to the hardware
+        designs, which affect the resource nodes themselves, there are
+        also a number of additional hardware decisions to be made
+        related to network architecture and facilities planning. These
+        factors play heavily into the overall architecture of an
+        OpenStack cloud.</para>
+    <section xml:id="designing-compute-resources-tech-considerations">
+        <title>Designing Compute Resources</title>
+    <para>It is recommended to design compute resources as pools of
+        resources which will be addressed on-demand. When designing
+        compute resource pools, a number of factors impact your design
+        decisions. For example, decisions related to processors,
+        memory, and storage within each hypervisor are just one
+        element of designing compute resources. In addition, it is
+        necessary to decide whether compute resources will be provided
+        in a single pool or in multiple pools.</para>
+    <para>To design for the best use of available resources by
+        applications running in the cloud, it is recommended to design
+        more than one compute resource pool. Each independent resource
+        pool should be designed to provide service for specific
+        flavors of instances or groupings of flavors. For the purpose
+        of this book, "instance" refers to a virtual machines and the
+        operating system running on the virtual machine. Designing
+        multiple resource pools helps to ensure that, as instances are
+        scheduled onto compute hypervisors, each independent node's
+        resources will be allocated in a way that makes the most
+        efficient use of available hardware. This is commonly referred
+        to as bin packing.</para>
+    <para>Using a consistent hardware design among the nodes that are
+        placed within a resource pool also helps support bin packing.
+        Hardware nodes selected for being a part of a compute resource
+        pool should share a common processor, memory, and storage
+        layout. By choosing a common hardware design, it becomes
+        easier to deploy, support and maintain those nodes throughout
+        their life cycle in the cloud.</para>
+    <para>OpenStack provides the ability to configure overcommit
+        ratio--the ratio of virtual resources available for allocation
+        to physical resources present--for both CPU and memory. The
+        default CPU overcommit ratio is 16:1 and the default memory
+        overcommit ratio is 1.5:1. Determine the tuning of the
+        overcommit ratios for both of these options during the design
+        phase, as this has a direct impact on the hardware layout of
+        your compute nodes.</para>
+    <para>As an example, consider that a m1.small instance uses 1
+        vCPU, 20 GB of ephemeral storage and 2,048 MB of RAM. When
+        designing a hardware node as a compute resource pool to
+        service instances, take into consideration the number of
+        processor cores available on the node as well as the required
+        disk and memory to service instances running at capacity. For
+        a server with 2 CPUs of 10 cores each, with hyperthreading
+        turned on, the default CPU overcommit ratio of 16:1 would
+        allow for 640 (2 x 10 x 2 x 16) total m1.small instances. By
+        the same reasoning, using the default memory overcommit ratio
+        of 1.5:1 you can determine that the server will need at least
+        853GB (640 x 2,048 MB % 1.5) of RAM. When sizing nodes for
+        memory, it is also important to consider the additional memory
+        required to service operating system and service needs.</para>
+    <para>Processor selection is an extremely important consideration
+        in hardware design, especially when comparing the features and
+        performance characteristics of different processors. Some
+        newly released processors include features specific to
+        virtualized compute hosts including hardware assisted
+        virtualization and technology related to memory paging (also
+        known as EPT shadowing). These features have a tremendous
+        positive impact on the performance of virtual machines running
+        in the cloud.</para>
+    <para>In addition to the impact on actual compute services, it is
+        also important to consider the compute requirements of
+        resource nodes within the cloud. Resource nodes refers to
+        non-hypervisor nodes providing controller, object storage,
+        block storage, or networking services in the cloud. The number
+        of processor cores and threads has a direct correlation to the
+        number of worker threads which can be run on a resource node.
+        It is important to ensure sufficient compute capacity and
+        memory is planned on resource nodes.</para>
+    <para>Workload profiles are unpredictable in a general purpose
+        cloud, so it may be difficult to design for every specific use
+        case in mind. Additional compute resource pools can be added
+        to the cloud at a later time, so this unpredictability should
+        not be a problem. In some cases, the demand on certain
+        instance types or flavors may not justify an individual
+        hardware design. In either of these cases, start by providing
+        hardware designs which will be capable of servicing the most
+        common instance requests first, looking to add additional
+        hardware designs to the overall architecture in the form of
+        new hardware node designs and resource pools as they become
+        justified at a later time.</para></section>
+    <section xml:id="designing-network-resources-tech-considerations">
+        <title>Designing Network Resources</title>
+    <para>An OpenStack cloud traditionally has multiple network
+        segments, each of which provides access to resources within
+        the cloud to both operators and tenants. In addition, the
+        network services themselves also require network communication
+        paths which should also be separated from the other networks.
+        When designing network services for a general purpose cloud,
+        it is recommended to plan for either a physical or logical
+        separation of network segments which will be used by operators
+        and tenants. It is further suggested to create an additional
+        network segment for access to internal services such as the
+        message bus and database used by the various cloud services.
+        Segregating these services onto separate networks helps to
+        protect sensitive data and also protects against unauthorized
+        access to services.</para>
+    <para>Based on the requirements of instances being serviced in the
+        cloud, the next design choice which will affect your design is
+        the choice of network service which will be used to service
+        instances in the cloud. The choice between nova-network, as a
+        part of OpenStack Compute Service, and Neutron, the OpenStack
+        Networking Service, has tremendous implications and will have
+        a huge impact on the architecture and design of the cloud
+        network infrastructure.</para>
+    <para>The nova-network service is primarily a layer 2 networking
+        service which has two main modes in which it will function.
+        The difference between the two modes in nova-network pertain
+        to whether or not nova-network uses VLANs. When using
+        nova-network in a flat network mode, all network hardware
+        nodes and devices throughout the cloud are connected to a
+        single layer 2 network segment which provides access to
+        application data.</para>
+    <para>When the network devices in the cloud support segmentation
+        using VLANs, nova-network can operate in the second mode. In
+        this design model, each tenant within the cloud is assigned a
+        network subnet which is mapped to a VLAN on the physical
+        network. It is especially important to remember the maximum
+        number of 4096 VLANs which can be used within a spanning tree
+        domain. These limitations place hard limits on the amount of
+        growth possible within the data center. When designing a
+        general purpose cloud intended to support multiple tenants, it
+        is especially recommended to use nova-network with VLANs, and
+        not in flat network mode.</para>
+    <para>Another consideration regarding network is the fact that
+        nova-network is entirely managed by the cloud operator;
+        tenants do not have control over network resources. If tenants
+        require the ability to manage and create network resources
+        such as network segments and subnets, it will be necessary to
+        install the OpenStack Networking Service to provide network
+        access to instances.</para>
+    <para>The OpenStack Networking Service is a first class networking
+        service that gives full control over creation of virtual
+        network resources to tenants. This is often accomplished in
+        the form of tunneling protocols which will establish
+        encapsulated communication paths over existing network
+        infrastructure in order to segment tenant traffic. These
+        methods vary depending on the specific implementation, but
+        some of the more common methods include tunneling over GRE,
+        encapsulating with VXLAN, and VLAN tags.</para>
+    <para>Initially, it is suggested to design at least three network
+        segments, the first of which will be used for access to the
+        cloud’s REST APIs by tenants and operators. This is generally
+        referred to as a public network. In most cases, the controller
+        nodes and swift proxies within the cloud will be the only
+        devices necessary to connect to this network segment. In some
+        cases, this network might also be serviced by hardware load
+        balancers and other network devices.</para>
+    <para>The next segment is used by cloud administrators to manage
+        hardware resources and is also used by configuration
+        management tools when deploying software and services onto new
+        hardware. In some cases, this network segment might also be
+        used for internal services, including the message bus and
+        database services, to communicate with each other. Due to the
+        highly secure nature of this network segment, it may be
+        desirable to secure this network from unauthorized access.
+        This network will likely need to communicate with every
+        hardware node within the cloud.</para>
+    <para>The last network segment is used by applications and
+        consumers to provide access to the physical network and also
+        for users accessing applications running within the cloud.
+        This network is generally segregated from the one used to
+        access the cloud APIs and is not capable of communicating
+        directly with the hardware resources in the cloud. Compute
+        resource nodes will need to communicate on this network
+        segment, as will any network gateway services which allow
+        application data to access the physical network outside of the
+        cloud.</para></section>
+    <section xml:id="designing-storage-resources-tech-considerations"><title>Designing Storage Resources</title>
+    <para>OpenStack has two independent storage services to consider,
+        each with its own specific design requirements and goals. In
+        addition to services which provide storage as their primary
+        function, there are additional design considerations with
+        regard to compute and controller nodes which will affect the
+        overall cloud architecture.</para></section>
+    <section xml:id="designing-openstack-object-storage-tech-considerations">
+        <title>Designing OpenStack Object Storage</title>
+    <para>When designing hardware resources for OpenStack Object
+        Storage, the primary goal is to maximize the amount of storage
+        in each resource node while also ensuring that the cost per
+        terabyte is kept to a minimum. This often involves utilizing
+        servers which can hold a large number of spinning disks.
+        Whether choosing to use 2U server form factors with directly
+        attached storage or an external chassis that holds a larger
+        number of drives, the main goal is to maximize the storage
+        available in each node.</para>
+    <para>It is not recommended to invest in enterprise class drives
+        for an OpenStack Object Storage cluster. The consistency and
+        partition tolerance characteristics of OpenStack Object
+        Storage will ensure that data stays up to date and survives
+        hardware faults without the use of any specialized data
+        replication devices.</para>
+    <para>A great benefit of OpenStack Object Storage is the ability
+        to mix and match drives by utilizing weighting within the
+        swift ring. When designing your swift storage cluster, it is
+        recommended to make use of the most cost effective storage
+        solution available at the time. Many server chassis on the
+        market can hold 60 or more drives in 4U of rack space,
+        therefore it is recommended to maximize the amount of storage
+        per rack unit at the best cost per terabyte. Furthermore, the
+        use of RAID controllers is not recommended in an object
+        storage node.</para>
+    <para>In order to achieve this durability and availability of data
+        stored as objects, it is important to design object storage
+        resource pools in a way that provides the suggested
+        availability that the service can provide. Beyond designing at
+        the hardware node level, it is important to consider
+        rack-level and zone-level designs to accommodate the number of
+        replicas configured to be stored in the Object Storage service
+        (the default number of replicas is three). Each replica of
+        data should exist in its own availability zone with its own
+        power, cooling, and network resources available to service
+        that specific zone.</para>
+    <para>Object storage nodes should be designed so that the number
+        of requests does not hinder the performance of the cluster.
+        The object storage service is a chatty protocol, therefore
+        making use of multiple processors that have higher core counts
+        will ensure the IO requests do not inundate the server.</para></section>
+    <section xml:id="designing-openstack-block-storage"><title>Designing OpenStack Block Storage</title>
+    <para>When designing OpenStack Block Storage resource nodes, it is
+        helpful to understand the workloads and requirements that will
+        drive the use of block storage in the cloud. In a general
+        purpose cloud these use patterns are often unknown. It is
+        recommended to design block storage pools so that tenants can
+        choose the appropriate storage solution for their
+        applications. By creating multiple storage pools of different
+        types, in conjunction with configuring an advanced storage
+        scheduler for the block storage service, it is possible to
+        provide tenants with a large catalog of storage services with
+        a variety of performance levels and redundancy options.</para>
+    <para>In addition to directly attached storage populated in
+        servers, block storage can also take advantage of a number of
+        enterprise storage solutions. These are addressed via a plug-in
+        driver developed by the hardware vendor. A large number of
+        enterprise storage plug-in drivers ship out-of-the-box with
+        OpenStack Block Storage (and many more available via third
+        party channels). While a general purpose cloud would likely
+        use directly attached storage in the majority of block storage
+        nodes, it may also be necessary to provide additional levels
+        of service to tenants which can only be provided by enterprise
+        class storage solutions.</para>
+    <para>The determination to use a RAID controller card in block
+        storage nodes is impacted primarily by the redundancy and
+        availability requirements of the application. Applications
+        which have a higher demand on input-output per second (IOPS)
+        will influence both the choice to use a RAID controller and
+        the level of RAID configured on the volume. Where performance
+        is a consideration, it is suggested to make use of higher
+        performing RAID volumes. In contrast, where redundancy of
+        block storage volumes is more important it is recommended to
+        make use of a redundant RAID configuration such as RAID 5 or
+        RAID 6. Some specialized features, such as automated
+        replication of block storage volumes, may require the use of
+        third-party plug-ins and enterprise block storage solutions in
+        order to provide the high demand on storage. Furthermore,
+        where extreme performance is a requirement it may also be
+        necessary to make use of high speed SSD disk drives' high
+        performing flash storage solutions.</para></section>
+    <section xml:id="software-selection-tech-considerations">
+        <title>Software Selection</title>
+    <para>The software selection process can play a large role in the
+        architecture of a general purpose cloud. Choice of operating
+        system, selection of OpenStack software components, choice of
+        hypervisor and selection of supplemental software will have a
+        large impact on the design of the cloud.</para>
+    <para>Operating system (OS) selection plays a large role in the
+        design and architecture of a cloud. There are a number of OSes
+        which have native support for OpenStack including Ubuntu, Red
+        Hat Enterprise Linux (RHEL), CentOS, and SUSE Linux Enterprise
+        Server (SLES). "Native support" in this context means that the
+        distribution provides distribution-native packages by which to
+        install OpenStack in their repositories. Note that "native
+        support" is not a constraint on the choice of OS; users are
+        free to choose just about any Linux distribution (or even
+        Microsoft Windows) and install OpenStack directly from source
+        (or compile their own packages). However, the reality is that
+        many organizations will prefer to install OpenStack from
+        distribution-supplied packages or repositories (although using
+        the distribution vendor's OpenStack packages might be a
+        requirement for support).</para>
+    <para>OS selection also directly influences hypervisor selection.
+        A cloud architect who selects Ubuntu or RHEL has some
+        flexibility in hypervisor; KVM, Xen, and LXC are supported
+        virtualization methods available under OpenStack Compute
+        (Nova) on these Linux distributions. A cloud architect who
+        selects Hyper-V, on the other hand, is limited to Windows
+        Server. Similarly, a cloud architect who selects XenServer is
+        limited to the CentOS-based dom0 operating system provided
+        with XenServer.</para>
+    <para>The primary factors that play into OS/hypervisor selection
+        include:</para>
+    <itemizedlist>
+        <listitem>
+            <para>User requirements: The selection of OS/hypervisor
+                combination first and foremost needs to support the
+                user requirements.</para>
+        </listitem>
+        <listitem>
+            <para>Support: The selected OS/hypervisor combination
+                needs to be supported by OpenStack.</para>
+        </listitem>
+        <listitem>
+            <para>Interoperability: The OS/hypervisor needs to be
+                interoperable with other features and services in the
+                OpenStack design in order to meet the user
+                requirements.</para>
+        </listitem>
+    </itemizedlist></section>
+    <section xml:id="hypervisor-tech-considerations"><title>Hypervisor</title>
+    <para>OpenStack supports a wide variety of hypervisors, one or
+        more of which can be used in a single cloud. These hypervisors
+        include:</para>
+    <itemizedlist>
+        <listitem>
+            <para>KVM (and Qemu)</para>
+        </listitem>
+        <listitem>
+            <para>XCP/XenServer</para>
+        </listitem>
+        <listitem>
+            <para>vSphere (vCenter and ESXi)</para>
+        </listitem>
+        <listitem>
+            <para>Hyper-V</para>
+        </listitem>
+        <listitem>
+            <para>LXC</para>
+        </listitem>
+        <listitem>
+            <para>Docker</para>
+        </listitem>
+        <listitem>
+            <para>Bare-metal</para>
+        </listitem>
+    </itemizedlist>
+    <para>A complete list of supported hypervisors and their
+        capabilities can be found at
+        https://wiki.openstack.org/wiki/HypervisorSupportMatrix.</para>
+    <para>General purpose clouds should make use of hypervisors that
+        support the most general purpose use cases, such as KVM and
+        Xen. More specific hypervisors should then be chosen to
+        account for specific functionality or a supported feature
+        requirement. In some cases, there may also be a mandated
+        requirement to run software on a certified hypervisor
+        including solutions from VMware, Microsoft, and Citrix.</para>
+    <para>The features offered through the OpenStack cloud platform
+        determine the best choice of a hypervisor. As an example, for
+        a general purpose cloud that predominantly supports a
+        Microsoft-based migration, or is managed by staff that has a
+        particular skill for managing certain hypervisors and
+        operating systems, Hyper-V might be the best available choice.
+        While the decision to use Hyper-V does not limit the ability
+        to run alternative operating systems, be mindful of those that
+        are deemed supported. Each different hypervisor also has their
+        own hardware requirements which may affect the decisions
+        around designing a general purpose cloud. For example, to
+        utilize the live migration feature of VMware, vMotion, this
+        requires an installation of vCenter/vSphere and the use of the
+        ESXi hypervisor, which increases the infrastructure
+        requirements.</para>
+    <para>In a mixed hypervisor environment, specific aggregates of
+        compute resources, each with defined capabilities, enable
+        workloads to utilize software and hardware specific to their
+        particular requirements. This functionality can be exposed
+        explicitly to the end user, or accessed through defined
+        metadata within a particular flavor of an instance.</para></section>
+    <section xml:id="openstack-components-tech-considerations"><title>OpenStack Components</title>
+    <para>A general purpose OpenStack cloud design should incorporate
+        the core OpenStack services to provide a wide range of
+        services to end-users. The OpenStack core services recommended
+        in a general purpose cloud are:</para>
+    <itemizedlist>
+        <listitem>
+            <para>OpenStack Compute (Nova)</para>
+        </listitem>
+        <listitem>
+            <para>OpenStack Networking (Neutron)</para>
+        </listitem>
+        <listitem>
+            <para>OpenStack Image Service (Glance)</para>
+        </listitem>
+        <listitem>
+            <para>OpenStack Identity Service (Keystone)</para>
+        </listitem>
+        <listitem>
+            <para>OpenStack Dashboard (Horizon)</para>
+        </listitem>
+        <listitem>
+            <para>OpenStack Telemetry (Ceilometer)</para>
+        </listitem>
+    </itemizedlist>
+    <para>A general purpose cloud may also include OpenStack Object
+        Storage (Swift). OpenStack Block Storage (Cinder) may be
+        selected to provide persistent storage to applications and
+        instances although, depending on the use case, this could be
+        optional.</para></section>
+    <section xml:id="supplemental-software-tech-considerations"><title>Supplemental Software</title>
+    <para>A general purpose OpenStack deployment consists of more than
+        just OpenStack-specific components. A typical deployment
+        involves services that provide supporting functionality,
+        including databases and message queues, and may also involve
+        software to provide high availability of the OpenStack
+        environment. Design decisions around the underlying message
+        queue might affect the required number of controller services,
+        as well as the technology to provide highly resilient database
+        functionality, such as MariaDB with Galera. In such a
+        scenario, replication of services relies on quorum. Therefore,
+        the underlying database nodes, for example, should consist of
+        at least 3 nodes to account for the recovery of a failed
+        Galera node. When increasing the number of nodes to support a
+        feature of the software, consideration of rack space and
+        switch port density becomes important.</para>
+    <para>Where many general purpose deployments use hardware load
+        balancers to provide highly available API access and SSL
+        termination, software solutions, for example HAProxy, can also
+        be considered. It is vital to ensure that such software
+        implementations are also made highly available. This high
+        availability can be achieved by using software such as
+        Keepalived or Pacemaker with Corosync. Pacemaker and Corosync
+        can provide Active-Active or Active-Passive highly available
+        configuration depending on the specific service in the
+        OpenStack environment. Using this software can affect the
+        design as it assumes at least a 2-node controller
+        infrastructure where one of those nodes may be running certain
+        services in standby mode.</para>
+    <para>Memcached is a distributed memory object caching system, and
+        Redis is a key-value store. Both are usually deployed on
+        general purpose clouds to assist in alleviating load to the
+        Identity service. The memcached service caches tokens, and due
+        to its distributed nature it can help alleviate some
+        bottlenecks to the underlying authentication system. Using
+        memcached or Redis does not affect the overall design of your
+        architecture as they tend to be deployed onto the
+        infrastructure nodes providing the OpenStack services.</para></section>
+    <section xml:id="performance-tech-considerations"><title>Performance</title>
+    <para>Performance of an OpenStack deployment is dependent on a
+        number of factors related to the infrastructure and controller
+        services. The user requirements can be split into general
+        network performance, performance of compute resources, and
+        performance of storage systems.</para></section>
+    <section xml:id="controller-infrastructure-tech-considerations">
+        <title>Controller Infrastructure</title>
+    <para>The Controller infrastructure nodes provide management
+        services to the end-user as well as providing services
+        internally for the operating of the cloud. The Controllers
+        typically run message queuing services that carry system
+        messages between each service. Performance issues related to
+        the message bus would lead to delays in sending that message
+        to where it needs to go. The result of this condition would be
+        delays in operation functions such as spinning up and deleting
+        instances, provisioning new storage volumes and managing
+        network resources. Such delays could adversely affect an
+        application’s ability to react to certain conditions,
+        especially when using auto-scaling features. It is important
+        to properly design the hardware used to run the controller
+        infrastructure as outlined above in the Hardware Selection
+        section.</para>
+    <para>Performance of the controller services is not just limited
+        to processing power, but restrictions may emerge in serving
+        concurrent users. Ensure that the APIs and Horizon services
+        are load tested to ensure that you are able to serve your
+        customers. Particular attention should be made to the
+        OpenStack Identity Service (Keystone), which provides the
+        authentication and authorization for all services, both
+        internally to OpenStack itself and to end-users. This service
+        can lead to a degradation of overall performance if this is
+        not sized appropriately.</para></section>
+    <section xml:id="network-performance-tech-considerations"><title>Network Performance</title>
+    <para>In a general purpose OpenStack cloud, the requirements of
+        the network help determine its performance capabilities. For
+        example, small deployments may employ 1 Gibabit Ethernet (GbE)
+        networking, whereas larger installations serving multiple
+        departments or many users would be better architected with 10
+        GbE networking. The performance of the running instances will
+        be limited by these speeds. It is possible to design OpenStack
+        environments that run a mix of networking capabilities. By
+        utilizing the different interface speeds, the users of the
+        OpenStack environment can choose networks that are fit for
+        their purpose. For example, web application instances may run
+        on a public network presented through OpenStack Networking
+        that has 1 GbE capability, whereas the back-end database uses
+        an OpenStack Networking network that has 10 GbE capability to
+        replicate its data or, in some cases, the design may
+        incorporate link aggregation for greater throughput.</para>
+    <para>Network performance can be boosted considerably by
+        implementing hardware load balancers to provide front-end
+        service to the cloud APIs. The hardware load balancers also
+        perform SSL termination if that is a requirement of your
+        environment. When implementing SSL offloading, it is important
+        to understand the SSL offloading capabilities of the devices
+        selected.</para></section>
+    <section xml:id="compute-host-tech-considerations"><title>Compute Host</title>
+    <para>The choice of hardware specifications used in compute nodes
+        including CPU, memory and disk type directly affects the
+        performance of the instances. Other factors which can directly
+        affect performance include tunable parameters within the
+        OpenStack services, for example the overcommit ratio applied
+        to resources. The defaults in OpenStack Compute set a 16:1
+        over-commit of the CPU and 1.5 over-commit of the memory.
+        Running at such high ratios leads to an increase in
+        "noisy-neighbor" activity. Care must be taken when sizing your
+        Compute environment to avoid this scenario. For running
+        general purpose OpenStack environments it is possible to keep
+        to the defaults, but make sure to monitor your environment as
+        usage increases.</para></section>
+    <section xml:id="storage-performance-tech-considerations"><title>Storage Performance</title>
+    <para>When considering performance of OpenStack Block Storage,
+        hardware and architecture choice is important. Block Storage
+        can use enterprise back-end systems such as NetApp or EMC, use
+        scale out storage such as GlusterFS and Ceph, or simply use
+        the capabilities of directly attached storage in the nodes
+        themselves. Block Storage may be deployed so that traffic
+        traverses the host network, which could affect, and be
+        adversely affected by, the front-side API traffic performance.
+        As such, consider using a dedicated data storage network with
+        dedicated interfaces on the Controller and Compute
+        hosts.</para>
+    <para>When considering performance of OpenStack Object Storage, a
+        number of design choices will affect performance. A user’s
+        access to the Object Storage is through the proxy services,
+        which typically sit behind hardware load balancers. By the
+        very nature of a highly resilient storage system, replication
+        of the data would affect performance of the overall system. In
+        this case, 10 GbE (or better) networking is recommended
+        throughout the storage network architecture.</para></section>
+    <section xml:id="availability-tech-considerations"><title>Availability</title>
+    <para>In OpenStack, the infrastructure is integral to providing
+        services and should always be available, especially when
+        operating with SLAs. Ensuring network availability is
+        accomplished by designing the network architecture so that no
+        single point of failure exists. A consideration of the number
+        of switches, routes and redundancies of power should be
+        factored into core infrastructure, as well as the associated
+        bonding of networks to provide diverse routes to your highly
+        available switch infrastructure.</para>
+    <para>The OpenStack services themselves should be deployed across
+        multiple servers that do not represent a single point of
+        failure. Ensuring API availability can be achieved by placing
+        these services behind highly available load balancers that
+        have multiple OpenStack servers as members.</para>
+    <para>OpenStack lends itself to deployment in a highly available
+        manner where it is expected that at least 2 servers be
+        utilized. These can run all the services involved from the
+        message queuing service, for example RabbitMQ or QPID, and an
+        appropriately deployed database service such as MySQL or
+        MariaDB. As services in the cloud are scaled out, back-end
+        services will need to scale too. Monitoring and reporting on
+        server utilization and response times, as well as load testing
+        your systems, will help determine scale out decisions.</para>
+    <para>Care must be taken when deciding network functionality.
+        Currently, OpenStack supports both the legacy Nova-network
+        system and the newer, extensible OpenStack Networking. Both
+        have their pros and cons when it comes to providing highly
+        available access. Nova-network, which provides networking
+        access maintained in the OpenStack Compute code, provides a
+        feature that removes a single point of failure when it comes
+        to routing, and this feature is currently missing in OpenStack
+        Networking. The effect of Nova network’s Multi-Host
+        functionality restricts failure domains to the host running
+        that instance.</para>
+    <para>On the other hand, when using OpenStack Networking, the
+        OpenStack controller servers or separate OpenStack Networking
+        hosts handle routing. For a deployment that requires features
+        available in only OpenStack Networking, it is possible to
+        remove this restriction by using third party software that
+        helps maintain highly available L3 routes. Doing so allows for
+        common APIs to control network hardware, or to provide complex
+        multi-tier web applications in a secure manner. It is also
+        possible to completely remove routing from OpenStack
+        Networking, and instead rely on hardware routing capabilities.
+        In this case, the switching infrastructure must support L3
+        routing.</para>
+    <para>OpenStack Networking (Neutron) and Nova Network both have
+        their advantages and disadvantages. They are both valid and
+        supported options that fit different use cases as described in
+        the following table.</para>
+    <para>Ensure your deployment has adequate back-up capabilities. As
+        an example, in a deployment that has two infrastructure
+        controller nodes, the design should include controller
+        availability. In the event of the loss of a single controller,
+        cloud services will run from a single controller in the event
+        of failure. Where the design has higher availability
+        requirements, it is important to meet those requirements by
+        designing the proper redundancy and availability of controller
+        nodes.</para>
+    <para>Application design must also be factored into the
+        capabilities of the underlying cloud infrastructure. If the
+        compute hosts do not provide a seamless live migration
+        capability, then it must be expected that when a compute host
+        fails, that instance and any data local to that instance will
+        be deleted. Conversely, when providing an expectation to users
+        that instances have a high-level of uptime guarantees, the
+        infrastructure must be deployed in a way that eliminates any
+        single point of failure when a compute host disappears. This
+        may include utilizing shared file systems on enterprise
+        storage or OpenStack Block storage to provide a level of
+        guarantee to match service features.</para>
+    <para>For more information on HA in OpenStack, see the OpenStack
+        High Availability Guide found at
+        http://docs.openstack.org/high-availability-guide.</para></section>
+    <section xml:id="security-tech-considerations"><title>Security</title>
+    <para>A security domain comprises users, applications, servers or
+        networks that share common trust requirements and expectations
+        within a system. Typically they have the same authentication
+        and authorization requirements and users.</para>
+    <para>These security domains are:</para>
+    <itemizedlist>
+        <listitem>
+            <para>Public</para>
+        </listitem>
+        <listitem>
+            <para>Guest</para>
+        </listitem>
+        <listitem>
+            <para>Management</para>
+        </listitem>
+        <listitem>
+            <para>Data</para>
+        </listitem>
+    </itemizedlist>
+    <para>These security domains can be mapped to an OpenStack
+        deployment individually, or combined. For example, some
+        deployment topologies combine both guest and data domains onto
+        one physical network, whereas in other cases these networks
+        are physically separated. In each case, the cloud operator
+        should be aware of the appropriate security concerns. Security
+        domains should be mapped out against your specific OpenStack
+        deployment topology. The domains and their trust requirements
+        depend upon whether the cloud instance is public, private, or
+        hybrid.</para>
+    <para>The public security domain is an entirely untrusted area of
+        the cloud infrastructure. It can refer to the Internet as a
+        whole or simply to networks over which you have no authority.
+        This domain should always be considered untrusted.</para>
+    <para>Typically used for compute instance-to-instance traffic, the
+        guest security domain handles compute data generated by
+        instances on the cloud but not services that support the
+        operation of the cloud, such as API calls. Public cloud
+        providers and private cloud providers who do not have
+        stringent controls on instance use or who allow unrestricted
+        internet access to instances should consider this domain to be
+        untrusted. Private cloud providers may want to consider this
+        network as internal and therefore trusted only if they have
+        controls in place to assert that they trust instances and all
+        their tenants.</para>
+    <para>The management security domain is where services interact.
+        Sometimes referred to as the "control plane", the networks in
+        this domain transport confidential data such as configuration
+        parameters, user names, and passwords. In most deployments this
+        domain is considered trusted.</para>
+    <para>The data security domain is concerned primarily with
+        information pertaining to the storage services within
+        OpenStack. Much of the data that crosses this network has high
+        integrity and confidentiality requirements and, depending on
+        the type of deployment, may also have strong availability
+        requirements. The trust level of this network is heavily
+        dependent on other deployment decisions.</para>
+    <para>When deploying OpenStack in an enterprise as a private cloud
+        it is usually behind the firewall and within the trusted
+        network alongside existing systems. Users of the cloud are,
+        traditionally, employees that are bound by the security
+        requirements set forth by the company. This tends to push most
+        of the security domains towards a more trusted model. However,
+        when deploying OpenStack in a public facing role, no
+        assumptions can be made and the attack vectors significantly
+        increase. For example, the API endpoints, along with the
+        software behind them, become vulnerable to bad actors wanting
+        to gain unauthorized access or prevent access to services,
+        which could lead to loss of data, functionality, and
+        reputation. These services must be protected against through
+        auditing and appropriate filtering.</para>
+    <para>Consideration must be taken when managing the users of the
+        system for both public and private clouds. The identity
+        service allows for LDAP to be part of the authentication
+        process. Including such systems in an OpenStack deployment may
+        ease user management if integrating into existing
+        systems.</para>
+    <para>It's important to understand that user authentication
+        requests include sensitive information including user names,
+        passwords and authentication tokens. For this reason, placing
+        the API services behind hardware that performs SSL termination
+        is strongly recommended.</para>
+    <para>For more information OpenStack Security, see the OpenStack
+        Security Guide, at
+        http://docs.openstack.org/security-guide/.</para>
+</section>
+</section>
--- a/doc/arch-design/generalpurpose/section_user_requirements_general_purpose.xml
+++ b/doc/arch-design/generalpurpose/section_user_requirements_general_purpose.xml
@ -0,0 +1,175 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<section xmlns="http://docbook.org/ns/docbook"
+    xmlns:xi="http://www.w3.org/2001/XInclude"
+    xmlns:xlink="http://www.w3.org/1999/xlink"
+    version="5.0"
+    xml:id="user-requirements-general-purpose">
+    <?dbhtml stop-chunking?>
+    <title>User Requirements</title>
+    <para>The general purpose cloud is built following the
+        Infrastructure-as-a-Service (IaaS) model; as a platform best
+        suited for use cases with simple requirements. The general
+        purpose cloud user requirements themselves are typically not
+        complex. However, it is still important to capture them even
+        if the project has minimum business and technical requirements
+        such as a Proof of Concept (PoC) or a small lab
+        platform.</para>
+    <para>These user considerations are written from the perspective
+        of the organization that is building the cloud, not from the
+        perspective of the end-users who will consume cloud services
+        provided by this design.</para>
+    <itemizedlist>
+        <listitem>
+            <para>Cost: Financial factors are a primary concern for
+                any organization. Since general purpose clouds are
+                considered the baseline from which all other cloud
+                architecture environments derive, cost will commonly
+                be an important criteria. This type of cloud, however,
+                does not always provide the most cost-effective
+                environment for a specialized application or
+                situation. Unless razor-thin margins and costs have
+                been mandated as a critical factor, cost should not be
+                the sole consideration when choosing or designing a
+                general purpose architecture.</para>
+        </listitem>
+        <listitem>
+            <para>Time to market: Another common business factor in
+                building a general purpose cloud is the ability to
+                deliver a service or product more quickly and
+                flexibly. In the modern hyper-fast business world,
+                being able to deliver a product in six months instead
+                of two years is often a major driving force behind the
+                decision to build a general purpose cloud. General
+                purpose clouds allow users to self-provision and gain
+                access to compute, network, and storage resources
+                on-demand thus decreasing time to market. It may
+                potentially make more sense to build a general purpose
+                PoC as opposed to waiting to finalize the ultimate use
+                case for the system. The tradeoff with taking this
+                approach is the risk that the general purpose cloud is
+                not optimized for the actual final workloads. The
+                final decision on which approach to take will be
+                dependent on the specifics of the business objectives
+                and time frame for the project.</para>
+        </listitem>
+        <listitem>
+            <para>Revenue opportunity: The revenue opportunity for a
+                given cloud will vary greatly based on the intended
+                use case of that particular cloud. Some general
+                purpose clouds are built for commercial customer
+                facing products, but there are plenty of other reasons
+                that might make the general purpose cloud the right
+                choice. A small cloud service provider (CSP) might
+                want to build a general purpose cloud rather than a
+                massively scalable cloud because they do not have the
+                deep financial resources needed, or because they do
+                not or will not know in advance the purposes for which
+                their customers are going to use the cloud. For some
+                users, the advantages cloud itself offers mean an
+                enhancement of revenue opportunity. For others, the
+                fact that a general purpose cloud provides only
+                baseline functionality will be a disincentive for use,
+                leading to a potential stagnation of potential revenue
+                opportunities.</para>
+        </listitem>
+    </itemizedlist>
+    <section xml:id="legal-requirements-general-purpose"><title>Legal Requirements</title>
+    <para>Many jurisdictions have legislative and regulatory
+        requirements governing the storage and management of data in
+        cloud environments. Common areas of regulation include:</para>
+    <itemizedlist>
+        <listitem>
+            <para>Data retention policies ensuring storage of
+                persistent data and records management to meet data
+                archival requirements.</para>
+        </listitem>
+        <listitem>
+            <para>Data ownership policies governing the possession and
+                responsibility for data.</para>
+        </listitem>
+        <listitem>
+            <para>Data sovereignty policies governing the storage of
+                data in foreign countries or otherwise separate
+                jurisdictions.</para>
+        </listitem>
+        <listitem>
+            <para>Data compliance policies governing certain types of
+                information need to reside in certain locations due to
+                regular issues - and more important cannot reside in
+                other locations for the same reason.</para>
+        </listitem>
+    </itemizedlist>
+    <para>Examples of such legal frameworks include the data
+        protection framework of the European Union
+        (http://ec.europa.eu/justice/data-protection/ ) and the
+        requirements of the Financial Industry Regulatory Authority
+        (http://www.finra.org/Industry/Regulation/FINRARules/ ) in the
+        United States. Consult a local regulatory body for more
+        information.</para></section>
+    <section xml:id="technical-requirements"><title>Technical Requirements</title>
+    <para>Technical cloud architecture requirements should be weighted
+        against the business requirements.</para>
+    <itemizedlist>
+        <listitem>
+            <para>Performance: As a baseline product, general purpose
+                clouds do not provide optimized performance for any
+                particular function. While a general purpose cloud
+                should provide enough performance to satisfy average
+                user considerations, performance is not a general
+                purpose cloud customer driver.</para>
+        </listitem>
+        <listitem>
+            <para>No predefined usage model: The lack of a pre-defined
+                usage model enables the user to run a wide variety of
+                applications without having to know the application
+                requirements in advance. This provides a degree of
+                independence and flexibility that no other cloud
+                scenarios are able to provide.</para>
+        </listitem>
+        <listitem>
+            <para>On-demand and self-service application: By
+                definition, a cloud provides end users with the
+                ability to self-provision computing power, storage,
+                networks, and software in a simple and flexible way.
+                The user must be able to scale their resources up to a
+                substantial level without disrupting the underlying
+                host operations. One of the benefits of using a
+                general purpose cloud architecture is the ability to
+                start with limited resources and increase them over
+                time as the user demand grows.</para>
+        </listitem>
+        <listitem>
+            <para>Public cloud: For a company interested in building a
+                commercial public cloud offering based on OpenStack,
+                the general purpose architecture model might be the
+                best choice because the designers are not going to
+                know the purposes or workloads for which the end users
+                will use the cloud.</para>
+        </listitem>
+        <listitem>
+            <para>Internal consumption (private) cloud: Organizations
+                need to determine if it makes the most sense to create
+                their own clouds internally. The main advantage of a
+                private cloud is that it allows the organization to
+                maintain complete control over all the architecture
+                and the cloud components. One caution is to think
+                about the possibility that users will want to combine
+                using the internal cloud with access to an external
+                cloud. If that case is likely, it might be worth
+                exploring the possibility of taking a multi-cloud
+                approach with regard to at least some of the
+                architectural elements. Designs that incorporate the
+                use of multiple clouds, such as a private cloud and a
+                public cloud offering, are described in the
+                "Multi-Cloud" scenario.</para>
+        </listitem>
+        <listitem>
+            <para>Security: Security should be implemented according
+                to asset, threat, and vulnerability risk assessment
+                matrices. For cloud domains that require increased
+                computer security, network security, or information
+                security, general purpose cloud is not considered an
+                appropriate choice.</para>
+        </listitem>
+    </itemizedlist></section>
+</section>
--- a/doc/arch-design/hybrid/section_architecture_hybrid.xml
+++ b/doc/arch-design/hybrid/section_architecture_hybrid.xml
@ -0,0 +1,186 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<section xmlns="http://docbook.org/ns/docbook"
+    xmlns:xi="http://www.w3.org/2001/XInclude"
+    xmlns:xlink="http://www.w3.org/1999/xlink"
+    version="5.0"
+    xml:id="arch-guide-architecture-hybrid">
+    <?dbhtml stop-chunking?>
+    <title>Architecture</title>
+    <para>Once business and application requirements have been
+        defined, the first step for designing a hybrid cloud solution
+        is to map out the dependencies between the expected workloads
+        and the diverse cloud infrastructures that need to support
+        them. By mapping the applications and the targeted cloud
+        environments, you can architect a solution that enables the
+        broadest compatibility between cloud platforms and minimizes
+        the need to create workarounds and processes to fill
+        identified gaps. Note the evaluation of the monitoring and
+        orchestration APIs available on each cloud platform and the
+        relative levels of support for them in the chosen Cloud
+        Management Platform.</para>
+    <mediaobject>
+        <imageobject>
+            <imagedata
+                fileref="../images/Multi-Cloud_Priv-AWS4.png"
+            />
+        </imageobject>
+    </mediaobject>
+    <section xml:id="image-portability"><title>Image portability</title>
+    <para>The majority of cloud workloads currently run on instances
+        using hypervisor technologies such as KVM, Xen, or ESXi. The
+        challenge is that each of these hypervisors use an image
+        format that is mostly, or not at all, compatible with one
+        another. In a private or hybrid cloud solution, this can be
+        mitigated by standardizing on the same hypervisor and instance
+        image format but this is not always feasible. This is
+        particularly evident if one of the clouds in the architecture
+        is a public cloud that is outside of the control of the
+        designers.</para>
+    <para>There are conversion tools such as virt-v2v
+        (http://libguestfs.org/virt-v2v/) and virt-edit
+        (http://libguestfs.org/virt-edit.1.html) that can be used in
+        those scenarios but they are often not suitable beyond very
+        basic cloud instance specifications. An alternative is to
+        build a thin operating system image as the base for new
+        instances. This facilitates rapid creation of cloud instances
+        using cloud orchestration or configuration management tools,
+        driven by the CMP, for more specific templating. Another more
+        expensive option is to use a commercial image migration tool.
+        The issue of image portability is not just for a one time
+        migration. If the intention is to use the multiple cloud for
+        disaster recovery, application diversity or high availability,
+        the images and instances are likely to be moved between the
+        different cloud platforms regularly.</para></section>
+    <section xml:id="upper-layer-services"><title>Upper-Layer Services</title>
+    <para>Many clouds offer complementary services over and above the
+        basic compute, network, and storage components. These
+        additional services are often used to simplify the deployment
+        and management of applications on a cloud platform.</para>
+    <para>Consideration is required to be given to moving workloads
+        that may have upper-layer service dependencies on the source
+        cloud platform to a destination cloud platform that may not
+        have a comparable service. Conversely, the user can implement
+        it in a different way or by using a different technology. For
+        example, moving an application that uses a NoSQL database
+        service such as MongoDB that is delivered as a service on the
+        source cloud, to a destination cloud that does not offer that
+        service or may only use a relational database such as MySQL,
+        could cause difficulties in maintaining the application
+        between the platforms.</para>
+    <para>There are a number of options that might be appropriate for
+        the hybrid cloud use case:</para>
+    <itemizedlist>
+        <listitem>
+            <para>Create a baseline of upper-layer services that are
+                implemented across all of the cloud platforms. For
+                platforms that do not support a given service, create
+                a service on top of that platform and apply it to the
+                workloads as they are launched on that cloud. For
+                example, OpenStack, via Trove, supports MySQL as a
+                service but not NoSQL databases in production. To move
+                from or to run alongside on AWS a NoSQL workload would
+                require recreating the NoSQL database on top of
+                OpenStack and automate the process of implementing it
+                using a tool such as OpenStack Orchestration
+                (Heat).</para>
+        </listitem>
+        <listitem>
+            <para>Deploy a Platform as a Service (PaaS) technology
+                such as Cloud Foundry or OpenShift that abstracts the
+                upper-layer services from the underlying cloud
+                platform. The unit of application deployment and
+                migration is the PaaS and leverages the services of
+                the PaaS and only consumes the base infrastructure
+                services of the cloud platform. The downside to this
+                approach is that the PaaS itself then potentially
+                becomes a source of lock-in.</para>
+        </listitem>
+        <listitem>
+            <para>Use only the base infrastructure services that are
+                common across all cloud platforms. Use automation
+                tools to create the required upper-layer services
+                which are portable across all cloud platforms. For
+                example, instead of using any database services that
+                are inherent in the cloud platforms, launch cloud
+                instances and deploy the databases on to those
+                instances using scripts or various configuration and
+                application deployment tools.</para>
+        </listitem>
+    </itemizedlist></section>
+    <section xml:id="network-services"><title>Network Services</title>
+    <para>Network services functionality is a significant barrier for
+        multiple cloud architectures. It could be an important factor
+        to assess when choosing a CMP and cloud provider.
+        Considerations are: functionality, security, scalability and
+        High availability (HA). Verification and ongoing testing of
+        the critical features of the cloud endpoint used by the
+        architecture are important tasks.</para>
+    <itemizedlist>
+        <listitem>
+            <para>Once the network functionality framework has been
+                decided, a minimum functionality test should be
+                designed to confirm that the functionality is in fact
+                compatible. This will ensure testing and functionality
+                persists during and after upgrades. Note that over
+                time, the diverse cloud platforms are likely to
+                de-synchronize if care is not taken to maintain
+                compatibility. This is a particular issue with
+                APIs.</para>
+        </listitem>
+        <listitem>
+            <para>Scalability across multiple cloud providers may
+                dictate which underlying network framework is chosen
+                for the different cloud providers. It is important to
+                have the network API functions presented and to verify
+                that the desired functionality persists across all
+                chosen cloud endpoint.</para>
+        </listitem>
+        <listitem>
+            <para>High availability (HA) implementations vary in
+                functionality and design. Examples of some common
+                methods are Active-Hot-Standby, Active-Passive and
+                Active-Active. High availability and a test framework
+                need to be developed to insure that the functionality
+                and limitations are well understood.</para>
+        </listitem>
+        <listitem>
+            <para>Security considerations, such as how data is secured
+                between client and endpoint and any traffic that
+                traverses the multiple clouds, from eavesdropping to
+                DoS activities must be addressed. Business and
+                regulatory requirements dictate the security approach
+                that needs to be taken.</para>
+        </listitem>
+    </itemizedlist></section>
+    <section xml:id="data"><title>Data</title>
+    <para>Replication has been the traditional method for protecting
+        object store implementations. A variety of different
+        implementations have existed in storage architectures.
+        Examples of this are both synchronous and asynchronous
+        mirroring. Most object stores and back-end storage systems have
+        a method for replication that can be implemented at the
+        storage subsystem layer. Object stores also have implemented
+        replication techniques that can be tailored to fit a clouds
+        needs. An organization must find the right balance between
+        data integrity and data availability. Replication strategy may
+        also influence the disaster recovery methods
+        implemented.</para>
+    <para>Replication across different racks, data centers and
+        geographical regions has led to the increased focus of
+        determining and ensuring data locality. The ability to
+        guarantee data is accessed from the nearest or fastest storage
+        can be necessary for applications to perform well. Examples of
+        this are Hadoop running in a cloud. The user either runs with
+        a native HDFS, when applicable, or on a separate parallel file
+        system such as those provided by Hitachi and IBM. Special
+        consideration should be taken when running embedded object
+        store methods to not cause extra data replication, which can
+        create unnecessary performance issues. Another example of
+        ensuring data locality is by using Ceph. Ceph has a data
+        container abstraction called a pool. Pools can be created with
+        replicas or erasure code. Replica based pools can also have a
+        rule set defined to have data written to a “local” set of
+        hardware which would be the primary access and modification
+        point.</para>
+    </section>
+</section>
--- a/doc/arch-design/hybrid/section_introduction_hybrid.xml
+++ b/doc/arch-design/hybrid/section_introduction_hybrid.xml
@ -0,0 +1,68 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<section xmlns="http://docbook.org/ns/docbook"
+    xmlns:xi="http://www.w3.org/2001/XInclude"
+    xmlns:xlink="http://www.w3.org/1999/xlink"
+    version="5.0"
+    xml:id="arch-guide-intro-hybrid">
+    <title>Introduction</title>
+    <para>Hybrid cloud, by definition, means that the design spans
+        more than one cloud. An example of this kind of architecture
+        may include a situation in which the design involves more than
+        one OpenStack cloud (for example, an OpenStack-based private
+        cloud and an OpenStack-based public cloud), or it may be a
+        situation incorporating an OpenStack cloud and a non-OpenStack
+        cloud (for example, an OpenStack-based private cloud that
+        interacts with Amazon Web Services). Bursting into an external
+        cloud is the practice of creating new instances to alleviate
+        extra load where there is no available capacity in the private
+        cloud.</para>
+    <para>Some situations that could involve hybrid cloud architecture
+        include:</para>
+    <itemizedlist>
+        <listitem>
+            <para>Bursting from a private cloud to a public
+                cloud</para>
+        </listitem>
+        <listitem>
+            <para>Disaster recovery</para>
+        </listitem>
+        <listitem>
+            <para>Development and testing</para>
+        </listitem>
+        <listitem>
+            <para>Federated cloud, enabling users to choose resources
+                from multiple providers</para>
+        </listitem>
+        <listitem>
+            <para>Hybrid clouds built to support legacy systems as
+                they transition to cloud</para>
+        </listitem>
+    </itemizedlist>
+    <para>As a hybrid cloud design deals with systems that are outside
+        of the control of the cloud architect or organization, a
+        hybrid cloud architecture requires considering aspects of the
+        architecture that might not have otherwise been necessary. For
+        example, the design may need to deal with hardware, software,
+        and APIs under the control of a separate organization.</para>
+    <para>Similarly, the degree to which the architecture is
+        OpenStack-based will have an effect on the cloud operator or
+        cloud consumer's ability to accomplish tasks with native
+        OpenStack tools. By definition, this is a situation in which
+        no single cloud can provide all of the necessary
+        functionality. In order to manage the entire system, users,
+        operators and consumers will need an overarching tool known as
+        a cloud management platform (CMP). Any organization that is
+        working with multiple clouds already has a CMP, even if that
+        CMP is the operator who logs into an external web portal and
+        launches a public cloud instance.</para>
+    <para>There are commercially available options, such as
+        Rightscale, and open source options, such as ManageIQ
+        (http://manageiq.org/), but there is no single CMP that can
+        address all needs in all scenarios. Whereas most of the
+        sections of this book talk about the aspects of OpenStack, an
+        architect needs to consider when designing an OpenStack
+        architecture. This section will also discuss the things the
+        architect must address when choosing or building a CMP to run
+        a hybrid cloud design, even if the CMP will be a manually
+        built solution.</para>
+</section>
--- a/doc/arch-design/hybrid/section_operational_considerations_hybrid.xml
+++ b/doc/arch-design/hybrid/section_operational_considerations_hybrid.xml
@ -0,0 +1,99 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<section xmlns="http://docbook.org/ns/docbook"
+    xmlns:xi="http://www.w3.org/2001/XInclude"
+    xmlns:xlink="http://www.w3.org/1999/xlink"
+    version="5.0"
+    xml:id="arch-guide-hybrid-operational-considerations">
+    <?dbhtml stop-chunking?>
+    <title>Operational Considerations</title>
+    <para>Hybrid cloud deployments present complex operational
+        challenges. There are several factors to consider that affect
+        the way each cloud is deployed and how users and operators
+        will interact with each cloud. Not every cloud provider
+        implements infrastructure components the same way which may
+        lead to incompatible interactions with workloads or a specific
+        Cloud Management Platform (CMP). Different cloud providers may
+        also offer different levels of integration with competing
+        cloud offerings.</para>
+    <para>When selecting a CMP, one of the most important aspects to
+        consider is monitoring. Gaining valuable insight into each
+        cloud is critical to gaining a holistic view of all involved
+        clouds. In choosing an existing CMP, determining whether it
+        supports monitoring of all the clouds involved or if
+        compatible APIs are available which can be queried for the
+        necessary information, is vital. Once all the information
+        about each cloud can be gathered and stored in a searchable
+        database, proper actions can be taken on that data offline so
+        workloads will not be impacted.</para>
+    <section xml:id="agility"><title>Agility</title>
+    <para>Implementing a hybrid cloud solution can provide application
+        availability across disparate cloud environments and
+        technologies. This availability enables the deployment to
+        survive a complete disaster in any single cloud environment.
+        Each cloud should provide the means to quickly spin up new
+        instances in the case of capacity issues or complete
+        unavailability of a single cloud installation.</para></section>
+    <section xml:id="application-readiness-hybrid"><title>Application Readiness</title>
+    <para>It is important to understand the type of application
+        workloads that will be deployed across the hybrid cloud
+        environment. Enterprise workloads that depend on the
+        underlying infrastructure for availability are not designed to
+        run on OpenStack. Although these types of applications can run
+        on an OpenStack cloud, if the application is not able to
+        tolerate infrastructure failures, it is likely to require
+        significant operator intervention to recover. Cloud workloads,
+        however, are designed with fault tolerance in mind and the SLA
+        of the application is not tied to the underlying
+        infrastructure. Ideally, cloud applications will be designed
+        to recover when entire racks and even data centers full of
+        infrastructure experience an outage.</para></section>
+    <section xml:id="upgrades"><title>Upgrades</title>
+    <para>OpenStack is a complex and constantly evolving collection of
+        software. Upgrades may be performed to one or more of the
+        cloud environments involved. If a public cloud is involved in
+        the deployment, predicting upgrades may not be possible. Be
+        sure to examine the advertised SLA for any public cloud
+        provider being used. Note that at massive scale, even when
+        dealing with a cloud that offers an SLA with a high percentage
+        of uptime, workloads must be able to recover at short
+        notice.</para>
+    <para>Similarly, when upgrading private cloud deployments, care
+        must be taken to minimize disruption by making incremental
+        changes and providing a facility to either rollback or
+        continue to roll forward when using a continuous delivery
+        model.</para>
+    <para>Another consideration is upgrades to the CMP which may need
+        to be completed in coordination with any of the hybrid cloud
+        upgrades. This may be necessary whenever API changes are made
+        in one of the cloud solutions in use to support the new
+        functionality.</para></section>
+    <section xml:id="network-operation-center-noc"><title>Network Operation Center (NOC)</title>
+    <para>When planning the Network Operation Center for a hybrid
+        cloud environment, it is important to recognize where control
+        over each piece of infrastructure resides. If a significant
+        portion of the cloud is on externally managed systems, be
+        prepared for situations in which it may not be possible to
+        make changes at all or at the most convenient time.
+        Additionally, situations of conflict may arise in which
+        multiple providers have differing points of view on the way
+        infrastructure must be managed and exposed. This can lead to
+        delays in root cause and analysis where each insists the blame
+        lies with the other provider.</para>
+    <para>It is important to ensure that the structure put in place
+        enables connection of the networking of both clouds to form an
+        integrated system, keeping in mind the state of handoffs.
+        These handoffs must both be as reliable as possible and
+        include as little latency as possible to ensure the best
+        performance of the overall system.</para></section>
+    <section xml:id="maintainability"><title>Maintainability</title>
+    <para>Operating hybrid clouds is a situation in which there is a
+        greater reliance on third party systems and processes. As a
+        result of a lack of control of various pieces of a hybrid
+        cloud environment, it is not necessarily possible to guarantee
+        proper maintenance of the overall system. Instead, the user
+        must be prepared to abandon workloads and spin them up again
+        in an improved state. Having a hybrid cloud deployment does,
+        however, provide agility for these situations by allowing the
+        migration of workloads to alternative clouds in response to
+        cloud-specific issues.</para></section>
+</section>
--- a/doc/arch-design/hybrid/section_prescriptive_examples_hybrid.xml
+++ b/doc/arch-design/hybrid/section_prescriptive_examples_hybrid.xml
@ -0,0 +1,175 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<section xmlns="http://docbook.org/ns/docbook"
+    xmlns:xi="http://www.w3.org/2001/XInclude"
+    xmlns:xlink="http://www.w3.org/1999/xlink"
+    version="5.0"
+    xml:id="prescriptive-examples-multi-cloud">
+    <?dbhtml stop-chunking?>
+    <title>Prescriptive Examples</title>
+    <para>Multi-cloud environments are typically created to facilitate
+        these use cases:</para>
+    <itemizedlist>
+        <listitem>
+            <para>Bursting workloads from private to public OpenStack
+                clouds</para>
+        </listitem>
+        <listitem>
+            <para>Bursting workloads from private to public
+                non-OpenStack clouds</para>
+        </listitem>
+        <listitem>
+            <para>High Availability across clouds (for technical
+                diversity)</para>
+        </listitem>
+    </itemizedlist>
+    <para>Examples of environments that address each of these use
+        cases will be discussed in this chapter.</para>
+    <para>Company A's data center is running dangerously low on
+        capacity. The option of expanding the data center will not be
+        possible in the foreseeable future. In order to accommodate
+        the continuously growing need for development resources in the
+        organization, the decision was make use of resource in the
+        public cloud.</para>
+    <para>The company has an internal cloud management platform that
+        will direct requests to the appropriate cloud, depending on
+        the currently local capacity.</para>
+    <para>This is a custom in-house application that has been written
+        for this specific purpose.</para>
+    <para>An example for such a solution is described in the figure
+        below.</para>
+    <mediaobject>
+        <imageobject>
+            <imagedata
+                fileref="../images/Multi-Cloud_Priv-Pub3.png"
+            />
+        </imageobject>
+    </mediaobject>
+    <para>This example shows two clouds, with a Cloud Management
+        Platform (CMP) connecting them. This guide does not attempt to
+        cover a specific CMP, but describes how workloads are
+        typically orchestrated using the Orchestration and Telemetry
+        services as shown in the diagram above. It is also possibly to
+        connect directly to the other OpenStack APIs with a
+        CMP.</para>
+    <para>The private cloud is an OpenStack cloud with one or more
+        controllers and one or more compute nodes. It includes
+        metering provided by OpenStack Telemetry. As load increases
+        Telemetry captures this and the information is in turn
+        processed by the CMP. As long as capacity is available, the
+        CMP uses the OpenStack API to call the Orchestration service
+        to create instances on the private cloud in response to user
+        requests. When capacity is not available on the private cloud,
+        the CMP issues a request to the Orchestration service API of
+        the public cloud to create the instance on the public
+        cloud.</para>
+    <para>In this example, the whole deployment was not directed to an
+        external public cloud because of the company's fear of lack of
+        resource control and security concerns over control and
+        increased operational expense.</para>
+    <para>In addition, CompanyA has already established a data center
+        with a substantial amount of hardware, and migrating all the
+        workloads out to a public cloud was not feasible.</para>
+    <section xml:id="bursting-to-public-nonopenstack-cloud"><title>Bursting to a Public non-OpenStack Cloud</title>
+    <para>Another common scenario is bursting workloads from the
+        private cloud into a non-OpenStack public cloud such as Amazon
+        Web Services (AWS) to take advantage of additional capacity
+        and scale applications as needed.</para>
+    <para>For an OpenStack-to-AWS hybrid cloud, the architecture looks
+        similar to the figure below:</para>
+    <mediaobject>
+        <imageobject>
+            <imagedata
+                fileref="../images/Multi-Cloud_Priv-AWS4.png"
+            />
+        </imageobject>
+    </mediaobject>
+    <para>In this scenario CompanyA has an additional requirement in
+        that the developers were already using AWS for some of their
+        work and did not want to change the cloud provider. Primarily
+        due to excessive overhead with network firewall rules that
+        needed to be created and corporate financial procedures that
+        required entering into an agreement with a new
+        provider.</para>
+    <para>As long as the CMP is capable of connecting an external
+        cloud provider with the appropriate API, the workflow process
+        will remain the same as the previous scenario. The actions the
+        CMP takes such as monitoring load, creating new instances, and
+        so forth are the same, but they would be performed in the
+        public cloud using the appropriate API calls. For example, if
+        the public cloud is Amazon Web Services, the CMP would use the
+        EC2 API to create a new instance and assign an Elastic IP.
+        That IP can then be added to HAProxy in the private cloud,
+        just as it was before. The CMP can also reference AWS-specific
+        tools such as CloudWatch and CloudFormation.</para>
+    <para>Several open source tool kits for building CMPs are now
+        available that can handle this kind of translation, including
+        ManageIQ, jClouds, and JumpGate.</para></section>
+    <section xml:id="high-availability-disaster-recovery"><title>High Availability/Disaster Recovery</title>
+    <para>CompanyA has a requirement to be able to recover from
+        catastrophic failure in their local data center. Some of the
+        workloads currently in use are running on their private
+        OpenStack cloud. Protecting the data involves block storage,
+        object storage, and a database. The architecture is designed
+        to support the failure of large components of the system, yet
+        ensuring that the system will continue to deliver services.
+        While the services remain available to users, the failed
+        components are restored in the background based on standard
+        best practice DR policies. To achieve the objectives, data is
+        replicated to a second cloud, in a geographically distant
+        location. The logical diagram of the system is described in
+        the figure below:</para>
+    <mediaobject>
+        <imageobject>
+            <imagedata
+                fileref="../images/Multi-Cloud_failover2.png"
+            />
+        </imageobject>
+    </mediaobject>
+    <para>This example includes two private OpenStack clouds connected
+        with a Cloud Management Platform (CMP). The source cloud,
+        OpenStack Cloud 1, includes a controller and at least one
+        instance running MySQL. It also includes at least one block
+        storage volume and one object storage volume so that the data
+        is available to the users at all times. The details of the
+        method for protecting each of these sources of data
+        differs.</para>
+    <para>The object storage relies on the replication capabilities of
+        the object storage provider. OpenStack Object Storage is
+        enabled so that it creates geographically separated replicas
+        that take advantage of this feature. It is configured so that
+        at least one replica exists in each cloud. In order to make
+        this work a single array spanning both clouds is configured
+        with OpenStack Identity that uses Federated Identity and talks
+        to both clouds, communicating with OpenStack Object Storage
+        through the Swift proxy.</para>
+    <para>For block storage, the replication is a little more
+        difficult, and involves tools outside of OpenStack itself. The
+        OpenStack Block Storage volume is not set as the drive itself
+        but as a logical object that points to a physical back end. The
+        disaster recovery is configured for Block Storage for
+        synchronous backup for the highest level of data protection,
+        but asynchronous backup could have been set as an alternative
+        that is not as latency sensitive. For asynchronous backup, the
+        Cinder API makes it possible to export the data and also the
+        metadata of a particular volume, so that it can be moved and
+        replicated elsewhere. More information can be found here:
+        https://blueprints.launchpad.net/cinder/+spec/cinder-backup-volume-metadata-support.</para>
+    <para>The synchronous backups create an identical volume in both
+        clouds and chooses the appropriate flavor so that each cloud
+        has an identical back end. This was done by creating volumes
+        through the CMP, because the CMP knows to create identical
+        volumes in both clouds. Once this is configured, a solution,
+        involving DRDB, is used to synchronize the actual physical
+        drives.</para>
+    <para>The database component is backed up using synchronous
+        backups. MySQL does not support geographically diverse
+        replication, so disaster recovery is provided by replicating
+        the file itself. As it is not possible to use object storage
+        as the back end of a database like MySQL, Swift replication
+        was not an option. It was decided not to store the data on
+        another geo-tiered storage system, such as Ceph, as block
+        storage. This would have given another layer of protection.
+        Another option would have been to store the database on an
+        OpenStack Block Storage volume and backing it up just as any
+        other block storage.</para></section>
+</section>
--- a/doc/arch-design/hybrid/section_tech_considerations_hybrid.xml
+++ b/doc/arch-design/hybrid/section_tech_considerations_hybrid.xml
@ -0,0 +1,325 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<section xmlns="http://docbook.org/ns/docbook"
+    xmlns:xi="http://www.w3.org/2001/XInclude"
+    xmlns:xlink="http://www.w3.org/1999/xlink"
+    version="5.0"
+    xml:id="technical-considerations-hybrid">
+    <?dbhtml stop-chunking?>
+    <title>Technical Considerations</title>
+    <para>A hybrid cloud environment requires inspection and
+        understanding of technical issues that are not only outside of
+        an organization's data center, but potentially outside of an
+        organization's control. In many cases, it is necessary to
+        ensure that the architecture and CMP chosen can adapt to, not
+        only to different environments, but also to the possibility of
+        change. In this situation, applications are crossing diverse
+        platforms and are likely to be located in diverse locations.
+        All of these factors will influence and add complexity to the
+        design of a hybrid cloud architecture.</para>
+    <para>The only situation where cloud platform incompatibilities
+        are not going to be an issue is when working with clouds that
+        are based on the same version and the same distribution of
+        OpenStack. Otherwise incompatibilities are virtually
+        inevitable.</para>
+    <para>Incompatibility should be less of an issue for clouds that
+        exclusively use the same version of OpenStack, even if they
+        use different distributions. The newer the distribution in
+        question, the less likely it is that there will be
+        incompatibilities between version. This is due to the fact
+        that the OpenStack community has established an initiative to
+        define core functions that need to remain backward compatible
+        between supported versions. The DefCore initiative defines
+        basic functions that every distribution must support in order
+        to bear the name "OpenStack".</para>
+    <para>Some vendors, however, add proprietary customizations to
+        their distributions. If an application or architecture makes
+        use of these features, it will be difficult to migrate to or
+        use other types of environments. Anyone considering
+        incorporating older versions of OpenStack prior to Havana
+        should consider carefully before attempting to incorporate
+        functionality between versions. Internal differences in older
+        versions may be so great that the best approach might be to
+        consider the versions to be essentially diverse platforms, as
+        different as OpenStack and Amazon Web Services or Microsoft
+        Azure.</para>
+    <para>The situation is more predictable if using different cloud
+        platforms is incorporated from inception. If the other clouds
+        are not based on OpenStack, then all pretense of compatibility
+        vanishes, and CMP tools must account for the myriad of
+        differences in the way operations are handled and services are
+        implemented. Some situations in which these incompatibilities
+        can arise include differences between the way in which a
+        cloud:</para>
+    <itemizedlist>
+        <listitem>
+            <para>Deploys instances</para>
+        </listitem>
+        <listitem>
+            <para>Manages networks</para>
+        </listitem>
+        <listitem>
+            <para>Treats applications</para>
+        </listitem>
+        <listitem>
+            <para>Implements services</para>
+        </listitem>
+    </itemizedlist>
+    <section xml:id="capacity-planning-hybrid"><title>Capacity planning</title>
+    <para>One of the primary reasons many organizations turn to a
+        hybrid cloud system is to increase capacity without having to
+        make large capital investments. However, capacity planning is
+        still necessary when designing an OpenStack installation even
+        if it is augmented with external clouds.</para>
+    <para>Specifically, overall capacity and placement of workloads
+        need to be accounted for when designing for a mostly
+        internally-operated cloud with the occasional capacity burs.
+        The long-term capacity plan for such a design needs to
+        incorporate growth over time to prevent the need to
+        permanently burst into, and occupy, a potentially more
+        expensive external cloud. In order to avoid this scenario,
+        account for the future applications and capacity requirements
+        and plan growth appropriately.</para>
+    <para>One of the drawbacks of capacity planning is
+        unpredictability. It is difficult to predict the amount of
+        load a particular application might incur if the number of
+        users fluctuates or the application experiences an unexpected
+        increase in popularity. It is possible to define application
+        requirements in terms of vCPU, RAM, bandwidth or other
+        resources and plan appropriately, but other clouds may not use
+        the same metric or even the same oversubscription
+        rates.</para>
+    <para>Oversubscription is a method to emulate more capacity than
+        they may physically be present. For example, a physical
+        hypervisor node with 32 gigabytes of RAM may host 24
+        instances, each provisioned with 2 gigabytes of RAM. As long
+        as all 24 of them are not concurrently utilizing 2 full
+        gigabytes, this arrangement is a non-issue. However, some
+        hosts take oversubscription to extremes and, as a result,
+        performance can frequently be inconsistent. If at all
+        possible, determine what the oversubscription rates of each
+        host are and plan capacity accordingly.</para></section>
+    <section xml:id="security-hybrid"><title>Security</title>
+    <para>The nature of a hybrid cloud environment removes complete
+        control over the infrastructure. Security becomes a stronger
+        requirement because data or applications may exist in a cloud
+        that is outside of an organization's control. Security domains
+        become an important distinction when planning for a hybrid
+        cloud environment and its capabilities. A security domain
+        comprises users, applications, servers or networks that share
+        common trust requirements and expectations within a
+        system.</para>
+    <para>The security domains are:</para>
+    <orderedlist>
+        <listitem>
+            <para>Public</para>
+        </listitem>
+        <listitem>
+            <para>Guest</para>
+        </listitem>
+        <listitem>
+            <para>Management</para>
+        </listitem>
+        <listitem>
+            <para>Data</para>
+        </listitem>
+    </orderedlist>
+    <para>These security domains can be mapped individually to the
+        organization's installation or combined. For example, some
+        deployment topologies combine both guest and data domains onto
+        one physical network, whereas other topologies may physically
+        separate these networks. In each case, the cloud operator
+        should be aware of the appropriate security concerns. Security
+        domains should be mapped out against the specific OpenStack
+        deployment topology. The domains and their trust requirements
+        depend upon whether the cloud instance is public, private, or
+        hybrid.</para>
+    <para>The public security domain is an entirely untrusted area of
+        the cloud infrastructure. It can refer to the Internet as a
+        whole or simply to networks over which an organization has no
+        authority. This domain should always be considered untrusted.
+        When considering hybrid cloud deployments, any traffic
+        traversing beyond and between the multiple clouds should
+        always be considered to reside in this security domain and is
+        therefore untrusted.</para>
+    <para>Typically used for instance-to-instance traffic within a
+        single data center, the guest security domain handles compute
+        data generated by instances on the cloud but not services that
+        support the operation of the cloud such as API calls. Public
+        cloud providers that are used in a hybrid cloud configuration
+        which an organization does not control and private cloud
+        providers who do not have stringent controls on instance use
+        or who allow unrestricted internet access to instances should
+        consider this domain to be untrusted. Private cloud providers
+        may consider this network as internal and therefore trusted
+        only if there are controls in place to assert that instances
+        and tenants are trusted.</para>
+    <para>The management security domain is where services interact.
+        Sometimes referred to as the "control plane", the networks in
+        this domain transport confidential data such as configuration
+        parameters, user names, and passwords. In deployments behind an
+        organization's firewall, this domain is considered trusted. In
+        a public cloud model which could be part of an architecture,
+        this would have to be assessed with the Public Cloud provider
+        to understand the controls in place.</para>
+    <para>The data security domain is concerned primarily with
+        information pertaining to the storage services within
+        OpenStack. Much of the data that crosses this network has high
+        integrity and confidentiality requirements and depending on
+        the type of deployment there may also be strong availability
+        requirements. The trust level of this network is heavily
+        dependent on deployment decisions and as such this is not
+        assigned a default level of trust.</para>
+    <para>Consideration must be taken when managing the users of the
+        system, whether operating or utilizing public or private
+        clouds. The identity service allows for LDAP to be part of the
+        authentication process. Including such systems in your
+        OpenStack deployments may ease user management if integrating
+        into existing systems. Be mindful when utilizing 3rd party
+        clouds to explore authentication options applicable to the
+        installation to help manage and keep user authentication
+        consistent.</para>
+    <para>Due to the process of passing user names, passwords, and
+        generated tokens between client machines and API endpoints,
+        placing API services behind hardware that performs SSL
+        termination is strongly recommended.</para>
+    <para>Within cloud components themselves, another component that
+        needs security scrutiny is the hypervisor. In a public cloud,
+        organizations typically do not have control over the choice of
+        hypervisor. (Amazon uses its own particular version of Xen,
+        for example.) In some cases, hypervisors may be vulnerable to
+        a type of attack called "hypervisor breakout" if they are not
+        properly secured. Hypervisor breakout describes the event of a
+        compromised or malicious instance breaking out of the resource
+        controls of the hypervisor and gaining access to the bare
+        metal operating system and hardware resources.</para>
+    <para>If the security of instances is not considered important,
+        there may not be an issue. In most cases, however, enterprises
+        need to avoid this kind of vulnerability, and the only way to
+        do that is to avoid a situation in which the instances are
+        running on a public cloud. That does not mean that there is a
+        need to own all of the infrastructure on which an OpenStack
+        installation operates; it suggests avoiding situations in
+        which hardware may be shared with others.</para>
+    <para>There are other services worth considering that provide a
+        bare metal instance instead of a cloud. In other cases, it is
+        possible to replicate a second private cloud by integrating
+        with a Private Cloud as a Service deployment, in which an
+        organization does not buy hardware, but also does not share it
+        with other tenants. It is also possible use a provider that
+        hosts a bare-metal "public" cloud instance for which the
+        hardware is dedicated only to one customer, or a provider that
+        offers Private Cloud as a Service.</para>
+    <para>Finally, it is important to realize that each cloud
+        implements services differently. What keeps data secure in one
+        cloud may not do the same in another. Be sure to know the
+        security requirements of every cloud that handles the
+        organization's data or workloads.</para>
+    <para>More information on OpenStack Security can be found at
+        http://docs.openstack.org/security-guide/</para></section>
+    <section xml:id="utilization-hybrid"><title>Utilization</title>
+    <para>When it comes to utilization, it is important that the CMP
+        understands what workloads are running, where they are
+        running, and their preferred utilizations. For example, in
+        most cases it is desirable to run as many workloads internally
+        as possible, utilizing other resources only when necessary. On
+        the other hand, situations exist in which the opposite is
+        true. The internal cloud may only be for development and
+        stressing it is undesirable. In most cases, a cost model of
+        various scenarios helps with this decision, however this
+        analysis is heavily influenced by internal priorities. The
+        important thing is the ability to efficiently make those
+        decisions on a programmatic basis.</para>
+    <para>The OpenStack Telemetry (Ceilometer) project is designed to
+        provide information on the usage of various OpenStack
+        components. There are two limitations to consider: first, if
+        there is to be a large amount of data (for example, if
+        monitoring a large cloud, or a very active one) it is
+        desirable to use a NoSQL back end for Ceilometer, such as
+        MongoDB. Second, when connecting to a non-OpenStack cloud,
+        there will need to be a way to monitor that usage and to
+        provide that monitoring data back to the CMP.</para></section>
+    <section xml:id="performace-hybrid"><title>Performance</title>
+    <para>Performance is of primary importance in the design of a
+        cloud. When it comes to a hybrid cloud deployment, many of the
+        same issues for multi-site deployments apply, such as network
+        latency between sites. It is also important to think about the
+        speed at which a workload can be spun up in another cloud, and
+        what can be done to reduce the time necessary to accomplish
+        that task. That may mean moving data closer to applications,
+        or conversely, applications closer to the data they process.
+        It may mean grouping functionality so that connections that
+        require low latency take place over a single cloud rather than
+        spanning clouds. That may also mean ensuring that the CMP has
+        the intelligence to know which cloud can most efficiently run
+        which types of workloads.</para>
+    <para>As with utilization, native OpenStack tools are available to
+        assist. Ceilometer can measure performance and, if necessary,
+        OpenStack Orchestration via the Heat project can be used to
+        react to changes in demand by spinning up more resources. It
+        is important to note, however, that Orchestration requires
+        special configurations in the client to enable functioning
+        with solution offerings from Amazon Web Services. When dealing
+        with other types of clouds, it is necessary to rely on the
+        features of the CMP.</para></section>
+    <section xml:id="components"><title>Components</title>
+    <para>The number and types of native OpenStack components that are
+        available for use is dependent on whether the deployment is
+        exclusively an OpenStack cloud or not. If so, all of the
+        OpenStack components will be available for use, and in many
+        ways the issues that need to be considered will be similar to
+        those that need to be considered for a multi-site
+        deployment.</para>
+    <para>That said, in any situation in which more than one cloud is
+        being used, at least four OpenStack tools will be
+        considered:</para>
+    <itemizedlist>
+        <listitem>
+            <para>OpenStack Compute (Nova): Regardless of deployment
+                location, hypervisor choice has a direct effect on how
+                difficult it is to integrate with one or more
+                additional clouds. For example, integrating a Hyper-V
+                based OpenStack cloud with Azure will have less
+                compatibility issues than if KVM is used.</para>
+        </listitem>
+        <listitem>
+            <para>Networking: Whether OpenStack Networking (Neutron)
+                or Nova-network is used, the network is one place
+                where integration capabilities need to be understood
+                in order to connect between clouds.</para>
+        </listitem>
+        <listitem>
+            <para>OpenStack Telemetry (Ceilometer): Use of Ceilometer
+                depends, in large part, on what the other parts of the
+                cloud are using.</para>
+        </listitem>
+        <listitem>
+            <para>Orchestration module (Heat): Similarly, Heat can
+                be a valuable tool in orchestrating tasks a CMP
+                decides are necessary in an OpenStack-based
+                cloud.</para>
+        </listitem>
+    </itemizedlist></section>
+    <section xml:id="special-considerations-hybrid"><title>Special considerations</title>
+    <para>Hybrid cloud deployments also involve two more issues that
+        are not common in other situations:</para>
+    <para>Image portability: Note that, as of the Icehouse release,
+        there is no single common image format that is usable by all
+        clouds. This means that images will need to be converted or
+        recreated when porting between clouds. To make things simpler,
+        launch the smallest and simplest images feasible, installing
+        only what is necessary preferably using a deployment manager
+        such as Chef or Puppet. That means not to use golden images
+        for speeding up the process, however if the same images are
+        being repeatedly deployed it may make more sense to utilize
+        this technique instead of provisioning applications on lighter
+        images each time.</para>
+    <para>API differences: The most profound issue that cannot be
+        avoided when using a hybrid cloud deployment with more than
+        just OpenStack (or with different versions of OpenStack) is
+        that the APIs needed to perform certain functions are
+        different. The CMP needs to know how to handle all necessary
+        versions. To get around this issue, some implementers build
+        portals to achieve a hybrid cloud environment, but a heavily
+        developer-focused organization will get more use out of a
+        hybrid cloud broker SDK such as jClouds.</para></section>
+</section>
--- a/doc/arch-design/hybrid/section_user_requirements_hybrid.xml
+++ b/doc/arch-design/hybrid/section_user_requirements_hybrid.xml
@ -0,0 +1,314 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<section xmlns="http://docbook.org/ns/docbook"
+    xmlns:xi="http://www.w3.org/2001/XInclude"
+    xmlns:xlink="http://www.w3.org/1999/xlink"
+    version="5.0"
+    xml:id="user-requirements-hybrid">
+    <?dbhtml stop-chunking?>
+    <title>User Requirements</title>
+    <para>Hybrid cloud architectures introduce additional
+        complexities, particularly those that use heterogeneous cloud
+        platforms. As a result, it is important to make sure that
+        design choices match requirements in such a way that the
+        benefits outweigh the inherent additional complexity and
+        risks.</para>
+    <para>Business considerations to make when designing a hybrid
+        cloud deployment include:</para>
+    <itemizedlist>
+        <listitem>
+            <para>Cost: A hybrid cloud architecture involves multiple
+                vendors and technical architectures. These
+                architectures may be more expensive to deploy and
+                maintain. Operational costs can be higher because of
+                the need for more sophisticated orchestration and
+                brokerage tools than in other architectures. In
+                contrast, overall operational costs might be lower by
+                virtue of using a cloud brokerage tool to deploy the
+                workloads to the most cost effective platform.</para>
+        </listitem>
+        <listitem>
+            <para>Revenue opportunity: Revenue opportunities vary
+                greatly based on the intent and use case of the cloud.
+                If it is being built as a commercial customer-facing
+                product, consider the drivers for building it over
+                multiple platforms and whether the use of multiple
+                platforms make the design more attractive to target
+                customers, thus enhancing the revenue
+                opportunity.</para>
+        </listitem>
+        <listitem>
+            <para>Time to Market: One of the most common reasons to
+                use cloud platforms is to speed the time to market of
+                a new product or application. A business requirement
+                to use multiple cloud platforms may be because there
+                is an existing investment in several applications and
+                it is faster to tie them together rather than
+                migrating components and refactoring to a single
+                platform.</para>
+        </listitem>
+        <listitem>
+            <para>Business or technical diversity: Organizations
+                already leveraging cloud-based services may wish to
+                embrace business diversity and utilize a hybrid cloud
+                design to spread their workloads across multiple cloud
+                providers so that no application is hosted in a single
+                cloud provider.</para>
+        </listitem>
+        <listitem>
+            <para>Application momentum: A business with existing
+                applications that are already in production on
+                multiple cloud environments may find that it is more
+                cost effective to integrate the applications on
+                multiple cloud platforms rather than migrate them to a
+                single platform.</para>
+        </listitem>
+    </itemizedlist>
+    <section xml:id="legal-requirements-hybrid"><title>Legal Requirements</title>
+    <para>Many jurisdictions have legislative and regulatory
+        requirements governing the storage and management of data in
+        cloud environments. Common areas of regulation include:</para>
+    <itemizedlist>
+        <listitem>
+            <para>Data retention policies ensuring storage of
+                persistent data and records management to meet data
+                archival requirements.</para>
+        </listitem>
+        <listitem>
+            <para>Data ownership policies governing the possession and
+                responsibility for data.</para>
+        </listitem>
+        <listitem>
+            <para>Data sovereignty policies governing the storage of
+                data in foreign countries or otherwise separate
+                jurisdictions.</para>
+        </listitem>
+        <listitem>
+            <para>Data compliance policies governing certain types of
+                information needs to reside in certain locations due
+                to regular issues and, more importantly, cannot reside
+                in other locations for the same reason.</para>
+        </listitem>
+    </itemizedlist>
+    <para>Examples of such legal frameworks include the data
+        protection framework of the European Union
+        (http://ec.europa.eu/justice/data-protection/) and the
+        requirements of the Financial Industry Regulatory Authority
+        (http://www.finra.org/Industry/Regulation/FINRARules/) in the
+        United States. Consult a local regulatory body for more
+        information.</para></section>
+    <section xml:id="workload-considerations"><title>Workload Considerations</title>
+    <para>Defining what the word "workload" means in the context of a
+        hybrid cloud environment is important. Workload can be defined
+        as the intended way the systems will be utilized, which is
+        often referred to as a “use case.” A workload can be a single
+        application or a suite of applications that work in concert.
+        It can also be a duplicate set of applications that need to
+        run on multiple cloud environments. In a hybrid cloud
+        deployment, the same workload will often need to function
+        equally well on radically different public and private cloud
+        environments. The architecture needs to address these
+        potential conflicts, complexity, and platform
+        incompatibilities.</para>
+    <para>Some possible use cases for a hybrid cloud architecture
+        include:</para>
+    <itemizedlist>
+        <listitem>
+            <para>Dynamic resource expansion or "bursting": Another
+                common reason to use a multiple cloud architecture is
+                a "bursty" application that needs additional resources
+                at times. An example of this case could be a retailer
+                that needs additional resources during the holiday
+                selling season, but does not want to build expensive
+                cloud resources to meet the peak demand. They might
+                have an OpenStack private cloud but want to burst to
+                AWS or some other public cloud for these peak load
+                periods. These bursts could be for long or short
+                cycles ranging from hourly, monthly or yearly.</para>
+        </listitem>
+        <listitem>
+            <para>Disaster recovery-business continuity: The cheaper
+                storage and instance management makes a good case for
+                using the cloud as a secondary site. The public cloud
+                is already heavily used for these purposes in
+                combination with an OpenStack public or private
+                cloud.</para>
+        </listitem>
+        <listitem>
+            <para>Federated hypervisor-instance management: Adding
+                self-service, charge back and transparent delivery of
+                the right resources from a federated pool can be cost
+                effective. In a hybrid cloud environment, this is a
+                particularly important consideration. Look for a cloud
+                that provides cross-platform hypervisor support and
+                robust instance management tools.</para>
+        </listitem>
+        <listitem>
+            <para>Application portfolio integration: An enterprise
+                cloud delivers better application portfolio management
+                and more efficient deployment by leveraging
+                self-service features and rules for deployments based
+                on types of use. A common driver for building hybrid
+                cloud architecture is to stitch together multiple
+                existing cloud environments that are already in
+                production or development.<!-- In the interest of time to
+                market, the requirements may be to maintain the
+                multiple clouds and just integrate the pieces
+                together, not rationalize to one cloud environment, but
+                instead to --></para>
+        </listitem>
+        <listitem>
+            <para>Migration scenarios: A common reason to create a
+                hybrid cloud architecture is to allow the migration of
+                applications between different clouds. This may be
+                because the application will be migrated permanently
+                to a new platform, or it might be because the
+                application needs to be supported on multiple
+                platforms going forward.</para>
+        </listitem>
+        <listitem>
+            <para>High availability: Another important reason for
+                wanting a multiple cloud architecture is to address
+                the needs for high availability. By using a
+                combination of multiple locations and platforms, a
+                design can achieve a level of availability that is not
+                possible with a single platform. This approach does
+                add a significant amount of complexity.</para>
+        </listitem>
+    </itemizedlist>
+    <para>In addition to thinking about how the workload will work on
+        a single cloud, the design must accommodate the added
+        complexity of needing the workload to run on multiple cloud
+        platforms. The complexity of transferring workloads across
+        clouds needs to be explored at the application, instance,
+        cloud platform, hypervisor, and network levels.</para></section>
+    <section xml:id="tools-considerations-hybrid"><title>Tools Considerations</title>
+    <para>When working with designs spanning multiple clouds, the
+        design must incorporate tools to facilitate working across
+        those multiple clouds. Some of the user requirements drive the
+        need for tools that will do the following functions:</para>
+    <itemizedlist>
+        <listitem>
+            <para>Broker between clouds: Since the multiple cloud
+                architecture assumes that there will be at least two
+                different and possibly incompatible platforms that are
+                likely to have different costs, brokering software is
+                designed to evaluate relative costs between different
+                cloud platforms. These solutions are sometimes
+                referred to as Cloud Management Platforms (CMPs).
+                Examples include Rightscale, Gravitent, Scalr,
+                CloudForms, and ManageIQ. These tools allow the
+                designer to determine the right location for the
+                workload based on predetermined criteria.</para>
+        </listitem>
+        <listitem>
+            <para>Facilitate orchestration across the clouds: CMPs are
+                tools are used to tie everything together. Cloud
+                orchestration tools are used to improve the management
+                of IT application portfolios as they migrate onto
+                public, private, and hybrid cloud platforms. These
+                tools are an important consideration. Cloud
+                orchestration tools are used for managing a diverse
+                portfolio of installed systems across multiple cloud
+                platforms. The typical enterprise IT application
+                portfolio is still comprised of a few thousand
+                applications scattered over legacy hardware,
+                virtualized infrastructure, and now dozens of
+                disjointed shadow public Infrastructure-as-a-Service
+                (IaaS) and Software-as-a-Service (SaaS) providers and
+                offerings.</para>
+        </listitem>
+    </itemizedlist></section>
+    <section xml:id="network-considerations-hybrid"><title>Network Considerations</title>
+    <para>The network services functionality is an important factor to
+        assess when choosing a CMP and cloud provider. Considerations
+        are functionality, security, scalability and HA. Verification
+        and ongoing testing of the critical features of the cloud
+        endpoint used by the architecture are important tasks.</para>
+    <itemizedlist>
+        <listitem>
+            <para>Once the network functionality framework has been
+                decided, a minimum functionality test should be
+                designed. This will ensure testing and functionality
+                persists during and after upgrades.</para>
+        </listitem>
+        <listitem>
+            <para>Scalability across multiple cloud providers may
+                dictate which underlying network framework you will
+                choose in different cloud providers. It is important
+                to have the network API functions presented and to
+                verify that functionality persists across all cloud
+                endpoints chosen.</para>
+        </listitem>
+        <listitem>
+            <para>High availability implementations vary in
+                functionality and design. Examples of some common
+                methods are Active-Hot-Standby, Active-Passive and
+                Active-Active. High availability and a test framework
+                needs to be developed to insure that the functionality
+                and limitations are well understood.</para>
+        </listitem>
+        <listitem>
+            <para>Security considerations include how data is secured
+                between client and endpoint and any traffic that
+                traverses the multiple clouds, from eavesdropping to
+                DoS activities.</para>
+        </listitem>
+    </itemizedlist></section>
+    <section xml:id="risk-mitigation-management-hybrid"><title>Risk Mitigation and Management
+        Considerations</title>
+    <para>Hybrid cloud architectures introduce additional risk because
+        they add additional complexity and potentially conflicting or
+        incompatible components or tools. However, they also reduce
+        risk by spreading workloads over multiple providers. This
+        means, if one was to go out of business, the organization
+        could remain operational.</para>
+    <para>Risks that will be heightened by using a hybrid cloud
+        architecture include:</para>
+    <itemizedlist>
+        <listitem>
+            <para>Provider availability or implementation details:
+                This can range from the company going out of business
+                to the company changing how it delivers its services.
+                Cloud architectures are inherently designed to be
+                flexible and changeable; paradoxically, the cloud is
+                both perceived to be rock solid and ever flexible at
+                the same time.</para>
+        </listitem>
+        <listitem>
+            <para>Differing SLAs: Users of hybrid cloud environments
+                potentially encounter some losses through differences
+                in service level agreements. A hybrid cloud design
+                needs to accommodate the different SLAs provided by
+                the various clouds involved in the design, and must
+                address the actual enforceability of the providers'
+                SLAs.</para>
+        </listitem>
+        <listitem>
+            <para>Security levels: Securing multiple cloud
+                environments is more complex than securing a single
+                cloud environment. Concerns need to be addressed at,
+                but not limited to, the application, network, and
+                cloud platform levels. One issue is that different
+                cloud platforms approach security differently, and a
+                hybrid cloud design must address and compensate for
+                differences in security approaches. For example, AWS
+                uses a relatively simple model that relies on user
+                privilege combined with firewalls.</para>
+        </listitem>
+        <listitem>
+            <para>Provider API changes: APIs are crucial in a hybrid
+                cloud environment. As a consumer of a provider's cloud
+                services, an organization will rarely have any control
+                over provider changes to APIs. Cloud services that
+                might have previously had compatible APIs may no
+                longer work. This is particularly a problem with AWS
+                and OpenStack AWS-compatible APIs. OpenStack was
+                originally planned to maintain compatibility with
+                changes in AWS APIs. However, over time, the APIs have
+                become more divergent in functionality. One way to
+                address this issue is to focus on using only the most
+                common and basic APIs to minimize potential
+                conflicts.</para>
+        </listitem>
+    </itemizedlist></section>
+</section>
--- a/doc/arch-design/images/Compute_NSX.png
+++ b/doc/arch-design/images/Compute_NSX.png
--- a/doc/arch-design/images/Compute_Tech_Bin_Packing_CPU_optimized1.png
+++ b/doc/arch-design/images/Compute_Tech_Bin_Packing_CPU_optimized1.png
--- a/doc/arch-design/images/Compute_Tech_Bin_Packing_General1.png
+++ b/doc/arch-design/images/Compute_Tech_Bin_Packing_General1.png
--- a/doc/arch-design/images/Example_Compute_Heavy_Multi-Hypervisor_-_Architecture_4.png
+++ b/doc/arch-design/images/Example_Compute_Heavy_Multi-Hypervisor_-_Architecture_4.png
--- a/doc/arch-design/images/Example_General_Purpose_Architecture_w_Swift.png
+++ b/doc/arch-design/images/Example_General_Purpose_Architecture_w_Swift.png
--- a/doc/arch-design/images/General_Architecture1.png
+++ b/doc/arch-design/images/General_Architecture1.png
--- a/doc/arch-design/images/General_Architecture2.png
+++ b/doc/arch-design/images/General_Architecture2.png
--- a/doc/arch-design/images/General_Architecture3.png
+++ b/doc/arch-design/images/General_Architecture3.png
--- a/doc/arch-design/images/Generic_CERN_Architecture.png
+++ b/doc/arch-design/images/Generic_CERN_Architecture.png
--- a/doc/arch-design/images/Generic_CERN_Example.png
+++ b/doc/arch-design/images/Generic_CERN_Example.png
--- a/doc/arch-design/images/Massively_Scalable_Cells_+_regions_+_azs.png
+++ b/doc/arch-design/images/Massively_Scalable_Cells_+_regions_+_azs.png
--- a/doc/arch-design/images/Methodology.png
+++ b/doc/arch-design/images/Methodology.png
--- a/doc/arch-design/images/Multi-Cloud_DR2.png
+++ b/doc/arch-design/images/Multi-Cloud_DR2.png
--- a/doc/arch-design/images/Multi-Cloud_Priv-AWS3.png
+++ b/doc/arch-design/images/Multi-Cloud_Priv-AWS3.png
--- a/doc/arch-design/images/Multi-Cloud_Priv-AWS4.png
+++ b/doc/arch-design/images/Multi-Cloud_Priv-AWS4.png
--- a/doc/arch-design/images/Multi-Cloud_Priv-Pub2.png
+++ b/doc/arch-design/images/Multi-Cloud_Priv-Pub2.png
--- a/doc/arch-design/images/Multi-Cloud_Priv-Pub3.png
+++ b/doc/arch-design/images/Multi-Cloud_Priv-Pub3.png
--- a/doc/arch-design/images/Multi-Cloud_failover.png
+++ b/doc/arch-design/images/Multi-Cloud_failover.png
--- a/doc/arch-design/images/Multi-Cloud_failover2.png
+++ b/doc/arch-design/images/Multi-Cloud_failover2.png
--- a/doc/arch-design/images/Multi-Site_Customer_Edge.png
+++ b/doc/arch-design/images/Multi-Site_Customer_Edge.png
--- a/doc/arch-design/images/Multi-Site_Location_Local.png
+++ b/doc/arch-design/images/Multi-Site_Location_Local.png
--- a/doc/arch-design/images/Multi-Site_shared_keystone.png
+++ b/doc/arch-design/images/Multi-Site_shared_keystone.png
--- a/doc/arch-design/images/Multi-Site_shared_keystone1.png
+++ b/doc/arch-design/images/Multi-Site_shared_keystone1.png
--- a/doc/arch-design/images/Multi-Site_shared_keystone_horizon.png
+++ b/doc/arch-design/images/Multi-Site_shared_keystone_horizon.png
--- a/doc/arch-design/images/Multi-Site_shared_keystone_horizon_swift.png
+++ b/doc/arch-design/images/Multi-Site_shared_keystone_horizon_swift.png
--- a/doc/arch-design/images/Multi-Site_shared_keystone_horizon_swift1.png
+++ b/doc/arch-design/images/Multi-Site_shared_keystone_horizon_swift1.png
--- a/doc/arch-design/images/Multi-site_Geo_Redundant_LB.png
+++ b/doc/arch-design/images/Multi-site_Geo_Redundant_LB.png
--- a/doc/arch-design/images/Network_Cloud_Storage1.png
+++ b/doc/arch-design/images/Network_Cloud_Storage1.png
--- a/doc/arch-design/images/Network_Cloud_Storage2.png
+++ b/doc/arch-design/images/Network_Cloud_Storage2.png
--- a/doc/arch-design/images/Network_Web_Services1.png
+++ b/doc/arch-design/images/Network_Web_Services1.png
--- a/doc/arch-design/images/OPST_0008_Compute_12015337_0314cd-compute_cells_high.png
+++ b/doc/arch-design/images/OPST_0008_Compute_12015337_0314cd-compute_cells_high.png
--- a/doc/arch-design/images/Special_case_SDN_external.png
+++ b/doc/arch-design/images/Special_case_SDN_external.png
--- a/doc/arch-design/images/Special_case_SDN_hosted.png
+++ b/doc/arch-design/images/Special_case_SDN_hosted.png
--- a/doc/arch-design/images/Specialized_Hardware2.png
+++ b/doc/arch-design/images/Specialized_Hardware2.png
--- a/doc/arch-design/images/Specialized_OOO.png
+++ b/doc/arch-design/images/Specialized_OOO.png
--- a/doc/arch-design/images/Specialized_VDI1.png
+++ b/doc/arch-design/images/Specialized_VDI1.png
--- a/doc/arch-design/images/Storage_Database_+_Object2.png
+++ b/doc/arch-design/images/Storage_Database_+_Object2.png
--- a/doc/arch-design/images/Storage_Database_+_Object3.png
+++ b/doc/arch-design/images/Storage_Database_+_Object3.png
--- a/doc/arch-design/images/Storage_Database_+_Object5.png
+++ b/doc/arch-design/images/Storage_Database_+_Object5.png
--- a/doc/arch-design/images/Storage_Hadoop.png
+++ b/doc/arch-design/images/Storage_Hadoop.png
--- a/doc/arch-design/images/Storage_Hadoop3.png
+++ b/doc/arch-design/images/Storage_Hadoop3.png
--- a/doc/arch-design/images/Storage_Object.png
+++ b/doc/arch-design/images/Storage_Object.png
--- a/doc/arch-design/images/design-methodology.png
+++ b/doc/arch-design/images/design-methodology.png
--- a/doc/arch-design/images/openstack_fullcover2014_1.jpg
+++ b/doc/arch-design/images/openstack_fullcover2014_1.jpg
--- a/doc/arch-design/images/packingexample-2.png
+++ b/doc/arch-design/images/packingexample-2.png
--- a/doc/arch-design/images/region-example.png
+++ b/doc/arch-design/images/region-example.png
--- a/doc/arch-design/introduction/section_how_this_book_is_organized.xml
+++ b/doc/arch-design/introduction/section_how_this_book_is_organized.xml
@ -0,0 +1,97 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<section xmlns="http://docbook.org/ns/docbook"
+    xmlns:xi="http://www.w3.org/2001/XInclude"
+    xmlns:xlink="http://www.w3.org/1999/xlink"
+    version="5.0"
+    xml:id="arch-guide-how-this-book-is-organized">
+    <title>How this Book is Organized</title>
+    <para>This book has been organized into various chapters that help
+        define the use cases associated with making architectural
+        choices related to an OpenStack cloud installation. Each
+        chapter is intended to stand alone to encourage individual
+        chapter readability, however each chapter is designed to
+        contain useful information that may be applicable in
+        situations covered by other chapters. Cloud architects may use
+        this book as a comprehensive guide by reading all of the use
+        cases, but it is also possible to review only the chapters
+        which pertain to a specific use case. When choosing to read
+        specific use cases, note that it may be necessary to read more
+        than one section of the guide to formulate a complete design
+        for the cloud. The use cases covered in this guide
+        include:</para>
+    <itemizedlist>
+        <listitem>
+            <para>General purpose: A cloud built with common
+                components that should address 80% of common use
+                cases.</para>
+        </listitem>
+        <listitem>
+            <para>Compute focused: A cloud designed to address compute
+                intensive workloads such as high performance computing
+                (HPC).</para>
+        </listitem>
+        <listitem>
+            <para>Storage focused: A cloud focused on storage
+                intensive workloads such as data analytics with
+                parallel file systems.</para>
+        </listitem>
+        <listitem>
+            <para>Network focused: A cloud depending on high
+                performance and reliable networking, such as a content
+                delivery network (CDN).</para>
+        </listitem>
+        <listitem>
+            <para>Multi-site: A cloud built with multiple sites
+                available for application deployments for
+                geographical, reliability or data locality
+                reasons.</para>
+        </listitem>
+        <listitem>
+            <para>Hybrid cloud: An architecture where multiple
+                disparate clouds are connected either for failover,
+                hybrid cloud bursting, or availability.</para>
+        </listitem>
+        <listitem>
+            <para>Massively Scalable: An architecture that is intended
+                for cloud service providers or other extremely large
+                installations.</para>
+        </listitem>
+    </itemizedlist>
+    <para>A section titled Specialized Use Cases provides information
+        on architectures that have not previously been covered in the
+        defined use cases.</para>
+    <para>Each chapter in the guide is then further broken down into
+        the following sections:</para>
+    <itemizedlist>
+        <listitem>
+            <para>Introduction: Provides an overview of the
+                architectural use case.</para>
+        </listitem>
+        <listitem>
+            <para>User requirements: Defines the set of user
+                considerations that typically come into play for that
+                use case.</para>
+        </listitem>
+        <listitem>
+            <para>Technical considerations: Covers the technical
+                issues that must be accounted when dealing with this
+                use case.</para>
+        </listitem>
+        <listitem>
+            <para>Operational considerations: Covers the ongoing
+                operational tasks associated with this use case and
+                architecture.</para>
+        </listitem>
+        <listitem>
+            <para>Architecture: Covers the overall architecture
+                associated with the use case.</para>
+        </listitem>
+        <listitem>
+            <para>Prescriptive examples: Presents one or more
+                scenarios where this architecture could be
+                deployed.</para>
+        </listitem>
+    </itemizedlist>
+    <para>A Glossary covers the terms and phrases used in the
+        book.</para>
+</section>
--- a/doc/arch-design/introduction/section_how_this_book_was_written.xml
+++ b/doc/arch-design/introduction/section_how_this_book_was_written.xml
@ -0,0 +1,88 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<section xmlns="http://docbook.org/ns/docbook"
+    xmlns:xi="http://www.w3.org/2001/XInclude"
+    xmlns:xlink="http://www.w3.org/1999/xlink"
+    version="5.0"
+    xml:id="arch-guide-why-and-who-we-wrote-this-book">
+    <title>Why and How We Wrote this Book</title>
+    <para>The velocity at which OpenStack environments are moving from
+        proof-of-concepts to production deployments is leading to
+        increasing questions and issues related to architecture design
+        considerations. By and large these considerations are not
+        addressed in the existing documentation, which typically
+        focuses on the specifics of deployment and configuration
+        options or operational considerations, rather than the bigger
+        picture.</para>
+    <para>We wrote this book to guide readers in designing an
+        OpenStack architecture that meets the needs of their
+        organization. This guide concentrates on identifying important
+        design considerations for common cloud use cases and provides
+        examples based on these design guidelines. This guide does not
+        aim to provide explicit instructions for installing and
+        configuring the cloud, but rather focuses on design principles
+        as they relate to user requirements as well as technical and
+        operational considerations. For specific guidance with
+        installation and configuration there are a number of resources
+        already available in the OpenStack documentation that help in
+        that area.</para>
+    <para>This book was written in a book sprint format, which is a
+        facilitated, rapid development production method for books.
+        For more information, see the Book Sprints website
+        (www.booksprints.net).</para>
+    <para>This book was written in five days during July 2014 while
+        exhausting the M&amp;M, Mountain Dew and healthy options
+        supply, complete with juggling entertainment during lunches at
+        VMware's headquarters in Palo Alto. The event was also
+        documented on Twitter using the #OpenStackDesign hashtag. The
+        Book Sprint was facilitated by Faith Bosworth and Adam
+        Hyde.</para>
+    <para>We would like to thank VMware for their generous
+        hospitality, as well as our employers, Cisco, Cloudscaling,
+        Comcast, EMC, Mirantis, Rackspace, Red Hat, Verizon, and
+        VMware, for enabling us to contribute our time. We would
+        especially like to think Anne Gentle and Kenneth Hui for all
+        of their shepherding and organization in making this
+        happen.</para>
+    <para>The author team includes:</para>
+    <itemizedlist>
+        <listitem>
+            <para>Kenneth Hui (EMC) @hui_kenneth</para>
+        </listitem>
+        <listitem>
+            <para>Alexandra Settle (Rackspace) @dewsday</para>
+        </listitem>
+        <listitem>
+            <para>Anthony Veiga (Comcast) @daaelar</para>
+        </listitem>
+        <listitem>
+            <para>Beth Cohen (Verizon) @bfcohen</para>
+        </listitem>
+        <listitem>
+            <para>Kevin Jackson (Rackspace) @itarchitectkev</para>
+        </listitem>
+        <listitem>
+            <para>Maish Saidel-Keesing (Cisco) @maishsk</para>
+        </listitem>
+        <listitem>
+            <para>Nick Chase (Mirantis) @NickChase</para>
+        </listitem>
+        <listitem>
+            <para>Scott Lowe (VMware) @scott_lowe</para>
+        </listitem>
+        <listitem>
+            <para>Sean Collins (Comcast) @sc68cal</para>
+        </listitem>
+        <listitem>
+            <para>Sean Winn (Cloudscaling) @seanmwinn</para>
+        </listitem>
+        <listitem>
+            <para>Sebastian Gutierrez (Red Hat) @gutseb</para>
+        </listitem>
+        <listitem>
+            <para>Stephen Gordon (Red Hat) @xsgordon</para>
+        </listitem>
+        <listitem>
+            <para>Vinny Valdez (Red Hat) @VinnyValdez</para>
+        </listitem>
+    </itemizedlist>
+</section>
--- a/doc/arch-design/introduction/section_intended_audience.xml
+++ b/doc/arch-design/introduction/section_intended_audience.xml
@ -0,0 +1,17 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<section xmlns="http://docbook.org/ns/docbook"
+    xmlns:xi="http://www.w3.org/2001/XInclude"
+    xmlns:xlink="http://www.w3.org/1999/xlink"
+    version="5.0"
+    xml:id="arch-guide-intended-audience">
+    <title>Intended Audience</title>
+    <para>This book has been written for architects and designers of
+        OpenStack clouds. This book is not intended for people who are
+        deploying OpenStack. For a guide on deploying and operating
+        OpenStack, please refer to the Operations Guide
+        http://docs.openstack.org/openstack-ops.</para>
+    <para>The reader should have prior knowledge of cloud architecture
+        and principles, experience in enterprise system design, Linux
+        and virtualization experience, and a basic understanding of
+        networking principles and protocols.</para>
+</section>
--- a/doc/arch-design/introduction/section_introduction_to_openstack_architecture_design_guide.xml
+++ b/doc/arch-design/introduction/section_introduction_to_openstack_architecture_design_guide.xml
@ -0,0 +1,33 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<section xmlns="http://docbook.org/ns/docbook"
+    xmlns:xi="http://www.w3.org/2001/XInclude"
+    xmlns:xlink="http://www.w3.org/1999/xlink"
+    version="5.0"
+    xml:id="arch-guide-intro-to-openstack-arch-design-guide">
+    <title>Introduction to the OpenStack Architecture Design
+        Guide</title>
+    <para>OpenStack is a leader in the cloud technology gold rush, as
+        organizations of all stripes discover the increased
+        flexibility and speed to market that self-service cloud and
+        Infrastructure as a Service (IaaS) provides. To truly reap
+        those benefits, however, the cloud must be designed and
+        architected properly.</para>
+    <para>A well-architected cloud provides a stable IT environment
+        that offers easy access to needed resources, usage-based
+        expenses, extra capacity on demand, disaster recovery, and a
+        secure environment, but a well-architected cloud does not
+        magically build itself. It requires careful consideration of a
+        multitude of factors, both technical and non-technical.</para>
+    <para>There is no single architecture that is "right" for an
+        OpenStack cloud deployment. OpenStack can be used for any
+        number of different purposes, and each of them has its own
+        particular requirements and architectural
+        peculiarities.</para>
+    <para>This book is designed to look at some of the most common
+        uses for OpenStack clouds (and even some that are less common,
+        but provide a good example) and explain what issues need to be
+        considered and why, along with a wealth of knowledge and
+        advice to help an organization to design and build a
+        well-architected OpenStack cloud that will fit its unique
+        requirements.</para>
+</section>
--- a/doc/arch-design/introduction/section_methodology.xml
+++ b/doc/arch-design/introduction/section_methodology.xml
@ -0,0 +1,232 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<section xmlns="http://docbook.org/ns/docbook"
+    xmlns:xi="http://www.w3.org/2001/XInclude"
+    xmlns:xlink="http://www.w3.org/1999/xlink"
+    version="5.0"
+    xml:id="methodology">
+    <title>Methodology</title>
+    <para>The magic of the cloud is that it can do anything. It is both robust
+        and flexible, the best of both worlds. Yes, the cloud is highly flexible
+        and it can do almost anything, but to get the most out of a cloud
+        investment, it is important to define how the cloud will be used by
+        creating and testing use cases. This is the chapter that describes the
+        thought process behind how to design a cloud architecture that best
+        suits the intended use.</para>
+    <mediaobject>
+        <imageobject>
+            <imagedata fileref="../images/Methodology.png"/>
+        </imageobject>
+    </mediaobject>
+    <para>The diagram shows at a very abstract level the process for capturing
+        requirements and building use cases. Once a set of use cases has been
+        defined, it can then be used to design the cloud architecture.</para>
+    <para>Use case planning can seem counter-intuitive. After all, it takes
+        about five minutes to sign up for a server with Amazon. Amazon does not
+        know in advance what any given user is planning on doing with it, right?
+        Wrong. Amazon’s product management department spends plenty of time
+        figuring out exactly what would be attractive to their typical customer
+        and honing the service to deliver it. For the enterprise, the planning
+        process is no different, but instead of planning for an external paying
+        customer, for example, the use could be for internal application
+        developers or a web portal. The following is a list of the high level
+        objectives that need to be incorporated into the thinking about creating
+        a use case.</para>
+    <para>Overall business objectives</para>
+    <itemizedlist>
+        <listitem>
+            <para>Develop clear definition of business goals and requirements
+            </para>
+        </listitem>
+        <listitem>
+            <para>Increase project support and engagement with business,
+                customers and end users.</para>
+        </listitem>
+    </itemizedlist>
+    <para>Technology</para>
+    <itemizedlist>
+        <listitem>
+            <para>Coordinate the OpenStack architecture across the project and
+                leverage OpenStack community efforts more effectively.</para>
+        </listitem>
+        <listitem>
+            <para>Architect for automation as much as possible to speed
+                development and deployment.</para>
+        </listitem>
+        <listitem>
+            <para>Use the appropriate tools for the development effort.</para>
+        </listitem>
+        <listitem>
+            <para>Create better and more test metrics and test harnesses to
+                support continuous and integrated development, test processes
+                and automation.</para>
+        </listitem>
+    </itemizedlist>
+    <para>Organization</para>
+    <itemizedlist>
+        <listitem>
+            <para>Better messaging of management support of team efforts</para>
+        </listitem>
+        <listitem>
+            <para>Develop better cultural understanding of Open Source, cloud
+                architectures, Agile methodologies, continuous development, test
+                and integration, overall development concepts in general</para>
+        </listitem>
+    </itemizedlist>
+    <para>As an example of how this works, consider a business goal of using the
+        cloud for the company’s E-commerce website. This goal means planning for
+        applications that will support thousands of sessions per second,
+        variable workloads, and lots of complex and changing data. By
+        identifying the key metrics, such as number of concurrent transactions
+        per second, size of database, and so on, it is possible to then build a
+        method for testing the assumptions.</para>
+    <para>Develop functional user scenarios: Develop functional user scenarios
+        that can be used to develop test cases that can be used to measure
+        overall project trajectory. If the organization is not ready to commit
+        to an application or applications that can be used to develop user
+        requirements, it needs to create requirements to build valid test
+        harnesses and develop useable metrics. Once the metrics are established,
+        as requirements change, it is easier to respond to the changes quickly
+        without having to worry overly much about setting the exact requirements
+        in advance. Think of this as creating ways to configure the system,
+        rather than redesigning it every time there is a requirements change.</para>
+    <para>Limit cloud feature set: Create requirements that address the pain
+        points, but do not recreate the entire OpenStack tool suite. The
+        requirement to build OpenStack, only better, is self-defeating. It is
+        important to limit scope creep by concentrating on developing a platform
+        that will address tool limitations for the requirements, but not
+        recreating the entire suite of tools. Work with technical product owners
+        to establish critical features that are needed for a successful cloud
+        deployment.</para>
+    <section xml:id="application-cloud-readiness-methods">
+        <title>Application Cloud Readiness</title>
+        <para>Although the cloud is designed to make things easier, it is
+            important to realize that "using cloud" is more than just firing up
+            an instance and dropping an application on it. The "lift and shift"
+            approach works in certain situations, but there is a fundamental
+            difference between clouds and traditional bare-metal-based
+            environments, or even traditional virtualized environments.</para>
+        <para>In traditional environments, with traditional enterprise
+            applications, the applications and the servers that run on them are
+            "pets". They're lovingly crafted and cared for, the servers have
+            names like Gandalf or Tardis, and if they get sick, someone nurses
+            them back to health. All of this is designed so that the application
+            does not experience an outage.</para>
+        <para>In cloud environments, on the other hand, servers are more like
+            cattle. There are thousands of them, they get names like NY-1138-Q,
+            and if they get sick, they get put down and a sysadmin installs
+            another one. Traditional applications that are unprepared for this
+            kind of environment, naturally will suffer outages, lost data, or
+            worse.</para>
+        <para>There are other reasons to design applications with cloud in mind.
+            Some are defensive, such as the fact that applications cannot be
+            certain of exactly where or on what hardware they will be launched,
+            they need to be flexible, or at least adaptable. Others are
+            proactive. For example, one of the advantages of using the cloud is
+            scalability, so applications need to be designed in such a way that
+            they t can take advantage of those and other opportunities.</para>
+    </section>
+    <section xml:id="determining-whether-an-application-is-cloud-ready">
+        <title>Determining whether an application is cloud-ready</title>
+        <para>There are several factors to take into consideration when looking
+            at whether an application is a good fit for the cloud.</para>
+        <para>Structure: A large, monolithic, single-tiered legacy application
+            typically isn't a good fit for the cloud. Efficiencies are gained
+            when load can be spread over several instances, so that a failure in
+            one part of the system can be mitigated without affecting other
+            parts of the system, or so that scaling can take place where the app
+            needs it.</para>
+        <para>Dependencies: Applications that depend on specific hardware --
+            such as a particular chip set or an external device such as a
+            fingerprint reader -- might not be a good fit for the cloud, unless
+            those dependencies are specifically addressed. Similarly, if an
+            application depends on an operating system or set of libraries that
+            cannot be used in the cloud, or cannot be virtualized, that is a
+            problem.</para>
+        <para>Connectivity: Self-contained applications or those that depend on
+            resources that are not reachable by the cloud in question, will not
+            run. In some situations, work around these issues with custom
+            network setup, but how well this works depends on the chosen cloud
+            environment.</para>
+        <para>Durability and Resilience: Despite the existence of SLAs, the one
+            reality of the cloud is that Things Break. Servers go down, network
+            connections are disrupted, other tenants on a server ramp up the
+            load to make the server unusable. Any number of things can happen,
+            and an application that isn't built to withstand this kind of
+            disruption isn't going to work properly.</para>
+    </section>
+    <section xml:id="designing-for-the-cloud">
+        <title>Designing for the cloud</title>
+        <para>Here are some guidelines to keep in mind when designing an
+            application for the cloud:</para>
+        <itemizedlist>
+            <listitem>
+                <para>Be a pessimist: Assume everything fails and design
+                    backwards. Love your chaos monkey.</para>
+            </listitem>
+            <listitem>
+                <para>Put your eggs in multiple baskets: Leverage multiple
+                    providers, geographic regions and availability zones to
+                    accommodate for local availability issues. Design for
+                    portability.</para>
+            </listitem>
+            <listitem>
+                <para>Think efficiency: Inefficient designs will not scale.
+                    Efficient designs become cheaper as they scale. Kill off
+                    unneeded components or capacity.</para>
+            </listitem>
+            <listitem>
+                <para>Be paranoid: Design for defense in depth and zero
+                    tolerance by building in security at every level and between
+                    every component. Trust no one.</para>
+            </listitem>
+            <listitem>
+                <para>But not too paranoid: Not every application needs the
+                    platinum solution. Architect for different SLA’s, service
+                    tiers and security levels.</para>
+            </listitem>
+            <listitem>
+                <para>Manage the data: Data is usually the most inflexible and
+                    complex area of a cloud and cloud integration architecture.
+                    Don’t short change the effort in analyzing and addressing
+                    data needs.</para>
+            </listitem>
+            <listitem>
+                <para>Hands off: Leverage automation to increase consistency and
+                    quality and reduce response times.</para>
+            </listitem>
+            <listitem>
+                <para>Divide and conquer: Pursue partitioning and
+                    parallel layering wherever possible. Make components as small
+                    and portable as possible. Use load balancing between layers.
+                </para>
+            </listitem>
+            <listitem>
+                <para>Think elasticity: Increasing resources should result in a
+                    proportional increase in performance and scalability.
+                    Decreasing resources should have the opposite effect.
+                </para>
+            </listitem>
+            <listitem>
+                <para>Be dynamic: Enable dynamic configuration changes such as
+                    auto scaling, failure recovery and resource discovery to
+                    adapt to changing environments, faults and workload volumes.
+                </para>
+            </listitem>
+            <listitem>
+                <para>Stay close: Reduce latency by moving highly interactive
+                    components and data near each other.</para>
+            </listitem>
+            <listitem>
+                <para>Keep it loose: Loose coupling, service interfaces,
+                    separation of concerns, abstraction and well defined API’s
+                    deliver flexibility.</para>
+            </listitem>
+            <listitem>
+                <para>Be cost aware: Autoscaling, data transmission, virtual
+                    software licenses, reserved instances, and so on can rapidly
+                    increase monthly usage charges. Monitor usage closely.
+                </para>
+            </listitem>
+        </itemizedlist>
+    </section>
+</section>
--- a/doc/arch-design/massively_scalable/section_introduction_massively_scalable.xml
+++ b/doc/arch-design/massively_scalable/section_introduction_massively_scalable.xml
@ -0,0 +1,75 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<section xmlns="http://docbook.org/ns/docbook"
+    xmlns:xi="http://www.w3.org/2001/XInclude"
+    xmlns:xlink="http://www.w3.org/1999/xlink"
+    version="5.0"
+    xml:id="arch-guide-intro-massive-scale">
+    <title>Introduction</title>
+    <para>A massively scalable architecture is defined as a cloud
+        implementation that is either a very large deployment, such as
+        one that would be built by a commercial service provider, or
+        one that has the capability to support user requests for large
+        amounts of cloud resources. An example would be an
+        infrastructure in which requests to service 500 instances or
+        more at a time is not uncommon. In a massively scalable
+        infrastructure, such a request is fulfilled without completely
+        consuming all of the available cloud infrastructure resources.
+        While the high capital cost of implementing such a cloud
+        architecture makes it cost prohibitive and is only spearheaded
+        by few organizations, many organizations are planning for
+        massive scalability moving toward the future.</para>
+    <para>A massively scalable OpenStack cloud design presents a
+        unique set of challenges and considerations. For the most part
+        it is similar to a general purpose cloud architecture, as it
+        is built to address a non-specific range of potential use
+        cases or functions. Typically, it is rare that massively
+        scalable clouds are designed or specialized for particular
+        workloads. Like the general purpose cloud, the massively
+        scalable cloud is most often built as a platform for a variety
+        of workloads. Massively scalable OpenStack clouds are
+        generally built as commercial public cloud offerings since
+        single private organizations rarely have the resources or need
+        for this scale.</para>
+    <para>Services provided by a massively scalable OpenStack cloud
+        will include:</para>
+    <itemizedlist>
+        <listitem>
+            <para>Virtual-machine disk image library</para>
+        </listitem>
+        <listitem>
+            <para>Raw block storage</para>
+        </listitem>
+        <listitem>
+            <para>File or object storage</para>
+        </listitem>
+        <listitem>
+            <para>Firewall functionality</para>
+        </listitem>
+        <listitem>
+            <para>Load balancing functionality</para>
+        </listitem>
+        <listitem>
+            <para>Private (non-routable) and public (floating) IP
+                addresses</para>
+        </listitem>
+        <listitem>
+            <para>Virtualized network topologies</para>
+        </listitem>
+        <listitem>
+            <para>Software bundles</para>
+        </listitem>
+        <listitem>
+            <para>Virtual compute resources</para>
+        </listitem>
+    </itemizedlist>
+    <para>Like a general purpose cloud, the instances deployed in a
+        massively scalable OpenStack cloud will not necessarily use
+        any specific aspect of the cloud offering (compute, network,
+        or storage). As the cloud grows in scale, the scale of the
+        number of workloads can cause stress on all of the cloud
+        components. Additional stresses are introduced to supporting
+        infrastructure including databases and message brokers. The
+        architecture design for such a cloud must account for these
+        performance pressures without negatively impacting user
+        experience.</para>
+</section>
--- a/doc/arch-design/massively_scalable/section_operational_considerations_massively_scalable.xml
+++ b/doc/arch-design/massively_scalable/section_operational_considerations_massively_scalable.xml
@ -0,0 +1,99 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<section xmlns="http://docbook.org/ns/docbook"
+    xmlns:xi="http://www.w3.org/2001/XInclude"
+    xmlns:xlink="http://www.w3.org/1999/xlink"
+    version="5.0"
+    xml:id="operational-considerations-massive-scale">
+    <?dbhtml stop-chunking?>
+    <title>Operational Considerations</title>
+    <para>In order to run at massive scale, it is important to plan on
+        the automation of as many of the operational processes as
+        possible. Automation includes the configuration of
+        provisioning, monitoring and alerting systems. Part of the
+        automation process includes the capability to determine when
+        human intervention is required and who should act. The
+        objective is to increase the ratio of operational staff to
+        running systems as much as possible to reduce maintenance
+        costs. In a massively scaled environment, it is impossible for
+        staff to give each system individual care.</para>
+    <para>Configuration management tools such as Puppet or Chef allow
+        operations staff to categorize systems into groups based on
+        their role and thus create configurations and system states
+        that are enforced through the provisioning system. Systems
+        that fall out of the defined state due to errors or failures
+        are quickly removed from the pool of active nodes and
+        replaced.</para>
+    <para>At large scale the resource cost of diagnosing individual
+        systems that have failed is far greater than the cost of
+        replacement. It is more economical to immediately replace the
+        system with a new system that can be provisioned and
+        configured automatically and quickly brought back into the
+        pool of active nodes. By automating tasks that are labor
+        intensive, repetitive, and critical to operations with
+        automation, cloud operations teams are able to be managed more
+        efficiently because fewer resources are needed for these
+        babysitting tasks. Administrators are then free to tackle
+        tasks that cannot be easily automated and have longer-term
+        impacts on the business such as capacity planning.</para>
+    <section xml:id="the-bleeding-edge"><title>The Bleeding Edge</title>
+    <para>Running OpenStack at massive scale requires striking a
+        balance between stability and features. For example, it might
+        be tempting to run an older stable release branch of OpenStack
+        to make deployments easier. However, when running at massive
+        scale, known issues that may be of some concern or only have
+        minimal impact in smaller deployments could become pain points
+        at massive scale. If the issue is well known, in many cases,
+        it may be resolved in more recent releases. The OpenStack
+        community can help resolve any issues reported by the applying
+        the collective expertise of the OpenStack developers.</para>
+    <para>When issues crop up, the number of organizations running at
+        a similar scale is a relatively tiny proportion of the
+        OpenStack community, therefore it is important to share these
+        issues with the community and be a vocal advocate for
+        resolving them. Some issues only manifest when operating at
+        large scale and the number of organizations able to duplicate
+        and validate an issue is small, so it will be important to
+        document and dedicate resources to their resolution.</para>
+    <para>In some cases, the resolution to the problem is ultimately
+        to deploy a more recent version of OpenStack. Alternatively,
+        when the issue needs to be resolved in a production
+        environment where rebuilding the entire environment is not an
+        option, it is possible to deploy just the more recent separate
+        underlying components required to resolve issues or gain
+        significant performance improvements. At first glance, this
+        could be perceived as potentially exposing the deployment to
+        increased risk to and instability. However, in many cases it
+        could be an issue that has not been discovered yet.</para>
+    <para>It is advisable to cultivate a development and operations
+        organization that is responsible for creating desired
+        features, diagnose and resolve issues, and also build the
+        infrastructure for large scale continuous integration tests
+        and continuous deployment. This helps catch bugs early and
+        make deployments quicker and less painful. In addition to
+        development resources, the recruitment of experts in the
+        fields of message queues, databases, distributed systems, and
+        networking, cloud and storage is also advisable.</para></section>
+    <section xml:id="growth-and-capacity-planning"><title>Growth and Capacity Planning</title>
+    <para>An important consideration in running at massive scale is
+        projecting growth and utilization trends to plan capital
+        expenditures for the near and long term. Utilization metrics
+        for compute, network, and storage as well as a historical
+        record of these metrics are required. While securing major
+        anchor tenants can lead to rapid jumps in the utilization
+        rates of all resources, the steady adoption of the cloud
+        inside an organizations or by public consumers in a public
+        offering will also create a steady trend of increased
+        utilization.</para></section>
+    <section xml:id="skills-and-training"><title>Skills and Training</title>
+    <para>Projecting growth for storage, networking, and compute is
+        only one aspect of a growth plan for running OpenStack at
+        massive scale. Growing and nurturing development and
+        operational staff is an additional consideration. Sending team
+        members to OpenStack conferences, meetup events, and
+        encouraging active participation in the mailing lists and
+        committees is a very important way to maintain skills and
+        forge relationships in the community. A list of OpenStack
+        training providers in the marketplace can be found here:
+        http://www.openstack.org/marketplace/training/.</para>
+    </section>
+</section>
--- a/doc/arch-design/massively_scalable/section_tech_considerations_massively_scalable.xml
+++ b/doc/arch-design/massively_scalable/section_tech_considerations_massively_scalable.xml
@ -0,0 +1,127 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<section xmlns="http://docbook.org/ns/docbook"
+    xmlns:xi="http://www.w3.org/2001/XInclude"
+    xmlns:xlink="http://www.w3.org/1999/xlink"
+    version="5.0"
+    xml:id="technical-considerations-massive-scale">
+    <?dbhtml stop-chunking?>
+    <title>Technical Considerations</title>
+    <para>Converting an existing OpenStack environment that was
+        designed for a different purpose to be massively scalable is a
+        formidable task. When building a massively scalable
+        environment from the ground up, make sure the initial
+        deployment is built with the same principles and choices that
+        apply as the environment grows. For example, a good approach
+        is to deploy the first site as a multi-site environment. This
+        allows the same deployment and segregation methods to be used
+        as the environment grows to separate locations across
+        dedicated links or wide area networks. In a hyperscale cloud,
+        scale trumps redundancy. Applications must be modified with
+        this in mind, relying on the scale and homogeneity of the
+        environment to provide reliability rather than redundant
+        infrastructure provided by non-commodity hardware
+        solutions.</para>
+    <section xml:id="infrastructure-segregation-massive-scale"><title>Infrastructure Segregation</title>
+    <para>Fortunately, OpenStack services are designed to support
+        massive horizontal scale. Be aware that this is not the case
+        for the entire supporting infrastructure. This is particularly
+        a problem for the database management systems and message
+        queues used by the various OpenStack services for data storage
+        and remote procedure call communications.</para>
+    <para>Traditional clustering techniques are typically used to
+        provide high availability and some additional scale for these
+        environments. In the quest for massive scale, however,
+        additional steps need to be taken to relieve the performance
+        pressure on these components to prevent them from negatively
+        impacting the overall performance of the environment. It is
+        important to make sure that all the components are in balance
+        so that, if and when the massively scalable environment fails,
+        all the components are at, or close to, maximum
+        capacity.</para>
+    <para>Regions are used to segregate completely independent
+        installations linked only by an Identity and Dashboard
+        (optional) installation. Services are installed with separate
+        API endpoints for each region, complete with separate database
+        and queue installations. This exposes some awareness of the
+        environment's fault domains to users and gives them the
+        ability to ensure some degree of application resiliency while
+        also imposing the requirement to specify which region their
+        actions must be applied to.</para>
+    <para>Environments operating at massive scale typically need their
+        regions or sites subdivided further without exposing the
+        requirement to specify the failure domain to the user. This
+        provides the ability to further divide the installation into
+        failure domains while also providing a logical unit for
+        maintenance and the addition of new hardware. At hyperscale,
+        instead of adding single compute nodes, administrators may add
+        entire racks or even groups of racks at a time with each new
+        addition of nodes exposed via one of the segregation concepts
+        mentioned herein.</para>
+    <para>Cells provide the ability to subdivide the compute portion
+        of an OpenStack installation, including regions, while still
+        exposing a single endpoint. In each region an API cell is
+        created along with a number of compute cells where the
+        workloads actually run. Each cell gets its own database and
+        message queue setup (ideally clustered), providing the ability
+        to subdivide the load on these subsystems, improving overall
+        performance.</para>
+    <para>Within each compute cell a complete compute installation is
+        provided, complete with full database and queue installations,
+        scheduler, conductor, and multiple compute hosts. The cells
+        scheduler handles placement of user requests from the single
+        API endpoint to a specific cell from those available. The
+        normal filter scheduler then handles placement within the
+        cell.</para>
+    <para>The downside of using cells is that they are not well
+        supported by any of the OpenStack services other than compute.
+        Also, they do not adequately support some relatively standard
+        OpenStack functionality such as security groups and host
+        aggregates. Due to their relative newness and specialized use,
+        they receive relatively little testing in the OpenStack gate.
+        Despite these issues, however, cells are used in some very
+        well known OpenStack installations operating at massive scale
+        including those at CERN and Rackspace.</para></section>
+    <section xml:id="host-aggregates"><title>Host Aggregates</title>
+    <para>Host Aggregates enable partitioning of OpenStack Compute
+        deployments into logical groups for load balancing and
+        instance distribution. Host aggregates may also be used to
+        further partition an availability zone. Consider a cloud which
+        might use host aggregates to partition an availability zone
+        into groups of hosts that either share common resources, such
+        as storage and network, or have a special property, such as
+        trusted computing hardware. Host aggregates are not explicitly
+        user-targetable; instead they are implicitly targeted via the
+        selection of instance flavors with extra specifications that
+        map to host aggregate metadata.</para></section>
+    <section xml:id="availability-zones"><title>Availability Zones</title>
+    <para>Availability zones provide another mechanism for subdividing
+        an installation or region. They are, in effect, Host
+        aggregates that are exposed for (optional) explicit targeting
+        by users.</para>
+    <para>Unlike cells, they do not have their own database server or
+        queue broker but simply represent an arbitrary grouping of
+        compute nodes. Typically, grouping of nodes into availability
+        zones is based on a shared failure domain based on a physical
+        characteristic such as a shared power source, physical network
+        connection, and so on. Availability Zones are exposed to the
+        user because they can be targeted; however, users are not
+        required to target them. An alternate approach is for the
+        operator to set a default availability zone to schedule
+        instances to other than the default availability zone of
+        nova.</para></section>
+    <section xml:id="segregation-example"><title>Segregation Example</title>
+    <para>In this example the cloud is divided into two regions, one
+        for each site, with two availability zones in each based on
+        the power layout of the data centers. A number of host
+        aggregates have also been defined to allow targeting of
+        virtual machine instances using flavors, that require special
+        capabilities shared by the target hosts such as SSDs, 10 G
+        networks, or GPU cards.</para>
+    <mediaobject>
+        <imageobject>
+            <imagedata
+                fileref="../images/Massively_Scalable_Cells_+_regions_+_azs.png"
+            />
+        </imageobject>
+    </mediaobject></section>
+</section>
--- a/doc/arch-design/massively_scalable/section_user_requirements_massively_scalable.xml
+++ b/doc/arch-design/massively_scalable/section_user_requirements_massively_scalable.xml
@ -0,0 +1,173 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<section xmlns="http://docbook.org/ns/docbook"
+    xmlns:xi="http://www.w3.org/2001/XInclude"
+    xmlns:xlink="http://www.w3.org/1999/xlink"
+    version="5.0"
+    xml:id="user-requirements-massive-scale-overview">
+    <?dbhtml stop-chunking?>
+    <title>User Requirements</title>
+    <para>More so than other scenarios, defining user requirements for
+        a massively scalable OpenStack design architecture dictates
+        approaching the design from two different, yet sometimes
+        opposing, perspectives: the cloud user, and the cloud
+        operator. The expectations and perceptions of the consumption
+        and management of resources of a massively scalable OpenStack
+        cloud from the user point of view is distinctly different from
+        that of the cloud operator.</para>
+    <para>Many jurisdictions have legislative and regulatory
+        requirements governing the storage and management of data in
+        cloud environments. Common areas of regulation include:</para>
+    <itemizedlist>
+        <listitem>
+            <para>Data retention policies ensuring storage of
+                persistent data and records management to meet data
+                archival requirements.</para>
+        </listitem>
+        <listitem>
+            <para>Data ownership policies governing the possession and
+                responsibility for data.</para>
+        </listitem>
+        <listitem>
+            <para>Data sovereignty policies governing the storage of
+                data in foreign countries or otherwise separate
+                jurisdictions.</para>
+        </listitem>
+        <listitem>
+            <para>Data compliance policies governing certain types of
+                information needs to reside in certain locations due
+                to regular issues and, more importantly, cannot reside
+                in other locations for the same reason.</para>
+        </listitem>
+    </itemizedlist>
+    <para>Examples of such legal frameworks include the data
+        protection framework of the European Union
+        (http://ec.europa.eu/justice/data-protection/ ) and the
+        requirements of the Financial Industry Regulatory Authority
+        (http://www.finra.org/Industry/Regulation/FINRARules/ ) in the
+        United States. Consult a local regulatory body for more
+        information.</para>
+    <section xml:id="user-requirements-massive-scale"><title>User Requirements</title>
+    <para>Massively scalable OpenStack clouds have the following user
+        requirements:</para>
+    <itemizedlist>
+        <listitem>
+            <para>The cloud user expects repeatable, dependable, and
+                deterministic processes for launching and deploying
+                cloud resources. This could be delivered through a
+                web-based interface or publicly available API
+                endpoints. All appropriate options for requesting
+                cloud resources need to be available through some type
+                of user interface, a command-line interface (CLI), or
+                API endpoints.</para>
+        </listitem>
+        <listitem>
+            <para>Cloud users expect a fully self-service and
+                on-demand consumption model. When an OpenStack cloud
+                reaches the "massively scalable" size, it means it is
+                expected to be consumed "as a service" in each and
+                every way.</para>
+        </listitem>
+        <listitem>
+            <para>For a user of a massively scalable OpenStack public
+                cloud, there will be no expectations for control over
+                security, performance, or availability. Only SLAs
+                related to uptime of API services are expected, and
+                very basic SLAs expected of services offered. The user
+                understands it is his or her responsibility to address
+                these issues on their own. The exception to this
+                expectation is the rare case of a massively scalable
+                cloud infrastructure built for a private or government
+                organization that has specific requirements.</para>
+        </listitem>
+    </itemizedlist>
+    <para>As might be expected, the cloud user requirements or
+        expectations that determine the design are all focused on the
+        consumption model. The user expects to be able to easily
+        consume cloud resources in an automated and deterministic way,
+        without any need for knowledge of the capacity, scalability,
+        or other attributes of the cloud's underlying
+        infrastructure.</para></section>
+    <section xml:id="operator-requirements-massive-scale"><title>Operator Requirements</title>
+    <para>Whereas the cloud user should be completely unaware of the
+        underlying infrastructure of the cloud and its attributes, the
+        operator must be able to build and support the infrastructure,
+        as well as how it needs to operate at scale. This presents a
+        very demanding set of requirements for building such a cloud
+        from the operator's perspective:</para>
+    <itemizedlist>
+        <listitem>
+            <para>First and foremost, everything must be capable of
+                automation. From the deployment of new hardware,
+                compute hardware, storage hardware, or networking
+                hardware, to the installation and configuration of the
+                supporting software, everything must be capable of
+                being automated. Manual processes will not suffice in
+                a massively scalable OpenStack design
+                architecture.</para>
+        </listitem>
+        <listitem>
+            <para>The cloud operator requires that capital expenditure
+                (CapEx) is minimized at all layers of the stack.
+                Operators of massively scalable OpenStack clouds
+                require the use of dependable commodity hardware and
+                freely available open source software components to
+                reduce deployment costs and operational expenses.
+                Initiatives like OpenCompute (more information
+                available at http://www.opencompute.org) provide
+                additional information and pointers. To cut costs,
+                many operators sacrifice redundancy. For example,
+                redundant power supplies, redundant network
+                connections, and redundant rack switches.</para>
+        </listitem>
+        <listitem>
+            <para>Companies operating a massively scalable OpenStack
+                cloud also require that operational expenditures
+                (OpEx) be minimized as much as possible. It is
+                recommended that cloud-optimized hardware is a good
+                approach when managing operational overhead. Some of
+                the factors that need to be considered include power,
+                cooling, and the physical design of the chassis. It is
+                possible to customize the hardware and systems so they
+                are optimized for this type of workload because of the
+                scale of these implementations.</para>
+        </listitem>
+        <listitem>
+            <para>Massively scalable OpenStack clouds require
+                extensive metering and monitoring functionality to
+                maximize the operational efficiency by keeping the
+                operator informed about the status and state of the
+                infrastructure. This includes full scale metering of
+                the hardware and software status. A corresponding
+                framework of logging and alerting is also required to
+                store and allow operations to act upon the metrics
+                provided by the metering and monitoring solution(s).
+                The cloud operator also needs a solution that uses the
+                data provided by the metering and monitoring solution
+                to provide capacity planning and capacity trending
+                analysis.</para>
+        </listitem>
+        <listitem>
+            <para>A massively scalable OpenStack cloud will be a
+                multi-site cloud. Therefore, the user-operator
+                requirements for a multi-site OpenStack architecture
+                design are also applicable here. This includes various
+                legal requirements for data storage, data placement,
+                and data retention; other jurisdictional legal or
+                compliance requirements; image
+                consistency-availability; storage replication and
+                availability (both block and file/object storage); and
+                authentication, authorization, and auditing (AAA),
+                just to name a few. Refer to the "Multi-Site" section
+                for more details on requirements and considerations
+                for multi-site OpenStack clouds.</para>
+        </listitem>
+        <listitem>
+            <para>Considerations around physical facilities such as
+                space, floor weight, rack height and type,
+                environmental considerations, power usage and power
+                usage efficiency (PUE), and physical security must
+                also be addressed by the design architecture of a
+                massively scalable OpenStack cloud.</para>
+        </listitem>
+    </itemizedlist></section>
+</section>
--- a/doc/arch-design/multi_site/section_architecture_multi_site.xml
+++ b/doc/arch-design/multi_site/section_architecture_multi_site.xml
@ -0,0 +1,118 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<section xmlns="http://docbook.org/ns/docbook"
+    xmlns:xi="http://www.w3.org/2001/XInclude"
+    xmlns:xlink="http://www.w3.org/1999/xlink" version="5.0"
+    xml:id="arch-design-architecture-multiple-site">
+    <?dbhtml stop-chunking?>
+    <title>Architecture</title>
+    <para>This graphic is a high level diagram of a multiple site OpenStack
+        architecture. Each site is an OpenStack cloud but it may be necessary to
+        architect the sites on different versions. For example, if the second
+        site is intended to be a replacement for the first site, they would be
+        different. Another common design would be a private OpenStack cloud with
+        replicated site that would be used for high availability or disaster
+        recovery. The most important design decision is how to configure the
+        storage. It can be configured as a single shared pool or separate pools,
+        depending on the user and technical requirements.</para>
+    <mediaobject>
+        <imageobject>
+            <imagedata
+                fileref="../images/Multi-Site_shared_keystone_horizon_swift1.png"/>
+        </imageobject>
+    </mediaobject>
+    <section xml:id="openstack-services-architecture">
+        <title>OpenStack Services Architecture</title>
+        <para>The OpenStack Identity service, which is used by all other
+            OpenStack components for authorization and the catalog of service
+            endpoints, supports the concept of regions. A region is a logical
+            construct that can be used to group OpenStack services that are in
+            close proximity to one another. The concept of regions is flexible;
+            it may can contain OpenStack service endpoints located within a
+            distinct geographic region, or regions. It may be smaller in scope,
+            where a region is a single rack within a data center or even a
+            single blade chassis, with multiple regions existing in adjacent
+            racks in the same data center.</para>
+        <para>The majority of OpenStack components are designed to run within
+            the context of a single region. The OpenStack Compute service is
+            designed to manage compute resources within a region, with support
+            for subdivisions of compute resources by using Availability Zones
+            and Cells. The OpenStack Networking service can be used to manage
+            network resources in the same broadcast domain or collection of
+            switches that are linked. The OpenStack Block Storage service
+            controls storage resources within a region with all storage
+            resources residing on the same storage network. Like the OpenStack
+            Compute service, the OpenStack Block Storage Service also supports
+            the Availability Zone construct,which can be used to subdivide
+            storage resources.</para>
+        <para>The OpenStack Dashboard, OpenStack Identity Service, and OpenStack
+            Object Storage services are components that can each be deployed
+            centrally in order to serve multiple regions.</para>
+    </section>
+    <section xml:id="arch-multi-storage">
+        <title>Storage</title>
+        <para>With multiple OpenStack regions, having a single OpenStack Object
+            Storage Service endpoint that delivers shared file storage for all
+            regions is desirable. The Object Storage service internally
+            replicates files to multiple nodes. The advantages of this are that,
+            if a file placed into the Object Storage service is visible to all
+            regions, it can be used by applications or workloads in any or all
+            of the regions. This simplifies high availability failover and
+            disaster recovery rollback.</para>
+        <para>In order to scale the Object Storage service to meet the workload
+            of multiple regions, multiple proxy workers are run and
+            load-balanced, storage nodes are installed in each region, and the
+            entire Object Storage Service can be fronted by an HTTP caching
+            layer. This is done so client requests for objects can be served out
+            of caches rather than directly from the storage modules themselves,
+            reducing the actual load on the storage network. In addition to an
+            HTTP caching layer, use a caching layer like Memcache to cache
+            objects between the proxy and storage nodes.</para>
+        <para>If the cloud is designed without a single Object Storage Service
+            endpoint for multiple regions, and instead a separate Object Storage
+            Service endpoint is made available in each region, applications are
+            required to handle synchronization (if desired) and other management
+            operations to ensure consistency across the nodes. For some
+            applications, having multiple Object Storage Service endpoints
+            located in the same region as the application may be desirable due
+            to reduced latency, cross region bandwidth, and ease of
+            deployment.</para>
+        <para>For the Block Storage service, the most important decisions are
+            the selection of the storage technology and whether or not a
+            dedicated network is used to carry storage traffic from the storage
+            service to the compute nodes.</para>
+    </section>
+    <section xml:id="arch-networking-multiple">
+        <title>Networking</title>
+        <para>When connecting multiple regions together there are several design
+            considerations. The overlay network technology choice determines how
+            packets are transmitted between regions and how the logical network
+            and addresses present to the application. If there are security or
+            regulatory requirements, encryption should be implemented to secure
+            the traffic between regions. For networking inside a region, the
+            overlay network technology for tenant networks is equally important.
+            The overlay technology and the network traffic of an application
+            generates or receives can be either complementary or be at cross
+            purpose. For example, using an overlay technology for an application
+            that transmits a large amount of small packets could add excessive
+            latency or overhead to each packet if not configured
+            properly.</para>
+    </section>
+    <section xml:id="arch-dependencies-multiple">
+        <title>Dependencies</title>
+        <para>The architecture for a multi-site installation of OpenStack is
+            dependent on a number of factors. One major dependency to consider
+            is storage. When designing the storage system, the storage mechanism
+            needs to be determined. Once the storage type is determined, how it
+            will be accessed is critical. For example, it is recommended that
+            storage should utilize a dedicated network. Another concern is how
+            the storage is configured to protect the data. For example, the
+            recovery point objective (RPO) and the recovery time objective
+            (RTO). How quickly can the recovery from a fault be completed, will
+            determine how often the replication of data be required. Ensure that
+            enough storage is allocated to support the data protection
+            strategy.</para>
+        <para>Networking decisions include the encapsulation mechanism that will
+            be used for the tenant networks, how large the broadcast domains
+            should be, and the contracted SLAs for the interconnects.</para>
+    </section>
+</section>
--- a/doc/arch-design/multi_site/section_introduction_multi_site.xml
+++ b/doc/arch-design/multi_site/section_introduction_multi_site.xml
@ -0,0 +1,33 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<section xmlns="http://docbook.org/ns/docbook"
+    xmlns:xi="http://www.w3.org/2001/XInclude"
+    xmlns:xlink="http://www.w3.org/1999/xlink"
+    version="5.0"
+    xml:id="arch-guide-intro-multi">
+    <title>Introduction</title>
+    <para>A multi-site OpenStack environment is one in which services
+        located in more than one data center are used to provide the
+        overall solution. Usage requirements of different multi-site
+        clouds may vary widely, however they share some common needs.
+        OpenStack is capable of running in a multi-region
+        configuration allowing some parts of OpenStack to effectively
+        manage a grouping of sites as a single cloud. With some
+        careful planning in the design phase, OpenStack can act as an
+        excellent multi-site cloud solution for a multitude of
+        needs.</para>
+    <para>Some use cases that might indicate a need for a multi-site
+        deployment of OpenStack include:</para>
+    <itemizedlist>
+        <listitem>
+            <para>An organization with a diverse geographic
+                footprint.</para>
+        </listitem>
+        <listitem>
+            <para>Geo-location sensitive data.</para>
+        </listitem>
+        <listitem>
+            <para>Data locality, in which specific data or
+                functionality should be close to users.</para>
+        </listitem>
+    </itemizedlist>
+</section>
--- a/doc/arch-design/multi_site/section_operational_considerations_multi_site.xml
+++ b/doc/arch-design/multi_site/section_operational_considerations_multi_site.xml
@ -0,0 +1,178 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<section xmlns="http://docbook.org/ns/docbook"
+    xmlns:xi="http://www.w3.org/2001/XInclude"
+    xmlns:xlink="http://www.w3.org/1999/xlink"
+    version="5.0"
+    xml:id="operational-considerations-multi-site">
+    <?dbhtml stop-chunking?>
+    <title>Operational Considerations</title>
+    <para>Deployment of a multi-site OpenStack cloud using regions
+        requires that the service catalog contains per-region entries
+        for each service deployed other than the Identity service
+        itself. There is limited support amongst currently available
+        off-the-shelf OpenStack deployment tools for defining multiple
+        regions in this fashion.</para>
+    <para>Deployers must be aware of this and provide the appropriate
+        customization of the service catalog for their site either
+        manually or via customization of the deployment tools in
+        use.</para>
+    <para>Note that, as of the Icehouse release, documentation for
+        implementing this feature is in progress. See this bug for
+        more information:
+        https://bugs.launchpad.net/openstack-manuals/+bug/1340509</para>
+    <section xml:id="licensing"><title>Licensing</title>
+    <para>Multi-site OpenStack deployments present additional
+        licensing considerations over and above regular OpenStack
+        clouds, particularly where site licenses are in use to provide
+        cost efficient access to software licenses. The licensing for
+        host operating systems, guest operating systems, OpenStack
+        distributions (if applicable), software-defined infrastructure
+        including network controllers and storage systems, and even
+        individual applications need to be evaluated in light of the
+        multi-site nature of the cloud.</para>
+    <para>Topics to consider include:</para>
+    <itemizedlist>
+        <listitem>
+            <para>The specific definition of what constitutes a site
+                in the relevant licenses, as the term does not
+                necessarily denote a geographic or otherwise
+                physically isolated location in the traditional
+                sense.</para>
+        </listitem>
+        <listitem>
+            <para>Differentiations between "hot" (active) and "cold"
+                (inactive) sites where significant savings may be made
+                in situations where one site is a cold standby for
+                disaster recovery purposes only.</para>
+        </listitem>
+        <listitem>
+            <para>Certain locations might require local vendors to
+                provide support and services for each site provides
+                challenges, but will vary on the licensing agreement
+                in place.</para>
+        </listitem>
+    </itemizedlist></section>
+    <section xml:id="logging-and-monitoring-multi-site"><title>Logging and Monitoring</title>
+    <para>Logging and monitoring does not significantly differ for a
+        multi-site OpenStack cloud. The same well known tools
+        described in the Operations Guide
+        (http://docs.openstack.org/openstack-ops/content/logging_monitoring.html)
+        remain applicable. Logging and monitoring can be provided both
+        on a per-site basis and in a common centralized
+        location.</para>
+    <para>When attempting to deploy logging and monitoring facilities
+        to a centralized location, care must be taken with regards to
+        the load placed on the inter-site networking links.</para></section>
+    <section xml:id="upgrades-multi-site"><title>Upgrades</title>
+    <para>In multi-site OpenStack clouds deployed using regions each
+        site is, effectively, an independent OpenStack installation
+        which is linked to the others by using centralized services
+        such as Identity which are shared between sites. At a high
+        level the recommended order of operations to upgrade an
+        individual OpenStack environment is
+        (http://docs.openstack.org/openstack-ops/content/ops_upgrades-general-steps.html):</para>
+    <orderedlist>
+        <listitem>
+            <para>Upgrade the OpenStack Identity Service
+                (Keystone).</para>
+        </listitem>
+        <listitem>
+            <para>Upgrade the OpenStack Image Service (Glance).</para>
+        </listitem>
+        <listitem>
+            <para>Upgrade OpenStack Compute (Nova), including
+                networking components.</para>
+        </listitem>
+        <listitem>
+            <para>Upgrade OpenStack Block Storage (Cinder).</para>
+        </listitem>
+        <listitem>
+            <para>Upgrade the OpenStack dashboard.(Horizon)</para>
+        </listitem>
+    </orderedlist>
+    <para>The process for upgrading a multi-site environment is not
+        significantly different:</para>
+    <orderedlist>
+        <listitem>
+            <para>Upgrade the shared OpenStack Identity Service
+                (Keystone) deployment.</para>
+        </listitem>
+        <listitem>
+            <para>Upgrade the OpenStack Image Service (glance) at each
+                site.</para>
+        </listitem>
+        <listitem>
+            <para>Upgrade OpenStack Compute (Nova), including
+                networking components, at each site.</para>
+        </listitem>
+        <listitem>
+            <para>Upgrade OpenStack Block Storage (Cinder) at each
+                site.</para>
+        </listitem>
+        <listitem>
+            <para>Upgrade the OpenStack dashboard (Horizon), at each
+                site - or in the single central location if it is
+                shared.</para>
+        </listitem>
+    </orderedlist>
+    <para>Note that, as of the OpenStack Icehouse release, compute
+        upgrades within each site can also be performed in a rolling
+        fashion. Compute controller services (API, Scheduler, and
+        Conductor) can be upgraded prior to upgrading of individual
+        compute nodes. This maximizes the ability of operations staff
+        to keep a site operational for users of compute services while
+        performing an upgrade.</para></section>
+    <section xml:id="quota-management-multi-site"><title>Quota Management</title>
+    <para>To prevent system capacities from being exhausted without
+        notification, OpenStack provides operators with the ability to
+        define quotas. Quotas are used to set operational limits and
+        are currently enforced at the tenant (or project) level rather
+        than at the user level.</para>
+    <para>Quotas are defined on a per-region basis. Operators may wish
+        to define identical quotas for tenants in each region of the
+        cloud to provide a consistent experience, or even create a
+        process for synchronizing allocated quotas across regions. It
+        is important to note that only the operational limits imposed
+        by the quotas will be aligned consumption of quotas by users
+        will not be reflected between regions.</para>
+    <para>For example, given a cloud with two regions, if the operator
+        grants a user a quota of 25 instances in each region then that
+        user may launch a total of 50 instances spread across both
+        regions. They may not, however, launch more than 25 instances
+        in any single region.</para>
+    <para>For more information on managing quotas refer to Chapter 9.
+        Managing Projects and Users
+        (http://docs.openstack.org/openstack-ops/content/projects_users.html)
+        of the OpenStack Operators Guide.</para></section>
+    <section xml:id="policy-management-multi-site"><title>Policy Management</title>
+    <para>OpenStack provides a default set of Role Based Access
+        Control (RBAC) policies, defined in a <filename>policy.json</filename> file, for
+        each service. Operators edit these files to customize the
+        policies for their OpenStack installation. If the application
+        of consistent RBAC policies across sites is considered a
+        requirement, then it is necessary to ensure proper
+        synchronization of the <filename>policy.json</filename> files to all
+        installations.</para>
+    <para>This must be done using normal system administration tools
+        such as rsync as no functionality for synchronizing policies
+        across regions is currently provided within OpenStack.</para></section>
+    <section xml:id="documentation-multi-site"><title>Documentation</title>
+    <para>Users must be able to leverage cloud infrastructure and
+        provision new resources in the environment. It is important
+        that user documentation is accessible by users of the cloud
+        infrastructure to ensure they are given sufficient information
+        to help them leverage the cloud. As an example, by default
+        OpenStack will schedule instances on a compute node
+        automatically. However, when multiple regions are available,
+        it is left to the end user to decide in which region to
+        schedule the new instance. Horizon will present the user with
+        the first region in your configuration. The API and CLI tools
+        will not execute commands unless a valid region is specified.
+        It is therefore important to provide documentation to your
+        users describing the region layout as well as calling out that
+        quotas are region-specific. If a user reaches his or her quota
+        in one region, OpenStack will not automatically build new
+        instances in another. Documenting specific examples will help
+        users understand how to operate the cloud, thereby reducing
+        calls and tickets filed with the help desk.</para></section>
+</section>
--- a/doc/arch-design/multi_site/section_prescriptive_examples_multi_site.xml
+++ b/doc/arch-design/multi_site/section_prescriptive_examples_multi_site.xml
@ -0,0 +1,218 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<section xmlns="http://docbook.org/ns/docbook"
+    xmlns:xi="http://www.w3.org/2001/XInclude"
+    xmlns:xlink="http://www.w3.org/1999/xlink"
+    version="5.0"
+    xml:id="prescriptive-example-multisite">
+    <?dbhtml stop-chunking?>
+    <title>Prescriptive Examples</title>
+    <para>Based on the needs of the intended workloads, there are
+        multiple ways to build a multi-site OpenStack installation.
+        Below are example architectures based on different
+        requirements. These examples are meant as a reference, and not
+        a hard and fast rule for deployments. Use the previous
+        sections of this chapter to assist in selecting specific
+        components and implementations based on specific needs.</para>
+    <para>A large content provider needs to deliver content to
+        customers that are geographically dispersed. The workload is
+        very sensitive to latency and needs a rapid response to
+        end-users. After reviewing the user, technical and operational
+        considerations, it is determined beneficial to build a number
+        of regions local to the customer’s edge. In this case rather
+        than build a few large, centralized data centers, the intent
+        of the architecture is to provide a pair of small data centers
+        in locations that are closer to the customer. In this use
+        case, spreading applications out allows for different
+        horizontal scaling than a traditional compute workload scale.
+        The intent is to scale by creating more copies of the
+        application in closer proximity to the users that need it
+        most, in order to ensure faster response time to user
+        requests. This provider will deploy two datacenters at each of
+        the four chosen regions. The implications of this design are
+        based around the method of placing copies of resources in each
+        of the remote regions. Swift objects, Glance images, and block
+        storage will need to be manually replicated into each region.
+        This may be beneficial for some systems, such as the case of
+        content service, where only some of the content needs to exist
+        in some but not all regions. A centralized Keystone is
+        recommended to ensure authentication and that access to the
+        API endpoints is easily manageable.</para>
+    <para>Installation of an automated DNS system such as Designate is
+        highly recommended. Unless an external Dynamic DNS system is
+        available, application administrators will need a way to
+        manage the mapping of which application copy exists in each
+        region and how to reach it. Designate will assist by making
+        the process automatic and by populating the records in the
+        each region's zone.</para>
+    <para>Telemetry for each region is also deployed, as each region
+        may grow differently or be used at a different rate.
+        Ceilometer will run to collect each region's metrics from each
+        of the controllers and report them back to a central location.
+        This is useful both to the end user and the administrator of
+        the OpenStack environment. The end user will find this method
+        useful, in that it is possible to determine if certain
+        locations are experiencing higher load than others, and take
+        appropriate action. Administrators will also benefit by
+        possibly being able to forecast growth per region, rather than
+        expanding the capacity of all regions simultaneously,
+        therefore maximizing the cost-effectiveness of the multi-site
+        design.</para>
+    <para>One of the key decisions of running this sort of
+        infrastructure is whether or not to provide a redundancy
+        model. Two types of redundancy and high availability models in
+        this configuration will be implemented. The first type
+        revolves around the availability of the central OpenStack
+        components. Keystone will be made highly available in three
+        central data centers that will host the centralized OpenStack
+        components. This prevents a loss of any one of the regions
+        causing an outage in service. It also has the added benefit of
+        being able to run a central storage repository as a primary
+        cache for distributing content to each of the regions.</para>
+    <para>The second redundancy topic is that of the edge data center
+        itself. A second data center in each of the edge regional
+        locations will house a second region near the first. This
+        ensures that the application will not suffer degraded
+        performance in terms of latency and availability.</para>
+    <para>This figure depicts the solution designed to have both a
+        centralized set of core data centers for OpenStack services
+        and paired edge data centers:</para>
+    <mediaobject>
+        <imageobject>
+            <imagedata
+                fileref="../images/Multi-Site_Customer_Edge.png"/>
+        </imageobject>
+    </mediaobject>
+    <section xml:id="geo-redundant-load-balancing"><title>Geo-redundant load balancing</title>
+    <para>A large-scale web application has been designed with cloud
+        principles in mind. The application is designed provide
+        service to application store, on a 24/7 basis. The company has
+        typical 2-tier architecture with a web front-end servicing the
+        customer requests and a NoSQL database back end storing the
+        information.</para>
+    <para>As of late there has been several outages in number of major
+        public cloud providers - usually due to the fact these
+        applications were running out of a single geographical
+        location. The design therefore should mitigate the chance of a
+        single site causing an outage for their business.</para>
+    <para>The solution would consist of the following OpenStack
+        components:</para>
+    <itemizedlist>
+        <listitem>
+            <para>A firewall, switches and load balancers on the
+                public facing network connections.</para>
+        </listitem>
+        <listitem>
+            <para>OpenStack Controller services running, Networking,
+                Horizon, Cinder and Nova compute running locally in
+                each of the three regions. The other services,
+                Keystone, Heat Ceilometer, Glance and Swift will be
+                installed centrally - with nodes in each of the region
+                providing a redundant OpenStack Controller plane
+                throughout the globe.</para>
+        </listitem>
+        <listitem>
+            <para>OpenStack Compute nodes running the KVM
+                hypervisor.</para>
+        </listitem>
+        <listitem>
+            <para>OpenStack Object Storage for serving static objects
+                such as images will be used to ensure that all images
+                are standardized across all the regions, and
+                replicated on a regular basis.</para>
+        </listitem>
+        <listitem>
+            <para>A Distributed DNS service available to all regions -
+                that allows for dynamic update of DNS records of
+                deployed instances.</para>
+        </listitem>
+        <listitem>
+            <para>A geo-redundant load balancing service will be used
+                to service the requests from the customers based on
+                their origin.</para>
+        </listitem>
+    </itemizedlist>
+    <para>An autoscaling heat template will used to deploy the
+        application in the three regions. This template will
+        include:</para>
+    <itemizedlist>
+        <listitem>
+            <para>Web Servers, running Apache.</para>
+        </listitem>
+        <listitem>
+            <para>Appropriate user_data to populate the central DNS
+                servers upon instance launch.</para>
+        </listitem>
+        <listitem>
+            <para>Appropriate Ceilometer alarms that maintain state of
+                the application and allow for handling of region or
+                instance failure.</para>
+        </listitem>
+    </itemizedlist>
+    <para>Another autoscaling Heat template will be used to deploy a
+        distributed MongoDB shard over the three locations - with the
+        option of storing required data on a globally available Swift
+        container. according to the usage and load on the database
+        server - additional shards will be provisioned according to
+        the thresholds defined in Ceilometer.</para>
+    <para>The reason that 3 regions were selected here was because of
+        the fear of having abnormal load on a single region in the
+        event of a failure. Two data center would have been sufficient
+        had the requirements been met.</para>
+    <para>Heat is used because of the built-in functionality of
+        autoscaling and auto healing in the event of increased load.
+        Additional configuration management tools, such as Puppet or
+        Chef could also have been used in this scenario, but were not
+        chosen due to the fact that Heat had the appropriate built-in
+        hooks into the OpenStack cloud - whereas the other tools were
+        external and not native to OpenStack. In addition - since this
+        deployment scenario was relatively straight forward - the
+        external tools were not needed.</para>
+    <para>Swift is used here to serve as a back end for Glance and
+        Object storage since was the most suitable solution for a
+        globally distributed storage solution - with its own
+        replication mechanism. Home grown solutions could also have
+        been used including the handling of replication - but were not
+        chosen, because Swift is already an intricate part of the
+        infrastructure - and proven solution.</para>
+    <para>An external load balancing service was used and not the
+        LBaaS in OpenStack because the solution in OpenStack is not
+        redundant and does have any awareness of geo location.</para>
+    <mediaobject>
+        <imageobject>
+            <imagedata
+                fileref="../images/Multi-site_Geo_Redundant_LB.png"/>
+        </imageobject>
+    </mediaobject></section>
+    <section xml:id="location-local-services"><title>Location-local service</title>
+    <para>A common use for a multi-site deployment of OpenStack, is
+        for creating a Content Delivery Network. An application that
+        uses a location-local architecture will require low network
+        latency and proximity to the user, in order to provide an
+        optimal user experience, in addition to reducing the cost of
+        bandwidth and transit, since the content resides on sites
+        closer to the customer, instead of a centralized content store
+        that would require utilizing higher cost cross country
+        links.</para>
+    <para>This architecture usually includes a geo-location component
+        that places user requests at the closest possible node. In
+        this scenario, 100% redundancy of content across every site is
+        a goal rather than a requirement, with the intent being to
+        maximize the amount of content available that is within a
+        minimum number of network hops for any given end user. Despite
+        these differences, the storage replication configuration has
+        significant overlap with that of a geo-redundant load
+        balancing use case.</para>
+    <para>In this example, the application utilizing this multi-site
+        OpenStack install that is location aware would launch web
+        server or content serving instances on the compute cluster in
+        each site. Requests from clients will first be sent to a
+        global services load balancer that determines the location of
+        the client, then routes the request to the closest OpenStack
+        site where the application completes the request.</para>
+    <mediaobject>
+        <imageobject>
+            <imagedata
+                fileref="../images/Multi-Site_shared_keystone1.png"/>
+        </imageobject>
+    </mediaobject></section>
+</section>
--- a/doc/arch-design/multi_site/section_tech_considerations_multi_site.xml
+++ b/doc/arch-design/multi_site/section_tech_considerations_multi_site.xml
@ -0,0 +1,196 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<section xmlns="http://docbook.org/ns/docbook"
+    xmlns:xi="http://www.w3.org/2001/XInclude"
+    xmlns:xlink="http://www.w3.org/1999/xlink"
+    version="5.0"
+    xml:id="technical-considerations-multi-site">
+    <?dbhtml stop-chunking?>
+    <title>Technical Considerations</title>
+    <para>There are many technical considerations to take into account
+        with regard to designing a multi-site OpenStack
+        implementation. An OpenStack cloud can be designed in a
+        variety of ways to handle individual application needs. A
+        multi-site deployment will have additional challenges compared
+        to single site installations and will therefore be a more
+        complex solution.</para>
+    <para>When determining capacity options be sure to take into
+        account not just the technical issues, but also the economic
+        or operational issues that might arise from specific
+        decisions.</para>
+    <para>Inter-site link capacity describes the capabilities of the
+        connectivity between the different OpenStack sites. This
+        includes parameters such as bandwidth, latency, whether or not
+        a link is dedicated, and any business policies applied to the
+        connection. The capability and number of the links between
+        sites will determine what kind of options may be available for
+        deployment. For example, if two sites have a pair of
+        high-bandwidth links available between them, it may be wise to
+        configure a separate storage replication network between the
+        two sites to support a single Swift endpoint and a shared
+        object storage capability between them. (An example of this
+        technique, as well as a configuration walk-through, is
+        available at
+        http://docs.openstack.org/developer/swift/replication_network.html#dedicated-replication-network).
+        Another option in this scenario is to build a dedicated set of
+        tenant private networks across the secondary link using
+        overlay networks with a third party mapping the site overlays
+        to each other.</para>
+    <para>The capacity requirements of the links between sites will be
+        driven by application behavior. If the latency of the links is
+        too high, certain applications that use a large number of
+        small packets, for example RPC calls, may encounter issues
+        communicating with each other or operating properly.
+        Additionally, OpenStack may encounter similar types of issues.
+        To mitigate this, tuning of the Keystone call timeouts may be
+        necessary to prevent issues authenticating against a central
+        Identity Service.</para>
+    <para>Another capacity consideration when it comes to networking
+        for a multi-site deployment is the available amount and
+        performance of overlay networks for tenant networks. If using
+        shared tenant networks across zones, it is imperative that an
+        external overlay manager or controller be used to map these
+        overlays together. It is necessary to ensure the amount of
+        possible IDs between the zones are identical. Note that, as of
+        the Icehouse release, Neutron was not capable of managing
+        tunnel IDs across installations. This means that if one site
+        runs out of IDs, but other does not, that tenant's network
+        will be unable to reach the other site.</para>
+    <para>Capacity can take other forms as well. The ability for a
+        region to grow depends on scaling out the number of available
+        compute nodes. This topic is covered in greater detail in the
+        section for compute-focused deployments. However, it should be
+        noted that cells may be necessary to grow an individual region
+        beyond a certain point. This point depends on the size of your
+        cluster and the ratio of virtual machines per
+        hypervisor.</para>
+    <para>A third form of capacity comes in the multi-region-capable
+        components of OpenStack. Centralized Object Storage is capable
+        of serving objects through a single namespace across multiple
+        regions. Since this works by accessing the object store via
+        swift proxy, it is possible to overload the proxies. There are
+        two options available to mitigate this issue. The first is to
+        deploy a large number of swift proxies. The drawback to this
+        is that the proxies are not load-balanced and a large file
+        request could continually hit the same proxy. The other way to
+        mitigate this is to front-end the proxies with a caching HTTP
+        proxy and load balancer. Since swift objects are returned to
+        the requester via HTTP, this load balancer would alleviate the
+        load required on the swift proxies.</para>
+    <section xml:id="utilization-multi-site"><title>Utilization</title>
+    <para>While constructing a multi-site OpenStack environment is the
+        goal of this guide, the real test is whether an application
+        can utilize it.</para>
+    <para>Identity is normally the first interface for the majority of
+        OpenStack users. Interacting with Keystone is required for
+        almost all major operations within OpenStack. Therefore, it is
+        important to ensure that you provide users with a single URL
+        for Keystone authentication. Equally important is proper
+        documentation and configuration of regions within Keystone.
+        Each of the sites defined in your installation is considered
+        to be a region in Keystone nomenclature. This is important for
+        the users of the system, when reading Keystone documentation,
+        as it is required to define the Region name when providing
+        actions to an API endpoint or in Horizon.</para>
+    <para>Load balancing is another common issue with multi-site
+        installations. While it is still possible to run HAproxy
+        instances with load balancer as a service, these will be local
+        to a specific region. Some applications may be able to cope
+        with this via internal mechanisms. Others, however, may
+        require the implementation of an external system including
+        global services load balancers or anycast-advertised
+        DNS.</para>
+    <para>Depending on the storage model chosen during site design,
+        storage replication and availability will also be a concern
+        for end-users. If an application is capable of understanding
+        regions, then it is possible to keep the object storage system
+        separated by region. In this case, users who want to have an
+        object available to more than one region will need to do the
+        cross-site replication themselves. With a centralized swift
+        proxy, however, the user may need to benchmark the replication
+        timing of the Swift back end. Benchmarking allows the
+        operational staff to provide users with an understanding of
+        the amount of time required for a stored or modified object to
+        become available to the entire environment.</para></section>
+    <section xml:id="performance"><title>Performance</title>
+    <para>Determining the performance of a multi-site installation
+        involves considerations that do not come into play in a
+        single-site deployment. Being a distributed deployment,
+        multi-site deployments incur a few extra penalties to
+        performance in certain situations.</para>
+    <para>Since multi-site systems can be geographically separated,
+        they may have worse than normal latency or jitter when
+        communicating across regions. This can especially impact
+        systems like the OpenStack Identity service when making
+        authentication attempts from regions that do not contain the
+        centralized Keystone implementation. It can also affect
+        certain applications which rely on remote procedure call (RPC)
+        for normal operation. An example of this can be seen in High
+        Performance Computing workloads.</para>
+    <para>Storage availability can also be impacted by the
+        architecture of a multi-site deployment. A centralized Object
+        Storage Service requires more time for an object to be
+        available to instances locally in regions where the object was
+        not created. Some applications may need to be tuned to account
+        for this effect. Block storage does not currently have a
+        method for replicating data across multiple regions, so
+        applications that depend on available block storage will need
+        to manually cope with this limitation by creating duplicate
+        block storage entries in each region.</para></section>
+    <section xml:id="security-multi-site"><title>Security</title>
+    <para>Securing a multi-site OpenStack installation also brings
+        extra challenges. Tenants may expect a tenant-created network
+        to be secure. In a multi-site installation the use of a
+        non-private connection between sites may be required. This may
+        mean that traffic would be visible to third parties and, in
+        cases where an application requires security, this issue will
+        require mitigation. Installing a VPN or encrypted connection
+        between sites is recommended in such instances.</para>
+    <para>Another security consideration with regard to multi-site
+        deployments is Identity. Authentication in a multi-site
+        deployment should be centralized. Centralization provides a
+        single authentication point for users across the deployment,
+        as well as a single point of administration for traditional
+        create, read, update and delete operations. Centralized
+        authentication is also useful for auditing purposes because
+        all authentication tokens originate from the same
+        source.</para>
+    <para>Just as tenants in a single-site deployment need isolation
+        from each other, so do tenants in multi-site installations.
+        The extra challenges in multi-site designs revolve around
+        ensuring that tenant networks function across regions.
+        Unfortunately, OpenStack Networking does not presently support
+        a mechanism to provide this functionality, therefore an
+        external system may be necessary to manage these mappings.
+        Tenant networks may contain sensitive information requiring
+        that this mapping be accurate and consistent to ensure that a
+        tenant in one site does not connect to a different tenant in
+        another site.</para></section>
+    <section xml:id="openstack-components-multi-site"><title>OpenStack Components</title>
+    <para>Most OpenStack installations require a bare minimum set of
+        pieces to function. These include Keystone for authentication,
+        Nova for compute, Glance for image storage, Neutron for
+        networking, and potentially an object store in the form of
+        Swift. Bringing multi-site into play also demands extra
+        components in order to coordinate between regions. Centralized
+        Keystone is necessary to provide the single authentication
+        point. Centralized Horizon is also recommended to provide a
+        single login point and a mapped experience to the API and CLI
+        options available. If necessary, a centralized Swift may be
+        used and will require the installation of the Swift proxy
+        service.</para>
+    <para>It may also be helpful to install a few extra options in
+        order to facilitate certain use cases. For instance,
+        installing Designate may assist in automatically generating
+        DNS domains for each region with an automatically-populated
+        zone full of resource records for each instance. This
+        facilitates using DNS as a mechanism for determining which
+        region would be selected for certain applications.</para>
+    <para>Another useful tool for managing a multi-site installation
+        is Heat. Heat allows the use of templates to define a set of
+        instances to be launched together or for scaling existing
+        sets. It can also be used to setup matching or differentiated
+        groupings based on regions. For instance, if an application
+        requires an equally balanced number of nodes across sites, the
+        same heat template can be used to cover each site with small
+        alterations to only the region name.</para></section>
+</section>
--- a/doc/arch-design/multi_site/section_user_requirements_multi_site.xml
+++ b/doc/arch-design/multi_site/section_user_requirements_multi_site.xml
@ -0,0 +1,213 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<section xmlns="http://docbook.org/ns/docbook"
+    xmlns:xi="http://www.w3.org/2001/XInclude"
+    xmlns:xlink="http://www.w3.org/1999/xlink"
+    version="5.0"
+    xml:id="user-requirements-multi-site">
+    <?dbhtml stop-chunking?>
+    <title>User Requirements</title>
+    <para>A multi-site architecture is complex and has its own risks
+        and considerations, therefore it is important to make sure
+        when contemplating the design such an architecture that it
+        meets the user and business requirements.</para>
+    <para>Many jurisdictions have legislative and regulatory
+        requirements governing the storage and management of data in
+        cloud environments. Common areas of regulation include:</para>
+    <itemizedlist>
+        <listitem>
+            <para>Data retention policies ensuring storage of
+                persistent data and records management to meet data
+                archival requirements.</para>
+        </listitem>
+        <listitem>
+            <para>Data ownership policies governing the possession and
+                responsibility for data.</para>
+        </listitem>
+        <listitem>
+            <para>Data sovereignty policies governing the storage of
+                data in foreign countries or otherwise separate
+                jurisdictions.</para>
+        </listitem>
+        <listitem>
+            <para>Data compliance policies governing types of
+                information that needs to reside in certain locations
+                due to regular issues and, more importantly, cannot
+                reside in other locations for the same reason.</para>
+        </listitem>
+    </itemizedlist>
+    <para>Examples of such legal frameworks include the data
+        protection framework of the European Union
+        (http://ec.europa.eu/justice/data-protection) and the
+        requirements of the Financial Industry Regulatory Authority
+        (http://www.finra.org/Industry/Regulation/FINRARules) in the
+        United States. Consult a local regulatory body for more
+        information.</para>
+    <section xml:id="workload-characteristics"><title>Workload Characteristics</title>
+    <para>The expected workload is a critical requirement that needs
+        to be captured to guide decision-making. An understanding of
+        the workloads in the context of the desired multi-site
+        environment and use case is important. Another way of thinking
+        about a workload is to think of it as the way the systems are
+        used. A workload could be a single application or a suite of
+        applications that work together. It could also be a duplicate
+        set of applications that need to run in multiple cloud
+        environments. Often in a multi-site deployment the same
+        workload will need to work identically in more than one
+        physical location.</para>
+    <para>This multi-site scenario likely includes one or more of the
+        other scenarios in this book with the additional requirement
+        of having the workloads in two or more locations. The
+        following are some possible scenarios:</para>
+    <para>For many use cases the proximity of the user to their
+        workloads has a direct influence on the performance of the
+        application and therefore should be taken into consideration
+        in the design. Certain applications require zero to minimal
+        latency that can only be achieved by deploying the cloud in
+        multiple locations. These locations could be in different data
+        centers, cities, countries or geographical regions, depending
+        on the user requirement and location of the users.</para></section>
+    <section xml:id="consistency-images-templates-across-sites">
+        <title>Consistency of images and templates across different
+        sites</title>
+    <para>It is essential that the deployment of instances is
+        consistent across the different sites. This needs to be built
+        into the infrastructure. If OpenStack Object Store is used as
+        a back end for Glance, it is possible to create repositories of
+        consistent images across multiple sites. Having a central
+        endpoint with multiple storage nodes will allow for a
+        consistent centralized storage for each and every site.</para>
+    <para>Not using a centralized object store will increase
+        operational overhead so that a consistent image library can be
+        maintained. This could include development of a replication
+        mechanism to handle the transport of images and the changes to
+        the images across multiple sites.</para></section>
+    <section xml:id="high-availability-multi-site"><title>High Availability</title>
+    <para>If high availability is a requirement to provide continuous
+        infrastructure operations, a basic requirement of High
+        Availability should be defined.</para>
+    <para>The OpenStack management components need to have a basic and
+        minimal level of redundancy. The simplest example is the loss
+        of any single site has no significant impact on the
+        availability of the OpenStack services of the entire
+        infrastructure.</para>
+    <para>The OpenStack High Availability Guide
+        (http://docs.openstack.org/high-availability-guide/content/)
+        contains more information on how to provide redundancy for the
+        OpenStack components.</para>
+    <para>Multiple network links should be deployed between sites to
+        provide redundancy for all components. This includes storage
+        replication, which should be isolated to a dedicated network
+        or VLAN with the ability to assign QoS to control the
+        replication traffic or provide priority for this traffic. Note
+        that if the data store is highly changeable, the network
+        requirements could have a significant effect on the
+        operational cost of maintaining the sites.</para>
+    <para>The ability to maintain object availability in both sites
+        has significant implications on the object storage design and
+        implementation. It will also have a significant impact on the
+        WAN network design between the sites.</para>
+    <para>Connecting more than two sites increases the challenges and
+        adds more complexity to the design considerations. Multi-site
+        implementations require extra planning to address the
+        additional topology complexity used for internal and external
+        connectivity. Some options include full mesh topology, hub
+        spoke, spine leaf, or 3d Torus.</para>
+    <para>Not all the applications running in a cloud are cloud-aware.
+        If that is the case, there should be clear measures and
+        expectations to define what the infrastructure can support
+        and, more importantly, what it cannot. An example would be
+        shared storage between sites. It is possible, however such a
+        solution is not native to OpenStack and requires a third-party
+        hardware vendor to fulfill such a requirement. Another example
+        can be seen in applications that are able to consume resources
+        in object storage directly. These applications need to be
+        cloud aware to make good use of an OpenStack Object
+        Store.</para></section>
+    <section xml:id="application-readiness"><title>Application readiness</title>
+    <para>Some applications are tolerant of the lack of synchronized
+        object storage, while others may need those objects to be
+        replicated and available across regions. Understanding of how
+        the cloud implementation impacts new and existing applications
+        is important for risk mitigation and the overall success of a
+        cloud project. Applications may have to be written to expect
+        an infrastructure with little to no redundancy. Existing
+        applications not developed with the cloud in mind may need to
+        be rewritten.</para></section>
+    <section xml:id="cost-multi-site"><title>Cost</title>
+    <para>The requirement of having more than one site has a cost
+        attached to it. The greater the number of sites, the greater
+        the cost and complexity. Costs can be broken down into the
+        following categories</para>
+    <itemizedlist>
+        <listitem>
+            <para>Compute Resources</para>
+        </listitem>
+        <listitem>
+            <para>Networking resources</para>
+        </listitem>
+        <listitem>
+            <para>Replication</para>
+        </listitem>
+        <listitem>
+            <para>Storage</para>
+        </listitem>
+        <listitem>
+            <para>Management</para>
+        </listitem>
+        <listitem>
+            <para>Operational costs</para>
+        </listitem>
+    </itemizedlist></section>
+    <section xml:id="site-loss-and-recovery"><title>Site Loss and Recovery</title>
+    <para>Outages can cause loss of partial or full functionality of a
+        site. Strategies should be implemented to understand and plan
+        for recovery scenarios.</para>
+    <itemizedlist>
+        <listitem>
+            <para>The deployed applications need to continue to
+                function and, more importantly, consideration should
+                be taken of the impact on the performance and
+                reliability of the application when a site is
+                unavailable.</para>
+        </listitem>
+        <listitem>
+            <para>It is important to understand what will happen to
+                replication of objects and data between the sites when
+                a site goes down. If this causes queues to start
+                building up, considering how long these queues can
+                safely exist until something explodes.</para>
+        </listitem>
+        <listitem>
+            <para>Ensure determination of the method for resuming
+                proper operations of a site when it comes back online
+                after a disaster. It is recommended to architect the
+                recovery to avoid race conditions.</para>
+        </listitem>
+    </itemizedlist></section>
+    <section xml:id="compliance-and-geo-location-multi-site"><title>Compliance and Geo-location</title>
+    <para>An organization could have certain legal obligations and
+        regulatory compliance measures which could require certain
+        workloads or data to not be located in certain regions.</para></section>
+    <section xml:id="auditing-multi-site"><title>Auditing</title>
+    <para>A well thought-out auditing strategy is important in order
+        to be able to quickly track down issues. Keeping track of
+        changes made to security groups and tenant changes can be
+        useful in rolling back the changes if they affect production.
+        For example, if all security group rules for a tenant
+        disappeared, the ability to quickly track down the issue would
+        be important for operational and legal reasons.</para></section>
+    <section xml:id="separation-of-duties"><title>Separation of duties</title>
+    <para>A common requirement is to define different roles for the
+        different cloud administration functions. An example would be
+        a requirement to segregate the duties and permissions by
+        site.</para></section>
+    <section xml:id="authentication-between-sites">
+        <title>Authentication between sites</title>
+    <para>Ideally it is best to have a single authentication domain
+        and not need a separate implementation for each and every
+        site. This will, of course, require an authentication
+        mechanism that is highly available and distributed to ensure
+        continuous operation. Authentication server locality is also
+        something that might be needed as well and should be planned
+        for.</para></section>
+</section>
--- a/doc/arch-design/network_focus/section_architecture_network_focus.xml
+++ b/doc/arch-design/network_focus/section_architecture_network_focus.xml
@ -0,0 +1,215 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<section xmlns="http://docbook.org/ns/docbook"
+    xmlns:xi="http://www.w3.org/2001/XInclude"
+    xmlns:xlink="http://www.w3.org/1999/xlink"
+    version="5.0"
+    xml:id="architecture-network-focus">
+    <title>Architecture</title>
+    <para>Network focused OpenStack architectures have many
+        similarities to other OpenStack architecture use cases. There
+        a number of very specific considerations to keep in mind when
+        designing for a network-centric or network-heavy application
+        environment.</para>
+    <para>Networks exist to serve a as medium of transporting data
+        between systems. It is inevitable that an OpenStack design
+        have inter-dependencies with non-network portions of OpenStack
+        as well as on external systems. Depending on the specific
+        workload, there may be major interactions with storage systems
+        both within and external to the OpenStack environment. For
+        example, if the workload is a content delivery network, then
+        the interactions with storage will be two-fold. There will be
+        traffic flowing to and from the storage array for ingesting
+        and serving content in a north-south direction. In addition,
+        there is replication traffic flowing in an east-west
+        direction.</para>
+    <para>Compute-heavy workloads may also induce interactions with
+        the network. Some high performance compute applications
+        require network-based memory mapping and data sharing and, as
+        a result, will induce a higher network load when they transfer
+        results and data sets. Others may be highly transactional and
+        issue transaction locks, perform their functions and rescind
+        transaction locks at very high rates. This also has an impact
+        on the network performance.</para>
+    <para>Some network dependencies are going to be external to
+        OpenStack. While Neutron is capable of providing network
+        ports, IP addresses, some level of routing, and overlay
+        networks, there are some other functions that it cannot
+        provide. For many of these, external systems or equipment may
+        be required to fill in the functional gaps. Hardware load
+        balancers are an example of equipment that may be necessary to
+        distribute workloads or offload certain functions. Note that,
+        as of the icehouse release, dynamic routing is currently in
+        its infancy within OpenStack and may need to be implemented
+        either by an external device or a specialized service instance
+        within OpenStack. Tunneling is a feature provided by Neutron,
+        however it is constrained to a Neutron-managed region. If the
+        need arises to extend a tunnel beyond the OpenStack region to
+        either another region or an external system, it is necessary
+        to implement the tunnel itself outside OpenStack or by using a
+        tunnel management system to map the tunnel or overlay to an
+        external tunnel. OpenStack does not currently provide quotas
+        for network resources. Where network quotas are required, it
+        is necessary to implement quality of service management
+        outside of OpenStack. In many of these instances, similar
+        solutions for traffic shaping or other network functions will
+        be needed.</para>
+    <para>Depending on the selected design, Neutron itself may not
+        even support the required layer 3 network functionality. If it
+        is necessary or advantageous to use the provider networking
+        mode of Neutron without running the layer 3 agent, then an
+        external router will be required to provide layer 3
+        connectivity to outside systems.</para>
+    <para>Interaction with orchestration services is inevitable in
+        larger-scale deployments. Heat is capable of allocating
+        network resource defined in templates to map to tenant
+        networks and for port creation, as well as allocating floating
+        IPs. If there is a requirement to define and manage network
+        resources in using orchestration, it is recommended that the
+        design include OpenStack Orchestration to meet the demands of
+        users.</para>
+    <section xml:id="desing-impacts"><title>Design Impacts</title>
+    <para>A wide variety of factors can affect a network focused
+        OpenStack architecture. While there are some considerations
+        shared with a general use case, specific workloads related to
+        network requirements will influence network design
+        decisions.</para>
+    <para>One decision includes whether or not to use Network Address
+        Translation (NAT) and where to implement it. If there is a
+        requirement for floating IPs to be available instead of using
+        public fixed addresses then NAT is required. This can be seen
+        in network management applications that rely on an IP
+        endpoint. An example of this is a DHCP relay that needs to
+        know the IP of the actual DHCP server. In these cases it is
+        easier to automate the infrastructure to apply the target IP
+        to a new instance rather than reconfigure legacy or external
+        systems for each new instance.</para>
+    <para>NAT for floating IPs managed by Neutron will reside within
+        the hypervisor but there are also versions of NAT that may be
+        running elsewhere. If there is a shortage of IPv4 addresses
+        there are two common methods to mitigate this externally to
+        OpenStack. The first is to run a load balancer either within
+        OpenStack as a instance, or use an external load balancing
+        solution. In the internal scenario, load balancing software,
+        such as HAproxy, can be managed with Neutron's Load Balancer
+        as a Service (LBaaS). This is specifically to manage the
+        Virtual IP (VIPs) while a dual-homed connection from the
+        HAproxy instance connects the public network with the tenant
+        private network that hosts all of the content servers. In the
+        external scenario, a load balancer would need to serve the VIP
+        and also be joined to the tenant overlay network through
+        external means or routed to it via private addresses.</para>
+    <para>Another kind of NAT that may be useful is protocol NAT. In
+        some cases it may be desirable to use only IPv6 addresses on
+        instances and operate either an instance or an external
+        service to provide a NAT-based transition technology such as
+        NAT64 and DNS64. This provides the ability to have a globally
+        routable IPv6 address while only consuming IPv4 addresses as
+        necessary or in a shared manner.</para>
+    <para>Application workloads will affect the design of the
+        underlying network architecture. If a workload requires
+        network-level redundancy, the routing and switching
+        architecture will have to accommodate this. There are
+        differing methods for providing this that are dependent on the
+        network hardware selected, the performance of the hardware,
+        and which networking model is deployed. Some examples of this
+        are the use of Link aggregation (LAG) or Hot Standby Router
+        Protocol (HSRP). There are also the considerations of whether
+        to deploy Neutron or Nova-network and which plug-in to select
+        for Neutron. If using an external system, Neutron will need to
+        be configured to run layer 2 with a provider network
+        configuration. For example, it may be necessary to implement
+        HSRP to terminate layer 3 connectivity.</para>
+    <para>Depending on the workload, overlay networks may or may not
+        be a recommended configuration. Where application network
+        connections are small, short lived or bursty, running a
+        dynamic overlay can generate as much bandwidth as the packets
+        it carries. It also can induce enough latency to cause issues
+        with certain applications. There is an impact to the device
+        generating the overlay which, in most installations, will be
+        the hypervisor. This will cause performance degradation on
+        packet per second and connection per second rates.</para>
+    <para>Overlays also come with a secondary option that may or may
+        not be appropriate to a specific workload. While all of them
+        will operate in full mesh by default, there might be good
+        reasons to disable this function because it may cause
+        excessive overhead for some workloads. Conversely, other
+        workloads will operate without issue. For example, most web
+        services applications will not have major issues with a full
+        mesh overlay network, while some network monitoring tools or
+        storage replication workloads will have performance issues
+        with throughput or excessive broadcast traffic.</para>
+    <para>A design decision that many overlook is a choice of layer 3
+        protocols. While OpenStack was initially built with only IPv4
+        support, Neutron now supports IPv6 and dual-stacked networks.
+        Note that, as of the icehouse release, this only includes
+        stateless address autoconfiguration but the work is in
+        progress to support stateless and stateful dhcpv6 as well as
+        IPv6 floating IPs without NAT. Some workloads become possible
+        through the use of IPv6 and IPv6 to IPv4 reverse transition
+        mechanisms such as NAT64 and DNS64 or 6to4, because these
+        options are available. This will alter the requirements for
+        any address plan as single-stacked and transitional IPv6
+        deployments can alleviate the need for IPv4 addresses.</para>
+    <para>As of the icehouse release, OpenStack has limited support
+        for dynamic routing, however there are a number of options
+        available by incorporating third party solutions to implement
+        routing within the cloud including network equipment, hardware
+        nodes, and instances. Some workloads will perform well with
+        nothing more than static routes and default gateways
+        configured at the layer 3 termination point. In most cases
+        this will suffice, however some cases require the addition of
+        at least one type of dynamic routing protocol if not multiple
+        protocols. Having a form of interior gateway protocol (IGP)
+        available to the instances inside an OpenStack installation
+        opens up the possibility of use cases for anycast route
+        injection for services that need to use it as a geographic
+        location or failover mechanism. Other applications may wish to
+        directly participate in a routing protocol, either as a
+        passive observer as in the case of a looking glass, or as an
+        active participant in the form of a route reflector. Since an
+        instance might have a large amount of compute and memory
+        resources, it is trivial to hold an entire unpartitioned
+        routing table and use it to provide services such as network
+        path visibility to other applications or as a monitoring
+        tool.</para>
+    <para>A lesser known, but harder to diagnose issue, is that of
+        path Maximum Transmission Unit (MTU) failures. It is less of
+        an optional design consideration and more a design warning
+        that MTU must be at least large enough to handle normal
+        traffic, plus any overhead from an overlay network, and the
+        desired layer 3 protocol. Adding externally built tunnels will
+        further lessen the MTU packet size making it imperative to pay
+        attention to the fully calculated MTU as some systems may be
+        configured to ignore or drop path MTU discovery
+        packets.</para></section>
+    <section xml:id="tunables">
+        <title>Tunable networking components</title>
+        <para>Consider configurable networking components related to an
+            OpenStack architecture design when designing for network intensive
+            workloads include MTU and QoS. Some workloads will require a larger
+            MTU than normal based on a requirement to transfer large blocks of
+            data. When providing network service for applications such as video
+            streaming or storage replication, it is recommended to ensure that
+            both OpenStack hardware nodes and the supporting network equipment
+            are configured for jumbo frames where possible. This will allow for
+            a better utilization of available bandwidth. Configuration of jumbo
+            frames should be done across the complete path the packets will
+            traverse. If one network component is not capable of handling jumbo
+            frames then the entire path will revert to the default MTU.</para>
+        <para>Quality of Service (QoS) also has a great impact on network
+            intensive workloads by providing instant service to packets which
+            have a higher priority due to their ability to be impacted by poor
+            network performance. In applications such as Voice over IP (VoIP)
+            differentiated services code points are a near requirement for
+            proper operation. QoS can also be used in the opposite direction for
+            mixed workloads to prevent low priority but high bandwidth
+            applications, for example backup services, video conferencing or
+            file sharing, from blocking bandwidth that is needed for the proper
+            operation of other workloads. It is possible to tag file storage
+            traffic as a lower class, such as best effort or scavenger, to allow
+            the higher priority traffic through. In cases where regions within a
+            cloud might be geographically distributed it may also be necessary
+            to plan accordingly to implement WAN optimization to combat latency
+            or packet loss.</para>
+    </section>
+</section>
--- a/doc/arch-design/network_focus/section_introduction_network_focus.xml
+++ b/doc/arch-design/network_focus/section_introduction_network_focus.xml
@ -0,0 +1,138 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<section xmlns="http://docbook.org/ns/docbook"
+    xmlns:xi="http://www.w3.org/2001/XInclude"
+    xmlns:xlink="http://www.w3.org/1999/xlink"
+    version="5.0"
+    xml:id="arch-guide-intro-network-focus">
+    <title>Introduction</title>
+    <para>All OpenStack deployments are dependent, to some extent, on
+        network communication in order to function properly due to a
+        service-based nature. In some cases, however, use cases
+        dictate that the network is elevated beyond simple
+        infrastructure. This section is a discussion of architectures
+        that are more reliant or focused on network services. These
+        architectures are heavily dependent on the network
+        infrastructure and need to be architected so that the network
+        services perform and are reliable in order to satisfy user and
+        application requirements.</para>
+    <para>Some possible use cases include:</para>
+    <itemizedlist>
+        <listitem>
+            <para>Content Delivery Network: This could include
+                streaming video, photographs or any other cloud based
+                repository of data that is distributed to a large
+                number of end users. Mass market streaming video will
+                be very heavily affected by the network configurations
+                that would affect latency, bandwidth, and the
+                distribution of instances. Not all video streaming is
+                consumer focused. For example, multicast videos (used
+                for media, press conferences, corporate presentations,
+                web conferencing services, etc.) can also utilize a
+                content delivery network. Content delivery will be
+                affected by the location of the video repository and
+                its relationship to end users. Performance is also
+                affected by network throughput of the backend systems,
+                as well as the WAN architecture and the cache
+                methodology.</para>
+        </listitem>
+        <listitem>
+            <para>Network Management Functions: A cloud that provides
+                network service functions would be built to support
+                the delivery of back-end network services such as DNS,
+                NTP or SNMP and would be used by a company for
+                internal network management.</para>
+        </listitem>
+        <listitem>
+            <para>Network Service Offerings: A cloud can be used to
+                run customer facing network tools to support services.
+                For example, VPNs, MPLS private networks, GRE tunnels
+                and others.</para>
+        </listitem>
+        <listitem>
+            <para>Web portals / Web Services: Web servers are a common
+                application for cloud services and it is recommended
+                to have an understanding of the network requirements.
+                The network will need to be able to scale out to meet
+                user demand and deliver webpages with a minimum of
+                latency. Internal east-west and north-south network
+                bandwidth must be considered depending on the details
+                of the portal architecture.</para>
+        </listitem>
+        <listitem>
+            <para>High Speed and High Volume Transactional Systems:
+                These types of applications are very sensitive to
+                network configurations. Examples include many
+                financial systems, credit card transaction
+                applications, trading and other extremely high volume
+                systems. These systems are sensitive to network jitter
+                and latency. They also have a high volume of both
+                east-west and north-south network traffic that needs
+                to be balanced to maximize efficiency of the data
+                delivery. Many of these systems have large high
+                performance database back ends that need to be
+                accessed.</para>
+        </listitem>
+        <listitem>
+            <para>High Availability: These types of use cases are
+                highly dependent on the proper sizing of the network
+                to maintain replication of data between sites for high
+                availability. If one site becomes unavailable, the
+                extra sites will be able to serve the displaced load
+                until the original site returns to service. It is
+                important to size network capacity to handle the loads
+                that are desired.</para>
+        </listitem>
+        <listitem>
+            <para>Big Data: Clouds that will be used for the
+                management and collection of big data (data ingest)
+                will have a significant demand on network resources.
+                Big data often uses partial replicas of the data to
+                maintain data integrity over large distributed clouds.
+                Other big data applications that require a large
+                amount of network resources are Hadoop, Cassandra,
+                NuoDB, RIAK and other No-SQL and distributed
+                databases.</para>
+        </listitem>
+        <listitem>
+            <para>Virtual Desktop Infrastructure (VDI): This use case
+                is very sensitive to network congestion, latency,
+                jitter and other network characteristics. Like video
+                streaming, the user experience is very important
+                however, unlike video streaming, caching is not an
+                option to offset the network issues. VDI requires both
+                upstream and downstream traffic and cannot rely on
+                caching for the delivery of the application to the end
+                user.</para>
+        </listitem>
+        <listitem>
+            <para>Voice over IP (VoIP): This is extremely sensitive to
+                network congestion, latency, jitter and other network
+                characteristics. VoIP has a symmetrical traffic
+                pattern and it requires network quality of service
+                (QoS) for best performance. It may also require an
+                active queue management implementation to ensure
+                delivery. Users are very sensitive to latency and
+                jitter fluctuations and can detect them at very low
+                levels.</para>
+        </listitem>
+        <listitem>
+            <para>Video Conference / Web Conference: This also is
+                extremely sensitive to network congestion, latency,
+                jitter and other network flaws. Video Conferencing has
+                a symmetrical traffic pattern, but unless the network
+                is on an MPLS private network, it cannot use network
+                quality of service (QoS) to improve performance.
+                Similar to VOIP, users will be sensitive to network
+                performance issues even at low levels.</para>
+        </listitem>
+        <listitem>
+            <para>High Performance Computing (HPC): This is a complex
+                use case that requires careful consideration of the
+                traffic flows and usage patterns to address the needs
+                of cloud clusters. It has high East-West traffic
+                patterns for distributed computing, but there can be
+                substantial North-South traffic depending on the
+                specific application.</para>
+        </listitem>
+    </itemizedlist>
+</section>
--- a/doc/arch-design/network_focus/section_operational_considerations_network_focus.xml
+++ b/doc/arch-design/network_focus/section_operational_considerations_network_focus.xml
@ -0,0 +1,72 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<section xmlns="http://docbook.org/ns/docbook"
+    xmlns:xi="http://www.w3.org/2001/XInclude"
+    xmlns:xlink="http://www.w3.org/1999/xlink"
+    version="5.0"
+    xml:id="operational-considerations-networking-focus">
+    <?dbhtml stop-chunking?>
+    <title>Operational Considerations</title>
+    <para>Network focused OpenStack clouds have a number of
+        operational considerations that will influence the selected
+        design. Topics including, but not limited to, dynamic routing
+        of static routes, service level agreements, and ownership of
+        user management all need to be considered.</para>
+    <para>One of the first required decisions is the selection of a
+        telecom company or transit provider. This is especially true
+        if the network requirements include external or site-to-site
+        network connectivity.</para>
+    <para>Additional design decisions need to be made about monitoring
+        and alarming. These can be an internal responsibility or the
+        responsibility of the external provider. In the case of using
+        an external provider, SLAs will likely apply. In addition,
+        other operational considerations such as bandwidth, latency,
+        and jitter can be part of a service level agreement.</para>
+    <para>The ability to upgrade the infrastructure is another subject
+        for consideration. As demand for network resources increase,
+        operators will be required to add additional IP address blocks
+        and add additional bandwidth capacity. Managing hardware and
+        software life cycle events, for example upgrades,
+        decommissioning, and outages while avoiding service
+        interruptions for tenants, will also need to be
+        considered.</para>
+    <para>Maintainability will also need to be factored into the
+        overall network design. This includes the ability to manage
+        and maintain IP addresses as well as the use of overlay
+        identifiers including VLAN tag IDs, GRE tunnel IDs, and MPLS
+        tags. As an example, if all of the IP addresses have to be
+        changed on a network, a process known as renumbering, then the
+        design needs to support the ability to do so.</para>
+    <para>Network focused applications themselves need to be addressed
+        when concerning certain operational realities. For example,
+        the impending exhaustion of IPv4 addresses, the migration to
+        IPv6 and the utilization of private networks to segregate
+        different types of traffic that an application receives or
+        generates. In the case of IPv4 to IPv6 migrations,
+        applications should follow best practices for storing IP
+        addresses. It is further recommended to avoid relying on IPv4
+        features that were not carried over to the IPv6 protocol or
+        have differences in implementation.</para>
+    <para>When using private networks to segregate traffic,
+        applications should create private tenant networks for
+        database and data storage network traffic, and utilize public
+        networks for client-facing traffic. By segregating this
+        traffic, quality of service and security decisions can be made
+        to ensure that each network has the correct level of service
+        that it requires.</para>
+    <para>Finally, decisions must be made about the routing of network
+        traffic. For some applications, a more complex policy
+        framework for routing must be developed. The economic cost of
+        transmitting traffic over expensive links versus cheaper
+        links, in addition to bandwidth, latency, and jitter
+        requirements, can be used to create a routing policy that will
+        satisfy business requirements.</para>
+    <para>How to respond to network events must also be taken into
+        consideration. As an example, how load is transferred from one
+        link to another during a failure scenario could be a factor in
+        the design. If network capacity is not planned correctly,
+        failover traffic could overwhelm other ports or network links
+        and create a cascading failure scenario. In this case, traffic
+        that fails over to one link overwhelms that link and then
+        moves to the subsequent links until the all network traffic
+        stops.</para>
+</section>
--- a/doc/arch-design/network_focus/section_prescriptive_examples_network_focus.xml
+++ b/doc/arch-design/network_focus/section_prescriptive_examples_network_focus.xml
@ -0,0 +1,189 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<section xmlns="http://docbook.org/ns/docbook"
+    xmlns:xi="http://www.w3.org/2001/XInclude"
+    xmlns:xlink="http://www.w3.org/1999/xlink"
+    version="5.0"
+    xml:id="prescriptive-example-large-scale-web-app">
+    <?dbhtml stop-chunking?>
+    <title>Prescriptive Examples</title>
+    <para>A large-scale web application has been designed with cloud
+        principles in mind. The application is designed to scale
+        horizontally in a bursting fashion and will generate a high
+        instance count. The application requires an SSL connection to
+        secure data and must not lose connection state to individual
+        servers.</para>
+    <para>An example design for this workload is depicted in the
+        figure below. In this example, a hardware load balancer is
+        configured to provide SSL offload functionality and to connect
+        to tenant networks in order to reduce address consumption.
+        This load balancer is linked to the routing architecture as it
+        will service the VIP for the application. The router and load
+        balancer are configured with GRE tunnel ID of the
+        application's tenant network and provided an IP address within
+        the tenant subnet but outside of the address pool. This is to
+        ensure that the load balancer can communicate with the
+        application's HTTP servers without requiring the consumption
+        of a public IP address.</para>
+    <para>Since sessions must remain until closing, the routing and
+        switching architecture is designed for high availability.
+        Switches are meshed to each hypervisor and to each other, and
+        also provide an MLAG implementation to ensure layer 2
+        connectivity does not fail. Routers are configured with VRRP
+        and fully meshed with switches to ensure layer 3 connectivity.
+        Since GRE is used as an overlay network, Neutron is installed
+        and configured to use the Open vSwitch agent in GRE tunnel
+        mode. This ensures all devices can reach all other devices and
+        that tenant networks can be created for private addressing
+        links to the load balancer.</para>
+    <mediaobject>
+        <imageobject>
+            <imagedata
+                fileref="../images/Network_Web_Services1.png"
+            />
+        </imageobject>
+    </mediaobject>
+    <para>A web service architecture has many options and optional
+        components. Due to this, it can fit into a large number of
+        other OpenStack designs however a few key components will need
+        to be in place to handle the nature of most web-scale
+        workloads. The user needs the following components:</para>
+    <itemizedlist>
+        <listitem>
+            <para>OpenStack Controller services (Image, Identity,
+                Networking and supporting services such as MariaDB and
+                RabbitMQ)</para>
+        </listitem>
+        <listitem>
+            <para>OpenStack Compute running KVM hypervisor</para>
+        </listitem>
+        <listitem>
+            <para>OpenStack Object Storage</para>
+        </listitem>
+        <listitem>
+            <para>OpenStack Orchestration</para>
+        </listitem>
+        <listitem>
+            <para>OpenStack Telemetry</para>
+        </listitem>
+    </itemizedlist>
+    <para>Beyond the normal Keystone, Nova, Glance and Swift
+        components, Heat is a recommended component to handle properly
+        scaling the workloads to adjust to demand. Ceilometer will
+        also need to be included in the design due to the requirement
+        for auto-scaling. Web services tend to be bursty in load, have
+        very defined peak and valley usage patterns and, as a result,
+        benefit from automatic scaling of instances based upon
+        traffic. At a network level, a split network configuration
+        will work well with databases residing on private tenant
+        networks since these do not emit a large quantity of broadcast
+        traffic and may need to interconnect to some databases for
+        content.</para>
+    <section xml:id="load-balancing"><title>Load Balancing</title>
+    <para>Load balancing was included in this design to spread
+        requests across multiple instances. This workload scales well
+        horizontally across large numbers of instances. This allows
+        instances to run without publicly routed IP addresses and
+        simply rely on the load balancer for the service to be
+        globally reachable. Many of these services do not require
+        direct server return. This aids in address planning and
+        utilization at scale since only the virtual IP (VIP) must be
+        public.</para></section>
+    <section xml:id="overlay-networks"><title>Overlay Networks</title>
+    <para>OpenStack Networking using the Open vSwitch GRE tunnel mode
+        was included in the design to provide overlay functionality.
+        In this case, the layer 3 external routers will be in a pair
+        with VRRP and switches should be paired with an implementation
+        of MLAG running to ensure that there is no loss of
+        connectivity with the upstream routing infrastructure.</para></section>
+    <section xml:id="performance-tuning"><title>Performance Tuning</title>
+    <para>Network level tuning for this workload is minimal.
+        Quality-of-Service (QoS) will be applied to these workloads
+        for a middle ground Class Selector depending on existing
+        policies. It will be higher than a best effort queue but lower
+        than an Expedited Forwarding or Assured Forwarding queue.
+        Since this type of application generates larger packets with
+        longer-lived connections, bandwidth utilization can be
+        optimized for long duration TCP. Normal bandwidth planning
+        applies here with regards to benchmarking a session's usage
+        multiplied by the expected number of concurrent sessions with
+        overhead.</para></section>
+    <section xml:id="network-functions"><title>Network Functions</title>
+    <para>Network functions is a broad category but encompasses
+        workloads that support the rest of a system's network. These
+        workloads tend to consist of large amounts of small packets
+        that are very short lived, such as DNS queries or SNMP traps.
+        These messages need to arrive quickly and do not deal with
+        packet loss as there can be a very large volume of them. There
+        are a few extra considerations to take into account for this
+        type of workload and this can change a configuration all the
+        way to the hypervisor level. For an application that generates
+        10 TCP sessions per user with an average bandwidth of 512
+        kilobytes per second per flow and expected user count of ten
+        thousand concurrent users, the expected bandwidth plan is
+        approximately 4.88 gigabits per second.</para>
+    <para>The supporting network for this type of configuration needs
+        to have a low latency and evenly distributed availability.
+        This workload benefits from having services local to the
+        consumers of the service. A multi-site approach is used as
+        well as deploying many copies of the application to handle
+        load as close as possible to consumers. Since these
+        applications function independently, they do not warrant
+        running overlays to interconnect tenant networks. Overlays
+        also have the drawback of performing poorly with rapid flow
+        setup and may incur too much overhead with large quantities of
+        small packets and are therefore not recommended.</para>
+    <para>QoS is desired for some workloads to ensure delivery. DNS
+        has a major impact on the load times of other services and
+        needs to be reliable and provide rapid responses. It is to
+        configure rules in upstream devices to apply a higher Class
+        Selector to DNS to ensure faster delivery or a better spot in
+        queuing algorithms.</para></section>
+    <section xml:id="cloud-storage"><title>Cloud Storage</title>
+    <para>Another common use case for OpenStack environments is to
+        provide a cloud based file storage and sharing service. While
+        this may initially be considered to be a storage focused use
+        case there are also major requirements on the network side
+        that place it in the realm of requiring a network focused
+        architecture. An example for this application is cloud
+        backup.</para>
+    <para>There are two specific behaviors of this workload that have
+        major and different impacts on the network. Since this is both
+        an externally facing service and internally replicating
+        application there are both North-South and East-West traffic
+        considerations.</para>
+    <para>North-South traffic is primarily user facing. This means
+        that when a user uploads content for storage it will be coming
+        into the OpenStack installation. Users who download this
+        content will be drawing traffic from the OpenStack
+        installation. Since the service is intended primarily as a
+        backup the majority of the traffic will be southbound into the
+        environment. In this case it is beneficial to configure a
+        network to be asymmetric downstream as the traffic entering
+        the OpenStack installation will be greater than traffic
+        leaving.</para>
+    <para>East-West traffic is likely to be fully symmetric. Since
+        replication will originate from any node and may target
+        multiple other nodes algorithmically, it is less likely for
+        this traffic to have a larger volume in any specific
+        direction. However this traffic may interfere with north-south
+        traffic.</para>
+    <mediaobject>
+        <imageobject>
+            <imagedata
+                fileref="../images/Network_Cloud_Storage2.png"
+            />
+        </imageobject>
+    </mediaobject>
+    <para>This application will prioritize the North-South traffic
+        over East-West traffic as it is the customer-facing data. QoS
+        is implemented on East-West traffic to be a lower priority
+        Class Selector, while North-South traffic requires a higher
+        level in the priority queue because of this.</para>
+    <para>The network design in this case is less dependant on
+        availability and more dependant on being able to handle high
+        bandwidth. As a direct result, it is beneficial to forego
+        redundant links in favor of bonding those connections. This
+        increases available bandwidth. It is also beneficial to
+        configure all devices in the path, including OpenStack, to
+        generate and pass jumbo frames.</para></section>
+</section>
--- a/doc/arch-design/network_focus/section_tech_considerations_network_focus.xml
+++ b/doc/arch-design/network_focus/section_tech_considerations_network_focus.xml
@ -0,0 +1,402 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<section xmlns="http://docbook.org/ns/docbook"
+    xmlns:xi="http://www.w3.org/2001/XInclude"
+    xmlns:xlink="http://www.w3.org/1999/xlink"
+    version="5.0"
+    xml:id="technical-considerations-network-focus">
+    <?dbhtml stop-chunking?>
+    <title>Technical Considerations</title>
+    <para>Designing an OpenStack network architecture involves a
+        combination of layer 2 and layer 3 considerations. Layer 2
+        decisions involve those made at the data-link layer, such as
+        the decision to use Ethernet versus Token Ring. Layer 3
+        involve those made about the protocol layer and the point at
+        which IP comes into the picture. As an example, a completely
+        internal OpenStack network can exist at layer 2 and ignore
+        layer 3 however, in order for any traffic to go outside of
+        that cloud, to another network, or to the Internet, a layer 3
+        router or switch must be involved.</para>
+    <para>The past few years have seen two competing trends in
+        networking. There has been a trend towards building data
+        center network architectures based on layer 2 networking and
+        simultaneously another network architecture approach is to
+        treat the cloud environment essentially as a miniature version
+        of the Internet. This represents a radically different
+        approach to the network architecture from what is currently
+        installed in the staging environment because the Internet is
+        based entirely on layer 3 routing rather than layer 2
+        switching.</para>
+    <para>In the data center context, there are advantages of
+        designing the network on layer 2 protocols rather than layer
+        3. In spite of the difficulties of using a bridge to perform
+        the network role of a router, many vendors, customers, and
+        service providers are attracted to the idea of using Ethernet
+        in as many parts of their networks as possible. The benefits
+        of selecting a layer 2 design are:</para>
+    <itemizedlist>
+        <listitem>
+            <para>Ethernet frames contain all the essentials for
+                networking. These include, but are not limited to,
+                globally unique source addresses, globally unique
+                destination addresses, and error control.</para>
+        </listitem>
+        <listitem>
+            <para>Ethernet frames can carry any kind of packet.
+                Networking at layer 2 is independent of the layer 3
+                protocol.</para>
+        </listitem>
+        <listitem>
+            <para>More layers added to the Ethernet frame only slow
+                the networking process down. This is known as 'nodal
+                processing delay'.</para>
+        </listitem>
+        <listitem>
+            <para>Adjunct networking features, for example class of
+                service (CoS) or multicasting, can be added to
+                Ethernet as readily as IP networks.</para>
+        </listitem>
+        <listitem>
+            <para>VLANs are an easy mechanism for isolating
+                networks.</para>
+        </listitem>
+    </itemizedlist>
+    <para>Most information starts and ends inside Ethernet frames.
+        Today this applies to data, voice (for example, VoIP) and
+        video (for example, web cameras). The concept is that, if more
+        of the end-to-end transfer of information from a source to a
+        destination can be done in the form of Ethernet frames, more
+        of the benefits of Ethernet can be realized on the network.
+        Though it is not a substitute for IP networking, networking at
+        layer 2 can be a powerful adjunct to IP networking.</para>
+    <para>The basic reasoning behind using layer 2 Ethernet over layer
+        3 IP networks is the speed, the reduced overhead of the IP
+        hierarchy, and the lack of requirement to keep track of IP
+        address configuration as systems are moved around. Whereas the
+        simplicity of layer 2 protocols might work well in a data
+        center with hundreds of physical machines, cloud data centers
+        have the additional burden of needing to keep track of all
+        virtual machine addresses and networks. In these data centers,
+        it is not uncommon for one physical node to support 30-40
+        instances.</para>
+    <para>Important Note: Networking at the frame level says nothing
+        about the presence or absence of IP addresses at the packet
+        level. Almost all ports, links, and devices on a network of
+        LAN switches still have IP addresses, as do all the source and
+        destination hosts. There are many reasons for the continued
+        need for IP addressing. The largest one is the need to manage
+        the network. A device or link without an IP address is usually
+        invisible to most management applications. Utilities including
+        remote access for diagnostics, file transfer of configurations
+        and software, and similar applications cannot run without IP
+        addresses as well as MAC addresses.</para>
+    <section xml:id="layer-2-arch-limitations"><title>Layer 2 Architecture Limitations</title>
+    <para>Outside of the traditional data center the limitations of
+        layer 2 network architectures become more obvious.</para>
+    <itemizedlist>
+        <listitem>
+            <para>Number of VLANs is limited to 4096.</para>
+        </listitem>
+        <listitem>
+            <para>The number of MACs stored in switch tables is
+                limited.</para>
+        </listitem>
+        <listitem>
+            <para>The need to maintain a set of layer 4 devices to
+                handle traffic control must be accommodated.</para>
+        </listitem>
+        <listitem>
+            <para>MLAG, often used for switch redundancy, is a
+                proprietary solution that does not scale beyond two
+                devices and forces vendor lock-in.</para>
+        </listitem>
+        <listitem>
+            <para>It can be difficult to troubleshoot a network
+                without IP addresses and ICMP.</para>
+        </listitem>
+        <listitem>
+            <para>Configuring ARP is considered complicated on large
+                layer 2 networks.</para>
+        </listitem>
+        <listitem>
+            <para>All network devices need to be aware of all MACs,
+                even instance MACs, so there is constant churn in MAC
+                tables and network state changes as instances are
+                started or stopped.</para>
+        </listitem>
+        <listitem>
+            <para>Migrating MACs (instance migration) to different
+                physical locations are a potential problem if ARP
+                table timeouts are not set properly.</para>
+        </listitem>
+    </itemizedlist>
+    <para>It is important to know that layer 2 has a very limited set
+        of network management tools. It is very difficult to control
+        traffic, as it does not have mechanisms to manage the network
+        or shape the traffic, and network troubleshooting is very
+        difficult. One reason for this difficulty is network devices
+        have no IP addresses. As a result, there is no reasonable way
+        to check network delay in a layer 2 network.</para>
+    <para>On large layer 2 networks, configuring ARP learning can also
+        be complicated. The setting for the MAC address timer on
+        switches is critical and, if set incorrectly, can cause
+        significant performance problems. As an example, the Cisco
+        default MAC address timer is extremely long. Migrating MACs to
+        different physical locations to support instance migration can
+        be a significant problem. In this case, the network
+        information maintained in the switches could be out of sync
+        with the new location of the instance.</para>
+    <para>In a layer 2 network, all devices are aware of all MACs,
+        even those that belong to instances. The network state
+        information in the backbone changes whenever an instance is
+        started or stopped. As a result there is far too much churn in
+        the MAC tables on the backbone switches.</para></section>
+    <section xml:id="layer-3-arch-advantages"><title>Layer 3 Architecture Advantages</title>
+    <para>In the layer 3 case, there is no churn in the routing tables
+        due to instances starting and stopping. The only time there
+        would be a routing state change would be in the case of a Top
+        of Rack (ToR) switch failure or a link failure in the backbone
+        itself. Other advantages of using a layer 3 architecture
+        include:</para>
+    <itemizedlist>
+        <listitem>
+            <para>layer 3 networks provide the same level of
+                resiliency and scalability as the Internet.</para>
+        </listitem>
+        <listitem>
+            <para>Controlling traffic with routing metrics is
+                straightforward.</para>
+        </listitem>
+        <listitem>
+            <para>layer 3 can be configured to use BGP confederation
+                for scalability so core routers have state
+                proportional to number of racks, not to the number of
+                servers or instances.</para>
+        </listitem>
+        <listitem>
+            <para>Routing ensures that instance MAC and IP addresses
+                out of the network core reducing state churn. Routing
+                state changes only occur in the case of a ToR switch
+                failure or backbone link failure.</para>
+        </listitem>
+        <listitem>
+            <para>There are a variety of well tested tools, for
+                example ICMP, to monitor and manage traffic.</para>
+        </listitem>
+        <listitem>
+            <para>layer 3 architectures allow for the use of Quality
+                of Service (QoS) to manage network performance.</para>
+        </listitem>
+    </itemizedlist>
+    <section xml:id="layer-3-arch-limitations"><title>Layer 3 Architecture Limitations</title>
+    <para>The main limitation of layer 3 is that there is no built-in
+        isolation mechanism comparable to the VLANs in layer 2
+        networks. Furthermore, the hierarchical nature of IP addresses
+        means that an instance will also be on the same subnet as its
+        physical host. This means that it cannot be migrated outside
+        of the subnet easily. For these reasons, network
+        virtualization needs to use IP encapsulation and software at
+        the end hosts for both isolation, as well as for separation of
+        the addressing in the virtual layer from addressing in the
+        physical layer. Other potential disadvantages of layer 3
+        include the need to design an IP addressing scheme rather than
+        relying on the switches to automatically keep track of the MAC
+        addresses and to configure the interior gateway routing
+        protocol in the switches.</para></section></section>
+    <section xml:id="network-recommendations-overview">
+        <title>Network Recommendations Overview</title>
+    <para>OpenStack has complex networking requirements for several
+        reasons. Many components interact at different levels of the
+        system stack that adds complexity. Data flows are complex.
+        Data in an OpenStack cloud moves both between instances across
+        the network (also known as East-West), as well as in and out
+        of the system (also known as North-South). Physical server
+        nodes have network requirements that are independent of those
+        used by instances which need to be isolated from the core
+        network to account for scalability. It is also recommended to
+        functionally separate the networks for security purposes and
+        tune performance through traffic shaping.</para>
+    <para>A number of important general technical and business factors
+        need to be taken into consideration when planning and
+        designing an OpenStack network. They include:</para>
+    <itemizedlist>
+        <listitem>
+            <para>A requirement for vendor independence. To avoid
+                hardware or software vendor lock-in, the design should
+                not rely on specific features of a vendor’s router or
+                switch.</para>
+        </listitem>
+        <listitem>
+            <para>A requirement to massively scale the ecosystem to
+                support millions of end users.</para>
+        </listitem>
+        <listitem>
+            <para>A requirement to support indeterminate platforms and
+                applications.</para>
+        </listitem>
+        <listitem>
+            <para>A requirement to design for cost efficient
+                operations to take advantage of massive scale.</para>
+        </listitem>
+        <listitem>
+            <para>A requirement to ensure that there is no single
+                point of failure in the cloud ecosystem.</para>
+        </listitem>
+        <listitem>
+            <para>A requirement for high availability architecture to
+                meet customer SLA requirements.</para>
+        </listitem>
+        <listitem>
+            <para>A requirement to be tolerant of rack level
+                failure.</para>
+        </listitem>
+        <listitem>
+            <para>A requirement to maximize flexibility to architect
+                future production environments.</para>
+        </listitem>
+    </itemizedlist>
+    <para>Keeping all of these in mind, the following network design
+        recommendations can be made:</para>
+    <itemizedlist>
+        <listitem>
+            <para>Layer 3 designs are preferred over layer 2
+                architectures.</para>
+        </listitem>
+        <listitem>
+            <para>Design a dense multi-path network core to support
+                multi-directional scaling and flexibility.</para>
+        </listitem>
+        <listitem>
+            <para>Use hierarchical addressing because it is the only
+                viable option to scale network ecosystem.</para>
+        </listitem>
+        <listitem>
+            <para>Use virtual networking to isolate instance service
+                network traffic from the management and internal
+                network traffic.</para>
+        </listitem>
+        <listitem>
+            <para>Isolate virtual networks using encapsulation
+                technologies.</para>
+        </listitem>
+        <listitem>
+            <para>Use traffic shaping for performance tuning.</para>
+        </listitem>
+        <listitem>
+            <para>Use eBGP to connect to the Internet up-link.</para>
+        </listitem>
+        <listitem>
+            <para>Use iBGP to flatten the internal traffic on the
+                layer 3 mesh.</para>
+        </listitem>
+        <listitem>
+            <para>Determine the most effective configuration for block
+                storage network.</para>
+        </listitem>
+    </itemizedlist></section>
+    <section xml:id="additional-considerations-network-focus"><title>Additional Considerations</title>
+    <para>There are numerous topics to consider when designing a
+        network-focused OpenStack cloud.</para>
+    <section xml:id="openstack-networking-versus-nova-network"><title>OpenStack Networking versus Nova Network
+        Considerations</title>
+    <para>Selecting the type of networking technology to implement
+        depends on many factors. OpenStack Networking (Neutron) and
+        Nova Network both have their advantages and disadvantages.
+        They are both valid and supported options that fit different
+        use cases as described in the following table.</para></section>
+    <section xml:id="redundant-networking-tor-switch-ha"><title>Redundant Networking: ToR Switch High Availability
+        Risk Analysis</title>
+    <para>A technical consideration of networking is the idea that
+        switching gear in the data center that should be installed
+        with backup switches in case of hardware failure.</para>
+    <para>Research into the mean time between failures (MTBF) on
+        switches is between 100,000 and 200,000 hours. This number is
+        dependent on the ambient temperature of the switch in the data
+        center. When properly cooled and maintained, this translates
+        to between 11 and 22 years before failure. Even in the worst
+        case of poor ventilation and high ambient temperatures in the
+        data center, the MTBF is still 2-3 years. This is based on
+        published research found at
+        http://www.garrettcom.com/techsupport/papers/ethernet_switch_reliability.pdf
+        and http://www.n-tron.com/pdf/network_availability.pdf</para>
+    <para>In most cases, it is much more economical to only use a
+        single switch with a small pool of spare switches to replace
+        failed units than it is to outfit an entire data center with
+        redundant switches. Applications should also be able to
+        tolerate rack level outages without affecting normal
+        operations since network and compute resources are easily
+        provisioned and plentiful.</para></section>
+    <section xml:id="preparing-for-future-ipv6-support"><title>Preparing for the future: IPv6 Support</title>
+    <para>One of the most important networking topics today is the
+        impending exhaustion of IPv4 addresses. In early 2014, ICANN
+        announced that they started allocating the final IPv4 address
+        blocks to the Regional Internet Registries
+        http://www.internetsociety.org/deploy360/blog/2014/05/goodbye-ipv4-iana-starts-allocating-final-address-blocks/.
+        This means the IPv4 address space is close to being fully
+        allocated. As a result, it will soon become difficult to
+        allocate more IPv4 addresses to an application that has
+        experienced growth, or is expected to scale out, due to the
+        lack of unallocated IPv4 address blocks.</para>
+    <para>For network focused applications the future is the IPv6
+        protocol. IPv6 increases the address space significantly,
+        fixes long standing issues in the IPv4 protocol, and will
+        become an essential for network focused applications in the
+        future.</para>
+    <para>Neutron supports IPv6 when configured to take advantage of
+        the feature. To enable it, simply create an IPv6 subnet in
+        OpenStack Neutron and use IPv6 prefixes when creating security
+        groups.</para></section>
+    <section xml:id="asymetric-links"><title>Asymmetric Links</title>
+    <para>When designing a network architecture, the traffic patterns
+        of an application will heavily influence the allocation of
+        total bandwidth and the number of links that are used to send
+        and receive traffic. Applications that provide file storage
+        for customers will allocate bandwidth and links to favor
+        incoming traffic, whereas video streaming applications will
+        allocate bandwidth and links to favor outgoing traffic.</para></section>
+    <section xml:id="performance-network-focus"><title>Performance</title>
+    <para>It is important to analyze the applications' tolerance for
+        latency and jitter when designing an environment to support
+        network focused applications. Certain applications, for
+        example VoIP, are less tolerant of latency and jitter. Where
+        latency and jitter are concerned, certain applications may
+        require tuning of QoS parameters and network device queues to
+        ensure that they are queued for transmit immediately or
+        guaranteed minimum bandwidth. Since OpenStack currently does
+        not support these functions, some considerations may need to
+        be made for the network plug-in selected.</para>
+    <para>The location of a service may also impact the application or
+        consumer experience. If an application is designed to serve
+        differing content to differing users it will need to be
+        designed to properly direct connections to those specific
+        locations. Use a multi-site installation for these situations,
+        where appropriate.</para>
+    <para>OpenStack networking can be implemented in two separate
+        ways. The legacy nova-network provides a flat DHCP network
+        with a single broadcast domain. This implementation does not
+        support tenant isolation networks or advanced plug-ins, but it
+        is currently the only way to implement a distributed layer 3
+        agent using the multi_host configuration. Neutron is the
+        official current implementation of OpenStack Networking. It
+        provides a pluggable architecture that supports a large
+        variety of network methods. Some of these include a layer 2
+        only provider network model, external device plug-ins, or even
+        OpenFlow controllers.</para>
+    <para>Networking at large scales becomes a set of boundary
+        questions. The determination of how large a layer 2 domain
+        needs to be is based on the amount of nodes within the domain
+        and the amount of broadcast traffic that passes between
+        instances. Breaking layer 2 boundaries may require the
+        implementation of overlay networks and tunnels. This decision
+        is a balancing act between the need for a smaller overhead or
+        a need for a smaller domain.</para>
+    <para>When selecting network devices, be aware that making this
+        decision based on largest port density often comes with a
+        drawback. Aggregation switches and routers have not all kept
+        pace with Top of Rack switches and may induce bottlenecks on
+        north-south traffic. As a result, it may be possible for
+        massive amounts of downstream network utilization to impact
+        upstream network devices, impacting service to the cloud.
+        Since OpenStack does not currently provide a mechanism for
+        traffic shaping or rate limiting, it is necessary to implement
+        these features at the network hardware level.</para></section></section>
+</section>
--- a/doc/arch-design/network_focus/section_user_requirements_network_focus.xml
+++ b/doc/arch-design/network_focus/section_user_requirements_network_focus.xml
@ -0,0 +1,170 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<section xmlns="http://docbook.org/ns/docbook"
+    xmlns:xi="http://www.w3.org/2001/XInclude"
+    xmlns:xlink="http://www.w3.org/1999/xlink"
+    version="5.0"
+    xml:id="user-requirements-network-focus">
+    <?dbhtml stop-chunking?>
+    <title>User Requirements</title>
+    <para>Network focused architectures vary from the general purpose
+        designs. They are heavily influenced by a specific subset of
+        applications that interact with the network in a more
+        impacting way. Some of the business requirements that will
+        influence the design include:</para>
+    <itemizedlist>
+        <listitem>
+            <para>User experience: User experience is impacted by
+                network latency through slow page loads, degraded
+                video streams, and low quality VoIP sessions. Users
+                are often not aware of how network design and
+                architecture affects their experiences. Both
+                enterprise customers and end-users rely on the network
+                for delivery of an application. Network performance
+                problems can provide a negative experience for the
+                end-user, as well as productivity and economic loss.
+           </para>
+        </listitem>
+        <listitem>
+            <para>Regulatory requirements: Networks need to take into
+                consideration any regulatory requirements about the
+                physical location of data as it traverses the network.
+                For example, Canadian medical records cannot pass
+                outside of Canadian sovereign territory. Another
+                network consideration is maintaining network
+                segregation of private data flows and ensuring that
+                the network between cloud locations is encrypted where
+                required. Network architectures are affected by
+                regulatory requirements for encryption and protection
+                of data in flight as the data moves through various
+                networks.</para>
+        </listitem>
+    </itemizedlist>
+    <para>Many jurisdictions have legislative and regulatory
+        requirements governing the storage and management of data in
+        cloud environments. Common areas of regulation include:</para>
+    <itemizedlist>
+        <listitem>
+            <para>Data retention policies ensuring storage of
+                persistent data and records management to meet data
+                archival requirements.</para>
+        </listitem>
+        <listitem>
+            <para>Data ownership policies governing the possession and
+                responsibility for data.</para>
+        </listitem>
+        <listitem>
+            <para>Data sovereignty policies governing the storage of
+                data in foreign countries or otherwise separate
+                jurisdictions.</para>
+        </listitem>
+        <listitem>
+            <para>Data compliance policies governing where information
+                needs to reside in certain locations due to regular
+                issues and, more importantly, where it cannot reside
+                in other locations for the same reason.</para>
+        </listitem>
+    </itemizedlist>
+    <para>Examples of such legal frameworks include the data
+        protection framework of the European Union
+        (http://ec.europa.eu/justice/data-protection/ ) and the
+        requirements of the Financial Industry Regulatory Authority
+        (http://www.finra.org/Industry/Regulation/FINRARules) in the
+        United States. Consult a local regulatory body for more
+        information.</para>
+    <section xml:id="high-availability-issues-network-focus"><title>High Availability Issues</title>
+    <para>OpenStack installations with high demand on network
+        resources have high availability requirements that are
+        determined by the application and use case. Financial
+        transaction systems will have a much higher requirement for
+        high availability than a development application. Forms of
+        network availability, for example quality of service (QoS),
+        can be used to improve the network performance of sensitive
+        applications, for example VoIP and video streaming.</para>
+    <para>Often, high performance systems will have SLA requirements
+        for a minimum QoS with regard to guaranteed uptime, latency
+        and bandwidth. The level of the SLA can have a significant
+        impact on the network architecture and requirements for
+        redundancy in the systems.</para></section>
+    <section xml:id="risks-network-focus"><title>Risks</title>
+    <itemizedlist>
+        <listitem>
+            <para>Network Misconfigurations: Configuring incorrect IP
+                addresses, VLANs, and routes can cause outages to
+                areas of the network or, in the worst case scenario,
+                the entire cloud infrastructure. Misconfigurations can
+                cause disruptive problems and should be automated to
+                minimize the opportunity for operator error.</para>
+        </listitem>
+        <listitem>
+            <para>Capacity Planning: Cloud networks need to be managed
+                for capacity and growth over time. There is a risk
+                that the network will not grow to support the
+                workload. Capacity planning includes the purchase of
+                network circuits and hardware that can potentially
+                have lead times measured in months or more.</para>
+        </listitem>
+        <listitem>
+            <para>Network Tuning: Cloud networks need to be configured
+                to minimize link loss, packet loss, packet storms,
+                broadcast storms, and loops.</para>
+        </listitem>
+        <listitem>
+            <para>Single Point Of Failure (SPOF): High availability
+                must be taken into account even at the physical and
+                environmental layers. If there is a single point of
+                failure due to only one upstream link, or only one
+                power supply, an outage becomes unavoidable.</para>
+        </listitem>
+        <listitem>
+            <para>Complexity: An overly complex network design becomes
+                difficult to maintain and troubleshoot. While
+                automated tools that handle overlay networks or device
+                level configuration can mitigate this, non-traditional
+                interconnects between functions and specialized
+                hardware need to be well documented or avoided to
+                prevent outages.</para>
+        </listitem>
+        <listitem>
+            <para>Non-standard features: There are additional risks
+                that arise from configuring the cloud network to take
+                advantage of vendor specific features. One example is
+                multi-link aggregation (MLAG) that is being used to
+                provide redundancy at the aggregator switch level of
+                the network. MLAG is not a standard and, as a result,
+                each vendor has their own proprietary implementation
+                of the feature. MLAG architectures are not
+                interoperable across switch vendors, which leads to
+                vendor lock-in, and can cause delays or inability when
+                upgrading components.</para>
+        </listitem>
+    </itemizedlist></section>
+    <section xml:id="security-network-focus"><title>Security</title>
+    <para>Security is often overlooked or added after a design has
+        been implemented. Consider security implications and
+        requirements before designing the physical and logical network
+        topologies. Some of the factors that need to be addressed
+        include making sure the networks are properly segregated and
+        traffic flows are going to the correct destinations without
+        crossing through locations that are undesirable. Some examples
+        of factors that need to be taken into consideration are:</para>
+    <itemizedlist>
+        <listitem>
+            <para>Firewalls</para>
+        </listitem>
+        <listitem>
+            <para>Overlay interconnects for joining separated tenant
+                networks</para>
+        </listitem>
+        <listitem>
+            <para>Routing through or avoiding specific networks</para>
+        </listitem>
+    </itemizedlist>
+    <para>Another security vulnerability that must be taken into
+        account is how networks are attached to hypervisors. If a
+        network must be separated from other systems at all costs, it
+        may be necessary to schedule instances for that network onto
+        dedicated compute nodes. This may also be done to mitigate
+        against exploiting a hypervisor breakout allowing the attacker
+        access to networks from a compromised instance.</para>
+    </section>
+</section>
--- a/doc/arch-design/pom.xml
+++ b/doc/arch-design/pom.xml
@ -0,0 +1,79 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<project xmlns="http://maven.apache.org/POM/4.0.0"
+    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
+    xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
+    <parent>
+        <groupId>org.openstack.docs</groupId>
+        <artifactId>parent-pom</artifactId>
+        <version>1.0.0-SNAPSHOT</version>
+        <relativePath>../pom.xml</relativePath>
+    </parent>
+    <modelVersion>4.0.0</modelVersion>
+    <artifactId>openstack-arch-design</artifactId>
+    <packaging>jar</packaging>
+    <name>OpenStack Architecture Design Guide</name>
+    <properties>
+        <!-- This is set by Jenkins according to the branch. -->
+        <release.path.name></release.path.name>
+        <comments.enabled>0</comments.enabled>
+    </properties>
+    <!-- ################################################ -->
+    <!-- USE "mvn clean generate-sources" to run this POM -->
+    <!-- ################################################ -->
+    <build>
+        <plugins>
+            <plugin>
+                <groupId>com.rackspace.cloud.api</groupId>
+                <artifactId>clouddocs-maven-plugin</artifactId>
+                <!-- version set in ../pom.xml -->
+                <executions>
+                    <execution>
+                        <id>generate-webhelp</id>
+                        <goals>
+                            <goal>generate-webhelp</goal>
+                        </goals>
+                        <phase>generate-sources</phase>
+                        <configuration>
+                            <!-- These parameters only apply to webhelp -->
+                            <enableDisqus>0</enableDisqus>
+                            <disqusShortname>openstack-arch-design</disqusShortname>
+                            <enableGoogleAnalytics>1</enableGoogleAnalytics>
+                            <googleAnalyticsId>UA-17511903-1</googleAnalyticsId>
+                            <generateToc>
+                                appendix  toc,title
+                                article/appendix  nop
+                                article   toc,title
+                                book      toc,title,figure,table,example,equation
+                                chapter   toc,title
+                                section   toc
+                                part      toc,title
+                                qandadiv  toc
+                                qandaset  toc
+                                reference toc,title
+                                set       toc,title
+                            </generateToc>
+                            <!-- The following elements sets the autonumbering of sections in output for chapter numbers but no numbered sections-->
+                            <sectionAutolabel>0</sectionAutolabel>
+                            <tocSectionDepth>1</tocSectionDepth>
+                            <sectionLabelIncludesComponentLabel>0</sectionLabelIncludesComponentLabel>
+                            <webhelpDirname>arch-design</webhelpDirname>
+                            <pdfFilenameBase>arch-design</pdfFilenameBase>
+                        </configuration>
+                    </execution>
+                </executions>
+                <configuration>
+                    <!-- These parameters apply to pdf and webhelp -->
+                    <xincludeSupported>true</xincludeSupported>
+                    <sourceDirectory>.</sourceDirectory>
+                    <includes>
+                        bk-openstack-arch-design.xml
+                    </includes>
+                    <canonicalUrlBase>http://docs.openstack.org/openstack-arch-design/content</canonicalUrlBase>
+                    <glossaryCollection>${basedir}/../glossary/glossary-terms.xml</glossaryCollection>
+                    <branding>openstack</branding>
+                    <formalProcedures>0</formalProcedures>
+                </configuration>
+            </plugin>
+        </plugins>
+    </build>
+</project>
--- a/doc/arch-design/specialized/section_desktop_as_a_service_specialized.xml
+++ b/doc/arch-design/specialized/section_desktop_as_a_service_specialized.xml
@ -0,0 +1,62 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<section xmlns="http://docbook.org/ns/docbook"
+    xmlns:xi="http://www.w3.org/2001/XInclude"
+    xmlns:xlink="http://www.w3.org/1999/xlink"
+    version="5.0"
+    xml:id="desktop-as-a-service">
+    <?dbhtml stop-chunking?>
+    <title>Desktop as a Service</title>
+    <para>Virtual Desktop Infrastructure (VDI) is a service that hosts
+        user desktop environments on remote servers. This application
+        is very sensitive to network latency and requires a high
+        performance compute environment. Traditionally these types of
+        environments have not been put on cloud environments because
+        few clouds are built to support such a demanding workload that
+        is so exposed to end users. Recently, as cloud environments
+        become more robust, vendors are starting to provide services
+        that allow virtual desktops to be hosted in the cloud. In the
+        not too distant future, OpenStack could be used as the
+        underlying infrastructure to run a virtual infrastructure
+        environment, either in-house or in the cloud.</para>
+    <section xml:id="challenges"><title>Challenges</title>
+    <para>Designing an infrastructure that is suitable to host virtual
+        desktops is a very different task to that of most virtual
+        workloads. The infrastructure will need to be designed, for
+        example:</para>
+    <itemizedlist>
+        <listitem>
+            <para>Boot storms - What happens when hundreds or
+                thousands of users log in during shift changes,
+                affects the storage design.</para>
+        </listitem>
+        <listitem>
+            <para>The performance of the applications running in these
+                virtual desktops</para>
+        </listitem>
+        <listitem>
+            <para>Operating system and compatibility with the
+                OpenStack hypervisor</para>
+        </listitem>
+    </itemizedlist></section>
+    <section xml:id="broker"><title>Broker</title>
+    <para>The Connection Broker is a central component of the
+        architecture that determines which Remote Desktop Host will be
+        assigned or connected to the user. The broker is often a
+        full-blown management product allowing for the automated
+        deployment and provisioning of Remote Desktop Hosts.</para></section>
+    <section xml:id="possible-solutions"><title>Possible Solutions</title>
+    <para>There a number of commercial products available today that
+        provide such a broker solution but nothing that is native in
+        the OpenStack project. There of course is also the option of
+        not providing a broker and managing this manually - but this
+        would not suffice as a large scale, enterprise
+        solution.</para></section>
+    <section xml:id="diagram"><title>Diagram</title>
+    <mediaobject>
+        <imageobject>
+            <imagedata
+                fileref="../images/Specialized_VDI1.png"
+            />
+        </imageobject>
+    </mediaobject></section>
+</section>
--- a/doc/arch-design/specialized/section_hardware_specialized.xml
+++ b/doc/arch-design/specialized/section_hardware_specialized.xml
@ -0,0 +1,45 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<section xmlns="http://docbook.org/ns/docbook"
+    xmlns:xi="http://www.w3.org/2001/XInclude"
+    xmlns:xlink="http://www.w3.org/1999/xlink"
+    version="5.0"
+    xml:id="specialized-hardware">
+    <?dbhtml stop-chunking?>
+    <title>Specialized Hardware</title>
+    <para>Certain workloads require specialized hardware devices that
+        are either difficult to virtualize or impossible to share.
+        Applications such as load balancers, highly parallel brute
+        force computing, and direct to wire networking may need
+        capabilities that basic OpenStack components do not
+        provide.</para>
+    <section xml:id="challenges-specialized-hardware"><title>Challenges</title>
+    <para>Some applications need access to hardware devices to either
+        improve performance or provide capabilities that are not
+        virtual CPU, RAM, network or storage. These can be a shared
+        resource, such as a cryptography processor, or a dedicated
+        resource such as a Graphics Processing Unit. OpenStack has
+        ways of providing some of these, while others may need extra
+        work.</para></section>
+    <section xml:id="solutions-specialized-hardware"><title>Solutions</title>
+    <para>In order to provide cryptography offloading to a set of
+        instances, it is possible to use Glance configuration options
+        to assign the cryptography chip to a device node in the guest.
+        The documentation at
+        http://docs.openstack.org/cli-reference/content/chapter_cli-glance-property.html
+        contains further information on configuring this solution, but
+        it allows all guests using the configured images to access the
+        hypervisor cryptography device.</para>
+    <para>If direct access to a specific device is required, it can be
+        dedicated to a single instance per hypervisor through the use
+        of PCI pass-through. The OpenStack administrator needs to
+        define a flavor that specifically has the PCI device in order
+        to properly schedule instances. More information regarding PCI
+        pass-through, including instructions for implementing and
+        using it, is available at
+        https://wiki.openstack.org/wiki/Pci_passthrough#How_to_check_PCI_status_with_PCI_api_patches.</para>
+    <mediaobject>
+        <imageobject>
+            <imagedata fileref="../images/Specialized_Hardware2.png"/>
+        </imageobject>
+    </mediaobject></section>
+</section>
--- a/Show More
+++ b/Show More