[arch-design] Migrate arch content from Ops Guide

1. Migrate storage and scaling content from Ops Guide to Arch Guide 2. Edit migrated content 3. Remove images from mitaka changes Note: Architecture chapter content in the Ops Guide will remain until the new Arch Guide is published. Change-Id: I676b57635be567a0c1b3ea63650e6327d3ea0696 Implements: blueprint arch-guide-restructure
2016-06-02 12:45:01 +10:00 · 2016-06-02 12:45:01 +10:00 · e2c0677126
commit e2c0677126
parent 0bed31d994
35 changed files with 959 additions and 14 deletions
--- a/doc/arch-design-draft/source/capacity-planning-scaling.rst
+++ b/doc/arch-design-draft/source/capacity-planning-scaling.rst
@ -1,3 +1,430 @@
 =============================
 Capacity planning and scaling
 =============================
+
+Whereas traditional applications required larger hardware to scale
+(vertical scaling), cloud-based applications typically request more,
+discrete hardware (horizontal scaling).
+
+OpenStack is designed to be horizontally scalable. Rather than switching
+to larger servers, you procure more servers and simply install identically
+configured services. Ideally, you scale out and load balance among groups of
+functionally identical services (for example, compute nodes or ``nova-api``
+nodes), that communicate on a message bus.
+
+The Starting Point
+~~~~~~~~~~~~~~~~~~
+
+Determining the scalability of your cloud and how to improve it requires
+balancing many variables. No one solution meets everyone's scalability goals.
+However, it is helpful to track a number of metrics. Since you can define
+virtual hardware templates, called "flavors" in OpenStack, you can start to
+make scaling decisions based on the flavors you'll provide. These templates
+define sizes for memory in RAM, root disk size, amount of ephemeral data disk
+space available, and number of cores for starters.
+
+The default OpenStack flavors are shown in :ref:`table_default_flavors`.
+
+.. _table_default_flavors:
+
+.. list-table:: Table. OpenStack default flavors
+   :widths: 20 20 20 20 20
+   :header-rows: 1
+
+   * - Name
+     - Virtual cores
+     - Memory
+     - Disk
+     - Ephemeral
+   * - m1.tiny
+     - 1
+     - 512 MB
+     - 1 GB
+     - 0 GB
+   * - m1.small
+     - 1
+     - 2 GB
+     - 10 GB
+     - 20 GB
+   * - m1.medium
+     - 2
+     - 4 GB
+     - 10 GB
+     - 40 GB
+   * - m1.large
+     - 4
+     - 8 GB
+     - 10 GB
+     - 80 GB
+   * - m1.xlarge
+     - 8
+     - 16 GB
+     - 10 GB
+     - 160 GB
+
+The starting point is the core count of your cloud. By applying
+some ratios, you can gather information about:
+
+-  The number of virtual machines (VMs) you expect to run,
+   ``((overcommit fraction × cores) / virtual cores per instance)``
+
+-  How much storage is required ``(flavor disk size × number of instances)``
+
+You can use these ratios to determine how much additional infrastructure
+you need to support your cloud.
+
+Here is an example using the ratios for gathering scalability
+information for the number of VMs expected as well as the storage
+needed. The following numbers support (200 / 2) × 16 = 1600 VM instances
+and require 80 TB of storage for ``/var/lib/nova/instances``:
+
+-  200 physical cores.
+
+-  Most instances are size m1.medium (two virtual cores, 50 GB of
+   storage).
+
+-  Default CPU overcommit ratio (``cpu_allocation_ratio`` in nova.conf)
+   of 16:1.
+
+.. note::
+   Regardless of the overcommit ratio, an instance can not be placed
+   on any physical node with fewer raw (pre-overcommit) resources than
+   instance flavor requires.
+
+However, you need more than the core count alone to estimate the load
+that the API services, database servers, and queue servers are likely to
+encounter. You must also consider the usage patterns of your cloud.
+
+As a specific example, compare a cloud that supports a managed
+web-hosting platform with one running integration tests for a
+development project that creates one VM per code commit. In the former,
+the heavy work of creating a VM happens only every few months, whereas
+the latter puts constant heavy load on the cloud controller. You must
+consider your average VM lifetime, as a larger number generally means
+less load on the cloud controller.
+
+.. TODO Perhaps relocate the above paragraph under the web scale use case?
+
+Aside from the creation and termination of VMs, you must consider the
+impact of users accessing the service particularly on ``nova-api`` and
+its associated database. Listing instances garners a great deal of
+information and, given the frequency with which users run this
+operation, a cloud with a large number of users can increase the load
+significantly. This can occur even without their knowledge. For example,
+leaving the OpenStack dashboard instances tab open in the browser
+refreshes the list of VMs every 30 seconds.
+
+After you consider these factors, you can determine how many cloud
+controller cores you require. A typical eight core, 8 GB of RAM server
+is sufficient for up to a rack of compute nodes — given the above
+caveats.
+
+You must also consider key hardware specifications for the performance
+of user VMs, as well as budget and performance needs, including storage
+performance (spindles/core), memory availability (RAM/core), network
+bandwidth hardware specifications and (Gbps/core), and overall
+CPU performance (CPU/core).
+
+.. tip::
+
+   For a discussion of metric tracking, including how to extract
+   metrics from your cloud, see the .`OpenStack Operations Guide
+   <http://docs.openstack.org/ops-guide/ops_logging_monitoring.html>`_.
+
+Adding Cloud Controller Nodes
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+You can facilitate the horizontal expansion of your cloud by adding
+nodes. Adding compute nodes is straightforward since they are easily picked up
+by the existing installation. However, you must consider some important
+points when you design your cluster to be highly available.
+
+A cloud controller node runs several different services. You
+can install services that communicate only using the message queue
+internally— ``nova-scheduler`` and ``nova-console`` on a new server for
+expansion. However, other integral parts require more care.
+
+You should load balance user-facing services such as dashboard,
+``nova-api``, or the Object Storage proxy. Use any standard HTTP
+load-balancing method (DNS round robin, hardware load balancer, or
+software such as Pound or HAProxy). One caveat with dashboard is the VNC
+proxy, which uses the WebSocket protocol— something that an L7 load
+balancer might struggle with. See also `Horizon session storage
+<http://docs.openstack.org/developer/horizon/topics/deployment.html#session-storage>`_.
+
+You can configure some services, such as ``nova-api`` and
+``glance-api``, to use multiple processes by changing a flag in their
+configuration file allowing them to share work between multiple cores on
+the one machine.
+
+.. tip::
+
+   Several options are available for MySQL load balancing, and the
+   supported AMQP brokers have built-in clustering support. Information
+   on how to configure these and many of the other services can be
+   found in the `operations chapter
+   <http://docs.openstack.org/ops-guide/operations.html>`_ in the Operations
+   Guide.
+
+Segregating Your Cloud
+~~~~~~~~~~~~~~~~~~~~~~
+
+When you want to offer users different regions to provide legal
+considerations for data storage, redundancy across earthquake fault
+lines, or for low-latency API calls, you segregate your cloud. Use one
+of the following OpenStack methods to segregate your cloud: *cells*,
+*regions*, *availability zones*, or *host aggregates*.
+
+Each method provides different functionality and can be best divided
+into two groups:
+
+-  Cells and regions, which segregate an entire cloud and result in
+   running separate Compute deployments.
+
+-  :term:`Availability zones <availability zone>` and host aggregates,
+   which merely divide a single Compute deployment.
+
+:ref:`table_segregation_methods` provides a comparison view of each
+segregation method currently provided by OpenStack Compute.
+
+.. _table_segregation_methods:
+
+.. list-table:: Table. OpenStack segregation methods
+   :widths: 20 20 20 20 20
+   :header-rows: 1
+
+   * -
+     - Cells
+     - Regions
+     - Availability zones
+     - Host aggregates
+   * - **Use**
+     - A single :term:`API endpoint` for compute, or you require a second
+       level of scheduling.
+     - Discrete regions with separate API endpoints and no coordination
+       between regions.
+     - Logical separation within your nova deployment for physical isolation
+       or redundancy.
+     - To schedule a group of hosts with common features.
+   * - **Example**
+     - A cloud with multiple sites where you can schedule VMs "anywhere" or on
+       a particular site.
+     - A cloud with multiple sites, where you schedule VMs to a particular
+       site and you want a shared infrastructure.
+     - A single-site cloud with equipment fed by separate power supplies.
+     - Scheduling to hosts with trusted hardware support.
+   * - **Overhead**
+     - Considered experimental. A new service, nova-cells. Each cell has a full
+       nova installation except nova-api.
+     - A different API endpoint for every region. Each region has a full nova
+       installation.
+     - Configuration changes to ``nova.conf``.
+     - Configuration changes to ``nova.conf``.
+   * - **Shared services**
+     - Keystone, ``nova-api``
+     - Keystone
+     - Keystone, All nova services
+     - Keystone, All nova services
+
+Cells and Regions
+-----------------
+
+OpenStack Compute cells are designed to allow running the cloud in a
+distributed fashion without having to use more complicated technologies,
+or be invasive to existing nova installations. Hosts in a cloud are
+partitioned into groups called *cells*. Cells are configured in a tree.
+The top-level cell ("API cell") has a host that runs the ``nova-api``
+service, but no ``nova-compute`` services. Each child cell runs all of
+the other typical ``nova-*`` services found in a regular installation,
+except for the ``nova-api`` service. Each cell has its own message queue
+and database service and also runs ``nova-cells``, which manages the
+communication between the API cell and child cells.
+
+This allows for a single API server being used to control access to
+multiple cloud installations. Introducing a second level of scheduling
+(the cell selection), in addition to the regular ``nova-scheduler``
+selection of hosts, provides greater flexibility to control where
+virtual machines are run.
+
+Unlike having a single API endpoint, regions have a separate API
+endpoint per installation, allowing for a more discrete separation.
+Users wanting to run instances across sites have to explicitly select a
+region. However, the additional complexity of a running a new service is
+not required.
+
+The OpenStack dashboard (horizon) can be configured to use multiple
+regions. This can be configured through the ``AVAILABLE_REGIONS``
+parameter.
+
+Availability Zones and Host Aggregates
+--------------------------------------
+
+You can use availability zones, host aggregates, or both to partition a
+nova deployment.
+
+Availability zones are implemented through and configured in a similar
+way to host aggregates.
+
+However, you can use them for different reasons.
+
+Availability zone
+^^^^^^^^^^^^^^^^^
+
+This enables you to arrange OpenStack compute hosts into logical groups
+and provides a form of physical isolation and redundancy from other
+availability zones, such as by using a separate power supply or network
+equipment.
+
+You define the availability zone in which a specified compute host
+resides locally on each server. An availability zone is commonly used to
+identify a set of servers that have a common attribute. For instance, if
+some of the racks in your data center are on a separate power source,
+you can put servers in those racks in their own availability zone.
+Availability zones can also help separate different classes of hardware.
+
+When users provision resources, they can specify from which availability
+zone they want their instance to be built. This allows cloud consumers
+to ensure that their application resources are spread across disparate
+machines to achieve high availability in the event of hardware failure.
+
+Host aggregates zone
+^^^^^^^^^^^^^^^^^^^^
+
+This enables you to partition OpenStack Compute deployments into logical
+groups for load balancing and instance distribution. You can use host
+aggregates to further partition an availability zone. For example, you
+might use host aggregates to partition an availability zone into groups
+of hosts that either share common resources, such as storage and
+network, or have a special property, such as trusted computing
+hardware.
+
+A common use of host aggregates is to provide information for use with
+the ``nova-scheduler``. For example, you might use a host aggregate to
+group a set of hosts that share specific flavors or images.
+
+The general case for this is setting key-value pairs in the aggregate
+metadata and matching key-value pairs in flavor's ``extra_specs``
+metadata. The ``AggregateInstanceExtraSpecsFilter`` in the filter
+scheduler will enforce that instances be scheduled only on hosts in
+aggregates that define the same key to the same value.
+
+An advanced use of this general concept allows different flavor types to
+run with different CPU and RAM allocation ratios so that high-intensity
+computing loads and low-intensity development and testing systems can
+share the same cloud without either starving the high-use systems or
+wasting resources on low-utilization systems. This works by setting
+``metadata`` in your host aggregates and matching ``extra_specs`` in
+your flavor types.
+
+The first step is setting the aggregate metadata keys
+``cpu_allocation_ratio`` and ``ram_allocation_ratio`` to a
+floating-point value. The filter schedulers ``AggregateCoreFilter`` and
+``AggregateRamFilter`` will use those values rather than the global
+defaults in ``nova.conf`` when scheduling to hosts in the aggregate. Be
+cautious when using this feature, since each host can be in multiple
+aggregates, but should have only one allocation ratio for
+each resources. It is up to you to avoid putting a host in multiple
+aggregates that define different values for the same resource.
+
+This is the first half of the equation. To get flavor types that are
+guaranteed a particular ratio, you must set the ``extra_specs`` in the
+flavor type to the key-value pair you want to match in the aggregate.
+For example, if you define ``extra_specs`` ``cpu_allocation_ratio`` to
+"1.0", then instances of that type will run in aggregates only where the
+metadata key ``cpu_allocation_ratio`` is also defined as "1.0." In
+practice, it is better to define an additional key-value pair in the
+aggregate metadata to match on rather than match directly on
+``cpu_allocation_ratio`` or ``core_allocation_ratio``. This allows
+better abstraction. For example, by defining a key ``overcommit`` and
+setting a value of "high," "medium," or "low," you could then tune the
+numeric allocation ratios in the aggregates without also needing to
+change all flavor types relating to them.
+
+.. note::
+
+    Previously, all services had an availability zone. Currently, only
+    the ``nova-compute`` service has its own availability zone. Services
+    such as ``nova-scheduler``, ``nova-network``, and ``nova-conductor``
+    have always spanned all availability zones.
+
+    When you run any of the following operations, the services appear in
+    their own internal availability zone
+    (CONF.internal_service_availability_zone):
+
+    -  :command:`nova host-list` (os-hosts)
+
+    -  :command:`euca-describe-availability-zones verbose`
+
+    -  :command:`nova service-list`
+
+    The internal availability zone is hidden in
+    euca-describe-availability_zones (nonverbose).
+
+    CONF.node_availability_zone has been renamed to
+    CONF.default_availability_zone and is used only by the
+    ``nova-api`` and ``nova-scheduler`` services.
+
+    CONF.node_availability_zone still works but is deprecated.
+
+Scalable Hardware
+~~~~~~~~~~~~~~~~~
+
+While several resources already exist to help with deploying and
+installing OpenStack, it's very important to make sure that you have
+your deployment planned out ahead of time. This guide presumes that you
+have set aside a rack for the OpenStack cloud but also offers
+suggestions for when and what to scale.
+
+Hardware Procurement
+--------------------
+
+“The Cloud” has been described as a volatile environment where servers
+can be created and terminated at will. While this may be true, it does
+not mean that your servers must be volatile. Ensuring that your cloud's
+hardware is stable and configured correctly means that your cloud
+environment remains up and running.
+
+OpenStack can be deployed on any hardware supported by an
+OpenStack compatible Linux distribution.
+
+Hardware does not have to be consistent, but it should at least have the
+same type of CPU to support instance migration.
+
+The typical hardware recommended for use with OpenStack is the standard
+value-for-money offerings that most hardware vendors stock. It should be
+straightforward to divide your procurement into building blocks such as
+"compute," "object storage," and "cloud controller," and request as many
+of these as you need. Alternatively, any existing servers you have that meet
+performance requirements and virtualization technology are likely to support
+OpenStack.
+
+Capacity Planning
+-----------------
+
+OpenStack is designed to increase in size in a straightforward manner.
+Taking into account the considerations previous mentioned, particularly on the
+sizing of the cloud controller, it should be possible to procure additional
+compute or object storage nodes as needed. New nodes do not need to be the same
+specification or vendor as existing nodes.
+
+For compute nodes, ``nova-scheduler`` will manage differences in
+sizing with core count and RAM. However, you should consider that the user
+experience changes with differing CPU speeds. When adding object storage
+nodes, a :term:`weight` should be specified that reflects the
+:term:`capability` of the node.
+
+Monitoring the resource usage and user growth will enable you to know
+when to procure. The `Logging and Monitoring
+<http://docs.openstack.org/ops-guide/ops_logging_monitoring.html>`_
+chapte in the Operations Guide details some useful metrics.
+
+Burn-in Testing
+---------------
+
+The chances of failure for the server's hardware are high at the start
+and the end of its life. As a result, dealing with hardware failures
+while in production can be avoided by appropriate burn-in testing to
+attempt to trigger the early-stage failures. The general principle is to
+stress the hardware to its limits. Examples of burn-in tests include
+running a CPU or disk benchmark for several days.
+
--- a/doc/arch-design-draft/source/design-compute.rst
+++ b/doc/arch-design-draft/source/design-compute.rst
@ -0,0 +1,3 @@
+=======
+Compute
+=======
--- a/doc/arch-design-draft/source/design-control-plane.rst
+++ b/doc/arch-design-draft/source/design-control-plane.rst
@ -0,0 +1,3 @@
+=============
+Control Plane
+=============
--- a/doc/arch-design-draft/source/design-dashboard-api.rst
+++ b/doc/arch-design-draft/source/design-dashboard-api.rst
@ -0,0 +1,3 @@
+==================
+Dashboard and APIs
+==================
--- a/doc/arch-design-draft/source/design-identity.rst
+++ b/doc/arch-design-draft/source/design-identity.rst
@ -0,0 +1,3 @@
+========
+Identity
+========
--- a/doc/arch-design-draft/source/design-images.rst
+++ b/doc/arch-design-draft/source/design-images.rst
@ -0,0 +1,3 @@
+======
+Images
+======
--- a/doc/arch-design-draft/source/design-networking.rst
+++ b/doc/arch-design-draft/source/design-networking.rst
@ -0,0 +1,3 @@
+==========
+Networking
+==========
--- a/doc/arch-design-draft/source/design-storage.rst
+++ b/doc/arch-design-draft/source/design-storage.rst
@ -0,0 +1,476 @@
+==============
+Storage design
+==============
+
+Storage is found in many parts of the OpenStack cloud environment. This
+section describes persistent storage options you can configure with
+your cloud. It is important to understand the distinction between
+:term:`ephemeral <ephemeral volume>` storage and
+:term:`persistent <persistent volume>` storage.
+
+Ephemeral Storage
+~~~~~~~~~~~~~~~~~
+
+If you deploy only the OpenStack :term:`Compute service` (nova), by default
+your users do not have access to any form of persistent storage. The disks
+associated with VMs are "ephemeral," meaning that from the user's point
+of view they disappear when a virtual machine is terminated.
+
+Persistent Storage
+~~~~~~~~~~~~~~~~~~
+
+Persistent storage means that the storage resource outlives any other
+resource and is always available, regardless of the state of a running
+instance.
+
+Today, OpenStack clouds explicitly support three types of persistent
+storage: *object storage*, *block storage*, and *file system storage*.
+
+Object Storage
+--------------
+
+Object storage is implemented in OpenStack by the
+OpenStack Object Storage (swift) project. Users access binary objects
+through a REST API. If your intended users need to
+archive or manage large datasets, you want to provide them with Object
+Storage. In addition, OpenStack can store your virtual machine (VM)
+images inside of an object storage system, as an alternative to storing
+the images on a file system.
+
+OpenStack Object Storage provides a highly scalable, highly available
+storage solution by relaxing some of the constraints of traditional file
+systems. In designing and procuring for such a cluster, it is important
+to understand some key concepts about its operation. Essentially, this
+type of storage is built on the idea that all storage hardware fails, at
+every level, at some point. Infrequently encountered failures that would
+hamstring other storage systems, such as issues taking down RAID cards
+or entire servers, are handled gracefully with OpenStack Object
+Storage. For more information, see the  `Swift developer
+documentation <http://docs.openstack.org/developer/swift/overview_architecture.html>`_
+
+When designing your cluster, you must consider durability and
+availability which is dependent on the spread and placement of your data,
+rather than the reliability of the
+hardware. Consider the default value of the number of replicas, which is
+three. This means that before an object is marked as having been
+written, at least two copies exist in case a single server fails to
+write, the third copy may or may not yet exist when the write operation
+initially returns. Altering this number increases the robustness of your
+data, but reduces the amount of storage you have available. Look
+at the placement of your servers. Consider spreading them widely
+throughout your data center's network and power-failure zones. Is a zone
+a rack, a server, or a disk?
+
+Consider these main traffic flows for an Object Storage network:
+
+* Among :term:`object`, :term:`container`, and
+  :term:`account servers <account server>`
+* Between servers and the proxies
+* Between the proxies and your users
+
+Object Storage frequent communicates among servers hosting data. Even a small
+cluster generates megabytes/second of traffic, which is predominantly, “Do
+you have the object?” and “Yes I have the object!” If the answer
+to the question is negative or the request times out,
+replication of the object begins.
+
+Consider the scenario where an entire server fails and 24 TB of data
+needs to be transferred "immediately" to remain at three copies — this can
+put significant load on the network.
+
+Another consideration is when a new file is being uploaded, the proxy server
+must write out as many streams as there are replicas, multiplying network
+traffic. For a three-replica cluster, 10 Gbps in means 30 Gbps out. Combining
+this with the previous high bandwidth bandwidth private versus public network
+recommendations demands of replication is what results in the recommendation
+that your private network be of significantly higher bandwidth than your public
+network requires. OpenStack Object Storage communicates internally with
+unencrypted, unauthenticated rsync for performance, so the private
+network is required.
+
+The remaining point on bandwidth is the public-facing portion. The
+``swift-proxy`` service is stateless, which means that you can easily
+add more and use HTTP load-balancing methods to share bandwidth and
+availability between them.
+
+More proxies means more bandwidth, if your storage can keep up.
+
+Block Storage
+-------------
+
+Block storage (sometimes referred to as volume storage) provides users
+with access to block-storage devices. Users interact with block storage
+by attaching volumes to their running VM instances.
+
+These volumes are persistent: they can be detached from one instance and
+re-attached to another, and the data remains intact. Block storage is
+implemented in OpenStack by the OpenStack Block Storage (cinder), which
+supports multiple back ends in the form of drivers. Your
+choice of a storage back end must be supported by a Block Storage
+driver.
+
+Most block storage drivers allow the instance to have direct access to
+the underlying storage hardware's block device. This helps increase the
+overall read/write IO. However, support for utilizing files as volumes
+is also well established, with full support for NFS, GlusterFS and
+others.
+
+These drivers work a little differently than a traditional "block"
+storage driver. On an NFS or GlusterFS file system, a single file is
+created and then mapped as a "virtual" volume into the instance. This
+mapping/translation is similar to how OpenStack utilizes QEMU's
+file-based virtual machines stored in ``/var/lib/nova/instances``.
+
+Shared File Systems Service
+---------------------------
+
+The Shared File Systems service (manila) provides a set of services for
+management of shared file systems in a multi-tenant cloud environment.
+Users interact with the Shared File Systems service by mounting remote File
+Systems on their instances with the following usage of those systems for
+file storing and exchange. The Shared File Systems service provides you with
+shares which is a remote, mountable file system. You can mount a
+share to and access a share from several hosts by several users at a
+time. With shares, user can also:
+
+* Create a share specifying its size, shared file system protocol,
+  visibility level.
+* Create a share on either a share server or standalone, depending on
+  the selected back-end mode, with or without using a share network.
+* Specify access rules and security services for existing shares.
+* Combine several shares in groups to keep data consistency inside the
+  groups for the following safe group operations.
+* Create a snapshot of a selected share or a share group for storing
+  the existing shares consistently or creating new shares from that
+  snapshot in a consistent way.
+* Create a share from a snapshot.
+* Set rate limits and quotas for specific shares and snapshots.
+* View usage of share resources.
+* Remove shares.
+
+Like Block Storage, the Shared File Systems service is persistent. It
+can be:
+
+* Mounted to any number of client machines.
+* Detached from one instance and attached to another without data loss.
+  During this process the data are safe unless the Shared File Systems
+  service itself is changed or removed.
+
+Shares are provided by the Shared File Systems service. In OpenStack,
+Shared File Systems service is implemented by Shared File System
+(manila) project, which supports multiple back-ends in the form of
+drivers. The Shared File Systems service can be configured to provision
+shares from one or more back-ends. Share servers are, mostly, virtual
+machines that export file shares using different protocols such as NFS,
+CIFS, GlusterFS, or HDFS.
+
+OpenStack Storage Concepts
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+:ref:`table_openstack_storage` explains the different storage concepts
+provided by OpenStack.
+
+.. _table_openstack_storage:
+
+.. list-table:: Table. OpenStack storage
+   :widths: 20 20 20 20 20
+   :header-rows: 1
+
+   * -
+     - Ephemeral storage
+     - Block storage
+     - Object storage
+     - Shared File System storage
+   * - Used to…
+     - Run operating system and scratch space
+     - Add additional persistent storage to a virtual machine (VM)
+     - Store data, including VM images
+     - Add additional persistent storage to a virtual machine
+   * - Accessed through…
+     - A file system
+     - A block device that can be partitioned, formatted, and mounted
+       (such as, /dev/vdc)
+     - The REST API
+     - A Shared File Systems service share (either manila managed or an
+       external one registered in manila) that can be partitioned, formatted
+       and mounted (such as /dev/vdc)
+   * - Accessible from…
+     - Within a VM
+     - Within a VM
+     - Anywhere
+     - Within a VM
+   * - Managed by…
+     - OpenStack Compute (nova)
+     - OpenStack Block Storage (cinder)
+     - OpenStack Object Storage (swift)
+     - OpenStack Shared File System Storage (manila)
+   * - Persists until…
+     - VM is terminated
+     - Deleted by user
+     - Deleted by user
+     - Deleted by user
+   * - Sizing determined by…
+     - Administrator configuration of size settings, known as *flavors*
+     - User specification in initial request
+     - Amount of available physical storage
+     - * User specification in initial request
+       * Requests for extension
+       * Available user-level quotes
+       * Limitations applied by Administrator
+   * - Encryption set by…
+     - Parameter in nova.conf
+     - Admin establishing `encrypted volume type
+       <http://docs.openstack.org/admin-guide/dashboard_manage_volumes.html>`_,
+       then user selecting encrypted volume
+     - Not yet available
+     - Shared File Systems service does not apply any additional encryption
+       above what the share’s back-end storage provides
+   * - Example of typical usage…
+     - 10 GB first disk, 30 GB second disk
+     - 1 TB disk
+     - 10s of TBs of dataset storage
+     - Depends completely on the size of back-end storage specified when
+       a share was being created. In case of thin provisioning it can be
+       partial space reservation (for more details see
+       `Capabilities and Extra-Specs
+       <http://docs.openstack.org/developer/manila/devref/capabilities_and_extra_specs.html?highlight=extra%20specs#common-capabilities>`_
+       specification)
+
+.. note::
+
+   **File-level Storage (for Live Migration)**
+
+   With file-level storage, users access stored data using the operating
+   system's file system interface. Most users, if they have used a network
+   storage solution before, have encountered this form of networked
+   storage. In the Unix world, the most common form of this is NFS. In the
+   Windows world, the most common form is called CIFS (previously, SMB).
+
+   OpenStack clouds do not present file-level storage to end users.
+   However, it is important to consider file-level storage for storing
+   instances under ``/var/lib/nova/instances`` when designing your cloud,
+   since you must have a shared file system if you want to support live
+   migration.
+
+Choosing Storage Back Ends
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Users will indicate different needs for their cloud use cases. Some may
+need fast access to many objects that do not change often, or want to
+set a time-to-live (TTL) value on a file. Others may access only storage
+that is mounted with the file system itself, but want it to be
+replicated instantly when starting a new instance. For other systems,
+ephemeral storage is the preferred choice. When you select
+:term:`storage back ends <storage back end>`,
+consider the following questions from user's perspective:
+
+* Do my users need block storage?
+* Do my users need object storage?
+* Do I need to support live migration?
+* Should my persistent storage drives be contained in my compute nodes,
+  or should I use external storage?
+* What is the platter count I can achieve? Do more spindles result in
+  better I/O despite network access?
+* Which one results in the best cost-performance scenario I'm aiming for?
+* How do I manage the storage operationally?
+* How redundant and distributed is the storage? What happens if a
+  storage node fails? To what extent can it mitigate my data-loss
+  disaster scenarios?
+
+To deploy your storage by using only commodity hardware, you can use a number
+of open-source packages, as shown in :ref:`table_persistent_file_storage`.
+
+.. _table_persistent_file_storage:
+
+.. list-table:: Table. Persistent file-based storage support
+   :widths: 25 25 25 25
+   :header-rows: 1
+
+   * -
+     - Object
+     - Block
+     - File-level
+   * - Swift
+     - .. image:: figures/Check_mark_23x20_02.png
+          :width: 30%
+     -
+     -
+   * - LVM
+     -
+     - .. image:: figures/Check_mark_23x20_02.png
+          :width: 30%
+     -
+   * - Ceph
+     - .. image:: figures/Check_mark_23x20_02.png
+          :width: 30%
+     - .. image:: figures/Check_mark_23x20_02.png
+          :width: 30%
+     - Experimental
+   * - Gluster
+     - .. image:: figures/Check_mark_23x20_02.png
+          :width: 30%
+     - .. image:: figures/Check_mark_23x20_02.png
+          :width: 30%
+     - .. image:: figures/Check_mark_23x20_02.png
+          :width: 30%
+   * - NFS
+     -
+     - .. image:: figures/Check_mark_23x20_02.png
+          :width: 30%
+     - .. image:: figures/Check_mark_23x20_02.png
+          :width: 30%
+   * - ZFS
+     -
+     - .. image:: figures/Check_mark_23x20_02.png
+          :width: 30%
+     -
+   * - Sheepdog
+     - .. image:: figures/Check_mark_23x20_02.png
+          :width: 30%
+     - .. image:: figures/Check_mark_23x20_02.png
+          :width: 30%
+     -
+
+This list of open source file-level shared storage solutions is not
+exhaustive; other open source solutions exist (MooseFS). Your
+organization may already have deployed a file-level shared storage
+solution that you can use.
+
+.. note::
+
+   **Storage Driver Support**
+
+   In addition to the open source technologies, there are a number of
+   proprietary solutions that are officially supported by OpenStack Block
+   Storage. You can find a matrix of the functionality provided by all of the
+   supported Block Storage drivers on the `OpenStack
+   wiki <https://wiki.openstack.org/wiki/CinderSupportMatrix>`_.
+
+Also, you need to decide whether you want to support object storage in
+your cloud. The two common use cases for providing object storage in a
+compute cloud are:
+
+* To provide users with a persistent storage mechanism
+* As a scalable, reliable data store for virtual machine images
+
+Commodity Storage Back-end Technologies
+---------------------------------------
+
+This section provides a high-level overview of the differences among the
+different commodity storage back end technologies. Depending on your
+cloud user's needs, you can implement one or many of these technologies
+in different combinations:
+
+OpenStack Object Storage (swift)
+ The official OpenStack Object Store implementation. It is a mature
+ technology that has been used for several years in production by
+ Rackspace as the technology behind Rackspace Cloud Files. As it is
+ highly scalable, it is well-suited to managing petabytes of storage.
+ OpenStack Object Storage's advantages are better integration with
+ OpenStack (integrates with OpenStack Identity, works with the
+ OpenStack dashboard interface) and better support for multiple data
+ center deployment through support of asynchronous eventual
+ consistency replication.
+
+ Therefore, if you eventually plan on distributing your storage
+ cluster across multiple data centers, if you need unified accounts
+ for your users for both compute and object storage, or if you want
+ to control your object storage with the OpenStack dashboard, you
+ should consider OpenStack Object Storage. More detail can be found
+ about OpenStack Object Storage in the section below.
+
+Ceph
+ A scalable storage solution that replicates data across commodity
+ storage nodes.
+
+ Ceph was designed to expose different types of storage interfaces to
+ the end user: it supports object storage, block storage, and
+ file-system interfaces, although the file-system interface is not
+ production-ready. Ceph supports the same API as swift
+ for object storage and can be used as a back end for cinder block
+ storage as well as back-end storage for glance images. Ceph supports
+ "thin provisioning," implemented using copy-on-write.
+
+ This can be useful when booting from volume because a new volume can
+ be provisioned very quickly. Ceph also supports keystone-based
+ authentication (as of version 0.56), so it can be a seamless swap in
+ for the default OpenStack swift implementation.
+
+ Ceph's advantages are that it gives the administrator more
+ fine-grained control over data distribution and replication
+ strategies, enables you to consolidate your object and block
+ storage, enables very fast provisioning of boot-from-volume
+ instances using thin provisioning, and supports a distributed
+ file-system interface, though this interface is `not yet
+ recommended <http://ceph.com/docs/master/cephfs/>`_ for use in
+ production deployment by the Ceph project.
+
+ If you want to manage your object and block storage within a single
+ system, or if you want to support fast boot-from-volume, you should
+ consider Ceph.
+
+Gluster
+ A distributed, shared file system. As of Gluster version 3.3, you
+ can use Gluster to consolidate your object storage and file storage
+ into one unified file and object storage solution, which is called
+ Gluster For OpenStack (GFO). GFO uses a customized version of swift
+ that enables Gluster to be used as the back-end storage.
+
+ The main reason to use GFO rather than swift is if you also
+ want to support a distributed file system, either to support shared
+ storage live migration or to provide it as a separate service to
+ your end users. If you want to manage your object and file storage
+ within a single system, you should consider GFO.
+
+LVM
+ The Logical Volume Manager is a Linux-based system that provides an
+ abstraction layer on top of physical disks to expose logical volumes
+ to the operating system. The LVM back-end implements block storage
+ as LVM logical partitions.
+
+ On each host that will house block storage, an administrator must
+ initially create a volume group dedicated to Block Storage volumes.
+ Blocks are created from LVM logical volumes.
+
+ .. note::
+
+    LVM does *not* provide any replication. Typically,
+    administrators configure RAID on nodes that use LVM as block
+    storage to protect against failures of individual hard drives.
+    However, RAID does not protect against a failure of the entire
+    host.
+
+ZFS
+ The Solaris iSCSI driver for OpenStack Block Storage implements
+ blocks as ZFS entities. ZFS is a file system that also has the
+ functionality of a volume manager. This is unlike on a Linux system,
+ where there is a separation of volume manager (LVM) and file system
+ (such as, ext3, ext4, xfs, and btrfs). ZFS has a number of
+ advantages over ext4, including improved data-integrity checking.
+
+ The ZFS back end for OpenStack Block Storage supports only
+ Solaris-based systems, such as Illumos. While there is a Linux port
+ of ZFS, it is not included in any of the standard Linux
+ distributions, and it has not been tested with OpenStack Block
+ Storage. As with LVM, ZFS does not provide replication across hosts
+ on its own; you need to add a replication solution on top of ZFS if
+ your cloud needs to be able to handle storage-node failures.
+
+ We don't recommend ZFS unless you have previous experience with
+ deploying it, since the ZFS back end for Block Storage requires a
+ Solaris-based operating system, and we assume that your experience
+ is primarily with Linux-based systems.
+
+Sheepdog
+ Sheepdog is a userspace distributed storage system. Sheepdog scales
+ to several hundred nodes, and has powerful virtual disk management
+ features like snapshot, cloning, rollback, thin provisioning.
+
+ It is essentially an object storage system that manages disks and
+ aggregates the space and performance of disks linearly in hyper
+ scale on commodity hardware in a smart way. On top of its object
+ store, Sheepdog provides elastic volume service and http service.
+ Sheepdog does not assume anything about kernel version and can work
+ nicely with xattr-supported file systems.
+
+
--- a/doc/arch-design-draft/source/design.rst
+++ b/doc/arch-design-draft/source/design.rst
@ -2,23 +2,47 @@
 Design
 ======

-Compute service
-~~~~~~~~~~~~~~~
+Designing an OpenStack cloud requires a understanding of the cloud user's
+requirements and needs to determine the best possible configuration. This
+chapter provides guidance on the decisions you need to make during the
+design process.

-Storage
-~~~~~~~
+To design, deploy, and configure OpenStack, administrators must
+understand the logical architecture. OpenStack modules are one of the
+following types:

-Networking service
-~~~~~~~~~~~~~~~~~~
+Daemon
+ Runs as a background process. On Linux platforms, a daemon is usually
+ installed as a service.

-Identity service
-~~~~~~~~~~~~~~~~
+Script
+ Installs a virtual environment and runs tests.

-Image service
-~~~~~~~~~~~~~
+Command-line interface (CLI)
+ Enables users to submit API calls to OpenStack services through commands.

-Control Plane
-~~~~~~~~~~~~~
+:ref:`logical_architecture` shows one example of the most common
+integrated services within OpenStack and how they interact with each
+other. End users can interact through the dashboard, CLIs, and APIs.
+All services authenticate through a common Identity service, and
+individual services interact with each other through public APIs, except
+where privileged administrator commands are necessary.

-Dashboard and APIs
-~~~~~~~~~~~~~~~~~~
+.. _logical_architecture:
+
+.. figure:: common/figures/osog_0001.png
+   :width: 100%
+   :alt: OpenStack Logical Architecture
+
+   OpenStack Logical Architecture
+
+.. toctree::
+   :maxdepth: 2
+
+   design-compute.rst
+   design-storage.rst
+   design-networking.rst
+   design-identity.rst
+   design-images.rst
+   design-control-plane.rst
+   design-dashboard-api.rst
--- a/doc/arch-design-draft/source/figures/Check_mark_23x20_02.png
+++ b/doc/arch-design-draft/source/figures/Check_mark_23x20_02.png
--- a/doc/arch-design-draft/source/figures/Compute_NSX.png
+++ b/doc/arch-design-draft/source/figures/Compute_NSX.png
--- a/doc/arch-design-draft/source/figures/Compute_Tech_Bin_Packing_CPU_optimized1.png
+++ b/doc/arch-design-draft/source/figures/Compute_Tech_Bin_Packing_CPU_optimized1.png
--- a/doc/arch-design-draft/source/figures/Compute_Tech_Bin_Packing_General1.png
+++ b/doc/arch-design-draft/source/figures/Compute_Tech_Bin_Packing_General1.png
--- a/doc/arch-design-draft/source/figures/General_Architecture3.png
+++ b/doc/arch-design-draft/source/figures/General_Architecture3.png
--- a/doc/arch-design-draft/source/figures/Generic_CERN_Architecture.png
+++ b/doc/arch-design-draft/source/figures/Generic_CERN_Architecture.png
--- a/doc/arch-design-draft/source/figures/Generic_CERN_Example.png
+++ b/doc/arch-design-draft/source/figures/Generic_CERN_Example.png
--- a/doc/arch-design-draft/source/figures/Massively_Scalable_Cells_regions_azs.png
+++ b/doc/arch-design-draft/source/figures/Massively_Scalable_Cells_regions_azs.png
--- a/doc/arch-design-draft/source/figures/Multi-Cloud_Priv-AWS4.png
+++ b/doc/arch-design-draft/source/figures/Multi-Cloud_Priv-AWS4.png
--- a/doc/arch-design-draft/source/figures/Multi-Cloud_Priv-Pub3.png
+++ b/doc/arch-design-draft/source/figures/Multi-Cloud_Priv-Pub3.png
--- a/doc/arch-design-draft/source/figures/Multi-Cloud_failover2.png
+++ b/doc/arch-design-draft/source/figures/Multi-Cloud_failover2.png
--- a/doc/arch-design-draft/source/figures/Multi-Site_Customer_Edge.png
+++ b/doc/arch-design-draft/source/figures/Multi-Site_Customer_Edge.png
--- a/doc/arch-design-draft/source/figures/Multi-Site_shared_keystone1.png
+++ b/doc/arch-design-draft/source/figures/Multi-Site_shared_keystone1.png
--- a/doc/arch-design-draft/source/figures/Multi-Site_shared_keystone_horizon_swift1.png
+++ b/doc/arch-design-draft/source/figures/Multi-Site_shared_keystone_horizon_swift1.png
--- a/doc/arch-design-draft/source/figures/Multi-site_Geo_Redundant_LB.png
+++ b/doc/arch-design-draft/source/figures/Multi-site_Geo_Redundant_LB.png
--- a/doc/arch-design-draft/source/figures/Network_Cloud_Storage2.png
+++ b/doc/arch-design-draft/source/figures/Network_Cloud_Storage2.png
--- a/doc/arch-design-draft/source/figures/Network_Web_Services1.png
+++ b/doc/arch-design-draft/source/figures/Network_Web_Services1.png
--- a/doc/arch-design-draft/source/figures/Specialized_Hardware2.png
+++ b/doc/arch-design-draft/source/figures/Specialized_Hardware2.png
--- a/doc/arch-design-draft/source/figures/Specialized_OOO.png
+++ b/doc/arch-design-draft/source/figures/Specialized_OOO.png
--- a/doc/arch-design-draft/source/figures/Specialized_SDN_external.png
+++ b/doc/arch-design-draft/source/figures/Specialized_SDN_external.png
--- a/doc/arch-design-draft/source/figures/Specialized_SDN_hosted.png
+++ b/doc/arch-design-draft/source/figures/Specialized_SDN_hosted.png
--- a/doc/arch-design-draft/source/figures/Specialized_VDI1.png
+++ b/doc/arch-design-draft/source/figures/Specialized_VDI1.png
--- a/doc/arch-design-draft/source/figures/Storage_Database_+_Object5.png
+++ b/doc/arch-design-draft/source/figures/Storage_Database_+_Object5.png
--- a/doc/arch-design-draft/source/figures/Storage_Hadoop3.png
+++ b/doc/arch-design-draft/source/figures/Storage_Hadoop3.png
--- a/doc/arch-design-draft/source/figures/Storage_Object.png
+++ b/doc/arch-design-draft/source/figures/Storage_Object.png
--- a/doc/common/figures/osog_0001.png
+++ b/doc/common/figures/osog_0001.png