Multi-site chapter edits
1. Edits to the multi-site chapter 2. Removed duplicated legal content which was added to a common section. See https://review.openstack.org/#/c/212299/ Change-Id: I10e3a04650548454c73024d87cbbb6fda63454e8 Implements: blueprint arch-guide
This commit is contained in:
parent
87ff7002f8
commit
68e8c66e79
@ -6,16 +6,9 @@
|
||||
xml:id="multi_site">
|
||||
<title>Multi-site</title>
|
||||
|
||||
<para>A multi-site OpenStack environment is one in which services,
|
||||
located in more than one data center, are used to provide the
|
||||
overall solution. Usage requirements of different multi-site
|
||||
clouds may vary widely, but they share some common needs.
|
||||
OpenStack is capable of running in a multi-region
|
||||
<para>OpenStack is capable of running in a multi-region
|
||||
configuration. This enables some parts of OpenStack to
|
||||
effectively manage a group of sites as a single cloud. With
|
||||
careful planning in the design phase, OpenStack can act as an
|
||||
excellent multi-site cloud solution for a multitude of
|
||||
needs.</para>
|
||||
effectively manage a group of sites as a single cloud.</para>
|
||||
<para>Some use cases that might indicate a need for a multi-site
|
||||
deployment of OpenStack include:</para>
|
||||
<itemizedlist>
|
||||
|
@ -6,59 +6,61 @@
|
||||
xml:id="arch-design-architecture-multiple-site">
|
||||
<?dbhtml stop-chunking?>
|
||||
<title>Architecture</title>
|
||||
<para>This graphic is a high level diagram of a multi-site OpenStack
|
||||
architecture. Each site is an OpenStack cloud but it may be necessary to
|
||||
architect the sites on different versions. For example, if the second
|
||||
site is intended to be a replacement for the first site, they would be
|
||||
different. Another common design would be a private OpenStack cloud with
|
||||
replicated site that would be used for high availability or disaster
|
||||
recovery. The most important design decision is how to configure the
|
||||
storage. It can be configured as a single shared pool or separate pools,
|
||||
depending on the user and technical requirements.</para>
|
||||
<para><xref linkend="multi-site_arch"/>
|
||||
illustrates a high level multi-site OpenStack
|
||||
architecture. Each site is an OpenStack cloud but it may be necessary
|
||||
to architect the sites on different versions. For example, if the
|
||||
second site is intended to be a replacement for the first site,
|
||||
they would be different. Another common design would be a private
|
||||
OpenStack cloud with a replicated site that would be used for high
|
||||
availability or disaster recovery. The most important design decision
|
||||
is configuring storage as a single shared pool or separate pools,
|
||||
depending on user and technical requirements.</para>
|
||||
<figure xml:id="multi-site_arch">
|
||||
<title>Multi-site OpenStack architecture</title>
|
||||
<mediaobject>
|
||||
<imageobject>
|
||||
<imagedata contentwidth="4in"
|
||||
<imagedata contentwidth="6in"
|
||||
fileref="../figures/Multi-Site_shared_keystone_horizon_swift1.png"/>
|
||||
</imageobject>
|
||||
</mediaobject>
|
||||
</figure>
|
||||
<section xml:id="openstack-services-architecture">
|
||||
<title>OpenStack services architecture</title>
|
||||
<para>The OpenStack Identity service, which is used by all other
|
||||
OpenStack components for authorization and the catalog of service
|
||||
endpoints, supports the concept of regions. A region is a logical
|
||||
construct that can be used to group OpenStack services that are in
|
||||
close proximity to one another. The concept of regions is flexible;
|
||||
it may can contain OpenStack service endpoints located within a
|
||||
distinct geographic region, or regions. It may be smaller in scope,
|
||||
where a region is a single rack within a data center or even a
|
||||
single blade chassis, with multiple regions existing in adjacent
|
||||
OpenStack components for authorization and the catalog of
|
||||
service endpoints, supports the concept of regions. A region
|
||||
is a logical construct used to group OpenStack services in
|
||||
close proximity to one another. The concept of
|
||||
regions is flexible; it may can contain OpenStack service
|
||||
endpoints located within a distinct geographic region or regions.
|
||||
It may be smaller in scope, where a region is a single rack
|
||||
within a data center, with multiple regions existing in adjacent
|
||||
racks in the same data center.</para>
|
||||
<para>The majority of OpenStack components are designed to run within
|
||||
the context of a single region. The OpenStack Compute service is
|
||||
designed to manage compute resources within a region, with support
|
||||
for subdivisions of compute resources by using availability zones
|
||||
and cells. The OpenStack Networking service can be used to manage
|
||||
network resources in the same broadcast domain or collection of
|
||||
switches that are linked. The OpenStack Block Storage service
|
||||
controls storage resources within a region with all storage
|
||||
resources residing on the same storage network. Like the OpenStack
|
||||
Compute service, the OpenStack Block Storage service also supports
|
||||
the availability zone construct which can be used to subdivide
|
||||
storage resources.</para>
|
||||
<para>The majority of OpenStack components are designed to run
|
||||
within the context of a single region. The OpenStack Compute
|
||||
service is designed to manage compute resources within a region,
|
||||
with support for subdivisions of compute resources by using
|
||||
availability zones and cells. The OpenStack Networking service
|
||||
can be used to manage network resources in the same broadcast
|
||||
domain or collection of switches that are linked. The OpenStack
|
||||
Block Storage service controls storage resources within a region
|
||||
with all storage resources residing on the same storage network.
|
||||
Like the OpenStack Compute service, the OpenStack Block Storage
|
||||
service also supports the availability zone construct which can
|
||||
be used to subdivide storage resources.</para>
|
||||
<para>The OpenStack dashboard, OpenStack Identity, and OpenStack
|
||||
Object Storage services are components that can each be deployed
|
||||
centrally in order to serve multiple regions.</para>
|
||||
</section>
|
||||
<section xml:id="arch-multi-storage">
|
||||
<title>Storage</title>
|
||||
<para>With multiple OpenStack regions, having a single OpenStack Object
|
||||
Storage service endpoint that delivers shared file storage for all
|
||||
regions is desirable. The Object Storage service internally
|
||||
replicates files to multiple nodes. The advantages of this are that,
|
||||
if a file placed into the Object Storage service is visible to all
|
||||
regions, it can be used by applications or workloads in any or all
|
||||
of the regions. This simplifies high availability failover and
|
||||
disaster recovery rollback.</para>
|
||||
<para>With multiple OpenStack regions, it is recommended to configure
|
||||
a single OpenStack Object Storage service endpoint to deliver
|
||||
shared file storage for all regions. The Object Storage service
|
||||
internally replicates files to multiple nodes which can be used
|
||||
by applications or workloads in multiple regions. This simplifies
|
||||
high availability failover and disaster recovery rollback.</para>
|
||||
<para>In order to scale the Object Storage service to meet the workload
|
||||
of multiple regions, multiple proxy workers are run and
|
||||
load-balanced, storage nodes are installed in each region, and the
|
||||
@ -68,19 +70,20 @@
|
||||
reducing the actual load on the storage network. In addition to an
|
||||
HTTP caching layer, use a caching layer like Memcache to cache
|
||||
objects between the proxy and storage nodes.</para>
|
||||
<para>If the cloud is designed without a single Object Storage Service
|
||||
endpoint for multiple regions, and instead a separate Object Storage
|
||||
Service endpoint is made available in each region, applications are
|
||||
<para>If the cloud is designed with a separate Object Storage
|
||||
Service endpoint made available in each region, applications are
|
||||
required to handle synchronization (if desired) and other management
|
||||
operations to ensure consistency across the nodes. For some
|
||||
applications, having multiple Object Storage Service endpoints
|
||||
located in the same region as the application may be desirable due
|
||||
to reduced latency, cross region bandwidth, and ease of
|
||||
deployment.</para>
|
||||
<para>For the Block Storage service, the most important decisions are
|
||||
the selection of the storage technology and whether or not a
|
||||
dedicated network is used to carry storage traffic from the storage
|
||||
service to the compute nodes.</para>
|
||||
<note>
|
||||
<para>For the Block Storage service, the most important decisions
|
||||
are the selection of the storage technology, and whether
|
||||
a dedicated network is used to carry storage traffic
|
||||
from the storage service to the compute nodes.</para>
|
||||
</note>
|
||||
</section>
|
||||
<section xml:id="arch-networking-multiple">
|
||||
<title>Networking</title>
|
||||
@ -100,18 +103,19 @@
|
||||
</section>
|
||||
<section xml:id="arch-dependencies-multiple">
|
||||
<title>Dependencies</title>
|
||||
<para>The architecture for a multi-site installation of OpenStack is
|
||||
dependent on a number of factors. One major dependency to consider
|
||||
is storage. When designing the storage system, the storage mechanism
|
||||
needs to be determined. Once the storage type is determined, how it
|
||||
is accessed is critical. For example, we recommend that
|
||||
storage should use a dedicated network. Another concern is how
|
||||
the storage is configured to protect the data. For example, the
|
||||
recovery point objective (RPO) and the recovery time objective
|
||||
(RTO). How quickly can the recovery from a fault be completed,
|
||||
determines how often the replication of data is required. Ensure that
|
||||
enough storage is allocated to support the data protection
|
||||
strategy.</para>
|
||||
<para>The architecture for a multi-site OpenStack installation
|
||||
is dependent on a number of factors. One major dependency to
|
||||
consider is storage. When designing the storage system, the
|
||||
storage mechanism needs to be determined. Once the storage
|
||||
type is determined, how it is accessed is critical. For example,
|
||||
we recommend that storage should use a dedicated network.
|
||||
Another concern is how the storage is configured to protect
|
||||
the data. For example, the Recovery Point Objective (RPO) and
|
||||
the Recovery Time Objective (RTO). How quickly recovery from
|
||||
a fault can be completed, determines how often the replication of
|
||||
data is required. Ensure that enough storage is allocated to
|
||||
support the data protection strategy.
|
||||
</para>
|
||||
<para>Networking decisions include the encapsulation mechanism that can
|
||||
be used for the tenant networks, how large the broadcast domains
|
||||
should be, and the contracted SLAs for the interconnects.</para>
|
||||
|
@ -6,16 +6,14 @@
|
||||
xml:id="operational-considerations-multi-site">
|
||||
<?dbhtml stop-chunking?>
|
||||
<title>Operational considerations</title>
|
||||
<para>Deployment of a multi-site OpenStack cloud using regions
|
||||
<para>Multi-site OpenStack cloud deployment using regions
|
||||
requires that the service catalog contains per-region entries
|
||||
for each service deployed other than the Identity service
|
||||
itself. There is limited support amongst currently available
|
||||
off-the-shelf OpenStack deployment tools for defining multiple
|
||||
regions in this fashion.</para>
|
||||
<para>Deployers must be aware of this and provide the appropriate
|
||||
for each service deployed other than the Identity service. Most
|
||||
off-the-shelf OpenStack deployment tools have limited support
|
||||
for defining multiple regions in this fashion.</para>
|
||||
<para>Deployers should be aware of this and provide the appropriate
|
||||
customization of the service catalog for their site either
|
||||
manually or via customization of the deployment tools in
|
||||
use.</para>
|
||||
manually, or by customizing deployment tools in use.</para>
|
||||
<note><para>As of the Kilo release, documentation for
|
||||
implementing this feature is in progress. See this bug for
|
||||
more information:
|
||||
@ -31,51 +29,46 @@
|
||||
host operating systems, guest operating systems, OpenStack
|
||||
distributions (if applicable), software-defined infrastructure
|
||||
including network controllers and storage systems, and even
|
||||
individual applications need to be evaluated in light of the
|
||||
multi-site nature of the cloud.</para>
|
||||
individual applications need to be evaluated.</para>
|
||||
<para>Topics to consider include:</para>
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>The specific definition of what constitutes a site
|
||||
<para>The definition of what constitutes a site
|
||||
in the relevant licenses, as the term does not
|
||||
necessarily denote a geographic or otherwise
|
||||
physically isolated location in the traditional
|
||||
sense.</para>
|
||||
physically isolated location.</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>Differentiations between "hot" (active) and "cold"
|
||||
(inactive) sites where significant savings may be made
|
||||
(inactive) sites, where significant savings may be made
|
||||
in situations where one site is a cold standby for
|
||||
disaster recovery purposes only.</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>Certain locations might require local vendors to
|
||||
provide support and services for each site provides
|
||||
challenges, but will vary on the licensing agreement
|
||||
in place.</para>
|
||||
provide support and services for each site which may vary
|
||||
with the licensing agreement in place.</para>
|
||||
</listitem>
|
||||
</itemizedlist></section>
|
||||
<section xml:id="logging-and-monitoring-multi-site">
|
||||
<title>Logging and monitoring</title>
|
||||
<para>Logging and monitoring does not significantly differ for a
|
||||
multi-site OpenStack cloud. The same well known tools
|
||||
described in the <link
|
||||
multi-site OpenStack cloud. The tools described in the <link
|
||||
xlink:href="http://docs.openstack.org/openstack-ops/content/logging_monitoring.html">Logging
|
||||
and monitoring chapter</link> of the <citetitle>Operations
|
||||
Guide</citetitle> remain applicable. Logging and monitoring
|
||||
can be provided both on a per-site basis and in a common
|
||||
can be provided on a per-site basis, and in a common
|
||||
centralized location.</para>
|
||||
<para>When attempting to deploy logging and monitoring facilities
|
||||
to a centralized location, care must be taken with regards to
|
||||
the load placed on the inter-site networking links.</para></section>
|
||||
to a centralized location, care must be taken with the load
|
||||
placed on the inter-site networking links.</para></section>
|
||||
<section xml:id="upgrades-multi-site">
|
||||
<title>Upgrades</title>
|
||||
<para>In multi-site OpenStack clouds deployed using regions each
|
||||
site is, effectively, an independent OpenStack installation
|
||||
which is linked to the others by using centralized services
|
||||
such as Identity which are shared between sites. At a high
|
||||
level the recommended order of operations to upgrade an
|
||||
individual OpenStack environment is (see the <link
|
||||
<para>In multi-site OpenStack clouds deployed using regions, sites
|
||||
are independent OpenStack installations which are linked
|
||||
together using shared centralized services such as OpenStack
|
||||
Identity. At a high level the recommended order of operations
|
||||
to upgrade an individual OpenStack environment is (see the <link
|
||||
xlink:href="http://docs.openstack.org/openstack-ops/content/ops_upgrades-general-steps.html">Upgrades
|
||||
chapter</link> of the <citetitle>Operations Guide</citetitle>
|
||||
for details):</para>
|
||||
@ -123,22 +116,20 @@
|
||||
shared.</para>
|
||||
</listitem>
|
||||
</orderedlist>
|
||||
<para>Note that Compute
|
||||
upgrades within each site can also be performed in a rolling
|
||||
<para>Compute upgrades within each site can also be performed in a rolling
|
||||
fashion. Compute controller services (API, Scheduler, and
|
||||
Conductor) can be upgraded prior to upgrading of individual
|
||||
compute nodes. This maximizes the ability of operations staff
|
||||
to keep a site operational for users of compute services while
|
||||
performing an upgrade.</para></section>
|
||||
compute nodes. This allows operations staff to keep a site
|
||||
operational for users of Compute services while performing an
|
||||
upgrade.</para></section>
|
||||
<section xml:id="quota-management-multi-site">
|
||||
<title>Quota management</title>
|
||||
<para>To prevent system capacities from being exhausted without
|
||||
notification, OpenStack provides operators with the ability to
|
||||
define quotas. Quotas are used to set operational limits and
|
||||
are currently enforced at the tenant (or project) level rather
|
||||
than at the user level.</para>
|
||||
<para>Quotas are defined on a per-region basis. Operators may wish
|
||||
to define identical quotas for tenants in each region of the
|
||||
<para>Quotas are used to set operational limits to prevent system
|
||||
capacities from being exhausted without notification. They are
|
||||
currently enforced at the tenant (or project) level rather than
|
||||
at the user level.</para>
|
||||
<para>Quotas are defined on a per-region basis. Operators can
|
||||
define identical quotas for tenants in each region of the
|
||||
cloud to provide a consistent experience, or even create a
|
||||
process for synchronizing allocated quotas across regions. It
|
||||
is important to note that only the operational limits imposed
|
||||
@ -161,24 +152,22 @@
|
||||
Control (RBAC) policies, defined in a <filename>policy.json</filename> file, for
|
||||
each service. Operators edit these files to customize the
|
||||
policies for their OpenStack installation. If the application
|
||||
of consistent RBAC policies across sites is considered a
|
||||
requirement, then it is necessary to ensure proper
|
||||
synchronization of the <filename>policy.json</filename> files to all
|
||||
installations.</para>
|
||||
<para>This must be done using normal system administration tools
|
||||
such as rsync as no functionality for synchronizing policies
|
||||
across regions is currently provided within OpenStack.</para></section>
|
||||
of consistent RBAC policies across sites is a requirement, then
|
||||
it is necessary to ensure proper synchronization of the
|
||||
<filename>policy.json</filename> files to all installations.</para>
|
||||
<para>This must be done using system administration tools
|
||||
such as rsync as functionality for synchronizing policies
|
||||
across regions is not currently provided within OpenStack.</para></section>
|
||||
<section xml:id="documentation-multi-site">
|
||||
<title>Documentation</title>
|
||||
<para>Users must be able to leverage cloud infrastructure and
|
||||
provision new resources in the environment. It is important
|
||||
that user documentation is accessible by users of the cloud
|
||||
infrastructure to ensure they are given sufficient information
|
||||
to help them leverage the cloud. As an example, by default
|
||||
OpenStack schedules instances on a compute node
|
||||
that user documentation is accessible by users to ensure they
|
||||
are given sufficient information to help them leverage the cloud.
|
||||
As an example, by default OpenStack schedules instances on a compute node
|
||||
automatically. However, when multiple regions are available,
|
||||
it is left to the end user to decide in which region to
|
||||
schedule the new instance. The dashboard presents the user with
|
||||
the end user needs to decide in which region to schedule the
|
||||
new instance. The dashboard presents the user with
|
||||
the first region in your configuration. The API and CLI tools
|
||||
do not execute commands unless a valid region is specified.
|
||||
It is therefore important to provide documentation to your
|
||||
|
@ -22,10 +22,10 @@
|
||||
very sensitive to latency and needs a rapid response to
|
||||
end-users. After reviewing the user, technical and operational
|
||||
considerations, it is determined beneficial to build a number
|
||||
of regions local to the customer's edge. In this case rather
|
||||
than build a few large, centralized data centers, the intent
|
||||
of the architecture is to provide a pair of small data centers
|
||||
in locations that are closer to the customer. In this use
|
||||
of regions local to the customer's edge. Rather than build a
|
||||
few large, centralized data centers, the intent of the architecture
|
||||
is to provide a pair of small data centers in locations that
|
||||
are closer to the customer. In this use
|
||||
case, spreading applications out allows for different
|
||||
horizontal scaling than a traditional compute workload scale.
|
||||
The intent is to scale by creating more copies of the
|
||||
@ -60,44 +60,47 @@
|
||||
expanding the capacity of all regions simultaneously,
|
||||
therefore maximizing the cost-effectiveness of the multi-site
|
||||
design.</para>
|
||||
<para>One of the key decisions of running this sort of
|
||||
infrastructure is whether or not to provide a redundancy
|
||||
<para>One of the key decisions of running this infrastructure is
|
||||
whether or not to provide a redundancy
|
||||
model. Two types of redundancy and high availability models in
|
||||
this configuration can be implemented. The first type
|
||||
revolves around the availability of the central OpenStack
|
||||
is the availability of central OpenStack
|
||||
components. Keystone can be made highly available in three
|
||||
central data centers that host the centralized OpenStack
|
||||
components. This prevents a loss of any one of the regions
|
||||
causing an outage in service. It also has the added benefit of
|
||||
being able to run a central storage repository as a primary
|
||||
cache for distributing content to each of the regions.</para>
|
||||
<para>The second redundancy topic is that of the edge data center
|
||||
itself. A second data center in each of the edge regional
|
||||
locations house a second region near the first. This
|
||||
<para>The second redundancy type is the edge data center itself.
|
||||
A second data center in each of the edge regional
|
||||
locations house a second region near the first region. This
|
||||
ensures that the application does not suffer degraded
|
||||
performance in terms of latency and availability.</para>
|
||||
<para>This figure depicts the solution designed to have both a
|
||||
centralized set of core data centers for OpenStack services
|
||||
and paired edge data centers:</para>
|
||||
<mediaobject>
|
||||
<para><xref linkend="multi-site_customer_edge"/> depicts
|
||||
the solution designed to have both a centralized set of core
|
||||
data centers for OpenStack services and paired edge data centers:</para>
|
||||
<figure xml:id="multi-site_customer_edge">
|
||||
<title>Multi-site architecture example</title>
|
||||
<mediaobject>
|
||||
<imageobject>
|
||||
<imagedata contentwidth="4in"
|
||||
<imagedata contentwidth="6in"
|
||||
fileref="../figures/Multi-Site_Customer_Edge.png"/>
|
||||
</imageobject>
|
||||
</mediaobject>
|
||||
</mediaobject>
|
||||
</figure>
|
||||
<section xml:id="geo-redundant-load-balancing">
|
||||
<title>Geo-redundant load balancing</title>
|
||||
<para>A large-scale web application has been designed with cloud
|
||||
principles in mind. The application is designed provide
|
||||
service to application store, on a 24/7 basis. The company has
|
||||
typical 2-tier architecture with a web front-end servicing the
|
||||
customer requests and a NoSQL database back end storing the
|
||||
typical two tier architecture with a web front-end servicing the
|
||||
customer requests, and a NoSQL database back end storing the
|
||||
information.</para>
|
||||
<para>As of late there has been several outages in number of major
|
||||
public cloud providers—usually due to the fact these
|
||||
applications were running out of a single geographical
|
||||
location. The design therefore should mitigate the chance of a
|
||||
single site causing an outage for their business.</para>
|
||||
public cloud providers due to applications running out of
|
||||
a single geographical location. The design therefore should
|
||||
mitigate the chance of a single site causing an outage for their
|
||||
business.</para>
|
||||
<para>The solution would consist of the following OpenStack
|
||||
components:</para>
|
||||
<itemizedlist>
|
||||
@ -108,12 +111,11 @@
|
||||
<listitem>
|
||||
<para>OpenStack Controller services running, Networking,
|
||||
dashboard, Block Storage and Compute running locally in
|
||||
each of the three regions. The other services,
|
||||
Identity, Orchestration, Telemetry, Image service and
|
||||
Object Storage can be
|
||||
installed centrally—with nodes in each of the region
|
||||
providing a redundant OpenStack Controller plane
|
||||
throughout the globe.</para>
|
||||
each of the three regions. Identity service, Orchestration
|
||||
service, Telemetry service, Image service and
|
||||
Object Storage can be installed centrally, with
|
||||
nodes in each of the region providing a redundant
|
||||
OpenStack Controller plane throughout the globe.</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>OpenStack Compute nodes running the KVM
|
||||
@ -126,9 +128,9 @@
|
||||
replicated on a regular basis.</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>A Distributed DNS service available to all
|
||||
regions—that allows for dynamic update of DNS records of
|
||||
deployed instances.</para>
|
||||
<para>A distributed DNS service available to all
|
||||
regions that allows for dynamic update of DNS
|
||||
records of deployed instances.</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>A geo-redundant load balancing service can be used
|
||||
@ -153,10 +155,10 @@
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
<para>Another autoscaling Heat template can be used to deploy a
|
||||
distributed MongoDB shard over the three locations—with the
|
||||
distributed MongoDB shard over the three locations, with the
|
||||
option of storing required data on a globally available swift
|
||||
container. According to the usage and load on the database
|
||||
server—additional shards can be provisioned according to
|
||||
server, additional shards can be provisioned according to
|
||||
the thresholds defined in Telemetry.</para>
|
||||
<!-- <para>The reason that three regions were selected here was because of
|
||||
the fear of having abnormal load on a single region in the
|
||||
@ -169,57 +171,66 @@
|
||||
autoscaling and auto healing in the event of increased load.
|
||||
Additional configuration management tools, such as Puppet or
|
||||
Chef could also have been used in this scenario, but were not
|
||||
chosen due to the fact that Orchestration had the appropriate built-in
|
||||
hooks into the OpenStack cloud—whereas the other tools were
|
||||
external and not native to OpenStack. In addition—since this
|
||||
deployment scenario was relatively straight forward—the
|
||||
external tools were not needed.</para>
|
||||
<para>
|
||||
OpenStack Object Storage is used here to serve as a back end for
|
||||
chosen since Orchestration had the appropriate built-in
|
||||
hooks into the OpenStack cloud, whereas the other tools were
|
||||
external and not native to OpenStack. In addition, external
|
||||
tools were not needed since this deployment scenario was straight
|
||||
forward.</para>
|
||||
<para>OpenStack Object Storage is used here to serve as a back end for
|
||||
the Image service since it is the most suitable solution for a
|
||||
globally distributed storage solution—with its own
|
||||
globally distributed storage solution with its own
|
||||
replication mechanism. Home grown solutions could also have
|
||||
been used including the handling of replication—but were not
|
||||
been used including the handling of replication, but were not
|
||||
chosen, because Object Storage is already an intricate part of the
|
||||
infrastructure—and proven solution.</para>
|
||||
infrastructure and a proven solution.</para>
|
||||
<para>An external load balancing service was used and not the
|
||||
LBaaS in OpenStack because the solution in OpenStack is not
|
||||
redundant and does not have any awareness of geo location.</para>
|
||||
<mediaobject>
|
||||
<figure xml:id="multi-site_geo_redundant">
|
||||
<title>Multi-site geo-redundant architecture</title>
|
||||
<mediaobject>
|
||||
<imageobject>
|
||||
<imagedata contentwidth="4in"
|
||||
<imagedata contentwidth="6in"
|
||||
fileref="../figures/Multi-site_Geo_Redundant_LB.png"/>
|
||||
</imageobject>
|
||||
</mediaobject></section>
|
||||
<section xml:id="location-local-services"><title>Location-local service</title>
|
||||
<para>A common use for a multi-site deployment of OpenStack, is
|
||||
for creating a Content Delivery Network. An application that
|
||||
</mediaobject>
|
||||
</figure>
|
||||
</section>
|
||||
<section xml:id="location-local-services">
|
||||
<title>Location-local service</title>
|
||||
<para>A common use for multi-site OpenStack deployment is
|
||||
creating a Content Delivery Network. An application that
|
||||
uses a location-local architecture requires low network
|
||||
latency and proximity to the user, in order to provide an
|
||||
optimal user experience, in addition to reducing the cost of
|
||||
bandwidth and transit, since the content resides on sites
|
||||
closer to the customer, instead of a centralized content store
|
||||
that requires utilizing higher cost cross-country links.</para>
|
||||
<para>This architecture usually includes a geo-location component
|
||||
that places user requests at the closest possible node. In
|
||||
latency and proximity to the user to provide an
|
||||
optimal user experience and reduce the cost of bandwidth and
|
||||
transit. The content resides on sites closer to the customer,
|
||||
instead of a centralized content store that requires utilizing
|
||||
higher cost cross-country links.</para>
|
||||
<para>This architecture includes a geo-location component
|
||||
that places user requests to the closest possible node. In
|
||||
this scenario, 100% redundancy of content across every site is
|
||||
a goal rather than a requirement, with the intent being to
|
||||
maximize the amount of content available that is within a
|
||||
minimum number of network hops for any given end user. Despite
|
||||
a goal rather than a requirement, with the intent to
|
||||
maximize the amount of content available within a
|
||||
minimum number of network hops for end users. Despite
|
||||
these differences, the storage replication configuration has
|
||||
significant overlap with that of a geo-redundant load
|
||||
balancing use case.</para>
|
||||
<para>In this example, the application utilizing this multi-site
|
||||
OpenStack install that is location aware would launch web
|
||||
server or content serving instances on the compute cluster in
|
||||
each site. Requests from clients are first sent to a
|
||||
global services load balancer that determines the location of
|
||||
the client, then routes the request to the closest OpenStack
|
||||
site where the application completes the request.</para>
|
||||
<mediaobject>
|
||||
<para>In <xref linkend="multi-site_shared_shared_keystone"/>,
|
||||
the application utilizing this multi-site OpenStack install
|
||||
that is location-aware would launch web server or content
|
||||
serving instances on the compute cluster in each site. Requests
|
||||
from clients are first sent to a global services load balancer
|
||||
that determines the location of the client, then routes the
|
||||
request to the closest OpenStack site where the application
|
||||
completes the request.</para>
|
||||
<figure xml:id="multi-site_shared_shared_keystone">
|
||||
<title>Multi-site shared keystone architecture</title>
|
||||
<mediaobject>
|
||||
<imageobject>
|
||||
<imagedata contentwidth="4in"
|
||||
<imagedata contentwidth="6in"
|
||||
fileref="../figures/Multi-Site_shared_keystone1.png"/>
|
||||
</imageobject>
|
||||
</mediaobject></section>
|
||||
</mediaobject>
|
||||
</figure>
|
||||
</section>
|
||||
</section>
|
||||
|
@ -27,105 +27,108 @@
|
||||
high-bandwidth links available between them, it may be wise to
|
||||
configure a separate storage replication network between the
|
||||
two sites to support a single Swift endpoint and a shared
|
||||
object storage capability between them. (An example of this
|
||||
Object Storage capability between them. An example of this
|
||||
technique, as well as a configuration walk-through, is
|
||||
available at <link
|
||||
xlink:href="http://docs.openstack.org/developer/swift/replication_network.html#dedicated-replication-network">http://docs.openstack.org/developer/swift/replication_network.html#dedicated-replication-network</link>).
|
||||
xlink:href="http://docs.openstack.org/developer/swift/replication_network.html#dedicated-replication-network">http://docs.openstack.org/developer/swift/replication_network.html#dedicated-replication-network</link>.
|
||||
Another option in this scenario is to build a dedicated set of
|
||||
tenant private networks across the secondary link using
|
||||
tenant private networks across the secondary link, using
|
||||
overlay networks with a third party mapping the site overlays
|
||||
to each other.</para>
|
||||
<para>The capacity requirements of the links between sites is
|
||||
driven by application behavior. If the latency of the links is
|
||||
driven by application behavior. If the link latency is
|
||||
too high, certain applications that use a large number of
|
||||
small packets, for example RPC calls, may encounter issues
|
||||
communicating with each other or operating properly.
|
||||
Additionally, OpenStack may encounter similar types of issues.
|
||||
To mitigate this, tuning of the Identity service call timeouts may be
|
||||
necessary to prevent issues authenticating against a central
|
||||
To mitigate this, Identity service call timeouts can be
|
||||
tuned to prevent issues authenticating against a central
|
||||
Identity service.</para>
|
||||
<para>Another capacity consideration when it comes to networking
|
||||
for a multi-site deployment is the available amount and
|
||||
performance of overlay networks for tenant networks. If using
|
||||
shared tenant networks across zones, it is imperative that an
|
||||
external overlay manager or controller be used to map these
|
||||
overlays together. It is necessary to ensure the amount of
|
||||
possible IDs between the zones are identical. Note that, as of
|
||||
the Kilo release, OpenStack Networking was not capable of managing
|
||||
tunnel IDs across installations. This means that if one site
|
||||
runs out of IDs, but other does not, that tenant's network
|
||||
is unable to reach the other site.</para>
|
||||
<para>Another network capacity consideration for a multi-site
|
||||
deployment is the amount and performance of overlay networks
|
||||
available for tenant networks. If using shared tenant networks
|
||||
across zones, it is imperative that an external overlay manager
|
||||
or controller be used to map these overlays together. It is
|
||||
necessary to ensure the amount of possible IDs between the zones
|
||||
are identical.</para>
|
||||
<note>
|
||||
<para>As of the Kilo release, OpenStack Networking was not
|
||||
capable of managing tunnel IDs across installations. So if
|
||||
one site runs out of IDs, but another does not, that tenant's
|
||||
network is unable to reach the other site.</para>
|
||||
</note>
|
||||
<para>Capacity can take other forms as well. The ability for a
|
||||
region to grow depends on scaling out the number of available
|
||||
compute nodes. This topic is covered in greater detail in the
|
||||
section for compute-focused deployments. However, it should be
|
||||
noted that cells may be necessary to grow an individual region
|
||||
beyond a certain point. This point depends on the size of your
|
||||
cluster and the ratio of virtual machines per
|
||||
section for compute-focused deployments. However, it may be
|
||||
necessary to grow cells in an individual region, depending on
|
||||
the size of your cluster and the ratio of virtual machines per
|
||||
hypervisor.</para>
|
||||
<para>A third form of capacity comes in the multi-region-capable
|
||||
components of OpenStack. Centralized Object Storage is capable
|
||||
of serving objects through a single namespace across multiple
|
||||
regions. Since this works by accessing the object store via
|
||||
regions. Since this works by accessing the object store through
|
||||
swift proxy, it is possible to overload the proxies. There are
|
||||
two options available to mitigate this issue. The first is to
|
||||
deploy a large number of swift proxies. The drawback to this
|
||||
is that the proxies are not load-balanced and a large file
|
||||
request could continually hit the same proxy. The other way to
|
||||
mitigate this is to front-end the proxies with a caching HTTP
|
||||
proxy and load balancer. Since swift objects are returned to
|
||||
the requester via HTTP, this load balancer would alleviate the
|
||||
load required on the swift proxies.</para>
|
||||
two options available to mitigate this issue:</para>
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>Deploy a large number of swift proxies. The drawback is
|
||||
that the proxies are not load-balanced and a large file
|
||||
request could continually hit the same proxy.</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>Add a caching HTTP proxy and load balancer in front of
|
||||
the swift proxies. Since swift objects are returned to the
|
||||
requester via HTTP, this load balancer would alleviate the
|
||||
load required on the swift proxies.</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
<section xml:id="utilization-multi-site"><title>Utilization</title>
|
||||
<para>While constructing a multi-site OpenStack environment is the
|
||||
goal of this guide, the real test is whether an application
|
||||
can utilize it.</para>
|
||||
<para>Identity is normally the first interface for the majority of
|
||||
OpenStack users. Interacting with the Identity service is required for
|
||||
almost all major operations within OpenStack. Therefore, it is
|
||||
important to ensure that you provide users with a single URL
|
||||
for Identity service authentication. Equally important is proper
|
||||
documentation and configuration of regions within the Identity service.
|
||||
<para>The Identity service is normally the first interface for
|
||||
OpenStack users and is required for almost all major operations
|
||||
within OpenStack. Therefore, it is important that you provide users
|
||||
with a single URL for Identity service authentication, and
|
||||
document the configuration of regions within the Identity service.
|
||||
Each of the sites defined in your installation is considered
|
||||
to be a region in Identity nomenclature. This is important for
|
||||
the users of the system, when reading Identity documentation,
|
||||
as it is required to define the region name when providing
|
||||
actions to an API endpoint or in the dashboard.</para>
|
||||
the users, as it is required to define the region name when
|
||||
providing actions to an API endpoint or in the dashboard.</para>
|
||||
<para>Load balancing is another common issue with multi-site
|
||||
installations. While it is still possible to run HAproxy
|
||||
instances with Load-Balancer-as-a-Service, these are local
|
||||
to a specific region. Some applications may be able to cope
|
||||
with this via internal mechanisms. Others, however, may
|
||||
require the implementation of an external system including
|
||||
global services load balancers or anycast-advertised
|
||||
DNS.</para>
|
||||
instances with Load-Balancer-as-a-Service, these are defined
|
||||
to a specific region. Some applications can manage this using
|
||||
internal mechanisms. Other applications may require the
|
||||
implementation of an external system, including global services
|
||||
load balancers or anycast-advertised DNS.</para>
|
||||
<para>Depending on the storage model chosen during site design,
|
||||
storage replication and availability are also a concern
|
||||
for end-users. If an application is capable of understanding
|
||||
regions, then it is possible to keep the object storage system
|
||||
separated by region. In this case, users who want to have an
|
||||
object available to more than one region need to do the
|
||||
cross-site replication themselves. With a centralized swift
|
||||
proxy, however, the user may need to benchmark the replication
|
||||
timing of the Object Storage back end. Benchmarking allows the
|
||||
operational staff to provide users with an understanding of
|
||||
the amount of time required for a stored or modified object to
|
||||
become available to the entire environment.</para></section>
|
||||
for end-users. If an application can support regions, then it
|
||||
is possible to keep the object storage system separated by region.
|
||||
In this case, users who want to have an object available to
|
||||
more than one region need to perform cross-site replication.
|
||||
However, with a centralized swift proxy, the user may need to
|
||||
benchmark the replication timing of the Object Storage back end.
|
||||
Benchmarking allows the operational staff to provide users with
|
||||
an understanding of the amount of time required for a stored or
|
||||
modified object to become available to the entire environment.</para>
|
||||
</section>
|
||||
<section xml:id="performance"><title>Performance</title>
|
||||
<para>Determining the performance of a multi-site installation
|
||||
involves considerations that do not come into play in a
|
||||
single-site deployment. Being a distributed deployment,
|
||||
multi-site deployments incur a few extra penalties to
|
||||
performance in certain situations.</para>
|
||||
performance in multi-site deployments may be affected in certain
|
||||
situations.</para>
|
||||
<para>Since multi-site systems can be geographically separated,
|
||||
they may have worse than normal latency or jitter when
|
||||
communicating across regions. This can especially impact
|
||||
systems like the OpenStack Identity service when making
|
||||
authentication attempts from regions that do not contain the
|
||||
centralized Identity implementation. It can also affect
|
||||
certain applications which rely on remote procedure call (RPC)
|
||||
for normal operation. An example of this can be seen in High
|
||||
Performance Computing workloads.</para>
|
||||
there may be greater latency or jitter when communicating across
|
||||
regions. This can especially impact systems like the OpenStack
|
||||
Identity service when making authentication attempts from regions
|
||||
that do not contain the centralized Identity implementation. It
|
||||
can also affect applications which rely on Remote Procedure Call (RPC)
|
||||
for normal operation. An example of this can be seen in high
|
||||
performance computing workloads.</para>
|
||||
<para>Storage availability can also be impacted by the
|
||||
architecture of a multi-site deployment. A centralized Object
|
||||
Storage service requires more time for an object to be
|
||||
@ -137,4 +140,37 @@
|
||||
to manually cope with this limitation by creating duplicate
|
||||
block storage entries in each region.</para>
|
||||
</section>
|
||||
<section xml:id="openstack-components_multi-site">
|
||||
<title>OpenStack components</title>
|
||||
<para>Most OpenStack installations require a bare minimum set of
|
||||
pieces to function. These include the OpenStack Identity
|
||||
(keystone) for authentication, OpenStack Compute
|
||||
(nova) for compute, OpenStack Image service (glance) for image
|
||||
storage, OpenStack Networking (neutron) for networking, and
|
||||
potentially an object store in the form of OpenStack Object
|
||||
Storage (swift). Deploying a multi-site installation also demands extra
|
||||
components in order to coordinate between regions. A centralized
|
||||
Identity service is necessary to provide the single authentication
|
||||
point. A centralized dashboard is also recommended to provide a
|
||||
single login point and a mapping to the API and CLI
|
||||
options available. A centralized Object Storage service may also
|
||||
be used, but will require the installation of the swift proxy
|
||||
service.</para>
|
||||
<para>It may also be helpful to install a few extra options in
|
||||
order to facilitate certain use cases. For example,
|
||||
installing Designate may assist in automatically generating
|
||||
DNS domains for each region with an automatically-populated
|
||||
zone full of resource records for each instance. This
|
||||
facilitates using DNS as a mechanism for determining which
|
||||
region will be selected for certain applications.</para>
|
||||
<para>Another useful tool for managing a multi-site installation
|
||||
is Orchestration (heat). The Orchestration module allows the
|
||||
use of templates to define a set of instances to be launched
|
||||
together or for scaling existing sets. It can also be used to
|
||||
set up matching or differentiated groupings based on
|
||||
regions. For instance, if an application requires an equally
|
||||
balanced number of nodes across sites, the same heat template
|
||||
can be used to cover each site with small alterations to only
|
||||
the region name.</para>
|
||||
</section>
|
||||
</section>
|
||||
|
@ -6,55 +6,16 @@
|
||||
xml:id="user-requirements-multi-site">
|
||||
<?dbhtml stop-chunking?>
|
||||
<title>User requirements</title>
|
||||
<para>A multi-site architecture is complex and has its own risks
|
||||
and considerations, therefore it is important to make sure
|
||||
when contemplating the design such an architecture that it
|
||||
meets the user and business requirements.</para>
|
||||
<para>Many jurisdictions have legislative and regulatory
|
||||
requirements governing the storage and management of data in
|
||||
cloud environments. Common areas of regulation include:</para>
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>Data retention policies ensuring storage of
|
||||
persistent data and records management to meet data
|
||||
archival requirements.</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>Data ownership policies governing the possession and
|
||||
responsibility for data.</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>Data sovereignty policies governing the storage of
|
||||
data in foreign countries or otherwise separate
|
||||
jurisdictions.</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>Data compliance policies governing types of
|
||||
information that needs to reside in certain locations
|
||||
due to regular issues and, more importantly, cannot
|
||||
reside in other locations for the same reason.</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
<para>Examples of such legal frameworks include the data
|
||||
protection framework of the European Union (<link
|
||||
xlink:href="http://ec.europa.eu/justice/data-protection">http://ec.europa.eu/justice/data-protection</link>)
|
||||
and the requirements of the Financial Industry Regulatory
|
||||
Authority (<link
|
||||
xlink:href="http://www.finra.org/Industry/Regulation/FINRARules">http://www.finra.org/Industry/Regulation/FINRARules</link>)
|
||||
in the United States. Consult a local regulatory body for more
|
||||
information.</para>
|
||||
<section xml:id="workload-characteristics">
|
||||
<title>Workload characteristics</title>
|
||||
<para>The expected workload is a critical requirement that needs
|
||||
to be captured to guide decision-making. An understanding of
|
||||
the workloads in the context of the desired multi-site
|
||||
environment and use case is important. Another way of thinking
|
||||
about a workload is to think of it as the way the systems are
|
||||
used. A workload could be a single application or a suite of
|
||||
applications that work together. It could also be a duplicate
|
||||
set of applications that need to run in multiple cloud
|
||||
environments. Often in a multi-site deployment the same
|
||||
workload will need to work identically in more than one
|
||||
<para>An understanding of the expected workloads for a desired
|
||||
multi-site environment and use case is an important factor in
|
||||
the decision-making process. In this context, <literal>workload</literal>
|
||||
refers to the way the systems are used. A workload could be a
|
||||
single application or a suite of applications that work together.
|
||||
It could also be a duplicate set of applications that need to
|
||||
run in multiple cloud environments. Often in a multi-site deployment,
|
||||
the same workload will need to work identically in more than one
|
||||
physical location.</para>
|
||||
<para>This multi-site scenario likely includes one or more of the
|
||||
other scenarios in this book with the additional requirement
|
||||
@ -72,26 +33,26 @@
|
||||
<title>Consistency of images and templates across different
|
||||
sites</title>
|
||||
<para>It is essential that the deployment of instances is
|
||||
consistent across the different sites. This needs to be built
|
||||
consistent across the different sites and built
|
||||
into the infrastructure. If the OpenStack Object Storage is used as
|
||||
a back end for the Image service, it is possible to create repositories of
|
||||
consistent images across multiple sites. Having central
|
||||
a back end for the Image service, it is possible to create repositories
|
||||
of consistent images across multiple sites. Having central
|
||||
endpoints with multiple storage nodes allows consistent centralized
|
||||
storage for each and every site.</para>
|
||||
<para>Not using a centralized object store increases operational
|
||||
overhead so that a consistent image library can be maintained. This
|
||||
storage for every site.</para>
|
||||
<para>Not using a centralized object store increases the operational
|
||||
overhead of maintaining a consistent image library. This
|
||||
could include development of a replication mechanism to handle
|
||||
the transport of images and the changes to the images across
|
||||
multiple sites.</para></section>
|
||||
<section xml:id="high-availability-multi-site"><title>High availability</title>
|
||||
<section xml:id="high-availability-multi-site">
|
||||
<title>High availability</title>
|
||||
<para>If high availability is a requirement to provide continuous
|
||||
infrastructure operations, a basic requirement of high
|
||||
availability should be defined.</para>
|
||||
<para>The OpenStack management components need to have a basic and
|
||||
minimal level of redundancy. The simplest example is the loss
|
||||
of any single site has no significant impact on the
|
||||
availability of the OpenStack services of the entire
|
||||
infrastructure.</para>
|
||||
of any single site should have minimal impact on the
|
||||
availability of the OpenStack services.</para>
|
||||
<para>The <link
|
||||
xlink:href="http://docs.openstack.org/high-availability-guide/content/"><citetitle>OpenStack
|
||||
High Availability Guide</citetitle></link>
|
||||
@ -111,14 +72,12 @@
|
||||
WAN network design between the sites.</para>
|
||||
<para>Connecting more than two sites increases the challenges and
|
||||
adds more complexity to the design considerations. Multi-site
|
||||
implementations require extra planning to address the
|
||||
additional topology complexity used for internal and external
|
||||
connectivity. Some options include full mesh topology, hub
|
||||
spoke, spine leaf, or 3d Torus.</para>
|
||||
<para>Not all the applications running in a cloud are cloud-aware.
|
||||
If that is the case, there should be clear measures and
|
||||
expectations to define what the infrastructure can support
|
||||
and, more importantly, what it cannot. An example would be
|
||||
implementations require planning to address the additional
|
||||
topology used for internal and external connectivity. Some options
|
||||
include full mesh topology, hub spoke, spine leaf, and 3D Torus.</para>
|
||||
<para>If applications running in a cloud are not cloud-aware, there
|
||||
should be clear measures and expectations to define what the
|
||||
infrastructure can and cannot support. An example would be
|
||||
shared storage between sites. It is possible, however such a
|
||||
solution is not native to OpenStack and requires a third-party
|
||||
hardware vendor to fulfill such a requirement. Another example
|
||||
@ -126,21 +85,21 @@
|
||||
in object storage directly. These applications need to be
|
||||
cloud aware to make good use of an OpenStack Object
|
||||
Store.</para></section>
|
||||
<section xml:id="application-readiness"><title>Application readiness</title>
|
||||
<section xml:id="application-readiness">
|
||||
<title>Application readiness</title>
|
||||
<para>Some applications are tolerant of the lack of synchronized
|
||||
object storage, while others may need those objects to be
|
||||
replicated and available across regions. Understanding of how
|
||||
replicated and available across regions. Understanding how
|
||||
the cloud implementation impacts new and existing applications
|
||||
is important for risk mitigation and the overall success of a
|
||||
cloud project. Applications may have to be written to expect
|
||||
an infrastructure with little to no redundancy. Existing
|
||||
applications not developed with the cloud in mind may need to
|
||||
be rewritten.</para></section>
|
||||
<section xml:id="cost-multi-site"><title>Cost</title>
|
||||
<para>The requirement of having more than one site has a cost
|
||||
attached to it. The greater the number of sites, the greater
|
||||
the cost and complexity. Costs can be broken down into the
|
||||
following categories:</para>
|
||||
is important for risk mitigation, and the overall success of a
|
||||
cloud project. Applications may have to be written or rewritten
|
||||
for an infrastructure with little to no redundancy, or with the
|
||||
cloud in mind.</para></section>
|
||||
<section xml:id="cost-multi-site">
|
||||
<title>Cost</title>
|
||||
<para>A greater number of sites increase cost and complexity for a
|
||||
multi-site deployment. Costs can be broken down into the following
|
||||
categories:</para>
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>Compute resources</para>
|
||||
@ -163,34 +122,32 @@
|
||||
</itemizedlist></section>
|
||||
<section xml:id="site-loss-and-recovery">
|
||||
<title>Site loss and recovery</title>
|
||||
<para>Outages can cause loss of partial or full functionality of a
|
||||
site. Strategies should be implemented to understand and plan
|
||||
for recovery scenarios.</para>
|
||||
<para>Outages can cause partial or full loss of site functionality.
|
||||
Strategies should be implemented to understand and plan for recovery
|
||||
scenarios.</para>
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>The deployed applications need to continue to
|
||||
function and, more importantly, consideration should
|
||||
be taken of the impact on the performance and
|
||||
reliability of the application when a site is
|
||||
unavailable.</para>
|
||||
function and, more importantly, you must consider the
|
||||
impact on the performance and reliability of the application
|
||||
when a site is unavailable.</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>It is important to understand what happens to the
|
||||
replication of objects and data between the sites when
|
||||
a site goes down. If this causes queues to start
|
||||
building up, consider how long these queues can
|
||||
safely exist until something explodes.</para>
|
||||
safely exist until an error occurs.</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>Ensure determination of the method for resuming
|
||||
proper operations of a site when it comes back online
|
||||
after a disaster. We recommend you architect the
|
||||
recovery to avoid race conditions.</para>
|
||||
<para>After an outage, ensure the method for resuming proper
|
||||
operations of a site is implemented when it comes back online.
|
||||
We recommend you architect the recovery to avoid race conditions.</para>
|
||||
</listitem>
|
||||
</itemizedlist></section>
|
||||
<section xml:id="compliance-and-geo-location-multi-site">
|
||||
<title>Compliance and geo-location</title>
|
||||
<para>An organization could have certain legal obligations and
|
||||
<para>An organization may have certain legal obligations and
|
||||
regulatory compliance measures which could require certain
|
||||
workloads or data to not be located in certain regions.</para></section>
|
||||
<section xml:id="auditing-multi-site">
|
||||
@ -210,11 +167,10 @@
|
||||
site.</para></section>
|
||||
<section xml:id="authentication-between-sites">
|
||||
<title>Authentication between sites</title>
|
||||
<para>Ideally it is best to have a single authentication domain
|
||||
and not need a separate implementation for each and every
|
||||
site. This, of course, requires an authentication
|
||||
mechanism that is highly available and distributed to ensure
|
||||
continuous operation. Authentication server locality is also
|
||||
something that might be needed as well and should be planned
|
||||
for.</para></section>
|
||||
<para>It is recommended to have a single authentication domain
|
||||
rather than a separate implementation for each and every
|
||||
site. This requires an authentication mechanism that is highly
|
||||
available and distributed to ensure continuous operation.
|
||||
Authentication server locality might be required and should be
|
||||
planned for.</para></section>
|
||||
</section>
|
||||
|
Loading…
x
Reference in New Issue
Block a user