openstack-manuals/doc/arch-design/multi_site/section_tech_considerations_multi_site.xml

<?xml version="1.0" encoding="UTF-8"?>
<section xmlns="http://docbook.org/ns/docbook"
  xmlns:xi="http://www.w3.org/2001/XInclude"
  xmlns:xlink="http://www.w3.org/1999/xlink"
  version="5.0"
  xml:id="technical-considerations-multi-site">
    <?dbhtml stop-chunking?>
    <title>Technical considerations</title>
    <para>There are many technical considerations to take into account
        with regard to designing a multi-site OpenStack
        implementation. An OpenStack cloud can be designed in a
        variety of ways to handle individual application needs. A
        multi-site deployment has additional challenges compared
        to single site installations and therefore is a more
        complex solution.</para>
    <para>When determining capacity options be sure to take into
        account not just the technical issues, but also the economic
        or operational issues that might arise from specific
        decisions.</para>
    <para>Inter-site link capacity describes the capabilities of the
        connectivity between the different OpenStack sites. This
        includes parameters such as bandwidth, latency, whether or not
        a link is dedicated, and any business policies applied to the
        connection. The capability and number of the links between
        sites determine what kind of options are available for
        deployment. For example, if two sites have a pair of
        high-bandwidth links available between them, it may be wise to
        configure a separate storage replication network between the
        two sites to support a single Swift endpoint and a shared
        object storage capability between them. (An example of this
        technique, as well as a configuration walk-through, is
        available at <link
        xlink:href="http://docs.openstack.org/developer/swift/replication_network.html#dedicated-replication-network">http://docs.openstack.org/developer/swift/replication_network.html#dedicated-replication-network</link>).
        Another option in this scenario is to build a dedicated set of
        tenant private networks across the secondary link using
        overlay networks with a third party mapping the site overlays
        to each other.</para>
    <para>The capacity requirements of the links between sites is
        driven by application behavior. If the latency of the links is
        too high, certain applications that use a large number of
        small packets, for example RPC calls, may encounter issues
        communicating with each other or operating properly.
        Additionally, OpenStack may encounter similar types of issues.
        To mitigate this, tuning of the Identity service call timeouts may be
        necessary to prevent issues authenticating against a central
        Identity service.</para>
    <para>Another capacity consideration when it comes to networking
        for a multi-site deployment is the available amount and
        performance of overlay networks for tenant networks. If using
        shared tenant networks across zones, it is imperative that an
        external overlay manager or controller be used to map these
        overlays together. It is necessary to ensure the amount of
        possible IDs between the zones are identical. Note that, as of
        the Kilo release, OpenStack Networking was not capable of managing
        tunnel IDs across installations. This means that if one site
        runs out of IDs, but other does not, that tenant's network
        is unable to reach the other site.</para>
    <para>Capacity can take other forms as well. The ability for a
        region to grow depends on scaling out the number of available
        compute nodes. This topic is covered in greater detail in the
        section for compute-focused deployments. However, it should be
        noted that cells may be necessary to grow an individual region
        beyond a certain point. This point depends on the size of your
        cluster and the ratio of virtual machines per
        hypervisor.</para>
    <para>A third form of capacity comes in the multi-region-capable
        components of OpenStack. Centralized Object Storage is capable
        of serving objects through a single namespace across multiple
        regions. Since this works by accessing the object store via
        swift proxy, it is possible to overload the proxies. There are
        two options available to mitigate this issue. The first is to
        deploy a large number of swift proxies. The drawback to this
        is that the proxies are not load-balanced and a large file
        request could continually hit the same proxy. The other way to
        mitigate this is to front-end the proxies with a caching HTTP
        proxy and load balancer. Since swift objects are returned to
        the requester via HTTP, this load balancer would alleviate the
        load required on the swift proxies.</para>
    <section xml:id="utilization-multi-site"><title>Utilization</title>
    <para>While constructing a multi-site OpenStack environment is the
        goal of this guide, the real test is whether an application
        can utilize it.</para>
    <para>Identity is normally the first interface for the majority of
        OpenStack users. Interacting with the Identity service is required for
        almost all major operations within OpenStack. Therefore, it is
        important to ensure that you provide users with a single URL
        for Identity service authentication. Equally important is proper
        documentation and configuration of regions within the Identity service.
        Each of the sites defined in your installation is considered
        to be a region in Identity nomenclature. This is important for
        the users of the system, when reading Identity documentation,
        as it is required to define the region name when providing
        actions to an API endpoint or in the dashboard.</para>
    <para>Load balancing is another common issue with multi-site
        installations. While it is still possible to run HAproxy
        instances with Load-Balancer-as-a-Service, these are local
        to a specific region. Some applications may be able to cope
        with this via internal mechanisms. Others, however, may
        require the implementation of an external system including
        global services load balancers or anycast-advertised
        DNS.</para>
    <para>Depending on the storage model chosen during site design,
        storage replication and availability are also a concern
        for end-users. If an application is capable of understanding
        regions, then it is possible to keep the object storage system
        separated by region. In this case, users who want to have an
        object available to more than one region need to do the
        cross-site replication themselves. With a centralized swift
        proxy, however, the user may need to benchmark the replication
        timing of the Object Storage back end. Benchmarking allows the
        operational staff to provide users with an understanding of
        the amount of time required for a stored or modified object to
        become available to the entire environment.</para></section>
    <section xml:id="performance"><title>Performance</title>
    <para>Determining the performance of a multi-site installation
        involves considerations that do not come into play in a
        single-site deployment. Being a distributed deployment,
        multi-site deployments incur a few extra penalties to
        performance in certain situations.</para>
    <para>Since multi-site systems can be geographically separated,
        they may have worse than normal latency or jitter when
        communicating across regions. This can especially impact
        systems like the OpenStack Identity service when making
        authentication attempts from regions that do not contain the
        centralized Identity implementation. It can also affect
        certain applications which rely on remote procedure call (RPC)
        for normal operation. An example of this can be seen in High
        Performance Computing workloads.</para>
    <para>Storage availability can also be impacted by the
        architecture of a multi-site deployment. A centralized Object
        Storage service requires more time for an object to be
        available to instances locally in regions where the object was
        not created. Some applications may need to be tuned to account
        for this effect. Block Storage does not currently have a
        method for replicating data across multiple regions, so
        applications that depend on available block storage need
        to manually cope with this limitation by creating duplicate
        block storage entries in each region.</para>
      </section>
</section>