Remove passive voice from chap 5, arch guide

Change-Id: Iaaa9d2a052c9f81cefebd18bdc866d96d4fed64e
Closes-Bug: #1427935
This commit is contained in:
Suyog Sainkar 2015-05-07 16:21:30 +10:00 committed by Brian Moss
parent dbf44fc980
commit 31771bef7b
6 changed files with 588 additions and 702 deletions

View File

@ -5,167 +5,141 @@
version="5.0" version="5.0"
xml:id="network_focus"> xml:id="network_focus">
<title>Network focused</title> <title>Network focused</title>
<para>All OpenStack deployments depend on network communication in order
<para>All OpenStack deployments are dependent, to some extent, on to function properly due to its service-based nature. In some cases,
network communication in order to function properly due to a however, the network elevates beyond simple
service-based nature. In some cases, however, use cases infrastructure. This chapter discusses architectures that are more
dictate that the network is elevated beyond simple reliant or focused on network services. These architectures depend
infrastructure. This chapter is a discussion of architectures on the network infrastructure and require
that are more reliant or focused on network services. These network services that perform reliably in order to satisfy user and
architectures are heavily dependent on the network application requirements.</para>
infrastructure and need to be architected so that the network
services perform and are reliable in order to satisfy user and
application requirements.</para>
<para>Some possible use cases include:</para> <para>Some possible use cases include:</para>
<variablelist> <variablelist>
<varlistentry> <varlistentry>
<term>Content delivery network</term> <term>Content delivery network</term>
<listitem> <listitem>
<para>This could include <para>This includes streaming video, viewing photographs, or
streaming video, photographs or any other cloud based accessing any other cloud-based data repository distributed to
repository of data that is distributed to a large a large number of end users. Network configuration affects
number of end users. Mass market streaming video will latency, bandwidth, and the distribution of instances. Therefore,
be very heavily affected by the network configurations it impacts video streaming. Not all video streaming is
that would affect latency, bandwidth, and the consumer-focused. For example, multicast videos (used for media,
distribution of instances. Not all video streaming is press conferences, corporate presentations, and web conferencing
consumer focused. For example, multicast videos (used services) can also use a content delivery network.
for media, press conferences, corporate presentations, The location of the video repository and its relationship to end
web conferencing services, and so on) can also utilize a users affects content delivery. Network throughput of the back-end
content delivery network. Content delivery will be systems, as well as the WAN architecture and the cache methodology,
affected by the location of the video repository and also affect performance.</para>
its relationship to end users. Performance is also
affected by network throughput of the back-end systems,
as well as the WAN architecture and the cache
methodology.</para>
</listitem> </listitem>
</varlistentry> </varlistentry>
<varlistentry> <varlistentry>
<term>Network management functions</term> <term>Network management functions</term>
<listitem> <listitem>
<para>A cloud that provides <para>Use this cloud to provide network service functions built to
network service functions would be built to support support the delivery of back-end network services such as DNS,
the delivery of back-end network services such as DNS, NTP, or SNMP. A company can use these services for internal
NTP or SNMP and would be used by a company for network management.</para>
internal network management.</para>
</listitem> </listitem>
</varlistentry> </varlistentry>
<varlistentry> <varlistentry>
<term>Network service offerings</term> <term>Network service offerings</term>
<listitem> <listitem>
<para>A cloud can be used to <para>Use this cloud to run customer-facing network tools to
run customer facing network tools to support services. support services. Examples include VPNs, MPLS private networks,
For example, VPNs, MPLS private networks, GRE tunnels and GRE tunnels.</para>
and others.</para>
</listitem> </listitem>
</varlistentry> </varlistentry>
<varlistentry> <varlistentry>
<term>Web portals or web services</term> <term>Web portals or web services</term>
<listitem> <listitem>
<para>Web servers are a common <para>Web servers are a common application for cloud services,
application for cloud services and we recommend and we recommend an understanding of their network requirements.
an understanding of the network requirements. The network requires scaling out to meet user demand and deliver
The network will need to be able to scale out to meet web pages with a minimum latency. Depending on the details of
user demand and deliver webpages with a minimum of the portal architecture, consider the internal east-west and
latency. Internal east-west and north-south network north-south network bandwidth.</para>
bandwidth must be considered depending on the details
of the portal architecture.</para>
</listitem> </listitem>
</varlistentry> </varlistentry>
<varlistentry> <varlistentry>
<term>High speed and high volume transactional systems</term> <term>High speed and high volume transactional systems</term>
<listitem> <listitem>
<para> <para>
These types of applications are very sensitive to These types of applications are sensitive to network
network configurations. Examples include many configurations. Examples include financial systems,
financial systems, credit card transaction credit card transaction applications, and trading and other
applications, trading and other extremely high volume extremely high volume systems. These systems are sensitive
systems. These systems are sensitive to network jitter to network jitter and latency. They must balance a high volume
and latency. They also have a high volume of both of East-West and North-South network traffic to
east-west and north-south network traffic that needs maximize efficiency of the data delivery.
to be balanced to maximize efficiency of the data Many of these systems must access large, high performance
delivery. Many of these systems have large high database back ends.</para>
performance database back ends that need to be
accessed.</para>
</listitem> </listitem>
</varlistentry> </varlistentry>
<varlistentry> <varlistentry>
<term>High availability</term> <term>High availability</term>
<listitem> <listitem>
<para>These types of use cases are <para>These types of use cases are dependent on the proper sizing
highly dependent on the proper sizing of the network of the network to maintain replication of data between sites for
to maintain replication of data between sites for high high availability. If one site becomes unavailable, the extra
availability. If one site becomes unavailable, the sites can serve the displaced load until the original site
extra sites will be able to serve the displaced load returns to service. It is important to size network capacity
until the original site returns to service. It is to handle the desired loads.</para>
important to size network capacity to handle the loads
that are desired.</para>
</listitem> </listitem>
</varlistentry> </varlistentry>
<varlistentry> <varlistentry>
<term>Big data</term> <term>Big data</term>
<listitem> <listitem>
<para>Clouds that will be used for the <para>Clouds used for the management and collection of big data
management and collection of big data (data ingest) (data ingest) have a significant demand on network resources.
will have a significant demand on network resources. Big data often uses partial replicas of the data to maintain
Big data often uses partial replicas of the data to integrity over large distributed clouds. Other big data
maintain data integrity over large distributed clouds. applications that require a large amount of network resources
Other big data applications that require a large are Hadoop, Cassandra, NuoDB, Riak, and other NoSQL and
amount of network resources are Hadoop, Cassandra, distributed databases.</para>
NuoDB, RIAK and other No-SQL and distributed
databases.</para>
</listitem> </listitem>
</varlistentry> </varlistentry>
<varlistentry> <varlistentry>
<term>Virtual desktop infrastructure (VDI)</term> <term>Virtual desktop infrastructure (VDI)</term>
<listitem> <listitem>
<para>This use case <para>This use case is sensitive to network congestion, latency,
is very sensitive to network congestion, latency, jitter, and other network characteristics. Like video streaming,
jitter and other network characteristics. Like video the user experience is important. However, unlike video
streaming, the user experience is very important streaming, caching is not an option to offset the network issues.
however, unlike video streaming, caching is not an VDI requires both upstream and downstream traffic and cannot rely
option to offset the network issues. VDI requires both on caching for the delivery of the application to the end user.</para>
upstream and downstream traffic and cannot rely on
caching for the delivery of the application to the end
user.</para>
</listitem> </listitem>
</varlistentry> </varlistentry>
<varlistentry> <varlistentry>
<term>Voice over IP (VoIP)</term> <term>Voice over IP (VoIP)</term>
<listitem> <listitem>
<para>This is extremely sensitive to <para>This is sensitive to network congestion, latency, jitter,
network congestion, latency, jitter and other network and other network characteristics. VoIP has a symmetrical traffic
characteristics. VoIP has a symmetrical traffic pattern and it requires network quality of service (QoS) for best
pattern and it requires network quality of service performance. In addition, you can implement active queue management
(QoS) for best performance. It may also require an to deliver voice and multimedia content. Users are sensitive to
active queue management implementation to ensure latency and jitter fluctuations and can detect them at very low
delivery. Users are very sensitive to latency and levels.</para>
jitter fluctuations and can detect them at very low
levels.</para>
</listitem> </listitem>
</varlistentry> </varlistentry>
<varlistentry> <varlistentry>
<term>Video Conference or web conference</term> <term>Video Conference or web conference</term>
<listitem> <listitem>
<para>This also is <para>This is sensitive to network congestion, latency, jitter,
extremely sensitive to network congestion, latency, and other network characteristics. Video Conferencing has a
jitter and other network flaws. Video Conferencing has symmetrical traffic pattern, but unless the network is on an
a symmetrical traffic pattern, but unless the network MPLS private network, it cannot use network quality of service
is on an MPLS private network, it cannot use network (QoS) to improve performance. Similar to VoIP, users are
quality of service (QoS) to improve performance. sensitive to network performance issues even at low levels.</para>
Similar to VOIP, users will be sensitive to network
performance issues even at low levels.</para>
</listitem> </listitem>
</varlistentry> </varlistentry>
<varlistentry> <varlistentry>
<term>High performance computing (HPC)</term> <term>High performance computing (HPC)</term>
<listitem> <listitem>
<para>This is a complex <para>This is a complex use case that requires careful
use case that requires careful consideration of the consideration of the traffic flows and usage patterns to address
traffic flows and usage patterns to address the needs the needs of cloud clusters. It has high east-west traffic
of cloud clusters. It has high East-West traffic patterns for distributed computing, but there can be substantial
patterns for distributed computing, but there can be north-south traffic depending on the specific application.</para>
substantial North-South traffic depending on the
specific application.</para>
</listitem> </listitem>
</varlistentry> </varlistentry>
</variablelist> </variablelist>

View File

@ -5,222 +5,190 @@
version="5.0" version="5.0"
xml:id="architecture-network-focus"> xml:id="architecture-network-focus">
<title>Architecture</title> <title>Architecture</title>
<para>Network focused OpenStack architectures have many <para>Network-focused OpenStack architectures have many similarities to
similarities to other OpenStack architecture use cases. There other OpenStack architecture use cases. There are several factors
are a number of very specific considerations to keep in mind when to consider when designing for a network-centric or network-heavy
designing for a network-centric or network-heavy application application environment.</para>
environment.</para> <para>Networks exist to serve as a medium of transporting data between
<para>Networks exist to serve as a medium of transporting data systems. It is inevitable that an OpenStack design has inter-dependencies
between systems. It is inevitable that an OpenStack design with non-network portions of OpenStack as well as on external systems.
has inter-dependencies with non-network portions of OpenStack Depending on the specific workload, there may be major interactions with
as well as on external systems. Depending on the specific storage systems both within and external to the OpenStack environment.
workload, there may be major interactions with storage systems For example, in the case of content delivery network, there is twofold
both within and external to the OpenStack environment. For interaction with storage. Traffic flows to and from the storage array for
example, if the workload is a content delivery network, then ingesting and serving content in a north-south direction. In addition,
the interactions with storage will be two-fold. There will be there is replication traffic flowing in an east-west direction.</para>
traffic flowing to and from the storage array for ingesting <para>Compute-heavy workloads may also induce interactions with the
and serving content in a north-south direction. In addition, network. Some high performance compute applications require network-based
there is replication traffic flowing in an east-west memory mapping and data sharing and, as a result, induce a higher network
direction.</para> load when they transfer results and data sets. Others may be highly
<para>Compute-heavy workloads may also induce interactions with transactional and issue transaction locks, perform their functions, and
the network. Some high performance compute applications revoke transaction locks at high rates. This also has an impact on the
require network-based memory mapping and data sharing and, as network performance.</para>
a result, will induce a higher network load when they transfer <para>Some network dependencies are external to OpenStack. While
results and data sets. Others may be highly transactional and OpenStack Networking is capable of providing network ports, IP addresses,
issue transaction locks, perform their functions and rescind some level of routing, and overlay networks, there are some other
transaction locks at very high rates. This also has an impact functions that it cannot provide. For many of these, you may require
on the network performance.</para> external systems or equipment to fill in the functional gaps. Hardware
<para>Some network dependencies are going to be external to load balancers are an example of equipment that may be necessary to
OpenStack. While OpenStack Networking is capable of providing network distribute workloads or offload certain functions. As of the Icehouse
ports, IP addresses, some level of routing, and overlay release, dynamic routing is currently in its infancy within OpenStack and
networks, there are some other functions that it cannot you may require an external device or a specialized service instance
provide. For many of these, external systems or equipment may within OpenStack to implement it. OpenStack Networking provides a
be required to fill in the functional gaps. Hardware load tunneling feature, however it is constrained to a Networking-managed
balancers are an example of equipment that may be necessary to region. If the need arises to extend a tunnel beyond the OpenStack region
distribute workloads or offload certain functions. Note that, to either another region or an external system, implement the tunnel
as of the Kilo release, dynamic routing is currently in itself outside OpenStack or use a tunnel management system to map the
its infancy within OpenStack and may need to be implemented tunnel or overlay to an external tunnel. OpenStack does not currently
either by an external device or a specialized service instance provide quotas for network resources. Where network quotas are required,
within OpenStack. Tunneling is a feature provided by OpenStack Networking, implement quality of service management outside of OpenStack. In many of
however it is constrained to a Networking-managed region. If the these instances, similar solutions for traffic shaping or other network
need arises to extend a tunnel beyond the OpenStack region to functions are needed.
either another region or an external system, it is necessary
to implement the tunnel itself outside OpenStack or by using a
tunnel management system to map the tunnel or overlay to an
external tunnel. OpenStack does not currently provide quotas
for network resources. Where network quotas are required, it
is necessary to implement quality of service management
outside of OpenStack. In many of these instances, similar
solutions for traffic shaping or other network functions will
be needed.
</para> </para>
<para> <para>
Depending on the selected design, Networking itself might not Depending on the selected design, Networking itself might not
even support the required support the required <glossterm baseform="Layer-3 network">layer-3
<glossterm baseform="Layer-3 network">layer-3
network</glossterm> functionality. If you choose to use the network</glossterm> functionality. If you choose to use the
provider networking mode without running the layer-3 agent, you provider networking mode without running the layer-3 agent, you
must install an external router to provide layer-3 connectivity must install an external router to provide layer-3 connectivity
to outside systems. to outside systems.
</para> </para>
<para>Interaction with orchestration services is inevitable in <para>Interaction with orchestration services is inevitable in
larger-scale deployments. The Orchestration module is capable of allocating larger-scale deployments. The Orchestration module is capable of
network resource defined in templates to map to tenant allocating network resource defined in templates to map to tenant
networks and for port creation, as well as allocating floating networks and for port creation, as well as allocating floating IPs.
IPs. If there is a requirement to define and manage network If there is a requirement to define and manage network resources when
resources in using orchestration, we recommend that the using orchestration, we recommend that the design include the
design include the Orchestration module to meet the demands of Orchestration module to meet the demands of users.</para>
users.</para>
<section xml:id="design-impacts"> <section xml:id="design-impacts">
<title>Design impacts</title> <title>Design impacts</title>
<para>A wide variety of factors can affect a network focused <para>A wide variety of factors can affect a network-focused OpenStack
OpenStack architecture. While there are some considerations architecture. While there are some considerations shared with a general
shared with a general use case, specific workloads related to use case, specific workloads related to network requirements influence
network requirements will influence network design network design decisions.</para>
decisions.</para> <para>One decision includes whether or not to use Network Address
<para>One decision includes whether or not to use Network Address Translation (NAT) and where to implement it. If there is a requirement
Translation (NAT) and where to implement it. If there is a for floating IPs instead of public fixed addresses then you must use
requirement for floating IPs to be available instead of using NAT. An example of this is a DHCP relay that must know the IP of the
public fixed addresses then NAT is required. This can be seen DHCP server. In these cases it is easier to automate the infrastructure
in network management applications that rely on an IP to apply the target IP to a new instance rather than to reconfigure
endpoint. An example of this is a DHCP relay that needs to legacy or external systems for each new instance.</para>
know the IP of the actual DHCP server. In these cases it is <para>NAT for floating IPs managed by Networking resides within the
easier to automate the infrastructure to apply the target IP hypervisor but there are also versions of NAT that may be running
to a new instance rather than reconfigure legacy or external elsewhere. If there is a shortage of IPv4 addresses there are two common
systems for each new instance.</para> methods to mitigate this externally to OpenStack. The first is to run a
<para>NAT for floating IPs managed by Networking will reside within load balancer either within OpenStack as an instance, or use an external
the hypervisor but there are also versions of NAT that may be load balancing solution. In the internal scenario, Networking's
running elsewhere. If there is a shortage of IPv4 addresses Load-Balancer-as-a-Service (LBaaS) can manage load balancing
there are two common methods to mitigate this externally to software, for example HAproxy. This is specifically to manage the
OpenStack. The first is to run a load balancer either within Virtual IP (VIP) while a dual-homed connection from the HAproxy instance
OpenStack as an instance, or use an external load balancing connects the public network with the tenant private network that hosts
solution. In the internal scenario, load balancing software, all of the content servers. In the external scenario, a load balancer
such as HAproxy, can be managed with Networking's needs to serve the VIP and also connect to the tenant overlay
Load-Balancer-as-a-Service (LBaaS). This is specifically to network through external means or through private addresses.</para>
manage the <para>Another kind of NAT that may be useful is protocol NAT. In some
Virtual IP (VIP) while a dual-homed connection from the cases it may be desirable to use only IPv6 addresses on instances and
HAproxy instance connects the public network with the tenant operate either an instance or an external service to provide a NAT-based
private network that hosts all of the content servers. In the transition technology such as NAT64 and DNS64. This provides the ability
external scenario, a load balancer would need to serve the VIP to have a globally routable IPv6 address while only consuming IPv4
and also be joined to the tenant overlay network through addresses as necessary or in a shared manner.</para>
external means or routed to it via private addresses.</para> <para>Application workloads affect the design of the underlying network
<para>Another kind of NAT that may be useful is protocol NAT. In architecture. If a workload requires network-level redundancy, the
some cases it may be desirable to use only IPv6 addresses on routing and switching architecture have to accommodate this. There
instances and operate either an instance or an external are differing methods for providing this that are dependent on the
service to provide a NAT-based transition technology such as selected network hardware, the performance of the hardware, and which
NAT64 and DNS64. This provides the ability to have a globally networking model you deploy. Examples include
routable IPv6 address while only consuming IPv4 addresses as Link aggregation (LAG) and Hot Standby Router Protocol (HSRP). Also
necessary or in a shared manner.</para> consider whether to deploy OpenStack Networking or
<para>Application workloads will affect the design of the legacy networking (nova-network), and which plug-in to select for
underlying network architecture. If a workload requires OpenStack Networking. If using an external system, configure Networking
network-level redundancy, the routing and switching to run <glossterm baseform="Layer-2 network">layer 2</glossterm>
architecture will have to accommodate this. There are with a provider network configuration. For example, implement HSRP
differing methods for providing this that are dependent on the to terminate layer-3 connectivity.</para>
network hardware selected, the performance of the hardware, <para>Depending on the workload, overlay networks may not be the best
and which networking model is deployed. Some examples of this solution. Where application network connections are
are the use of Link aggregation (LAG) or Hot Standby Router small, short lived, or bursty, running a dynamic overlay can generate
Protocol (HSRP). There are also the considerations of whether as much bandwidth as the packets it carries. It also can induce enough
to deploy OpenStack Networking or legacy networking (nova-network) latency to cause issues with certain applications. There is an impact
and which plug-in to select to the device generating the overlay which, in most installations,
for OpenStack Networking. If using an external system, Networking will need to is the hypervisor. This causes performance degradation on packet
be configured to run per second and connection per second rates.</para>
<glossterm baseform="Layer-2 network">layer 2</glossterm> <para>Overlays also come with a secondary option that may not be
with a provider network appropriate to a specific workload. While all of them operate in full
configuration. For example, it may be necessary to implement mesh by default, there might be good reasons to disable this function
HSRP to terminate layer-3 connectivity.</para> because it may cause excessive overhead for some workloads. Conversely,
<para>Depending on the workload, overlay networks may or may not other workloads operate without issue. For example, most web services
be a recommended configuration. Where application network applications do not have major issues with a full mesh overlay network,
connections are small, short lived or bursty, running a while some network monitoring tools or storage replication workloads
dynamic overlay can generate as much bandwidth as the packets have performance issues with throughput or excessive broadcast
it carries. It also can induce enough latency to cause issues traffic.</para>
with certain applications. There is an impact to the device <para>Many people overlook an important design decision: The choice of
generating the overlay which, in most installations, will be layer-3 protocols. While OpenStack was initially built with only IPv4
the hypervisor. This will cause performance degradation on
packet per second and connection per second rates.</para>
<para>Overlays also come with a secondary option that may or may
not be appropriate to a specific workload. While all of them
will operate in full mesh by default, there might be good
reasons to disable this function because it may cause
excessive overhead for some workloads. Conversely, other
workloads will operate without issue. For example, most web
services applications will not have major issues with a full
mesh overlay network, while some network monitoring tools or
storage replication workloads will have performance issues
with throughput or excessive broadcast traffic.</para>
<para>Many people overlook an important design decision: The choice
of layer-3
protocols. While OpenStack was initially built with only IPv4
support, Networking now supports IPv6 and dual-stacked networks. support, Networking now supports IPv6 and dual-stacked networks.
Note that, as of the Icehouse release, this only includes As of the Icehouse release, this only includes stateless
stateless address auto configuration but work is in address auto configuration but work is in progress to support stateless
progress to support stateless and stateful DHCPv6 as well as and stateful DHCPv6 as well as IPv6 floating IPs without NAT. Some
IPv6 floating IPs without NAT. Some workloads become possible workloads are possible through the use of IPv6 and IPv6 to IPv4
through the use of IPv6 and IPv6 to IPv4 reverse transition reverse transition mechanisms such as NAT64 and DNS64 or
mechanisms such as NAT64 and DNS64 or <glossterm>6to4</glossterm>, <glossterm>6to4</glossterm>.
because these This alters the requirements for any address plan as single-stacked and
options are available. This will alter the requirements for transitional IPv6 deployments can alleviate the need for IPv4
any address plan as single-stacked and transitional IPv6 addresses.</para>
deployments can alleviate the need for IPv4 addresses.</para> <para>As of the Icehouse release, OpenStack has limited support for
<para>As of the Kilo release, OpenStack has limited support dynamic routing, however there are a number of options available by
for dynamic routing, however there are a number of options incorporating third party solutions to implement routing within the
available by incorporating third party solutions to implement cloud including network equipment, hardware nodes, and instances. Some
routing within the cloud including network equipment, hardware workloads perform well with nothing more than static routes and default
nodes, and instances. Some workloads will perform well with gateways configured at the layer-3 termination point. In most cases this
nothing more than static routes and default gateways is sufficient, however some cases require the addition of at least one
configured at the layer-3 termination point. In most cases type of dynamic routing protocol if not multiple protocols. Having a
this will suffice, however some cases require the addition of form of interior gateway protocol (IGP) available to the instances
at least one type of dynamic routing protocol if not multiple inside an OpenStack installation opens up the possibility of use cases
protocols. Having a form of interior gateway protocol (IGP) for anycast route injection for services that need to use it as a
available to the instances inside an OpenStack installation geographic location or failover mechanism. Other applications may wish
opens up the possibility of use cases for anycast route to directly participate in a routing protocol, either as a passive
injection for services that need to use it as a geographic observer, as in the case of a looking glass, or as an active participant
location or failover mechanism. Other applications may wish to in the form of a route reflector. Since an instance might have a large
directly participate in a routing protocol, either as a amount of compute and memory resources, it is trivial to hold an entire
passive observer as in the case of a looking glass, or as an unpartitioned routing table and use it to provide services such as
active participant in the form of a route reflector. Since an network path visibility to other applications or as a monitoring
instance might have a large amount of compute and memory
resources, it is trivial to hold an entire unpartitioned
routing table and use it to provide services such as network
path visibility to other applications or as a monitoring
tool.</para> tool.</para>
<para> <para>Path maximum transmission unit (MTU) failures are lesser known but
Path maximum transmission unit (MTU) failures are lesser known harder to diagnose. The MTU must be large enough to handle normal
but harder to diagnose. The MTU must be large enough to handle traffic, overhead from an overlay network, and the desired layer-3
normal traffic, overhead from an overlay network, and the protocol. Adding externally built tunnels reduces the MTU packet size.
desired layer-3 protocol. When you add externally built tunnels, In this case, you must pay attention to the fully
the MTU packet size is reduced. In this case, you must pay calculated MTU size because some systems ignore or
attention to the fully calculated MTU size because some systems drop path MTU discovery packets.</para>
are configured to ignore or drop path MTU discovery packets.
</para>
</section> </section>
<section xml:id="tunables"> <section xml:id="tunables">
<title>Tunable networking components</title> <title>Tunable networking components</title>
<para>Consider configurable networking components related to an <para>Consider configurable networking components related to an
OpenStack architecture design when designing for network intensive OpenStack architecture design when designing for network intensive
workloads include MTU and QoS. Some workloads will require a larger workloads that include MTU and QoS. Some workloads require a larger MTU
MTU than normal based on a requirement to transfer large blocks of than normal due to the transfer of large blocks of data.
data. When providing network service for applications such as video When providing network service for applications such as video
streaming or storage replication, it is recommended to ensure that streaming or storage replication, we recommend that you configure
both OpenStack hardware nodes and the supporting network equipment both OpenStack hardware nodes and the supporting network equipment
are configured for jumbo frames where possible. This will allow for for jumbo frames where possible. This allows for better use of
a better utilization of available bandwidth. Configuration of jumbo available bandwidth. Configure jumbo frames
frames should be done across the complete path the packets will across the complete path the packets traverse. If one network
traverse. If one network component is not capable of handling jumbo component is not capable of handling jumbo frames then the entire
frames then the entire path will revert to the default MTU.</para> path reverts to the default MTU.</para>
<para>Quality of Service (QoS) also has a great impact on network <para>Quality of Service (QoS) also has a great impact on network
intensive workloads by providing instant service to packets which intensive workloads as it provides instant service to packets which
have a higher priority due to their ability to be impacted by poor have a higher priority due to the impact of poor
network performance. In applications such as Voice over IP (VoIP) network performance. In applications such as Voice over IP (VoIP),
differentiated services code points are a near requirement for differentiated services code points are a near requirement for proper
proper operation. QoS can also be used in the opposite direction for operation. You can also use QoS in the opposite direction for mixed
mixed workloads to prevent low priority but high bandwidth workloads to prevent low priority but high bandwidth applications,
applications, for example backup services, video conferencing or for example backup services, video conferencing, or file sharing,
file sharing, from blocking bandwidth that is needed for the proper from blocking bandwidth that is needed for the proper operation of
operation of other workloads. It is possible to tag file storage other workloads. It is possible to tag file storage traffic as a
traffic as a lower class, such as best effort or scavenger, to allow lower class, such as best effort or scavenger, to allow the higher
the higher priority traffic through. In cases where regions within a priority traffic through. In cases where regions within a cloud might
cloud might be geographically distributed it may also be necessary be geographically distributed it may also be necessary to plan
to plan accordingly to implement WAN optimization to combat latency accordingly to implement WAN optimization to combat latency or
or packet loss.</para> packet loss.</para>
</section> </section>
</section> </section>

View File

@ -6,67 +6,63 @@
xml:id="operational-considerations-networking-focus"> xml:id="operational-considerations-networking-focus">
<?dbhtml stop-chunking?> <?dbhtml stop-chunking?>
<title>Operational considerations</title> <title>Operational considerations</title>
<para>Network focused OpenStack clouds have a number of <para>Network-focused OpenStack clouds have a number of operational
operational considerations that will influence the selected considerations that influence the selected design, including:</para>
design. Topics including, but not limited to, dynamic routing <itemizedlist>
of static routes, service level agreements, and ownership of <listitem>
user management all need to be considered.</para> <para>Dynamic routing of static routes</para>
<para>One of the first required decisions is the selection of a </listitem>
telecom company or transit provider. This is especially true <listitem>
if the network requirements include external or site-to-site <para>Service level agreements (SLAs)</para>
network connectivity.</para> </listitem>
<para>Additional design decisions need to be made about monitoring <listitem>
and alarming. These can be an internal responsibility or the <para>Ownership of user management</para>
responsibility of the external provider. In the case of using </listitem>
an external provider, SLAs will likely apply. In addition, </itemizedlist>
other operational considerations such as bandwidth, latency, <para>An initial network consideration is the selection of a telecom
and jitter can be part of a service level agreement.</para> company or transit provider.</para>
<para>The ability to upgrade the infrastructure is another subject <para>Make additional design decisions about monitoring and alarming.
for consideration. As demand for network resources increase, This can be an internal responsibility or the responsibility of the
operators will be required to add additional IP address blocks external provider. In the case of using an external provider, service
and add additional bandwidth capacity. Managing hardware and level agreements (SLAs) likely apply. In addition, other operational
software life cycle events, for example upgrades, considerations such as bandwidth, latency, and jitter can be part of an
decommissioning, and outages while avoiding service SLA.</para>
interruptions for tenants, will also need to be <para>Consider the ability to upgrade the infrastructure. As demand for
considered.</para> network resources increase, operators add additional IP address blocks
<para>Maintainability will also need to be factored into the and add additional bandwidth capacity. In addition, consider managing
overall network design. This includes the ability to manage hardware and software life cycle events, for example upgrades,
and maintain IP addresses as well as the use of overlay decommissioning, and outages, while avoiding service interruptions for
identifiers including VLAN tag IDs, GRE tunnel IDs, and MPLS tenants.</para>
tags. As an example, if all of the IP addresses have to be <para>Factor maintainability into the overall network design. This
changed on a network, a process known as renumbering, then the includes the ability to manage and maintain IP addresses as well as the
design needs to support the ability to do so.</para> use of overlay identifiers including VLAN tag IDs, GRE tunnel IDs, and
<para>Network focused applications themselves need to be addressed MPLS tags. As an example, if you may need to change all of the IP
when concerning certain operational realities. For example, addresses on a network, a process known as renumbering, then the design
the impending exhaustion of IPv4 addresses, the migration to must support this function.</para>
IPv6 and the utilization of private networks to segregate <para>Address network-focused applications when considering certain
different types of traffic that an application receives or operational realities. For example, consider the impending exhaustion
generates. In the case of IPv4 to IPv6 migrations, of IPv4 addresses, the migration to IPv6, and the use of private
applications should follow best practices for storing IP networks to segregate different types of traffic that an application
addresses. It is further recommended to avoid relying on IPv4 receives or generates. In the case of IPv4 to IPv6 migrations,
features that were not carried over to the IPv6 protocol or applications should follow best practices for storing IP addresses.
have differences in implementation.</para> We recommend you avoid relying on IPv4 features that did not carry over
<para>When using private networks to segregate traffic, to the IPv6 protocol or have differences in implementation.</para>
applications should create private tenant networks for <para>To segregate traffic, allow applications to create a private tenant
database and data storage network traffic, and utilize public network for database and storage network traffic. Use a public network
networks for client-facing traffic. By segregating this for services that require direct client access from the internet. Upon
traffic, quality of service and security decisions can be made segregating the traffic, consider quality of service (QoS) and security
to ensure that each network has the correct level of service to ensure each network has the required level of service.</para>
that it requires.</para> <para>Finally, consider the routing of network traffic.
<para>Finally, decisions must be made about the routing of network For some applications, develop a complex policy framework for
traffic. For some applications, a more complex policy routing. To create a routing policy that satisfies business requirements,
framework for routing must be developed. The economic cost of consider the economic cost of transmitting traffic over expensive links
transmitting traffic over expensive links versus cheaper versus cheaper links, in addition to bandwidth, latency, and jitter
links, in addition to bandwidth, latency, and jitter requirements.</para>
requirements, can be used to create a routing policy that will <para>Additionally, consider how to respond to network events. As an
satisfy business requirements.</para> example, how load transfers from one link to another during a
<para>How to respond to network events must also be taken into failure scenario could be a factor in the design. If you do not plan
consideration. As an example, how load is transferred from one network capacity correctly, failover traffic could overwhelm other ports
link to another during a failure scenario could be a factor in or network links and create a cascading failure scenario. In this case,
the design. If network capacity is not planned correctly, traffic that fails over to one link overwhelms that link and then moves
failover traffic could overwhelm other ports or network links to the subsequent links until all network traffic stops.</para>
and create a cascading failure scenario. In this case, traffic
that fails over to one link overwhelms that link and then
moves to the subsequent links until the all network traffic
stops.</para>
</section> </section>

View File

@ -6,34 +6,34 @@
xml:id="prescriptive-example-large-scale-web-app"> xml:id="prescriptive-example-large-scale-web-app">
<?dbhtml stop-chunking?> <?dbhtml stop-chunking?>
<title>Prescriptive examples</title> <title>Prescriptive examples</title>
<para>A large-scale web application has been designed with cloud <para>An organization design a large-scale web application with cloud
principles in mind. The application is designed to scale principles in mind. The application scales
horizontally in a bursting fashion and will generate a high horizontally in a bursting fashion and generates a high
instance count. The application requires an SSL connection to instance count. The application requires an SSL connection to
secure data and must not lose connection state to individual secure data and must not lose connection state to individual
servers.</para> servers.</para>
<para>An example design for this workload is depicted in the <para>The figure below depicts an example design for this workload.
figure below. In this example, a hardware load balancer is In this example, a hardware load balancer provides SSL offload
configured to provide SSL offload functionality and to connect functionality and connects
to tenant networks in order to reduce address consumption. to tenant networks in order to reduce address consumption.
This load balancer is linked to the routing architecture as it This load balancer links to the routing architecture as it
will service the VIP for the application. The router and load services the VIP for the application. The router and load
balancer are configured with GRE tunnel ID of the balancer use the GRE tunnel ID of the
application's tenant network and provided an IP address within application's tenant network and an IP address within
the tenant subnet but outside of the address pool. This is to the tenant subnet but outside of the address pool. This is to
ensure that the load balancer can communicate with the ensure that the load balancer can communicate with the
application's HTTP servers without requiring the consumption application's HTTP servers without requiring the consumption
of a public IP address.</para> of a public IP address.</para>
<para>Because sessions persist until they are closed, the routing and <para>Because sessions persist until closed, the routing and
switching architecture is designed for high availability. switching architecture provides high availability.
Switches are meshed to each hypervisor and each other, and Switches mesh to each hypervisor and each other, and
also provide an MLAG implementation to ensure that layer-2 also provide an MLAG implementation to ensure that layer-2
connectivity does not fail. Routers are configured with VRRP connectivity does not fail. Routers use VRRP
and fully meshed with switches to ensure layer-3 connectivity. and fully mesh with switches to ensure layer-3 connectivity.
Since GRE is used as an overlay network, Networking is installed Since GRE is provides an overlay network, Networking is present
and configured to use the Open vSwitch agent in GRE tunnel and uses the Open vSwitch agent in GRE tunnel
mode. This ensures all devices can reach all other devices and mode. This ensures all devices can reach all other devices and
that tenant networks can be created for private addressing that you can create tenant networks for private addressing
links to the load balancer. links to the load balancer.
<mediaobject> <mediaobject>
<imageobject> <imageobject>
@ -44,9 +44,9 @@
</mediaobject></para> </mediaobject></para>
<para>A web service architecture has many options and optional <para>A web service architecture has many options and optional
components. Due to this, it can fit into a large number of components. Due to this, it can fit into a large number of
other OpenStack designs however a few key components will need other OpenStack designs. A few key components, however, need
to be in place to handle the nature of most web-scale to be in place to handle the nature of most web-scale
workloads. The user needs the following components:</para> workloads. You require the following components:</para>
<itemizedlist> <itemizedlist>
<listitem> <listitem>
<para>OpenStack Controller services (Image, Identity, <para>OpenStack Controller services (Image, Identity,
@ -66,59 +66,59 @@
<para>Telemetry module</para> <para>Telemetry module</para>
</listitem> </listitem>
</itemizedlist> </itemizedlist>
<para> <para>Beyond the normal Identity, Compute, Image service, and Object
Beyond the normal Identity, Compute, Image service and Object Storage components, we recommend the Orchestration module
Storage components, the Orchestration module is a recommended component to handle the proper scaling of workloads to adjust to
component to handle properly scaling the workloads to adjust to demand. Due to the requirement for auto-scaling,
demand. Due to the requirement for auto-scaling, the design includes the Telemetry module. Web services
the design includes the Telemetry module. Web services tend to be bursty in load, have very defined peak and valley
tend to be bursty in load, have very defined peak and valley usage patterns and, as a result, benefit from automatic scaling
usage patterns and, as a result, benefit from automatic scaling of instances based upon traffic. At a network level, a split
of instances based upon traffic. At a network level, a split network configuration works well with databases residing on
network configuration will work well with databases residing on private tenant networks since these do not emit a large quantity
private tenant networks since these do not emit a large quantity of broadcast traffic and may need to interconnect to some
of broadcast traffic and may need to interconnect to some databases for content.
databases for content.
</para> </para>
<section xml:id="load-balancing"> <section xml:id="load-balancing">
<title>Load balancing</title> <title>Load balancing</title>
<para>Load balancing was included in this design to spread <para>Load balancing spreads requests across multiple instances.
requests across multiple instances. This workload scales well This workload scales well horizontally across large numbers of
horizontally across large numbers of instances. This allows instances. This enables instances to run without publicly
instances to run without publicly routed IP addresses and routed IP addresses and instead to rely on the load
simply rely on the load balancer for the service to be balancer to provide a globally reachable service.
globally reachable. Many of these services do not require Many of these services do not require
direct server return. This aids in address planning and direct server return. This aids in address planning and
utilization at scale since only the virtual IP (VIP) must be utilization at scale since only the virtual IP (VIP) must be
public.</para></section> public.</para>
</section>
<section xml:id="overlay-networks"> <section xml:id="overlay-networks">
<title>Overlay networks</title> <title>Overlay networks</title>
<para> <para>
The overlay functionality design includes OpenStack Networking The overlay functionality design includes OpenStack Networking
in Open vSwitch GRE tunnel mode. in Open vSwitch GRE tunnel mode.
In this case, the layer-3 external routers are paired with In this case, the layer-3 external routers pair with
VRRP and switches should be paired with an implementation of VRRP, and switches pair with an implementation of
MLAG running to ensure that you do not lose connectivity with MLAG to ensure that you do not lose connectivity with
the upstream routing infrastructure. the upstream routing infrastructure.
</para> </para>
</section> </section>
<section xml:id="performance-tuning"> <section xml:id="performance-tuning">
<title>Performance tuning</title> <title>Performance tuning</title>
<para>Network level tuning for this workload is minimal. <para>Network level tuning for this workload is minimal.
Quality-of-Service (QoS) will be applied to these workloads Quality-of-Service (QoS) applies to these workloads
for a middle ground Class Selector depending on existing for a middle ground Class Selector depending on existing
policies. It will be higher than a best effort queue but lower policies. It is higher than a best effort queue but lower
than an Expedited Forwarding or Assured Forwarding queue. than an Expedited Forwarding or Assured Forwarding queue.
Since this type of application generates larger packets with Since this type of application generates larger packets with
longer-lived connections, bandwidth utilization can be longer-lived connections, you can optimize bandwidth utilization
optimized for long duration TCP. Normal bandwidth planning for long duration TCP. Normal bandwidth planning
applies here with regards to benchmarking a session's usage applies here with regards to benchmarking a session's usage
multiplied by the expected number of concurrent sessions with multiplied by the expected number of concurrent sessions with
overhead.</para></section> overhead.</para>
</section>
<section xml:id="network-functions"> <section xml:id="network-functions">
<title>Network functions</title> <title>Network functions</title>
<para>Network functions is a broad category but encompasses <para>Network functions is a broad category but encompasses
workloads that support the rest of a system's network. These workloads that support the rest of a system's network. These
workloads tend to consist of large amounts of small packets workloads tend to consist of large amounts of small packets
that are very short lived, such as DNS queries or SNMP traps. that are very short lived, such as DNS queries or SNMP traps.
@ -134,63 +134,57 @@
<para>The supporting network for this type of configuration needs <para>The supporting network for this type of configuration needs
to have a low latency and evenly distributed availability. to have a low latency and evenly distributed availability.
This workload benefits from having services local to the This workload benefits from having services local to the
consumers of the service. A multi-site approach is used as consumers of the service. Use a multi-site approach as
well as deploying many copies of the application to handle well as deploying many copies of the application to handle
load as close as possible to consumers. Since these load as close as possible to consumers. Since these
applications function independently, they do not warrant applications function independently, they do not warrant
running overlays to interconnect tenant networks. Overlays running overlays to interconnect tenant networks. Overlays
also have the drawback of performing poorly with rapid flow also have the drawback of performing poorly with rapid flow
setup and may incur too much overhead with large quantities of setup and may incur too much overhead with large quantities of
small packets and are therefore not recommended.</para> small packets and therefore we do not recommend them.</para>
<para>QoS is desired for some workloads to ensure delivery. DNS <para>QoS is desirable for some workloads to ensure delivery. DNS
has a major impact on the load times of other services and has a major impact on the load times of other services and
needs to be reliable and provide rapid responses. It is to needs to be reliable and provide rapid responses. Configure rules
configure rules in upstream devices to apply a higher Class in upstream devices to apply a higher Class
Selector to DNS to ensure faster delivery or a better spot in Selector to DNS to ensure faster delivery or a better spot in
queuing algorithms.</para></section> queuing algorithms.</para>
</section>
<section xml:id="cloud-storage"> <section xml:id="cloud-storage">
<title>Cloud storage</title> <title>Cloud storage</title>
<para> <para>Another common use case for OpenStack environments is providing
Another common use case for OpenStack environments is to provide a cloud-based file storage and sharing service. You might
a cloud-based file storage and sharing service. You might consider this a storage-focused use case, but its network-side
consider this a storage-focused use case, but its network-side requirements make it a network-focused use case.</para>
requirements make it a network-focused use case.</para> <para>For example, consider a cloud backup application. This workload
<para> has two specific behaviors that impact the network. Because this
For example, consider a cloud backup application. This workload workload is an externally-facing service and an
has two specific behaviors that impact the network. Because this internally-replicating application, it has both <glossterm
workload is an externally-facing service and an baseform="north-south traffic">north-south</glossterm> and
internally-replicating application, it has both <glossterm <glossterm>east-west traffic</glossterm>
baseform="north-south traffic">north-south</glossterm> and considerations:</para>
<glossterm>east-west traffic</glossterm>
considerations, as follows:
</para>
<variablelist> <variablelist>
<varlistentry> <varlistentry>
<term>north-south traffic</term> <term>north-south traffic</term>
<listitem> <listitem>
<para> <para>When a user uploads and stores content, that content moves
When a user uploads and stores content, that content moves
into the OpenStack installation. When users download this into the OpenStack installation. When users download this
content, the content moves from the OpenStack content, the content moves out from the OpenStack
installation. Because this service is intended primarily installation. Because this service operates primarily
as a backup, most of the traffic moves southbound into the as a backup, most of the traffic moves southbound into the
environment. In this situation, it benefits you to environment. In this situation, it benefits you to
configure a network to be asymmetrically downstream configure a network to be asymmetrically downstream
because the traffic that enters the OpenStack installation because the traffic that enters the OpenStack installation
is greater than the traffic that leaves the installation. is greater than the traffic that leaves the installation.</para>
</para>
</listitem> </listitem>
</varlistentry> </varlistentry>
<varlistentry> <varlistentry>
<term>east-west traffic</term> <term>east-west traffic</term>
<listitem> <listitem>
<para> <para>Likely to be fully symmetric. Because replication
Likely to be fully symmetric. Because replication
originates from any node and might target multiple other originates from any node and might target multiple other
nodes algorithmically, it is less likely for this traffic nodes algorithmically, it is less likely for this traffic
to have a larger volume in any specific direction. However to have a larger volume in any specific direction. However
this traffic might interfere with north-south traffic. this traffic might interfere with north-south traffic.</para>
</para>
</listitem> </listitem>
</varlistentry> </varlistentry>
</variablelist> </variablelist>
@ -201,16 +195,15 @@
/> />
</imageobject> </imageobject>
</mediaobject> </mediaobject>
<para> <para>This application prioritizes the north-south traffic over
This application prioritizes the north-south traffic over
east-west traffic: the north-south traffic involves east-west traffic: the north-south traffic involves
customer-facing data. customer-facing data.</para>
</para>
<para>The network design in this case is less dependent on <para>The network design in this case is less dependent on
availability and more dependent on being able to handle high availability and more dependent on being able to handle high
bandwidth. As a direct result, it is beneficial to forego bandwidth. As a direct result, it is beneficial to forgo
redundant links in favor of bonding those connections. This redundant links in favor of bonding those connections. This
increases available bandwidth. It is also beneficial to increases available bandwidth. It is also beneficial to
configure all devices in the path, including OpenStack, to configure all devices in the path, including OpenStack, to
generate and pass jumbo frames.</para></section> generate and pass jumbo frames.</para>
</section>
</section> </section>

View File

@ -13,27 +13,23 @@
involve those made about the protocol layer and the point when involve those made about the protocol layer and the point when
IP comes into the picture. As an example, a completely IP comes into the picture. As an example, a completely
internal OpenStack network can exist at layer 2 and ignore internal OpenStack network can exist at layer 2 and ignore
layer 3 however, in order for any traffic to go outside of layer 3. In order for any traffic to go outside of
that cloud, to another network, or to the Internet, a layer-3 that cloud, to another network, or to the Internet, however, you must
router or switch must be involved.</para> use a layer-3 router or switch.</para>
<para> <para>The past few years have seen two competing trends in
The past few years have seen two competing trends in
networking. One trend leans towards building data center network networking. One trend leans towards building data center network
architectures based on layer-2 networking. Another trend treats architectures based on layer-2 networking. Another trend treats
the cloud environment essentially as a miniature version of the the cloud environment essentially as a miniature version of the
Internet. This approach is radically different from the network Internet. This approach is radically different from the network
architecture approach that is used in the staging environment: architecture approach in the staging environment:
the Internet is based entirely on layer-3 routing rather than the Internet only uses layer-3 routing rather than
layer-2 switching. layer-2 switching.</para>
</para> <para>A network designed on layer-2 protocols has advantages over one
<para>
A network designed on layer-2 protocols has advantages over one
designed on layer-3 protocols. In spite of the difficulties of designed on layer-3 protocols. In spite of the difficulties of
using a bridge to perform the network role of a router, many using a bridge to perform the network role of a router, many
vendors, customers, and service providers choose to use Ethernet vendors, customers, and service providers choose to use Ethernet
in as many parts of their networks as possible. The benefits of in as many parts of their networks as possible. The benefits of
selecting a layer-2 design are: selecting a layer-2 design are:</para>
</para>
<itemizedlist> <itemizedlist>
<listitem> <listitem>
<para>Ethernet frames contain all the essentials for <para>Ethernet frames contain all the essentials for
@ -47,13 +43,13 @@
protocol.</para> protocol.</para>
</listitem> </listitem>
<listitem> <listitem>
<para>More layers added to the Ethernet frame only slow <para>Adding more layers to the Ethernet frame only slows
the networking process down. This is known as 'nodal the networking process down. This is known as 'nodal
processing delay'.</para> processing delay'.</para>
</listitem> </listitem>
<listitem> <listitem>
<para>Adjunct networking features, for example class of <para>You can add adjunct networking features, for
service (CoS) or multicasting, can be added to example class of service (CoS) or multicasting, to
Ethernet as readily as IP networks.</para> Ethernet as readily as IP networks.</para>
</listitem> </listitem>
<listitem> <listitem>
@ -62,45 +58,37 @@
</listitem> </listitem>
</itemizedlist> </itemizedlist>
<para>Most information starts and ends inside Ethernet frames. <para>Most information starts and ends inside Ethernet frames.
Today this applies to data, voice (for example, VoIP) and Today this applies to data, voice (for example, VoIP), and
video (for example, web cameras). The concept is that, if more video (for example, web cameras). The concept is that, if you can
of the end-to-end transfer of information from a source to a perform more of the end-to-end transfer of information from
destination can be done in the form of Ethernet frames, more a source to a destination in the form of Ethernet frames, the network
of the benefits of Ethernet can be realized on the network. benefits more from the advantages of Ethernet.
Though it is not a substitute for IP networking, networking at Although it is not a substitute for IP networking, networking at
layer 2 can be a powerful adjunct to IP networking. layer 2 can be a powerful adjunct to IP networking.</para>
</para>
<para> <para>
Layer-2 Ethernet usage has these advantages over layer-3 IP Layer-2 Ethernet usage has these advantages over layer-3 IP
network usage: network usage:
</para> </para>
<itemizedlist> <itemizedlist>
<listitem> <listitem>
<para> <para>Speed</para>
Speed
</para>
</listitem> </listitem>
<listitem> <listitem>
<para> <para>Reduced overhead of the IP hierarchy.</para>
Reduced overhead of the IP hierarchy.
</para>
</listitem> </listitem>
<listitem> <listitem>
<para> <para>No need to keep track of address configuration as systems
No need to keep track of address configuration as systems move around. Whereas the simplicity of layer-2
are moved around. Whereas the simplicity of layer-2
protocols might work well in a data center with hundreds protocols might work well in a data center with hundreds
of physical machines, cloud data centers have the of physical machines, cloud data centers have the
additional burden of needing to keep track of all virtual additional burden of needing to keep track of all virtual
machine addresses and networks. In these data centers, it machine addresses and networks. In these data centers, it
is not uncommon for one physical node to support 30-40 is not uncommon for one physical node to support 30-40
instances. instances.</para>
</para>
</listitem> </listitem>
</itemizedlist> </itemizedlist>
<important> <important>
<para> <para>Networking at the frame level says nothing
Networking at the frame level says nothing
about the presence or absence of IP addresses at the packet about the presence or absence of IP addresses at the packet
level. Almost all ports, links, and devices on a network of level. Almost all ports, links, and devices on a network of
LAN switches still have IP addresses, as do all the source and LAN switches still have IP addresses, as do all the source and
@ -125,8 +113,8 @@
limited.</para> limited.</para>
</listitem> </listitem>
<listitem> <listitem>
<para>The need to maintain a set of layer-4 devices to <para>You must accommodate the need to maintain a set of
handle traffic control must be accommodated.</para> layer-4 devices to handle traffic control.</para>
</listitem> </listitem>
<listitem> <listitem>
<para>MLAG, often used for switch redundancy, is a <para>MLAG, often used for switch redundancy, is a
@ -138,21 +126,20 @@
without IP addresses and ICMP.</para> without IP addresses and ICMP.</para>
</listitem> </listitem>
<listitem> <listitem>
<para> <para>Configuring <glossterm
Configuring <glossterm
baseform="Address Resolution Protocol (ARP)">ARP</glossterm> baseform="Address Resolution Protocol (ARP)">ARP</glossterm>
is considered complicated on large layer-2 networks.</para> can be complicated on large layer-2 networks.</para>
</listitem> </listitem>
<listitem> <listitem>
<para>All network devices need to be aware of all MACs, <para>All network devices need to be aware of all MACs,
even instance MACs, so there is constant churn in MAC even instance MACs, so there is constant churn in MAC
tables and network state changes as instances are tables and network state changes as instances start and
started or stopped.</para> stop.</para>
</listitem> </listitem>
<listitem> <listitem>
<para>Migrating MACs (instance migration) to different <para>Migrating MACs (instance migration) to different
physical locations are a potential problem if ARP physical locations are a potential problem if you do not
table timeouts are not set properly.</para> set ARP table timeouts properly.</para>
</listitem> </listitem>
</itemizedlist> </itemizedlist>
<para>It is important to know that layer 2 has a very limited set <para>It is important to know that layer 2 has a very limited set
@ -173,14 +160,15 @@
with the new location of the instance.</para> with the new location of the instance.</para>
<para>In a layer-2 network, all devices are aware of all MACs, <para>In a layer-2 network, all devices are aware of all MACs,
even those that belong to instances. The network state even those that belong to instances. The network state
information in the backbone changes whenever an instance is information in the backbone changes whenever an instance starts
started or stopped. As a result there is far too much churn in or stops. As a result there is far too much churn in
the MAC tables on the backbone switches.</para></section> the MAC tables on the backbone switches.</para>
</section>
<section xml:id="layer-3-arch-advantages"> <section xml:id="layer-3-arch-advantages">
<title>Layer-3 architecture advantages</title> <title>Layer-3 architecture advantages</title>
<para>In the layer 3 case, there is no churn in the routing tables <para>In the layer 3 case, there is no churn in the routing tables
due to instances starting and stopping. The only time there due to instances starting and stopping. The only time there
would be a routing state change would be in the case of a Top would be a routing state change is in the case of a Top
of Rack (ToR) switch failure or a link failure in the backbone of Rack (ToR) switch failure or a link failure in the backbone
itself. Other advantages of using a layer-3 architecture itself. Other advantages of using a layer-3 architecture
include:</para> include:</para>
@ -194,15 +182,15 @@
straightforward.</para> straightforward.</para>
</listitem> </listitem>
<listitem> <listitem>
<para>Layer 3 can be configured to use <glossterm <para>You can configure layer 3 to use <glossterm
baseform="Border Gateway Protocol (BGP)">BGP</glossterm> baseform="Border Gateway Protocol (BGP)">BGP</glossterm>
confederation for scalability so core routers have state confederation for scalability so core routers have state
proportional to the number of racks, not to the number of proportional to the number of racks, not to the number of
servers or instances.</para> servers or instances.</para>
</listitem> </listitem>
<listitem> <listitem>
<para>Routing ensures that instance MAC and IP addresses <para>Routing takes instance MAC and IP addresses
out of the network core reducing state churn. Routing out of the network core, reducing state churn. Routing
state changes only occur in the case of a ToR switch state changes only occur in the case of a ToR switch
failure or backbone link failure.</para> failure or backbone link failure.</para>
</listitem> </listitem>
@ -211,7 +199,7 @@
example ICMP, to monitor and manage traffic.</para> example ICMP, to monitor and manage traffic.</para>
</listitem> </listitem>
<listitem> <listitem>
<para>Layer-3 architectures allow for the use of Quality <para>Layer-3 architectures enable the use of Quality
of Service (QoS) to manage network performance.</para> of Service (QoS) to manage network performance.</para>
</listitem> </listitem>
</itemizedlist> </itemizedlist>
@ -220,17 +208,16 @@
<para>The main limitation of layer 3 is that there is no built-in <para>The main limitation of layer 3 is that there is no built-in
isolation mechanism comparable to the VLANs in layer-2 isolation mechanism comparable to the VLANs in layer-2
networks. Furthermore, the hierarchical nature of IP addresses networks. Furthermore, the hierarchical nature of IP addresses
means that an instance will also be on the same subnet as its means that an instance is on the same subnet as its
physical host. This means that it cannot be migrated outside physical host. This means that you cannot migrate it outside
of the subnet easily. For these reasons, network of the subnet easily. For these reasons, network
virtualization needs to use IP <glossterm>encapsulation</glossterm> virtualization needs to use IP <glossterm>encapsulation</glossterm>
and software at and software at the end hosts for isolation and the separation of
the end hosts for both isolation, as well as for separation of the addressing in the virtual layer from the addressing in the
the addressing in the virtual layer from addressing in the
physical layer. Other potential disadvantages of layer 3 physical layer. Other potential disadvantages of layer 3
include the need to design an IP addressing scheme rather than include the need to design an IP addressing scheme rather than
relying on the switches to automatically keep track of the MAC relying on the switches to keep track of the MAC
addresses and to configure the interior gateway routing addresses automatically and to configure the interior gateway routing
protocol in the switches.</para> protocol in the switches.</para>
</section> </section>
</section> </section>
@ -242,13 +229,13 @@
Data in an OpenStack cloud moves both between instances across Data in an OpenStack cloud moves both between instances across
the network (also known as East-West), as well as in and out the network (also known as East-West), as well as in and out
of the system (also known as North-South). Physical server of the system (also known as North-South). Physical server
nodes have network requirements that are independent of those nodes have network requirements that are independent of instance
used by instances which need to be isolated from the core network requirements, which you must isolate from the core
network to account for scalability. It is also recommended to network to account for scalability. We recommend
functionally separate the networks for security purposes and functionally separating the networks for security purposes and
tune performance through traffic shaping.</para> tuning performance through traffic shaping.</para>
<para>A number of important general technical and business factors <para>You must consider a number of important general technical
need to be taken into consideration when planning and and business factors when planning and
designing an OpenStack network. They include:</para> designing an OpenStack network. They include:</para>
<itemizedlist> <itemizedlist>
<listitem> <listitem>
@ -286,11 +273,10 @@
future production environments.</para> future production environments.</para>
</listitem> </listitem>
</itemizedlist> </itemizedlist>
<para>Keeping all of these in mind, the following network design <para>Bearing in mind these considerations, we recommend the following:</para>
recommendations can be made:</para>
<itemizedlist> <itemizedlist>
<listitem> <listitem>
<para>Layer-3 designs are preferred over layer-2 <para>Layer-3 designs are preferable to layer-2
architectures.</para> architectures.</para>
</listitem> </listitem>
<listitem> <listitem>
@ -327,16 +313,16 @@
</itemizedlist></section> </itemizedlist></section>
<section xml:id="additional-considerations-network-focus"> <section xml:id="additional-considerations-network-focus">
<title>Additional considerations</title> <title>Additional considerations</title>
<para>There are numerous topics to consider when designing a <para>There are several further considerations when designing a
network-focused OpenStack cloud.</para> network-focused OpenStack cloud.</para>
<section xml:id="openstack-networking-versus-nova-network"> <section xml:id="openstack-networking-versus-nova-network">
<title>OpenStack Networking versus legacy networking (nova-network) <title>OpenStack Networking versus legacy networking (nova-network)
considerations</title> considerations</title>
<para>Selecting the type of networking technology to implement <para>Selecting the type of networking technology to implement
depends on many factors. OpenStack Networking (neutron) and depends on many factors. OpenStack Networking (neutron) and
legacy networking (nova-network) both have their advantages and disadvantages. legacy networking (nova-network) both have their advantages and
They are both valid and supported options that fit different disadvantages. They are both valid and supported options that fit
use cases as described in the following table.</para> different use cases:</para>
<informaltable rules="all"> <informaltable rules="all">
<col width="40%" /> <col width="40%" />
<col width="60%" /> <col width="60%" />
@ -375,79 +361,75 @@
<title>Redundant networking: ToR switch high availability <title>Redundant networking: ToR switch high availability
risk analysis</title> risk analysis</title>
<para>A technical consideration of networking is the idea that <para>A technical consideration of networking is the idea that
switching gear in the data center that should be installed you should install switching gear in a data center
with backup switches in case of hardware failure.</para> with backup switches in case of hardware failure.</para>
<para> <para>Research indicates the mean time between failures (MTBF) on switches
Research into the mean time between failures (MTBF) on switches
is between 100,000 and 200,000 hours. This number is dependent is between 100,000 and 200,000 hours. This number is dependent
on the ambient temperature of the switch in the data on the ambient temperature of the switch in the data
center. When properly cooled and maintained, this translates to center. When properly cooled and maintained, this translates to
between 11 and 22 years before failure. Even in the worst case between 11 and 22 years before failure. Even in the worst case
of poor ventilation and high ambient temperatures in the data of poor ventilation and high ambient temperatures in the data
center, the MTBF is still 2-3 years. This is based on published center, the MTBF is still 2-3 years. See <link
research found at <link
xlink:href="http://www.garrettcom.com/techsupport/papers/ethernet_switch_reliability.pdf">http://www.garrettcom.com/techsupport/papers/ethernet_switch_reliability.pdf</link> xlink:href="http://www.garrettcom.com/techsupport/papers/ethernet_switch_reliability.pdf">http://www.garrettcom.com/techsupport/papers/ethernet_switch_reliability.pdf</link>
and <link for further information.</para>
xlink:href="http://www.n-tron.com/pdf/network_availability.pdf">http://www.n-tron.com/pdf/network_availability.pdf</link>. <para>In most cases, it is much more economical to use a
</para>
<para>In most cases, it is much more economical to only use a
single switch with a small pool of spare switches to replace single switch with a small pool of spare switches to replace
failed units than it is to outfit an entire data center with failed units than it is to outfit an entire data center with
redundant switches. Applications should also be able to redundant switches. Applications should tolerate rack level
tolerate rack level outages without affecting normal outages without affecting normal
operations since network and compute resources are easily operations, since network and compute resources are easily
provisioned and plentiful.</para></section> provisioned and plentiful.</para>
</section>
<section xml:id="preparing-for-future-ipv6-support"> <section xml:id="preparing-for-future-ipv6-support">
<title>Preparing for the future: IPv6 support</title> <title>Preparing for the future: IPv6 support</title>
<para> <para>One of the most important networking topics today is the
One of the most important networking topics today is the impending exhaustion of IPv4 addresses. In early 2014, ICANN
impending exhaustion of IPv4 addresses. In early 2014, ICANN announced that they started allocating the final IPv4 address
announced that they started allocating the final IPv4 address blocks to the Regional Internet Registries (<link
blocks to the Regional Internet Registries (<link xlink:href="http://www.internetsociety.org/deploy360/blog/2014/05/goodbye-ipv4-iana-starts-allocating-final-address-blocks/">http://www.internetsociety.org/deploy360/blog/2014/05/goodbye-ipv4-iana-starts-allocating-final-address-blocks/</link>).
xlink:href="http://www.internetsociety.org/deploy360/blog/2014/05/goodbye-ipv4-iana-starts-allocating-final-address-blocks/">http://www.internetsociety.org/deploy360/blog/2014/05/goodbye-ipv4-iana-starts-allocating-final-address-blocks/</link>). This means the IPv4 address space is close to being fully
This means the IPv4 address space is close to being fully allocated. As a result, it will soon become difficult to
allocated. As a result, it will soon become difficult to allocate more IPv4 addresses to an application that has
allocate more IPv4 addresses to an application that has experienced growth, or that you expect to scale out, due to the lack
experienced growth, or is expected to scale out, due to the lack of unallocated IPv4 address blocks.</para>
of unallocated IPv4 address blocks.</para> <para>For network focused applications the future is the IPv6
<para>For network focused applications the future is the IPv6
protocol. IPv6 increases the address space significantly, protocol. IPv6 increases the address space significantly,
fixes long standing issues in the IPv4 protocol, and will fixes long standing issues in the IPv4 protocol, and will
become essential for network focused applications in the become essential for network focused applications in the
future.</para> future.</para>
<para>OpenStack Networking supports IPv6 when configured to take advantage of <para>OpenStack Networking supports IPv6 when configured to take
the feature. To enable it, simply create an IPv6 subnet in advantage of it. To enable IPv6, create an IPv6 subnet in
Networking and use IPv6 prefixes when creating security Networking and use IPv6 prefixes when creating security
groups.</para></section> groups.</para></section>
<section xml:id="asymmetric-links"> <section xml:id="asymmetric-links">
<title>Asymmetric links</title> <title>Asymmetric links</title>
<para>When designing a network architecture, the traffic patterns <para>When designing a network architecture, the traffic patterns
of an application will heavily influence the allocation of of an application heavily influence the allocation of
total bandwidth and the number of links that are used to send total bandwidth and the number of links that you use to send
and receive traffic. Applications that provide file storage and receive traffic. Applications that provide file storage
for customers will allocate bandwidth and links to favor for customers allocate bandwidth and links to favor
incoming traffic, whereas video streaming applications will incoming traffic, whereas video streaming applications
allocate bandwidth and links to favor outgoing traffic.</para></section> allocate bandwidth and links to favor outgoing traffic.</para>
</section>
<section xml:id="performance-network-focus"> <section xml:id="performance-network-focus">
<title>Performance</title> <title>Performance</title>
<para>It is important to analyze the applications' tolerance for <para>It is important to analyze the applications' tolerance for
latency and jitter when designing an environment to support latency and jitter when designing an environment to support
network focused applications. Certain applications, for network focused applications. Certain applications, for
example VoIP, are less tolerant of latency and jitter. Where example VoIP, are less tolerant of latency and jitter. Where
latency and jitter are concerned, certain applications may latency and jitter are concerned, certain applications may
require tuning of QoS parameters and network device queues to require tuning of QoS parameters and network device queues to
ensure that they are queued for transmit immediately or ensure that they queue for transmit immediately or
guaranteed minimum bandwidth. Since OpenStack currently does guarantee minimum bandwidth. Since OpenStack currently does
not support these functions, some considerations may need to not support these functions, consider carefully your selected
be made for the network plug-in selected.</para> network plug-in.</para>
<para>The location of a service may also impact the application or <para>The location of a service may also impact the application or
consumer experience. If an application is designed to serve consumer experience. If an application serves
differing content to differing users it will need to be differing content to different users it must properly direct
designed to properly direct connections to those specific connections to those specific locations. Where appropriate,
locations. Use a multi-site installation for these situations, use a multi-site installation for these situations.</para>
where appropriate.</para> <para>You can implement networking in two separate
<para>Networking can be implemented in two separate ways. Legacy networking (nova-network) provides a flat DHCP network
ways. The legacy networking (nova-network) provides a flat DHCP network
with a single broadcast domain. This implementation does not with a single broadcast domain. This implementation does not
support tenant isolation networks or advanced plug-ins, but it support tenant isolation networks or advanced plug-ins, but it
is currently the only way to implement a distributed layer-3 is currently the only way to implement a distributed layer-3
@ -457,15 +439,15 @@
variety of network methods. Some of these include a layer-2 variety of network methods. Some of these include a layer-2
only provider network model, external device plug-ins, or even only provider network model, external device plug-ins, or even
OpenFlow controllers.</para> OpenFlow controllers.</para>
<para>Networking at large scales becomes a set of boundary <para>Networking at large scales becomes a set of boundary
questions. The determination of how large a layer-2 domain questions. The determination of how large a layer-2 domain
needs to be is based on the amount of nodes within the domain must be is based on the amount of nodes within the domain
and the amount of broadcast traffic that passes between and the amount of broadcast traffic that passes between
instances. Breaking layer-2 boundaries may require the instances. Breaking layer-2 boundaries may require the
implementation of overlay networks and tunnels. This decision implementation of overlay networks and tunnels. This decision
is a balancing act between the need for a smaller overhead or is a balancing act between the need for a smaller overhead or
a need for a smaller domain.</para> a need for a smaller domain.</para>
<para>When selecting network devices, be aware that making this <para>When selecting network devices, be aware that making this
decision based on the greatest port density often comes with a decision based on the greatest port density often comes with a
drawback. Aggregation switches and routers have not all kept drawback. Aggregation switches and routers have not all kept
pace with Top of Rack switches and may induce bottlenecks on pace with Top of Rack switches and may induce bottlenecks on

View File

@ -6,187 +6,160 @@
xml:id="user-requirements-network-focus"> xml:id="user-requirements-network-focus">
<?dbhtml stop-chunking?> <?dbhtml stop-chunking?>
<title>User requirements</title> <title>User requirements</title>
<para>Network focused architectures vary from the general purpose <para>Network-focused architectures vary from the general-purpose
designs. They are heavily influenced by a specific subset of architecture designs. Certain network-intensive applications influence
applications that interact with the network in a more these architectures. Some of the business requirements that influence
impacting way. Some of the business requirements that will the design include:</para>
influence the design include:</para>
<itemizedlist> <itemizedlist>
<listitem> <listitem>
<para>User experience: User experience is impacted by <para>Network latency through slow page loads, degraded video
network latency through slow page loads, degraded streams, and low quality VoIP sessions impacts the user
video streams, and low quality VoIP sessions. Users experience. Users are often not aware of how network design and
are often not aware of how network design and architecture affects their experiences. Both enterprise customers
architecture affects their experiences. Both and end-users rely on the network for delivery of an application.
enterprise customers and end-users rely on the network Network performance problems can result in a negative experience
for delivery of an application. Network performance for the end-user, as well as productivity and economic loss.
problems can provide a negative experience for the
end-user, as well as productivity and economic loss.
</para> </para>
</listitem> </listitem>
<listitem> <listitem>
<para>Regulatory requirements: Networks need to take into <para>Regulatory requirements: Consider regulatory
consideration any regulatory requirements about the requirements about the physical location of data as it traverses
physical location of data as it traverses the network. the network. In addition, maintain network segregation of private
For example, Canadian medical records cannot pass data flows while ensuring an encrypted network between cloud
outside of Canadian sovereign territory. Another locations where required. Regulatory requirements for encryption
network consideration is maintaining network and protection of data in flight affect network architectures as
segregation of private data flows and ensuring that the data moves through various networks.</para>
the network between cloud locations is encrypted where
required. Network architectures are affected by
regulatory requirements for encryption and protection
of data in flight as the data moves through various
networks.</para>
</listitem> </listitem>
</itemizedlist> </itemizedlist>
<para>Many jurisdictions have legislative and regulatory <para>Many jurisdictions have legislative and regulatory requirements
requirements governing the storage and management of data in governing the storage and management of data in cloud environments.
cloud environments. Common areas of regulation include:</para> Common areas of regulation include:</para>
<itemizedlist> <itemizedlist>
<listitem> <listitem>
<para>Data retention policies ensuring storage of <para>Data retention policies ensuring storage of persistent data
persistent data and records management to meet data and records management to meet data archival requirements.</para>
archival requirements.</para>
</listitem> </listitem>
<listitem> <listitem>
<para>Data ownership policies governing the possession and <para>Data ownership policies governing the possession and
responsibility for data.</para> responsibility for data.</para>
</listitem> </listitem>
<listitem> <listitem>
<para>Data sovereignty policies governing the storage of <para>Data sovereignty policies governing the storage of data in
data in foreign countries or otherwise separate foreign countries or otherwise separate jurisdictions.</para>
jurisdictions.</para>
</listitem> </listitem>
<listitem> <listitem>
<para>Data compliance policies governing where information <para>Data compliance policies govern where information can and
needs to reside in certain locations due to regular cannot reside in certain locations.</para>
issues and, more importantly, where it cannot reside
in other locations for the same reason.</para>
</listitem> </listitem>
</itemizedlist> </itemizedlist>
<para>Examples of such legal frameworks include the data <para>Examples of such legal frameworks include the data protection
protection framework of the European Union (<link framework of the European Union
xlink:href="http://ec.europa.eu/justice/data-protection/">http://ec.europa.eu/justice/data-protection/</link>) (<link xlink:href="http://ec.europa.eu/justice/data-protection/">http://ec.europa.eu/justice/data-protection/</link>)
and the requirements of the Financial Industry Regulatory and the requirements of the Financial Industry Regulatory Authority
Authority (<link (<link xlink:href="http://www.finra.org/Industry/Regulation/FINRARules">http://www.finra.org/Industry/Regulation/FINRARules</link>)
xlink:href="http://www.finra.org/Industry/Regulation/FINRARules">http://www.finra.org/Industry/Regulation/FINRARules</link>) in the United States. Consult a local regulatory body for more
in the United States. Consult a local regulatory body for more information.</para>
information.</para>
<section xml:id="high-availability-issues-network-focus"> <section xml:id="high-availability-issues-network-focus">
<title>High availability issues</title> <title>High availability issues</title>
<para>OpenStack installations with high demand on network <para>Depending on the application and use case, network-intensive
resources have high availability requirements that are OpenStack installations can have high availability requirements.
determined by the application and use case. Financial Financial transaction systems have a much higher requirement for high
transaction systems will have a much higher requirement for availability than a development application. Use network availability
high availability than a development application. Forms of technologies, for example quality of service (QoS), to improve the
network availability, for example quality of service (QoS), network performance of sensitive applications such as VoIP and video
can be used to improve the network performance of sensitive streaming.</para>
applications, for example VoIP and video streaming.</para> <para>High performance systems have SLA requirements for a minimum
<para>Often, high performance systems will have SLA requirements QoS with regard to guaranteed uptime, latency, and bandwidth. The level
for a minimum QoS with regard to guaranteed uptime, latency of the SLA can have a significant impact on the network architecture and
and bandwidth. The level of the SLA can have a significant requirements for redundancy in the systems.</para>
impact on the network architecture and requirements for </section>
redundancy in the systems.</para></section>
<section xml:id="risks-network-focus"> <section xml:id="risks-network-focus">
<title>Risks</title> <title>Risks</title>
<variablelist> <variablelist>
<varlistentry> <varlistentry>
<term>Network misconfigurations</term> <term>Network misconfigurations</term>
<listitem> <listitem>
<para>Configuring incorrect IP <para>Configuring incorrect IP addresses, VLANs, and routers
addresses, VLANs, and routes can cause outages to can cause outages to areas of the network or, in the worst-case
areas of the network or, in the worst-case scenario, scenario, the entire cloud infrastructure. Automate network
the entire cloud infrastructure. Misconfigurations can configurations to minimize the opportunity for operator error
cause disruptive problems and should be automated to as it can cause disruptive problems.</para>
minimize the opportunity for operator error.</para>
</listitem> </listitem>
</varlistentry> </varlistentry>
<varlistentry> <varlistentry>
<term>Capacity planning</term> <term>Capacity planning</term>
<listitem> <listitem>
<para>Cloud networks need to be managed <para>Cloud networks require management for capacity and growth
for capacity and growth over time. There is a risk over time. Capacity planning includes the purchase of network
that the network will not grow to support the circuits and hardware that can potentially have lead times
workload. Capacity planning includes the purchase of measured in months or years.</para>
network circuits and hardware that can potentially
have lead times measured in months or more.</para>
</listitem> </listitem>
</varlistentry> </varlistentry>
<varlistentry> <varlistentry>
<term>Network tuning</term> <term>Network tuning</term>
<listitem> <listitem>
<para>Cloud networks need to be configured <para>Configure cloud networks to minimize link loss, packet loss,
to minimize link loss, packet loss, packet storms, packet storms, broadcast storms, and loops.</para>
broadcast storms, and loops.</para>
</listitem> </listitem>
</varlistentry> </varlistentry>
<varlistentry> <varlistentry>
<term>Single Point Of Failure (SPOF)</term> <term>Single Point Of Failure (SPOF)</term>
<listitem> <listitem>
<para>High availability <para>Consider high availability at the physical and environmental
must be taken into account even at the physical and layers. If there is a single point of failure due to only one
environmental layers. If there is a single point of upstream link, or only one power supply, an outage can become
failure due to only one upstream link, or only one unavoidable.</para>
power supply, an outage becomes unavoidable.</para>
</listitem> </listitem>
</varlistentry> </varlistentry>
<varlistentry> <varlistentry>
<term>Complexity</term> <term>Complexity</term>
<listitem> <listitem>
<para>An overly complex network design becomes <para>An overly complex network design can be difficult to
difficult to maintain and troubleshoot. While maintain and troubleshoot. While device-level configuration
automated tools that handle overlay networks or device can ease maintenance concerns and automated tools can handle
level configuration can mitigate this, non-traditional overlay networks, avoid or document non-traditional interconnects
interconnects between functions and specialized between functions and specialized hardware to prevent
hardware need to be well documented or avoided to outages.</para>
prevent outages.</para>
</listitem> </listitem>
</varlistentry> </varlistentry>
<varlistentry> <varlistentry>
<term>Non-standard features</term> <term>Non-standard features</term>
<listitem> <listitem>
<para>There are additional risks <para>There are additional risks that arise from configuring the
that arise from configuring the cloud network to take cloud network to take advantage of vendor specific features.
advantage of vendor specific features. One example is One example is multi-link aggregation (MLAG) used to provide
multi-link aggregation (MLAG) that is being used to redundancy at the aggregator switch level of the network. MLAG
provide redundancy at the aggregator switch level of is not a standard and, as a result, each vendor has their own
the network. MLAG is not a standard and, as a result, proprietary implementation of the feature. MLAG architectures
each vendor has their own proprietary implementation are not interoperable across switch vendors, which leads to
of the feature. MLAG architectures are not vendor lock-in, and can cause delays or inability when upgrading
interoperable across switch vendors, which leads to components.</para>
vendor lock-in, and can cause delays or inability when
upgrading components.</para>
</listitem> </listitem>
</varlistentry> </varlistentry>
</variablelist> </variablelist>
</section> </section>
<section xml:id="security-network-focus"><title>Security</title> <section xml:id="security-network-focus"><title>Security</title>
<para>Security is often overlooked or added after a design has <para>Users often overlook or add security after a design implementation.
been implemented. Consider security implications and Consider security implications and requirements before designing the
requirements before designing the physical and logical network physical and logical network topologies. Make sure that the networks are
topologies. Some of the factors that need to be addressed properly segregated and traffic flows are going to the correct
include making sure the networks are properly segregated and destinations without crossing through locations that are undesirable.
traffic flows are going to the correct destinations without Consider the following example factors:</para>
crossing through locations that are undesirable. Some examples
of factors that need to be taken into consideration are:</para>
<itemizedlist> <itemizedlist>
<listitem> <listitem>
<para>Firewalls</para> <para>Firewalls</para>
</listitem> </listitem>
<listitem> <listitem>
<para>Overlay interconnects for joining separated tenant <para>Overlay interconnects for joining separated tenant networks</para>
networks</para>
</listitem> </listitem>
<listitem> <listitem>
<para>Routing through or avoiding specific networks</para> <para>Routing through or avoiding specific networks</para>
</listitem> </listitem>
</itemizedlist> </itemizedlist>
<para>Another security vulnerability that must be taken into <para>How networks attach to hypervisors can expose security
account is how networks are attached to hypervisors. If a vulnerabilities. To mitigate against exploiting hypervisor breakouts,
network must be separated from other systems at all costs, it separate networks from other systems and schedule instances for the
may be necessary to schedule instances for that network onto network onto dedicated compute nodes. This prevents attackers
dedicated compute nodes. This may also be done to mitigate from having access to the networks from a compromised instance.</para>
against exploiting a hypervisor breakout allowing the attacker
access to networks from a compromised instance.</para>
</section> </section>
</section> </section>