openstack-manuals/doc/arch-design-to-archive/source/generalpurpose-technical-considerations.rst
daz d7b5e0eaa9 [arch-design] Publish draft Arch Guide to docs.openstack.org
1. Unpublish the current arch-design and temporarily relocate it to a
"to archive" directory until the archiving structure is available
2. Publish the arch-design-draft to docs.openstack.org
3. Unpublish arch-design-draft from https://docs.openstack.org/draft/

Change-Id: Ida5f237d2edce7a83a24c376c355e2c220bc8c28
Implements: blueprint arch-design-pike
2017-03-08 23:02:35 +00:00

28 KiB
Raw Blame History

Technical considerations

General purpose clouds are expected to include these base services:

  • Compute
  • Network
  • Storage

Each of these services have different resource requirements. As a result, you must make design decisions relating directly to the service, as well as provide a balanced infrastructure for all services.

Take into consideration the unique aspects of each service, as individual characteristics and service mass can impact the hardware selection process. Hardware designs should be generated for each of the services.

Hardware decisions are also made in relation to network architecture and facilities planning. These factors play heavily into the overall architecture of an OpenStack cloud.

Compute resource design

When designing compute resource pools, a number of factors can impact your design decisions. Factors such as number of processors, amount of memory, and the quantity of storage required for each hypervisor must be taken into account.

You will also need to decide whether to provide compute resources in a single pool or in multiple pools. In most cases, multiple pools of resources can be allocated and addressed on demand. A compute design that allocates multiple pools of resources makes best use of application resources, and is commonly referred to as bin packing.

In a bin packing design, each independent resource pool provides service for specific flavors. This helps to ensure that, as instances are scheduled onto compute hypervisors, each independent node's resources will be allocated in a way that makes the most efficient use of the available hardware. Bin packing also requires a common hardware design, with all hardware nodes within a compute resource pool sharing a common processor, memory, and storage layout. This makes it easier to deploy, support, and maintain nodes throughout their lifecycle.

An overcommit ratio is the ratio of available virtual resources to available physical resources. This ratio is configurable for CPU and memory. The default CPU overcommit ratio is 16:1, and the default memory overcommit ratio is 1.5:1. Determining the tuning of the overcommit ratios during the design phase is important as it has a direct impact on the hardware layout of your compute nodes.

When selecting a processor, compare features and performance characteristics. Some processors include features specific to virtualized compute hosts, such as hardware-assisted virtualization, and technology related to memory paging (also known as EPT shadowing). These types of features can have a significant impact on the performance of your virtual machine.

You will also need to consider the compute requirements of non-hypervisor nodes (sometimes referred to as resource nodes). This includes controller, object storage, and block storage nodes, and networking services.

The number of processor cores and threads impacts the number of worker threads which can be run on a resource node. Design decisions must relate directly to the service being run on it, as well as provide a balanced infrastructure for all services.

Workload can be unpredictable in a general purpose cloud, so consider including the ability to add additional compute resource pools on demand. In some cases, however, the demand for certain instance types or flavors may not justify individual hardware design. In either case, start by allocating hardware designs that are capable of servicing the most common instance requests. If you want to add additional hardware to the overall architecture, this can be done later.

Designing network resources

OpenStack clouds generally have multiple network segments, with each segment providing access to particular resources. The network services themselves also require network communication paths which should be separated from the other networks. When designing network services for a general purpose cloud, plan for either a physical or logical separation of network segments used by operators and projects. You can also create an additional network segment for access to internal services such as the message bus and database used by various services. Segregating these services onto separate networks helps to protect sensitive data and protects against unauthorized access to services.

Choose a networking service based on the requirements of your instances. The architecture and design of your cloud will impact whether you choose OpenStack Networking (neutron), or legacy networking (nova-network).

Legacy networking (nova-network)

The legacy networking (nova-network) service is primarily a layer-2 networking service that functions in two modes, which use VLANs in different ways. In a flat network mode, all network hardware nodes and devices throughout the cloud are connected to a single layer-2 network segment that provides access to application data.

When the network devices in the cloud support segmentation using VLANs, legacy networking can operate in the second mode. In this design model, each project within the cloud is assigned a network subnet which is mapped to a VLAN on the physical network. It is especially important to remember the maximum number of 4096 VLANs which can be used within a spanning tree domain. This places a hard limit on the amount of growth possible within the data center. When designing a general purpose cloud intended to support multiple projects, we recommend the use of legacy networking with VLANs, and not in flat network mode.

Another consideration regarding network is the fact that legacy networking is entirely managed by the cloud operator; projects do not have control over network resources. If projects require the ability to manage and create network resources such as network segments and subnets, it will be necessary to install the OpenStack Networking service to provide network access to instances.

Networking (neutron)

OpenStack Networking (neutron) is a first class networking service that gives full control over creation of virtual network resources to projects. This is often accomplished in the form of tunneling protocols which will establish encapsulated communication paths over existing network infrastructure in order to segment project traffic. These methods vary depending on the specific implementation, but some of the more common methods include tunneling over GRE, encapsulating with VXLAN, and VLAN tags.

We recommend you design at least three network segments:

  • The first segment is a public network, used for access to REST APIs by projects and operators. The controller nodes and swift proxies are the only devices connecting to this network segment. In some cases, this network might also be serviced by hardware load balancers and other network devices.
  • The second segment is used by administrators to manage hardware resources. Configuration management tools also use this for deploying software and services onto new hardware. In some cases, this network segment might also be used for internal services, including the message bus and database services. This network needs to communicate with every hardware node. Due to the highly sensitive nature of this network segment, you also need to secure this network from unauthorized access.
  • The third network segment is used by applications and consumers to access the physical network, and for users to access applications. This network is segregated from the one used to access the cloud APIs and is not capable of communicating directly with the hardware resources in the cloud. Compute resource nodes and network gateway services which allow application data to access the physical network from outside of the cloud need to communicate on this network segment.

Designing Object Storage

When designing hardware resources for OpenStack Object Storage, the primary goal is to maximize the amount of storage in each resource node while also ensuring that the cost per terabyte is kept to a minimum. This often involves utilizing servers which can hold a large number of spinning disks. Whether choosing to use 2U server form factors with directly attached storage or an external chassis that holds a larger number of drives, the main goal is to maximize the storage available in each node.

Note

We do not recommended investing in enterprise class drives for an OpenStack Object Storage cluster. The consistency and partition tolerance characteristics of OpenStack Object Storage ensures that data stays up to date and survives hardware faults without the use of any specialized data replication devices.

One of the benefits of OpenStack Object Storage is the ability to mix and match drives by making use of weighting within the swift ring. When designing your swift storage cluster, we recommend making use of the most cost effective storage solution available at the time.

To achieve durability and availability of data stored as objects it is important to design object storage resource pools to ensure they can provide the suggested availability. Considering rack-level and zone-level designs to accommodate the number of replicas configured to be stored in the Object Storage service (the default number of replicas is three) is important when designing beyond the hardware node level. Each replica of data should exist in its own availability zone with its own power, cooling, and network resources available to service that specific zone.

Object storage nodes should be designed so that the number of requests does not hinder the performance of the cluster. The object storage service is a chatty protocol, therefore making use of multiple processors that have higher core counts will ensure the IO requests do not inundate the server.

Designing Block Storage

When designing OpenStack Block Storage resource nodes, it is helpful to understand the workloads and requirements that will drive the use of block storage in the cloud. We recommend designing block storage pools so that projects can choose appropriate storage solutions for their applications. By creating multiple storage pools of different types, in conjunction with configuring an advanced storage scheduler for the block storage service, it is possible to provide projects with a large catalog of storage services with a variety of performance levels and redundancy options.

Block storage also takes advantage of a number of enterprise storage solutions. These are addressed via a plug-in driver developed by the hardware vendor. A large number of enterprise storage plug-in drivers ship out-of-the-box with OpenStack Block Storage (and many more available via third party channels). General purpose clouds are more likely to use directly attached storage in the majority of block storage nodes, deeming it necessary to provide additional levels of service to projects which can only be provided by enterprise class storage solutions.

Redundancy and availability requirements impact the decision to use a RAID controller card in block storage nodes. The input-output per second (IOPS) demand of your application will influence whether or not you should use a RAID controller, and which level of RAID is required. Making use of higher performing RAID volumes is suggested when considering performance. However, where redundancy of block storage volumes is more important we recommend making use of a redundant RAID configuration such as RAID 5 or RAID 6. Some specialized features, such as automated replication of block storage volumes, may require the use of third-party plug-ins and enterprise block storage solutions in order to provide the high demand on storage. Furthermore, where extreme performance is a requirement it may also be necessary to make use of high speed SSD disk drives' high performing flash storage solutions.

Software selection

The software selection process plays a large role in the architecture of a general purpose cloud. The following have a large impact on the design of the cloud:

  • Choice of operating system
  • Selection of OpenStack software components
  • Choice of hypervisor
  • Selection of supplemental software

Operating system (OS) selection plays a large role in the design and architecture of a cloud. There are a number of OSes which have native support for OpenStack including:

  • Ubuntu
  • Red Hat Enterprise Linux (RHEL)
  • CentOS
  • SUSE Linux Enterprise Server (SLES)

Note

Native support is not a constraint on the choice of OS; users are free to choose just about any Linux distribution (or even Microsoft Windows) and install OpenStack directly from source (or compile their own packages). However, many organizations will prefer to install OpenStack from distribution-supplied packages or repositories (although using the distribution vendor's OpenStack packages might be a requirement for support).

OS selection also directly influences hypervisor selection. A cloud architect who selects Ubuntu, RHEL, or SLES has some flexibility in hypervisor; KVM, Xen, and LXC are supported virtualization methods available under OpenStack Compute (nova) on these Linux distributions. However, a cloud architect who selects Windows Server is limited to Hyper-V. Similarly, a cloud architect who selects XenServer is limited to the CentOS-based dom0 operating system provided with XenServer.

The primary factors that play into OS-hypervisor selection include:

User requirements

The selection of OS-hypervisor combination first and foremost needs to support the user requirements.

Support

The selected OS-hypervisor combination needs to be supported by OpenStack.

Interoperability

The OS-hypervisor needs to be interoperable with other features and services in the OpenStack design in order to meet the user requirements.

Hypervisor

OpenStack supports a wide variety of hypervisors, one or more of which can be used in a single cloud. These hypervisors include:

  • KVM (and QEMU)
  • XCP/XenServer
  • vSphere (vCenter and ESXi)
  • Hyper-V
  • LXC
  • Docker
  • Bare-metal

A complete list of supported hypervisors and their capabilities can be found at OpenStack Hypervisor Support Matrix.

We recommend general purpose clouds use hypervisors that support the most general purpose use cases, such as KVM and Xen. More specific hypervisors should be chosen to account for specific functionality or a supported feature requirement. In some cases, there may also be a mandated requirement to run software on a certified hypervisor including solutions from VMware, Microsoft, and Citrix.

The features offered through the OpenStack cloud platform determine the best choice of a hypervisor. Each hypervisor has their own hardware requirements which may affect the decisions around designing a general purpose cloud.

In a mixed hypervisor environment, specific aggregates of compute resources, each with defined capabilities, enable workloads to utilize software and hardware specific to their particular requirements. This functionality can be exposed explicitly to the end user, or accessed through defined metadata within a particular flavor of an instance.

OpenStack components

A general purpose OpenStack cloud design should incorporate the core OpenStack services to provide a wide range of services to end-users. The OpenStack core services recommended in a general purpose cloud are:

  • Compute service (nova)
  • Networking service (neutron)
  • Image service (glance)
  • Identity service (keystone)
  • Dashboard (horizon)
  • Telemetry service (telemetry)

A general purpose cloud may also include Object Storage service (swift). Block Storage service (cinder). These may be selected to provide storage to applications and instances.

Supplemental software

A general purpose OpenStack deployment consists of more than just OpenStack-specific components. A typical deployment involves services that provide supporting functionality, including databases and message queues, and may also involve software to provide high availability of the OpenStack environment. Design decisions around the underlying message queue might affect the required number of controller services, as well as the technology to provide highly resilient database functionality, such as MariaDB with Galera. In such a scenario, replication of services relies on quorum.

Where many general purpose deployments use hardware load balancers to provide highly available API access and SSL termination, software solutions, for example HAProxy, can also be considered. It is vital to ensure that such software implementations are also made highly available. High availability can be achieved by using software such as Keepalived or Pacemaker with Corosync. Pacemaker and Corosync can provide active-active or active-passive highly available configuration depending on the specific service in the OpenStack environment. Using this software can affect the design as it assumes at least a 2-node controller infrastructure where one of those nodes may be running certain services in standby mode.

Memcached is a distributed memory object caching system, and Redis is a key-value store. Both are deployed on general purpose clouds to assist in alleviating load to the Identity service. The memcached service caches tokens, and due to its distributed nature it can help alleviate some bottlenecks to the underlying authentication system. Using memcached or Redis does not affect the overall design of your architecture as they tend to be deployed onto the infrastructure nodes providing the OpenStack services.

Controller infrastructure

The Controller infrastructure nodes provide management services to the end-user as well as providing services internally for the operating of the cloud. The Controllers run message queuing services that carry system messages between each service. Performance issues related to the message bus would lead to delays in sending that message to where it needs to go. The result of this condition would be delays in operation functions such as spinning up and deleting instances, provisioning new storage volumes and managing network resources. Such delays could adversely affect an applications ability to react to certain conditions, especially when using auto-scaling features. It is important to properly design the hardware used to run the controller infrastructure as outlined above in the Hardware Selection section.

Performance of the controller services is not limited to processing power, but restrictions may emerge in serving concurrent users. Ensure that the APIs and Horizon services are load tested to ensure that you are able to serve your customers. Particular attention should be made to the OpenStack Identity Service (Keystone), which provides the authentication and authorization for all services, both internally to OpenStack itself and to end-users. This service can lead to a degradation of overall performance if this is not sized appropriately.

Network performance

In a general purpose OpenStack cloud, the requirements of the network help determine performance capabilities. It is possible to design OpenStack environments that run a mix of networking capabilities. By utilizing the different interface speeds, the users of the OpenStack environment can choose networks that are fit for their purpose.

Network performance can be boosted considerably by implementing hardware load balancers to provide front-end service to the cloud APIs. The hardware load balancers also perform SSL termination if that is a requirement of your environment. When implementing SSL offloading, it is important to understand the SSL offloading capabilities of the devices selected.

Compute host

The choice of hardware specifications used in compute nodes including CPU, memory and disk type directly affects the performance of the instances. Other factors which can directly affect performance include tunable parameters within the OpenStack services, for example the overcommit ratio applied to resources. The defaults in OpenStack Compute set a 16:1 over-commit of the CPU and 1.5 over-commit of the memory. Running at such high ratios leads to an increase in "noisy-neighbor" activity. Care must be taken when sizing your Compute environment to avoid this scenario. For running general purpose OpenStack environments it is possible to keep to the defaults, but make sure to monitor your environment as usage increases.

Storage performance

When considering performance of Block Storage, hardware and architecture choice is important. Block Storage can use enterprise back-end systems such as NetApp or EMC, scale out storage such as GlusterFS and Ceph, or simply use the capabilities of directly attached storage in the nodes themselves. Block Storage may be deployed so that traffic traverses the host network, which could affect, and be adversely affected by, the front-side API traffic performance. As such, consider using a dedicated data storage network with dedicated interfaces on the Controller and Compute hosts.

When considering performance of Object Storage, a number of design choices will affect performance. A users access to the Object Storage is through the proxy services, which sit behind hardware load balancers. By the very nature of a highly resilient storage system, replication of the data would affect performance of the overall system. In this case, 10 GbE (or better) networking is recommended throughout the storage network architecture.

High Availability

In OpenStack, the infrastructure is integral to providing services and should always be available, especially when operating with SLAs. Ensuring network availability is accomplished by designing the network architecture so that no single point of failure exists. A consideration of the number of switches, routes and redundancies of power should be factored into core infrastructure, as well as the associated bonding of networks to provide diverse routes to your highly available switch infrastructure.

The OpenStack services themselves should be deployed across multiple servers that do not represent a single point of failure. Ensuring API availability can be achieved by placing these services behind highly available load balancers that have multiple OpenStack servers as members.

OpenStack lends itself to deployment in a highly available manner where it is expected that at least 2 servers be utilized. These can run all the services involved from the message queuing service, for example RabbitMQ or QPID, and an appropriately deployed database service such as MySQL or MariaDB. As services in the cloud are scaled out, back-end services will need to scale too. Monitoring and reporting on server utilization and response times, as well as load testing your systems, will help determine scale out decisions.

Care must be taken when deciding network functionality. Currently, OpenStack supports both the legacy networking (nova-network) system and the newer, extensible OpenStack Networking (neutron). Both have their pros and cons when it comes to providing highly available access. Legacy networking, which provides networking access maintained in the OpenStack Compute code, provides a feature that removes a single point of failure when it comes to routing, and this feature is currently missing in OpenStack Networking. The effect of legacy networkings multi-host functionality restricts failure domains to the host running that instance.

When using Networking, the OpenStack controller servers or separate Networking hosts handle routing. For a deployment that requires features available in only Networking, it is possible to remove this restriction by using third party software that helps maintain highly available L3 routes. Doing so allows for common APIs to control network hardware, or to provide complex multi-tier web applications in a secure manner. It is also possible to completely remove routing from Networking, and instead rely on hardware routing capabilities. In this case, the switching infrastructure must support L3 routing.

OpenStack Networking and legacy networking both have their advantages and disadvantages. They are both valid and supported options that fit different network deployment models described in the Networking deployment options table <https://docs.openstack.org/ops-guide/arch-network-design.html#network-topology> of OpenStack Operations Guide.

Ensure your deployment has adequate back-up capabilities.

Application design must also be factored into the capabilities of the underlying cloud infrastructure. If the compute hosts do not provide a seamless live migration capability, then it must be expected that when a compute host fails, that instance and any data local to that instance will be deleted. However, when providing an expectation to users that instances have a high-level of uptime guarantees, the infrastructure must be deployed in a way that eliminates any single point of failure when a compute host disappears. This may include utilizing shared file systems on enterprise storage or OpenStack Block storage to provide a level of guarantee to match service features.

For more information on high availability in OpenStack, see the OpenStack High Availability Guide.

Security

A security domain comprises users, applications, servers or networks that share common trust requirements and expectations within a system. Typically they have the same authentication and authorization requirements and users.

These security domains are:

  • Public
  • Guest
  • Management
  • Data

These security domains can be mapped to an OpenStack deployment individually, or combined. In each case, the cloud operator should be aware of the appropriate security concerns. Security domains should be mapped out against your specific OpenStack deployment topology. The domains and their trust requirements depend upon whether the cloud instance is public, private, or hybrid.

  • The public security domain is an entirely untrusted area of the cloud infrastructure. It can refer to the internet as a whole or simply to networks over which you have no authority. This domain should always be considered untrusted.
  • The guest security domain handles compute data generated by instances on the cloud but not services that support the operation of the cloud, such as API calls. Public cloud providers and private cloud providers who do not have stringent controls on instance use or who allow unrestricted internet access to instances should consider this domain to be untrusted. Private cloud providers may want to consider this network as internal and therefore trusted only if they have controls in place to assert that they trust instances and all their projects.
  • The management security domain is where services interact. Sometimes referred to as the control plane, the networks in this domain transport confidential data such as configuration parameters, user names, and passwords. In most deployments this domain is considered trusted.
  • The data security domain is concerned primarily with information pertaining to the storage services within OpenStack. Much of the data that crosses this network has high integrity and confidentiality requirements and, depending on the type of deployment, may also have strong availability requirements. The trust level of this network is heavily dependent on other deployment decisions.

When deploying OpenStack in an enterprise as a private cloud it is usually behind the firewall and within the trusted network alongside existing systems. Users of the cloud are employees that are bound by the security requirements set forth by the company. This tends to push most of the security domains towards a more trusted model. However, when deploying OpenStack in a public facing role, no assumptions can be made and the attack vectors significantly increase.

Consideration must be taken when managing the users of the system for both public and private clouds. The identity service allows for LDAP to be part of the authentication process. Including such systems in an OpenStack deployment may ease user management if integrating into existing systems.

It is important to understand that user authentication requests include sensitive information including user names, passwords, and authentication tokens. For this reason, placing the API services behind hardware that performs SSL termination is strongly recommended.

For more information OpenStack Security, see the OpenStack Security Guide.