diff --git a/doc/openstack-ops/ch_arch_examples.xml b/doc/openstack-ops/ch_arch_examples.xml new file mode 100644 index 00000000..f77b3868 --- /dev/null +++ b/doc/openstack-ops/ch_arch_examples.xml @@ -0,0 +1,19 @@ + + + + + + + +]> + + + Example Architectures + + + + diff --git a/doc/openstack-ops/figures/ha_network_diagram_basic.png b/doc/openstack-ops/figures/ha_network_diagram_basic.png new file mode 100644 index 00000000..505e2670 Binary files /dev/null and b/doc/openstack-ops/figures/ha_network_diagram_basic.png differ diff --git a/doc/openstack-ops/figures/ha_network_diagram_performance.png b/doc/openstack-ops/figures/ha_network_diagram_performance.png new file mode 100644 index 00000000..c4ae7081 Binary files /dev/null and b/doc/openstack-ops/figures/ha_network_diagram_performance.png differ diff --git a/doc/openstack-ops/figures/ha_node_compute.png b/doc/openstack-ops/figures/ha_node_compute.png new file mode 100644 index 00000000..44d6438d Binary files /dev/null and b/doc/openstack-ops/figures/ha_node_compute.png differ diff --git a/doc/openstack-ops/figures/ha_node_controller.png b/doc/openstack-ops/figures/ha_node_controller.png new file mode 100644 index 00000000..f5747ef1 Binary files /dev/null and b/doc/openstack-ops/figures/ha_node_controller.png differ diff --git a/doc/openstack-ops/figures/ha_node_network.png b/doc/openstack-ops/figures/ha_node_network.png new file mode 100644 index 00000000..ff08d469 Binary files /dev/null and b/doc/openstack-ops/figures/ha_node_network.png differ diff --git a/doc/openstack-ops/figures/ha_node_storage.png b/doc/openstack-ops/figures/ha_node_storage.png new file mode 100644 index 00000000..19dcc695 Binary files /dev/null and b/doc/openstack-ops/figures/ha_node_storage.png differ diff --git a/doc/openstack-ops/part_architecture.xml b/doc/openstack-ops/part_architecture.xml index 62909b8f..e00c0c37 100644 --- a/doc/openstack-ops/part_architecture.xml +++ b/doc/openstack-ops/part_architecture.xml @@ -21,7 +21,7 @@ decisions you will need to make during the process. - + diff --git a/doc/openstack-ops/section_arch_example-neutron.xml b/doc/openstack-ops/section_arch_example-neutron.xml new file mode 100644 index 00000000..67b8bcc5 --- /dev/null +++ b/doc/openstack-ops/section_arch_example-neutron.xml @@ -0,0 +1,645 @@ + + + + + + +]> +
+ + Example Architecture - OpenStack Networking + This chapter provides an example architecture using OpenStack Networking in a highly + available environment. +
+ Overview + A highly-available environment can be put into place if you require an environment + that can scale horizontally, or want your cloud to continue to be operational in case of + node failure. This example architecture has been written based on the current default feature + set of OpenStack Havana, with an emphasis on high availability. +
+ Components + + + + + + OpenStack release + Havana + + + Host operating system + Red Hat Enterprise Linux 6.5 + + + OpenStack package repository + Red Hat Distributed OpenStack (RDO) + (http://repos.fedorapeople.org/repos/openstack/openstack-havana/rdo-release-havana-7.noarch.rpm) + + + Hypervisor + KVM + + + Database + MySQL + + + Message queue + Qpid + + + Networking service + OpenStack Networking + + + Tenant Network Separation + VLAN + + + + Image Service (glance) + back-end + GlusterFS + + + Identity Service (keystone) + driver + SQL + + + Block Storage Service (cinder) + back-end + GlusterFS + + + +
+
Rationale + This example architecture has been selected based on the current default feature set + of OpenStack Havana, with an emphasis on high availability. This architecture is currently + being deployed in an internal Red Hat OpenStack cloud, and used to run hosted and shared + services which by their nature must be highly available. + + This architecture's components have been selected for the following reasons: + + + Red Hat Enterprise Linux - You must + choose an operating system that can run on all of the physical nodes. + This example architecture is based on Red Hat Enterprise Linux, which + offers reliability, long-term support, certified testing, and is + hardened. Enterprise customers, now moving into OpenStack usage, + typically require these advantages. + + + RDO - The Red Hat Distributed + OpenStack package offer an easy way to download the most current + OpenStack release that is built for the Red Hat Enterprise Linux + platform. + + + KVM - KVM is the supported hypervisor + of choice for Red Hat Enterprise Linux (and included in distribution). + It is feature complete, and free from licensing charges and + restrictions. + + + MySQL - Mysql is used as the database + backend for all databases in the OpenStack environment. MySQL is the + supported database of choice for Red Hat Enterprise Linux (and included + in distribution); the database is open source, scalable, and handles + memory well. + + + Qpid - Apache Qpid offers 100 percent + compatibility with the Advanced Message Queuing Protocol Standard, and + its broker is available for both C++ and Java. + + + OpenStack Networking - OpenStack + Networking offers sophisticated networking functionality, including + Layer 2 (L2) network segregation and provider networks. + + + VLAN - Using a virtual local area + network offers broadcast control, security, and physical layer + transparency. If needed, use VXLAN to extend your address space. + + + GlusterFS - GlusterFS offers scalable + storage. As your environment grows, you can continue to add more storage + nodes (instead of being restricted, for example, by an expensive storage + array). + + +
+
+
+ Detailed Description +
+ Node Types + This section gives you a breakdown of the different nodes that make up + the OpenStack environment. A node is a physical machine that is provisioned + with an operating system, and running a defined software stack on top of it. + The following table provides node descriptions and specifications. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Node Types
TypeDescriptionExample Hardware
ControllerController nodes are responsible for running the management software services needed for the + OpenStack environment to function. These nodes: + + + Provide the front door that people access as well as the API + services which all other components in the environment talk + to. + + Run a number of services in a highly available fashion, + utilising Packemaker and HAProxy to provide a virtual IP and + load-balancing functions so all controller nodes are being + used. + + Supply highly available "infrastructure" services as MySQL and + Qpid which underpin all the services. + + Provide what is known as "persistent storage" through services + run on the host as well. This persistent storage is backed onto + the storage nodes for reliability. + See . + Model: Dell R620 + CPU: 2 x Intel® Xeon® CPU E5-2620 0 @ 2.00GHz + Memory: 32GB + Disk: 2 x 300GB 10000 RPM SAS Disks + Network: 2 x 10G network ports +
ComputeCompute nodes run the virtual machine instances in OpenStack. They: + + + Run the bare minimum of services needed to facilitate + these instances. + + + Use local storage on the node for the virtual machines, so + that no VM migration or instance recovery at node failure is + possible. + + See . + Model: Dell R620 + CPU: 2x Intel® Xeon® CPU E5-2650 0 @ 2.00GHz + Memory: 128GB + Disk: 2 x 600GB 10000 RPM SAS Disks + Network: 4 x 10G network ports (For future proofing expansion) +
StorageStorage nodes store all the data required for the environment, including disk images in the + Image Service library, and the persistent storage volumes created by the + Block Storage service. Storage nodes use GlusterFS technology to + keep the data highly available and scalable. + See . + Model: Dell R720xd + CPU: 2 x Intel® Xeon® CPU E5-2620 0 @ 2.00GHz + Memory: 64GB + Disk: 2 x 500GB 7200 RPM SAS Disks + 24 x 600GB 10000 RPM SAS + Disks + Raid Controller: PERC H710P Integrated RAID Controller, 1GB NV + Cache + Network: 2 x 10G network ports +
NetworkNetwork nodes are responsible for doing all the virtual networking needed + for people to create public or private networks, and uplink their virtual + machines into external networks. Network nodes: + + Form the only ingress and egress point for instances running on top of OpenStack. + Run all of the environment's networking services with the exception of the networking API + service (which runs on the controller node). + + See . + Model: Dell R620 + CPU: 1 x Intel® Xeon® CPU E5-2620 0 @ 2.00GHz + Memory: 32GB + Disk: 2 x 300GB 10000 RPM SAS Disks + Network: 5 x 10G network ports +
UtilityUtility nodes are used by internal administration staff only to provide a + number of basic system administration functions needed to get the + environment up and running, and to maintain the hardware, OS, and software + on which it runs. + These nodes run services such as provisioning, configuration + management, monitoring, or GlusterFS management software. They are not + required to scale although these machines are usually backed up. + Model: Dell R620 + CPU: 2x Intel® Xeon® CPU E5-2620 0 @ 2.00GHz + Memory: 32 GB + Disk: 2 x 500GB 7200 RPM SAS Disks + Network: 2 x 10G network ports +
+
+
+ Networking Layout + The network contains all the management devices for all hardware in the environment + (for example, by including Dell iDrac7 devices for the hardware nodes, and management + interfaces for network switches). The network is accessed by internal staff only when + diagnosing or recovering a hardware issue. +
+ OpenStack internal network + This network is used for OpenStack management functions and traffic, including + services needed for the provisioning of physical nodes + (pxe, tftp, + kickstart), traffic between various OpenStack node + types using OpenStack APIs and messages (for example, + nova-compute talking to + keystone or cinder-volume + talking to nova-api), and all traffic for storage data to + the storage layer underneath by the Gluster protocol. All physical nodes have at + least one network interface (typically eth0) in this + network. This network is only accessible from other VLANs on port 22 (for + ssh access to manage machines). +
+
+ Public Network + This network is a combination of: + + IP addresses for public-facing interfaces on the + controller nodes (which end users will access the OpenStack services) + A range of publicly routable, IPv4 network addresses to be used + by OpenStack Networking for floating IPs. You may be restricted in your access + to IPv4 addresses; a large range of IPv4 addresses is not necessary. + + Routers for private networks created within OpenStack. + + + This network is connected to the controller nodes so users can access the OpenStack + interfaces, and connected to the network nodes to provide VMs with publicly routable + traffic functionality. The network is also connected to the utility machines so that + any utility services that need to be made public (such as system monitoring) can be + accessed. +
+
+ VM traffic network + This is a closed network that is not publicly routable and is simply used as a + private, internal network for traffic between virtual machines in OpenStack, and + between the virtual machines and the network nodes which provide l3 routes out to + the public network (and floating IPs for connections back in to the VMs). Because + this is a closed network, we are using a different address space to the others to + clearly define the separation. Only Compute and OpenStack Networking nodes need to + be connected to this network. +
+
+
+ Node connectivity + The following section details how the nodes are connected to the different networks + (see ), and what other considerations need to take + place (for example, bonding) when connecting nodes to the networks. +
+ Initial deployment + Initially, the connection setup should revolve around keeping the connectivity simple + and straightforward, in order to minimise deployment complexity and time to deploy. The + following deployment aims to have 1x10G connectivity available to all Compute nodes, while + still leveraging bonding on appropriate nodes for maximum performance. +
+ Basic node deployment + + + + + +
+
+
+ Connectivity for maximum performance + If the networking performance of the basic layout is not enough, you can move to + the following layout which provides 2x10G network links to all instances in the + environment, as well as providing more network bandwidth to the storage + layer. +
+ Performance node deployment + + + + + +
+
+
+
+ Node Diagrams + The following diagrams include logical information about the different types of nodes, + indicating what services will be running on top of them, and how they interact with each + other. The diagrams also illustrate how the availability and scalability of services are + achieved. +
+ Controller node + + + + + +
+
+ Compute Node + + + + + +
+
+ Network Node + + + + + +
+
+ Storage Node + + + + + +
+
+
+
+ Example component configuration + The following tables includes example configuration and considerations for both third-party and OpenStack + components: + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Third-party component configuration
ComponentTuningAvailabilityScalability
MySQLbinlog-format = rowMaster-Master replication. However, both nodes are not used at the same + time. Replication keeps all nodes as close to being up to date as + possible (although the asynchronous nature of the replication means a + fully consistent state is not possible). Connections to the database + only happen through a Pacemaker virtual IP, ensuring that most problems that + occur with master-master replication can be avoided.Not heavily considered. Once load on the MySQL server increases enough + that scalability needs to be considered, multiple masters or a + master/slave setup can be used.
Qpidmax-connections=1000 worker-threads=20 + connection-backlog=10, sasl security enabled with + SASL-BASIC authenticationQpid is added as a resource to the Pacemaker software that runs on + Controller nodes where Qpid is situated. This ensures only one Qpid + instance is running at one time, and the node with the Pacemaker virtual IP will + always be the node running Qpid.Not heavily considered. However, Qpid can be changed to run on all + controller nodes for scalability and availability purposes, and removed + from Pacemaker.
Haproxymaxconn 3000Haproxy is a software layer-7 load balancer used to front door all + clustered OpenStack API components and do SSL termination. Haproxy can + be added as a resource to the Pacemaker software that runs on the + Controller nodes where HAProxy is situated. This ensures that only one + HAProxy instance is running at one time, and the node with the Pacemaker + virtual IP will always be the node running HAProxy.Not considered. Haproxy has small enough performance overheads that a + single instance should scale enough for this level of workload. If extra + scalability is needed, keepalived or other + Layer-4 load balancing can be introduced to be placed in front of + multiple copies of Haproxy.
MemcachedMAXCONN="8192" CACHESIZE="30457"Memcached is a fast in-memory key/value cache software that is used by + OpenStack components for caching data and increasing performance. + Memcached runs on all controller nodes, ensuring that should one go + down, another instance of memcached is available.Not considered. A single instance of memcached should be able to scale + to the desired workloads. If scalability is desired, Haproxy can be + placed in front of Memcached (in raw tcp mode) + to utilise multiple memcached instances for + scalability. However, this might cause cache consistency issues.
PacemakerConfigured to use corosync and + cman as a cluster communication + stack/quorum manager, and as a two-node cluster.Pacemaker is the clustering software used to ensure the availability of services running on + the controller and network nodes: + + Because Pacemaker is cluster software, the software + itself handles its own availability, leveraging + corosync and + cman underneath. + + + If you use the GlusterFS native client, no virtual IP + is needed since the client knows all about nodes after + initial connection, and automatically routes around + failures on the client side. + + + If you use the NFS or SMB adaptor, you will need a + virtual IP on which to mount the GlusterFS + volumes. + + + If more nodes need to be made cluster aware, Pacemaker can scale to 64 + nodes.
GlusterFSglusterfs performance profile "virt" enabled on + all volumes. Volumes are setup in 2-node replication.Glusterfs is a clustered file system that is run on the storage + nodes to provide persistent scalable data storage the the environment. + Because all connections to gluster use the + gluster native mount points, the + gluster instances themselves provide + availability and failover functionality.The scalability of GlusterFS storage can be achieved by adding in more + storage volumes.
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
OpenStack component configuration
ComponentNode typeTuningAvailabilityScalability
Dashboard (horizon)ControllerConfigured to use memcached as a session store, + neutron support is enabled, + can_set_mount_point = FalseThe Dashboard is run on all controller nodes, ensuring at least once + instance will be available in case of node failure. It also sits behind + HAProxy, which detects when the software fails and routes requests + around the failing instance.The Dashboard is run on all controller nodes, so scalability can be achieved with additional + controller nodes. Haproxy allows scalability for the Dashboard as more nodes are added.
Identity (keystone)ControllerConfigured to use memcached for caching, and use PKI for tokens.Identity is run on all controller nodes, ensuring at least once instance will be available + in case of node failure. Identity also sits behind HAProxy, which detects when the software fails + and routes requests around the failing instance.Identity is run on all controller nodes, so scalability can be achieved with additional + controller nodes. Haproxy allows scalability for Identity as more nodes are added.
Image Service (glance)Controller/var/lib/glance/images is a GlusterFS native + mount to a Gluster volume off the storage layer.The Image Service is run on all controller nodes, ensuring at least once + instance will be available in case of node failure. It also sits behind + HAProxy, which detects when the software fails and routes requests + around the failing instance.The Image Service is run on all controller nodes, so scalability can be achieved with additional controller + nodes. HAProxy allows scalability for the Image Service as more nodes are added.
Compute (nova)Controller, ComputeConfigured to use Qpid. qpid_heartbeat = 10, + configured to use memcached for caching, + configured to use libvirt, configured to use + neutron. + Configured nova-consoleauth to use + memcached for session management (so + that it can have multiple copies and run in a load balancer).The nova API, scheduler, objectstore, cert, consoleauth, conductor, and vncproxy services are + run on all controller nodes, ensuring at least once instance will be + available in case of node failure. Compute is also behind HAProxy, + which detects when the software fails and routes requests around the + failing instance. + Compute's compute and conductor services, which run on the compute nodes, are only needed to run services on + that node, so availability of those services is coupled tightly to the nodes that are available. As long + as a compute node is up, it will have the needed services running on top of it. + The nova API, scheduler, objectstore, cert, consoleauth, conductor, and + vncproxy services are run on all controller nodes, so scalability can be + achieved with additional controller nodes. HAProxy allows scalability + for Compute as more nodes are added. The scalability of services running + on the compute nodes (compute, conductor) is achieved linearly by adding + in more compute nodes.
Block Storage (cinder)ControllerConfigured to use Qpid, qpid_heartbeat = 10, + configured to use a Gluster volume from the storage layer as the backend + for Block Storage, using the Gluster native client.Block Storage API, scheduler, and volume services are run on all controller nodes, + ensuring at least once instance will be available in case of node failure. Block + Storage also sits behind HAProxy, which detects if the software fails and routes + requests around the failing instance.Block Storage API, scheduler and volume services are run on all + controller nodes, so scalability can be achieved with additional + controller nodes. HAProxy allows scalability for Block Storage as more + nodes are added.
OpenStack Networking (neutron)Controller, Compute, NetworkConfigured to use QPID. qpid_heartbeat = 10, + kernel namespace support enabled, tenant_network_type = + vlan, allow_overlapping_ips = + true, tenant_network_type = + vlan, bridge_uplinks = + br-ex:em2, bridge_mappings = + physnet1:br-exThe OpenStack Networking service is run on all controller nodes, + ensuring at least one instance will be available in case of node + failure. It also sits behind HAProxy, which detects if the software + fails and routes requests around the failing instance. + OpenStack Networking's ovs-agent, + l3-agent-dhcp-agent, and + metadata-agent services run on the + network nodes, as lsb resources inside of + Pacemaker. This means that in the case of network node failure, + services are kept running on another node. Finally, the + ovs-agent service is also run on all + compute nodes, and in case of compute node failure, the other nodes + will continue to function using the copy of the service running on + them.The OpenStack Networking server service is run on all controller nodes, + so scalability can be achieved with additional controller nodes. HAProxy + allows scalability for OpenStack Networking as more nodes are added. + Scalability of services running on the network nodes is not currently + supported by OpenStack Networking, so they are not be considered. One + copy of the services should be sufficient to handle the workload. + Scalability of the ovs-agent running on compute + nodes is achieved by adding in more compute nodes as necessary.
+
+
+
diff --git a/doc/openstack-ops/ch_arch_example.xml b/doc/openstack-ops/section_arch_example-nova.xml similarity index 88% rename from doc/openstack-ops/ch_arch_example.xml rename to doc/openstack-ops/section_arch_example-nova.xml index 8864c9c1..6e5d2679 100644 --- a/doc/openstack-ops/ch_arch_example.xml +++ b/doc/openstack-ops/section_arch_example-nova.xml @@ -1,5 +1,5 @@ - @@ -7,12 +7,12 @@ ]> - +
- Example Architecture + Example Architecture - Legacy Networking (nova) Because OpenStack is highly configurable, with many different back-ends and network configuration options, it is difficult to write documentation that covers all possible @@ -34,6 +34,8 @@ architecture does not dictate a particular number of nodes, but shows the thinking and considerations that went into choosing this architecture including the features offered. +
+ Components @@ -142,20 +144,22 @@
Rationale - This example architecture has been selected based on the current - default feature set of OpenStack Folsom, with - an emphasis on stability. We believe that many clouds that - currently run OpenStack in production have made similar - choices. - You must first choose the operating system that runs on all of the - physical nodes. While OpenStack is supported on several - distributions of Linux, we used Ubuntu 12.04 - LTS (Long Term Support), which is used by the - majority of the development community, has feature completeness - compared with other distributions, and has clear future support - plans. - We recommend that you do not use the default Ubuntu OpenStack - install packages and instead use the This example architecture has been selected based on the + current default feature set of OpenStack + Havana, with an emphasis on + stability. We believe that many + clouds that currently run OpenStack in production have + made similar choices. + You must first choose the operating system that runs on + all of the physical nodes. While OpenStack is supported on + several distributions of Linux, we used Ubuntu 12.04 LTS (Long Term + Support), which is used by the majority of + the development community, has feature completeness + compared with other distributions, and has clear future + support plans. + We recommend that you do not use the default Ubuntu + OpenStack install packages and instead use the Ubuntu Cloud Archive (https://wiki.ubuntu.com/ServerTeam/CloudArchive). The Cloud Archive @@ -219,15 +223,15 @@ indispensable, not just for user interaction with the cloud, but also as a tool for operators. Additionally, the dashboard's use of Django makes it a flexible framework for extension. - -
- Why Not Use the OpenStack Network Service (neutron)? - We do not discuss the OpenStack Network Service (neutron) in - this guide, because the authors of this guide only have - production deployment experience using - nova-network. Additionally, it does not yet support - multi-host networking. -
+
+ Why Not Use the OpenStack Network Service + (neutron)? + This example architecture does not use the OpenStack + Network Service (neutron), because it does not yet support + multi-host networking and our organizations (university, + government) have access to a large range of + publicly-accessible IPv4 addresses. +
Why Use Multi-host Networking? In a default OpenStack deployment, there is a single @@ -241,17 +245,16 @@ become a bottleneck if excessive network traffic comes in and goes out of the cloud. - - Multi-host - (http://docs.openstack.org/folsom/openstack-compute/admin/content/existing-ha-networking-options.html#d6e8906) - is a high-availability option for the network configuration - where the nova-network service is run on every compute node - instead of running on only a single node. + (http://docs.openstack.org/havana/install-guide/install/apt/content/nova-network.html) + is a high-availability option for the network + configuration where the nova-network service is run on + every compute node instead of running on only a single + node.
- +
Detailed Description The reference architecture consists of multiple compute @@ -330,4 +333,4 @@
- +