Import RST ops-guide

Import RST ops-guide. Publish it as draft for now, do not translate it. Also do not publish mitaka/arch-design-draft version. Change-Id: Id25e02aa0b2219fd9141d1354124386cb59bb856
2016-04-28 14:36:13 -05:00 · 2016-04-28 14:36:13 -05:00 · 2603dc85e5
parent 6da99e7619
commit 2603dc85e5
60 changed files with 15849 additions and 3 deletions
--- a/doc-tools-check-languages.conf
+++ b/doc-tools-check-languages.conf
@ -44,4 +44,6 @@ declare -A SPECIAL_BOOKS=(
    ["releasenotes"]="skip"
    # Skip arch design while its being revised
    ["arch-design-draft"]="skip"
    # Skip ops-guide while its being revised
    ["ops-guide"]="skip"
 )
--- a/doc/ops-guide/setup.cfg
+++ b/doc/ops-guide/setup.cfg
@ -0,0 +1,30 @@
 [metadata]
 name = openstackopsguide
 summary = OpenStack Operations Guide
 author = OpenStack
 author-email = openstack-docs@lists.openstack.org
 home-page = http://docs.openstack.org/
 classifier =
 Environment :: OpenStack
 Intended Audience :: Information Technology
 Intended Audience :: System Administrators
 License :: OSI Approved :: Apache Software License
 Operating System :: POSIX :: Linux
 Topic :: Documentation
 [global]
 setup-hooks =
    pbr.hooks.setup_hook
 [files]
 [build_sphinx]
 all_files = 1
 build-dir = build
 source-dir = source
 [wheel]
 universal = 1
 [pbr]
 warnerrors = True
--- a/doc/ops-guide/setup.py
+++ b/doc/ops-guide/setup.py
@ -0,0 +1,30 @@
 #!/usr/bin/env python
 # Copyright (c) 2013 Hewlett-Packard Development Company, L.P.
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
 # You may obtain a copy of the License at
 #
 #    http://www.apache.org/licenses/LICENSE-2.0
 #
 # Unless required by applicable law or agreed to in writing, software
 # distributed under the License is distributed on an "AS IS" BASIS,
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
 # implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
 # THIS FILE IS MANAGED BY THE GLOBAL REQUIREMENTS REPO - DO NOT EDIT
 import setuptools
 # In python < 2.7.4, a lazy loading of package `pbr` will break
 # setuptools if some other modules registered functions in `atexit`.
 # solution from: http://bugs.python.org/issue15881#msg170215
 try:
    import multiprocessing  # noqa
 except ImportError:
    pass
 setuptools.setup(
    setup_requires=['pbr'],
    pbr=True)
--- a/doc/ops-guide/source/acknowledgements.rst
+++ b/doc/ops-guide/source/acknowledgements.rst
@ -0,0 +1,51 @@
 ================
 Acknowledgements
 ================
 The OpenStack Foundation supported the creation of this book with plane
 tickets to Austin, lodging (including one adventurous evening without
 power after a windstorm), and delicious food. For about USD $10,000, we
 could collaborate intensively for a week in the same room at the
 Rackspace Austin office. The authors are all members of the OpenStack
 Foundation, which you can join. Go to the `Foundation web
 site <https://www.openstack.org/join>`_.
 We want to acknowledge our excellent host Rackers at Rackspace in
 Austin:
 -  Emma Richards of Rackspace Guest Relations took excellent care of our
   lunch orders and even set aside a pile of sticky notes that had
   fallen off the walls.
 -  Betsy Hagemeier, a Fanatical Executive Assistant, took care of a room
   reshuffle and helped us settle in for the week.
 -  The Real Estate team at Rackspace in Austin, also known as "The
   Victors," were super responsive.
 -  Adam Powell in Racker IT supplied us with bandwidth each day and
   second monitors for those of us needing more screens.
 -  On Wednesday night we had a fun happy hour with the Austin OpenStack
   Meetup group and Racker Katie Schmidt took great care of our group.
 We also had some excellent input from outside of the room:
 -  Tim Bell from CERN gave us feedback on the outline before we started
   and reviewed it mid-week.
 -  Sébastien Han has written excellent blogs and generously gave his
   permission for re-use.
 -  Oisin Feeley read it, made some edits, and provided emailed feedback
   right when we asked.
 Inside the book sprint room with us each day was our book sprint
 facilitator Adam Hyde. Without his tireless support and encouragement,
 we would have thought a book of this scope was impossible in five days.
 Adam has proven the book sprint method effectively again and again. He
 creates both tools and faith in collaborative authoring at
 `www.booksprints.net <http://www.booksprints.net/>`_.
 We couldn't have pulled it off without so much supportive help and
 encouragement.
--- a/doc/ops-guide/source/app_crypt.rst
+++ b/doc/ops-guide/source/app_crypt.rst
@ -0,0 +1,542 @@
 =================================
 Tales From the Cryp^H^H^H^H Cloud
 =================================
 Herein lies a selection of tales from OpenStack cloud operators. Read,
 and learn from their wisdom.
 Double VLAN
 ~~~~~~~~~~~
 I was on-site in Kelowna, British Columbia, Canada setting up a new
 OpenStack cloud. The deployment was fully automated: Cobbler deployed
 the OS on the bare metal, bootstrapped it, and Puppet took over from
 there. I had run the deployment scenario so many times in practice and
 took for granted that everything was working.
 On my last day in Kelowna, I was in a conference call from my hotel. In
 the background, I was fooling around on the new cloud. I launched an
 instance and logged in. Everything looked fine. Out of boredom, I ran
 :command:`ps aux` and all of the sudden the instance locked up.
 Thinking it was just a one-off issue, I terminated the instance and
 launched a new one. By then, the conference call ended and I was off to
 the data center.
 At the data center, I was finishing up some tasks and remembered the
 lock-up. I logged into the new instance and ran :command:`ps aux` again.
 It worked. Phew. I decided to run it one more time. It locked up.
 After reproducing the problem several times, I came to the unfortunate
 conclusion that this cloud did indeed have a problem. Even worse, my
 time was up in Kelowna and I had to return back to Calgary.
 Where do you even begin troubleshooting something like this? An instance
 that just randomly locks up when a command is issued. Is it the image?
 Nope—it happens on all images. Is it the compute node? Nope—all nodes.
 Is the instance locked up? No! New SSH connections work just fine!
 We reached out for help. A networking engineer suggested it was an MTU
 issue. Great! MTU! Something to go on! What's MTU and why would it cause
 a problem?
 MTU is maximum transmission unit. It specifies the maximum number of
 bytes that the interface accepts for each packet. If two interfaces have
 two different MTUs, bytes might get chopped off and weird things
 happen—such as random session lockups.
 .. note::
   Not all packets have a size of 1500. Running the :command:`ls` command over
   SSH might only create a single packets less than 1500 bytes.
   However, running a command with heavy output, such as :command:`ps aux`
   requires several packets of 1500 bytes.
 OK, so where is the MTU issue coming from? Why haven't we seen this in
 any other deployment? What's new in this situation? Well, new data
 center, new uplink, new switches, new model of switches, new servers,
 first time using this model of servers… so, basically everything was
 new. Wonderful. We toyed around with raising the MTU at various areas:
 the switches, the NICs on the compute nodes, the virtual NICs in the
 instances, we even had the data center raise the MTU for our uplink
 interface. Some changes worked, some didn't. This line of
 troubleshooting didn't feel right, though. We shouldn't have to be
 changing the MTU in these areas.
 As a last resort, our network admin (Alvaro) and myself sat down with
 four terminal windows, a pencil, and a piece of paper. In one window, we
 ran ping. In the second window, we ran ``tcpdump`` on the cloud
 controller. In the third, ``tcpdump`` on the compute node. And the forth
 had ``tcpdump`` on the instance. For background, this cloud was a
 multi-node, non-multi-host setup.
 One cloud controller acted as a gateway to all compute nodes.
 VlanManager was used for the network config. This means that the cloud
 controller and all compute nodes had a different VLAN for each OpenStack
 project. We used the :option:`-s` option of ``ping`` to change the packet
 size. We watched as sometimes packets would fully return, sometimes they'd
 only make it out and never back in, and sometimes the packets would stop at a
 random point. We changed ``tcpdump`` to start displaying the hex dump of
 the packet. We pinged between every combination of outside, controller,
 compute, and instance.
 Finally, Alvaro noticed something. When a packet from the outside hits
 the cloud controller, it should not be configured with a VLAN. We
 verified this as true. When the packet went from the cloud controller to
 the compute node, it should only have a VLAN if it was destined for an
 instance. This was still true. When the ping reply was sent from the
 instance, it should be in a VLAN. True. When it came back to the cloud
 controller and on its way out to the Internet, it should no longer have
 a VLAN. False. Uh oh. It looked as though the VLAN part of the packet
 was not being removed.
 That made no sense.
 While bouncing this idea around in our heads, I was randomly typing
 commands on the compute node:
 .. code-block:: console
   $ ip a
   …
   10: vlan100@vlan20: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master br100 state UP
   …
 "Hey Alvaro, can you run a VLAN on top of a VLAN?"
 "If you did, you'd add an extra 4 bytes to the packet…"
 Then it all made sense…
 .. code-block:: console
   $ grep vlan_interface /etc/nova/nova.conf
   vlan_interface=vlan20
 In ``nova.conf``, ``vlan_interface`` specifies what interface OpenStack
 should attach all VLANs to. The correct setting should have been:
 .. code-block:: ini
   vlan_interface=bond0
 As this would be the server's bonded NIC.
 vlan20 is the VLAN that the data center gave us for outgoing Internet
 access. It's a correct VLAN and is also attached to bond0.
 By mistake, I configured OpenStack to attach all tenant VLANs to vlan20
 instead of bond0 thereby stacking one VLAN on top of another. This added
 an extra 4 bytes to each packet and caused a packet of 1504 bytes to be
 sent out which would cause problems when it arrived at an interface that
 only accepted 1500.
 As soon as this setting was fixed, everything worked.
 "The Issue"
 ~~~~~~~~~~~
 At the end of August 2012, a post-secondary school in Alberta, Canada
 migrated its infrastructure to an OpenStack cloud. As luck would have
 it, within the first day or two of it running, one of their servers just
 disappeared from the network. Blip. Gone.
 After restarting the instance, everything was back up and running. We
 reviewed the logs and saw that at some point, network communication
 stopped and then everything went idle. We chalked this up to a random
 occurrence.
 A few nights later, it happened again.
 We reviewed both sets of logs. The one thing that stood out the most was
 DHCP. At the time, OpenStack, by default, set DHCP leases for one minute
 (it's now two minutes). This means that every instance contacts the
 cloud controller (DHCP server) to renew its fixed IP. For some reason,
 this instance could not renew its IP. We correlated the instance's logs
 with the logs on the cloud controller and put together a conversation:
 #. Instance tries to renew IP.
 #. Cloud controller receives the renewal request and sends a response.
 #. Instance "ignores" the response and re-sends the renewal request.
 #. Cloud controller receives the second request and sends a new
   response.
 #. Instance begins sending a renewal request to ``255.255.255.255``
   since it hasn't heard back from the cloud controller.
 #. The cloud controller receives the ``255.255.255.255`` request and
   sends a third response.
 #. The instance finally gives up.
 With this information in hand, we were sure that the problem had to do
 with DHCP. We thought that for some reason, the instance wasn't getting
 a new IP address and with no IP, it shut itself off from the network.
 A quick Google search turned up this: `DHCP lease errors in VLAN
 mode <https://lists.launchpad.net/openstack/msg11696.html>`_
 (https://lists.launchpad.net/openstack/msg11696.html) which further
 supported our DHCP theory.
 An initial idea was to just increase the lease time. If the instance
 only renewed once every week, the chances of this problem happening
 would be tremendously smaller than every minute. This didn't solve the
 problem, though. It was just covering the problem up.
 We decided to have ``tcpdump`` run on this instance and see if we could
 catch it in action again. Sure enough, we did.
 The ``tcpdump`` looked very, very weird. In short, it looked as though
 network communication stopped before the instance tried to renew its IP.
 Since there is so much DHCP chatter from a one minute lease, it's very
 hard to confirm it, but even with only milliseconds difference between
 packets, if one packet arrives first, it arrived first, and if that
 packet reported network issues, then it had to have happened before
 DHCP.
 Additionally, this instance in question was responsible for a very, very
 large backup job each night. While "The Issue" (as we were now calling
 it) didn't happen exactly when the backup happened, it was close enough
 (a few hours) that we couldn't ignore it.
 Further days go by and we catch The Issue in action more and more. We
 find that dhclient is not running after The Issue happens. Now we're
 back to thinking it's a DHCP issue. Running ``/etc/init.d/networking``
 restart brings everything back up and running.
 Ever have one of those days where all of the sudden you get the Google
 results you were looking for? Well, that's what happened here. I was
 looking for information on dhclient and why it dies when it can't renew
 its lease and all of the sudden I found a bunch of OpenStack and dnsmasq
 discussions that were identical to the problem we were seeing!
 `Problem with Heavy Network IO and
 Dnsmasq <http://www.gossamer-threads.com/lists/openstack/operators/18197>`_
 (http://www.gossamer-threads.com/lists/openstack/operators/18197)
 `instances losing IP address while running, due to No
 DHCPOFFER <http://www.gossamer-threads.com/lists/openstack/dev/14696>`_
 (http://www.gossamer-threads.com/lists/openstack/dev/14696)
 Seriously, Google.
 This bug report was the key to everything: `KVM images lose connectivity
 with bridged
 network <https://bugs.launchpad.net/ubuntu/+source/qemu-kvm/+bug/997978>`_
 (https://bugs.launchpad.net/ubuntu/+source/qemu-kvm/+bug/997978)
 It was funny to read the report. It was full of people who had some
 strange network problem but didn't quite explain it in the same way.
 So it was a qemu/kvm bug.
 At the same time of finding the bug report, a co-worker was able to
 successfully reproduce The Issue! How? He used ``iperf`` to spew a ton
 of bandwidth at an instance. Within 30 minutes, the instance just
 disappeared from the network.
 Armed with a patched qemu and a way to reproduce, we set out to see if
 we've finally solved The Issue. After 48 hours straight of hammering the
 instance with bandwidth, we were confident. The rest is history. You can
 search the bug report for "joe" to find my comments and actual tests.
 Disappearing Images
 ~~~~~~~~~~~~~~~~~~~
 At the end of 2012, Cybera (a nonprofit with a mandate to oversee the
 development of cyberinfrastructure in Alberta, Canada) deployed an
 updated OpenStack cloud for their `DAIR
 project <http://www.canarie.ca/cloud/>`_
 (http://www.canarie.ca/en/dair-program/about). A few days into
 production, a compute node locks up. Upon rebooting the node, I checked
 to see what instances were hosted on that node so I could boot them on
 behalf of the customer. Luckily, only one instance.
 The :command:`nova reboot` command wasn't working, so I used :command:`virsh`,
 but it immediately came back with an error saying it was unable to find the
 backing disk. In this case, the backing disk is the Glance image that is
 copied to ``/var/lib/nova/instances/_base`` when the image is used for
 the first time. Why couldn't it find it? I checked the directory and
 sure enough it was gone.
 I reviewed the ``nova`` database and saw the instance's entry in the
 ``nova.instances`` table. The image that the instance was using matched
 what virsh was reporting, so no inconsistency there.
 I checked Glance and noticed that this image was a snapshot that the
 user created. At least that was good news—this user would have been the
 only user affected.
 Finally, I checked StackTach and reviewed the user's events. They had
 created and deleted several snapshots—most likely experimenting.
 Although the timestamps didn't match up, my conclusion was that they
 launched their instance and then deleted the snapshot and it was somehow
 removed from ``/var/lib/nova/instances/_base``. None of that made sense,
 but it was the best I could come up with.
 It turns out the reason that this compute node locked up was a hardware
 issue. We removed it from the DAIR cloud and called Dell to have it
 serviced. Dell arrived and began working. Somehow or another (or a fat
 finger), a different compute node was bumped and rebooted. Great.
 When this node fully booted, I ran through the same scenario of seeing
 what instances were running so I could turn them back on. There were a
 total of four. Three booted and one gave an error. It was the same error
 as before: unable to find the backing disk. Seriously, what?
 Again, it turns out that the image was a snapshot. The three other
 instances that successfully started were standard cloud images. Was it a
 problem with snapshots? That didn't make sense.
 A note about DAIR's architecture: ``/var/lib/nova/instances`` is a
 shared NFS mount. This means that all compute nodes have access to it,
 which includes the ``_base`` directory. Another centralized area is
 ``/var/log/rsyslog`` on the cloud controller. This directory collects
 all OpenStack logs from all compute nodes. I wondered if there were any
 entries for the file that :command:`virsh` is reporting:
 .. code-block:: console
   dair-ua-c03/nova.log:Dec 19 12:10:59 dair-ua-c03
   2012-12-19 12:10:59 INFO nova.virt.libvirt.imagecache
   [-] Removing base file:
   /var/lib/nova/instances/_base/7b4783508212f5d242cbf9ff56fb8d33b4ce6166_10
 Ah-hah! So OpenStack was deleting it. But why?
 A feature was introduced in Essex to periodically check and see if there
 were any ``_base`` files not in use. If there were, OpenStack Compute
 would delete them. This idea sounds innocent enough and has some good
 qualities to it. But how did this feature end up turned on? It was
 disabled by default in Essex. As it should be. It was `decided to be
 turned on in Folsom <https://bugs.launchpad.net/nova/+bug/1029674>`_
 (https://bugs.launchpad.net/nova/+bug/1029674). I cannot emphasize
 enough that:
 *Actions which delete things should not be enabled by default.*
 Disk space is cheap these days. Data recovery is not.
 Secondly, DAIR's shared ``/var/lib/nova/instances`` directory
 contributed to the problem. Since all compute nodes have access to this
 directory, all compute nodes periodically review the \_base directory.
 If there is only one instance using an image, and the node that the
 instance is on is down for a few minutes, it won't be able to mark the
 image as still in use. Therefore, the image seems like it's not in use
 and is deleted. When the compute node comes back online, the instance
 hosted on that node is unable to start.
 The Valentine's Day Compute Node Massacre
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 Although the title of this story is much more dramatic than the actual
 event, I don't think, or hope, that I'll have the opportunity to use
 "Valentine's Day Massacre" again in a title.
 This past Valentine's Day, I received an alert that a compute node was
 no longer available in the cloud—meaning,
 .. code-block:: console
   $ nova service-list
 showed this particular node in a down state.
 I logged into the cloud controller and was able to both ``ping`` and SSH
 into the problematic compute node which seemed very odd. Usually if I
 receive this type of alert, the compute node has totally locked up and
 would be inaccessible.
 After a few minutes of troubleshooting, I saw the following details:
 -  A user recently tried launching a CentOS instance on that node
 -  This user was the only user on the node (new node)
 -  The load shot up to 8 right before I received the alert
 -  The bonded 10gb network device (bond0) was in a DOWN state
 -  The 1gb NIC was still alive and active
 I looked at the status of both NICs in the bonded pair and saw that
 neither was able to communicate with the switch port. Seeing as how each
 NIC in the bond is connected to a separate switch, I thought that the
 chance of a switch port dying on each switch at the same time was quite
 improbable. I concluded that the 10gb dual port NIC had died and needed
 replaced. I created a ticket for the hardware support department at the
 data center where the node was hosted. I felt lucky that this was a new
 node and no one else was hosted on it yet.
 An hour later I received the same alert, but for another compute node.
 Crap. OK, now there's definitely a problem going on. Just like the
 original node, I was able to log in by SSH. The bond0 NIC was DOWN but
 the 1gb NIC was active.
 And the best part: the same user had just tried creating a CentOS
 instance. What?
 I was totally confused at this point, so I texted our network admin to
 see if he was available to help. He logged in to both switches and
 immediately saw the problem: the switches detected spanning tree packets
 coming from the two compute nodes and immediately shut the ports down to
 prevent spanning tree loops:
 .. code-block:: console
   Feb 15 01:40:18 SW-1 Stp: %SPANTREE-4-BLOCK_BPDUGUARD: Received BPDU packet on Port-Channel35 with BPDU guard enabled. Disabling interface. (source mac fa:16:3e:24:e7:22)
   Feb 15 01:40:18 SW-1 Ebra: %ETH-4-ERRDISABLE: bpduguard error detected on Port-Channel35.
   Feb 15 01:40:18 SW-1 Mlag: %MLAG-4-INTF_INACTIVE_LOCAL: Local interface Port-Channel35 is link down. MLAG 35 is inactive.
   Feb 15 01:40:18 SW-1 Ebra: %LINEPROTO-5-UPDOWN: Line protocol on Interface Port-Channel35 (Server35), changed state to down
   Feb 15 01:40:19 SW-1 Stp: %SPANTREE-6-INTERFACE_DEL: Interface Port-Channel35 has been removed from instance MST0
   Feb 15 01:40:19 SW-1 Ebra: %LINEPROTO-5-UPDOWN: Line protocol on Interface Ethernet35 (Server35), changed state to down
 He re-enabled the switch ports and the two compute nodes immediately
 came back to life.
 Unfortunately, this story has an open ending... we're still looking into
 why the CentOS image was sending out spanning tree packets. Further,
 we're researching a proper way on how to mitigate this from happening.
 It's a bigger issue than one might think. While it's extremely important
 for switches to prevent spanning tree loops, it's very problematic to
 have an entire compute node be cut from the network when this happens.
 If a compute node is hosting 100 instances and one of them sends a
 spanning tree packet, that instance has effectively DDOS'd the other 99
 instances.
 This is an ongoing and hot topic in networking circles —especially with
 the raise of virtualization and virtual switches.
 Down the Rabbit Hole
 ~~~~~~~~~~~~~~~~~~~~
 Users being able to retrieve console logs from running instances is a
 boon for support—many times they can figure out what's going on inside
 their instance and fix what's going on without bothering you.
 Unfortunately, sometimes overzealous logging of failures can cause
 problems of its own.
 A report came in: VMs were launching slowly, or not at all. Cue the
 standard checks—nothing on the Nagios, but there was a spike in network
 towards the current master of our RabbitMQ cluster. Investigation
 started, but soon the other parts of the queue cluster were leaking
 memory like a sieve. Then the alert came in—the master Rabbit server
 went down and connections failed over to the slave.
 At that time, our control services were hosted by another team and we
 didn't have much debugging information to determine what was going on
 with the master, and we could not reboot it. That team noted that it
 failed without alert, but managed to reboot it. After an hour, the
 cluster had returned to its normal state and we went home for the day.
 Continuing the diagnosis the next morning was kick started by another
 identical failure. We quickly got the message queue running again, and
 tried to work out why Rabbit was suffering from so much network traffic.
 Enabling debug logging on nova-api quickly brought understanding. A
 ``tail -f /var/log/nova/nova-api.log`` was scrolling by faster
 than we'd ever seen before. CTRL+C on that and we could plainly see the
 contents of a system log spewing failures over and over again - a system
 log from one of our users' instances.
 After finding the instance ID we headed over to
 ``/var/lib/nova/instances`` to find the ``console.log``:
 .. code-block:: console
   adm@cc12:/var/lib/nova/instances/instance-00000e05# wc -l console.log
   92890453 console.log
   adm@cc12:/var/lib/nova/instances/instance-00000e05# ls -sh console.log
   5.5G console.log
 Sure enough, the user had been periodically refreshing the console log
 page on the dashboard and the 5G file was traversing the Rabbit cluster
 to get to the dashboard.
 We called them and asked them to stop for a while, and they were happy
 to abandon the horribly broken VM. After that, we started monitoring the
 size of console logs.
 To this day, `the issue <https://bugs.launchpad.net/nova/+bug/832507>`__
 (https://bugs.launchpad.net/nova/+bug/832507) doesn't have a permanent
 resolution, but we look forward to the discussion at the next summit.
 Havana Haunted by the Dead
 ~~~~~~~~~~~~~~~~~~~~~~~~~~
 Felix Lee of Academia Sinica Grid Computing Centre in Taiwan contributed
 this story.
 I just upgraded OpenStack from Grizzly to Havana 2013.2-2 using the RDO
 repository and everything was running pretty well—except the EC2 API.
 I noticed that the API would suffer from a heavy load and respond slowly
 to particular EC2 requests such as ``RunInstances``.
 Output from ``/var/log/nova/nova-api.log`` on :term:`Havana`:
 .. code-block:: console
   2014-01-10 09:11:45.072 129745 INFO nova.ec2.wsgi.server
   [req-84d16d16-3808-426b-b7af-3b90a11b83b0
   0c6e7dba03c24c6a9bce299747499e8a 7052bd6714e7460caeb16242e68124f9]
   117.103.103.29 "GET
   /services/Cloud?AWSAccessKeyId=[something]&Action=RunInstances&ClientToken=[something]&ImageId=ami-00000001&InstanceInitiatedShutdownBehavior=terminate...
   HTTP/1.1" status: 200 len: 1109 time: 138.5970151
 This request took over two minutes to process, but executed quickly on
 another co-existing Grizzly deployment using the same hardware and
 system configuration.
 Output from ``/var/log/nova/nova-api.log`` on :term:`Grizzly`:
 .. code-block:: console
   2014-01-08 11:15:15.704 INFO nova.ec2.wsgi.server
   [req-ccac9790-3357-4aa8-84bd-cdaab1aa394e
   ebbd729575cb404081a45c9ada0849b7 8175953c209044358ab5e0ec19d52c37]
   117.103.103.29 "GET
   /services/Cloud?AWSAccessKeyId=[something]&Action=RunInstances&ClientToken=[something]&ImageId=ami-00000007&InstanceInitiatedShutdownBehavior=terminate...
   HTTP/1.1" status: 200 len: 931 time: 3.9426181
 While monitoring system resources, I noticed a significant increase in
 memory consumption while the EC2 API processed this request. I thought
 it wasn't handling memory properly—possibly not releasing memory. If the
 API received several of these requests, memory consumption quickly grew
 until the system ran out of RAM and began using swap. Each node has 48
 GB of RAM and the "nova-api" process would consume all of it within
 minutes. Once this happened, the entire system would become unusably
 slow until I restarted the nova-api service.
 So, I found myself wondering what changed in the EC2 API on Havana that
 might cause this to happen. Was it a bug or a normal behavior that I now
 need to work around?
 After digging into the nova (OpenStack Compute) code, I noticed two
 areas in ``api/ec2/cloud.py`` potentially impacting my system:
 .. code-block:: python
   instances = self.compute_api.get_all(context,
                                        search_opts=search_opts,
                                        sort_dir='asc')
   sys_metas = self.compute_api.get_all_system_metadata(
       context, search_filts=[{'key': ['EC2_client_token']},
                              {'value': [client_token]}])
 Since my database contained many records—over 1 million metadata records
 and over 300,000 instance records in "deleted" or "errored" states—each
 search took a long time. I decided to clean up the database by first
 archiving a copy for backup and then performing some deletions using the
 MySQL client. For example, I ran the following SQL command to remove
 rows of instances deleted for over a year:
 .. code-block:: console
   mysql> delete from nova.instances where deleted=1 and terminated_at < (NOW() - INTERVAL 1 YEAR);
 Performance increased greatly after deleting the old records and my new
 deployment continues to behave well.
--- a/doc/ops-guide/source/app_resources.rst
+++ b/doc/ops-guide/source/app_resources.rst
@ -0,0 +1,62 @@
 =========
 Resources
 =========
 OpenStack
 ~~~~~~~~~
 -  `Installation Guide for openSUSE 13.2 and SUSE Linux Enterprise
   Server 12 <http://docs.openstack.org/liberty/install-guide-obs/>`_
 -  `Installation Guide for Red Hat Enterprise Linux 7, CentOS 7, and
   Fedora 22 <http://docs.openstack.org/liberty/install-guide-rdo/>`_
 -  `Installation Guide for Ubuntu 14.04 (LTS)
   Server <http://docs.openstack.org/liberty/install-guide-ubuntu/>`_
 -  `OpenStack Administrator Guide <http://docs.openstack.org/admin-guide/>`_
 -  `OpenStack Cloud Computing Cookbook (Packt
   Publishing) <http://www.packtpub.com/openstack-cloud-computing-cookbook-second-edition/book>`_
 Cloud (General)
 ~~~~~~~~~~~~~~~
 -  `“The NIST Definition of Cloud
   Computing” <http://csrc.nist.gov/publications/nistpubs/800-145/SP800-145.pdf>`_
 Python
 ~~~~~~
 -  `Dive Into Python (Apress) <http://www.diveintopython.net/>`_
 Networking
 ~~~~~~~~~~
 -  `TCP/IP Illustrated, Volume 1: The Protocols, 2/E
   (Pearson) <http://www.pearsonhighered.com/educator/product/TCPIP-Illustrated-Volume-1-The-Protocols/9780321336316.page>`_
 -  `The TCP/IP Guide (No Starch
   Press) <http://www.nostarch.com/tcpip.htm>`_
 -  `“A tcpdump Tutorial and
   Primer” <http://danielmiessler.com/study/tcpdump/>`_
 Systems Administration
 ~~~~~~~~~~~~~~~~~~~~~~
 -  `UNIX and Linux Systems Administration Handbook (Prentice
   Hall) <http://www.admin.com/>`_
 Virtualization
 ~~~~~~~~~~~~~~
 -  `The Book of Xen (No Starch
   Press) <http://www.nostarch.com/xen.htm>`_
 Configuration Management
 ~~~~~~~~~~~~~~~~~~~~~~~~
 -  `Puppet Labs Documentation <http://docs.puppetlabs.com/>`_
 -  `Pro Puppet (Apress) <http://www.apress.com/9781430230571>`_
--- a/doc/ops-guide/source/app_roadmaps.rst
+++ b/doc/ops-guide/source/app_roadmaps.rst
@ -0,0 +1,435 @@
 =====================
 Working with Roadmaps
 =====================
 The good news: OpenStack has unprecedented transparency when it comes to
 providing information about what's coming up. The bad news: each release
 moves very quickly. The purpose of this appendix is to highlight some of
 the useful pages to track, and take an educated guess at what is coming
 up in the next release and perhaps further afield.
 OpenStack follows a six month release cycle, typically releasing in
 April/May and October/November each year. At the start of each cycle,
 the community gathers in a single location for a design summit. At the
 summit, the features for the coming releases are discussed, prioritized,
 and planned. The below figure shows an example release cycle, with dates
 showing milestone releases, code freeze, and string freeze dates, along
 with an example of when the summit occurs. Milestones are interim releases
 within the cycle that are available as packages for download and
 testing. Code freeze is putting a stop to adding new features to the
 release. String freeze is putting a stop to changing any strings within
 the source code.
 .. image:: figures/osog_ac01.png
   :width: 100%
 Information Available to You
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 There are several good sources of information available that you can use
 to track your OpenStack development desires.OpenStack community working
 with roadmaps information available
 Release notes are maintained on the OpenStack wiki, and also shown here:
 .. list-table::
   :widths: 25 25 25 25
   :header-rows: 1
   * - Series
     - Status
     - Releases
     - Date
   * - Liberty
     - `Under Development
       <https://wiki.openstack.org/wiki/Liberty_Release_Schedule>`_
     - 2015.2
     - Oct, 2015
   * - Kilo
     - `Current stable release, security-supported
       <https://wiki.openstack.org/wiki/Kilo_Release_Schedule>`_
     - `2015.1 <https://wiki.openstack.org/wiki/ReleaseNotes/Kilo>`_
     - Apr 30, 2015
   * - Juno
     - `Security-supported
       <https://wiki.openstack.org/wiki/Juno_Release_Schedule>`_
     - `2014.2 <https://wiki.openstack.org/wiki/ReleaseNotes/Juno>`_
     - Oct 16, 2014
   * - Icehouse
     - `End-of-life
       <https://wiki.openstack.org/wiki/Icehouse_Release_Schedule>`_
     - `2014.1 <https://wiki.openstack.org/wiki/ReleaseNotes/Icehouse>`_
     - Apr 17, 2014
   * -
     -
     - `2014.1.1 <https://wiki.openstack.org/wiki/ReleaseNotes/2014.1.1>`_
     - Jun 9, 2014
   * -
     -
     - `2014.1.2 <https://wiki.openstack.org/wiki/ReleaseNotes/2014.1.2>`_
     - Aug 8, 2014
   * -
     -
     - `2014.1.3 <https://wiki.openstack.org/wiki/ReleaseNotes/2014.1.3>`_
     - Oct 2, 2014
   * - Havana
     - End-of-life
     - `2013.2 <https://wiki.openstack.org/wiki/ReleaseNotes/Havana>`_
     - Apr 4, 2013
   * -
     -
     - `2013.2.1 <https://wiki.openstack.org/wiki/ReleaseNotes/2013.2.1>`_
     - Dec 16, 2013
   * -
     -
     - `2013.2.2 <https://wiki.openstack.org/wiki/ReleaseNotes/2013.2.2>`_
     - Feb 13, 2014
   * -
     -
     - `2013.2.3 <https://wiki.openstack.org/wiki/ReleaseNotes/2013.2.3>`_
     - Apr 3, 2014
   * -
     -
     - `2013.2.4 <https://wiki.openstack.org/wiki/ReleaseNotes/2013.2.4>`_
     - Sep 22, 2014
   * -
     -
     - `2013.2.1 <https://wiki.openstack.org/wiki/ReleaseNotes/2013.2.1>`_
     - Dec 16, 2013
   * - Grizzly
     - End-of-life
     - `2013.1 <https://wiki.openstack.org/wiki/ReleaseNotes/Grizzly>`_
     - Apr 4, 2013
   * -
     -
     - `2013.1.1 <https://wiki.openstack.org/wiki/ReleaseNotes/2013.1.1>`_
     - May 9, 2013
   * -
     -
     - `2013.1.2 <https://wiki.openstack.org/wiki/ReleaseNotes/2013.1.2>`_
     - Jun 6, 2013
   * -
     -
     - `2013.1.3 <https://wiki.openstack.org/wiki/ReleaseNotes/2013.1.3>`_
     - Aug 8, 2013
   * -
     -
     - `2013.1.4 <https://wiki.openstack.org/wiki/ReleaseNotes/2013.1.4>`_
     - Oct 17, 2013
   * -
     -
     - `2013.1.5 <https://wiki.openstack.org/wiki/ReleaseNotes/2013.1.5>`_
     - Mar 20, 2015
   * - Folsom
     - End-of-life
     - `2012.2 <https://wiki.openstack.org/wiki/ReleaseNotes/Folsom>`_
     - Sep 27, 2012
   * -
     -
     - `2012.2.1 <https://wiki.openstack.org/wiki/ReleaseNotes/2012.2.1>`_
     - Nov 29, 2012
   * -
     -
     - `2012.2.2 <https://wiki.openstack.org/wiki/ReleaseNotes/2012.2.2>`_
     - Dec 13, 2012
   * -
     -
     - `2012.2.3 <https://wiki.openstack.org/wiki/ReleaseNotes/2012.2.3>`_
     - Jan 31, 2013
   * -
     -
     - `2012.2.4 <https://wiki.openstack.org/wiki/ReleaseNotes/2012.2.4>`_
     - Apr 11, 2013
   * - Essex
     - End-of-life
     - `2012.1 <https://wiki.openstack.org/wiki/ReleaseNotes/Essex>`_
     - Apr 5, 2012
   * -
     -
     - `2012.1.1 <https://wiki.openstack.org/wiki/ReleaseNotes/2012.1.1>`_
     - Jun 22, 2012
   * -
     -
     - `2012.1.2 <https://wiki.openstack.org/wiki/ReleaseNotes/2012.1.2>`_
     - Aug 10, 2012
   * -
     -
     - `2012.1.3 <https://wiki.openstack.org/wiki/ReleaseNotes/2012.1.3>`_
     - Oct 12, 2012
   * - Diablo
     - Deprecated
     - `2011.3 <https://wiki.openstack.org/wiki/ReleaseNotes/Diablo>`_
     - Sep 22, 2011
   * -
     -
     - `2011.3.1 <https://wiki.openstack.org/wiki/ReleaseNotes/2011.3.1>`_
     - Jan 19, 2012
   * - Cactus
     - Deprecated
     - `2011.2 <https://wiki.openstack.org/wiki/ReleaseNotes/Cactus>`_
     - Apr 15, 2011
   * - Bexar
     - Deprecated
     - `2011.1 <https://wiki.openstack.org/wiki/ReleaseNotes/Bexar>`_
     - Feb 3, 2011
   * - Austin
     - Deprecated
     - `2010.1 <https://wiki.openstack.org/wiki/ReleaseNotes/Austin>`_
     - Oct 21, 2010
 Here are some other resources:
 -  `A breakdown of current features under development, with their target
   milestone <http://status.openstack.org/release/>`_
 -  `A list of all features, including those not yet under
   development <https://blueprints.launchpad.net/openstack>`_
 -  `Rough-draft design discussions ("etherpads") from the last design
   summit <https://wiki.openstack.org/wiki/Summit/Kilo/Etherpads>`_
 -  `List of individual code changes under
   review <https://review.openstack.org/>`_
 Influencing the Roadmap
 ~~~~~~~~~~~~~~~~~~~~~~~
 OpenStack truly welcomes your ideas (and contributions) and highly
 values feedback from real-world users of the software. By learning a
 little about the process that drives feature development, you can
 participate and perhaps get the additions you desire.
 Feature requests typically start their life in Etherpad, a collaborative
 editing tool, which is used to take coordinating notes at a design
 summit session specific to the feature. This then leads to the creation
 of a blueprint on the Launchpad site for the particular project, which
 is used to describe the feature more formally. Blueprints are then
 approved by project team members, and development can begin.
 Therefore, the fastest way to get your feature request up for
 consideration is to create an Etherpad with your ideas and propose a
 session to the design summit. If the design summit has already passed,
 you may also create a blueprint directly. Read this `blog post about how
 to work with blueprints
 <http://vmartinezdelacruz.com/how-to-work-with-blueprints-without-losing-your-mind/>`_
 the perspective of Victoria Martínez, a developer intern.
 The roadmap for the next release as it is developed can be seen at
 `Releases <http://releases.openstack.org>`_.
 To determine the potential features going in to future releases, or to
 look at features implemented previously, take a look at the existing
 blueprints such as  `OpenStack Compute (nova)
 Blueprints <https://blueprints.launchpad.net/nova>`_, `OpenStack
 Identity (keystone)
 Blueprints <https://blueprints.launchpad.net/keystone>`_, and release
 notes.
 Aside from the direct-to-blueprint pathway, there is another very
 well-regarded mechanism to influence the development roadmap: the user
 survey. Found at http://openstack.org/user-survey, it allows you to
 provide details of your deployments and needs, anonymously by default.
 Each cycle, the user committee analyzes the results and produces a
 report, including providing specific information to the technical
 committee and project team leads.
 Aspects to Watch
 ~~~~~~~~~~~~~~~~
 You want to keep an eye on the areas improving within OpenStack. The
 best way to "watch" roadmaps for each project is to look at the
 blueprints that are being approved for work on milestone releases. You
 can also learn from PTL webinars that follow the OpenStack summits twice
 a year.
 Driver Quality Improvements
 ---------------------------
 A major quality push has occurred across drivers and plug-ins in Block
 Storage, Compute, and Networking. Particularly, developers of Compute
 and Networking drivers that require proprietary or hardware products are
 now required to provide an automated external testing system for use
 during the development process.
 Easier Upgrades
 ---------------
 One of the most requested features since OpenStack began (for components
 other than Object Storage, which tends to "just work"): easier upgrades.
 In all recent releases internal messaging communication is versioned,
 meaning services can theoretically drop back to backward-compatible
 behavior. This allows you to run later versions of some components,
 while keeping older versions of others.
 In addition, database migrations are now tested with the Turbo Hipster
 tool. This tool tests database migration performance on copies of
 real-world user databases.
 These changes have facilitated the first proper OpenStack upgrade guide,
 found in :doc:`ops_upgrades`, and will continue to improve in the next
 release.
 Deprecation of Nova Network
 ---------------------------
 With the introduction of the full software-defined networking stack
 provided by OpenStack Networking (neutron) in the Folsom release,
 development effort on the initial networking code that remains part of
 the Compute component has gradually lessened. While many still use
 ``nova-network`` in production, there has been a long-term plan to
 remove the code in favor of the more flexible and full-featured
 OpenStack Networking.
 An attempt was made to deprecate ``nova-network`` during the Havana
 release, which was aborted due to the lack of equivalent functionality
 (such as the FlatDHCP multi-host high-availability mode mentioned in
 this guide), lack of a migration path between versions, insufficient
 testing, and simplicity when used for the more straightforward use cases
 ``nova-network`` traditionally supported. Though significant effort has
 been made to address these concerns, ``nova-network`` was not be
 deprecated in the Juno release. In addition, to a limited degree,
 patches to ``nova-network`` have again begin to be accepted, such as
 adding a per-network settings feature and SR-IOV support in Juno.
 This leaves you with an important point of decision when designing your
 cloud. OpenStack Networking is robust enough to use with a small number
 of limitations (performance issues in some scenarios, only basic high
 availability of layer 3 systems) and provides many more features than
 ``nova-network``. However, if you do not have the more complex use cases
 that can benefit from fuller software-defined networking capabilities,
 or are uncomfortable with the new concepts introduced, ``nova-network``
 may continue to be a viable option for the next 12 months.
 Similarly, if you have an existing cloud and are looking to upgrade from
 ``nova-network`` to OpenStack Networking, you should have the option to
 delay the upgrade for this period of time. However, each release of
 OpenStack brings significant new innovation, and regardless of your use
 of networking methodology, it is likely best to begin planning for an
 upgrade within a reasonable timeframe of each release.
 As mentioned, there's currently no way to cleanly migrate from
 ``nova-network`` to neutron. We recommend that you keep a migration in
 mind and what that process might involve for when a proper migration
 path is released.
 Distributed Virtual Router
 ~~~~~~~~~~~~~~~~~~~~~~~~~~
 One of the long-time complaints surrounding OpenStack Networking was the
 lack of high availability for the layer 3 components. The Juno release
 introduced Distributed Virtual Router (DVR), which aims to solve this
 problem.
 Early indications are that it does do this well for a base set of
 scenarios, such as using the ML2 plug-in with Open vSwitch, one flat
 external network and VXLAN tenant networks. However, it does appear that
 there are problems with the use of VLANs, IPv6, Floating IPs, high
 north-south traffic scenarios and large numbers of compute nodes. It is
 expected these will improve significantly with the next release, but bug
 reports on specific issues are highly desirable.
 Replacement of Open vSwitch Plug-in with Modular Layer 2
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 The Modular Layer 2 plug-in is a framework allowing OpenStack Networking
 to simultaneously utilize the variety of layer-2 networking technologies
 found in complex real-world data centers. It currently works with the
 existing Open vSwitch, Linux Bridge, and Hyper-V L2 agents and is
 intended to replace and deprecate the monolithic plug-ins associated
 with those L2 agents.
 New API Versions
 ~~~~~~~~~~~~~~~~
 The third version of the Compute API was broadly discussed and worked on
 during the Havana and Icehouse release cycles. Current discussions
 indicate that the V2 API will remain for many releases, and the next
 iteration of the API will be denoted v2.1 and have similar properties to
 the existing v2.0, rather than an entirely new v3 API. This is a great
 time to evaluate all API and provide comments while the next generation
 APIs are being defined. A new working group was formed specifically to
 `improve OpenStack APIs <https://wiki.openstack.org/wiki/API_Working_Group>`_
 and create design guidelines, which you are welcome to join.
 OpenStack on OpenStack (TripleO)
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 This project continues to improve and you may consider using it for
 greenfield deployments, though according to the latest user survey
 results it remains to see widespread uptake.
 Data processing service for OpenStack (sahara)
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 A much-requested answer to big data problems, a dedicated team has been
 making solid progress on a Hadoop-as-a-Service project.
 Bare metal Deployment (ironic)
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 The bare-metal deployment has been widely lauded, and development
 continues. The Juno release brought the OpenStack Bare metal drive into
 the Compute project, and it was aimed to deprecate the existing
 bare-metal driver in Kilo. If you are a current user of the bare metal
 driver, a particular blueprint to follow is `Deprecate the bare metal
 driver
 <https://blueprints.launchpad.net/nova/+spec/deprecate-baremetal-driver>`_
 Database as a Service (trove)
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 The OpenStack community has had a database-as-a-service tool in
 development for some time, and we saw the first integrated release of it
 in Icehouse. From its release it was able to deploy database servers out
 of the box in a highly available way, initially supporting only MySQL.
 Juno introduced support for Mongo (including clustering), PostgreSQL and
 Couchbase, in addition to replication functionality for MySQL. In Kilo,
 more advanced clustering capability was delivered, in addition to better
 integration with other OpenStack components such as Networking.
 Message Service (zaqar)
 ~~~~~~~~~~~~~~~~~~~~~~~
 A service to provide queues of messages and notifications was released.
 DNS service (designate)
 ~~~~~~~~~~~~~~~~~~~~~~~
 A long requested service, to provide the ability to manipulate DNS
 entries associated with OpenStack resources has gathered a following.
 The designate project was also released.
 Scheduler Improvements
 ~~~~~~~~~~~~~~~~~~~~~~
 Both Compute and Block Storage rely on schedulers to determine where to
 place virtual machines or volumes. In Havana, the Compute scheduler
 underwent significant improvement, while in Icehouse it was the
 scheduler in Block Storage that received a boost. Further down the
 track, an effort started this cycle that aims to create a holistic
 scheduler covering both will come to fruition. Some of the work that was
 done in Kilo can be found under the `Gantt
 project <https://wiki.openstack.org/wiki/Gantt/kilo>`_.
 Block Storage Improvements
 --------------------------
 Block Storage is considered a stable project, with wide uptake and a
 long track record of quality drivers. The team has discussed many areas
 of work at the summits, including better error reporting, automated
 discovery, and thin provisioning features.
 Toward a Python SDK
 -------------------
 Though many successfully use the various python-\*client code as an
 effective SDK for interacting with OpenStack, consistency between the
 projects and documentation availability waxes and wanes. To combat this,
 an `effort to improve the
 experience <https://wiki.openstack.org/wiki/PythonOpenStackSDK>`_ has
 started. Cross-project development efforts in OpenStack have a checkered
 history, such as the `unified client
 project <https://wiki.openstack.org/wiki/OpenStackClient>`_ having
 several false starts. However, the early signs for the SDK project are
 promising, and we expect to see results during the Juno cycle.
--- a/doc/ops-guide/source/app_usecases.rst
+++ b/doc/ops-guide/source/app_usecases.rst
@ -0,0 +1,192 @@
 =========
 Use Cases
 =========
 This appendix contains a small selection of use cases from the
 community, with more technical detail than usual. Further examples can
 be found on the `OpenStack website <https://www.openstack.org/user-stories/>`_.
 NeCTAR
 ~~~~~~
 Who uses it: researchers from the Australian publicly funded research
 sector. Use is across a wide variety of disciplines, with the purpose of
 instances ranging from running simple web servers to using hundreds of
 cores for high-throughput computing.
 Deployment
 ----------
 Using OpenStack Compute cells, the NeCTAR Research Cloud spans eight
 sites with approximately 4,000 cores per site.
 Each site runs a different configuration, as a resource cells in an
 OpenStack Compute cells setup. Some sites span multiple data centers,
 some use off compute node storage with a shared file system, and some
 use on compute node storage with a non-shared file system. Each site
 deploys the Image service with an Object Storage back end. A central
 Identity, dashboard, and Compute API service are used. A login to the
 dashboard triggers a SAML login with Shibboleth, which creates an
 account in the Identity service with an SQL back end. An Object Storage
 Global Cluster is used across several sites.
 Compute nodes have 24 to 48 cores, with at least 4 GB of RAM per core
 and approximately 40 GB of ephemeral storage per core.
 All sites are based on Ubuntu 14.04, with KVM as the hypervisor. The
 OpenStack version in use is typically the current stable version, with 5
 to 10 percent back-ported code from trunk and modifications.
 Resources
 ---------
 -  `OpenStack.org case
   study <https://www.openstack.org/user-stories/nectar/>`_
 -  `NeCTAR-RC GitHub <https://github.com/NeCTAR-RC/>`_
 -  `NeCTAR website <https://www.nectar.org.au/>`_
 MIT CSAIL
 ~~~~~~~~~
 Who uses it: researchers from the MIT Computer Science and Artificial
 Intelligence Lab.
 Deployment
 ----------
 The CSAIL cloud is currently 64 physical nodes with a total of 768
 physical cores and 3,456 GB of RAM. Persistent data storage is largely
 outside the cloud on NFS, with cloud resources focused on compute
 resources. There are more than 130 users in more than 40 projects,
 typically running 2,000–2,500 vCPUs in 300 to 400 instances.
 We initially deployed on Ubuntu 12.04 with the Essex release of
 OpenStack using FlatDHCP multi-host networking.
 The software stack is still Ubuntu 12.04 LTS, but now with OpenStack
 Havana from the Ubuntu Cloud Archive. KVM is the hypervisor, deployed
 using `FAI <http://fai-project.org/>`_ and Puppet for configuration
 management. The FAI and Puppet combination is used lab-wide, not only
 for OpenStack. There is a single cloud controller node, which also acts
 as network controller, with the remainder of the server hardware
 dedicated to compute nodes.
 Host aggregates and instance-type extra specs are used to provide two
 different resource allocation ratios. The default resource allocation
 ratios we use are 4:1 CPU and 1.5:1 RAM. Compute-intensive workloads use
 instance types that require non-oversubscribed hosts where ``cpu_ratio``
 and ``ram_ratio`` are both set to 1.0. Since we have hyper-threading
 enabled on our compute nodes, this provides one vCPU per CPU thread, or
 two vCPUs per physical core.
 With our upgrade to Grizzly in August 2013, we moved to OpenStack
 Networking, neutron (quantum at the time). Compute nodes have
 two-gigabit network interfaces and a separate management card for IPMI
 management. One network interface is used for node-to-node
 communications. The other is used as a trunk port for OpenStack managed
 VLANs. The controller node uses two bonded 10g network interfaces for
 its public IP communications. Big pipes are used here because images are
 served over this port, and it is also used to connect to iSCSI storage,
 back-ending the image storage and database. The controller node also has
 a gigabit interface that is used in trunk mode for OpenStack managed
 VLAN traffic. This port handles traffic to the dhcp-agent and
 metadata-proxy.
 We approximate the older ``nova-network`` multi-host HA setup by using
 "provider VLAN networks" that connect instances directly to existing
 publicly addressable networks and use existing physical routers as their
 default gateway. This means that if our network controller goes down,
 running instances still have their network available, and no single
 Linux host becomes a traffic bottleneck. We are able to do this because
 we have a sufficient supply of IPv4 addresses to cover all of our
 instances and thus don't need NAT and don't use floating IP addresses.
 We provide a single generic public network to all projects and
 additional existing VLANs on a project-by-project basis as needed.
 Individual projects are also allowed to create their own private GRE
 based networks.
 Resources
 ---------
 -  `CSAIL homepage <http://www.csail.mit.edu/>`_
 DAIR
 ~~~~
 Who uses it: DAIR is an integrated virtual environment that leverages
 the CANARIE network to develop and test new information communication
 technology (ICT) and other digital technologies. It combines such
 digital infrastructure as advanced networking and cloud computing and
 storage to create an environment for developing and testing innovative
 ICT applications, protocols, and services; performing at-scale
 experimentation for deployment; and facilitating a faster time to
 market.
 Deployment
 ----------
 DAIR is hosted at two different data centers across Canada: one in
 Alberta and the other in Quebec. It consists of a cloud controller at
 each location, although, one is designated the "master" controller that
 is in charge of central authentication and quotas. This is done through
 custom scripts and light modifications to OpenStack. DAIR is currently
 running Havana.
 For Object Storage, each region has a swift environment.
 A NetApp appliance is used in each region for both block storage and
 instance storage. There are future plans to move the instances off the
 NetApp appliance and onto a distributed file system such as :term:`Ceph` or
 GlusterFS.
 VlanManager is used extensively for network management. All servers have
 two bonded 10GbE NICs that are connected to two redundant switches. DAIR
 is set up to use single-node networking where the cloud controller is
 the gateway for all instances on all compute nodes. Internal OpenStack
 traffic (for example, storage traffic) does not go through the cloud
 controller.
 Resources
 ---------
 -  `DAIR homepage <http://www.canarie.ca/cloud/>`__
 CERN
 ~~~~
 Who uses it: researchers at CERN (European Organization for Nuclear
 Research) conducting high-energy physics research.
 Deployment
 ----------
 The environment is largely based on Scientific Linux 6, which is Red Hat
 compatible. We use KVM as our primary hypervisor, although tests are
 ongoing with Hyper-V on Windows Server 2008.
 We use the Puppet Labs OpenStack modules to configure Compute, Image
 service, Identity, and dashboard. Puppet is used widely for instance
 configuration, and Foreman is used as a GUI for reporting and instance
 provisioning.
 Users and groups are managed through Active Directory and imported into
 the Identity service using LDAP. CLIs are available for nova and
 Euca2ools to do this.
 There are three clouds currently running at CERN, totaling about 4,700
 compute nodes, with approximately 120,000 cores. The CERN IT cloud aims
 to expand to 300,000 cores by 2015.
 Resources
 ---------
 -  `“OpenStack in Production: A tale of 3 OpenStack
   Clouds” <http://openstack-in-production.blogspot.de/2013/09/a-tale-of-3-openstack-clouds-50000.html>`_
 -  `“Review of CERN Data Centre
   Infrastructure” <http://cds.cern.ch/record/1457989/files/chep%202012%20CERN%20infrastructure%20final.pdf?version=1>`_
 -  `“CERN Cloud Infrastructure User
   Guide” <http://information-technology.web.cern.ch/book/cern-private-cloud-user-guide>`_
--- a/doc/ops-guide/source/arch_cloud_controller.rst
+++ b/doc/ops-guide/source/arch_cloud_controller.rst
@ -0,0 +1,403 @@
 ====================================================
 Designing for Cloud Controllers and Cloud Management
 ====================================================
 OpenStack is designed to be massively horizontally scalable, which
 allows all services to be distributed widely. However, to simplify this
 guide, we have decided to discuss services of a more central nature,
 using the concept of a *cloud controller*. A cloud controller is just a
 conceptual simplification. In the real world, you design an architecture
 for your cloud controller that enables high availability so that if any
 node fails, another can take over the required tasks. In reality, cloud
 controller tasks are spread out across more than a single node.
 The cloud controller provides the central management system for
 OpenStack deployments. Typically, the cloud controller manages
 authentication and sends messaging to all the systems through a message
 queue.
 For many deployments, the cloud controller is a single node. However, to
 have high availability, you have to take a few considerations into
 account, which we'll cover in this chapter.
 The cloud controller manages the following services for the cloud:
 Databases
    Tracks current information about users and instances, for example,
    in a database, typically one database instance managed per service
 Message queue services
    All :term:`Advanced Message Queuing Protocol (AMQP)` messages for
    services are received and sent according to the queue broker
 Conductor services
    Proxy requests to a database
 Authentication and authorization for identity management
    Indicates which users can do what actions on certain cloud
    resources; quota management is spread out among services,
    howeverauthentication
 Image-management services
    Stores and serves images with metadata on each, for launching in the
    cloud
 Scheduling services
    Indicates which resources to use first; for example, spreading out
    where instances are launched based on an algorithm
 User dashboard
    Provides a web-based front end for users to consume OpenStack cloud
    services
 API endpoints
    Offers each service's REST API access, where the API endpoint
    catalog is managed by the Identity service
 For our example, the cloud controller has a collection of ``nova-*``
 components that represent the global state of the cloud; talks to
 services such as authentication; maintains information about the cloud
 in a database; communicates to all compute nodes and storage
 :term:`workers <worker>` through a queue; and provides API access.
 Each service running on a designated cloud controller may be broken out
 into separate nodes for scalability or availability.
 As another example, you could use pairs of servers for a collective
 cloud controller—one active, one standby—for redundant nodes providing a
 given set of related services, such as:
 -  Front end web for API requests, the scheduler for choosing which
   compute node to boot an instance on, Identity services, and the
   dashboard
 -  Database and message queue server (such as MySQL, RabbitMQ)
 -  Image service for the image management
 Now that you see the myriad designs for controlling your cloud, read
 more about the further considerations to help with your design
 decisions.
 Hardware Considerations
 ~~~~~~~~~~~~~~~~~~~~~~~
 A cloud controller's hardware can be the same as a compute node, though
 you may want to further specify based on the size and type of cloud that
 you run.
 It's also possible to use virtual machines for all or some of the
 services that the cloud controller manages, such as the message queuing.
 In this guide, we assume that all services are running directly on the
 cloud controller.
 The table below contains common considerations to review when sizing hardware
 for the cloud controller design.
 .. list-table:: Cloud controller hardware sizing considerations
   :widths: 50 50
   :header-rows: 1
   * - Consideration
     - Ramification
   * - How many instances will run at once?
     - Size your database server accordingly, and scale out beyond one cloud
       controller if many instances will report status at the same time and
       scheduling where a new instance starts up needs computing power.
   * - How many compute nodes will run at once?
     - Ensure that your messaging queue handles requests successfully and size
       accordingly.
   * - How many users will access the API?
     - If many users will make multiple requests, make sure that the CPU load
       for the cloud controller can handle it.
   * - How many users will access the dashboard versus the REST API directly?
     - The dashboard makes many requests, even more than the API access, so
       add even more CPU if your dashboard is the main interface for your users.
   * - How many ``nova-api`` services do you run at once for your cloud?
     - You need to size the controller with a core per service.
   * - How long does a single instance run?
     - Starting instances and deleting instances is demanding on the compute
       node but also demanding on the controller node because of all the API
       queries and scheduling needs.
   * - Does your authentication system also verify externally?
     - External systems such as LDAP or Active Directory require network
       connectivity between the cloud controller and an external authentication
       system. Also ensure that the cloud controller has the CPU power to keep
       up with requests.
 Separation of Services
 ~~~~~~~~~~~~~~~~~~~~~~
 While our example contains all central services in a single location, it
 is possible and indeed often a good idea to separate services onto
 different physical servers. The table below is a list of deployment
 scenarios we've seen and their justifications.
 .. list-table:: Deployment scenarios
   :widths: 50 50
   :header-rows: 1
   * - Scenario
     - Justification
   * - Run ``glance-*`` servers on the ``swift-proxy`` server.
     - This deployment felt that the spare I/O on the Object Storage proxy
       server was sufficient and that the Image Delivery portion of glance
       benefited from being on physical hardware and having good connectivity
       to the Object Storage back end it was using.
   * - Run a central dedicated database server.
     - This deployment used a central dedicated server to provide the databases
       for all services. This approach simplified operations by isolating
       database server updates and allowed for the simple creation of slave
       database servers for failover.
   * - Run one VM per service.
     - This deployment ran central services on a set of servers running KVM.
       A dedicated VM was created for each service (``nova-scheduler``,
       rabbitmq, database, etc). This assisted the deployment with scaling
       because administrators could tune the resources given to each virtual
       machine based on the load it received (something that was not well
       understood during installation).
   * - Use an external load balancer.
     - This deployment had an expensive hardware load balancer in its
       organization. It ran multiple ``nova-api`` and ``swift-proxy``
       servers on different physical servers and used the load balancer
       to switch between them.
 One choice that always comes up is whether to virtualize. Some services,
 such as ``nova-compute``, ``swift-proxy`` and ``swift-object`` servers,
 should not be virtualized. However, control servers can often be happily
 virtualized—the performance penalty can usually be offset by simply
 running more of the service.
 Database
 ~~~~~~~~
 OpenStack Compute uses an SQL database to store and retrieve stateful
 information. MySQL is the popular database choice in the OpenStack
 community.
 Loss of the database leads to errors. As a result, we recommend that you
 cluster your database to make it failure tolerant. Configuring and
 maintaining a database cluster is done outside OpenStack and is
 determined by the database software you choose to use in your cloud
 environment. MySQL/Galera is a popular option for MySQL-based databases.
 Message Queue
 ~~~~~~~~~~~~~
 Most OpenStack services communicate with each other using the *message
 queue*.messages design considerationsdesign considerations message
 queues For example, Compute communicates to block storage services and
 networking services through the message queue. Also, you can optionally
 enable notifications for any service. RabbitMQ, Qpid, and Zeromq are all
 popular choices for a message-queue service. In general, if the message
 queue fails or becomes inaccessible, the cluster grinds to a halt and
 ends up in a read-only state, with information stuck at the point where
 the last message was sent. Accordingly, we recommend that you cluster
 the message queue. Be aware that clustered message queues can be a pain
 point for many OpenStack deployments. While RabbitMQ has native
 clustering support, there have been reports of issues when running it at
 a large scale. While other queuing solutions are available, such as Zeromq
 and Qpid, Zeromq does not offer stateful queues. Qpid is the messaging
 system of choice for Red Hat and its derivatives. Qpid does not have
 native clustering capabilities and requires a supplemental service, such
 as Pacemaker or Corsync. For your message queue, you need to determine
 what level of data loss you are comfortable with and whether to use an
 OpenStack project's ability to retry multiple MQ hosts in the event of a
 failure, such as using Compute's ability to do so.
 Conductor Services
 ~~~~~~~~~~~~~~~~~~
 In the previous version of OpenStack, all ``nova-compute`` services
 required direct access to the database hosted on the cloud controller.
 This was problematic for two reasons: security and performance. With
 regard to security, if a compute node is compromised, the attacker
 inherently has access to the database. With regard to performance,
 ``nova-compute`` calls to the database are single-threaded and blocking.
 This creates a performance bottleneck because database requests are
 fulfilled serially rather than in parallel.
 The conductor service resolves both of these issues by acting as a proxy
 for the ``nova-compute`` service. Now, instead of ``nova-compute``
 directly accessing the database, it contacts the ``nova-conductor``
 service, and ``nova-conductor`` accesses the database on
 ``nova-compute``'s behalf. Since ``nova-compute`` no longer has direct
 access to the database, the security issue is resolved. Additionally,
 ``nova-conductor`` is a nonblocking service, so requests from all
 compute nodes are fulfilled in parallel.
 .. note::
   If you are using ``nova-network`` and multi-host networking in your
   cloud environment, ``nova-compute`` still requires direct access to
   the database.
 The ``nova-conductor`` service is horizontally scalable. To make
 ``nova-conductor`` highly available and fault tolerant, just launch more
 instances of the ``nova-conductor`` process, either on the same server
 or across multiple servers.
 Application Programming Interface (API)
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 All public access, whether direct, through a command-line client, or
 through the web-based dashboard, uses the API service. Find the API
 reference at http://api.openstack.org/.
 You must choose whether you want to support the Amazon EC2 compatibility
 APIs, or just the OpenStack APIs. One issue you might encounter when
 running both APIs is an inconsistent experience when referring to images
 and instances.
 For example, the EC2 API refers to instances using IDs that contain
 hexadecimal, whereas the OpenStack API uses names and digits. Similarly,
 the EC2 API tends to rely on DNS aliases for contacting virtual
 machines, as opposed to OpenStack, which typically lists IP
 addresses.
 If OpenStack is not set up in the right way, it is simple to have
 scenarios in which users are unable to contact their instances due to
 having only an incorrect DNS alias. Despite this, EC2 compatibility can
 assist users migrating to your cloud.
 As with databases and message queues, having more than one :term:`API server`
 is a good thing. Traditional HTTP load-balancing techniques can be used to
 achieve a highly available ``nova-api`` service.
 Extensions
 ~~~~~~~~~~
 The `API
 Specifications <http://docs.openstack.org/api/api-specs.html>`_ define
 the core actions, capabilities, and mediatypes of the OpenStack API. A
 client can always depend on the availability of this core API, and
 implementers are always required to support it in its entirety.
 Requiring strict adherence to the core API allows clients to rely upon a
 minimal level of functionality when interacting with multiple
 implementations of the same API.
 The OpenStack Compute API is extensible. An extension adds capabilities
 to an API beyond those defined in the core. The introduction of new
 features, MIME types, actions, states, headers, parameters, and
 resources can all be accomplished by means of extensions to the core
 API. This allows the introduction of new features in the API without
 requiring a version change and allows the introduction of
 vendor-specific niche functionality.
 Scheduling
 ~~~~~~~~~~
 The scheduling services are responsible for determining the compute or
 storage node where a virtual machine or block storage volume should be
 created. The scheduling services receive creation requests for these
 resources from the message queue and then begin the process of
 determining the appropriate node where the resource should reside. This
 process is done by applying a series of user-configurable filters
 against the available collection of nodes.
 There are currently two schedulers: ``nova-scheduler`` for virtual
 machines and ``cinder-scheduler`` for block storage volumes. Both
 schedulers are able to scale horizontally, so for high-availability
 purposes, or for very large or high-schedule-frequency installations,
 you should consider running multiple instances of each scheduler. The
 schedulers all listen to the shared message queue, so no special load
 balancing is required.
 Images
 ~~~~~~
 The OpenStack Image service consists of two parts: ``glance-api`` and
 ``glance-registry``. The former is responsible for the delivery of
 images; the compute node uses it to download images from the back end.
 The latter maintains the metadata information associated with virtual
 machine images and requires a database.
 The ``glance-api`` part is an abstraction layer that allows a choice of
 back end. Currently, it supports:
 OpenStack Object Storage
    Allows you to store images as objects.
 File system
    Uses any traditional file system to store the images as files.
 S3
    Allows you to fetch images from Amazon S3.
 HTTP
    Allows you to fetch images from a web server. You cannot write
    images by using this mode.
 If you have an OpenStack Object Storage service, we recommend using this
 as a scalable place to store your images. You can also use a file system
 with sufficient performance or Amazon S3—unless you do not need the
 ability to upload new images through OpenStack.
 Dashboard
 ~~~~~~~~~
 The OpenStack dashboard (horizon) provides a web-based user interface to
 the various OpenStack components. The dashboard includes an end-user
 area for users to manage their virtual infrastructure and an admin area
 for cloud operators to manage the OpenStack environment as a
 whole.
 The dashboard is implemented as a Python web application that normally
 runs in :term:`Apache` ``httpd``. Therefore, you may treat it the same as any
 other web application, provided it can reach the API servers (including
 their admin endpoints) over the network.
 Authentication and Authorization
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 The concepts supporting OpenStack's authentication and authorization are
 derived from well-understood and widely used systems of a similar
 nature. Users have credentials they can use to authenticate, and they
 can be a member of one or more groups (known as projects or tenants,
 interchangeably).
 For example, a cloud administrator might be able to list all instances
 in the cloud, whereas a user can see only those in his current group.
 Resources quotas, such as the number of cores that can be used, disk
 space, and so on, are associated with a project.
 OpenStack Identity provides authentication decisions and user attribute
 information, which is then used by the other OpenStack services to
 perform authorization. The policy is set in the ``policy.json`` file.
 For information on how to configure these, see :doc:`ops_projects_users`
 OpenStack Identity supports different plug-ins for authentication
 decisions and identity storage. Examples of these plug-ins include:
 -  In-memory key-value Store (a simplified internal storage structure)
 -  SQL database (such as MySQL or PostgreSQL)
 -  Memcached (a distributed memory object caching system)
 -  LDAP (such as OpenLDAP or Microsoft's Active Directory)
 Many deployments use the SQL database; however, LDAP is also a popular
 choice for those with existing authentication infrastructure that needs
 to be integrated.
 Network Considerations
 ~~~~~~~~~~~~~~~~~~~~~~
 Because the cloud controller handles so many different services, it must
 be able to handle the amount of traffic that hits it. For example, if
 you choose to host the OpenStack Image service on the cloud controller,
 the cloud controller should be able to support the transferring of the
 images at an acceptable speed.
 As another example, if you choose to use single-host networking where
 the cloud controller is the network gateway for all instances, then the
 cloud controller must support the total amount of traffic that travels
 between your cloud and the public Internet.
 We recommend that you use a fast NIC, such as 10 GB. You can also choose
 to use two 10 GB NICs and bond them together. While you might not be
 able to get a full bonded 20 GB speed, different transmission streams
 use different NICs. For example, if the cloud controller transfers two
 images, each image uses a different NIC and gets a full 10 GB of
 bandwidth.
--- a/doc/ops-guide/source/arch_compute_nodes.rst
+++ b/doc/ops-guide/source/arch_compute_nodes.rst
@ -0,0 +1,331 @@
 =============
 Compute Nodes
 =============
 In this chapter, we discuss some of the choices you need to consider
 when building out your compute nodes. Compute nodes form the resource
 core of the OpenStack Compute cloud, providing the processing, memory,
 network and storage resources to run instances.
 Choosing a CPU
 ~~~~~~~~~~~~~~
 The type of CPU in your compute node is a very important choice. First,
 ensure that the CPU supports virtualization by way of *VT-x* for Intel
 chips and *AMD-v* for AMD chips.
 .. note::
   Consult the vendor documentation to check for virtualization
   support. For Intel, read `“Does my processor support Intel® Virtualization
   Technology?” <http://www.intel.com/support/processors/sb/cs-030729.htm>`_.
   For AMD, read `AMD Virtualization
   <http://www.amd.com/en-us/innovations/software-technologies/server-solution/virtualization>`_.
   Note that your CPU may support virtualization but it may be
   disabled. Consult your BIOS documentation for how to enable CPU
   features.
 The number of cores that the CPU has also affects the decision. It's
 common for current CPUs to have up to 12 cores. Additionally, if an
 Intel CPU supports hyperthreading, those 12 cores are doubled to 24
 cores. If you purchase a server that supports multiple CPUs, the number
 of cores is further multiplied.
 **Multithread Considerations**
 Hyper-Threading is Intel's proprietary simultaneous multithreading
 implementation used to improve parallelization on their CPUs. You might
 consider enabling Hyper-Threading to improve the performance of
 multithreaded applications.
 Whether you should enable Hyper-Threading on your CPUs depends upon your
 use case. For example, disabling Hyper-Threading can be beneficial in
 intense computing environments. We recommend that you do performance
 testing with your local workload with both Hyper-Threading on and off to
 determine what is more appropriate in your case.
 Choosing a Hypervisor
 ~~~~~~~~~~~~~~~~~~~~~
 A hypervisor provides software to manage virtual machine access to the
 underlying hardware. The hypervisor creates, manages, and monitors
 virtual machines. OpenStack Compute supports many hypervisors to various
 degrees, including:
 -  `KVM <http://www.linux-kvm.org/page/Main_Page>`_
 -  `LXC <https://linuxcontainers.org/>`_
 -  `QEMU <http://wiki.qemu.org/Main_Page>`_
 -  `VMware
   ESX/ESXi <https://www.vmware.com/support/vsphere-hypervisor>`_
 -  `Xen <http://www.xenproject.org/>`_
 -  `Hyper-V <http://technet.microsoft.com/en-us/library/hh831531.aspx>`_
 -  `Docker <https://www.docker.com/>`_
 Probably the most important factor in your choice of hypervisor is your
 current usage or experience. Aside from that, there are practical
 concerns to do with feature parity, documentation, and the level of
 community experience.
 For example, KVM is the most widely adopted hypervisor in the OpenStack
 community. Besides KVM, more deployments run Xen, LXC, VMware, and
 Hyper-V than the others listed. However, each of these are lacking some
 feature support or the documentation on how to use them with OpenStack
 is out of date.
 The best information available to support your choice is found on the
 `Hypervisor Support Matrix
 <http://docs.openstack.org/developer/nova/support-matrix.html>`_
 and in the `configuration reference
 <http://docs.openstack.org/liberty/config-reference/content/section_compute-hypervisors.html>`_.
 .. note::
   It is also possible to run multiple hypervisors in a single
   deployment using host aggregates or cells. However, an individual
   compute node can run only a single hypervisor at a time.
 Instance Storage Solutions
 ~~~~~~~~~~~~~~~~~~~~~~~~~~
 As part of the procurement for a compute cluster, you must specify some
 storage for the disk on which the instantiated instance runs. There are
 three main approaches to providing this temporary-style storage, and it
 is important to understand the implications of the choice.
 They are:
 -  Off compute node storage—shared file system
 -  On compute node storage—shared file system
 -  On compute node storage—nonshared file system
 In general, the questions you should ask when selecting storage are as
 follows:
 -  What is the platter count you can achieve?
 -  Do more spindles result in better I/O despite network access?
 -  Which one results in the best cost-performance scenario you're aiming
   for?
 -  How do you manage the storage operationally?
 Many operators use separate compute and storage hosts. Compute services
 and storage services have different requirements, and compute hosts
 typically require more CPU and RAM than storage hosts. Therefore, for a
 fixed budget, it makes sense to have different configurations for your
 compute nodes and your storage nodes. Compute nodes will be invested in
 CPU and RAM, and storage nodes will be invested in block storage.
 However, if you are more restricted in the number of physical hosts you
 have available for creating your cloud and you want to be able to
 dedicate as many of your hosts as possible to running instances, it
 makes sense to run compute and storage on the same machines.
 We'll discuss the three main approaches to instance storage in the next
 few sections.
 Off Compute Node Storage—Shared File System
 -------------------------------------------
 In this option, the disks storing the running instances are hosted in
 servers outside of the compute nodes.
 If you use separate compute and storage hosts, you can treat your
 compute hosts as "stateless." As long as you don't have any instances
 currently running on a compute host, you can take it offline or wipe it
 completely without having any effect on the rest of your cloud. This
 simplifies maintenance for the compute hosts.
 There are several advantages to this approach:
 -  If a compute node fails, instances are usually easily recoverable.
 -  Running a dedicated storage system can be operationally simpler.
 -  You can scale to any number of spindles.
 -  It may be possible to share the external storage for other purposes.
 The main downsides to this approach are:
 -  Depending on design, heavy I/O usage from some instances can affect
   unrelated instances.
 -  Use of the network can decrease performance.
 On Compute Node Storage—Shared File System
 ------------------------------------------
 In this option, each compute node is specified with a significant amount
 of disk space, but a distributed file system ties the disks from each
 compute node into a single mount.
 The main advantage of this option is that it scales to external storage
 when you require additional storage.
 However, this option has several downsides:
 -  Running a distributed file system can make you lose your data
   locality compared with nonshared storage.
 -  Recovery of instances is complicated by depending on multiple hosts.
 -  The chassis size of the compute node can limit the number of spindles
   able to be used in a compute node.
 -  Use of the network can decrease performance.
 On Compute Node Storage—Nonshared File System
 ---------------------------------------------
 In this option, each compute node is specified with enough disks to
 store the instances it hosts.
 There are two main reasons why this is a good idea:
 -  Heavy I/O usage on one compute node does not affect instances on
   other compute nodes.
 -  Direct I/O access can increase performance.
 This has several downsides:
 -  If a compute node fails, the instances running on that node are lost.
 -  The chassis size of the compute node can limit the number of spindles
   able to be used in a compute node.
 -  Migrations of instances from one node to another are more complicated
   and rely on features that may not continue to be developed.
 -  If additional storage is required, this option does not scale.
 Running a shared file system on a storage system apart from the computes
 nodes is ideal for clouds where reliability and scalability are the most
 important factors. Running a shared file system on the compute nodes
 themselves may be best in a scenario where you have to deploy to
 preexisting servers for which you have little to no control over their
 specifications. Running a nonshared file system on the compute nodes
 themselves is a good option for clouds with high I/O requirements and
 low concern for reliability.
 Issues with Live Migration
 --------------------------
 We consider live migration an integral part of the operations of the
 cloud. This feature provides the ability to seamlessly move instances
 from one physical host to another, a necessity for performing upgrades
 that require reboots of the compute hosts, but only works well with
 shared storage.
 Live migration can also be done with nonshared storage, using a feature
 known as *KVM live block migration*. While an earlier implementation of
 block-based migration in KVM and QEMU was considered unreliable, there
 is a newer, more reliable implementation of block-based live migration
 as of QEMU 1.4 and libvirt 1.0.2 that is also compatible with OpenStack.
 However, none of the authors of this guide have first-hand experience
 using live block migration.
 Choice of File System
 ---------------------
 If you want to support shared-storage live migration, you need to
 configure a distributed file system.
 Possible options include:
 -  NFS (default for Linux)
 -  GlusterFS
 -  MooseFS
 -  Lustre
 We've seen deployments with all, and recommend that you choose the one
 you are most familiar with operating. If you are not familiar with any
 of these, choose NFS, as it is the easiest to set up and there is
 extensive community knowledge about it.
 Overcommitting
 ~~~~~~~~~~~~~~
 OpenStack allows you to overcommit CPU and RAM on compute nodes. This
 allows you to increase the number of instances you can have running on
 your cloud, at the cost of reducing the performance of the instances.
 OpenStack Compute uses the following ratios by default:
 -  CPU allocation ratio: 16:1
 -  RAM allocation ratio: 1.5:1
 The default CPU allocation ratio of 16:1 means that the scheduler
 allocates up to 16 virtual cores per physical core. For example, if a
 physical node has 12 cores, the scheduler sees 192 available virtual
 cores. With typical flavor definitions of 4 virtual cores per instance,
 this ratio would provide 48 instances on a physical node.
 The formula for the number of virtual instances on a compute node is
 *(OR*PC)/VC*, where:
 *OR*
    CPU overcommit ratio (virtual cores per physical core)
 *PC*
    Number of physical cores
 *VC*
    Number of virtual cores per instance
 Similarly, the default RAM allocation ratio of 1.5:1 means that the
 scheduler allocates instances to a physical node as long as the total
 amount of RAM associated with the instances is less than 1.5 times the
 amount of RAM available on the physical node.
 For example, if a physical node has 48 GB of RAM, the scheduler
 allocates instances to that node until the sum of the RAM associated
 with the instances reaches 72 GB (such as nine instances, in the case
 where each instance has 8 GB of RAM).
 .. note::
   Regardless of the overcommit ratio, an instance can not be placed
   on any physical node with fewer raw (pre-overcommit) resources than
   the instance flavor requires.
 You must select the appropriate CPU and RAM allocation ratio for your
 particular use case.
 Logging
 ~~~~~~~
 Logging is detailed more fully in :doc:`ops_logging_monitoring`. However,
 it is an important design consideration to take into account before
 commencing operations of your cloud.
 OpenStack produces a great deal of useful logging information, however;
 but for the information to be useful for operations purposes, you should
 consider having a central logging server to send logs to, and a log
 parsing/analysis system (such as logstash).
 Networking
 ~~~~~~~~~~
 Networking in OpenStack is a complex, multifaceted challenge. See
 :doc:`arch_network_design`.
 Conclusion
 ~~~~~~~~~~
 Compute nodes are the workhorse of your cloud and the place where your
 users' applications will run. They are likely to be affected by your
 decisions on what to deploy and how you deploy it. Their requirements
 should be reflected in the choices you make.
--- a/doc/ops-guide/source/arch_example_neutron.rst
+++ b/doc/ops-guide/source/arch_example_neutron.rst
@ -0,0 +1,556 @@
 ===========================================
 Example Architecture — OpenStack Networking
 ===========================================
 This chapter provides an example architecture using OpenStack
 Networking, also known as the Neutron project, in a highly available
 environment.
 Overview
 ~~~~~~~~
 A highly-available environment can be put into place if you require an
 environment that can scale horizontally, or want your cloud to continue
 to be operational in case of node failure. This example architecture has
 been written based on the current default feature set of OpenStack
 Havana, with an emphasis on high availability.
 Components
 ----------
 .. list-table::
   :widths: 50 50
   :header-rows: 1
   * - Component
     - Details
   * - OpenStack release
     - Havana
   * - Host operating system
     - Red Hat Enterprise Linux 6.5
   * - OpenStack package repository
     - `Red Hat Distributed OpenStack (RDO) <https://repos.fedorapeople.org/repos/openstack/>`_
   * - Hypervisor
     - KVM
   * - Database
     - MySQL
   * - Message queue
     - Qpid
   * - Networking service
     - OpenStack Networking
   * - Tenant Network Separation
     - VLAN
   * - Image service back end
     - GlusterFS
   * - Identity driver
     - SQL
   * - Block Storage back end
     - GlusterFS
 Rationale
 ---------
 This example architecture has been selected based on the current default
 feature set of OpenStack Havana, with an emphasis on high availability.
 This architecture is currently being deployed in an internal Red Hat
 OpenStack cloud and used to run hosted and shared services, which by
 their nature must be highly available.
 This architecture's components have been selected for the following
 reasons:
 Red Hat Enterprise Linux
    You must choose an operating system that can run on all of the
    physical nodes. This example architecture is based on Red Hat
    Enterprise Linux, which offers reliability, long-term support,
    certified testing, and is hardened. Enterprise customers, now moving
    into OpenStack usage, typically require these advantages.
 RDO
    The Red Hat Distributed OpenStack package offers an easy way to
    download the most current OpenStack release that is built for the
    Red Hat Enterprise Linux platform.
 KVM
    KVM is the supported hypervisor of choice for Red Hat Enterprise
    Linux (and included in distribution). It is feature complete and
    free from licensing charges and restrictions.
 MySQL
    MySQL is used as the database back end for all databases in the
    OpenStack environment. MySQL is the supported database of choice for
    Red Hat Enterprise Linux (and included in distribution); the
    database is open source, scalable, and handles memory well.
 Qpid
    Apache Qpid offers 100 percent compatibility with the
    :term:`Advanced Message Queuing Protocol (AMQP)` Standard, and its
    broker is available for both C++ and Java.
 OpenStack Networking
    OpenStack Networking offers sophisticated networking functionality,
    including Layer 2 (L2) network segregation and provider networks.
 VLAN
    Using a virtual local area network offers broadcast control,
    security, and physical layer transparency. If needed, use VXLAN to
    extend your address space.
 GlusterFS
    GlusterFS offers scalable storage. As your environment grows, you
    can continue to add more storage nodes (instead of being restricted,
    for example, by an expensive storage array).
 Detailed Description
 ~~~~~~~~~~~~~~~~~~~~
 Node types
 ----------
 This section gives you a breakdown of the different nodes that make up
 the OpenStack environment. A node is a physical machine that is
 provisioned with an operating system, and running a defined software
 stack on top of it. The table below provides node descriptions and
 specifications.
 .. list-table:: Node types
   :widths: 33 33 33
   :header-rows: 1
   * - Type
     - Description
     - Example hardware
   * - Controller
     - Controller nodes are responsible for running the management software
       services needed for the OpenStack environment to function.
       These nodes:
       * Provide the front door that people access as well as the API
         services that all other components in the environment talk to.
       * Run a number of services in a highly available fashion,
         utilizing Pacemaker and HAProxy to provide a virtual IP and
         load-balancing functions so all controller nodes are being used.
       * Supply highly available "infrastructure" services,
         such as MySQL and Qpid, that underpin all the services.
       * Provide what is known as "persistent storage" through services
         run on the host as well. This persistent storage is backed onto
         the storage nodes for reliability.
       See :ref:`controller_node`.
     - Model: Dell R620
       CPU: 2x Intel® Xeon® CPU E5-2620 0 @ 2.00 GHz
       Memory: 32 GB
       Disk: two 300 GB 10000 RPM SAS Disks
       Network: two 10G network ports
   * - Compute
     - Compute nodes run the virtual machine instances in OpenStack. They:
       * Run the bare minimum of services needed to facilitate these
         instances.
       * Use local storage on the node for the virtual machines so that
         no VM migration or instance recovery at node failure is possible.
       See :ref:`compute_node`.
     - Model: Dell R620
       CPU: 2x Intel® Xeon® CPU E5-2650 0 @ 2.00 GHz
       Memory: 128 GB
       Disk: two 600 GB 10000 RPM SAS Disks
       Network: four 10G network ports (For future proofing expansion)
   * - Storage
     - Storage nodes store all the data required for the environment,
       including disk images in the Image service library, and the
       persistent storage volumes created by the Block Storage service.
       Storage nodes use GlusterFS technology to keep the data highly
       available and scalable.
       See :ref:`storage_node`.
     - Model: Dell R720xd
       CPU: 2x Intel® Xeon® CPU E5-2620 0 @ 2.00 GHz
       Memory: 64 GB
       Disk: two 500 GB 7200 RPM SAS Disks and twenty-four 600 GB
       10000 RPM SAS Disks
       Raid Controller: PERC H710P Integrated RAID Controller, 1 GB NV Cache
       Network: two 10G network ports
   * - Network
     - Network nodes are responsible for doing all the virtual networking
       needed for people to create public or private networks and uplink
       their virtual machines into external networks. Network nodes:
       * Form the only ingress and egress point for instances running
         on top of OpenStack.
       * Run all of the environment's networking services, with the
         exception of the networking API service (which runs on the
         controller node).
       See :ref:`network_node`.
     - Model: Dell R620
       CPU: 1x Intel® Xeon® CPU E5-2620 0 @ 2.00 GHz
       Memory: 32 GB
       Disk: two 300 GB 10000 RPM SAS Disks
       Network: five 10G network ports
   * - Utility
     - Utility nodes are used by internal administration staff only to
       provide a number of basic system administration functions needed
       to get the environment up and running and to maintain the hardware,
       OS, and software on which it runs.
       These nodes run services such as provisioning, configuration
       management, monitoring, or GlusterFS management software.
       They are not required to scale, although these machines are
       usually backed up.
     - Model: Dell R620
       CPU: 2x Intel® Xeon® CPU E5-2620 0 @ 2.00 GHz
       Memory: 32 GB
       Disk: two 500 GB 7200 RPM SAS Disks
       Network: two 10G network ports
 .. _networking_layout:
 Networking layout
 -----------------
 The network contains all the management devices for all hardware in the
 environment (for example, by including Dell iDrac7 devices for the
 hardware nodes, and management interfaces for network switches). The
 network is accessed by internal staff only when diagnosing or recovering
 a hardware issue.
 OpenStack internal network
 --------------------------
 This network is used for OpenStack management functions and traffic,
 including services needed for the provisioning of physical nodes
 (``pxe``, ``tftp``, ``kickstart``), traffic between various OpenStack
 node types using OpenStack APIs and messages (for example,
 ``nova-compute`` talking to ``keystone`` or ``cinder-volume`` talking to
 ``nova-api``), and all traffic for storage data to the storage layer
 underneath by the Gluster protocol. All physical nodes have at least one
 network interface (typically ``eth0``) in this network. This network is
 only accessible from other VLANs on port 22 (for ``ssh`` access to
 manage machines).
 Public Network
 --------------
 This network is a combination of:
 -  IP addresses for public-facing interfaces on the controller nodes
   (which end users will access the OpenStack services)
 -  A range of publicly routable, IPv4 network addresses to be used by
   OpenStack Networking for floating IPs. You may be restricted in your
   access to IPv4 addresses; a large range of IPv4 addresses is not
   necessary.
 -  Routers for private networks created within OpenStack.
 This network is connected to the controller nodes so users can access
 the OpenStack interfaces, and connected to the network nodes to provide
 VMs with publicly routable traffic functionality. The network is also
 connected to the utility machines so that any utility services that need
 to be made public (such as system monitoring) can be accessed.
 VM traffic network
 ------------------
 This is a closed network that is not publicly routable and is simply
 used as a private, internal network for traffic between virtual machines
 in OpenStack, and between the virtual machines and the network nodes
 that provide l3 routes out to the public network (and floating IPs for
 connections back in to the VMs). Because this is a closed network, we
 are using a different address space to the others to clearly define the
 separation. Only Compute and OpenStack Networking nodes need to be
 connected to this network.
 Node connectivity
 ~~~~~~~~~~~~~~~~~
 The following section details how the nodes are connected to the
 different networks (see :ref:`networking_layout`) and
 what other considerations need to take place (for example, bonding) when
 connecting nodes to the networks.
 Initial deployment
 ------------------
 Initially, the connection setup should revolve around keeping the
 connectivity simple and straightforward in order to minimize deployment
 complexity and time to deploy. The deployment shown below aims to have 1 × 10G
 connectivity available to all compute nodes, while still leveraging bonding on
 appropriate nodes for maximum performance.
 .. figure:: figures/osog_0101.png
   :alt: Basic node deployment
   :width: 100%
   Basic node deployment
 Connectivity for maximum performance
 ------------------------------------
 If the networking performance of the basic layout is not enough, you can
 move to the design below, which provides 2 × 10G network
 links to all instances in the environment as well as providing more
 network bandwidth to the storage layer.
 .. figure:: figures/osog_0102.png
   :alt: Performance node deployment
   :width: 100%
   Performance node deployment
 Node diagrams
 ~~~~~~~~~~~~~
 The following diagrams include logical
 information about the different types of nodes, indicating what services
 will be running on top of them and how they interact with each other.
 The diagrams also illustrate how the availability and scalability of
 services are achieved.
 .. _controller_node:
 .. figure:: figures/osog_0103.png
   :alt: Controller node
   :width: 100%
   Controller node
 .. _compute_node:
 .. figure:: figures/osog_0104.png
   :alt: Compute node
   :width: 100%
   Compute node
 .. _network_node:
 .. figure:: figures/osog_0105.png
   :alt: Network node
   :width: 100%
   Network node
 .. _storage_node:
 .. figure:: figures/osog_0106.png
   :alt: Storage node
   :width: 100%
   Storage node
 Example Component Configuration
 -------------------------------
 The following tables include example configuration
 and considerations for both third-party and OpenStack components:
 .. list-table:: Table: Third-party component configuration
   :widths: 25 25 25 25
   :header-rows: 1
   * - Component
     - Tuning
     - Availability
     - Scalability
   * - MySQL
     - ``binlog-format = row``
     - Master/master replication. However, both nodes are not used at the
       same time. Replication keeps all nodes as close to being up to date
       as possible (although the asynchronous nature of the replication means
       a fully consistent state is not possible). Connections to the database
       only happen through a Pacemaker virtual IP, ensuring that most problems
       that occur with master-master replication can be avoided.
     - Not heavily considered. Once load on the MySQL server increases enough
       that scalability needs to be considered, multiple masters or a
       master/slave setup can be used.
   * - Qpid
     - ``max-connections=1000`` ``worker-threads=20`` ``connection-backlog=10``,
       sasl security enabled with SASL-BASIC authentication
     - Qpid is added as a resource to the Pacemaker software that runs on
       Controller nodes where Qpid is situated. This ensures only one Qpid
       instance is running at one time, and the node with the Pacemaker
       virtual IP will always be the node running Qpid.
     - Not heavily considered. However, Qpid can be changed to run on all
       controller nodes for scalability and availability purposes,
       and removed from Pacemaker.
   * - HAProxy
     - ``maxconn 3000``
     - HAProxy is a software layer-7 load balancer used to front door all
       clustered OpenStack API components and do SSL termination.
       HAProxy can be added as a resource to the Pacemaker software that
       runs on the Controller nodes where HAProxy is situated.
       This ensures that only one HAProxy instance is running at one time,
       and the node with the Pacemaker virtual IP will always be the node
       running HAProxy.
     - Not considered. HAProxy has small enough performance overheads that
       a single instance should scale enough for this level of workload.
       If extra scalability is needed, ``keepalived`` or other Layer-4
       load balancing can be introduced to be placed in front of multiple
       copies of HAProxy.
   * - Memcached
     - ``MAXCONN="8192" CACHESIZE="30457"``
     - Memcached is a fast in-memory key-value cache software that is used
       by OpenStack components for caching data and increasing performance.
       Memcached runs on all controller nodes, ensuring that should one go
       down, another instance of Memcached is available.
     - Not considered. A single instance of Memcached should be able to
       scale to the desired workloads. If scalability is desired, HAProxy
       can be placed in front of Memcached (in raw ``tcp`` mode) to utilize
       multiple Memcached instances for scalability. However, this might
       cause cache consistency issues.
   * - Pacemaker
     - Configured to use ``corosync`` and ``cman`` as a cluster communication
       stack/quorum manager, and as a two-node cluster.
     - Pacemaker is the clustering software used to ensure the availability
       of services running on the controller and network nodes:
       * Because Pacemaker is cluster software, the software itself handles
         its own availability, leveraging ``corosync`` and ``cman``
         underneath.
       * If you use the GlusterFS native client, no virtual IP is needed,
         since the client knows all about nodes after initial connection
         and automatically routes around failures on the client side.
       * If you use the NFS or SMB adaptor, you will need a virtual IP on
         which to mount the GlusterFS volumes.
     - If more nodes need to be made cluster aware, Pacemaker can scale to
       64 nodes.
   * - GlusterFS
     - ``glusterfs`` performance profile "virt" enabled on all volumes.
       Volumes are setup in two-node replication.
     - Glusterfs is a clustered file system that is run on the storage
       nodes to provide persistent scalable data storage in the environment.
       Because all connections to gluster use the ``gluster`` native mount
       points, the ``gluster`` instances themselves provide availability
       and failover functionality.
     - The scalability of GlusterFS storage can be achieved by adding in
       more storage volumes.
 |
 .. list-table:: Table: OpenStack component configuration
   :widths: 20 20 20 20 20
   :header-rows: 1
   * - Component
     - Node type
     - Tuning
     - Availability
     - Scalability
   * - Dashboard (horizon)
     - Controller
     - Configured to use Memcached as a session store, ``neutron``
       support is enabled, ``can_set_mount_point = False``
     - The dashboard is run on all controller nodes, ensuring at least one
       instance will be available in case of node failure.
       It also sits behind HAProxy, which detects when the software fails
       and routes requests around the failing instance.
     - The dashboard is run on all controller nodes, so scalability can be
       achieved with additional controller nodes. HAProxy allows scalability
       for the dashboard as more nodes are added.
   * - Identity (keystone)
     - Controller
     - Configured to use Memcached for caching and PKI for tokens.
     - Identity is run on all controller nodes, ensuring at least one
       instance will be available in case of node failure.
       Identity also sits behind HAProxy, which detects when the software
       fails and routes requests around the failing instance.
     - Identity is run on all controller nodes, so scalability can be
       achieved with additional controller nodes.
       HAProxy allows scalability for Identity as more nodes are added.
   * - Image service (glance)
     - Controller
     - ``/var/lib/glance/images`` is a GlusterFS native mount to a Gluster
       volume off the storage layer.
     - The Image service is run on all controller nodes, ensuring at least
       one instance will be available in case of node failure.
       It also sits behind HAProxy, which detects when the software fails
       and routes requests around the failing instance.
     - The Image service is run on all controller nodes, so scalability
       can be achieved with additional controller nodes. HAProxy allows
       scalability for the Image service as more nodes are added.
   * - Compute (nova)
     - Controller, Compute
     - Configured to use Qpid, ``qpid_heartbeat = `` ``10``,configured to
       use Memcached for caching, configured to use ``libvirt``, configured
       to use ``neutron``.
       Configured ``nova-consoleauth`` to use Memcached for session
       management (so that it can have multiple copies and run in a
       load balancer).
     - The nova API, scheduler, objectstore, cert, consoleauth, conductor,
       and vncproxy services are run on all controller nodes, ensuring at
       least one instance will be available in case of node failure.
       Compute is also behind HAProxy, which detects when the software
       fails and routes requests around the failing instance.
       Nova-compute and nova-conductor services, which run on the compute
       nodes, are only needed to run services on that node, so availability
       of those services is coupled tightly to the nodes that are available.
       As long as a compute node is up, it will have the needed services
       running on top of it.
     - The nova API, scheduler, objectstore, cert, consoleauth, conductor,
       and vncproxy services are run on all controller nodes, so scalability
       can be achieved with additional controller nodes. HAProxy allows
       scalability for Compute as more nodes are added. The scalability
       of services running on the compute nodes (compute, conductor) is
       achieved linearly by adding in more compute nodes.
   * - Block Storage (cinder)
     - Controller
     - Configured to use Qpid, ``qpid_heartbeat = ``\ ``10``,configured to
       use a Gluster volume from the storage layer as the back end for
       Block Storage, using the Gluster native client.
     - Block Storage API, scheduler, and volume services are run on all
       controller nodes, ensuring at least one instance will be available
       in case of node failure. Block Storage also sits behind HAProxy,
       which detects if the software fails and routes requests around the
       failing instance.
     - Block Storage API, scheduler and volume services are run on all
       controller nodes, so scalability can be achieved with additional
       controller nodes. HAProxy allows scalability for Block Storage as
       more nodes are added.
   * - OpenStack Networking (neutron)
     - Controller, Compute, Network
     - Configured to use QPID, ``qpid_heartbeat = 10``, kernel namespace
       support enabled, ``tenant_network_type = vlan``,
       ``allow_overlapping_ips = true``, ``tenant_network_type = vlan``,
       ``bridge_uplinks = br-ex:em2``, ``bridge_mappings = physnet1:br-ex``
     - The OpenStack Networking service is run on all controller nodes,
       ensuring at least one instance will be available in case of node
       failure. It also sits behind HAProxy, which detects if the software
       fails and routes requests around the failing instance.
     - The OpenStack Networking server service is run on all controller
       nodes, so scalability can be achieved with additional controller
       nodes. HAProxy allows scalability for OpenStack Networking as more
       nodes are added. Scalability of services running on the network
       nodes is not currently supported by OpenStack Networking, so they
       are not be considered. One copy of the services should be sufficient
       to handle the workload. Scalability of the ``ovs-agent`` running on
       compute nodes is achieved by adding in more compute nodes as
       necessary.
--- a/doc/ops-guide/source/arch_example_nova_network.rst
+++ b/doc/ops-guide/source/arch_example_nova_network.rst
@ -0,0 +1,259 @@
 ===============================================
 Example Architecture — Legacy Networking (nova)
 ===============================================
 This particular example architecture has been upgraded from :term:`Grizzly` to
 :term:`Havana` and tested in production environments where many public IP
 addresses are available for assignment to multiple instances. You can
 find a second example architecture that uses OpenStack Networking
 (neutron) after this section. Each example offers high availability,
 meaning that if a particular node goes down, another node with the same
 configuration can take over the tasks so that the services continue to
 be available.
 Overview
 ~~~~~~~~
 The simplest architecture you can build upon for Compute has a single
 cloud controller and multiple compute nodes. The simplest architecture
 for Object Storage has five nodes: one for identifying users and
 proxying requests to the API, then four for storage itself to provide
 enough replication for eventual consistency. This example architecture
 does not dictate a particular number of nodes, but shows the thinking
 and considerations that went into choosing this architecture including
 the features offered.
 Components
 ~~~~~~~~~~
 .. list-table::
   :widths: 50 50
   :header-rows: 1
   * - Component
     - Details
   * - OpenStack release
     - Havana
   * - Host operating system
     - Ubuntu 12.04 LTS or Red Hat Enterprise Linux 6.5,
       including derivatives such as CentOS and Scientific Linux
   * - OpenStack package repository
     - `Ubuntu Cloud Archive <https://wiki.ubuntu.com/ServerTeam/CloudArchive>`_
       or `RDO <http://openstack.redhat.com/Frequently_Asked_Questions>`_
   * - Hypervisor
     - KVM
   * - Database
     - MySQL\*
   * - Message queue
     - RabbitMQ for Ubuntu; Qpid for Red Hat Enterprise Linux and derivatives
   * - Networking service
     - ``nova-network``
   * - Network manager
     - FlatDHCP
   * - Single ``nova-network`` or multi-host?
     - multi-host\*
   * - Image service (glance) back end
     - file
   * - Identity (keystone) driver
     - SQL
   * - Block Storage (cinder) back end
     - LVM/iSCSI
   * - Live Migration back end
     - Shared storage using NFS\*
   * - Object storage
     - OpenStack Object Storage (swift)
 An asterisk (\*) indicates when the example architecture deviates from
 the settings of a default installation. We'll offer explanations for
 those deviations next.
 .. note::
    The following features of OpenStack are supported by the example
    architecture documented in this guide, but are optional:
    -  :term:`Dashboard`: You probably want to offer a dashboard, but your
       users may be more interested in API access only.
    -  Block storage: You don't have to offer users block storage if
       their use case only needs ephemeral storage on compute nodes, for
       example.
    -  Floating IP address: Floating IP addresses are public IP
       addresses that you allocate from a predefined pool to assign to
       virtual machines at launch. Floating IP address ensure that the
       public IP address is available whenever an instance is booted.
       Not every organization can offer thousands of public floating IP
       addresses for thousands of instances, so this feature is
       considered optional.
    -  Live migration: If you need to move running virtual machine
       instances from one host to another with little or no service
       interruption, you would enable live migration, but it is
       considered optional.
    -  Object storage: You may choose to store machine images on a file
       system rather than in object storage if you do not have the extra
       hardware for the required replication and redundancy that
       OpenStack Object Storage offers.
 Rationale
 ~~~~~~~~~
 This example architecture has been selected based on the current default
 feature set of OpenStack Havana, with an emphasis on stability. We
 believe that many clouds that currently run OpenStack in production have
 made similar choices.
 You must first choose the operating system that runs on all of the
 physical nodes. While OpenStack is supported on several distributions of
 Linux, we used *Ubuntu 12.04 LTS (Long Term Support)*, which is used by
 the majority of the development community, has feature completeness
 compared with other distributions and has clear future support plans.
 We recommend that you do not use the default Ubuntu OpenStack install
 packages and instead use the `Ubuntu Cloud
 Archive <https://wiki.ubuntu.com/ServerTeam/CloudArchive>`__. The Cloud
 Archive is a package repository supported by Canonical that allows you
 to upgrade to future OpenStack releases while remaining on Ubuntu 12.04.
 *KVM* as a :term:`hypervisor` complements the choice of Ubuntu—being a
 matched pair in terms of support, and also because of the significant degree
 of attention it garners from the OpenStack development community (including
 the authors, who mostly use KVM). It is also feature complete, free from
 licensing charges and restrictions.
 *MySQL* follows a similar trend. Despite its recent change of ownership,
 this database is the most tested for use with OpenStack and is heavily
 documented. We deviate from the default database, *SQLite*, because
 SQLite is not an appropriate database for production usage.
 The choice of *RabbitMQ* over other
 :term:`AMQP <Advanced Message Queuing Protocol (AMQP)>` compatible options
 that are gaining support in OpenStack, such as ZeroMQ and Qpid, is due to its
 ease of use and significant testing in production. It also is the only
 option that supports features such as Compute cells. We recommend
 clustering with RabbitMQ, as it is an integral component of the system
 and fairly simple to implement due to its inbuilt nature.
 As discussed in previous chapters, there are several options for
 networking in OpenStack Compute. We recommend *FlatDHCP* and to use
 *Multi-Host* networking mode for high availability, running one
 ``nova-network`` daemon per OpenStack compute host. This provides a
 robust mechanism for ensuring network interruptions are isolated to
 individual compute hosts, and allows for the direct use of hardware
 network gateways.
 *Live Migration* is supported by way of shared storage, with *NFS* as
 the distributed file system.
 Acknowledging that many small-scale deployments see running Object
 Storage just for the storage of virtual machine images as too costly, we
 opted for the file back end in the OpenStack :term:`Image service` (Glance).
 If your cloud will include Object Storage, you can easily add it as a back
 end.
 We chose the *SQL back end for Identity* over others, such as LDAP. This
 back end is simple to install and is robust. The authors acknowledge
 that many installations want to bind with existing directory services
 and caution careful understanding of the `array of options available
 <http://docs.openstack.org/havana/config-reference/content/ch_configuring-openstack-identity.html#configuring-keystone-for-ldap-backend>`_.
 Block Storage (cinder) is installed natively on external storage nodes
 and uses the *LVM/iSCSI plug-in*. Most Block Storage plug-ins are tied
 to particular vendor products and implementations limiting their use to
 consumers of those hardware platforms, but LVM/iSCSI is robust and
 stable on commodity hardware.
 While the cloud can be run without the *OpenStack Dashboard*, we
 consider it to be indispensable, not just for user interaction with the
 cloud, but also as a tool for operators. Additionally, the dashboard's
 use of Django makes it a flexible framework for extension.
 Why not use OpenStack Networking?
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 This example architecture does not use OpenStack Networking, because it
 does not yet support multi-host networking and our organizations
 (university, government) have access to a large range of
 publicly-accessible IPv4 addresses.
 Why use multi-host networking?
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 In a default OpenStack deployment, there is a single ``nova-network``
 service that runs within the cloud (usually on the cloud controller)
 that provides services such as
 :term:`network address translation <NAT>` (NAT), :term:`DHCP`,
 and :term:`DNS` to the guest instances. If the single node that runs the
 ``nova-network`` service goes down, you cannot access your instances,
 and the instances cannot access the Internet. The single node that runs
 the ``nova-network`` service can become a bottleneck if excessive
 network traffic comes in and goes out of the cloud.
 .. note::
    `Multi-host <http://docs.openstack.org/havana/install-guide/install/apt/content/nova-network.html>`_
    is a high-availability option for the network configuration, where
    the ``nova-network`` service is run on every compute node instead of
    running on only a single node.
 Detailed Description
 --------------------
 The reference architecture consists of multiple compute nodes, a cloud
 controller, an external NFS storage server for instance storage, and an
 OpenStack Block Storage server for volume storage.legacy networking
 (nova) detailed description A network time service (:term:`Network Time
 Protocol <NTP>`, or NTP) synchronizes time on all the nodes. FlatDHCPManager in
 multi-host mode is used for the networking. A logical diagram for this
 example architecture shows which services are running on each node:
 .. image:: figures/osog_01in01.png
   :width: 100%
 |
 The cloud controller runs the dashboard, the API services, the database
 (MySQL), a message queue server (RabbitMQ), the scheduler for choosing
 compute resources (``nova-scheduler``), Identity services (keystone,
 ``nova-consoleauth``), Image services (``glance-api``,
 ``glance-registry``), services for console access of guests, and Block
 Storage services, including the scheduler for storage resources
 (``cinder-api`` and ``cinder-scheduler``).
 Compute nodes are where the computing resources are held, and in our
 example architecture, they run the hypervisor (KVM), libvirt (the driver
 for the hypervisor, which enables live migration from node to node),
 ``nova-compute``, ``nova-api-metadata`` (generally only used when
 running in multi-host mode, it retrieves instance-specific metadata),
 ``nova-vncproxy``, and ``nova-network``.
 The network consists of two switches, one for the management or private
 traffic, and one that covers public access, including floating IPs. To
 support this, the cloud controller and the compute nodes have two
 network cards. The OpenStack Block Storage and NFS storage servers only
 need to access the private network and therefore only need one network
 card, but multiple cards run in a bonded configuration are recommended
 if possible. Floating IP access is direct to the Internet, whereas Flat
 IP access goes through a NAT. To envision the network traffic, use this
 diagram:
 .. image:: figures/osog_01in02.png
   :width: 100%
 |
 Optional Extensions
 -------------------
 You can extend this reference architecture aslegacy networking (nova)
 optional extensions follows:
 -  Add additional cloud controllers (see :doc:`ops_maintenance`).
 -  Add an OpenStack Storage service (see the Object Storage chapter in
   the *OpenStack Installation Guide* for your distribution).
 -  Add additional OpenStack Block Storage hosts (see
   :doc:`ops_maintenance`).
--- a/doc/ops-guide/source/arch_example_thoughts.rst
+++ b/doc/ops-guide/source/arch_example_thoughts.rst
@ -0,0 +1,11 @@
 =========================================
 Parting Thoughts on Architecture Examples
 =========================================
 With so many considerations and options available, our hope is to
 provide a few clearly-marked and tested paths for your OpenStack
 exploration. If you're looking for additional ideas, check out
 :doc:`app_usecases`, the
 `OpenStack Installation Guides <http://docs.openstack.org/#install-guides>`_, or the
 `OpenStack User Stories
 page <http://www.openstack.org/user-stories/>`_.
--- a/doc/ops-guide/source/arch_examples.rst
+++ b/doc/ops-guide/source/arch_examples.rst
@ -0,0 +1,30 @@
 =====================
 Architecture Examples
 =====================
 To understand the possibilities that OpenStack offers, it's best to
 start with basic architecture that has been tested in production
 environments. We offer two examples with basic pivots on the base
 operating system (Ubuntu and Red Hat Enterprise Linux) and the
 networking architecture. There are other differences between these two
 examples and this guide provides reasons for each choice made.
 Because OpenStack is highly configurable, with many different back ends
 and network configuration options, it is difficult to write
 documentation that covers all possible OpenStack deployments. Therefore,
 this guide defines examples of architecture to simplify the task of
 documenting, as well as to provide the scope for this guide. Both of the
 offered architecture examples are currently running in production and
 serving users.
 .. note::
   As always, refer to the :doc:`common/glossary` if you are unclear
   about any of the terminology mentioned in architecture examples.
 .. toctree::
   :maxdepth: 2
   arch_example_nova_network.rst
   arch_example_neutron.rst
   arch_example_thoughts.rst
--- a/doc/ops-guide/source/arch_network_design.rst
+++ b/doc/ops-guide/source/arch_network_design.rst
@ -0,0 +1,290 @@
 ==============
 Network Design
 ==============
 OpenStack provides a rich networking environment, and this chapter
 details the requirements and options to deliberate when designing your
 cloud.
 .. warning::
   If this is the first time you are deploying a cloud infrastructure
   in your organization, after reading this section, your first
   conversations should be with your networking team. Network usage in
   a running cloud is vastly different from traditional network
   deployments and has the potential to be disruptive at both a
   connectivity and a policy level.
 For example, you must plan the number of IP addresses that you need for
 both your guest instances as well as management infrastructure.
 Additionally, you must research and discuss cloud network connectivity
 through proxy servers and firewalls.
 In this chapter, we'll give some examples of network implementations to
 consider and provide information about some of the network layouts that
 OpenStack uses. Finally, we have some brief notes on the networking
 services that are essential for stable operation.
 Management Network
 ~~~~~~~~~~~~~~~~~~
 A :term:`management network` (a separate network for use by your cloud
 operators) typically consists of a separate switch and separate NICs
 (network interface cards), and is a recommended option. This segregation
 prevents system administration and the monitoring of system access from
 being disrupted by traffic generated by guests.
 Consider creating other private networks for communication between
 internal components of OpenStack, such as the message queue and
 OpenStack Compute. Using a virtual local area network (VLAN) works well
 for these scenarios because it provides a method for creating multiple
 virtual networks on a physical network.
 Public Addressing Options
 ~~~~~~~~~~~~~~~~~~~~~~~~~
 There are two main types of IP addresses for guest virtual machines:
 fixed IPs and floating IPs. Fixed IPs are assigned to instances on boot,
 whereas floating IP addresses can change their association between
 instances by action of the user. Both types of IP addresses can be
 either public or private, depending on your use case.
 Fixed IP addresses are required, whereas it is possible to run OpenStack
 without floating IPs. One of the most common use cases for floating IPs
 is to provide public IP addresses to a private cloud, where there are a
 limited number of IP addresses available. Another is for a public cloud
 user to have a "static" IP address that can be reassigned when an
 instance is upgraded or moved.
 Fixed IP addresses can be private for private clouds, or public for
 public clouds. When an instance terminates, its fixed IP is lost. It is
 worth noting that newer users of cloud computing may find their
 ephemeral nature frustrating.
 IP Address Planning
 ~~~~~~~~~~~~~~~~~~~
 An OpenStack installation can potentially have many subnets (ranges of
 IP addresses) and different types of services in each. An IP address
 plan can assist with a shared understanding of network partition
 purposes and scalability. Control services can have public and private
 IP addresses, and as noted above, there are a couple of options for an
 instance's public addresses.
 An IP address plan might be broken down into the following sections:
 Subnet router
    Packets leaving the subnet go via this address, which could be a
    dedicated router or a ``nova-network`` service.
 Control services public interfaces
    Public access to ``swift-proxy``, ``nova-api``, ``glance-api``, and
    horizon come to these addresses, which could be on one side of a
    load balancer or pointing at individual machines.
 Object Storage cluster internal communications
    Traffic among object/account/container servers and between these and
    the proxy server's internal interface uses this private network.
 Compute and storage communications
    If ephemeral or block storage is external to the compute node, this
    network is used.
 Out-of-band remote management
    If a dedicated remote access controller chip is included in servers,
    often these are on a separate network.
 In-band remote management
    Often, an extra (such as 1 GB) interface on compute or storage nodes
    is used for system administrators or monitoring tools to access the
    host instead of going through the public interface.
 Spare space for future growth
    Adding more public-facing control services or guest instance IPs
    should always be part of your plan.
 For example, take a deployment that has both OpenStack Compute and
 Object Storage, with private ranges 172.22.42.0/24 and 172.22.87.0/26
 available. One way to segregate the space might be as follows:
 ::
    172.22.42.0/24:
    172.22.42.1   - 172.22.42.3   - subnet routers
    172.22.42.4   - 172.22.42.20  - spare for networks
    172.22.42.21  - 172.22.42.104 - Compute node remote access controllers
                                    (inc spare)
    172.22.42.105 - 172.22.42.188 - Compute node management interfaces (inc spare)
    172.22.42.189 - 172.22.42.208 - Swift proxy remote access controllers
                                    (inc spare)
    172.22.42.209 - 172.22.42.228 - Swift proxy management interfaces (inc spare)
    172.22.42.229 - 172.22.42.252 - Swift storage servers remote access controllers
                                    (inc spare)
    172.22.42.253 - 172.22.42.254 - spare
    172.22.87.0/26:
    172.22.87.1  - 172.22.87.3    - subnet routers
    172.22.87.4  - 172.22.87.24   - Swift proxy server internal interfaces
                                    (inc spare)
    172.22.87.25 - 172.22.87.63   - Swift object server internal interfaces
                                    (inc spare)
 A similar approach can be taken with public IP addresses, taking note
 that large, flat ranges are preferred for use with guest instance IPs.
 Take into account that for some OpenStack networking options, a public
 IP address in the range of a guest instance public IP address is
 assigned to the ``nova-compute`` host.
 Network Topology
 ~~~~~~~~~~~~~~~~
 OpenStack Compute with ``nova-network`` provides predefined network
 deployment models, each with its own strengths and weaknesses. The
 selection of a network manager changes your network topology, so the
 choice should be made carefully. You also have a choice between the
 tried-and-true legacy ``nova-network`` settings or the neutron project
 for OpenStack Networking. Both offer networking for launched instances
 with different implementations and requirements.
 For OpenStack Networking with the neutron project, typical
 configurations are documented with the idea that any setup you can
 configure with real hardware you can re-create with a software-defined
 equivalent. Each tenant can contain typical network elements such as
 routers, and services such as :term:`DHCP`.
 The following table describes the networking deployment options for both
 legacy ``nova-network`` options and an equivalent neutron
 configuration.
 .. list-table:: Networking deployment options
   :widths: 25 25 25 25
   :header-rows: 1
   * - Network deployment model
     - Strengths
     - Weaknesses
     - Neutron equivalent
   * - Flat
     - Extremely simple topology. No DHCP overhead.
     - Requires file injection into the instance to configure network
       interfaces.
     - Configure a single bridge as the integration bridge (br-int) and
       connect it to a physical network interface with the Modular Layer 2
       (ML2) plug-in, which uses Open vSwitch by default.
   * - FlatDHCP
     - Relatively simple to deploy. Standard networking. Works with all guest
       operating systems.
     - Requires its own DHCP broadcast domain.
     - Configure DHCP agents and routing agents. Network Address Translation
       (NAT) performed outside of compute nodes, typically on one or more
       network nodes.
   * - VlanManager
     - Each tenant is isolated to its own VLANs.
     - More complex to set up. Requires its own DHCP broadcast domain.
       Requires many VLANs to be trunked onto a single port. Standard VLAN
       number limitation. Switches must support 802.1q VLAN tagging.
     - Isolated tenant networks implement some form of isolation of layer 2
       traffic between distinct networks. VLAN tagging is key concept, where
       traffic is “tagged” with an ordinal identifier for the VLAN. Isolated
       network implementations may or may not include additional services like
       DHCP, NAT, and routing.
   * - FlatDHCP Multi-host with high availability (HA)
     - Networking failure is isolated to the VMs running on the affected
       hypervisor. DHCP traffic can be isolated within an individual host.
       Network traffic is distributed to the compute nodes.
     - More complex to set up. Compute nodes typically need IP addresses
       accessible by external networks. Options must be carefully configured
       for live migration to work with networking services.
     - Configure neutron with multiple DHCP and layer-3 agents. Network nodes
       are not able to failover to each other, so the controller runs
       networking services, such as DHCP. Compute nodes run the ML2 plug-in
       with support for agents such as Open vSwitch or Linux Bridge.
 Both ``nova-network`` and neutron services provide similar capabilities,
 such as VLAN between VMs. You also can provide multiple NICs on VMs with
 either service. Further discussion follows.
 VLAN Configuration Within OpenStack VMs
 ---------------------------------------
 VLAN configuration can be as simple or as complicated as desired. The
 use of VLANs has the benefit of allowing each project its own subnet and
 broadcast segregation from other projects. To allow OpenStack to
 efficiently use VLANs, you must allocate a VLAN range (one for each
 project) and turn each compute node switch port into a trunk
 port.
 For example, if you estimate that your cloud must support a maximum of
 100 projects, pick a free VLAN range that your network infrastructure is
 currently not using (such as VLAN 200–299). You must configure OpenStack
 with this range and also configure your switch ports to allow VLAN
 traffic from that range.
 Multi-NIC Provisioning
 ----------------------
 OpenStack Networking with ``neutron`` and OpenStack Compute with
 ``nova-network`` have the ability to assign multiple NICs to instances. For
 ``nova-network`` this can be done on a per-request basis, with each
 additional NIC using up an entire subnet or VLAN, reducing the total
 number of supported projects.
 Multi-Host and Single-Host Networking
 -------------------------------------
 The ``nova-network`` service has the ability to operate in a multi-host
 or single-host mode. Multi-host is when each compute node runs a copy of
 ``nova-network`` and the instances on that compute node use the compute
 node as a gateway to the Internet. The compute nodes also host the
 floating IPs and security groups for instances on that node. Single-host
 is when a central server—for example, the cloud controller—runs the
 ``nova-network`` service. All compute nodes forward traffic from the
 instances to the cloud controller. The cloud controller then forwards
 traffic to the Internet. The cloud controller hosts the floating IPs and
 security groups for all instances on all compute nodes in the
 cloud.
 There are benefits to both modes. Single-node has the downside of a
 single point of failure. If the cloud controller is not available,
 instances cannot communicate on the network. This is not true with
 multi-host, but multi-host requires that each compute node has a public
 IP address to communicate on the Internet. If you are not able to obtain
 a significant block of public IP addresses, multi-host might not be an
 option.
 Services for Networking
 ~~~~~~~~~~~~~~~~~~~~~~~
 OpenStack, like any network application, has a number of standard
 considerations to apply, such as NTP and DNS.
 NTP
 ---
 Time synchronization is a critical element to ensure continued operation
 of OpenStack components. Correct time is necessary to avoid errors in
 instance scheduling, replication of objects in the object store, and
 even matching log timestamps for debugging.
 All servers running OpenStack components should be able to access an
 appropriate NTP server. You may decide to set up one locally or use the
 public pools available from the `Network Time Protocol
 project <http://www.pool.ntp.org/en/>`_.
 DNS
 ---
 OpenStack does not currently provide DNS services, aside from the
 dnsmasq daemon, which resides on ``nova-network`` hosts. You could
 consider providing a dynamic DNS service to allow instances to update a
 DNS entry with new IP addresses. You can also consider making a generic
 forward and reverse DNS mapping for instances' IP addresses, such as
 vm-203-0-113-123.example.com.
 Conclusion
 ~~~~~~~~~~
 Armed with your IP address layout and numbers and knowledge about the
 topologies and services you can use, it's now time to prepare the
 network for your installation. Be sure to also check out the `OpenStack
 Security Guide <http://docs.openstack.org/sec/>`_ for tips on securing
 your network. We wish you a good relationship with your networking team!
--- a/doc/ops-guide/source/arch_provision.rst
+++ b/doc/ops-guide/source/arch_provision.rst
@ -0,0 +1,252 @@
 ===========================
 Provisioning and Deployment
 ===========================
 A critical part of a cloud's scalability is the amount of effort that it
 takes to run your cloud. To minimize the operational cost of running
 your cloud, set up and use an automated deployment and configuration
 infrastructure with a configuration management system, such as :term:`Puppet`
 or :term:`Chef`. Combined, these systems greatly reduce manual effort and the
 chance for operator error.
 This infrastructure includes systems to automatically install the
 operating system's initial configuration and later coordinate the
 configuration of all services automatically and centrally, which reduces
 both manual effort and the chance for error. Examples include Ansible,
 CFEngine, Chef, Puppet, and Salt. You can even use OpenStack to deploy
 OpenStack, named TripleO (OpenStack On OpenStack).
 Automated Deployment
 ~~~~~~~~~~~~~~~~~~~~
 An automated deployment system installs and configures operating systems
 on new servers, without intervention, after the absolute minimum amount
 of manual work, including physical racking, MAC-to-IP assignment, and
 power configuration. Typically, solutions rely on wrappers around PXE
 boot and TFTP servers for the basic operating system install and then
 hand off to an automated configuration management system.
 Both Ubuntu and Red Hat Enterprise Linux include mechanisms for
 configuring the operating system, including preseed and kickstart, that
 you can use after a network boot. Typically, these are used to bootstrap
 an automated configuration system. Alternatively, you can use an
 image-based approach for deploying the operating system, such as
 systemimager. You can use both approaches with a virtualized
 infrastructure, such as when you run VMs to separate your control
 services and physical infrastructure.
 When you create a deployment plan, focus on a few vital areas because
 they are very hard to modify post deployment. The next two sections talk
 about configurations for:
 -  Disk partitioning and disk array setup for scalability
 -  Networking configuration just for PXE booting
 Disk Partitioning and RAID
 --------------------------
 At the very base of any operating system are the hard drives on which
 the operating system (OS) is installed.
 You must complete the following configurations on the server's hard
 drives:
 -  Partitioning, which provides greater flexibility for layout of
   operating system and swap space, as described below.
 -  Adding to a RAID array (RAID stands for redundant array of
   independent disks), based on the number of disks you have available,
   so that you can add capacity as your cloud grows. Some options are
   described in more detail below.
 The simplest option to get started is to use one hard drive with two
 partitions:
 -  File system to store files and directories, where all the data lives,
   including the root partition that starts and runs the system.
 -  Swap space to free up memory for processes, as an independent area of
   the physical disk used only for swapping and nothing else.
 RAID is not used in this simplistic one-drive setup because generally
 for production clouds, you want to ensure that if one disk fails,
 another can take its place. Instead, for production, use more than one
 disk. The number of disks determine what types of RAID arrays to build.
 We recommend that you choose one of the following multiple disk options:
 Option 1
    Partition all drives in the same way in a horizontal fashion, as
    shown in :ref:`partition_setup`.
    With this option, you can assign different partitions to different
    RAID arrays. You can allocate partition 1 of disk one and two to the
    ``/boot`` partition mirror. You can make partition 2 of all disks
    the root partition mirror. You can use partition 3 of all disks for
    a ``cinder-volumes`` LVM partition running on a RAID 10 array.
    .. _partition_setup:
    .. figure:: figures/osog_0201.png
       Figure. Partition setup of drives
    While you might end up with unused partitions, such as partition 1
    in disk three and four of this example, this option allows for
    maximum utilization of disk space. I/O performance might be an issue
    as a result of all disks being used for all tasks.
 Option 2
    Add all raw disks to one large RAID array, either hardware or
    software based. You can partition this large array with the boot,
    root, swap, and LVM areas. This option is simple to implement and
    uses all partitions. However, disk I/O might suffer.
 Option 3
    Dedicate entire disks to certain partitions. For example, you could
    allocate disk one and two entirely to the boot, root, and swap
    partitions under a RAID 1 mirror. Then, allocate disk three and four
    entirely to the LVM partition, also under a RAID 1 mirror. Disk I/O
    should be better because I/O is focused on dedicated tasks. However,
    the LVM partition is much smaller.
 .. note::
   You may find that you can automate the partitioning itself. For
   example, MIT uses `Fully Automatic Installation
   (FAI) <http://fai-project.org/>`_ to do the initial PXE-based
   partition and then install using a combination of min/max and
   percentage-based partitioning.
 As with most architecture choices, the right answer depends on your
 environment. If you are using existing hardware, you know the disk
 density of your servers and can determine some decisions based on the
 options above. If you are going through a procurement process, your
 user's requirements also help you determine hardware purchases. Here are
 some examples from a private cloud providing web developers custom
 environments at AT&T. This example is from a specific deployment, so
 your existing hardware or procurement opportunity may vary from this.
 AT&T uses three types of hardware in its deployment:
 -  Hardware for controller nodes, used for all stateless OpenStack API
   services. About 32–64 GB memory, small attached disk, one processor,
   varied number of cores, such as 6–12.
 -  Hardware for compute nodes. Typically 256 or 144 GB memory, two
   processors, 24 cores. 4–6 TB direct attached storage, typically in a
   RAID 5 configuration.
 -  Hardware for storage nodes. Typically for these, the disk space is
   optimized for the lowest cost per GB of storage while maintaining
   rack-space efficiency.
 Again, the right answer depends on your environment. You have to make
 your decision based on the trade-offs between space utilization,
 simplicity, and I/O performance.
 Network Configuration
 ---------------------
 Network configuration is a very large topic that spans multiple areas of
 this book. For now, make sure that your servers can PXE boot and
 successfully communicate with the deployment server.
 For example, you usually cannot configure NICs for VLANs when PXE
 booting. Additionally, you usually cannot PXE boot with bonded NICs. If
 you run into this scenario, consider using a simple 1 GB switch in a
 private network on which only your cloud communicates.
 Automated Configuration
 ~~~~~~~~~~~~~~~~~~~~~~~
 The purpose of automatic configuration management is to establish and
 maintain the consistency of a system without using human intervention.
 You want to maintain consistency in your deployments so that you can
 have the same cloud every time, repeatably. Proper use of automatic
 configuration-management tools ensures that components of the cloud
 systems are in particular states, in addition to simplifying deployment,
 and configuration change propagation.
 These tools also make it possible to test and roll back changes, as they
 are fully repeatable. Conveniently, a large body of work has been done
 by the OpenStack community in this space. Puppet, a configuration
 management tool, even provides official modules for OpenStack projects
 in an OpenStack infrastructure system known as `Puppet
 OpenStack <https://wiki.openstack.org/wiki/Puppet>`_. Chef
 configuration management is provided within
 https://git.openstack.org/cgit/openstack/openstack-chef-repo. Additional
 configuration management systems include Juju, Ansible, and Salt. Also,
 PackStack is a command-line utility for Red Hat Enterprise Linux and
 derivatives that uses Puppet modules to support rapid deployment of
 OpenStack on existing servers over an SSH connection.
 An integral part of a configuration-management system is the item that
 it controls. You should carefully consider all of the items that you
 want, or do not want, to be automatically managed. For example, you may
 not want to automatically format hard drives with user data.
 Remote Management
 ~~~~~~~~~~~~~~~~~
 In our experience, most operators don't sit right next to the servers
 running the cloud, and many don't necessarily enjoy visiting the data
 center. OpenStack should be entirely remotely configurable, but
 sometimes not everything goes according to plan.
 In this instance, having an out-of-band access into nodes running
 OpenStack components is a boon. The IPMI protocol is the de facto
 standard here, and acquiring hardware that supports it is highly
 recommended to achieve that lights-out data center aim.
 In addition, consider remote power control as well. While IPMI usually
 controls the server's power state, having remote access to the PDU that
 the server is plugged into can really be useful for situations when
 everything seems wedged.
 Parting Thoughts for Provisioning and Deploying OpenStack
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 You can save time by understanding the use cases for the cloud you want
 to create. Use cases for OpenStack are varied. Some include object
 storage only; others require preconfigured compute resources to speed
 development-environment set up; and others need fast provisioning of
 compute resources that are already secured per tenant with private
 networks. Your users may have need for highly redundant servers to make
 sure their legacy applications continue to run. Perhaps a goal would be
 to architect these legacy applications so that they run on multiple
 instances in a cloudy, fault-tolerant way, but not make it a goal to add
 to those clusters over time. Your users may indicate that they need
 scaling considerations because of heavy Windows server use.
 You can save resources by looking at the best fit for the hardware you
 have in place already. You might have some high-density storage hardware
 available. You could format and repurpose those servers for OpenStack
 Object Storage. All of these considerations and input from users help
 you build your use case and your deployment plan.
 .. note::
    For further research about OpenStack deployment, investigate the
    supported and documented preconfigured, prepackaged installers for
    OpenStack from companies such as
    `Canonical <http://www.ubuntu.com/cloud/ubuntu-openstack>`_,
    `Cisco <http://www.cisco.com/web/solutions/openstack/index.html>`_,
    `Cloudscaling <http://www.cloudscaling.com/>`_,
    `IBM <http://www-03.ibm.com/software/products/en/smartcloud-orchestrator/>`_,
    `Metacloud <http://www.metacloud.com/>`_,
    `Mirantis <http://www.mirantis.com/>`_,
    `Piston <http://www.pistoncloud.com/>`_,
    `Rackspace <http://www.rackspace.com/cloud/private/>`_, `Red
    Hat <http://www.redhat.com/openstack/>`_,
    `SUSE <https://www.suse.com/products/suse-cloud/>`_, and
    `SwiftStack <https://www.swiftstack.com/>`_.
 Conclusion
 ~~~~~~~~~~
 The decisions you make with respect to provisioning and deployment will
 affect your day-to-day, week-to-week, and month-to-month maintenance of
 the cloud. Your configuration management will be able to evolve over
 time. However, more thought and design need to be done for upfront
 choices about deployment, disk partitioning, and network configuration.
--- a/doc/ops-guide/source/arch_scaling.rst
+++ b/doc/ops-guide/source/arch_scaling.rst
@ -0,0 +1,427 @@
 =======
 Scaling
 =======
 Whereas traditional applications required larger hardware to scale
 ("vertical scaling"), cloud-based applications typically request more,
 discrete hardware ("horizontal scaling"). If your cloud is successful,
 eventually you must add resources to meet the increasing demand.
 To suit the cloud paradigm, OpenStack itself is designed to be
 horizontally scalable. Rather than switching to larger servers, you
 procure more servers and simply install identically configured services.
 Ideally, you scale out and load balance among groups of functionally
 identical services (for example, compute nodes or ``nova-api`` nodes),
 that communicate on a message bus.
 The Starting Point
 ~~~~~~~~~~~~~~~~~~
 Determining the scalability of your cloud and how to improve it is an
 exercise with many variables to balance. No one solution meets
 everyone's scalability goals. However, it is helpful to track a number
 of metrics. Since you can define virtual hardware templates, called
 "flavors" in OpenStack, you can start to make scaling decisions based on
 the flavors you'll provide. These templates define sizes for memory in
 RAM, root disk size, amount of ephemeral data disk space available, and
 number of cores for starters.
 The default OpenStack flavors are shown in the following table.
 .. list-table:: OpenStack default flavors
   :widths: 20 20 20 20 20
   :header-rows: 1
   * - Name
     - Virtual cores
     - Memory
     - Disk
     - Ephemeral
   * - m1.tiny
     - 1
     - 512 MB
     - 1 GB
     - 0 GB
   * - m1.small
     - 1
     - 2 GB
     - 10 GB
     - 20 GB
   * - m1.medium
     - 2
     - 4 GB
     - 10 GB
     - 40 GB
   * - m1.large
     - 4
     - 8 GB
     - 10 GB
     - 80 GB
   * - m1.xlarge
     - 8
     - 16 GB
     - 10 GB
     - 160 GB
 The starting point for most is the core count of your cloud. By applying
 some ratios, you can gather information about:
 -  The number of virtual machines (VMs) you expect to run,
   ``((overcommit fraction × cores) / virtual cores per instance)``
 -  How much storage is required ``(flavor disk size × number of instances)``
 You can use these ratios to determine how much additional infrastructure
 you need to support your cloud.
 Here is an example using the ratios for gathering scalability
 information for the number of VMs expected as well as the storage
 needed. The following numbers support (200 / 2) × 16 = 1600 VM instances
 and require 80 TB of storage for ``/var/lib/nova/instances``:
 -  200 physical cores.
 -  Most instances are size m1.medium (two virtual cores, 50 GB of
   storage).
 -  Default CPU overcommit ratio (``cpu_allocation_ratio`` in nova.conf)
   of 16:1.
 .. note::
   Regardless of the overcommit ratio, an instance can not be placed
   on any physical node with fewer raw (pre-overcommit) resources than
   instance flavor requires.
 However, you need more than the core count alone to estimate the load
 that the API services, database servers, and queue servers are likely to
 encounter. You must also consider the usage patterns of your cloud.
 As a specific example, compare a cloud that supports a managed
 web-hosting platform with one running integration tests for a
 development project that creates one VM per code commit. In the former,
 the heavy work of creating a VM happens only every few months, whereas
 the latter puts constant heavy load on the cloud controller. You must
 consider your average VM lifetime, as a larger number generally means
 less load on the cloud controller.
 Aside from the creation and termination of VMs, you must consider the
 impact of users accessing the service—particularly on ``nova-api`` and
 its associated database. Listing instances garners a great deal of
 information and, given the frequency with which users run this
 operation, a cloud with a large number of users can increase the load
 significantly. This can occur even without their knowledge—leaving the
 OpenStack dashboard instances tab open in the browser refreshes the list
 of VMs every 30 seconds.
 After you consider these factors, you can determine how many cloud
 controller cores you require. A typical eight core, 8 GB of RAM server
 is sufficient for up to a rack of compute nodes — given the above
 caveats.
 You must also consider key hardware specifications for the performance
 of user VMs, as well as budget and performance needs, including storage
 performance (spindles/core), memory availability (RAM/core), network
 bandwidthbandwidth hardware specifications and (Gbps/core), and overall
 CPU performance (CPU/core).
 .. note::
   For a discussion of metric tracking, including how to extract
   metrics from your cloud, see :doc:`ops_logging_monitoring`.
 Adding Cloud Controller Nodes
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 You can facilitate the horizontal expansion of your cloud by adding
 nodes. Adding compute nodes is straightforward—they are easily picked up
 by the existing installation. However, you must consider some important
 points when you design your cluster to be highly available.
 Recall that a cloud controller node runs several different services. You
 can install services that communicate only using the message queue
 internally—\ ``nova-scheduler`` and ``nova-console``—on a new server for
 expansion. However, other integral parts require more care.
 You should load balance user-facing services such as dashboard,
 ``nova-api``, or the Object Storage proxy. Use any standard HTTP
 load-balancing method (DNS round robin, hardware load balancer, or
 software such as Pound or HAProxy). One caveat with dashboard is the VNC
 proxy, which uses the WebSocket protocol—something that an L7 load
 balancer might struggle with. See also `Horizon session storage
 <http://docs.openstack.org/developer/horizon/topics/deployment.html#session-storage>`_.
 You can configure some services, such as ``nova-api`` and
 ``glance-api``, to use multiple processes by changing a flag in their
 configuration file—allowing them to share work between multiple cores on
 the one machine.
 .. note::
   Several options are available for MySQL load balancing, and the
   supported AMQP brokers have built-in clustering support. Information
   on how to configure these and many of the other services can be
   found in :doc:`operations`.
 Segregating Your Cloud
 ~~~~~~~~~~~~~~~~~~~~~~
 When you want to offer users different regions to provide legal
 considerations for data storage, redundancy across earthquake fault
 lines, or for low-latency API calls, you segregate your cloud. Use one
 of the following OpenStack methods to segregate your cloud: *cells*,
 *regions*, *availability zones*, or *host aggregates*.
 Each method provides different functionality and can be best divided
 into two groups:
 -  Cells and regions, which segregate an entire cloud and result in
   running separate Compute deployments.
 -  :term:`Availability zones <availability zone>` and host aggregates, which
   merely divide a single Compute deployment.
 The table below provides a comparison view of each segregation method currently
 provided by OpenStack Compute.
 .. list-table:: OpenStack segregation methods
   :widths: 20 20 20 20 20
   :header-rows: 1
   * -
     - Cells
     - Regions
     - Availability zones
     - Host aggregates
   * - **Use when you need**
     - A single :term:`API endpoint` for compute, or you require a second
       level of scheduling.
     - Discrete regions with separate API endpoints and no coordination
       between regions.
     - Logical separation within your nova deployment for physical isolation
       or redundancy.
     - To schedule a group of hosts with common features.
   * - **Example**
     - A cloud with multiple sites where you can schedule VMs "anywhere" or on
       a particular site.
     - A cloud with multiple sites, where you schedule VMs to a particular
       site and you want a shared infrastructure.
     - A single-site cloud with equipment fed by separate power supplies.
     - Scheduling to hosts with trusted hardware support.
   * - **Overhead**
     - Considered experimental. A new service, nova-cells. Each cell has a full
       nova installation except nova-api.
     - A different API endpoint for every region. Each region has a full nova
       installation.
     - Configuration changes to ``nova.conf``.
     - Configuration changes to ``nova.conf``.
   * - **Shared services**
     - Keystone, ``nova-api``
     - Keystone
     - Keystone, All nova services
     - Keystone, All nova services
 Cells and Regions
 -----------------
 OpenStack Compute cells are designed to allow running the cloud in a
 distributed fashion without having to use more complicated technologies,
 or be invasive to existing nova installations. Hosts in a cloud are
 partitioned into groups called *cells*. Cells are configured in a tree.
 The top-level cell ("API cell") has a host that runs the ``nova-api``
 service, but no ``nova-compute`` services. Each child cell runs all of
 the other typical ``nova-*`` services found in a regular installation,
 except for the ``nova-api`` service. Each cell has its own message queue
 and database service and also runs ``nova-cells``, which manages the
 communication between the API cell and child cells.
 This allows for a single API server being used to control access to
 multiple cloud installations. Introducing a second level of scheduling
 (the cell selection), in addition to the regular ``nova-scheduler``
 selection of hosts, provides greater flexibility to control where
 virtual machines are run.
 Unlike having a single API endpoint, regions have a separate API
 endpoint per installation, allowing for a more discrete separation.
 Users wanting to run instances across sites have to explicitly select a
 region. However, the additional complexity of a running a new service is
 not required.
 The OpenStack dashboard (horizon) can be configured to use multiple
 regions. This can be configured through the ``AVAILABLE_REGIONS``
 parameter.
 Availability Zones and Host Aggregates
 --------------------------------------
 You can use availability zones, host aggregates, or both to partition a
 nova deployment.
 Availability zones are implemented through and configured in a similar
 way to host aggregates.
 However, you use them for different reasons.
 Availability zone
 ~~~~~~~~~~~~~~~~~
 This enables you to arrange OpenStack compute hosts into logical groups
 and provides a form of physical isolation and redundancy from other
 availability zones, such as by using a separate power supply or network
 equipment.
 You define the availability zone in which a specified compute host
 resides locally on each server. An availability zone is commonly used to
 identify a set of servers that have a common attribute. For instance, if
 some of the racks in your data center are on a separate power source,
 you can put servers in those racks in their own availability zone.
 Availability zones can also help separate different classes of hardware.
 When users provision resources, they can specify from which availability
 zone they want their instance to be built. This allows cloud consumers
 to ensure that their application resources are spread across disparate
 machines to achieve high availability in the event of hardware failure.
 Host aggregates zone
 ~~~~~~~~~~~~~~~~~~~~
 This enables you to partition OpenStack Compute deployments into logical
 groups for load balancing and instance distribution. You can use host
 aggregates to further partition an availability zone. For example, you
 might use host aggregates to partition an availability zone into groups
 of hosts that either share common resources, such as storage and
 network, or have a special property, such as trusted computing
 hardware.
 A common use of host aggregates is to provide information for use with
 the ``nova-scheduler``. For example, you might use a host aggregate to
 group a set of hosts that share specific flavors or images.
 The general case for this is setting key-value pairs in the aggregate
 metadata and matching key-value pairs in flavor's ``extra_specs``
 metadata. The ``AggregateInstanceExtraSpecsFilter`` in the filter
 scheduler will enforce that instances be scheduled only on hosts in
 aggregates that define the same key to the same value.
 An advanced use of this general concept allows different flavor types to
 run with different CPU and RAM allocation ratios so that high-intensity
 computing loads and low-intensity development and testing systems can
 share the same cloud without either starving the high-use systems or
 wasting resources on low-utilization systems. This works by setting
 ``metadata`` in your host aggregates and matching ``extra_specs`` in
 your flavor types.
 The first step is setting the aggregate metadata keys
 ``cpu_allocation_ratio`` and ``ram_allocation_ratio`` to a
 floating-point value. The filter schedulers ``AggregateCoreFilter`` and
 ``AggregateRamFilter`` will use those values rather than the global
 defaults in ``nova.conf`` when scheduling to hosts in the aggregate. It
 is important to be cautious when using this feature, since each host can
 be in multiple aggregates but should have only one allocation ratio for
 each resources. It is up to you to avoid putting a host in multiple
 aggregates that define different values for the same resource.
 This is the first half of the equation. To get flavor types that are
 guaranteed a particular ratio, you must set the ``extra_specs`` in the
 flavor type to the key-value pair you want to match in the aggregate.
 For example, if you define ``extra_specs`` ``cpu_allocation_ratio`` to
 "1.0", then instances of that type will run in aggregates only where the
 metadata key ``cpu_allocation_ratio`` is also defined as "1.0." In
 practice, it is better to define an additional key-value pair in the
 aggregate metadata to match on rather than match directly on
 ``cpu_allocation_ratio`` or ``core_allocation_ratio``. This allows
 better abstraction. For example, by defining a key ``overcommit`` and
 setting a value of "high," "medium," or "low," you could then tune the
 numeric allocation ratios in the aggregates without also needing to
 change all flavor types relating to them.
 .. note::
    Previously, all services had an availability zone. Currently, only
    the ``nova-compute`` service has its own availability zone. Services
    such as ``nova-scheduler``, ``nova-network``, and ``nova-conductor``
    have always spanned all availability zones.
    When you run any of the following operations, the services appear in
    their own internal availability zone
    (CONF.internal_service_availability_zone):
    -  :command:`nova host-list` (os-hosts)
    -  :command:`euca-describe-availability-zones verbose`
    -  :command:`nova service-list`
    The internal availability zone is hidden in
    euca-describe-availability_zones (nonverbose).
    CONF.node_availability_zone has been renamed to
    CONF.default_availability_zone and is used only by the
    ``nova-api`` and ``nova-scheduler`` services.
    CONF.node_availability_zone still works but is deprecated.
 Scalable Hardware
 ~~~~~~~~~~~~~~~~~
 While several resources already exist to help with deploying and
 installing OpenStack, it's very important to make sure that you have
 your deployment planned out ahead of time. This guide presumes that you
 have at least set aside a rack for the OpenStack cloud but also offers
 suggestions for when and what to scale.
 Hardware Procurement
 --------------------
 “The Cloud” has been described as a volatile environment where servers
 can be created and terminated at will. While this may be true, it does
 not mean that your servers must be volatile. Ensuring that your cloud's
 hardware is stable and configured correctly means that your cloud
 environment remains up and running. Basically, put effort into creating
 a stable hardware environment so that you can host a cloud that users
 may treat as unstable and volatile.
 OpenStack can be deployed on any hardware supported by an
 OpenStack-compatible Linux distribution.
 Hardware does not have to be consistent, but it should at least have the
 same type of CPU to support instance migration.
 The typical hardware recommended for use with OpenStack is the standard
 value-for-money offerings that most hardware vendors stock. It should be
 straightforward to divide your procurement into building blocks such as
 "compute," "object storage," and "cloud controller," and request as many
 of these as you need. Alternatively, should you be unable to spend more,
 if you have existing servers—provided they meet your performance
 requirements and virtualization technology—they are quite likely to be
 able to support OpenStack.
 Capacity Planning
 -----------------
 OpenStack is designed to increase in size in a straightforward manner.
 Taking into account the considerations that we've mentioned in this
 chapter—particularly on the sizing of the cloud controller—it should be
 possible to procure additional compute or object storage nodes as
 needed. New nodes do not need to be the same specification, or even
 vendor, as existing nodes.
 For compute nodes, ``nova-scheduler`` will take care of differences in
 sizing having to do with core count and RAM amounts; however, you should
 consider that the user experience changes with differing CPU speeds.
 When adding object storage nodes, a weight should be specified that
 reflects the capability of the node.
 Monitoring the resource usage and user growth will enable you to know
 when to procure. :doc:`ops_logging_monitoring` details some useful metrics.
 Burn-in Testing
 ---------------
 The chances of failure for the server's hardware are high at the start
 and the end of its life. As a result, dealing with hardware failures
 while in production can be avoided by appropriate burn-in testing to
 attempt to trigger the early-stage failures. The general principle is to
 stress the hardware to its limits. Examples of burn-in tests include
 running a CPU or disk benchmark for several days.
--- a/doc/ops-guide/source/arch_storage.rst
+++ b/doc/ops-guide/source/arch_storage.rst
@ -0,0 +1,521 @@
 =================
 Storage Decisions
 =================
 Storage is found in many parts of the OpenStack stack, and the differing
 types can cause confusion to even experienced cloud engineers. This
 section focuses on persistent storage options you can configure with
 your cloud. It's important to understand the distinction between
 :term:`ephemeral <ephemeral volume>` storage and
 :term:`persistent <persistent volume>` storage.
 Ephemeral Storage
 ~~~~~~~~~~~~~~~~~
 If you deploy only the OpenStack :term:`Compute service` (nova), your users do
 not have access to any form of persistent storage by default. The disks
 associated with VMs are "ephemeral," meaning that (from the user's point
 of view) they effectively disappear when a virtual machine is
 terminated.
 Persistent Storage
 ~~~~~~~~~~~~~~~~~~
 Persistent storage means that the storage resource outlives any other
 resource and is always available, regardless of the state of a running
 instance.
 Today, OpenStack clouds explicitly support three types of persistent
 storage: *object storage*, *block storage*, and *file system storage*.
 Object Storage
 --------------
 With object storage, users access binary objects through a REST API. You
 may be familiar with Amazon S3, which is a well-known example of an
 object storage system. Object storage is implemented in OpenStack by the
 OpenStack Object Storage (swift) project. If your intended users need to
 archive or manage large datasets, you want to provide them with object
 storage. In addition, OpenStack can store your virtual machine (VM)
 images inside of an object storage system, as an alternative to storing
 the images on a file system.
 OpenStack Object Storage provides a highly scalable, highly available
 storage solution by relaxing some of the constraints of traditional file
 systems. In designing and procuring for such a cluster, it is important
 to understand some key concepts about its operation. Essentially, this
 type of storage is built on the idea that all storage hardware fails, at
 every level, at some point. Infrequently encountered failures that would
 hamstring other storage systems, such as issues taking down RAID cards
 or entire servers, are handled gracefully with OpenStack Object
 Storage.
 A good document describing the Object Storage architecture is found
 within the `developer
 documentation <http://docs.openstack.org/developer/swift/overview_architecture.html>`_
 — read this first. Once you understand the architecture, you should know what a
 proxy server does and how zones work. However, some important points are
 often missed at first glance.
 When designing your cluster, you must consider durability and
 availability. Understand that the predominant source of these is the
 spread and placement of your data, rather than the reliability of the
 hardware. Consider the default value of the number of replicas, which is
 three. This means that before an object is marked as having been
 written, at least two copies exist—in case a single server fails to
 write, the third copy may or may not yet exist when the write operation
 initially returns. Altering this number increases the robustness of your
 data, but reduces the amount of storage you have available. Next, look
 at the placement of your servers. Consider spreading them widely
 throughout your data center's network and power-failure zones. Is a zone
 a rack, a server, or a disk?
 Object Storage's network patterns might seem unfamiliar at first.
 Consider these main traffic flows:
 -  Among :term:`object`, :term:`container`, and
   :term:`account servers <account server>`
 -  Between those servers and the proxies
 -  Between the proxies and your users
 Object Storage is very "chatty" among servers hosting data—even a small
 cluster does megabytes/second of traffic, which is predominantly, “Do
 you have the object?”/“Yes I have the object!” Of course, if the answer
 to the aforementioned question is negative or the request times out,
 replication of the object begins.
 Consider the scenario where an entire server fails and 24 TB of data
 needs to be transferred "immediately" to remain at three copies—this can
 put significant load on the network.
 Another fact that's often forgotten is that when a new file is being
 uploaded, the proxy server must write out as many streams as there are
 replicas—giving a multiple of network traffic. For a three-replica
 cluster, 10 Gbps in means 30 Gbps out. Combining this with the previous
 high bandwidth bandwidth private vs. public network recommendations
 demands of replication is what results in the recommendation that your
 private network be of significantly higher bandwidth than your public
 need be. Oh, and OpenStack Object Storage communicates internally with
 unencrypted, unauthenticated rsync for performance—you do want the
 private network to be private.
 The remaining point on bandwidth is the public-facing portion. The
 ``swift-proxy`` service is stateless, which means that you can easily
 add more and use HTTP load-balancing methods to share bandwidth and
 availability between them.
 More proxies means more bandwidth, if your storage can keep up.
 Block Storage
 -------------
 Block storage (sometimes referred to as volume storage) provides users
 with access to block-storage devices. Users interact with block storage
 by attaching volumes to their running VM instances.
 These volumes are persistent: they can be detached from one instance and
 re-attached to another, and the data remains intact. Block storage is
 implemented in OpenStack by the OpenStack Block Storage (cinder)
 project, which supports multiple back ends in the form of drivers. Your
 choice of a storage back end must be supported by a Block Storage
 driver.
 Most block storage drivers allow the instance to have direct access to
 the underlying storage hardware's block device. This helps increase the
 overall read/write IO. However, support for utilizing files as volumes
 is also well established, with full support for NFS, GlusterFS and
 others.
 These drivers work a little differently than a traditional "block"
 storage driver. On an NFS or GlusterFS file system, a single file is
 created and then mapped as a "virtual" volume into the instance. This
 mapping/translation is similar to how OpenStack utilizes QEMU's
 file-based virtual machines stored in ``/var/lib/nova/instances``.
 Shared File Systems Service
 ---------------------------
 The Shared File Systems service provides a set of services for
 management of Shared File Systems in a multi-tenant cloud environment.
 Users interact with Shared File Systems service by mounting remote File
 Systems on their instances with the following usage of those systems for
 file storing and exchange. Shared File Systems service provides you with
 shares. A share is a remote, mountable file system. You can mount a
 share to and access a share from several hosts by several users at a
 time. With shares, user can also:
 -  Create a share specifying its size, shared file system protocol,
   visibility level
 -  Create a share on either a share server or standalone, depending on
   the selected back-end mode, with or without using a share network.
 -  Specify access rules and security services for existing shares.
 -  Combine several shares in groups to keep data consistency inside the
   groups for the following safe group operations.
 -  Create a snapshot of a selected share or a share group for storing
   the existing shares consistently or creating new shares from that
   snapshot in a consistent way
 -  Create a share from a snapshot.
 -  Set rate limits and quotas for specific shares and snapshots
 -  View usage of share resources
 -  Remove shares.
 Like Block Storage, the Shared File Systems service is persistent. It
 can be:
 -  Mounted to any number of client machines.
 -  Detached from one instance and attached to another without data loss.
   During this process the data are safe unless the Shared File Systems
   service itself is changed or removed.
 Shares are provided by the Shared File Systems service. In OpenStack,
 Shared File Systems service is implemented by Shared File System
 (manila) project, which supports multiple back-ends in the form of
 drivers. The Shared File Systems service can be configured to provision
 shares from one or more back-ends. Share servers are, mostly, virtual
 machines that export file shares via different protocols such as NFS,
 CIFS, GlusterFS, or HDFS.
 OpenStack Storage Concepts
 ~~~~~~~~~~~~~~~~~~~~~~~~~~
 The table below explains the different storage concepts provided by OpenStack.
 .. list-table:: OpenStack storage
   :widths: 20 20 20 20 20
   :header-rows: 1
   * -
     - Ephemeral storage
     - Block storage
     - Object storage
     - Shared File System storage
   * - Used to…
     - Run operating system and scratch space
     - Add additional persistent storage to a virtual machine (VM)
     - Store data, including VM images
     - Add additional persistent storage to a virtual machine
   * - Accessed through…
     - A file system
     - A block device that can be partitioned, formatted, and mounted
       (such as, /dev/vdc)
     - The REST API
     - A Shared File Systems service share (either manila managed or an
       external one registered in manila) that can be partitioned, formatted
       and mounted (such as /dev/vdc)
   * - Accessible from…
     - Within a VM
     - Within a VM
     - Anywhere
     - Within a VM
   * - Managed by…
     - OpenStack Compute (nova)
     - OpenStack Block Storage (cinder)
     - OpenStack Object Storage (swift)
     - OpenStack Shared File System Storage (manila)
   * - Persists until…
     - VM is terminated
     - Deleted by user
     - Deleted by user
     - Deleted by user
   * - Sizing determined by…
     - Administrator configuration of size settings, known as *flavors*
     - User specification in initial request
     - Amount of available physical storage
     - * User specification in initial request
       * Requests for extension
       * Available user-level quotes
       * Limitations applied by Administrator
   * - Encryption set by…
     - Parameter in nova.conf
     - Admin establishing `encrypted volume type
       <http://docs.openstack.org/admin-guide/dashboard_manage_volumes.html>`_,
       then user selecting encrypted volume
     - Not yet available
     - Shared File Systems service does not apply any additional encryption
       above what the share’s back-end storage provides
   * - Example of typical usage…
     - 10 GB first disk, 30 GB second disk
     - 1 TB disk
     - 10s of TBs of dataset storage
     - Depends completely on the size of back-end storage specified when
       a share was being created. In case of thin provisioning it can be
       partial space reservation (for more details see
       `Capabilities and Extra-Specs
       <http://docs.openstack.org/developer/manila/devref/capabilities_and_extra_specs.html?highlight=extra%20specs#common-capabilities>`_
       specification)
 With file-level storage, users access stored data using the operating
 system's file system interface. Most users, if they have used a network
 storage solution before, have encountered this form of networked
 storage. In the Unix world, the most common form of this is NFS. In the
 Windows world, the most common form is called CIFS (previously,
 SMB).
 OpenStack clouds do not present file-level storage to end users.
 However, it is important to consider file-level storage for storing
 instances under ``/var/lib/nova/instances`` when designing your cloud,
 since you must have a shared file system if you want to support live
 migration.
 Choosing Storage Back Ends
 ~~~~~~~~~~~~~~~~~~~~~~~~~~
 Users will indicate different needs for their cloud use cases. Some may
 need fast access to many objects that do not change often, or want to
 set a time-to-live (TTL) value on a file. Others may access only storage
 that is mounted with the file system itself, but want it to be
 replicated instantly when starting a new instance. For other systems,
 ephemeral storage—storage that is released when a VM attached to it is
 shut down— is the preferred way. When you select
 :term:`storage back ends <storage back end>`,
 ask the following questions on behalf of your users:
 -  Do my users need block storage?
 -  Do my users need object storage?
 -  Do I need to support live migration?
 -  Should my persistent storage drives be contained in my compute nodes,
   or should I use external storage?
 -  What is the platter count I can achieve? Do more spindles result in
   better I/O despite network access?
 -  Which one results in the best cost-performance scenario I'm aiming
   for?
 -  How do I manage the storage operationally?
 -  How redundant and distributed is the storage? What happens if a
   storage node fails? To what extent can it mitigate my data-loss
   disaster scenarios?
 To deploy your storage by using only commodity hardware, you can use a
 number of open-source packages, as shown in the following table.
 .. list-table:: Persistent file-based storage support
   :widths: 25 25 25 25
   :header-rows: 1
   * -  
     - Object
     - Block
     - File-level
   * - Swift
     - .. image:: figures/Check_mark_23x20_02.png
          :width: 30%
     -
     -  
   * - LVM
     -
     - .. image:: figures/Check_mark_23x20_02.png
          :width: 30%
     -  
   * - Ceph
     - .. image:: figures/Check_mark_23x20_02.png
          :width: 30%
     - .. image:: figures/Check_mark_23x20_02.png
          :width: 30%
     - Experimental
   * - Gluster
     - .. image:: figures/Check_mark_23x20_02.png
          :width: 30%
     - .. image:: figures/Check_mark_23x20_02.png
          :width: 30%
     - .. image:: figures/Check_mark_23x20_02.png
          :width: 30%
   * - NFS
     -
     - .. image:: figures/Check_mark_23x20_02.png
          :width: 30%
     - .. image:: figures/Check_mark_23x20_02.png
          :width: 30%
   * - ZFS
     -
     - .. image:: figures/Check_mark_23x20_02.png
          :width: 30%
     -  
   * - Sheepdog
     - .. image:: figures/Check_mark_23x20_02.png
          :width: 30%
     - .. image:: figures/Check_mark_23x20_02.png
          :width: 30%
     -
 .. note::
   This list of open source file-level shared storage solutions is not
   exhaustive; other open source solutions exist (MooseFS). Your
   organization may already have deployed a file-level shared storage
   solution that you can use.
 **Storage Driver Support**
 In addition to the open source technologies, there are a number of
 proprietary solutions that are officially supported by OpenStack Block
 Storage. They are offered by the following vendors:
 -  IBM (Storwize family/SVC, XIV)
 -  NetApp
 -  Nexenta
 -  SolidFire
 You can find a matrix of the functionality provided by all of the
 supported Block Storage drivers on the `OpenStack
 wiki <https://wiki.openstack.org/wiki/CinderSupportMatrix>`_.
 Also, you need to decide whether you want to support object storage in
 your cloud. The two common use cases for providing object storage in a
 compute cloud are:
 -  To provide users with a persistent storage mechanism
 -  As a scalable, reliable data store for virtual machine images
 Commodity Storage Back-end Technologies
 ---------------------------------------
 This section provides a high-level overview of the differences among the
 different commodity storage back end technologies. Depending on your
 cloud user's needs, you can implement one or many of these technologies
 in different combinations:
 OpenStack Object Storage (swift)
    The official OpenStack Object Store implementation. It is a mature
    technology that has been used for several years in production by
    Rackspace as the technology behind Rackspace Cloud Files. As it is
    highly scalable, it is well-suited to managing petabytes of storage.
    OpenStack Object Storage's advantages are better integration with
    OpenStack (integrates with OpenStack Identity, works with the
    OpenStack dashboard interface) and better support for multiple data
    center deployment through support of asynchronous eventual
    consistency replication.
    Therefore, if you eventually plan on distributing your storage
    cluster across multiple data centers, if you need unified accounts
    for your users for both compute and object storage, or if you want
    to control your object storage with the OpenStack dashboard, you
    should consider OpenStack Object Storage. More detail can be found
    about OpenStack Object Storage in the section below.
 Ceph
    A scalable storage solution that replicates data across commodity
    storage nodes. Ceph was originally developed by one of the founders
    of DreamHost and is currently used in production there.
    Ceph was designed to expose different types of storage interfaces to
    the end user: it supports object storage, block storage, and
    file-system interfaces, although the file-system interface is not
    yet considered production-ready. Ceph supports the same API as swift
    for object storage and can be used as a back end for cinder block
    storage as well as back-end storage for glance images. Ceph supports
    "thin provisioning," implemented using copy-on-write.
    This can be useful when booting from volume because a new volume can
    be provisioned very quickly. Ceph also supports keystone-based
    authentication (as of version 0.56), so it can be a seamless swap in
    for the default OpenStack swift implementation.
    Ceph's advantages are that it gives the administrator more
    fine-grained control over data distribution and replication
    strategies, enables you to consolidate your object and block
    storage, enables very fast provisioning of boot-from-volume
    instances using thin provisioning, and supports a distributed
    file-system interface, though this interface is `not yet
    recommended <http://ceph.com/docs/master/cephfs/>`_ for use in
    production deployment by the Ceph project.
    If you want to manage your object and block storage within a single
    system, or if you want to support fast boot-from-volume, you should
    consider Ceph.
 Gluster
    A distributed, shared file system. As of Gluster version 3.3, you
    can use Gluster to consolidate your object storage and file storage
    into one unified file and object storage solution, which is called
    Gluster For OpenStack (GFO). GFO uses a customized version of swift
    that enables Gluster to be used as the back-end storage.
    The main reason to use GFO rather than regular swift is if you also
    want to support a distributed file system, either to support shared
    storage live migration or to provide it as a separate service to
    your end users. If you want to manage your object and file storage
    within a single system, you should consider GFO.
 LVM
    The Logical Volume Manager is a Linux-based system that provides an
    abstraction layer on top of physical disks to expose logical volumes
    to the operating system. The LVM back-end implements block storage
    as LVM logical partitions.
    On each host that will house block storage, an administrator must
    initially create a volume group dedicated to Block Storage volumes.
    Blocks are created from LVM logical volumes.
    .. note::
       LVM does *not* provide any replication. Typically,
       administrators configure RAID on nodes that use LVM as block
       storage to protect against failures of individual hard drives.
       However, RAID does not protect against a failure of the entire
       host.
 ZFS
    The Solaris iSCSI driver for OpenStack Block Storage implements
    blocks as ZFS entities. ZFS is a file system that also has the
    functionality of a volume manager. This is unlike on a Linux system,
    where there is a separation of volume manager (LVM) and file system
    (such as, ext3, ext4, xfs, and btrfs). ZFS has a number of
    advantages over ext4, including improved data-integrity checking.
    The ZFS back end for OpenStack Block Storage supports only
    Solaris-based systems, such as Illumos. While there is a Linux port
    of ZFS, it is not included in any of the standard Linux
    distributions, and it has not been tested with OpenStack Block
    Storage. As with LVM, ZFS does not provide replication across hosts
    on its own; you need to add a replication solution on top of ZFS if
    your cloud needs to be able to handle storage-node failures.
    We don't recommend ZFS unless you have previous experience with
    deploying it, since the ZFS back end for Block Storage requires a
    Solaris-based operating system, and we assume that your experience
    is primarily with Linux-based systems.
 Sheepdog
    Sheepdog is a userspace distributed storage system. Sheepdog scales
    to several hundred nodes, and has powerful virtual disk management
    features like snapshot, cloning, rollback, thin provisioning.
    It is essentially an object storage system that manages disks and
    aggregates the space and performance of disks linearly in hyper
    scale on commodity hardware in a smart way. On top of its object
    store, Sheepdog provides elastic volume service and http service.
    Sheepdog does not assume anything about kernel version and can work
    nicely with xattr-supported file systems.
 Conclusion
 ~~~~~~~~~~
 We hope that you now have some considerations in mind and questions to
 ask your future cloud users about their storage use cases. As you can
 see, your storage decisions will also influence your network design for
 performance and security needs. Continue with us to make more informed
 decisions about your OpenStack cloud design.
--- a/doc/ops-guide/source/architecture.rst
+++ b/doc/ops-guide/source/architecture.rst
@ -0,0 +1,52 @@
 ============
 Architecture
 ============
 Designing an OpenStack cloud is a great achievement. It requires a
 robust understanding of the requirements and needs of the cloud's users
 to determine the best possible configuration to meet them. OpenStack
 provides a great deal of flexibility to achieve your needs, and this
 part of the book aims to shine light on many of the decisions you need
 to make during the process.
 To design, deploy, and configure OpenStack, administrators must
 understand the logical architecture. A diagram can help you envision all
 the integrated services within OpenStack and how they interact with each
 other.
 OpenStack modules are one of the following types:
 Daemon
 Runs as a background process. On Linux platforms, a daemon is usually
 installed as a service.
 Script
 Installs a virtual environment and runs tests.
 Command-line interface (CLI)
 Enables users to submit API calls to OpenStack services through commands.
 As shown, end users can interact through the dashboard, CLIs, and APIs.
 All services authenticate through a common Identity service, and
 individual services interact with each other through public APIs, except
 where privileged administrator commands are necessary.
 :ref:`logical_architecture` shows the most common, but not the only logical
 architecture for an OpenStack cloud.
 .. _logical_architecture:
 .. figure:: figures/osog_0001.png
   :width: 100%
   Figure. OpenStack Logical Architecture
 .. toctree::
   :maxdepth: 2
   arch_examples.rst
   arch_provision.rst
   arch_cloud_controller.rst
   arch_compute_nodes.rst
   arch_scaling.rst
   arch_storage.rst
   arch_network_design.rst
--- a/doc/ops-guide/source/common
+++ b/doc/ops-guide/source/common
@ -0,0 +1 @@
 ../../common
--- a/doc/ops-guide/source/conf.py
+++ b/doc/ops-guide/source/conf.py
@ -0,0 +1,290 @@
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
 # You may obtain a copy of the License at
 #
 #    http://www.apache.org/licenses/LICENSE-2.0
 #
 # Unless required by applicable law or agreed to in writing, software
 # distributed under the License is distributed on an "AS IS" BASIS,
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
 # implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
 # This file is execfile()d with the current directory set to its
 # containing dir.
 #
 # Note that not all possible configuration values are present in this
 # autogenerated file.
 #
 # All configuration values have a default; values that are commented out
 # serve to show the default.
 import os
 # import sys
 import openstackdocstheme
 # If extensions (or modules to document with autodoc) are in another directory,
 # add these directories to sys.path here. If the directory is relative to the
 # documentation root, use os.path.abspath to make it absolute, like shown here.
 # sys.path.insert(0, os.path.abspath('.'))
 # -- General configuration ------------------------------------------------
 # If your documentation needs a minimal Sphinx version, state it here.
 # needs_sphinx = '1.0'
 # Add any Sphinx extension module names here, as strings. They can be
 # extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
 # ones.
 extensions = []
 # Add any paths that contain templates here, relative to this directory.
 # templates_path = ['_templates']
 # The suffix of source filenames.
 source_suffix = '.rst'
 # The encoding of source files.
 # source_encoding = 'utf-8-sig'
 # The master toctree document.
 master_doc = 'index'
 # General information about the project.
 project = u'Operations Guide'
 bug_tag = u'ops-guide'
 copyright = u'2016, OpenStack contributors'
 # The version info for the project you're documenting, acts as replacement for
 # |version| and |release|, also used in various other places throughout the
 # built documents.
 #
 # The short X.Y version.
 version = '0.0.1'
 # The full version, including alpha/beta/rc tags.
 release = '0.0.1'
 # A few variables have to be set for the log-a-bug feature.
 #   giturl: The location of conf.py on Git. Must be set manually.
 #   gitsha: The SHA checksum of the bug description. Automatically extracted from git log.
 #   bug_tag: Tag for categorizing the bug. Must be set manually.
 # These variables are passed to the logabug code via html_context.
 giturl = u'http://git.openstack.org/cgit/openstack/openstack-manuals/tree/doc/ops-guide/source'
 git_cmd = "/usr/bin/git log | head -n1 | cut -f2 -d' '"
 gitsha = os.popen(git_cmd).read().strip('\n')
 html_context = {"gitsha": gitsha, "bug_tag": bug_tag,
                "giturl": giturl}
 # The language for content autogenerated by Sphinx. Refer to documentation
 # for a list of supported languages.
 # language = None
 # There are two options for replacing |today|: either, you set today to some
 # non-false value, then it is used:
 # today = ''
 # Else, today_fmt is used as the format for a strftime call.
 # today_fmt = '%B %d, %Y'
 # List of patterns, relative to source directory, that match files and
 # directories to ignore when looking for source files.
 exclude_patterns = ['common/cli*', 'common/nova*',
                    'common/get_started*', 'common/dashboard*']
 # The reST default role (used for this markup: `text`) to use for all
 # documents.
 # default_role = None
 # If true, '()' will be appended to :func: etc. cross-reference text.
 # add_function_parentheses = True
 # If true, the current module name will be prepended to all description
 # unit titles (such as .. function::).
 # add_module_names = True
 # If true, sectionauthor and moduleauthor directives will be shown in the
 # output. They are ignored by default.
 # show_authors = False
 # The name of the Pygments (syntax highlighting) style to use.
 pygments_style = 'sphinx'
 # A list of ignored prefixes for module index sorting.
 # modindex_common_prefix = []
 # If true, keep warnings as "system message" paragraphs in the built documents.
 # keep_warnings = False
 # -- Options for HTML output ----------------------------------------------
 # The theme to use for HTML and HTML Help pages.  See the documentation for
 # a list of builtin themes.
 html_theme = 'openstackdocs'
 # Theme options are theme-specific and customize the look and feel of a theme
 # further.  For a list of options available for each theme, see the
 # documentation.
 # html_theme_options = {}
 # Add any paths that contain custom themes here, relative to this directory.
 html_theme_path = [openstackdocstheme.get_html_theme_path()]
 # The name for this set of Sphinx documents.  If None, it defaults to
 # "<project> v<release> documentation".
 # html_title = None
 # A shorter title for the navigation bar.  Default is the same as html_title.
 # html_short_title = None
 # The name of an image file (relative to this directory) to place at the top
 # of the sidebar.
 # html_logo = None
 # The name of an image file (within the static path) to use as favicon of the
 # docs.  This file should be a Windows icon file (.ico) being 16x16 or 32x32
 # pixels large.
 # html_favicon = None
 # Add any paths that contain custom static files (such as style sheets) here,
 # relative to this directory. They are copied after the builtin static files,
 # so a file named "default.css" will overwrite the builtin "default.css".
 # html_static_path = []
 # Add any extra paths that contain custom files (such as robots.txt or
 # .htaccess) here, relative to this directory. These files are copied
 # directly to the root of the documentation.
 # html_extra_path = []
 # If not '', a 'Last updated on:' timestamp is inserted at every page bottom,
 # using the given strftime format.
 # So that we can enable "log-a-bug" links from each output HTML page, this
 # variable must be set to a format that includes year, month, day, hours and
 # minutes.
 html_last_updated_fmt = '%Y-%m-%d %H:%M'
 # If true, SmartyPants will be used to convert quotes and dashes to
 # typographically correct entities.
 # html_use_smartypants = True
 # Custom sidebar templates, maps document names to template names.
 # html_sidebars = {}
 # Additional templates that should be rendered to pages, maps page names to
 # template names.
 # html_additional_pages = {}
 # If false, no module index is generated.
 # html_domain_indices = True
 # If false, no index is generated.
 html_use_index = False
 # If true, the index is split into individual pages for each letter.
 # html_split_index = False
 # If true, links to the reST sources are added to the pages.
 html_show_sourcelink = False
 # If true, "Created using Sphinx" is shown in the HTML footer. Default is True.
 # html_show_sphinx = True
 # If true, "(C) Copyright ..." is shown in the HTML footer. Default is True.
 # html_show_copyright = True
 # If true, an OpenSearch description file will be output, and all pages will
 # contain a <link> tag referring to it.  The value of this option must be the
 # base URL from which the finished HTML is served.
 # html_use_opensearch = ''
 # This is the file name suffix for HTML files (e.g. ".xhtml").
 # html_file_suffix = None
 # Output file base name for HTML help builder.
 htmlhelp_basename = 'ops-guide'
 # If true, publish source files
 html_copy_source = False
 # -- Options for LaTeX output ---------------------------------------------
 latex_elements = {
    # The paper size ('letterpaper' or 'a4paper').
    # 'papersize': 'letterpaper',
    # The font size ('10pt', '11pt' or '12pt').
    # 'pointsize': '10pt',
    # Additional stuff for the LaTeX preamble.
    # 'preamble': '',
 }
 # Grouping the document tree into LaTeX files. List of tuples
 # (source start file, target name, title,
 #  author, documentclass [howto, manual, or own class]).
 latex_documents = [
    ('index', 'OpsGuide.tex', u'Operations Guide',
     u'OpenStack contributors', 'manual'),
 ]
 # The name of an image file (relative to this directory) to place at the top of
 # the title page.
 # latex_logo = None
 # For "manual" documents, if this is true, then toplevel headings are parts,
 # not chapters.
 # latex_use_parts = False
 # If true, show page references after internal links.
 # latex_show_pagerefs = False
 # If true, show URL addresses after external links.
 # latex_show_urls = False
 # Documents to append as an appendix to all manuals.
 # latex_appendices = []
 # If false, no module index is generated.
 # latex_domain_indices = True
 # -- Options for manual page output ---------------------------------------
 # One entry per manual page. List of tuples
 # (source start file, name, description, authors, manual section).
 man_pages = [
    ('index', 'opsguide', u'Operations Guide',
     [u'OpenStack contributors'], 1)
 ]
 # If true, show URL addresses after external links.
 # man_show_urls = False
 # -- Options for Texinfo output -------------------------------------------
 # Grouping the document tree into Texinfo files. List of tuples
 # (source start file, target name, title, author,
 #  dir menu entry, description, category)
 texinfo_documents = [
    ('index', 'OpsGuide', u'Operations Guide',
     u'OpenStack contributors', 'OpsGuide',
     'This book provides information about designing and operating '
     'OpenStack clouds.', 'Miscellaneous'),
 ]
 # Documents to append as an appendix to all manuals.
 # texinfo_appendices = []
 # If false, no module index is generated.
 # texinfo_domain_indices = True
 # How to display URL addresses: 'footnote', 'no', or 'inline'.
 # texinfo_show_urls = 'footnote'
 # If true, do not generate a @detailmenu in the "Top" node's menu.
 # texinfo_no_detailmenu = False
 # -- Options for Internationalization output ------------------------------
 locale_dirs = ['locale/']
--- a/doc/ops-guide/source/figures/Check_mark_23x20_02.png
+++ b/doc/ops-guide/source/figures/Check_mark_23x20_02.png
--- a/doc/ops-guide/source/figures/Check_mark_23x20_02.svg
+++ b/doc/ops-guide/source/figures/Check_mark_23x20_02.svg
@ -0,0 +1,60 @@
 <?xml version="1.0" encoding="UTF-8" standalone="no"?>
 <!-- Created with Inkscape (http://www.inkscape.org/) -->
 <svg
   xmlns:dc="http://purl.org/dc/elements/1.1/"
   xmlns:cc="http://web.resource.org/cc/"
   xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
   xmlns:svg="http://www.w3.org/2000/svg"
   xmlns="http://www.w3.org/2000/svg"
   xmlns:sodipodi="http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd"
   xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"
   width="19.21315"
   height="18.294994"
   id="svg2"
   sodipodi:version="0.32"
   inkscape:version="0.45"
   sodipodi:modified="true"
   version="1.0">
  <defs
     id="defs4" />
  <sodipodi:namedview
     id="base"
     pagecolor="#ffffff"
     bordercolor="#666666"
     borderopacity="1.0"
     gridtolerance="10000"
     guidetolerance="10"
     objecttolerance="10"
     inkscape:pageopacity="0.0"
     inkscape:pageshadow="2"
     inkscape:zoom="7.9195959"
     inkscape:cx="17.757032"
     inkscape:cy="7.298821"
     inkscape:document-units="px"
     inkscape:current-layer="layer1"
     inkscape:window-width="984"
     inkscape:window-height="852"
     inkscape:window-x="148"
     inkscape:window-y="66" />
  <metadata
     id="metadata7">
    <rdf:RDF>
      <cc:Work
         rdf:about="">
        <dc:format>image/svg+xml</dc:format>
        <dc:type
           rdf:resource="http://purl.org/dc/dcmitype/StillImage" />
      </cc:Work>
    </rdf:RDF>
  </metadata>
  <g
     inkscape:label="Layer 1"
     inkscape:groupmode="layer"
     id="layer1"
     transform="translate(-192.905,-516.02064)">
    <path
       style="fill:#000000"
       d="M 197.67968,534.31563 C 197.40468,534.31208 196.21788,532.53719 195.04234,530.37143 L 192.905,526.43368 L 193.45901,525.87968 C 193.76371,525.57497 194.58269,525.32567 195.27896,525.32567 L 196.5449,525.32567 L 197.18129,527.33076 L 197.81768,529.33584 L 202.88215,523.79451 C 205.66761,520.74678 208.88522,517.75085 210.03239,517.13691 L 212.11815,516.02064 L 207.90871,520.80282 C 205.59351,523.43302 202.45735,527.55085 200.93947,529.95355 C 199.42159,532.35625 197.95468,534.31919 197.67968,534.31563 z "
       id="path2223" />
  </g>
 </svg>
--- a/doc/ops-guide/source/figures/network_packet_ping.svg
+++ b/doc/ops-guide/source/figures/network_packet_ping.svg
--- a/doc/ops-guide/source/figures/neutron_packet_ping.svg
+++ b/doc/ops-guide/source/figures/neutron_packet_ping.svg
--- a/doc/ops-guide/source/figures/os-ref-arch.svg
+++ b/doc/ops-guide/source/figures/os-ref-arch.svg
--- a/doc/ops-guide/source/figures/os_physical_network.svg
+++ b/doc/ops-guide/source/figures/os_physical_network.svg
--- a/doc/ops-guide/source/figures/osog_0001.png
+++ b/doc/ops-guide/source/figures/osog_0001.png
--- a/doc/ops-guide/source/figures/osog_00in01.png
+++ b/doc/ops-guide/source/figures/osog_00in01.png
--- a/doc/ops-guide/source/figures/osog_0101.png
+++ b/doc/ops-guide/source/figures/osog_0101.png
--- a/doc/ops-guide/source/figures/osog_0102.png
+++ b/doc/ops-guide/source/figures/osog_0102.png
--- a/doc/ops-guide/source/figures/osog_0103.png
+++ b/doc/ops-guide/source/figures/osog_0103.png
--- a/doc/ops-guide/source/figures/osog_0104.png
+++ b/doc/ops-guide/source/figures/osog_0104.png
--- a/doc/ops-guide/source/figures/osog_0105.png
+++ b/doc/ops-guide/source/figures/osog_0105.png
--- a/doc/ops-guide/source/figures/osog_0106.png
+++ b/doc/ops-guide/source/figures/osog_0106.png
--- a/doc/ops-guide/source/figures/osog_01in01.png
+++ b/doc/ops-guide/source/figures/osog_01in01.png
--- a/doc/ops-guide/source/figures/osog_01in02.png
+++ b/doc/ops-guide/source/figures/osog_01in02.png
--- a/doc/ops-guide/source/figures/osog_0201.png
+++ b/doc/ops-guide/source/figures/osog_0201.png
--- a/doc/ops-guide/source/figures/osog_0901.png
+++ b/doc/ops-guide/source/figures/osog_0901.png
--- a/doc/ops-guide/source/figures/osog_0902.png
+++ b/doc/ops-guide/source/figures/osog_0902.png
--- a/doc/ops-guide/source/figures/osog_1201.png
+++ b/doc/ops-guide/source/figures/osog_1201.png
--- a/doc/ops-guide/source/figures/osog_1202.png
+++ b/doc/ops-guide/source/figures/osog_1202.png
--- a/doc/ops-guide/source/figures/osog_ac01.png
+++ b/doc/ops-guide/source/figures/osog_ac01.png
--- a/doc/ops-guide/source/figures/releasecyclegrizzlydiagram.png
+++ b/doc/ops-guide/source/figures/releasecyclegrizzlydiagram.png
--- a/doc/ops-guide/source/index.rst
+++ b/doc/ops-guide/source/index.rst
@ -0,0 +1,26 @@
 ==========================
 OpenStack Operations Guide
 ==========================
 Abstract
 ~~~~~~~~
 This book provides information about designing and operating OpenStack clouds.
 Contents
 ~~~~~~~~
 .. toctree::
   :maxdepth: 2
   acknowledgements.rst
   preface_ops.rst
   architecture.rst
   operations.rst
   app_usecases.rst
   app_crypt.rst
   app_roadmaps.rst
   app_resources.rst
   common/app_support.rst
   common/glossary.rst
--- a/doc/ops-guide/source/operations.rst
+++ b/doc/ops-guide/source/operations.rst
@ -0,0 +1,41 @@
 ==========
 Operations
 ==========
 Congratulations! By now, you should have a solid design for your cloud.
 We now recommend that you turn to the `OpenStack Installation Guides
 <http://docs.openstack.org/index.html#install-guides>`_, which contains a
 step-by-step guide on how to manually install the OpenStack packages and
 dependencies on your cloud.
 While it is important for an operator to be familiar with the steps
 involved in deploying OpenStack, we also strongly encourage you to
 evaluate configuration-management tools, such as :term:`Puppet` or
 :term:`Chef`, which can help automate this deployment process.
 In the remainder of this guide, we assume that you have successfully
 deployed an OpenStack cloud and are able to perform basic operations
 such as adding images, booting instances, and attaching volumes.
 As your focus turns to stable operations, we recommend that you do skim
 the remainder of this book to get a sense of the content. Some of this
 content is useful to read in advance so that you can put best practices
 into effect to simplify your life in the long run. Other content is more
 useful as a reference that you might turn to when an unexpected event
 occurs (such as a power failure), or to troubleshoot a particular
 problem.
 .. toctree::
   :maxdepth: 2
   ops_lay_of_the_land.rst
   ops_projects_users.rst
   ops_user_facing_operations.rst
   ops_maintenance.rst
   ops_network_troubleshooting.rst
   ops_logging_monitoring.rst
   ops_backup_recovery.rst
   ops_customize.rst
   ops_upstream.rst
   ops_advanced_configuration.rst
   ops_upgrades.rst
--- a/doc/ops-guide/source/ops_advanced_configuration.rst
+++ b/doc/ops-guide/source/ops_advanced_configuration.rst
@ -0,0 +1,163 @@
 ======================
 Advanced Configuration
 ======================
 OpenStack is intended to work well across a variety of installation
 flavors, from very small private clouds to large public clouds. To
 achieve this, the developers add configuration options to their code
 that allow the behavior of the various components to be tweaked
 depending on your needs. Unfortunately, it is not possible to cover all
 possible deployments with the default configuration values.
 At the time of writing, OpenStack has more than 3,000 configuration
 options. You can see them documented at the
 `OpenStack configuration reference
 guide <http://docs.openstack.org/liberty/config-reference/content/config_overview.html>`_.
 This chapter cannot hope to document all of these, but we do try to
 introduce the important concepts so that you know where to go digging
 for more information.
 Differences Between Various Drivers
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 Many OpenStack projects implement a driver layer, and each of these
 drivers will implement its own configuration options. For example, in
 OpenStack Compute (nova), there are various hypervisor drivers
 implemented—libvirt, xenserver, hyper-v, and vmware, for example. Not
 all of these hypervisor drivers have the same features, and each has
 different tuning requirements.
 .. note::
   The currently implemented hypervisors are listed on the `OpenStack
   documentation
   website <http://docs.openstack.org/liberty/config-reference/content/section_compute-hypervisors.html>`_.
   You can see a matrix of the various features in OpenStack Compute
   (nova) hypervisor drivers on the OpenStack wiki at the `Hypervisor
   support matrix
   page <http://docs.openstack.org/developer/nova/support-matrix.html>`_.
 The point we are trying to make here is that just because an option
 exists doesn't mean that option is relevant to your driver choices.
 Normally, the documentation notes which drivers the configuration
 applies to.
 Implementing Periodic Tasks
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
 Another common concept across various OpenStack projects is that of
 periodic tasks. Periodic tasks are much like cron jobs on traditional
 Unix systems, but they are run inside an OpenStack process. For example,
 when OpenStack Compute (nova) needs to work out what images it can
 remove from its local cache, it runs a periodic task to do this.
 Periodic tasks are important to understand because of limitations in the
 threading model that OpenStack uses. OpenStack uses cooperative
 threading in Python, which means that if something long and complicated
 is running, it will block other tasks inside that process from running
 unless it voluntarily yields execution to another cooperative thread.
 A tangible example of this is the ``nova-compute`` process. In order to
 manage the image cache with libvirt, ``nova-compute`` has a periodic
 process that scans the contents of the image cache. Part of this scan is
 calculating a checksum for each of the images and making sure that
 checksum matches what ``nova-compute`` expects it to be. However, images
 can be very large, and these checksums can take a long time to generate.
 At one point, before it was reported as a bug and fixed,
 ``nova-compute`` would block on this task and stop responding to RPC
 requests. This was visible to users as failure of operations such as
 spawning or deleting instances.
 The take away from this is if you observe an OpenStack process that
 appears to "stop" for a while and then continue to process normally, you
 should check that periodic tasks aren't the problem. One way to do this
 is to disable the periodic tasks by setting their interval to zero.
 Additionally, you can configure how often these periodic tasks run—in
 some cases, it might make sense to run them at a different frequency
 from the default.
 The frequency is defined separately for each periodic task. Therefore,
 to disable every periodic task in OpenStack Compute (nova), you would
 need to set a number of configuration options to zero. The current list
 of configuration options you would need to set to zero are:
 -  ``bandwidth_poll_interval``
 -  ``sync_power_state_interval``
 -  ``heal_instance_info_cache_interval``
 -  ``host_state_interval``
 -  ``image_cache_manager_interval``
 -  ``reclaim_instance_interval``
 -  ``volume_usage_poll_interval``
 -  ``shelved_poll_interval``
 -  ``shelved_offload_time``
 -  ``instance_delete_interval``
 To set a configuration option to zero, include a line such as
 ``image_cache_manager_interval=0`` in your ``nova.conf`` file.
 This list will change between releases, so please refer to your
 configuration guide for up-to-date information.
 Specific Configuration Topics
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 This section covers specific examples of configuration options you might
 consider tuning. It is by no means an exhaustive list.
 Security Configuration for Compute, Networking, and Storage
 -----------------------------------------------------------
 The `OpenStack Security Guide <http://docs.openstack.org/sec/>`_
 provides a deep dive into securing an OpenStack cloud, including
 SSL/TLS, key management, PKI and certificate management, data transport
 and privacy concerns, and compliance.
 High Availability
 -----------------
 The `OpenStack High Availability
 Guide <http://docs.openstack.org/ha-guide/index.html>`_ offers
 suggestions for elimination of a single point of failure that could
 cause system downtime. While it is not a completely prescriptive
 document, it offers methods and techniques for avoiding downtime and
 data loss.
 Enabling IPv6 Support
 ---------------------
 You can follow the progress being made on IPV6 support by watching the
 `neutron IPv6 Subteam at
 work <https://wiki.openstack.org/wiki/Meetings/Neutron-IPv6-Subteam>`_.Liberty
 IPv6 supportIPv6, enabling support forconfiguration options IPv6 support
 By modifying your configuration setup, you can set up IPv6 when using
 ``nova-network`` for networking, and a tested setup is documented for
 FlatDHCP and a multi-host configuration. The key is to make
 ``nova-network`` think a ``radvd`` command ran successfully. The entire
 configuration is detailed in a Cybera blog post, `“An IPv6 enabled
 cloud” <http://www.cybera.ca/news-and-events/tech-radar/an-ipv6-enabled-cloud/>`_.
 Geographical Considerations for Object Storage
 ----------------------------------------------
 Support for global clustering of object storage servers is available for
 all supported releases. You would implement these global clusters to
 ensure replication across geographic areas in case of a natural disaster
 and also to ensure that users can write or access their objects more
 quickly based on the closest data center. You configure a default region
 with one zone for each cluster, but be sure your network (WAN) can
 handle the additional request and response load between zones as you add
 more zones and build a ring that handles more zones. Refer to
 `Geographically Distributed
 Clusters <http://docs.openstack.org/developer/swift/admin_guide.html#geographically-distributed-clusters>`_
 in the documentation for additional information.
--- a/doc/ops-guide/source/ops_backup_recovery.rst
+++ b/doc/ops-guide/source/ops_backup_recovery.rst
@ -0,0 +1,203 @@
 ===================
 Backup and Recovery
 ===================
 Standard backup best practices apply when creating your OpenStack backup
 policy. For example, how often to back up your data is closely related
 to how quickly you need to recover from data loss.
 .. note::
   If you cannot have any data loss at all, you should also focus on a
   highly available deployment. The `OpenStack High Availability
   Guide <http://docs.openstack.org/ha-guide/index.html>`_ offers
   suggestions for elimination of a single point of failure that could
   cause system downtime. While it is not a completely prescriptive
   document, it offers methods and techniques for avoiding downtime and
   data loss.
 Other backup considerations include:
 -  How many backups to keep?
 -  Should backups be kept off-site?
 -  How often should backups be tested?
 Just as important as a backup policy is a recovery policy (or at least
 recovery testing).
 What to Back Up
 ~~~~~~~~~~~~~~~
 While OpenStack is composed of many components and moving parts, backing
 up the critical data is quite simple.
 This chapter describes only how to back up configuration files and
 databases that the various OpenStack components need to run. This
 chapter does not describe how to back up objects inside Object Storage
 or data contained inside Block Storage. Generally these areas are left
 for users to back up on their own.
 Database Backups
 ~~~~~~~~~~~~~~~~
 The example OpenStack architecture designates the cloud controller as
 the MySQL server. This MySQL server hosts the databases for nova,
 glance, cinder, and keystone. With all of these databases in one place,
 it's very easy to create a database backup:
 .. code-block:: console
   # mysqldump --opt --all-databases > openstack.sql
 If you only want to backup a single database, you can instead run:
 .. code-block:: console
   # mysqldump --opt nova > nova.sql
 where ``nova`` is the database you want to back up.
 You can easily automate this process by creating a cron job that runs
 the following script once per day:
 .. code-block:: bash
   #!/bin/bash
   backup_dir="/var/lib/backups/mysql"
   filename="${backup_dir}/mysql-`hostname`-`eval date +%Y%m%d`.sql.gz"
   # Dump the entire MySQL database
   /usr/bin/mysqldump --opt --all-databases | gzip > $filename
   # Delete backups older than 7 days
   find $backup_dir -ctime +7 -type f -delete
 This script dumps the entire MySQL database and deletes any backups
 older than seven days.
 File System Backups
 ~~~~~~~~~~~~~~~~~~~
 This section discusses which files and directories should be backed up
 regularly, organized by service.
 Compute
 -------
 The ``/etc/nova`` directory on both the cloud controller and compute
 nodes should be regularly backed up.
 ``/var/log/nova`` does not need to be backed up if you have all logs
 going to a central area. It is highly recommended to use a central
 logging server or back up the log directory.
 ``/var/lib/nova`` is another important directory to back up. The
 exception to this is the ``/var/lib/nova/instances`` subdirectory on
 compute nodes. This subdirectory contains the KVM images of running
 instances. You would want to back up this directory only if you need to
 maintain backup copies of all instances. Under most circumstances, you
 do not need to do this, but this can vary from cloud to cloud and your
 service levels. Also be aware that making a backup of a live KVM
 instance can cause that instance to not boot properly if it is ever
 restored from a backup.
 Image Catalog and Delivery
 --------------------------
 ``/etc/glance`` and ``/var/log/glance`` follow the same rules as their
 nova counterparts.
 ``/var/lib/glance`` should also be backed up. Take special notice of
 ``/var/lib/glance/images``. If you are using a file-based back end of
 glance, ``/var/lib/glance/images`` is where the images are stored and
 care should be taken.
 There are two ways to ensure stability with this directory. The first is
 to make sure this directory is run on a RAID array. If a disk fails, the
 directory is available. The second way is to use a tool such as rsync to
 replicate the images to another server:
 .. code-block:: console
   # rsync -az --progress /var/lib/glance/images \
   backup-server:/var/lib/glance/images/
 Identity
 --------
 ``/etc/keystone`` and ``/var/log/keystone`` follow the same rules as
 other components.
 ``/var/lib/keystone``, although it should not contain any data being
 used, can also be backed up just in case.
 Block Storage
 -------------
 ``/etc/cinder`` and ``/var/log/cinder`` follow the same rules as other
 components.
 ``/var/lib/cinder`` should also be backed up.
 Object Storage
 --------------
 ``/etc/swift`` is very important to have backed up. This directory
 contains the swift configuration files as well as the ring files and
 ring :term:`builder files <builder file>`, which if lost, render the data
 on your cluster inaccessible. A best practice is to copy the builder files
 to all storage nodes along with the ring files. Multiple backup copies are
 spread throughout your storage cluster.
 Recovering Backups
 ~~~~~~~~~~~~~~~~~~
 Recovering backups is a fairly simple process. To begin, first ensure
 that the service you are recovering is not running. For example, to do a
 full recovery of ``nova`` on the cloud controller, first stop all
 ``nova`` services:
 .. code-block:: console
   # stop nova-api
   # stop nova-cert
   # stop nova-consoleauth
   # stop nova-novncproxy
   # stop nova-objectstore
   # stop nova-scheduler
 Now you can import a previously backed-up database:
 .. code-block:: console
   # mysql nova < nova.sql
 You can also restore backed-up nova directories:
 .. code-block:: console
   # mv /etc/nova{,.orig}
   # cp -a /path/to/backup/nova /etc/
 Once the files are restored, start everything back up:
 .. code-block:: console
   # start mysql
   # for i in nova-api nova-cert nova-consoleauth nova-novncproxy
   nova-objectstore nova-scheduler
   > do
   > start $i
   > done
 Other services follow the same process, with their respective
 directories and databases.
 Summary
 ~~~~~~~
 Backup and subsequent recovery is one of the first tasks system
 administrators learn. However, each system has different items that need
 attention. By taking care of your database, image service, and
 appropriate file system locations, you can be assured that you can
 handle any event requiring recovery.
--- a/doc/ops-guide/source/ops_customize.rst
+++ b/doc/ops-guide/source/ops_customize.rst
@ -0,0 +1,850 @@
 =============
 Customization
 =============
 OpenStack might not do everything you need it to do out of the box. To
 add a new feature, you can follow different paths.
 To take the first path, you can modify the OpenStack code directly.
 Learn `how to
 contribute <https://wiki.openstack.org/wiki/How_To_Contribute>`_,
 follow the `code review
 workflow <https://wiki.openstack.org/wiki/GerritWorkflow>`_, make your
 changes, and contribute them back to the upstream OpenStack project.
 This path is recommended if the feature you need requires deep
 integration with an existing project. The community is always open to
 contributions and welcomes new functionality that follows the
 feature-development guidelines. This path still requires you to use
 DevStack for testing your feature additions, so this chapter walks you
 through the DevStack environment.
 For the second path, you can write new features and plug them in using
 changes to a configuration file. If the project where your feature would
 need to reside uses the Python Paste framework, you can create
 middleware for it and plug it in through configuration. There may also
 be specific ways of customizing a project, such as creating a new
 scheduler driver for Compute or a custom tab for the dashboard.
 This chapter focuses on the second path for customizing OpenStack by
 providing two examples for writing new features. The first example shows
 how to modify Object Storage (swift) middleware to add a new feature,
 and the second example provides a new scheduler feature for OpenStack
 Compute (nova). To customize OpenStack this way you need a development
 environment. The best way to get an environment up and running quickly
 is to run DevStack within your cloud.
 Create an OpenStack Development Environment
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 To create a development environment, you can use DevStack. DevStack is
 essentially a collection of shell scripts and configuration files that
 builds an OpenStack development environment for you. You use it to
 create such an environment for developing a new feature.
 You can find all of the documentation at the
 `DevStack <http://docs.openstack.org/developer/devstack/>`_ website.
 **To run DevStack on an instance in your OpenStack cloud:**
 #. Boot an instance from the dashboard or the nova command-line interface
   (CLI) with the following parameters:
   -  Name: devstack
   -  Image: Ubuntu 14.04 LTS
   -  Memory Size: 4 GB RAM
   -  Disk Size: minimum 5 GB
   If you are using the ``nova`` client, specify :option:`--flavor 3` for the
   :command:`nova boot` command to get adequate memory and disk sizes.
 #. Log in and set up DevStack. Here's an example of the commands you can
   use to set up DevStack on a virtual machine:
   #. Log in to the instance:
      .. code-block:: console
         $ ssh username@my.instance.ip.address
   #. Update the virtual machine's operating system:
      .. code-block:: console
         # apt-get -y update
   #. Install git:
      .. code-block:: console
         # apt-get -y install git
   #. Clone the ``devstack`` repository:
      .. code-block:: console
         $ git clone https://git.openstack.org/openstack-dev/devstack
   #. Change to the ``devstack`` repository:
      .. code-block:: console
         $ cd devstack
 #. (Optional) If you've logged in to your instance as the root user, you
   must create a "stack" user; otherwise you'll run into permission issues.
   If you've logged in as a user other than root, you can skip these steps:
   #. Run the DevStack script to create the stack user:
      .. code-block:: console
         # tools/create-stack-user.sh
   #. Give ownership of the ``devstack`` directory to the stack user:
      .. code-block:: console
         # chown -R stack:stack /root/devstack
   #. Set some permissions you can use to view the DevStack screen later:
      .. code-block:: console
         # chmod o+rwx /dev/pts/0
   #. Switch to the stack user:
      .. code-block:: console
         $ su stack
 #. Edit the ``local.conf`` configuration file that controls what DevStack
   will deploy. Copy the example ``local.conf`` file at the end of this
   section (:ref:`local.conf`):
   .. code-block:: console
      $ vim local.conf
 #. Run the stack script that will install OpenStack:
   .. code-block:: console
      $ ./stack.sh
 #. When the stack script is done, you can open the screen session it
   started to view all of the running OpenStack services:
   .. code-block:: console
      $ screen -r stack
 #. Press ``Ctrl+A`` followed by 0 to go to the first ``screen`` window.
 .. note::
   -  The ``stack.sh`` script takes a while to run. Perhaps you can
      take this opportunity to `join the OpenStack
      Foundation <https://www.openstack.org/join/>`__.
   -  ``Screen`` is a useful program for viewing many related services
      at once. For more information, see the `GNU screen quick
      reference <http://aperiodic.net/screen/quick_reference>`__.
 Now that you have an OpenStack development environment, you're free to
 hack around without worrying about damaging your production deployment.
 :ref:`local.conf` provides a working environment for
 running OpenStack Identity, Compute, Block Storage, Image service, the
 OpenStack dashboard, and Object Storage as the starting point.
 .. _local.conf:
 local.conf
 ----------
 .. code-block:: bash
   [[local|localrc]]
   FLOATING_RANGE=192.168.1.224/27
   FIXED_RANGE=10.11.12.0/24
   FIXED_NETWORK_SIZE=256
   FLAT_INTERFACE=eth0
   ADMIN_PASSWORD=supersecret
   DATABASE_PASSWORD=iheartdatabases
   RABBIT_PASSWORD=flopsymopsy
   SERVICE_PASSWORD=iheartksl
   SERVICE_TOKEN=xyzpdqlazydog
 Customizing Object Storage (Swift) Middleware
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 OpenStack Object Storage, known as swift when reading the code, is based
 on the Python `Paste <http://pythonpaste.org/>`_ framework. The best
 introduction to its architecture is `A Do-It-Yourself
 Framework <http://pythonpaste.org/do-it-yourself-framework.html>`_.
 Because of the swift project's use of this framework, you are able to
 add features to a project by placing some custom code in a project's
 pipeline without having to change any of the core code.
 Imagine a scenario where you have public access to one of your
 containers, but what you really want is to restrict access to that to a
 set of IPs based on a whitelist. In this example, we'll create a piece
 of middleware for swift that allows access to a container from only a
 set of IP addresses, as determined by the container's metadata items.
 Only those IP addresses that you explicitly whitelist using the
 container's metadata will be able to access the container.
 .. warning::
   This example is for illustrative purposes only. It should not be
   used as a container IP whitelist solution without further
   development and extensive security testing.
 When you join the screen session that ``stack.sh`` starts with
 ``screen -r stack``, you see a screen for each service running, which
 can be a few or several, depending on how many services you configured
 DevStack to run.
 The asterisk * indicates which screen window you are viewing. This
 example shows we are viewing the key (for keystone) screen window:
 .. code-block:: console
   0$ shell  1$ key*  2$ horizon  3$ s-proxy  4$ s-object  5$ s-container  6$ s-account
 The purpose of the screen windows are as follows:
 ``shell``
    A shell where you can get some work done
 ``key*``
    The keystone service
 ``horizon``
    The horizon dashboard web application
 ``s-{name}``
    The swift services
 **To create the middleware and plug it in through Paste configuration:**
 All of the code for OpenStack lives in ``/opt/stack``. Go to the swift
 directory in the ``shell`` screen and edit your middleware module.
 #. Change to the directory where Object Storage is installed:
   .. code-block:: console
      $ cd /opt/stack/swift
 #. Create the ``ip_whitelist.py`` Python source code file:
   .. code-block:: console
      $ vim swift/common/middleware/ip_whitelist.py
 #. Copy the code as shown below into ``ip_whitelist.py``.
   The following code is a middleware example that
   restricts access to a container based on IP address as explained at the
   beginning of the section. Middleware passes the request on to another
   application. This example uses the swift "swob" library to wrap Web
   Server Gateway Interface (WSGI) requests and responses into objects for
   swift to interact with. When you're done, save and close the file.
   .. code-block:: python
      # vim: tabstop=4 shiftwidth=4 softtabstop=4
      # Copyright (c) 2014 OpenStack Foundation
      # All Rights Reserved.
      #
      #    Licensed under the Apache License, Version 2.0 (the "License"); you may
      #    not use this file except in compliance with the License. You may obtain
      #    a copy of the License at
      #
      #         http://www.apache.org/licenses/LICENSE-2.0
      #
      #    Unless required by applicable law or agreed to in writing, software
      #    distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
      #    WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
      #    License for the specific language governing permissions and limitations
      #    under the License.
      import socket
      from swift.common.utils import get_logger
      from swift.proxy.controllers.base import get_container_info
      from swift.common.swob import Request, Response
      class IPWhitelistMiddleware(object):
          """
          IP Whitelist Middleware
          Middleware that allows access to a container from only a set of IP
          addresses as determined by the container's metadata items that start
          with the prefix 'allow'. E.G. allow-dev=192.168.0.20
          """
          def __init__(self, app, conf, logger=None):
              self.app = app
              if logger:
                  self.logger = logger
              else:
                  self.logger = get_logger(conf, log_route='ip_whitelist')
              self.deny_message = conf.get('deny_message', "IP Denied")
              self.local_ip = socket.gethostbyname(socket.gethostname())
          def __call__(self, env, start_response):
              """
              WSGI entry point.
              Wraps env in swob.Request object and passes it down.
              :param env: WSGI environment dictionary
              :param start_response: WSGI callable
              """
              req = Request(env)
              try:
                  version, account, container, obj = req.split_path(1, 4, True)
              except ValueError:
                  return self.app(env, start_response)
              container_info = get_container_info(
                  req.environ, self.app, swift_source='IPWhitelistMiddleware')
              remote_ip = env['REMOTE_ADDR']
              self.logger.debug("Remote IP: %(remote_ip)s",
                                {'remote_ip': remote_ip})
              meta = container_info['meta']
              allow = {k:v for k,v in meta.iteritems() if k.startswith('allow')}
              allow_ips = set(allow.values())
              allow_ips.add(self.local_ip)
              self.logger.debug("Allow IPs: %(allow_ips)s",
                                {'allow_ips': allow_ips})
              if remote_ip in allow_ips:
                  return self.app(env, start_response)
              else:
                  self.logger.debug(
                      "IP %(remote_ip)s denied access to Account=%(account)s "
                      "Container=%(container)s. Not in %(allow_ips)s", locals())
                  return Response(
                      status=403,
                      body=self.deny_message,
                      request=req)(env, start_response)
      def filter_factory(global_conf, **local_conf):
          """
          paste.deploy app factory for creating WSGI proxy apps.
          """
          conf = global_conf.copy()
          conf.update(local_conf)
          def ip_whitelist(app):
              return IPWhitelistMiddleware(app, conf)
          return ip_whitelist
   There is a lot of useful information in ``env`` and ``conf`` that you
   can use to decide what to do with the request. To find out more about
   what properties are available, you can insert the following log
   statement into the ``__init__`` method:
   .. code-block:: python
      self.logger.debug("conf = %(conf)s", locals())
   and the following log statement into the ``__call__`` method:
   .. code-block:: python
      self.logger.debug("env = %(env)s", locals())
 #. To plug this middleware into the swift Paste pipeline, you edit one
   configuration file, ``/etc/swift/proxy-server.conf``:
   .. code-block:: console
      $ vim /etc/swift/proxy-server.conf
 #. Find the ``[filter:ratelimit]`` section in
   ``/etc/swift/proxy-server.conf``, and copy in the following
   configuration section after it:
   .. code-block:: ini
      [filter:ip_whitelist]
      paste.filter_factory = swift.common.middleware.ip_whitelist:filter_factory
      # You can override the default log routing for this filter here:
      # set log_name = ratelimit
      # set log_facility = LOG_LOCAL0
      # set log_level = INFO
      # set log_headers = False
      # set log_address = /dev/log
      deny_message = You shall not pass!
 #. Find the ``[pipeline:main]`` section in
   ``/etc/swift/proxy-server.conf``, and add ``ip_whitelist`` after
   ratelimit to the list like so. When you're done, save and close the
   file:
   .. code-block:: ini
      [pipeline:main]
      pipeline = catch_errors gatekeeper healthcheck proxy-logging cache bulk tempurl ratelimit ip_whitelist ...
 #. Restart the ``swift proxy`` service to make swift use your middleware.
   Start by switching to the ``swift-proxy`` screen:
   #. Press **Ctrl+A** followed by 3.
   #. Press **Ctrl+C** to kill the service.
   #. Press Up Arrow to bring up the last command.
   #. Press Enter to run it.
 #. Test your middleware with the ``swift`` CLI. Start by switching to the
   shell screen and finish by switching back to the ``swift-proxy`` screen
   to check the log output:
   #. Press  **Ctrl+A** followed by 0.
   #. Make sure you're in the ``devstack`` directory:
      .. code-block:: console
         $ cd /root/devstack
   #. Source openrc to set up your environment variables for the CLI:
      .. code-block:: console
         $ source openrc
   #. Create a container called ``middleware-test``:
      .. code-block:: console
         $ swift post middleware-test
   #. Press **Ctrl+A** followed by 3 to check the log output.
 #. Among the log statements you'll see the lines:
   .. code-block:: ini
      proxy-server Remote IP: my.instance.ip.address (txn: ...)
      proxy-server Allow IPs: set(['my.instance.ip.address']) (txn: ...)
   These two statements are produced by our middleware and show that the
   request was sent from our DevStack instance and was allowed.
 #. Test the middleware from outside DevStack on a remote machine that has
   access to your DevStack instance:
   #. Install the ``keystone`` and ``swift`` clients on your local machine:
      .. code-block:: console
         # pip install python-keystoneclient python-swiftclient
   #. Attempt to list the objects in the ``middleware-test`` container:
      .. code-block:: console
         $ swift --os-auth-url=http://my.instance.ip.address:5000/v2.0/ \
         --os-region-name=RegionOne --os-username=demo:demo \
         --os-password=devstack list middleware-test
         Container GET failed: http://my.instance.ip.address:8080/v1/AUTH_.../
             middleware-test?format=json 403 Forbidden   You shall not pass!
 #. Press **Ctrl+A** followed by 3 to check the log output. Look at the
   swift log statements again, and among the log statements, you'll see the
   lines:
   .. code-block:: console
      proxy-server Authorizing from an overriding middleware (i.e: tempurl) (txn: ...)
      proxy-server ... IPWhitelistMiddleware
      proxy-server Remote IP: my.local.ip.address (txn: ...)
      proxy-server Allow IPs: set(['my.instance.ip.address']) (txn: ...)
      proxy-server IP my.local.ip.address denied access to Account=AUTH_... \
         Container=None. Not in set(['my.instance.ip.address']) (txn: ...)
   Here we can see that the request was denied because the remote IP
   address wasn't in the set of allowed IPs.
 #. Back in your DevStack instance on the shell screen, add some metadata to
   your container to allow the request from the remote machine:
   #. Press **Ctrl+A** followed by 0.
   #. Add metadata to the container to allow the IP:
      .. code-block:: console
         $ swift post --meta allow-dev:my.local.ip.address middleware-test
   #. Now try the command from Step 10 again and it succeeds. There are no
      objects in the container, so there is nothing to list; however, there is
      also no error to report.
 .. warning::
   Functional testing like this is not a replacement for proper unit
   and integration testing, but it serves to get you started.
 You can follow a similar pattern in other projects that use the Python
 Paste framework. Simply create a middleware module and plug it in
 through configuration. The middleware runs in sequence as part of that
 project's pipeline and can call out to other services as necessary. No
 project core code is touched. Look for a ``pipeline`` value in the
 project's ``conf`` or ``ini`` configuration files in ``/etc/<project>``
 to identify projects that use Paste.
 When your middleware is done, we encourage you to open source it and let
 the community know on the OpenStack mailing list. Perhaps others need
 the same functionality. They can use your code, provide feedback, and
 possibly contribute. If enough support exists for it, perhaps you can
 propose that it be added to the official swift
 `middleware <https://git.openstack.org/cgit/openstack/swift/tree/swift/common/middleware>`_.
 Customizing the OpenStack Compute (nova) Scheduler
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 Many OpenStack projects allow for customization of specific features
 using a driver architecture. You can write a driver that conforms to a
 particular interface and plug it in through configuration. For example,
 you can easily plug in a new scheduler for Compute. The existing
 schedulers for Compute are feature full and well documented at
 `Scheduling <http://docs.openstack.org/liberty/config-reference/content/section_compute-scheduler.html>`_.
 However, depending on your user's use cases, the existing schedulers
 might not meet your requirements. You might need to create a new
 scheduler.
 To create a scheduler, you must inherit from the class
 ``nova.scheduler.driver.Scheduler``. Of the five methods that you can
 override, you *must* override the two methods marked with an asterisk
 (\*) below:
 -  ``update_service_capabilities``
 -  ``hosts_up``
 -  ``group_hosts``
 -  \* ``schedule_run_instance``
 -  \* ``select_destinations``
 To demonstrate customizing OpenStack, we'll create an example of a
 Compute scheduler that randomly places an instance on a subset of hosts,
 depending on the originating IP address of the request and the prefix of
 the hostname. Such an example could be useful when you have a group of
 users on a subnet and you want all of their instances to start within
 some subset of your hosts.
 .. warning::
   This example is for illustrative purposes only. It should not be
   used as a scheduler for Compute without further development and
   testing.
 When you join the screen session that ``stack.sh`` starts with
 ``screen -r stack``, you are greeted with many screen windows:
 .. code-block:: console
   0$ shell*  1$ key  2$ horizon  ...  9$ n-api  ...  14$ n-sch ...
 ``shell``
    A shell where you can get some work done
 ``key``
    The keystone service
 ``horizon``
    The horizon dashboard web application
 ``n-{name}``
    The nova services
 ``n-sch``
    The nova scheduler service
 **To create the scheduler and plug it in through configuration**
 #. The code for OpenStack lives in ``/opt/stack``, so go to the ``nova``
   directory and edit your scheduler module. Change to the directory where
   ``nova`` is installed:
   .. code-block:: console
      $ cd /opt/stack/nova
 #. Create the ``ip_scheduler.py`` Python source code file:
   .. code-block:: console
      $ vim nova/scheduler/ip_scheduler.py
 #. The code shown below is a driver that will
   schedule servers to hosts based on IP address as explained at the
   beginning of the section. Copy the code into ``ip_scheduler.py``. When
   you're done, save and close the file.
   .. code-block:: python
      # vim: tabstop=4 shiftwidth=4 softtabstop=4
      # Copyright (c) 2014 OpenStack Foundation
      # All Rights Reserved.
      #
      #    Licensed under the Apache License, Version 2.0 (the "License"); you may
      #    not use this file except in compliance with the License. You may obtain
      #    a copy of the License at
      #
      #         http://www.apache.org/licenses/LICENSE-2.0
      #
      #    Unless required by applicable law or agreed to in writing, software
      #    distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
      #    WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
      #    License for the specific language governing permissions and limitations
      #    under the License.
      """
      IP Scheduler implementation
      """
      import random
      from oslo.config import cfg
      from nova.compute import rpcapi as compute_rpcapi
      from nova import exception
      from nova.openstack.common import log as logging
      from nova.openstack.common.gettextutils import _
      from nova.scheduler import driver
      CONF = cfg.CONF
      CONF.import_opt('compute_topic', 'nova.compute.rpcapi')
      LOG = logging.getLogger(__name__)
      class IPScheduler(driver.Scheduler):
          """
          Implements Scheduler as a random node selector based on
          IP address and hostname prefix.
          """
          def __init__(self, *args, **kwargs):
              super(IPScheduler, self).__init__(*args, **kwargs)
              self.compute_rpcapi = compute_rpcapi.ComputeAPI()
          def _filter_hosts(self, request_spec, hosts, filter_properties,
              hostname_prefix):
              """Filter a list of hosts based on hostname prefix."""
              hosts = [host for host in hosts if host.startswith(hostname_prefix)]
              return hosts
          def _schedule(self, context, topic, request_spec, filter_properties):
              """Picks a host that is up at random."""
              elevated = context.elevated()
              hosts = self.hosts_up(elevated, topic)
              if not hosts:
                  msg = _("Is the appropriate service running?")
                  raise exception.NoValidHost(reason=msg)
              remote_ip = context.remote_address
              if remote_ip.startswith('10.1'):
                  hostname_prefix = 'doc'
              elif remote_ip.startswith('10.2'):
                  hostname_prefix = 'ops'
              else:
                  hostname_prefix = 'dev'
              hosts = self._filter_hosts(request_spec, hosts, filter_properties,
                  hostname_prefix)
              if not hosts:
                  msg = _("Could not find another compute")
                  raise exception.NoValidHost(reason=msg)
              host = random.choice(hosts)
              LOG.debug("Request from %(remote_ip)s scheduled to %(host)s" % locals())
              return host
          def select_destinations(self, context, request_spec, filter_properties):
              """Selects random destinations."""
              num_instances = request_spec['num_instances']
              # NOTE(timello): Returns a list of dicts with 'host', 'nodename' and
              # 'limits' as keys for compatibility with filter_scheduler.
              dests = []
              for i in range(num_instances):
                  host = self._schedule(context, CONF.compute_topic,
                          request_spec, filter_properties)
                  host_state = dict(host=host, nodename=None, limits=None)
                  dests.append(host_state)
              if len(dests) < num_instances:
                  raise exception.NoValidHost(reason='')
              return dests
          def schedule_run_instance(self, context, request_spec,
                                    admin_password, injected_files,
                                    requested_networks, is_first_time,
                                    filter_properties, legacy_bdm_in_spec):
              """Create and run an instance or instances."""
              instance_uuids = request_spec.get('instance_uuids')
              for num, instance_uuid in enumerate(instance_uuids):
                  request_spec['instance_properties']['launch_index'] = num
                  try:
                      host = self._schedule(context, CONF.compute_topic,
                                            request_spec, filter_properties)
                      updated_instance = driver.instance_update_db(context,
                              instance_uuid)
                      self.compute_rpcapi.run_instance(context,
                              instance=updated_instance, host=host,
                              requested_networks=requested_networks,
                              injected_files=injected_files,
                              admin_password=admin_password,
                              is_first_time=is_first_time,
                              request_spec=request_spec,
                              filter_properties=filter_properties,
                              legacy_bdm_in_spec=legacy_bdm_in_spec)
                  except Exception as ex:
                      # NOTE(vish): we don't reraise the exception here to make sure
                      #             that all instances in the request get set to
                      #             error properly
                      driver.handle_schedule_error(context, ex, instance_uuid,
                                                   request_spec)
   There is a lot of useful information in ``context``, ``request_spec``,
   and ``filter_properties`` that you can use to decide where to schedule
   the instance. To find out more about what properties are available, you
   can insert the following log statements into the
   ``schedule_run_instance`` method of the scheduler above:
   .. code-block:: python
      LOG.debug("context = %(context)s" % {'context': context.__dict__})
      LOG.debug("request_spec = %(request_spec)s" % locals())
      LOG.debug("filter_properties = %(filter_properties)s" % locals())
 #. To plug this scheduler into nova, edit one configuration file,
   ``/etc/nova/nova.conf``:
   .. code-block:: console
      $ vim /etc/nova/nova.conf
 #. Find the ``scheduler_driver`` config and change it like so:
   .. code-block:: ini
      scheduler_driver=nova.scheduler.ip_scheduler.IPScheduler
 #. Restart the nova scheduler service to make nova use your scheduler.
   Start by switching to the ``n-sch`` screen:
   #. Press **Ctrl+A** followed by 9.
   #. Press **Ctrl+A** followed by N until you reach the ``n-sch`` screen.
   #. Press **Ctrl+C** to kill the service.
   #. Press Up Arrow to bring up the last command.
   #. Press Enter to run it.
 #. Test your scheduler with the nova CLI. Start by switching to the
   ``shell`` screen and finish by switching back to the ``n-sch`` screen to
   check the log output:
   #. Press  **Ctrl+A** followed by 0.
   #. Make sure you're in the ``devstack`` directory:
      .. code-block:: console
         $ cd /root/devstack
   #. Source ``openrc`` to set up your environment variables for the CLI:
      .. code-block:: console
         $ source openrc
   #. Put the image ID for the only installed image into an environment
      variable:
      .. code-block:: console
         $ IMAGE_ID=`nova image-list | egrep cirros | egrep -v "kernel|ramdisk" | awk '{print $2}'`
   #. Boot a test server:
      .. code-block:: console
         $ nova boot --flavor 1 --image $IMAGE_ID scheduler-test
 #. Switch back to the ``n-sch`` screen. Among the log statements, you'll
   see the line:
   .. code-block:: console
      2014-01-23 19:57:47.262 DEBUG nova.scheduler.ip_scheduler \
      [req-... demo demo] Request from 162.242.221.84 \
      scheduled to devstack-havana \
      _schedule /opt/stack/nova/nova/scheduler/ip_scheduler.py:76
 .. warning::
   Functional testing like this is not a replacement for proper unit
   and integration testing, but it serves to get you started.
 A similar pattern can be followed in other projects that use the driver
 architecture. Simply create a module and class that conform to the
 driver interface and plug it in through configuration. Your code runs
 when that feature is used and can call out to other services as
 necessary. No project core code is touched. Look for a "driver" value in
 the project's ``.conf`` configuration files in ``/etc/<project>`` to
 identify projects that use a driver architecture.
 When your scheduler is done, we encourage you to open source it and let
 the community know on the OpenStack mailing list. Perhaps others need
 the same functionality. They can use your code, provide feedback, and
 possibly contribute. If enough support exists for it, perhaps you can
 propose that it be added to the official Compute
 `schedulers <https://git.openstack.org/cgit/openstack/nova/tree/nova/scheduler>`_.
 Customizing the Dashboard (Horizon)
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 The dashboard is based on the Python
 `Django <https://www.djangoproject.com/>`_ web application framework.
 The best guide to customizing it has already been written and can be
 found at `Building on
 Horizon <http://docs.openstack.org/developer/horizon/topics/tutorial.html>`_.
 Conclusion
 ~~~~~~~~~~
 When operating an OpenStack cloud, you may discover that your users can
 be quite demanding. If OpenStack doesn't do what your users need, it may
 be up to you to fulfill those requirements. This chapter provided you
 with some options for customization and gave you the tools you need to
 get started.
--- a/doc/ops-guide/source/ops_lay_of_the_land.rst
+++ b/doc/ops-guide/source/ops_lay_of_the_land.rst
@ -0,0 +1,602 @@
 ===============
 Lay of the Land
 ===============
 This chapter helps you set up your working environment and use it to
 take a look around your cloud.
 Using the OpenStack Dashboard for Administration
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 As a cloud administrative user, you can use the OpenStack dashboard to
 create and manage projects, users, images, and flavors. Users are
 allowed to create and manage images within specified projects and to
 share images, depending on the Image service configuration. Typically,
 the policy configuration allows admin users only to set quotas and
 create and manage services. The dashboard provides an :guilabel:`Admin`
 tab with a :guilabel:`System Panel` and an :guilabel:`Identity` tab.
 These interfaces give you access to system information and usage as
 well as to settings for configuring what
 end users can do. Refer to the `OpenStack Administrator
 Guide <http://docs.openstack.org/admin-guide/dashboard.html>`_ for
 detailed how-to information about using the dashboard as an admin user.
 Command-Line Tools
 ~~~~~~~~~~~~~~~~~~
 We recommend using a combination of the OpenStack command-line interface
 (CLI) tools and the OpenStack dashboard for administration. Some users
 with a background in other cloud technologies may be using the EC2
 Compatibility API, which uses naming conventions somewhat different from
 the native API. We highlight those differences.
 We strongly suggest that you install the command-line clients from the
 `Python Package Index <https://pypi.python.org/pypi>`_ (PyPI) instead
 of from the distribution packages. The clients are under heavy
 development, and it is very likely at any given time that the version of
 the packages distributed by your operating-system vendor are out of
 date.
 The pip utility is used to manage package installation from the PyPI
 archive and is available in the python-pip package in most Linux
 distributions. Each OpenStack project has its own client, so depending
 on which services your site runs, install some or all of the
 following packages:
 * python-novaclient (:term:`nova` CLI)
 * python-glanceclient (:term:`glance` CLI)
 * python-keystoneclient (:term:`keystone` CLI)
 * python-cinderclient (:term:`cinder` CLI)
 * python-swiftclient (:term:`swift` CLI)
 * python-neutronclient (:term:`neutron` CLI)
 Installing the Tools
 --------------------
 To install (or upgrade) a package from the PyPI archive with pip,
 command-line tools installingas root:
 .. code-block:: console
   # pip install [--upgrade] <package-name>
 To remove the package:
 .. code-block:: console
   # pip uninstall <package-name>
 If you need even newer versions of the clients, pip can install directly
 from the upstream git repository using the :option:`-e` flag. You must specify
 a name for the Python egg that is installed. For example:
 .. code-block:: console
   # pip install -e git+https://git.openstack.org/openstack/python-novaclient#egg=python-novaclient
 If you support the EC2 API on your cloud, you should also install the
 euca2ools package or some other EC2 API tool so that you can get the
 same view your users have. Using EC2 API-based tools is mostly out of
 the scope of this guide, though we discuss getting credentials for use
 with it.
 Administrative Command-Line Tools
 ---------------------------------
 There are also several :command:`*-manage` command-line tools. These are
 installed with the project's services on the cloud controller and do not
 need to be installed\*-manage command-line toolscommand-line tools
 administrative separately:
 * :command:`glance-manage`
 * :command:`keystone-manage`
 * :command:`cinder-manage`
 Unlike the CLI tools mentioned above, the :command:`*-manage` tools must
 be run from the cloud controller, as root, because they need read access
 to the config files such as ``/etc/nova/nova.conf`` and to make queries
 directly against the database rather than against the OpenStack
 :term:`API endpoints <API endpoint>`.
 .. warning::
   The existence of the ``*-manage`` tools is a legacy issue. It is a
   goal of the OpenStack project to eventually migrate all of the
   remaining functionality in the ``*-manage`` tools into the API-based
   tools. Until that day, you need to SSH into the
   :term:`cloud controller node` to perform some maintenance operations
   that require one of the ``*-manage`` tools.
 Getting Credentials
 -------------------
 You must have the appropriate credentials if you want to use the
 command-line tools to make queries against your OpenStack cloud. By far,
 the easiest way to obtain :term:`authentication` credentials to use with
 command-line clients is to use the OpenStack dashboard. Select
 :guilabel:`Project`, click the :guilabel:`Project` tab, and click
 :guilabel:`Access & Security` on the :guilabel:`Compute` category.
 On the :guilabel:`Access & Security` page, click the :guilabel:`API Access`
 tab to display two buttons, :guilabel:`Download OpenStack RC File` and
 :guilabel:`Download EC2 Credentials`, which let you generate files that
 you can source in your shell to populate the environment variables the
 command-line tools require to know where your service endpoints and your
 authentication information are. The user you logged in to the dashboard
 dictates the filename for the openrc file, such as ``demo-openrc.sh``.
 When logged in as admin, the file is named ``admin-openrc.sh``.
 The generated file looks something like this:
 .. code-block:: bash
   #!/bin/bash
   # With the addition of Keystone, to use an openstack cloud you should
   # authenticate against keystone, which returns a **Token** and **Service
   # Catalog**. The catalog contains the endpoint for all services the
   # user/tenant has access to--including nova, glance, keystone, swift.
   #
   # *NOTE*: Using the 2.0 *auth api* does not mean that compute api is 2.0.
   # We use the 1.1 *compute api*
   export OS_AUTH_URL=http://203.0.113.10:5000/v2.0
   # With the addition of Keystone we have standardized on the term **tenant**
   # as the entity that owns the resources.
   export OS_TENANT_ID=98333aba48e756fa8f629c83a818ad57
   export OS_TENANT_NAME="test-project"
   # In addition to the owning entity (tenant), openstack stores the entity
   # performing the action as the **user**.
   export OS_USERNAME=demo
   # With Keystone you pass the keystone password.
   echo "Please enter your OpenStack Password: "
   read -s OS_PASSWORD_INPUT
   export OS_PASSWORD=$OS_PASSWORD_INPUT
 .. warning::
   This does not save your password in plain text, which is a good
   thing. But when you source or run the script, it prompts you for
   your password and then stores your response in the environment
   variable ``OS_PASSWORD``. It is important to note that this does
   require interactivity. It is possible to store a value directly in
   the script if you require a noninteractive operation, but you then
   need to be extremely cautious with the security and permissions of
   this file.passwordssecurity issues passwords
 EC2 compatibility credentials can be downloaded by selecting
 :guilabel:`Project`, then :guilabel:`Compute`, then
 :guilabel:`Access & Security`, then :guilabel:`API Access` to display the
 :guilabel:`Download EC2 Credentials` button. Click the button to generate
 a ZIP file with server x509 certificates and a shell script fragment.
 Create a new directory in a secure location because these are live credentials
 containing all the authentication information required to access your
 cloud identity, unlike the default ``user-openrc``. Extract the ZIP file
 here. You should have ``cacert.pem``, ``cert.pem``, ``ec2rc.sh``, and
 ``pk.pem``. The ``ec2rc.sh`` is similar to this:
 .. code-block:: bash
   #!/bin/bash
   NOVARC=$(readlink -f "${BASH_SOURCE:-${0}}" 2>/dev/null) ||\
   NOVARC=$(python -c 'import os,sys; \
   print os.path.abspath(os.path.realpath(sys.argv[1]))' "${BASH_SOURCE:-${0}}")
   NOVA_KEY_DIR=${NOVARC%/*}
   export EC2_ACCESS_KEY=df7f93ec47e84ef8a347bbb3d598449a
   export EC2_SECRET_KEY=ead2fff9f8a344e489956deacd47e818
   export EC2_URL=http://203.0.113.10:8773/services/Cloud
   export EC2_USER_ID=42 # nova does not use user id, but bundling requires it
   export EC2_PRIVATE_KEY=${NOVA_KEY_DIR}/pk.pem
   export EC2_CERT=${NOVA_KEY_DIR}/cert.pem
   export NOVA_CERT=${NOVA_KEY_DIR}/cacert.pem
   export EUCALYPTUS_CERT=${NOVA_CERT} # euca-bundle-image seems to require this
   alias ec2-bundle-image="ec2-bundle-image --cert $EC2_CERT --privatekey \
   $EC2_PRIVATE_KEY --user 42 --ec2cert $NOVA_CERT"
   alias ec2-upload-bundle="ec2-upload-bundle -a $EC2_ACCESS_KEY -s \
   $EC2_SECRET_KEY --url $S3_URL --ec2cert $NOVA_CERT"
 To put the EC2 credentials into your environment, source the
 ``ec2rc.sh`` file.
 Inspecting API Calls
 --------------------
 The command-line tools can be made to show the OpenStack API calls they
 make by passing the :option:`--debug` flag to them.API (application
 programming interface) API calls, inspectingcommand-line tools
 inspecting API calls For example:
 .. code-block:: console
   # nova --debug list
 This example shows the HTTP requests from the client and the responses
 from the endpoints, which can be helpful in creating custom tools
 written to the OpenStack API.
 Using cURL for further inspection
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 Underlying the use of the command-line tools is the OpenStack API, which
 is a RESTful API that runs over HTTP. There may be cases where you want
 to interact with the API directly or need to use it because of a
 suspected bug in one of the CLI tools. The best way to do this is to use
 a combination of `cURL <http://curl.haxx.se/>`_ and another tool,
 such as `jq <http://stedolan.github.io/jq/>`_, to parse the JSON from
 the responses.
 The first thing you must do is authenticate with the cloud using your
 credentials to get an authentication token.
 Your credentials are a combination of username, password, and tenant
 (project). You can extract these values from the ``openrc.sh`` discussed
 above. The token allows you to interact with your other service
 endpoints without needing to reauthenticate for every request. Tokens
 are typically good for 24 hours, and when the token expires, you are
 alerted with a 401 (Unauthorized) response and you can request another
 token.
 #. Look at your OpenStack service catalog:
   .. code-block:: console
      $ curl -s -X POST http://203.0.113.10:35357/v2.0/tokens \
        -d '{"auth": {"passwordCredentials": {"username":"test-user", \
                                              "password":"test-password"},  \
                                              "tenantName":"test-project"}}' \
        -H "Content-type: application/json" | jq .
 #. Read through the JSON response to get a feel for how the catalog is
   laid out.
   To make working with subsequent requests easier, store the token in
   an environment variable:
   .. code-block:: console
      $ TOKEN=`curl -s -X POST http://203.0.113.10:35357/v2.0/tokens \
        -d '{"auth": {"passwordCredentials": {"username":"test-user",  \
                                              "password":"test-password"},  \
                                              "tenantName":"test-project"}}' \
        -H "Content-type: application/json" |  jq -r .access.token.id`
   Now you can refer to your token on the command line as ``$TOKEN``.
 #. Pick a service endpoint from your service catalog, such as compute.
   Try a request, for example, listing instances (servers):
   .. code-block:: console
      $ curl -s \
        -H "X-Auth-Token: $TOKEN" \
        http://203.0.113.10:8774/v2/98333aba48e756fa8f629c83a818ad57/servers | jq .
 To discover how API requests should be structured, read the `OpenStack
 API Reference <http://developer.openstack.org/api-ref.html>`_. To chew
 through the responses using jq, see the `jq
 Manual <http://stedolan.github.io/jq/manual/>`_.
 The ``-s flag`` used in the cURL commands above are used to prevent
 the progress meter from being shown. If you are having trouble running
 cURL commands, you'll want to remove it. Likewise, to help you
 troubleshoot cURL commands, you can include the ``-v`` flag to show you
 the verbose output. There are many more extremely useful features in
 cURL; refer to the man page for all the options.
 Servers and Services
 --------------------
 As an administrator, you have a few ways to discover what your OpenStack
 cloud looks like simply by using the OpenStack tools available. This
 section gives you an idea of how to get an overview of your cloud, its
 shape, size, and current state.
 First, you can discover what servers belong to your OpenStack cloud by
 running:
 .. code-block:: console
   # nova service-list
 The output looks like the following:
 .. code-block:: console
   +----+------------------+-------------------+------+---------+-------+----------------------------+-----------------+
   | Id | Binary           | Host              | Zone | Status  | State | Updated_at                 | Disabled Reason |
   +----+------------------+-------------------+------+---------+-------+----------------------------+-----------------+
   | 1  | nova-cert        | cloud.example.com | nova | enabled | up    | 2016-01-05T17:20:38.000000 | -               |
   | 2  | nova-compute     | c01.example.com   | nova | enabled | up    | 2016-01-05T17:20:38.000000 | -               |
   | 3  | nova-compute     | c01.example.com.  | nova | enabled | up    | 2016-01-05T17:20:38.000000 | -               |
   | 4  | nova-compute     | c01.example.com   | nova | enabled | up    | 2016-01-05T17:20:38.000000 | -               |
   | 5  | nova-compute     | c01.example.com   | nova | enabled | up    | 2016-01-05T17:20:38.000000 | -               |
   | 6  | nova-compute     | c01.example.com   | nova | enabled | up    | 2016-01-05T17:20:38.000000 | -               |
   | 7  | nova-conductor   | cloud.example.com | nova | enabled | up    | 2016-01-05T17:20:38.000000 | -               |
   | 8  | nova-cert        | cloud.example.com | nova | enabled | up    | 2016-01-05T17:20:42.000000 | -               |
   | 9  | nova-scheduler   | cloud.example.com | nova | enabled | up    | 2016-01-05T17:20:38.000000 | -               |
   | 10 | nova-consoleauth | cloud.example.com | nova | enabled | up    | 2016-01-05T17:20:35.000000 | -               |
   +----+------------------+-------------------+------+---------+-------+----------------------------+-----------------+
 The output shows that there are five compute nodes and one cloud
 controller. You see all the services in the up state, which indicates that
 the services are up and running. If a service is in a down state, it is
 no longer available. This is an indication that you
 should troubleshoot why the service is down.
 If you are using cinder, run the following command to see a similar
 listing:
 .. code-block:: console
   # cinder-manage host list | sort
   host              zone
   c01.example.com   nova
   c02.example.com   nova
   c03.example.com   nova
   c04.example.com   nova
   c05.example.com   nova
   cloud.example.com nova
 With these two tables, you now have a good overview of what servers and
 services make up your cloud.
 You can also use the Identity service (keystone) to see what services
 are available in your cloud as well as what endpoints have been
 configured for the services.
 The following command requires you to have your shell environment
 configured with the proper administrative variables:
 .. code-block:: console
   $ openstack catalog list
   +----------+------------+---------------------------------------------------------------------------------+
   | Name     | Type       | Endpoints                                                                       |
   +----------+------------+---------------------------------------------------------------------------------+
   | nova     | compute    | RegionOne                                                                       |
   |          |            |   publicURL: http://192.168.122.10:8774/v2/9faa845768224258808fc17a1bb27e5e     |
   |          |            |   internalURL: http://192.168.122.10:8774/v2/9faa845768224258808fc17a1bb27e5e   |
   |          |            |   adminURL: http://192.168.122.10:8774/v2/9faa845768224258808fc17a1bb27e5e      |
   |          |            |                                                                                 |
   | cinderv2 | volumev2   | RegionOne                                                                       |
   |          |            |   publicURL: http://192.168.122.10:8776/v2/9faa845768224258808fc17a1bb27e5e     |
   |          |            |   internalURL: http://192.168.122.10:8776/v2/9faa845768224258808fc17a1bb27e5e   |
   |          |            |   adminURL: http://192.168.122.10:8776/v2/9faa845768224258808fc17a1bb27e5e      |
   |          |            |                                                                                 |
 The preceding output has been truncated to show only two services. You
 will see one service entry for each service that your cloud provides.
 Note how the endpoint domain can be different depending on the endpoint
 type. Different endpoint domains per type are not required, but this can
 be done for different reasons, such as endpoint privacy or network
 traffic segregation.
 You can find the version of the Compute installation by using the
 nova client command:
 .. code-block:: console
   # nova version-list
 Diagnose Your Compute Nodes
 ---------------------------
 You can obtain extra information about virtual machines that are
 running—their CPU usage, the memory, the disk I/O or network I/O—per
 instance, by running the :command:`nova diagnostics` command with a server ID:
 .. code-block:: console
   $ nova diagnostics <serverID>
 The output of this command varies depending on the hypervisor because
 hypervisors support different attributes. The following demonstrates
 the difference between the two most popular hypervisors.
 Here is example output when the hypervisor is Xen:
 .. code-block:: console
   +----------------+-----------------+
   |    Property    |      Value      |
   +----------------+-----------------+
   | cpu0           | 4.3627          |
   | memory         | 1171088064.0000 |
   | memory_target  | 1171088064.0000 |
   | vbd_xvda_read  | 0.0             |
   | vbd_xvda_write | 0.0             |
   | vif_0_rx       | 3223.6870       |
   | vif_0_tx       | 0.0             |
   | vif_1_rx       | 104.4955        |
   | vif_1_tx       | 0.0             |
   +----------------+-----------------+
 While the command should work with any hypervisor that is controlled
 through libvirt (KVM, QEMU, or LXC), it has been tested only with KVM.
 Here is the example output when the hypervisor is KVM:
 .. code-block:: console
   +------------------+------------+
   | Property         | Value      |
   +------------------+------------+
   | cpu0_time        | 2870000000 |
   | memory           | 524288     |
   | vda_errors       | -1         |
   | vda_read         | 262144     |
   | vda_read_req     | 112        |
   | vda_write        | 5606400    |
   | vda_write_req    | 376        |
   | vnet0_rx         | 63343      |
   | vnet0_rx_drop    | 0          |
   | vnet0_rx_errors  | 0          |
   | vnet0_rx_packets | 431        |
   | vnet0_tx         | 4905       |
   | vnet0_tx_drop    | 0          |
   | vnet0_tx_errors  | 0          |
   | vnet0_tx_packets | 45         |
   +------------------+------------+
 Network Inspection
 ~~~~~~~~~~~~~~~~~~
 To see which fixed IP networks are configured in your cloud, you can use
 the :command:`nova` command-line client to get the IP ranges:
 .. code-block:: console
   $ nova network-list
   +--------------------------------------+--------+--------------+
   | ID                                   | Label  | Cidr         |
   +--------------------------------------+--------+--------------+
   | 3df67919-9600-4ea8-952e-2a7be6f70774 | test01 |  10.1.0.0/24 |
   | 8283efb2-e53d-46e1-a6bd-bb2bdef9cb9a | test02 |  10.1.1.0/24 |
   +--------------------------------------+--------+--------------+
 The nova command-line client can provide some additional details:
 .. code-block:: console
   # nova network-list
   id IPv4        IPv6 start address DNS1 DNS2 VlanID project   uuid
   1  10.1.0.0/24 None 10.1.0.3      None None 300    2725bbd   beacb3f2
   2  10.1.1.0/24 None 10.1.1.3      None None 301    none      d0b1a796
 This output shows that two networks are configured, each network
 containing 255 IPs (a /24 subnet). The first network has been assigned
 to a certain project, while the second network is still open for
 assignment. You can assign this network manually; otherwise, it is
 automatically assigned when a project launches its first instance.
 To find out whether any floating IPs are available in your cloud, run:
 .. code-block:: console
   # nova floating-ip-list
   2725bb...59f43f 1.2.3.4 None            nova vlan20
   None            1.2.3.5 48a415...b010ff nova vlan20
 Here, two floating IPs are available. The first has been allocated to a
 project, while the other is unallocated.
 Users and Projects
 ~~~~~~~~~~~~~~~~~~
 To see a list of projects that have been added to the cloud,projects
 obtaining list of currentuser management listing usersworking
 environment users and projects run:
 .. code-block:: console
   $ openstack project list
   +----------------------------------+--------------------+
   | ID                               | Name               |
   +----------------------------------+--------------------+
   | 422c17c0b26f4fbe9449f37a5621a5e6 | alt_demo           |
   | 5dc65773519248f3a580cfe28ba7fa3f | demo               |
   | 9faa845768224258808fc17a1bb27e5e | admin              |
   | a733070a420c4b509784d7ea8f6884f7 | invisible_to_admin |
   | aeb3e976e7794f3f89e4a7965db46c1e | service            |
   +----------------------------------+--------------------+
 To see a list of users, run:
 .. code-block:: console
   $ openstack user list
   +----------------------------------+----------+
   | ID                               | Name     |
   +----------------------------------+----------+
   | 5837063598694771aedd66aa4cddf0b8 | demo     |
   | 58efd9d852b74b87acc6efafaf31b30e | cinder   |
   | 6845d995a57a441f890abc8f55da8dfb | glance   |
   | ac2d15a1205f46d4837d5336cd4c5f5a | alt_demo |
   | d8f593c3ae2b47289221f17a776a218b | admin    |
   | d959ec0a99e24df0b7cb106ff940df20 | nova     |
   +----------------------------------+----------+
 .. note::
   Sometimes a user and a group have a one-to-one mapping. This happens
   for standard system accounts, such as cinder, glance, nova, and
   swift, or when only one user is part of a group.
 Running Instances
 ~~~~~~~~~~~~~~~~~
 To see a list of running instances,instances list of runningworking
 environment running instances run:
 .. code-block:: console
   $ nova list --all-tenants
   +-----+------------------+--------+-------------------------------------------+
   | ID  | Name             | Status | Networks                                  |
   +-----+------------------+--------+-------------------------------------------+
   | ... | Windows          | ACTIVE | novanetwork_1=10.1.1.3, 199.116.232.39    |
   | ... | cloud controller | ACTIVE | novanetwork_0=10.1.0.6; jtopjian=10.1.2.3 |
   | ... | compute node 1   | ACTIVE | novanetwork_0=10.1.0.4; jtopjian=10.1.2.4 |
   | ... | devbox           | ACTIVE | novanetwork_0=10.1.0.3                    |
   | ... | devstack         | ACTIVE | novanetwork_0=10.1.0.5                    |
   | ... | initial          | ACTIVE | nova_network=10.1.7.4, 10.1.8.4           |
   | ... | lorin-head       | ACTIVE | nova_network=10.1.7.3, 10.1.8.3           |
   +-----+------------------+--------+-------------------------------------------+
 Unfortunately, this command does not tell you various details about the
 running instances, such as what compute node the instance is running on,
 what flavor the instance is, and so on. You can use the following
 command to view details about individual instances:
 .. code-block:: console
   $ nova show <uuid>
 For example:
 .. code-block:: console
   # nova show 81db556b-8aa5-427d-a95c-2a9a6972f630
   +-------------------------------------+-----------------------------------+
   | Property                            | Value                             |
   +-------------------------------------+-----------------------------------+
   | OS-DCF:diskConfig                   | MANUAL                            |
   | OS-EXT-SRV-ATTR:host                | c02.example.com                   |
   | OS-EXT-SRV-ATTR:hypervisor_hostname | c02.example.com                   |
   | OS-EXT-SRV-ATTR:instance_name       | instance-00000029                 |
   | OS-EXT-STS:power_state              | 1                                 |
   | OS-EXT-STS:task_state               | None                              |
   | OS-EXT-STS:vm_state                 | active                            |
   | accessIPv4                          |                                   |
   | accessIPv6                          |                                   |
   | config_drive                        |                                   |
   | created                             | 2013-02-13T20:08:36Z              |
   | flavor                              | m1.small (6)                      |
   | hostId                              | ...                               |
   | id                                  | ...                               |
   | image                               | Ubuntu 12.04 cloudimg amd64 (...) |
   | key_name                            | jtopjian-sandbox                  |
   | metadata                            | {}                                |
   | name                                | devstack                          |
   | novanetwork_0 network               | 10.1.0.5                          |
   | progress                            | 0                                 |
   | security_groups                     | [{u'name': u'default'}]           |
   | status                              | ACTIVE                            |
   | tenant_id                           | ...                               |
   | updated                             | 2013-02-13T20:08:59Z              |
   | user_id                             | ...                               |
   +-------------------------------------+-----------------------------------+
 This output shows that an instance named ``devstack`` was created from
 an Ubuntu 12.04 image using a flavor of ``m1.small`` and is hosted on
 the compute node ``c02.example.com``.
 Summary
 ~~~~~~~
 We hope you have enjoyed this quick tour of your working environment,
 including how to interact with your cloud and extract useful
 information. From here, you can use the `Administrator
 Guide <http://docs.openstack.org/admin-guide/>`_ as your
 reference for all of the command-line functionality in your cloud.
--- a/doc/ops-guide/source/ops_logging_monitoring.rst
+++ b/doc/ops-guide/source/ops_logging_monitoring.rst
@ -0,0 +1,777 @@
 ======================
 Logging and Monitoring
 ======================
 As an OpenStack cloud is composed of so many different services, there
 are a large number of log files. This chapter aims to assist you in
 locating and working with them and describes other ways to track the
 status of your deployment.
 Where Are the Logs?
 ~~~~~~~~~~~~~~~~~~~
 Most services use the convention of writing their log files to
 subdirectories of the ``/var/log directory``, as listed in the
 below table.
 .. list-table:: OpenStack log locations
   :widths: 33 33 33
   :header-rows: 1
   * - Node type
     - Service
     - Log location
   * - Cloud controller
     - ``nova-*``
     - ``/var/log/nova``
   * - Cloud controller
     - ``glance-*``
     - ``/var/log/glance``
   * - Cloud controller
     - ``cinder-*``
     - ``/var/log/cinder``
   * - Cloud controller
     - ``keystone-*``
     - ``/var/log/keystone``
   * - Cloud controller
     - ``neutron-*``
     - ``/var/log/neutron``
   * - Cloud controller
     - horizon
     - ``/var/log/apache2/``
   * - All nodes
     - misc (swift, dnsmasq)
     - ``/var/log/syslog``
   * - Compute nodes
     - libvirt
     - ``/var/log/libvirt/libvirtd.log``
   * - Compute nodes
     - Console (boot up messages) for VM instances:
     - ``/var/lib/nova/instances/instance-<instance id>/console.log``
   * - Block Storage nodes
     - cinder-volume
     - ``/var/log/cinder/cinder-volume.log``
 Reading the Logs
 ~~~~~~~~~~~~~~~~
 OpenStack services use the standard logging levels, at increasing
 severity: DEBUG, INFO, AUDIT, WARNING, ERROR, CRITICAL, and TRACE. That
 is, messages only appear in the logs if they are more "severe" than the
 particular log level, with DEBUG allowing all log statements through.
 For example, TRACE is logged only if the software has a stack trace,
 while INFO is logged for every message including those that are only for
 information.
 To disable DEBUG-level logging, edit ``/etc/nova/nova.conf`` file as follows:
 .. code-block:: ini
   debug=false
 Keystone is handled a little differently. To modify the logging level,
 edit the ``/etc/keystone/logging.conf`` file and look at the
 ``logger_root`` and ``handler_file`` sections.
 Logging for horizon is configured in
 ``/etc/openstack_dashboard/local_settings.py``. Because horizon is
 a Django web application, it follows the `Django Logging framework
 conventions <https://docs.djangoproject.com/en/dev/topics/logging/>`_.
 The first step in finding the source of an error is typically to search
 for a CRITICAL, TRACE, or ERROR message in the log starting at the
 bottom of the log file.
 Here is an example of a CRITICAL log message, with the corresponding
 TRACE (Python traceback) immediately following:
 .. code-block:: console
   2013-02-25 21:05:51 17409 CRITICAL cinder [-] Bad or unexpected response from the storage volume backend API: volume group
    cinder-volumes doesn't exist
   2013-02-25 21:05:51 17409 TRACE cinder Traceback (most recent call last):
   2013-02-25 21:05:51 17409 TRACE cinder File "/usr/bin/cinder-volume", line 48, in <module>
   2013-02-25 21:05:51 17409 TRACE cinder service.wait()
   2013-02-25 21:05:51 17409 TRACE cinder File "/usr/lib/python2.7/dist-packages/cinder/service.py", line 422, in wait
   2013-02-25 21:05:51 17409 TRACE cinder _launcher.wait()
   2013-02-25 21:05:51 17409 TRACE cinder File "/usr/lib/python2.7/dist-packages/cinder/service.py", line 127, in wait
   2013-02-25 21:05:51 17409 TRACE cinder service.wait()
   2013-02-25 21:05:51 17409 TRACE cinder File "/usr/lib/python2.7/dist-packages/eventlet/greenthread.py", line 166, in wait
   2013-02-25 21:05:51 17409 TRACE cinder return self._exit_event.wait()
   2013-02-25 21:05:51 17409 TRACE cinder File "/usr/lib/python2.7/dist-packages/eventlet/event.py", line 116, in wait
   2013-02-25 21:05:51 17409 TRACE cinder return hubs.get_hub().switch()
   2013-02-25 21:05:51 17409 TRACE cinder File "/usr/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 177, in switch
   2013-02-25 21:05:51 17409 TRACE cinder return self.greenlet.switch()
   2013-02-25 21:05:51 17409 TRACE cinder File "/usr/lib/python2.7/dist-packages/eventlet/greenthread.py", line 192, in main
   2013-02-25 21:05:51 17409 TRACE cinder result = function(*args, **kwargs)
   2013-02-25 21:05:51 17409 TRACE cinder File "/usr/lib/python2.7/dist-packages/cinder/service.py", line 88, in run_server
   2013-02-25 21:05:51 17409 TRACE cinder server.start()
   2013-02-25 21:05:51 17409 TRACE cinder File "/usr/lib/python2.7/dist-packages/cinder/service.py", line 159, in start
   2013-02-25 21:05:51 17409 TRACE cinder self.manager.init_host()
   2013-02-25 21:05:51 17409 TRACE cinder File "/usr/lib/python2.7/dist-packages/cinder/volume/manager.py", line 95,
    in init_host
   2013-02-25 21:05:51 17409 TRACE cinder self.driver.check_for_setup_error()
   2013-02-25 21:05:51 17409 TRACE cinder File "/usr/lib/python2.7/dist-packages/cinder/volume/driver.py", line 116,
    in check_for_setup_error
   2013-02-25 21:05:51 17409 TRACE cinder raise exception.VolumeBackendAPIException(data=exception_message)
   2013-02-25 21:05:51 17409 TRACE cinder VolumeBackendAPIException: Bad or unexpected response from the storage volume
    backend API: volume group cinder-volumes doesn't exist
   2013-02-25 21:05:51 17409 TRACE cinder
 In this example, ``cinder-volumes`` failed to start and has provided a
 stack trace, since its volume back end has been unable to set up the
 storage volume—probably because the LVM volume that is expected from the
 configuration does not exist.
 Here is an example error log:
 .. code-block:: console
   2013-02-25 20:26:33 6619 ERROR nova.openstack.common.rpc.common [-] AMQP server on localhost:5672 is unreachable:
    [Errno 111] ECONNREFUSED. Trying again in 23 seconds.
 In this error, a nova service has failed to connect to the RabbitMQ
 server because it got a connection refused error.
 Tracing Instance Requests
 ~~~~~~~~~~~~~~~~~~~~~~~~~
 When an instance fails to behave properly, you will often have to trace
 activity associated with that instance across the log files of various
 ``nova-*`` services and across both the cloud controller and compute
 nodes.
 The typical way is to trace the UUID associated with an instance across
 the service logs.
 Consider the following example:
 .. code-block:: console
   $ nova list
   +--------------------------------+--------+--------+--------------------------+
   | ID                             | Name   | Status | Networks                 |
   +--------------------------------+--------+--------+--------------------------+
   | fafed8-4a46-413b-b113-f1959ffe | cirros | ACTIVE | novanetwork=192.168.100.3|
   +--------------------------------------+--------+--------+--------------------+
 Here, the ID associated with the instance is
 ``faf7ded8-4a46-413b-b113-f19590746ffe``. If you search for this string
 on the cloud controller in the ``/var/log/nova-*.log`` files, it appears
 in ``nova-api.log`` and ``nova-scheduler.log``. If you search for this
 on the compute nodes in ``/var/log/nova-*.log``, it appears in
 ``nova-network.log`` and ``nova-compute.log``. If no ERROR or CRITICAL
 messages appear, the most recent log entry that reports this may provide
 a hint about what has gone wrong.
 Adding Custom Logging Statements
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 If there is not enough information in the existing logs, you may need to
 add your own custom logging statements to the ``nova-*``
 services.
 The source files are located in
 ``/usr/lib/python2.7/dist-packages/nova``.
 To add logging statements, the following line should be near the top of
 the file. For most files, these should already be there:
 .. code-block:: python
   from nova.openstack.common import log as logging
   LOG = logging.getLogger(__name__)
 To add a DEBUG logging statement, you would do:
 .. code-block:: python
   LOG.debug("This is a custom debugging statement")
 You may notice that all the existing logging messages are preceded by an
 underscore and surrounded by parentheses, for example:
 .. code-block:: python
   LOG.debug(_("Logging statement appears here"))
 This formatting is used to support translation of logging messages into
 different languages using the
 `gettext <https://docs.python.org/2/library/gettext.html>`_
 internationalization library. You don't need to do this for your own
 custom log messages. However, if you want to contribute the code back to
 the OpenStack project that includes logging statements, you must
 surround your log messages with underscores and parentheses.
 RabbitMQ Web Management Interface or rabbitmqctl
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 Aside from connection failures, RabbitMQ log files are generally not
 useful for debugging OpenStack related issues. Instead, we recommend you
 use the RabbitMQ web management interface.RabbitMQlogging/monitoring
 RabbitMQ web management interface Enable it on your cloud
 controller:
 .. code-block:: console
   # /usr/lib/rabbitmq/bin/rabbitmq-plugins enable rabbitmq_management
 .. code-block:: console
   # service rabbitmq-server restart
 The RabbitMQ web management interface is accessible on your cloud
 controller at *http://localhost:55672*.
 .. note::
   Ubuntu 12.04 installs RabbitMQ version 2.7.1, which uses port 55672.
   RabbitMQ versions 3.0 and above use port 15672 instead. You can
   check which version of RabbitMQ you have running on your local
   Ubuntu machine by doing:
   .. code-block:: console
      $ dpkg -s rabbitmq-server | grep "Version:"
      Version: 2.7.1-0ubuntu4
 An alternative to enabling the RabbitMQ web management interface is to
 use the ``rabbitmqctl`` commands. For example,
 :command:`rabbitmqctl list_queues| grep cinder` displays any messages left in
 the queue. If there are messages, it's a possible sign that cinder
 services didn't connect properly to rabbitmq and might have to be
 restarted.
 Items to monitor for RabbitMQ include the number of items in each of the
 queues and the processing time statistics for the server.
 Centrally Managing Logs
 ~~~~~~~~~~~~~~~~~~~~~~~
 Because your cloud is most likely composed of many servers, you must
 check logs on each of those servers to properly piece an event together.
 A better solution is to send the logs of all servers to a central
 location so that they can all be accessed from the same
 area.
 Ubuntu uses rsyslog as the default logging service. Since it is natively
 able to send logs to a remote location, you don't have to install
 anything extra to enable this feature, just modify the configuration
 file. In doing this, consider running your logging over a management
 network or using an encrypted VPN to avoid interception.
 rsyslog Client Configuration
 ----------------------------
 To begin, configure all OpenStack components to log to syslog in
 addition to their standard log file location. Also configure each
 component to log to a different syslog facility. This makes it easier to
 split the logs into individual components on the central server:
 ``nova.conf``:
 .. code-block:: ini
   use_syslog=True
   syslog_log_facility=LOG_LOCAL0
 ``glance-api.conf`` and ``glance-registry.conf``:
 .. code-block:: ini
   use_syslog=True
   syslog_log_facility=LOG_LOCAL1
 ``cinder.conf``:
 .. code-block:: ini
   use_syslog=True
   syslog_log_facility=LOG_LOCAL2
 ``keystone.conf``:
 .. code-block:: ini
   use_syslog=True
   syslog_log_facility=LOG_LOCAL3
 By default, Object Storage logs to syslog.
 Next, create ``/etc/rsyslog.d/client.conf`` with the following line:
 .. code-block:: ini
   *.* @192.168.1.10
 This instructs rsyslog to send all logs to the IP listed. In this
 example, the IP points to the cloud controller.
 rsyslog Server Configuration
 ----------------------------
 Designate a server as the central logging server. The best practice is
 to choose a server that is solely dedicated to this purpose. Create a
 file called ``/etc/rsyslog.d/server.conf`` with the following contents:
 .. code-block:: ini
   # Enable UDP
   $ModLoad imudp
   # Listen on 192.168.1.10 only
   $UDPServerAddress 192.168.1.10
   # Port 514
   $UDPServerRun 514
   # Create logging templates for nova
   $template NovaFile,"/var/log/rsyslog/%HOSTNAME%/nova.log"
   $template NovaAll,"/var/log/rsyslog/nova.log"
   # Log everything else to syslog.log
   $template DynFile,"/var/log/rsyslog/%HOSTNAME%/syslog.log"
   *.* ?DynFile
   # Log various openstack components to their own individual file
   local0.* ?NovaFile
   local0.* ?NovaAll
   & ~
 This example configuration handles the nova service only. It first
 configures rsyslog to act as a server that runs on port 514. Next, it
 creates a series of logging templates. Logging templates control where
 received logs are stored. Using the last example, a nova log from
 c01.example.com goes to the following locations:
 -  ``/var/log/rsyslog/c01.example.com/nova.log``
 -  ``/var/log/rsyslog/nova.log``
 This is useful, as logs from c02.example.com go to:
 -  ``/var/log/rsyslog/c02.example.com/nova.log``
 -  ``/var/log/rsyslog/nova.log``
 You have an individual log file for each compute node as well as an
 aggregated log that contains nova logs from all nodes.
 Monitoring
 ~~~~~~~~~~
 There are two types of monitoring: watching for problems and watching
 usage trends. The former ensures that all services are up and running,
 creating a functional cloud. The latter involves monitoring resource
 usage over time in order to make informed decisions about potential
 bottlenecks and upgrades.
 **Nagios** is an open source monitoring service. It's capable of executing
 arbitrary commands to check the status of server and network services,
 remotely executing arbitrary commands directly on servers, and allowing
 servers to push notifications back in the form of passive monitoring.
 Nagios has been around since 1999. Although newer monitoring services
 are available, Nagios is a tried-and-true systems administration
 staple.
 Process Monitoring
 ------------------
 A basic type of alert monitoring is to simply check and see whether a
 required process is running.monitoring process monitoringprocess
 monitoringlogging/monitoring process monitoring For example, ensure that
 the ``nova-api`` service is running on the cloud controller:
 .. code-block:: console
   # ps aux | grep nova-api
   nova 12786 0.0 0.0 37952 1312 ? Ss Feb11 0:00 su -s /bin/sh -c exec nova-api
   --config-file=/etc/nova/nova.conf nova
   nova 12787 0.0 0.1 135764 57400 ? S Feb11 0:01 /usr/bin/python
    /usr/bin/nova-api --config-file=/etc/nova/nova.conf
   nova 12792 0.0 0.0 96052 22856 ? S Feb11 0:01 /usr/bin/python
   /usr/bin/nova-api --config-file=/etc/nova/nova.conf
   nova 12793 0.0 0.3 290688 115516 ? S Feb11 1:23 /usr/bin/python
   /usr/bin/nova-api --config-file=/etc/nova/nova.conf
   nova 12794 0.0 0.2 248636 77068 ? S Feb11 0:04 /usr/bin/python
   /usr/bin/nova-api --config-file=/etc/nova/nova.conf
   root 24121 0.0 0.0 11688 912 pts/5 S+ 13:07 0:00 grep nova-api
 You can create automated alerts for critical processes by using Nagios
 and NRPE. For example, to ensure that the ``nova-compute`` process is
 running on compute nodes, create an alert on your Nagios server that
 looks like this:
 .. code-block:: none
   define service {
       host_name c01.example.com
       check_command check_nrpe_1arg!check_nova-compute
       use generic-service
       notification_period 24x7
       contact_groups sysadmins
       service_description nova-compute
   }
 Then on the actual compute node, create the following NRPE
 configuration:
 .. code-block:: none
    \command[check_nova-compute]=/usr/lib/nagios/plugins/check_procs -c 1: \
    -a nova-compute
 Nagios checks that at least one ``nova-compute`` service is running at
 all times.
 Resource Alerting
 -----------------
 Resource alerting provides notifications when one or more resources are
 critically low. While the monitoring thresholds should be tuned to your
 specific OpenStack environment, monitoring resource usage is not
 specific to OpenStack at all—any generic type of alert will work
 fine.
 Some of the resources that you want to monitor include:
 -  Disk usage
 -  Server load
 -  Memory usage
 -  Network I/O
 -  Available vCPUs
 For example, to monitor disk capacity on a compute node with Nagios, add
 the following to your Nagios configuration:
 .. code-block:: none
   define service {
       host_name c01.example.com
       check_command check_nrpe!check_all_disks!20% 10%
       use generic-service
       contact_groups sysadmins
       service_description Disk
   }
 On the compute node, add the following to your NRPE configuration:
 .. code-block:: none
   command[check_all_disks]=/usr/lib/nagios/plugins/check_disk -w $ARG1$ -c \
   $ARG2$ -e
 Nagios alerts you with a WARNING when any disk on the compute node is 80
 percent full and CRITICAL when 90 percent is full.
 StackTach
 ---------
 StackTach is a tool that collects and reports the notifications sent by
 ``nova``. Notifications are essentially the same as logs but can be much
 more detailed. Nearly all OpenStack components are capable of generating
 notifications when significant events occur. Notifications are messages
 placed on the OpenStack queue (generally RabbitMQ) for consumption by
 downstream systems. An overview of notifications can be found at `System
 Usage
 Data <https://wiki.openstack.org/wiki/SystemUsageData>`_.
 To enable ``nova`` to send notifications, add the following to
 ``nova.conf``:
 .. code-block:: ini
   notification_topics=monitor
   notification_driver=messagingv2
 Once ``nova`` is sending notifications, install and configure StackTach.
 StackTach workers for Queue consumption and pipeling processing are
 configured to read these notifications from RabbitMQ servers and store
 them in a database. Users can inquire on instances, requests and servers
 by using the browser interface or command line tool,
 `Stacky <https://github.com/rackerlabs/stacky>`_. Since StackTach is
 relatively new and constantly changing, installation instructions
 quickly become outdated. Please refer to the `StackTach Git
 repo <https://git.openstack.org/cgit//openstack/stacktach>`_ for
 instructions as well as a demo video. Additional details on the latest
 developments can be discovered at the `official
 page <http://stacktach.com/>`_
 Logstash
 --------
 Logstash is a high performance indexing and search engine for logs. Logs
 from Jenkins test runs are sent to logstash where they are indexed and
 stored. Logstash facilitates reviewing logs from multiple sources in a
 single test run, searching for errors or particular events within a test
 run, and searching for log event trends across test runs.
 There are four major layers in Logstash setup which are
 -  Log Pusher
 -  Log Indexer
 -  ElasticSearch
 -  Kibana
 Each layer scales horizontally. As the number of logs grows you can add
 more log pushers, more Logstash indexers, and more ElasticSearch nodes.
 Logpusher is a pair of Python scripts which first listens to Jenkins
 build events and converts them into Gearman jobs. Gearman provides a
 generic application framework to farm out work to other machines or
 processes that are better suited to do the work. It allows you to do
 work in parallel, to load balance processing, and to call functions
 between languages.Later Logpusher performs Gearman jobs to push log
 files into logstash. Logstash indexer reads these log events, filters
 them to remove unwanted lines, collapse multiple events together, and
 parses useful information before shipping them to ElasticSearch for
 storage and indexing. Kibana is a logstash oriented web client for
 ElasticSearch.
 OpenStack Telemetry
 -------------------
 An integrated OpenStack project (code-named :term:`ceilometer`) collects
 metering and event data relating to OpenStack services. Data collected
 by the Telemetry service could be used for billing. Depending on
 deployment configuration, collected data may be accessible to users
 based on the deployment configuration. The Telemetry service provides a
 REST API documented at
 http://developer.openstack.org/api-ref-telemetry-v2.html. You can read
 more about the module in the `OpenStack Administrator
 Guide <http://docs.openstack.org/admin-guide/telemetry.html>`_ or
 in the `developer
 documentation <http://docs.openstack.org/developer/ceilometer>`_.
 OpenStack-Specific Resources
 ----------------------------
 Resources such as memory, disk, and CPU are generic resources that all
 servers (even non-OpenStack servers) have and are important to the
 overall health of the server. When dealing with OpenStack specifically,
 these resources are important for a second reason: ensuring that enough
 are available to launch instances. There are a few ways you can see
 OpenStack resource usage.monitoring OpenStack-specific
 resourcesresources generic vs. OpenStack-specificlogging/monitoring
 OpenStack-specific resources The first is through the :command:`nova` command:
 .. code-block:: console
   # nova usage-list
 This command displays a list of how many instances a tenant has running
 and some light usage statistics about the combined instances. This
 command is useful for a quick overview of your cloud, but it doesn't
 really get into a lot of details.
 Next, the ``nova`` database contains three tables that store usage
 information.
 The ``nova.quotas`` and ``nova.quota_usages`` tables store quota
 information. If a tenant's quota is different from the default quota
 settings, its quota is stored in the ``nova.quotas`` table. For example:
 .. code-block:: mysql
   mysql> select project_id, resource, hard_limit from quotas;
   +----------------------------------+-----------------------------+------------+
   | project_id                       | resource                    | hard_limit |
   +----------------------------------+-----------------------------+------------+
   | 628df59f091142399e0689a2696f5baa | metadata_items              | 128        |
   | 628df59f091142399e0689a2696f5baa | injected_file_content_bytes | 10240      |
   | 628df59f091142399e0689a2696f5baa | injected_files              | 5          |
   | 628df59f091142399e0689a2696f5baa | gigabytes                   | 1000       |
   | 628df59f091142399e0689a2696f5baa | ram                         | 51200      |
   | 628df59f091142399e0689a2696f5baa | floating_ips                | 10         |
   | 628df59f091142399e0689a2696f5baa | instances                   | 10         |
   | 628df59f091142399e0689a2696f5baa | volumes                     | 10         |
   | 628df59f091142399e0689a2696f5baa | cores                       | 20         |
   +----------------------------------+-----------------------------+------------+
 The ``nova.quota_usages`` table keeps track of how many resources the
 tenant currently has in use:
 .. code-block:: mysql
   mysql> select project_id, resource, in_use from quota_usages where project_id like '628%';
   +----------------------------------+--------------+--------+
   | project_id                       | resource     | in_use |
   +----------------------------------+--------------+--------+
   | 628df59f091142399e0689a2696f5baa | instances    | 1      |
   | 628df59f091142399e0689a2696f5baa | ram          | 512    |
   | 628df59f091142399e0689a2696f5baa | cores        | 1      |
   | 628df59f091142399e0689a2696f5baa | floating_ips | 1      |
   | 628df59f091142399e0689a2696f5baa | volumes      | 2      |
   | 628df59f091142399e0689a2696f5baa | gigabytes    | 12     |
   | 628df59f091142399e0689a2696f5baa | images       | 1      |
   +----------------------------------+--------------+--------+
 By comparing a tenant's hard limit with their current resource usage,
 you can see their usage percentage. For example, if this tenant is using
 1 floating IP out of 10, then they are using 10 percent of their
 floating IP quota. Rather than doing the calculation manually, you can
 use SQL or the scripting language of your choice and create a formatted
 report:
 .. code-block:: mysql
   +----------------------------------+------------+-------------+---------------+
   | some_tenant                                                                 |
   +-----------------------------------+------------+------------+---------------+
   | Resource                          | Used       | Limit      |               |
   +-----------------------------------+------------+------------+---------------+
   | cores                             | 1          | 20         |           5 % |
   | floating_ips                      | 1          | 10         |          10 % |
   | gigabytes                         | 12         | 1000       |           1 % |
   | images                            | 1          | 4          |          25 % |
   | injected_file_content_bytes       | 0          | 10240      |           0 % |
   | injected_file_path_bytes          | 0          | 255        |           0 % |
   | injected_files                    | 0          | 5          |           0 % |
   | instances                         | 1          | 10         |          10 % |
   | key_pairs                         | 0          | 100        |           0 % |
   | metadata_items                    | 0          | 128        |           0 % |
   | ram                               | 512        | 51200      |           1 % |
   | reservation_expire                | 0          | 86400      |           0 % |
   | security_group_rules              | 0          | 20         |           0 % |
   | security_groups                   | 0          | 10         |           0 % |
   | volumes                           | 2          | 10         |          20 % |
   +-----------------------------------+------------+------------+---------------+
 The preceding information was generated by using a custom script that
 can be found on
 `GitHub <https://github.com/cybera/novac/blob/dev/libexec/novac-quota-report>`_.
 .. note::
   This script is specific to a certain OpenStack installation and must
   be modified to fit your environment. However, the logic should
   easily be transferable.
 Intelligent Alerting
 --------------------
 Intelligent alerting can be thought of as a form of continuous
 integration for operations. For example, you can easily check to see
 whether the Image service is up and running by ensuring that
 the ``glance-api`` and ``glance-registry`` processes are running or by
 seeing whether ``glace-api`` is responding on port 9292.
 But how can you tell whether images are being successfully uploaded to
 the Image service? Maybe the disk that Image service is storing the
 images on is full or the S3 back end is down. You could naturally check
 this by doing a quick image upload:
 .. code-block:: bash
   #!/bin/bash
   #
   # assumes that reasonable credentials have been stored at
   # /root/auth
   . /root/openrc
   wget http://download.cirros-cloud.net/0.3.4/cirros-0.3.4-x86_64-disk.img
   glance image-create --name='cirros image' --is-public=true
   --container-format=bare --disk-format=qcow2 < cirros-0.3.4-x8
   6_64-disk.img
 By taking this script and rolling it into an alert for your monitoring
 system (such as Nagios), you now have an automated way of ensuring that
 image uploads to the Image Catalog are working.
 .. note::
   You must remove the image after each test. Even better, test whether
   you can successfully delete an image from the Image service.
 Intelligent alerting takes considerably more time to plan and implement
 than the other alerts described in this chapter. A good outline to
 implement intelligent alerting is:
 -  Review common actions in your cloud.
 -  Create ways to automatically test these actions.
 -  Roll these tests into an alerting system.
 Some other examples for Intelligent Alerting include:
 -  Can instances launch and be destroyed?
 -  Can users be created?
 -  Can objects be stored and deleted?
 -  Can volumes be created and destroyed?
 Trending
 --------
 Trending can give you great insight into how your cloud is performing
 day to day. You can learn, for example, if a busy day was simply a rare
 occurrence or if you should start adding new compute nodes.
 Trending takes a slightly different approach than alerting. While
 alerting is interested in a binary result (whether a check succeeds or
 fails), trending records the current state of something at a certain
 point in time. Once enough points in time have been recorded, you can
 see how the value has changed over time.
 All of the alert types mentioned earlier can also be used for trend
 reporting. Some other trend examples include:
 -  The number of instances on each compute node
 -  The types of flavors in use
 -  The number of volumes in use
 -  The number of Object Storage requests each hour
 -  The number of ``nova-api`` requests each hour
 -  The I/O statistics of your storage services
 As an example, recording ``nova-api`` usage can allow you to track the
 need to scale your cloud controller. By keeping an eye on ``nova-api``
 requests, you can determine whether you need to spawn more ``nova-api``
 processes or go as far as introducing an entirely new server to run
 ``nova-api``. To get an approximate count of the requests, look for
 standard INFO messages in ``/var/log/nova/nova-api.log``:
 .. code-block:: console
   # grep INFO /var/log/nova/nova-api.log | wc
 You can obtain further statistics by looking for the number of
 successful requests:
 .. code-block:: console
   # grep " 200 " /var/log/nova/nova-api.log | wc
 By running this command periodically and keeping a record of the result,
 you can create a trending report over time that shows whether your
 ``nova-api`` usage is increasing, decreasing, or keeping steady.
 A tool such as **collectd** can be used to store this information. While
 collectd is out of the scope of this book, a good starting point would
 be to use collectd to store the result as a COUNTER data type. More
 information can be found in `collectd's
 documentation <https://collectd.org/wiki/index.php/Data_source>`_.
 Summary
 ~~~~~~~
 For stable operations, you want to detect failure promptly and determine
 causes efficiently. With a distributed system, it's even more important
 to track the right items to meet a service-level target. Learning where
 these logs are located in the file system or API gives you an advantage.
 This chapter also showed how to read, interpret, and manipulate
 information from OpenStack services so that you can monitor effectively.
--- a/doc/ops-guide/source/ops_maintenance.rst
+++ b/doc/ops-guide/source/ops_maintenance.rst
--- a/doc/ops-guide/source/ops_network_troubleshooting.rst
+++ b/doc/ops-guide/source/ops_network_troubleshooting.rst
--- a/doc/ops-guide/source/ops_projects_users.rst
+++ b/doc/ops-guide/source/ops_projects_users.rst
@ -0,0 +1,778 @@
 ===========================
 Managing Projects and Users
 ===========================
 An OpenStack cloud does not have much value without users. This chapter
 covers topics that relate to managing users, projects, and quotas. This
 chapter describes users and projects as described by version 2 of the
 OpenStack Identity API.
 .. warning::
   While version 3 of the Identity API is available, the client tools
   do not yet implement those calls, and most OpenStack clouds are
   still implementing Identity API v2.0.
 Projects or Tenants?
 ~~~~~~~~~~~~~~~~~~~~
 In OpenStack user interfaces and documentation, a group of users is
 referred to as a :term:`project` or :term:`tenant`.
 These terms are interchangeable.
 The initial implementation of OpenStack Compute had its own
 authentication system and used the term ``project``. When authentication
 moved into the OpenStack Identity (keystone) project, it used the term
 ``tenant`` to refer to a group of users. Because of this legacy, some of
 the OpenStack tools refer to projects and some refer to tenants.
 .. note::
   This guide uses the term ``project``, unless an example shows
   interaction with a tool that uses the term ``tenant``.
 Managing Projects
 ~~~~~~~~~~~~~~~~~
 Users must be associated with at least one project, though they may
 belong to many. Therefore, you should add at least one project before
 adding users.
 Adding Projects
 ---------------
 To create a project through the OpenStack dashboard:
 #. Log in as an administrative user.
 #. Select the :guilabel:`Identity` tab in the left navigation bar.
 #. Under Identity tab, click :guilabel:`Projects`.
 #. Click the :guilabel:`Create Project` button.
 You are prompted for a project name and an optional, but recommended,
 description. Select the checkbox at the bottom of the form to enable
 this project. By default, it is enabled, as shown in
 :ref:`figure_create_project`.
 .. _figure_create_project:
 .. figure:: figures/osog_0901.png
   :alt: Dashboard's Create Project form
   Figure Dashboard's Create Project form
 It is also possible to add project members and adjust the project
 quotas. We'll discuss those actions later, but in practice, it can be
 quite convenient to deal with all these operations at one time.
 To add a project through the command line, you must use the OpenStack
 command line client.
 .. code-block:: console
   # openstack project create demo
 This command creates a project named "demo." Optionally, you can add a
 description string by appending :option:`--description tenant-description`,
 which can be very useful. You can also
 create a group in a disabled state by appending :option:`--disable` to the
 command. By default, projects are created in an enabled state.
 Quotas
 ~~~~~~
 To prevent system capacities from being exhausted without notification,
 you can set up :term:`quotas <quota>`. Quotas are operational limits. For example,
 the number of gigabytes allowed per tenant can be controlled to ensure that
 a single tenant cannot consume all of the disk space. Quotas are
 currently enforced at the tenant (or project) level, rather than the
 user level.
 .. warning::
   Because without sensible quotas a single tenant could use up all the
   available resources, default quotas are shipped with OpenStack. You
   should pay attention to which quota settings make sense for your
   hardware capabilities.
 Using the command-line interface, you can manage quotas for the
 OpenStack Compute service and the Block Storage service.
 Typically, default values are changed because a tenant requires more
 than the OpenStack default of 10 volumes per tenant, or more than the
 OpenStack default of 1 TB of disk space on a compute node.
 .. note::
   To view all tenants, run:
   .. code-block:: console
       $ openstack project list
       +---------------------------------+----------+
       | ID                              | Name     |
       +---------------------------------+----------+
       | a981642d22c94e159a4a6540f70f9f8 | admin    |
       | 934b662357674c7b9f5e4ec6ded4d0e | tenant01 |
       | 7bc1dbfd7d284ec4a856ea1eb82dca8 | tenant02 |
       | 9c554aaef7804ba49e1b21cbd97d218 | services |
       +---------------------------------+----------+
 Set Image Quotas
 ----------------
 You can restrict a project's image storage by total number of bytes.
 Currently, this quota is applied cloud-wide, so if you were to set an
 Image quota limit of 5 GB, then all projects in your cloud will be able
 to store only 5 GB of images and snapshots.
 To enable this feature, edit the ``/etc/glance/glance-api.conf`` file,
 and under the ``[DEFAULT]`` section, add:
 .. code-block:: ini
   user_storage_quota = <bytes>
 For example, to restrict a project's image storage to 5 GB, do this:
 .. code-block:: ini
   user_storage_quota = 5368709120
 .. note::
   There is a configuration option in ``glance-api.conf`` that limits
   the number of members allowed per image, called
   ``image_member_quota``, set to 128 by default. That setting is a
   different quota from the storage quota.
 Set Compute Service Quotas
 --------------------------
 As an administrative user, you can update the Compute service quotas for
 an existing tenant, as well as update the quota defaults for a new
 tenant.Compute Compute service See :ref:`table_compute_quota`.
 .. _table_compute_quota:
 .. list-table:: Compute quota descriptions
   :widths: 30 40 30
   :header-rows: 1
   * - Quota
     - Description
     - Property name
   * - Fixed IPs
     - Number of fixed IP addresses allowed per tenant.
       This number must be equal to or greater than the number
       of allowed instances.
     - fixed-ips
   * - Floating IPs
     - Number of floating IP addresses allowed per tenant.
     - floating-ips
   * - Injected file content bytes
     - Number of content bytes allowed per injected file.
     - injected-file-content-bytes
   * - Injected file path bytes
     - Number of bytes allowed per injected file path.
     - injected-file-path-bytes
   * - Injected files
     - Number of injected files allowed per tenant.
     - injected-files
   * - Instances
     - Number of instances allowed per tenant.
     - instances
   * - Key pairs
     - Number of key pairs allowed per user.
     - key-pairs
   * - Metadata items
     - Number of metadata items allowed per instance.
     - metadata-items
   * - RAM
     - Megabytes of instance RAM allowed per tenant.
     - ram
   * - Security group rules
     - Number of rules per security group.
     - security-group-rules
   * - Security groups
     - Number of security groups per tenant.
     - security-groups
   * - VCPUs
     - Number of instance cores allowed per tenant.
     - cores
 View and update compute quotas for a tenant (project)
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 As an administrative user, you can use the :command:`nova quota-*`
 commands, which are provided by the
 ``python-novaclient`` package, to view and update tenant quotas.
 **To view and update default quota values**
 #. List all default quotas for all tenants, as follows:
   .. code-block:: console
      $ nova quota-defaults
   For example:
   .. code-block:: console
      $ nova quota-defaults
      +-----------------------------+-------+
      | Property                    | Value |
      +-----------------------------+-------+
      | metadata_items              | 128   |
      | injected_file_content_bytes | 10240 |
      | ram                         | 51200 |
      | floating_ips                | 10    |
      | key_pairs                   | 100   |
      | instances                   | 10    |
      | security_group_rules        | 20    |
      | injected_files              | 5     |
      | cores                       | 20    |
      | fixed_ips                   | -1    |
      | injected_file_path_bytes    | 255   |
      | security_groups             | 10    |
      +-----------------------------+-------+
 #. Update a default value for a new tenant, as follows:
   .. code-block:: console
      $ nova quota-class-update default key value
   For example:
   .. code-block:: console
      $ nova quota-class-update default --instances 15
 **To view quota values for a tenant (project)**
 #. Place the tenant ID in a variable:
   .. code-block:: console
      $ tenant=$(openstack project list | awk '/tenantName/ {print $2}')
 #. List the currently set quota values for a tenant, as follows:
   .. code-block:: console
      $ nova quota-show --tenant $tenant
   For example:
   .. code-block:: console
      $ nova quota-show --tenant $tenant
      +-----------------------------+-------+
      | Property                    | Value |
      +-----------------------------+-------+
      | metadata_items              | 128   |
      | injected_file_content_bytes | 10240 |
      | ram                         | 51200 |
      | floating_ips                | 12    |
      | key_pairs                   | 100   |
      | instances                   | 10    |
      | security_group_rules        | 20    |
      | injected_files              | 5     |
      | cores                       | 20    |
      | fixed_ips                   | -1    |
      | injected_file_path_bytes    | 255   |
      | security_groups             | 10    |
      +-----------------------------+-------+
 **To update quota values for a tenant (project)**
 #. Obtain the tenant ID, as follows:
   .. code-block:: console
      $ tenant=$(openstack project list | awk '/tenantName/ {print $2}')
 #. Update a particular quota value, as follows:
   .. code-block:: console
      # nova quota-update --quotaName quotaValue tenantID
   For example:
   .. code-block:: console
      # nova quota-update --floating-ips 20 $tenant
      # nova quota-show --tenant $tenant
      +-----------------------------+-------+
      | Property                    | Value |
      +-----------------------------+-------+
      | metadata_items              | 128   |
      | injected_file_content_bytes | 10240 |
      | ram                         | 51200 |
      | floating_ips                | 20    |
      | key_pairs                   | 100   |
      | instances                   | 10    |
      | security_group_rules        | 20    |
      | injected_files              | 5     |
      | cores                       | 20    |
      | fixed_ips                   | -1    |
      | injected_file_path_bytes    | 255   |
      | security_groups             | 10    |
      +-----------------------------+-------+
   .. note::
      To view a list of options for the ``quota-update`` command, run:
      .. code-block:: console
         $ nova help quota-update
 Set Object Storage Quotas
 -------------------------
 There are currently two categories of quotas for Object Storage:
 Container quotas
    Limit the total size (in bytes) or number of objects that can be
    stored in a single container.
 Account quotas
    Limit the total size (in bytes) that a user has available in the
    Object Storage service.
 To take advantage of either container quotas or account quotas, your
 Object Storage proxy server must have ``container_quotas`` or
 ``account_quotas`` (or both) added to the ``[pipeline:main]`` pipeline.
 Each quota type also requires its own section in the
 ``proxy-server.conf`` file:
 .. code-block:: ini
   [pipeline:main]
   pipeline = catch_errors [...] slo dlo account_quotas proxy-server
   [filter:account_quotas]
   use = egg:swift#account_quotas
   [filter:container_quotas]
   use = egg:swift#container_quotas
 To view and update Object Storage quotas, use the :command:`swift` command
 provided by the ``python-swiftclient`` package. Any user included in the
 project can view the quotas placed on their project. To update Object
 Storage quotas on a project, you must have the role of ResellerAdmin in
 the project that the quota is being applied to.
 To view account quotas placed on a project:
 .. code-block:: console
   $ swift stat
      Account: AUTH_b36ed2d326034beba0a9dd1fb19b70f9
   Containers: 0
      Objects: 0
        Bytes: 0
   Meta Quota-Bytes: 214748364800
   X-Timestamp: 1351050521.29419
   Content-Type: text/plain; charset=utf-8
   Accept-Ranges: bytes
 To apply or update account quotas on a project:
 .. code-block:: console
   $ swift post -m quota-bytes:
        <bytes>
 For example, to place a 5 GB quota on an account:
 .. code-block:: console
   $ swift post -m quota-bytes:
        5368709120
 To verify the quota, run the :command:`swift stat` command again:
 .. code-block:: console
   $ swift stat
      Account: AUTH_b36ed2d326034beba0a9dd1fb19b70f9
   Containers: 0
      Objects: 0
        Bytes: 0
   Meta Quota-Bytes: 5368709120
   X-Timestamp: 1351541410.38328
   Content-Type: text/plain; charset=utf-8
   Accept-Ranges: bytes
 Set Block Storage Quotas
 ------------------------
 As an administrative user, you can update the Block Storage service
 quotas for a tenant, as well as update the quota defaults for a new
 tenant. See :ref:`table_block_storage_quota`.
 .. _table_block_storage_quota:
 .. list-table:: Table: Block Storage quota descriptions
   :widths: 50 50
   :header-rows: 1
   * - Property name
     - Description
   * - gigabytes
     - Number of volume gigabytes allowed per tenant
   * - snapshots
     - Number of Block Storage snapshots allowed per tenant.
   * - volumes
     - Number of Block Storage volumes allowed per tenant
 View and update Block Storage quotas for a tenant (project)
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 As an administrative user, you can use the :command:`cinder quota-*`
 commands, which are provided by the
 ``python-cinderclient`` package, to view and update tenant quotas.
 **To view and update default Block Storage quota values**
 #. List all default quotas for all tenants, as follows:
   .. code-block:: console
      $ cinder quota-defaults
   For example:
   .. code-block:: console
      $ cinder quota-defaults
      +-----------+-------+
      |  Property | Value |
      +-----------+-------+
      | gigabytes |  1000 |
      | snapshots |   10  |
      |  volumes  |   10  |
      +-----------+-------+
 #. To update a default value for a new tenant, update the property in the
   ``/etc/cinder/cinder.conf`` file.
 **To view Block Storage quotas for a tenant (project)**
 #. View quotas for the tenant, as follows:
   .. code-block:: console
      # cinder quota-show tenantName
   For example:
   .. code-block:: console
      # cinder quota-show tenant01
      +-----------+-------+
      |  Property | Value |
      +-----------+-------+
      | gigabytes |  1000 |
      | snapshots |   10  |
      |  volumes  |   10  |
      +-----------+-------+
 **To update Block Storage quotas for a tenant (project)**
 #. Place the tenant ID in a variable:
   .. code-block:: console
      $ tenant=$(openstack project list | awk '/tenantName/ {print $2}')
 #. Update a particular quota value, as follows:
   .. code-block:: console
      # cinder quota-update --quotaName NewValue tenantID
   For example:
   .. code-block:: console
      # cinder quota-update --volumes 15 $tenant
      # cinder quota-show tenant01
      +-----------+-------+
      |  Property | Value |
      +-----------+-------+
      | gigabytes |  1000 |
      | snapshots |   10  |
      |  volumes  |   15  |
      +-----------+-------+
 User Management
 ~~~~~~~~~~~~~~~
 The command-line tools for managing users are inconvenient to use
 directly. They require issuing multiple commands to complete a single
 task, and they use UUIDs rather than symbolic names for many items. In
 practice, humans typically do not use these tools directly. Fortunately,
 the OpenStack dashboard provides a reasonable interface to this. In
 addition, many sites write custom tools for local needs to enforce local
 policies and provide levels of self-service to users that aren't
 currently available with packaged tools.
 Creating New Users
 ~~~~~~~~~~~~~~~~~~
 To create a user, you need the following information:
 * Username
 * Email address
 * Password
 * Primary project
 * Role
 * Enabled
 Username and email address are self-explanatory, though your site may
 have local conventions you should observe. The primary project is simply
 the first project the user is associated with and must exist prior to
 creating the user. Role is almost always going to be "member." Out of
 the box, OpenStack comes with two roles defined:
 member
    A typical user
 admin
    An administrative super user, which has full permissions across all
    projects and should be used with great care
 It is possible to define other roles, but doing so is uncommon.
 Once you've gathered this information, creating the user in the
 dashboard is just another web form similar to what we've seen before and
 can be found by clicking the Users link in the Identity navigation bar
 and then clicking the Create User button at the top right.
 Modifying users is also done from this Users page. If you have a large
 number of users, this page can get quite crowded. The Filter search box
 at the top of the page can be used to limit the users listing. A form
 very similar to the user creation dialog can be pulled up by selecting
 Edit from the actions dropdown menu at the end of the line for the user
 you are modifying.
 Associating Users with Projects
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 Many sites run with users being associated with only one project. This
 is a more conservative and simpler choice both for administration and
 for users. Administratively, if a user reports a problem with an
 instance or quota, it is obvious which project this relates to. Users
 needn't worry about what project they are acting in if they are only in
 one project. However, note that, by default, any user can affect the
 resources of any other user within their project. It is also possible to
 associate users with multiple projects if that makes sense for your
 organization.
 Associating existing users with an additional project or removing them
 from an older project is done from the Projects page of the dashboard by
 selecting Modify Users from the Actions column, as shown in
 :ref:`figure_edit_project_members`.
 From this view, you can do a number of useful things, as well as a few
 dangerous ones.
 The first column of this form, named All Users, includes a list of all
 the users in your cloud who are not already associated with this
 project. The second column shows all the users who are. These lists can
 be quite long, but they can be limited by typing a substring of the
 username you are looking for in the filter field at the top of the
 column.
 From here, click the :guilabel:`+` icon to add users to the project.
 Click the :guilabel:`-` to remove them.
 .. _figure_edit_project_members:
 .. figure:: figures/osog_0902.png
   :alt: Edit Project Members tab
   Edit Project Members tab
 The dangerous possibility comes with the ability to change member roles.
 This is the dropdown list below the username in the
 :guilabel:`Project Members` list. In virtually all cases,
 this value should be set to Member. This example purposefully shows
 an administrative user where this value is admin.
 .. warning::
   The admin is global, not per project, so granting a user the admin
   role in any project gives the user administrative rights across the
   whole cloud.
 Typical use is to only create administrative users in a single project,
 by convention the admin project, which is created by default during
 cloud setup. If your administrative users also use the cloud to launch
 and manage instances, it is strongly recommended that you use separate
 user accounts for administrative access and normal operations and that
 they be in distinct projects.
 Customizing Authorization
 -------------------------
 The default :term:`authorization` settings allow administrative users
 only to create resources on behalf of a different project.
 OpenStack handles two kinds of authorization policies:
 Operation based
    Policies specify access criteria for specific operations, possibly
    with fine-grained control over specific attributes.
 Resource based
    Whether access to a specific resource might be granted or not
    according to the permissions configured for the resource (currently
    available only for the network resource). The actual authorization
    policies enforced in an OpenStack service vary from deployment to
    deployment.
 The policy engine reads entries from the ``policy.json`` file. The
 actual location of this file might vary from distribution to
 distribution: for nova, it is typically in ``/etc/nova/policy.json``.
 You can update entries while the system is running, and you do not have
 to restart services. Currently, the only way to update such policies is
 to edit the policy file.
 The OpenStack service's policy engine matches a policy directly. A rule
 indicates evaluation of the elements of such policies. For instance, in
 a ``compute:create: [["rule:admin_or_owner"]]`` statement, the policy is
 ``compute:create``, and the rule is ``admin_or_owner``.
 Policies are triggered by an OpenStack policy engine whenever one of
 them matches an OpenStack API operation or a specific attribute being
 used in a given operation. For instance, the engine tests the
 ``create:compute`` policy every time a user sends a
 ``POST /v2/{tenant_id}/servers`` request to the OpenStack Compute API
 server. Policies can be also related to specific :term:`API extensions
 <API extension>`. For instance, if a user needs an extension like
 ``compute_extension:rescue``, the attributes defined by the provider
 extensions trigger the rule test for that operation.
 An authorization policy can be composed by one or more rules. If more
 rules are specified, evaluation policy is successful if any of the rules
 evaluates successfully; if an API operation matches multiple policies,
 then all the policies must evaluate successfully. Also, authorization
 rules are recursive. Once a rule is matched, the rule(s) can be resolved
 to another rule, until a terminal rule is reached. These are the rules
 defined:
 Role-based rules
    Evaluate successfully if the user submitting the request has the
    specified role. For instance, ``"role:admin"`` is successful if the
    user submitting the request is an administrator.
 Field-based rules
    Evaluate successfully if a field of the resource specified in the
    current request matches a specific value. For instance,
    ``"field:networks:shared=True"`` is successful if the attribute
    shared of the network resource is set to ``true``.
 Generic rules
    Compare an attribute in the resource with an attribute extracted
    from the user's security credentials and evaluates successfully if
    the comparison is successful. For instance,
    ``"tenant_id:%(tenant_id)s"`` is successful if the tenant identifier
    in the resource is equal to the tenant identifier of the user
    submitting the request.
 Here are snippets of the default nova ``policy.json`` file:
 .. code-block:: json
   {
           "context_is_admin":  [["role:admin"]],
           "admin_or_owner":  [["is_admin:True"], ["project_id:%(project_id)s"]], ~~~~(1)~~~~
           "default": [["rule:admin_or_owner"]], ~~~~(2)~~~~
           "compute:create": [ ],
           "compute:create:attach_network": [ ],
           "compute:create:attach_volume": [ ],
           "compute:get_all": [ ],
           "admin_api": [["is_admin:True"]],
           "compute_extension:accounts": [["rule:admin_api"]],
           "compute_extension:admin_actions": [["rule:admin_api"]],
           "compute_extension:admin_actions:pause": [["rule:admin_or_owner"]],
           "compute_extension:admin_actions:unpause": [["rule:admin_or_owner"]],
           ...
           "compute_extension:admin_actions:migrate": [["rule:admin_api"]],
           "compute_extension:aggregates": [["rule:admin_api"]],
           "compute_extension:certificates": [ ],
           ...
           "compute_extension:flavorextraspecs": [ ],
           "compute_extension:flavormanage": [["rule:admin_api"]], ~~~~(3)~~~~
   }
 1. Shows a rule that evaluates successfully if the current user is an
   administrator or the owner of the resource specified in the request
   (tenant identifier is equal).
 2. Shows the default policy, which is always evaluated if an API
   operation does not match any of the policies in ``policy.json``.
 3. Shows a policy restricting the ability to manipulate flavors to
   administrators using the Admin API only.admin API
 In some cases, some operations should be restricted to administrators
 only. Therefore, as a further example, let us consider how this sample
 policy file could be modified in a scenario where we enable users to
 create their own flavors:
 .. code-block:: console
   "compute_extension:flavormanage": [ ],
 Users Who Disrupt Other Users
 -----------------------------
 Users on your cloud can disrupt other users, sometimes intentionally and
 maliciously and other times by accident. Understanding the situation
 allows you to make a better decision on how to handle the
 disruption.
 For example, a group of users have instances that are utilizing a large
 amount of compute resources for very compute-intensive tasks. This is
 driving the load up on compute nodes and affecting other users. In this
 situation, review your user use cases. You may find that high compute
 scenarios are common, and should then plan for proper segregation in
 your cloud, such as host aggregation or regions.
 Another example is a user consuming a very large amount of
 bandwidthbandwidth recognizing DDOS attacks. Again, the key is to
 understand what the user is doing. If she naturally needs a high amount
 of bandwidth, you might have to limit her transmission rate as to not
 affect other users or move her to an area with more bandwidth available.
 On the other hand, maybe her instance has been hacked and is part of a
 botnet launching DDOS attacks. Resolution of this issue is the same as
 though any other server on your network has been hacked. Contact the
 user and give her time to respond. If she doesn't respond, shut down the
 instance.
 A final example is if a user is hammering cloud resources repeatedly.
 Contact the user and learn what he is trying to do. Maybe he doesn't
 understand that what he's doing is inappropriate, or maybe there is an
 issue with the resource he is trying to access that is causing his
 requests to queue or lag.
 Summary
 ~~~~~~~
 One key element of systems administration that is often overlooked is
 that end users are the reason systems administrators exist. Don't go the
 BOFH route and terminate every user who causes an alert to go off. Work
 with users to understand what they're trying to accomplish and see how
 your environment can better assist them in achieving their goals. Meet
 your users needs by organizing your users into projects, applying
 policies, managing quotas, and working with them.
--- a/doc/ops-guide/source/ops_upgrades.rst
+++ b/doc/ops-guide/source/ops_upgrades.rst
@ -0,0 +1,550 @@
 ========
 Upgrades
 ========
 With the exception of Object Storage, upgrading from one version of
 OpenStack to another can take a great deal of effort. This chapter
 provides some guidance on the operational aspects that you should
 consider for performing an upgrade for a basic architecture.
 Pre-upgrade considerations
 ~~~~~~~~~~~~~~~~~~~~~~~~~~
 Upgrade planning
 ----------------
 -  Thoroughly review the `release
   notes <http://wiki.openstack.org/wiki/ReleaseNotes/>`_ to learn
   about new, updated, and deprecated features. Find incompatibilities
   between versions.
 -  Consider the impact of an upgrade to users. The upgrade process
   interrupts management of your environment including the dashboard. If
   you properly prepare for the upgrade, existing instances, networking,
   and storage should continue to operate. However, instances might
   experience intermittent network interruptions.
 -  Consider the approach to upgrading your environment. You can perform
   an upgrade with operational instances, but this is a dangerous
   approach. You might consider using live migration to temporarily
   relocate instances to other compute nodes while performing upgrades.
   However, you must ensure database consistency throughout the process;
   otherwise your environment might become unstable. Also, don't forget
   to provide sufficient notice to your users, including giving them
   plenty of time to perform their own backups.
 -  Consider adopting structure and options from the service
   configuration files and merging them with existing configuration
   files. The `OpenStack Configuration
   Reference <http://docs.openstack.org/liberty/config-reference/content/>`_
   contains new, updated, and deprecated options for most services.
 -  Like all major system upgrades, your upgrade could fail for one or
   more reasons. You should prepare for this situation by having the
   ability to roll back your environment to the previous release,
   including databases, configuration files, and packages. We provide an
   example process for rolling back your environment in
   :ref:`rolling_back_a_failed_upgrade`.
 -  Develop an upgrade procedure and assess it thoroughly by using a test
   environment similar to your production environment.
 Pre-upgrade testing environment
 -------------------------------
 The most important step is the pre-upgrade testing. If you are upgrading
 immediately after release of a new version, undiscovered bugs might
 hinder your progress. Some deployers prefer to wait until the first
 point release is announced. However, if you have a significant
 deployment, you might follow the development and testing of the release
 to ensure that bugs for your use cases are fixed.
 Each OpenStack cloud is different even if you have a near-identical
 architecture as described in this guide. As a result, you must still
 test upgrades between versions in your environment using an approximate
 clone of your environment.
 However, that is not to say that it needs to be the same size or use
 identical hardware as the production environment. It is important to
 consider the hardware and scale of the cloud that you are upgrading. The
 following tips can help you minimise the cost:
 Use your own cloud
    The simplest place to start testing the next version of OpenStack is
    by setting up a new environment inside your own cloud. This might
    seem odd, especially the double virtualization used in running
    compute nodes. But it is a sure way to very quickly test your
    configuration.
 Use a public cloud
    Consider using a public cloud to test the scalability limits of your
    cloud controller configuration. Most public clouds bill by the hour,
    which means it can be inexpensive to perform even a test with many
    nodes.
 Make another storage endpoint on the same system
    If you use an external storage plug-in or shared file system with
    your cloud, you can test whether it works by creating a second share
    or endpoint. This allows you to test the system before entrusting
    the new version on to your storage.
 Watch the network
    Even at smaller-scale testing, look for excess network packets to
    determine whether something is going horribly wrong in
    inter-component communication.
 To set up the test environment, you can use one of several methods:
 -  Do a full manual install by using the `OpenStack Installation
   Guide <http://docs.openstack.org/index.html#install-guides>`_ for
   your platform. Review the final configuration files and installed
   packages.
 -  Create a clone of your automated configuration infrastructure with
   changed package repository URLs.
   Alter the configuration until it works.
 Either approach is valid. Use the approach that matches your experience.
 An upgrade pre-testing system is excellent for getting the configuration
 to work. However, it is important to note that the historical use of the
 system and differences in user interaction can affect the success of
 upgrades.
 If possible, we highly recommend that you dump your production database
 tables and test the upgrade in your development environment using this
 data. Several MySQL bugs have been uncovered during database migrations
 because of slight table differences between a fresh installation and
 tables that migrated from one version to another. This will have impact
 on large real datasets, which you do not want to encounter during a
 production outage.
 Artificial scale testing can go only so far. After your cloud is
 upgraded, you must pay careful attention to the performance aspects of
 your cloud.
 Upgrade Levels
 --------------
 Upgrade levels are a feature added to OpenStack Compute since the
 Grizzly release to provide version locking on the RPC (Message Queue)
 communications between the various Compute services.
 This functionality is an important piece of the puzzle when it comes to
 live upgrades and is conceptually similar to the existing API versioning
 that allows OpenStack services of different versions to communicate
 without issue.
 Without upgrade levels, an X+1 version Compute service can receive and
 understand X version RPC messages, but it can only send out X+1 version
 RPC messages. For example, if a nova-conductor process has been upgraded
 to X+1 version, then the conductor service will be able to understand
 messages from X version nova-compute processes, but those compute
 services will not be able to understand messages sent by the conductor
 service.
 During an upgrade, operators can add configuration options to
 ``nova.conf`` which lock the version of RPC messages and allow live
 upgrading of the services without interruption caused by version
 mismatch. The configuration options allow the specification of RPC
 version numbers if desired, but release name alias are also supported.
 For example:
 .. code-block:: ini
   [upgrade_levels]
   compute=X+1
   conductor=X+1
   scheduler=X+1
 will keep the RPC version locked across the specified services to the
 RPC version used in X+1. As all instances of a particular service are
 upgraded to the newer version, the corresponding line can be removed
 from ``nova.conf``.
 Using this functionality, ideally one would lock the RPC version to the
 OpenStack version being upgraded from on nova-compute nodes, to ensure
 that, for example X+1 version nova-compute processes will continue to
 work with X version nova-conductor processes while the upgrade
 completes. Once the upgrade of nova-compute processes is complete, the
 operator can move onto upgrading nova-conductor and remove the version
 locking for nova-compute in ``nova.conf``.
 General upgrade process
 ~~~~~~~~~~~~~~~~~~~~~~~
 This section describes the process to upgrade a basic OpenStack
 deployment based on the basic two-node architecture in the `OpenStack
 Installation
 Guide <http://docs.openstack.org/index.html#install-guides>`_. All
 nodes must run a supported distribution of Linux with a recent kernel
 and the current release packages.
 Service specific upgrade instructions
 -------------------------------------
 * `Upgrading the Networking Service <http://docs.openstack.org/developer/neutron/devref/upgrade.html>`_
 Prerequisites
 -------------
 -  Perform some cleaning of the environment prior to starting the
   upgrade process to ensure a consistent state. For example, instances
   not fully purged from the system after deletion might cause
   indeterminate behavior.
 -  For environments using the OpenStack Networking service (neutron),
   verify the release version of the database. For example:
   .. code-block:: console
      # su -s /bin/sh -c "neutron-db-manage --config-file /etc/neutron/neutron.conf \
        --config-file /etc/neutron/plugins/ml2/ml2_conf.ini current" neutron
 Perform a backup
 ----------------
 #. Save the configuration files on all nodes. For example:
   .. code-block:: console
      # for i in keystone glance nova neutron openstack-dashboard cinder heat ceilometer; \
        do mkdir $i-kilo; \
        done
      # for i in keystone glance nova neutron openstack-dashboard cinder heat ceilometer; \
        do cp -r /etc/$i/* $i-kilo/; \
        done
   .. note::
      You can modify this example script on each node to handle different
      services.
 #. Make a full database backup of your production data. As of Kilo,
   database downgrades are not supported, and the only method available to
   get back to a prior database version will be to restore from backup.
   .. code-block:: console
      # mysqldump -u root -p --opt --add-drop-database --all-databases > icehouse-db-backup.sql
   .. note::
      Consider updating your SQL server configuration as described in the
      `OpenStack Installation
      Guide <http://docs.openstack.org/index.html#install-guides>`_.
 Manage repositories
 -------------------
 On all nodes:
 #. Remove the repository for the previous release packages.
 #. Add the repository for the new release packages.
 #. Update the repository database.
 Upgrade packages on each node
 -----------------------------
 Depending on your specific configuration, upgrading all packages might
 restart or break services supplemental to your OpenStack environment.
 For example, if you use the TGT iSCSI framework for Block Storage
 volumes and the upgrade includes new packages for it, the package
 manager might restart the TGT iSCSI services and impact connectivity to
 volumes.
 If the package manager prompts you to update configuration files, reject
 the changes. The package manager appends a suffix to newer versions of
 configuration files. Consider reviewing and adopting content from these
 files.
 .. note::
   You may need to explicitly install the ``ipset`` package if your
   distribution does not install it as a dependency.
 Update services
 ---------------
 To update a service on each node, you generally modify one or more
 configuration files, stop the service, synchronize the database schema,
 and start the service. Some services require different steps. We
 recommend verifying operation of each service before proceeding to the
 next service.
 The order you should upgrade services, and any changes from the general
 upgrade process is described below:
 **Controller node**
 #.  OpenStack Identity - Clear any expired tokens before synchronizing
    the database.
 #.  OpenStack Image service
 #.  OpenStack Compute, including networking components.
 #.  OpenStack Networking
 #.  OpenStack Block Storage
 #.  OpenStack dashboard - In typical environments, updating the
    dashboard only requires restarting the Apache HTTP service.
 #.  OpenStack Orchestration
 #.  OpenStack Telemetry - In typical environments, updating the
    Telemetry service only requires restarting the service.
 #.  OpenStack Compute - Edit the configuration file and restart the
    service.
 #. OpenStack Networking - Edit the configuration file and restart the
    service.
 **Compute nodes**
 -  OpenStack Block Storage - Updating the Block Storage service only
   requires restarting the service.
 **Storage nodes**
 -  OpenStack Networking - Edit the configuration file and restart the
   service.
 Final steps
 -----------
 On all distributions, you must perform some final tasks to complete the
 upgrade process.
 #. Decrease DHCP timeouts by modifying ``/etc/nova/nova.conf`` on the
   compute nodes back to the original value for your environment.
 #. Update all ``.ini`` files to match passwords and pipelines as required
   for the OpenStack release in your environment.
 #. After migration, users see different results from
   :command:`nova image-list` and :command:`glance image-list`. To ensure
   users see the same images in the list
   commands, edit the ``/etc/glance/policy.json`` and
   ``/etc/nova/policy.json`` files to contain
   ``"context_is_admin": "role:admin"``, which limits access to private
   images for projects.
 #. Verify proper operation of your environment. Then, notify your users
   that their cloud is operating normally again.
 .. _rolling_back_a_failed_upgrade:
 Rolling back a failed upgrade
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 Upgrades involve complex operations and can fail. Before attempting any
 upgrade, you should make a full database backup of your production data.
 As of Kilo, database downgrades are not supported, and the only method
 available to get back to a prior database version will be to restore
 from backup.
 This section provides guidance for rolling back to a previous release of
 OpenStack. All distributions follow a similar procedure.
 A common scenario is to take down production management services in
 preparation for an upgrade, completed part of the upgrade process, and
 discovered one or more problems not encountered during testing. As a
 consequence, you must roll back your environment to the original "known
 good" state. You also made sure that you did not make any state changes
 after attempting the upgrade process; no new instances, networks,
 storage volumes, and so on. Any of these new resources will be in a
 frozen state after the databases are restored from backup.
 Within this scope, you must complete these steps to successfully roll
 back your environment:
 #. Roll back configuration files.
 #. Restore databases from backup.
 #. Roll back packages.
 You should verify that you have the requisite backups to restore.
 Rolling back upgrades is a tricky process because distributions tend to
 put much more effort into testing upgrades than downgrades. Broken
 downgrades take significantly more effort to troubleshoot and, resolve
 than broken upgrades. Only you can weigh the risks of trying to push a
 failed upgrade forward versus rolling it back. Generally, consider
 rolling back as the very last option.
 The following steps described for Ubuntu have worked on at least one
 production environment, but they might not work for all environments.
 **To perform the rollback**
 #. Stop all OpenStack services.
 #. Copy contents of configuration backup directories that you created
   during the upgrade process back to ``/etc/<service>`` directory.
 #. Restore databases from the ``RELEASE_NAME-db-backup.sql`` backup file
   that you created with the :command:`mysqldump` command during the upgrade
   process:
   .. code-block:: console
      # mysql -u root -p < RELEASE_NAME-db-backup.sql
 #. Downgrade OpenStack packages.
   .. warning::
      Downgrading packages is by far the most complicated step; it is
      highly dependent on the distribution and the overall administration
      of the system.
   #. Determine which OpenStack packages are installed on your system. Use the
      :command:`dpkg --get-selections` command. Filter for OpenStack
      packages, filter again to omit packages explicitly marked in the
      ``deinstall`` state, and save the final output to a file. For example,
      the following command covers a controller node with keystone, glance,
      nova, neutron, and cinder:
      .. code-block:: console
         # dpkg --get-selections | grep -e keystone -e glance -e nova -e neutron \
         -e cinder | grep -v deinstall | tee openstack-selections
         cinder-api                                      install
         cinder-common                                   install
         cinder-scheduler                                install
         cinder-volume                                   install
         glance                                          install
         glance-api                                      install
         glance-common                                   install
         glance-registry                                 install
         neutron-common                                  install
         neutron-dhcp-agent                              install
         neutron-l3-agent                                install
         neutron-lbaas-agent                             install
         neutron-metadata-agent                          install
         neutron-plugin-openvswitch                      install
         neutron-plugin-openvswitch-agent                install
         neutron-server                                  install
         nova-api                                        install
         nova-cert                                       install
         nova-common                                     install
         nova-conductor                                  install
         nova-consoleauth                                install
         nova-novncproxy                                 install
         nova-objectstore                                install
         nova-scheduler                                  install
         python-cinder                                   install
         python-cinderclient                             install
         python-glance                                   install
         python-glanceclient                             install
         python-keystone                                 install
         python-keystoneclient                           install
         python-neutron                                  install
         python-neutronclient                            install
         python-nova                                     install
         python-novaclient                               install
      .. note::
         Depending on the type of server, the contents and order of your
         package list might vary from this example.
   #. You can determine the package versions available for reversion by using
      the ``apt-cache policy`` command. If you removed the Grizzly
      repositories, you must first reinstall them and run ``apt-get update``:
      .. code-block:: console
         # apt-cache policy nova-common
         nova-common:
         Installed: 1:2013.2-0ubuntu1~cloud0
         Candidate: 1:2013.2-0ubuntu1~cloud0
         Version table:
         *** 1:2013.2-0ubuntu1~cloud0 0
               500 http://ubuntu-cloud.archive.canonical.com/ubuntu/
                   precise-updates/havana/main amd64 Packages
               100 /var/lib/dpkg/status
             1:2013.1.4-0ubuntu1~cloud0 0
               500 http://ubuntu-cloud.archive.canonical.com/ubuntu/
                   precise-updates/grizzly/main amd64 Packages
            2012.1.3+stable-20130423-e52e6912-0ubuntu1.2 0
               500 http://us.archive.ubuntu.com/ubuntu/
                   precise-updates/main amd64 Packages
               500 http://security.ubuntu.com/ubuntu/
                   precise-security/main amd64 Packages
            2012.1-0ubuntu2 0
               500 http://us.archive.ubuntu.com/ubuntu/
                   precise/main amd64 Packages
      This tells us the currently installed version of the package, newest
      candidate version, and all versions along with the repository that
      contains each version. Look for the appropriate Grizzly
      version— ``1:2013.1.4-0ubuntu1~cloud0`` in this case. The process of
      manually picking through this list of packages is rather tedious and
      prone to errors. You should consider using the following script to help
      with this process:
      .. code-block:: console
         # for i in `cut -f 1 openstack-selections | sed 's/neutron/quantum/;'`;
           do echo -n $i ;apt-cache policy $i | grep -B 1 grizzly |
           grep -v Packages | awk '{print "="$1}';done | tr '\n' ' ' |
           tee openstack-grizzly-versions
         cinder-api=1:2013.1.4-0ubuntu1~cloud0
         cinder-common=1:2013.1.4-0ubuntu1~cloud0
         cinder-scheduler=1:2013.1.4-0ubuntu1~cloud0
         cinder-volume=1:2013.1.4-0ubuntu1~cloud0
         glance=1:2013.1.4-0ubuntu1~cloud0
         glance-api=1:2013.1.4-0ubuntu1~cloud0
         glance-common=1:2013.1.4-0ubuntu1~cloud0
         glance-registry=1:2013.1.4-0ubuntu1~cloud0
         quantum-common=1:2013.1.4-0ubuntu1~cloud0
         quantum-dhcp-agent=1:2013.1.4-0ubuntu1~cloud0
         quantum-l3-agent=1:2013.1.4-0ubuntu1~cloud0
         quantum-lbaas-agent=1:2013.1.4-0ubuntu1~cloud0
         quantum-metadata-agent=1:2013.1.4-0ubuntu1~cloud0
         quantum-plugin-openvswitch=1:2013.1.4-0ubuntu1~cloud0
         quantum-plugin-openvswitch-agent=1:2013.1.4-0ubuntu1~cloud0
         quantum-server=1:2013.1.4-0ubuntu1~cloud0
         nova-api=1:2013.1.4-0ubuntu1~cloud0
         nova-cert=1:2013.1.4-0ubuntu1~cloud0
         nova-common=1:2013.1.4-0ubuntu1~cloud0
         nova-conductor=1:2013.1.4-0ubuntu1~cloud0
         nova-consoleauth=1:2013.1.4-0ubuntu1~cloud0
         nova-novncproxy=1:2013.1.4-0ubuntu1~cloud0
         nova-objectstore=1:2013.1.4-0ubuntu1~cloud0
         nova-scheduler=1:2013.1.4-0ubuntu1~cloud0
         python-cinder=1:2013.1.4-0ubuntu1~cloud0
         python-cinderclient=1:1.0.3-0ubuntu1~cloud0
         python-glance=1:2013.1.4-0ubuntu1~cloud0
         python-glanceclient=1:0.9.0-0ubuntu1.2~cloud0
         python-quantum=1:2013.1.4-0ubuntu1~cloud0
         python-quantumclient=1:2.2.0-0ubuntu1~cloud0
         python-nova=1:2013.1.4-0ubuntu1~cloud0
         python-novaclient=1:2.13.0-0ubuntu1~cloud0
      .. note::
         If you decide to continue this step manually, don't forget to change
         ``neutron`` to ``quantum`` where applicable.
   #. Use the :command:`apt-get install` command to install specific versions of each
      package by specifying ``<package-name>=<version>``. The script in the
      previous step conveniently created a list of ``package=version`` pairs
      for you:
      .. code-block:: console
         # apt-get install `cat openstack-grizzly-versions`
      This step completes the rollback procedure. You should remove the
      upgrade release repository and run :command:`apt-get update` to prevent
      accidental upgrades until you solve whatever issue caused you to roll
      back your environment.
--- a/doc/ops-guide/source/ops_upstream.rst
+++ b/doc/ops-guide/source/ops_upstream.rst
@ -0,0 +1,324 @@
 ==================
 Upstream OpenStack
 ==================
 OpenStack is founded on a thriving community that is a source of help
 and welcomes your contributions. This chapter details some of the ways
 you can interact with the others involved.
 Getting Help
 ~~~~~~~~~~~~
 There are several avenues available for seeking assistance. The quickest
 way is to help the community help you. Search the Q&A sites, mailing
 list archives, and bug lists for issues similar to yours. If you can't
 find anything, follow the directions for reporting bugs or use one of
 the channels for support, which are listed below.
 Your first port of call should be the official OpenStack documentation,
 found on http://docs.openstack.org. You can get questions answered on
 http://ask.openstack.org.
 `Mailing lists <https://wiki.openstack.org/wiki/Mailing_Lists>`_ are
 also a great place to get help. The wiki page has more information about
 the various lists. As an operator, the main lists you should be aware of
 are:
 `General list <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack>`_
    *openstack@lists.openstack.org*. The scope of this list is the
    current state of OpenStack. This is a very high-traffic mailing
    list, with many, many emails per day.
 `Operators list <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators>`_
    *openstack-operators@lists.openstack.org.* This list is intended for
    discussion among existing OpenStack cloud operators, such as
    yourself. Currently, this list is relatively low traffic, on the
    order of one email a day.
 `Development list <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev>`_
    *openstack-dev@lists.openstack.org*. The scope of this list is the
    future state of OpenStack. This is a high-traffic mailing list, with
    multiple emails per day.
 We recommend that you subscribe to the general list and the operator
 list, although you must set up filters to manage the volume for the
 general list. You'll also find links to the mailing list archives on the
 mailing list wiki page, where you can search through the discussions.
 `Multiple IRC channels <https://wiki.openstack.org/wiki/IRC>`_ are
 available for general questions and developer discussions. The general
 discussion channel is #openstack on *irc.freenode.net*.
 Reporting Bugs
 ~~~~~~~~~~~~~~
 As an operator, you are in a very good position to report unexpected
 behavior with your cloud. Since OpenStack is flexible, you may be the
 only individual to report a particular issue. Every issue is important
 to fix, so it is essential to learn how to easily submit a bug
 report.
 All OpenStack projects use `Launchpad <https://launchpad.net/>`_
 for bug tracking. You'll need to create an account on Launchpad before you
 can submit a bug report.
 Once you have a Launchpad account, reporting a bug is as simple as
 identifying the project or projects that are causing the issue.
 Sometimes this is more difficult than expected, but those working on the
 bug triage are happy to help relocate issues if they are not in the
 right place initially:
 -  Report a bug in
   `nova <https://bugs.launchpad.net/nova/+filebug/+login>`_.
 -  Report a bug in
   `python-novaclient <https://bugs.launchpad.net/python-novaclient/+filebug/+login>`_.
 -  Report a bug in
   `swift <https://bugs.launchpad.net/swift/+filebug/+login>`_.
 -  Report a bug in
   `python-swiftclient <https://bugs.launchpad.net/python-swiftclient/+filebug/+login>`_.
 -  Report a bug in
   `glance <https://bugs.launchpad.net/glance/+filebug/+login>`_.
 -  Report a bug in
   `python-glanceclient <https://bugs.launchpad.net/python-glanceclient/+filebug/+login>`_.
 -  Report a bug in
   `keystone <https://bugs.launchpad.net/keystone/+filebug/+login>`_.
 -  Report a bug in
   `python-keystoneclient <https://bugs.launchpad.net/python-keystoneclient/+filebug/+login>`_.
 -  Report a bug in
   `neutron <https://bugs.launchpad.net/neutron/+filebug/+login>`_.
 -  Report a bug in
   `python-neutronclient <https://bugs.launchpad.net/python-neutronclient/+filebug/+login>`_.
 -  Report a bug in
   `cinder <https://bugs.launchpad.net/cinder/+filebug/+login>`_.
 -  Report a bug in
   `python-cinderclient <https://bugs.launchpad.net/python-cinderclient/+filebug/+login>`_.
 -  Report a bug in
   `manila <https://bugs.launchpad.net/manila/+filebug/+login>`_.
 -  Report a bug in
   `python-manilaclient <https://bugs.launchpad.net/python-manilaclient/+filebug/+login>`_.
 -  Report a bug in
   `python-openstackclient <https://bugs.launchpad.net/python-openstackclient/+filebug/+login>`_.
 -  Report a bug in
   `horizon <https://bugs.launchpad.net/horizon/+filebug/+login>`_.
 -  Report a bug with the
   `documentation <https://bugs.launchpad.net/openstack-manuals/+filebug/+login>`_.
 -  Report a bug with the `API
   documentation <https://bugs.launchpad.net/openstack-api-site/+filebug/+login>`_.
 To write a good bug report, the following process is essential. First,
 search for the bug to make sure there is no bug already filed for the
 same issue. If you find one, be sure to click on "This bug affects X
 people. Does this bug affect you?" If you can't find the issue, then
 enter the details of your report. It should at least include:
 -  The release, or milestone, or commit ID corresponding to the software
   that you are running
 -  The operating system and version where you've identified the bug
 -  Steps to reproduce the bug, including what went wrong
 -  Description of the expected results instead of what you saw
 -  Portions of your log files so that you include only relevant excerpts
 When you do this, the bug is created with:
 -  Status: *New*
 In the bug comments, you can contribute instructions on how to fix a
 given bug, and set it to *Triaged*. Or you can directly fix it: assign
 the bug to yourself, set it to *In progress*, branch the code, implement
 the fix, and propose your change for merging. But let's not get ahead of
 ourselves; there are bug triaging tasks as well.
 Confirming and Prioritizing
 ---------------------------
 This stage is about checking that a bug is real and assessing its
 impact. Some of these steps require bug supervisor rights (usually
 limited to core teams). If the bug lacks information to properly
 reproduce or assess the importance of the bug, the bug is set to:
 -  Status: *Incomplete*
 Once you have reproduced the issue (or are 100 percent confident that
 this is indeed a valid bug) and have permissions to do so, set:
 -  Status: *Confirmed*
 Core developers also prioritize the bug, based on its impact:
 -  Importance: <Bug impact>
 The bug impacts are categorized as follows:
 #. *Critical* if the bug prevents a key feature from working properly
   (regression) for all users (or without a simple workaround) or
   results in data loss
 #. *High* if the bug prevents a key feature from working properly for
   some users (or with a workaround)
 #. *Medium* if the bug prevents a secondary feature from working
   properly
 #. *Low* if the bug is mostly cosmetic
 #. *Wishlist* if the bug is not really a bug but rather a welcome change
   in behavior
 If the bug contains the solution, or a patch, set the bug status to
 *Triaged*.
 Bug Fixing
 ----------
 At this stage, a developer works on a fix. During that time, to avoid
 duplicating the work, the developer should set:
 -  Status: *In Progress*
 -  Assignee: <yourself>
 When the fix is ready, the developer proposes a change and gets the
 change reviewed.
 After the Change Is Accepted
 ----------------------------
 After the change is reviewed, accepted, and lands in master, it
 automatically moves to:
 -  Status: *Fix Committed*
 When the fix makes it into a milestone or release branch, it
 automatically moves to:
 -  Milestone: Milestone the bug was fixed in
 -  Status: \ *Fix Released*
 Join the OpenStack Community
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 Since you've made it this far in the book, you should consider becoming
 an official individual member of the community and `join the OpenStack
 Foundation <https://www.openstack.org/join/>`_. The OpenStack
 Foundation is an independent body providing shared resources to help
 achieve the OpenStack mission by protecting, empowering, and promoting
 OpenStack software and the community around it, including users,
 developers, and the entire ecosystem. We all share the responsibility to
 make this community the best it can possibly be, and signing up to be a
 member is the first step to participating. Like the software, individual
 membership within the OpenStack Foundation is free and accessible to
 anyone.
 How to Contribute to the Documentation
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 OpenStack documentation efforts encompass operator and administrator
 docs, API docs, and user docs.
 The genesis of this book was an in-person event, but now that the book
 is in your hands, we want you to contribute to it. OpenStack
 documentation follows the coding principles of iterative work, with bug
 logging, investigating, and fixing.
 Just like the code, http://docs.openstack.org is updated constantly
 using the Gerrit review system, with source stored in git.openstack.org
 in the `openstack-manuals
 repository <https://git.openstack.org/cgit/openstack/openstack-manuals/>`_
 and the `api-site
 repository <https://git.openstack.org/cgit/openstack/api-site/>`_.
 To review the documentation before it's published, go to the OpenStack
 Gerrit server at \ http://review.openstack.org and search for
 `project:openstack/openstack-manuals <https://review.openstack.org/#/q/status:open+project:openstack/openstack-manuals,n,z>`_
 or
 `project:openstack/api-site <https://review.openstack.org/#/q/status:open+project:openstack/api-site,n,z>`_.
 See the `How To Contribute page on the
 wiki <https://wiki.openstack.org/wiki/How_To_Contribute>`_ for more
 information on the steps you need to take to submit your first
 documentation review or change.
 Security Information
 ~~~~~~~~~~~~~~~~~~~~
 As a community, we take security very seriously and follow a specific
 process for reporting potential issues. We vigilantly pursue fixes and
 regularly eliminate exposures. You can report security issues you
 discover through this specific process. The OpenStack Vulnerability
 Management Team is a very small group of experts in vulnerability
 management drawn from the OpenStack community. The team's job is
 facilitating the reporting of vulnerabilities, coordinating security
 fixes and handling progressive disclosure of the vulnerability
 information. Specifically, the team is responsible for the following
 functions:
 Vulnerability management
    All vulnerabilities discovered by community members (or users) can
    be reported to the team.
 Vulnerability tracking
    The team will curate a set of vulnerability related issues in the
    issue tracker. Some of these issues are private to the team and the
    affected product leads, but once remediation is in place, all
    vulnerabilities are public.
 Responsible disclosure
    As part of our commitment to work with the security community, the
    team ensures that proper credit is given to security researchers who
    responsibly report issues in OpenStack.
 We provide two ways to report issues to the OpenStack Vulnerability
 Management Team, depending on how sensitive the issue is:
 -  Open a bug in Launchpad and mark it as a "security bug." This makes
   the bug private and accessible to only the Vulnerability Management
   Team.
 -  If the issue is extremely sensitive, send an encrypted email to one
   of the team's members. Find their GPG keys at `OpenStack
   Security <http://www.openstack.org/projects/openstack-security/>`_.
 You can find the full list of security-oriented teams you can join at
 `Security Teams <https://wiki.openstack.org/wiki/SecurityTeams>`_. The
 vulnerability management process is fully documented at `Vulnerability
 Management <https://wiki.openstack.org/wiki/VulnerabilityManagement>`_.
 Finding Additional Information
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 In addition to this book, there are many other sources of information
 about OpenStack. The
 `OpenStack website <http://www.openstack.org/>`_
 is a good starting point, with
 `OpenStack Docs <http://docs.openstack.org/>`_ and `OpenStack API
 Docs <http://developer.openstack.org/>`_ providing technical
 documentation about OpenStack. The `OpenStack
 wiki <https://wiki.openstack.org/wiki/Main_Page>`_ contains a lot of
 general information that cuts across the OpenStack projects, including a
 list of `recommended
 tools <https://wiki.openstack.org/wiki/OperationsTools>`_. Finally,
 there are a number of blogs aggregated at \ `Planet
 OpenStack <http://planet.openstack.org/>`_.OpenStack community
 additional information
--- a/doc/ops-guide/source/ops_user_facing_operations.rst
+++ b/doc/ops-guide/source/ops_user_facing_operations.rst
--- a/doc/ops-guide/source/preface_ops.rst
+++ b/doc/ops-guide/source/preface_ops.rst
@ -0,0 +1,500 @@
 =======
 Preface
 =======
 OpenStack is an open source platform that lets you build an
 :term:`Infrastructure-as-a-Service (IaaS)<IaaS>` cloud that runs on commodity
 hardware.
 Introduction to OpenStack
 ~~~~~~~~~~~~~~~~~~~~~~~~~
 OpenStack believes in open source, open design, and open development,
 all in an open community that encourages participation by anyone. The
 long-term vision for OpenStack is to produce a ubiquitous open source
 cloud computing platform that meets the needs of public and private
 cloud providers regardless of size. OpenStack services control large
 pools of compute, storage, and networking resources throughout a data
 center.
 The technology behind OpenStack consists of a series of interrelated
 projects delivering various components for a cloud infrastructure
 solution. Each service provides an open API so that all of these
 resources can be managed through a dashboard that gives administrators
 control while empowering users to provision resources through a web
 interface, a command-line client, or software development kits that
 support the API. Many OpenStack APIs are extensible, meaning you can
 keep compatibility with a core set of calls while providing access to
 more resources and innovating through API extensions. The OpenStack
 project is a global collaboration of developers and cloud computing
 technologists. The project produces an open standard cloud computing
 platform for both public and private clouds. By focusing on ease of
 implementation, massive scalability, a variety of rich features, and
 tremendous extensibility, the project aims to deliver a practical and
 reliable cloud solution for all types of organizations.
 Getting Started with OpenStack
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 As an open source project, one of the unique aspects of OpenStack is
 that it has many different levels at which you can begin to engage with
 it—you don't have to do everything yourself.
 Using OpenStack
 ---------------
 You could ask, "Do I even need to build a cloud?" If you want to start
 using a compute or storage service by just swiping your credit card, you
 can go to eNovance, HP, Rackspace, or other organizations to start using
 their public OpenStack clouds. Using their OpenStack cloud resources is
 similar to accessing the publicly available Amazon Web Services Elastic
 Compute Cloud (EC2) or Simple Storage Solution (S3).
 Plug and Play OpenStack
 -----------------------
 However, the enticing part of OpenStack might be to build your own
 private cloud, and there are several ways to accomplish this goal.
 Perhaps the simplest of all is an appliance-style solution. You purchase
 an appliance, unpack it, plug in the power and the network, and watch it
 transform into an OpenStack cloud with minimal additional configuration.
 However, hardware choice is important for many applications, so if that
 applies to you, consider that there are several software distributions
 available that you can run on servers, storage, and network products of
 your choosing. Canonical (where OpenStack replaced Eucalyptus as the
 default cloud option in 2011), Red Hat, and SUSE offer enterprise
 OpenStack solutions and support. You may also want to take a look at
 some of the specialized distributions, such as those from Rackspace,
 Piston, SwiftStack, or Cloudscaling.
 Alternatively, if you want someone to help guide you through the
 decisions about the underlying hardware or your applications, perhaps
 adding in a few features or integrating components along the way,
 consider contacting one of the system integrators with OpenStack
 experience, such as Mirantis or Metacloud.
 If your preference is to build your own OpenStack expertise internally,
 a good way to kick-start that might be to attend or arrange a training
 session. The OpenStack Foundation has a `Training
 Marketplace <http://www.openstack.org/marketplace/training>`_ where you
 can look for nearby events. Also, the OpenStack community is `working to
 produce <https://wiki.openstack.org/wiki/Training-guides>`_ open source
 training materials.
 Roll Your Own OpenStack
 -----------------------
 However, this guide has a different audience—those seeking flexibility
 from the OpenStack framework by deploying do-it-yourself solutions.
 OpenStack is designed for horizontal scalability, so you can easily add
 new compute, network, and storage resources to grow your cloud over
 time. In addition to the pervasiveness of massive OpenStack public
 clouds, many organizations, such as PayPal, Intel, and Comcast, build
 large-scale private clouds. OpenStack offers much more than a typical
 software package because it lets you integrate a number of different
 technologies to construct a cloud. This approach provides great
 flexibility, but the number of options might be daunting at first.
 Who This Book Is For
 ~~~~~~~~~~~~~~~~~~~~
 This book is for those of you starting to run OpenStack clouds as well
 as those of you who were handed an operational one and want to keep it
 running well. Perhaps you're on a DevOps team, perhaps you are a system
 administrator starting to dabble in the cloud, or maybe you want to get
 on the OpenStack cloud team at your company. This book is for all of
 you.
 This guide assumes that you are familiar with a Linux distribution that
 supports OpenStack, SQL databases, and virtualization. You must be
 comfortable administering and configuring multiple Linux machines for
 networking. You must install and maintain an SQL database and
 occasionally run queries against it.
 One of the most complex aspects of an OpenStack cloud is the networking
 configuration. You should be familiar with concepts such as DHCP, Linux
 bridges, VLANs, and iptables. You must also have access to a network
 hardware expert who can configure the switches and routers required in
 your OpenStack cloud.
 .. note::
   Cloud computing is quite an advanced topic, and this book requires a
   lot of background knowledge. However, if you are fairly new to cloud
   computing, we recommend that you make use of the :doc:`common/glossary`
   at the back of the book, as well as the online documentation for OpenStack
   and additional resources mentioned in this book in :doc:`app_resources`.
 Further Reading
 ---------------
 There are other books on the `OpenStack documentation
 website <http://docs.openstack.org>`_ that can help you get the job
 done.
 OpenStack Installation Guides
    Describes a manual installation process, as in, by hand, without
    automation, for multiple distributions based on a packaging system:
    -  `Installation Guide for openSUSE 13.2 and SUSE Linux Enterprise
       Server
       12 <http://docs.openstack.org/liberty/install-guide-obs/>`_
    -  `Installation Guide for Red Hat Enterprise Linux 7 and CentOS
       7 <http://docs.openstack.org/liberty/install-guide-rdo/>`_
    -  `Installation Guide for Ubuntu 14.04 (LTS)
       Server <http://docs.openstack.org/liberty/install-guide-ubuntu/>`_
 `OpenStack Configuration Reference <http://docs.openstack.org/liberty/config-reference/content/>`_
    Contains a reference listing of all configuration options for core
    and integrated OpenStack services by release version
 `OpenStack Administrator Guide <http://docs.openstack.org/admin-guide/>`_
    Contains how-to information for managing an OpenStack cloud as
    needed for your use cases, such as storage, computing, or
    software-defined-networking
 `OpenStack High Availability Guide <http://docs.openstack.org/ha-guide/index.html>`_
    Describes potential strategies for making your OpenStack services
    and related controllers and data stores highly available
 `OpenStack Security Guide <http://docs.openstack.org/sec/>`_
    Provides best practices and conceptual information about securing an
    OpenStack cloud
 `Virtual Machine Image Guide <http://docs.openstack.org/image-guide/>`_
    Shows you how to obtain, create, and modify virtual machine images
    that are compatible with OpenStack
 `OpenStack End User Guide <http://docs.openstack.org/user-guide/>`_
    Shows OpenStack end users how to create and manage resources in an
    OpenStack cloud with the OpenStack dashboard and OpenStack client
    commands
 `Networking Guide <http://docs.openstack.org/networking-guide/>`_
    This guide targets OpenStack administrators seeking to deploy and
    manage OpenStack Networking (neutron).
 `OpenStack API Guide <http://developer.openstack.org/api-guide/quick-start/>`_
    A brief overview of how to send REST API requests to endpoints for
    OpenStack services
 How This Book Is Organized
 ~~~~~~~~~~~~~~~~~~~~~~~~~~
 This book is organized into two parts: the architecture decisions for
 designing OpenStack clouds and the repeated operations for running
 OpenStack clouds.
 **Part I:**
 :doc:`arch_examples`
    Because of all the decisions the other chapters discuss, this
    chapter describes the decisions made for this particular book and
    much of the justification for the example architecture.
 :doc:`arch_provision`
    While this book doesn't describe installation, we do recommend
    automation for deployment and configuration, discussed in this
    chapter.
 :doc:`arch_cloud_controller`
    The cloud controller is an invention for the sake of consolidating
    and describing which services run on which nodes. This chapter
    discusses hardware and network considerations as well as how to
    design the cloud controller for performance and separation of
    services.
 :doc:`arch_compute_nodes`
    This chapter describes the compute nodes, which are dedicated to
    running virtual machines. Some hardware choices come into play here,
    as well as logging and networking descriptions.
 :doc:`arch_scaling`
    This chapter discusses the growth of your cloud resources through
    scaling and segregation considerations.
 :doc:`arch_storage`
    As with other architecture decisions, storage concepts within
    OpenStack offer many options. This chapter lays out the choices for
    you.
 :doc:`arch_network_design`
    Your OpenStack cloud networking needs to fit into your existing
    networks while also enabling the best design for your users and
    administrators, and this chapter gives you in-depth information
    about networking decisions.
 **Part II:**
 :doc:`ops_lay_of_the_land`
    This chapter is written to let you get your hands wrapped around
    your OpenStack cloud through command-line tools and understanding
    what is already set up in your cloud.
 :doc:`ops_projects_users`
    This chapter walks through user-enabling processes that all admins
    must face to manage users, give them quotas to parcel out resources,
    and so on.
 :doc:`ops_user_facing_operations`
    This chapter shows you how to use OpenStack cloud resources and how
    to train your users.
 :doc:`ops_maintenance`
    This chapter goes into the common failures that the authors have
    seen while running clouds in production, including troubleshooting.
 :doc:`ops_network_troubleshooting`
    Because network troubleshooting is especially difficult with virtual
    resources, this chapter is chock-full of helpful tips and tricks for
    tracing network traffic, finding the root cause of networking
    failures, and debugging related services, such as DHCP and DNS.
 :doc:`ops_logging_monitoring`
    This chapter shows you where OpenStack places logs and how to best
    read and manage logs for monitoring purposes.
 :doc:`ops_backup_recovery`
    This chapter describes what you need to back up within OpenStack as
    well as best practices for recovering backups.
 :doc:`ops_customize`
    For readers who need to get a specialized feature into OpenStack,
    this chapter describes how to use DevStack to write custom
    middleware or a custom scheduler to rebalance your resources.
 :doc:`ops_upstream`
    Because OpenStack is so, well, open, this chapter is dedicated to
    helping you navigate the community and find out where you can help
    and where you can get help.
 :doc:`ops_advanced_configuration`
    Much of OpenStack is driver-oriented, so you can plug in different
    solutions to the base set of services. This chapter describes some
    advanced configuration topics.
 :doc:`ops_upgrades`
    This chapter provides upgrade information based on the architectures
    used in this book.
 **Back matter:**
 :doc:`app_usecases`
    You can read a small selection of use cases from the OpenStack
    community with some technical details and further resources.
 :doc:`app_crypt`
    These are shared legendary tales of image disappearances, VM
    massacres, and crazy troubleshooting techniques that result in
    hard-learned lessons and wisdom.
 :doc:`app_roadmaps`
    Read about how to track the OpenStack roadmap through the open and
    transparent development processes.
 :doc:`app_resources`
    So many OpenStack resources are available online because of the
    fast-moving nature of the project, but there are also resources
    listed here that the authors found helpful while learning
    themselves.
 :doc:`common/glossary`
    A list of terms used in this book is included, which is a subset of
    the larger OpenStack glossary available online.
 Why and How We Wrote This Book
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 We wrote this book because we have deployed and maintained OpenStack
 clouds for at least a year and we wanted to share this knowledge with
 others. After months of being the point people for an OpenStack cloud,
 we also wanted to have a document to hand to our system administrators
 so that they'd know how to operate the cloud on a daily basis—both
 reactively and pro-actively. We wanted to provide more detailed
 technical information about the decisions that deployers make along the
 way.
 We wrote this book to help you:
 -  Design and create an architecture for your first nontrivial OpenStack
   cloud. After you read this guide, you'll know which questions to ask
   and how to organize your compute, networking, and storage resources
   and the associated software packages.
 -  Perform the day-to-day tasks required to administer a cloud.
 We wrote this book in a book sprint, which is a facilitated, rapid
 development production method for books. For more information, see the
 `BookSprints site <http://www.booksprints.net/>`_. Your authors cobbled
 this book together in five days during February 2013, fueled by caffeine
 and the best takeout food that Austin, Texas, could offer.
 On the first day, we filled white boards with colorful sticky notes to
 start to shape this nebulous book about how to architect and operate
 clouds:
 We wrote furiously from our own experiences and bounced ideas between
 each other. At regular intervals we reviewed the shape and organization
 of the book and further molded it, leading to what you see today.
 The team includes:
 Tom Fifield
    After learning about scalability in computing from particle physics
    experiments, such as ATLAS at the Large Hadron Collider (LHC) at
    CERN, Tom worked on OpenStack clouds in production to support the
    Australian public research sector. Tom currently serves as an
    OpenStack community manager and works on OpenStack documentation in
    his spare time.
 Diane Fleming
    Diane works on the OpenStack API documentation tirelessly. She
    helped out wherever she could on this project.
 Anne Gentle
    Anne is the documentation coordinator for OpenStack and also served
    as an individual contributor to the Google Documentation Summit in
    2011, working with the Open Street Maps team. She has worked on book
    sprints in the past, with FLOSS Manuals’ Adam Hyde facilitating.
    Anne lives in Austin, Texas.
 Lorin Hochstein
    An academic turned software-developer-slash-operator, Lorin worked
    as the lead architect for Cloud Services at Nimbis Services, where
    he deploys OpenStack for technical computing applications. He has
    been working with OpenStack since the Cactus release. Previously, he
    worked on high-performance computing extensions for OpenStack at
    University of Southern California's Information Sciences Institute
    (USC-ISI).
 Adam Hyde
    Adam facilitated this book sprint. He also founded the book sprint
    methodology and is the most experienced book-sprint facilitator
    around. See http://www.booksprints.net for more information. Adam
    founded FLOSS Manuals—a community of some 3,000 individuals
    developing Free Manuals about Free Software. He is also the founder
    and project manager for Booktype, an open source project for
    writing, editing, and publishing books online and in print.
 Jonathan Proulx
    Jon has been piloting an OpenStack cloud as a senior technical
    architect at the MIT Computer Science and Artificial Intelligence
    Lab for his researchers to have as much computing power as they
    need. He started contributing to OpenStack documentation and
    reviewing the documentation so that he could accelerate his
    learning.
 Everett Toews
    Everett is a developer advocate at Rackspace making OpenStack and
    the Rackspace Cloud easy to use. Sometimes developer, sometimes
    advocate, and sometimes operator, he's built web applications,
    taught workshops, given presentations around the world, and deployed
    OpenStack for production use by academia and business.
 Joe Topjian
    Joe has designed and deployed several clouds at Cybera, a nonprofit
    where they are building e-infrastructure to support entrepreneurs
    and local researchers in Alberta, Canada. He also actively maintains
    and operates these clouds as a systems architect, and his
    experiences have generated a wealth of troubleshooting skills for
    cloud environments.
 OpenStack community members
    Many individual efforts keep a community book alive. Our community
    members updated content for this book year-round. Also, a year after
    the first sprint, Jon Proulx hosted a second two-day mini-sprint at
    MIT with the goal of updating the book for the latest release. Since
    the book's inception, more than 30 contributors have supported this
    book. We have a tool chain for reviews, continuous builds, and
    translations. Writers and developers continuously review patches,
    enter doc bugs, edit content, and fix doc bugs. We want to recognize
    their efforts!
    The following people have contributed to this book: Akihiro Motoki,
    Alejandro Avella, Alexandra Settle, Andreas Jaeger, Andy McCallum,
    Benjamin Stassart, Chandan Kumar, Chris Ricker, David Cramer, David
    Wittman, Denny Zhang, Emilien Macchi, Gauvain Pocentek, Ignacio
    Barrio, James E. Blair, Jay Clark, Jeff White, Jeremy Stanley, K
    Jonathan Harker, KATO Tomoyuki, Lana Brindley, Laura Alves, Lee Li,
    Lukasz Jernas, Mario B. Codeniera, Matthew Kassawara, Michael Still,
    Monty Taylor, Nermina Miller, Nigel Williams, Phil Hopkins, Russell
    Bryant, Sahid Orentino Ferdjaoui, Sandy Walsh, Sascha Peilicke, Sean
    M. Collins, Sergey Lukjanov, Shilla Saebi, Stephen Gordon, Summer
    Long, Uwe Stuehler, Vaibhav Bhatkar, Veronica Musso, Ying Chun
    "Daisy" Guo, Zhengguang Ou, and ZhiQiang Fan.
 How to Contribute to This Book
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 The genesis of this book was an in-person event, but now that the book
 is in your hands, we want you to contribute to it. OpenStack
 documentation follows the coding principles of iterative work, with bug
 logging, investigating, and fixing. We also store the source content on
 GitHub and invite collaborators through the OpenStack Gerrit
 installation, which offers reviews. For the O'Reilly edition of this
 book, we are using the company's Atlas system, which also stores source
 content on GitHub and enables collaboration among contributors.
 Learn more about how to contribute to the OpenStack docs at `OpenStack
 Documentation Contributor
 Guide <http://docs.openstack.org/contributor-guide/>`_.
 If you find a bug and can't fix it or aren't sure it's really a doc bug,
 log a bug at `OpenStack
 Manuals <https://bugs.launchpad.net/openstack-manuals>`_. Tag the bug
 under Extra options with the ``ops-guide`` tag to indicate that the bug
 is in this guide. You can assign the bug to yourself if you know how to
 fix it. Also, a member of the OpenStack doc-core team can triage the doc
 bug.
 Conventions Used in This Book
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 The following typographical conventions are used in this book:
 *Italic*
    Indicates new terms, URLs, email addresses, filenames, and file
    extensions.
 ``Constant width``
    Used for program listings, as well as within paragraphs to refer to
    program elements such as variable or function names, databases, data
    types, environment variables, statements, and keywords.
 ``Constant width bold``
    Shows commands or other text that should be typed literally by the
    user.
 Constant width italic
    Shows text that should be replaced with user-supplied values or by
    values determined by context.
 Command prompts
    Commands prefixed with the ``#`` prompt should be executed by the
    ``root`` user. These examples can also be executed using the
    :command:`sudo` command, if available.
    Commands prefixed with the ``$`` prompt can be executed by any user,
    including ``root``.
 .. tip::
   This element signifies a tip or suggestion.
 .. note::
   This element signifies a general note.
 .. warning::
   This element indicates a warning or caution.
 See also:
 .. toctree::
   common/conventions.rst
--- a/tools/build-all-rst.sh
+++ b/tools/build-all-rst.sh
@ -22,7 +22,8 @@ done
 # Draft guides
 # This includes guides that we publish from stable branches
 # as versioned like the networking-guide.
-for guide in networking-guide arch-design-draft config-reference; do
+for guide in networking-guide arch-design-draft config-reference \
    ops-guide; do
    tools/build-rst.sh doc/$guide --build build \
        --target "draft/$guide" $LINKCHECK
 done
--- a/tools/publishdocs.sh
+++ b/tools/publishdocs.sh
@ -31,8 +31,9 @@ function copy_to_branch {
        cp -a publish-docs/draft/* publish-docs/$BRANCH/
        # We don't need this file
        rm -f publish-docs/$BRANCH/draft-index.html
-        # We don't need Contributor Guide
+        # We don't need these draft guides on the branch
-        rm -rf publish-docs/$BRANCH/contributor-guide
+        rm -rf publish-docs/$BRANCH/arch-design-draft
        rm -rf publish-docs/$BRANCH/ops-guide
        for f in $(find publish-docs/$BRANCH -name "atom.xml"); do
            sed -i -e "s|/draft/|/$BRANCH/|g" $f