Mikhail Dubov 8089fdd7b8 [Docs] Various fixes
* Remove obsolete rst files
* Structural changes: added a folder "miscellaneous" for articles not included
  in the main tree
* Small fixes in rst files that resulted in warnings while building the docs
* Task templates lesson is now the 4th in the tutorial

Change-Id: Id1db7e2337ca8266352a5ede68c66e4b3739335b
2015-04-20 23:02:17 +03:00

8.8 KiB

Finding a Keystone bug while benchmarking 20 node HA cloud performance at creating 400 VMs

(Contributed by Alexander Maretskiy, Mirantis)

Below we describe how we found a bug in keystone and achieved 2x average performance increase at booting Nova servers after fixing that bug. Our initial goal was to benchmark the booting of a significant amount of servers on a cluster (running on a custom build of Mirantis OpenStack v5.1) and to ensure that this operation has reasonable performance and completes with no errors.


  • Get data on how a cluster behaves when a huge amount of servers is started
  • Get data on how good the neutron component is good in this case


  • Creating 400 servers with configured networking
  • Servers are being created simultaneously - 5 servers at the same time


Having a real hardware lab with 20 nodes:


12 cores, Intel(R) Xeon(R) CPU E5-2620 v2 @ 2.10GHz

RAM 32GB (4 x Samsung DDRIII 8GB)


This cluster was created via Fuel Dashboard interface.

Deployment Custom build of Mirantis OpenStack v5.1
OpenStack release Icehouse
Operating System Ubuntu 12.04.4
Mode High availability
Hypervisor KVM
Networking Neutron with GRE segmentation
Controller nodes 3
Compute nodes 17



For this benchmark, we use custom rally with the following patch:



Rally was deployed for cluster using ExistingCloud type of deployment.

Server flavor :

$ nova flavor-show ram64
| Property                   | Value                                |
| OS-FLV-DISABLED:disabled   | False                                |
| OS-FLV-EXT-DATA:ephemeral  | 0                                    |
| disk                       | 0                                    |
| extra_specs                | {}                                   |
| id                         | 2e46aba0-9e7f-4572-8b0a-b12cfe7e06a1 |
| name                       | ram64                                |
| os-flavor-access:is_public | True                                 |
| ram                        | 64                                   |
| rxtx_factor                | 1.0                                  |
| swap                       |                                      |
| vcpus                      | 1                                    |

Server image :

$ nova image-show TestVM
| Property                   | Value                                           |
| OS-EXT-IMG-SIZE:size       | 13167616                                        |
| created                    | 2014-08-21T11:18:49Z                            |
| id                         | 7a0d90cb-4372-40ef-b711-8f63b0ea9678            |
| metadata murano_image_info | {"title": "Murano Demo", "type": "cirros.demo"} |
| minDisk                    | 0                                               |
| minRam                     | 64                                              |
| name                       | TestVM                                          |
| progress                   | 100                                             |
| status                     | ACTIVE                                          |
| updated                    | 2014-08-21T11:18:50Z                            |

Task configuration file (in JSON format): :

   "NovaServers.boot_server": [
           "args": {
               "flavor": {
                   "name": "ram64"
               "image": {
                   "name": "TestVM"
           "runner": {
               "type": "constant",
               "concurrency": 5,
               "times": 400
           "context": {
               "neutron_network": {
                   "network_ip_version": 4
               "users": {
                   "concurrent": 30,
                   "users_per_tenant": 5,
                   "tenants": 5
               "quotas": {
                   "neutron": {
                       "subnet": -1,
                       "port": -1,
                       "network": -1,
                       "router": -1

The only difference between first and second run is that runner.times for first time was set to 500


First time - a bug was found:

Starting from 142 server, we have error from novaclient: Error <class 'novaclient.exceptions.Unauthorized'>: Unauthorized (HTTP 401).

That is how a bug in keystone was found.

action min (sec) avg (sec) max (sec) 90 percentile 95 percentile success count
nova.boot_server total 6.507 6.507 17.402 17.402 100.303 100.303 39.222 39.222 50.134 50.134 26.8% 26.8% 500 500

Second run, with bugfix:

After a patch was applied (using RPC instead of neutron client in metadata agent), we got 100% success and 2x improved average perfomance:

action min (sec) avg (sec) max (sec) 90 percentile 95 percentile success count
nova.boot_server total 5.031 5.031 8.008 8.008 14.093 14.093 9.616 9.616 9.716 9.716 100.0% 100.0% 400 400