Neutron control plane performance and agent restart

The patch set contains the test plan and the report. Change-Id: I5236b1a3e13b2e54666457c0f271af4246e9b60d
2016-09-30 19:17:14 +03:00
parent 9494d3b4b2
commit 89f62fa919
16 changed files with 14776 additions and 0 deletions
--- a/doc/source/test_plans/neutron_features/agent_restart/plan.rst
+++ b/doc/source/test_plans/neutron_features/agent_restart/plan.rst
@@ -0,0 +1,145 @@
+.. _neutron_agent_restart_test_plan:
+
+=============================================================
+OpenStack Neutron Control Plane Performance and Agent Restart
+=============================================================
+
+:status: **draft**
+:version: 1.0
+
+
+Test Plan
+=========
+
+Neutron Server is the core of Neutron control plane. It processes requests
+from public API and internal RPC API. The latter is used to communicate with
+agents. Normally RPC is used to notify agents about updated configuration.
+However in case of agent restart or communication failure the agent requests
+all data from server and the amount of data may be significant.
+
+The goal of this test plan is to measure how restart of bunch of agents
+affect performance of Neutron control plane.
+
+
+Test Environment
+----------------
+
+Preparation
+^^^^^^^^^^^
+
+This test plan is performed against existing OpenStack cloud.
+
+
+Environment description
+^^^^^^^^^^^^^^^^^^^^^^^
+
+The environment description includes hardware specification of servers,
+network parameters, operation system and OpenStack deployment characteristics.
+
+Hardware
+~~~~~~~~
+
+This section contains list of all types of hardware nodes.
+
+-----------+-------+----------------------------------------------------+
+| Parameter | Value | Comments                                           |
+-----------+-------+----------------------------------------------------+
+| model     |       | e.g. Supermicro X9SRD-F                            |
+-----------+-------+----------------------------------------------------+
+| CPU       |       | e.g. 6 x Intel(R) Xeon(R) CPU E5-2620 v2 @ 2.10GHz |
+-----------+-------+----------------------------------------------------+
+| role      |       | e.g. compute or network                            |
+-----------+-------+----------------------------------------------------+
+
+Network
+~~~~~~~
+
+This section contains list of interfaces and network parameters.
+For complicated cases this section may include topology diagram and switch
+parameters.
+
+------------------+-------+-------------------------+
+| Parameter        | Value | Comments                |
+------------------+-------+-------------------------+
+| network role     |       | e.g. provider or public |
+------------------+-------+-------------------------+
+| card model       |       | e.g. Intel              |
+------------------+-------+-------------------------+
+| driver           |       | e.g. ixgbe              |
+------------------+-------+-------------------------+
+| speed            |       | e.g. 10G or 1G          |
+------------------+-------+-------------------------+
+| MTU              |       | e.g. 9000               |
+------------------+-------+-------------------------+
+| offloading modes |       | e.g. default            |
+------------------+-------+-------------------------+
+
+Software
+~~~~~~~~
+
+This section describes installed software.
+
+-----------------+-------+---------------------------+
+| Parameter       | Value | Comments                  |
+-----------------+-------+---------------------------+
+| OS              |       | e.g. Ubuntu 14.04.3       |
+-----------------+-------+---------------------------+
+| OpenStack       |       | e.g. Liberty              |
+-----------------+-------+---------------------------+
+| Hypervisor      |       | e.g. KVM                  |
+-----------------+-------+---------------------------+
+| Neutron plugin  |       | e.g. ML2 + OVS            |
+-----------------+-------+---------------------------+
+| L2 segmentation |       | e.g. VLAN or VxLAN or GRE |
+-----------------+-------+---------------------------+
+| virtual routers |       | HA                        |
+-----------------+-------+---------------------------+
+
+Test Case: mass restart of agents
+---------------------------------
+
+Description
+^^^^^^^^^^^
+
+Measurements can be performed by methodology described in
+:ref:`reliability_testing_version_2`. The following metrics need to be 
+collected:
+
+.. list-table::
+   :header-rows: 1
+
+   *
+     - Priority
+     - Value
+     - Measurement Unit
+     - Description
+   *
+     - 1
+     - Service downtime
+     - sec
+     - How long the service was not available and operations were in error
+       state.
+   *
+     - 1
+     - MTTR
+     - sec
+     - How long does it takes to recover service performance after the failure.
+   *
+     - 1
+     - Operation Degradation
+     - sec
+     - the mean of difference in operation performance during recovery period
+       and operation performance when service operates normally.
+   *
+     - 1
+     - Operation Degradation Ratio
+     - sec
+     - the ratio between operation performance during recovery period and
+       operation performance when service operates normally.
+
+
+Reports
+=======
+
+Test plan execution reports:
+ * :ref:`neutron_agent_restart_test_report`
--- a/doc/source/test_plans/neutron_features/index.rst
+++ b/doc/source/test_plans/neutron_features/index.rst
@@ -11,3 +11,4 @@ Neutron features test plans

    l3_ha/plan
    resource_density/plan
+    agent_restart/plan
--- a/doc/source/test_results/neutron_features/agent_restart/create_and_delete_networks_with_restart_neutron_l3_agent_service/index.rst
+++ b/doc/source/test_results/neutron_features/agent_restart/create_and_delete_networks_with_restart_neutron_l3_agent_service/index.rst
@@ -0,0 +1,77 @@
+Networks operations and L3-agent restart
+========================================
+
+In this scenario we restart all L3 agents while Neutron creates and deletes
+networks.
+
+
+This report is generated on results collected by execution of the following
+Rally scenario:
+
+.. code-block:: yaml
+
+    ---
+      NeutronNetworks.create_and_delete_networks:
+        -
+          args:
+            network_create_args: {}
+          runner:
+            type: "constant_for_duration"
+            duration: 120
+            concurrency: 4
+          context:
+            users:
+              tenants: 1
+              users_per_tenant: 1
+            quotas:
+              neutron:
+                network: -1
+          hooks:
+            -
+              name: fault_injection
+              args:
+                action: restart neutron-l3-agent service
+              trigger:
+                name: event
+                args:
+                  unit: iteration
+                  at: [100]
+    
+
+Summary
+-------
+
+
+
+No errors nor performance degradation observed.
+
+
+
+Details
+-------
+
+This section contains individual data for particular scenario runs.
+
+
+
+Run #1
+^^^^^^
+
+.. image:: plot_1.svg
+
+Baseline
+~~~~~~~~
+
+Baseline samples are collected before the start of fault injection. They are
+used to estimate service performance degradation after the fault.
+
+-----------+-------------+-----------+-----------+---------------------+
+|   Samples |   Median, s |   Mean, s |   Std dev |   95% percentile, s |
+===========+=============+===========+===========+=====================+
+|        85 |        0.36 |       0.4 |     0.068 |                0.52 |
+-----------+-------------+-----------+-----------+---------------------+
+
+
+
+
+
--- a/doc/source/test_results/neutron_features/agent_restart/create_and_delete_networks_with_restart_neutron_l3_agent_service/plot_1.svg
+++ b/doc/source/test_results/neutron_features/agent_restart/create_and_delete_networks_with_restart_neutron_l3_agent_service/plot_1.svg
--- a/doc/source/test_results/neutron_features/agent_restart/create_and_delete_networks_with_restart_neutron_openvswitch_agent_service/index.rst
+++ b/doc/source/test_results/neutron_features/agent_restart/create_and_delete_networks_with_restart_neutron_openvswitch_agent_service/index.rst
@@ -0,0 +1,76 @@
+Networks operations and OVS agent restart
+=========================================
+
+In this scenario we restart all OVS agents while Neutron creates and deletes
+networks.
+
+This report is generated on results collected by execution of the following
+Rally scenario:
+
+.. code-block:: yaml
+
+    ---
+      NeutronNetworks.create_and_delete_networks:
+        -
+          args:
+            network_create_args: {}
+          runner:
+            type: "constant_for_duration"
+            duration: 120
+            concurrency: 4
+          context:
+            users:
+              tenants: 1
+              users_per_tenant: 1
+            quotas:
+              neutron:
+                network: -1
+          hooks:
+            -
+              name: fault_injection
+              args:
+                action: restart neutron-openvswitch-agent service
+              trigger:
+                name: event
+                args:
+                  unit: iteration
+                  at: [100]
+    
+
+Summary
+-------
+
+
+
+No errors nor performance degradation observed.
+
+
+
+Details
+-------
+
+This section contains individual data for particular scenario runs.
+
+
+
+Run #1
+^^^^^^
+
+.. image:: plot_1.svg
+
+Baseline
+~~~~~~~~
+
+Baseline samples are collected before the start of fault injection. They are
+used to estimate service performance degradation after the fault.
+
+-----------+-------------+-----------+-----------+---------------------+
+|   Samples |   Median, s |   Mean, s |   Std dev |   95% percentile, s |
+===========+=============+===========+===========+=====================+
+|        86 |        0.38 |       0.4 |     0.063 |                 0.5 |
+-----------+-------------+-----------+-----------+---------------------+
+
+
+
+
+
--- a/doc/source/test_results/neutron_features/agent_restart/create_and_delete_networks_with_restart_neutron_openvswitch_agent_service/plot_1.svg
+++ b/doc/source/test_results/neutron_features/agent_restart/create_and_delete_networks_with_restart_neutron_openvswitch_agent_service/plot_1.svg
--- a/doc/source/test_results/neutron_features/agent_restart/create_and_delete_ports_with_restart_neutron_l3_agent_service/index.rst
+++ b/doc/source/test_results/neutron_features/agent_restart/create_and_delete_ports_with_restart_neutron_l3_agent_service/index.rst
@@ -0,0 +1,79 @@
+Ports operations and L3-agent restart
+=====================================
+
+In this scenario we restart all L3 agents while Neutron creates and deletes
+ports.
+
+This report is generated on results collected by execution of the following
+Rally scenario:
+
+.. code-block:: yaml
+
+    ---
+      NeutronNetworks.create_and_delete_ports:
+        -
+          args:
+            network_create_args: {}
+            port_create_args: {}
+            ports_per_network: 10
+          runner:
+            type: "constant_for_duration"
+            duration: 300
+            concurrency: 6
+          context:
+            users:
+              tenants: 1
+              users_per_tenant: 1
+            quotas:
+              neutron:
+                network: -1
+                port: -1
+          hooks:
+            -
+              name: fault_injection
+              args:
+                action: restart neutron-l3-agent service
+              trigger:
+                name: event
+                args:
+                  unit: iteration
+                  at: [80]
+    
+
+Summary
+-------
+
+
+
+No errors nor performance degradation observed.
+
+
+
+Details
+-------
+
+This section contains individual data for particular scenario runs.
+
+
+
+Run #1
+^^^^^^
+
+.. image:: plot_1.svg
+
+Baseline
+~~~~~~~~
+
+Baseline samples are collected before the start of fault injection. They are
+used to estimate service performance degradation after the fault.
+
+-----------+-------------+-----------+-----------+---------------------+
+|   Samples |   Median, s |   Mean, s |   Std dev |   95% percentile, s |
+===========+=============+===========+===========+=====================+
+|        63 |         8.5 |       8.5 |       0.4 |                 9.3 |
+-----------+-------------+-----------+-----------+---------------------+
+
+
+
+
+
--- a/doc/source/test_results/neutron_features/agent_restart/create_and_delete_ports_with_restart_neutron_l3_agent_service/plot_1.svg
+++ b/doc/source/test_results/neutron_features/agent_restart/create_and_delete_ports_with_restart_neutron_l3_agent_service/plot_1.svg
--- a/doc/source/test_results/neutron_features/agent_restart/create_and_delete_ports_with_restart_neutron_openvswitch_agent_service/index.rst
+++ b/doc/source/test_results/neutron_features/agent_restart/create_and_delete_ports_with_restart_neutron_openvswitch_agent_service/index.rst
@@ -0,0 +1,79 @@
+Ports operations and OVS agent restart
+======================================
+
+In this scenario we restart all OVS agents while Neutron creates and deletes
+ports.
+
+This report is generated on results collected by execution of the following
+Rally scenario:
+
+.. code-block:: yaml
+
+    ---
+      NeutronNetworks.create_and_delete_ports:
+        -
+          args:
+            network_create_args: {}
+            port_create_args: {}
+            ports_per_network: 10
+          runner:
+            type: "constant_for_duration"
+            duration: 300
+            concurrency: 4
+          context:
+            users:
+              tenants: 1
+              users_per_tenant: 1
+            quotas:
+              neutron:
+                network: -1
+                port: -1
+          hooks:
+            -
+              name: fault_injection
+              args:
+                action: restart neutron-openvswitch-agent service
+              trigger:
+                name: event
+                args:
+                  unit: iteration
+                  at: [80]
+    
+
+Summary
+-------
+
+
+
+No errors nor performance degradation observed.
+
+
+
+Details
+-------
+
+This section contains individual data for particular scenario runs.
+
+
+
+Run #1
+^^^^^^
+
+.. image:: plot_1.svg
+
+Baseline
+~~~~~~~~
+
+Baseline samples are collected before the start of fault injection. They are
+used to estimate service performance degradation after the fault.
+
+-----------+-------------+-----------+-----------+---------------------+
+|   Samples |   Median, s |   Mean, s |   Std dev |   95% percentile, s |
+===========+=============+===========+===========+=====================+
+|        65 |         8.7 |       8.8 |      0.31 |                 9.3 |
+-----------+-------------+-----------+-----------+---------------------+
+
+
+
+
+
--- a/doc/source/test_results/neutron_features/agent_restart/create_and_delete_ports_with_restart_neutron_openvswitch_agent_service/plot_1.svg
+++ b/doc/source/test_results/neutron_features/agent_restart/create_and_delete_ports_with_restart_neutron_openvswitch_agent_service/plot_1.svg
--- a/doc/source/test_results/neutron_features/agent_restart/create_and_delete_subnets_with_restart_neutron_l3_agent_service/index.rst
+++ b/doc/source/test_results/neutron_features/agent_restart/create_and_delete_subnets_with_restart_neutron_l3_agent_service/index.rst
@@ -0,0 +1,80 @@
+Subnets operations and L3-agent restart
+=======================================
+
+In this scenario we restart all L3 agents while Neutron creates and deletes
+subnets.
+
+This report is generated on results collected by execution of the following
+Rally scenario:
+
+.. code-block:: yaml
+
+    ---
+      NeutronNetworks.create_and_delete_subnets:
+        -
+          args:
+            network_create_args: {}
+            subnet_create_args: {}
+            subnet_cidr_start: "1.1.0.0/28"
+            subnets_per_network: 2
+          runner:
+            type: "constant_for_duration"
+            duration: 120
+            concurrency: 4
+          context:
+            users:
+              tenants: 1
+              users_per_tenant: 1
+            quotas:
+              neutron:
+                network: -1
+                subnet: -1
+          hooks:
+            -
+              name: fault_injection
+              args:
+                action: restart neutron-l3-agent service
+              trigger:
+                name: event
+                args:
+                  unit: iteration
+                  at: [100]
+    
+
+Summary
+-------
+
+
+
+No errors nor performance degradation observed.
+
+
+
+Details
+-------
+
+This section contains individual data for particular scenario runs.
+
+
+
+Run #1
+^^^^^^
+
+.. image:: plot_1.svg
+
+Baseline
+~~~~~~~~
+
+Baseline samples are collected before the start of fault injection. They are
+used to estimate service performance degradation after the fault.
+
+-----------+-------------+-----------+-----------+---------------------+
+|   Samples |   Median, s |   Mean, s |   Std dev |   95% percentile, s |
+===========+=============+===========+===========+=====================+
+|        85 |           2 |         2 |      0.16 |                 2.4 |
+-----------+-------------+-----------+-----------+---------------------+
+
+
+
+
+
--- a/doc/source/test_results/neutron_features/agent_restart/create_and_delete_subnets_with_restart_neutron_l3_agent_service/plot_1.svg
+++ b/doc/source/test_results/neutron_features/agent_restart/create_and_delete_subnets_with_restart_neutron_l3_agent_service/plot_1.svg
--- a/doc/source/test_results/neutron_features/agent_restart/create_and_delete_subnets_with_restart_neutron_openvswitch_agent_service/index.rst
+++ b/doc/source/test_results/neutron_features/agent_restart/create_and_delete_subnets_with_restart_neutron_openvswitch_agent_service/index.rst
@@ -0,0 +1,80 @@
+Subnets operations and OVS-agent restart
+========================================
+
+In this scenario we restart all OVS agents while Neutron creates and deletes
+subnets.
+
+This report is generated on results collected by execution of the following
+Rally scenario:
+
+.. code-block:: yaml
+
+    ---
+      NeutronNetworks.create_and_delete_subnets:
+        -
+          args:
+            network_create_args: {}
+            subnet_create_args: {}
+            subnet_cidr_start: "1.1.0.0/28"
+            subnets_per_network: 2
+          runner:
+            type: "constant_for_duration"
+            duration: 120
+            concurrency: 4
+          context:
+            users:
+              tenants: 1
+              users_per_tenant: 1
+            quotas:
+              neutron:
+                network: -1
+                subnet: -1
+          hooks:
+            -
+              name: fault_injection
+              args:
+                action: restart neutron-openvswitch-agent service
+              trigger:
+                name: event
+                args:
+                  unit: iteration
+                  at: [100]
+    
+
+Summary
+-------
+
+
+
+No errors nor performance degradation observed.
+
+
+
+Details
+-------
+
+This section contains individual data for particular scenario runs.
+
+
+
+Run #1
+^^^^^^
+
+.. image:: plot_1.svg
+
+Baseline
+~~~~~~~~
+
+Baseline samples are collected before the start of fault injection. They are
+used to estimate service performance degradation after the fault.
+
+-----------+-------------+-----------+-----------+---------------------+
+|   Samples |   Median, s |   Mean, s |   Std dev |   95% percentile, s |
+===========+=============+===========+===========+=====================+
+|        85 |         1.3 |       1.4 |      0.14 |                 1.6 |
+-----------+-------------+-----------+-----------+---------------------+
+
+
+
+
+
--- a/doc/source/test_results/neutron_features/agent_restart/create_and_delete_subnets_with_restart_neutron_openvswitch_agent_service/plot_1.svg
+++ b/doc/source/test_results/neutron_features/agent_restart/create_and_delete_subnets_with_restart_neutron_openvswitch_agent_service/plot_1.svg
--- a/doc/source/test_results/neutron_features/agent_restart/index.rst
+++ b/doc/source/test_results/neutron_features/agent_restart/index.rst
@@ -0,0 +1,50 @@
+.. _neutron_agent_restart_test_report:
+
+=========================================================================
+OpenStack Neutron Control Plane Performance and Agent Restart Test Report
+=========================================================================
+
+This report is generated for :ref:`neutron_agent_restart_test_plan`.
+
+Environment description
+=======================
+
+Cluster description
+-------------------
+* 3 controllers
+* 3 compute nodes
+
+Software versions
+-----------------
+
+**OpenStack/System**:
+  Fuel/MOS 9.0, Ubuntu 14.04, Linux kernel 3.13, OVS 2.4.1
+**Networking**
+  Neutron ML2 + OVS plugin, DVR, L2pop, MTU 1500
+
+Hardware configuration of each server
+-------------------------------------
+
+Description of servers hardware
+
+**Compute Vendor**:
+    HP ProLiant DL380 Gen9,
+**CPU**
+    2 x Intel(R) Xeon(R) CPU E5-2680 v3 @2.50GHz (48 cores)
+**RAM**:
+    256 Gb
+**NIC**
+    2 x Intel Corporation Ethernet 10G 2P X710
+
+
+Reports
+=======
+
+Reports are collected on OpenStack with 100 instances, 100 routers,
+100 networks.
+
+.. toctree::
+    :glob:
+    :maxdepth: 1
+
+    */index
--- a/doc/source/test_results/neutron_features/index.rst
+++ b/doc/source/test_results/neutron_features/index.rst
@@ -12,3 +12,4 @@ Neutron features scale testing
    l3_ha/test_results_liberty
    l3_ha/test_results_mitaka
    resource_density/index
+    agent_restart/index