Merge "[Docs] Guide about running and debugging fullstack tests"
This commit is contained in:
commit
e95eef2866
78
TESTING.rst
78
TESTING.rst
@ -234,83 +234,7 @@ that the test requires. Developers further benefit from full stack testing as
|
|||||||
it can sufficiently simulate a real environment and provide a rapidly
|
it can sufficiently simulate a real environment and provide a rapidly
|
||||||
reproducible way to verify code while you're still writing it.
|
reproducible way to verify code while you're still writing it.
|
||||||
|
|
||||||
How?
|
More details can be found in :ref:`FullStack Testing<fullstack_testing>` guide.
|
||||||
++++
|
|
||||||
|
|
||||||
Full stack tests set up their own Neutron processes (Server & agents). They
|
|
||||||
assume a working Rabbit and MySQL server before the run starts. Instructions
|
|
||||||
on how to run fullstack tests on a VM are available below.
|
|
||||||
|
|
||||||
Each test defines its own topology (What and how many servers and agents should
|
|
||||||
be running).
|
|
||||||
|
|
||||||
Since the test runs on the machine itself, full stack testing enables
|
|
||||||
"white box" testing. This means that you can, for example, create a router
|
|
||||||
through the API and then assert that a namespace was created for it.
|
|
||||||
|
|
||||||
Full stack tests run in the Neutron tree with Neutron resources alone. You
|
|
||||||
may use the Neutron API (The Neutron server is set to NOAUTH so that Keystone
|
|
||||||
is out of the picture). VMs may be simulated with a container-like class:
|
|
||||||
neutron.tests.fullstack.resources.machine.FakeFullstackMachine.
|
|
||||||
An example of its usage may be found at:
|
|
||||||
neutron/tests/fullstack/test_connectivity.py.
|
|
||||||
|
|
||||||
Full stack testing can simulate multi node testing by starting an agent
|
|
||||||
multiple times. Specifically, each node would have its own copy of the
|
|
||||||
OVS/LinuxBridge/DHCP/L3 agents, all configured with the same "host" value.
|
|
||||||
Each OVS agent is connected to its own pair of br-int/br-ex, and those bridges
|
|
||||||
are then interconnected.
|
|
||||||
For LinuxBridge agent each agent is started in its own namespace, called
|
|
||||||
"host-<some_random_value>". Such namespaces are connected with OVS "central"
|
|
||||||
bridge to each other.
|
|
||||||
|
|
||||||
.. image:: images/fullstack_multinode_simulation.png
|
|
||||||
|
|
||||||
Segmentation at the database layer is guaranteed by creating a database
|
|
||||||
per test. The messaging layer achieves segmentation by utilizing a RabbitMQ
|
|
||||||
feature called 'vhosts'. In short, just like a MySQL server serve multiple
|
|
||||||
databases, so can a RabbitMQ server serve multiple messaging domains.
|
|
||||||
Exchanges and queues in one 'vhost' are segmented from those in another
|
|
||||||
'vhost'.
|
|
||||||
|
|
||||||
Please note that if the change you would like to test using fullstack tests
|
|
||||||
involves a change to python-neutronclient as well as neutron, then you should
|
|
||||||
make sure your fullstack tests are in a separate third change that depends on
|
|
||||||
the python-neutronclient change using the 'Depends-On' tag in the commit
|
|
||||||
message. You will need to wait for the next release of python-neutronclient,
|
|
||||||
and a minimum version bump for python-neutronclient in the global requirements,
|
|
||||||
before your fullstack tests will work in the gate. This is because tox uses
|
|
||||||
the version of python-neutronclient listed in the upper-constraints.txt file in
|
|
||||||
the openstack/requirements repository.
|
|
||||||
|
|
||||||
When?
|
|
||||||
+++++
|
|
||||||
|
|
||||||
1) You'd like to test the interaction between Neutron components (Server
|
|
||||||
and agents) and have already tested each component in isolation via unit or
|
|
||||||
functional tests. You should have many unit tests, fewer tests to test
|
|
||||||
a component and even fewer to test their interaction. Edge cases should
|
|
||||||
not be tested with full stack testing.
|
|
||||||
2) You'd like to increase coverage by testing features that require multi node
|
|
||||||
testing such as l2pop, L3 HA and DVR.
|
|
||||||
3) You'd like to test agent restarts. We've found bugs in the OVS, DHCP and
|
|
||||||
L3 agents and haven't found an effective way to test these scenarios. Full
|
|
||||||
stack testing can help here as the full stack infrastructure can restart an
|
|
||||||
agent during the test.
|
|
||||||
|
|
||||||
Example
|
|
||||||
+++++++
|
|
||||||
|
|
||||||
Neutron offers a Quality of Service API, initially offering bandwidth
|
|
||||||
capping at the port level. In the reference implementation, it does this by
|
|
||||||
utilizing an OVS feature.
|
|
||||||
neutron.tests.fullstack.test_qos.TestBwLimitQoSOvs.test_bw_limit_qos_policy_rule_lifecycle
|
|
||||||
is a positive example of how the fullstack testing infrastructure should be used.
|
|
||||||
It creates a network, subnet, QoS policy & rule and a port utilizing that policy.
|
|
||||||
It then asserts that the expected bandwidth limitation is present on the OVS
|
|
||||||
bridge connected to that port. The test is a true integration test, in the
|
|
||||||
sense that it invokes the API and then asserts that Neutron interacted with
|
|
||||||
the hypervisor appropriately.
|
|
||||||
|
|
||||||
API Tests
|
API Tests
|
||||||
~~~~~~~~~
|
~~~~~~~~~
|
||||||
|
@ -20,20 +20,262 @@
|
|||||||
''''''' Heading 4
|
''''''' Heading 4
|
||||||
(Avoid deeper levels because they do not render well.)
|
(Avoid deeper levels because they do not render well.)
|
||||||
|
|
||||||
|
.. _fullstack_testing:
|
||||||
|
|
||||||
Full Stack Testing
|
Full Stack Testing
|
||||||
==================
|
==================
|
||||||
|
|
||||||
Goals
|
How?
|
||||||
-----
|
++++
|
||||||
|
|
||||||
* Stabilize the job:
|
Full stack tests set up their own Neutron processes (Server & agents). They
|
||||||
- Fix L3 HA failure
|
assume a working Rabbit and MySQL server before the run starts. Instructions
|
||||||
- Look in to non-deterministic failures when adding a large amount of
|
on how to run fullstack tests on a VM are available below.
|
||||||
tests (Possibly bug 1486199).
|
|
||||||
- Switch to kill signal 15 to terminate agents (Bug 1487548).
|
Each test defines its own topology (What and how many servers and agents should
|
||||||
* Convert the L3 HA failover functional test to a full stack test
|
be running).
|
||||||
* Write DVR tests
|
|
||||||
* Write additional L3 HA tests
|
Since the test runs on the machine itself, full stack testing enables
|
||||||
* Write a test that validates DVR + L3 HA integration after
|
"white box" testing. This means that you can, for example, create a router
|
||||||
https://bugs.launchpad.net/neutron/+bug/1365473 is fixed.
|
through the API and then assert that a namespace was created for it.
|
||||||
|
|
||||||
|
Full stack tests run in the Neutron tree with Neutron resources alone. You
|
||||||
|
may use the Neutron API (The Neutron server is set to NOAUTH so that Keystone
|
||||||
|
is out of the picture). VMs may be simulated with a container-like class:
|
||||||
|
neutron.tests.fullstack.resources.machine.FakeFullstackMachine.
|
||||||
|
An example of its usage may be found at:
|
||||||
|
neutron/tests/fullstack/test_connectivity.py.
|
||||||
|
|
||||||
|
Full stack testing can simulate multi node testing by starting an agent
|
||||||
|
multiple times. Specifically, each node would have its own copy of the
|
||||||
|
OVS/LinuxBridge/DHCP/L3 agents, all configured with the same "host" value.
|
||||||
|
Each OVS agent is connected to its own pair of br-int/br-ex, and those bridges
|
||||||
|
are then interconnected.
|
||||||
|
For LinuxBridge agent each agent is started in its own namespace, called
|
||||||
|
"host-<some_random_value>". Such namespaces are connected with OVS "central"
|
||||||
|
bridge to each other.
|
||||||
|
|
||||||
|
.. image:: images/fullstack_multinode_simulation.png
|
||||||
|
|
||||||
|
Segmentation at the database layer is guaranteed by creating a database
|
||||||
|
per test. The messaging layer achieves segmentation by utilizing a RabbitMQ
|
||||||
|
feature called 'vhosts'. In short, just like a MySQL server serve multiple
|
||||||
|
databases, so can a RabbitMQ server serve multiple messaging domains.
|
||||||
|
Exchanges and queues in one 'vhost' are segmented from those in another
|
||||||
|
'vhost'.
|
||||||
|
|
||||||
|
Please note that if the change you would like to test using fullstack tests
|
||||||
|
involves a change to python-neutronclient as well as neutron, then you should
|
||||||
|
make sure your fullstack tests are in a separate third change that depends on
|
||||||
|
the python-neutronclient change using the 'Depends-On' tag in the commit
|
||||||
|
message. You will need to wait for the next release of python-neutronclient,
|
||||||
|
and a minimum version bump for python-neutronclient in the global requirements,
|
||||||
|
before your fullstack tests will work in the gate. This is because tox uses
|
||||||
|
the version of python-neutronclient listed in the upper-constraints.txt file in
|
||||||
|
the openstack/requirements repository.
|
||||||
|
|
||||||
|
When?
|
||||||
|
+++++
|
||||||
|
|
||||||
|
1) You'd like to test the interaction between Neutron components (Server
|
||||||
|
and agents) and have already tested each component in isolation via unit or
|
||||||
|
functional tests. You should have many unit tests, fewer tests to test
|
||||||
|
a component and even fewer to test their interaction. Edge cases should
|
||||||
|
not be tested with full stack testing.
|
||||||
|
2) You'd like to increase coverage by testing features that require multi node
|
||||||
|
testing such as l2pop, L3 HA and DVR.
|
||||||
|
3) You'd like to test agent restarts. We've found bugs in the OVS, DHCP and
|
||||||
|
L3 agents and haven't found an effective way to test these scenarios. Full
|
||||||
|
stack testing can help here as the full stack infrastructure can restart an
|
||||||
|
agent during the test.
|
||||||
|
|
||||||
|
Example
|
||||||
|
+++++++
|
||||||
|
|
||||||
|
Neutron offers a Quality of Service API, initially offering bandwidth
|
||||||
|
capping at the port level. In the reference implementation, it does this by
|
||||||
|
utilizing an OVS feature.
|
||||||
|
neutron.tests.fullstack.test_qos.TestBwLimitQoSOvs.test_bw_limit_qos_policy_rule_lifecycle
|
||||||
|
is a positive example of how the fullstack testing infrastructure should be used.
|
||||||
|
It creates a network, subnet, QoS policy & rule and a port utilizing that policy.
|
||||||
|
It then asserts that the expected bandwidth limitation is present on the OVS
|
||||||
|
bridge connected to that port. The test is a true integration test, in the
|
||||||
|
sense that it invokes the API and then asserts that Neutron interacted with
|
||||||
|
the hypervisor appropriately.
|
||||||
|
|
||||||
|
How to run fullstack tests locally?
|
||||||
|
+++++++++++++++++++++++++++++++++++
|
||||||
|
|
||||||
|
Fullstack tests can be run locally. That makes it much easier to understand
|
||||||
|
exactly how it works, debug issues in the existing tests or write new ones.
|
||||||
|
To run fullstack tests locally, you should clone
|
||||||
|
`Devstack <https://opendev.org/openstack/devstack/>` and `Neutron
|
||||||
|
<https://opendev.org/openstack/neutron>` repositories. When repositories are
|
||||||
|
available locally, the first thing which needs to be done is preparation of the
|
||||||
|
environment. There is a simple script in Neutron to do that.
|
||||||
|
|
||||||
|
.. code-block:: console
|
||||||
|
|
||||||
|
$ export VENV=dsvm-fullstack
|
||||||
|
$ tools/configure_for_func_testing.sh /opt/stack/devstack -i
|
||||||
|
|
||||||
|
This will prepare needed files, install required packages, etc. When it is
|
||||||
|
done you should see a message like:
|
||||||
|
|
||||||
|
.. code-block:: console
|
||||||
|
|
||||||
|
Phew, we're done!
|
||||||
|
|
||||||
|
That means that all went well and you should be ready to run fullstack tests
|
||||||
|
locally. Of course there are many tests there and running all of them can
|
||||||
|
take a pretty long time so lets try to run just one:
|
||||||
|
|
||||||
|
.. code-block:: console
|
||||||
|
|
||||||
|
$ tox -e dsvm-fullstack neutron.tests.fullstack.test_qos.TestBwLimitQoSOvs.test_bw_limit_qos_policy_rule_lifecycle
|
||||||
|
dsvm-fullstack create: /opt/stack/neutron/.tox/dsvm-fullstack
|
||||||
|
dsvm-fullstack installdeps: -chttps://releases.openstack.org/constraints/upper/master, -r/opt/stack/neutron/requirements.txt, -r/opt/stack/neutron/test-requirements.txt, -r/opt/stack/neutron/neutron/tests/functional/requirements.txt
|
||||||
|
dsvm-fullstack develop-inst: /opt/stack/neutron
|
||||||
|
{0} neutron.tests.fullstack.test_qos.TestBwLimitQoSOvs.test_bw_limit_qos_policy_rule_lifecycle(ingress) [40.395436s] ... ok
|
||||||
|
{1} neutron.tests.fullstack.test_qos.TestBwLimitQoSOvs.test_bw_limit_qos_policy_rule_lifecycle(egress) [43.277898s] ... ok
|
||||||
|
Stopping rootwrap daemon process with pid=12657
|
||||||
|
Running upgrade for neutron ...
|
||||||
|
OK
|
||||||
|
/usr/lib/python3.8/subprocess.py:942: ResourceWarning: subprocess 13475 is still running
|
||||||
|
_warn("subprocess %s is still running" % self.pid,
|
||||||
|
ResourceWarning: Enable tracemalloc to get the object allocation traceback
|
||||||
|
Stopping rootwrap daemon process with pid=12669
|
||||||
|
Running upgrade for neutron ...
|
||||||
|
OK
|
||||||
|
/usr/lib/python3.8/subprocess.py:942: ResourceWarning: subprocess 13477 is still running
|
||||||
|
_warn("subprocess %s is still running" % self.pid,
|
||||||
|
ResourceWarning: Enable tracemalloc to get the object allocation traceback
|
||||||
|
|
||||||
|
======
|
||||||
|
Totals
|
||||||
|
======
|
||||||
|
Ran: 2 tests in 43.3367 sec.
|
||||||
|
- Passed: 2
|
||||||
|
- Skipped: 0
|
||||||
|
- Expected Fail: 0
|
||||||
|
- Unexpected Success: 0
|
||||||
|
- Failed: 0
|
||||||
|
Sum of execute time for each test: 83.6733 sec.
|
||||||
|
|
||||||
|
==============
|
||||||
|
Worker Balance
|
||||||
|
==============
|
||||||
|
- Worker 0 (1 tests) => 0:00:40.395436
|
||||||
|
- Worker 1 (1 tests) => 0:00:43.277898
|
||||||
|
___________________________________________________________________________________________________________________________________________________________ summary ___________________________________________________________________________________________________________________________________________________________
|
||||||
|
dsvm-fullstack: commands succeeded
|
||||||
|
congratulations :)
|
||||||
|
|
||||||
|
That means that our test was run successfully.
|
||||||
|
Now you can start hacking, write new fullstack tests or debug failing ones as
|
||||||
|
needed.
|
||||||
|
|
||||||
|
Debugging tests locally
|
||||||
|
+++++++++++++++++++++++
|
||||||
|
|
||||||
|
If you need to debug a fullstack test locally you can use the ``remote_pdb``
|
||||||
|
module for that. First need to install remote_pdb module in the virtual
|
||||||
|
environment created for fullstack testing by tox.
|
||||||
|
|
||||||
|
.. code-block:: console
|
||||||
|
|
||||||
|
$ .tox/dsvm-fullstack/bin/pip install remote_pdb
|
||||||
|
|
||||||
|
Then you need to install a breakpoint in your code. For example, lets do that
|
||||||
|
in the
|
||||||
|
neutron.tests.fullstack.test_qos.TestBwLimitQoSOvs.test_bw_limit_qos_policy_rule_lifecycle
|
||||||
|
module:
|
||||||
|
|
||||||
|
.. code-block:: python
|
||||||
|
|
||||||
|
def test_bw_limit_qos_policy_rule_lifecycle(self):
|
||||||
|
import remote_pdb; remote_pdb.set_trace(port=1234)
|
||||||
|
new_limit = BANDWIDTH_LIMIT + 100
|
||||||
|
|
||||||
|
Now you can run the test again:
|
||||||
|
|
||||||
|
.. code-block:: console
|
||||||
|
|
||||||
|
$ tox -e dsvm-fullstack neutron.tests.fullstack.test_qos.TestBwLimitQoSOvs.test_bw_limit_qos_policy_rule_lifecycle
|
||||||
|
|
||||||
|
It will pause with message like:
|
||||||
|
|
||||||
|
.. code-block:: console
|
||||||
|
|
||||||
|
RemotePdb session open at 127.0.0.1:1234, waiting for connection ...
|
||||||
|
|
||||||
|
And now you can start debugging using ``telnet`` tool:
|
||||||
|
|
||||||
|
.. code-block:: console
|
||||||
|
|
||||||
|
$ telnet 127.0.0.1 1234
|
||||||
|
Trying 127.0.0.1...
|
||||||
|
Connected to 127.0.0.1.
|
||||||
|
Escape character is '^]'.
|
||||||
|
>
|
||||||
|
/opt/stack/neutron/neutron/tests/fullstack/test_qos.py(208)test_bw_limit_qos_policy_rule_lifecycle()
|
||||||
|
-> new_limit = BANDWIDTH_LIMIT + 100
|
||||||
|
(Pdb)
|
||||||
|
|
||||||
|
From that point you can start debugging your code in the same way you
|
||||||
|
usually do with ``pdb`` module.
|
||||||
|
|
||||||
|
Checking test logs
|
||||||
|
++++++++++++++++++
|
||||||
|
|
||||||
|
Each fullstack test is spawning its own, isolated environment with needed
|
||||||
|
services. So, for example, it can be ``neutron-server``, ``neutron-ovs-agent``
|
||||||
|
or ``neutron-dhcp-agent``. And often there is a need to check logs of some of
|
||||||
|
those processes. That is of course possible when running fullstack tests
|
||||||
|
locally. By default, logs are stored in ``/opt/stack/logs/dsvm-fullstack-logs``.
|
||||||
|
The logs directory can be defined by the environment variable ``OS_LOG_PATH``.
|
||||||
|
In that directory there are directories with names matching names of the
|
||||||
|
tests, for example:
|
||||||
|
|
||||||
|
.. code-block:: console
|
||||||
|
|
||||||
|
$ ls -l
|
||||||
|
total 224
|
||||||
|
drwxr-xr-x 2 vagrant vagrant 4096 Nov 26 16:49 TestBwLimitQoSOvs.test_bw_limit_qos_policy_rule_lifecycle_egress_
|
||||||
|
-rw-rw-r-- 1 vagrant vagrant 94928 Nov 26 16:50 TestBwLimitQoSOvs.test_bw_limit_qos_policy_rule_lifecycle_egress_.txt
|
||||||
|
drwxr-xr-x 2 vagrant vagrant 4096 Nov 26 16:49 TestBwLimitQoSOvs.test_bw_limit_qos_policy_rule_lifecycle_ingress_
|
||||||
|
-rw-rw-r-- 1 vagrant vagrant 121027 Nov 26 16:54 TestBwLimitQoSOvs.test_bw_limit_qos_policy_rule_lifecycle_ingress_.txt
|
||||||
|
|
||||||
|
For each test there is a directory and txt file with the same name. This txt
|
||||||
|
file contains the log from the test runner. So you can check exactly what was
|
||||||
|
done by the test when it was run. This file contains logs from all runs of the
|
||||||
|
same test. So if you run the test 10 times, you will have the logs from all
|
||||||
|
10 runs of the test.
|
||||||
|
In the directory with same name there are logs from the neutron services run
|
||||||
|
during the test, for example:
|
||||||
|
|
||||||
|
.. code-block:: console
|
||||||
|
|
||||||
|
$ ls -l TestBwLimitQoSOvs.test_bw_limit_qos_policy_rule_lifecycle_ingress_/
|
||||||
|
total 1836
|
||||||
|
-rw-rw-r-- 1 vagrant vagrant 333371 Nov 26 16:40 neutron-openvswitch-agent--2020-11-26--16-40-38-818499.log
|
||||||
|
-rw-rw-r-- 1 vagrant vagrant 552097 Nov 26 16:53 neutron-openvswitch-agent--2020-11-26--16-49-29-716615.log
|
||||||
|
-rw-rw-r-- 1 vagrant vagrant 461483 Nov 26 16:41 neutron-server--2020-11-26--16-40-35-875937.log
|
||||||
|
-rw-rw-r-- 1 vagrant vagrant 526070 Nov 26 16:54 neutron-server--2020-11-26--16-49-26-758447.log
|
||||||
|
|
||||||
|
Here each file is only from one run and one service. In the name of the file
|
||||||
|
there is timestamp of when the service was started.
|
||||||
|
|
||||||
|
Debugging fullstack failures in the gate
|
||||||
|
++++++++++++++++++++++++++++++++++++++++
|
||||||
|
|
||||||
|
Sometimes there is a need to investigate reason that a test failed in the gate.
|
||||||
|
After every ``neutron-fullstack`` job run, on the Zuul job page there are logs
|
||||||
|
available. In the directory ``controller/logs/dsvm-fullstack-logs`` you can find
|
||||||
|
exactly the same files with logs from each test case as mentioned above.
|
||||||
|
|
||||||
|
You can also check, for example, the journal log from the node where the tests
|
||||||
|
were run. All those logs are available in the file
|
||||||
|
``controller/logs/devstack.journal.xz`` in the jobs logs.
|
||||||
|
In ``controller/logs/devstack.journal.README.txt`` there are also
|
||||||
|
instructions on how to download and check those journal logs locally.
|
||||||
|
Loading…
Reference in New Issue
Block a user