Not long ago we enabled a rally scenario booting VMs in the neutron gate
so we can collect osprofiler reports about it. The rally scenario we
re-used was only triggered by changes in the rally-openstack repo so I
could not collect data about its failure rate. Now that it's running
frequently in the neutron gate this scenario actually seems to be quite
unstable (usually timing out while waiting for the VM to get to ACTIVE):
http://logstash.openstack.org/#/dashboard/file/logstash.json?query=message:\"rally.exceptions.TimeoutException: Rally tired waiting\" AND build_name:\"neutron-rally-task\" AND voting:1&from=864000s
Since we only want to run this scenario for the osprofiler report
we can get rid of the gate instability by allowing a 100% failure rate
in the scenario SLA.
Change-Id: Ied354e8242274c8eeb26909e29afbe6d41662bfc
Related-Change: https://review.opendev.org/662804
Since the trunk scenario is now present in rally-openstack
tree, it is no longer needed in neutron tree.
Change-Id: I27c9f0baed267ca8bd181d34842b9d5cc03ab846
This commit makes use of recently merged functionality in rally
to create multiple security group rules per security group and list
them. This will help us identify API performance regressions with
respect to security group rules.
Change-Id: I92ac9785d2403c00b873e68f30062f9796b9ac8b
The option to define the name of floating network enables more
flexibility for using the task file as is.
Change-Id: Icaf9ca6b337a500d3f76521f94244f1932d0e09b
Signed-off-by: Juha Kosonen <juha.kosonen@nokia.com>
Rally team moved OpenStack plugins under separate repository and in-tree
code is deprecated now and will be removed soon.
This patch changes several imports to use the latest available code.
Change-Id: I901ceb685e75d905578135fdf9f1b08ba3ea7223
Rally team finally added native Zuul V3 jobs (with a bunch of separate
roles and etc) and for simplification of maintainance, it would be nice
to use them.
Change-Id: I755e776a7c24e1bcdf144d7af071a52633aeb94d
Validators like:
* rally.task.validation.required_openstack
* rally.task.validation.required_services
are deprecated in Rally 0.10.0 [1].
Instead of calling it directly they should be used via
new decorator 'rally.common.validation.add' and this commit
switches it to use validators in new way.
[1] https://tinyurl.com/y94sfct2
Change-Id: I2883eb94e2532a10160305b283e2d64b93443909
When at least one service named as q-* is present in ENABLED_SERVICES,
then devstack utilizes lib/neutron-legacy to configure services
regardless of how other services are deployed (e.g. with lib/neutron).
This breaks deployment using lib/neutron.
Switching to new names doesn't change anything substantial because
devstack plugin equally handles both variants. It allows to use new
devstack neutron library though.
Change-Id: Id0d35523651131766a70e78bf130205c1c63acd5
The new task format was introduced recently. It unifies different
sections and tries to make the things a bit simpler.
Rally task consists of subtasks. Their amount should be at least one.
The subtask is a group of workloads. Soon, it will be possible to define
a single SLA for all workloads in a subtask and even more - use once
executed contexts for workloads (i.e. create temporary users, network not
for each workload but for a group of them).
The workload is a combination of different plugins to be executed for a
test. The most important are Scenario plugin (what will be executed in
each iteration), runner (how the load should be generated) and contexts
(what resources should be precreated before the workload).
One scenario with different runners/contexts can create different load.
To distinguish them the new section "description" of workload was
introduced. It allows to add a custom description for a workload which
will be dispayed in the report files. In case of missing "description"
section, the description of scenario will be taken.
Also, I need to mention that "failure_rate: 0" SLA is a default now, so
there is no need to specify it.
Change-Id: If99e8c722d9ccb18b8b9d7e12214e76e483a2016
Since I9d3bafa075631a3f48cbd3627a4cc1a5a859cce2 in rally, platform
should be part of context name (context@platform).
Otherwise, a warning message (or even breakage because of
I10ac687f9f420dcf0d907b51d5d9303f68d35719) may be triggered.
Change-Id: I6f84282d22d13d36dcba221ab9d94c3fda95f130
This test is executed 4 times and creating 1000 Neutron ports
just to run this scenario 4 times results in this job taking
15 minutes to complete its iterations.
This cuts the count in half to 125 per execution to cut the run
time in half and ensure we don't get to close to the gate timeout.
This was done once before as part of
817a19c4b9 but unfortunately being
part of another SLA change resulted in it being reverted as part
of the SLA change.
Change-Id: I61466d87b002252efc163cbb5d03eafc5d4da3fb
This reverts commit 817a19c4b9.
This apparently hits us in gate where I can see at least one
failure with 5.44 secs taken for a scenario.
Change-Id: Ied0516686e86167a0c7b1d480eb1db2789f7cada
New validation now enforces that times is >= concurrency because times
is total number of *runs*, not the number of concurrent *run sets*.
Change-Id: I454ef821e00bd5123a9640f472ad4b034dbec75e
Closes-Bug: #1680580
In a normal gate run these are returning in 2 seconds each
on average. Let's reduce the SLA of these from 15 to 5 now
to help prevent future performance regressions in this area.
Change-Id: Iae174c95d214c83d6726a3d3bd339dde7886af4f
During a normal run, the top three scenarios account for slightly
more than a half hour of runtime. Sample numbers:
Scenario Load Duration Full Duration
NeutronNetworks.create_and_update_subnets 562.189 1,182.400
NeutronTrunks.create_and_list_trunk_subports 427.475 600.721
NeutronNetworks.create_and_list_ports 310.167 540.144
This patch reduces the resources created by each of the 3 by 75%. This
should save us an additional ~20 minutes during a normal gate run, which
should change our window for timeout from approximately a half hour on a
normal node to about 50 minutes.
This additional buffer should hopefully be enough to reduce the failure
rate for the rally job when it gets scheduled to a slow node.
Change-Id: I923b625f7dd3ebf794b6a9e097f5ed12ce446bb5
This adds a basic rally scenario to create a trunk
with a bunch of subports so we can keep an eye on the
performance of the trunk API.
Change-Id: I12aaf6121b677e9696131601b3539a7091e2858c
This allows us to configure neutron when running the rally job in
the gate. This effort stems from patch [1]. Blame Kevin for not
wanting to squash the two together.
[1] I12aaf6121b677e9696131601b3539a7091e2858c
Change-Id: I006957784ac7900021bcfee57cbc83b5a6c533c4
The previous configuration of the task was taking up
to a half hour to run. Between this task and the others,
it was eating up all of our gate time, leaving no room
to add new jobs.
This reduces it 5 times in the number of runs, from 40
to 8. This still gives us a reasonable number to get an
average from, especially since each run creates 100 ports.
Change-Id: I0955e44df1a9e072c58fdacc337121b8621132df
Previous runs are showing that creating ports under this high load
ends up taking >5 secs per port on average. Lets set to 4, which is
still double the api_worker count in some cases for the current gate.
Change-Id: I05a0d28f5b035684e07288825f5b704a843dc9d7
Quoting the quota devref:
"""
For a reservation to be successful, the total amount of resources requested,
plus the total amount of resources reserved, plus the total amount of resources
already stored in the database should not exceed the project's quota limit.
"""
This means that in the absolute worst case scenario with 20 concurrent
workers, 19 could have made reservations, committed resources, but not
yet cleared their reservation. Because of the outstanding reservation
and the resources created by the 19 workers, they will all be
double-counted until their reservation is cleared (or it expires).
This adjusts the rally scenarios to handle the double-count for
concurrency.
Related-Bug: #1623390
Change-Id: I4808a92e7e6067aeeb62fc3b3d7f7ac71b179c44
This increases the rally port and network count to 100
and enables quotas to exercise the quota engine to
better simulate a real system.
Additionally, it reduces the SLA requirements because of
regressions that have snuck in throughout the cycle. As
they are fixed these should be reduced back down.
Change-Id: I042d64245b1e4486334996d834ad31561613fa50
Increase the ports per network in the create and list ports
test. This also adds a max average time SLA so we can catch
regressions in performance.
Change-Id: I2e7e3fd7406db77c8e44dce2ab0b4594ff6f2db9
* Since 24 Nov 2014 we added a lot of Neutron benchmarks
Running more Neutron related benchmarks in Neutron gate allows
to avoid performance regressions and races.
* Neutron benchmarks are described here:
https://github.com/stackforge/rally/blob/master/rally/benchmark/scenarios/neutron/network.py
It's quite simple code be free to take a look.
* All changes in concurrency and times are related to optimization
of duration/usefulness
* To get description of benchmarks use:
rally info find NeutronNetworks.create_and_update_networks
New benchmarks:
- NeutronNetworks.create_and_update_networks
- NeutronNetworks.create_and_delete_networks
- NeutronNetworks.create_and_update_subnets
- NeutronNetworks.create_and_delete_subnets
- NeutronNetworks.create_and_update_routers
- NeutronNetworks.create_and_delete_routers
- NeutronNetworks.create_and_list_routers
- NeutronNetworks.create_and_update_ports
- NeutronNetworks.create_and_delete_ports
- NeutronNetworks.create_and_list_ports
- Quotas.neutron_update
related bug: #bug 1419723
Change-Id: Ie3c84e057fc96c0f35ad77b7297c564442ebcf10
*) Rename rally-scenarios that is quite misleading to rally-jobs.
rally-jobs makes much more sense, cause it actually contains files
related to rally job
*) Update rally-jobs/README.rst to add more info
*) Update rally-jobs/plugins/README.rst to expaling plugins
*) Add new directory rally-jobs/extra, this directory is copy pasted
in gates and can be used for files that are required by some of
benchmarks
Change-Id: I6d0c0435a4bb4658ddf4adb871bc36ab8c157f3e