46 Commits

Author SHA1 Message Date
melanie witt
9ec6afe893 Enable unified limits in the nova-next job
This configures the nova-next job to enable the new experimental
unified limits functionality.

This also adds basic testing for nova "global" limits in keystone in
the post-test-hook for the nova-next job. These can't be tested in
tempest because they involve modifying global quota limits that would
affect any other server tests running in parallel.

Related to blueprint unified-limits-nova

Depends-On: https://review.opendev.org/c/openstack/devstack/+/789962
Depends-On: https://review.opendev.org/c/openstack/tempest/+/790186
Depends-On: https://review.opendev.org/c/openstack/tempest/+/804311

Change-Id: I624b2684867305a9095e8964ead786c7b0c28242
2022-03-04 03:42:14 +00:00
Balazs Gibizer
5725297e12 Revert "Temp disable nova-manage placement heal_allocation testing"
This reverts commit 45e71fb9ceaf7fd89e63379aa65a8971b053a8b1.

Reason for revert: https://review.opendev.org/c/openstack/nova/+/802060 is landed now, so the heal allocation tests should work again

Change-Id: I829e7466cec3f768ad13dcf793315fb029126a2e
2021-11-02 07:05:52 +00:00
Balazs Gibizer
45e71fb9ce Temp disable nova-manage placement heal_allocation testing
Since I99a49b107b1872ddf83d1d8497a26a8d728feb07 the nova-next job fails
due to I missed a dependency between that neutron patch and
https://review.opendev.org/c/openstack/nova/+/802060 . So this patch
disable testing until the nova adaptation lands.

Change-Id: Ic28ef83f5193e6c1fbac1577ef58fe0d9e45694d
2021-11-01 11:26:55 +00:00
Lee Yarwood
6aa580c15f gate: Remove test_evacuate.sh
I02b2b851a74f24816d2f782a66d94de81ee527b0 missed that it was also safe
to remove this now unsused script from the codebase.

Change-Id: I5fa6819008a7359ba0bc3d3e98c19721589b831a
2021-06-15 17:35:20 +01:00
Lee Yarwood
91e53e4c2b zuul: Replace grenade and nova-grenade-multinode with grenade-multinode
If2608406776e0d5a06b726e65b55881e70562d18 dropped the single node
grenade job from the integrated-gate-compute template as it duplicates
the existing grenade-multinode job. However it doesn't remove the
remianing single node grenade job still present in the Nova project.

This change replaces the dsvm based nova-grenade-multinode job with the
zuulv3 native grenade-multinode based job.

Various legacy playbooks and hook scripts are also removed as they are
no longer used. Note that this does result in a loss of coverage for
ceph that should be replaced as soon as a zuulv3 native ceph based
multinode job is available.

Change-Id: I02b2b851a74f24816d2f782a66d94de81ee527b0
2021-04-29 11:05:58 +01:00
Lucas Alvares Gomes
20a7c98eff [OVN] Adapt the live-migration job scripts to work with OVN
There's no q-agt service in an OVN deployment.

Change-Id: Ia25c966c70542bcd02f5540b5b94896c17e49888
Signed-off-by: Lucas Alvares Gomes <lucasagomes@gmail.com>
2021-03-15 09:41:03 +00:00
Zuul
5d910b6695 Merge "Revert "Temporarily disable parts of heal port allocation test"" 2020-12-16 18:37:40 +00:00
Lee Yarwood
eaa98e3340 nova-grenade-multinode: Skip test_live_block_migration_paused
As discussed in the bug the only advice we've been given from the
libvirt and QEMU teams is to avoid this in Bionic where QEMU is using
the legacy -drive architecture. This is passing in the new zuulv3 Focal
based job so just skip the test here for the time being in grenade.

Change-Id: I1aeab16e2b8d907a114ed22c7e716f534fe1b129
Related-Bug: #1901739
2020-12-11 12:04:24 +00:00
Balazs Gibizer
528740128a Revert "Temporarily disable parts of heal port allocation test"
This reverts commit d309e3cdf52bb4f8800c59f295a9c2ffb4069369.

Depends-On: https://review.opendev.org/756892

Change-Id: I8f789cf6d5df368e9a8e12b5e10ce555d2ef9416
2020-11-18 11:36:35 +00:00
Balazs Gibizer
d309e3cdf5 Temporarily disable parts of heal port allocation test
Due to bug 1894825 the nova-next job fails on master as the allocation
key cannot be deleted from the binding:profile of a neutron port. This
patch temporarily disable this test step while the bug is being fixed
and a new neutronlib is released with the fix.

Change-Id: I4dfebfb5c92dd8a5cdc779aac587e7477cd5fad6
Related-Bug: #1894825
Closes-Bug: #1898035
2020-10-02 09:39:01 +02:00
Lee Yarwood
5ab9b28161 test_evacuate.sh: Stop using libvirt-bin
I49dc963ada17a595232d3eb329d94632d07b874b missed that
call_hook_if_defined will actually cause the entire run to fail [1] if
we attempt to stop the non-existent libvirt-bin service so just remove
it now we are using the train UCA.

[1] 7a70f559c5/functions.sh (L74)

Change-Id: Ife26f1ceb6208e12328ccdccbab0681ee55d5a2a
2020-09-22 10:29:37 +01:00
Zuul
b5330a97ae Merge "test_evacuate.sh: Support libvirt-bin and libvirtd systemd services" 2020-09-18 23:03:01 +00:00
Lee Yarwood
6c62830ae8 test_evacuate.sh: Support libvirt-bin and libvirtd systemd services
The systemd service unit for libvirtd has changed name from libvirt-bin
to libvirtd, as such the evacuation test script needs to be changed to
support both as we move between these versions.

Change-Id: I49dc963ada17a595232d3eb329d94632d07b874b
2020-09-18 13:45:39 +01:00
Lee Yarwood
9d55b754f2 test_evacuate: Wait until subnode is down before starting tests
Change-Id: I714eb2ef3a6d307b60d82e7fedc49bfeadd20289
2020-09-15 08:48:06 +00:00
Lee Yarwood
1e16b3184d nova-live-migration: Only stop n-cpu and q-agt during evacuation testing
I8af2ad741ca08c3d88efb9aa817c4d1470491a23 started to correctly fence the
subnode ahead of evacuation testing but missed that c-vol and g-api
where also running on the host. As a result the BFV evacuation test will
fail if the volume being used is created on the c-vol backend hosted on
the subnode.

This change now avoids this by limiting the services stopped ahead of
the evacuation on the subnode to n-cpu and q-agt.

Change-Id: Ia7c317e373e4037495d379d06eda19a71412d409
Closes-Bug: #1868234
2020-03-21 17:08:47 +00:00
Lee Yarwood
b097959c1c nova-live-migration: Ensure subnode is fenced during evacuation testing
As stated in the forced-down API [1]:

> Setting a service forced down without completely fencing it will
> likely result in the corruption of VMs on that host.

Previously only the libvirtd service was stopped on the subnode prior to
calling this API, allowing n-cpu, q-agt and the underlying guest domains
to continue running on the host.

This change now ensures all devstack services are stopped on the subnode
and all active domains destroyed.

It is hoped that this will resolve bug #1813789 where evacuations have
timed out due to VIF plugging issues on the new destination host.

[1] https://docs.openstack.org/api-ref/compute/?expanded=update-forced-down-detail#update-forced-down

Related-Bug: #1813789
Change-Id: I8af2ad741ca08c3d88efb9aa817c4d1470491a23
2020-03-19 11:34:13 +00:00
Lee Yarwood
e23c3c2c8d nova-live-migration: Wait for n-cpu services to come up after configuring Ceph
Previously the ceph.sh script used during the nova-live-migration job
would only grep for a `compute` process when checking if the services
had been restarted. This check was bogus and would always return 0 as it
would always match itself. For example:

2020-03-13 21:06:47.682073 | primary | 2020-03-13 21:06:47.681 | root
29529  0.0  0.0   4500   736 pts/0    S+   21:06   0:00 /bin/sh -c ps
       aux | grep compute
2020-03-13 21:06:47.683964 | primary | 2020-03-13 21:06:47.683 | root
29531  0.0  0.0  14616   944 pts/0    S+   21:06   0:00 grep compute

Failures of this job were seen on the stable/pike branch where slower CI
nodes appeared to struggle to allow Libvirt to report to n-cpu in time
before Tempest was started. This in-turn caused instance build failures
and the overall failure of the job.

This change resolves this issue by switching to pgrep and ensuring
n-cpu services are reported as fully up after a cold restart before
starting the Tempest test run.

Closes-Bug: 1867380
Change-Id: Icd7ab2ca4ddbed92c7e883a63a23245920d961e7
2020-03-16 12:37:45 +00:00
Matt Riedemann
85a1dd338b Convert legacy nova-live-migration and nova-multinode-grenade to py3
This makes these legacy devstack-gate-based jobs run with python3.

Change-Id: Id565a20ba3ebe2ea1a72b879bd2762ba3e655658
2019-11-14 16:06:02 +00:00
Matt Riedemann
14ca6f62e3 Remove the TODO about using OSC for BFV in test_evacuate.sh
With OSC 4.0.0 we could now use the --boot-from-volume option
to create a volume-backed server from the provided image. However,
that option leaves the created root volume around since
delete_on_termination defaults to false in the API. So while we
could use that option and convert from nova boot to openstack
server create, it would mean we'd have to find and manually delete
the created volume after the server is created, which is more work
than it's worth to implement the TODO so just remove it.

Change-Id: I0b70b19d74007041fc2da55a4edb1c636af691d6
2019-11-07 14:33:32 -05:00
melanie witt
6ea945e3b1 Remove redundant call to get/create default security group
In the instance_create DB API method, it ensures the (legacy) default
security group gets created for the specified project_id if it does
not already exist. If the security group does not exist, it is created
in a separate transaction.

Later in the instance_create method, it reads the default security group
back that it wrote earlier (via the same ensure default security group
code). But since it was written in a separate transaction, the current
transaction will not be able to see it and will get back 0 rows. So, it
creates a duplicate default security group record if project_id=NULL
(which it will be, if running nova-manage db online_data_migrations,
which uses an anonymous RequestContext with project_id=NULL). This
succeeds despite the unique constraint on project_id because in MySQL,
unique constraints are only enforced on non-NULL values [1].

To avoid creation of a duplicate default security group for
project_id=NULL, we can use the default security group object that was
returned from the first security_group_ensure_default call earlier in
instance_create method and remove the second, redundant call.

This also breaks out the security groups setup code from a nested
method as it was causing confusion during code review and is not being
used for any particular purpose. Inspection of the original commit
where it was added in 2012 [2] did not contain any comments about the
nested method and it appeared to either be a way to organize the code
or a way to reuse the 'models' module name as a local variable name.

Closes-Bug: #1824435

[1] https://dev.mysql.com/doc/refman/8.0/en/create-index.html#create-index-unique
[2] https://review.opendev.org/#/c/8973/2/nova/db/sqlalchemy/api.py@1339

Change-Id: Idb205ab5b16bbf96965418cd544016fa9cc92de9
2019-10-14 18:54:43 +00:00
melanie witt
7c41365f19 Add regression test for bug 1824435
This adds a regression test in our post test hook. We are not able to
do a similar test in the unit or functional tests because SQLite does
not provide any isolation between transactions on the same database
connection [1] and the bug can only be reproduced with the isolation
that is present when using a real MySQL database.

Related-Bug: #1824435

[1] https://www.sqlite.org/isolation.html

Change-Id: I204361d6ff7c2323bc744878d8a9fa2d20a480b1
2019-10-14 06:03:44 +00:00
Zuul
6ee5cfb397 Merge "Test heal port allocations in nova-next" 2019-10-03 03:28:30 +00:00
Balazs Gibizer
0044702e0d Test heal port allocations in nova-next
This patch extends the existing integration test for
heal_allocations to test the recently implemented port
allocation healing functionality.

Change-Id: I993c9661c37da012cc975ee8c04daa0eb9216744
Related-Bug: #1819923
2019-10-02 11:15:36 +02:00
Balazs Gibizer
2cf9a5f9fa Add cold migrate and resize to nova-grenade-multinode
Changes in [1] could potentially break a mixed-compute-version
environment as we don't have grenade coverage for cold migrate and
resize. This adds that coverage to the nova-grenade-multinode
job.

[1]https://review.opendev.org/#/c/655721/10

Change-Id: I81372d610ddf8abb473621deb6e7cb68eb000fee
2019-08-30 15:35:46 -04:00
Balazs Gibizer
3c1d9dab85 Move live_migration test hooks under gate/
This patch resolves a TODO in the .zuul.yaml about using common
irrelevant files in our dsvm jobs. To be able to do that we need to move
the test hooks from nova/tests/live_migraton under gate/.

Change-Id: I4e5352fd1a99ff2b4134a734eac6626be772caf1
2019-08-29 14:45:48 -04:00
melanie witt
f5c2430876 Remove unused args from archive_deleted_rows calls
As of commit 1c9de9c7779b1faf9d9542b3e5bd20da70067365, we no longer
pass any args to the archive_deleted_rows function, so we can remove
the argument list from the function.

Change-Id: I73b2f716908088b137102631f9360939a1d7341a
2019-08-28 05:14:18 +00:00
melanie witt
1c9de9c777 Verify archive_deleted_rows --all-cells in post test hook
We are already running archive_deleted_rows in the gate, but we are
not verifying whether all instance records, for example, were actually
successfully removed from the databases (cell0 and cell1).

This adds the --all-cells option to our archive_deleted_rows runs and
verifies that instance records were successfully removed from all cell
databases.

It is not sufficient to check only for return code 0 because
archive_deleted_rows will still return 0 when it misses archiving
records in cell databases.

Related-Bug: #1719487

Change-Id: If133b12bf02d708c099504a88b474dce0bdb0f00
2019-08-27 06:16:24 +00:00
melanie witt
f32671359e Make a failure to purge_db fail in post_test_hook.sh
Currently, the 'purge_db' call occurs before 'set -e', so if and when
the database purge fails (return non-zero) it does not cause the script
to exit with a failure.

This moves the call after 'set -e' to make the script exit with a
failure if the database purge step fails.

Closes-Bug: #1840967

Change-Id: I6ae27c4e11acafdc0bba8813f47059d084758b4e
2019-08-21 19:21:55 +00:00
Matt Riedemann
cee072b962 Convert nova-next to a zuul v3 job
For the most part this should be a pretty straight-forward
port of the run.yaml. The most complicated thing is executing
the post_test_hook.sh script. For that, a new post-run playbook
and role are added.

The relative path to devstack scripts in post_test_hook.sh itself
had to drop the 'new' directory since we are no longer executing
the script through devstack-gate anymore the 'new' path does not
exist.

Change-Id: Ie3dc90862c895a8bd9bff4511a16254945f45478
2019-07-23 11:32:35 -04:00
Matt Riedemann
87365c760e Add integration testing for heal_allocations
This adds a simple scenario for the heal_allocations CLI
to the post_test_hook script run at the end of the nova-next
job. The functional testing in-tree is pretty extensive but
it's always good to have real integration testing.

Change-Id: If86e4796a9db3020d4fdb751e8bc771c6f98aa47
Related-Bug: #1819923
2019-06-29 11:03:55 +00:00
Dan Smith
0685139ed8 Make nova-next archive using --before
Change-Id: I4fbd0cb73c73ab680af3f341d6069addb57393fb
2019-06-05 07:42:23 -07:00
Matt Riedemann
bed9d49163 Pass --nic when creating servers in evacuate integration test script
Devstack change Ib2e7096175c991acf35de04e840ac188752d3c17 started
creating a second network which is shared when tempest is enabled.
This causes the "openstack server create" and "nova boot" commands
in test_evacuate.sh to fail with:

  Multiple possible networks found, use a Network ID to be more specific.

This change selects the non-shared network and uses it to create
the servers during evacuate testing.

Change-Id: I2085a306e4d6565df4a641efabd009a3bc182e87
Closes-Bug: #1822605
2019-04-01 09:58:01 -04:00
Sean Mooney
30550d3d94 update gate test for removal of force evacuate
micro-version 2.68 removed force evacuation, this chage
updates gate/test_evacuate.sh to use micro-version 2.67

Closes-Bug: #1819166

Change-Id: I44a3514b4b0ba1648aa96f92e896729c823b151c
2019-03-08 16:31:31 +00:00
Zuul
99e7df0795 Merge "Remove placement perf check" 2018-12-08 04:56:56 +00:00
Matt Riedemann
3b1463b968 Use tempest [compute]/build_timeout in evacuate tests
Waiting 30 seconds for an evacuate to complete is not enough
time on some slower CI test nodes. This change uses the
same build timeout configuration from tempest to determine
the overall evacuate timeout in our evacuate tests.

Change-Id: Ie5935ae54d2cbf1a4272e93815ee5f67d3ffe2eb
Closes-Bug: #1806925
2018-12-05 10:46:06 -05:00
Chris Dent
84182d0aa2 Remove placement perf check
gate/post_test_perf_check.sh did some simplistic performance testing of
placement. With the extraction of placement we want it to happen during
openstack/placement CI changes so we remove it here.

The depends-on is to the placement change that turns it on there, using
an independent (and very small) job.

Depends-On: I93875e3ce1f77fdb237e339b7b3e38abe3dad8f7
Change-Id: I30a7bc9a0148fd3ed15ddd997d8dab11e4fb1fe1
2018-11-30 15:12:48 +00:00
Matt Riedemann
2023f46015 Add volume-backed evacuate test
This adds a volume-backed instance evacuate scenario
to the test_evacuate post-test script.

Change-Id: I37120d9ce02de6dadbd279de195d2f289c891123
2018-10-25 16:15:56 -04:00
Matt Riedemann
8327011f91 Add post-test hook for testing evacuate
This adds a post-test bash script to test evacuate
in a multinode job.

This performs two tests:

1. A negative test where we inject a fault by stopping
   libvirt prior to the evacuation and wait for the
   server to go to ERROR status.

2. A positive where we restart libvirt, wait for the
   compute service to be enabled and then evacuate
   the server and wait for it to be ACTIVE.

For now we hack this into the nova-live-migration
job, but it should probably live in a different job
long-term.

Change-Id: I9b7c9ad6b0ab167ba4583681efbbce4b18941178
2018-10-25 16:15:56 -04:00
Chris Dent
28937be947 Add trait query to placement perf check
This updates the EXPLANATION and sets the pinned version placeload
to the just release 0.3.0. This ought to hold us for a while. If
we need to do this again, we should probably switch to using
requirements files in some fashion, but I'm hoping we can avoid
that until later, potentially even after placement extraction
when we will have to moving and changing this anyway.

Change-Id: Ia3383c5dbbf8445254df774dc6ad23f2b9a3721e
2018-08-16 18:32:12 +01:00
Chris Dent
e6754e1b9e Add explanatory prefix to post_test_perf output
The pirate on crack output of placeload can be confusing
so this change adds a prefix to the placement-perf.txt log
file so that it is somewhat more self-explanatory.

This change also pins the version of placeload because the
explanation is version dependent.

Change-Id: I055adb5f6004c93109b17db8313a7fef85538217
2018-08-16 18:21:47 +01:00
Chris Dent
8b4fcdfdc6 Add placement perf info gathering hook to end of nova-next
This change adds a post test hook to the nova-next job to report
timing of a query to GET /allocation_candidates when there are 1000
resource providers with the same inventory.

A summary of the work ends up in logs/placement-perf.txt

Change-Id: Idc446347cd8773f579b23c96235348d8e10ea3f6
2018-08-14 15:42:08 +01:00
Dan Smith
fd59fbd4d1 Make nova-manage db purge take --all-cells
This makes purge iterate over all cells if requested. This also makes our
post_test_hook.sh use the --all-cells variant with just the base config
file.

Related to blueprint purge-db

Change-Id: I7eb5ed05224838cdba18e96724162cc930f4422e
2018-03-08 09:26:49 -08:00
Dan Smith
ae241cc68f Add simple db purge command
This adds a simple purge command to nova-manage. It either deletes all
shadow archived data, or data older than a date if provided.

This also adds a post-test hook to run purge after archive to validate
that it at least works on data generated by a gate run.

Related to blueprint purge-db

Change-Id: I6f87cf03d49be6bfad2c5e6f0c8accf0fab4e6ee
2018-03-07 10:35:32 -08:00
Dan Smith
64635ba18d Run post-test archive against cell1
Change-Id: I4af326fe66f0cf24ede8a8b7a8ce0e528c4f437c
2018-03-07 10:35:32 -08:00
Matt Riedemann
e8e8941d25 Check for leaked server resource allocations in post_test_hook
The post_test_hook.sh runs in the nova-next CI job. The 1.0.0
version of the osc-placement plugin adds the CLIs to show consumer
resource allocations.

This adds some sanity check code to the post_test_hook.sh script
to look for any resource provider (compute nodes) that have allocations
against them, which shouldn't be the case for successful test runs
where servers are cleaned up properly.

Change-Id: I9801ad04eedf2fede24f3eb104715dcc8e20063d
2018-02-24 02:27:38 +00:00
Sean Dague
95441ef896 move gate hooks to gate/
We prevent a lot of tests from getting run on tools/ changes given
that most of that is unrelated to running any tests. By having the
gate hooks in that directory it made for somewhat odd separation of
what is test sensitive and what is not.

This moves things to the gate/ top level directory, and puts a symlink
in place to handle project-config compatibility until that can be
updated.

Change-Id: Iec9e89f0380256c1ae8df2d19c547d67bbdebd65
2017-01-04 11:05:16 +00:00