7044 Commits

Author SHA1 Message Date
f2a90803aa Update master for stable/victoria
Add file to the reno documentation build to show release notes for
stable/victoria.

Use pbr instruction to increment the minor version number
automatically so that master versions are higher than the versions on
stable/victoria.

Change-Id: Ic5e63e7094f931e7ba21becdb8e785a761ebc822
Sem-Ver: feature
2020-09-24 09:13:48 +00:00
Abhishek Kekane
6504588aaa Victoria RC-1 release notes
Change-Id: I91bb8614f47c0283c88b96ca10e8280655d2d1fb
2020-09-23 12:46:20 +00:00
Zuul
d7c033bf21 Merge "Do not use OSC in infra playbook" 2020-09-23 10:16:08 +00:00
Victor Coutellier
922c2ed5ad Fix cleaning of web-download image import
If import flow fail before reaching the end it never execute
the _DeleteFromFS task and the node_staging_uri is never cleaned up.

Implement the revert() function of the _WebDownload task to remove the
temporary file.

Change-Id: I6dd6a6e2a95a5bd17a80b6256852bb9fac5fa339
Co-Authored-By: Grégoire Unbekandt <gregoire.unbekandt@gmail.com>
Co-Authored-By: Abhishek Kekane <akekane@redhat.com>
Closes-Bug: #1795950
2020-09-22 20:31:30 +02:00
Erno Kuvaja
df0495676e Do not use OSC in infra playbook
Using the supported glanceclient instead.

Change-Id: I373467d2cdefb2301a949c9236f445dbbc641a2a
2020-09-22 18:07:39 +01:00
Grégoire Unbekandt
68c202d38b Image import "web-download" check downloaded size
If the downloaded data size is different from the expected one, the
task "web-download" in the image import process will now fail.

Change-Id: Ie260486d795a6f4af1632f6f3708abc92fb47a3a
Closes-Bug: #1895663
2020-09-22 14:04:15 +00:00
Zuul
ab151973bd Merge "Run the nova-ceph-multistore job against glance" 2020-09-18 15:50:12 +00:00
Stephen Finucane
54a2231f17 docs: Remove cruft from 'conf.py'
Change-Id: Ie44453b647ce78a26246b8293794ebdec68fd120
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
2020-09-17 17:21:55 +01:00
Stephen Finucane
d3d9982e66 docs: Convert table of image properties to definition list
This renders much flatter as is similar to what's used nowadays for
config options (via the 'oslo_config.sphinxext' extension)

Change-Id: If204d887ed0d65cfc5e75cc7739b0f8f59ce000f
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
2020-09-17 17:21:51 +01:00
Stephen Finucane
e1fe3024bb docs: Remove references to XenAPI driver
The XenAPI driver is dead. Let's hold the tissues and clear out
references from the documentation instead.

Change-Id: I6ec331cf7d2d1ded924893f707ed963027939754
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
2020-09-17 17:09:17 +01:00
Abhishek Kekane
213660d8b5 Victoria milestone 3 release notes
Note: Fixed formatting for import-locking-behavior
release note.

Change-Id: I914d5b901483b3f3942ade3e3856a62901592521
2020-09-15 14:09:55 +00:00
Dan Smith
1a9b345840 Run the nova-ceph-multistore job against glance
Nova has a job where both nova and glance are configured for multistore
ceph, and where the image gets automatically copied from the file to
rbd stores on first use. Run that on glance to get the coverage for
it as well.

Change-Id: I9c734fabaabe78ea8f7e77d0aa2112ebe867ecb6
2020-09-14 10:01:08 -07:00
Abhishek Kekane
e16d5c9ba3 Corrections in default value of all_stores_must_succeed
Change-Id: I7b04f8ba86e0d6f42ef436e30ece9f019221e318
Closes-Bug: #1884996
2020-09-11 06:03:30 +00:00
Zuul
b1715c96ff Merge "[Docs] Cinder multiple stores for glance" 2020-09-10 17:16:26 +00:00
Zuul
dfd31cdaf6 Merge "Refresh Glance example configs for Victoria milestone 3" 2020-09-09 16:42:10 +00:00
Abhishek Kekane
9bd06d24ac [Docs] Cinder multiple stores for glance
Change-Id: I1ec4d3f3f57f8a0576ea5ed09a289ab27882104b
2020-09-09 15:04:07 +00:00
Zuul
a3343cff96 Merge "Support cinder multiple stores" 2020-09-08 20:22:37 +00:00
Zuul
1e46b62a25 Merge "Remove babel.cfg etc" 2020-09-08 12:31:52 +00:00
whoami-rajat
98a1e792c6 Support cinder multiple stores
This patch updates the location URL of the legacy images while
upgrading from single cinder store to multiple stores.
It does that with the help of lazy loading logic i.e. while
GET images call, it checks the location URL and metadata
of the image against the configured store ids and updates
images to respective stores on the basis of volume type (comparing
image-volume's type with the configured cinder_volume_type).
Legacy image URL:
cinder://<volume-id>
New image URL:
cinder://<store-id>/<volume-id>

NOTE: bumping lower-constraints/requirements of glance-store to 2.3.0 as
it includes changes[1] that are a hard requirement for cinder multiple
stores to work with glance

[1] https://review.opendev.org/#/c/746556/

Change-Id: I087a89c20813378fea8ff22ddf81d7a10c220db3
Implements: blueprint multiple-cinder-backend-support
2020-09-07 09:07:42 +00:00
Zuul
5126ca0242 Merge "[Trivial]Add missing print parameters in log messages" 2020-09-03 07:02:35 +00:00
likui
a8fb4e2587 Remove babel.cfg etc
Remove babel.cfg and the translation bits from setup.cfg, those are not
needed anymore.

Change-Id: I221173c724d23c1221febb8928c67bae1150bf5e
2020-09-03 15:01:23 +08:00
Abhishek Kekane
19229d7990 Refresh Glance example configs for Victoria milestone 3
Change-Id: Ibd02882c1e42de3db7f78b50dd974b99b0d9ded1
2020-09-03 05:53:06 +00:00
Zuul
c5bf4d42e3 Merge "Make our ceph job enable thin provisioning" 2020-09-02 21:43:09 +00:00
Zuul
3f9c5afe4d Merge "Make our import-workflow job also convert images to raw" 2020-09-02 21:43:06 +00:00
Zuul
552d902b91 Merge "Add a release note about import locking" 2020-09-01 13:56:11 +00:00
Zuul
20cb03f2d9 Merge "Cleanup import status information after busting a lock" 2020-09-01 13:56:09 +00:00
Zuul
ef72e8b5fe Merge "Add ImageLock to base flow checks" 2020-09-01 12:56:52 +00:00
Dan Smith
bb7774c99b Add a release note about import locking
This adds a release note detailing the new locking behavior and criteria
for stealing the lock.

Related-Bug: #1884596
Change-Id: I19c713c91794694f990f1372fda61cc2e20fac54
2020-08-31 07:45:42 -07:00
zhufl
179d111c1f [Trivial]Add missing print parameters in log messages
This is to add missing print parameters in log message.

Change-Id: I300a3f19a0dfacb23903ac3e92571855ed32cd83
2020-08-31 11:25:05 +08:00
Zuul
8da47cb5ca Merge "Functional test enhancement for lock busting" 2020-08-29 21:48:08 +00:00
Zuul
2f533251dc Merge "Handle atomic image properties separately" 2020-08-29 19:13:33 +00:00
Zuul
c1cf511a7c Merge "Fix metadefs for compute-watchdog" 2020-08-29 03:02:57 +00:00
Zuul
5e9a14aee1 Merge "Move SynchronousAPIBase to a generalized location" 2020-08-29 01:45:04 +00:00
Zuul
0260ab5922 Merge "Disable wait_for_fork() kill aggression if expect_exit=True" 2020-08-28 03:34:41 +00:00
Dan Smith
c023445d05 Make our import-workflow job also convert images to raw
This enables image conversion on the import-workflow job so that we at
least run those code paths somewhere in CI.

Change-Id: Ie4a9171f002b42a13c1786268057bdc0ab3804d0
2020-08-27 07:29:12 -07:00
Dan Smith
76667323d7 Disable wait_for_fork() kill aggression if expect_exit=True
In some cases, we spawn a process that we expect to run to completion
and exit. In those cases, disable the force-kill logic in wait_for_fork()
to avoid killing those processes too early.

Related-Bug: #1891190
Change-Id: Ia78d101ae3702d9f761ca8f04b4b10af4a6d84fb
2020-08-25 10:31:45 -07:00
Dan Smith
b6380424f8 Make our ceph job enable thin provisioning
Depends-On: https://review.opendev.org/#/c/744282
Change-Id: Ifa05b30cd8a810c6ae0678eaa1e433c16e0b7b93
2020-08-24 08:41:34 -07:00
Dan Smith
552da84400 Cleanup import status information after busting a lock
When we bust a lock, we now own the image for that time period
and may exclude the other task (if still running) from updating
the import status information. If not still running, we should
take responsibility of that cleanup since we know what task we
stole the lock from. We should, however, only do that if we
succeed in grabbing the lock to avoid racing with another thread
which might be trying to do the same thing.

Change-Id: Iff3dfbfcbfb956a06d77a144e5456bdb556c5a2c
2020-08-24 06:41:13 -07:00
Dan Smith
3636915dd4 Add ImageLock to base flow checks
This adds the ImageLock flow atom to the test_async flow assertions to
make sure it gets included. Also changed are the magic numbers for
expected flow lengths, per feedback on the
Icb3c1d27e9a514d96fca7c1d824fd2183f69d8b3 review.

Change-Id: I0795e3835a500f20102e0c393a9a15f00672c87a
2020-08-24 06:41:13 -07:00
Dan Smith
a86062c492 Functional test enhancement for lock busting
This enhances our test for the import lock-busting case to include
freeing the stuck import task and letting the new and old ones proceed
to make sure that the end state looks like what we expect.

Note that this includes another task lock check after exiting the
ImportAction context before we call image save. While writing these
tests I determined that we can end up with the original task racing
to update the image locations. If set_data() took a very long time,
caused our lock to be stolen and another task is running, when our
set_data() finally finishes we may overwrite their newly-added
location when we go to save our (now stale) list. Thus, the extra
task check (imperfectly) tries to avoid us doing anything else after
our task lock is stolen.

Change-Id: I74baf53fac1c3e23f6dc743058165ecb39074626
2020-08-24 06:41:13 -07:00
Dan Smith
26f0311b29 Handle atomic image properties separately
The image_update() code will clobber, revert, or update lock values
in keys that we use as atomic properties. This adds an exclusion
list of properties that we handle specially and plumbs them down
to image_update() so that they will be excluded from the
add/update/delete logic.

Change-Id: Ib910274472346ce0c336cd1ead8370d5799d0b96
2020-08-24 06:41:13 -07:00
Dan Smith
36cbc50e7d Move SynchronousAPIBase to a generalized location
The base class for the tests added in test_images_import_locking provides
a mechanism to make API cals directly against the WSGI stack, without
starting a separate server and using the local networking. This is useful
for cases where fault injection of global state needs to be altered, where
this is very difficult in the existing fork-and-exec functional test
model.

This moves that test base class out to the functional module, expands the
documentation a little, and also generalizes the request methods for
wider applicability.

Change-Id: I59e3b5d5d4b69f076092b9950c0d34467a6636ad
2020-08-24 06:41:13 -07:00
Dan Smith
1b006c4f44 Add functional test for task status updating
This was really hard to test before the introduction of the
SynchronousAPIBase test class, so this is added after that patch for
convenience.

This tests uses two mocks, but it's otherwise quite "functional" in its
realness. It mocks out the timer so it always fires, and it mocks out
the effective call to glance_store and uses that as a hook to grab the
task state in lockstep (and avoid making another copy of the fake data
on disk, since we need to generate a couple MiB of data to test this).

Change-Id: Ibd0c802efa2723e11ece1097c9bf1ad68b1a820c
2020-08-24 06:41:13 -07:00
Dan Smith
3f6e349d08 Implement time-limited import locking
This attempts to provide a time-based import lock that is dependent
on the task actually making progress. While the task is copying
data, the task message is updated, which in turn touches the task
updated_at time. The API will break any lock after 30 minutes of
no activity on a stalled or dead task. The import taskflow will
check to see if it has lost the lock at any point, and/or if its
task status has changed and abort if so.

The logic in more detail:

1. API locks the image by task-id before we start the task thread, but
   before we return
2. Import thread will check the task-id lock on the image every time it
   tries to modify the image, and if it has changed, will abort
3. The data pipeline will heartbeat the task every minute by updating
   the task.message (bonus: we get some status)
4. If the data pipeline heartbeat ever finds the task state to be changed
   from the expected 'processing' it will abort
5. On task revert or completion, we drop the task-id lock from the image
6. If something ever gets stuck or dies, the heartbeating will stop
7. If the API gets a request for an import where the lock is held, it
   will grab the task by id (in the lock) and check the state and age.
   If the age is sufficiently old (no heartbeating) and the state is
   either 'processing' or terminal, it will mark the task as failed,
   steal the lock, and proceed.

Lots of logging throughout any time we encounter unexpected situations.

Closes-Bug: #1884596
Change-Id: Icb3c1d27e9a514d96fca7c1d824fd2183f69d8b3
2020-08-24 06:41:13 -07:00
Dan Smith
dc08127f05 Add FakeData generator test utility
This adds a FakeData file-like generator that is able to generate
arbitrary amounts of data when we need to simulate reading from a
file or request pipeline, without having to store all of that in
memory.

Change-Id: Iff1fbe2b55f4be12e69c9fd3dec7e3b3e2593e53
2020-08-24 06:41:13 -07:00
Dan Smith
8c7342cbc1 Make test_copy_image_revert_lifecycle handle 409 on import retry
This test is very racy in general, and specifically on the post-revert
retry to do the import, it may start the next import before the previous
one has finished reverting. This patch makes it retry a few times with
delay if an HTTP 409 is received so it is tolerant of that situation once
it becomes possible in the subsequent patch to add import locking.

Change-Id: Ic933f170d43b290fd819e8527ecb60be7f7f3f89
2020-08-24 06:41:13 -07:00
Zuul
1580b4d821 Merge "Poll for final state on test_copy_image_revert_lifecycle()" 2020-08-18 22:18:45 +00:00
Zuul
f32eb7d8d9 Merge "Fix import failure status reporting when all_stores_must_succeed=True" 2020-08-18 22:18:43 +00:00
Zuul
8626aa38b6 Merge "Functional reproducer for bug 1891352" 2020-08-18 17:58:28 +00:00
Dan Smith
6c96319eeb Poll for final state on test_copy_image_revert_lifecycle()
This test currently simulates failure by pre-deleting the store
directory for 'file3' which is the second of a two-store import
operation. The goal is to assert that the later failure reverts
the import of the earlier 'file2' store. However, the way the
polling loop works is that we break out once 'file2' has completed,
and then assume that 'file3' has already failed and reverted.
This is a race, and one we're losing consistently in CI.

This patch waits for the failure of 'file3' to be reported, as
well as the revert of 'file2' to occur before exiting the polling
loop and checking the final state of things to ensure that the
revert has actually happened. This addresses the non-determinism
inherent in the original test.

Change-Id: I11c7edaefc96236d2757acfb70d9c338c0f51348
2020-08-18 16:57:25 +00:00