cinder/cinder/volume/drivers
Gorka Eguileor 0b2c0d9238 LVM: Fix delete volume error due to lvs failure
Sometimes we get an unexpected failure on the lvs command where it exits
with code 139, which is not one of the 5 predefined ones in the code
ECMD_PROCESSED, ENO_SUCH_CMD, EINVALID_CMD_LINE, EINIT_FAILED, or
ECMD_FAILED.

We've seen this happen mostly on CI jobs when deleting a volume, on the
check to see if the volume is present (``_volume_not_present``) and
makes the whole operation unexpectedly fail.

When looking at the logs we can find other instances of this same exit
code happening in other calls, but those are fortunately to be covered
by retries for other unexpected errors such as bug #1335905 that seem to
make the call eventually succeed.

The stderr of the failure is polluted with debug and warning messages
such as:

  [1] /dev/sda1: stat failed: No such file or directory

      This has been removed [2] from the LVM code indicating it's
      somewhat troublesome, but doesn't explain how.

  [3] Path /dev/sda1 no longer valid for device(8,1)

  [4] WARNING: Scan ignoring device 8:0 with no paths.

But the real error is most likely:

  [5]: Device open /dev/sda 8:0 failed errno 2

On failure we see that error twice, because the code retries it in LVM
trying to workaround some kind of unknown udev race [6].

Since the LVM code indicates that a retry can help, we'll retry on error
139 when calling ``get_lv_info``.

To narrow down the retry we'll only do it on error 139, so we modify the
existing ``retry`` decorator to accept the ``retry`` parameter (same as
the tenacity library) and create our own retry if the
ProcessExecutionError fails with a specific error.

This pattern seems better than blindly retrying all
ProcessExecutionError cases.

[1]: 17f5572bc9/lib/filters/filter-persistent.c (L132)
[2]: 22c5467add
[3]: b84a9927b7/lib/device/dev-cache.c (L1396-L1402)
[4]: b84a9927b7/lib/label/label.c (L798)
[5]: b84a9927b7/lib/label/label.c (L550)
[6]: b84a9927b7/lib/label/label.c (L562-L567)

Closes-Bug: #1901783
Change-Id: I6824ba4fbcb6fd8f57f8ff86ad7132446ac6c504
2021-03-29 16:43:10 +02:00
..
ceph Update to hacking 4.0.0 2021-02-16 10:54:51 -05:00
datera Update to hacking 4.0.0 2021-02-16 10:54:51 -05:00
dell_emc LVM: Fix delete volume error due to lvs failure 2021-03-29 16:43:10 +02:00
fujitsu Replace md5 with oslo version 2020-11-13 16:01:14 -05:00
fusionstorage Fix missing print format in log messages 2020-04-15 07:28:23 +08:00
hedvig Merge "Hedvig: Migration to py37" 2020-01-20 13:43:58 +00:00
hitachi Merge "Hitachi: Trace REST API input/output logs" 2021-03-10 05:42:27 +00:00
hpe Move trace methods from utils to volume_utils 2021-02-12 20:16:55 +00:00
huawei Replace md5 with oslo version 2020-11-13 16:01:14 -05:00
ibm Merge "[SVF]: Volume name is not validated for host" 2021-03-17 15:03:00 +00:00
infortrend Rename volume/utils.py to volume/volume_utils.py 2019-09-09 15:00:07 -04:00
inspur LVM: Fix delete volume error due to lvs failure 2021-03-29 16:43:10 +02:00
kaminario Move trace methods from utils to volume_utils 2021-02-12 20:16:55 +00:00
kioxia Add KIOXIA KumoScale NVMeOF driver 2021-02-10 15:40:17 +00:00
lenovo Create Seagate driver from dothill driver 2019-08-16 17:49:15 -06:00
macrosan Fix CI_WIKI_NAME entries 2021-02-22 11:05:09 -06:00
nec Move trace methods from utils to volume_utils 2021-02-12 20:16:55 +00:00
netapp NetApp ONTAP: Fix FlexGroup replication 2021-03-24 22:27:23 +00:00
nexenta Replace md5 with oslo version 2020-11-13 16:01:14 -05:00
open_e JovianDSS: add certs and snapshot restore 2021-03-19 12:48:50 +02:00
prophetstor Move get_volume_stats impl to the base volume driver 2020-07-01 12:41:20 +03:00
san Create Seagate driver from dothill driver 2019-08-16 17:49:15 -06:00
sandstone Move get_volume_stats impl to the base volume driver 2020-07-01 12:41:20 +03:00
stx Move trace methods from utils to volume_utils 2021-02-12 20:16:55 +00:00
synology Merge "Synology: Improve session expired error handling" 2020-11-17 23:42:56 +00:00
toyou TOYOU: Abandon the target parameter and Report SAN driver options 2021-02-18 05:41:44 +00:00
veritas_access Replace md5 with oslo version 2020-11-13 16:01:14 -05:00
vmware vmware: Use cookiejar from oslo.vmware client directly 2021-02-10 16:32:31 -05:00
windows Merge "Fix CI_WIKI_NAME entries" 2021-03-02 21:17:04 +00:00
zadara Update code layout and missing Zadara features 2021-03-18 20:22:32 +02:00
__init__.py Files with no code must be left completely empty 2016-09-28 16:29:30 +07:00
infinidat.py Move brick calls from cinder.utils to volume_utils 2021-02-10 17:27:46 -05:00
linstordrv.py Support Glance image data colocation 2020-04-13 12:04:18 +00:00
lvm.py Move get_volume_stats impl to the base volume driver 2020-07-01 12:41:20 +03:00
nfs.py Support format info in fs type drivers 2021-03-12 12:41:12 -05:00
nimble.py Bug fix for revert to snapshot feature 2021-03-19 18:08:49 -07:00
pure.py Merge "[PURE] support IPv6 / add parameter pure_iscsi_cidr_list" 2021-03-26 16:35:01 +00:00
qnap.py Fix a misspelling error in QNAP driver 2020-05-28 11:10:47 +08:00
quobyte.py Merge "Disallow extension of attached volumes for NFS & Quobyte drivers" 2020-10-07 05:50:41 +00:00
rbd.py Merge "RBD: Change rbd_exclusive_cinder_pool's default" 2021-03-11 01:29:34 +00:00
remotefs.py NFS: Fix for groups and cloning 2021-03-16 06:13:05 -04:00
rsd.py Move trace methods from utils to volume_utils 2021-02-12 20:16:55 +00:00
solidfire.py NetApp SolidFire: Refactor DuplicateSfVolumeNames exception 2021-01-10 01:23:59 +00:00
spdk.py SPDK: Report info in top-level volume_stats 2020-09-23 11:13:33 -04:00
storpool.py Fix CI_WIKI_NAME entries 2021-02-22 11:05:09 -06:00
veritas_cnfs.py Mark Veritas CNFS Driver Unsupported 2020-01-13 20:46:45 -06:00
vzstorage.py Add missing context to function call 2020-05-04 08:25:10 -04:00