NetApp SolidFire: Fix error on cluster workload rebalancing

When SolidFire is under heavy load or being upgraded, the
SolidFire cluster may automatically move connections from primary
to secondary nodes, in order to rebalance cluster workload.

Although this operation ocurrs very quickly, if an operation is made
to a volume at the same time it's being moved, there might be a
chance that API calls such as create snapshot could fail with
xNotPrimary error. Normally this will succeed on a retry of the
operation.

This patch fixes this issue by adding the xNotPrimary exception to
our list of retryable exceptions in the SolidFire driver.

Change-Id: I67dd2bfba37adcb7cda5f1cd08ff641410ec3f6b
Closes-Bug: #1891914
This commit is contained in:
Fernando Ferraz 2020-09-30 12:32:02 -03:00
parent 4f68296d67
commit 8d0c08b4bb
2 changed files with 13 additions and 2 deletions

View File

@ -262,9 +262,11 @@ class SolidFireDriver(san.SanISCSIDriver):
- Enable Active/Active support flag
- Implement Active/Active replication support
2.2.0 - Add storage assisted volume migration support
2.2.1 - Fix bug #1891914 fix error on cluster workload rebalancing
by adding xNotPrimary to the retryable exception list
"""
VERSION = '2.2.0'
VERSION = '2.2.1'
SUPPORTS_ACTIVE_ACTIVE = True
@ -302,7 +304,8 @@ class SolidFireDriver(san.SanISCSIDriver):
'xMaxSnapshotsPerNodeExceeded',
'xMaxClonesPerNodeExceeded',
'xSliceNotRegistered',
'xNotReadyForIO']
'xNotReadyForIO',
'xNotPrimary']
def __init__(self, *args, **kwargs):
super(SolidFireDriver, self).__init__(*args, **kwargs)

View File

@ -0,0 +1,8 @@
---
fixes:
- |
NetApp SolidFire driver `Bug #1891914
<https://bugs.launchpad.net/cinder/+bug/1891914>`_:
Fix an error that might occur on cluster workload rebalancing or
system upgrade, when an operation is made to a volume at the same
time its connection is being moved to a secondary node.