Ceph: Fix restoring old backups to a different backend

There might be scenarios where a very old backup, created from a Cinder
snapshot, was not handled correctly by my previous patch.

This new patch addresses this corner case by adding a fallback
mechanism.

In Cinder, the base name of the backup volume is generated based either
on the volume id alone or a combination of the volume id and backup id.
Two different scenarios are considered here, each corresponding to a
different base name format:

(1)
If the backup was made from a volume snapshot, the older base name
format is used: volume-<vol-uuid>.backup.base.

(2)
If the backup wasn't made from a snapshot, a newer base name format is
used: volume-<vol-uuid>.backup-<backup-uuid>.

In both cases, the function attempts to restore the volume from a backup
image by looking for it under the base name. If the image is not found
(i.e., rbd.ImageNotFound), an error message is logged, and then the
function attempts to retrieve the backup volume under the alternative
base name. This is done because it's possible that the backup volume was
created using a different format than the one initially tried.

In other words, the check for another base name serves as a fallback
mechanism, which is used when the backup image is not found under the
expected name. The primary reason behind having two different base name
formats is likely due to a change in the naming convention at some point
in the history of the Cinder project. This approach ensures backward
compatibility and makes it possible to restore older backups that were
created before the change.

Change-Id: I074aeb222ca4566b630b5b76529454da4c8b493b
This commit is contained in:
Sofia Enriquez 2023-05-26 16:27:51 +00:00 committed by Pete Zaitcev
parent 3b9cc13a94
commit becf45c6cb
2 changed files with 62 additions and 7 deletions
cinder
backup/drivers
tests/unit/backup/drivers

@ -976,7 +976,6 @@ class CephBackupDriver(driver.BackupDriver):
if len(snaps) > 1:
msg = (_("Backup should only have one snapshot but instead has %s")
% len(snaps))
LOG.error(msg)
raise exception.BackupOperationError(msg)
LOG.debug("Found snapshot '%s'", snaps[0])
@ -1109,11 +1108,43 @@ class CephBackupDriver(driver.BackupDriver):
backup_name = self._get_backup_base_name(backup.volume_id,
backup=backup)
# Retrieve backup volume
src_rbd = eventlet.tpool.Proxy(self.rbd.Image(client.ioctx,
backup_name,
snapshot=src_snap,
read_only=True))
try:
# Retrieve backup volume
_src = src_snap
src_rbd = eventlet.tpool.Proxy(self.rbd.Image(client.ioctx,
backup_name,
snapshot=_src,
read_only=True))
except rbd.ImageNotFound:
# Check for another base name as a fallback mechanism, in case
# the backup image is not found under the expected name.
# The main reason behind having two different base name formats
# is due to a change in the naming convention at some point in
# the history of the Cinder project.
# This approach ensures backward compatibility and makes it
# possible to restore older backups that were created before
# the change.
tried_name = backup_name
if backup.snapshot_id:
backup_name = self._get_backup_base_name(backup.volume_id,
backup=backup)
else:
backup_name = self._get_backup_base_name(backup.volume_id)
msg = (_("Backup %(backup_id)s of volume %(volume_id)s"
" not found with name %(tried_name)s,"
" trying a legacy name %(next_name)s.") %
{'backup_id': backup.id,
'volume_id': backup.volume_id,
'tried_name': tried_name,
'next_name': backup_name})
LOG.info(msg)
src_rbd = eventlet.tpool.Proxy(self.rbd.Image(
client.ioctx,
backup_name,
snapshot=_src,
read_only=True))
try:
rbd_meta = linuxrbd.RBDImageMetadata(src_rbd,
backup.container,
@ -1354,7 +1385,6 @@ class CephBackupDriver(driver.BackupDriver):
backup.volume_id)
except exception.BackupMetadataUnsupportedVersion:
msg = _("Metadata restore failed due to incompatible version")
LOG.error(msg)
raise exception.BackupOperationError(msg)
def restore(self,

@ -998,6 +998,31 @@ class BackupCephTestCase(test.TestCase):
mock_getbasename.assert_called_once_with(self.volume_id)
mock_transfer_data.assert_called_once()
@common_mocks
def test_full_restore_with_image_not_found(self):
length = 1024
volume_is_new = True
src_snap = None
with tempfile.NamedTemporaryFile() as dest_file:
with mock.patch.object(self.service,
'_get_backup_base_name') as mock_name,\
mock.patch('eventlet.tpool.Proxy') as mock_proxy:
self.mock_rbd.Image.side_effect = self.mock_rbd.ImageNotFound
self.assertRaises(self.mock_rbd.ImageNotFound,
self.service._full_restore,
self.backup,
dest_file,
length,
volume_is_new,
src_snap)
# Check that the _get_backup_base_name was called
# twice due to the exception
self.assertEqual(mock_name.call_count, 2)
self.assertEqual(mock_proxy.call_count, 2)
@common_mocks
def test_discard_bytes(self):
# Lower the chunksize to a memory manageable number