relinker: allow clobber-hardlink-collision

The relinker has already been robust to hardlink collisions on
tombstones for some time; this change allows ops to optionally
(non-default) enable a similar handling of other files when relinking
the old=>new partdir.

If your cluster is having a bunch of these kinds of collisions and after
spot checking you determine the data is in fact duplicate copies the
same data - you'd much rather have the option for the relinker to
programatically handle them non-destructively than forcing ops to rm a
bunch of files manually just get out of a PPI.

Once the PPI is over and you reconstrcutors are running again, after
some validation you can probably clean out your quarantine dirs.

Drive-by: log unknown relink errors at error level to match expected
non-zero return code

Closes-Bug: #2127779
Change-Id: Iaae0d9fb7a1949d1aad9aa77b0daeb249fb471b5
Signed-off-by: Clay Gerrard <clay.gerrard@gmail.com>
This commit is contained in:
Clay Gerrard
2025-09-18 18:42:57 -05:00
committed by Alistair Coles
parent b035ed1385
commit be62933d00
3 changed files with 454 additions and 53 deletions

View File

@@ -793,3 +793,12 @@ use = egg:swift#xprofile
#
# stats_interval = 300.0
# recon_cache_path = /var/cache/swift
#
# Because of the way old versions of swift on old kernels worked you may end up
# with a file in the new part dir path that had the exact timestamp of a
# "newer" file in the current part dir. With this option enabled during the
# relink phase we'll quarantine the colliding file in the new target part dir
# and retry the relink. During the cleanup phase we ignore the un-matched
# inode "collision" and allow the cleanup of the old file in the old part dir
# same as tombstones.
# clobber_hardlink_collisions = false