From a7da2232629bfd4d5c04f5169e51b1f57b6c9362 Mon Sep 17 00:00:00 2001 From: Matthew Oliver Date: Tue, 19 Dec 2017 05:47:20 +0000 Subject: [PATCH] Fix intermittent problem in part_swapping test There is an intermittent failure in the test_part_swapping_problem test found in test/unit/common/ring/test_builder.py. The test does a rebalance, then changes the ring to test a specific problem, does some housekeeping and then rebalances again. The problem is the ringbuilder keeps track of where in the ring it started the last ring rebalance, saved in `_last_part_gather_start`. On a rebalance, or more specifically in `_gather_parts_for_balance` we then we will start somewhere on the other side of the ring with: quarter_turn = (self.parts // 4) random_half = random.randint(0, self.parts / 2) start = (self._last_part_gather_start + quarter_turn + random_half) % self.parts Because we don't reset `_last_part_gather_start` when we change the ring in the test, there is edge case where if we are unlucky during both rebalances whereby both calls to randint returns a relatively large number pushes the start of the second rebalance to the wrong side of the ring. Actually it's more problematic, and only 1 large random and a one in the middle will cause it, maybe pictures help: rabalance 1 (r1): quarter_turn = 4, random_half = 5 rebalance 2 (r2): quarter_turn = 4, random_half = 3 r1 r2 | | v v array('H', [0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1]), array('H', [1, 1, 1, 1, 2, 2, 2, 3, 3, 3, 2, 2, 2, 3, 3, 3]), array('H', [2, 2, 2, 2, 3, 3, 4, 4, 4, 4, 3, 4, 4, 4, 4, 4])] Now when gathering for rebalance 2 it'll pick: array('H', [0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, X]), array('H', [X, X, 1, 1, 2, 2, 2, 3, 3, 3, 2, 2, 2, 3, 3, 3]), array('H', [2, 2, 2, 2, 3, 3, 4, 4, 4, 4, 3, 4, 4, 4, 4, 4])] Which can cause the 3 attempts to gather and rebalance to be used up. This causes the intermittent failure seen in the bug. This patch solves this by resetting `_gather_parts_for_balance` to 0 while we tidy up the ring change. Meaning we'll always start on the correct side of the ring. Change-Id: I0d3a69620d4734091dfa516efd0d6b2ed87e196b Closes-Bug: #1724356 --- test/unit/common/ring/test_builder.py | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/test/unit/common/ring/test_builder.py b/test/unit/common/ring/test_builder.py index 4d23bdd8b4..bf987c2ae6 100644 --- a/test/unit/common/ring/test_builder.py +++ b/test/unit/common/ring/test_builder.py @@ -1072,6 +1072,11 @@ class TestRingBuilder(unittest.TestCase): new_dev_parts[dev_id] += 1 for dev in rb._iter_devs(): dev['parts'] = new_dev_parts[dev['id']] + # reset the _last_part_gather_start otherwise + # there is a chance it'll unluckly wrap and try and + # move one of the device 1's from replica 2 + # causing the intermitant failure in bug 1724356 + rb._last_part_gather_start = 0 rb.pretend_min_part_hours_passed() rb.rebalance()