28 Commits

Author SHA1 Message Date
Alex Gaynor
6384b192b5 Ensure that files are always closed explicitly.
This is needed on Pythons without reference
counting garbage collectors (e.g. PyPy).

Change-Id: Ieb563ace9f65a4ad204b01be32bf7a9d5f226005
2013-07-22 16:10:37 -07:00
Darrell Bishop
0d5ed67726 Fix ring validate with device prior to rebalance.
A builder file with some devices added but before rebalance() is called
should be invalid (in the sense that validate() raises
RingValidationError instead of TypeError).

Change-Id: I5538239db4b2fde83be2390014504e35ccd7c0d3
2013-06-08 10:55:41 -07:00
Ilya Kharin
3957dbc5d4 RingBuilder.add_dev returns device id
When added a new devices into builder the add_dev function returns it
unique id.

blueprint argparse-in-swift-ring-builder

Change-Id: I57080bb625e812f6cea71199df907a44b332b552
2013-05-24 17:34:24 +04:00
Ilya Kharin
43bf568f48 Move parse search logic outside from builder
Dramatic part of RingBuilder.search_devs which parse a complex format
of a search device string moved to the swift-ring-builder script.
Instead, the search_devs has a simple interface to search devices.

blueprint argparse-in-swift-ring-builder

Change-Id: If3dd77b297b474fb9a058e4693fef2dfb11fca3d
2013-05-24 17:12:34 +04:00
Ilya Kharin
cc040a9c29 Add ability to save builder data to a disk file
Instances of the RingBuilder class can store its data to a disk file by
the save method and load it by the load method.

blueprint argparse-in-swift-ring-builder

Change-Id: I69fdf0693ca9f520d235a795ecdd2da310dcd5d3
2013-05-16 19:49:00 +04:00
Sergey Kraynev
ea7858176b Implementation of replication servers
Support separate replication ip address:
- Added new function in utils. This function provides ability
  to select separate IP address for replication service.
- Db_replicator and object replicators were changed.
  Replication process uses new function now.

Replication network parameters:
- Replication network fields (replication_ip, replication_port)
  support was added to device dictionary in swift-ring-builder script.
- Changes were made to support new fields in search, show and set_info
  functions.

Implementation of replication servers:
- Separate replication servers use the same code as normal replication
  servers, but with replication_server parameter = True.  When using a
  separate replication network, the non-replication servers set
  replication_server = False.  When there is no separate replication
  network (the default case), replication_server is not included in the config.

DocImpact
Change-Id: Ie9af5bdcdf9241c355e36053ca4adfe49dc35bd0
Implements: blueprint dedicated-replication-network
2013-04-21 18:14:42 -04:00
Eohyung Lee
98acf42f92 Fix rebalance for zero weighted devices.
If we set device's weight to zero, currently balance will be set
special value(999.99) until zero weighted device return all
its partitions. So we cannot check balance has changed.
Thus we need to check balance or last_balance is special value.

Change-Id: I5b7db8b8e48db0c4771c51a764bda689869817d5
Fixes: bug #1171731
2013-04-26 16:36:08 +09:00
Greg Lange
44f00a23c1 fixed some minor things in tests that pyflakes complained about
Change-Id: Ifeab56a964630bcf941e932fcbe39e6572e62975
2013-03-26 20:42:26 +00:00
Jenkins
157d5c7d49 Merge "Basic ring builder validation." 2013-03-19 00:27:31 +00:00
Samuel Merritt
d42a78a3aa Basic ring builder validation.
This prevents people from creating bogus ring builder files.

Example: "swift-ring-builder object.builder create 33 0.9 -4".

Fixes bug 924577.

Change-Id: I7bfc04f7fa5f55f70a4eaae96c414f6b2872e283
2013-03-18 09:50:45 -07:00
Samuel Merritt
229ba53a19 Fix crash in swift-ring-builder's list_parts command.
If you run list_parts against a builder that has never been
rebalanced, you'd get a crash. Now you don't.

To reproduce:

$ swift-ring-builder foo.builder create 8 3 1
$ swift-ring-builder foo.builder add r1z1-1.2.3.4:6000/sda 100
$ swift-ring-builder foo.builder list_parts z1

Change-Id: Ic3edffab0c5c2e9551a2f89ddb881153f0b07db7
2013-03-14 18:49:30 -07:00
Samuel Merritt
ebcd60f7d9 Add a region tier to Swift's ring.
The region is one level above the zone; it is intended to represent a
chunk of machines that is distant from others with respect to
bandwidth and latency.

Old rings will default to having all their devices in region 1. Since
everything is in the same region by default, the ring builder will
simply distribute across zones as it did before, so your partition
assignment won't move because of this change. If you start adding
devices in other regions, of course, the assignment will change to
take that into account.

swift-ring-builder still accepts the same syntax as before, but will
default added devices to region 1 if no region is specified.

Examples:

$ swift-ring-builder foo.builder add r2z1-1.2.3.4:555/sda

$ swift-ring-builder foo.builder add r1z3-1.2.3.4:555/sda

$ swift-ring-builder foo.builder add z3-1.2.3.4:555/sda

Also, some updates to ring-overview doc.

Change-Id: Ifefbb839cdcf033e6c9201fadca95224c7303a29
2013-03-13 10:00:58 -07:00
Samuel Merritt
7548cb9c47 Make rings' replica counts adjustable.
Example:

$ swift-ring-builder account.builder set_replicas 4
$ swift-ring-builder rebalance

This is a prerequisite for supporting globally-distributed clusters,
as operators of such clusters will probably want at least as many
replicas as they have regions. Therefore, adding a region requires
adding a replica. Similarly, removing a region lets an operator remove
a replica and save some money on disks.

In order to not hose clusters with lots of data, swift-ring-builder
now allows for setting of fractional replicas. Thus, one can gradually
increase the replica count at a rate that does not adversely affect
cluster performance.

Example:

$ swift-ring-builder object.builder set_replicas 3.01
$ swift-ring-builder object.builder rebalance
<distribute rings and wait>

$ swift-ring-builder object.builder set_replicas 3.02
$ swift-ring-builder object.builder rebalance
<distribute rings and wait>...

Obviously, fractional replicas are nonsensical for a single
partition. A fractional replica count is for the whole ring, not for
any individual partition, and indicates the average number of replicas
of each partition. For example, a replica count of 3.2 means that 20%
of partitions have 4 replicas and 80% have 3 replicas.

Changes do not take effect until after the ring is rebalanced. Thus,
if you mean to go from 3 replicas to 3.01 but you accidentally type
2.01, no data is lost.

Additionally, 'swift-ring-builder X.builder create' can now take a
decimal argument for the number of replicas.

DocImpact

Change-Id: I12b34dacf60350a297a46be493d5d171580243ff
2013-02-22 15:03:10 -08:00
Christopher MacGown
e189723fec Allow rebalance to take a seed.
Passing a seed into rebalance makes the rebalance deterministic
which allows us to generate identical rings across disparate
nodes without having to copy the ring files around.

Change-Id: Ie5ae46ac030e61284bc501fdef9d77eeb5243afd
2013-01-29 17:08:20 -08:00
Samuel Merritt
f2941b0846 Validate numericness of ports in builder files.
You can't really goof this up using bin/swift-ring-builder, but if you
have code that uses swift.common.ring.RingBuilder directly, you can
stuff e.g. "6002" in where you mean 6002, resulting in some fairly
baffling failures. (Yes, I have done this.)

Change-Id: I87b7b7066b9ea2ce6f82255605da99cf0d283689
2013-01-22 18:56:48 -08:00
Florian Hines
c4f5761101 builder.add_devs gets next id if not provided
Have builder.add_devs get the next id to use when adding a new device
if its not specified in the dict.

Change-Id: I5a0defab43f5cfc5d997080bfd8563bfe72368ad
2012-09-14 16:11:50 -05:00
Florian Hines
c0537ac6e0 Breakout search_devs & add get_builder() for reuse
This moves search_devs into RingBuilder to make it accessible to other utils
that need to search the builder. Along the same lines this also adds a
load() call to get a RingBuilder instance when working with the builder files.

- This adds python-mock >= 0.7 as a dependency for unittests. On Ubuntu
  10.04 you'll have to pip install it, on 12.04 you can apt-get install
  it. Fedora 17+ should be able to yum install it.
- new pep8 compliance
- Fixed a small issue (undefined var) in swift-ring-builder when remove was
called but failed to find a match.

Change-Id: I2e02684235aa2f4e901a00858ae037091594c545
2012-09-06 20:16:46 -05:00
Samuel Merritt
bb509dd863 As-unique-as-possible partition replica placement.
This commit introduces a new algorithm for assigning partition
replicas to devices. Basically, the ring builder organizes the devices
into tiers (first zone, then IP/port, then device ID). When placing a
replica, the ring builder looks for the emptiest device (biggest
parts_wanted) in the furthest-away tier.

In the case where zone-count >= replica-count, the new algorithm will
give the same results as the one it replaces. Thus, no migration is
needed.

In the case where zone-count < replica-count, the new algorithm
behaves differently from the old algorithm. The new algorithm will
distribute things evenly at each tier so that the replication is as
high-quality as possible, given the circumstances. The old algorithm
would just crash, so again, no migration is needed.

Handoffs have also been updated to use the new algorithm. When
generating handoff nodes, first the ring looks for nodes in other
zones, then other ips/ports, then any other drive. The first handoff
nodes (the ones in other zones) will be the same as before; this
commit just extends the list of handoff nodes.

The proxy server and replicators have been altered to avoid looking at
the ring's replica count directly. Previously, with a replica count of
C, RingData.get_nodes() and RingData.get_part_nodes() would return
lists of length C, so some other code used the replica count when it
needed the number of nodes. If two of a partition's replicas are on
the same device (e.g. with 3 replicas, 2 devices), then that
assumption is no longer true. Fortunately, all the proxy server and
replicators really needed was the number of nodes returned, which they
already had. (Bonus: now the only code that mentions replica_count
directly is in the ring and the ring builder.)

Change-Id: Iba2929edfc6ece89791890d0635d4763d821a3aa
2012-05-09 15:56:06 -07:00
John Dickinson
1ecf5ebba1 updated copyright date for all files
Change-Id: Ifd909d3561c2647770a7e0caa3cd91acd1b4f298
2012-03-19 13:45:34 -05:00
Samuel Merritt
e994d033a6 Refactor partition gathering.
RingBuilder._reassign_parts() is really moving one (partition,
replica) pair at a time. However, the way that _gather_reassign_parts
passes that data in was strange; it would update each replica's entry
in _replica2part2dev to 0xffff, then return a list of affected
partitions. Now it just returns the pairs to move.

This is helpful in the presence of bugs that affect partition
assignment (e.g. #943493), there's no chance of stray 0xffff values
hanging around and corrupting the partition map.

Also, update my email address.

Change-Id: Ifb3aeb4fac750f66e2ddbad88eb5846e72bac20c
2012-03-06 10:17:03 -08:00
Samuel Merritt
7fe0c6c695 Fix rebalancing when 2+ of a partition's replicas are on deleted devices.
RingBuilder._reassign_parts assumed that only replica for a given
partition would move. This isn't necessarily true in the case where a
bunch of devices have been removed. This would leave invalid entries
in _replica2part2dev and also cause validation to fail.

One easy way to reproduce this is to create a 3-replica, 3-zone,
6-device ring with 2 drives per zone (all of equal weight), rebalance,
and then remove one drive from each zone and rebalance again.

Bug: 943493

Change-Id: I0d399bed5d733448ad877fa2823b542777d385a4
2012-02-29 11:30:08 -08:00
Jenkins
7286502adf Merge "Add more specific error messages to swift-ring-builder" 2011-10-07 18:23:21 +00:00
Mark Gius
2f50a0798e Add more specific error messages to swift-ring-builder
Replace existing Exceptions in ring builder with more specific exceptions.
Abstracted out some behavior in ring-builder that is likely to cause an
exception. Add try/except blocks to swift-ring-builder to catch specific
exceptions and provide the user with some information about how to deal
with the error.

This change begins to address blueprint friendly-error-messages

Change-Id: I8fc9cfa4899421fe04bba23ac52523778e902321
2011-09-27 10:20:51 -07:00
Mark Gius
c0315a89df Fix for bug 845952
Devices scheduled to be removed are assigned a device of 65535.  When
looking for parts to reassign from heavy nodes, these parts need to be
skipped.

Includes review suggestions

Change-Id: I61f40c36509bf998834c123b0f80117ca6def3ff
2011-09-27 09:54:07 -07:00
gholt
d6aaba670b Shuffle the partitions to reassign on a ring rebalance. 2011-01-13 09:05:44 -08:00
Anne Gentle
8823427161 Changed copyright notices on py files and the single rst file with a copyright notice 2011-01-04 17:34:43 -06:00
Caleb Tennis
db90da2763 Remove the exception from the unit test, since we don't bomb out anymore. Also, add a warning to swift-ring-builder if you're building an empty ring, or do a write_ring and you aren't rebalanced 2010-08-21 18:21:59 +00:00
Chuck Thier
001407b969 Initial commit of Swift code 2010-07-12 17:03:45 -05:00