6736 Commits

Author SHA1 Message Date
Jenkins
ded0de7aa5 Merge "Don't rehash primaries in reconstructor handoffs_only mode" 2017-07-10 16:16:00 +00:00
Clay Gerrard
44c63c6990 Don't rehash primaries in reconstructor handoffs_only mode
The reconstructor handoffs_only needs to aggressively avoid erroneous
I/O related to rehash of primary suffixes.

While in handoffs_only mode the reconstructor won't even look at primary
partitions.

This has a *huge* impact on cycle time once the node has completed
processing handoffs; which results in a much faster and stronger signal
that that it's either time to rebalance again or turn off handoffs_only.

Related-Change-Id: Idde4b6cf92fab6c45f2c0c2733277701eb436898

Change-Id: If4bbb778d511efe13713590639c8b91615556f22
2017-07-07 15:16:00 -07:00
Tim Burke
d112b7d29d First pass at admin-guide cleanup
Change-Id: I005232d95a3e1d181271eef488a828ad330e6006
2017-07-07 17:33:54 +00:00
John Dickinson
4cb76a41ce docs migration from openstack-manuals
Context for this is at
https://specs.openstack.org/openstack/docs-specs/specs/pike/os-manuals-migration.html

Change-Id: I9a4da27ce1d56b6406e2db979698038488f3cf6f
2017-07-06 16:44:25 -07:00
Jenkins
37a1935198 Merge "Switch from oslosphinx to openstackdocstheme" 2017-07-06 18:16:43 +00:00
Jenkins
c3f6e82ae1 Merge "Write-affinity aware object deletion" 2017-07-06 14:00:05 +00:00
Jenkins
e94b383655 Merge "Add support to increase object ring partition power" 2017-07-05 14:40:42 +00:00
liuyamin
006a378193 Add license in swift code file
Source code should be licensed under the Apache 2.0 license.
Add Apache License in swift/probe/__init__.py file.

Change-Id: I3b6bc2ec5fe5caac87ee23f637dbcc7a5d8fc331
2017-07-05 10:07:11 +08:00
Jenkins
f1e1dbb80a Merge "Make eventlet.tpool's thread count configurable in object server" 2017-07-04 11:49:24 +00:00
Jenkins
c22bab4b34 Merge "Version DLOs, just like every other type of object" 2017-07-03 14:06:07 +00:00
Van Hung Pham
bfb5759a53 Switch from oslosphinx to openstackdocstheme
As part of the docs migration work[0] for Pike we need to switch to use
the openstackdocstheme.

Fix one display problem with wrong section markup in the index file.

[0]https://review.openstack.org/#/c/472275/

Change-Id: Ide31218a7f37ba5d959de99cab48fc6513bf426f
2017-07-02 09:27:05 +02:00
Jenkins
638fcc152b Merge "Bind SAIO services on different loopback addresses" 2017-07-01 05:17:34 +00:00
Clay Gerrard
58d7812596 fix flakey time for test_default_sorted_output
Change-Id: Ib7f0c22336e8354d4f46e2343149495bef382f9c
2017-06-30 16:45:43 -07:00
Jenkins
fe3dcf5007 Merge "Close all versioned_writes subrequests' app_iters" 2017-06-30 18:44:44 +00:00
Jenkins
3361cd083e Merge "Order devices in the output of swift-ring-builder" 2017-06-29 22:06:37 +00:00
Jenkins
537f9a3f64 Merge "Delay port binding to reduce wait at process start" 2017-06-29 16:39:57 +00:00
Jenkins
44f6059394 Merge "Test: Use assertIsNone() in unittest" 2017-06-29 14:50:02 +00:00
Jenkins
fc06f6b7af Merge "Fix the reST field raises in docstrings" 2017-06-29 14:49:38 +00:00
Clay Gerrard
806b18c6f5 Extend device tier pretty print normalization
* use pretty device tier normalization in additional debug message
* refactor pretty device tier normalization for consistency
* unittest asserting consistency

Change-Id: Ide32e35ff9387f1cc1e4997eb133314d06b794f3
2017-06-28 15:40:34 -07:00
Samuel Merritt
62509cc8f4 Improve debug logging from ring builder.
The gather/place debug logs used to just contain device IDs; now they
include region, zone, and IP. This makes it easier to see what's going
on when debugging rebalance operations.

Change-Id: I6314e327973c57a34b88ebbb4d3b1594dbacd357
2017-06-28 15:09:07 -07:00
liuyamin
1eeb354c27 Fix the reST field raises in docstrings
Probably the most common format for documenting arguments is reST field
lists [1]. This change updates some docstrings to comply with the field
lists syntax.

[1] http://sphinx-doc.org/domains.html#info-field-lists

Change-Id: I0c35c6b4df840018534737bca2ca32dc977b0e05
2017-06-28 09:10:24 +08:00
Matthew Oliver
e11a38c63a Bind SAIO services on different loopback addresses
Currently all devices in the ring and all services in a SAIO
all bind to the same loopback address 127.0.0.1. But this
breaks servers_per_port if you want to do any testing on that.

This change binds each service to a different loopback address
and updates the rings (remakerings) accordingly.

To make sure rysncd binds correctly the bind address needed
to be changed to listen on all addresses (0.0.0.0).

Change-Id: I7e77434f275df1e2699de495d8b622b90157a9d7
2017-06-27 16:08:08 -04:00
Lingxian Kong
831eb6e3ce Write-affinity aware object deletion
When deleting objects in multi-region swift delpoyment with write
affinity configured, users always get 404 when deleting object before
it's replcated to approriate nodes.

This patch adds a config item 'write_affinity_handoff_delete_count' so
that operator could define how many local handoff nodes should swift
send request to get more candidates for the final response, or by
default just leave it to swift to calculate the appropriate number.

Change-Id: Ic4ef82e4fc1a91c85bdbc6bf41705a76f16d1341
Closes-Bug: #1503161
2017-06-27 22:42:02 +12:00
junboli
99a6d3b30a Test: Use assertIsNone() in unittest
Use assertIsNone() instead of assertEqual(), because assertEqual()
still fails on false values when compared to None

Change-Id: Ic52c319e3e55135df834fdf857982e1721bc44bb
2017-06-25 03:01:42 +00:00
Samuel Merritt
d9c4913e3b Make eventlet.tpool's thread count configurable in object server
If you're running servers_per_port > 0 and threads_per_disk = 0 (as it
should be with servers_per_port on), each object-server process will
have 20 IO threads waiting around to service eventlet.tpool
calls. This is far too many; with servers_per_port, there's no real
benefit to having so many IO threads.

This commit makes it so that, when servers_per_port > 0, each object
server defaults to having one main thread and one IO thread.

Also, eventlet's tpool size is now configurable via the object-server
config file. If a tpool size is set, that's what we'll use regardless
of servers_per_port. This allows operators with an excess of threads
to remove some regardless of servers_per_port.

Change-Id: I8f8914b7e70f2510393eb7c5e6be9708631ac027
Closes-Bug: 1554233
2017-06-23 16:16:03 +10:00
Mingyu Li
a1134e4aa2 Order devices in the output of swift-ring-builder
After the change to reuse device id's [1], the order of devices in the
output of swift-ring-builder is confusing. This patch list the devices
in order of (region, zone, ip, device).
The effect of this patch is as illustrated in [2].

This patch also partially fix Bug 1545016.

1. https://review.openstack.org/#/c/265461/
2. https://github.com/MicrowiseOnGitHub/tempfiles/blob/master/reorder_ring_output

Change-Id: I564ed1b8d0cd4a6250649689b1bce7ad3574fe57
Partial-Bug: 1545016
Closes-Bug: 1536743
2017-06-22 16:06:48 -07:00
Jenkins
2d18ecdf4b Merge "Replace slowdown option with *_per_second option" 2017-06-22 01:18:26 +00:00
Jenkins
169d1d8ab8 Merge "Require that known-bad EC schemes be deprecated" 2017-06-22 01:11:03 +00:00
Viktor Varga
2cb74b1b84 Replaced assertTrue(False, msg) with fail(msg)
In some unit tests instead of self.fail(msg) statements
self.assertTrue(False, msg) were used, which might be ambiguous.

Using assertTrue(False, msg) gives the following message on fail:

    File "C:\Python361\lib\unittest\case.py", line 678, in assertTrue
      raise self.failureException(msg)
      AssertionError: False is not true : msg

'False is not true' message implies that unit test failed (as the
result is False while we asserted True).

Replaced with self.fail(msg) is less ambiguous and more readable.

    File "C:\Python361\lib\unittest\case.py", line 666, in fail
      raise self.failureException(msg)
      AssertionError: msg

TrivialFix

Change-Id: Ib56a0ed8549fd7af2724eb59222106888781e9c8
2017-06-21 12:16:33 +02:00
Alistair Coles
3cccd5a0ed Make bin/swift-get-nodes executable
Change-Id: I510e05f4cf16cc6740f673c27cea6bc3899938c1
2017-06-20 11:47:00 +01:00
Ondřej Nový
a8bc94c7e3 Replace slowdown option with *_per_second option
container and object updaters sleeps "slowdown" (default 0.01) seconds
after every processed container/object. Because time.sleep call adds overhead,
use ratelimit_sleep from common.utils instead. Same as in auditor.

Change-Id: I362aa0f13c78ad03ce1f76ee0257b0646f981212
2017-06-16 19:22:00 +00:00
Tim Burke
2c3ac543f4 Require that known-bad EC schemes be deprecated
We said we were going to do it, we've had two releases saying we'd do
it, we've even backported our saying it to Newton -- let's actually do
it.

Upgrade Consideration
=====================

Erasure-coded storage policies using isa_l_rs_vand and nparity >= 5 must
be configured as deprecated, preventing any new containers from being
created with such a policy. This configuration is known to harm data
durability. Any data in such policies should be migrated to a new
policy. See https://bugs.launchpad.net/swift/+bug/1639691 for more
information.

UpgradeImpact
Related-Change: I50159c9d19f2385d5f60112e9aaefa1a68098313
Change-Id: I8f9de0bec01032d9d9b58848e2a76ac92e65ab09
Closes-Bug: 1639691
2017-06-16 17:58:43 +00:00
Christian Schwede
e1140666d6 Add support to increase object ring partition power
This patch adds methods to increase the partition power of an existing
object ring without downtime for the users using a 3-step process. Data
won't be moved to other nodes; objects using the new increased partition
power will be located on the same device and are hardlinked to avoid
data movement.

1. A new setting "next_part_power" will be added to the rings, and once
the proxy server reloaded the rings it will send this value to the
object servers on any write operation. Object servers will now create a
hard-link in the new location to the original DiskFile object. Already
existing data will be relinked using a new tool in the new locations
using hardlinks.

2. The actual partition power itself will be increased. Servers will now
use the new partition power to read from and write to. No longer
required hard links in the old object location have to be removed now by
the relinker tool; the relinker tool reads the next_part_power setting
to find object locations that need to be cleaned up.

3. The "next_part_power" flag will be removed.

This mostly implements the spec in [1]; however it's not using an
"epoch" as described there. The idea of the epoch was to store data
using different partition powers in their own namespace to avoid
conflicts with auditors and replicators as well as being able to abort
such an operation and just remove the new tree.  This would require some
heavy change of the on-disk data layout, and other object-server
implementations would be required to adopt this scheme too.

Instead the object-replicator is now aware that there is a partition
power increase in progress and will skip replication of data in that
storage policy; the relinker tool should be simply run and afterwards
the partition power will be increased. This shouldn't take that much
time (it's only walking the filesystem and hardlinking); impact should
be low therefore. The relinker should be run on all storage nodes at the
same time in parallel to decrease the required time (though this is not
mandatory). Failures during relinking should not affect cluster
operations - relinking can be even aborted manually and restarted later.

Auditors are not quarantining objects written to a path with a different
partition power and therefore working as before (though they are reading
each object twice in the worst case before the no longer needed hard
links are removed).

Co-Authored-By: Alistair Coles <alistair.coles@hpe.com>
Co-Authored-By: Matthew Oliver <matt@oliver.net.au>
Co-Authored-By: Tim Burke <tim.burke@gmail.com>

[1] https://specs.openstack.org/openstack/swift-specs/specs/in_progress/
increasing_partition_power.html

Change-Id: I7d6371a04f5c1c4adbb8733a71f3c177ee5448bb
2017-06-15 15:08:48 -07:00
Tim Burke
0b60b0bd7e Close all versioned_writes subrequests' app_iters
While we're at it, drop some unused expect_exception code from
test_versioned_writes.py

Change-Id: Id232880c7259839159c057471b61083d5862d189
Related-Change: I430c48c4a81e8392fa271160bcbc1817ef0a88f7
2017-06-15 14:32:44 -07:00
Romain LE DISEZ
a7c8becd4e Fix a socket leak in copy middleware
When the "copy" middleware tries to copy a segmented object which is
bigger than max_file_size, it immediatly returns "413 Request Entity Too
Large". But at that point, connections have already been established by
the proxy server to the object servers. These connections must be closed
before returning.

Closes-Bug: #1698207
Change-Id: I430c48c4a81e8392fa271160bcbc1817ef0a88f7
2017-06-15 22:49:17 +02:00
Jenkins
6181351a65 Merge "Make mount_check option usable in containerized environments" 2017-06-14 19:49:23 +00:00
Jenkins
aed3edc522 Merge "Follow up for affinity config per policy" 2017-06-14 02:54:11 +00:00
Jenkins
73da215bdf Merge "Ring doc cleanups" 2017-06-14 02:54:03 +00:00
Jenkins
4315093a28 Merge "More Global EC doc updates" 2017-06-13 21:13:07 +00:00
Tim Burke
b5ee8c88d0 Ring doc cleanups
Change-Id: Ie51ea5c729341da793887e1e25c1e45301a96751
2017-06-13 09:23:23 -07:00
Kota Tsuyuzaki
066f44323d Follow up for affinity config per policy
This changes:
- Add assertions for write_affinity values in load_app tests
- Add a test case that policy override read_affinity default by timing
  strategy
- Avoid 'scope' but it's 'label'

Related-Change: I3f718f425f525baa80045ba067950c752bcaaefc
Change-Id: Ia8262490895d60da345f3679fc53653b2c2a2b3e
2017-06-13 11:53:11 +01:00
Clay Gerrard
4c7839d256 More Global EC doc updates
Soften the language about inefficiency on read and strengthen the
language encouraging the use of read affinity and composite rings.

Change-Id: Idc81a8c71e74ae28d384759700c5268d77ae3c85
2017-06-13 10:08:20 +01:00
Jenkins
41c8f1330f Merge "Update Global EC docs with reference to composite rings" 2017-06-13 06:26:54 +00:00
Alistair Coles
9665252352 Update Global EC docs with reference to composite rings
* In light of the composite rings feature being added [1],
  downgrade the warnings about EC Duplication [2] being
  experimental.

* Add links from Global EC docs to composite rings and
  per-policy proxy config features.

* Add discussion of using EC duplication with composite
  rings.

* Update Known Issues.

[1] Related-Change: I0d8928b55020592f8e75321d1f7678688301d797
[2] Related-Change: Idd155401982a2c48110c30b480966a863f6bd305

Change-Id: Id97a4899255945a6eaeacfef12fd29a2580588df
2017-06-12 16:58:02 -07:00
Jenkins
75c1bea7a5 Merge "Cleanup db replicator probetest" 2017-06-12 21:13:15 +00:00
Jenkins
7bbe02b290 Merge "Allow to configure the nameservers in cname_lookup" 2017-06-12 19:48:45 +00:00
Jenkins
53b356b1ee Merge "Fix unit test failing when swift.conf has default policy index >10" 2017-06-12 19:22:26 +00:00
Jenkins
8aa560dc9c Merge "remove remote qualifier from release notes branch scanning" 2017-06-12 19:18:47 +00:00
Jenkins
d46b0f29f9 Merge "Limit number of revert tombstone SSYNC requests" 2017-06-08 18:10:08 +00:00
Mahati Chamarthy
188c07e12a Limit number of revert tombstone SSYNC requests
Revert tombstone only parts try to talk to all primary nodes - this
fixes it to randomize selection within part_nodes. Corresponding probe
test is modified to reflect this change.

The primary improvement of this patch is the reconstuctor at a handoff
node is being able to delete local tombstones when it succeeds to sync
to less than all primary nodes. (Before this patch, it requires all
nodes are responsible for the REVERT requests)

The number of primary nodes to communicate with the reconstructor can be
in dicsussion more but, right now with this patch, it's (replicas - k + 1)
that is able to prevent stale read.

*BONUS*

- Fix mis-testsetting (was setting less replicas than ec_k + ec_m)
  for reconstructor ring in the unit test

Co-Authored-By: Kota Tsuyuzaki <tsuyuzaki.kota@lab.ntt.co.jp>
Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>

Change-Id: I05ce8fe75f1c4a7971cc8995b003df818b69b3c1
Closes-Bug: #1668857
2017-06-08 07:07:42 +00:00