83 Commits

Author SHA1 Message Date
Alistair Coles
45884c1102 Enable per policy proxy config options
This is an alternative approach to that proposed in [1]

Adds support for optional per-policy config sections
to be added in proxy-server.conf. This is highly desirable
to allow per-policy affinity options to be set for use with
duplicated EC policies [2] and composite rings [3].

Certain options found in per-policy conf sections will
override their equivalents that may be set in the
[app:proxy-server] section. Currently the options
handled that way are:

  sorting_method
  read_affinity
  write_affinity
  write_affinity_node_count

For example:

  [proxy-server:policy:0]
  sorting_method = affinity
  read_affinity = r1=100
  write_affinity = r1
  write_affinity_node_count = 1 * replicas

The corresponding attributes of the proxy-server Application
are now available from instances of an OverrideConf object
that is obtained from Application.get_policy_options(policy).

[1] Related-Change: I9104fc789ba85ab3ab5ccd34096125b482821389
[2] Related-Change: Idd155401982a2c48110c30b480966a863f6bd305
[3] Related-Change: I0d8928b55020592f8e75321d1f7678688301d797

Co-Authored-By: Kota Tsuyuzaki <tsuyuzaki.kota@lab.ntt.co.jp>
Change-Id: I3f718f425f525baa80045ba067950c752bcaaefc
2017-05-23 20:22:30 +01:00
Pete Zaitcev
5dfc3a75fb Open-code eventlet.listen()
Recently out gate started blowing up intermittently with a strange
case of ports mixed up. Sometimes a functional tests tries to
authorize on a port that's clearly an object server port, and
the like. As it turns out, eventlet developers added an unavoidable
SO_REUSEPORT into listen(), which makes listen(("localhost",0)
to reuse ports.

There's an issue about it:
 https://github.com/eventlet/eventlet/issues/411

This patch is working around the problem while eventlet people
consider the issue.

Change-Id: I67522909f96495a6a30e1acdb79835dce2189549
2017-05-11 01:39:14 -06:00
Thiago da Silva
4d3aa4ea78 refactor some common code from crypto
This patch moves some code from the crypto files
to a more common modules that will be used by symlinks

Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>

Change-Id: I1758693c5dd428f9f2157966aac49d97c2c7ab12
Signed-off-by: Thiago da Silva <thiago@redhat.com>
2017-03-09 11:12:29 -05:00
Tim Burke
523bc0ab71 Always set swift processes to use UTC
Previously, we would set the TZ environment variable to the result of

    time.strftime("%z", time.gmtime())

This has a few problems.

 1. The "%z" format does not appear in the table of formatting
    directives for strftime [1]. While it *does* appear in a
    footnote [2] for that section, it is described as "not supported by
    all ANSI C libraries." This may explain the next point.

 2. On the handful of Linux platforms I've tested, the above produces
    "+0000" regardless of the system's timezone. This seems to run
    counter to the intent of the patches that introduced the TZ
    mangling. (See the first two related changes.)

 3. The above does not produce a valid Posix TZ format, which expects
    (at minimum) a name consisting of three or more alphabetic
    characters followed by the offset to be added to the local time to
    get Coordinated Universal Time (UTC).

Further, while we would change os.environ['TZ'], we would *not* call
time.tzset like it says in the docs [3], which seems like a Bad Thing.

Some combination of the above has the net effect of changing some of the
functions in the time module to use UTC. (Maybe all of them? At the very
least, time.localtime and time.mktime.) However, it does *not* change
the offset stored in time.timezone, which causes bad behavior when
dealing with local timestamps [4].

Now, set TZ to "UTC+0" and call tzset. Apparently we don't have a good
way of getting local timezone info, we were (unintentionally?) using UTC
before, and you should probably be running your servers in UTC anyway.

[1] https://docs.python.org/2/library/time.html#time.strftime
[2] https://docs.python.org/2/library/time.html#id2
[3] https://docs.python.org/2/library/time.html#time.tzset
[4] Like in email.utils.mktime_tz, prior to being fixed in
    https://hg.python.org/cpython/rev/a283563c8cc4

Change-Id: I007425301914144e228b9cfece5533443e851b6e
Related-Change: Ifc78236a99ed193a42389e383d062b38f57a5a31
Related-Change: I8ec80202789707f723abfe93ccc9cf1e677e4dc6
Related-Change: Iee7488d03ab404072d3d0c1a262f004bb0f2da26
2016-12-19 16:23:13 -08:00
Jenkins
6e900205a9 Merge "monkey_patch_mimetools() now does nothing on py3" 2016-08-18 10:53:48 +00:00
Tim Burke
f7a820ed3a Wait for a non-empty chunk in WSGIContext._app_call
We're functioning as a WSGI server here, so this bit from PEP-3333 seems
to apply:

> The start_response callable must not actually transmit the response
> headers. Instead, it must store them for the server or gateway to
> transmit only after the first iteration of the application return
> value that yields a non-empty bytestrin ... . In other words, response
> headers must not be sent until there is actual body data available, or
> until the application's returned iterable is exhausted.

Plus, it mirrors what swob.Request.call_application does.

Change-Id: I1e8501f8ce91ea912780db64fee1c56bef809a98
2016-08-12 05:51:57 +00:00
Victor Stinner
fbf0e49917 monkey_patch_mimetools() now does nothing on py3
The mimetools module has been removed from Python 3: modify
monkey_patch_mimetools() to do nothing on Python 3.

Skip test_monkey_patch_mimetools() on Python 3.

Change-Id: I50f01ec159efedbb4df759ddd1e13928ac28fba6
2016-07-26 19:58:38 +02:00
Samuel Merritt
ce90a1e79e Make info caching work across subrequests
Previously, if you called get_account_info, get_container_info, or
get_object_info, then the results of that call would be cached in the
WSGI environment as top-level keys. This is okay, except that if you,
in middleware, copy the WSGI environment and then make a subrequest
using the copy, information retrieved in the subrequest is cached
only in the copy and not in the original. This can mean lots of extra
trips to memcache for, say, SLO validation where the segments are in
another container; the object HEAD ends up getting container info for
the segment container, but then the next object HEAD gets it again.

This commit moves the cache for get_*_info into a dictionary at
environ['swift.infocache']; this way, you can shallow-copy the request
environment and still get the benefits from the cache.

Change-Id: I3481b38b41c33cd1e39e19baab56193c5f9bf6ac
2016-05-13 10:36:49 -07:00
Jenkins
177e531a2e Merge "Remove unneeded setting of SO_REUSEADDR." 2016-05-12 09:19:19 +00:00
Prashanth Pai
46d61a4dcd Refactor server side copy as middleware
Rewrite server side copy and 'object post as copy' feature as middleware to
simplify the PUT method in the object controller code. COPY is no longer
a verb implemented as public method in Proxy application.

The server side copy middleware is inserted to the left of dlo, slo and
versioned_writes middlewares in the proxy server pipeline. As a result,
dlo and slo copy_hooks are no longer required. SLO manifests are now
validated when copied so when copying a manifest to another account the
referenced segments must be readable in that account for the manifest
copy to succeed (previously this validation was not made, meaning the
manifest was copied but could be unusable if the segments were not
readable).

With this change, there should be no change in functionality or existing
behavior. This is asserted with (almost) no changes required to existing
functional tests.

Some notes (for operators):
* Middleware required to be auto-inserted before slo and dlo and
  versioned_writes
* Turning off server side copy is not configurable.
* object_post_as_copy is no longer a configurable option of proxy server
  but of this middleware. However, for smooth upgrade, config option set
  in proxy server app is also read.

DocImpact: Introducing server side copy as middleware

Co-Authored-By: Alistair Coles <alistair.coles@hpe.com>
Co-Authored-By: Thiago da Silva <thiago@redhat.com>

Change-Id: Ic96a92e938589a2f6add35a40741fd062f1c29eb
Signed-off-by: Prashanth Pai <ppai@redhat.com>
Signed-off-by: Thiago da Silva <thiago@redhat.com>
2016-05-11 14:55:51 -04:00
Samuel Merritt
deaef2f9d6 Remove unneeded setting of SO_REUSEADDR.
This has been in eventlet.listen() since version 0.15.

Change-Id: Ib77b28231a2180f1ea082f356c4687c39681a6f7
2016-05-05 15:40:40 -07:00
Dmitriy Ukhlov
746d928a87 Adds eventlet monkey patching of select module if thread is pathed
Oslo.messaging pika driver requires patching of select module if thread
is patched.
Pika driver uses select call and if it is not patched onsuming messages
blocks whole eventlet loop

Closes-Bug: #1570242
Change-Id: I9756737309f401ebddb7475eb84725f65bca01bf
2016-04-15 14:04:12 +00:00
Chaozhe.Chen
4a44e27e00 Replace assertEqual(None, *) with assertIsNone in tests
As swift no longer supports Python 2.6, replace assertEqual(None, *)
with assertIsNone in tests to have more clear messages in case of
failure.

Change-Id: I94af3e8156ef40465d4f7a2cb79fb99fc7bbda56
Closes-Bug: #1280522
2016-02-16 23:49:06 +08:00
Alistair Coles
30d74af653 Insert versioned_writes in correct pipeline position
If not explicitly configured the versioned_writes middleware
should be auto-inserted in the pipeline after slo and dlo, which
is where the versioned_writes filter section's comments say it
should be in proxy-server.conf-sample. At the moment it can end up
being placed ahead of slo and dlo if they have been explicitly
configured, which results in the linked bug manifesting.

Closes-Bug: #1537042
Change-Id: I6ac95a331f4ef0d4887311940acc6f8bc00fb4eb
2016-02-02 18:30:06 +00:00
Matthew Oliver
87f7e907ee Pass HTTP_REFERER down to subrequests
Currently a HTTP_REFERER (Referer) header isn't passed down to
subrequests. This means *LO subrequests to segment containers
return a 403 on a *LO GET when accessed by requests using referer
ACLs.
Currently the only way around referer access to *LO's is to make the
segments container world readable.

This change makes sure the referer header is passed into subrequests
allowing a segments container to only need to be locked down with
the same referer as the *LO container.

This is a 1 line change to code, but also adds a unit and 2 functional
functional tests (one for DLO and one for SLO).

Change-Id: I1fa5328979302d9c8133aa739787c8dae6084f54
Closes-Bug: #1526575
2015-12-17 14:28:40 +00:00
Jenkins
d9e44bda05 Merge "monkeypatch thread for keystoneclient" 2015-11-11 21:08:53 +00:00
Mehdi Abaakouk
bf8689474a monkeypatch thread for keystoneclient
keystoneclient uses threading.Lock(), but swift doesn't
monkeypatch threading, this result in lockup when two
greenthreads try to acquire a non green lock.

This change fixes that.

Change-Id: I9b44284a5eb598a6978364819f253e031f4eaeef
Closes-bug: #1508424
2015-11-03 16:36:19 +01:00
Samuel Merritt
e31ecb24b6 Get rid of contextlib.nested() for py3
contextlib.nested() is missing completely in Python 3.

Since 2.7, we can use multiple context managers in a 'with' statement,
like so:

    with thing1() as t1, thing2() as t2:
        do_stuff()

Now, if we had some code that needed to nest an arbitrary number of
context managers, there's stuff we could do with contextlib.ExitStack
and such... but we don't. We only use contextlib.nested() in tests to
set up bunches of mocks without crazy-deep indentation, and all that
stuff fits perfectly into multiple-context-manager 'with' statements.

Change-Id: Id472958b007948f05dbd4c7fb8cf3ffab58e2681
2015-10-23 11:44:54 -07:00
janonymous
f5f9d791b0 pep8 fix: assertEquals -> assertEqual
assertEquals is deprecated in py3, replacing it.

Change-Id: Ida206abbb13c320095bb9e3b25a2b66cc31bfba8
Co-Authored-By: Ondřej Nový <ondrej.novy@firma.seznam.cz>
2015-10-11 12:57:25 +02:00
Victor Stinner
c0af385173 py3: Replace urllib imports with six.moves.urllib
The urllib, urllib2 and urlparse modules of Python 2 were reorganized
into a new urllib namespace on Python 3. Replace urllib, urllib2 and
urlparse imports with six.moves.urllib to make the modified code
compatible with Python 2 and Python 3.

The initial patch was generated by the urllib operation of the sixer
tool on: bin/* swift/ test/.

Change-Id: I61a8c7fb7972eabc7da8dad3b3d34bceee5c5d93
2015-10-08 15:24:13 +02:00
Victor Stinner
7bea148d2f pep8: replace deprecated calls to assert_()
The TestCase.assert_() has been deprecated in Python 2.7. Replace it
with assertTrue() or even better methods (assertIn, assertNotIn,
assertIsInstance) which provide better error messages.

Change-Id: I21c730351470031a2dabe5238693095eabdb8964
2015-08-19 12:05:01 -07:00
Clément Contini
d6267c1f95 Keep user id and project id in subrequests env
Keep HTTP_X_USER_ID and HTTP_X_PROJECT_ID to be available as
user_id and project_id in storage.objects.outgoing.bytes in
ceilometer when downloading a multipart object.

Change-Id: I0f4734f021e5d6e84d48ed9bebeb321d7a9590ad
Closes-Bug: #1477283
2015-08-18 16:31:51 +00:00
Thiago da Silva
035a411660 versioned writes middleware
Rewrite object versioning as middleware to simplify the PUT method
in the object controller.

The functionality remains basically the
same with the only major difference being the ability to now
version slo manifest files. dlo manifests are still not
supported as part of this patch.

Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>

DocImpact
Change-Id: Ie899290b3312e201979eafefb253d1a60b65b837
Signed-off-by: Thiago da Silva <thiago@redhat.com>
Signed-off-by: Prashanth Pai <ppai@redhat.com>
2015-08-07 14:11:32 -04:00
Darrell Bishop
207dd9b49d Fix regression in WSGI server SIGHUP behavior
The SIGHUP receipt used to pop us out of an os.wait() where now, it's in
a "green" wait() and Timeout() combo, some part of which eats the signal
receipt.  This causes the while loop condition to never get checked and
SIGHUP no longer works as a server reload command.

The fix is to loop at least every 0.5 seconds, as a trade-off between
not busy-waiting and checking the "keep running" condition often enough
to feel responsive.

Change-Id: I95283b8b7cfc2998ab5813e0ad3ca1fa231696c8
Closes-Bug: #1479972
2015-07-30 14:39:55 -07:00
Jenkins
187cea5445 Merge "Replace StringIO with BytesIO for WSGI input" 2015-07-24 06:52:40 +00:00
Jenkins
260e976e50 Merge "Get StringIO and cStringIO from six.moves" 2015-07-24 06:52:36 +00:00
janonymous
cd7b2db550 unit tests: Replace "self.assert_" by "self.assertTrue"
The assert_() method is deprecated and can be safely replaced by assertTrue().
This patch makes sure that running the tests does not create undesired
warnings.

Change-Id: I0602ba39ef93263386644ee68088d5f65fcb4a71
2015-07-21 19:23:00 +05:30
Victor Stinner
8753e452b0 Replace StringIO with BytesIO for WSGI input
wsgi.input is a binary stream (bytes), not a text stream (unicode).

* Replace StringIO with BytesIO for WSGI input
* Replace StringIO('') with StringIO() and replace WsgiStringIO('') with
  WsgiStringIO(): an empty string is already the default value

Change-Id: I09c9527be2265a6847189aeeb74a17261ddc781a
2015-07-15 16:56:33 +02:00
Victor Stinner
6e70f3fa32 Get StringIO and cStringIO from six.moves
* replace "from cStringIO import StringIO"
  with "from six.moves import cStringIO as StringIO"
* replace "from StringIO import StringIO"
  with "from six import StringIO"
* replace "import cStringIO" and "cStringIO.StringIO()"
  with "from six import moves" and "moves.cStringIO()"
* replace "import StringIO" and "StringIO.StringIO()"
  with "import six" and "six.StringIO()"

This patch was generated by the stringio operation of the sixer tool:
https://pypi.python.org/pypi/sixer

Change-Id: Iacba77fec3045f96773d1090c0bd48613729a561
2015-07-15 16:56:33 +02:00
Victor Stinner
1cc3eff958 Fixes for mock 1.1
The new release of mock 1.1 is more strict. It helped to find bugs in
tests.

Closes-Bug: #1473369
Change-Id: Id179513c6010d827cbcbdda7692a920e29213bcb
2015-07-10 16:37:11 +02:00
Victor Stinner
e5c962a28c Replace xrange() with six.moves.range()
Patch generated by the xrange operation of the sixer tool:
https://pypi.python.org/pypi/sixer

Manual changes:

* Fix indentation for pep8 checks
* Fix TestGreenthreadSafeIterator.test_access_is_serialized of
  test.unit.common.test_utils:
  replace range(1, 11) with list(range(1, 11))
* Fix UnsafeXrange docstring, revert change

Change-Id: Icb7e26135c5e57b5302b8bfe066b33cafe69fe4d
2015-06-23 07:29:15 +00:00
Darrell Bishop
df134df901 Allow 1+ object-servers-per-disk deployment
Enabled by a new > 0 integer config value, "servers_per_port" in the
[DEFAULT] config section for object-server and/or replication server
configs.  The setting's integer value determines how many different
object-server workers handle requests for any single unique local port
in the ring.  In this mode, the parent swift-object-server process
continues to run as the original user (i.e. root if low-port binding
is required), binds to all ports as defined in the ring, and forks off
the specified number of workers per listen socket.  The child, per-port
servers drop privileges and behave pretty much how object-server workers
always have, except that because the ring has unique ports per disk, the
object-servers will only be handling requests for a single disk.  The
parent process detects dead servers and restarts them (with the correct
listen socket), starts missing servers when an updated ring file is
found with a device on the server with a new port, and kills extraneous
servers when their port is found to no longer be in the ring.  The ring
files are stat'ed at most every "ring_check_interval" seconds, as
configured in the object-server config (same default of 15s).

Immediately stopping all swift-object-worker processes still works by
sending the parent a SIGTERM.  Likewise, a SIGHUP to the parent process
still causes the parent process to close all listen sockets and exit,
allowing existing children to finish serving their existing requests.
The drop_privileges helper function now has an optional param to
suppress the setsid() call, which otherwise screws up the child workers'
process management.

The class method RingData.load() can be told to only load the ring
metadata (i.e. everything except replica2part2dev_id) with the optional
kwarg, header_only=True.  This is used to keep the parent and all
forked off workers from unnecessarily having full copies of all storage
policy rings in memory.

A new helper class, swift.common.storage_policy.BindPortsCache,
provides a method to return a set of all device ports in all rings for
the server on which it is instantiated (identified by its set of IP
addresses).  The BindPortsCache instance will track mtimes of ring
files, so they are not opened more frequently than necessary.

This patch includes enhancements to the probe tests and
object-replicator/object-reconstructor config plumbing to allow the
probe tests to work correctly both in the "normal" config (same IP but
unique ports for each SAIO "server") and a server-per-port setup where
each SAIO "server" must have a unique IP address and unique port per
disk within each "server".  The main probe tests only work with 4
servers and 4 disks, but you can see the difference in the rings for the
EC probe tests where there are 2 disks per server for a total of 8
disks.  Specifically, swift.common.ring.utils.is_local_device() will
ignore the ports when the "my_port" argument is None.  Then,
object-replicator and object-reconstructor both set self.bind_port to
None if server_per_port is enabled.  Bonus improvement for IPv6
addresses in is_local_device().

This PR for vagrant-swift-all-in-one will aid in testing this patch:
https://github.com/swiftstack/vagrant-swift-all-in-one/pull/16/

Also allow SAIO to answer is_local_device() better; common SAIO setups
have multiple "servers" all on the same host with different ports for
the different "servers" (which happen to match the IPs specified in the
rings for the devices on each of those "servers").

However, you can configure the SAIO to have different localhost IP
addresses (e.g. 127.0.0.1, 127.0.0.2, etc.) in the ring and in the
servers' config files' bind_ip setting.

This new whataremyips() implementation combined with a little plumbing
allows is_local_device() to accurately answer, even on an SAIO.

In the default case (an unspecified bind_ip defaults to '0.0.0.0') as
well as an explict "bind to everything" like '0.0.0.0' or '::',
whataremyips() behaves as it always has, returning all IP addresses for
the server.

Also updated probe tests to handle each "server" in the SAIO having a
unique IP address.

For some (noisy) benchmarks that show servers_per_port=X is at least as
good as the same number of "normal" workers:
https://gist.github.com/dbishop/c214f89ca708a6b1624a#file-summary-md

Benchmarks showing the benefits of I/O isolation with a small number of
slow disks:
https://gist.github.com/dbishop/fd0ab067babdecfb07ca#file-results-md

If you were wondering what the overhead of threads_per_disk looks like:
https://gist.github.com/dbishop/1d14755fedc86a161718#file-tabular_results-md

DocImpact

Change-Id: I2239a4000b41a7e7cc53465ce794af49d44796c6
2015-06-18 12:43:50 -07:00
janonymous
09e7477a39 Replace it.next() with next(it) for py3 compat
The Python 2 next() method of iterators was renamed to __next__() on
Python 3. Use the builtin next() function instead which works on Python
2 and Python 3.

Change-Id: Ic948bc574b58f1d28c5c58e3985906dee17fa51d
2015-06-15 22:10:45 +05:30
Yuan Zhou
61a9d35fd5 Update container sync to use internal client
This patch changes container sync to use Internal Client instead
of Direct Client.

In the current design, container sync uses direct_get_object to
get the newest source object(which talks to storage node directly).
This works fine for replication storage policies however in
erasure coding policies, direct_get_object would only return part
of the object(it's encoded as several pieces). Using Internal
Client can get the original object in EC case.

Note that for the container sync put/delete part, it's working in
EC since it's using Simple Client.

Signed-off-by: Yuan Zhou <yuan.zhou@intel.com>

DocImpact
Change-Id: I91952bc9337f354ce6024bf8392046a1ecf6ecc9
2015-04-14 00:52:17 -07:00
Clay Gerrard
c767ec9a37 Let eventlet.wsgi.server log tracebacks when eventlet_debug is enabled
The "logging" available in eventlet.wsgi.server/BaseHTTPServer doesn't
generally suite our needs, so it should be bypassed using a NullLogger in
production.  But in development it can be useful if tracebacks generated from
inside eventlet.wsgi (say a NameError in DiskFile.__iter__) end up in logs.
Since we already have eventlet_debug parsed inside of run_server we can skip
the NullLogger bypass and let stuff blast out to STDERR when configured for
development/debug logging.

Change-Id: I20a9e82c7fed8948bf649f1f8571b4145fca201d
2014-09-15 18:23:32 -07:00
John Dickinson
b7281cf2c5 make the bind_port config setting required
In a long-term effort to change the recommended ports for Swift,
the first step is to require the bind_port in config files. Later,
we can change the recommended setting.

Anyone currently explicitly setting the ports will not be affected.
Anyone not setting the ports will need to specify them to match their
rings.

DocImpact

Change-Id: Icca83a263acdd0afc9016424a3e9f8c15e944789
2014-09-08 07:28:43 -07:00
Jenkins
dbc2102652 Merge "Disable case-changing behavior in Eventlet" 2014-07-17 01:02:31 +00:00
MORITA Kazutaka
f2774a4d11 Disable case-changing behavior in Eventlet
RFC 2616 says that HTTP header fields are case-insensitive.  However, there are
some S3 clients who don't accept normalized header by Swift and Eventlet.  For
example, AWS Java SDK expects that an etag header is 'ETag', not 'Etag'.

This patch disables Eventlet's header capitalization so that the swift3
middleware can normalize the response headers as those clients expect.

Note that this change requires a fix for Eventlet, which will be included in
the next Eventlet release (v0.15).

Change-Id: I6d3428b0dafef776bdb3ebac7639b3126fa5e60d
2014-06-25 12:26:06 +09:00
Paul Luse
441a171cf5 Fix issues with test_wsgi.py and Storage Policies
Discovered some tests that were coupling the code under test with the
storage policies configured in /etc/swift/swift.conf.  There was some
tests that created fake rings in their tempdirs, but didn't reset or
patch the POLICIES global.  So if your local config needed more rings
that the fake's were setting up (just 2) the tests would puke when they
loaded up an app that looked for rings.  I think this probably started
happening when we added eager object ring loading back into the proxy.

 * two TestCases in test_wsgi were missing @patch_policies
 * fixed issue with patch_policies that could cause state to bleed
   between tests
 * patch_policies' legacy and default collections get a FakeRing by
   default

 * drive-by cleanup for test_loadapp_proxy() ring serialized path
   handling
 * drive-by cleanup for test_internal_client that was doing basically
   the same thing as test_wsgi

Change-Id: Ia706000ba961ed24f2c22b81041e53a0c3f302fc
2014-06-23 23:13:18 -07:00
Clay Gerrard
8bec50838c Extend interface on InternalClient
* add get_object
 * allow extra headers passthrough on HEAD/metadata reqeusts
 * expose (account|container|get_object)_ring properties

Pipeline propety access to the auto_create_account_prefix also allows us to
bypass the early exit on a container HEAD for auto_create_accounts if the
container-updater hasn't cycled yet.

Allow overriding of storage policy index.

This is something the reconciler will need so that it can GET from one
policy, PUT in another, and then DELETE from the first one again.

DocImpact
Implements: blueprint storage-policies
Change-Id: I9b287d15f2426022d669d1186c9e22dd8ca13fb9
2014-06-18 17:31:39 -07:00
Clay Gerrard
3824ff3df7 Add Storage Policy support to Object Server
Objects now have a storage policy index associated with them as well;
this is determined by their filesystem path. Like before, objects in
policy 0 are in /srv/node/$disk/objects; this provides compatibility
on upgrade. (Recall that policy 0 is given to all existing data when a
cluster is upgraded.) Objects in policy 1 are in
/srv/node/$disk/objects-1, objects in policy 2 are in
/srv/node/$disk/objects-2, and so on.

 * 'quarantined' dir already created 'objects' subdir so now there
   will also be objects-N created at the same level

This commit does not address replicators, auditors, or updaters except
where method signatures changed. They'll still work if your cluster
has only one storage policy, though.

DocImpact
Implements: blueprint storage-policies
Change-Id: I459f3ed97df516cb0c9294477c28729c30f48e09
2014-06-18 17:31:38 -07:00
Clay Gerrard
7624b198cf Update FakeRing and FakeLogger
FakeLogger gets better log level handling

Parameterize logger on some daemons which were previously
unparameterized and try and use the interface in tests.

FakeRing use more real code

The existing FakeRing mock's implementation bit me on some pretty subtle
character encoding issue by-passing the hash_path code that is normally
part of get_part_nodes.  This change tries to exercise more of the real
ring code paths when it makes sense and provide a better Fake for use in
testing.

Add write_fake_ring helper to test.unit for when you need a real ring.

DocImpact
Implements: blueprint storage-policies
Change-Id: Id2e3740b1dd569050f4e083617e7dd6a4249027e
2014-06-18 17:31:37 -07:00
Jenkins
373e06a6a0 Merge "Print pipeline names in the reported pipeline" 2014-05-27 16:04:33 +00:00
Brian Cline
b4c5a13664 Uses None instead of mutables for function param defaults
As seen on #1174809, changes use of mutable types as default
arguments and defaults them within the method. Otherwise, those
defaults can be unexpectedly persisted with the function between
invocations and erupt into mass hysteria on the streets.

There was indeed a test (TestSimpleClient.test_get_with_retries)
that was erroneously relying on this behavior. Since previous tests
had populated their own instantiations with a token, this test only
passed because the modified headers dict from previous tests was
being overridden. As expected, with the mutable defaults fix in
SimpleClient, this test begain to fail since it never specified any
token, yet it has always passed anyway. This change also now provides
the expected token.

Change-Id: If95f11d259008517dab511e88acfe9731e5a99b5
Related-Bug: #1174809
2014-05-10 11:15:56 +00:00
Pete Zaitcev
4ce9b252fd Print pipeline names in the reported pipeline
When current code modifies the pipeline, it prints the entry point
names instead of the names used to construct the pipeline. This is
inconvenient because a sysadmin cannot copy and paste from the log.

We already save the pipeline name into contexts in most cases, so
the fix simply reuses that to provide friendly names.

Fixes bug: 1311802

Change-Id: Ic76baf1360cd521f140fa1980029ccbce58f1717
2014-05-09 19:10:08 -06:00
Jenkins
d7744acde7 Merge "Don't lard up InternalClient with extra middleware" 2014-03-05 01:09:42 +00:00
Samuel Merritt
09ef06fd99 Convert all old-style classes to new-style
This cleanup has been slowly happening for a while; let's finish it.

Change-Id: I1561e3540d524834e0cc5bc725ab80936eae1f0e
2014-03-03 17:28:48 -08:00
Samuel Merritt
3f19041c99 Don't lard up InternalClient with extra middleware
One can argue that it makes sense for the client-facing proxy server
to have certain middlewares like gatekeeper in its pipeline, but that
is not desirable for InternalClient. In particular, it prevents you
from passing in sysmeta headers using InternalClient, and I found
myself wanting to do that earlier today.

Now InternalClient's proxy application gets exactly what's configured;
no more, no less. This will mean that the object expirer can read and
write sysmeta headers, but I think we can trust it to keep our
secrets.

Change-Id: I17b4a89c24e600754701ee1645b40406421fa6f3
2014-02-28 12:55:15 -08:00
Samuel Merritt
6acea29fa6 Move all DLO functionality to middleware
This is for the same reason that SLO got pulled into middleware, which
includes stuff like automatic retry of GETs on broken connection and
the multi-ring storage policy stuff.

The proxy will automatically insert the dlo middleware at an
appropriate place in the pipeline the same way it does with the
gatekeeper middleware. Clusters will still support DLOs after upgrade
even with an old config file that doesn't mention dlo at all.

Includes support for reading config values from the proxy server's
config section so that upgraded clusters continue to work as before.

Bonus fix: resolve 'after' vs. 'after_fn' in proxy's required filters
list. Having two was confusing, so I kept the more-general one.

DocImpact

blueprint multi-ring-large-objects

Change-Id: Ib3b3830c246816dd549fc74be98b4bc651e7bace
2014-02-03 18:29:48 -08:00
anc
6164fa246d Generic means for persisting system metadata.
Middleware or core features may need to store metadata
against accounts or containers. This patch adds a
generic mechanism for system metadata to be persisted
in backend databases, without polluting the user
metadata namespace, by using the reserved header
namespace x-<server_type>-sysmeta-*.

Modifications are firstly that backend servers persist
system metadata headers alongside user metadata and
other system state.

For accounts and containers, system metadata in PUT
and POST requests is treated in a similar way to user
metadata. System metadata is not yet supported for
object requests.

Secondly, changes in the proxy controllers ensure that
headers in the system metadata namespace will pass through
in requests to backend servers.

Thirdly, system metadata returned from backend servers
in GET or HEAD responses is added to the cached info
dict, which middleware can access.

Finally, a gatekeeper middleware module is provided
which filters all system metadata headers from requests
and responses by removing headers with names starting
x-account-sysmeta-, x-container-sysmeta-. The gatekeeper
also removes headers starting x-object-sysmeta- in
anticipation of future support for system metadata being
set for objects. This prevents clients from writing or
reading system metadata.

The required_filters list in swift/proxy/server.py is
modified to include the gatekeeper middleware so that
if the gatekeeper has not been configured in the
pipeline then it will be automatically inserted close
to the start of the pipeline.

blueprint cluster-federation

Change-Id: I80b8b14243cc59505f8c584920f8f527646b5f45
2014-01-06 22:29:37 +00:00