Add documentation for GateKeeper

The purpose of GateKeeper mostly relates to the development of new swift code,
so I threw together a guide for development_middleware that covers some basics
with a eye towards metadata handling in-particular.

I also fixed up some missing autodoc's, split out middleware autodoc and added
some ref's here and about so I could link to them from the
development_middleware guide.

DocImpact
Change-Id: I20dd942ea8df9e33c3e794cb49669ffa1332c63e
This commit is contained in:
Clay Gerrard 2014-01-17 01:15:22 -08:00 committed by Peter Portante
parent f538006bbf
commit 63f8c2284a
14 changed files with 558 additions and 178 deletions

View File

@ -771,6 +771,8 @@ delay_reaping 0 Normally, the reaper begins deleting
2592000 = 30 days, for example. 2592000 = 30 days, for example.
================== =============== ========================================= ================== =============== =========================================
.. _proxy-server-config:
-------------------------- --------------------------
Proxy Server Configuration Proxy Server Configuration
-------------------------- --------------------------
@ -828,6 +830,15 @@ log_custom_handlers None Comma separated list of functions
handlers. handlers.
eventlet_debug false If true, turn on debug logging eventlet_debug false If true, turn on debug logging
for eventlet for eventlet
expose_info true Enables exposing configuration
settings via HTTP GET /info.
admin_key Key to use for admin calls that
are HMAC signed. Default
is empty, which will
disable admin calls to
/info.
============================ =============== ============================= ============================ =============== =============================
[proxy-server] [proxy-server]

View File

@ -0,0 +1,251 @@
=======================
Middleware and Metadata
=======================
----------------
Using Middleware
----------------
`Python WSGI Middleware`_ (or just "middleware") can be used to "wrap"
the request and response of a Python WSGI application (i.e. a webapp,
or REST/HTTP API), like Swift's WSGI servers (proxy-server,
account-server, container-server, object-server). Swift uses middleware
to add (sometimes optional) behaviors to the Swift WSGI servers.
.. _Python WSGI Middleware: http://www.python.org/dev/peps/pep-0333/#middleware-components-that-play-both-sides
Middleware can be added to the Swift WSGI servers by modifying their
`paste`_ configuration file. The majority of Swift middleware is applied
to the :ref:`proxy-server`.
.. _paste: http://pythonpaste.org/
Given the following basic configuration::
[DEFAULT]
log_level = DEBUG
user = <your-user-name>
[pipeline:main]
pipeline = proxy-server
[app:proxy-server]
use = egg:swift#proxy
You could add the :ref:`healthcheck` middleware by adding a section for
that filter and adding it to the pipeline::
[DEFAULT]
log_level = DEBUG
user = <your-user-name>
[pipeline:main]
pipeline = healthcheck proxy-server
[filter:healthcheck]
use = egg:swift#healthcheck
[app:proxy-server]
use = egg:swift#proxy
Some middleware is required and will be inserted into your pipeline
automatically by core swift code (e.g. the proxy-server will insert
:ref:`catch_errors` and :ref:`gatekeeper` at the start of the pipeline if they
are not already present). You can see which features are available on a given
Swift endpoint (including middleware) using the :ref:`discoverability`
interface.
----------------------------
Creating Your Own Middleware
----------------------------
The best way to see how to write middleware is to look at examples.
Many optional features in Swift are implemented as
:ref:`common_middleware` and provided in ``swift.common.middleware``, but
Swift middleware may be packaged and distributed as a separate project.
Some examples are listed on the :ref:`associated_projects` page.
A contrived middleware example that modifies request behavior by
inspecting custom HTTP headers (e.g. X-Webhook) and uses :ref:`sysmeta`
to persist data to backend storage as well as common patterns like a
:func:`.get_container_info` cache/query and :func:`.wsgify` decorator is
presented below::
from swift.common.http import is_success
from swift.common.swob import wsgify
from swift.common.utils import split_path, get_logger
from swift.common.request_helper import get_sys_meta_prefix
from swift.proxy.controllers.base import get_container_info
from eventlet import Timeout
from eventlet.green import urllib2
# x-container-sysmeta-webhook
SYSMETA_WEBHOOK = get_sys_meta_prefix('container') + 'webhook'
class WebhookMiddleware(object):
def __init__(self, app, conf):
self.app = app
self.logger = get_logger(conf, log_route='webhook')
@wsgify
def __call__(self, req):
obj = None
try:
(version, account, container, obj) = \
split_path(req.path_info, 4, 4, True)
except ValueError:
# not an object request
pass
if 'x-webhook' in req.headers:
# translate user's request header to sysmeta
req.headers[SYSMETA_WEBHOOK] = \
req.headers['x-webhook']
if 'x-remove-webhook' in req.headers:
# empty value will tombstone sysmeta
req.headers[SYSMETA_WEBHOOK] = ''
# account and object storage will ignore x-container-sysmeta-*
resp = req.get_response(self.app)
if obj and is_success(resp.status_int) and req.method == 'PUT':
container_info = get_container_info(req.environ, self.app)
# container_info may have our new sysmeta key
webhook = container_info['sysmeta'].get('webhook')
if webhook:
# create a POST request with obj name as body
webhook_req = urllib2.Request(webhook, data=obj)
with Timeout(20):
try:
urllib2.urlopen(webhook_req).read()
except (Exception, Timeout):
self.logger.exception(
'failed POST to webhook %s' % webhook)
else:
self.logger.info(
'successfully called webhook %s' % webhook)
if 'x-container-sysmeta-webhook' in resp.headers:
# translate sysmeta from the backend resp to
# user-visible client resp header
resp.headers['x-webhook'] = resp.headers[SYSMETA_WEBHOOK]
return resp
def webhook_factory(global_conf, **local_conf):
conf = global_conf.copy()
conf.update(local_conf)
def webhook_filter(app, conf):
return WebhookMiddleware(app)
return webhook_filter
In practice this middleware will call the url stored on the container as
X-Webhook on all successful object uploads.
If this example was at ``<swift-repo>/swift/common/middleware/webhook.py`` -
you could add it to your proxy by creating a new filter section and
adding it to the pipeline::
[DEFAULT]
log_level = DEBUG
user = <your-user-name>
[pipeline:main]
pipeline = healthcheck webhook proxy-server
[filter:webhook]
paste.filter_factory = swift.common.middleware.webhook:webhook_factory
[filter:healthcheck]
use = egg:swift#healthcheck
[app:proxy-server]
use = egg:swift#proxy
Most python packages expose middleware as entrypoints. See `PasteDeploy`_
documentation for more information about the syntax of the ``use`` option.
All middleware included with Swift is installed to support the ``egg:swift``
syntax.
.. _PasteDeploy: http://pythonpaste.org/deploy/#egg-uris
Middleware may advertize its availability and capabilities via Swift's
:ref:`discoverability` support by using
:func:`.register_swift_info`::
from swift.common.utils import register_swift_info
def webhook_factory(global_conf, **local_conf):
register_swift_info('webhook')
def webhook_filter(app):
return WebhookMiddleware(app)
return webhook_filter
--------------
Swift Metadata
--------------
Generally speaking metadata is information about a resource that is
associated with the resource but is not the data contained in the
resource itself - which is set and retrieved via HTTP headers. (e.g. the
"Content-Type" of a Swift object that is returned in HTTP response
headers)
All user resources in Swift (i.e. account, container, objects) can have
user metadata associated with them. Middleware may also persist custom
metadata to accounts and containers safely using System Metadata. Some
core swift features which predate sysmeta have added exceptions for
custom non-user metadata headers (e.g. :ref:`acls`,
:ref:`large-objects`)
^^^^^^^^^^^^^
User Metadata
^^^^^^^^^^^^^
User metadata takes the form of ``X-<type>-Meta-<key>: <value>``, where
``<type>`` depends on the resources type (i.e. Account, Container, Object)
and ``<key>`` and ``<value>`` are set by the client.
User metadata should generally be reserved for use by the client or
client applications. An perfect example use-case for user metadata is
`python-swiftclient`_'s ``X-Object-Meta-Mtime`` which it stores on
object it uploads to implement its ``--changed`` option which will only
upload files that have changed since the last upload.
.. _python-swiftclient: https://github.com/openstack/python-swiftclient
New middleware should avoid storing metadata within the User Metadata
namespace to avoid potential conflict with existing user metadata when
introducing new metadata keys. An example of legacy middleware that
borrows the user metadata namespace is :ref:`tempurl`. An example of
middleware which uses custom non-user metadata to avoid the user
metadata namespace is :ref:`slo-doc`.
.. _sysmeta:
^^^^^^^^^^^^^^^
System Metadata
^^^^^^^^^^^^^^^
System metadata takes the form of ``X-<type>-Sysmeta-<key>: <value>``,
where ``<type>`` depends on the resources type (i.e. Account, Container,
Object) and ``<key>`` and ``<value>`` are set by trusted code running in a
Swift WSGI Server.
All headers on client requests in the form of ``X-<type>-Sysmeta-<key>``
will be dropped from the request before being processed by any
middleware. All headers on responses from back-end systems in the form
of ``X-<type>-Sysmeta-<key>`` will be removed after all middleware has
processed the response but before the response is sent to the client.
See :ref:`gatekeeper` middleware for more information.
System metadata provides a means to store potentially private custom
metadata with associated Swift resources in a safe and secure fashion
without actually having to plumb custom metadata through the core swift
servers. The incoming filtering ensures that the namespace can not be
modified directly by client requests, and the outgoing filter ensures
that removing middleware that uses a specific system metadata key
renders it benign. New middleware should take advantage of system
metadata.

View File

@ -66,6 +66,7 @@ Developer Documentation
development_guidelines development_guidelines
development_saio development_saio
development_auth development_auth
development_middleware
development_ondisk_backends development_ondisk_backends
Administrator Documentation Administrator Documentation
@ -93,6 +94,7 @@ Source Documentation
db db
object object
misc misc
middleware
Indices and tables Indices and tables

196
doc/source/middleware.rst Normal file
View File

@ -0,0 +1,196 @@
.. _common_middleware:
**********
Middleware
**********
.. _common_tempauth:
TempAuth
========
.. automodule:: swift.common.middleware.tempauth
:members:
:show-inheritance:
KeystoneAuth
============
.. automodule:: swift.common.middleware.keystoneauth
:members:
:show-inheritance:
.. _healthcheck:
Healthcheck
===========
.. automodule:: swift.common.middleware.healthcheck
:members:
:show-inheritance:
.. _recon:
Recon
===========
.. automodule:: swift.common.middleware.recon
:members:
:show-inheritance:
.. _memecached:
Ratelimit
=========
.. automodule:: swift.common.middleware.ratelimit
:members:
:show-inheritance:
StaticWeb
=========
.. automodule:: swift.common.middleware.staticweb
:members:
:show-inheritance:
.. _tempurl:
TempURL
=======
.. automodule:: swift.common.middleware.tempurl
:members:
:show-inheritance:
FormPost
========
.. automodule:: swift.common.middleware.formpost
:members:
:show-inheritance:
Domain Remap
============
.. automodule:: swift.common.middleware.domain_remap
:members:
:show-inheritance:
CNAME Lookup
============
.. automodule:: swift.common.middleware.cname_lookup
:members:
:show-inheritance:
Cross Domain Policies
=====================
.. automodule:: swift.common.middleware.crossdomain
:members:
:show-inheritance:
Name Check (Forbidden Character Filter)
=======================================
.. automodule:: swift.common.middleware.name_check
:members:
:show-inheritance:
Memcache
========
.. automodule:: swift.common.middleware.memcache
:members:
:show-inheritance:
Proxy Logging
=============
.. automodule:: swift.common.middleware.proxy_logging
:members:
:show-inheritance:
.. _catch_errors:
CatchErrors
=============
.. automodule:: swift.common.middleware.catch_errors
:members:
:show-inheritance:
.. _gatekeeper:
GateKeeper
=============
.. automodule:: swift.common.middleware.gatekeeper
:members:
:show-inheritance:
Bulk Operations (Delete and Archive Auto Extraction)
====================================================
.. automodule:: swift.common.middleware.bulk
:members:
:show-inheritance:
Container Quotas
================
.. automodule:: swift.common.middleware.container_quotas
:members:
:show-inheritance:
Account Quotas
==============
.. automodule:: swift.common.middleware.account_quotas
:members:
:show-inheritance:
.. _slo-doc:
Static Large Objects
====================
.. automodule:: swift.common.middleware.slo
:members:
:show-inheritance:
List Endpoints
==============
.. automodule:: swift.common.middleware.list_endpoints
:members:
:show-inheritance:
Container Sync Middleware
=========================
.. automodule:: swift.common.middleware.container_sync
:members:
:show-inheritance:
.. _discoverability:
Discoverability
===============
Swift will by default provide clients with an interface providing details
about the installation. Unless disabled (i.e ``expose_info=false`` in
:ref:`proxy-server-config`), a GET request to ``/info`` will return configuration
data in JSON format. An example response::
{"swift": {"version": "1.11.0"}, "staticweb": {}, "tempurl": {}}
This would signify to the client that swift version 1.11.0 is running and that
staticweb and tempurl are available in this installation.
There may be administrator-only information available via ``/info``. To
retrieve it, one must use an HMAC-signed request, similar to TempURL.
The signature may be produced like so::
swift-temp-url GET 3600 /info secret 2>/dev/null | sed s/temp_url/swiftinfo/g

View File

@ -33,25 +33,8 @@ Utils
:members: :members:
:show-inheritance: :show-inheritance:
.. _common_tempauth:
TempAuth
========
.. automodule:: swift.common.middleware.tempauth
:members:
:show-inheritance:
.. _acls: .. _acls:
KeystoneAuth
============
.. automodule:: swift.common.middleware.keystoneauth
:members:
:show-inheritance:
ACLs ACLs
==== ====
@ -68,6 +51,25 @@ WSGI
:members: :members:
:show-inheritance: :show-inheritance:
.. _swob:
Swob
====
.. automodule:: swift.common.swob
:members:
:show-inheritance:
.. _request_helpers:
Request Helpers
===============
.. automodule:: swift.common.request_helpers
:members:
:undoc-members:
:show-inheritance:
.. _direct_client: .. _direct_client:
Direct Client Direct Client
@ -97,26 +99,6 @@ Buffered HTTP
:members: :members:
:show-inheritance: :show-inheritance:
.. _healthcheck:
Healthcheck
===========
.. automodule:: swift.common.middleware.healthcheck
:members:
:show-inheritance:
.. _recon:
Recon
===========
.. automodule:: swift.common.middleware.recon
:members:
:show-inheritance:
.. _memecached:
MemCacheD MemCacheD
========= =========
@ -124,99 +106,6 @@ MemCacheD
:members: :members:
:show-inheritance: :show-inheritance:
Manager
=========
.. automodule:: swift.common.manager
:members:
:show-inheritance:
Ratelimit
=========
.. automodule:: swift.common.middleware.ratelimit
:members:
:show-inheritance:
StaticWeb
=========
.. automodule:: swift.common.middleware.staticweb
:members:
:show-inheritance:
TempURL
=======
.. automodule:: swift.common.middleware.tempurl
:members:
:show-inheritance:
FormPost
========
.. automodule:: swift.common.middleware.formpost
:members:
:show-inheritance:
Domain Remap
============
.. automodule:: swift.common.middleware.domain_remap
:members:
:show-inheritance:
CNAME Lookup
============
.. automodule:: swift.common.middleware.cname_lookup
:members:
:show-inheritance:
Proxy Logging
=============
.. automodule:: swift.common.middleware.proxy_logging
:members:
:show-inheritance:
Bulk Operations (Delete and Archive Auto Extraction)
====================================================
.. automodule:: swift.common.middleware.bulk
:members:
:show-inheritance:
Container Quotas
================
.. automodule:: swift.common.middleware.container_quotas
:members:
:show-inheritance:
Account Quotas
==============
.. automodule:: swift.common.middleware.account_quotas
:members:
:show-inheritance:
.. _slo-doc:
Static Large Objects
====================
.. automodule:: swift.common.middleware.slo
:members:
:show-inheritance:
List Endpoints
==============
.. automodule:: swift.common.middleware.list_endpoints
:members:
:show-inheritance:
Container Sync Realms Container Sync Realms
===================== =====================
@ -224,28 +113,9 @@ Container Sync Realms
:members: :members:
:show-inheritance: :show-inheritance:
Container Sync Middleware Manager
========================= =========
.. automodule:: swift.common.middleware.container_sync .. automodule:: swift.common.manager
:members: :members:
:show-inheritance: :show-inheritance:
Discoverability
===============
Swift can optionally be configured to provide clients with an interface
providing details about the installation. If configured, a GET request to
/info will return configuration data in JSON format. An example
response::
{"swift": {"version": "1.8.1"}, "staticweb": {}, "tempurl": {}}
This would signify to the client that swift version 1.8.1 is running and that
staticweb and tempurl are available in this installation.
There may be administrator-only information available via /info. To
retrieve it, one must use an HMAC-signed request, similar to TempURL.
The signature may be produced like so:
swift-temp-url GET 3600 /info secret 2>/dev/null | sed s/temp_url/swiftinfo/g

View File

@ -1,3 +1,5 @@
.. _large-objects:
==================== ====================
Large Object Support Large Object Support
==================== ====================

View File

@ -13,3 +13,40 @@ Proxy Server
:members: :members:
:undoc-members: :undoc-members:
:show-inheritance: :show-inheritance:
.. _proxy-controllers:
Proxy Controllers
=================
Base
~~~~
.. automodule:: swift.proxy.controllers.base
:members:
:undoc-members:
:show-inheritance:
Account
~~~~~~~
.. automodule:: swift.proxy.controllers.account
:members:
:undoc-members:
:show-inheritance:
Container
~~~~~~~~~
.. automodule:: swift.proxy.controllers.container
:members:
:undoc-members:
:show-inheritance:
Object
~~~~~~
.. automodule:: swift.proxy.controllers.obj
:members:
:undoc-members:
:show-inheritance:

View File

@ -405,12 +405,12 @@ class InternalClient(object):
:param account: The container's account. :param account: The container's account.
:param container: Container to check. :param container: Container to check.
:returns : True if container exists, false otherwise.
:raises UnexpectedResponse: Exception raised when requests fail :raises UnexpectedResponse: Exception raised when requests fail
to get a response with an acceptable status to get a response with an acceptable status
:raises Exception: Exception is raised when code fails in an :raises Exception: Exception is raised when code fails in an
unexpected way. unexpected way.
:returns : True if container exists, false otherwise.
""" """
path = self.make_path(account, container) path = self.make_path(account, container)

View File

@ -36,11 +36,9 @@ from swift.common.utils import get_logger
from swift.common.request_helpers import remove_items, get_sys_meta_prefix from swift.common.request_helpers import remove_items, get_sys_meta_prefix
import re import re
""" #: A list of python regular expressions that will be used to
A list of python regular expressions that will be used to #: match against inbound request headers. Matching headers will
match against inbound request headers. Matching headers will #: be removed from the request.
be removed from the request.
"""
# Exclude headers starting with a sysmeta prefix. # Exclude headers starting with a sysmeta prefix.
# If adding to this list, note that these are regex patterns, # If adding to this list, note that these are regex patterns,
# so use a trailing $ to constrain to an exact header match # so use a trailing $ to constrain to an exact header match
@ -52,11 +50,9 @@ inbound_exclusions = [get_sys_meta_prefix('account'),
# for system metadata being applied to objects # for system metadata being applied to objects
""" #: A list of python regular expressions that will be used to
A list of python regular expressions that will be used to #: match against outbound response headers. Matching headers will
match against outbound response headers. Matching headers will #: be removed from the response.
be removed from the response.
"""
outbound_exclusions = inbound_exclusions outbound_exclusions = inbound_exclusions

View File

@ -15,14 +15,17 @@
''' '''
Created on February 27, 2012 Created on February 27, 2012
A filter that disallows any paths that contain defined forbidden characters A filter that disallows any paths that contain defined forbidden characters or
or that exceed a defined length. that exceed a defined length.
Place in proxy filter before proxy, e.g. Place early in the proxy-server pipeline after the left-most occurrence of the
``proxy-logging`` middleware (if present) and before the final
``proxy-logging`` middleware (if present) or the ``proxy-serer`` app itself,
e.g.::
[pipeline:main] [pipeline:main]
pipeline = catch_errors healthcheck name_check cache ratelimit tempauth sos pipeline = catch_errors healthcheck proxy-logging name_check cache \
proxy-logging proxy-server ratelimit tempauth sos proxy-logging proxy-server
[filter:name_check] [filter:name_check]
use = egg:swift#name_check use = egg:swift#name_check

View File

@ -40,7 +40,7 @@ added. For example::
use = egg:swift#staticweb use = egg:swift#staticweb
Any publicly readable containers (for example, ``X-Container-Read: .r:*``, see Any publicly readable containers (for example, ``X-Container-Read: .r:*``, see
`acls`_ for more information on this) will be checked for :ref:`acls` for more information on this) will be checked for
X-Container-Meta-Web-Index and X-Container-Meta-Web-Error header values:: X-Container-Meta-Web-Index and X-Container-Meta-Web-Error header values::
X-Container-Meta-Web-Index <index.name> X-Container-Meta-Web-Index <index.name>

View File

@ -186,7 +186,8 @@ def remove_items(headers, condition):
:param headers: a dict of headers :param headers: a dict of headers
:param condition: a function that will be passed the header key as a :param condition: a function that will be passed the header key as a
single argument and should return True if the header is to be removed. single argument and should return True if the header
is to be removed.
:returns: a dict, possibly empty, of headers that have been removed :returns: a dict, possibly empty, of headers that have been removed
""" """
removed = {} removed = {}

View File

@ -250,8 +250,11 @@ def get_object_info(env, app, path=None, swift_source=None):
""" """
Get the info structure for an object, based on env and app. Get the info structure for an object, based on env and app.
This is useful to middlewares. This is useful to middlewares.
Note: This call bypasses auth. Success does not imply that the
request has authorization to the object. .. note::
This call bypasses auth. Success does not imply that the request has
authorization to the object.
""" """
(version, account, container, obj) = \ (version, account, container, obj) = \
split_path(path or env['PATH_INFO'], 4, 4, True) split_path(path or env['PATH_INFO'], 4, 4, True)
@ -266,8 +269,11 @@ def get_container_info(env, app, swift_source=None):
""" """
Get the info structure for a container, based on env and app. Get the info structure for a container, based on env and app.
This is useful to middlewares. This is useful to middlewares.
Note: This call bypasses auth. Success does not imply that the
request has authorization to the account. .. note::
This call bypasses auth. Success does not imply that the request has
authorization to the account.
""" """
(version, account, container, unused) = \ (version, account, container, unused) = \
split_path(env['PATH_INFO'], 3, 4, True) split_path(env['PATH_INFO'], 3, 4, True)
@ -282,8 +288,11 @@ def get_account_info(env, app, swift_source=None):
""" """
Get the info structure for an account, based on env and app. Get the info structure for an account, based on env and app.
This is useful to middlewares. This is useful to middlewares.
Note: This call bypasses auth. Success does not imply that the
request has authorization to the container. .. note::
This call bypasses auth. Success does not imply that the request has
authorization to the container.
""" """
(version, account, _junk, _junk) = \ (version, account, _junk, _junk) = \
split_path(env['PATH_INFO'], 2, 4, True) split_path(env['PATH_INFO'], 2, 4, True)
@ -591,9 +600,11 @@ class GetOrHeadHandler(object):
def fast_forward(self, num_bytes): def fast_forward(self, num_bytes):
""" """
Will skip num_bytes into the current ranges. Will skip num_bytes into the current ranges.
:params num_bytes: the number of bytes that have already been read on :params num_bytes: the number of bytes that have already been read on
this request. This will change the Range header this request. This will change the Range header
so that the next req will start where it left off. so that the next req will start where it left off.
:raises NotImplementedError: if this is a multirange request :raises NotImplementedError: if this is a multirange request
:raises ValueError: if invalid range header :raises ValueError: if invalid range header
:raises HTTPRequestedRangeNotSatisfiable: if begin + num_bytes :raises HTTPRequestedRangeNotSatisfiable: if begin + num_bytes