2011-10-26 21:42:24 +00:00
|
|
|
=======================
|
|
|
|
Expiring Object Support
|
|
|
|
=======================
|
|
|
|
|
2016-12-08 16:55:41 +00:00
|
|
|
The ``swift-object-expirer`` offers scheduled deletion of objects. The Swift
|
|
|
|
client would use the ``X-Delete-At`` or ``X-Delete-After`` headers during an
|
|
|
|
object ``PUT`` or ``POST`` and the cluster would automatically quit serving
|
|
|
|
that object at the specified time and would shortly thereafter remove the
|
|
|
|
object from the system.
|
|
|
|
|
|
|
|
The ``X-Delete-At`` header takes a Unix Epoch timestamp, in integer form; for
|
|
|
|
example: ``1317070737`` represents ``Mon Sep 26 20:58:57 2011 UTC``.
|
|
|
|
|
2018-01-05 14:43:12 +00:00
|
|
|
The ``X-Delete-After`` header takes a positive integer number of seconds. The
|
|
|
|
proxy server that receives the request will convert this header into an
|
|
|
|
``X-Delete-At`` header using the request timestamp plus the value given.
|
|
|
|
|
|
|
|
If both the ``X-Delete-At`` and ``X-Delete-After`` headers are sent with a
|
|
|
|
request then the ``X-Delete-After`` header will take precedence.
|
2016-12-08 16:55:41 +00:00
|
|
|
|
|
|
|
As expiring objects are added to the system, the object servers will record the
|
|
|
|
expirations in a hidden ``.expiring_objects`` account for the
|
|
|
|
``swift-object-expirer`` to handle later.
|
|
|
|
|
|
|
|
Usually, just one instance of the ``swift-object-expirer`` daemon needs to run
|
|
|
|
for a cluster. This isn't exactly automatic failover high availability, but if
|
|
|
|
this daemon doesn't run for a few hours it should not be any real issue. The
|
|
|
|
expired-but-not-yet-deleted objects will still ``404 Not Found`` if someone
|
|
|
|
tries to ``GET`` or ``HEAD`` them and they'll just be deleted a bit later when
|
|
|
|
the daemon is restarted.
|
|
|
|
|
|
|
|
By default, the ``swift-object-expirer`` daemon will run with a concurrency of
|
|
|
|
1. Increase this value to get more concurrency. A concurrency of 1 may not be
|
|
|
|
enough to delete expiring objects in a timely fashion for a particular Swift
|
|
|
|
cluster.
|
|
|
|
|
|
|
|
It is possible to run multiple daemons to do different parts of the work if a
|
|
|
|
single process with a concurrency of more than 1 is not enough (see the sample
|
|
|
|
config file for details).
|
|
|
|
|
|
|
|
To run the ``swift-object-expirer`` as multiple processes, set ``processes`` to
|
|
|
|
the number of processes (either in the config file or on the command line).
|
|
|
|
Then run one process for each part. Use ``process`` to specify the part of the
|
|
|
|
work to be done by a process using the command line or the config. So, for
|
|
|
|
example, if you'd like to run three processes, set ``processes`` to 3 and run
|
|
|
|
three processes with ``process`` set to 0, 1, and 2 for the three processes.
|
|
|
|
If multiple processes are used, it's necessary to run one for each part of the
|
|
|
|
work or that part of the work will not be done.
|
|
|
|
|
2018-09-12 07:35:51 +00:00
|
|
|
By default the daemon looks for two different config files. When launching,
|
|
|
|
the process searches for the ``[object-expirer]`` section in the
|
|
|
|
|
|
|
|
``/etc/swift/object-server.conf`` config. If the section or the config is missing
|
|
|
|
it will then look for and use the ``/etc/swift/object-expirer.conf`` config.
|
|
|
|
The latter config file is considered deprecated and is searched for to aid
|
|
|
|
in cluster upgrades.
|
|
|
|
|
2023-02-22 12:14:05 -08:00
|
|
|
Delay Reaping of Objects from Disk
|
|
|
|
----------------------------------
|
|
|
|
|
|
|
|
Swift's expiring object ``x-delete-at`` feature can be used to have the cluster
|
|
|
|
reap user's objects automatically from disk on their behalf when they no longer
|
|
|
|
want them stored in their account. In some cases it may be necessary to
|
|
|
|
"intervene" in the expected expiration process to prevent accidental or
|
|
|
|
premature data loss if an object marked for expiration should NOT be deleted
|
|
|
|
immediately when it expires for whatever reason. In these cases
|
|
|
|
``swift-object-expirer`` offers configuration of a ``delay_reaping`` value
|
|
|
|
on accounts and containers, which provides a delay between when an object
|
|
|
|
is marked for deletion, or expired, and when it is actually reaped from disk.
|
|
|
|
When this is set in the object expirer config the object expirer leaves expired
|
|
|
|
objects on disk (and in container listings) for the ``delay_reaping`` time.
|
|
|
|
After this delay has passed objects will be reaped as normal.
|
|
|
|
|
|
|
|
The ``delay_reaping`` value can be set either at an account level or a
|
|
|
|
container level. When set at an account level, the object expirer will
|
|
|
|
only reap objects within the account after the delay. A container level
|
|
|
|
``delay_reaping`` works similarly for containers and overrides an account
|
|
|
|
level ``delay_reaping`` value.
|
|
|
|
|
|
|
|
The ``delay_reaping`` values are set in the ``[object-expirer]`` section in
|
|
|
|
either the object-server or object-expirer config files. They are configured
|
|
|
|
with dynamic config option names prefixed with ``delay_reaping_<ACCT>``
|
|
|
|
at the account level and ``delay_reaping_<ACCT>/<CNTR>`` at the container
|
|
|
|
level, with the ``delay_reaping`` value in seconds.
|
|
|
|
|
|
|
|
Here is an example of ``delay_reaping`` configs in the``object-expirer``
|
|
|
|
section in the ``object-server.conf``::
|
|
|
|
|
|
|
|
[object-expirer]
|
|
|
|
delay_reaping_AUTH_test = 300.0
|
|
|
|
delay_reaping_AUTH_test2 = 86400.0
|
|
|
|
delay_reaping_AUTH_test/test = 0.0
|
|
|
|
delay_reaping_AUTH_test/test2 = 600.0
|
|
|
|
|
|
|
|
.. note::
|
|
|
|
A container level ``delay_reaping`` value does not require an account level
|
|
|
|
``delay_reaping`` value but overrides the account level value for the same
|
|
|
|
account if it exists. By default, no ``delay_reaping`` value is configured
|
|
|
|
for any accounts or containers.
|
|
|
|
|
2023-02-21 13:26:06 -08:00
|
|
|
Accessing Objects After Expiration
|
|
|
|
----------------------------------
|
|
|
|
|
|
|
|
By default, objects that expire become inaccessible, even to the account owner.
|
|
|
|
The object may not have been deleted, but any GET/HEAD/POST client request for
|
|
|
|
the object will respond 404 Not Found after the ``x-delete-at`` timestamp
|
|
|
|
has passed.
|
|
|
|
|
|
|
|
The ``swift-proxy-server`` offers the ability to globally configure a flag to
|
|
|
|
allow requests to access expired objects that have not yet been deleted.
|
|
|
|
When this flag is enabled, a user can make a GET, HEAD, or POST request with
|
|
|
|
the header ``x-open-expired`` set to true to access the expired object.
|
|
|
|
|
|
|
|
The global configuration is an opt-in flag that can be set in the
|
|
|
|
``[proxy-server]`` section of the ``proxy-server.conf`` file. It is configured
|
|
|
|
with a single flag ``allow_open_expired`` set to true or false. By default,
|
|
|
|
this flag is set to false.
|
|
|
|
|
|
|
|
Here is an example in the ``proxy-server`` section in ``proxy-server.conf``::
|
|
|
|
|
|
|
|
[proxy-server]
|
|
|
|
allow_open_expired = false
|
|
|
|
|
|
|
|
To discover whether this flag is set, you can send a **GET** request to the
|
|
|
|
``/info`` :ref:`discoverability <discoverability>` path. This will return
|
|
|
|
configuration data in JSON format where the value of ``allow_open_expired`` is
|
|
|
|
exposed.
|
|
|
|
|
|
|
|
When using a temporary URL to access the object, this feature is not enabled.
|
|
|
|
This means that adding the header will not allow requests to temporary URLs
|
|
|
|
to access expired objects.
|
|
|
|
|
2018-09-12 07:35:51 +00:00
|
|
|
Upgrading impact: General Task Queue vs Legacy Queue
|
|
|
|
----------------------------------------------------
|
|
|
|
|
|
|
|
The expirer daemon will be moving to a new general task-queue based design that
|
|
|
|
will divide the work across all object servers, as such only expirers defined
|
|
|
|
in the object-server config will be able to use the new system.
|
|
|
|
The parameters in both files are identical except for a new option in the
|
2021-03-12 23:06:13 -06:00
|
|
|
object-server ``[object-expirer]`` section, ``dequeue_from_legacy``
|
2018-09-12 07:35:51 +00:00
|
|
|
which when set to ``True`` will tell the expirer that in addition to using
|
|
|
|
the new task queueing system to also check the legacy (soon to be deprecated)
|
|
|
|
queue.
|
|
|
|
|
|
|
|
.. note::
|
|
|
|
The new task-queue system has not been completed yet. So an expirer's with
|
2021-03-12 23:06:13 -06:00
|
|
|
``dequeue_from_legacy`` set to ``False`` will currently do nothing.
|
2018-09-12 07:35:51 +00:00
|
|
|
|
2021-03-12 23:06:13 -06:00
|
|
|
By default ``dequeue_from_legacy`` will be ``False``, it is necessary to
|
2018-09-12 07:35:51 +00:00
|
|
|
be set to ``True`` explicitly while migrating from the old expiring queue.
|
|
|
|
|
|
|
|
Any expirer using the old config ``/etc/swift/object-expirer.conf`` will not
|
2021-03-12 23:06:13 -06:00
|
|
|
use the new general task queue. It'll ignore the ``dequeue_from_legacy``
|
2018-09-12 07:35:51 +00:00
|
|
|
and will only check the legacy queue. Meaning it'll run as a legacy expirer.
|
|
|
|
|
|
|
|
Why is this important? If you are currently running object-expirers on nodes
|
|
|
|
that are not object storage nodes, then for the time being they will still
|
|
|
|
work but only by dequeuing from the old queue.
|
|
|
|
When the new general task queue is introduced, expirers will be required to
|
|
|
|
run on the object servers so that any new objects added can be removed.
|
|
|
|
If you're in this situation, you can safely setup the new expirer
|
|
|
|
section in the ``object-server.conf`` to deal with the new queue and leave the
|
|
|
|
legacy expirers running elsewhere.
|
|
|
|
|
|
|
|
However, if your old expirers are running on the object-servers, the most
|
|
|
|
common topology, then you would add the new section to all object servers, to
|
|
|
|
deal the new queue. In order to maintain the same number of expirers checking
|
|
|
|
the legacy queue, pick the same number of nodes as you previously had and turn
|
2021-03-12 23:06:13 -06:00
|
|
|
on ``dequeue_from_legacy`` on those nodes only. Also note on these nodes
|
2018-09-12 07:35:51 +00:00
|
|
|
you'd need to keep the legacy ``process`` and ``processes`` options to maintain
|
|
|
|
the concurrency level for the legacy queue.
|
|
|
|
|
|
|
|
.. note::
|
|
|
|
Be careful not to enable ``dequeue_from_legacy`` on too many expirers as
|
|
|
|
all legacy tasks are stored in a single hidden account and the same hidden
|
|
|
|
containers. On a large cluster one may inadvertently overload the
|
|
|
|
acccount/container servers handling the legacy expirer queue.
|
|
|
|
|
|
|
|
Here is a quick sample of the ``object-expirer`` section required in the
|
|
|
|
``object-server.conf``::
|
|
|
|
|
|
|
|
[object-expirer]
|
|
|
|
# log_name = object-expirer
|
|
|
|
# log_facility = LOG_LOCAL0
|
|
|
|
# log_level = INFO
|
|
|
|
# log_address = /dev/log
|
|
|
|
#
|
|
|
|
interval = 300
|
|
|
|
|
|
|
|
# If this true, expirer execute tasks in legacy expirer task queue
|
2021-03-12 23:06:13 -06:00
|
|
|
dequeue_from_legacy = false
|
2018-09-12 07:35:51 +00:00
|
|
|
|
2021-03-12 23:06:13 -06:00
|
|
|
# processes can only be used in conjunction with `dequeue_from_legacy`.
|
|
|
|
# So this option is ignored if dequeue_from_legacy=false.
|
2018-09-12 07:35:51 +00:00
|
|
|
# processes is how many parts to divide the legacy work into, one part per
|
|
|
|
# process that will be doing the work
|
|
|
|
# processes set 0 means that a single legacy process will be doing all the work
|
|
|
|
# processes can also be specified on the command line and will override the
|
|
|
|
# config value
|
|
|
|
# processes = 0
|
|
|
|
|
2021-03-12 23:06:13 -06:00
|
|
|
# process can only be used in conjunction with `dequeue_from_legacy`.
|
|
|
|
# So this option is ignored if dequeue_from_legacy=false.
|
2018-09-12 07:35:51 +00:00
|
|
|
# process is which of the parts a particular legacy process will work on
|
|
|
|
# process can also be specified on the command line and will override the config
|
|
|
|
# value
|
|
|
|
# process is "zero based", if you want to use 3 processes, you should run
|
|
|
|
# processes with process set to 0, 1, and 2
|
|
|
|
# process = 0
|
|
|
|
|
|
|
|
report_interval = 300
|
|
|
|
|
|
|
|
# request_tries is the number of times the expirer's internal client will
|
|
|
|
# attempt any given request in the event of failure. The default is 3.
|
|
|
|
# request_tries = 3
|
|
|
|
|
|
|
|
# concurrency is the level of concurrency to use to do the work, this value
|
|
|
|
# must be set to at least 1
|
|
|
|
# concurrency = 1
|
|
|
|
|
|
|
|
# The expirer will re-attempt expiring if the source object is not available
|
|
|
|
# up to reclaim_age seconds before it gives up and deletes the entry in the
|
|
|
|
# queue.
|
|
|
|
# reclaim_age = 604800
|
|
|
|
|
|
|
|
And for completeness, here is a quick sample of the legacy
|
|
|
|
``object-expirer.conf`` file::
|
2011-10-26 21:42:24 +00:00
|
|
|
|
|
|
|
[DEFAULT]
|
|
|
|
# swift_dir = /etc/swift
|
|
|
|
# user = swift
|
|
|
|
# You can specify default log routing here if you want:
|
|
|
|
# log_name = swift
|
|
|
|
# log_facility = LOG_LOCAL0
|
|
|
|
# log_level = INFO
|
2013-05-01 21:11:25 +00:00
|
|
|
|
2011-10-26 21:42:24 +00:00
|
|
|
[object-expirer]
|
|
|
|
interval = 300
|
2013-05-01 21:11:25 +00:00
|
|
|
|
2011-10-26 21:42:24 +00:00
|
|
|
[pipeline:main]
|
|
|
|
pipeline = catch_errors cache proxy-server
|
2013-05-01 21:11:25 +00:00
|
|
|
|
2011-10-26 21:42:24 +00:00
|
|
|
[app:proxy-server]
|
|
|
|
use = egg:swift#proxy
|
|
|
|
# See proxy-server.conf-sample for options
|
2013-05-01 21:11:25 +00:00
|
|
|
|
2011-10-26 21:42:24 +00:00
|
|
|
[filter:cache]
|
|
|
|
use = egg:swift#memcache
|
|
|
|
# See proxy-server.conf-sample for options
|
2013-05-01 21:11:25 +00:00
|
|
|
|
2011-10-26 21:42:24 +00:00
|
|
|
[filter:catch_errors]
|
|
|
|
use = egg:swift#catch_errors
|
|
|
|
# See proxy-server.conf-sample for options
|
|
|
|
|
2018-09-12 07:35:51 +00:00
|
|
|
|
|
|
|
.. note::
|
|
|
|
When running legacy expirers, the daemon needs to run on a machine with
|
|
|
|
access to all the backend servers in the cluster, but does not need proxy
|
|
|
|
server or public access. The daemon will use its own internal proxy code
|
|
|
|
instance to access the backend servers.
|