Files
deb-python-taskflow/doc/source/atoms.rst
Joshua Harlow 8103b4d7cd Remove wording issue (track does not make sense here)
Change-Id: I68b28339d56e4421c68e2fd650e5de7e29638143
2014-05-25 07:30:17 -07:00

180 lines
6.9 KiB
ReStructuredText

------------------------
Atoms, Tasks and Retries
------------------------
Atom
====
An :py:class:`atom <taskflow.atom.Atom>` is the smallest unit in TaskFlow which
acts as the base for other classes (its naming was inspired from the
similarities between this type and `atoms`_ in the physical world). Atoms
have a name and may have a version. An atom is expected to name desired input
values (requirements) and name outputs (provided values).
.. note::
For more details about atom inputs and outputs please visit
:doc:`arguments and results <arguments_and_results>`.
.. automodule:: taskflow.atom
.. _atoms: http://en.wikipedia.org/wiki/Atom
Task
=====
A :py:class:`task <taskflow.task.BaseTask>` (derived from an atom) is the
smallest possible unit of work that can have an execute & rollback sequence
associated with it. These task objects all derive
from :py:class:`~taskflow.task.BaseTask` which defines what a task must
provide in terms of properties and methods.
Currently the following *provided* types of task subclasses are:
* :py:class:`~taskflow.task.Task`: useful for inheriting from and creating your
own subclasses.
* :py:class:`~taskflow.task.FunctorTask`: useful for wrapping existing
functions into task objects.
.. note::
:py:class:`~taskflow.task.FunctorTask` task types can not currently be used
with the :doc:`worker based engine <workers>` due to the fact that
arbitrary functions can not be guaranteed to be correctly
located (especially if they are lambda or anonymous functions) on the
worker nodes.
Retry
=====
A :py:class:`retry <taskflow.retry.Retry>` (derived from an atom) is a special
unit that handles errors, controls flow execution and can (for example) retry
other atoms with other parameters if needed. When an associated atom
fails, these retry units are *consulted* to determine what the resolution
method should be. The goal is that with this *consultation* the retry atom
will suggest a method for getting around the failure (perhaps by retrying,
reverting a single item, or reverting everything contained in the retries
associated scope).
Currently derivatives of the :py:class:`retry <taskflow.retry.Retry>` base
class must provide a ``on_failure`` method to determine how a failure should
be handled.
The current enumeration set that can be returned from this method is:
* ``RETRY`` - retries the surrounding subflow (a retry object is associated
with a flow, which is typically converted into a graph hierarchy at
compilation time) again.
* ``REVERT`` - reverts only the surrounding subflow but *consult* the
parent atom before doing this to determine if the parent retry object
provides a different reconciliation strategy (retry atoms can be nested, this
is possible since flows themselves can be nested).
* ``REVERT_ALL`` - completely reverts a whole flow.
To aid in the reconciliation process the
:py:class:`retry <taskflow.retry.Retry>` base class also mandates ``execute``
and ``revert`` methods (although subclasses are allowed to define these methods
as no-ops) that can be used by a retry atom to interact with the runtime
execution model (for example, to track the number of times it has been
called which is useful for the :py:class:`~taskflow.retry.ForEach` retry
subclass).
To avoid recreating common retry patterns the following provided retry
subclasses are provided:
* :py:class:`~taskflow.retry.AlwaysRevert`: Always reverts subflow.
* :py:class:`~taskflow.retry.AlwaysRevertAll`: Always reverts the whole flow.
* :py:class:`~taskflow.retry.Times`: Retries subflow given number of times.
* :py:class:`~taskflow.retry.ForEach`: Allows for providing different values
to subflow atoms each time a failure occurs (making it possibly to resolve
the failure by altering subflow atoms inputs).
* :py:class:`~taskflow.retry.ParameterizedForEach`: Same as
:py:class:`~taskflow.retry.ForEach` but extracts values from storage
instead of the :py:class:`~taskflow.retry.ForEach` constructor.
Usage
-----
.. testsetup::
import taskflow
from taskflow import task
from taskflow import retry
from taskflow.patterns import linear_flow
from taskflow import engines
.. doctest::
>>> class EchoTask(task.Task):
... def execute(self, *args, **kwargs):
... print(self.name)
... print(args)
... print(kwargs)
...
>>> flow = linear_flow.Flow('f1').add(
... EchoTask('t1'),
... linear_flow.Flow('f2', retry=retry.ForEach(values=['a', 'b', 'c'], name='r1', provides='value')).add(
... EchoTask('t2'),
... EchoTask('t3', requires='value')),
... EchoTask('t4'))
In this example the flow ``f2`` has a retry controller ``r1``, that is an
instance of the default retry controller :py:class:`~taskflow.retry.ForEach`,
it accepts a collection of values and iterates over this collection when
each failure occurs. On each run :py:class:`~taskflow.retry.ForEach` retry
returns the next value from the collection and stops retrying a subflow if
there are no more values left in the collection. For example if tasks ``t2`` or
``t3`` fail, then the flow ``f2`` will be reverted and retry ``r1`` will retry
it with the next value from the given collection ``['a', 'b', 'c']``. But if
the task ``t1`` or the task ``t4`` fails, ``r1`` won't retry a flow, because
tasks ``t1`` and ``t4`` are in the flow ``f1`` and don't depend on
retry ``r1`` (so they will not *consult* ``r1`` on failure).
.. doctest::
>>> class SendMessage(task.Task):
... def execute(self, message):
... print("Sending message: %s" % message)
...
>>> flow = linear_flow.Flow('send_message', retry=retry.Times(5)).add(
... SendMessage('sender'))
In this example the ``send_message`` flow will try to execute the
``SendMessage`` five times when it fails. When it fails for the sixth time (if
it does) the task will be asked to ``REVERT`` (in this example task reverting
does not cause anything to happen but in other use cases it could).
.. doctest::
>>> class ConnectToServer(task.Task):
... def execute(self, ip):
... print("Connecting to %s" % ip)
...
>>> server_ips = ['192.168.1.1', '192.168.1.2', '192.168.1.3' ]
>>> flow = linear_flow.Flow('send_message',
... retry=retry.ParameterizedForEach(rebind={'values': 'server_ips'},
... provides='ip')).add(
... ConnectToServer(requires=['ip']))
In this example the flow tries to connect a server using a list (a tuple
can also be used) of possible IP addresses. Each time the retry will return
one IP from the list. In case of a failure it will return the next one until
it reaches the last one, then the flow will be reverted.
Interfaces
==========
.. automodule:: taskflow.task
.. automodule:: taskflow.retry
Hierarchy
=========
.. inheritance-diagram::
taskflow.atom
taskflow.task
taskflow.retry
:parts: 1