Files
deb-python-taskflow/doc/source/atoms.rst
Joshua Harlow df5b1621a1 Fix case of taskflow in docs
Fix occurrences of the lowercase (or mixed case)
version of taskflow and rename those occurrences
to the cased version to be consistent with usage
elsewhere in the docs.

Change-Id: I31f552daa015724b443b099c2fcfe38d8e04605a
2014-05-24 17:25:57 -07:00

180 lines
6.9 KiB
ReStructuredText

------------------------
Atoms, Tasks and Retries
------------------------
Atom
====
An :py:class:`atom <taskflow.atom.Atom>` is the smallest unit in TaskFlow which
acts as the base for other classes (its naming was inspired from the
similarities between this type and `atoms`_ in the physical world). Atoms
have a name and may have a version. An atom is expected to name desired input
values (requirements) and name outputs (provided values).
.. note::
For more details about atom inputs and outputs please visit
:doc:`arguments and results <arguments_and_results>`.
.. automodule:: taskflow.atom
.. _atoms: http://en.wikipedia.org/wiki/Atom
Task
=====
A :py:class:`task <taskflow.task.BaseTask>` (derived from an atom) is the
smallest possible unit of work that can have an execute & rollback sequence
associated with it. These task objects all derive
from :py:class:`~taskflow.task.BaseTask` which defines what a task must
provide in terms of properties and methods.
Currently the following *provided* types of task subclasses are:
* :py:class:`~taskflow.task.Task`: useful for inheriting from and creating your
own subclasses.
* :py:class:`~taskflow.task.FunctorTask`: useful for wrapping existing
functions into task objects.
.. note::
:py:class:`~taskflow.task.FunctorTask` task types can not currently be used
with the :doc:`worker based engine <workers>` due to the fact that
arbitrary functions can not be guaranteed to be correctly
located (especially if they are lambda or anonymous functions) on the
worker nodes.
Retry
=====
A :py:class:`retry <taskflow.retry.Retry>` (derived from an atom) is a special
unit that handles errors, controls flow execution and can (for example) retry
other atoms with other parameters if needed. When an associated atom
fails, these retry units are *consulted* to determine what the resolution
method should be. The goal is that with this *consultation* the retry atom
will suggest a method for getting around the failure (perhaps by retrying,
reverting a single item, or reverting everything contained in the retries
associated scope).
Currently derivatives of the :py:class:`retry <taskflow.retry.Retry>` base
class must provide a ``on_failure`` method to determine how a failure should
be handled.
The current enumeration set that can be returned from this method is:
* ``RETRY`` - retries the surrounding subflow (a retry object is associated
with a flow, which is typically converted into a graph hierarchy at
compilation time) again.
* ``REVERT`` - reverts only the surrounding subflow but *consult* the
parent atom before doing this to determine if the parent retry object
provides a different reconciliation strategy (retry atoms can be nested, this
is possible since flows themselves can be nested).
* ``REVERT_ALL`` - completely reverts a whole flow.
To aid in the reconciliation process the
:py:class:`retry <taskflow.retry.Retry>` base class also mandates ``execute``
and ``revert`` methods (although subclasses are allowed to define these methods
as no-ops) that can be used by a retry atom to track interact with the runtime
execution model (for example, to track the number of times it has been
called which is useful for the :py:class:`~taskflow.retry.ForEach` retry
subclass).
To avoid recreating common retry patterns the following provided retry
subclasses are provided:
* :py:class:`~taskflow.retry.AlwaysRevert`: Always reverts subflow.
* :py:class:`~taskflow.retry.AlwaysRevertAll`: Always reverts the whole flow.
* :py:class:`~taskflow.retry.Times`: Retries subflow given number of times.
* :py:class:`~taskflow.retry.ForEach`: Allows for providing different values
to subflow atoms each time a failure occurs (making it possibly to resolve
the failure by altering subflow atoms inputs).
* :py:class:`~taskflow.retry.ParameterizedForEach`: Same as
:py:class:`~taskflow.retry.ForEach` but extracts values from storage
instead of the :py:class:`~taskflow.retry.ForEach` constructor.
Usage
-----
.. testsetup::
import taskflow
from taskflow import task
from taskflow import retry
from taskflow.patterns import linear_flow
from taskflow import engines
.. doctest::
>>> class EchoTask(task.Task):
... def execute(self, *args, **kwargs):
... print(self.name)
... print(args)
... print(kwargs)
...
>>> flow = linear_flow.Flow('f1').add(
... EchoTask('t1'),
... linear_flow.Flow('f2', retry=retry.ForEach(values=['a', 'b', 'c'], name='r1', provides='value')).add(
... EchoTask('t2'),
... EchoTask('t3', requires='value')),
... EchoTask('t4'))
In this example the flow ``f2`` has a retry controller ``r1``, that is an
instance of the default retry controller :py:class:`~taskflow.retry.ForEach`,
it accepts a collection of values and iterates over this collection when
each failure occurs. On each run :py:class:`~taskflow.retry.ForEach` retry
returns the next value from the collection and stops retrying a subflow if
there are no more values left in the collection. For example if tasks ``t2`` or
``t3`` fail, then the flow ``f2`` will be reverted and retry ``r1`` will retry
it with the next value from the given collection ``['a', 'b', 'c']``. But if
the task ``t1`` or the task ``t4`` fails, ``r1`` won't retry a flow, because
tasks ``t1`` and ``t4`` are in the flow ``f1`` and don't depend on
retry ``r1`` (so they will not *consult* ``r1`` on failure).
.. doctest::
>>> class SendMessage(task.Task):
... def execute(self, message):
... print("Sending message: %s" % message)
...
>>> flow = linear_flow.Flow('send_message', retry=retry.Times(5)).add(
... SendMessage('sender'))
In this example the ``send_message`` flow will try to execute the
``SendMessage`` five times when it fails. When it fails for the sixth time (if
it does) the task will be asked to ``REVERT`` (in this example task reverting
does not cause anything to happen but in other use cases it could).
.. doctest::
>>> class ConnectToServer(task.Task):
... def execute(self, ip):
... print("Connecting to %s" % ip)
...
>>> server_ips = ['192.168.1.1', '192.168.1.2', '192.168.1.3' ]
>>> flow = linear_flow.Flow('send_message',
... retry=retry.ParameterizedForEach(rebind={'values': 'server_ips'},
... provides='ip')).add(
... ConnectToServer(requires=['ip']))
In this example the flow tries to connect a server using a list (a tuple
can also be used) of possible IP addresses. Each time the retry will return
one IP from the list. In case of a failure it will return the next one until
it reaches the last one, then the flow will be reverted.
Interfaces
==========
.. automodule:: taskflow.task
.. automodule:: taskflow.retry
Hierarchy
=========
.. inheritance-diagram::
taskflow.atom
taskflow.task
taskflow.retry
:parts: 1