Merge tag '0.7.0' into debian/kilo

taskflow 0.7.0 release
2015-02-10 15:58:00 +01:00
parent cc12858501 19f9674877
commit cfa15d868d
201 changed files with 14340 additions and 16193 deletions
--- a/CONTRIBUTING.rst
+++ b/CONTRIBUTING.rst
@@ -1,13 +1,13 @@
 If you would like to contribute to the development of OpenStack,
 you must follow the steps documented at:

-   http://wiki.openstack.org/HowToContribute#If_you.27re_a_developer
+   http://docs.openstack.org/infra/manual/developers.html#development-workflow

 Once those steps have been completed, changes to OpenStack
 should be submitted for review via the Gerrit tool, following
 the workflow documented at:

-   http://wiki.openstack.org/GerritWorkflow
+   http://docs.openstack.org/infra/manual/developers.html#development-workflow

 Pull requests submitted through GitHub will be ignored.

--- a/README.rst
+++ b/README.rst
@@ -5,7 +5,10 @@ A library to do [jobs, tasks, flows] in a highly available, easy to understand
 and declarative manner (and more!) to be used with OpenStack and other
 projects.

- More information can be found by referring to the `developer documentation`_.
+* Free software: Apache license
+* Documentation: http://docs.openstack.org/developer/taskflow
+* Source: http://git.openstack.org/cgit/openstack/taskflow
+* Bugs: http://bugs.launchpad.net/taskflow/

 Join us
 -------
@@ -18,32 +21,27 @@ Testing and requirements
 Requirements
 ~~~~~~~~~~~~

-Because TaskFlow has many optional (pluggable) parts like persistence
-backends and engines, we decided to split our requirements into two
-parts: - things that are absolutely required by TaskFlow (you can't use
-TaskFlow without them) are put into ``requirements-pyN.txt`` (``N`` being the
-Python *major* version number used to install the package); - things that are
-required by some optional part of TaskFlow (you can use TaskFlow without
-them) are put into ``optional-requirements.txt``; if you want to use the
-feature in question, you should add that requirements to your project or
-environment; - as usual, things that required only for running tests are
-put into ``test-requirements.txt``.
+Because this project has many optional (pluggable) parts like persistence
+backends and engines, we decided to split our requirements into three
+parts: - things that are absolutely required (you can't use the project
+without them) are put into ``requirements-pyN.txt`` (``N`` being the
+Python *major* version number used to install the package). The requirements
+that are required by some optional part of this project (you can use the
+project without them) are put into our ``tox.ini`` file (so that we can still
+test the optional functionality works as expected). If you want to use the
+feature in question (`eventlet`_ or the worker based engine that
+uses `kombu`_ or the `sqlalchemy`_ persistence backend or jobboards which
+have an implementation built using `kazoo`_ ...), you should add
+that requirement(s) to your project or environment; - as usual, things that
+required only for running tests are put into ``test-requirements.txt``.

 Tox.ini
 ~~~~~~~

 Our ``tox.ini`` file describes several test environments that allow to test
 TaskFlow with different python versions and sets of requirements installed.
-
-To generate the ``tox.ini`` file, use the ``toxgen.py`` script by first
-installing `toxgen`_ and then provide that script as input the ``tox-tmpl.ini``
-file to generate the final ``tox.ini`` file.
-
-*For example:*
-
-::
-
-    $ toxgen.py -i tox-tmpl.ini -o tox.ini
+Please refer to the `tox`_ documentation to understand how to make these test
+environments work for you.

 Developer documentation
 -----------------------
@@ -56,5 +54,9 @@ We also have sphinx documentation in ``docs/source``.

    $ python setup.py build_sphinx

-.. _toxgen: https://pypi.python.org/pypi/toxgen/
+.. _kazoo: http://kazoo.readthedocs.org/
+.. _sqlalchemy: http://www.sqlalchemy.org/
+.. _kombu: http://kombu.readthedocs.org/
+.. _eventlet: http://eventlet.net/
+.. _tox: http://tox.testrun.org/
 .. _developer documentation: http://docs.openstack.org/developer/taskflow/
--- a/doc/diagrams/core.graffle
+++ b/doc/diagrams/core.graffle
--- a/doc/diagrams/core.graffle.tgz
+++ b/doc/diagrams/core.graffle.tgz
--- a/doc/diagrams/jobboard.graffle.tgz
+++ b/doc/diagrams/jobboard.graffle.tgz
--- a/doc/source/arguments_and_results.rst
+++ b/doc/source/arguments_and_results.rst
@@ -1,32 +1,35 @@
-==========================
-Atom Arguments and Results
-==========================
+=====================
+Arguments and results
+=====================

 .. |task.execute| replace:: :py:meth:`~taskflow.task.BaseTask.execute`
 .. |task.revert| replace:: :py:meth:`~taskflow.task.BaseTask.revert`
 .. |retry.execute| replace:: :py:meth:`~taskflow.retry.Retry.execute`
 .. |retry.revert| replace:: :py:meth:`~taskflow.retry.Retry.revert`
+.. |Retry| replace:: :py:class:`~taskflow.retry.Retry`
+.. |Task| replace:: :py:class:`Task <taskflow.task.BaseTask>`

-In TaskFlow, all flow and task state goes to (potentially persistent) storage.
-That includes all the information that :doc:`atoms <atoms>` (e.g. tasks) in the
-flow need when they are executed, and all the information task produces (via
-serializable task results). A developer who implements tasks or flows can
-specify what arguments a task accepts and what result it returns in several
-ways. This document will help you understand what those ways are and how to use
-those ways to accomplish your desired usage pattern.
+In TaskFlow, all flow and task state goes to (potentially persistent) storage
+(see :doc:`persistence <persistence>` for more details). That includes all the
+information that :doc:`atoms <atoms>` (e.g. tasks, retry objects...) in the
+workflow need when they are executed, and all the information task/retry
+produces (via serializable results). A developer who implements tasks/retries
+or flows can specify what arguments a task/retry accepts and what result it
+returns in several ways. This document will help you understand what those ways
+are and how to use those ways to accomplish your desired usage pattern.

 .. glossary::

-    Task arguments
-        Set of names of task arguments available as the ``requires``
-        property of the task instance. When a task is about to be executed
-        values with these names are retrieved from storage and passed to
-        |task.execute| method of the task.
+    Task/retry arguments
+        Set of names of task/retry arguments available as the ``requires``
+        property of the task/retry instance. When a task or retry object is
+        about to be executed values with these names are retrieved from storage
+        and passed to the ``execute`` method of the task/retry.

-    Task results
-        Set of names of task results (what task provides) available as
-        ``provides`` property of task instance. After a task finishes
-        successfully, its result(s) (what the task |task.execute| method
+    Task/retry results
+        Set of names of task/retry results (what task/retry provides) available
+        as ``provides`` property of task or retry instance. After a task/retry
+        finishes successfully, its result(s) (what the ``execute`` method
        returns) are available by these names from storage (see examples
        below).

@@ -44,8 +47,8 @@ There are different ways to specify the task argument ``requires`` set.
 Arguments inference
 -------------------

-Task arguments can be inferred from arguments of the |task.execute| method of
-the task.
+Task/retry arguments can be inferred from arguments of the |task.execute|
+method of a task (or the |retry.execute| of a retry object).

 .. doctest::

@@ -56,10 +59,10 @@ the task.
    >>> sorted(MyTask().requires)
    ['eggs', 'spam']

-Inference from the method signature is the ''simplest'' way to specify task
+Inference from the method signature is the ''simplest'' way to specify
 arguments. Optional arguments (with default values), and special arguments like
-``self``, ``*args`` and ``**kwargs`` are ignored on inference (as these names
-have special meaning/usage in python).
+``self``, ``*args`` and ``**kwargs`` are ignored during inference (as these
+names have special meaning/usage in python).

 .. doctest::

@@ -83,14 +86,14 @@ have special meaning/usage in python).
 Rebinding
 ---------

-**Why:** There are cases when the value you want to pass to a task is stored
-with a name other then the corresponding task arguments name. That's when the
-``rebind`` task constructor parameter comes in handy. Using it the flow author
+**Why:** There are cases when the value you want to pass to a task/retry is
+stored with a name other then the corresponding arguments name. That's when the
+``rebind`` constructor parameter comes in handy. Using it the flow author
 can instruct the engine to fetch a value from storage by one name, but pass it
-to a tasks |task.execute| method with another name. There are two possible ways
-of accomplishing this.
+to a tasks/retrys ``execute`` method with another name. There are two possible
+ways of accomplishing this.

-The first is to pass a dictionary that maps the task argument name to the name
+The first is to pass a dictionary that maps the argument name to the name
 of a saved value.

 For example, if you have task::
@@ -100,24 +103,25 @@ For example, if you have task::
        def execute(self, vm_name, vm_image_id, **kwargs):
            pass  # TODO(imelnikov): use parameters to spawn vm

-and you saved 'vm_name' with 'name' key in storage, you can spawn a vm with
-such 'name' like this::
+and you saved ``'vm_name'`` with ``'name'`` key in storage, you can spawn a vm
+with such ``'name'`` like this::

    SpawnVMTask(rebind={'vm_name': 'name'})

 The second way is to pass a tuple/list/dict of argument names. The length of
-the tuple/list/dict should not be less then number of task required parameters.
+the tuple/list/dict should not be less then number of required parameters.
+
 For example, you can achieve the same effect as the previous example with::

    SpawnVMTask(rebind_args=('name', 'vm_image_id'))

-which is equivalent to a more elaborate::
+This is equivalent to a more elaborate::

    SpawnVMTask(rebind=dict(vm_name='name',
                            vm_image_id='vm_image_id'))

-In both cases, if your task accepts arbitrary arguments with ``**kwargs``
-construct, you can specify extra arguments.
+In both cases, if your task (or retry) accepts arbitrary arguments
+with the ``**kwargs`` construct, you can specify extra arguments.

 ::

@@ -158,7 +162,8 @@ arguments) will appear in the ``kwargs`` of the |task.execute| method.

 When constructing a task instance the flow author can also add more
 requirements if desired.  Those manual requirements (if they are not functional
-arguments) will appear in the ``**kwargs`` the |task.execute| method.
+arguments) will appear in the ``kwargs`` parameter of the |task.execute|
+method.

 .. doctest::

@@ -189,15 +194,19 @@ avoid invalid argument mappings.
 Results specification
 =====================

-In python, function results are not named, so we can not infer what a task
-returns. This is important since the complete task result (what the
-|task.execute| method returns) is saved in (potentially persistent) storage,
-and it is typically (but not always) desirable to make those results accessible
-to other tasks. To accomplish this the task specifies names of those values via
-its ``provides`` task constructor parameter or other method (see below).
+In python, function results are not named, so we can not infer what a
+task/retry returns. This is important since the complete result (what the
+task |task.execute| or retry |retry.execute| method returns) is saved
+in (potentially persistent) storage, and it is typically (but not always)
+desirable to make those results accessible to others. To accomplish this
+the task/retry specifies names of those values via its ``provides`` constructor
+parameter or by its default provides attribute.
+
+Examples
+--------

 Returning one value
-------------------
+++++++++++++++++++

 If task returns just one value, ``provides`` should be string -- the
 name of the value.
@@ -212,7 +221,7 @@ name of the value.
    set(['the_answer'])

 Returning a tuple
-----------------
+++++++++++++++++

 For a task that returns several values, one option (as usual in python) is to
 return those values via a ``tuple``.
@@ -242,17 +251,17 @@ tasks) will be able to get those elements from storage by name:

 Provides argument can be shorter then the actual tuple returned by a task --
 then extra values are ignored (but, as expected, **all** those values are saved
-and passed to the |task.revert| method).
+and passed to the task |task.revert| or retry |retry.revert| method).

 .. note::

    Provides arguments tuple can also be longer then the actual tuple returned
    by task -- when this happens the extra parameters are left undefined: a
    warning is printed to logs and if use of such parameter is attempted a
-    ``NotFound`` exception is raised.
+    :py:class:`~taskflow.exceptions.NotFound`  exception is raised.

 Returning a dictionary
----------------------
++++++++++++++++++++++

 Another option is to return several values as a dictionary (aka a ``dict``).

@@ -290,16 +299,17 @@ will be able to get elements from storage by name:
    and passed to the |task.revert| method). If the provides argument has some
    items not present in the actual dict returned by the task -- then extra
    parameters are left undefined: a warning is printed to logs and if use of
-    such parameter is attempted a ``NotFound`` exception is raised.
+    such parameter is attempted a :py:class:`~taskflow.exceptions.NotFound`
+    exception is raised.

 Default provides
----------------
++++++++++++++++

-As mentioned above, the default task base class provides nothing, which means
-task results are not accessible to other tasks in the flow.
+As mentioned above, the default base class provides nothing, which means
+results are not accessible to other tasks/retrys in the flow.

-The task author can override this and specify default value for provides using
-``default_provides`` class variable:
+The author can override this and specify default value for provides using
+the ``default_provides`` class/instance variable:

 ::

@@ -314,8 +324,8 @@ Of course, the flow author can override this to change names if needed:

    BitsAndPiecesTask(provides=('b', 'p'))

-or to change structure -- e.g. this instance will make whole tuple accessible
-to other tasks by name 'bnp':
+or to change structure -- e.g. this instance will make tuple accessible
+to other tasks by name ``'bnp'``:

 ::

@@ -331,28 +341,29 @@ the task from other tasks in the flow (e.g. to avoid naming conflicts):
 Revert arguments
 ================

-To revert a task engine calls its |task.revert| method. This method
-should accept same arguments as |task.execute| method of the task and one
-more special keyword argument, named ``result``.
+To revert a task the :doc:`engine <engines>` calls the tasks
+|task.revert| method. This method should accept the same arguments
+as the |task.execute| method of the task and one more special keyword
+argument, named ``result``.

 For ``result`` value, two cases are possible:

-* if task is being reverted because it failed (an exception was raised from its
-  |task.execute| method), ``result`` value is instance of
-  :py:class:`taskflow.utils.misc.Failure` object that holds exception
-  information;
+* If the task is being reverted because it failed (an exception was raised
+  from its |task.execute| method), the ``result`` value is an instance of a
+  :py:class:`~taskflow.types.failure.Failure` object that holds the exception
+  information.

-* if task is being reverted because some other task failed, and this task
-  finished successfully, ``result`` value is task result fetched from storage:
-  basically, that's what |task.execute| method returned.
+* If the task is being reverted because some other task failed, and this task
+  finished successfully, ``result`` value is the result fetched from storage:
+  ie, what the |task.execute| method returned.

 All other arguments are fetched from storage in the same way it is done for
 |task.execute| method.

-To determine if task failed you can check whether ``result`` is instance of
-:py:class:`taskflow.utils.misc.Failure`::
+To determine if a task failed you can check whether ``result`` is instance of
+:py:class:`~taskflow.types.failure.Failure`::

-    from taskflow.utils import misc
+    from taskflow.types import failure

    class RevertingTask(task.Task):

@@ -360,55 +371,61 @@ To determine if task failed you can check whether ``result`` is instance of
            return do_something(spam, eggs)

        def revert(self, result, spam, eggs):
-            if isinstance(result, misc.Failure):
+            if isinstance(result, failure.Failure):
                print("This task failed, exception: %s"
                      % result.exception_str)
            else:
                print("do_something returned %r" % result)

-If this task failed (``do_something`` raised exception) it will print ``"This
-task failed, exception:"`` and exception message on revert. If this task
-finished successfully, it will print ``"do_something returned"`` and
-representation of result.
+If this task failed (ie ``do_something`` raised an exception) it will print
+``"This task failed, exception:"`` and a exception message on revert. If this
+task finished successfully, it will print ``"do_something returned"`` and a
+representation of the ``do_something`` result.

 Retry arguments
 ===============

-A Retry controller works with arguments in the same way as a Task. But it has
-an additional parameter 'history' that is a list of tuples. Each tuple contains
-a result of the previous Retry run and a table where a key is a failed task and
-a value is a :py:class:`taskflow.utils.misc.Failure`.
+A |Retry| controller works with arguments in the same way as a |Task|. But it
+has an additional parameter ``'history'`` that is itself a
+:py:class:`~taskflow.retry.History` object that contains what failed over all
+the engines attempts (aka the outcomes). The history object can be
+viewed as a tuple that contains a result of the previous retrys run and a
+table/dict where each key is a failed atoms name and each value is
+a :py:class:`~taskflow.types.failure.Failure` object.

-Consider the following Retry::
+Consider the following implementation::

  class MyRetry(retry.Retry):

      default_provides = 'value'

      def on_failure(self, history, *args, **kwargs):
-          print history
+          print(list(history))
          return RETRY

      def execute(self, history, *args, **kwargs):
-          print history
+          print(list(history))
          return 5

      def revert(self, history, *args, **kwargs):
-          print history
+          print(list(history))

-Imagine the following Retry had returned a value '5' and then some task 'A'
+Imagine the above retry had returned a value ``'5'`` and then some task ``'A'``
 failed with some exception.  In this case ``on_failure`` method will receive
-the following history::
+the following history (printed as a list)::

-    [('5', {'A': misc.Failure()})]
+    [('5', {'A': failure.Failure()})]

-Then the |retry.execute| method will be called again and it'll receive the same
-history.
+At this point (since the implementation returned ``RETRY``) the
+|retry.execute| method will be called again and it will receive the same
+history and it can then return a value that subseqent tasks can use to alter
+there behavior.

-If the |retry.execute| method raises an exception, the |retry.revert| method of
-Retry will be called and :py:class:`taskflow.utils.misc.Failure` object will be
-present in the history instead of Retry result::
+If instead the |retry.execute| method itself raises an exception,
+the |retry.revert| method of the implementation will be called and
+a :py:class:`~taskflow.types.failure.Failure` object will be present in the
+history object instead of the typical result.

-    [('5', {'A': misc.Failure()}), (misc.Failure(), {})]
+.. note::

-After the Retry has been reverted, the Retry history will be cleaned.
+    After a |Retry| has been reverted, the objects history will be cleaned.
--- a/doc/source/atoms.rst
+++ b/doc/source/atoms.rst
@@ -1,5 +1,5 @@
 ------------------------
-Atoms, Tasks and Retries
+Atoms, tasks and retries
 ------------------------

 Atom
@@ -94,8 +94,8 @@ subclasses are provided:
  :py:class:`~taskflow.retry.ForEach` but extracts values from storage
  instead of the :py:class:`~taskflow.retry.ForEach` constructor.

-Usage
-----
+Examples
+--------

 .. testsetup::

--- a/doc/source/conductors.rst
+++ b/doc/source/conductors.rst
@@ -63,6 +63,10 @@ Interfaces
 ==========

 .. automodule:: taskflow.conductors.base
+
+Implementations
+===============
+
 .. automodule:: taskflow.conductors.single_threaded

 Hierarchy
--- a/doc/source/conf.py
+++ b/doc/source/conf.py
@@ -1,5 +1,6 @@
 # -*- coding: utf-8 -*-

+import datetime
 import os
 import sys

@@ -13,7 +14,6 @@ extensions = [
    'sphinx.ext.doctest',
    'sphinx.ext.extlinks',
    'sphinx.ext.inheritance_diagram',
-    'sphinx.ext.intersphinx',
    'sphinx.ext.viewcode',
    'oslosphinx'
 ]
@@ -37,7 +37,7 @@ exclude_patterns = ['_build']

 # General information about the project.
 project = u'TaskFlow'
-copyright = u'2013-2014, OpenStack Foundation'
+copyright = u'%s, OpenStack Foundation' % datetime.date.today().year
 source_tree = 'http://git.openstack.org/cgit/openstack/taskflow/tree'

 # If true, '()' will be appended to :func: etc. cross-reference text.
@@ -56,6 +56,7 @@ modindex_common_prefix = ['taskflow.']
 # Shortened external links.
 extlinks = {
    'example': (source_tree + '/taskflow/examples/%s.py', ''),
+    'pybug': ('http://bugs.python.org/issue%s', ''),
 }

 # -- Options for HTML output --------------------------------------------------
@@ -82,9 +83,6 @@ latex_documents = [
     'OpenStack Foundation', 'manual'),
 ]

-# Example configuration for intersphinx: refer to the Python standard library.
-intersphinx_mapping = {'http://docs.python.org/': None}
-
 # -- Options for autoddoc ----------------------------------------------------

 # Keep source order
--- a/doc/source/engines.rst
+++ b/doc/source/engines.rst
@@ -13,23 +13,23 @@ and uses it to decide which :doc:`atom <atoms>` to run and when.
 TaskFlow provides different implementations of engines. Some may be easier to
 use (ie, require no additional infrastructure setup) and understand; others
 might require more complicated setup but provide better scalability. The idea
-and *ideal* is that deployers or developers of a service that uses TaskFlow can
+and *ideal* is that deployers or developers of a service that use TaskFlow can
 select an engine that suites their setup best without modifying the code of
 said service.

 Engines usually have different capabilities and configuration, but all of them
 **must** implement the same interface and preserve the semantics of patterns
-(e.g.  parts of :py:class:`linear flow <taskflow.patterns.linear_flow.Flow>`
-are run one after another, in order, even if engine is *capable* of running
-tasks in parallel).
+(e.g. parts of a :py:class:`.linear_flow.Flow`
+are run one after another, in order, even if the selected engine is *capable*
+of running tasks in parallel).

 Why they exist
 --------------

-An engine being the core component which actually makes your flows progress is
-likely a new concept for many programmers so let's describe how it operates in
-more depth and some of the reasoning behind why it exists. This will hopefully
-make it more clear on there value add to the TaskFlow library user.
+An engine being *the* core component which actually makes your flows progress
+is likely a new concept for many programmers so let's describe how it operates
+in more depth and some of the reasoning behind why it exists. This will
+hopefully make it more clear on there value add to the TaskFlow library user.

 First though let us discuss something most are familiar already with; the
 difference between `declarative`_ and `imperative`_ programming models. The
@@ -48,15 +48,15 @@ more of a *pure* function that executes, reverts and may require inputs and
 provide outputs). This is where engines get involved; they do the execution of
 the *what* defined via :doc:`atoms <atoms>`, tasks, flows and the relationships
 defined there-in and execute these in a well-defined manner (and the engine is
-responsible for *most* of the state manipulation instead).
+responsible for any state manipulation instead).

 This mix of imperative and declarative (with a stronger emphasis on the
-declarative model) allows for the following functionality to be possible:
+declarative model) allows for the following functionality to become possible:

 * Enhancing reliability: Decoupling of state alterations from what should be
  accomplished allows for a *natural* way of resuming by allowing the engine to
-  track the current state and know at which point a flow is in and how to get
-  back into that state when resumption occurs.
+  track the current state and know at which point a workflow is in and how to
+  get back into that state when resumption occurs.
 * Enhancing scalability: When a engine is responsible for executing your
  desired work it becomes possible to alter the *how* in the future by creating
  new types of execution backends (for example the worker model which does not
@@ -83,13 +83,14 @@ Of course these kind of features can come with some drawbacks:
  away from (and this is likely a mindset change for programmers used to the
  imperative model). We have worked to make this less of a concern by creating
  and encouraging the usage of :doc:`persistence <persistence>`, to help make
-  it possible to have some level of provided state transfer mechanism.
+  it possible to have state and tranfer that state via a argument input and
+  output mechanism.
 * Depending on how much imperative code exists (and state inside that code)
-  there can be *significant* rework of that code and converting or refactoring
-  it to these new concepts.  We have tried to help here by allowing you to have
-  tasks that internally use regular python code (and internally can be written
-  in an imperative style) as well as by providing examples and these developer
-  docs; helping this process be as seamless as possible.
+  there *may* be *significant* rework of that code and converting or
+  refactoring it to these new concepts. We have tried to help here by allowing
+  you to have tasks that internally use regular python code (and internally can
+  be written in an imperative style) as well as by providing
+  :doc:`examples <examples>` that show how to use these concepts.
 * Another one of the downsides of decoupling the *what* from the *how*  is that
  it may become harder to use traditional techniques to debug failures
  (especially if remote workers are involved). We try to help here by making it
@@ -110,16 +111,16 @@ All engines are mere classes that implement the same interface, and of course
 it is possible to import them and create instances just like with any classes
 in Python. But the easier (and recommended) way for creating an engine is using
 the engine helper functions. All of these functions are imported into the
-`taskflow.engines` module namespace, so the typical usage of these functions
+``taskflow.engines`` module namespace, so the typical usage of these functions
 might look like::

    from taskflow import engines

    ...
    flow = make_flow()
-    engine = engines.load(flow, engine_conf=my_conf,
-                          backend=my_persistence_conf)
-    engine.run
+    eng = engines.load(flow, engine='serial', backend=my_persistence_conf)
+    eng.run()
+    ...


 .. automodule:: taskflow.engines.helpers
@@ -128,59 +129,74 @@ Usage
 =====

 To select which engine to use and pass parameters to an engine you should use
-the ``engine_conf`` parameter any helper factory function accepts. It may be:
+the ``engine`` parameter any engine helper function accepts and for any engine
+specific options use the ``kwargs`` parameter.

-* a string, naming engine type;
-* a dictionary, holding engine type with key ``'engine'`` and possibly
-  type-specific engine configuration parameters.
+Types
+=====

-Single-Threaded
---------------
+Serial
+------

 **Engine type**: ``'serial'``

-Runs all tasks on the single thread -- the same thread `engine.run()` is called
-on. This engine is used by default.
+Runs all tasks on a single thread -- the same thread ``engine.run()`` is
+called from.
+
+.. note::
+
+    This engine is used by default.

 .. tip::

    If eventlet is used then this engine will not block other threads
-    from running as eventlet automatically creates a co-routine system (using
-    greenthreads and monkey patching). See `eventlet <http://eventlet.net/>`_
-    and `greenlet <http://greenlet.readthedocs.org/>`_ for more details.
+    from running as eventlet automatically creates a implicit co-routine
+    system (using greenthreads and monkey patching). See
+    `eventlet <http://eventlet.net/>`_ and
+    `greenlet <http://greenlet.readthedocs.org/>`_ for more details.

 Parallel
 --------

 **Engine type**: ``'parallel'``

-Parallel engine schedules tasks onto different threads to run them in parallel.
-
-Additional supported keyword arguments:
-
-* ``executor``: a object that implements a :pep:`3148` compatible `executor`_
-  interface; it will be used for scheduling tasks. You can use instances of a
-  `thread pool executor`_ or a :py:class:`green executor
-  <taskflow.utils.eventlet_utils.GreenExecutor>` (which internally uses
-  `eventlet <http://eventlet.net/>`_ and greenthread pools).
+A parallel engine schedules tasks onto different threads/processes to allow for
+running non-dependent tasks simultaneously. See the documentation of
+:py:class:`~taskflow.engines.action_engine.engine.ParallelActionEngine` for
+supported arguments that can be used to construct a parallel engine that runs
+using your desired execution model.

 .. tip::

-    Sharing executor between engine instances provides better
-    scalability by reducing thread creation and teardown as well as by reusing
-    existing pools (which is a good practice in general).
+    Sharing an executor between engine instances provides better
+    scalability by reducing thread/process creation and teardown as well as by
+    reusing existing pools (which is a good practice in general).

 .. note::

-    Running tasks with a `process pool executor`_ is not currently supported.
+    Running tasks with a `process pool executor`_ is **experimentally**
+    supported. This is mainly due to the `futures backport`_ and
+    the `multiprocessing`_ module that exist in older versions of python not
+    being as up to date (with important fixes such as :pybug:`4892`,
+    :pybug:`6721`, :pybug:`9205`, :pybug:`11635`, :pybug:`16284`,
+    :pybug:`22393` and others...) as the most recent python version (which
+    themselves have a variety of ongoing/recent bugs).

-Worker-Based
------------
+Workers
+-------

-**Engine type**: ``'worker-based'``
+**Engine type**: ``'worker-based'`` or ``'workers'``

-For more information, please see :doc:`workers <workers>` for more details on
-how the worker based engine operates (and the design decisions behind it).
+.. note:: Since this engine is significantly more complicated (and
+          different) then the others we thought it appropriate to devote a
+          whole documentation section to it.
+
+For further information, please refer to the the following:
+
+.. toctree::
+   :maxdepth: 2
+
+   workers

 How they run
 ============
@@ -241,6 +257,14 @@ object starts to take over and begins going through the stages listed
 below (for a more visual diagram/representation see
 the :ref:`engine state diagram <engine states>`).

+.. note::
+
+   The engine will respect the constraints imposed by the flow. For example,
+   if Engine is executing a :py:class:`.linear_flow.Flow` then it is
+   constrained by the dependency-graph which is linear in this case, and hence
+   using a Parallel Engine may not yield any benefits if one is looking for
+   concurrency.
+
 Resumption
 ^^^^^^^^^^

@@ -265,13 +289,13 @@ Scheduling
 ^^^^^^^^^^

 This stage selects which atoms are eligible to run by using a
-:py:class:`~taskflow.engines.action_engine.runtime.Scheduler` implementation
+:py:class:`~taskflow.engines.action_engine.scheduler.Scheduler` implementation
 (the default implementation looks at there intention, checking if predecessor
 atoms have ran and so-on, using a
 :py:class:`~taskflow.engines.action_engine.analyzer.Analyzer` helper
 object as needed) and submits those atoms to a previously provided compatible
 `executor`_ for asynchronous execution. This
-:py:class:`~taskflow.engines.action_engine.runtime.Scheduler` will return a
+:py:class:`~taskflow.engines.action_engine.scheduler.Scheduler` will return a
 `future`_ object for each atom scheduled; all of which are collected into a
 list of not done futures. This will end the initial round of scheduling and at
 this point the engine enters the :ref:`waiting <waiting>` stage.
@@ -284,7 +308,7 @@ Waiting
 In this stage the engine waits for any of the future objects previously
 submitted to complete. Once one of the future objects completes (or fails) that
 atoms result will be examined and finalized using a
-:py:class:`~taskflow.engines.action_engine.runtime.Completer` implementation.
+:py:class:`~taskflow.engines.action_engine.completer.Completer` implementation.
 It typically will persist results to a provided persistence backend (saved
 into the corresponding :py:class:`~taskflow.persistence.logbook.AtomDetail`
 and :py:class:`~taskflow.persistence.logbook.FlowDetail` objects) and reflect
@@ -322,24 +346,33 @@ saved for this execution.
 Interfaces
 ==========

+.. automodule:: taskflow.engines.base
+
+Implementations
+===============
+
 .. automodule:: taskflow.engines.action_engine.analyzer
 .. automodule:: taskflow.engines.action_engine.compiler
+.. automodule:: taskflow.engines.action_engine.completer
 .. automodule:: taskflow.engines.action_engine.engine
+.. automodule:: taskflow.engines.action_engine.executor
 .. automodule:: taskflow.engines.action_engine.runner
 .. automodule:: taskflow.engines.action_engine.runtime
-.. automodule:: taskflow.engines.base
+.. automodule:: taskflow.engines.action_engine.scheduler
+.. automodule:: taskflow.engines.action_engine.scopes

 Hierarchy
 =========

 .. inheritance-diagram::
-    taskflow.engines.base
-    taskflow.engines.action_engine.engine
-    taskflow.engines.worker_based.engine
+    taskflow.engines.action_engine.engine.ActionEngine
+    taskflow.engines.base.Engine
+    taskflow.engines.worker_based.engine.WorkerBasedActionEngine
    :parts: 1

+.. _multiprocessing: https://docs.python.org/2/library/multiprocessing.html
 .. _future: https://docs.python.org/dev/library/concurrent.futures.html#future-objects
 .. _executor: https://docs.python.org/dev/library/concurrent.futures.html#concurrent.futures.Executor
 .. _networkx: https://networkx.github.io/
-.. _thread pool executor: https://docs.python.org/dev/library/concurrent.futures.html#threadpoolexecutor
+.. _futures backport: https://pypi.python.org/pypi/futures
 .. _process pool executor: https://docs.python.org/dev/library/concurrent.futures.html#processpoolexecutor
--- a/doc/source/examples.rst
+++ b/doc/source/examples.rst
@@ -1,3 +1,39 @@
+Hello world
+===========
+
+.. note::
+
+    Full source located at :example:`hello_world`.
+
+.. literalinclude:: ../../taskflow/examples/hello_world.py
+    :language: python
+    :linenos:
+    :lines: 16-
+
+Passing values from and to tasks
+================================
+
+.. note::
+
+    Full source located at :example:`simple_linear_pass`.
+
+.. literalinclude:: ../../taskflow/examples/simple_linear_pass.py
+    :language: python
+    :linenos:
+    :lines: 16-
+
+Using listeners
+===============
+
+.. note::
+
+    Full source located at :example:`echo_listener`.
+
+.. literalinclude:: ../../taskflow/examples/echo_listener.py
+    :language: python
+    :linenos:
+    :lines: 16-
+
 Making phone calls
 ==================

@@ -34,6 +70,42 @@ Building a car
    :linenos:
    :lines: 16-

+Iterating over the alphabet (using processes)
+=============================================
+
+.. note::
+
+    Full source located at :example:`alphabet_soup`.
+
+.. literalinclude:: ../../taskflow/examples/alphabet_soup.py
+    :language: python
+    :linenos:
+    :lines: 16-
+
+Watching execution timing
+=========================
+
+.. note::
+
+    Full source located at :example:`timing_listener`.
+
+.. literalinclude:: ../../taskflow/examples/timing_listener.py
+    :language: python
+    :linenos:
+    :lines: 16-
+
+Table multiplier (in parallel)
+==============================
+
+.. note::
+
+    Full source located at :example:`parallel_table_multiply`
+
+.. literalinclude:: ../../taskflow/examples/parallel_table_multiply.py
+    :language: python
+    :linenos:
+    :lines: 16-
+
 Linear equation solver (explicit dependencies)
 ==============================================

@@ -80,6 +152,18 @@ Creating a volume (in parallel)
    :linenos:
    :lines: 16-

+Summation mapper(s) and reducer (in parallel)
+=============================================
+
+.. note::
+
+    Full source located at :example:`simple_map_reduce`
+
+.. literalinclude:: ../../taskflow/examples/simple_map_reduce.py
+    :language: python
+    :linenos:
+    :lines: 16-
+
 Storing & emitting a bill
 =========================

@@ -163,3 +247,50 @@ Distributed execution (simple)
    :language: python
    :linenos:
    :lines: 16-
+
+Distributed notification (simple)
+=================================
+
+.. note::
+
+    Full source located at :example:`wbe_event_sender`
+
+.. literalinclude:: ../../taskflow/examples/wbe_event_sender.py
+    :language: python
+    :linenos:
+    :lines: 16-
+
+Distributed mandelbrot (complex)
+================================
+
+.. note::
+
+    Full source located at :example:`wbe_mandelbrot`
+
+Output
+------
+
+.. image:: img/mandelbrot.png
+   :height: 128px
+   :align: right
+   :alt: Generated mandelbrot fractal
+
+Code
+----
+
+.. literalinclude:: ../../taskflow/examples/wbe_mandelbrot.py
+    :language: python
+    :linenos:
+    :lines: 16-
+
+Jobboard producer/consumer (simple)
+===================================
+
+.. note::
+
+    Full source located at :example:`jobboard_produce_consume_colors`
+
+.. literalinclude:: ../../taskflow/examples/jobboard_produce_consume_colors.py
+    :language: python
+    :linenos:
+    :lines: 16-
--- a/doc/source/img/engine_states.svg
+++ b/doc/source/img/engine_states.svg
--- a/doc/source/img/flow_states.svg
+++ b/doc/source/img/flow_states.svg
--- a/doc/source/img/jobboard.png
+++ b/doc/source/img/jobboard.png
--- a/doc/source/img/mandelbrot.png
+++ b/doc/source/img/mandelbrot.png
--- a/doc/source/img/retry_states.svg
+++ b/doc/source/img/retry_states.svg
--- a/doc/source/img/task_states.svg
+++ b/doc/source/img/task_states.svg
--- a/doc/source/img/wbe_request_states.svg
+++ b/doc/source/img/wbe_request_states.svg
--- a/doc/source/index.rst
+++ b/doc/source/index.rst
@@ -14,7 +14,7 @@ Contents
 ========

 .. toctree::
-   :maxdepth: 2
+   :maxdepth: 3

   atoms
   arguments_and_results
@@ -29,11 +29,6 @@ Contents
   jobs
   conductors

-.. toctree::
-   :hidden:
-
-   workers
-
 Examples
 --------

@@ -70,13 +65,9 @@ TaskFlow into your project:
  ``[TaskFlow]`` to your emails subject to get an even faster response).
 * Follow (or at least attempt to follow) some of the established
  `best practices`_ (feel free to add your own suggested best practices).
-
-.. warning::
-
-        External usage of internal helpers and other internal utility functions
-        and modules should be kept to a *minimum* as these may be altered,
-        refactored or moved *without* notice. If you are unsure whether to use
-        a function, class, or module, please ask (see above).
+* Keep in touch with the team (see above); we are all friendly and enjoy
+  knowing your use cases and learning how we can help make your lives easier
+  by adding or adjusting functionality in this library.

 .. _IRC: irc://chat.freenode.net/openstack-state-management
 .. _best practices: http://wiki.openstack.org/wiki/TaskFlow/Best_practices
@@ -91,6 +82,8 @@ Miscellaneous

   exceptions
   states
+   types
+   utils

 Indices and tables
 ==================
--- a/doc/source/inputs_and_outputs.rst
+++ b/doc/source/inputs_and_outputs.rst
@@ -1,11 +1,11 @@
 ==================
-Inputs and Outputs
+Inputs and outputs
 ==================

 In TaskFlow there are multiple ways to provide inputs for your tasks and flows
 and get information from them. This document describes one of them, that
 involves task arguments and results. There are also :doc:`notifications
-<notifications>`, which allow you to get notified when task or flow changed
+<notifications>`, which allow you to get notified when a task or flow changes
 state. You may also opt to use the :doc:`persistence <persistence>` layer
 itself directly.

@@ -19,15 +19,16 @@ This is the standard and recommended way to pass data from one task to another.
 Of course not every task argument needs to be provided to some other task of a
 flow, and not every task result should be consumed by every task.

-If some value is required by one or more tasks of a flow, but is not provided
-by any task, it is considered to be flow input, and **must** be put into the
-storage before the flow is run. A set of names required by a flow can be
-retrieved via that flow's ``requires`` property. These names can be used to
+If some value is required by one or more tasks of a flow, but it is not
+provided by any task, it is considered to be flow input, and **must** be put
+into the storage before the flow is run. A set of names required by a flow can
+be retrieved via that flow's ``requires`` property. These names can be used to
 determine what names may be applicable for placing in storage ahead of time
 and which names are not applicable.

 All values provided by tasks of the flow are considered to be flow outputs; the
-set of names of such values is available via ``provides`` property of the flow.
+set of names of such values is available via the ``provides`` property of the
+flow.

 .. testsetup::

@@ -49,7 +50,7 @@ For example:
   ...     MyTask(requires='b', provides='d')
   ... )
   >>> flow.requires
-   set(['a'])
+   frozenset(['a'])
   >>> sorted(flow.provides)
   ['b', 'c', 'd']

@@ -59,8 +60,10 @@ As you can see, this flow does not require b, as it is provided by the fist
 task.

 .. note::
-   There is no difference between processing of Task and Retry inputs
-   and outputs.
+
+   There is no difference between processing of
+   :py:class:`Task <taskflow.task.BaseTask>` and
+   :py:class:`~taskflow.retry.Retry` inputs and outputs.

 ------------------
 Engine and storage
@@ -146,8 +149,10 @@ Outputs

 As you can see from examples above, the run method returns all flow outputs in
 a ``dict``. This same data can be fetched via
-:py:meth:`~taskflow.storage.Storage.fetch_all` method of the storage. You can
-also get single results using :py:meth:`~taskflow.storage.Storage.fetch`.
+:py:meth:`~taskflow.storage.Storage.fetch_all` method of the engines storage
+object. You can also get single results using the
+engines storage objects :py:meth:`~taskflow.storage.Storage.fetch` method.
+
 For example:

 .. doctest::
--- a/doc/source/jobs.rst
+++ b/doc/source/jobs.rst
@@ -28,14 +28,14 @@ Definitions
 ===========

 Jobs
-  A :py:class:`job <taskflow.jobs.job.Job>` consists of a unique identifier,
+  A :py:class:`job <taskflow.jobs.base.Job>` consists of a unique identifier,
  name, and a reference to a :py:class:`logbook
  <taskflow.persistence.logbook.LogBook>` which contains the details of the
  work that has been or should be/will be completed to finish the work that has
  been created for that job.

 Jobboards
-  A :py:class:`jobboard <taskflow.jobs.jobboard.JobBoard>` is responsible for
+  A :py:class:`jobboard <taskflow.jobs.base.JobBoard>` is responsible for
  managing the posting, ownership, and delivery of jobs. It acts as the
  location where jobs can be posted, claimed and searched for; typically by
  iteration or notification.  Jobboards may be backed by different *capable*
@@ -45,6 +45,13 @@ Jobboards
  service that uses TaskFlow to select a jobboard implementation that fits
  their setup (and there intended usage) best.

+High level architecture
+=======================
+
+.. image:: img/jobboard.png
+   :height: 350px
+   :align: right
+
 Features
 ========

@@ -157,10 +164,13 @@ might look like:
            else:
                # I finished it, now cleanup.
                board.consume(my_job)
-                persistence.destroy_logbook(my_job.book.uuid)
+                persistence.get_connection().destroy_logbook(my_job.book.uuid)
        time.sleep(coffee_break_time)
    ...

+Types
+=====
+
 Zookeeper
 ---------

@@ -192,6 +202,11 @@ Additional *configuration* parameters:
  when your program uses eventlet and you want to instruct kazoo to use an
  eventlet compatible handler (such as the `eventlet handler`_).

+.. note::
+
+    See :py:class:`~taskflow.jobs.backends.impl_zookeeper.ZookeeperJobBoard`
+    for implementation details.
+
 Considerations
 ==============

@@ -244,9 +259,21 @@ the claim by then, therefore both would be *working* on a job.
 Interfaces
 ==========

+.. automodule:: taskflow.jobs.base
 .. automodule:: taskflow.jobs.backends
-.. automodule:: taskflow.jobs.job
-.. automodule:: taskflow.jobs.jobboard
+
+Implementations
+===============
+
+.. automodule:: taskflow.jobs.backends.impl_zookeeper
+
+Hierarchy
+=========
+
+.. inheritance-diagram::
+    taskflow.jobs.base
+    taskflow.jobs.backends.impl_zookeeper
+    :parts: 1

 .. _paradigm shift: https://wiki.openstack.org/wiki/TaskFlow/Paradigm_shifts#Workflow_ownership_transfer
 .. _zookeeper: http://zookeeper.apache.org/
--- a/doc/source/notifications.rst
+++ b/doc/source/notifications.rst
@@ -1,5 +1,5 @@
 ===========================
-Notifications and Listeners
+Notifications and listeners
 ===========================

 .. testsetup::
@@ -7,6 +7,8 @@ Notifications and Listeners
    from taskflow import task
    from taskflow.patterns import linear_flow
    from taskflow import engines
+    from taskflow.types import notifier
+    ANY = notifier.Notifier.ANY

 --------
 Overview
@@ -17,10 +19,9 @@ transitions, which is useful for monitoring, logging, metrics, debugging
 and plenty of other tasks.

 To receive these notifications you should register a callback with
-an instance of the the :py:class:`notifier <taskflow.utils.misc.Notifier>`
-class that is attached
-to :py:class:`engine <taskflow.engines.base.EngineBase>`
-attributes ``task_notifier`` and ``notifier``.
+an instance of the :py:class:`~taskflow.types.notifier.Notifier`
+class that is attached to :py:class:`~taskflow.engines.base.Engine`
+attributes ``atom_notifier`` and ``notifier``.

 TaskFlow also comes with a set of predefined :ref:`listeners <listeners>`, and
 provides means to write your own listeners, which can be more convenient than
@@ -30,17 +31,14 @@ using raw callbacks.
 Receiving notifications with callbacks
 --------------------------------------

-To manage notifications instances of
-:py:class:`~taskflow.utils.misc.Notifier` are used.
-
-.. autoclass:: taskflow.utils.misc.Notifier
-
 Flow notifications
 ------------------

-To receive notification on flow state changes use
-:py:class:`~taskflow.utils.misc.Notifier` available as
-``notifier`` property of the engine. A basic example is:
+To receive notification on flow state changes use the
+:py:class:`~taskflow.types.notifier.Notifier` instance available as the
+``notifier`` property of an engine.
+
+A basic example is:

 .. doctest::

@@ -61,7 +59,7 @@ To receive notification on flow state changes use
   >>> flo = linear_flow.Flow("cat-dog").add(
   ...   CatTalk(), DogTalk(provides="dog"))
   >>> eng = engines.load(flo, store={'meow': 'meow', 'woof': 'woof'})
-   >>> eng.notifier.register("*", flow_transition)
+   >>> eng.notifier.register(ANY, flow_transition)
   >>> eng.run()
   Flow 'cat-dog' transition to state RUNNING
   meow
@@ -71,9 +69,11 @@ To receive notification on flow state changes use
 Task notifications
 ------------------

-To receive notification on task state changes use
-:py:class:`~taskflow.utils.misc.Notifier` available as
-``task_notifier`` property of the engine. A basic example is:
+To receive notification on task state changes use the
+:py:class:`~taskflow.types.notifier.Notifier` instance available as the
+``atom_notifier`` property of an engine.
+
+A basic example is:

 .. doctest::

@@ -95,7 +95,7 @@ To receive notification on task state changes use
   >>> flo.add(CatTalk(), DogTalk(provides="dog"))
   <taskflow.patterns.linear_flow.Flow object at 0x...>
   >>> eng = engines.load(flo, store={'meow': 'meow', 'woof': 'woof'})
-   >>> eng.task_notifier.register("*", task_transition)
+   >>> eng.task_notifier.register(ANY, task_transition)
   >>> eng.run()
   Task 'CatTalk' transition to state RUNNING
   meow
@@ -138,30 +138,53 @@ For example, this is how you can use
   >>> with printing.PrintingListener(eng):
   ...   eng.run()
   ...
-   taskflow.engines.action_engine.engine.SingleThreadedActionEngine: ... has moved flow 'cat-dog' (...) into state 'RUNNING'
-   taskflow.engines.action_engine.engine.SingleThreadedActionEngine: ... has moved task 'CatTalk' (...) into state 'RUNNING'
+   <taskflow.engines.action_engine.engine.SerialActionEngine object at ...> has moved flow 'cat-dog' (...) into state 'RUNNING' from state 'PENDING'
+   <taskflow.engines.action_engine.engine.SerialActionEngine object at ...> has moved task 'CatTalk' (...) into state 'RUNNING' from state 'PENDING'
   meow
-   taskflow.engines.action_engine.engine.SingleThreadedActionEngine: ... has moved task 'CatTalk' (...) into state 'SUCCESS' with result 'cat' (failure=False)
-   taskflow.engines.action_engine.engine.SingleThreadedActionEngine: ... has moved task 'DogTalk' (...) into state 'RUNNING'
+   <taskflow.engines.action_engine.engine.SerialActionEngine object at ...> has moved task 'CatTalk' (...) into state 'SUCCESS' from state 'RUNNING' with result 'cat' (failure=False)
+   <taskflow.engines.action_engine.engine.SerialActionEngine object at ...> has moved task 'DogTalk' (...) into state 'RUNNING' from state 'PENDING'
   woof
-   taskflow.engines.action_engine.engine.SingleThreadedActionEngine: ... has moved task 'DogTalk' (...) into state 'SUCCESS' with result 'dog' (failure=False)
-   taskflow.engines.action_engine.engine.SingleThreadedActionEngine: ... has moved flow 'cat-dog' (...) into state 'SUCCESS'
+   <taskflow.engines.action_engine.engine.SerialActionEngine object at ...> has moved task 'DogTalk' (...) into state 'SUCCESS' from state 'RUNNING' with result 'dog' (failure=False)
+   <taskflow.engines.action_engine.engine.SerialActionEngine object at ...> has moved flow 'cat-dog' (...) into state 'SUCCESS' from state 'RUNNING'

 Basic listener
 --------------

-.. autoclass:: taskflow.listeners.base.ListenerBase
+.. autoclass:: taskflow.listeners.base.Listener

 Printing and logging listeners
 ------------------------------

-.. autoclass:: taskflow.listeners.base.LoggingBase
+.. autoclass:: taskflow.listeners.base.DumpingListener

 .. autoclass:: taskflow.listeners.logging.LoggingListener

+.. autoclass:: taskflow.listeners.logging.DynamicLoggingListener
+
 .. autoclass:: taskflow.listeners.printing.PrintingListener

 Timing listener
 ---------------

 .. autoclass:: taskflow.listeners.timing.TimingListener
+
+.. autoclass:: taskflow.listeners.timing.PrintingTimingListener
+
+Claim listener
+--------------
+
+.. autoclass:: taskflow.listeners.claims.CheckingClaimListener
+
+Hierarchy
+---------
+
+.. inheritance-diagram::
+    taskflow.listeners.base.DumpingListener
+    taskflow.listeners.base.Listener
+    taskflow.listeners.claims.CheckingClaimListener
+    taskflow.listeners.logging.DynamicLoggingListener
+    taskflow.listeners.logging.LoggingListener
+    taskflow.listeners.printing.PrintingListener
+    taskflow.listeners.timing.PrintingTimingListener
+    taskflow.listeners.timing.TimingListener
+    :parts: 1
--- a/doc/source/persistence.rst
+++ b/doc/source/persistence.rst
@@ -38,7 +38,7 @@ How it is used

 On :doc:`engine <engines>` construction typically a backend (it can be
 optional) will be provided which satisfies the
-:py:class:`~taskflow.persistence.backends.base.Backend` abstraction. Along with
+:py:class:`~taskflow.persistence.base.Backend` abstraction. Along with
 providing a backend object a
 :py:class:`~taskflow.persistence.logbook.FlowDetail` object will also be
 created and provided (this object will contain the details about the flow to be
@@ -55,7 +55,7 @@ interface to the underlying backend storage objects (it provides helper
 functions that are commonly used by the engine, avoiding repeating code when
 interacting with the provided
 :py:class:`~taskflow.persistence.logbook.FlowDetail` and
-:py:class:`~taskflow.persistence.backends.base.Backend` objects). As an engine
+:py:class:`~taskflow.persistence.base.Backend` objects). As an engine
 initializes it will extract (or create)
 :py:class:`~taskflow.persistence.logbook.AtomDetail` objects for each atom in
 the workflow the engine will be executing.
@@ -72,7 +72,7 @@ predecessor :py:class:`~taskflow.persistence.logbook.AtomDetail` outputs and
 states (which may have been persisted in a past run). This will result in
 either using there previous information or by running those predecessors and
 saving their output to the :py:class:`~taskflow.persistence.logbook.FlowDetail`
-and :py:class:`~taskflow.persistence.backends.base.Backend` objects. This
+and :py:class:`~taskflow.persistence.base.Backend` objects. This
 execution, analysis and interaction with the storage objects continues (what is
 described here is a simplification of what really happens; which is quite a bit
 more complex) until the engine has finished running (at which point the engine
@@ -144,6 +144,9 @@ the following:
  ``'connection'`` and possibly type-specific backend parameters as other
  keys.

+Types
+=====
+
 Memory
 ------

@@ -152,6 +155,11 @@ Memory
 Retains all data in local memory (not persisted to reliable storage). Useful
 for scenarios where persistence is not required (and also in unit tests).

+.. note::
+
+    See :py:class:`~taskflow.persistence.backends.impl_memory.MemoryBackend`
+    for implementation details.
+
 Files
 -----

@@ -163,6 +171,11 @@ from the same local machine only). Useful for cases where a *more* reliable
 persistence is desired along with the simplicity of files and directories (a
 concept everyone is familiar with).

+.. note::
+
+    See :py:class:`~taskflow.persistence.backends.impl_dir.DirBackend`
+    for implementation details.
+
 Sqlalchemy
 ----------

@@ -174,9 +187,62 @@ Useful when you need a higher level of durability than offered by the previous
 solutions. When using these connection types it is possible to resume a engine
 from a peer machine (this does not apply when using sqlite).

+Schema
+^^^^^^
+
+*Logbooks*
+
+==========  ========  =============
+Name        Type      Primary Key
+==========  ========  =============
+created_at  DATETIME  False
+updated_at  DATETIME  False
+uuid        VARCHAR   True
+name        VARCHAR   False
+meta        TEXT      False
+==========  ========  =============
+
+*Flow details*
+
+===========  ========  =============
+Name         Type      Primary Key
+===========  ========  =============
+created_at   DATETIME  False
+updated_at   DATETIME  False
+uuid         VARCHAR   True
+name         VARCHAR   False
+meta         TEXT      False
+state        VARCHAR   False
+parent_uuid  VARCHAR   False
+===========  ========  =============
+
+*Atom details*
+
+===========  ========  =============
+Name         Type      Primary Key
+===========  ========  =============
+created_at   DATETIME  False
+updated_at   DATETIME  False
+uuid         VARCHAR   True
+name         VARCHAR   False
+meta         TEXT      False
+atom_type    VARCHAR   False
+state        VARCHAR   False
+intention    VARCHAR   False
+results      TEXT      False
+failure      TEXT      False
+version      TEXT      False
+parent_uuid  VARCHAR   False
+===========  ========  =============
+
 .. _sqlalchemy: http://www.sqlalchemy.org/docs/
 .. _ACID: https://en.wikipedia.org/wiki/ACID

+.. note::
+
+    See :py:class:`~taskflow.persistence.backends.impl_sqlalchemy.SQLAlchemyBackend`
+    for implementation details.
+
 Zookeeper
 ---------

@@ -190,6 +256,11 @@ logbook represented as znodes. Since zookeeper is also distributed it is also
 able to resume a engine from a peer machine (having similar functionality
 as the database connection types listed previously).

+.. note::
+
+    See :py:class:`~taskflow.persistence.backends.impl_zookeeper.ZkBackend`
+    for implementation details.
+
 .. _zookeeper: http://zookeeper.apache.org
 .. _kazoo: http://kazoo.readthedocs.org/

@@ -197,15 +268,24 @@ Interfaces
 ==========

 .. automodule:: taskflow.persistence.backends
-.. automodule:: taskflow.persistence.backends.base
+.. automodule:: taskflow.persistence.base
 .. automodule:: taskflow.persistence.logbook

+Implementations
+===============
+
+.. automodule:: taskflow.persistence.backends.impl_dir
+.. automodule:: taskflow.persistence.backends.impl_memory
+.. automodule:: taskflow.persistence.backends.impl_sqlalchemy
+.. automodule:: taskflow.persistence.backends.impl_zookeeper
+
 Hierarchy
 =========

 .. inheritance-diagram::
-    taskflow.persistence.backends.impl_memory
-    taskflow.persistence.backends.impl_zookeeper
+    taskflow.persistence.base
    taskflow.persistence.backends.impl_dir
+    taskflow.persistence.backends.impl_memory
    taskflow.persistence.backends.impl_sqlalchemy
+    taskflow.persistence.backends.impl_zookeeper
    :parts: 2
--- a/doc/source/resumption.rst
+++ b/doc/source/resumption.rst
@@ -88,7 +88,7 @@ The following scenarios explain some expected structural changes and how they
 can be accommodated (and what the effect will be when resuming & running).

 Same atoms
----------
++++++++++

 When the factory function mentioned above returns the exact same the flow and
 atoms (no changes are performed).
@@ -98,7 +98,7 @@ atoms with :py:class:`~taskflow.persistence.logbook.AtomDetail` objects by name
 and then the engine resumes.

 Atom was added
--------------
++++++++++++++

 When the factory function mentioned above alters the flow by adding a new atom
 in (for example for changing the runtime structure of what was previously ran
@@ -109,7 +109,7 @@ corresponding :py:class:`~taskflow.persistence.logbook.AtomDetail` does not
 exist and one will be created and associated.

 Atom was removed
----------------
++++++++++++++++

 When the factory function mentioned above alters the flow by removing a new
 atom in (for example for changing the runtime structure of what was previously
@@ -121,7 +121,7 @@ it was not there, and any results it returned if it was completed before will
 be ignored.

 Atom code was changed
---------------------
+++++++++++++++++++++

 When the factory function mentioned above alters the flow by deciding that a
 newer version of a previously existing atom should be ran (possibly to perform
@@ -137,8 +137,8 @@ ability to upgrade atoms before running (manual introspection & modification of
 a :py:class:`~taskflow.persistence.logbook.LogBook` can be done before engine
 loading and running to accomplish this in the meantime).

-Atom was split in two atoms or merged from two (or more) to one atom
--------------------------------------------------------------------
+Atom was split in two atoms or merged
+++++++++++++++++++++++++++++++++++++

 When the factory function mentioned above alters the flow by deciding that a
 previously existing atom should be split into N atoms or the factory function
@@ -154,7 +154,7 @@ introspection & modification of a
 loading and running to accomplish this in the meantime).

 Flow structure was changed
--------------------------
++++++++++++++++++++++++++

 If manual links were added or removed from graph, or task requirements were
 changed, or flow was refactored (atom moved into or out of subflows, linear
--- a/doc/source/states.rst
+++ b/doc/source/states.rst
@@ -4,12 +4,21 @@ States

 .. _engine states:

+.. note::
+
+  The code contains explicit checks during transitions using the models
+  described below. These checks ensure that a transition is valid, if the
+  transition is determined to be invalid the transitioning code will raise
+  a :py:class:`~taskflow.exceptions.InvalidState` exception. This exception
+  being triggered usually means there is some kind of bug in the code or some
+  type of misuse/state violation is occurring, and should be reported as such.
+
 Engine
 ======

 .. image:: img/engine_states.svg
   :width: 660px
-   :align: left
+   :align: center
   :alt: Action engine state transitions

 **RESUMING** - Prepares flow & atoms to be resumed.
@@ -22,135 +31,166 @@ Engine

 **SUCCESS** - Completed successfully.

+**FAILURE** - Completed unsuccessfully.
+
 **REVERTED** - Reverting was induced and all atoms were **not** completed
 successfully.

 **SUSPENDED** - Suspended while running.

+**UNDEFINED** - *Internal state.*
+
+**GAME_OVER** - *Internal state.*
+
 Flow
 ====

 .. image:: img/flow_states.svg
   :width: 660px
-   :align: left
+   :align: center
   :alt: Flow state transitions

-**PENDING** - A flow starts its life in this state.
+**PENDING** - A flow starts its execution lifecycle in this state (it has no
+state prior to being ran by an engine, since flow(s) are just pattern(s)
+that define the semantics and ordering of their contents and flows gain
+state only when they are executed).

-**RUNNING** - In this state flow makes a progress, executes and/or reverts its
-atoms.
+**RUNNING** - In this state the engine running a flow progresses through the
+flow.

-**SUCCESS** - Once all atoms have finished successfully the flow transitions to
-the SUCCESS state.
+**SUCCESS** - Transitioned to once all of the flows atoms have finished
+successfully.

-**REVERTED** - The flow transitions to this state when it has been reverted
-successfully after the failure.
+**REVERTED** - Transitioned to once all of the flows atoms have been reverted
+successfully after a failure.

-**FAILURE** - The flow transitions to this state when it can not be reverted
-after the failure.
+**FAILURE** - The engine will transition the flow to this state when it can not
+be reverted after a single failure or after multiple failures (greater than
+one failure *may* occur when running in parallel).

-**SUSPENDING** - In the RUNNING state the flow can be suspended. When this
-happens, flow transitions to the SUSPENDING state immediately. In that state
-the engine running the flow waits for running atoms to finish (since the engine
-can not preempt atoms that are active).
+**SUSPENDING** - In the ``RUNNING`` state the engine running the flow can be
+suspended. When this happens, the engine attempts to transition the flow
+to the ``SUSPENDING`` state immediately. In that state the engine running the
+flow waits for running atoms to finish (since the engine can not preempt
+atoms that are actively running).

-**SUSPENDED** - When no atoms are running and all results received so far are
-saved, the flow transitions from the SUSPENDING state to SUSPENDED. Also it may
-go to the SUCCESS state if all atoms were in fact ran, or to the REVERTED state
-if the flow was reverting and all atoms were reverted while the engine was
-waiting for running atoms to finish, or to the FAILURE state if atoms were run
-or reverted and some of them failed.
-
-**RESUMING** - When the flow is interrupted 'in a hard way' (e.g. server
-crashed), it can be loaded from storage in any state. If the state is not
-PENDING (aka, the flow was never ran) or SUCCESS, FAILURE or REVERTED (in which
-case the flow has already finished), the flow gets set to the RESUMING state
-for the short time period while it is being loaded from backend storage [a
-database, a filesystem...] (this transition is not shown on the diagram). When
-the flow is finally loaded, it goes to the SUSPENDED state.
-
-From the SUCCESS, FAILURE or REVERTED states the flow can be ran again (and
-thus it goes back into the RUNNING state). One of the possible use cases for
-this transition is to allow for alteration of a flow or flow details associated
-with a previously ran flow after the flow has finished, and client code wants
-to ensure that each atom from this new (potentially updated) flow has its
-chance to run.
+**SUSPENDED** - When no atoms are running and all results received so far have
+been saved, the engine transitions the flow from the ``SUSPENDING`` state
+to the ``SUSPENDED`` state.

 .. note::

-  The current code also contains strong checks during each flow state
-  transition using the model described above and raises the
-  :py:class:`~taskflow.exceptions.InvalidState` exception if an invalid
-  transition is attempted. This exception being triggered usually means there
-  is some kind of bug in the engine code or some type of misuse/state violation
-  is occurring, and should be reported as such.
+  The engine may transition the flow to the ``SUCCESS`` state (from the
+  ``SUSPENDING`` state) if all atoms were in fact running (and completed)
+  before the suspension request was able to be honored (this is due to the lack
+  of preemption) or to the ``REVERTED`` state if the engine was reverting and
+  all atoms were reverted while the engine was waiting for running atoms to
+  finish or to the ``FAILURE`` state if atoms were running or reverted and
+  some of them had failed.

+**RESUMING** - When the engine running a flow is interrupted *'in a
+hard way'* (e.g. server crashed), it can be loaded from storage in *any*
+state (this is required since it is can not be known what state was last
+successfully saved). If the loaded state is not ``PENDING`` (aka, the flow was
+never ran) or ``SUCCESS``, ``FAILURE`` or ``REVERTED`` (in which case the flow
+has already finished), the flow gets set to the ``RESUMING`` state for the
+short time period while it is being loaded from backend storage [a database, a
+filesystem...] (this transition is not shown on the diagram). When the flow is
+finally loaded, it goes to the ``SUSPENDED`` state.
+
+From the ``SUCCESS``, ``FAILURE`` or ``REVERTED`` states the flow can be ran
+again; therefore it is allowable to go back into the ``RUNNING`` state
+immediately. One of the possible use cases for this transition is to allow for
+alteration of a flow or flow details associated with a previously ran flow
+after the flow has finished, and client code wants to ensure that each atom
+from this new (potentially updated) flow has its chance to run.

 Task
 ====

 .. image:: img/task_states.svg
   :width: 660px
-   :align: left
+   :align: center
   :alt: Task state transitions

-**PENDING** - When a task is added to a flow, it starts in the PENDING state,
-which means it can be executed immediately or waits for all of task it depends
-on to complete.  The task transitions to the PENDING state after it was
-reverted and its flow was restarted or retried.
+**PENDING** - A task starts its execution lifecycle in this state (it has no
+state prior to being ran by an engine, since tasks(s) are just objects that
+represent how to accomplish a piece of work). Once it has been transitioned to
+the ``PENDING`` state by the engine this means it can be executed immediately
+or if needed will wait for all of the atoms it depends on to complete.

-**RUNNING** - When flow starts to execute the task, it transitions to the
-RUNNING state, and stays in this state until its
-:py:meth:`execute() <taskflow.task.BaseTask.execute>` method returns.
+.. note::

-**SUCCESS** - The task transitions to this state after it was finished
-successfully.
+  A engine running a task also transitions the task to the ``PENDING`` state
+  after it was reverted and its containing flow was restarted or retried.

-**FAILURE** - The task transitions to this state after it was finished with
-error. When the flow containing this task is being reverted, all its tasks are
-walked in particular order.
+**RUNNING** - When an engine running the task starts to execute the task, the
+engine will transition the task to the ``RUNNING`` state, and the task will
+stay in this state until the tasks :py:meth:`~taskflow.task.BaseTask.execute`
+method returns.

-**REVERTING** - The task transitions to this state when the flow starts to
-revert it and its :py:meth:`revert() <taskflow.task.BaseTask.revert>` method
-is called. Only tasks in the SUCCESS or FAILURE state can be reverted. If this
-method fails (raises exception), the task goes to the FAILURE state.
+**SUCCESS** - The engine running the task transitions the task to this state
+after the task has finished successfully (ie no exception/s were raised during
+execution).
+
+**FAILURE** - The engine running the task transitions the task to this state
+after it has finished with an error.
+
+**REVERTING** - The engine running a task transitions the task to this state
+when the containing flow the engine is running starts to revert and
+its :py:meth:`~taskflow.task.BaseTask.revert` method is called. Only tasks in
+the ``SUCCESS`` or ``FAILURE`` state can be reverted. If this method fails (ie
+raises an exception), the task goes to the ``FAILURE`` state (if it was already
+in the ``FAILURE`` state then this is a no-op).

 **REVERTED** - A task that has been reverted appears in this state.

-
 Retry
 =====

+.. note::
+
+  A retry has the same states as a task and one additional state.
+
 .. image:: img/retry_states.svg
   :width: 660px
-   :align: left
+   :align: center
   :alt: Retry state transitions

-Retry has the same states as a task and one additional state.
+**PENDING** - A retry starts its execution lifecycle in this state (it has no
+state prior to being ran by an engine, since retry(s) are just objects that
+represent how to retry an associated flow). Once it has been transitioned to
+the ``PENDING`` state by the engine this means it can be executed immediately
+or if needed will wait for all of the atoms it depends on to complete (in the
+retry case the retry object will also be consulted when failures occur in the
+flow that the retry is associated with by consulting its
+:py:meth:`~taskflow.retry.Decider.on_failure` method).

-**PENDING** - When a retry is added to a flow, it starts in the PENDING state,
-which means it can be executed immediately or waits for all of task it depends
-on to complete.  The retry transitions to the PENDING state after it was
-reverted and its flow was restarted or retried.
+.. note::

-**RUNNING** - When flow starts to execute the retry, it transitions to the
-RUNNING state, and stays in this state until its
-:py:meth:`execute() <taskflow.retry.Retry.execute>` method returns.
+  A engine running a retry also transitions the retry to the ``PENDING`` state
+  after it was reverted and its associated flow was restarted or retried.

-**SUCCESS** - The retry transitions to this state after it was finished
-successfully.
+**RUNNING** - When a engine starts to execute the retry, the engine
+transitions the retry to the ``RUNNING`` state, and the retry stays in this
+state until its :py:meth:`~taskflow.retry.Retry.execute` method returns.

-**FAILURE** - The retry transitions to this state after it was finished with
-error. When the flow containing this retry is being reverted, all its tasks are
-walked in particular order.
+**SUCCESS** - The engine running the retry transitions it to this state after
+it was finished successfully (ie no exception/s were raised during
+execution).

-**REVERTING** - The retry transitions to this state when the flow starts to
-revert it and its :py:meth:`revert() <taskflow.retry.Retry.revert>` method is
-called. Only retries in SUCCESS or FAILURE state can be reverted. If this
-method fails (raises exception), the retry goes to the FAILURE state.
+**FAILURE** - The engine running the retry transitions it to this state after
+it has finished with an error.
+
+**REVERTING** - The engine running the retry transitions to this state when
+the associated flow the engine is running starts to revert it and its
+:py:meth:`~taskflow.retry.Retry.revert` method is called. Only retries
+in ``SUCCESS`` or ``FAILURE`` state can be reverted. If this method fails (ie
+raises an exception), the retry goes to the ``FAILURE`` state (if it was
+already in the ``FAILURE`` state then this is a no-op).

 **REVERTED** - A retry that has been reverted appears in this state.

-**RETRYING** - If flow that is managed by the current retry was failed and
-reverted, the engine prepares it for the next run and transitions to the
-RETRYING state.
+**RETRYING** - If flow that is associated with the current retry was failed and
+reverted, the engine prepares the flow for the next run and transitions the
+retry to the ``RETRYING`` state.
--- a/doc/source/types.rst
+++ b/doc/source/types.rst
@@ -0,0 +1,68 @@
+-----
+Types
+-----
+
+.. note::
+
+    Even though these types **are** made for public consumption and usage
+    should be encouraged/easily possible it should be noted that these may be
+    moved out to new libraries at various points in the future (for example
+    the ``FSM`` code *may* move to its own oslo supported ``automaton`` library
+    at some point in the future [#f1]_). If you are using these
+    types **without** using the rest of this library it is **strongly**
+    encouraged that you be a vocal proponent of getting these made
+    into *isolated* libraries (as using these types in this manner is not
+    the expected and/or desired usage).
+
+Cache
+=====
+
+.. automodule:: taskflow.types.cache
+
+Failure
+=======
+
+.. automodule:: taskflow.types.failure
+
+FSM
+===
+
+.. automodule:: taskflow.types.fsm
+
+Futures
+=======
+
+.. automodule:: taskflow.types.futures
+
+Graph
+=====
+
+.. automodule:: taskflow.types.graph
+
+Notifier
+========
+
+.. automodule:: taskflow.types.notifier
+
+Periodic
+========
+
+.. automodule:: taskflow.types.periodic
+
+Table
+=====
+
+.. automodule:: taskflow.types.table
+
+Timing
+======
+
+.. automodule:: taskflow.types.timing
+
+Tree
+====
+
+.. automodule:: taskflow.types.tree
+
+.. [#f1] See: https://review.openstack.org/#/c/141961 for a proposal to
+         do this.
--- a/doc/source/utils.rst
+++ b/doc/source/utils.rst
@@ -0,0 +1,54 @@
+---------
+Utilities
+---------
+
+.. warning::
+
+    External usage of internal utility functions and modules should be kept
+    to a **minimum** as they may be altered, refactored or moved to other
+    locations **without** notice (and without the typical deprecation cycle).
+
+Async
+~~~~~
+
+.. automodule:: taskflow.utils.async_utils
+
+Deprecation
+~~~~~~~~~~~
+
+.. automodule:: taskflow.utils.deprecation
+
+Eventlet
+~~~~~~~~
+
+.. automodule:: taskflow.utils.eventlet_utils
+
+Kazoo
+~~~~~
+
+.. automodule:: taskflow.utils.kazoo_utils
+
+Kombu
+~~~~~
+
+.. automodule:: taskflow.utils.kombu_utils
+
+Locks
+~~~~~
+
+.. automodule:: taskflow.utils.lock_utils
+
+Miscellaneous
+~~~~~~~~~~~~~
+
+.. automodule:: taskflow.utils.misc
+
+Persistence
+~~~~~~~~~~~
+
+.. automodule:: taskflow.utils.persistence_utils
+
+Threading
+~~~~~~~~~
+
+.. automodule:: taskflow.utils.threading_utils
--- a/doc/source/workers.rst
+++ b/doc/source/workers.rst
@@ -1,7 +1,3 @@
-------
-Workers
-------
-
 Overview
 ========

@@ -17,7 +13,6 @@ connected via `amqp`_ (or other supported `kombu`_ transports).
    production ready.

 .. _blueprint page: https://blueprints.launchpad.net/taskflow?searchtext=wbe
-.. _kombu: http://kombu.readthedocs.org/

 Terminology
 -----------
@@ -36,11 +31,12 @@ Executor
  these requests can be accepted and processed by remote workers.

 Worker
-  Workers are started on remote hosts and has list of tasks it can perform (on
-  request). Workers accept and process task requests that are published by an
-  executor. Several requests can be processed simultaneously in separate
-  threads. For example, an `executor`_ can be passed to the worker and
-  configured to run in as many threads (green or not) as desired.
+  Workers are started on remote hosts and each has a list of tasks it can
+  perform (on request). Workers accept and process task requests that are
+  published by an executor. Several requests can be processed simultaneously
+  in separate threads (or processes...). For example, an `executor`_ can be
+  passed to the worker and configured to run in as many threads (green or
+  not) as desired.

 Proxy
  Executors interact with workers via a proxy. The proxy maintains the
@@ -72,35 +68,12 @@ Requirements
 .. _executor: https://docs.python.org/dev/library/concurrent.futures.html#executor-objects
 .. _protocol: http://en.wikipedia.org/wiki/Communications_protocol

-Use-cases
---------
-
-* `Glance`_
-
-  * Image tasks *(long-running)*
-
-    * Convert, import/export & more...
-
-* `Heat`_
-
-  * Engine work distribution
-
-* `Rally`_
-
-  * Load generation
-
-* *Your use-case here*
-
-.. _Heat: https://wiki.openstack.org/wiki/Heat
-.. _Rally: https://wiki.openstack.org/wiki/Rally
-.. _Glance: https://wiki.openstack.org/wiki/Glance
-
 Design
 ======

-There are two communication sides, the *executor* and *worker* that communicate
-using a proxy component. The proxy is designed to accept/publish messages
-from/into a named exchange.
+There are two communication sides, the *executor* (and associated engine
+derivative) and *worker* that communicate using a proxy component. The proxy
+is designed to accept/publish messages from/into a named exchange.

 High level architecture
 -----------------------
@@ -135,7 +108,7 @@ engine executor in the following manner:
      executes the task).
   2. If dispatched succeeded then the worker sends a confirmation response
      to the executor otherwise the worker sends a failed response along with
-      a serialized :py:class:`failure <taskflow.utils.misc.Failure>` object
+      a serialized :py:class:`failure <taskflow.types.failure.Failure>` object
      that contains what has failed (and why).
   3. The worker executes the task and once it is finished sends the result
      back to the originating executor (every time a task progress event is
@@ -152,20 +125,29 @@ engine executor in the following manner:

 .. note::

-    :py:class:`~taskflow.utils.misc.Failure` objects are not json-serializable
-    (they contain references to tracebacks which are not serializable), so they
-    are converted to dicts before sending and converted from dicts after
-    receiving on both executor & worker sides (this translation is lossy since
-    the traceback won't be fully retained).
+    :py:class:`~taskflow.types.failure.Failure` objects are not directly
+    json-serializable (they contain references to tracebacks which are not
+    serializable), so they are converted to dicts before sending and converted
+    from dicts after receiving on both executor & worker sides (this
+    translation is lossy since the traceback won't be fully retained).

-Executor request format
-~~~~~~~~~~~~~~~~~~~~~~~
+Protocol
+~~~~~~~~

-* **task** - full task name to be performed
+.. automodule:: taskflow.engines.worker_based.protocol
+
+Examples
+~~~~~~~~
+
+Request (execute)
+"""""""""""""""""
+
+* **task_name** - full task name to be performed
+* **task_cls** - full task class name to be performed
 * **action** - task action to be performed (e.g. execute, revert)
 * **arguments** - arguments the task action to be called with
 * **result** - task execution result (result or
-  :py:class:`~taskflow.utils.misc.Failure`) *[passed to revert only]*
+  :py:class:`~taskflow.types.failure.Failure`) *[passed to revert only]*

 Additionally, the following parameters are added to the request message:

@@ -180,20 +162,70 @@ Additionally, the following parameters are added to the request message:
    {
        "action": "execute",
        "arguments": {
-            "joe_number": 444
+            "x": 111
        },
-        "task": "tasks.CallJoe"
+        "task_cls": "taskflow.tests.utils.TaskOneArgOneReturn",
+        "task_name": "taskflow.tests.utils.TaskOneArgOneReturn",
+        "task_version": [
+            1,
+            0
+        ]
    }

-Worker response format
-~~~~~~~~~~~~~~~~~~~~~~
+
+Request (revert)
+""""""""""""""""
+
+When **reverting:**
+
+.. code:: json
+
+    {
+        "action": "revert",
+        "arguments": {},
+        "failures": {
+            "taskflow.tests.utils.TaskWithFailure": {
+                "exc_type_names": [
+                    "RuntimeError",
+                    "StandardError",
+                    "Exception"
+                ],
+                "exception_str": "Woot!",
+                "traceback_str": "  File \"/homes/harlowja/dev/os/taskflow/taskflow/engines/action_engine/executor.py\", line 56, in _execute_task\n    result = task.execute(**arguments)\n  File \"/homes/harlowja/dev/os/taskflow/taskflow/tests/utils.py\", line 165, in execute\n    raise RuntimeError('Woot!')\n",
+                "version": 1
+            }
+        },
+        "result": [
+            "failure",
+            {
+                "exc_type_names": [
+                    "RuntimeError",
+                    "StandardError",
+                    "Exception"
+                ],
+                "exception_str": "Woot!",
+                "traceback_str": "  File \"/homes/harlowja/dev/os/taskflow/taskflow/engines/action_engine/executor.py\", line 56, in _execute_task\n    result = task.execute(**arguments)\n  File \"/homes/harlowja/dev/os/taskflow/taskflow/tests/utils.py\", line 165, in execute\n    raise RuntimeError('Woot!')\n",
+                "version": 1
+            }
+        ],
+        "task_cls": "taskflow.tests.utils.TaskWithFailure",
+        "task_name": "taskflow.tests.utils.TaskWithFailure",
+        "task_version": [
+            1,
+            0
+        ]
+    }
+
+Worker response(s)
+""""""""""""""""""

 When **running:**

 .. code:: json

    {
-        "status": "RUNNING"
+        "data": {},
+        "state": "RUNNING"
    }

 When **progressing:**
@@ -201,9 +233,11 @@ When **progressing:**
 .. code:: json

    {
-        "event_data": <event_data>,
-        "progress": <progress>,
-        "state": "PROGRESS"
+        "details": {
+            "progress": 0.5
+        },
+        "event_type": "update_progress",
+        "state": "EVENT"
    }

 When **succeeded:**
@@ -211,8 +245,9 @@ When **succeeded:**
 .. code:: json

    {
-        "event": <event>,
-        "result": <result>,
+        "data": {
+            "result": 666
+        },
        "state": "SUCCESS"
    }

@@ -221,15 +256,68 @@ When **failed:**
 .. code:: json

    {
-        "event": <event>,
-        "result": <misc.Failure>,
+        "data": {
+            "result": {
+                "exc_type_names": [
+                    "RuntimeError",
+                    "StandardError",
+                    "Exception"
+                ],
+                "exception_str": "Woot!",
+                "traceback_str": "  File \"/homes/harlowja/dev/os/taskflow/taskflow/engines/action_engine/executor.py\", line 56, in _execute_task\n    result = task.execute(**arguments)\n  File \"/homes/harlowja/dev/os/taskflow/taskflow/tests/utils.py\", line 165, in execute\n    raise RuntimeError('Woot!')\n",
+                "version": 1
+            }
+        },
        "state": "FAILURE"
    }

+Request state transitions
+-------------------------
+
+.. image:: img/wbe_request_states.svg
+   :width: 520px
+   :align: center
+   :alt: WBE request state transitions
+
+**WAITING** - Request placed on queue (or other `kombu`_ message bus/transport)
+but not *yet* consumed.
+
+**PENDING** - Worker accepted request and is pending to run using its
+executor (threads, processes, or other).
+
+**FAILURE** - Worker failed after running request (due to task exeception) or
+no worker moved/started executing (by placing the request into ``RUNNING``
+state) with-in specified time span (this defaults to 60 seconds unless
+overriden).
+
+**RUNNING** - Workers executor (using threads, processes...) has started to
+run requested task (once this state is transitioned to any request timeout no
+longer becomes applicable; since at this point it is unknown how long a task
+will run since it can not be determined if a task is just taking a long time
+or has failed).
+
+**SUCCESS** - Worker finished running task without exception.
+
+.. note::
+
+    During the ``WAITING`` and ``PENDING`` stages the engine keeps track
+    of how long the request has been *alive* for and if a timeout is reached
+    the request will automatically transition to ``FAILURE`` and any further
+    transitions from a worker will be disallowed (for example, if a worker
+    accepts the request in the future and sets the task to ``PENDING`` this
+    transition will be logged and ignored). This timeout can be adjusted and/or
+    removed by setting the engine ``transition_timeout`` option to a
+    higher/lower value or by setting it to ``None`` (to remove the timeout
+    completely). In the future this will be improved to be more dynamic
+    by implementing the blueprints associated with `failover`_ and
+    `info/resilence`_.
+
+.. _failover: https://blueprints.launchpad.net/taskflow/+spec/wbe-worker-failover
+.. _info/resilence: https://blueprints.launchpad.net/taskflow/+spec/wbe-worker-info
+
 Usage
 =====

-
 Workers
 -------

@@ -273,32 +361,26 @@ For complete parameters and object usage please see

 .. code:: python

-    engine_conf = {
-        'engine': 'worker-based',
-        'url': 'amqp://guest:guest@localhost:5672//',
-        'exchange': 'test-exchange',
-        'topics': ['topic1', 'topic2'],
-    }
    flow = lf.Flow('simple-linear').add(...)
-    eng = taskflow.engines.load(flow, engine_conf=engine_conf)
+    eng = taskflow.engines.load(flow, engine='worker-based',
+                                url='amqp://guest:guest@localhost:5672//',
+                                exchange='test-exchange',
+                                topics=['topic1', 'topic2'])
    eng.run()

 **Example with filesystem transport:**

 .. code:: python

-    engine_conf = {
-        'engine': 'worker-based',
-        'exchange': 'test-exchange',
-        'topics': ['topic1', 'topic2'],
-        'transport': 'filesystem',
-        'transport_options': {
-            'data_folder_in': '/tmp/test',
-            'data_folder_out': '/tmp/test',
-        },
-    }
    flow = lf.Flow('simple-linear').add(...)
-    eng = taskflow.engines.load(flow, engine_conf=engine_conf)
+    eng = taskflow.engines.load(flow, engine='worker-based',
+                                exchange='test-exchange',
+                                topics=['topic1', 'topic2'],
+                                transport='filesystem',
+                                transport_options={
+                                    'data_folder_in': '/tmp/in',
+                                    'data_folder_out': '/tmp/out',
+                                })
    eng.run()

 Additional supported keyword arguments:
@@ -333,7 +415,8 @@ Limitations
 Interfaces
 ==========

-.. automodule:: taskflow.engines.worker_based.worker
 .. automodule:: taskflow.engines.worker_based.engine
 .. automodule:: taskflow.engines.worker_based.proxy
-.. automodule:: taskflow.engines.worker_based.executor
+.. automodule:: taskflow.engines.worker_based.worker
+
+.. _kombu: http://kombu.readthedocs.org/
--- a/openstack-common.conf
+++ b/openstack-common.conf
@@ -1,16 +1,7 @@
 [DEFAULT]

 # The list of modules to copy from oslo-incubator.git
-module=excutils
-module=importutils
-module=jsonutils
-module=strutils
-module=timeutils
-module=uuidutils
-module=network_utils
-
 script=tools/run_cross_tests.sh

 # The base module to hold the copy of openstack.common
 base=taskflow
-
--- a/optional-requirements.txt
+++ b/optional-requirements.txt
@@ -1,31 +0,0 @@
-# This file lists dependencies that are used by different pluggable (optional)
-# parts of TaskFlow, like engines or persistence backends. They are not
-# strictly required by TaskFlow (aka you can use TaskFlow without them), so
-# they don't go into one of the requirements.txt files.
-
-# The order of packages is significant, because pip processes them in the order
-# of appearance. Changing the order has an impact on the overall integration
-# process, which may cause wedges in the gate later.
-
-# Database (sqlalchemy) persistence:
-SQLAlchemy>=0.7.8,<=0.9.99
-alembic>=0.4.1
-
-# Database (sqlalchemy) persistence with MySQL:
-MySQL-python
-
-# NOTE(imelnikov): pyMySQL should be here, but for now it's commented out
-# because of https://bugs.launchpad.net/openstack-ci/+bug/1280008
-# pyMySQL
-
-# Database (sqlalchemy) persistence with PostgreSQL:
-psycopg2
-
-# ZooKeeper backends
-kazoo>=1.3.1
-
-# Eventlet may be used with parallel engine:
-eventlet>=0.13.0
-
-# Needed for the worker-based engine:
-kombu>=2.4.8
--- a/requirements-py2.txt
+++ b/requirements-py2.txt
@@ -2,21 +2,29 @@
 # of appearance. Changing the order has an impact on the overall integration
 # process, which may cause wedges in the gate later.

+# See: https://bugs.launchpad.net/pbr/+bug/1384919 for why this is here...
+pbr>=0.6,!=0.7,<1.0
+
 # Packages needed for using this library.
-anyjson>=0.3.3
-iso8601>=0.1.9
+
 # Only needed on python 2.6
 ordereddict
+
 # Python 2->3 compatibility library.
 six>=1.7.0
+
 # Very nice graph library
 networkx>=1.8
-Babel>=1.3
+
 # Used for backend storage engine loading.
-stevedore>=0.14
+stevedore>=1.1.0  # Apache-2.0
+
 # Backport for concurrent.futures which exists in 3.2+
 futures>=2.1.6
+
 # Used for structured input validation
 jsonschema>=2.0.0,<3.0.0
-# For pretty printing state-machine tables
-PrettyTable>=0.7,<0.8
+
+# For common utilities
+oslo.utils>=1.2.0                       # Apache-2.0
+oslo.serialization>=1.2.0               # Apache-2.0
--- a/requirements-py3.txt
+++ b/requirements-py3.txt
@@ -2,17 +2,23 @@
 # of appearance. Changing the order has an impact on the overall integration
 # process, which may cause wedges in the gate later.

+# See: https://bugs.launchpad.net/pbr/+bug/1384919 for why this is here...
+pbr>=0.6,!=0.7,<1.0
+
 # Packages needed for using this library.
-anyjson>=0.3.3
-iso8601>=0.1.9
+
 # Python 2->3 compatibility library.
 six>=1.7.0
+
 # Very nice graph library
 networkx>=1.8
-Babel>=1.3
+
 # Used for backend storage engine loading.
-stevedore>=0.14
+stevedore>=1.1.0  # Apache-2.0
+
 # Used for structured input validation
 jsonschema>=2.0.0,<3.0.0
-# For pretty printing state-machine tables
-PrettyTable>=0.7,<0.8
+
+# For common utilities
+oslo.utils>=1.2.0                       # Apache-2.0
+oslo.serialization>=1.2.0               # Apache-2.0
--- a/setup.cfg
+++ b/setup.cfg
@@ -6,11 +6,8 @@ description-file =
 author = Taskflow Developers
 author-email = taskflow-dev@lists.launchpad.net
 home-page = https://launchpad.net/taskflow
-keywords = reliable recoverable execution
-           tasks flows workflows jobs
-           persistence states
-           asynchronous parallel threads
-           dataflow openstack
+keywords = reliable,tasks,execution,parallel,dataflow,workflows,distributed
+requires-python = >=2.6
 classifier =
    Development Status :: 4 - Beta
    Environment :: OpenStack
@@ -24,6 +21,7 @@ classifier =
    Programming Language :: Python :: 2.7
    Programming Language :: Python :: 3
    Programming Language :: Python :: 3.3
+    Programming Language :: Python :: 3.4
    Topic :: Software Development :: Libraries
    Topic :: System :: Distributed Computing

@@ -49,10 +47,11 @@ taskflow.persistence =
    zookeeper = taskflow.persistence.backends.impl_zookeeper:ZkBackend

 taskflow.engines =
-    default = taskflow.engines.action_engine.engine:SingleThreadedActionEngine
-    serial = taskflow.engines.action_engine.engine:SingleThreadedActionEngine
-    parallel = taskflow.engines.action_engine.engine:MultiThreadedActionEngine
+    default = taskflow.engines.action_engine.engine:SerialActionEngine
+    serial = taskflow.engines.action_engine.engine:SerialActionEngine
+    parallel = taskflow.engines.action_engine.engine:ParallelActionEngine
    worker-based = taskflow.engines.worker_based.engine:WorkerBasedActionEngine
+    workers = taskflow.engines.worker_based.engine:WorkerBasedActionEngine

 [nosetests]
 cover-erase = true
--- a/taskflow/atom.py
+++ b/taskflow/atom.py
@@ -15,15 +15,11 @@
 #    License for the specific language governing permissions and limitations
 #    under the License.

-import logging
-
+from oslo_utils import reflection
 import six

 from taskflow import exceptions
 from taskflow.utils import misc
-from taskflow.utils import reflection
-
-LOG = logging.getLogger(__name__)


 def _save_as_to_mapping(save_as):
@@ -73,7 +69,8 @@ def _build_rebind_dict(args, rebind_args):
    elif isinstance(rebind_args, dict):
        return rebind_args
    else:
-        raise TypeError('Invalid rebind value: %s' % rebind_args)
+        raise TypeError("Invalid rebind value '%s' (%s)"
+                        % (rebind_args, type(rebind_args)))


 def _build_arg_mapping(atom_name, reqs, rebind_args, function, do_infer,
@@ -125,7 +122,7 @@ class Atom(object):
                   with this atom. It can be useful in resuming older versions
                   of atoms. Standard major, minor versioning concepts
                   should apply.
-    :ivar save_as: An *immutable* output ``resource`` name dict this atom
+    :ivar save_as: An *immutable* output ``resource`` name dictionary this atom
                   produces that other atoms may depend on this atom providing.
                   The format is output index (or key when a dictionary
                   is returned from the execute method) to stored argument
@@ -136,11 +133,19 @@ class Atom(object):
                  the names that this atom expects (in a way this is like
                  remapping a namespace of another atom into the namespace
                  of this atom).
-    :ivar inject: An *immutable* input_name => value dictionary which specifies
-                  any initial inputs that should be automatically injected into
-                  the atoms scope before the atom execution commences (this
-                  allows for providing atom *local* values that do not need to
-                  be provided by other atoms).
+    :param name: Meaningful name for this atom, should be something that is
+                 distinguishable and understandable for notification,
+                 debugging, storing and any other similar purposes.
+    :param provides: A set, string or list of items that
+                     this will be providing (or could provide) to others, used
+                     to correlate and associate the thing/s this atom
+                     produces, if it produces anything at all.
+    :param inject: An *immutable* input_name => value dictionary which
+                  specifies  any initial inputs that should be automatically
+                  injected into the atoms scope before the atom execution
+                  commences (this allows for providing atom *local* values that
+                  do not need to be provided by other atoms/dependents).
+    :ivar inject: See parameter ``inject``.
    """

    def __init__(self, name=None, provides=None, inject=None):
--- a/taskflow/conductors/base.py
+++ b/taskflow/conductors/base.py
@@ -17,7 +17,7 @@ import threading

 import six

-import taskflow.engines
+from taskflow import engines
 from taskflow import exceptions as excp
 from taskflow.utils import lock_utils

@@ -34,10 +34,15 @@ class Conductor(object):
    period of time will finish up the prior failed conductors work.
    """

-    def __init__(self, name, jobboard, engine_conf, persistence):
+    def __init__(self, name, jobboard, persistence,
+                 engine=None, engine_options=None):
        self._name = name
        self._jobboard = jobboard
-        self._engine_conf = engine_conf
+        self._engine = engine
+        if not engine_options:
+            self._engine_options = {}
+        else:
+            self._engine_options = engine_options.copy()
        self._persistence = persistence
        self._lock = threading.RLock()

@@ -83,10 +88,10 @@ class Conductor(object):
            store = dict(job.details["store"])
        else:
            store = {}
-        return taskflow.engines.load_from_detail(flow_detail,
-                                                 store=store,
-                                                 engine_conf=self._engine_conf,
-                                                 backend=self._persistence)
+        return engines.load_from_detail(flow_detail, store=store,
+                                        engine=self._engine,
+                                        backend=self._persistence,
+                                        **self._engine_options)

    @lock_utils.locked
    def connect(self):
@@ -108,9 +113,10 @@ class Conductor(object):
        """Dispatches a claimed job for work completion.

        Accepts a single (already claimed) job and causes it to be run in
-        an engine. Returns a boolean that signifies whether the job should
-        be consumed. The job is consumed upon completion (unless False is
-        returned which will signify the job should be abandoned instead).
+        an engine. Returns a future object that represented the work to be
+        completed sometime in the future. The future should return a single
+        boolean from its result() method. This boolean determines whether the
+        job will be consumed (true) or whether it should be abandoned (false).

        :param job: A job instance that has already been claimed by the
                    jobboard.
--- a/taskflow/conductors/single_threaded.py
+++ b/taskflow/conductors/single_threaded.py
@@ -12,16 +12,16 @@
 #    License for the specific language governing permissions and limitations
 #    under the License.

-import logging
-import threading
-
 import six

 from taskflow.conductors import base
 from taskflow import exceptions as excp
 from taskflow.listeners import logging as logging_listener
+from taskflow import logging
 from taskflow.types import timing as tt
+from taskflow.utils import async_utils
 from taskflow.utils import lock_utils
+from taskflow.utils import threading_utils

 LOG = logging.getLogger(__name__)
 WAIT_TIMEOUT = 0.5
@@ -50,11 +50,11 @@ class SingleThreadedConductor(base.Conductor):
    upon the jobboard capabilities to automatically abandon these jobs.
    """

-    def __init__(self, name, jobboard, engine_conf, persistence,
-                 wait_timeout=None):
-        super(SingleThreadedConductor, self).__init__(name, jobboard,
-                                                      engine_conf,
-                                                      persistence)
+    def __init__(self, name, jobboard, persistence,
+                 engine=None, engine_options=None, wait_timeout=None):
+        super(SingleThreadedConductor, self).__init__(
+            name, jobboard, persistence,
+            engine=engine, engine_options=engine_options)
        if wait_timeout is None:
            wait_timeout = WAIT_TIMEOUT
        if isinstance(wait_timeout, (int, float) + six.string_types):
@@ -63,7 +63,7 @@ class SingleThreadedConductor(base.Conductor):
            self._wait_timeout = wait_timeout
        else:
            raise ValueError("Invalid timeout literal: %s" % (wait_timeout))
-        self._dead = threading.Event()
+        self._dead = threading_utils.Event()

    @lock_utils.locked
    def stop(self, timeout=None):
@@ -80,8 +80,7 @@ class SingleThreadedConductor(base.Conductor):
        be honored in the future) and False will be returned indicating this.
        """
        self._wait_timeout.interrupt()
-        self._dead.wait(timeout)
-        return self._dead.is_set()
+        return self._dead.wait(timeout)

    @property
    def dispatching(self):
@@ -116,7 +115,7 @@ class SingleThreadedConductor(base.Conductor):
                         job, exc_info=True)
            else:
                LOG.info("Job completed successfully: %s", job)
-        return consume
+        return async_utils.make_completed_future(consume)

    def run(self):
        self._dead.clear()
@@ -136,12 +135,13 @@ class SingleThreadedConductor(base.Conductor):
                        continue
                    consume = False
                    try:
-                        consume = self._dispatch_job(job)
+                        f = self._dispatch_job(job)
                    except Exception:
                        LOG.warn("Job dispatching failed: %s", job,
                                 exc_info=True)
                    else:
                        dispatched += 1
+                        consume = f.result()
                    try:
                        if consume:
                            self._jobboard.consume(job, self._name)
--- a/taskflow/engines/action_engine/actions/init.py
+++ b/taskflow/engines/action_engine/actions/init.py
--- a/taskflow/engines/action_engine/actions/base.py
+++ b/taskflow/engines/action_engine/actions/base.py
@@ -0,0 +1,42 @@
+# -*- coding: utf-8 -*-
+
+#    Copyright (C) 2015 Yahoo! Inc. All Rights Reserved.
+#
+#    Licensed under the Apache License, Version 2.0 (the "License"); you may
+#    not use this file except in compliance with the License. You may obtain
+#    a copy of the License at
+#
+#         http://www.apache.org/licenses/LICENSE-2.0
+#
+#    Unless required by applicable law or agreed to in writing, software
+#    distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+#    WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+#    License for the specific language governing permissions and limitations
+#    under the License.
+
+import abc
+
+import six
+
+from taskflow import states
+
+
+#: Sentinel use to represent no-result (none can be a valid result...)
+NO_RESULT = object()
+
+#: States that are expected to/may have a result to save...
+SAVE_RESULT_STATES = (states.SUCCESS, states.FAILURE)
+
+
+@six.add_metaclass(abc.ABCMeta)
+class Action(object):
+    """An action that handles executing, state changes, ... of atoms."""
+
+    def __init__(self, storage, notifier, walker_factory):
+        self._storage = storage
+        self._notifier = notifier
+        self._walker_factory = walker_factory
+
+    @abc.abstractmethod
+    def handles(self, atom):
+        """Checks if this action handles the provided atom."""
--- a/taskflow/engines/action_engine/actions/retry.py
+++ b/taskflow/engines/action_engine/actions/retry.py
@@ -0,0 +1,130 @@
+# -*- coding: utf-8 -*-
+
+#    Copyright (C) 2012-2013 Yahoo! Inc. All Rights Reserved.
+#
+#    Licensed under the Apache License, Version 2.0 (the "License"); you may
+#    not use this file except in compliance with the License. You may obtain
+#    a copy of the License at
+#
+#         http://www.apache.org/licenses/LICENSE-2.0
+#
+#    Unless required by applicable law or agreed to in writing, software
+#    distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+#    WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+#    License for the specific language governing permissions and limitations
+#    under the License.
+
+from taskflow.engines.action_engine.actions import base
+from taskflow.engines.action_engine import executor as ex
+from taskflow import logging
+from taskflow import retry as retry_atom
+from taskflow import states
+from taskflow.types import failure
+from taskflow.types import futures
+
+LOG = logging.getLogger(__name__)
+
+
+def _execute_retry(retry, arguments):
+    try:
+        result = retry.execute(**arguments)
+    except Exception:
+        result = failure.Failure()
+    return (ex.EXECUTED, result)
+
+
+def _revert_retry(retry, arguments):
+    try:
+        result = retry.revert(**arguments)
+    except Exception:
+        result = failure.Failure()
+    return (ex.REVERTED, result)
+
+
+class RetryAction(base.Action):
+    """An action that handles executing, state changes, ... of retry atoms."""
+
+    def __init__(self, storage, notifier, walker_factory):
+        super(RetryAction, self).__init__(storage, notifier, walker_factory)
+        self._executor = futures.SynchronousExecutor()
+
+    @staticmethod
+    def handles(atom):
+        return isinstance(atom, retry_atom.Retry)
+
+    def _get_retry_args(self, retry, addons=None):
+        scope_walker = self._walker_factory(retry)
+        arguments = self._storage.fetch_mapped_args(retry.rebind,
+                                                    atom_name=retry.name,
+                                                    scope_walker=scope_walker)
+        history = self._storage.get_retry_history(retry.name)
+        arguments[retry_atom.EXECUTE_REVERT_HISTORY] = history
+        if addons:
+            arguments.update(addons)
+        return arguments
+
+    def change_state(self, retry, state, result=base.NO_RESULT):
+        old_state = self._storage.get_atom_state(retry.name)
+        if state in base.SAVE_RESULT_STATES:
+            save_result = None
+            if result is not base.NO_RESULT:
+                save_result = result
+            self._storage.save(retry.name, save_result, state)
+        elif state == states.REVERTED:
+            self._storage.cleanup_retry_history(retry.name, state)
+        else:
+            if state == old_state:
+                # NOTE(imelnikov): nothing really changed, so we should not
+                # write anything to storage and run notifications
+                return
+            self._storage.set_atom_state(retry.name, state)
+        retry_uuid = self._storage.get_atom_uuid(retry.name)
+        details = {
+            'retry_name': retry.name,
+            'retry_uuid': retry_uuid,
+            'old_state': old_state,
+        }
+        if result is not base.NO_RESULT:
+            details['result'] = result
+        self._notifier.notify(state, details)
+
+    def execute(self, retry):
+
+        def _on_done_callback(fut):
+            result = fut.result()[-1]
+            if isinstance(result, failure.Failure):
+                self.change_state(retry, states.FAILURE, result=result)
+            else:
+                self.change_state(retry, states.SUCCESS, result=result)
+
+        self.change_state(retry, states.RUNNING)
+        fut = self._executor.submit(_execute_retry, retry,
+                                    self._get_retry_args(retry))
+        fut.add_done_callback(_on_done_callback)
+        fut.atom = retry
+        return fut
+
+    def revert(self, retry):
+
+        def _on_done_callback(fut):
+            result = fut.result()[-1]
+            if isinstance(result, failure.Failure):
+                self.change_state(retry, states.FAILURE)
+            else:
+                self.change_state(retry, states.REVERTED)
+
+        self.change_state(retry, states.REVERTING)
+        arg_addons = {
+            retry_atom.REVERT_FLOW_FAILURES: self._storage.get_failures(),
+        }
+        fut = self._executor.submit(_revert_retry, retry,
+                                    self._get_retry_args(retry,
+                                                         addons=arg_addons))
+        fut.add_done_callback(_on_done_callback)
+        fut.atom = retry
+        return fut
+
+    def on_failure(self, retry, atom, last_failure):
+        self._storage.save_retry_failure(retry.name, atom.name, last_failure)
+        arguments = self._get_retry_args(retry)
+        return retry.on_failure(**arguments)
--- a/taskflow/engines/action_engine/actions/task.py
+++ b/taskflow/engines/action_engine/actions/task.py
@@ -0,0 +1,150 @@
+# -*- coding: utf-8 -*-
+
+#    Copyright (C) 2012-2013 Yahoo! Inc. All Rights Reserved.
+#
+#    Licensed under the Apache License, Version 2.0 (the "License"); you may
+#    not use this file except in compliance with the License. You may obtain
+#    a copy of the License at
+#
+#         http://www.apache.org/licenses/LICENSE-2.0
+#
+#    Unless required by applicable law or agreed to in writing, software
+#    distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+#    WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+#    License for the specific language governing permissions and limitations
+#    under the License.
+
+import functools
+
+from taskflow.engines.action_engine.actions import base
+from taskflow import logging
+from taskflow import states
+from taskflow import task as task_atom
+from taskflow.types import failure
+
+LOG = logging.getLogger(__name__)
+
+
+class TaskAction(base.Action):
+    """An action that handles scheduling, state changes, ... of task atoms."""
+
+    def __init__(self, storage, notifier, walker_factory, task_executor):
+        super(TaskAction, self).__init__(storage, notifier, walker_factory)
+        self._task_executor = task_executor
+
+    @staticmethod
+    def handles(atom):
+        return isinstance(atom, task_atom.BaseTask)
+
+    def _is_identity_transition(self, old_state, state, task, progress):
+        if state in base.SAVE_RESULT_STATES:
+            # saving result is never identity transition
+            return False
+        if state != old_state:
+            # changing state is not identity transition by definition
+            return False
+        # NOTE(imelnikov): last thing to check is that the progress has
+        # changed, which means progress is not None and is different from
+        # what is stored in the database.
+        if progress is None:
+            return False
+        old_progress = self._storage.get_task_progress(task.name)
+        if old_progress != progress:
+            return False
+        return True
+
+    def change_state(self, task, state,
+                     result=base.NO_RESULT, progress=None):
+        old_state = self._storage.get_atom_state(task.name)
+        if self._is_identity_transition(old_state, state, task, progress):
+            # NOTE(imelnikov): ignore identity transitions in order
+            # to avoid extra write to storage backend and, what's
+            # more important, extra notifications
+            return
+        if state in base.SAVE_RESULT_STATES:
+            save_result = None
+            if result is not base.NO_RESULT:
+                save_result = result
+            self._storage.save(task.name, save_result, state)
+        else:
+            self._storage.set_atom_state(task.name, state)
+        if progress is not None:
+            self._storage.set_task_progress(task.name, progress)
+        task_uuid = self._storage.get_atom_uuid(task.name)
+        details = {
+            'task_name': task.name,
+            'task_uuid': task_uuid,
+            'old_state': old_state,
+        }
+        if result is not base.NO_RESULT:
+            details['result'] = result
+        self._notifier.notify(state, details)
+        if progress is not None:
+            task.update_progress(progress)
+
+    def _on_update_progress(self, task, event_type, details):
+        """Should be called when task updates its progress."""
+        try:
+            progress = details.pop('progress')
+        except KeyError:
+            pass
+        else:
+            try:
+                self._storage.set_task_progress(task.name, progress,
+                                                details=details)
+            except Exception:
+                # Update progress callbacks should never fail, so capture and
+                # log the emitted exception instead of raising it.
+                LOG.exception("Failed setting task progress for %s to %0.3f",
+                              task, progress)
+
+    def schedule_execution(self, task):
+        self.change_state(task, states.RUNNING, progress=0.0)
+        scope_walker = self._walker_factory(task)
+        arguments = self._storage.fetch_mapped_args(task.rebind,
+                                                    atom_name=task.name,
+                                                    scope_walker=scope_walker)
+        if task.notifier.can_be_registered(task_atom.EVENT_UPDATE_PROGRESS):
+            progress_callback = functools.partial(self._on_update_progress,
+                                                  task)
+        else:
+            progress_callback = None
+        task_uuid = self._storage.get_atom_uuid(task.name)
+        return self._task_executor.execute_task(
+            task, task_uuid, arguments,
+            progress_callback=progress_callback)
+
+    def complete_execution(self, task, result):
+        if isinstance(result, failure.Failure):
+            self.change_state(task, states.FAILURE, result=result)
+        else:
+            self.change_state(task, states.SUCCESS,
+                              result=result, progress=1.0)
+
+    def schedule_reversion(self, task):
+        self.change_state(task, states.REVERTING, progress=0.0)
+        scope_walker = self._walker_factory(task)
+        arguments = self._storage.fetch_mapped_args(task.rebind,
+                                                    atom_name=task.name,
+                                                    scope_walker=scope_walker)
+        task_uuid = self._storage.get_atom_uuid(task.name)
+        task_result = self._storage.get(task.name)
+        failures = self._storage.get_failures()
+        if task.notifier.can_be_registered(task_atom.EVENT_UPDATE_PROGRESS):
+            progress_callback = functools.partial(self._on_update_progress,
+                                                  task)
+        else:
+            progress_callback = None
+        future = self._task_executor.revert_task(
+            task, task_uuid, arguments, task_result, failures,
+            progress_callback=progress_callback)
+        return future
+
+    def complete_reversion(self, task, result):
+        if isinstance(result, failure.Failure):
+            self.change_state(task, states.FAILURE)
+        else:
+            self.change_state(task, states.REVERTED, progress=1.0)
+
+    def wait_for_any(self, fs, timeout):
+        return self._task_executor.wait_for_any(fs, timeout)
--- a/taskflow/engines/action_engine/compiler.py
+++ b/taskflow/engines/action_engine/compiler.py
@@ -14,211 +14,397 @@
 #    License for the specific language governing permissions and limitations
 #    under the License.

-import logging
+import collections
+import threading

 from taskflow import exceptions as exc
 from taskflow import flow
+from taskflow import logging
 from taskflow import retry
 from taskflow import task
 from taskflow.types import graph as gr
+from taskflow.types import tree as tr
+from taskflow.utils import lock_utils
 from taskflow.utils import misc

 LOG = logging.getLogger(__name__)

+_RETRY_EDGE_DATA = {
+    flow.LINK_RETRY: True,
+}
+_EDGE_INVARIANTS = (flow.LINK_INVARIANT, flow.LINK_MANUAL, flow.LINK_RETRY)
+_EDGE_REASONS = flow.LINK_REASONS
+

 class Compilation(object):
-    """The result of a compilers compile() is this *immutable* object.
+    """The result of a compilers compile() is this *immutable* object."""

-    For now it is just a execution graph but in the future it will grow to
-    include more methods & properties that help the various runtime units
-    execute in a more optimal & featureful manner.
-    """
-    def __init__(self, execution_graph):
+    def __init__(self, execution_graph, hierarchy):
        self._execution_graph = execution_graph
+        self._hierarchy = hierarchy

    @property
    def execution_graph(self):
+        """The execution ordering of atoms (as a graph structure)."""
        return self._execution_graph

+    @property
+    def hierarchy(self):
+        """The hierachy of patterns (as a tree structure)."""
+        return self._hierarchy
+
+
+def _add_update_edges(graph, nodes_from, nodes_to, attr_dict=None):
+    """Adds/updates edges from nodes to other nodes in the specified graph.
+
+    It will connect the 'nodes_from' to the 'nodes_to' if an edge currently
+    does *not* exist (if it does already exist then the edges attributes
+    are just updated instead). When an edge is created the provided edge
+    attributes dictionary will be applied to the new edge between these two
+    nodes.
+    """
+    # NOTE(harlowja): give each edge its own attr copy so that if it's
+    # later modified that the same copy isn't modified...
+    for u in nodes_from:
+        for v in nodes_to:
+            if not graph.has_edge(u, v):
+                if attr_dict:
+                    graph.add_edge(u, v, attr_dict=attr_dict.copy())
+                else:
+                    graph.add_edge(u, v)
+            else:
+                # Just update the attr_dict (if any).
+                if attr_dict:
+                    graph.add_edge(u, v, attr_dict=attr_dict.copy())
+
+
+class Linker(object):
+    """Compiler helper that adds pattern(s) constraints onto a graph."""
+
+    @staticmethod
+    def _is_not_empty(graph):
+        # Returns true if the given graph is *not* empty...
+        return graph.number_of_nodes() > 0
+
+    @staticmethod
+    def _find_first_decomposed(node, priors,
+                               decomposed_members, decomposed_filter):
+        # How this works; traverse backwards and find only the predecessor
+        # items that are actually connected to this entity, and avoid any
+        # linkage that is not directly connected. This is guaranteed to be
+        # valid since we always iter_links() over predecessors before
+        # successors in all currently known patterns; a queue is used here
+        # since it is possible for a node to have 2+ different predecessors so
+        # we must search back through all of them in a reverse BFS order...
+        #
+        # Returns the first decomposed graph of those nodes (including the
+        # passed in node) that passes the provided filter
+        # function (returns none if none match).
+        frontier = collections.deque([node])
+        # NOTE(harowja): None is in this initial set since the first prior in
+        # the priors list has None as its predecessor (which we don't want to
+        # look for a decomposed member of).
+        visited = set([None])
+        while frontier:
+            node = frontier.popleft()
+            if node in visited:
+                continue
+            node_graph = decomposed_members[node]
+            if decomposed_filter(node_graph):
+                return node_graph
+            visited.add(node)
+            # TODO(harlowja): optimize this more to avoid searching through
+            # things already searched...
+            for (u, v) in reversed(priors):
+                if node == v:
+                    # Queue its predecessor to be searched in the future...
+                    frontier.append(u)
+        else:
+            return None
+
+    def apply_constraints(self, graph, flow, decomposed_members):
+        # This list is used to track the links that have been previously
+        # iterated over, so that when we are trying to find a entry to
+        # connect to that we iterate backwards through this list, finding
+        # connected nodes to the current target (lets call it v) and find
+        # the first (u_n, or u_n - 1, u_n - 2...) that was decomposed into
+        # a non-empty graph. We also retain all predecessors of v so that we
+        # can correctly locate u_n - 1 if u_n turns out to have decomposed into
+        # an empty graph (and so on).
+        priors = []
+        # NOTE(harlowja): u, v are flows/tasks (also graph terminology since
+        # we are compiling things down into a flattened graph), the meaning
+        # of this link iteration via iter_links() is that u -> v (with the
+        # provided dictionary attributes, if any).
+        for (u, v, attr_dict) in flow.iter_links():
+            if not priors:
+                priors.append((None, u))
+            v_g = decomposed_members[v]
+            if not v_g.number_of_nodes():
+                priors.append((u, v))
+                continue
+            invariant = any(attr_dict.get(k) for k in _EDGE_INVARIANTS)
+            if not invariant:
+                # This is a symbol *only* dependency, connect
+                # corresponding providers and consumers to allow the consumer
+                # to be executed immediately after the provider finishes (this
+                # is an optimization for these types of dependencies...)
+                u_g = decomposed_members[u]
+                if not u_g.number_of_nodes():
+                    # This must always exist, but incase it somehow doesn't...
+                    raise exc.CompilationFailure(
+                        "Non-invariant link being created from '%s' ->"
+                        " '%s' even though the target '%s' was found to be"
+                        " decomposed into an empty graph" % (v, u, u))
+                for u in u_g.nodes_iter():
+                    for v in v_g.nodes_iter():
+                        depends_on = u.provides & v.requires
+                        if depends_on:
+                            _add_update_edges(graph,
+                                              [u], [v],
+                                              attr_dict={
+                                                  _EDGE_REASONS: depends_on,
+                                              })
+            else:
+                # Connect nodes with no predecessors in v to nodes with no
+                # successors in the *first* non-empty predecessor of v (thus
+                # maintaining the edge dependency).
+                match = self._find_first_decomposed(u, priors,
+                                                    decomposed_members,
+                                                    self._is_not_empty)
+                if match is not None:
+                    _add_update_edges(graph,
+                                      match.no_successors_iter(),
+                                      list(v_g.no_predecessors_iter()),
+                                      attr_dict=attr_dict)
+            priors.append((u, v))
+

 class PatternCompiler(object):
-    """Compiles patterns & atoms into a compilation unit.
+    """Compiles a pattern (or task) into a compilation unit.

-    NOTE(harlowja): during this pattern translation process any nested flows
-    will be converted into there equivalent subgraphs. This currently implies
-    that contained atoms in those nested flows, post-translation will no longer
-    be associated with there previously containing flow but instead will lose
-    this identity and what will remain is the logical constraints that there
-    contained flow mandated. In the future this may be changed so that this
-    association is not lost via the compilation process (since it can be
-    useful to retain this relationship).
+    Let's dive into the basic idea for how this works:
+
+    The compiler here is provided a 'root' object via its __init__ method,
+    this object could be a task, or a flow (one of the supported patterns),
+    the end-goal is to produce a :py:class:`.Compilation` object as the result
+    with the needed components. If this is not possible a
+    :py:class:`~.taskflow.exceptions.CompilationFailure` will be raised (or
+    in the case where a unknown type is being requested to compile
+    a ``TypeError`` will be raised).
+
+    The complexity of this comes into play when the 'root' is a flow that
+    contains itself other nested flows (and so-on); to compile this object and
+    its contained objects into a graph that *preserves* the constraints the
+    pattern mandates we have to go through a recursive algorithm that creates
+    subgraphs for each nesting level, and then on the way back up through
+    the recursion (now with a decomposed mapping from contained patterns or
+    atoms to there corresponding subgraph) we have to then connect the
+    subgraphs (and the atom(s) there-in) that were decomposed for a pattern
+    correctly into a new graph (using a :py:class:`.Linker` object to ensure
+    the pattern mandated constraints are retained) and then return to the
+    caller (and they will do the same thing up until the root node, which by
+    that point one graph is created with all contained atoms in the
+    pattern/nested patterns mandated ordering).
+
+    Also maintained in the :py:class:`.Compilation` object is a hierarchy of
+    the nesting of items (which is also built up during the above mentioned
+    recusion, via a much simpler algorithm); this is typically used later to
+    determine the prior atoms of a given atom when looking up values that can
+    be provided to that atom for execution (see the scopes.py file for how this
+    works). Note that although you *could* think that the graph itself could be
+    used for this, which in some ways it can (for limited usage) the hierarchy
+    retains the nested structure (which is useful for scoping analysis/lookup)
+    to be able to provide back a iterator that gives back the scopes visible
+    at each level (the graph does not have this information once flattened).
+
+    Let's take an example:
+
+    Given the pattern ``f(a(b, c), d)`` where ``f`` is a
+    :py:class:`~taskflow.patterns.linear_flow.Flow` with items ``a(b, c)``
+    where ``a`` is a :py:class:`~taskflow.patterns.linear_flow.Flow` composed
+    of tasks ``(b, c)`` and task ``d``.
+
+    The algorithm that will be performed (mirroring the above described logic)
+    will go through the following steps (the tree hierachy building is left
+    out as that is more obvious)::
+
+        Compiling f
+          - Decomposing flow f with no parent (must be the root)
+          - Compiling a
+              - Decomposing flow a with parent f
+              - Compiling b
+                  - Decomposing task b with parent a
+                  - Decomposed b into:
+                    Name: b
+                    Nodes: 1
+                      - b
+                    Edges: 0
+              - Compiling c
+                  - Decomposing task c with parent a
+                  - Decomposed c into:
+                    Name: c
+                    Nodes: 1
+                      - c
+                    Edges: 0
+              - Relinking decomposed b -> decomposed c
+              - Decomposed a into:
+                Name: a
+                Nodes: 2
+                  - b
+                  - c
+                Edges: 1
+                  b -> c ({'invariant': True})
+          - Compiling d
+              - Decomposing task d with parent f
+              - Decomposed d into:
+                Name: d
+                Nodes: 1
+                  - d
+                Edges: 0
+          - Relinking decomposed a -> decomposed d
+          - Decomposed f into:
+            Name: f
+            Nodes: 3
+              - c
+              - b
+              - d
+            Edges: 2
+              c -> d ({'invariant': True})
+              b -> c ({'invariant': True})
    """
-    def compile(self, root):
-        graph = _Flattener(root).flatten()
-        if graph.number_of_nodes() == 0:
-            # Try to get a name attribute, otherwise just use the object
-            # string representation directly if that attribute does not exist.
-            name = getattr(root, 'name', root)
-            raise exc.Empty("Root container '%s' (%s) is empty."
-                            % (name, type(root)))
-        return Compilation(graph)
-
-
-_RETRY_EDGE_DATA = {
-    'retry': True,
-}
-
-
-class _Flattener(object):
-    """Flattens a root item (task/flow) into a execution graph."""

    def __init__(self, root, freeze=True):
        self._root = root
-        self._graph = None
        self._history = set()
-        self._freeze = bool(freeze)
+        self._linker = Linker()
+        self._freeze = freeze
+        self._lock = threading.Lock()
+        self._compilation = None

-    def _add_new_edges(self, graph, nodes_from, nodes_to, edge_attrs):
-        """Adds new edges from nodes to other nodes in the specified graph.
-
-        It will connect the nodes_from to the nodes_to if an edge currently
-        does *not* exist. When an edge is created the provided edge attributes
-        will be applied to the new edge between these two nodes.
-        """
-        nodes_to = list(nodes_to)
-        for u in nodes_from:
-            for v in nodes_to:
-                if not graph.has_edge(u, v):
-                    # NOTE(harlowja): give each edge its own attr copy so that
-                    # if it's later modified that the same copy isn't modified.
-                    graph.add_edge(u, v, attr_dict=edge_attrs.copy())
-
-    def _flatten(self, item):
-        functor = self._find_flattener(item)
-        if not functor:
-            raise TypeError("Unknown type requested to flatten: %s (%s)"
-                            % (item, type(item)))
+    def _flatten(self, item, parent):
+        """Flattens a item (pattern, task) into a graph + tree node."""
+        functor = self._find_flattener(item, parent)
        self._pre_item_flatten(item)
-        graph = functor(item)
-        self._post_item_flatten(item, graph)
-        return graph
+        graph, node = functor(item, parent)
+        self._post_item_flatten(item, graph, node)
+        return graph, node

-    def _find_flattener(self, item):
+    def _find_flattener(self, item, parent):
        """Locates the flattening function to use to flatten the given item."""
        if isinstance(item, flow.Flow):
            return self._flatten_flow
        elif isinstance(item, task.BaseTask):
            return self._flatten_task
        elif isinstance(item, retry.Retry):
-            if len(self._history) == 1:
-                raise TypeError("Retry controller: %s (%s) must only be used"
+            if parent is None:
+                raise TypeError("Retry controller '%s' (%s) must only be used"
                                " as a flow constructor parameter and not as a"
                                " root component" % (item, type(item)))
            else:
-                # TODO(harlowja): we should raise this type error earlier
-                # instead of later since we should do this same check on add()
-                # calls, this makes the error more visible (instead of waiting
-                # until compile time).
-                raise TypeError("Retry controller: %s (%s) must only be used"
+                raise TypeError("Retry controller '%s' (%s) must only be used"
                                " as a flow constructor parameter and not as a"
                                " flow added component" % (item, type(item)))
        else:
-            return None
+            raise TypeError("Unknown item '%s' (%s) requested to flatten"
+                            % (item, type(item)))

    def _connect_retry(self, retry, graph):
        graph.add_node(retry)

-        # All graph nodes that have no predecessors should depend on its retry
-        nodes_to = [n for n in graph.no_predecessors_iter() if n != retry]
-        self._add_new_edges(graph, [retry], nodes_to, _RETRY_EDGE_DATA)
+        # All nodes that have no predecessors should depend on this retry.
+        nodes_to = [n for n in graph.no_predecessors_iter() if n is not retry]
+        if nodes_to:
+            _add_update_edges(graph, [retry], nodes_to,
+                              attr_dict=_RETRY_EDGE_DATA)

-        # Add link to retry for each node of subgraph that hasn't
-        # a parent retry
+        # Add association for each node of graph that has no existing retry.
        for n in graph.nodes_iter():
-            if n != retry and 'retry' not in graph.node[n]:
-                graph.node[n]['retry'] = retry
+            if n is not retry and flow.LINK_RETRY not in graph.node[n]:
+                graph.node[n][flow.LINK_RETRY] = retry

-    def _flatten_task(self, task):
+    def _flatten_task(self, task, parent):
        """Flattens a individual task."""
        graph = gr.DiGraph(name=task.name)
        graph.add_node(task)
-        return graph
+        node = tr.Node(task)
+        if parent is not None:
+            parent.add(node)
+        return graph, node

-    def _flatten_flow(self, flow):
-        """Flattens a graph flow."""
+    def _decompose_flow(self, flow, parent):
+        """Decomposes a flow into a graph, tree node + decomposed subgraphs."""
        graph = gr.DiGraph(name=flow.name)
-
-        # Flatten all nodes into a single subgraph per node.
-        subgraph_map = {}
+        node = tr.Node(flow)
+        if parent is not None:
+            parent.add(node)
+        if flow.retry is not None:
+            node.add(tr.Node(flow.retry))
+        decomposed_members = {}
        for item in flow:
-            subgraph = self._flatten(item)
-            subgraph_map[item] = subgraph
-            graph = gr.merge_graphs([graph, subgraph])
-
-        # Reconnect all node edges to their corresponding subgraphs.
-        for (u, v, attrs) in flow.iter_links():
-            u_g = subgraph_map[u]
-            v_g = subgraph_map[v]
-            if any(attrs.get(k) for k in ('invariant', 'manual', 'retry')):
-                # Connect nodes with no predecessors in v to nodes with
-                # no successors in u (thus maintaining the edge dependency).
-                self._add_new_edges(graph,
-                                    u_g.no_successors_iter(),
-                                    v_g.no_predecessors_iter(),
-                                    edge_attrs=attrs)
-            else:
-                # This is dependency-only edge, connect corresponding
-                # providers and consumers.
-                for provider in u_g:
-                    for consumer in v_g:
-                        reasons = provider.provides & consumer.requires
-                        if reasons:
-                            graph.add_edge(provider, consumer, reasons=reasons)
+            subgraph, _subnode = self._flatten(item, node)
+            decomposed_members[item] = subgraph
+            if subgraph.number_of_nodes():
+                graph = gr.merge_graphs([graph, subgraph])
+        return graph, node, decomposed_members

+    def _flatten_flow(self, flow, parent):
+        """Flattens a flow."""
+        graph, node, decomposed_members = self._decompose_flow(flow, parent)
+        self._linker.apply_constraints(graph, flow, decomposed_members)
        if flow.retry is not None:
            self._connect_retry(flow.retry, graph)
-        return graph
+        return graph, node

    def _pre_item_flatten(self, item):
        """Called before a item is flattened; any pre-flattening actions."""
-        if id(item) in self._history:
-            raise ValueError("Already flattened item: %s (%s), recursive"
-                             " flattening not supported" % (item, id(item)))
-        self._history.add(id(item))
+        if item in self._history:
+            raise ValueError("Already flattened item '%s' (%s), recursive"
+                             " flattening is not supported" % (item,
+                                                               type(item)))
+        self._history.add(item)

-    def _post_item_flatten(self, item, graph):
-        """Called before a item is flattened; any post-flattening actions."""
+    def _post_item_flatten(self, item, graph, node):
+        """Called after a item is flattened; doing post-flattening actions."""

    def _pre_flatten(self):
-        """Called before the flattening of the item starts."""
+        """Called before the flattening of the root starts."""
        self._history.clear()

-    def _post_flatten(self, graph):
-        """Called after the flattening of the item finishes successfully."""
+    def _post_flatten(self, graph, node):
+        """Called after the flattening of the root finishes successfully."""
        dup_names = misc.get_duplicate_keys(graph.nodes_iter(),
                                            key=lambda node: node.name)
        if dup_names:
-            dup_names = ', '.join(sorted(dup_names))
-            raise exc.Duplicate("Atoms with duplicate names "
-                                "found: %s" % (dup_names))
+            raise exc.Duplicate(
+                "Atoms with duplicate names found: %s" % (sorted(dup_names)))
+        if graph.number_of_nodes() == 0:
+            raise exc.Empty("Root container '%s' (%s) is empty"
+                            % (self._root, type(self._root)))
        self._history.clear()
        # NOTE(harlowja): this one can be expensive to calculate (especially
-        # the cycle detection), so only do it if we know debugging is enabled
+        # the cycle detection), so only do it if we know BLATHER is enabled
        # and not under all cases.
-        if LOG.isEnabledFor(logging.DEBUG):
-            LOG.debug("Translated '%s' into a graph:", self._root)
+        if LOG.isEnabledFor(logging.BLATHER):
+            LOG.blather("Translated '%s'", self._root)
+            LOG.blather("Graph:")
            for line in graph.pformat().splitlines():
                # Indent it so that it's slightly offset from the above line.
-                LOG.debug(" %s", line)
+                LOG.blather("  %s", line)
+            LOG.blather("Hierarchy:")
+            for line in node.pformat().splitlines():
+                # Indent it so that it's slightly offset from the above line.
+                LOG.blather("  %s", line)

-    def flatten(self):
-        """Flattens a item (a task or flow) into a single execution graph."""
-        if self._graph is not None:
-            return self._graph
-        self._pre_flatten()
-        graph = self._flatten(self._root)
-        self._post_flatten(graph)
-        self._graph = graph
-        if self._freeze:
-            self._graph.freeze()
-        return self._graph
+    @lock_utils.locked
+    def compile(self):
+        """Compiles the contained item into a compiled equivalent."""
+        if self._compilation is None:
+            self._pre_flatten()
+            graph, node = self._flatten(self._root, None)
+            self._post_flatten(graph, node)
+            if self._freeze:
+                graph.freeze()
+                node.freeze()
+            self._compilation = Compilation(graph, node)
+        return self._compilation
--- a/taskflow/engines/action_engine/completer.py
+++ b/taskflow/engines/action_engine/completer.py
@@ -0,0 +1,114 @@
+# -*- coding: utf-8 -*-
+
+#    Copyright (C) 2014 Yahoo! Inc. All Rights Reserved.
+#
+#    Licensed under the Apache License, Version 2.0 (the "License"); you may
+#    not use this file except in compliance with the License. You may obtain
+#    a copy of the License at
+#
+#         http://www.apache.org/licenses/LICENSE-2.0
+#
+#    Unless required by applicable law or agreed to in writing, software
+#    distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+#    WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+#    License for the specific language governing permissions and limitations
+#    under the License.
+
+from taskflow.engines.action_engine import executor as ex
+from taskflow import retry as retry_atom
+from taskflow import states as st
+from taskflow import task as task_atom
+from taskflow.types import failure
+
+
+class Completer(object):
+    """Completes atoms using actions to complete them."""
+
+    def __init__(self, runtime):
+        self._runtime = runtime
+        self._analyzer = runtime.analyzer
+        self._retry_action = runtime.retry_action
+        self._runtime = runtime
+        self._storage = runtime.storage
+        self._task_action = runtime.task_action
+
+    def _complete_task(self, task, event, result):
+        """Completes the given task, processes task failure."""
+        if event == ex.EXECUTED:
+            self._task_action.complete_execution(task, result)
+        else:
+            self._task_action.complete_reversion(task, result)
+
+    def resume(self):
+        """Resumes nodes in the contained graph.
+
+        This is done to allow any previously completed or failed nodes to
+        be analyzed, there results processed and any potential nodes affected
+        to be adjusted as needed.
+
+        This should return a set of nodes which should be the initial set of
+        nodes that were previously not finished (due to a RUNNING or REVERTING
+        attempt not previously finishing).
+        """
+        for node in self._analyzer.iterate_all_nodes():
+            if self._analyzer.get_state(node) == st.FAILURE:
+                self._process_atom_failure(node, self._storage.get(node.name))
+        for retry in self._analyzer.iterate_retries(st.RETRYING):
+            self._runtime.retry_subflow(retry)
+        unfinished_nodes = set()
+        for node in self._analyzer.iterate_all_nodes():
+            if self._analyzer.get_state(node) in (st.RUNNING, st.REVERTING):
+                unfinished_nodes.add(node)
+        return unfinished_nodes
+
+    def complete(self, node, event, result):
+        """Performs post-execution completion of a node.
+
+        Returns whether the result should be saved into an accumulator of
+        failures or whether this should not be done.
+        """
+        if isinstance(node, task_atom.BaseTask):
+            self._complete_task(node, event, result)
+        if isinstance(result, failure.Failure):
+            if event == ex.EXECUTED:
+                self._process_atom_failure(node, result)
+            else:
+                return True
+        return False
+
+    def _process_atom_failure(self, atom, failure):
+        """Processes atom failure & applies resolution strategies.
+
+        On atom failure this will find the atoms associated retry controller
+        and ask that controller for the strategy to perform to resolve that
+        failure. After getting a resolution strategy decision this method will
+        then adjust the needed other atoms intentions, and states, ... so that
+        the failure can be worked around.
+        """
+        retry = self._analyzer.find_atom_retry(atom)
+        if retry is not None:
+            # Ask retry controller what to do in case of failure
+            action = self._retry_action.on_failure(retry, atom, failure)
+            if action == retry_atom.RETRY:
+                # Prepare just the surrounding subflow for revert to be later
+                # retried...
+                self._storage.set_atom_intention(retry.name, st.RETRY)
+                self._runtime.reset_subgraph(retry, state=None,
+                                             intention=st.REVERT)
+            elif action == retry_atom.REVERT:
+                # Ask parent checkpoint.
+                self._process_atom_failure(retry, failure)
+            elif action == retry_atom.REVERT_ALL:
+                # Prepare all flow for revert
+                self._revert_all()
+            else:
+                raise ValueError("Unknown atom failure resolution"
+                                 " action '%s'" % action)
+        else:
+            # Prepare all flow for revert
+            self._revert_all()
+
+    def _revert_all(self):
+        """Attempts to set all nodes to the REVERT intention."""
+        self._runtime.reset_nodes(self._analyzer.iterate_all_nodes(),
+                                  state=None, intention=st.REVERT)
--- a/taskflow/engines/action_engine/engine.py
+++ b/taskflow/engines/action_engine/engine.py
@@ -14,21 +14,24 @@
 #    License for the specific language governing permissions and limitations
 #    under the License.

+import collections
 import contextlib
 import threading

+from concurrent import futures
+from oslo_utils import excutils
+import six
+
 from taskflow.engines.action_engine import compiler
 from taskflow.engines.action_engine import executor
 from taskflow.engines.action_engine import runtime
 from taskflow.engines import base
 from taskflow import exceptions as exc
-from taskflow.openstack.common import excutils
-from taskflow import retry
 from taskflow import states
 from taskflow import storage as atom_storage
+from taskflow.types import failure
 from taskflow.utils import lock_utils
 from taskflow.utils import misc
-from taskflow.utils import reflection


@contextlib.contextmanager
@@ -41,7 +44,7 @@ def _start_stop(executor):
        executor.stop()


-class ActionEngine(base.EngineBase):
+class ActionEngine(base.Engine):
    """Generic action-based engine.

    This engine compiles the flow (and any subflows) into a compilation unit
@@ -57,10 +60,9 @@ class ActionEngine(base.EngineBase):
    the tasks and flow being ran can go through.
    """
    _compiler_factory = compiler.PatternCompiler
-    _task_executor_factory = executor.SerialTaskExecutor

-    def __init__(self, flow, flow_detail, backend, conf):
-        super(ActionEngine, self).__init__(flow, flow_detail, backend, conf)
+    def __init__(self, flow, flow_detail, backend, options):
+        super(ActionEngine, self).__init__(flow, flow_detail, backend, options)
        self._runtime = None
        self._compiled = False
        self._compilation = None
@@ -68,9 +70,6 @@ class ActionEngine(base.EngineBase):
        self._state_lock = threading.RLock()
        self._storage_ensured = False

-    def __str__(self):
-        return "%s: %s" % (reflection.get_class_name(self), id(self))
-
    def suspend(self):
        if not self._compiled:
            raise exc.InvalidState("Can not suspend an engine"
@@ -129,7 +128,7 @@ class ActionEngine(base.EngineBase):
                closed = False
                for (last_state, failures) in runner.run_iter(timeout=timeout):
                    if failures:
-                        misc.Failure.reraise_if_any(failures)
+                        failure.Failure.reraise_if_any(failures)
                    if closed:
                        continue
                    try:
@@ -152,7 +151,7 @@ class ActionEngine(base.EngineBase):
                    self._change_state(last_state)
                    if last_state not in [states.SUSPENDED, states.SUCCESS]:
                        failures = self.storage.get_failures()
-                        misc.Failure.reraise_if_any(failures.values())
+                        failure.Failure.reraise_if_any(failures.values())

    def _change_state(self, state):
        with self._state_lock:
@@ -169,19 +168,11 @@ class ActionEngine(base.EngineBase):
        self.notifier.notify(state, details)

    def _ensure_storage(self):
-        # NOTE(harlowja): signal to the tasks that exist that we are about to
-        # resume, if they have a previous state, they will now transition to
-        # a resuming state (and then to suspended).
-        self._change_state(states.RESUMING)  # does nothing in PENDING state
+        """Ensure all contained atoms exist in the storage unit."""
        for node in self._compilation.execution_graph.nodes_iter():
-            version = misc.get_version_string(node)
-            if isinstance(node, retry.Retry):
-                self.storage.ensure_retry(node.name, version, node.save_as)
-            else:
-                self.storage.ensure_task(node.name, version, node.save_as)
+            self.storage.ensure_atom(node)
            if node.inject:
                self.storage.inject_atom_args(node.name, node.inject)
-        self._change_state(states.SUSPENDED)  # does nothing in PENDING state

    @lock_utils.locked
    def prepare(self):
@@ -189,7 +180,12 @@ class ActionEngine(base.EngineBase):
            raise exc.InvalidState("Can not prepare an engine"
                                   " which has not been compiled")
        if not self._storage_ensured:
+            # Set our own state to resuming -> (ensure atoms exist
+            # in storage) -> suspended in the storage unit and notify any
+            # attached listeners of these changes.
+            self._change_state(states.RESUMING)
            self._ensure_storage()
+            self._change_state(states.SUSPENDED)
            self._storage_ensured = True
        # At this point we can check to ensure all dependencies are either
        # flow/task provided or storage provided, if there are still missing
@@ -204,42 +200,162 @@ class ActionEngine(base.EngineBase):
            self._runtime.reset_all()
            self._change_state(states.PENDING)

-    @misc.cachedproperty
-    def _task_executor(self):
-        return self._task_executor_factory()
-
    @misc.cachedproperty
    def _compiler(self):
-        return self._compiler_factory()
+        return self._compiler_factory(self._flow)

    @lock_utils.locked
    def compile(self):
        if self._compiled:
            return
-        self._compilation = self._compiler.compile(self._flow)
+        self._compilation = self._compiler.compile()
        self._runtime = runtime.Runtime(self._compilation,
                                        self.storage,
-                                        self.task_notifier,
+                                        self.atom_notifier,
                                        self._task_executor)
        self._compiled = True


-class SingleThreadedActionEngine(ActionEngine):
+class SerialActionEngine(ActionEngine):
    """Engine that runs tasks in serial manner."""
    _storage_factory = atom_storage.SingleThreadedStorage

+    def __init__(self, flow, flow_detail, backend, options):
+        super(SerialActionEngine, self).__init__(flow, flow_detail,
+                                                 backend, options)
+        self._task_executor = executor.SerialTaskExecutor()
+
+
+class _ExecutorTypeMatch(collections.namedtuple('_ExecutorTypeMatch',
+                                                ['types', 'executor_cls'])):
+    def matches(self, executor):
+        return isinstance(executor, self.types)
+
+
+class _ExecutorTextMatch(collections.namedtuple('_ExecutorTextMatch',
+                                                ['strings', 'executor_cls'])):
+    def matches(self, text):
+        return text.lower() in self.strings
+
+
+class ParallelActionEngine(ActionEngine):
+    """Engine that runs tasks in parallel manner.
+
+    Supported keyword arguments:
+
+    * ``executor``: a object that implements a :pep:`3148` compatible executor
+      interface; it will be used for scheduling tasks. The following
+      type are applicable (other unknown types passed will cause a type
+      error to be raised).
+
+=========================  ===============================================
+Type provided              Executor used
+=========================  ===============================================
+|cft|.ThreadPoolExecutor   :class:`~.executor.ParallelThreadTaskExecutor`
+|cfp|.ProcessPoolExecutor  :class:`~.executor.ParallelProcessTaskExecutor`
+|cf|._base.Executor        :class:`~.executor.ParallelThreadTaskExecutor`
+=========================  ===============================================
+
+    * ``executor``: a string that will be used to select a :pep:`3148`
+      compatible executor; it will be used for scheduling tasks. The following
+      string are applicable (other unknown strings passed will cause a value
+      error to be raised).
+
+===========================  ===============================================
+String (case insensitive)    Executor used
+===========================  ===============================================
+``process``                  :class:`~.executor.ParallelProcessTaskExecutor`
+``processes``                :class:`~.executor.ParallelProcessTaskExecutor`
+``thread``                   :class:`~.executor.ParallelThreadTaskExecutor`
+``threaded``                 :class:`~.executor.ParallelThreadTaskExecutor`
+``threads``                  :class:`~.executor.ParallelThreadTaskExecutor`
+===========================  ===============================================
+
+    .. |cfp| replace:: concurrent.futures.process
+    .. |cft| replace:: concurrent.futures.thread
+    .. |cf| replace:: concurrent.futures
+    """

-class MultiThreadedActionEngine(ActionEngine):
-    """Engine that runs tasks in parallel manner."""
    _storage_factory = atom_storage.MultiThreadedStorage

-    def _task_executor_factory(self):
-        return executor.ParallelTaskExecutor(executor=self._executor,
-                                             max_workers=self._max_workers)
+    # One of these types should match when a object (non-string) is provided
+    # for the 'executor' option.
+    #
+    # NOTE(harlowja): the reason we use the library/built-in futures is to
+    # allow for instances of that to be detected and handled correctly, instead
+    # of forcing everyone to use our derivatives...
+    _executor_cls_matchers = [
+        _ExecutorTypeMatch((futures.ThreadPoolExecutor,),
+                           executor.ParallelThreadTaskExecutor),
+        _ExecutorTypeMatch((futures.ProcessPoolExecutor,),
+                           executor.ParallelProcessTaskExecutor),
+        _ExecutorTypeMatch((futures.Executor,),
+                           executor.ParallelThreadTaskExecutor),
+    ]

-    def __init__(self, flow, flow_detail, backend, conf,
-                 executor=None, max_workers=None):
-        super(MultiThreadedActionEngine, self).__init__(
-            flow, flow_detail, backend, conf)
-        self._executor = executor
-        self._max_workers = max_workers
+    # One of these should match when a string/text is provided for the
+    # 'executor' option (a mixed case equivalent is allowed since the match
+    # will be lower-cased before checking).
+    _executor_str_matchers = [
+        _ExecutorTextMatch(frozenset(['processes', 'process']),
+                           executor.ParallelProcessTaskExecutor),
+        _ExecutorTextMatch(frozenset(['thread', 'threads', 'threaded']),
+                           executor.ParallelThreadTaskExecutor),
+    ]
+
+    # Used when no executor is provided (either a string or object)...
+    _default_executor_cls = executor.ParallelThreadTaskExecutor
+
+    def __init__(self, flow, flow_detail, backend, options):
+        super(ParallelActionEngine, self).__init__(flow, flow_detail,
+                                                   backend, options)
+        # This ensures that any provided executor will be validated before
+        # we get to far in the compilation/execution pipeline...
+        self._task_executor = self._fetch_task_executor(self._options)
+
+    @classmethod
+    def _fetch_task_executor(cls, options):
+        kwargs = {}
+        executor_cls = cls._default_executor_cls
+        # Match the desired executor to a class that will work with it...
+        desired_executor = options.get('executor')
+        if isinstance(desired_executor, six.string_types):
+            matched_executor_cls = None
+            for m in cls._executor_str_matchers:
+                if m.matches(desired_executor):
+                    matched_executor_cls = m.executor_cls
+                    break
+            if matched_executor_cls is None:
+                expected = set()
+                for m in cls._executor_str_matchers:
+                    expected.update(m.strings)
+                raise ValueError("Unknown executor string '%s' expected"
+                                 " one of %s (or mixed case equivalent)"
+                                 % (desired_executor, list(expected)))
+            else:
+                executor_cls = matched_executor_cls
+        elif desired_executor is not None:
+            matched_executor_cls = None
+            for m in cls._executor_cls_matchers:
+                if m.matches(desired_executor):
+                    matched_executor_cls = m.executor_cls
+                    break
+            if matched_executor_cls is None:
+                expected = set()
+                for m in cls._executor_cls_matchers:
+                    expected.update(m.types)
+                raise TypeError("Unknown executor '%s' (%s) expected an"
+                                " instance of %s" % (desired_executor,
+                                                     type(desired_executor),
+                                                     list(expected)))
+            else:
+                executor_cls = matched_executor_cls
+                kwargs['executor'] = desired_executor
+        for k in getattr(executor_cls, 'OPTIONS', []):
+            if k == 'executor':
+                continue
+            try:
+                kwargs[k] = options[k]
+            except KeyError:
+                pass
+        return executor_cls(**kwargs)
--- a/taskflow/engines/action_engine/executor.py
+++ b/taskflow/engines/action_engine/executor.py
@@ -15,52 +15,315 @@
 #    under the License.

 import abc
+import collections
+from multiprocessing import managers
+import os
+import pickle

-from concurrent import futures
+from oslo_utils import excutils
+from oslo_utils import reflection
+from oslo_utils import timeutils
+from oslo_utils import uuidutils
 import six
+from six.moves import queue as compat_queue

+from taskflow import logging
+from taskflow import task as task_atom
+from taskflow.types import failure
+from taskflow.types import futures
+from taskflow.types import notifier
+from taskflow.types import timing
 from taskflow.utils import async_utils
-from taskflow.utils import misc
 from taskflow.utils import threading_utils

 # Execution and reversion events.
 EXECUTED = 'executed'
 REVERTED = 'reverted'

+# See http://bugs.python.org/issue1457119 for why this is so complex...
+_PICKLE_ERRORS = [pickle.PickleError, TypeError]
+try:
+    import cPickle as _cPickle
+    _PICKLE_ERRORS.append(_cPickle.PickleError)
+except ImportError:
+    pass
+_PICKLE_ERRORS = tuple(_PICKLE_ERRORS)
+_SEND_ERRORS = (IOError, EOFError)
+_UPDATE_PROGRESS = task_atom.EVENT_UPDATE_PROGRESS

-def _execute_task(task, arguments, progress_callback):
-    with task.autobind('update_progress', progress_callback):
+# Message types/kind sent from worker/child processes...
+_KIND_COMPLETE_ME = 'complete_me'
+_KIND_EVENT = 'event'
+
+LOG = logging.getLogger(__name__)
+
+
+def _execute_task(task, arguments, progress_callback=None):
+    with notifier.register_deregister(task.notifier,
+                                      _UPDATE_PROGRESS,
+                                      callback=progress_callback):
        try:
            task.pre_execute()
            result = task.execute(**arguments)
        except Exception:
            # NOTE(imelnikov): wrap current exception with Failure
            # object and return it.
-            result = misc.Failure()
+            result = failure.Failure()
        finally:
            task.post_execute()
-    return (task, EXECUTED, result)
+    return (EXECUTED, result)


-def _revert_task(task, arguments, result, failures, progress_callback):
-    kwargs = arguments.copy()
-    kwargs['result'] = result
-    kwargs['flow_failures'] = failures
-    with task.autobind('update_progress', progress_callback):
+def _revert_task(task, arguments, result, failures, progress_callback=None):
+    arguments = arguments.copy()
+    arguments[task_atom.REVERT_RESULT] = result
+    arguments[task_atom.REVERT_FLOW_FAILURES] = failures
+    with notifier.register_deregister(task.notifier,
+                                      _UPDATE_PROGRESS,
+                                      callback=progress_callback):
        try:
            task.pre_revert()
-            result = task.revert(**kwargs)
+            result = task.revert(**arguments)
        except Exception:
            # NOTE(imelnikov): wrap current exception with Failure
            # object and return it.
-            result = misc.Failure()
+            result = failure.Failure()
        finally:
            task.post_revert()
-    return (task, REVERTED, result)
+    return (REVERTED, result)
+
+
+class _ViewableSyncManager(managers.SyncManager):
+    """Manager that exposes its state as methods."""
+
+    def is_shutdown(self):
+        return self._state.value == managers.State.SHUTDOWN
+
+    def is_running(self):
+        return self._state.value == managers.State.STARTED
+
+
+class _Channel(object):
+    """Helper wrapper around a multiprocessing queue used by a worker."""
+
+    def __init__(self, queue, identity):
+        self._queue = queue
+        self._identity = identity
+        self._sent_messages = collections.defaultdict(int)
+        self._pid = None
+
+    @property
+    def sent_messages(self):
+        return self._sent_messages
+
+    def put(self, message):
+        # NOTE(harlowja): this is done in late in execution to ensure that this
+        # happens in the child process and not the parent process (where the
+        # constructor is called).
+        if self._pid is None:
+            self._pid = os.getpid()
+        message.update({
+            'sent_on': timeutils.utcnow(),
+            'sender': {
+                'pid': self._pid,
+                'id': self._identity,
+            },
+        })
+        if 'body' not in message:
+            message['body'] = {}
+        try:
+            self._queue.put(message)
+        except _PICKLE_ERRORS:
+            LOG.warn("Failed serializing message %s", message, exc_info=True)
+            return False
+        except _SEND_ERRORS:
+            LOG.warn("Failed sending message %s", message, exc_info=True)
+            return False
+        else:
+            self._sent_messages[message['kind']] += 1
+            return True
+
+
+class _WaitWorkItem(object):
+    """The piece of work that will executed by a process executor.
+
+    This will call the target function, then wait until the tasks emitted
+    events/items have been depleted before offically being finished.
+
+    NOTE(harlowja): this is done so that the task function will *not* return
+    until all of its notifications have been proxied back to its originating
+    task. If we didn't do this then the executor would see this task as done
+    and then potentially start tasks that are successors of the task that just
+    finished even though notifications are still left to be sent from the
+    previously finished task...
+    """
+
+    def __init__(self, channel, barrier,
+                 func, task, *args, **kwargs):
+        self._channel = channel
+        self._barrier = barrier
+        self._func = func
+        self._task = task
+        self._args = args
+        self._kwargs = kwargs
+
+    def _on_finish(self):
+        sent_events = self._channel.sent_messages.get(_KIND_EVENT, 0)
+        if sent_events:
+            message = {
+                'created_on': timeutils.utcnow(),
+                'kind': _KIND_COMPLETE_ME,
+            }
+            if self._channel.put(message):
+                watch = timing.StopWatch()
+                watch.start()
+                self._barrier.wait()
+                LOG.blather("Waited %s seconds until task '%s' %s emitted"
+                            " notifications were depleted", watch.elapsed(),
+                            self._task, sent_events)
+
+    def __call__(self):
+        args = self._args
+        kwargs = self._kwargs
+        try:
+            return self._func(self._task, *args, **kwargs)
+        finally:
+            self._on_finish()
+
+
+class _EventSender(object):
+    """Sends event information from a child worker process to its creator."""
+
+    def __init__(self, channel):
+        self._channel = channel
+
+    def __call__(self, event_type, details):
+        message = {
+            'created_on': timeutils.utcnow(),
+            'kind': _KIND_EVENT,
+            'body': {
+                'event_type': event_type,
+                'details': details,
+            },
+        }
+        self._channel.put(message)
+
+
+class _Target(object):
+    """An immutable helper object that represents a target of a message."""
+
+    def __init__(self, task, barrier, identity):
+        self.task = task
+        self.barrier = barrier
+        self.identity = identity
+        # Counters used to track how many message 'kinds' were proxied...
+        self.dispatched = collections.defaultdict(int)
+
+    def __repr__(self):
+        return "<%s at 0x%x targeting '%s' with identity '%s'>" % (
+            reflection.get_class_name(self), id(self),
+            self.task, self.identity)
+
+
+class _Dispatcher(object):
+    """Dispatches messages received from child worker processes."""
+
+    # When the run() method is busy (typically in a thread) we want to set
+    # these so that the thread can know how long to sleep when there is no
+    # active work to dispatch.
+    _SPIN_PERIODICITY = 0.01
+
+    def __init__(self, dispatch_periodicity=None):
+        if dispatch_periodicity is None:
+            dispatch_periodicity = self._SPIN_PERIODICITY
+        if dispatch_periodicity <= 0:
+            raise ValueError("Provided dispatch periodicity must be greater"
+                             " than zero and not '%s'" % dispatch_periodicity)
+        self._targets = {}
+        self._dead = threading_utils.Event()
+        self._dispatch_periodicity = dispatch_periodicity
+        self._stop_when_empty = False
+
+    def register(self, identity, target):
+        self._targets[identity] = target
+
+    def deregister(self, identity):
+        try:
+            target = self._targets.pop(identity)
+        except KeyError:
+            pass
+        else:
+            # Just incase set the barrier to unblock any worker...
+            target.barrier.set()
+            if LOG.isEnabledFor(logging.BLATHER):
+                LOG.blather("Dispatched %s messages %s to target '%s' during"
+                            " the lifetime of its existence in the dispatcher",
+                            sum(six.itervalues(target.dispatched)),
+                            dict(target.dispatched), target)
+
+    def reset(self):
+        self._stop_when_empty = False
+        self._dead.clear()
+        if self._targets:
+            leftover = set(six.iterkeys(self._targets))
+            while leftover:
+                self.deregister(leftover.pop())
+
+    def interrupt(self):
+        self._stop_when_empty = True
+        self._dead.set()
+
+    def _dispatch(self, message):
+        if LOG.isEnabledFor(logging.BLATHER):
+            LOG.blather("Dispatching message %s (it took %s seconds"
+                        " for it to arrive for processing after being"
+                        " sent)", message,
+                        timeutils.delta_seconds(message['sent_on'],
+                                                timeutils.utcnow()))
+        try:
+            kind = message['kind']
+            sender = message['sender']
+            body = message['body']
+        except (KeyError, ValueError, TypeError):
+            LOG.warn("Badly formatted message %s received", message,
+                     exc_info=True)
+            return
+        target = self._targets.get(sender['id'])
+        if target is None:
+            # Must of been removed...
+            return
+        if kind == _KIND_COMPLETE_ME:
+            target.dispatched[kind] += 1
+            target.barrier.set()
+        elif kind == _KIND_EVENT:
+            task = target.task
+            target.dispatched[kind] += 1
+            task.notifier.notify(body['event_type'], body['details'])
+        else:
+            LOG.warn("Unknown message '%s' found in message from sender"
+                     " %s to target '%s'", kind, sender, target)
+
+    def run(self, queue):
+        watch = timing.StopWatch(duration=self._dispatch_periodicity)
+        while (not self._dead.is_set() or
+               (self._stop_when_empty and self._targets)):
+            watch.restart()
+            leftover = watch.leftover()
+            while leftover:
+                try:
+                    message = queue.get(timeout=leftover)
+                except compat_queue.Empty:
+                    break
+                else:
+                    self._dispatch(message)
+                    leftover = watch.leftover()
+            leftover = watch.leftover()
+            if leftover:
+                self._dead.wait(leftover)


@six.add_metaclass(abc.ABCMeta)
-class TaskExecutorBase(object):
+class TaskExecutor(object):
    """Executes and reverts tasks.

    This class takes task and its arguments and executes or reverts it.
@@ -69,7 +332,8 @@ class TaskExecutorBase(object):
    """

    @abc.abstractmethod
-    def execute_task(self, task, task_uuid, arguments, progress_callback=None):
+    def execute_task(self, task, task_uuid, arguments,
+                     progress_callback=None):
        """Schedules task execution."""

    @abc.abstractmethod
@@ -77,9 +341,9 @@ class TaskExecutorBase(object):
                    progress_callback=None):
        """Schedules task reversion."""

-    @abc.abstractmethod
    def wait_for_any(self, fs, timeout=None):
        """Wait for futures returned by this executor to complete."""
+        return async_utils.wait_for_any(fs, timeout=timeout)

    def start(self):
        """Prepare to execute tasks."""
@@ -90,58 +354,221 @@ class TaskExecutorBase(object):
        pass


-class SerialTaskExecutor(TaskExecutorBase):
-    """Execute task one after another."""
+class SerialTaskExecutor(TaskExecutor):
+    """Executes tasks one after another."""
+
+    def __init__(self):
+        self._executor = futures.SynchronousExecutor()
+
+    def start(self):
+        self._executor.restart()
+
+    def stop(self):
+        self._executor.shutdown()

    def execute_task(self, task, task_uuid, arguments, progress_callback=None):
-        return async_utils.make_completed_future(
-            _execute_task(task, arguments, progress_callback))
+        fut = self._executor.submit(_execute_task,
+                                    task, arguments,
+                                    progress_callback=progress_callback)
+        fut.atom = task
+        return fut

    def revert_task(self, task, task_uuid, arguments, result, failures,
                    progress_callback=None):
-        return async_utils.make_completed_future(
-            _revert_task(task, arguments, result,
-                         failures, progress_callback))
-
-    def wait_for_any(self, fs, timeout=None):
-        # NOTE(imelnikov): this executor returns only done futures.
-        return (fs, set())
+        fut = self._executor.submit(_revert_task,
+                                    task, arguments, result, failures,
+                                    progress_callback=progress_callback)
+        fut.atom = task
+        return fut


-class ParallelTaskExecutor(TaskExecutorBase):
+class ParallelTaskExecutor(TaskExecutor):
    """Executes tasks in parallel.

    Submits tasks to an executor which should provide an interface similar
    to concurrent.Futures.Executor.
    """

+    #: Options this executor supports (passed in from engine options).
+    OPTIONS = frozenset(['max_workers'])
+
    def __init__(self, executor=None, max_workers=None):
        self._executor = executor
        self._max_workers = max_workers
-        self._create_executor = executor is None
+        self._own_executor = executor is None
+
+    @abc.abstractmethod
+    def _create_executor(self, max_workers=None):
+        """Called when an executor has not been provided to make one."""
+
+    def _submit_task(self, func, task, *args, **kwargs):
+        fut = self._executor.submit(func, task, *args, **kwargs)
+        fut.atom = task
+        return fut

    def execute_task(self, task, task_uuid, arguments, progress_callback=None):
-        return self._executor.submit(
-            _execute_task, task, arguments, progress_callback)
+        return self._submit_task(_execute_task, task, arguments,
+                                 progress_callback=progress_callback)

    def revert_task(self, task, task_uuid, arguments, result, failures,
                    progress_callback=None):
-        return self._executor.submit(
-            _revert_task, task,
-            arguments, result, failures, progress_callback)
-
-    def wait_for_any(self, fs, timeout=None):
-        return async_utils.wait_for_any(fs, timeout)
+        return self._submit_task(_revert_task, task, arguments, result,
+                                 failures, progress_callback=progress_callback)

    def start(self):
-        if self._create_executor:
+        if self._own_executor:
            if self._max_workers is not None:
                max_workers = self._max_workers
            else:
                max_workers = threading_utils.get_optimal_thread_count()
-            self._executor = futures.ThreadPoolExecutor(max_workers)
+            self._executor = self._create_executor(max_workers=max_workers)

    def stop(self):
-        if self._create_executor:
+        if self._own_executor:
            self._executor.shutdown(wait=True)
            self._executor = None
+
+
+class ParallelThreadTaskExecutor(ParallelTaskExecutor):
+    """Executes tasks in parallel using a thread pool executor."""
+
+    def _create_executor(self, max_workers=None):
+        return futures.ThreadPoolExecutor(max_workers=max_workers)
+
+
+class ParallelProcessTaskExecutor(ParallelTaskExecutor):
+    """Executes tasks in parallel using a process pool executor.
+
+    NOTE(harlowja): this executor executes tasks in external processes, so that
+    implies that tasks that are sent to that external process are pickleable
+    since this is how the multiprocessing works (sending pickled objects back
+    and forth) and that the bound handlers (for progress updating in
+    particular) are proxied correctly from that external process to the one
+    that is alive in the parent process to ensure that callbacks registered in
+    the parent are executed on events in the child.
+    """
+
+    #: Options this executor supports (passed in from engine options).
+    OPTIONS = frozenset(['max_workers', 'dispatch_periodicity'])
+
+    def __init__(self, executor=None, max_workers=None,
+                 dispatch_periodicity=None):
+        super(ParallelProcessTaskExecutor, self).__init__(
+            executor=executor, max_workers=max_workers)
+        self._manager = _ViewableSyncManager()
+        self._dispatcher = _Dispatcher(
+            dispatch_periodicity=dispatch_periodicity)
+        # Only created after starting...
+        self._worker = None
+        self._queue = None
+
+    def _create_executor(self, max_workers=None):
+        return futures.ProcessPoolExecutor(max_workers=max_workers)
+
+    def start(self):
+        if threading_utils.is_alive(self._worker):
+            raise RuntimeError("Worker thread must be stopped via stop()"
+                               " before starting/restarting")
+        super(ParallelProcessTaskExecutor, self).start()
+        # These don't seem restartable; make a new one...
+        if self._manager.is_shutdown():
+            self._manager = _ViewableSyncManager()
+        if not self._manager.is_running():
+            self._manager.start()
+        self._dispatcher.reset()
+        self._queue = self._manager.Queue()
+        self._worker = threading_utils.daemon_thread(self._dispatcher.run,
+                                                     self._queue)
+        self._worker.start()
+
+    def stop(self):
+        self._dispatcher.interrupt()
+        super(ParallelProcessTaskExecutor, self).stop()
+        if threading_utils.is_alive(self._worker):
+            self._worker.join()
+            self._worker = None
+            self._queue = None
+        self._dispatcher.reset()
+        self._manager.shutdown()
+        self._manager.join()
+
+    def _rebind_task(self, task, clone, channel, progress_callback=None):
+        # Creates and binds proxies for all events the task could receive
+        # so that when the clone runs in another process that this task
+        # can recieve the same notifications (thus making it look like the
+        # the notifications are transparently happening in this process).
+        needed = set()
+        for (event_type, listeners) in task.notifier.listeners_iter():
+            if listeners:
+                needed.add(event_type)
+        if progress_callback is not None:
+            needed.add(_UPDATE_PROGRESS)
+        if needed:
+            sender = _EventSender(channel)
+            for event_type in needed:
+                clone.notifier.register(event_type, sender)
+
+    def _submit_task(self, func, task, *args, **kwargs):
+        """Submit a function to run the given task (with given args/kwargs).
+
+        NOTE(harlowja): Adjust all events to be proxies instead since we want
+        those callbacks to be activated in this process, not in the child,
+        also since typically callbacks are functors (or callables) we can
+        not pickle those in the first place...
+
+        To make sure people understand how this works, the following is a
+        lengthy description of what is going on here, read at will:
+
+        So to ensure that we are proxying task triggered events that occur
+        in the executed subprocess (which will be created and used by the
+        thing using the multiprocessing based executor) we need to establish
+        a link between that process and this process that ensures that when a
+        event is triggered in that task in that process that a corresponding
+        event is triggered on the original task that was requested to be ran
+        in this process.
+
+        To accomplish this we have to create a copy of the task (without
+        any listeners) and then reattach a new set of listeners that will
+        now instead of calling the desired listeners just place messages
+        for this process (a dispatcher thread that is created in this class)
+        to dispatch to the original task (using a common queue + per task
+        sender identity/target that is used and associated to know which task
+        to proxy back too, since it is possible that there many be *many*
+        subprocess running at the same time, each running a different task
+        and using the same common queue to submit messages back to).
+
+        Once the subprocess task has finished execution, the executor will
+        then trigger a callback that will remove the task + target from the
+        dispatcher (which will stop any further proxying back to the original
+        task).
+        """
+        progress_callback = kwargs.pop('progress_callback', None)
+        clone = task.copy(retain_listeners=False)
+        identity = uuidutils.generate_uuid()
+        target = _Target(task, self._manager.Event(), identity)
+        channel = _Channel(self._queue, identity)
+        self._rebind_task(task, clone, channel,
+                          progress_callback=progress_callback)
+
+        def register():
+            if progress_callback is not None:
+                task.notifier.register(_UPDATE_PROGRESS, progress_callback)
+            self._dispatcher.register(identity, target)
+
+        def deregister():
+            if progress_callback is not None:
+                task.notifier.deregister(_UPDATE_PROGRESS, progress_callback)
+            self._dispatcher.deregister(identity)
+
+        register()
+        work = _WaitWorkItem(channel, target.barrier,
+                             func, clone, *args, **kwargs)
+        try:
+            fut = self._executor.submit(work)
+        except RuntimeError:
+            with excutils.save_and_reraise_exception():
+                deregister()
+
+        fut.atom = task
+        fut.add_done_callback(lambda fut: deregister())
+        return fut
--- a/taskflow/engines/action_engine/retry_action.py
+++ b/taskflow/engines/action_engine/retry_action.py
@@ -1,86 +0,0 @@
-# -*- coding: utf-8 -*-
-
-#    Copyright (C) 2012-2013 Yahoo! Inc. All Rights Reserved.
-#
-#    Licensed under the Apache License, Version 2.0 (the "License"); you may
-#    not use this file except in compliance with the License. You may obtain
-#    a copy of the License at
-#
-#         http://www.apache.org/licenses/LICENSE-2.0
-#
-#    Unless required by applicable law or agreed to in writing, software
-#    distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
-#    WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
-#    License for the specific language governing permissions and limitations
-#    under the License.
-
-import logging
-
-from taskflow.engines.action_engine import executor as ex
-from taskflow import states
-from taskflow.utils import async_utils
-from taskflow.utils import misc
-
-LOG = logging.getLogger(__name__)
-
-SAVE_RESULT_STATES = (states.SUCCESS, states.FAILURE)
-
-
-class RetryAction(object):
-    def __init__(self, storage, notifier):
-        self._storage = storage
-        self._notifier = notifier
-
-    def _get_retry_args(self, retry):
-        kwargs = self._storage.fetch_mapped_args(retry.rebind,
-                                                 atom_name=retry.name)
-        kwargs['history'] = self._storage.get_retry_history(retry.name)
-        return kwargs
-
-    def change_state(self, retry, state, result=None):
-        if state in SAVE_RESULT_STATES:
-            self._storage.save(retry.name, result, state)
-        elif state == states.REVERTED:
-            self._storage.cleanup_retry_history(retry.name, state)
-        else:
-            old_state = self._storage.get_atom_state(retry.name)
-            if state == old_state:
-                # NOTE(imelnikov): nothing really changed, so we should not
-                # write anything to storage and run notifications
-                return
-            self._storage.set_atom_state(retry.name, state)
-        retry_uuid = self._storage.get_atom_uuid(retry.name)
-        details = dict(retry_name=retry.name,
-                       retry_uuid=retry_uuid,
-                       result=result)
-        self._notifier.notify(state, details)
-
-    def execute(self, retry):
-        self.change_state(retry, states.RUNNING)
-        kwargs = self._get_retry_args(retry)
-        try:
-            result = retry.execute(**kwargs)
-        except Exception:
-            result = misc.Failure()
-            self.change_state(retry, states.FAILURE, result=result)
-        else:
-            self.change_state(retry, states.SUCCESS, result=result)
-        return async_utils.make_completed_future((retry, ex.EXECUTED, result))
-
-    def revert(self, retry):
-        self.change_state(retry, states.REVERTING)
-        kwargs = self._get_retry_args(retry)
-        kwargs['flow_failures'] = self._storage.get_failures()
-        try:
-            result = retry.revert(**kwargs)
-        except Exception:
-            result = misc.Failure()
-            self.change_state(retry, states.FAILURE)
-        else:
-            self.change_state(retry, states.REVERTED)
-        return async_utils.make_completed_future((retry, ex.REVERTED, result))
-
-    def on_failure(self, retry, atom, last_failure):
-        self._storage.save_retry_failure(retry.name, atom.name, last_failure)
-        kwargs = self._get_retry_args(retry)
-        return retry.on_failure(**kwargs)
--- a/taskflow/engines/action_engine/runner.py
+++ b/taskflow/engines/action_engine/runner.py
@@ -14,11 +14,10 @@
 #    License for the specific language governing permissions and limitations
 #    under the License.

-import logging
-
+from taskflow import logging
 from taskflow import states as st
+from taskflow.types import failure
 from taskflow.types import fsm
-from taskflow.utils import misc

 # Waiting state timeout (in seconds).
 _WAITING_TIMEOUT = 60
@@ -28,6 +27,17 @@ _UNDEFINED = 'UNDEFINED'
 _GAME_OVER = 'GAME_OVER'
 _META_STATES = (_GAME_OVER, _UNDEFINED)

+# Event name constants the state machine uses.
+_SCHEDULE = 'schedule_next'
+_WAIT = 'wait_finished'
+_ANALYZE = 'examine_finished'
+_FINISH = 'completed'
+_FAILED = 'failed'
+_SUSPENDED = 'suspended'
+_SUCCESS = 'success'
+_REVERTED = 'reverted'
+_START = 'start'
+
 LOG = logging.getLogger(__name__)


@@ -46,25 +56,25 @@ class _MachineBuilder(object):

    NOTE(harlowja): the machine states that this build will for are::

-    +--------------+-----------+------------+----------+---------+
-    |    Start     |   Event   |    End     | On Enter | On Exit |
-    +--------------+-----------+------------+----------+---------+
-    |  ANALYZING   |  finished | GAME_OVER  | on_enter | on_exit |
-    |  ANALYZING   |  schedule | SCHEDULING | on_enter | on_exit |
-    |  ANALYZING   |    wait   |  WAITING   | on_enter | on_exit |
-    |  FAILURE[$]  |           |            |          |         |
-    |  GAME_OVER   |   failed  |  FAILURE   | on_enter | on_exit |
-    |  GAME_OVER   |  reverted |  REVERTED  | on_enter | on_exit |
-    |  GAME_OVER   |  success  |  SUCCESS   | on_enter | on_exit |
-    |  GAME_OVER   | suspended | SUSPENDED  | on_enter | on_exit |
-    |   RESUMING   |  schedule | SCHEDULING | on_enter | on_exit |
-    | REVERTED[$]  |           |            |          |         |
-    |  SCHEDULING  |    wait   |  WAITING   | on_enter | on_exit |
-    |  SUCCESS[$]  |           |            |          |         |
-    | SUSPENDED[$] |           |            |          |         |
-    | UNDEFINED[^] |   start   |  RESUMING  | on_enter | on_exit |
-    |   WAITING    |  analyze  | ANALYZING  | on_enter | on_exit |
-    +--------------+-----------+------------+----------+---------+
+    +--------------+------------------+------------+----------+---------+
+         Start     |      Event       |    End     | On Enter | On Exit
+    +--------------+------------------+------------+----------+---------+
+       ANALYZING   |    completed     | GAME_OVER  |          |
+       ANALYZING   |  schedule_next   | SCHEDULING |          |
+       ANALYZING   |  wait_finished   |  WAITING   |          |
+       FAILURE[$]  |                  |            |          |
+       GAME_OVER   |      failed      |  FAILURE   |          |
+       GAME_OVER   |     reverted     |  REVERTED  |          |
+       GAME_OVER   |     success      |  SUCCESS   |          |
+       GAME_OVER   |    suspended     | SUSPENDED  |          |
+        RESUMING   |  schedule_next   | SCHEDULING |          |
+      REVERTED[$]  |                  |            |          |
+       SCHEDULING  |  wait_finished   |  WAITING   |          |
+       SUCCESS[$]  |                  |            |          |
+      SUSPENDED[$] |                  |            |          |
+      UNDEFINED[^] |      start       |  RESUMING  |          |
+        WAITING    | examine_finished | ANALYZING  |          |
+    +--------------+------------------+------------+----------+---------+

    Between any of these yielded states (minus ``GAME_OVER`` and ``UNDEFINED``)
    if the engine has been suspended or the engine has failed (due to a
@@ -89,21 +99,34 @@ class _MachineBuilder(object):
            timeout = _WAITING_TIMEOUT

        def resume(old_state, new_state, event):
+            # This reaction function just updates the state machines memory
+            # to include any nodes that need to be executed (from a previous
+            # attempt, which may be empty if never ran before) and any nodes
+            # that are now ready to be ran.
            memory.next_nodes.update(self._completer.resume())
            memory.next_nodes.update(self._analyzer.get_next_nodes())
-            return 'schedule'
+            return _SCHEDULE

        def game_over(old_state, new_state, event):
+            # This reaction function is mainly a intermediary delegation
+            # function that analyzes the current memory and transitions to
+            # the appropriate handler that will deal with the memory values,
+            # it is *always* called before the final state is entered.
            if memory.failures:
-                return 'failed'
+                return _FAILED
            if self._analyzer.get_next_nodes():
-                return 'suspended'
+                return _SUSPENDED
            elif self._analyzer.is_success():
-                return 'success'
+                return _SUCCESS
            else:
-                return 'reverted'
+                return _REVERTED

        def schedule(old_state, new_state, event):
+            # This reaction function starts to schedule the memory's next
+            # nodes (iff the engine is still runnable, which it may not be
+            # if the user of this engine has requested the engine/storage
+            # that holds this information to stop or suspend); handles failures
+            # that occur during this process safely...
            if self.runnable() and memory.next_nodes:
                not_done, failures = self._scheduler.schedule(
                    memory.next_nodes)
@@ -112,7 +135,7 @@ class _MachineBuilder(object):
                if failures:
                    memory.failures.extend(failures)
                memory.next_nodes.clear()
-            return 'wait'
+            return _WAIT

        def wait(old_state, new_state, event):
            # TODO(harlowja): maybe we should start doing 'yield from' this
@@ -123,33 +146,55 @@ class _MachineBuilder(object):
                                                           timeout)
                memory.done.update(done)
                memory.not_done = not_done
-            return 'analyze'
+            return _ANALYZE

        def analyze(old_state, new_state, event):
+            # This reaction function is responsible for analyzing all nodes
+            # that have finished executing and completing them and figuring
+            # out what nodes are now ready to be ran (and then triggering those
+            # nodes to be scheduled in the future); handles failures that
+            # occur during this process safely...
            next_nodes = set()
            while memory.done:
                fut = memory.done.pop()
+                node = fut.atom
                try:
-                    node, event, result = fut.result()
+                    event, result = fut.result()
                    retain = self._completer.complete(node, event, result)
-                    if retain and isinstance(result, misc.Failure):
-                        memory.failures.append(result)
+                    if isinstance(result, failure.Failure):
+                        if retain:
+                            memory.failures.append(result)
+                        else:
+                            # NOTE(harlowja): avoid making any
+                            # intention request to storage unless we are
+                            # sure we are in DEBUG enabled logging (otherwise
+                            # we will call this all the time even when DEBUG
+                            # is not enabled, which would suck...)
+                            if LOG.isEnabledFor(logging.DEBUG):
+                                intention = self._storage.get_atom_intention(
+                                    node.name)
+                                LOG.debug("Discarding failure '%s' (in"
+                                          " response to event '%s') under"
+                                          " completion units request during"
+                                          " completion of node '%s' (intention"
+                                          " is to %s)", result, event,
+                                          node, intention)
                except Exception:
-                    memory.failures.append(misc.Failure())
+                    memory.failures.append(failure.Failure())
                else:
                    try:
                        more_nodes = self._analyzer.get_next_nodes(node)
                    except Exception:
-                        memory.failures.append(misc.Failure())
+                        memory.failures.append(failure.Failure())
                    else:
                        next_nodes.update(more_nodes)
            if self.runnable() and next_nodes and not memory.failures:
                memory.next_nodes.update(next_nodes)
-                return 'schedule'
+                return _SCHEDULE
            elif memory.not_done:
-                return 'wait'
+                return _WAIT
            else:
-                return 'finished'
+                return _FINISH

        def on_exit(old_state, event):
            LOG.debug("Exiting old state '%s' in response to event '%s'",
@@ -178,24 +223,25 @@ class _MachineBuilder(object):
        m.add_state(st.WAITING, **watchers)
        m.add_state(st.FAILURE, terminal=True, **watchers)

-        m.add_transition(_GAME_OVER, st.REVERTED, 'reverted')
-        m.add_transition(_GAME_OVER, st.SUCCESS, 'success')
-        m.add_transition(_GAME_OVER, st.SUSPENDED, 'suspended')
-        m.add_transition(_GAME_OVER, st.FAILURE, 'failed')
-        m.add_transition(_UNDEFINED, st.RESUMING, 'start')
-        m.add_transition(st.ANALYZING, _GAME_OVER, 'finished')
-        m.add_transition(st.ANALYZING, st.SCHEDULING, 'schedule')
-        m.add_transition(st.ANALYZING, st.WAITING, 'wait')
-        m.add_transition(st.RESUMING, st.SCHEDULING, 'schedule')
-        m.add_transition(st.SCHEDULING, st.WAITING, 'wait')
-        m.add_transition(st.WAITING, st.ANALYZING, 'analyze')
+        m.add_transition(_GAME_OVER, st.REVERTED, _REVERTED)
+        m.add_transition(_GAME_OVER, st.SUCCESS, _SUCCESS)
+        m.add_transition(_GAME_OVER, st.SUSPENDED, _SUSPENDED)
+        m.add_transition(_GAME_OVER, st.FAILURE, _FAILED)
+        m.add_transition(_UNDEFINED, st.RESUMING, _START)
+        m.add_transition(st.ANALYZING, _GAME_OVER, _FINISH)
+        m.add_transition(st.ANALYZING, st.SCHEDULING, _SCHEDULE)
+        m.add_transition(st.ANALYZING, st.WAITING, _WAIT)
+        m.add_transition(st.RESUMING, st.SCHEDULING, _SCHEDULE)
+        m.add_transition(st.SCHEDULING, st.WAITING, _WAIT)
+        m.add_transition(st.WAITING, st.ANALYZING, _ANALYZE)

-        m.add_reaction(_GAME_OVER, 'finished', game_over)
-        m.add_reaction(st.ANALYZING, 'analyze', analyze)
-        m.add_reaction(st.RESUMING, 'start', resume)
-        m.add_reaction(st.SCHEDULING, 'schedule', schedule)
-        m.add_reaction(st.WAITING, 'wait', wait)
+        m.add_reaction(_GAME_OVER, _FINISH, game_over)
+        m.add_reaction(st.ANALYZING, _ANALYZE, analyze)
+        m.add_reaction(st.RESUMING, _START, resume)
+        m.add_reaction(st.SCHEDULING, _SCHEDULE, schedule)
+        m.add_reaction(st.WAITING, _WAIT, wait)

+        m.freeze()
        return (m, memory)


@@ -230,7 +276,7 @@ class Runner(object):
    def run_iter(self, timeout=None):
        """Runs the nodes using a built state machine."""
        machine, memory = self.builder.build(timeout=timeout)
-        for (_prior_state, new_state) in machine.run_iter('start'):
+        for (_prior_state, new_state) in machine.run_iter(_START):
            # NOTE(harlowja): skip over meta-states.
            if new_state not in _META_STATES:
                if new_state == st.FAILURE:
--- a/taskflow/engines/action_engine/runtime.py
+++ b/taskflow/engines/action_engine/runtime.py
@@ -14,15 +14,14 @@
 #    License for the specific language governing permissions and limitations
 #    under the License.

-from taskflow.engines.action_engine import analyzer as ca
-from taskflow.engines.action_engine import executor as ex
-from taskflow.engines.action_engine import retry_action as ra
+from taskflow.engines.action_engine.actions import retry as ra
+from taskflow.engines.action_engine.actions import task as ta
+from taskflow.engines.action_engine import analyzer as an
+from taskflow.engines.action_engine import completer as co
 from taskflow.engines.action_engine import runner as ru
-from taskflow.engines.action_engine import task_action as ta
-from taskflow import exceptions as excp
-from taskflow import retry as retry_atom
+from taskflow.engines.action_engine import scheduler as sched
+from taskflow.engines.action_engine import scopes as sc
 from taskflow import states as st
-from taskflow import task as task_atom
 from taskflow.utils import misc


@@ -34,11 +33,12 @@ class Runtime(object):
    action engine to run to completion.
    """

-    def __init__(self, compilation, storage, task_notifier, task_executor):
-        self._task_notifier = task_notifier
+    def __init__(self, compilation, storage, atom_notifier, task_executor):
+        self._atom_notifier = atom_notifier
        self._task_executor = task_executor
        self._storage = storage
        self._compilation = compilation
+        self._scopes = {}

    @property
    def compilation(self):
@@ -50,7 +50,7 @@ class Runtime(object):

    @misc.cachedproperty
    def analyzer(self):
-        return ca.Analyzer(self._compilation, self._storage)
+        return an.Analyzer(self._compilation, self._storage)

    @misc.cachedproperty
    def runner(self):
@@ -58,30 +58,47 @@ class Runtime(object):

    @misc.cachedproperty
    def completer(self):
-        return Completer(self)
+        return co.Completer(self)

    @misc.cachedproperty
    def scheduler(self):
-        return Scheduler(self)
+        return sched.Scheduler(self)

    @misc.cachedproperty
    def retry_action(self):
-        return ra.RetryAction(self.storage, self._task_notifier)
+        return ra.RetryAction(self._storage, self._atom_notifier,
+                              self._fetch_scopes_for)

    @misc.cachedproperty
    def task_action(self):
-        return ta.TaskAction(self.storage, self._task_executor,
-                             self._task_notifier)
+        return ta.TaskAction(self._storage,
+                             self._atom_notifier, self._fetch_scopes_for,
+                             self._task_executor)
+
+    def _fetch_scopes_for(self, atom):
+        """Fetches a tuple of the visible scopes for the given atom."""
+        try:
+            return self._scopes[atom]
+        except KeyError:
+            walker = sc.ScopeWalker(self.compilation, atom,
+                                    names_only=True)
+            visible_to = tuple(walker)
+            self._scopes[atom] = visible_to
+            return visible_to
+
+    # Various helper methods used by the runtime components; not for public
+    # consumption...

    def reset_nodes(self, nodes, state=st.PENDING, intention=st.EXECUTE):
        for node in nodes:
            if state:
-                if isinstance(node, task_atom.BaseTask):
-                    self.task_action.change_state(node, state, progress=0.0)
-                elif isinstance(node, retry_atom.Retry):
+                if self.task_action.handles(node):
+                    self.task_action.change_state(node, state,
+                                                  progress=0.0)
+                elif self.retry_action.handles(node):
                    self.retry_action.change_state(node, state)
                else:
-                    raise TypeError("Unknown how to reset node %s, %s"
+                    raise TypeError("Unknown how to reset atom '%s' (%s)"
                                    % (node, type(node)))
            if intention:
                self.storage.set_atom_intention(node.name, intention)
@@ -94,174 +111,6 @@ class Runtime(object):
        self.reset_nodes(self.analyzer.iterate_subgraph(node),
                         state=state, intention=intention)

-
-# Various helper methods used by completer and scheduler.
-def _retry_subflow(retry, runtime):
-    runtime.storage.set_atom_intention(retry.name, st.EXECUTE)
-    runtime.reset_subgraph(retry)
-
-
-class Completer(object):
-    """Completes atoms using actions to complete them."""
-
-    def __init__(self, runtime):
-        self._analyzer = runtime.analyzer
-        self._retry_action = runtime.retry_action
-        self._runtime = runtime
-        self._storage = runtime.storage
-        self._task_action = runtime.task_action
-
-    def _complete_task(self, task, event, result):
-        """Completes the given task, processes task failure."""
-        if event == ex.EXECUTED:
-            self._task_action.complete_execution(task, result)
-        else:
-            self._task_action.complete_reversion(task, result)
-
-    def resume(self):
-        """Resumes nodes in the contained graph.
-
-        This is done to allow any previously completed or failed nodes to
-        be analyzed, there results processed and any potential nodes affected
-        to be adjusted as needed.
-
-        This should return a set of nodes which should be the initial set of
-        nodes that were previously not finished (due to a RUNNING or REVERTING
-        attempt not previously finishing).
-        """
-        for node in self._analyzer.iterate_all_nodes():
-            if self._analyzer.get_state(node) == st.FAILURE:
-                self._process_atom_failure(node, self._storage.get(node.name))
-        for retry in self._analyzer.iterate_retries(st.RETRYING):
-            _retry_subflow(retry, self._runtime)
-        unfinished_nodes = set()
-        for node in self._analyzer.iterate_all_nodes():
-            if self._analyzer.get_state(node) in (st.RUNNING, st.REVERTING):
-                unfinished_nodes.add(node)
-        return unfinished_nodes
-
-    def complete(self, node, event, result):
-        """Performs post-execution completion of a node.
-
-        Returns whether the result should be saved into an accumulator of
-        failures or whether this should not be done.
-        """
-        if isinstance(node, task_atom.BaseTask):
-            self._complete_task(node, event, result)
-        if isinstance(result, misc.Failure):
-            if event == ex.EXECUTED:
-                self._process_atom_failure(node, result)
-            else:
-                return True
-        return False
-
-    def _process_atom_failure(self, atom, failure):
-        """Processes atom failure & applies resolution strategies.
-
-        On atom failure this will find the atoms associated retry controller
-        and ask that controller for the strategy to perform to resolve that
-        failure. After getting a resolution strategy decision this method will
-        then adjust the needed other atoms intentions, and states, ... so that
-        the failure can be worked around.
-        """
-        retry = self._analyzer.find_atom_retry(atom)
-        if retry:
-            # Ask retry controller what to do in case of failure
-            action = self._retry_action.on_failure(retry, atom, failure)
-            if action == retry_atom.RETRY:
-                # Prepare subflow for revert
-                self._storage.set_atom_intention(retry.name, st.RETRY)
-                self._runtime.reset_subgraph(retry, state=None,
-                                             intention=st.REVERT)
-            elif action == retry_atom.REVERT:
-                # Ask parent checkpoint
-                self._process_atom_failure(retry, failure)
-            elif action == retry_atom.REVERT_ALL:
-                # Prepare all flow for revert
-                self._revert_all()
-        else:
-            # Prepare all flow for revert
-            self._revert_all()
-
-    def _revert_all(self):
-        """Attempts to set all nodes to the REVERT intention."""
-        self._runtime.reset_nodes(self._analyzer.iterate_all_nodes(),
-                                  state=None, intention=st.REVERT)
-
-
-class Scheduler(object):
-    """Schedules atoms using actions to schedule."""
-
-    def __init__(self, runtime):
-        self._analyzer = runtime.analyzer
-        self._retry_action = runtime.retry_action
-        self._runtime = runtime
-        self._storage = runtime.storage
-        self._task_action = runtime.task_action
-
-    def _schedule_node(self, node):
-        """Schedule a single node for execution."""
-        # TODO(harlowja): we need to rework this so that we aren't doing type
-        # checking here, type checking usually means something isn't done right
-        # and usually will limit extensibility in the future.
-        if isinstance(node, task_atom.BaseTask):
-            return self._schedule_task(node)
-        elif isinstance(node, retry_atom.Retry):
-            return self._schedule_retry(node)
-        else:
-            raise TypeError("Unknown how to schedule node %s, %s"
-                            % (node, type(node)))
-
-    def _schedule_retry(self, retry):
-        """Schedules the given retry atom for *future* completion.
-
-        Depending on the atoms stored intention this may schedule the retry
-        atom for reversion or execution.
-        """
-        intention = self._storage.get_atom_intention(retry.name)
-        if intention == st.EXECUTE:
-            return self._retry_action.execute(retry)
-        elif intention == st.REVERT:
-            return self._retry_action.revert(retry)
-        elif intention == st.RETRY:
-            self._retry_action.change_state(retry, st.RETRYING)
-            _retry_subflow(retry, self._runtime)
-            return self._retry_action.execute(retry)
-        else:
-            raise excp.ExecutionFailure("Unknown how to schedule retry with"
-                                        " intention: %s" % intention)
-
-    def _schedule_task(self, task):
-        """Schedules the given task atom for *future* completion.
-
-        Depending on the atoms stored intention this may schedule the task
-        atom for reversion or execution.
-        """
-        intention = self._storage.get_atom_intention(task.name)
-        if intention == st.EXECUTE:
-            return self._task_action.schedule_execution(task)
-        elif intention == st.REVERT:
-            return self._task_action.schedule_reversion(task)
-        else:
-            raise excp.ExecutionFailure("Unknown how to schedule task with"
-                                        " intention: %s" % intention)
-
-    def schedule(self, nodes):
-        """Schedules the provided nodes for *future* completion.
-
-        This method should schedule a future for each node provided and return
-        a set of those futures to be waited on (or used for other similar
-        purposes). It should also return any failure objects that represented
-        scheduling failures that may have occurred during this scheduling
-        process.
-        """
-        futures = set()
-        for node in nodes:
-            try:
-                futures.add(self._schedule_node(node))
-            except Exception:
-                # Immediately stop scheduling future work so that we can
-                # exit execution early (rather than later) if a single task
-                # fails to schedule correctly.
-                return (futures, [misc.Failure()])
-        return (futures, [])
+    def retry_subflow(self, retry):
+        self.storage.set_atom_intention(retry.name, st.EXECUTE)
+        self.reset_subgraph(retry)
--- a/taskflow/engines/action_engine/scheduler.py
+++ b/taskflow/engines/action_engine/scheduler.py
@@ -0,0 +1,115 @@
+# -*- coding: utf-8 -*-
+
+#    Copyright (C) 2014 Yahoo! Inc. All Rights Reserved.
+#
+#    Licensed under the Apache License, Version 2.0 (the "License"); you may
+#    not use this file except in compliance with the License. You may obtain
+#    a copy of the License at
+#
+#         http://www.apache.org/licenses/LICENSE-2.0
+#
+#    Unless required by applicable law or agreed to in writing, software
+#    distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+#    WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+#    License for the specific language governing permissions and limitations
+#    under the License.
+
+from taskflow import exceptions as excp
+from taskflow import retry as retry_atom
+from taskflow import states as st
+from taskflow import task as task_atom
+from taskflow.types import failure
+
+
+class _RetryScheduler(object):
+    def __init__(self, runtime):
+        self._runtime = runtime
+        self._retry_action = runtime.retry_action
+        self._storage = runtime.storage
+
+    @staticmethod
+    def handles(atom):
+        return isinstance(atom, retry_atom.Retry)
+
+    def schedule(self, retry):
+        """Schedules the given retry atom for *future* completion.
+
+        Depending on the atoms stored intention this may schedule the retry
+        atom for reversion or execution.
+        """
+        intention = self._storage.get_atom_intention(retry.name)
+        if intention == st.EXECUTE:
+            return self._retry_action.execute(retry)
+        elif intention == st.REVERT:
+            return self._retry_action.revert(retry)
+        elif intention == st.RETRY:
+            self._retry_action.change_state(retry, st.RETRYING)
+            self._runtime.retry_subflow(retry)
+            return self._retry_action.execute(retry)
+        else:
+            raise excp.ExecutionFailure("Unknown how to schedule retry with"
+                                        " intention: %s" % intention)
+
+
+class _TaskScheduler(object):
+    def __init__(self, runtime):
+        self._storage = runtime.storage
+        self._task_action = runtime.task_action
+
+    @staticmethod
+    def handles(atom):
+        return isinstance(atom, task_atom.BaseTask)
+
+    def schedule(self, task):
+        """Schedules the given task atom for *future* completion.
+
+        Depending on the atoms stored intention this may schedule the task
+        atom for reversion or execution.
+        """
+        intention = self._storage.get_atom_intention(task.name)
+        if intention == st.EXECUTE:
+            return self._task_action.schedule_execution(task)
+        elif intention == st.REVERT:
+            return self._task_action.schedule_reversion(task)
+        else:
+            raise excp.ExecutionFailure("Unknown how to schedule task with"
+                                        " intention: %s" % intention)
+
+
+class Scheduler(object):
+    """Schedules atoms using actions to schedule."""
+
+    def __init__(self, runtime):
+        self._schedulers = [
+            _RetryScheduler(runtime),
+            _TaskScheduler(runtime),
+        ]
+
+    def _schedule_node(self, node):
+        """Schedule a single node for execution."""
+        for sched in self._schedulers:
+            if sched.handles(node):
+                return sched.schedule(node)
+        else:
+            raise TypeError("Unknown how to schedule '%s' (%s)"
+                            % (node, type(node)))
+
+    def schedule(self, nodes):
+        """Schedules the provided nodes for *future* completion.
+
+        This method should schedule a future for each node provided and return
+        a set of those futures to be waited on (or used for other similar
+        purposes). It should also return any failure objects that represented
+        scheduling failures that may have occurred during this scheduling
+        process.
+        """
+        futures = set()
+        for node in nodes:
+            try:
+                futures.add(self._schedule_node(node))
+            except Exception:
+                # Immediately stop scheduling future work so that we can
+                # exit execution early (rather than later) if a single task
+                # fails to schedule correctly.
+                return (futures, [failure.Failure()])
+        return (futures, [])
--- a/taskflow/engines/action_engine/scopes.py
+++ b/taskflow/engines/action_engine/scopes.py
@@ -0,0 +1,113 @@
+# -*- coding: utf-8 -*-
+
+#    Copyright (C) 2014 Yahoo! Inc. All Rights Reserved.
+#
+#    Licensed under the Apache License, Version 2.0 (the "License"); you may
+#    not use this file except in compliance with the License. You may obtain
+#    a copy of the License at
+#
+#         http://www.apache.org/licenses/LICENSE-2.0
+#
+#    Unless required by applicable law or agreed to in writing, software
+#    distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+#    WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+#    License for the specific language governing permissions and limitations
+#    under the License.
+
+from taskflow import atom as atom_type
+from taskflow import flow as flow_type
+from taskflow import logging
+
+LOG = logging.getLogger(__name__)
+
+
+def _extract_atoms(node, idx=-1):
+    # Always go left to right, since right to left is the pattern order
+    # and we want to go backwards and not forwards through that ordering...
+    if idx == -1:
+        children_iter = node.reverse_iter()
+    else:
+        children_iter = reversed(node[0:idx])
+    atoms = []
+    for child in children_iter:
+        if isinstance(child.item, flow_type.Flow):
+            atoms.extend(_extract_atoms(child))
+        elif isinstance(child.item, atom_type.Atom):
+            atoms.append(child.item)
+        else:
+            raise TypeError(
+                "Unknown extraction item '%s' (%s)" % (child.item,
+                                                       type(child.item)))
+    return atoms
+
+
+class ScopeWalker(object):
+    """Walks through the scopes of a atom using a engines compilation.
+
+    This will walk the visible scopes that are accessible for the given
+    atom, which can be used by some external entity in some meaningful way,
+    for example to find dependent values...
+    """
+
+    def __init__(self, compilation, atom, names_only=False):
+        self._node = compilation.hierarchy.find(atom)
+        if self._node is None:
+            raise ValueError("Unable to find atom '%s' in compilation"
+                             " hierarchy" % atom)
+        self._atom = atom
+        self._graph = compilation.execution_graph
+        self._names_only = names_only
+
+    def __iter__(self):
+        """Iterates over the visible scopes.
+
+        How this works is the following:
+
+        We find all the possible predecessors of the given atom, this is useful
+        since we know they occurred before this atom but it doesn't tell us
+        the corresponding scope *level* that each predecessor was created in,
+        so we need to find this information.
+
+        For that information we consult the location of the atom ``Y`` in the
+        node hierarchy. We lookup in a reverse order the parent ``X`` of ``Y``
+        and traverse backwards from the index in the parent where ``Y``
+        occurred, all children in ``X`` that we encounter in this backwards
+        search (if a child is a flow itself, its atom contents will be
+        expanded) will be assumed to be at the same scope. This is then a
+        *potential* single scope, to make an *actual* scope we remove the items
+        from the *potential* scope that are not predecessors of ``Y`` to form
+        the *actual* scope.
+
+        Then for additional scopes we continue up the tree, by finding the
+        parent of ``X`` (lets call it ``Z``) and perform the same operation,
+        going through the children in a reverse manner from the index in
+        parent ``Z`` where ``X`` was located. This forms another *potential*
+        scope which we provide back as an *actual* scope after reducing the
+        potential set by the predecessors of ``Y``. We then repeat this process
+        until we no longer have any parent nodes (aka have reached the top of
+        the tree) or we run out of predecessors.
+        """
+        predecessors = set(self._graph.bfs_predecessors_iter(self._atom))
+        last = self._node
+        for parent in self._node.path_iter(include_self=False):
+            if not predecessors:
+                break
+            last_idx = parent.index(last.item)
+            visible = []
+            for a in _extract_atoms(parent, idx=last_idx):
+                if a in predecessors:
+                    predecessors.remove(a)
+                    if not self._names_only:
+                        visible.append(a)
+                    else:
+                        visible.append(a.name)
+            if LOG.isEnabledFor(logging.BLATHER):
+                if not self._names_only:
+                    visible_names = [a.name for a in visible]
+                else:
+                    visible_names = visible
+                LOG.blather("Scope visible to '%s' (limited by parent '%s'"
+                            " index < %s) is: %s", self._atom,
+                            parent.item.name, last_idx, visible_names)
+            yield visible
+            last = parent
--- a/taskflow/engines/action_engine/task_action.py
+++ b/taskflow/engines/action_engine/task_action.py
@@ -1,116 +0,0 @@
-# -*- coding: utf-8 -*-
-
-#    Copyright (C) 2012-2013 Yahoo! Inc. All Rights Reserved.
-#
-#    Licensed under the Apache License, Version 2.0 (the "License"); you may
-#    not use this file except in compliance with the License. You may obtain
-#    a copy of the License at
-#
-#         http://www.apache.org/licenses/LICENSE-2.0
-#
-#    Unless required by applicable law or agreed to in writing, software
-#    distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
-#    WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
-#    License for the specific language governing permissions and limitations
-#    under the License.
-
-import logging
-
-from taskflow import states
-from taskflow.utils import misc
-
-LOG = logging.getLogger(__name__)
-
-SAVE_RESULT_STATES = (states.SUCCESS, states.FAILURE)
-
-
-class TaskAction(object):
-
-    def __init__(self, storage, task_executor, notifier):
-        self._storage = storage
-        self._task_executor = task_executor
-        self._notifier = notifier
-
-    def _is_identity_transition(self, state, task, progress):
-        if state in SAVE_RESULT_STATES:
-            # saving result is never identity transition
-            return False
-        old_state = self._storage.get_atom_state(task.name)
-        if state != old_state:
-            # changing state is not identity transition by definition
-            return False
-        # NOTE(imelnikov): last thing to check is that the progress has
-        # changed, which means progress is not None and is different from
-        # what is stored in the database.
-        if progress is None:
-            return False
-        old_progress = self._storage.get_task_progress(task.name)
-        if old_progress != progress:
-            return False
-        return True
-
-    def change_state(self, task, state, result=None, progress=None):
-        if self._is_identity_transition(state, task, progress):
-            # NOTE(imelnikov): ignore identity transitions in order
-            # to avoid extra write to storage backend and, what's
-            # more important, extra notifications
-            return
-        if state in SAVE_RESULT_STATES:
-            self._storage.save(task.name, result, state)
-        else:
-            self._storage.set_atom_state(task.name, state)
-        if progress is not None:
-            self._storage.set_task_progress(task.name, progress)
-        task_uuid = self._storage.get_atom_uuid(task.name)
-        details = dict(task_name=task.name,
-                       task_uuid=task_uuid,
-                       result=result)
-        self._notifier.notify(state, details)
-        if progress is not None:
-            task.update_progress(progress)
-
-    def _on_update_progress(self, task, event_data, progress, **kwargs):
-        """Should be called when task updates its progress."""
-        try:
-            self._storage.set_task_progress(task.name, progress, kwargs)
-        except Exception:
-            # Update progress callbacks should never fail, so capture and log
-            # the emitted exception instead of raising it.
-            LOG.exception("Failed setting task progress for %s to %0.3f",
-                          task, progress)
-
-    def schedule_execution(self, task):
-        self.change_state(task, states.RUNNING, progress=0.0)
-        kwargs = self._storage.fetch_mapped_args(task.rebind,
-                                                 atom_name=task.name)
-        task_uuid = self._storage.get_atom_uuid(task.name)
-        return self._task_executor.execute_task(task, task_uuid, kwargs,
-                                                self._on_update_progress)
-
-    def complete_execution(self, task, result):
-        if isinstance(result, misc.Failure):
-            self.change_state(task, states.FAILURE, result=result)
-        else:
-            self.change_state(task, states.SUCCESS,
-                              result=result, progress=1.0)
-
-    def schedule_reversion(self, task):
-        self.change_state(task, states.REVERTING, progress=0.0)
-        kwargs = self._storage.fetch_mapped_args(task.rebind,
-                                                 atom_name=task.name)
-        task_uuid = self._storage.get_atom_uuid(task.name)
-        task_result = self._storage.get(task.name)
-        failures = self._storage.get_failures()
-        future = self._task_executor.revert_task(task, task_uuid, kwargs,
-                                                 task_result, failures,
-                                                 self._on_update_progress)
-        return future
-
-    def complete_reversion(self, task, rev_result):
-        if isinstance(rev_result, misc.Failure):
-            self.change_state(task, states.FAILURE)
-        else:
-            self.change_state(task, states.REVERTED, progress=1.0)
-
-    def wait_for_any(self, fs, timeout):
-        return self._task_executor.wait_for_any(fs, timeout)
--- a/taskflow/engines/base.py
+++ b/taskflow/engines/base.py
@@ -19,29 +19,57 @@ import abc

 import six

+from taskflow.types import notifier
+from taskflow.utils import deprecation
 from taskflow.utils import misc


@six.add_metaclass(abc.ABCMeta)
-class EngineBase(object):
+class Engine(object):
    """Base for all engines implementations.

    :ivar notifier: A notification object that will dispatch events that
                    occur related to the flow the engine contains.
    :ivar task_notifier: A notification object that will dispatch events that
                         occur related to the tasks the engine contains.
+                         occur related to the tasks the engine
+                         contains (deprecated).
+    :ivar atom_notifier: A notification object that will dispatch events that
+                         occur related to the atoms the engine contains.
    """

-    def __init__(self, flow, flow_detail, backend, conf):
+    def __init__(self, flow, flow_detail, backend, options):
        self._flow = flow
        self._flow_detail = flow_detail
        self._backend = backend
-        if not conf:
-            self._conf = {}
+        if not options:
+            self._options = {}
        else:
-            self._conf = dict(conf)
-        self.notifier = misc.Notifier()
-        self.task_notifier = misc.Notifier()
+            self._options = dict(options)
+        self._notifier = notifier.Notifier()
+        self._atom_notifier = notifier.Notifier()
+
+    @property
+    def notifier(self):
+        """The flow notifier."""
+        return self._notifier
+
+    @property
+    @deprecation.moved_property('atom_notifier', version="0.6",
+                                removal_version="?")
+    def task_notifier(self):
+        """The task notifier."""
+        return self._atom_notifier
+
+    @property
+    def atom_notifier(self):
+        """The atom notifier."""
+        return self._atom_notifier
+
+    @property
+    def options(self):
+        """The options that were passed to this engine on construction."""
+        return self._options

    @misc.cachedproperty
    def storage(self):
@@ -85,3 +113,10 @@ class EngineBase(object):
        not currently be preempted) and move the engine into a suspend state
        which can then later be resumed from.
        """
+
+
+# TODO(harlowja): remove in 0.7 or later...
+EngineBase = deprecation.moved_inheritable_class(Engine,
+                                                 'EngineBase', __name__,
+                                                 version="0.6",
+                                                 removal_version="?")
--- a/taskflow/engines/helpers.py
+++ b/taskflow/engines/helpers.py
@@ -15,21 +15,93 @@
 #    under the License.

 import contextlib
+import itertools
+import traceback

+from oslo_utils import importutils
+from oslo_utils import reflection
 import six
 import stevedore.driver

 from taskflow import exceptions as exc
-from taskflow.openstack.common import importutils
+from taskflow import logging
 from taskflow.persistence import backends as p_backends
+from taskflow.utils import deprecation
 from taskflow.utils import misc
 from taskflow.utils import persistence_utils as p_utils
-from taskflow.utils import reflection

+LOG = logging.getLogger(__name__)

 # NOTE(imelnikov): this is the entrypoint namespace, not the module namespace.
 ENGINES_NAMESPACE = 'taskflow.engines'

+# The default entrypoint engine type looked for when it is not provided.
+ENGINE_DEFAULT = 'default'
+
+# TODO(harlowja): only used during the deprecation cycle, remove it once
+# ``_extract_engine_compat`` is also gone...
+_FILE_NAMES = [__file__]
+if six.PY2:
+    # Due to a bug in py2.x the __file__ may point to the pyc file & since
+    # we are using the traceback module and that module only shows py files
+    # we have to do a slight adjustment to ensure we match correctly...
+    #
+    # This is addressed in https://www.python.org/dev/peps/pep-3147/#file
+    if __file__.endswith("pyc"):
+        _FILE_NAMES.append(__file__[0:-1])
+_FILE_NAMES = tuple(_FILE_NAMES)
+
+
+def _extract_engine(**kwargs):
+    """Extracts the engine kind and any associated options."""
+
+    def _compat_extract(**kwargs):
+        options = {}
+        kind = kwargs.pop('engine', None)
+        engine_conf = kwargs.pop('engine_conf', None)
+        if engine_conf is not None:
+            if isinstance(engine_conf, six.string_types):
+                kind = engine_conf
+            else:
+                options.update(engine_conf)
+                kind = options.pop('engine', None)
+        if not kind:
+            kind = ENGINE_DEFAULT
+        # See if it's a URI and if so, extract any further options...
+        try:
+            uri = misc.parse_uri(kind)
+        except (TypeError, ValueError):
+            pass
+        else:
+            kind = uri.scheme
+            options = misc.merge_uri(uri, options.copy())
+        # Merge in any leftover **kwargs into the options, this makes it so
+        # that the provided **kwargs override any URI or engine_conf specific
+        # options.
+        options.update(kwargs)
+        return (kind, options)
+
+    engine_conf = kwargs.get('engine_conf', None)
+    if engine_conf is not None:
+        # Figure out where our code ends and the calling code begins (this is
+        # needed since this code is called from two functions in this module,
+        # which means the stack level will vary by one depending on that).
+        finder = itertools.takewhile(
+            lambda frame: frame[0] in _FILE_NAMES,
+            reversed(traceback.extract_stack(limit=3)))
+        stacklevel = sum(1 for _frame in finder)
+        decorator = deprecation.renamed_kwarg('engine_conf', 'engine',
+                                              version="0.6",
+                                              removal_version="?",
+                                              # Three is added on since the
+                                              # decorator adds three of its own
+                                              # stack levels that we need to
+                                              # hop out of...
+                                              stacklevel=stacklevel + 3)
+        return decorator(_compat_extract)(**kwargs)
+    else:
+        return _compat_extract(**kwargs)
+

 def _fetch_factory(factory_name):
    try:
@@ -56,49 +128,43 @@ def _fetch_validate_factory(flow_factory):


 def load(flow, store=None, flow_detail=None, book=None,
-         engine_conf=None, backend=None, namespace=ENGINES_NAMESPACE,
-         **kwargs):
+         engine_conf=None, backend=None,
+         namespace=ENGINES_NAMESPACE, engine=ENGINE_DEFAULT, **kwargs):
    """Load a flow into an engine.

-    This function creates and prepares engine to run the
-    flow. All that is left is to run the engine with 'run()' method.
+    This function creates and prepares an engine to run the provided flow. All
+    that is left after this returns is to run the engine with the
+    engines ``run()`` method.

-    Which engine to load is specified in 'engine_conf' parameter. It
-    can be a string that names engine type or a dictionary which holds
-    engine type (with 'engine' key) and additional engine-specific
-    configuration.
+    Which engine to load is specified via the ``engine`` parameter. It
+    can be a string that names the engine type to use, or a string that
+    is a URI with a scheme that names the engine type to use and further
+    options contained in the URI's host, port, and query parameters...

-    Which storage backend to use is defined by backend parameter. It
+    Which storage backend to use is defined by the backend parameter. It
    can be backend itself, or a dictionary that is passed to
-    taskflow.persistence.backends.fetch to obtain backend.
+    ``taskflow.persistence.backends.fetch()`` to obtain a viable backend.

    :param flow: flow to load
    :param store: dict -- data to put to storage to satisfy flow requirements
    :param flow_detail: FlowDetail that holds the state of the flow (if one is
        not provided then one will be created for you in the provided backend)
    :param book: LogBook to create flow detail in if flow_detail is None
-    :param engine_conf: engine type and configuration configuration
-    :param backend: storage backend to use or configuration
-    :param namespace: driver namespace for stevedore (default is fine
-       if you don't know what is it)
+    :param engine_conf: engine type or URI and options (**deprecated**)
+    :param backend: storage backend to use or configuration that defines it
+    :param namespace: driver namespace for stevedore (or empty for default)
+    :param engine: string engine type or URI string with scheme that contains
+                   the engine type and any URI specific components that will
+                   become part of the engine options.
+    :param kwargs: arbitrary keyword arguments passed as options (merged with
+                   any extracted ``engine`` and ``engine_conf`` options),
+                   typically used for any engine specific options that do not
+                   fit as any of the existing arguments.
    :returns: engine
    """

-    if engine_conf is None:
-        engine_conf = {'engine': 'default'}
-
-    # NOTE(imelnikov): this allows simpler syntax.
-    if isinstance(engine_conf, six.string_types):
-        engine_conf = {'engine': engine_conf}
-
-    engine_name = engine_conf['engine']
-    try:
-        pieces = misc.parse_uri(engine_name)
-    except (TypeError, ValueError):
-        pass
-    else:
-        engine_name = pieces['scheme']
-        engine_conf = misc.merge_uri(pieces, engine_conf.copy())
+    kind, options = _extract_engine(engine_conf=engine_conf,
+                                    engine=engine, **kwargs)

    if isinstance(backend, dict):
        backend = p_backends.fetch(backend)
@@ -107,15 +173,15 @@ def load(flow, store=None, flow_detail=None, book=None,
        flow_detail = p_utils.create_flow_detail(flow, book=book,
                                                 backend=backend)

+    LOG.debug('Looking for %r engine driver in %r', kind, namespace)
    try:
        mgr = stevedore.driver.DriverManager(
-            namespace, engine_name,
+            namespace, kind,
            invoke_on_load=True,
-            invoke_args=(flow, flow_detail, backend, engine_conf),
-            invoke_kwds=kwargs)
+            invoke_args=(flow, flow_detail, backend, options))
        engine = mgr.driver
    except RuntimeError as e:
-        raise exc.NotFound("Could not find engine %s" % (engine_name), e)
+        raise exc.NotFound("Could not find engine '%s'" % (kind), e)
    else:
        if store:
            engine.storage.inject(store)
@@ -123,35 +189,20 @@ def load(flow, store=None, flow_detail=None, book=None,


 def run(flow, store=None, flow_detail=None, book=None,
-        engine_conf=None, backend=None, namespace=ENGINES_NAMESPACE, **kwargs):
+        engine_conf=None, backend=None, namespace=ENGINES_NAMESPACE,
+        engine=ENGINE_DEFAULT, **kwargs):
    """Run the flow.

-    This function load the flow into engine (with 'load' function)
-    and runs the engine.
+    This function loads the flow into an engine (with the :func:`load() <load>`
+    function) and runs the engine.

-    Which engine to load is specified in 'engine_conf' parameter. It
-    can be a string that names engine type or a dictionary which holds
-    engine type (with 'engine' key) and additional engine-specific
-    configuration.
+    The arguments are interpreted as for :func:`load() <load>`.

-    Which storage backend to use is defined by backend parameter. It
-    can be backend itself, or a dictionary that is passed to
-    taskflow.persistence.backends.fetch to obtain backend.
-
-    :param flow: flow to run
-    :param store: dict -- data to put to storage to satisfy flow requirements
-    :param flow_detail: FlowDetail that holds the state of the flow (if one is
-        not provided then one will be created for you in the provided backend)
-    :param book: LogBook to create flow detail in if flow_detail is None
-    :param engine_conf: engine type and configuration configuration
-    :param backend: storage backend to use or configuration
-    :param namespace: driver namespace for stevedore (default is fine
-       if you don't know what is it)
-    :returns: dictionary of all named task results (see Storage.fetch_all)
+    :returns: dictionary of all named results (see ``storage.fetch_all()``)
    """
    engine = load(flow, store=store, flow_detail=flow_detail, book=book,
                  engine_conf=engine_conf, backend=backend,
-                  namespace=namespace, **kwargs)
+                  namespace=namespace, engine=engine, **kwargs)
    engine.run()
    return engine.storage.fetch_all()

@@ -196,23 +247,21 @@ def save_factory_details(flow_detail,

 def load_from_factory(flow_factory, factory_args=None, factory_kwargs=None,
                      store=None, book=None, engine_conf=None, backend=None,
-                      namespace=ENGINES_NAMESPACE, **kwargs):
+                      namespace=ENGINES_NAMESPACE, engine=ENGINE_DEFAULT,
+                      **kwargs):
    """Loads a flow from a factory function into an engine.

    Gets flow factory function (or name of it) and creates flow with
-    it. Then, flow is loaded into engine with load(), and factory
-    function fully qualified name is saved to flow metadata so that
-    it can be later resumed with resume.
+    it. Then, the flow is loaded into an engine with the :func:`load() <load>`
+    function, and the factory function fully qualified name is saved to flow
+    metadata so that it can be later resumed.

    :param flow_factory: function or string: function that creates the flow
    :param factory_args: list or tuple of factory positional arguments
    :param factory_kwargs: dict of factory keyword arguments
-    :param store: dict -- data to put to storage to satisfy flow requirements
-    :param book: LogBook to create flow detail in
-    :param engine_conf: engine type and configuration configuration
-    :param backend: storage backend to use or configuration
-    :param namespace: driver namespace for stevedore (default is fine
-       if you don't know what is it)
+
+    Further arguments are interpreted as for :func:`load() <load>`.
+
    :returns: engine
    """

@@ -230,7 +279,7 @@ def load_from_factory(flow_factory, factory_args=None, factory_kwargs=None,
                         backend=backend)
    return load(flow=flow, store=store, flow_detail=flow_detail, book=book,
                engine_conf=engine_conf, backend=backend, namespace=namespace,
-                **kwargs)
+                engine=engine, **kwargs)


 def flow_from_detail(flow_detail):
@@ -261,21 +310,21 @@ def flow_from_detail(flow_detail):


 def load_from_detail(flow_detail, store=None, engine_conf=None, backend=None,
-                     namespace=ENGINES_NAMESPACE, **kwargs):
+                     namespace=ENGINES_NAMESPACE, engine=ENGINE_DEFAULT,
+                     **kwargs):
    """Reloads an engine previously saved.

-    This reloads the flow using the flow_from_detail() function and then calls
-    into the load() function to create an engine from that flow.
+    This reloads the flow using the
+    :func:`flow_from_detail() <flow_from_detail>` function and then calls
+    into the :func:`load() <load>` function to create an engine from that flow.

    :param flow_detail: FlowDetail that holds state of the flow to load
-    :param store: dict -- data to put to storage to satisfy flow requirements
-    :param engine_conf: engine type and configuration configuration
-    :param backend: storage backend to use or configuration
-    :param namespace: driver namespace for stevedore (default is fine
-       if you don't know what is it)
+
+    Further arguments are interpreted as for :func:`load() <load>`.
+
    :returns: engine
    """
    flow = flow_from_detail(flow_detail)
    return load(flow, flow_detail=flow_detail,
                store=store, engine_conf=engine_conf, backend=backend,
-                namespace=namespace, **kwargs)
+                namespace=namespace, engine=engine, **kwargs)
--- a/taskflow/engines/worker_based/cache.py
+++ b/taskflow/engines/worker_based/cache.py
@@ -1,48 +0,0 @@
-# -*- coding: utf-8 -*-
-
-#    Copyright (C) 2014 Yahoo! Inc. All Rights Reserved.
-#
-#    Licensed under the Apache License, Version 2.0 (the "License"); you may
-#    not use this file except in compliance with the License. You may obtain
-#    a copy of the License at
-#
-#         http://www.apache.org/licenses/LICENSE-2.0
-#
-#    Unless required by applicable law or agreed to in writing, software
-#    distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
-#    WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
-#    License for the specific language governing permissions and limitations
-#    under the License.
-
-import random
-
-import six
-
-from taskflow.engines.worker_based import protocol as pr
-from taskflow.types import cache as base
-
-
-class RequestsCache(base.ExpiringCache):
-    """Represents a thread-safe requests cache."""
-
-    def get_waiting_requests(self, tasks):
-        """Get list of waiting requests by tasks."""
-        waiting_requests = []
-        with self._lock.read_lock():
-            for request in six.itervalues(self._data):
-                if request.state == pr.WAITING and request.task_cls in tasks:
-                    waiting_requests.append(request)
-        return waiting_requests
-
-
-class WorkersCache(base.ExpiringCache):
-    """Represents a thread-safe workers cache."""
-
-    def get_topic_by_task(self, task):
-        """Get topic for a given task."""
-        available_topics = []
-        with self._lock.read_lock():
-            for topic, tasks in six.iteritems(self._data):
-                if task in tasks:
-                    available_topics.append(topic)
-        return random.choice(available_topics) if available_topics else None
--- a/taskflow/engines/worker_based/dispatcher.py
+++ b/taskflow/engines/worker_based/dispatcher.py
@@ -14,12 +14,11 @@
 #    License for the specific language governing permissions and limitations
 #    under the License.

-import logging
-
 from kombu import exceptions as kombu_exc
-import six

 from taskflow import exceptions as excp
+from taskflow import logging
+from taskflow.utils import kombu_utils as ku

 LOG = logging.getLogger(__name__)

@@ -27,31 +26,55 @@ LOG = logging.getLogger(__name__)
 class TypeDispatcher(object):
    """Receives messages and dispatches to type specific handlers."""

-    def __init__(self, type_handlers):
-        self._handlers = dict(type_handlers)
-        self._requeue_filters = []
+    def __init__(self, type_handlers=None, requeue_filters=None):
+        if type_handlers is not None:
+            self._type_handlers = dict(type_handlers)
+        else:
+            self._type_handlers = {}
+        if requeue_filters is not None:
+            self._requeue_filters = list(requeue_filters)
+        else:
+            self._requeue_filters = []

-    def add_requeue_filter(self, callback):
-        """Add a callback that can *request* message requeuing.
+    @property
+    def type_handlers(self):
+        """Dictionary of message type -> callback to handle that message.

-        The callback will be activated before the message has been acked and
-        it can be used to instruct the dispatcher to requeue the message
-        instead of processing it.
+        The callback(s) will be activated by looking for a message
+        property 'type' and locating a callback in this dictionary that maps
+        to that type; if one is found it is expected to be a callback that
+        accepts two positional parameters; the first being the message data
+        and the second being the message object. If a callback is not found
+        then the message is rejected and it will be up to the underlying
+        message transport to determine what this means/implies...
        """
-        assert six.callable(callback), "Callback must be callable"
-        self._requeue_filters.append(callback)
+        return self._type_handlers
+
+    @property
+    def requeue_filters(self):
+        """List of filters (callbacks) to request a message to be requeued.
+
+        The callback(s) will be activated before the message has been acked and
+        it can be used to instruct the dispatcher to requeue the message
+        instead of processing it. The callback, when called, will be provided
+        two positional parameters; the first being the message data and the
+        second being the message object. Using these provided parameters the
+        filter should return a truthy object if the message should be requeued
+        and a falsey object if it should not.
+        """
+        return self._requeue_filters

    def _collect_requeue_votes(self, data, message):
        # Returns how many of the filters asked for the message to be requeued.
        requeue_votes = 0
-        for f in self._requeue_filters:
+        for i, cb in enumerate(self._requeue_filters):
            try:
-                if f(data, message):
+                if cb(data, message):
                    requeue_votes += 1
            except Exception:
-                LOG.exception("Failed calling requeue filter to determine"
-                              " if message %r should be requeued.",
-                              message.delivery_tag)
+                LOG.exception("Failed calling requeue filter %s '%s' to"
+                              " determine if message %r should be requeued.",
+                              i + 1, cb, message.delivery_tag)
        return requeue_votes

    def _requeue_log_error(self, message, errors):
@@ -66,15 +89,15 @@ class TypeDispatcher(object):
            LOG.critical("Couldn't requeue %r, reason:%r",
                         message.delivery_tag, exc, exc_info=True)
        else:
-            LOG.debug("AMQP message %r requeued.", message.delivery_tag)
+            LOG.debug("Message '%s' was requeued.", ku.DelayedPretty(message))

    def _process_message(self, data, message, message_type):
-        handler = self._handlers.get(message_type)
+        handler = self._type_handlers.get(message_type)
        if handler is None:
            message.reject_log_error(logger=LOG,
                                     errors=(kombu_exc.MessageStateError,))
            LOG.warning("Unexpected message type: '%s' in message"
-                        " %r", message_type, message.delivery_tag)
+                        " '%s'", message_type, ku.DelayedPretty(message))
        else:
            if isinstance(handler, (tuple, list)):
                handler, validator = handler
@@ -83,20 +106,23 @@ class TypeDispatcher(object):
                except excp.InvalidFormat as e:
                    message.reject_log_error(
                        logger=LOG, errors=(kombu_exc.MessageStateError,))
-                    LOG.warn("Message: %r, '%s' was rejected due to it being"
+                    LOG.warn("Message '%s' (%s) was rejected due to it being"
                             " in an invalid format: %s",
-                             message.delivery_tag, message_type, e)
+                             ku.DelayedPretty(message), message_type, e)
                    return
            message.ack_log_error(logger=LOG,
                                  errors=(kombu_exc.MessageStateError,))
            if message.acknowledged:
-                LOG.debug("AMQP message %r acknowledged.",
-                          message.delivery_tag)
+                LOG.debug("Message '%s' was acknowledged.",
+                          ku.DelayedPretty(message))
                handler(data, message)
+            else:
+                message.reject_log_error(logger=LOG,
+                                         errors=(kombu_exc.MessageStateError,))

    def on_message(self, data, message):
        """This method is called on incoming messages."""
-        LOG.debug("Got message: %r", message.delivery_tag)
+        LOG.debug("Received message '%s'", ku.DelayedPretty(message))
        if self._collect_requeue_votes(data, message):
            self._requeue_log_error(message,
                                    errors=(kombu_exc.MessageStateError,))
@@ -107,6 +133,6 @@ class TypeDispatcher(object):
                message.reject_log_error(
                    logger=LOG, errors=(kombu_exc.MessageStateError,))
                LOG.warning("The 'type' message property is missing"
-                            " in message %r", message.delivery_tag)
+                            " in message '%s'", ku.DelayedPretty(message))
            else:
                self._process_message(data, message, message_type)
--- a/taskflow/engines/worker_based/endpoint.py
+++ b/taskflow/engines/worker_based/endpoint.py
@@ -14,8 +14,9 @@
 #    License for the specific language governing permissions and limitations
 #    under the License.

+from oslo_utils import reflection
+
 from taskflow.engines.action_engine import executor
-from taskflow.utils import reflection


 class Endpoint(object):
@@ -33,18 +34,16 @@ class Endpoint(object):
    def name(self):
        return self._task_cls_name

-    def _get_task(self, name=None):
+    def generate(self, name=None):
        # NOTE(skudriashev): Note that task is created here with the `name`
        # argument passed to its constructor. This will be a problem when
        # task's constructor requires any other arguments.
        return self._task_cls(name=name)

-    def execute(self, task_name, **kwargs):
-        task, event, result = self._executor.execute_task(
-            self._get_task(task_name), **kwargs).result()
+    def execute(self, task, **kwargs):
+        event, result = self._executor.execute_task(task, **kwargs).result()
        return result

-    def revert(self, task_name, **kwargs):
-        task, event, result = self._executor.revert_task(
-            self._get_task(task_name), **kwargs).result()
+    def revert(self, task, **kwargs):
+        event, result = self._executor.revert_task(task, **kwargs).result()
        return result
--- a/taskflow/engines/worker_based/engine.py
+++ b/taskflow/engines/worker_based/engine.py
@@ -16,13 +16,14 @@

 from taskflow.engines.action_engine import engine
 from taskflow.engines.worker_based import executor
+from taskflow.engines.worker_based import protocol as pr
 from taskflow import storage as t_storage


 class WorkerBasedActionEngine(engine.ActionEngine):
    """Worker based action engine.

-    Specific backend configuration:
+    Specific backend options (extracted from provided engine options):

    :param exchange: broker exchange exchange name in which executor / worker
                     communication is performed
@@ -30,24 +31,48 @@ class WorkerBasedActionEngine(engine.ActionEngine):
    :param topics: list of workers topics to communicate with (this will also
                   be learned by listening to the notifications that workers
                   emit).
-    :keyword transport: transport to be used (e.g. amqp, memory, etc.)
-    :keyword transport_options: transport specific options
+    :param transport: transport to be used (e.g. amqp, memory, etc.)
+    :param transition_timeout: numeric value (or None for infinite) to wait
+                               for submitted remote requests to transition out
+                               of the (PENDING, WAITING) request states. When
+                               expired the associated task the request was made
+                               for will have its result become a
+                               `RequestTimeout` exception instead of its
+                               normally returned value (or raised exception).
+    :param transport_options: transport specific options (see:
+                              http://kombu.readthedocs.org/ for what these
+                              options imply and are expected to be)
+    :param retry_options: retry specific options
+                          (see: :py:attr:`~.proxy.Proxy.DEFAULT_RETRY_OPTIONS`)
    """

    _storage_factory = t_storage.SingleThreadedStorage

-    def _task_executor_factory(self):
-        if self._executor is not None:
-            return self._executor
-        return executor.WorkerTaskExecutor(
-            uuid=self._flow_detail.uuid,
-            url=self._conf.get('url'),
-            exchange=self._conf.get('exchange', 'default'),
-            topics=self._conf.get('topics', []),
-            transport=self._conf.get('transport'),
-            transport_options=self._conf.get('transport_options'))
+    def __init__(self, flow, flow_detail, backend, options):
+        super(WorkerBasedActionEngine, self).__init__(flow, flow_detail,
+                                                      backend, options)
+        # This ensures that any provided executor will be validated before
+        # we get to far in the compilation/execution pipeline...
+        self._task_executor = self._fetch_task_executor(self._options,
+                                                        self._flow_detail)

-    def __init__(self, flow, flow_detail, backend, conf, **kwargs):
-        super(WorkerBasedActionEngine, self).__init__(
-            flow, flow_detail, backend, conf)
-        self._executor = kwargs.get('executor')
+    @classmethod
+    def _fetch_task_executor(cls, options, flow_detail):
+        try:
+            e = options['executor']
+            if not isinstance(e, executor.WorkerTaskExecutor):
+                raise TypeError("Expected an instance of type '%s' instead of"
+                                " type '%s' for 'executor' option"
+                                % (executor.WorkerTaskExecutor, type(e)))
+            return e
+        except KeyError:
+            return executor.WorkerTaskExecutor(
+                uuid=flow_detail.uuid,
+                url=options.get('url'),
+                exchange=options.get('exchange', 'default'),
+                retry_options=options.get('retry_options'),
+                topics=options.get('topics', []),
+                transport=options.get('transport'),
+                transport_options=options.get('transport_options'),
+                transition_timeout=options.get('transition_timeout',
+                                               pr.REQUEST_TIMEOUT))
--- a/taskflow/engines/worker_based/executor.py
+++ b/taskflow/engines/worker_based/executor.py
@@ -15,119 +15,94 @@
 #    under the License.

 import functools
-import logging
-import threading
+
+from oslo_utils import timeutils

 from taskflow.engines.action_engine import executor
-from taskflow.engines.worker_based import cache
 from taskflow.engines.worker_based import protocol as pr
 from taskflow.engines.worker_based import proxy
+from taskflow.engines.worker_based import types as wt
 from taskflow import exceptions as exc
-from taskflow.openstack.common import timeutils
-from taskflow.types import timing as tt
-from taskflow.utils import async_utils
+from taskflow import logging
+from taskflow import task as task_atom
+from taskflow.types import periodic
+from taskflow.utils import kombu_utils as ku
 from taskflow.utils import misc
-from taskflow.utils import reflection
 from taskflow.utils import threading_utils as tu

 LOG = logging.getLogger(__name__)


-def _is_alive(thread):
-    if not thread:
-        return False
-    return thread.is_alive()
-
-
-class PeriodicWorker(object):
-    """Calls a set of functions when activated periodically.
-
-    NOTE(harlowja): the provided timeout object determines the periodicity.
-    """
-    def __init__(self, timeout, functors):
-        self._timeout = timeout
-        self._functors = []
-        for f in functors:
-            self._functors.append((f, reflection.get_callable_name(f)))
-
-    def start(self):
-        while not self._timeout.is_stopped():
-            for (f, f_name) in self._functors:
-                LOG.debug("Calling periodic function '%s'", f_name)
-                try:
-                    f()
-                except Exception:
-                    LOG.warn("Failed to call periodic function '%s'", f_name,
-                             exc_info=True)
-            self._timeout.wait()
-
-    def stop(self):
-        self._timeout.interrupt()
-
-    def reset(self):
-        self._timeout.reset()
-
-
-class WorkerTaskExecutor(executor.TaskExecutorBase):
+class WorkerTaskExecutor(executor.TaskExecutor):
    """Executes tasks on remote workers."""

-    def __init__(self, uuid, exchange, topics, **kwargs):
+    def __init__(self, uuid, exchange, topics,
+                 transition_timeout=pr.REQUEST_TIMEOUT,
+                 url=None, transport=None, transport_options=None,
+                 retry_options=None):
        self._uuid = uuid
-        self._topics = topics
-        self._requests_cache = cache.RequestsCache()
-        self._workers_cache = cache.WorkersCache()
-        self._workers_arrival = threading.Condition()
-        handlers = {
-            pr.NOTIFY: [
-                self._process_notify,
-                functools.partial(pr.Notify.validate, response=True),
-            ],
+        self._requests_cache = wt.RequestsCache()
+        self._transition_timeout = transition_timeout
+        type_handlers = {
            pr.RESPONSE: [
                self._process_response,
                pr.Response.validate,
            ],
        }
-        self._proxy = proxy.Proxy(uuid, exchange, handlers,
-                                  self._on_wait, **kwargs)
-        self._proxy_thread = None
-        self._periodic = PeriodicWorker(tt.Timeout(pr.NOTIFY_PERIOD),
-                                        [self._notify_topics])
-        self._periodic_thread = None
+        self._proxy = proxy.Proxy(uuid, exchange,
+                                  type_handlers=type_handlers,
+                                  on_wait=self._on_wait, url=url,
+                                  transport=transport,
+                                  transport_options=transport_options,
+                                  retry_options=retry_options)
+        # NOTE(harlowja): This is the most simplest finder impl. that
+        # doesn't have external dependencies (outside of what this engine
+        # already requires); it though does create periodic 'polling' traffic
+        # to workers to 'learn' of the tasks they can perform (and requires
+        # pre-existing knowledge of the topics those workers are on to gather
+        # and update this information).
+        self._finder = wt.ProxyWorkerFinder(uuid, self._proxy, topics)
+        self._finder.on_worker = self._on_worker
+        self._helpers = tu.ThreadBundle()
+        self._helpers.bind(lambda: tu.daemon_thread(self._proxy.start),
+                           after_start=lambda t: self._proxy.wait(),
+                           before_join=lambda t: self._proxy.stop())
+        p_worker = periodic.PeriodicWorker.create([self._finder])
+        if p_worker:
+            self._helpers.bind(lambda: tu.daemon_thread(p_worker.start),
+                               before_join=lambda t: p_worker.stop(),
+                               after_join=lambda t: p_worker.reset(),
+                               before_start=lambda t: p_worker.reset())

-    def _process_notify(self, notify, message):
-        """Process notify message from remote side."""
-        LOG.debug("Start processing notify message.")
-        topic = notify['topic']
-        tasks = notify['tasks']
-
-        # add worker info to the cache
-        self._workers_arrival.acquire()
-        try:
-            self._workers_cache[topic] = tasks
-            self._workers_arrival.notify_all()
-        finally:
-            self._workers_arrival.release()
-
-        # publish waiting requests
-        for request in self._requests_cache.get_waiting_requests(tasks):
+    def _on_worker(self, worker):
+        """Process new worker that has arrived (and fire off any work)."""
+        for request in self._requests_cache.get_waiting_requests(worker):
            if request.transition_and_log_error(pr.PENDING, logger=LOG):
-                self._publish_request(request, topic)
+                self._publish_request(request, worker)

    def _process_response(self, response, message):
        """Process response from remote side."""
-        LOG.debug("Start processing response message.")
+        LOG.debug("Started processing response message '%s'",
+                  ku.DelayedPretty(message))
        try:
            task_uuid = message.properties['correlation_id']
        except KeyError:
-            LOG.warning("The 'correlation_id' message property is missing.")
+            LOG.warning("The 'correlation_id' message property is"
+                        " missing in message '%s'",
+                        ku.DelayedPretty(message))
        else:
            request = self._requests_cache.get(task_uuid)
            if request is not None:
                response = pr.Response.from_dict(response)
+                LOG.debug("Response with state '%s' received for '%s'",
+                          response.state, request)
                if response.state == pr.RUNNING:
                    request.transition_and_log_error(pr.RUNNING, logger=LOG)
-                elif response.state == pr.PROGRESS:
-                    request.on_progress(**response.data)
+                elif response.state == pr.EVENT:
+                    # Proxy the event + details to the task/request notifier...
+                    event_type = response.data['event_type']
+                    details = response.data['details']
+                    request.notifier.notify(event_type, details)
                elif response.state in (pr.FAILURE, pr.SUCCESS):
                    moved = request.transition_and_log_error(response.state,
                                                             logger=LOG)
@@ -139,10 +114,10 @@ class WorkerTaskExecutor(executor.TaskExecutorBase):
                        del self._requests_cache[request.uuid]
                        request.set_result(**response.data)
                else:
-                    LOG.warning("Unexpected response status: '%s'",
+                    LOG.warning("Unexpected response status '%s'",
                                response.state)
            else:
-                LOG.debug("Request with id='%s' not found.", task_uuid)
+                LOG.debug("Request with id='%s' not found", task_uuid)

    @staticmethod
    def _handle_expired_request(request):
@@ -163,67 +138,76 @@ class WorkerTaskExecutor(executor.TaskExecutorBase):
                    " seconds for it to transition out of (%s) states"
                    % (request, request_age, ", ".join(pr.WAITING_STATES)))
            except exc.RequestTimeout:
-                with misc.capture_failure() as fail:
-                    LOG.debug(fail.exception_str)
-                    request.set_result(fail)
+                with misc.capture_failure() as failure:
+                    LOG.debug(failure.exception_str)
+                    request.set_result(failure)

    def _on_wait(self):
        """This function is called cyclically between draining events."""
        self._requests_cache.cleanup(self._handle_expired_request)

    def _submit_task(self, task, task_uuid, action, arguments,
-                     progress_callback, timeout=pr.REQUEST_TIMEOUT, **kwargs):
+                     progress_callback=None, **kwargs):
        """Submit task request to a worker."""
        request = pr.Request(task, task_uuid, action, arguments,
-                             progress_callback, timeout, **kwargs)
+                             self._transition_timeout, **kwargs)

-        # Get task's topic and publish request if topic was found.
-        topic = self._workers_cache.get_topic_by_task(request.task_cls)
-        if topic is not None:
+        # Register the callback, so that we can proxy the progress correctly.
+        if (progress_callback is not None and
+                request.notifier.can_be_registered(
+                    task_atom.EVENT_UPDATE_PROGRESS)):
+            request.notifier.register(task_atom.EVENT_UPDATE_PROGRESS,
+                                      progress_callback)
+            cleaner = functools.partial(request.notifier.deregister,
+                                        task_atom.EVENT_UPDATE_PROGRESS,
+                                        progress_callback)
+            request.result.add_done_callback(lambda fut: cleaner())
+
+        # Get task's worker and publish request if worker was found.
+        worker = self._finder.get_worker_for_task(task)
+        if worker is not None:
            # NOTE(skudriashev): Make sure request is set to the PENDING state
            # before putting it into the requests cache to prevent the notify
            # processing thread get list of waiting requests and publish it
            # before it is published here, so it wouldn't be published twice.
            if request.transition_and_log_error(pr.PENDING, logger=LOG):
                self._requests_cache[request.uuid] = request
-                self._publish_request(request, topic)
+                self._publish_request(request, worker)
        else:
+            LOG.debug("Delaying submission of '%s', no currently known"
+                      " worker/s available to process it", request)
            self._requests_cache[request.uuid] = request

        return request.result

-    def _publish_request(self, request, topic):
+    def _publish_request(self, request, worker):
        """Publish request to a given topic."""
+        LOG.debug("Submitting execution of '%s' to worker '%s' (expecting"
+                  " response identified by reply_to=%s and"
+                  " correlation_id=%s)", request, worker, self._uuid,
+                  request.uuid)
        try:
-            self._proxy.publish(msg=request,
-                                routing_key=topic,
+            self._proxy.publish(request, worker.topic,
                                reply_to=self._uuid,
                                correlation_id=request.uuid)
        except Exception:
            with misc.capture_failure() as failure:
-                LOG.exception("Failed to submit the '%s' request.", request)
+                LOG.critical("Failed to submit '%s' (transitioning it to"
+                             " %s)", request, pr.FAILURE, exc_info=True)
                if request.transition_and_log_error(pr.FAILURE, logger=LOG):
                    del self._requests_cache[request.uuid]
                    request.set_result(failure)

-    def _notify_topics(self):
-        """Cyclically called to publish notify message to each topic."""
-        self._proxy.publish(pr.Notify(), self._topics, reply_to=self._uuid)
-
    def execute_task(self, task, task_uuid, arguments,
                     progress_callback=None):
        return self._submit_task(task, task_uuid, pr.EXECUTE, arguments,
-                                 progress_callback)
+                                 progress_callback=progress_callback)

    def revert_task(self, task, task_uuid, arguments, result, failures,
                    progress_callback=None):
        return self._submit_task(task, task_uuid, pr.REVERT, arguments,
-                                 progress_callback, result=result,
-                                 failures=failures)
-
-    def wait_for_any(self, fs, timeout=None):
-        """Wait for futures returned by this executor to complete."""
-        return async_utils.wait_for_any(fs, timeout)
+                                 progress_callback=progress_callback,
+                                 result=result, failures=failures)

    def wait_for_workers(self, workers=1, timeout=None):
        """Waits for geq workers to notify they are ready to do work.
@@ -234,42 +218,15 @@ class WorkerTaskExecutor(executor.TaskExecutorBase):
        return how many workers are still needed, otherwise it will
        return zero.
        """
-        if workers <= 0:
-            raise ValueError("Worker amount must be greater than zero")
-        w = None
-        if timeout is not None:
-            w = tt.StopWatch(timeout).start()
-        self._workers_arrival.acquire()
-        try:
-            while len(self._workers_cache) < workers:
-                if w is not None and w.expired():
-                    return workers - len(self._workers_cache)
-                timeout = None
-                if w is not None:
-                    timeout = w.leftover()
-                self._workers_arrival.wait(timeout)
-            return 0
-        finally:
-            self._workers_arrival.release()
+        return self._finder.wait_for_workers(workers=workers,
+                                             timeout=timeout)

    def start(self):
        """Starts proxy thread and associated topic notification thread."""
-        if not _is_alive(self._proxy_thread):
-            self._proxy_thread = tu.daemon_thread(self._proxy.start)
-            self._proxy_thread.start()
-            self._proxy.wait()
-        if not _is_alive(self._periodic_thread):
-            self._periodic.reset()
-            self._periodic_thread = tu.daemon_thread(self._periodic.start)
-            self._periodic_thread.start()
+        self._helpers.start()

    def stop(self):
        """Stops proxy thread and associated topic notification thread."""
-        if self._periodic_thread is not None:
-            self._periodic.stop()
-            self._periodic_thread.join()
-            self._periodic_thread = None
-        if self._proxy_thread is not None:
-            self._proxy.stop()
-            self._proxy_thread.join()
-            self._proxy_thread = None
+        self._helpers.stop()
+        self._requests_cache.clear(self._handle_expired_request)
+        self._finder.clear()
--- a/taskflow/engines/worker_based/protocol.py
+++ b/taskflow/engines/worker_based/protocol.py
@@ -15,21 +15,21 @@
 #    under the License.

 import abc
-import logging
 import threading

 from concurrent import futures
 import jsonschema
 from jsonschema import exceptions as schema_exc
+from oslo_utils import reflection
+from oslo_utils import timeutils
 import six

 from taskflow.engines.action_engine import executor
 from taskflow import exceptions as excp
-from taskflow.openstack.common import timeutils
+from taskflow import logging
+from taskflow.types import failure as ft
 from taskflow.types import timing as tt
 from taskflow.utils import lock_utils
-from taskflow.utils import misc
-from taskflow.utils import reflection

 # NOTE(skudriashev): This is protocol states and events, which are not
 # related to task states.
@@ -38,14 +38,14 @@ PENDING = 'PENDING'
 RUNNING = 'RUNNING'
 SUCCESS = 'SUCCESS'
 FAILURE = 'FAILURE'
-PROGRESS = 'PROGRESS'
+EVENT = 'EVENT'

 # During these states the expiry is active (once out of these states the expiry
 # no longer matters, since we have no way of knowing how long a task will run
 # for).
 WAITING_STATES = (WAITING, PENDING)

-_ALL_STATES = (WAITING, PENDING, RUNNING, SUCCESS, FAILURE, PROGRESS)
+_ALL_STATES = (WAITING, PENDING, RUNNING, SUCCESS, FAILURE, EVENT)
 _STOP_TIMER_STATES = (RUNNING, SUCCESS, FAILURE)

 # Transitions that a request state can go through.
@@ -121,12 +121,16 @@ class Message(object):

 class Notify(Message):
    """Represents notify message type."""
+
+    #: String constant representing this message type.
    TYPE = NOTIFY

    # NOTE(harlowja): the executor (the entity who initially requests a worker
    # to send back a notification response) schema is different than the
    # worker response schema (that's why there are two schemas here).
-    _RESPONSE_SCHEMA = {
+
+    #: Expected notify *response* message schema (in json schema format).
+    RESPONSE_SCHEMA = {
        "type": "object",
        'properties': {
            'topic': {
@@ -142,7 +146,9 @@ class Notify(Message):
        "required": ["topic", 'tasks'],
        "additionalProperties": False,
    }
-    _SENDER_SCHEMA = {
+
+    #: Expected *sender* request message schema (in json schema format).
+    SENDER_SCHEMA = {
        "type": "object",
        "additionalProperties": False,
    }
@@ -156,9 +162,9 @@ class Notify(Message):
    @classmethod
    def validate(cls, data, response):
        if response:
-            schema = cls._RESPONSE_SCHEMA
+            schema = cls.RESPONSE_SCHEMA
        else:
-            schema = cls._SENDER_SCHEMA
+            schema = cls.SENDER_SCHEMA
        try:
            jsonschema.validate(data, schema, types=_SCHEMA_TYPES)
        except schema_exc.ValidationError as e:
@@ -180,8 +186,11 @@ class Request(Message):
    states.
    """

+    #: String constant representing this message type.
    TYPE = REQUEST
-    _SCHEMA = {
+
+    #: Expected message schema (in json schema format).
+    SCHEMA = {
        "type": "object",
        'properties': {
            # These two are typically only sent on revert actions (that is
@@ -219,29 +228,36 @@ class Request(Message):
        'required': ['task_cls', 'task_name', 'task_version', 'action'],
    }

-    def __init__(self, task, uuid, action, arguments, progress_callback,
-                 timeout, **kwargs):
+    def __init__(self, task, uuid, action, arguments, timeout, **kwargs):
        self._task = task
-        self._task_cls = reflection.get_class_name(task)
        self._uuid = uuid
        self._action = action
        self._event = ACTION_TO_EVENT[action]
        self._arguments = arguments
-        self._progress_callback = progress_callback
        self._kwargs = kwargs
        self._watch = tt.StopWatch(duration=timeout).start()
        self._state = WAITING
        self._lock = threading.Lock()
        self._created_on = timeutils.utcnow()
-        self.result = futures.Future()
+        self._result = futures.Future()
+        self._result.atom = task
+        self._notifier = task.notifier
+
+    @property
+    def result(self):
+        return self._result
+
+    @property
+    def notifier(self):
+        return self._notifier

    @property
    def uuid(self):
        return self._uuid

    @property
-    def task_cls(self):
-        return self._task_cls
+    def task(self):
+        return self._task

    @property
    def state(self):
@@ -270,15 +286,19 @@ class Request(Message):
        """Return json-serializable request.

        To convert requests that have failed due to some exception this will
-        convert all `misc.Failure` objects into dictionaries (which will then
-        be reconstituted by the receiver).
+        convert all `failure.Failure` objects into dictionaries (which will
+        then be reconstituted by the receiver).
        """
-        request = dict(task_cls=self._task_cls, task_name=self._task.name,
-                       task_version=self._task.version, action=self._action,
-                       arguments=self._arguments)
+        request = {
+            'task_cls': reflection.get_class_name(self._task),
+            'task_name': self._task.name,
+            'task_version': self._task.version,
+            'action': self._action,
+            'arguments': self._arguments,
+        }
        if 'result' in self._kwargs:
            result = self._kwargs['result']
-            if isinstance(result, misc.Failure):
+            if isinstance(result, ft.Failure):
                request['result'] = ('failure', result.to_dict())
            else:
                request['result'] = ('success', result)
@@ -290,10 +310,7 @@ class Request(Message):
        return request

    def set_result(self, result):
-        self.result.set_result((self._task, self._event, result))
-
-    def on_progress(self, event_data, progress):
-        self._progress_callback(self._task, event_data, progress)
+        self.result.set_result((self._event, result))

    def transition_and_log_error(self, new_state, logger=None):
        """Transitions *and* logs an error if that transitioning raises.
@@ -341,7 +358,7 @@ class Request(Message):
    @classmethod
    def validate(cls, data):
        try:
-            jsonschema.validate(data, cls._SCHEMA, types=_SCHEMA_TYPES)
+            jsonschema.validate(data, cls.SCHEMA, types=_SCHEMA_TYPES)
        except schema_exc.ValidationError as e:
            raise excp.InvalidFormat("%s message response data not of the"
                                     " expected format: %s"
@@ -350,8 +367,12 @@ class Request(Message):

 class Response(Message):
    """Represents response message type."""
+
+    #: String constant representing this message type.
    TYPE = RESPONSE
-    _SCHEMA = {
+
+    #: Expected message schema (in json schema format).
+    SCHEMA = {
        "type": "object",
        'properties': {
            'state': {
@@ -361,7 +382,7 @@ class Response(Message):
            'data': {
                "anyOf": [
                    {
-                        "$ref": "#/definitions/progress",
+                        "$ref": "#/definitions/event",
                    },
                    {
                        "$ref": "#/definitions/completion",
@@ -375,17 +396,17 @@ class Response(Message):
        "required": ["state", 'data'],
        "additionalProperties": False,
        "definitions": {
-            "progress": {
+            "event": {
                "type": "object",
                "properties": {
-                    'progress': {
-                        'type': 'number',
+                    'event_type': {
+                        'type': 'string',
                    },
-                    'event_data': {
+                    'details': {
                        'type': 'object',
                    },
                },
-                "required": ["progress", 'event_data'],
+                "required": ["event_type", 'details'],
                "additionalProperties": False,
            },
            # Used when sending *only* request state changes (and no data is
@@ -417,7 +438,7 @@ class Response(Message):
        state = data['state']
        data = data['data']
        if state == FAILURE and 'result' in data:
-            data['result'] = misc.Failure.from_dict(data['result'])
+            data['result'] = ft.Failure.from_dict(data['result'])
        return cls(state, **data)

    @property
@@ -434,7 +455,7 @@ class Response(Message):
    @classmethod
    def validate(cls, data):
        try:
-            jsonschema.validate(data, cls._SCHEMA, types=_SCHEMA_TYPES)
+            jsonschema.validate(data, cls.SCHEMA, types=_SCHEMA_TYPES)
        except schema_exc.ValidationError as e:
            raise excp.InvalidFormat("%s message response data not of the"
                                     " expected format: %s"
--- a/taskflow/engines/worker_based/proxy.py
+++ b/taskflow/engines/worker_based/proxy.py
@@ -14,15 +14,15 @@
 #    License for the specific language governing permissions and limitations
 #    under the License.

-import logging
-import socket
-import threading
+import collections

 import kombu
+from kombu import exceptions as kombu_exceptions
 import six

 from taskflow.engines.worker_based import dispatcher
-from taskflow.utils import misc
+from taskflow import logging
+from taskflow.utils import threading_utils

 LOG = logging.getLogger(__name__)

@@ -30,102 +30,197 @@ LOG = logging.getLogger(__name__)
 # the socket can get "stuck", and is a best practice for Kombu consumers.
 DRAIN_EVENTS_PERIOD = 1

+# Helper objects returned when requested to get connection details, used
+# instead of returning the raw results from the kombu connection objects
+# themselves so that a person can not mutate those objects (which would be
+# bad).
+_ConnectionDetails = collections.namedtuple('_ConnectionDetails',
+                                            ['uri', 'transport'])
+_TransportDetails = collections.namedtuple('_TransportDetails',
+                                           ['options', 'driver_type',
+                                            'driver_name', 'driver_version'])
+

 class Proxy(object):
-    """A proxy processes messages from/to the named exchange."""
+    """A proxy processes messages from/to the named exchange.

-    def __init__(self, topic, exchange_name, type_handlers, on_wait=None,
-                 **kwargs):
+    For **internal** usage only (not for public consumption).
+    """
+
+    DEFAULT_RETRY_OPTIONS = {
+        # The number of seconds we start sleeping for.
+        'interval_start': 1,
+        # How many seconds added to the interval for each retry.
+        'interval_step': 1,
+        # Maximum number of seconds to sleep between each retry.
+        'interval_max': 1,
+        # Maximum number of times to retry.
+        'max_retries': 3,
+    }
+    """Settings used (by default) to reconnect under transient failures.
+
+    See: http://kombu.readthedocs.org/ (and connection ``ensure_options``) for
+    what these values imply/mean...
+    """
+
+    # This is the only provided option that should be an int, the others
+    # are allowed to be floats; used when we check that the user-provided
+    # value is valid...
+    _RETRY_INT_OPTS = frozenset(['max_retries'])
+
+    def __init__(self, topic, exchange,
+                 type_handlers=None, on_wait=None, url=None,
+                 transport=None, transport_options=None,
+                 retry_options=None):
        self._topic = topic
-        self._exchange_name = exchange_name
+        self._exchange_name = exchange
        self._on_wait = on_wait
-        self._running = threading.Event()
-        self._dispatcher = dispatcher.TypeDispatcher(type_handlers)
-        self._dispatcher.add_requeue_filter(
+        self._running = threading_utils.Event()
+        self._dispatcher = dispatcher.TypeDispatcher(
            # NOTE(skudriashev): Process all incoming messages only if proxy is
            # running, otherwise requeue them.
-            lambda data, message: not self.is_running)
+            requeue_filters=[lambda data, message: not self.is_running],
+            type_handlers=type_handlers)

-        url = kwargs.get('url')
-        transport = kwargs.get('transport')
-        transport_opts = kwargs.get('transport_options')
+        ensure_options = self.DEFAULT_RETRY_OPTIONS.copy()
+        if retry_options is not None:
+            # Override the defaults with any user provided values...
+            for k in set(six.iterkeys(ensure_options)):
+                if k in retry_options:
+                    # Ensure that the right type is passed in...
+                    val = retry_options[k]
+                    if k in self._RETRY_INT_OPTS:
+                        tmp_val = int(val)
+                    else:
+                        tmp_val = float(val)
+                    if tmp_val < 0:
+                        raise ValueError("Expected value greater or equal to"
+                                         " zero for 'retry_options' %s; got"
+                                         " %s instead" % (k, val))
+                    ensure_options[k] = tmp_val
+        self._ensure_options = ensure_options

        self._drain_events_timeout = DRAIN_EVENTS_PERIOD
-        if transport == 'memory' and transport_opts:
-            polling_interval = transport_opts.get('polling_interval')
+        if transport == 'memory' and transport_options:
+            polling_interval = transport_options.get('polling_interval')
            if polling_interval is not None:
                self._drain_events_timeout = polling_interval

        # create connection
        self._conn = kombu.Connection(url, transport=transport,
-                                      transport_options=transport_opts)
+                                      transport_options=transport_options)

        # create exchange
        self._exchange = kombu.Exchange(name=self._exchange_name,
-                                        durable=False,
-                                        auto_delete=True)
+                                        durable=False, auto_delete=True)
+
+    @property
+    def dispatcher(self):
+        """Dispatcher internally used to dispatch message(s) that match."""
+        return self._dispatcher

    @property
    def connection_details(self):
+        """Details about the connection (read-only)."""
        # The kombu drivers seem to use 'N/A' when they don't have a version...
        driver_version = self._conn.transport.driver_version()
        if driver_version and driver_version.lower() == 'n/a':
            driver_version = None
-        return misc.AttrDict(
+        if self._conn.transport_options:
+            transport_options = self._conn.transport_options.copy()
+        else:
+            transport_options = {}
+        transport = _TransportDetails(
+            options=transport_options,
+            driver_type=self._conn.transport.driver_type,
+            driver_name=self._conn.transport.driver_name,
+            driver_version=driver_version)
+        return _ConnectionDetails(
            uri=self._conn.as_uri(include_password=False),
-            transport=misc.AttrDict(
-                options=dict(self._conn.transport_options),
-                driver_type=self._conn.transport.driver_type,
-                driver_name=self._conn.transport.driver_name,
-                driver_version=driver_version))
+            transport=transport)

    @property
    def is_running(self):
        """Return whether the proxy is running."""
        return self._running.is_set()

-    def _make_queue(self, name, exchange, **kwargs):
-        """Make named queue for the given exchange."""
-        return kombu.Queue(name="%s_%s" % (self._exchange_name, name),
-                           exchange=exchange,
-                           routing_key=name,
-                           durable=False,
-                           auto_delete=True,
-                           **kwargs)
+    def _make_queue(self, routing_key, exchange, channel=None):
+        """Make a named queue for the given exchange."""
+        queue_name = "%s_%s" % (self._exchange_name, routing_key)
+        return kombu.Queue(name=queue_name,
+                           routing_key=routing_key, durable=False,
+                           exchange=exchange, auto_delete=True,
+                           channel=channel)

-    def publish(self, msg, routing_key, **kwargs):
+    def publish(self, msg, routing_key, reply_to=None, correlation_id=None):
        """Publish message to the named exchange with given routing key."""
-        LOG.debug("Sending %s", msg)
        if isinstance(routing_key, six.string_types):
            routing_keys = [routing_key]
        else:
            routing_keys = routing_key
-        with kombu.producers[self._conn].acquire(block=True) as producer:
-            for routing_key in routing_keys:
-                queue = self._make_queue(routing_key, self._exchange)
-                producer.publish(body=msg.to_dict(),
-                                 routing_key=routing_key,
-                                 exchange=self._exchange,
-                                 declare=[queue],
-                                 type=msg.TYPE,
-                                 **kwargs)
+
+        # Filter out any empty keys...
+        routing_keys = [r_k for r_k in routing_keys if r_k]
+        if not routing_keys:
+            LOG.warn("No routing key/s specified; unable to send '%s'"
+                     " to any target queue on exchange '%s'", msg,
+                     self._exchange_name)
+            return
+
+        def _publish(producer, routing_key):
+            queue = self._make_queue(routing_key, self._exchange)
+            producer.publish(body=msg.to_dict(),
+                             routing_key=routing_key,
+                             exchange=self._exchange,
+                             declare=[queue],
+                             type=msg.TYPE,
+                             reply_to=reply_to,
+                             correlation_id=correlation_id)
+
+        def _publish_errback(exc, interval):
+            LOG.exception('Publishing error: %s', exc)
+            LOG.info('Retry triggering in %s seconds', interval)
+
+        LOG.debug("Sending '%s' message using routing keys %s",
+                  msg, routing_keys)
+        with kombu.connections[self._conn].acquire(block=True) as conn:
+            with conn.Producer() as producer:
+                ensure_kwargs = self._ensure_options.copy()
+                ensure_kwargs['errback'] = _publish_errback
+                safe_publish = conn.ensure(producer, _publish, **ensure_kwargs)
+                for routing_key in routing_keys:
+                    safe_publish(producer, routing_key)

    def start(self):
        """Start proxy."""
+
+        def _drain(conn, timeout):
+            try:
+                conn.drain_events(timeout=timeout)
+            except kombu_exceptions.TimeoutError:
+                pass
+
+        def _drain_errback(exc, interval):
+            LOG.exception('Draining error: %s', exc)
+            LOG.info('Retry triggering in %s seconds', interval)
+
        LOG.info("Starting to consume from the '%s' exchange.",
                 self._exchange_name)
        with kombu.connections[self._conn].acquire(block=True) as conn:
            queue = self._make_queue(self._topic, self._exchange, channel=conn)
-            with conn.Consumer(queues=queue,
-                               callbacks=[self._dispatcher.on_message]):
+            callbacks = [self._dispatcher.on_message]
+            with conn.Consumer(queues=queue, callbacks=callbacks) as consumer:
+                ensure_kwargs = self._ensure_options.copy()
+                ensure_kwargs['errback'] = _drain_errback
+                safe_drain = conn.ensure(consumer, _drain, **ensure_kwargs)
                self._running.set()
-                while self.is_running:
-                    try:
-                        conn.drain_events(timeout=self._drain_events_timeout)
-                    except socket.timeout:
-                        pass
-                    if self._on_wait is not None:
-                        self._on_wait()
+                try:
+                    while self._running.is_set():
+                        safe_drain(conn, self._drain_events_timeout)
+                        if self._on_wait is not None:
+                            self._on_wait()
+                finally:
+                    self._running.clear()

    def wait(self):
        """Wait until proxy is started."""
--- a/taskflow/engines/worker_based/server.py
+++ b/taskflow/engines/worker_based/server.py
@@ -15,12 +15,15 @@
 #    under the License.

 import functools
-import logging

 import six

 from taskflow.engines.worker_based import protocol as pr
 from taskflow.engines.worker_based import proxy
+from taskflow import logging
+from taskflow.types import failure as ft
+from taskflow.types import notifier as nt
+from taskflow.utils import kombu_utils as ku
 from taskflow.utils import misc

 LOG = logging.getLogger(__name__)
@@ -43,8 +46,10 @@ def delayed(executor):
 class Server(object):
    """Server implementation that waits for incoming tasks requests."""

-    def __init__(self, topic, exchange, executor, endpoints, **kwargs):
-        handlers = {
+    def __init__(self, topic, exchange, executor, endpoints,
+                 url=None, transport=None, transport_options=None,
+                 retry_options=None):
+        type_handlers = {
            pr.NOTIFY: [
                delayed(executor)(self._process_notify),
                functools.partial(pr.Notify.validate, response=False),
@@ -54,10 +59,12 @@ class Server(object):
                pr.Request.validate,
            ],
        }
-        self._proxy = proxy.Proxy(topic, exchange, handlers,
-                                  on_wait=None, **kwargs)
+        self._proxy = proxy.Proxy(topic, exchange,
+                                  type_handlers=type_handlers,
+                                  url=url, transport=transport,
+                                  transport_options=transport_options,
+                                  retry_options=retry_options)
        self._topic = topic
-        self._executor = executor
        self._endpoints = dict([(endpoint.name, endpoint)
                                for endpoint in endpoints])

@@ -70,21 +77,26 @@ class Server(object):
                       failures=None, **kwargs):
        """Parse request before it can be further processed.

-        All `misc.Failure` objects that have been converted to dict on the
-        remote side will now converted back to `misc.Failure` objects.
+        All `failure.Failure` objects that have been converted to dict on the
+        remote side will now converted back to `failure.Failure` objects.
        """
-        action_args = dict(arguments=arguments, task_name=task_name)
+        # These arguments will eventually be given to the task executor
+        # so they need to be in a format it will accept (and using keyword
+        # argument names that it accepts)...
+        arguments = {
+            'arguments': arguments,
+        }
        if result is not None:
            data_type, data = result
            if data_type == 'failure':
-                action_args['result'] = misc.Failure.from_dict(data)
+                arguments['result'] = ft.Failure.from_dict(data)
            else:
-                action_args['result'] = data
+                arguments['result'] = data
        if failures is not None:
-            action_args['failures'] = {}
-            for k, v in failures.items():
-                action_args['failures'][k] = misc.Failure.from_dict(v)
-        return task_cls, action, action_args
+            arguments['failures'] = {}
+            for key, data in six.iteritems(failures):
+                arguments['failures'][key] = ft.Failure.from_dict(data)
+        return (task_cls, task_name, action, arguments)

    @staticmethod
    def _parse_message(message):
@@ -100,62 +112,84 @@ class Server(object):
            except KeyError:
                raise ValueError("The '%s' message property is missing" %
                                 prop)
-
        return properties

-    def _reply(self, reply_to, task_uuid, state=pr.FAILURE, **kwargs):
-        """Send reply to the `reply_to` queue."""
+    def _reply(self, capture, reply_to, task_uuid, state=pr.FAILURE, **kwargs):
+        """Send a reply to the `reply_to` queue with the given information.
+
+        Can capture failures to publish and if capturing will log associated
+        critical errors on behalf of the caller, and then returns whether the
+        publish worked out or did not.
+        """
        response = pr.Response(state, **kwargs)
+        published = False
        try:
            self._proxy.publish(response, reply_to, correlation_id=task_uuid)
+            published = True
        except Exception:
-            LOG.exception("Failed to send reply")
+            if not capture:
+                raise
+            LOG.critical("Failed to send reply to '%s' for task '%s' with"
+                         " response %s", reply_to, task_uuid, response,
+                         exc_info=True)
+        return published

-    def _on_update_progress(self, reply_to, task_uuid, task, event_data,
-                            progress):
-        """Send task update progress notification."""
-        self._reply(reply_to, task_uuid, pr.PROGRESS, event_data=event_data,
-                    progress=progress)
+    def _on_event(self, reply_to, task_uuid, event_type, details):
+        """Send out a task event notification."""
+        # NOTE(harlowja): the executor that will trigger this using the
+        # task notification/listener mechanism will handle logging if this
+        # fails, so thats why capture is 'False' is used here.
+        self._reply(False, reply_to, task_uuid, pr.EVENT,
+                    event_type=event_type, details=details)

    def _process_notify(self, notify, message):
        """Process notify message and reply back."""
-        LOG.debug("Start processing notify message.")
+        LOG.debug("Started processing notify message '%s'",
+                  ku.DelayedPretty(message))
        try:
            reply_to = message.properties['reply_to']
-        except Exception:
-            LOG.exception("The 'reply_to' message property is missing.")
+        except KeyError:
+            LOG.warn("The 'reply_to' message property is missing"
+                     " in received notify message '%s'",
+                     ku.DelayedPretty(message), exc_info=True)
        else:
-            self._proxy.publish(
-                msg=pr.Notify(topic=self._topic, tasks=self._endpoints.keys()),
-                routing_key=reply_to
-            )
+            response = pr.Notify(topic=self._topic,
+                                 tasks=self._endpoints.keys())
+            try:
+                self._proxy.publish(response, routing_key=reply_to)
+            except Exception:
+                LOG.critical("Failed to send reply to '%s' with notify"
+                             " response '%s'", reply_to, response,
+                             exc_info=True)

    def _process_request(self, request, message):
        """Process request message and reply back."""
-        # NOTE(skudriashev): parse broker message first to get the `reply_to`
-        # and the `task_uuid` parameters to have possibility to reply back.
-        LOG.debug("Start processing request message.")
+        LOG.debug("Started processing request message '%s'",
+                  ku.DelayedPretty(message))
        try:
+            # NOTE(skudriashev): parse broker message first to get
+            # the `reply_to` and the `task_uuid` parameters to have
+            # possibility to reply back (if we can't parse, we can't respond
+            # in the first place...).
            reply_to, task_uuid = self._parse_message(message)
        except ValueError:
-            LOG.exception("Failed to parse broker message")
+            LOG.warn("Failed to parse request attributes from message '%s'",
+                     ku.DelayedPretty(message), exc_info=True)
            return
        else:
-            # prepare task progress callback
-            progress_callback = functools.partial(
-                self._on_update_progress, reply_to, task_uuid)
            # prepare reply callback
-            reply_callback = functools.partial(
-                self._reply, reply_to, task_uuid)
+            reply_callback = functools.partial(self._reply, True, reply_to,
+                                               task_uuid)

        # parse request to get task name, action and action arguments
        try:
-            task_cls, action, action_args = self._parse_request(**request)
-            action_args.update(task_uuid=task_uuid,
-                               progress_callback=progress_callback)
+            bundle = self._parse_request(**request)
+            task_cls, task_name, action, arguments = bundle
+            arguments['task_uuid'] = task_uuid
        except ValueError:
            with misc.capture_failure() as failure:
-                LOG.exception("Failed to parse request")
+                LOG.warn("Failed to parse request contents from message '%s'",
+                         ku.DelayedPretty(message), exc_info=True)
                reply_callback(result=failure.to_dict())
                return

@@ -164,22 +198,61 @@ class Server(object):
            endpoint = self._endpoints[task_cls]
        except KeyError:
            with misc.capture_failure() as failure:
-                LOG.exception("The '%s' task endpoint does not exist",
-                              task_cls)
+                LOG.warn("The '%s' task endpoint does not exist, unable"
+                         " to continue processing request message '%s'",
+                         task_cls, ku.DelayedPretty(message), exc_info=True)
                reply_callback(result=failure.to_dict())
                return
        else:
-            reply_callback(state=pr.RUNNING)
+            try:
+                handler = getattr(endpoint, action)
+            except AttributeError:
+                with misc.capture_failure() as failure:
+                    LOG.warn("The '%s' handler does not exist on task endpoint"
+                             " '%s', unable to continue processing request"
+                             " message '%s'", action, endpoint,
+                             ku.DelayedPretty(message), exc_info=True)
+                    reply_callback(result=failure.to_dict())
+                    return
+            else:
+                try:
+                    task = endpoint.generate(name=task_name)
+                except Exception:
+                    with misc.capture_failure() as failure:
+                        LOG.warn("The '%s' task '%s' generation for request"
+                                 " message '%s' failed", endpoint, action,
+                                 ku.DelayedPretty(message), exc_info=True)
+                        reply_callback(result=failure.to_dict())
+                        return
+                else:
+                    if not reply_callback(state=pr.RUNNING):
+                        return

-        # perform task action
+        # associate *any* events this task emits with a proxy that will
+        # emit them back to the engine... for handling at the engine side
+        # of things...
+        if task.notifier.can_be_registered(nt.Notifier.ANY):
+            task.notifier.register(nt.Notifier.ANY,
+                                   functools.partial(self._on_event,
+                                                     reply_to, task_uuid))
+        elif isinstance(task.notifier, nt.RestrictedNotifier):
+            # only proxy the allowable events then...
+            for event_type in task.notifier.events_iter():
+                task.notifier.register(event_type,
+                                       functools.partial(self._on_event,
+                                                         reply_to, task_uuid))
+
+        # perform the task action
        try:
-            result = getattr(endpoint, action)(**action_args)
+            result = handler(task, **arguments)
        except Exception:
            with misc.capture_failure() as failure:
-                LOG.exception("The %s task execution failed", endpoint)
+                LOG.warn("The '%s' endpoint '%s' execution for request"
+                         " message '%s' failed", endpoint, action,
+                         ku.DelayedPretty(message), exc_info=True)
                reply_callback(result=failure.to_dict())
        else:
-            if isinstance(result, misc.Failure):
+            if isinstance(result, ft.Failure):
                reply_callback(result=result.to_dict())
            else:
                reply_callback(state=pr.SUCCESS, result=result)
--- a/taskflow/engines/worker_based/types.py
+++ b/taskflow/engines/worker_based/types.py
@@ -0,0 +1,234 @@
+# -*- coding: utf-8 -*-
+
+#    Copyright (C) 2014 Yahoo! Inc. All Rights Reserved.
+#
+#    Licensed under the Apache License, Version 2.0 (the "License"); you may
+#    not use this file except in compliance with the License. You may obtain
+#    a copy of the License at
+#
+#         http://www.apache.org/licenses/LICENSE-2.0
+#
+#    Unless required by applicable law or agreed to in writing, software
+#    distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+#    WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+#    License for the specific language governing permissions and limitations
+#    under the License.
+
+import abc
+import functools
+import itertools
+import random
+import threading
+
+from oslo_utils import reflection
+import six
+
+from taskflow.engines.worker_based import protocol as pr
+from taskflow import logging
+from taskflow.types import cache as base
+from taskflow.types import periodic
+from taskflow.types import timing as tt
+from taskflow.utils import kombu_utils as ku
+
+LOG = logging.getLogger(__name__)
+
+
+class RequestsCache(base.ExpiringCache):
+    """Represents a thread-safe requests cache."""
+
+    def get_waiting_requests(self, worker):
+        """Get list of waiting requests that the given worker can satisfy."""
+        waiting_requests = []
+        with self._lock:
+            for request in six.itervalues(self._data):
+                if request.state == pr.WAITING \
+                   and worker.performs(request.task):
+                    waiting_requests.append(request)
+        return waiting_requests
+
+
+# TODO(harlowja): this needs to be made better, once
+# https://blueprints.launchpad.net/taskflow/+spec/wbe-worker-info is finally
+# implemented we can go about using that instead.
+class TopicWorker(object):
+    """A (read-only) worker and its relevant information + useful methods."""
+
+    _NO_IDENTITY = object()
+
+    def __init__(self, topic, tasks, identity=_NO_IDENTITY):
+        self.tasks = []
+        for task in tasks:
+            if not isinstance(task, six.string_types):
+                task = reflection.get_class_name(task)
+            self.tasks.append(task)
+        self.topic = topic
+        self.identity = identity
+
+    def performs(self, task):
+        if not isinstance(task, six.string_types):
+            task = reflection.get_class_name(task)
+        return task in self.tasks
+
+    def __eq__(self, other):
+        if not isinstance(other, TopicWorker):
+            return NotImplemented
+        if len(other.tasks) != len(self.tasks):
+            return False
+        if other.topic != self.topic:
+            return False
+        for task in other.tasks:
+            if not self.performs(task):
+                return False
+        # If one of the identity equals _NO_IDENTITY, then allow it to match...
+        if self._NO_IDENTITY in (self.identity, other.identity):
+            return True
+        else:
+            return other.identity == self.identity
+
+    def __repr__(self):
+        r = reflection.get_class_name(self, fully_qualified=False)
+        if self.identity is not self._NO_IDENTITY:
+            r += "(identity=%s, tasks=%s, topic=%s)" % (self.identity,
+                                                        self.tasks, self.topic)
+        else:
+            r += "(identity=*, tasks=%s, topic=%s)" % (self.tasks, self.topic)
+        return r
+
+
+@six.add_metaclass(abc.ABCMeta)
+class WorkerFinder(object):
+    """Base class for worker finders..."""
+
+    def __init__(self):
+        self._cond = threading.Condition()
+        self.on_worker = None
+
+    @abc.abstractmethod
+    def _total_workers(self):
+        """Returns how many workers are known."""
+
+    def wait_for_workers(self, workers=1, timeout=None):
+        """Waits for geq workers to notify they are ready to do work.
+
+        NOTE(harlowja): if a timeout is provided this function will wait
+        until that timeout expires, if the amount of workers does not reach
+        the desired amount of workers before the timeout expires then this will
+        return how many workers are still needed, otherwise it will
+        return zero.
+        """
+        if workers <= 0:
+            raise ValueError("Worker amount must be greater than zero")
+        watch = tt.StopWatch(duration=timeout)
+        watch.start()
+        with self._cond:
+            while self._total_workers() < workers:
+                if watch.expired():
+                    return max(0, workers - self._total_workers())
+                self._cond.wait(watch.leftover(return_none=True))
+            return 0
+
+    @staticmethod
+    def _match_worker(task, available_workers):
+        """Select a worker (from geq 1 workers) that can best perform the task.
+
+        NOTE(harlowja): this method will be activated when there exists
+        one one greater than one potential workers that can perform a task,
+        the arguments provided will be the potential workers located and the
+        task that is being requested to perform and the result should be one
+        of those workers using whatever best-fit algorithm is possible (or
+        random at the least).
+        """
+        if len(available_workers) == 1:
+            return available_workers[0]
+        else:
+            return random.choice(available_workers)
+
+    @abc.abstractmethod
+    def get_worker_for_task(self, task):
+        """Gets a worker that can perform a given task."""
+
+    def clear(self):
+        pass
+
+
+class ProxyWorkerFinder(WorkerFinder):
+    """Requests and receives responses about workers topic+task details."""
+
+    def __init__(self, uuid, proxy, topics):
+        super(ProxyWorkerFinder, self).__init__()
+        self._proxy = proxy
+        self._topics = topics
+        self._workers = {}
+        self._uuid = uuid
+        self._proxy.dispatcher.type_handlers.update({
+            pr.NOTIFY: [
+                self._process_response,
+                functools.partial(pr.Notify.validate, response=True),
+            ],
+        })
+        self._counter = itertools.count()
+
+    def _next_worker(self, topic, tasks, temporary=False):
+        if not temporary:
+            return TopicWorker(topic, tasks,
+                               identity=six.next(self._counter))
+        else:
+            return TopicWorker(topic, tasks)
+
+    @periodic.periodic(pr.NOTIFY_PERIOD)
+    def beat(self):
+        """Cyclically called to publish notify message to each topic."""
+        self._proxy.publish(pr.Notify(), self._topics, reply_to=self._uuid)
+
+    def _total_workers(self):
+        return len(self._workers)
+
+    def _add(self, topic, tasks):
+        """Adds/updates a worker for the topic for the given tasks."""
+        try:
+            worker = self._workers[topic]
+            # Check if we already have an equivalent worker, if so just
+            # return it...
+            if worker == self._next_worker(topic, tasks, temporary=True):
+                return (worker, False)
+            # This *fall through* is done so that if someone is using an
+            # active worker object that already exists that we just create
+            # a new one; so that the existing object doesn't get
+            # affected (workers objects are supposed to be immutable).
+        except KeyError:
+            pass
+        worker = self._next_worker(topic, tasks)
+        self._workers[topic] = worker
+        return (worker, True)
+
+    def _process_response(self, response, message):
+        """Process notify message from remote side."""
+        LOG.debug("Started processing notify message '%s'",
+                  ku.DelayedPretty(message))
+        topic = response['topic']
+        tasks = response['tasks']
+        with self._cond:
+            worker, new_or_updated = self._add(topic, tasks)
+            if new_or_updated:
+                LOG.debug("Received notification about worker '%s' (%s"
+                          " total workers are currently known)", worker,
+                          self._total_workers())
+                self._cond.notify_all()
+        if self.on_worker is not None and new_or_updated:
+            self.on_worker(worker)
+
+    def clear(self):
+        with self._cond:
+            self._workers.clear()
+            self._cond.notify_all()
+
+    def get_worker_for_task(self, task):
+        available_workers = []
+        with self._cond:
+            for worker in six.itervalues(self._workers):
+                if worker.performs(task):
+                    available_workers.append(worker)
+        if available_workers:
+            return self._match_worker(task, available_workers)
+        else:
+            return None
--- a/taskflow/engines/worker_based/worker.py
+++ b/taskflow/engines/worker_based/worker.py
@@ -14,19 +14,20 @@
 #    License for the specific language governing permissions and limitations
 #    under the License.

-import logging
 import os
 import platform
 import socket
 import string
 import sys

-from concurrent import futures
+from oslo_utils import reflection

 from taskflow.engines.worker_based import endpoint
 from taskflow.engines.worker_based import server
+from taskflow import logging
 from taskflow import task as t_task
-from taskflow.utils import reflection
+from taskflow.types import futures
+from taskflow.utils import misc
 from taskflow.utils import threading_utils as tu
 from taskflow import version

@@ -69,46 +70,35 @@ class Worker(object):
    :param url: broker url
    :param exchange: broker exchange name
    :param topic: topic name under which worker is stated
-    :param tasks: tasks list that worker is capable to perform
-
-        Tasks list item can be one of the following types:
-        1. String:
-
-            1.1 Python module name:
-
-                > tasks=['taskflow.tests.utils']
-
-            1.2. Task class (BaseTask subclass) name:
-
-                > tasks=['taskflow.test.utils.DummyTask']
-
-        3. Python module:
-
-            > from taskflow.tests import utils
-            > tasks=[utils]
-
-        4. Task class (BaseTask subclass):
-
-            > from taskflow.tests import utils
-            > tasks=[utils.DummyTask]
-
-    :param executor: custom executor object that is used for processing
-        requests in separate threads
-    :keyword threads_count: threads count to be passed to the default executor
-    :keyword transport: transport to be used (e.g. amqp, memory, etc.)
-    :keyword transport_options: transport specific options
+    :param tasks: task list that worker is capable of performing, items in
+        the list can be one of the following types; 1, a string naming the
+        python module name to search for tasks in or the task class name; 2, a
+        python module  to search for tasks in; 3, a task class object that
+        will be used to create tasks from.
+    :param executor: custom executor object that can used for processing
+        requests in separate threads (if not provided one will be created)
+    :param threads_count: threads count to be passed to the
+                          default executor (used only if an executor is not
+                          passed in)
+    :param transport: transport to be used (e.g. amqp, memory, etc.)
+    :param transport_options: transport specific options (see:
+                              http://kombu.readthedocs.org/ for what these
+                              options imply and are expected to be)
+    :param retry_options: retry specific options
+                          (see: :py:attr:`~.proxy.Proxy.DEFAULT_RETRY_OPTIONS`)
    """

-    def __init__(self, exchange, topic, tasks, executor=None, **kwargs):
+    def __init__(self, exchange, topic, tasks,
+                 executor=None, threads_count=None, url=None,
+                 transport=None, transport_options=None,
+                 retry_options=None):
        self._topic = topic
        self._executor = executor
        self._owns_executor = False
        self._threads_count = -1
        if self._executor is None:
-            if 'threads_count' in kwargs:
-                self._threads_count = int(kwargs.pop('threads_count'))
-                if self._threads_count <= 0:
-                    raise ValueError("threads_count provided must be > 0")
+            if threads_count is not None:
+                self._threads_count = int(threads_count)
            else:
                self._threads_count = tu.get_optimal_thread_count()
            self._executor = futures.ThreadPoolExecutor(self._threads_count)
@@ -116,12 +106,15 @@ class Worker(object):
        self._endpoints = self._derive_endpoints(tasks)
        self._exchange = exchange
        self._server = server.Server(topic, exchange, self._executor,
-                                     self._endpoints, **kwargs)
+                                     self._endpoints, url=url,
+                                     transport=transport,
+                                     transport_options=transport_options,
+                                     retry_options=retry_options)

    @staticmethod
    def _derive_endpoints(tasks):
        """Derive endpoints from list of strings, classes or packages."""
-        derived_tasks = reflection.find_subclasses(tasks, t_task.BaseTask)
+        derived_tasks = misc.find_subclasses(tasks, t_task.BaseTask)
        return [endpoint.Endpoint(task) for task in derived_tasks]

    def _generate_banner(self):
@@ -158,14 +151,23 @@ class Worker(object):
            pass
        tpl_params['platform'] = platform.platform()
        tpl_params['thread_id'] = tu.get_ident()
-        return BANNER_TEMPLATE.substitute(BANNER_TEMPLATE.defaults,
-                                          **tpl_params)
+        banner = BANNER_TEMPLATE.substitute(BANNER_TEMPLATE.defaults,
+                                            **tpl_params)
+        # NOTE(harlowja): this is needed since the template in this file
+        # will always have newlines that end with '\n' (even on different
+        # platforms due to the way this source file is encoded) so we have
+        # to do this little dance to make it platform neutral...
+        return misc.fix_newlines(banner)

-    def run(self, display_banner=True):
+    def run(self, display_banner=True, banner_writer=None):
        """Runs the worker."""
        if display_banner:
-            for line in self._generate_banner().splitlines():
-                LOG.info(line)
+            banner = self._generate_banner()
+            if banner_writer is None:
+                for line in banner.splitlines():
+                    LOG.info(line)
+            else:
+                banner_writer(banner)
        self._server.start()

    def wait(self):
--- a/taskflow/examples/alphabet_soup.py
+++ b/taskflow/examples/alphabet_soup.py
@@ -0,0 +1,93 @@
+# -*- coding: utf-8 -*-
+
+#    Copyright (C) 2014 Yahoo! Inc. All Rights Reserved.
+#
+#    Licensed under the Apache License, Version 2.0 (the "License"); you may
+#    not use this file except in compliance with the License. You may obtain
+#    a copy of the License at
+#
+#         http://www.apache.org/licenses/LICENSE-2.0
+#
+#    Unless required by applicable law or agreed to in writing, software
+#    distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+#    WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+#    License for the specific language governing permissions and limitations
+#    under the License.
+
+import fractions
+import functools
+import logging
+import os
+import string
+import sys
+import time
+
+logging.basicConfig(level=logging.ERROR)
+
+self_dir = os.path.abspath(os.path.dirname(__file__))
+top_dir = os.path.abspath(os.path.join(os.path.dirname(__file__),
+                                       os.pardir,
+                                       os.pardir))
+sys.path.insert(0, top_dir)
+sys.path.insert(0, self_dir)
+
+from taskflow import engines
+from taskflow import exceptions
+from taskflow.patterns import linear_flow
+from taskflow import task
+
+
+# In this example we show how a simple linear set of tasks can be executed
+# using local processes (and not threads or remote workers) with minimial (if
+# any) modification to those tasks to make them safe to run in this mode.
+#
+# This is useful since it allows further scaling up your workflows when thread
+# execution starts to become a bottleneck (which it can start to be due to the
+# GIL in python). It also offers a intermediary scalable runner that can be
+# used when the scale and/or setup of remote workers is not desirable.
+
+
+def progress_printer(task, event_type, details):
+    # This callback, attached to each task will be called in the local
+    # process (not the child processes)...
+    progress = details.pop('progress')
+    progress = int(progress * 100.0)
+    print("Task '%s' reached %d%% completion" % (task.name, progress))
+
+
+class AlphabetTask(task.Task):
+    # Second delay between each progress part.
+    _DELAY = 0.1
+
+    # This task will run in X main stages (each with a different progress
+    # report that will be delivered back to the running process...). The
+    # initial 0% and 100% are triggered automatically by the engine when
+    # a task is started and finished (so that's why those are not emitted
+    # here).
+    _PROGRESS_PARTS = [fractions.Fraction("%s/5" % x) for x in range(1, 5)]
+
+    def execute(self):
+        for p in self._PROGRESS_PARTS:
+            self.update_progress(p)
+            time.sleep(self._DELAY)
+
+
+print("Constructing...")
+soup = linear_flow.Flow("alphabet-soup")
+for letter in string.ascii_lowercase:
+    abc = AlphabetTask(letter)
+    abc.notifier.register(task.EVENT_UPDATE_PROGRESS,
+                          functools.partial(progress_printer, abc))
+    soup.add(abc)
+try:
+    print("Loading...")
+    e = engines.load(soup, engine='parallel', executor='processes')
+    print("Compiling...")
+    e.compile()
+    print("Preparing...")
+    e.prepare()
+    print("Running...")
+    e.run()
+    print("Done...")
+except exceptions.NotImplementedError as e:
+    print(e)
--- a/taskflow/examples/build_a_car.py
+++ b/taskflow/examples/build_a_car.py
@@ -31,6 +31,9 @@ import taskflow.engines
 from taskflow.patterns import graph_flow as gf
 from taskflow.patterns import linear_flow as lf
 from taskflow import task
+from taskflow.types import notifier
+
+ANY = notifier.Notifier.ANY

 import example_utils as eu  # noqa

@@ -160,11 +163,11 @@ spec = {

 engine = taskflow.engines.load(flow, store={'spec': spec.copy()})

-# This registers all (*) state transitions to trigger a call to the flow_watch
-# function for flow state transitions, and registers the same all (*) state
-# transitions for task state transitions.
-engine.notifier.register('*', flow_watch)
-engine.task_notifier.register('*', task_watch)
+# This registers all (ANY) state transitions to trigger a call to the
+# flow_watch function for flow state transitions, and registers the
+# same all (ANY) state transitions for task state transitions.
+engine.notifier.register(ANY, flow_watch)
+engine.task_notifier.register(ANY, task_watch)

 eu.print_wrapped("Building a car")
 engine.run()
@@ -176,8 +179,8 @@ engine.run()
 spec['doors'] = 5

 engine = taskflow.engines.load(flow, store={'spec': spec.copy()})
-engine.notifier.register('*', flow_watch)
-engine.task_notifier.register('*', task_watch)
+engine.notifier.register(ANY, flow_watch)
+engine.task_notifier.register(ANY, task_watch)

 eu.print_wrapped("Building a wrong car that doesn't match specification")
 try:
--- a/taskflow/examples/calculate_in_parallel.py
+++ b/taskflow/examples/calculate_in_parallel.py
@@ -93,5 +93,5 @@ flow = lf.Flow('root').add(
 # The result here will be all results (from all tasks) which is stored in an
 # in-memory storage location that backs this engine since it is not configured
 # with persistence storage.
-result = taskflow.engines.run(flow, engine_conf='parallel')
+result = taskflow.engines.run(flow, engine='parallel')
 print(result)
--- a/taskflow/examples/create_parallel_volume.py
+++ b/taskflow/examples/create_parallel_volume.py
@@ -28,11 +28,12 @@ top_dir = os.path.abspath(os.path.join(os.path.dirname(__file__),
                                       os.pardir))
 sys.path.insert(0, top_dir)

+from oslo_utils import reflection
+
 from taskflow import engines
 from taskflow.listeners import printing
 from taskflow.patterns import unordered_flow as uf
 from taskflow import task
-from taskflow.utils import reflection

 # INTRO: This examples shows how unordered_flow can be used to create a large
 # number of fake volumes in parallel (or serially, depending on a constant that
@@ -64,13 +65,9 @@ VOLUME_COUNT = 5
 # time difference that this causes.
 SERIAL = False
 if SERIAL:
-    engine_conf = {
-        'engine': 'serial',
-    }
+    engine = 'serial'
 else:
-    engine_conf = {
-        'engine': 'parallel',
-    }
+    engine = 'parallel'


 class VolumeCreator(task.Task):
@@ -106,7 +103,7 @@ for i in range(0, VOLUME_COUNT):

 # Show how much time the overall engine loading and running takes.
 with show_time(name=flow.name.title()):
-    eng = engines.load(flow, engine_conf=engine_conf)
+    eng = engines.load(flow, engine=engine)
    # This context manager automatically adds (and automatically removes) a
    # helpful set of state transition notification printing helper utilities
    # that show you exactly what transitions the engine is going through
--- a/taskflow/examples/delayed_return.py
+++ b/taskflow/examples/delayed_return.py
@@ -39,14 +39,14 @@ from taskflow.listeners import base
 from taskflow.patterns import linear_flow as lf
 from taskflow import states
 from taskflow import task
-from taskflow.utils import misc
+from taskflow.types import notifier


 class PokeFutureListener(base.ListenerBase):
    def __init__(self, engine, future, task_name):
        super(PokeFutureListener, self).__init__(
            engine,
-            task_listen_for=(misc.Notifier.ANY,),
+            task_listen_for=(notifier.Notifier.ANY,),
            flow_listen_for=[])
        self._future = future
        self._task_name = task_name
@@ -74,7 +74,7 @@ class Bye(task.Task):

 def return_from_flow(pool):
    wf = lf.Flow("root").add(Hi("hi"), Bye("bye"))
-    eng = taskflow.engines.load(wf, engine_conf='serial')
+    eng = taskflow.engines.load(wf, engine='serial')
    f = futures.Future()
    watcher = PokeFutureListener(eng, f, 'hi')
    watcher.register()
--- a/taskflow/examples/echo_listener.py
+++ b/taskflow/examples/echo_listener.py
@@ -0,0 +1,56 @@
+# -*- coding: utf-8 -*-
+
+#    Copyright (C) 2015 Yahoo! Inc. All Rights Reserved.
+#
+#    Licensed under the Apache License, Version 2.0 (the "License"); you may
+#    not use this file except in compliance with the License. You may obtain
+#    a copy of the License at
+#
+#         http://www.apache.org/licenses/LICENSE-2.0
+#
+#    Unless required by applicable law or agreed to in writing, software
+#    distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+#    WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+#    License for the specific language governing permissions and limitations
+#    under the License.
+
+import logging
+import os
+import sys
+
+logging.basicConfig(level=logging.DEBUG)
+
+top_dir = os.path.abspath(os.path.join(os.path.dirname(__file__),
+                                       os.pardir,
+                                       os.pardir))
+sys.path.insert(0, top_dir)
+
+from taskflow import engines
+from taskflow.listeners import logging as logging_listener
+from taskflow.patterns import linear_flow as lf
+from taskflow import task
+
+# INTRO: This example walks through a miniature workflow which will do a
+# simple echo operation; during this execution a listener is assocated with
+# the engine to recieve all notifications about what the flow has performed,
+# this example dumps that output to the stdout for viewing (at debug level
+# to show all the information which is possible).
+
+
+class Echo(task.Task):
+    def execute(self):
+        print(self.name)
+
+
+# Generate the work to be done (but don't do it yet).
+wf = lf.Flow('abc')
+wf.add(Echo('a'))
+wf.add(Echo('b'))
+wf.add(Echo('c'))
+
+# This will associate the listener with the engine (the listener
+# will automatically register for notifications with the engine and deregister
+# when the context is exited).
+e = engines.load(wf)
+with logging_listener.DynamicLoggingListener(e):
+    e.run()
--- a/taskflow/examples/fake_billing.py
+++ b/taskflow/examples/fake_billing.py
@@ -27,10 +27,10 @@ top_dir = os.path.abspath(os.path.join(os.path.dirname(__file__),
                                       os.pardir))
 sys.path.insert(0, top_dir)

+from oslo_utils import uuidutils

 from taskflow import engines
 from taskflow.listeners import printing
-from taskflow.openstack.common import uuidutils
 from taskflow.patterns import graph_flow as gf
 from taskflow.patterns import linear_flow as lf
 from taskflow import task
@@ -148,6 +148,12 @@ class DeclareSuccess(task.Task):
        print("All data processed and sent to %s" % (sent_to))


+class DummyUser(object):
+    def __init__(self, user, id_):
+        self.user = user
+        self.id = id_
+
+
 # Resources (db handles and similar) of course can *not* be persisted so we
 # need to make sure that we pass this resource fetcher to the tasks constructor
 # so that the tasks have access to any needed resources (the resources are
@@ -168,9 +174,9 @@ flow.add(sub_flow)
 # prepopulating this allows the tasks that dependent on the 'request' variable
 # to start processing (in this case this is the ExtractInputRequest task).
 store = {
-    'request': misc.AttrDict(user="bob", id="1.35"),
+    'request': DummyUser(user="bob", id_="1.35"),
 }
-eng = engines.load(flow, engine_conf='serial', store=store)
+eng = engines.load(flow, engine='serial', store=store)

 # This context manager automatically adds (and automatically removes) a
 # helpful set of state transition notification printing helper utilities
--- a/taskflow/examples/graph_flow.py
+++ b/taskflow/examples/graph_flow.py
@@ -81,11 +81,11 @@ store = {
 }

 result = taskflow.engines.run(
-    flow, engine_conf='serial', store=store)
+    flow, engine='serial', store=store)

 print("Single threaded engine result %s" % result)

 result = taskflow.engines.run(
-    flow, engine_conf='parallel', store=store)
+    flow, engine='parallel', store=store)

 print("Multi threaded engine result %s" % result)
--- a/taskflow/examples/hello_world.py
+++ b/taskflow/examples/hello_world.py
@@ -0,0 +1,105 @@
+# -*- coding: utf-8 -*-
+
+#    Copyright (C) 2014 Yahoo! Inc. All Rights Reserved.
+#
+#    Licensed under the Apache License, Version 2.0 (the "License"); you may
+#    not use this file except in compliance with the License. You may obtain
+#    a copy of the License at
+#
+#         http://www.apache.org/licenses/LICENSE-2.0
+#
+#    Unless required by applicable law or agreed to in writing, software
+#    distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+#    WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+#    License for the specific language governing permissions and limitations
+#    under the License.
+
+import logging
+import os
+import sys
+
+logging.basicConfig(level=logging.ERROR)
+
+top_dir = os.path.abspath(os.path.join(os.path.dirname(__file__),
+                                       os.pardir,
+                                       os.pardir))
+sys.path.insert(0, top_dir)
+
+from taskflow import engines
+from taskflow.patterns import linear_flow as lf
+from taskflow.patterns import unordered_flow as uf
+from taskflow import task
+from taskflow.types import futures
+from taskflow.utils import eventlet_utils
+
+
+# INTRO: This is the defacto hello world equivalent for taskflow; it shows how
+# a overly simplistic workflow can be created that runs using different
+# engines using different styles of execution (all can be used to run in
+# parallel if a workflow is provided that is parallelizable).
+
+class PrinterTask(task.Task):
+    def __init__(self, name, show_name=True, inject=None):
+        super(PrinterTask, self).__init__(name, inject=inject)
+        self._show_name = show_name
+
+    def execute(self, output):
+        if self._show_name:
+            print("%s: %s" % (self.name, output))
+        else:
+            print(output)
+
+
+# This will be the work that we want done, which for this example is just to
+# print 'hello world' (like a song) using different tasks and different
+# execution models.
+song = lf.Flow("beats")
+
+# Unordered flows when ran can be ran in parallel; and a chorus is everyone
+# singing at once of course!
+hi_chorus = uf.Flow('hello')
+world_chorus = uf.Flow('world')
+for (name, hello, world) in [('bob', 'hello', 'world'),
+                             ('joe', 'hellooo', 'worllllld'),
+                             ('sue', "helloooooo!", 'wooorllld!')]:
+    hi_chorus.add(PrinterTask("%s@hello" % name,
+                              # This will show up to the execute() method of
+                              # the task as the argument named 'output' (which
+                              # will allow us to print the character we want).
+                              inject={'output': hello}))
+    world_chorus.add(PrinterTask("%s@world" % name,
+                                 inject={'output': world}))
+
+# The composition starts with the conductor and then runs in sequence with
+# the chorus running in parallel, but no matter what the 'hello' chorus must
+# always run before the 'world' chorus (otherwise the world will fall apart).
+song.add(PrinterTask("conductor@begin",
+                     show_name=False, inject={'output': "*ding*"}),
+         hi_chorus,
+         world_chorus,
+         PrinterTask("conductor@end",
+                     show_name=False, inject={'output': "*dong*"}))
+
+# Run in parallel using eventlet green threads...
+if eventlet_utils.EVENTLET_AVAILABLE:
+    with futures.GreenThreadPoolExecutor() as executor:
+        e = engines.load(song, executor=executor, engine='parallel')
+        e.run()
+
+
+# Run in parallel using real threads...
+with futures.ThreadPoolExecutor(max_workers=1) as executor:
+    e = engines.load(song, executor=executor, engine='parallel')
+    e.run()
+
+
+# Run in parallel using external processes...
+with futures.ProcessPoolExecutor(max_workers=1) as executor:
+    e = engines.load(song, executor=executor, engine='parallel')
+    e.run()
+
+
+# Run serially (aka, if the workflow could have been ran in parallel, it will
+# not be when ran in this mode)...
+e = engines.load(song, engine='serial')
+e.run()
--- a/taskflow/examples/jobboard_produce_consume_colors.py
+++ b/taskflow/examples/jobboard_produce_consume_colors.py
@@ -0,0 +1,177 @@
+# -*- coding: utf-8 -*-
+
+#    Copyright (C) 2014 Yahoo! Inc. All Rights Reserved.
+#
+#    Licensed under the Apache License, Version 2.0 (the "License"); you may
+#    not use this file except in compliance with the License. You may obtain
+#    a copy of the License at
+#
+#         http://www.apache.org/licenses/LICENSE-2.0
+#
+#    Unless required by applicable law or agreed to in writing, software
+#    distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+#    WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+#    License for the specific language governing permissions and limitations
+#    under the License.
+
+import collections
+import contextlib
+import logging
+import os
+import random
+import sys
+import threading
+import time
+
+logging.basicConfig(level=logging.ERROR)
+
+top_dir = os.path.abspath(os.path.join(os.path.dirname(__file__),
+                                       os.pardir,
+                                       os.pardir))
+sys.path.insert(0, top_dir)
+
+from six.moves import range as compat_range
+from zake import fake_client
+
+from taskflow import exceptions as excp
+from taskflow.jobs import backends
+from taskflow.utils import threading_utils
+
+# In this example we show how a jobboard can be used to post work for other
+# entities to work on. This example creates a set of jobs using one producer
+# thread (typically this would be split across many machines) and then having
+# other worker threads with there own jobboards select work using a given
+# filters [red/blue] and then perform that work (and consuming or abandoning
+# the job after it has been completed or failed).
+
+# Things to note:
+# - No persistence layer is used (or logbook), just the job details are used
+#   to determine if a job should be selected by a worker or not.
+# - This example runs in a single process (this is expected to be atypical
+#   but this example shows that it can be done if needed, for testing...)
+# - The iterjobs(), claim(), consume()/abandon() worker workflow.
+# - The post() producer workflow.
+
+SHARED_CONF = {
+    'path': "/taskflow/jobs",
+    'board': 'zookeeper',
+}
+
+# How many workers and producers of work will be created (as threads).
+PRODUCERS = 3
+WORKERS = 5
+
+# How many units of work each producer will create.
+PRODUCER_UNITS = 10
+
+# How many units of work are expected to be produced (used so workers can
+# know when to stop running and shutdown, typically this would not be a
+# a value but we have to limit this examples execution time to be less than
+# infinity).
+EXPECTED_UNITS = PRODUCER_UNITS * PRODUCERS
+
+# Delay between producing/consuming more work.
+WORKER_DELAY, PRODUCER_DELAY = (0.5, 0.5)
+
+# To ensure threads don't trample other threads output.
+STDOUT_LOCK = threading.Lock()
+
+
+def dispatch_work(job):
+    # This is where the jobs contained work *would* be done
+    time.sleep(1.0)
+
+
+def safe_print(name, message, prefix=""):
+    with STDOUT_LOCK:
+        if prefix:
+            print("%s %s: %s" % (prefix, name, message))
+        else:
+            print("%s: %s" % (name, message))
+
+
+def worker(ident, client, consumed):
+    # Create a personal board (using the same client so that it works in
+    # the same process) and start looking for jobs on the board that we want
+    # to perform.
+    name = "W-%s" % (ident)
+    safe_print(name, "started")
+    claimed_jobs = 0
+    consumed_jobs = 0
+    abandoned_jobs = 0
+    with backends.backend(name, SHARED_CONF.copy(), client=client) as board:
+        while len(consumed) != EXPECTED_UNITS:
+            favorite_color = random.choice(['blue', 'red'])
+            for job in board.iterjobs(ensure_fresh=True, only_unclaimed=True):
+                # See if we should even bother with it...
+                if job.details.get('color') != favorite_color:
+                    continue
+                safe_print(name, "'%s' [attempting claim]" % (job))
+                try:
+                    board.claim(job, name)
+                    claimed_jobs += 1
+                    safe_print(name, "'%s' [claimed]" % (job))
+                except (excp.NotFound, excp.UnclaimableJob):
+                    safe_print(name, "'%s' [claim unsuccessful]" % (job))
+                else:
+                    try:
+                        dispatch_work(job)
+                        board.consume(job, name)
+                        safe_print(name, "'%s' [consumed]" % (job))
+                        consumed_jobs += 1
+                        consumed.append(job)
+                    except Exception:
+                        board.abandon(job, name)
+                        abandoned_jobs += 1
+                        safe_print(name, "'%s' [abandoned]" % (job))
+            time.sleep(WORKER_DELAY)
+    safe_print(name,
+               "finished (claimed %s jobs, consumed %s jobs,"
+               " abandoned %s jobs)" % (claimed_jobs, consumed_jobs,
+                                        abandoned_jobs), prefix=">>>")
+
+
+def producer(ident, client):
+    # Create a personal board (using the same client so that it works in
+    # the same process) and start posting jobs on the board that we want
+    # some entity to perform.
+    name = "P-%s" % (ident)
+    safe_print(name, "started")
+    with backends.backend(name, SHARED_CONF.copy(), client=client) as board:
+        for i in compat_range(0, PRODUCER_UNITS):
+            job_name = "%s-%s" % (name, i)
+            details = {
+                'color': random.choice(['red', 'blue']),
+            }
+            job = board.post(job_name, book=None, details=details)
+            safe_print(name, "'%s' [posted]" % (job))
+            time.sleep(PRODUCER_DELAY)
+    safe_print(name, "finished", prefix=">>>")
+
+
+def main():
+    with contextlib.closing(fake_client.FakeClient()) as c:
+        created = []
+        for i in compat_range(0, PRODUCERS):
+            p = threading_utils.daemon_thread(producer, i + 1, c)
+            created.append(p)
+            p.start()
+        consumed = collections.deque()
+        for i in compat_range(0, WORKERS):
+            w = threading_utils.daemon_thread(worker, i + 1, c, consumed)
+            created.append(w)
+            w.start()
+        while created:
+            t = created.pop()
+            t.join()
+        # At the end there should be nothing leftover, let's verify that.
+        board = backends.fetch('verifier', SHARED_CONF.copy(), client=c)
+        board.connect()
+        with contextlib.closing(board):
+            if board.job_count != 0 or len(consumed) != EXPECTED_UNITS:
+                return 1
+            return 0
+
+
+if __name__ == "__main__":
+    sys.exit(main())
--- a/taskflow/examples/parallel_table_multiply.py
+++ b/taskflow/examples/parallel_table_multiply.py
@@ -0,0 +1,129 @@
+# -*- coding: utf-8 -*-
+
+#    Copyright (C) 2014 Yahoo! Inc. All Rights Reserved.
+#
+#    Licensed under the Apache License, Version 2.0 (the "License"); you may
+#    not use this file except in compliance with the License. You may obtain
+#    a copy of the License at
+#
+#         http://www.apache.org/licenses/LICENSE-2.0
+#
+#    Unless required by applicable law or agreed to in writing, software
+#    distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+#    WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+#    License for the specific language governing permissions and limitations
+#    under the License.
+
+import csv
+import logging
+import os
+import random
+import sys
+
+logging.basicConfig(level=logging.ERROR)
+
+top_dir = os.path.abspath(os.path.join(os.path.dirname(__file__),
+                                       os.pardir,
+                                       os.pardir))
+sys.path.insert(0, top_dir)
+
+from six.moves import range as compat_range
+
+from taskflow import engines
+from taskflow.patterns import unordered_flow as uf
+from taskflow import task
+from taskflow.types import futures
+from taskflow.utils import eventlet_utils
+
+# INTRO: This example walks through a miniature workflow which does a parallel
+# table modification where each row in the table gets adjusted by a thread, or
+# green thread (if eventlet is available) in parallel and then the result
+# is reformed into a new table and some verifications are performed on it
+# to ensure everything went as expected.
+
+
+MULTIPLER = 10
+
+
+class RowMultiplier(task.Task):
+    """Performs a modification of an input row, creating a output row."""
+
+    def __init__(self, name, index, row, multiplier):
+        super(RowMultiplier, self).__init__(name=name)
+        self.index = index
+        self.multiplier = multiplier
+        self.row = row
+
+    def execute(self):
+        return [r * self.multiplier for r in self.row]
+
+
+def make_flow(table):
+    # This creation will allow for parallel computation (since the flow here
+    # is specifically unordered; and when things are unordered they have
+    # no dependencies and when things have no dependencies they can just be
+    # ran at the same time, limited in concurrency by the executor or max
+    # workers of that executor...)
+    f = uf.Flow("root")
+    for i, row in enumerate(table):
+        f.add(RowMultiplier("m-%s" % i, i, row, MULTIPLER))
+    # NOTE(harlowja): at this point nothing has ran, the above is just
+    # defining what should be done (but not actually doing it) and associating
+    # an ordering dependencies that should be enforced (the flow pattern used
+    # forces this), the engine in the later main() function will actually
+    # perform this work...
+    return f
+
+
+def main():
+    if len(sys.argv) == 2:
+        tbl = []
+        with open(sys.argv[1], 'rb') as fh:
+            reader = csv.reader(fh)
+            for row in reader:
+                tbl.append([float(r) if r else 0.0 for r in row])
+    else:
+        # Make some random table out of thin air...
+        tbl = []
+        cols = random.randint(1, 100)
+        rows = random.randint(1, 100)
+        for _i in compat_range(0, rows):
+            row = []
+            for _j in compat_range(0, cols):
+                row.append(random.random())
+            tbl.append(row)
+
+    # Generate the work to be done.
+    f = make_flow(tbl)
+
+    # Now run it (using the specified executor)...
+    if eventlet_utils.EVENTLET_AVAILABLE:
+        executor = futures.GreenThreadPoolExecutor(max_workers=5)
+    else:
+        executor = futures.ThreadPoolExecutor(max_workers=5)
+    try:
+        e = engines.load(f, engine='parallel', executor=executor)
+        for st in e.run_iter():
+            print(st)
+    finally:
+        executor.shutdown()
+
+    # Find the old rows and put them into place...
+    #
+    # TODO(harlowja): probably easier just to sort instead of search...
+    computed_tbl = []
+    for i in compat_range(0, len(tbl)):
+        for t in f:
+            if t.index == i:
+                computed_tbl.append(e.storage.get(t.name))
+
+    # Do some basic validation (which causes the return code of this process
+    # to be different if things were not as expected...)
+    if len(computed_tbl) != len(tbl):
+        return 1
+    else:
+        return 0
+
+
+if __name__ == "__main__":
+    sys.exit(main())
--- a/taskflow/examples/persistence_example.py
+++ b/taskflow/examples/persistence_example.py
@@ -91,20 +91,15 @@ else:
    blowup = True

 with eu.get_backend(backend_uri) as backend:
-    # Now we can run.
-    engine_config = {
-        'backend': backend,
-        'engine_conf': 'serial',
-        'book': logbook.LogBook("my-test"),
-    }
-
    # Make a flow that will blowup if the file doesn't exist previously, if it
    # did exist, assume we won't blowup (and therefore this shows the undo
    # and redo that a flow will go through).
+    book = logbook.LogBook("my-test")
    flow = make_flow(blowup=blowup)
    eu.print_wrapped("Running")
    try:
-        eng = engines.load(flow, **engine_config)
+        eng = engines.load(flow, engine='serial',
+                           backend=backend, book=book)
        eng.run()
        if not blowup:
            eu.rm_path(persist_path)
@@ -115,4 +110,4 @@ with eu.get_backend(backend_uri) as backend:
        traceback.print_exc(file=sys.stdout)

    eu.print_wrapped("Book contents")
-    print(p_utils.pformat(engine_config['book']))
+    print(p_utils.pformat(book))
--- a/taskflow/examples/resume_many_flows.py
+++ b/taskflow/examples/resume_many_flows.py
@@ -48,7 +48,7 @@ def _exec(cmd, add_env=None):
                            stdout=subprocess.PIPE,
                            stderr=sys.stderr)

-    stdout, stderr = proc.communicate()
+    stdout, _stderr = proc.communicate()
    rc = proc.returncode
    if rc != 0:
        raise RuntimeError("Could not run %s [%s]", cmd, rc)
--- a/taskflow/examples/resume_vm_boot.py
+++ b/taskflow/examples/resume_vm_boot.py
@@ -31,13 +31,15 @@ top_dir = os.path.abspath(os.path.join(os.path.dirname(__file__),
 sys.path.insert(0, top_dir)
 sys.path.insert(0, self_dir)

+from oslo_utils import uuidutils
+
 from taskflow import engines
 from taskflow import exceptions as exc
-from taskflow.openstack.common import uuidutils
 from taskflow.patterns import graph_flow as gf
 from taskflow.patterns import linear_flow as lf
 from taskflow import task
-from taskflow.utils import eventlet_utils as e_utils
+from taskflow.types import futures
+from taskflow.utils import eventlet_utils
 from taskflow.utils import persistence_utils as p_utils

 import example_utils as eu  # noqa
@@ -141,7 +143,7 @@ class AllocateIP(task.Task):

    def execute(self, vm_spec):
        ips = []
-        for i in range(0, vm_spec.get('ips', 0)):
+        for _i in range(0, vm_spec.get('ips', 0)):
            ips.append("192.168.0.%s" % (random.randint(1, 254)))
        return ips

@@ -235,11 +237,9 @@ with eu.get_backend() as backend:
        flow_id = None

    # Set up how we want our engine to run, serial, parallel...
-    engine_conf = {
-        'engine': 'parallel',
-    }
-    if e_utils.EVENTLET_AVAILABLE:
-        engine_conf['executor'] = e_utils.GreenExecutor(5)
+    executor = None
+    if eventlet_utils.EVENTLET_AVAILABLE:
+        executor = futures.GreenThreadPoolExecutor(5)

    # Create/fetch a logbook that will track the workflows work.
    book = None
@@ -255,15 +255,15 @@ with eu.get_backend() as backend:
        book = p_utils.temporary_log_book(backend)
        engine = engines.load_from_factory(create_flow,
                                           backend=backend, book=book,
-                                           engine_conf=engine_conf)
+                                           engine='parallel',
+                                           executor=executor)
        print("!! Your tracking id is: '%s+%s'" % (book.uuid,
                                                   engine.storage.flow_uuid))
        print("!! Please submit this on later runs for tracking purposes")
    else:
        # Attempt to load from a previously partially completed flow.
-        engine = engines.load_from_detail(flow_detail,
-                                          backend=backend,
-                                          engine_conf=engine_conf)
+        engine = engines.load_from_detail(flow_detail, backend=backend,
+                                          engine='parallel', executor=executor)

    # Make me my vm please!
    eu.print_wrapped('Running')
--- a/taskflow/examples/resume_volume_create.py
+++ b/taskflow/examples/resume_volume_create.py
@@ -143,13 +143,9 @@ with example_utils.get_backend() as backend:
        flow_detail = find_flow_detail(backend, book_id, flow_id)

    # Load and run.
-    engine_conf = {
-        'engine': 'serial',
-    }
    engine = engines.load(flow,
                          flow_detail=flow_detail,
-                          backend=backend,
-                          engine_conf=engine_conf)
+                          backend=backend, engine='serial')
    engine.run()

 # How to use.
--- a/taskflow/examples/run_by_iter.py
+++ b/taskflow/examples/run_by_iter.py
@@ -30,9 +30,9 @@ sys.path.insert(0, top_dir)
 sys.path.insert(0, self_dir)


-from taskflow.engines.action_engine import engine
+from taskflow import engines
 from taskflow.patterns import linear_flow as lf
-from taskflow.persistence.backends import impl_memory
+from taskflow.persistence import backends as persistence_backends
 from taskflow import task
 from taskflow.utils import persistence_utils

@@ -73,18 +73,15 @@ flows = []
 for i in range(0, flow_count):
    f = make_alphabet_flow(i + 1)
    flows.append(make_alphabet_flow(i + 1))
-be = impl_memory.MemoryBackend({})
+be = persistence_backends.fetch(conf={'connection': 'memory'})
 book = persistence_utils.temporary_log_book(be)
-engines = []
+engine_iters = []
 for f in flows:
    fd = persistence_utils.create_flow_detail(f, book, be)
-    e = engine.SingleThreadedActionEngine(f, fd, be, {})
+    e = engines.load(f, flow_detail=fd, backend=be, book=book)
    e.compile()
    e.storage.inject({'A': 'A'})
    e.prepare()
-    engines.append(e)
-engine_iters = []
-for e in engines:
    engine_iters.append(e.run_iter())
 while engine_iters:
    for it in list(engine_iters):
--- a/taskflow/examples/run_by_iter_enumerate.py
+++ b/taskflow/examples/run_by_iter_enumerate.py
@@ -27,9 +27,9 @@ top_dir = os.path.abspath(os.path.join(os.path.dirname(__file__),
 sys.path.insert(0, top_dir)
 sys.path.insert(0, self_dir)

-from taskflow.engines.action_engine import engine
+from taskflow import engines
 from taskflow.patterns import linear_flow as lf
-from taskflow.persistence.backends import impl_memory
+from taskflow.persistence import backends as persistence_backends
 from taskflow import task
 from taskflow.utils import persistence_utils

@@ -48,10 +48,10 @@ f = lf.Flow("counter")
 for i in range(0, 10):
    f.add(EchoNameTask("echo_%s" % (i + 1)))

-be = impl_memory.MemoryBackend()
+be = persistence_backends.fetch(conf={'connection': 'memory'})
 book = persistence_utils.temporary_log_book(be)
 fd = persistence_utils.create_flow_detail(f, book, be)
-e = engine.SingleThreadedActionEngine(f, fd, be, {})
+e = engines.load(f, flow_detail=fd, backend=be, book=book)
 e.compile()
 e.prepare()

--- a/taskflow/examples/simple_linear_listening.py
+++ b/taskflow/examples/simple_linear_listening.py
@@ -28,6 +28,9 @@ sys.path.insert(0, top_dir)
 import taskflow.engines
 from taskflow.patterns import linear_flow as lf
 from taskflow import task
+from taskflow.types import notifier
+
+ANY = notifier.Notifier.ANY

 # INTRO: In this example we create two tasks (this time as functions instead
 # of task subclasses as in the simple_linear.py example), each of which ~calls~
@@ -92,8 +95,8 @@ engine = taskflow.engines.load(flow, store={
 # notification objects that a engine exposes. The usage of a '*' (kleene star)
 # here means that we want to be notified on all state changes, if you want to
 # restrict to a specific state change, just register that instead.
-engine.notifier.register('*', flow_watch)
-engine.task_notifier.register('*', task_watch)
+engine.notifier.register(ANY, flow_watch)
+engine.task_notifier.register(ANY, task_watch)

 # And now run!
 engine.run()
--- a/taskflow/examples/simple_linear_pass.out.txt
+++ b/taskflow/examples/simple_linear_pass.out.txt
@@ -0,0 +1,9 @@
+Constructing...
+Loading...
+Compiling...
+Preparing...
+Running...
+Executing 'a'
+Executing 'b'
+Got input 'a'
+Done...
--- a/taskflow/examples/simple_linear_pass.py
+++ b/taskflow/examples/simple_linear_pass.py
@@ -0,0 +1,68 @@
+# -*- coding: utf-8 -*-
+
+#    Copyright (C) 2014 Yahoo! Inc. All Rights Reserved.
+#
+#    Licensed under the Apache License, Version 2.0 (the "License"); you may
+#    not use this file except in compliance with the License. You may obtain
+#    a copy of the License at
+#
+#         http://www.apache.org/licenses/LICENSE-2.0
+#
+#    Unless required by applicable law or agreed to in writing, software
+#    distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+#    WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+#    License for the specific language governing permissions and limitations
+#    under the License.
+
+import logging
+import os
+import sys
+
+logging.basicConfig(level=logging.ERROR)
+
+self_dir = os.path.abspath(os.path.dirname(__file__))
+top_dir = os.path.abspath(os.path.join(os.path.dirname(__file__),
+                                       os.pardir,
+                                       os.pardir))
+sys.path.insert(0, top_dir)
+sys.path.insert(0, self_dir)
+
+from taskflow import engines
+from taskflow.patterns import linear_flow
+from taskflow import task
+
+# INTRO: This examples shows how a task (in a linear/serial workflow) can
+# produce an output that can be then consumed/used by a downstream task.
+
+
+class TaskA(task.Task):
+    default_provides = 'a'
+
+    def execute(self):
+        print("Executing '%s'" % (self.name))
+        return 'a'
+
+
+class TaskB(task.Task):
+    def execute(self, a):
+        print("Executing '%s'" % (self.name))
+        print("Got input '%s'" % (a))
+
+
+print("Constructing...")
+wf = linear_flow.Flow("pass-from-to")
+wf.add(TaskA('a'), TaskB('b'))
+
+print("Loading...")
+e = engines.load(wf)
+
+print("Compiling...")
+e.compile()
+
+print("Preparing...")
+e.prepare()
+
+print("Running...")
+e.run()
+
+print("Done...")
--- a/taskflow/examples/simple_map_reduce.py
+++ b/taskflow/examples/simple_map_reduce.py
@@ -0,0 +1,115 @@
+# -*- coding: utf-8 -*-
+
+#    Copyright (C) 2014 Yahoo! Inc. All Rights Reserved.
+#
+#    Licensed under the Apache License, Version 2.0 (the "License"); you may
+#    not use this file except in compliance with the License. You may obtain
+#    a copy of the License at
+#
+#         http://www.apache.org/licenses/LICENSE-2.0
+#
+#    Unless required by applicable law or agreed to in writing, software
+#    distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+#    WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+#    License for the specific language governing permissions and limitations
+#    under the License.
+
+import logging
+import os
+import sys
+
+logging.basicConfig(level=logging.ERROR)
+
+self_dir = os.path.abspath(os.path.dirname(__file__))
+top_dir = os.path.abspath(os.path.join(os.path.dirname(__file__),
+                                       os.pardir,
+                                       os.pardir))
+sys.path.insert(0, top_dir)
+sys.path.insert(0, self_dir)
+
+# INTRO: this examples shows a simplistic map/reduce implementation where
+# a set of mapper(s) will sum a series of input numbers (in parallel) and
+# return there individual summed result. A reducer will then use those
+# produced values and perform a final summation and this result will then be
+# printed (and verified to ensure the calculation was as expected).
+
+import six
+
+from taskflow import engines
+from taskflow.patterns import linear_flow
+from taskflow.patterns import unordered_flow
+from taskflow import task
+
+
+class SumMapper(task.Task):
+    def execute(self, inputs):
+        # Sums some set of provided inputs.
+        return sum(inputs)
+
+
+class TotalReducer(task.Task):
+    def execute(self, *args, **kwargs):
+        # Reduces all mapped summed outputs into a single value.
+        total = 0
+        for (k, v) in six.iteritems(kwargs):
+            # If any other kwargs was passed in, we don't want to use those
+            # in the calculation of the total...
+            if k.startswith('reduction_'):
+                total += v
+        return total
+
+
+def chunk_iter(chunk_size, upperbound):
+    """Yields back chunk size pieces from zero to upperbound - 1."""
+    chunk = []
+    for i in range(0, upperbound):
+        chunk.append(i)
+        if len(chunk) == chunk_size:
+            yield chunk
+            chunk = []
+
+
+# Upper bound of numbers to sum for example purposes...
+UPPER_BOUND = 10000
+
+# How many mappers we want to have.
+SPLIT = 10
+
+# How big of a chunk we want to give each mapper.
+CHUNK_SIZE = UPPER_BOUND // SPLIT
+
+# This will be the workflow we will compose and run.
+w = linear_flow.Flow("root")
+
+# The mappers will run in parallel.
+store = {}
+provided = []
+mappers = unordered_flow.Flow('map')
+for i, chunk in enumerate(chunk_iter(CHUNK_SIZE, UPPER_BOUND)):
+    mapper_name = 'mapper_%s' % i
+    # Give that mapper some information to compute.
+    store[mapper_name] = chunk
+    # The reducer uses all of the outputs of the mappers, so it needs
+    # to be recorded that it needs access to them (under a specific name).
+    provided.append("reduction_%s" % i)
+    mappers.add(SumMapper(name=mapper_name,
+                          rebind={'inputs': mapper_name},
+                          provides=provided[-1]))
+w.add(mappers)
+
+# The reducer will run last (after all the mappers).
+w.add(TotalReducer('reducer', requires=provided))
+
+# Now go!
+e = engines.load(w, engine='parallel', store=store, max_workers=4)
+print("Running a parallel engine with options: %s" % e.options)
+e.run()
+
+# Now get the result the reducer created.
+total = e.storage.get('reducer')
+print("Calculated result = %s" % total)
+
+# Calculate it manually to verify that it worked...
+calc_total = sum(range(0, UPPER_BOUND))
+if calc_total != total:
+    sys.exit(1)
--- a/taskflow/examples/timing_listener.py
+++ b/taskflow/examples/timing_listener.py
@@ -0,0 +1,59 @@
+# -*- coding: utf-8 -*-
+
+#    Copyright (C) 2014 Yahoo! Inc. All Rights Reserved.
+#
+#    Licensed under the Apache License, Version 2.0 (the "License"); you may
+#    not use this file except in compliance with the License. You may obtain
+#    a copy of the License at
+#
+#         http://www.apache.org/licenses/LICENSE-2.0
+#
+#    Unless required by applicable law or agreed to in writing, software
+#    distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+#    WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+#    License for the specific language governing permissions and limitations
+#    under the License.
+
+import logging
+import os
+import random
+import sys
+import time
+
+logging.basicConfig(level=logging.ERROR)
+
+top_dir = os.path.abspath(os.path.join(os.path.dirname(__file__),
+                                       os.pardir,
+                                       os.pardir))
+sys.path.insert(0, top_dir)
+
+from taskflow import engines
+from taskflow.listeners import timing
+from taskflow.patterns import linear_flow as lf
+from taskflow import task
+
+# INTRO: in this example we will attach a listener to an engine
+# and have variable run time tasks run and show how the listener will print
+# out how long those tasks took (when they started and when they finished).
+#
+# This shows how timing metrics can be gathered (or attached onto a engine)
+# after a workflow has been constructed, making it easy to gather metrics
+# dynamically for situations where this kind of information is applicable (or
+# even adding this information on at a later point in the future when your
+# application starts to slow down).
+
+
+class VariableTask(task.Task):
+    def __init__(self, name):
+        super(VariableTask, self).__init__(name)
+        self._sleepy_time = random.random()
+
+    def execute(self):
+        time.sleep(self._sleepy_time)
+
+
+f = lf.Flow('root')
+f.add(VariableTask('a'), VariableTask('b'), VariableTask('c'))
+e = engines.load(f)
+with timing.PrintingTimingListener(e):
+    e.run()
--- a/taskflow/examples/wbe_event_sender.py
+++ b/taskflow/examples/wbe_event_sender.py
@@ -0,0 +1,150 @@
+# -*- coding: utf-8 -*-
+
+#    Copyright (C) 2014 Yahoo! Inc. All Rights Reserved.
+#
+#    Licensed under the Apache License, Version 2.0 (the "License"); you may
+#    not use this file except in compliance with the License. You may obtain
+#    a copy of the License at
+#
+#         http://www.apache.org/licenses/LICENSE-2.0
+#
+#    Unless required by applicable law or agreed to in writing, software
+#    distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+#    WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+#    License for the specific language governing permissions and limitations
+#    under the License.
+
+import logging
+import os
+import string
+import sys
+import time
+
+top_dir = os.path.abspath(os.path.join(os.path.dirname(__file__),
+                                       os.pardir,
+                                       os.pardir))
+sys.path.insert(0, top_dir)
+
+from six.moves import range as compat_range
+
+from taskflow import engines
+from taskflow.engines.worker_based import worker
+from taskflow.patterns import linear_flow as lf
+from taskflow import task
+from taskflow.types import notifier
+from taskflow.utils import threading_utils
+
+ANY = notifier.Notifier.ANY
+
+# INTRO: This examples shows how to use a remote workers event notification
+# attribute to proxy back task event notifications to the controlling process.
+#
+# In this case a simple set of events are triggered by a worker running a
+# task (simulated to be remote by using a kombu memory transport and threads).
+# Those events that the 'remote worker' produces will then be proxied back to
+# the task that the engine is running 'remotely', and then they will be emitted
+# back to the original callbacks that exist in the originating engine
+# process/thread. This creates a one-way *notification* channel that can
+# transparently be used in-process, outside-of-process using remote workers and
+# so-on that allows tasks to signal to its controlling process some sort of
+# action that has occurred that the task may need to tell others about (for
+# example to trigger some type of response when the task reaches 50% done...).
+
+
+def event_receiver(event_type, details):
+    """This is the callback that (in this example) doesn't do much..."""
+    print("Recieved event '%s'" % event_type)
+    print("Details = %s" % details)
+
+
+class EventReporter(task.Task):
+    """This is the task that will be running 'remotely' (not really remote)."""
+
+    EVENTS = tuple(string.ascii_uppercase)
+    EVENT_DELAY = 0.1
+
+    def execute(self):
+        for i, e in enumerate(self.EVENTS):
+            details = {
+                'leftover': self.EVENTS[i:],
+            }
+            self.notifier.notify(e, details)
+            time.sleep(self.EVENT_DELAY)
+
+
+BASE_SHARED_CONF = {
+    'exchange': 'taskflow',
+    'transport': 'memory',
+    'transport_options': {
+        'polling_interval': 0.1,
+    },
+}
+
+# Until https://github.com/celery/kombu/issues/398 is resolved it is not
+# recommended to run many worker threads in this example due to the types
+# of errors mentioned in that issue.
+MEMORY_WORKERS = 1
+WORKER_CONF = {
+    'tasks': [
+        # Used to locate which tasks we can run (we don't want to allow
+        # arbitrary code/tasks to be ran by any worker since that would
+        # open up a variety of vulnerabilities).
+        '%s:EventReporter' % (__name__),
+    ],
+}
+
+
+def run(engine_options):
+    reporter = EventReporter()
+    reporter.notifier.register(ANY, event_receiver)
+    flow = lf.Flow('event-reporter').add(reporter)
+    eng = engines.load(flow, engine='worker-based', **engine_options)
+    eng.run()
+
+
+if __name__ == "__main__":
+    logging.basicConfig(level=logging.ERROR)
+
+    # Setup our transport configuration and merge it into the worker and
+    # engine configuration so that both of those objects use it correctly.
+    worker_conf = dict(WORKER_CONF)
+    worker_conf.update(BASE_SHARED_CONF)
+    engine_options = dict(BASE_SHARED_CONF)
+    workers = []
+
+    # These topics will be used to request worker information on; those
+    # workers will respond with there capabilities which the executing engine
+    # will use to match pending tasks to a matched worker, this will cause
+    # the task to be sent for execution, and the engine will wait until it
+    # is finished (a response is recieved) and then the engine will either
+    # continue with other tasks, do some retry/failure resolution logic or
+    # stop (and potentially re-raise the remote workers failure)...
+    worker_topics = []
+
+    try:
+        # Create a set of worker threads to simulate actual remote workers...
+        print('Running %s workers.' % (MEMORY_WORKERS))
+        for i in compat_range(0, MEMORY_WORKERS):
+            # Give each one its own unique topic name so that they can
+            # correctly communicate with the engine (they will all share the
+            # same exchange).
+            worker_conf['topic'] = 'worker-%s' % (i + 1)
+            worker_topics.append(worker_conf['topic'])
+            w = worker.Worker(**worker_conf)
+            runner = threading_utils.daemon_thread(w.run)
+            runner.start()
+            w.wait()
+            workers.append((runner, w.stop))
+
+        # Now use those workers to do something.
+        print('Executing some work.')
+        engine_options['topics'] = worker_topics
+        result = run(engine_options)
+        print('Execution finished.')
+    finally:
+        # And cleanup.
+        print('Stopping workers.')
+        while workers:
+            r, stopper = workers.pop()
+            stopper()
+            r.join()
--- a/taskflow/examples/wbe_mandelbrot.out.txt
+++ b/taskflow/examples/wbe_mandelbrot.out.txt
@@ -0,0 +1,6 @@
+Calculating your mandelbrot fractal of size 512x512.
+Running 2 workers.
+Execution finished.
+Stopping workers.
+Writing image...
+Gathered 262144 results that represents a mandelbrot image (using 8 chunks that are computed jointly by 2 workers).
--- a/taskflow/examples/wbe_mandelbrot.py
+++ b/taskflow/examples/wbe_mandelbrot.py
@@ -0,0 +1,253 @@
+# -*- coding: utf-8 -*-
+
+#    Copyright (C) 2014 Yahoo! Inc. All Rights Reserved.
+#
+#    Licensed under the Apache License, Version 2.0 (the "License"); you may
+#    not use this file except in compliance with the License. You may obtain
+#    a copy of the License at
+#
+#         http://www.apache.org/licenses/LICENSE-2.0
+#
+#    Unless required by applicable law or agreed to in writing, software
+#    distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+#    WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+#    License for the specific language governing permissions and limitations
+#    under the License.
+
+import logging
+import math
+import os
+import sys
+
+top_dir = os.path.abspath(os.path.join(os.path.dirname(__file__),
+                                       os.pardir,
+                                       os.pardir))
+sys.path.insert(0, top_dir)
+
+from six.moves import range as compat_range
+
+from taskflow import engines
+from taskflow.engines.worker_based import worker
+from taskflow.patterns import unordered_flow as uf
+from taskflow import task
+from taskflow.utils import threading_utils
+
+# INTRO: This example walks through a workflow that will in parallel compute
+# a mandelbrot result set (using X 'remote' workers) and then combine their
+# results together to form a final mandelbrot fractal image. It shows a usage
+# of taskflow to perform a well-known embarrassingly parallel problem that has
+# the added benefit of also being an elegant visualization.
+#
+# NOTE(harlowja): this example simulates the expected larger number of workers
+# by using a set of threads (which in this example simulate the remote workers
+# that would typically be running on other external machines).
+#
+# NOTE(harlowja): to have it produce an image run (after installing pillow):
+#
+# $ python taskflow/examples/wbe_mandelbrot.py output.png
+
+BASE_SHARED_CONF = {
+    'exchange': 'taskflow',
+}
+WORKERS = 2
+WORKER_CONF = {
+    # These are the tasks the worker can execute, they *must* be importable,
+    # typically this list is used to restrict what workers may execute to
+    # a smaller set of *allowed* tasks that are known to be safe (one would
+    # not want to allow all python code to be executed).
+    'tasks': [
+        '%s:MandelCalculator' % (__name__),
+    ],
+}
+ENGINE_CONF = {
+    'engine': 'worker-based',
+}
+
+# Mandelbrot & image settings...
+IMAGE_SIZE = (512, 512)
+CHUNK_COUNT = 8
+MAX_ITERATIONS = 25
+
+
+class MandelCalculator(task.Task):
+    def execute(self, image_config, mandelbrot_config, chunk):
+        """Returns the number of iterations before the computation "escapes".
+
+        Given the real and imaginary parts of a complex number, determine if it
+        is a candidate for membership in the mandelbrot set given a fixed
+        number of iterations.
+        """
+
+        # Parts borrowed from (credit to mark harris and benoît mandelbrot).
+        #
+        # http://nbviewer.ipython.org/gist/harrism/f5707335f40af9463c43
+        def mandelbrot(x, y, max_iters):
+            c = complex(x, y)
+            z = 0.0j
+            for i in compat_range(max_iters):
+                z = z * z + c
+                if (z.real * z.real + z.imag * z.imag) >= 4:
+                    return i
+            return max_iters
+
+        min_x, max_x, min_y, max_y, max_iters = mandelbrot_config
+        height, width = image_config['size']
+        pixel_size_x = (max_x - min_x) / width
+        pixel_size_y = (max_y - min_y) / height
+        block = []
+        for y in compat_range(chunk[0], chunk[1]):
+            row = []
+            imag = min_y + y * pixel_size_y
+            for x in compat_range(0, width):
+                real = min_x + x * pixel_size_x
+                row.append(mandelbrot(real, imag, max_iters))
+            block.append(row)
+        return block
+
+
+def calculate(engine_conf):
+    # Subdivide the work into X pieces, then request each worker to calculate
+    # one of those chunks and then later we will write these chunks out to
+    # an image bitmap file.
+
+    # And unordered flow is used here since the mandelbrot calculation is an
+    # example of a embarrassingly parallel computation that we can scatter
+    # across as many workers as possible.
+    flow = uf.Flow("mandelbrot")
+
+    # These symbols will be automatically given to tasks as input to there
+    # execute method, in this case these are constants used in the mandelbrot
+    # calculation.
+    store = {
+        'mandelbrot_config': [-2.0, 1.0, -1.0, 1.0, MAX_ITERATIONS],
+        'image_config': {
+            'size': IMAGE_SIZE,
+        }
+    }
+
+    # We need the task names to be in the right order so that we can extract
+    # the final results in the right order (we don't care about the order when
+    # executing).
+    task_names = []
+
+    # Compose our workflow.
+    height, _width = IMAGE_SIZE
+    chunk_size = int(math.ceil(height / float(CHUNK_COUNT)))
+    for i in compat_range(0, CHUNK_COUNT):
+        chunk_name = 'chunk_%s' % i
+        task_name = "calculation_%s" % i
+        # Break the calculation up into chunk size pieces.
+        rows = [i * chunk_size, i * chunk_size + chunk_size]
+        flow.add(
+            MandelCalculator(task_name,
+                             # This ensures the storage symbol with name
+                             # 'chunk_name' is sent into the tasks local
+                             # symbol 'chunk'. This is how we give each
+                             # calculator its own correct sequence of rows
+                             # to work on.
+                             rebind={'chunk': chunk_name}))
+        store[chunk_name] = rows
+        task_names.append(task_name)
+
+    # Now execute it.
+    eng = engines.load(flow, store=store, engine_conf=engine_conf)
+    eng.run()
+
+    # Gather all the results and order them for further processing.
+    gather = []
+    for name in task_names:
+        gather.extend(eng.storage.get(name))
+    points = []
+    for y, row in enumerate(gather):
+        for x, color in enumerate(row):
+            points.append(((x, y), color))
+    return points
+
+
+def write_image(results, output_filename=None):
+    print("Gathered %s results that represents a mandelbrot"
+          " image (using %s chunks that are computed jointly"
+          " by %s workers)." % (len(results), CHUNK_COUNT, WORKERS))
+    if not output_filename:
+        return
+
+    # Pillow (the PIL fork) saves us from writing our own image writer...
+    try:
+        from PIL import Image
+    except ImportError as e:
+        # To currently get this (may change in the future),
+        # $ pip install Pillow
+        raise RuntimeError("Pillow is required to write image files: %s" % e)
+
+    # Limit to 255, find the max and normalize to that...
+    color_max = 0
+    for _point, color in results:
+        color_max = max(color, color_max)
+
+    # Use gray scale since we don't really have other colors.
+    img = Image.new('L', IMAGE_SIZE, "black")
+    pixels = img.load()
+    for (x, y), color in results:
+        if color_max == 0:
+            color = 0
+        else:
+            color = int((float(color) / color_max) * 255.0)
+        pixels[x, y] = color
+    img.save(output_filename)
+
+
+def create_fractal():
+    logging.basicConfig(level=logging.ERROR)
+
+    # Setup our transport configuration and merge it into the worker and
+    # engine configuration so that both of those use it correctly.
+    shared_conf = dict(BASE_SHARED_CONF)
+    shared_conf.update({
+        'transport': 'memory',
+        'transport_options': {
+            'polling_interval': 0.1,
+        },
+    })
+
+    if len(sys.argv) >= 2:
+        output_filename = sys.argv[1]
+    else:
+        output_filename = None
+
+    worker_conf = dict(WORKER_CONF)
+    worker_conf.update(shared_conf)
+    engine_conf = dict(ENGINE_CONF)
+    engine_conf.update(shared_conf)
+    workers = []
+    worker_topics = []
+
+    print('Calculating your mandelbrot fractal of size %sx%s.' % IMAGE_SIZE)
+    try:
+        # Create a set of workers to simulate actual remote workers.
+        print('Running %s workers.' % (WORKERS))
+        for i in compat_range(0, WORKERS):
+            worker_conf['topic'] = 'calculator_%s' % (i + 1)
+            worker_topics.append(worker_conf['topic'])
+            w = worker.Worker(**worker_conf)
+            runner = threading_utils.daemon_thread(w.run)
+            runner.start()
+            w.wait()
+            workers.append((runner, w.stop))
+
+        # Now use those workers to do something.
+        engine_conf['topics'] = worker_topics
+        results = calculate(engine_conf)
+        print('Execution finished.')
+    finally:
+        # And cleanup.
+        print('Stopping workers.')
+        while workers:
+            r, stopper = workers.pop()
+            stopper()
+            r.join()
+    print("Writing image...")
+    write_image(results, output_filename=output_filename)
+
+
+if __name__ == "__main__":
+    create_fractal()
--- a/taskflow/examples/wbe_simple_linear.py
+++ b/taskflow/examples/wbe_simple_linear.py
@@ -19,7 +19,6 @@ import logging
 import os
 import sys
 import tempfile
-import threading

 top_dir = os.path.abspath(os.path.join(os.path.dirname(__file__),
                                       os.pardir,
@@ -30,6 +29,7 @@ from taskflow import engines
 from taskflow.engines.worker_based import worker
 from taskflow.patterns import linear_flow as lf
 from taskflow.tests import utils
+from taskflow.utils import threading_utils

 import example_utils  # noqa

@@ -53,7 +53,12 @@ USE_FILESYSTEM = False
 BASE_SHARED_CONF = {
    'exchange': 'taskflow',
 }
-WORKERS = 2
+
+# Until https://github.com/celery/kombu/issues/398 is resolved it is not
+# recommended to run many worker threads in this example due to the types
+# of errors mentioned in that issue.
+MEMORY_WORKERS = 2
+FILE_WORKERS = 1
 WORKER_CONF = {
    # These are the tasks the worker can execute, they *must* be importable,
    # typically this list is used to restrict what workers may execute to
@@ -64,19 +69,16 @@ WORKER_CONF = {
        'taskflow.tests.utils:TaskMultiArgOneReturn'
    ],
 }
-ENGINE_CONF = {
-    'engine': 'worker-based',
-}


-def run(engine_conf):
+def run(engine_options):
    flow = lf.Flow('simple-linear').add(
        utils.TaskOneArgOneReturn(provides='result1'),
        utils.TaskMultiArgOneReturn(provides='result2')
    )
    eng = engines.load(flow,
                       store=dict(x=111, y=222, z=333),
-                       engine_conf=engine_conf)
+                       engine='worker-based', **engine_options)
    eng.run()
    return eng.storage.fetch_all()

@@ -90,6 +92,7 @@ if __name__ == "__main__":

    tmp_path = None
    if USE_FILESYSTEM:
+        worker_count = FILE_WORKERS
        tmp_path = tempfile.mkdtemp(prefix='wbe-example-')
        shared_conf.update({
            'transport': 'filesystem',
@@ -100,6 +103,7 @@ if __name__ == "__main__":
            },
        })
    else:
+        worker_count = MEMORY_WORKERS
        shared_conf.update({
            'transport': 'memory',
            'transport_options': {
@@ -108,28 +112,26 @@ if __name__ == "__main__":
        })
    worker_conf = dict(WORKER_CONF)
    worker_conf.update(shared_conf)
-    engine_conf = dict(ENGINE_CONF)
-    engine_conf.update(shared_conf)
+    engine_options = dict(shared_conf)
    workers = []
    worker_topics = []

    try:
        # Create a set of workers to simulate actual remote workers.
-        print('Running %s workers.' % (WORKERS))
-        for i in range(0, WORKERS):
+        print('Running %s workers.' % (worker_count))
+        for i in range(0, worker_count):
            worker_conf['topic'] = 'worker-%s' % (i + 1)
            worker_topics.append(worker_conf['topic'])
            w = worker.Worker(**worker_conf)
-            runner = threading.Thread(target=w.run)
-            runner.daemon = True
+            runner = threading_utils.daemon_thread(w.run)
            runner.start()
            w.wait()
            workers.append((runner, w.stop))

        # Now use those workers to do something.
        print('Executing some work.')
-        engine_conf['topics'] = worker_topics
-        result = run(engine_conf)
+        engine_options['topics'] = worker_topics
+        result = run(engine_options)
        print('Execution finished.')
        # This is done so that the test examples can work correctly
        # even when the keys change order (which will happen in various
--- a/taskflow/examples/wrapped_exception.py
+++ b/taskflow/examples/wrapped_exception.py
@@ -33,7 +33,7 @@ from taskflow import exceptions
 from taskflow.patterns import unordered_flow as uf
 from taskflow import task
 from taskflow.tests import utils
-from taskflow.utils import misc
+from taskflow.types import failure

 import example_utils as eu  # noqa

@@ -93,18 +93,18 @@ def run(**store):
    try:
        with utils.wrap_all_failures():
            taskflow.engines.run(flow, store=store,
-                                 engine_conf='parallel')
+                                 engine='parallel')
    except exceptions.WrappedFailure as ex:
        unknown_failures = []
-        for failure in ex:
-            if failure.check(FirstException):
-                print("Got FirstException: %s" % failure.exception_str)
-            elif failure.check(SecondException):
-                print("Got SecondException: %s" % failure.exception_str)
+        for a_failure in ex:
+            if a_failure.check(FirstException):
+                print("Got FirstException: %s" % a_failure.exception_str)
+            elif a_failure.check(SecondException):
+                print("Got SecondException: %s" % a_failure.exception_str)
            else:
-                print("Unknown failure: %s" % failure)
-                unknown_failures.append(failure)
-        misc.Failure.reraise_if_any(unknown_failures)
+                print("Unknown failure: %s" % a_failure)
+                unknown_failures.append(a_failure)
+        failure.Failure.reraise_if_any(unknown_failures)


 eu.print_wrapped("Raise and catch first exception only")
--- a/taskflow/exceptions.py
+++ b/taskflow/exceptions.py
@@ -14,6 +14,7 @@
 #    License for the specific language governing permissions and limitations
 #    under the License.

+import os
 import traceback

 import six
@@ -25,6 +26,14 @@ class TaskFlowException(Exception):
    NOTE(harlowja): in later versions of python we can likely remove the need
    to have a cause here as PY3+ have implemented PEP 3134 which handles
    chaining in a much more elegant manner.
+
+    :param message: the exception message, typically some string that is
+                    useful for consumers to view when debugging or analyzing
+                    failures.
+    :param cause: the cause of the exception being raised, when provided this
+                  should itself be an exception instance, this is useful for
+                  creating a chain of exceptions for versions of python where
+                  this is not yet implemented/supported natively.
    """
    def __init__(self, message, cause=None):
        super(TaskFlowException, self).__init__(message)
@@ -38,21 +47,31 @@ class TaskFlowException(Exception):
        """Pretty formats a taskflow exception + any connected causes."""
        if indent < 0:
            raise ValueError("indent must be greater than or equal to zero")
+        return os.linesep.join(self._pformat(self, [], 0,
+                                             indent=indent,
+                                             indent_text=indent_text))

-        def _format(excp, indent_by):
-            lines = []
-            for line in traceback.format_exception_only(type(excp), excp):
-                # We'll add our own newlines on at the end of formatting.
-                if line.endswith("\n"):
-                    line = line[0:-1]
-                lines.append((indent_text * indent_by) + line)
-            try:
-                lines.extend(_format(excp.cause, indent_by + indent))
-            except AttributeError:
-                pass
-            return lines
-
-        return "\n".join(_format(self, 0))
+    @classmethod
+    def _pformat(cls, excp, lines, current_indent, indent=2, indent_text=" "):
+        line_prefix = indent_text * current_indent
+        for line in traceback.format_exception_only(type(excp), excp):
+            # We'll add our own newlines on at the end of formatting.
+            #
+            # NOTE(harlowja): the reason we don't search for os.linesep is
+            # that the traceback module seems to only use '\n' (for some
+            # reason).
+            if line.endswith("\n"):
+                line = line[0:-1]
+            lines.append(line_prefix + line)
+        try:
+            cause = excp.cause
+        except AttributeError:
+            pass
+        else:
+            if cause is not None:
+                cls._pformat(cause, lines, current_indent + indent,
+                             indent=indent, indent_text=indent_text)
+        return lines


 # Errors related to storage or operations on storage units.
@@ -98,8 +117,21 @@ class DependencyFailure(TaskFlowException):
    """Raised when some type of dependency problem occurs."""


+class AmbiguousDependency(DependencyFailure):
+    """Raised when some type of ambiguous dependency problem occurs."""
+
+
 class MissingDependencies(DependencyFailure):
-    """Raised when a entity has dependencies that can not be satisfied."""
+    """Raised when a entity has dependencies that can not be satisfied.
+
+    :param who: the entity that caused the missing dependency to be triggered.
+    :param requirements: the dependency which were not satisfied.
+
+    Further arguments are interpreted as for in
+    :py:class:`~taskflow.exceptions.TaskFlowException`.
+    """
+
+    #: Exception message template used when creating an actual message.
    MESSAGE_TPL = ("%(who)s requires %(requirements)s but no other entity"
                   " produces said requirements")

@@ -109,6 +141,10 @@ class MissingDependencies(DependencyFailure):
        self.missing_requirements = requirements


+class CompilationFailure(TaskFlowException):
+    """Raised when some type of compilation issue is found."""
+
+
 class IncompatibleVersion(TaskFlowException):
    """Raised when some type of version incompatibility is found."""

@@ -135,13 +171,30 @@ class InvalidFormat(TaskFlowException):

 # Others.

-class WrappedFailure(Exception):
-    """Wraps one or several failures.
+class NotImplementedError(NotImplementedError):
+    """Exception for when some functionality really isn't implemented.

-    When exception cannot be re-raised (for example, because
-    the value and traceback is lost in serialization) or
-    there are several exceptions, we wrap corresponding Failure
-    objects into this exception class.
+    This is typically useful when the library itself needs to distinguish
+    internal features not being made available from users features not being
+    made available/implemented (and to avoid misinterpreting the two).
+    """
+
+
+class WrappedFailure(Exception):
+    """Wraps one or several failure objects.
+
+    When exception/s cannot be re-raised (for example, because the value and
+    traceback are lost in serialization) or there are several exceptions active
+    at the same time (due to more than one thread raising exceptions), we will
+    wrap the corresponding failure objects into this exception class and
+    *may* reraise this exception type to allow users to handle the contained
+    failures/causes as they see fit...
+
+    See the failure class documentation for a more comprehensive set of reasons
+    why this object *may* be reraised instead of the original exception.
+
+    :param causes: the :py:class:`~taskflow.types.failure.Failure` objects
+                   that caused this this exception to be raised.
    """

    def __init__(self, causes):
@@ -163,12 +216,14 @@ class WrappedFailure(Exception):
        return len(self._causes)

    def check(self, *exc_classes):
-        """Check if any of exc_classes caused (part of) the failure.
+        """Check if any of exception classes caused the failure/s.

-        Arguments of this method can be exception types or type names
-        (strings). If any of wrapped failures were caused by exception
-        of given type, the corresponding argument is returned. Else,
-        None is returned.
+        :param exc_classes: exception types/exception type names to
+                            search for.
+
+        If any of the contained failures were caused by an exception of a
+        given type, the corresponding argument that matched is returned. If
+        not then none is returned.
        """
        if not exc_classes:
            return None
@@ -184,7 +239,10 @@ class WrappedFailure(Exception):


 def exception_message(exc):
-    """Return the string representation of exception."""
+    """Return the string representation of exception.
+
+    :param exc: exception object to get a string representation of.
+    """
    # NOTE(imelnikov): Dealing with non-ascii data in python is difficult:
    # https://bugs.launchpad.net/taskflow/+bug/1275895
    # https://bugs.launchpad.net/taskflow/+bug/1276053
--- a/taskflow/flow.py
+++ b/taskflow/flow.py
@@ -16,9 +16,21 @@

 import abc

+from oslo_utils import reflection
 import six

-from taskflow.utils import reflection
+# Link metadata keys that have inherent/special meaning.
+#
+# This key denotes the link is an invariant that ensures the order is
+# correctly preserved.
+LINK_INVARIANT = 'invariant'
+# This key denotes the link is a manually/user-specified.
+LINK_MANUAL = 'manual'
+# This key denotes the link was created when resolving/compiling retries.
+LINK_RETRY = 'retry'
+# This key denotes the link was created due to symbol constraints and the
+# value will be a set of names that the constraint ensures are satisfied.
+LINK_REASONS = 'reasons'


@six.add_metaclass(abc.ABCMeta)
@@ -34,10 +46,7 @@ class Flow(object):

    NOTE(harlowja): if a flow is placed in another flow as a subflow, a desired
    way to compose flows together, then it is valid and permissible that during
-    execution the subflow & parent flow may be flattened into a new flow. Since
-    a flow is just a 'structuring' concept this is typically a behavior that
-    should not be worried about (as it is not visible to the user), but it is
-    worth mentioning here.
+    compilation the subflow & parent flow *may* be flattened into a new flow.
    """

    def __init__(self, name, retry=None):
@@ -45,7 +54,7 @@ class Flow(object):
        self._retry = retry
        # NOTE(akarpinska): if retry doesn't have a name,
        # the name of its owner will be assigned
-        if self._retry and self._retry.name is None:
+        if self._retry is not None and self._retry.name is None:
            self._retry.name = self.name + "_retry"

    @property
@@ -93,27 +102,14 @@ class Flow(object):

    @property
    def provides(self):
-        """Set of result names provided by the flow.
-
-        Includes names of all the outputs provided by atoms of this flow.
-        """
+        """Set of symbol names provided by the flow."""
        provides = set()
-        if self._retry:
+        if self._retry is not None:
            provides.update(self._retry.provides)
-        for subflow in self:
-            provides.update(subflow.provides)
-        return provides
+        for item in self:
+            provides.update(item.provides)
+        return frozenset(provides)

-    @property
+    @abc.abstractproperty
    def requires(self):
-        """Set of argument names required by the flow.
-
-        Includes names of all the inputs required by atoms of this
-        flow, but not provided within the flow itself.
-        """
-        requires = set()
-        if self._retry:
-            requires.update(self._retry.requires)
-        for subflow in self:
-            requires.update(subflow.requires)
-        return requires - self.provides
+        """Set of *unsatisfied* symbol names required by the flow."""
--- a/taskflow/jobs/backends/init.py
+++ b/taskflow/jobs/backends/init.py
@@ -15,12 +15,12 @@
 #    under the License.

 import contextlib
-import logging

 import six
 from stevedore import driver

 from taskflow import exceptions as exc
+from taskflow import logging
 from taskflow.utils import misc


@@ -55,12 +55,12 @@ def fetch(name, conf, namespace=BACKEND_NAMESPACE, **kwargs):
        conf = {'board': conf}
    board = conf['board']
    try:
-        pieces = misc.parse_uri(board)
+        uri = misc.parse_uri(board)
    except (TypeError, ValueError):
        pass
    else:
-        board = pieces['scheme']
-        conf = misc.merge_uri(pieces, conf.copy())
+        board = uri.scheme
+        conf = misc.merge_uri(uri, conf.copy())
    LOG.debug('Looking for %r jobboard driver in %r', board, namespace)
    try:
        mgr = driver.DriverManager(namespace, board,
--- a/taskflow/jobs/backends/impl_zookeeper.py
+++ b/taskflow/jobs/backends/impl_zookeeper.py
@@ -17,21 +17,20 @@
 import collections
 import contextlib
 import functools
-import logging
 import threading

 from concurrent import futures
 from kazoo import exceptions as k_exceptions
 from kazoo.protocol import paths as k_paths
 from kazoo.recipe import watchers
+from oslo_serialization import jsonutils
+from oslo_utils import excutils
+from oslo_utils import uuidutils
 import six

 from taskflow import exceptions as excp
-from taskflow.jobs import job as base_job
-from taskflow.jobs import jobboard
-from taskflow.openstack.common import excutils
-from taskflow.openstack.common import jsonutils
-from taskflow.openstack.common import uuidutils
+from taskflow.jobs import base
+from taskflow import logging
 from taskflow import states
 from taskflow.types import timing as tt
 from taskflow.utils import kazoo_utils
@@ -62,7 +61,9 @@ def _check_who(who):
        raise ValueError("Job applicant must be non-empty")


-class ZookeeperJob(base_job.Job):
+class ZookeeperJob(base.Job):
+    """A zookeeper job."""
+
    def __init__(self, name, board, client, backend, path,
                 uuid=None, details=None, book=None, book_data=None,
                 created_on=None):
@@ -77,10 +78,13 @@ class ZookeeperJob(base_job.Job):
        if all((self._book, self._book_data)):
            raise ValueError("Only one of 'book_data' or 'book'"
                             " can be provided")
-        self._path = path
+        self._path = k_paths.normpath(path)
        self._lock_path = path + LOCK_POSTFIX
        self._created_on = created_on
        self._node_not_found = False
+        basename = k_paths.basename(self._path)
+        self._root = self._path[0:-len(basename)]
+        self._sequence = int(basename[len(JOB_PREFIX):])

    @property
    def lock_path(self):
@@ -90,6 +94,16 @@ class ZookeeperJob(base_job.Job):
    def path(self):
        return self._path

+    @property
+    def sequence(self):
+        """Sequence number of the current job."""
+        return self._sequence
+
+    @property
+    def root(self):
+        """The parent path of the job in zookeeper."""
+        return self._root
+
    def _get_node_attr(self, path, attr_name, trans_func=None):
        try:
            _data, node_stat = self._client.get(path)
@@ -104,7 +118,7 @@ class ZookeeperJob(base_job.Job):
                                % (attr_name, self.uuid, self.path, path), e)
        except self._client.handler.timeout_exception as e:
            raise excp.JobFailure("Can not fetch the %r attribute"
-                                  " of job %s (%s), connection timed out"
+                                  " of job %s (%s), operation timed out"
                                  % (attr_name, self.uuid, self.path), e)
        except k_exceptions.SessionExpiredError as e:
            raise excp.JobFailure("Can not fetch the %r attribute"
@@ -172,7 +186,7 @@ class ZookeeperJob(base_job.Job):
                                  " session expired" % (self.uuid), e)
        except self._client.handler.timeout_exception as e:
            raise excp.JobFailure("Can not fetch the state of %s,"
-                                  " connection timed out" % (self.uuid), e)
+                                  " operation timed out" % (self.uuid), e)
        except k_exceptions.KazooException as e:
            raise excp.JobFailure("Can not fetch the state of %s, internal"
                                  " error" % (self.uuid), e)
@@ -186,8 +200,11 @@ class ZookeeperJob(base_job.Job):
            return states.UNCLAIMED
        return states.CLAIMED

-    def __cmp__(self, other):
-        return cmp(self.path, other.path)
+    def __lt__(self, other):
+        if self.root == other.root:
+            return self.sequence < other.sequence
+        else:
+            return self.root < other.root

    def __hash__(self):
        return hash(self.path)
@@ -218,14 +235,14 @@ class ZookeeperJob(base_job.Job):


 class ZookeeperJobBoardIterator(six.Iterator):
-    """Iterator over a zookeeper jobboard.
+    """Iterator over a zookeeper jobboard that iterates over potential jobs.

    It supports the following attributes/constructor arguments:

-    * ensure_fresh: boolean that requests that during every fetch of a new
+    * ``ensure_fresh``: boolean that requests that during every fetch of a new
      set of jobs this will cause the iterator to force the backend to
      refresh (ensuring that the jobboard has the most recent job listings).
-    * only_unclaimed: boolean that indicates whether to only iterate
+    * ``only_unclaimed``: boolean that indicates whether to only iterate
      over unclaimed jobs.
    """

@@ -274,7 +291,30 @@ class ZookeeperJobBoardIterator(six.Iterator):
            return job


-class ZookeeperJobBoard(jobboard.NotifyingJobBoard):
+class ZookeeperJobBoard(base.NotifyingJobBoard):
+    """A jobboard backend by zookeeper.
+
+    Powered by the `kazoo <http://kazoo.readthedocs.org/>`_ library.
+
+    This jobboard creates *sequenced* persistent znodes in a directory in
+    zookeeper (that directory defaults ``/taskflow/jobs``) and uses zookeeper
+    watches to notify other jobboards that the job which was posted using the
+    :meth:`.post` method (this creates a znode with contents/details in json)
+    The users of those jobboard(s) (potentially on disjoint sets of machines)
+    can then iterate over the available jobs and decide if they want to attempt
+    to claim one of the jobs they have iterated over. If so they will then
+    attempt to contact zookeeper and will attempt to create a ephemeral znode
+    using the name of the persistent znode + ".lock" as a postfix. If the
+    entity trying to use the jobboard to :meth:`.claim` the job is able to
+    create a ephemeral znode with that name then it will be allowed (and
+    expected) to perform whatever *work* the contents of that job that it
+    locked described. Once finished the ephemeral znode and persistent znode
+    may be deleted (if successfully completed) in a single transcation or if
+    not successfull (or the entity that claimed the znode dies) the ephemeral
+    znode will be released (either manually by using :meth:`.abandon` or
+    automatically by zookeeper the ephemeral is deemed to be lost).
+    """
+
    def __init__(self, name, conf,
                 client=None, persistence=None, emit_notifications=True):
        super(ZookeeperJobBoard, self).__init__(name, conf)
@@ -298,8 +338,7 @@ class ZookeeperJobBoard(jobboard.NotifyingJobBoard):
        self._persistence = persistence
        # Misc. internal details
        self._known_jobs = {}
-        self._job_lock = threading.RLock()
-        self._job_cond = threading.Condition(self._job_lock)
+        self._job_cond = threading.Condition()
        self._open_close_lock = threading.RLock()
        self._client.add_listener(self._state_change_listener)
        self._bad_paths = frozenset([path])
@@ -312,8 +351,12 @@ class ZookeeperJobBoard(jobboard.NotifyingJobBoard):

    def _emit(self, state, details):
        # Submit the work to the executor to avoid blocking the kazoo queue.
-        if self._worker is not None:
+        try:
            self._worker.submit(self.notifier.notify, state, details)
+        except (AttributeError, RuntimeError):
+            # Notification thread is shutdown or non-existent, either case we
+            # just want to skip submitting a notification...
+            pass

    @property
    def path(self):
@@ -321,20 +364,19 @@ class ZookeeperJobBoard(jobboard.NotifyingJobBoard):

    @property
    def job_count(self):
-        with self._job_lock:
-            return len(self._known_jobs)
+        return len(self._known_jobs)

    def _fetch_jobs(self, ensure_fresh=False):
        if ensure_fresh:
            self._force_refresh()
-        with self._job_lock:
+        with self._job_cond:
            return sorted(six.itervalues(self._known_jobs))

    def _force_refresh(self):
        try:
            children = self._client.get_children(self.path)
        except self._client.handler.timeout_exception as e:
-            raise excp.JobFailure("Refreshing failure, connection timed out",
+            raise excp.JobFailure("Refreshing failure, operation timed out",
                                  e)
        except k_exceptions.SessionExpiredError as e:
            raise excp.JobFailure("Refreshing failure, session expired", e)
@@ -351,11 +393,13 @@ class ZookeeperJobBoard(jobboard.NotifyingJobBoard):
                                         ensure_fresh=ensure_fresh)

    def _remove_job(self, path):
-        LOG.debug("Removing job that was at path: %s", path)
-        with self._job_lock:
+        if path not in self._known_jobs:
+            return
+        with self._job_cond:
            job = self._known_jobs.pop(path, None)
        if job is not None:
-            self._emit(jobboard.REMOVAL, details={'job': job})
+            LOG.debug("Removed job that was at path '%s'", path)
+            self._emit(base.REMOVAL, details={'job': job})

    def _process_child(self, path, request):
        """Receives the result of a child data fetch request."""
@@ -368,7 +412,7 @@ class ZookeeperJobBoard(jobboard.NotifyingJobBoard):
            LOG.warn("Incorrectly formatted job data found at path: %s",
                     path, exc_info=True)
        except self._client.handler.timeout_exception:
-            LOG.warn("Connection timed out fetching job data from path: %s",
+            LOG.warn("Operation timed out fetching job data from path: %s",
                     path, exc_info=True)
        except k_exceptions.SessionExpiredError:
            LOG.warn("Session expired fetching job data from path: %s", path,
@@ -380,8 +424,10 @@ class ZookeeperJobBoard(jobboard.NotifyingJobBoard):
            LOG.warn("Internal error fetching job data from path: %s",
                     path, exc_info=True)
        else:
-            self._job_cond.acquire()
-            try:
+            with self._job_cond:
+                # Now we can offically check if someone already placed this
+                # jobs information into the known job set (if it's already
+                # existing then just leave it alone).
                if path not in self._known_jobs:
                    job = ZookeeperJob(job_data['name'], self,
                                       self._client, self._persistence, path,
@@ -391,10 +437,8 @@ class ZookeeperJobBoard(jobboard.NotifyingJobBoard):
                                       created_on=created_on)
                    self._known_jobs[path] = job
                    self._job_cond.notify_all()
-            finally:
-                self._job_cond.release()
        if job is not None:
-            self._emit(jobboard.POSTED, details={'job': job})
+            self._emit(base.POSTED, details={'job': job})

    def _on_job_posting(self, children, delayed=True):
        LOG.debug("Got children %s under path %s", children, self.path)
@@ -405,32 +449,43 @@ class ZookeeperJobBoard(jobboard.NotifyingJobBoard):
                continue
            child_paths.append(k_paths.join(self.path, c))

-        # Remove jobs that we know about but which are no longer children
-        with self._job_lock:
-            removals = set()
-            for path, _job in six.iteritems(self._known_jobs):
+        # Figure out what we really should be investigating and what we
+        # shouldn't (remove jobs that exist in our local version, but don't
+        # exist in the children anymore) and accumulate all paths that we
+        # need to trigger population of (without holding the job lock).
+        investigate_paths = []
+        pending_removals = []
+        with self._job_cond:
+            for path in six.iterkeys(self._known_jobs):
                if path not in child_paths:
-                    removals.add(path)
-            for path in removals:
-                self._remove_job(path)
-
-        # Ensure that we have a job record for each new job that has appeared
+                    pending_removals.append(path)
        for path in child_paths:
            if path in self._bad_paths:
                continue
-            with self._job_lock:
-                if path not in self._known_jobs:
-                    # Fire off the request to populate this job asynchronously.
-                    #
-                    # This method is *usually* called from a asynchronous
-                    # handler so it's better to exit from this quickly to
-                    # allow other asynchronous handlers to be executed.
-                    request = self._client.get_async(path)
-                    child_proc = functools.partial(self._process_child, path)
-                    if delayed:
-                        request.rawlink(child_proc)
-                    else:
-                        child_proc(request)
+            # This pre-check will *not* guarantee that we will not already
+            # have the job (if it's being populated elsewhere) but it will
+            # reduce the amount of duplicated requests in general; later when
+            # the job information has been populated we will ensure that we
+            # are not adding duplicates into the currently known jobs...
+            if path in self._known_jobs:
+                continue
+            if path not in investigate_paths:
+                investigate_paths.append(path)
+        if pending_removals:
+            with self._job_cond:
+                for path in pending_removals:
+                    self._remove_job(path)
+        for path in investigate_paths:
+            # Fire off the request to populate this job.
+            #
+            # This method is *usually* called from a asynchronous handler so
+            # it's better to exit from this quickly to allow other asynchronous
+            # handlers to be executed.
+            request = self._client.get_async(path)
+            if delayed:
+                request.rawlink(functools.partial(self._process_child, path))
+            else:
+                self._process_child(path, request)

    def post(self, name, book=None, details=None):

@@ -466,13 +521,10 @@ class ZookeeperJobBoard(jobboard.NotifyingJobBoard):
                               self._persistence, job_path,
                               book=book, details=details,
                               uuid=job_uuid)
-            self._job_cond.acquire()
-            try:
+            with self._job_cond:
                self._known_jobs[job_path] = job
                self._job_cond.notify_all()
-            finally:
-                self._job_cond.release()
-            self._emit(jobboard.POSTED, details={'job': job})
+            self._emit(base.POSTED, details={'job': job})
            return job

    def claim(self, job, who):
@@ -531,14 +583,13 @@ class ZookeeperJobBoard(jobboard.NotifyingJobBoard):
            if not job_path:
                raise ValueError("Unable to check if %r is a known path"
                                 % (job_path))
-            with self._job_lock:
-                if job_path not in self._known_jobs:
-                    fail_msg_tpl += ", unknown job"
-                    raise excp.NotFound(fail_msg_tpl % (job_uuid))
+            if job_path not in self._known_jobs:
+                fail_msg_tpl += ", unknown job"
+                raise excp.NotFound(fail_msg_tpl % (job_uuid))
        try:
            yield
        except self._client.handler.timeout_exception as e:
-            fail_msg_tpl += ", connection timed out"
+            fail_msg_tpl += ", operation timed out"
            raise excp.JobFailure(fail_msg_tpl % (job_uuid), e)
        except k_exceptions.SessionExpiredError as e:
            fail_msg_tpl += ", session expired"
@@ -610,14 +661,12 @@ class ZookeeperJobBoard(jobboard.NotifyingJobBoard):

    def wait(self, timeout=None):
        # Wait until timeout expires (or forever) for jobs to appear.
-        watch = None
-        if timeout is not None:
-            watch = tt.StopWatch(duration=float(timeout)).start()
-        self._job_cond.acquire()
-        try:
+        watch = tt.StopWatch(duration=timeout)
+        watch.start()
+        with self._job_cond:
            while True:
                if not self._known_jobs:
-                    if watch is not None and watch.expired():
+                    if watch.expired():
                        raise excp.NotFound("Expired waiting for jobs to"
                                            " arrive; waited %s seconds"
                                            % watch.elapsed())
@@ -626,17 +675,12 @@ class ZookeeperJobBoard(jobboard.NotifyingJobBoard):
                    # when we acquire the condition that there will actually
                    # be jobs (especially if we are spuriously awaken), so we
                    # must recalculate the amount of time we really have left.
-                    timeout = None
-                    if watch is not None:
-                        timeout = watch.leftover()
-                    self._job_cond.wait(timeout)
+                    self._job_cond.wait(watch.leftover(return_none=True))
                else:
                    it = ZookeeperJobBoardIterator(self)
                    it._jobs.extend(self._fetch_jobs())
                    it._fetched = True
                    return it
-        finally:
-            self._job_cond.release()

    @property
    def connected(self):
@@ -651,7 +695,7 @@ class ZookeeperJobBoard(jobboard.NotifyingJobBoard):
            LOG.debug("Shutting down the notifier")
            self._worker.shutdown()
            self._worker = None
-        with self._job_lock:
+        with self._job_cond:
            self._known_jobs.clear()
        LOG.debug("Stopped & cleared local state")

--- a/taskflow/jobs/jobboard.py
+++ b/taskflow/jobs/jobboard.py
@@ -17,9 +17,100 @@

 import abc

+from oslo_utils import uuidutils
 import six

-from taskflow.utils import misc
+from taskflow.types import notifier
+
+
+@six.add_metaclass(abc.ABCMeta)
+class Job(object):
+    """A abstraction that represents a named and trackable unit of work.
+
+    A job connects a logbook, a owner, last modified and created on dates and
+    any associated state that the job has. Since it is a connector to a
+    logbook, which are each associated with a set of factories that can create
+    set of flows, it is the current top-level container for a piece of work
+    that can be owned by an entity (typically that entity will read those
+    logbooks and run any contained flows).
+
+    Only one entity will be allowed to own and operate on the flows contained
+    in a job at a given time (for the foreseeable future).
+
+    NOTE(harlowja): It is the object that will be transferred to another
+    entity on failure so that the contained flows ownership can be
+    transferred to the secondary entity/owner for resumption, continuation,
+    reverting...
+    """
+
+    def __init__(self, name, uuid=None, details=None):
+        if uuid:
+            self._uuid = uuid
+        else:
+            self._uuid = uuidutils.generate_uuid()
+        self._name = name
+        if not details:
+            details = {}
+        self._details = details
+
+    @abc.abstractproperty
+    def last_modified(self):
+        """The datetime the job was last modified."""
+        pass
+
+    @abc.abstractproperty
+    def created_on(self):
+        """The datetime the job was created on."""
+        pass
+
+    @abc.abstractproperty
+    def board(self):
+        """The board this job was posted on or was created from."""
+
+    @abc.abstractproperty
+    def state(self):
+        """The current state of this job."""
+
+    @abc.abstractproperty
+    def book(self):
+        """Logbook associated with this job.
+
+        If no logbook is associated with this job, this property is None.
+        """
+
+    @abc.abstractproperty
+    def book_uuid(self):
+        """UUID of logbook associated with this job.
+
+        If no logbook is associated with this job, this property is None.
+        """
+
+    @abc.abstractproperty
+    def book_name(self):
+        """Name of logbook associated with this job.
+
+        If no logbook is associated with this job, this property is None.
+        """
+
+    @property
+    def uuid(self):
+        """The uuid of this job."""
+        return self._uuid
+
+    @property
+    def details(self):
+        """A dictionary of any details associated with this job."""
+        return self._details
+
+    @property
+    def name(self):
+        """The non-uniquely identifying name of this job."""
+        return self._name
+
+    def __str__(self):
+        """Pretty formats the job into something *more* meaningful."""
+        return "%s %s (%s): %s" % (type(self).__name__,
+                                   self.name, self.uuid, self.details)


@six.add_metaclass(abc.ABCMeta)
@@ -203,4 +294,4 @@ class NotifyingJobBoard(JobBoard):
    """
    def __init__(self, name, conf):
        super(NotifyingJobBoard, self).__init__(name, conf)
-        self.notifier = misc.Notifier()
+        self.notifier = notifier.Notifier()
--- a/taskflow/jobs/job.py
+++ b/taskflow/jobs/job.py
@@ -1,112 +0,0 @@
-# -*- coding: utf-8 -*-
-
-#    Copyright (C) 2013 Rackspace Hosting Inc. All Rights Reserved.
-#    Copyright (C) 2013 Yahoo! Inc. All Rights Reserved.
-#
-#    Licensed under the Apache License, Version 2.0 (the "License"); you may
-#    not use this file except in compliance with the License. You may obtain
-#    a copy of the License at
-#
-#         http://www.apache.org/licenses/LICENSE-2.0
-#
-#    Unless required by applicable law or agreed to in writing, software
-#    distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
-#    WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
-#    License for the specific language governing permissions and limitations
-#    under the License.
-
-import abc
-
-import six
-
-from taskflow.openstack.common import uuidutils
-
-
-@six.add_metaclass(abc.ABCMeta)
-class Job(object):
-    """A abstraction that represents a named and trackable unit of work.
-
-    A job connects a logbook, a owner, last modified and created on dates and
-    any associated state that the job has. Since it is a connector to a
-    logbook, which are each associated with a set of factories that can create
-    set of flows, it is the current top-level container for a piece of work
-    that can be owned by an entity (typically that entity will read those
-    logbooks and run any contained flows).
-
-    Only one entity will be allowed to own and operate on the flows contained
-    in a job at a given time (for the foreseeable future).
-
-    NOTE(harlowja): It is the object that will be transferred to another
-    entity on failure so that the contained flows ownership can be
-    transferred to the secondary entity/owner for resumption, continuation,
-    reverting...
-    """
-
-    def __init__(self, name, uuid=None, details=None):
-        if uuid:
-            self._uuid = uuid
-        else:
-            self._uuid = uuidutils.generate_uuid()
-        self._name = name
-        if not details:
-            details = {}
-        self._details = details
-
-    @abc.abstractproperty
-    def last_modified(self):
-        """The datetime the job was last modified."""
-        pass
-
-    @abc.abstractproperty
-    def created_on(self):
-        """The datetime the job was created on."""
-        pass
-
-    @abc.abstractproperty
-    def board(self):
-        """The board this job was posted on or was created from."""
-
-    @abc.abstractproperty
-    def state(self):
-        """The current state of this job."""
-
-    @abc.abstractproperty
-    def book(self):
-        """Logbook associated with this job.
-
-        If no logbook is associated with this job, this property is None.
-        """
-
-    @abc.abstractproperty
-    def book_uuid(self):
-        """UUID of logbook associated with this job.
-
-        If no logbook is associated with this job, this property is None.
-        """
-
-    @abc.abstractproperty
-    def book_name(self):
-        """Name of logbook associated with this job.
-
-        If no logbook is associated with this job, this property is None.
-        """
-
-    @property
-    def uuid(self):
-        """The uuid of this job."""
-        return self._uuid
-
-    @property
-    def details(self):
-        """A dictionary of any details associated with this job."""
-        return self._details
-
-    @property
-    def name(self):
-        """The non-uniquely identifying name of this job."""
-        return self._name
-
-    def __str__(self):
-        """Pretty formats the job into something *more* meaningful."""
-        return "%s %s (%s): %s" % (type(self).__name__,
-                                   self.name, self.uuid, self.details)
--- a/taskflow/listeners/base.py
+++ b/taskflow/listeners/base.py
@@ -17,13 +17,15 @@
 from __future__ import absolute_import

 import abc
-import logging

+from oslo_utils import excutils
 import six

-from taskflow.openstack.common import excutils
+from taskflow import logging
 from taskflow import states
-from taskflow.utils import misc
+from taskflow.types import failure
+from taskflow.types import notifier
+from taskflow.utils import deprecation

 LOG = logging.getLogger(__name__)

@@ -31,33 +33,84 @@ LOG = logging.getLogger(__name__)
 # do not produce results.
 FINISH_STATES = (states.FAILURE, states.SUCCESS)

+# What is listened for by default...
+DEFAULT_LISTEN_FOR = (notifier.Notifier.ANY,)

-class ListenerBase(object):
+
+def _task_matcher(details):
+    """Matches task details emitted."""
+    if not details:
+        return False
+    if 'task_name' in details and 'task_uuid' in details:
+        return True
+    return False
+
+
+def _retry_matcher(details):
+    """Matches retry details emitted."""
+    if not details:
+        return False
+    if 'retry_name' in details and 'retry_uuid' in details:
+        return True
+    return False
+
+
+def _bulk_deregister(notifier, registered, details_filter=None):
+    """Bulk deregisters callbacks associated with many states."""
+    while registered:
+        state, cb = registered.pop()
+        notifier.deregister(state, cb,
+                            details_filter=details_filter)
+
+
+def _bulk_register(watch_states, notifier, cb, details_filter=None):
+    """Bulk registers a callback associated with many states."""
+    registered = []
+    try:
+        for state in watch_states:
+            if not notifier.is_registered(state, cb,
+                                          details_filter=details_filter):
+                notifier.register(state, cb,
+                                  details_filter=details_filter)
+                registered.append((state, cb))
+    except ValueError:
+        with excutils.save_and_reraise_exception():
+            _bulk_deregister(notifier, registered,
+                             details_filter=details_filter)
+    else:
+        return registered
+
+
+class Listener(object):
    """Base class for listeners.

    A listener can be attached to an engine to do various actions on flow and
-    task state transitions.  It implements context manager protocol to be able
-    to register and unregister with a given engine automatically when a context
-    is entered and when it is exited.
+    atom state transitions. It implements the context manager protocol to be
+    able to register and unregister with a given engine automatically when a
+    context is entered and when it is exited.

    To implement a listener, derive from this class and override
-    ``_flow_receiver`` and/or ``_task_receiver`` methods (in this class,
-    they do nothing).
+    ``_flow_receiver`` and/or ``_task_receiver`` and/or ``_retry_receiver``
+    methods (in this class, they do nothing).
    """

    def __init__(self, engine,
-                 task_listen_for=(misc.Notifier.ANY,),
-                 flow_listen_for=(misc.Notifier.ANY,)):
+                 task_listen_for=DEFAULT_LISTEN_FOR,
+                 flow_listen_for=DEFAULT_LISTEN_FOR,
+                 retry_listen_for=DEFAULT_LISTEN_FOR):
        if not task_listen_for:
            task_listen_for = []
+        if not retry_listen_for:
+            retry_listen_for = []
        if not flow_listen_for:
            flow_listen_for = []
        self._listen_for = {
            'task': list(task_listen_for),
+            'retry': list(retry_listen_for),
            'flow': list(flow_listen_for),
        }
        self._engine = engine
-        self._registered = False
+        self._registered = {}

    def _flow_receiver(self, state, details):
        pass
@@ -65,46 +118,42 @@ class ListenerBase(object):
    def _task_receiver(self, state, details):
        pass

+    def _retry_receiver(self, state, details):
+        pass
+
    def deregister(self):
-        if not self._registered:
-            return
-
-        def _deregister(watch_states, notifier, cb):
-            for s in watch_states:
-                notifier.deregister(s, cb)
-
-        _deregister(self._listen_for['task'], self._engine.task_notifier,
-                    self._task_receiver)
-        _deregister(self._listen_for['flow'], self._engine.notifier,
-                    self._flow_receiver)
-
-        self._registered = False
+        if 'task' in self._registered:
+            _bulk_deregister(self._engine.atom_notifier,
+                             self._registered['task'],
+                             details_filter=_task_matcher)
+            del self._registered['task']
+        if 'retry' in self._registered:
+            _bulk_deregister(self._engine.atom_notifier,
+                             self._registered['retry'],
+                             details_filter=_retry_matcher)
+            del self._registered['retry']
+        if 'flow' in self._registered:
+            _bulk_deregister(self._engine.notifier,
+                             self._registered['flow'])
+            del self._registered['flow']

    def register(self):
-        if self._registered:
-            return
-
-        def _register(watch_states, notifier, cb):
-            registered = []
-            try:
-                for s in watch_states:
-                    if not notifier.is_registered(s, cb):
-                        notifier.register(s, cb)
-                        registered.append((s, cb))
-            except ValueError:
-                with excutils.save_and_reraise_exception():
-                    for (s, cb) in registered:
-                        notifier.deregister(s, cb)
-
-        _register(self._listen_for['task'], self._engine.task_notifier,
-                  self._task_receiver)
-        _register(self._listen_for['flow'], self._engine.notifier,
-                  self._flow_receiver)
-
-        self._registered = True
+        if 'task' not in self._registered:
+            self._registered['task'] = _bulk_register(
+                self._listen_for['task'], self._engine.atom_notifier,
+                self._task_receiver, details_filter=_task_matcher)
+        if 'retry' not in self._registered:
+            self._registered['retry'] = _bulk_register(
+                self._listen_for['retry'], self._engine.atom_notifier,
+                self._retry_receiver, details_filter=_retry_matcher)
+        if 'flow' not in self._registered:
+            self._registered['flow'] = _bulk_register(
+                self._listen_for['flow'], self._engine.notifier,
+                self._flow_receiver)

    def __enter__(self):
        self.register()
+        return self

    def __exit__(self, type, value, tb):
        try:
@@ -115,42 +164,63 @@ class ListenerBase(object):
                     self._engine, exc_info=True)


+# TODO(harlowja): remove in 0.7 or later...
+ListenerBase = deprecation.moved_inheritable_class(Listener,
+                                                   'ListenerBase', __name__,
+                                                   version="0.6",
+                                                   removal_version="?")
+
+
@six.add_metaclass(abc.ABCMeta)
-class LoggingBase(ListenerBase):
-    """Abstract base class for logging listeners.
+class DumpingListener(Listener):
+    """Abstract base class for dumping listeners.

    This provides a simple listener that can be attached to an engine which can
-    be derived from to log task and/or flow state transitions to some logging
+    be derived from to dump task and/or flow state transitions to some target
    backend.

-    To implement your own logging listener derive form this class and
-    override ``_log`` method.
+    To implement your own dumping listener derive from this class and
+    override the ``_dump`` method.
    """

    @abc.abstractmethod
-    def _log(self, message, *args, **kwargs):
-        raise NotImplementedError()
+    def _dump(self, message, *args, **kwargs):
+        """Dumps the provided *templated* message to some output."""

    def _flow_receiver(self, state, details):
-        self._log("%s has moved flow '%s' (%s) into state '%s'",
-                  self._engine, details['flow_name'],
-                  details['flow_uuid'], state)
+        self._dump("%s has moved flow '%s' (%s) into state '%s'"
+                   " from state '%s'", self._engine, details['flow_name'],
+                   details['flow_uuid'], state, details['old_state'])

    def _task_receiver(self, state, details):
        if state in FINISH_STATES:
            result = details.get('result')
            exc_info = None
            was_failure = False
-            if isinstance(result, misc.Failure):
+            if isinstance(result, failure.Failure):
                if result.exc_info:
                    exc_info = tuple(result.exc_info)
                was_failure = True
-            self._log("%s has moved task '%s' (%s) into state '%s'"
-                      " with result '%s' (failure=%s)",
-                      self._engine, details['task_name'],
-                      details['task_uuid'], state, result, was_failure,
-                      exc_info=exc_info)
+            self._dump("%s has moved task '%s' (%s) into state '%s'"
+                       " from state '%s' with result '%s' (failure=%s)",
+                       self._engine, details['task_name'],
+                       details['task_uuid'], state, details['old_state'],
+                       result, was_failure, exc_info=exc_info)
        else:
-            self._log("%s has moved task '%s' (%s) into state '%s'",
-                      self._engine, details['task_name'],
-                      details['task_uuid'], state)
+            self._dump("%s has moved task '%s' (%s) into state '%s'"
+                       " from state '%s'", self._engine, details['task_name'],
+                       details['task_uuid'], state, details['old_state'])
+
+
+# TODO(harlowja): remove in 0.7 or later...
+class LoggingBase(deprecation.moved_inheritable_class(DumpingListener,
+                                                      'LoggingBase', __name__,
+                                                      version="0.6",
+                                                      removal_version="?")):
+
+    def _dump(self, message, *args, **kwargs):
+        self._log(message, *args, **kwargs)
+
+    @abc.abstractmethod
+    def _log(self, message, *args, **kwargs):
+        """Logs the provided *templated* message to some output."""
--- a/taskflow/listeners/claims.py
+++ b/taskflow/listeners/claims.py
@@ -0,0 +1,102 @@
+# -*- coding: utf-8 -*-
+
+#    Copyright (C) 2014 Yahoo! Inc. All Rights Reserved.
+#
+#    Licensed under the Apache License, Version 2.0 (the "License"); you may
+#    not use this file except in compliance with the License. You may obtain
+#    a copy of the License at
+#
+#         http://www.apache.org/licenses/LICENSE-2.0
+#
+#    Unless required by applicable law or agreed to in writing, software
+#    distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+#    WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+#    License for the specific language governing permissions and limitations
+#    under the License.
+
+from __future__ import absolute_import
+
+import logging
+import os
+
+import six
+
+from taskflow import exceptions
+from taskflow.listeners import base
+from taskflow import states
+
+LOG = logging.getLogger(__name__)
+
+
+class CheckingClaimListener(base.Listener):
+    """Listener that interacts [engine, job, jobboard]; ensures claim is valid.
+
+    This listener (or a derivative) can be associated with an engines
+    notification system after the job has been claimed (so that the jobs work
+    can be worked on by that engine). This listener (after associated) will
+    check that the job is still claimed *whenever* the engine notifies of a
+    task or flow state change. If the job is not claimed when a state change
+    occurs, a associated handler (or the default) will be activated to
+    determine how to react to this *hopefully* exceptional case.
+
+    NOTE(harlowja): this may create more traffic than desired to the
+    jobboard backend (zookeeper or other), since the amount of state change
+    per task and flow is non-zero (and checking during each state change will
+    result in quite a few calls to that management system to check the jobs
+    claim status); this could be later optimized to check less (or only check
+    on a smaller set of states)
+
+    NOTE(harlowja): if a custom ``on_job_loss`` callback is provided it must
+    accept three positional arguments, the first being the current engine being
+    ran, the second being the 'task/flow' state and the third being the details
+    that were sent from the engine to listeners for inspection.
+    """
+
+    def __init__(self, engine, job, board, owner, on_job_loss=None):
+        super(CheckingClaimListener, self).__init__(engine)
+        self._job = job
+        self._board = board
+        self._owner = owner
+        if on_job_loss is None:
+            self._on_job_loss = self._suspend_engine_on_loss
+        else:
+            if not six.callable(on_job_loss):
+                raise ValueError("Custom 'on_job_loss' handler must be"
+                                 " callable")
+            self._on_job_loss = on_job_loss
+
+    def _suspend_engine_on_loss(self, engine, state, details):
+        """The default strategy for handling claims being lost."""
+        try:
+            engine.suspend()
+        except exceptions.TaskFlowException as e:
+            LOG.warn("Failed suspending engine '%s', (previously owned by"
+                     " '%s'):%s%s", engine, self._owner, os.linesep,
+                     e.pformat())
+
+    def _flow_receiver(self, state, details):
+        self._claim_checker(state, details)
+
+    def _task_receiver(self, state, details):
+        self._claim_checker(state, details)
+
+    def _has_been_lost(self):
+        try:
+            job_state = self._job.state
+            job_owner = self._board.find_owner(self._job)
+        except (exceptions.NotFound, exceptions.JobFailure):
+            return True
+        else:
+            if job_state == states.UNCLAIMED or self._owner != job_owner:
+                return True
+            else:
+                return False
+
+    def _claim_checker(self, state, details):
+        if not self._has_been_lost():
+            LOG.debug("Job '%s' is still claimed (actively owned by '%s')",
+                      self._job, self._owner)
+        else:
+            LOG.warn("Job '%s' has lost its claim (previously owned by '%s')",
+                     self._job, self._owner)
+            self._on_job_loss(self._engine, state, details)
--- a/taskflow/listeners/logging.py
+++ b/taskflow/listeners/logging.py
@@ -16,34 +16,187 @@

 from __future__ import absolute_import

-import logging
+import logging as logging_base
+import os
+import sys

 from taskflow.listeners import base
+from taskflow import logging
+from taskflow import states
+from taskflow.types import failure
 from taskflow.utils import misc

 LOG = logging.getLogger(__name__)

+if sys.version_info[0:2] == (2, 6):
+    _PY26 = True
+else:
+    _PY26 = False

-class LoggingListener(base.LoggingBase):
+
+# Fixes this for python 2.6 which was missing the is enabled for method
+# when a logger adapter is being used/provided, this will no longer be needed
+# when we can just support python 2.7+ (which fixed the lack of this method
+# on adapters).
+def _isEnabledFor(logger, level):
+    if _PY26 and isinstance(logger, logging_base.LoggerAdapter):
+        return logger.logger.isEnabledFor(level)
+    return logger.isEnabledFor(level)
+
+
+class LoggingListener(base.DumpingListener):
    """Listener that logs notifications it receives.

-    It listens for task and flow notifications and writes those
-    notifications to provided logger, or logger of its module
-    (``taskflow.listeners.logging``) if none provided. Log level
-    can also be configured, ``logging.DEBUG`` is used by default.
+    It listens for task and flow notifications and writes those notifications
+    to a provided logger, or logger of its module
+    (``taskflow.listeners.logging``) if none is provided (and no class
+    attribute is overriden). The log level can also be
+    configured, ``logging.DEBUG`` is used by default when none is provided.
    """
+
+    #: Default logger to use if one is not provided on construction.
+    _LOGGER = None
+
    def __init__(self, engine,
-                 task_listen_for=(misc.Notifier.ANY,),
-                 flow_listen_for=(misc.Notifier.ANY,),
+                 task_listen_for=base.DEFAULT_LISTEN_FOR,
+                 flow_listen_for=base.DEFAULT_LISTEN_FOR,
+                 retry_listen_for=base.DEFAULT_LISTEN_FOR,
                 log=None,
                 level=logging.DEBUG):
-        super(LoggingListener, self).__init__(engine,
-                                              task_listen_for=task_listen_for,
-                                              flow_listen_for=flow_listen_for)
-        self._logger = log
-        if not self._logger:
-            self._logger = LOG
+        super(LoggingListener, self).__init__(
+            engine, task_listen_for=task_listen_for,
+            flow_listen_for=flow_listen_for, retry_listen_for=retry_listen_for)
+        self._logger = misc.pick_first_not_none(log, self._LOGGER, LOG)
        self._level = level

-    def _log(self, message, *args, **kwargs):
+    def _dump(self, message, *args, **kwargs):
        self._logger.log(self._level, message, *args, **kwargs)
+
+
+class DynamicLoggingListener(base.Listener):
+    """Listener that logs notifications it receives.
+
+    It listens for task and flow notifications and writes those notifications
+    to a provided logger, or logger of its module
+    (``taskflow.listeners.logging``) if none is provided (and no class
+    attribute is overriden). The log level can *slightly* be configured
+    and ``logging.DEBUG`` or ``logging.WARNING`` (unless overriden via a
+    constructor parameter) will be selected automatically based on the
+    execution state and results produced.
+
+    The following flow states cause ``logging.WARNING`` (or provided
+    level) to be used:
+
+    * ``states.FAILURE``
+    * ``states.REVERTED``
+
+    The following task states cause ``logging.WARNING`` (or provided level)
+    to be used:
+
+    * ``states.FAILURE``
+    * ``states.RETRYING``
+    * ``states.REVERTING``
+
+    When a task produces a :py:class:`~taskflow.types.failure.Failure` object
+    as its result (typically this happens when a task raises an exception) this
+    will **always** switch the logger to use ``logging.WARNING`` (if the
+    failure object contains a ``exc_info`` tuple this will also be logged to
+    provide a meaningful traceback).
+    """
+
+    #: Default logger to use if one is not provided on construction.
+    _LOGGER = None
+
+    def __init__(self, engine,
+                 task_listen_for=base.DEFAULT_LISTEN_FOR,
+                 flow_listen_for=base.DEFAULT_LISTEN_FOR,
+                 retry_listen_for=base.DEFAULT_LISTEN_FOR,
+                 log=None, failure_level=logging.WARNING,
+                 level=logging.DEBUG):
+        super(DynamicLoggingListener, self).__init__(
+            engine, task_listen_for=task_listen_for,
+            flow_listen_for=flow_listen_for, retry_listen_for=retry_listen_for)
+        self._failure_level = failure_level
+        self._level = level
+        self._task_log_levels = {
+            states.FAILURE: self._failure_level,
+            states.REVERTED: self._failure_level,
+            states.RETRYING: self._failure_level,
+        }
+        self._flow_log_levels = {
+            states.FAILURE: self._failure_level,
+            states.REVERTED: self._failure_level,
+        }
+        self._logger = misc.pick_first_not_none(log, self._LOGGER, LOG)
+
+    @staticmethod
+    def _format_failure(fail):
+        """Returns a (exc_info, exc_details) tuple about the failure.
+
+        The ``exc_info`` tuple should be a standard three element
+        (exctype, value, traceback) tuple that will be used for further
+        logging. If a non-empty string is returned for ``exc_details`` it
+        should contain any string info about the failure (with any specific
+        details the ``exc_info`` may not have/contain). If the ``exc_info``
+        tuple is returned as ``None`` then it will cause the logging
+        system to avoid outputting any traceback information (read
+        the python documentation on the logger interaction with ``exc_info``
+        to learn more).
+        """
+        if fail.exc_info:
+            exc_info = fail.exc_info
+            exc_details = ''
+        else:
+            # When a remote failure occurs (or somehow the failure
+            # object lost its traceback), we will not have a valid
+            # exc_info that can be used but we *should* have a string
+            # version that we can use instead...
+            exc_info = None
+            exc_details = "%s%s" % (os.linesep, fail.pformat(traceback=True))
+        return (exc_info, exc_details)
+
+    def _flow_receiver(self, state, details):
+        """Gets called on flow state changes."""
+        level = self._flow_log_levels.get(state, self._level)
+        self._logger.log(level, "Flow '%s' (%s) transitioned into state '%s'"
+                         " from state '%s'", details['flow_name'],
+                         details['flow_uuid'], state, details.get('old_state'))
+
+    def _task_receiver(self, state, details):
+        """Gets called on task state changes."""
+        if 'result' in details and state in base.FINISH_STATES:
+            # If the task failed, it's useful to show the exception traceback
+            # and any other available exception information.
+            result = details.get('result')
+            if isinstance(result, failure.Failure):
+                exc_info, exc_details = self._format_failure(result)
+                self._logger.log(self._failure_level,
+                                 "Task '%s' (%s) transitioned into state"
+                                 " '%s' from state '%s'%s",
+                                 details['task_name'], details['task_uuid'],
+                                 state, details['old_state'], exc_details,
+                                 exc_info=exc_info)
+            else:
+                # Otherwise, depending on the enabled logging level/state we
+                # will show or hide results that the task may have produced
+                # during execution.
+                level = self._task_log_levels.get(state, self._level)
+                if (_isEnabledFor(self._logger, self._level)
+                        or state == states.FAILURE):
+                    self._logger.log(level, "Task '%s' (%s) transitioned into"
+                                     " state '%s' from state '%s' with"
+                                     " result '%s'", details['task_name'],
+                                     details['task_uuid'], state,
+                                     details['old_state'], result)
+                else:
+                    self._logger.log(level, "Task '%s' (%s) transitioned into"
+                                     " state '%s' from state '%s'",
+                                     details['task_name'],
+                                     details['task_uuid'], state,
+                                     details['old_state'])
+        else:
+            # Just a intermediary state, carry on!
+            level = self._task_log_levels.get(state, self._level)
+            self._logger.log(level, "Task '%s' (%s) transitioned into state"
+                             " '%s' from state '%s'", details['task_name'],
+                             details['task_uuid'], state, details['old_state'])
--- a/taskflow/listeners/printing.py
+++ b/taskflow/listeners/printing.py
@@ -20,24 +20,24 @@ import sys
 import traceback

 from taskflow.listeners import base
-from taskflow.utils import misc


-class PrintingListener(base.LoggingBase):
+class PrintingListener(base.DumpingListener):
    """Writes the task and flow notifications messages to stdout or stderr."""
    def __init__(self, engine,
-                 task_listen_for=(misc.Notifier.ANY,),
-                 flow_listen_for=(misc.Notifier.ANY,),
+                 task_listen_for=base.DEFAULT_LISTEN_FOR,
+                 flow_listen_for=base.DEFAULT_LISTEN_FOR,
+                 retry_listen_for=base.DEFAULT_LISTEN_FOR,
                 stderr=False):
-        super(PrintingListener, self).__init__(engine,
-                                               task_listen_for=task_listen_for,
-                                               flow_listen_for=flow_listen_for)
+        super(PrintingListener, self).__init__(
+            engine, task_listen_for=task_listen_for,
+            flow_listen_for=flow_listen_for, retry_listen_for=retry_listen_for)
        if stderr:
            self._file = sys.stderr
        else:
            self._file = sys.stdout

-    def _log(self, message, *args, **kwargs):
+    def _dump(self, message, *args, **kwargs):
        print(message % args, file=self._file)
        exc_info = kwargs.get('exc_info')
        if exc_info is not None:
--- a/taskflow/listeners/timing.py
+++ b/taskflow/listeners/timing.py
@@ -16,22 +16,30 @@

 from __future__ import absolute_import

-import logging
+import itertools

 from taskflow import exceptions as exc
 from taskflow.listeners import base
+from taskflow import logging
 from taskflow import states
 from taskflow.types import timing as tt

-STARTING_STATES = (states.RUNNING, states.REVERTING)
-FINISHED_STATES = base.FINISH_STATES + (states.REVERTED,)
-WATCH_STATES = frozenset(FINISHED_STATES + STARTING_STATES +
-                         (states.PENDING,))
+STARTING_STATES = frozenset((states.RUNNING, states.REVERTING))
+FINISHED_STATES = frozenset((base.FINISH_STATES + (states.REVERTED,)))
+WATCH_STATES = frozenset(itertools.chain(FINISHED_STATES, STARTING_STATES,
+                                         [states.PENDING]))

 LOG = logging.getLogger(__name__)


-class TimingListener(base.ListenerBase):
+# TODO(harlowja): get rid of this when we can just support python 3.x and use
+# its print function directly instead of having to wrap it in a helper function
+# due to how python 2.x print is a language built-in and not a function...
+def _printer(message):
+    print(message)
+
+
+class TimingListener(base.Listener):
    """Listener that captures task duration.

    It records how long a task took to execute (or fail)
@@ -46,11 +54,17 @@ class TimingListener(base.ListenerBase):

    def deregister(self):
        super(TimingListener, self).deregister()
+        # There should be none that still exist at deregistering time, so log a
+        # warning if there were any that somehow still got left behind...
+        leftover_timers = len(self._timers)
+        if leftover_timers:
+            LOG.warn("%s task(s) did not enter %s states", leftover_timers,
+                     FINISHED_STATES)
        self._timers.clear()

    def _record_ending(self, timer, task_name):
        meta_update = {
-            'duration': float(timer.elapsed()),
+            'duration': timer.elapsed(),
        }
        try:
            # Don't let storage failures throw exceptions in a listener method.
@@ -66,5 +80,28 @@ class TimingListener(base.ListenerBase):
        elif state in STARTING_STATES:
            self._timers[task_name] = tt.StopWatch().start()
        elif state in FINISHED_STATES:
-            if task_name in self._timers:
-                self._record_ending(self._timers[task_name], task_name)
+            timer = self._timers.pop(task_name, None)
+            if timer is not None:
+                timer.stop()
+                self._record_ending(timer, task_name)
+
+
+class PrintingTimingListener(TimingListener):
+    """Listener that prints the start & stop timing as well as recording it."""
+
+    def __init__(self, engine, printer=None):
+        super(PrintingTimingListener, self).__init__(engine)
+        if printer is None:
+            self._printer = _printer
+        else:
+            self._printer = printer
+
+    def _record_ending(self, timer, task_name):
+        super(PrintingTimingListener, self)._record_ending(timer, task_name)
+        self._printer("It took task '%s' %0.2f seconds to"
+                      " finish." % (task_name, timer.elapsed()))
+
+    def _task_receiver(self, state, details):
+        super(PrintingTimingListener, self)._task_receiver(state, details)
+        if state in STARTING_STATES:
+            self._printer("'%s' task started." % (details['task_name']))
--- a/Show More
+++ b/Show More