Add specific scoping documentation

Adds information into the arguments and result
docs about how scoping lookup works and what it
implies.

Change-Id: I810874dce042ec43fe9e704d6689215e19d67c9c
This commit is contained in:
Joshua Harlow
2015-02-24 16:19:04 -08:00
parent 0a97fb96b5
commit 6da46b71d9
3 changed files with 66 additions and 30 deletions

View File

@@ -346,6 +346,47 @@ failures have occurred then the engine will have finished and if so desired the
:doc:`persistence <persistence>` can be used to cleanup any details that were
saved for this execution.
Scoping
=======
During creation of flows it is also important to understand the lookup
strategy (also typically known as `scope`_ resolution) that the engine you
are using will internally use. For example when a task ``A`` provides
result 'a' and a task ``B`` after ``A`` provides a different result 'a' and a
task ``C`` after ``A`` and after ``B`` requires 'a' to run, which one will
be selected?
Default strategy
----------------
When a engine is executing it internally interacts with the
:py:class:`~taskflow.storage.Storage` class
and that class interacts with the a
:py:class:`~taskflow.engines.action_engine.scopes.ScopeWalker` instance
and the :py:class:`~taskflow.storage.Storage` class uses the following
lookup order to find (or fail) a atoms requirement lookup/request:
#. Injected atom specific arguments.
#. Transient injected arguments.
#. Non-transient injected arguments.
#. First scope visited provider that produces the named result; note that
if multiple providers are found in the same scope the *first* (the scope
walkers yielded ordering defines what *first* means) that produced that
result *and* can be extracted without raising an error is selected as the
provider of the requested requirement.
#. Fails with :py:class:`~taskflow.exceptions.NotFound` if unresolved at this
point (the ``cause`` attribute of this exception may have more details on
why the lookup failed).
.. note::
To examine this this information when debugging it is recommended to
enable the ``BLATHER`` logging level (level 5). At this level the storage
and scope code/layers will log what is being searched for and what is
being found.
.. _scope: http://en.wikipedia.org/wiki/Scope_%28computer_science%29
Interfaces
==========
@@ -362,7 +403,8 @@ Implementations
.. automodule:: taskflow.engines.action_engine.runner
.. automodule:: taskflow.engines.action_engine.runtime
.. automodule:: taskflow.engines.action_engine.scheduler
.. automodule:: taskflow.engines.action_engine.scopes
.. autoclass:: taskflow.engines.action_engine.scopes.ScopeWalker
:special-members: __iter__
Hierarchy
=========

View File

@@ -44,6 +44,8 @@ def _extract_atoms(node, idx=-1):
class ScopeWalker(object):
"""Walks through the scopes of a atom using a engines compilation.
NOTE(harlowja): for internal usage only.
This will walk the visible scopes that are accessible for the given
atom, which can be used by some external entity in some meaningful way,
for example to find dependent values...
@@ -63,29 +65,35 @@ class ScopeWalker(object):
How this works is the following:
We find all the possible predecessors of the given atom, this is useful
since we know they occurred before this atom but it doesn't tell us
the corresponding scope *level* that each predecessor was created in,
so we need to find this information.
We first grab all the predecessors of the given atom (lets call it
``Y``) by using the :py:class:`~.compiler.Compilation` execution
graph (and doing a reverse breadth-first expansion to gather its
predecessors), this is useful since we know they *always* will
exist (and execute) before this atom but it does not tell us the
corresponding scope *level* (flow, nested flow...) that each
predecessor was created in, so we need to find this information.
For that information we consult the location of the atom ``Y`` in the
node hierarchy. We lookup in a reverse order the parent ``X`` of ``Y``
and traverse backwards from the index in the parent where ``Y``
occurred, all children in ``X`` that we encounter in this backwards
search (if a child is a flow itself, its atom contents will be
expanded) will be assumed to be at the same scope. This is then a
*potential* single scope, to make an *actual* scope we remove the items
from the *potential* scope that are not predecessors of ``Y`` to form
the *actual* scope.
:py:class:`~.compiler.Compilation` hierarchy/tree. We lookup in a
reverse order the parent ``X`` of ``Y`` and traverse backwards from
the index in the parent where ``Y`` exists to all siblings (and
children of those siblings) in ``X`` that we encounter in this
backwards search (if a sibling is a flow itself, its atom(s)
will be recursively expanded and included). This collection will
then be assumed to be at the same scope. This is what is called
a *potential* single scope, to make an *actual* scope we remove the
items from the *potential* scope that are **not** predecessors
of ``Y`` to form the *actual* scope which we then yield back.
Then for additional scopes we continue up the tree, by finding the
parent of ``X`` (lets call it ``Z``) and perform the same operation,
going through the children in a reverse manner from the index in
parent ``Z`` where ``X`` was located. This forms another *potential*
scope which we provide back as an *actual* scope after reducing the
potential set by the predecessors of ``Y``. We then repeat this process
until we no longer have any parent nodes (aka have reached the top of
the tree) or we run out of predecessors.
potential set to only include predecessors previously gathered. We
then repeat this process until we no longer have any parent
nodes (aka we have reached the top of the tree) or we run out of
predecessors.
"""
predecessors = set(self._graph.bfs_predecessors_iter(self._atom))
last = self._node

View File

@@ -673,24 +673,10 @@ class Storage(object):
with self._lock.read_lock():
if optional_args is None:
optional_args = []
if atom_name and atom_name not in self._atom_name_to_uuid:
raise exceptions.NotFound("Unknown atom name: %s" % atom_name)
if not args_mapping:
return {}
# The order of lookup is the following:
#
# 1. Injected atom specific arguments.
# 2. Transient injected arguments.
# 3. Non-transient injected arguments.
# 4. First scope visited group that produces the named result.
# a). The first of that group that actually provided the name
# result is selected (if group size is greater than one).
#
# Otherwise: blowup! (this will also happen if reading or
# extracting an expected result fails, since it is better to fail
# on lookup then provide invalid data from the wrong provider)
if atom_name:
injected_args = self._injected_args.get(atom_name, {})
else: