Merge tag '1.15.0' into debian/liberty
taskflow 1.15.0 release
24
README.rst
@@ -1,6 +1,14 @@
|
||||
TaskFlow
|
||||
========
|
||||
|
||||
.. image:: https://img.shields.io/pypi/v/taskflow.svg
|
||||
:target: https://pypi.python.org/pypi/taskflow/
|
||||
:alt: Latest Version
|
||||
|
||||
.. image:: https://img.shields.io/pypi/dm/taskflow.svg
|
||||
:target: https://pypi.python.org/pypi/taskflow/
|
||||
:alt: Downloads
|
||||
|
||||
A library to do [jobs, tasks, flows] in a highly available, easy to understand
|
||||
and declarative manner (and more!) to be used with OpenStack and other
|
||||
projects.
|
||||
@@ -22,18 +30,16 @@ Requirements
|
||||
~~~~~~~~~~~~
|
||||
|
||||
Because this project has many optional (pluggable) parts like persistence
|
||||
backends and engines, we decided to split our requirements into three
|
||||
backends and engines, we decided to split our requirements into two
|
||||
parts: - things that are absolutely required (you can't use the project
|
||||
without them) are put into ``requirements-pyN.txt`` (``N`` being the
|
||||
Python *major* version number used to install the package). The requirements
|
||||
without them) are put into ``requirements.txt``. The requirements
|
||||
that are required by some optional part of this project (you can use the
|
||||
project without them) are put into our ``tox.ini`` file (so that we can still
|
||||
test the optional functionality works as expected). If you want to use the
|
||||
feature in question (`eventlet`_ or the worker based engine that
|
||||
uses `kombu`_ or the `sqlalchemy`_ persistence backend or jobboards which
|
||||
project without them) are put into our ``test-requirements.txt`` file (so
|
||||
that we can still test the optional functionality works as expected). If
|
||||
you want to use the feature in question (`eventlet`_ or the worker based engine
|
||||
that uses `kombu`_ or the `sqlalchemy`_ persistence backend or jobboards which
|
||||
have an implementation built using `kazoo`_ ...), you should add
|
||||
that requirement(s) to your project or environment; - as usual, things that
|
||||
required only for running tests are put into ``test-requirements.txt``.
|
||||
that requirement(s) to your project or environment.
|
||||
|
||||
Tox.ini
|
||||
~~~~~~~
|
||||
|
||||
BIN
doc/diagrams/area_of_influence.graffle.tgz
Normal file
@@ -74,8 +74,8 @@ ignored during inference (as these names have special meaning/usage in python).
|
||||
... def execute(self, *args, **kwargs):
|
||||
... pass
|
||||
...
|
||||
>>> UniTask().requires
|
||||
frozenset([])
|
||||
>>> sorted(UniTask().requires)
|
||||
[]
|
||||
|
||||
.. make vim sphinx highlighter* happy**
|
||||
|
||||
@@ -84,7 +84,7 @@ Rebinding
|
||||
---------
|
||||
|
||||
**Why:** There are cases when the value you want to pass to a task/retry is
|
||||
stored with a name other then the corresponding arguments name. That's when the
|
||||
stored with a name other than the corresponding arguments name. That's when the
|
||||
``rebind`` constructor parameter comes in handy. Using it the flow author
|
||||
can instruct the engine to fetch a value from storage by one name, but pass it
|
||||
to a tasks/retrys ``execute`` method with another name. There are two possible
|
||||
@@ -214,8 +214,8 @@ name of the value.
|
||||
... def execute(self):
|
||||
... return 42
|
||||
...
|
||||
>>> TheAnswerReturningTask(provides='the_answer').provides
|
||||
set(['the_answer'])
|
||||
>>> sorted(TheAnswerReturningTask(provides='the_answer').provides)
|
||||
['the_answer']
|
||||
|
||||
Returning a tuple
|
||||
+++++++++++++++++
|
||||
@@ -416,7 +416,7 @@ the following history (printed as a list)::
|
||||
At this point (since the implementation returned ``RETRY``) the
|
||||
|retry.execute| method will be called again and it will receive the same
|
||||
history and it can then return a value that subseqent tasks can use to alter
|
||||
there behavior.
|
||||
their behavior.
|
||||
|
||||
If instead the |retry.execute| method itself raises an exception,
|
||||
the |retry.revert| method of the implementation will be called and
|
||||
|
||||
@@ -23,9 +23,9 @@ values (requirements) and name outputs (provided values).
|
||||
Task
|
||||
=====
|
||||
|
||||
A :py:class:`task <taskflow.task.BaseTask>` (derived from an atom) is the
|
||||
smallest possible unit of work that can have an execute & rollback sequence
|
||||
associated with it. These task objects all derive
|
||||
A :py:class:`task <taskflow.task.BaseTask>` (derived from an atom) is a
|
||||
unit of work that can have an execute & rollback sequence associated with
|
||||
it (they are *nearly* analogous to functions). These task objects all derive
|
||||
from :py:class:`~taskflow.task.BaseTask` which defines what a task must
|
||||
provide in terms of properties and methods.
|
||||
|
||||
@@ -48,38 +48,30 @@ Retry
|
||||
=====
|
||||
|
||||
A :py:class:`retry <taskflow.retry.Retry>` (derived from an atom) is a special
|
||||
unit that handles errors, controls flow execution and can (for example) retry
|
||||
other atoms with other parameters if needed. When an associated atom
|
||||
fails, these retry units are *consulted* to determine what the resolution
|
||||
method should be. The goal is that with this *consultation* the retry atom
|
||||
will suggest a method for getting around the failure (perhaps by retrying,
|
||||
reverting a single item, or reverting everything contained in the retries
|
||||
associated scope).
|
||||
unit of work that handles errors, controls flow execution and can (for
|
||||
example) retry other atoms with other parameters if needed. When an associated
|
||||
atom fails, these retry units are *consulted* to determine what the resolution
|
||||
*strategy* should be. The goal is that with this consultation the retry atom
|
||||
will suggest a *strategy* for getting around the failure (perhaps by retrying,
|
||||
reverting a single atom, or reverting everything contained in the retries
|
||||
associated `scope`_).
|
||||
|
||||
Currently derivatives of the :py:class:`retry <taskflow.retry.Retry>` base
|
||||
class must provide a ``on_failure`` method to determine how a failure should
|
||||
be handled.
|
||||
class must provide a :py:func:`~taskflow.retry.Retry.on_failure` method to
|
||||
determine how a failure should be handled. The current enumeration(s) that can
|
||||
be returned from the :py:func:`~taskflow.retry.Retry.on_failure` method
|
||||
are defined in an enumeration class described here:
|
||||
|
||||
The current enumeration set that can be returned from this method is:
|
||||
|
||||
* ``RETRY`` - retries the surrounding subflow (a retry object is associated
|
||||
with a flow, which is typically converted into a graph hierarchy at
|
||||
compilation time) again.
|
||||
|
||||
* ``REVERT`` - reverts only the surrounding subflow but *consult* the
|
||||
parent atom before doing this to determine if the parent retry object
|
||||
provides a different reconciliation strategy (retry atoms can be nested, this
|
||||
is possible since flows themselves can be nested).
|
||||
|
||||
* ``REVERT_ALL`` - completely reverts a whole flow.
|
||||
.. autoclass:: taskflow.retry.Decision
|
||||
|
||||
To aid in the reconciliation process the
|
||||
:py:class:`retry <taskflow.retry.Retry>` base class also mandates ``execute``
|
||||
and ``revert`` methods (although subclasses are allowed to define these methods
|
||||
as no-ops) that can be used by a retry atom to interact with the runtime
|
||||
execution model (for example, to track the number of times it has been
|
||||
called which is useful for the :py:class:`~taskflow.retry.ForEach` retry
|
||||
subclass).
|
||||
:py:class:`retry <taskflow.retry.Retry>` base class also mandates
|
||||
:py:func:`~taskflow.retry.Retry.execute`
|
||||
and :py:func:`~taskflow.retry.Retry.revert` methods (although subclasses
|
||||
are allowed to define these methods as no-ops) that can be used by a retry
|
||||
atom to interact with the runtime execution model (for example, to track the
|
||||
number of times it has been called which is useful for
|
||||
the :py:class:`~taskflow.retry.ForEach` retry subclass).
|
||||
|
||||
To avoid recreating common retry patterns the following provided retry
|
||||
subclasses are provided:
|
||||
@@ -94,8 +86,40 @@ subclasses are provided:
|
||||
:py:class:`~taskflow.retry.ForEach` but extracts values from storage
|
||||
instead of the :py:class:`~taskflow.retry.ForEach` constructor.
|
||||
|
||||
Examples
|
||||
--------
|
||||
.. _scope: http://en.wikipedia.org/wiki/Scope_%28computer_science%29
|
||||
|
||||
.. note::
|
||||
|
||||
They are *similar* to exception handlers but are made to be *more* capable
|
||||
due to their ability to *dynamically* choose a reconciliation strategy,
|
||||
which allows for these atoms to influence subsequent execution(s) and the
|
||||
inputs any associated atoms require.
|
||||
|
||||
Area of influence
|
||||
-----------------
|
||||
|
||||
Each retry atom is associated with a flow and it can *influence* how the
|
||||
atoms (or nested flows) contained in that that flow retry or revert (using
|
||||
the previously mentioned patterns and decision enumerations):
|
||||
|
||||
*For example:*
|
||||
|
||||
.. image:: img/area_of_influence.svg
|
||||
:width: 325px
|
||||
:align: left
|
||||
:alt: Retry area of influence
|
||||
|
||||
In this diagram retry controller (1) will be consulted if task ``A``, ``B``
|
||||
or ``C`` fail and retry controller (2) decides to delegate its retry decision
|
||||
to retry controller (1). If retry controller (2) does **not** decide to
|
||||
delegate its retry decision to retry controller (1) then retry
|
||||
controller (1) will be oblivious of any decisions. If any of
|
||||
task ``1``, ``2`` or ``3`` fail then only retry controller (1) will be
|
||||
consulted to determine the strategy/pattern to apply to resolve there
|
||||
associated failure.
|
||||
|
||||
Usage examples
|
||||
--------------
|
||||
|
||||
.. testsetup::
|
||||
|
||||
@@ -167,7 +191,13 @@ Interfaces
|
||||
==========
|
||||
|
||||
.. automodule:: taskflow.task
|
||||
.. automodule:: taskflow.retry
|
||||
.. autoclass:: taskflow.retry.Retry
|
||||
.. autoclass:: taskflow.retry.History
|
||||
.. autoclass:: taskflow.retry.AlwaysRevert
|
||||
.. autoclass:: taskflow.retry.AlwaysRevertAll
|
||||
.. autoclass:: taskflow.retry.Times
|
||||
.. autoclass:: taskflow.retry.ForEach
|
||||
.. autoclass:: taskflow.retry.ParameterizedForEach
|
||||
|
||||
Hierarchy
|
||||
=========
|
||||
@@ -175,5 +205,10 @@ Hierarchy
|
||||
.. inheritance-diagram::
|
||||
taskflow.atom
|
||||
taskflow.task
|
||||
taskflow.retry
|
||||
taskflow.retry.Retry
|
||||
taskflow.retry.AlwaysRevert
|
||||
taskflow.retry.AlwaysRevertAll
|
||||
taskflow.retry.Times
|
||||
taskflow.retry.ForEach
|
||||
taskflow.retry.ParameterizedForEach
|
||||
:parts: 1
|
||||
|
||||
@@ -2,6 +2,10 @@
|
||||
Conductors
|
||||
----------
|
||||
|
||||
.. image:: img/conductor.png
|
||||
:width: 97px
|
||||
:alt: Conductor
|
||||
|
||||
Overview
|
||||
========
|
||||
|
||||
@@ -18,14 +22,14 @@ They are responsible for the following:
|
||||
tasks and flows to be executed).
|
||||
* Dispatching the engine using the provided :doc:`persistence <persistence>`
|
||||
layer and engine configuration.
|
||||
* Completing or abandoning the claimed job (depending on dispatching and
|
||||
execution outcome).
|
||||
* Completing or abandoning the claimed :doc:`job <jobs>` (depending on
|
||||
dispatching and execution outcome).
|
||||
* *Rinse and repeat*.
|
||||
|
||||
.. note::
|
||||
|
||||
They are inspired by and have similar responsibilities
|
||||
as `railroad conductors`_.
|
||||
as `railroad conductors`_ or `musical conductors`_.
|
||||
|
||||
Considerations
|
||||
==============
|
||||
@@ -53,28 +57,31 @@ claimable state.
|
||||
|
||||
#. Forcefully delete jobs that have been failing continuously after a given
|
||||
number of conductor attempts. This can be either done manually or
|
||||
automatically via scripts (or other associated monitoring).
|
||||
automatically via scripts (or other associated monitoring) or via
|
||||
the jobboards :py:func:`~taskflow.jobs.base.JobBoard.trash` method.
|
||||
#. Resolve the internal error's cause (storage backend failure, other...).
|
||||
#. Help implement `jobboard garbage binning`_.
|
||||
|
||||
.. _jobboard garbage binning: https://blueprints.launchpad.net/taskflow/+spec/jobboard-garbage-bin
|
||||
|
||||
Interfaces
|
||||
==========
|
||||
|
||||
.. automodule:: taskflow.conductors.base
|
||||
.. automodule:: taskflow.conductors.backends
|
||||
|
||||
Implementations
|
||||
===============
|
||||
|
||||
.. automodule:: taskflow.conductors.single_threaded
|
||||
Blocking
|
||||
--------
|
||||
|
||||
.. automodule:: taskflow.conductors.backends.impl_blocking
|
||||
|
||||
Hierarchy
|
||||
=========
|
||||
|
||||
.. inheritance-diagram::
|
||||
taskflow.conductors.base
|
||||
taskflow.conductors.single_threaded
|
||||
taskflow.conductors.backends.impl_blocking
|
||||
:parts: 1
|
||||
|
||||
.. _musical conductors: http://en.wikipedia.org/wiki/Conducting
|
||||
.. _railroad conductors: http://en.wikipedia.org/wiki/Conductor_%28transportation%29
|
||||
|
||||
@@ -17,11 +17,13 @@ and *ideal* is that deployers or developers of a service that use TaskFlow can
|
||||
select an engine that suites their setup best without modifying the code of
|
||||
said service.
|
||||
|
||||
Engines usually have different capabilities and configuration, but all of them
|
||||
**must** implement the same interface and preserve the semantics of patterns
|
||||
(e.g. parts of a :py:class:`.linear_flow.Flow`
|
||||
are run one after another, in order, even if the selected engine is *capable*
|
||||
of running tasks in parallel).
|
||||
.. note::
|
||||
|
||||
Engines usually have different capabilities and configuration, but all of
|
||||
them **must** implement the same interface and preserve the semantics of
|
||||
patterns (e.g. parts of a :py:class:`.linear_flow.Flow`
|
||||
are run one after another, in order, even if the selected
|
||||
engine is *capable* of running tasks in parallel).
|
||||
|
||||
Why they exist
|
||||
--------------
|
||||
@@ -29,7 +31,7 @@ Why they exist
|
||||
An engine being *the* core component which actually makes your flows progress
|
||||
is likely a new concept for many programmers so let's describe how it operates
|
||||
in more depth and some of the reasoning behind why it exists. This will
|
||||
hopefully make it more clear on there value add to the TaskFlow library user.
|
||||
hopefully make it more clear on their value add to the TaskFlow library user.
|
||||
|
||||
First though let us discuss something most are familiar already with; the
|
||||
difference between `declarative`_ and `imperative`_ programming models. The
|
||||
@@ -57,7 +59,7 @@ declarative model) allows for the following functionality to become possible:
|
||||
accomplished allows for a *natural* way of resuming by allowing the engine to
|
||||
track the current state and know at which point a workflow is in and how to
|
||||
get back into that state when resumption occurs.
|
||||
* Enhancing scalability: When a engine is responsible for executing your
|
||||
* Enhancing scalability: When an engine is responsible for executing your
|
||||
desired work it becomes possible to alter the *how* in the future by creating
|
||||
new types of execution backends (for example the `worker`_ model which does
|
||||
not execute locally). Without the decoupling of the *what* and the *how* it
|
||||
@@ -172,13 +174,13 @@ using your desired execution model.
|
||||
scalability by reducing thread/process creation and teardown as well as by
|
||||
reusing existing pools (which is a good practice in general).
|
||||
|
||||
.. note::
|
||||
.. warning::
|
||||
|
||||
Running tasks with a `process pool executor`_ is **experimentally**
|
||||
supported. This is mainly due to the `futures backport`_ and
|
||||
the `multiprocessing`_ module that exist in older versions of python not
|
||||
being as up to date (with important fixes such as :pybug:`4892`,
|
||||
:pybug:`6721`, :pybug:`9205`, :pybug:`11635`, :pybug:`16284`,
|
||||
:pybug:`6721`, :pybug:`9205`, :pybug:`16284`,
|
||||
:pybug:`22393` and others...) as the most recent python version (which
|
||||
themselves have a variety of ongoing/recent bugs).
|
||||
|
||||
@@ -203,7 +205,7 @@ For further information, please refer to the the following:
|
||||
How they run
|
||||
============
|
||||
|
||||
To provide a peek into the general process that a engine goes through when
|
||||
To provide a peek into the general process that an engine goes through when
|
||||
running lets break it apart a little and describe what one of the engine types
|
||||
does while executing (for this we will look into the
|
||||
:py:class:`~taskflow.engines.action_engine.engine.ActionEngine` engine type).
|
||||
@@ -221,39 +223,48 @@ are setup.
|
||||
Compiling
|
||||
---------
|
||||
|
||||
During this stage the flow will be converted into an internal graph
|
||||
representation using a
|
||||
:py:class:`~taskflow.engines.action_engine.compiler.Compiler` (the default
|
||||
implementation for patterns is the
|
||||
During this stage (see :py:func:`~taskflow.engines.base.Engine.compile`) the
|
||||
flow will be converted into an internal graph representation using a
|
||||
compiler (the default implementation for patterns is the
|
||||
:py:class:`~taskflow.engines.action_engine.compiler.PatternCompiler`). This
|
||||
class compiles/converts the flow objects and contained atoms into a
|
||||
`networkx`_ directed graph that contains the equivalent atoms defined in the
|
||||
flow and any nested flows & atoms as well as the constraints that are created
|
||||
by the application of the different flow patterns. This graph is then what will
|
||||
be analyzed & traversed during the engines execution. At this point a few
|
||||
helper object are also created and saved to internal engine variables (these
|
||||
object help in execution of atoms, analyzing the graph and performing other
|
||||
internal engine activities). At the finishing of this stage a
|
||||
`networkx`_ directed graph (and tree structure) that contains the equivalent
|
||||
atoms defined in the flow and any nested flows & atoms as well as the
|
||||
constraints that are created by the application of the different flow
|
||||
patterns. This graph (and tree) are what will be analyzed & traversed during
|
||||
the engines execution. At this point a few helper object are also created and
|
||||
saved to internal engine variables (these object help in execution of
|
||||
atoms, analyzing the graph and performing other internal engine
|
||||
activities). At the finishing of this stage a
|
||||
:py:class:`~taskflow.engines.action_engine.runtime.Runtime` object is created
|
||||
which contains references to all needed runtime components.
|
||||
which contains references to all needed runtime components and its
|
||||
:py:func:`~taskflow.engines.action_engine.runtime.Runtime.compile` is called
|
||||
to compile a cache of frequently used execution helper objects.
|
||||
|
||||
Preparation
|
||||
-----------
|
||||
|
||||
This stage starts by setting up the storage needed for all atoms in the
|
||||
previously created graph, ensuring that corresponding
|
||||
:py:class:`~taskflow.persistence.logbook.AtomDetail` (or subclass of) objects
|
||||
are created for each node in the graph. Once this is done final validation
|
||||
occurs on the requirements that are needed to start execution and what
|
||||
:py:class:`~taskflow.storage.Storage` provides. If there is any atom or flow
|
||||
requirements not satisfied then execution will not be allowed to continue.
|
||||
This stage (see :py:func:`~taskflow.engines.base.Engine.prepare`) starts by
|
||||
setting up the storage needed for all atoms in the compiled graph, ensuring
|
||||
that corresponding :py:class:`~taskflow.persistence.models.AtomDetail` (or
|
||||
subclass of) objects are created for each node in the graph.
|
||||
|
||||
Validation
|
||||
----------
|
||||
|
||||
This stage (see :py:func:`~taskflow.engines.base.Engine.validate`) performs
|
||||
any final validation of the compiled (and now storage prepared) engine. It
|
||||
compares the requirements that are needed to start execution and
|
||||
what is currently provided or will be produced in the future. If there are
|
||||
*any* atom requirements that are not satisfied (no known current provider or
|
||||
future producer is found) then execution will **not** be allowed to continue.
|
||||
|
||||
Execution
|
||||
---------
|
||||
|
||||
The graph (and helper objects) previously created are now used for guiding
|
||||
further execution. The flow is put into the ``RUNNING`` :doc:`state <states>`
|
||||
and a
|
||||
further execution (see :py:func:`~taskflow.engines.base.Engine.run`). The
|
||||
flow is put into the ``RUNNING`` :doc:`state <states>` and a
|
||||
:py:class:`~taskflow.engines.action_engine.runner.Runner` implementation
|
||||
object starts to take over and begins going through the stages listed
|
||||
below (for a more visual diagram/representation see
|
||||
@@ -262,10 +273,10 @@ the :ref:`engine state diagram <engine states>`).
|
||||
.. note::
|
||||
|
||||
The engine will respect the constraints imposed by the flow. For example,
|
||||
if Engine is executing a :py:class:`.linear_flow.Flow` then it is
|
||||
constrained by the dependency-graph which is linear in this case, and hence
|
||||
using a Parallel Engine may not yield any benefits if one is looking for
|
||||
concurrency.
|
||||
if an engine is executing a :py:class:`~taskflow.patterns.linear_flow.Flow`
|
||||
then it is constrained by the dependency graph which is linear in this
|
||||
case, and hence using a parallel engine may not yield any benefits if one
|
||||
is looking for concurrency.
|
||||
|
||||
Resumption
|
||||
^^^^^^^^^^
|
||||
@@ -282,7 +293,7 @@ for things like retry atom which can influence what a tasks intention should be
|
||||
:py:class:`~taskflow.engines.action_engine.analyzer.Analyzer` helper
|
||||
object which was designed to provide helper methods for this analysis). Once
|
||||
these intentions are determined and associated with each task (the intention is
|
||||
also stored in the :py:class:`~taskflow.persistence.logbook.AtomDetail` object)
|
||||
also stored in the :py:class:`~taskflow.persistence.models.AtomDetail` object)
|
||||
the :ref:`scheduling <scheduling>` stage starts.
|
||||
|
||||
.. _scheduling:
|
||||
@@ -292,7 +303,7 @@ Scheduling
|
||||
|
||||
This stage selects which atoms are eligible to run by using a
|
||||
:py:class:`~taskflow.engines.action_engine.scheduler.Scheduler` implementation
|
||||
(the default implementation looks at there intention, checking if predecessor
|
||||
(the default implementation looks at their intention, checking if predecessor
|
||||
atoms have ran and so-on, using a
|
||||
:py:class:`~taskflow.engines.action_engine.analyzer.Analyzer` helper
|
||||
object as needed) and submits those atoms to a previously provided compatible
|
||||
@@ -312,15 +323,15 @@ submitted to complete. Once one of the future objects completes (or fails) that
|
||||
atoms result will be examined and finalized using a
|
||||
:py:class:`~taskflow.engines.action_engine.completer.Completer` implementation.
|
||||
It typically will persist results to a provided persistence backend (saved
|
||||
into the corresponding :py:class:`~taskflow.persistence.logbook.AtomDetail`
|
||||
and :py:class:`~taskflow.persistence.logbook.FlowDetail` objects via the
|
||||
into the corresponding :py:class:`~taskflow.persistence.models.AtomDetail`
|
||||
and :py:class:`~taskflow.persistence.models.FlowDetail` objects via the
|
||||
:py:class:`~taskflow.storage.Storage` helper) and reflect
|
||||
the new state of the atom. At this point what typically happens falls into two
|
||||
categories, one for if that atom failed and one for if it did not. If the atom
|
||||
failed it may be set to a new intention such as ``RETRY`` or
|
||||
``REVERT`` (other atoms that were predecessors of this failing atom may also
|
||||
have there intention altered). Once this intention adjustment has happened a
|
||||
new round of :ref:`scheduling <scheduling>` occurs and this process repeats
|
||||
new round of :ref:`scheduling <scheduling>` occurs and this process repeats
|
||||
until the engine succeeds or fails (if the process running the engine dies the
|
||||
above stages will be restarted and resuming will occur).
|
||||
|
||||
@@ -328,8 +339,8 @@ above stages will be restarted and resuming will occur).
|
||||
|
||||
If the engine is suspended while the engine is going through the above
|
||||
stages this will stop any further scheduling stages from occurring and
|
||||
all currently executing atoms will be allowed to finish (and there results
|
||||
will be saved).
|
||||
all currently executing work will be allowed to finish (see
|
||||
:ref:`suspension <suspension>`).
|
||||
|
||||
Finishing
|
||||
---------
|
||||
@@ -346,6 +357,79 @@ failures have occurred then the engine will have finished and if so desired the
|
||||
:doc:`persistence <persistence>` can be used to cleanup any details that were
|
||||
saved for this execution.
|
||||
|
||||
Special cases
|
||||
=============
|
||||
|
||||
.. _suspension:
|
||||
|
||||
Suspension
|
||||
----------
|
||||
|
||||
Each engine implements a :py:func:`~taskflow.engines.base.Engine.suspend`
|
||||
method that can be used to *externally* (or in the future *internally*) request
|
||||
that the engine stop :ref:`scheduling <scheduling>` new work. By default what
|
||||
this performs is a transition of the flow state from ``RUNNING`` into a
|
||||
``SUSPENDING`` state (which will later transition into a ``SUSPENDED`` state).
|
||||
Since an engine may be remotely executing atoms (or locally executing them)
|
||||
and there is currently no preemption what occurs is that the engines
|
||||
:py:class:`~taskflow.engines.action_engine.runner.Runner` state machine will
|
||||
detect this transition into ``SUSPENDING`` has occurred and the state
|
||||
machine will avoid scheduling new work (it will though let active work
|
||||
continue). After the current work has finished the engine will
|
||||
transition from ``SUSPENDING`` into ``SUSPENDED`` and return from its
|
||||
:py:func:`~taskflow.engines.base.Engine.run` method.
|
||||
|
||||
|
||||
.. note::
|
||||
|
||||
When :py:func:`~taskflow.engines.base.Engine.run` is returned from at that
|
||||
point there *may* (but does not have to be, depending on what was active
|
||||
when :py:func:`~taskflow.engines.base.Engine.suspend` was called) be
|
||||
unfinished work in the flow that was not finished (but which can be
|
||||
resumed at a later point in time).
|
||||
|
||||
Scoping
|
||||
=======
|
||||
|
||||
During creation of flows it is also important to understand the lookup
|
||||
strategy (also typically known as `scope`_ resolution) that the engine you
|
||||
are using will internally use. For example when a task ``A`` provides
|
||||
result 'a' and a task ``B`` after ``A`` provides a different result 'a' and a
|
||||
task ``C`` after ``A`` and after ``B`` requires 'a' to run, which one will
|
||||
be selected?
|
||||
|
||||
Default strategy
|
||||
----------------
|
||||
|
||||
When an engine is executing it internally interacts with the
|
||||
:py:class:`~taskflow.storage.Storage` class
|
||||
and that class interacts with the a
|
||||
:py:class:`~taskflow.engines.action_engine.scopes.ScopeWalker` instance
|
||||
and the :py:class:`~taskflow.storage.Storage` class uses the following
|
||||
lookup order to find (or fail) a atoms requirement lookup/request:
|
||||
|
||||
#. Transient injected atom specific arguments.
|
||||
#. Non-transient injected atom specific arguments.
|
||||
#. Transient injected arguments (flow specific).
|
||||
#. Non-transient injected arguments (flow specific).
|
||||
#. First scope visited provider that produces the named result; note that
|
||||
if multiple providers are found in the same scope the *first* (the scope
|
||||
walkers yielded ordering defines what *first* means) that produced that
|
||||
result *and* can be extracted without raising an error is selected as the
|
||||
provider of the requested requirement.
|
||||
#. Fails with :py:class:`~taskflow.exceptions.NotFound` if unresolved at this
|
||||
point (the ``cause`` attribute of this exception may have more details on
|
||||
why the lookup failed).
|
||||
|
||||
.. note::
|
||||
|
||||
To examine this this information when debugging it is recommended to
|
||||
enable the ``BLATHER`` logging level (level 5). At this level the storage
|
||||
and scope code/layers will log what is being searched for and what is
|
||||
being found.
|
||||
|
||||
.. _scope: http://en.wikipedia.org/wiki/Scope_%28computer_science%29
|
||||
|
||||
Interfaces
|
||||
==========
|
||||
|
||||
@@ -354,15 +438,27 @@ Interfaces
|
||||
Implementations
|
||||
===============
|
||||
|
||||
.. automodule:: taskflow.engines.action_engine.engine
|
||||
|
||||
Components
|
||||
----------
|
||||
|
||||
.. warning::
|
||||
|
||||
External usage of internal engine functions, components and modules should
|
||||
be kept to a **minimum** as they may be altered, refactored or moved to
|
||||
other locations **without** notice (and without the typical deprecation
|
||||
cycle).
|
||||
|
||||
.. automodule:: taskflow.engines.action_engine.analyzer
|
||||
.. automodule:: taskflow.engines.action_engine.compiler
|
||||
.. automodule:: taskflow.engines.action_engine.completer
|
||||
.. automodule:: taskflow.engines.action_engine.engine
|
||||
.. automodule:: taskflow.engines.action_engine.executor
|
||||
.. automodule:: taskflow.engines.action_engine.runner
|
||||
.. automodule:: taskflow.engines.action_engine.runtime
|
||||
.. automodule:: taskflow.engines.action_engine.scheduler
|
||||
.. automodule:: taskflow.engines.action_engine.scopes
|
||||
.. autoclass:: taskflow.engines.action_engine.scopes.ScopeWalker
|
||||
:special-members: __iter__
|
||||
|
||||
Hierarchy
|
||||
=========
|
||||
|
||||
@@ -34,6 +34,30 @@ Using listeners
|
||||
:linenos:
|
||||
:lines: 16-
|
||||
|
||||
Using listeners (to watch a phone call)
|
||||
=======================================
|
||||
|
||||
.. note::
|
||||
|
||||
Full source located at :example:`simple_linear_listening`.
|
||||
|
||||
.. literalinclude:: ../../taskflow/examples/simple_linear_listening.py
|
||||
:language: python
|
||||
:linenos:
|
||||
:lines: 16-
|
||||
|
||||
Dumping a in-memory backend
|
||||
===========================
|
||||
|
||||
.. note::
|
||||
|
||||
Full source located at :example:`dump_memory_backend`.
|
||||
|
||||
.. literalinclude:: ../../taskflow/examples/dump_memory_backend.py
|
||||
:language: python
|
||||
:linenos:
|
||||
:lines: 16-
|
||||
|
||||
Making phone calls
|
||||
==================
|
||||
|
||||
@@ -176,6 +200,18 @@ Summation mapper(s) and reducer (in parallel)
|
||||
:linenos:
|
||||
:lines: 16-
|
||||
|
||||
Sharing a thread pool executor (in parallel)
|
||||
============================================
|
||||
|
||||
.. note::
|
||||
|
||||
Full source located at :example:`share_engine_thread`
|
||||
|
||||
.. literalinclude:: ../../taskflow/examples/share_engine_thread.py
|
||||
:language: python
|
||||
:linenos:
|
||||
:lines: 16-
|
||||
|
||||
Storing & emitting a bill
|
||||
=========================
|
||||
|
||||
@@ -306,3 +342,28 @@ Jobboard producer/consumer (simple)
|
||||
:language: python
|
||||
:linenos:
|
||||
:lines: 16-
|
||||
|
||||
Conductor simulating a CI pipeline
|
||||
==================================
|
||||
|
||||
.. note::
|
||||
|
||||
Full source located at :example:`tox_conductor`
|
||||
|
||||
.. literalinclude:: ../../taskflow/examples/tox_conductor.py
|
||||
:language: python
|
||||
:linenos:
|
||||
:lines: 16-
|
||||
|
||||
|
||||
Conductor running 99 bottles of beer song requests
|
||||
==================================================
|
||||
|
||||
.. note::
|
||||
|
||||
Full source located at :example:`99_bottles`
|
||||
|
||||
.. literalinclude:: ../../taskflow/examples/99_bottles.py
|
||||
:language: python
|
||||
:linenos:
|
||||
:lines: 16-
|
||||
|
||||
2
doc/source/history.rst
Normal file
@@ -0,0 +1,2 @@
|
||||
.. include:: ../../ChangeLog
|
||||
|
||||
3
doc/source/img/area_of_influence.svg
Normal file
|
After Width: | Height: | Size: 7.1 KiB |
BIN
doc/source/img/conductor.png
Normal file
|
After Width: | Height: | Size: 9.4 KiB |
8
doc/source/img/job_states.svg
Normal file
|
After Width: | Height: | Size: 13 KiB |
|
Before Width: | Height: | Size: 20 KiB After Width: | Height: | Size: 22 KiB |
|
Before Width: | Height: | Size: 18 KiB After Width: | Height: | Size: 20 KiB |
@@ -14,7 +14,7 @@ Contents
|
||||
========
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 3
|
||||
:maxdepth: 2
|
||||
|
||||
atoms
|
||||
arguments_and_results
|
||||
@@ -29,6 +29,9 @@ Contents
|
||||
jobs
|
||||
conductors
|
||||
|
||||
Supplementary
|
||||
=============
|
||||
|
||||
Examples
|
||||
--------
|
||||
|
||||
@@ -62,7 +65,8 @@ TaskFlow into your project:
|
||||
|
||||
* Read over the `paradigm shifts`_ and engage the team in `IRC`_ (or via the
|
||||
`openstack-dev`_ mailing list) if these need more explanation (prefix
|
||||
``[TaskFlow]`` to your emails subject to get an even faster response).
|
||||
``[Oslo][TaskFlow]`` to your emails subject to get an even faster
|
||||
response).
|
||||
* Follow (or at least attempt to follow) some of the established
|
||||
`best practices`_ (feel free to add your own suggested best practices).
|
||||
* Keep in touch with the team (see above); we are all friendly and enjoy
|
||||
@@ -85,6 +89,29 @@ Miscellaneous
|
||||
types
|
||||
utils
|
||||
|
||||
Bookshelf
|
||||
---------
|
||||
|
||||
A useful collection of links, documents, papers, similar
|
||||
projects, frameworks and libraries.
|
||||
|
||||
.. note::
|
||||
|
||||
Please feel free to submit your own additions and/or changes.
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 1
|
||||
|
||||
shelf
|
||||
|
||||
History
|
||||
-------
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
|
||||
history
|
||||
|
||||
Indices and tables
|
||||
==================
|
||||
|
||||
|
||||
@@ -30,7 +30,7 @@ Definitions
|
||||
Jobs
|
||||
A :py:class:`job <taskflow.jobs.base.Job>` consists of a unique identifier,
|
||||
name, and a reference to a :py:class:`logbook
|
||||
<taskflow.persistence.logbook.LogBook>` which contains the details of the
|
||||
<taskflow.persistence.models.LogBook>` which contains the details of the
|
||||
work that has been or should be/will be completed to finish the work that has
|
||||
been created for that job.
|
||||
|
||||
@@ -43,7 +43,7 @@ Jobboards
|
||||
jobboards implement the same interface and semantics so that the backend
|
||||
usage is as transparent as possible. This allows deployers or developers of a
|
||||
service that uses TaskFlow to select a jobboard implementation that fits
|
||||
their setup (and there intended usage) best.
|
||||
their setup (and their intended usage) best.
|
||||
|
||||
High level architecture
|
||||
=======================
|
||||
@@ -62,7 +62,8 @@ Features
|
||||
the previously partially completed work or begin initial work to ensure
|
||||
that the workflow as a whole progresses (where progressing implies
|
||||
transitioning through the workflow :doc:`patterns <patterns>` and
|
||||
:doc:`atoms <atoms>` and completing their associated state transitions).
|
||||
:doc:`atoms <atoms>` and completing their associated
|
||||
:doc:`states <states>` transitions).
|
||||
|
||||
- Atomic transfer and single ownership
|
||||
|
||||
@@ -94,11 +95,12 @@ Features
|
||||
Usage
|
||||
=====
|
||||
|
||||
All engines are mere classes that implement same interface, and of course it is
|
||||
possible to import them and create their instances just like with any classes
|
||||
in Python. But the easier (and recommended) way for creating jobboards is by
|
||||
using the :py:meth:`fetch() <taskflow.jobs.backends.fetch>` function which uses
|
||||
entrypoints (internally using `stevedore`_) to fetch and configure your backend
|
||||
All jobboards are mere classes that implement same interface, and of course
|
||||
it is possible to import them and create instances of them just like with any
|
||||
other class in Python. But the easier (and recommended) way for creating
|
||||
jobboards is by using the :py:meth:`fetch() <taskflow.jobs.backends.fetch>`
|
||||
function which uses entrypoints (internally using `stevedore`_) to fetch and
|
||||
configure your backend.
|
||||
|
||||
Using this function the typical creation of a jobboard (and an example posting
|
||||
of a job) might look like:
|
||||
@@ -200,13 +202,27 @@ Additional *configuration* parameters:
|
||||
* ``handler``: a class that provides ``kazoo.handlers``-like interface; it will
|
||||
be used internally by `kazoo`_ to perform asynchronous operations, useful
|
||||
when your program uses eventlet and you want to instruct kazoo to use an
|
||||
eventlet compatible handler (such as the `eventlet handler`_).
|
||||
eventlet compatible handler.
|
||||
|
||||
.. note::
|
||||
|
||||
See :py:class:`~taskflow.jobs.backends.impl_zookeeper.ZookeeperJobBoard`
|
||||
for implementation details.
|
||||
|
||||
Redis
|
||||
-----
|
||||
|
||||
**Board type**: ``'redis'``
|
||||
|
||||
Uses `redis`_ to provide the jobboard capabilities and semantics by using
|
||||
a redis hash datastructure and individual job ownership keys (that can
|
||||
optionally expire after a given amount of time).
|
||||
|
||||
.. note::
|
||||
|
||||
See :py:class:`~taskflow.jobs.backends.impl_redis.RedisJobBoard`
|
||||
for implementation details.
|
||||
|
||||
Considerations
|
||||
==============
|
||||
|
||||
@@ -218,7 +234,7 @@ Dual-engine jobs
|
||||
----------------
|
||||
|
||||
**What:** Since atoms and engines are not currently `preemptable`_ we can not
|
||||
force a engine (or the threads/remote workers... it is using to run) to stop
|
||||
force an engine (or the threads/remote workers... it is using to run) to stop
|
||||
working on an atom (it is general bad behavior to force code to stop without
|
||||
its consent anyway) if it has already started working on an atom (short of
|
||||
doing a ``kill -9`` on the running interpreter). This could cause problems
|
||||
@@ -265,18 +281,27 @@ Interfaces
|
||||
Implementations
|
||||
===============
|
||||
|
||||
Zookeeper
|
||||
---------
|
||||
|
||||
.. automodule:: taskflow.jobs.backends.impl_zookeeper
|
||||
|
||||
Redis
|
||||
-----
|
||||
|
||||
.. automodule:: taskflow.jobs.backends.impl_redis
|
||||
|
||||
Hierarchy
|
||||
=========
|
||||
|
||||
.. inheritance-diagram::
|
||||
taskflow.jobs.base
|
||||
taskflow.jobs.backends.impl_redis
|
||||
taskflow.jobs.backends.impl_zookeeper
|
||||
:parts: 1
|
||||
|
||||
.. _paradigm shift: https://wiki.openstack.org/wiki/TaskFlow/Paradigm_shifts#Workflow_ownership_transfer
|
||||
.. _zookeeper: http://zookeeper.apache.org/
|
||||
.. _kazoo: http://kazoo.readthedocs.org/
|
||||
.. _eventlet handler: https://pypi.python.org/pypi/kazoo-eventlet-handler/
|
||||
.. _stevedore: http://stevedore.readthedocs.org/
|
||||
.. _redis: http://redis.io/
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
===========================
|
||||
---------------------------
|
||||
Notifications and listeners
|
||||
===========================
|
||||
---------------------------
|
||||
|
||||
.. testsetup::
|
||||
|
||||
@@ -10,13 +10,12 @@ Notifications and listeners
|
||||
from taskflow.types import notifier
|
||||
ANY = notifier.Notifier.ANY
|
||||
|
||||
--------
|
||||
Overview
|
||||
--------
|
||||
========
|
||||
|
||||
Engines provide a way to receive notification on task and flow state
|
||||
transitions, which is useful for monitoring, logging, metrics, debugging
|
||||
and plenty of other tasks.
|
||||
transitions (see :doc:`states <states>`), which is useful for
|
||||
monitoring, logging, metrics, debugging and plenty of other tasks.
|
||||
|
||||
To receive these notifications you should register a callback with
|
||||
an instance of the :py:class:`~taskflow.types.notifier.Notifier`
|
||||
@@ -27,9 +26,8 @@ TaskFlow also comes with a set of predefined :ref:`listeners <listeners>`, and
|
||||
provides means to write your own listeners, which can be more convenient than
|
||||
using raw callbacks.
|
||||
|
||||
--------------------------------------
|
||||
Receiving notifications with callbacks
|
||||
--------------------------------------
|
||||
======================================
|
||||
|
||||
Flow notifications
|
||||
------------------
|
||||
@@ -106,9 +104,8 @@ A basic example is:
|
||||
|
||||
.. _listeners:
|
||||
|
||||
---------
|
||||
Listeners
|
||||
---------
|
||||
=========
|
||||
|
||||
TaskFlow comes with a set of predefined listeners -- helper classes that can be
|
||||
used to do various actions on flow and/or tasks transitions. You can also
|
||||
@@ -147,28 +144,31 @@ For example, this is how you can use
|
||||
<taskflow.engines.action_engine.engine.SerialActionEngine object at ...> has moved task 'DogTalk' (...) into state 'SUCCESS' from state 'RUNNING' with result 'dog' (failure=False)
|
||||
<taskflow.engines.action_engine.engine.SerialActionEngine object at ...> has moved flow 'cat-dog' (...) into state 'SUCCESS' from state 'RUNNING'
|
||||
|
||||
Basic listener
|
||||
--------------
|
||||
Interfaces
|
||||
==========
|
||||
|
||||
.. autoclass:: taskflow.listeners.base.Listener
|
||||
.. automodule:: taskflow.listeners.base
|
||||
|
||||
Implementations
|
||||
===============
|
||||
|
||||
Printing and logging listeners
|
||||
------------------------------
|
||||
|
||||
.. autoclass:: taskflow.listeners.base.DumpingListener
|
||||
|
||||
.. autoclass:: taskflow.listeners.logging.LoggingListener
|
||||
|
||||
.. autoclass:: taskflow.listeners.logging.DynamicLoggingListener
|
||||
|
||||
.. autoclass:: taskflow.listeners.printing.PrintingListener
|
||||
|
||||
Timing listener
|
||||
---------------
|
||||
Timing listeners
|
||||
----------------
|
||||
|
||||
.. autoclass:: taskflow.listeners.timing.TimingListener
|
||||
.. autoclass:: taskflow.listeners.timing.DurationListener
|
||||
|
||||
.. autoclass:: taskflow.listeners.timing.PrintingTimingListener
|
||||
.. autoclass:: taskflow.listeners.timing.PrintingDurationListener
|
||||
|
||||
.. autoclass:: taskflow.listeners.timing.EventTimeListener
|
||||
|
||||
Claim listener
|
||||
--------------
|
||||
@@ -181,7 +181,7 @@ Capturing listener
|
||||
.. autoclass:: taskflow.listeners.capturing.CaptureListener
|
||||
|
||||
Hierarchy
|
||||
---------
|
||||
=========
|
||||
|
||||
.. inheritance-diagram::
|
||||
taskflow.listeners.base.DumpingListener
|
||||
@@ -191,6 +191,7 @@ Hierarchy
|
||||
taskflow.listeners.logging.DynamicLoggingListener
|
||||
taskflow.listeners.logging.LoggingListener
|
||||
taskflow.listeners.printing.PrintingListener
|
||||
taskflow.listeners.timing.PrintingTimingListener
|
||||
taskflow.listeners.timing.TimingListener
|
||||
taskflow.listeners.timing.PrintingDurationListener
|
||||
taskflow.listeners.timing.EventTimeListener
|
||||
taskflow.listeners.timing.DurationListener
|
||||
:parts: 1
|
||||
|
||||
@@ -40,38 +40,38 @@ On :doc:`engine <engines>` construction typically a backend (it can be
|
||||
optional) will be provided which satisfies the
|
||||
:py:class:`~taskflow.persistence.base.Backend` abstraction. Along with
|
||||
providing a backend object a
|
||||
:py:class:`~taskflow.persistence.logbook.FlowDetail` object will also be
|
||||
:py:class:`~taskflow.persistence.models.FlowDetail` object will also be
|
||||
created and provided (this object will contain the details about the flow to be
|
||||
ran) to the engine constructor (or associated :py:meth:`load()
|
||||
<taskflow.engines.helpers.load>` helper functions). Typically a
|
||||
:py:class:`~taskflow.persistence.logbook.FlowDetail` object is created from a
|
||||
:py:class:`~taskflow.persistence.logbook.LogBook` object (the book object acts
|
||||
as a type of container for :py:class:`~taskflow.persistence.logbook.FlowDetail`
|
||||
and :py:class:`~taskflow.persistence.logbook.AtomDetail` objects).
|
||||
:py:class:`~taskflow.persistence.models.FlowDetail` object is created from a
|
||||
:py:class:`~taskflow.persistence.models.LogBook` object (the book object acts
|
||||
as a type of container for :py:class:`~taskflow.persistence.models.FlowDetail`
|
||||
and :py:class:`~taskflow.persistence.models.AtomDetail` objects).
|
||||
|
||||
**Preparation**: Once an engine starts to run it will create a
|
||||
:py:class:`~taskflow.storage.Storage` object which will act as the engines
|
||||
interface to the underlying backend storage objects (it provides helper
|
||||
functions that are commonly used by the engine, avoiding repeating code when
|
||||
interacting with the provided
|
||||
:py:class:`~taskflow.persistence.logbook.FlowDetail` and
|
||||
:py:class:`~taskflow.persistence.models.FlowDetail` and
|
||||
:py:class:`~taskflow.persistence.base.Backend` objects). As an engine
|
||||
initializes it will extract (or create)
|
||||
:py:class:`~taskflow.persistence.logbook.AtomDetail` objects for each atom in
|
||||
:py:class:`~taskflow.persistence.models.AtomDetail` objects for each atom in
|
||||
the workflow the engine will be executing.
|
||||
|
||||
**Execution:** When an engine beings to execute (see :doc:`engine <engines>`
|
||||
for more of the details about how an engine goes about this process) it will
|
||||
examine any previously existing
|
||||
:py:class:`~taskflow.persistence.logbook.AtomDetail` objects to see if they can
|
||||
:py:class:`~taskflow.persistence.models.AtomDetail` objects to see if they can
|
||||
be used for resuming; see :doc:`resumption <resumption>` for more details on
|
||||
this subject. For atoms which have not finished (or did not finish correctly
|
||||
from a previous run) they will begin executing only after any dependent inputs
|
||||
are ready. This is done by analyzing the execution graph and looking at
|
||||
predecessor :py:class:`~taskflow.persistence.logbook.AtomDetail` outputs and
|
||||
predecessor :py:class:`~taskflow.persistence.models.AtomDetail` outputs and
|
||||
states (which may have been persisted in a past run). This will result in
|
||||
either using there previous information or by running those predecessors and
|
||||
saving their output to the :py:class:`~taskflow.persistence.logbook.FlowDetail`
|
||||
either using their previous information or by running those predecessors and
|
||||
saving their output to the :py:class:`~taskflow.persistence.models.FlowDetail`
|
||||
and :py:class:`~taskflow.persistence.base.Backend` objects. This
|
||||
execution, analysis and interaction with the storage objects continues (what is
|
||||
described here is a simplification of what really happens; which is quite a bit
|
||||
@@ -81,7 +81,7 @@ will have succeeded or failed in its attempt to run the workflow).
|
||||
**Post-execution:** Typically when an engine is done running the logbook would
|
||||
be discarded (to avoid creating a stockpile of useless data) and the backend
|
||||
storage would be told to delete any contents for a given execution. For certain
|
||||
use-cases though it may be advantageous to retain logbooks and there contents.
|
||||
use-cases though it may be advantageous to retain logbooks and their contents.
|
||||
|
||||
A few scenarios come to mind:
|
||||
|
||||
@@ -176,7 +176,7 @@ concept everyone is familiar with).
|
||||
See :py:class:`~taskflow.persistence.backends.impl_dir.DirBackend`
|
||||
for implementation details.
|
||||
|
||||
Sqlalchemy
|
||||
SQLAlchemy
|
||||
----------
|
||||
|
||||
**Connection**: ``'mysql'`` or ``'postgres'`` or ``'sqlite'``
|
||||
@@ -249,9 +249,13 @@ parent_uuid VARCHAR False
|
||||
``results`` will contain. This size limit will restrict how many prior
|
||||
failures a retry atom can contain. More information and a future fix
|
||||
will be posted to bug `1416088`_ (for the meantime try to ensure that
|
||||
your retry units history does not grow beyond ~80 prior results).
|
||||
your retry units history does not grow beyond ~80 prior results). This
|
||||
truncation can also be avoided by providing ``mysql_sql_mode`` as
|
||||
``traditional`` when selecting your mysql + sqlalchemy based
|
||||
backend (see the `mysql modes`_ documentation for what this implies).
|
||||
|
||||
.. _1416088: http://bugs.launchpad.net/taskflow/+bug/1416088
|
||||
.. _mysql modes: http://dev.mysql.com/doc/refman/5.0/en/sql-mode.html
|
||||
|
||||
Zookeeper
|
||||
---------
|
||||
@@ -279,14 +283,34 @@ Interfaces
|
||||
|
||||
.. automodule:: taskflow.persistence.backends
|
||||
.. automodule:: taskflow.persistence.base
|
||||
.. automodule:: taskflow.persistence.logbook
|
||||
.. automodule:: taskflow.persistence.path_based
|
||||
|
||||
Models
|
||||
======
|
||||
|
||||
.. automodule:: taskflow.persistence.models
|
||||
|
||||
Implementations
|
||||
===============
|
||||
|
||||
.. automodule:: taskflow.persistence.backends.impl_dir
|
||||
Memory
|
||||
------
|
||||
|
||||
.. automodule:: taskflow.persistence.backends.impl_memory
|
||||
|
||||
Files
|
||||
-----
|
||||
|
||||
.. automodule:: taskflow.persistence.backends.impl_dir
|
||||
|
||||
SQLAlchemy
|
||||
----------
|
||||
|
||||
.. automodule:: taskflow.persistence.backends.impl_sqlalchemy
|
||||
|
||||
Zookeeper
|
||||
---------
|
||||
|
||||
.. automodule:: taskflow.persistence.backends.impl_zookeeper
|
||||
|
||||
Storage
|
||||
|
||||
@@ -46,7 +46,7 @@ name serves a special purpose in the resumption process (as well as serving a
|
||||
useful purpose when running, allowing for atom identification in the
|
||||
:doc:`notification <notifications>` process). The reason for having names is
|
||||
that an atom in a flow needs to be somehow matched with (a potentially)
|
||||
existing :py:class:`~taskflow.persistence.logbook.AtomDetail` during engine
|
||||
existing :py:class:`~taskflow.persistence.models.AtomDetail` during engine
|
||||
resumption & subsequent running.
|
||||
|
||||
The match should be:
|
||||
@@ -71,9 +71,9 @@ Scenarios
|
||||
=========
|
||||
|
||||
When new flow is loaded into engine, there is no persisted data for it yet, so
|
||||
a corresponding :py:class:`~taskflow.persistence.logbook.FlowDetail` object
|
||||
a corresponding :py:class:`~taskflow.persistence.models.FlowDetail` object
|
||||
will be created, as well as a
|
||||
:py:class:`~taskflow.persistence.logbook.AtomDetail` object for each atom that
|
||||
:py:class:`~taskflow.persistence.models.AtomDetail` object for each atom that
|
||||
is contained in it. These will be immediately saved into the persistence
|
||||
backend that is configured. If no persistence backend is configured, then as
|
||||
expected nothing will be saved and the atoms and flow will be ran in a
|
||||
@@ -94,7 +94,7 @@ When the factory function mentioned above returns the exact same the flow and
|
||||
atoms (no changes are performed).
|
||||
|
||||
**Runtime change:** Nothing should be done -- the engine will re-associate
|
||||
atoms with :py:class:`~taskflow.persistence.logbook.AtomDetail` objects by name
|
||||
atoms with :py:class:`~taskflow.persistence.models.AtomDetail` objects by name
|
||||
and then the engine resumes.
|
||||
|
||||
Atom was added
|
||||
@@ -105,7 +105,7 @@ in (for example for changing the runtime structure of what was previously ran
|
||||
in the first run).
|
||||
|
||||
**Runtime change:** By default when the engine resumes it will notice that a
|
||||
corresponding :py:class:`~taskflow.persistence.logbook.AtomDetail` does not
|
||||
corresponding :py:class:`~taskflow.persistence.models.AtomDetail` does not
|
||||
exist and one will be created and associated.
|
||||
|
||||
Atom was removed
|
||||
@@ -134,7 +134,7 @@ factory should replace this name where it was being used previously.
|
||||
exist when a new atom is added. In the future TaskFlow could make this easier
|
||||
by providing a ``upgrade()`` function that can be used to give users the
|
||||
ability to upgrade atoms before running (manual introspection & modification of
|
||||
a :py:class:`~taskflow.persistence.logbook.LogBook` can be done before engine
|
||||
a :py:class:`~taskflow.persistence.models.LogBook` can be done before engine
|
||||
loading and running to accomplish this in the meantime).
|
||||
|
||||
Atom was split in two atoms or merged
|
||||
@@ -150,7 +150,7 @@ exist when a new atom is added or removed. In the future TaskFlow could make
|
||||
this easier by providing a ``migrate()`` function that can be used to give
|
||||
users the ability to migrate atoms previous data before running (manual
|
||||
introspection & modification of a
|
||||
:py:class:`~taskflow.persistence.logbook.LogBook` can be done before engine
|
||||
:py:class:`~taskflow.persistence.models.LogBook` can be done before engine
|
||||
loading and running to accomplish this in the meantime).
|
||||
|
||||
Flow structure was changed
|
||||
|
||||
60
doc/source/shelf.rst
Normal file
@@ -0,0 +1,60 @@
|
||||
Libraries & frameworks
|
||||
----------------------
|
||||
|
||||
* `APScheduler`_ (Python)
|
||||
* `Async`_ (Python)
|
||||
* `Celery`_ (Python)
|
||||
* `Graffiti`_ (Python)
|
||||
* `JobLib`_ (Python)
|
||||
* `Luigi`_ (Python)
|
||||
* `Mesos`_ (C/C++)
|
||||
* `Papy`_ (Python)
|
||||
* `Parallel Python`_ (Python)
|
||||
* `RQ`_ (Python)
|
||||
* `Spiff`_ (Python)
|
||||
* `TBB Flow`_ (C/C++)
|
||||
|
||||
Languages
|
||||
---------
|
||||
|
||||
* `Ani`_
|
||||
* `Make`_
|
||||
* `Plaid`_
|
||||
|
||||
Services
|
||||
--------
|
||||
|
||||
* `Cloud Dataflow`_
|
||||
* `Mistral`_
|
||||
|
||||
Papers
|
||||
------
|
||||
|
||||
* `Advances in Dataflow Programming Languages`_
|
||||
|
||||
Related paradigms
|
||||
-----------------
|
||||
|
||||
* `Dataflow programming`_
|
||||
* `Programming paradigm(s)`_
|
||||
|
||||
.. _APScheduler: http://pythonhosted.org/APScheduler/
|
||||
.. _Async: http://pypi.python.org/pypi/async
|
||||
.. _Celery: http://www.celeryproject.org/
|
||||
.. _Graffiti: http://github.com/SegFaultAX/graffiti
|
||||
.. _JobLib: http://pythonhosted.org/joblib/index.html
|
||||
.. _Luigi: http://github.com/spotify/luigi
|
||||
.. _RQ: http://python-rq.org/
|
||||
.. _Mistral: http://wiki.openstack.org/wiki/Mistral
|
||||
.. _Mesos: http://mesos.apache.org/
|
||||
.. _Parallel Python: http://www.parallelpython.com/
|
||||
.. _Spiff: http://github.com/knipknap/SpiffWorkflow
|
||||
.. _Papy: http://code.google.com/p/papy/
|
||||
.. _Make: http://www.gnu.org/software/make/
|
||||
.. _Ani: http://code.google.com/p/anic/
|
||||
.. _Programming paradigm(s): http://en.wikipedia.org/wiki/Programming_paradigm
|
||||
.. _Plaid: http://www.cs.cmu.edu/~aldrich/plaid/
|
||||
.. _Advances in Dataflow Programming Languages: http://www.cs.ucf.edu/~dcm/Teaching/COT4810-Spring2011/Literature/DataFlowProgrammingLanguages.pdf
|
||||
.. _Cloud Dataflow: https://cloud.google.com/dataflow/
|
||||
.. _TBB Flow: https://www.threadingbuildingblocks.org/tutorial-intel-tbb-flow-graph
|
||||
.. _Dataflow programming: http://en.wikipedia.org/wiki/Dataflow_programming
|
||||
@@ -121,9 +121,14 @@ or if needed will wait for all of the atoms it depends on to complete.
|
||||
|
||||
.. note::
|
||||
|
||||
A engine running a task also transitions the task to the ``PENDING`` state
|
||||
An engine running a task also transitions the task to the ``PENDING`` state
|
||||
after it was reverted and its containing flow was restarted or retried.
|
||||
|
||||
|
||||
**IGNORE** - When a conditional decision has been made to skip (not
|
||||
execute) the task the engine will transition the task to
|
||||
the ``IGNORE`` state.
|
||||
|
||||
**RUNNING** - When an engine running the task starts to execute the task, the
|
||||
engine will transition the task to the ``RUNNING`` state, and the task will
|
||||
stay in this state until the tasks :py:meth:`~taskflow.task.BaseTask.execute`
|
||||
@@ -168,10 +173,14 @@ flow that the retry is associated with by consulting its
|
||||
|
||||
.. note::
|
||||
|
||||
A engine running a retry also transitions the retry to the ``PENDING`` state
|
||||
An engine running a retry also transitions the retry to the ``PENDING`` state
|
||||
after it was reverted and its associated flow was restarted or retried.
|
||||
|
||||
**RUNNING** - When a engine starts to execute the retry, the engine
|
||||
**IGNORE** - When a conditional decision has been made to skip (not
|
||||
execute) the retry the engine will transition the retry to
|
||||
the ``IGNORE`` state.
|
||||
|
||||
**RUNNING** - When an engine starts to execute the retry, the engine
|
||||
transitions the retry to the ``RUNNING`` state, and the retry stays in this
|
||||
state until its :py:meth:`~taskflow.retry.Retry.execute` method returns.
|
||||
|
||||
@@ -194,3 +203,26 @@ already in the ``FAILURE`` state then this is a no-op).
|
||||
**RETRYING** - If flow that is associated with the current retry was failed and
|
||||
reverted, the engine prepares the flow for the next run and transitions the
|
||||
retry to the ``RETRYING`` state.
|
||||
|
||||
Jobs
|
||||
====
|
||||
|
||||
.. image:: img/job_states.svg
|
||||
:width: 500px
|
||||
:align: center
|
||||
:alt: Job state transitions
|
||||
|
||||
**UNCLAIMED** - A job (with details about what work is to be completed) has
|
||||
been initially posted (by some posting entity) for work on by some other
|
||||
entity (for example a :doc:`conductor <conductors>`). This can also be a state
|
||||
that is entered when some owning entity has manually abandoned (or
|
||||
lost ownership of) a previously claimed job.
|
||||
|
||||
**CLAIMED** - A job that is *actively* owned by some entity; typically that
|
||||
ownership is tied to jobs persistent data via some ephemeral connection so
|
||||
that the job ownership is lost (typically automatically or after some
|
||||
timeout) if that ephemeral connection is lost.
|
||||
|
||||
**COMPLETE** - The work defined in the job has been finished by its owning
|
||||
entity and the job can no longer be processed (and it *may* be removed at
|
||||
some/any point in the future).
|
||||
|
||||
@@ -29,11 +29,6 @@ FSM
|
||||
|
||||
.. automodule:: taskflow.types.fsm
|
||||
|
||||
Futures
|
||||
=======
|
||||
|
||||
.. automodule:: taskflow.types.futures
|
||||
|
||||
Graph
|
||||
=====
|
||||
|
||||
@@ -43,11 +38,12 @@ Notifier
|
||||
========
|
||||
|
||||
.. automodule:: taskflow.types.notifier
|
||||
:special-members: __call__
|
||||
|
||||
Periodic
|
||||
========
|
||||
Sets
|
||||
====
|
||||
|
||||
.. automodule:: taskflow.types.periodic
|
||||
.. automodule:: taskflow.types.sets
|
||||
|
||||
Table
|
||||
=====
|
||||
|
||||
@@ -33,11 +33,6 @@ Kombu
|
||||
|
||||
.. automodule:: taskflow.utils.kombu_utils
|
||||
|
||||
Locks
|
||||
~~~~~
|
||||
|
||||
.. automodule:: taskflow.utils.lock_utils
|
||||
|
||||
Miscellaneous
|
||||
~~~~~~~~~~~~~
|
||||
|
||||
@@ -48,6 +43,16 @@ Persistence
|
||||
|
||||
.. automodule:: taskflow.utils.persistence_utils
|
||||
|
||||
Redis
|
||||
~~~~~
|
||||
|
||||
.. automodule:: taskflow.utils.redis_utils
|
||||
|
||||
Schema
|
||||
~~~~~~
|
||||
|
||||
.. automodule:: taskflow.utils.schema_utils
|
||||
|
||||
Threading
|
||||
~~~~~~~~~
|
||||
|
||||
|
||||
@@ -7,10 +7,9 @@ connected via `amqp`_ (or other supported `kombu`_ transports).
|
||||
|
||||
.. note::
|
||||
|
||||
This engine is under active development and is experimental but it is
|
||||
usable and does work but is missing some features (please check the
|
||||
`blueprint page`_ for known issues and plans) that will make it more
|
||||
production ready.
|
||||
This engine is under active development and is usable and **does** work
|
||||
but is missing some features (please check the `blueprint page`_ for
|
||||
known issues and plans) that will make it more production ready.
|
||||
|
||||
.. _blueprint page: https://blueprints.launchpad.net/taskflow?searchtext=wbe
|
||||
|
||||
@@ -18,8 +17,8 @@ Terminology
|
||||
-----------
|
||||
|
||||
Client
|
||||
Code or program or service that uses this library to define flows and
|
||||
run them via engines.
|
||||
Code or program or service (or user) that uses this library to define
|
||||
flows and run them via engines.
|
||||
|
||||
Transport + protocol
|
||||
Mechanism (and `protocol`_ on top of that mechanism) used to pass information
|
||||
@@ -118,7 +117,7 @@ engine executor in the following manner:
|
||||
4. The executor gets the task request confirmation from the worker and the task
|
||||
request state changes from the ``PENDING`` to the ``RUNNING`` state. Once a
|
||||
task request is in the ``RUNNING`` state it can't be timed-out (considering
|
||||
that task execution process may take unpredictable time).
|
||||
that the task execution process may take an unpredictable amount of time).
|
||||
5. The executor gets the task execution result from the worker and passes it
|
||||
back to the executor and worker-based engine to finish task processing (this
|
||||
repeats for subsequent tasks).
|
||||
@@ -129,7 +128,9 @@ engine executor in the following manner:
|
||||
json-serializable (they contain references to tracebacks which are not
|
||||
serializable), so they are converted to dicts before sending and converted
|
||||
from dicts after receiving on both executor & worker sides (this
|
||||
translation is lossy since the traceback won't be fully retained).
|
||||
translation is lossy since the traceback can't be fully retained, due
|
||||
to its contents containing internal interpreter references and
|
||||
details).
|
||||
|
||||
Protocol
|
||||
~~~~~~~~
|
||||
@@ -406,16 +407,20 @@ Limitations
|
||||
locally to avoid transport overhead for very simple tasks (currently it will
|
||||
run even lightweight tasks remotely, which may be non-performant).
|
||||
* Fault detection, currently when a worker acknowledges a task the engine will
|
||||
wait for the task result indefinitely (a task could take a very long time to
|
||||
finish). In the future there needs to be a way to limit the duration of a
|
||||
remote workers execution (and track there liveness) and possibly spawn
|
||||
the task on a secondary worker if a timeout is reached (aka the first worker
|
||||
has died or has stopped responding).
|
||||
wait for the task result indefinitely (a task may take an indeterminate
|
||||
amount of time to finish). In the future there needs to be a way to limit
|
||||
the duration of a remote workers execution (and track their liveness) and
|
||||
possibly spawn the task on a secondary worker if a timeout is reached (aka
|
||||
the first worker has died or has stopped responding).
|
||||
|
||||
Interfaces
|
||||
==========
|
||||
Implementations
|
||||
===============
|
||||
|
||||
.. automodule:: taskflow.engines.worker_based.engine
|
||||
|
||||
Components
|
||||
----------
|
||||
|
||||
.. automodule:: taskflow.engines.worker_based.proxy
|
||||
.. automodule:: taskflow.engines.worker_based.worker
|
||||
|
||||
|
||||
@@ -1,7 +1,4 @@
|
||||
[DEFAULT]
|
||||
|
||||
# The list of modules to copy from oslo-incubator.git
|
||||
script=tools/run_cross_tests.sh
|
||||
|
||||
# The base module to hold the copy of openstack.common
|
||||
base=taskflow
|
||||
|
||||
2
pylintrc
@@ -12,7 +12,7 @@ variable-rgx=[a-z_][a-z0-9_]{0,30}$
|
||||
argument-rgx=[a-z_][a-z0-9_]{1,30}$
|
||||
|
||||
# Method names should be at least 3 characters long
|
||||
# and be lowecased with underscores
|
||||
# and be lowercased with underscores
|
||||
method-rgx=[a-z_][a-z0-9_]{2,50}$
|
||||
|
||||
# Don't require docstrings on tests.
|
||||
|
||||
@@ -1,30 +0,0 @@
|
||||
# The order of packages is significant, because pip processes them in the order
|
||||
# of appearance. Changing the order has an impact on the overall integration
|
||||
# process, which may cause wedges in the gate later.
|
||||
|
||||
# See: https://bugs.launchpad.net/pbr/+bug/1384919 for why this is here...
|
||||
pbr>=0.6,!=0.7,<1.0
|
||||
|
||||
# Packages needed for using this library.
|
||||
|
||||
# Only needed on python 2.6
|
||||
ordereddict
|
||||
|
||||
# Python 2->3 compatibility library.
|
||||
six>=1.7.0
|
||||
|
||||
# Very nice graph library
|
||||
networkx>=1.8
|
||||
|
||||
# Used for backend storage engine loading.
|
||||
stevedore>=1.1.0 # Apache-2.0
|
||||
|
||||
# Backport for concurrent.futures which exists in 3.2+
|
||||
futures>=2.1.6
|
||||
|
||||
# Used for structured input validation
|
||||
jsonschema>=2.0.0,<3.0.0
|
||||
|
||||
# For common utilities
|
||||
oslo.utils>=1.2.0 # Apache-2.0
|
||||
oslo.serialization>=1.2.0 # Apache-2.0
|
||||
@@ -1,24 +0,0 @@
|
||||
# The order of packages is significant, because pip processes them in the order
|
||||
# of appearance. Changing the order has an impact on the overall integration
|
||||
# process, which may cause wedges in the gate later.
|
||||
|
||||
# See: https://bugs.launchpad.net/pbr/+bug/1384919 for why this is here...
|
||||
pbr>=0.6,!=0.7,<1.0
|
||||
|
||||
# Packages needed for using this library.
|
||||
|
||||
# Python 2->3 compatibility library.
|
||||
six>=1.7.0
|
||||
|
||||
# Very nice graph library
|
||||
networkx>=1.8
|
||||
|
||||
# Used for backend storage engine loading.
|
||||
stevedore>=1.1.0 # Apache-2.0
|
||||
|
||||
# Used for structured input validation
|
||||
jsonschema>=2.0.0,<3.0.0
|
||||
|
||||
# For common utilities
|
||||
oslo.utils>=1.2.0 # Apache-2.0
|
||||
oslo.serialization>=1.2.0 # Apache-2.0
|
||||
48
requirements.txt
Normal file
@@ -0,0 +1,48 @@
|
||||
# The order of packages is significant, because pip processes them in the order
|
||||
# of appearance. Changing the order has an impact on the overall integration
|
||||
# process, which may cause wedges in the gate later.
|
||||
|
||||
# See: https://bugs.launchpad.net/pbr/+bug/1384919 for why this is here...
|
||||
pbr<2.0,>=0.11
|
||||
|
||||
# Packages needed for using this library.
|
||||
|
||||
# Python 2->3 compatibility library.
|
||||
six>=1.9.0
|
||||
|
||||
# Enum library made for <= python 3.3
|
||||
enum34;python_version=='2.7' or python_version=='2.6'
|
||||
|
||||
# For async and/or periodic work
|
||||
futurist>=0.1.1 # Apache-2.0
|
||||
|
||||
# For reader/writer + interprocess locks.
|
||||
fasteners>=0.7 # Apache-2.0
|
||||
|
||||
# Very nice graph library
|
||||
networkx>=1.8
|
||||
|
||||
# For contextlib new additions/compatibility for <= python 3.3
|
||||
contextlib2>=0.4.0 # PSF License
|
||||
|
||||
# Used for backend storage engine loading.
|
||||
stevedore>=1.5.0 # Apache-2.0
|
||||
|
||||
# Backport for concurrent.futures which exists in 3.2+
|
||||
futures>=3.0;python_version=='2.7' or python_version=='2.6'
|
||||
|
||||
# Backport for time.monotonic which is in 3.3+
|
||||
monotonic>=0.1 # Apache-2.0
|
||||
|
||||
# Used for structured input validation
|
||||
jsonschema!=2.5.0,<3.0.0,>=2.0.0
|
||||
|
||||
# For common utilities
|
||||
oslo.utils>=1.6.0 # Apache-2.0
|
||||
oslo.serialization>=1.4.0 # Apache-2.0
|
||||
|
||||
# For lru caches and such
|
||||
cachetools>=1.0.0 # MIT License
|
||||
|
||||
# For deprecation of things
|
||||
debtcollector>=0.3.0 # Apache-2.0
|
||||
@@ -17,10 +17,8 @@ classifier =
|
||||
Operating System :: POSIX :: Linux
|
||||
Programming Language :: Python
|
||||
Programming Language :: Python :: 2
|
||||
Programming Language :: Python :: 2.6
|
||||
Programming Language :: Python :: 2.7
|
||||
Programming Language :: Python :: 3
|
||||
Programming Language :: Python :: 3.3
|
||||
Programming Language :: Python :: 3.4
|
||||
Topic :: Software Development :: Libraries
|
||||
Topic :: System :: Distributed Computing
|
||||
@@ -36,6 +34,10 @@ packages =
|
||||
[entry_points]
|
||||
taskflow.jobboards =
|
||||
zookeeper = taskflow.jobs.backends.impl_zookeeper:ZookeeperJobBoard
|
||||
redis = taskflow.jobs.backends.impl_redis:RedisJobBoard
|
||||
|
||||
taskflow.conductors =
|
||||
blocking = taskflow.conductors.backends.impl_blocking:BlockingConductor
|
||||
|
||||
taskflow.persistence =
|
||||
dir = taskflow.persistence.backends.impl_dir:DirBackend
|
||||
|
||||
1
setup.py
@@ -1,4 +1,3 @@
|
||||
#!/usr/bin/env python
|
||||
# Copyright (c) 2013 Hewlett-Packard Development Company, L.P.
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
|
||||
191
taskflow/atom.py
@@ -16,14 +16,22 @@
|
||||
# under the License.
|
||||
|
||||
import abc
|
||||
import collections
|
||||
import itertools
|
||||
|
||||
from oslo_utils import reflection
|
||||
import six
|
||||
from six.moves import zip as compat_zip
|
||||
|
||||
from taskflow import exceptions
|
||||
from taskflow.types import sets
|
||||
from taskflow.utils import misc
|
||||
|
||||
|
||||
# Helper types tuples...
|
||||
_sequence_types = (list, tuple, collections.Sequence)
|
||||
_set_types = (set, collections.Set)
|
||||
|
||||
|
||||
def _save_as_to_mapping(save_as):
|
||||
"""Convert save_as to mapping name => index.
|
||||
|
||||
@@ -33,25 +41,27 @@ def _save_as_to_mapping(save_as):
|
||||
# outside of code so that it's more easily understandable, since what an
|
||||
# atom returns is pretty crucial for other later operations.
|
||||
if save_as is None:
|
||||
return {}
|
||||
return collections.OrderedDict()
|
||||
if isinstance(save_as, six.string_types):
|
||||
# NOTE(harlowja): this means that your atom will only return one item
|
||||
# instead of a dictionary-like object or a indexable object (like a
|
||||
# list or tuple).
|
||||
return {save_as: None}
|
||||
elif isinstance(save_as, (tuple, list)):
|
||||
return collections.OrderedDict([(save_as, None)])
|
||||
elif isinstance(save_as, _sequence_types):
|
||||
# NOTE(harlowja): this means that your atom will return a indexable
|
||||
# object, like a list or tuple and the results can be mapped by index
|
||||
# to that tuple/list that is returned for others to use.
|
||||
return dict((key, num) for num, key in enumerate(save_as))
|
||||
elif isinstance(save_as, set):
|
||||
return collections.OrderedDict((key, num)
|
||||
for num, key in enumerate(save_as))
|
||||
elif isinstance(save_as, _set_types):
|
||||
# NOTE(harlowja): in the case where a set is given we will not be
|
||||
# able to determine the numeric ordering in a reliable way (since it is
|
||||
# a unordered set) so the only way for us to easily map the result of
|
||||
# the atom will be via the key itself.
|
||||
return dict((key, key) for key in save_as)
|
||||
raise TypeError('Atom provides parameter '
|
||||
'should be str, set or tuple/list, not %r' % save_as)
|
||||
# able to determine the numeric ordering in a reliable way (since it
|
||||
# may be an unordered set) so the only way for us to easily map the
|
||||
# result of the atom will be via the key itself.
|
||||
return collections.OrderedDict((key, key) for key in save_as)
|
||||
else:
|
||||
raise TypeError('Atom provides parameter '
|
||||
'should be str, set or tuple/list, not %r' % save_as)
|
||||
|
||||
|
||||
def _build_rebind_dict(args, rebind_args):
|
||||
@@ -62,9 +72,9 @@ def _build_rebind_dict(args, rebind_args):
|
||||
new name onto the required name).
|
||||
"""
|
||||
if rebind_args is None:
|
||||
return {}
|
||||
return collections.OrderedDict()
|
||||
elif isinstance(rebind_args, (list, tuple)):
|
||||
rebind = dict(zip(args, rebind_args))
|
||||
rebind = collections.OrderedDict(compat_zip(args, rebind_args))
|
||||
if len(args) < len(rebind_args):
|
||||
rebind.update((a, a) for a in rebind_args[len(args):])
|
||||
return rebind
|
||||
@@ -85,11 +95,11 @@ def _build_arg_mapping(atom_name, reqs, rebind_args, function, do_infer,
|
||||
extra arguments (where applicable).
|
||||
"""
|
||||
|
||||
# build a list of required arguments based on function signature
|
||||
# Build a list of required arguments based on function signature.
|
||||
req_args = reflection.get_callable_args(function, required_only=True)
|
||||
all_args = reflection.get_callable_args(function, required_only=False)
|
||||
|
||||
# remove arguments that are part of ignore_list
|
||||
# Remove arguments that are part of ignore list.
|
||||
if ignore_list:
|
||||
for arg in ignore_list:
|
||||
if arg in req_args:
|
||||
@@ -97,65 +107,56 @@ def _build_arg_mapping(atom_name, reqs, rebind_args, function, do_infer,
|
||||
else:
|
||||
ignore_list = []
|
||||
|
||||
required = {}
|
||||
# add reqs to required mappings
|
||||
# Build the required names.
|
||||
required = collections.OrderedDict()
|
||||
|
||||
# Add required arguments to required mappings if inference is enabled.
|
||||
if do_infer:
|
||||
required.update((a, a) for a in req_args)
|
||||
|
||||
# Add additional manually provided requirements to required mappings.
|
||||
if reqs:
|
||||
if isinstance(reqs, six.string_types):
|
||||
required.update({reqs: reqs})
|
||||
else:
|
||||
required.update((a, a) for a in reqs)
|
||||
|
||||
# add req_args to required mappings if do_infer is set
|
||||
if do_infer:
|
||||
required.update((a, a) for a in req_args)
|
||||
|
||||
# update required mappings based on rebind_args
|
||||
# Update required mappings values based on rebinding of arguments names.
|
||||
required.update(_build_rebind_dict(req_args, rebind_args))
|
||||
|
||||
# Determine if there are optional arguments that we may or may not take.
|
||||
if do_infer:
|
||||
opt_args = set(all_args) - set(required) - set(ignore_list)
|
||||
optional = dict((a, a) for a in opt_args)
|
||||
opt_args = sets.OrderedSet(all_args)
|
||||
opt_args = opt_args - set(itertools.chain(six.iterkeys(required),
|
||||
iter(ignore_list)))
|
||||
optional = collections.OrderedDict((a, a) for a in opt_args)
|
||||
else:
|
||||
optional = {}
|
||||
optional = collections.OrderedDict()
|
||||
|
||||
# Check if we are given some extra arguments that we aren't able to accept.
|
||||
if not reflection.accepts_kwargs(function):
|
||||
extra_args = set(required) - set(all_args)
|
||||
extra_args = sets.OrderedSet(six.iterkeys(required))
|
||||
extra_args -= all_args
|
||||
if extra_args:
|
||||
extra_args_str = ', '.join(sorted(extra_args))
|
||||
raise ValueError('Extra arguments given to atom %s: %s'
|
||||
% (atom_name, extra_args_str))
|
||||
% (atom_name, list(extra_args)))
|
||||
|
||||
# NOTE(imelnikov): don't use set to preserve order in error message
|
||||
missing_args = [arg for arg in req_args if arg not in required]
|
||||
if missing_args:
|
||||
raise ValueError('Missing arguments for atom %s: %s'
|
||||
% (atom_name, ' ,'.join(missing_args)))
|
||||
% (atom_name, missing_args))
|
||||
return required, optional
|
||||
|
||||
|
||||
@six.add_metaclass(abc.ABCMeta)
|
||||
class Atom(object):
|
||||
"""An abstract flow atom that causes a flow to progress (in some manner).
|
||||
"""An unit of work that causes a flow to progress (in some manner).
|
||||
|
||||
An atom is a named object that operates with input flow data to perform
|
||||
An atom is a named object that operates with input data to perform
|
||||
some action that furthers the overall flows progress. It usually also
|
||||
produces some of its own named output as a result of this process.
|
||||
|
||||
:ivar version: An *immutable* version that associates version information
|
||||
with this atom. It can be useful in resuming older versions
|
||||
of atoms. Standard major, minor versioning concepts
|
||||
should apply.
|
||||
:ivar save_as: An *immutable* output ``resource`` name dictionary this atom
|
||||
produces that other atoms may depend on this atom providing.
|
||||
The format is output index (or key when a dictionary
|
||||
is returned from the execute method) to stored argument
|
||||
name.
|
||||
:ivar rebind: An *immutable* input ``resource`` mapping dictionary that
|
||||
can be used to alter the inputs given to this atom. It is
|
||||
typically used for mapping a prior atoms output into
|
||||
the names that this atom expects (in a way this is like
|
||||
remapping a namespace of another atom into the namespace
|
||||
of this atom).
|
||||
:param name: Meaningful name for this atom, should be something that is
|
||||
distinguishable and understandable for notification,
|
||||
debugging, storing and any other similar purposes.
|
||||
@@ -164,52 +165,61 @@ class Atom(object):
|
||||
to correlate and associate the thing/s this atom
|
||||
produces, if it produces anything at all.
|
||||
:param inject: An *immutable* input_name => value dictionary which
|
||||
specifies any initial inputs that should be automatically
|
||||
injected into the atoms scope before the atom execution
|
||||
commences (this allows for providing atom *local* values that
|
||||
do not need to be provided by other atoms/dependents).
|
||||
specifies any initial inputs that should be automatically
|
||||
injected into the atoms scope before the atom execution
|
||||
commences (this allows for providing atom *local* values
|
||||
that do not need to be provided by other atoms/dependents).
|
||||
:ivar version: An *immutable* version that associates version information
|
||||
with this atom. It can be useful in resuming older versions
|
||||
of atoms. Standard major, minor versioning concepts
|
||||
should apply.
|
||||
:ivar save_as: An *immutable* output ``resource`` name
|
||||
:py:class:`.OrderedDict` this atom produces that other
|
||||
atoms may depend on this atom providing. The format is
|
||||
output index (or key when a dictionary is returned from
|
||||
the execute method) to stored argument name.
|
||||
:ivar rebind: An *immutable* input ``resource`` :py:class:`.OrderedDict`
|
||||
that can be used to alter the inputs given to this atom. It
|
||||
is typically used for mapping a prior atoms output into
|
||||
the names that this atom expects (in a way this is like
|
||||
remapping a namespace of another atom into the namespace
|
||||
of this atom).
|
||||
:ivar inject: See parameter ``inject``.
|
||||
:ivar requires: Any inputs this atom requires to function (if applicable).
|
||||
NOTE(harlowja): there can be no intersection between what
|
||||
this atom requires and what it produces (since this would
|
||||
be an impossible dependency to satisfy).
|
||||
:ivar optional: Any inputs that are optional for this atom's execute
|
||||
method.
|
||||
|
||||
:ivar name: See parameter ``name``.
|
||||
:ivar requires: A :py:class:`~taskflow.types.sets.OrderedSet` of inputs
|
||||
this atom requires to function.
|
||||
:ivar optional: A :py:class:`~taskflow.types.sets.OrderedSet` of inputs
|
||||
that are optional for this atom to function.
|
||||
:ivar provides: A :py:class:`~taskflow.types.sets.OrderedSet` of outputs
|
||||
this atom produces.
|
||||
"""
|
||||
|
||||
def __init__(self, name=None, provides=None, inject=None):
|
||||
self._name = name
|
||||
self.save_as = _save_as_to_mapping(provides)
|
||||
self.name = name
|
||||
self.version = (1, 0)
|
||||
self.inject = inject
|
||||
self.requires = frozenset()
|
||||
self.optional = frozenset()
|
||||
self.save_as = _save_as_to_mapping(provides)
|
||||
self.requires = sets.OrderedSet()
|
||||
self.optional = sets.OrderedSet()
|
||||
self.provides = sets.OrderedSet(self.save_as)
|
||||
self.rebind = collections.OrderedDict()
|
||||
|
||||
def _build_arg_mapping(self, executor, requires=None, rebind=None,
|
||||
auto_extract=True, ignore_list=None):
|
||||
req_arg, opt_arg = _build_arg_mapping(self.name, requires, rebind,
|
||||
executor, auto_extract,
|
||||
ignore_list)
|
||||
|
||||
self.rebind = {}
|
||||
if opt_arg:
|
||||
self.rebind.update(opt_arg)
|
||||
if req_arg:
|
||||
self.rebind.update(req_arg)
|
||||
self.requires = frozenset(req_arg.values())
|
||||
self.optional = frozenset(opt_arg.values())
|
||||
required, optional = _build_arg_mapping(self.name, requires, rebind,
|
||||
executor, auto_extract,
|
||||
ignore_list=ignore_list)
|
||||
rebind = collections.OrderedDict()
|
||||
for (arg_name, bound_name) in itertools.chain(six.iteritems(required),
|
||||
six.iteritems(optional)):
|
||||
rebind.setdefault(arg_name, bound_name)
|
||||
self.rebind = rebind
|
||||
self.requires = sets.OrderedSet(six.itervalues(required))
|
||||
self.optional = sets.OrderedSet(six.itervalues(optional))
|
||||
if self.inject:
|
||||
inject_set = set(six.iterkeys(self.inject))
|
||||
self.requires -= inject_set
|
||||
self.optional -= inject_set
|
||||
|
||||
out_of_order = self.provides.intersection(self.requires)
|
||||
if out_of_order:
|
||||
raise exceptions.DependencyFailure(
|
||||
"Atom %(item)s provides %(oo)s that are required "
|
||||
"by this atom"
|
||||
% dict(item=self.name, oo=sorted(out_of_order)))
|
||||
inject_keys = frozenset(six.iterkeys(self.inject))
|
||||
self.requires -= inject_keys
|
||||
self.optional -= inject_keys
|
||||
|
||||
@abc.abstractmethod
|
||||
def execute(self, *args, **kwargs):
|
||||
@@ -219,23 +229,8 @@ class Atom(object):
|
||||
def revert(self, *args, **kwargs):
|
||||
"""Reverts this atom (undoing any :meth:`execute` side-effects)."""
|
||||
|
||||
@property
|
||||
def name(self):
|
||||
"""A non-unique name for this atom (human readable)."""
|
||||
return self._name
|
||||
|
||||
def __str__(self):
|
||||
return "%s==%s" % (self.name, misc.get_version_string(self))
|
||||
|
||||
def __repr__(self):
|
||||
return '<%s %s>' % (reflection.get_class_name(self), self)
|
||||
|
||||
@property
|
||||
def provides(self):
|
||||
"""Any outputs this atom produces.
|
||||
|
||||
NOTE(harlowja): there can be no intersection between what this atom
|
||||
requires and what it produces (since this would be an impossible
|
||||
dependency to satisfy).
|
||||
"""
|
||||
return set(self.save_as)
|
||||
|
||||
45
taskflow/conductors/backends/__init__.py
Normal file
@@ -0,0 +1,45 @@
|
||||
# -*- coding: utf-8 -*-
|
||||
|
||||
# Copyright (C) 2014 Yahoo! Inc. All Rights Reserved.
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License"); you may
|
||||
# not use this file except in compliance with the License. You may obtain
|
||||
# a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
|
||||
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
|
||||
# License for the specific language governing permissions and limitations
|
||||
# under the License.
|
||||
|
||||
import logging
|
||||
|
||||
import stevedore.driver
|
||||
|
||||
from taskflow import exceptions as exc
|
||||
|
||||
# NOTE(harlowja): this is the entrypoint namespace, not the module namespace.
|
||||
CONDUCTOR_NAMESPACE = 'taskflow.conductors'
|
||||
|
||||
LOG = logging.getLogger(__name__)
|
||||
|
||||
|
||||
def fetch(kind, name, jobboard, namespace=CONDUCTOR_NAMESPACE, **kwargs):
|
||||
"""Fetch a conductor backend with the given options.
|
||||
|
||||
This fetch method will look for the entrypoint 'kind' in the entrypoint
|
||||
namespace, and then attempt to instantiate that entrypoint using the
|
||||
provided name, jobboard and any board specific kwargs.
|
||||
"""
|
||||
LOG.debug('Looking for %r conductor driver in %r', kind, namespace)
|
||||
try:
|
||||
mgr = stevedore.driver.DriverManager(
|
||||
namespace, kind,
|
||||
invoke_on_load=True,
|
||||
invoke_args=(name, jobboard),
|
||||
invoke_kwds=kwargs)
|
||||
return mgr.driver
|
||||
except RuntimeError as e:
|
||||
raise exc.NotFound("Could not find conductor %s" % (kind), e)
|
||||
219
taskflow/conductors/backends/impl_blocking.py
Normal file
@@ -0,0 +1,219 @@
|
||||
# -*- coding: utf-8 -*-
|
||||
|
||||
# Licensed under the Apache License, Version 2.0 (the "License"); you may
|
||||
# not use this file except in compliance with the License. You may obtain
|
||||
# a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
|
||||
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
|
||||
# License for the specific language governing permissions and limitations
|
||||
# under the License.
|
||||
|
||||
import threading
|
||||
|
||||
try:
|
||||
from contextlib import ExitStack # noqa
|
||||
except ImportError:
|
||||
from contextlib2 import ExitStack # noqa
|
||||
|
||||
from debtcollector import removals
|
||||
from oslo_utils import excutils
|
||||
import six
|
||||
|
||||
from taskflow.conductors import base
|
||||
from taskflow import exceptions as excp
|
||||
from taskflow.listeners import logging as logging_listener
|
||||
from taskflow import logging
|
||||
from taskflow.types import timing as tt
|
||||
from taskflow.utils import async_utils
|
||||
|
||||
LOG = logging.getLogger(__name__)
|
||||
WAIT_TIMEOUT = 0.5
|
||||
NO_CONSUME_EXCEPTIONS = tuple([
|
||||
excp.ExecutionFailure,
|
||||
excp.StorageFailure,
|
||||
])
|
||||
|
||||
|
||||
class BlockingConductor(base.Conductor):
|
||||
"""A conductor that runs jobs in its own dispatching loop.
|
||||
|
||||
This conductor iterates over jobs in the provided jobboard (waiting for
|
||||
the given timeout if no jobs exist) and attempts to claim them, work on
|
||||
those jobs in its local thread (blocking further work from being claimed
|
||||
and consumed) and then consume those work units after completetion. This
|
||||
process will repeat until the conductor has been stopped or other critical
|
||||
error occurs.
|
||||
|
||||
NOTE(harlowja): consumption occurs even if a engine fails to run due to
|
||||
a task failure. This is only skipped when an execution failure or
|
||||
a storage failure occurs which are *usually* correctable by re-running on
|
||||
a different conductor (storage failures and execution failures may be
|
||||
transient issues that can be worked around by later execution). If a job
|
||||
after completing can not be consumed or abandoned the conductor relies
|
||||
upon the jobboard capabilities to automatically abandon these jobs.
|
||||
"""
|
||||
|
||||
START_FINISH_EVENTS_EMITTED = tuple([
|
||||
'compilation', 'preparation',
|
||||
'validation', 'running',
|
||||
])
|
||||
"""Events will be emitted for the start and finish of each engine
|
||||
activity defined above, the actual event name that can be registered
|
||||
to subscribe to will be ``${event}_start`` and ``${event}_end`` where
|
||||
the ``${event}`` in this pseudo-variable will be one of these events.
|
||||
"""
|
||||
|
||||
def __init__(self, name, jobboard,
|
||||
persistence=None, engine=None,
|
||||
engine_options=None, wait_timeout=None):
|
||||
super(BlockingConductor, self).__init__(
|
||||
name, jobboard, persistence=persistence,
|
||||
engine=engine, engine_options=engine_options)
|
||||
if wait_timeout is None:
|
||||
wait_timeout = WAIT_TIMEOUT
|
||||
if isinstance(wait_timeout, (int, float) + six.string_types):
|
||||
self._wait_timeout = tt.Timeout(float(wait_timeout))
|
||||
elif isinstance(wait_timeout, tt.Timeout):
|
||||
self._wait_timeout = wait_timeout
|
||||
else:
|
||||
raise ValueError("Invalid timeout literal: %s" % (wait_timeout))
|
||||
self._dead = threading.Event()
|
||||
|
||||
@removals.removed_kwarg('timeout', version="0.8", removal_version="2.0")
|
||||
def stop(self, timeout=None):
|
||||
"""Requests the conductor to stop dispatching.
|
||||
|
||||
This method can be used to request that a conductor stop its
|
||||
consumption & dispatching loop.
|
||||
|
||||
The method returns immediately regardless of whether the conductor has
|
||||
been stopped.
|
||||
|
||||
.. deprecated:: 0.8
|
||||
|
||||
The ``timeout`` parameter is **deprecated** and is present for
|
||||
backward compatibility **only**. In order to wait for the
|
||||
conductor to gracefully shut down, :py:meth:`wait` should be used
|
||||
instead.
|
||||
"""
|
||||
self._wait_timeout.interrupt()
|
||||
|
||||
@property
|
||||
def dispatching(self):
|
||||
return not self._dead.is_set()
|
||||
|
||||
def _listeners_from_job(self, job, engine):
|
||||
listeners = super(BlockingConductor, self)._listeners_from_job(job,
|
||||
engine)
|
||||
listeners.append(logging_listener.LoggingListener(engine, log=LOG))
|
||||
return listeners
|
||||
|
||||
def _dispatch_job(self, job):
|
||||
engine = self._engine_from_job(job)
|
||||
listeners = self._listeners_from_job(job, engine)
|
||||
with ExitStack() as stack:
|
||||
for listener in listeners:
|
||||
stack.enter_context(listener)
|
||||
LOG.debug("Dispatching engine for job '%s'", job)
|
||||
consume = True
|
||||
try:
|
||||
for stage_func, event_name in [(engine.compile, 'compilation'),
|
||||
(engine.prepare, 'preparation'),
|
||||
(engine.validate, 'validation'),
|
||||
(engine.run, 'running')]:
|
||||
self._notifier.notify("%s_start" % event_name, {
|
||||
'job': job,
|
||||
'engine': engine,
|
||||
'conductor': self,
|
||||
})
|
||||
stage_func()
|
||||
self._notifier.notify("%s_end" % event_name, {
|
||||
'job': job,
|
||||
'engine': engine,
|
||||
'conductor': self,
|
||||
})
|
||||
except excp.WrappedFailure as e:
|
||||
if all((f.check(*NO_CONSUME_EXCEPTIONS) for f in e)):
|
||||
consume = False
|
||||
if LOG.isEnabledFor(logging.WARNING):
|
||||
if consume:
|
||||
LOG.warn("Job execution failed (consumption being"
|
||||
" skipped): %s [%s failures]", job, len(e))
|
||||
else:
|
||||
LOG.warn("Job execution failed (consumption"
|
||||
" proceeding): %s [%s failures]", job, len(e))
|
||||
# Show the failure/s + traceback (if possible)...
|
||||
for i, f in enumerate(e):
|
||||
LOG.warn("%s. %s", i + 1, f.pformat(traceback=True))
|
||||
except NO_CONSUME_EXCEPTIONS:
|
||||
LOG.warn("Job execution failed (consumption being"
|
||||
" skipped): %s", job, exc_info=True)
|
||||
consume = False
|
||||
except Exception:
|
||||
LOG.warn("Job execution failed (consumption proceeding): %s",
|
||||
job, exc_info=True)
|
||||
else:
|
||||
LOG.info("Job completed successfully: %s", job)
|
||||
return async_utils.make_completed_future(consume)
|
||||
|
||||
def run(self):
|
||||
self._dead.clear()
|
||||
try:
|
||||
while True:
|
||||
if self._wait_timeout.is_stopped():
|
||||
break
|
||||
dispatched = 0
|
||||
for job in self._jobboard.iterjobs():
|
||||
if self._wait_timeout.is_stopped():
|
||||
break
|
||||
LOG.debug("Trying to claim job: %s", job)
|
||||
try:
|
||||
self._jobboard.claim(job, self._name)
|
||||
except (excp.UnclaimableJob, excp.NotFound):
|
||||
LOG.debug("Job already claimed or consumed: %s", job)
|
||||
continue
|
||||
consume = False
|
||||
try:
|
||||
f = self._dispatch_job(job)
|
||||
except KeyboardInterrupt:
|
||||
with excutils.save_and_reraise_exception():
|
||||
LOG.warn("Job dispatching interrupted: %s", job)
|
||||
except Exception:
|
||||
LOG.warn("Job dispatching failed: %s", job,
|
||||
exc_info=True)
|
||||
else:
|
||||
dispatched += 1
|
||||
consume = f.result()
|
||||
try:
|
||||
if consume:
|
||||
self._jobboard.consume(job, self._name)
|
||||
else:
|
||||
self._jobboard.abandon(job, self._name)
|
||||
except (excp.JobFailure, excp.NotFound):
|
||||
if consume:
|
||||
LOG.warn("Failed job consumption: %s", job,
|
||||
exc_info=True)
|
||||
else:
|
||||
LOG.warn("Failed job abandonment: %s", job,
|
||||
exc_info=True)
|
||||
if dispatched == 0 and not self._wait_timeout.is_stopped():
|
||||
self._wait_timeout.wait()
|
||||
finally:
|
||||
self._dead.set()
|
||||
|
||||
def wait(self, timeout=None):
|
||||
"""Waits for the conductor to gracefully exit.
|
||||
|
||||
This method waits for the conductor to gracefully exit. An optional
|
||||
timeout can be provided, which will cause the method to return
|
||||
within the specified timeout. If the timeout is reached, the returned
|
||||
value will be False.
|
||||
|
||||
:param timeout: Maximum number of seconds that the :meth:`wait` method
|
||||
should block for.
|
||||
"""
|
||||
return self._dead.wait(timeout)
|
||||
@@ -15,16 +15,17 @@
|
||||
import abc
|
||||
import threading
|
||||
|
||||
import fasteners
|
||||
import six
|
||||
|
||||
from taskflow import engines
|
||||
from taskflow import exceptions as excp
|
||||
from taskflow.utils import lock_utils
|
||||
from taskflow.types import notifier
|
||||
|
||||
|
||||
@six.add_metaclass(abc.ABCMeta)
|
||||
class Conductor(object):
|
||||
"""Conductors conduct jobs & assist in associated runtime interactions.
|
||||
"""Base for all conductor implementations.
|
||||
|
||||
Conductors act as entities which extract jobs from a jobboard, assign
|
||||
there work to some engine (using some desired configuration) and then wait
|
||||
@@ -34,8 +35,8 @@ class Conductor(object):
|
||||
period of time will finish up the prior failed conductors work.
|
||||
"""
|
||||
|
||||
def __init__(self, name, jobboard, persistence,
|
||||
engine=None, engine_options=None):
|
||||
def __init__(self, name, jobboard,
|
||||
persistence=None, engine=None, engine_options=None):
|
||||
self._name = name
|
||||
self._jobboard = jobboard
|
||||
self._engine = engine
|
||||
@@ -45,6 +46,18 @@ class Conductor(object):
|
||||
self._engine_options = engine_options.copy()
|
||||
self._persistence = persistence
|
||||
self._lock = threading.RLock()
|
||||
self._notifier = notifier.Notifier()
|
||||
|
||||
@property
|
||||
def notifier(self):
|
||||
"""The conductor actions (or other state changes) notifier.
|
||||
|
||||
NOTE(harlowja): different conductor implementations may emit
|
||||
different events + event details at different times, so refer to your
|
||||
conductor documentation to know exactly what can and what can not be
|
||||
subscribed to.
|
||||
"""
|
||||
return self._notifier
|
||||
|
||||
def _flow_detail_from_job(self, job):
|
||||
"""Extracts a flow detail from a job (via some manner).
|
||||
@@ -88,20 +101,36 @@ class Conductor(object):
|
||||
store = dict(job.details["store"])
|
||||
else:
|
||||
store = {}
|
||||
return engines.load_from_detail(flow_detail, store=store,
|
||||
engine=self._engine,
|
||||
backend=self._persistence,
|
||||
**self._engine_options)
|
||||
engine = engines.load_from_detail(flow_detail, store=store,
|
||||
engine=self._engine,
|
||||
backend=self._persistence,
|
||||
**self._engine_options)
|
||||
return engine
|
||||
|
||||
@lock_utils.locked
|
||||
def _listeners_from_job(self, job, engine):
|
||||
"""Returns a list of listeners to be attached to an engine.
|
||||
|
||||
This method should be overridden in order to attach listeners to
|
||||
engines. It will be called once for each job, and the list returned
|
||||
listeners will be added to the engine for this job.
|
||||
|
||||
:param job: A job instance that is about to be run in an engine.
|
||||
:param engine: The engine that listeners will be attached to.
|
||||
:returns: a list of (unregistered) listener instances.
|
||||
"""
|
||||
# TODO(dkrause): Create a standard way to pass listeners or
|
||||
# listener factories over the jobboard
|
||||
return []
|
||||
|
||||
@fasteners.locked
|
||||
def connect(self):
|
||||
"""Ensures the jobboard is connected (noop if it is already)."""
|
||||
if not self._jobboard.connected:
|
||||
self._jobboard.connect()
|
||||
|
||||
@lock_utils.locked
|
||||
@fasteners.locked
|
||||
def close(self):
|
||||
"""Closes the jobboard, disallowing further use."""
|
||||
"""Closes the contained jobboard, disallowing further use."""
|
||||
self._jobboard.close()
|
||||
|
||||
@abc.abstractmethod
|
||||
|
||||
@@ -1,5 +1,7 @@
|
||||
# -*- coding: utf-8 -*-
|
||||
|
||||
# Copyright (C) 2015 Yahoo! Inc. All Rights Reserved.
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License"); you may
|
||||
# not use this file except in compliance with the License. You may obtain
|
||||
# a copy of the License at
|
||||
@@ -12,163 +14,18 @@
|
||||
# License for the specific language governing permissions and limitations
|
||||
# under the License.
|
||||
|
||||
import six
|
||||
from debtcollector import moves
|
||||
from debtcollector import removals
|
||||
|
||||
from taskflow.conductors import base
|
||||
from taskflow import exceptions as excp
|
||||
from taskflow.listeners import logging as logging_listener
|
||||
from taskflow import logging
|
||||
from taskflow.types import timing as tt
|
||||
from taskflow.utils import async_utils
|
||||
from taskflow.utils import deprecation
|
||||
from taskflow.utils import threading_utils
|
||||
from taskflow.conductors.backends import impl_blocking
|
||||
|
||||
LOG = logging.getLogger(__name__)
|
||||
WAIT_TIMEOUT = 0.5
|
||||
NO_CONSUME_EXCEPTIONS = tuple([
|
||||
excp.ExecutionFailure,
|
||||
excp.StorageFailure,
|
||||
])
|
||||
# TODO(harlowja): remove this module soon...
|
||||
removals.removed_module(__name__,
|
||||
replacement="the conductor entrypoints",
|
||||
version="0.8", removal_version="2.0",
|
||||
stacklevel=4)
|
||||
|
||||
|
||||
class SingleThreadedConductor(base.Conductor):
|
||||
"""A conductor that runs jobs in its own dispatching loop.
|
||||
|
||||
This conductor iterates over jobs in the provided jobboard (waiting for
|
||||
the given timeout if no jobs exist) and attempts to claim them, work on
|
||||
those jobs in its local thread (blocking further work from being claimed
|
||||
and consumed) and then consume those work units after completetion. This
|
||||
process will repeat until the conductor has been stopped or other critical
|
||||
error occurs.
|
||||
|
||||
NOTE(harlowja): consumption occurs even if a engine fails to run due to
|
||||
a task failure. This is only skipped when an execution failure or
|
||||
a storage failure occurs which are *usually* correctable by re-running on
|
||||
a different conductor (storage failures and execution failures may be
|
||||
transient issues that can be worked around by later execution). If a job
|
||||
after completing can not be consumed or abandoned the conductor relies
|
||||
upon the jobboard capabilities to automatically abandon these jobs.
|
||||
"""
|
||||
|
||||
def __init__(self, name, jobboard, persistence,
|
||||
engine=None, engine_options=None, wait_timeout=None):
|
||||
super(SingleThreadedConductor, self).__init__(
|
||||
name, jobboard, persistence,
|
||||
engine=engine, engine_options=engine_options)
|
||||
if wait_timeout is None:
|
||||
wait_timeout = WAIT_TIMEOUT
|
||||
if isinstance(wait_timeout, (int, float) + six.string_types):
|
||||
self._wait_timeout = tt.Timeout(float(wait_timeout))
|
||||
elif isinstance(wait_timeout, tt.Timeout):
|
||||
self._wait_timeout = wait_timeout
|
||||
else:
|
||||
raise ValueError("Invalid timeout literal: %s" % (wait_timeout))
|
||||
self._dead = threading_utils.Event()
|
||||
|
||||
@deprecation.removed_kwarg('timeout',
|
||||
version="0.8", removal_version="?")
|
||||
def stop(self, timeout=None):
|
||||
"""Requests the conductor to stop dispatching.
|
||||
|
||||
This method can be used to request that a conductor stop its
|
||||
consumption & dispatching loop.
|
||||
|
||||
The method returns immediately regardless of whether the conductor has
|
||||
been stopped.
|
||||
|
||||
:param timeout: This parameter is **deprecated** and is present for
|
||||
backward compatibility **only**. In order to wait for
|
||||
the conductor to gracefully shut down, :meth:`wait`
|
||||
should be used instead.
|
||||
"""
|
||||
self._wait_timeout.interrupt()
|
||||
|
||||
@property
|
||||
def dispatching(self):
|
||||
return not self._dead.is_set()
|
||||
|
||||
def _dispatch_job(self, job):
|
||||
engine = self._engine_from_job(job)
|
||||
consume = True
|
||||
with logging_listener.LoggingListener(engine, log=LOG):
|
||||
LOG.debug("Dispatching engine %s for job: %s", engine, job)
|
||||
try:
|
||||
engine.run()
|
||||
except excp.WrappedFailure as e:
|
||||
if all((f.check(*NO_CONSUME_EXCEPTIONS) for f in e)):
|
||||
consume = False
|
||||
if LOG.isEnabledFor(logging.WARNING):
|
||||
if consume:
|
||||
LOG.warn("Job execution failed (consumption being"
|
||||
" skipped): %s [%s failures]", job, len(e))
|
||||
else:
|
||||
LOG.warn("Job execution failed (consumption"
|
||||
" proceeding): %s [%s failures]", job, len(e))
|
||||
# Show the failure/s + traceback (if possible)...
|
||||
for i, f in enumerate(e):
|
||||
LOG.warn("%s. %s", i + 1, f.pformat(traceback=True))
|
||||
except NO_CONSUME_EXCEPTIONS:
|
||||
LOG.warn("Job execution failed (consumption being"
|
||||
" skipped): %s", job, exc_info=True)
|
||||
consume = False
|
||||
except Exception:
|
||||
LOG.warn("Job execution failed (consumption proceeding): %s",
|
||||
job, exc_info=True)
|
||||
else:
|
||||
LOG.info("Job completed successfully: %s", job)
|
||||
return async_utils.make_completed_future(consume)
|
||||
|
||||
def run(self):
|
||||
self._dead.clear()
|
||||
try:
|
||||
while True:
|
||||
if self._wait_timeout.is_stopped():
|
||||
break
|
||||
dispatched = 0
|
||||
for job in self._jobboard.iterjobs():
|
||||
if self._wait_timeout.is_stopped():
|
||||
break
|
||||
LOG.debug("Trying to claim job: %s", job)
|
||||
try:
|
||||
self._jobboard.claim(job, self._name)
|
||||
except (excp.UnclaimableJob, excp.NotFound):
|
||||
LOG.debug("Job already claimed or consumed: %s", job)
|
||||
continue
|
||||
consume = False
|
||||
try:
|
||||
f = self._dispatch_job(job)
|
||||
except Exception:
|
||||
LOG.warn("Job dispatching failed: %s", job,
|
||||
exc_info=True)
|
||||
else:
|
||||
dispatched += 1
|
||||
consume = f.result()
|
||||
try:
|
||||
if consume:
|
||||
self._jobboard.consume(job, self._name)
|
||||
else:
|
||||
self._jobboard.abandon(job, self._name)
|
||||
except (excp.JobFailure, excp.NotFound):
|
||||
if consume:
|
||||
LOG.warn("Failed job consumption: %s", job,
|
||||
exc_info=True)
|
||||
else:
|
||||
LOG.warn("Failed job abandonment: %s", job,
|
||||
exc_info=True)
|
||||
if dispatched == 0 and not self._wait_timeout.is_stopped():
|
||||
self._wait_timeout.wait()
|
||||
finally:
|
||||
self._dead.set()
|
||||
|
||||
def wait(self, timeout=None):
|
||||
"""Waits for the conductor to gracefully exit.
|
||||
|
||||
This method waits for the conductor to gracefully exit. An optional
|
||||
timeout can be provided, which will cause the method to return
|
||||
within the specified timeout. If the timeout is reached, the returned
|
||||
value will be False.
|
||||
|
||||
:param timeout: Maximum number of seconds that the :meth:`wait` method
|
||||
should block for.
|
||||
"""
|
||||
return self._dead.wait(timeout)
|
||||
# TODO(harlowja): remove this proxy/legacy class soon...
|
||||
SingleThreadedConductor = moves.moved_class(
|
||||
impl_blocking.BlockingConductor, 'SingleThreadedConductor',
|
||||
__name__, version="0.8", removal_version="?")
|
||||
|
||||
@@ -14,8 +14,16 @@
|
||||
# License for the specific language governing permissions and limitations
|
||||
# under the License.
|
||||
|
||||
from oslo_utils import eventletutils as _eventletutils
|
||||
|
||||
# promote helpers to this module namespace
|
||||
# Give a nice warning that if eventlet is being used these modules
|
||||
# are highly recommended to be patched (or otherwise bad things could
|
||||
# happen).
|
||||
_eventletutils.warn_eventlet_not_patched(
|
||||
expected_patched_modules=['time', 'thread'])
|
||||
|
||||
|
||||
# Promote helpers to this module namespace (for easy access).
|
||||
from taskflow.engines.helpers import flow_from_detail # noqa
|
||||
from taskflow.engines.helpers import load # noqa
|
||||
from taskflow.engines.helpers import load_from_detail # noqa
|
||||
|
||||
@@ -32,11 +32,6 @@ SAVE_RESULT_STATES = (states.SUCCESS, states.FAILURE)
|
||||
class Action(object):
|
||||
"""An action that handles executing, state changes, ... of atoms."""
|
||||
|
||||
def __init__(self, storage, notifier, walker_factory):
|
||||
def __init__(self, storage, notifier):
|
||||
self._storage = storage
|
||||
self._notifier = notifier
|
||||
self._walker_factory = walker_factory
|
||||
|
||||
@abc.abstractmethod
|
||||
def handles(self, atom):
|
||||
"""Checks if this action handles the provided atom."""
|
||||
|
||||
@@ -14,13 +14,14 @@
|
||||
# License for the specific language governing permissions and limitations
|
||||
# under the License.
|
||||
|
||||
import futurist
|
||||
|
||||
from taskflow.engines.action_engine.actions import base
|
||||
from taskflow.engines.action_engine import executor as ex
|
||||
from taskflow import logging
|
||||
from taskflow import retry as retry_atom
|
||||
from taskflow import states
|
||||
from taskflow.types import failure
|
||||
from taskflow.types import futures
|
||||
|
||||
LOG = logging.getLogger(__name__)
|
||||
|
||||
@@ -44,20 +45,14 @@ def _revert_retry(retry, arguments):
|
||||
class RetryAction(base.Action):
|
||||
"""An action that handles executing, state changes, ... of retry atoms."""
|
||||
|
||||
def __init__(self, storage, notifier, walker_factory):
|
||||
super(RetryAction, self).__init__(storage, notifier, walker_factory)
|
||||
self._executor = futures.SynchronousExecutor()
|
||||
|
||||
@staticmethod
|
||||
def handles(atom):
|
||||
return isinstance(atom, retry_atom.Retry)
|
||||
def __init__(self, storage, notifier):
|
||||
super(RetryAction, self).__init__(storage, notifier)
|
||||
self._executor = futurist.SynchronousExecutor()
|
||||
|
||||
def _get_retry_args(self, retry, addons=None):
|
||||
scope_walker = self._walker_factory(retry)
|
||||
arguments = self._storage.fetch_mapped_args(
|
||||
retry.rebind,
|
||||
atom_name=retry.name,
|
||||
scope_walker=scope_walker,
|
||||
optional_args=retry.optional
|
||||
)
|
||||
history = self._storage.get_retry_history(retry.name)
|
||||
|
||||
@@ -28,14 +28,10 @@ LOG = logging.getLogger(__name__)
|
||||
class TaskAction(base.Action):
|
||||
"""An action that handles scheduling, state changes, ... of task atoms."""
|
||||
|
||||
def __init__(self, storage, notifier, walker_factory, task_executor):
|
||||
super(TaskAction, self).__init__(storage, notifier, walker_factory)
|
||||
def __init__(self, storage, notifier, task_executor):
|
||||
super(TaskAction, self).__init__(storage, notifier)
|
||||
self._task_executor = task_executor
|
||||
|
||||
@staticmethod
|
||||
def handles(atom):
|
||||
return isinstance(atom, task_atom.BaseTask)
|
||||
|
||||
def _is_identity_transition(self, old_state, state, task, progress):
|
||||
if state in base.SAVE_RESULT_STATES:
|
||||
# saving result is never identity transition
|
||||
@@ -100,11 +96,9 @@ class TaskAction(base.Action):
|
||||
|
||||
def schedule_execution(self, task):
|
||||
self.change_state(task, states.RUNNING, progress=0.0)
|
||||
scope_walker = self._walker_factory(task)
|
||||
arguments = self._storage.fetch_mapped_args(
|
||||
task.rebind,
|
||||
atom_name=task.name,
|
||||
scope_walker=scope_walker,
|
||||
optional_args=task.optional
|
||||
)
|
||||
if task.notifier.can_be_registered(task_atom.EVENT_UPDATE_PROGRESS):
|
||||
@@ -126,11 +120,9 @@ class TaskAction(base.Action):
|
||||
|
||||
def schedule_reversion(self, task):
|
||||
self.change_state(task, states.REVERTING, progress=0.0)
|
||||
scope_walker = self._walker_factory(task)
|
||||
arguments = self._storage.fetch_mapped_args(
|
||||
task.rebind,
|
||||
atom_name=task.name,
|
||||
scope_walker=scope_walker,
|
||||
optional_args=task.optional
|
||||
)
|
||||
task_uuid = self._storage.get_atom_uuid(task.name)
|
||||
|
||||
@@ -14,6 +14,8 @@
|
||||
# License for the specific language governing permissions and limitations
|
||||
# under the License.
|
||||
|
||||
import itertools
|
||||
|
||||
from networkx.algorithms import traversal
|
||||
import six
|
||||
|
||||
@@ -21,6 +23,60 @@ from taskflow import retry as retry_atom
|
||||
from taskflow import states as st
|
||||
|
||||
|
||||
class IgnoreDecider(object):
|
||||
"""Checks any provided edge-deciders and determines if ok to run."""
|
||||
|
||||
def __init__(self, atom, edge_deciders):
|
||||
self._atom = atom
|
||||
self._edge_deciders = edge_deciders
|
||||
|
||||
def check(self, runtime):
|
||||
"""Returns bool of whether this decider should allow running."""
|
||||
results = {}
|
||||
for name in six.iterkeys(self._edge_deciders):
|
||||
results[name] = runtime.storage.get(name)
|
||||
for local_decider in six.itervalues(self._edge_deciders):
|
||||
if not local_decider(history=results):
|
||||
return False
|
||||
return True
|
||||
|
||||
def affect(self, runtime):
|
||||
"""If the :py:func:`~.check` returns false, affects associated atoms.
|
||||
|
||||
This will alter the associated atom + successor atoms by setting there
|
||||
state to ``IGNORE`` so that they are ignored in future runtime
|
||||
activities.
|
||||
"""
|
||||
successors_iter = runtime.analyzer.iterate_subgraph(self._atom)
|
||||
runtime.reset_nodes(itertools.chain([self._atom], successors_iter),
|
||||
state=st.IGNORE, intention=st.IGNORE)
|
||||
|
||||
def check_and_affect(self, runtime):
|
||||
"""Handles :py:func:`~.check` + :py:func:`~.affect` in right order."""
|
||||
proceed = self.check(runtime)
|
||||
if not proceed:
|
||||
self.affect(runtime)
|
||||
return proceed
|
||||
|
||||
|
||||
class NoOpDecider(object):
|
||||
"""No-op decider that says it is always ok to run & has no effect(s)."""
|
||||
|
||||
def check(self, runtime):
|
||||
"""Always good to go."""
|
||||
return True
|
||||
|
||||
def affect(self, runtime):
|
||||
"""Does nothing."""
|
||||
|
||||
def check_and_affect(self, runtime):
|
||||
"""Handles :py:func:`~.check` + :py:func:`~.affect` in right order.
|
||||
|
||||
Does nothing.
|
||||
"""
|
||||
return self.check(runtime)
|
||||
|
||||
|
||||
class Analyzer(object):
|
||||
"""Analyzes a compilation and aids in execution processes.
|
||||
|
||||
@@ -31,21 +87,25 @@ class Analyzer(object):
|
||||
the rest of the runtime system.
|
||||
"""
|
||||
|
||||
def __init__(self, compilation, storage):
|
||||
self._storage = storage
|
||||
self._graph = compilation.execution_graph
|
||||
def __init__(self, runtime):
|
||||
self._storage = runtime.storage
|
||||
self._execution_graph = runtime.compilation.execution_graph
|
||||
self._check_atom_transition = runtime.check_atom_transition
|
||||
self._fetch_edge_deciders = runtime.fetch_edge_deciders
|
||||
|
||||
def get_next_nodes(self, node=None):
|
||||
"""Get next nodes to run (originating from node or all nodes)."""
|
||||
if node is None:
|
||||
execute = self.browse_nodes_for_execute()
|
||||
revert = self.browse_nodes_for_revert()
|
||||
return execute + revert
|
||||
|
||||
state = self.get_state(node)
|
||||
intention = self._storage.get_atom_intention(node.name)
|
||||
if state == st.SUCCESS:
|
||||
if intention == st.REVERT:
|
||||
return [node]
|
||||
return [
|
||||
(node, NoOpDecider()),
|
||||
]
|
||||
elif intention == st.EXECUTE:
|
||||
return self.browse_nodes_for_execute(node)
|
||||
else:
|
||||
@@ -60,74 +120,90 @@ class Analyzer(object):
|
||||
def browse_nodes_for_execute(self, node=None):
|
||||
"""Browse next nodes to execute.
|
||||
|
||||
This returns a collection of nodes that are ready to be executed, if
|
||||
given a specific node it will only examine the successors of that node,
|
||||
otherwise it will examine the whole graph.
|
||||
This returns a collection of nodes that *may* be ready to be
|
||||
executed, if given a specific node it will only examine the successors
|
||||
of that node, otherwise it will examine the whole graph.
|
||||
"""
|
||||
if node:
|
||||
nodes = self._graph.successors(node)
|
||||
if node is not None:
|
||||
nodes = self._execution_graph.successors(node)
|
||||
else:
|
||||
nodes = self._graph.nodes_iter()
|
||||
|
||||
available_nodes = []
|
||||
nodes = self._execution_graph.nodes_iter()
|
||||
ready_nodes = []
|
||||
for node in nodes:
|
||||
if self._is_ready_for_execute(node):
|
||||
available_nodes.append(node)
|
||||
return available_nodes
|
||||
is_ready, late_decider = self._get_maybe_ready_for_execute(node)
|
||||
if is_ready:
|
||||
ready_nodes.append((node, late_decider))
|
||||
return ready_nodes
|
||||
|
||||
def browse_nodes_for_revert(self, node=None):
|
||||
"""Browse next nodes to revert.
|
||||
|
||||
This returns a collection of nodes that are ready to be be reverted, if
|
||||
given a specific node it will only examine the predecessors of that
|
||||
node, otherwise it will examine the whole graph.
|
||||
This returns a collection of nodes that *may* be ready to be be
|
||||
reverted, if given a specific node it will only examine the
|
||||
predecessors of that node, otherwise it will examine the whole
|
||||
graph.
|
||||
"""
|
||||
if node:
|
||||
nodes = self._graph.predecessors(node)
|
||||
if node is not None:
|
||||
nodes = self._execution_graph.predecessors(node)
|
||||
else:
|
||||
nodes = self._graph.nodes_iter()
|
||||
|
||||
available_nodes = []
|
||||
nodes = self._execution_graph.nodes_iter()
|
||||
ready_nodes = []
|
||||
for node in nodes:
|
||||
if self._is_ready_for_revert(node):
|
||||
available_nodes.append(node)
|
||||
return available_nodes
|
||||
is_ready, late_decider = self._get_maybe_ready_for_revert(node)
|
||||
if is_ready:
|
||||
ready_nodes.append((node, late_decider))
|
||||
return ready_nodes
|
||||
|
||||
def _is_ready_for_execute(self, task):
|
||||
"""Checks if task is ready to be executed."""
|
||||
state = self.get_state(task)
|
||||
intention = self._storage.get_atom_intention(task.name)
|
||||
transition = st.check_task_transition(state, st.RUNNING)
|
||||
def _get_maybe_ready_for_execute(self, atom):
|
||||
"""Returns if an atom is *likely* ready to be executed."""
|
||||
|
||||
state = self.get_state(atom)
|
||||
intention = self._storage.get_atom_intention(atom.name)
|
||||
transition = self._check_atom_transition(atom, state, st.RUNNING)
|
||||
if not transition or intention != st.EXECUTE:
|
||||
return False
|
||||
return (False, None)
|
||||
|
||||
task_names = []
|
||||
for prev_task in self._graph.predecessors(task):
|
||||
task_names.append(prev_task.name)
|
||||
predecessor_names = []
|
||||
for previous_atom in self._execution_graph.predecessors(atom):
|
||||
predecessor_names.append(previous_atom.name)
|
||||
|
||||
task_states = self._storage.get_atoms_states(task_names)
|
||||
return all(state == st.SUCCESS and intention == st.EXECUTE
|
||||
for state, intention in six.itervalues(task_states))
|
||||
predecessor_states = self._storage.get_atoms_states(predecessor_names)
|
||||
predecessor_states_iter = six.itervalues(predecessor_states)
|
||||
ok_to_run = all(state == st.SUCCESS and intention == st.EXECUTE
|
||||
for state, intention in predecessor_states_iter)
|
||||
|
||||
def _is_ready_for_revert(self, task):
|
||||
"""Checks if task is ready to be reverted."""
|
||||
state = self.get_state(task)
|
||||
intention = self._storage.get_atom_intention(task.name)
|
||||
transition = st.check_task_transition(state, st.REVERTING)
|
||||
if not ok_to_run:
|
||||
return (False, None)
|
||||
else:
|
||||
edge_deciders = self._fetch_edge_deciders(atom)
|
||||
return (True, IgnoreDecider(atom, edge_deciders))
|
||||
|
||||
def _get_maybe_ready_for_revert(self, atom):
|
||||
"""Returns if an atom is *likely* ready to be reverted."""
|
||||
|
||||
state = self.get_state(atom)
|
||||
intention = self._storage.get_atom_intention(atom.name)
|
||||
transition = self._check_atom_transition(atom, state, st.REVERTING)
|
||||
if not transition or intention not in (st.REVERT, st.RETRY):
|
||||
return False
|
||||
return (False, None)
|
||||
|
||||
task_names = []
|
||||
for prev_task in self._graph.successors(task):
|
||||
task_names.append(prev_task.name)
|
||||
predecessor_names = []
|
||||
for previous_atom in self._execution_graph.successors(atom):
|
||||
predecessor_names.append(previous_atom.name)
|
||||
|
||||
task_states = self._storage.get_atoms_states(task_names)
|
||||
return all(state in (st.PENDING, st.REVERTED)
|
||||
for state, intention in six.itervalues(task_states))
|
||||
predecessor_states = self._storage.get_atoms_states(predecessor_names)
|
||||
predecessor_states_iter = six.itervalues(predecessor_states)
|
||||
ok_to_run = all(state in (st.PENDING, st.REVERTED)
|
||||
for state, intention in predecessor_states_iter)
|
||||
|
||||
def iterate_subgraph(self, retry):
|
||||
"""Iterates a subgraph connected to given retry controller."""
|
||||
for _src, dst in traversal.dfs_edges(self._graph, retry):
|
||||
if not ok_to_run:
|
||||
return (False, None)
|
||||
else:
|
||||
return (True, NoOpDecider())
|
||||
|
||||
def iterate_subgraph(self, atom):
|
||||
"""Iterates a subgraph connected to given atom."""
|
||||
for _src, dst in traversal.dfs_edges(self._execution_graph, atom):
|
||||
yield dst
|
||||
|
||||
def iterate_retries(self, state=None):
|
||||
@@ -135,23 +211,30 @@ class Analyzer(object):
|
||||
|
||||
If no state is provided it will yield back all retry controllers.
|
||||
"""
|
||||
for node in self._graph.nodes_iter():
|
||||
for node in self._execution_graph.nodes_iter():
|
||||
if isinstance(node, retry_atom.Retry):
|
||||
if not state or self.get_state(node) == state:
|
||||
yield node
|
||||
|
||||
def iterate_all_nodes(self):
|
||||
for node in self._graph.nodes_iter():
|
||||
"""Yields back all nodes in the execution graph."""
|
||||
for node in self._execution_graph.nodes_iter():
|
||||
yield node
|
||||
|
||||
def find_atom_retry(self, atom):
|
||||
return self._graph.node[atom].get('retry')
|
||||
"""Returns the retry atom associated to the given atom (or none)."""
|
||||
return self._execution_graph.node[atom].get('retry')
|
||||
|
||||
def is_success(self):
|
||||
for node in self._graph.nodes_iter():
|
||||
if self.get_state(node) != st.SUCCESS:
|
||||
"""Checks if all nodes in the execution graph are in 'happy' state."""
|
||||
for atom in self.iterate_all_nodes():
|
||||
atom_state = self.get_state(atom)
|
||||
if atom_state == st.IGNORE:
|
||||
continue
|
||||
if atom_state != st.SUCCESS:
|
||||
return False
|
||||
return True
|
||||
|
||||
def get_state(self, node):
|
||||
return self._storage.get_atom_state(node.name)
|
||||
def get_state(self, atom):
|
||||
"""Gets the state of a given atom (from the backend storage unit)."""
|
||||
return self._storage.get_atom_state(atom.name)
|
||||
|
||||
@@ -17,14 +17,14 @@
|
||||
import collections
|
||||
import threading
|
||||
|
||||
import fasteners
|
||||
|
||||
from taskflow import exceptions as exc
|
||||
from taskflow import flow
|
||||
from taskflow import logging
|
||||
from taskflow import retry
|
||||
from taskflow import task
|
||||
from taskflow.types import graph as gr
|
||||
from taskflow.types import tree as tr
|
||||
from taskflow.utils import lock_utils
|
||||
from taskflow.utils import misc
|
||||
|
||||
LOG = logging.getLogger(__name__)
|
||||
@@ -158,13 +158,22 @@ class Linker(object):
|
||||
" decomposed into an empty graph" % (v, u, u))
|
||||
for u in u_g.nodes_iter():
|
||||
for v in v_g.nodes_iter():
|
||||
depends_on = u.provides & v.requires
|
||||
# This is using the intersection() method vs the &
|
||||
# operator since the latter doesn't work with frozen
|
||||
# sets (when used in combination with ordered sets).
|
||||
#
|
||||
# If this is not done the following happens...
|
||||
#
|
||||
# TypeError: unsupported operand type(s)
|
||||
# for &: 'frozenset' and 'OrderedSet'
|
||||
depends_on = u.provides.intersection(v.requires)
|
||||
if depends_on:
|
||||
edge_attrs = {
|
||||
_EDGE_REASONS: frozenset(depends_on),
|
||||
}
|
||||
_add_update_edges(graph,
|
||||
[u], [v],
|
||||
attr_dict={
|
||||
_EDGE_REASONS: depends_on,
|
||||
})
|
||||
attr_dict=edge_attrs)
|
||||
else:
|
||||
# Connect nodes with no predecessors in v to nodes with no
|
||||
# successors in the *first* non-empty predecessor of v (thus
|
||||
@@ -180,8 +189,84 @@ class Linker(object):
|
||||
priors.append((u, v))
|
||||
|
||||
|
||||
class _TaskCompiler(object):
|
||||
"""Non-recursive compiler of tasks."""
|
||||
|
||||
@staticmethod
|
||||
def handles(obj):
|
||||
return isinstance(obj, task.BaseTask)
|
||||
|
||||
def compile(self, task, parent=None):
|
||||
graph = gr.DiGraph(name=task.name)
|
||||
graph.add_node(task)
|
||||
node = tr.Node(task)
|
||||
if parent is not None:
|
||||
parent.add(node)
|
||||
return graph, node
|
||||
|
||||
|
||||
class _FlowCompiler(object):
|
||||
"""Recursive compiler of flows."""
|
||||
|
||||
@staticmethod
|
||||
def handles(obj):
|
||||
return isinstance(obj, flow.Flow)
|
||||
|
||||
def __init__(self, deep_compiler_func, linker):
|
||||
self._deep_compiler_func = deep_compiler_func
|
||||
self._linker = linker
|
||||
|
||||
def _connect_retry(self, retry, graph):
|
||||
graph.add_node(retry)
|
||||
|
||||
# All nodes that have no predecessors should depend on this retry.
|
||||
nodes_to = [n for n in graph.no_predecessors_iter() if n is not retry]
|
||||
if nodes_to:
|
||||
_add_update_edges(graph, [retry], nodes_to,
|
||||
attr_dict=_RETRY_EDGE_DATA)
|
||||
|
||||
# Add association for each node of graph that has no existing retry.
|
||||
for n in graph.nodes_iter():
|
||||
if n is not retry and flow.LINK_RETRY not in graph.node[n]:
|
||||
graph.node[n][flow.LINK_RETRY] = retry
|
||||
|
||||
@staticmethod
|
||||
def _occurence_detector(to_graph, from_graph):
|
||||
return sum(1 for node in from_graph.nodes_iter()
|
||||
if node in to_graph)
|
||||
|
||||
def _decompose_flow(self, flow, parent=None):
|
||||
"""Decomposes a flow into a graph, tree node + decomposed subgraphs."""
|
||||
graph = gr.DiGraph(name=flow.name)
|
||||
node = tr.Node(flow)
|
||||
if parent is not None:
|
||||
parent.add(node)
|
||||
if flow.retry is not None:
|
||||
node.add(tr.Node(flow.retry))
|
||||
decomposed_members = {}
|
||||
for item in flow:
|
||||
subgraph, _subnode = self._deep_compiler_func(item, parent=node)
|
||||
decomposed_members[item] = subgraph
|
||||
if subgraph.number_of_nodes():
|
||||
graph = gr.merge_graphs(
|
||||
graph, subgraph,
|
||||
# We can specialize this to be simpler than the default
|
||||
# algorithm which creates overhead that we don't
|
||||
# need for our purposes...
|
||||
overlap_detector=self._occurence_detector)
|
||||
return graph, node, decomposed_members
|
||||
|
||||
def compile(self, flow, parent=None):
|
||||
graph, node, decomposed_members = self._decompose_flow(flow,
|
||||
parent=parent)
|
||||
self._linker.apply_constraints(graph, flow, decomposed_members)
|
||||
if flow.retry is not None:
|
||||
self._connect_retry(flow.retry, graph)
|
||||
return graph, node
|
||||
|
||||
|
||||
class PatternCompiler(object):
|
||||
"""Compiles a pattern (or task) into a compilation unit.
|
||||
"""Compiles a flow pattern (or task) into a compilation unit.
|
||||
|
||||
Let's dive into the basic idea for how this works:
|
||||
|
||||
@@ -189,9 +274,10 @@ class PatternCompiler(object):
|
||||
this object could be a task, or a flow (one of the supported patterns),
|
||||
the end-goal is to produce a :py:class:`.Compilation` object as the result
|
||||
with the needed components. If this is not possible a
|
||||
:py:class:`~.taskflow.exceptions.CompilationFailure` will be raised (or
|
||||
in the case where a unknown type is being requested to compile
|
||||
a ``TypeError`` will be raised).
|
||||
:py:class:`~.taskflow.exceptions.CompilationFailure` will be raised.
|
||||
In the case where a **unknown** type is being requested to compile
|
||||
a ``TypeError`` will be raised and when a duplicate object (one that
|
||||
has **already** been compiled) is encountered a ``ValueError`` is raised.
|
||||
|
||||
The complexity of this comes into play when the 'root' is a flow that
|
||||
contains itself other nested flows (and so-on); to compile this object and
|
||||
@@ -281,98 +367,40 @@ class PatternCompiler(object):
|
||||
self._freeze = freeze
|
||||
self._lock = threading.Lock()
|
||||
self._compilation = None
|
||||
self._matchers = [
|
||||
_FlowCompiler(self._compile, self._linker),
|
||||
_TaskCompiler(),
|
||||
]
|
||||
|
||||
def _flatten(self, item, parent):
|
||||
"""Flattens a item (pattern, task) into a graph + tree node."""
|
||||
functor = self._find_flattener(item, parent)
|
||||
self._pre_item_flatten(item)
|
||||
graph, node = functor(item, parent)
|
||||
self._post_item_flatten(item, graph, node)
|
||||
return graph, node
|
||||
|
||||
def _find_flattener(self, item, parent):
|
||||
"""Locates the flattening function to use to flatten the given item."""
|
||||
if isinstance(item, flow.Flow):
|
||||
return self._flatten_flow
|
||||
elif isinstance(item, task.BaseTask):
|
||||
return self._flatten_task
|
||||
elif isinstance(item, retry.Retry):
|
||||
if parent is None:
|
||||
raise TypeError("Retry controller '%s' (%s) must only be used"
|
||||
" as a flow constructor parameter and not as a"
|
||||
" root component" % (item, type(item)))
|
||||
else:
|
||||
raise TypeError("Retry controller '%s' (%s) must only be used"
|
||||
" as a flow constructor parameter and not as a"
|
||||
" flow added component" % (item, type(item)))
|
||||
def _compile(self, item, parent=None):
|
||||
"""Compiles a item (pattern, task) into a graph + tree node."""
|
||||
for m in self._matchers:
|
||||
if m.handles(item):
|
||||
self._pre_item_compile(item)
|
||||
graph, node = m.compile(item, parent=parent)
|
||||
self._post_item_compile(item, graph, node)
|
||||
return graph, node
|
||||
else:
|
||||
raise TypeError("Unknown item '%s' (%s) requested to flatten"
|
||||
raise TypeError("Unknown object '%s' (%s) requested to compile"
|
||||
% (item, type(item)))
|
||||
|
||||
def _connect_retry(self, retry, graph):
|
||||
graph.add_node(retry)
|
||||
|
||||
# All nodes that have no predecessors should depend on this retry.
|
||||
nodes_to = [n for n in graph.no_predecessors_iter() if n is not retry]
|
||||
if nodes_to:
|
||||
_add_update_edges(graph, [retry], nodes_to,
|
||||
attr_dict=_RETRY_EDGE_DATA)
|
||||
|
||||
# Add association for each node of graph that has no existing retry.
|
||||
for n in graph.nodes_iter():
|
||||
if n is not retry and flow.LINK_RETRY not in graph.node[n]:
|
||||
graph.node[n][flow.LINK_RETRY] = retry
|
||||
|
||||
def _flatten_task(self, task, parent):
|
||||
"""Flattens a individual task."""
|
||||
graph = gr.DiGraph(name=task.name)
|
||||
graph.add_node(task)
|
||||
node = tr.Node(task)
|
||||
if parent is not None:
|
||||
parent.add(node)
|
||||
return graph, node
|
||||
|
||||
def _decompose_flow(self, flow, parent):
|
||||
"""Decomposes a flow into a graph, tree node + decomposed subgraphs."""
|
||||
graph = gr.DiGraph(name=flow.name)
|
||||
node = tr.Node(flow)
|
||||
if parent is not None:
|
||||
parent.add(node)
|
||||
if flow.retry is not None:
|
||||
node.add(tr.Node(flow.retry))
|
||||
decomposed_members = {}
|
||||
for item in flow:
|
||||
subgraph, _subnode = self._flatten(item, node)
|
||||
decomposed_members[item] = subgraph
|
||||
if subgraph.number_of_nodes():
|
||||
graph = gr.merge_graphs([graph, subgraph])
|
||||
return graph, node, decomposed_members
|
||||
|
||||
def _flatten_flow(self, flow, parent):
|
||||
"""Flattens a flow."""
|
||||
graph, node, decomposed_members = self._decompose_flow(flow, parent)
|
||||
self._linker.apply_constraints(graph, flow, decomposed_members)
|
||||
if flow.retry is not None:
|
||||
self._connect_retry(flow.retry, graph)
|
||||
return graph, node
|
||||
|
||||
def _pre_item_flatten(self, item):
|
||||
"""Called before a item is flattened; any pre-flattening actions."""
|
||||
def _pre_item_compile(self, item):
|
||||
"""Called before a item is compiled; any pre-compilation actions."""
|
||||
if item in self._history:
|
||||
raise ValueError("Already flattened item '%s' (%s), recursive"
|
||||
" flattening is not supported" % (item,
|
||||
type(item)))
|
||||
raise ValueError("Already compiled item '%s' (%s), duplicate"
|
||||
" and/or recursive compiling is not"
|
||||
" supported" % (item, type(item)))
|
||||
self._history.add(item)
|
||||
|
||||
def _post_item_flatten(self, item, graph, node):
|
||||
"""Called after a item is flattened; doing post-flattening actions."""
|
||||
def _post_item_compile(self, item, graph, node):
|
||||
"""Called after a item is compiled; doing post-compilation actions."""
|
||||
|
||||
def _pre_flatten(self):
|
||||
"""Called before the flattening of the root starts."""
|
||||
def _pre_compile(self):
|
||||
"""Called before the compilation of the root starts."""
|
||||
self._history.clear()
|
||||
|
||||
def _post_flatten(self, graph, node):
|
||||
"""Called after the flattening of the root finishes successfully."""
|
||||
def _post_compile(self, graph, node):
|
||||
"""Called after the compilation of the root finishes successfully."""
|
||||
dup_names = misc.get_duplicate_keys(graph.nodes_iter(),
|
||||
key=lambda node: node.name)
|
||||
if dup_names:
|
||||
@@ -396,13 +424,13 @@ class PatternCompiler(object):
|
||||
# Indent it so that it's slightly offset from the above line.
|
||||
LOG.blather(" %s", line)
|
||||
|
||||
@lock_utils.locked
|
||||
@fasteners.locked
|
||||
def compile(self):
|
||||
"""Compiles the contained item into a compiled equivalent."""
|
||||
if self._compilation is None:
|
||||
self._pre_flatten()
|
||||
graph, node = self._flatten(self._root, None)
|
||||
self._post_flatten(graph, node)
|
||||
self._pre_compile()
|
||||
graph, node = self._compile(self._root, parent=None)
|
||||
self._post_compile(graph, node)
|
||||
if self._freeze:
|
||||
graph.freeze()
|
||||
node.freeze()
|
||||
|
||||
@@ -14,22 +14,102 @@
|
||||
# License for the specific language governing permissions and limitations
|
||||
# under the License.
|
||||
|
||||
import abc
|
||||
import weakref
|
||||
|
||||
from oslo_utils import reflection
|
||||
import six
|
||||
|
||||
from taskflow.engines.action_engine import executor as ex
|
||||
from taskflow import logging
|
||||
from taskflow import retry as retry_atom
|
||||
from taskflow import states as st
|
||||
from taskflow import task as task_atom
|
||||
from taskflow.types import failure
|
||||
|
||||
LOG = logging.getLogger(__name__)
|
||||
|
||||
|
||||
@six.add_metaclass(abc.ABCMeta)
|
||||
class Strategy(object):
|
||||
"""Failure resolution strategy base class."""
|
||||
|
||||
strategy = None
|
||||
|
||||
def __init__(self, runtime):
|
||||
self._runtime = runtime
|
||||
|
||||
@abc.abstractmethod
|
||||
def apply(self):
|
||||
"""Applies some algorithm to resolve some detected failure."""
|
||||
|
||||
def __str__(self):
|
||||
base = reflection.get_class_name(self, fully_qualified=False)
|
||||
if self.strategy is not None:
|
||||
strategy_name = self.strategy.name
|
||||
else:
|
||||
strategy_name = "???"
|
||||
return base + "(strategy=%s)" % (strategy_name)
|
||||
|
||||
|
||||
class RevertAndRetry(Strategy):
|
||||
"""Sets the *associated* subflow for revert to be later retried."""
|
||||
|
||||
strategy = retry_atom.RETRY
|
||||
|
||||
def __init__(self, runtime, retry):
|
||||
super(RevertAndRetry, self).__init__(runtime)
|
||||
self._retry = retry
|
||||
|
||||
def apply(self):
|
||||
tweaked = self._runtime.reset_nodes([self._retry], state=None,
|
||||
intention=st.RETRY)
|
||||
tweaked.extend(self._runtime.reset_subgraph(self._retry, state=None,
|
||||
intention=st.REVERT))
|
||||
return tweaked
|
||||
|
||||
|
||||
class RevertAll(Strategy):
|
||||
"""Sets *all* nodes/atoms to the ``REVERT`` intention."""
|
||||
|
||||
strategy = retry_atom.REVERT_ALL
|
||||
|
||||
def __init__(self, runtime):
|
||||
super(RevertAll, self).__init__(runtime)
|
||||
self._analyzer = runtime.analyzer
|
||||
|
||||
def apply(self):
|
||||
return self._runtime.reset_nodes(self._analyzer.iterate_all_nodes(),
|
||||
state=None, intention=st.REVERT)
|
||||
|
||||
|
||||
class Revert(Strategy):
|
||||
"""Sets atom and *associated* nodes to the ``REVERT`` intention."""
|
||||
|
||||
strategy = retry_atom.REVERT
|
||||
|
||||
def __init__(self, runtime, atom):
|
||||
super(Revert, self).__init__(runtime)
|
||||
self._atom = atom
|
||||
|
||||
def apply(self):
|
||||
tweaked = self._runtime.reset_nodes([self._atom], state=None,
|
||||
intention=st.REVERT)
|
||||
tweaked.extend(self._runtime.reset_subgraph(self._atom, state=None,
|
||||
intention=st.REVERT))
|
||||
return tweaked
|
||||
|
||||
|
||||
class Completer(object):
|
||||
"""Completes atoms using actions to complete them."""
|
||||
|
||||
def __init__(self, runtime):
|
||||
self._runtime = runtime
|
||||
self._runtime = weakref.proxy(runtime)
|
||||
self._analyzer = runtime.analyzer
|
||||
self._retry_action = runtime.retry_action
|
||||
self._storage = runtime.storage
|
||||
self._task_action = runtime.task_action
|
||||
self._undefined_resolver = RevertAll(self._runtime)
|
||||
|
||||
def _complete_task(self, task, event, result):
|
||||
"""Completes the given task, processes task failure."""
|
||||
@@ -75,6 +155,32 @@ class Completer(object):
|
||||
return True
|
||||
return False
|
||||
|
||||
def _determine_resolution(self, atom, failure):
|
||||
"""Determines which resolution strategy to activate/apply."""
|
||||
retry = self._analyzer.find_atom_retry(atom)
|
||||
if retry is not None:
|
||||
# Ask retry controller what to do in case of failure.
|
||||
strategy = self._retry_action.on_failure(retry, atom, failure)
|
||||
if strategy == retry_atom.RETRY:
|
||||
return RevertAndRetry(self._runtime, retry)
|
||||
elif strategy == retry_atom.REVERT:
|
||||
# Ask parent retry and figure out what to do...
|
||||
parent_resolver = self._determine_resolution(retry, failure)
|
||||
# Ok if the parent resolver says something not REVERT, and
|
||||
# it isn't just using the undefined resolver, assume the
|
||||
# parent knows best.
|
||||
if parent_resolver is not self._undefined_resolver:
|
||||
if parent_resolver.strategy != retry_atom.REVERT:
|
||||
return parent_resolver
|
||||
return Revert(self._runtime, retry)
|
||||
elif strategy == retry_atom.REVERT_ALL:
|
||||
return RevertAll(self._runtime)
|
||||
else:
|
||||
raise ValueError("Unknown atom failure resolution"
|
||||
" action/strategy '%s'" % strategy)
|
||||
else:
|
||||
return self._undefined_resolver
|
||||
|
||||
def _process_atom_failure(self, atom, failure):
|
||||
"""Processes atom failure & applies resolution strategies.
|
||||
|
||||
@@ -84,30 +190,15 @@ class Completer(object):
|
||||
then adjust the needed other atoms intentions, and states, ... so that
|
||||
the failure can be worked around.
|
||||
"""
|
||||
retry = self._analyzer.find_atom_retry(atom)
|
||||
if retry is not None:
|
||||
# Ask retry controller what to do in case of failure
|
||||
action = self._retry_action.on_failure(retry, atom, failure)
|
||||
if action == retry_atom.RETRY:
|
||||
# Prepare just the surrounding subflow for revert to be later
|
||||
# retried...
|
||||
self._storage.set_atom_intention(retry.name, st.RETRY)
|
||||
self._runtime.reset_subgraph(retry, state=None,
|
||||
intention=st.REVERT)
|
||||
elif action == retry_atom.REVERT:
|
||||
# Ask parent checkpoint.
|
||||
self._process_atom_failure(retry, failure)
|
||||
elif action == retry_atom.REVERT_ALL:
|
||||
# Prepare all flow for revert
|
||||
self._revert_all()
|
||||
else:
|
||||
raise ValueError("Unknown atom failure resolution"
|
||||
" action '%s'" % action)
|
||||
resolver = self._determine_resolution(atom, failure)
|
||||
LOG.debug("Applying resolver '%s' to resolve failure '%s'"
|
||||
" of atom '%s'", resolver, failure, atom)
|
||||
tweaked = resolver.apply()
|
||||
# Only show the tweaked node list when blather is on, otherwise
|
||||
# just show the amount/count of nodes tweaks...
|
||||
if LOG.isEnabledFor(logging.BLATHER):
|
||||
LOG.blather("Modified/tweaked %s nodes while applying"
|
||||
" resolver '%s'", tweaked, resolver)
|
||||
else:
|
||||
# Prepare all flow for revert
|
||||
self._revert_all()
|
||||
|
||||
def _revert_all(self):
|
||||
"""Attempts to set all nodes to the REVERT intention."""
|
||||
self._runtime.reset_nodes(self._analyzer.iterate_all_nodes(),
|
||||
state=None, intention=st.REVERT)
|
||||
LOG.debug("Modified/tweaked %s nodes while applying"
|
||||
" resolver '%s'", len(tweaked), resolver)
|
||||
|
||||
@@ -19,7 +19,10 @@ import contextlib
|
||||
import threading
|
||||
|
||||
from concurrent import futures
|
||||
import fasteners
|
||||
import networkx as nx
|
||||
from oslo_utils import excutils
|
||||
from oslo_utils import strutils
|
||||
import six
|
||||
|
||||
from taskflow.engines.action_engine import compiler
|
||||
@@ -27,11 +30,14 @@ from taskflow.engines.action_engine import executor
|
||||
from taskflow.engines.action_engine import runtime
|
||||
from taskflow.engines import base
|
||||
from taskflow import exceptions as exc
|
||||
from taskflow import logging
|
||||
from taskflow import states
|
||||
from taskflow import storage
|
||||
from taskflow.types import failure
|
||||
from taskflow.utils import lock_utils
|
||||
from taskflow.utils import misc
|
||||
|
||||
LOG = logging.getLogger(__name__)
|
||||
|
||||
|
||||
@contextlib.contextmanager
|
||||
def _start_stop(executor):
|
||||
@@ -60,6 +66,13 @@ class ActionEngine(base.Engine):
|
||||
"""
|
||||
_compiler_factory = compiler.PatternCompiler
|
||||
|
||||
NO_RERAISING_STATES = frozenset([states.SUSPENDED, states.SUCCESS])
|
||||
"""
|
||||
States that if the engine stops in will **not** cause any potential
|
||||
failures to be reraised. States **not** in this list will cause any
|
||||
failure/s that were captured (if any) to get reraised.
|
||||
"""
|
||||
|
||||
def __init__(self, flow, flow_detail, backend, options):
|
||||
super(ActionEngine, self).__init__(flow, flow_detail, backend, options)
|
||||
self._runtime = None
|
||||
@@ -69,10 +82,18 @@ class ActionEngine(base.Engine):
|
||||
self._state_lock = threading.RLock()
|
||||
self._storage_ensured = False
|
||||
|
||||
def _check(self, name, check_compiled, check_storage_ensured):
|
||||
"""Check (and raise) if the engine has not reached a certain stage."""
|
||||
if check_compiled and not self._compiled:
|
||||
raise exc.InvalidState("Can not %s an engine which"
|
||||
" has not been compiled" % name)
|
||||
if check_storage_ensured and not self._storage_ensured:
|
||||
raise exc.InvalidState("Can not %s an engine"
|
||||
" which has not has its storage"
|
||||
" populated" % name)
|
||||
|
||||
def suspend(self):
|
||||
if not self._compiled:
|
||||
raise exc.InvalidState("Can not suspend an engine"
|
||||
" which has not been compiled")
|
||||
self._check('suspend', True, False)
|
||||
self._change_state(states.SUSPENDING)
|
||||
|
||||
@property
|
||||
@@ -88,8 +109,31 @@ class ActionEngine(base.Engine):
|
||||
else:
|
||||
return None
|
||||
|
||||
@misc.cachedproperty
|
||||
def storage(self):
|
||||
"""The storage unit for this engine.
|
||||
|
||||
NOTE(harlowja): the atom argument lookup strategy will change for
|
||||
this storage unit after
|
||||
:py:func:`~taskflow.engines.base.Engine.compile` has
|
||||
completed (since **only** after compilation is the actual structure
|
||||
known). Before :py:func:`~taskflow.engines.base.Engine.compile`
|
||||
has completed the atom argument lookup strategy lookup will be
|
||||
restricted to injected arguments **only** (this will **not** reflect
|
||||
the actual runtime lookup strategy, which typically will be, but is
|
||||
not always different).
|
||||
"""
|
||||
def _scope_fetcher(atom_name):
|
||||
if self._compiled:
|
||||
return self._runtime.fetch_scopes_for(atom_name)
|
||||
else:
|
||||
return None
|
||||
return storage.Storage(self._flow_detail,
|
||||
backend=self._backend,
|
||||
scope_fetcher=_scope_fetcher)
|
||||
|
||||
def run(self):
|
||||
with lock_utils.try_lock(self._lock) as was_locked:
|
||||
with fasteners.try_lock(self._lock) as was_locked:
|
||||
if not was_locked:
|
||||
raise exc.ExecutionFailure("Engine currently locked, please"
|
||||
" try again later")
|
||||
@@ -119,6 +163,7 @@ class ActionEngine(base.Engine):
|
||||
"""
|
||||
self.compile()
|
||||
self.prepare()
|
||||
self.validate()
|
||||
runner = self._runtime.runner
|
||||
last_state = None
|
||||
with _start_stop(self._task_executor):
|
||||
@@ -148,7 +193,7 @@ class ActionEngine(base.Engine):
|
||||
ignorable_states = getattr(runner, 'ignorable_states', [])
|
||||
if last_state and last_state not in ignorable_states:
|
||||
self._change_state(last_state)
|
||||
if last_state not in [states.SUSPENDED, states.SUCCESS]:
|
||||
if last_state not in self.NO_RERAISING_STATES:
|
||||
failures = self.storage.get_failures()
|
||||
failure.Failure.reraise_if_any(failures.values())
|
||||
|
||||
@@ -168,16 +213,63 @@ class ActionEngine(base.Engine):
|
||||
|
||||
def _ensure_storage(self):
|
||||
"""Ensure all contained atoms exist in the storage unit."""
|
||||
transient = strutils.bool_from_string(
|
||||
self._options.get('inject_transient', True))
|
||||
self.storage.ensure_atoms(
|
||||
self._compilation.execution_graph.nodes_iter())
|
||||
for node in self._compilation.execution_graph.nodes_iter():
|
||||
self.storage.ensure_atom(node)
|
||||
if node.inject:
|
||||
self.storage.inject_atom_args(node.name, node.inject)
|
||||
self.storage.inject_atom_args(node.name,
|
||||
node.inject,
|
||||
transient=transient)
|
||||
|
||||
@lock_utils.locked
|
||||
@fasteners.locked
|
||||
def validate(self):
|
||||
self._check('validate', True, True)
|
||||
# At this point we can check to ensure all dependencies are either
|
||||
# flow/task provided or storage provided, if there are still missing
|
||||
# dependencies then this flow will fail at runtime (which we can avoid
|
||||
# by failing at validation time).
|
||||
execution_graph = self._compilation.execution_graph
|
||||
if LOG.isEnabledFor(logging.BLATHER):
|
||||
LOG.blather("Validating scoping and argument visibility for"
|
||||
" execution graph with %s nodes and %s edges with"
|
||||
" density %0.3f", execution_graph.number_of_nodes(),
|
||||
execution_graph.number_of_edges(),
|
||||
nx.density(execution_graph))
|
||||
missing = set()
|
||||
# Attempt to retain a chain of what was missing (so that the final
|
||||
# raised exception for the flow has the nodes that had missing
|
||||
# dependencies).
|
||||
last_cause = None
|
||||
last_node = None
|
||||
missing_nodes = 0
|
||||
fetch_func = self.storage.fetch_unsatisfied_args
|
||||
for node in execution_graph.nodes_iter():
|
||||
node_missing = fetch_func(node.name, node.rebind,
|
||||
optional_args=node.optional)
|
||||
if node_missing:
|
||||
cause = exc.MissingDependencies(node,
|
||||
sorted(node_missing),
|
||||
cause=last_cause)
|
||||
last_cause = cause
|
||||
last_node = node
|
||||
missing_nodes += 1
|
||||
missing.update(node_missing)
|
||||
if missing:
|
||||
# For when a task is provided (instead of a flow) and that
|
||||
# task is the only item in the graph and its missing deps, avoid
|
||||
# re-wrapping it in yet another exception...
|
||||
if missing_nodes == 1 and last_node is self._flow:
|
||||
raise last_cause
|
||||
else:
|
||||
raise exc.MissingDependencies(self._flow,
|
||||
sorted(missing),
|
||||
cause=last_cause)
|
||||
|
||||
@fasteners.locked
|
||||
def prepare(self):
|
||||
if not self._compiled:
|
||||
raise exc.InvalidState("Can not prepare an engine"
|
||||
" which has not been compiled")
|
||||
self._check('prepare', True, False)
|
||||
if not self._storage_ensured:
|
||||
# Set our own state to resuming -> (ensure atoms exist
|
||||
# in storage) -> suspended in the storage unit and notify any
|
||||
@@ -186,14 +278,6 @@ class ActionEngine(base.Engine):
|
||||
self._ensure_storage()
|
||||
self._change_state(states.SUSPENDED)
|
||||
self._storage_ensured = True
|
||||
# At this point we can check to ensure all dependencies are either
|
||||
# flow/task provided or storage provided, if there are still missing
|
||||
# dependencies then this flow will fail at runtime (which we can avoid
|
||||
# by failing at preparation time).
|
||||
external_provides = set(self.storage.fetch_all().keys())
|
||||
missing = self._flow.requires - external_provides
|
||||
if missing:
|
||||
raise exc.MissingDependencies(self._flow, sorted(missing))
|
||||
# Reset everything back to pending (if we were previously reverted).
|
||||
if self.storage.get_flow_state() == states.REVERTED:
|
||||
self._runtime.reset_all()
|
||||
@@ -203,7 +287,7 @@ class ActionEngine(base.Engine):
|
||||
def _compiler(self):
|
||||
return self._compiler_factory(self._flow)
|
||||
|
||||
@lock_utils.locked
|
||||
@fasteners.locked
|
||||
def compile(self):
|
||||
if self._compiled:
|
||||
return
|
||||
@@ -212,6 +296,7 @@ class ActionEngine(base.Engine):
|
||||
self.storage,
|
||||
self.atom_notifier,
|
||||
self._task_executor)
|
||||
self._runtime.compile()
|
||||
self._compiled = True
|
||||
|
||||
|
||||
@@ -239,7 +324,7 @@ class _ExecutorTextMatch(collections.namedtuple('_ExecutorTextMatch',
|
||||
class ParallelActionEngine(ActionEngine):
|
||||
"""Engine that runs tasks in parallel manner.
|
||||
|
||||
Supported keyword arguments:
|
||||
Supported option keys:
|
||||
|
||||
* ``executor``: a object that implements a :pep:`3148` compatible executor
|
||||
interface; it will be used for scheduling tasks. The following
|
||||
@@ -279,7 +364,7 @@ String (case insensitive) Executor used
|
||||
#
|
||||
# NOTE(harlowja): the reason we use the library/built-in futures is to
|
||||
# allow for instances of that to be detected and handled correctly, instead
|
||||
# of forcing everyone to use our derivatives...
|
||||
# of forcing everyone to use our derivatives (futurist or other)...
|
||||
_executor_cls_matchers = [
|
||||
_ExecutorTypeMatch((futures.ThreadPoolExecutor,),
|
||||
executor.ParallelThreadTaskExecutor),
|
||||
|
||||
@@ -19,7 +19,9 @@ import collections
|
||||
from multiprocessing import managers
|
||||
import os
|
||||
import pickle
|
||||
import threading
|
||||
|
||||
import futurist
|
||||
from oslo_utils import excutils
|
||||
from oslo_utils import reflection
|
||||
from oslo_utils import timeutils
|
||||
@@ -30,9 +32,7 @@ from six.moves import queue as compat_queue
|
||||
from taskflow import logging
|
||||
from taskflow import task as task_atom
|
||||
from taskflow.types import failure
|
||||
from taskflow.types import futures
|
||||
from taskflow.types import notifier
|
||||
from taskflow.types import timing
|
||||
from taskflow.utils import async_utils
|
||||
from taskflow.utils import threading_utils
|
||||
|
||||
@@ -175,7 +175,7 @@ class _WaitWorkItem(object):
|
||||
'kind': _KIND_COMPLETE_ME,
|
||||
}
|
||||
if self._channel.put(message):
|
||||
watch = timing.StopWatch()
|
||||
watch = timeutils.StopWatch()
|
||||
watch.start()
|
||||
self._barrier.wait()
|
||||
LOG.blather("Waited %s seconds until task '%s' %s emitted"
|
||||
@@ -240,7 +240,7 @@ class _Dispatcher(object):
|
||||
raise ValueError("Provided dispatch periodicity must be greater"
|
||||
" than zero and not '%s'" % dispatch_periodicity)
|
||||
self._targets = {}
|
||||
self._dead = threading_utils.Event()
|
||||
self._dead = threading.Event()
|
||||
self._dispatch_periodicity = dispatch_periodicity
|
||||
self._stop_when_empty = False
|
||||
|
||||
@@ -304,7 +304,7 @@ class _Dispatcher(object):
|
||||
" %s to target '%s'", kind, sender, target)
|
||||
|
||||
def run(self, queue):
|
||||
watch = timing.StopWatch(duration=self._dispatch_periodicity)
|
||||
watch = timeutils.StopWatch(duration=self._dispatch_periodicity)
|
||||
while (not self._dead.is_set() or
|
||||
(self._stop_when_empty and self._targets)):
|
||||
watch.restart()
|
||||
@@ -347,18 +347,16 @@ class TaskExecutor(object):
|
||||
|
||||
def start(self):
|
||||
"""Prepare to execute tasks."""
|
||||
pass
|
||||
|
||||
def stop(self):
|
||||
"""Finalize task executor."""
|
||||
pass
|
||||
|
||||
|
||||
class SerialTaskExecutor(TaskExecutor):
|
||||
"""Executes tasks one after another."""
|
||||
|
||||
def __init__(self):
|
||||
self._executor = futures.SynchronousExecutor()
|
||||
self._executor = futurist.SynchronousExecutor()
|
||||
|
||||
def start(self):
|
||||
self._executor.restart()
|
||||
@@ -417,11 +415,8 @@ class ParallelTaskExecutor(TaskExecutor):
|
||||
|
||||
def start(self):
|
||||
if self._own_executor:
|
||||
if self._max_workers is not None:
|
||||
max_workers = self._max_workers
|
||||
else:
|
||||
max_workers = threading_utils.get_optimal_thread_count()
|
||||
self._executor = self._create_executor(max_workers=max_workers)
|
||||
self._executor = self._create_executor(
|
||||
max_workers=self._max_workers)
|
||||
|
||||
def stop(self):
|
||||
if self._own_executor:
|
||||
@@ -433,7 +428,7 @@ class ParallelThreadTaskExecutor(ParallelTaskExecutor):
|
||||
"""Executes tasks in parallel using a thread pool executor."""
|
||||
|
||||
def _create_executor(self, max_workers=None):
|
||||
return futures.ThreadPoolExecutor(max_workers=max_workers)
|
||||
return futurist.ThreadPoolExecutor(max_workers=max_workers)
|
||||
|
||||
|
||||
class ParallelProcessTaskExecutor(ParallelTaskExecutor):
|
||||
@@ -463,7 +458,7 @@ class ParallelProcessTaskExecutor(ParallelTaskExecutor):
|
||||
self._queue = None
|
||||
|
||||
def _create_executor(self, max_workers=None):
|
||||
return futures.ProcessPoolExecutor(max_workers=max_workers)
|
||||
return futurist.ProcessPoolExecutor(max_workers=max_workers)
|
||||
|
||||
def start(self):
|
||||
if threading_utils.is_alive(self._worker):
|
||||
|
||||
@@ -51,39 +51,50 @@ class _MachineMemory(object):
|
||||
self.done = set()
|
||||
|
||||
|
||||
class _MachineBuilder(object):
|
||||
"""State machine *builder* that the runner uses.
|
||||
class Runner(object):
|
||||
"""State machine *builder* + *runner* that powers the engine components.
|
||||
|
||||
NOTE(harlowja): the machine states that this build will for are::
|
||||
NOTE(harlowja): the machine (states and events that will trigger
|
||||
transitions) that this builds is represented by the following
|
||||
table::
|
||||
|
||||
+--------------+------------------+------------+----------+---------+
|
||||
Start | Event | End | On Enter | On Exit
|
||||
+--------------+------------------+------------+----------+---------+
|
||||
ANALYZING | completed | GAME_OVER | |
|
||||
ANALYZING | schedule_next | SCHEDULING | |
|
||||
ANALYZING | wait_finished | WAITING | |
|
||||
FAILURE[$] | | | |
|
||||
GAME_OVER | failed | FAILURE | |
|
||||
GAME_OVER | reverted | REVERTED | |
|
||||
GAME_OVER | success | SUCCESS | |
|
||||
GAME_OVER | suspended | SUSPENDED | |
|
||||
RESUMING | schedule_next | SCHEDULING | |
|
||||
REVERTED[$] | | | |
|
||||
SCHEDULING | wait_finished | WAITING | |
|
||||
SUCCESS[$] | | | |
|
||||
SUSPENDED[$] | | | |
|
||||
UNDEFINED[^] | start | RESUMING | |
|
||||
WAITING | examine_finished | ANALYZING | |
|
||||
+--------------+------------------+------------+----------+---------+
|
||||
+--------------+------------------+------------+----------+---------+
|
||||
Start | Event | End | On Enter | On Exit
|
||||
+--------------+------------------+------------+----------+---------+
|
||||
ANALYZING | completed | GAME_OVER | |
|
||||
ANALYZING | schedule_next | SCHEDULING | |
|
||||
ANALYZING | wait_finished | WAITING | |
|
||||
FAILURE[$] | | | |
|
||||
GAME_OVER | failed | FAILURE | |
|
||||
GAME_OVER | reverted | REVERTED | |
|
||||
GAME_OVER | success | SUCCESS | |
|
||||
GAME_OVER | suspended | SUSPENDED | |
|
||||
RESUMING | schedule_next | SCHEDULING | |
|
||||
REVERTED[$] | | | |
|
||||
SCHEDULING | wait_finished | WAITING | |
|
||||
SUCCESS[$] | | | |
|
||||
SUSPENDED[$] | | | |
|
||||
UNDEFINED[^] | start | RESUMING | |
|
||||
WAITING | examine_finished | ANALYZING | |
|
||||
+--------------+------------------+------------+----------+---------+
|
||||
|
||||
Between any of these yielded states (minus ``GAME_OVER`` and ``UNDEFINED``)
|
||||
if the engine has been suspended or the engine has failed (due to a
|
||||
non-resolveable task failure or scheduling failure) the machine will stop
|
||||
executing new tasks (currently running tasks will be allowed to complete)
|
||||
and this machines run loop will be broken.
|
||||
|
||||
NOTE(harlowja): If the runtimes scheduler component is able to schedule
|
||||
tasks in parallel, this enables parallel running and/or reversion.
|
||||
"""
|
||||
|
||||
# Informational states this action yields while running, not useful to
|
||||
# have the engine record but useful to provide to end-users when doing
|
||||
# execution iterations.
|
||||
ignorable_states = (st.SCHEDULING, st.WAITING, st.RESUMING, st.ANALYZING)
|
||||
|
||||
def __init__(self, runtime, waiter):
|
||||
self._runtime = runtime
|
||||
self._analyzer = runtime.analyzer
|
||||
self._completer = runtime.completer
|
||||
self._scheduler = runtime.scheduler
|
||||
@@ -91,20 +102,36 @@ class _MachineBuilder(object):
|
||||
self._waiter = waiter
|
||||
|
||||
def runnable(self):
|
||||
"""Checks if the storage says the flow is still runnable/running."""
|
||||
return self._storage.get_flow_state() == st.RUNNING
|
||||
|
||||
def build(self, timeout=None):
|
||||
"""Builds a state-machine (that can be/is used during running)."""
|
||||
|
||||
memory = _MachineMemory()
|
||||
if timeout is None:
|
||||
timeout = _WAITING_TIMEOUT
|
||||
|
||||
# Cache some local functions/methods...
|
||||
do_schedule = self._scheduler.schedule
|
||||
wait_for_any = self._waiter.wait_for_any
|
||||
do_complete = self._completer.complete
|
||||
|
||||
def iter_next_nodes(target_node=None):
|
||||
# Yields and filters and tweaks the next nodes to execute...
|
||||
maybe_nodes = self._analyzer.get_next_nodes(node=target_node)
|
||||
for node, late_decider in maybe_nodes:
|
||||
proceed = late_decider.check_and_affect(self._runtime)
|
||||
if proceed:
|
||||
yield node
|
||||
|
||||
def resume(old_state, new_state, event):
|
||||
# This reaction function just updates the state machines memory
|
||||
# to include any nodes that need to be executed (from a previous
|
||||
# attempt, which may be empty if never ran before) and any nodes
|
||||
# that are now ready to be ran.
|
||||
memory.next_nodes.update(self._completer.resume())
|
||||
memory.next_nodes.update(self._analyzer.get_next_nodes())
|
||||
memory.next_nodes.update(iter_next_nodes())
|
||||
return _SCHEDULE
|
||||
|
||||
def game_over(old_state, new_state, event):
|
||||
@@ -114,7 +141,7 @@ class _MachineBuilder(object):
|
||||
# it is *always* called before the final state is entered.
|
||||
if memory.failures:
|
||||
return _FAILED
|
||||
if self._analyzer.get_next_nodes():
|
||||
if any(1 for node in iter_next_nodes()):
|
||||
return _SUSPENDED
|
||||
elif self._analyzer.is_success():
|
||||
return _SUCCESS
|
||||
@@ -128,8 +155,7 @@ class _MachineBuilder(object):
|
||||
# that holds this information to stop or suspend); handles failures
|
||||
# that occur during this process safely...
|
||||
if self.runnable() and memory.next_nodes:
|
||||
not_done, failures = self._scheduler.schedule(
|
||||
memory.next_nodes)
|
||||
not_done, failures = do_schedule(memory.next_nodes)
|
||||
if not_done:
|
||||
memory.not_done.update(not_done)
|
||||
if failures:
|
||||
@@ -142,8 +168,7 @@ class _MachineBuilder(object):
|
||||
# call sometime in the future, or equivalent that will work in
|
||||
# py2 and py3.
|
||||
if memory.not_done:
|
||||
done, not_done = self._waiter.wait_for_any(memory.not_done,
|
||||
timeout)
|
||||
done, not_done = wait_for_any(memory.not_done, timeout)
|
||||
memory.done.update(done)
|
||||
memory.not_done = not_done
|
||||
return _ANALYZE
|
||||
@@ -160,7 +185,7 @@ class _MachineBuilder(object):
|
||||
node = fut.atom
|
||||
try:
|
||||
event, result = fut.result()
|
||||
retain = self._completer.complete(node, event, result)
|
||||
retain = do_complete(node, event, result)
|
||||
if isinstance(result, failure.Failure):
|
||||
if retain:
|
||||
memory.failures.append(result)
|
||||
@@ -183,7 +208,7 @@ class _MachineBuilder(object):
|
||||
memory.failures.append(failure.Failure())
|
||||
else:
|
||||
try:
|
||||
more_nodes = self._analyzer.get_next_nodes(node)
|
||||
more_nodes = set(iter_next_nodes(target_node=node))
|
||||
except Exception:
|
||||
memory.failures.append(failure.Failure())
|
||||
else:
|
||||
@@ -204,10 +229,10 @@ class _MachineBuilder(object):
|
||||
LOG.debug("Entering new state '%s' in response to event '%s'",
|
||||
new_state, event)
|
||||
|
||||
# NOTE(harlowja): when ran in debugging mode it is quite useful
|
||||
# NOTE(harlowja): when ran in blather mode it is quite useful
|
||||
# to track the various state transitions as they happen...
|
||||
watchers = {}
|
||||
if LOG.isEnabledFor(logging.DEBUG):
|
||||
if LOG.isEnabledFor(logging.BLATHER):
|
||||
watchers['on_exit'] = on_exit
|
||||
watchers['on_enter'] = on_enter
|
||||
|
||||
@@ -244,38 +269,9 @@ class _MachineBuilder(object):
|
||||
m.freeze()
|
||||
return (m, memory)
|
||||
|
||||
|
||||
class Runner(object):
|
||||
"""Runner that iterates while executing nodes using the given runtime.
|
||||
|
||||
This runner acts as the action engine run loop/state-machine, it resumes
|
||||
the workflow, schedules all task it can for execution using the runtimes
|
||||
scheduler and analyzer components, and than waits on returned futures and
|
||||
then activates the runtimes completion component to finish up those tasks
|
||||
and so on...
|
||||
|
||||
NOTE(harlowja): If the runtimes scheduler component is able to schedule
|
||||
tasks in parallel, this enables parallel running and/or reversion.
|
||||
"""
|
||||
|
||||
# Informational states this action yields while running, not useful to
|
||||
# have the engine record but useful to provide to end-users when doing
|
||||
# execution iterations.
|
||||
ignorable_states = (st.SCHEDULING, st.WAITING, st.RESUMING, st.ANALYZING)
|
||||
|
||||
def __init__(self, runtime, waiter):
|
||||
self._builder = _MachineBuilder(runtime, waiter)
|
||||
|
||||
@property
|
||||
def builder(self):
|
||||
return self._builder
|
||||
|
||||
def runnable(self):
|
||||
return self._builder.runnable()
|
||||
|
||||
def run_iter(self, timeout=None):
|
||||
"""Runs the nodes using a built state machine."""
|
||||
machine, memory = self.builder.build(timeout=timeout)
|
||||
"""Runs iteratively using a locally built state machine."""
|
||||
machine, memory = self.build(timeout=timeout)
|
||||
for (_prior_state, new_state) in machine.run_iter(_START):
|
||||
# NOTE(harlowja): skip over meta-states.
|
||||
if new_state not in _META_STATES:
|
||||
|
||||
@@ -14,6 +14,8 @@
|
||||
# License for the specific language governing permissions and limitations
|
||||
# under the License.
|
||||
|
||||
import functools
|
||||
|
||||
from taskflow.engines.action_engine.actions import retry as ra
|
||||
from taskflow.engines.action_engine.actions import task as ta
|
||||
from taskflow.engines.action_engine import analyzer as an
|
||||
@@ -21,7 +23,9 @@ from taskflow.engines.action_engine import completer as co
|
||||
from taskflow.engines.action_engine import runner as ru
|
||||
from taskflow.engines.action_engine import scheduler as sched
|
||||
from taskflow.engines.action_engine import scopes as sc
|
||||
from taskflow import flow as flow_type
|
||||
from taskflow import states as st
|
||||
from taskflow import task
|
||||
from taskflow.utils import misc
|
||||
|
||||
|
||||
@@ -38,7 +42,53 @@ class Runtime(object):
|
||||
self._task_executor = task_executor
|
||||
self._storage = storage
|
||||
self._compilation = compilation
|
||||
self._scopes = {}
|
||||
self._atom_cache = {}
|
||||
|
||||
def compile(self):
|
||||
"""Compiles & caches frequently used execution helper objects.
|
||||
|
||||
Build out a cache of commonly used item that are associated
|
||||
with the contained atoms (by name), and are useful to have for
|
||||
quick lookup on (for example, the change state handler function for
|
||||
each atom, the scope walker object for each atom, the task or retry
|
||||
specific scheduler and so-on).
|
||||
"""
|
||||
change_state_handlers = {
|
||||
'task': functools.partial(self.task_action.change_state,
|
||||
progress=0.0),
|
||||
'retry': self.retry_action.change_state,
|
||||
}
|
||||
schedulers = {
|
||||
'retry': self.retry_scheduler,
|
||||
'task': self.task_scheduler,
|
||||
}
|
||||
execution_graph = self._compilation.execution_graph
|
||||
for atom in self.analyzer.iterate_all_nodes():
|
||||
metadata = {}
|
||||
walker = sc.ScopeWalker(self.compilation, atom, names_only=True)
|
||||
if isinstance(atom, task.BaseTask):
|
||||
check_transition_handler = st.check_task_transition
|
||||
change_state_handler = change_state_handlers['task']
|
||||
scheduler = schedulers['task']
|
||||
else:
|
||||
check_transition_handler = st.check_retry_transition
|
||||
change_state_handler = change_state_handlers['retry']
|
||||
scheduler = schedulers['retry']
|
||||
edge_deciders = {}
|
||||
for previous_atom in execution_graph.predecessors(atom):
|
||||
# If there is any link function that says if this connection
|
||||
# is able to run (or should not) ensure we retain it and use
|
||||
# it later as needed.
|
||||
u_v_data = execution_graph.adj[previous_atom][atom]
|
||||
u_v_decider = u_v_data.get(flow_type.LINK_DECIDER)
|
||||
if u_v_decider is not None:
|
||||
edge_deciders[previous_atom.name] = u_v_decider
|
||||
metadata['scope_walker'] = walker
|
||||
metadata['check_transition_handler'] = check_transition_handler
|
||||
metadata['change_state_handler'] = change_state_handler
|
||||
metadata['scheduler'] = scheduler
|
||||
metadata['edge_deciders'] = edge_deciders
|
||||
self._atom_cache[atom.name] = metadata
|
||||
|
||||
@property
|
||||
def compilation(self):
|
||||
@@ -50,7 +100,7 @@ class Runtime(object):
|
||||
|
||||
@misc.cachedproperty
|
||||
def analyzer(self):
|
||||
return an.Analyzer(self._compilation, self._storage)
|
||||
return an.Analyzer(self)
|
||||
|
||||
@misc.cachedproperty
|
||||
def runner(self):
|
||||
@@ -64,53 +114,101 @@ class Runtime(object):
|
||||
def scheduler(self):
|
||||
return sched.Scheduler(self)
|
||||
|
||||
@misc.cachedproperty
|
||||
def task_scheduler(self):
|
||||
return sched.TaskScheduler(self)
|
||||
|
||||
@misc.cachedproperty
|
||||
def retry_scheduler(self):
|
||||
return sched.RetryScheduler(self)
|
||||
|
||||
@misc.cachedproperty
|
||||
def retry_action(self):
|
||||
return ra.RetryAction(self._storage, self._atom_notifier,
|
||||
self._fetch_scopes_for)
|
||||
return ra.RetryAction(self._storage,
|
||||
self._atom_notifier)
|
||||
|
||||
@misc.cachedproperty
|
||||
def task_action(self):
|
||||
return ta.TaskAction(self._storage,
|
||||
self._atom_notifier, self._fetch_scopes_for,
|
||||
self._atom_notifier,
|
||||
self._task_executor)
|
||||
|
||||
def _fetch_scopes_for(self, atom):
|
||||
"""Fetches a tuple of the visible scopes for the given atom."""
|
||||
def check_atom_transition(self, atom, current_state, target_state):
|
||||
"""Checks if the atom can transition to the provided target state."""
|
||||
# This does not check if the name exists (since this is only used
|
||||
# internally to the engine, and is not exposed to atoms that will
|
||||
# not exist and therefore doesn't need to handle that case).
|
||||
metadata = self._atom_cache[atom.name]
|
||||
check_transition_handler = metadata['check_transition_handler']
|
||||
return check_transition_handler(current_state, target_state)
|
||||
|
||||
def fetch_edge_deciders(self, atom):
|
||||
"""Fetches the edge deciders for the given atom."""
|
||||
# This does not check if the name exists (since this is only used
|
||||
# internally to the engine, and is not exposed to atoms that will
|
||||
# not exist and therefore doesn't need to handle that case).
|
||||
metadata = self._atom_cache[atom.name]
|
||||
return metadata['edge_deciders']
|
||||
|
||||
def fetch_scheduler(self, atom):
|
||||
"""Fetches the cached specific scheduler for the given atom."""
|
||||
# This does not check if the name exists (since this is only used
|
||||
# internally to the engine, and is not exposed to atoms that will
|
||||
# not exist and therefore doesn't need to handle that case).
|
||||
metadata = self._atom_cache[atom.name]
|
||||
return metadata['scheduler']
|
||||
|
||||
def fetch_scopes_for(self, atom_name):
|
||||
"""Fetches a walker of the visible scopes for the given atom."""
|
||||
try:
|
||||
return self._scopes[atom]
|
||||
metadata = self._atom_cache[atom_name]
|
||||
except KeyError:
|
||||
walker = sc.ScopeWalker(self.compilation, atom,
|
||||
names_only=True)
|
||||
visible_to = tuple(walker)
|
||||
self._scopes[atom] = visible_to
|
||||
return visible_to
|
||||
# This signals to the caller that there is no walker for whatever
|
||||
# atom name was given that doesn't really have any associated atom
|
||||
# known to be named with that name; this is done since the storage
|
||||
# layer will call into this layer to fetch a scope for a named
|
||||
# atom and users can provide random names that do not actually
|
||||
# exist...
|
||||
return None
|
||||
else:
|
||||
return metadata['scope_walker']
|
||||
|
||||
# Various helper methods used by the runtime components; not for public
|
||||
# consumption...
|
||||
|
||||
def reset_nodes(self, nodes, state=st.PENDING, intention=st.EXECUTE):
|
||||
for node in nodes:
|
||||
def reset_nodes(self, atoms, state=st.PENDING, intention=st.EXECUTE):
|
||||
"""Resets all the provided atoms to the given state and intention."""
|
||||
tweaked = []
|
||||
for atom in atoms:
|
||||
metadata = self._atom_cache[atom.name]
|
||||
if state or intention:
|
||||
tweaked.append((atom, state, intention))
|
||||
if state:
|
||||
if self.task_action.handles(node):
|
||||
self.task_action.change_state(node, state,
|
||||
progress=0.0)
|
||||
elif self.retry_action.handles(node):
|
||||
self.retry_action.change_state(node, state)
|
||||
else:
|
||||
raise TypeError("Unknown how to reset atom '%s' (%s)"
|
||||
% (node, type(node)))
|
||||
change_state_handler = metadata['change_state_handler']
|
||||
change_state_handler(atom, state)
|
||||
if intention:
|
||||
self.storage.set_atom_intention(node.name, intention)
|
||||
self.storage.set_atom_intention(atom.name, intention)
|
||||
return tweaked
|
||||
|
||||
def reset_all(self, state=st.PENDING, intention=st.EXECUTE):
|
||||
self.reset_nodes(self.analyzer.iterate_all_nodes(),
|
||||
state=state, intention=intention)
|
||||
"""Resets all atoms to the given state and intention."""
|
||||
return self.reset_nodes(self.analyzer.iterate_all_nodes(),
|
||||
state=state, intention=intention)
|
||||
|
||||
def reset_subgraph(self, node, state=st.PENDING, intention=st.EXECUTE):
|
||||
self.reset_nodes(self.analyzer.iterate_subgraph(node),
|
||||
state=state, intention=intention)
|
||||
def reset_subgraph(self, atom, state=st.PENDING, intention=st.EXECUTE):
|
||||
"""Resets a atoms subgraph to the given state and intention.
|
||||
|
||||
The subgraph is contained of all of the atoms successors.
|
||||
"""
|
||||
return self.reset_nodes(self.analyzer.iterate_subgraph(atom),
|
||||
state=state, intention=intention)
|
||||
|
||||
def retry_subflow(self, retry):
|
||||
"""Prepares a retrys + its subgraph for execution.
|
||||
|
||||
This sets the retrys intention to ``EXECUTE`` and resets all of its
|
||||
subgraph (its successors) to the ``PENDING`` state with an ``EXECUTE``
|
||||
intention.
|
||||
"""
|
||||
self.storage.set_atom_intention(retry.name, st.EXECUTE)
|
||||
self.reset_subgraph(retry)
|
||||
|
||||
@@ -14,23 +14,21 @@
|
||||
# License for the specific language governing permissions and limitations
|
||||
# under the License.
|
||||
|
||||
import weakref
|
||||
|
||||
from taskflow import exceptions as excp
|
||||
from taskflow import retry as retry_atom
|
||||
from taskflow import states as st
|
||||
from taskflow import task as task_atom
|
||||
from taskflow.types import failure
|
||||
|
||||
|
||||
class _RetryScheduler(object):
|
||||
class RetryScheduler(object):
|
||||
"""Schedules retry atoms."""
|
||||
|
||||
def __init__(self, runtime):
|
||||
self._runtime = runtime
|
||||
self._runtime = weakref.proxy(runtime)
|
||||
self._retry_action = runtime.retry_action
|
||||
self._storage = runtime.storage
|
||||
|
||||
@staticmethod
|
||||
def handles(atom):
|
||||
return isinstance(atom, retry_atom.Retry)
|
||||
|
||||
def schedule(self, retry):
|
||||
"""Schedules the given retry atom for *future* completion.
|
||||
|
||||
@@ -51,15 +49,13 @@ class _RetryScheduler(object):
|
||||
" intention: %s" % intention)
|
||||
|
||||
|
||||
class _TaskScheduler(object):
|
||||
class TaskScheduler(object):
|
||||
"""Schedules task atoms."""
|
||||
|
||||
def __init__(self, runtime):
|
||||
self._storage = runtime.storage
|
||||
self._task_action = runtime.task_action
|
||||
|
||||
@staticmethod
|
||||
def handles(atom):
|
||||
return isinstance(atom, task_atom.BaseTask)
|
||||
|
||||
def schedule(self, task):
|
||||
"""Schedules the given task atom for *future* completion.
|
||||
|
||||
@@ -77,39 +73,28 @@ class _TaskScheduler(object):
|
||||
|
||||
|
||||
class Scheduler(object):
|
||||
"""Schedules atoms using actions to schedule."""
|
||||
"""Safely schedules atoms using a runtime ``fetch_scheduler`` routine."""
|
||||
|
||||
def __init__(self, runtime):
|
||||
self._schedulers = [
|
||||
_RetryScheduler(runtime),
|
||||
_TaskScheduler(runtime),
|
||||
]
|
||||
self._fetch_scheduler = runtime.fetch_scheduler
|
||||
|
||||
def _schedule_node(self, node):
|
||||
"""Schedule a single node for execution."""
|
||||
for sched in self._schedulers:
|
||||
if sched.handles(node):
|
||||
return sched.schedule(node)
|
||||
else:
|
||||
raise TypeError("Unknown how to schedule '%s' (%s)"
|
||||
% (node, type(node)))
|
||||
def schedule(self, atoms):
|
||||
"""Schedules the provided atoms for *future* completion.
|
||||
|
||||
def schedule(self, nodes):
|
||||
"""Schedules the provided nodes for *future* completion.
|
||||
|
||||
This method should schedule a future for each node provided and return
|
||||
This method should schedule a future for each atom provided and return
|
||||
a set of those futures to be waited on (or used for other similar
|
||||
purposes). It should also return any failure objects that represented
|
||||
scheduling failures that may have occurred during this scheduling
|
||||
process.
|
||||
"""
|
||||
futures = set()
|
||||
for node in nodes:
|
||||
for atom in atoms:
|
||||
scheduler = self._fetch_scheduler(atom)
|
||||
try:
|
||||
futures.add(self._schedule_node(node))
|
||||
futures.add(scheduler.schedule(atom))
|
||||
except Exception:
|
||||
# Immediately stop scheduling future work so that we can
|
||||
# exit execution early (rather than later) if a single task
|
||||
# exit execution early (rather than later) if a single atom
|
||||
# fails to schedule correctly.
|
||||
return (futures, [failure.Failure()])
|
||||
return (futures, [])
|
||||
|
||||
@@ -21,29 +21,30 @@ from taskflow import logging
|
||||
LOG = logging.getLogger(__name__)
|
||||
|
||||
|
||||
def _extract_atoms(node, idx=-1):
|
||||
def _extract_atoms_iter(node, idx=-1):
|
||||
# Always go left to right, since right to left is the pattern order
|
||||
# and we want to go backwards and not forwards through that ordering...
|
||||
if idx == -1:
|
||||
children_iter = node.reverse_iter()
|
||||
else:
|
||||
children_iter = reversed(node[0:idx])
|
||||
atoms = []
|
||||
for child in children_iter:
|
||||
if isinstance(child.item, flow_type.Flow):
|
||||
atoms.extend(_extract_atoms(child))
|
||||
for atom in _extract_atoms_iter(child):
|
||||
yield atom
|
||||
elif isinstance(child.item, atom_type.Atom):
|
||||
atoms.append(child.item)
|
||||
yield child.item
|
||||
else:
|
||||
raise TypeError(
|
||||
"Unknown extraction item '%s' (%s)" % (child.item,
|
||||
type(child.item)))
|
||||
return atoms
|
||||
|
||||
|
||||
class ScopeWalker(object):
|
||||
"""Walks through the scopes of a atom using a engines compilation.
|
||||
|
||||
NOTE(harlowja): for internal usage only.
|
||||
|
||||
This will walk the visible scopes that are accessible for the given
|
||||
atom, which can be used by some external entity in some meaningful way,
|
||||
for example to find dependent values...
|
||||
@@ -54,60 +55,80 @@ class ScopeWalker(object):
|
||||
if self._node is None:
|
||||
raise ValueError("Unable to find atom '%s' in compilation"
|
||||
" hierarchy" % atom)
|
||||
self._level_cache = {}
|
||||
self._atom = atom
|
||||
self._graph = compilation.execution_graph
|
||||
self._names_only = names_only
|
||||
self._predecessors = None
|
||||
|
||||
#: Function that extracts the *associated* atoms of a given tree node.
|
||||
_extract_atoms_iter = staticmethod(_extract_atoms_iter)
|
||||
|
||||
def __iter__(self):
|
||||
"""Iterates over the visible scopes.
|
||||
|
||||
How this works is the following:
|
||||
|
||||
We find all the possible predecessors of the given atom, this is useful
|
||||
since we know they occurred before this atom but it doesn't tell us
|
||||
the corresponding scope *level* that each predecessor was created in,
|
||||
so we need to find this information.
|
||||
We first grab all the predecessors of the given atom (lets call it
|
||||
``Y``) by using the :py:class:`~.compiler.Compilation` execution
|
||||
graph (and doing a reverse breadth-first expansion to gather its
|
||||
predecessors), this is useful since we know they *always* will
|
||||
exist (and execute) before this atom but it does not tell us the
|
||||
corresponding scope *level* (flow, nested flow...) that each
|
||||
predecessor was created in, so we need to find this information.
|
||||
|
||||
For that information we consult the location of the atom ``Y`` in the
|
||||
node hierarchy. We lookup in a reverse order the parent ``X`` of ``Y``
|
||||
and traverse backwards from the index in the parent where ``Y``
|
||||
occurred, all children in ``X`` that we encounter in this backwards
|
||||
search (if a child is a flow itself, its atom contents will be
|
||||
expanded) will be assumed to be at the same scope. This is then a
|
||||
*potential* single scope, to make an *actual* scope we remove the items
|
||||
from the *potential* scope that are not predecessors of ``Y`` to form
|
||||
the *actual* scope.
|
||||
:py:class:`~.compiler.Compilation` hierarchy/tree. We lookup in a
|
||||
reverse order the parent ``X`` of ``Y`` and traverse backwards from
|
||||
the index in the parent where ``Y`` exists to all siblings (and
|
||||
children of those siblings) in ``X`` that we encounter in this
|
||||
backwards search (if a sibling is a flow itself, its atom(s)
|
||||
will be recursively expanded and included). This collection will
|
||||
then be assumed to be at the same scope. This is what is called
|
||||
a *potential* single scope, to make an *actual* scope we remove the
|
||||
items from the *potential* scope that are **not** predecessors
|
||||
of ``Y`` to form the *actual* scope which we then yield back.
|
||||
|
||||
Then for additional scopes we continue up the tree, by finding the
|
||||
parent of ``X`` (lets call it ``Z``) and perform the same operation,
|
||||
going through the children in a reverse manner from the index in
|
||||
parent ``Z`` where ``X`` was located. This forms another *potential*
|
||||
scope which we provide back as an *actual* scope after reducing the
|
||||
potential set by the predecessors of ``Y``. We then repeat this process
|
||||
until we no longer have any parent nodes (aka have reached the top of
|
||||
the tree) or we run out of predecessors.
|
||||
potential set to only include predecessors previously gathered. We
|
||||
then repeat this process until we no longer have any parent
|
||||
nodes (aka we have reached the top of the tree) or we run out of
|
||||
predecessors.
|
||||
"""
|
||||
predecessors = set(self._graph.bfs_predecessors_iter(self._atom))
|
||||
if self._predecessors is None:
|
||||
pred_iter = self._graph.bfs_predecessors_iter(self._atom)
|
||||
self._predecessors = set(pred_iter)
|
||||
predecessors = self._predecessors.copy()
|
||||
last = self._node
|
||||
for parent in self._node.path_iter(include_self=False):
|
||||
for lvl, parent in enumerate(self._node.path_iter(include_self=False)):
|
||||
if not predecessors:
|
||||
break
|
||||
last_idx = parent.index(last.item)
|
||||
visible = []
|
||||
for a in _extract_atoms(parent, idx=last_idx):
|
||||
if a in predecessors:
|
||||
predecessors.remove(a)
|
||||
if not self._names_only:
|
||||
visible.append(a)
|
||||
else:
|
||||
visible.append(a.name)
|
||||
if LOG.isEnabledFor(logging.BLATHER):
|
||||
if not self._names_only:
|
||||
try:
|
||||
visible, removals = self._level_cache[lvl]
|
||||
predecessors = predecessors - removals
|
||||
except KeyError:
|
||||
visible = []
|
||||
removals = set()
|
||||
for atom in self._extract_atoms_iter(parent, idx=last_idx):
|
||||
if atom in predecessors:
|
||||
predecessors.remove(atom)
|
||||
removals.add(atom)
|
||||
visible.append(atom)
|
||||
if not predecessors:
|
||||
break
|
||||
self._level_cache[lvl] = (visible, removals)
|
||||
if LOG.isEnabledFor(logging.BLATHER):
|
||||
visible_names = [a.name for a in visible]
|
||||
else:
|
||||
visible_names = visible
|
||||
LOG.blather("Scope visible to '%s' (limited by parent '%s'"
|
||||
" index < %s) is: %s", self._atom,
|
||||
parent.item.name, last_idx, visible_names)
|
||||
yield visible
|
||||
LOG.blather("Scope visible to '%s' (limited by parent '%s'"
|
||||
" index < %s) is: %s", self._atom,
|
||||
parent.item.name, last_idx, visible_names)
|
||||
if self._names_only:
|
||||
yield [a.name for a in visible]
|
||||
else:
|
||||
yield visible
|
||||
last = parent
|
||||
|
||||
@@ -17,12 +17,10 @@
|
||||
|
||||
import abc
|
||||
|
||||
from debtcollector import moves
|
||||
import six
|
||||
|
||||
from taskflow import storage
|
||||
from taskflow.types import notifier
|
||||
from taskflow.utils import deprecation
|
||||
from taskflow.utils import misc
|
||||
|
||||
|
||||
@six.add_metaclass(abc.ABCMeta)
|
||||
@@ -56,10 +54,18 @@ class Engine(object):
|
||||
return self._notifier
|
||||
|
||||
@property
|
||||
@deprecation.moved_property('atom_notifier', version="0.6",
|
||||
removal_version="?")
|
||||
@moves.moved_property('atom_notifier', version="0.6",
|
||||
removal_version="2.0")
|
||||
def task_notifier(self):
|
||||
"""The task notifier."""
|
||||
"""The task notifier.
|
||||
|
||||
.. deprecated:: 0.6
|
||||
|
||||
The property is **deprecated** and is present for
|
||||
backward compatibility **only**. In order to access this
|
||||
property going forward the :py:attr:`.atom_notifier` should
|
||||
be used instead.
|
||||
"""
|
||||
return self._atom_notifier
|
||||
|
||||
@property
|
||||
@@ -72,10 +78,9 @@ class Engine(object):
|
||||
"""The options that were passed to this engine on construction."""
|
||||
return self._options
|
||||
|
||||
@misc.cachedproperty
|
||||
@abc.abstractproperty
|
||||
def storage(self):
|
||||
"""The storage unit for this flow."""
|
||||
return storage.Storage(self._flow_detail, backend=self._backend)
|
||||
"""The storage unit for this engine."""
|
||||
|
||||
@abc.abstractmethod
|
||||
def compile(self):
|
||||
@@ -92,9 +97,18 @@ class Engine(object):
|
||||
"""Performs any pre-run, but post-compilation actions.
|
||||
|
||||
NOTE(harlowja): During preparation it is currently assumed that the
|
||||
underlying storage will be initialized, all final dependencies
|
||||
will be verified, the tasks will be reset and the engine will enter
|
||||
the PENDING state.
|
||||
underlying storage will be initialized, the atoms will be reset and
|
||||
the engine will enter the PENDING state.
|
||||
"""
|
||||
|
||||
@abc.abstractmethod
|
||||
def validate(self):
|
||||
"""Performs any pre-run, post-prepare validation actions.
|
||||
|
||||
NOTE(harlowja): During validation all final dependencies
|
||||
will be verified and ensured. This will by default check that all
|
||||
atoms have satisfiable requirements (satisfied by some other
|
||||
provider).
|
||||
"""
|
||||
|
||||
@abc.abstractmethod
|
||||
@@ -105,15 +119,13 @@ class Engine(object):
|
||||
def suspend(self):
|
||||
"""Attempts to suspend the engine.
|
||||
|
||||
If the engine is currently running tasks then this will attempt to
|
||||
suspend future work from being started (currently active tasks can
|
||||
If the engine is currently running atoms then this will attempt to
|
||||
suspend future work from being started (currently active atoms can
|
||||
not currently be preempted) and move the engine into a suspend state
|
||||
which can then later be resumed from.
|
||||
"""
|
||||
|
||||
|
||||
# TODO(harlowja): remove in 0.7 or later...
|
||||
EngineBase = deprecation.moved_inheritable_class(Engine,
|
||||
'EngineBase', __name__,
|
||||
version="0.6",
|
||||
removal_version="?")
|
||||
EngineBase = moves.moved_class(Engine, 'EngineBase', __name__,
|
||||
version="0.6", removal_version="2.0")
|
||||
|
||||
@@ -18,6 +18,7 @@ import contextlib
|
||||
import itertools
|
||||
import traceback
|
||||
|
||||
from debtcollector import renames
|
||||
from oslo_utils import importutils
|
||||
from oslo_utils import reflection
|
||||
import six
|
||||
@@ -26,7 +27,6 @@ import stevedore.driver
|
||||
from taskflow import exceptions as exc
|
||||
from taskflow import logging
|
||||
from taskflow.persistence import backends as p_backends
|
||||
from taskflow.utils import deprecation
|
||||
from taskflow.utils import misc
|
||||
from taskflow.utils import persistence_utils as p_utils
|
||||
|
||||
@@ -90,14 +90,14 @@ def _extract_engine(**kwargs):
|
||||
lambda frame: frame[0] in _FILE_NAMES,
|
||||
reversed(traceback.extract_stack(limit=3)))
|
||||
stacklevel = sum(1 for _frame in finder)
|
||||
decorator = deprecation.renamed_kwarg('engine_conf', 'engine',
|
||||
version="0.6",
|
||||
removal_version="?",
|
||||
# Three is added on since the
|
||||
# decorator adds three of its own
|
||||
# stack levels that we need to
|
||||
# hop out of...
|
||||
stacklevel=stacklevel + 3)
|
||||
decorator = renames.renamed_kwarg('engine_conf', 'engine',
|
||||
version="0.6",
|
||||
removal_version="2.0",
|
||||
# Three is added on since the
|
||||
# decorator adds three of its own
|
||||
# stack levels that we need to
|
||||
# hop out of...
|
||||
stacklevel=stacklevel + 3)
|
||||
return decorator(_compat_extract)(**kwargs)
|
||||
else:
|
||||
return _compat_extract(**kwargs)
|
||||
@@ -134,7 +134,7 @@ def load(flow, store=None, flow_detail=None, book=None,
|
||||
|
||||
This function creates and prepares an engine to run the provided flow. All
|
||||
that is left after this returns is to run the engine with the
|
||||
engines ``run()`` method.
|
||||
engines :py:meth:`~taskflow.engines.base.Engine.run` method.
|
||||
|
||||
Which engine to load is specified via the ``engine`` parameter. It
|
||||
can be a string that names the engine type to use, or a string that
|
||||
@@ -143,7 +143,15 @@ def load(flow, store=None, flow_detail=None, book=None,
|
||||
|
||||
Which storage backend to use is defined by the backend parameter. It
|
||||
can be backend itself, or a dictionary that is passed to
|
||||
``taskflow.persistence.backends.fetch()`` to obtain a viable backend.
|
||||
:py:func:`~taskflow.persistence.backends.fetch` to obtain a
|
||||
viable backend.
|
||||
|
||||
.. deprecated:: 0.6
|
||||
|
||||
The ``engine_conf`` argument is **deprecated** and is present
|
||||
for backward compatibility **only**. In order to provide this
|
||||
argument going forward the ``engine`` string (or URI) argument
|
||||
should be used instead.
|
||||
|
||||
:param flow: flow to load
|
||||
:param store: dict -- data to put to storage to satisfy flow requirements
|
||||
@@ -198,7 +206,15 @@ def run(flow, store=None, flow_detail=None, book=None,
|
||||
|
||||
The arguments are interpreted as for :func:`load() <load>`.
|
||||
|
||||
:returns: dictionary of all named results (see ``storage.fetch_all()``)
|
||||
.. deprecated:: 0.6
|
||||
|
||||
The ``engine_conf`` argument is **deprecated** and is present
|
||||
for backward compatibility **only**. In order to provide this
|
||||
argument going forward the ``engine`` string (or URI) argument
|
||||
should be used instead.
|
||||
|
||||
:returns: dictionary of all named
|
||||
results (see :py:meth:`~.taskflow.storage.Storage.fetch_all`)
|
||||
"""
|
||||
engine = load(flow, store=store, flow_detail=flow_detail, book=book,
|
||||
engine_conf=engine_conf, backend=backend,
|
||||
@@ -262,6 +278,13 @@ def load_from_factory(flow_factory, factory_args=None, factory_kwargs=None,
|
||||
|
||||
Further arguments are interpreted as for :func:`load() <load>`.
|
||||
|
||||
.. deprecated:: 0.6
|
||||
|
||||
The ``engine_conf`` argument is **deprecated** and is present
|
||||
for backward compatibility **only**. In order to provide this
|
||||
argument going forward the ``engine`` string (or URI) argument
|
||||
should be used instead.
|
||||
|
||||
:returns: engine
|
||||
"""
|
||||
|
||||
@@ -322,6 +345,13 @@ def load_from_detail(flow_detail, store=None, engine_conf=None, backend=None,
|
||||
|
||||
Further arguments are interpreted as for :func:`load() <load>`.
|
||||
|
||||
.. deprecated:: 0.6
|
||||
|
||||
The ``engine_conf`` argument is **deprecated** and is present
|
||||
for backward compatibility **only**. In order to provide this
|
||||
argument going forward the ``engine`` string (or URI) argument
|
||||
should be used instead.
|
||||
|
||||
:returns: engine
|
||||
"""
|
||||
flow = flow_from_detail(flow_detail)
|
||||
|
||||
@@ -23,6 +23,36 @@ from taskflow.utils import kombu_utils as ku
|
||||
LOG = logging.getLogger(__name__)
|
||||
|
||||
|
||||
class Handler(object):
|
||||
"""Component(s) that will be called on reception of messages."""
|
||||
|
||||
__slots__ = ['_process_message', '_validator']
|
||||
|
||||
def __init__(self, process_message, validator=None):
|
||||
self._process_message = process_message
|
||||
self._validator = validator
|
||||
|
||||
@property
|
||||
def process_message(self):
|
||||
"""Main callback that is called to process a received message.
|
||||
|
||||
This is only called after the format has been validated (using
|
||||
the ``validator`` callback if applicable) and only after the message
|
||||
has been acknowledged.
|
||||
"""
|
||||
return self._process_message
|
||||
|
||||
@property
|
||||
def validator(self):
|
||||
"""Optional callback that will be activated before processing.
|
||||
|
||||
This callback if present is expected to validate the message and
|
||||
raise :py:class:`~taskflow.exceptions.InvalidFormat` if the message
|
||||
is not valid.
|
||||
"""
|
||||
return self._validator
|
||||
|
||||
|
||||
class TypeDispatcher(object):
|
||||
"""Receives messages and dispatches to type specific handlers."""
|
||||
|
||||
@@ -99,10 +129,9 @@ class TypeDispatcher(object):
|
||||
LOG.warning("Unexpected message type: '%s' in message"
|
||||
" '%s'", message_type, ku.DelayedPretty(message))
|
||||
else:
|
||||
if isinstance(handler, (tuple, list)):
|
||||
handler, validator = handler
|
||||
if handler.validator is not None:
|
||||
try:
|
||||
validator(data)
|
||||
handler.validator(data)
|
||||
except excp.InvalidFormat as e:
|
||||
message.reject_log_error(
|
||||
logger=LOG, errors=(kombu_exc.MessageStateError,))
|
||||
@@ -115,7 +144,7 @@ class TypeDispatcher(object):
|
||||
if message.acknowledged:
|
||||
LOG.debug("Message '%s' was acknowledged.",
|
||||
ku.DelayedPretty(message))
|
||||
handler(data, message)
|
||||
handler.process_message(data, message)
|
||||
else:
|
||||
message.reject_log_error(logger=LOG,
|
||||
errors=(kombu_exc.MessageStateError,))
|
||||
|
||||
@@ -36,8 +36,9 @@ class WorkerBasedActionEngine(engine.ActionEngine):
|
||||
of the (PENDING, WAITING) request states. When
|
||||
expired the associated task the request was made
|
||||
for will have its result become a
|
||||
`RequestTimeout` exception instead of its
|
||||
normally returned value (or raised exception).
|
||||
:py:class:`~taskflow.exceptions.RequestTimeout`
|
||||
exception instead of its normally returned
|
||||
value (or raised exception).
|
||||
:param transport_options: transport specific options (see:
|
||||
http://kombu.readthedocs.org/ for what these
|
||||
options imply and are expected to be)
|
||||
|
||||
@@ -16,16 +16,17 @@
|
||||
|
||||
import functools
|
||||
|
||||
from futurist import periodics
|
||||
from oslo_utils import timeutils
|
||||
|
||||
from taskflow.engines.action_engine import executor
|
||||
from taskflow.engines.worker_based import dispatcher
|
||||
from taskflow.engines.worker_based import protocol as pr
|
||||
from taskflow.engines.worker_based import proxy
|
||||
from taskflow.engines.worker_based import types as wt
|
||||
from taskflow import exceptions as exc
|
||||
from taskflow import logging
|
||||
from taskflow import task as task_atom
|
||||
from taskflow.types import periodic
|
||||
from taskflow.utils import kombu_utils as ku
|
||||
from taskflow.utils import misc
|
||||
from taskflow.utils import threading_utils as tu
|
||||
@@ -44,10 +45,8 @@ class WorkerTaskExecutor(executor.TaskExecutor):
|
||||
self._requests_cache = wt.RequestsCache()
|
||||
self._transition_timeout = transition_timeout
|
||||
type_handlers = {
|
||||
pr.RESPONSE: [
|
||||
self._process_response,
|
||||
pr.Response.validate,
|
||||
],
|
||||
pr.RESPONSE: dispatcher.Handler(self._process_response,
|
||||
validator=pr.Response.validate),
|
||||
}
|
||||
self._proxy = proxy.Proxy(uuid, exchange,
|
||||
type_handlers=type_handlers,
|
||||
@@ -68,7 +67,7 @@ class WorkerTaskExecutor(executor.TaskExecutor):
|
||||
self._helpers.bind(lambda: tu.daemon_thread(self._proxy.start),
|
||||
after_start=lambda t: self._proxy.wait(),
|
||||
before_join=lambda t: self._proxy.stop())
|
||||
p_worker = periodic.PeriodicWorker.create([self._finder])
|
||||
p_worker = periodics.PeriodicWorker.create([self._finder])
|
||||
if p_worker:
|
||||
self._helpers.bind(lambda: tu.daemon_thread(p_worker.start),
|
||||
before_join=lambda t: p_worker.stop(),
|
||||
|
||||
@@ -15,11 +15,11 @@
|
||||
# under the License.
|
||||
|
||||
import abc
|
||||
import collections
|
||||
import threading
|
||||
|
||||
from concurrent import futures
|
||||
import jsonschema
|
||||
from jsonschema import exceptions as schema_exc
|
||||
import fasteners
|
||||
import futurist
|
||||
from oslo_utils import reflection
|
||||
from oslo_utils import timeutils
|
||||
import six
|
||||
@@ -28,8 +28,7 @@ from taskflow.engines.action_engine import executor
|
||||
from taskflow import exceptions as excp
|
||||
from taskflow import logging
|
||||
from taskflow.types import failure as ft
|
||||
from taskflow.types import timing as tt
|
||||
from taskflow.utils import lock_utils
|
||||
from taskflow.utils import schema_utils as su
|
||||
|
||||
# NOTE(skudriashev): This is protocol states and events, which are not
|
||||
# related to task states.
|
||||
@@ -98,12 +97,6 @@ NOTIFY = 'NOTIFY'
|
||||
REQUEST = 'REQUEST'
|
||||
RESPONSE = 'RESPONSE'
|
||||
|
||||
# Special jsonschema validation types/adjustments.
|
||||
_SCHEMA_TYPES = {
|
||||
# See: https://github.com/Julian/jsonschema/issues/148
|
||||
'array': (list, tuple),
|
||||
}
|
||||
|
||||
LOG = logging.getLogger(__name__)
|
||||
|
||||
|
||||
@@ -112,7 +105,8 @@ class Message(object):
|
||||
"""Base class for all message types."""
|
||||
|
||||
def __str__(self):
|
||||
return "<%s> %s" % (self.TYPE, self.to_dict())
|
||||
cls_name = reflection.get_class_name(self, fully_qualified=False)
|
||||
return "<%s> %s" % (cls_name, self.to_dict())
|
||||
|
||||
@abc.abstractmethod
|
||||
def to_dict(self):
|
||||
@@ -166,16 +160,25 @@ class Notify(Message):
|
||||
else:
|
||||
schema = cls.SENDER_SCHEMA
|
||||
try:
|
||||
jsonschema.validate(data, schema, types=_SCHEMA_TYPES)
|
||||
except schema_exc.ValidationError as e:
|
||||
su.schema_validate(data, schema)
|
||||
except su.ValidationError as e:
|
||||
cls_name = reflection.get_class_name(cls, fully_qualified=False)
|
||||
if response:
|
||||
raise excp.InvalidFormat("%s message response data not of the"
|
||||
" expected format: %s"
|
||||
% (cls.TYPE, e.message), e)
|
||||
excp.raise_with_cause(excp.InvalidFormat,
|
||||
"%s message response data not of the"
|
||||
" expected format: %s" % (cls_name,
|
||||
e.message),
|
||||
cause=e)
|
||||
else:
|
||||
raise excp.InvalidFormat("%s message sender data not of the"
|
||||
" expected format: %s"
|
||||
% (cls.TYPE, e.message), e)
|
||||
excp.raise_with_cause(excp.InvalidFormat,
|
||||
"%s message sender data not of the"
|
||||
" expected format: %s" % (cls_name,
|
||||
e.message),
|
||||
cause=e)
|
||||
|
||||
|
||||
_WorkUnit = collections.namedtuple('_WorkUnit', ['task_cls', 'task_name',
|
||||
'action', 'arguments'])
|
||||
|
||||
|
||||
class Request(Message):
|
||||
@@ -235,11 +238,11 @@ class Request(Message):
|
||||
self._event = ACTION_TO_EVENT[action]
|
||||
self._arguments = arguments
|
||||
self._kwargs = kwargs
|
||||
self._watch = tt.StopWatch(duration=timeout).start()
|
||||
self._watch = timeutils.StopWatch(duration=timeout).start()
|
||||
self._state = WAITING
|
||||
self._lock = threading.Lock()
|
||||
self._created_on = timeutils.utcnow()
|
||||
self._result = futures.Future()
|
||||
self._result = futurist.Future()
|
||||
self._result.atom = task
|
||||
self._notifier = task.notifier
|
||||
|
||||
@@ -332,7 +335,7 @@ class Request(Message):
|
||||
new_state, exc_info=True)
|
||||
return moved
|
||||
|
||||
@lock_utils.locked
|
||||
@fasteners.locked
|
||||
def transition(self, new_state):
|
||||
"""Transitions the request to a new state.
|
||||
|
||||
@@ -358,11 +361,60 @@ class Request(Message):
|
||||
@classmethod
|
||||
def validate(cls, data):
|
||||
try:
|
||||
jsonschema.validate(data, cls.SCHEMA, types=_SCHEMA_TYPES)
|
||||
except schema_exc.ValidationError as e:
|
||||
raise excp.InvalidFormat("%s message response data not of the"
|
||||
" expected format: %s"
|
||||
% (cls.TYPE, e.message), e)
|
||||
su.schema_validate(data, cls.SCHEMA)
|
||||
except su.ValidationError as e:
|
||||
cls_name = reflection.get_class_name(cls, fully_qualified=False)
|
||||
excp.raise_with_cause(excp.InvalidFormat,
|
||||
"%s message response data not of the"
|
||||
" expected format: %s" % (cls_name,
|
||||
e.message),
|
||||
cause=e)
|
||||
else:
|
||||
# Validate all failure dictionaries that *may* be present...
|
||||
failures = []
|
||||
if 'failures' in data:
|
||||
failures.extend(six.itervalues(data['failures']))
|
||||
result = data.get('result')
|
||||
if result is not None:
|
||||
result_data_type, result_data = result
|
||||
if result_data_type == 'failure':
|
||||
failures.append(result_data)
|
||||
for fail_data in failures:
|
||||
ft.Failure.validate(fail_data)
|
||||
|
||||
@staticmethod
|
||||
def from_dict(data, task_uuid=None):
|
||||
"""Parses **validated** data into a work unit.
|
||||
|
||||
All :py:class:`~taskflow.types.failure.Failure` objects that have been
|
||||
converted to dict(s) on the remote side will now converted back
|
||||
to py:class:`~taskflow.types.failure.Failure` objects.
|
||||
"""
|
||||
task_cls = data['task_cls']
|
||||
task_name = data['task_name']
|
||||
action = data['action']
|
||||
arguments = data.get('arguments', {})
|
||||
result = data.get('result')
|
||||
failures = data.get('failures')
|
||||
# These arguments will eventually be given to the task executor
|
||||
# so they need to be in a format it will accept (and using keyword
|
||||
# argument names that it accepts)...
|
||||
arguments = {
|
||||
'arguments': arguments,
|
||||
}
|
||||
if task_uuid is not None:
|
||||
arguments['task_uuid'] = task_uuid
|
||||
if result is not None:
|
||||
result_data_type, result_data = result
|
||||
if result_data_type == 'failure':
|
||||
arguments['result'] = ft.Failure.from_dict(result_data)
|
||||
else:
|
||||
arguments['result'] = result_data
|
||||
if failures is not None:
|
||||
arguments['failures'] = {}
|
||||
for task, fail_data in six.iteritems(failures):
|
||||
arguments['failures'][task] = ft.Failure.from_dict(fail_data)
|
||||
return _WorkUnit(task_cls, task_name, action, arguments)
|
||||
|
||||
|
||||
class Response(Message):
|
||||
@@ -455,8 +507,15 @@ class Response(Message):
|
||||
@classmethod
|
||||
def validate(cls, data):
|
||||
try:
|
||||
jsonschema.validate(data, cls.SCHEMA, types=_SCHEMA_TYPES)
|
||||
except schema_exc.ValidationError as e:
|
||||
raise excp.InvalidFormat("%s message response data not of the"
|
||||
" expected format: %s"
|
||||
% (cls.TYPE, e.message), e)
|
||||
su.schema_validate(data, cls.SCHEMA)
|
||||
except su.ValidationError as e:
|
||||
cls_name = reflection.get_class_name(cls, fully_qualified=False)
|
||||
excp.raise_with_cause(excp.InvalidFormat,
|
||||
"%s message response data not of the"
|
||||
" expected format: %s" % (cls_name,
|
||||
e.message),
|
||||
cause=e)
|
||||
else:
|
||||
state = data['state']
|
||||
if state == FAILURE and 'result' in data:
|
||||
ft.Failure.validate(data['result'])
|
||||
|
||||
@@ -15,6 +15,7 @@
|
||||
# under the License.
|
||||
|
||||
import collections
|
||||
import threading
|
||||
|
||||
import kombu
|
||||
from kombu import exceptions as kombu_exceptions
|
||||
@@ -22,7 +23,6 @@ import six
|
||||
|
||||
from taskflow.engines.worker_based import dispatcher
|
||||
from taskflow import logging
|
||||
from taskflow.utils import threading_utils
|
||||
|
||||
LOG = logging.getLogger(__name__)
|
||||
|
||||
@@ -75,7 +75,7 @@ class Proxy(object):
|
||||
self._topic = topic
|
||||
self._exchange_name = exchange
|
||||
self._on_wait = on_wait
|
||||
self._running = threading_utils.Event()
|
||||
self._running = threading.Event()
|
||||
self._dispatcher = dispatcher.TypeDispatcher(
|
||||
# NOTE(skudriashev): Process all incoming messages only if proxy is
|
||||
# running, otherwise requeue them.
|
||||
|
||||
@@ -17,14 +17,14 @@
|
||||
import functools
|
||||
|
||||
from oslo_utils import reflection
|
||||
import six
|
||||
from oslo_utils import timeutils
|
||||
|
||||
from taskflow.engines.worker_based import dispatcher
|
||||
from taskflow.engines.worker_based import protocol as pr
|
||||
from taskflow.engines.worker_based import proxy
|
||||
from taskflow import logging
|
||||
from taskflow.types import failure as ft
|
||||
from taskflow.types import notifier as nt
|
||||
from taskflow.types import timing as tt
|
||||
from taskflow.utils import kombu_utils as ku
|
||||
from taskflow.utils import misc
|
||||
|
||||
@@ -38,14 +38,13 @@ class Server(object):
|
||||
url=None, transport=None, transport_options=None,
|
||||
retry_options=None):
|
||||
type_handlers = {
|
||||
pr.NOTIFY: [
|
||||
pr.NOTIFY: dispatcher.Handler(
|
||||
self._delayed_process(self._process_notify),
|
||||
functools.partial(pr.Notify.validate, response=False),
|
||||
],
|
||||
pr.REQUEST: [
|
||||
validator=functools.partial(pr.Notify.validate,
|
||||
response=False)),
|
||||
pr.REQUEST: dispatcher.Handler(
|
||||
self._delayed_process(self._process_request),
|
||||
pr.Request.validate,
|
||||
],
|
||||
validator=pr.Request.validate),
|
||||
}
|
||||
self._executor = executor
|
||||
self._proxy = proxy.Proxy(topic, exchange,
|
||||
@@ -77,7 +76,7 @@ class Server(object):
|
||||
def _on_receive(content, message):
|
||||
LOG.debug("Submitting message '%s' for execution in the"
|
||||
" future to '%s'", ku.DelayedPretty(message), func_name)
|
||||
watch = tt.StopWatch()
|
||||
watch = timeutils.StopWatch()
|
||||
watch.start()
|
||||
try:
|
||||
self._executor.submit(_on_run, watch, content, message)
|
||||
@@ -94,32 +93,6 @@ class Server(object):
|
||||
def connection_details(self):
|
||||
return self._proxy.connection_details
|
||||
|
||||
@staticmethod
|
||||
def _parse_request(task_cls, task_name, action, arguments, result=None,
|
||||
failures=None, **kwargs):
|
||||
"""Parse request before it can be further processed.
|
||||
|
||||
All `failure.Failure` objects that have been converted to dict on the
|
||||
remote side will now converted back to `failure.Failure` objects.
|
||||
"""
|
||||
# These arguments will eventually be given to the task executor
|
||||
# so they need to be in a format it will accept (and using keyword
|
||||
# argument names that it accepts)...
|
||||
arguments = {
|
||||
'arguments': arguments,
|
||||
}
|
||||
if result is not None:
|
||||
data_type, data = result
|
||||
if data_type == 'failure':
|
||||
arguments['result'] = ft.Failure.from_dict(data)
|
||||
else:
|
||||
arguments['result'] = data
|
||||
if failures is not None:
|
||||
arguments['failures'] = {}
|
||||
for key, data in six.iteritems(failures):
|
||||
arguments['failures'][key] = ft.Failure.from_dict(data)
|
||||
return (task_cls, task_name, action, arguments)
|
||||
|
||||
@staticmethod
|
||||
def _parse_message(message):
|
||||
"""Extracts required attributes out of the messages properties.
|
||||
@@ -199,11 +172,9 @@ class Server(object):
|
||||
reply_callback = functools.partial(self._reply, True, reply_to,
|
||||
task_uuid)
|
||||
|
||||
# parse request to get task name, action and action arguments
|
||||
# Parse the request to get the activity/work to perform.
|
||||
try:
|
||||
bundle = self._parse_request(**request)
|
||||
task_cls, task_name, action, arguments = bundle
|
||||
arguments['task_uuid'] = task_uuid
|
||||
work = pr.Request.from_dict(request, task_uuid=task_uuid)
|
||||
except ValueError:
|
||||
with misc.capture_failure() as failure:
|
||||
LOG.warn("Failed to parse request contents from message '%s'",
|
||||
@@ -211,34 +182,35 @@ class Server(object):
|
||||
reply_callback(result=failure.to_dict())
|
||||
return
|
||||
|
||||
# get task endpoint
|
||||
# Now fetch the task endpoint (and action handler on it).
|
||||
try:
|
||||
endpoint = self._endpoints[task_cls]
|
||||
endpoint = self._endpoints[work.task_cls]
|
||||
except KeyError:
|
||||
with misc.capture_failure() as failure:
|
||||
LOG.warn("The '%s' task endpoint does not exist, unable"
|
||||
" to continue processing request message '%s'",
|
||||
task_cls, ku.DelayedPretty(message), exc_info=True)
|
||||
work.task_cls, ku.DelayedPretty(message),
|
||||
exc_info=True)
|
||||
reply_callback(result=failure.to_dict())
|
||||
return
|
||||
else:
|
||||
try:
|
||||
handler = getattr(endpoint, action)
|
||||
handler = getattr(endpoint, work.action)
|
||||
except AttributeError:
|
||||
with misc.capture_failure() as failure:
|
||||
LOG.warn("The '%s' handler does not exist on task endpoint"
|
||||
" '%s', unable to continue processing request"
|
||||
" message '%s'", action, endpoint,
|
||||
" message '%s'", work.action, endpoint,
|
||||
ku.DelayedPretty(message), exc_info=True)
|
||||
reply_callback(result=failure.to_dict())
|
||||
return
|
||||
else:
|
||||
try:
|
||||
task = endpoint.generate(name=task_name)
|
||||
task = endpoint.generate(name=work.task_name)
|
||||
except Exception:
|
||||
with misc.capture_failure() as failure:
|
||||
LOG.warn("The '%s' task '%s' generation for request"
|
||||
" message '%s' failed", endpoint, action,
|
||||
" message '%s' failed", endpoint, work.action,
|
||||
ku.DelayedPretty(message), exc_info=True)
|
||||
reply_callback(result=failure.to_dict())
|
||||
return
|
||||
@@ -246,7 +218,7 @@ class Server(object):
|
||||
if not reply_callback(state=pr.RUNNING):
|
||||
return
|
||||
|
||||
# associate *any* events this task emits with a proxy that will
|
||||
# Associate *any* events this task emits with a proxy that will
|
||||
# emit them back to the engine... for handling at the engine side
|
||||
# of things...
|
||||
if task.notifier.can_be_registered(nt.Notifier.ANY):
|
||||
@@ -254,22 +226,23 @@ class Server(object):
|
||||
functools.partial(self._on_event,
|
||||
reply_to, task_uuid))
|
||||
elif isinstance(task.notifier, nt.RestrictedNotifier):
|
||||
# only proxy the allowable events then...
|
||||
# Only proxy the allowable events then...
|
||||
for event_type in task.notifier.events_iter():
|
||||
task.notifier.register(event_type,
|
||||
functools.partial(self._on_event,
|
||||
reply_to, task_uuid))
|
||||
|
||||
# perform the task action
|
||||
# Perform the task action.
|
||||
try:
|
||||
result = handler(task, **arguments)
|
||||
result = handler(task, **work.arguments)
|
||||
except Exception:
|
||||
with misc.capture_failure() as failure:
|
||||
LOG.warn("The '%s' endpoint '%s' execution for request"
|
||||
" message '%s' failed", endpoint, action,
|
||||
" message '%s' failed", endpoint, work.action,
|
||||
ku.DelayedPretty(message), exc_info=True)
|
||||
reply_callback(result=failure.to_dict())
|
||||
else:
|
||||
# And be done with it!
|
||||
if isinstance(result, ft.Failure):
|
||||
reply_callback(result=result.to_dict())
|
||||
else:
|
||||
|
||||
@@ -20,15 +20,16 @@ import itertools
|
||||
import random
|
||||
import threading
|
||||
|
||||
from futurist import periodics
|
||||
from oslo_utils import reflection
|
||||
from oslo_utils import timeutils
|
||||
import six
|
||||
|
||||
from taskflow.engines.worker_based import dispatcher
|
||||
from taskflow.engines.worker_based import protocol as pr
|
||||
from taskflow import logging
|
||||
from taskflow.types import cache as base
|
||||
from taskflow.types import notifier
|
||||
from taskflow.types import periodic
|
||||
from taskflow.types import timing as tt
|
||||
from taskflow.utils import kombu_utils as ku
|
||||
|
||||
LOG = logging.getLogger(__name__)
|
||||
@@ -122,7 +123,7 @@ class WorkerFinder(object):
|
||||
"""
|
||||
if workers <= 0:
|
||||
raise ValueError("Worker amount must be greater than zero")
|
||||
watch = tt.StopWatch(duration=timeout)
|
||||
watch = timeutils.StopWatch(duration=timeout)
|
||||
watch.start()
|
||||
with self._cond:
|
||||
while self._total_workers() < workers:
|
||||
@@ -165,10 +166,10 @@ class ProxyWorkerFinder(WorkerFinder):
|
||||
self._workers = {}
|
||||
self._uuid = uuid
|
||||
self._proxy.dispatcher.type_handlers.update({
|
||||
pr.NOTIFY: [
|
||||
pr.NOTIFY: dispatcher.Handler(
|
||||
self._process_response,
|
||||
functools.partial(pr.Notify.validate, response=True),
|
||||
],
|
||||
validator=functools.partial(pr.Notify.validate,
|
||||
response=True)),
|
||||
})
|
||||
self._counter = itertools.count()
|
||||
|
||||
@@ -179,7 +180,7 @@ class ProxyWorkerFinder(WorkerFinder):
|
||||
else:
|
||||
return TopicWorker(topic, tasks)
|
||||
|
||||
@periodic.periodic(pr.NOTIFY_PERIOD)
|
||||
@periodics.periodic(pr.NOTIFY_PERIOD, run_immediately=True)
|
||||
def beat(self):
|
||||
"""Cyclically called to publish notify message to each topic."""
|
||||
self._proxy.publish(pr.Notify(), self._topics, reply_to=self._uuid)
|
||||
|
||||
@@ -20,47 +20,17 @@ import socket
|
||||
import string
|
||||
import sys
|
||||
|
||||
import futurist
|
||||
from oslo_utils import reflection
|
||||
|
||||
from taskflow.engines.worker_based import endpoint
|
||||
from taskflow.engines.worker_based import server
|
||||
from taskflow import logging
|
||||
from taskflow import task as t_task
|
||||
from taskflow.types import futures
|
||||
from taskflow.utils import misc
|
||||
from taskflow.utils import threading_utils as tu
|
||||
from taskflow import version
|
||||
|
||||
BANNER_TEMPLATE = string.Template("""
|
||||
TaskFlow v${version} WBE worker.
|
||||
Connection details:
|
||||
Driver = $transport_driver
|
||||
Exchange = $exchange
|
||||
Topic = $topic
|
||||
Transport = $transport_type
|
||||
Uri = $connection_uri
|
||||
Powered by:
|
||||
Executor = $executor_type
|
||||
Thread count = $executor_thread_count
|
||||
Supported endpoints:$endpoints
|
||||
System details:
|
||||
Hostname = $hostname
|
||||
Pid = $pid
|
||||
Platform = $platform
|
||||
Python = $python
|
||||
Thread id = $thread_id
|
||||
""".strip())
|
||||
BANNER_TEMPLATE.defaults = {
|
||||
# These values may not be possible to fetch/known, default to unknown...
|
||||
'pid': '???',
|
||||
'hostname': '???',
|
||||
'executor_thread_count': '???',
|
||||
'endpoints': ' %s' % ([]),
|
||||
# These are static (avoid refetching...)
|
||||
'version': version.version_string(),
|
||||
'python': sys.version.split("\n", 1)[0].strip(),
|
||||
}
|
||||
|
||||
LOG = logging.getLogger(__name__)
|
||||
|
||||
|
||||
@@ -88,6 +58,39 @@ class Worker(object):
|
||||
(see: :py:attr:`~.proxy.Proxy.DEFAULT_RETRY_OPTIONS`)
|
||||
"""
|
||||
|
||||
BANNER_TEMPLATE = string.Template("""
|
||||
TaskFlow v${version} WBE worker.
|
||||
Connection details:
|
||||
Driver = $transport_driver
|
||||
Exchange = $exchange
|
||||
Topic = $topic
|
||||
Transport = $transport_type
|
||||
Uri = $connection_uri
|
||||
Powered by:
|
||||
Executor = $executor_type
|
||||
Thread count = $executor_thread_count
|
||||
Supported endpoints:$endpoints
|
||||
System details:
|
||||
Hostname = $hostname
|
||||
Pid = $pid
|
||||
Platform = $platform
|
||||
Python = $python
|
||||
Thread id = $thread_id
|
||||
""".strip())
|
||||
|
||||
# See: http://bugs.python.org/issue13173 for why we are doing this...
|
||||
BANNER_TEMPLATE.defaults = {
|
||||
# These values may not be possible to fetch/known, default
|
||||
# to ??? to represent that they are unknown...
|
||||
'pid': '???',
|
||||
'hostname': '???',
|
||||
'executor_thread_count': '???',
|
||||
'endpoints': ' %s' % ([]),
|
||||
# These are static (avoid refetching...)
|
||||
'version': version.version_string(),
|
||||
'python': sys.version.split("\n", 1)[0].strip(),
|
||||
}
|
||||
|
||||
def __init__(self, exchange, topic, tasks,
|
||||
executor=None, threads_count=None, url=None,
|
||||
transport=None, transport_options=None,
|
||||
@@ -95,13 +98,9 @@ class Worker(object):
|
||||
self._topic = topic
|
||||
self._executor = executor
|
||||
self._owns_executor = False
|
||||
self._threads_count = -1
|
||||
if self._executor is None:
|
||||
if threads_count is not None:
|
||||
self._threads_count = int(threads_count)
|
||||
else:
|
||||
self._threads_count = tu.get_optimal_thread_count()
|
||||
self._executor = futures.ThreadPoolExecutor(self._threads_count)
|
||||
self._executor = futurist.ThreadPoolExecutor(
|
||||
max_workers=threads_count)
|
||||
self._owns_executor = True
|
||||
self._endpoints = self._derive_endpoints(tasks)
|
||||
self._exchange = exchange
|
||||
@@ -119,7 +118,10 @@ class Worker(object):
|
||||
|
||||
def _generate_banner(self):
|
||||
"""Generates a banner that can be useful to display before running."""
|
||||
tpl_params = {}
|
||||
try:
|
||||
tpl_params = dict(self.BANNER_TEMPLATE.defaults)
|
||||
except AttributeError:
|
||||
tpl_params = {}
|
||||
connection_details = self._server.connection_details
|
||||
transport = connection_details.transport
|
||||
if transport.driver_version:
|
||||
@@ -133,8 +135,9 @@ class Worker(object):
|
||||
tpl_params['transport_type'] = transport.driver_type
|
||||
tpl_params['connection_uri'] = connection_details.uri
|
||||
tpl_params['executor_type'] = reflection.get_class_name(self._executor)
|
||||
if self._threads_count != -1:
|
||||
tpl_params['executor_thread_count'] = self._threads_count
|
||||
threads_count = getattr(self._executor, 'max_workers', None)
|
||||
if threads_count is not None:
|
||||
tpl_params['executor_thread_count'] = threads_count
|
||||
if self._endpoints:
|
||||
pretty_endpoints = []
|
||||
for ep in self._endpoints:
|
||||
@@ -151,8 +154,7 @@ class Worker(object):
|
||||
pass
|
||||
tpl_params['platform'] = platform.platform()
|
||||
tpl_params['thread_id'] = tu.get_ident()
|
||||
banner = BANNER_TEMPLATE.substitute(BANNER_TEMPLATE.defaults,
|
||||
**tpl_params)
|
||||
banner = self.BANNER_TEMPLATE.substitute(**tpl_params)
|
||||
# NOTE(harlowja): this is needed since the template in this file
|
||||
# will always have newlines that end with '\n' (even on different
|
||||
# platforms due to the way this source file is encoded) so we have
|
||||
|
||||
204
taskflow/examples/99_bottles.py
Normal file
@@ -0,0 +1,204 @@
|
||||
# -*- coding: utf-8 -*-
|
||||
|
||||
# Copyright (C) 2015 Yahoo! Inc. All Rights Reserved.
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License"); you may
|
||||
# not use this file except in compliance with the License. You may obtain
|
||||
# a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
|
||||
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
|
||||
# License for the specific language governing permissions and limitations
|
||||
# under the License.
|
||||
|
||||
import contextlib
|
||||
import logging
|
||||
import os
|
||||
import sys
|
||||
import time
|
||||
|
||||
top_dir = os.path.abspath(os.path.join(os.path.dirname(__file__),
|
||||
os.pardir,
|
||||
os.pardir))
|
||||
sys.path.insert(0, top_dir)
|
||||
|
||||
from taskflow.conductors import backends as conductor_backends
|
||||
from taskflow import engines
|
||||
from taskflow.jobs import backends as job_backends
|
||||
from taskflow.patterns import linear_flow as lf
|
||||
from taskflow.persistence import backends as persistence_backends
|
||||
from taskflow.persistence import logbook
|
||||
from taskflow import task
|
||||
from taskflow.types import timing
|
||||
|
||||
from oslo_utils import uuidutils
|
||||
|
||||
# Instructions!
|
||||
#
|
||||
# 1. Install zookeeper (or change host listed below)
|
||||
# 2. Download this example, place in file '99_bottles.py'
|
||||
# 3. Run `python 99_bottles.py p` to place a song request onto the jobboard
|
||||
# 4. Run `python 99_bottles.py c` a few times (in different shells)
|
||||
# 5. On demand kill previously listed processes created in (4) and watch
|
||||
# the work resume on another process (and repeat)
|
||||
# 6. Keep enough workers alive to eventually finish the song (if desired).
|
||||
|
||||
ME = os.getpid()
|
||||
ZK_HOST = "localhost:2181"
|
||||
JB_CONF = {
|
||||
'hosts': ZK_HOST,
|
||||
'board': 'zookeeper',
|
||||
'path': '/taskflow/99-bottles-demo',
|
||||
}
|
||||
PERSISTENCE_URI = r"sqlite:////tmp/bottles.db"
|
||||
TAKE_DOWN_DELAY = 1.0
|
||||
PASS_AROUND_DELAY = 3.0
|
||||
HOW_MANY_BOTTLES = 99
|
||||
|
||||
|
||||
class TakeABottleDown(task.Task):
|
||||
def execute(self, bottles_left):
|
||||
sys.stdout.write('Take one down, ')
|
||||
sys.stdout.flush()
|
||||
time.sleep(TAKE_DOWN_DELAY)
|
||||
return bottles_left - 1
|
||||
|
||||
|
||||
class PassItAround(task.Task):
|
||||
def execute(self):
|
||||
sys.stdout.write('pass it around, ')
|
||||
sys.stdout.flush()
|
||||
time.sleep(PASS_AROUND_DELAY)
|
||||
|
||||
|
||||
class Conclusion(task.Task):
|
||||
def execute(self, bottles_left):
|
||||
sys.stdout.write('%s bottles of beer on the wall...\n' % bottles_left)
|
||||
sys.stdout.flush()
|
||||
|
||||
|
||||
def make_bottles(count):
|
||||
# This is the function that will be called to generate the workflow
|
||||
# and will also be called to regenerate it on resumption so that work
|
||||
# can continue from where it last left off...
|
||||
|
||||
s = lf.Flow("bottle-song")
|
||||
|
||||
take_bottle = TakeABottleDown("take-bottle-%s" % count,
|
||||
inject={'bottles_left': count},
|
||||
provides='bottles_left')
|
||||
pass_it = PassItAround("pass-%s-around" % count)
|
||||
next_bottles = Conclusion("next-bottles-%s" % (count - 1))
|
||||
s.add(take_bottle, pass_it, next_bottles)
|
||||
|
||||
for bottle in reversed(list(range(1, count))):
|
||||
take_bottle = TakeABottleDown("take-bottle-%s" % bottle,
|
||||
provides='bottles_left')
|
||||
pass_it = PassItAround("pass-%s-around" % bottle)
|
||||
next_bottles = Conclusion("next-bottles-%s" % (bottle - 1))
|
||||
s.add(take_bottle, pass_it, next_bottles)
|
||||
|
||||
return s
|
||||
|
||||
|
||||
def run_conductor():
|
||||
# This continuously consumers until its stopped via ctrl-c or other
|
||||
# kill signal...
|
||||
|
||||
event_watches = {}
|
||||
|
||||
# This will be triggered by the conductor doing various activities
|
||||
# with engines, and is quite nice to be able to see the various timing
|
||||
# segments (which is useful for debugging, or watching, or figuring out
|
||||
# where to optimize).
|
||||
def on_conductor_event(event, details):
|
||||
print("Event '%s' has been received..." % event)
|
||||
print("Details = %s" % details)
|
||||
if event.endswith("_start"):
|
||||
w = timing.StopWatch()
|
||||
w.start()
|
||||
base_event = event[0:-len("_start")]
|
||||
event_watches[base_event] = w
|
||||
if event.endswith("_end"):
|
||||
base_event = event[0:-len("_end")]
|
||||
try:
|
||||
w = event_watches.pop(base_event)
|
||||
w.stop()
|
||||
print("It took %0.3f seconds for event '%s' to finish"
|
||||
% (w.elapsed(), base_event))
|
||||
except KeyError:
|
||||
pass
|
||||
|
||||
print("Starting conductor with pid: %s" % ME)
|
||||
my_name = "conductor-%s" % ME
|
||||
persist_backend = persistence_backends.fetch(PERSISTENCE_URI)
|
||||
with contextlib.closing(persist_backend):
|
||||
with contextlib.closing(persist_backend.get_connection()) as conn:
|
||||
conn.upgrade()
|
||||
job_backend = job_backends.fetch(my_name, JB_CONF,
|
||||
persistence=persist_backend)
|
||||
job_backend.connect()
|
||||
with contextlib.closing(job_backend):
|
||||
cond = conductor_backends.fetch('blocking', my_name, job_backend,
|
||||
persistence=persist_backend)
|
||||
cond.notifier.register(cond.notifier.ANY, on_conductor_event)
|
||||
# Run forever, and kill -9 or ctrl-c me...
|
||||
try:
|
||||
cond.run()
|
||||
finally:
|
||||
cond.stop()
|
||||
cond.wait()
|
||||
|
||||
|
||||
def run_poster():
|
||||
# This just posts a single job and then ends...
|
||||
print("Starting poster with pid: %s" % ME)
|
||||
my_name = "poster-%s" % ME
|
||||
persist_backend = persistence_backends.fetch(PERSISTENCE_URI)
|
||||
with contextlib.closing(persist_backend):
|
||||
with contextlib.closing(persist_backend.get_connection()) as conn:
|
||||
conn.upgrade()
|
||||
job_backend = job_backends.fetch(my_name, JB_CONF,
|
||||
persistence=persist_backend)
|
||||
job_backend.connect()
|
||||
with contextlib.closing(job_backend):
|
||||
# Create information in the persistence backend about the
|
||||
# unit of work we want to complete and the factory that
|
||||
# can be called to create the tasks that the work unit needs
|
||||
# to be done.
|
||||
lb = logbook.LogBook("post-from-%s" % my_name)
|
||||
fd = logbook.FlowDetail("song-from-%s" % my_name,
|
||||
uuidutils.generate_uuid())
|
||||
lb.add(fd)
|
||||
with contextlib.closing(persist_backend.get_connection()) as conn:
|
||||
conn.save_logbook(lb)
|
||||
engines.save_factory_details(fd, make_bottles,
|
||||
[HOW_MANY_BOTTLES], {},
|
||||
backend=persist_backend)
|
||||
# Post, and be done with it!
|
||||
jb = job_backend.post("song-from-%s" % my_name, book=lb)
|
||||
print("Posted: %s" % jb)
|
||||
print("Goodbye...")
|
||||
|
||||
|
||||
def main():
|
||||
if len(sys.argv) == 1:
|
||||
sys.stderr.write("%s p|c\n" % os.path.basename(sys.argv[0]))
|
||||
elif sys.argv[1] in ('p', 'c'):
|
||||
if sys.argv[-1] == "v":
|
||||
logging.basicConfig(level=5)
|
||||
else:
|
||||
logging.basicConfig(level=logging.ERROR)
|
||||
if sys.argv[1] == 'p':
|
||||
run_poster()
|
||||
else:
|
||||
run_conductor()
|
||||
else:
|
||||
sys.stderr.write("%s p|c (v?)\n" % os.path.basename(sys.argv[0]))
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
main()
|
||||
@@ -38,7 +38,7 @@ from taskflow import task
|
||||
|
||||
|
||||
# In this example we show how a simple linear set of tasks can be executed
|
||||
# using local processes (and not threads or remote workers) with minimial (if
|
||||
# using local processes (and not threads or remote workers) with minimal (if
|
||||
# any) modification to those tasks to make them safe to run in this mode.
|
||||
#
|
||||
# This is useful since it allows further scaling up your workflows when thread
|
||||
|
||||
@@ -38,7 +38,7 @@ ANY = notifier.Notifier.ANY
|
||||
import example_utils as eu # noqa
|
||||
|
||||
|
||||
# INTRO: This examples shows how a graph flow and linear flow can be used
|
||||
# INTRO: This example shows how a graph flow and linear flow can be used
|
||||
# together to execute dependent & non-dependent tasks by going through the
|
||||
# steps required to build a simplistic car (an assembly line if you will). It
|
||||
# also shows how raw functions can be wrapped into a task object instead of
|
||||
@@ -167,7 +167,7 @@ engine = taskflow.engines.load(flow, store={'spec': spec.copy()})
|
||||
# flow_watch function for flow state transitions, and registers the
|
||||
# same all (ANY) state transitions for task state transitions.
|
||||
engine.notifier.register(ANY, flow_watch)
|
||||
engine.task_notifier.register(ANY, task_watch)
|
||||
engine.atom_notifier.register(ANY, task_watch)
|
||||
|
||||
eu.print_wrapped("Building a car")
|
||||
engine.run()
|
||||
@@ -180,7 +180,7 @@ spec['doors'] = 5
|
||||
|
||||
engine = taskflow.engines.load(flow, store={'spec': spec.copy()})
|
||||
engine.notifier.register(ANY, flow_watch)
|
||||
engine.task_notifier.register(ANY, task_watch)
|
||||
engine.atom_notifier.register(ANY, task_watch)
|
||||
|
||||
eu.print_wrapped("Building a wrong car that doesn't match specification")
|
||||
try:
|
||||
|
||||
@@ -30,7 +30,7 @@ from taskflow.patterns import linear_flow as lf
|
||||
from taskflow.patterns import unordered_flow as uf
|
||||
from taskflow import task
|
||||
|
||||
# INTRO: This examples shows how a linear flow and a unordered flow can be
|
||||
# INTRO: These examples show how a linear flow and an unordered flow can be
|
||||
# used together to execute calculations in parallel and then use the
|
||||
# result for the next task/s. The adder task is used for all calculations
|
||||
# and argument bindings are used to set correct parameters for each task.
|
||||
|
||||
@@ -35,7 +35,7 @@ from taskflow.listeners import printing
|
||||
from taskflow.patterns import unordered_flow as uf
|
||||
from taskflow import task
|
||||
|
||||
# INTRO: This examples shows how unordered_flow can be used to create a large
|
||||
# INTRO: These examples show how unordered_flow can be used to create a large
|
||||
# number of fake volumes in parallel (or serially, depending on a constant that
|
||||
# can be easily changed).
|
||||
|
||||
|
||||
78
taskflow/examples/dump_memory_backend.py
Normal file
@@ -0,0 +1,78 @@
|
||||
# -*- coding: utf-8 -*-
|
||||
|
||||
# Copyright (C) 2015 Yahoo! Inc. All Rights Reserved.
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License"); you may
|
||||
# not use this file except in compliance with the License. You may obtain
|
||||
# a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
|
||||
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
|
||||
# License for the specific language governing permissions and limitations
|
||||
# under the License.
|
||||
|
||||
import logging
|
||||
import os
|
||||
import sys
|
||||
|
||||
logging.basicConfig(level=logging.ERROR)
|
||||
|
||||
self_dir = os.path.abspath(os.path.dirname(__file__))
|
||||
top_dir = os.path.abspath(os.path.join(os.path.dirname(__file__),
|
||||
os.pardir,
|
||||
os.pardir))
|
||||
sys.path.insert(0, top_dir)
|
||||
sys.path.insert(0, self_dir)
|
||||
|
||||
from taskflow import engines
|
||||
from taskflow.patterns import linear_flow as lf
|
||||
from taskflow.persistence import backends
|
||||
from taskflow import task
|
||||
from taskflow.utils import persistence_utils as pu
|
||||
|
||||
# INTRO: in this example we create a dummy flow with a dummy task, and run
|
||||
# it using a in-memory backend and pre/post run we dump out the contents
|
||||
# of the in-memory backends tree structure (which can be quite useful to
|
||||
# look at for debugging or other analysis).
|
||||
|
||||
|
||||
class PrintTask(task.Task):
|
||||
def execute(self):
|
||||
print("Running '%s'" % self.name)
|
||||
|
||||
|
||||
backend = backends.fetch({
|
||||
'connection': 'memory://',
|
||||
})
|
||||
book, flow_detail = pu.temporary_flow_detail(backend=backend)
|
||||
|
||||
# Make a little flow and run it...
|
||||
f = lf.Flow('root')
|
||||
for alpha in ['a', 'b', 'c']:
|
||||
f.add(PrintTask(alpha))
|
||||
|
||||
e = engines.load(f, flow_detail=flow_detail,
|
||||
book=book, backend=backend)
|
||||
e.compile()
|
||||
e.prepare()
|
||||
|
||||
print("----------")
|
||||
print("Before run")
|
||||
print("----------")
|
||||
print(backend.memory.pformat())
|
||||
print("----------")
|
||||
|
||||
e.run()
|
||||
|
||||
print("---------")
|
||||
print("After run")
|
||||
print("---------")
|
||||
for path in backend.memory.ls_r(backend.memory.root_path, absolute=True):
|
||||
value = backend.memory[path]
|
||||
if value:
|
||||
print("%s -> %s" % (path, value))
|
||||
else:
|
||||
print("%s" % (path))
|
||||
@@ -31,8 +31,8 @@ from taskflow.patterns import linear_flow as lf
|
||||
from taskflow import task
|
||||
|
||||
# INTRO: This example walks through a miniature workflow which will do a
|
||||
# simple echo operation; during this execution a listener is assocated with
|
||||
# the engine to recieve all notifications about what the flow has performed,
|
||||
# simple echo operation; during this execution a listener is associated with
|
||||
# the engine to receive all notifications about what the flow has performed,
|
||||
# this example dumps that output to the stdout for viewing (at debug level
|
||||
# to show all the information which is possible).
|
||||
|
||||
|
||||
@@ -36,8 +36,8 @@ from taskflow.patterns import linear_flow as lf
|
||||
from taskflow import task
|
||||
from taskflow.utils import misc
|
||||
|
||||
# INTRO: This example walks through a miniature workflow which simulates a
|
||||
# the reception of a API request, creation of a database entry, driver
|
||||
# INTRO: This example walks through a miniature workflow which simulates
|
||||
# the reception of an API request, creation of a database entry, driver
|
||||
# activation (which invokes a 'fake' webservice) and final completion.
|
||||
#
|
||||
# This example also shows how a function/object (in this class the url sending)
|
||||
|
||||
@@ -80,12 +80,37 @@ store = {
|
||||
"y5": 9,
|
||||
}
|
||||
|
||||
# This is the expected values that should be created.
|
||||
unexpected = 0
|
||||
expected = [
|
||||
('x1', 4),
|
||||
('x2', 12),
|
||||
('x3', 16),
|
||||
('x4', 21),
|
||||
('x5', 20),
|
||||
('x6', 41),
|
||||
('x7', 82),
|
||||
]
|
||||
|
||||
result = taskflow.engines.run(
|
||||
flow, engine='serial', store=store)
|
||||
|
||||
print("Single threaded engine result %s" % result)
|
||||
for (name, value) in expected:
|
||||
actual = result.get(name)
|
||||
if actual != value:
|
||||
sys.stderr.write("%s != %s\n" % (actual, value))
|
||||
unexpected += 1
|
||||
|
||||
result = taskflow.engines.run(
|
||||
flow, engine='parallel', store=store)
|
||||
|
||||
print("Multi threaded engine result %s" % result)
|
||||
for (name, value) in expected:
|
||||
actual = result.get(name)
|
||||
if actual != value:
|
||||
sys.stderr.write("%s != %s\n" % (actual, value))
|
||||
unexpected += 1
|
||||
|
||||
if unexpected:
|
||||
sys.exit(1)
|
||||
|
||||
@@ -25,16 +25,17 @@ top_dir = os.path.abspath(os.path.join(os.path.dirname(__file__),
|
||||
os.pardir))
|
||||
sys.path.insert(0, top_dir)
|
||||
|
||||
import futurist
|
||||
|
||||
from taskflow import engines
|
||||
from taskflow.patterns import linear_flow as lf
|
||||
from taskflow.patterns import unordered_flow as uf
|
||||
from taskflow import task
|
||||
from taskflow.types import futures
|
||||
from taskflow.utils import eventlet_utils
|
||||
|
||||
|
||||
# INTRO: This is the defacto hello world equivalent for taskflow; it shows how
|
||||
# a overly simplistic workflow can be created that runs using different
|
||||
# an overly simplistic workflow can be created that runs using different
|
||||
# engines using different styles of execution (all can be used to run in
|
||||
# parallel if a workflow is provided that is parallelizable).
|
||||
|
||||
@@ -82,19 +83,19 @@ song.add(PrinterTask("conductor@begin",
|
||||
|
||||
# Run in parallel using eventlet green threads...
|
||||
if eventlet_utils.EVENTLET_AVAILABLE:
|
||||
with futures.GreenThreadPoolExecutor() as executor:
|
||||
with futurist.GreenThreadPoolExecutor() as executor:
|
||||
e = engines.load(song, executor=executor, engine='parallel')
|
||||
e.run()
|
||||
|
||||
|
||||
# Run in parallel using real threads...
|
||||
with futures.ThreadPoolExecutor(max_workers=1) as executor:
|
||||
with futurist.ThreadPoolExecutor(max_workers=1) as executor:
|
||||
e = engines.load(song, executor=executor, engine='parallel')
|
||||
e.run()
|
||||
|
||||
|
||||
# Run in parallel using external processes...
|
||||
with futures.ProcessPoolExecutor(max_workers=1) as executor:
|
||||
with futurist.ProcessPoolExecutor(max_workers=1) as executor:
|
||||
e = engines.load(song, executor=executor, engine='parallel')
|
||||
e.run()
|
||||
|
||||
|
||||
@@ -1,171 +0,0 @@
|
||||
# -*- encoding: utf-8 -*-
|
||||
#
|
||||
# Copyright © 2013 eNovance <licensing@enovance.com>
|
||||
#
|
||||
# Authors: Dan Krause <dan@dankrause.net>
|
||||
# Cyril Roelandt <cyril.roelandt@enovance.com>
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License"); you may
|
||||
# not use this file except in compliance with the License. You may obtain
|
||||
# a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
|
||||
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
|
||||
# License for the specific language governing permissions and limitations
|
||||
# under the License.
|
||||
|
||||
# This example shows how to use the job board feature.
|
||||
#
|
||||
# Let's start by creating some jobs:
|
||||
# $ python job_board_no_test.py create my-board my-job '{}'
|
||||
# $ python job_board_no_test.py create my-board my-job '{"foo": "bar"}'
|
||||
# $ python job_board_no_test.py create my-board my-job '{"foo": "baz"}'
|
||||
# $ python job_board_no_test.py create my-board my-job '{"foo": "barbaz"}'
|
||||
#
|
||||
# Make sure they were registered:
|
||||
# $ python job_board_no_test.py list my-board
|
||||
# 7277181a-1f83-473d-8233-f361615bae9e - {}
|
||||
# 84a396e8-d02e-450d-8566-d93cb68550c0 - {u'foo': u'bar'}
|
||||
# 4d355d6a-2c72-44a2-a558-19ae52e8ae2c - {u'foo': u'baz'}
|
||||
# cd9aae2c-fd64-416d-8ba0-426fa8e3d59c - {u'foo': u'barbaz'}
|
||||
#
|
||||
# Perform one job:
|
||||
# $ python job_board_no_test.py consume my-board \
|
||||
# 84a396e8-d02e-450d-8566-d93cb68550c0
|
||||
# Performing job 84a396e8-d02e-450d-8566-d93cb68550c0 with args \
|
||||
# {u'foo': u'bar'}
|
||||
# $ python job_board_no_test.py list my-board
|
||||
# 7277181a-1f83-473d-8233-f361615bae9e - {}
|
||||
# 4d355d6a-2c72-44a2-a558-19ae52e8ae2c - {u'foo': u'baz'}
|
||||
# cd9aae2c-fd64-416d-8ba0-426fa8e3d59c - {u'foo': u'barbaz'}
|
||||
#
|
||||
# Delete a job:
|
||||
# $ python job_board_no_test.py delete my-board \
|
||||
# cd9aae2c-fd64-416d-8ba0-426fa8e3d59c
|
||||
# $ python job_board_no_test.py list my-board
|
||||
# 7277181a-1f83-473d-8233-f361615bae9e - {}
|
||||
# 4d355d6a-2c72-44a2-a558-19ae52e8ae2c - {u'foo': u'baz'}
|
||||
#
|
||||
# Delete all the remaining jobs
|
||||
# $ python job_board_no_test.py clear my-board
|
||||
# $ python job_board_no_test.py list my-board
|
||||
# $
|
||||
|
||||
import argparse
|
||||
import contextlib
|
||||
import json
|
||||
import os
|
||||
import sys
|
||||
import tempfile
|
||||
|
||||
import taskflow.jobs.backends as job_backends
|
||||
from taskflow.persistence import logbook
|
||||
|
||||
import example_utils # noqa
|
||||
|
||||
|
||||
@contextlib.contextmanager
|
||||
def jobboard(*args, **kwargs):
|
||||
jb = job_backends.fetch(*args, **kwargs)
|
||||
jb.connect()
|
||||
yield jb
|
||||
jb.close()
|
||||
|
||||
|
||||
conf = {
|
||||
'board': 'zookeeper',
|
||||
'hosts': ['127.0.0.1:2181']
|
||||
}
|
||||
|
||||
|
||||
def consume_job(args):
|
||||
def perform_job(job):
|
||||
print("Performing job %s with args %s" % (job.uuid, job.details))
|
||||
|
||||
with jobboard(args.board_name, conf) as jb:
|
||||
for job in jb.iterjobs(ensure_fresh=True):
|
||||
if job.uuid == args.job_uuid:
|
||||
jb.claim(job, "test-client")
|
||||
perform_job(job)
|
||||
jb.consume(job, "test-client")
|
||||
|
||||
|
||||
def clear_jobs(args):
|
||||
with jobboard(args.board_name, conf) as jb:
|
||||
for job in jb.iterjobs(ensure_fresh=True):
|
||||
jb.claim(job, "test-client")
|
||||
jb.consume(job, "test-client")
|
||||
|
||||
|
||||
def create_job(args):
|
||||
store = json.loads(args.details)
|
||||
book = logbook.LogBook(args.job_name)
|
||||
if example_utils.SQLALCHEMY_AVAILABLE:
|
||||
persist_path = os.path.join(tempfile.gettempdir(), "persisting.db")
|
||||
backend_uri = "sqlite:///%s" % (persist_path)
|
||||
else:
|
||||
persist_path = os.path.join(tempfile.gettempdir(), "persisting")
|
||||
backend_uri = "file:///%s" % (persist_path)
|
||||
with example_utils.get_backend(backend_uri) as backend:
|
||||
backend.get_connection().save_logbook(book)
|
||||
with jobboard(args.board_name, conf, persistence=backend) as jb:
|
||||
jb.post(args.job_name, book, details=store)
|
||||
|
||||
|
||||
def list_jobs(args):
|
||||
with jobboard(args.board_name, conf) as jb:
|
||||
for job in jb.iterjobs(ensure_fresh=True):
|
||||
print("%s - %s" % (job.uuid, job.details))
|
||||
|
||||
|
||||
def delete_job(args):
|
||||
with jobboard(args.board_name, conf) as jb:
|
||||
for job in jb.iterjobs(ensure_fresh=True):
|
||||
if job.uuid == args.job_uuid:
|
||||
jb.claim(job, "test-client")
|
||||
jb.consume(job, "test-client")
|
||||
|
||||
|
||||
def main(argv):
|
||||
parser = argparse.ArgumentParser()
|
||||
subparsers = parser.add_subparsers(title='subcommands',
|
||||
description='valid subcommands',
|
||||
help='additional help')
|
||||
|
||||
# Consume command
|
||||
parser_consume = subparsers.add_parser('consume')
|
||||
parser_consume.add_argument('board_name')
|
||||
parser_consume.add_argument('job_uuid')
|
||||
parser_consume.set_defaults(func=consume_job)
|
||||
|
||||
# Clear command
|
||||
parser_consume = subparsers.add_parser('clear')
|
||||
parser_consume.add_argument('board_name')
|
||||
parser_consume.set_defaults(func=clear_jobs)
|
||||
|
||||
# Create command
|
||||
parser_create = subparsers.add_parser('create')
|
||||
parser_create.add_argument('board_name')
|
||||
parser_create.add_argument('job_name')
|
||||
parser_create.add_argument('details')
|
||||
parser_create.set_defaults(func=create_job)
|
||||
|
||||
# Delete command
|
||||
parser_delete = subparsers.add_parser('delete')
|
||||
parser_delete.add_argument('board_name')
|
||||
parser_delete.add_argument('job_uuid')
|
||||
parser_delete.set_defaults(func=delete_job)
|
||||
|
||||
# List command
|
||||
parser_list = subparsers.add_parser('list')
|
||||
parser_list.add_argument('board_name')
|
||||
parser_list.set_defaults(func=list_jobs)
|
||||
|
||||
args = parser.parse_args(argv)
|
||||
args.func(args)
|
||||
|
||||
if __name__ == '__main__':
|
||||
main(sys.argv[1:])
|
||||
@@ -30,6 +30,7 @@ top_dir = os.path.abspath(os.path.join(os.path.dirname(__file__),
|
||||
os.pardir))
|
||||
sys.path.insert(0, top_dir)
|
||||
|
||||
import six
|
||||
from six.moves import range as compat_range
|
||||
from zake import fake_client
|
||||
|
||||
@@ -40,7 +41,7 @@ from taskflow.utils import threading_utils
|
||||
# In this example we show how a jobboard can be used to post work for other
|
||||
# entities to work on. This example creates a set of jobs using one producer
|
||||
# thread (typically this would be split across many machines) and then having
|
||||
# other worker threads with there own jobboards select work using a given
|
||||
# other worker threads with their own jobboards select work using a given
|
||||
# filters [red/blue] and then perform that work (and consuming or abandoning
|
||||
# the job after it has been completed or failed).
|
||||
|
||||
@@ -66,7 +67,7 @@ PRODUCER_UNITS = 10
|
||||
|
||||
# How many units of work are expected to be produced (used so workers can
|
||||
# know when to stop running and shutdown, typically this would not be a
|
||||
# a value but we have to limit this examples execution time to be less than
|
||||
# a value but we have to limit this example's execution time to be less than
|
||||
# infinity).
|
||||
EXPECTED_UNITS = PRODUCER_UNITS * PRODUCERS
|
||||
|
||||
@@ -150,6 +151,14 @@ def producer(ident, client):
|
||||
|
||||
|
||||
def main():
|
||||
if six.PY3:
|
||||
# TODO(harlowja): Hack to make eventlet work right, remove when the
|
||||
# following is fixed: https://github.com/eventlet/eventlet/issues/230
|
||||
from taskflow.utils import eventlet_utils as _eu # noqa
|
||||
try:
|
||||
import eventlet as _eventlet # noqa
|
||||
except ImportError:
|
||||
pass
|
||||
with contextlib.closing(fake_client.FakeClient()) as c:
|
||||
created = []
|
||||
for i in compat_range(0, PRODUCERS):
|
||||
|
||||
@@ -27,12 +27,12 @@ top_dir = os.path.abspath(os.path.join(os.path.dirname(__file__),
|
||||
os.pardir))
|
||||
sys.path.insert(0, top_dir)
|
||||
|
||||
import futurist
|
||||
from six.moves import range as compat_range
|
||||
|
||||
from taskflow import engines
|
||||
from taskflow.patterns import unordered_flow as uf
|
||||
from taskflow import task
|
||||
from taskflow.types import futures
|
||||
from taskflow.utils import eventlet_utils
|
||||
|
||||
# INTRO: This example walks through a miniature workflow which does a parallel
|
||||
@@ -98,9 +98,9 @@ def main():
|
||||
|
||||
# Now run it (using the specified executor)...
|
||||
if eventlet_utils.EVENTLET_AVAILABLE:
|
||||
executor = futures.GreenThreadPoolExecutor(max_workers=5)
|
||||
executor = futurist.GreenThreadPoolExecutor(max_workers=5)
|
||||
else:
|
||||
executor = futures.ThreadPoolExecutor(max_workers=5)
|
||||
executor = futurist.ThreadPoolExecutor(max_workers=5)
|
||||
try:
|
||||
e = engines.load(f, engine='parallel', executor=executor)
|
||||
for st in e.run_iter():
|
||||
|
||||
@@ -31,7 +31,7 @@ sys.path.insert(0, self_dir)
|
||||
|
||||
from taskflow import engines
|
||||
from taskflow.patterns import linear_flow as lf
|
||||
from taskflow.persistence import logbook
|
||||
from taskflow.persistence import models
|
||||
from taskflow import task
|
||||
from taskflow.utils import persistence_utils as p_utils
|
||||
|
||||
@@ -68,15 +68,15 @@ class ByeTask(task.Task):
|
||||
print("Bye!")
|
||||
|
||||
|
||||
# This generates your flow structure (at this stage nothing is ran).
|
||||
# This generates your flow structure (at this stage nothing is run).
|
||||
def make_flow(blowup=False):
|
||||
flow = lf.Flow("hello-world")
|
||||
flow.add(HiTask(), ByeTask(blowup))
|
||||
return flow
|
||||
|
||||
|
||||
# Persist the flow and task state here, if the file/dir exists already blowup
|
||||
# if not don't blowup, this allows a user to see both the modes and to see
|
||||
# Persist the flow and task state here, if the file/dir exists already blow up
|
||||
# if not don't blow up, this allows a user to see both the modes and to see
|
||||
# what is stored in each case.
|
||||
if eu.SQLALCHEMY_AVAILABLE:
|
||||
persist_path = os.path.join(tempfile.gettempdir(), "persisting.db")
|
||||
@@ -91,10 +91,10 @@ else:
|
||||
blowup = True
|
||||
|
||||
with eu.get_backend(backend_uri) as backend:
|
||||
# Make a flow that will blowup if the file doesn't exist previously, if it
|
||||
# did exist, assume we won't blowup (and therefore this shows the undo
|
||||
# Make a flow that will blow up if the file didn't exist previously, if it
|
||||
# did exist, assume we won't blow up (and therefore this shows the undo
|
||||
# and redo that a flow will go through).
|
||||
book = logbook.LogBook("my-test")
|
||||
book = models.LogBook("my-test")
|
||||
flow = make_flow(blowup=blowup)
|
||||
eu.print_wrapped("Running")
|
||||
try:
|
||||
|
||||
@@ -31,6 +31,7 @@ top_dir = os.path.abspath(os.path.join(os.path.dirname(__file__),
|
||||
sys.path.insert(0, top_dir)
|
||||
sys.path.insert(0, self_dir)
|
||||
|
||||
import futurist
|
||||
from oslo_utils import uuidutils
|
||||
|
||||
from taskflow import engines
|
||||
@@ -38,13 +39,12 @@ from taskflow import exceptions as exc
|
||||
from taskflow.patterns import graph_flow as gf
|
||||
from taskflow.patterns import linear_flow as lf
|
||||
from taskflow import task
|
||||
from taskflow.types import futures
|
||||
from taskflow.utils import eventlet_utils
|
||||
from taskflow.utils import persistence_utils as p_utils
|
||||
|
||||
import example_utils as eu # noqa
|
||||
|
||||
# INTRO: This examples shows how a hierarchy of flows can be used to create a
|
||||
# INTRO: These examples show how a hierarchy of flows can be used to create a
|
||||
# vm in a reliable & resumable manner using taskflow + a miniature version of
|
||||
# what nova does while booting a vm.
|
||||
|
||||
@@ -239,7 +239,7 @@ with eu.get_backend() as backend:
|
||||
# Set up how we want our engine to run, serial, parallel...
|
||||
executor = None
|
||||
if eventlet_utils.EVENTLET_AVAILABLE:
|
||||
executor = futures.GreenThreadPoolExecutor(5)
|
||||
executor = futurist.GreenThreadPoolExecutor(5)
|
||||
|
||||
# Create/fetch a logbook that will track the workflows work.
|
||||
book = None
|
||||
|
||||
@@ -39,7 +39,7 @@ from taskflow.utils import persistence_utils as p_utils
|
||||
|
||||
import example_utils # noqa
|
||||
|
||||
# INTRO: This examples shows how a hierarchy of flows can be used to create a
|
||||
# INTRO: These examples show how a hierarchy of flows can be used to create a
|
||||
# pseudo-volume in a reliable & resumable manner using taskflow + a miniature
|
||||
# version of what cinder does while creating a volume (very miniature).
|
||||
|
||||
|
||||
@@ -32,7 +32,7 @@ from taskflow import task
|
||||
|
||||
# INTRO: In this example we create a retry controller that receives a phone
|
||||
# directory and tries different phone numbers. The next task tries to call Jim
|
||||
# using the given number. If if is not a Jim's number, the tasks raises an
|
||||
# using the given number. If it is not a Jim's number, the task raises an
|
||||
# exception and retry controller takes the next number from the phone
|
||||
# directory and retries the call.
|
||||
#
|
||||
|
||||
@@ -37,7 +37,7 @@ from taskflow import task
|
||||
from taskflow.utils import persistence_utils
|
||||
|
||||
|
||||
# INTRO: This examples shows how to run a set of engines at the same time, each
|
||||
# INTRO: This example shows how to run a set of engines at the same time, each
|
||||
# running in different engines using a single thread of control to iterate over
|
||||
# each engine (which causes that engine to advanced to its next state during
|
||||
# each iteration).
|
||||
|
||||
@@ -33,10 +33,10 @@ from taskflow.persistence import backends as persistence_backends
|
||||
from taskflow import task
|
||||
from taskflow.utils import persistence_utils
|
||||
|
||||
# INTRO: This examples shows how to run a engine using the engine iteration
|
||||
# INTRO: These examples show how to run an engine using the engine iteration
|
||||
# capability, in between iterations other activities occur (in this case a
|
||||
# value is output to stdout); but more complicated actions can occur at the
|
||||
# boundary when a engine yields its current state back to the caller.
|
||||
# boundary when an engine yields its current state back to the caller.
|
||||
|
||||
|
||||
class EchoNameTask(task.Task):
|
||||
|
||||
81
taskflow/examples/share_engine_thread.py
Normal file
@@ -0,0 +1,81 @@
|
||||
# -*- coding: utf-8 -*-
|
||||
|
||||
# Copyright (C) 2012-2013 Yahoo! Inc. All Rights Reserved.
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License"); you may
|
||||
# not use this file except in compliance with the License. You may obtain
|
||||
# a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
|
||||
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
|
||||
# License for the specific language governing permissions and limitations
|
||||
# under the License.
|
||||
|
||||
import logging
|
||||
import os
|
||||
import random
|
||||
import sys
|
||||
import time
|
||||
|
||||
logging.basicConfig(level=logging.ERROR)
|
||||
|
||||
top_dir = os.path.abspath(os.path.join(os.path.dirname(__file__),
|
||||
os.pardir,
|
||||
os.pardir))
|
||||
sys.path.insert(0, top_dir)
|
||||
|
||||
import futurist
|
||||
import six
|
||||
|
||||
from taskflow import engines
|
||||
from taskflow.patterns import unordered_flow as uf
|
||||
from taskflow import task
|
||||
from taskflow.utils import threading_utils as tu
|
||||
|
||||
# INTRO: in this example we create 2 dummy flow(s) with a 2 dummy task(s), and
|
||||
# run it using a shared thread pool executor to show how a single executor can
|
||||
# be used with more than one engine (sharing the execution thread pool between
|
||||
# them); this allows for saving resources and reusing threads in situations
|
||||
# where this is benefical.
|
||||
|
||||
|
||||
class DelayedTask(task.Task):
|
||||
def __init__(self, name):
|
||||
super(DelayedTask, self).__init__(name=name)
|
||||
self._wait_for = random.random()
|
||||
|
||||
def execute(self):
|
||||
print("Running '%s' in thread '%s'" % (self.name, tu.get_ident()))
|
||||
time.sleep(self._wait_for)
|
||||
|
||||
|
||||
f1 = uf.Flow("f1")
|
||||
f1.add(DelayedTask("f1-1"))
|
||||
f1.add(DelayedTask("f1-2"))
|
||||
|
||||
f2 = uf.Flow("f2")
|
||||
f2.add(DelayedTask("f2-1"))
|
||||
f2.add(DelayedTask("f2-2"))
|
||||
|
||||
# Run them all using the same futures (thread-pool based) executor...
|
||||
with futurist.ThreadPoolExecutor() as ex:
|
||||
e1 = engines.load(f1, engine='parallel', executor=ex)
|
||||
e2 = engines.load(f2, engine='parallel', executor=ex)
|
||||
iters = [e1.run_iter(), e2.run_iter()]
|
||||
# Iterate over a copy (so we can remove from the source list).
|
||||
cloned_iters = list(iters)
|
||||
while iters:
|
||||
# Run a single 'step' of each iterator, forcing each engine to perform
|
||||
# some work, then yield, and repeat until each iterator is consumed
|
||||
# and there is no more engine work to be done.
|
||||
for it in cloned_iters:
|
||||
try:
|
||||
six.next(it)
|
||||
except StopIteration:
|
||||
try:
|
||||
iters.remove(it)
|
||||
except ValueError:
|
||||
pass
|
||||
@@ -41,8 +41,8 @@ from taskflow import task
|
||||
# taskflow provides via tasks and flows makes it possible for you to easily at
|
||||
# a later time hook in a persistence layer (and then gain the functionality
|
||||
# that offers) when you decide the complexity of adding that layer in
|
||||
# is 'worth it' for your applications usage pattern (which certain applications
|
||||
# may not need).
|
||||
# is 'worth it' for your application's usage pattern (which certain
|
||||
# applications may not need).
|
||||
|
||||
|
||||
class CallJim(task.Task):
|
||||
|
||||
@@ -37,7 +37,7 @@ ANY = notifier.Notifier.ANY
|
||||
# a given ~phone~ number (provided as a function input) in a linear fashion
|
||||
# (one after the other).
|
||||
#
|
||||
# For a workflow which is serial this shows a extremely simple way
|
||||
# For a workflow which is serial this shows an extremely simple way
|
||||
# of structuring your tasks (the code that does the work) into a linear
|
||||
# sequence (the flow) and then passing the work off to an engine, with some
|
||||
# initial data to be ran in a reliable manner.
|
||||
@@ -92,11 +92,11 @@ engine = taskflow.engines.load(flow, store={
|
||||
})
|
||||
|
||||
# This is where we attach our callback functions to the 2 different
|
||||
# notification objects that a engine exposes. The usage of a '*' (kleene star)
|
||||
# notification objects that an engine exposes. The usage of a ANY (kleene star)
|
||||
# here means that we want to be notified on all state changes, if you want to
|
||||
# restrict to a specific state change, just register that instead.
|
||||
engine.notifier.register(ANY, flow_watch)
|
||||
engine.task_notifier.register(ANY, task_watch)
|
||||
engine.atom_notifier.register(ANY, task_watch)
|
||||
|
||||
# And now run!
|
||||
engine.run()
|
||||
|
||||
@@ -31,7 +31,7 @@ from taskflow import engines
|
||||
from taskflow.patterns import linear_flow
|
||||
from taskflow import task
|
||||
|
||||
# INTRO: This examples shows how a task (in a linear/serial workflow) can
|
||||
# INTRO: This example shows how a task (in a linear/serial workflow) can
|
||||
# produce an output that can be then consumed/used by a downstream task.
|
||||
|
||||
|
||||
|
||||
@@ -27,9 +27,9 @@ top_dir = os.path.abspath(os.path.join(os.path.dirname(__file__),
|
||||
sys.path.insert(0, top_dir)
|
||||
sys.path.insert(0, self_dir)
|
||||
|
||||
# INTRO: this examples shows a simplistic map/reduce implementation where
|
||||
# INTRO: These examples show a simplistic map/reduce implementation where
|
||||
# a set of mapper(s) will sum a series of input numbers (in parallel) and
|
||||
# return there individual summed result. A reducer will then use those
|
||||
# return their individual summed result. A reducer will then use those
|
||||
# produced values and perform a final summation and this result will then be
|
||||
# printed (and verified to ensure the calculation was as expected).
|
||||
|
||||
|
||||
75
taskflow/examples/switch_graph_flow.py
Normal file
@@ -0,0 +1,75 @@
|
||||
# -*- coding: utf-8 -*-
|
||||
|
||||
# Copyright (C) 2014 Yahoo! Inc. All Rights Reserved.
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License"); you may
|
||||
# not use this file except in compliance with the License. You may obtain
|
||||
# a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
|
||||
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
|
||||
# License for the specific language governing permissions and limitations
|
||||
# under the License.
|
||||
|
||||
import logging
|
||||
import os
|
||||
import sys
|
||||
|
||||
logging.basicConfig(level=logging.ERROR)
|
||||
|
||||
top_dir = os.path.abspath(os.path.join(os.path.dirname(__file__),
|
||||
os.pardir,
|
||||
os.pardir))
|
||||
sys.path.insert(0, top_dir)
|
||||
|
||||
from taskflow import engines
|
||||
from taskflow.patterns import graph_flow as gf
|
||||
from taskflow.persistence import backends
|
||||
from taskflow import task
|
||||
from taskflow.utils import persistence_utils as pu
|
||||
|
||||
|
||||
class DummyTask(task.Task):
|
||||
def execute(self):
|
||||
print("Running %s" % self.name)
|
||||
|
||||
|
||||
def allow(history):
|
||||
print(history)
|
||||
return False
|
||||
|
||||
|
||||
r = gf.Flow("root")
|
||||
r_a = DummyTask('r-a')
|
||||
r_b = DummyTask('r-b')
|
||||
r.add(r_a, r_b)
|
||||
r.link(r_a, r_b, decider=allow)
|
||||
|
||||
backend = backends.fetch({
|
||||
'connection': 'memory://',
|
||||
})
|
||||
book, flow_detail = pu.temporary_flow_detail(backend=backend)
|
||||
|
||||
e = engines.load(r, flow_detail=flow_detail, book=book, backend=backend)
|
||||
e.compile()
|
||||
e.prepare()
|
||||
e.run()
|
||||
|
||||
|
||||
print("---------")
|
||||
print("After run")
|
||||
print("---------")
|
||||
entries = [os.path.join(backend.memory.root_path, child)
|
||||
for child in backend.memory.ls(backend.memory.root_path)]
|
||||
while entries:
|
||||
path = entries.pop()
|
||||
value = backend.memory[path]
|
||||
if value:
|
||||
print("%s -> %s" % (path, value))
|
||||
else:
|
||||
print("%s" % (path))
|
||||
entries.extend(os.path.join(path, child)
|
||||
for child in backend.memory.ls(path))
|
||||
@@ -36,7 +36,7 @@ from taskflow import task
|
||||
# and have variable run time tasks run and show how the listener will print
|
||||
# out how long those tasks took (when they started and when they finished).
|
||||
#
|
||||
# This shows how timing metrics can be gathered (or attached onto a engine)
|
||||
# This shows how timing metrics can be gathered (or attached onto an engine)
|
||||
# after a workflow has been constructed, making it easy to gather metrics
|
||||
# dynamically for situations where this kind of information is applicable (or
|
||||
# even adding this information on at a later point in the future when your
|
||||
@@ -55,5 +55,5 @@ class VariableTask(task.Task):
|
||||
f = lf.Flow('root')
|
||||
f.add(VariableTask('a'), VariableTask('b'), VariableTask('c'))
|
||||
e = engines.load(f)
|
||||
with timing.PrintingTimingListener(e):
|
||||
with timing.PrintingDurationListener(e):
|
||||
e.run()
|
||||
|
||||
243
taskflow/examples/tox_conductor.py
Normal file
@@ -0,0 +1,243 @@
|
||||
# -*- coding: utf-8 -*-
|
||||
|
||||
# Copyright (C) 2014 Yahoo! Inc. All Rights Reserved.
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License"); you may
|
||||
# not use this file except in compliance with the License. You may obtain
|
||||
# a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
|
||||
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
|
||||
# License for the specific language governing permissions and limitations
|
||||
# under the License.
|
||||
|
||||
import contextlib
|
||||
import itertools
|
||||
import logging
|
||||
import os
|
||||
import shutil
|
||||
import socket
|
||||
import sys
|
||||
import tempfile
|
||||
import threading
|
||||
import time
|
||||
|
||||
logging.basicConfig(level=logging.ERROR)
|
||||
|
||||
top_dir = os.path.abspath(os.path.join(os.path.dirname(__file__),
|
||||
os.pardir,
|
||||
os.pardir))
|
||||
sys.path.insert(0, top_dir)
|
||||
|
||||
from oslo_utils import timeutils
|
||||
from oslo_utils import uuidutils
|
||||
import six
|
||||
from zake import fake_client
|
||||
|
||||
from taskflow.conductors import backends as conductors
|
||||
from taskflow import engines
|
||||
from taskflow.jobs import backends as boards
|
||||
from taskflow.patterns import linear_flow
|
||||
from taskflow.persistence import backends as persistence
|
||||
from taskflow.persistence import models
|
||||
from taskflow import task
|
||||
from taskflow.utils import threading_utils
|
||||
|
||||
# INTRO: This examples shows how a worker/producer can post desired work (jobs)
|
||||
# to a jobboard and a conductor can consume that work (jobs) from that jobboard
|
||||
# and execute those jobs in a reliable & async manner (for example, if the
|
||||
# conductor were to crash then the job will be released back onto the jobboard
|
||||
# and another conductor can attempt to finish it, from wherever that job last
|
||||
# left off).
|
||||
#
|
||||
# In this example a in-memory jobboard (and in-memory storage) is created and
|
||||
# used that simulates how this would be done at a larger scale (it is an
|
||||
# example after all).
|
||||
|
||||
# Restrict how long this example runs for...
|
||||
RUN_TIME = 5
|
||||
REVIEW_CREATION_DELAY = 0.5
|
||||
SCAN_DELAY = 0.1
|
||||
NAME = "%s_%s" % (socket.getfqdn(), os.getpid())
|
||||
|
||||
# This won't really use zookeeper but will use a local version of it using
|
||||
# the zake library that mimics an actual zookeeper cluster using threads and
|
||||
# an in-memory data structure.
|
||||
JOBBOARD_CONF = {
|
||||
'board': 'zookeeper://localhost?path=/taskflow/tox/jobs',
|
||||
}
|
||||
|
||||
|
||||
class RunReview(task.Task):
|
||||
# A dummy task that clones the review and runs tox...
|
||||
|
||||
def _clone_review(self, review, temp_dir):
|
||||
print("Cloning review '%s' into %s" % (review['id'], temp_dir))
|
||||
|
||||
def _run_tox(self, temp_dir):
|
||||
print("Running tox in %s" % temp_dir)
|
||||
|
||||
def execute(self, review, temp_dir):
|
||||
self._clone_review(review, temp_dir)
|
||||
self._run_tox(temp_dir)
|
||||
|
||||
|
||||
class MakeTempDir(task.Task):
|
||||
# A task that creates and destroys a temporary dir (on failure).
|
||||
#
|
||||
# It provides the location of the temporary dir for other tasks to use
|
||||
# as they see fit.
|
||||
|
||||
default_provides = 'temp_dir'
|
||||
|
||||
def execute(self):
|
||||
return tempfile.mkdtemp()
|
||||
|
||||
def revert(self, *args, **kwargs):
|
||||
temp_dir = kwargs.get(task.REVERT_RESULT)
|
||||
if temp_dir:
|
||||
shutil.rmtree(temp_dir)
|
||||
|
||||
|
||||
class CleanResources(task.Task):
|
||||
# A task that cleans up any workflow resources.
|
||||
|
||||
def execute(self, temp_dir):
|
||||
print("Removing %s" % temp_dir)
|
||||
shutil.rmtree(temp_dir)
|
||||
|
||||
|
||||
def review_iter():
|
||||
"""Makes reviews (never-ending iterator/generator)."""
|
||||
review_id_gen = itertools.count(0)
|
||||
while True:
|
||||
review_id = six.next(review_id_gen)
|
||||
review = {
|
||||
'id': review_id,
|
||||
}
|
||||
yield review
|
||||
|
||||
|
||||
# The reason this is at the module namespace level is important, since it must
|
||||
# be accessible from a conductor dispatching an engine, if it was a lambda
|
||||
# function for example, it would not be reimportable and the conductor would
|
||||
# be unable to reference it when creating the workflow to run.
|
||||
def create_review_workflow():
|
||||
"""Factory method used to create a review workflow to run."""
|
||||
f = linear_flow.Flow("tester")
|
||||
f.add(
|
||||
MakeTempDir(name="maker"),
|
||||
RunReview(name="runner"),
|
||||
CleanResources(name="cleaner")
|
||||
)
|
||||
return f
|
||||
|
||||
|
||||
def generate_reviewer(client, saver, name=NAME):
|
||||
"""Creates a review producer thread with the given name prefix."""
|
||||
real_name = "%s_reviewer" % name
|
||||
no_more = threading.Event()
|
||||
jb = boards.fetch(real_name, JOBBOARD_CONF,
|
||||
client=client, persistence=saver)
|
||||
|
||||
def make_save_book(saver, review_id):
|
||||
# Record what we want to happen (sometime in the future).
|
||||
book = models.LogBook("book_%s" % review_id)
|
||||
detail = models.FlowDetail("flow_%s" % review_id,
|
||||
uuidutils.generate_uuid())
|
||||
book.add(detail)
|
||||
# Associate the factory method we want to be called (in the future)
|
||||
# with the book, so that the conductor will be able to call into
|
||||
# that factory to retrieve the workflow objects that represent the
|
||||
# work.
|
||||
#
|
||||
# These args and kwargs *can* be used to save any specific parameters
|
||||
# into the factory when it is being called to create the workflow
|
||||
# objects (typically used to tell a factory how to create a unique
|
||||
# workflow that represents this review).
|
||||
factory_args = ()
|
||||
factory_kwargs = {}
|
||||
engines.save_factory_details(detail, create_review_workflow,
|
||||
factory_args, factory_kwargs)
|
||||
with contextlib.closing(saver.get_connection()) as conn:
|
||||
conn.save_logbook(book)
|
||||
return book
|
||||
|
||||
def run():
|
||||
"""Periodically publishes 'fake' reviews to analyze."""
|
||||
jb.connect()
|
||||
review_generator = review_iter()
|
||||
with contextlib.closing(jb):
|
||||
while not no_more.is_set():
|
||||
review = six.next(review_generator)
|
||||
details = {
|
||||
'store': {
|
||||
'review': review,
|
||||
},
|
||||
}
|
||||
job_name = "%s_%s" % (real_name, review['id'])
|
||||
print("Posting review '%s'" % review['id'])
|
||||
jb.post(job_name,
|
||||
book=make_save_book(saver, review['id']),
|
||||
details=details)
|
||||
time.sleep(REVIEW_CREATION_DELAY)
|
||||
|
||||
# Return the unstarted thread, and a callback that can be used
|
||||
# shutdown that thread (to avoid running forever).
|
||||
return (threading_utils.daemon_thread(target=run), no_more.set)
|
||||
|
||||
|
||||
def generate_conductor(client, saver, name=NAME):
|
||||
"""Creates a conductor thread with the given name prefix."""
|
||||
real_name = "%s_conductor" % name
|
||||
jb = boards.fetch(name, JOBBOARD_CONF,
|
||||
client=client, persistence=saver)
|
||||
conductor = conductors.fetch("blocking", real_name, jb,
|
||||
engine='parallel', wait_timeout=SCAN_DELAY)
|
||||
|
||||
def run():
|
||||
jb.connect()
|
||||
with contextlib.closing(jb):
|
||||
conductor.run()
|
||||
|
||||
# Return the unstarted thread, and a callback that can be used
|
||||
# shutdown that thread (to avoid running forever).
|
||||
return (threading_utils.daemon_thread(target=run), conductor.stop)
|
||||
|
||||
|
||||
def main():
|
||||
# Need to share the same backend, so that data can be shared...
|
||||
persistence_conf = {
|
||||
'connection': 'memory',
|
||||
}
|
||||
saver = persistence.fetch(persistence_conf)
|
||||
with contextlib.closing(saver.get_connection()) as conn:
|
||||
# This ensures that the needed backend setup/data directories/schema
|
||||
# upgrades and so on... exist before they are attempted to be used...
|
||||
conn.upgrade()
|
||||
fc1 = fake_client.FakeClient()
|
||||
# Done like this to share the same client storage location so the correct
|
||||
# zookeeper features work across clients...
|
||||
fc2 = fake_client.FakeClient(storage=fc1.storage)
|
||||
entities = [
|
||||
generate_reviewer(fc1, saver),
|
||||
generate_conductor(fc2, saver),
|
||||
]
|
||||
for t, stopper in entities:
|
||||
t.start()
|
||||
try:
|
||||
watch = timeutils.StopWatch(duration=RUN_TIME)
|
||||
watch.start()
|
||||
while not watch.expired():
|
||||
time.sleep(0.1)
|
||||
finally:
|
||||
for t, stopper in reversed(entities):
|
||||
stopper()
|
||||
t.join()
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
main()
|
||||
@@ -36,10 +36,10 @@ from taskflow.utils import threading_utils
|
||||
|
||||
ANY = notifier.Notifier.ANY
|
||||
|
||||
# INTRO: This examples shows how to use a remote workers event notification
|
||||
# INTRO: These examples show how to use a remote worker's event notification
|
||||
# attribute to proxy back task event notifications to the controlling process.
|
||||
#
|
||||
# In this case a simple set of events are triggered by a worker running a
|
||||
# In this case a simple set of events is triggered by a worker running a
|
||||
# task (simulated to be remote by using a kombu memory transport and threads).
|
||||
# Those events that the 'remote worker' produces will then be proxied back to
|
||||
# the task that the engine is running 'remotely', and then they will be emitted
|
||||
@@ -113,10 +113,10 @@ if __name__ == "__main__":
|
||||
workers = []
|
||||
|
||||
# These topics will be used to request worker information on; those
|
||||
# workers will respond with there capabilities which the executing engine
|
||||
# workers will respond with their capabilities which the executing engine
|
||||
# will use to match pending tasks to a matched worker, this will cause
|
||||
# the task to be sent for execution, and the engine will wait until it
|
||||
# is finished (a response is recieved) and then the engine will either
|
||||
# is finished (a response is received) and then the engine will either
|
||||
# continue with other tasks, do some retry/failure resolution logic or
|
||||
# stop (and potentially re-raise the remote workers failure)...
|
||||
worker_topics = []
|
||||
|
||||
@@ -111,11 +111,11 @@ def calculate(engine_conf):
|
||||
# an image bitmap file.
|
||||
|
||||
# And unordered flow is used here since the mandelbrot calculation is an
|
||||
# example of a embarrassingly parallel computation that we can scatter
|
||||
# example of an embarrassingly parallel computation that we can scatter
|
||||
# across as many workers as possible.
|
||||
flow = uf.Flow("mandelbrot")
|
||||
|
||||
# These symbols will be automatically given to tasks as input to there
|
||||
# These symbols will be automatically given to tasks as input to their
|
||||
# execute method, in this case these are constants used in the mandelbrot
|
||||
# calculation.
|
||||
store = {
|
||||
|
||||
@@ -17,15 +17,46 @@
|
||||
import os
|
||||
import traceback
|
||||
|
||||
from oslo_utils import excutils
|
||||
from oslo_utils import reflection
|
||||
import six
|
||||
|
||||
|
||||
def raise_with_cause(exc_cls, message, *args, **kwargs):
|
||||
"""Helper to raise + chain exceptions (when able) and associate a *cause*.
|
||||
|
||||
NOTE(harlowja): Since in py3.x exceptions can be chained (due to
|
||||
:pep:`3134`) we should try to raise the desired exception with the given
|
||||
*cause* (or extract a *cause* from the current stack if able) so that the
|
||||
exception formats nicely in old and new versions of python. Since py2.x
|
||||
does **not** support exception chaining (or formatting) our root exception
|
||||
class has a :py:meth:`~taskflow.exceptions.TaskFlowException.pformat`
|
||||
method that can be used to get *similar* information instead (and this
|
||||
function makes sure to retain the *cause* in that case as well so
|
||||
that the :py:meth:`~taskflow.exceptions.TaskFlowException.pformat` method
|
||||
shows them).
|
||||
|
||||
:param exc_cls: the :py:class:`~taskflow.exceptions.TaskFlowException`
|
||||
class to raise.
|
||||
:param message: the text/str message that will be passed to
|
||||
the exceptions constructor as its first positional
|
||||
argument.
|
||||
:param args: any additional positional arguments to pass to the
|
||||
exceptions constructor.
|
||||
:param kwargs: any additional keyword arguments to pass to the
|
||||
exceptions constructor.
|
||||
"""
|
||||
if not issubclass(exc_cls, TaskFlowException):
|
||||
raise ValueError("Subclass of taskflow exception is required")
|
||||
excutils.raise_with_cause(exc_cls, message, *args, **kwargs)
|
||||
|
||||
|
||||
class TaskFlowException(Exception):
|
||||
"""Base class for *most* exceptions emitted from this library.
|
||||
|
||||
NOTE(harlowja): in later versions of python we can likely remove the need
|
||||
to have a cause here as PY3+ have implemented PEP 3134 which handles
|
||||
chaining in a much more elegant manner.
|
||||
to have a ``cause`` here as PY3+ have implemented :pep:`3134` which
|
||||
handles chaining in a much more elegant manner.
|
||||
|
||||
:param message: the exception message, typically some string that is
|
||||
useful for consumers to view when debugging or analyzing
|
||||
@@ -43,35 +74,55 @@ class TaskFlowException(Exception):
|
||||
def cause(self):
|
||||
return self._cause
|
||||
|
||||
def pformat(self, indent=2, indent_text=" "):
|
||||
def __str__(self):
|
||||
return self.pformat()
|
||||
|
||||
def _get_message(self):
|
||||
# We must *not* call into the __str__ method as that will reactivate
|
||||
# the pformat method, which will end up badly (and doesn't look
|
||||
# pretty at all); so be careful...
|
||||
return self.args[0]
|
||||
|
||||
def pformat(self, indent=2, indent_text=" ", show_root_class=False):
|
||||
"""Pretty formats a taskflow exception + any connected causes."""
|
||||
if indent < 0:
|
||||
raise ValueError("indent must be greater than or equal to zero")
|
||||
return os.linesep.join(self._pformat(self, [], 0,
|
||||
indent=indent,
|
||||
indent_text=indent_text))
|
||||
|
||||
@classmethod
|
||||
def _pformat(cls, excp, lines, current_indent, indent=2, indent_text=" "):
|
||||
line_prefix = indent_text * current_indent
|
||||
for line in traceback.format_exception_only(type(excp), excp):
|
||||
# We'll add our own newlines on at the end of formatting.
|
||||
#
|
||||
# NOTE(harlowja): the reason we don't search for os.linesep is
|
||||
# that the traceback module seems to only use '\n' (for some
|
||||
# reason).
|
||||
if line.endswith("\n"):
|
||||
line = line[0:-1]
|
||||
lines.append(line_prefix + line)
|
||||
try:
|
||||
cause = excp.cause
|
||||
except AttributeError:
|
||||
pass
|
||||
else:
|
||||
if cause is not None:
|
||||
cls._pformat(cause, lines, current_indent + indent,
|
||||
indent=indent, indent_text=indent_text)
|
||||
return lines
|
||||
raise ValueError("Provided 'indent' must be greater than"
|
||||
" or equal to zero instead of %s" % indent)
|
||||
buf = six.StringIO()
|
||||
if show_root_class:
|
||||
buf.write(reflection.get_class_name(self, fully_qualified=False))
|
||||
buf.write(": ")
|
||||
buf.write(self._get_message())
|
||||
active_indent = indent
|
||||
next_up = self.cause
|
||||
seen = []
|
||||
while next_up is not None and next_up not in seen:
|
||||
seen.append(next_up)
|
||||
buf.write(os.linesep)
|
||||
if isinstance(next_up, TaskFlowException):
|
||||
buf.write(indent_text * active_indent)
|
||||
buf.write(reflection.get_class_name(next_up,
|
||||
fully_qualified=False))
|
||||
buf.write(": ")
|
||||
buf.write(next_up._get_message())
|
||||
else:
|
||||
lines = traceback.format_exception_only(type(next_up), next_up)
|
||||
for i, line in enumerate(lines):
|
||||
buf.write(indent_text * active_indent)
|
||||
if line.endswith("\n"):
|
||||
# We'll add our own newlines on...
|
||||
line = line[0:-1]
|
||||
buf.write(line)
|
||||
if i + 1 != len(lines):
|
||||
buf.write(os.linesep)
|
||||
if not isinstance(next_up, TaskFlowException):
|
||||
# Don't go deeper into non-taskflow exceptions... as we
|
||||
# don't know if there exception 'cause' attributes are even
|
||||
# useable objects...
|
||||
break
|
||||
active_indent += indent
|
||||
next_up = getattr(next_up, 'cause', None)
|
||||
return buf.getvalue()
|
||||
|
||||
|
||||
# Errors related to storage or operations on storage units.
|
||||
|
||||
@@ -31,6 +31,9 @@ LINK_RETRY = 'retry'
|
||||
# This key denotes the link was created due to symbol constraints and the
|
||||
# value will be a set of names that the constraint ensures are satisfied.
|
||||
LINK_REASONS = 'reasons'
|
||||
#
|
||||
# This key denotes a callable that will determine if the target is visited.
|
||||
LINK_DECIDER = 'decider'
|
||||
|
||||
|
||||
@six.add_metaclass(abc.ABCMeta)
|
||||
@@ -96,9 +99,8 @@ class Flow(object):
|
||||
"""
|
||||
|
||||
def __str__(self):
|
||||
lines = ["%s: %s" % (reflection.get_class_name(self), self.name)]
|
||||
lines.append("%s" % (len(self)))
|
||||
return "; ".join(lines)
|
||||
return "%s: %s(len=%d)" % (reflection.get_class_name(self),
|
||||
self.name, len(self))
|
||||
|
||||
@property
|
||||
def provides(self):
|
||||
|
||||
@@ -39,17 +39,17 @@ def fetch(name, conf, namespace=BACKEND_NAMESPACE, **kwargs):
|
||||
|
||||
NOTE(harlowja): to aid in making it easy to specify configuration and
|
||||
options to a board the configuration (which is typical just a dictionary)
|
||||
can also be a uri string that identifies the entrypoint name and any
|
||||
can also be a URI string that identifies the entrypoint name and any
|
||||
configuration specific to that board.
|
||||
|
||||
For example, given the following configuration uri:
|
||||
For example, given the following configuration URI::
|
||||
|
||||
zookeeper://<not-used>/?a=b&c=d
|
||||
zookeeper://<not-used>/?a=b&c=d
|
||||
|
||||
This will look for the entrypoint named 'zookeeper' and will provide
|
||||
a configuration object composed of the uris parameters, in this case that
|
||||
is {'a': 'b', 'c': 'd'} to the constructor of that board instance (also
|
||||
including the name specified).
|
||||
a configuration object composed of the URI's components, in this case that
|
||||
is ``{'a': 'b', 'c': 'd'}`` to the constructor of that board
|
||||
instance (also including the name specified).
|
||||
"""
|
||||
if isinstance(conf, six.string_types):
|
||||
conf = {'board': conf}
|
||||
|
||||
957
taskflow/jobs/backends/impl_redis.py
Normal file
@@ -0,0 +1,957 @@
|
||||
# -*- coding: utf-8 -*-
|
||||
|
||||
# Copyright (C) 2015 Yahoo! Inc. All Rights Reserved.
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License"); you may
|
||||
# not use this file except in compliance with the License. You may obtain
|
||||
# a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
|
||||
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
|
||||
# License for the specific language governing permissions and limitations
|
||||
# under the License.
|
||||
|
||||
import contextlib
|
||||
import datetime
|
||||
import string
|
||||
import threading
|
||||
import time
|
||||
|
||||
import fasteners
|
||||
import msgpack
|
||||
from oslo_serialization import msgpackutils
|
||||
from oslo_utils import strutils
|
||||
from oslo_utils import timeutils
|
||||
from oslo_utils import uuidutils
|
||||
from redis import exceptions as redis_exceptions
|
||||
import six
|
||||
from six.moves import range as compat_range
|
||||
|
||||
from taskflow import exceptions as exc
|
||||
from taskflow.jobs import base
|
||||
from taskflow import logging
|
||||
from taskflow import states
|
||||
from taskflow.utils import misc
|
||||
from taskflow.utils import redis_utils as ru
|
||||
|
||||
|
||||
LOG = logging.getLogger(__name__)
|
||||
|
||||
|
||||
@contextlib.contextmanager
|
||||
def _translate_failures():
|
||||
"""Translates common redis exceptions into taskflow exceptions."""
|
||||
try:
|
||||
yield
|
||||
except redis_exceptions.ConnectionError:
|
||||
exc.raise_with_cause(exc.JobFailure, "Failed to connect to redis")
|
||||
except redis_exceptions.TimeoutError:
|
||||
exc.raise_with_cause(exc.JobFailure,
|
||||
"Failed to communicate with redis, connection"
|
||||
" timed out")
|
||||
except redis_exceptions.RedisError:
|
||||
exc.raise_with_cause(exc.JobFailure,
|
||||
"Failed to communicate with redis,"
|
||||
" internal error")
|
||||
|
||||
|
||||
class RedisJob(base.Job):
|
||||
"""A redis job."""
|
||||
|
||||
def __init__(self, board, name, sequence, key,
|
||||
uuid=None, details=None,
|
||||
created_on=None, backend=None,
|
||||
book=None, book_data=None):
|
||||
super(RedisJob, self).__init__(board, name,
|
||||
uuid=uuid, details=details,
|
||||
backend=backend,
|
||||
book=book, book_data=book_data)
|
||||
self._created_on = created_on
|
||||
self._client = board._client
|
||||
self._redis_version = board._redis_version
|
||||
self._sequence = sequence
|
||||
self._key = key
|
||||
self._last_modified_key = board.join(key + board.LAST_MODIFIED_POSTFIX)
|
||||
self._owner_key = board.join(key + board.OWNED_POSTFIX)
|
||||
|
||||
@property
|
||||
def key(self):
|
||||
"""Key (in board listings/trash hash) the job data is stored under."""
|
||||
return self._key
|
||||
|
||||
@property
|
||||
def last_modified_key(self):
|
||||
"""Key the job last modified data is stored under."""
|
||||
return self._last_modified_key
|
||||
|
||||
@property
|
||||
def owner_key(self):
|
||||
"""Key the job claim + data of the owner is stored under."""
|
||||
return self._owner_key
|
||||
|
||||
@property
|
||||
def sequence(self):
|
||||
"""Sequence number of the current job."""
|
||||
return self._sequence
|
||||
|
||||
def expires_in(self):
|
||||
"""How many seconds until the claim expires.
|
||||
|
||||
Returns the number of seconds until the ownership entry expires or
|
||||
:attr:`~taskflow.utils.redis_utils.UnknownExpire.DOES_NOT_EXPIRE` or
|
||||
:attr:`~taskflow.utils.redis_utils.UnknownExpire.KEY_NOT_FOUND` if it
|
||||
does not expire or if the expiry can not be determined (perhaps the
|
||||
:attr:`.owner_key` expired at/before time of inquiry?).
|
||||
"""
|
||||
with _translate_failures():
|
||||
return ru.get_expiry(self._client, self._owner_key,
|
||||
prior_version=self._redis_version)
|
||||
|
||||
def extend_expiry(self, expiry):
|
||||
"""Extends the owner key (aka the claim) expiry for this job.
|
||||
|
||||
NOTE(harlowja): if the claim for this job did **not** previously
|
||||
have an expiry associated with it, calling this method will create
|
||||
one (and after that time elapses the claim on this job will cease
|
||||
to exist).
|
||||
|
||||
Returns ``True`` if the expiry request was performed
|
||||
otherwise ``False``.
|
||||
"""
|
||||
with _translate_failures():
|
||||
return ru.apply_expiry(self._client, self._owner_key, expiry,
|
||||
prior_version=self._redis_version)
|
||||
|
||||
def __lt__(self, other):
|
||||
if self.created_on == other.created_on:
|
||||
return self.sequence < other.sequence
|
||||
else:
|
||||
return self.created_on < other.created_on
|
||||
|
||||
@property
|
||||
def created_on(self):
|
||||
return self._created_on
|
||||
|
||||
@property
|
||||
def last_modified(self):
|
||||
with _translate_failures():
|
||||
raw_last_modified = self._client.get(self._last_modified_key)
|
||||
last_modified = None
|
||||
if raw_last_modified:
|
||||
last_modified = self._board._loads(
|
||||
raw_last_modified, root_types=(datetime.datetime,))
|
||||
# NOTE(harlowja): just incase this is somehow busted (due to time
|
||||
# sync issues/other), give back the most recent one (since redis
|
||||
# does not maintain clock information; we could have this happen
|
||||
# due to now clients who mutate jobs also send the time in).
|
||||
last_modified = max(last_modified, self._created_on)
|
||||
return last_modified
|
||||
|
||||
@property
|
||||
def state(self):
|
||||
listings_key = self._board.listings_key
|
||||
owner_key = self._owner_key
|
||||
listings_sub_key = self._key
|
||||
|
||||
def _do_fetch(p):
|
||||
# NOTE(harlowja): state of a job in redis is not set into any
|
||||
# explicit 'state' field, but is maintained by what nodes exist in
|
||||
# redis instead (ie if a owner key exists, then we know a owner
|
||||
# is active, if no job data exists and no owner, then we know that
|
||||
# the job is unclaimed, and so-on)...
|
||||
p.multi()
|
||||
p.hexists(listings_key, listings_sub_key)
|
||||
p.exists(owner_key)
|
||||
job_exists, owner_exists = p.execute()
|
||||
if not job_exists:
|
||||
if owner_exists:
|
||||
# This should **not** be possible due to lua code ordering
|
||||
# but let's log an INFO statement if it does happen (so
|
||||
# that it can be investigated)...
|
||||
LOG.info("Unexpected owner key found at '%s' when job"
|
||||
" key '%s[%s]' was not found", owner_key,
|
||||
listings_key, listings_sub_key)
|
||||
return states.COMPLETE
|
||||
else:
|
||||
if owner_exists:
|
||||
return states.CLAIMED
|
||||
else:
|
||||
return states.UNCLAIMED
|
||||
|
||||
with _translate_failures():
|
||||
return self._client.transaction(_do_fetch,
|
||||
listings_key, owner_key,
|
||||
value_from_callable=True)
|
||||
|
||||
def __str__(self):
|
||||
"""Pretty formats the job into something *more* meaningful."""
|
||||
tpl = "%s: %s (uuid=%s, owner_key=%s, sequence=%s, details=%s)"
|
||||
return tpl % (type(self).__name__,
|
||||
self.name, self.uuid, self.owner_key,
|
||||
self.sequence, self.details)
|
||||
|
||||
|
||||
class RedisJobBoard(base.JobBoard):
|
||||
"""A jobboard backed by `redis`_.
|
||||
|
||||
Powered by the `redis-py <http://redis-py.readthedocs.org/>`_ library.
|
||||
|
||||
This jobboard creates job entries by listing jobs in a redis `hash`_. This
|
||||
hash contains jobs that can be actively worked on by (and examined/claimed
|
||||
by) some set of eligible consumers. Job posting is typically performed
|
||||
using the :meth:`.post` method (this creates a hash entry with job
|
||||
contents/details encoded in `msgpack`_). The users of these
|
||||
jobboard(s) (potentially on disjoint sets of machines) can then
|
||||
iterate over the available jobs and decide if they want to attempt to
|
||||
claim one of the jobs they have iterated over. If so they will then
|
||||
attempt to contact redis and they will attempt to create a key in
|
||||
redis (using a embedded lua script to perform this atomically) to claim a
|
||||
desired job. If the entity trying to use the jobboard to :meth:`.claim`
|
||||
the job is able to create that lock/owner key then it will be
|
||||
allowed (and expected) to perform whatever *work* the contents of that
|
||||
job described. Once the claiming entity is finished the lock/owner key
|
||||
and the `hash`_ entry will be deleted (if successfully completed) in a
|
||||
single request (also using a embedded lua script to perform this
|
||||
atomically). If the claiming entity is not successful (or the entity
|
||||
that claimed the job dies) the lock/owner key can be released
|
||||
automatically (by **optional** usage of a claim expiry) or by
|
||||
using :meth:`.abandon` to manually abandon the job so that it can be
|
||||
consumed/worked on by others.
|
||||
|
||||
NOTE(harlowja): by default the :meth:`.claim` has no expiry (which
|
||||
means claims will be persistent, even under claiming entity failure). To
|
||||
ensure a expiry occurs pass a numeric value for the ``expiry`` keyword
|
||||
argument to the :meth:`.claim` method that defines how many seconds the
|
||||
claim should be retained for. When an expiry is used ensure that that
|
||||
claim is kept alive while it is being worked on by using
|
||||
the :py:meth:`~.RedisJob.extend_expiry` method periodically.
|
||||
|
||||
.. _msgpack: http://msgpack.org/
|
||||
.. _redis: http://redis.io/
|
||||
.. _hash: http://redis.io/topics/data-types#hashes
|
||||
"""
|
||||
|
||||
CLIENT_CONF_TRANSFERS = tuple([
|
||||
# Host config...
|
||||
('host', str),
|
||||
('port', int),
|
||||
|
||||
# See: http://redis.io/commands/auth
|
||||
('password', str),
|
||||
|
||||
# Data encoding/decoding + error handling
|
||||
('encoding', str),
|
||||
('encoding_errors', str),
|
||||
|
||||
# Connection settings.
|
||||
('socket_timeout', float),
|
||||
('socket_connect_timeout', float),
|
||||
|
||||
# This one negates the usage of host, port, socket connection
|
||||
# settings as it doesn't use the same kind of underlying socket...
|
||||
('unix_socket_path', str),
|
||||
|
||||
# Do u want ssl???
|
||||
('ssl', strutils.bool_from_string),
|
||||
('ssl_keyfile', str),
|
||||
('ssl_certfile', str),
|
||||
('ssl_cert_reqs', str),
|
||||
('ssl_ca_certs', str),
|
||||
|
||||
# See: http://www.rediscookbook.org/multiple_databases.html
|
||||
('db', int),
|
||||
])
|
||||
"""
|
||||
Keys (and value type converters) that we allow to proxy from the jobboard
|
||||
configuration into the redis client (used to configure the redis client
|
||||
internals if no explicit client is provided via the ``client`` keyword
|
||||
argument).
|
||||
|
||||
See: http://redis-py.readthedocs.org/en/latest/#redis.Redis
|
||||
|
||||
See: https://github.com/andymccurdy/redis-py/blob/2.10.3/redis/client.py
|
||||
"""
|
||||
|
||||
#: Postfix (combined with job key) used to make a jobs owner key.
|
||||
OWNED_POSTFIX = b".owned"
|
||||
|
||||
#: Postfix (combined with job key) used to make a jobs last modified key.
|
||||
LAST_MODIFIED_POSTFIX = b".last_modified"
|
||||
|
||||
#: Default namespace for keys when none is provided.
|
||||
DEFAULT_NAMESPACE = b'taskflow'
|
||||
|
||||
MIN_REDIS_VERSION = (2, 6)
|
||||
"""
|
||||
Minimum redis version this backend requires.
|
||||
|
||||
This version is required since we need the built-in server-side lua
|
||||
scripting support that is included in 2.6 and newer.
|
||||
"""
|
||||
|
||||
NAMESPACE_SEP = b':'
|
||||
"""
|
||||
Separator that is used to combine a key with the namespace (to get
|
||||
the **actual** key that will be used).
|
||||
"""
|
||||
|
||||
KEY_PIECE_SEP = b'.'
|
||||
"""
|
||||
Separator that is used to combine a bunch of key pieces together (to get
|
||||
the **actual** key that will be used).
|
||||
"""
|
||||
|
||||
#: Expected lua response status field when call is ok.
|
||||
SCRIPT_STATUS_OK = "ok"
|
||||
|
||||
#: Expected lua response status field when call is **not** ok.
|
||||
SCRIPT_STATUS_ERROR = "error"
|
||||
|
||||
#: Expected lua script error response when the owner is not as expected.
|
||||
SCRIPT_NOT_EXPECTED_OWNER = "Not expected owner!"
|
||||
|
||||
#: Expected lua script error response when the owner is not findable.
|
||||
SCRIPT_UNKNOWN_OWNER = "Unknown owner!"
|
||||
|
||||
#: Expected lua script error response when the job is not findable.
|
||||
SCRIPT_UNKNOWN_JOB = "Unknown job!"
|
||||
|
||||
#: Expected lua script error response when the job is already claimed.
|
||||
SCRIPT_ALREADY_CLAIMED = "Job already claimed!"
|
||||
|
||||
SCRIPT_TEMPLATES = {
|
||||
'consume': """
|
||||
-- Extract *all* the variables (so we can easily know what they are)...
|
||||
local owner_key = KEYS[1]
|
||||
local listings_key = KEYS[2]
|
||||
local last_modified_key = KEYS[3]
|
||||
|
||||
local expected_owner = ARGV[1]
|
||||
local job_key = ARGV[2]
|
||||
local result = {}
|
||||
if redis.call("hexists", listings_key, job_key) == 1 then
|
||||
if redis.call("exists", owner_key) == 1 then
|
||||
local owner = redis.call("get", owner_key)
|
||||
if owner ~= expected_owner then
|
||||
result["status"] = "${error}"
|
||||
result["reason"] = "${not_expected_owner}"
|
||||
result["owner"] = owner
|
||||
else
|
||||
-- The order is important here, delete the owner first (and if
|
||||
-- that blows up, the job data will still exist so it can be
|
||||
-- worked on again, instead of the reverse)...
|
||||
redis.call("del", owner_key, last_modified_key)
|
||||
redis.call("hdel", listings_key, job_key)
|
||||
result["status"] = "${ok}"
|
||||
end
|
||||
else
|
||||
result["status"] = "${error}"
|
||||
result["reason"] = "${unknown_owner}"
|
||||
end
|
||||
else
|
||||
result["status"] = "${error}"
|
||||
result["reason"] = "${unknown_job}"
|
||||
end
|
||||
return cmsgpack.pack(result)
|
||||
""",
|
||||
'claim': """
|
||||
local function apply_ttl(key, ms_expiry)
|
||||
if ms_expiry ~= nil then
|
||||
redis.call("pexpire", key, ms_expiry)
|
||||
end
|
||||
end
|
||||
|
||||
-- Extract *all* the variables (so we can easily know what they are)...
|
||||
local owner_key = KEYS[1]
|
||||
local listings_key = KEYS[2]
|
||||
local last_modified_key = KEYS[3]
|
||||
|
||||
local expected_owner = ARGV[1]
|
||||
local job_key = ARGV[2]
|
||||
local last_modified_blob = ARGV[3]
|
||||
|
||||
-- If this is non-numeric (which it may be) this becomes nil
|
||||
local ms_expiry = nil
|
||||
if ARGV[4] ~= "none" then
|
||||
ms_expiry = tonumber(ARGV[4])
|
||||
end
|
||||
local result = {}
|
||||
if redis.call("hexists", listings_key, job_key) == 1 then
|
||||
if redis.call("exists", owner_key) == 1 then
|
||||
local owner = redis.call("get", owner_key)
|
||||
if owner == expected_owner then
|
||||
-- Owner is the same, leave it alone...
|
||||
redis.call("set", last_modified_key, last_modified_blob)
|
||||
apply_ttl(owner_key, ms_expiry)
|
||||
result["status"] = "${ok}"
|
||||
else
|
||||
result["status"] = "${error}"
|
||||
result["reason"] = "${already_claimed}"
|
||||
result["owner"] = owner
|
||||
end
|
||||
else
|
||||
redis.call("set", owner_key, expected_owner)
|
||||
redis.call("set", last_modified_key, last_modified_blob)
|
||||
apply_ttl(owner_key, ms_expiry)
|
||||
result["status"] = "${ok}"
|
||||
end
|
||||
else
|
||||
result["status"] = "${error}"
|
||||
result["reason"] = "${unknown_job}"
|
||||
end
|
||||
return cmsgpack.pack(result)
|
||||
""",
|
||||
'abandon': """
|
||||
-- Extract *all* the variables (so we can easily know what they are)...
|
||||
local owner_key = KEYS[1]
|
||||
local listings_key = KEYS[2]
|
||||
local last_modified_key = KEYS[3]
|
||||
|
||||
local expected_owner = ARGV[1]
|
||||
local job_key = ARGV[2]
|
||||
local last_modified_blob = ARGV[3]
|
||||
local result = {}
|
||||
if redis.call("hexists", listings_key, job_key) == 1 then
|
||||
if redis.call("exists", owner_key) == 1 then
|
||||
local owner = redis.call("get", owner_key)
|
||||
if owner ~= expected_owner then
|
||||
result["status"] = "${error}"
|
||||
result["reason"] = "${not_expected_owner}"
|
||||
result["owner"] = owner
|
||||
else
|
||||
redis.call("del", owner_key)
|
||||
redis.call("set", last_modified_key, last_modified_blob)
|
||||
result["status"] = "${ok}"
|
||||
end
|
||||
else
|
||||
result["status"] = "${error}"
|
||||
result["reason"] = "${unknown_owner}"
|
||||
end
|
||||
else
|
||||
result["status"] = "${error}"
|
||||
result["reason"] = "${unknown_job}"
|
||||
end
|
||||
return cmsgpack.pack(result)
|
||||
""",
|
||||
'trash': """
|
||||
-- Extract *all* the variables (so we can easily know what they are)...
|
||||
local owner_key = KEYS[1]
|
||||
local listings_key = KEYS[2]
|
||||
local last_modified_key = KEYS[3]
|
||||
local trash_listings_key = KEYS[4]
|
||||
|
||||
local expected_owner = ARGV[1]
|
||||
local job_key = ARGV[2]
|
||||
local last_modified_blob = ARGV[3]
|
||||
local result = {}
|
||||
if redis.call("hexists", listings_key, job_key) == 1 then
|
||||
local raw_posting = redis.call("hget", listings_key, job_key)
|
||||
if redis.call("exists", owner_key) == 1 then
|
||||
local owner = redis.call("get", owner_key)
|
||||
if owner ~= expected_owner then
|
||||
result["status"] = "${error}"
|
||||
result["reason"] = "${not_expected_owner}"
|
||||
result["owner"] = owner
|
||||
else
|
||||
-- This ordering is important (try to first move the value
|
||||
-- and only if that works do we try to do any deletions)...
|
||||
redis.call("hset", trash_listings_key, job_key, raw_posting)
|
||||
redis.call("set", last_modified_key, last_modified_blob)
|
||||
redis.call("del", owner_key)
|
||||
redis.call("hdel", listings_key, job_key)
|
||||
result["status"] = "${ok}"
|
||||
end
|
||||
else
|
||||
result["status"] = "${error}"
|
||||
result["reason"] = "${unknown_owner}"
|
||||
end
|
||||
else
|
||||
result["status"] = "${error}"
|
||||
result["reason"] = "${unknown_job}"
|
||||
end
|
||||
return cmsgpack.pack(result)
|
||||
""",
|
||||
}
|
||||
"""`Lua`_ **template** scripts that will be used by various methods (they
|
||||
are turned into real scripts and loaded on call into the :func:`.connect`
|
||||
method).
|
||||
|
||||
Some things to note:
|
||||
|
||||
- The lua script is ran serially, so when this runs no other command will
|
||||
be mutating the backend (and redis also ensures that no other script
|
||||
will be running) so atomicity of these scripts are guaranteed by redis.
|
||||
|
||||
- Transactions were considered (and even mostly implemented) but
|
||||
ultimately rejected since redis does not support rollbacks and
|
||||
transactions can **not** be interdependent (later operations can **not**
|
||||
depend on the results of earlier operations). Both of these issues limit
|
||||
our ability to correctly report errors (with useful messages) and to
|
||||
maintain consistency under failure/contention (due to the inability to
|
||||
rollback). A third and final blow to using transactions was to
|
||||
correctly use them we would have to set a watch on a *very* contentious
|
||||
key (the listings key) which would under load cause clients to retry more
|
||||
often then would be desired (this also increases network load, CPU
|
||||
cycles used, transactions failures triggered and so on).
|
||||
|
||||
- Partial transaction execution is possible due to pre/post ``EXEC``
|
||||
failures (and the lack of rollback makes this worse).
|
||||
|
||||
So overall after thinking, it seemed like having little lua scripts
|
||||
was not that bad (even if it is somewhat convoluted) due to the above and
|
||||
public mentioned issues with transactions. In general using lua scripts
|
||||
for this purpose seems to be somewhat common practice and it solves the
|
||||
issues that came up when transactions were considered & implemented.
|
||||
|
||||
Some links about redis (and redis + lua) that may be useful to look over:
|
||||
|
||||
- `Atomicity of scripts`_
|
||||
- `Scripting and transactions`_
|
||||
- `Why redis does not support rollbacks`_
|
||||
- `Intro to lua for redis programmers`_
|
||||
- `Five key takeaways for developing with redis`_
|
||||
- `Everything you always wanted to know about redis`_ (slides)
|
||||
|
||||
.. _Lua: http://www.lua.org/
|
||||
.. _Atomicity of scripts: http://redis.io/commands/eval#atomicity-of-\
|
||||
scripts
|
||||
.. _Scripting and transactions: http://redis.io/topics/transactions#redis-\
|
||||
scripting-and-transactions
|
||||
.. _Why redis does not support rollbacks: http://redis.io/topics/transa\
|
||||
ctions#why-redis-does-not-suppo\
|
||||
rt-roll-backs
|
||||
.. _Intro to lua for redis programmers: http://www.redisgreen.net/blog/int\
|
||||
ro-to-lua-for-redis-programmers
|
||||
.. _Five key takeaways for developing with redis: https://redislabs.com/bl\
|
||||
og/5-key-takeaways-fo\
|
||||
r-developing-with-redis
|
||||
.. _Everything you always wanted to know about redis: http://www.slidesh
|
||||
are.net/carlosabal\
|
||||
de/everything-you-a\
|
||||
lways-wanted-to-\
|
||||
know-about-redis-b\
|
||||
ut-were-afraid-to-ask
|
||||
"""
|
||||
|
||||
@classmethod
|
||||
def _make_client(cls, conf):
|
||||
client_conf = {}
|
||||
for key, value_type_converter in cls.CLIENT_CONF_TRANSFERS:
|
||||
if key in conf:
|
||||
if value_type_converter is not None:
|
||||
client_conf[key] = value_type_converter(conf[key])
|
||||
else:
|
||||
client_conf[key] = conf[key]
|
||||
return ru.RedisClient(**client_conf)
|
||||
|
||||
def __init__(self, name, conf,
|
||||
client=None, persistence=None):
|
||||
super(RedisJobBoard, self).__init__(name, conf)
|
||||
self._closed = True
|
||||
if client is not None:
|
||||
self._client = client
|
||||
self._owns_client = False
|
||||
else:
|
||||
self._client = self._make_client(self._conf)
|
||||
# NOTE(harlowja): This client should not work until connected...
|
||||
self._client.close()
|
||||
self._owns_client = True
|
||||
self._namespace = self._conf.get('namespace', self.DEFAULT_NAMESPACE)
|
||||
self._open_close_lock = threading.RLock()
|
||||
# Redis server version connected to + scripts (populated on connect).
|
||||
self._redis_version = None
|
||||
self._scripts = {}
|
||||
# The backend to load the full logbooks from, since what is sent over
|
||||
# the data connection is only the logbook uuid and name, and not the
|
||||
# full logbook.
|
||||
self._persistence = persistence
|
||||
|
||||
def join(self, key_piece, *more_key_pieces):
|
||||
"""Create and return a namespaced key from many segments.
|
||||
|
||||
NOTE(harlowja): all pieces that are text/unicode are converted into
|
||||
their binary equivalent (if they are already binary no conversion
|
||||
takes place) before being joined (as redis expects binary keys and not
|
||||
unicode/text ones).
|
||||
"""
|
||||
namespace_pieces = []
|
||||
if self._namespace is not None:
|
||||
namespace_pieces = [self._namespace, self.NAMESPACE_SEP]
|
||||
else:
|
||||
namespace_pieces = []
|
||||
key_pieces = [key_piece]
|
||||
if more_key_pieces:
|
||||
key_pieces.extend(more_key_pieces)
|
||||
for i in compat_range(0, len(namespace_pieces)):
|
||||
namespace_pieces[i] = misc.binary_encode(namespace_pieces[i])
|
||||
for i in compat_range(0, len(key_pieces)):
|
||||
key_pieces[i] = misc.binary_encode(key_pieces[i])
|
||||
namespace = b"".join(namespace_pieces)
|
||||
key = self.KEY_PIECE_SEP.join(key_pieces)
|
||||
return namespace + key
|
||||
|
||||
@property
|
||||
def namespace(self):
|
||||
"""The namespace all keys will be prefixed with (or none)."""
|
||||
return self._namespace
|
||||
|
||||
@misc.cachedproperty
|
||||
def trash_key(self):
|
||||
"""Key where a hash will be stored with trashed jobs in it."""
|
||||
return self.join(b"trash")
|
||||
|
||||
@misc.cachedproperty
|
||||
def sequence_key(self):
|
||||
"""Key where a integer will be stored (used to sequence jobs)."""
|
||||
return self.join(b"sequence")
|
||||
|
||||
@misc.cachedproperty
|
||||
def listings_key(self):
|
||||
"""Key where a hash will be stored with active jobs in it."""
|
||||
return self.join(b"listings")
|
||||
|
||||
@property
|
||||
def job_count(self):
|
||||
with _translate_failures():
|
||||
return self._client.hlen(self.listings_key)
|
||||
|
||||
@property
|
||||
def connected(self):
|
||||
return not self._closed
|
||||
|
||||
@fasteners.locked(lock='_open_close_lock')
|
||||
def connect(self):
|
||||
self.close()
|
||||
if self._owns_client:
|
||||
self._client = self._make_client(self._conf)
|
||||
with _translate_failures():
|
||||
# The client maintains a connection pool, so do a ping and
|
||||
# if that works then assume the connection works, which may or
|
||||
# may not be continuously maintained (if the server dies
|
||||
# at a later time, we will become aware of that when the next
|
||||
# op occurs).
|
||||
self._client.ping()
|
||||
is_new_enough, redis_version = ru.is_server_new_enough(
|
||||
self._client, self.MIN_REDIS_VERSION)
|
||||
if not is_new_enough:
|
||||
wanted_version = ".".join([str(p)
|
||||
for p in self.MIN_REDIS_VERSION])
|
||||
if redis_version:
|
||||
raise exc.JobFailure("Redis version %s or greater is"
|
||||
" required (version %s is to"
|
||||
" old)" % (wanted_version,
|
||||
redis_version))
|
||||
else:
|
||||
raise exc.JobFailure("Redis version %s or greater is"
|
||||
" required" % (wanted_version))
|
||||
else:
|
||||
self._redis_version = redis_version
|
||||
script_params = {
|
||||
# Status field values.
|
||||
'ok': self.SCRIPT_STATUS_OK,
|
||||
'error': self.SCRIPT_STATUS_ERROR,
|
||||
|
||||
# Known error reasons (when status field is error).
|
||||
'not_expected_owner': self.SCRIPT_NOT_EXPECTED_OWNER,
|
||||
'unknown_owner': self.SCRIPT_UNKNOWN_OWNER,
|
||||
'unknown_job': self.SCRIPT_UNKNOWN_JOB,
|
||||
'already_claimed': self.SCRIPT_ALREADY_CLAIMED,
|
||||
}
|
||||
prepared_scripts = {}
|
||||
for n, raw_script_tpl in six.iteritems(self.SCRIPT_TEMPLATES):
|
||||
script_tpl = string.Template(raw_script_tpl)
|
||||
script_blob = script_tpl.substitute(**script_params)
|
||||
script = self._client.register_script(script_blob)
|
||||
prepared_scripts[n] = script
|
||||
self._scripts.update(prepared_scripts)
|
||||
self._closed = False
|
||||
|
||||
@fasteners.locked(lock='_open_close_lock')
|
||||
def close(self):
|
||||
if self._owns_client:
|
||||
self._client.close()
|
||||
self._scripts.clear()
|
||||
self._redis_version = None
|
||||
self._closed = True
|
||||
|
||||
@staticmethod
|
||||
def _dumps(obj):
|
||||
try:
|
||||
return msgpackutils.dumps(obj)
|
||||
except (msgpack.PackException, ValueError):
|
||||
# TODO(harlowja): remove direct msgpack exception access when
|
||||
# oslo.utils provides easy access to the underlying msgpack
|
||||
# pack/unpack exceptions..
|
||||
exc.raise_with_cause(exc.JobFailure,
|
||||
"Failed to serialize object to"
|
||||
" msgpack blob")
|
||||
|
||||
@staticmethod
|
||||
def _loads(blob, root_types=(dict,)):
|
||||
try:
|
||||
return misc.decode_msgpack(blob, root_types=root_types)
|
||||
except (msgpack.UnpackException, ValueError):
|
||||
# TODO(harlowja): remove direct msgpack exception access when
|
||||
# oslo.utils provides easy access to the underlying msgpack
|
||||
# pack/unpack exceptions..
|
||||
exc.raise_with_cause(exc.JobFailure,
|
||||
"Failed to deserialize object from"
|
||||
" msgpack blob (of length %s)" % len(blob))
|
||||
|
||||
_decode_owner = staticmethod(misc.binary_decode)
|
||||
|
||||
_encode_owner = staticmethod(misc.binary_encode)
|
||||
|
||||
def find_owner(self, job):
|
||||
owner_key = self.join(job.key + self.OWNED_POSTFIX)
|
||||
with _translate_failures():
|
||||
raw_owner = self._client.get(owner_key)
|
||||
return self._decode_owner(raw_owner)
|
||||
|
||||
def post(self, name, book=None, details=None):
|
||||
job_uuid = uuidutils.generate_uuid()
|
||||
posting = base.format_posting(job_uuid, name,
|
||||
created_on=timeutils.utcnow(),
|
||||
book=book, details=details)
|
||||
with _translate_failures():
|
||||
sequence = self._client.incr(self.sequence_key)
|
||||
posting.update({
|
||||
'sequence': sequence,
|
||||
})
|
||||
with _translate_failures():
|
||||
raw_posting = self._dumps(posting)
|
||||
raw_job_uuid = six.b(job_uuid)
|
||||
was_posted = bool(self._client.hsetnx(self.listings_key,
|
||||
raw_job_uuid, raw_posting))
|
||||
if not was_posted:
|
||||
raise exc.JobFailure("New job located at '%s[%s]' could not"
|
||||
" be posted" % (self.listings_key,
|
||||
raw_job_uuid))
|
||||
else:
|
||||
return RedisJob(self, name, sequence, raw_job_uuid,
|
||||
uuid=job_uuid, details=details,
|
||||
created_on=posting['created_on'],
|
||||
book=book, book_data=posting.get('book'),
|
||||
backend=self._persistence)
|
||||
|
||||
def wait(self, timeout=None, initial_delay=0.005,
|
||||
max_delay=1.0, sleep_func=time.sleep):
|
||||
if initial_delay > max_delay:
|
||||
raise ValueError("Initial delay %s must be less than or equal"
|
||||
" to the provided max delay %s"
|
||||
% (initial_delay, max_delay))
|
||||
# This does a spin-loop that backs off by doubling the delay
|
||||
# up to the provided max-delay. In the future we could try having
|
||||
# a secondary client connected into redis pubsub and use that
|
||||
# instead, but for now this is simpler.
|
||||
w = timeutils.StopWatch(duration=timeout)
|
||||
w.start()
|
||||
delay = initial_delay
|
||||
while True:
|
||||
jc = self.job_count
|
||||
if jc > 0:
|
||||
it = self.iterjobs()
|
||||
return it
|
||||
else:
|
||||
if w.expired():
|
||||
raise exc.NotFound("Expired waiting for jobs to"
|
||||
" arrive; waited %s seconds"
|
||||
% w.elapsed())
|
||||
else:
|
||||
remaining = w.leftover(return_none=True)
|
||||
if remaining is not None:
|
||||
delay = min(delay * 2, remaining, max_delay)
|
||||
else:
|
||||
delay = min(delay * 2, max_delay)
|
||||
sleep_func(delay)
|
||||
|
||||
def iterjobs(self, only_unclaimed=False, ensure_fresh=False):
|
||||
with _translate_failures():
|
||||
raw_postings = self._client.hgetall(self.listings_key)
|
||||
postings = []
|
||||
for raw_job_key, raw_posting in six.iteritems(raw_postings):
|
||||
posting = self._loads(raw_posting)
|
||||
details = posting.get('details', {})
|
||||
job_uuid = posting['uuid']
|
||||
job = RedisJob(self, posting['name'], posting['sequence'],
|
||||
raw_job_key, uuid=job_uuid, details=details,
|
||||
created_on=posting['created_on'],
|
||||
book_data=posting.get('book'),
|
||||
backend=self._persistence)
|
||||
postings.append(job)
|
||||
postings = sorted(postings)
|
||||
for job in postings:
|
||||
if only_unclaimed:
|
||||
if job.state == states.UNCLAIMED:
|
||||
yield job
|
||||
else:
|
||||
yield job
|
||||
|
||||
@base.check_who
|
||||
def consume(self, job, who):
|
||||
script = self._get_script('consume')
|
||||
with _translate_failures():
|
||||
raw_who = self._encode_owner(who)
|
||||
raw_result = script(keys=[job.owner_key, self.listings_key,
|
||||
job.last_modified_key],
|
||||
args=[raw_who, job.key])
|
||||
result = self._loads(raw_result)
|
||||
status = result['status']
|
||||
if status != self.SCRIPT_STATUS_OK:
|
||||
reason = result.get('reason')
|
||||
if reason == self.SCRIPT_UNKNOWN_JOB:
|
||||
raise exc.NotFound("Job %s not found to be"
|
||||
" consumed" % (job.uuid))
|
||||
elif reason == self.SCRIPT_UNKNOWN_OWNER:
|
||||
raise exc.NotFound("Can not consume job %s"
|
||||
" which we can not determine"
|
||||
" the owner of" % (job.uuid))
|
||||
elif reason == self.SCRIPT_NOT_EXPECTED_OWNER:
|
||||
raw_owner = result.get('owner')
|
||||
if raw_owner:
|
||||
owner = self._decode_owner(raw_owner)
|
||||
raise exc.JobFailure("Can not consume job %s"
|
||||
" which is not owned by %s (it is"
|
||||
" actively owned by %s)"
|
||||
% (job.uuid, who, owner))
|
||||
else:
|
||||
raise exc.JobFailure("Can not consume job %s"
|
||||
" which is not owned by %s"
|
||||
% (job.uuid, who))
|
||||
else:
|
||||
raise exc.JobFailure("Failure to consume job %s,"
|
||||
" unknown internal error (reason=%s)"
|
||||
% (job.uuid, reason))
|
||||
|
||||
@base.check_who
|
||||
def claim(self, job, who, expiry=None):
|
||||
if expiry is None:
|
||||
# On the lua side none doesn't translate to nil so we have
|
||||
# do to this string conversion to make sure that we can tell
|
||||
# the difference.
|
||||
ms_expiry = "none"
|
||||
else:
|
||||
ms_expiry = int(expiry * 1000.0)
|
||||
if ms_expiry <= 0:
|
||||
raise ValueError("Provided expiry (when converted to"
|
||||
" milliseconds) must be greater"
|
||||
" than zero instead of %s" % (expiry))
|
||||
script = self._get_script('claim')
|
||||
with _translate_failures():
|
||||
raw_who = self._encode_owner(who)
|
||||
raw_result = script(keys=[job.owner_key, self.listings_key,
|
||||
job.last_modified_key],
|
||||
args=[raw_who, job.key,
|
||||
# NOTE(harlowja): we need to send this
|
||||
# in as a blob (even if it's not
|
||||
# set/used), since the format can not
|
||||
# currently be created in lua...
|
||||
self._dumps(timeutils.utcnow()),
|
||||
ms_expiry])
|
||||
result = self._loads(raw_result)
|
||||
status = result['status']
|
||||
if status != self.SCRIPT_STATUS_OK:
|
||||
reason = result.get('reason')
|
||||
if reason == self.SCRIPT_UNKNOWN_JOB:
|
||||
raise exc.NotFound("Job %s not found to be"
|
||||
" claimed" % (job.uuid))
|
||||
elif reason == self.SCRIPT_ALREADY_CLAIMED:
|
||||
raw_owner = result.get('owner')
|
||||
if raw_owner:
|
||||
owner = self._decode_owner(raw_owner)
|
||||
raise exc.UnclaimableJob("Job %s already"
|
||||
" claimed by %s"
|
||||
% (job.uuid, owner))
|
||||
else:
|
||||
raise exc.UnclaimableJob("Job %s already"
|
||||
" claimed" % (job.uuid))
|
||||
else:
|
||||
raise exc.JobFailure("Failure to claim job %s,"
|
||||
" unknown internal error (reason=%s)"
|
||||
% (job.uuid, reason))
|
||||
|
||||
@base.check_who
|
||||
def abandon(self, job, who):
|
||||
script = self._get_script('abandon')
|
||||
with _translate_failures():
|
||||
raw_who = self._encode_owner(who)
|
||||
raw_result = script(keys=[job.owner_key, self.listings_key,
|
||||
job.last_modified_key],
|
||||
args=[raw_who, job.key,
|
||||
self._dumps(timeutils.utcnow())])
|
||||
result = self._loads(raw_result)
|
||||
status = result.get('status')
|
||||
if status != self.SCRIPT_STATUS_OK:
|
||||
reason = result.get('reason')
|
||||
if reason == self.SCRIPT_UNKNOWN_JOB:
|
||||
raise exc.NotFound("Job %s not found to be"
|
||||
" abandoned" % (job.uuid))
|
||||
elif reason == self.SCRIPT_UNKNOWN_OWNER:
|
||||
raise exc.NotFound("Can not abandon job %s"
|
||||
" which we can not determine"
|
||||
" the owner of" % (job.uuid))
|
||||
elif reason == self.SCRIPT_NOT_EXPECTED_OWNER:
|
||||
raw_owner = result.get('owner')
|
||||
if raw_owner:
|
||||
owner = self._decode_owner(raw_owner)
|
||||
raise exc.JobFailure("Can not abandon job %s"
|
||||
" which is not owned by %s (it is"
|
||||
" actively owned by %s)"
|
||||
% (job.uuid, who, owner))
|
||||
else:
|
||||
raise exc.JobFailure("Can not abandon job %s"
|
||||
" which is not owned by %s"
|
||||
% (job.uuid, who))
|
||||
else:
|
||||
raise exc.JobFailure("Failure to abandon job %s,"
|
||||
" unknown internal"
|
||||
" error (status=%s, reason=%s)"
|
||||
% (job.uuid, status, reason))
|
||||
|
||||
def _get_script(self, name):
|
||||
try:
|
||||
return self._scripts[name]
|
||||
except KeyError:
|
||||
exc.raise_with_cause(exc.NotFound,
|
||||
"Can not access %s script (has this"
|
||||
" board been connected?)" % name)
|
||||
|
||||
@base.check_who
|
||||
def trash(self, job, who):
|
||||
script = self._get_script('trash')
|
||||
with _translate_failures():
|
||||
raw_who = self._encode_owner(who)
|
||||
raw_result = script(keys=[job.owner_key, self.listings_key,
|
||||
job.last_modified_key, self.trash_key],
|
||||
args=[raw_who, job.key,
|
||||
self._dumps(timeutils.utcnow())])
|
||||
result = self._loads(raw_result)
|
||||
status = result['status']
|
||||
if status != self.SCRIPT_STATUS_OK:
|
||||
reason = result.get('reason')
|
||||
if reason == self.SCRIPT_UNKNOWN_JOB:
|
||||
raise exc.NotFound("Job %s not found to be"
|
||||
" trashed" % (job.uuid))
|
||||
elif reason == self.SCRIPT_UNKNOWN_OWNER:
|
||||
raise exc.NotFound("Can not trash job %s"
|
||||
" which we can not determine"
|
||||
" the owner of" % (job.uuid))
|
||||
elif reason == self.SCRIPT_NOT_EXPECTED_OWNER:
|
||||
raw_owner = result.get('owner')
|
||||
if raw_owner:
|
||||
owner = self._decode_owner(raw_owner)
|
||||
raise exc.JobFailure("Can not trash job %s"
|
||||
" which is not owned by %s (it is"
|
||||
" actively owned by %s)"
|
||||
% (job.uuid, who, owner))
|
||||
else:
|
||||
raise exc.JobFailure("Can not trash job %s"
|
||||
" which is not owned by %s"
|
||||
% (job.uuid, who))
|
||||
else:
|
||||
raise exc.JobFailure("Failure to trash job %s,"
|
||||
" unknown internal error (reason=%s)"
|
||||
% (job.uuid, reason))
|
||||
@@ -17,14 +17,17 @@
|
||||
import collections
|
||||
import contextlib
|
||||
import functools
|
||||
import sys
|
||||
import threading
|
||||
|
||||
from concurrent import futures
|
||||
import fasteners
|
||||
import futurist
|
||||
from kazoo import exceptions as k_exceptions
|
||||
from kazoo.protocol import paths as k_paths
|
||||
from kazoo.recipe import watchers
|
||||
from oslo_serialization import jsonutils
|
||||
from oslo_utils import excutils
|
||||
from oslo_utils import timeutils
|
||||
from oslo_utils import uuidutils
|
||||
import six
|
||||
|
||||
@@ -32,66 +35,39 @@ from taskflow import exceptions as excp
|
||||
from taskflow.jobs import base
|
||||
from taskflow import logging
|
||||
from taskflow import states
|
||||
from taskflow.types import timing as tt
|
||||
from taskflow.utils import kazoo_utils
|
||||
from taskflow.utils import lock_utils
|
||||
from taskflow.utils import misc
|
||||
|
||||
LOG = logging.getLogger(__name__)
|
||||
|
||||
UNCLAIMED_JOB_STATES = (
|
||||
states.UNCLAIMED,
|
||||
)
|
||||
ALL_JOB_STATES = (
|
||||
states.UNCLAIMED,
|
||||
states.COMPLETE,
|
||||
states.CLAIMED,
|
||||
)
|
||||
|
||||
# Transaction support was added in 3.4.0
|
||||
MIN_ZK_VERSION = (3, 4, 0)
|
||||
LOCK_POSTFIX = ".lock"
|
||||
JOB_PREFIX = 'job'
|
||||
|
||||
|
||||
def _check_who(who):
|
||||
if not isinstance(who, six.string_types):
|
||||
raise TypeError("Job applicant must be a string type")
|
||||
if len(who) == 0:
|
||||
raise ValueError("Job applicant must be non-empty")
|
||||
|
||||
|
||||
class ZookeeperJob(base.Job):
|
||||
"""A zookeeper job."""
|
||||
|
||||
def __init__(self, name, board, client, backend, path,
|
||||
def __init__(self, board, name, client, path,
|
||||
uuid=None, details=None, book=None, book_data=None,
|
||||
created_on=None):
|
||||
super(ZookeeperJob, self).__init__(name, uuid=uuid, details=details)
|
||||
self._board = board
|
||||
self._book = book
|
||||
if not book_data:
|
||||
book_data = {}
|
||||
self._book_data = book_data
|
||||
created_on=None, backend=None):
|
||||
super(ZookeeperJob, self).__init__(board, name,
|
||||
uuid=uuid, details=details,
|
||||
backend=backend,
|
||||
book=book, book_data=book_data)
|
||||
self._client = client
|
||||
self._backend = backend
|
||||
if all((self._book, self._book_data)):
|
||||
raise ValueError("Only one of 'book_data' or 'book'"
|
||||
" can be provided")
|
||||
self._path = k_paths.normpath(path)
|
||||
self._lock_path = path + LOCK_POSTFIX
|
||||
self._lock_path = self._path + board.LOCK_POSTFIX
|
||||
self._created_on = created_on
|
||||
self._node_not_found = False
|
||||
basename = k_paths.basename(self._path)
|
||||
self._root = self._path[0:-len(basename)]
|
||||
self._sequence = int(basename[len(JOB_PREFIX):])
|
||||
self._sequence = int(basename[len(board.JOB_PREFIX):])
|
||||
|
||||
@property
|
||||
def lock_path(self):
|
||||
"""Path the job lock/claim and owner znode is stored."""
|
||||
return self._lock_path
|
||||
|
||||
@property
|
||||
def path(self):
|
||||
"""Path the job data znode is stored."""
|
||||
return self._path
|
||||
|
||||
@property
|
||||
@@ -112,22 +88,27 @@ class ZookeeperJob(base.Job):
|
||||
return trans_func(attr)
|
||||
else:
|
||||
return attr
|
||||
except k_exceptions.NoNodeError as e:
|
||||
raise excp.NotFound("Can not fetch the %r attribute"
|
||||
" of job %s (%s), path %s not found"
|
||||
% (attr_name, self.uuid, self.path, path), e)
|
||||
except self._client.handler.timeout_exception as e:
|
||||
raise excp.JobFailure("Can not fetch the %r attribute"
|
||||
" of job %s (%s), operation timed out"
|
||||
% (attr_name, self.uuid, self.path), e)
|
||||
except k_exceptions.SessionExpiredError as e:
|
||||
raise excp.JobFailure("Can not fetch the %r attribute"
|
||||
" of job %s (%s), session expired"
|
||||
% (attr_name, self.uuid, self.path), e)
|
||||
except (AttributeError, k_exceptions.KazooException) as e:
|
||||
raise excp.JobFailure("Can not fetch the %r attribute"
|
||||
" of job %s (%s), internal error" %
|
||||
(attr_name, self.uuid, self.path), e)
|
||||
except k_exceptions.NoNodeError:
|
||||
excp.raise_with_cause(
|
||||
excp.NotFound,
|
||||
"Can not fetch the %r attribute of job %s (%s),"
|
||||
" path %s not found" % (attr_name, self.uuid,
|
||||
self.path, path))
|
||||
except self._client.handler.timeout_exception:
|
||||
excp.raise_with_cause(
|
||||
excp.JobFailure,
|
||||
"Can not fetch the %r attribute of job %s (%s),"
|
||||
" operation timed out" % (attr_name, self.uuid, self.path))
|
||||
except k_exceptions.SessionExpiredError:
|
||||
excp.raise_with_cause(
|
||||
excp.JobFailure,
|
||||
"Can not fetch the %r attribute of job %s (%s),"
|
||||
" session expired" % (attr_name, self.uuid, self.path))
|
||||
except (AttributeError, k_exceptions.KazooException):
|
||||
excp.raise_with_cause(
|
||||
excp.JobFailure,
|
||||
"Can not fetch the %r attribute of job %s (%s),"
|
||||
" internal error" % (attr_name, self.uuid, self.path))
|
||||
|
||||
@property
|
||||
def last_modified(self):
|
||||
@@ -155,23 +136,6 @@ class ZookeeperJob(base.Job):
|
||||
self._node_not_found = True
|
||||
return self._created_on
|
||||
|
||||
@property
|
||||
def board(self):
|
||||
return self._board
|
||||
|
||||
def _load_book(self):
|
||||
book_uuid = self.book_uuid
|
||||
if self._backend is not None and book_uuid is not None:
|
||||
# TODO(harlowja): we are currently limited by assuming that the
|
||||
# job posted has the same backend as this loader (to start this
|
||||
# seems to be a ok assumption, and can be adjusted in the future
|
||||
# if we determine there is a use-case for multi-backend loaders,
|
||||
# aka a registry of loaders).
|
||||
with contextlib.closing(self._backend.get_connection()) as conn:
|
||||
return conn.get_logbook(book_uuid)
|
||||
# No backend to fetch from or no uuid specified
|
||||
return None
|
||||
|
||||
@property
|
||||
def state(self):
|
||||
owner = self.board.find_owner(self)
|
||||
@@ -181,15 +145,21 @@ class ZookeeperJob(base.Job):
|
||||
job_data = misc.decode_json(raw_data)
|
||||
except k_exceptions.NoNodeError:
|
||||
pass
|
||||
except k_exceptions.SessionExpiredError as e:
|
||||
raise excp.JobFailure("Can not fetch the state of %s,"
|
||||
" session expired" % (self.uuid), e)
|
||||
except self._client.handler.timeout_exception as e:
|
||||
raise excp.JobFailure("Can not fetch the state of %s,"
|
||||
" operation timed out" % (self.uuid), e)
|
||||
except k_exceptions.KazooException as e:
|
||||
raise excp.JobFailure("Can not fetch the state of %s, internal"
|
||||
" error" % (self.uuid), e)
|
||||
except k_exceptions.SessionExpiredError:
|
||||
excp.raise_with_cause(
|
||||
excp.JobFailure,
|
||||
"Can not fetch the state of %s,"
|
||||
" session expired" % (self.uuid))
|
||||
except self._client.handler.timeout_exception:
|
||||
excp.raise_with_cause(
|
||||
excp.JobFailure,
|
||||
"Can not fetch the state of %s,"
|
||||
" operation timed out" % (self.uuid))
|
||||
except k_exceptions.KazooException:
|
||||
excp.raise_with_cause(
|
||||
excp.JobFailure,
|
||||
"Can not fetch the state of %s,"
|
||||
" internal error" % (self.uuid))
|
||||
if not job_data:
|
||||
# No data this job has been completed (the owner that we might have
|
||||
# fetched will not be able to be fetched again, since the job node
|
||||
@@ -209,30 +179,6 @@ class ZookeeperJob(base.Job):
|
||||
def __hash__(self):
|
||||
return hash(self.path)
|
||||
|
||||
@property
|
||||
def book(self):
|
||||
if self._book is None:
|
||||
self._book = self._load_book()
|
||||
return self._book
|
||||
|
||||
@property
|
||||
def book_uuid(self):
|
||||
if self._book:
|
||||
return self._book.uuid
|
||||
if self._book_data:
|
||||
return self._book_data.get('uuid')
|
||||
else:
|
||||
return None
|
||||
|
||||
@property
|
||||
def book_name(self):
|
||||
if self._book:
|
||||
return self._book.name
|
||||
if self._book_data:
|
||||
return self._book_data.get('name')
|
||||
else:
|
||||
return None
|
||||
|
||||
|
||||
class ZookeeperJobBoardIterator(six.Iterator):
|
||||
"""Iterator over a zookeeper jobboard that iterates over potential jobs.
|
||||
@@ -246,6 +192,16 @@ class ZookeeperJobBoardIterator(six.Iterator):
|
||||
over unclaimed jobs.
|
||||
"""
|
||||
|
||||
_UNCLAIMED_JOB_STATES = (
|
||||
states.UNCLAIMED,
|
||||
)
|
||||
|
||||
_JOB_STATES = (
|
||||
states.UNCLAIMED,
|
||||
states.COMPLETE,
|
||||
states.CLAIMED,
|
||||
)
|
||||
|
||||
def __init__(self, board, only_unclaimed=False, ensure_fresh=False):
|
||||
self._board = board
|
||||
self._jobs = collections.deque()
|
||||
@@ -255,6 +211,7 @@ class ZookeeperJobBoardIterator(six.Iterator):
|
||||
|
||||
@property
|
||||
def board(self):
|
||||
"""The board this iterator was created from."""
|
||||
return self._board
|
||||
|
||||
def __iter__(self):
|
||||
@@ -262,9 +219,9 @@ class ZookeeperJobBoardIterator(six.Iterator):
|
||||
|
||||
def _next_job(self):
|
||||
if self.only_unclaimed:
|
||||
allowed_states = UNCLAIMED_JOB_STATES
|
||||
allowed_states = self._UNCLAIMED_JOB_STATES
|
||||
else:
|
||||
allowed_states = ALL_JOB_STATES
|
||||
allowed_states = self._JOB_STATES
|
||||
job = None
|
||||
while self._jobs and job is None:
|
||||
maybe_job = self._jobs.popleft()
|
||||
@@ -292,29 +249,49 @@ class ZookeeperJobBoardIterator(six.Iterator):
|
||||
|
||||
|
||||
class ZookeeperJobBoard(base.NotifyingJobBoard):
|
||||
"""A jobboard backend by zookeeper.
|
||||
"""A jobboard backed by `zookeeper`_.
|
||||
|
||||
Powered by the `kazoo <http://kazoo.readthedocs.org/>`_ library.
|
||||
|
||||
This jobboard creates *sequenced* persistent znodes in a directory in
|
||||
zookeeper (that directory defaults ``/taskflow/jobs``) and uses zookeeper
|
||||
watches to notify other jobboards that the job which was posted using the
|
||||
:meth:`.post` method (this creates a znode with contents/details in json)
|
||||
The users of those jobboard(s) (potentially on disjoint sets of machines)
|
||||
can then iterate over the available jobs and decide if they want to attempt
|
||||
to claim one of the jobs they have iterated over. If so they will then
|
||||
attempt to contact zookeeper and will attempt to create a ephemeral znode
|
||||
using the name of the persistent znode + ".lock" as a postfix. If the
|
||||
entity trying to use the jobboard to :meth:`.claim` the job is able to
|
||||
create a ephemeral znode with that name then it will be allowed (and
|
||||
expected) to perform whatever *work* the contents of that job that it
|
||||
locked described. Once finished the ephemeral znode and persistent znode
|
||||
may be deleted (if successfully completed) in a single transcation or if
|
||||
not successfull (or the entity that claimed the znode dies) the ephemeral
|
||||
znode will be released (either manually by using :meth:`.abandon` or
|
||||
automatically by zookeeper the ephemeral is deemed to be lost).
|
||||
zookeeper and uses zookeeper watches to notify other jobboards of
|
||||
jobs which were posted using the :meth:`.post` method (this creates a
|
||||
znode with job contents/details encoded in `json`_). The users of these
|
||||
jobboard(s) (potentially on disjoint sets of machines) can then iterate
|
||||
over the available jobs and decide if they want
|
||||
to attempt to claim one of the jobs they have iterated over. If so they
|
||||
will then attempt to contact zookeeper and they will attempt to create a
|
||||
ephemeral znode using the name of the persistent znode + ".lock" as a
|
||||
postfix. If the entity trying to use the jobboard to :meth:`.claim` the
|
||||
job is able to create a ephemeral znode with that name then it will be
|
||||
allowed (and expected) to perform whatever *work* the contents of that
|
||||
job described. Once the claiming entity is finished the ephemeral znode
|
||||
and persistent znode will be deleted (if successfully completed) in a
|
||||
single transaction. If the claiming entity is not successful (or the
|
||||
entity that claimed the znode dies) the ephemeral znode will be
|
||||
released (either manually by using :meth:`.abandon` or automatically by
|
||||
zookeeper when the ephemeral node and associated session is deemed to
|
||||
have been lost).
|
||||
|
||||
.. _zookeeper: http://zookeeper.apache.org/
|
||||
.. _json: http://json.org/
|
||||
"""
|
||||
|
||||
#: Transaction support was added in 3.4.0 so we need at least that version.
|
||||
MIN_ZK_VERSION = (3, 4, 0)
|
||||
|
||||
#: Znode **postfix** that lock entries have.
|
||||
LOCK_POSTFIX = ".lock"
|
||||
|
||||
#: Znode child path created under root path that contains trashed jobs.
|
||||
TRASH_FOLDER = ".trash"
|
||||
|
||||
#: Znode **prefix** that job entries have.
|
||||
JOB_PREFIX = 'job'
|
||||
|
||||
#: Default znode path used for jobs (data, locks...).
|
||||
DEFAULT_PATH = "/taskflow/jobs"
|
||||
|
||||
def __init__(self, name, conf,
|
||||
client=None, persistence=None, emit_notifications=True):
|
||||
super(ZookeeperJobBoard, self).__init__(name, conf)
|
||||
@@ -324,17 +301,17 @@ class ZookeeperJobBoard(base.NotifyingJobBoard):
|
||||
else:
|
||||
self._client = kazoo_utils.make_client(self._conf)
|
||||
self._owned = True
|
||||
path = str(conf.get("path", "/taskflow/jobs"))
|
||||
path = str(conf.get("path", self.DEFAULT_PATH))
|
||||
if not path:
|
||||
raise ValueError("Empty zookeeper path is disallowed")
|
||||
if not k_paths.isabs(path):
|
||||
raise ValueError("Zookeeper path must be absolute")
|
||||
self._path = path
|
||||
# The backend to load the full logbooks from, since whats sent over
|
||||
# the zookeeper data connection is only the logbook uuid and name, and
|
||||
# not currently the full logbook (later when a zookeeper backend
|
||||
# appears we can likely optimize for that backend usage by directly
|
||||
# reading from the path where the data is stored, if we want).
|
||||
self._trash_path = self._path.replace(k_paths.basename(self._path),
|
||||
self.TRASH_FOLDER)
|
||||
# The backend to load the full logbooks from, since what is sent over
|
||||
# the data connection is only the logbook uuid and name, and not the
|
||||
# full logbook.
|
||||
self._persistence = persistence
|
||||
# Misc. internal details
|
||||
self._known_jobs = {}
|
||||
@@ -345,23 +322,34 @@ class ZookeeperJobBoard(base.NotifyingJobBoard):
|
||||
self._job_watcher = None
|
||||
# Since we use sequenced ids this will be the path that the sequences
|
||||
# are prefixed with, for example, job0000000001, job0000000002, ...
|
||||
self._job_base = k_paths.join(path, JOB_PREFIX)
|
||||
self._job_base = k_paths.join(path, self.JOB_PREFIX)
|
||||
self._worker = None
|
||||
self._emit_notifications = bool(emit_notifications)
|
||||
self._connected = False
|
||||
|
||||
def _emit(self, state, details):
|
||||
# Submit the work to the executor to avoid blocking the kazoo queue.
|
||||
# Submit the work to the executor to avoid blocking the kazoo threads
|
||||
# and queue(s)...
|
||||
worker = self._worker
|
||||
if worker is None:
|
||||
return
|
||||
try:
|
||||
self._worker.submit(self.notifier.notify, state, details)
|
||||
except (AttributeError, RuntimeError):
|
||||
# Notification thread is shutdown or non-existent, either case we
|
||||
# just want to skip submitting a notification...
|
||||
worker.submit(self.notifier.notify, state, details)
|
||||
except RuntimeError:
|
||||
# Notification thread is shutdown just skip submitting a
|
||||
# notification...
|
||||
pass
|
||||
|
||||
@property
|
||||
def path(self):
|
||||
"""Path where all job znodes will be stored."""
|
||||
return self._path
|
||||
|
||||
@property
|
||||
def trash_path(self):
|
||||
"""Path where all trashed job znodes will be stored."""
|
||||
return self._trash_path
|
||||
|
||||
@property
|
||||
def job_count(self):
|
||||
return len(self._known_jobs)
|
||||
@@ -375,15 +363,17 @@ class ZookeeperJobBoard(base.NotifyingJobBoard):
|
||||
def _force_refresh(self):
|
||||
try:
|
||||
children = self._client.get_children(self.path)
|
||||
except self._client.handler.timeout_exception as e:
|
||||
raise excp.JobFailure("Refreshing failure, operation timed out",
|
||||
e)
|
||||
except k_exceptions.SessionExpiredError as e:
|
||||
raise excp.JobFailure("Refreshing failure, session expired", e)
|
||||
except self._client.handler.timeout_exception:
|
||||
excp.raise_with_cause(excp.JobFailure,
|
||||
"Refreshing failure, operation timed out")
|
||||
except k_exceptions.SessionExpiredError:
|
||||
excp.raise_with_cause(excp.JobFailure,
|
||||
"Refreshing failure, session expired")
|
||||
except k_exceptions.NoNodeError:
|
||||
pass
|
||||
except k_exceptions.KazooException as e:
|
||||
raise excp.JobFailure("Refreshing failure, internal error", e)
|
||||
except k_exceptions.KazooException:
|
||||
excp.raise_with_cause(excp.JobFailure,
|
||||
"Refreshing failure, internal error")
|
||||
else:
|
||||
self._on_job_posting(children, delayed=False)
|
||||
|
||||
@@ -429,8 +419,9 @@ class ZookeeperJobBoard(base.NotifyingJobBoard):
|
||||
# jobs information into the known job set (if it's already
|
||||
# existing then just leave it alone).
|
||||
if path not in self._known_jobs:
|
||||
job = ZookeeperJob(job_data['name'], self,
|
||||
self._client, self._persistence, path,
|
||||
job = ZookeeperJob(self, job_data['name'],
|
||||
self._client, path,
|
||||
backend=self._persistence,
|
||||
uuid=job_data['uuid'],
|
||||
book_data=job_data.get("book"),
|
||||
details=job_data.get("details", {}),
|
||||
@@ -444,7 +435,8 @@ class ZookeeperJobBoard(base.NotifyingJobBoard):
|
||||
LOG.debug("Got children %s under path %s", children, self.path)
|
||||
child_paths = []
|
||||
for c in children:
|
||||
if c.endswith(LOCK_POSTFIX) or not c.startswith(JOB_PREFIX):
|
||||
if (c.endswith(self.LOCK_POSTFIX) or
|
||||
not c.startswith(self.JOB_PREFIX)):
|
||||
# Skip lock paths or non-job-paths (these are not valid jobs)
|
||||
continue
|
||||
child_paths.append(k_paths.join(self.path, c))
|
||||
@@ -488,45 +480,31 @@ class ZookeeperJobBoard(base.NotifyingJobBoard):
|
||||
self._process_child(path, request)
|
||||
|
||||
def post(self, name, book=None, details=None):
|
||||
|
||||
def format_posting(job_uuid):
|
||||
posting = {
|
||||
'uuid': job_uuid,
|
||||
'name': name,
|
||||
}
|
||||
if details:
|
||||
posting['details'] = details
|
||||
else:
|
||||
posting['details'] = {}
|
||||
if book is not None:
|
||||
posting['book'] = {
|
||||
'name': book.name,
|
||||
'uuid': book.uuid,
|
||||
}
|
||||
return posting
|
||||
|
||||
# NOTE(harlowja): Jobs are not ephemeral, they will persist until they
|
||||
# are consumed (this may change later, but seems safer to do this until
|
||||
# further notice).
|
||||
job_uuid = uuidutils.generate_uuid()
|
||||
job_posting = base.format_posting(job_uuid, name,
|
||||
book=book, details=details)
|
||||
raw_job_posting = misc.binary_encode(jsonutils.dumps(job_posting))
|
||||
with self._wrap(job_uuid, None,
|
||||
"Posting failure: %s", ensure_known=False):
|
||||
job_posting = format_posting(job_uuid)
|
||||
job_posting = misc.binary_encode(jsonutils.dumps(job_posting))
|
||||
fail_msg_tpl="Posting failure: %s",
|
||||
ensure_known=False):
|
||||
job_path = self._client.create(self._job_base,
|
||||
value=job_posting,
|
||||
value=raw_job_posting,
|
||||
sequence=True,
|
||||
ephemeral=False)
|
||||
job = ZookeeperJob(name, self, self._client,
|
||||
self._persistence, job_path,
|
||||
book=book, details=details,
|
||||
uuid=job_uuid)
|
||||
job = ZookeeperJob(self, name, self._client, job_path,
|
||||
backend=self._persistence,
|
||||
book=book, details=details, uuid=job_uuid,
|
||||
book_data=job_posting.get('book'))
|
||||
with self._job_cond:
|
||||
self._known_jobs[job_path] = job
|
||||
self._job_cond.notify_all()
|
||||
self._emit(base.POSTED, details={'job': job})
|
||||
return job
|
||||
|
||||
@base.check_who
|
||||
def claim(self, job, who):
|
||||
def _unclaimable_try_find_owner(cause):
|
||||
try:
|
||||
@@ -534,13 +512,14 @@ class ZookeeperJobBoard(base.NotifyingJobBoard):
|
||||
except Exception:
|
||||
owner = None
|
||||
if owner:
|
||||
msg = "Job %s already claimed by '%s'" % (job.uuid, owner)
|
||||
message = "Job %s already claimed by '%s'" % (job.uuid, owner)
|
||||
else:
|
||||
msg = "Job %s already claimed" % (job.uuid)
|
||||
return excp.UnclaimableJob(msg, cause)
|
||||
message = "Job %s already claimed" % (job.uuid)
|
||||
excp.raise_with_cause(excp.UnclaimableJob,
|
||||
message, cause=cause)
|
||||
|
||||
_check_who(who)
|
||||
with self._wrap(job.uuid, job.path, "Claiming failure: %s"):
|
||||
with self._wrap(job.uuid, job.path,
|
||||
fail_msg_tpl="Claiming failure: %s"):
|
||||
# NOTE(harlowja): post as json which will allow for future changes
|
||||
# more easily than a raw string/text.
|
||||
value = jsonutils.dumps({
|
||||
@@ -558,21 +537,23 @@ class ZookeeperJobBoard(base.NotifyingJobBoard):
|
||||
try:
|
||||
kazoo_utils.checked_commit(txn)
|
||||
except k_exceptions.NodeExistsError as e:
|
||||
raise _unclaimable_try_find_owner(e)
|
||||
_unclaimable_try_find_owner(e)
|
||||
except kazoo_utils.KazooTransactionException as e:
|
||||
if len(e.failures) < 2:
|
||||
raise
|
||||
else:
|
||||
if isinstance(e.failures[0], k_exceptions.NoNodeError):
|
||||
raise excp.NotFound(
|
||||
excp.raise_with_cause(
|
||||
excp.NotFound,
|
||||
"Job %s not found to be claimed" % job.uuid,
|
||||
e.failures[0])
|
||||
cause=e.failures[0])
|
||||
if isinstance(e.failures[1], k_exceptions.NodeExistsError):
|
||||
raise _unclaimable_try_find_owner(e.failures[1])
|
||||
_unclaimable_try_find_owner(e.failures[1])
|
||||
else:
|
||||
raise excp.UnclaimableJob(
|
||||
excp.raise_with_cause(
|
||||
excp.UnclaimableJob,
|
||||
"Job %s claim failed due to transaction"
|
||||
" not succeeding" % (job.uuid), e)
|
||||
" not succeeding" % (job.uuid), cause=e)
|
||||
|
||||
@contextlib.contextmanager
|
||||
def _wrap(self, job_uuid, job_path,
|
||||
@@ -588,21 +569,23 @@ class ZookeeperJobBoard(base.NotifyingJobBoard):
|
||||
raise excp.NotFound(fail_msg_tpl % (job_uuid))
|
||||
try:
|
||||
yield
|
||||
except self._client.handler.timeout_exception as e:
|
||||
except self._client.handler.timeout_exception:
|
||||
fail_msg_tpl += ", operation timed out"
|
||||
raise excp.JobFailure(fail_msg_tpl % (job_uuid), e)
|
||||
except k_exceptions.SessionExpiredError as e:
|
||||
excp.raise_with_cause(excp.JobFailure, fail_msg_tpl % (job_uuid))
|
||||
except k_exceptions.SessionExpiredError:
|
||||
fail_msg_tpl += ", session expired"
|
||||
raise excp.JobFailure(fail_msg_tpl % (job_uuid), e)
|
||||
excp.raise_with_cause(excp.JobFailure, fail_msg_tpl % (job_uuid))
|
||||
except k_exceptions.NoNodeError:
|
||||
fail_msg_tpl += ", unknown job"
|
||||
raise excp.NotFound(fail_msg_tpl % (job_uuid))
|
||||
except k_exceptions.KazooException as e:
|
||||
excp.raise_with_cause(excp.NotFound, fail_msg_tpl % (job_uuid))
|
||||
except k_exceptions.KazooException:
|
||||
fail_msg_tpl += ", internal error"
|
||||
raise excp.JobFailure(fail_msg_tpl % (job_uuid), e)
|
||||
excp.raise_with_cause(excp.JobFailure, fail_msg_tpl % (job_uuid))
|
||||
|
||||
def find_owner(self, job):
|
||||
with self._wrap(job.uuid, job.path, "Owner query failure: %s"):
|
||||
with self._wrap(job.uuid, job.path,
|
||||
fail_msg_tpl="Owner query failure: %s",
|
||||
ensure_known=False):
|
||||
try:
|
||||
self._client.sync(job.lock_path)
|
||||
raw_data, _lock_stat = self._client.get(job.lock_path)
|
||||
@@ -618,14 +601,16 @@ class ZookeeperJobBoard(base.NotifyingJobBoard):
|
||||
return (misc.decode_json(lock_data), lock_stat,
|
||||
misc.decode_json(job_data), job_stat)
|
||||
|
||||
@base.check_who
|
||||
def consume(self, job, who):
|
||||
_check_who(who)
|
||||
with self._wrap(job.uuid, job.path, "Consumption failure: %s"):
|
||||
with self._wrap(job.uuid, job.path,
|
||||
fail_msg_tpl="Consumption failure: %s"):
|
||||
try:
|
||||
owner_data = self._get_owner_and_data(job)
|
||||
lock_data, lock_stat, data, data_stat = owner_data
|
||||
except k_exceptions.NoNodeError:
|
||||
raise excp.JobFailure("Can not consume a job %s"
|
||||
excp.raise_with_cause(excp.NotFound,
|
||||
"Can not consume a job %s"
|
||||
" which we can not determine"
|
||||
" the owner of" % (job.uuid))
|
||||
if lock_data.get("owner") != who:
|
||||
@@ -638,14 +623,16 @@ class ZookeeperJobBoard(base.NotifyingJobBoard):
|
||||
kazoo_utils.checked_commit(txn)
|
||||
self._remove_job(job.path)
|
||||
|
||||
@base.check_who
|
||||
def abandon(self, job, who):
|
||||
_check_who(who)
|
||||
with self._wrap(job.uuid, job.path, "Abandonment failure: %s"):
|
||||
with self._wrap(job.uuid, job.path,
|
||||
fail_msg_tpl="Abandonment failure: %s"):
|
||||
try:
|
||||
owner_data = self._get_owner_and_data(job)
|
||||
lock_data, lock_stat, data, data_stat = owner_data
|
||||
except k_exceptions.NoNodeError:
|
||||
raise excp.JobFailure("Can not abandon a job %s"
|
||||
excp.raise_with_cause(excp.NotFound,
|
||||
"Can not abandon a job %s"
|
||||
" which we can not determine"
|
||||
" the owner of" % (job.uuid))
|
||||
if lock_data.get("owner") != who:
|
||||
@@ -656,12 +643,36 @@ class ZookeeperJobBoard(base.NotifyingJobBoard):
|
||||
txn.delete(job.lock_path, version=lock_stat.version)
|
||||
kazoo_utils.checked_commit(txn)
|
||||
|
||||
@base.check_who
|
||||
def trash(self, job, who):
|
||||
with self._wrap(job.uuid, job.path,
|
||||
fail_msg_tpl="Trash failure: %s"):
|
||||
try:
|
||||
owner_data = self._get_owner_and_data(job)
|
||||
lock_data, lock_stat, data, data_stat = owner_data
|
||||
except k_exceptions.NoNodeError:
|
||||
excp.raise_with_cause(excp.NotFound,
|
||||
"Can not trash a job %s"
|
||||
" which we can not determine"
|
||||
" the owner of" % (job.uuid))
|
||||
if lock_data.get("owner") != who:
|
||||
raise excp.JobFailure("Can not trash a job %s"
|
||||
" which is not owned by %s"
|
||||
% (job.uuid, who))
|
||||
trash_path = job.path.replace(self.path, self.trash_path)
|
||||
value = misc.binary_encode(jsonutils.dumps(data))
|
||||
txn = self._client.transaction()
|
||||
txn.create(trash_path, value=value)
|
||||
txn.delete(job.lock_path, version=lock_stat.version)
|
||||
txn.delete(job.path, version=data_stat.version)
|
||||
kazoo_utils.checked_commit(txn)
|
||||
|
||||
def _state_change_listener(self, state):
|
||||
LOG.debug("Kazoo client has changed to state: %s", state)
|
||||
|
||||
def wait(self, timeout=None):
|
||||
# Wait until timeout expires (or forever) for jobs to appear.
|
||||
watch = tt.StopWatch(duration=timeout)
|
||||
watch = timeutils.StopWatch(duration=timeout)
|
||||
watch.start()
|
||||
with self._job_cond:
|
||||
while True:
|
||||
@@ -684,9 +695,9 @@ class ZookeeperJobBoard(base.NotifyingJobBoard):
|
||||
|
||||
@property
|
||||
def connected(self):
|
||||
return self._client.connected
|
||||
return self._connected and self._client.connected
|
||||
|
||||
@lock_utils.locked(lock='_open_close_lock')
|
||||
@fasteners.locked(lock='_open_close_lock')
|
||||
def close(self):
|
||||
if self._owned:
|
||||
LOG.debug("Stopping client")
|
||||
@@ -698,8 +709,9 @@ class ZookeeperJobBoard(base.NotifyingJobBoard):
|
||||
with self._job_cond:
|
||||
self._known_jobs.clear()
|
||||
LOG.debug("Stopped & cleared local state")
|
||||
self._connected = False
|
||||
|
||||
@lock_utils.locked(lock='_open_close_lock')
|
||||
@fasteners.locked(lock='_open_close_lock')
|
||||
def connect(self, timeout=10.0):
|
||||
|
||||
def try_clean():
|
||||
@@ -717,25 +729,33 @@ class ZookeeperJobBoard(base.NotifyingJobBoard):
|
||||
timeout = float(timeout)
|
||||
self._client.start(timeout=timeout)
|
||||
except (self._client.handler.timeout_exception,
|
||||
k_exceptions.KazooException) as e:
|
||||
raise excp.JobFailure("Failed to connect to zookeeper", e)
|
||||
k_exceptions.KazooException):
|
||||
excp.raise_with_cause(excp.JobFailure,
|
||||
"Failed to connect to zookeeper")
|
||||
try:
|
||||
if self._conf.get('check_compatible', True):
|
||||
kazoo_utils.check_compatible(self._client, MIN_ZK_VERSION)
|
||||
kazoo_utils.check_compatible(self._client, self.MIN_ZK_VERSION)
|
||||
if self._worker is None and self._emit_notifications:
|
||||
self._worker = futures.ThreadPoolExecutor(max_workers=1)
|
||||
self._worker = futurist.ThreadPoolExecutor(max_workers=1)
|
||||
self._client.ensure_path(self.path)
|
||||
self._client.ensure_path(self.trash_path)
|
||||
if self._job_watcher is None:
|
||||
self._job_watcher = watchers.ChildrenWatch(
|
||||
self._client,
|
||||
self.path,
|
||||
func=self._on_job_posting,
|
||||
allow_session_lost=True)
|
||||
self._connected = True
|
||||
except excp.IncompatibleVersion:
|
||||
with excutils.save_and_reraise_exception():
|
||||
try_clean()
|
||||
except (self._client.handler.timeout_exception,
|
||||
k_exceptions.KazooException) as e:
|
||||
try_clean()
|
||||
raise excp.JobFailure("Failed to do post-connection"
|
||||
" initialization", e)
|
||||
k_exceptions.KazooException):
|
||||
exc_type, exc, exc_tb = sys.exc_info()
|
||||
try:
|
||||
try_clean()
|
||||
excp.raise_with_cause(excp.JobFailure,
|
||||
"Failed to do post-connection"
|
||||
" initialization", cause=exc)
|
||||
finally:
|
||||
del(exc_type, exc, exc_tb)
|
||||
|
||||
@@ -16,6 +16,7 @@
|
||||
# under the License.
|
||||
|
||||
import abc
|
||||
import contextlib
|
||||
|
||||
from oslo_utils import uuidutils
|
||||
import six
|
||||
@@ -43,7 +44,9 @@ class Job(object):
|
||||
reverting...
|
||||
"""
|
||||
|
||||
def __init__(self, name, uuid=None, details=None):
|
||||
def __init__(self, board, name,
|
||||
uuid=None, details=None, backend=None,
|
||||
book=None, book_data=None):
|
||||
if uuid:
|
||||
self._uuid = uuid
|
||||
else:
|
||||
@@ -52,45 +55,62 @@ class Job(object):
|
||||
if not details:
|
||||
details = {}
|
||||
self._details = details
|
||||
self._backend = backend
|
||||
self._board = board
|
||||
self._book = book
|
||||
if not book_data:
|
||||
book_data = {}
|
||||
self._book_data = book_data
|
||||
|
||||
@abc.abstractproperty
|
||||
def last_modified(self):
|
||||
"""The datetime the job was last modified."""
|
||||
pass
|
||||
|
||||
@abc.abstractproperty
|
||||
def created_on(self):
|
||||
"""The datetime the job was created on."""
|
||||
pass
|
||||
|
||||
@abc.abstractproperty
|
||||
@property
|
||||
def board(self):
|
||||
"""The board this job was posted on or was created from."""
|
||||
return self._board
|
||||
|
||||
@abc.abstractproperty
|
||||
def state(self):
|
||||
"""The current state of this job."""
|
||||
"""Access the current state of this job."""
|
||||
pass
|
||||
|
||||
@abc.abstractproperty
|
||||
@property
|
||||
def book(self):
|
||||
"""Logbook associated with this job.
|
||||
|
||||
If no logbook is associated with this job, this property is None.
|
||||
"""
|
||||
if self._book is None:
|
||||
self._book = self._load_book()
|
||||
return self._book
|
||||
|
||||
@abc.abstractproperty
|
||||
@property
|
||||
def book_uuid(self):
|
||||
"""UUID of logbook associated with this job.
|
||||
|
||||
If no logbook is associated with this job, this property is None.
|
||||
"""
|
||||
if self._book is not None:
|
||||
return self._book.uuid
|
||||
else:
|
||||
return self._book_data.get('uuid')
|
||||
|
||||
@abc.abstractproperty
|
||||
@property
|
||||
def book_name(self):
|
||||
"""Name of logbook associated with this job.
|
||||
|
||||
If no logbook is associated with this job, this property is None.
|
||||
"""
|
||||
if self._book is not None:
|
||||
return self._book.name
|
||||
else:
|
||||
return self._book_data.get('name')
|
||||
|
||||
@property
|
||||
def uuid(self):
|
||||
@@ -107,10 +127,24 @@ class Job(object):
|
||||
"""The non-uniquely identifying name of this job."""
|
||||
return self._name
|
||||
|
||||
def _load_book(self):
|
||||
book_uuid = self.book_uuid
|
||||
if self._backend is not None and book_uuid is not None:
|
||||
# TODO(harlowja): we are currently limited by assuming that the
|
||||
# job posted has the same backend as this loader (to start this
|
||||
# seems to be a ok assumption, and can be adjusted in the future
|
||||
# if we determine there is a use-case for multi-backend loaders,
|
||||
# aka a registry of loaders).
|
||||
with contextlib.closing(self._backend.get_connection()) as conn:
|
||||
return conn.get_logbook(book_uuid)
|
||||
# No backend to fetch from or no uuid specified
|
||||
return None
|
||||
|
||||
def __str__(self):
|
||||
"""Pretty formats the job into something *more* meaningful."""
|
||||
return "%s %s (%s): %s" % (type(self).__name__,
|
||||
self.name, self.uuid, self.details)
|
||||
return "%s: %s (uuid=%s, details=%s)" % (type(self).__name__,
|
||||
self.name, self.uuid,
|
||||
self.details)
|
||||
|
||||
|
||||
@six.add_metaclass(abc.ABCMeta)
|
||||
@@ -260,6 +294,25 @@ class JobBoard(object):
|
||||
this must be the same name that was used for claiming this job.
|
||||
"""
|
||||
|
||||
@abc.abstractmethod
|
||||
def trash(self, job, who):
|
||||
"""Trash the provided job.
|
||||
|
||||
Trashing a job signals to others that the job is broken and should not
|
||||
be reclaimed. This is provided as an option for users to be able to
|
||||
remove jobs from the board externally. The trashed job details should
|
||||
be kept around in an alternate location to be reviewed, if desired.
|
||||
|
||||
Only the entity that has claimed that job can trash a job. Any entity
|
||||
trashing a unclaimed job (or a job they do not own) will cause an
|
||||
exception.
|
||||
|
||||
:param job: a job on this jobboard that can be trashed (if it does
|
||||
not exist then a NotFound exception will be raised).
|
||||
:param who: string that names the entity performing the trashing,
|
||||
this must be the same name that was used for claiming this job.
|
||||
"""
|
||||
|
||||
@abc.abstractproperty
|
||||
def connected(self):
|
||||
"""Returns if this jobboard is connected."""
|
||||
@@ -295,3 +348,40 @@ class NotifyingJobBoard(JobBoard):
|
||||
def __init__(self, name, conf):
|
||||
super(NotifyingJobBoard, self).__init__(name, conf)
|
||||
self.notifier = notifier.Notifier()
|
||||
|
||||
|
||||
# Internal helpers for usage by board implementations...
|
||||
|
||||
def check_who(meth):
|
||||
|
||||
@six.wraps(meth)
|
||||
def wrapper(self, job, who, *args, **kwargs):
|
||||
if not isinstance(who, six.string_types):
|
||||
raise TypeError("Job applicant must be a string type")
|
||||
if len(who) == 0:
|
||||
raise ValueError("Job applicant must be non-empty")
|
||||
return meth(self, job, who, *args, **kwargs)
|
||||
|
||||
return wrapper
|
||||
|
||||
|
||||
def format_posting(uuid, name, created_on=None, last_modified=None,
|
||||
details=None, book=None):
|
||||
posting = {
|
||||
'uuid': uuid,
|
||||
'name': name,
|
||||
}
|
||||
if created_on is not None:
|
||||
posting['created_on'] = created_on
|
||||
if last_modified is not None:
|
||||
posting['last_modified'] = last_modified
|
||||
if details:
|
||||
posting['details'] = details
|
||||
else:
|
||||
posting['details'] = {}
|
||||
if book is not None:
|
||||
posting['book'] = {
|
||||
'name': book.name,
|
||||
'uuid': book.uuid,
|
||||
}
|
||||
return posting
|
||||
|
||||
@@ -18,6 +18,7 @@ from __future__ import absolute_import
|
||||
|
||||
import abc
|
||||
|
||||
from debtcollector import moves
|
||||
from oslo_utils import excutils
|
||||
import six
|
||||
|
||||
@@ -25,7 +26,6 @@ from taskflow import logging
|
||||
from taskflow import states
|
||||
from taskflow.types import failure
|
||||
from taskflow.types import notifier
|
||||
from taskflow.utils import deprecation
|
||||
|
||||
LOG = logging.getLogger(__name__)
|
||||
|
||||
@@ -165,10 +165,8 @@ class Listener(object):
|
||||
|
||||
|
||||
# TODO(harlowja): remove in 0.7 or later...
|
||||
ListenerBase = deprecation.moved_inheritable_class(Listener,
|
||||
'ListenerBase', __name__,
|
||||
version="0.6",
|
||||
removal_version="?")
|
||||
ListenerBase = moves.moved_class(Listener, 'ListenerBase', __name__,
|
||||
version="0.6", removal_version="2.0")
|
||||
|
||||
|
||||
@six.add_metaclass(abc.ABCMeta)
|
||||
@@ -213,10 +211,18 @@ class DumpingListener(Listener):
|
||||
|
||||
|
||||
# TODO(harlowja): remove in 0.7 or later...
|
||||
class LoggingBase(deprecation.moved_inheritable_class(DumpingListener,
|
||||
'LoggingBase', __name__,
|
||||
version="0.6",
|
||||
removal_version="?")):
|
||||
class LoggingBase(moves.moved_class(DumpingListener,
|
||||
'LoggingBase', __name__,
|
||||
version="0.6", removal_version="2.0")):
|
||||
|
||||
"""Legacy logging base.
|
||||
|
||||
.. deprecated:: 0.6
|
||||
|
||||
This class is **deprecated** and is present for backward
|
||||
compatibility **only**, its replacement
|
||||
:py:class:`.DumpingListener` should be used going forward.
|
||||
"""
|
||||
|
||||
def _dump(self, message, *args, **kwargs):
|
||||
self._log(message, *args, **kwargs)
|
||||
|
||||
@@ -17,12 +17,15 @@
|
||||
from __future__ import absolute_import
|
||||
|
||||
import itertools
|
||||
import time
|
||||
|
||||
from debtcollector import moves
|
||||
from oslo_utils import timeutils
|
||||
|
||||
from taskflow import exceptions as exc
|
||||
from taskflow.listeners import base
|
||||
from taskflow import logging
|
||||
from taskflow import states
|
||||
from taskflow.types import timing as tt
|
||||
|
||||
STARTING_STATES = frozenset((states.RUNNING, states.REVERTING))
|
||||
FINISHED_STATES = frozenset((base.FINISH_STATES + (states.REVERTED,)))
|
||||
@@ -39,7 +42,7 @@ def _printer(message):
|
||||
print(message)
|
||||
|
||||
|
||||
class TimingListener(base.Listener):
|
||||
class DurationListener(base.Listener):
|
||||
"""Listener that captures task duration.
|
||||
|
||||
It records how long a task took to execute (or fail)
|
||||
@@ -47,13 +50,13 @@ class TimingListener(base.Listener):
|
||||
to task metadata with key ``'duration'``.
|
||||
"""
|
||||
def __init__(self, engine):
|
||||
super(TimingListener, self).__init__(engine,
|
||||
task_listen_for=WATCH_STATES,
|
||||
flow_listen_for=[])
|
||||
super(DurationListener, self).__init__(engine,
|
||||
task_listen_for=WATCH_STATES,
|
||||
flow_listen_for=[])
|
||||
self._timers = {}
|
||||
|
||||
def deregister(self):
|
||||
super(TimingListener, self).deregister()
|
||||
super(DurationListener, self).deregister()
|
||||
# There should be none that still exist at deregistering time, so log a
|
||||
# warning if there were any that somehow still got left behind...
|
||||
leftover_timers = len(self._timers)
|
||||
@@ -78,7 +81,7 @@ class TimingListener(base.Listener):
|
||||
if state == states.PENDING:
|
||||
self._timers.pop(task_name, None)
|
||||
elif state in STARTING_STATES:
|
||||
self._timers[task_name] = tt.StopWatch().start()
|
||||
self._timers[task_name] = timeutils.StopWatch().start()
|
||||
elif state in FINISHED_STATES:
|
||||
timer = self._timers.pop(task_name, None)
|
||||
if timer is not None:
|
||||
@@ -86,22 +89,76 @@ class TimingListener(base.Listener):
|
||||
self._record_ending(timer, task_name)
|
||||
|
||||
|
||||
class PrintingTimingListener(TimingListener):
|
||||
"""Listener that prints the start & stop timing as well as recording it."""
|
||||
TimingListener = moves.moved_class(DurationListener,
|
||||
'TimingListener', __name__,
|
||||
version="0.8", removal_version="2.0")
|
||||
|
||||
|
||||
class PrintingDurationListener(DurationListener):
|
||||
"""Listener that prints the duration as well as recording it."""
|
||||
|
||||
def __init__(self, engine, printer=None):
|
||||
super(PrintingTimingListener, self).__init__(engine)
|
||||
super(PrintingDurationListener, self).__init__(engine)
|
||||
if printer is None:
|
||||
self._printer = _printer
|
||||
else:
|
||||
self._printer = printer
|
||||
|
||||
def _record_ending(self, timer, task_name):
|
||||
super(PrintingTimingListener, self)._record_ending(timer, task_name)
|
||||
super(PrintingDurationListener, self)._record_ending(timer, task_name)
|
||||
self._printer("It took task '%s' %0.2f seconds to"
|
||||
" finish." % (task_name, timer.elapsed()))
|
||||
|
||||
def _task_receiver(self, state, details):
|
||||
super(PrintingTimingListener, self)._task_receiver(state, details)
|
||||
super(PrintingDurationListener, self)._task_receiver(state, details)
|
||||
if state in STARTING_STATES:
|
||||
self._printer("'%s' task started." % (details['task_name']))
|
||||
|
||||
|
||||
PrintingTimingListener = moves.moved_class(
|
||||
PrintingDurationListener, 'PrintingTimingListener', __name__,
|
||||
version="0.8", removal_version="2.0")
|
||||
|
||||
|
||||
class EventTimeListener(base.Listener):
|
||||
"""Listener that captures task, flow, and retry event timestamps.
|
||||
|
||||
It records how when an event is received (using unix time) to
|
||||
storage. It saves the timestamps under keys (in atom or flow details
|
||||
metadata) of the format ``{event}-timestamp`` where ``event`` is the
|
||||
state/event name that has been received.
|
||||
|
||||
This information can be later extracted/examined to derive durations...
|
||||
"""
|
||||
|
||||
def __init__(self, engine,
|
||||
task_listen_for=base.DEFAULT_LISTEN_FOR,
|
||||
flow_listen_for=base.DEFAULT_LISTEN_FOR,
|
||||
retry_listen_for=base.DEFAULT_LISTEN_FOR):
|
||||
super(EventTimeListener, self).__init__(
|
||||
engine, task_listen_for=task_listen_for,
|
||||
flow_listen_for=flow_listen_for, retry_listen_for=retry_listen_for)
|
||||
|
||||
def _record_atom_event(self, state, atom_name):
|
||||
meta_update = {'%s-timestamp' % state: time.time()}
|
||||
try:
|
||||
# Don't let storage failures throw exceptions in a listener method.
|
||||
self._engine.storage.update_atom_metadata(atom_name, meta_update)
|
||||
except exc.StorageFailure:
|
||||
LOG.warn("Failure to store timestamp %s for atom %s",
|
||||
meta_update, atom_name, exc_info=True)
|
||||
|
||||
def _flow_receiver(self, state, details):
|
||||
meta_update = {'%s-timestamp' % state: time.time()}
|
||||
try:
|
||||
# Don't let storage failures throw exceptions in a listener method.
|
||||
self._engine.storage.update_flow_metadata(meta_update)
|
||||
except exc.StorageFailure:
|
||||
LOG.warn("Failure to store timestamp %s for flow %s",
|
||||
meta_update, details['flow_name'], exc_info=True)
|
||||
|
||||
def _task_receiver(self, state, details):
|
||||
self._record_atom_event(state, details['task_name'])
|
||||
|
||||
def _retry_receiver(self, state, details):
|
||||
self._record_atom_event(state, details['retry_name'])
|
||||
|
||||
@@ -32,6 +32,7 @@ CRITICAL = logging.CRITICAL
|
||||
DEBUG = logging.DEBUG
|
||||
ERROR = logging.ERROR
|
||||
FATAL = logging.FATAL
|
||||
INFO = logging.INFO
|
||||
NOTSET = logging.NOTSET
|
||||
WARN = logging.WARN
|
||||
WARNING = logging.WARNING
|
||||
|
||||
@@ -16,22 +16,32 @@
|
||||
|
||||
import collections
|
||||
|
||||
import six
|
||||
|
||||
from taskflow import exceptions as exc
|
||||
from taskflow import flow
|
||||
from taskflow.types import graph as gr
|
||||
|
||||
|
||||
def _unsatisfied_requires(node, graph, *additional_provided):
|
||||
"""Extracts the unsatisified symbol requirements of a single node."""
|
||||
requires = set(node.requires)
|
||||
if not requires:
|
||||
return requires
|
||||
for provided in additional_provided:
|
||||
requires = requires - provided
|
||||
# This is using the difference() method vs the -
|
||||
# operator since the latter doesn't work with frozen
|
||||
# or regular sets (when used in combination with ordered
|
||||
# sets).
|
||||
#
|
||||
# If this is not done the following happens...
|
||||
#
|
||||
# TypeError: unsupported operand type(s)
|
||||
# for -: 'set' and 'OrderedSet'
|
||||
requires = requires.difference(provided)
|
||||
if not requires:
|
||||
return requires
|
||||
for pred in graph.bfs_predecessors_iter(node):
|
||||
requires = requires - pred.provides
|
||||
requires = requires.difference(pred.provides)
|
||||
if not requires:
|
||||
return requires
|
||||
return requires
|
||||
@@ -55,16 +65,23 @@ class Flow(flow.Flow):
|
||||
self._graph = gr.DiGraph()
|
||||
self._graph.freeze()
|
||||
|
||||
def link(self, u, v):
|
||||
#: Extracts the unsatisified symbol requirements of a single node.
|
||||
_unsatisfied_requires = staticmethod(_unsatisfied_requires)
|
||||
|
||||
def link(self, u, v, decider=None):
|
||||
"""Link existing node u as a runtime dependency of existing node v."""
|
||||
if not self._graph.has_node(u):
|
||||
raise ValueError('Item %s not found to link from' % (u))
|
||||
raise ValueError("Node '%s' not found to link from" % (u))
|
||||
if not self._graph.has_node(v):
|
||||
raise ValueError('Item %s not found to link to' % (v))
|
||||
self._swap(self._link(u, v, manual=True))
|
||||
raise ValueError("Node '%s' not found to link to" % (v))
|
||||
if decider is not None:
|
||||
if not six.callable(decider):
|
||||
raise ValueError("Decider boolean callback must be callable")
|
||||
self._swap(self._link(u, v, manual=True, decider=decider))
|
||||
return self
|
||||
|
||||
def _link(self, u, v, graph=None, reason=None, manual=False):
|
||||
def _link(self, u, v, graph=None,
|
||||
reason=None, manual=False, decider=None):
|
||||
mutable_graph = True
|
||||
if graph is None:
|
||||
graph = self._graph
|
||||
@@ -74,6 +91,8 @@ class Flow(flow.Flow):
|
||||
attrs = graph.get_edge_data(u, v)
|
||||
if not attrs:
|
||||
attrs = {}
|
||||
if decider is not None:
|
||||
attrs[flow.LINK_DECIDER] = decider
|
||||
if manual:
|
||||
attrs[flow.LINK_MANUAL] = True
|
||||
if reason is not None:
|
||||
@@ -94,34 +113,38 @@ class Flow(flow.Flow):
|
||||
direct access to the underlying graph).
|
||||
"""
|
||||
if not graph.is_directed_acyclic():
|
||||
raise exc.DependencyFailure("No path through the items in the"
|
||||
raise exc.DependencyFailure("No path through the node(s) in the"
|
||||
" graph produces an ordering that"
|
||||
" will allow for logical"
|
||||
" edge traversal")
|
||||
self._graph = graph.freeze()
|
||||
|
||||
def add(self, *items, **kwargs):
|
||||
def add(self, *nodes, **kwargs):
|
||||
"""Adds a given task/tasks/flow/flows to this flow.
|
||||
|
||||
:param items: items to add to the flow
|
||||
:param nodes: node(s) to add to the flow
|
||||
:param kwargs: keyword arguments, the two keyword arguments
|
||||
currently processed are:
|
||||
|
||||
* ``resolve_requires`` a boolean that when true (the
|
||||
default) implies that when items are added their
|
||||
symbol requirements will be matched to existing items
|
||||
and links will be automatically made to those
|
||||
default) implies that when node(s) are added their
|
||||
symbol requirements will be matched to existing
|
||||
node(s) and links will be automatically made to those
|
||||
providers. If multiple possible providers exist
|
||||
then a AmbiguousDependency exception will be raised.
|
||||
* ``resolve_existing``, a boolean that when true (the
|
||||
default) implies that on addition of a new item that
|
||||
existing items will have their requirements scanned
|
||||
for symbols that this newly added item can provide.
|
||||
default) implies that on addition of a new node that
|
||||
existing node(s) will have their requirements scanned
|
||||
for symbols that this newly added node can provide.
|
||||
If a match is found a link is automatically created
|
||||
from the newly added item to the requiree.
|
||||
from the newly added node to the requiree.
|
||||
"""
|
||||
items = [i for i in items if not self._graph.has_node(i)]
|
||||
if not items:
|
||||
|
||||
# Let's try to avoid doing any work if we can; since the below code
|
||||
# after this filter can create more temporary graphs that aren't needed
|
||||
# if the nodes already exist...
|
||||
nodes = [i for i in nodes if not self._graph.has_node(i)]
|
||||
if not nodes:
|
||||
return self
|
||||
|
||||
# This syntax will *hopefully* be better in future versions of python.
|
||||
@@ -143,52 +166,52 @@ class Flow(flow.Flow):
|
||||
retry_provides.add(value)
|
||||
provided[value].append(self._retry)
|
||||
|
||||
for item in self._graph.nodes_iter():
|
||||
for value in _unsatisfied_requires(item, self._graph,
|
||||
retry_provides):
|
||||
required[value].append(item)
|
||||
for value in item.provides:
|
||||
provided[value].append(item)
|
||||
for node in self._graph.nodes_iter():
|
||||
for value in self._unsatisfied_requires(node, self._graph,
|
||||
retry_provides):
|
||||
required[value].append(node)
|
||||
for value in node.provides:
|
||||
provided[value].append(node)
|
||||
|
||||
# NOTE(harlowja): Add items and edges to a temporary copy of the
|
||||
# NOTE(harlowja): Add node(s) and edge(s) to a temporary copy of the
|
||||
# underlying graph and only if that is successful added to do we then
|
||||
# swap with the underlying graph.
|
||||
tmp_graph = gr.DiGraph(self._graph)
|
||||
for item in items:
|
||||
tmp_graph.add_node(item)
|
||||
for node in nodes:
|
||||
tmp_graph.add_node(node)
|
||||
|
||||
# Try to find a valid provider.
|
||||
if resolve_requires:
|
||||
for value in _unsatisfied_requires(item, tmp_graph,
|
||||
retry_provides):
|
||||
for value in self._unsatisfied_requires(node, tmp_graph,
|
||||
retry_provides):
|
||||
if value in provided:
|
||||
providers = provided[value]
|
||||
if len(providers) > 1:
|
||||
provider_names = [n.name for n in providers]
|
||||
raise exc.AmbiguousDependency(
|
||||
"Resolution error detected when"
|
||||
" adding %(item)s, multiple"
|
||||
" adding '%(node)s', multiple"
|
||||
" providers %(providers)s found for"
|
||||
" required symbol '%(value)s'"
|
||||
% dict(item=item.name,
|
||||
% dict(node=node.name,
|
||||
providers=sorted(provider_names),
|
||||
value=value))
|
||||
else:
|
||||
self._link(providers[0], item,
|
||||
self._link(providers[0], node,
|
||||
graph=tmp_graph, reason=value)
|
||||
else:
|
||||
required[value].append(item)
|
||||
required[value].append(node)
|
||||
|
||||
for value in item.provides:
|
||||
provided[value].append(item)
|
||||
for value in node.provides:
|
||||
provided[value].append(node)
|
||||
|
||||
# See if what we provide fulfills any existing requiree.
|
||||
if resolve_existing:
|
||||
for value in item.provides:
|
||||
for value in node.provides:
|
||||
if value in required:
|
||||
for requiree in list(required[value]):
|
||||
if requiree is not item:
|
||||
self._link(item, requiree,
|
||||
if requiree is not node:
|
||||
self._link(node, requiree,
|
||||
graph=tmp_graph, reason=value)
|
||||
required[value].remove(requiree)
|
||||
|
||||
@@ -222,8 +245,9 @@ class Flow(flow.Flow):
|
||||
requires.update(self._retry.requires)
|
||||
retry_provides.update(self._retry.provides)
|
||||
g = self._get_subgraph()
|
||||
for item in g.nodes_iter():
|
||||
requires.update(_unsatisfied_requires(item, g, retry_provides))
|
||||
for node in g.nodes_iter():
|
||||
requires.update(self._unsatisfied_requires(node, g,
|
||||
retry_provides))
|
||||
return frozenset(requires)
|
||||
|
||||
|
||||
@@ -239,36 +263,35 @@ class TargetedFlow(Flow):
|
||||
self._subgraph = None
|
||||
self._target = None
|
||||
|
||||
def set_target(self, target_item):
|
||||
def set_target(self, target_node):
|
||||
"""Set target for the flow.
|
||||
|
||||
Any items (tasks or subflows) not needed for the target
|
||||
item will not be executed.
|
||||
Any node(s) (tasks or subflows) not needed for the target
|
||||
node will not be executed.
|
||||
"""
|
||||
if not self._graph.has_node(target_item):
|
||||
raise ValueError('Item %s not found' % target_item)
|
||||
self._target = target_item
|
||||
if not self._graph.has_node(target_node):
|
||||
raise ValueError("Node '%s' not found" % target_node)
|
||||
self._target = target_node
|
||||
self._subgraph = None
|
||||
|
||||
def reset_target(self):
|
||||
"""Reset target for the flow.
|
||||
|
||||
All items of the flow will be executed.
|
||||
All node(s) of the flow will be executed.
|
||||
"""
|
||||
|
||||
self._target = None
|
||||
self._subgraph = None
|
||||
|
||||
def add(self, *items):
|
||||
def add(self, *nodes):
|
||||
"""Adds a given task/tasks/flow/flows to this flow."""
|
||||
super(TargetedFlow, self).add(*items)
|
||||
super(TargetedFlow, self).add(*nodes)
|
||||
# reset cached subgraph, in case it was affected
|
||||
self._subgraph = None
|
||||
return self
|
||||
|
||||
def link(self, u, v):
|
||||
def link(self, u, v, decider=None):
|
||||
"""Link existing node u as a runtime dependency of existing node v."""
|
||||
super(TargetedFlow, self).link(u, v)
|
||||
super(TargetedFlow, self).link(u, v, decider=decider)
|
||||
# reset cached subgraph, in case it was affected
|
||||
self._subgraph = None
|
||||
return self
|
||||
|
||||
@@ -16,6 +16,7 @@
|
||||
|
||||
import contextlib
|
||||
|
||||
import six
|
||||
from stevedore import driver
|
||||
|
||||
from taskflow import exceptions as exc
|
||||
@@ -38,18 +39,20 @@ def fetch(conf, namespace=BACKEND_NAMESPACE, **kwargs):
|
||||
|
||||
NOTE(harlowja): to aid in making it easy to specify configuration and
|
||||
options to a backend the configuration (which is typical just a dictionary)
|
||||
can also be a uri string that identifies the entrypoint name and any
|
||||
can also be a URI string that identifies the entrypoint name and any
|
||||
configuration specific to that backend.
|
||||
|
||||
For example, given the following configuration uri:
|
||||
For example, given the following configuration URI::
|
||||
|
||||
mysql://<not-used>/?a=b&c=d
|
||||
mysql://<not-used>/?a=b&c=d
|
||||
|
||||
This will look for the entrypoint named 'mysql' and will provide
|
||||
a configuration object composed of the uris parameters, in this case that
|
||||
is {'a': 'b', 'c': 'd'} to the constructor of that persistence backend
|
||||
a configuration object composed of the URI's components, in this case that
|
||||
is ``{'a': 'b', 'c': 'd'}`` to the constructor of that persistence backend
|
||||
instance.
|
||||
"""
|
||||
if isinstance(conf, six.string_types):
|
||||
conf = {'connection': conf}
|
||||
backend_name = conf['connection']
|
||||
try:
|
||||
uri = misc.parse_uri(backend_name)
|
||||
|
||||
@@ -15,33 +15,39 @@
|
||||
# License for the specific language governing permissions and limitations
|
||||
# under the License.
|
||||
|
||||
import contextlib
|
||||
import errno
|
||||
import os
|
||||
import shutil
|
||||
|
||||
import cachetools
|
||||
import fasteners
|
||||
from oslo_serialization import jsonutils
|
||||
import six
|
||||
|
||||
from taskflow import exceptions as exc
|
||||
from taskflow import logging
|
||||
from taskflow.persistence import base
|
||||
from taskflow.persistence import logbook
|
||||
from taskflow.utils import lock_utils
|
||||
from taskflow.persistence import path_based
|
||||
from taskflow.utils import misc
|
||||
|
||||
LOG = logging.getLogger(__name__)
|
||||
|
||||
@contextlib.contextmanager
|
||||
def _storagefailure_wrapper():
|
||||
try:
|
||||
yield
|
||||
except exc.TaskFlowException:
|
||||
raise
|
||||
except Exception as e:
|
||||
if isinstance(e, (IOError, OSError)) and e.errno == errno.ENOENT:
|
||||
exc.raise_with_cause(exc.NotFound,
|
||||
'Item not found: %s' % e.filename,
|
||||
cause=e)
|
||||
else:
|
||||
exc.raise_with_cause(exc.StorageFailure,
|
||||
"Storage backend internal error", cause=e)
|
||||
|
||||
|
||||
class DirBackend(base.Backend):
|
||||
class DirBackend(path_based.PathBasedBackend):
|
||||
"""A directory and file based backend.
|
||||
|
||||
This backend writes logbooks, flow details, and atom details to a provided
|
||||
base path on the local filesystem. It will create and store those objects
|
||||
in three key directories (one for logbooks, one for flow details and one
|
||||
for atom details). It creates those associated directories and then
|
||||
creates files inside those directories that represent the contents of those
|
||||
objects for later reading and writing.
|
||||
|
||||
This backend does *not* provide true transactional semantics. It does
|
||||
guarantee that there will be no interprocess race conditions when
|
||||
writing and reading by using a consistent hierarchy of file based locks.
|
||||
@@ -49,22 +55,33 @@ class DirBackend(base.Backend):
|
||||
Example configuration::
|
||||
|
||||
conf = {
|
||||
"path": "/tmp/taskflow",
|
||||
"path": "/tmp/taskflow", # save data to this root directory
|
||||
"max_cache_size": 1024, # keep up-to 1024 entries in memory
|
||||
}
|
||||
"""
|
||||
|
||||
DEFAULT_FILE_ENCODING = 'utf-8'
|
||||
"""
|
||||
Default encoding used when decoding or encoding files into or from
|
||||
text/unicode into binary or binary into text/unicode.
|
||||
"""
|
||||
|
||||
def __init__(self, conf):
|
||||
super(DirBackend, self).__init__(conf)
|
||||
self._path = os.path.abspath(conf['path'])
|
||||
self._lock_path = os.path.join(self._path, 'locks')
|
||||
self._file_cache = {}
|
||||
|
||||
@property
|
||||
def lock_path(self):
|
||||
return self._lock_path
|
||||
|
||||
@property
|
||||
def base_path(self):
|
||||
return self._path
|
||||
max_cache_size = self._conf.get('max_cache_size')
|
||||
if max_cache_size is not None:
|
||||
max_cache_size = int(max_cache_size)
|
||||
if max_cache_size < 1:
|
||||
raise ValueError("Maximum cache size must be greater than"
|
||||
" or equal to one")
|
||||
self.file_cache = cachetools.LRUCache(max_cache_size)
|
||||
else:
|
||||
self.file_cache = {}
|
||||
self.encoding = self._conf.get('encoding', self.DEFAULT_FILE_ENCODING)
|
||||
if not self._path:
|
||||
raise ValueError("Empty path is disallowed")
|
||||
self._path = os.path.abspath(self._path)
|
||||
self.lock = fasteners.ReaderWriterLock()
|
||||
|
||||
def get_connection(self):
|
||||
return Connection(self)
|
||||
@@ -73,333 +90,77 @@ class DirBackend(base.Backend):
|
||||
pass
|
||||
|
||||
|
||||
class Connection(base.Connection):
|
||||
def __init__(self, backend):
|
||||
self._backend = backend
|
||||
self._file_cache = self._backend._file_cache
|
||||
self._flow_path = os.path.join(self._backend.base_path, 'flows')
|
||||
self._atom_path = os.path.join(self._backend.base_path, 'atoms')
|
||||
self._book_path = os.path.join(self._backend.base_path, 'books')
|
||||
|
||||
def validate(self):
|
||||
# Verify key paths exist.
|
||||
paths = [
|
||||
self._backend.base_path,
|
||||
self._backend.lock_path,
|
||||
self._flow_path,
|
||||
self._atom_path,
|
||||
self._book_path,
|
||||
]
|
||||
for p in paths:
|
||||
if not os.path.isdir(p):
|
||||
raise RuntimeError("Missing required directory: %s" % (p))
|
||||
|
||||
class Connection(path_based.PathBasedConnection):
|
||||
def _read_from(self, filename):
|
||||
# This is very similar to the oslo-incubator fileutils module, but
|
||||
# tweaked to not depend on a global cache, as well as tweaked to not
|
||||
# pull-in the oslo logging module (which is a huge pile of code).
|
||||
mtime = os.path.getmtime(filename)
|
||||
cache_info = self._file_cache.setdefault(filename, {})
|
||||
cache_info = self.backend.file_cache.setdefault(filename, {})
|
||||
if not cache_info or mtime > cache_info.get('mtime', 0):
|
||||
with open(filename, 'rb') as fp:
|
||||
cache_info['data'] = fp.read().decode('utf-8')
|
||||
cache_info['data'] = misc.binary_decode(
|
||||
fp.read(), encoding=self.backend.encoding)
|
||||
cache_info['mtime'] = mtime
|
||||
return cache_info['data']
|
||||
|
||||
def _write_to(self, filename, contents):
|
||||
if isinstance(contents, six.text_type):
|
||||
contents = contents.encode('utf-8')
|
||||
contents = misc.binary_encode(contents,
|
||||
encoding=self.backend.encoding)
|
||||
with open(filename, 'wb') as fp:
|
||||
fp.write(contents)
|
||||
self._file_cache.pop(filename, None)
|
||||
self.backend.file_cache.pop(filename, None)
|
||||
|
||||
def _run_with_process_lock(self, lock_name, functor, *args, **kwargs):
|
||||
lock_path = os.path.join(self.backend.lock_path, lock_name)
|
||||
with lock_utils.InterProcessLock(lock_path):
|
||||
@contextlib.contextmanager
|
||||
def _path_lock(self, path):
|
||||
lockfile = self._join_path(path, 'lock')
|
||||
with fasteners.InterProcessLock(lockfile) as lock:
|
||||
with _storagefailure_wrapper():
|
||||
yield lock
|
||||
|
||||
def _join_path(self, *parts):
|
||||
return os.path.join(*parts)
|
||||
|
||||
def _get_item(self, path):
|
||||
with self._path_lock(path):
|
||||
item_path = self._join_path(path, 'metadata')
|
||||
return misc.decode_json(self._read_from(item_path))
|
||||
|
||||
def _set_item(self, path, value, transaction):
|
||||
with self._path_lock(path):
|
||||
item_path = self._join_path(path, 'metadata')
|
||||
self._write_to(item_path, jsonutils.dumps(value))
|
||||
|
||||
def _del_tree(self, path, transaction):
|
||||
with self._path_lock(path):
|
||||
shutil.rmtree(path)
|
||||
|
||||
def _get_children(self, path):
|
||||
with _storagefailure_wrapper():
|
||||
return [link for link in os.listdir(path)
|
||||
if os.path.islink(self._join_path(path, link))]
|
||||
|
||||
def _ensure_path(self, path):
|
||||
with _storagefailure_wrapper():
|
||||
misc.ensure_tree(path)
|
||||
|
||||
def _create_link(self, src_path, dest_path, transaction):
|
||||
with _storagefailure_wrapper():
|
||||
try:
|
||||
return functor(*args, **kwargs)
|
||||
except exc.TaskFlowException:
|
||||
raise
|
||||
except Exception as e:
|
||||
LOG.exception("Failed running locking file based session")
|
||||
# NOTE(harlowja): trap all other errors as storage errors.
|
||||
raise exc.StorageFailure("Storage backend internal error", e)
|
||||
|
||||
def _get_logbooks(self):
|
||||
lb_uuids = []
|
||||
try:
|
||||
lb_uuids = [d for d in os.listdir(self._book_path)
|
||||
if os.path.isdir(os.path.join(self._book_path, d))]
|
||||
except EnvironmentError as e:
|
||||
if e.errno != errno.ENOENT:
|
||||
raise
|
||||
for lb_uuid in lb_uuids:
|
||||
try:
|
||||
yield self._get_logbook(lb_uuid)
|
||||
except exc.NotFound:
|
||||
pass
|
||||
|
||||
def get_logbooks(self):
|
||||
try:
|
||||
books = list(self._get_logbooks())
|
||||
except EnvironmentError as e:
|
||||
raise exc.StorageFailure("Unable to fetch logbooks", e)
|
||||
else:
|
||||
for b in books:
|
||||
yield b
|
||||
|
||||
@property
|
||||
def backend(self):
|
||||
return self._backend
|
||||
|
||||
def close(self):
|
||||
pass
|
||||
|
||||
def _save_atom_details(self, atom_detail, ignore_missing):
|
||||
# See if we have an existing atom detail to merge with.
|
||||
e_ad = None
|
||||
try:
|
||||
e_ad = self._get_atom_details(atom_detail.uuid, lock=False)
|
||||
except EnvironmentError:
|
||||
if not ignore_missing:
|
||||
raise exc.NotFound("No atom details found with id: %s"
|
||||
% atom_detail.uuid)
|
||||
if e_ad is not None:
|
||||
atom_detail = e_ad.merge(atom_detail)
|
||||
ad_path = os.path.join(self._atom_path, atom_detail.uuid)
|
||||
ad_data = base._format_atom(atom_detail)
|
||||
self._write_to(ad_path, jsonutils.dumps(ad_data))
|
||||
return atom_detail
|
||||
|
||||
def update_atom_details(self, atom_detail):
|
||||
return self._run_with_process_lock("atom",
|
||||
self._save_atom_details,
|
||||
atom_detail,
|
||||
ignore_missing=False)
|
||||
|
||||
def _get_atom_details(self, uuid, lock=True):
|
||||
|
||||
def _get():
|
||||
ad_path = os.path.join(self._atom_path, uuid)
|
||||
ad_data = misc.decode_json(self._read_from(ad_path))
|
||||
ad_cls = logbook.atom_detail_class(ad_data['type'])
|
||||
return ad_cls.from_dict(ad_data['atom'])
|
||||
|
||||
if lock:
|
||||
return self._run_with_process_lock('atom', _get)
|
||||
else:
|
||||
return _get()
|
||||
|
||||
def _get_flow_details(self, uuid, lock=True):
|
||||
|
||||
def _get():
|
||||
fd_path = os.path.join(self._flow_path, uuid)
|
||||
meta_path = os.path.join(fd_path, 'metadata')
|
||||
meta = misc.decode_json(self._read_from(meta_path))
|
||||
fd = logbook.FlowDetail.from_dict(meta)
|
||||
ad_to_load = []
|
||||
ad_path = os.path.join(fd_path, 'atoms')
|
||||
try:
|
||||
ad_to_load = [f for f in os.listdir(ad_path)
|
||||
if os.path.islink(os.path.join(ad_path, f))]
|
||||
except EnvironmentError as e:
|
||||
if e.errno != errno.ENOENT:
|
||||
raise
|
||||
for ad_uuid in ad_to_load:
|
||||
fd.add(self._get_atom_details(ad_uuid))
|
||||
return fd
|
||||
|
||||
if lock:
|
||||
return self._run_with_process_lock('flow', _get)
|
||||
else:
|
||||
return _get()
|
||||
|
||||
def _save_atoms_and_link(self, atom_details, local_atom_path):
|
||||
for atom_detail in atom_details:
|
||||
self._save_atom_details(atom_detail, ignore_missing=True)
|
||||
src_ad_path = os.path.join(self._atom_path, atom_detail.uuid)
|
||||
target_ad_path = os.path.join(local_atom_path, atom_detail.uuid)
|
||||
try:
|
||||
os.symlink(src_ad_path, target_ad_path)
|
||||
except EnvironmentError as e:
|
||||
os.symlink(src_path, dest_path)
|
||||
except OSError as e:
|
||||
if e.errno != errno.EEXIST:
|
||||
raise
|
||||
|
||||
def _save_flow_details(self, flow_detail, ignore_missing):
|
||||
# See if we have an existing flow detail to merge with.
|
||||
e_fd = None
|
||||
try:
|
||||
e_fd = self._get_flow_details(flow_detail.uuid, lock=False)
|
||||
except EnvironmentError:
|
||||
if not ignore_missing:
|
||||
raise exc.NotFound("No flow details found with id: %s"
|
||||
% flow_detail.uuid)
|
||||
if e_fd is not None:
|
||||
e_fd = e_fd.merge(flow_detail)
|
||||
for ad in flow_detail:
|
||||
if e_fd.find(ad.uuid) is None:
|
||||
e_fd.add(ad)
|
||||
flow_detail = e_fd
|
||||
flow_path = os.path.join(self._flow_path, flow_detail.uuid)
|
||||
misc.ensure_tree(flow_path)
|
||||
self._write_to(os.path.join(flow_path, 'metadata'),
|
||||
jsonutils.dumps(flow_detail.to_dict()))
|
||||
if len(flow_detail):
|
||||
atom_path = os.path.join(flow_path, 'atoms')
|
||||
misc.ensure_tree(atom_path)
|
||||
self._run_with_process_lock('atom',
|
||||
self._save_atoms_and_link,
|
||||
list(flow_detail), atom_path)
|
||||
return flow_detail
|
||||
@contextlib.contextmanager
|
||||
def _transaction(self):
|
||||
"""This just wraps a global write-lock."""
|
||||
lock = self.backend.lock.write_lock
|
||||
with lock():
|
||||
yield
|
||||
|
||||
def update_flow_details(self, flow_detail):
|
||||
return self._run_with_process_lock("flow",
|
||||
self._save_flow_details,
|
||||
flow_detail,
|
||||
ignore_missing=False)
|
||||
|
||||
def _save_flows_and_link(self, flow_details, local_flow_path):
|
||||
for flow_detail in flow_details:
|
||||
self._save_flow_details(flow_detail, ignore_missing=True)
|
||||
src_fd_path = os.path.join(self._flow_path, flow_detail.uuid)
|
||||
target_fd_path = os.path.join(local_flow_path, flow_detail.uuid)
|
||||
try:
|
||||
os.symlink(src_fd_path, target_fd_path)
|
||||
except EnvironmentError as e:
|
||||
if e.errno != errno.EEXIST:
|
||||
raise
|
||||
|
||||
def _save_logbook(self, book):
|
||||
# See if we have an existing logbook to merge with.
|
||||
e_lb = None
|
||||
try:
|
||||
e_lb = self._get_logbook(book.uuid)
|
||||
except exc.NotFound:
|
||||
pass
|
||||
if e_lb is not None:
|
||||
e_lb = e_lb.merge(book)
|
||||
for fd in book:
|
||||
if e_lb.find(fd.uuid) is None:
|
||||
e_lb.add(fd)
|
||||
book = e_lb
|
||||
book_path = os.path.join(self._book_path, book.uuid)
|
||||
misc.ensure_tree(book_path)
|
||||
self._write_to(os.path.join(book_path, 'metadata'),
|
||||
jsonutils.dumps(book.to_dict(marshal_time=True)))
|
||||
if len(book):
|
||||
flow_path = os.path.join(book_path, 'flows')
|
||||
misc.ensure_tree(flow_path)
|
||||
self._run_with_process_lock('flow',
|
||||
self._save_flows_and_link,
|
||||
list(book), flow_path)
|
||||
return book
|
||||
|
||||
def save_logbook(self, book):
|
||||
return self._run_with_process_lock("book",
|
||||
self._save_logbook, book)
|
||||
|
||||
def upgrade(self):
|
||||
|
||||
def _step_create():
|
||||
for path in (self._book_path, self._flow_path, self._atom_path):
|
||||
try:
|
||||
misc.ensure_tree(path)
|
||||
except EnvironmentError as e:
|
||||
raise exc.StorageFailure("Unable to create logbooks"
|
||||
" required child path %s" % path,
|
||||
e)
|
||||
|
||||
for path in (self._backend.base_path, self._backend.lock_path):
|
||||
try:
|
||||
misc.ensure_tree(path)
|
||||
except EnvironmentError as e:
|
||||
raise exc.StorageFailure("Unable to create logbooks required"
|
||||
" path %s" % path, e)
|
||||
|
||||
self._run_with_process_lock("init", _step_create)
|
||||
|
||||
def clear_all(self):
|
||||
|
||||
def _step_clear():
|
||||
for d in (self._book_path, self._flow_path, self._atom_path):
|
||||
if os.path.isdir(d):
|
||||
shutil.rmtree(d)
|
||||
|
||||
def _step_atom():
|
||||
self._run_with_process_lock("atom", _step_clear)
|
||||
|
||||
def _step_flow():
|
||||
self._run_with_process_lock("flow", _step_atom)
|
||||
|
||||
def _step_book():
|
||||
self._run_with_process_lock("book", _step_flow)
|
||||
|
||||
# Acquire all locks by going through this little hierarchy.
|
||||
self._run_with_process_lock("init", _step_book)
|
||||
|
||||
def destroy_logbook(self, book_uuid):
|
||||
|
||||
def _destroy_atoms(atom_details):
|
||||
for atom_detail in atom_details:
|
||||
atom_path = os.path.join(self._atom_path, atom_detail.uuid)
|
||||
try:
|
||||
shutil.rmtree(atom_path)
|
||||
except EnvironmentError as e:
|
||||
if e.errno != errno.ENOENT:
|
||||
raise exc.StorageFailure("Unable to remove atom"
|
||||
" directory %s" % atom_path,
|
||||
e)
|
||||
|
||||
def _destroy_flows(flow_details):
|
||||
for flow_detail in flow_details:
|
||||
flow_path = os.path.join(self._flow_path, flow_detail.uuid)
|
||||
self._run_with_process_lock("atom", _destroy_atoms,
|
||||
list(flow_detail))
|
||||
try:
|
||||
shutil.rmtree(flow_path)
|
||||
except EnvironmentError as e:
|
||||
if e.errno != errno.ENOENT:
|
||||
raise exc.StorageFailure("Unable to remove flow"
|
||||
" directory %s" % flow_path,
|
||||
e)
|
||||
|
||||
def _destroy_book():
|
||||
book = self._get_logbook(book_uuid)
|
||||
book_path = os.path.join(self._book_path, book.uuid)
|
||||
self._run_with_process_lock("flow", _destroy_flows, list(book))
|
||||
try:
|
||||
shutil.rmtree(book_path)
|
||||
except EnvironmentError as e:
|
||||
if e.errno != errno.ENOENT:
|
||||
raise exc.StorageFailure("Unable to remove book"
|
||||
" directory %s" % book_path, e)
|
||||
|
||||
# Acquire all locks by going through this little hierarchy.
|
||||
self._run_with_process_lock("book", _destroy_book)
|
||||
|
||||
def _get_logbook(self, book_uuid):
|
||||
book_path = os.path.join(self._book_path, book_uuid)
|
||||
meta_path = os.path.join(book_path, 'metadata')
|
||||
try:
|
||||
meta = misc.decode_json(self._read_from(meta_path))
|
||||
except EnvironmentError as e:
|
||||
if e.errno == errno.ENOENT:
|
||||
raise exc.NotFound("No logbook found with id: %s" % book_uuid)
|
||||
else:
|
||||
raise
|
||||
lb = logbook.LogBook.from_dict(meta, unmarshal_time=True)
|
||||
fd_path = os.path.join(book_path, 'flows')
|
||||
fd_uuids = []
|
||||
try:
|
||||
fd_uuids = [f for f in os.listdir(fd_path)
|
||||
if os.path.islink(os.path.join(fd_path, f))]
|
||||
except EnvironmentError as e:
|
||||
if e.errno != errno.ENOENT:
|
||||
raise
|
||||
for fd_uuid in fd_uuids:
|
||||
lb.add(self._get_flow_details(fd_uuid))
|
||||
return lb
|
||||
|
||||
def get_logbook(self, book_uuid):
|
||||
return self._run_with_process_lock("book",
|
||||
self._get_logbook, book_uuid)
|
||||
def validate(self):
|
||||
with _storagefailure_wrapper():
|
||||
for p in (self.flow_path, self.atom_path, self.book_path):
|
||||
if not os.path.isdir(p):
|
||||
raise RuntimeError("Missing required directory: %s" % (p))
|
||||
|
||||