Files
deb-python-taskflow/doc/source/engines.rst
Joshua Harlow 964a37df9a Documentation tune-ups
Change-Id: Iac0ddd4948ab364e1905327978bf449b52294388
2014-04-26 19:54:17 -07:00

4.2 KiB

Engines

Overview

Engines are what really runs your atoms.

An engine takes a flow structure (described by patterns) and uses it to decide which atom <atoms> to run and when.

TaskFlow provides different implementations of engines. Some may be easier to use (ie, require no additional infrastructure setup) and understand; others might require more complicated setup but provide better scalability. The idea and ideal is that deployers or developers of a service that uses TaskFlow can select an engine that suites their setup best without modifying the code of said service.

Engines usually have different capabilities and configuration, but all of them must implement the same interface and preserve the semantics of patterns (e.g. parts of :pylinear flow <taskflow.patterns.linear_flow.Flow> are run one after another, in order, even if engine is capable of running tasks in parallel).

Creating

All engines are mere classes that implement the same interface, and of course it is possible to import them and create instances just like with any classes in Python. But the easier (and recommended) way for creating an engine is using the engine helper functions. All of these functions are imported into the taskflow.engines module namespace, so the typical usage of these functions might look like:

from taskflow import engines

...
flow = make_flow()
engine = engines.load(flow, engine_conf=my_conf, backend=my_persistence_conf)
engine.run

taskflow.engines.helpers

Usage

To select which engine to use and pass parameters to an engine you should use the engine_conf parameter any helper factory function accepts. It may be:

  • a string, naming engine type;
  • a dictionary, holding engine type with key 'engine' and possibly type-specific engine parameters.

Single-Threaded

Engine type: 'serial'

Runs all tasks on the single thread -- the same thread engine.run() is called on. This engine is used by default.

Tip

If eventlet is used then this engine will not block other threads from running as eventlet automatically creates a co-routine system (using greenthreads and monkey patching). See eventlet and greenlet for more details.

Parallel

Engine type: 'parallel'

Parallel engine schedules tasks onto different threads to run them in parallel.

Additional configuration parameters:

  • executor: a class that provides concurrent.futures.Executor-like interface; it will be used for scheduling tasks. You can use instances of concurrent.futures.ThreadPoolExecutor or taskflow.utils.eventlet_utils.GreenExecutor (which internally uses eventlet and greenthread pools).

Tip

Sharing executor between engine instances provides better scalability by reducing thread creation and teardown as well as by reusing existing pools (which is a good practice in general).

Note

Running tasks with concurrent.futures.ProcessPoolExecutor is not supported now.

Worker-Based

Engine type: 'worker-based'

This is engine that schedules tasks to workers -- separate processes dedicated for certain tasks execution, possibly running on other machines, connected via amqp (or other supported kombu transports). For more information, please see wiki page for more details on how the worker based engine operates.

Note

This engine is under active development and is experimental but it is usable and does work but is missing some features (please check the blueprint page for known issues and plans) that will make it more production ready.

Interfaces

taskflow.engines.base

Hierarchy

taskflow.engines.base taskflow.engines.action_engine.engine taskflow.engines.worker_based.engine