Sometimes you'll get a string back from some action (like swift
get_object) and it will be in either a yaml or json format. These
functions will allow you to parse those into a useful object.
Change-Id: I375219f4b019319e1b3d756dca512f7f90cd097f
* Made scheduler delay configurable. It now consists of a fixed
part configured with the 'fixed_delay' property and a random
addition limited by the 'random_delay' config property.
Because of this, using loopingcall from oslo was replaced with
a regular loop in a separate thread becase loopingcall
supports only fixed delays.
Closes-Bug: #1721733
Change-Id: I8f6a15be339e208755323afb18e4b58f886770c1
* If a subworkflow completes it sends its result to a parent
workflow by using the scheduler (delayed call) which operates
through the database and has a delay between iterations.
This patch optimizes this by reusing already existing
decorator @action_queue.process to make RPC calls to convey
subworkflow results outside of a DB transaction, similar
way as we schedule action runs after completion of a task.
The main reason for making this change is how Scheduler now
works in HA mode. In fact, it doesn't scale well because
every Scheduler instance keeps quering DB for delayed calls
eligible for processing and hence in HA setup many Schedulers
take same delayed calls often and clash between each other
causing DB deadlocks in mysql. They are caused just by mysql
locking model (it's documented in their docs) so we have
means to handle them. However, Scheduler still remans a
bottleneck in the system and it's better to reduce the load
on it as much as possible.
One more reason to make this change is that we don't solve
the problem of eleminating the possibility to loose RPC
messages (when a DB TX is committed and RPC calls is not made
yet) with Scheduler anyway. If we use Scheduler for scheduling
RPC calls we just shift the place where we can unsync DB and
MQ to the Scheduler. So, in other words, it is a fundamental
problem of syncing two external data sources which can't be
naturally enrolled into one distributed transaction.
Based on our experience or running big workflows we concluded
that simplication of network protocols gives better results,
meaning that the less components we use for network
communications the better. Eventually it increases performance
and reduces the load on the system and also reduces the
probability of having DB and MQ out of sync.
We used to use Scheduler for running actions on executors too by
scheduling RPC calls but at some point we saw that it reduces
performance on 40-50% without bringing any real benefits at
this expense. The opposite way, Scheduler was even a worse
bottleneck because of this. So we decided to eliminate the
Scheduler from this chain and the system became practically
much more performant and reliable. So now I did the same
with delivering a subworkflow result.
I believe when it comes to recovering from situations of
DB and MQ being out of sync we need to come up with special
tools that will assume some minimal human intervention
(although I think we can recover some things automatically).
Such a tool should just make it very obvious what's broken
and how to fix it, and make it convenient to fix it (restart
a task/action etc.).
* Processing action queue now happens within a new greenthread
because otherwise Mistral engine can get into a deadlock
by sending a request to itself while processing another one.
It can happen if we use blocking RPC which is the only option
for now.
* Other small fixes
Change-Id: Ic3cf6c47bba215dc6a13944b0585cce59e4e88f9
Use the configured API host and port for the mistal URL, which defaults
to 0.0.0.0 and 8989,
Change-Id: I154b3dc174a9c40887729bb3f8866d5c2316cd12
Closes-Bug: 1709677
Turns on the 'confirmation' for message publishing in the
Kombu RPC client.
Fixes the a race condition in the Kombu RPC client between the
reply queue being declared and the listener being started and
the reply being sent by the server side.
Fixes the Kombu RPC server not resetting the sleep timer after
successful connection to the MQ service.
Change-Id: I0db1cb4c2de7f2c7415825b28e961076870038bf
Closes-Bug: 1718883
Currently we can only do a CURD action on a cron-trigger by name.
This patch refer to workflow implementation and re-encapsulate
DB API so that users can manage a cron-trigger by id or name.
Closes-Bug: 1684469
Change-Id: I9ff657b2604647e734b5539e9bd6a524a3a20efb
Evaluate action names dynamically, so yaql or jinja expression can be
used
Change-Id: I48761c215f0255976c330ffa34f27bb695c944a9
Implements: blueprint mistral-dynamic-actions
* It turns out that the behaviour of looping.RetryDecorator from
oslo.service is different than what was expected. See
https://bugs.launchpad.net/oslo.service/+bug/1718635 for
details. This patch replaces it with a similar decorator from
tenacity library.
Change-Id: I2d8a7f2b430e26991cc13a88ad33c1266c82d113
* Since scheduler has transactional logic we need to account for
cases when these transactions hit DB deadlocks. It is possible
just due to MySQL nature, it's recommended to always design
based on that. This patch decomposes one big scheduler method
that processes delayed calls into smaller methods so that we
could apply @db_utils.retry_on_deadlock decorator to repeat
transactions if they fail because of a deadlock in MySQL.
* Fixed taskk state update when we're assiging it to RUNNING.
In this case we can't set it to RUNNING if there any paused
child executions.
* Fixed the test for cascaded pausing which didn't account
non-atomicity of this operation
Closes-Bug: #1715589
Change-Id: Iffa0fb540a5705c587d71d30af6ab913b26d3952
Currently our API doesn't return any project info in many resources, like
execution, cron-trigger, workbook, etc. As a essential property in those
resources, it's really helpful for users to process resources if we
return project_id in resource endpoints.
Closes-Bug: 1707573
Closes-Bug: 1694398
Change-Id: I866d20cf1f5129b6249140063aa0836f63626767
In the file doc/source/user/dsl_v2.rst, the json format of
task publish result is not well formated, this change is to
pretty format the json code block
Change-Id: I783472c1d7c13cd7b6a464325cf7213abf3ac359
For sub-workflow executions this will be the ID of the initial workflow
execution. For example, given the following workflows.
workflows:
wf1:
tasks:
task1:
workflow: wf2
wf2:
tasks:
task1:
workflow: wf3
wf3:
tasks:
task1:
action: std.noop
When we start wf1, it calls wf2 which calls wf3. Currently, it is hard to
retrieve the full execution, including sub-workflow executions. This patch adds
the root_execution_id to the sub-workflows, so the execution for wf2 and wf3
have a root_execution_id that is the ID of the execution for wf1. The execution
for wf1 is the root execution, so root_execution_id is None. This basically
gives us a flat reference to the graph of workflow executions.
This change is useful as we can then expose it via the API and it makes
it easier and more efficient to find a full execution, including it's
sub-workflow executions.
Implements: blueprint mistral-root-execution-id
Change-Id: I24638812caa2e48e3c071925db5e552b21e15d47
This patchset adds the __init__.py file to mistral/tests/unit/expressions
to enable those tests to run, and updates the Jinja tests to include
the created_at attribute.
Closes-Bug: 1716452
Change-Id: I5c7af1cd4a2777764a2a46adf85129e4cb90e4d8
During some cases resp.content may have a context not in json format
and with 'None' in the resp.encoding. Need to properly handle this
situation and don't try to pass 'None' as an argument to decode.
Change-Id: Id87d650996f16b5ffab79d72413134a4c7fe9ca9
Closes-Bug: #1700608