Merge "Enhance XJob reliability spec"
This commit is contained in:
commit
fbcb71ed86
234
specs/ocata/enhance-xjob-reliability.rst
Normal file
234
specs/ocata/enhance-xjob-reliability.rst
Normal file
@ -0,0 +1,234 @@
|
|||||||
|
=======================================
|
||||||
|
Enhance Reliability of Asynchronous Job
|
||||||
|
=======================================
|
||||||
|
|
||||||
|
Background
|
||||||
|
==========
|
||||||
|
|
||||||
|
Currently we are using cast method in our RPC client to trigger asynchronous
|
||||||
|
job in XJob daemon. After one of the worker threads receives the RPC message
|
||||||
|
from the message broker, it registers the job in the database and starts to
|
||||||
|
run the handle function. The registration guarantees that asynchronous job will
|
||||||
|
not be lost after the job fails and the failed job can be redone. The detailed
|
||||||
|
discussion of the asynchronous job process in XJob daemon is covered in our
|
||||||
|
design document [1]_.
|
||||||
|
|
||||||
|
Though asynchronous jobs are correctly saved after worker threads get the RPC
|
||||||
|
message, we still have risk to lose jobs. By using cast method, it's only
|
||||||
|
guaranteed that the message is received by the message broker, but there's no
|
||||||
|
guarantee that the message can be received by the message consumer, i.e., the
|
||||||
|
RPC server thread running in XJob daemon. According to the RabbitMQ document,
|
||||||
|
undelivered messages will be lost if RabbitMQ server stops [2]_. Message
|
||||||
|
persistence or publisher confirm [3]_ can be used to increase reliability, but
|
||||||
|
they sacrifice performance. On the other hand, we can not assume that message
|
||||||
|
brokers other than RabbitMQ will provide similar persistence or confirmation
|
||||||
|
functionality. Therefore, Tricircle itself should handle the asynchronous job
|
||||||
|
reliability problem as far as possible. Since we already have a framework to
|
||||||
|
register, run and redo asynchronous jobs in XJob daemon, we propose a cheaper
|
||||||
|
way to improve reliability.
|
||||||
|
|
||||||
|
Proposal
|
||||||
|
========
|
||||||
|
|
||||||
|
One straightforward way to make sure that the RPC server has received the RPC
|
||||||
|
message is to use call method. RPC client will be blocked until the RPC server
|
||||||
|
replies the message if it uses call method to send the RPC request. So if
|
||||||
|
something wrong happens before the reply, RPC client can be aware of it. Of
|
||||||
|
course we cannot make RPC client wait too long, thus RPC handlers in the RPC
|
||||||
|
server side need to be simple and quick to run. Thanks to the asynchronous job
|
||||||
|
framework we already have, migrating from cast method to call method is easy.
|
||||||
|
|
||||||
|
Here is the flow of the current process::
|
||||||
|
|
||||||
|
+--------+ +--------+ +---------+ +---------------+ +----------+
|
||||||
|
| | | | | | | | | |
|
||||||
|
| API | | RPC | | Message | | RPC Server | | Database |
|
||||||
|
| Server | | client | | Broker | | Handle Worker | | |
|
||||||
|
| | | | | | | | | |
|
||||||
|
+---+----+ +---+----+ +----+----+ +-------+-------+ +----+-----+
|
||||||
|
| | | | |
|
||||||
|
| call RPC API | | | |
|
||||||
|
+--------------> | | |
|
||||||
|
| | send cast message | | |
|
||||||
|
| +-------------------> | |
|
||||||
|
| call return | | dispatch message | |
|
||||||
|
<--------------+ +------------------> |
|
||||||
|
| | | | register job |
|
||||||
|
| | | +---------------->
|
||||||
|
| | | | |
|
||||||
|
| | | | obtain lock |
|
||||||
|
| | | +---------------->
|
||||||
|
| | | | |
|
||||||
|
| | | | run job |
|
||||||
|
| | | +----+ |
|
||||||
|
| | | | | |
|
||||||
|
| | | | | |
|
||||||
|
| | | <----+ |
|
||||||
|
| | | | |
|
||||||
|
| | | | |
|
||||||
|
+ + + + +
|
||||||
|
|
||||||
|
We can just leave **register job** phase in the RPC handle and put **obtain
|
||||||
|
lock** and **run job** phase in a separate thread, so the RPC handle is simple
|
||||||
|
enough to use call method to invoke it. Here is the proposed flow::
|
||||||
|
|
||||||
|
+--------+ +--------+ +---------+ +---------------+ +----------+ +-------------+ +-------+
|
||||||
|
| | | | | | | | | | | | | |
|
||||||
|
| API | | RPC | | Message | | RPC Server | | Database | | RPC Server | | Job |
|
||||||
|
| Server | | client | | Broker | | Handle Worker | | | | Loop Worker | | Queue |
|
||||||
|
| | | | | | | | | | | | | |
|
||||||
|
+---+----+ +---+----+ +----+----+ +-------+-------+ +----+-----+ +------+------+ +---+---+
|
||||||
|
| | | | | | |
|
||||||
|
| call RPC API | | | | | |
|
||||||
|
+--------------> | | | | |
|
||||||
|
| | send call message | | | | |
|
||||||
|
| +--------------------> | | | |
|
||||||
|
| | | dispatch message | | | |
|
||||||
|
| | +------------------> | | |
|
||||||
|
| | | | register job | | |
|
||||||
|
| | | +----------------> | |
|
||||||
|
| | | | | | |
|
||||||
|
| | | | job enqueue | | |
|
||||||
|
| | | +------------------------------------------------>
|
||||||
|
| | | | | | |
|
||||||
|
| | | reply message | | | job dequeue |
|
||||||
|
| | <------------------+ | |-------------->
|
||||||
|
| | send reply message | | | obtain lock | |
|
||||||
|
| <--------------------+ | <----------------+ |
|
||||||
|
| call return | | | | | |
|
||||||
|
<--------------+ | | | run job | |
|
||||||
|
| | | | | +----+ |
|
||||||
|
| | | | | | | |
|
||||||
|
| | | | | | | |
|
||||||
|
| | | | | +----> |
|
||||||
|
| | | | | | | | |
|
||||||
|
| | | | | | |
|
||||||
|
+ + + + + + +
|
||||||
|
|
||||||
|
In the above graph, **Loop Worker** is a new-introduced thread to do the actual
|
||||||
|
work. **Job Queue** is an eventlet queue [4]_ used to coordinate **Handle
|
||||||
|
Worker** who produces job entries and **Loop Worker** who consumes job entries.
|
||||||
|
While accessing an empty queue, **Loop Worker** will be blocked until some job
|
||||||
|
entries are put into the queue. **Loop Worker** retrieves job entries from the
|
||||||
|
job queue then start to run it. Similar to the original flow, since multiple
|
||||||
|
workers may get the same type of job for the same resource at the same time,
|
||||||
|
workers need to obtain the lock before it can run the job. One problem occurs
|
||||||
|
whenever XJob daemon stops before it finishes all the jobs in the job queue;
|
||||||
|
all unfinished jobs are lost. To solve it, we make changes to the original
|
||||||
|
periodical task that is used to redo failed job, and let it also handle the
|
||||||
|
jobs which have been registered for a certain time but haven't been started.
|
||||||
|
So both failed jobs and "orphan" new jobs can be picked up and redone.
|
||||||
|
|
||||||
|
You can see that **Handle Worker** doesn't do many works, it just consumes RPC
|
||||||
|
messages, register jobs then put job items in the job queue. So one extreme
|
||||||
|
solution here, will be to register new jobs in the API server side and start
|
||||||
|
worker threads to retrieve jobs from the database and run them. In this way, we
|
||||||
|
can remove all the RPC processes and use database to coordinate. The drawback
|
||||||
|
of this solution is that we don't dispatch jobs. All the workers query jobs
|
||||||
|
from the database so there is high probability that some of the workers obtain
|
||||||
|
the same job and thus race occurs. In the first solution, message broker
|
||||||
|
helps us to dispatch messages, and so dispatch jobs.
|
||||||
|
|
||||||
|
Considering job dispatch is important, we can make some changes to the second
|
||||||
|
solution and move to the third one, that is to also register new jobs in the
|
||||||
|
API server side, but we still use cast method to trigger asynchronous job in
|
||||||
|
XJob daemon. Since job registration is done in the API server side, we are not
|
||||||
|
afraid that the jobs will be lost if cast messages are lost. If API server side
|
||||||
|
fails to register the job, it will return response of failure; If registration
|
||||||
|
of job succeeds, the job will be done by XJob daemon at last. By using RPC, we
|
||||||
|
dispatch jobs with the help of message brokers. One thing which makes cast
|
||||||
|
method better than call method is that retrieving RPC messages and running job
|
||||||
|
handles are done in the same thread so if one XJob daemon is busy handling
|
||||||
|
jobs, RPC messages will not be dispatched to it. However when using call
|
||||||
|
method, RPC messages are retrieved by one thread(the **Handle Worker**) and job
|
||||||
|
handles are run by another thread(the **Loop Worker**), so XJob daemon may
|
||||||
|
accumulate many jobs in the queue and at the same time it's busy handling jobs.
|
||||||
|
This solution has the same problem with the call method solution. If cast
|
||||||
|
messages are lost, the new jobs are registered in the database but no XJob
|
||||||
|
daemon is aware of these new jobs. Same way to solve it, use periodical task to
|
||||||
|
pick up these "orphan" jobs. Here is the flow::
|
||||||
|
|
||||||
|
+--------+ +--------+ +---------+ +---------------+ +----------+
|
||||||
|
| | | | | | | | | |
|
||||||
|
| API | | RPC | | Message | | RPC Server | | Database |
|
||||||
|
| Server | | client | | Broker | | Handle Worker | | |
|
||||||
|
| | | | | | | | | |
|
||||||
|
+---+----+ +---+----+ +----+----+ +-------+-------+ +----+-----+
|
||||||
|
| | | | |
|
||||||
|
| call RPC API | | | |
|
||||||
|
+--------------> | | |
|
||||||
|
| | register job | | |
|
||||||
|
| +------------------------------------------------------->
|
||||||
|
| | | | |
|
||||||
|
| | [if succeed to | | |
|
||||||
|
| | register job] | | |
|
||||||
|
| | send cast message | | |
|
||||||
|
| +-------------------> | |
|
||||||
|
| call return | | dispatch message | |
|
||||||
|
<--------------+ +------------------> |
|
||||||
|
| | | | obtain lock |
|
||||||
|
| | | +---------------->
|
||||||
|
| | | | |
|
||||||
|
| | | | run job |
|
||||||
|
| | | +----+ |
|
||||||
|
| | | | | |
|
||||||
|
| | | | | |
|
||||||
|
| | | <----+ |
|
||||||
|
| | | | |
|
||||||
|
| | | | |
|
||||||
|
+ + + + +
|
||||||
|
|
||||||
|
Discussion
|
||||||
|
==========
|
||||||
|
|
||||||
|
In this section we discuss the pros and cons of the above three solutions.
|
||||||
|
|
||||||
|
.. list-table:: **Solution Comparison**
|
||||||
|
:header-rows: 1
|
||||||
|
|
||||||
|
* - Solution
|
||||||
|
- Pros
|
||||||
|
- Cons
|
||||||
|
* - API server uses call
|
||||||
|
- no RPC message lost
|
||||||
|
- downtime of unfinished jobs in the job queue when XJob daemon stops,
|
||||||
|
job dispatch not based on XJob daemon workload
|
||||||
|
* - API server register jobs + no RPC
|
||||||
|
- no requirement on RPC(message broker), no downtime
|
||||||
|
- no job dispatch, conflict costs time
|
||||||
|
* - API server register jobs + uses cast
|
||||||
|
- job dispatch based on XJob daemon workload
|
||||||
|
- downtime of lost jobs due to cast messages lost
|
||||||
|
|
||||||
|
Downtime means that after a job is dispatched to a worker, other workers need
|
||||||
|
to wait for a certain time to determine that job is expired and take over it.
|
||||||
|
|
||||||
|
Conclusion
|
||||||
|
==========
|
||||||
|
|
||||||
|
We decide to implement the third solution(API server register jobs + uses cast)
|
||||||
|
since it improves the asynchronous job reliability and at the mean time has
|
||||||
|
better work load dispatch.
|
||||||
|
|
||||||
|
Data Model Impact
|
||||||
|
=================
|
||||||
|
|
||||||
|
None
|
||||||
|
|
||||||
|
Dependencies
|
||||||
|
============
|
||||||
|
|
||||||
|
None
|
||||||
|
|
||||||
|
Documentation Impact
|
||||||
|
====================
|
||||||
|
|
||||||
|
None
|
||||||
|
|
||||||
|
References
|
||||||
|
==========
|
||||||
|
|
||||||
|
.. [1] https://docs.google.com/document/d/1zcxwl8xMEpxVCqLTce2-dUOtB-ObmzJTbV1uSQ6qTsY
|
||||||
|
.. [2] https://www.rabbitmq.com/tutorials/tutorial-two-python.html
|
||||||
|
.. [3] https://www.rabbitmq.com/confirms.html
|
||||||
|
.. [4] http://eventlet.net/doc/modules/queue.html
|
Loading…
Reference in New Issue
Block a user