WIP etcd coordination
Introduce etcd and tooz as service coordination. Change-Id: If2c228c4d2ebaf93d79c4cbf2cc39146f8f74086 Story: 2001842 Task: 30376
This commit is contained in:
parent
110ec01268
commit
cd6c8744df
162
specs/etcd-coordination.rst
Normal file
162
specs/etcd-coordination.rst
Normal file
@ -0,0 +1,162 @@
|
||||
..
|
||||
This work is licensed under a Creative Commons Attribution 3.0 Unported
|
||||
License.
|
||||
|
||||
http://creativecommons.org/licenses/by/3.0/legalcode
|
||||
|
||||
========================================
|
||||
Incorporate ETCD as service coordination
|
||||
========================================
|
||||
|
||||
https://storyboard.openstack.org/#!/story/2001842
|
||||
|
||||
This spec is part of the ironic-inspector HA work. To further split the
|
||||
inspector service, this spec proposes to introduce etcd as the base service
|
||||
for the coordination between ironic-inspector api and conductor services.
|
||||
|
||||
Problem description
|
||||
===================
|
||||
|
||||
From the previous work, the single process ironic-inspector is logically
|
||||
splitted into two services both running under ``oslo.service``, namely
|
||||
``ironic_inspector`` and ``ironic-inspector-conductor``.
|
||||
|
||||
To split two services into two processes, we need to address existing
|
||||
functional test issue before we can split two services into respective
|
||||
executables. Currently the functional test uses fake messaging driver
|
||||
which only works for single process, we can either add rabbitmq support
|
||||
for functional test env or introduce another messaging mechanism like
|
||||
``json-rpc``, but the first solution is not desirable.
|
||||
|
||||
Even when services are splitted, we are facing the challenge of service
|
||||
coordination, for multiple inspector conductor services, we need a way to
|
||||
prevent the racing of concurrent operation on the same node, or to choose
|
||||
which inspector conductor should the request be delivered to.
|
||||
|
||||
|
||||
Proposed change
|
||||
===============
|
||||
|
||||
As etcd is already a base service for the OpenStack platform, the spec
|
||||
proposes to add ``python-etcd3`` and ``tooz`` as project requirements for the
|
||||
service coordination. ``tooz`` provides several feature encapsulations like
|
||||
group management, locking, etc. Group management is only implemented for ETCD
|
||||
API v3, thus ``python-etcd3`` is required.
|
||||
|
||||
All proposed work is implemented with tooz interfaces. Each service will
|
||||
create a coordinator and keep heartbeating, the example workflow for
|
||||
ironic-inspector API service:
|
||||
|
||||
#. Create a coordinator with hostname
|
||||
#. Create a group "ironic-inspector-service-group", bypass if the group
|
||||
already exists.
|
||||
#. Query query group members upon API request, randomly pick one conductor,
|
||||
generate topic according to hostname and send rpc request.
|
||||
|
||||
The example workflow for ironic-inspector conductor service:
|
||||
|
||||
#. Create a coordinator with hostname
|
||||
#. Join group "ironic-inspector-service-group", create and join if the
|
||||
group does not exist.
|
||||
#. Leaving group explicitly when service is shutdown.
|
||||
|
||||
There is no distributed locking support for ironic-inspector, this spec will
|
||||
introduce an abstract lock layer, and implement locking support based on tooz.
|
||||
|
||||
|
||||
Alternatives
|
||||
------------
|
||||
|
||||
Though it's totally workable to utilize database as the the coordination
|
||||
source just like ironic, it would be much lighter if implemented with tooz.
|
||||
tooz also supports multiple backends, which brings more possibilities in
|
||||
deployement.
|
||||
|
||||
Data model impact
|
||||
-----------------
|
||||
|
||||
None.
|
||||
|
||||
HTTP API impact
|
||||
---------------
|
||||
|
||||
None.
|
||||
|
||||
Client (CLI) impact
|
||||
-------------------
|
||||
|
||||
None.
|
||||
|
||||
Ironic python agent impact
|
||||
--------------------------
|
||||
|
||||
None.
|
||||
|
||||
Ironic impact
|
||||
-------------
|
||||
|
||||
None.
|
||||
|
||||
Performance and scalability impact
|
||||
----------------------------------
|
||||
|
||||
There should be no obvious performance and scalability impact before services
|
||||
are actually splitted.
|
||||
|
||||
Security impact
|
||||
---------------
|
||||
|
||||
None.
|
||||
|
||||
Deployer impact
|
||||
---------------
|
||||
|
||||
TODO(kaifeng): Add configuration options to support proposed work:
|
||||
|
||||
- etcd: host, port, ca_cert, cert_key, cert_cert, timeout, user, password,
|
||||
?grpc_options?
|
||||
- group name
|
||||
- lock prefix
|
||||
|
||||
Developer impact
|
||||
----------------
|
||||
|
||||
None.
|
||||
|
||||
Upgrades and Backwards Compatibility
|
||||
------------------------------------
|
||||
|
||||
After this spec is implemented, etcd v3 will be a mandatory requirement for
|
||||
inspector service working properly.
|
||||
|
||||
Implementation
|
||||
==============
|
||||
|
||||
Assignee(s)
|
||||
-----------
|
||||
|
||||
Primary assignee:
|
||||
kaifeng, kaifeng.w@gmail.com
|
||||
|
||||
Work Items
|
||||
----------
|
||||
|
||||
Implement proposed work.
|
||||
|
||||
|
||||
Dependencies
|
||||
============
|
||||
|
||||
``python-etcd3`` and ``tooz`` are required library support.
|
||||
There should be a etcd v3 service running in the same cloud.
|
||||
|
||||
Testing
|
||||
=======
|
||||
|
||||
Will be covered by unittest and bifrost.
|
||||
|
||||
References
|
||||
==========
|
||||
|
||||
https://docs.openstack.org/tooz/latest/user/index.html
|
||||
|
Loading…
Reference in New Issue
Block a user