Merge "Add conductor/node locality spec"
This commit is contained in:
commit
c6d0a57a04
257
specs/approved/conductor-node-locality-affinity.rst
Normal file
257
specs/approved/conductor-node-locality-affinity.rst
Normal file
@ -0,0 +1,257 @@
|
||||
..
|
||||
This work is licensed under a Creative Commons Attribution 3.0 Unported
|
||||
License.
|
||||
|
||||
http://creativecommons.org/licenses/by/3.0/legalcode
|
||||
|
||||
================================
|
||||
Conductor/node grouping affinity
|
||||
================================
|
||||
|
||||
https://storyboard.openstack.org/#!/story/2001795
|
||||
|
||||
This spec proposes adding a ``conductor_group`` property to nodes, which can be
|
||||
matched to one or more conductors configured with a matching
|
||||
``conductor_group`` configuration option, to restrict control of those nodes to
|
||||
these conductors.
|
||||
|
||||
|
||||
Problem description
|
||||
===================
|
||||
|
||||
Today, there is no way to control the conductor-to-node mapping. This is
|
||||
desirable for a number of reasons:
|
||||
|
||||
* An operator may have an Ironic deployment that spans multiple sites. Without
|
||||
control of this mapping, images may be pulled over WAN links. This causes
|
||||
slower deployments and may be less secure.
|
||||
|
||||
* Similarly, an operator may want to map nodes to conductors that are
|
||||
physically closer to the nodes in the same site, to reduce the number of
|
||||
network hops between the node and the conductor. A prime example of this
|
||||
would be to place a conductor in each rack, reducing the path to only go
|
||||
through the top-of-rack switch.
|
||||
|
||||
* A deployer may have multiple networks for out-of-band control, that must be
|
||||
completely isolated. This feature would allow isolating a conductor to a
|
||||
single out-of-band network.
|
||||
|
||||
* A deployer may have multiple physical networks that not all conductors are
|
||||
connected to. By configuring the mapping correctly, conductors can manage
|
||||
only the nodes which they can communicate with. This is described further
|
||||
in another RFE.[0]
|
||||
|
||||
|
||||
Proposed change
|
||||
===============
|
||||
|
||||
We propose adding a ``conductor_group`` configuration option for the conductor,
|
||||
which is a single arbitrary string specifying some grouping characteristic of
|
||||
the conductor.
|
||||
|
||||
We also propose a ``conductor_group`` field in the node record, which will be
|
||||
used to map a node to a conductor. This matching will be done case-insensitive,
|
||||
to make things a bit easier for operators.
|
||||
|
||||
A blank ``conductor_group`` field or config is the default. A conductor without
|
||||
a group can only manage nodes without a group, and a node without a group can
|
||||
only be managed by a conductor without a group.
|
||||
|
||||
The hash ring will need to be modified to take grouping into account, as
|
||||
described below in the `RPC API Impact`_ section.
|
||||
|
||||
Alternatives
|
||||
------------
|
||||
|
||||
Another RFE[1] proposes a complex system of hard and soft affinity, affinity
|
||||
and anti-affinity, and scoring of placement to a conductor with multiple tags.
|
||||
This is quite complex, and I don't believe we'll get it done in the short term.
|
||||
Completing this more basic work doesn't block this more complex work, and so
|
||||
we should take it one step at a time.
|
||||
|
||||
Data model impact
|
||||
-----------------
|
||||
|
||||
A ``conductor_group`` field will be added to the nodes table, as a
|
||||
``VARCHAR(255)``. This will have a default of ``""``, or the empty string.
|
||||
This string will be used in the hash ring calculation, so there's no sense in
|
||||
defaulting to ``NULL``.
|
||||
|
||||
A ``conductor_group`` field will also be added to the conductors table, also as
|
||||
a ``VARCHAR(255)``. This will also have a default of ``""``, or the empty
|
||||
string. This will be used to build the hash ring to look up where nodes should
|
||||
land.
|
||||
|
||||
State Machine Impact
|
||||
--------------------
|
||||
|
||||
None.
|
||||
|
||||
REST API impact
|
||||
---------------
|
||||
|
||||
The ``conductor_group`` field of the node will be added to the node object in
|
||||
the REST API, with a microversion as usual. It will be allowed in POST and
|
||||
PATCH requests. As with the database, it will be restricted to 255 characters.
|
||||
There must be a conductor in that group available, as the conductor services
|
||||
node creation and updates, and is selected via the hash ring.
|
||||
|
||||
It's worth noting that we would like to expose the grouping of conductors
|
||||
via the REST API eventually. However, the best way to do this isn't
|
||||
immediately clear, so we leave it outside the scope of this spec for now.
|
||||
Another RFE[3] proposes a service management API that may be a good fit.
|
||||
|
||||
Client (CLI) impact
|
||||
-------------------
|
||||
|
||||
"ironic" CLI
|
||||
~~~~~~~~~~~~
|
||||
None, it's deprecated.
|
||||
|
||||
"openstack baremetal" CLI
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
The ``conductor_group`` field for a node will be exposed in the client output,
|
||||
and added to the ``node create`` and ``node set`` commands.
|
||||
|
||||
RPC API impact
|
||||
--------------
|
||||
|
||||
This will affect which conductor is the destination for RPC calls corresponding
|
||||
to a given node, however won't have a direct effect on the RPC API itself.
|
||||
|
||||
The hash ring will change such that the internal keys for the hash ring will
|
||||
now be of the structure ``"$conductor_group:$drivername"``. A colon (``:``) is
|
||||
used as the separator between the two, to eliminate conflicts between
|
||||
conductor groups and drivers or hardware types. For example, an ``agent_ilo``
|
||||
key with no separator could mean a node with no group and the ``agent_ilo``
|
||||
driver, or it could mean a node with group ``agent_`` using the ``ilo``
|
||||
hardware type. To handle upgrades, hash ring keys will be built without
|
||||
the conductor group while the service is pinned to a version before this
|
||||
feature, and built with the conductor group when the service is unpinned or
|
||||
pinned to a version after this feature is implemented.
|
||||
|
||||
We handle upgrades by ignoring grouping for services which have a pin in the
|
||||
RPC version that is less than the release with this feature. Once everything
|
||||
is upgraded and unpinned, we begin using the grouping tags configured.
|
||||
|
||||
Operators should leave a sufficient number of conductors available without a
|
||||
grouping tag configured to run the cluster, until nodes can be configured
|
||||
with the grouping tag. Any nodes without a grouping tag will only be
|
||||
managed by conductors without a grouping tag.
|
||||
|
||||
Driver API impact
|
||||
-----------------
|
||||
|
||||
Hash ring generation and lookup will include the grouping tag, as specified
|
||||
above in the `RPC API Impact`_ section.
|
||||
|
||||
Nova driver impact
|
||||
------------------
|
||||
|
||||
This change is transparent to Nova.
|
||||
|
||||
Ramdisk impact
|
||||
--------------
|
||||
|
||||
None.
|
||||
|
||||
Security impact
|
||||
---------------
|
||||
|
||||
No direct impact; however this provides another mechanism for securing a
|
||||
deployment by enabling logical infrastructure segregation.
|
||||
|
||||
Other end user impact
|
||||
---------------------
|
||||
|
||||
None.
|
||||
|
||||
Scalability impact
|
||||
------------------
|
||||
|
||||
None.
|
||||
|
||||
Performance Impact
|
||||
------------------
|
||||
|
||||
None.
|
||||
|
||||
Other deployer impact
|
||||
---------------------
|
||||
|
||||
Deployers that wish to use this feature will need to manage the process of
|
||||
labeling conductors and nodes to enable it, which may be a non-trivial task.
|
||||
|
||||
Developer impact
|
||||
----------------
|
||||
|
||||
None.
|
||||
|
||||
|
||||
Implementation
|
||||
==============
|
||||
|
||||
Assignee(s)
|
||||
-----------
|
||||
|
||||
Primary assignee:
|
||||
jroll
|
||||
|
||||
Other contributors:
|
||||
dtantsur
|
||||
|
||||
Work Items
|
||||
----------
|
||||
|
||||
* Add database fields.
|
||||
|
||||
* Add conductor config and populate conductor DB field.
|
||||
|
||||
* Change the hash ring calculation, and bump the RPC API so that we can pin
|
||||
during upgrades.
|
||||
|
||||
* Add fields to the node and conductor objects.
|
||||
|
||||
* Make the REST API changes.
|
||||
|
||||
* Update the client library/CLI.
|
||||
|
||||
* Document the feature.
|
||||
|
||||
|
||||
Dependencies
|
||||
============
|
||||
|
||||
None.
|
||||
|
||||
|
||||
Testing
|
||||
=======
|
||||
|
||||
Unit tests should be sufficient, as that's how we test our hash ring now.
|
||||
It's difficult to test this with Tempest without exposing conductor grouping
|
||||
via the REST API.
|
||||
|
||||
Upgrades and Backwards Compatibility
|
||||
====================================
|
||||
|
||||
This is described in the `RPC API Impact`_ section.
|
||||
|
||||
|
||||
Documentation Impact
|
||||
====================
|
||||
|
||||
This should be documented in the install guide and admin guide.
|
||||
|
||||
|
||||
References
|
||||
==========
|
||||
|
||||
[0] https://storyboard.openstack.org/#!/story/1734876
|
||||
|
||||
[1] https://storyboard.openstack.org/#!/story/1739426
|
||||
|
||||
[2] Notes from the Rocky PTG session:
|
||||
https://etherpad.openstack.org/p/ironic-rocky-ptg-location-awareness
|
||||
|
||||
[3] https://storyboard.openstack.org/#!/story/1526759
|
1
specs/not-implemented/conductor-node-locality-affinity.rst
Symbolic link
1
specs/not-implemented/conductor-node-locality-affinity.rst
Symbolic link
@ -0,0 +1 @@
|
||||
../approved/conductor-node-locality-affinity.rst
|
Loading…
Reference in New Issue
Block a user