Active-active L3 Gateway with Multihoming

The aim of the RFE is to add multihoming support to routers along with
automatic management of ECMP for default routes and BFD next-hop
reachability verification.

https://bugs.launchpad.net/neutron/+bug/2002687

Related-Bug: #2002687
Change-Id: I95a0d5f1b7aef985df5625cd83222799db811f2b
This commit is contained in:
Dmitrii Shcherbakov 2023-01-12 22:50:52 +03:00
parent 780a7ced21
commit 8ce21d40ed
1 changed files with 370 additions and 0 deletions

View File

@ -0,0 +1,370 @@
..
This work is licensed under a Creative Commons Attribution 3.0 Unported
License.
http://creativecommons.org/licenses/by/3.0/legalcode
=========================================
Active-active L3 Gateway with Multihoming
=========================================
https://bugs.launchpad.net/neutron/+bug/2002687
Currently Neutron routers only support one external gateway port and ECMP
default routes can only be added manually as extra static routes. Likewise,
BFD is not configurable for ECMP routes. This specification provides an
extension to the existing Neutron API for configuring multiple external
gateways with automatic addition of ECMP default routes and BFD for those
routes. It also discusses the problem of scheduling multiple gateway ports per
router on different chassis OVN is chosen as a primary backend.
Problem Description
===================
Some network designs include multiple L3 gateways to:
* Share the load across different gateways: both in terms of different
OVN chassis hosting different gateway ports and sharing the processing
load and upstream gateways handling parts of the north-south flows;
* Provide independent network paths for the north-south direction (e.g. via
different ISPs) for resiliency without relying on the same L2.
Having multi-homing implemented at the instance level imposes additional burden
on the end user of a cloud and support requirements for the guest OS, whereas
utilizing ECMP and BFD at the router side alleviates the need for instance-side
awareness of a more complex routing setup.
Adding more than one gateway port implies extending the existing data model
which was described in the `multiple external gateways spec`_. However, it left
adding additional gateway routes out of scope leaving this to future
improvements around dynamic routing. Also the focus of neutron-dynamic-routing
has so far been around advertising routes, not accepting new ones from the
external peers - so dynamic routing support like this is a very different
subject. However, manual addition of extra routes does not utilize the default
gateway IP information available from subnets in Neutron while this could be
addressed by implementing an extra conditional behavior when adding more than
one gateway port to a router.
ECMP routes can result in black-holing of traffic should the next-hop of a
route becomes unreachable. `BFD`_ is a standard protocol adopted by IETF
for next-hop failure detection which can be used for route eviction. OVN
supports BFD `as of v21.03.0`_ with a data model that allows enabling
BFD on a per next-hop basis by associating BFD session information with routes,
however, it is not modeled at the Neutron level even if a backend supports it.
From the Neutron data model perspective, ECMP for routes is already a supported
concept since `ECMP support spec`_ got implemented in Wallaby (albeit the
spec focused on the L3-agent based implementation).
As for OVN and BFD, the OVN database state needs to be populated by Neutron
based on the data from the Neutron database, therefore, data model changes to
the Neutron DB are needed to represent the BFD session parameters.
Proposed Change
===============
DB Impact
---------
Core Model
^^^^^^^^^^
Currently there are two ways in which router to gateway port relationship is
expressed in Neutron:
* The ``gw_port_id`` foreign key in the ``routers`` table which is set to a
UUID of the router port that has a type of ``network:router_gateway``;
* The ``routerports`` table which was `added for referential integrity`_
purposes and stores ``router_id`` to ``port_id`` mappings along with a
redundant ``port_type``.
In terms of the representation of multiple gateway ports in the Neutron DB the
proposal follows the `multiple external gateways spec`_:
* Keep the ``routers``, ``routerports`` and ``ports`` tables as they are now
but start storing multiple ``network:router_gateway`` ports per router in the
``routerports`` table;
* For backwards-compatibility store a single gateway port id in
``routers.gw_port_id`` which will be present both in this field and also in
the ``routerports`` table and in ``routers.gw_port_id``).
* Extend the ``neutron.db.models.l3.Router`` class with a new attribute
``gw_ports`` that will map to all relevant ``network:router_gateway`` ports
stored in the ``routerports`` table.
BFD and ECMP Route Behavior Modeling
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Each external gateway info dictionary of a router should contain additional
information about the policy around handling of default routes derived from the
subnets associated with the gateway ports of a router. This is needed so that
Neutron as a CMS has the required state to specify to OVN whether ECMP routes
are wanted and whether BFD needs to be enabled for them instead of just
passing those options through at the time of addition of an external gateway
to a router. Therefore, this information needs to be stored along the router.
The following additional columns are proposed for the ``routers`` table:
``enable_default_route_ecmp`` is a router-level policy on whether ECMP default
routes should be used or not (if the L3 service plugin supports them). In the
OVN case implying `ECMP symmetric reply`_ to be enabled when adding ECMP routes
can be a reasonable default.
``enable_default_route_bfd`` is a router-level policy on whether BFD should be
used for checking whether the next-hop of a default route is reachable (if
the L3 service plugin supports BFD).
Besides the handling of default routes derived from subnets, the BFD
information needs to be associated with every route object similar to how
it is done in the OVN data model itself. The reason behind this is that BFD
can be enabled or disabled on a per next-hop basis even if the destination
of a route is the same.
The `OVN modeling of BFD`_ serves as a basis for the data model changes for the
Neutron DB. An additional ``BFD`` table will be associated with the
``routerroutes`` table storing static routes much like the association of
`logical router routes with BFD records`_ is done.
OVN allows making a BFD session for a particular static route use a different
destination IP for checking reachability rather than the next hop of the static
route itself (by storing the BFD peer IP address in the ``dst_ip`` field in
the ``BFD`` table). This is a useful semantic for separating the control plane
and data plane and can be used for static routes of Neutron routers too.
However, when it comes to default routes inferred from gateway port to subnet
associations, Neutron's behavior should be to use the next hop of the default
route as a ``dst_ip``.
OVN models the BFD table as follows (see `ovn-nb docs`_ for more information):
* ``dst_ip`` - a BFD peer IP address;
* ``mix_tx`` - an integer specifying the minimum interval (milliseconds, >=1)
that OVN would use when transmitting the BFD control packet (minus jitter);
* ``min_rx`` - an integer specifying the minimum interval (milliseconds)
between the received BFD control packets that OVN is capable of supporting
(minus the jitter applied by the sender). Can be set to 0 to state that
BFD control packet transmission from the BFD peer is not desired;
* ``detect_mult`` - an integer (>= 1) specifying the detection time multiplier.
The negotiated transmit interval, multiplied by this value, provides the
Detection Time for the receiving system in Asynchronous mode.
* OVN includes the ``options`` field that includes a map of string-string pairs
which is reserved for future use. A similar field or extra columns can be
added later to the Neutron model to account for additional extensions.
In Neutron each static route could optionally have a BFD record associated with
it with a BFD record ID acting as a foreign key in the ``routerroutes`` table.
The proposed approach for this spec is limited to only adding a small portion
of the columns in the database and without exposing a full API to BFD (a future
revision might implement the `BFD support spec`_ to provide a full API) with
the aim of allowing operators to specify the need to use BFD with default
settings of a backend.
The proposed columns for the ``bfd_monitor`` table for in this spec are:
+-------------------+---------+-------+------+---------------------------------------+
| Attribute | Type | Req | CRUD | Description |
+===================+=========+=======+======+=======================================+
| id | uuid-str| No | R | An id of a bfd_monitor |
+-------------------+---------+-------+------+---------------------------------------+
| name | String | No | CRU | Human readable name for the |
| | | | | bfd_monitor (255 characters limit). |
| | | | | Does not have to be unique. |
+-------------------+---------+-------+------+---------------------------------------+
| project_id | String | No | R | The owner of the bfd_monitor. |
+-------------------+---------+-------+------+---------------------------------------+
| dst_ip | String | Yes | CR | The destination IP address to be |
| | | | | monitored. In case of a singlehop bfd |
| | | | | this is the nexthop ip of the route, |
| | | | | for the general case (like multihop |
| | | | | bfd) this is an arbitrary IP (IPv4 or |
| | | | | Ipv6) that can serve as BFD neighbor. |
+-------------------+---------+-------+------+---------------------------------------+
| status | String | N/A | R | Shows if the BFD monitor was |
| | | | | succesfully created in the backend, |
| | | | | but nothing about the session status, |
| | | | | for that the session_status API |
| | | | | endpoint can be used. |
+-------------------+---------+-------+------+---------------------------------------+
Rest API Changes
----------------
Router API
^^^^^^^^^^
The API changes augment the changes from the `multiple external gateways spec`_
but also include additional changes when it comes to default route ECMP and
BFD handling. Thus, a different name for an API extension is used in
this specification: ``external-gateway-multihoming`` and API changes are listed
here in full.
This extension adds new parameters to the ``external_gateway_info`` dictionary:
::
{
'network_id': {
'type:uuid': None, 'required': True},
'external_fixed_ips': {
'type:fixed_ips': None,
'required': False
},
'enable_snat': {
'type:boolean': None, 'required': False,
'convert_to': converters.convert_to_boolean
},
'enable_default_route_ecmp': {
'type:boolean': None, 'required': False,
'convert_to': converters.convert_to_boolean
},
'enable_default_route_bfd': {
'type:boolean': None, 'required': False,
'convert_to': converters.convert_to_boolean
},
}
``enable_default_route_ecmp`` is a router-level policy on whether ECMP default
routes should be used or not (if the L3 service plugin supports them).
``enable_default_route_bfd`` is a router-level policy on whether BFD should be
used for checking whether the next-hop of a default route is reachable (if
the L3 service plugin supports BFD).
A new router attribute is added as well called ``external_gateways`` which is a
list of ``external_gateway_info`` structures.
The first element of the ``external_gateways`` list is special for
compatibility purposes as it contains the same information as the
``external_gateway_info`` does. When ``enable_default_route_ecmp`` is set to
``false`` it also defines the default gateway placed into the routers routing
table (in the multi-segment network case, the placement of a gateway port
matters for inferring the default gateway based on the subnet used on a network
segment).
The order of the rest of the list is ignored.
Duplicates in the list (that is multiple external gateways with the same
``network_id``) are allowed: in that case multiple gateway ports will be
attached to the same network (this can be used to have the active-active setup
when external gateways are available on the same network).
Updating ``external_gateway_info`` also updates the first element of
``external_gateways`` and it leaves the rest of ``external_gateways``
unchanged. Setting ``external_gateway_info`` to an empty value also
resets ``external_gateways`` to the empty list.
The ``external_gateways`` attribute cannot be set in
``POST /v2.0/routers`` or ``PUT /v2.0/routers/{router_id}`` requests,
instead it can be managed via sub-methods:
* ``PUT /v2.0/routers/{router_id}/add_external_gateways``
Accepts a list of ``external_gateway_info`` structures. Adding an
external gateway to a network that already has one raises an error.
* ``PUT /v2.0/routers/{router_id}/update_external_gateways``
Accepts a list of ``external_gateway_info`` structures. The external
gateways to be updated are identified by the ``network_ids`` found
in the PUT request. The ``external_fixed_ips`` and ``enable_snat``
fields can be updated. The ``network_id`` field cannot be updated.
* ``PUT /v2.0/routers/{router_id}/remove_external_gateways``
Accepts a list of potentially partial ``external_gateway_info``
structures. Only the ``network_id`` field from
``external_gateway_info`` structure is used. The ``external_fixed_ips``
and ``enable_snat`` keys can be present but their values are ignored.
The add/update/remove PUT sub-methods respond with the whole router
object just as ``POST/PUT/GET /v2.0/routers``.
Extra routes API: ECMP
^^^^^^^^^^^^^^^^^^^^^^
As the `ECMP support spec`_ notes, there are no API changes to make to support
ECMP routes per se: multiple routes to the same destination and different
next-hops can already be specified when adding extra routes. However, that spec
focused on the agent-based implementation - part of the work to implement this
spec is to check whether the same is true for the OVN-based L3 implementation.
One parameter that is proposed to be added here is a boolean
``ecmp_symmetric_reply`` which will act a way to populate the relevant new
column for each static route (it is modeled at the route level in OVN as well).
Extra routes API: BFD
^^^^^^^^^^^^^^^^^^^^^
In the absence of a full BFD API, users will have an option to specify a policy
on the routers (``enable_default_route_bfd``) or routes (by supplying an
additional option to the ``add_extraroutes`` API invocation called
``enable_bfd``). Neutron will manage ``bfd_monitor`` records related to those
options internally for now using route next hops as ``dst_ip`` fields.
Out of Scope
============
* BFD authentication as it is not implemented in the OVN BFD implementation
while it is present in the protocol RFC itself. Therefore, the data model
should be extensible to support this in the future;
* Solving the distributed SNAT problem.
One direction is to use conntrack state synchronization between the gateway
ports. Other ideas involve making smarter control plane choices about where
this conntrack state should exist instead of distributing it
everywhere - this can be done by ensuring that processing of flows is done
locally to the instance but there are downsides to that as well which needs
to be considered more carefully
* Accepting ECMP routes via dynamic routing protocols. The current aim is to
utilize the default gateway information available in Neutron subnets to
configure default gateway ECMP routes or to use the extra routes extension.
This specification is a building block for the future support of dynamic
routing.
* Modeling of route metrics. While there are cases where one default route
could be preferred over the other for the same destination, neither Neutron
nor OVN model this concept today;
* Implementation of BFD for the non-OVN L3 implementation based on Linux
namespaces.
Asymmetric Routing and Distributed Routers
------------------------------------------
Conntrack can be utilized to avoid responses generated by instances to go via
the route different from the one the request came in on in presence of ECMP
routes. OVN has support for making the reply traffic take the symmetric path.
This can be configured by utilizing the `options column`_ in the logical router
static routes table in OVN which allows configuring `ECMP symmetric reply`_ by
setting ``ecmp_symmetric_reply`` option to ``true``.
Routes in Neutron could have an ``ecmp_symmetric_reply`` option to specify a
policy on whether to enable `ECMP symmetric reply`_ depending on whether the L3
service plugin supports it or not.
However, the `commit introducing the feature`_ in OVN notes a limitation on its
use: it can only be used on gateway routers, not distributed routers that have
a gateway port due to the dependency of the ingress pipeline logic of the
logical router on the hypervisor-local CT state.
.. _multiple external gateways spec: https://specs.openstack.org/openstack/neutron-specs/specs/xena/multiple-external-gateways.html
.. _BFD: https://www.rfc-editor.org/rfc/rfc5880
.. _as of v21.03.0: https://github.com/ovn-org/ovn/commit/6e0a69ad4bcdf9e4cace5c73ef48ab06065e8519
.. _ECMP support spec: https://specs.openstack.org/openstack/neutron-specs/specs/wallaby/l3-router-support-ecmp.html
.. _added for referential integrity: https://opendev.org/openstack/neutron/commit/93012915a3445a8ac8a0b30b702df30febbbb728
.. _OVN modeling of BFD: https://github.com/ovn-org/ovn/blob/v22.12.0/ovn-nb.ovsschema#L612-L636
.. _logical router routes with BFD records: https://github.com/ovn-org/ovn/blob/v22.12.0/ovn-nb.ovsschema#L449-L452
.. _ovn-nb docs: https://www.ovn.org/support/dist-docs/ovn-nb.5.txt
.. _options column: https://github.com/ovn-org/ovn/blob/v22.12.0/ovn-nb.ovsschema#L453-L455
.. _ECMP symmetric reply: https://github.com/ovn-org/ovn/blob/v22.12.0/ovn-nb.xml#L3312-L3319
.. _commit introducing the feature: https://github.com/ovn-org/ovn/commit/4fdca656857d4a5caeec35ae813888cb9e403e5e
.. _BFD support spec: https://specs.openstack.org/openstack/neutron-specs/specs/xena/bfd_support.html