Add documentation about NB DB driver
This includes the option to use the OVN-Cluster for routing instead of the kernel. It also updates the supportability matrix to better reflect the current status, and makes a little reorg on the organization structure Change-Id: If8fb9a42f74511e9f70a25d7c08dce99c20c3f10
This commit is contained in:
parent
2e74077cb6
commit
bc84a51dd4
Binary file not shown.
After Width: | Height: | Size: 44 KiB |
|
@ -22,62 +22,67 @@ The next sections highlight the options and features supported by each driver
|
||||||
BGP Driver (SB)
|
BGP Driver (SB)
|
||||||
---------------
|
---------------
|
||||||
|
|
||||||
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+--------------------------+--------------------+-----------------------+-------------+
|
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+--------------------------+--------------------+-----------------------+-----------+
|
||||||
| Exposing Method | Description | Expose with | Wired with | Expose Tenants | Expose only GUA | OVS-DPDK/HWOL Support | Implemented |
|
| Exposing Method | Description | Expose with | Wired with | Expose Tenants | Expose only GUA | OVS-DPDK/HWOL Support | Supported |
|
||||||
+=================+=====================================================+==========================================+==========================================+==========================+====================+=======================+=============+
|
+=================+=====================================================+==========================================+==========================================+==========================+====================+=======================+===========+
|
||||||
| Underlay | Expose IPs on the default underlay network | Adding IP to dummy nic isolated in a VRF | Ingress: ip rules, and ip routes on the | Yes | Yes | No | Yes |
|
| Underlay | Expose IPs on the default underlay network. | Adding IP to dummy NIC isolated in a VRF | Ingress: ip rules, and ip routes on the | Yes | Yes | No | Yes |
|
||||||
| | | | routing table associated with OVS | | (expose_ipv6_gua | | |
|
| | | | routing table associated with OVS | | (expose_ipv6_gua | | |
|
||||||
| | | | Egress: OVS flow to change MAC | (expose_tenant_networks) | _tenant_networks) | | |
|
| | | | Egress: OVS flow to change MAC | (expose_tenant_networks) | _tenant_networks) | | |
|
||||||
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+--------------------------+--------------------+-----------------------+-------------+
|
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+--------------------------+--------------------+-----------------------+-----------+
|
||||||
|
|
||||||
|
|
||||||
BGP Driver (NB)
|
BGP Driver (NB)
|
||||||
---------------
|
---------------
|
||||||
|
|
||||||
Note until RFE on OVN (https://bugzilla.redhat.com/show_bug.cgi?id=2107515)
|
OVN version 23.09 is required to expose tenant networks and ovn-lb, because
|
||||||
is implemented there is no option to expose tenant networks as we do not know
|
CR-LRP port chassis information in the NB DB is only available in that
|
||||||
where the CR-LRP port is associated to.
|
version (https://bugzilla.redhat.com/show_bug.cgi?id=2107515).
|
||||||
|
|
||||||
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+--------------------------+-----------------------+-------------+
|
The following table lists the various methods you can use to expose the
|
||||||
| Exposing Method | Description | Expose with | Wired with | Expose Tenants or GUA | OVS-DPDK/HWOL Support | Implemented |
|
networks/IPS, how they expose the IPs and the tenant networks, and whether
|
||||||
+=================+=====================================================+==========================================+==========================================+==========================+=======================+=============+
|
OVS-DPDK and hardware offload (HWOL) is supported.
|
||||||
| Underlay | Expose IPs on the default underlay network | Adding IP to dummy nic isolated in a VRF | Ingress: ip rules, and ip routes on the | No support until OVN | No | Yes |
|
|
||||||
| | | | routing table associated to ovs | has information about | | |
|
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+--------------------------+-----------------------+---------------+
|
||||||
| | | | Egress: ovs-flow to change mac | the CR-LRP chassis on | | |
|
| Exposing Method | Description | Expose with | Wired with | Expose Tenants or GUA | OVS-DPDK/HWOL Support | Supported |
|
||||||
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+ the SB DB +-----------------------+-------------+
|
+=================+=====================================================+==========================================+==========================================+==========================+=======================+===============+
|
||||||
| L2VNI | Extends the L2 segment on a given VNI | No need to expose it, automatic with the | Ingress: vxlan + bridge device | | No | No |
|
| Underlay | Expose IPs on the default underlay network. | Adding IP to dummy NIC isolated in a VRF.| Ingress: ip rules, and ip routes on the | Yes | No | Yes |
|
||||||
| | | FRR configuration and the wiring | Egress: nothing | | | |
|
| | | | routing table associated to OVS | | | |
|
||||||
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+ +-----------------------+-------------+
|
| | | | Egress: OVS-flow to change MAC | (expose_tenant_networks) | | |
|
||||||
| VRF | Expose IPs on a given VRF (vni id) | Add IPs to dummy nic associated to the | Ingress: vxlan + bridge device | | No | No |
|
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+--------------------------+-----------------------+---------------+
|
||||||
| | | VRF device (lo_VNI_ID) | Egress: flow to redirect to VRF device | | | |
|
| L2VNI | Extends the L2 segment on a given VNI. | No need to expose it, automatic with the | Ingress: vxlan + bridge device | N/A | No | No |
|
||||||
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+ +-----------------------+-------------+
|
| | | FRR configuration and the wiring. | Egress: nothing | | | |
|
||||||
| Dynamic | Mix of the previous, depending on annotations it | Mix of the previous three | Ingress: mix of all the above | | No | No |
|
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+--------------------------+-----------------------+---------------+
|
||||||
| | exposes it differently and on different VNIs | | Egress: mix of all the above | | | |
|
| VRF | Expose IPs on a given VRF (vni id). | Add IPs to dummy NIC associated to the | Ingress: vxlan + bridge device | Yes | No | No |
|
||||||
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+ +-----------------------+-------------+
|
| | | VRF device (lo_VNI_ID). | Egress: flow to redirect to VRF device | (Not implemented) | | |
|
||||||
| OVN-Cluster | Make use of an extra OVN cluster (per node) instead | Adding IP to dummy nic isolated in a VRF | Ingress: ovn routes, ovs flow (mac tweak)| | Yes | No |
|
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+--------------------------+-----------------------+---------------+
|
||||||
| | of kernel routing -- exposing the IPs with BGP is | (as it only supports the underlay option)| Egress: ovn routes and policies, | | | |
|
| Dynamic | Mix of the previous. Depending on annotations it | Mix of the previous three. | Ingress: mix of all the above | Depends on the method | No | No |
|
||||||
| | the same as before | | and ovs flow (mac tweak) | | | |
|
| | exposes IPs differently and on different VNIs. | | Egress: mix of all the above | used | | |
|
||||||
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+--------------------------+-----------------------+-------------+
|
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+--------------------------+-----------------------+---------------+
|
||||||
|
| OVN | Make use of an extra OVN cluster (per node) instead | Adding IP to dummy NIC isolated in a VRF | Ingress: OVN routes, OVS flow (MAC tweak)| Yes | Yes | Yes. Only for |
|
||||||
|
| | of kernel routing -- exposing the IPs with BGP is | (as it only supports the underlay | Egress: OVN routes and policies, | (Not implemented) | | ipv4 and flat |
|
||||||
|
| | the same as before. | option). | and OVS flow (MAC tweak) | | | provider |
|
||||||
|
| | | | | | | networks |
|
||||||
|
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+--------------------------+-----------------------+---------------+
|
||||||
|
|
||||||
|
|
||||||
BGP Stretched Driver (SB)
|
BGP Stretched Driver (SB)
|
||||||
-------------------------
|
-------------------------
|
||||||
|
|
||||||
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+----------------+--------------------+-----------------------+-------------+
|
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+----------------+--------------------+-----------------------+-----------+
|
||||||
| Exposing Method | Description | Expose with | Wired with | Expose Tenants | Expose only GUA | OVS-DPDK/HWOL Support | Implemented |
|
| Exposing Method | Description | Expose with | Wired with | Expose Tenants | Expose only GUA | OVS-DPDK/HWOL Support | Supported |
|
||||||
+=================+=====================================================+==========================================+==========================================+================+====================+=======================+=============+
|
+=================+=====================================================+==========================================+==========================================+================+====================+=======================+===========+
|
||||||
| Underlay | Expose IPs on the default underlay network | Adding IP routes to default VRF table | Ingress: ip rules, and ip routes on the | Yes | No | No | Yes |
|
| Underlay | Expose IPs on the default underlay network. | Adding IP routes to default VRF table. | Ingress: ip rules, and ip routes on the | Yes | No | No | Yes |
|
||||||
| | | | routing table associated to ovs | | | | |
|
| | | | routing table associated to OVS | | | | |
|
||||||
| | | | Egress: ovs-flow to change mac | | | | |
|
| | | | Egress: OVS-flow to change MAC | | | | |
|
||||||
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+----------------+--------------------+-----------------------+-------------+
|
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+----------------+--------------------+-----------------------+-----------+
|
||||||
|
|
||||||
|
|
||||||
EVPN Driver (SB)
|
EVPN Driver (SB)
|
||||||
----------------
|
----------------
|
||||||
|
|
||||||
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+----------------+--------------------+-----------------------+-------------+
|
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+----------------+--------------------+-----------------------+-----------+
|
||||||
| Exposing Method | Description | Expose with | Wired with | Expose Tenants | Expose only GUA | OVS-DPDK/HWOL Support | Implemented |
|
| Exposing Method | Description | Expose with | Wired with | Expose Tenants | Expose only GUA | OVS-DPDK/HWOL Support | Supported |
|
||||||
+=================+=====================================================+==========================================+==========================================+================+====================+=======================+=============+
|
+=================+=====================================================+==========================================+==========================================+================+====================+=======================+===========+
|
||||||
| VRF | Expose IPs on a given VRF (vni id) -- requires | Add IPs to dummy nic associated to the | Ingress: vxlan + bridge device | Yes | No | No | No |
|
| VRF | Expose IPs on a given VRF (vni id) -- requires | Add IPs to dummy NIC associated to the | Ingress: vxlan + bridge device | Yes | No | No | No |
|
||||||
| | newtorking-bgpvpn or manual NB DB inputs | VRF device (lo_VNI_ID) | Egress: flow to redirect to VRF device | | | | |
|
| | newtorking-bgpvpn or manual NB DB inputs. | VRF device (lo_VNI_ID). | Egress: flow to redirect to VRF device | | | | |
|
||||||
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+----------------+--------------------+-----------------------+-------------+
|
+-----------------+-----------------------------------------------------+------------------------------------------+------------------------------------------+----------------+--------------------+-----------------------+-----------+
|
|
@ -0,0 +1,163 @@
|
||||||
|
Agent deployment
|
||||||
|
~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
The BGP mode (for both NB and SB drivers) exposes the VMs and LBs in provider
|
||||||
|
networks or with FIPs, as well as VMs on tenant networks if
|
||||||
|
``expose_tenant_networks`` or ``expose_ipv6_gua_tenant_networks`` configuration
|
||||||
|
options are enabled.
|
||||||
|
|
||||||
|
There is a need to deploy the agent in all the nodes where VMs can be created
|
||||||
|
as well as in the networker nodes (i.e., where OVN router gateway ports can be
|
||||||
|
allocated):
|
||||||
|
|
||||||
|
- For VMs and Amphora load balancers on provider networks or with FIPs,
|
||||||
|
the IP is exposed on the node where the VM (or amphora) is deployed.
|
||||||
|
Therefore the agent needs to be running on the compute nodes.
|
||||||
|
|
||||||
|
- For VMs on tenant networks (with ``expose_tenant_networks`` or
|
||||||
|
``expose_ipv6_gua_tenant_networks`` configuration options enabled), the agent
|
||||||
|
needs to be running on the networker nodes. In OpenStack, with OVN
|
||||||
|
networking, the N/S traffic to the tenant VMs (without FIPs) needs to go
|
||||||
|
through the networking nodes, more specifically the one hosting the
|
||||||
|
chassisredirect OVN port (cr-lrp), connecting the provider network to the
|
||||||
|
OVN virtual router. Hence, the VM IPs are advertised through BGP in that
|
||||||
|
node, and from there it follows the normal path to the OpenStack compute
|
||||||
|
node where the VM is located — through the tunnel.
|
||||||
|
|
||||||
|
- Similarly, for OVN load balancer the IPs are exposed on the networker node.
|
||||||
|
In this case the ARP request for the VIP is replied by the OVN router
|
||||||
|
gateway port, therefore the traffic needs to be injected into OVN overlay
|
||||||
|
at that point too.
|
||||||
|
Therefore the agent needs to be running on the networker nodes for OVN
|
||||||
|
load balancers.
|
||||||
|
|
||||||
|
As an example of how to start the OVN BGP Agent on the nodes, see the commands
|
||||||
|
below:
|
||||||
|
|
||||||
|
.. code-block:: ini
|
||||||
|
|
||||||
|
$ python setup.py install
|
||||||
|
$ cat bgp-agent.conf
|
||||||
|
# sample configuration that can be adapted based on needs
|
||||||
|
[DEFAULT]
|
||||||
|
debug=True
|
||||||
|
reconcile_interval=120
|
||||||
|
expose_tenant_networks=True
|
||||||
|
# expose_ipv6_gua_tenant_networks=True
|
||||||
|
# for SB DB driver
|
||||||
|
driver=ovn_bgp_driver
|
||||||
|
# for NB DB driver
|
||||||
|
#driver=nb_ovn_bgp_driver
|
||||||
|
bgp_AS=64999
|
||||||
|
bgp_nic=bgp-nic
|
||||||
|
bgp_vrf=bgp-vrf
|
||||||
|
bgp_vrf_table_id=10
|
||||||
|
ovsdb_connection=tcp:127.0.0.1:6640
|
||||||
|
address_scopes=2237917c7b12489a84de4ef384a2bcae
|
||||||
|
|
||||||
|
[ovn]
|
||||||
|
ovn_nb_connection = tcp:172.17.0.30:6641
|
||||||
|
ovn_sb_connection = tcp:172.17.0.30:6642
|
||||||
|
|
||||||
|
[agent]
|
||||||
|
root_helper=sudo ovn-bgp-agent-rootwrap /etc/ovn-bgp-agent/rootwrap.conf
|
||||||
|
root_helper_daemon=sudo ovn-bgp-agent-rootwrap-daemon /etc/ovn-bgp-agent/rootwrap.conf
|
||||||
|
|
||||||
|
$ sudo bgp-agent --config-dir bgp-agent.conf
|
||||||
|
Starting BGP Agent...
|
||||||
|
Loaded chassis 51c8480f-c573-4c1c-b96e-582f9ca21e70.
|
||||||
|
BGP Agent Started...
|
||||||
|
Ensuring VRF configuration for advertising routes
|
||||||
|
Configuring br-ex default rule and routing tables for each provider network
|
||||||
|
Found routing table for br-ex with: ['201', 'br-ex']
|
||||||
|
Sync current routes.
|
||||||
|
Add BGP route for logical port with ip 172.24.4.226
|
||||||
|
Add BGP route for FIP with ip 172.24.4.199
|
||||||
|
Add BGP route for CR-LRP Port 172.24.4.221
|
||||||
|
....
|
||||||
|
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
|
||||||
|
If you only want to expose the IPv6 GUA tenant IPs, then remove the option
|
||||||
|
``expose_tenant_networks`` and add ``expose_ipv6_gua_tenant_networks=True``
|
||||||
|
instead.
|
||||||
|
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
|
||||||
|
If you want to filter the tenant networks to be exposed by some specific
|
||||||
|
address scopes, add the list of address scopes to ``address_scope=XXX``
|
||||||
|
section. If no filtering should be applied, just remove the line.
|
||||||
|
|
||||||
|
|
||||||
|
Note that the OVN BGP Agent operates under the next assumptions:
|
||||||
|
|
||||||
|
- A dynamic routing solution, in this case FRR, is deployed and
|
||||||
|
advertises/withdraws routes added/deleted to/from certain local interface,
|
||||||
|
in this case the ones associated to the VRF created to that end. As only VM
|
||||||
|
and load balancer IPs need to be advertised, FRR needs to be configure with
|
||||||
|
the proper filtering so that only /32 (or /128 for IPv6) IPs are advertised.
|
||||||
|
A sample config for FRR is:
|
||||||
|
|
||||||
|
.. code-block:: ini
|
||||||
|
|
||||||
|
frr version 7.5
|
||||||
|
frr defaults traditional
|
||||||
|
hostname cmp-1-0
|
||||||
|
log file /var/log/frr/frr.log debugging
|
||||||
|
log timestamp precision 3
|
||||||
|
service integrated-vtysh-config
|
||||||
|
line vty
|
||||||
|
|
||||||
|
router bgp 64999
|
||||||
|
bgp router-id 172.30.1.1
|
||||||
|
bgp log-neighbor-changes
|
||||||
|
bgp graceful-shutdown
|
||||||
|
no bgp default ipv4-unicast
|
||||||
|
no bgp ebgp-requires-policy
|
||||||
|
|
||||||
|
neighbor uplink peer-group
|
||||||
|
neighbor uplink remote-as internal
|
||||||
|
neighbor uplink password foobar
|
||||||
|
neighbor enp2s0 interface peer-group uplink
|
||||||
|
neighbor enp3s0 interface peer-group uplink
|
||||||
|
|
||||||
|
address-family ipv4 unicast
|
||||||
|
redistribute connected
|
||||||
|
neighbor uplink activate
|
||||||
|
neighbor uplink allowas-in origin
|
||||||
|
neighbor uplink prefix-list only-host-prefixes out
|
||||||
|
exit-address-family
|
||||||
|
|
||||||
|
address-family ipv6 unicast
|
||||||
|
redistribute connected
|
||||||
|
neighbor uplink activate
|
||||||
|
neighbor uplink allowas-in origin
|
||||||
|
neighbor uplink prefix-list only-host-prefixes out
|
||||||
|
exit-address-family
|
||||||
|
|
||||||
|
ip prefix-list only-default permit 0.0.0.0/0
|
||||||
|
ip prefix-list only-host-prefixes permit 0.0.0.0/0 ge 32
|
||||||
|
|
||||||
|
route-map rm-only-default permit 10
|
||||||
|
match ip address prefix-list only-default
|
||||||
|
set src 172.30.1.1
|
||||||
|
|
||||||
|
ip protocol bgp route-map rm-only-default
|
||||||
|
|
||||||
|
ipv6 prefix-list only-default permit ::/0
|
||||||
|
ipv6 prefix-list only-host-prefixes permit ::/0 ge 128
|
||||||
|
|
||||||
|
route-map rm-only-default permit 11
|
||||||
|
match ipv6 address prefix-list only-default
|
||||||
|
set src f00d:f00d:f00d:f00d:f00d:f00d:f00d:0004
|
||||||
|
|
||||||
|
ipv6 protocol bgp route-map rm-only-default
|
||||||
|
|
||||||
|
ip nht resolve-via-default
|
||||||
|
|
||||||
|
|
||||||
|
- The relevant provider OVS bridges are created and configured with a loopback
|
||||||
|
IP address (eg. 1.1.1.1/32 for IPv4), and proxy ARP/NDP is enabled on their
|
||||||
|
kernel interface.
|
|
@ -0,0 +1,66 @@
|
||||||
|
BGP Advertisement
|
||||||
|
+++++++++++++++++
|
||||||
|
|
||||||
|
The OVN BGP Agent (both SB and NB drivers) is in charge of triggering FRR
|
||||||
|
(IP routing protocol suite for Linux which includes protocol daemons for BGP,
|
||||||
|
OSPF, RIP, among others) to advertise/withdraw directly connected routes via
|
||||||
|
BGP. To do that, when the agent starts, it ensures that:
|
||||||
|
|
||||||
|
- FRR local instance is reconfigured to leak routes for a new VRF. To do that
|
||||||
|
it uses ``vtysh shell``. It connects to the existsing FRR socket (
|
||||||
|
``--vty_socket`` option) and executes the next commands, passing them through
|
||||||
|
a file (``-c FILE_NAME`` option):
|
||||||
|
|
||||||
|
.. code-block:: ini
|
||||||
|
|
||||||
|
router bgp {{ bgp_as }}
|
||||||
|
address-family ipv4 unicast
|
||||||
|
import vrf {{ vrf_name }}
|
||||||
|
exit-address-family
|
||||||
|
|
||||||
|
address-family ipv6 unicast
|
||||||
|
import vrf {{ vrf_name }}
|
||||||
|
exit-address-family
|
||||||
|
|
||||||
|
router bgp {{ bgp_as }} vrf {{ vrf_name }}
|
||||||
|
bgp router-id {{ bgp_router_id }}
|
||||||
|
address-family ipv4 unicast
|
||||||
|
redistribute connected
|
||||||
|
exit-address-family
|
||||||
|
|
||||||
|
address-family ipv6 unicast
|
||||||
|
redistribute connected
|
||||||
|
exit-address-family
|
||||||
|
|
||||||
|
|
||||||
|
- There is a VRF created (the one leaked in the previous step), by default
|
||||||
|
with name ``bgp-vrf``.
|
||||||
|
|
||||||
|
- There is a dummy interface type (by default named ``bgp-nic``), associated to
|
||||||
|
the previously created VRF device.
|
||||||
|
|
||||||
|
- Ensure ARP/NDP is enabled at OVS provider bridges by adding an IP to it.
|
||||||
|
|
||||||
|
|
||||||
|
Then, to expose the VMs/LB IPs as they are created (or upon
|
||||||
|
initialization or re-sync), since the FRR configuration has the
|
||||||
|
``redistribute connected`` option enabled, the only action needed to expose it
|
||||||
|
(or withdraw it) is to add it (or remove it) from the ``bgp-nic`` dummy interface.
|
||||||
|
Then it relies on Zebra to do the BGP advertisement, as Zebra detects the
|
||||||
|
addition/deletion of the IP on the local interface and advertises/withdraws
|
||||||
|
the route:
|
||||||
|
|
||||||
|
.. code-block:: ini
|
||||||
|
|
||||||
|
$ ip addr add IPv4/32 dev bgp-nic
|
||||||
|
$ ip addr add IPv6/128 dev bgp-nic
|
||||||
|
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
|
||||||
|
As we also want to be able to expose VM connected to tenant networks
|
||||||
|
(when ``expose_tenant_networks`` or ``expose_ipv6_gua_tenant_networks``
|
||||||
|
configuration options are enabled), there is a need to expose the Neutron
|
||||||
|
router gateway port (CR-LRP on OVN) so that the traffic to VMs in tenant
|
||||||
|
networks is injected into OVN overlay through the node that is hosting
|
||||||
|
that port.
|
|
@ -1,603 +0,0 @@
|
||||||
..
|
|
||||||
This work is licensed under a Creative Commons Attribution 3.0 Unported
|
|
||||||
License.
|
|
||||||
|
|
||||||
http://creativecommons.org/licenses/by/3.0/legalcode
|
|
||||||
|
|
||||||
Convention for heading levels in Neutron devref:
|
|
||||||
======= Heading 0 (reserved for the title in a document)
|
|
||||||
------- Heading 1
|
|
||||||
~~~~~~~ Heading 2
|
|
||||||
+++++++ Heading 3
|
|
||||||
''''''' Heading 4
|
|
||||||
(Avoid deeper levels because they do not render well.)
|
|
||||||
|
|
||||||
=======================================
|
|
||||||
OVN BGP Agent: Design of the BGP Driver
|
|
||||||
=======================================
|
|
||||||
|
|
||||||
Purpose
|
|
||||||
-------
|
|
||||||
|
|
||||||
The purpose of this document is to present the design decision behind
|
|
||||||
the BGP Driver for the Networking OVN BGP agent.
|
|
||||||
|
|
||||||
The main purpose of adding support for BGP is to be able to expose Virtual
|
|
||||||
Machines (VMs) and Load Balancers (LBs) IPs through BGP dynamic protocol
|
|
||||||
when they either have a Floating IP (FIP) associated or are booted/created
|
|
||||||
on a provider network -- also in tenant networks if a flag is enabled.
|
|
||||||
|
|
||||||
|
|
||||||
Overview
|
|
||||||
--------
|
|
||||||
|
|
||||||
With the increment of virtualized/containerized workloads it is becoming more
|
|
||||||
and more common to use pure layer-3 Spine and Leaf network deployments at
|
|
||||||
datacenters. There are several benefits of this, such as reduced complexity at
|
|
||||||
scale, reduced failures domains, limiting broadcast traffic, among others.
|
|
||||||
|
|
||||||
The OVN BGP Agent is a Python based daemon that runs on each node
|
|
||||||
(e.g., OpenStack controllers and/or compute nodes). It connects to the OVN
|
|
||||||
SouthBound DataBase (OVN SB DB) to detect the specific events it needs to
|
|
||||||
react to, and then leverages FRR to expose the routes towards the VMs, and
|
|
||||||
kernel networking capabilities to redirect the traffic arriving on the nodes
|
|
||||||
to the OVN overlay.
|
|
||||||
|
|
||||||
.. note::
|
|
||||||
|
|
||||||
Note it is only intended for the N/S traffic, the E/W traffic will work
|
|
||||||
exactly the same as before, i.e., VMs are connected through geneve
|
|
||||||
tunnels.
|
|
||||||
|
|
||||||
|
|
||||||
The agent provides a multi-driver implementation that allows you to configure
|
|
||||||
it for specific infrastructure running on top of OVN, for instance OpenStack
|
|
||||||
or Kubernetes/OpenShift.
|
|
||||||
This simple design allows the agent to implement different drivers, depending
|
|
||||||
on what OVN SB DB events are being watched (watchers examples at
|
|
||||||
``ovn_bgp_agent/drivers/openstack/watchers/``), and what actions are
|
|
||||||
triggered in reaction to them (drivers examples at
|
|
||||||
``ovn_bgp_agent/drivers/openstack/XXXX_driver.py``, implementing the
|
|
||||||
``ovn_bgp_agent/drivers/driver_api.py``).
|
|
||||||
|
|
||||||
A driver implements the support for BGP capabilities. It ensures both VMs and
|
|
||||||
LBs on providers networks or with Floating IPs associated can be
|
|
||||||
exposed throug BGP. In addition, VMs on tenant networks can be also exposed
|
|
||||||
if the ``expose_tenant_network`` configuration option is enabled.
|
|
||||||
To control what tenant networks are exposed another flag can be used:
|
|
||||||
``address_scopes``. If not set, all the tenant networks will be exposed, while
|
|
||||||
if it is configured with a (set of) address_scopes, only the tenant networks
|
|
||||||
whose address_scope matches will be exposed.
|
|
||||||
|
|
||||||
A common driver API is defined exposing the next methods:
|
|
||||||
|
|
||||||
- ``expose_ip`` and ``withdraw_ip``: used to expose/withdraw IPs for local
|
|
||||||
OVN ports.
|
|
||||||
|
|
||||||
- ``expose_remote_ip`` and ``withdraw_remote_ip``: use to expose/withdraw IPs
|
|
||||||
through another node when the VM/Pod are running on a different node.
|
|
||||||
For example for VMs on tenant networks where the traffic needs to be
|
|
||||||
injected through the OVN router gateway port.
|
|
||||||
|
|
||||||
- ``expose_subnet`` and ``withdraw_subnet``: used to expose/withdraw subnets through
|
|
||||||
the local node.
|
|
||||||
|
|
||||||
|
|
||||||
Proposed Solution
|
|
||||||
-----------------
|
|
||||||
|
|
||||||
To support BGP functionality the OVN BGP Agent includes a driver
|
|
||||||
that performs the extra steps required for exposing the IPs through BGP on
|
|
||||||
the right nodes and steering the traffic to/from the node from/to the OVN
|
|
||||||
overlay. In order to configure which driver to use, one should set the
|
|
||||||
``driver`` configuration option in the ``bgp-agent.conf`` file.
|
|
||||||
|
|
||||||
This driver requires a watcher to react to the BGP-related events.
|
|
||||||
In this case, the BGP actions will be trigger by events related to
|
|
||||||
``Port_Binding`` and ``Load_Balancer`` OVN SB DB tables.
|
|
||||||
The information in those tables gets modified by actions related to VMs or LBs
|
|
||||||
creation/deletion, as well as FIPs association/disassociation to/from them.
|
|
||||||
|
|
||||||
Then, the agent performs some actions in order to ensure those VMs are
|
|
||||||
reachable through BGP:
|
|
||||||
|
|
||||||
- Traffic between nodes or BGP Advertisement: These are the actions needed to
|
|
||||||
expose the BGP routes and make sure all the nodes know how to reach the
|
|
||||||
VM/LB IP on the nodes.
|
|
||||||
|
|
||||||
- Traffic within a node or redirecting traffic to/from OVN overlay: These are
|
|
||||||
the actions needed to redirect the traffic to/from a VM to the OVN neutron
|
|
||||||
networks, when traffic reaches the node where the VM is or in their way
|
|
||||||
out of the node.
|
|
||||||
|
|
||||||
The code for the BGP driver is located at
|
|
||||||
``drivers/openstack/ovn_bgp_driver.py``, and its associated watcher can be
|
|
||||||
found at ``drivers/openstack/watchers/bgp_watcher.py``.
|
|
||||||
|
|
||||||
|
|
||||||
OVN SB DB Events
|
|
||||||
~~~~~~~~~~~~~~~~
|
|
||||||
|
|
||||||
The watcher associated to the BGP driver detect the relevant events on the
|
|
||||||
OVN SB DB to call the driver functions to configure BGP and linux kernel
|
|
||||||
networking accordingly.
|
|
||||||
The folloging events are watched and handled by the BGP watcher:
|
|
||||||
|
|
||||||
- VMs or LBs created/deleted on provider networks
|
|
||||||
|
|
||||||
- FIPs association/disassociation to VMs or LBs
|
|
||||||
|
|
||||||
- VMs or LBs created/deleted on tenant networks (if the
|
|
||||||
``expose_tenant_networks`` configuration option is enabled, or if the
|
|
||||||
``expose_ipv6_gua_tenant_networks`` for only exposing IPv6 GUA ranges)
|
|
||||||
|
|
||||||
.. note::
|
|
||||||
|
|
||||||
If ``expose_tenant_networks`` flag is enabled, it does not matter the
|
|
||||||
status of ``expose_ipv6_gua_tenant_networks``, as all the tenant IPs
|
|
||||||
will be advertized.
|
|
||||||
|
|
||||||
|
|
||||||
The BGP watcher detects OVN Southbound Database events at the ``Port_Binding``
|
|
||||||
and ``Load_Balancer`` tables. It creates new event classes named
|
|
||||||
``PortBindingChassisEvent`` and ``OVNLBEvent``, that all the events
|
|
||||||
watched for BGP use as the base (inherit from).
|
|
||||||
|
|
||||||
The specific defined events to react to are:
|
|
||||||
|
|
||||||
- ``PortBindingChassisCreatedEvent``: Detects when a port of type
|
|
||||||
``""`` (empty double-qoutes), ``virtual``, or ``chassisredirect`` gets
|
|
||||||
attached to the OVN chassis where the agent is running. This is the case for
|
|
||||||
VM or amphora LB ports on the provider networks, VM or amphora LB ports on
|
|
||||||
tenant networks with a FIP associated, and neutron gateway router ports
|
|
||||||
(CR-LRPs). It calls ``expose_ip`` driver method to perform the needed
|
|
||||||
actions to expose it.
|
|
||||||
|
|
||||||
- ``PortBindingChassisDeletedEvent``: Detects when a port of type
|
|
||||||
``""`` (empty double-quotes), ``virtual``, or ``chassisredirect`` gets
|
|
||||||
detached from the OVN chassis where the agent is running. This is the case
|
|
||||||
for VM or amphora LB ports on the provider networks, VM or amphora LB ports
|
|
||||||
on tenant networks with a FIP associated, and neutron gateway router ports
|
|
||||||
(CR-LRPs). It calls ``withdraw_ip`` driver method to perform the needed
|
|
||||||
actions to withdraw the exposed BGP route.
|
|
||||||
|
|
||||||
- ``FIPSetEvent``: Detects when a patch port gets its nat_addresses field
|
|
||||||
updated (e.g., action related to FIPs NATing). If that so, and the associated
|
|
||||||
VM port is on the local chassis the event is processed by the agent and the
|
|
||||||
required ip rule gets created and also the IP is (BGP) exposed. It calls
|
|
||||||
``expose_ip`` driver method, including the associated_port information, to
|
|
||||||
perform the required actions.
|
|
||||||
|
|
||||||
- ``FIPUnsetEvent``: Same as previous, but when the nat_address field get an
|
|
||||||
IP deleted. It calls ``withdraw_ip`` driver method to perform the required
|
|
||||||
actions.
|
|
||||||
|
|
||||||
- ``SubnetRouterAttachedEvent``: Detects when a patch port gets created.
|
|
||||||
This means a subnet is attached to a router. In the ``expose_tenant_network``
|
|
||||||
case, if the chassis is the one having the cr-lrp port for that router where
|
|
||||||
the port is getting created, then the event is processed by the agent and the
|
|
||||||
needed actions (ip rules and routes, and ovs rules) for exposing the IPs on
|
|
||||||
that network are performed. This event calls the driver_api
|
|
||||||
``expose_subnet``. The same happens if ``expose_ipv6_gua_tenant_networks``
|
|
||||||
is used, but then, the IPs are only exposed if they are IPv6 global.
|
|
||||||
|
|
||||||
- ``SubnetRouterDetachedEvent``: Same as previous one, but for the deletion
|
|
||||||
of the port. It calls ``withdraw_subnet``.
|
|
||||||
|
|
||||||
- ``TenantPortCreateEvent``: Detects when a port of type ``""`` (empty
|
|
||||||
double-quotes) or ``virtual`` gets updated. If that port is not on a
|
|
||||||
provider network, and the chasis where the event is processed has the
|
|
||||||
LogicalRouterPort for the network and the OVN router gateway port where the
|
|
||||||
network is connected to, then the event is processed and the actions to
|
|
||||||
expose it through BGP are triggered. It calls the ``expose_remote_ip`` as in
|
|
||||||
this case the IPs are exposed through the node with the OVN router gateway
|
|
||||||
port, instead of where the VM is.
|
|
||||||
|
|
||||||
- ``TenantPortDeleteEvent``: Same as previous one, but for the deletion of the
|
|
||||||
port. It calls ``withdraw_remote_ip``.
|
|
||||||
|
|
||||||
- ``OVNLBMemberUpdateEvent``: This event is required to handle the OVN load
|
|
||||||
balancers created on the provider networks. It detects when new datapaths
|
|
||||||
are added/removed to/from the ``Load_Balancer`` entries. This happens when
|
|
||||||
members are added/removed -- their respective datapaths are added into the
|
|
||||||
``Load_Balancer`` table entry. The event is only processed in the nodes with the
|
|
||||||
relevant OVN router gateway ports, as it is where it needs to get exposed to
|
|
||||||
be injected into OVN overlay. It calls ``expose_ovn_lb_on_provider`` when the
|
|
||||||
second datapath is added (first one is the one belonging to the VIP (i.e.,
|
|
||||||
the provider network), while the second one belongs to the load balancer
|
|
||||||
member -- note all the load balancer members are expected to be connected
|
|
||||||
through the same router to the provider network). And it calls
|
|
||||||
``withdraw_ovn_lb_on_provider`` when that member gets deleted (only one
|
|
||||||
datapath left) or the event type is ROW_DELETE, meaning the whole
|
|
||||||
load balancer is deleted.
|
|
||||||
|
|
||||||
|
|
||||||
Driver Logic
|
|
||||||
~~~~~~~~~~~~
|
|
||||||
|
|
||||||
The BGP driver is in charge of the networking configuration ensuring that
|
|
||||||
VMs and LBs on provider networks or with FIPs can be reached through BGP
|
|
||||||
(N/S traffic). In addition, if ``expose_tenant_networks`` flag is enabled,
|
|
||||||
VMs in tenant networks should be reachable too -- although instead of directly
|
|
||||||
in the node they are created, through one of the network gateway chassis nodes.
|
|
||||||
The same happens with ``expose_ipv6_gua_tenant_networks`` but only for IPv6
|
|
||||||
GUA ranges. In addition, if the config option ``address_scopes`` is set only
|
|
||||||
the tenant networks with matching corresponding address_scope will be exposed.
|
|
||||||
|
|
||||||
To accomplish this, it needs to ensure that:
|
|
||||||
|
|
||||||
- VM and LBs IPs can be advertized in a node where the traffic could be
|
|
||||||
injected into the OVN overlay, in this case either the node hosting the VM
|
|
||||||
or the node where the router gateway port is scheduled (see limitations
|
|
||||||
subsection).
|
|
||||||
|
|
||||||
- Once the traffic reaches the specific node, the traffic is redirected to the
|
|
||||||
OVN overlay by leveraging kernel networking.
|
|
||||||
|
|
||||||
|
|
||||||
BGP Advertisement
|
|
||||||
+++++++++++++++++
|
|
||||||
|
|
||||||
The OVN BGP Agent is in charge of triggering FRR (ip routing protocol
|
|
||||||
suite for Linux which includes protocol daemons for BGP, OSPF, RIP,
|
|
||||||
among others) to advertise/withdraw directly connected routes via BGP.
|
|
||||||
To do that, when the agent starts, it ensures that:
|
|
||||||
|
|
||||||
- FRR local instance is reconfigured to leak routes for a new VRF. To do that
|
|
||||||
it uses ``vtysh shell``. It connects to the existsing FRR socket (
|
|
||||||
``--vty_socket`` option) and executes the next commands, passing them through
|
|
||||||
a file (``-c FILE_NAME`` option):
|
|
||||||
|
|
||||||
.. code-block:: ini
|
|
||||||
|
|
||||||
LEAK_VRF_TEMPLATE = '''
|
|
||||||
router bgp {{ bgp_as }}
|
|
||||||
address-family ipv4 unicast
|
|
||||||
import vrf {{ vrf_name }}
|
|
||||||
exit-address-family
|
|
||||||
|
|
||||||
address-family ipv6 unicast
|
|
||||||
import vrf {{ vrf_name }}
|
|
||||||
exit-address-family
|
|
||||||
|
|
||||||
router bgp {{ bgp_as }} vrf {{ vrf_name }}
|
|
||||||
bgp router-id {{ bgp_router_id }}
|
|
||||||
address-family ipv4 unicast
|
|
||||||
redistribute connected
|
|
||||||
exit-address-family
|
|
||||||
|
|
||||||
address-family ipv6 unicast
|
|
||||||
redistribute connected
|
|
||||||
exit-address-family
|
|
||||||
|
|
||||||
'''
|
|
||||||
|
|
||||||
|
|
||||||
- There is a VRF created (the one leaked in the previous step), by default
|
|
||||||
with name ``bgp_vrf``.
|
|
||||||
|
|
||||||
- There is a dummy interface type (by default named ``bgp-nic``), associated to
|
|
||||||
the previously created VRF device.
|
|
||||||
|
|
||||||
- Ensure ARP/NDP is enabled at OVS provider bridges by adding an IP to it
|
|
||||||
|
|
||||||
|
|
||||||
Then, to expose the VMs/LB IPs as they are created (or upon
|
|
||||||
initialization or re-sync), since the FRR configuration has the
|
|
||||||
``redistribute connected`` option enabled, the only action needed to expose it
|
|
||||||
(or withdraw it) is to add it (or remove it) from the ``bgp-nic`` dummy interface.
|
|
||||||
Then it relies on Zebra to do the BGP advertisemant, as Zebra detects the
|
|
||||||
addition/deletion of the IP on the local interface and advertises/withdraw
|
|
||||||
the route:
|
|
||||||
|
|
||||||
.. code-block:: ini
|
|
||||||
|
|
||||||
$ ip addr add IPv4/32 dev bgp-nic
|
|
||||||
$ ip addr add IPv6/128 dev bgp-nic
|
|
||||||
|
|
||||||
|
|
||||||
.. note::
|
|
||||||
|
|
||||||
As we also want to be able to expose VM connected to tenant networks
|
|
||||||
(when ``expose_tenant_networks`` or ``expose_ipv6_gua_tenant_networks``
|
|
||||||
configuration options are enabled), there is a need to expose the Neutron
|
|
||||||
router gateway port (CR-LRP on OVN) so that the traffic to VMs on tenant
|
|
||||||
networks is injected into OVN overlay through the node that is hosting
|
|
||||||
that port.
|
|
||||||
|
|
||||||
|
|
||||||
Traffic Redirection to/from OVN
|
|
||||||
+++++++++++++++++++++++++++++++
|
|
||||||
|
|
||||||
Once the VM/LB IP is exposed in an specific node (either the one hosting the
|
|
||||||
VM/LB or the one with the OVN router gateway port), the OVN BGP Agent is in
|
|
||||||
charge of configuring the linux kernel networking and OVS so that the traffic
|
|
||||||
can be injected into the OVN overlay, and vice versa. To do that, when the
|
|
||||||
agent starts, it ensures that:
|
|
||||||
|
|
||||||
- ARP/NDP is enabled at OVS provider bridges by adding an IP to it
|
|
||||||
|
|
||||||
- There is a routing table associated to each OVS provider bridge
|
|
||||||
(adds entry at /etc/iproute2/rt_tables)
|
|
||||||
|
|
||||||
- If provider network is a VLAN network, a VLAN device connected
|
|
||||||
to the bridge is created, and it has ARP and NDP enabed.
|
|
||||||
|
|
||||||
- Cleans up extra OVS flows at the OVS provider bridges
|
|
||||||
|
|
||||||
Then, either upon events or due to (re)sync (regularly or during start up), it:
|
|
||||||
|
|
||||||
- Adds an IP rule to apply specific routing table routes,
|
|
||||||
in this case the one associated to the OVS provider bridge:
|
|
||||||
|
|
||||||
.. code-block:: ini
|
|
||||||
|
|
||||||
$ ip rule
|
|
||||||
0: from all lookup local
|
|
||||||
1000: from all lookup [l3mdev-table]
|
|
||||||
*32000: from all to IP lookup br-ex* # br-ex is the OVS provider bridge
|
|
||||||
*32000: from all to CIDR lookup br-ex* # for VMs in tenant networks
|
|
||||||
32766: from all lookup main
|
|
||||||
32767: from all lookup default
|
|
||||||
|
|
||||||
|
|
||||||
- Adds an IP route at the OVS provider bridge routing table so that the traffic is
|
|
||||||
routed to the OVS provider bridge device:
|
|
||||||
|
|
||||||
.. code-block:: ini
|
|
||||||
|
|
||||||
$ ip route show table br-ex
|
|
||||||
default dev br-ex scope link
|
|
||||||
*CIDR via CR-LRP_IP dev br-ex* # for VMs in tenant networks
|
|
||||||
*CR-LRP_IP dev br-ex scope link* # for the VM in tenant network redirection
|
|
||||||
*IP dev br-ex scope link* # IPs on provider or FIPs
|
|
||||||
|
|
||||||
|
|
||||||
- Adds a static ARP entry for the OVN router gateway ports (CR-LRP) so that the
|
|
||||||
traffic is steered to OVN via br-int -- this is because OVN does not reply
|
|
||||||
to ARP requests outside its L2 network:
|
|
||||||
|
|
||||||
.. code-block:: ini
|
|
||||||
|
|
||||||
$ ip nei
|
|
||||||
...
|
|
||||||
CR-LRP_IP dev br-ex lladdr CR-LRP_MAC PERMANENT
|
|
||||||
...
|
|
||||||
|
|
||||||
|
|
||||||
- For IPv6, instead of the static ARP entry, and NDP proxy is added, same
|
|
||||||
reasoning:
|
|
||||||
|
|
||||||
.. code-block:: ini
|
|
||||||
|
|
||||||
$ ip -6 nei add proxy CR-LRP_IP dev br-ex
|
|
||||||
|
|
||||||
|
|
||||||
- Finally, in order for properly send the traffic out from the OVN overlay
|
|
||||||
to kernel networking to be sent out of the node, the OVN BGP Agent needs
|
|
||||||
to add a new flow at the OVS provider bridges so that the destination MAC
|
|
||||||
address is changed to the MAC address of the OVS provider bridge
|
|
||||||
(``actions=mod_dl_dst:OVN_PROVIDER_BRIDGE_MAC,NORMAL``):
|
|
||||||
|
|
||||||
.. code-block:: ini
|
|
||||||
|
|
||||||
$ sudo ovs-ofctl dump-flows br-ex
|
|
||||||
cookie=0x3e7, duration=77.949s, table=0, n_packets=0, n_bytes=0, priority=900,ip,in_port="patch-provnet-1" actions=mod_dl_dst:3a:f7:e9:54:e8:4d,NORMAL
|
|
||||||
cookie=0x3e7, duration=77.937s, table=0, n_packets=0, n_bytes=0, priority=900,ipv6,in_port="patch-provnet-1" actions=mod_dl_dst:3a:f7:e9:54:e8:4d,NORMAL
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
Driver API
|
|
||||||
++++++++++
|
|
||||||
|
|
||||||
The BGP driver needs to implement the ``driver_api.py`` interface with the
|
|
||||||
following functions:
|
|
||||||
|
|
||||||
- ``expose_ip``: creates all the ip rules and routes, and ovs flows needed
|
|
||||||
to redirect the traffic to OVN overlay. It also ensure FRR exposes through
|
|
||||||
BGP the required IP.
|
|
||||||
|
|
||||||
- ``withdraw_ip``: removes the above configuration to withdraw the exposed IP.
|
|
||||||
|
|
||||||
- ``expose_subnet``: add kernel networking configuration (ip rules and route)
|
|
||||||
to ensure traffic can go from the node to the OVN overlay, and viceversa,
|
|
||||||
for IPs within the tenant subnet CIDR.
|
|
||||||
|
|
||||||
- ``withdraw_subnet``: removes the above kernel networking configuration.
|
|
||||||
|
|
||||||
- ``expose_remote_ip``: BGP expose VM tenant network IPs through the chassis
|
|
||||||
hosting the OVN gateway port for the router where the VM is connected.
|
|
||||||
It ensures traffic destinated to the VM IP arrives to this node by exposing
|
|
||||||
the IP through BGP locally. The previous steps in ``expose_subnet`` ensure
|
|
||||||
the traffic is redirected to the OVN overlay once on the node.
|
|
||||||
|
|
||||||
- ``withdraw_remote_ip``: removes the above steps to stop advertizing the IP
|
|
||||||
through BGP from the node.
|
|
||||||
|
|
||||||
And in addition, it also implements these 2 extra ones for the OVN load
|
|
||||||
balancers on the provider networks
|
|
||||||
|
|
||||||
- ``expose_ovn_lb_on_provider``: adds kernel networking configuration to ensure
|
|
||||||
traffic is forwarded from the node to the OVN overlay as well as to expose
|
|
||||||
the VIP through BGP.
|
|
||||||
|
|
||||||
- ``withdraw_ovn_lb_on_provider``: removes the above steps to stop advertising
|
|
||||||
the load balancer VIP.
|
|
||||||
|
|
||||||
|
|
||||||
Agent deployment
|
|
||||||
~~~~~~~~~~~~~~~~
|
|
||||||
|
|
||||||
The BGP mode exposes the VMs and LBs in provider networks or with
|
|
||||||
FIPs, as well as VMs on tenant networks if ``expose_tenant_networks`` or
|
|
||||||
``expose_ipv6_gua_tenant_networks`` configuration options are enabled.
|
|
||||||
|
|
||||||
There is a need to deploy the agent in all the nodes where VMs can be created
|
|
||||||
as well as in the networker nodes (i.e., where OVN router gateway ports can be
|
|
||||||
allocated):
|
|
||||||
|
|
||||||
- For VMs and Amphora load balancers on provider networks or with FIPs,
|
|
||||||
the IP is exposed on the node where the VM (or amphora) is deployed.
|
|
||||||
Therefore the agent needs to be running on the compute nodes.
|
|
||||||
|
|
||||||
- For VMs on tenant networks (with ``expose_tenant_networks`` or
|
|
||||||
``expose_ipv6_gua_tenant_networks`` configuration options enabled), the agent
|
|
||||||
needs to be running on the networker nodes. In OpenStack, with OVN
|
|
||||||
networking, the N/S traffic to the tenant VMs (without FIPs) needs to go
|
|
||||||
through the networking nodes, more specifically the one hosting the
|
|
||||||
chassisredirect ovn port (cr-lrp), connecting the provider network to the
|
|
||||||
OVN virtual router. Hence, the VM IPs is advertised through BGP in that
|
|
||||||
node, and from there it follows the normal path to the OpenStack compute
|
|
||||||
node where the VM is located — the Geneve tunnel.
|
|
||||||
|
|
||||||
- Similarly, for OVN load balancer the IPs are exposed on the networker node.
|
|
||||||
In this case the ARP request for the VIP is replied by the OVN router
|
|
||||||
gateway port, therefore the traffic needs to be injected into OVN overlay
|
|
||||||
at that point too.
|
|
||||||
Therefore the agent needs to be running on the networker nodes for OVN
|
|
||||||
load balancers.
|
|
||||||
|
|
||||||
As an example of how to start the OVN BGP Agent on the nodes, see the commands
|
|
||||||
below:
|
|
||||||
|
|
||||||
.. code-block:: ini
|
|
||||||
|
|
||||||
$ python setup.py install
|
|
||||||
$ cat bgp-agent.conf
|
|
||||||
# sample configuration that can be adapted based on needs
|
|
||||||
[DEFAULT]
|
|
||||||
debug=True
|
|
||||||
reconcile_interval=120
|
|
||||||
expose_tenant_networks=True
|
|
||||||
# expose_ipv6_gua_tenant_networks=True
|
|
||||||
driver=osp_bgp_driver
|
|
||||||
address_scopes=2237917c7b12489a84de4ef384a2bcae
|
|
||||||
|
|
||||||
$ sudo bgp-agent --config-dir bgp-agent.conf
|
|
||||||
Starting BGP Agent...
|
|
||||||
Loaded chassis 51c8480f-c573-4c1c-b96e-582f9ca21e70.
|
|
||||||
BGP Agent Started...
|
|
||||||
Ensuring VRF configuration for advertising routes
|
|
||||||
Configuring br-ex default rule and routing tables for each provider network
|
|
||||||
Found routing table for br-ex with: ['201', 'br-ex']
|
|
||||||
Sync current routes.
|
|
||||||
Add BGP route for logical port with ip 172.24.4.226
|
|
||||||
Add BGP route for FIP with ip 172.24.4.199
|
|
||||||
Add BGP route for CR-LRP Port 172.24.4.221
|
|
||||||
....
|
|
||||||
|
|
||||||
|
|
||||||
.. note::
|
|
||||||
|
|
||||||
If you only want to expose the IPv6 GUA tenant IPs, then remove the option
|
|
||||||
``expose_tenant_networks`` and add ``expose_ipv6_gua_tenant_networks=True``
|
|
||||||
instead.
|
|
||||||
|
|
||||||
|
|
||||||
.. note::
|
|
||||||
|
|
||||||
If you want to filter the tenant networks to be exposed by some specific
|
|
||||||
address scopes, add the list of address scopes to ``addresss_scope=XXX``
|
|
||||||
section. If no filtering should be applied, just remove the line.
|
|
||||||
|
|
||||||
|
|
||||||
Note that the OVN BGP Agent operates under the next assumptions:
|
|
||||||
|
|
||||||
- A dynamic routing solution, in this case FRR, is deployed and
|
|
||||||
advertises/withdraws routes added/deleted to/from certain local interface,
|
|
||||||
in this case the ones associated to the VRF created to that end. As only VM
|
|
||||||
and load balancer IPs needs to be advertised, FRR needs to be configure with
|
|
||||||
the proper filtering so that only /32 (or /128 for IPv6) IPs are advertised.
|
|
||||||
A sample config for FRR is:
|
|
||||||
|
|
||||||
.. code-block:: ini
|
|
||||||
|
|
||||||
frr version 7.0
|
|
||||||
frr defaults traditional
|
|
||||||
hostname cmp-1-0
|
|
||||||
log file /var/log/frr/frr.log debugging
|
|
||||||
log timestamp precision 3
|
|
||||||
service integrated-vtysh-config
|
|
||||||
line vty
|
|
||||||
|
|
||||||
router bgp 64999
|
|
||||||
bgp router-id 172.30.1.1
|
|
||||||
bgp log-neighbor-changes
|
|
||||||
bgp graceful-shutdown
|
|
||||||
no bgp default ipv4-unicast
|
|
||||||
no bgp ebgp-requires-policy
|
|
||||||
|
|
||||||
neighbor uplink peer-group
|
|
||||||
neighbor uplink remote-as internal
|
|
||||||
neighbor uplink password foobar
|
|
||||||
neighbor enp2s0 interface peer-group uplink
|
|
||||||
neighbor enp3s0 interface peer-group uplink
|
|
||||||
|
|
||||||
address-family ipv4 unicast
|
|
||||||
redistribute connected
|
|
||||||
neighbor uplink activate
|
|
||||||
neighbor uplink allowas-in origin
|
|
||||||
neighbor uplink prefix-list only-host-prefixes out
|
|
||||||
exit-address-family
|
|
||||||
|
|
||||||
address-family ipv6 unicast
|
|
||||||
redistribute connected
|
|
||||||
neighbor uplink activate
|
|
||||||
neighbor uplink allowas-in origin
|
|
||||||
neighbor uplink prefix-list only-host-prefixes out
|
|
||||||
exit-address-family
|
|
||||||
|
|
||||||
ip prefix-list only-default permit 0.0.0.0/0
|
|
||||||
ip prefix-list only-host-prefixes permit 0.0.0.0/0 ge 32
|
|
||||||
|
|
||||||
route-map rm-only-default permit 10
|
|
||||||
match ip address prefix-list only-default
|
|
||||||
set src 172.30.1.1
|
|
||||||
|
|
||||||
ip protocol bgp route-map rm-only-default
|
|
||||||
|
|
||||||
ipv6 prefix-list only-default permit ::/0
|
|
||||||
ipv6 prefix-list only-host-prefixes permit ::/0 ge 128
|
|
||||||
|
|
||||||
route-map rm-only-default permit 11
|
|
||||||
match ipv6 address prefix-list only-default
|
|
||||||
set src f00d:f00d:f00d:f00d:f00d:f00d:f00d:0004
|
|
||||||
|
|
||||||
ipv6 protocol bgp route-map rm-only-default
|
|
||||||
|
|
||||||
ip nht resolve-via-default
|
|
||||||
|
|
||||||
|
|
||||||
- The relevant provider OVS bridges are created and configured with a loopback
|
|
||||||
IP address (eg. 1.1.1.1/32 for IPv4), and proxy ARP/NDP is enabled on their
|
|
||||||
kernel interface. In the case of OpenStack this is done by TripleO directly.
|
|
||||||
|
|
||||||
|
|
||||||
Limitations
|
|
||||||
-----------
|
|
||||||
|
|
||||||
The following limitations apply:
|
|
||||||
|
|
||||||
- There is no API to decide what to expose, all VMs/LBs on providers or with
|
|
||||||
Floating IPs associated to them will get exposed. For the VMs in the tenant
|
|
||||||
networks, the flag ``address_scopes`` should be used for filtering what
|
|
||||||
subnets to expose -- which should be also used to ensure no overlapping IPs.
|
|
||||||
|
|
||||||
- There is no support for overlapping CIDRs, so this must be avoided, e.g., by
|
|
||||||
using address scopes and subnet pools.
|
|
||||||
|
|
||||||
- Network traffic is steered by kernel routing (ip routes and rules), therefore
|
|
||||||
OVS-DPDK, where the kernel space is skipped, is not supported.
|
|
||||||
|
|
||||||
- Network traffic is steered by kernel routing (ip routes and rules), therefore
|
|
||||||
SRIOV, where the hypervisor is skipped, is not supported.
|
|
||||||
|
|
||||||
- In OpenStack with OVN networking the N/S traffic to the ovn-octavia VIPs on
|
|
||||||
the provider or the FIPs associated to the VIPs on tenant networks needs to
|
|
||||||
go through the networking nodes (the ones hosting the Neutron Router Gateway
|
|
||||||
Ports, i.e., the chassisredirect cr-lrp ports, for the router connecting the
|
|
||||||
load balancer members to the provider network). Therefore, the entry point
|
|
||||||
into the OVN overlay needs to be one of those networking nodes, and
|
|
||||||
consequently the VIPs (or FIPs to VIPs) are exposed through them. From those
|
|
||||||
nodes the traffic will follow the normal tunneled path (Geneve tunnel) to
|
|
||||||
the OpenStack compute node where the selected member is located.
|
|
|
@ -0,0 +1,78 @@
|
||||||
|
Traffic Redirection to/from OVN
|
||||||
|
+++++++++++++++++++++++++++++++
|
||||||
|
|
||||||
|
Besides the VM/LB IP being exposed in a specific node (either the one hosting
|
||||||
|
the VM/LB or the one with the OVN router gateway port), the OVN BGP Agent is in
|
||||||
|
charge of configuring the linux kernel networking and OVS so that the traffic
|
||||||
|
can be injected into the OVN overlay, and vice versa. To do that, when the
|
||||||
|
agent starts, it ensures that:
|
||||||
|
|
||||||
|
- ARP/NDP is enabled on OVS provider bridges by adding an IP to it
|
||||||
|
|
||||||
|
- There is a routing table associated to each OVS provider bridge
|
||||||
|
(adds entry at /etc/iproute2/rt_tables)
|
||||||
|
|
||||||
|
- If the provider network is a VLAN network, a VLAN device connected
|
||||||
|
to the bridge is created, and it has ARP and NDP enabled.
|
||||||
|
|
||||||
|
- Cleans up extra OVS flows at the OVS provider bridges
|
||||||
|
|
||||||
|
Then, either upon events or due to (re)sync (regularly or during start up), it:
|
||||||
|
|
||||||
|
- Adds an IP rule to apply specific routing table routes,
|
||||||
|
in this case the one associated to the OVS provider bridge:
|
||||||
|
|
||||||
|
.. code-block:: ini
|
||||||
|
|
||||||
|
$ ip rule
|
||||||
|
0: from all lookup local
|
||||||
|
1000: from all lookup [l3mdev-table]
|
||||||
|
*32000: from all to IP lookup br-ex* # br-ex is the OVS provider bridge
|
||||||
|
*32000: from all to CIDR lookup br-ex* # for VMs in tenant networks
|
||||||
|
32766: from all lookup main
|
||||||
|
32767: from all lookup default
|
||||||
|
|
||||||
|
|
||||||
|
- Adds an IP route at the OVS provider bridge routing table so that the traffic is
|
||||||
|
routed to the OVS provider bridge device:
|
||||||
|
|
||||||
|
.. code-block:: ini
|
||||||
|
|
||||||
|
$ ip route show table br-ex
|
||||||
|
default dev br-ex scope link
|
||||||
|
*CIDR via CR-LRP_IP dev br-ex* # for VMs in tenant networks
|
||||||
|
*CR-LRP_IP dev br-ex scope link* # for the VM in tenant network redirection
|
||||||
|
*IP dev br-ex scope link* # IPs on provider or FIPs
|
||||||
|
|
||||||
|
|
||||||
|
- Adds a static ARP entry for the OVN router gateway ports (CR-LRP) so that the
|
||||||
|
traffic is steered to OVN via br-int -- this is because OVN does not reply
|
||||||
|
to ARP requests outside its L2 network:
|
||||||
|
|
||||||
|
.. code-block:: ini
|
||||||
|
|
||||||
|
$ ip neigh
|
||||||
|
...
|
||||||
|
CR-LRP_IP dev br-ex lladdr CR-LRP_MAC PERMANENT
|
||||||
|
...
|
||||||
|
|
||||||
|
|
||||||
|
- For IPv6, instead of the static ARP entry, an NDP proxy is added, same
|
||||||
|
reasoning:
|
||||||
|
|
||||||
|
.. code-block:: ini
|
||||||
|
|
||||||
|
$ ip -6 neigh add proxy CR-LRP_IP dev br-ex
|
||||||
|
|
||||||
|
|
||||||
|
- Finally, in order for properly send the traffic out from the OVN overlay
|
||||||
|
to kernel networking to be sent out of the node, the OVN BGP Agent needs
|
||||||
|
to add a new flow at the OVS provider bridges so that the destination MAC
|
||||||
|
address is changed to the MAC address of the OVS provider bridge
|
||||||
|
(``actions=mod_dl_dst:OVN_PROVIDER_BRIDGE_MAC,NORMAL``):
|
||||||
|
|
||||||
|
.. code-block:: ini
|
||||||
|
|
||||||
|
$ sudo ovs-ofctl dump-flows br-ex
|
||||||
|
cookie=0x3e7, duration=77.949s, table=0, n_packets=0, n_bytes=0, priority=900,ip,in_port="patch-provnet-1" actions=mod_dl_dst:3a:f7:e9:54:e8:4d,NORMAL
|
||||||
|
cookie=0x3e7, duration=77.937s, table=0, n_packets=0, n_bytes=0, priority=900,ipv6,in_port="patch-provnet-1" actions=mod_dl_dst:3a:f7:e9:54:e8:4d,NORMAL
|
|
@ -0,0 +1,310 @@
|
||||||
|
.. _bgp_driver:
|
||||||
|
|
||||||
|
===================================================================
|
||||||
|
[SB DB] OVN BGP Agent: Design of the BGP Driver with kernel routing
|
||||||
|
===================================================================
|
||||||
|
|
||||||
|
Purpose
|
||||||
|
-------
|
||||||
|
|
||||||
|
The addition of a BGP driver enables the OVN BGP agent to expose virtual
|
||||||
|
machine (VMs) and load balancer (LBs) IP addresses through the BGP dynamic
|
||||||
|
protocol when these IP addresses are either associated with a floating IP
|
||||||
|
(FIP) or are booted or created on a provider network. The same functionality
|
||||||
|
is available on project networks, when a special flag is set.
|
||||||
|
|
||||||
|
This document presents the design decision behind the BGP Driver for the
|
||||||
|
Networking OVN BGP agent.
|
||||||
|
|
||||||
|
Overview
|
||||||
|
--------
|
||||||
|
|
||||||
|
With the growing popularity of virtualized and containerized workloads,
|
||||||
|
it is common to use pure Layer 3 spine and leaf network deployments in data
|
||||||
|
centers. The benefits of this practice reduce scaling complexities,
|
||||||
|
failure domains, and broadcast traffic limits.
|
||||||
|
|
||||||
|
The southbound OVN BGP agent is a Python-based daemon that runs on each
|
||||||
|
OpenStack Controller and Compute node.
|
||||||
|
The agent monitors the Open Virtual Network (OVN) southbound database
|
||||||
|
for certain VM and floating IP (FIP) events.
|
||||||
|
When these events occur, the agent notifies the FRR BGP daemon (bgpd)
|
||||||
|
to advertise the IP address or FIP associated with the VM.
|
||||||
|
The agent also triggers actions that route the external traffic to the OVN
|
||||||
|
overlay.
|
||||||
|
Because the agent uses a multi-driver implementation, you can configure the
|
||||||
|
agent for the specific infrastructure that runs on top of OVN, such as OSP or
|
||||||
|
Kubernetes and OpenShift.
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
|
||||||
|
Note it is only intended for the N/S traffic, the E/W traffic will work
|
||||||
|
exactly the same as before, i.e., VMs are connected through geneve
|
||||||
|
tunnels.
|
||||||
|
|
||||||
|
|
||||||
|
This design simplicity enables the agent to implement different drivers,
|
||||||
|
depending on what OVN SB DB events are being watched (watchers examples at
|
||||||
|
``ovn_bgp_agent/drivers/openstack/watchers/``), and what actions are
|
||||||
|
triggered in reaction to them (drivers examples at
|
||||||
|
``ovn_bgp_agent/drivers/openstack/XXXX_driver.py``, implementing the
|
||||||
|
``ovn_bgp_agent/drivers/driver_api.py``).
|
||||||
|
|
||||||
|
A driver implements the support for BGP capabilities. It ensures that both VMs
|
||||||
|
and LBs on provider networks or associated floating IPs are exposed through BGP.
|
||||||
|
In addition, VMs on tenant networks can be also exposed
|
||||||
|
if the ``expose_tenant_network`` configuration option is enabled.
|
||||||
|
To control what tenant networks are exposed another flag can be used:
|
||||||
|
``address_scopes``. If not set, all the tenant networks will be exposed, while
|
||||||
|
if it is configured with a (set of) address_scopes, only the tenant networks
|
||||||
|
whose address_scope matches will be exposed.
|
||||||
|
|
||||||
|
A common driver API is defined exposing the these methods:
|
||||||
|
|
||||||
|
- ``expose_ip`` and ``withdraw_ip``: exposes or withdraws IPs for local
|
||||||
|
OVN ports.
|
||||||
|
|
||||||
|
- ``expose_remote_ip`` and ``withdraw_remote_ip``: exposes or withdraws IPs
|
||||||
|
through another node when the VM or pods are running on a different node.
|
||||||
|
For example, use for VMs on tenant networks where the traffic needs to be
|
||||||
|
injected through the OVN router gateway port.
|
||||||
|
|
||||||
|
- ``expose_subnet`` and ``withdraw_subnet``: exposes or withdraws subnets
|
||||||
|
through the local node.
|
||||||
|
|
||||||
|
|
||||||
|
Proposed Solution
|
||||||
|
-----------------
|
||||||
|
|
||||||
|
To support BGP functionality the OVN BGP Agent includes a driver
|
||||||
|
that performs the extra steps required for exposing the IPs through BGP on
|
||||||
|
the correct nodes and steering the traffic to/from the node from/to the OVN
|
||||||
|
overlay. To configure the OVN BGP agent to use the BGP driver set the
|
||||||
|
``driver`` configuration option in the ``bgp-agent.conf`` file to
|
||||||
|
``ovn_bgp_driver``.
|
||||||
|
|
||||||
|
The BGP driver requires a watcher to react to the BGP-related events.
|
||||||
|
In this case, BGP actions are triggered by events related to
|
||||||
|
``Port_Binding`` and ``Load_Balancer`` OVN SB DB tables.
|
||||||
|
The information in these tables is modified when VMs and LBs are created and
|
||||||
|
deleted, and when FIPs for them are associated and disassociated.
|
||||||
|
|
||||||
|
Then, the agent performs some actions in order to ensure those VMs are
|
||||||
|
reachable through BGP:
|
||||||
|
|
||||||
|
- Traffic between nodes or BGP Advertisement: These are the actions needed to
|
||||||
|
expose the BGP routes and make sure all the nodes know how to reach the
|
||||||
|
VM/LB IP on the nodes.
|
||||||
|
|
||||||
|
- Traffic within a node or redirecting traffic to/from OVN overlay: These are
|
||||||
|
the actions needed to redirect the traffic to/from a VM to the OVN Neutron
|
||||||
|
networks, when traffic reaches the node where the VM is or in their way
|
||||||
|
out of the node.
|
||||||
|
|
||||||
|
The code for the BGP driver is located at
|
||||||
|
``ovn_bgp_agent/drivers/openstack/ovn_bgp_driver.py``, and its associated
|
||||||
|
watcher can be found at
|
||||||
|
``ovn_bgp_agent/drivers/openstack/watchers/bgp_watcher.py``.
|
||||||
|
|
||||||
|
|
||||||
|
OVN SB DB Events
|
||||||
|
~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
The watcher associated with the BGP driver detects the relevant events on the
|
||||||
|
OVN SB DB to call the driver functions to configure BGP and linux kernel
|
||||||
|
networking accordingly.
|
||||||
|
The following events are watched and handled by the BGP watcher:
|
||||||
|
|
||||||
|
- VMs or LBs created/deleted on provider networks
|
||||||
|
|
||||||
|
- FIPs association/disassociation to VMs or LBs
|
||||||
|
|
||||||
|
- VMs or LBs created/deleted on tenant networks (if the
|
||||||
|
``expose_tenant_networks`` configuration option is enabled, or if the
|
||||||
|
``expose_ipv6_gua_tenant_networks`` for only exposing IPv6 GUA ranges)
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
|
||||||
|
If ``expose_tenant_networks`` flag is enabled, it does not matter the
|
||||||
|
status of ``expose_ipv6_gua_tenant_networks``, as all the tenant IPs
|
||||||
|
are advertised.
|
||||||
|
|
||||||
|
|
||||||
|
It creates new event classes named
|
||||||
|
``PortBindingChassisEvent`` and ``OVNLBEvent``, that all the events
|
||||||
|
watched for BGP use as the base (inherit from).
|
||||||
|
|
||||||
|
The BGP watcher reacts to the following events:
|
||||||
|
|
||||||
|
- ``PortBindingChassisCreatedEvent``: Detects when a port of type
|
||||||
|
``""`` (empty double-qoutes), ``virtual``, or ``chassisredirect`` gets
|
||||||
|
attached to the OVN chassis where the agent is running. This is the case for
|
||||||
|
VM or amphora LB ports on the provider networks, VM or amphora LB ports on
|
||||||
|
tenant networks with a FIP associated, and neutron gateway router ports
|
||||||
|
(CR-LRPs). It calls ``expose_ip`` driver method to perform the needed
|
||||||
|
actions to expose it.
|
||||||
|
|
||||||
|
- ``PortBindingChassisDeletedEvent``: Detects when a port of type
|
||||||
|
``""`` (empty double-quotes), ``virtual``, or ``chassisredirect`` gets
|
||||||
|
detached from the OVN chassis where the agent is running. This is the case
|
||||||
|
for VM or amphora LB ports on the provider networks, VM or amphora LB ports
|
||||||
|
on tenant networks with a FIP associated, and neutron gateway router ports
|
||||||
|
(CR-LRPs). It calls ``withdraw_ip`` driver method to perform the needed
|
||||||
|
actions to withdraw the exposed BGP route.
|
||||||
|
|
||||||
|
- ``FIPSetEvent``: Detects when a Port_Binding entry of type ``patch`` gets
|
||||||
|
its ``nat_addresses`` field updated (e.g., action related to FIPs NATing).
|
||||||
|
When true, and the associated VM port is on the local chassis, the event
|
||||||
|
is processed by the agent and the required IP rule gets created and its
|
||||||
|
IP is (BGP) exposed. It calls the ``expose_ip`` driver method, including
|
||||||
|
the associated_port information, to perform the required actions.
|
||||||
|
|
||||||
|
- ``FIPUnsetEvent``: Same as previous, but when the ``nat_addresses`` field get
|
||||||
|
an IP deleted. It calls the ``withdraw_ip`` driver method to perform the
|
||||||
|
required actions.
|
||||||
|
|
||||||
|
- ``SubnetRouterAttachedEvent``: Detects when a Port_Binding entry of type
|
||||||
|
``patch`` port gets created. This means a subnet is attached to a router.
|
||||||
|
In the ``expose_tenant_network``
|
||||||
|
case, if the chassis is the one having the cr-lrp port for that router where
|
||||||
|
the port is getting created, then the event is processed by the agent and the
|
||||||
|
needed actions (ip rules and routes, and ovs rules) for exposing the IPs on
|
||||||
|
that network are performed. This event calls the driver API
|
||||||
|
``expose_subnet``. The same happens if ``expose_ipv6_gua_tenant_networks``
|
||||||
|
is used, but then, the IPs are only exposed if they are IPv6 global.
|
||||||
|
|
||||||
|
- ``SubnetRouterDetachedEvent``: Same as ``SubnetRouterAttachedEvent``,
|
||||||
|
but for the deletion of the port. It calls ``withdraw_subnet``.
|
||||||
|
|
||||||
|
- ``TenantPortCreateEvent``: Detects when a port of type ``""`` (empty
|
||||||
|
double-quotes) or ``virtual`` gets updated. If that port is not on a
|
||||||
|
provider network, and the chassis where the event is processed has the
|
||||||
|
``LogicalRouterPort`` for the network and the OVN router gateway port where
|
||||||
|
the network is connected to, then the event is processed and the actions to
|
||||||
|
expose it through BGP are triggered. It calls the ``expose_remote_ip``
|
||||||
|
because in this case the IPs are exposed through the node with the OVN router
|
||||||
|
gateway port, instead of the node where the VM is located.
|
||||||
|
|
||||||
|
- ``TenantPortDeleteEvent``: Same as ``TenantPortCreateEvent``, but for
|
||||||
|
the deletion of the port. It calls ``withdraw_remote_ip``.
|
||||||
|
|
||||||
|
- ``OVNLBMemberUpdateEvent``: This event is required to handle the OVN load
|
||||||
|
balancers created on the provider networks. It detects when new datapaths
|
||||||
|
are added/removed to/from the ``Load_Balancer`` entries. This happens when
|
||||||
|
members are added/removed which triggers the addition/deletion of their
|
||||||
|
datapaths into the ``Load_Balancer`` table entry.
|
||||||
|
The event is only processed in the nodes with
|
||||||
|
the relevant OVN router gateway ports, because it is where it needs to get
|
||||||
|
exposed to be injected into OVN overlay.
|
||||||
|
``OVNLBMemberUpdateEvent`` calls ``expose_ovn_lb_on_provider`` only when the
|
||||||
|
second datapath is added. The first datapath belongs to the VIP for the
|
||||||
|
provider network, while the second one belongs to the load balancer member.
|
||||||
|
``OVNLBMemberUpdateEvent`` calls ``withdraw_ovn_lb_on_provider`` when the
|
||||||
|
second datapath is deleted, or the entire load balancer is deleted (event
|
||||||
|
type is ``ROW_DELETE``).
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
|
||||||
|
All the load balancer members are expected to be connected through the same
|
||||||
|
router to the provider network.
|
||||||
|
|
||||||
|
|
||||||
|
Driver Logic
|
||||||
|
~~~~~~~~~~~~
|
||||||
|
|
||||||
|
The BGP driver is in charge of the networking configuration ensuring that
|
||||||
|
VMs and LBs on provider networks or with FIPs can be reached through BGP
|
||||||
|
(N/S traffic). In addition, if the ``expose_tenant_networks`` flag is enabled,
|
||||||
|
VMs in tenant networks should be reachable too -- although instead of directly
|
||||||
|
in the node they are created, through one of the network gateway chassis nodes.
|
||||||
|
The same happens with ``expose_ipv6_gua_tenant_networks`` but only for IPv6
|
||||||
|
GUA ranges. In addition, if the config option ``address_scopes`` is set, only
|
||||||
|
the tenant networks with matching corresponding ``address_scope`` will be
|
||||||
|
exposed.
|
||||||
|
|
||||||
|
To accomplish the network configuration and advertisement, the driver ensures:
|
||||||
|
|
||||||
|
- VM and LBs IPs can be advertised in a node where the traffic could be
|
||||||
|
injected into the OVN overlay, in this case either the node hosting the VM
|
||||||
|
or the node where the router gateway port is scheduled (see limitations
|
||||||
|
subsection).
|
||||||
|
|
||||||
|
- Once the traffic reaches the specific node, the traffic is redirected to the
|
||||||
|
OVN overlay by leveraging kernel networking.
|
||||||
|
|
||||||
|
|
||||||
|
.. include:: ../bgp_advertising.rst
|
||||||
|
|
||||||
|
|
||||||
|
.. include:: ../bgp_traffic_redirection.rst
|
||||||
|
|
||||||
|
|
||||||
|
Driver API
|
||||||
|
++++++++++
|
||||||
|
|
||||||
|
The BGP driver needs to implement the ``driver_api.py`` interface with the
|
||||||
|
following functions:
|
||||||
|
|
||||||
|
- ``expose_ip``: creates all the IP rules and routes, and OVS flows needed
|
||||||
|
to redirect the traffic to the OVN overlay. It also ensure FRR exposes
|
||||||
|
through BGP the required IP.
|
||||||
|
|
||||||
|
- ``withdraw_ip``: removes the above configuration to withdraw the exposed IP.
|
||||||
|
|
||||||
|
- ``expose_subnet``: add kernel networking configuration (IP rules and route)
|
||||||
|
to ensure traffic can go from the node to the OVN overlay, and vice versa,
|
||||||
|
for IPs within the tenant subnet CIDR.
|
||||||
|
|
||||||
|
- ``withdraw_subnet``: removes the above kernel networking configuration.
|
||||||
|
|
||||||
|
- ``expose_remote_ip``: BGP exposes VM tenant network IPs through the chassis
|
||||||
|
hosting the OVN gateway port for the router where the VM is connected.
|
||||||
|
It ensures traffic destinated to the VM IP arrives to this node by exposing
|
||||||
|
the IP through BGP locally. The previous steps in ``expose_subnet`` ensure
|
||||||
|
the traffic is redirected to the OVN overlay once on the node.
|
||||||
|
|
||||||
|
- ``withdraw_remote_ip``: removes the above steps to stop advertising the IP
|
||||||
|
through BGP from the node.
|
||||||
|
|
||||||
|
The driver API implements these additional methods for OVN load balancers on
|
||||||
|
provider networks:
|
||||||
|
|
||||||
|
- ``expose_ovn_lb_on_provider``: adds kernel networking configuration to ensure
|
||||||
|
traffic is forwarded from the node to the OVN overlay and to expose
|
||||||
|
the VIP through BGP.
|
||||||
|
|
||||||
|
- ``withdraw_ovn_lb_on_provider``: removes the above steps to stop advertising
|
||||||
|
the load balancer VIP.
|
||||||
|
|
||||||
|
|
||||||
|
.. include:: ../agent_deployment.rst
|
||||||
|
|
||||||
|
|
||||||
|
Limitations
|
||||||
|
-----------
|
||||||
|
|
||||||
|
The following limitations apply:
|
||||||
|
|
||||||
|
- There is no API to decide what to expose, all VMs/LBs on providers or with
|
||||||
|
floating IPs associated with them will get exposed. For the VMs in the tenant
|
||||||
|
networks, the flag ``address_scopes`` should be used for filtering what
|
||||||
|
subnets to expose -- which should be also used to ensure no overlapping IPs.
|
||||||
|
|
||||||
|
- There is no support for overlapping CIDRs, so this must be avoided, e.g., by
|
||||||
|
using address scopes and subnet pools.
|
||||||
|
|
||||||
|
- Network traffic is steered by kernel routing (IP routes and rules), therefore
|
||||||
|
OVS-DPDK, where the kernel space is skipped, is not supported.
|
||||||
|
|
||||||
|
- Network traffic is steered by kernel routing (IP routes and rules), therefore
|
||||||
|
SR-IOV, where the hypervisor is skipped, is not supported.
|
||||||
|
|
||||||
|
- In OpenStack with OVN networking the N/S traffic to the ovn-octavia VIPs on
|
||||||
|
the provider or the FIPs associated to the VIPs on tenant networks needs to
|
||||||
|
go through the networking nodes (the ones hosting the Neutron Router Gateway
|
||||||
|
Ports, i.e., the chassisredirect cr-lrp ports, for the router connecting the
|
||||||
|
load balancer members to the provider network). Therefore, the entry point
|
||||||
|
into the OVN overlay needs to be one of those networking nodes, and
|
||||||
|
consequently the VIPs (or FIPs to VIPs) are exposed through them. From those
|
||||||
|
nodes the traffic follows the normal tunneled path (Geneve tunnel) to
|
||||||
|
the OpenStack compute node where the selected member is located.
|
|
@ -12,9 +12,9 @@
|
||||||
''''''' Heading 4
|
''''''' Heading 4
|
||||||
(Avoid deeper levels because they do not render well.)
|
(Avoid deeper levels because they do not render well.)
|
||||||
|
|
||||||
========================================
|
=========================================================
|
||||||
Design of OVN BGP Agent with EVPN Driver
|
Design of OVN BGP Agent with EVPN Driver (kernel routing)
|
||||||
========================================
|
=========================================================
|
||||||
|
|
||||||
Purpose
|
Purpose
|
||||||
-------
|
-------
|
||||||
|
@ -96,7 +96,7 @@ watcher detects it).
|
||||||
The overall arquitecture and integration between the ``networking-bgpvpn``
|
The overall arquitecture and integration between the ``networking-bgpvpn``
|
||||||
and the ``networking-bgp-ovn`` agent are shown in the next figure:
|
and the ``networking-bgp-ovn`` agent are shown in the next figure:
|
||||||
|
|
||||||
.. image:: ../../images/networking-bgpvpn_integration.png
|
.. image:: ../../../images/networking-bgpvpn_integration.png
|
||||||
:alt: integration components
|
:alt: integration components
|
||||||
:align: center
|
:align: center
|
||||||
:width: 100%
|
:width: 100%
|
||||||
|
@ -409,7 +409,7 @@ The next figure shows the N/S traffic flow through the VRF to the VM,
|
||||||
including information regarding the OVS flows on the provider bridge (br-ex),
|
including information regarding the OVS flows on the provider bridge (br-ex),
|
||||||
and the routes on the VRF routing table.
|
and the routes on the VRF routing table.
|
||||||
|
|
||||||
.. image:: ../../images/evpn_traffic_flow.png
|
.. image:: ../../../images/evpn_traffic_flow.png
|
||||||
:alt: integration components
|
:alt: integration components
|
||||||
:align: center
|
:align: center
|
||||||
:width: 100%
|
:width: 100%
|
|
@ -0,0 +1,12 @@
|
||||||
|
==========================
|
||||||
|
BGP Drivers Documentation
|
||||||
|
==========================
|
||||||
|
|
||||||
|
.. toctree::
|
||||||
|
:maxdepth: 1
|
||||||
|
|
||||||
|
bgp_mode_design
|
||||||
|
nb_bgp_mode_design
|
||||||
|
ovn_bgp_mode_design
|
||||||
|
evpn_mode_design
|
||||||
|
bgp_mode_stretched_l2_design
|
|
@ -0,0 +1,386 @@
|
||||||
|
.. _nb_bgp_driver:
|
||||||
|
|
||||||
|
======================================================================
|
||||||
|
[NB DB] NB OVN BGP Agent: Design of the BGP Driver with kernel routing
|
||||||
|
======================================================================
|
||||||
|
|
||||||
|
Purpose
|
||||||
|
-------
|
||||||
|
|
||||||
|
The addition of a BGP driver enables the OVN BGP agent to expose virtual
|
||||||
|
machine (VMs) and load balancer (LBs) IP addresses through the BGP dynamic
|
||||||
|
protocol when these IP addresses are either associated with a floating IP
|
||||||
|
(FIP) or are booted or created on a provider network.
|
||||||
|
The same functionality is available on project networks, when a special
|
||||||
|
flag is set.
|
||||||
|
|
||||||
|
This document presents the design decision behind the NB BGP Driver for
|
||||||
|
the Networking OVN BGP agent.
|
||||||
|
|
||||||
|
Overview
|
||||||
|
--------
|
||||||
|
|
||||||
|
With the growing popularity of virtualized and containerized workloads,
|
||||||
|
it is common to use pure Layer 3 spine and leaf network deployments in
|
||||||
|
data centers. The benefits of this practice reduce scaling complexities,
|
||||||
|
failure domains, and broadcast traffic limits
|
||||||
|
|
||||||
|
The northbound OVN BGP agent is a Python-based daemon that runs on each
|
||||||
|
OpenStack Controller and Compute node.
|
||||||
|
The agent monitors the Open Virtual Network (OVN) northbound database
|
||||||
|
for certain VM and floating IP (FIP) events.
|
||||||
|
When these events occur, the agent notifies the FRR BGP daemon (bgpd)
|
||||||
|
to advertise the IP address or FIP associated with the VM.
|
||||||
|
The agent also triggers actions that route the external traffic to the OVN
|
||||||
|
overlay.
|
||||||
|
Unlike its predecessor, the (southbound) OVN BGP agent, the northbound OVN BGP
|
||||||
|
agent uses the northbound database API which is more stable than the southbound
|
||||||
|
database API because the former is isolated from internal changes to core OVN.
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
|
||||||
|
Note northbound OVN BGP agent driver is only intended for the N/S traffic,
|
||||||
|
the E/W traffic will work exactly the same as before, i.e., VMs are
|
||||||
|
connected through geneve tunnels.
|
||||||
|
|
||||||
|
|
||||||
|
The agent provides a multi-driver implementation that allows you to configure
|
||||||
|
it for specific infrastructure running on top of OVN, for instance OpenStack
|
||||||
|
or Kubernetes/OpenShift.
|
||||||
|
This design simplicity enables the agent to implement different drivers,
|
||||||
|
depending on what OVN NB DB events are being watched (watchers examples at
|
||||||
|
``ovn_bgp_agent/drivers/openstack/watchers/``), and what actions are
|
||||||
|
triggered in reaction to them (drivers examples at
|
||||||
|
``ovn_bgp_agent/drivers/openstack/XXXX_driver.py``, implementing the
|
||||||
|
``ovn_bgp_agent/drivers/driver_api.py``).
|
||||||
|
|
||||||
|
A driver implements the support for BGP capabilities. It ensures that both VMs
|
||||||
|
and LBs on provider networks or associated Floating IPs are exposed through
|
||||||
|
BGP. In addition, VMs on tenant networks can be also exposed
|
||||||
|
if the ``expose_tenant_network`` configuration option is enabled.
|
||||||
|
To control what tenant networks are exposed another flag can be used:
|
||||||
|
``address_scopes``. If not set, all the tenant networks will be exposed, while
|
||||||
|
if it is configured with a (set of) address_scopes, only the tenant networks
|
||||||
|
whose address_scope matches will be exposed.
|
||||||
|
|
||||||
|
A common driver API is defined exposing the these methods:
|
||||||
|
|
||||||
|
- ``expose_ip`` and ``withdraw_ip``: exposes or withdraws IPs for local
|
||||||
|
OVN ports.
|
||||||
|
|
||||||
|
- ``expose_remote_ip`` and ``withdraw_remote_ip``: exposes or withdraws IPs
|
||||||
|
through another node when the VM or pods are running on a different node.
|
||||||
|
For example, use for VMs on tenant networks where the traffic needs to be
|
||||||
|
injected through the OVN router gateway port.
|
||||||
|
|
||||||
|
- ``expose_subnet`` and ``withdraw_subnet``: exposes or withdraws subnets through
|
||||||
|
the local node.
|
||||||
|
|
||||||
|
|
||||||
|
Proposed Solution
|
||||||
|
-----------------
|
||||||
|
|
||||||
|
To support BGP functionality the NB OVN BGP Agent includes a new driver
|
||||||
|
that performs the steps required for exposing the IPs through BGP on
|
||||||
|
the correct nodes and steering the traffic to/from the node from/to the OVN
|
||||||
|
overlay.
|
||||||
|
To configure the OVN BGP agent to use the northbound OVN BGP driver, in the
|
||||||
|
``bgp-agent.conf`` file, set the value of ``driver`` to ``nb_ovn_bgp_driver``.
|
||||||
|
|
||||||
|
This driver requires a watcher to react to the BGP-related events.
|
||||||
|
In this case, BGP actions are triggered by events related to
|
||||||
|
``Logical_Switch_Port``, ``Logical_Router_Port``and ``Load_Balancer``
|
||||||
|
on OVN NB DB tables.
|
||||||
|
The information in these tables is modified when VMs and LBs are created and
|
||||||
|
deleted, and when FIPs for them are associated and disassociated.
|
||||||
|
|
||||||
|
Then, the agent performs these actions to ensure the VMs are reachable through
|
||||||
|
BGP:
|
||||||
|
|
||||||
|
- Traffic between nodes or BGP Advertisement: These are the actions needed to
|
||||||
|
expose the BGP routes and make sure all the nodes know how to reach the
|
||||||
|
VM/LB IP on the nodes. This is exactly the same as in the initial OVN BGP
|
||||||
|
Driver (see :ref:`bgp_driver`)
|
||||||
|
|
||||||
|
- Traffic within a node or redirecting traffic to/from OVN overlay (wiring):
|
||||||
|
These are the actions needed to redirect the traffic to/from a VM to the OVN
|
||||||
|
neutron networks, when traffic reaches the node where the VM is or in their
|
||||||
|
way out of the node.
|
||||||
|
|
||||||
|
The code for the NB BGP driver is located at
|
||||||
|
``ovn_bgp_agent/drivers/openstack/nb_ovn_bgp_driver.py``, and its associated
|
||||||
|
watcher can be found at
|
||||||
|
``ovn_bgp_agent/drivers/openstack/watchers/nb_bgp_watcher.py``.
|
||||||
|
|
||||||
|
Note this new driver also allows different ways of wiring the node to the OVN
|
||||||
|
overlay. These are configurable through the option ``exposing_method``, where
|
||||||
|
for now you can select:
|
||||||
|
|
||||||
|
- ``underlay``: using kernel routing (what we describe in this document), same
|
||||||
|
as supported by the driver at :ref:`bgp_driver`.
|
||||||
|
|
||||||
|
- ``ovn``: using an extra OVN cluster per node to perform the routing at
|
||||||
|
OVN/OVS level instead of kernel, therefore enabling datapath acceleration
|
||||||
|
(Hardware Offloading and OVS-DPDK). More information about this mechanism
|
||||||
|
at :ref:`bgp_driver`.
|
||||||
|
|
||||||
|
|
||||||
|
OVN NB DB Events
|
||||||
|
~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
The watcher associated with the BGP driver detects the relevant events on the
|
||||||
|
OVN NB DB to call the driver functions to configure BGP and linux kernel
|
||||||
|
networking accordingly.
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
|
||||||
|
Linux Kernel Networking is used when the default ``exposing_method``
|
||||||
|
(``underlay``) is used. If ``ovn`` is used instead, OVN routing is
|
||||||
|
used instead of Kernel. For more details on this see :ref:`ovn_routing`.
|
||||||
|
|
||||||
|
The following events are watched and handled by the BGP watcher:
|
||||||
|
|
||||||
|
- VMs or LBs created/deleted on provider networks
|
||||||
|
|
||||||
|
- FIPs association/disassociation to VMs or LBs
|
||||||
|
|
||||||
|
- VMs or LBs created/deleted on tenant networks (if the
|
||||||
|
``expose_tenant_networks`` configuration option is enabled, or if the
|
||||||
|
``expose_ipv6_gua_tenant_networks`` for only exposing IPv6 GUA ranges)
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
|
||||||
|
If ``expose_tenant_networks`` flag is enabled, it does not matter the
|
||||||
|
status of ``expose_ipv6_gua_tenant_networks``, as all the tenant IPs
|
||||||
|
are advertised.
|
||||||
|
|
||||||
|
|
||||||
|
The NB BGP watcher reacts to the following events:
|
||||||
|
|
||||||
|
- ``Logical_Switch_Port``
|
||||||
|
|
||||||
|
- ``Logical_Router_Port``
|
||||||
|
|
||||||
|
- ``Load_Balancer``
|
||||||
|
|
||||||
|
Besides the previously existing ``OVNLBEvent`` class, the NB BGP watcher has
|
||||||
|
new event classes named ``LSPChassisEvent`` and ``LRPChassisEvent`` that
|
||||||
|
all the events watched for NB BGP driver use as the base (inherit from).
|
||||||
|
|
||||||
|
The specific defined events to react to are:
|
||||||
|
|
||||||
|
- ``LogicalSwitchPortProviderCreateEvent``: Detects when a VM or an amphora LB
|
||||||
|
port, logical switch ports of type ``""`` (empty double-qoutes) or
|
||||||
|
``virtual``, comes up or gets attached to the OVN chassis where the agent is
|
||||||
|
running. If the ports are on a provider network, then the driver calls the
|
||||||
|
``expose_ip`` driver method to perform the needed actions to expose the port
|
||||||
|
(wire and advertise). If the port is on a tenant network, the driver
|
||||||
|
dismisses the event.
|
||||||
|
|
||||||
|
- ``LogicalSwitchPortProviderDeleteEvent``: Detects when a VM or an amphora LB
|
||||||
|
port, logical switch ports of type "" (empty double-qoutes) or ``virtual``,
|
||||||
|
goes down or gets detached from the OVN chassis where the agent is running.
|
||||||
|
If the ports are on a provider network, then the driver calls the
|
||||||
|
``withdraw_ip`` driver method to perform the needed actions to withdraw the
|
||||||
|
port (withdraw and unwire). If the port is on a tenant network, the driver
|
||||||
|
dismisses the event.
|
||||||
|
|
||||||
|
- ``LogicalSwitchPortFIPCreateEvent``: Similar to
|
||||||
|
``LogicalSwitchPortProviderCreateEvent`` but focusing on the changes on the
|
||||||
|
FIP information on the Logical Switch Port external_ids.
|
||||||
|
It calls ``expose_fip`` driver method to perform the needed actions to expose
|
||||||
|
the floating IP (wire and advertize).
|
||||||
|
|
||||||
|
- ``LogicalSwitchPortFIPDeleteEvent``: Same as previous one but for withdrawing
|
||||||
|
FIPs. In this case it is similar to ``LogicalSwitchPortProviderDeleteEvent``
|
||||||
|
but instaed calls the ``withdraw_fip`` driver method to perform the needed actions
|
||||||
|
to withdraw the floating IP (Withdraw and unwire).
|
||||||
|
|
||||||
|
- ``LocalnetCreateDeleteEvent``: Detects creation/deletion of OVN localnet
|
||||||
|
ports, which indicates the creation/deletion of provider networks. This
|
||||||
|
triggers a resync (``sync`` method) action to perform the base configuration
|
||||||
|
needed for the provider networks, such as OVS flows or arp/ndp
|
||||||
|
configurations.
|
||||||
|
|
||||||
|
- ``ChassisRedirectCreateEvent``: Similar to
|
||||||
|
``LogicalSwitchPortProviderCreateEvent`` but with the focus on logical router
|
||||||
|
ports, such as the OVN gateway ports (cr-lrps), instead of logical switch
|
||||||
|
ports. The driver calls ``expose_ip`` which performs additional steps to also
|
||||||
|
expose IPs related to the cr-lrps, such as the ovn-lb or IPs in tenant
|
||||||
|
networks. The watcher ``match`` checks the chassis information in the
|
||||||
|
``status`` field, which must be ovn23.09 or later.
|
||||||
|
|
||||||
|
- ``ChassisRedirectDeleteEvent``: Similar to
|
||||||
|
``LogicalSwitchPortProviderDeleteEvent`` but with the focus on logical router
|
||||||
|
ports, such as the OVN gateway ports (cr-lrps), instead of logical switch
|
||||||
|
ports. The driver calls ``withdraw_ip`` which performs additional steps to
|
||||||
|
also withdraw IPs related to the cr-lrps, such as the ovn-lb or IPs in tenant
|
||||||
|
networks. The watcher ``match`` checks the chassis information in the
|
||||||
|
``status`` field, which must be ovn23.09 or later.
|
||||||
|
|
||||||
|
- ``LogicalSwitchPortSubnetAttachEvent``: Detects Logical Switch Ports of type
|
||||||
|
``router`` (connecting Logical Switch to Logical Router) and checks if the
|
||||||
|
associated router is associated to the local chassis, i.e., if the CR-LRP of
|
||||||
|
the router is located in the local chassis. If that is the case, the
|
||||||
|
``expose_subnet`` driver method is called which is in charge of the wiring
|
||||||
|
needed for the IPs on that subnet (set of IP routes and rules).
|
||||||
|
|
||||||
|
- ``LogicalSwitchPortSubnetDetachEvent``: Similar to
|
||||||
|
``LogicalSwitchPortSubnetAttachEvent`` but for unwiring the subnet, so it is
|
||||||
|
calling the``withdraw_subnet`` driver method.
|
||||||
|
|
||||||
|
- ``LogicalSwitchPortTenantCreateEvent``: Detects when a logical switch port
|
||||||
|
of type ``""`` (empty double-qoutes) or ``virtual``, similar to
|
||||||
|
``LogicalSwitchPortProviderCreateEvent``. It checks if the network associated
|
||||||
|
to the VM is exposed in the local chassis (meaning its cr-lrp is also local).
|
||||||
|
If that is the case, it calls ``expose_remote_ip``, which manages the
|
||||||
|
advertising of the IP -- there is no need for wiring, as that is done when
|
||||||
|
the subnet is exposed by ``LogicalSwitchPortSubnetAttachEvent`` event.
|
||||||
|
|
||||||
|
- ``LogicalSwitchPortTenantDeleteEvent``: Similar to
|
||||||
|
``LogicalSwitchPortTenantCreateEvent`` but for withdrawing IPs.
|
||||||
|
Calling ``withdraw_remote_ips``.
|
||||||
|
|
||||||
|
- ``OVNLBCreateEvent``: Detects Load_Balancer events and processes them only
|
||||||
|
if the Load_Balancer entry has associated VIPs and the router is local to
|
||||||
|
the chassis.
|
||||||
|
If the VIP or router is added to a provider network, the driver calls
|
||||||
|
``expose_ovn_lb_vip`` to expose and wire the VIP or router.
|
||||||
|
If the VIP or router is added to a tenant network, the driver calls
|
||||||
|
``expose_ovn_lb_vip`` to only expose the VIP or router.
|
||||||
|
If a floating IP is added, then the driver calls ``expose_ovn_lb_fip`` to
|
||||||
|
expose and wire the FIP.
|
||||||
|
|
||||||
|
- ``OVNLBDeleteEvent``: If the VIP or router is removed from a provider
|
||||||
|
network, the driver calls ``withdraw_ovn_lb_vip`` to withdraw and unwire
|
||||||
|
the VIP or router. If the VIP or router is removed to a tenant network,
|
||||||
|
the driver calls ``withdraw_ovn_lb_vip`` to only withdraw the VIP or router.
|
||||||
|
If a floating IP is removed, then the driver calls ``withdraw_ovn_lb_fip``
|
||||||
|
to withdraw and unwire the FIP.
|
||||||
|
|
||||||
|
|
||||||
|
Driver Logic
|
||||||
|
~~~~~~~~~~~~
|
||||||
|
|
||||||
|
The NB BGP driver is in charge of the networking configuration ensuring that
|
||||||
|
VMs and LBs on provider networks or with FIPs can be reached through BGP
|
||||||
|
(N/S traffic). In addition, if the ``expose_tenant_networks`` flag is enabled,
|
||||||
|
VMs in tenant networks should be reachable too -- although instead of directly
|
||||||
|
in the node they are created, through one of the network gateway chassis nodes.
|
||||||
|
The same happens with ``expose_ipv6_gua_tenant_networks`` but only for IPv6
|
||||||
|
GUA ranges. In addition, if the config option ``address_scopes`` is set, only
|
||||||
|
the tenant networks with matching corresponding ``address_scope`` will be
|
||||||
|
exposed.
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
|
||||||
|
To be able to expose tenant networks a ovn version ovn23.09 or newer is
|
||||||
|
needed
|
||||||
|
|
||||||
|
To accomplish the network configuration and advertisement, the driver ensures:
|
||||||
|
|
||||||
|
- VM and LBs IPs can be advertised in a node where the traffic can be injected
|
||||||
|
into the OVN overlay: either in the node that hosts the VM or in the node
|
||||||
|
where the router gateway port is scheduled. (See the "limitations"
|
||||||
|
subsection.).
|
||||||
|
|
||||||
|
- After the traffic reaches the specific node, kernel networking redirects the
|
||||||
|
traffic to the OVN overlay, if the default ``underlay`` exposing method is
|
||||||
|
used.
|
||||||
|
|
||||||
|
|
||||||
|
.. include:: ../bgp_advertising.rst
|
||||||
|
|
||||||
|
|
||||||
|
.. include:: ../bgp_traffic_redirection.rst
|
||||||
|
|
||||||
|
|
||||||
|
Driver API
|
||||||
|
++++++++++
|
||||||
|
|
||||||
|
The NB BGP driver implements the ``driver_api.py`` interface with the
|
||||||
|
following functions:
|
||||||
|
|
||||||
|
- ``expose_ip``: creates all the IP rules and routes, and OVS flows needed
|
||||||
|
to redirect the traffic to OVN overlay. It also ensures that FRR exposes
|
||||||
|
the required IP by using BGP.
|
||||||
|
|
||||||
|
- ``withdraw_ip``: removes the configuration (IP rules/routes, OVS flows)
|
||||||
|
from ``expose_ip`` method to withdraw the exposed IP.
|
||||||
|
|
||||||
|
- ``expose_subnet``: adds kernel networking configuration (IP rules and route)
|
||||||
|
to ensure traffic can go from the node to the OVN overlay (and back)
|
||||||
|
for IPs within the tenant subnet CIDR.
|
||||||
|
|
||||||
|
- ``withdraw_subnet``: removes kernel networking configuration added by
|
||||||
|
``expose_subnet``.
|
||||||
|
|
||||||
|
- ``expose_remote_ip``: BGP expose VM tenant network IPs through the chassis
|
||||||
|
hosting the OVN gateway port for the router where the VM is connected.
|
||||||
|
It ensures traffic directed to the VM IP arrives at this node by exposing
|
||||||
|
the IP through BGP locally. The previous steps in ``expose_subnet`` ensure
|
||||||
|
the traffic is redirected to the OVN overlay after it arrives on the node.
|
||||||
|
|
||||||
|
- ``withdraw_remote_ip``: removes the configuration added by
|
||||||
|
``expose_remote_ip``.
|
||||||
|
|
||||||
|
And in addition, the driver also implements extra methods for the FIPs and the
|
||||||
|
OVN load balancers:
|
||||||
|
|
||||||
|
- ``expose_fip`` and ``withdraw_fip`` which are equivalent to ``expose_ip`` and
|
||||||
|
``withdraw_ip`` but for FIPs.
|
||||||
|
|
||||||
|
- ``expose_ovn_lb_vip``: adds kernel networking configuration to ensure
|
||||||
|
traffic is forwarded from the node with the associated cr-lrp to the OVN
|
||||||
|
overlay, as well as to expose the VIP through BGP in that node.
|
||||||
|
|
||||||
|
- ``withdraw_ovn_lb_vip``: removes the above steps to stop advertising
|
||||||
|
the load balancer VIP.
|
||||||
|
|
||||||
|
- ``expose_ovn_lb_fip`` and ``withdraw_ovn_lb_fip``: for exposing the FIPs
|
||||||
|
associated to ovn loadbalancers. This is similar to
|
||||||
|
``expose_fip/withdraw_fip`` but taking into account that it must be exposed
|
||||||
|
on the node with the cr-lrp for the router associated to the loadbalancer.
|
||||||
|
|
||||||
|
|
||||||
|
.. include:: ../agent_deployment.rst
|
||||||
|
|
||||||
|
|
||||||
|
Limitations
|
||||||
|
-----------
|
||||||
|
|
||||||
|
The following limitations apply:
|
||||||
|
|
||||||
|
- OVN 23.09 or later is needed to support exposing tenant networks IPs and
|
||||||
|
OVN loadbalancers.
|
||||||
|
|
||||||
|
- There is no API to decide what to expose, all VMs/LBs on providers or with
|
||||||
|
floating IPs associated with them are exposed. For the VMs in the tenant
|
||||||
|
networks, use the flag ``address_scopes`` to filter which subnets to expose,
|
||||||
|
which also prefents having overlapping IPs.
|
||||||
|
|
||||||
|
- In the currently implemented exposing methods (``underlay`` and
|
||||||
|
``ovn``) there is no support for overlapping CIDRs, so this must be
|
||||||
|
avoided, e.g., by using address scopes and subnet pools.
|
||||||
|
|
||||||
|
- For the default exposing method (``underlay``) the network traffic is steered
|
||||||
|
by kernel routing (ip routes and rules), therefore OVS-DPDK, where the kernel
|
||||||
|
space is skipped, is not supported. With the ``ovn`` exposing method
|
||||||
|
the routing is done at ovn level, so this limitation does not exists.
|
||||||
|
More details in :ref:`ovn_routing`.
|
||||||
|
|
||||||
|
- For the default exposing method (``underlay``) the network traffic is steered
|
||||||
|
by kernel routing (ip routes and rules), therefore SRIOV, where the hypervisor
|
||||||
|
is skipped, is not supported. With the ``ovn`` exposing method
|
||||||
|
the routing is done at ovn level, so this limitation does not exists.
|
||||||
|
More details in :ref:`ovn_routing`.
|
||||||
|
|
||||||
|
- In OpenStack with OVN networking the N/S traffic to the ovn-octavia VIPs on
|
||||||
|
the provider or the FIPs associated with the VIPs on tenant networks needs to
|
||||||
|
go through the networking nodes (the ones hosting the Neutron Router Gateway
|
||||||
|
Ports, i.e., the chassisredirect cr-lrp ports, for the router connecting the
|
||||||
|
load balancer members to the provider network). Therefore, the entry point
|
||||||
|
into the OVN overlay needs to be one of those networking nodes, and
|
||||||
|
consequently the VIPs (or FIPs to VIPs) are exposed through them. From those
|
||||||
|
nodes the traffic will follow the normal tunneled path (Geneve tunnel) to
|
||||||
|
the OpenStack compute node where the selected member is located.
|
|
@ -0,0 +1,265 @@
|
||||||
|
.. _ovn_routing:
|
||||||
|
|
||||||
|
===================================================================
|
||||||
|
[NB DB] NB OVN BGP Agent: Design of the BGP Driver with OVN routing
|
||||||
|
===================================================================
|
||||||
|
|
||||||
|
This is an extension of the NB OVN BGP Driver which adds a new
|
||||||
|
``exposing_method`` named ``ovn`` to make use of OVN routing, instead of
|
||||||
|
relying on Kernel routing.
|
||||||
|
|
||||||
|
Purpose
|
||||||
|
-------
|
||||||
|
|
||||||
|
The addition of a BGP driver enables the OVN BGP agent to expose virtual
|
||||||
|
machine (VMs) and load balancer (LBs) IP addresses through the BGP dynamic
|
||||||
|
protocol when these IP addresses are either associated with a floating IP
|
||||||
|
(FIP) or are booted or created on a provider network.
|
||||||
|
The same functionality is available on project networks, when a special
|
||||||
|
flag is set.
|
||||||
|
|
||||||
|
This document presents the design decision behind the extensions on the
|
||||||
|
NB OVN BGP Driver to support OVN routing instead of kernel routing,
|
||||||
|
and therefore enabling datapath acceleartion.
|
||||||
|
|
||||||
|
|
||||||
|
Overview
|
||||||
|
--------
|
||||||
|
|
||||||
|
The main goal is to make the BGP capabilities of OVN BGP Agent compliant with
|
||||||
|
OVS-DPDK and HWOL. To do that we need to move to OVN/OVS what the OVN BGP
|
||||||
|
Agent is currently doing with Kernel networking -- redirect traffic to/from
|
||||||
|
the OpenStack OVN Overlay.
|
||||||
|
|
||||||
|
To accomplish this goal, the following is required:
|
||||||
|
|
||||||
|
- Ensure that incoming traffic gets redirected from the physical NICs to the OVS
|
||||||
|
integration bridge (br-int) though one or more OVS provider bridges (br-ex)
|
||||||
|
without using kernel routes and rules.
|
||||||
|
|
||||||
|
- Ensure the outgoing traffic gets redirected to the physical NICs without
|
||||||
|
using the default kernel routes.
|
||||||
|
|
||||||
|
- Expose the IPs in the same way as we did before.
|
||||||
|
|
||||||
|
The third point is simple as it is already being done, but for the first two
|
||||||
|
points OVN virtual routing capabilities are needed, ensuring the traffic gets
|
||||||
|
routed from the NICS to the OpenStack Overlay and vice versa.
|
||||||
|
|
||||||
|
|
||||||
|
Proposed Solution
|
||||||
|
-----------------
|
||||||
|
|
||||||
|
To avoid placing kernel networking in the middle of the datapath and blocking
|
||||||
|
acceleration, the proposed solution mandates locating a separate OVN cluster
|
||||||
|
on each node that manages the needed virtual infrastructure between the
|
||||||
|
OpenStack networking overlay and the physical network.
|
||||||
|
Because routing occurs at OVN/OVS level, this proposal makes it is possible
|
||||||
|
to support hardware offloading (HWOL) and OVS-DPDK.
|
||||||
|
|
||||||
|
The next figure shows the proposed cluster required to manage the OVN virtual
|
||||||
|
networking infrastructure on each node.
|
||||||
|
|
||||||
|
.. image:: ../../../images/ovn-cluster-overview.png
|
||||||
|
:alt: OVN Routing integration
|
||||||
|
:align: center
|
||||||
|
:width: 100%
|
||||||
|
|
||||||
|
In a standard deployment ``br-int`` is directly connected to the OVS external
|
||||||
|
bridge (``br-ex``) where the physical NICs are attached.
|
||||||
|
By contrast, in the default BGP driver solution (see :ref:`nb_bgp_driver`),
|
||||||
|
the physical NICs are not directly attached to br-ex, but rely on kernel
|
||||||
|
networking (ip routes and ip rules) to redirect the traffic to ``br-ex``.
|
||||||
|
The OVN routing architecture proposes the following mapping:
|
||||||
|
|
||||||
|
- ``br-int`` connects to an external (from the OpenStack perspective) OVS bridge
|
||||||
|
(``br-osp``).
|
||||||
|
|
||||||
|
- ``br-osp`` does not have any physical resources attached, just patch
|
||||||
|
ports connecting them to ``br-int`` and ``br-bgp``.
|
||||||
|
|
||||||
|
- ``br-bgp`` is the integration bridge managed by the extra OVN cluster
|
||||||
|
deployed per node. This is where the virtual OVN resources are be created
|
||||||
|
(routers and switches). It creates mappings to ``br-osp`` and ``br-ex``
|
||||||
|
(patch ports).
|
||||||
|
|
||||||
|
- ``br-ex`` keeps being the external bridge, where the physical NICs are
|
||||||
|
attached (as in default environments without BGP). But instead of being
|
||||||
|
directly connected to ``br-int``, is connected to ``br-bgp``. Note for
|
||||||
|
ECMP purposes, each nic is attached to a different ``br-ex`` device
|
||||||
|
(``br-ex`` and ``br-ex-2``).
|
||||||
|
|
||||||
|
The virtual OVN resources requires the following:
|
||||||
|
|
||||||
|
- Logical Router (``bgp-router``): manages the routing that was
|
||||||
|
previously done in the kernel networking layer between both networks
|
||||||
|
(physical and OpenStack OVN overlay). It has two connections (i.e., Logical
|
||||||
|
Router Ports) towards the ``bgp-ex-X`` Logical Switches to add support for ECMP
|
||||||
|
(only one switch is required but you must have several in case of ECMP),
|
||||||
|
and one connection to the ``bgp-osp`` Logical Switch to ensure traffic
|
||||||
|
to/from the OpenStack networking overlay.
|
||||||
|
|
||||||
|
- Logical Switch (``bgp-ex``): is connected to the ``bgp-router``, and has
|
||||||
|
a localnet to connect it to ``br-ex`` and therefore the physical NICs. There
|
||||||
|
is one Logical Switch per NIC (``bgp-ex`` and ``bgp-ex-2``).
|
||||||
|
|
||||||
|
- Logical Switch (``bgp-osp``): is connected to the ``bgp-router``, and has
|
||||||
|
a localnet to connect it to ``br-osp`` to enable it to send traffic to
|
||||||
|
and from the OpenStack OVN overlay.
|
||||||
|
|
||||||
|
The following OVS flows are required on both OVS bridges:
|
||||||
|
|
||||||
|
- ``br-ex-X`` bridges: require a flow to ensure only the traffic
|
||||||
|
targetted for OpenStack provider networks is redirected to the OVN cluster.
|
||||||
|
|
||||||
|
.. code-block:: ini
|
||||||
|
|
||||||
|
cookie=0x3e7, duration=942003.114s, table=0, n_packets=1825, n_bytes=178850, priority=1000,ip,in_port=eth1,nw_dst=172.16.0.0/16 actions=mod_dl_dst:52:54:00:30:93:ea,output:"patch-bgp-ex-lo"
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
- ``br-osp`` bridge: require a flow for each OpenStack provider network to
|
||||||
|
change the MAC by the one on the router port in the OVN cluster and to
|
||||||
|
properly manage traffic that is routed to the OVN cluster.
|
||||||
|
|
||||||
|
.. code-block:: ini
|
||||||
|
|
||||||
|
cookie=0x3e7, duration=942011.971s, table=0, n_packets=8644, n_bytes=767152, priority=1000,ip,in_port="patch-provnet-0" actions=mod_dl_dst:40:44:00:00:00:06,NORMAL
|
||||||
|
|
||||||
|
|
||||||
|
OVN NB DB Events
|
||||||
|
~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
The OVN northbound database events that the driver monitors are the same as
|
||||||
|
the ones for the NB DB driver with the ``underlay`` exposing mode.
|
||||||
|
See :ref:`nb_bgp_driver`. The main difference between the two drivers is
|
||||||
|
that the wiring actions are simplified for the OVN routing driver.
|
||||||
|
|
||||||
|
|
||||||
|
Driver Logic
|
||||||
|
~~~~~~~~~~~~
|
||||||
|
|
||||||
|
As with the other BGP drivers or ``exposing modes`` (:ref:`bgp_driver`,
|
||||||
|
:ref:`nb_bgp_driver`) the NB DB Driver with the ``ovn`` exposing mode enabled
|
||||||
|
(i.e., enabling ``OVN routing`` instead of rely on ``Kernel networking``)
|
||||||
|
is in charge of exposing the IPs with BGP and of the networking configuration
|
||||||
|
to ensure that VMs abd LBs on provider networks or with FIPs can be reached
|
||||||
|
through BGP (N/S traffic). Similarly, if ``expose_tenant_networks`` flag is
|
||||||
|
enabled, VMs in tenant networks should be reachable too -- although instead
|
||||||
|
of directly in the node they are created, through one of the network gateway
|
||||||
|
chassis nodes. The same happens with ``expose_ipv6_gua_tenant_networks``
|
||||||
|
but only for IPv6 GUA ranges.
|
||||||
|
In addition, if the config option ``address_scopes`` is set only the tenant
|
||||||
|
networks with matching corresponding address_scope will be exposed.
|
||||||
|
|
||||||
|
To accomplish this, it needs to configure the extra per node ovn cluster to
|
||||||
|
ensure that:
|
||||||
|
|
||||||
|
- VM and LBs IPs can be advertized in a node where the traffic could be injected
|
||||||
|
into the OVN overlay through the extra ovn cluster (instead of the Kernel
|
||||||
|
routing) -- either in the node hosting the VM or the node where the router
|
||||||
|
gateway port is scheduled.
|
||||||
|
|
||||||
|
- Once the traffic reaches the specific node, the traffic is redirected to the
|
||||||
|
OVN overlay by using the extra ovn cluster per node with the proper OVN
|
||||||
|
configuration. To do this it needs to create Logical Switches, Logical
|
||||||
|
Routers and the routing configuration between them (routes and policies).
|
||||||
|
|
||||||
|
.. include:: ../bgp_advertising.rst
|
||||||
|
|
||||||
|
|
||||||
|
Traffic Redirection to/from OVN
|
||||||
|
+++++++++++++++++++++++++++++++
|
||||||
|
|
||||||
|
As explained before, the main idea of this exposing mode is to leverage OVN
|
||||||
|
routing instead of kernel routing. For the traffic going out the steps are
|
||||||
|
the next:
|
||||||
|
|
||||||
|
- If (OpenStack) OVN cluster knows about the destination MAC then that works
|
||||||
|
as in deployment without BGP or OVN cluster support (no arp needed, MAC
|
||||||
|
directly used). If the MAC is unknown but on the same provider network(s)
|
||||||
|
range, the ARP gets replied by the Logical Switch Port on the ``bgp-osp`` LS
|
||||||
|
thanks to enabling arp_proxy on it. And if it is a different range, it will
|
||||||
|
reply due to the router having default routes to the outside.
|
||||||
|
The flow at ``br-osp`` is in charge of changing the destination MAC by the
|
||||||
|
one on the Logical Router Port on ``bgp-router`` LR.
|
||||||
|
|
||||||
|
- The previous step takes the traffic to the extra OVN cluster per node, where
|
||||||
|
the default (ECMP) routes are used to send the traffic to the external
|
||||||
|
Logical Switch and from there to the physical nics attached to the external
|
||||||
|
OVS bridge(s) (``br-ex``, ``br-ex-2``). In case of known MAC by OpenStack,
|
||||||
|
instead of the default routes, a Logical Route Policy gets applied so that
|
||||||
|
traffic is forced to be redirected out (through the LRPs connected to the
|
||||||
|
external LS) when comming through the internal LRP (the one connected to
|
||||||
|
OpenStack).
|
||||||
|
|
||||||
|
And for the traffic comming in:
|
||||||
|
|
||||||
|
- The flow hits the ovs flow added at the ``br-ex-X`` bridge(s) to redirect
|
||||||
|
the traffic to the per node OVN cluster, changing the destination MAC by
|
||||||
|
the one at the related ``br-ex`` device, which are the same used for the
|
||||||
|
OVN cluster Logical Router Ports. This takes the traffic to the OVN router.
|
||||||
|
|
||||||
|
- After that, thanks to having the arp_proxy enabled on the LSP on ``bgp-osp``
|
||||||
|
the traffic will be redirected to there. And due to a limitation in the
|
||||||
|
functionality of arp_proxy, there is a need of adding an extra static mac
|
||||||
|
binding entry in the cluster so that the VM MAC is used for destination
|
||||||
|
instead of the own LSP MAC, which would lead to droping the traffic on the
|
||||||
|
LS pipeline.
|
||||||
|
|
||||||
|
.. code-block:: ini
|
||||||
|
|
||||||
|
_uuid : 6e1626b3-832c-4ee6-9311-69ebc15cb14d
|
||||||
|
ip : "172.16.201.219"
|
||||||
|
logical_port : bgp-router-openstack
|
||||||
|
mac : "fa:16:3e:82:ee:19"
|
||||||
|
override_dynamic_mac: true
|
||||||
|
|
||||||
|
|
||||||
|
Driver API
|
||||||
|
++++++++++
|
||||||
|
|
||||||
|
This is the very same as in the NB DB driver with the ``underlay`` exposing
|
||||||
|
mode. See :ref:`nb_bgp_driver`.
|
||||||
|
|
||||||
|
|
||||||
|
Agent deployment
|
||||||
|
~~~~~~~~~~~~~~~~
|
||||||
|
The deployment is similar to the NB DB driver with the ``underlay`` exposing
|
||||||
|
method but with some extra configuration. See :ref:`nb_bgp_driver` for the base.
|
||||||
|
|
||||||
|
It is needed to state the exposing method in the DEFAULT section and the extra
|
||||||
|
configuration for the local ovn cluster that performs the routing, including the
|
||||||
|
range for the provider networks to expose/handle:
|
||||||
|
|
||||||
|
.. code-block:: ini
|
||||||
|
|
||||||
|
[DEFAULT]
|
||||||
|
exposing_method=ovn
|
||||||
|
|
||||||
|
[local_ovn_cluster]
|
||||||
|
ovn_nb_connection=unix:/run/ovn/ovnnb_db.sock
|
||||||
|
ovn_sb_connection=unix:/run/ovn/ovnsb_db.sock
|
||||||
|
external_nics=eth1,eth2
|
||||||
|
peer_ips=100.64.1.5,100.65.1.5
|
||||||
|
provider_networks_pool_prefixes=172.16.0.0/16
|
||||||
|
|
||||||
|
|
||||||
|
Limitations
|
||||||
|
-----------
|
||||||
|
|
||||||
|
The following limitations apply:
|
||||||
|
|
||||||
|
- OVN 23.06 or later is needed
|
||||||
|
|
||||||
|
- Tenant networks, subnet and ovn-loadbalancer are not yet supported, and will
|
||||||
|
require OVN 23.09 or nlaterewer.
|
||||||
|
|
||||||
|
- IPv6 not yet supported
|
||||||
|
|
||||||
|
- ECMP not properly working as there is no support for BFD at the ovn-cluster,
|
||||||
|
which means if one of the routes goes away the OVN cluster won't react to it
|
||||||
|
and there will be traffic disruption.
|
||||||
|
|
||||||
|
- There is no support for overlapping CIDRs, so this must be avoided, e.g., by
|
||||||
|
using address scopes and subnet pools.
|
|
@ -5,7 +5,7 @@
|
||||||
.. toctree::
|
.. toctree::
|
||||||
:maxdepth: 2
|
:maxdepth: 2
|
||||||
|
|
||||||
bgp_mode_design
|
drivers/index
|
||||||
evpn_mode_design
|
agent_deployment
|
||||||
bgp_mode_stretched_l2_design
|
bgp_advertising
|
||||||
bgp_supportability_matrix
|
bgp_traffic_redirection
|
||||||
|
|
|
@ -10,10 +10,11 @@ Welcome to the documentation of OVN BGP Agent
|
||||||
Contents:
|
Contents:
|
||||||
|
|
||||||
.. toctree::
|
.. toctree::
|
||||||
:maxdepth: 2
|
:maxdepth: 3
|
||||||
|
|
||||||
readme
|
readme
|
||||||
contributor/index
|
contributor/index
|
||||||
|
bgp_supportability_matrix
|
||||||
|
|
||||||
Indices and tables
|
Indices and tables
|
||||||
==================
|
==================
|
||||||
|
|
Loading…
Reference in New Issue