234 lines
14 KiB
ReStructuredText
234 lines
14 KiB
ReStructuredText
===========================================
|
|
Cross Neutron VxLAN Networking in Tricircle
|
|
===========================================
|
|
|
|
Background
|
|
==========
|
|
|
|
Currently we only support VLAN as the cross-Neutron network type. For VLAN network
|
|
type, central plugin in Tricircle picks a physical network and allocates a VLAN
|
|
tag(or uses what users specify), then before the creation of local network,
|
|
local plugin queries this provider network information and creates the network
|
|
based on this information. Tricircle only guarantees that instance packets sent
|
|
out of hosts in different pods belonging to the same VLAN network will be tagged
|
|
with the same VLAN ID. Deployers need to carefully configure physical networks
|
|
and switch ports to make sure that packets can be transported correctly between
|
|
physical devices.
|
|
|
|
For more flexible deployment, VxLAN network type is a better choice. Compared
|
|
to 12-bit VLAN ID, 24-bit VxLAN ID can support more numbers of bridge networks
|
|
and cross-Neutron L2 networks. With MAC-in-UDP encapsulation of VxLAN network,
|
|
hosts in different pods only need to be IP routable to transport instance
|
|
packets.
|
|
|
|
Proposal
|
|
========
|
|
|
|
There are some challenges to support cross-Neutron VxLAN network.
|
|
|
|
1. How to keep VxLAN ID identical for the same VxLAN network across Neutron servers
|
|
|
|
2. How to synchronize tunnel endpoint information between pods
|
|
|
|
3. How to trigger L2 agents to build tunnels based on this information
|
|
|
|
4. How to support different back-ends, like ODL, L2 gateway
|
|
|
|
The first challenge can be solved as VLAN network does, we allocate VxLAN ID in
|
|
central plugin and local plugin will use the same VxLAN ID to create local
|
|
network. For the second challenge, we introduce a new table called
|
|
"shadow_agents" in Tricircle database, so central plugin can save the tunnel
|
|
endpoint information collected from one local Neutron server in this table
|
|
and use it to populate the information to other local Neutron servers when
|
|
needed. Here is the schema of the table:
|
|
|
|
.. csv-table:: Shadow Agent Table
|
|
:header: Field, Type, Nullable, Key, Default
|
|
|
|
id, string, no, primary, null
|
|
pod_id, string, no, , null
|
|
host, string, no, unique, null
|
|
type, string, no, unique, null
|
|
tunnel_ip, string, no, , null
|
|
|
|
**How to collect tunnel endpoint information**
|
|
|
|
When the host where a port will be located is determined, local Neutron server
|
|
will receive a port-update request containing host ID in the body. During the
|
|
process of this request, local plugin can query agent information that contains
|
|
tunnel endpoint information from local Neutron database with host ID and port
|
|
VIF type; then send tunnel endpoint information to central Neutron server by
|
|
issuing a port-update request with this information in the binding profile.
|
|
|
|
**How to populate tunnel endpoint information**
|
|
|
|
When the tunnel endpoint information in one pod is needed to be populated to
|
|
other pods, XJob will issue port-create requests to corresponding local Neutron
|
|
servers with tunnel endpoint information queried from Tricircle database in the
|
|
bodies. After receiving such request, local Neutron server will save tunnel
|
|
endpoint information by calling real core plugin's "create_or_update_agent"
|
|
method. This method comes from neutron.db.agent_db.AgentDbMixin class. Plugins
|
|
that support "agent" extension will have this method. Actually there's no such
|
|
agent daemon running in the target local Neutron server, but we insert a record
|
|
for it in the database so the local Neutron server will assume there exists an
|
|
agent. That's why we call it shadow agent.
|
|
|
|
The proposed solution for the third challenge is based on the shadow agent and
|
|
L2 population mechanism. In the original Neutron process, if the port status
|
|
is updated to active, L2 population mechanism driver does two things. First,
|
|
driver checks if the updated port is the first port in the target agent. If so,
|
|
driver collects tunnel endpoint information of other ports in the same network,
|
|
then sends the information to the target agent via RPC. Second, driver sends
|
|
the tunnel endpoint information of the updated port to other agents where ports
|
|
in the same network are located, also via RPC. L2 agents will build the tunnels
|
|
based on the information they received. To trigger the above processes to build
|
|
tunnels across Neutron servers, we further introduce shadow port.
|
|
|
|
Let's say we have two instance ports, port1 is located in host1 in pod1 and
|
|
port2 is located in host2 in pod2. To make L2 agent running in host1 build a
|
|
tunnel to host2, we create a port with the same properties of port2 in pod1.
|
|
As discussed above, local Neutron server will create shadow agent during the
|
|
process of port-create request, so local Neutron server in pod1 won't complain
|
|
that host2 doesn't exist. To trigger L2 population process, we then update the
|
|
port status to active, so L2 agent in host1 will receive tunnel endpoint
|
|
information of port2 and build the tunnel. Port status is a read-only property
|
|
so we can't directly update it via ReSTful API. Instead, we issue a port-update
|
|
request with a special key in the binding profile. After local Neutron server
|
|
receives such request, it pops the special key from the binding profile and
|
|
updates the port status to active. XJob daemon will take the job to create and
|
|
update shadow ports.
|
|
|
|
Here is the flow of shadow agent and shadow port process::
|
|
|
|
+-------+ +---------+ +---------+
|
|
| | | | +---------+ | |
|
|
| Local | | Local | | | +----------+ +------+ | Local |
|
|
| Nova | | Neutron | | Central | | | | | | Neutron |
|
|
| Pod1 | | Pod1 | | Neutron | | Database | | XJob | | Pod2 |
|
|
| | | | | | | | | | | |
|
|
+---+---+ +---- ----+ +----+----+ +----+-----+ +--+---+ +----+----+
|
|
| | | | | |
|
|
| update port1 | | | | |
|
|
| [host id] | | | | |
|
|
+---------------> | | | |
|
|
| | update port1 | | | |
|
|
| | [agent info] | | | |
|
|
| +----------------> | | |
|
|
| | | save shadow | | |
|
|
| | | agent info | | |
|
|
| | +----------------> | |
|
|
| | | | | |
|
|
| | | trigger shadow | | |
|
|
| | | port setup job | | |
|
|
| | | for pod1 | | |
|
|
| | +---------------------------------> |
|
|
| | | | | query ports in |
|
|
| | | | | the same network |
|
|
| | | | +------------------>
|
|
| | | | | |
|
|
| | | | | return port2 |
|
|
| | | | <------------------+
|
|
| | | | query shadow | |
|
|
| | | | agent info | |
|
|
| | | | for port2 | |
|
|
| | | <----------------+ |
|
|
| | | | | |
|
|
| | | | create shadow | |
|
|
| | | | port for port2 | |
|
|
| <--------------------------------------------------+ |
|
|
| | | | | |
|
|
| | create shadow | | | |
|
|
| | agent and port | | | |
|
|
| +-----+ | | | |
|
|
| | | | | | |
|
|
| | | | | | |
|
|
| <-----+ | | | |
|
|
| | | | update shadow | |
|
|
| | | | port to active | |
|
|
| <--------------------------------------------------+ |
|
|
| | | | | |
|
|
| | L2 population | | | trigger shadow |
|
|
| +-----+ | | | port setup job |
|
|
| | | | | | for pod2 |
|
|
| | | | | +-----+ |
|
|
| <-----+ | | | | |
|
|
| | | | | | |
|
|
| | | | <-----+ |
|
|
| | | | | |
|
|
| | | | | |
|
|
+ + + + + +
|
|
|
|
Bridge network can support VxLAN network in the same way, we just create shadow
|
|
ports for router interface and router gateway. In the above graph, local Nova
|
|
server updates port with host ID to trigger the whole process. L3 agent will
|
|
update interface port and gateway port with host ID, so similar process will
|
|
be triggered to create shadow ports for router interface and router gateway.
|
|
|
|
Currently Neutron team is working on push notification [1]_, Neutron server
|
|
will send resource data to agents; agents cache this data and use it to do the
|
|
real job like configuring openvswitch, updating iptables, configuring dnsmasq,
|
|
etc. Agents don't need to retrieve resource data from Neutron server via RPC
|
|
any more. Based on push notification, if tunnel endpoint information is stored
|
|
in port object later, and this information supports updating via ReSTful API,
|
|
we can simplify the solution for challenge 3 and 4. We just need to create
|
|
shadow port containing tunnel endpoint information. This information will be
|
|
pushed to agents and agents use it to create necessary tunnels and flows.
|
|
|
|
**How to support different back-ends besides ML2+OVS implementation**
|
|
|
|
We consider two typical back-ends that can support cross-Neutron VxLAN networking,
|
|
L2 gateway and SDN controller like ODL. For L2 gateway, we consider only
|
|
supporting static tunnel endpoint information for L2 gateway at the first step.
|
|
Shadow agent and shadow port process is almost the same with the ML2+OVS
|
|
implementation. The difference is that, for L2 gateway, the tunnel IP of the
|
|
shadow agent is set to the tunnel endpoint of the L2 gateway. So after L2
|
|
population, L2 agents will create tunnels to the tunnel endpoint of the L2
|
|
gateway. For SDN controller, we assume that SDN controller has the ability to
|
|
manage tunnel endpoint information across Neutron servers, so Tricircle only helps to
|
|
allocate VxLAN ID and keep the VxLAN ID identical across Neutron servers for one network.
|
|
Shadow agent and shadow port process will not be used in this case. However, if
|
|
different SDN controllers are used in different pods, it will be hard for each
|
|
SDN controller to connect hosts managed by other SDN controllers since each SDN
|
|
controller has its own mechanism. This problem is discussed in this page [2]_.
|
|
One possible solution under Tricircle is as what L2 gateway does. We create
|
|
shadow ports that contain L2 gateway tunnel endpoint information so SDN
|
|
controller can build tunnels in its own way. We then configure L2 gateway in
|
|
each pod to forward the packets between L2 gateways. L2 gateways discussed here
|
|
are mostly hardware based, and can be controlled by SDN controller. SDN
|
|
controller will use ML2 mechanism driver to receive the L2 network context and
|
|
further control L2 gateways for the network.
|
|
|
|
To distinguish different back-ends, we will add a new configuration option
|
|
cross_pod_vxlan_mode whose valid values are "p2p", "l2gw" and "noop". Mode
|
|
"p2p" works for the ML2+OVS scenario, in this mode, shadow ports and shadow
|
|
agents containing host tunnel endpoint information are created; mode "l2gw"
|
|
works for the L2 gateway scenario, in this mode, shadow ports and shadow agents
|
|
containing L2 gateway tunnel endpoint information are created. For the SDN
|
|
controller scenario, as discussed above, if SDN controller can manage tunnel
|
|
endpoint information by itself, we only need to use "noop" mode, meaning that
|
|
neither shadow ports nor shadow agents will be created; or if SDN controller
|
|
can manage hardware L2 gateway, we can use "l2gw" mode.
|
|
|
|
Data Model Impact
|
|
=================
|
|
|
|
New table "shadow_agents" is added.
|
|
|
|
Dependencies
|
|
============
|
|
|
|
None
|
|
|
|
Documentation Impact
|
|
====================
|
|
|
|
- Update configuration guide to introduce options for VxLAN network
|
|
- Update networking guide to discuss new scenarios with VxLAN network
|
|
- Add release note about cross-Neutron VxLAN networking support
|
|
|
|
References
|
|
==========
|
|
|
|
.. [1] https://blueprints.launchpad.net/neutron/+spec/push-notifications
|
|
.. [2] http://etherealmind.com/help-wanted-stitching-a-federated-sdn-on-openstack-with-evpn/
|