Initial VPP ML2 network driver version

This is the first OpenStack commit for a ML2 driver for VPP.  This
contains the driver and a VPP agent that uses the VPP API (via the
python API package) to control VPP, and the driver and agent
communicate with one another via an etcd instance.

This version is (loosely, subject to some tidying and getting it
past the Gerrit tests) based on

Change-Id: I8909ddc56b5c49ec2ee1b25acbff21716eee70f6
Co-Authored-By: Ian Wells <>
Co-Authored-By: Naveen Joy <>
Co-Authored-By: Feng Pan <>
Ian Wells 6 years ago
parent f69d3d16c4
commit b78c4a3354

.gitignore vendored

@ -0,0 +1,6 @@

@ -0,0 +1,3 @@
Feng Pan <>
Ian Wells <>
Naveen Joy <>

@ -0,0 +1,153 @@
CentOS 7 Setup Guide
This document describes steps to set up a Centos 7 single host devstack
environmenmt using networking-vpp.
Host Setup
#. Configure hugepage and iommu support:
``default_hugepagesz=2M hugepagesz=2M hugepages=2048 iommu=pt
VPP build and install
#. pull VPP source from git::
git clone
#. Build and install VPP::
cd vpp
make install-dep
make build-release
make pkg-rpm
#. Install VPP rpms. The rpms are located in vpp/built-root directory after
build is complete::
sudo yum install build-root/vpp*.rpm
#. Build and install VPP-PAPI. VPP-PAPI is VPP's python API used by
* Install python-devel package if it is not installed already::
sudo install -y python-devel
* Build and install::
make -Cbuild-root PLATFORM=vpp TAG=vpp_debug vpp-api-install
cd vpp-api/python
sudo python install
#. Configuring VPP
It may be desirable to change vpp cli's listening port to something other
than the default 5000, as it is used by keystone. This can be done by
adding line ``cli-listen localhost:5002`` in ``unix`` section of VPP
config file ``/etc/vpp/startup.conf``.
It is necessary to load pmd kernel module of choice (vfio-pci, igb_uio,
etc). igb_uio module can be found in dpdk build directory:
#. Starting VPP
VPP can be started by starting VPP service::
systemctl start vpp
To verify VPP has started correctly::
vppctl show interface
You should see your physical NIC listed in the interface list, in this
case GigabitEthernet2/5/0::
Name Idx State Counter Count
GigabitEthernet2/5/0 5 down
local0 0 down
pg/stream-0 1 down
pg/stream-1 2 down
pg/stream-2 3 down
pg/stream-3 4 down
More detailed instruction on vpp building and installing can be found at:,_install,_and_test_images#Build_A_VPP_Package
Upgrade qemu-kvm
#. Enable Centos EV repo
``yum install centos-release-qemu-ev``
#. Update packages, this will pick up new qemu packages from EV repo.
``yum update``
#. Remove qemu-system-x86 package if it's installed, this will prevent
libvirt from identifying QEMU version to be 2.0
``yum remove qemu-system-x86``
Build and install qemu
If you would like to use qemu rather than qemu-kvm, you can build and
install qemu with the following steps:
tar xvf qemu-2.3.1.tar.bz2
cd qemu-2.3.1
sudo yum install gtk2-devel
./configure --enable-numa
sudo make install
Devstack Setup
General direction on how to download and set up devstack can be found at
Add the following to local.conf::
disable_service n-net q-agt
disable_service cinder c-sch c-api c-vol
disable_service tempest
enable_plugin networking-vpp
Note that ``VLAN_TRUNK_IF`` should be set to the interface name in VPP that you
want to use as your trunk interface.
VM creation
Note that hugepage support is required on guest VMs for vhostuser port
attachment, this can be done by creating a new flavor and booting the VM with
the flavor::
nova flavor-create m1.tiny.hugepage auto 512 0 1
nova flavor-key m1.tiny.hugepage set hw:mem_page_size=2048
nova boot --image cirros-0.3.4-x86_64-uec --flavor m1.tiny.hugepage --nic net-name=private myvm

@ -0,0 +1,74 @@
* Tox fixes
* Remove outdated line from example config
* Use a journal table
* do not create vpp subintf if exists
* etcd support, part 1
* Added etcd comms channel
* Major updates
* More PEP8
* Fix doc file in setup.cfg
* Major updates
* tox cleanups (mainly long lines)
* typo
* Break notifcation of Nova into own function
* Correct doc file name
* update delete_network_on_host
* update network_on_host
* update vpp get_interface
* add tests
* added tests
* delete bridge on network delete
* create bridge-pool
* delete vpp bridge_domain on network delete
* support arbitrary physical network names for flat n/w binding
* create network
* create network msg
* prevent adding an existing bridge interface
* remove agent auto-cleanup
* devstack l3_agent prevent ovs cleanup fix
* vif_type fix
* dhcp fixes
* vpp agent logging
* DHCP fixes
* updated agent filtering for messaging
* updated agent messaging to include failure reporting
* remove local settings from devstack/settings
* comment
* delete port_postcommit
* add flat networking
* Change qemu user/group to be config options. Also added default values for those for ubuntu and redhat
* fix bind logger format string error
* fix unbind logger debug
* add host info to unbind data
* delete port url update to remove host
* delete port_postcommit with debug
* delete port_postcommit
* add debugs to
* Add support for defining settings in local.conf. We now first check if a setting is defined
* Add centos 7 guide doc
* Convert README to rst
* send bind call without queuing
* add hostname to ip lookup
* unicast message
* update agent list in devstack settings
* update agent to listen on
* add mech_vpp debugs
* agent list split on comma
* updated agent list in devstack settings
* More fixes on comments, debug, deleting
* changed socket group to libvirtd
* added missing unbind call to agent in mech_vpp
* added debug messages for bind and unbind requests
* updated devstack/settings vlan trunk interface
* fix bug attribute error
* test
* fixed return value error
* fixed bug for checking return value
* Added README with details of what work has been done to date
* updated vpp api interface
* First pass at agent, diags in driver
* Initial commit: mechdriver that talks to an as-yet missing agent
* Added .gitreview

@ -0,0 +1,175 @@
Apache License
Version 2.0, January 2004
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
implied, including, without limitation, any warranties or conditions
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.

@ -0,0 +1,202 @@
ML2 Mechanism driver and small control plane for OpenVPP forwarder
This is a Neutron mechanism driver to bring the advantages of OpenVPP to
OpenStack deployments.
It's been written to be as simple and readable as possible while offering
either full Neutron functionality or a simple roadmap to it.
While the driver is not perfect, we're aiming for
- robustness in the face of failures (of one of several Neutron servers, of
agents, of the etcd nodes in the cluster gluing them together)
- simplicity
- testability - having failure cases covered is no good if you don't have
a means to test the code that protects you from them
As a general rule, everything is implemented in the simplest way,
for two reasons: one is that we get to see it working faster, and
the other is that it's much easier to replace a simple system with
a more complex one than it is to change a complex one. The current
design will change, but the one that's there at the moment is small
and easy to read, even if it makes you pull faces when you read it.
Your questions answered
How do I use it?
There's a devstack plugin. You can add this plugin to your local.conf and see it working.
You'll want to get VPP and the VPP Python bindings set up on the host before you do that.
I haven't written up the instructions yet but they're coming.
To get the best performance, this will use vhostuser sockets to talk to VMs, which means you
need a modern version of your OS (CentOS 7 and Ubuntu 16.04 look good). It also means that
you need to run your VMs with a special flavor that enables shared memory - basically, you
need to set up hugepages for your VMs, as that's the only supported way Nova does this
today. Because you're using pinned shared memory you are going to find you can't
overcommit memory on the target machine.
I've tested this on Ubuntu 16.04. Others have tried it with CentOS
7. You may need to disable libvirt security for qemu, as libvirt
doesn't play well with vhostuser sockets in its default setup.
CentOS testing is ongoing. An initial CentOS 7 guide can be found
at `<CENTOS_7-guide.rst>`_
What overlays does it support?
Today, it supports VLANs and flat networks.
How does it work?
VPP has physical interface nominated as the physical networks. In
common with other Neutron drivers, each physical network can be
used as a flat or VLAN network, for either fully virtual tenant
networks or for provider networks. When a VLAN-overlay network is
needed on a host, we create a subinterface with the selected VLAN
(the typedriver chooses the VLAN, we don't do anything clever about
that). For all networks, the agent makes a new bridge domain in
VPP and puts the subinterface or host interface into it. Binding
a port involves putting the port into the same bridge domain.
How does it implement binding?
This mechanism driver takes two approaches. It doesn't do anything
at all until Neutron needs to drop traffic on a compute host, so
the only thing it's really interested in is ports. Making a network
or a subnet doesn't do anything at all, so there are no entry points
for the network and subnet operations. And it mainly interests
itself in the process of binding: the bind calls determine if it
has work to do, and the port postcommit calls make sure that, when
a binding takes, the VPP forwarders in the system get appropriately
programmed to put the traffic where Nova expects it.
There are a number of calls that a mechanism driver can implement. The
precommit calls determine if a create, update or delete is acceptable and
stop it before it hits the database; they can also update additional
database tables. The postcommit calls allow you to act once the
network change is permanently recorded. These two calls, on ports,
are what we use.
In our case, we add a write to a journal table to the DB commit
from within the precommit. This is not committed if the commit
fails for other reasons.
The postcommit calls are where you should trigger actions based on an
update that's now been accepted and saved (you can't back down at
that point) - but the tenant is still waiting for an answer, so
it's wise to be quick. In our case, we kick a background thread
to push the journal log out, in order, to etcd. etcd then contains
the desired state of each host agent, and the agents monitor etcd
for changes relevant to them and update their state.
To ensure binding is done correctly, we send Nova a notification
only when the agent has definitely created the structures in VPP
necessary for the port to work. This is generally a good idea but
for vhost-user connections it's particularly important as QEMU goes
into a funny state if you start it with vhost-user sockets that
don't connect immediately. This state tends to confuse libvirt and
nova; for that reason, we recommend you make VIF plugging failures
fatal with the relevant Nova config option, so that a VM is never
started with ports that haven't been properly bound and configured.
Additionally, there are some helper calls to determine if this
mechanism driver, in conjunction with the other ones on the system,
needs to do anything. In some cases it may not be responsible for the
port at all.
How does it talk to VPP?
This uses the Python module Ole Troan added to VPP to interface with the
forwarder. VPP has an admin channel, implemented as a couple of shared
memory queues, to exchange control messages with VPP. The Python bindings
are a very thin layer between that shared memory system and a set of Python
Note that VPP runs as root and so the shared memory buffers are protected
and need root credentials to access, so the agent also runs as root. It
rather inelegantly coredumps if it doesn't have root privileges.
What does it support?
For now, assume it moves packets to where they need to go. It also integrates
properly with ML2 L3, DHCP and Metadata functionality.
The main notable absence at this point is security groups.
What are you doing next?
Security groups - this requires some additional functionality in VPP to work,
so we're currently waiting on that to be committed upstream.
We're considering how to add TAP-as-a-Service functionality
to the system so that you can prove, to your own satisfaction, that
the networking is operating correctly and your app is broken :)
What else needs fixing?
There are a long list of items where you can help. If you want a slow
introduction to the code, read it! It's not very big and it has comments and
everything. Among those comments you'll find several TODO comments where we
have opinions about shortcuts that we took that need revisiting; if you want
a go at changing the code, those TODO statements are a really good place to
That aside, you could attempt to get VXLAN working, or you could
look at tidying up the VPP API in the OpenVPP codebase, or you could add a
working memory to the VPP agent (perhaps by adding user-data to the VPP API
so that the agent could annotate the ports with its own information).
Firewalling and security groups are a big area where it's lacking.
If you're moving packets around fast and you're using secure components in
your VMs they don't matter so much (and this is quite common in NFV scenarios)
but to make this useful for everything the driver needs to implement basic
anti-spoof firewalling, security groups, and also the allowed-address-pair
and portsecurity extensions so that security can be turned down when the
application needs something different. VPP has ACLs, but the VPP team are
looking at improving that functionality and we're currently waiting for the
next version of the code and a hopefully more convenient API to use.
If you do think of doing work on this, remember that when you change
a security group you might be changing the firewalling on lots of
ports - on lots of servers - all at the same time.
Per above, VPP's comms channel with control planes is privileged, and so is the
channel for making vhost-user connections (you need to know the credentials that
libvirt uses). If it weren't for those two things, the agent doesn't need any
special system rights and could run as a normal user. This could be fixed (by
getting VPP to drop the privs on the shared memory and by using e.g. a setgid
directory to talk to VPP, respectively).
Why didn't you use the ML2 agent framework for this driver?
Neutron's agent framework is based on communicating via RabbitMQ. This can
lead to issues of scale when there are more than a few compute hosts involved,
and RabbitMQ is not as robust as it could be, plus RabbitMQ is trying to be a
fully reliable messaging system - all of which work against a robust and
scalable SDN control system.
We didn't want to start down that path, so instead we've taken a different
approach, that of a 'desired state' database with change listeners. etcd
stores the data of how the network should be and the agents try to achieve that (and also report
their status back via etcd). One nice feature of this is that anyone can
check how well the system is working - both sorts of update can be watched in
real time with the command
etcdctl watch --recursive --forever /
The driver and agents should deal with disconnections across the
board, and the agents know that they must resync themselves with
the desired state when they completely lose track of what's happening.

@ -0,0 +1 @@
[python: **.py]

@ -0,0 +1,144 @@
# plugin - DevStack dispatch script for vpp
vpp_debug() {
if [ ! -z "$VPP_DEVSTACK_DEBUG" ] ; then
"$@" || true # a debug command failing is not a failure
# For debugging purposes, highlight vpp sections
vpp_debug tput setab 1
# All machines using the VPP mechdriver and agent
function pre_install_vpp {
function install_vpp {
echo "Installing networking-vpp"
setup_develop "$MECH_VPP_DIR"
function init_vpp {
function configure_vpp {
iniset /$Q_PLUGIN_CONF_FILE ml2_vpp agents $MECH_VPP_AGENTLIST
iniset /$Q_PLUGIN_CONF_FILE ml2_vpp physnets $MECH_VPP_PHYSNETLIST
if [ ! -z "$VXLAN_SRC_ADDR" ] ; then
iniset /$Q_PLUGIN_CONF_FILE ml2_vpp vxlan_src_addr $VXLAN_SRC_ADDR
if [ ! -z "$VXLAN_BCAST_ADDR" ] ; then
iniset /$Q_PLUGIN_CONF_FILE ml2_vpp vxlan_bcast_addr $VXLAN_BCAST_ADDR
if [ ! -z "$VXLAN_VRF" ] ; then
iniset /$Q_PLUGIN_CONF_FILE ml2_vpp vxlan_vrf $VXLAN_VRF
if [ ! -z "$QEMU_USER" ] ; then
iniset /$Q_PLUGIN_CONF_FILE ml2_vpp qemu_user $QEMU_USER
if [ ! -z "$QEMU_GROUP" ] ; then
iniset /$Q_PLUGIN_CONF_FILE ml2_vpp qemu_group $QEMU_GROUP
function shut_vpp_down {
# The VPP control plane element (we don't at this point start VPP itself TODO)
function pre_install_vpp_agent {
function install_vpp_agent {
function init_vpp_agent {
# sudo for now, as it needs to connect to VPP and for that requires root privs
# to share its shmem comms channel
run_process $agent_service_name "sudo $VPP_CP_BINARY --config-file /$Q_PLUGIN_CONF_FILE"
function configure_vpp_agent {
function shut_vpp_agent_down {
stop_process $agent_service_name
agent_do() {
if is_service_enabled "$agent_service_name"; then
if [[ "$1" == "stack" && "$2" == "pre-install" ]]; then
# Set up system services
echo_summary "Configuring system services $name"
agent_do pre_install_vpp_agent
elif [[ "$1" == "stack" && "$2" == "install" ]]; then
# Perform installation of service source
echo_summary "Installing $name"
agent_do install_vpp_agent
elif [[ "$1" == "stack" && "$2" == "post-config" ]]; then
# Configure after the other layer 1 and 2 services have been configured
echo_summary "Configuring $name"
agent_do configure_vpp_agent
elif [[ "$1" == "stack" && "$2" == "extra" ]]; then
# Initialize and start the service
echo_summary "Initializing $name"
agent_do init_vpp_agent
if [[ "$1" == "unstack" ]]; then
# Shut down services
agent_do shut_vpp_agent_down
if [[ "$1" == "clean" ]]; then
# Remove state and transient data
# Remember first calls
# no-op
vpp_debug tput setab 9
function neutron_plugin_install_agent_packages {
install_package bridge-utils
function neutron_plugin_configure_l3_agent {
# We have opinions on the interface driver that should attach agents
function neutron_plugin_setup_interface_driver {
local conf_file=$1
iniset $conf_file DEFAULT interface_driver linuxbridge

@ -0,0 +1,12 @@
enable_service vpp-agent

@ -0,0 +1,75 @@
# -*- coding: utf-8 -*-
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import os
import sys
sys.path.insert(0, os.path.abspath('../..'))
# -- General configuration ----------------------------------------------------
# Add any Sphinx extension module names here, as strings. They can be
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom ones.
extensions = [
# autodoc generation is a bit aggressive and a nuisance when doing heavy
# text edit cycles.
# execute "export SPHINX_DEBUG=1" in your terminal to disable
# The suffix of source filenames.
source_suffix = '.rst'
# The master toctree document.
master_doc = 'index'
# General information about the project.
project = u'networking-vpp'
copyright = u'2016, OpenStack Foundation'
# If true, '()' will be appended to :func: etc. cross-reference text.
add_function_parentheses = True
# If true, the current module name will be prepended to all description
# unit titles (such as .. function::).
add_module_names = True
# The name of the Pygments (syntax highlighting) style to use.
pygments_style = 'sphinx'
# -- Options for HTML output --------------------------------------------------
# The theme to use for HTML and HTML Help pages. Major themes that come with
# Sphinx are currently 'default' and 'sphinxdoc'.
# html_theme_path = ["."]
# html_theme = '_theme'
# html_static_path = ['static']
# Output file base name for HTML help builder.
htmlhelp_basename = '%sdoc' % project
# Grouping the document tree into LaTeX files. List of tuples
# (source start file, target name, title, author, documentclass
# [howto/manual]).
latex_documents = [
'%s.tex' % project,
u'%s Documentation' % project,
u'OpenStack Foundation', 'manual'),
# Example configuration for intersphinx: refer to the Python standard library.
#intersphinx_mapping = {'': None}

@ -0,0 +1,4 @@
.. include:: ../../CONTRIBUTING.rst

@ -0,0 +1,35 @@
Licensed under the Apache License, Version 2.0 (the "License"); you may
not use this file except in compliance with the License. You may obtain
a copy of the License at
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
License for the specific language governing permissions and limitations
under the License.
Developer Guide
In the Developer Guide, you will find information on networking-vpp's lower
level design and implementation details.
.. toctree::
:maxdepth: 2
Indices and tables
* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`

@ -0,0 +1,33 @@
.. networking-vpp documentation master file, created by
sphinx-quickstart on Tue Jul 9 22:26:36 2013.
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.
Welcome to networking-vpp's documentation!
.. toctree::
:maxdepth: 2
Developer Docs
.. toctree::
:maxdepth: 1
Indices and tables
* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`

@ -0,0 +1,12 @@
At the command line::
$ pip install networking-vpp
Or, if you have virtualenvwrapper installed::
$ mkvirtualenv networking-vpp
$ pip install networking-vpp

@ -0,0 +1 @@
.. include:: ../../README.rst

@ -0,0 +1,564 @@
# Copyright (c) 2016 Cisco Systems, Inc.
# All Rights Reserved
# Licensed under the Apache License, Version 2.0 (the "License"); you may
# not use this file except in compliance with the License. You may obtain
# a copy of the License at
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
# License for the specific language governing permissions and limitations
# under the License.
# This is a simple Flask application that provides REST APIs by which
# compute and network services can communicate, plus a REST API for
# debugging using a CLI client.
# Note that it does *NOT* at this point have a persistent database, so
# restarting this process will make Gluon forget about every port it's
# learned, which will not do your system much good (the data is in the
# global 'backends' and 'ports' objects). This is for simplicity of
# demonstration; we have a second codebase already defined that is
# written to OpenStack endpoint principles and includes its ORM, so
# that work was not repeated here where the aim was to get the APIs
# worked out. The two codebases will merge in the future.
import distro
import etcd
import json
import os
import re
import sys
from threading import Thread
import time
import traceback
import vpp
from networking_vpp import config_opts
from neutron.agent.linux import bridge_lib
from neutron.agent.linux import ip_lib
from neutron.common import constants as n_const
from oslo_config import cfg
from oslo_log import log as logging
LOG = logging.getLogger(__name__)
# config_opts is required to configure the options within it, but
# not referenced from here, so shut up tox:
assert config_opts
# This mirrors functionality in Neutron so that we're creating a name
# that Neutron can find for its agents.
def get_tap_name(uuid):
return n_const.TAP_DEVICE_PREFIX + uuid[0:11]
# This mirrors functionality in Nova so that we're creating a vhostuser
# name that it will be able to locate
def get_vhostuser_name(uuid):
return os.path.join(VHOSTUSER_DIR, uuid)
def get_distro_family():
if in ['rhel', 'centos', 'fedora']:
return 'redhat'
def get_qemu_default():
distro = get_distro_family()
if distro == 'redhat':
qemu_user = 'qemu'
qemu_group = 'qemu'
elif distro == 'ubuntu':
qemu_user = 'libvirt-qemu'
qemu_group = 'libvirtd'
# let's just try libvirt-qemu for now, maybe we should instead
# print error messsage and exit?
qemu_user = 'libvirt-qemu'
qemu_group = 'kvm'
return (qemu_user, qemu_group)
class VPPForwarder(object):
def __init__(self,
physnets, # physnet_name: interface-name
self.vpp = vpp.VPPInterface(LOG)
self.physnets = physnets
self.qemu_user = qemu_user
self.qemu_group = qemu_group
# This is the address we'll use if we plan on broadcasting
# vxlan packets
self.vxlan_bcast_addr = vxlan_bcast_addr
self.vxlan_src_addr = vxlan_src_addr
self.vxlan_vrf = vxlan_vrf
# Used as a unique number for bridge IDs
self.next_bridge_id = 5678
self.networks = {} # (physnet, type, ID): datastruct
self.interfaces = {} # uuid: if idx
def get_vpp_ifidx(self, if_name):
"""Return VPP's interface index value for the network interface"""
if self.vpp.get_interface(if_name):
return self.vpp.get_interface(if_name).sw_if_index
LOG.error("Error obtaining interface data from vpp "
"for interface:%s" % if_name)
return None
def get_interface(self, physnet):
return self.physnets.get(physnet, None)
def new_bridge_domain(self):
x = self.next_bridge_id
self.next_bridge_id += 1
return x
def network_on_host(self, physnet, net_type, seg_id=None):
"""Find or create a network of the type required"""
if (physnet, net_type, seg_id) not in self.networks:
self.create_network_on_host(physnet, net_type, seg_id)
return self.networks.get((physnet, net_type, seg_id), None)
def create_network_on_host(self, physnet, net_type, seg_id):
intf = self.get_interface(physnet)
if intf is None:
LOG.error("Error: no physnet found")
return None
ifidx = self.get_vpp_ifidx(intf)
# TODO(ijw): bridge domains have no distinguishing marks.
# VPP needs to allow us to name or label them so that we
# can find them when we restart. If we add an interface
# to two bridges that will likely not do as required
if net_type == 'flat':
if_upstream = ifidx
LOG.debug('Adding upstream interface-idx:%s-%s to bridge '
'for flat networking' % (intf, if_upstream))
elif net_type == 'vlan':
LOG.debug('Adding upstream VLAN interface %s.%s '
'to bridge for vlan networking' % (intf, seg_id))
if not self.vpp.get_interface('%s.%s' % (intf, seg_id)):
if_upstream = self.vpp.create_vlan_subif(ifidx,
if_upstream = self.get_vpp_ifidx('%s.%s' % (intf, seg_id))
# elif net_type == 'vxlan':
# # NB physnet not really used here
# if_upstream = \
# self.vpp.create_srcrep_vxlan_subif(self, self.vxlan_vrf,
# self.vxlan_src_addr,
# self.vxlan_bcast_addr,
# seg_id)
raise Exception('network type %s not supported', net_type)
id = self.new_bridge_domain()
self.vpp.add_to_bridge(id, if_upstream)
self.networks[(physnet, net_type, seg_id)] = {
'bridge_domain_id': id,
'if_upstream': intf,
'if_upstream_idx': if_upstream,
'network_type': net_type,
'segmentation_id': seg_id,
def delete_network_on_host(self, physnet, net_type, seg_id=None):
net = self.networks.get((physnet, net_type, seg_id), None)
if net is not None:
# We leave the interface up. Other networks may be using it
LOG.error("Delete Network: network is unknown "
"to agent")
# stolen from LB driver
def _bridge_exists_and_ensure_up(self, bridge_name):
"""Check if the bridge exists and make sure it is up."""
br = ip_lib.IPDevice(bridge_name)
# If the device doesn't exist this will throw a RuntimeError
except RuntimeError:
return False
return True
def ensure_bridge(self, bridge_name):
"""Create a bridge unless it already exists."""
# _bridge_exists_and_ensure_up instead of device_exists is used here
# because there are cases where the bridge exists but it's not UP,
# for example:
# 1) A greenthread was executing this function and had not yet executed
# "ip link set bridge_name up" before eventlet switched to this
# thread running the same function
# 2) The Nova VIF driver was running concurrently and had just created
# the bridge, but had not yet put it UP
if not self._bridge_exists_and_ensure_up(bridge_name):
bridge_device = bridge_lib.BridgeDevice.addbr(bridge_name)
if bridge_device.setfd(0):
if bridge_device.disable_stp():
if bridge_device.disable_ipv6():
bridge_device = bridge_lib.BridgeDevice(bridge_name)
return bridge_device
# TODO(ijw): should be checking this all succeeded
# end theft
# TODO(njoy): make wait_time configurable
# TODO(ijw): needs to be one thread for all waits
def add_external_tap(self, device_name, bridge, bridge_name):
"""Add an externally created TAP device to the bridge
Wait for the external tap device to be created by the DHCP agent.
When the tap device is ready, add it to bridge Run as a thread
so REST call can return before this code completes its
wait_time = 60
found = False
while wait_time > 0:
if ip_lib.device_exists(device_name):
LOG.debug('External tap device %s found!'
% device_name)
LOG.debug('Bridging tap interface %s on %s'
% (device_name, bridge_name))
if not bridge.owns_interface(device_name):
LOG.debug('Interface: %s is already added '
'to the bridge %s' %
(device_name, bridge_name))
found = True
wait_time -= 2
if not found:
LOG.error('Failed waiting for external tap device:%s'
% device_name)
def create_interface_on_host(self, if_type, uuid, mac):
if uuid in self.interfaces:
LOG.debug('port %s repeat binding request - ignored' % uuid)
LOG.debug('binding port %s as type %s' %
(uuid, if_type))
# TODO(ijw): naming not obviously consistent with
# Neutron's naming
name = uuid[0:11]
bridge_name = 'br-' + name
tap_name = 'tap' + name
if if_type == 'maketap' or if_type == 'plugtap':
if if_type == 'maketap':
iface_idx = self.vpp.create_tap(tap_name, mac)
props = {'name': tap_name}
int_tap_name = 'vpp' + name
props = {'bridge_name': bridge_name,
'ext_tap_name': tap_name,
'int_tap_name': int_tap_name}
LOG.debug('Creating tap interface %s with mac %s'
% (int_tap_name, mac))
iface_idx = self.vpp.create_tap(int_tap_name, mac)
# TODO(ijw): someone somewhere ought to be sorting
# the MTUs out
br = self.ensure_bridge(bridge_name)
# This is the external TAP device that will be
# created by an agent, say the DHCP agent later in
# time
t = Thread(target=self.add_external_tap,
args=(tap_name, br, bridge_name,))
# This is the device that we just created with VPP
if not br.owns_interface(int_tap_name):
elif if_type == 'vhostuser':
path = get_vhostuser_name(uuid)
iface_idx = self.vpp.create_vhostuser(path, mac,
props = {'path': path}
raise Exception('unsupported interface type')
props['bind_type'] = if_type
props['iface_idx'] = iface_idx
props['mac'] = mac
self.interfaces[uuid] = props
return self.interfaces[uuid]
def bind_interface_on_host(self, if_type, uuid, mac, physnet,
net_type, seg_id):
# TODO(najoy): Need to send a return value so the ML2 driver
# can raise an exception and prevent network creation (when
# network_on_host returns None)
net_data = self.network_on_host(physnet, net_type, seg_id)
net_br_idx = net_data['bridge_domain_id']
props = self.create_interface_on_host(if_type, uuid, mac)
iface_idx = props['iface_idx']
self.vpp.add_to_bridge(net_br_idx, iface_idx)
props['net_data'] = net_data
LOG.debug('Bound vpp interface with sw_idx:%s on '
'bridge domain:%s'
% (iface_idx, net_br_idx))
return props
def unbind_interface_on_host(self, uuid):
if uuid not in self.interfaces:
LOG.debug('unknown port %s unbinding request - ignored'
% uuid)
props = self.interfaces[uuid]
iface_idx = props['iface_idx']
LOG.debug('unbinding port %s, recorded as type %s'
% (uuid, props['bind_type']))
# We no longer need this interface. Specifically if it's
# a vhostuser interface it's annoying to have it around
# because the VM's memory (hugepages) will not be
# released. So, here, we destroy it.
if props['bind_type'] == 'vhostuser':
elif props['bind_type'] in ['maketap', 'plugtap']:
if props['bind_type'] == 'plugtap':
name = uuid[0:11]
bridge_name = 'br-' + name
bridge = bridge_lib.BridgeDevice(bridge_name)
if bridge.exists():
# These may fail, don't care much
if bridge.owns_interface(props['int_tap_name']):
if bridge.owns_interface(props['ext_tap_name']):
except Exception as exc:
LOG.error('Unknown port type %s during unbind'
% props['bind_type'])
# TODO(ijw): delete structures of newly unused networks with
# delete_network
LEADIN = '/networking-vpp' # TODO(ijw): make configurable?
class EtcdListener(object):
def __init__(self, host, etcd_client, vppf, physnets): = host
self.etcd_client = etcd_client
self.vppf = vppf
self.physnets = physnets
# We need certain directories to exist
self.mkdir(LEADIN + '/state/%s/ports' %
self.mkdir(LEADIN + '/nodes/%s/ports' %
def mkdir(self, path):
self.etcd_client.write(path, None, dir=True)
except etcd.EtcdNotFile:
# Thrown when the directory already exists, which is fine
def repop_interfaces(self):
# The vppf bits
def unbind(self, id):
def bind(self, id, binding_type, mac_address, physnet, network_type,
# args['binding_type'] in ('vhostuser', 'plugtap'):
return self.vppf.bind_interface_on_host(binding_type,
HEARTBEAT = 60 # seconds
def process_ops(self):
# TODO(ijw): needs to remember its last tick on reboot, or
# reconfigure from start (which means that VPP needs it
# storing, so it's lost on reboot of VPP)
physnets = self.physnets.keys()
for f in physnets:
self.etcd_client.write(LEADIN + '/state/%s/physnets/%s'
% (, f), 1)
tick = None
while True:
# The key that indicates to people that we're alive
# (not that they care)
self.etcd_client.write(LEADIN + '/state/%s/alive' %,
1, ttl=3 * self.HEARTBEAT)
LOG.error("ML2_VPP(%s): thread pausing"
% self.__class__.__name__)
rv = + "/nodes/%s/ports"
LOG.error('watch received %s on %s at tick %s',
rv.action, rv.key, rv.modifiedIndex)
tick = rv.modifiedIndex + 1
LOG.error("ML2_VPP(%s): thread active"
% self.__class__.__name__)
# Matches a port key, gets host and uuid
m = re.match(LEADIN + '/nodes/%s/ports/([^/]+)$' %,
if m:
port =
if rv.action == 'delete':
# Removing key == desire to unbind
LEADIN + '/state/%s/ports/%s'
% (, port))
except etcd.EtcdKeyNotFound:
# Gone is fine, if we didn't delete it
# it's no problem
# Create or update == bind
data = json.loads(rv.value)
props = self.bind(port,
self.etcd_client.write(LEADIN + '/state/%s/ports/%s'
% (, port),
LOG.warn('Unexpected key change in etcd port feedback')
except etcd.EtcdWatchTimedOut:
# This is normal
except Exception as e:
LOG.error('etcd threw exception %s' % traceback.format_exc(e))
# TODO(ijw): prevents tight crash loop, but adds
# latency
# Should be specific to etcd faults, should have
# sensible behaviour - Don't just kill the thread...
def main():
# If the user and/or group are specified in config file, we will use
# them as configured; otherwise we try to use defaults depending on
# distribution. Currently only supporting ubuntu and redhat.
qemu_user = cfg.CONF.ml2_vpp.qemu_user
qemu_group = cfg.CONF.ml2_vpp.qemu_group
default_user, default_group = get_qemu_default()
if not qemu_user:
qemu_user = default_user
if not qemu_group:
qemu_group = default_group
physnet_list = cfg.CONF.ml2_vpp.physnets.replace(' ', '').split(',')
physnets = {}
for f in physnet_list:
(k, v) = f.split(':')
physnets[k] = v
vppf = VPPForwarder(physnets,
etcd_client = etcd.Client() # TODO(ijw): args
ops = EtcdListener(, etcd_client, vppf, physnets)
if __name__ == '__main__':

@ -0,0 +1,208 @@
# Copyright (c) 2016 Cisco Systems, Inc.
# All Rights Reserved
# Licensed under the Apache License, Version 2.0 (the "License"); you may
# not use this file except in compliance with the License. You may obtain
# a copy of the License at
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
# License for the specific language governing permissions and limitations
# under the License.
import grp
import os
import pwd
import vpp_papi
def mac_to_bytes(mac):
return str(''.join(chr(int(x, base=16)) for x in mac.split(':')))
def fix_string(s):
return s.rstrip("\0").decode(encoding='ascii')
def _vpp_cb(*args, **kwargs):
# sw_interface_set_flags comes back when you delete interfaces
# print 'callback:', args, kwargs
# Sometimes a callback fires unexpectedly. We need to catch them
# because vpp_papi will traceback otherwise
class VPPInterface(object):
def _check_retval(self, t):
"""See if VPP returned OK.
VPP is very inconsistent in return codes, so for now this reports
a logged warning rather than flagging an error.
self.LOG.debug("checking return value for object: %s" % str(t))
if t.retval != 0:
self.LOG.debug('FAIL? retval here is %s' % t.retval)
except AttributeError as e:
self.LOG.debug("Unexpected request format. Error: %s on %s"
% (e, t))
def get_interfaces(self):
t = vpp_papi.sw_interface_dump(0, b'ignored')
for interface in t:
if interface.vl_msg_id == vpp_papi.VL_API_SW_INTERFACE_DETAILS:
yield (fix_string(interface.interface_name), interface)
def get_interface(self, name):
for (ifname, f) in self.get_interfaces():
if ifname == name:
return f
def get_version(self):
t = vpp_papi.show_version()
return fix_string(t.version)
def create_tap(self, ifname, mac):
# (we don't like unicode in VPP hence str(ifname))
t = vpp_papi.tap_connect(False, # random MAC
False, # renumber - who knows, no doc
0) # customdevinstance - who knows, no doc
return t.sw_if_index # will be -1 on failure (e.g. 'already exists')
def delete_tap(self, idx):
# Err, I just got a sw_interface_set_flags here, not a delete tap?
# self._check_retval(t)
def create_vhostuser(self, ifpath, mac, qemu_user, qemu_group):'Creating %s as a port' % ifpath)