This is the first OpenStack commit for a ML2 driver for VPP. This contains the driver and a VPP agent that uses the VPP API (via the python API package) to control VPP, and the driver and agent communicate with one another via an etcd instance. This version is (loosely, subject to some tidying and getting it past the Gerrit tests) based on http://github.com/iawells/networking-vpp. Change-Id: I8909ddc56b5c49ec2ee1b25acbff21716eee70f6 Co-Authored-By: Ian Wells <iawells@cisco.com> Co-Authored-By: Naveen Joy <najoy@cisco.com> Co-Authored-By: Feng Pan <fpan@redhat.com>changes/72/370472/3
parent
f69d3d16c4
commit
b78c4a3354
@ -0,0 +1,6 @@
|
||||
.eggs
|
||||
build
|
||||
*.egg-info
|
||||
*.pyc
|
||||
.tox
|
||||
*~
|
@ -0,0 +1,3 @@
|
||||
Feng Pan <fpan@redhat.com>
|
||||
Ian Wells <iawells@cisco.com>
|
||||
Naveen Joy <naveen.joy@gmail.com>
|
@ -0,0 +1,153 @@
|
||||
====================
|
||||
CentOS 7 Setup Guide
|
||||
====================
|
||||
|
||||
This document describes steps to set up a Centos 7 single host devstack
|
||||
environmenmt using networking-vpp.
|
||||
|
||||
Host Setup
|
||||
~~~~~~~~~~
|
||||
|
||||
#. Configure hugepage and iommu support:
|
||||
|
||||
``default_hugepagesz=2M hugepagesz=2M hugepages=2048 iommu=pt
|
||||
intel_iommu=on``
|
||||
|
||||
VPP build and install
|
||||
~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
#. pull VPP source from git::
|
||||
|
||||
git clone https://gerrit.fd.io/r/vpp
|
||||
|
||||
#. Build and install VPP::
|
||||
|
||||
cd vpp
|
||||
make install-dep
|
||||
make build-release
|
||||
make pkg-rpm
|
||||
|
||||
#. Install VPP rpms. The rpms are located in vpp/built-root directory after
|
||||
build is complete::
|
||||
|
||||
sudo yum install build-root/vpp*.rpm
|
||||
|
||||
|
||||
#. Build and install VPP-PAPI. VPP-PAPI is VPP's python API used by
|
||||
networking-vpp.
|
||||
|
||||
* Install python-devel package if it is not installed already::
|
||||
|
||||
sudo install -y python-devel
|
||||
|
||||
* Build and install::
|
||||
|
||||
make -Cbuild-root PLATFORM=vpp TAG=vpp_debug vpp-api-install
|
||||
cd vpp-api/python
|
||||
sudo python setup.py install
|
||||
|
||||
#. Configuring VPP
|
||||
|
||||
It may be desirable to change vpp cli's listening port to something other
|
||||
than the default 5000, as it is used by keystone. This can be done by
|
||||
adding line ``cli-listen localhost:5002`` in ``unix`` section of VPP
|
||||
config file ``/etc/vpp/startup.conf``.
|
||||
|
||||
It is necessary to load pmd kernel module of choice (vfio-pci, igb_uio,
|
||||
etc). igb_uio module can be found in dpdk build directory:
|
||||
``build-root/install-vpp-native/dpdk/kmod/igb_uio.ko``
|
||||
|
||||
#. Starting VPP
|
||||
|
||||
VPP can be started by starting VPP service::
|
||||
|
||||
systemctl start vpp
|
||||
|
||||
To verify VPP has started correctly::
|
||||
|
||||
vppctl show interface
|
||||
|
||||
You should see your physical NIC listed in the interface list, in this
|
||||
case GigabitEthernet2/5/0::
|
||||
|
||||
Name Idx State Counter Count
|
||||
GigabitEthernet2/5/0 5 down
|
||||
local0 0 down
|
||||
pg/stream-0 1 down
|
||||
pg/stream-1 2 down
|
||||
pg/stream-2 3 down
|
||||
pg/stream-3 4 down
|
||||
|
||||
|
||||
More detailed instruction on vpp building and installing can be found at:
|
||||
https://wiki.fd.io/view/VPP/Build,_install,_and_test_images#Build_A_VPP_Package
|
||||
|
||||
Upgrade qemu-kvm
|
||||
~~~~~~~~~~~~~~~~
|
||||
|
||||
#. Enable Centos EV repo
|
||||
|
||||
``yum install centos-release-qemu-ev``
|
||||
|
||||
#. Update packages, this will pick up new qemu packages from EV repo.
|
||||
|
||||
``yum update``
|
||||
|
||||
#. Remove qemu-system-x86 package if it's installed, this will prevent
|
||||
libvirt from identifying QEMU version to be 2.0
|
||||
|
||||
``yum remove qemu-system-x86``
|
||||
|
||||
|
||||
Build and install qemu
|
||||
~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
If you would like to use qemu rather than qemu-kvm, you can build and
|
||||
install qemu with the following steps:
|
||||
|
||||
::
|
||||
|
||||
wget http://wiki.qemu-project.org/download/qemu-2.3.1.tar.bz2
|
||||
tar xvf qemu-2.3.1.tar.bz2
|
||||
cd qemu-2.3.1
|
||||
sudo yum install gtk2-devel
|
||||
./configure --enable-numa
|
||||
make
|
||||
sudo make install
|
||||
|
||||
Devstack Setup
|
||||
~~~~~~~~~~~~~~
|
||||
|
||||
General direction on how to download and set up devstack can be found at http://docs.openstack.org/developer/devstack/
|
||||
|
||||
Add the following to local.conf::
|
||||
|
||||
disable_service n-net q-agt
|
||||
disable_service cinder c-sch c-api c-vol
|
||||
disable_service tempest
|
||||
|
||||
enable_plugin networking-vpp https://github.com/iawells/networking-vpp.git
|
||||
ENABLED_SERVICES+=,q-svc,q-meta,q-dhcp
|
||||
Q_PLUGIN=ml2
|
||||
Q_ML2_TENANT_NETWORK_TYPE=vlan
|
||||
ML2_VLAN_RANGES=physnet:100:200
|
||||
Q_ML2_PLUGIN_EXT_DRIVERS=
|
||||
Q_ML2_PLUGIN_MECHANISM_DRIVERS=vpp
|
||||
Q_ML2_PLUGIN_TYPE_DRIVERS=vlan
|
||||
VLAN_TRUNK_IF='GigabitEthernet2/5/0'
|
||||
|
||||
Note that ``VLAN_TRUNK_IF`` should be set to the interface name in VPP that you
|
||||
want to use as your trunk interface.
|
||||
|
||||
VM creation
|
||||
~~~~~~~~~~~
|
||||
|
||||
Note that hugepage support is required on guest VMs for vhostuser port
|
||||
attachment, this can be done by creating a new flavor and booting the VM with
|
||||
the flavor::
|
||||
|
||||
nova flavor-create m1.tiny.hugepage auto 512 0 1
|
||||
nova flavor-key m1.tiny.hugepage set hw:mem_page_size=2048
|
||||
|
||||
nova boot --image cirros-0.3.4-x86_64-uec --flavor m1.tiny.hugepage --nic net-name=private myvm
|
||||
|
@ -0,0 +1,74 @@
|
||||
CHANGES
|
||||
=======
|
||||
|
||||
* Tox fixes
|
||||
* Remove outdated line from example config
|
||||
* Use a journal table
|
||||
* do not create vpp subintf if exists
|
||||
* etcd support, part 1
|
||||
* Added etcd comms channel
|
||||
* Major updates
|
||||
* More PEP8
|
||||
* Fix doc file in setup.cfg
|
||||
* Major updates
|
||||
* tox cleanups (mainly long lines)
|
||||
* typo
|
||||
* Break notifcation of Nova into own function
|
||||
* Correct doc file name
|
||||
* update delete_network_on_host
|
||||
* update network_on_host
|
||||
* update vpp get_interface
|
||||
* add tests
|
||||
* added tests
|
||||
* delete bridge on network delete
|
||||
* create bridge-pool
|
||||
* delete vpp bridge_domain on network delete
|
||||
* support arbitrary physical network names for flat n/w binding
|
||||
* create network
|
||||
* create network msg
|
||||
* prevent adding an existing bridge interface
|
||||
* remove agent auto-cleanup
|
||||
* devstack l3_agent prevent ovs cleanup fix
|
||||
* vif_type fix
|
||||
* dhcp fixes
|
||||
* vpp agent logging
|
||||
* DHCP fixes
|
||||
* updated agent filtering for messaging
|
||||
* updated agent messaging to include failure reporting
|
||||
* remove local settings from devstack/settings
|
||||
* comment
|
||||
* delete port_postcommit
|
||||
* add flat networking
|
||||
* Change qemu user/group to be config options. Also added default values for those for ubuntu and redhat
|
||||
* fix server.py bind logger format string error
|
||||
* fix server.py unbind logger debug
|
||||
* add host info to unbind data
|
||||
* delete port url update to remove host
|
||||
* delete port_postcommit with debug
|
||||
* delete port_postcommit
|
||||
* add debugs to vpp.py
|
||||
* Add support for defining settings in local.conf. We now first check if a setting is defined
|
||||
* Add centos 7 guide doc
|
||||
* Convert README to rst
|
||||
* send bind call without queuing
|
||||
* add hostname to ip lookup
|
||||
* unicast message
|
||||
* update agent list in devstack settings
|
||||
* update agent to listen on 0.0.0.0
|
||||
* add mech_vpp debugs
|
||||
* agent list split on comma
|
||||
* updated agent list in devstack settings
|
||||
* More fixes on comments, debug, deleting
|
||||
* changed socket group to libvirtd
|
||||
* added missing unbind call to agent in mech_vpp
|
||||
* added debug messages for bind and unbind requests
|
||||
* updated devstack/settings vlan trunk interface
|
||||
* fix bug attribute error
|
||||
* test
|
||||
* fixed return value error
|
||||
* fixed bug for checking return value
|
||||
* Added README with details of what work has been done to date
|
||||
* updated vpp api interface
|
||||
* First pass at agent, diags in driver
|
||||
* Initial commit: mechdriver that talks to an as-yet missing agent
|
||||
* Added .gitreview
|
@ -0,0 +1,175 @@
|
||||
|
||||
Apache License
|
||||
Version 2.0, January 2004
|
||||
http://www.apache.org/licenses/
|
||||
|
||||
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
|
||||
|
||||
1. Definitions.
|
||||
|
||||
"License" shall mean the terms and conditions for use, reproduction,
|
||||
and distribution as defined by Sections 1 through 9 of this document.
|
||||
|
||||
"Licensor" shall mean the copyright owner or entity authorized by
|
||||
the copyright owner that is granting the License.
|
||||
|
||||
"Legal Entity" shall mean the union of the acting entity and all
|
||||
other entities that control, are controlled by, or are under common
|
||||
control with that entity. For the purposes of this definition,
|
||||
"control" means (i) the power, direct or indirect, to cause the
|
||||
direction or management of such entity, whether by contract or
|
||||
otherwise, or (ii) ownership of fifty percent (50%) or more of the
|
||||
outstanding shares, or (iii) beneficial ownership of such entity.
|
||||
|
||||
"You" (or "Your") shall mean an individual or Legal Entity
|
||||
exercising permissions granted by this License.
|
||||
|
||||
"Source" form shall mean the preferred form for making modifications,
|
||||
including but not limited to software source code, documentation
|
||||
source, and configuration files.
|
||||
|
||||
"Object" form shall mean any form resulting from mechanical
|
||||
transformation or translation of a Source form, including but
|
||||
not limited to compiled object code, generated documentation,
|
||||
and conversions to other media types.
|
||||
|
||||
"Work" shall mean the work of authorship, whether in Source or
|
||||
Object form, made available under the License, as indicated by a
|
||||
copyright notice that is included in or attached to the work
|
||||
(an example is provided in the Appendix below).
|
||||
|
||||
"Derivative Works" shall mean any work, whether in Source or Object
|
||||
form, that is based on (or derived from) the Work and for which the
|
||||
editorial revisions, annotations, elaborations, or other modifications
|
||||
represent, as a whole, an original work of authorship. For the purposes
|
||||
of this License, Derivative Works shall not include works that remain
|
||||
separable from, or merely link (or bind by name) to the interfaces of,
|
||||
the Work and Derivative Works thereof.
|
||||
|
||||
"Contribution" shall mean any work of authorship, including
|
||||
the original version of the Work and any modifications or additions
|
||||
to that Work or Derivative Works thereof, that is intentionally
|
||||
submitted to Licensor for inclusion in the Work by the copyright owner
|
||||
or by an individual or Legal Entity authorized to submit on behalf of
|
||||
the copyright owner. For the purposes of this definition, "submitted"
|
||||
means any form of electronic, verbal, or written communication sent
|
||||
to the Licensor or its representatives, including but not limited to
|
||||
communication on electronic mailing lists, source code control systems,
|
||||
and issue tracking systems that are managed by, or on behalf of, the
|
||||
Licensor for the purpose of discussing and improving the Work, but
|
||||
excluding communication that is conspicuously marked or otherwise
|
||||
designated in writing by the copyright owner as "Not a Contribution."
|
||||
|
||||
"Contributor" shall mean Licensor and any individual or Legal Entity
|
||||
on behalf of whom a Contribution has been received by Licensor and
|
||||
subsequently incorporated within the Work.
|
||||
|
||||
2. Grant of Copyright License. Subject to the terms and conditions of
|
||||
this License, each Contributor hereby grants to You a perpetual,
|
||||
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
|
||||
copyright license to reproduce, prepare Derivative Works of,
|
||||
publicly display, publicly perform, sublicense, and distribute the
|
||||
Work and such Derivative Works in Source or Object form.
|
||||
|
||||
3. Grant of Patent License. Subject to the terms and conditions of
|
||||
this License, each Contributor hereby grants to You a perpetual,
|
||||
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
|
||||
(except as stated in this section) patent license to make, have made,
|
||||
use, offer to sell, sell, import, and otherwise transfer the Work,
|
||||
where such license applies only to those patent claims licensable
|
||||
by such Contributor that are necessarily infringed by their
|
||||
Contribution(s) alone or by combination of their Contribution(s)
|
||||
with the Work to which such Contribution(s) was submitted. If You
|
||||
institute patent litigation against any entity (including a
|
||||
cross-claim or counterclaim in a lawsuit) alleging that the Work
|
||||
or a Contribution incorporated within the Work constitutes direct
|
||||
or contributory patent infringement, then any patent licenses
|
||||
granted to You under this License for that Work shall terminate
|
||||
as of the date such litigation is filed.
|
||||
|
||||
4. Redistribution. You may reproduce and distribute copies of the
|
||||
Work or Derivative Works thereof in any medium, with or without
|
||||
modifications, and in Source or Object form, provided that You
|
||||
meet the following conditions:
|
||||
|
||||
(a) You must give any other recipients of the Work or
|
||||
Derivative Works a copy of this License; and
|
||||
|
||||
(b) You must cause any modified files to carry prominent notices
|
||||
stating that You changed the files; and
|
||||
|
||||
(c) You must retain, in the Source form of any Derivative Works
|
||||
that You distribute, all copyright, patent, trademark, and
|
||||
attribution notices from the Source form of the Work,
|
||||
excluding those notices that do not pertain to any part of
|
||||
the Derivative Works; and
|
||||
|
||||
(d) If the Work includes a "NOTICE" text file as part of its
|
||||
distribution, then any Derivative Works that You distribute must
|
||||
include a readable copy of the attribution notices contained
|
||||
within such NOTICE file, excluding those notices that do not
|
||||
pertain to any part of the Derivative Works, in at least one
|
||||
of the following places: within a NOTICE text file distributed
|
||||
as part of the Derivative Works; within the Source form or
|
||||
documentation, if provided along with the Derivative Works; or,
|
||||
within a display generated by the Derivative Works, if and
|
||||
wherever such third-party notices normally appear. The contents
|
||||
of the NOTICE file are for informational purposes only and
|
||||
do not modify the License. You may add Your own attribution
|
||||
notices within Derivative Works that You distribute, alongside
|
||||
or as an addendum to the NOTICE text from the Work, provided
|
||||
that such additional attribution notices cannot be construed
|
||||
as modifying the License.
|
||||
|
||||
You may add Your own copyright statement to Your modifications and
|
||||
may provide additional or different license terms and conditions
|
||||
for use, reproduction, or distribution of Your modifications, or
|
||||
for any such Derivative Works as a whole, provided Your use,
|
||||
reproduction, and distribution of the Work otherwise complies with
|
||||
the conditions stated in this License.
|
||||
|
||||
5. Submission of Contributions. Unless You explicitly state otherwise,
|
||||
any Contribution intentionally submitted for inclusion in the Work
|
||||
by You to the Licensor shall be under the terms and conditions of
|
||||
this License, without any additional terms or conditions.
|
||||
Notwithstanding the above, nothing herein shall supersede or modify
|
||||
the terms of any separate license agreement you may have executed
|
||||
with Licensor regarding such Contributions.
|
||||
|
||||
6. Trademarks. This License does not grant permission to use the trade
|
||||
names, trademarks, service marks, or product names of the Licensor,
|
||||
except as required for reasonable and customary use in describing the
|
||||
origin of the Work and reproducing the content of the NOTICE file.
|
||||
|
||||
7. Disclaimer of Warranty. Unless required by applicable law or
|
||||
agreed to in writing, Licensor provides the Work (and each
|
||||
Contributor provides its Contributions) on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
|
||||
implied, including, without limitation, any warranties or conditions
|
||||
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
|
||||
PARTICULAR PURPOSE. You are solely responsible for determining the
|
||||
appropriateness of using or redistributing the Work and assume any
|
||||
risks associated with Your exercise of permissions under this License.
|
||||
|
||||
8. Limitation of Liability. In no event and under no legal theory,
|
||||
whether in tort (including negligence), contract, or otherwise,
|
||||
unless required by applicable law (such as deliberate and grossly
|
||||
negligent acts) or agreed to in writing, shall any Contributor be
|
||||
liable to You for damages, including any direct, indirect, special,
|
||||
incidental, or consequential damages of any character arising as a
|
||||
result of this License or out of the use or inability to use the
|
||||
Work (including but not limited to damages for loss of goodwill,
|
||||
work stoppage, computer failure or malfunction, or any and all
|
||||
other commercial damages or losses), even if such Contributor
|
||||
has been advised of the possibility of such damages.
|
||||
|
||||
9. Accepting Warranty or Additional Liability. While redistributing
|
||||
the Work or Derivative Works thereof, You may choose to offer,
|
||||
and charge a fee for, acceptance of support, warranty, indemnity,
|
||||
or other liability obligations and/or rights consistent with this
|
||||
License. However, in accepting such obligations, You may act only
|
||||
on Your own behalf and on Your sole responsibility, not on behalf
|
||||
of any other Contributor, and only if You agree to indemnify,
|
||||
defend, and hold each Contributor harmless for any liability
|
||||
incurred by, or claims asserted against, such Contributor by reason
|
||||
of your accepting any such warranty or additional liability.
|
@ -0,0 +1,202 @@
|
||||
==============
|
||||
networking-vpp
|
||||
==============
|
||||
|
||||
ML2 Mechanism driver and small control plane for OpenVPP forwarder
|
||||
|
||||
This is a Neutron mechanism driver to bring the advantages of OpenVPP to
|
||||
OpenStack deployments.
|
||||
|
||||
It's been written to be as simple and readable as possible while offering
|
||||
either full Neutron functionality or a simple roadmap to it.
|
||||
While the driver is not perfect, we're aiming for
|
||||
|
||||
- robustness in the face of failures (of one of several Neutron servers, of
|
||||
agents, of the etcd nodes in the cluster gluing them together)
|
||||
- simplicity
|
||||
- testability - having failure cases covered is no good if you don't have
|
||||
a means to test the code that protects you from them
|
||||
|
||||
As a general rule, everything is implemented in the simplest way,
|
||||
for two reasons: one is that we get to see it working faster, and
|
||||
the other is that it's much easier to replace a simple system with
|
||||
a more complex one than it is to change a complex one. The current
|
||||
design will change, but the one that's there at the moment is small
|
||||
and easy to read, even if it makes you pull faces when you read it.
|
||||
|
||||
Your questions answered
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
How do I use it?
|
||||
----------------
|
||||
|
||||
There's a devstack plugin. You can add this plugin to your local.conf and see it working.
|
||||
You'll want to get VPP and the VPP Python bindings set up on the host before you do that.
|
||||
I haven't written up the instructions yet but they're coming.
|
||||
|
||||
To get the best performance, this will use vhostuser sockets to talk to VMs, which means you
|
||||
need a modern version of your OS (CentOS 7 and Ubuntu 16.04 look good). It also means that
|
||||
you need to run your VMs with a special flavor that enables shared memory - basically, you
|
||||
need to set up hugepages for your VMs, as that's the only supported way Nova does this
|
||||
today. Because you're using pinned shared memory you are going to find you can't
|
||||
overcommit memory on the target machine.
|
||||
|
||||
I've tested this on Ubuntu 16.04. Others have tried it with CentOS
|
||||
7. You may need to disable libvirt security for qemu, as libvirt
|
||||
doesn't play well with vhostuser sockets in its default setup.
|
||||
CentOS testing is ongoing. An initial CentOS 7 guide can be found
|
||||
at `<CENTOS_7-guide.rst>`_
|
||||
|
||||
What overlays does it support?
|
||||
------------------------------
|
||||
|
||||
Today, it supports VLANs and flat networks.
|
||||
|
||||
How does it work?
|
||||
-----------------
|
||||
|
||||
VPP has physical interface nominated as the physical networks. In
|
||||
common with other Neutron drivers, each physical network can be
|
||||
used as a flat or VLAN network, for either fully virtual tenant
|
||||
networks or for provider networks. When a VLAN-overlay network is
|
||||
needed on a host, we create a subinterface with the selected VLAN
|
||||
(the typedriver chooses the VLAN, we don't do anything clever about
|
||||
that). For all networks, the agent makes a new bridge domain in
|
||||
VPP and puts the subinterface or host interface into it. Binding
|
||||
a port involves putting the port into the same bridge domain.
|
||||
|
||||
How does it implement binding?
|
||||
------------------------------
|
||||
|
||||
This mechanism driver takes two approaches. It doesn't do anything
|
||||
at all until Neutron needs to drop traffic on a compute host, so
|
||||
the only thing it's really interested in is ports. Making a network
|
||||
or a subnet doesn't do anything at all, so there are no entry points
|
||||
for the network and subnet operations. And it mainly interests
|
||||
itself in the process of binding: the bind calls determine if it
|
||||
has work to do, and the port postcommit calls make sure that, when
|
||||
a binding takes, the VPP forwarders in the system get appropriately
|
||||
programmed to put the traffic where Nova expects it.
|
||||
|
||||
There are a number of calls that a mechanism driver can implement. The
|
||||
precommit calls determine if a create, update or delete is acceptable and
|
||||
stop it before it hits the database; they can also update additional
|
||||
database tables. The postcommit calls allow you to act once the
|
||||
network change is permanently recorded. These two calls, on ports,
|
||||
are what we use.
|
||||
|
||||
In our case, we add a write to a journal table to the DB commit
|
||||
from within the precommit. This is not committed if the commit
|
||||
fails for other reasons.
|
||||
|
||||
The postcommit calls are where you should trigger actions based on an
|
||||
update that's now been accepted and saved (you can't back down at
|
||||
that point) - but the tenant is still waiting for an answer, so
|
||||
it's wise to be quick. In our case, we kick a background thread
|
||||
to push the journal log out, in order, to etcd. etcd then contains
|
||||
the desired state of each host agent, and the agents monitor etcd
|
||||
for changes relevant to them and update their state.
|
||||
|
||||
To ensure binding is done correctly, we send Nova a notification
|
||||
only when the agent has definitely created the structures in VPP
|
||||
necessary for the port to work. This is generally a good idea but
|
||||
for vhost-user connections it's particularly important as QEMU goes
|
||||
into a funny state if you start it with vhost-user sockets that
|
||||
don't connect immediately. This state tends to confuse libvirt and
|
||||
nova; for that reason, we recommend you make VIF plugging failures
|
||||
fatal with the relevant Nova config option, so that a VM is never
|
||||
started with ports that haven't been properly bound and configured.
|
||||
|
||||
Additionally, there are some helper calls to determine if this
|
||||
mechanism driver, in conjunction with the other ones on the system,
|
||||
needs to do anything. In some cases it may not be responsible for the
|
||||
port at all.
|
||||
|
||||
|
||||
How does it talk to VPP?
|
||||
------------------------
|
||||
|
||||
This uses the Python module Ole Troan added to VPP to interface with the
|
||||
forwarder. VPP has an admin channel, implemented as a couple of shared
|
||||
memory queues, to exchange control messages with VPP. The Python bindings
|
||||
are a very thin layer between that shared memory system and a set of Python
|
||||
APIs.
|
||||
|
||||
Note that VPP runs as root and so the shared memory buffers are protected
|
||||
and need root credentials to access, so the agent also runs as root. It
|
||||
rather inelegantly coredumps if it doesn't have root privileges.
|
||||
|
||||
What does it support?
|
||||
------------------------
|
||||
|
||||
For now, assume it moves packets to where they need to go. It also integrates
|
||||
properly with ML2 L3, DHCP and Metadata functionality.
|
||||
|
||||
The main notable absence at this point is security groups.
|
||||
|
||||
What are you doing next?
|
||||
------------------------
|
||||
|
||||
Security groups - this requires some additional functionality in VPP to work,
|
||||
so we're currently waiting on that to be committed upstream.
|
||||
|
||||
We're considering how to add TAP-as-a-Service functionality
|
||||
to the system so that you can prove, to your own satisfaction, that
|
||||
the networking is operating correctly and your app is broken :)
|
||||
|
||||
What else needs fixing?
|
||||
-----------------------
|
||||
|
||||
There are a long list of items where you can help. If you want a slow
|
||||
introduction to the code, read it! It's not very big and it has comments and
|
||||
everything. Among those comments you'll find several TODO comments where we
|
||||
have opinions about shortcuts that we took that need revisiting; if you want
|
||||
a go at changing the code, those TODO statements are a really good place to
|
||||
start.
|
||||
|
||||
That aside, you could attempt to get VXLAN working, or you could
|
||||
look at tidying up the VPP API in the OpenVPP codebase, or you could add a
|
||||
working memory to the VPP agent (perhaps by adding user-data to the VPP API
|
||||
so that the agent could annotate the ports with its own information).
|
||||
|
||||
Firewalling and security groups are a big area where it's lacking.
|
||||
If you're moving packets around fast and you're using secure components in
|
||||
your VMs they don't matter so much (and this is quite common in NFV scenarios)
|
||||
but to make this useful for everything the driver needs to implement basic
|
||||
anti-spoof firewalling, security groups, and also the allowed-address-pair
|
||||
and portsecurity extensions so that security can be turned down when the
|
||||
application needs something different. VPP has ACLs, but the VPP team are
|
||||
looking at improving that functionality and we're currently waiting for the
|
||||
next version of the code and a hopefully more convenient API to use.
|
||||
If you do think of doing work on this, remember that when you change
|
||||
a security group you might be changing the firewalling on lots of
|
||||
ports - on lots of servers - all at the same time.
|
||||
|
||||
Per above, VPP's comms channel with control planes is privileged, and so is the
|
||||
channel for making vhost-user connections (you need to know the credentials that
|
||||
libvirt uses). If it weren't for those two things, the agent doesn't need any
|
||||
special system rights and could run as a normal user. This could be fixed (by
|
||||
getting VPP to drop the privs on the shared memory and by using e.g. a setgid
|
||||
directory to talk to VPP, respectively).
|
||||
|
||||
Why didn't you use the ML2 agent framework for this driver?
|
||||
-----------------------------------------------------------
|
||||
|
||||
Neutron's agent framework is based on communicating via RabbitMQ. This can
|
||||
lead to issues of scale when there are more than a few compute hosts involved,
|
||||
and RabbitMQ is not as robust as it could be, plus RabbitMQ is trying to be a
|
||||
fully reliable messaging system - all of which work against a robust and
|
||||
scalable SDN control system.
|
||||
|
||||
We didn't want to start down that path, so instead we've taken a different
|
||||
approach, that of a 'desired state' database with change listeners. etcd
|
||||
stores the data of how the network should be and the agents try to achieve that (and also report
|
||||
their status back via etcd). One nice feature of this is that anyone can
|
||||
check how well the system is working - both sorts of update can be watched in
|
||||
real time with the command
|
||||
|
||||
etcdctl watch --recursive --forever /
|
||||
|
||||
The driver and agents should deal with disconnections across the
|
||||
board, and the agents know that they must resync themselves with
|
||||
the desired state when they completely lose track of what's happening.
|
@ -0,0 +1,144 @@
|
||||
# plugin - DevStack plugin.sh dispatch script for vpp
|
||||
|
||||
vpp_debug() {
|
||||
if [ ! -z "$VPP_DEVSTACK_DEBUG" ] ; then
|
||||
"$@" || true # a debug command failing is not a failure
|
||||
fi
|
||||
}
|
||||
|
||||
# For debugging purposes, highlight vpp sections
|
||||
vpp_debug tput setab 1
|
||||
|
||||
name=networking-vpp
|
||||
|
||||
# All machines using the VPP mechdriver and agent
|
||||
function pre_install_vpp {
|
||||
:
|
||||
}
|
||||
|
||||
function install_vpp {
|
||||
cd "$MECH_VPP_DIR"
|
||||
echo "Installing networking-vpp"
|
||||
setup_develop "$MECH_VPP_DIR"
|
||||
}
|
||||
|
||||
function init_vpp {
|
||||
:
|
||||
}
|
||||
|
||||
function configure_vpp {
|
||||
iniset /$Q_PLUGIN_CONF_FILE ml2_vpp agents $MECH_VPP_AGENTLIST
|
||||
iniset /$Q_PLUGIN_CONF_FILE ml2_vpp physnets $MECH_VPP_PHYSNETLIST
|
||||
|
||||
if [ ! -z "$VXLAN_SRC_ADDR" ] ; then
|
||||
iniset /$Q_PLUGIN_CONF_FILE ml2_vpp vxlan_src_addr $VXLAN_SRC_ADDR
|
||||
fi
|
||||
|
||||
if [ ! -z "$VXLAN_BCAST_ADDR" ] ; then
|
||||
iniset /$Q_PLUGIN_CONF_FILE ml2_vpp vxlan_bcast_addr $VXLAN_BCAST_ADDR
|
||||
fi
|
||||
|
||||
if [ ! -z "$VXLAN_VRF" ] ; then
|
||||
iniset /$Q_PLUGIN_CONF_FILE ml2_vpp vxlan_vrf $VXLAN_VRF
|
||||
fi
|
||||
|
||||
if [ ! -z "$QEMU_USER" ] ; then
|
||||
iniset /$Q_PLUGIN_CONF_FILE ml2_vpp qemu_user $QEMU_USER
|
||||
fi
|
||||
|
||||
if [ ! -z "$QEMU_GROUP" ] ; then
|
||||
iniset /$Q_PLUGIN_CONF_FILE ml2_vpp qemu_group $QEMU_GROUP
|
||||
fi
|
||||
}
|
||||
|
||||
function shut_vpp_down {
|
||||
:
|
||||
}
|
||||
|
||||
|
||||
# The VPP control plane element (we don't at this point start VPP itself TODO)
|
||||
|
||||
agent_service_name=vpp-agent
|
||||
|
||||
function pre_install_vpp_agent {
|
||||
:
|
||||
}
|
||||
|
||||
function install_vpp_agent {
|
||||
:
|
||||
}
|
||||
|
||||
function init_vpp_agent {
|
||||
# sudo for now, as it needs to connect to VPP and for that requires root privs
|
||||
# to share its shmem comms channel
|
||||
run_process $agent_service_name "sudo $VPP_CP_BINARY --config-file /$Q_PLUGIN_CONF_FILE"
|
||||
}
|
||||
|
||||
function configure_vpp_agent {
|
||||
:
|
||||
}
|
||||
|
||||
function shut_vpp_agent_down {
|
||||
stop_process $agent_service_name
|
||||
}
|
||||
|
||||
|
||||
|
||||
agent_do() {
|
||||
if is_service_enabled "$agent_service_name"; then
|
||||
"$@"
|
||||
fi
|
||||
}
|
||||
|
||||
if [[ "$1" == "stack" && "$2" == "pre-install" ]]; then
|
||||
# Set up system services
|
||||
echo_summary "Configuring system services $name"
|
||||
pre_install_vpp
|
||||
agent_do pre_install_vpp_agent
|
||||
|
||||
elif [[ "$1" == "stack" && "$2" == "install" ]]; then
|
||||
# Perform installation of service source
|
||||
echo_summary "Installing $name"
|
||||
install_vpp
|
||||
agent_do install_vpp_agent
|
||||
|
||||
elif [[ "$1" == "stack" && "$2" == "post-config" ]]; then
|
||||
# Configure after the other layer 1 and 2 services have been configured
|
||||
echo_summary "Configuring $name"
|
||||
configure_vpp
|
||||
agent_do configure_vpp_agent
|
||||
|
||||
elif [[ "$1" == "stack" && "$2" == "extra" ]]; then
|
||||
# Initialize and start the service
|
||||
echo_summary "Initializing $name"
|
||||
init_vpp
|
||||
agent_do init_vpp_agent
|
||||
fi
|
||||
|
||||
if [[ "$1" == "unstack" ]]; then
|
||||
# Shut down services
|
||||
shut_vpp_down
|
||||
agent_do shut_vpp_agent_down
|
||||
fi
|
||||
|
||||
if [[ "$1" == "clean" ]]; then
|
||||
# Remove state and transient data
|
||||
# Remember clean.sh first calls unstack.sh
|
||||
# no-op
|
||||
:
|
||||
fi
|
||||
vpp_debug tput setab 9
|
||||
|
||||
function neutron_plugin_install_agent_packages {
|
||||
install_package bridge-utils
|
||||
}
|
||||
|
||||
function neutron_plugin_configure_l3_agent {
|
||||
:
|
||||
}
|
||||
|
||||
# We have opinions on the interface driver that should attach agents
|
||||
function neutron_plugin_setup_interface_driver {
|
||||
local conf_file=$1
|
||||
iniset $conf_file DEFAULT interface_driver linuxbridge
|
||||
}
|
@ -0,0 +1,12 @@
|
||||
enable_service vpp-agent
|
||||
|
||||
|
||||
MECH_VPP_DIR="$DEST/networking-vpp"
|
||||
MECH_VPP_BIN_DIR=$(get_python_exec_prefix)
|
||||
VPP_CP_BINARY="$MECH_VPP_BIN_DIR/vpp-agent"
|
||||
|
||||
MECH_VPP_PHYSNETLIST=${MECH_VPP_PHYSNETLIST:-physnet:GigabitEthernet2/2/0}
|
||||
|
||||
VXLAN_SRC_ADDR=${VXLAN_SRC_ADDR:-}
|
||||
VXLAN_BCAST_ADDR=${VXLAN_BCAST_ADDR:-}
|
||||
VXLAN_VRF=${VXLAN_VRF:-1}
|
@ -0,0 +1,75 @@
|
||||
# -*- coding: utf-8 -*-
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
|
||||
# implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
|
||||
import os
|
||||
import sys
|
||||
|
||||
sys.path.insert(0, os.path.abspath('../..'))
|
||||
# -- General configuration ----------------------------------------------------
|
||||
|
||||
# Add any Sphinx extension module names here, as strings. They can be
|
||||
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom ones.
|
||||
extensions = [
|
||||
'sphinx.ext.autodoc',
|
||||
#'sphinx.ext.intersphinx',
|
||||
'oslosphinx'
|
||||
]
|
||||
|
||||
# autodoc generation is a bit aggressive and a nuisance when doing heavy
|
||||
# text edit cycles.
|
||||
# execute "export SPHINX_DEBUG=1" in your terminal to disable
|
||||
|
||||
# The suffix of source filenames.
|
||||
source_suffix = '.rst'
|
||||
|
||||
# The master toctree document.
|
||||
master_doc = 'index'
|
||||
|
||||
# General information about the project.
|
||||
project = u'networking-vpp'
|
||||
copyright = u'2016, OpenStack Foundation'
|
||||
|
||||
# If true, '()' will be appended to :func: etc. cross-reference text.
|
||||
add_function_parentheses = True
|
||||
|
||||
# If true, the current module name will be prepended to all description
|
||||
# unit titles (such as .. function::).
|
||||
add_module_names = True
|
||||
|
||||
# The name of the Pygments (syntax highlighting) style to use.
|
||||
pygments_style = 'sphinx'
|
||||
|
||||
# -- Options for HTML output --------------------------------------------------
|
||||
|
||||
# The theme to use for HTML and HTML Help pages. Major themes that come with
|
||||
# Sphinx are currently 'default' and 'sphinxdoc'.
|
||||
# html_theme_path = ["."]
|
||||
# html_theme = '_theme'
|
||||
# html_static_path = ['static']
|
||||
|
||||
# Output file base name for HTML help builder.
|
||||
htmlhelp_basename = '%sdoc' % project
|
||||
|
||||
# Grouping the document tree into LaTeX files. List of tuples
|
||||
# (source start file, target name, title, author, documentclass
|
||||
# [howto/manual]).
|
||||
latex_documents = [
|
||||
('index',
|
||||
'%s.tex' % project,
|
||||
u'%s Documentation' % project,
|
||||
u'OpenStack Foundation', 'manual'),
|
||||
]
|
||||
|
||||
# Example configuration for intersphinx: refer to the Python standard library.
|
||||
#intersphinx_mapping = {'http://docs.python.org/': None}
|
@ -0,0 +1,4 @@
|
||||
============
|
||||
Contributing
|
||||
============
|
||||
.. include:: ../../CONTRIBUTING.rst
|
@ -0,0 +1,35 @@
|
||||
..
|
||||
Licensed under the Apache License, Version 2.0 (the "License"); you may
|
||||
not use this file except in compliance with the License. You may obtain
|
||||
a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
|
||||
WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
|
||||
License for the specific language governing permissions and limitations
|
||||
under the License.
|
||||
|
||||
|
||||
Developer Guide
|
||||
===============
|
||||
|
||||
In the Developer Guide, you will find information on networking-vpp's lower
|
||||
level design and implementation details.
|
||||
|
||||
|
||||
Contents:
|
||||
--------------------------------
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
|
||||
|
||||
|
||||
Indices and tables
|
||||
==================
|
||||
|
||||
* :ref:`genindex`
|
||||
* :ref:`modindex`
|
||||
* :ref:`search`
|
||||
|
@ -0,0 +1,33 @@
|
||||
.. networking-vpp documentation master file, created by
|
||||
sphinx-quickstart on Tue Jul 9 22:26:36 2013.
|
||||
You can adapt this file completely to your liking, but it should at least
|
||||
contain the root `toctree` directive.
|
||||
|
||||
Welcome to networking-vpp's documentation!
|
||||
========================================================
|
||||
|
||||
Contents:
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
|
||||
readme
|
||||
installation
|
||||
contributing
|
||||
|
||||
Developer Docs
|
||||
==============
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 1
|
||||
|
||||
devref/index
|
||||
|
||||
|
||||
Indices and tables
|
||||
==================
|
||||
|
||||
* :ref:`genindex`
|
||||
* :ref:`modindex`
|
||||
* :ref:`search`
|
||||
|
@ -0,0 +1,12 @@
|
||||
============
|
||||
Installation
|
||||
============
|
||||
|
||||
At the command line::
|
||||
|
||||
$ pip install networking-vpp
|
||||
|
||||
Or, if you have virtualenvwrapper installed::
|
||||
|
||||
$ mkvirtualenv networking-vpp
|
||||
$ pip install networking-vpp
|
@ -0,0 +1 @@
|
||||
.. include:: ../../README.rst
|
@ -0,0 +1,2 @@
|
||||
[ml2_vpp]
|
||||
|
@ -0,0 +1,564 @@
|
||||
# Copyright (c) 2016 Cisco Systems, Inc.
|
||||
# All Rights Reserved
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License"); you may
|
||||
# not use this file except in compliance with the License. You may obtain
|
||||
# a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
|
||||
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
|
||||
# License for the specific language governing permissions and limitations
|
||||
# under the License.
|
||||
|
||||
|
||||
# This is a simple Flask application that provides REST APIs by which
|
||||
# compute and network services can communicate, plus a REST API for
|
||||
# debugging using a CLI client.
|
||||
|
||||
# Note that it does *NOT* at this point have a persistent database, so
|
||||
# restarting this process will make Gluon forget about every port it's
|
||||
# learned, which will not do your system much good (the data is in the
|
||||
# global 'backends' and 'ports' objects). This is for simplicity of
|
||||
# demonstration; we have a second codebase already defined that is
|
||||
# written to OpenStack endpoint principles and includes its ORM, so
|
||||
# that work was not repeated here where the aim was to get the APIs
|
||||
# worked out. The two codebases will merge in the future.
|
||||
|
||||
import distro
|
||||
import etcd
|
||||
import json
|
||||
import os
|
||||
import re
|
||||
import sys
|
||||
from threading import Thread
|
||||
import time
|
||||
import traceback
|
||||
import vpp
|
||||
|
||||
from networking_vpp import config_opts
|
||||
from neutron.agent.linux import bridge_lib
|
||||
from neutron.agent.linux import ip_lib
|
||||
from neutron.common import constants as n_const
|
||||
from oslo_config import cfg
|
||||
from oslo_log import log as logging
|
||||
|
||||
LOG = logging.getLogger(__name__)
|
||||
|
||||
|
||||
# config_opts is required to configure the options within it, but
|
||||
# not referenced from here, so shut up tox:
|
||||
assert config_opts
|
||||
|
||||
######################################################################
|
||||
|
||||
# This mirrors functionality in Neutron so that we're creating a name
|
||||
# that Neutron can find for its agents.
|
||||
|
||||
DEV_NAME_PREFIX = n_const.TAP_DEVICE_PREFIX
|
||||
|
||||
|
||||
def get_tap_name(uuid):
|
||||
return n_const.TAP_DEVICE_PREFIX + uuid[0:11]
|
||||
|
||||
# This mirrors functionality in Nova so that we're creating a vhostuser
|
||||
# name that it will be able to locate
|
||||
|
||||
VHOSTUSER_DIR = '/tmp'
|
||||
|
||||
|
||||
def get_vhostuser_name(uuid):
|
||||
return os.path.join(VHOSTUSER_DIR, uuid)
|
||||
|
||||
|
||||
def get_distro_family():
|
||||
if distro.id() in ['rhel', 'centos', 'fedora']:
|
||||
return 'redhat'
|
||||
else:
|
||||
return distro.id()
|
||||
|
||||
|
||||
def get_qemu_default():
|
||||
distro = get_distro_family()
|
||||
if distro == 'redhat':
|
||||
qemu_user = 'qemu'
|
||||
qemu_group = 'qemu'
|
||||
elif distro == 'ubuntu':
|
||||
qemu_user = 'libvirt-qemu'
|
||||
qemu_group = 'libvirtd'
|
||||
else:
|
||||
# let's just try libvirt-qemu for now, maybe we should instead
|
||||
# print error messsage and exit?
|
||||
qemu_user = 'libvirt-qemu'
|
||||
qemu_group = 'kvm'
|
||||
|
||||
return (qemu_user, qemu_group)
|
||||
|
||||
|
||||
######################################################################
|
||||
|
||||
|
||||
class VPPForwarder(object):
|
||||
|
||||
def __init__(self,
|
||||
physnets, # physnet_name: interface-name
|
||||
vxlan_src_addr=None,
|
||||
vxlan_bcast_addr=None,
|
||||
vxlan_vrf=None,
|
||||
qemu_user=None,
|
||||
qemu_group=None):
|
||||
self.vpp = vpp.VPPInterface(LOG)
|
||||
|
||||
self.physnets = physnets
|
||||
|
||||
self.qemu_user = qemu_user
|
||||
self.qemu_group = qemu_group
|
||||
|
||||
# This is the address we'll use if we plan on broadcasting
|
||||
# vxlan packets
|
||||
self.vxlan_bcast_addr = vxlan_bcast_addr
|
||||
self.vxlan_src_addr = vxlan_src_addr
|
||||
self.vxlan_vrf = vxlan_vrf
|
||||
# Used as a unique number for bridge IDs
|
||||
self.next_bridge_id = 5678
|
||||
|
||||
self.networks = {} # (physnet, type, ID): datastruct
|
||||
self.interfaces = {} # uuid: if idx
|
||||
|
||||
def get_vpp_ifidx(self, if_name):
|
||||
"""Return VPP's interface index value for the network interface"""
|
||||
if self.vpp.get_interface(if_name):
|
||||
return self.vpp.get_interface(if_name).sw_if_index
|
||||
else:
|
||||
LOG.error("Error obtaining interface data from vpp "
|
||||
"for interface:%s" % if_name)
|
||||
return None
|
||||
|
||||
def get_interface(self, physnet):
|
||||
return self.physnets.get(physnet, None)
|
||||
|
||||
def new_bridge_domain(self):
|
||||
x = self.next_bridge_id
|
||||
self.vpp.create_bridge_domain(x)
|
||||
self.next_bridge_id += 1
|
||||
return x
|
||||
|
||||
def network_on_host(self, physnet, net_type, seg_id=None):
|
||||
"""Find or create a network of the type required"""
|
||||
|
||||
if (physnet, net_type, seg_id) not in self.networks:
|
||||
self.create_network_on_host(physnet, net_type, seg_id)
|
||||
return self.networks.get((physnet, net_type, seg_id), None)
|
||||
|
||||
def create_network_on_host(self, physnet, net_type, seg_id):
|
||||
intf = self.get_interface(physnet)
|
||||
if intf is None:
|
||||
LOG.error("Error: no physnet found")
|
||||
return None
|
||||
|
||||
ifidx = self.get_vpp_ifidx(intf)
|
||||
|
||||
# TODO(ijw): bridge domains have no distinguishing marks.
|
||||
# VPP needs to allow us to name or label them so that we
|
||||
# can find them when we restart. If we add an interface
|
||||
# to two bridges that will likely not do as required
|
||||
|
||||
if net_type == 'flat':
|
||||
if_upstream = ifidx
|
||||
|
||||
LOG.debug('Adding upstream interface-idx:%s-%s to bridge '
|
||||
'for flat networking' % (intf, if_upstream))
|
||||
|
||||
elif net_type == 'vlan':
|
||||
self.vpp.ifup(ifidx)
|
||||
|
||||
LOG.debug('Adding upstream VLAN interface %s.%s '
|
||||
'to bridge for vlan networking' % (intf, seg_id))
|
||||
if not self.vpp.get_interface('%s.%s' % (intf, seg_id)):
|
||||
if_upstream = self.vpp.create_vlan_subif(ifidx,
|
||||
seg_id)
|
||||
else:
|
||||
if_upstream = self.get_vpp_ifidx('%s.%s' % (intf, seg_id))
|
||||
# elif net_type == 'vxlan':
|
||||
# # NB physnet not really used here
|
||||
# if_upstream = \
|
||||
# self.vpp.create_srcrep_vxlan_subif(self, self.vxlan_vrf,
|
||||
# self.vxlan_src_addr,
|
||||
# self.vxlan_bcast_addr,
|
||||
# seg_id)
|
||||
else:
|
||||
raise Exception('network type %s not supported', net_type)
|
||||
|
||||
self.vpp.ifup(if_upstream)
|
||||
|
||||
id = self.new_bridge_domain()
|
||||
|
||||
self.vpp.add_to_bridge(id, if_upstream)
|
||||
self.networks[(physnet, net_type, seg_id)] = {
|
||||
'bridge_domain_id': id,
|
||||
'if_upstream': intf,
|
||||
'if_upstream_idx': if_upstream,
|
||||
'network_type': net_type,
|
||||
'segmentation_id': seg_id,
|
||||
}
|
||||
|
||||
def delete_network_on_host(self, physnet, net_type, seg_id=None):
|
||||
net = self.networks.get((physnet, net_type, seg_id), None)
|
||||
if net is not None:
|
||||
|
||||
self.vpp.delete_bridge_domain(net['bridge_domain_id'])
|
||||
|
||||
# We leave the interface up. Other networks may be using it
|
||||
else:
|
||||
LOG.error("Delete Network: network is unknown "
|
||||
"to agent")
|
||||
|
||||
########################################
|
||||
# stolen from LB driver
|
||||
def _bridge_exists_and_ensure_up(self, bridge_name):
|
||||
"""Check if the bridge exists and make sure it is up."""
|
||||
br = ip_lib.IPDevice(bridge_name)
|
||||
br.set_log_fail_as_error(False)
|
||||
try:
|
||||
# If the device doesn't exist this will throw a RuntimeError
|
||||
br.link.set_up()
|
||||
except RuntimeError:
|
||||
return False
|
||||
return True
|
||||
|
||||
def ensure_bridge(self, bridge_name):
|
||||
"""Create a bridge unless it already exists."""
|
||||
# _bridge_exists_and_ensure_up instead of device_exists is used here
|
||||
# because there are cases where the bridge exists but it's not UP,
|
||||
# for example:
|
||||
# 1) A greenthread was executing this function and had not yet executed
|
||||
# "ip link set bridge_name up" before eventlet switched to this
|
||||
# thread running the same function
|
||||
# 2) The Nova VIF driver was running concurrently and had just created
|
||||
# the bridge, but had not yet put it UP
|
||||
if not self._bridge_exists_and_ensure_up(bridge_name):
|
||||
bridge_device = bridge_lib.BridgeDevice.addbr(bridge_name)
|
||||
if bridge_device.setfd(0):
|
||||
return
|
||||
if bridge_device.disable_stp():
|
||||
return
|
||||
if bridge_device.disable_ipv6():
|
||||
return
|
||||
if bridge_device.link.set_up():
|
||||
return
|
||||
else:
|
||||
bridge_device = bridge_lib.BridgeDevice(bridge_name)
|
||||
return bridge_device
|
||||
|
||||
# TODO(ijw): should be checking this all succeeded
|
||||
|
||||
# end theft
|
||||
########################################
|
||||
|
||||
# TODO(njoy): make wait_time configurable
|
||||
# TODO(ijw): needs to be one thread for all waits
|
||||
def add_external_tap(self, device_name, bridge, bridge_name):
|
||||
"""Add an externally created TAP device to the bridge
|
||||
|
||||
Wait for the external tap device to be created by the DHCP agent.
|
||||
When the tap device is ready, add it to bridge Run as a thread
|
||||
so REST call can return before this code completes its
|
||||
execution.
|
||||
|
||||
"""
|
||||
wait_time = 60
|
||||
found = False
|
||||
while wait_time > 0:
|
||||
if ip_lib.device_exists(device_name):
|
||||
LOG.debug('External tap device %s found!'
|
||||
% device_name)
|
||||
LOG.debug('Bridging tap interface %s on %s'
|
||||
% (device_name, bridge_name))
|
||||
if not bridge.owns_interface(device_name):
|
||||
bridge.addif(device_name)
|
||||
else:
|
||||
LOG.debug('Interface: %s is already added '
|
||||
'to the bridge %s' %
|
||||
(device_name, bridge_name))
|
||||
found = True
|
||||
break
|
||||
else:
|
||||
time.sleep(2)
|
||||
wait_time -= 2
|
||||
if not found:
|
||||
LOG.error('Failed waiting for external tap device:%s'
|
||||
% device_name)
|
||||
|
||||
def create_interface_on_host(self, if_type, uuid, mac):
|
||||
if uuid in self.interfaces:
|
||||
LOG.debug('port %s repeat binding request - ignored' % uuid)
|
||||
else:
|
||||
LOG.debug('binding port %s as type %s' %
|
||||
(uuid, if_type))
|
||||
|
||||
# TODO(ijw): naming not obviously consistent with
|
||||
# Neutron's naming
|
||||
name = uuid[0:11]
|
||||
bridge_name = 'br-' + name
|
||||
tap_name = 'tap' + name
|
||||
|
||||
if if_type == 'maketap' or if_type == 'plugtap':
|
||||
if if_type == 'maketap':
|
||||
iface_idx = self.vpp.create_tap(tap_name, mac)
|
||||
props = {'name': tap_name}
|
||||
else:
|
||||
int_tap_name = 'vpp' + name
|
||||
|
||||
props = {'bridge_name': bridge_name,
|
||||
'ext_tap_name': tap_name,
|
||||
'int_tap_name': int_tap_name}
|
||||
|
||||
LOG.debug('Creating tap interface %s with mac %s'
|
||||
% (int_tap_name, mac))
|
||||
iface_idx = self.vpp.create_tap(int_tap_name, mac)
|
||||
# TODO(ijw): someone somewhere ought to be sorting
|
||||
# the MTUs out
|
||||
br = self.ensure_bridge(bridge_name)
|
||||
# This is the external TAP device that will be
|
||||
# created by an agent, say the DHCP agent later in
|
||||
# time
|
||||
t = Thread(target=self.add_external_tap,
|
||||
args=(tap_name, br, bridge_name,))
|
||||
t.start()
|
||||
# This is the device that we just created with VPP
|
||||
if not br.owns_interface(int_tap_name):
|
||||
br.addif(int_tap_name)
|
||||
elif if_type == 'vhostuser':
|
||||
path = get_vhostuser_name(uuid)
|
||||
iface_idx = self.vpp.create_vhostuser(path, mac,
|
||||
self.qemu_user,
|
||||
self.qemu_group)
|
||||
props = {'path': path}
|
||||
else:
|
||||
raise Exception('unsupported interface type')
|
||||
props['bind_type'] = if_type
|
||||
props['iface_idx'] = iface_idx
|
||||
props['mac'] = mac
|
||||
self.interfaces[uuid] = props
|
||||
return self.interfaces[uuid]
|
||||
|
||||
def bind_interface_on_host(self, if_type, uuid, mac, physnet,
|
||||
net_type, seg_id):
|
||||
# TODO(najoy): Need to send a return value so the ML2 driver
|
||||
# can raise an exception and prevent network creation (when
|
||||
# network_on_host returns None)
|
||||
|
||||
net_data = self.network_on_host(physnet, net_type, seg_id)
|
||||
net_br_idx = net_data['bridge_domain_id']
|
||||
props = self.create_interface_on_host(if_type, uuid, mac)
|
||||
iface_idx = props['iface_idx']
|
||||
self.vpp.ifup(iface_idx)
|
||||
self.vpp.add_to_bridge(net_br_idx, iface_idx)
|
||||
props['net_data'] = net_data
|
||||
LOG.debug('Bound vpp interface with sw_idx:%s on '
|
||||
'bridge domain:%s'
|
||||
% (iface_idx, net_br_idx))
|
||||
return props
|
||||
|
||||
def unbind_interface_on_host(self, uuid):
|
||||
if uuid not in self.interfaces:
|
||||
LOG.debug('unknown port %s unbinding request - ignored'
|
||||
% uuid)
|
||||
else:
|
||||
props = self.interfaces[uuid]
|
||||
iface_idx = props['iface_idx']
|
||||
|
||||
LOG.debug('unbinding port %s, recorded as type %s'
|
||||
% (uuid, props['bind_type']))
|
||||
|
||||
# We no longer need this interface. Specifically if it's
|
||||
# a vhostuser interface it's annoying to have it around
|
||||
# because the VM's memory (hugepages) will not be
|
||||
# released. So, here, we destroy it.
|
||||
|
||||
if props['bind_type'] == 'vhostuser':
|
||||
self.vpp.delete_vhostuser(iface_idx)
|
||||
elif props['bind_type'] in ['maketap', 'plugtap']:
|
||||
self.vpp.delete_tap(iface_idx)
|
||||
if props['bind_type'] == 'plugtap':
|
||||
name = uuid[0:11]
|
||||
bridge_name = 'br-' + name
|
||||
bridge = bridge_lib.BridgeDevice(bridge_name)
|
||||
if bridge.exists():
|
||||
# These may fail, don't care much
|
||||
try:
|
||||
if bridge.owns_interface(props['int_tap_name']):
|
||||
bridge.delif(props['int_tap_name'])
|
||||
if bridge.owns_interface(props['ext_tap_name']):
|
||||
bridge.delif(props['ext_tap_name'])
|
||||
bridge.link.set_down()
|
||||
bridge.delbr()
|
||||
except Exception as exc:
|
||||
LOG.debug(exc)
|
||||
else:
|
||||
LOG.error('Unknown port type %s during unbind'
|
||||
% props['bind_type'])
|
||||
|
||||
# TODO(ijw): delete structures of newly unused networks with
|
||||
# delete_network
|
||||
|
||||
|
||||
######################################################################
|
||||
|
||||
LEADIN = '/networking-vpp' # TODO(ijw): make configurable?
|
||||
|
||||
|
||||
class EtcdListener(object):
|
||||
def __init__(self, host, etcd_client, vppf, physnets):
|
||||
self.host = host
|
||||
self.etcd_client = etcd_client
|
||||
self.vppf = vppf
|
||||
self.physnets = physnets
|
||||
|
||||
# We need certain directories to exist
|
||||
self.mkdir(LEADIN + '/state/%s/ports' % self.host)
|
||||
self.mkdir(LEADIN + '/nodes/%s/ports' % self.host)
|
||||
|
||||
def mkdir(self, path):
|
||||
try:
|
||||
self.etcd_client.write(path, None, dir=True)
|
||||
except etcd.EtcdNotFile:
|
||||
# Thrown when the directory already exists, which is fine
|
||||
pass
|
||||
|
||||
def repop_interfaces(self):
|
||||
pass
|
||||
|
||||
# The vppf bits
|
||||
|
||||
def unbind(self, id):
|
||||
self.vppf.unbind_interface_on_host(id)
|
||||
|
||||
def bind(self, id, binding_type, mac_address, physnet, network_type,
|
||||
segmentation_id):
|
||||
# args['binding_type'] in ('vhostuser', 'plugtap'):
|
||||
return self.vppf.bind_interface_on_host(binding_type,
|
||||
id,
|
||||
mac_address,
|
||||
physnet,
|
||||
network_type,
|
||||
segmentation_id)
|
||||
|
||||
HEARTBEAT = 60 # seconds
|
||||
|
||||
def process_ops(self):
|
||||
# TODO(ijw): needs to remember its last tick on reboot, or
|
||||
# reconfigure from start (which means that VPP needs it
|
||||
# storing, so it's lost on reboot of VPP)
|
||||
physnets = self.physnets.keys()
|
||||
for f in physnets:
|
||||
self.etcd_client.write(LEADIN + '/state/%s/physnets/%s'
|
||||
% (self.host, f), 1)
|
||||
|
||||
tick = None
|
||||
while True:
|
||||
|
||||
# The key that indicates to people that we're alive
|
||||
# (not that they care)
|
||||
self.etcd_client.write(LEADIN + '/state/%s/alive' % self.host,
|
||||
1, ttl=3 * self.HEARTBEAT)
|
||||
|
||||
try:
|
||||
LOG.error("ML2_VPP(%s): thread pausing"
|
||||
% self.__class__.__name__)
|
||||
rv = self.etcd_client.watch(LEADIN + "/nodes/%s/ports"
|
||||
% self.host,
|
||||
recursive=True,
|
||||
index=tick,
|
||||
timeout=self.HEARTBEAT)
|
||||
LOG.error('watch received %s on %s at tick %s',
|
||||
rv.action, rv.key, rv.modifiedIndex)
|
||||
tick = rv.modifiedIndex + 1
|
||||
LOG.error("ML2_VPP(%s): thread active"
|
||||
% self.__class__.__name__)
|
||||
|
||||
# Matches a port key, gets host and uuid
|
||||
m = re.match(LEADIN + '/nodes/%s/ports/([^/]+)$' % self.host,
|
||||
rv.key)
|
||||
|
||||
if m:
|
||||
port = m.group(1)
|
||||
|
||||
if rv.action == 'delete':
|
||||
# Removing key == desire to unbind
|
||||
self.unbind(port)
|
||||
try:
|
||||
self.etcd_client.delete(
|
||||
LEADIN + '/state/%s/ports/%s'
|
||||
% (self.host, port))
|
||||
except etcd.EtcdKeyNotFound:
|
||||
# Gone is fine, if we didn't delete it
|
||||
# it's no problem
|
||||
pass
|
||||
else:
|
||||
# Create or update == bind
|
||||
data = json.loads(rv.value)
|
||||
props = self.bind(port,
|
||||
data['binding_type'],
|
||||
data['mac_address'],
|
||||
data['physnet'],
|
||||
data['network_type'],
|
||||
data['segmentation_id'])
|
||||
self.etcd_client.write(LEADIN + '/state/%s/ports/%s'
|
||||
% (self.host, port),
|
||||
json.dumps(props))
|
||||
|
||||
else:
|
||||
LOG.warn('Unexpected key change in etcd port feedback')
|
||||
|
||||
except etcd.EtcdWatchTimedOut:
|
||||
# This is normal
|
||||
pass
|
||||
except Exception as e:
|
||||
LOG.error('etcd threw exception %s' % traceback.format_exc(e))
|
||||
|
||||
# TODO(ijw): prevents tight crash loop, but adds
|
||||
# latency
|
||||
time.sleep(1)
|
||||
|
||||
# Should be specific to etcd faults, should have
|
||||
# sensible behaviour - Don't just kill the thread...
|
||||
|
||||
|
||||
def main():
|
||||
cfg.CONF(sys.argv[1:])
|
||||
|
||||
# If the user and/or group are specified in config file, we will use
|
||||
# them as configured; otherwise we try to use defaults depending on
|
||||
# distribution. Currently only supporting ubuntu and redhat.
|
||||
qemu_user = cfg.CONF.ml2_vpp.qemu_user
|
||||
qemu_group = cfg.CONF.ml2_vpp.qemu_group
|
||||
default_user, default_group = get_qemu_default()
|
||||
if not qemu_user:
|
||||
qemu_user = default_user
|
||||
if not qemu_group:
|
||||
qemu_group = default_group
|
||||
|
||||
physnet_list = cfg.CONF.ml2_vpp.physnets.replace(' ', '').split(',')
|
||||
physnets = {}
|
||||
for f in physnet_list:
|
||||
(k, v) = f.split(':')
|
||||
physnets[k] = v
|
||||
|
||||
vppf = VPPForwarder(physnets,
|
||||
vxlan_src_addr=cfg.CONF.ml2_vpp.vxlan_src_addr,
|
||||
vxlan_bcast_addr=cfg.CONF.ml2_vpp.vxlan_bcast_addr,
|
||||
vxlan_vrf=cfg.CONF.ml2_vpp.vxlan_vrf,
|
||||
qemu_user=qemu_user,
|
||||
qemu_group=qemu_group)
|
||||
|
||||
etcd_client = etcd.Client() # TODO(ijw): args
|
||||
|
||||
ops = EtcdListener(cfg.CONF.host, etcd_client, vppf, physnets)
|
||||
|
||||
ops.process_ops()
|
||||
|
||||
if __name__ == '__main__':
|
||||
main()
|
@ -0,0 +1,208 @@
|
||||
# Copyright (c) 2016 Cisco Systems, Inc.
|
||||
# All Rights Reserved
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License"); you may
|
||||
# not use this file except in compliance with the License. You may obtain
|
||||
# a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
|
||||
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
|
||||
# License for the specific language governing permissions and limitations
|
||||
# under the License.
|
||||
|
||||
|
||||
import grp
|
||||
import os
|
||||
import pwd
|
||||
import vpp_papi
|
||||
|
||||
|
||||
def mac_to_bytes(mac):
|
||||
return str(''.join(chr(int(x, base=16)) for x in mac.split(':')))
|
||||
|
||||
|
||||
def fix_string(s):
|
||||
return s.rstrip("\0").decode(encoding='ascii')
|
||||
|
||||
|
||||
def _vpp_cb(*args, **kwargs):
|
||||
# sw_interface_set_flags comes back when you delete interfaces
|
||||
# print 'callback:', args, kwargs
|
||||
pass
|
||||
|
||||
|
||||
# Sometimes a callback fires unexpectedly. We need to catch them
|
||||
# because vpp_papi will traceback otherwise
|
||||
vpp_papi.register_event_callback(_vpp_cb)
|
||||
|
||||
|
||||
class VPPInterface(object):
|
||||
|
||||
def _check_retval(self, t):
|
||||
"""See if VPP returned OK.
|
||||
|
||||
VPP is very inconsistent in return codes, so for now this reports
|
||||
a logged warning rather than flagging an error.
|
||||
"""
|
||||
|
||||
try:
|
||||
self.LOG.debug("checking return value for object: %s" % str(t))
|
||||
if t.retval != 0:
|
||||
self.LOG.debug('FAIL? retval here is %s' % t.retval)
|
||||
except AttributeError as e:
|
||||
self.LOG.debug("Unexpected request format. Error: %s on %s"
|
||||
% (e, t))
|
||||
|
||||
def get_interfaces(self):
|
||||
t = vpp_papi.sw_interface_dump(0, b'ignored')
|
||||
|
||||
for interface in t:
|
||||
if interface.vl_msg_id == vpp_papi.VL_API_SW_INTERFACE_DETAILS:
|
||||
yield (fix_string(interface.interface_name), interface)
|
||||
|
||||
def get_interface(self, name):
|
||||
for (ifname, f) in self.get_interfaces():
|
||||
if ifname == name:
|
||||
return f
|
||||
|
||||
def get_version(self):
|
||||
t = vpp_papi.show_version()
|
||||
|
||||
self._check_retval(t)
|
||||
|
||||
return fix_string(t.version)
|
||||
|
||||
########################################
|
||||
|
||||
def create_tap(self, ifname, mac):
|
||||
# (we don't like unicode in VPP hence str(ifname))
|
||||
t = vpp_papi.tap_connect(False, # random MAC
|
||||
str(ifname),
|
||||
mac_to_bytes(mac),
|
||||
False, # renumber - who knows, no doc
|
||||
0) # customdevinstance - who knows, no doc
|
||||
|
||||
self._check_retval(t)
|
||||
|
||||
return t.sw_if_index # will be -1 on failure (e.g. 'already exists')
|
||||
|
||||
def delete_tap(self, idx):
|
||||
vpp_papi.tap_delete(idx)
|
||||
|
||||
# Err, I just got a sw_interface_set_flags here, not a delete tap?
|
||||
# self._check_retval(t)
|
||||
|
||||
#############################
|
||||
|
||||
def create_vhostuser(self, ifpath, mac, qemu_user, qemu_group):
|
||||
self.LOG.info('Creating %s as a port' % ifpath)
|
||||
|