[docs] Add Open vSwitch Hardware Offloading (ASAP^2) Deployment Scenario
This patch adds documentation that can assist an operator in deploying ASAP^2-accelerated[1] Open vSwitch - AKA Open vSwitch Hardware Offloading[2]. This feature requires Mellanox ConnectX-4 Lx or ConnectX-5 NICs, and can be deployed using existing OVS and SR-IOV-related tasks in OpenStack-Ansible. [1] http://www.mellanox.com/page/asap2?mtag=asap2 [2] https://docs.openstack.org/neutron/latest/admin/config-ovs-offload.html Co-Authored-By: James Denton <james.denton@outlook.com> Change-Id: I0c93aca73f2b809ff1e525f75496982942ea3785
This commit is contained in:
parent
e61d4fd7c7
commit
3fe2242707
317
doc/source/app-openvswitch-asap.rst
Normal file
317
doc/source/app-openvswitch-asap.rst
Normal file
@ -0,0 +1,317 @@
|
||||
============================================================
|
||||
Scenario - Using Open vSwitch w/ ASAP :sup:`2` (Direct Mode)
|
||||
============================================================
|
||||
|
||||
Overview
|
||||
~~~~~~~~
|
||||
|
||||
With appropriate hardware, operators can choose to utilize
|
||||
ASAP :sup:`2`-accelerated Open vSwitch instead of unaccelerated Open vSwitch
|
||||
for the Neutron virtual network infrastructure. ASAP :sup:`2` technology
|
||||
offloads packet processing onto hardware built into the NIC rather than using
|
||||
the CPU of the host. It requires careful consideration and planning before
|
||||
implementing. This document outlines how to set it up in your environment.
|
||||
|
||||
.. note::
|
||||
|
||||
ASAP :sup:`2` is a proprietary feature provided with certain Mellanox NICs,
|
||||
including the ConnectX-4 Lx and ConnectX-5. Future support is not
|
||||
guaranteed. This feature is considered *EXPERIMENTAL* and should not
|
||||
be used for production workloads. There is no guarantee of upgradability
|
||||
or backwards compatibility.
|
||||
|
||||
.. note::
|
||||
|
||||
Hardware offloading is not compatible with the ``openvswitch`` firewall
|
||||
driver. To ensure flows are offloaded, port security must be disabled.
|
||||
Information on disabling port security is discussed later in this document.
|
||||
|
||||
Recommended reading
|
||||
~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
This guide is a variation of the standard Open vSwitch and SR-IOV deployment
|
||||
guides available at:
|
||||
|
||||
* `<https://docs.openstack.org/openstack-ansible-os_neutron/latest/app-openvswitch.html>`_
|
||||
|
||||
* `<https://docs.openstack.org/openstack-ansible-os_neutron/latest/configure-network-services.html#sr-iov-support-optional>`_
|
||||
|
||||
The following resources may also be helpful:
|
||||
|
||||
* `<https://docs.openstack.org/neutron/latest/admin/config-sriov.html>`_
|
||||
|
||||
* `<https://docs.openstack.org/neutron/latest/admin/config-ovs-offload.html>`_
|
||||
|
||||
* `<https://www.mellanox.com/related-docs/prod_software/ASAP2_Hardware_Offloading_for_vSwitches_User_Manual_v4.4.pdf>`_
|
||||
|
||||
Prerequisites
|
||||
~~~~~~~~~~~~~
|
||||
|
||||
To enable SR-IOV and PCI passthrough capabilities on a Linux platform,
|
||||
ensure that VT-d/VT-x are enabled for Intel processors and AMD-V/AMD-Vi are
|
||||
enabled for AMD processors. Such features are typically enabled in the BIOS.
|
||||
|
||||
On an Intel platform, the following kernel parameters are required and can be
|
||||
added to the GRUB configuration:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
GRUB_CMDLINE_LINUX="... iommu=pt intel_iommu=on"
|
||||
|
||||
On an AMD platform, use these parameters instead:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
GRUB_CMDLINE_LINUX="... iommu=pt amd_iommu=on"
|
||||
|
||||
Update GRUB and reboot the host(s).
|
||||
|
||||
SR-IOV provides virtual functions (VFs) that can be presented to instances as
|
||||
network interfaces and are used in lieu of tuntap interfaces. Configuration
|
||||
of VFs is outside the scope of this guide. The following links may be helpful:
|
||||
|
||||
* `<https://community.mellanox.com/s/article/getting-started-with-mellanox-asap-2>`_
|
||||
|
||||
* `<https://community.mellanox.com/s/article/howto-configure-sr-iov-for-connectx-4-connectx-5-with-kvm--ethernet-x>`_
|
||||
|
||||
Deployment
|
||||
~~~~~~~~~~
|
||||
|
||||
Configure your networking according the Open vSwitch implementation docs:
|
||||
|
||||
* `Scenario - Using Open vSwitch
|
||||
<https://docs.openstack.org/openstack-ansible-os_neutron/latest/app-openvswitch.html>`_
|
||||
|
||||
.. note::
|
||||
|
||||
At this time, only a single (non-bonded) interface is supported.
|
||||
|
||||
An example provider network configuration has been provided below:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
- network:
|
||||
container_bridge: "br-provider"
|
||||
container_type: "veth"
|
||||
type: "vlan"
|
||||
range: "700:709"
|
||||
net_name: "physnet1"
|
||||
network_interface: "ens4f0"
|
||||
group_binds:
|
||||
- neutron_openvswitch_agent
|
||||
|
||||
Add a ``nova_pci_passthrough_whitelist`` entry to ``user_variables.yml``, where
|
||||
``devname`` is the name of the interface connected to the provider bridge and
|
||||
``physical_network`` is the name of the provider network.
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
nova_pci_passthrough_whitelist: '{"devname":"ens4f0","physical_network":"physnet1"}'
|
||||
|
||||
.. note::
|
||||
|
||||
In the respective network block configured in ``openstack_user_config.yml``,
|
||||
``devname`` corresponds to ``network_interface`` and ``physical_network``
|
||||
corresponds to ``net_name``.
|
||||
|
||||
To enable the ``openvswitch`` firewall driver rather than the default
|
||||
``iptables_hybrid`` firewall driver, add the following overrides to
|
||||
``user_variables.yml``:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
neutron_ml2_conf_ini_overrides:
|
||||
securitygroup:
|
||||
firewall_driver: openvswitch
|
||||
neutron_openvswitch_agent_ini_overrides:
|
||||
securitygroup:
|
||||
firewall_driver: openvswitch
|
||||
|
||||
.. note::
|
||||
|
||||
Hardware-offloaded flows are **not** activated for ports utilizing security
|
||||
groups or port security. Be sure to disable port security *and* security
|
||||
groups on individual ports or networks when hardware offloading is required.
|
||||
|
||||
Once the OpenStack cluster is configured, start the OpenStack deployment as
|
||||
listed in the OpenStack-Ansible Install guide by running all playbooks in
|
||||
sequence on the deployment host.
|
||||
|
||||
Post-Deployment
|
||||
~~~~~~~~~~~~~~~
|
||||
|
||||
Once the deployment is complete, create the VFs that will be used for SR-IOV.
|
||||
In this example, the physical function (PF) is ``ens4f0``. It will
|
||||
simultaneously be connected to the Neutron provider bridge ``br-provider``.
|
||||
|
||||
1. On each compute node, determine the maximum number of VFs a PF can support:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
# cat /sys/class/net/ens4f0/device/sriov_totalvfs
|
||||
|
||||
.. note::
|
||||
|
||||
To adjust ``sriov_totalvfs`` please refer to Mellanox documentation.
|
||||
|
||||
2. On each compute node, create the VFs:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
# echo '8' > /sys/class/net/ens4f0/device/sriov_numvfs
|
||||
|
||||
Configure Open vSwitch hardware offloading
|
||||
------------------------------------------
|
||||
|
||||
1. Unbind the VFs from the Mellanox driver:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
# for vf in `ls -ld /sys/class/net/ens4f0/device/virt* | cut -f 11 -d ' ' | cut -b 4-`
|
||||
do
|
||||
echo $vf > /sys/bus/pci/drivers/mlx5_core/unbind
|
||||
done
|
||||
|
||||
2. Enable the switch in the NIC:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
# PCI_ADDR=`grep PCI_SLOT_NAME /sys/class/net/ens4f0/device/uevent | sed 's:.*PCI_SLOT_NAME=::'`
|
||||
# devlink dev eswitch set pci/$PCI_ADDR mode switchdev
|
||||
|
||||
3. Enable hardware offload filters with TC:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
# ethtool -K ens4f0 hw-tc-offload on
|
||||
|
||||
4. Rebind the VFs to the Mellanox driver:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
# for vf in `ls -ld /sys/class/net/ens4f0/device/virt* | cut -f 11 -d ' ' | cut -b 4-`
|
||||
do
|
||||
echo $vf > /sys/bus/pci/drivers/mlx5_core/bind
|
||||
done
|
||||
|
||||
5. Enable hardware offloading in OVS:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
# ovs-vsctl set Open_vSwitch . other_config:hw-offload=true
|
||||
# ovs-vsctl set Open_vSwitch . other_config:max-idle=30000
|
||||
|
||||
6. Restart Open vSwitch
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
# systemctl restart openvswitch-switch
|
||||
|
||||
7. Restart the Open vSwitch agent
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
# systemctl restart neutron-openvswitch-agent
|
||||
|
||||
8. Restart the Nova compute service
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
# systemctl restart nova-compute
|
||||
|
||||
.. warning::
|
||||
|
||||
Changes to ``sriov_numvfs`` as well as the built-in NIC switch will not
|
||||
persist a reboot and must be performed every time the server is started.
|
||||
|
||||
Verify operation
|
||||
~~~~~~~~~~~~~~~~
|
||||
|
||||
To verify operation of hardware-offloaded Open vSwitch, you must create
|
||||
a virtual machine instance using an image with the proper network drivers.
|
||||
|
||||
The following images are known to contain working drivers:
|
||||
|
||||
* `Fedora 24 <http://www.mellanox.com/repository/solutions/openstack/images/fedora_24_ofed_4.0-2.0.0.1.qcow2>`_
|
||||
|
||||
* `Ubuntu 18.04 LTS (Bionic) <https://cloud-images.ubuntu.com/bionic/current/bionic-server-cloudimg-amd64.img>`_
|
||||
|
||||
* `Centos 7 (1901) <https://cloud.centos.org/centos/7/images/CentOS-7-x86_64-GenericCloud-1901.qcow2>`_
|
||||
|
||||
Before creating an instance, a Neutron port must be created that has the
|
||||
following characteristics:
|
||||
|
||||
:code:`--vnic-type direct --binding-profile '{"capabilities": ["switchdev"]}'`
|
||||
|
||||
To ensure flows are offloaded, disable port security with the
|
||||
``--disable-port-security`` argument.
|
||||
|
||||
An example of the full command can be seen here:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
# openstack port create \
|
||||
--network <network> \
|
||||
--vnic-type direct --binding-profile '{"capabilities": ["switchdev"]}' \
|
||||
--disable-port-security \
|
||||
<name>
|
||||
|
||||
The port can then be attached to the instance at boot. Once booted, the port
|
||||
will be updated to reflect the PCI address of the corresponding virtual
|
||||
function:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
root@aio1-utility-container-8c0b0916:~# openstack port show -c binding_profile testport2
|
||||
+-----------------+------------------------------------------------------------------------------------------------------------------+
|
||||
| Field | Value |
|
||||
+-----------------+------------------------------------------------------------------------------------------------------------------+
|
||||
| binding_profile | capabilities='[u'switchdev']', pci_slot='0000:21:00.6', pci_vendor_info='15b3:1016', physical_network='physnet1' |
|
||||
+-----------------+------------------------------------------------------------------------------------------------------------------+
|
||||
|
||||
Observing traffic
|
||||
-----------------
|
||||
|
||||
From the compute node, perform a packet capture on the representor port
|
||||
that corresponds to the virtual function attached to the instance. In this
|
||||
example, the interface is ``eth1``.
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
root@compute1:~# tcpdump -nnn -i eth1 icmp
|
||||
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
|
||||
listening on eth1, link-type EN10MB (Ethernet), capture size 262144 bytes
|
||||
|
||||
Perform a ping from another host and observe the traffic at the representor
|
||||
port:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
root@infra2:~# ping 192.168.88.151 -c5
|
||||
PING 192.168.88.151 (192.168.88.151) 56(84) bytes of data.
|
||||
64 bytes from 192.168.88.151: icmp_seq=1 ttl=64 time=48.3 ms
|
||||
64 bytes from 192.168.88.151: icmp_seq=2 ttl=64 time=1.52 ms
|
||||
64 bytes from 192.168.88.151: icmp_seq=3 ttl=64 time=0.586 ms
|
||||
64 bytes from 192.168.88.151: icmp_seq=4 ttl=64 time=0.688 ms
|
||||
64 bytes from 192.168.88.151: icmp_seq=5 ttl=64 time=0.775 ms
|
||||
|
||||
--- 192.168.88.151 ping statistics ---
|
||||
5 packets transmitted, 5 received, 0% packet loss, time 4045ms
|
||||
rtt min/avg/max/mdev = 0.586/10.381/48.335/18.979 ms
|
||||
|
||||
root@compute1:~# tcpdump -nnn -i eth1 icmp
|
||||
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
|
||||
listening on eth1, link-type EN10MB (Ethernet), capture size 262144 bytes
|
||||
19:51:09.684957 IP 192.168.88.254 > 192.168.88.151: ICMP echo request, id 11168, seq 1, length 64
|
||||
19:51:09.685448 IP 192.168.88.151 > 192.168.88.254: ICMP echo reply, id 11168, seq 1, length 64
|
||||
|
||||
When offloading is handled in the NIC, only the first packet(s) of the
|
||||
flow will be visible in the packet capture.
|
||||
|
||||
The following command can be used to dump flows in the kernel datapath:
|
||||
|
||||
:code:`# ovs-dpctl dump-flows type=ovs`
|
||||
|
||||
The following command can be used to dump flows that are offloaded:
|
||||
|
||||
:code:`# ovs-dpctl dump-flows type=offloaded`
|
@ -7,6 +7,7 @@ Neutron role for OpenStack-Ansible
|
||||
|
||||
configure-network-services.rst
|
||||
app-openvswitch.rst
|
||||
app-openvswitch-asap.rst
|
||||
app-ovn.rst
|
||||
app-nuage.rst
|
||||
app-calico.rst
|
||||
|
Loading…
Reference in New Issue
Block a user