neutron-specs/specs/2023.2/erspan-for-tap-as-a-service...

18 KiB

Tunnel based mirroring (ERSPAN, GRE) for Tap-as-a-service

https://bugs.launchpad.net/neutron/+bug/2015471

Mirroring is a widely used tool to analyse traffic of switch ports. Tap-as-a-service project was created to allow admins to mirror traffic of one Neutron port to another Neutron port.

Mirroring can also be done by encapsulating the traffic into a tunnel, like GRE (Generic Routing Encapsulation) or ERSPAN (Encapsulated Remote Switch Port Analyzer). ERSPAN first was used widely in Cisco switches, and GRE is widely used as tunneling protocol.

ERSPAN protocol has 3 versions of which the last two is adopted, these are version 1 and version 2 (the other versioning uses TYPE I, II and III, and TYPE II is version 1 and TYPE III is version 2) For more details see the ERSPAN draft from Cisco.

Since OVS 2.10 it is possible to use ERSPAN with OVS (see OVS basic configuration, and OVS protocol header fields) both ERSPAN v1 and v2, see erspan NEWS update commit.

Since OVN v22.12.0 it is possible to create mirrors with OVN (see OVN 22.12 nbctl man page, and OVN commit that introduced mirroring).

Note

OVN only supports ERSPAN v1, and with OVN it is also possible to create a clean GRE type mirror.

This specification proposes an extension to the current tap-as-a-service (TAAS) API to allow the users to create ERSPAN or GRE mirrors from Neutron ports to a remote IP, and proposes the necessary backend changes to the current OVS driver of taas and proposes a new driver for OVN to use ERSPAN or GRE mirroring with OVN.

Problem Description

Mirroring traffic can be useful in many situations for operators, for example to debug network issues.

Tap-as-a-service provided a solution for traffic mirroring by allowing to create tap-flows and mirror the traffic of them to the related tap-service. Each tap-flow and tap-service can be attached to a Neutron port, so the port attached to tap-flow is the source of the mirrored traffic and the port of tap-service is the destination of the mirroring. There is a N-1 relation between tap-flows and tap-services. This mirroring model is mirroring traffic from one Neutron port to another Neutron port over a Neutron network.

The operator needs to mirror traffic from the cloud (from Neutron ports) to an analyser outside of the cloud. An ERSPAN or GRE mirror is a good solution for such need.

Use Cases

  • As an operator I want to mirror the traffic from a Neutron port to a network analyser that can be outside of my cloud.
  • As an operator I want to mirror the traffic from a Neutron port to a Floating IP.
  • As an operator I want to mirror the traffic of a Neutron port to a dedicated infra network, to avoid overloading the tenant networks.
  • As an operator I want to use the extra headers ERSPAN provides (i.e.: original VLAN, original CoS in case of version 2 ERSPAN).

Proposed Change

The proposal is to use OVS and OVN builtin ERSPAN and GRE mirroring features. For using ERSPAN or GRE with OVS a port is added to an OVS bridge with type=erspan or type=gre in case of GRE.

As ERSPAN is a modification of GRE, with some extra ERSPAN specific headers (see ERSPAN draft from Cisco), OVS creates the tunnel from the previously created OVS port to the destination IP. This means that the mirrored traffic is encapsulated to an ERSPAN/GRE tunnel.

Example wireshark dump of an ICMP echo reply:

Frame 1: 148 bytes on wire (1184 bits), 148 bytes captured (1184 bits)
Ethernet II, Src: RealtekU_79:ff:db (52:54:00:79:ff:db), Dst: RealtekU_91:2f:52 (52:54:00:91:2f:52)
Internet Protocol Version 4, Src: 100.109.0.84, Dst: 100.109.0.142
Generic Routing Encapsulation (ERSPAN)
Encapsulated Remote Switch Packet ANalysis Type II
Ethernet II, Src: fa:16:3e:d5:4b:c1 (fa:16:3e:d5:4b:c1), Dst: fa:16:3e:80:ed:09 (fa:16:3e:80:ed:09)
Internet Protocol Version 4, Src: 10.0.0.39, Dst: 10.0.0.47
Internet Control Message Protocol

This means that the source IP of the tunnel is the IP of the host on which the ERSPAN port is created (in my virtual env it is 100.109.0.84). In the above example the 2 inner IPs (10.0.0.47 and 10.0.0.39) are the fixed IPs of 2 Openstack ports (VMs).

The outer protocol source IP (100.109.0.84) is the IP of the host on which the mirror port is created, thus outside of the cloud, and in the control of the admin.

The outer protocol destination IP (100.109.0.142) in this case is also outside of the cloud, and Openstack Neutron control, and another host on which I can run tcpdump.

REST API impact

The current API model of taas uses two high level objects: tap-services and tap-flows. The tap-flow represents the source of the mirrored traffic, and a tap-service represents the destination of the mirrored traffic. For one tap-service multiple tap-flows can be attached. For details please check the tap-as-a-service API reference. Both a tap-flow and a tap-service are referencing a Neutron port, and the traffic on that port will be the source of the mirror (in case of a tap-flow), or the destination of the mirror (in case of a tap-service).

Warning

Only the traffic can be mirrored by tap-as-a-service which anyway allowed by security-groups!

In case of the ERSPAN implementation of OVS and OVN the source is a bridge port (in case of OVS) or a logical switch port (in case of OVN) and the destination is represented by only an IP address, the above API model is not useful.

The proposal is to introduce a new high level API for ERSPAN or GRE mirroring: tap_mirror.

This solution keeps the current API clean and not overloaded, and makes it easier for operators to expect the right behaviour after API operations.

The proposed API is admin only, to avoid the overloading of infrastructure networks by tenants.

The suggested API request:

  • POST /v2.0/taas/tap_mirrors

    Create a tap mirror that mirrors traffic from a Neutron port to an external IP:

    {
        "tap_mirror": {
            "name": "mirror-traffic-of-server-a0",
            "description": "Mirror the traffic from server-a0",
            "direction": "IN"|"OUT"|"BOTH",
            "port_id": "1a1a5a96-e8cb-11ed-9678-9b663820b519",
            "tunnel_id": "1",
            "remote_ip": "172.31.1.1",
            "mirror_type": "erspan"|"gre"
        }
    }
  • port_id is the source of the mirroring, this is a Neutron port.

Note

Only VM ports can be used as the source or mirroring.

  • remote_ip: The IP of the remote end of the tunnel.
  • The tunnel_id field is the identifier of the ERSPAN or GRE session between the source and destination.

Note

There is a big difference in the GRE and ERSPAN id size: GRE has 32 bits key size but ERSPAN has only 10 bits for ERSPAN session ID. This must be documented and validated on the API.

Note

This API proposal keeps the current taas API's N-1 relationship between source and destination. Multiple source ports' traffic can be mirrored to one destination IP.

  • mirror_type field is to select between ERSPAN and GRE.
  • direction is the direction of the traffic to be mirrored on the port. The current tap-as-a-service API allows the operator to select the direction when the tap-flow is created, it can be: IN, OUT, BOTH. This specification proposes to keep the current direction setting options with the new API. Meaning of the directions:
    • IN: the traffic towards the port, and into the VM attached to it (ingress traffic).
    • OUT: traffic from the port out of the VM attached to the port (egress traffic).
    • BOTH: mirror both ingress (IN), and egress (OUT) traffic of the port.

Warning

It is not possible to create tunnel (GRE or ERSPAN) with the same tunnel_id to the same remote_ip from the same portwith OVN. Due to this if the user chooses BOTH for direction 2 tunnels must be created with with different tunnel_id.

This is something to make visible for the user on the API, when she/he GET the tap_mirror, show that there are 2 tunnel_ids used for this specific tap_mirror. This also means that the receiving side of the mirror must be prepared that the egress and ingress direction will be encapsulated to tunnels with different tunnel_id.

Note

Both GRE and ERSPAN handle the fragmentation, so if the mirrored traffic's packet size with the extra headers bigger than the MTU on the interface, the packet in the tunnel will be sent fregmented.

Note

  • For the GRE type mirroring 8 octet extra header is added over IP headers.
  • For ERSPAN 8 octet is added for GRE, 8 octet is acced for ERSPAN and an extra trailing 4 byte CRC is added, so in summary 20 octets extra header is added in this case.

The proposed API definition:

mirror_types_list = ['erspan', 'gre']

RESOURCE_ATTRIBUTE_MAP = {
    'tap_mirror': {
        'id': {
           'allow_post': False, 'allow_put': False,
           'validate': {'type:uuid': None}, 'is_visible': True,
           'primary_key': True},
        'name': {
            'allow_post': True, 'allow_put': True,
            'validate': {'type:string': None},
            'is_visible': True, 'default': ''},
        'description': {
            'allow_post': True, 'allow_put': True,
            'validate': {'type:string': None},
            'is_visible': True, 'default': ''},
        'port_id': {
            'allow_post': True, 'allow_put': False,
            'validate': {'type:uuid': None},
            'enforce_policy': True, 'is_visible': True},
        'direction': {
            'allow_post': True, 'allow_put': False,
            'validate': {'type:values': direction_enum},
            'is_visible': True},
        'remote_ip': {
            'allow_post': True, 'allow_put': False,
            'validate': {'type:ip_address': None},
            'is_visible': True},
        'tunnel_id': {
            'allow_post': True, 'allow_put': False,
            'validate': {'type:integer': None},
            'is_visible': True, 'default': constants.ATTR_NOT_SPECIFIED},
        'mirror_type': {
            'allow_post': True, 'allow_put': False,
            'validate': {'type:values': mirror_types_list},
            'is_visible': True,},
    }
}

DB Impact

To persist the new tap_mirror in the DB, a new table tap_mirrors is needed:

op.create_table(
    'tap_mirrors',
    sa.Column('id', sa.String(length=36), primary_key=True,
              nullable=False),
    sa.Column('project_id', sa.String(length=255), nullable=True),
    sa.Column('name', sa.String(length=255), nullable=True),
    sa.Column('description', sa.String(length=1024), nullable=True),
    sa.Column('port_id', sa.String(36), nullable=False),
    sa.Column('direction', nullable=True),
    sa.Column('remote_ip', sa.String(db_const.IP_ADDR_FIELD_SIZE),
    sa.Column('mirorr_type', sa.String(36), nullable=False),
)

To handle the used tunnel_ids a new table tap_tunnel_ids is necessary. This table will represent the tunnel_ids used by the mirror:

op.create_table(
    'tap_tunnel_ids',
    sa.Column('id', sa.String(length=36), primary_key=True,
              nullable=False),
    sa.Column('tunnel_value', sa.String(length=36), nullable=False),
    sa.Column('tap_mirror_id', sa.String(length=36), nullable=False),
)

if the user creates a tap_mirror with direction BOTH, 2 tap_tunnel_ids will be added, and will be allocated for the tap_mirror, and both will be visible on the API.

This also means that TapMirror and TapTunnelId DB models will be added.

OVN driver for mirroring

OVN creates only version 1 type of ERSPAN ports, end-to-end from API call to backend changes this will look something like this (Using GRE is very similar the OVN mirror's type will be gre, and the OVS port type will be gre):

$ # REST API operation
$ curl -g -i -X POST http://<host_ip:9696>/networking/v2.0/taas/tap_mirrors \
  -d '{"tap_mirror": {"name": "mirror1", "port_id": "54c4b09f-8b3d-4685-b66d-ce22c67956a9",
                      "direction": "OUT", "remote_ip": "100.109.0.142", "tunnel_id": "42",
                      "mirror_type": "erspan"}}'

$ # backend changes
$ sudo ovn-nbctl mirror-list
mirror_out_297b12c0-e9a5-11ed-9f90-07946c615270:
  Type     :  erspan
  Sink     :  100.109.0.142
  Filter   :  from-lport
  Index/Key:  42

$ sudo ovs-vsctl show
    Bridge br-int
        ...
        Port ovn-my_mirror2
            Interface ovn-mirror_out_297b12c0-e9a5-11ed-9f90-07946c615270
                type: erspan
                options: {erspan_idx="42", erspan_ver="1", key="2", remote_ip="100.109.0.142"}

With OVN to mirror both ingress and egress traffic of the source port 2 mirrors must be created (as the OVN mirror can have only from-lport or to-lport as direction), and attached to the port (logical-switch-port), one with filter=from-lport and one with filter=to-lport.

Note

OVN chose ERSPAN version 1 which is directionless by the protocol description, but a direction can be selected as filter when the mirror is created with ovn-nbctl, or via ovsdb (see ovn-nb.ovsschema mirror table)

So if the user creates a tap_mirror with direction IN the filter will be to-lport, if OUT the filter will be from-lport and in case of BOTH 2 mirrors will be created one with to-lort and one with from-lport.

The above means that in case of mirroring both ingress and egrees traffic tap-as-a-service will create 2 ERSPAN or GRE ports on br-int for each tap_mirror.

Different tunnel_id will be used for the 2 traffic directions, for details see REST API impact .

OVS driver changes

To keep consistency between the 2 drivers, this specification proposes to use GRE and ERSPAN version 1 for OVS drive also.

The end-to-end call will look like this:

$ # REST API operation
$ curl -g -i -X POST http://<host_ip:9696>/networking/v2.0/taas/tap_mirrors \
  -d '{"tap_mirror": {"name": "mirror1", "port_id": "54c4b09f-8b3d-4685-b66d-ce22c67956a9",
                      "direction": "IN", "remote_ip": "100.109.0.142", "tunnel_id": "42",
                      "mirror_type": "erspan"}}'

$ # Backend changes
$ sudo ovs-vsctl show
    Bridge br-tap
        ...
        Port mirror_in_ed6046d
        Interface mirror_in_ed6046d
            type: erspan
            options: {erspan_idx="42", erspan_ver="1", remote_ip="100.109.0.84"}

$ sudo ovs-ofctl dump-flows  br-tap
   ...
   ... priority=20,dl_dst=fa:16:3e:d3:3a:d1 actions=output:"mirror_in_ed6046d"

For the details on how direction BOTH will be handled see OVN driver for mirroring.

Differences for OVS driver will be that 2 OVS ports will be created, with 2 different erspan_id/tunnel_id (see the section REST API impact for how it can be visible on the API). For the two directions 2 different flows will be installed on br-tap with different output port in the action field:

$ # Direction IN
$ sudo ovs-ofctl dump-flows  br-tap
...
... priority=20,dl_dst=fa:16:3e:d3:3a:d1 actions=output:"mirror_in_ed6046d"

$ # Direction OUT
$ sudo ovs-ofctl dump-flows  br-tap
...
... priority=20,dl_src=fa:16:3e:d3:3a:d1 actions=output:"mirror_out_ed6046d"

Out of Scope

This specification is not proposing to make the OVN driver fully compatible with the current OVS or SRIOV driver. So the proposed OVN driver will implement only ERSPAN.

To make OVN driver fully feature compatible with the current OVS or SRIOV driver can be part of a coming specification.

Implementation

Assignee(s)

Work Items

  • Add new REST API extension for tap-as-a-service, neutron-lib and tap-as-a-service changes.
  • Change tap-as-a-service db schema accordingly.
  • Adopt ovsdbapp to make it possible to manipulate both ovsdb and ovn-northd and create mirrors.
  • Change OVS driver.
  • Create a new ERSPAN only OVN tap-as-a-service driver.
  • Adopt the documentation.
  • Implement the necessary tests.
    • end-to-end test in tempest can be done using Floating IPs.
  • Adopt OpenstackSDK and the necessary CLI code.
  • Adopt Heat to make it possible to create ERSPAN mirrors.

References