diff --git a/doc/source/contributor/internals/openvswitch_firewall.rst b/doc/source/contributor/internals/openvswitch_firewall.rst index 9261fa0df68..779df4d7143 100644 --- a/doc/source/contributor/internals/openvswitch_firewall.rst +++ b/doc/source/contributor/internals/openvswitch_firewall.rst @@ -30,10 +30,45 @@ Class ``SGPortMap`` was created to keep state consistent, and maps from ports to security groups and vice-versa. Every port and security group is represented by its own object encapsulating the necessary information. -Note: Open vSwitch firewall driver uses register 5 for marking flow -related to port and register 6 which defines network and is used for conntrack -zones. +.. note:: + Open vSwitch firewall driver uses ``register 5`` for identifying the port + related to the flow and ``register 6`` which identifies the network, used + in particular for conntrack zones. + +Ingress/Egress Terminology +-------------------------- + +In this document, the terms ``ingress`` and ``egress`` are relative to +a VM instance connected to OVS (or a netns connected to OVS): + +* ``ingress`` applies to traffic that will ultimately go into a VM (or into + a netns), assuming it is not dropped + +* ``egress`` applies to traffic coming from a VM (or from a netns) + +:: + + . . + _______|\ _______|\ + \ ingress \ \ ingress \ + /_______ / /_______ / + |/ .-----------------. |/ + ' | | ' + | |-----------( netns interface ) + ( non-VM, non-netns )---| OVS | + ( interface: phy, patch ) | |------------( VM interface ) + . | | . + /|________ '-----------------' /|________ + / egress / / egress / + \ ________\ \ ________\ + \| \| + ' ' + +Note that these terms are used differently in OVS code and documentation, where +they are relative to the OVS bridge, with ``ingress`` applying to traffic as +it comes into the OVS bridge, and ``egress`` applying to traffic as it leaves +the OVS bridge. Firewall API calls ------------------ @@ -139,39 +174,39 @@ Rules example with explanation: ------------------------------- The following example presents two ports on the same host. They have different -security groups and there is icmp traffic allowed from first security group to +security groups and there is ICMP traffic allowed from first security group to the second security group. Ports have following attributes: :: Port 1 - plugged to the port 1 in OVS bridge - - ip address: 192.168.0.1 - - mac address: fa:16:3e:a4:22:10 - - security group 1: can send icmp packets out + - IP address: 192.168.0.1 + - MAC address: fa:16:3e:a4:22:10 + - security group 1: can send ICMP packets out - allowed address pair: 10.0.0.1/32, fa:16:3e:8c:84:13 Port 2 - plugged to the port 2 in OVS bridge - - ip address: 192.168.0.2 - - mac address: fa:16:3e:24:57:c7 + - IP address: 192.168.0.2 + - MAC address: fa:16:3e:24:57:c7 - security group 2: - - can receive icmp packets from security group 1 - - can receive tcp packets from security group 1 - - can receive tcp packets to port 80 from security group 2 - - can receive ip packets from security group 3 + - can receive ICMP packets from security group 1 + - can receive TCP packets from security group 1 + - can receive TCP packets to port 80 from security group 2 + - can receive IP packets from security group 3 - allowed address pair: 10.1.0.0/24, fa:16:3e:8c:84:14 -``table 0`` contains a low priority rule to continue packets processing in -``table 60`` aka TRANSIENT table. ``table 0`` is left for use to other +|table_0| contains a low priority rule to continue packets processing in +|table_60| aka TRANSIENT table. |table_0| is left for use to other features that take precedence over firewall, e.g. DVR. The only requirement is -that after feature is done with its processing, it needs to pass packets for -processing to the TRANSIENT table. This TRANSIENT table distinguishes the -traffic to ingress or egress and loads to ``register 5`` value identifying port -traffic. -Egress flow is determined by switch port number and ingress flow is determined -by destination mac address. ``register 6`` contains port tag to isolate -connections into separate conntrack zones. +that after such a feature is done with its processing, it needs to pass packets +for processing to the TRANSIENT table. This TRANSIENT table distinguishes the +ingress traffic from the egress traffic and loads into ``register 5`` a value +identifying the port (for egress traffic based on the switch port number, and +for ingress traffic based on the network id and destination MAC address); +``register 6`` contains a value identifying the network (which is also the +OVSDB port tag) to isolate connections into separate conntrack zones. :: @@ -183,16 +218,17 @@ connections into separate conntrack zones. table=60, priority=90,dl_vlan=0x284,dl_dst=fa:16:3e:8c:84:14 actions=load:0x2->NXM_NX_REG5[],load:0x284->NXM_NX_REG6[],resubmit(,81) table=60, priority=0 actions=NORMAL -Following ``table 71`` implements arp spoofing protection, ip spoofing -protection, allows traffic for obtaining ip addresses (dhcp, dhcpv6, slaac, -ndp) for egress traffic and allows arp replies. Also identifies not tracked -connections which are processed later with information obtained from +The following table, |table_71| implements ARP spoofing protection, IP spoofing +protection, allows traffic related to IP address allocations (dhcp, dhcpv6, +slaac, ndp) for egress traffic, and allows ARP replies. Also identifies not +tracked connections which are processed later with information obtained from conntrack. Notice the ``zone=NXM_NX_REG6[0..15]`` in ``actions`` when obtaining information from conntrack. It says every port has its own conntrack zone -defined by value in ``register 6``. It's there to avoid accepting established -traffic that belongs to different port with same conntrack parameters. +defined by the value in ``register 6`` (OVSDB port tag identifying the network). +It's there to avoid accepting established traffic that belongs to different +port with same conntrack parameters. -The very first rule in ``table 71`` is a rule removing conntrack information +The very first rule in |table_71| is a rule removing conntrack information for a use-case where Neutron logical port is placed directly to the hypervisor. In such case kernel does conntrack lookup before packet reaches Open vSwitch bridge. Tracked packets are sent back for processing by the same table after @@ -218,7 +254,7 @@ solicitation and neighbour advertisement. table=71, priority=95,icmp6,reg5=0x2,in_port=2,icmp_type=135 actions=resubmit(,94) table=71, priority=95,icmp6,reg5=0x2,in_port=2,icmp_type=136 actions=resubmit(,94) -Following rules implement arp spoofing protection +Following rules implement ARP spoofing protection :: @@ -241,7 +277,7 @@ instances. table=71, priority=70,udp,reg5=0x2,in_port=2,tp_src=67,tp_dst=68 actions=resubmit(,93) table=71, priority=70,udp6,reg5=0x2,in_port=2,tp_src=547,tp_dst=546 actions=resubmit(,93) -Flowing rules obtain conntrack information for valid ip and mac address +Flowing rules obtain conntrack information for valid IP and MAC address combinations. All other packets are dropped. :: @@ -257,14 +293,14 @@ combinations. All other packets are dropped. table=71, priority=0 actions=drop -``table 72`` accepts only established or related connections, and implements -rules defined by the security group. As this egress connection might also be an +|table_72| accepts only established or related connections, and implements +rules defined by security groups. As this egress connection might also be an ingress connection for some other port, it's not switched yet but eventually -processed by ingress pipeline. +processed by the ingress pipeline. -All established or new connections defined by security group rule are +All established or new connections defined by security group rules are ``accepted``, which will be explained later. All invalid packets are dropped. -In case below we allow all icmp egress traffic. +In the case below we allow all ICMP egress traffic. :: @@ -273,10 +309,10 @@ In case below we allow all icmp egress traffic. table=72, priority=50,ct_state=+inv+trk actions=resubmit(,93) -Important on the flows below is the ``ct_mark=0x1``. Such value have flows that -were marked as not existing anymore by rule introduced later. Those are -typically connections that were allowed by some security group rule and the -rule was removed. +Important on the flows below is the ``ct_mark=0x1``. Flows that +were marked as not existing anymore by rule introduced later will value this +value. Those are typically connections that were allowed by some security group +rule and the rule was removed. :: @@ -293,7 +329,7 @@ allowed. table=72, priority=50,ct_state=-new-est+rel-inv,ct_zone=644,ct_mark=0,reg5=0x1 actions=resubmit(,94) table=72, priority=50,ct_state=-new-est+rel-inv,ct_zone=644,ct_mark=0,reg5=0x2 actions=resubmit(,94) -In the following flows are marked established connections that weren't matched +In the following, flows are marked established connections that weren't matched in the previous flows, which means they don't have accepting security group rule anymore. @@ -305,10 +341,10 @@ rule anymore. table=72, priority=40,ct_state=+est,reg5=0x2 actions=ct(commit,zone=NXM_NX_REG6[0..15],exec(load:0x1->NXM_NX_CT_MARK[])) table=72, priority=0 actions=drop -In following ``table 73`` are all detected ingress connections sent to ingress +In following |table_73| are all detected ingress connections sent to ingress pipeline. Since the connection was already accepted by egress pipeline, all remaining egress connections are sent to normal flood'n'learn switching -(table 94). +in |table_94|. :: @@ -322,8 +358,8 @@ remaining egress connections are sent to normal flood'n'learn switching table=73, priority=80,reg5=0x2 actions=resubmit(,94) table=73, priority=0 actions=drop -``table 81`` is similar to ``table 71``, allows basic ingress traffic for -obtaining ip address and arp queries. Note that vlan tag must be removed by +|table_81| is similar to |table_71|, allows basic ingress traffic for +obtaining IP address and ARP queries. Note that vlan tag must be removed by adding ``strip_vlan`` to actions list, prior to injecting packet directly to port. Not tracked packets are sent to obtain conntrack information. @@ -353,11 +389,11 @@ port. Not tracked packets are sent to obtain conntrack information. table=81, priority=80,ct_state=+trk,reg5=0x2 actions=resubmit(,82) table=81, priority=0 actions=drop -Similarly to ``table 72``, ``table 82`` accepts established and related -connections. In this case we allow all icmp traffic coming from +Similarly to |table_72|, |table_82| accepts established and related +connections. In this case we allow all ICMP traffic coming from ``security group 1`` which is in this case only ``port 1``. -The first four flows match on the ip addresses, and the -next two flows match on the icmp protocol. +The first four flows match on the IP addresses, and the +next two flows match on the ICMP protocol. These six flows define conjunction flows, and the next two define actions for them. @@ -404,15 +440,15 @@ the security group 2. These flows are from the following security group rules, :: - - can receive tcp packets from security group 1 - - can receive tcp packets to port 80 from security group 2 + - can receive TCP packets from security group 1 + - can receive TCP packets to port 80 from security group 2 and these rules have been processed by ``merge_port_ranges`` into: :: - - can receive tcp packets to port != 80 from security group 1 - - can receive tcp packets to port 80 from security group 1 or 2 + - can receive TCP packets to port != 80 from security group 1 + - can receive TCP packets to port 80 from security group 1 or 2 before translating to flows so that there is only one matching flow even when the TCP destination port is 80. @@ -429,10 +465,10 @@ flows, but the corresponding security group rules have different remote group IDs. Unlike the above TCP example, there's no convenient way of expressing ``protocol != TCP`` or ``icmp_code != 1``. So the OVS firewall uses a different priority than the previous TCP flows so -as not to mix up them. +as not to mix them up. The mechanism for dropping connections that are not allowed anymore is the -same as in ``table 72``. +same as in |table_72|. :: @@ -449,40 +485,53 @@ same as in ``table 72``. table=82, priority=0 actions=drop -Note: Conntrack zones on a single node are now based on network to which port is -plugged in. That makes a difference between traffic on hypervisor only and -east-west traffic. For example, if port has a VIP that was migrated to a port on -different node, then new port won't contain conntrack information about previous -traffic that happened with VIP. +.. note:: -Using OpenFlow in conjunction with OVS firewall ------------------------------------------------ + Conntrack zones on a single node are now based on the network to which + a port is plugged in. That makes a difference between traffic on hypervisor + only and east-west traffic. For example, if a port has a VIP that was + migrated to a port on a different node, then the new port won't contain + conntrack information about previous traffic that happened with VIP. -There are three tables where packets are sent once they get through the OVS -firewall pipeline. The tables can be used by other mechanisms using OpenFlow -that are supposed to work with the OVS firewall. Packets sent to ``table 91`` -are considered accepted by the egress pipeline, and will be processed so that -they are forwarded to their destination by being submitted to a NORMAL action -that results in Ethernet flood/learn processing. Note that ``table 91`` merely -resubmits to ``table 94``that contains the actual NORMAL action; this allows -to have``table 91`` be a single places where the NORMAL action can be overriden -by other components (currently used by ``networking-bagpipe`` driver for -``networking-bgpvpn``). +OVS firewall integration points +------------------------------- -Packets sent to ``table 92`` were processed by the ingress filtering pipeline. -As packets from the ingress filtering pipeline were injected to its -destination, ``table 92`` receives copies of those packets and therefore -default action is ``drop``. Finally, packets sent to ``table 93`` were filtered -by the firewall and should be dropped. Default action is ``drop`` in this -table. +There are three tables where packets are sent once after going through the OVS +firewall pipeline. The tables can be used by other mechanisms that are supposed +to work with the OVS firewall, typically L2 agent extensions. -In regard to the performance perspective, please note that only the first -accepted packet of each connection session will go to ``table 91`` and -``table 92``. +Egress pipeline +~~~~~~~~~~~~~~~ +Packets are sent to |table_91| and |table_94| when they are considered accepted +by the egress pipeline, and they will be processed so that they are forwarded +to their destination by being submitted to a NORMAL action, that results in +Ethernet flood/learn processing. + +Two tables are used to differentiate between the first packets of a connection +and the following packets. This was introduced for performance reasons to +allow the logging extension to only log the first packets of a connection. +Only the first accepted packet of each connection session will go to |table_91| +and the following ones will go to |table_94|. + +Note that |table_91| merely resubmits to |table_94| that contains the actual +NORMAL action; this allows to have a single place where the NORMAL action can +be overridden by other components (currently used by ``networking-bagpipe`` +driver for ``networking-bgpvpn``). + +Ingress pipeline +~~~~~~~~~~~~~~~~ + +The first packet of each connection accepted by the ingress pipeline is sent +to |table_92|. The default action in this table is DROP because at this point +the packets have already been delivered to their destination port. This +integration point is essentially provided for the logging extension. + +Packets are sent to |table_93| if processing by the ingress filtering +concluded that they should be dropped. Upgrade path from iptables hybrid driver -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +---------------------------------------- During an upgrade, the agent will need to re-plug each instance's tap device into the integration bridge while trying to not break existing connections. One @@ -505,3 +554,15 @@ firewall rules and finally remove the drop rules for the instance. use the OVS firewall, and instances from other nodes can be live-migrated to it. Once the first node is evacuated, its firewall driver can be then be switched to the OVS driver. + +.. |table_0| replace:: ``table 0`` (LOCAL_SWITCHING) +.. |table_60| replace:: ``table 60`` (TRANSIENT) +.. |table_71| replace:: ``table 71`` (BASE_EGRESS) +.. |table_72| replace:: ``table 72`` (RULES_EGRESS) +.. |table_73| replace:: ``table 73`` (ACCEPT_OR_INGRESS) +.. |table_81| replace:: ``table 81`` (BASE_INGRESS) +.. |table_82| replace:: ``table 82`` (RULES_INGRESS) +.. |table_91| replace:: ``table 91`` (ACCEPTED_EGRESS_TRAFFIC) +.. |table_92| replace:: ``table 92`` (ACCEPTED_INGRESS_TRAFFIC) +.. |table_93| replace:: ``table 93`` (DROPPED_TRAFFIC) +.. |table_94| replace:: ``table 94`` (ACCEPTED_EGRESS_TRAFFIC_NORMAL) diff --git a/neutron/plugins/ml2/drivers/openvswitch/agent/common/constants.py b/neutron/plugins/ml2/drivers/openvswitch/agent/common/constants.py index f7b83757d3c..06ce66c1f02 100644 --- a/neutron/plugins/ml2/drivers/openvswitch/agent/common/constants.py +++ b/neutron/plugins/ml2/drivers/openvswitch/agent/common/constants.py @@ -75,7 +75,6 @@ OVS_FIREWALL_TABLES = ( ACCEPTED_EGRESS_TRAFFIC_TABLE = 91 ACCEPTED_INGRESS_TRAFFIC_TABLE = 92 DROPPED_TRAFFIC_TABLE = 93 - ACCEPTED_EGRESS_TRAFFIC_NORMAL_TABLE = 94 # --- Tunnel bridge (tun_br)