Browse Source

Nested CNI: Remove interfaces on DEL requests

Historically we haven't been doing any actions on CNI DEL request when
running in nested mode because all the interfaces we created will be
gone when netns will get deleted. There is an issue with that approach:

1. We get ADD request for pod A in netns X and correctly bind it using
   VLAN ID V.
2. We get DEL request for same pod A and netns X. We don't do anything.
3. Netns is not yet deleted…
4. Another ADD request comes for pod A but now with netns Y (normal K8s
   behavior).
   a) We try binding it, but there is a veth pair with VLAN ID V already
   on the host namespace, so we end up with VLAN ID conflict.
   b) Binding fails with EEXIST error.

The problem with this hypothesis is that the EEXIST error seems to be
persistent and in theory it should disappear after the netns gets
deleted but maybe there's a longer delay in the netns deletion.

This commit solves this problem by making sure we always clean up all
interfaces on DEL requests.

Change-Id: I01b66dcd5c7c0e8193150a0a1b20878aaa8ec9e3
Closes-Bug: 1854928
changes/29/711229/2
Michał Dulko 2 years ago
parent
commit
a0a6e65bb0
  1. 10
      kuryr_kubernetes/cni/binding/nested.py
  2. 7
      kuryr_kubernetes/tests/unit/cni/test_binding.py

10
kuryr_kubernetes/cni/binding/nested.py

@ -110,9 +110,13 @@ class NestedDriver(health.HealthHandler, b_base.BaseBindingDriver,
iface.up()
def disconnect(self, vif, ifname, netns, container_id):
# NOTE(vikasc): device will get deleted with container namespace, so
# nothing to be done here.
pass
# NOTE(dulek): Interfaces should get deleted with the netns, but it may
# happen that kubelet or crio will call new CNI ADD before
# the old netns is deleted. This might result in VLAN ID
# conflict. In oder to protect from that let's remove the
# netns ifaces here anyway.
with b_base.get_ipdb(netns) as c_ipdb:
self._remove_ifaces(c_ipdb, (vif.vif_name, ifname), netns)
class VlanDriver(NestedDriver):

7
kuryr_kubernetes/tests/unit/cni/test_binding.py

@ -97,8 +97,13 @@ class TestDriverMixin(test_base.TestCase):
if report:
report.assert_called_once()
@mock.patch('kuryr_kubernetes.cni.binding.base.get_ipdb')
@mock.patch('os_vif.unplug')
def _test_disconnect(self, m_vif_unplug, report=None):
def _test_disconnect(self, m_vif_unplug, m_get_ipdb, report=None):
def get_ipdb(netns=None):
return self.ipdbs[netns]
m_get_ipdb.side_effect = get_ipdb
base.disconnect(self.vif, self.instance_info, self.ifname, self.netns,
report)
m_vif_unplug.assert_called_once_with(self.vif, self.instance_info)

Loading…
Cancel
Save