33 Commits

Author SHA1 Message Date
Michał Dulko
0176b4a98c CNI: Watch for deleted pods
It can happen that we get the CNI request, but pod gets deleted before
kuryr-controller was able to create KuryrPort for it. If kuryr-daemon
only watches for KuryrPort events it will not be able to notice that
and will wait until the timeout, which in effect doesn't play well with
some K8s tests.

This commit adds a separate Service that will watch on Pod events and if
Pod gets deleted we'll make sure to put a sentinel value (None) into the
registry so that the thread waiting for the KuryrPort to appear there
will know that it has to stop and raise an error.

Closes-Bug: 1963678
Change-Id: I52fc1805ec47f24c1da88fd13e79e928a3693419
2022-03-08 12:28:48 +01:00
Michał Dulko
d5f5db7005 CNI: Use K8S_POD_UID passed from CRI
Recent versions of cri-o and containerd are passing K8S_POD_UID as a CNI
argument, alongside with K8S_POD_NAMESPACE and K8S_POD_NAME. As both
latter variables cannot be used to safely identify a pod in the API
(StatefulSet recreates pods with the same name), we were prone to race
conditions in the CNI code that we could only workaround. The end effect
was mostly IP conflict.

Now that the UID argument is passed, we're able to compare the UID from
the request with the one in the API to make sure we're wiring the
correct pod. This commit implements that by making sure to move the
check to the code actually waiting for the pod to appear in the
registry. In case of K8S_POD_UID missing from the CNI request, API call
to retrieve Pod is used as a fallback.

We also know that this check doesn't work for static pods, so CRD and
controller needed to be updated to include information if the pod is
static on the KuryrPort spec, so that we can skip the check for the
static pods without the need to fetch Pod from the API.

Closes-Bug: 1963677
Change-Id: I5ef6a8212c535e90dee049a579c1483644d56db8
2022-03-08 12:28:48 +01:00
Roman Dobosz
7369dc2f67 Add events for Network Policy related activities.
Besides that, added ownerReferences to the KuryrNetworkPolicy, so that
we don't need to query kubertnetes API about uid of NetworkPolicy
object. Although side effect of that field is, that it will be removed
alongside with NetworkPolicy, but it's acceptable, since we remove it
anyway after NetworkPolicy removal.

Change-Id: Ia9fb5cac516bc042c20897d8527afdfb8661b42b
2021-12-16 14:16:08 +01:00
Zuul
b4699f697e Merge "Add events for Services" 2021-12-15 21:28:45 +00:00
Michał Dulko
ca719a4009 Add events for Services
This commit implements adding Events for various things that may happen
when Kuryr handles a Service - either incidents or just informative
messages helpful to the user.

As to add an Event it is required to have a uid of the object and in
most of KuryrLoadBalancerHandler we don't have the Service
representation, this commit implements populating ownerReferences of the
KuryrLoadBalancer object with a reference to the Service. This has a
major side effect that KLB will be garbage collected if corresponding
service doesn't exist, which is probably a good thing (and we manage
that ourselves using finalizers anyway).

Another set of refactorings is to remove KLB creation from
EndpointsHandler in order to stop it fighting with ServiceHandler over
creation - EndpointsHandler cannot add ownerReference as it has no uid
of the Service.

Other refactorings related to the error messages are also included.

Change-Id: I36b4d62e6fc7ace00909e9b1aa7681f6d59ca455
2021-12-13 17:05:46 +01:00
Michał Dulko
1933ab1142 CNI: Improve logging of timeout errors
There are 2 places we can time out in CNI:
1. KuryrPort haven't got created for a pod we got ADD request for.
2. Ports we plugged for a pod aren't moving to ACTIVE state.

This commit improves logging in these two cases by making sure an
exception with a meaningful message is raised when such a timeout
occurs. The messages includes clues on where to look for root the cause
next - kuryr-controller in case of 1 and Neutron in case of 2.

I'm also disabling putting kuryr-cni into unhealthy state in these
cases. Those erorrs cannot be cleared by a restart of the pod.

Change-Id: I5b881b58fe7d6dfed66a7bb6e3473b5b7939854d
2021-12-10 12:37:15 +01:00
Michał Dulko
87981d0652 Do not start kuryr-daemon when worker_num <= 1
We've discovered that running kuryr-daemon with [cni_daemon]worker_num=1
breaks pyroute2.IPDB's ability to correctly close threads, leading to a
process leak. This commit makes sure kuryr-daemon will fail to start
when worker_num <= 1.

This required a few more changes in order to make sure that when any
kuryr-daemon subservice dies, kuryr-daemon will shutdown too.

Change-Id: I41afc6fa67abfff62d2f0017db508051a1e7edf4
2021-11-05 14:25:22 +01:00
Michał Dulko
a17e5502c2 Show error messages when resources are stuck
If an Octavia loadbalancer is stuck in PENDING_UPDATE state or Neutron
port is DOWN despite being plugged there's not much Kuryr can do. For
such cases we need to clearly message the user that the error they're
seeing is caused by OpenStack service misbehaving and not Kuryr.

This commit does so by making sure in such cases we raise a distinct
version of ResourceNotReady exception.

Change-Id: I2dd1e8989caf004b3dee0cb51780a45ce8d9353c
Closes-Bug: 1918711
2021-06-17 17:30:17 +02:00
Sunday Mgbogu
d0331abc38 Include proper log when Kuryr cannot reach Octavia API
This is not the proper way of informing user that Octavia returns 503,
we should have a nice message or we'll start getting bug reports on us

Closes-bug: 1918708

Change-Id: I871c3998edb5b1d594067b60e908c453ad122dde
2021-04-21 07:48:18 +01:00
Zuul
fba2c4dfa1 Merge "Civilize logging vol 2" 2020-09-22 10:51:05 +00:00
Roman Dobosz
5c855d9611 Added new K8sFieldValueForbidden exception.
During tests it turns out, that we didn't catch Forbidden exceptions,
since there was no forbidden http code sent from kubernetes API. In this
Patch we introduce new exception K8sFieldValueForbidden, which will be
raised on 422 Unprocessable Entity, k8s API returns.

Also, taken care of objects in state terminating in remove_finalizer
method.

Closes-Bug: 1895124
Change-Id: If4ac93190db3a56ee6b94ca122bfd2e95c29ffb9
2020-09-21 10:42:34 +00:00
Maysa Macedo
7894021931 Clean lb crd status upon Load Balancer removal
When a lb transitions to ERROR or the IP on the Service
spec differs from the lb VIP, the lb is released and
the CRD doesn't get updated, causing Not Found expections
when handling the creation of others load balancer
resources. This commit fixes the issue by ensuring the
clean up of the status field happens upon lb release.
Also, it adds protection in case we still get
nonexistent lb on the CRD.

Closes-Bug: 1894758
Change-Id: I484ece6a7b52b51d878f724bd4fad0494eb759d6
2020-09-19 14:57:25 +00:00
Michał Dulko
9743f6b3c9 Civilize logging vol 2
This is another attempt at getting the useless tracebacks out of logs
for any level higher than debug. In particular:

* All pools logs regarding "Kuryr-controller not yet ready to *" are now
  on debug level.
* The ugly ResourceNotReady raised from _populate_pool is now suppressed
  if method is run from eventlet.spawn(), which prevents that exception
  being logged by eventlet.
* ResourceNotReady will only print namespace/name or name, not the full
  resource representation.
* ConnectionResetError is suppressed on Retry handler level just as any
  other K8sClientError.

Change-Id: Ic6e6ee556f36ef4fe3429e8e1e4a2ddc7e8251dc
2020-09-17 12:15:39 +02:00
Roman Dobosz
24915ad66c Add finalizer for the pod as soon as possible.
We were observed the situation, where pod has been created, triggered
KuryrPort CRD creation and removed just before Kuryrport on_present
event was handled, resulting in errors from KuryrPort side.

Idea is to set the finalizer to the pod as quickly as possible, so that
it wont disappear before we correctly set up the CRD.

Change-Id: Iac82bb05a465e94e47356c3c873e11f00e5d0cd9
2020-08-11 12:55:45 +02:00
Michał Dulko
b6c89debdd Implement add_finalizer and remove_finalizer
This commit adds helper client methods that will aid in working with
finalizers of the CRDs and other stuff. Also one place where we remove
the finalizer already is updated to use the methods.

Change-Id: I665e03f80102a08b2c3ec412a4417c3a32f9384b
2020-07-28 19:01:22 +02:00
Luis Tomas Bolivar
780c4dfa09 Namespace event handling through KuryrNet CRD
This patch moves the namespace handling to be more aligned
with the k8s style.

Depends-on: If0aaf748d13027b3d660aa0f74c4f6653e911250

Change-Id: Ia2811d743f6c4791321b05977118d0b4276787b5
2020-03-13 12:30:07 +01:00
Michał Dulko
574f5eab4b Nested: Detect MTU mismatch
Apparently it is possible to override Neutron's MTU setting through DHCP
agent. This may lead to a situation when node (VM) network will have a
different MTU than pod network. In such case setting pod network's MTU
on a Pod's veth pair will fail due to MTU mismatch.

This commit makes sure we detect such situation soon and produce a log
message with a hint about the root cause.

Change-Id: Ib694950c77ac7c3fd480f579b627dc79bfceac85
Closes-Bug: 1863212
2020-02-21 12:02:58 +01:00
Maysa Macedo
0814ccaac6 Remove openshift routes(Ingress) support
Route pods from openshift can be used instead and the code
is not being used/maintained.

Change-Id: I76448752ba07f4b30dbfa783c2ae99d46e730eaf
2020-02-01 16:09:28 +00:00
Peng Liu
70ee5ad132 Implement NPWG multi-vif driver
This patch creates a npwg multi-vif driver which can parse the
Pod annotations and CRD defined in Network Plumbing Working
Group CRD SPEC.

Implements: blueprint kuryr-npwg-spec-support
Change-Id: I9ee9643b468a5fe453541b9cf1acf31ca872a313
2018-08-09 17:31:21 +08:00
Yossi Boaron
4ab102afa8 OCP-Router: Ingress controller support
This is the second patch of the Ingress Controller capability.

In order for the K8S Ingress and OpenShift Route resources to work,
the cluster must have an Ingress Controller running.

This patch extends LBaaS driver to support L7 load balancing and
verifies, retrieves and stores the L7 router LB (pre-created by admin or
Devstack) details.
The OCP-route and K8S-endpoint handlers (implemented in next patch) will
query the ingress controller for the L7 router details.

Partially Implements: blueprint openshift-router-support

Change-Id: Id55169f6c9c1c607b2aa54c92711dfbd04a9e39d
2018-06-15 14:34:57 +00:00
Luis Tomas Bolivar
d5d4ef1f9d Add namespace subnet driver for namespace creation
This patch adds a new subnet driver that creates a new network
for each created k8s namespace. It makes use of K8s CRDs to store
the information about the network resources created for each
namespace

Partially Implements: blueprint network-namespace

Change-Id: I7988e1da7a9ed57f29c85ddcd99bb2c87808010e
2018-05-25 08:57:42 +02:00
Zuul
5cf852da91 Merge "Services: Rollback openstack resources in case of annotation failure" 2018-03-14 15:13:58 +00:00
Luis Tomas Bolivar
a83eba5fa1 Add multi pools support
This patch adds support for nodes with different vif drivers as
well as different pool drivers for each vif driver type.

Closes-Bug: 1747406
Change-Id: I842fd4b513a5f325d598d677e5008f9ea51adab9
2018-03-07 13:06:56 +01:00
Yossi Boaron
2e6c7eaae7 Services: Rollback openstack resources in case of annotation failure
Upon K8S service creation the LBaaS handler creates all LB resources
at neutron (LB,Listener,Pool,etc) and store them at K8S resource
 using annotation.
When K8S service is deleted, the LBaaS handler retrieves LB
resources details from annotation and release them at neutron.

This patch handles the case in which K8S service resource was deleted
before LBaaS handler stored openstack resource details.

Closes-Bug: 1748890

Change-Id: Iea806d32c99cd3cf51a832b576ff4054fc522bd3
2018-02-25 09:37:57 +02:00
Gary Loughnane
04b17e4a06 Add MACVLAN based interfaces for nested containers
Currently nested containers can only be run by using trunk support and
vlan based interfaces. This patch introduces the additional option of
MACVLAN slave interfaces for pods running in VMs.

This patch includes both a new VIF driver on the controller side and the
binding driver for the CNI plugin.

Implements: blueprint macvlan-pod-in-vm
Depends-On: Ib71204d2d14d3d4f15beada701094e37d89d7801
Co-Authored-By: Marco Chiappero <marco.chiappero@intel.com>
Change-Id: I03c536bb0057bba0a5eb4d1c135baa8ab625e400
2017-06-12 13:14:12 +01:00
Marco Chiappero
d458322e4b Refactor the class hierarchy of controller drivers
In order to better organize nested drivers (VLAN and MACVLAN),
refactor the class hierachy of VIF drivers, providing better locations
for shared code. In particular:

- add an additional abstract class named NestedPodVIFDriver for nested
drivers to share common code, to accomodate the upcoming MACVLAN
driver
- rename GenericPodVIFDriver to NeutronPodVIFDriver (all the drivers are
Neutron specific)

This change is part of the MACVLAN based pod-in-VM spec and should be
applied before any following MACVLAN related patches.

Implements: blueprint
https://blueprints.launchpad.net/kuryr-kubernetes/+spec/macvlan-pod-in-vm

Change-Id: Ib71204d2d14d3d4f15beada701094e37d89d7801
Signed-off-by: Marco Chiappero <marco.chiappero@intel.com>
2017-06-05 17:33:11 +01:00
shihanzhang
1d35146a46 Remove log translations
Log messages are no longer being translated. This removes all use of
the _LE, _LI, and _LW translation markers to simplify logging and to
avoid confusion with new contributions.

See:
http://lists.openstack.org/pipermail/openstack-i18n/2016-November/002574.html
http://lists.openstack.org/pipermail/openstack-dev/2017-March/113365.html

Change-Id: If4735fc3ac1803585efd90657539e540d157a59a
2017-03-28 15:13:49 +08:00
vikaschoudhary16
dc65eb1cbd Add support for nested pods with Vlan trunk port
Enable support for pods running in Nova vms.

I will be pushing a patch with devstack plugin changes.

Reference: https://review.openstack.org/#/c/411116/1/doc/source/devref/howto_binding_drivers.rst
Change-Id: Ib2aed7a0d1fa705f17a62d0fa4e272f19212e39e
Partially-Implements: blueprint binding-drivers-porting
2017-01-18 16:57:32 +05:30
Ilya Chukhnakov
fa03953aff Experimental CNI & VIFBridge binding
This patch provides an experimental CNI driver. It's primary purpose
is to enable development of other components (e.g. functional tests,
service/LBaaSv2 support). It is expected to be replaced with daemon
to configure VIF and connect it to the pods and a small lightweight
client to serve as CNI driver called by Kubernetes.

NOTE: unit tests are not provided as part of this patch as it is yet
unclear what parts of it will be reused in daemon-based
implementation.

Change-Id: Iacc8439dd3aee910d542e48ed013d6d3f354786e
Partially-Implements: blueprint kuryr-k8s-integration
2016-12-05 18:05:22 +00:00
Ilya Chukhnakov
d6dd891bef Generic VIF controller driver
This patch introduces a driver that manages normal Neutron ports to
provide VIFs for Kubernetes Pods.

Change-Id: Ice32e96e107f7b7331caca3b79c488532710b4a2
Partially-Implements: blueprint kuryr-k8s-integration
2016-11-22 18:34:10 +00:00
Ilya Chukhnakov
634290839a Port-to-VIF os-vif translator for hybrid OVS case
This patch introduces Port-to-VIF translation to 'os_vif_util' and
implements a translator that supports hybrid OpenVSwitch plugging
case.

Change-Id: I9f5c36fa32b51da8cccf377455b096270f23a782
Partially-Implements: blueprint kuryr-k8s-integration
2016-11-22 15:01:06 +03:00
Ilya Chukhnakov
5f6c9a574e Retry handler
This patch adds the Retry handler that can be used as part of the
event handling pipeline to retry failed handlers.

Change-Id: Ia86790de8efa6a3ef5b677a70ffbd2d8201f9d95
Partially-Implements: blueprint kuryr-k8s-integration
2016-10-31 10:53:07 +00:00
Ilya Chukhnakov
d68a97fe47 K8s and Neutron clients support
Adds basic K8s client implementation and CONF-based singletons for
both Neutron and K8s clients.

The K8s client added by this patch should be considered a temporary
solution that only implements the necessary parts to let us move
forward with kuryr-kubernetes. Eventually it will be replaced by either
[1] or [2].

The problem with [1] is that it does not yet support the streaming API
that we need for WATCH. And [2] is outside of the OSt umbrella, so [1]
is preferred over [2] unless [2] makes it into global-requirements.txt.

[1] https://github.com/openstack/python-k8sclient
[2] https://pypi.python.org/pypi/pykube

NOTE: Removed py3-related code from config and top-level __init__.
      How to properly deal with that code is TBD.

Change-Id: Ib4eb410eaf9725c296fcdddd8857eb24b8929915
Partially-Implements: blueprint kuryr-k8s-integration
2016-10-03 16:07:03 +00:00