Because Kubernetes will stop propagating metadata.selLink field in
release 1.20 and it will be removed in release 1.21, we need to adapt
and calculate selfLink equivalent by ourselves.
In this patch new function is introduced, which will return the path for
provided resource.
Also, we need to deal with list of resources, since for some reason CRs
do have information regarding apiVersion and the kind, while as for core
resources the apiVersion is only on top of the list object, kind is
something like *List, and object within 'items' are without either
apiVersion nor kind.
Implements: blueprint selflink
Change-Id: I721a46ea0379382f7eb2e13c59bd193314f37e7f
Kuryr-Kubernetes relies on watching resources in K8s API using an HTTP
stream served by kube-apiserver. In such a distributed system this is
sometimes unstable and e.g. etcd issues can cause some events to be
omitted. To prevent controller from such situations this patch makes
sure that periodically a full list of resources is fetched and injected
as events into the handlers.
We should probably do the same for kuryr-daemon watcher, but that case
is less problematic as it'll be restarted in event of ADD requests
timing out.
Change-Id: I67874d086043071de072420df9ea5e86b3f2582e
We don't need exit_on_stop option now as we always set it to True. This
commit removes it to simplify the code.
Change-Id: I258709095727d87ede66ad6846c6e56ebdabec91
Currently, if the number of neutron resources requested reaches
the quota, kuryr-controller is marked as unhealthy and restarted.
In order to avoid the constant restart of the pod, this patch adds
a new readiness checks that checks if the resources used by
the enabled handlers are over quota.
Closes-Bug: 1804310
Change-Id: If4d42f866d2d64cae63736f4c206bedca039258b
We were assuming that watcher threads will be cleaning up after
themselves - i.e. will remove paths from Watcher._watching dict on
Watcher._stop_watch(). Turns out _stop_watch() is killing the threads in
a hard way using thread.stop(). This means that paths are never removed
from Watcher._watching dict and on restart (i.e. Watcher.start()), the
method considers that there is no path that we're not already
processing and does nothing.
This commit fixes that by cleaning up Watcher._watching dict in
Watcher._stop_watch() method.
Closes-Bug: 1790912
Change-Id: I17baaab1769ca5882f0b8edf496f92ac39507969
This commit makes sure exception is logged with traceback when an error
happened in Watcher (e.g. when processing the event).
Change-Id: I54909622dae5e05e63576e4b504650b6c3fcb5c5
In case all the watchers (in the CNI case the pod watcher only) have
gracefully exited, continuing the process only serves to give a false
appearance of things working. At the same time, it prevents the
containerized deployment orchestrator from realizing that the Kuryr pod
is not functional so it does not restart it.
This fix allows non health proves environments where all watchers have
gracefully exited to be restarted by k8s/ocp and eventually work again
should the issue that made the graceful exits happen be solved.
Change-Id: Id70978e06d980bc0ffa08bcee02d78bef9dcbeb8
Closes-Bug: #1776676
Signed-off-by: Antoni Segura Puimedon <antonisp@celebdor.com>
We had a false positive here and it was raising and swallowing some
exceptions. Fixing it exposed the fact that we were not really removing
the resource from the watch so it would loop forever trying to find more
events instead of finishing.
Change-Id: I96bc2d6ac7951c1cbc78c74e810577fc35587d39
Closes-Bug: #1777155
Closes-Bug: #1777632
Signed-off-by: Antoni Segura Puimedon <antonisp@celebdor.com>
This commit implements initial version of high availability support in
kuryr-controller - Active/Passive mode. In this mode only one instance
of controller is processing the resources while other ones are in
standby mode. If current leader dies, one of standbys is taking the
leader role and starts processing resources.
Please note that as leader election is based on Kubernetes mechanisms,
this is only supported when kuryr-controller is run as Pod on Kubernetes
cluster.
Implements: bp high-availability
Change-Id: I2c6c9315612d64158fb9f8284e0abb065aca7208
This patch adds liveness checks for watcher and handlers, without passing the
manager's reference to modules that probably should not be aware of it.
Related-Bug: #1705429
Change-Id: I0192756c556b13f98302a57acedce269c278e260