38 Commits

Author SHA1 Message Date
Zuul
1708114fb1 Merge "Move vifs to 'status' in the KuryrPort CRD." 2020-08-19 08:24:26 +00:00
Roman Dobosz
1aa6753d80 Move vifs to 'status' in the KuryrPort CRD.
I newly added CRD, KuryrPort, we noticed, that vifs key, which is now
under 'spec' object, is rather a thing which could be represented as the
CRD status.

In this patch we propose to move vifs data under the status key.

Depends-On: I2cb66e25534e44b79f660b10498086aa88ad805c
Change-Id: I71385799775f9f9cc928e4d39a0fd443c98b53c6
2020-08-12 17:39:45 +02:00
Michał Dulko
d80e1bff99 Support upgrading LBaaSState annotation to KLB CRD
On upgrade from version using annotations on Endpoints and Services
objects to save information about created Octavia resources, to a
version where that information lives in KuryrLoadBalancer CRD we need to
make sure that data is converted. Otherwise we can end up with doubled
loadbalancers.

This commit makes sure data is converted before we try processing any
Service or Endpoints resource that has annotations.

Change-Id: I01ee5cedc7af8bd02283d065cd9b6f4a94f79888
2020-08-10 16:51:32 +00:00
scavnicka
f71ae55476 Update loadbalancer CRD with service spec and rely on CRD
This commit adds support for creation of loadbalancer, listeners,
members, pools with using the CRD, it is also filling the status
field in the CRD.

Depends-On: https://review.opendev.org/#/c/743214/
Change-Id: I42f90c836397b0d71969642d6ba31bfb49786a43
2020-07-30 21:56:43 +00:00
Roman Dobosz
a458fa6894 Pod annotations to KuryrPort CRD.
Till now, we were using pod annotations to store information regarding
state of the associated VIFs to pod. This alone have its own issues and
it's prone to the inconsistency in case of controller failures.

In this patch we propose new CRD called KuryrPort for storage the
information about VIFs.

Depends-On: If639b63dcf660ed709623c8d5f788026619c895c
Change-Id: I1e76ea949120f819dcab6d07714522a576e426f2
2020-07-29 23:50:17 +02:00
Michał Dulko
9db38c85b2 Tweak exponential backoff
This commit attempts to tweak and simplify the exponential backoff that
we use by making the default interval 1 instead of 3 (so that it won't
raise that fast), locking default maximum wait at 60 seconds (so that we
won't wait e.g. more than 2 minutes as a backoff waiting for pod to
become active) and introducing small jitter instead of fully random
choice of time that we had.

Change-Id: Iaf7abb1a82d213ba0aeeec5b5b17760b1622c549
2020-07-03 16:56:52 +02:00
Michał Dulko
d8892d2e72 Civilize logging
Our logs are awful and this commit attempts to fix some issues with
them:
* Make sure we always indicate why some readiness or liveness probe
  fail.
* Suppress INFO logs from werkzeug (so that we don't see every probe
  call on INFO level).
* Remove logging of successful probe checks.
* Make watcher restart logs less scary and include more cases.
* Add backoff to watcher restarts so that we don't spam logs when K8s
  API is briefly unavailable.
* Add warnings for low quotas.
* Suppress some long logs on K8s healthz failures - we don't need full
  message from K8s printed twice.

I also refactored CNI and controller health probes servers to make sure
they're not duplicating code.

Change-Id: Ia3db4863af8f28cfbaf2317042c8631cc63d9745
2020-07-03 15:09:52 +02:00
Luis Tomas Bolivar
eeee83d0f3 Add IPv6 support to namespace subnet driver
Change-Id: If3bd633b36694dedaf65cb14287e9b9519958de8
2020-03-17 13:30:16 +00:00
Zuul
e461600ffa Merge "Ensure LB sg rules use IPv6 when enabled" 2020-03-12 18:15:06 +00:00
Maysa Macedo
7fb7d96c21 Ensure LB sg rules use IPv6 when enabled
When IPv6 and Network Policy are enabled we must ensure the
amphora SG is updated with sg rules using IPv6.

Implements: blueprint kuryr-ipv6-support

Change-Id: Id89b6c02e85d7faa75be6182c9d82ee7f32ff909
2020-03-10 19:13:42 +00:00
ITD27M01
9cdd1c8112 Ensures accurate quota calculation during the readiness checks
Current deployments of OpenShift platform with Kuryr CNI
in real OpenStack installations (multi-projects environments)
are crashing because of kuryr-controller cannot come to
READY state.

This is due to inaccurate quota calculations in the readiness
process and an unscalable fetching of objects from Neutron API
to count and comparing with limits.

This commit ensures accurate quota calculation for installation
project during the readiness checks and removes the harsh
Neutron API calls. It will dramatically speedup readiness checks.

Change-Id: Ia5e90d6bd5a8d30d0596508abd541e1508dc23ec
Closes-Bug: 1864327
2020-02-25 16:58:02 +03:00
Michał Dulko
1045bcb02a Bump hacking to newer version
Our hacking module is ancient and makes Python 3.6's f-strings to fail
PEP8 check. This commit bumps hacking to newer version and fixes
violations found by it.

Change-Id: If8769f7657676d71bcf84c08108e728836071425
2020-02-21 12:02:58 +01:00
Gary Loughnane
edc6597fe2 Add DPDK support for nested pods
Add DPDK support for nested K8s pods. Patch includes a new VIF driver on
the controller and a new CNI binding driver.

This patch introduces dependency from os-vif v.1.12.0, since there
a new vif type.

Change-Id: I6be9110192f524325e24fb97d905faff86d0cfef
Implements: blueprint nested-dpdk-support
Co-Authored-By: Kural Ramakrishnan <kuralamudhan.ramakrishnan@intel.com>
Co-Authored-By: Marco Chiappero <marco.chiappero@intel.com>
Signed-off-by: Alexey Perevalov <a.perevalov@samsung.com>
Signed-off-by: Danil Golov <d.golov@samsung.com>
2020-02-04 10:59:45 +03:00
Roman Dobosz
1b97158c89 Move from Neutron client to OpenStackSDK.
OpenStack SDK is a framework, which provides consistent and complete
interface for OpenStack cloud. As we already use it for Octavia
integration, in this patch we propose to use it also for other parts of
Kuryr.

Implements: blueprint switch-to-openstacksdk
Change-Id: Ia87bf68bdade3e6f7f752d013f4bdb29bfa5d444
2019-10-23 14:08:18 +02:00
Luis Tomas Bolivar
998be3bbda Avoid race between Retries and Deletion actions
This patch set increases the timeout to wait for resources to be
created/deleted. This is needed to better support spikes without
restarting the kuryr-controller. This patch also ensures that
future retry events are not afecting the kuryr controller if
they are retried once the related resources are already deleted,
i.e., the on_delete event was executed before one of the retries.

Closes-Bug: 1847753
Change-Id: I725ba22f0babf496af219a37e42e1c33b247308a
2019-10-16 18:25:01 +02:00
Yash Gupta
4c3e338273 Reuse utils.get_lbaas_spec in lb handler
LoadBalancerHandler._get_lbaas_spec is identical to
utils.get_lbaas_spec, which is used by LBaaSSpecHandler, so we can reuse the
function from utils instead of duplication of code.

Note that the passed k8s object is endpoint in case of
LoadBalancerHandler and service in case of LBaaSSpecHandler (but same
annotation used in both cases, so a common function should be enough)

Change-Id: I124109f79bcdefcc4948eb35b4bbb4a9ca87c43b
Signed-off-by: Yash Gupta <y.gupta@samsung.com>
2019-09-03 14:30:03 +09:00
Maysa Macedo
73cac914c2 Ensure controller is only restarted after the event timesout
It can take a while for a pod to have annotations and a hostIP
defined. It is also possible that a pod is deleted before
the state is set, causing a NotFound k8s exception. Lastly,
a service migth be missing the lbaas_spec annotation, causing
the event handled on the endpoints to crash. All this scenarios
can be avoided by raising a resource not ready exception, which
allows the operation to be retried.

Change-Id: I5476cd4261a6118dbb388d7238e83169439ffe0d
2019-08-28 13:22:23 +02:00
Zuul
9ac6613af9 Merge "Fix interval ignoring by exponential sleep" 2019-05-29 14:47:03 +00:00
Luis Tomas Bolivar
3f9c80e6e6 Populate pools upon namespace creation
When namespace subnet driver is used, a new subnet is created for
each new namespace. As pools are created per subnet, this patch
ensures that new ports are created for each pool for the new subnet
in the nested case.

Note this feature depends on using resource tagging to filter out
trunk ports in case of multiple clusters deployed on the same openstack
project or when other trunks are present. Otherwise it will consider
all the existing trunks no matter if they belong or not to the
kubernetes cluster.

NOTE: this is only for nested case, where pooling shows the greatest
improvements as ports are already ACTIVE.

Change-Id: Id014cf49da8d4cbe0c1795e47765fcf2f0684c09
2019-05-29 09:26:49 +02:00
Ilya Maximets
c725f82f1a Fix interval ignoring by exponential sleep
'exponential_sleep' always uses DEFAULT_INTERVAL instead of
'interval' passed in arguments.

Change-Id: I04779bbf3b3b2af5090b1b44bb6e72599dba7081
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
2019-05-20 15:59:51 +03:00
Maysa Macedo
ae1d1dd51a Fix LBaaS SG rules update
The LBaaS SG update is failing when the pods selected by the selector
in the rule block are removed after the pod, on which the policy is
enforced, is removed. This commit fixes the issue by changing from
LBaaSServiceSpec object to LBaaSLoadBalancer, which is the object
type expected by '_apply_members_security_groups' function.

Change-Id: I17f2f632e02bc0f46ccc7434173acce68aef957b
Closes-Bug: 1823022
2019-04-08 17:15:45 +00:00
Luis Tomas Bolivar
dfa9a392f1 Add support for svc with text targetPorts
This patch adds support for services that define the targetPort
with text (port name), pointing to the same port number as the defined
exposed port on the svc.

Closes-Bug: 1818969
Change-Id: I7f957d292f7c4a43b759292e5bd04c4db704c4c4
2019-03-15 15:15:40 +01:00
Maysa Macedo
ba89bd027f Fix LBaaS sg rules update on deployment scale
When a service is created with a Network Policy applied and
deployments are scaled up or down, the LBaaS SG rules should be
updated accordindly. Right now, the LBaaS/Service do not react on
deployment scales.
This commit fixes the issue by ensuring that the LBaaS SG is updated
on pod events.

Also, when Pods, Network Policies and SVCs are created together it might
happen that the LBaaS SG remains with default SG rules, even though
the policy is being enforced. This commit ensures the right SG rules
are applied on a LBaaS regardless the order of k8s resources creation.
This happens by setting the LBaaS Spec annotation whenever a request
to update the SG rules has been made and retrieving the Spec again
whenever a LBaaS member is created.

Change-Id: I1c54d17a5fcff5387ffae2b132f5036ee9bf07ca
Closes-Bug: 1816015
2019-03-04 15:57:47 +00:00
Maysa Macedo
70692f86a4 Ensure NP changes are applied to services
When a Network Policy is changed, services must also be updated,
deleting the unnecessary rules that do not match the NP anymore
and create needed ones.

Closes-Bug: #1811242

Partially Implements: blueprint k8s-network-policies

Change-Id: I800477d08fd1f46c2a94d3653496f8f1188a3844
2019-01-24 13:26:47 +01:00
Luis Tomas Bolivar
b200d368cd Add Network Policy support to services
This patch adds support for Network Policy on services. It
applies pods' security groups onto the services in front of them.
It makes the next assumptions:
- All the pods pointed by one svc have the same labels, thus the same
sgs being enforced
- Only copies the SG rules that have the same protocol and direction
as the listener being created
- Adds a default rule to NP to enable traffic from services subnet CIDR

Partially Implements: blueprint k8s-network-policies
Change-Id: Ibd4b51ff40b69af26ab7e7b81d18e63abddf775b
2019-01-08 06:35:55 -05:00
Luis Tomas Bolivar
05cdb9cd4f Ensure controller healthchecks passes without CRDs
This patch ensures the controller healthchecks do not set the
controller as not Ready due to missing CRDs when deploying without
namespace and/or policy handlers. In that case the CRDs are not needed

Closes-Bug: 1808966
Change-Id: I685f9a47605da86504619983848b8ef73d71b332
2018-12-21 12:55:55 +00:00
Zuul
859bedd65a Merge "Work out situation with KUBERNETES_NODE_NAME" 2018-11-29 11:17:10 +00:00
Maysa Macedo
b215aae146 Add quota readiness check to controller
Currently, if the number of neutron resources requested reaches
the quota, kuryr-controller is marked as unhealthy and restarted.
In order to avoid the constant restart of the pod, this patch adds
a new readiness checks that checks if the resources used by
the enabled handlers are over quota.

Closes-Bug: 1804310
Change-Id: If4d42f866d2d64cae63736f4c206bedca039258b
2018-11-21 10:11:55 +00:00
Michał Dulko
65e05c3b6d Work out situation with KUBERNETES_NODE_NAME
This reverts commit f0cde86ee68027bd66597f2e4b8db4e10fa81e0b as it turns
out that variable is actually used by kuryr-daemon and fixes the way
kuryr-controller detects its identity to match how leader-elector is
doing it.

Change-Id: I95c2d3e1760a938d40d57a99fb87b6f02ca7f64a
Closes-Bug: 1798835
2018-11-20 17:23:13 +01:00
Danil Golov
8e60dcc4aa Add SR-IOV pod vif driver
This commit adds SR-IOV driver and new type of VIF to handle SR-IOV requests.
This driver can work as a primary driver and only one driver, but only when kubernetes
will fully support CNI specification.

Now this driver can work in couple with multi vif driver, e.g. NPWGMultiVIFDriver.
(see doc/source/installation/multi_vif_with_npwg_spec.rst)

Also this driver relies on kubernetes SRIOV device plugin.

This commit also adds 'default_physnet_subnets' setting, that should
include a mapping of physnets to neutron subnet IDs, it's necessary to
specify VIF's physnet (subnet id comes from annotation).

To get details how to create pods with sriov interfaces see
doc/source/installation/sriov.rst

Target bp: kuryr-kubernetes-sriov-support
Change-Id: I45c5f1a7fb423ee68731d0ae85f7171e33d0aeeb
Signed-off-by: Danil Golov <d.golov@partner.samsung.com>
Signed-off-by: Vladimir Kuramshin <v.kuramshin@samsung.com>
Signed-off-by: Alexey Perevalov <a.perevalov@samsung.com>
2018-09-18 10:19:43 +03:00
Michał Dulko
b895427011 Fix compatiblity with old Pod annotation format
We've changed Pod annotation format in Rocky. To support upgrading
Kuryr we need to keep compatiblity with that format. This commit
implements that.

We should also think about creating a tool that will convert all the
annotations in a setup.

Change-Id: I88e1b318d58d0d90138e347503928da41518a888
Closes-Bug: 1782366
2018-08-16 17:28:55 +02:00
Peng Liu
5fa529efa4 Move function get_subnet to kuryr_kubernetes.utils
Since the function _get_subnet is widely used by different components,
I move it to kuryr_kubernetes.utils as a part of common utilities.

Change-Id: I9a80fb55f5c02274fb50c4c92eb3514ccb42830e
2018-08-13 07:06:52 -04:00
Michał Dulko
e416b2492a kuryr-controller A/P HA
This commit implements initial version of high availability support in
kuryr-controller - Active/Passive mode. In this mode only one instance
of controller is processing the resources while other ones are in
standby mode. If current leader dies, one of standbys is taking the
leader role and starts processing resources.

Please note that as leader election is based on Kubernetes mechanisms,
this is only supported when kuryr-controller is run as Pod on Kubernetes
cluster.

Implements: bp high-availability

Change-Id: I2c6c9315612d64158fb9f8284e0abb065aca7208
2018-06-14 10:25:34 +02:00
Eunsoo Park
58e6b1914c Watcher restarts watching resources in failure
kuryr-kubernetes watcher watches k8s resources and trigger registered
pipeline.

This patch handles restarting watching when watch thread has failed.

Change-Id: I27a719a326dc37f97c46b88d0c171d0f12ded605
Closes-Bug: 1739776
Related-Bug: 1705429
Signed-off-by: Eunsoo Park <esevan.park@gmail.com>
2018-03-19 17:12:40 +09:00
Luis Tomas Bolivar
a83eba5fa1 Add multi pools support
This patch adds support for nodes with different vif drivers as
well as different pool drivers for each vif driver type.

Closes-Bug: 1747406
Change-Id: I842fd4b513a5f325d598d677e5008f9ea51adab9
2018-03-07 13:06:56 +01:00
Luis Tomas Bolivar
7061f4abac Make CNI Registry Plugin namespace aware
As with the k8sCNIRegistryPlugin the watching is for the
complete node, instead of per pod and namespace, we need
to make registry information to account for the namespace
where the pod is created to differentiate between different
containers running on the same node, with the same name, but
in a different namespace

Related-Bug: 1731486
Change-Id: I26e1dec6ae613c5316a45f93563c4a015df59441
2018-03-02 16:09:57 +01:00
Michał Dulko
18db649943 Support kuryr-daemon when running containerized
This commit implements kuryr-daemon support when
KURYR_K8S_CONTAINERIZED_DEPLOYMENT=True. It's done by:

* CNI docker image installs Kuryr-Kubernetes pip package and adds
  exectution of kuryr-daemon into entrypoint script.
* Hosts /proc and /var/run/openvswitch are mounted into the CNI
  container.
* Code is changed to use /host_proc instead of /proc when in a container
  (it's impossible to mount host's /proc into container's /proc).

Implements: blueprint cni-split-exec-daemon

Change-Id: I9155a2cba28f578cee129a4c40066209f7ab543d
2017-12-13 11:45:22 +01:00
Jaume Devesa
82dce858cf Add asyncio eventloop.
This commits introduces the asyncio event loop as well as the base
abstract class to define watchers.

It is in a very simple approach (it does not reschedule watchers if
they fail) but it lets you to see the proposal of hierarchy in watchers
as well as the methods that a watcher has to implement (see the pod
module).

Partial-Implements: blueprint kuryr-k8s-integration
Co-Authored-By: Taku Fukushima <f.tac.mac@gmail.com>
Co-Authored-By: Antoni Segura Puimedon
Change-Id: I91975dd197213c1a6b0e171c1ae218a547722eeb
2016-09-01 14:47:47 +02:00