Recent versions of cri-o and containerd are passing K8S_POD_UID as a CNI
argument, alongside with K8S_POD_NAMESPACE and K8S_POD_NAME. As both
latter variables cannot be used to safely identify a pod in the API
(StatefulSet recreates pods with the same name), we were prone to race
conditions in the CNI code that we could only workaround. The end effect
was mostly IP conflict.
Now that the UID argument is passed, we're able to compare the UID from
the request with the one in the API to make sure we're wiring the
correct pod. This commit implements that by making sure to move the
check to the code actually waiting for the pod to appear in the
registry. In case of K8S_POD_UID missing from the CNI request, API call
to retrieve Pod is used as a fallback.
We also know that this check doesn't work for static pods, so CRD and
controller needed to be updated to include information if the pod is
static on the KuryrPort spec, so that we can skip the check for the
static pods without the need to fetch Pod from the API.
Closes-Bug: 1963677
Change-Id: I5ef6a8212c535e90dee049a579c1483644d56db8
It is possible to add info about which component added an Event and this
patch makes sure we do it. This should show up in `kubectl describe`
directly pointing each event to Kuryr.
Change-Id: If954d62010b43e8a92ac2cb3140fc434e5d477a0
Some minor wording fixes, making sure we're not logging errors happening
when we try to create events on Namespace termination and a safeguard if
somebody puts garbage into ownerReferences we depend from.
Change-Id: Ia2d6f41c0b9f969ea01ae78a16036cd4715de703
Also added function for getting object out of ownerReferences or by
query kubernetes API for getting original obj out of CRD.
Change-Id: I17e1672a46496bc6a51f3d28fbfbd1adea9ca249
Ensures Namespaces are only handled when there
is any Pod on Pod Network present on it to avoid
creating Networks and Subnets without really being
used.
Depends-On: https://review.opendev.org/c/openstack/kuryr-tempest-plugin/+/813404
Change-Id: Idbc1e2868c54eb1151d2498a0324731fe099592e
When the Pod has status Completed it has finalized
its job, consequently it's expected that Kuryr will
reclycle the Neutron Ports associated with it to be
used by other Pods. However, Kuryr is considering
Completed Pods as not scheduled and the event is skipped.
This commit fixes the issue by ensuring that Completed
Pods Ports recycle is done before checking for Scheduled
Pods.
Closes-Bug: #1945680
Change-Id: I23e7901fb016d59fa76762d1f24fee47974e72df
Includes a new Gauge metric that records the
number of members of load balancers considered
critical. The metric is labeled with the Name
of load balancer and pool name, and the amount
of members. Also includes an Enum with the
current state of the lb.
Change-Id: Id89bb48d86588f4d2a28ab91963e0b84843cbd6f
This change adds help strings for cache parameters, so that description
of these parameters are included in .conf file generated by
oslo-config-generator.
Change-Id: I9d7082fc42b5dbd67f7214810719158d93f5f88d
Due to coding error I not only switched "used" with "limit" in our
quota check messages but also made it impossible for the check to fail.
This commit fixes it.
Closes-Bug: 1927241
Change-Id: I6e6a396d0e0467ec424bb403064a19cb4f1a586e
For simple case, when operator wants to open connection to all the
namespaces in the cluster, i.e.:
kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
name: networkpolicy-example
spec:
podSelector: {}
policyTypes:
- Egress
- Ingress
egress:
- to:
- namespaceSelector: {}
there was false assumption, that we need to open it without any
restriction, while the truth is, that all we need to do is to open
egress network to all the namespaces within cluster.
Change-Id: Ibea039fa9c3b46b83e99237ce2ceb03f02d50727
Closes-Bug: 1915008
We mostly assumed that trunk ports are only used by Kuryr in an
OpenStack env but sometimes it's not true. This commit adds some checks
to make sure we list trunk ports in a smarter way (checking if they
match the worker_nodes_subnets) and operate on them in a safer way
(checking if they even have IPs).
Change-Id: I3257e263b53bb9f38946ca9cff6a1be5448dec00
Closes-Bug: 1914631
In order to support OpenShift's ability to run its nodes in various
OpenStack subnets in a dynamic way, this commit introduces the
OpenShiftNodesSubnets and MachineHandler. The idea is that
MachineHandler is responsible for watching the OpenShift Machine objects
and calling the driver. The driver will then save and serve a list of
current worker nodes subnets.
Change-Id: Iae3a5d011abaeab4aa97d6aa7153227c6f85b93c
This commit deprecates `[pod_vif_nested]worker_nodes_subnet` in favor of
`[pod_vif_nested]worker_nodes_subnets` that will accept a list instead.
All the code using the deprecated options is updated to expect a list
and iterate over possible nodes subnets.
Change-Id: I7671fb06863d58b58905bec43555d8f21626f640
Because Kubernetes will stop propagating metadata.selLink field in
release 1.20 and it will be removed in release 1.21, we need to adapt
and calculate selfLink equivalent by ourselves.
In this patch new function is introduced, which will return the path for
provided resource.
Also, we need to deal with list of resources, since for some reason CRs
do have information regarding apiVersion and the kind, while as for core
resources the apiVersion is only on top of the list object, kind is
something like *List, and object within 'items' are without either
apiVersion nor kind.
Implements: blueprint selflink
Change-Id: I721a46ea0379382f7eb2e13c59bd193314f37e7f
The targetRef field is optional for the Endpoints object,
and when service without selectors are created that field
is more likely to not be specified. This commit ensures
Kuryr properly wires Endpoints without targetRef.
Change-Id: Ib43e88aafd9e0907556a0e740990a6acbd173fb0
When a lb transitions to ERROR or the IP on the Service
spec differs from the lb VIP, the lb is released and
the CRD doesn't get updated, causing Not Found expections
when handling the creation of others load balancer
resources. This commit fixes the issue by ensuring the
clean up of the status field happens upon lb release.
Also, it adds protection in case we still get
nonexistent lb on the CRD.
Closes-Bug: 1894758
Change-Id: I484ece6a7b52b51d878f724bd4fad0494eb759d6
This is another attempt at getting the useless tracebacks out of logs
for any level higher than debug. In particular:
* All pools logs regarding "Kuryr-controller not yet ready to *" are now
on debug level.
* The ugly ResourceNotReady raised from _populate_pool is now suppressed
if method is run from eventlet.spawn(), which prevents that exception
being logged by eventlet.
* ResourceNotReady will only print namespace/name or name, not the full
resource representation.
* ConnectionResetError is suppressed on Retry handler level just as any
other K8sClientError.
Change-Id: Ic6e6ee556f36ef4fe3429e8e1e4a2ddc7e8251dc
I newly added CRD, KuryrPort, we noticed, that vifs key, which is now
under 'spec' object, is rather a thing which could be represented as the
CRD status.
In this patch we propose to move vifs data under the status key.
Depends-On: I2cb66e25534e44b79f660b10498086aa88ad805c
Change-Id: I71385799775f9f9cc928e4d39a0fd443c98b53c6
On upgrade from version using annotations on Endpoints and Services
objects to save information about created Octavia resources, to a
version where that information lives in KuryrLoadBalancer CRD we need to
make sure that data is converted. Otherwise we can end up with doubled
loadbalancers.
This commit makes sure data is converted before we try processing any
Service or Endpoints resource that has annotations.
Change-Id: I01ee5cedc7af8bd02283d065cd9b6f4a94f79888
This commit adds support for creation of loadbalancer, listeners,
members, pools with using the CRD, it is also filling the status
field in the CRD.
Depends-On: https://review.opendev.org/#/c/743214/
Change-Id: I42f90c836397b0d71969642d6ba31bfb49786a43
Till now, we were using pod annotations to store information regarding
state of the associated VIFs to pod. This alone have its own issues and
it's prone to the inconsistency in case of controller failures.
In this patch we propose new CRD called KuryrPort for storage the
information about VIFs.
Depends-On: If639b63dcf660ed709623c8d5f788026619c895c
Change-Id: I1e76ea949120f819dcab6d07714522a576e426f2
This commit attempts to tweak and simplify the exponential backoff that
we use by making the default interval 1 instead of 3 (so that it won't
raise that fast), locking default maximum wait at 60 seconds (so that we
won't wait e.g. more than 2 minutes as a backoff waiting for pod to
become active) and introducing small jitter instead of fully random
choice of time that we had.
Change-Id: Iaf7abb1a82d213ba0aeeec5b5b17760b1622c549
Our logs are awful and this commit attempts to fix some issues with
them:
* Make sure we always indicate why some readiness or liveness probe
fail.
* Suppress INFO logs from werkzeug (so that we don't see every probe
call on INFO level).
* Remove logging of successful probe checks.
* Make watcher restart logs less scary and include more cases.
* Add backoff to watcher restarts so that we don't spam logs when K8s
API is briefly unavailable.
* Add warnings for low quotas.
* Suppress some long logs on K8s healthz failures - we don't need full
message from K8s printed twice.
I also refactored CNI and controller health probes servers to make sure
they're not duplicating code.
Change-Id: Ia3db4863af8f28cfbaf2317042c8631cc63d9745
When IPv6 and Network Policy are enabled we must ensure the
amphora SG is updated with sg rules using IPv6.
Implements: blueprint kuryr-ipv6-support
Change-Id: Id89b6c02e85d7faa75be6182c9d82ee7f32ff909
Current deployments of OpenShift platform with Kuryr CNI
in real OpenStack installations (multi-projects environments)
are crashing because of kuryr-controller cannot come to
READY state.
This is due to inaccurate quota calculations in the readiness
process and an unscalable fetching of objects from Neutron API
to count and comparing with limits.
This commit ensures accurate quota calculation for installation
project during the readiness checks and removes the harsh
Neutron API calls. It will dramatically speedup readiness checks.
Change-Id: Ia5e90d6bd5a8d30d0596508abd541e1508dc23ec
Closes-Bug: 1864327
Our hacking module is ancient and makes Python 3.6's f-strings to fail
PEP8 check. This commit bumps hacking to newer version and fixes
violations found by it.
Change-Id: If8769f7657676d71bcf84c08108e728836071425
Add DPDK support for nested K8s pods. Patch includes a new VIF driver on
the controller and a new CNI binding driver.
This patch introduces dependency from os-vif v.1.12.0, since there
a new vif type.
Change-Id: I6be9110192f524325e24fb97d905faff86d0cfef
Implements: blueprint nested-dpdk-support
Co-Authored-By: Kural Ramakrishnan <kuralamudhan.ramakrishnan@intel.com>
Co-Authored-By: Marco Chiappero <marco.chiappero@intel.com>
Signed-off-by: Alexey Perevalov <a.perevalov@samsung.com>
Signed-off-by: Danil Golov <d.golov@samsung.com>
OpenStack SDK is a framework, which provides consistent and complete
interface for OpenStack cloud. As we already use it for Octavia
integration, in this patch we propose to use it also for other parts of
Kuryr.
Implements: blueprint switch-to-openstacksdk
Change-Id: Ia87bf68bdade3e6f7f752d013f4bdb29bfa5d444
This patch set increases the timeout to wait for resources to be
created/deleted. This is needed to better support spikes without
restarting the kuryr-controller. This patch also ensures that
future retry events are not afecting the kuryr controller if
they are retried once the related resources are already deleted,
i.e., the on_delete event was executed before one of the retries.
Closes-Bug: 1847753
Change-Id: I725ba22f0babf496af219a37e42e1c33b247308a
LoadBalancerHandler._get_lbaas_spec is identical to
utils.get_lbaas_spec, which is used by LBaaSSpecHandler, so we can reuse the
function from utils instead of duplication of code.
Note that the passed k8s object is endpoint in case of
LoadBalancerHandler and service in case of LBaaSSpecHandler (but same
annotation used in both cases, so a common function should be enough)
Change-Id: I124109f79bcdefcc4948eb35b4bbb4a9ca87c43b
Signed-off-by: Yash Gupta <y.gupta@samsung.com>
It can take a while for a pod to have annotations and a hostIP
defined. It is also possible that a pod is deleted before
the state is set, causing a NotFound k8s exception. Lastly,
a service migth be missing the lbaas_spec annotation, causing
the event handled on the endpoints to crash. All this scenarios
can be avoided by raising a resource not ready exception, which
allows the operation to be retried.
Change-Id: I5476cd4261a6118dbb388d7238e83169439ffe0d
When namespace subnet driver is used, a new subnet is created for
each new namespace. As pools are created per subnet, this patch
ensures that new ports are created for each pool for the new subnet
in the nested case.
Note this feature depends on using resource tagging to filter out
trunk ports in case of multiple clusters deployed on the same openstack
project or when other trunks are present. Otherwise it will consider
all the existing trunks no matter if they belong or not to the
kubernetes cluster.
NOTE: this is only for nested case, where pooling shows the greatest
improvements as ports are already ACTIVE.
Change-Id: Id014cf49da8d4cbe0c1795e47765fcf2f0684c09
The LBaaS SG update is failing when the pods selected by the selector
in the rule block are removed after the pod, on which the policy is
enforced, is removed. This commit fixes the issue by changing from
LBaaSServiceSpec object to LBaaSLoadBalancer, which is the object
type expected by '_apply_members_security_groups' function.
Change-Id: I17f2f632e02bc0f46ccc7434173acce68aef957b
Closes-Bug: 1823022
This patch adds support for services that define the targetPort
with text (port name), pointing to the same port number as the defined
exposed port on the svc.
Closes-Bug: 1818969
Change-Id: I7f957d292f7c4a43b759292e5bd04c4db704c4c4
When a service is created with a Network Policy applied and
deployments are scaled up or down, the LBaaS SG rules should be
updated accordindly. Right now, the LBaaS/Service do not react on
deployment scales.
This commit fixes the issue by ensuring that the LBaaS SG is updated
on pod events.
Also, when Pods, Network Policies and SVCs are created together it might
happen that the LBaaS SG remains with default SG rules, even though
the policy is being enforced. This commit ensures the right SG rules
are applied on a LBaaS regardless the order of k8s resources creation.
This happens by setting the LBaaS Spec annotation whenever a request
to update the SG rules has been made and retrieving the Spec again
whenever a LBaaS member is created.
Change-Id: I1c54d17a5fcff5387ffae2b132f5036ee9bf07ca
Closes-Bug: 1816015
When a Network Policy is changed, services must also be updated,
deleting the unnecessary rules that do not match the NP anymore
and create needed ones.
Closes-Bug: #1811242
Partially Implements: blueprint k8s-network-policies
Change-Id: I800477d08fd1f46c2a94d3653496f8f1188a3844
This patch adds support for Network Policy on services. It
applies pods' security groups onto the services in front of them.
It makes the next assumptions:
- All the pods pointed by one svc have the same labels, thus the same
sgs being enforced
- Only copies the SG rules that have the same protocol and direction
as the listener being created
- Adds a default rule to NP to enable traffic from services subnet CIDR
Partially Implements: blueprint k8s-network-policies
Change-Id: Ibd4b51ff40b69af26ab7e7b81d18e63abddf775b