promenade/charts/proxy/values.yaml
Mark Burnett 69cb269230 Make K8S proxy health check more aggressive
In K8S version 1.10, the proxy can sometimes get stuck believing that
some services do not have any endpoints.  This seems to be triggered by
network instability, though the proxy doesn't seem to recover on its
own, while bouncing the pod fixes the issue.

This change adds a naive means of detecting and recoverying from this
(`iptables-save | grep 'has no endpoints'` in the liveness probe) that
may occasionally have false positives.  As such, the liveness probe is
configured very conservatively to avoid triggering CrashLoopBackoff in
the event of a false positive.

Finally, there is a whitelist feature to help avoid false positives for
services that are known to legitimately have empty endpoints during the
course of normal operation (e.g. Patroni might manage such an endpoint
list).

Change-Id: I29a770fab70b1fb79db59ef5408f40b2af1c01f9
2018-09-05 13:46:03 -05:00

72 lines
1.8 KiB
YAML

# Copyright 2017 AT&T Intellectual Property. All other rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# limitations under the License.
manifests:
daemonset_proxy: true
rbac: true
pod:
lifecycle:
upgrades:
daemonsets:
pod_replacement_strategy: RollingUpdate
proxy:
enabled: true
min_ready_seconds: 0
max_unavailable: 1
termination_grace_period:
proxy:
timeout: 30
resources:
enabled: false
proxy:
requests:
memory: "128Mi"
cpu: "100m"
limits:
memory: "1024Mi"
cpu: "2000m"
images:
tags:
proxy: gcr.io/google_containers/hyperkube-amd64:v1.10.2
pull_policy: "IfNotPresent"
command_prefix:
- /proxy
- --proxy-mode=iptables
- --cluster-cidr=10.97.0.0/16
network:
kubernetes_netloc: 10.96.0.1
kube_service:
host: 127.0.0.1
port: 6553
livenessProbe:
config:
# NOTE(mark-burnett): To avoid cascading failure modes, it is
# important that these values are configured to avoid the possibility
# of CrashLoopBackoff for this pod. Otherwise, a small non-impacting
# issue could disable kube-proxy for the entire site.
failureThreshold: 10
initialDelaySeconds: 15
periodSeconds: 35
successThreshold: 1
timeoutSeconds: 10
whitelist:
# - postgres