Make DNS pod autoscale
DNS service is a very critical service in k8s world, though it's not
a part of k8s itself. So it would be nice to have it replicate more
than 1 and on differents nodes to have high availbility. Otherwise,
services running on k8s cluster will be broken if the node contains
DNS pod down. Another sample is, when user would like to do a cluster
upgrade, services will be borken when the node containers DNS pod
being replaced. You can find lots of discussion about this, please
refer [1],[2] and [3].
[1] https://github.com/kubernetes/kubeadm/issues/128
[2] https://github.com/kubernetes/kubernetes/issues/40063
[3] https://github.com/kubernetes/kops/issues/2693
Closes-Bug: #1757554
Change-Id: Ic64569d4bdcf367955398d5badef70e7afe33bbb
(cherry picked from commit 54a4ac9f84
)
This commit is contained in:
parent
9a01ab8a8a
commit
2fc72e9b0f
|
@ -1242,6 +1242,27 @@ _`ingress_controller_role`
|
|||
|
||||
kubectl label node <node-name> role=ingress
|
||||
|
||||
DNS
|
||||
---
|
||||
|
||||
CoreDNS is a critical service in Kubernetes cluster for service discovery. To
|
||||
get high availability for CoreDNS pod for Kubernetes cluster, now Magnum
|
||||
supports the autoscaling of CoreDNS using `cluster-proportional-autoscaler
|
||||
<https://github.com/kubernetes-incubator/cluster-proportional-autoscaler>`_.
|
||||
With cluster-proportional-autoscaler, the replicas of CoreDNS pod will be
|
||||
autoscaled based on the nodes and cores in the clsuter to prevent single
|
||||
point failure.
|
||||
|
||||
The scaling parameters and data points are provided via a ConfigMap to the
|
||||
autoscaler and it refreshes its parameters table every poll interval to be up
|
||||
to date with the latest desired scaling parameters. Using ConfigMap means user
|
||||
can do on-the-fly changes(including control mode) without rebuilding or
|
||||
restarting the scaler containers/pods. Please refer `Autoscale the DNS Service
|
||||
in a Cluster
|
||||
<https://kubernetes.io/docs/tasks/administer-cluster/dns-horizontal-autoscaling/#tuning-autoscaling-parameters>`_
|
||||
for more info.
|
||||
|
||||
|
||||
Swarm
|
||||
=====
|
||||
|
||||
|
|
|
@ -2,7 +2,9 @@
|
|||
|
||||
. /etc/sysconfig/heat-params
|
||||
|
||||
_prefix=${CONTAINER_INFRA_PREFIX:-docker.io/coredns/}
|
||||
_dns_prefix=${CONTAINER_INFRA_PREFIX:-docker.io/coredns/}
|
||||
_autoscaler_prefix=${CONTAINER_INFRA_PREFIX:-docker.io/googlecontainer/}
|
||||
|
||||
CORE_DNS=/etc/kubernetes/manifests/kube-coredns.yaml
|
||||
[ -f ${CORE_DNS} ] || {
|
||||
echo "Writing File: $CORE_DNS"
|
||||
|
@ -93,7 +95,7 @@ spec:
|
|||
operator: "Exists"
|
||||
containers:
|
||||
- name: coredns
|
||||
image: ${_prefix}coredns:1.0.1
|
||||
image: ${_dns_prefix}coredns:1.0.1
|
||||
imagePullPolicy: Always
|
||||
args: [ "-conf", "/etc/coredns/Corefile" ]
|
||||
volumeMounts:
|
||||
|
@ -150,6 +152,96 @@ spec:
|
|||
- name: metrics
|
||||
port: 9153
|
||||
protocol: TCP
|
||||
---
|
||||
kind: ServiceAccount
|
||||
apiVersion: v1
|
||||
metadata:
|
||||
name: kube-dns-autoscaler
|
||||
namespace: kube-system
|
||||
labels:
|
||||
addonmanager.kubernetes.io/mode: Reconcile
|
||||
---
|
||||
kind: ClusterRole
|
||||
apiVersion: rbac.authorization.k8s.io/v1
|
||||
metadata:
|
||||
name: system:kube-dns-autoscaler
|
||||
labels:
|
||||
addonmanager.kubernetes.io/mode: Reconcile
|
||||
rules:
|
||||
- apiGroups: [""]
|
||||
resources: ["nodes"]
|
||||
verbs: ["list"]
|
||||
- apiGroups: [""]
|
||||
resources: ["replicationcontrollers/scale"]
|
||||
verbs: ["get", "update"]
|
||||
- apiGroups: ["extensions"]
|
||||
resources: ["deployments/scale", "replicasets/scale"]
|
||||
verbs: ["get", "update"]
|
||||
# Remove the configmaps rule once below issue is fixed:
|
||||
# kubernetes-incubator/cluster-proportional-autoscaler#16
|
||||
- apiGroups: [""]
|
||||
resources: ["configmaps"]
|
||||
verbs: ["get", "create"]
|
||||
---
|
||||
kind: ClusterRoleBinding
|
||||
apiVersion: rbac.authorization.k8s.io/v1
|
||||
metadata:
|
||||
name: system:kube-dns-autoscaler
|
||||
labels:
|
||||
addonmanager.kubernetes.io/mode: Reconcile
|
||||
subjects:
|
||||
- kind: ServiceAccount
|
||||
name: kube-dns-autoscaler
|
||||
namespace: kube-system
|
||||
roleRef:
|
||||
kind: ClusterRole
|
||||
name: system:kube-dns-autoscaler
|
||||
apiGroup: rbac.authorization.k8s.io
|
||||
|
||||
---
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: kube-dns-autoscaler
|
||||
namespace: kube-system
|
||||
labels:
|
||||
k8s-app: kube-dns-autoscaler
|
||||
kubernetes.io/cluster-service: "true"
|
||||
addonmanager.kubernetes.io/mode: Reconcile
|
||||
spec:
|
||||
selector:
|
||||
matchLabels:
|
||||
k8s-app: kube-dns-autoscaler
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
k8s-app: kube-dns-autoscaler
|
||||
annotations:
|
||||
scheduler.alpha.kubernetes.io/critical-pod: ''
|
||||
spec:
|
||||
priorityClassName: system-cluster-critical
|
||||
containers:
|
||||
- name: autoscaler
|
||||
image: ${_autoscaler_prefix}cluster-proportional-autoscaler-amd64:1.1.2
|
||||
resources:
|
||||
requests:
|
||||
cpu: "20m"
|
||||
memory: "10Mi"
|
||||
command:
|
||||
- /cluster-proportional-autoscaler
|
||||
- --namespace=kube-system
|
||||
- --configmap=kube-dns-autoscaler
|
||||
# Should keep target in sync with above coredns deployment name
|
||||
- --target=Deployment/coredns
|
||||
# When cluster is using large nodes(with more cores), "coresPerReplica" should dominate.
|
||||
# If using small nodes, "nodesPerReplica" should dominate.
|
||||
- --default-params={"linear":{"coresPerReplica":256,"nodesPerReplica":16,"preventSinglePointFailure":true}}
|
||||
- --logtostderr=true
|
||||
- --v=2
|
||||
tolerations:
|
||||
- key: "CriticalAddonsOnly"
|
||||
operator: "Exists"
|
||||
serviceAccountName: kube-dns-autoscaler
|
||||
EOF
|
||||
}
|
||||
|
||||
|
|
|
@ -0,0 +1,8 @@
|
|||
---
|
||||
issues:
|
||||
- |
|
||||
Currently, the replicas of coreDNS pod is hardcoded as 1. It's not a
|
||||
reasonable number for such a critical service. Without DNS, probably all
|
||||
workloads running on the k8s cluster will be broken. Now Magnum is making
|
||||
the coreDNS pod autoscaling based on the nodes and cores number.
|
||||
|
Loading…
Reference in New Issue