[k8s] Monitoring with Prometheus and Grafana
Profit from the default cAdvisor deployed by k8s to deploy the remaining monitoring stack on top, made of node-exporter, Prometheus and Grafana. Node-exporter is ran as a normal pod through a manifest, while Prometheus and Grafana are deployments with 1 replica. Prometheus has compliance with Kubernetes, so the discovery of the nodes and other k8s components is configured directly in Prometheus configuration. Change-Id: If2cab996b9458580a55b5212ab298c909622e7f3 Partially-Implements: blueprint container-monitoring
This commit is contained in:
parent
3a9e8cfb40
commit
248e45f75c
@ -33,6 +33,7 @@ Contents
|
||||
#. `Storage`_
|
||||
#. `Image Management`_
|
||||
#. `Notification`_
|
||||
#. `Container Monitoring`_
|
||||
|
||||
===========
|
||||
Terminology
|
||||
@ -304,7 +305,11 @@ the table are linked to more details elsewhere in the user guide.
|
||||
+---------------------------------------+--------------------+---------------+
|
||||
| `admission_control_list`_ | see below | see below |
|
||||
+---------------------------------------+--------------------+---------------+
|
||||
|
||||
| `prometheus_monitoring`_ | - true | false |
|
||||
| | - false | |
|
||||
+---------------------------------------+--------------------+---------------+
|
||||
| `grafana_admin_passwd`_ | (any string) | "admin" |
|
||||
+---------------------------------------+--------------------+---------------+
|
||||
|
||||
=======
|
||||
Cluster
|
||||
@ -2719,3 +2724,69 @@ created. This example can be applied for any ``create``, ``update`` or
|
||||
"publisher_id": "magnum.host1234",
|
||||
"timestamp": "2016-05-20 15:03:45.960280"
|
||||
}
|
||||
|
||||
|
||||
====================
|
||||
Container Monitoring
|
||||
====================
|
||||
|
||||
The offered monitoring stack relies on the following set of containers and
|
||||
services:
|
||||
|
||||
- cAdvisor
|
||||
- Node Exporter
|
||||
- Prometheus
|
||||
- Grafana
|
||||
|
||||
To setup this monitoring stack, users are given two configurable labels in
|
||||
the Magnum cluster template's definition:
|
||||
|
||||
_`prometheus_monitoring`
|
||||
This label accepts a boolean value. If *True*, the monitoring stack will be
|
||||
setup. By default *prometheus_monitoring = False*.
|
||||
|
||||
_`grafana_admin_passwd`
|
||||
This label lets users create their own *admin* user password for the Grafana
|
||||
interface. It expects a string value. By default it is set to *admin*.
|
||||
|
||||
|
||||
Container Monitoring in Kubernetes
|
||||
----------------------------------
|
||||
|
||||
By default, all Kubernetes clusters already contain *cAdvisor* integrated
|
||||
with the *Kubelet* binary. Its container monitoring data can be accessed on
|
||||
a node level basis through *http://NODE_IP:4194*.
|
||||
|
||||
Node Exporter is part of the above mentioned monitoring stack as it can be
|
||||
used to export machine metrics. Such functionality also work on a node level
|
||||
which means that when `prometheus_monitoring`_ is *True*, the Kubernetes nodes
|
||||
will be populated with an additional manifest under
|
||||
*/etc/kubernetes/manifests*. Node Exporter is then automatically picked up
|
||||
and launched as a regular Kubernetes POD.
|
||||
|
||||
To aggregate and complement all the existing monitoring metrics and add a
|
||||
built-in visualization layer, Prometheus is used. It is launched by the
|
||||
Kubernetes master node(s) as a *Service* within a *Deployment* with one
|
||||
replica and it relies on a *ConfigMap* where the Prometheus configuration
|
||||
(prometheus.yml) is defined. This configuration uses Prometheus native
|
||||
support for service discovery in Kubernetes clusters,
|
||||
*kubernetes_sd_configs*. The respective manifests can be found in
|
||||
*/srv/kubernetes/monitoring/* on the master nodes and once the service is
|
||||
up and running, Prometheus UI can be accessed through port 9090.
|
||||
|
||||
Finally, for custom plotting and enhanced metric aggregation and
|
||||
visualization, Prometheus can be integrated with Grafana as it provides
|
||||
native compliance for Prometheus data sources. Also Grafana is deployed as
|
||||
a *Service* within a *Deployment* with one replica. The default user is
|
||||
*admin* and the password is setup according to `grafana_admin_passwd`_.
|
||||
There is also a default Grafana dashboard provided with this installation,
|
||||
from the official `Grafana dashboards' repository
|
||||
<https://grafana.net/dashboards>`_. The Prometheus data
|
||||
source is automatically added to Grafana once it is up and running, pointing
|
||||
to *http://prometheus:9090* through *Proxy*. The respective manifests can
|
||||
also be found in */srv/kubernetes/monitoring/* on the master nodes and once
|
||||
the service is running, the Grafana dashboards can be accessed through port
|
||||
3000.
|
||||
|
||||
For both Prometheus and Grafana, there is an assigned *systemd* service
|
||||
called *kube-enable-monitoring*.
|
||||
|
@ -0,0 +1,139 @@
|
||||
#!/bin/bash
|
||||
|
||||
. /etc/sysconfig/heat-params
|
||||
|
||||
if [ "$(echo $PROMETHEUS_MONITORING | tr '[:upper:]' '[:lower:]')" = "false" ]; then
|
||||
exit 0
|
||||
fi
|
||||
|
||||
function writeFile {
|
||||
# $1 is filename
|
||||
# $2 is file content
|
||||
|
||||
[ -f ${1} ] || {
|
||||
echo "Writing File: $1"
|
||||
mkdir -p $(dirname ${1})
|
||||
cat << EOF > ${1}
|
||||
$2
|
||||
EOF
|
||||
}
|
||||
}
|
||||
|
||||
KUBE_MON_BIN=/usr/local/bin/kube-enable-monitoring
|
||||
KUBE_MON_SERVICE=/etc/systemd/system/kube-enable-monitoring.service
|
||||
GRAFANA_DEF_DASHBOARDS="/var/lib/grafana/dashboards"
|
||||
GRAFANA_DEF_DASHBOARD_FILE=$GRAFANA_DEF_DASHBOARDS"/default.json"
|
||||
|
||||
# Write the binary for enable-monitoring
|
||||
KUBE_MON_BIN_CONTENT='''#!/bin/sh
|
||||
until curl -sf "http://127.0.0.1:8080/healthz"
|
||||
do
|
||||
echo "Waiting for Kubernetes API..."
|
||||
sleep 5
|
||||
done
|
||||
|
||||
# Check if all resources exist already before creating them
|
||||
# Check if configmap Prometheus exists
|
||||
kubectl get configmap prometheus -n kube-system
|
||||
if [ "$?" != "0" ] && \
|
||||
[ -f "/srv/kubernetes/monitoring/prometheusConfigMap.yaml" ]; then
|
||||
kubectl create -f /srv/kubernetes/monitoring/prometheusConfigMap.yaml
|
||||
fi
|
||||
|
||||
# Check if deployment and service Prometheus exist
|
||||
kubectl get service prometheus -n kube-system | kubectl get deployment prometheus -n kube-system
|
||||
if [ "${PIPESTATUS[0]}" != "0" ] && [ "${PIPESTATUS[1]}" != "0" ] && \
|
||||
[ -f "/srv/kubernetes/monitoring/prometheusService.yaml" ]; then
|
||||
kubectl create -f /srv/kubernetes/monitoring/prometheusService.yaml
|
||||
fi
|
||||
|
||||
# Check if configmap graf-dash exists
|
||||
kubectl get configmap graf-dash -n kube-system
|
||||
if [ "$?" != "0" ] && \
|
||||
[ -f '''$GRAFANA_DEF_DASHBOARD_FILE''' ]; then
|
||||
kubectl create configmap graf-dash --from-file='''$GRAFANA_DEF_DASHBOARD_FILE''' -n kube-system
|
||||
fi
|
||||
|
||||
# Check if deployment and service Grafana exist
|
||||
kubectl get service grafana -n kube-system | kubectl get deployment grafana -n kube-system
|
||||
if [ "${PIPESTATUS[0]}" != "0" ] && [ "${PIPESTATUS[1]}" != "0" ] && \
|
||||
[ -f "/srv/kubernetes/monitoring/grafanaService.yaml" ]; then
|
||||
kubectl create -f /srv/kubernetes/monitoring/grafanaService.yaml
|
||||
fi
|
||||
|
||||
# Wait for Grafana pod and then inject data source
|
||||
while true
|
||||
do
|
||||
echo "Waiting for Grafana pod to be up and Running"
|
||||
if [ "$(kubectl get po -n kube-system -l name=grafana -o jsonpath={..phase})" = "Running" ]; then
|
||||
break
|
||||
fi
|
||||
sleep 2
|
||||
done
|
||||
|
||||
# Which node is running Grafana
|
||||
NODE_IP=`kubectl get po -n kube-system -o jsonpath={.items[0].status.hostIP} -l name=grafana`
|
||||
PROM_SERVICE_IP=`kubectl get svc prometheus --namespace kube-system -o jsonpath={..clusterIP}`
|
||||
|
||||
# The Grafana pod might be running but the app might still be initiating
|
||||
echo "Check if Grafana is ready..."
|
||||
curl --user admin:$ADMIN_PASSWD -X GET http://$NODE_IP:3000/api/datasources/1
|
||||
until [ $? -eq 0 ]
|
||||
do
|
||||
sleep 2
|
||||
curl --user admin:$ADMIN_PASSWD -X GET http://$NODE_IP:3000/api/datasources/1
|
||||
done
|
||||
|
||||
# Inject Prometheus datasource into Grafana
|
||||
while true
|
||||
do
|
||||
INJECT=`curl --user admin:$ADMIN_PASSWD -X POST \
|
||||
-H "Content-Type: application/json;charset=UTF-8" \
|
||||
--data-binary '''"'"'''{"name":"k8sPrometheus","isDefault":true,
|
||||
"type":"prometheus","url":"http://'''"'"'''$PROM_SERVICE_IP'''"'"''':9090","access":"proxy"}'''"'"'''\
|
||||
"http://$NODE_IP:3000/api/datasources/"`
|
||||
|
||||
if [[ "$INJECT" = *"Datasource added"* ]]; then
|
||||
echo "Prometheus datasource injected into Grafana"
|
||||
break
|
||||
fi
|
||||
echo "Trying to inject Prometheus datasource into Grafana - "$INJECT
|
||||
done
|
||||
'''
|
||||
writeFile $KUBE_MON_BIN "$KUBE_MON_BIN_CONTENT"
|
||||
|
||||
|
||||
# Write the monitoring service
|
||||
KUBE_MON_SERVICE_CONTENT='''[Unit]
|
||||
Requires=kubelet.service
|
||||
|
||||
[Service]
|
||||
Type=oneshot
|
||||
Environment=HOME=/root
|
||||
EnvironmentFile=-/etc/kubernetes/config
|
||||
ExecStart='''${KUBE_MON_BIN}'''
|
||||
|
||||
[Install]
|
||||
WantedBy=multi-user.target
|
||||
'''
|
||||
writeFile $KUBE_MON_SERVICE "$KUBE_MON_SERVICE_CONTENT"
|
||||
|
||||
chown root:root ${KUBE_MON_BIN}
|
||||
chmod 0755 ${KUBE_MON_BIN}
|
||||
|
||||
chown root:root ${KUBE_MON_SERVICE}
|
||||
chmod 0644 ${KUBE_MON_SERVICE}
|
||||
|
||||
# Download the default JSON Grafana dashboard
|
||||
# Not a crucial step, so allow it to fail
|
||||
# TODO: this JSON should be passed into the minions as gzip in cloud-init
|
||||
GRAFANA_DASHB_URL="https://grafana.net/api/dashboards/1621/revisions/1/download"
|
||||
mkdir -p $GRAFANA_DEF_DASHBOARDS
|
||||
curl $GRAFANA_DASHB_URL -o $GRAFANA_DEF_DASHBOARD_FILE || echo "Failed to fetch default Grafana dashboard"
|
||||
if [ -f $GRAFANA_DEF_DASHBOARD_FILE ]; then
|
||||
sed -i -- 's|${DS_PROMETHEUS}|k8sPrometheus|g' $GRAFANA_DEF_DASHBOARD_FILE
|
||||
fi
|
||||
|
||||
# Launch the monitoring service
|
||||
systemctl enable kube-enable-monitoring
|
||||
systemctl start --no-block kube-enable-monitoring
|
@ -0,0 +1,27 @@
|
||||
#!/bin/sh
|
||||
|
||||
. /etc/sysconfig/heat-params
|
||||
|
||||
if [ "$(echo $PROMETHEUS_MONITORING | tr '[:upper:]' '[:lower:]')" = "false" ]; then
|
||||
exit 0
|
||||
fi
|
||||
|
||||
# Write node-exporter manifest as a regular pod
|
||||
cat > /etc/kubernetes/manifests/node-exporter.yaml << EOF
|
||||
apiVersion: v1
|
||||
kind: Pod
|
||||
metadata:
|
||||
name: node-exporter
|
||||
namespace: kube-system
|
||||
annotations:
|
||||
prometheus.io/scrape: "true"
|
||||
labels:
|
||||
app: node-exporter
|
||||
spec:
|
||||
containers:
|
||||
- name: node-exporter
|
||||
image: prom/node-exporter
|
||||
ports:
|
||||
- containerPort: 9100
|
||||
hostPort: 9100
|
||||
EOF
|
@ -0,0 +1,67 @@
|
||||
#cloud-config
|
||||
merge_how: dict(recurse_array)+list(append)
|
||||
write_files:
|
||||
- path: /srv/kubernetes/monitoring/grafanaService.yaml
|
||||
owner: "root:root"
|
||||
permissions: "0644"
|
||||
content: |
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
labels:
|
||||
name: node
|
||||
role: service
|
||||
name: grafana
|
||||
namespace: kube-system
|
||||
spec:
|
||||
type: "NodePort"
|
||||
ports:
|
||||
- port: 3000
|
||||
targetPort: 3000
|
||||
nodePort: 30603
|
||||
selector:
|
||||
grafana: "true"
|
||||
---
|
||||
apiVersion: extensions/v1beta1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: grafana
|
||||
namespace: kube-system
|
||||
spec:
|
||||
replicas: 1
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
name: grafana
|
||||
grafana: "true"
|
||||
role: db
|
||||
spec:
|
||||
containers:
|
||||
- image: grafana/grafana
|
||||
imagePullPolicy: Always
|
||||
name: grafana
|
||||
env:
|
||||
- name: GF_SECURITY_ADMIN_PASSWORD
|
||||
value: $ADMIN_PASSWD
|
||||
- name: GF_DASHBOARDS_JSON_ENABLED
|
||||
value: "true"
|
||||
- name: GF_DASHBOARDS_JSON_PATH
|
||||
value: /var/lib/grafana/dashboards
|
||||
resources:
|
||||
# keep request = limit to keep this container in guaranteed class
|
||||
limits:
|
||||
cpu: 100m
|
||||
memory: 200Mi
|
||||
requests:
|
||||
cpu: 100m
|
||||
memory: 200Mi
|
||||
volumeMounts:
|
||||
- name: default-dashboard
|
||||
mountPath: /var/lib/grafana/dashboards
|
||||
ports:
|
||||
- containerPort: 3000
|
||||
hostPort: 3000
|
||||
volumes:
|
||||
- name: default-dashboard
|
||||
configMap:
|
||||
name: graf-dash
|
@ -5,6 +5,7 @@ write_files:
|
||||
owner: "root:root"
|
||||
permissions: "0600"
|
||||
content: |
|
||||
PROMETHEUS_MONITORING="$PROMETHEUS_MONITORING"
|
||||
KUBE_API_PUBLIC_ADDRESS="$KUBE_API_PUBLIC_ADDRESS"
|
||||
KUBE_API_PRIVATE_ADDRESS="$KUBE_API_PRIVATE_ADDRESS"
|
||||
KUBE_API_PORT="$KUBE_API_PORT"
|
||||
|
@ -5,6 +5,7 @@ write_files:
|
||||
owner: "root:root"
|
||||
permissions: "0600"
|
||||
content: |
|
||||
PROMETHEUS_MONITORING="$PROMETHEUS_MONITORING"
|
||||
KUBE_ALLOW_PRIV="$KUBE_ALLOW_PRIV"
|
||||
KUBE_MASTER_IP="$KUBE_MASTER_IP"
|
||||
KUBE_API_PORT="$KUBE_API_PORT"
|
||||
|
@ -0,0 +1,82 @@
|
||||
#cloud-config
|
||||
merge_how: dict(recurse_array)+list(append)
|
||||
write_files:
|
||||
- path: /srv/kubernetes/monitoring/prometheusConfigMap.yaml
|
||||
owner: "root:root"
|
||||
permissions: "0644"
|
||||
content: |
|
||||
apiVersion: v1
|
||||
kind: ConfigMap
|
||||
metadata:
|
||||
name: prometheus
|
||||
namespace: kube-system
|
||||
data:
|
||||
prometheus.yml: |
|
||||
global:
|
||||
scrape_interval: 10s
|
||||
scrape_timeout: 10s
|
||||
evaluation_interval: 10s
|
||||
|
||||
scrape_configs:
|
||||
- job_name: 'kubernetes-nodes-cadvisor'
|
||||
tls_config:
|
||||
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
|
||||
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
|
||||
kubernetes_sd_configs:
|
||||
- role: node
|
||||
relabel_configs:
|
||||
- action: labelmap
|
||||
regex: __meta_kubernetes_node_label_(.+)
|
||||
- source_labels: [__meta_kubernetes_role]
|
||||
action: replace
|
||||
target_label: kubernetes_role
|
||||
- source_labels: [__address__]
|
||||
regex: '(.*):10250'
|
||||
replacement: '${1}:10255'
|
||||
target_label: __address__
|
||||
metric_relabel_configs:
|
||||
- action: replace
|
||||
source_labels: [id]
|
||||
regex: '^/machine\.slice/machine-rkt\\x2d([^\\]+)\\.+/([^/]+)\.service$'
|
||||
target_label: rkt_container_name
|
||||
replacement: '${2}-${1}'
|
||||
- action: replace
|
||||
source_labels: [id]
|
||||
regex: '^/system\.slice/(.+)\.service$'
|
||||
target_label: systemd_service_name
|
||||
replacement: '${1}'
|
||||
|
||||
- job_name: 'kubernetes-apiserver-cadvisor'
|
||||
tls_config:
|
||||
insecure_skip_verify: true
|
||||
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
|
||||
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
|
||||
kubernetes_sd_configs:
|
||||
- role: endpoints
|
||||
relabel_configs:
|
||||
- action: labelmap
|
||||
regex: __meta_kubernetes_node_label_(.+)
|
||||
- source_labels: [__meta_kubernetes_role]
|
||||
action: replace
|
||||
target_label: kubernetes_role
|
||||
- source_labels: [__address__]
|
||||
regex: '(.*):10250'
|
||||
replacement: '${1}:10255'
|
||||
target_label: __address__
|
||||
|
||||
- job_name: 'kubernetes-node-exporter'
|
||||
tls_config:
|
||||
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
|
||||
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
|
||||
kubernetes_sd_configs:
|
||||
- role: node
|
||||
relabel_configs:
|
||||
- action: labelmap
|
||||
regex: __meta_kubernetes_node_label_(.+)
|
||||
- source_labels: [__meta_kubernetes_role]
|
||||
action: replace
|
||||
target_label: kubernetes_role
|
||||
- source_labels: [__address__]
|
||||
regex: '(.*):10250'
|
||||
replacement: '${1}:9100'
|
||||
target_label: __address__
|
@ -0,0 +1,60 @@
|
||||
#cloud-config
|
||||
merge_how: dict(recurse_array)+list(append)
|
||||
write_files:
|
||||
- path: /srv/kubernetes/monitoring/prometheusService.yaml
|
||||
owner: "root:root"
|
||||
permissions: "0644"
|
||||
content: |
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
annotations:
|
||||
prometheus.io/scrape: 'true'
|
||||
labels:
|
||||
name: prometheus
|
||||
name: prometheus
|
||||
namespace: kube-system
|
||||
spec:
|
||||
selector:
|
||||
app: prometheus
|
||||
type: NodePort
|
||||
ports:
|
||||
- name: prometheus
|
||||
protocol: TCP
|
||||
port: 9090
|
||||
nodePort: 30900
|
||||
---
|
||||
apiVersion: extensions/v1beta1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: prometheus
|
||||
namespace: kube-system
|
||||
spec:
|
||||
replicas: 1
|
||||
selector:
|
||||
matchLabels:
|
||||
app: prometheus
|
||||
template:
|
||||
metadata:
|
||||
name: prometheus
|
||||
labels:
|
||||
app: prometheus
|
||||
spec:
|
||||
containers:
|
||||
- name: prometheus
|
||||
image: prom/prometheus
|
||||
args:
|
||||
- '-storage.local.retention=6h'
|
||||
- '-storage.local.memory-chunks=500000'
|
||||
- '-config.file=/etc/prometheus/prometheus.yml'
|
||||
ports:
|
||||
- name: web
|
||||
containerPort: 9090
|
||||
hostPort: 9090
|
||||
volumeMounts:
|
||||
- name: config-volume
|
||||
mountPath: /etc/prometheus
|
||||
volumes:
|
||||
- name: config-volume
|
||||
configMap:
|
||||
name: prometheus
|
@ -109,7 +109,9 @@ class K8sTemplateDefinition(template_def.BaseTemplateDefinition):
|
||||
'flannel_network_subnetlen',
|
||||
'system_pods_initial_delay',
|
||||
'system_pods_timeout',
|
||||
'admission_control_list']
|
||||
'admission_control_list',
|
||||
'prometheus_monitoring',
|
||||
'grafana_admin_passwd']
|
||||
|
||||
for label in label_list:
|
||||
extra_params[label] = cluster_template.labels.get(label)
|
||||
|
@ -40,6 +40,19 @@ parameters:
|
||||
default: m1.small
|
||||
description: flavor to use when booting the server for minions
|
||||
|
||||
prometheus_monitoring:
|
||||
type: boolean
|
||||
default: false
|
||||
description: >
|
||||
whether or not to have the grafana-prometheus-cadvisor monitoring setup
|
||||
|
||||
grafana_admin_passwd:
|
||||
type: string
|
||||
default: admin
|
||||
hidden: true
|
||||
description: >
|
||||
admin user password for the Grafana monitoring interface
|
||||
|
||||
dns_nameserver:
|
||||
type: string
|
||||
description: address of a DNS nameserver reachable in your environment
|
||||
@ -417,6 +430,8 @@ resources:
|
||||
resource_def:
|
||||
type: kubemaster.yaml
|
||||
properties:
|
||||
prometheus_monitoring: {get_param: prometheus_monitoring}
|
||||
grafana_admin_passwd: {get_param: grafana_admin_passwd}
|
||||
api_public_address: {get_attr: [api_lb, floating_address]}
|
||||
api_private_address: {get_attr: [api_lb, address]}
|
||||
ssh_key_name: {get_param: ssh_key_name}
|
||||
@ -474,6 +489,7 @@ resources:
|
||||
resource_def:
|
||||
type: kubeminion.yaml
|
||||
properties:
|
||||
prometheus_monitoring: {get_param: prometheus_monitoring}
|
||||
ssh_key_name: {get_param: ssh_key_name}
|
||||
server_image: {get_param: server_image}
|
||||
minion_flavor: {get_param: minion_flavor}
|
||||
|
@ -105,6 +105,17 @@ parameters:
|
||||
type: string
|
||||
description: endpoint to retrieve TLS certs from
|
||||
|
||||
prometheus_monitoring:
|
||||
type: boolean
|
||||
description: >
|
||||
whether or not to have prometheus and grafana deployed
|
||||
|
||||
grafana_admin_passwd:
|
||||
type: string
|
||||
hidden: true
|
||||
description: >
|
||||
admin user password for the Grafana monitoring interface
|
||||
|
||||
api_public_address:
|
||||
type: string
|
||||
description: Public IP address of the Kubernetes master server.
|
||||
@ -238,6 +249,7 @@ resources:
|
||||
str_replace:
|
||||
template: {get_file: ../../common/templates/kubernetes/fragments/write-heat-params-master.yaml}
|
||||
params:
|
||||
"$PROMETHEUS_MONITORING": {get_param: prometheus_monitoring}
|
||||
"$KUBE_API_PUBLIC_ADDRESS": {get_attr: [api_address_switch, public_ip]}
|
||||
"$KUBE_API_PRIVATE_ADDRESS": {get_attr: [api_address_switch, private_ip]}
|
||||
"$KUBE_API_PORT": {get_param: kubernetes_port}
|
||||
@ -314,6 +326,39 @@ resources:
|
||||
group: ungrouped
|
||||
config: {get_file: ../../common/templates/kubernetes/fragments/write-network-config.sh}
|
||||
|
||||
write_prometheus_configmap:
|
||||
type: OS::Heat::SoftwareConfig
|
||||
properties:
|
||||
group: ungrouped
|
||||
config: {get_file: ../../common/templates/kubernetes/fragments/write-prometheus-configmap.yaml}
|
||||
|
||||
|
||||
write_prometheus_service:
|
||||
type: OS::Heat::SoftwareConfig
|
||||
properties:
|
||||
group: ungrouped
|
||||
config: {get_file: ../../common/templates/kubernetes/fragments/write-prometheus-service.yaml}
|
||||
|
||||
write_grafana_service:
|
||||
type: OS::Heat::SoftwareConfig
|
||||
properties:
|
||||
group: ungrouped
|
||||
config:
|
||||
str_replace:
|
||||
template: {get_file: ../../common/templates/kubernetes/fragments/write-grafana-service.yaml}
|
||||
params:
|
||||
"$ADMIN_PASSWD": {get_param: grafana_admin_passwd}
|
||||
|
||||
enable_monitoring:
|
||||
type: OS::Heat::SoftwareConfig
|
||||
properties:
|
||||
group: ungrouped
|
||||
config:
|
||||
str_replace:
|
||||
template: {get_file: ../../common/templates/kubernetes/fragments/enable-monitoring.sh}
|
||||
params:
|
||||
"$ADMIN_PASSWD": {get_param: grafana_admin_passwd}
|
||||
|
||||
network_config_service:
|
||||
type: OS::Heat::SoftwareConfig
|
||||
properties:
|
||||
@ -394,6 +439,9 @@ resources:
|
||||
- config: {get_resource: add_proxy}
|
||||
- config: {get_resource: enable_services}
|
||||
- config: {get_resource: write_network_config}
|
||||
- config: {get_resource: write_prometheus_configmap}
|
||||
- config: {get_resource: write_prometheus_service}
|
||||
- config: {get_resource: write_grafana_service}
|
||||
- config: {get_resource: network_config_service}
|
||||
- config: {get_resource: network_service}
|
||||
- config: {get_resource: kube_system_namespace_service}
|
||||
@ -401,6 +449,7 @@ resources:
|
||||
- config: {get_resource: enable_kube_proxy}
|
||||
- config: {get_resource: kube_ui_service}
|
||||
- config: {get_resource: kube_examples}
|
||||
- config: {get_resource: enable_monitoring}
|
||||
- config: {get_resource: master_wc_notify}
|
||||
|
||||
######################################################################
|
||||
|
@ -61,6 +61,11 @@ parameters:
|
||||
type: string
|
||||
description: endpoint to retrieve TLS certs from
|
||||
|
||||
prometheus_monitoring:
|
||||
type: boolean
|
||||
description: >
|
||||
whether or not to have the node-exporter running on the node
|
||||
|
||||
kube_master_ip:
|
||||
type: string
|
||||
description: IP address of the Kubernetes master server.
|
||||
@ -220,6 +225,7 @@ resources:
|
||||
str_replace:
|
||||
template: {get_file: ../../common/templates/kubernetes/fragments/write-heat-params.yaml}
|
||||
params:
|
||||
$PROMETHEUS_MONITORING: {get_param: prometheus_monitoring}
|
||||
$KUBE_ALLOW_PRIV: {get_param: kube_allow_priv}
|
||||
$KUBE_MASTER_IP: {get_param: kube_master_ip}
|
||||
$KUBE_API_PORT: {get_param: kubernetes_port}
|
||||
@ -321,6 +327,12 @@ resources:
|
||||
group: ungrouped
|
||||
config: {get_file: ../../common/templates/kubernetes/fragments/enable-kube-proxy-minion.sh}
|
||||
|
||||
enable_node_exporter:
|
||||
type: OS::Heat::SoftwareConfig
|
||||
properties:
|
||||
group: ungrouped
|
||||
config: {get_file: ../../common/templates/kubernetes/fragments/enable-node-exporter.sh}
|
||||
|
||||
minion_wc_notify:
|
||||
type: OS::Heat::SoftwareConfig
|
||||
properties:
|
||||
@ -361,6 +373,7 @@ resources:
|
||||
- config: {get_resource: add_proxy}
|
||||
- config: {get_resource: enable_services}
|
||||
- config: {get_resource: enable_kube_proxy}
|
||||
- config: {get_resource: enable_node_exporter}
|
||||
- config: {get_resource: enable_docker_registry}
|
||||
- config: {get_resource: minion_wc_notify}
|
||||
|
||||
|
File diff suppressed because it is too large
Load Diff
@ -43,6 +43,19 @@ parameters:
|
||||
default: baremetal
|
||||
description: flavor to use when booting the server
|
||||
|
||||
prometheus_monitoring:
|
||||
type: boolean
|
||||
default: false
|
||||
description: >
|
||||
whether or not to have the grafana-prometheus-cadvisor monitoring setup
|
||||
|
||||
grafana_admin_passwd:
|
||||
type: string
|
||||
default: admin
|
||||
hidden: true
|
||||
description: >
|
||||
admin user password for the Grafana monitoring interface
|
||||
|
||||
dns_nameserver:
|
||||
type: string
|
||||
description: address of a dns nameserver reachable in your environment
|
||||
@ -405,6 +418,8 @@ resources:
|
||||
resource_def:
|
||||
type: kubemaster.yaml
|
||||
properties:
|
||||
prometheus_monitoring: {get_param: prometheus_monitoring}
|
||||
grafana_admin_passwd: {get_param: grafana_admin_passwd}
|
||||
api_public_address: {get_attr: [api_lb, floating_address]}
|
||||
api_private_address: {get_attr: [api_lb, address]}
|
||||
ssh_key_name: {get_param: ssh_key_name}
|
||||
@ -491,6 +506,7 @@ resources:
|
||||
kubeminion_software_configs:
|
||||
type: kubeminion_software_configs.yaml
|
||||
properties:
|
||||
prometheus_monitoring: {get_param: prometheus_monitoring}
|
||||
network_driver: {get_param: network_driver}
|
||||
kube_master_ip: {get_attr: [api_address_lb_switch, private_ip]}
|
||||
etcd_server_ip: {get_attr: [etcd_address_lb_switch, private_ip]}
|
||||
|
@ -105,6 +105,17 @@ parameters:
|
||||
type: string
|
||||
description: endpoint to retrieve TLS certs from
|
||||
|
||||
prometheus_monitoring:
|
||||
type: boolean
|
||||
description: >
|
||||
whether or not to have prometheus and grafana deployed
|
||||
|
||||
grafana_admin_passwd:
|
||||
type: string
|
||||
hidden: true
|
||||
description: >
|
||||
admin user password for the Grafana monitoring interface
|
||||
|
||||
api_public_address:
|
||||
type: string
|
||||
description: Public IP address of the Kubernetes master server.
|
||||
@ -232,6 +243,7 @@ resources:
|
||||
str_replace:
|
||||
template: {get_file: ../../common/templates/kubernetes/fragments/write-heat-params-master.yaml}
|
||||
params:
|
||||
"$PROMETHEUS_MONITORING": {get_param: prometheus_monitoring}
|
||||
"$KUBE_API_PUBLIC_ADDRESS": {get_attr: [api_address_switch, public_ip]}
|
||||
"$KUBE_API_PRIVATE_ADDRESS": {get_attr: [api_address_switch, private_ip]}
|
||||
"$KUBE_API_PORT": {get_param: kubernetes_port}
|
||||
@ -307,6 +319,39 @@ resources:
|
||||
group: ungrouped
|
||||
config: {get_file: ../../common/templates/kubernetes/fragments/write-network-config.sh}
|
||||
|
||||
write_prometheus_configmap:
|
||||
type: OS::Heat::SoftwareConfig
|
||||
properties:
|
||||
group: ungrouped
|
||||
config: {get_file: ../../common/templates/kubernetes/fragments/write-prometheus-configmap.yaml}
|
||||
|
||||
|
||||
write_prometheus_service:
|
||||
type: OS::Heat::SoftwareConfig
|
||||
properties:
|
||||
group: ungrouped
|
||||
config: {get_file: ../../common/templates/kubernetes/fragments/write-prometheus-service.yaml}
|
||||
|
||||
write_grafana_service:
|
||||
type: OS::Heat::SoftwareConfig
|
||||
properties:
|
||||
group: ungrouped
|
||||
config:
|
||||
str_replace:
|
||||
template: {get_file: ../../common/templates/kubernetes/fragments/write-grafana-service.yaml}
|
||||
params:
|
||||
"$ADMIN_PASSWD": {get_param: grafana_admin_passwd}
|
||||
|
||||
enable_monitoring:
|
||||
type: OS::Heat::SoftwareConfig
|
||||
properties:
|
||||
group: ungrouped
|
||||
config:
|
||||
str_replace:
|
||||
template: {get_file: ../../common/templates/kubernetes/fragments/enable-monitoring.sh}
|
||||
params:
|
||||
"$ADMIN_PASSWD": {get_param: grafana_admin_passwd}
|
||||
|
||||
network_config_service:
|
||||
type: OS::Heat::SoftwareConfig
|
||||
properties:
|
||||
@ -387,6 +432,9 @@ resources:
|
||||
- config: {get_resource: add_proxy}
|
||||
- config: {get_resource: enable_services}
|
||||
- config: {get_resource: write_network_config}
|
||||
- config: {get_resource: write_prometheus_configmap}
|
||||
- config: {get_resource: write_prometheus_service}
|
||||
- config: {get_resource: write_grafana_service}
|
||||
- config: {get_resource: network_config_service}
|
||||
- config: {get_resource: network_service}
|
||||
- config: {get_resource: kube_system_namespace_service}
|
||||
@ -394,6 +442,7 @@ resources:
|
||||
- config: {get_resource: enable_kube_proxy}
|
||||
- config: {get_resource: kube_ui_service}
|
||||
- config: {get_resource: kube_examples}
|
||||
- config: {get_resource: enable_monitoring}
|
||||
- config: {get_resource: master_wc_notify}
|
||||
|
||||
######################################################################
|
||||
|
@ -43,6 +43,11 @@ parameters:
|
||||
type: string
|
||||
description: endpoint to retrieve TLS certs from
|
||||
|
||||
prometheus_monitoring:
|
||||
type: boolean
|
||||
description: >
|
||||
whether or not to have the node-exporter running on the node
|
||||
|
||||
kube_master_ip:
|
||||
type: string
|
||||
description: IP address of the Kubernetes master server.
|
||||
@ -176,6 +181,7 @@ resources:
|
||||
str_replace:
|
||||
template: {get_file: ../../common/templates/kubernetes/fragments/write-heat-params.yaml}
|
||||
params:
|
||||
$PROMETHEUS_MONITORING: {get_param: prometheus_monitoring}
|
||||
$KUBE_ALLOW_PRIV: {get_param: kube_allow_priv}
|
||||
$KUBE_MASTER_IP: {get_param: kube_master_ip}
|
||||
$KUBE_API_PORT: {get_param: kubernetes_port}
|
||||
@ -276,6 +282,12 @@ resources:
|
||||
group: ungrouped
|
||||
config: {get_file: ../../common/templates/kubernetes/fragments/enable-kube-proxy-minion.sh}
|
||||
|
||||
enable_node_exporter:
|
||||
type: OS::Heat::SoftwareConfig
|
||||
properties:
|
||||
group: ungrouped
|
||||
config: {get_file: ../../common/templates/kubernetes/fragments/enable-node-exporter.sh}
|
||||
|
||||
minion_wc_notify:
|
||||
type: OS::Heat::SoftwareConfig
|
||||
properties:
|
||||
@ -316,6 +328,7 @@ resources:
|
||||
- config: {get_resource: add_proxy}
|
||||
- config: {get_resource: enable_services}
|
||||
- config: {get_resource: enable_kube_proxy}
|
||||
- config: {get_resource: enable_node_exporter}
|
||||
- config: {get_resource: enable_docker_registry}
|
||||
- config: {get_resource: minion_wc_notify}
|
||||
|
||||
|
@ -51,7 +51,9 @@ class TestClusterConductorWithK8s(base.TestCase):
|
||||
'flannel_backend': 'vxlan',
|
||||
'system_pods_initial_delay': '15',
|
||||
'system_pods_timeout': '1',
|
||||
'admission_control_list': 'fake_list'},
|
||||
'admission_control_list': 'fake_list',
|
||||
'prometheus_monitoring': 'False',
|
||||
'grafana_admin_passwd': 'fake_pwd'},
|
||||
'tls_disabled': False,
|
||||
'server_type': 'vm',
|
||||
'registry_enabled': False,
|
||||
@ -149,7 +151,9 @@ class TestClusterConductorWithK8s(base.TestCase):
|
||||
'flannel_backend': 'vxlan',
|
||||
'system_pods_initial_delay': '15',
|
||||
'system_pods_timeout': '1',
|
||||
'admission_control_list': 'fake_list'},
|
||||
'admission_control_list': 'fake_list',
|
||||
'prometheus_monitoring': 'False',
|
||||
'grafana_admin_passwd': 'fake_pwd'},
|
||||
'http_proxy': 'http_proxy',
|
||||
'https_proxy': 'https_proxy',
|
||||
'no_proxy': 'no_proxy',
|
||||
@ -180,6 +184,8 @@ class TestClusterConductorWithK8s(base.TestCase):
|
||||
'system_pods_initial_delay': '15',
|
||||
'system_pods_timeout': '1',
|
||||
'admission_control_list': 'fake_list',
|
||||
'prometheus_monitoring': 'False',
|
||||
'grafana_admin_passwd': 'fake_pwd',
|
||||
'http_proxy': 'http_proxy',
|
||||
'https_proxy': 'https_proxy',
|
||||
'no_proxy': 'no_proxy',
|
||||
@ -261,6 +267,8 @@ class TestClusterConductorWithK8s(base.TestCase):
|
||||
'system_pods_initial_delay': '15',
|
||||
'system_pods_timeout': '1',
|
||||
'admission_control_list': 'fake_list',
|
||||
'prometheus_monitoring': 'False',
|
||||
'grafana_admin_passwd': 'fake_pwd',
|
||||
'http_proxy': 'http_proxy',
|
||||
'https_proxy': 'https_proxy',
|
||||
'magnum_url': 'http://127.0.0.1:9511/v1',
|
||||
@ -344,6 +352,8 @@ class TestClusterConductorWithK8s(base.TestCase):
|
||||
'system_pods_initial_delay': '15',
|
||||
'system_pods_timeout': '1',
|
||||
'admission_control_list': 'fake_list',
|
||||
'prometheus_monitoring': 'False',
|
||||
'grafana_admin_passwd': 'fake_pwd',
|
||||
'insecure_registry_url': '10.0.0.1:5000',
|
||||
'kube_version': 'fake-version',
|
||||
'magnum_url': 'http://127.0.0.1:9511/v1',
|
||||
@ -419,6 +429,8 @@ class TestClusterConductorWithK8s(base.TestCase):
|
||||
'system_pods_initial_delay': '15',
|
||||
'system_pods_timeout': '1',
|
||||
'admission_control_list': 'fake_list',
|
||||
'prometheus_monitoring': 'False',
|
||||
'grafana_admin_passwd': 'fake_pwd',
|
||||
'tls_disabled': False,
|
||||
'registry_enabled': False,
|
||||
'trustee_domain_id': self.mock_keystone.trustee_domain_id,
|
||||
@ -486,6 +498,8 @@ class TestClusterConductorWithK8s(base.TestCase):
|
||||
'system_pods_initial_delay': '15',
|
||||
'system_pods_timeout': '1',
|
||||
'admission_control_list': 'fake_list',
|
||||
'prometheus_monitoring': 'False',
|
||||
'grafana_admin_passwd': 'fake_pwd',
|
||||
'tls_disabled': False,
|
||||
'registry_enabled': False,
|
||||
'trustee_domain_id': self.mock_keystone.trustee_domain_id,
|
||||
@ -679,6 +693,8 @@ class TestClusterConductorWithK8s(base.TestCase):
|
||||
'system_pods_initial_delay': '15',
|
||||
'system_pods_timeout': '1',
|
||||
'admission_control_list': 'fake_list',
|
||||
'prometheus_monitoring': 'False',
|
||||
'grafana_admin_passwd': 'fake_pwd',
|
||||
'tenant_name': 'fake_tenant',
|
||||
'username': 'fake_user',
|
||||
'cluster_uuid': self.cluster_dict['uuid'],
|
||||
|
@ -260,6 +260,10 @@ class AtomicK8sTemplateDefinitionTestCase(BaseTemplateDefinitionTestCase):
|
||||
'system_pods_timeout')
|
||||
admission_control_list = mock_cluster_template.labels.get(
|
||||
'admission_control_list')
|
||||
prometheus_monitoring = mock_cluster_template.labels.get(
|
||||
'prometheus_monitoring')
|
||||
grafana_admin_passwd = mock_cluster_template.labels.get(
|
||||
'grafana_admin_passwd')
|
||||
|
||||
k8s_def = k8sa_tdef.AtomicK8sTemplateDefinition()
|
||||
|
||||
@ -275,6 +279,8 @@ class AtomicK8sTemplateDefinitionTestCase(BaseTemplateDefinitionTestCase):
|
||||
'system_pods_initial_delay': system_pods_initial_delay,
|
||||
'system_pods_timeout': system_pods_timeout,
|
||||
'admission_control_list': admission_control_list,
|
||||
'prometheus_monitoring': prometheus_monitoring,
|
||||
'grafana_admin_passwd': grafana_admin_passwd,
|
||||
'username': 'fake_user',
|
||||
'tenant_name': 'fake_tenant',
|
||||
'magnum_url': mock_osc.magnum_url.return_value,
|
||||
@ -325,6 +331,10 @@ class AtomicK8sTemplateDefinitionTestCase(BaseTemplateDefinitionTestCase):
|
||||
'system_pods_timeout')
|
||||
admission_control_list = mock_cluster_template.labels.get(
|
||||
'admission_control_list')
|
||||
prometheus_monitoring = mock_cluster_template.labels.get(
|
||||
'prometheus_monitoring')
|
||||
grafana_admin_passwd = mock_cluster_template.labels.get(
|
||||
'grafana_admin_passwd')
|
||||
|
||||
k8s_def = k8sa_tdef.AtomicK8sTemplateDefinition()
|
||||
|
||||
@ -340,6 +350,8 @@ class AtomicK8sTemplateDefinitionTestCase(BaseTemplateDefinitionTestCase):
|
||||
'system_pods_initial_delay': system_pods_initial_delay,
|
||||
'system_pods_timeout': system_pods_timeout,
|
||||
'admission_control_list': admission_control_list,
|
||||
'prometheus_monitoring': prometheus_monitoring,
|
||||
'grafana_admin_passwd': grafana_admin_passwd,
|
||||
'username': 'fake_user',
|
||||
'tenant_name': 'fake_tenant',
|
||||
'magnum_url': mock_osc.magnum_url.return_value,
|
||||
|
@ -0,0 +1,8 @@
|
||||
---
|
||||
features:
|
||||
- |
|
||||
Includes a monitoring stack based on cAdvisor, node-exporter, Prometheus
|
||||
and Grafana. Users can enable this stack through the label
|
||||
prometheus_monitoring. Prometheus scrapes metrics from the Kubernetes
|
||||
cluster and then serves them to Grafana through Grafana's Prometheus
|
||||
data source. Upon completion, a default Grafana dashboard is provided.
|
Loading…
Reference in New Issue
Block a user