performance-docs/doc/source/methodologies/monitoring/index.rst
Oleg Basov 013d072f2b Containerized Openstack Monitoring Solution
Change-Id: I66ea0711dd0319c1153a13b159dc5be6f7a7016c
2017-01-11 19:26:05 +03:00

949 lines
35 KiB
ReStructuredText

.. _Methodology_for_Containerized_Openstack_Monitoring:
**************************************************
Methodology for Containerized Openstack Monitoring
**************************************************
:Abstract:
This document describes one of the Containerized Openstack monitoring solutions
to provide scalable and comprehensive architecture and obtain all crucial performance
metrics on each structure layer.
Containerized Openstack Monitoring Architecture
===============================================
This part of documentation describes required performance metrics in each
distinguished Containerized Openstack layer.
Containerized Openstack comprises three layers where Monitoring System should
be able to query all necessary counters:
- OS layer
- Kubernetes layer
- Openstack layer
Monitoring instruments must be logically divided in two groups:
- Monitoring Server Side
- Node Client Side
Operation System Layer
----------------------
We were using Ubuntu Xenial on top of bare-metal servers for both server and node side.
Baremetal hardware description
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
We deployed everything at 200 servers environment with following hardware characteristics:
.. table::
+-------+----------------+------------------------+
|server |vendor,model |HP,DL380 Gen9 |
+-------+----------------+------------------------+
|CPU |vendor,model |Intel,E5-2680 v3 |
| +----------------+------------------------+
| |processor_count |2 |
| +----------------+------------------------+
| |core_count |12 |
| +----------------+------------------------+
| |frequency_MHz |2500 |
+-------+----------------+------------------------+
|RAM |vendor,model |HP,752369-081 |
| +----------------+------------------------+
| |amount_MB |262144 |
+-------+----------------+------------------------+
|NETWORK|interface_name |p1p1 |
| +----------------+------------------------+
| |vendor,model |Intel,X710 Dual Port |
| +----------------+------------------------+
| |bandwidth |10G |
+-------+----------------+------------------------+
|STORAGE|dev_name |/dev/sda |
| +----------------+------------------------+
| |vendor,model | | raid10 - HP P840 |
| | | | 12 disks EH0600JEDHE |
| +----------------+------------------------+
| |SSD/HDD |HDD |
| +----------------+------------------------+
| |size | 3,6TB |
+-------+----------------+------------------------+
Operating system configuration
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Baremetal nodes were provisioned with Cobbler with our in-home preseed scripts.
OS versions we used:
.. table:: Versions Operating Systems
+--------------------+-----------------------------------------+
|Software |Version |
+--------------------+-----------------------------------------+
|Ubuntu |Ubuntu 16.04.1 LTS |
+--------------------+-----------------------------------------+
|Kernel |4.4.0-47-generic |
+--------------------+-----------------------------------------+
You can find /etc folder contents from the one of the typical system we were using:
:download:`etc_tarball <configs/node1.tar.gz>`
Required system metrics
^^^^^^^^^^^^^^^^^^^^^^^
At this layer we must get this list of processes:
.. table::
+------------------------+-----------------------------------------+
|List of processes |Mariadb |
| +-----------------------------------------+
| |Rabbitmq |
| |-----------------------------------------+
| |Keystone |
| +-----------------------------------------+
| |Glance |
| +-----------------------------------------+
| |Cinder |
| +-----------------------------------------+
| |Nova |
| +-----------------------------------------+
| |Neutron |
| +-----------------------------------------+
| |Openvswitch |
| +-----------------------------------------+
| |Kubernetes |
+------------------------+-----------------------------------------+
And following list of metrics:
.. table::
+------------------------+-----------------------------------------+
|Node load average |1min |
| +-----------------------------------------+
| |5min |
| |-----------------------------------------+
| |15min |
+------------------------+-----------------------------------------+
|Global process stats |Running |
| +-----------------------------------------+
| |Stopped |
| |-----------------------------------------+
| |Waiting |
+------------------------+-----------------------------------------+
|Global CPU Usage | Steal |
| +-----------------------------------------+
| | Wait |
| +-----------------------------------------+
| | User |
| +-----------------------------------------+
| | System |
| +-----------------------------------------+
| | Interrupt |
| +-----------------------------------------+
| | Nice |
| +-----------------------------------------+
| | Idle |
+------------------------+-----------------------------------------+
|Per CPU Usage | User |
| +-----------------------------------------+
| | System |
+------------------------+-----------------------------------------+
|Global memory usage |bandwidth |
| +-----------------------------------------+
| |Cached |
| +-----------------------------------------+
| |Buffered |
| +-----------------------------------------+
| |Free |
| +-----------------------------------------+
| |Used |
| +-----------------------------------------+
| |Total |
+------------------------+-----------------------------------------+
|Numa monitoring |Numa_hit |
|For each node +-----------------------------------------+
| |Numa_miss |
| |-----------------------------------------+
| |Numa_foreign |
| +-----------------------------------------+
| |Local_node |
| +-----------------------------------------+
| |Other_node |
+------------------------+-----------------------------------------+
|Numa monitoring |Huge |
|For each pid +-----------------------------------------+
| |Heap |
| |-----------------------------------------+
| |Stack |
| +-----------------------------------------+
| |Private |
+------------------------+-----------------------------------------+
|Global IOSTAT \+ |Merge reads /s |
|Per device IOSTAT +-----------------------------------------+
| |Merge write /s |
| +-----------------------------------------+
| |read/s |
| +-----------------------------------------+
| |write/s |
| +-----------------------------------------+
| |Read transfer |
| +-----------------------------------------+
| |Write transfer |
| +-----------------------------------------+
| |Read latency |
| +-----------------------------------------+
| |Write latency |
| +-----------------------------------------+
| |Write transfer |
| +-----------------------------------------+
| |Queue size |
| +-----------------------------------------+
| |Await |
+------------------------+-----------------------------------------+
|Network per interface |Octets /s (in, out) |
| +-----------------------------------------+
| |Packet /s (in, out) |
| |-----------------------------------------+
| |Dropped /s |
+------------------------+-----------------------------------------+
|Other system metrics |Entropy |
| +-----------------------------------------+
| |DF per device |
+------------------------+-----------------------------------------+
Kubernetes Layer
----------------
`Kargo`_ from `Fuel-CCP-installer`_ was our main tool to deploy K8S
on top of provisioned systems (monitored nodes).
Kargo sets up Kubernetes in the following way:
- masters: Calico, Kubernetes API services
- nodes: Calico, Kubernetes minion services
- etcd: etcd service
Kargo deployment parameters
^^^^^^^^^^^^^^^^^^^^^^^^^^^
You can find Kargo deployment script in `Kargo deployment script`_ section
.. code:: bash
docker_options: "--insecure-registry 172.20.8.35:5000 -D"
upstream_dns_servers: [172.20.8.34, 8.8.4.4]
nameservers: [172.20.8.34, 8.8.4.4]
kube_service_addresses: 10.224.0.0/12
kube_pods_subnet: 10.240.0.0/12
kube_network_node_prefix: 22
kube_apiserver_insecure_bind_address: "0.0.0.0"
dns_replicas: 3
dns_cpu_limit: "100m"
dns_memory_limit: "512Mi"
dns_cpu_requests: "70m"
dns_memory_requests: "70Mi"
deploy_netchecker: false
.. table::
+----------------------+-----------------------------------------+
|Software |Version |
+----------------------+-----------------------------------------+
|`Fuel-CCP-Installer`_ |6fd81252cb2d2c804f388337aa67d4403700f094 |
| | |
+----------------------+-----------------------------------------+
|`Kargo`_ |2c23027794d7851ee31363c5b6594180741ee923 |
+----------------------+-----------------------------------------+
Required K8S metrics
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Here we should get K8S health
metrics and ETCD performance metrics:
.. table::
+------------------------+-----------------------------------------+
|ETCD performance metrics|members count / states |
| +-----------------------------------------+
| |numbers of keys in a cluster |
| |-----------------------------------------+
| |Size of data set |
| +-----------------------------------------+
| |Avg. latency from leader to followers |
| +-----------------------------------------+
| |Bandwidth rate, send/receive |
| +-----------------------------------------+
| |Create store success/fail |
| +-----------------------------------------+
| |Get success/fail |
| +-----------------------------------------+
| |Set success/fail |
| +-----------------------------------------+
| |Package rate, send/receive |
| +-----------------------------------------+
| |Expire count |
| +-----------------------------------------+
| |Update success/fail |
| +-----------------------------------------+
| |Compare-and-swap success/fail |
| +-----------------------------------------+
| |Watchers |
| +-----------------------------------------+
| |Delete success/fail |
| +-----------------------------------------+
| |Compare-and-delete success/fail |
| +-----------------------------------------+
| |Append req, send/ receive |
+------------------------+-----------------------------------------+
|K8S health metrics |Number of node in each state |
| +-----------------------------------------+
| |Total number of namespaces |
| +-----------------------------------------+
| |Total number of PODs per cluster,node,ns |
| +-----------------------------------------+
| |Total of number of services |
| +-----------------------------------------+
| |Endpoints in each service |
| +-----------------------------------------+
| |Number of API service instances |
| +-----------------------------------------+
| |Number of controller instances |
| +-----------------------------------------+
| |Number of scheduler instances |
| +-----------------------------------------+
| |Cluster resources, scheduler view |
+------------------------+-----------------------------------------+
|K8S API log analysis |Number of responses (per each HTTP code) |
| +-----------------------------------------+
| |Response Time |
+------------------------+-----------------------------------------+
For last two metrics we should utilize log collector to store and parse all
log records within K8S environments.
Openstack Layer
-----------------
CCP stands for "Containerized Control Plane". CCP aims to build, run and manage
production-ready OpenStack containers on top of Kubernetes cluster.
.. table::
+--------------------+-----------------------------------------+
|Software |Version |
+--------------------+-----------------------------------------+
|`Fuel-CCP`_ |8570d0e0e512bd16f8449f0a10b1e3900fd09b2d |
+--------------------+-----------------------------------------+
CCP configuration
^^^^^^^^^^^^^^^^^
CCP was deployed on top of 200 nodes K8S cluster in the following configuration:
.. code-block:: yaml
node[1-3]: Kubernetes
node([4-6])$: # 4-6
roles:
- controller
- openvswitch
node[7-9]$: # 7-9
roles:
- rabbitmq
node10$: # 10
roles:
- galera
node11$: # 11
roles:
- heat
node(1[2-9])$: # 12-19
roles:
- compute
- openvswitch
node[2-9][0-9]$: # 20-99
roles:
- compute
- openvswitch
node(1[0-9][0-9])$: # 100-199
roles:
- compute
- openvswitch
node200$:
roles:
- backup
CCP Openstack services list ( `versions.yaml`_ ):
.. code-block:: yaml
openstack/cinder:
git_ref: stable/newton
git_url: https://github.com/openstack/cinder.git
openstack/glance:
git_ref: stable/newton
git_url: https://github.com/openstack/glance.git
openstack/heat:
git_ref: stable/newton
git_url: https://github.com/openstack/heat.git
openstack/horizon:
git_ref: stable/newton
git_url: https://github.com/openstack/horizon.git
openstack/keystone:
git_ref: stable/newton
git_url: https://github.com/openstack/keystone.git
openstack/neutron:
git_ref: stable/newton
git_url: https://github.com/openstack/neutron.git
openstack/nova:
git_ref: stable/newton
git_url: https://github.com/openstack/nova.git
openstack/requirements:
git_ref: stable/newton
git_url: https://git.openstack.org/openstack/requirements.git
openstack/sahara-dashboard:
git_ref: stable/newton
git_url: https://git.openstack.org/openstack/sahara-dashboard.git
`K8S Ingress Resources`_ rules were enabled during CCP deployment to expose Openstack services
endpoints to external routable network.
See CCP deployment script and configuration files in the
`CCP deployment and configuration files`_ section.
Required Openstack-related metrics
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
At this layer we should get openstack environment metrics,
API and resources utilization metrics.
.. table:: Versions of CCP-related software
+------------------------+-----------------------------------------+
|Openstack metrics |Total number of controller nodes |
| +-----------------------------------------+
| |Total number of services |
| |-----------------------------------------+
| |Total number of compute nodes |
| +-----------------------------------------+
| |Total number of nodes |
| +-----------------------------------------+
| |Total number of VMs |
| +-----------------------------------------+
| |Number of VMs per tenant, per node |
| +-----------------------------------------+
| |Resource utilization per project,service |
| +-----------------------------------------+
| |Total number of tenants |
| +-----------------------------------------+
| |API request time |
| +-----------------------------------------+
| |Mean time to spawn VM |
+------------------------+-----------------------------------------+
Implementation
==============
This part of documentation describes Monitoring System implementation.
Here is software list that we chose to realize all required tasks:
.. table::
+-----------------------------------------+-----------------------------------------+
|Monitoring Node Server Side |Monitored Node Client Side |
+--------------------+--------------------+--------------------+--------------------+
|Metrics server |Log storage |Metrics agent |Log collector |
| | | | |
+--------------------+--------------------+--------------------+--------------------+
| `Prometheus`_ \+ | `ElasticSearch`_ |`Telegraf`_ | `Heka`_ |
| `Grafana`_ | \+ `Kibana`_ | | |
+--------------------+--------------------+--------------------+--------------------+
Server Side Software
---------------------
Prometheus
^^^^^^^^^^
.. table::
+--------------------+-----------------------------------------+
|Software |Version |
+--------------------+-----------------------------------------+
|`Prometheus GitHub`_|7e369b9318a4d5d97a004586a99f10fa51a46b26 |
+--------------------+-----------------------------------------+
Due to high load rate we faced an issue with Prometheus performance at metrics count up to 15 millions.
We split Prometheus setup in 2 standalone nodes. First node - to poll API metrics from K8S-related services
that natively available at `/metrics` uri and exposed by K8S API and ETCD API by default.
Second node - to store all other metrics that should be collected and calculated locally on environment
servers via Telegraf.
Prometheus nodes deployments scripts and configuration files could be found at `Prometheus deployment and configuration files`_ section
Grafana
^^^^^^^
.. table::
+--------------------+-----------------------------------------+
|Software |Version |
+--------------------+-----------------------------------------+
|`Grafana`_ |v4.0.1 |
+--------------------+-----------------------------------------+
Grafana was used as a metrics visualizer with several dashboards for each metrics group.
Separate individual dashboards were built for each group of metrics:
- System nodes metrics
- Kubernetes metrics
- ETCD metrics
- Openstack metrics
You can find their setting at `Grafana dashboards configuration`_
Grafana server deployment script:
.. code-block:: bash
#!/bin/bash
ansible-playbook -i ./hosts ./deploy-graf-prom.yaml --tags "grafana"
It uses the same yaml configuration file `deploy-graf-prom.yaml`_ from `Prometheus deployment and configuration files`_ section.
ElasticSearch
^^^^^^^^^^^^^
.. table::
+--------------------+-----------------------------------------+
|Software |Version |
+--------------------+-----------------------------------------+
|`ElasticSearch`_ |2.4.2 |
+--------------------+-----------------------------------------+
ElasticSearch is well-known proven log storage and we used it as a standalone
node for collecting Kubernetes API logs and all other logs from containers across environment.
For appropriate performance at 200 nodes lab we increased `ES_HEAP_SIZE` from default 1G to 10G
in /etc/default/elasticsearch configuration file.
Elastic search and Kibana dashboard were installed with
`deploy_elasticsearch_kibana.sh`_ deployment script.
Kibana
^^^^^^
.. table::
+--------------------+-----------------------------------------+
|Software |Version |
+--------------------+-----------------------------------------+
|`Kibana`_ |4.5.4 |
+--------------------+-----------------------------------------+
We used Kibana as a main visualization tool for Elastic Search. We were able to create chart
graphs based on K8S API logs analysis. Kibana was installed on a single separate node
with a single dashboard representing K8S API Response time graph.
Dashboard settings:
:download:`Kibana_dashboard.json <configs/dashboards/Kibana_dashboard.json>`
Client side Software
--------------------
Telegraf
^^^^^^^^
.. table::
+--------------------+-----------------------------------------+
|Software |Version |
+--------------------+-----------------------------------------+
|`Telegraf`_ |v1.0.0-beta2-235-gbc14ac5 |
| |git: openstack_stats |
| |bc14ac5b9475a59504b463ad8f82ed810feed3ec |
+--------------------+-----------------------------------------+
Telegraf was chosen as client-side metrics agent. It provides multiple ways to poll and calculate from variety of
different sources. With regard to its plugin-driven nature, it takes data from different inputs and
exposes calculated metrics in Prometheus format. We used forked version of Telegraf with custom patches to
be able to utilize custom Openstack-input plugin:
- `GitHub Telegraf Fork`_
- `Go SDK for OpenStack`_
Following automation scripts and configuration files were used to start Telegraf agent
across environment nodes.
`Telegraf deployment and configuration files`_
Below you can see which plugins were used to obtain metrics.
Standart Plugins
""""""""""""""""
.. code:: bash
inputs.cpu CPU
inputs.disk
inputs.diskio
inputs.kernel
inputs.mem
inputs.processes
inputs.swap
inputs.system
inputs.kernel_vmstat
inputs.net
inputs.netstat
inputs.exec
Openstack input plugin
""""""""""""""""""""""
`inputs.openstack` custom plugin was used to gather the most of required Openstack-related metrics.
settings:
.. code:: bash
interval = '40s'
identity_endpoint = "http://keystone.ccp.svc.cluster.local:5000/v3"
domain = "default"
project = "admin"
username = "admin"
password = "password"
`System.exec` plugin
""""""""""""""""""""
`system.exec` plugin was used to trigger scripts to poll
and calculate all non-standard metrics.
common settings:
.. code:: bash
interval = "15s"
timeout = "30s"
data_format = "influx"
commands:
.. code:: bash
"/opt/telegraf/bin/list_openstack_processes.sh"
"/opt/telegraf/bin/per_process_cpu_usage.sh"
"/opt/telegraf/bin/numa_stat_per_pid.sh"
"/opt/telegraf/bin/iostat_per_device.sh"
"/opt/telegraf/bin/memory_bandwidth.sh"
"/opt/telegraf/bin/network_tcp_queue.sh"
"/opt/telegraf/bin/etcd_get_metrics.sh"
"/opt/telegraf/bin/k8s_get_metrics.sh"
"/opt/telegraf/bin/vmtime.sh"
"/opt/telegraf/bin/osapitime.sh"
You can see full Telegraf configuration file and its custom input scripts in the
section `Telegraf deployment and configuration files`_.
Heka
^^^^
.. table::
+--------------------+-----------------------------------------+
|Software |Version |
+--------------------+-----------------------------------------+
|`Heka`_ |0.10.0 |
+--------------------+-----------------------------------------+
We chose Heka as log collecting agent for its wide variety of inputs
(possibility to feed data from Docker socket), filters (custom shorthand SandBox filters in LUA language)
and possibility to encode data for ElasticSearch.
With Heka agent started across environment servers we were able to send containers' logs to ElasticSearch
server. With custom LUA filter we extracted K8S API data and convert it in appropriate format to
visualize API timing counters (Average Response Time).
Heka deployment scripts and configuration file with LUA custom filter are in
`Heka deployment and configuration`_ section.
Applications
============
Kargo deployment script
-----------------------
deploy_k8s_using_kargo.sh
^^^^^^^^^^^^^^^^^^^^^^^^^
.. literalinclude:: configs/deploy_k8s_using_kargo.sh
:language: bash
CCP deployment and configuration files
---------------------------------------
deploy-ccp.sh
^^^^^^^^^^^^^
.. literalinclude:: configs/ccp/deploy-ccp.sh
:language: bash
ccp.yaml
^^^^^^^^
.. literalinclude:: configs/ccp/ccp.yaml
:language: yaml
configs.yaml
^^^^^^^^^^^^
.. literalinclude:: configs/ccp/configs.yaml
:language: yaml
topology.yaml
^^^^^^^^^^^^^
.. literalinclude:: configs/ccp/topology.yaml
:language: yaml
repos.yaml
^^^^^^^^^^
.. literalinclude:: configs/ccp/repos.yaml
:language: yaml
versions.yaml
^^^^^^^^^^^^^
.. literalinclude:: configs/ccp/versions.yaml
:language: yaml
Prometheus deployment and configuration files
---------------------------------------------
Deployment scripts
^^^^^^^^^^^^^^^^^^
deploy_prometheus.sh
""""""""""""""""""""
.. literalinclude:: configs/prometheus-grafana-telegraf/deploy_prometheus.sh
:language: bash
deploy-graf-prom.yaml
"""""""""""""""""""""
.. literalinclude:: configs/prometheus-grafana-telegraf/deploy-graf-prom.yaml
:language: yaml
docker_prometheus.yaml
""""""""""""""""""""""
.. literalinclude:: configs/prometheus-grafana-telegraf/docker_prometheus.yaml
:language: yaml
deploy_etcd_collect.sh
""""""""""""""""""""""
.. literalinclude:: configs/prometheus-grafana-telegraf/deploy_etcd_collect.sh
:language: bash
Configuration files
^^^^^^^^^^^^^^^^^^^
prometheus-kuber.yml.j2
"""""""""""""""""""""""
.. literalinclude:: configs/prometheus-grafana-telegraf/prometheus/prometheus-kuber.yml.j2
:language: bash
prometheus-system.yml.j2
""""""""""""""""""""""""
.. literalinclude:: configs/prometheus-grafana-telegraf/prometheus/prometheus-system.yml.j2
:language: bash
targets.yml.j2
""""""""""""""
.. literalinclude:: configs/prometheus-grafana-telegraf/prometheus/targets.yml.j2
:language: bash
Grafana dashboards configuration
--------------------------------
:download:`Systems_nodes_statistics.json <configs/dashboards/Systems_nodes_statistics.json>`
:download:`Kubernetes_statistics.json <configs/dashboards/Kubernetes_statistics.json>`
:download:`ETCD.json <configs/dashboards/ETCD.json>`
:download:`OpenStack.json <configs/dashboards/OpenStack.json>`
ElasticSearch deployment script
-------------------------------
deploy_elasticsearch_kibana.sh
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. literalinclude:: configs/elasticsearch-heka/deploy_elasticsearch_kibana.sh
:language: bash
Telegraf deployment and configuration files
-------------------------------------------
deploy_telegraf.sh
^^^^^^^^^^^^^^^^^^
.. literalinclude:: configs/prometheus-grafana-telegraf/deploy_telegraf.sh
:language: bash
deploy-telegraf.yaml
^^^^^^^^^^^^^^^^^^^^
.. literalinclude:: configs/prometheus-grafana-telegraf/deploy-telegraf.yaml
:language: yaml
Telegraf system
^^^^^^^^^^^^^^^
telegraf-sys.conf
"""""""""""""""""
.. literalinclude:: configs/prometheus-grafana-telegraf/telegraf/telegraf-sys.conf
:language: bash
Telegraf openstack
^^^^^^^^^^^^^^^^^^^
telegraf-openstack.conf.j2
""""""""""""""""""""""""""
.. literalinclude:: configs/prometheus-grafana-telegraf/telegraf/telegraf-openstack.conf.j2
:language: bash
Telegraf inputs scripts
^^^^^^^^^^^^^^^^^^^^^^^
list_openstack_processes.sh
"""""""""""""""""""""""""""
.. literalinclude:: configs/prometheus-grafana-telegraf/telegraf/scripts/list_openstack_processes.sh
:language: bash
per_process_cpu_usage.sh
""""""""""""""""""""""""
.. literalinclude:: configs/prometheus-grafana-telegraf/telegraf/scripts/per_process_cpu_usage.sh
:language: bash
numa_stat_per_pid.sh
""""""""""""""""""""
.. literalinclude:: configs/prometheus-grafana-telegraf/telegraf/scripts/numa_stat_per_pid.sh
:language: bash
iostat_per_device.sh
""""""""""""""""""""
.. literalinclude:: configs/prometheus-grafana-telegraf/telegraf/scripts/iostat_per_device.sh
:language: bash
memory_bandwidth.sh
"""""""""""""""""""
.. literalinclude:: configs/prometheus-grafana-telegraf/telegraf/scripts/memory_bandwidth.sh
:language: bash
network_tcp_queue.sh
""""""""""""""""""""
.. literalinclude:: configs/prometheus-grafana-telegraf/telegraf/scripts/network_tcp_queue.sh
:language: bash
etcd_get_metrics.sh
"""""""""""""""""""
.. literalinclude:: configs/prometheus-grafana-telegraf/telegraf/scripts/etcd_get_metrics.sh
:language: bash
k8s_get_metrics.sh
""""""""""""""""""
.. literalinclude:: configs/prometheus-grafana-telegraf/telegraf/scripts/k8s_get_metrics.sh
:language: bash
vmtime.sh
"""""""""
.. literalinclude:: configs/prometheus-grafana-telegraf/telegraf/scripts/vmtime.sh
:language: bash
osapitime.sh
""""""""""""
.. literalinclude:: configs/prometheus-grafana-telegraf/telegraf/scripts/osapitime.sh
:language: bash
Heka deployment and configuration
---------------------------------
Deployment
^^^^^^^^^^
deploy_heka.sh
""""""""""""""
.. literalinclude:: configs/elasticsearch-heka/deploy_heka.sh
:language: bash
deploy-heka.yaml
""""""""""""""""
.. literalinclude:: configs/elasticsearch-heka/deploy-heka.yaml
:language: yaml
Configuration
^^^^^^^^^^^^^
00-hekad.toml.j2
""""""""""""""""
.. literalinclude:: configs/elasticsearch-heka/heka/00-hekad.toml.j2
:language: bash
kubeapi_to_int.lua.j2
"""""""""""""""""""""
.. literalinclude:: configs/elasticsearch-heka/heka/kubeapi_to_int.lua.j2
:language: bash
.. references:
.. _Fuel-CCP-Installer: https://github.com/openstack/fuel-ccp-installer
.. _Kargo: https://github.com/kubernetes-incubator/kargo.git
.. _Fuel-CCP: https://github.com/openstack/fuel-ccp
.. _Prometheus: https://prometheus.io/
.. _Prometheus GitHub: https://github.com/prometheus/prometheus
.. _Grafana: http://grafana.org/
.. _ElasticSearch: https://www.elastic.co/products/elasticsearch
.. _Kibana: https://www.elastic.co/products/kibana
.. _Telegraf: https://www.influxdata.com/time-series-platform/telegraf/
.. _GitHub Telegraf Fork: https://github.com/spjmurray/telegraf/tree/openstack_stats/plugins/inputs/openstack
.. _Go SDK for OpenStack: https://github.com/rackspace/gophercloud/
.. _Heka: https://hekad.readthedocs.io/en/v0.10.0/
.. _K8S Ingress Resources: http://kubernetes.io/docs/user-guide/ingress/