![Martin Mágr](/assets/img/avatar_default.png)
bp tripleo-opstools-centralized-logging Change-Id: I4a26517732acb55f4e3a4f7347354116f257493b
4.2 KiB
Enable deployment of centralized logging
https://blueprints.launchpad.net/tripleo/+spec/tripleo-opstools-centralized-logging
TripleO should be deploying with an out-of-the-box centralized logging solution to serve the overcloud.
Problem Description
With a complex distributed system like OpenStack, identifying and diagnosing a problem may require tracking a transaction across many different systems and many different logfiles. In the absence of a centralized logging solution, this process is frustrating to both new and experienced operators and can make even simple problems hard to diagnose.
Proposed Change
We will deploy the Fluentd service in log collecting mode as a composable service on all nodes in the overcloud stack when configured to do so by the environment. Each composable service will have its own fluentd source configuration.
To receive these messages, we will deploy a centralized logging system running Kibana, Elasticsearch and Fluentd on dedicated nodes to provide log aggregation and analysis. This will be deployed in a dedicated Heat stack that is separate from the overcloud stack using composable roles.
We will also support sending messages to an external Fluentd instance not deployed by tripleo.
Summary of use cases
- Elasticsearch, Kibana and Fluentd log relay/transformer deployed as a separate Heat stack in the overcloud stack; Fluentd log collector deployed on each overcloud node
- ElasticSearch, Kibana and Fluentd log relay/transformer deployed in external infrastructure; Fluentd log collector deployed on each overcloud node
Alternatives
None
Security Impact
Data collected from the logs of OpenStack services can contain sensitive information:
- Communication between the fluentd agent and the log aggregator should be protected with SSL.
- Access to the Kibana UI must have at least basic HTTP authentication, and client access should be via SSL.
- ElasticSearch should only allow collections over
localhost
.
Other End User Impact
None
Performance Impact
Additional resources will be required for running Fluentd on overcloud nodes. Log traffic from the overcloud nodes to the log aggregator will consume some bandwidth.
Other Deployer Impact
- Fluentd will be deployed on all overcloud nodes.
- New parameters for configuring Fluentd collector.
- New parameters for configuring log collector (Fluentd, ElasticSearch, and Kibana)
Developer Impact
Support for the new node type should be implemented for tripleo-quickstart.
Implementation
Assignee(s)
Martin Mágr <mmagr@redhat.com> Lars Kellogg-Stedman <lars@redhat.com>
Work Items
- puppet-tripleo profile for fluentd service
- tripleo-heat-templates composable role for FluentD collector deployment
- tripleo-heat-templates composable role for FluentD aggregator deployment
- tripleo-heat-templates composable role for ElasticSearch deployment
- tripleo-heat-templates composable role for Kibana deployment
- Support for logging node in tripleo-quickstart
Dependencies
- Puppet module for Fluentd: konstantin-fluentd [1]
- Puppet module for ElasticSearch elasticsearch-elasticsearch [2]
- Puppet module for Kibana (tbd)
- CentOS Opstools SIG package repository
Testing
Fluentd client deployment will be tested by current TripleO CI as soon as the patch is merged. Because the centralized logging features will not be enabled by default we may need to introduce specific tests for these features.
Documentation Impact
Process of creating new node type and new options will have to be documented.
References
[1] https://forge.puppet.com/srf/fluentd [2] https://forge.puppet.com/elasticsearch/elasticsearch