Fix Directory Check in Agent

Remove name, pattern and recursive in directory check to make
directory check only checks for the size of the target directory
(not file size)

Add directory detection plugin to automatically generate
directory.yaml file

Change-Id: I10fde0a4ed43f62c310045bccccb3a66e07542e1
This commit is contained in:
Kaiyan Sheng 2016-02-23 17:25:43 -07:00
parent 3d1782d6b1
commit 9650aafe35
7 changed files with 122 additions and 64 deletions

View File

@ -73,10 +73,10 @@ Main:
check_freq: {args.check_frequency}
# Threshold value for warning on collection time of each check (in seconds)
sub_collection_warn: {args.sub_collection_warn}
sub_collection_warn: 5
# Collector restart interval (in hours)
collector_restart_interval: {args.collector_restart_interval}
collector_restart_interval: 24
# Change port the Agent is listening to
# listen_port: 17123

View File

@ -1,25 +1,22 @@
# (C) Copyright 2015 Hewlett Packard Enterprise Development Company LP
# (C) Copyright 2015-2016 Hewlett Packard Enterprise Development Company LP
init_config:
instances:
# This config is for the Directory Check which is used to report metrics
# for the files in a given directory
# for the size of a given directory
#
# NOTE: This check is NOT currently supported on Windows systems
#
# For each instance, the 'directory' parameter is required, all others are optional.
# For each instance, the 'directory' parameter is required
#
# WARNING: Ensure the user account running the Agent (typically dd-agent) has read
# access to the monitored directory and files.
# WARNING: Ensure the user account running the Agent (typically mon-agent)
# has read access to the monitored directory and files.
#
# Instances take the following parameters:
# "directory" - string, the directory to monitor. Required
# "name" - string, tag metrics with specified name. defaults to the "directory"
# "pattern" - string, the `fnmatch` pattern to use when reading the "directory"'s files. default "*"
# "recursive" - boolean, when true the stats will recurse into directories. default False
- directory: "/path/to/directory"
name: "tag_name"
pattern: "*.log"
recursive: True
- built_by: Directory
directory: /path/to/directory_1
- built_by: Directory
directory: /path/to/directory_2

View File

@ -20,6 +20,7 @@
- [Host Alive Checks](#host-alive-checks)
- [Process Checks](#process-checks)
- [File Size Checks](#file-size-checks)
- [Directory Checks](#directory-checks)
- [Http Endpoint Checks](#http-endpoint-checks)
- [Http Metrics](#http-metrics)
- [MySQL Checks](#mysql-checks)
@ -519,6 +520,7 @@ The host alive checks return the following metrics
Also in the case of an error the value_meta contains an error message.
## Process Checks
Process checks can be performed to both verify that a set of named processes are running on the local system and collect/send system level metrics on those processes. The YAML file `process.yaml` contains the list of processes that are checked.
@ -602,6 +604,30 @@ The file_size checks return the following metrics:
| file.size_bytes | file_name, directory_name, hostname, service |
## Directory Checks
This section describes the directory check that can be performed by the Agent. Directory checks are used for gathering the total size of all the files under a specific directory. A YAML file (directory.yaml) contains the list of directory names to check. A Python script (directory.py) runs checks each host in turn to gather stats. Note: for sparse file, directory check is using its resident size instead of the actual size.
Similar to other checks, the configuration is done in YAML, and consists of two keys: init_config and instances. The former is not used by directory check, while the later contains one or more sets of directory names to check on. Directory check will sum the size of all the files under the given directory recursively.
Sample config:
```
init_config: null
instances:
- built_by: Directory
directory: /var/log/monasca/agent
- built_by: Directory
directory: /etc/monasca/agent
```
The directory checks return the following metrics:
| Metric Name | Dimensions |
| ----------- | ---------- |
| directory.size_bytes | path, hostname, service |
| directory.files_count | path, hostname, service |
## Http Endpoint Checks
This section describes the http endpoint check that can be performed by the Agent. Http endpoint checks are checks that perform simple up/down checks on services, such as HTTP/REST APIs. An agent, given a list of URLs, can dispatch an http request and report to the API success/failure as a metric.

View File

@ -1,85 +1,76 @@
# (C) Copyright 2015 Hewlett Packard Enterprise Development Company LP
# (C) Copyright 2015-2016 Hewlett Packard Enterprise Development Company LP
from fnmatch import fnmatch
from monasca_agent.collector.checks import AgentCheck
from os.path import abspath
from os.path import exists
from os.path import join
from os import access
from os import stat
from os import walk
import time
from monasca_agent.collector.checks import AgentCheck
from os import X_OK
import logging
log = logging.getLogger(__name__)
class DirectoryCheck(AgentCheck):
"""This check is for monitoring and reporting metrics on the files for a provided directory
"""This check is for monitoring and reporting metrics on the provided directory
WARNING: the user/group that dd-agent runs as must have access to stat the files in the desired directory
WARNING: the user/group that mon-agent runs as must have access to stat
the files in the desired directory
Config options:
"directory" - string, the directory to gather stats for. required
"name" - string, the name to use when tagging the metrics. defaults to the "directory"
"pattern" - string, the `fnmatch` pattern to use when reading the "directory"'s files. default "*"
"recursive" - boolean, when true the stats will recurse into directories. default False
"""
def check(self, instance):
if "directory" not in instance:
raise Exception('DirectoryCheck: missing "directory" in config')
error_message = 'DirectoryCheck: missing "directory" in config'
log.error(error_message)
raise Exception(error_message)
directory = instance["directory"]
abs_directory = abspath(directory)
name = instance.get("name") or directory
pattern = instance.get("pattern") or "*"
recursive = instance.get("recursive") or False
if not exists(abs_directory):
raise Exception("DirectoryCheck: the directory (%s) does not exist" % abs_directory)
error_message = "DirectoryCheck: the directory (%s) does not " \
"exist" % abs_directory
log.error(error_message)
raise Exception(error_message)
dimensions = self._set_dimensions({"name": name}, instance)
self._get_stats(abs_directory, name, pattern, recursive, dimensions)
dimensions = self._set_dimensions({"path": directory}, instance)
self._get_stats(abs_directory, dimensions)
def _get_stats(self, directory, name, pattern, recursive, dimensions):
def _get_stats(self, directory_name, dimensions):
directory_bytes = 0
directory_files = 0
for root, dirs, files in walk(directory):
for root, dirs, files in walk(directory_name):
for directory in dirs:
directory_root = join(root, directory)
if not access(directory_root, X_OK):
log.warn("DirectoryCheck: could not access directory {}".
format(directory_root))
for filename in files:
# check if it passes our filter
if not fnmatch(filename, pattern):
continue
filename = join(root, filename)
try:
file_stat = stat(filename)
except OSError as ose:
self.warning("DirectoryCheck: could not stat file %s - %s" % (filename, ose))
log.warn("DirectoryCheck: could not stat file %s - %s" %
(filename, ose))
else:
directory_files += 1
directory_bytes += file_stat.st_size
# file specific metrics
self.histogram(
"system.disk.directory.file.bytes",
file_stat.st_size,
dimensions=dimensions)
self.histogram(
"system.disk.directory.file.modified_sec_ago",
time.time() -
file_stat.st_mtime,
dimensions=dimensions)
self.histogram(
"system.disk.directory.file.created_sec_ago",
time.time() -
file_stat.st_ctime,
dimensions=dimensions)
# os.walk gives us all sub-directories and their files
# if we do not want to do this recursively and just want
# the top level directory we gave it, then break
if not recursive:
break
# number of files
self.gauge("system.disk.directory.files", directory_files, dimensions=dimensions)
self.gauge("directory.files_count", directory_files,
dimensions=dimensions)
# total file size
self.gauge("system.disk.directory.bytes", directory_bytes, dimensions=dimensions)
self.gauge("directory.size_bytes", directory_bytes,
dimensions=dimensions)
log.debug("DirectoryCheck: Directory {0} size {1} bytes with {2} "
"files in it.".format(directory_name, directory_bytes,
directory_files))

View File

@ -0,0 +1,22 @@
# (C) Copyright 2016 Hewlett Packard Enterprise Development Company LP
import monasca_setup.detection
class Directory(monasca_setup.detection.ServicePlugin):
"""Setup configuration to monitor directory size.
directory_names example:
'directory_names': ['/path/to/directory_1',
'/path/to/directory_2',
...]
"""
def __init__(self, template_dir, overwrite=True, args=None):
service_params = {
'args': args,
'template_dir': template_dir,
'overwrite': overwrite,
'service_name': 'directory-service',
'directory_names': []}
super(Directory, self).__init__(service_params)

View File

@ -9,6 +9,7 @@ from plugin import Plugin
from monasca_setup import agent_config
from monasca_setup.detection.utils import find_process_cmdline
from monasca_setup.detection.utils import service_api_check
from monasca_setup.detection.utils import watch_directory
from monasca_setup.detection.utils import watch_file_size
from monasca_setup.detection.utils import watch_process
@ -27,8 +28,9 @@ class ServicePlugin(Plugin):
self.service_name = kwargs['service_name']
self.process_names = kwargs.get('process_names')
self.file_dirs_names = kwargs.get('file_dirs_names')
self.directory_names = kwargs.get('directory_names')
self.service_api_url = kwargs.get('service_api_url')
self.search_pattern = kwargs['search_pattern']
self.search_pattern = kwargs.get('search_pattern')
overwrite = kwargs['overwrite']
template_dir = kwargs['template_dir'],
if 'args' in kwargs:
@ -39,12 +41,13 @@ class ServicePlugin(Plugin):
# dict {'service_api_url':'url'}
args_dict = dict(
[item.split('=') for item in args.split()])
# Allow args to override all of these parameters
if 'process_names' in args_dict:
self.process_names = args_dict['process_names'].split(',')
if 'file_dirs_names' in args_dict:
self.file_dirs_names = args_dict['file_dirs_names']
if 'directory_names' in args_dict:
self.directory_names = args_dict['directory_names'].split(',')
if 'service_api_url' in args_dict:
self.service_api_url = args_dict['service_api_url']
if 'search_pattern' in args_dict:
@ -71,6 +74,8 @@ class ServicePlugin(Plugin):
self.available = True
if self.file_dirs_names:
self.available = True
if self.directory_names:
self.available = True
def build_config(self):
"""Build the config as a Plugins object and return.
@ -101,6 +106,12 @@ class ServicePlugin(Plugin):
config.merge(watch_file_size(file_dir, file_names,
file_recursive))
if self.directory_names:
for dir_name in self.directory_names:
log.info("\tMonitoring the size of directory {0}.".format(
dir_name))
config.merge(watch_directory(dir_name))
# Skip the http_check if disable_http_check is set
if self.args is not None and self.args.get('disable_http_check', False):
self.service_api_url = None

View File

@ -123,8 +123,19 @@ def watch_file_size(directory_name, file_names, file_recursive):
return config
def service_api_check(name, url, pattern, use_keystone=True,
service=None, component=None):
def watch_directory(directory_name):
"""Takes a directory name and returns a Plugins object with the config set.
"""
config = agent_config.Plugins()
parameters = {'directory': directory_name}
config['directory'] = {'init_config': None,
'instances': [parameters]}
return config
def service_api_check(name, url, pattern,
use_keystone=True, service=None, component=None):
"""Setup a service api to be watched by the http_check plugin.
"""
config = agent_config.Plugins()