process.py detection plugin that accepts JSON string or YAML config

JSON was used to support names that contain special characters. New json argument type for monasca-setup (inspired by review from Alexis Lee https://review.openstack.org/339023/) Note: The json argument would help solve a problem with file_size.py (https://bugs.launchpad.net/bugs/1625966), but file_size.py would still need modifications to accept json kwargs in its __init__ method. Change-Id: Id56a81d8f424be079a683d95c59c3e2a7d6b20d5
2016-09-23 14:46:59 -06:00
parent 1544d4cb49
commit 2e4913f4b6
6 changed files with 451 additions and 23 deletions
--- a/docs/Agent.md
+++ b/docs/Agent.md
@@ -107,6 +107,7 @@ All parameters require a '--' before the parameter such as '--verbose'. Run `mon
 | skip_detection_plugins | Skip provided space separated list of detection plugins. | system |
 | overwrite | This is an optional parameter to overwrite the plugin configuration.  Use this if you don't want to keep the original configuration.  If this parameter is not specified, the configuration will be appended to the existing configuration, possibly creating duplicate checks.  **NOTE:** The agent config file, agent.yaml, will always be overwritten, even if this parameter is not specified. | |
 | detection_args | Some detection plugins can be passed arguments. This is a string that will be passed to the detection plugins. | "hostname=ping.me" |
 | detection_args_json | A JSON string can be passed to the detection plugin. | '{"process_config":{"process_names":["monasca-api","monasca-notification"],"dimensions":{"service":"monitoring"}}}' |
 | max_measurement_buffer_size | Integer value for the maximum number of measurements to buffer locally while unable to connect to the monasca-api. If the queue exceeds this value, measurements will be dropped in batches. A value of '-1' indicates no limit | 100000 |
 | backlog_send_rate | Integer value of how many batches of buffered measurements to send each time the forwarder flushes data | 1000 |
@@ -252,4 +253,4 @@ If there is some problem with multiple plugins that end up blocking the entire t
 Some of the plugins have their own thread pools to handle asynchronous checks. The collector thread pool is separate and has no special interaction with those thread pools.
 # License
-(C) Copyright 2015 Hewlett Packard Enterprise Development Company LP
+(C) Copyright 2015-2016 Hewlett Packard Enterprise Development LP
--- a/docs/Plugins.md
+++ b/docs/Plugins.md
@@ -321,6 +321,7 @@ These are the detection plugins included with the Monasca Agent.  See [Customiza
 | ovsvapp | ServicePlugin |
 | postfix | Plugin |
 | powerdns | Plugin |
 | process | Plugin |
 | rabbitmq | Plugin |
 | supervisord | Plugin |
 | swift | ServicePlugin |
@@ -1319,24 +1320,89 @@ Each process entry consists of one primary key: name. Either search_string or us
 To grab more process metrics beside the process.pid_count, which only shows that the process is up and running, the configuration option detailed must be set to true.
 Sample monasca-setup:
 Monitor by process_names:
 ```
-init_config:
+monasca-setup -d ProcessCheck -json \
-
+         '{"process_config":[{"process_names":["monasca-notification","monasca-api"],"dimensions":{"service":"monitoring"}}]}'
 ```
 Monitor by process_username:
 ```
 monasca-setup -d ProcessCheck -json \
         '{"process_config":[{"process_username":"dbadmin","dimensions":{"service":"monitoring","component":"vertica"}}]}'
 ```
 Multiple entries in one call:
 ```
 monasca-setup -d ProcessCheck -json \
         '{"process_config":[{"process_names":["monasca-notification","monasca-api"],"dimensions":{"service":"monitoring"}},
                             {"process_names":["elasticsearch"],"dimensions":{"service":"logging"}},
                             {"process_username":"dbadmin","dimensions":{"service":"monitoring","component":"vertica"}}]}'
 ```
 Using a yaml config file:
 ```
 monasca-setup -d ProcessCheck -a "conf_file_path=/home/stack/myprocess.yaml"
 ```
 Example yaml input file format for process check by process names:
 ```
 ---
 process_config:
 - process_names:
  - monasca-notification
  - monasca-api
  dimensions:
    service: monitoring
 ```
 Example yaml input file format for multiple process_names entries:
 ```
 ---
 process_config:
 - process_names:
  - monasca-notification
  - monasca-api
  dimensions:
    service: monitoring
 - process_names:
  - elasticsearch
  dimensions:
    service: logging
 - process_names:
  - monasca-thresh
  exact_match: 'true'
  dimensions:
    service: monitoring
    component: thresh
 ```
 Sample successfully built process.yaml:
 ```
 init_config: null
 instances:
- - name: ssh
+- built_by: ProcessCheck
-   search_string: ['ssh', 'sshd']
+  detailed: true
  dimensions:
    component: monasca-api
    service: monitoring
  exact_match: false
  name: monasca-api
  search_string:
  - monasca-api
- - name: mysql
+- built_by: ProcessCheck
-   search_string: ['mysql']
+  detailed: true
-   exact_match: True
+  dimensions:
    component: monasca-notification
    service: monitoring
  exact_match: false
  name: monasca-notification
  search_string:
  - monasca-notification
- - name: kafka
+- built_by: ProcessCheck
-   search_string: ['kafka']
+  detailed: true
-   detailed: true
+  dimensions:
-
+    component: vertica
- - name: monasca_agent
+    service: monitoring
-   username: mon-agent
+  name: vertica
-   detailed: true
+  username: dbadmin
 ```
 The process checks return the following metrics ( if detailed is set to true, otherwise process.pid_count is only returned ):
--- a/monasca_setup/detection/plugins/process.py
+++ b/monasca_setup/detection/plugins/process.py
@@ -0,0 +1,183 @@
 # (C) Copyright 2016 Hewlett Packard Enterprise Development LP
 import json
 import logging
 import yaml
 import monasca_setup.agent_config
 import monasca_setup.detection
 from monasca_setup.detection.utils import find_process_cmdline
 log = logging.getLogger(__name__)
 class ProcessCheck(monasca_setup.detection.Plugin):
    """Setup a process check according to the passed in JSON string or YAML config file path.
       A process can be monitored by process_names or by process_username, or by both if
       the process_config list contains both dictionary entries. Pass in the dictionary containing process_names
       when watching process by name.  Pass in the dictionary containing process_username and dimensions with
       component when watching process by username. Watching by process_username is useful for groups of processes
       that are owned by a specific user. For process monitoring by process_username the component dimension
       is required since it is used to initialize the instance name in process.yaml.
       service and component dimensions are recommended to distinguish multiple components per service.  The component
       dimensions will be defaulted to the process name when it is not input when monitoring by process_names.
       exact_match is optional and defaults to false, meaning the process name search string can be found within the process name.
       exact_match can be set to true if the process_names search string should match the process name.
       Pass in a YAML config file path:
       monasca-setup -d ProcessCheck -a "conf_file_path=/home/stack/myprocess.yaml"
       or
       Pass in a JSON string command line argument:
       Using monasca-setup, you can pass in a json string with arguments --detection_args_json, or the shortcut -json.
       Monitor by process_names:
       monasca-setup -d ProcessCheck -json \
         '{"process_config":[{"process_names":["monasca-notification","monasca-api"],"dimensions":{"service":"monitoring"}}]}'
       Specifiy one or more dictionary entries to the process_config list:
       monasca-setup -d ProcessCheck -json \
         '{"process_config":[
            {"process_names":["monasca-notification","monasca-api"],"dimensions":{"service":"monitoring"}},
            {"process_names":["elasticsearch"],"dimensions":{"service":"logging"},"exact_match":"true"},
            {"process_names":["monasca-thresh"],"dimensions":{"service":"monitoring","component":"thresh"}}]}'
       Monitor by process_username:
       monasca-setup -d ProcessCheck -json \
         '{"process_config":[{"process_username":"dbadmin","dimensions":{"service":"monitoring","component":"vertica"}}]}'
       Can specify monitoring by both process_username and process_names:
       monasca-setup -d ProcessCheck -json \
         '{"process_config":[{"process_names":["monasca-api"],"dimensions":{"service":"monitoring"}},
                             {"process_username":"mon-api","dimensions":{"service":"monitoring","component":"monasca-api"}}]}'
    """
    def __init__(self, template_dir, overwrite=False, args=None, **kwargs):
        self.process_config = []
        self.valid_process_names = []
        self.valid_usernames = []
        if 'process_config' in kwargs:
            self.process_config = kwargs['process_config']
        super(ProcessCheck, self).__init__(template_dir, overwrite, args)
    def _get_config(self):
        self.conf_file_path = None
        if self.args:
            self.conf_file_path = self.args.get('conf_file_path', None)
        if self.conf_file_path:
            self._read_config(self.conf_file_path)
    def _read_config(self, config_file):
        log.info("\tUsing parameters from config file: {}".format(config_file))
        with open(config_file) as data_file:
            try:
                data = yaml.safe_load(data_file)
                if 'process_config' in data:
                    self.process_config = data['process_config']
                else:
                    log.error("\tInvalid format yaml file, missing key: process_config")
            except yaml.YAMLError as e:
                exception_msg = ("Could not read config file. Invalid yaml format detected. {0}.".format(e))
                raise Exception(exception_msg)
    def _detect(self):
        """Run detection, set self.available True if the service is detected.
        """
        self._get_config()
        for process_item in self.process_config:
            if 'dimensions' not in process_item:
                process_item['dimensions'] = {}
            if 'process_names' in process_item:
                found_process_names = []
                not_found_process_names = []
                for process_name in process_item['process_names']:
                    if find_process_cmdline(process_name) is not None:
                        found_process_names.append(process_name)
                    else:
                        not_found_process_names.append(process_name)
                # monitoring by process_names
                if not_found_process_names:
                    log.info("\tDid not discover process_name(s): {0}.".format(",".join(not_found_process_names)))
                if found_process_names:
                    process_item['found_process_names'] = found_process_names
                    if 'exact_match' in process_item:
                        if isinstance(process_item['exact_match'], basestring):
                            process_item['exact_match'] = (process_item['exact_match'].lower() == 'true')
                    else:
                        process_item['exact_match'] = False
                    self.valid_process_names.append(process_item)
            if 'process_username' in process_item:
                if 'component' in process_item['dimensions']:
                    self.valid_usernames.append(process_item)
                else:
                    log.error("\tMissing required component dimension, when monitoring by "
                              "process_username: {}".format(process_item['process_username']))
        if self.valid_process_names or self.valid_usernames:
            self.available = True
    def _monitor_by_process_name(self, process_name, exact_match=False, detailed=True, dimensions=None):
        config = monasca_setup.agent_config.Plugins()
        instance = {'name': process_name,
                    'detailed': detailed,
                    'exact_match': exact_match,
                    'search_string': [process_name],
                    'dimensions': {}}
        # default component to process name if not given
        if dimensions:
            instance['dimensions'].update(dimensions)
            if 'component' not in dimensions:
                instance['dimensions']['component'] = process_name
        else:
            instance['dimensions']['component'] = process_name
        config['process'] = {'init_config': None, 'instances': [instance]}
        return config
    def _monitor_by_process_username(self, process_username, detailed=True, dimensions=None):
        config = monasca_setup.agent_config.Plugins()
        instance = {'name': dimensions['component'],
                    'detailed': detailed,
                    'username': process_username,
                    'dimensions': {}}
        if dimensions:
            instance['dimensions'].update(dimensions)
        config['process'] = {'init_config': None, 'instances': [instance]}
        return config
    def build_config(self):
        """Build the config as a Plugins object and return.
        """
        config = monasca_setup.agent_config.Plugins()
        # Watch by process_names
        for process in self.valid_process_names:
            log.info("\tMonitoring by process_name(s): {0} "
                     "for dimensions: {1}.".format(",".join(process['found_process_names']),
                                                   json.dumps(process['dimensions'])))
            for process_name in process['found_process_names']:
                config.merge(self._monitor_by_process_name(process_name=process_name,
                                                           dimensions=process['dimensions'],
                                                           exact_match=process['exact_match']))
        # Watch by process_username
        for process in self.valid_usernames:
            log.info("\tMonitoring by process_username: {0} "
                     "for dimensions: {1}.".format(process['process_username'], json.dumps(process['dimensions'])))
            config.merge(self._monitor_by_process_username(process_username=process['process_username'],
                                                           dimensions=process['dimensions']))
        return config
    def dependencies_installed(self):
        """Return True if dependencies are installed.
        """
        return True
--- a/monasca_setup/detection/utils.py
+++ b/monasca_setup/detection/utils.py
@@ -1,4 +1,4 @@
-# (C) Copyright 2015-2016 Hewlett Packard Enterprise Development Company LP
+# (C) Copyright 2015-2016 Hewlett Packard Enterprise Development LP
 """ Util functions to assist in detection.
 """
@@ -39,7 +39,8 @@ def find_process_cmdline(search_string):
    """
    for process in psutil.process_iter():
        try:
-            if search_string in ' '.join(process.cmdline()):
+            if (search_string in ' '.join(process.cmdline()) and
               'monasca-setup' not in ' '.join(process.cmdline())):
                return process
        except psutil.NoSuchProcess:
            continue
--- a/monasca_setup/main.py
+++ b/monasca_setup/main.py
@@ -6,6 +6,7 @@
 import argparse
 from glob import glob
 import json
 import logging
 import os
 import pwd
@@ -73,6 +74,7 @@ def main(argv=None):
    else:
        # Run detection for all the plugins, halting on any failures if plugins were specified in the arguments
        detected_config = plugin_detection(plugins, args.template_dir, args.detection_args,
                                           args.detection_args_json,
                                           skip_failed=(args.detection_plugins is None))
        if detected_config is None:
            return 1  # Indicates detection problem, skip remaining steps and give non-zero exit code
@@ -223,8 +225,11 @@ def parse_arguments(parser):
                             "This assumes the base config has already run.")
    parser.add_argument('--skip_detection_plugins', nargs='*',
                        help="Skip detection for all plugins in this space separated list.")
-    parser.add_argument('-a', '--detection_args', help="A string of arguments that will be passed to detection " +
+    detection_args_group = parser.add_mutually_exclusive_group()
-                                                       "plugins. Only certain detection plugins use arguments.")
+    detection_args_group.add_argument('-a', '--detection_args', help="A string of arguments that will be passed to detection " +
                                      "plugins. Only certain detection plugins use arguments.")
    detection_args_group.add_argument('-json', '--detection_args_json',
                                      help="A JSON string that will be passed to detection plugins that parse JSON.")
    parser.add_argument('--check_frequency', help="How often to run metric collection in seconds",
                        type=validate_positive, default=30)
    parser.add_argument('--num_collector_threads', help="Number of Threads to use in Collector " +
@@ -283,7 +288,7 @@ def parse_arguments(parser):
    return parser.parse_args()
-def plugin_detection(plugins, template_dir, detection_args, skip_failed=True, remove=False):
+def plugin_detection(plugins, template_dir, detection_args, detection_args_json, skip_failed=True, remove=False):
    """Runs the detection step for each plugin in the list and returns the complete detected agent config.
    :param plugins: A list of detection plugin classes
    :param template_dir: Location of plugin configuration templates
@@ -292,9 +297,14 @@ def plugin_detection(plugins, template_dir, detection_args, skip_failed=True, re
    :return: An agent_config instance representing the total configuration from all detection plugins run.
    """
    plugin_config = agent_config.Plugins()
    if detection_args_json:
        json_data = json.loads(detection_args_json)
    for detect_class in plugins:
        # todo add option to install dependencies
-        detect = detect_class(template_dir, False, detection_args)
+        if detection_args_json:
            detect = detect_class(template_dir, False, **json_data)
        else:
            detect = detect_class(template_dir, False, detection_args)
        if detect.available:
            new_config = detect.build_config_with_name()
            if not remove:
@@ -321,9 +331,9 @@ def remove_config(args, plugin_names):
    detected_plugins = utils.discover_plugins(CUSTOM_PLUGIN_PATH)
    plugins = utils.select_plugins(args.detection_plugins, detected_plugins)
-    if args.detection_args is not None:
+    if (args.detection_args or args.detection_args_json):
        detected_config = plugin_detection(
-            plugins, args.template_dir, args.detection_args,
+            plugins, args.template_dir, args.detection_args, args.detection_args_json,
            skip_failed=(args.detection_plugins is None), remove=True)
    for file_path in existing_config_files:
--- a/tests/detection/test_process_check.py
+++ b/tests/detection/test_process_check.py
@@ -0,0 +1,167 @@
 # (C) Copyright 2016 Hewlett Packard Enterprise Development LP
 import contextlib
 import logging
 import os
 import psutil
 import tempfile
 import unittest
 from mock import patch
 from monasca_setup.detection.plugins.process import ProcessCheck
 LOG = logging.getLogger('monasca_setup.detection.plugins.process')
 class PSUtilGetProc(object):
    cmdLine = ['monasca-api']
    def as_dict(self):
        return {'name': 'monasca-api',
                'cmdline': PSUtilGetProc.cmdLine}
    def cmdline(self):
        return self.cmdLine
 class TestProcessCheck(unittest.TestCase):
    def setUp(self):
        unittest.TestCase.setUp(self)
        with patch.object(ProcessCheck, '_detect') as mock_detect:
            self.proc_plugin = ProcessCheck('temp_dir')
            self.assertTrue(mock_detect.called)
    def _detect(self,
                proc_plugin,
                config_is_file=False,
                by_process_name=True):
        proc_plugin.available = False
        psutil_mock = PSUtilGetProc()
        process_iter_patch = patch.object(psutil, 'process_iter',
                                          return_value=[psutil_mock])
        isfile_patch = patch.object(os.path, 'isfile',
                                    return_value=config_is_file)
        with contextlib.nested(process_iter_patch,
                               isfile_patch) as (
                mock_process_iter, mock_isfile):
            proc_plugin._detect()
            if by_process_name:
                self.assertTrue(mock_process_iter.called)
            self.assertFalse(mock_isfile.called)
    def test_detect_process_not_found(self):
        PSUtilGetProc.cmdLine = []
        self.proc_plugin.process_config = [{'process_names': ['monasca-api'], 'dimensions': {'service': 'monitoring'}}]
        self._detect(self.proc_plugin)
        self.assertFalse(self.proc_plugin.available)
    def test_detect_process_found(self):
        self.proc_plugin.process_config = [{'process_names': ['monasca-api'], 'dimensions': {'service': 'monitoring'}}]
        self._detect(self.proc_plugin)
        self.assertTrue(self.proc_plugin.available)
    def test_missing_arg(self):
        # monitor by process_username requires component
        self.proc_plugin.process_config = [{'process_username': 'dbadmin', 'dimensions': {'service': 'monitoring'}}]
        self._detect(self.proc_plugin, by_process_name=False)
        self.assertFalse(self.proc_plugin.available)
    def test_detect_build_config_process_name(self):
        self.proc_plugin.process_config = [{'process_names': ['monasca-api'], 'dimensions': {'service': 'monitoring'}}]
        self._detect(self.proc_plugin)
        result = self.proc_plugin.build_config()
        self.assertEqual(result['process']['instances'][0]['name'],
                         'monasca-api')
        self.assertEqual(result['process']['instances'][0]['detailed'],
                         True)
        self.assertEqual(result['process']['instances'][0]['exact_match'],
                         False)
        self.assertEqual(result['process']['instances'][0]['dimensions']['service'],
                         'monitoring')
        self.assertEqual(result['process']['instances'][0]['dimensions']['component'],
                         'monasca-api')
        self.assertEqual(result['process']['instances'][0]['search_string'][0],
                         'monasca-api')
    def test_detect_build_config_process_name_exact_match_true(self):
        self.proc_plugin.process_config = [
            {'process_names': ['monasca-api'], 'dimensions': {'service': 'monitoring'}, 'exact_match': True}]
        self._detect(self.proc_plugin)
        result = self.proc_plugin.build_config()
        self.assertEqual(result['process']['instances'][0]['name'],
                         'monasca-api')
        self.assertEqual(result['process']['instances'][0]['detailed'],
                         True)
        self.assertEqual(result['process']['instances'][0]['exact_match'],
                         True)
        self.assertEqual(result['process']['instances'][0]['dimensions']['service'],
                         'monitoring')
        self.assertEqual(result['process']['instances'][0]['dimensions']['component'],
                         'monasca-api')
        self.assertEqual(result['process']['instances'][0]['search_string'][0],
                         'monasca-api')
    def test_build_config_process_names(self):
        self.proc_plugin.valid_process_names = [
            {'process_names': ['monasca-api'],
             'dimensions': {'service': 'monitoring'},
             'found_process_names': ['monasca-api'],
             'exact_match': False},
            {'process_names': ['monasca-thresh'],
             'dimensions': {'service': 'monitoring'},
             'found_process_names': ['monasca-thresh'],
             'exact_match': False}]
        result = self.proc_plugin.build_config()
        self.assertEqual(result['process']['instances'][0]['name'],
                         'monasca-api')
        self.assertEqual(result['process']['instances'][0]['detailed'],
                         True)
        self.assertEqual(result['process']['instances'][0]['exact_match'],
                         False)
        self.assertEqual(result['process']['instances'][0]['dimensions']['service'],
                         'monitoring')
        self.assertEqual(result['process']['instances'][0]['dimensions']['component'],
                         'monasca-api')
        self.assertEqual(result['process']['instances'][0]['search_string'][0],
                         'monasca-api')
        self.assertEqual(result['process']['instances'][1]['name'],
                         'monasca-thresh')
        self.assertEqual(result['process']['instances'][1]['dimensions']['component'],
                         'monasca-thresh')
    def test_detect_build_config_process_username(self):
        self.proc_plugin.process_config = \
            [{'process_username': 'dbadmin', 'dimensions': {'service': 'monitoring', 'component': 'vertica'}}]
        self.proc_plugin._detect()
        result = self.proc_plugin.build_config()
        self.assertEqual(result['process']['instances'][0]['name'],
                         'vertica')
        self.assertEqual(result['process']['instances'][0]['detailed'],
                         True)
        self.assertEqual(result['process']['instances'][0]['dimensions']['service'],
                         'monitoring')
        self.assertEqual(result['process']['instances'][0]['dimensions']['component'],
                         'vertica')
    def test_input_yaml_file(self):
        # note: The previous tests will cover all yaml data variations, since the data is translated into a single dictionary.
        fd, temp_path = tempfile.mkstemp(suffix='.yaml')
        os.write(fd, '---\nprocess_config:\n- process_username: dbadmin\n  dimensions:\n    '
                 'service: monitoring\n    component: vertica\n')
        self.proc_plugin.args = {'conf_file_path': temp_path}
        self.proc_plugin._detect()
        result = self.proc_plugin.build_config()
        self.assertEqual(result['process']['instances'][0]['name'],
                         'vertica')
        self.assertEqual(result['process']['instances'][0]['detailed'],
                         True)
        self.assertEqual(result['process']['instances'][0]['dimensions']['service'],
                         'monitoring')
        self.assertEqual(result['process']['instances'][0]['dimensions']['component'],
                         'vertica')
        os.close(fd)
        os.remove(temp_path)