The mock third party library was needed for mock support in py2
runtimes. Since we now only support py36 and later, we can use the
standard lib unittest.mock module instead.
Change-Id: Ia3f2c8abc87cf5551d3469d616790e8e9d567bce
The json_plugin tests read the localhost name in a different way
to how the hostname is fetched in the Monasca Agent check. This can
cause these tests to fail since the hostnames may differ. For example,
only one may contain .home as a suffix. In this change we avoid reading
the hostname from the system running the tests.
Story: 2007743
Task: 39921
Change-Id: Ifddba6aa9350f722a741a28a152ed9bc3e0b7da6
When running with Py3 we compare a byte string to a unicode string
when parsing StatsD metrics. This patch adds some unit tests to
reproduce the bug and decodes the bytestring to make the existing
comparisons valid under Py3. When backporting to Train we can use
Oslo encodeutils. Clearly we could have more unit tests, but
this makes a start.
Change-Id: I6341f96f5c186428d2d829cabf618a6f84f40ce2
Story: 2007684
Task: 39796
The remove_config() function only removes an exact match for
the configured instance.
This will allow removing plugin configuration when all config is
not known.
Use case: A compute node has been removed, so any host_alive
ping checks that are configured for it should be removed. But at
the time of removal the list of target_hostnames to match are
not known.
Change-Id: I8050e1eed68d7b64f7a968b061afa69fe2e86d72
Story: 2004539
Task: 28287
Now that we no longer support py27, we can use the standard library
unittest.mock module instead of the third party mock lib.
Change-Id: I1dca4b2c7eccf1b19482dde60b88a132935b48b8
Signed-off-by: Sean McGinnis <sean.mcginnis@gmail.com>
Since the Luminous release of Ceph, the plugin no longer exports metrics
such as object storage daemon stats, placement groups and pool stats.
Check for the installed version of the Ceph command and parse results
according to version.
Include test data for Jewel and Luminous Ceph clusters.
Story: 2005032
Task: 29515
Change-Id: I0aef0db25f49545c715b07880edd57135e3beafe
Co-Authored-By: Bharat Kunwar <bharat@stackhpc.com>
Co-Authored-By: Doug Szumski <doug@stackhpc.com>
Currently we don't have any capability to monitor the internal TLS/SSL
certificates. i.e. SSL certificates used by MySQL for replication, RabbitMQ for
distribution, etc. The cert_check plugin is not adequate for this purpose
becaue it can only check on certficates over HTTPS endpoints. Furthermore,
checking on these internal certificates over the network is cumbersome
because the agent plugin would have to speak specific protocols.
This patch adds a cert_file_check plugin to detect the certificate expiry
(in days from now) for the given X.509 certificate file in PEM format.
Similar to cert_check plugin, this plugin will a metric
'cert_file.cert_expire_days' which contains the number of days from now the
given certificate will be expired. If the certificate has already expired,
this will be a negative number.
Change-Id: Id95cc7115823f972e234417223ab5906b57447cc
Story: 2006753
A powerful metric to watch for a swift cluster is the
number of handoff partitions on a drive on a storage node.
A build up of handoff nodes on a particular server could
indicate a disk problem somewhere in the cluster. A bottleneck
somewhere. Or better, when would be a good time to rebalance
the ring (as you'd want to do it when existing backend data
movement is at a minimum.
So it turns out to be a great visualisation of the health of
a cluster.
That's what this check plugin does. Each instance check takes
the following values:
ring: <path to a Swift ring file>
devices: <path to the directory of mountpoints>
granularity: <either server or device>
To be able to determine primary vs handoff partitions on a drive
the swift ring needs to be consulted. If a storage node stores
more then 1 ring, and an instance would be defined for each.
You give swift a bunch of disks. These disks are placed in what
swift calls the 'devices' location. That is a directory where a
mount point for each mounted swift drive is located.
Finally, you can decide on the granularity, which defaults to
`server` if not defined. Only 2 metrics are created from this
check:
swift.partitions.primary_count
swift.partitions.handoff_count
But with the hostname dimension a ring dimension will also be set.
Allowing the graphing of the handoff vs partitions of each ring.
When the granularity is set to device, then an additional
dimension to the metric is added, the device name (the name of
the devices mount point). This allows the graphing and monitoring
of each device in a server if a finer granularity is required.
Because we need to consult the Swift ring there is a runtime
requirement on the Python Swift module being installed. But
this isn't required for the unit tests. Making it a runtime
dependency means when the check is loaded it'll log an error
and then exit if it can't import the swift module.
This is the second of two Swift check plugins I've been working on.
For more details see my blog post[1]
[1] - https://oliver.net.au/?p=358
Change-Id: Ie91add9af39f2ab0e5b575390c0c6355563c0bfc
Swift outputs alot of statsd metrics that you can point directly
at monasca-agents. However there is another swift endpoint,
recon, that is used to gather more metrics.
The Swift recon (or reconnaissance) API is an endpoint each of the
storage node servers make available via a REST API. This API can
either be hit manually or via the swift-recon tool.
This patch adds a check plugin that hits the recon REST API and
and send metrics to monasca.
This is the first of two Swift check plugins I'm working on.
For more details see my blog post[1]
[1] - https://oliver.net.au/?p=358
Change-Id: I503d74936f6f37fb261c1592845968319695475a
Even though there was a py36 test enabled in the gate, the tox.ini
configuration was not actually invoking the unit tests. This
change sets up the environment to allow tests to run.
As a result, a number of Python3 errors are uncovered and fixed.
Notably:
Python 3 does not have contextlib.nested, so reformatting using ,
file() is not in Python 3, so use io.open() instead
Use six.assertCountEqual(self, in tests
safe_decode:
subprocess.check_output returns in byte encoding, while default text
type str. safe_decode does the right thing by making sure string are not
bytes in python2 and python3
No ascci encoding:
python3 defaults to UTF-8 encoding, which is merely an extension to
ascii (default for python2).
test_json_plugin.py:
the file being opened in binary(wb) mode so python is expecting the
string in bytes.
Some of the refactoring should be revisited after we drop Python 2
support.
Change-Id: I62b46a2509c39201ca015ca7c269b2ea70c376c8
Story: 2005047
Task: 29547
In standard system locations, check for the client.admin key
for each detected Ceph cluster and conditionally suppress
Ceph agent checks that require it if it is not found.
Change-Id: If3a28ceb5cdde40749d077ad465054eba37c848c
Story: 2005172
To properly support Keystone V3, we must also properly convey the
domain information to the underlaying Keystone client.
Story: 2005045
Task: 29542
Change-Id: I57f233578132a3689a2182c53483d8110f15bcea
To properly support Keystone V3, we must also properly convey the
domain information to the underlaying Keystone client.
Change-Id: I725e107e418d15b65aabf24f8c05403f952f94d5
story: 2005018
Task: 29497
It is possible monasca-setup configuration process informs user about
errors related to setting up other services, but system is working
correctly. It is expected to change some log level classification for
INFO or WARNING depends on type of message.
Story: 2004970
Task: 29425
Change-Id: Idb8101fea6e7c5c357d72d77b3b264db4cce8527
To properly support Keystone V3, we must also properly convey
the domain information to the Keystone authentication plugins.
Change-Id: I8c6539cf692e090290cfdf104eb22530a625aadb
story: 2004655
Use the six library to get monasca-agent to work with
python2.7 and python3.
Story: 2004148
Task: 27621
Change-Id: I0de315967dd5a745741fda0c53ce8cc85cda8cc5
Signed-off-by: Chuck Short <chucks@redhat.com>
Static class variables of FakeProcesses class were used in all test
methods. These variables were used in an uncoordinated way causing tests
to fail ocassionally, depending on the grouping and ordering of the
tests in the worker processes.
Code has been refactored and instance variables are used now in all test
methods.
Change-Id: Ic87a00883dc6cff128809f24164d95eb8c98ae91
Story: 2002848
Task: 22797
monasca_setup.detection.utils.load_oslo_configuration() should ignore
options to `from_cmd` that is not oslo.config built-in options. We are
only interested in the options --config-file and --config-dir.
Change-Id: I72dc53d8ee6dc7b0784e6931a19c461cdb322851
Story: 2001303
Task: 5854
For k8s we are updating the pod owner
functionaility
Kubernetes 1.6 and newer supports having owner
reference instead if parsing name. This patch
takes advantage of that.
For prometheus we are adding the ability to look
up pod owner based on a pod dimension. This is
key when using kube-state-metrics so we dont
have dead alarms
Also includes a minor fix in k8s plugin for memory
Patch also includes changes to our agent tests as
we now allow () in our new dimensions keys and values
and our tests were not updated for this change yet
Change-Id: I17ea7f42d4b23534221675309c31feeafa75d20c
Ceph commands the Ceph check runs to query cluster status need to access
/etc/ceph/ceph.client.admin.keyring. To that end one can either add
monasca-agent to the ceph group or run monasca-agent as root. This
commit adds the use_sudo configuration option that runs ceph commands
using sudo. This is useful if your agent has sudo rights anyway due to
other plugins that require it (e.g. postfix).
Change-Id: I24075359f7090f02577cd22a1b3badcbe7041302
Following commit makes enhancements to the
keystone handling inside monasca-agent:
* using generic password approach that abstracts from underlying
keystone version thus allows agent to be used seamlessly with
either v2.0 or v3. The only relevant part is the set of parameters
that one needs to supply to either monasca-reconfigure or agent.yaml
configuration file
* using keystone discovery - it simply means that agent will no longer
enforce particular keystone version but will allow keystoneauth
to pick the best match for given environment
Extra:
* extracted methods get_session and get_client utilize an aproach
presented above and can be used outside of monasca_agent.common.keystone
inside checks or detection plugins
* make imports to import only modules instead specific objects
* removed some redundant methods
Story: 2000995
Task: 4191
Needed-By: I579f6bcd5975a32af2a255be41c9b6c4043fa1dc
Needed-By: Ifee5b88ccb632222310aafb1081ecb9c9d085150
Change-Id: Iec97e50089ed31ae7ad8244b37cec128817871a5
Osloconfig is better at figuring out placement of config files
and and resolving config in case of multiple files.
Change-Id: I0aec0d4cd5ecd18059a39bbae17cee5d9056fd53
Story: 2000999
Task: 4181
This change adds a ceph plugin to collect metrics regarding
ceph clusters. Includes documentation, detection and tests.
Change-Id: Ia4b61e751f0f8087fc9ab3adff889734b8afc2d6
List of changes:
* using oslo_config to get nova configuration
* adjusted _detect body to match changes done
recently for other plugins
Extra:
* added utility method to to load oslo_configuration
for any OpenStack project using it.
* removed json, time from required dependencies (core python libs)
* removed libvirt inspector from required dependencies (part of
an agent itself)
* removed netaddr from required dependencies (part of agent's
requirements)
* in overall tried to introduce some order into libvirt
code
Story: 2000999
Task: 4623
Story: 2001054
Task: 4655
Change-Id: Iaac56cf96f710659908d23dc55831be7dac30e0a
Find Zookeeper autodetection about options liek hostname and port
number. Discover where zookeeper configuration file is and read
options from it.
Change-Id: I8fb8a13e1ec5c1c488398497d217fc9b4bb4e7c6
Story: 2001043
Task: 4600
The node name of influxdb and influxdb-relay can be defined in
the configuration file of influxdb-relay.
This change enables to specify the node name by "detection_args"
option with "influxdb_node" or "influxdb_relay_node".
Change-Id: I5f394893798c5c3bab87bbf7065d5c5f3776514d
Added autodetection of influxdb.
Plugin configures process and http_check
monitoring.
Added note informing that this auto-plugin
can be extended further with retrieving
internal metrics of InfluxDB.
Also, marked MonInfluxDB as depracated in
favour of InfluxDB plugin.
Change-Id: I9a435482bbe7da4aedd06b1678331cf83ccc4587
`as_dict` tries to retrieve various params of process from the OS which
are time consuming to obtain and not required for the filtering being
performed.
Change-Id: If02f6703b17b02e36797aa23417ea55fdaea89e2
Story: 2001004
Task: 4187
Plugin that connects to the Kubernetes API to gather metrics
about the Kubernetes environment.
Taken from original review https://review.openstack.org/#/c/391559/
Change-Id: Ifff9285e9a2ac06d59383b986619ee62c59c712e
This improves the likelyhood that the collector daemon will exit during
auto restart. Also removed gevent requirement and usage so only
eventlet is used. The conflict between gevent and eventlet caused
multiproccessing worker join and sys.exit to hang forever.
Change-Id: I60f980f4a74eafb709e51c0f520a81e2d72cb609
Use white list in init_config to control what are the metrics
to report. If there is no white_list section in the init_config,
then report all the metrics.
Change-Id: Ia5d9bed47748af83bbc27575f992449584364479