Create a tool for recreating missing metrid_id in Cassandra

In some rare cases it is possible for a row in Cassandra for
metrics to have no value for metric_id or created_at, though
they may still have updated_at and the other required columns.

This tool is for recreating the metric_id from the other required
columns.

An additional 'persister-check-missing-metric-id.py' tool is
provided which can be run to see if there are missing metric-id
values that need to be recreated.

Please see the README.rst for usage directions.

Story: 2005305
Task: 30611

Change-Id: I0593558407c8c773d728bbd035dde91310b59be3
(cherry picked from commit 09af9bff91)
This commit is contained in:
Joseph Davis 2019-04-19 14:41:33 -07:00
parent 105c2e1b29
commit 0aaa7d357f
4 changed files with 448 additions and 0 deletions

View File

@ -0,0 +1,88 @@
persister-recreate-metric-id
============================
In some rare cases, it is possible to have metric rows in the Cassandra
database which do not have a metric_id. Due to the nature of TSDBs,
it is valid to have sparse data, but the version of Monasca API up
through Rocky do not handle this well and produce an ugly ERROR.
For further reading - https://storyboard.openstack.org/#!/story/2005305
This tool runs through the metric table in Cassandra, identifies rows
that are missing a metric_id, and uses an UPDATE operation to recreate
the metric_id based on other values. The metric_id is calculated from
a hash of the region, tenant_id, metric_name and dimensions, so it can
be recreated.
All effort has been made to ensure this is a safe process. And it
should be safe to run the tool multiple times. However, it is provided
AS IS and you should use it at your own risk.
Usage
=====
Steps to use this tool:
- Log in to one node where monasca-persister is deployed.
- Identify installation path to monasca-persister. This may be a
virtual environment such as
`/opt/stack/venv/monasca-<version>/lib/python2.7/site-packages/monasca-persister`
or as in devstack `/opt/stack/monasca-persister/monasca_persister/`.
- Identify the existing configuration for monasca-persister. If using a
java deployment, it may be in `/opt/stack/service/monasca/etc/persister-config.yml`
or in devstack `/etc/monasca/persister.conf`
- Copy and modify the config template file.
::
cp persister-recreate.ini /opt/stack/service/monasca/etc/persister-recreate.ini
vi /opt/stack/service/monasca/etc/persister-recreate.ini
- Copy the values from the monasca-persister config in to the new .ini,
particularly the password. In some cases, the single IP for the
management network of one of the Cassandra nodes may need to be given,
rather than the list of hostnames as specified in the .yml.
- Copy the `persister-recreate-metric-id.py` and `persister-check-missing-metric-id.py`
files in to place with the monasca-persister code.
::
cp persister-*-metric-id.py /opt/stack/venv/monasca-<version>/lib/python2.7/site-packages/monasca-persister
- Ensure the `mon-persister` user has permission to access both
`persister-recreate.ini` and `persister-recreate-metric-id.py`.
- Invoke the tool to generate a log of rows needing repair.
::
sudo -u mon-persister /opt/stack/venv/monasca-<version>/bin/python /opt/stack/venv/monasca-<version>/lib/python2.7/site-packages/monasca_persister/persister-check-missing-metric-id.py --config-file /opt/stack/service/monasca/etc/persister-recreate.ini
- Review the logged output. If output is as expected, then invoke
the recreate-missing-metric-id tool to repair the rows.
::
sudo -u mon-persister /opt/stack/venv/monasca-<version>/bin/python /opt/stack/venv/monasca-<version>/lib/python2.7/site-packages/monasca_persister/persister-recreate-metric-id.py --config-file /opt/stack/service/monasca/etc/persister-recreate.ini
- Once repair has been verified successful, the configuration file
may be deleted.
License
=======
Copyright (c) 2019 SUSE LLC
Licensed under the Apache License, Version 2.0 (the “License”); you may
not use this file except in compliance with the License. You may obtain
a copy of the License at
::
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an “AS IS” BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

View File

@ -0,0 +1,156 @@
# (C) Copyright 2019 SUSE LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
# implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""Persister check for missing metric_id tool
This tool is designed to 'fix' the rare instance when a metric_id
has been removed from a row in Cassandra. That can cause issues
when monasca-api retrieves the metric and tries to decode it.
Configure this tool by copying the Monasca Persister settings from
/opt/stack/service/monasca/etc/persister-config.yml in to a config
.ini file (see template).
Start the tool as stand-alone process by running
'sudo -u mon_persister <venv python> \
<venv path>/site-packages/monasca_persister/persister-check-missing-metric-id.py \
--config-file <config file>'
When done, you may delete the config .ini file.
Template for .ini file (suggested /opt/stack/service/monasca/etc/persister-recreate.ini)
[DEFAULT]
debug = False
[repositories]
metrics_driver = monasca_persister.repositories.cassandra.metrics_repository: \
MetricCassandraRepository
[cassandra]
# Comma separated list of Cassandra node IP addresses (list value)
contact_points = <single ip address for mgmt network on this node>
# Cassandra port number (integer value)
port = 9042
# Keyspace name where metrics are stored (string value)
#keyspace = monasca
# Cassandra user name (string value)
user = mon_persister
# Cassandra password (string value)
password = <password from persister-config.yml>
"""
import sys
from oslo_log import log
from monasca_persister import config
from monasca_persister.repositories.cassandra import connection_util
LOG = log.getLogger(__name__)
METRIC_ALL_CQL = ('select region, tenant_id, metric_name, dimensions, '
'dimension_names, created_at, metric_id, updated_at '
'from metrics')
def usage():
usage = """Monasca Persister Check for Missing metric_id Tool
Used to find a metric_id, which in rare cases may be deleted
from a row. The metric_id is a hash of other fields, and thus can be
recreated. Note this tool is only for use with Cassandra storage installs.
Please see the included README.rst for more details about creating an
appropriate configuration file.
To get a report of what rows are missing values (execute as mon-persister user):
persister-recreate-metric-id.py --config-file <path>/persister-recreate.ini
"""
print(usage)
def main():
"""persister check for missing metric_id tool."""
config.parse_args()
try:
LOG.info('Starting check of metric_id consistency.')
# Connection setup
# rocky style - note that we don't deliver pike style
_cluster = connection_util.create_cluster()
_session = connection_util.create_session(_cluster)
metric_all_stmt = _session.prepare(METRIC_ALL_CQL)
rows = _session.execute(metric_all_stmt)
# if rows:
# LOG.info('First - {}'.format(rows[0]))
# # LOG.info('First name {} and id {}'.format(
# # rows[0].metric_name, rows[0].metric_id)) # metric_id can't be logged raw
# Bit of a misnomer - "null" is not in the cassandra db
missing_value_rows = []
for row in rows:
if row.metric_id is None:
LOG.info('Row with missing metric_id - {}'.format(row))
missing_value_rows.append(row)
# check created_at
if row.created_at is None and row.updated_at is not None:
LOG.info("Metric created_at was also None.")
# TODO(joadavis) update the updated_at timestamp to now
# recreate metric id
# copied from metrics_repository.py
hash_string = '%s\0%s\0%s\0%s' % (row.region, row.tenant_id,
row.metric_name,
'\0'.join(row.dimensions))
# metric_id = hashlib.sha1(hash_string.encode('utf8')).hexdigest()
# id_bytes = bytearray.fromhex(metric_id)
LOG.info("Recreated hash for metric id: {}".format(hash_string))
# LOG.info("new id_bytes {}".format(id_bytes)) # can't unicode decode for logging
# LOG.info("of {} rows there are {} missing metric_id".format(len(rows), len(null_rows)))
if len(missing_value_rows) > 0:
LOG.warning("--> There were {} rows missing metric_id.".format(
len(missing_value_rows)))
LOG.warning(" Those rows have NOT been updated.\n"
" Please run the persister-recreate-metric-id "
"tool to repair the rows.")
else:
LOG.info("No missing metric_ids were found, no changes made.")
LOG.info('Done with metric_id consistency check.')
return 0
except Exception:
LOG.exception('Error! Exiting.')
if __name__ == "__main__":
sys.exit(main())

View File

@ -0,0 +1,182 @@
# (C) Copyright 2019 SUSE LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
# implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""Persister Recreate metric_id
This tool is designed to 'fix' the rare instance when a metric_id
has been removed from a row in Cassandra. That can cause issues
when monasca-api retrieves the metric and tries to decode it.
Configure this tool by copying the Monasca Persister settings from
/opt/stack/service/monasca/etc/persister-config.yml in to a config
.ini file (see template).
Start the tool as stand-alone process by running
'sudo -u mon_persister <venv python> \
<venv path>/site-packages/monasca_persister/persister-recreate-metric-id.py \
--config-file <config file>'
When done, you may delete the config .ini file.
Template for .ini file (suggested /opt/stack/service/monasca/etc/persister-recreate.ini)
[DEFAULT]
debug = False
[repositories]
metrics_driver = monasca_persister.repositories.cassandra.metrics_repository: \
MetricCassandraRepository
[cassandra]
# Comma separated list of Cassandra node IP addresses (list value)
contact_points = <single ip address for mgmt network on this node>
# Cassandra port number (integer value)
port = 9042
# Keyspace name where metrics are stored (string value)
#keyspace = monasca
# Cassandra user name (string value)
user = mon_persister
# Cassandra password (string value)
password = <password from persister-config.yml>
"""
import hashlib
import sys
from oslo_config import cfg
from oslo_log import log
from monasca_persister import config
from monasca_persister.repositories.cassandra import connection_util
from monasca_persister.repositories.cassandra.metrics_repository import METRICS_INSERT_CQL
LOG = log.getLogger(__name__)
METRIC_ALL_CQL = ('select region, tenant_id, metric_name, dimensions, '
'dimension_names, created_at, metric_id, updated_at '
'from metrics')
def usage():
usage = """Monasca Persister Recreate metric_id Tool
Used to recreate a metric_id, which in rare cases may be deleted
from a row. The metric_id is a hash of other fields, and thus can be
recreated. Note this tool is only for use with Cassandra storage installs.
Please see the included README.rst for more details about creating an
appropriate configuration file.
persister-recreate-metric-id [-h] --config-file <ini>
-h --help Prints this
--config-file <ini> (Required) Configuration file as described in README.rst
Example
To repair rows (execute as mon-persister user):
persister-recreate-metric-id.py --config-file <path>/persister-recreate.ini
"""
print(usage)
def main():
"""persister recreate metric_id tool."""
config.parse_args()
conf = cfg.CONF
try:
LOG.info('Starting check and repair of metric_id consistency.')
# Connection setup
# rocky style - note that we don't deliver pike style
_cluster = connection_util.create_cluster()
_session = connection_util.create_session(_cluster)
_retention = conf.cassandra.retention_policy * 24 * 3600
metric_all_stmt = _session.prepare(METRIC_ALL_CQL)
metric_repair_stmt = _session.prepare(METRICS_INSERT_CQL)
rows = _session.execute(metric_all_stmt)
# if rows:
# LOG.info('First - {}'.format(rows[0]))
# # LOG.info('First name {} and id {}'.format(
# # rows[0].metric_name, rows[0].metric_id)) # metric_id can't be logged raw
# Bit of a misnomer - "null" is not in the cassandra db
missing_value_rows = []
for row in rows:
if row.metric_id is None:
LOG.info('Row with missing metric_id - {}'.format(row))
missing_value_rows.append(row)
# check created_at
fixed_created_at = row.created_at
if row.created_at is None and row.updated_at is not None:
LOG.info("Metric created_at was also None, repairing.")
fixed_created_at = row.updated_at
# TODO(joadavis) update the updated_at timestamp to now
# recreate metric id
# copied from metrics_repository.py
hash_string = '%s\0%s\0%s\0%s' % (row.region, row.tenant_id,
row.metric_name,
'\0'.join(row.dimensions))
metric_id = hashlib.sha1(hash_string.encode('utf8')).hexdigest()
id_bytes = bytearray.fromhex(metric_id)
LOG.info("Recreated hash for metric id: {}".format(hash_string))
# LOG.info("new id_bytes {}".format(id_bytes)) # can't unicode decode for logging
# execute cql
metric_repair_bound_stmt = metric_repair_stmt.bind((_retention,
id_bytes,
fixed_created_at,
row.updated_at,
row.region,
row.tenant_id,
row.metric_name,
row.dimensions,
row.dimension_names))
_session.execute(metric_repair_bound_stmt)
# LOG.info("of {} rows there are {} missing metric_id".format(len(rows), len(null_rows)))
if len(missing_value_rows) > 0:
LOG.warning("--> There were {} rows missing metric_id.".format(
len(missing_value_rows)))
LOG.warning(" Those rows have been updated.")
else:
LOG.info("No missing metric_ids were found, no changes made.")
LOG.info('Done with metric_id consistency check and repair.')
return 0
except Exception:
LOG.exception('Error! Exiting.')
if __name__ == "__main__":
sys.exit(main())

View File

@ -0,0 +1,22 @@
[DEFAULT]
debug = False
[repositories]
metrics_driver = monasca_persister.repositories.cassandra.metrics_repository:MetricCassandraRepository
[cassandra]
# Comma separated list of Cassandra node IP addresses (list value)
contact_points = <single ip address for mgmt network on this node>
# Cassandra port number (integer value)
port = 9042
# Keyspace name where metrics are stored (string value)
#keyspace = monasca
# Cassandra user name (string value)
user = mon_persister
# Cassandra password (string value)
password = <password from persister-config.yml>