(armada) Chart Time Metrics
Change-Id: I121d8fcf050a83cbcf01a14c1543d11a0b04ea2a
This commit is contained in:
parent
987eacad79
commit
ca1e648b0f
154
specs/approved/armada_time_metrics.rst
Normal file
154
specs/approved/armada_time_metrics.rst
Normal file
@ -0,0 +1,154 @@
|
||||
..
|
||||
This work is licensed under a Creative Commons Attribution 3.0 Unported
|
||||
License.
|
||||
|
||||
http://creativecommons.org/licenses/by/3.0/legalcode
|
||||
|
||||
|
||||
=======================================
|
||||
Time Performance Metrics for Each Chart
|
||||
=======================================
|
||||
|
||||
Allow time performance metrics on charts including deployment time, upgrade
|
||||
time, wait time, test time, and consumed time for docs or resources, if
|
||||
applicable.
|
||||
|
||||
Problem description
|
||||
===================
|
||||
|
||||
There are currently no time metrics within Armada for chart deployments,
|
||||
upgrades, tests, or other actions. This can cause issues in that there is no
|
||||
known time for deployment of an environment, potentially restricting
|
||||
deployment or upgrade periods for charts. By adding time metrics for the
|
||||
charts, this will allow for better predictability of deployments and upgrades
|
||||
as well as show when charts are acting not as intended.
|
||||
|
||||
Use Cases
|
||||
---------
|
||||
|
||||
Knowing how long a chart takes to deploy or upgrade can streamline these
|
||||
processes in future deployements or upgrades. It allows for predictable chart
|
||||
deployment and upgrade times as well as finding inconsistencies within those
|
||||
deployments and upgrades, likely pinpointing which chart(s) is causing errors.
|
||||
|
||||
Proposed change
|
||||
===============
|
||||
|
||||
Add time metrics to the `ChartBuilder`, `ChartDeploy`, and `ChartDelete`
|
||||
classes. The timer will be the built in python library `time` which will
|
||||
then be written to the logs for use or analysis.
|
||||
|
||||
These metrics include the full deployment, upgrade, wait, install, and delete
|
||||
time for charts through Armada. These will be logged in the `HH:MM:SS`
|
||||
format, with the chart name and action performed.
|
||||
|
||||
TODO: Relevant data and info about said data?
|
||||
|
||||
Example:
|
||||
|
||||
chart_deploy.py::
|
||||
|
||||
def execute(self, chart, cg_test_all_charts, prefix, known_releases):
|
||||
namespace = chart.get('namespace')
|
||||
release = chart.get('release')
|
||||
release_name = r.release_prefixer(prefix, release)
|
||||
LOG.info('Processing Chart, release=%s', release_name)
|
||||
start_time = time.time()
|
||||
...
|
||||
LOG.info('Chart deployment/update completed in %s' % \
|
||||
time.strftime('%H:%M:%S', time.gmtime(time.time() - start_time)))
|
||||
|
||||
start_time = time.time()
|
||||
# Wait
|
||||
timer = int(round(deadline - time.time()))
|
||||
chart_wait.wait(timer)
|
||||
|
||||
LOG.info('Chart wait completed in %s' % \
|
||||
time.strftime('%H:%M:%S', time.gmtime(time.time() - start_time)))
|
||||
|
||||
start_time = time.time()
|
||||
#Test
|
||||
just_deployed = ('install' in result) or ('upgrade' in result)
|
||||
...
|
||||
if run_test:
|
||||
self._test_chart(release_name, test_handler)
|
||||
|
||||
LOG.info('Chart test completed in %s' % \
|
||||
time.strftime('%H:%M:%S', time.gmtime(time.time() - start_time)))
|
||||
...
|
||||
|
||||
|
||||
Alternatives
|
||||
------------
|
||||
|
||||
A simplistic alternative is to merely log time stamps for each action which
|
||||
occurs on a chart. While almost the same as the proposed change, it doesn't
|
||||
show an elapsed time but just start and end points.
|
||||
|
||||
Another alternative is to use the `datetime` library instead of the `time`
|
||||
library. This allows for very similar functionality in getting the elapsed
|
||||
time for chart deployment, update, wait, test, etc. It is slightly more
|
||||
effort to get the `timedelta` object produced by comparing two `datetime`
|
||||
objects to a string format to put into the log.
|
||||
|
||||
A third alternative is to use the Prometheus metrics through Openstack Helm.
|
||||
The Prometheus config file currently has it scrape the cAdvisor endpoint to
|
||||
retrieve metrics. These metrics could be use to show the starting time of
|
||||
chart deployments based on the containers. The 'container_start_time_seconds'
|
||||
metric will show the epoch timestamp for the container the chart is running,
|
||||
which can be converted normal timestamp. In order to grab the scraped metrics,
|
||||
a HTTP request as follows can be used::
|
||||
|
||||
curl http://127.0.0.1:9090/metrics
|
||||
|
||||
Unfortunatly these metrics do not include anything that would easily show
|
||||
when a chart was finished. A possibility would to grab the next chart's
|
||||
'container_start_time_seconds' timestamp and compare it to the previous, thus
|
||||
getting a rough estimate for the time performance of a chart deployment.
|
||||
However, for upgrades, waits, and tests times, it may prove too complex
|
||||
from the Prometheus scraped metrics to get accurate data.
|
||||
|
||||
This returns
|
||||
|
||||
Security Impact
|
||||
---------------
|
||||
None
|
||||
|
||||
Notifications Impact
|
||||
--------------------
|
||||
|
||||
Extra notification diplaying deployment or upgrade time
|
||||
|
||||
Other End User Impact
|
||||
---------------------
|
||||
None
|
||||
|
||||
Performance Impact
|
||||
------------------
|
||||
None
|
||||
|
||||
Other Deployer Impact
|
||||
---------------------
|
||||
None
|
||||
|
||||
Implementation
|
||||
==============
|
||||
|
||||
Assignee(s)
|
||||
-----------
|
||||
|
||||
Work Items
|
||||
----------
|
||||
|
||||
Dependencies
|
||||
============
|
||||
None
|
||||
|
||||
Documentation Impact
|
||||
====================
|
||||
None
|
||||
|
||||
References
|
||||
==========
|
||||
TODO
|
||||
|
Loading…
Reference in New Issue
Block a user