archive_policy: lighten the default archive policies

This reduces by one or two thirds the default archive policies CPU consumption for 'low' and 'medium' by removing one or two definitions. Also make it clearer that using low and medium is going to be fasten in terms of CPU as the consumption depends on the number of definition in an archive policy. Change-Id: Iaba3b2ef88858ad777147d2859180d9a27658f1c
2016-12-27 18:32:04 +01:00 · 2016-12-27 18:32:04 +01:00 · 246b7873b4
commit 246b7873b4
parent c28c65d30d
4 changed files with 74 additions and 32 deletions
--- a/doc/source/architecture.rst
+++ b/doc/source/architecture.rst
@ -85,14 +85,14 @@ same "one year, one minute aggregations" resolution, the space used will go up
 to a maximum of 6 × 4.1 MiB = 24.6 MiB.


-How to set the archive policy and granularity
---------------------------------------------
+How to define archive policies
+------------------------------

-In Gnocchi, the archive policy is expressed in number of points. If your
-archive policy defines a policy of 10 points with a granularity of 1 second,
-the time series archive will keep up to 10 seconds, each representing an
-aggregation over 1 second. This means the time series will at maximum retain 10
-seconds of data (sometimes a bit more) between the more recent point and the
+In Gnocchi, the archive policy definitions are expressed in number of points.
+If your archive policy defines a policy of 10 points with a granularity of 1
+second, the time series archive will keep up to 10 seconds, each representing
+an aggregation over 1 second. This means the time series will at maximum retain
+10 seconds of data (sometimes a bit more) between the more recent point and the
 oldest point. That does not mean it will be 10 consecutive seconds: there might
 be a gap if data is fed irregularly.

@ -112,6 +112,12 @@ This would represent 6125 points × 9 = 54 KiB per aggregation method. If
 you use the 8 standard aggregation method, your metric will take up to 8 × 54
 KiB = 432 KiB of disk space.

+Be aware that the more definitions you set in an archive policy, the more CPU
+it will consume. Therefore, creating an archive policy with 2 definitons (e.g.
+1 second granularity for 1 day and 1 minute granularity for 1 month) will
+consume twice CPU than just one definition (e.g. just 1 second granularity for
+1 day).
+
 Default archive policies
 ------------------------

@ -119,19 +125,16 @@ By default, 3 archive policies are created using the default archive policy
 list (listed in `default_aggregation_methods`, i.e. mean, min, max, sum, std,
 count):

- low (maximum estimated size per metric: 5 KiB)
+- low (maximum estimated size per metric: 406 MiB)

-  * 5 minutes granularity over 1 hour
-  * 1 hour granularity over 1 day
-  * 1 day granularity over 1 month
+  * 5 minutes granularity over 30 days

- medium (maximum estimated size per metric: 139 KiB)
+- medium (maximum estimated size per metric: 887 KiB)

-  * 1 minute granularity over 1 day
-  * 1 hour granularity over 1 week
-  * 1 day granularity over 1 year
+  * 1 minute granularity over 7 days
+  * 1 hour granularity over 365 days

- high (maximum estimated size per metric: 1 578 KiB)
+- high (maximum estimated size per metric: 1 057 KiB)

  * 1 second granularity over 1 hour
  * 1 minute granularity over 1 week
--- a/gnocchi/archive_policy.py
+++ b/gnocchi/archive_policy.py
@ -211,22 +211,19 @@ class ArchivePolicyItem(dict):
 DEFAULT_ARCHIVE_POLICIES = {
    'low': ArchivePolicy(
        "low", 0, [
-            # 5 minutes resolution for an hour
-            ArchivePolicyItem(granularity=300, points=12),
-            # 1 hour resolution for a day
-            ArchivePolicyItem(granularity=3600, points=24),
-            # 1 day resolution for a month
-            ArchivePolicyItem(granularity=3600 * 24, points=30),
+            # 5 minutes resolution for 30 days
+            ArchivePolicyItem(granularity=300,
+                              timespan=30 * 24 * 60 * 60),
        ],
    ),
    'medium': ArchivePolicy(
        "medium", 0, [
-            # 1 minute resolution for an day
-            ArchivePolicyItem(granularity=60, points=60 * 24),
-            # 1 hour resolution for a week
-            ArchivePolicyItem(granularity=3600, points=7 * 24),
-            # 1 day resolution for a year
-            ArchivePolicyItem(granularity=3600 * 24, points=365),
+            # 1 minute resolution for 7 days
+            ArchivePolicyItem(granularity=60,
+                              timespan=7 * 24 * 60 * 60),
+            # 1 hour resolution for 365 days
+            ArchivePolicyItem(granularity=3600,
+                              timespan=365 * 24 * 60 * 60),
        ],
    ),
    'high': ArchivePolicy(
--- a/gnocchi/tests/base.py
+++ b/gnocchi/tests/base.py
@ -179,12 +179,50 @@ class TestCase(base.BaseTestCase):
    ARCHIVE_POLICIES = {
        'no_granularity_match': archive_policy.ArchivePolicy(
            "no_granularity_match",
-            0,
-            [
+            0, [
                # 2 second resolution for a day
                archive_policy.ArchivePolicyItem(
                    granularity=2, points=3600 * 24),
-                ],
+            ],
+        ),
+        'low': archive_policy.ArchivePolicy(
+            "low", 0, [
+                # 5 minutes resolution for an hour
+                archive_policy.ArchivePolicyItem(
+                    granularity=300, points=12),
+                # 1 hour resolution for a day
+                archive_policy.ArchivePolicyItem(
+                    granularity=3600, points=24),
+                # 1 day resolution for a month
+                archive_policy.ArchivePolicyItem(
+                    granularity=3600 * 24, points=30),
+            ],
+        ),
+        'medium': archive_policy.ArchivePolicy(
+            "medium", 0, [
+                # 1 minute resolution for an day
+                archive_policy.ArchivePolicyItem(
+                    granularity=60, points=60 * 24),
+                # 1 hour resolution for a week
+                archive_policy.ArchivePolicyItem(
+                    granularity=3600, points=7 * 24),
+                # 1 day resolution for a year
+                archive_policy.ArchivePolicyItem(
+                    granularity=3600 * 24, points=365),
+            ],
+        ),
+        'high': archive_policy.ArchivePolicy(
+            "high", 0, [
+                # 1 second resolution for an hour
+                archive_policy.ArchivePolicyItem(
+                    granularity=1, points=3600),
+                # 1 minute resolution for a week
+                archive_policy.ArchivePolicyItem(
+                    granularity=60, points=60 * 24 * 7),
+                # 1 hour resolution for a year
+                archive_policy.ArchivePolicyItem(
+                    granularity=3600, points=365 * 24),
+            ],
        ),
    }

@ -238,7 +276,6 @@ class TestCase(base.BaseTestCase):
        self.coord.stop()

        self.archive_policies = self.ARCHIVE_POLICIES.copy()
-        self.archive_policies.update(archive_policy.DEFAULT_ARCHIVE_POLICIES)
        for name, ap in six.iteritems(self.archive_policies):
            # Create basic archive policies
            try:
--- a/releasenotes/notes/lighten-default-archive-policies-455561c027edf4ad.yaml
+++ b/releasenotes/notes/lighten-default-archive-policies-455561c027edf4ad.yaml
@ -0,0 +1,5 @@
+---
+other:
+  - The default archive policies "low" and "medium" are now storing less data
+    than they used to be. They are only using respectively 1 and 2 definition
+    of archiving policy, which speeds up by 66% and 33% their computing speed.