Apply timer trigger jitter to project-branches

Currently the timer trigger accepts an optional "jitter" specification which can delay the start of a pipeline timer trigger by up to a certain number of seconds. It applies uniformly to every project-branch that participates in the pipeline. For example, if a periodic pipeline with nova and glance is configured to trigger at midnight, and has a jitter of 30 seconds, then the master and stable branches of nova and glance will all be enqueued at the same time (perhaps 00:00:17). While this provides some utility in that if other systems are configured to do things around midnight, this pipeline may not join a thundering herd with them. Or if there are many periodic pipelines configured for midnight (perhaps across different tenants, or just with slightly different purposes), they won't be a thundering hurd. But to the extent that jobs within a given pipeline might want to avoid a thundering herd with other similar jobs in the same pipeline, it offers no relief. While Zuul may be able to handle it (especially since multiple schedulers allows other pipelines to continue to operate), these jobs may interact with remote systems which would appreciate not being DoS'd. To alleviate this, we change the jitter from applying to the pipeline as a whole to individual project-branches. To be clear, it is still the case that the pipeline has only a single configured trigger time (this change does not allow projects to configure their own triggers). But instead of firing a single event for the entire pipeline, we will fire a unique event for every project-branch in that pipeline, and these events will have the jitter applied to them individually. So in our example above, nova@master might fire at 00:00:05, nova@stable/zulu may fire at 00:00:07, glance@master at 00:00:13, etc. This behavior is similar enough in spirit to the current behavior that we can consider it a minor implementation change, and it doesn't require any new configuration options, feature flags, deprecation notice, etc. The documentation is updated to describe the new behavior, as well as correct an error relating to the description of jitter (it only delays, not advances, events). We currently add a single job to APScheduler for every timer triggered pipeline in every tenant (so the number of jobs is the sum of the periodic pipelines in every tenant). OpenDev for example may have on the order of 20 APScheduler jobs. With the new approach, we will enqueue a job for each project-branch in a periodic pipeline. For a system like OpenDev, that could potentially be thousands of jobs. In reality, based on current configuration and pipeline participation, it should be 176. Even though it will result in the same number of Zuul trigger events, there is overhead to having more APScheduler jobs. To characterize this, I performed a benchmark where I added a certain number of APScheduler jobs with the same trigger time (and no jitter) and recorded the amount of time needed to add the jobs and also, once the jobs began firing, the elapsed time from the first to the last job. This should charactize the additional overhead the scheduler will encounter with this change. Time needed to add jobs to APScheduler (seconds) 1: 0.00014448165893554688 10: 0.0009338855743408203 100: 0.00925445556640625 1000: 0.09204769134521484 10000: 0.9236903190612793 100000: 11.758053541183472 1000000: 223.83168983459473 Time to run jobs (last-first in seconds) 1: 2.384185791015625e-06 10: 0.006863832473754883 100: 0.09936022758483887 1000: 0.22670435905456543 10000: 1.517075777053833 100000: 19.97287678718567 1000000: 399.24730825424194 Given that this operates primarily at the tenant level (when a tenant reconfiguration happens, jobs need to be removed and added), I think it makes sense to consider up to 10,000 jobs a reasonable high end. It looks like we can go a little past that (between 10,000 and 100,000) while still seeing something like a linear increase. As we approach 1,000,000 jobs it starts looking more polynomial and I would not conisder the performance to be acceptable. But 100,000 is already an unlikely number, so I think this level of performance is okay within the likely range of jobs. The default executor used by APScheduler is a standard python ThreadPoolExecutor with a maximum of 10 simultaneous workers. This will cause us to fire up to 10 Zuul event simultaneously (whereas before we were only likely to fire simultaneous events if multiple tenants had identical pipeline timer triggers). This could result in more load on the connection sources and change cache as they update the branch tips in the change cache. It seems plausible that 10 simulatenous events is something that the sources and ZK can handle. If not, we can reduce the granularity of the lock we use to prevent updating the same project at the same time (to perhaps a single lock for all projects), or construct the APScheduler with a lower number of max_workrs. Change-Id: I27fc23763da81273eb135e14cd1d0bd95964fd16
2022-07-06 10:56:30 -07:00
parent 78b14ec3c1
commit 49abc4255e
3 changed files with 75 additions and 56 deletions
--- a/doc/source/drivers/timer.rst
+++ b/doc/source/drivers/timer.rst
@@ -14,9 +14,9 @@ Timers don't require a special connection or driver. Instead they can
 simply be used by listing ``timer`` as the trigger.

 This trigger will run based on a cron-style time specification.  It
-will enqueue an event into its pipeline for every project defined in
-the configuration.  Any job associated with the pipeline will run in
-response to that event.
+will enqueue an event into its pipeline for every project and branch
+defined in the configuration.  Any job associated with the pipeline
+will run in response to that event.

 .. attr:: pipeline.trigger.timer

@@ -27,9 +27,9 @@ response to that event.

      The time specification in cron syntax.  Only the 5 part syntax
      is supported, not the symbolic names.  Example: ``0 0 * * *``
-      runs at midnight. The first weekday is Monday.
-      An optional 6th part specifies seconds.  The optional 7th part
-      specifies a jitter in seconds. This advances or delays the
-      trigger randomly, limited by the specified value.
-      Example ``0 0 * * * * 60`` runs at midnight with a +/- 60
-      seconds jitter.
+      runs at midnight. The first weekday is Monday.  An optional 6th
+      part specifies seconds.  The optional 7th part specifies a
+      jitter in seconds. This delays the trigger randomly, limited by
+      the specified value.  Example ``0 0 * * * * 60`` runs at
+      midnight or randomly up to 60 seconds later.  The jitter is
+      applied individually to each project-branch combination.
--- a/releasenotes/notes/timer-jitter-3d3df10d0e75f892.yaml
+++ b/releasenotes/notes/timer-jitter-3d3df10d0e75f892.yaml
@@ -0,0 +1,7 @@
+---
+features:
+  - |
+    Pipeline timer triggers with jitter now apply the jitter to each
+    project-branch individually (instead of to the pipeline as a
+    whole).  This can reduce the thundering herd effect on external
+    systems for periodic pipelines with many similar jobs.
--- a/zuul/driver/timer/init.py
+++ b/zuul/driver/timer/init.py
@@ -130,6 +130,24 @@ class TimerDriver(Driver, TriggerInterface):
                            pipeline.name)
                        continue

+                    self._addJobsInner(tenant, pipeline, trigger, timespec,
+                                       jobs)
+
+    def _addJobsInner(self, tenant, pipeline, trigger, timespec, jobs):
+        # jobs is a list that we mutate
+        for project_name, pcs in tenant.layout.project_configs.items():
+            # timer operates on branch heads and doesn't need
+            # speculative layouts to decide if it should be
+            # enqueued or not.  So it can be decided on cached
+            # data if it needs to run or not.
+            pcst = tenant.layout.getAllProjectConfigs(project_name)
+            if not [True for pc in pcst if pipeline.name in pc.pipelines]:
+                continue
+
+            (trusted, project) = tenant.getProject(project_name)
+            try:
+                for branch in project.source.getProjectBranches(
+                        project, tenant):
                    # The 'misfire_grace_time' argument is set to None to
                    # disable checking if the job missed its run time window.
                    # This ensures we don't miss a trigger when the job is
@@ -137,11 +155,17 @@ class TimerDriver(Driver, TriggerInterface):
                    # delays are not a problem for our trigger use-case.
                    job = self.apsched.add_job(
                        self._onTrigger, trigger=trigger,
-                        args=(tenant, pipeline.name, timespec,),
+                        args=(tenant, pipeline.name, project_name,
+                              branch, timespec,),
                        misfire_grace_time=None)
                    jobs.append(job)
+            except Exception:
+                self.log.exception("Unable to create APScheduler job for "
+                                   "%s %s %s",
+                                   tenant, pipeline, project)

-    def _onTrigger(self, tenant, pipeline_name, timespec):
+    def _onTrigger(self, tenant, pipeline_name, project_name, branch,
+                   timespec):
        if not self.election_won:
            return

@@ -150,55 +174,43 @@ class TimerDriver(Driver, TriggerInterface):
            return

        try:
-            self._dispatchEvent(tenant, pipeline_name, timespec)
+            self._dispatchEvent(tenant, pipeline_name, project_name,
+                                branch, timespec)
        except Exception:
            self.stop_event.set()
            self.log.exception("Error when dispatching timer event")

-    def _dispatchEvent(self, tenant, pipeline_name, timespec):
-        self.log.debug('Got trigger for tenant %s and pipeline %s with '
-                       'timespec %s', tenant.name, pipeline_name, timespec)
-        for project_name, pcs in tenant.layout.project_configs.items():
-            try:
-                # timer operates on branch heads and doesn't need
-                # speculative layouts to decide if it should be
-                # enqueued or not.  So it can be decided on cached
-                # data if it needs to run or not.
-                pcst = tenant.layout.getAllProjectConfigs(project_name)
-                if not [True for pc in pcst if pipeline_name in pc.pipelines]:
-                    continue
-
-                (trusted, project) = tenant.getProject(project_name)
-                for branch in project.source.getProjectBranches(
-                        project, tenant):
-                    try:
-                        event = TimerTriggerEvent()
-                        event.type = 'timer'
-                        event.timespec = timespec
-                        event.forced_pipeline = pipeline_name
-                        event.project_hostname = project.canonical_hostname
-                        event.project_name = project.name
-                        event.ref = 'refs/heads/%s' % branch
-                        event.branch = branch
-                        event.zuul_event_id = str(uuid4().hex)
-                        event.timestamp = time.time()
-                        # Refresh the branch in order to update the item in the
-                        # change cache.
-                        change_key = project.source.getChangeKey(event)
-                        with self.project_update_locks[project.canonical_name]:
-                            project.source.getChange(change_key, refresh=True,
-                                                     event=event)
-                        log = get_annotated_logger(self.log, event)
-                        log.debug("Adding event")
-                        self.sched.addTriggerEvent(self.name, event)
-                    except Exception:
-                        self.log.exception("Error dispatching timer event for "
-                                           "project %s branch %s",
-                                           project, branch)
-            except Exception:
-                self.log.exception("Error dispatching timer event for "
-                                   "project %s",
-                                   project)
+    def _dispatchEvent(self, tenant, pipeline_name, project_name,
+                       branch, timespec):
+        self.log.debug('Got trigger for tenant %s and pipeline %s '
+                       'project %s branch %s with timespec %s',
+                       tenant.name, pipeline_name, project_name,
+                       branch, timespec)
+        try:
+            (trusted, project) = tenant.getProject(project_name)
+            event = TimerTriggerEvent()
+            event.type = 'timer'
+            event.timespec = timespec
+            event.forced_pipeline = pipeline_name
+            event.project_hostname = project.canonical_hostname
+            event.project_name = project.name
+            event.ref = 'refs/heads/%s' % branch
+            event.branch = branch
+            event.zuul_event_id = str(uuid4().hex)
+            event.timestamp = time.time()
+            # Refresh the branch in order to update the item in the
+            # change cache.
+            change_key = project.source.getChangeKey(event)
+            with self.project_update_locks[project.canonical_name]:
+                project.source.getChange(change_key, refresh=True,
+                                         event=event)
+            log = get_annotated_logger(self.log, event)
+            log.debug("Adding event")
+            self.sched.addTriggerEvent(self.name, event)
+        except Exception:
+            self.log.exception("Error dispatching timer event for "
+                               "tenant %s project %s branch %s",
+                               tenant, project_name, branch)

    def stop(self):
        self.stopped = True