Maintain shared memory after fork in Python >=3.7

Python 3.7 adds a gc.freeze() call that moves all currently allocated
objects to a 'permanent' garbage collector generation that is never
garbage collected:

https://docs.python.org/3.7/library/gc.html?highlight=gc#gc.freeze

By calling this prior to fork()ing off worker processes, we ensure that
existing pages will largely remain in shared memory (i.e. there will be
only one copy shared across all worker processes and the parent).
Otherwise, the mark-and-sweep action of the garbage collector causes
writes to a substantial proportion of the pages, resulting in each process
having its own copy.

This may result in some otherwise-collectable objects (i.e. objects that
are no longer reachable but that have circular references) remaining in
memory permanently; however in almost all cases it is preferable to
leave them allocated rather than free up gaps in existing pages that
workers will then allocate new objects in, again causing the pages to be
copied.

Change-Id: I0f420f171669094233fe1ca1aae60c94cd0db65c
This commit is contained in:
Zane Bitter 2018-01-02 10:58:53 -05:00
parent 97044dc5ae
commit e753070ca2
1 changed files with 8 additions and 0 deletions

View File

@ -21,6 +21,7 @@ import abc
import collections
import copy
import errno
import gc
import io
import logging
import os
@ -528,6 +529,13 @@ class ProcessLauncher(object):
_check_service_base(service)
wrap = ServiceWrapper(service, workers)
# Hide existing objects from the garbage collector, so that most
# existing pages will remain in shared memory rather than being
# duplicated between subprocesses in the GC mark-and-sweep. (Requires
# Python 3.7 or later.)
if hasattr(gc, 'freeze'):
gc.freeze()
LOG.info('Starting %d workers', wrap.workers)
while self.running and len(wrap.children) < wrap.workers:
self._start_child(wrap)