c0484c9d7c
When Linux runs out of memory and activates the OOM killer, it scores processes based on how much memory they are using[1]. If a job triggers an OOM by causing ansible-playbook to use a lot of RAM, normally we would expect the OOM killer to kill Ansible. However, if the executor is busy, it may be using a lot of RAM as well, and its score may exceed the score of the smaller Ansible process. Nonetheless, we would still rather kill the Ansible process. This adjusts the score for the bubblewrap and ansible processes so that they will have a score increased by an amount equal to about 20% of system RAM. This effectively means that as long as the executor uses less than 20% of system RAM, it is guaranteed to score lower than Ansible (and likely will continue to score lower for some significant amount over that as well, depending on how much RAM Ansible is using). We read the executor's oom_score_adj when we initialize the bwrap driver and add 200 to it in order to accomodate the situation where the executor has its own oom_score_adj. We always want the bwrap children to have a higher score than the executor. The choom program adjusts the OOM score for the command that it executes, and this is inherited by child processes. So we adjust bwrap and expect ansible-playbook to inherit it. It is also possible to adjust the score of the exeucotor process lower (so the executor could be made less likely to be a target) but that requires root privileges, so is not implemented in this change. [1] https://lxr.linux.no/#linux+v6.7.1/mm/oom_kill.c#L201 Change-Id: I3a3d116cf68b84b8a6f9ec13808d1d2c2008008f |
||
---|---|---|
.. | ||
__init__.py |