Updated openstack/openstack

Project: openstack-infra/devstack-gate  d9bb4f1915b1acebc475e2ff911312631b192513

Set standard "swappiness"

One thing that remained somewhat of mystery after recent issues with
the OOM killer & centos7 was why we saw the problem only on RAX and
not hpcloud.

In going through the final telemetry sent from a dying host [1] it is
clear there is swap available, but it is not being used at all.

Closer inspection shows that there is one crucial difference between
HP & RAX centos7 images, and that is the swappiness levels -- which is
set to 0 on RAX and 30 on HP [2].

There is discussion about this around the place; [3] is a good post --
but the upshot is that a swappiness of 0 (swap only when needed) can
lead to certain processes triggering the OOM killer despite some swap
being available.  This matches what we saw on the console, which was
usually mysql getting itself killed (unfortunately I don't believe we
have any console logs gathered during the issue still available).

Swap should really be a last-resort for our CI testing.  Using
cloud-based IO as memory is not going to be performant.  This change
proposes turning swappiness down low for all platforms after we setup
the swap-space; this should ensure its availability for extreme
circumstances but not overuse.

I have related changes to enable better oversight of memory usage of
jobs in the works [4].

[1] http://paste.openstack.org/show/196769/
[2] https://etherpad.openstack.org/p/oom-in-rax-centos7-CI-job
[3] http://www.percona.com/blog/2014/04/28/oom-relation-vm-swappiness0-new-kernel/
[4] https://review.openstack.org/#/c/171919/

Change-Id: I09974be88cc590cf9ffbd16db58c19131c9532aa
This commit is contained in:
Jenkins
2015-04-15 16:05:45 +00:00
committed by Gerrit Code Review
parent 165008ae17
commit 93b48c5204