integ/base/systemd/centos/patches/914-sd-event-update-state-at-the-end-in-event_source_ena.patch
Li Zhou ccfeeef59d systemd: Prevent excessive /proc/1/mountinfo reparsing
Backport the patches for this issue:
https://bugzilla.redhat.com/show_bug.cgi?id=1819868

We met such an issue:
When testing a large number of pods (> 230), occasionally observed a
number of issues related to systemd process:
    systemd ran continually 90-100% cpu usage
    systemd memory usage started increasing rapidly (20GB/hour)
    systemctl commands would always timeout (Failed to get properties:
        Connection timed out)
    sm services failed and can't recover: open-ldap,
        registry-token-server, docker-distribution, etcd
    new pods can't start, and got stuck in state ContainerCreating

Those patches work to prevent excessive /proc/1/mountinfo reparsing.
It has been verified that those patches can improve this performance
greatly.

16 commits are listed in sequence (from [1] to [16]) at below link
for the issue:
https://github.com/systemd-rhel/rhel-8/pull/154/commits

[16](10)core: prevent excessive /proc/self/mountinfo parsing
[15][Dropped-6]test: add ratelimiting test
[14](9)sd-event: add ability to ratelimit event sources
[13](8)sd-event: increase n_enabled_child_sources just once
[12](7)sd-event: update state at the end in event_source_enable
[11](6)sd-event: remove earliest_index/latest_index into common part of
event source objects
[10][Dropped-5]sd-event: follow coding style with naming return
parameter
[9] [Dropped-4]sd-event: ref event loop while in sd_event_prepare() ot
sd_event_run()
[8] (5)sd-event: refuse running default event loops in any other thread
than the one they are default for
[7] [Dropped-3]sd-event: let's suffix last_run/last_log with "_usec"
[6] [Dropped-2]sd-event: fix delays assert brain-o (#17790)
[5] (4)sd-event: split out code to add/remove timer event sources to
earliest/latest prioq
[4] (3)sd-event: split clock data allocation out of sd_event_add_time()
[3] [Dropped-1]sd-event: mention that two debug logged events are
ignored
[2] (2)sd-event: split out enable and disable codepaths from
sd_event_source_set_enabled()
[1] (1)sd-event: split out helper functions for reshuffling prioqs

I ported 10 of them back (from (1) to (10)) to fix this issue
and dropped the other 6 (from [Dropped-1] to [Dropped-6]) for those
reasons:
[Dropped-1]Only changes error log.
[Dropped-2]Fixes a bug introduced in a commit which doesn't exist in
this version.
[Dropped-3]Only changes vars' names and there is no functional change.
[Dropped-4]More commits are needed for merging it, while I don't see
any help on adding the rate-limiting ability.
[Dropped-5]Change coding style for a function which isn't really used
by anyone.
[Dropped-6]Add test cases.

Closes-Bug: #1924686
Signed-off-by: Li Zhou <li.zhou@windriver.com>
Change-Id: Ia4c8f162cb1a47b40d1b26cf4d604976b97e92d6
2021-04-22 22:09:33 -04:00

126 lines
4.0 KiB
Diff

From ad89da1e00919c510596dac78741c98052b1e2f7 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Zbigniew=20J=C4=99drzejewski-Szmek?= <zbyszek@in.waw.pl>
Date: Tue, 10 Nov 2020 10:38:37 +0100
Subject: [PATCH 14/20] sd-event: update state at the end in
event_source_enable
Coverity in CID#1435966 was complaining that s->enabled is not "restored" in
all cases. But the code was actually correct, since it should only be
"restored" in the error paths. But let's still make this prettier by not setting
the state before all operations that may fail are done.
We need to set .enabled for the prioq reshuffling operations, so move those down.
No functional change intended.
(cherry picked from commit d2eafe61ca07f8300dc741a0491a914213fa2b6b)
Related: #1819868
[commit deb9e6ad3a1d7cfbc3b53d1e74cda6ae398a90fd from
https://github.com/systemd-rhel/rhel-8/]
Signed-off-by: Li Zhou <li.zhou@windriver.com>
---
src/libsystemd/sd-event/sd-event.c | 51 +++++++++++++++++-------------
1 file changed, 29 insertions(+), 22 deletions(-)
diff --git a/src/libsystemd/sd-event/sd-event.c b/src/libsystemd/sd-event/sd-event.c
index 82cb9ad..3ff15a2 100644
--- a/src/libsystemd/sd-event/sd-event.c
+++ b/src/libsystemd/sd-event/sd-event.c
@@ -1691,11 +1691,11 @@ static int event_source_disable(sd_event_source *s) {
return 0;
}
-static int event_source_enable(sd_event_source *s, int m) {
+static int event_source_enable(sd_event_source *s, int enable) {
int r;
assert(s);
- assert(IN_SET(m, SD_EVENT_ON, SD_EVENT_ONESHOT));
+ assert(IN_SET(enable, SD_EVENT_ON, SD_EVENT_ONESHOT));
assert(s->enabled == SD_EVENT_OFF);
/* Unset the pending flag when this event source is enabled */
@@ -1705,31 +1705,16 @@ static int event_source_enable(sd_event_source *s, int m) {
return r;
}
- s->enabled = m;
-
switch (s->type) {
-
case SOURCE_IO:
- r = source_io_register(s, m, s->io.events);
- if (r < 0) {
- s->enabled = SD_EVENT_OFF;
+ r = source_io_register(s, enable, s->io.events);
+ if (r < 0)
return r;
- }
-
- break;
-
- case SOURCE_TIME_REALTIME:
- case SOURCE_TIME_BOOTTIME:
- case SOURCE_TIME_MONOTONIC:
- case SOURCE_TIME_REALTIME_ALARM:
- case SOURCE_TIME_BOOTTIME_ALARM:
- event_source_time_prioq_reshuffle(s);
break;
case SOURCE_SIGNAL:
r = event_make_signal_data(s->event, s->signal.sig, NULL);
if (r < 0) {
- s->enabled = SD_EVENT_OFF;
event_gc_signal_data(s->event, &s->priority, s->signal.sig);
return r;
}
@@ -1750,10 +1735,12 @@ static int event_source_enable(sd_event_source *s, int m) {
break;
+ case SOURCE_TIME_REALTIME:
+ case SOURCE_TIME_BOOTTIME:
+ case SOURCE_TIME_MONOTONIC:
+ case SOURCE_TIME_REALTIME_ALARM:
+ case SOURCE_TIME_BOOTTIME_ALARM:
case SOURCE_EXIT:
- prioq_reshuffle(s->event->exit, s, &s->exit.prioq_index);
- break;
-
case SOURCE_DEFER:
case SOURCE_POST:
break;
@@ -1762,6 +1749,26 @@ static int event_source_enable(sd_event_source *s, int m) {
assert_not_reached("Wut? I shouldn't exist.");
}
+ s->enabled = enable;
+
+ /* Non-failing operations below */
+ switch (s->type) {
+ case SOURCE_TIME_REALTIME:
+ case SOURCE_TIME_BOOTTIME:
+ case SOURCE_TIME_MONOTONIC:
+ case SOURCE_TIME_REALTIME_ALARM:
+ case SOURCE_TIME_BOOTTIME_ALARM:
+ event_source_time_prioq_reshuffle(s);
+ break;
+
+ case SOURCE_EXIT:
+ prioq_reshuffle(s->event->exit, s, &s->exit.prioq_index);
+ break;
+
+ default:
+ break;
+ }
+
return 0;
}
--
2.17.1