systemd: Prevent excessive /proc/1/mountinfo reparsing

Backport the patches for this issue:
https://bugzilla.redhat.com/show_bug.cgi?id=1819868

We met such an issue:
When testing a large number of pods (> 230), occasionally observed a
number of issues related to systemd process:
    systemd ran continually 90-100% cpu usage
    systemd memory usage started increasing rapidly (20GB/hour)
    systemctl commands would always timeout (Failed to get properties:
        Connection timed out)
    sm services failed and can't recover: open-ldap,
        registry-token-server, docker-distribution, etcd
    new pods can't start, and got stuck in state ContainerCreating

Those patches work to prevent excessive /proc/1/mountinfo reparsing.
It has been verified that those patches can improve this performance
greatly.

16 commits are listed in sequence (from [1] to [16]) at below link
for the issue:
https://github.com/systemd-rhel/rhel-8/pull/154/commits

[16](10)core: prevent excessive /proc/self/mountinfo parsing
[15][Dropped-6]test: add ratelimiting test
[14](9)sd-event: add ability to ratelimit event sources
[13](8)sd-event: increase n_enabled_child_sources just once
[12](7)sd-event: update state at the end in event_source_enable
[11](6)sd-event: remove earliest_index/latest_index into common part of
event source objects
[10][Dropped-5]sd-event: follow coding style with naming return
parameter
[9] [Dropped-4]sd-event: ref event loop while in sd_event_prepare() ot
sd_event_run()
[8] (5)sd-event: refuse running default event loops in any other thread
than the one they are default for
[7] [Dropped-3]sd-event: let's suffix last_run/last_log with "_usec"
[6] [Dropped-2]sd-event: fix delays assert brain-o (#17790)
[5] (4)sd-event: split out code to add/remove timer event sources to
earliest/latest prioq
[4] (3)sd-event: split clock data allocation out of sd_event_add_time()
[3] [Dropped-1]sd-event: mention that two debug logged events are
ignored
[2] (2)sd-event: split out enable and disable codepaths from
sd_event_source_set_enabled()
[1] (1)sd-event: split out helper functions for reshuffling prioqs

I ported 10 of them back (from (1) to (10)) to fix this issue
and dropped the other 6 (from [Dropped-1] to [Dropped-6]) for those
reasons:
[Dropped-1]Only changes error log.
[Dropped-2]Fixes a bug introduced in a commit which doesn't exist in
this version.
[Dropped-3]Only changes vars' names and there is no functional change.
[Dropped-4]More commits are needed for merging it, while I don't see
any help on adding the rate-limiting ability.
[Dropped-5]Change coding style for a function which isn't really used
by anyone.
[Dropped-6]Add test cases.

Closes-Bug: #1924686
Signed-off-by: Li Zhou <li.zhou@windriver.com>
Change-Id: Ia4c8f162cb1a47b40d1b26cf4d604976b97e92d6
This commit is contained in:
Li Zhou 2021-04-12 02:15:25 -04:00
parent 43ffd243ca
commit ccfeeef59d
22 changed files with 3445 additions and 12 deletions

View File

@ -1,21 +1,19 @@
From 3c0e59a677c921f60f27002a27eb5f4776475e44 Mon Sep 17 00:00:00 2001 From 91090adc8d4c774796d36f7563eea224569a9b0f Mon Sep 17 00:00:00 2001
Message-Id: <3c0e59a677c921f60f27002a27eb5f4776475e44.1574265913.git.Jim.Somerville@windriver.com> From: Li Zhou <li.zhou@windriver.com>
In-Reply-To: <eeb3e979288cb8c14d8546d12a27da4c88fbb0e4.1574265913.git.Jim.Somerville@windriver.com> Date: Wed, 21 Apr 2021 13:59:22 +0800
References: <eeb3e979288cb8c14d8546d12a27da4c88fbb0e4.1574265913.git.Jim.Somerville@windriver.com> Subject: [PATCH] Add STX patches
From: Jim Somerville <Jim.Somerville@windriver.com>
Date: Wed, 20 Nov 2019 10:59:45 -0500
Subject: [PATCH 3/3] Add STX patches
Signed-off-by: Jim Somerville <Jim.Somerville@windriver.com> Signed-off-by: Jim Somerville <Jim.Somerville@windriver.com>
Signed-off-by: Li Zhou <li.zhou@windriver.com>
--- ---
SPECS/systemd.spec | 5 +++++ SPECS/systemd.spec | 58 ++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 5 insertions(+) 1 file changed, 58 insertions(+)
diff --git a/SPECS/systemd.spec b/SPECS/systemd.spec diff --git a/SPECS/systemd.spec b/SPECS/systemd.spec
index 4c83150..e1e98bb 100644 index 4c83150..72de7d6 100644
--- a/SPECS/systemd.spec --- a/SPECS/systemd.spec
+++ b/SPECS/systemd.spec +++ b/SPECS/systemd.spec
@@ -786,6 +786,11 @@ Patch0744: 0744-selinux-don-t-log-SELINUX_INFO-and-SELINUX_WARNING-m.patch @@ -786,6 +786,64 @@ Patch0744: 0744-selinux-don-t-log-SELINUX_INFO-and-SELINUX_WARNING-m.patch
Patch0745: 0745-fix-mis-merge.patch Patch0745: 0745-fix-mis-merge.patch
Patch0746: 0746-fs-util-chase_symlinks-prevent-double-free.patch Patch0746: 0746-fs-util-chase_symlinks-prevent-double-free.patch
@ -23,10 +21,63 @@ index 4c83150..e1e98bb 100644
+Patch0801: 801-inject-millisec-in-syslog-date.patch +Patch0801: 801-inject-millisec-in-syslog-date.patch
+Patch0802: 802-fix-build-error-for-unused-variable.patch +Patch0802: 802-fix-build-error-for-unused-variable.patch
+Patch0803: 803-Fix-compile-failure-due-to-deprecated-value.patch +Patch0803: 803-Fix-compile-failure-due-to-deprecated-value.patch
+
+# This cluster of patches relates to fixing redhat bug #1819868
+# "systemd excessively reads mountinfo and udev in dense container environments"
+
+# Below patches are added for merging patch (1)
+Patch0901: 901-sd-event-don-t-touch-fd-s-accross-forks.patch
+Patch0902: 902-sd-event-make-sure-RT-signals-are-not-dropped.patch
+# Patch (1) for solving #1819868
+Patch0903: 903-sd-event-split-out-helper-functions-for-reshuffling-.patch
+
+# Below patches are added for merging patch (2)
+Patch0904: 904-sd-event-drop-pending-events-when-we-turn-off-on-an-.patch
+Patch0905: 905-sd-event-fix-call-to-event_make_signal_data.patch
+Patch0906: 906-sd-event-make-sure-to-create-a-signal-queue-for-the-.patch
+# Patch (2) for solving #1819868
+Patch0907: 907-sd-event-split-out-enable-and-disable-codepaths-from.patch
+
+# Below patch is added for merging patch (3)
+Patch0908: 908-sd-event-use-prioq_ensure_allocated-where-possible.patch
+# Patch (3) for solving #1819868
+Patch0909: 909-sd-event-split-clock-data-allocation-out-of-sd_event.patch
+
+# Patch (4) for solving #1819868
+Patch0910: 910-sd-event-split-out-code-to-add-remove-timer-event-so.patch
+
+# Below patch is added for merging patch (5)
+Patch0911: 911-sd-event-rename-PASSIVE-PREPARED-to-INITIAL-ARMED.patch
+# Patch (5) for solving #1819868
+Patch0912: 912-sd-event-refuse-running-default-event-loops-in-any-o.patch
+
+# Patch (6) for solving #1819868
+Patch0913: 913-sd-event-remove-earliest_index-latest_index-into-com.patch
+
+# Patch (7) for solving #1819868
+Patch0914: 914-sd-event-update-state-at-the-end-in-event_source_ena.patch
+
+# Patch (8) for solving #1819868
+Patch0915: 915-sd-event-increase-n_enabled_child_sources-just-once.patch
+
+# Below patches are added for merging patch (9)
+Patch0916: 916-sd-event-don-t-provide-priority-stability.patch
+Patch0917: 917-sd-event-when-determining-the-last-allowed-time-a-ti.patch
+Patch0918: 918-sd-event-permit-a-USEC_INFINITY-timeout-as-an-altern.patch
+# Patch (9) for solving #1819868
+Patch0919: 919-sd-event-add-ability-to-ratelimit-event-sources.patch
+
+# Patch (10) for solving #1819868
+Patch0920: 920-core-prevent-excessive-proc-self-mountinfo-parsing.patch
+
+# This patch fixes build issues related to the above patches. Our goal is to keep
+# upstream patches as unmodified as possible to facilitate maintaining them, so instead
+# of individually changing them for compilation, we just have one patch at the end to do it.
+Patch0921: 921-systemd-Fix-compiling-errors-when-merging-1819868.patch
+ +
Patch9999: 9999-Update-kernel-install-script-by-backporting-fedora-p.patch Patch9999: 9999-Update-kernel-install-script-by-backporting-fedora-p.patch
%global num_patches %{lua: c=0; for i,p in ipairs(patches) do c=c+1; end; print(c);} %global num_patches %{lua: c=0; for i,p in ipairs(patches) do c=c+1; end; print(c);}
-- --
1.8.3.1 2.17.1

View File

@ -0,0 +1,54 @@
From 5de71cb7d887a569bfb987efdceda493338990bf Mon Sep 17 00:00:00 2001
From: Tom Gundersen <teg@jklm.no>
Date: Thu, 4 Jun 2015 16:54:45 +0200
Subject: [PATCH 01/20] sd-event: don't touch fd's accross forks
We protect most of the API from use accross forks, but we still allow both
sd_event and sd_event_source objects to be unref'ed. This would cause
problems as it would unregister sources from the underlying eventfd, hence
also affecting the original instance in the parent process.
This fixes the issue by not touching the fds on unref when done accross a fork,
but still free the memory.
This fixes a regression introduced by
"udevd: move main-loop to sd-event": 693d371d30fee
where the worker processes were disabling the inotify event source in the
main daemon.
[commit f68067348f58cd08d8f4f5325ce22f9a9d2c2140 from
https://github.com/systemd-rhel/rhel-8/]
Signed-off-by: Li Zhou <li.zhou@windriver.com>
---
src/libsystemd/sd-event/sd-event.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/src/libsystemd/sd-event/sd-event.c b/src/libsystemd/sd-event/sd-event.c
index 9d48e5a..a84bfbb 100644
--- a/src/libsystemd/sd-event/sd-event.c
+++ b/src/libsystemd/sd-event/sd-event.c
@@ -474,6 +474,9 @@ static int source_io_unregister(sd_event_source *s) {
assert(s);
assert(s->type == SOURCE_IO);
+ if (event_pid_changed(s->event))
+ return 0;
+
if (!s->io.registered)
return 0;
@@ -604,6 +607,9 @@ static int event_update_signal_fd(sd_event *e) {
assert(e);
+ if (event_pid_changed(e))
+ return 0;
+
add_to_epoll = e->signal_fd < 0;
r = signalfd(e->signal_fd, &e->sigset, SFD_NONBLOCK|SFD_CLOEXEC);
--
2.17.1

View File

@ -0,0 +1,815 @@
From 2976f3b959bef0e6f0a1f4d55d998c5d60e56b0d Mon Sep 17 00:00:00 2001
From: Lennart Poettering <lennart@poettering.net>
Date: Thu, 3 Sep 2015 20:13:09 +0200
Subject: [PATCH 02/20] sd-event: make sure RT signals are not dropped
RT signals operate in a queue, and we should be careful to never merge
two queued signals into one. Hence, makes sure we only ever dequeue a
single signal at a time and leave the remaining ones queued in the
signalfd. In order to implement correct priorities for the signals
introduce one signalfd per priority, so that we only process the highest
priority signal at a time.
[commit 9da4cb2be260ed123f2676cb85cb350c527b1492 from
https://github.com/systemd-rhel/rhel-8/]
Signed-off-by: Li Zhou <li.zhou@windriver.com>
---
src/libsystemd/sd-event/sd-event.c | 430 ++++++++++++++++++---------
src/libsystemd/sd-event/test-event.c | 66 +++-
2 files changed, 357 insertions(+), 139 deletions(-)
diff --git a/src/libsystemd/sd-event/sd-event.c b/src/libsystemd/sd-event/sd-event.c
index a84bfbb..26ef3ea 100644
--- a/src/libsystemd/sd-event/sd-event.c
+++ b/src/libsystemd/sd-event/sd-event.c
@@ -56,9 +56,22 @@ typedef enum EventSourceType {
_SOURCE_EVENT_SOURCE_TYPE_INVALID = -1
} EventSourceType;
+/* All objects we use in epoll events start with this value, so that
+ * we know how to dispatch it */
+typedef enum WakeupType {
+ WAKEUP_NONE,
+ WAKEUP_EVENT_SOURCE,
+ WAKEUP_CLOCK_DATA,
+ WAKEUP_SIGNAL_DATA,
+ _WAKEUP_TYPE_MAX,
+ _WAKEUP_TYPE_INVALID = -1,
+} WakeupType;
+
#define EVENT_SOURCE_IS_TIME(t) IN_SET((t), SOURCE_TIME_REALTIME, SOURCE_TIME_BOOTTIME, SOURCE_TIME_MONOTONIC, SOURCE_TIME_REALTIME_ALARM, SOURCE_TIME_BOOTTIME_ALARM)
struct sd_event_source {
+ WakeupType wakeup;
+
unsigned n_ref;
sd_event *event;
@@ -120,6 +133,7 @@ struct sd_event_source {
};
struct clock_data {
+ WakeupType wakeup;
int fd;
/* For all clocks we maintain two priority queues each, one
@@ -136,11 +150,23 @@ struct clock_data {
bool needs_rearm:1;
};
+struct signal_data {
+ WakeupType wakeup;
+
+ /* For each priority we maintain one signal fd, so that we
+ * only have to dequeue a single event per priority at a
+ * time. */
+
+ int fd;
+ int64_t priority;
+ sigset_t sigset;
+ sd_event_source *current;
+};
+
struct sd_event {
unsigned n_ref;
int epoll_fd;
- int signal_fd;
int watchdog_fd;
Prioq *pending;
@@ -157,8 +183,8 @@ struct sd_event {
usec_t perturb;
- sigset_t sigset;
- sd_event_source **signal_sources;
+ sd_event_source **signal_sources; /* indexed by signal number */
+ Hashmap *signal_data; /* indexed by priority */
Hashmap *child_sources;
unsigned n_enabled_child_sources;
@@ -355,6 +381,7 @@ static int exit_prioq_compare(const void *a, const void *b) {
static void free_clock_data(struct clock_data *d) {
assert(d);
+ assert(d->wakeup == WAKEUP_CLOCK_DATA);
safe_close(d->fd);
prioq_free(d->earliest);
@@ -378,7 +405,6 @@ static void event_free(sd_event *e) {
*(e->default_event_ptr) = NULL;
safe_close(e->epoll_fd);
- safe_close(e->signal_fd);
safe_close(e->watchdog_fd);
free_clock_data(&e->realtime);
@@ -392,6 +418,7 @@ static void event_free(sd_event *e) {
prioq_free(e->exit);
free(e->signal_sources);
+ hashmap_free(e->signal_data);
hashmap_free(e->child_sources);
set_free(e->post_sources);
@@ -409,13 +436,12 @@ _public_ int sd_event_new(sd_event** ret) {
return -ENOMEM;
e->n_ref = 1;
- e->signal_fd = e->watchdog_fd = e->epoll_fd = e->realtime.fd = e->boottime.fd = e->monotonic.fd = e->realtime_alarm.fd = e->boottime_alarm.fd = -1;
+ e->watchdog_fd = e->epoll_fd = e->realtime.fd = e->boottime.fd = e->monotonic.fd = e->realtime_alarm.fd = e->boottime_alarm.fd = -1;
e->realtime.next = e->boottime.next = e->monotonic.next = e->realtime_alarm.next = e->boottime_alarm.next = USEC_INFINITY;
+ e->realtime.wakeup = e->boottime.wakeup = e->monotonic.wakeup = e->realtime_alarm.wakeup = e->boottime_alarm.wakeup = WAKEUP_CLOCK_DATA;
e->original_pid = getpid();
e->perturb = USEC_INFINITY;
- assert_se(sigemptyset(&e->sigset) == 0);
-
e->pending = prioq_new(pending_prioq_compare);
if (!e->pending) {
r = -ENOMEM;
@@ -510,7 +536,6 @@ static int source_io_register(
r = epoll_ctl(s->event->epoll_fd, EPOLL_CTL_MOD, s->io.fd, &ev);
else
r = epoll_ctl(s->event->epoll_fd, EPOLL_CTL_ADD, s->io.fd, &ev);
-
if (r < 0)
return -errno;
@@ -592,45 +617,171 @@ static struct clock_data* event_get_clock_data(sd_event *e, EventSourceType t) {
}
}
-static bool need_signal(sd_event *e, int signal) {
- return (e->signal_sources && e->signal_sources[signal] &&
- e->signal_sources[signal]->enabled != SD_EVENT_OFF)
- ||
- (signal == SIGCHLD &&
- e->n_enabled_child_sources > 0);
-}
+static int event_make_signal_data(
+ sd_event *e,
+ int sig,
+ struct signal_data **ret) {
-static int event_update_signal_fd(sd_event *e) {
struct epoll_event ev = {};
- bool add_to_epoll;
+ struct signal_data *d;
+ bool added = false;
+ sigset_t ss_copy;
+ int64_t priority;
int r;
assert(e);
if (event_pid_changed(e))
- return 0;
+ return -ECHILD;
- add_to_epoll = e->signal_fd < 0;
+ if (e->signal_sources && e->signal_sources[sig])
+ priority = e->signal_sources[sig]->priority;
+ else
+ priority = 0;
- r = signalfd(e->signal_fd, &e->sigset, SFD_NONBLOCK|SFD_CLOEXEC);
- if (r < 0)
- return -errno;
+ d = hashmap_get(e->signal_data, &priority);
+ if (d) {
+ if (sigismember(&d->sigset, sig) > 0) {
+ if (ret)
+ *ret = d;
+ return 0;
+ }
+ } else {
+ r = hashmap_ensure_allocated(&e->signal_data, &uint64_hash_ops);
+ if (r < 0)
+ return r;
+
+ d = new0(struct signal_data, 1);
+ if (!d)
+ return -ENOMEM;
+
+ d->wakeup = WAKEUP_SIGNAL_DATA;
+ d->fd = -1;
+ d->priority = priority;
+
+ r = hashmap_put(e->signal_data, &d->priority, d);
+ if (r < 0)
+ return r;
- e->signal_fd = r;
+ added = true;
+ }
+
+ ss_copy = d->sigset;
+ assert_se(sigaddset(&ss_copy, sig) >= 0);
+
+ r = signalfd(d->fd, &ss_copy, SFD_NONBLOCK|SFD_CLOEXEC);
+ if (r < 0) {
+ r = -errno;
+ goto fail;
+ }
+
+ d->sigset = ss_copy;
- if (!add_to_epoll)
+ if (d->fd >= 0) {
+ if (ret)
+ *ret = d;
return 0;
+ }
+
+ d->fd = r;
ev.events = EPOLLIN;
- ev.data.ptr = INT_TO_PTR(SOURCE_SIGNAL);
+ ev.data.ptr = d;
- r = epoll_ctl(e->epoll_fd, EPOLL_CTL_ADD, e->signal_fd, &ev);
- if (r < 0) {
- e->signal_fd = safe_close(e->signal_fd);
- return -errno;
+ r = epoll_ctl(e->epoll_fd, EPOLL_CTL_ADD, d->fd, &ev);
+ if (r < 0) {
+ r = -errno;
+ goto fail;
}
+ if (ret)
+ *ret = d;
+
return 0;
+
+fail:
+ if (added) {
+ d->fd = safe_close(d->fd);
+ hashmap_remove(e->signal_data, &d->priority);
+ free(d);
+ }
+
+ return r;
+}
+
+static void event_unmask_signal_data(sd_event *e, struct signal_data *d, int sig) {
+ assert(e);
+ assert(d);
+
+ /* Turns off the specified signal in the signal data
+ * object. If the signal mask of the object becomes empty that
+ * way removes it. */
+
+ if (sigismember(&d->sigset, sig) == 0)
+ return;
+
+ assert_se(sigdelset(&d->sigset, sig) >= 0);
+
+ if (sigisemptyset(&d->sigset)) {
+
+ /* If all the mask is all-zero we can get rid of the structure */
+ hashmap_remove(e->signal_data, &d->priority);
+ assert(!d->current);
+ safe_close(d->fd);
+ free(d);
+ return;
+ }
+
+ assert(d->fd >= 0);
+
+ if (signalfd(d->fd, &d->sigset, SFD_NONBLOCK|SFD_CLOEXEC) < 0)
+ log_debug_errno(errno, "Failed to unset signal bit, ignoring: %m");
+}
+
+static void event_gc_signal_data(sd_event *e, const int64_t *priority, int sig) {
+ struct signal_data *d;
+ static const int64_t zero_priority = 0;
+
+ assert(e);
+
+ /* Rechecks if the specified signal is still something we are
+ * interested in. If not, we'll unmask it, and possibly drop
+ * the signalfd for it. */
+
+ if (sig == SIGCHLD &&
+ e->n_enabled_child_sources > 0)
+ return;
+
+ if (e->signal_sources &&
+ e->signal_sources[sig] &&
+ e->signal_sources[sig]->enabled != SD_EVENT_OFF)
+ return;
+
+ /*
+ * The specified signal might be enabled in three different queues:
+ *
+ * 1) the one that belongs to the priority passed (if it is non-NULL)
+ * 2) the one that belongs to the priority of the event source of the signal (if there is one)
+ * 3) the 0 priority (to cover the SIGCHLD case)
+ *
+ * Hence, let's remove it from all three here.
+ */
+
+ if (priority) {
+ d = hashmap_get(e->signal_data, priority);
+ if (d)
+ event_unmask_signal_data(e, d, sig);
+ }
+
+ if (e->signal_sources && e->signal_sources[sig]) {
+ d = hashmap_get(e->signal_data, &e->signal_sources[sig]->priority);
+ if (d)
+ event_unmask_signal_data(e, d, sig);
+ }
+
+ d = hashmap_get(e->signal_data, &zero_priority);
+ if (d)
+ event_unmask_signal_data(e, d, sig);
}
static void source_disconnect(sd_event_source *s) {
@@ -669,17 +820,11 @@ static void source_disconnect(sd_event_source *s) {
case SOURCE_SIGNAL:
if (s->signal.sig > 0) {
+
if (s->event->signal_sources)
s->event->signal_sources[s->signal.sig] = NULL;
- /* If the signal was on and now it is off... */
- if (s->enabled != SD_EVENT_OFF && !need_signal(s->event, s->signal.sig)) {
- assert_se(sigdelset(&s->event->sigset, s->signal.sig) == 0);
-
- (void) event_update_signal_fd(s->event);
- /* If disabling failed, we might get a spurious event,
- * but otherwise nothing bad should happen. */
- }
+ event_gc_signal_data(s->event, &s->priority, s->signal.sig);
}
break;
@@ -689,18 +834,10 @@ static void source_disconnect(sd_event_source *s) {
if (s->enabled != SD_EVENT_OFF) {
assert(s->event->n_enabled_child_sources > 0);
s->event->n_enabled_child_sources--;
-
- /* We know the signal was on, if it is off now... */
- if (!need_signal(s->event, SIGCHLD)) {
- assert_se(sigdelset(&s->event->sigset, SIGCHLD) == 0);
-
- (void) event_update_signal_fd(s->event);
- /* If disabling failed, we might get a spurious event,
- * but otherwise nothing bad should happen. */
- }
}
- hashmap_remove(s->event->child_sources, INT_TO_PTR(s->child.pid));
+ (void) hashmap_remove(s->event->child_sources, INT_TO_PTR(s->child.pid));
+ event_gc_signal_data(s->event, &s->priority, SIGCHLD);
}
break;
@@ -779,6 +916,14 @@ static int source_set_pending(sd_event_source *s, bool b) {
d->needs_rearm = true;
}
+ if (s->type == SOURCE_SIGNAL && !b) {
+ struct signal_data *d;
+
+ d = hashmap_get(s->event->signal_data, &s->priority);
+ if (d && d->current == s)
+ d->current = NULL;
+ }
+
return 0;
}
@@ -828,6 +973,7 @@ _public_ int sd_event_add_io(
if (!s)
return -ENOMEM;
+ s->wakeup = WAKEUP_EVENT_SOURCE;
s->io.fd = fd;
s->io.events = events;
s->io.callback = callback;
@@ -884,7 +1030,7 @@ static int event_setup_timer_fd(
return -errno;
ev.events = EPOLLIN;
- ev.data.ptr = INT_TO_PTR(clock_to_event_source_type(clock));
+ ev.data.ptr = d;
r = epoll_ctl(e->epoll_fd, EPOLL_CTL_ADD, fd, &ev);
if (r < 0) {
@@ -994,9 +1140,9 @@ _public_ int sd_event_add_signal(
void *userdata) {
sd_event_source *s;
+ struct signal_data *d;
sigset_t ss;
int r;
- bool previous;
assert_return(e, -EINVAL);
assert_return(sig > 0, -EINVAL);
@@ -1021,8 +1167,6 @@ _public_ int sd_event_add_signal(
} else if (e->signal_sources[sig])
return -EBUSY;
- previous = need_signal(e, sig);
-
s = source_new(e, !ret, SOURCE_SIGNAL);
if (!s)
return -ENOMEM;
@@ -1034,14 +1178,10 @@ _public_ int sd_event_add_signal(
e->signal_sources[sig] = s;
- if (!previous) {
- assert_se(sigaddset(&e->sigset, sig) == 0);
-
- r = event_update_signal_fd(e);
- if (r < 0) {
- source_free(s);
- return r;
- }
+ r = event_make_signal_data(e, sig, &d);
+ if (r < 0) {
+ source_free(s);
+ return r;
}
/* Use the signal name as description for the event source by default */
@@ -1063,7 +1203,6 @@ _public_ int sd_event_add_child(
sd_event_source *s;
int r;
- bool previous;
assert_return(e, -EINVAL);
assert_return(pid > 1, -EINVAL);
@@ -1080,8 +1219,6 @@ _public_ int sd_event_add_child(
if (hashmap_contains(e->child_sources, INT_TO_PTR(pid)))
return -EBUSY;
- previous = need_signal(e, SIGCHLD);
-
s = source_new(e, !ret, SOURCE_CHILD);
if (!s)
return -ENOMEM;
@@ -1100,14 +1237,11 @@ _public_ int sd_event_add_child(
e->n_enabled_child_sources ++;
- if (!previous) {
- assert_se(sigaddset(&e->sigset, SIGCHLD) == 0);
-
- r = event_update_signal_fd(e);
- if (r < 0) {
- source_free(s);
- return r;
- }
+ r = event_make_signal_data(e, SIGCHLD, NULL);
+ if (r < 0) {
+ e->n_enabled_child_sources--;
+ source_free(s);
+ return r;
}
e->need_process_child = true;
@@ -1407,6 +1541,8 @@ _public_ int sd_event_source_get_priority(sd_event_source *s, int64_t *priority)
}
_public_ int sd_event_source_set_priority(sd_event_source *s, int64_t priority) {
+ int r;
+
assert_return(s, -EINVAL);
assert_return(s->event->state != SD_EVENT_FINISHED, -ESTALE);
assert_return(!event_pid_changed(s->event), -ECHILD);
@@ -1414,7 +1550,25 @@ _public_ int sd_event_source_set_priority(sd_event_source *s, int64_t priority)
if (s->priority == priority)
return 0;
- s->priority = priority;
+ if (s->type == SOURCE_SIGNAL && s->enabled != SD_EVENT_OFF) {
+ struct signal_data *old, *d;
+
+ /* Move us from the signalfd belonging to the old
+ * priority to the signalfd of the new priority */
+
+ assert_se(old = hashmap_get(s->event->signal_data, &s->priority));
+
+ s->priority = priority;
+
+ r = event_make_signal_data(s->event, s->signal.sig, &d);
+ if (r < 0) {
+ s->priority = old->priority;
+ return r;
+ }
+
+ event_unmask_signal_data(s->event, old, s->signal.sig);
+ } else
+ s->priority = priority;
if (s->pending)
prioq_reshuffle(s->event->pending, s, &s->pending_index);
@@ -1482,34 +1636,18 @@ _public_ int sd_event_source_set_enabled(sd_event_source *s, int m) {
}
case SOURCE_SIGNAL:
- assert(need_signal(s->event, s->signal.sig));
-
s->enabled = m;
- if (!need_signal(s->event, s->signal.sig)) {
- assert_se(sigdelset(&s->event->sigset, s->signal.sig) == 0);
-
- (void) event_update_signal_fd(s->event);
- /* If disabling failed, we might get a spurious event,
- * but otherwise nothing bad should happen. */
- }
-
+ event_gc_signal_data(s->event, &s->priority, s->signal.sig);
break;
case SOURCE_CHILD:
- assert(need_signal(s->event, SIGCHLD));
-
s->enabled = m;
assert(s->event->n_enabled_child_sources > 0);
s->event->n_enabled_child_sources--;
- if (!need_signal(s->event, SIGCHLD)) {
- assert_se(sigdelset(&s->event->sigset, SIGCHLD) == 0);
-
- (void) event_update_signal_fd(s->event);
- }
-
+ event_gc_signal_data(s->event, &s->priority, SIGCHLD);
break;
case SOURCE_EXIT:
@@ -1555,37 +1693,33 @@ _public_ int sd_event_source_set_enabled(sd_event_source *s, int m) {
}
case SOURCE_SIGNAL:
- /* Check status before enabling. */
- if (!need_signal(s->event, s->signal.sig)) {
- assert_se(sigaddset(&s->event->sigset, s->signal.sig) == 0);
-
- r = event_update_signal_fd(s->event);
- if (r < 0) {
- s->enabled = SD_EVENT_OFF;
- return r;
- }
- }
s->enabled = m;
+
+ r = event_make_signal_data(s->event, s->signal.sig, NULL);
+ if (r < 0) {
+ s->enabled = SD_EVENT_OFF;
+ event_gc_signal_data(s->event, &s->priority, s->signal.sig);
+ return r;
+ }
+
break;
case SOURCE_CHILD:
- /* Check status before enabling. */
- if (s->enabled == SD_EVENT_OFF) {
- if (!need_signal(s->event, SIGCHLD)) {
- assert_se(sigaddset(&s->event->sigset, s->signal.sig) == 0);
-
- r = event_update_signal_fd(s->event);
- if (r < 0) {
- s->enabled = SD_EVENT_OFF;
- return r;
- }
- }
+ if (s->enabled == SD_EVENT_OFF)
s->event->n_enabled_child_sources++;
- }
s->enabled = m;
+
+ r = event_make_signal_data(s->event, s->signal.sig, SIGCHLD);
+ if (r < 0) {
+ s->enabled = SD_EVENT_OFF;
+ s->event->n_enabled_child_sources--;
+ event_gc_signal_data(s->event, &s->priority, SIGCHLD);
+ return r;
+ }
+
break;
case SOURCE_EXIT:
@@ -2029,20 +2163,35 @@ static int process_child(sd_event *e) {
return 0;
}
-static int process_signal(sd_event *e, uint32_t events) {
+static int process_signal(sd_event *e, struct signal_data *d, uint32_t events) {
bool read_one = false;
int r;
assert(e);
-
assert_return(events == EPOLLIN, -EIO);
+ /* If there's a signal queued on this priority and SIGCHLD is
+ on this priority too, then make sure to recheck the
+ children we watch. This is because we only ever dequeue
+ the first signal per priority, and if we dequeue one, and
+ SIGCHLD might be enqueued later we wouldn't know, but we
+ might have higher priority children we care about hence we
+ need to check that explicitly. */
+
+ if (sigismember(&d->sigset, SIGCHLD))
+ e->need_process_child = true;
+
+ /* If there's already an event source pending for this
+ * priority we don't read another */
+ if (d->current)
+ return 0;
+
for (;;) {
struct signalfd_siginfo si;
ssize_t n;
sd_event_source *s = NULL;
- n = read(e->signal_fd, &si, sizeof(si));
+ n = read(d->fd, &si, sizeof(si));
if (n < 0) {
if (errno == EAGAIN || errno == EINTR)
return read_one;
@@ -2057,24 +2206,21 @@ static int process_signal(sd_event *e, uint32_t events) {
read_one = true;
- if (si.ssi_signo == SIGCHLD) {
- r = process_child(e);
- if (r < 0)
- return r;
- if (r > 0)
- continue;
- }
-
if (e->signal_sources)
s = e->signal_sources[si.ssi_signo];
-
if (!s)
continue;
+ if (s->pending)
+ continue;
s->signal.siginfo = si;
+ d->current = s;
+
r = source_set_pending(s, true);
if (r < 0)
return r;
+
+ return 1;
}
}
@@ -2393,23 +2539,31 @@ _public_ int sd_event_wait(sd_event *e, uint64_t timeout) {
for (i = 0; i < m; i++) {
- if (ev_queue[i].data.ptr == INT_TO_PTR(SOURCE_TIME_REALTIME))
- r = flush_timer(e, e->realtime.fd, ev_queue[i].events, &e->realtime.next);
- else if (ev_queue[i].data.ptr == INT_TO_PTR(SOURCE_TIME_BOOTTIME))
- r = flush_timer(e, e->boottime.fd, ev_queue[i].events, &e->boottime.next);
- else if (ev_queue[i].data.ptr == INT_TO_PTR(SOURCE_TIME_MONOTONIC))
- r = flush_timer(e, e->monotonic.fd, ev_queue[i].events, &e->monotonic.next);
- else if (ev_queue[i].data.ptr == INT_TO_PTR(SOURCE_TIME_REALTIME_ALARM))
- r = flush_timer(e, e->realtime_alarm.fd, ev_queue[i].events, &e->realtime_alarm.next);
- else if (ev_queue[i].data.ptr == INT_TO_PTR(SOURCE_TIME_BOOTTIME_ALARM))
- r = flush_timer(e, e->boottime_alarm.fd, ev_queue[i].events, &e->boottime_alarm.next);
- else if (ev_queue[i].data.ptr == INT_TO_PTR(SOURCE_SIGNAL))
- r = process_signal(e, ev_queue[i].events);
- else if (ev_queue[i].data.ptr == INT_TO_PTR(SOURCE_WATCHDOG))
+ if (ev_queue[i].data.ptr == INT_TO_PTR(SOURCE_WATCHDOG))
r = flush_timer(e, e->watchdog_fd, ev_queue[i].events, NULL);
- else
- r = process_io(e, ev_queue[i].data.ptr, ev_queue[i].events);
+ else {
+ WakeupType *t = ev_queue[i].data.ptr;
+
+ switch (*t) {
+
+ case WAKEUP_EVENT_SOURCE:
+ r = process_io(e, ev_queue[i].data.ptr, ev_queue[i].events);
+ break;
+ case WAKEUP_CLOCK_DATA: {
+ struct clock_data *d = ev_queue[i].data.ptr;
+ r = flush_timer(e, d->fd, ev_queue[i].events, &d->next);
+ break;
+ }
+
+ case WAKEUP_SIGNAL_DATA:
+ r = process_signal(e, ev_queue[i].data.ptr, ev_queue[i].events);
+ break;
+
+ default:
+ assert_not_reached("Invalid wake-up pointer");
+ }
+ }
if (r < 0)
goto finish;
}
diff --git a/src/libsystemd/sd-event/test-event.c b/src/libsystemd/sd-event/test-event.c
index 721700b..6bb1420 100644
--- a/src/libsystemd/sd-event/test-event.c
+++ b/src/libsystemd/sd-event/test-event.c
@@ -160,7 +160,7 @@ static int exit_handler(sd_event_source *s, void *userdata) {
return 3;
}
-int main(int argc, char *argv[]) {
+static void test_basic(void) {
sd_event *e = NULL;
sd_event_source *w = NULL, *x = NULL, *y = NULL, *z = NULL, *q = NULL, *t = NULL;
static const char ch = 'x';
@@ -248,6 +248,70 @@ int main(int argc, char *argv[]) {
safe_close_pair(b);
safe_close_pair(d);
safe_close_pair(k);
+}
+
+static int last_rtqueue_sigval = 0;
+static int n_rtqueue = 0;
+
+static int rtqueue_handler(sd_event_source *s, const struct signalfd_siginfo *si, void *userdata) {
+ last_rtqueue_sigval = si->ssi_int;
+ n_rtqueue ++;
+ return 0;
+}
+
+static void test_rtqueue(void) {
+ sd_event_source *u = NULL, *v = NULL, *s = NULL;
+ sd_event *e = NULL;
+
+ assert_se(sd_event_default(&e) >= 0);
+
+ assert_se(sigprocmask_many(SIG_BLOCK, NULL, SIGRTMIN+2, SIGRTMIN+3, SIGUSR2, -1) >= 0);
+ assert_se(sd_event_add_signal(e, &u, SIGRTMIN+2, rtqueue_handler, NULL) >= 0);
+ assert_se(sd_event_add_signal(e, &v, SIGRTMIN+3, rtqueue_handler, NULL) >= 0);
+ assert_se(sd_event_add_signal(e, &s, SIGUSR2, rtqueue_handler, NULL) >= 0);
+
+ assert_se(sd_event_source_set_priority(v, -10) >= 0);
+
+ assert(sigqueue(getpid(), SIGRTMIN+2, (union sigval) { .sival_int = 1 }) >= 0);
+ assert(sigqueue(getpid(), SIGRTMIN+3, (union sigval) { .sival_int = 2 }) >= 0);
+ assert(sigqueue(getpid(), SIGUSR2, (union sigval) { .sival_int = 3 }) >= 0);
+ assert(sigqueue(getpid(), SIGRTMIN+3, (union sigval) { .sival_int = 4 }) >= 0);
+ assert(sigqueue(getpid(), SIGUSR2, (union sigval) { .sival_int = 5 }) >= 0);
+
+ assert_se(n_rtqueue == 0);
+ assert_se(last_rtqueue_sigval == 0);
+
+ assert_se(sd_event_run(e, (uint64_t) -1) >= 1);
+ assert_se(n_rtqueue == 1);
+ assert_se(last_rtqueue_sigval == 2); /* first SIGRTMIN+3 */
+
+ assert_se(sd_event_run(e, (uint64_t) -1) >= 1);
+ assert_se(n_rtqueue == 2);
+ assert_se(last_rtqueue_sigval == 4); /* second SIGRTMIN+3 */
+
+ assert_se(sd_event_run(e, (uint64_t) -1) >= 1);
+ assert_se(n_rtqueue == 3);
+ assert_se(last_rtqueue_sigval == 3); /* first SIGUSR2 */
+
+ assert_se(sd_event_run(e, (uint64_t) -1) >= 1);
+ assert_se(n_rtqueue == 4);
+ assert_se(last_rtqueue_sigval == 1); /* SIGRTMIN+2 */
+
+ assert_se(sd_event_run(e, 0) == 0); /* the other SIGUSR2 is dropped, because the first one was still queued */
+ assert_se(n_rtqueue == 4);
+ assert_se(last_rtqueue_sigval == 1);
+
+ sd_event_source_unref(u);
+ sd_event_source_unref(v);
+ sd_event_source_unref(s);
+
+ sd_event_unref(e);
+}
+
+int main(int argc, char *argv[]) {
+
+ test_basic();
+ test_rtqueue();
return 0;
}
--
2.17.1

View File

@ -0,0 +1,216 @@
From ea762f1c0206c99d2ba4d3cba41cadf70311a3cc Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Michal=20Sekleta=CC=81r?= <msekleta@redhat.com>
Date: Fri, 23 Oct 2020 18:29:27 +0200
Subject: [PATCH 03/20] sd-event: split out helper functions for reshuffling
prioqs
We typically don't just reshuffle a single prioq at once, but always
two. Let's add two helper functions that do this, and reuse them
everywhere.
(Note that this drops one minor optimization:
sd_event_source_set_time_accuracy() previously only reshuffled the
"latest" prioq, since changing the accuracy has no effect on the
earliest time of an event source, just the latest time an event source
can run. This optimization is removed to simplify things, given that
it's not really worth the effort as prioq_reshuffle() on properly
ordered prioqs has practically zero cost O(1)).
(Slightly generalized, commented and split out of #17284 by Lennart)
(cherry picked from commit e1951c16a8fbe5b0b9ecc08f4f835a806059d28f)
Related: #1819868
[commit 4ce10f8e41a85a56ad9b805442eb1149ece7c82a from
https://github.com/systemd-rhel/rhel-8/]
Signed-off-by: Li Zhou <li.zhou@windriver.com>
---
src/libsystemd/sd-event/sd-event.c | 96 ++++++++++++------------------
1 file changed, 38 insertions(+), 58 deletions(-)
diff --git a/src/libsystemd/sd-event/sd-event.c b/src/libsystemd/sd-event/sd-event.c
index 26ef3ea..eb3182f 100644
--- a/src/libsystemd/sd-event/sd-event.c
+++ b/src/libsystemd/sd-event/sd-event.c
@@ -784,6 +784,33 @@ static void event_gc_signal_data(sd_event *e, const int64_t *priority, int sig)
event_unmask_signal_data(e, d, sig);
}
+static void event_source_pp_prioq_reshuffle(sd_event_source *s) {
+ assert(s);
+
+ /* Reshuffles the pending + prepare prioqs. Called whenever the dispatch order changes, i.e. when
+ * they are enabled/disabled or marked pending and such. */
+
+ if (s->pending)
+ prioq_reshuffle(s->event->pending, s, &s->pending_index);
+
+ if (s->prepare)
+ prioq_reshuffle(s->event->prepare, s, &s->prepare_index);
+}
+
+static void event_source_time_prioq_reshuffle(sd_event_source *s) {
+ struct clock_data *d;
+
+ assert(s);
+ assert(EVENT_SOURCE_IS_TIME(s->type));
+
+ /* Called whenever the event source's timer ordering properties changed, i.e. time, accuracy,
+ * pending, enable state. Makes sure the two prioq's are ordered properly again. */
+ assert_se(d = event_get_clock_data(s->event, s->type));
+ prioq_reshuffle(d->earliest, s, &s->time.earliest_index);
+ prioq_reshuffle(d->latest, s, &s->time.latest_index);
+ d->needs_rearm = true;
+}
+
static void source_disconnect(sd_event_source *s) {
sd_event *event;
@@ -905,16 +932,8 @@ static int source_set_pending(sd_event_source *s, bool b) {
} else
assert_se(prioq_remove(s->event->pending, s, &s->pending_index));
- if (EVENT_SOURCE_IS_TIME(s->type)) {
- struct clock_data *d;
-
- d = event_get_clock_data(s->event, s->type);
- assert(d);
-
- prioq_reshuffle(d->earliest, s, &s->time.earliest_index);
- prioq_reshuffle(d->latest, s, &s->time.latest_index);
- d->needs_rearm = true;
- }
+ if (EVENT_SOURCE_IS_TIME(s->type))
+ event_source_time_prioq_reshuffle(s);
if (s->type == SOURCE_SIGNAL && !b) {
struct signal_data *d;
@@ -1570,11 +1589,7 @@ _public_ int sd_event_source_set_priority(sd_event_source *s, int64_t priority)
} else
s->priority = priority;
- if (s->pending)
- prioq_reshuffle(s->event->pending, s, &s->pending_index);
-
- if (s->prepare)
- prioq_reshuffle(s->event->prepare, s, &s->prepare_index);
+ event_source_pp_prioq_reshuffle(s);
if (s->type == SOURCE_EXIT)
prioq_reshuffle(s->event->exit, s, &s->exit.prioq_index);
@@ -1622,18 +1637,10 @@ _public_ int sd_event_source_set_enabled(sd_event_source *s, int m) {
case SOURCE_TIME_BOOTTIME:
case SOURCE_TIME_MONOTONIC:
case SOURCE_TIME_REALTIME_ALARM:
- case SOURCE_TIME_BOOTTIME_ALARM: {
- struct clock_data *d;
-
+ case SOURCE_TIME_BOOTTIME_ALARM:
s->enabled = m;
- d = event_get_clock_data(s->event, s->type);
- assert(d);
-
- prioq_reshuffle(d->earliest, s, &s->time.earliest_index);
- prioq_reshuffle(d->latest, s, &s->time.latest_index);
- d->needs_rearm = true;
+ event_source_time_prioq_reshuffle(s);
break;
- }
case SOURCE_SIGNAL:
s->enabled = m;
@@ -1679,18 +1686,10 @@ _public_ int sd_event_source_set_enabled(sd_event_source *s, int m) {
case SOURCE_TIME_BOOTTIME:
case SOURCE_TIME_MONOTONIC:
case SOURCE_TIME_REALTIME_ALARM:
- case SOURCE_TIME_BOOTTIME_ALARM: {
- struct clock_data *d;
-
+ case SOURCE_TIME_BOOTTIME_ALARM:
s->enabled = m;
- d = event_get_clock_data(s->event, s->type);
- assert(d);
-
- prioq_reshuffle(d->earliest, s, &s->time.earliest_index);
- prioq_reshuffle(d->latest, s, &s->time.latest_index);
- d->needs_rearm = true;
+ event_source_time_prioq_reshuffle(s);
break;
- }
case SOURCE_SIGNAL:
@@ -1737,11 +1736,7 @@ _public_ int sd_event_source_set_enabled(sd_event_source *s, int m) {
}
}
- if (s->pending)
- prioq_reshuffle(s->event->pending, s, &s->pending_index);
-
- if (s->prepare)
- prioq_reshuffle(s->event->prepare, s, &s->prepare_index);
+ event_source_pp_prioq_reshuffle(s);
return 0;
}
@@ -1757,7 +1752,6 @@ _public_ int sd_event_source_get_time(sd_event_source *s, uint64_t *usec) {
}
_public_ int sd_event_source_set_time(sd_event_source *s, uint64_t usec) {
- struct clock_data *d;
assert_return(s, -EINVAL);
assert_return(usec != (uint64_t) -1, -EINVAL);
@@ -1769,13 +1763,7 @@ _public_ int sd_event_source_set_time(sd_event_source *s, uint64_t usec) {
source_set_pending(s, false);
- d = event_get_clock_data(s->event, s->type);
- assert(d);
-
- prioq_reshuffle(d->earliest, s, &s->time.earliest_index);
- prioq_reshuffle(d->latest, s, &s->time.latest_index);
- d->needs_rearm = true;
-
+ event_source_time_prioq_reshuffle(s);
return 0;
}
@@ -1790,7 +1778,6 @@ _public_ int sd_event_source_get_time_accuracy(sd_event_source *s, uint64_t *use
}
_public_ int sd_event_source_set_time_accuracy(sd_event_source *s, uint64_t usec) {
- struct clock_data *d;
assert_return(s, -EINVAL);
assert_return(usec != (uint64_t) -1, -EINVAL);
@@ -1805,12 +1792,7 @@ _public_ int sd_event_source_set_time_accuracy(sd_event_source *s, uint64_t usec
source_set_pending(s, false);
- d = event_get_clock_data(s->event, s->type);
- assert(d);
-
- prioq_reshuffle(d->latest, s, &s->time.latest_index);
- d->needs_rearm = true;
-
+ event_source_time_prioq_reshuffle(s);
return 0;
}
@@ -2088,9 +2070,7 @@ static int process_timer(
if (r < 0)
return r;
- prioq_reshuffle(d->earliest, s, &s->time.earliest_index);
- prioq_reshuffle(d->latest, s, &s->time.latest_index);
- d->needs_rearm = true;
+ event_source_time_prioq_reshuffle(s);
}
return 0;
--
2.17.1

View File

@ -0,0 +1,50 @@
From 76969d09522ca2ab58bc157eb9ce357af5677f3a Mon Sep 17 00:00:00 2001
From: Lennart Poettering <lennart@poettering.net>
Date: Fri, 25 May 2018 17:06:39 +0200
Subject: [PATCH 04/20] sd-event: drop pending events when we turn off/on an
event source
[commit ac989a783a31df95e6c0ce2a90a8d2e1abe73592 from
https://github.com/systemd-rhel/rhel-8/]
Signed-off-by: Li Zhou <li.zhou@windriver.com>
---
src/libsystemd/sd-event/sd-event.c | 15 +++++++++++++++
1 file changed, 15 insertions(+)
diff --git a/src/libsystemd/sd-event/sd-event.c b/src/libsystemd/sd-event/sd-event.c
index eb3182f..6e93059 100644
--- a/src/libsystemd/sd-event/sd-event.c
+++ b/src/libsystemd/sd-event/sd-event.c
@@ -1623,6 +1623,13 @@ _public_ int sd_event_source_set_enabled(sd_event_source *s, int m) {
if (m == SD_EVENT_OFF) {
+ /* Unset the pending flag when this event source is disabled */
+ if (!IN_SET(s->type, SOURCE_DEFER, SOURCE_EXIT)) {
+ r = source_set_pending(s, false);
+ if (r < 0)
+ return r;
+ }
+
switch (s->type) {
case SOURCE_IO:
@@ -1672,6 +1679,14 @@ _public_ int sd_event_source_set_enabled(sd_event_source *s, int m) {
}
} else {
+
+ /* Unset the pending flag when this event source is enabled */
+ if (s->enabled == SD_EVENT_OFF && !IN_SET(s->type, SOURCE_DEFER, SOURCE_EXIT)) {
+ r = source_set_pending(s, false);
+ if (r < 0)
+ return r;
+ }
+
switch (s->type) {
case SOURCE_IO:
--
2.17.1

View File

@ -0,0 +1,31 @@
From 7380d2cca8bda0f8c821645f8a5ddb8ac47aec46 Mon Sep 17 00:00:00 2001
From: Thomas Hindoe Paaboel Andersen <phomes@gmail.com>
Date: Sun, 6 Sep 2015 22:06:45 +0200
Subject: [PATCH 05/20] sd-event: fix call to event_make_signal_data
This looks like a typo from commit 9da4cb2b where it was added.
[commit b8a50a99a6e158a5b3ceacf0764dbe9f42558f3e from
https://github.com/systemd-rhel/rhel-8/]
Signed-off-by: Li Zhou <li.zhou@windriver.com>
---
src/libsystemd/sd-event/sd-event.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/libsystemd/sd-event/sd-event.c b/src/libsystemd/sd-event/sd-event.c
index 6e93059..7c33dcd 100644
--- a/src/libsystemd/sd-event/sd-event.c
+++ b/src/libsystemd/sd-event/sd-event.c
@@ -1726,7 +1726,7 @@ _public_ int sd_event_source_set_enabled(sd_event_source *s, int m) {
s->enabled = m;
- r = event_make_signal_data(s->event, s->signal.sig, SIGCHLD);
+ r = event_make_signal_data(s->event, s->signal.sig, NULL);
if (r < 0) {
s->enabled = SD_EVENT_OFF;
s->event->n_enabled_child_sources--;
--
2.17.1

View File

@ -0,0 +1,37 @@
From 0a2519a5ab04e775115c90039d30bdc576a79c06 Mon Sep 17 00:00:00 2001
From: Lennart Poettering <lennart@poettering.net>
Date: Mon, 7 Sep 2015 00:31:24 +0200
Subject: [PATCH 06/20] sd-event: make sure to create a signal queue for the
right signal
We should never access the "signal" part of the event source unless the
event source is actually for a signal. In this case it's a child pid
handler however, hence make sure to use the right signal.
This is a fix for PR #1177, which in turn was a fix for
9da4cb2be260ed123f2676cb85cb350c527b1492.
[commit 10edebf6cd69cfbe0d38dbaf5478264fbb60a51e from
https://github.com/systemd-rhel/rhel-8/]
Signed-off-by: Li Zhou <li.zhou@windriver.com>
---
src/libsystemd/sd-event/sd-event.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/libsystemd/sd-event/sd-event.c b/src/libsystemd/sd-event/sd-event.c
index 7c33dcd..2f5ff23 100644
--- a/src/libsystemd/sd-event/sd-event.c
+++ b/src/libsystemd/sd-event/sd-event.c
@@ -1726,7 +1726,7 @@ _public_ int sd_event_source_set_enabled(sd_event_source *s, int m) {
s->enabled = m;
- r = event_make_signal_data(s->event, s->signal.sig, NULL);
+ r = event_make_signal_data(s->event, SIGCHLD, NULL);
if (r < 0) {
s->enabled = SD_EVENT_OFF;
s->event->n_enabled_child_sources--;
--
2.17.1

View File

@ -0,0 +1,315 @@
From 477bbfd4f5012613144c5ba5517aa8de1f300da6 Mon Sep 17 00:00:00 2001
From: Lennart Poettering <lennart@poettering.net>
Date: Fri, 23 Oct 2020 21:21:58 +0200
Subject: [PATCH 07/20] sd-event: split out enable and disable codepaths from
sd_event_source_set_enabled()
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
So far half of sd_event_source_set_enabled() was doing enabling, the
other half was doing disabling. Let's split that into two separate
calls.
(This also adds a new shortcut to sd_event_source_set_enabled(): if the
caller toggles between "ON" and "ONESHOT" we'll now shortcut this, since
the event source is already enabled in that case and shall remain
enabled.)
This heavily borrows and is inspired from Michal Sekletár's #17284
refactoring.
(cherry picked from commit ddfde737b546c17e54182028153aa7f7e78804e3)
Related: #1819868
[commit d7ad6ad123200f562081ff09f7bed3c6d969ac0a from
https://github.com/systemd-rhel/rhel-8/
LZ: Dropped SOURCE_INOTIFY related parts because it hasn't been added
in this systemd version.]
Signed-off-by: Li Zhou <li.zhou@windriver.com>
---
src/libsystemd/sd-event/sd-event.c | 224 +++++++++++++++--------------
1 file changed, 118 insertions(+), 106 deletions(-)
diff --git a/src/libsystemd/sd-event/sd-event.c b/src/libsystemd/sd-event/sd-event.c
index 2f5ff23..2e07478 100644
--- a/src/libsystemd/sd-event/sd-event.c
+++ b/src/libsystemd/sd-event/sd-event.c
@@ -1606,153 +1606,165 @@ _public_ int sd_event_source_get_enabled(sd_event_source *s, int *m) {
return 0;
}
-_public_ int sd_event_source_set_enabled(sd_event_source *s, int m) {
+static int event_source_disable(sd_event_source *s) {
int r;
- assert_return(s, -EINVAL);
- assert_return(m == SD_EVENT_OFF || m == SD_EVENT_ON || m == SD_EVENT_ONESHOT, -EINVAL);
- assert_return(!event_pid_changed(s->event), -ECHILD);
+ assert(s);
+ assert(s->enabled != SD_EVENT_OFF);
- /* If we are dead anyway, we are fine with turning off
- * sources, but everything else needs to fail. */
- if (s->event->state == SD_EVENT_FINISHED)
- return m == SD_EVENT_OFF ? 0 : -ESTALE;
+ /* Unset the pending flag when this event source is disabled */
+ if (!IN_SET(s->type, SOURCE_DEFER, SOURCE_EXIT)) {
+ r = source_set_pending(s, false);
+ if (r < 0)
+ return r;
+ }
- if (s->enabled == m)
- return 0;
+ s->enabled = SD_EVENT_OFF;
- if (m == SD_EVENT_OFF) {
+ switch (s->type) {
- /* Unset the pending flag when this event source is disabled */
- if (!IN_SET(s->type, SOURCE_DEFER, SOURCE_EXIT)) {
- r = source_set_pending(s, false);
- if (r < 0)
- return r;
- }
+ case SOURCE_IO:
+ source_io_unregister(s);
+ break;
- switch (s->type) {
+ case SOURCE_TIME_REALTIME:
+ case SOURCE_TIME_BOOTTIME:
+ case SOURCE_TIME_MONOTONIC:
+ case SOURCE_TIME_REALTIME_ALARM:
+ case SOURCE_TIME_BOOTTIME_ALARM:
+ event_source_time_prioq_reshuffle(s);
+ break;
- case SOURCE_IO:
- r = source_io_unregister(s);
- if (r < 0)
- return r;
+ case SOURCE_SIGNAL:
+ event_gc_signal_data(s->event, &s->priority, s->signal.sig);
+ break;
- s->enabled = m;
- break;
+ case SOURCE_CHILD:
+ assert(s->event->n_enabled_child_sources > 0);
+ s->event->n_enabled_child_sources--;
- case SOURCE_TIME_REALTIME:
- case SOURCE_TIME_BOOTTIME:
- case SOURCE_TIME_MONOTONIC:
- case SOURCE_TIME_REALTIME_ALARM:
- case SOURCE_TIME_BOOTTIME_ALARM:
- s->enabled = m;
- event_source_time_prioq_reshuffle(s);
- break;
+ event_gc_signal_data(s->event, &s->priority, SIGCHLD);
+ break;
- case SOURCE_SIGNAL:
- s->enabled = m;
+ case SOURCE_EXIT:
+ prioq_reshuffle(s->event->exit, s, &s->exit.prioq_index);
+ break;
- event_gc_signal_data(s->event, &s->priority, s->signal.sig);
- break;
+ case SOURCE_DEFER:
+ case SOURCE_POST:
+ break;
- case SOURCE_CHILD:
- s->enabled = m;
+ default:
+ assert_not_reached("Wut? I shouldn't exist.");
+ }
- assert(s->event->n_enabled_child_sources > 0);
- s->event->n_enabled_child_sources--;
+ return 0;
+}
- event_gc_signal_data(s->event, &s->priority, SIGCHLD);
- break;
+static int event_source_enable(sd_event_source *s, int m) {
+ int r;
- case SOURCE_EXIT:
- s->enabled = m;
- prioq_reshuffle(s->event->exit, s, &s->exit.prioq_index);
- break;
+ assert(s);
+ assert(IN_SET(m, SD_EVENT_ON, SD_EVENT_ONESHOT));
+ assert(s->enabled == SD_EVENT_OFF);
- case SOURCE_DEFER:
- case SOURCE_POST:
- s->enabled = m;
- break;
+ /* Unset the pending flag when this event source is enabled */
+ if (!IN_SET(s->type, SOURCE_DEFER, SOURCE_EXIT)) {
+ r = source_set_pending(s, false);
+ if (r < 0)
+ return r;
+ }
- default:
- assert_not_reached("Wut? I shouldn't exist.");
- }
+ s->enabled = m;
- } else {
+ switch (s->type) {
- /* Unset the pending flag when this event source is enabled */
- if (s->enabled == SD_EVENT_OFF && !IN_SET(s->type, SOURCE_DEFER, SOURCE_EXIT)) {
- r = source_set_pending(s, false);
- if (r < 0)
- return r;
+ case SOURCE_IO:
+ r = source_io_register(s, m, s->io.events);
+ if (r < 0) {
+ s->enabled = SD_EVENT_OFF;
+ return r;
}
- switch (s->type) {
+ break;
- case SOURCE_IO:
- r = source_io_register(s, m, s->io.events);
- if (r < 0)
- return r;
+ case SOURCE_TIME_REALTIME:
+ case SOURCE_TIME_BOOTTIME:
+ case SOURCE_TIME_MONOTONIC:
+ case SOURCE_TIME_REALTIME_ALARM:
+ case SOURCE_TIME_BOOTTIME_ALARM:
+ event_source_time_prioq_reshuffle(s);
+ break;
- s->enabled = m;
- break;
+ case SOURCE_SIGNAL:
+ r = event_make_signal_data(s->event, s->signal.sig, NULL);
+ if (r < 0) {
+ s->enabled = SD_EVENT_OFF;
+ event_gc_signal_data(s->event, &s->priority, s->signal.sig);
+ return r;
+ }
- case SOURCE_TIME_REALTIME:
- case SOURCE_TIME_BOOTTIME:
- case SOURCE_TIME_MONOTONIC:
- case SOURCE_TIME_REALTIME_ALARM:
- case SOURCE_TIME_BOOTTIME_ALARM:
- s->enabled = m;
- event_source_time_prioq_reshuffle(s);
- break;
+ break;
- case SOURCE_SIGNAL:
+ case SOURCE_CHILD:
+ s->event->n_enabled_child_sources++;
- s->enabled = m;
+ r = event_make_signal_data(s->event, SIGCHLD, NULL);
+ if (r < 0) {
+ s->enabled = SD_EVENT_OFF;
+ s->event->n_enabled_child_sources--;
+ event_gc_signal_data(s->event, &s->priority, SIGCHLD);
+ return r;
+ }
- r = event_make_signal_data(s->event, s->signal.sig, NULL);
- if (r < 0) {
- s->enabled = SD_EVENT_OFF;
- event_gc_signal_data(s->event, &s->priority, s->signal.sig);
- return r;
- }
- break;
+ break;
- case SOURCE_CHILD:
+ case SOURCE_EXIT:
+ prioq_reshuffle(s->event->exit, s, &s->exit.prioq_index);
+ break;
- if (s->enabled == SD_EVENT_OFF)
- s->event->n_enabled_child_sources++;
+ case SOURCE_DEFER:
+ case SOURCE_POST:
+ break;
- s->enabled = m;
+ default:
+ assert_not_reached("Wut? I shouldn't exist.");
+ }
- r = event_make_signal_data(s->event, SIGCHLD, NULL);
- if (r < 0) {
- s->enabled = SD_EVENT_OFF;
- s->event->n_enabled_child_sources--;
- event_gc_signal_data(s->event, &s->priority, SIGCHLD);
- return r;
- }
+ return 0;
+}
- break;
+_public_ int sd_event_source_set_enabled(sd_event_source *s, int m) {
+ int r;
- case SOURCE_EXIT:
- s->enabled = m;
- prioq_reshuffle(s->event->exit, s, &s->exit.prioq_index);
- break;
+ assert_return(s, -EINVAL);
+ assert_return(IN_SET(m, SD_EVENT_OFF, SD_EVENT_ON, SD_EVENT_ONESHOT), -EINVAL);
+ assert_return(!event_pid_changed(s->event), -ECHILD);
- case SOURCE_DEFER:
- case SOURCE_POST:
- s->enabled = m;
- break;
+ /* If we are dead anyway, we are fine with turning off sources, but everything else needs to fail. */
+ if (s->event->state == SD_EVENT_FINISHED)
+ return m == SD_EVENT_OFF ? 0 : -ESTALE;
- default:
- assert_not_reached("Wut? I shouldn't exist.");
+ if (s->enabled == m) /* No change? */
+ return 0;
+
+ if (m == SD_EVENT_OFF)
+ r = event_source_disable(s);
+ else {
+ if (s->enabled != SD_EVENT_OFF) {
+ /* Switching from "on" to "oneshot" or back? If that's the case, we can take a shortcut, the
+ * event source is already enabled after all. */
+ s->enabled = m;
+ return 0;
}
+
+ r = event_source_enable(s, m);
}
+ if (r < 0)
+ return r;
event_source_pp_prioq_reshuffle(s);
-
return 0;
}
--
2.17.1

View File

@ -0,0 +1,73 @@
From 5e365321f3006d44f57bb27ff9de96ca01c1104a Mon Sep 17 00:00:00 2001
From: Evgeny Vereshchagin <evvers@ya.ru>
Date: Sun, 22 Nov 2015 06:41:31 +0000
Subject: [PATCH 08/20] sd-event: use prioq_ensure_allocated where possible
[commit c983e776c4e7e2ea6e1990123d215e639deb353b from
https://github.com/systemd-rhel/rhel-8/]
Signed-off-by: Li Zhou <li.zhou@windriver.com>
---
src/libsystemd/sd-event/sd-event.c | 30 +++++++++++-------------------
1 file changed, 11 insertions(+), 19 deletions(-)
diff --git a/src/libsystemd/sd-event/sd-event.c b/src/libsystemd/sd-event/sd-event.c
index 2e07478..7074520 100644
--- a/src/libsystemd/sd-event/sd-event.c
+++ b/src/libsystemd/sd-event/sd-event.c
@@ -442,11 +442,9 @@ _public_ int sd_event_new(sd_event** ret) {
e->original_pid = getpid();
e->perturb = USEC_INFINITY;
- e->pending = prioq_new(pending_prioq_compare);
- if (!e->pending) {
- r = -ENOMEM;
+ r = prioq_ensure_allocated(&e->pending, pending_prioq_compare);
+ if (r < 0)
goto fail;
- }
e->epoll_fd = epoll_create1(EPOLL_CLOEXEC);
if (e->epoll_fd < 0) {
@@ -1096,17 +1094,13 @@ _public_ int sd_event_add_time(
d = event_get_clock_data(e, type);
assert(d);
- if (!d->earliest) {
- d->earliest = prioq_new(earliest_time_prioq_compare);
- if (!d->earliest)
- return -ENOMEM;
- }
+ r = prioq_ensure_allocated(&d->earliest, earliest_time_prioq_compare);
+ if (r < 0)
+ return r;
- if (!d->latest) {
- d->latest = prioq_new(latest_time_prioq_compare);
- if (!d->latest)
- return -ENOMEM;
- }
+ r = prioq_ensure_allocated(&d->latest, latest_time_prioq_compare);
+ if (r < 0)
+ return r;
if (d->fd < 0) {
r = event_setup_timer_fd(e, d, clock);
@@ -1357,11 +1351,9 @@ _public_ int sd_event_add_exit(
assert_return(e->state != SD_EVENT_FINISHED, -ESTALE);
assert_return(!event_pid_changed(e), -ECHILD);
- if (!e->exit) {
- e->exit = prioq_new(exit_prioq_compare);
- if (!e->exit)
- return -ENOMEM;
- }
+ r = prioq_ensure_allocated(&e->exit, exit_prioq_compare);
+ if (r < 0)
+ return r;
s = source_new(e, !ret, SOURCE_EXIT);
if (!s)
--
2.17.1

View File

@ -0,0 +1,79 @@
From 77b772bce846db28dc447420fd380a51eadcde15 Mon Sep 17 00:00:00 2001
From: Lennart Poettering <lennart@poettering.net>
Date: Mon, 23 Nov 2020 11:40:24 +0100
Subject: [PATCH 09/20] sd-event: split clock data allocation out of
sd_event_add_time()
Just some simple refactoring, that will make things easier for us later.
But it looks better this way even without the later function reuse.
(cherry picked from commit 41c63f36c3352af8bebf03b6181f5d866431d0af)
Related: #1819868
[commit 6cc0022115afbac9ac66c456b140601d90271687 from
https://github.com/systemd-rhel/rhel-8/]
Signed-off-by: Li Zhou <li.zhou@windriver.com>
---
src/libsystemd/sd-event/sd-event.c | 34 ++++++++++++++++++++----------
1 file changed, 23 insertions(+), 11 deletions(-)
diff --git a/src/libsystemd/sd-event/sd-event.c b/src/libsystemd/sd-event/sd-event.c
index 7074520..8e6536f 100644
--- a/src/libsystemd/sd-event/sd-event.c
+++ b/src/libsystemd/sd-event/sd-event.c
@@ -1065,6 +1065,28 @@ static int time_exit_callback(sd_event_source *s, uint64_t usec, void *userdata)
return sd_event_exit(sd_event_source_get_event(s), PTR_TO_INT(userdata));
}
+static int setup_clock_data(sd_event *e, struct clock_data *d, clockid_t clock) {
+ int r;
+
+ assert(d);
+
+ if (d->fd < 0) {
+ r = event_setup_timer_fd(e, d, clock);
+ if (r < 0)
+ return r;
+ }
+
+ r = prioq_ensure_allocated(&d->earliest, earliest_time_prioq_compare);
+ if (r < 0)
+ return r;
+
+ r = prioq_ensure_allocated(&d->latest, latest_time_prioq_compare);
+ if (r < 0)
+ return r;
+
+ return 0;
+}
+
_public_ int sd_event_add_time(
sd_event *e,
sd_event_source **ret,
@@ -1094,20 +1116,10 @@ _public_ int sd_event_add_time(
d = event_get_clock_data(e, type);
assert(d);
- r = prioq_ensure_allocated(&d->earliest, earliest_time_prioq_compare);
- if (r < 0)
- return r;
-
- r = prioq_ensure_allocated(&d->latest, latest_time_prioq_compare);
+ r = setup_clock_data(e, d, clock);
if (r < 0)
return r;
- if (d->fd < 0) {
- r = event_setup_timer_fd(e, d, clock);
- if (r < 0)
- return r;
- }
-
s = source_new(e, !ret, type);
if (!s)
return -ENOMEM;
--
2.17.1

View File

@ -0,0 +1,120 @@
From dad1d000b493f98f4f5eaf4bfa34c8617f41970f Mon Sep 17 00:00:00 2001
From: Lennart Poettering <lennart@poettering.net>
Date: Mon, 23 Nov 2020 15:25:35 +0100
Subject: [PATCH 10/20] sd-event: split out code to add/remove timer event
sources to earliest/latest prioq
Just some refactoring that makes code prettier, and will come handy
later, because we can reuse these functions at more places.
(cherry picked from commit 1e45e3fecc303e7ae9946220c742f69675e99c34)
Related: #1819868
[commit 88b2618e4de850060a1c5c22b049e6de0578fbb5 from
https://github.com/systemd-rhel/rhel-8/]
Signed-off-by: Li Zhou <li.zhou@windriver.com>
---
src/libsystemd/sd-event/sd-event.c | 57 +++++++++++++++++++++---------
1 file changed, 41 insertions(+), 16 deletions(-)
diff --git a/src/libsystemd/sd-event/sd-event.c b/src/libsystemd/sd-event/sd-event.c
index 8e6536f..e0e0eaa 100644
--- a/src/libsystemd/sd-event/sd-event.c
+++ b/src/libsystemd/sd-event/sd-event.c
@@ -809,6 +809,19 @@ static void event_source_time_prioq_reshuffle(sd_event_source *s) {
d->needs_rearm = true;
}
+static void event_source_time_prioq_remove(
+ sd_event_source *s,
+ struct clock_data *d) {
+
+ assert(s);
+ assert(d);
+
+ prioq_remove(d->earliest, s, &s->time.earliest_index);
+ prioq_remove(d->latest, s, &s->time.latest_index);
+ s->time.earliest_index = s->time.latest_index = PRIOQ_IDX_NULL;
+ d->needs_rearm = true;
+}
+
static void source_disconnect(sd_event_source *s) {
sd_event *event;
@@ -833,13 +846,8 @@ static void source_disconnect(sd_event_source *s) {
case SOURCE_TIME_REALTIME_ALARM:
case SOURCE_TIME_BOOTTIME_ALARM: {
struct clock_data *d;
-
- d = event_get_clock_data(s->event, s->type);
- assert(d);
-
- prioq_remove(d->earliest, s, &s->time.earliest_index);
- prioq_remove(d->latest, s, &s->time.latest_index);
- d->needs_rearm = true;
+ assert_se(d = event_get_clock_data(s->event, s->type));
+ event_source_time_prioq_remove(s, d);
break;
}
@@ -1087,6 +1095,30 @@ static int setup_clock_data(sd_event *e, struct clock_data *d, clockid_t clock)
return 0;
}
+static int event_source_time_prioq_put(
+ sd_event_source *s,
+ struct clock_data *d) {
+
+ int r;
+
+ assert(s);
+ assert(d);
+
+ r = prioq_put(d->earliest, s, &s->time.earliest_index);
+ if (r < 0)
+ return r;
+
+ r = prioq_put(d->latest, s, &s->time.latest_index);
+ if (r < 0) {
+ assert_se(prioq_remove(d->earliest, s, &s->time.earliest_index) > 0);
+ s->time.earliest_index = PRIOQ_IDX_NULL;
+ return r;
+ }
+
+ d->needs_rearm = true;
+ return 0;
+}
+
_public_ int sd_event_add_time(
sd_event *e,
sd_event_source **ret,
@@ -1113,8 +1145,7 @@ _public_ int sd_event_add_time(
type = clock_to_event_source_type(clock);
assert_return(type >= 0, -ENOTSUP);
- d = event_get_clock_data(e, type);
- assert(d);
+ assert_se(d = event_get_clock_data(e, type));
r = setup_clock_data(e, d, clock);
if (r < 0)
@@ -1131,13 +1162,7 @@ _public_ int sd_event_add_time(
s->userdata = userdata;
s->enabled = SD_EVENT_ONESHOT;
- d->needs_rearm = true;
-
- r = prioq_put(d->earliest, s, &s->time.earliest_index);
- if (r < 0)
- goto fail;
-
- r = prioq_put(d->latest, s, &s->time.latest_index);
+ r = event_source_time_prioq_put(s, d);
if (r < 0)
goto fail;
--
2.17.1

View File

@ -0,0 +1,126 @@
From 6dc0338be9020eebcbfafe078a46bc7be8e4a2ff Mon Sep 17 00:00:00 2001
From: Tom Gundersen <teg@jklm.no>
Date: Sat, 14 Mar 2015 11:47:35 +0100
Subject: [PATCH 11/20] sd-event: rename PASSIVE/PREPARED to INITIAL/ARMED
[commit 2b0c9ef7352dae53ee746c32033999c1346633b3 from
https://github.com/systemd-rhel/rhel-8/]
Signed-off-by: Li Zhou <li.zhou@windriver.com>
---
src/libsystemd/sd-event/sd-event.c | 22 +++++++++++-----------
src/systemd/sd-event.h | 4 ++--
2 files changed, 13 insertions(+), 13 deletions(-)
diff --git a/src/libsystemd/sd-event/sd-event.c b/src/libsystemd/sd-event/sd-event.c
index e0e0eaa..299312a 100644
--- a/src/libsystemd/sd-event/sd-event.c
+++ b/src/libsystemd/sd-event/sd-event.c
@@ -2423,7 +2423,7 @@ static int dispatch_exit(sd_event *e) {
r = source_dispatch(p);
- e->state = SD_EVENT_PASSIVE;
+ e->state = SD_EVENT_INITIAL;
sd_event_unref(e);
return r;
@@ -2492,7 +2492,7 @@ _public_ int sd_event_prepare(sd_event *e) {
assert_return(e, -EINVAL);
assert_return(!event_pid_changed(e), -ECHILD);
assert_return(e->state != SD_EVENT_FINISHED, -ESTALE);
- assert_return(e->state == SD_EVENT_PASSIVE, -EBUSY);
+ assert_return(e->state == SD_EVENT_INITIAL, -EBUSY);
if (e->exit_requested)
goto pending;
@@ -2526,15 +2526,15 @@ _public_ int sd_event_prepare(sd_event *e) {
if (event_next_pending(e) || e->need_process_child)
goto pending;
- e->state = SD_EVENT_PREPARED;
+ e->state = SD_EVENT_ARMED;
return 0;
pending:
- e->state = SD_EVENT_PREPARED;
+ e->state = SD_EVENT_ARMED;
r = sd_event_wait(e, 0);
if (r == 0)
- e->state = SD_EVENT_PREPARED;
+ e->state = SD_EVENT_ARMED;
return r;
}
@@ -2547,7 +2547,7 @@ _public_ int sd_event_wait(sd_event *e, uint64_t timeout) {
assert_return(e, -EINVAL);
assert_return(!event_pid_changed(e), -ECHILD);
assert_return(e->state != SD_EVENT_FINISHED, -ESTALE);
- assert_return(e->state == SD_EVENT_PREPARED, -EBUSY);
+ assert_return(e->state == SD_EVENT_ARMED, -EBUSY);
if (e->exit_requested) {
e->state = SD_EVENT_PENDING;
@@ -2643,7 +2643,7 @@ _public_ int sd_event_wait(sd_event *e, uint64_t timeout) {
r = 0;
finish:
- e->state = SD_EVENT_PASSIVE;
+ e->state = SD_EVENT_INITIAL;
return r;
}
@@ -2666,14 +2666,14 @@ _public_ int sd_event_dispatch(sd_event *e) {
e->state = SD_EVENT_RUNNING;
r = source_dispatch(p);
- e->state = SD_EVENT_PASSIVE;
+ e->state = SD_EVENT_INITIAL;
sd_event_unref(e);
return r;
}
- e->state = SD_EVENT_PASSIVE;
+ e->state = SD_EVENT_INITIAL;
return 1;
}
@@ -2684,7 +2684,7 @@ _public_ int sd_event_run(sd_event *e, uint64_t timeout) {
assert_return(e, -EINVAL);
assert_return(!event_pid_changed(e), -ECHILD);
assert_return(e->state != SD_EVENT_FINISHED, -ESTALE);
- assert_return(e->state == SD_EVENT_PASSIVE, -EBUSY);
+ assert_return(e->state == SD_EVENT_INITIAL, -EBUSY);
r = sd_event_prepare(e);
if (r > 0)
@@ -2704,7 +2704,7 @@ _public_ int sd_event_loop(sd_event *e) {
assert_return(e, -EINVAL);
assert_return(!event_pid_changed(e), -ECHILD);
- assert_return(e->state == SD_EVENT_PASSIVE, -EBUSY);
+ assert_return(e->state == SD_EVENT_INITIAL, -EBUSY);
sd_event_ref(e);
diff --git a/src/systemd/sd-event.h b/src/systemd/sd-event.h
index 4957f3a..ffde7c8 100644
--- a/src/systemd/sd-event.h
+++ b/src/systemd/sd-event.h
@@ -51,8 +51,8 @@ enum {
};
enum {
- SD_EVENT_PASSIVE,
- SD_EVENT_PREPARED,
+ SD_EVENT_INITIAL,
+ SD_EVENT_ARMED,
SD_EVENT_PENDING,
SD_EVENT_RUNNING,
SD_EVENT_EXITING,
--
2.17.1

View File

@ -0,0 +1,39 @@
From 01c94571660c44c415ba8bcba62176f45bf84be5 Mon Sep 17 00:00:00 2001
From: Lennart Poettering <lennart@poettering.net>
Date: Wed, 30 Oct 2019 20:26:50 +0100
Subject: [PATCH 12/20] sd-event: refuse running default event loops in any
other thread than the one they are default for
(cherry picked from commit e544601536ac13a288d7476f4400c7b0f22b7ea1)
Related: #1819868
[commit 4c5fdbde7e745126f31542a70b45cc4faec094d2 from
https://github.com/systemd-rhel/rhel-8/
LZ: Dropped the part that won't affect code to simplify the merging.]
Signed-off-by: Li Zhou <li.zhou@windriver.com>
---
src/libsystemd/sd-event/sd-event.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/src/libsystemd/sd-event/sd-event.c b/src/libsystemd/sd-event/sd-event.c
index 299312a..a2f7868 100644
--- a/src/libsystemd/sd-event/sd-event.c
+++ b/src/libsystemd/sd-event/sd-event.c
@@ -2494,6 +2494,11 @@ _public_ int sd_event_prepare(sd_event *e) {
assert_return(e->state != SD_EVENT_FINISHED, -ESTALE);
assert_return(e->state == SD_EVENT_INITIAL, -EBUSY);
+ /* Let's check that if we are a default event loop we are executed in the correct thread. We only do
+ * this check here once, since gettid() is typically not cached, and thus want to minimize
+ * syscalls */
+ assert_return(!e->default_event_ptr || e->tid == gettid(), -EREMOTEIO);
+
if (e->exit_requested)
goto pending;
--
2.17.1

View File

@ -0,0 +1,106 @@
From f72ca8a711fc406dc52f18c7dbc3bfc5397b26ea Mon Sep 17 00:00:00 2001
From: Lennart Poettering <lennart@poettering.net>
Date: Mon, 23 Nov 2020 17:49:27 +0100
Subject: [PATCH 13/20] sd-event: remove earliest_index/latest_index into
common part of event source objects
So far we used these fields to organize the earliest/latest timer event
priority queue. In a follow-up commit we want to introduce ratelimiting
to event sources, at which point we want any kind of event source to be
able to trigger time wakeups, and hence they all need to be included in
the earliest/latest prioqs. Thus, in preparation let's make this
generic.
No change in behaviour, just some shifting around of struct members from
the type-specific to the generic part.
(cherry picked from commit f41315fceb5208c496145cda2d6c865a5458ce44)
Related: #1819868
[commit 97f599bf57fdaee688ae5750e9b2b2587e2b597a from
https://github.com/systemd-rhel/rhel-8/]
Signed-off-by: Li Zhou <li.zhou@windriver.com>
---
src/libsystemd/sd-event/sd-event.c | 25 +++++++++++++------------
1 file changed, 13 insertions(+), 12 deletions(-)
diff --git a/src/libsystemd/sd-event/sd-event.c b/src/libsystemd/sd-event/sd-event.c
index a2f7868..82cb9ad 100644
--- a/src/libsystemd/sd-event/sd-event.c
+++ b/src/libsystemd/sd-event/sd-event.c
@@ -94,6 +94,9 @@ struct sd_event_source {
LIST_FIELDS(sd_event_source, sources);
+ unsigned earliest_index;
+ unsigned latest_index;
+
union {
struct {
sd_event_io_handler_t callback;
@@ -105,8 +108,6 @@ struct sd_event_source {
struct {
sd_event_time_handler_t callback;
usec_t next, accuracy;
- unsigned earliest_index;
- unsigned latest_index;
} time;
struct {
sd_event_signal_handler_t callback;
@@ -804,8 +805,8 @@ static void event_source_time_prioq_reshuffle(sd_event_source *s) {
/* Called whenever the event source's timer ordering properties changed, i.e. time, accuracy,
* pending, enable state. Makes sure the two prioq's are ordered properly again. */
assert_se(d = event_get_clock_data(s->event, s->type));
- prioq_reshuffle(d->earliest, s, &s->time.earliest_index);
- prioq_reshuffle(d->latest, s, &s->time.latest_index);
+ prioq_reshuffle(d->earliest, s, &s->earliest_index);
+ prioq_reshuffle(d->latest, s, &s->latest_index);
d->needs_rearm = true;
}
@@ -816,9 +817,9 @@ static void event_source_time_prioq_remove(
assert(s);
assert(d);
- prioq_remove(d->earliest, s, &s->time.earliest_index);
- prioq_remove(d->latest, s, &s->time.latest_index);
- s->time.earliest_index = s->time.latest_index = PRIOQ_IDX_NULL;
+ prioq_remove(d->earliest, s, &s->earliest_index);
+ prioq_remove(d->latest, s, &s->latest_index);
+ s->earliest_index = s->latest_index = PRIOQ_IDX_NULL;
d->needs_rearm = true;
}
@@ -1104,14 +1105,14 @@ static int event_source_time_prioq_put(
assert(s);
assert(d);
- r = prioq_put(d->earliest, s, &s->time.earliest_index);
+ r = prioq_put(d->earliest, s, &s->earliest_index);
if (r < 0)
return r;
- r = prioq_put(d->latest, s, &s->time.latest_index);
+ r = prioq_put(d->latest, s, &s->latest_index);
if (r < 0) {
- assert_se(prioq_remove(d->earliest, s, &s->time.earliest_index) > 0);
- s->time.earliest_index = PRIOQ_IDX_NULL;
+ assert_se(prioq_remove(d->earliest, s, &s->earliest_index) > 0);
+ s->earliest_index = PRIOQ_IDX_NULL;
return r;
}
@@ -1158,7 +1159,7 @@ _public_ int sd_event_add_time(
s->time.next = usec;
s->time.accuracy = accuracy == 0 ? DEFAULT_ACCURACY_USEC : accuracy;
s->time.callback = callback;
- s->time.earliest_index = s->time.latest_index = PRIOQ_IDX_NULL;
+ s->earliest_index = s->latest_index = PRIOQ_IDX_NULL;
s->userdata = userdata;
s->enabled = SD_EVENT_ONESHOT;
--
2.17.1

View File

@ -0,0 +1,125 @@
From ad89da1e00919c510596dac78741c98052b1e2f7 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Zbigniew=20J=C4=99drzejewski-Szmek?= <zbyszek@in.waw.pl>
Date: Tue, 10 Nov 2020 10:38:37 +0100
Subject: [PATCH 14/20] sd-event: update state at the end in
event_source_enable
Coverity in CID#1435966 was complaining that s->enabled is not "restored" in
all cases. But the code was actually correct, since it should only be
"restored" in the error paths. But let's still make this prettier by not setting
the state before all operations that may fail are done.
We need to set .enabled for the prioq reshuffling operations, so move those down.
No functional change intended.
(cherry picked from commit d2eafe61ca07f8300dc741a0491a914213fa2b6b)
Related: #1819868
[commit deb9e6ad3a1d7cfbc3b53d1e74cda6ae398a90fd from
https://github.com/systemd-rhel/rhel-8/]
Signed-off-by: Li Zhou <li.zhou@windriver.com>
---
src/libsystemd/sd-event/sd-event.c | 51 +++++++++++++++++-------------
1 file changed, 29 insertions(+), 22 deletions(-)
diff --git a/src/libsystemd/sd-event/sd-event.c b/src/libsystemd/sd-event/sd-event.c
index 82cb9ad..3ff15a2 100644
--- a/src/libsystemd/sd-event/sd-event.c
+++ b/src/libsystemd/sd-event/sd-event.c
@@ -1691,11 +1691,11 @@ static int event_source_disable(sd_event_source *s) {
return 0;
}
-static int event_source_enable(sd_event_source *s, int m) {
+static int event_source_enable(sd_event_source *s, int enable) {
int r;
assert(s);
- assert(IN_SET(m, SD_EVENT_ON, SD_EVENT_ONESHOT));
+ assert(IN_SET(enable, SD_EVENT_ON, SD_EVENT_ONESHOT));
assert(s->enabled == SD_EVENT_OFF);
/* Unset the pending flag when this event source is enabled */
@@ -1705,31 +1705,16 @@ static int event_source_enable(sd_event_source *s, int m) {
return r;
}
- s->enabled = m;
-
switch (s->type) {
-
case SOURCE_IO:
- r = source_io_register(s, m, s->io.events);
- if (r < 0) {
- s->enabled = SD_EVENT_OFF;
+ r = source_io_register(s, enable, s->io.events);
+ if (r < 0)
return r;
- }
-
- break;
-
- case SOURCE_TIME_REALTIME:
- case SOURCE_TIME_BOOTTIME:
- case SOURCE_TIME_MONOTONIC:
- case SOURCE_TIME_REALTIME_ALARM:
- case SOURCE_TIME_BOOTTIME_ALARM:
- event_source_time_prioq_reshuffle(s);
break;
case SOURCE_SIGNAL:
r = event_make_signal_data(s->event, s->signal.sig, NULL);
if (r < 0) {
- s->enabled = SD_EVENT_OFF;
event_gc_signal_data(s->event, &s->priority, s->signal.sig);
return r;
}
@@ -1750,10 +1735,12 @@ static int event_source_enable(sd_event_source *s, int m) {
break;
+ case SOURCE_TIME_REALTIME:
+ case SOURCE_TIME_BOOTTIME:
+ case SOURCE_TIME_MONOTONIC:
+ case SOURCE_TIME_REALTIME_ALARM:
+ case SOURCE_TIME_BOOTTIME_ALARM:
case SOURCE_EXIT:
- prioq_reshuffle(s->event->exit, s, &s->exit.prioq_index);
- break;
-
case SOURCE_DEFER:
case SOURCE_POST:
break;
@@ -1762,6 +1749,26 @@ static int event_source_enable(sd_event_source *s, int m) {
assert_not_reached("Wut? I shouldn't exist.");
}
+ s->enabled = enable;
+
+ /* Non-failing operations below */
+ switch (s->type) {
+ case SOURCE_TIME_REALTIME:
+ case SOURCE_TIME_BOOTTIME:
+ case SOURCE_TIME_MONOTONIC:
+ case SOURCE_TIME_REALTIME_ALARM:
+ case SOURCE_TIME_BOOTTIME_ALARM:
+ event_source_time_prioq_reshuffle(s);
+ break;
+
+ case SOURCE_EXIT:
+ prioq_reshuffle(s->event->exit, s, &s->exit.prioq_index);
+ break;
+
+ default:
+ break;
+ }
+
return 0;
}
--
2.17.1

View File

@ -0,0 +1,44 @@
From 04e2ffb437b301963804e6d199be1196d1b4307b Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Zbigniew=20J=C4=99drzejewski-Szmek?= <zbyszek@in.waw.pl>
Date: Tue, 10 Nov 2020 12:57:34 +0100
Subject: [PATCH 15/20] sd-event: increase n_enabled_child_sources just once
Neither source_child_pidfd_register() nor event_make_signal_data() look at
n_enabled_child_sources.
(cherry picked from commit ac9f2640cb9c107b43f47bba7e068d3b92b5337b)
Related: #1819868
[commit 188465c472996b426a1f22a9fc46d031b722c3b4 from
https://github.com/systemd-rhel/rhel-8/]
Signed-off-by: Li Zhou <li.zhou@windriver.com>
---
src/libsystemd/sd-event/sd-event.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/src/libsystemd/sd-event/sd-event.c b/src/libsystemd/sd-event/sd-event.c
index 3ff15a2..e34fd0b 100644
--- a/src/libsystemd/sd-event/sd-event.c
+++ b/src/libsystemd/sd-event/sd-event.c
@@ -1722,8 +1722,6 @@ static int event_source_enable(sd_event_source *s, int enable) {
break;
case SOURCE_CHILD:
- s->event->n_enabled_child_sources++;
-
r = event_make_signal_data(s->event, SIGCHLD, NULL);
if (r < 0) {
s->enabled = SD_EVENT_OFF;
@@ -1732,6 +1730,7 @@ static int event_source_enable(sd_event_source *s, int enable) {
return r;
}
+ s->event->n_enabled_child_sources++;
break;
--
2.17.1

View File

@ -0,0 +1,97 @@
From 2d07173304abd3f1d3fae5e0f01bf5874b1f04db Mon Sep 17 00:00:00 2001
From: David Herrmann <dh.herrmann@gmail.com>
Date: Tue, 29 Sep 2015 20:56:17 +0200
Subject: [PATCH 16/20] sd-event: don't provide priority stability
Currently, we guarantee that if two event-sources with the same priority
fire at the same time, they're always dispatched in the same order. While
this might sound nice in theory, there's is little benefit in providing
stability on that level. We have no control over the order the events are
reported, hence, we cannot guarantee that we get notified about both at
the same time.
By dropping the stability guarantee, we loose roughly 10% Heap swaps in
the prioq on a desktop cold-boot. Krzysztof Kotlenga even reported up to
20% on his tests. This sounds worth optimizing, so drop the stability
guarantee.
[commit 6fe869c251790a0e3cef5b243169dda363723f49 from
https://github.com/systemd-rhel/rhel-8/]
Signed-off-by: Li Zhou <li.zhou@windriver.com>
---
src/libsystemd/sd-event/sd-event.c | 30 ------------------------------
1 file changed, 30 deletions(-)
diff --git a/src/libsystemd/sd-event/sd-event.c b/src/libsystemd/sd-event/sd-event.c
index e34fd0b..6304991 100644
--- a/src/libsystemd/sd-event/sd-event.c
+++ b/src/libsystemd/sd-event/sd-event.c
@@ -243,12 +243,6 @@ static int pending_prioq_compare(const void *a, const void *b) {
if (x->pending_iteration > y->pending_iteration)
return 1;
- /* Stability for the rest */
- if (x < y)
- return -1;
- if (x > y)
- return 1;
-
return 0;
}
@@ -278,12 +272,6 @@ static int prepare_prioq_compare(const void *a, const void *b) {
if (x->priority > y->priority)
return 1;
- /* Stability for the rest */
- if (x < y)
- return -1;
- if (x > y)
- return 1;
-
return 0;
}
@@ -311,12 +299,6 @@ static int earliest_time_prioq_compare(const void *a, const void *b) {
if (x->time.next > y->time.next)
return 1;
- /* Stability for the rest */
- if (x < y)
- return -1;
- if (x > y)
- return 1;
-
return 0;
}
@@ -344,12 +326,6 @@ static int latest_time_prioq_compare(const void *a, const void *b) {
if (x->time.next + x->time.accuracy > y->time.next + y->time.accuracy)
return 1;
- /* Stability for the rest */
- if (x < y)
- return -1;
- if (x > y)
- return 1;
-
return 0;
}
@@ -371,12 +347,6 @@ static int exit_prioq_compare(const void *a, const void *b) {
if (x->priority > y->priority)
return 1;
- /* Stability for the rest */
- if (x < y)
- return -1;
- if (x > y)
- return 1;
-
return 0;
}
--
2.17.1

View File

@ -0,0 +1,53 @@
From cf0a396c411c78d0d477d2226f89884df207aec2 Mon Sep 17 00:00:00 2001
From: Lennart Poettering <lennart@poettering.net>
Date: Mon, 1 Feb 2016 00:19:14 +0100
Subject: [PATCH 17/20] sd-event: when determining the last allowed time a time
event may elapse, deal with overflows
[commit 1bce0ffa66f329bd50d8bfaa943a755caa65b269 from
https://github.com/systemd-rhel/rhel-8/]
Signed-off-by: Li Zhou <li.zhou@windriver.com>
---
src/libsystemd/sd-event/sd-event.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/src/libsystemd/sd-event/sd-event.c b/src/libsystemd/sd-event/sd-event.c
index 6304991..63f77ac 100644
--- a/src/libsystemd/sd-event/sd-event.c
+++ b/src/libsystemd/sd-event/sd-event.c
@@ -302,6 +302,10 @@ static int earliest_time_prioq_compare(const void *a, const void *b) {
return 0;
}
+static usec_t time_event_source_latest(const sd_event_source *s) {
+ return usec_add(s->time.next, s->time.accuracy);
+}
+
static int latest_time_prioq_compare(const void *a, const void *b) {
const sd_event_source *x = a, *y = b;
@@ -321,9 +325,9 @@ static int latest_time_prioq_compare(const void *a, const void *b) {
return 1;
/* Order by time */
- if (x->time.next + x->time.accuracy < y->time.next + y->time.accuracy)
+ if (time_event_source_latest(x) < time_event_source_latest(y))
return -1;
- if (x->time.next + x->time.accuracy > y->time.next + y->time.accuracy)
+ if (time_event_source_latest(x) > time_event_source_latest(y))
return 1;
return 0;
@@ -2014,7 +2018,7 @@ static int event_arm_timer(
b = prioq_peek(d->latest);
assert_se(b && b->enabled != SD_EVENT_OFF);
- t = sleep_between(e, a->time.next, b->time.next + b->time.accuracy);
+ t = sleep_between(e, a->time.next, time_event_source_latest(b));
if (d->next == t)
return 0;
--
2.17.1

View File

@ -0,0 +1,60 @@
From c0521bcf58da1857a2077cd3b3abc330bab33598 Mon Sep 17 00:00:00 2001
From: Lennart Poettering <lennart@poettering.net>
Date: Mon, 1 Feb 2016 00:20:18 +0100
Subject: [PATCH 18/20] sd-event: permit a USEC_INFINITY timeout as an
alternative to a disabling an event source
This should simplify handling of time events in clients and is in-line with the USEC_INFINITY macro we already have.
This way setting a timeout to 0 indicates "elapse immediately", and a timeout of USEC_INFINITY "elapse never".
[commit 393003e1debf7c7f75beaacbd532b92c3e3dc729 from
https://github.com/systemd-rhel/rhel-8/
LZ: Dropped the part that won't affect code to simplify the merging.]
Signed-off-by: Li Zhou <li.zhou@windriver.com>
---
src/libsystemd/sd-event/sd-event.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/src/libsystemd/sd-event/sd-event.c b/src/libsystemd/sd-event/sd-event.c
index 63f77ac..69dd02b 100644
--- a/src/libsystemd/sd-event/sd-event.c
+++ b/src/libsystemd/sd-event/sd-event.c
@@ -1109,7 +1109,6 @@ _public_ int sd_event_add_time(
int r;
assert_return(e, -EINVAL);
- assert_return(usec != (uint64_t) -1, -EINVAL);
assert_return(accuracy != (uint64_t) -1, -EINVAL);
assert_return(e->state != SD_EVENT_FINISHED, -ESTALE);
assert_return(!event_pid_changed(e), -ECHILD);
@@ -1791,7 +1790,6 @@ _public_ int sd_event_source_get_time(sd_event_source *s, uint64_t *usec) {
_public_ int sd_event_source_set_time(sd_event_source *s, uint64_t usec) {
assert_return(s, -EINVAL);
- assert_return(usec != (uint64_t) -1, -EINVAL);
assert_return(EVENT_SOURCE_IS_TIME(s->type), -EDOM);
assert_return(s->event->state != SD_EVENT_FINISHED, -ESTALE);
assert_return(!event_pid_changed(s->event), -ECHILD);
@@ -1909,6 +1907,8 @@ static usec_t sleep_between(sd_event *e, usec_t a, usec_t b) {
if (a <= 0)
return 0;
+ if (a >= USEC_INFINITY)
+ return USEC_INFINITY;
if (b <= a + 1)
return a;
@@ -1998,7 +1998,7 @@ static int event_arm_timer(
d->needs_rearm = false;
a = prioq_peek(d->earliest);
- if (!a || a->enabled == SD_EVENT_OFF) {
+ if (!a || a->enabled == SD_EVENT_OFF || a->time.next == USEC_INFINITY) {
if (d->fd < 0)
return 0;
--
2.17.1

View File

@ -0,0 +1,841 @@
From 69266c451910d2b57313b2fe7561e07cd5400d27 Mon Sep 17 00:00:00 2001
From: Lennart Poettering <lennart@poettering.net>
Date: Mon, 23 Nov 2020 18:02:40 +0100
Subject: [PATCH 19/20] sd-event: add ability to ratelimit event sources
Let's a concept of "rate limiting" to event sources: if specific event
sources fire too often in some time interval temporarily take them
offline, and take them back online once the interval passed.
This is a simple scheme of avoiding starvation of event sources if some
event source fires too often.
This introduces the new conceptual states of "offline" and "online" for
event sources: an event source is "online" only when enabled *and* not
ratelimited, and offline in all other cases. An event source that is
online hence has its fds registered in the epoll, its signals in the
signalfd and so on.
(cherry picked from commit b6d5481b3d9f7c9b1198ab54b54326ec73e855bf)
Related: #1819868
[commit 395eb7753a9772f505102fbbe3ba3261b57abbe9 from
https://github.com/systemd-rhel/rhel-8/
LZ: Moved the changes in libsystemd.sym to libsystemd.sym.m4 from the
file changing history; patch ratelimit.h in its old path; dropped
SOURCE_INOTIFY related parts in sd-event.c because it hasn't been
added in this systemd version.]
Signed-off-by: Li Zhou <li.zhou@windriver.com>
---
src/libsystemd/libsystemd.sym.m4 | 7 +
src/libsystemd/sd-event/sd-event.c | 427 +++++++++++++++++++++++------
src/shared/ratelimit.h | 8 +
src/systemd/sd-event.h | 3 +
4 files changed, 365 insertions(+), 80 deletions(-)
diff --git a/src/libsystemd/libsystemd.sym.m4 b/src/libsystemd/libsystemd.sym.m4
index b1c2b43..ceb5d7f 100644
--- a/src/libsystemd/libsystemd.sym.m4
+++ b/src/libsystemd/libsystemd.sym.m4
@@ -169,6 +169,13 @@ global:
sd_journal_has_persistent_files;
} LIBSYSTEMD_219;
+LIBSYSTEMD_248 {
+global:
+ sd_event_source_set_ratelimit;
+ sd_event_source_get_ratelimit;
+ sd_event_source_is_ratelimited;
+} LIBSYSTEMD_229;
+
m4_ifdef(`ENABLE_KDBUS',
LIBSYSTEMD_FUTURE {
global:
diff --git a/src/libsystemd/sd-event/sd-event.c b/src/libsystemd/sd-event/sd-event.c
index 69dd02b..a3ade40 100644
--- a/src/libsystemd/sd-event/sd-event.c
+++ b/src/libsystemd/sd-event/sd-event.c
@@ -32,6 +32,7 @@
#include "util.h"
#include "time-util.h"
#include "missing.h"
+#include "ratelimit.h"
#include "set.h"
#include "list.h"
@@ -67,7 +68,24 @@ typedef enum WakeupType {
_WAKEUP_TYPE_INVALID = -1,
} WakeupType;
-#define EVENT_SOURCE_IS_TIME(t) IN_SET((t), SOURCE_TIME_REALTIME, SOURCE_TIME_BOOTTIME, SOURCE_TIME_MONOTONIC, SOURCE_TIME_REALTIME_ALARM, SOURCE_TIME_BOOTTIME_ALARM)
+#define EVENT_SOURCE_IS_TIME(t) \
+ IN_SET((t), \
+ SOURCE_TIME_REALTIME, \
+ SOURCE_TIME_BOOTTIME, \
+ SOURCE_TIME_MONOTONIC, \
+ SOURCE_TIME_REALTIME_ALARM, \
+ SOURCE_TIME_BOOTTIME_ALARM)
+
+#define EVENT_SOURCE_CAN_RATE_LIMIT(t) \
+ IN_SET((t), \
+ SOURCE_IO, \
+ SOURCE_TIME_REALTIME, \
+ SOURCE_TIME_BOOTTIME, \
+ SOURCE_TIME_MONOTONIC, \
+ SOURCE_TIME_REALTIME_ALARM, \
+ SOURCE_TIME_BOOTTIME_ALARM, \
+ SOURCE_SIGNAL, \
+ SOURCE_DEFER)
struct sd_event_source {
WakeupType wakeup;
@@ -85,6 +103,7 @@ struct sd_event_source {
bool pending:1;
bool dispatching:1;
bool floating:1;
+ bool ratelimited:1;
int64_t priority;
unsigned pending_index;
@@ -94,6 +113,10 @@ struct sd_event_source {
LIST_FIELDS(sd_event_source, sources);
+ RateLimit rate_limit;
+
+ /* These are primarily fields relevant for time event sources, but since any event source can
+ * effectively become one when rate-limited, this is part of the common fields. */
unsigned earliest_index;
unsigned latest_index;
@@ -188,7 +211,7 @@ struct sd_event {
Hashmap *signal_data; /* indexed by priority */
Hashmap *child_sources;
- unsigned n_enabled_child_sources;
+ unsigned n_online_child_sources;
Set *post_sources;
@@ -219,8 +242,19 @@ struct sd_event {
static void source_disconnect(sd_event_source *s);
+static bool event_source_is_online(sd_event_source *s) {
+ assert(s);
+ return s->enabled != SD_EVENT_OFF && !s->ratelimited;
+}
+
+static bool event_source_is_offline(sd_event_source *s) {
+ assert(s);
+ return s->enabled == SD_EVENT_OFF || s->ratelimited;
+}
+
static int pending_prioq_compare(const void *a, const void *b) {
const sd_event_source *x = a, *y = b;
+ int r;
assert(x->pending);
assert(y->pending);
@@ -231,23 +265,23 @@ static int pending_prioq_compare(const void *a, const void *b) {
if (x->enabled == SD_EVENT_OFF && y->enabled != SD_EVENT_OFF)
return 1;
+ /* Non rate-limited ones first. */
+ r = CMP(!!x->ratelimited, !!y->ratelimited);
+ if (r != 0)
+ return r;
+
/* Lower priority values first */
- if (x->priority < y->priority)
- return -1;
- if (x->priority > y->priority)
- return 1;
+ r = CMP(x->priority, y->priority);
+ if (r != 0)
+ return r;
/* Older entries first */
- if (x->pending_iteration < y->pending_iteration)
- return -1;
- if (x->pending_iteration > y->pending_iteration)
- return 1;
-
- return 0;
+ return CMP(x->pending_iteration, y->pending_iteration);
}
static int prepare_prioq_compare(const void *a, const void *b) {
const sd_event_source *x = a, *y = b;
+ int r;
assert(x->prepare);
assert(y->prepare);
@@ -258,29 +292,46 @@ static int prepare_prioq_compare(const void *a, const void *b) {
if (x->enabled == SD_EVENT_OFF && y->enabled != SD_EVENT_OFF)
return 1;
+ /* Non rate-limited ones first. */
+ r = CMP(!!x->ratelimited, !!y->ratelimited);
+ if (r != 0)
+ return r;
+
/* Move most recently prepared ones last, so that we can stop
* preparing as soon as we hit one that has already been
* prepared in the current iteration */
- if (x->prepare_iteration < y->prepare_iteration)
- return -1;
- if (x->prepare_iteration > y->prepare_iteration)
- return 1;
+ r = CMP(x->prepare_iteration, y->prepare_iteration);
+ if (r != 0)
+ return r;
/* Lower priority values first */
- if (x->priority < y->priority)
- return -1;
- if (x->priority > y->priority)
- return 1;
+ return CMP(x->priority, y->priority);
+}
- return 0;
+static usec_t time_event_source_next(const sd_event_source *s) {
+ assert(s);
+
+ /* We have two kinds of event sources that have elapsation times associated with them: the actual
+ * time based ones and the ones for which a ratelimit can be in effect (where we want to be notified
+ * once the ratelimit time window ends). Let's return the next elapsing time depending on what we are
+ * looking at here. */
+
+ if (s->ratelimited) { /* If rate-limited the next elapsation is when the ratelimit time window ends */
+ assert(s->rate_limit.begin != 0);
+ assert(s->rate_limit.interval != 0);
+ return usec_add(s->rate_limit.begin, s->rate_limit.interval);
+ }
+
+ /* Otherwise this must be a time event source, if not ratelimited */
+ if (EVENT_SOURCE_IS_TIME(s->type))
+ return s->time.next;
+
+ return USEC_INFINITY;
}
static int earliest_time_prioq_compare(const void *a, const void *b) {
const sd_event_source *x = a, *y = b;
- assert(EVENT_SOURCE_IS_TIME(x->type));
- assert(x->type == y->type);
-
/* Enabled ones first */
if (x->enabled != SD_EVENT_OFF && y->enabled == SD_EVENT_OFF)
return -1;
@@ -294,24 +345,30 @@ static int earliest_time_prioq_compare(const void *a, const void *b) {
return 1;
/* Order by time */
- if (x->time.next < y->time.next)
- return -1;
- if (x->time.next > y->time.next)
- return 1;
-
- return 0;
+ return CMP(time_event_source_next(x), time_event_source_next(y));
}
static usec_t time_event_source_latest(const sd_event_source *s) {
- return usec_add(s->time.next, s->time.accuracy);
+ assert(s);
+
+ if (s->ratelimited) { /* For ratelimited stuff the earliest and the latest time shall actually be the
+ * same, as we should avoid adding additional inaccuracy on an inaccuracy time
+ * window */
+ assert(s->rate_limit.begin != 0);
+ assert(s->rate_limit.interval != 0);
+ return usec_add(s->rate_limit.begin, s->rate_limit.interval);
+ }
+
+ /* Must be a time event source, if not ratelimited */
+ if (EVENT_SOURCE_IS_TIME(s->type))
+ return usec_add(s->time.next, s->time.accuracy);
+
+ return USEC_INFINITY;
}
static int latest_time_prioq_compare(const void *a, const void *b) {
const sd_event_source *x = a, *y = b;
- assert(EVENT_SOURCE_IS_TIME(x->type));
- assert(x->type == y->type);
-
/* Enabled ones first */
if (x->enabled != SD_EVENT_OFF && y->enabled == SD_EVENT_OFF)
return -1;
@@ -722,12 +779,12 @@ static void event_gc_signal_data(sd_event *e, const int64_t *priority, int sig)
* the signalfd for it. */
if (sig == SIGCHLD &&
- e->n_enabled_child_sources > 0)
+ e->n_online_child_sources > 0)
return;
if (e->signal_sources &&
e->signal_sources[sig] &&
- e->signal_sources[sig]->enabled != SD_EVENT_OFF)
+ event_source_is_online(e->signal_sources[sig]))
return;
/*
@@ -774,11 +831,17 @@ static void event_source_time_prioq_reshuffle(sd_event_source *s) {
struct clock_data *d;
assert(s);
- assert(EVENT_SOURCE_IS_TIME(s->type));
/* Called whenever the event source's timer ordering properties changed, i.e. time, accuracy,
* pending, enable state. Makes sure the two prioq's are ordered properly again. */
- assert_se(d = event_get_clock_data(s->event, s->type));
+
+ if (s->ratelimited)
+ d = &s->event->monotonic;
+ else {
+ assert(EVENT_SOURCE_IS_TIME(s->type));
+ assert_se(d = event_get_clock_data(s->event, s->type));
+ }
+
prioq_reshuffle(d->earliest, s, &s->earliest_index);
prioq_reshuffle(d->latest, s, &s->latest_index);
d->needs_rearm = true;
@@ -819,12 +882,18 @@ static void source_disconnect(sd_event_source *s) {
case SOURCE_TIME_BOOTTIME:
case SOURCE_TIME_MONOTONIC:
case SOURCE_TIME_REALTIME_ALARM:
- case SOURCE_TIME_BOOTTIME_ALARM: {
- struct clock_data *d;
- assert_se(d = event_get_clock_data(s->event, s->type));
- event_source_time_prioq_remove(s, d);
+ case SOURCE_TIME_BOOTTIME_ALARM:
+ /* Only remove this event source from the time event source here if it is not ratelimited. If
+ * it is ratelimited, we'll remove it below, separately. Why? Because the clock used might
+ * differ: ratelimiting always uses CLOCK_MONOTONIC, but timer events might use any clock */
+
+ if (!s->ratelimited) {
+ struct clock_data *d;
+ assert_se(d = event_get_clock_data(s->event, s->type));
+ event_source_time_prioq_remove(s, d);
+ }
+
break;
- }
case SOURCE_SIGNAL:
if (s->signal.sig > 0) {
@@ -839,9 +908,9 @@ static void source_disconnect(sd_event_source *s) {
case SOURCE_CHILD:
if (s->child.pid > 0) {
- if (s->enabled != SD_EVENT_OFF) {
- assert(s->event->n_enabled_child_sources > 0);
- s->event->n_enabled_child_sources--;
+ if (event_source_is_online(s)) {
+ assert(s->event->n_online_child_sources > 0);
+ s->event->n_online_child_sources--;
}
(void) hashmap_remove(s->event->child_sources, INT_TO_PTR(s->child.pid));
@@ -872,6 +941,9 @@ static void source_disconnect(sd_event_source *s) {
if (s->prepare)
prioq_remove(s->event->prepare, s, &s->prepare_index);
+ if (s->ratelimited)
+ event_source_time_prioq_remove(s, &s->event->monotonic);
+
event = s->event;
s->type = _SOURCE_EVENT_SOURCE_TYPE_INVALID;
@@ -1259,11 +1331,11 @@ _public_ int sd_event_add_child(
return r;
}
- e->n_enabled_child_sources ++;
+ e->n_online_child_sources++;
r = event_make_signal_data(e, SIGCHLD, NULL);
if (r < 0) {
- e->n_enabled_child_sources--;
+ e->n_online_child_sources--;
source_free(s);
return r;
}
@@ -1476,7 +1548,7 @@ _public_ int sd_event_source_set_io_fd(sd_event_source *s, int fd) {
if (s->io.fd == fd)
return 0;
- if (s->enabled == SD_EVENT_OFF) {
+ if (event_source_is_offline(s)) {
s->io.fd = fd;
s->io.registered = false;
} else {
@@ -1524,7 +1596,7 @@ _public_ int sd_event_source_set_io_events(sd_event_source *s, uint32_t events)
if (s->io.events == events && !(events & EPOLLET))
return 0;
- if (s->enabled != SD_EVENT_OFF) {
+ if (event_source_is_online(s)) {
r = source_io_register(s, s->enabled, events);
if (r < 0)
return r;
@@ -1572,7 +1644,7 @@ _public_ int sd_event_source_set_priority(sd_event_source *s, int64_t priority)
if (s->priority == priority)
return 0;
- if (s->type == SOURCE_SIGNAL && s->enabled != SD_EVENT_OFF) {
+ if (s->type == SOURCE_SIGNAL && event_source_is_online(s)) {
struct signal_data *old, *d;
/* Move us from the signalfd belonging to the old
@@ -1609,20 +1681,29 @@ _public_ int sd_event_source_get_enabled(sd_event_source *s, int *m) {
return 0;
}
-static int event_source_disable(sd_event_source *s) {
+static int event_source_offline(
+ sd_event_source *s,
+ int enabled,
+ bool ratelimited) {
+
+ bool was_offline;
int r;
assert(s);
- assert(s->enabled != SD_EVENT_OFF);
+ assert(enabled == SD_EVENT_OFF || ratelimited);
/* Unset the pending flag when this event source is disabled */
- if (!IN_SET(s->type, SOURCE_DEFER, SOURCE_EXIT)) {
+ if (s->enabled != SD_EVENT_OFF &&
+ enabled == SD_EVENT_OFF &&
+ !IN_SET(s->type, SOURCE_DEFER, SOURCE_EXIT)) {
r = source_set_pending(s, false);
if (r < 0)
return r;
}
- s->enabled = SD_EVENT_OFF;
+ was_offline = event_source_is_offline(s);
+ s->enabled = enabled;
+ s->ratelimited = ratelimited;
switch (s->type) {
@@ -1643,8 +1724,10 @@ static int event_source_disable(sd_event_source *s) {
break;
case SOURCE_CHILD:
- assert(s->event->n_enabled_child_sources > 0);
- s->event->n_enabled_child_sources--;
+ if (!was_offline) {
+ assert(s->event->n_online_child_sources > 0);
+ s->event->n_online_child_sources--;
+ }
event_gc_signal_data(s->event, &s->priority, SIGCHLD);
break;
@@ -1661,26 +1744,42 @@ static int event_source_disable(sd_event_source *s) {
assert_not_reached("Wut? I shouldn't exist.");
}
- return 0;
+ return 1;
}
-static int event_source_enable(sd_event_source *s, int enable) {
+static int event_source_online(
+ sd_event_source *s,
+ int enabled,
+ bool ratelimited) {
+
+ bool was_online;
int r;
assert(s);
- assert(IN_SET(enable, SD_EVENT_ON, SD_EVENT_ONESHOT));
- assert(s->enabled == SD_EVENT_OFF);
+ assert(enabled != SD_EVENT_OFF || !ratelimited);
/* Unset the pending flag when this event source is enabled */
- if (!IN_SET(s->type, SOURCE_DEFER, SOURCE_EXIT)) {
+ if (s->enabled == SD_EVENT_OFF &&
+ enabled != SD_EVENT_OFF &&
+ !IN_SET(s->type, SOURCE_DEFER, SOURCE_EXIT)) {
r = source_set_pending(s, false);
if (r < 0)
return r;
}
+ /* Are we really ready for onlining? */
+ if (enabled == SD_EVENT_OFF || ratelimited) {
+ /* Nope, we are not ready for onlining, then just update the precise state and exit */
+ s->enabled = enabled;
+ s->ratelimited = ratelimited;
+ return 0;
+ }
+
+ was_online = event_source_is_online(s);
+
switch (s->type) {
case SOURCE_IO:
- r = source_io_register(s, enable, s->io.events);
+ r = source_io_register(s, enabled, s->io.events);
if (r < 0)
return r;
break;
@@ -1698,13 +1797,13 @@ static int event_source_enable(sd_event_source *s, int enable) {
r = event_make_signal_data(s->event, SIGCHLD, NULL);
if (r < 0) {
s->enabled = SD_EVENT_OFF;
- s->event->n_enabled_child_sources--;
+ s->event->n_online_child_sources--;
event_gc_signal_data(s->event, &s->priority, SIGCHLD);
return r;
}
- s->event->n_enabled_child_sources++;
-
+ if (!was_online)
+ s->event->n_online_child_sources++;
break;
case SOURCE_TIME_REALTIME:
@@ -1721,7 +1820,8 @@ static int event_source_enable(sd_event_source *s, int enable) {
assert_not_reached("Wut? I shouldn't exist.");
}
- s->enabled = enable;
+ s->enabled = enabled;
+ s->ratelimited = ratelimited;
/* Non-failing operations below */
switch (s->type) {
@@ -1741,7 +1841,7 @@ static int event_source_enable(sd_event_source *s, int enable) {
break;
}
- return 0;
+ return 1;
}
_public_ int sd_event_source_set_enabled(sd_event_source *s, int m) {
@@ -1759,7 +1859,7 @@ _public_ int sd_event_source_set_enabled(sd_event_source *s, int m) {
return 0;
if (m == SD_EVENT_OFF)
- r = event_source_disable(s);
+ r = event_source_offline(s, m, s->ratelimited);
else {
if (s->enabled != SD_EVENT_OFF) {
/* Switching from "on" to "oneshot" or back? If that's the case, we can take a shortcut, the
@@ -1768,7 +1868,7 @@ _public_ int sd_event_source_set_enabled(sd_event_source *s, int m) {
return 0;
}
- r = event_source_enable(s, m);
+ r = event_source_online(s, m, s->ratelimited);
}
if (r < 0)
return r;
@@ -1900,6 +2000,96 @@ _public_ void *sd_event_source_set_userdata(sd_event_source *s, void *userdata)
return ret;
}
+static int event_source_enter_ratelimited(sd_event_source *s) {
+ int r;
+
+ assert(s);
+
+ /* When an event source becomes ratelimited, we place it in the CLOCK_MONOTONIC priority queue, with
+ * the end of the rate limit time window, much as if it was a timer event source. */
+
+ if (s->ratelimited)
+ return 0; /* Already ratelimited, this is a NOP hence */
+
+ /* Make sure we can install a CLOCK_MONOTONIC event further down. */
+ r = setup_clock_data(s->event, &s->event->monotonic, CLOCK_MONOTONIC);
+ if (r < 0)
+ return r;
+
+ /* Timer event sources are already using the earliest/latest queues for the timer scheduling. Let's
+ * first remove them from the prioq appropriate for their own clock, so that we can use the prioq
+ * fields of the event source then for adding it to the CLOCK_MONOTONIC prioq instead. */
+ if (EVENT_SOURCE_IS_TIME(s->type))
+ event_source_time_prioq_remove(s, event_get_clock_data(s->event, s->type));
+
+ /* Now, let's add the event source to the monotonic clock instead */
+ r = event_source_time_prioq_put(s, &s->event->monotonic);
+ if (r < 0)
+ goto fail;
+
+ /* And let's take the event source officially offline */
+ r = event_source_offline(s, s->enabled, /* ratelimited= */ true);
+ if (r < 0) {
+ event_source_time_prioq_remove(s, &s->event->monotonic);
+ goto fail;
+ }
+
+ event_source_pp_prioq_reshuffle(s);
+
+ log_debug("Event source %p (%s) entered rate limit state.", s, strna(s->description));
+ return 0;
+
+fail:
+ /* Reinstall time event sources in the priority queue as before. This shouldn't fail, since the queue
+ * space for it should already be allocated. */
+ if (EVENT_SOURCE_IS_TIME(s->type))
+ assert_se(event_source_time_prioq_put(s, event_get_clock_data(s->event, s->type)) >= 0);
+
+ return r;
+}
+
+static int event_source_leave_ratelimit(sd_event_source *s) {
+ int r;
+
+ assert(s);
+
+ if (!s->ratelimited)
+ return 0;
+
+ /* Let's take the event source out of the monotonic prioq first. */
+ event_source_time_prioq_remove(s, &s->event->monotonic);
+
+ /* Let's then add the event source to its native clock prioq again — if this is a timer event source */
+ if (EVENT_SOURCE_IS_TIME(s->type)) {
+ r = event_source_time_prioq_put(s, event_get_clock_data(s->event, s->type));
+ if (r < 0)
+ goto fail;
+ }
+
+ /* Let's try to take it online again. */
+ r = event_source_online(s, s->enabled, /* ratelimited= */ false);
+ if (r < 0) {
+ /* Do something roughly sensible when this failed: undo the two prioq ops above */
+ if (EVENT_SOURCE_IS_TIME(s->type))
+ event_source_time_prioq_remove(s, event_get_clock_data(s->event, s->type));
+
+ goto fail;
+ }
+
+ event_source_pp_prioq_reshuffle(s);
+ ratelimit_reset(&s->rate_limit);
+
+ log_debug("Event source %p (%s) left rate limit state.", s, strna(s->description));
+ return 0;
+
+fail:
+ /* Do something somewhat reasonable when we cannot move an event sources out of ratelimited mode:
+ * simply put it back in it, maybe we can then process it more successfully next iteration. */
+ assert_se(event_source_time_prioq_put(s, &s->event->monotonic) >= 0);
+
+ return r;
+}
+
static usec_t sleep_between(sd_event *e, usec_t a, usec_t b) {
usec_t c;
assert(e);
@@ -1998,7 +2188,7 @@ static int event_arm_timer(
d->needs_rearm = false;
a = prioq_peek(d->earliest);
- if (!a || a->enabled == SD_EVENT_OFF || a->time.next == USEC_INFINITY) {
+ if (!a || a->enabled == SD_EVENT_OFF || time_event_source_next(a) == USEC_INFINITY) {
if (d->fd < 0)
return 0;
@@ -2018,7 +2208,7 @@ static int event_arm_timer(
b = prioq_peek(d->latest);
assert_se(b && b->enabled != SD_EVENT_OFF);
- t = sleep_between(e, a->time.next, time_event_source_latest(b));
+ t = sleep_between(e, time_event_source_next(a), time_event_source_latest(b));
if (d->next == t)
return 0;
@@ -2097,10 +2287,22 @@ static int process_timer(
for (;;) {
s = prioq_peek(d->earliest);
- if (!s ||
- s->time.next > n ||
- s->enabled == SD_EVENT_OFF ||
- s->pending)
+ if (!s || time_event_source_next(s) > n)
+ break;
+
+ if (s->ratelimited) {
+ /* This is an event sources whose ratelimit window has ended. Let's turn it on
+ * again. */
+ assert(s->ratelimited);
+
+ r = event_source_leave_ratelimit(s);
+ if (r < 0)
+ return r;
+
+ continue;
+ }
+
+ if (s->enabled == SD_EVENT_OFF || s->pending)
break;
r = source_set_pending(s, true);
@@ -2146,7 +2348,7 @@ static int process_child(sd_event *e) {
if (s->pending)
continue;
- if (s->enabled == SD_EVENT_OFF)
+ if (event_source_is_offline(s))
continue;
zero(s->child.siginfo);
@@ -2242,11 +2444,26 @@ static int process_signal(sd_event *e, struct signal_data *d, uint32_t events) {
}
static int source_dispatch(sd_event_source *s) {
+ _cleanup_(sd_event_unrefp) sd_event *saved_event = NULL;
int r = 0;
assert(s);
assert(s->pending || s->type == SOURCE_EXIT);
+ /* Similar, store a reference to the event loop object, so that we can still access it after the
+ * callback might have invalidated/disconnected the event source. */
+ saved_event = sd_event_ref(s->event);
+
+ /* Check if we hit the ratelimit for this event source, if so, let's disable it. */
+ assert(!s->ratelimited);
+ if (!ratelimit_below(&s->rate_limit)) {
+ r = event_source_enter_ratelimited(s);
+ if (r < 0)
+ return r;
+
+ return 1;
+ }
+
if (s->type != SOURCE_DEFER && s->type != SOURCE_EXIT) {
r = source_set_pending(s, false);
if (r < 0)
@@ -2356,7 +2573,7 @@ static int event_prepare(sd_event *e) {
sd_event_source *s;
s = prioq_peek(e->prepare);
- if (!s || s->prepare_iteration == e->iteration || s->enabled == SD_EVENT_OFF)
+ if (!s || s->prepare_iteration == e->iteration || event_source_is_offline(s))
break;
s->prepare_iteration = e->iteration;
@@ -2393,7 +2610,7 @@ static int dispatch_exit(sd_event *e) {
assert(e);
p = prioq_peek(e->exit);
- if (!p || p->enabled == SD_EVENT_OFF) {
+ if (!p || event_source_is_offline(p)) {
e->state = SD_EVENT_FINISHED;
return 0;
}
@@ -2419,7 +2636,7 @@ static sd_event_source* event_next_pending(sd_event *e) {
if (!p)
return NULL;
- if (p->enabled == SD_EVENT_OFF)
+ if (event_source_is_offline(p))
return NULL;
return p;
@@ -2879,3 +3096,53 @@ _public_ int sd_event_get_iteration(sd_event *e, uint64_t *ret) {
*ret = e->iteration;
return 0;
}
+
+_public_ int sd_event_source_set_ratelimit(sd_event_source *s, uint64_t interval, unsigned burst) {
+ int r;
+
+ assert_return(s, -EINVAL);
+
+ /* Turning on ratelimiting on event source types that don't support it, is a loggable offense. Doing
+ * so is a programming error. */
+ assert_return(EVENT_SOURCE_CAN_RATE_LIMIT(s->type), -EDOM);
+
+ /* When ratelimiting is configured we'll always reset the rate limit state first and start fresh,
+ * non-ratelimited. */
+ r = event_source_leave_ratelimit(s);
+ if (r < 0)
+ return r;
+
+ RATELIMIT_INIT(s->rate_limit, interval, burst);
+ return 0;
+}
+
+_public_ int sd_event_source_get_ratelimit(sd_event_source *s, uint64_t *ret_interval, unsigned *ret_burst) {
+ assert_return(s, -EINVAL);
+
+ /* Querying whether an event source has ratelimiting configured is not a loggable offsense, hence
+ * don't use assert_return(). Unlike turning on ratelimiting it's not really a programming error */
+ if (!EVENT_SOURCE_CAN_RATE_LIMIT(s->type))
+ return -EDOM;
+
+ if (!ratelimit_configured(&s->rate_limit))
+ return -ENOEXEC;
+
+ if (ret_interval)
+ *ret_interval = s->rate_limit.interval;
+ if (ret_burst)
+ *ret_burst = s->rate_limit.burst;
+
+ return 0;
+}
+
+_public_ int sd_event_source_is_ratelimited(sd_event_source *s) {
+ assert_return(s, -EINVAL);
+
+ if (!EVENT_SOURCE_CAN_RATE_LIMIT(s->type))
+ return false;
+
+ if (!ratelimit_configured(&s->rate_limit))
+ return false;
+
+ return s->ratelimited;
+}
diff --git a/src/shared/ratelimit.h b/src/shared/ratelimit.h
index 58efca7..434089e 100644
--- a/src/shared/ratelimit.h
+++ b/src/shared/ratelimit.h
@@ -55,3 +55,11 @@ typedef struct RateLimit {
} while (false)
bool ratelimit_test(RateLimit *r);
+
+static inline void ratelimit_reset(RateLimit *rl) {
+ rl->num = rl->begin = 0;
+}
+
+static inline bool ratelimit_configured(RateLimit *rl) {
+ return rl->interval > 0 && rl->burst > 0;
+}
diff --git a/src/systemd/sd-event.h b/src/systemd/sd-event.h
index ffde7c8..f297c6a 100644
--- a/src/systemd/sd-event.h
+++ b/src/systemd/sd-event.h
@@ -130,6 +130,9 @@ int sd_event_source_set_time_accuracy(sd_event_source *s, uint64_t usec);
int sd_event_source_get_time_clock(sd_event_source *s, clockid_t *clock);
int sd_event_source_get_signal(sd_event_source *s);
int sd_event_source_get_child_pid(sd_event_source *s, pid_t *pid);
+int sd_event_source_set_ratelimit(sd_event_source *s, uint64_t interval_usec, unsigned burst);
+int sd_event_source_get_ratelimit(sd_event_source *s, uint64_t *ret_interval_usec, unsigned *ret_burst);
+int sd_event_source_is_ratelimited(sd_event_source *s);
_SD_END_DECLARATIONS;
--
2.17.1

View File

@ -0,0 +1,37 @@
From dc3e079395816ce251c4794992f1816a61c1215d Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Michal=20Sekleta=CC=81r?= <msekleta@redhat.com>
Date: Thu, 9 Jul 2020 18:16:44 +0200
Subject: [PATCH 20/20] core: prevent excessive /proc/self/mountinfo parsing
(cherry picked from commit d586f642fd90e3bb378f7b6d3e3a64a753e51756)
Resolves: #1819868
[commit 51737206afaa10d902c86ec9b5ec97cf425039c2 from
https://github.com/systemd-rhel/rhel-8/]
Signed-off-by: Li Zhou <li.zhou@windriver.com>
---
src/core/mount.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/src/core/mount.c b/src/core/mount.c
index c7aed23..48427b7 100644
--- a/src/core/mount.c
+++ b/src/core/mount.c
@@ -1692,6 +1692,12 @@ static int mount_enumerate(Manager *m) {
r = sd_event_source_set_priority(m->mount_utab_event_source, -10);
if (r < 0)
goto fail;
+
+ r = sd_event_source_set_ratelimit(m->mount_event_source, 1 * USEC_PER_SEC, 5);
+ if (r < 0) {
+ log_error_errno(r, "Failed to enable rate limit for mount events: %m");
+ goto fail;
+ }
}
r = mount_load_proc_self_mountinfo(m, false);
--
2.17.1

View File

@ -0,0 +1,64 @@
From 15ac2f7ffd502cdc6f4ba47d0dd70fc39c48d8d7 Mon Sep 17 00:00:00 2001
From: Li Zhou <li.zhou@windriver.com>
Date: Wed, 31 Mar 2021 16:08:18 +0800
Subject: [PATCH 21/21] systemd: Fix compiling errors when merging #1819868
A series of patches are merged in for the issue:
https://bugzilla.redhat.com/show_bug.cgi?id=1819868
This commit is for fixing the compiling errors caused by context
conflict.
Signed-off-by: Li Zhou <li.zhou@windriver.com>
---
src/libsystemd/sd-event/sd-event.c | 25 ++++++++++++++++++++++++-
1 file changed, 24 insertions(+), 1 deletion(-)
diff --git a/src/libsystemd/sd-event/sd-event.c b/src/libsystemd/sd-event/sd-event.c
index 9dc1a27..282b38f 100644
--- a/src/libsystemd/sd-event/sd-event.c
+++ b/src/libsystemd/sd-event/sd-event.c
@@ -37,9 +37,32 @@
#include "list.h"
#include "sd-event.h"
+#include "event-util.h"
#define DEFAULT_ACCURACY_USEC (250 * USEC_PER_MSEC)
+#define CMP(a, b) __CMP(UNIQ, (a), UNIQ, (b))
+#define __CMP(aq, a, bq, b) \
+ ({ \
+ const typeof(a) UNIQ_T(A, aq) = (a); \
+ const typeof(b) UNIQ_T(B, bq) = (b); \
+ UNIQ_T(A, aq) < UNIQ_T(B, bq) ? -1 : \
+ UNIQ_T(A, aq) > UNIQ_T(B, bq) ? 1 : 0; \
+ })
+
+static inline usec_t usec_add(usec_t a, usec_t b) {
+ usec_t c;
+
+ /* Adds two time values, and makes sure USEC_INFINITY as input results as USEC_INFINITY in output, and doesn't
+ * overflow. */
+
+ c = a + b;
+ if (c < a || c < b) /* overflow check */
+ return USEC_INFINITY;
+
+ return c;
+}
+
typedef enum EventSourceType {
SOURCE_IO,
SOURCE_TIME_REALTIME,
@@ -2456,7 +2479,7 @@ static int source_dispatch(sd_event_source *s) {
/* Check if we hit the ratelimit for this event source, if so, let's disable it. */
assert(!s->ratelimited);
- if (!ratelimit_below(&s->rate_limit)) {
+ if (!ratelimit_test(&s->rate_limit)) {
r = event_source_enter_ratelimited(s);
if (r < 0)
return r;
--
2.17.1