2021-02-16 15:57:39 +01:00
|
|
|
/*
|
|
|
|
* Copyright (C) Internet Systems Consortium, Inc. ("ISC")
|
|
|
|
*
|
2021-06-03 08:37:05 +02:00
|
|
|
* SPDX-License-Identifier: MPL-2.0
|
|
|
|
*
|
2021-02-16 15:57:39 +01:00
|
|
|
* This Source Code Form is subject to the terms of the Mozilla Public
|
|
|
|
* License, v. 2.0. If a copy of the MPL was not distributed with this
|
|
|
|
* file, you can obtain one at https://mozilla.org/MPL/2.0/.
|
|
|
|
*
|
|
|
|
* See the COPYRIGHT file distributed with this work for additional
|
|
|
|
* information regarding copyright ownership.
|
|
|
|
*/
|
|
|
|
|
|
|
|
/*! \file */
|
|
|
|
|
|
|
|
#include <inttypes.h>
|
|
|
|
#include <stdlib.h>
|
|
|
|
|
|
|
|
#include <isc/mem.h>
|
|
|
|
#include <isc/once.h>
|
|
|
|
#include <isc/thread.h>
|
|
|
|
#include <isc/util.h>
|
New event loop handling API
This commit introduces new APIs for applications and signal handling,
intended to replace isc_app for applications built on top of libisc.
* isc_app will be replaced with isc_loopmgr, which handles the
starting and stopping of applications. In isc_loopmgr, the main
thread is not blocked, but is part of the working thread set.
The loop manager will start a number of threads, each with a
uv_loop event loop running. Setup and teardown functions can be
assigned which will run when the loop starts and stops, and
jobs can be scheduled to run in the meantime. When
isc_loopmgr_shutdown() is run from any the loops, all loops
will shut down and the application can terminate.
* signal handling will now be handled with a separate isc_signal unit.
isc_loopmgr only handles SIGTERM and SIGINT for application
termination, but the application may install additional signal
handlers, such as SIGHUP as a signal to reload configuration.
* new job running primitives, isc_job and isc_async, have been added.
Both units schedule callbacks (specifying a callback function and
argument) on an event loop. The difference is that isc_job unit is
unlocked and not thread-safe, so it can be used to efficiently
run jobs in the same thread, while isc_async is thread-safe and
uses locking, so it can be used to pass jobs from one thread to
another.
* isc_tid will be used to track the thread ID in isc_loop worker
threads.
* unit tests have been added for the new APIs.
2022-07-26 13:03:22 +02:00
|
|
|
#include <isc/uv.h>
|
2021-02-16 15:57:39 +01:00
|
|
|
|
|
|
|
#include "trampoline_p.h"
|
|
|
|
|
|
|
|
#define ISC__TRAMPOLINE_UNUSED 0
|
|
|
|
|
|
|
|
struct isc__trampoline {
|
|
|
|
int tid; /* const */
|
2021-02-16 19:54:04 +01:00
|
|
|
uintptr_t self;
|
2021-02-16 15:57:39 +01:00
|
|
|
isc_threadfunc_t start;
|
|
|
|
isc_threadarg_t arg;
|
Prevent memory bloat caused by a jemalloc quirk
Since version 5.0.0, decay-based purging is the only available dirty
page cleanup mechanism in jemalloc. It relies on so-called tickers,
which are simple data structures used for ensuring that certain actions
are taken "once every N times". Ticker data (state) is stored in a
thread-specific data structure called tsd in jemalloc parlance. Ticks
are triggered when extents are allocated and deallocated. Once every
1000 ticks, jemalloc attempts to release some of the dirty pages hanging
around (if any). This allows memory use to be kept in check over time.
This dirty page cleanup mechanism has a quirk. If the first
allocator-related action for a given thread is a free(), a
minimally-initialized tsd is set up which does not include ticker data.
When that thread subsequently calls *alloc(), the tsd transitions to its
nominal state, but due to a certain flag being set during minimal tsd
initialization, ticker data remains unallocated. This prevents
decay-based dirty page purging from working, effectively enabling memory
exhaustion over time. [1]
The quirk described above has been addressed (by moving ticker state to
a different structure) in jemalloc's development branch [2], but not in
any numbered jemalloc version released to date (the latest one being
5.2.1 as of this writing).
Work around the problem by ensuring that every thread spawned by
isc_thread_create() starts with a malloc() call. Avoid immediately
calling free() for the dummy allocation to prevent an optimizing
compiler from stripping away the malloc() + free() pair altogether.
An alternative implementation of this workaround was considered that
used a pair of isc_mem_create() + isc_mem_destroy() calls instead of
malloc() + free(), enabling the change to be fully contained within
isc__trampoline_run() (i.e. to not touch struct isc__trampoline), as the
compiler is not allowed to strip away arbitrary function calls.
However, that solution was eventually dismissed as it triggered
ThreadSanitizer reports when tools like dig, nsupdate, or rndc exited
abruptly without waiting for all worker threads to finish their work.
[1] https://github.com/jemalloc/jemalloc/issues/2251
[2] https://github.com/jemalloc/jemalloc/commit/c259323ab3082324100c708109dbfff660d0f4b8
2022-04-21 14:19:39 +02:00
|
|
|
void *jemalloc_enforce_init;
|
2021-02-16 15:57:39 +01:00
|
|
|
};
|
|
|
|
|
|
|
|
/*
|
2021-05-11 12:59:35 +02:00
|
|
|
* We can't use isc_mem API here, because it's called too early and the
|
|
|
|
* isc_mem_debugging flags can be changed later causing mismatch between flags
|
|
|
|
* used for isc_mem_get() and isc_mem_put().
|
2021-02-16 15:57:39 +01:00
|
|
|
*
|
2021-05-11 12:59:35 +02:00
|
|
|
* Since this is a single allocation at library load and deallocation at library
|
|
|
|
* unload, using the standard allocator without the tracking is fine for this
|
|
|
|
* single purpose.
|
2022-05-04 09:26:34 +02:00
|
|
|
*
|
|
|
|
* We can't use isc_mutex API either, because we track whether the mutexes get
|
|
|
|
* properly destroyed, and we intentionally leak the static mutex here without
|
|
|
|
* destroying it to prevent data race between library destructor running while
|
|
|
|
* thread is being still created.
|
2021-02-16 15:57:39 +01:00
|
|
|
*/
|
2022-05-04 09:26:34 +02:00
|
|
|
|
|
|
|
static uv_mutex_t isc__trampoline_lock;
|
|
|
|
static isc__trampoline_t **trampolines;
|
|
|
|
thread_local size_t isc_tid_v = SIZE_MAX;
|
|
|
|
static size_t isc__trampoline_min = 1;
|
|
|
|
static size_t isc__trampoline_max = 65;
|
|
|
|
|
2021-02-16 15:57:39 +01:00
|
|
|
static isc__trampoline_t *
|
|
|
|
isc__trampoline_new(int tid, isc_threadfunc_t start, isc_threadarg_t arg) {
|
|
|
|
isc__trampoline_t *trampoline = calloc(1, sizeof(*trampoline));
|
|
|
|
RUNTIME_CHECK(trampoline != NULL);
|
|
|
|
|
|
|
|
*trampoline = (isc__trampoline_t){
|
|
|
|
.tid = tid,
|
|
|
|
.start = start,
|
|
|
|
.arg = arg,
|
|
|
|
.self = ISC__TRAMPOLINE_UNUSED,
|
|
|
|
};
|
|
|
|
|
|
|
|
return (trampoline);
|
|
|
|
}
|
|
|
|
|
2022-05-04 09:26:34 +02:00
|
|
|
void
|
|
|
|
isc__trampoline_initialize(void) {
|
|
|
|
uv_mutex_init(&isc__trampoline_lock);
|
2021-02-16 15:57:39 +01:00
|
|
|
|
|
|
|
trampolines = calloc(isc__trampoline_max, sizeof(trampolines[0]));
|
|
|
|
RUNTIME_CHECK(trampolines != NULL);
|
|
|
|
|
|
|
|
/* Get the trampoline slot 0 for the main thread */
|
|
|
|
trampolines[0] = isc__trampoline_new(0, NULL, NULL);
|
|
|
|
isc_tid_v = trampolines[0]->tid;
|
2022-05-04 09:26:34 +02:00
|
|
|
trampolines[0]->self = isc_thread_self();
|
2021-02-16 15:57:39 +01:00
|
|
|
|
|
|
|
/* Initialize the other trampolines */
|
|
|
|
for (size_t i = 1; i < isc__trampoline_max; i++) {
|
|
|
|
trampolines[i] = NULL;
|
|
|
|
}
|
|
|
|
isc__trampoline_min = 1;
|
|
|
|
}
|
|
|
|
|
|
|
|
void
|
2022-05-04 09:26:34 +02:00
|
|
|
isc__trampoline_shutdown(void) {
|
2021-02-16 15:57:39 +01:00
|
|
|
/*
|
|
|
|
* When the program using the library exits abruptly and the library
|
|
|
|
* gets unloaded, there might be some existing trampolines from unjoined
|
|
|
|
* threads. We intentionally ignore those and don't check whether all
|
2022-05-04 09:26:34 +02:00
|
|
|
* trampolines have been cleared before exiting, so we leak a little bit
|
|
|
|
* of resources here, including the lock.
|
2021-02-16 15:57:39 +01:00
|
|
|
*/
|
|
|
|
free(trampolines[0]);
|
|
|
|
}
|
|
|
|
|
|
|
|
isc__trampoline_t *
|
|
|
|
isc__trampoline_get(isc_threadfunc_t start, isc_threadarg_t arg) {
|
|
|
|
isc__trampoline_t **tmp = NULL;
|
|
|
|
isc__trampoline_t *trampoline = NULL;
|
2022-05-04 09:26:34 +02:00
|
|
|
uv_mutex_lock(&isc__trampoline_lock);
|
2021-02-16 15:57:39 +01:00
|
|
|
again:
|
|
|
|
for (size_t i = isc__trampoline_min; i < isc__trampoline_max; i++) {
|
|
|
|
if (trampolines[i] == NULL) {
|
|
|
|
trampoline = isc__trampoline_new(i, start, arg);
|
|
|
|
trampolines[i] = trampoline;
|
|
|
|
isc__trampoline_min = i + 1;
|
|
|
|
goto done;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
tmp = calloc(2 * isc__trampoline_max, sizeof(trampolines[0]));
|
|
|
|
RUNTIME_CHECK(tmp != NULL);
|
|
|
|
for (size_t i = 0; i < isc__trampoline_max; i++) {
|
|
|
|
tmp[i] = trampolines[i];
|
|
|
|
}
|
|
|
|
for (size_t i = isc__trampoline_max; i < 2 * isc__trampoline_max; i++) {
|
|
|
|
tmp[i] = NULL;
|
|
|
|
}
|
|
|
|
free(trampolines);
|
|
|
|
trampolines = tmp;
|
|
|
|
isc__trampoline_max = isc__trampoline_max * 2;
|
|
|
|
goto again;
|
|
|
|
done:
|
|
|
|
INSIST(trampoline != NULL);
|
2022-05-04 09:26:34 +02:00
|
|
|
uv_mutex_unlock(&isc__trampoline_lock);
|
2021-02-16 15:57:39 +01:00
|
|
|
|
|
|
|
return (trampoline);
|
|
|
|
}
|
|
|
|
|
2021-08-31 17:25:07 +00:00
|
|
|
void
|
|
|
|
isc__trampoline_detach(isc__trampoline_t *trampoline) {
|
2022-05-04 09:26:34 +02:00
|
|
|
uv_mutex_lock(&isc__trampoline_lock);
|
2021-02-16 15:57:39 +01:00
|
|
|
REQUIRE(trampoline->self == isc_thread_self());
|
2022-05-04 09:26:34 +02:00
|
|
|
REQUIRE(trampoline->tid > 0);
|
|
|
|
REQUIRE((size_t)trampoline->tid < isc__trampoline_max);
|
2021-02-16 15:57:39 +01:00
|
|
|
REQUIRE(trampolines[trampoline->tid] == trampoline);
|
|
|
|
|
|
|
|
trampolines[trampoline->tid] = NULL;
|
|
|
|
|
|
|
|
if (isc__trampoline_min > (size_t)trampoline->tid) {
|
|
|
|
isc__trampoline_min = trampoline->tid;
|
|
|
|
}
|
|
|
|
|
Prevent memory bloat caused by a jemalloc quirk
Since version 5.0.0, decay-based purging is the only available dirty
page cleanup mechanism in jemalloc. It relies on so-called tickers,
which are simple data structures used for ensuring that certain actions
are taken "once every N times". Ticker data (state) is stored in a
thread-specific data structure called tsd in jemalloc parlance. Ticks
are triggered when extents are allocated and deallocated. Once every
1000 ticks, jemalloc attempts to release some of the dirty pages hanging
around (if any). This allows memory use to be kept in check over time.
This dirty page cleanup mechanism has a quirk. If the first
allocator-related action for a given thread is a free(), a
minimally-initialized tsd is set up which does not include ticker data.
When that thread subsequently calls *alloc(), the tsd transitions to its
nominal state, but due to a certain flag being set during minimal tsd
initialization, ticker data remains unallocated. This prevents
decay-based dirty page purging from working, effectively enabling memory
exhaustion over time. [1]
The quirk described above has been addressed (by moving ticker state to
a different structure) in jemalloc's development branch [2], but not in
any numbered jemalloc version released to date (the latest one being
5.2.1 as of this writing).
Work around the problem by ensuring that every thread spawned by
isc_thread_create() starts with a malloc() call. Avoid immediately
calling free() for the dummy allocation to prevent an optimizing
compiler from stripping away the malloc() + free() pair altogether.
An alternative implementation of this workaround was considered that
used a pair of isc_mem_create() + isc_mem_destroy() calls instead of
malloc() + free(), enabling the change to be fully contained within
isc__trampoline_run() (i.e. to not touch struct isc__trampoline), as the
compiler is not allowed to strip away arbitrary function calls.
However, that solution was eventually dismissed as it triggered
ThreadSanitizer reports when tools like dig, nsupdate, or rndc exited
abruptly without waiting for all worker threads to finish their work.
[1] https://github.com/jemalloc/jemalloc/issues/2251
[2] https://github.com/jemalloc/jemalloc/commit/c259323ab3082324100c708109dbfff660d0f4b8
2022-04-21 14:19:39 +02:00
|
|
|
free(trampoline->jemalloc_enforce_init);
|
2021-02-16 15:57:39 +01:00
|
|
|
free(trampoline);
|
|
|
|
|
2022-05-04 09:26:34 +02:00
|
|
|
uv_mutex_unlock(&isc__trampoline_lock);
|
2021-02-16 15:57:39 +01:00
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
2021-08-31 17:25:07 +00:00
|
|
|
void
|
|
|
|
isc__trampoline_attach(isc__trampoline_t *trampoline) {
|
2022-05-04 09:26:34 +02:00
|
|
|
uv_mutex_lock(&isc__trampoline_lock);
|
2021-02-16 15:57:39 +01:00
|
|
|
REQUIRE(trampoline->self == ISC__TRAMPOLINE_UNUSED);
|
2022-05-04 09:26:34 +02:00
|
|
|
REQUIRE(trampoline->tid > 0);
|
|
|
|
REQUIRE((size_t)trampoline->tid < isc__trampoline_max);
|
|
|
|
REQUIRE(trampolines[trampoline->tid] == trampoline);
|
2021-02-16 15:57:39 +01:00
|
|
|
|
|
|
|
/* Initialize the trampoline */
|
|
|
|
isc_tid_v = trampoline->tid;
|
|
|
|
trampoline->self = isc_thread_self();
|
Prevent memory bloat caused by a jemalloc quirk
Since version 5.0.0, decay-based purging is the only available dirty
page cleanup mechanism in jemalloc. It relies on so-called tickers,
which are simple data structures used for ensuring that certain actions
are taken "once every N times". Ticker data (state) is stored in a
thread-specific data structure called tsd in jemalloc parlance. Ticks
are triggered when extents are allocated and deallocated. Once every
1000 ticks, jemalloc attempts to release some of the dirty pages hanging
around (if any). This allows memory use to be kept in check over time.
This dirty page cleanup mechanism has a quirk. If the first
allocator-related action for a given thread is a free(), a
minimally-initialized tsd is set up which does not include ticker data.
When that thread subsequently calls *alloc(), the tsd transitions to its
nominal state, but due to a certain flag being set during minimal tsd
initialization, ticker data remains unallocated. This prevents
decay-based dirty page purging from working, effectively enabling memory
exhaustion over time. [1]
The quirk described above has been addressed (by moving ticker state to
a different structure) in jemalloc's development branch [2], but not in
any numbered jemalloc version released to date (the latest one being
5.2.1 as of this writing).
Work around the problem by ensuring that every thread spawned by
isc_thread_create() starts with a malloc() call. Avoid immediately
calling free() for the dummy allocation to prevent an optimizing
compiler from stripping away the malloc() + free() pair altogether.
An alternative implementation of this workaround was considered that
used a pair of isc_mem_create() + isc_mem_destroy() calls instead of
malloc() + free(), enabling the change to be fully contained within
isc__trampoline_run() (i.e. to not touch struct isc__trampoline), as the
compiler is not allowed to strip away arbitrary function calls.
However, that solution was eventually dismissed as it triggered
ThreadSanitizer reports when tools like dig, nsupdate, or rndc exited
abruptly without waiting for all worker threads to finish their work.
[1] https://github.com/jemalloc/jemalloc/issues/2251
[2] https://github.com/jemalloc/jemalloc/commit/c259323ab3082324100c708109dbfff660d0f4b8
2022-04-21 14:19:39 +02:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Ensure every thread starts with a malloc() call to prevent memory
|
|
|
|
* bloat caused by a jemalloc quirk. While this dummy allocation is
|
|
|
|
* not used for anything, free() must not be immediately called for it
|
|
|
|
* so that an optimizing compiler does not strip away such a pair of
|
|
|
|
* malloc() + free() calls altogether, as it would foil the fix.
|
|
|
|
*/
|
|
|
|
trampoline->jemalloc_enforce_init = malloc(8);
|
2022-05-04 09:26:34 +02:00
|
|
|
uv_mutex_unlock(&isc__trampoline_lock);
|
2021-08-31 17:25:07 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
isc_threadresult_t
|
|
|
|
isc__trampoline_run(isc_threadarg_t arg) {
|
|
|
|
isc__trampoline_t *trampoline = (isc__trampoline_t *)arg;
|
|
|
|
isc_threadresult_t result;
|
|
|
|
|
|
|
|
isc__trampoline_attach(trampoline);
|
2021-02-16 15:57:39 +01:00
|
|
|
|
|
|
|
/* Run the main function */
|
|
|
|
result = (trampoline->start)(trampoline->arg);
|
|
|
|
|
2021-08-31 17:25:07 +00:00
|
|
|
isc__trampoline_detach(trampoline);
|
2021-02-16 15:57:39 +01:00
|
|
|
|
|
|
|
return (result);
|
|
|
|
}
|