diff --git a/doc/dev/qsbr.md b/doc/dev/qsbr.md
deleted file mode 100644
index 7880a56bd4..0000000000
--- a/doc/dev/qsbr.md
+++ /dev/null
@@ -1,397 +0,0 @@
-
-
-QSBR: quiescent state based reclamation
-=======================================
-
-QSBR is a safe memory reclamation (SMR) algorithm for lock-free data
-structures such as a qp-trie. (See `doc/dev/qp.md`.)
-
-When an object is unlinked from a lock-free data structure, it
-cannot be `free()`ed immediately, because there can still be readers
-accessing the object via an old version of the data structure. SMR
-algorithms determine when it is safe to reclaim memory after it has
-been unlinked.
-
-
-Introductions and overviews
----------------------------
-
-There is a terse overview in `include/isc/qsbr.h`.
-
-Jeff Preshing has a nice introduction to QSBR,
-__
-
-At the end of this note is a copy of a blog post about writing BIND's
-`isc_qsbr`, __
-
-[Paul McKenney's web page][paulmck] has links to his book on
-concurrent programming, the [Userspace RCU library][urcu], and more.
-McKenney invented RCU and QSBR. RCU is the Linux kernel's machinery
-for lock-free data structures and safe memory reclamation, based on
-QSBR.
-
-[paulmck]: http://www.rdrop.com/~paulmck/
-[urcu]: https://liburcu.org/
-
-
-Example code
-------------
-
-If you are implementing a lock-free data structure that needs safe
-memory reclamation, here's a guide to using `isc_qsbr`, based on how
-QSBR is used by `dns_qp`.
-
-### registration
-
-When the program starts up you need to register a global callback
-function that will reclaim unused memory. You can do so using an
-ISC_CONSTRUCTOR function that runs automatically at startup.
-
- static void
- qp_qsbr_register(void) ISC_CONSTRUCTOR;
- static void
- qp_qsbr_register(void) {
- isc_qsbr_register(qp_qsbr_reclaimer);
- }
-
-### work list
-
-Your module will need somewhere that your callback can find the work
-it needs to do. The qp-trie has an atomic list of `dns_qpmulti_t`
-objects for this purpose.
-
- /* a global variable */
- static ISC_ASTACK(dns_qpmulti_t) qsbr_work;
-
-The reason for using global variables is so that we don't need to
-allocate a thunk every time we have memory reclamation work to do.
-
-### read-only access
-
-You should design your data structure so that it has a single atomic
-root pointer referring to its current version. A lock-free reader
-_must_ run in an `isc_loop` callback. It gains access to the data
-structure by taking a copy of this pointer:
-
- qp_node_t *reader = atomic_load_acquire(&multi->reader);
-
-During an `isc_loop` callback, a reader should keep using the same
-pointer go get a consistent view of the data structure. If it reloads
-the pointer it can get a different version changed by concurrent
-writers.
-
-A reader _must_ stop using the root pointer and any interior pointers
-obtained via the root pointer before it returns to the `isc_loop`.
-
-### modifications and writes
-
-All changes to the data structure must be copy-on-write (aka
-read-copy-update) so that concurrent readers are not disturbed.
-
-When a new version of the data structure has been prepared, it is
-committed by overwriting the atomic root pointer,
-
- atomic_store_release(&multi->reader, reader); /* COMMIT */
-
-### scheduling cleanup
-
-After committing a change, your data structure may have memory that
-will become free, after concurrent readers have stopped accessing it.
-To reclaim the memory when it is safe, use code like:
-
- isc_qsbr_phase_t phase = isc_qsbr_phase(multi->loopmgr);
- if (defer_chunk_reclamation(qp, phase)) {
- ISC_ASTACK_ADD(qsbr_work, multi, cleanup);
- isc_qsbr_activate(multi->loopmgr, phase);
- }
-
- * First, get the current QSBR phase
-
- * Second, mark free memory with the phase number. The qp-trie scans
- its chunks and marks those that will become free, and returns
- `true` if there is cleanup work to do.
-
- * If so, the qp-trie is added to the work list. (`ISC_ALIST_ADD()`
- is idempotent).
-
- * Finally, QSBR is informed that there is work to do.
-
-In other cases it might not make sense to scan the data structure
-after committing, and instead you might make note of which memory to
-clean up while making changes before you know what the phase will be.
-You can then have per-phase work lists, like:
-
- static ISC_ASTACK(my_work_t) qsbr_work[ISC_QSBR_PHASES];
-
- isc_qsbr_phase_t phase = isc_qsbr_phase(loopmgr);
- ISC_ASTACK_ADD(qsbr_work[phase], cleanup_work, link);
- isc_qsbr_activate(loopmgr, phase);
-
-In general, there will be several (maybe many) write operations during
-a grace period. Your lock-free data structure should collect its
-reclamation work from all these writes into a batch per phase, i.e.
-per grace period.
-
-### reclaiming
-
-Inside the reclaimer callback, we iterate over the work list and clean
-up each item on it. If there is more cleanup work to do in another
-phase, we put the qp-trie back on the work list for another go.
-
- static void
- qsbreclaimer(void *arg, isc_qsbr_phase_t phase) {
- UNUSED(arg);
-
- ISC_STACK(dns_qpmulti_t) drain = ISC_ASTACK_TO_STACK(qsbr_work);
- while (!ISC_STACK_EMPTY(drain)) {
- dns_qpmulti_t *multi = ISC_STACK_POP(drain, cleanup);
- INSIST(QPMULTI_VALID(multi));
- LOCK(&multi->mutex);
- if (reclaim_chunks(&multi->writer, phase)) {
- /* more to do next time */
- ISC_ALIST_PUSH(qsbr_work, multi, cleanup);
- }
- UNLOCK(&multi->mutex);
- }
- }
-
-### reclaim marks
-
-In the qp-trie data structure, each chunk has some metadata which
-includes a bitfield for the reclaim phase:
-
- isc_qsbr_phase_t phase : ISC_QSBR_PHASE_BITS;
-
-We use a bitfield so that all the metadata fits in a single word.
-
-
-------------------------------------------------------------------------
-
-Safe memory reclamation for BIND
-================================
-
-At the end of October 2022, I _finally_ got [my multithreaded
-qp-trie][qp-gc] working! It could be built with two different
-concurrency control mechanisms:
-
- * A reader/writer lock
-
- This has poor read-side scalability, because every thread is
- hammering on the same shared location. But its write performance
- is reasonably good: concurrent readers don't slow it down too much.
-
- * [`liburcu`, userland read-copy-update][urcu]
-
- RCU has a fast and scalable read side, nice! But on the write side
- I used `synchronize_rcu()`, which is blocking and rather slow, so
- my write performance was terrible.
-
-OK, but I want the best of both worlds! To fix it, I needed to change
-the qp-trie code to use safe memory reclamation more effectively:
-instead of blocking inside `synchronize_rcu()` before cleaning up, use
-`call_rcu()` to clean up asynchronously. I expect I'll write about the
-qp-trie changes another time.
-
-Another issue is that I want the best of both worlds _by default_,
-but `liburcu` is [LGPL][] and we don't want BIND to depend on
-code whose licence demands more from our users than the [MPL][].
-
-[qp-gc]: https://dotat.at/@/2021-06-23-page-based-gc-for-qp-trie-rcu.html
-[LGPL]: https://opensource.org/licenses/LGPL-2.1
-[MPL]: https://opensource.org/licenses/MPL-2.0
-
-So I set out to write my own safe memory reclamation support code.
-
-
-lock freedom
-------------
-
-In a [multithreaded qp-trie][qp-gc], there can be many concurrent
-readers, but there can be only one writer at a time and modifications
-are strictly serialized. When I have got it working properly, readers
-are completely wait-free, unaffected by other readers, and almost
-unaffected by writers. Writers need to get a mutex to ensure there is
-only one at a time, but once the mutex is acquired, a writer is not
-obstructed by readers.
-
-The way this works is that readers use an atomic load to get a pointer
-to the root of the current version of the trie. Readers can make
-multiple queries using this root pointer and the results will be
-consistent wrt that particular version, regardless of what changes
-writers might be making concurrently. Writers do not affect readers
-because all changes are made by copy-on-write. When a writer is ready
-to commit a new version of the trie, it uses an atomic store to flip
-the root pointer.
-
-
-safe memory reclamation
------------------------
-
-We can't copy-on-write indefinitely: we need to reclaim the memory
-used by old versions of the trie. And we must do so "safely", i.e.
-without `free()`ing memory that readers are still using.
-
-So, before `free()`ing memory, a writer must wait for a _"grace
-period"_, which is a jargon term meaning "until readers are not using
-the old version". There are a bunch of algorithms for determining when
-a grace period is over, with varying amounts of over-approximation,
-CPU overhead, and memory backlog.
-
-The [RCU][urcu] function `synchronize_rcu()` is slow because it blocks
-waiting for a grace period; the `call_rcu()` function runs a callback
-asynchronously after a grace period has passed. I wanted to avoid
-blocking my writers, so I needed to implement something like
-`call_rcu()`.
-
-
-aversions
----------
-
-When I started trying to work out how to do safe memory reclamation,
-it all seemed quite intimidating. But as I learned more, I found that
-my circumstances make it easier than it appeared at first.
-
-The [`liburcu`][urcu] homepage has a long list of supported CPU
-architectures and operating systems. Do I have to care about those
-details too? No! The RCU code dates back to before the age of
-standardized concurrent memory models, so the RCU developers had to
-invent their own atomic primitives and correctness rules. Twenty-ish
-years later the state of the art has advanced, so I can use
-`` without having to re-do it like `liburcu`.
-
-You can also choose between several algorithms implemented by
-[`liburcu`][urcu], involving questions about kernel support, specially
-reserved signals, and intrusiveness in application code. But while I
-was working out how to schedule asynchronous memory reclamation work,
-I realised that BIND is already well-suited to the fastest flavour of
-RCU, called "QSBR".
-
-
-QSBR
-----
-
-QSBR stands for "quiescent state based reclamation". A _"quiescent
-state"_ is a fancy name for a point when a thread is not accessing a
-lock-free data structure, and does not retain any root pointers or
-interior pointers.
-
-When a thread has passed through a quiescent state, it no longer has
-access to older versions of the data structures. When _all_ threads
-have passed through quiescent states, then nothing in the program has
-access to old versions. This is how QSBR detects grace periods: after
-a writer commits a new version, it waits for all threads to pass
-through quiescent states, and therefore a grace period has definitely
-elapsed, and so it is then safe to reclaim the old version's memory.
-
-QSBR is fast because readers do not need to explicitly mark the
-critical section surrounding the atomic load that I mentioned earlier.
-Threads just need to pass through a quiescent state frequently enough
-that there isn't a huge build-up of unreclaimed memory.
-
-Inside an operating system kernel (RCU's native environment), a
-context switch provides a natural quiescent state. In a userland
-application, you need to find a good place to call
-`rcu_quiescent_state()`. You could call it every time you have
-finished using a root pointer, but marking a quiescent state is not
-completely free, so there are probably more efficient ways.
-
-
-`libuv`
--------
-
-BIND is multithreaded, and (basically) each thread runs an event loop.
-Recent versions of BIND use [`libuv`][uv] for the event loops.
-
-A lot of things started falling into place when I realised that the
-`libuv` event loop gives BIND a [natural quiescent state][uv-loop]:
-when the event callbacks have finished running, and `libuv` is about
-to call `select()` or `poll()` or whatever, we can mark a quiescent
-state. We can require that event-handling functions do not stash root
-pointers in the heap, but only use them via local variables, so we
-know that old versions are inaccessible after the callback returns.
-
-My design marks a quiescent state once per loop, so on a busy server
-where each loop has lots to do, the cost of marking a quiescent state
-is amortized across several I/O events.
-
-[uv]: http://libuv.org/
-[uv-loop]: http://docs.libuv.org/en/v1.x/design.html#the-i-o-loop
-
-
-fuzzy barrier
--------------
-
-So, how do we mark a quiescent state? Using a _"fuzzy barrier"_.
-
-When a thread reaches a normal barrier, it blocks until all the other
-threads have reached the barrier, after which exactly one of the
-threads can enter a protected section of code, and the others are
-unblocked and can proceed as normal.
-
-When a thread encounters a fuzzy barrier, it never blocks. It either
-proceeds immediately as normal, or if it is the last thread to reach
-the barrier, it enters the protected code.
-
-RCU does not actually use a fuzzy barrier as I have described it. Like
-a fuzzy barrier, each thread keeps track of whether it has passed
-through a quiescent state in the current grace period, without
-blocking; but unlike a fuzzy barrier, no thread is diverted to the
-protected code. Instead, code that wants to enter a protected section
-uses the blocking `synchronize_rcu()` function.
-
-
-EBR-ish
--------
-
-As in the paper ["performance of memory reclamation for lockless
-synchronization"][HMBW], my implementation of QSBR uses a fuzzy
-barrier designed for another safe memory reclamation algorithm, EBR,
-epoch based reclamation. (EBR was invented here in Cambridge by [Keir
-Fraser][tr579].)
-
-[HMBW]: http://csng.cs.toronto.edu/publication_files/0000/0159/jpdc07.pdf
-[tr579]: https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-579.html
-
-Actually, my fuzzy barrier is slightly different to EBR's. In EBR, the
-fuzzy barrier is used every time the program enters a critical
-section. (In qp-trie terms, that would be every time a reader fetches
-a root pointer.) So it is vital that EBR's barrier avoids mutating
-shared state, because that would wreck multithreaded performance.
-
-Because BIND will only pass through the fuzzy barrier when it is about
-to use a blocking system call, my version mutates shared state more
-frequently (typically, once per CPU per grace period, instead of once
-per grace period). If this turns out to be a problem, it won't be too
-hard to make it work more like EBR.
-
-More trivially, I'm using the term "phase" instead of "epoch", because
-it's nothing to do with the unix epoch, because there are three
-phases, and because I can talk about phase transitions and threads
-being out of phase with each other.
-
-
-coda
-----
-
-While reading various RCU-related papers, I was amused by ["user-level
-implementations of read-copy update"][DMSDW], which says:
-
-> BIND, a major domain-name server used for Internet domain-name
-> resolution, is facing scalability issues. Since domain names
-> are read often but rarely updated, using user-level RCU might be
-> beneficial.
-
-Yes, I think it might :-)
-
-[DMSDW]: https://www.efficios.com/publications/
diff --git a/lib/isc/Makefile.am b/lib/isc/Makefile.am
index 7d74517464..061fa24e7c 100644
--- a/lib/isc/Makefile.am
+++ b/lib/isc/Makefile.am
@@ -66,7 +66,6 @@ libisc_la_HEADERS = \
include/isc/pause.h \
include/isc/portset.h \
include/isc/quota.h \
- include/isc/qsbr.h \
include/isc/radix.h \
include/isc/random.h \
include/isc/ratelimiter.h \
@@ -174,7 +173,6 @@ libisc_la_SOURCES = \
picohttpparser.h \
portset.c \
quota.c \
- qsbr.c \
radix.c \
random.c \
ratelimiter.c \
diff --git a/lib/isc/include/isc/loop.h b/lib/isc/include/isc/loop.h
index 93330e60d5..de94e25a22 100644
--- a/lib/isc/include/isc/loop.h
+++ b/lib/isc/include/isc/loop.h
@@ -69,17 +69,6 @@ isc_loopmgr_run(isc_loopmgr_t *loopmgr);
*\li 'loopmgr' is a valid loop manager.
*/
-void
-isc_loopmgr_wakeup(isc_loopmgr_t *loopmgr);
-/*%<
- * Send no-op events to wake up all running loops in 'loopmgr' except
- * the current one. (See .)
- *
- * Requires:
- *\li 'loopmgr' is a valid loop manager.
- *\li We are in a running loop.
- */
-
void
isc_loopmgr_pause(isc_loopmgr_t *loopmgr);
/*%<
diff --git a/lib/isc/include/isc/qsbr.h b/lib/isc/include/isc/qsbr.h
deleted file mode 100644
index 242c6ac45b..0000000000
--- a/lib/isc/include/isc/qsbr.h
+++ /dev/null
@@ -1,282 +0,0 @@
-/*
- * Copyright (C) Internet Systems Consortium, Inc. ("ISC")
- *
- * SPDX-License-Identifier: MPL-2.0
- *
- * This Source Code Form is subject to the terms of the Mozilla Public
- * License, v. 2.0. If a copy of the MPL was not distributed with this
- * file, you can obtain one at https://mozilla.org/MPL/2.0/.
- *
- * See the COPYRIGHT file distributed with this work for additional
- * information regarding copyright ownership.
- */
-
-#pragma once
-
-#include
-#include
-#include
-#include
-
-/*
- * Quiescent state based reclamation
- * =================================
- *
- * QSBR is a safe memory reclamation algorithm for lock-free data
- * structures such as a qp-trie.
- *
- * When an object is unlinked from a lock-free data structure, it
- * cannot be free()d immediately, because there can still be readers
- * accessing the object via an old version of the data structure. SMR
- * algorithms determine when it is safe to reclaim memory after it has
- * been unlinked.
- *
- * With QSBR, reading a data structure is wait-free. All that is
- * required is an atomic load to get the data structure's current
- * root; there is no need to explicitly mark any read-side critical
- * section.
- *
- * QSBR is used by RCU (read-copy-update) in the Linux kernel. BIND's
- * implementation also uses some ideas from EBR (epoch-based reclamation).
- * The following summary is based on the overview in the paper
- * "performance of memory reclamation for lockless synchronization",
- * (http://csng.cs.toronto.edu/publication_files/0000/0159/jpdc07.pdf).
- *
- * Aside: This QSBR implementation is somewhat different from the one
- * in liburcu, described in the paper "user-level implementations of
- * read-copy update", (https://www.efficios.com/publications/), which
- * contains the amusing comment:
- *
- * BIND, a major domain-name server used for Internet domain-name
- * resolution, is facing scalability issues. Since domain names
- * are read often but rarely updated, using user-level RCU might
- * be beneficial.
- *
- * A "quiescent state" is a point when a thread is not accessing any
- * lock-free data structure. After passing through a quiescent state,
- * a thread can no longer access versions of a data structure that
- * were replaced before that point. In BIND, we use a point in the
- * event loop (a uv_prepare_t callback) to identify a quiescent state.
- *
- * Aside: a prepare handle runs its callbacks before the loop sleeps,
- * which reduces reclaim latency (unlike a check handle) and it does
- * not affect timeout calculations (unlike an idle handle).
- *
- * A "grace period" is any time interval such that after the end of
- * the grace period, all objects removed before the start of the grace
- * period can safely be reclaimed. Different SMR algorithms detect
- * grace periods with varying degrees of tightness or looseness.
- *
- * QSBR uses quiescent states to detect grace periods: a grace period
- * is a time interval in which every thread passes through a quiescent
- * state. (This is a safe over-estimate.) A "fuzzy barrier" is used to
- * find out when all threads have passed through a quiescent state.
- *
- * NOTE: In BIND this means that code which is not running in an event
- * loop thread (such as an isc_work / uv_work_t callback) must use
- * locking (not lock-free) data structure accessors.
- *
- * Because a quiescent state happens once per event loop, a grace
- * period takes roughly the same amount of time as the slowest event
- * loop in each cycle.
- *
- * Similar to the paper linked above, this QSBR implementation uses a
- * variant of the EBR fuzzy barrier. Like EBR, each grace period is
- * numbered with a "phase", which cycles round 1,2,3,1,2,3,... (Phases
- * are called epochs in EBR, but I think "phase" is a better metaphor.)
- * When entering the fuzzy barrier, each thread updates its local phase
- * to match the global phase, keeping a global count of the number of
- * threads still to pass. When this count reaches zero, it is the end of
- * the grace period; the global phase is updated and reclamation is
- * triggered.
- *
- * Note that threads are usually slightly out-of-phase wrt the global
- * grace period. At any particular point in time, there will be some
- * threads in the current global phase, and some in the previous
- * global phase. EBR has three phases because that is the minimum
- * number that leaves one phase unoccupied by readers. Any objects that
- * were detached from the data structure in the third phase can be
- * reclaimed after the start of the current phase, because a grace
- * period (the previous phase) has elapsed since the objects were
- * detached.
- *
- * A phase number can be used by a lock-free data structure (such as a
- * qp-trie) to record when an object was detached. QSBR calls the data
- * structure's reclaimer function, passing a phase number indicating
- * that objects detached in that phase can now be reclaimed
- *
- * In general, there will be several (maybe many) write operations
- * during a grace period. The lock-free data structures that use QSBR
- * will collect their reclamation work from all these writes into a
- * batch per phase, i.e. per grace period.
- *
- * There is some example code in `doc/dev/qsbr.md`, with pointers to
- * less terse introductions to QSBR and other overview material.
- */
-
-#define ISC_QSBR_PHASE_BITS 2
-
-typedef unsigned int isc_qsbr_phase_t;
-/*%<
- * A grace period phase number. It can be stored in a bitfield of size
- * ISC_QSBR_PHASE_BITS. You can use zero to indicate "no phase".
- * (Don't assume the maximum is three: We might want to increase the
- * number of phases so that there is more than one unoccupied phase.
- * This would allow concurrent reclamation of objects released in
- * multiple unoccupied phases.)
- */
-
-typedef void
-isc_qsbreclaimer_t(isc_qsbr_phase_t phase);
-/*%<
- * The type of memory reclaimer callback functions.
- *
- * The `phase` identifies which objects are to be reclaimed.
- *
- * An isc_qsbreclaimer_t can call isc_qsbr_activate() if it could not
- * reclaim everything and needs to be called again.
- */
-
-typedef struct isc_qsbr_registered {
- ISC_SLINK(struct isc_qsbr_registered) link;
- isc_qsbreclaimer_t *func;
-} isc_qsbr_registered_t;
-/*%<
- * Each reclaimer callback has a static `isc_qsbr_registered_t` object
- * so that QSBR can find it.
- */
-
-void
-isc__qsbr_register(isc_qsbr_registered_t *reg);
-/*%<
- * Requires:
- * \li reclaimer->link is not linked
- * \li reclaimer->func is not NULL
- */
-
-#define isc_qsbr_register(cb) \
- do { \
- static isc_qsbr_registered_t registration = { \
- .link = ISC_SLINK_INITIALIZER, \
- .func = cb, \
- }; \
- isc__qsbr_register(®istration); \
- } while (0)
-/*%<
- * Register a callback function with QSBR. This macro should be used
- * inside an `ISC_CONSTRUCTOR` function. There should be one callback
- * for eack lock-free data structure implementation, which is able to
- * reclaim all the unused memory across all instances of its data
- * structure.
- */
-
-isc_qsbr_phase_t
-isc_qsbr_phase(isc_loopmgr_t *loopmgr);
-/*%<
- * Get the current phase, to use for marking detached objects.
- *
- * To commit a write that requires cleanup, the ordering must be:
- *
- * - Use atomic_store_release() to commit the data structure's new
- * root pointer; release ordering ensures that the interior changes
- * are written before the root pointer.
- *
- * - Call isc_qsbr_phase() to get the phase to be used for marking
- * objects to reclaim. This must happen after the commit, to ensure
- * there is at least one grace period between commit and cleanup.
- *
- * - Pass the same phase to isc_qsbr_activate() so that the reclaimer
- * will be called after a grace period has passed.
- */
-
-void
-isc_qsbr_activate(isc_loopmgr_t *loopmgr, isc_qsbr_phase_t phase);
-/*%<
- * Tell QSBR that objects have been detached and will need reclaiming
- * after a grace period.
- */
-
-/***********************************************************************
- *
- * private parts
- */
-
-/*
- * Accessors and constructors for the `grace` variable.
- * It contains two bit fields:
- *
- * - the global phase in the lower ISC_QSBR_PHASE_BITS
- *
- * - a thread counter in the upper bits
- */
-
-#define ISC_QSBR_ONE_THREAD (1 << ISC_QSBR_PHASE_BITS)
-#define ISC_QSBR_PHASE_MAX (ISC_QSBR_ONE_THREAD - 1)
-
-#define ISC_QSBR_GRACE_PHASE(grace) (grace & ISC_QSBR_PHASE_MAX)
-#define ISC_QSBR_GRACE_THREADS(grace) (grace >> ISC_QSBR_PHASE_BITS)
-#define ISC_QSBR_GRACE(threads, phase) \
- ((threads << ISC_QSBR_PHASE_BITS) | phase)
-
-typedef struct isc_qsbr {
- /*
- * The `grace` variable keeps track of the current grace period.
- * When the phase changes, the thread counter is set to the number of
- * threads that need to observe the new phase before the grace period
- * can end.
- *
- * The thread counter is an add-on to the usual EBR fuzzy barrier.
- * Counting threads through the barrier adds multi-thread update
- * contention, and in EBR the fuzzy barrier runs frequently enough
- * (on every access) that it's important to minimize its cost. With
- * QSBR, the fuzzy barrier runs less frequently (roughly, per loop,
- * instead of per-callback) so contention is less of a concern. The
- * thread counter helps to reduce reclaim latency, because unlike EBR
- * we don't probabilistically check, we know deterministically when
- * all threads have changed phase.
- */
- atomic_uint_fast32_t grace;
-
- /*
- * A flag for each phase indicating that there will be work to
- * do, so we don't invoke the reclaim machinery unnecessarily.
- * Set by `isc_qsbr_activate()` and cleared before the reclaimer
- * functions are invoked (so they can re-set their flag if
- * necessary).
- */
- atomic_uint_fast32_t activated;
-
- /*
- * The time of the last phase transition (isc_nanosecs_t). Used
- * to ensure that grace periods do not last forever. We use
- * `isc_time_monotonic()` because we need the same time in all
- * threads. (`uv_now()` is different in different threads.)
- */
- atomic_uint_fast64_t transition_time;
-
-} isc_qsbr_t;
-
-/*
- * When we start there is no worker thread yet, so the thread
- * count is equal to the number of loops. The global phase starts
- * off at one (it must always be nonzero).
- */
-#define ISC_QSBR_INITIALIZER(nloops) \
- (isc_qsbr_t) { \
- .grace = ISC_QSBR_GRACE(nloops, 1), \
- .transition_time = isc_time_monotonic(), \
- }
-
-/*
- * For use by tests that need to explicitly drive QSBR phase transitions.
- */
-void
-isc__qsbr_quiescent_state(isc_loop_t *loop);
-
-/*
- * Used by the loopmgr
- */
-void
-isc__qsbr_quiescent_cb(uv_prepare_t *handle);
-void
-isc__qsbr_destroy(isc_loopmgr_t *loopmgr);
diff --git a/lib/isc/loop.c b/lib/isc/loop.c
index 1cef7d35ad..41c7f656f0 100644
--- a/lib/isc/loop.c
+++ b/lib/isc/loop.c
@@ -26,14 +26,12 @@
#include
#include
#include
-#include
#include
#include
#include
#include
#include
#include
-#include
#include
#include
#include
@@ -151,7 +149,6 @@ destroy_cb(uv_async_t *handle) {
uv_close(&loop->run_trigger, isc__job_close);
uv_close(&loop->destroy_trigger, NULL);
uv_close(&loop->pause_trigger, NULL);
- uv_close(&loop->wakeup_trigger, NULL);
uv_close(&loop->quiescent, NULL);
uv_walk(&loop->loop, loop_walk_cb, (char *)"destroy_cb");
@@ -162,8 +159,6 @@ shutdown_cb(uv_async_t *handle) {
isc_loop_t *loop = uv_handle_get_data(handle);
isc_loopmgr_t *loopmgr = loop->loopmgr;
- loop->shuttingdown = true;
-
/* Make sure, we can't be called again */
uv_close(&loop->shutdown_trigger, shutdown_trigger_close_cb);
@@ -185,12 +180,6 @@ shutdown_cb(uv_async_t *handle) {
UV_RUNTIME_CHECK(uv_async_send, r);
}
-static void
-wakeup_cb(uv_async_t *handle) {
- /* we only woke up to make the loop take a spin */
- UNUSED(handle);
-}
-
static void
loop_init(isc_loop_t *loop, isc_loopmgr_t *loopmgr, uint32_t tid) {
*loop = (isc_loop_t){
@@ -226,9 +215,6 @@ loop_init(isc_loop_t *loop, isc_loopmgr_t *loopmgr, uint32_t tid) {
UV_RUNTIME_CHECK(uv_async_init, r);
uv_handle_set_data(&loop->destroy_trigger, loop);
- r = uv_async_init(&loop->loop, &loop->wakeup_trigger, wakeup_cb);
- UV_RUNTIME_CHECK(uv_async_init, r);
-
r = uv_prepare_init(&loop->loop, &loop->quiescent);
UV_RUNTIME_CHECK(uv_prepare_init, r);
uv_handle_set_data(&loop->quiescent, loop);
@@ -245,7 +231,7 @@ loop_init(isc_loop_t *loop, isc_loopmgr_t *loopmgr, uint32_t tid) {
static void
quiescent_cb(uv_prepare_t *handle) {
- isc__qsbr_quiescent_cb(handle);
+ UNUSED(handle);
#if defined(RCU_QSBR)
/* safe memory reclamation */
@@ -340,7 +326,6 @@ isc_loopmgr_create(isc_mem_t *mctx, uint32_t nloops, isc_loopmgr_t **loopmgrp) {
loopmgr = isc_mem_get(mctx, sizeof(*loopmgr));
*loopmgr = (isc_loopmgr_t){
.nloops = nloops,
- .qsbr = ISC_QSBR_INITIALIZER(nloops),
};
isc_mem_attach(mctx, &loopmgr->mctx);
@@ -465,22 +450,6 @@ isc_loopmgr_run(isc_loopmgr_t *loopmgr) {
isc_thread_main(loop_thread, &loopmgr->loops[0]);
}
-void
-isc_loopmgr_wakeup(isc_loopmgr_t *loopmgr) {
- REQUIRE(VALID_LOOPMGR(loopmgr));
-
- for (size_t i = 0; i < loopmgr->nloops; i++) {
- isc_loop_t *loop = &loopmgr->loops[i];
-
- /* Skip current loop */
- if (i == isc_tid()) {
- continue;
- }
-
- uv_async_send(&loop->wakeup_trigger);
- }
-}
-
void
isc_loopmgr_pause(isc_loopmgr_t *loopmgr) {
REQUIRE(VALID_LOOPMGR(loopmgr));
diff --git a/lib/isc/loop_p.h b/lib/isc/loop_p.h
index b9bef13b19..185406d0d7 100644
--- a/lib/isc/loop_p.h
+++ b/lib/isc/loop_p.h
@@ -21,7 +21,6 @@
#include
#include
#include
-#include
#include
#include
#include
@@ -76,9 +75,7 @@ struct isc_loop {
uv_async_t destroy_trigger;
/* safe memory reclamation */
- uv_async_t wakeup_trigger;
uv_prepare_t quiescent;
- isc_qsbr_phase_t qsbr_phase;
};
/*
@@ -113,9 +110,6 @@ struct isc_loopmgr {
/* per-thread objects */
isc_loop_t *loops;
-
- /* safe memory reclamation */
- isc_qsbr_t qsbr;
};
/*
diff --git a/lib/isc/qsbr.c b/lib/isc/qsbr.c
deleted file mode 100644
index c122770c14..0000000000
--- a/lib/isc/qsbr.c
+++ /dev/null
@@ -1,393 +0,0 @@
-/*
- * Copyright (C) Internet Systems Consortium, Inc. ("ISC")
- *
- * SPDX-License-Identifier: MPL-2.0
- *
- * This Source Code Form is subject to the terms of the Mozilla Public
- * License, v. 2.0. If a copy of the MPL was not distributed with this
- * file, you can obtain one at https://mozilla.org/MPL/2.0/.
- *
- * See the COPYRIGHT file distributed with this work for additional
- * information regarding copyright ownership.
- */
-
-#include
-#include
-#include
-#include
-#include
-#include
-#include
-#include
-#include
-
-#include "loop_p.h"
-
-#define MAX_GRACE_PERIOD_NS 53 * NS_PER_MS
-
-#if 0
-#define TRACE(fmt, ...) \
- isc_log_write(isc_lctx, ISC_LOGCATEGORY_GENERAL, ISC_LOGMODULE_OTHER, \
- ISC_LOG_DEBUG(7), "%s:%u:%s():t%u: " fmt, __FILE__, \
- __LINE__, __func__, isc_tid(), ##__VA_ARGS__)
-#else
-#define TRACE(...)
-#endif
-
-static ISC_STACK(isc_qsbr_registered_t) qsbreclaimers = ISC_STACK_INITIALIZER;
-
-static void
-reclaim_cb(void *arg);
-static void
-reclaimed_cb(void *arg);
-
-/**********************************************************************/
-
-/*
- * 3,2,1,3,2,1,...
- */
-static isc_qsbr_phase_t
-change_phase(isc_qsbr_phase_t phase) {
- return (--phase > 0 ? phase : ISC_QSBR_PHASE_MAX);
-}
-
-/*
- * For marking or checking that a phase has cleanup work to do.
- */
-static unsigned int
-active_bit(isc_qsbr_phase_t phase) {
- return (1 << phase);
-}
-
-/*
- * Extract the global phase from the grace period state.
- */
-static isc_qsbr_phase_t
-global_phase(isc_qsbr_t *qsbr, memory_order m_o) {
- uint32_t grace = atomic_load_explicit(&qsbr->grace, m_o);
- return (ISC_QSBR_GRACE_PHASE(grace));
-}
-
-/*
- * Record that the current thread has passed the barrier.
- * Returns true if more threads still need to pass.
- *
- * ATOMIC: acquire-release, to ensure that this is not reordered wrt
- * read-only accesses to lock-free data structures. This implements the
- * ordering requirements of a quiescent state.
- */
-static bool
-fuzzy_barrier_not_yet(isc_qsbr_t *qsbr) {
- uint32_t grace = atomic_fetch_sub_acq_rel(&qsbr->grace,
- ISC_QSBR_ONE_THREAD);
- uint32_t threads = ISC_QSBR_GRACE_THREADS(grace);
- return (threads > 1);
-}
-
-/*
- * Ungracefully drive all cleanup work to completion.
- *
- * ATOMIC: everything is relaxed, because we assume that concurrent
- * readers have already finished. `reclaim_cb()` uses the `activated`
- * flags to ensure it is OK that threads will race to complete the
- * cleanup.
- */
-static void
-qsbr_shutdown(isc_loopmgr_t *loopmgr) {
- isc_qsbr_t *qsbr = &loopmgr->qsbr;
- isc_qsbr_phase_t phase = global_phase(qsbr, memory_order_relaxed);
- uint32_t threads = isc_loopmgr_nloops(loopmgr);
- uint32_t grace;
-
- while (atomic_load_relaxed(&qsbr->activated) != 0) {
- reclaim_cb(loopmgr);
- phase = change_phase(phase);
- grace = ISC_QSBR_GRACE(threads, phase);
- atomic_store_relaxed(&qsbr->grace, grace);
- }
-}
-
-/*
- * On a quiet server that does not have enough network traffic to keep
- * all its threads spinning, grace periods might extend indefinitely.
- * So check if we have been waiting an unreasonably long time since
- * the last phase change. If so, send a no-op async request to every
- * thread to make them all cycle through a quiescent state.
- */
-static void
-maybe_wakeup(isc_loop_t *loop) {
- isc_loopmgr_t *loopmgr = loop->loopmgr;
- isc_qsbr_t *qsbr = &loopmgr->qsbr;
-
- /*
- * ATOMIC: relaxed is OK here because we don't use any values guarded
- * by the `activated` flags.
- */
- if (atomic_load_relaxed(&qsbr->activated) == 0) {
- return;
- }
- if (loop->shuttingdown) {
- qsbr_shutdown(loopmgr);
- return;
- }
-
- /*
- * ATOMIC: relaxed, because the `transition_time` doesn't guard any
- * other values, just the isc_loopmgr_wakeup() call below.
- */
- atomic_uint_fast64_t *qsbr_ttp = &qsbr->transition_time;
- isc_nanosecs_t now = isc_time_monotonic();
- isc_nanosecs_t start = atomic_load_relaxed(qsbr_ttp);
- if (now < start + MAX_GRACE_PERIOD_NS) {
- return;
- }
-
- /*
- * To stop other threads from also invoking `isc_loopmgr_wakeup()`,
- * we try to push the timer into the future (expecting that it will
- * not trigger again), and quit if someone else got there first.
- * ATOMIC: relaxed, as before; strong, because there is no retry loop.
- */
- if (!atomic_compare_exchange_strong_relaxed(qsbr_ttp, &start, now)) {
- return;
- }
-
- TRACE("long grace period of %llu ns, waking up other threads",
- (unsigned long long)(now - start));
-
- isc_loopmgr_wakeup(loopmgr);
-}
-
-/*
- * Callers use the fuzzy barrier to ensure only one thread can enter
- * this function at a time.
- *
- * Phase transitions happen at roughly the same frequency that IO
- * event loops cycle, limited by the slowest loop in each cycle.
- */
-static void
-phase_transition(isc_loop_t *loop, isc_qsbr_phase_t current_phase) {
- isc_loopmgr_t *loopmgr = loop->loopmgr;
- isc_qsbr_t *qsbr = &loopmgr->qsbr;
-
- if (loop->shuttingdown) {
- qsbr_shutdown(loopmgr);
- return;
- }
-
- /*
- * After we change phase, threads will be in either the `current_phase`
- * or the `next_phase`. We will reclaim memory from the `third_phase`.
- *
- * ATOMIC: relaxed is OK here because the necessary synchronization
- * happens in `reclaim_cb()`.
- */
- isc_qsbr_phase_t next_phase = change_phase(current_phase);
- isc_qsbr_phase_t third_phase = change_phase(next_phase);
- bool activated = atomic_load_relaxed(&qsbr->activated) &
- active_bit(third_phase);
-
- /*
- * Reset the wakeup timer, and log the length of the grace period.
- * ATOMIC: relaxed, per the commentary in `maybe_wakeup()`.
- */
- atomic_uint_fast64_t *qsbr_tt = &qsbr->transition_time;
- isc_nanosecs_t now = isc_time_monotonic();
- isc_nanosecs_t start = atomic_exchange_relaxed(qsbr_tt, now);
- TRACE("phase %u -> %u after grace period of %f ms", current_phase,
- next_phase, (double)(now - start) / NS_PER_MS);
- UNUSED(start); /* ifndef TRACE() */
-
- /*
- * Work out the threads counter for this grace period.
- *
- * We need to add one for any reclamation worker thread, to
- * prevent us from changing phase before the work is done. If
- * we change too early, any newly detached objects will be
- * marked with the same phase as the running reclaimer, which
- * might lead to them being free()d too soon.
- */
- uint32_t threads = isc_loopmgr_nloops(loopmgr) + (activated ? 1 : 0);
-
- /*
- * Start the new grace period.
- *
- * ATOMIC: release, to pair with the load-acquire in `reclaim_cb()`
- * which is spawned in a separate worker thread.
- */
- uint32_t grace = ISC_QSBR_GRACE(threads, next_phase);
- atomic_store_release(&qsbr->grace, grace);
-
- if (activated) {
- isc_work_enqueue(loop, reclaim_cb, reclaimed_cb, loopmgr);
- }
-}
-
-/*
- * This function is called once per cycle of each IO event loop by the
- * `uv_prepare` callback below.
- */
-void
-isc__qsbr_quiescent_state(isc_loop_t *loop) {
- isc_loopmgr_t *loopmgr = loop->loopmgr;
- isc_qsbr_t *qsbr = &loopmgr->qsbr;
-
- /*
- * ATOMIC: relaxed. If we are in phase then we don't need to
- * synchronize; if we are not then this thread's presence in
- * the thread counter will prevent the phase from changing
- * before we get to the fuzzy barrier.
- */
- isc_qsbr_phase_t phase = global_phase(qsbr, memory_order_relaxed);
- if (loop->qsbr_phase == phase) {
- maybe_wakeup(loop);
- return;
- }
-
- /*
- * Enter the current phase and count us out of the previous phase.
- */
- loop->qsbr_phase = phase;
- if (fuzzy_barrier_not_yet(qsbr)) {
- maybe_wakeup(loop);
- return;
- }
-
- /*
- * We were the last thread to enter the current phase so the
- * grace period is up. No other thread can reach this point.
- */
- phase_transition(loop, phase);
-}
-
-void
-isc__qsbr_quiescent_cb(uv_prepare_t *handle) {
- isc_loop_t *loop = uv_handle_get_data((uv_handle_t *)handle);
- isc__qsbr_quiescent_state(loop);
-}
-
-static void
-reclaimed_cb(void *arg) {
- /* we are back on a loop thread */
- isc_loopmgr_t *loopmgr = arg;
- isc_qsbr_t *qsbr = &loopmgr->qsbr;
- isc_loop_t *loop = CURRENT_LOOP(loopmgr);
-
- /*
- * Remove the reclaimers from the thread count, so that the
- * next grace period can start.
- */
- if (fuzzy_barrier_not_yet(qsbr)) {
- return;
- }
-
- /*
- * The reclaimers were the last thread to be counted out: every
- * other thread already passed through a quiescent state.
- *
- * We expect loop->qsbr_phase == global_phase() at this point,
- * except during shutdown when the phase shifts rapidly. Also,
- * the current loop might not have received the shutdown
- * message yet, so it seems easiest to omit the assertion.
- *
- * ATOMIC: relaxed, the fuzzy barrier already synchronized.
- */
- TRACE("reclaimers overran");
- phase_transition(loop, global_phase(qsbr, memory_order_relaxed));
-}
-
-static void
-reclaim_cb(void *arg) {
- /* we are on a work thread not a loop thread */
- isc_loopmgr_t *loopmgr = arg;
- isc_qsbr_t *qsbr = &loopmgr->qsbr;
-
- /*
- * The global phase has just been bumped by a `phase_transition()`
- * and it cannot change again until the grace period is up, which
- * cannot happen until we have finished working.
- *
- * ATOMIC: acquire, to pair with the release in `phase_transition()`.
- *
- * The phase we are to clean up is 2 before the current phase,
- * which is the same as the one after the current phase (mod 3).
- */
- isc_qsbr_phase_t cur_phase = global_phase(qsbr, memory_order_acquire);
- isc_qsbr_phase_t third_phase = change_phase(cur_phase);
- unsigned int third_bit = active_bit(third_phase);
-
- /*
- * If any reclaimers need to be called again later, they can use
- * `isc_qsbr_activate()`, so we need to clear the bit first.
- *
- * ATOMIC: acquire, so that `isc_qsbr_activate()` happens before
- * the callbacks are invoked.
- */
- uint32_t activated = atomic_fetch_and_explicit(
- &qsbr->activated, ~third_bit, memory_order_acquire);
-
- /* this can happen when we are racing to clean up on shutdown */
- if ((activated & third_bit) == 0) {
- return;
- }
-
- isc_qsbr_registered_t *reclaimer = ISC_STACK_TOP(qsbreclaimers);
- while (reclaimer != NULL) {
- reclaimer->func(third_phase);
- reclaimer = ISC_SLINK_NEXT(reclaimer, link);
- }
-}
-
-void
-isc__qsbr_register(isc_qsbr_registered_t *reclaimer) {
- REQUIRE(reclaimer->func != NULL);
- ISC_STACK_PUSH(qsbreclaimers, reclaimer, link);
-}
-
-/*
- * ATOMIC: This function needs to ensure that the global phase is read
- * after a write has committed. Acquire/release ordering is not sufficient
- * for ordering between separate atomics (the data structure's root pointer
- * and the global phase), so it must be sequentially consistent.
- *
- * In general, the phases up to and including the next phase transition
- * look like:
- *
- * 1. local phase
- * 2. global phase
- * 3. next phase
- * 1. third phase
- *
- * i.e. some threads are still one behind the global phase, on the same
- * phase that will be cleaned up immediately after the phase transition.
- *
- * This function is called just after a write commits. It's likely that
- * some threads on the global phase (2) are using a version of the data
- * structure from before the write, and they can continue using it while
- * the straggler threads (1) catch up and cause a phase transition.
- *
- * The writer can be one of the straggler threads. If it incorrectly marks
- * cleanup work with its local phase (1), memory will be reclaimed
- * immediately after the next phase transition (when the third phase is
- * also 1), which could be almost immediately when the writer returns to
- * the event loop. This will cause a use-after-free for existing readers
- * (in phase 2).
- *
- * More straightforwardly, we need to be able to queue up reclaim work from
- * a thread that isn't running a loop, which also means this function has
- * to return the global phase.
- */
-isc_qsbr_phase_t
-isc_qsbr_phase(isc_loopmgr_t *loopmgr) {
- isc_qsbr_t *qsbr = &loopmgr->qsbr;
- return (global_phase(qsbr, memory_order_seq_cst));
-}
-
-void
-isc_qsbr_activate(isc_loopmgr_t *loopmgr, isc_qsbr_phase_t phase) {
- /*
- * ATOMIC: release ordering ensures that writing the cleanup lists
- * happens before the callback is invoked from a worker thread.
- */
- atomic_fetch_or_release(&loopmgr->qsbr.activated, active_bit(phase));
-}