2
0
mirror of https://gitlab.isc.org/isc-projects/bind9 synced 2025-08-22 18:19:42 +00:00

41 Commits

Author SHA1 Message Date
Ondřej Surý
b8d00e2e18
Change the loopmgr to be singleton
All the applications built on top of the loop manager were required to
create just a single instance of the loop manager.  Refactor the loop
manager to not expose this instance to the callers and keep the loop
manager object internal to the isc_loop compilation unit.

This significantly simplifies a number of data structures and calls to
the isc_loop API.
2025-07-23 22:44:16 +02:00
Ondřej Surý
1032681af0 Convert the isc/tid.h to use own signed integer isc_tid_t type
Change the internal type used for isc_tid unit to isc_tid_t to hide the
specific integer type being used for the 'tid'.  Internally, the signed
integer type is being used.  This allows us to have negatively indexed
arrays that works both for threads with assigned tid and the threads
with unassigned tid.  This should be used only in specific situations.
2025-06-28 13:32:12 +02:00
Ondřej Surý
4e79e9baae
Give every memory context a name
Instead of giving the memory context names with an explicit call to
isc_mem_setname(), add the name to isc_mem_create() call to have all the
memory contexts an unconditional name.
2025-05-29 05:46:46 +02:00
Ondřej Surý
30d4939382
Move the call_rcu_thread explicit create and shutdown to isc_loop
When isc__thread_initialize() is called from a library constructor, it
could be called before we fork the main process.  This happens with
named, and then we have the call_rcu_thread attached to the pre-fork
process and not the post-fork process, which means that the initial
process will never shutdown, because there's noone to tell it so.

Move the isc__thread_initialize() and isc__thread_shutdown() to the
isc_loop unit where we call it before creating the extra thread and
after joining all the extra threads respectively.
2025-04-16 12:30:14 +02:00
Ondřej Surý
2aa70fff76
Remove unused isc_mutexblock and isc_condition units
The isc_mutexblock and isc_condition units were no longer in use and
were removed.
2025-03-01 07:33:09 +01:00
Pavel Březina
67e21d94d4 mark loop as shuttingdown earlier in shutdown_cb
`shutdown_trigger_close_cb` is not called in the main loop since
queued events in the `loop->async_trigger`, including loop teardown
(shutdown_server) are processed first, before the `uv_close` callback
is executed..

In order to pass the information to the queued events, it is necessary
to set the flag earlier in the process and not wait for the `uv_close`
callback to trigger.
2024-12-10 19:18:49 +00:00
Ondřej Surý
0258850f20
Remove redundant parentheses from the return statement 2024-11-19 12:27:22 +01:00
Nicki Křížek
842abe9fbf Revert "Double the number of threadpool threads"
This reverts commit 6857df20a40f4e05f465a7a3f5d24eeedce8fc6c.
2024-09-20 14:31:25 +02:00
Nicki Křížek
377831a290 Merge tag 'v9.21.1' 2024-09-18 18:02:41 +02:00
Ondřej Surý
6370e9b311 Add isc_helper API that adds 1:1 thread for each loop
Add an extra thread that can be used to offload operations that would
affect latency, but are not long-running tasks; those are handled by
isc_work API.

Each isc_loop now has matching isc_helper thread that also built on top
of uv_loop.  In fact, it matches most of the isc_loop functionality, but
only the `isc_helper_run()` asynchronous call is exposed.
2024-09-12 12:09:45 +00:00
Nicki Křížek
6857df20a4 Double the number of threadpool threads
Introduce this temporary workaround to reduce the impact of long-running
tasks in offload threads which can block the resolution of queries.
2024-09-06 14:15:21 +02:00
Ondřej Surý
8506102216 Remove logging context (isc_log_t) from the public namespace
Now that the logging uses single global context, remove the isc_log_t
from the public namespace.
2024-08-20 12:50:39 +00:00
Mark Andrews
88c48dde5e Stop processing catalog zone changes when shutting down
Abandon catz_addmodzone_cb  and catz_delzone_cb processing if the
loop is shutting down.
2024-05-09 08:17:44 +10:00
Evan Hunt
63659e2e3a
complete removal of isc_loop_current()
isc_loop() can now take its place.

This also requires changes to the test harness - instead of running the
setup and teardown outside of th main loop, we now schedule the setup
and teardown to run on the loop (via isc_loop_setup() and
isc_loop_teardown()) - this is needed because the new the isc_loop()
call has to be run on the active event loop, but previously the
isc_loop_current() (and the variants like isc_loop_main()) would work
even outside of the loop because it needed just isc_tid() to work, but
not the full loop (which was mainly true for the main thread).
2024-04-02 10:35:56 +02:00
Evan Hunt
c47fa689d4
use a thread-local variable to get the current running loop
if we had a method to get the running loop, similar to how
isc_tid() gets the current thread ID, we can simplify loop
and loopmgr initialization.

remove most uses of isc_loop_current() in favor of isc_loop().
in some places where that was the only reason to pass loopmgr,
remove loopmgr from the function parameters.
2024-04-02 10:35:56 +02:00
Ondřej Surý
15096aefdf
Make the dns_validator validations asynchronous and limit it
Instead of running all the cryptographic validation in a tight loop,
spread it out into multiple event loop "ticks", but moving every single
validation into own isc_async_run() asynchronous event.  Move the
cryptographic operations - both verification and DNSKEY selection - to
the offloaded threads (isc_work_enqueue), this further limits the time
we spend doing expensive operations on the event loops that should be
fast.

Limit the impact of invalid or malicious RRSets that contain crafted
records causing the dns_validator to do many validations per single
fetch by adding a cap on the maximum number of validations and maximum
number of validation failures that can happen before the resolving
fails.
2024-02-01 21:45:06 +01:00
Ondřej Surý
89fcb6f897
Apply the isc_mem_cget semantic patch 2023-08-31 22:08:35 +02:00
Ondřej Surý
4ca64c1799
Pin dns_request to the associated loop
When dns_request was canceled via dns_requestmgr_shutdown() the cancel
event would be propagated on different loop (loop 0) than the loop where
request was created on.  In turn this would propagate down to isc_netmgr
where we require all the events to be called from the matching isc_loop.

Pin the dns_requests to the loops and ensure that all the events are
called on the associated loop.  This in turn allows us to remove the
hashed locks on the requests and change the single .requests list to be
a per-loop list for the request accounting.

Additionally, do some extra cleanup because some race condititions are
now not possible as all events on the dns_request are serialized.
2023-07-28 09:01:22 +02:00
Evan Hunt
e37d02905c
add isc_loop_now() to get consistent time
isc_loop_now() is a front-end to uv_now(), returning the start
time of the current event loop tick.
2023-07-19 15:32:21 +02:00
Ondřej Surý
5bd9343c4e
Remove the explicit call_rcu thread creating and destruction
The free_all_cpu_call_rcu_data() call can consume hundreds of
milliseconds on shutdown.  Don't try to be smart and let the RCU library
handle this internally.
2023-06-27 07:59:00 +02:00
Tony Finch
c319ccd4c9 Fixes for liburcu-qsbr
Move registration and deregistration of the main thread from
`isc_loopmgr_run()` into `isc__initialize()` / `isc__shutdown()`:
liburcu-qsbr fails an assertion if we try to use it from an
unregistered thread, and we need to be able to use it when the
event loops are not running.

Use `rcu_assign_pointer()` and `rcu_dereference()` in qp-trie
transactions so that they properly mark threads as online. The
RCU-protected pointer is no longer declared atomic because
liburcu does not (yet) use standard C atomics.

Fix the definition of `isc_qsbr_rcu_dereference()` to return
the referenced value, and to call the right function inside
liburcu.

Change the thread sanitizer suppressions to match any variant of
`rcu_*_barrier()`
2023-05-15 20:49:42 +00:00
Tony Finch
afae41aa40
Check the return value from uv_async_send()
An omission pointed out by the following report from Coverity:

    /lib/isc/loop.c: 483 in isc_loopmgr_pause()
    >>>     CID 455002:  Error handling issues  (CHECKED_RETURN)
    >>>     Calling "uv_async_send" without checking return value (as is done elsewhere 5 out of 6 times).
    483     		uv_async_send(&loop->pause_trigger);
2023-05-15 18:52:04 +01:00
Tony Finch
f11cc83142
Use per-CPU RCU helper threads
Create and free per-CPU helper threads from the main thread and tell
thread sanitizer to suppress leaking threads. (We are not leaking
threads ourselves and we can safely ignore the Userspace-RCU thread
leaks.)
2023-05-12 20:48:31 +01:00
Tony Finch
05ca11e122
Remove isc_qsbr (we are using liburcu instead)
This commit breaks the qp-trie code.
2023-05-12 20:48:31 +01:00
Ondřej Surý
7b1d985de2
Change the isc_async API to use cds_wfcqueue internally
The isc_async API was using lock-free stack (where enqueue operation was
not wait-free).  Change the isc_async to use cds_wfcqueue internally -
enqueue and splice (move the queue members from one list to another) is
nonblocking and wait-free.
2023-05-12 14:16:25 +02:00
Tony Finch
7d1ceaf35d
Move per-thread RCU setup into isc_thread
All the per-loop `libuv` setup remains in `isc_loop`, but the per-thread
RCU setup is moved to `isc_thread` alongside the other per-thread setup.
This avoids repeating the per-thread setup for `call_rcu()` helpers,
and explains a little better why some parts of the per-thread setup
is missing for `call_rcu()` helpers.

This also removes the per-loop `call_rcu()` helpers as we refactored the
isc__random_initialize() in the previous commit.
2023-04-27 12:38:53 +02:00
Ondřej Surý
65021dbf52
Move the isc_random API initialization to the thread_local variable
Instead of writing complicated wrappers for every thread, move the
initialization back to isc_random unit and check whether the random seed
was initialized with a thread_local variable.

Ensure that isc_entropy_get() returns a non-zero seed.

This avoids problems with thread sanitizer tests getting stuck in an
infinite loop.
2023-04-27 12:38:53 +02:00
Tony Finch
e0248bf60f
Simplify isc_thread a little
Remove the `isc_threadarg_t` and `isc_threadresult_t`
typedefs which were unhelpful disguises for `void *`,
and free the dummy jemalloc allocation sooner.
2023-04-27 12:38:53 +02:00
Ondřej Surý
c2c907d728
Improve the Userspace RCU integration
This commit allows BIND 9 to be compiled with different flavours of
Userspace RCU, and improves the integration between Userspace RCU and
our event loop:

- In the RCU QSBR, the thread is put offline when polling and online
  when rcu_dereference, rcu_assign_pointer (or friends) are called.

- In other RCU modes, we check that we are not reading when reaching the
  quiescent callback in the event loop.

- We register the thread before uv_work_run() callback is called and
  after it has finished.  The rcu_(un)register_thread() has a large
  overhead, but that's fine in this case.
2023-04-27 12:38:53 +02:00
Tony Finch
bc2389b828 Add per-thread sharded histograms for heavy loads
Although an `isc_histo_t` is thread-safe, it can suffer
from cache contention under heavy load. To avoid this,
an `isc_histomulti_t` contains a histogram per thread,
so updates are local and low-contention.
2023-04-03 12:08:05 +01:00
Ondřej Surý
1844590ad9
Refactor isc_job_run to not-make any allocations
Change the isc_job_run() to not-make any allocations.  The caller must
make sure that it allocates isc_job_t - usually as part of the argument
passed to the callback.

For simple jobs, using isc_async_run() is advised as it allocates its
own separate isc_job_t.
2023-03-30 16:00:52 +02:00
Tony Finch
9b7aa536ba QSBR: safe memory reclamation for lock-free data structures
This "quiescent state based reclamation" module provides support for
the qp-trie module in dns/qp. It is a replacement for liburcu, written
without reference to the urcu source code, and in fact it works in a
significantly different way.

A few specifics of BIND make this variant of QSBR somewhat simpler:

  * We can require that wait-free access to a qp-trie only happens in
    an isc_loop callback. The loop provides a natural quiescent state,
    after the callbacks are done, when no qp-trie access occurs.

  * We can dispense with any API like rcu_synchronize(). In practice,
    it takes far too long to wait for a grace period to elapse for each
    write to a data structure.

  * We use the idea of "phases" (aka epochs or eras) from EBR to
    reduce the amount of bookkeeping needed to track memory that is no
    longer needed, knowing that the qp-trie does most of that work
    already.

I considered hazard pointers for safe memory reclamation. They have
more read-side overhead (updating the hazard pointers) and it wasn't
clear to me how to nicely schedule the cleanup work. Another
alternative, epoch-based reclamation, is designed for fine-grained
lock-free updates, so it needs some rethinking to work well with the
heavily read-biased design of the qp-trie. QSBR has the fastest read
side of the basic SMR algorithms (with no barriers), and fits well
into a libuv loop. More recent hybrid SMR algorithms do not appear to
have enough benefits to justify the extra complexity.
2023-02-23 15:57:53 +00:00
Ondřej Surý
6eb1340d1b Use atomic stack for async job queue
Previously, the async job queue would use a locked-list (ISC_LIST).
With introduction of atomic stack (that has to be drained at once), we
could use it to remove some contention between the threads and simplify
the async queue.

Fortunately, the reverse order still works for us - instead of append
and tail/prev operation on the list, we are now using prepend and
head/next operation on the atomic stack.
2023-02-22 16:13:37 +00:00
Evan Hunt
a52b17d39b
remove isc_task completely
as there is no further use of isc_task in BIND, this commit removes
it, along with isc_taskmgr, isc_event, and all other related types.

functions that accepted taskmgr as a parameter have been cleaned up.
as a result of this change, some functions can no longer fail, so
they've been changed to type void, and their callers have been
updated accordingly.

the tasks table has been removed from the statistics channel and
the stats version has been updated. dns_dyndbctx has been changed
to reference the loopmgr instead of taskmgr, and DNS_DYNDB_VERSION
has been udpated as well.
2023-02-16 18:35:32 +01:00
Evan Hunt
f58e7c28cd
switch to using isc_loopmgr_pause() instead of task exclusive
change functions using isc_taskmgr_beginexclusive() to use
isc_loopmgr_pause() instead.

also, removed an unnecessary use of exclusive mode in
named_server_tcptimeouts().

most functions that were implemented as task events because they needed
to be running in a task to use exclusive mode have now been changed
into loop callbacks instead. (the exception is catz, which is being
changed in a separate commit because it's a particularly complex change.)
2023-02-16 17:51:55 +01:00
Tony Finch
6927a30926 Remove do-nothing header <isc/print.h>
This one really truly did nothing. No lines added!
2023-02-15 16:44:47 +00:00
Ondřej Surý
6613f89c62 Enhance the isc_loop unit to allow reference count tracking
Use ISC_REFCOUNT_TRACE_{IMPL,DECL} to allow better isc_loop reference
tracking - use `#define ISC_LOOP_TRACE 1` in <isc/loop.h> to enable.
2023-01-05 12:33:15 +00:00
Ondřej Surý
9d2f22e666
Properly name the loop->mctx
The per loop memory context were unnamed, properly name them as
'loop<tid>'.
2022-11-08 13:32:13 +01:00
Evan Hunt
dc878e3098 isc_async_run() runs events in reverse order
when more than one event was scheduled in the isc_aysnc queue,
they were executed in reverse order. we need to pull events
off the back of queue instead the front, so that uv_loop will
run them in the right order.

note that isc_job_run() has the same behavior, because it calls
uv_idle_start() directly. in that case we just document it so
it'll be less surprising in the future.
2022-10-31 05:43:45 -07:00
Tony Finch
a34a2784b1 De-duplicate some calls to strerror_r()
Specifically, when reporting an unexpected or fatal error.
2022-10-17 11:58:26 +01:00
Ondřej Surý
84c90e223f
New event loop handling API
This commit introduces new APIs for applications and signal handling,
intended to replace isc_app for applications built on top of libisc.

* isc_app will be replaced with isc_loopmgr, which handles the
  starting and stopping of applications. In isc_loopmgr, the main
  thread is not blocked, but is part of the working thread set.
  The loop manager will start a number of threads, each with a
  uv_loop event loop running. Setup and teardown functions can be
  assigned which will run when the loop starts and stops, and
  jobs can be scheduled to run in the meantime. When
  isc_loopmgr_shutdown() is run from any the loops, all loops
  will shut down and the application can terminate.

* signal handling will now be handled with a separate isc_signal unit.
  isc_loopmgr only handles SIGTERM and SIGINT for application
  termination, but the application may install additional signal
  handlers, such as SIGHUP as a signal to reload configuration.

* new job running primitives, isc_job and isc_async, have been added.
  Both units schedule callbacks (specifying a callback function and
  argument) on an event loop. The difference is that isc_job unit is
  unlocked and not thread-safe, so it can be used to efficiently
  run jobs in the same thread, while isc_async is thread-safe and
  uses locking, so it can be used to pass jobs from one thread to
  another.

* isc_tid will be used to track the thread ID in isc_loop worker
  threads.

* unit tests have been added for the new APIs.
2022-08-25 12:24:29 +02:00