2
0
mirror of https://gitlab.isc.org/isc-projects/bind9 synced 2025-08-25 03:27:18 +00:00

235 Commits

Author SHA1 Message Date
Ondřej Surý
b9cb29076f Log when starting and ending task exclusive mode
The task exclusive mode stops all processing (tasks and networking IO)
except the designated exclusive task events.  This has impact on the
operation of the server.  Add log messages indicating when we start the
exclusive mode, and when we end exclusive task mode.
2022-02-10 21:09:06 +01:00
Ondřej Surý
58bd26b6cf Update the copyright information in all files in the repository
This commit converts the license handling to adhere to the REUSE
specification.  It specifically:

1. Adds used licnses to LICENSES/ directory

2. Add "isc" template for adding the copyright boilerplate

3. Changes all source files to include copyright and SPDX license
   header, this includes all the C sources, documentation, zone files,
   configuration files.  There are notes in the doc/dev/copyrights file
   on how to add correct headers to the new files.

4. Handle the rest that can't be modified via .reuse/dep5 file.  The
   binary (or otherwise unmodifiable) files could have license places
   next to them in <foo>.license file, but this would lead to cluttered
   repository and most of the files handled in the .reuse/dep5 file are
   system test files.
2022-01-11 09:05:02 +01:00
Ondřej Surý
e705f213ca Remove taskmgr->excl_lock, fix the locking for taskmgr->exiting
While doing code review, it was found that the taskmgr->exiting is set
under taskmgr->lock, but accessed under taskmgr->excl_lock in the
isc_task_beginexclusive().

Additionally, before the change that moved running the tasks to the
netmgr, the task_ready() subrouting of isc_task_detach() would lock
mgr->lock, requiring the mgr->excl to be protected mgr->excl_lock
to prevent deadlock in the code.  After !4918 has been merged, this is
no longer true, and we can remove taskmgr->excl_lock and use
taskmgr->lock in its stead.

Solve both issues by removing the taskmgr->excl_lock and exclusively use
taskmgr->lock to protect both taskmgr->excl and taskmgr->exiting which
now doesn't need to be atomic_bool, because it's always accessed from
within the locked section.
2022-01-05 16:44:57 +01:00
Ondřej Surý
f9d90159b8 On shutdown, return ISC_R_SHUTTINGDOWN from isc_taskmgr_excltask()
The isc_taskmgr_excltask() would return ISC_R_NOTFOUND either when the
exclusive task was not set (yet) or when the taskmgr is shutting down
and the exclusive task has been already cleared.

Distinguish between the two states and return ISC_R_SHUTTINGDOWN when
the taskmgr is being shut down instead of ISC_R_NOTFOUND.
2022-01-05 13:41:12 +01:00
Ondřej Surý
29c2e52484 The isc/platform.h header has been completely removed
The isc/platform.h header was left empty which things either already
moved to config.h or to appropriate headers.  This is just the final
cleanup commit.
2021-07-06 05:33:48 +00:00
Mark Andrews
234ad2d075 Lock access to task->threadid 2021-06-15 00:01:58 +00:00
Ondřej Surý
7670f98377 Add isc_task_getnetmgr() function
Add a function to pull the attached netmgr from inside the executed
task.  This is needed for any task that needs to call the netmgr API.
2021-05-31 14:52:05 +02:00
Mark Andrews
715a2c7fc1 Add missing initialisations
configuring with --enable-mutex-atomics flagged these incorrectly
initialised variables on systems where pthread_mutex_init doesn't
just zero out the structure.
2021-05-26 08:15:08 +00:00
Ondřej Surý
4db5e30177 Run shutdown events with the task's existing threadid
Previously, task->threadid was reassigned to 0 while shutting
down, which caused an assertion.
2021-05-24 20:02:20 +02:00
Ondřej Surý
0be7ea78be Reduce the number of client tasks and bind them to netmgr queues
Since a client object is bound to a netmgr handle, each client
will always be processed by the same netmgr worker, so we can
simplify the code by binding client->task to the same thread as
the client. Since ns__client_request() now runs in the same event
loop as client->task events, is no longer necessary to pause the
task manager before launching them.

Also removed some functions in isc_task that were not used.
2021-05-24 20:02:20 +02:00
Ondřej Surý
e623c12757 Destroy reference to taskmgr after all tasks are done
We were clearing the pointer to taskmgr as soon as isc_taskmgr_destroy()
would be called and before all tasks were finished.  Unfortunately, some
tasks would use global named_g_taskmgr objects from inside the events
and this would cause either a data race or NULL pointer dereference.

This commit fixes the data race by moving the destruction of the
referenced pointer to the time after all tasks are finished.
2021-05-10 12:13:27 -07:00
Ondřej Surý
6c57a6cc3d Add isc_taskmgr_detach when task is created while shutting down
When taskmgr is shutting down, the creating the task would attach
to the taskmgr, but don't detach on error condition.
2021-05-10 11:39:51 +02:00
Ondřej Surý
365c6a9851 ensure interlocked netmgr events run on worker[0]
Network manager events that require interlock (pause, resume, listen)
are now always executed in the same worker thread, mgr->workers[0],
to prevent races.

"stoplistening" events no longer require interlock.
2021-05-07 14:28:32 -07:00
Ondřej Surý
4c8f6ebeb1 Use barriers for netmgr synchronization
The netmgr listening, stoplistening, pausing and resuming functions
now use barriers for synchronization, which makes the code much simpler.

isc/barrier.h defines isc_barrier macros as a front-end for uv_barrier
on platforms where that works, and pthread_barrier where it doesn't
(including TSAN builds).
2021-05-07 14:28:32 -07:00
Evan Hunt
5c08f97791 only run tasks as privileged if taskmgr is in privileged mode
all zone loading tasks have the privileged flag, but we only want
them to run as privileged tasks when the server is being initialized;
if we privilege them the rest of the time, the server may hang for a
long time after a reload/reconfig. so now we call isc_taskmgr_setmode()
to turn privileged execution mode on or off in the task manager.

isc_task_privileged() returns true if the task's privilege flag is
set *and* the taskmgr is in privileged execution mode. this is used
to determine in which netmgr event queue the task should be run.
2021-05-07 14:28:30 -07:00
Ondřej Surý
dacf586e18 Make the netmgr queue processing quantized
There was a theoretical possibility of clogging up the queue processing
with an endless loop where currently processing netievent would schedule
new netievent that would get processed immediately.  This wasn't such a
problem when only netmgr netievents were processed, but with the
addition of the tasks, there are at least two situation where this could
happen:

 1. In lib/dns/zone.c:setnsec3param() the task would get re-enqueued
    when the zone was not yet fully loaded.

 2. Tasks have internal quantum for maximum number of isc_events to be
    processed, when the task quantum is reached, the task would get
    rescheduled and then immediately processed by the netmgr queue
    processing.

As the isc_queue doesn't have a mechanism to atomically move the queue,
this commit adds a mechanism to quantize the queue, so enqueueing new
netievents will never stop processing other uv_loop_t events.
The default quantum size is 128.

Since the queue used in the network manager allows items to be enqueued
more than once, tasks are now reference-counted around task_ready()
and task_run(). task_ready() now has a public API wrapper,
isc_task_ready(), that the netmgr can use to reschedule processing
of a task if the quantum has been reached.

Incidental changes: Cleaned up some unused fields left in isc_task_t
and isc_taskmgr_t after the last refactoring, and changed atomic
flags to atomic_bools for easier manipulation.
2021-05-07 14:28:30 -07:00
Ondřej Surý
b5bf58b419 Destroy netmgr before destroying taskmgr
With taskmgr running on top of netmgr, the ordering of how the tasks and
netmgr shutdown interacts was wrong as previously isc_taskmgr_destroy()
was waiting until all tasks were properly shutdown and detached.  This
responsibility was moved to netmgr, so we now need to do the following:

  1. shutdown all the tasks - this schedules all shutdown events onto
     the netmgr queue

  2. shutdown the netmgr - this also makes sure all the tasks and
     events are properly executed

  3. Shutdown the taskmgr - this now waits for all the tasks to finish
     running before returning

  4. Shutdown the netmgr - this call waits for all the netmgr netievents
     to finish before returning

This solves the race when the taskmgr object would be destroyed before
all the tasks were finished running in the netmgr loops.
2021-05-07 14:28:30 -07:00
Ondřej Surý
a011d42211 Add new isc_managers API to simplify <*>mgr create/destroy
Previously, netmgr, taskmgr, timermgr and socketmgr all had their own
isc_<*>mgr_create() and isc_<*>mgr_destroy() functions.  The new
isc_managers_create() and isc_managers_destroy() fold all four into a
single function and makes sure the objects are created and destroy in
correct order.

Especially now, when taskmgr runs on top of netmgr, the correct order is
important and when the code was duplicated at many places it's easy to
make mistake.

The former isc_<*>mgr_create() and isc_<*>mgr_destroy() functions were
made private and a single call to isc_managers_create() and
isc_managers_destroy() is required at the program startup / shutdown.
2021-05-07 10:19:05 -07:00
Ondřej Surý
b540722bc3 Refactor taskmgr to run on top of netmgr
This commit changes the taskmgr to run the individual tasks on the
netmgr internal workers.  While an effort has been put into keeping the
taskmgr interface intact, couple of changes have been made:

 * The taskmgr has no concept of universal privileged mode - rather the
   tasks are either privileged or unprivileged (normal).  The privileged
   tasks are run as a first thing when the netmgr is unpaused.  There
   are now four different queues in in the netmgr:

   1. priority queue - netievent on the priority queue are run even when
      the taskmgr enter exclusive mode and netmgr is paused.  This is
      needed to properly start listening on the interfaces, free
      resources and resume.

   2. privileged task queue - only privileged tasks are queued here and
      this is the first queue that gets processed when network manager
      is unpaused using isc_nm_resume().  All netmgr workers need to
      clean the privileged task queue before they all proceed normal
      operation.  Both task queues are processed when the workers are
      finished.

   3. task queue - only (traditional) task are scheduled here and this
      queue along with privileged task queues are process when the
      netmgr workers are finishing.  This is needed to process the task
      shutdown events.

   4. normal queue - this is the queue with netmgr events, e.g. reading,
      sending, callbacks and pretty much everything is processed here.

 * The isc_taskmgr_create() now requires initialized netmgr (isc_nm_t)
   object.

 * The isc_nm_destroy() function now waits for indefinite time, but it
   will print out the active objects when in tracing mode
   (-DNETMGR_TRACE=1 and -DNETMGR_TRACE_VERBOSE=1), the netmgr has been
   made a little bit more asynchronous and it might take longer time to
   shutdown all the active networking connections.

 * Previously, the isc_nm_stoplistening() was a synchronous operation.
   This has been changed and the isc_nm_stoplistening() just schedules
   the child sockets to stop listening and exits.  This was needed to
   prevent a deadlock as the the (traditional) tasks are now executed on
   the netmgr threads.

 * The socket selection logic in isc__nm_udp_send() was flawed, but
   fortunatelly, it was broken, so we never hit the problem where we
   created uvreq_t on a socket from nmhandle_t, but then a different
   socket could be picked up and then we were trying to run the send
   callback on a socket that had different threadid than currently
   running.
2021-04-20 23:22:28 +02:00
Ondřej Surý
16fe0d1f41 Cleanup the public vs private ISCAPI remnants
Since all the libraries are internal now, just cleanup the ISCAPI remnants
in isc_socket, isc_task and isc_timer APIs.  This means, there's one less
layer as following changes have been done:

 * struct isc_socket and struct isc_socketmgr have been removed
 * struct isc__socket and struct isc__socketmgr have been renamed
   to struct isc_socket and struct isc_socketmgr
 * struct isc_task and struct isc_taskmgr have been removed
 * struct isc__task and struct isc__taskmgr have been renamed
   to struct isc_task and struct isc_taskmgr
 * struct isc_timer and struct isc_timermgr have been removed
 * struct isc__timer and struct isc__timermgr have been renamed
   to struct isc_timer and struct isc_timermgr
 * All the associated code that dealt with typing isc_<foo>
   to isc__<foo> and back has been removed.
2021-04-19 13:18:24 +02:00
Ondřej Surý
3388ef36b3 Cleanup the isc_<*>mgr_createinc() constructors
Previously, the taskmgr, timermgr and socketmgr had a constructor
variant, that would create the mgr on top of existing appctx.  This was
no longer true and isc_<*>mgr was just calling isc_<*>mgr_create()
directly without any extra code.

This commit just cleans up the extra function.
2021-04-19 10:22:56 +02:00
Ondřej Surý
a0181056a8 Change the isc_thread_self() return type to uintptr_t
The pthread_self(), thrd_current() or GetCurrentThreadId() could
actually be a pointer, so we should rather convert the value into
uintptr_t instead of unsigned long.
2021-02-25 16:21:10 +01:00
Evan Hunt
dcee985b7f update all copyright headers to eliminate the typo 2020-09-14 16:20:40 -07:00
Witold Kręcicki
a8807d9a7b Add missing isc_mutex_destroy and isc_conditional_destroy calls.
While harmless on Linux, missing isc_{mutex,conditional}_destroy
causes a memory leak on *BSD. Missing calls were added.
2020-05-29 19:18:58 +00:00
Evan Hunt
68a1c9d679 change 'expr == true' to 'expr' in conditionals 2020-05-25 16:09:57 -07:00
Evan Hunt
ba0313e649 fix spelling errors reported by Fossies. 2020-02-21 15:05:08 +11:00
Witold Kręcicki
23bd04d2f1 Make isc_task_pause/isc_task_unpause thread safe.
isc_task_pause/unpause were inherently thread-unsafe - a task
could be paused only once by one thread, if the task was running
while we paused it it led to races. Fix it by making sure that
the task will pause if requested to, and by using a 'pause reference
counter' to count task pause requests - a task will be unpaused
iff all threads unpause it.

Don't remove from queue when pausing task - we lock the queue lock
(expensive), while it's unlikely that the task will be running -
and we'll remove it anyway in dispatcher
2020-02-18 09:22:04 +00:00
Ondřej Surý
5777c44ad0 Reformat using the new rules 2020-02-14 09:31:05 +01:00
Evan Hunt
e851ed0bb5 apply the modified style 2020-02-13 15:05:06 -08:00
Ondřej Surý
056e133c4c Use clang-tidy to add curly braces around one-line statements
The command used to reformat the files in this commit was:

./util/run-clang-tidy \
	-clang-tidy-binary clang-tidy-11
	-clang-apply-replacements-binary clang-apply-replacements-11 \
	-checks=-*,readability-braces-around-statements \
	-j 9 \
	-fix \
	-format \
	-style=file \
	-quiet
clang-format -i --style=format $(git ls-files '*.c' '*.h')
uncrustify -c .uncrustify.cfg --replace --no-backup $(git ls-files '*.c' '*.h')
clang-format -i --style=format $(git ls-files '*.c' '*.h')
2020-02-13 22:07:21 +01:00
Ondřej Surý
f50b1e0685 Use clang-format to reformat the source files 2020-02-12 15:04:17 +01:00
Ondřej Surý
bc1d4c9cb4 Clear the pointer to destroyed object early using the semantic patch
Also disable the semantic patch as the code needs tweaks here and there because
some destroy functions might not destroy the object and return early if the
object is still in use.
2020-02-09 18:00:17 -08:00
Witold Kręcicki
e1c4a69197 Fix a race in taskmgr between worker and task pausing/unpausing.
To reproduce the race - create a task, send two events to it, first one
must take some time. Then, from the outside, pause(), unpause() and detach()
the task.
When the long-running event is processed by the task it is in
task_state_running state. When we called pause() the state changed to
task_state_paused, on unpause we checked that there are events in the task
queue, changed the state to task_state_ready and enqueued the task on the
workers readyq. We then detach the task.
The dispatch() is done with processing the event, it processes the second
event in the queue, and then shuts down the task and frees it (as it's not
referenced anymore). Dispatcher then takes the, already freed, task from
the queue where it was wrongly put, causing an use-after free and,
subsequently, either an assertion failure or a segmentation fault.
The probability of this happening is very slim, yet it might happen under a
very high load, more probably on a recursive resolver than on an
authoritative.
The fix introduces a new 'task_state_pausing' state - to which tasks
are moved if they're being paused while still running. They are moved
to task_state_paused state when dispatcher is done with them, and
if we unpause a task in paused state it's moved back to task_state_running
and not requeued.
2020-01-21 10:06:19 +01:00
Ondřej Surý
fbf9856f43 Add isc_refcount_destroy() as appropriate 2020-01-14 13:12:13 +01:00
Ondřej Surý
5746172da3 Convert task flags to C11 atomics 2019-12-13 07:10:25 +01:00
Witold Kręcicki
5ce4b04b50 If a task is running and we call isc_task_pause it can
be implicitly unpaused when we switch from 'running' to
'idle' state. Fix it by not switching to 'idle' when paused.
2019-11-13 12:32:17 +00:00
Evan Hunt
59c64fa4bd add isc_task_pause() and isc_task_unpause() functions
This allows a task to be temporary disabled so that objects won't be
processed simultaneously by libuv events and isc_task events. When a
task is paused, currently running events may complete, but no further
event will added to the run queue will be executed until the task is
unpaused.
2019-11-07 11:55:37 -08:00
Evan Hunt
36ee430327 optionally associate a netmgr with a task manager when creating
When a task manager is created, we can now specify an `isc_nm`
object to associate with it; thereafter when the task manager is
placed into exclusive mode, the network manager will be paused.
2019-11-07 11:55:37 -08:00
Witold Kręcicki
70397f9d92 netmgr: libuv-based network manager
This is a replacement for the existing isc_socket and isc_socketmgr
implementation. It uses libuv for asynchronous network communication;
"networker" objects will be distributed across worker threads reading
incoming packets and sending them for processing.

UDP listener sockets automatically create an array of "child" sockets
so each worker can listen separately.

TCP sockets are shared amongst worker threads.

A TCPDNS socket is a wrapper around a TCP socket, which handles the
the two-byte length field at the beginning of DNS messages over TCP.

(Other wrapper socket types can be implemented in the future to handle
DNS over TLS, DNS over HTTPS, etc.)
2019-11-07 11:55:37 -08:00
Ondřej Surý
c662969da1 lib/isc/task.c: Fix invalid order of DbC checks that could cause dereference before NULL check 2019-10-03 09:04:27 +02:00
Ondřej Surý
50e109d659 isc_event_allocate() cannot fail, remove the fail handling blocks
isc_event_allocate() calls isc_mem_get() to allocate the event structure.  As
isc_mem_get() cannot fail softly (e.g. it never returns NULL), the
isc_event_allocate() cannot return NULL, hence we remove the (ret == NULL)
handling blocks using the semantic patch from the previous commit.
2019-08-30 08:55:34 +02:00
Ondřej Surý
46919579bb Make isc_thread_join() assert internally on failure
Previously isc_thread_join() would return ISC_R_UNEXPECTED on a failure to
create new thread.  All such occurences were caught and wrapped into assert
function at higher level.  The function was simplified to assert directly in the
isc_thread_join() function and all caller level assertions were removed.
2019-07-31 11:56:58 +02:00
Ondřej Surý
d6a60f2905 Make isc_thread_create() assert internally on failure
Previously isc_thread_create() would return ISC_R_UNEXPECTED on a failure to
create new thread.  All such occurences were caught and wrapped into assert
function at higher level.  The function was simplified to assert directly in the
isc_thread_create() function and all caller level assertions were removed.
2019-07-31 11:56:58 +02:00
Ondřej Surý
ae83801e2b Remove blocks checking whether isc_mem_get() failed using the coccinelle 2019-07-23 15:32:35 -04:00
Witold Kręcicki
e56cc07f50 Fix a few broken atomics initializations 2019-07-09 16:11:14 +02:00
Ondřej Surý
c0511688b5 lib/isc/task.c: use isc_refcount_t 2019-07-09 16:11:14 +02:00
Witold Kręcicki
5aeb99786e Properly initialize all atomic variables 2019-07-09 16:09:36 +02:00
Witold Kręcicki
b56948743a lib/isc/task: use isc_refcount_t 2019-07-09 16:09:36 +02:00
Ondřej Surý
e3e6888946 Make the usage of json-c objects opaque to the caller
The json-c have previously leaked into the global namespace leading
to forced -I<include_path> for every compilation unit using isc/xml.h
header.  This MR fixes the usage making the caller object opaque.
2019-06-25 12:04:20 +02:00
Ondřej Surý
0771dd3be8 Make the usage of libxml2 opaque to the caller
The libxml2 have previously leaked into the global namespace leading
to forced -I<include_path> for every compilation unit using isc/xml.h
header.  This MR fixes the usage making the caller object opaque.
2019-06-25 12:01:32 +02:00