isc_task_pause/unpause were inherently thread-unsafe - a task
could be paused only once by one thread, if the task was running
while we paused it it led to races. Fix it by making sure that
the task will pause if requested to, and by using a 'pause reference
counter' to count task pause requests - a task will be unpaused
iff all threads unpause it.
Don't remove from queue when pausing task - we lock the queue lock
(expensive), while it's unlikely that the task will be running -
and we'll remove it anyway in dispatcher
Also disable the semantic patch as the code needs tweaks here and there because
some destroy functions might not destroy the object and return early if the
object is still in use.
To reproduce the race - create a task, send two events to it, first one
must take some time. Then, from the outside, pause(), unpause() and detach()
the task.
When the long-running event is processed by the task it is in
task_state_running state. When we called pause() the state changed to
task_state_paused, on unpause we checked that there are events in the task
queue, changed the state to task_state_ready and enqueued the task on the
workers readyq. We then detach the task.
The dispatch() is done with processing the event, it processes the second
event in the queue, and then shuts down the task and frees it (as it's not
referenced anymore). Dispatcher then takes the, already freed, task from
the queue where it was wrongly put, causing an use-after free and,
subsequently, either an assertion failure or a segmentation fault.
The probability of this happening is very slim, yet it might happen under a
very high load, more probably on a recursive resolver than on an
authoritative.
The fix introduces a new 'task_state_pausing' state - to which tasks
are moved if they're being paused while still running. They are moved
to task_state_paused state when dispatcher is done with them, and
if we unpause a task in paused state it's moved back to task_state_running
and not requeued.
This allows a task to be temporary disabled so that objects won't be
processed simultaneously by libuv events and isc_task events. When a
task is paused, currently running events may complete, but no further
event will added to the run queue will be executed until the task is
unpaused.
When a task manager is created, we can now specify an `isc_nm`
object to associate with it; thereafter when the task manager is
placed into exclusive mode, the network manager will be paused.
This is a replacement for the existing isc_socket and isc_socketmgr
implementation. It uses libuv for asynchronous network communication;
"networker" objects will be distributed across worker threads reading
incoming packets and sending them for processing.
UDP listener sockets automatically create an array of "child" sockets
so each worker can listen separately.
TCP sockets are shared amongst worker threads.
A TCPDNS socket is a wrapper around a TCP socket, which handles the
the two-byte length field at the beginning of DNS messages over TCP.
(Other wrapper socket types can be implemented in the future to handle
DNS over TLS, DNS over HTTPS, etc.)
isc_event_allocate() calls isc_mem_get() to allocate the event structure. As
isc_mem_get() cannot fail softly (e.g. it never returns NULL), the
isc_event_allocate() cannot return NULL, hence we remove the (ret == NULL)
handling blocks using the semantic patch from the previous commit.
Previously isc_thread_join() would return ISC_R_UNEXPECTED on a failure to
create new thread. All such occurences were caught and wrapped into assert
function at higher level. The function was simplified to assert directly in the
isc_thread_join() function and all caller level assertions were removed.
Previously isc_thread_create() would return ISC_R_UNEXPECTED on a failure to
create new thread. All such occurences were caught and wrapped into assert
function at higher level. The function was simplified to assert directly in the
isc_thread_create() function and all caller level assertions were removed.
The json-c have previously leaked into the global namespace leading
to forced -I<include_path> for every compilation unit using isc/xml.h
header. This MR fixes the usage making the caller object opaque.
The libxml2 have previously leaked into the global namespace leading
to forced -I<include_path> for every compilation unit using isc/xml.h
header. This MR fixes the usage making the caller object opaque.
Compiling with -O3 triggers the following warnings with GCC 9.1:
task.c: In function ‘isc_taskmgr_create’:
task.c:1384:43: warning: ‘%04u’ directive output may be truncated writing between 4 and 10 bytes into a region of size 6 [-Wformat-truncation=]
1384 | snprintf(name, sizeof(name), "isc-worker%04u", i);
| ^~~~
task.c:1384:32: note: directive argument in the range [0, 4294967294]
1384 | snprintf(name, sizeof(name), "isc-worker%04u", i);
| ^~~~~~~~~~~~~~~~
task.c:1384:3: note: ‘snprintf’ output between 15 and 21 bytes into a destination of size 16
1384 | snprintf(name, sizeof(name), "isc-worker%04u", i);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
private_test.c: In function ‘private_nsec3_totext_test’:
private_test.c:110:9: warning: array subscript 4 is outside array bounds of ‘uint32_t[1]’ {aka ‘unsigned int[1]’} [-Warray-bounds]
110 | while (*sp == '\0' && slen > 0) {
| ^~~
private_test.c:103:11: note: while referencing ‘salt’
103 | uint32_t salt;
| ^~~~
Prevent these warnings from being triggered by increasing the size of
the relevant array (task.c) and reordering conditions (private_test.c).
This work cleans up the API which includes couple of things:
1. Make the isc_appctx_t type fully opaque
2. Protect all access to the isc_app_t members via stdatomics
3. sigwait() is part of POSIX.1, remove dead non-sigwait code
4. Remove unused code: isc_appctx_set{taskmgr,sockmgr,timermgr}
If we know that we'll have a task pool doing specific thing it's better
to use this knowledge and bind tasks to task queues, this behaves better
than randomly choosing the task queue.
- use bound resolver tasks - we have a pool of tasks doing resolutions,
we can spread the load evenly using isc_task_create_bound
- quantum set universally to 25