This change makes the zone table lock-free for reads. Previously, the
zone table used a red-black tree, which is not thread safe, so the hot
read path acquired both the per-view mutex and the per-zonetable
rwlock. (The double locking was to fix to cleanup races on shutdown.)
One visible difference is that zones are not necessarily shut down
promptly: it depends on when the qp-trie garbage collector cleans up
the zone table. The `catz` system test checks several times that zones
have been deleted; the test now checks for zones to be removed from
the server configuration, instead of being fully shut down. The catz
test does not churn through enough zones to trigger a gc, so the zones
are not fully detached until the server exits.
After this change, it is still possible to improve the way we handle
changes to the zone table, for instance, batching changes, or better
compaction heuristics.
When fatal is called we may be holding memory allocated by OpenSSL.
This may result in the reference count for the FIPS provider not
going to zero and the shared library not being unloaded during
OPENSSL_cleanup. When the shared library is ultimately unloaded,
when all remaining dynamically loaded libraries are freed, we have
already destroyed the memory context we where using to track memory
leaks / late frees resulting in INSIST being called.
Disable triggering the INSIST when fatal has being called.
The isc_time_now() and isc_time_now_hires() were used inconsistently
through the code - either with status check, or without status check,
or via TIME_NOW() macro with RUNTIME_CHECK() on failure.
Refactor the isc_time_now() and isc_time_now_hires() to always fail when
getting current time has failed, and return the isc_time_t value as
return value instead of passing the pointer to result in the argument.
the 'dispatchmgr' member of the resolver object is used by both
the dns_resolver and dns_request modules, and may in the future
be used by others such as dns_xfrin. it doesn't make sense for it
to live in the resolver object; this commit moves it into dns_view.
as there is no further use of isc_task in BIND, this commit removes
it, along with isc_taskmgr, isc_event, and all other related types.
functions that accepted taskmgr as a parameter have been cleaned up.
as a result of this change, some functions can no longer fail, so
they've been changed to type void, and their callers have been
updated accordingly.
the tasks table has been removed from the statistics channel and
the stats version has been updated. dns_dyndbctx has been changed
to reference the loopmgr instead of taskmgr, and DNS_DYNDB_VERSION
has been udpated as well.
dns_request_create() and _createraw() now take a 'loop' parameter
and run the callback event on the specified loop.
as the task manager is no longer used, it has been removed from
the dns_requestmgr structure. the dns_resolver_taskmgr() function
is also no longer used and has been removed.
* rbt node chains were sized to allow for bitstring labels, so they
had 256 levels; but in the absence of bistrings, 128 is enough.
* dns_byaddr_createptrname() had a redundant options argument,
and a very outdated doc comment.
* A number of comments referred to bitstring labels in a way that is
no longer helpful. (A few informative comments remain.)
DSCP has not been fully working since the network manager was
introduced in 9.16, and has been completely broken since 9.18.
This seems to have caused very few difficulties for anyone,
so we have now marked it as obsolete and removed the
implementation.
To ensure that old config files don't fail, the code to parse
dscp key-value pairs is still present, but a warning is logged
that the feature is obsolete and should not be used. Nothing is
done with configured values, and there is no longer any
range checking.
The various factors like NS_PER_MS are now defined in a single place
and the names are no longer inconsistent. I chose the _PER_SEC names
rather than _PER_S because it is slightly more clear in isolation;
but the smaller units are always NS, US, and MS.
This change prepares ground for sending DNS requests using DoT,
which, in particular, will be used for forwarding dynamic updates
to TLS-enabled primaries.
dns_request_create() was a front-end to dns_request_createvia() that
was only used by test binaries. dns_request_createvia() has been
renamed to dns_request_create(), and the test programs that formerly
used dns_request_create() have been updated to use the new parameters.
YAML strings should be quoted if they contain colon characters.
Since IPv6 addresses do, we now quote the query_address and
response_address strings in all YAML output.
Previously:
* applications were using isc_app as the base unit for running the
application and signal handling.
* networking was handled in the netmgr layer, which would start a
number of threads, each with a uv_loop event loop.
* task/event handling was done in the isc_task unit, which used
netmgr event loops to run the isc_event calls.
In this refactoring:
* the network manager now uses isc_loop instead of maintaining its
own worker threads and event loops.
* the taskmgr that manages isc_task instances now also uses isc_loopmgr,
and every isc_task runs on a specific isc_loop bound to the specific
thread.
* applications have been updated as necessary to use the new API.
* new ISC_LOOP_TEST macros have been added to enable unit tests to
run isc_loop event loops. unit tests have been updated to use this
where needed.
* isc_timer was rewritten using the uv_timer, and isc_timermgr_t was
completely removed; isc_timer objects are now directly created on the
isc_loop event loops.
* the isc_timer API has been simplified. the "inactive" timer type has
been removed; timers are now stopped by calling isc_timer_stop()
instead of resetting to inactive.
* isc_manager now creates a loop manager rather than a timer manager.
* modules and applications using isc_timer have been updated to use the
new API.
Previously, tasks could be created either unbound or bound to a specific
thread (worker loop). The unbound tasks would be assigned to a random
thread every time isc_task_send() was called. Because there's no logic
that would assign the task to the least busy worker, this just creates
unpredictability. Instead of random assignment, bind all the previously
unbound tasks to worker 0, which is guaranteed to exist.
The dns_message_gettempname(), dns_message_gettemprdata(),
dns_message_gettemprdataset(), and dns_message_gettemprdatalist() always
succeeds because the memory allocation cannot fail now. Change the API
to return void and cleanup all the use of aforementioned functions.
After removing the isc_task_onshutdown(), the isc_task_shutdown() and
isc_task_destroy() became obsolete.
Remove calls to isc_task_shutdown() and replace the calls to
isc_task_destroy() with isc_task_detach().
Simplify the internal logic to destroy the task when the last reference
is removed.
There are a couple of problems with dns_request_createvia(): a UDP
retry count of zero means unlimited retries (it should mean no
retries), and the overall request timeout is not enforced. The
combination of these bugs means that requests can be retried forever.
This change alters calls to dns_request_createvia() to avoid the
infinite retry bug by providing an explicit retry count. Previously,
the calls specified infinite retries and relied on the limit implied
by the overall request timeout and the UDP timeout (which did not work
because the overall timeout is not enforced). The `udpretries`
argument is also changed to be the number of retries; previously, zero
was interpreted as infinity because of an underflow to UINT_MAX, which
appeared to be a mistake. And `mdig` is updated to match the change in
retry accounting.
The bug could be triggered by zone maintenance queries, including
NOTIFY messages, DS parental checks, refresh SOA queries and stub zone
nameserver lookups. It could also occur with `nsupdate -r 0`.
(But `mdig` had its own code to avoid the bug.)
C11 has builtin support for _Noreturn function specifier with
convenience noreturn macro defined in <stdnoreturn.h> header.
Replace ISC_NORETURN macro by C11 noreturn with fallback to
__attribute__((noreturn)) if the C11 support is not complete.
Previously, the unreachable code paths would have to be tagged with:
INSIST(0);
ISC_UNREACHABLE();
There was also older parts of the code that used comment annotation:
/* NOTREACHED */
Unify the handling of unreachable code paths to just use:
UNREACHABLE();
The UNREACHABLE() macro now asserts when reached and also uses
__builtin_unreachable(); when such builtin is available in the compiler.
This commit converts the license handling to adhere to the REUSE
specification. It specifically:
1. Adds used licnses to LICENSES/ directory
2. Add "isc" template for adding the copyright boilerplate
3. Changes all source files to include copyright and SPDX license
header, this includes all the C sources, documentation, zone files,
configuration files. There are notes in the doc/dev/copyrights file
on how to add correct headers to the new files.
4. Handle the rest that can't be modified via .reuse/dep5 file. The
binary (or otherwise unmodifiable) files could have license places
next to them in <foo>.license file, but this would lead to cluttered
repository and most of the files handled in the .reuse/dep5 file are
system test files.
Remove the dynamic registration of result codes. Convert isc_result_t
from unsigned + #defines into 32-bit enum type in grand unified
<isc/result.h> header. Keep the existing values of the result codes
even at the expense of the description and identifier tables being
unnecessary large.
Additionally, add couple of:
switch (result) {
[...]
default:
break;
}
statements where compiler now complains about missing enum values in the
switch statement.
- startrecv() and getnext() have been rewritten.
- Don't set TCP flag when connecting a UDP dispatch.
- Prevent TCP connections from trying to connect twice.
- dns_dispatch_gettcp() can now find a matching TCP dispatch that has
not yet fully connected, and attach to it. when the connection is
completed, the connect callbacks are run for all of the pending
entries.
- An atomic 'state' variable is now used for connection state instead of
attributes.
- When dns_dispatch_cancel() is called on a TCP dispatch entry, only
that one entry is canceled. the dispatch itself should not be shut
down until there are no dispatch entries left associated with it.
- Other incidental cleanup, including removing DNS_DISPATCHATTR_IPV4 and
_IPV6 (they were being set in the dispatch attributes but never used),
cleaning up dns_requestmgr_create(), and renaming dns_dispatch_read()
to the more descriptive dns_dispatch_resume().
- Responses received by the dispatch are no longer sent to the caller
via a task event, but via a netmgr-style recv callback. the 'action'
parameter to dns_dispatch_addresponse() is now called 'response' and
is called directly from udp_recv() or tcp_recv() when a valid response
has been received.
- All references to isc_task and isc_taskmgr have been removed from
dispatch functions.
- All references to dns_dispatchevent_t have been removed and the type
has been deleted.
- Added a task to the resolver response context, to be used for fctx
events.
- When the caller cancels an operation, the response handler will be
called with ISC_R_CANCELED; it can abort immediately since the caller
will presumably have taken care of cleanup already.
- Cleaned up attach/detach in resquery and request.
Since every dispsock was associated with a dispentry anyway (though not
always vice versa), the members of dispsock have been combined into
dispentry, which is now reference-counted. dispentry objects are now
attached before connecting and detached afterward to prevent races
between the connect callback and dns_dispatch_removeresponse().
Dispatch and dispatchmgr objects are now reference counted as well, and
the shutdown process has been simplified. reference counting of
resquery and request objects has also been cleaned up significantly.
dns_dispatch_cancel() now flags a dispentry as having been canceled, so
that if the connect callback runs after cancellation, it will not
initiate a read.
The isblackholed() function has been simplified.
The flow of operations in dispatch is changing and will now be similar
for both UDP and TCP queries:
1) Call dns_dispatch_addresponse() to assign a query ID and register
that we'll be listening for a response with that ID soon. the
parameters for this function include callback functions to inform the
caller when the socket is connected and when the message has been
sent, as well as a task action that will be sent when the response
arrives. (later this could become a netmgr callback, but at this
stage to minimize disruption to the calling code, we continue to use
isc_task for the response event.) on successful completion of this
function, a dispatch entry object will be instantiated.
2) Call dns_dispatch_connect() on the dispatch entry. this runs
isc_nm_udpconnect() or isc_nm_tcpdnsconnect(), as needed, and begins
listening for responses. the caller is informed via a callback
function when the connection is established.
3) Call dns_dispatch_send() on the dispatch entry. this runs
isc_nm_send() to send a request.
4) Call dns_dispatch_removeresponse() to terminate listening and close
the connection.
Implementation comments below:
- As we will be using netmgr buffers now. code to send the length in
TCP queries has also been removed as that is handled by the netmgr.
- TCP dispatches can be used by multiple simultaneous queries, so
dns_dispatch_connect() now checks whether the dispatch is already
connected before calling isc_nm_tcpdnsconnect() again.
- Running dns_dispatch_getnext() from a non-network thread caused a
crash due to assertions in the netmgr read functions that appear to be
unnecessary now. the assertions have been removed.
- fctx->nqueries was formerly incremented when the connection was
successful, but is now incremented when the query is started and
decremented if the connection fails.
- It's no longer necessary for each dispatch to have a pool of tasks, so
there's now a single task per dispatch.
- Dispatch code to avoid UDP ports already in use has been removed.
- dns_resolver and dns_request have been modified to use netmgr callback
functions instead of task events. some additional changes were needed
to handle shutdown processing correctly.
- Timeout processing is not yet fully converted to use netmgr timeouts.
- Fixed a lock order cycle reported by TSAN (view -> zone-> adb -> view)
by by calling dns_zt functions without holding the view lock.
- Many dispatch attributes can be set implicitly instead of being passed
in. we can infer whether to set DNS_DISPATCHATTR_TCP or _UDP from
whether we're calling dns_dispatch_createtcp() or _createudp(). we
can also infer DNS_DISPATCHATTR_IPV4 or _IPV6 from the addresses or
the socket that were passed in.
- We no longer use dup'd sockets in UDP dispatches, so the 'dup_socket'
parameter has been removed from dns_dispatch_createudp(), along with
the code implementing it. also removed isc_socket_dup() since it no
longer has any callers.
- The 'buffersize' parameter was ignored and has now been removed;
buffersize is now fixed at 4096.
- Maxbuffers and maxrequests don't need to be passed in on every call to
dns_dispatch_createtcp() and _createudp().
In all current uses, the value for mgr->maxbuffers will either be
raised once from its default of 20000 to 32768, or else left
alone. (passing in a value lower than 20000 does not lower it.) there
isn't enough difference between these values for there to be any need
to configure this.
The value for disp->maxrequests controls both the quota of concurrent
requests for a dispatch and also the size of the dispatch socket
memory pool. it's not clear that this quota is necessary at all. the
memory pool size currently starts at 32768, but is sometimes lowered
to 4096, which is definitely unnecessary.
This commit sets both values permanently to 32768.
- Previously TCP dispatches allocated their own separate QID table,
which didn't incorporate a port table. this commit removes
per-dispatch QID tables and shares the same table between all
dispatches. since dispatches are created for each TCP socket, this may
speed up the dispatch allocation process. there may be a slight
increase in lock contention since all dispatches are sharing a single
QID table, but since TCP sockets are used less often than UDP
sockets (which were already sharing a QID table), it should not be a
substantial change.
- The dispatch port table was being used to determine whether a port was
already in use; if so, then a UDP socket would be bound with
REUSEADDR. this commit removes the port table, and always binds UDP
sockets that way.
Currently the netmgr doesn't support unconnected, shared UDP sockets, so
there's no reason to retain that functionality in the dispatcher prior
to porting to the netmgr.
In this commit, the DNS_DISPATCHATTR_EXCLUSIVE attribute has been
removed as it is now non-optional; UDP dispatches are alwasy exclusive.
Code implementing non-exclusive UDP dispatches has been removed.
dns_dispatch_getentrysocket() now always returns the dispsocket for UDP
dispatches and the dispatch socket for TCP dispatches.
There is no longer any need to search for existing dispatches from
dns_dispatch_getudp(), so the 'mask' option has been removed, and the
function renamed to the more descriptive dns_dispatch_createudp().
The DNS_REQUESTOPT_SHARE flag was added when client-side pipelining of
TCP queries was implemented. there was no need to make it optional;
forcing it to be in effect for all requests simplfiies the code.
- UDP buffersize is now established when creating dispatch manager
and is always set to 4096.
- Set up the default port range in dispatchmgr before setting the magic
number.
- Magic is not set until dispatchmgr is fully created.
dns_message_gettempname() now returns a pointer to an initialized
name associated with a dns_fixedname_t object. it is no longer
necessary to allocate a buffer for temporary names associated with
the message object.
Previously, netmgr, taskmgr, timermgr and socketmgr all had their own
isc_<*>mgr_create() and isc_<*>mgr_destroy() functions. The new
isc_managers_create() and isc_managers_destroy() fold all four into a
single function and makes sure the objects are created and destroy in
correct order.
Especially now, when taskmgr runs on top of netmgr, the correct order is
important and when the code was duplicated at many places it's easy to
make mistake.
The former isc_<*>mgr_create() and isc_<*>mgr_destroy() functions were
made private and a single call to isc_managers_create() and
isc_managers_destroy() is required at the program startup / shutdown.
This commit changes the taskmgr to run the individual tasks on the
netmgr internal workers. While an effort has been put into keeping the
taskmgr interface intact, couple of changes have been made:
* The taskmgr has no concept of universal privileged mode - rather the
tasks are either privileged or unprivileged (normal). The privileged
tasks are run as a first thing when the netmgr is unpaused. There
are now four different queues in in the netmgr:
1. priority queue - netievent on the priority queue are run even when
the taskmgr enter exclusive mode and netmgr is paused. This is
needed to properly start listening on the interfaces, free
resources and resume.
2. privileged task queue - only privileged tasks are queued here and
this is the first queue that gets processed when network manager
is unpaused using isc_nm_resume(). All netmgr workers need to
clean the privileged task queue before they all proceed normal
operation. Both task queues are processed when the workers are
finished.
3. task queue - only (traditional) task are scheduled here and this
queue along with privileged task queues are process when the
netmgr workers are finishing. This is needed to process the task
shutdown events.
4. normal queue - this is the queue with netmgr events, e.g. reading,
sending, callbacks and pretty much everything is processed here.
* The isc_taskmgr_create() now requires initialized netmgr (isc_nm_t)
object.
* The isc_nm_destroy() function now waits for indefinite time, but it
will print out the active objects when in tracing mode
(-DNETMGR_TRACE=1 and -DNETMGR_TRACE_VERBOSE=1), the netmgr has been
made a little bit more asynchronous and it might take longer time to
shutdown all the active networking connections.
* Previously, the isc_nm_stoplistening() was a synchronous operation.
This has been changed and the isc_nm_stoplistening() just schedules
the child sockets to stop listening and exits. This was needed to
prevent a deadlock as the the (traditional) tasks are now executed on
the netmgr threads.
* The socket selection logic in isc__nm_udp_send() was flawed, but
fortunatelly, it was broken, so we never hit the problem where we
created uvreq_t on a socket from nmhandle_t, but then a different
socket could be picked up and then we were trying to run the send
callback on a socket that had different threadid than currently
running.
Add a +burst option to mdig so that we have a second to setup the
mdig calls then they run at the start of the next second.
RRL uses 'queries in a second' as a approximation to
'queries per second'. Getting the bursts of traffic to all happen in
the same second should prevent false negatives in the system test.
We now have a second to setup the traffic in. Then the traffic should
be sent at the start of the next second. If that still fails we
should move to +burst=<now+2> (further extend mdig) instead of the
implicit <now+1> as the trigger second.
The dns_message_create() function cannot soft fail (as all memory
allocations either succeed or cause abort), so we change the function to
return void and cleanup the calls.