BIND 9 plugins are installed using Automake's pkglib_LTLIBRARIES stanza,
which causes the relevant shared objects to be placed in the
$(libdir)/@PACKAGE@/ directory, where @PACKAGE@ is expanded to the
lowercase form of the first argument passed to AC_INIT(), i.e. "bind".
Meanwhile, NAMED_PLUGINDIR - the preprocessor macro that the
ns_plugin_expandpath() function uses for determining the absolute path
to a plugin for which only a filename has been provided (rather than a
path) - is set to $(libdir)/named. This discrepancy breaks loading
plugins using just their filenames. Fix the issue (and also prevent it
from reoccurring) by setting NAMED_PLUGINDIR to $(pkglibdir).
For some applications, it's useful to not listen on full battery of
threads. Add workers argument to all isc_nm_listen*() functions and
convenience ISC_NM_LISTEN_ONE and ISC_NM_LISTEN_ALL macros.
This commit makes use of isc_nmsocket_set_tlsctx(). Now, instead of
recreating TLS-enabled listeners (including the underlying TCP
listener sockets), only the TLS context in use is replaced.
Now that the dns_aclenv_t has now properly rwlocked .localhost and
.localnets member, we can remove the task exclusive mode use from the
ns_interfacemgr. Some light related cleanup has been also done.
In order to modify the .localhost and .localnets members of the
dns_aclenv, all other processing on the netmgr loops needed to be
stopped using the task exclusive mode. Add the isc_rwlock to the
dns_aclenv, so any modifications to the .localhost and .localnets can be
done under the write lock.
Instead of passing the number of worker to the dns_zonemgr manually,
get the number of nm threads using the new isc_nm_getnworkers() call.
Additionally, remove the isc_pool API and manage the array of memory
context, zonetasks and loadtasks directly in the zonemgr.
After switching to per-thread resources in the zonemgr, the performance
was decreased because the memory context, zonetask and loadtask was
picked from the pool at random.
Pin the zone to single threadid (.tid) and align the memory context,
zonetask and loadtask to be the same, this sets the hard affinity of the
zone to the netmgr thread.
Previously, the zonemgr created 1 task per 100 zones and 1 memory
context per 1000 zones (with minimum 10 tasks and 2 memory contexts) to
reduce the contention between threads.
Instead of reducing the contention by having many resources, create a
per-nm_thread memory context, loadtask and zonetask and spread the zones
between just per-thread resources.
Note: this commit alone does decrease performance when loading the zone
by couple seconds (in case of 1M zone) and thus there's more work in
this whole MR fixing the performance.
Ensure the update zone name is mentioned in the NOTAUTH error message
in the server log, so that it is easier to track down problematic
update clients. There are two cases: either the update zone is
unrelated to any of the server's zones (previously no zone was
mentioned); or the update zone is a subdomain of one or more of the
server's zones (previously the name of the irrelevant parent zone was
misleadingly logged).
Closes#3209
The .lock, .exiting and .excl members were not using for anything else
than starting task exclusive mode, setting .exiting to true and ending
exclusive mode.
Remove all the stray members and dead code eliminating the task
exclusive mode use from ns_clientmgr.
In couple places, we have missed INSIST(0) or ISC_UNREACHABLE()
replacement on some branches with UNREACHABLE(). Replace all
ISC_UNREACHABLE() or INSIST(0) calls with UNREACHABLE().
This commit adds support for Strict/Mutual TLS into BIND. It does so
by implementing the backing code for 'hostname' and 'ca-file' options
of the 'tls' statement. The commit also updates the documentation
accordingly.
This commit adds support for keeping CA certificates stores associated
with TLS contexts. The intention is to keep one reusable store per a
set of related TLS contexts.
The way the ns_client_t .shuttingdown member was practically dead code.
The .shuttingdown would be set to true only in ns__client_put() function
meaning that we have detached from all ns_client_t .*handles and the
ns_client_t object being freed:
client->magic = 0;
client->shuttingdown = true;
[...]
isc_mem_put(manager->ctx, client, sizeof(*client))
Meanwhile the ns_client_t object is accessed like this:
isc_nmhandle_detach(&client->fetchhandle);
client->query.attributes &= ~NS_QUERYATTR_RECURSING;
client->state = NS_CLIENTSTATE_WORKING;
qctx_init(client, &devent, 0, &qctx);
client_shuttingdown = ns_client_shuttingdown(client);
if (fetch_canceled || fetch_answered || client_shuttingdown) {
[...]
}
Even if the isc_nmhandle_detach(...) was the last handle detach, it
would mean that immediatelly, after calling the isc_nmhandle_detach(),
we would be causing use-after-free, because the ns_client_t is
immediatelly destroyed after setting .shuttingdown to true.
The similar code in the query_hookresume() already noticed this:
/*
* This event is running under a client task, so it's safe to detach
* the fetch handle. And it should be done before resuming query
* processing below, since that may trigger another recursion or
* asynchronous hook event.
*/
Previously, it was possible to assign a bit of memory space in the
nmhandle to store the client data. This was complicated and prevents
further refactoring of isc_nmhandle_t caching (future work).
Instead of caching the data in the nmhandle, allocate the hot-path
ns_client_t objects from per-thread clientmgr memory context and just
assign it to the isc_nmhandle_t via isc_nmhandle_set().
The ns_client_t is always attached to ns_clientmgr_t which has
associated memory context, server context, task and threadid. Use those
directly from the ns_clientmgr_t instead of attaching it to an extra
copy in ns_client_t to make the ns_client_t more sleek and lean.
Additionally, remove some stray ns_client_t struct members that were not
used anywhere.
Historically, the inline keyword was a strong suggestion to the compiler
that it should inline the function marked inline. As compilers became
better at optimising, this functionality has receded, and using inline
as a suggestion to inline a function is obsolete. The compiler will
happily ignore it and inline something else entirely if it finds that's
a better optimisation.
Therefore, remove all the occurences of the inline keyword with static
functions inside single compilation unit and leave the decision whether
to inline a function or not entirely on the compiler
NOTE: We keep the usage the inline keyword when the purpose is to change
the linkage behaviour.
C11 has builtin support for _Noreturn function specifier with
convenience noreturn macro defined in <stdnoreturn.h> header.
Replace ISC_NORETURN macro by C11 noreturn with fallback to
__attribute__((noreturn)) if the C11 support is not complete.
Previously, the unreachable code paths would have to be tagged with:
INSIST(0);
ISC_UNREACHABLE();
There was also older parts of the code that used comment annotation:
/* NOTREACHED */
Unify the handling of unreachable code paths to just use:
UNREACHABLE();
The UNREACHABLE() macro now asserts when reached and also uses
__builtin_unreachable(); when such builtin is available in the compiler.
Gcc 7+ and Clang 10+ have implemented __attribute__((fallthrough)) which
is explicit version of the /* FALLTHROUGH */ comment we are currently
using.
Add and apply FALLTHROUGH macro that uses the attribute if available,
but does nothing on older compilers.
In one case (lib/dns/zone.c), using the macro revealed that we were
using the /* FALLTHROUGH */ comment in wrong place, remove that comment.
Instead of passing the "workers" variable back and forth along with
passing the single isc_nm_t instance, add isc_nm_getnworkers() function
that returns the number of netmgr threads are running.
Change the ns_interfacemgr and ns_taskmgr to utilize the newly acquired
knowledge.
When max-transfer-*-out timeouts were reintroduced, the log message
about starting the timer was errorneously left as ISC_LOG_ERROR.
Change the log level of said message to ISC_LOG_DEBUG(1).
The C17 standard deprecated ATOMIC_VAR_INIT() macro (see [1]). Follow
the suite and remove the ATOMIC_VAR_INIT() usage in favor of simple
assignment of the value as this is what all supported stdatomic.h
implementations do anyway:
* MacOSX.plaform: #define ATOMIC_VAR_INIT(__v) {__v}
* Gcc stdatomic.h: #define ATOMIC_VAR_INIT(VALUE) (VALUE)
1. http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p1138r0.pdf
Commit aab691d51266f552a7923db32686fb9398b1d255 did not fix all possible
scenarios in which the ns_statscounter_recursclients counter underflows.
The solution implemented therein can be ineffective e.g. when CNAME
chaining happens with prefetching enabled.
Here is an example recursive resolution scenario in which the
ns_statscounter_recursclients counter can underflow with the current
logic in effect:
1. Query processing starts, the answer is not found in the cache, so
recursion is started. The NS_CLIENTATTR_RECURSING attribute is set.
ns_statscounter_recursclients is incremented (Δ = +1).
2. Recursion completes, returning a CNAME. client->recursionquota is
non-NULL, so the NS_CLIENTATTR_RECURSING attribute remains set.
ns_statscounter_recursclients is decremented (Δ = 0).
3. Query processing restarts.
4. The current QNAME (the target of the CNAME from step 2) is found in
the cache, with a TTL low enough to trigger a prefetch.
5. query_prefetch() attaches to client->recursionquota.
ns_statscounter_recursclients is not incremented because
query_prefetch() does not do that (Δ = 0).
6. Query processing restarts.
7. The current QNAME (the target of the CNAME from step 4) is not found
in the cache, so recursion is started. client->recursionquota is
already attached to (since step 5) and the NS_CLIENTATTR_RECURSING
attribute is set (since step 1), so ns_statscounter_recursclients is
not incremented (Δ = 0).
8. The prefetch from step 5 completes. client->recursionquota is
detached from in prefetch_done(). ns_statscounter_recursclients is
not decremented because prefetch_done() does not do that (Δ = 0).
9. Recursion for the current QNAME completes. client->recursionquota
is already detached from, i.e. set to NULL (since step 8), and the
NS_CLIENTATTR_RECURSING attribute is set (since step 1), so
ns_statscounter_recursclients is decremented (Δ = -1).
Another possible scenario is that after step 7, recursion for the target
of the CNAME from step 4 completes before the prefetch for the CNAME
itself. fetch_callback() then notices that client->recursionquota is
non-NULL and decrements ns_statscounter_recursclients, even though
client->recursionquota was attached to by query_prefetch() and therefore
not accompanied by an incrementation of ns_statscounter_recursclients.
The net result is also an underflow.
Instead of trying to properly handle all possible orderings of events
set into motion by normal recursion and prefetch-triggered recursion,
adjust ns_statscounter_recursclients whenever the recursive clients
quota is successfully attached to or detached from. Remove the
NS_CLIENTATTR_RECURSING attribute altogether as its only purpose is made
obsolete by this change.
The keep-response-order option has been obsoleted, and in this commit,
remove the keep-response-order ACL map rendering the option no-op, the
call the isc_nm_sequential() and the now unused isc_nm_sequential()
function itself.
While refactoring the libns to use the new network manager, the
max-transfer-*-out options were not implemented and they were turned
non-operational.
Reimplement the max-transfer-idle-out functionality using the write
timer and max-transfer-time-out using the new isc_nm_timer API.
While refactoring the lib/ns/xfrout.c, it was discovered that .shutdown
and .shutdown_arg members of ns_client_t structure are unused.
Remove the unused members and associated code that was using in it in
the ns_xfrout.
When invalid DNS message is received, there was a handling mechanism for
DoH that would be called to return proper HTTP response.
Reuse this mechanism and reset the TCP connection when the client is
blackholed, DNS message is completely bogus or the ns_client receives
response instead of query.
this brings DNS_CLIENTINFO_VERSION into line with the subscription
branch so that fixes applied to clientinfo processing can also be
applied to the main branch without diverging.
Some operating systems (OpenBSD and DragonFly BSD) don't restrict the
IPv6 sockets to sending and receiving IPv6 packets only. Explicitly
enable the IPV6_V6ONLY socket option on the IPv6 sockets to prevent
failures from using the IPv4-mapped IPv6 address.
This commit converts the license handling to adhere to the REUSE
specification. It specifically:
1. Adds used licnses to LICENSES/ directory
2. Add "isc" template for adding the copyright boilerplate
3. Changes all source files to include copyright and SPDX license
header, this includes all the C sources, documentation, zone files,
configuration files. There are notes in the doc/dev/copyrights file
on how to add correct headers to the new files.
4. Handle the rest that can't be modified via .reuse/dep5 file. The
binary (or otherwise unmodifiable) files could have license places
next to them in <foo>.license file, but this would lead to cluttered
repository and most of the files handled in the .reuse/dep5 file are
system test files.
Using the TLS context cache for server-side contexts could reduce the
number of contexts to initialise in the configurations when e.g. the
same 'tls' entry is used in multiple 'listen-on' statements for the
same DNS transport, binding to multiple IP addresses.
In such a case, only one TLS context will be created, instead of a
context per IP address, which could reduce the initialisation time, as
initialising even a non-ephemeral TLS context introduces some delay,
which can be *visually* noticeable by log activity.
Also, this change lays down a foundation for Mutual TLS (when the
server validates a client certificate, additionally to a client
validating the server), as the TLS context cache can be extended to
store additional data required for validation (like intermediates CA
chain).
Additionally to the above, the change ensures that the contexts are
not being changed after initialisation, as such a practice is frowned
upon. Previously we would set the supported ALPN tags within
isc_nm_listenhttp() and isc_nm_listentlsdns(). We do not do that for
client-side contexts, so that appears to be an overlook. Now we set
the supported ALPN tags right after server-side contexts creation,
similarly how we do for client-side ones.
reference counting of ns_interface objects has not been used
since the clientmgr cleanup in #2433, and it no longer really
makes sense now - when we want to destroy an interface on a
rescan, we want it to be destroyed, not kept active by some
other caller. so ns_interface_attach() has been removed,
ns_interface_detach() has been replaced with a static
interface_destroy(), and do_scan() has been simplified
accordingly.
previously, if "listen-on-v6" was set to "none", then every
time a scan saw an IPv6 address it would appear to be a new
one. this commit retains all known interfaces in a list
and sets a flag in the ones that are listening, so that
configured interfaces that have been seen before will be
recognized as such.
as an incidental fix, the ns__interfacemgr_getif() and _nextif()
functions have been removed since they were never used.
This commit modifies the NetLink handling code in such a way
that the contents of the messages we are interested in is checked
for the local addresses changes only. This helps to avoid spurious
interface re-scans.
The 'route_recv' log messages are also reduced from DEBUG(3) to
DEBUG(9).
The memory context created in the clientmgr context was missing a name,
so it was nameless in the memory context statistics.
Set the clientmgr memory context name to "clientmgr".
The 850e9e59bf8c29f895a981211c72c0b3c294bcfd commit intended to recreate
the HTTPS and TLS interfaces during reconfiguration, but they are being
recreated also during regular interface re-scans.
Make sure the HTTPS and TLS interfaces are being recreated only during
reconfiguration.
For DoH and DoT listeners, a reconfiguration event triggers a creation
of a new 'SSL_CTX' TLS context, and a destruction of the old one.
The network manager, though, keeps using the old context which causes
errors.
During interface scanning, when a matching existing interface is found,
reuse it only when it doesn't have a TLS context, otherwise shut it down
and recreate with a new TLS context.
The old code rejected NSEC that proved the wildcard name existed
(exists). The new code rejects NSEC that prove that the wildcard
name exists and that the type exists (exists && data) but accept
NSEC that prove the wildcard name exists.
query_synthnxdomain (renamed query_synthnxdomainnodata) already
took the NSEC records and added the correct records to the message
body for NXDOMAIN or NODATA responses with the above change. The
only additional change needed was to ensure the correct RCODE is
set.
dns_nsec_noexistnodata now checks that RRSIG and NSEC are
present in the type map. Both types should be present in
a correctly constructed NSEC record. This check is in
addition to similar checks in resolver.c and validator.c.
This commit completes the integration of the new, extended ACL syntax
featuring 'port' and 'transport' options.
The runtime presentation and ACL loading code are extended to allow
the syntax to be used beyond the 'allow-transfer' option (e.g. in
'acl' definitions and other 'allow-*' options) and can be used to
ultimately extend the ACL support with transport-only
ACLs (e.g. 'transport-acl tls-acl port 853 transport tls'). But, due
to fundamental nature of such a change, it has not been completed as a
part of 9.17.X release series due to it being close to 9.18 stable
release status. That means that we do not have enough time to fully
test it.
The complete integration is planned as a part of 9.19.X release
series.
The code was manually verified to work as expected by temporarily
enabling the extended syntax for 'acl' statements and 'allow-query'
options, including ACL merging, negated ACLs.
This commit adds an isc_nm_socket_type() function which can be used to
obtain a handle's socket type.
This change obsoletes isc_nm_is_tlsdns_handle() and
isc_nm_is_http_handle(). However, it was decided to keep the latter as
we eventually might end up supporting multiple HTTP versions.
Add a new parameter to 'ns_client_t' to store potential extended DNS
error. Reset when the client request ends, or is put back.
Add defines for all well-known info-codes.
Update the number of DNS_EDNSOPTIONS that we are willing to set.
Create a new function to set the extended error for a client reply.
Change 5756 (GL #2854) introduced build errors when using
'configure --disable-doh'. To fix this, isc_nm_is_http_handle() is
now defined in all builds, not just builds that have DoH enabled.
Missing code comments were added both for that function and for
isc_nm_is_tlsdns_handle().
This commit makes BIND set the "max-age" value of the "Cache-Control"
HTTP header to the minimal TTL from the Answer section for positive
answers, as RFC 8484 advises in section 5.1.
We calculate the minimal TTL as a side effect of rendering the
response DNS message, so it does not change the code flow much, nor
should it have any measurable negative impact on the performance.
For negative answers, the "max-age" value is set using the TTL and
SOA-minimum values from an SOA record in the Authority section.
it was possible for the route socket's udp_recv() callback to fire
after the interfacemgr was detached, causing an assertion failure.
this has now been fixed by referencing the interfacemgr when setting up
the route socket, and dereferencing it when shutting it down.
Some of the libns unit tests override the isc_nmhandle_attach() and
_detach() functions. This causes a failure in ns_interface_create()
if a route socket is being used, so we add a parameter to disable it.