Instead of passing the number of worker to the dns_zonemgr manually,
get the number of nm threads using the new isc_nm_getnworkers() call.
Additionally, remove the isc_pool API and manage the array of memory
context, zonetasks and loadtasks directly in the zonemgr.
After switching to per-thread resources in the zonemgr, the performance
was decreased because the memory context, zonetask and loadtask was
picked from the pool at random.
Pin the zone to single threadid (.tid) and align the memory context,
zonetask and loadtask to be the same, this sets the hard affinity of the
zone to the netmgr thread.
Previously, the zonemgr created 1 task per 100 zones and 1 memory
context per 1000 zones (with minimum 10 tasks and 2 memory contexts) to
reduce the contention between threads.
Instead of reducing the contention by having many resources, create a
per-nm_thread memory context, loadtask and zonetask and spread the zones
between just per-thread resources.
Note: this commit alone does decrease performance when loading the zone
by couple seconds (in case of 1M zone) and thus there's more work in
this whole MR fixing the performance.
Previously, the zone timer was not stopped before detaching the timer.
This could lead to a data race where the timer post_event() could fire
before the timer was detached, but then the event would be executed
after the zone was already destroyed.
This was not noticed before because the timing or the ordering of the
actions were different, but it was causing assertion failures in the
libns tests now.
Properly stop the zone timer before detaching the timer object from the
dns_zone.
When we are loading the zones, set the quantum to UINT_MAX, which makes
task_run process all tasks at once. After the zone loading is finished
the quantum will be dropped to 1 to not block server when we are loading
new zones after reconfiguration.
the value of 'i' in generate could overflow when adding 'step' to
it in the 'for' loop. Use an unsigned int for 'i' which will give
an additional bit and prevent the overflow. The inputs are both
less than 2^31 and and the result will be less than 2^32-1.
The ADB previously used separate reference counters for internal
and external references, plus additional counters for ABD find
and namehook objects, and used all these counters to coordinate
its shutdown process, which was a multi-stage affair involving
a sequence of control events.
It also used a complex interlocking set of static functions for
referencing, deferencing, linking, unlinking, and cleaning up various
internal objects; these functions returned boolean values to their
callers to indicate what additional processing was needed.
The changes in the previous two commits destabilized this fragile
system in a way that was difficult to recover from, so in this commit
we refactor all of it. The dns_adb and dns_adbentry objects now use
conventional attach and detach functions for reference counting, and
the shutdown process is much more straightforward. Instead of
handling shutdown asynchronously, we can just destroy the ADB when
references reach zero
In addition, ADB locking has been simplified. Instead of a
single `find_{name,entry}_and_lock()` function which searches for
a name or entry's hash bucket, locks it, and then searches for the
name or entry in the bucket, we now use one function to find the
bucket (leaving it to the caller to do the locking) and another
find the name or entry. Instead of locking the entire ADB when
modifying hash tables, we now use read-write locks around the
specific hash table. The only remaining need for adb->lock
is when modifying the `whenshutdown` list.
Comments throughout the module have been improved.
Replace adb->{names,entries} and related arrays (indexed by hashed
bucket) with a isc_ht hash tables storing the new struct
adb{name,entry}bucket_t that wraps all the variables that were
originally stored in arrays indexed by "bucket" number stored directly
in the struct dns_adb.
Previously, the task exclusive mode has been used to grow the internal
arrays used to store the named and entries objects. The isc_ht hash
tables are now protected by the isc_rwlock instead and thus the usage of
the task exclusive mode has been removed from the dns_adb.
Co-authored-by: Ondřej Surý <ondrej@isc.org>
the use of "result" as a variable name for a boolean return value
was confusing; all 'result' variables that are not isc_result_t
have been renamed to 'ret'.
The static function print_dns_name() was a duplicate of
dns_name_print(), so it has been replaced with that.
Changed INSIST to REQUIRE where appropriate, and added NULL
initialization for pointer variables.
The use of isc_appctx_t in dns_client was used to wait for
dns_client_startresolve() to finish the processing (the resolve_done()
task callback).
This has been replaced with standard bool+cond+lock combination removing
the need of isc_appctx_t altogether.
Previously, the isc_ht API would always take the key as a literal input
to the hashing function. Change the isc_ht_init() function to take an
'options' argument, in which ISC_HT_CASE_SENSITIVE or _INSENSITIVE can
be specified, to determine whether to use case-sensitive hashing in
isc_hash32() when hashing the key.
Fibonacci hashing was implemented in four separate places (rbt.c,
rbtdb.c, resolver.c, zone.c). This commit combines them into a single
implementation. The hash_32() function is now replaced with
isc_hash_bits32().
In couple places, we have missed INSIST(0) or ISC_UNREACHABLE()
replacement on some branches with UNREACHABLE(). Replace all
ISC_UNREACHABLE() or INSIST(0) calls with UNREACHABLE().
This commit adds support for Strict/Mutual TLS into BIND. It does so
by implementing the backing code for 'hostname' and 'ca-file' options
of the 'tls' statement. The commit also updates the documentation
accordingly.
This commit adds support for keeping CA certificates stores associated
with TLS contexts. The intention is to keep one reusable store per a
set of related TLS contexts.
There is a possible code path of using the uninitialized `bname`
character array while logging an error message.
Initialize the `bname` buffer earlier in the function.
Also, change the initialization routine to use a helper function.
A successful call to `dns_rdata_tostruct()` expects an accompanying
call to `dns_rdata_freestruct()` to free up any memory that could have
been allocated during the first call.
In catz.c there are several places where `dns_rdata_freestruct()` call
is skipped.
Add the missing cleanup routines.
Previously, it was possible to assign a bit of memory space in the
nmhandle to store the client data. This was complicated and prevents
further refactoring of isc_nmhandle_t caching (future work).
Instead of caching the data in the nmhandle, allocate the hot-path
ns_client_t objects from per-thread clientmgr memory context and just
assign it to the isc_nmhandle_t via isc_nmhandle_set().
Some ancient versions of clang reported uninitialized memory use false
positive (see https://bugs.llvm.org/show_bug.cgi?id=14461). Since clang
4.0.1 has been long obsoleted, just remove the workarounds.
Historically, the inline keyword was a strong suggestion to the compiler
that it should inline the function marked inline. As compilers became
better at optimising, this functionality has receded, and using inline
as a suggestion to inline a function is obsolete. The compiler will
happily ignore it and inline something else entirely if it finds that's
a better optimisation.
Therefore, remove all the occurences of the inline keyword with static
functions inside single compilation unit and leave the decision whether
to inline a function or not entirely on the compiler
NOTE: We keep the usage the inline keyword when the purpose is to change
the linkage behaviour.
Previously, the unreachable code paths would have to be tagged with:
INSIST(0);
ISC_UNREACHABLE();
There was also older parts of the code that used comment annotation:
/* NOTREACHED */
Unify the handling of unreachable code paths to just use:
UNREACHABLE();
The UNREACHABLE() macro now asserts when reached and also uses
__builtin_unreachable(); when such builtin is available in the compiler.
Gcc 7+ and Clang 10+ have implemented __attribute__((fallthrough)) which
is explicit version of the /* FALLTHROUGH */ comment we are currently
using.
Add and apply FALLTHROUGH macro that uses the attribute if available,
but does nothing on older compilers.
In one case (lib/dns/zone.c), using the macro revealed that we were
using the /* FALLTHROUGH */ comment in wrong place, remove that comment.
In the GSS-TSIG verification code there was an alarming
variable-length array whose size came off the network, from the
signature in the request. It turned out to be safe, because the caller
had previously checked that the signature had a reasonable size.
However, the safety checks are in the generic TSIG implementation, and
the risky VLA usage was in the GSS-specific code, and they are
separated by the DST indirection layer, so it wasn't immediately
obvious that the risky VLA was in fact safe.
In fact this risky VLA was completely unnecessary, because the GSS
signature can be verified in place without being copied to the stack,
like the message covered by the signature. The `REGION_TO_GBUFFER()`
macro backwardly assigns the region in its left argument to the GSS
buffer in its right argument; this is just a pointer and length
conversion, without copying any data. The `gss_verify_mic()` call uses
both message and signature GSS buffers in a read-only manner.
The clang-format-15 has new option InsertBraces that could add missing
branches around single line statements. Use that to our advantage
without switching to not-yet-released LLVM version to add missing braces
in couple of places.
The fetch can be in the shutting down state when resume_dslookup() is
trying to operate on it.
This is also a security issue, because a malicious actor can set up a
name server which delays certain queries in such a way that the fetch
will time out and shut down, which will cause named to crash.
Add a check to see if the fetch has the shutting down attribute set,
and cancel any further operations on it in such case.
A similar bug had been fixed earlier for the resume_qmin() function,
see [GL #966].
This is an optimisation as we can skip a lot of pointless work when we
know there is a DNAME there.
When we have a partial match and a DNAME above the QNAME, the closest
encloser has the same owner as the DNAME, will have the DNAME bit set
in the type map, and we wouldn't use it as we would return the
DNAME + RRSIG(DNAME) instead.
So there is no point in looking for it nor in attempting to check that
it is valid for the QNAME.
'setup_delegation' depends on 'foundname' being the value returned
by 'dns_rbt_findnode' in the cache and 'find_coveringnsec' was
modifying 'foundname' when a covering NSEC was not found.
When caching glue, we need to ensure that there is no closer
source of truth for the name. If the owner name for the glue
record would be answered by a locally configured zone, do not
cache.
When caching additional and glue data *not* from a forwarder, we must
check that there is no "forward only" clause covering the owner name
that would take precedence. Such names would normally be allowed by
baliwick rules, but a "forward only" zone introduces a new baliwick
scope.
If we are using a fowarder, in addition to checking that names to
be cached are subdomains of the forwarded namespace, we must also
check that there are no subsidiary forwarded namespaces which would
take precedence. To be safe, we don't cache any responses if the
forwarding configuration has changed since the query was sent.
Change the isc_interval_t implementation from separate data type and
separate implementation to be shim implementation on top of isc_time_t.
The distinction between isc_interval_t and isc_time_t has been kept
because they are semantically different - isc_interval_t is relative and
isc_time_t is absolute, but this allows isc_time_t and isc_interval_t to
be freely interchangeable, f.e. this:
isc_time_t *t1;
isc_interval_t *interval;
isc_time_t *t2;
isc_interval_set(interval, isc_time_seconds(t2), isc_time_nanoseconds(t2);;
isc_time_subtract(t1, interval, t2);
isc_interval_set(interval, isc_time_seconds(t2), isc_time_nanoseconds(t2));
to just:
isc_time_t *t1;
isc_interval_t *interval;
isc_time_t *t2;
isc_time_subtract(t1, t2, interval);
without introducing a whole set of new functions.
There were two places where expires argument (absolute isc_time_t value)
was being used. Both places has been converted to use relative interval
argument in preparation of simplification and refactoring of isc_timer
API.
The isc_timer_create() function was a bit conflated. It could have been
used to create a timer and start it at the same time. As there was a
single place where this was done before (see the previous commit for
nta.c), this was cleaned up and the isc_timer_create() function was
changed to only create new timer.
In nta.c, it was the only place where the active timer was created
directly instead of first creating inactive timer and then starting it
with isc_timer_reset().
Change the code to create inactive timer first, so we can refactor the
isc_timer_create() function.
The C17 standard deprecated ATOMIC_VAR_INIT() macro (see [1]). Follow
the suite and remove the ATOMIC_VAR_INIT() usage in favor of simple
assignment of the value as this is what all supported stdatomic.h
implementations do anyway:
* MacOSX.plaform: #define ATOMIC_VAR_INIT(__v) {__v}
* Gcc stdatomic.h: #define ATOMIC_VAR_INIT(VALUE) (VALUE)
1. http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p1138r0.pdf
Previously, the function(s) in the commit subject could fail for various
reasons - mostly allocation failures, or other functions returning
different return code than ISC_R_SUCCESS. Now, the aforementioned
function(s) cannot ever fail and they would always return ISC_R_SUCCESS.
Change the function(s) to return void and remove the extra checks in
the code that uses them.
Previously, the function(s) in the commit subject could fail for various
reasons - mostly allocation failures, or other functions returning
different return code than ISC_R_SUCCESS. Now, the aforementioned
function(s) cannot ever fail and they would always return ISC_R_SUCCESS.
Change the function(s) to return void and remove the extra checks in
the code that uses them.
Previously, the function(s) in the commit subject could fail for various
reasons - mostly allocation failures, or other functions returning
different return code than ISC_R_SUCCESS. Now, the aforementioned
function(s) cannot ever fail and they would always return ISC_R_SUCCESS.
Change the function(s) to return void and remove the extra checks in
the code that uses them.
When get_dispatch() returns an error code, the dns_request_createraw()
function jumps to the `cleanup` label, which will leave a previous
attachment to the `request` pointer unattached.
Fix the issue by jumping to the `detach` label instead.
Formerly, the gen.h header contained a compatibility layer between Win32
and POSIX platforms. Since we have already dropped the Win32 build, we
can merged gen.h into gen.c as the header file is not used elsewhere.
The order in which the netievents are processed on the network manager
loop is not guaranteed. Therefore the recv/read callback can come
earlier than the send/write callback.
The dns_request API wasn't ready for this reordering and it was
destroying the dns_request_t object before the send callback has been
called.
Add additional attach/detach in the req_send()/req_senddone() functions
to make sure we don't destroy the dns_request_t while it's still being
references by asynchronous call.
BIND unconditionally uses shims for BN_GENCB_new(), BN_GENCB_free(),
and BN_GENCB_get_arg() for all LibreSSL versions and, correctly, for
OpenSSL <1.1.0 versions.
This breaks LibreSSL compilation starting with LibreSSL 3.5.0.
Use autoconf check instead to check whether the family of the functions
are available.
By default C promotes short unsigned values to signed int which
leads to undefined behaviour when the value is shifted by too much.
Force unsigned arithmetic to be perform by explicitly casting to a
unsigned type.