There are a couple of problems with dns_request_createvia(): a UDP
retry count of zero means unlimited retries (it should mean no
retries), and the overall request timeout is not enforced. The
combination of these bugs means that requests can be retried forever.
This change alters calls to dns_request_createvia() to avoid the
infinite retry bug by providing an explicit retry count. Previously,
the calls specified infinite retries and relied on the limit implied
by the overall request timeout and the UDP timeout (which did not work
because the overall timeout is not enforced). The `udpretries`
argument is also changed to be the number of retries; previously, zero
was interpreted as infinity because of an underflow to UINT_MAX, which
appeared to be a mistake. And `mdig` is updated to match the change in
retry accounting.
The bug could be triggered by zone maintenance queries, including
NOTIFY messages, DS parental checks, refresh SOA queries and stub zone
nameserver lookups. It could also occur with `nsupdate -r 0`.
(But `mdig` had its own code to avoid the bug.)
The reference counting and isc_timer_attach()/isc_timer_detach()
semantic are actually misleading because it cannot be used under normal
conditions. The usual conditions under which is timer used uses the
object where timer is used as argument to the "timer" itself. This
means that when the caller is using `isc_timer_detach()` it needs the
timer to stop and the isc_timer_detach() does that only if this would be
the last reference. Unfortunately, this also means that if the timer is
attached elsewhere and the timer is fired it will most likely be
use-after-free, because the object used in the timer no longer exists.
Remove the reference counting from the isc_timer unit, remove
isc_timer_attach() function and rename isc_timer_detach() to
isc_timer_destroy() to better reflect how the API needs to be used.
The only caveat is that the already executed event must be destroyed
before the isc_timer_destroy() is called because the timer is no longet
attached to .ev_destroy_arg.
Previously, the task privileged mode has been used only when the named
was starting up and loading the zones from the disk as the "first" thing
to do. The privileged task was setup with quantum == 2, which made the
taskmgr/netmgr spin around the privileged queue processing two events at
the time.
The same effect can be achieved by setting the quantum to UINT_MAX (e.g.
practically unlimited) for the loadzone task, hence the privileged task
mode was removed in favor of just processing all the events on the
loadzone task in a single task_run().
Instead of passing the number of worker to the dns_zonemgr manually,
get the number of nm threads using the new isc_nm_getnworkers() call.
Additionally, remove the isc_pool API and manage the array of memory
context, zonetasks and loadtasks directly in the zonemgr.
After switching to per-thread resources in the zonemgr, the performance
was decreased because the memory context, zonetask and loadtask was
picked from the pool at random.
Pin the zone to single threadid (.tid) and align the memory context,
zonetask and loadtask to be the same, this sets the hard affinity of the
zone to the netmgr thread.
Previously, the zonemgr created 1 task per 100 zones and 1 memory
context per 1000 zones (with minimum 10 tasks and 2 memory contexts) to
reduce the contention between threads.
Instead of reducing the contention by having many resources, create a
per-nm_thread memory context, loadtask and zonetask and spread the zones
between just per-thread resources.
Note: this commit alone does decrease performance when loading the zone
by couple seconds (in case of 1M zone) and thus there's more work in
this whole MR fixing the performance.
Previously, the zone timer was not stopped before detaching the timer.
This could lead to a data race where the timer post_event() could fire
before the timer was detached, but then the event would be executed
after the zone was already destroyed.
This was not noticed before because the timing or the ordering of the
actions were different, but it was causing assertion failures in the
libns tests now.
Properly stop the zone timer before detaching the timer object from the
dns_zone.
When we are loading the zones, set the quantum to UINT_MAX, which makes
task_run process all tasks at once. After the zone loading is finished
the quantum will be dropped to 1 to not block server when we are loading
new zones after reconfiguration.
Fibonacci hashing was implemented in four separate places (rbt.c,
rbtdb.c, resolver.c, zone.c). This commit combines them into a single
implementation. The hash_32() function is now replaced with
isc_hash_bits32().
Historically, the inline keyword was a strong suggestion to the compiler
that it should inline the function marked inline. As compilers became
better at optimising, this functionality has receded, and using inline
as a suggestion to inline a function is obsolete. The compiler will
happily ignore it and inline something else entirely if it finds that's
a better optimisation.
Therefore, remove all the occurences of the inline keyword with static
functions inside single compilation unit and leave the decision whether
to inline a function or not entirely on the compiler
NOTE: We keep the usage the inline keyword when the purpose is to change
the linkage behaviour.
Previously, the unreachable code paths would have to be tagged with:
INSIST(0);
ISC_UNREACHABLE();
There was also older parts of the code that used comment annotation:
/* NOTREACHED */
Unify the handling of unreachable code paths to just use:
UNREACHABLE();
The UNREACHABLE() macro now asserts when reached and also uses
__builtin_unreachable(); when such builtin is available in the compiler.
Gcc 7+ and Clang 10+ have implemented __attribute__((fallthrough)) which
is explicit version of the /* FALLTHROUGH */ comment we are currently
using.
Add and apply FALLTHROUGH macro that uses the attribute if available,
but does nothing on older compilers.
In one case (lib/dns/zone.c), using the macro revealed that we were
using the /* FALLTHROUGH */ comment in wrong place, remove that comment.
Change the isc_interval_t implementation from separate data type and
separate implementation to be shim implementation on top of isc_time_t.
The distinction between isc_interval_t and isc_time_t has been kept
because they are semantically different - isc_interval_t is relative and
isc_time_t is absolute, but this allows isc_time_t and isc_interval_t to
be freely interchangeable, f.e. this:
isc_time_t *t1;
isc_interval_t *interval;
isc_time_t *t2;
isc_interval_set(interval, isc_time_seconds(t2), isc_time_nanoseconds(t2);;
isc_time_subtract(t1, interval, t2);
isc_interval_set(interval, isc_time_seconds(t2), isc_time_nanoseconds(t2));
to just:
isc_time_t *t1;
isc_interval_t *interval;
isc_time_t *t2;
isc_time_subtract(t1, t2, interval);
without introducing a whole set of new functions.
There were two places where expires argument (absolute isc_time_t value)
was being used. Both places has been converted to use relative interval
argument in preparation of simplification and refactoring of isc_timer
API.
The isc_timer_create() function was a bit conflated. It could have been
used to create a timer and start it at the same time. As there was a
single place where this was done before (see the previous commit for
nta.c), this was cleaned up and the isc_timer_create() function was
changed to only create new timer.
When a zone is being configured with a new view, the catalog zones
structure will also be linked to that view. Later on, in case of some
error, should the zone be reverted to the previous view, the link
between the catalog zones structure and the view won't be reverted.
Change the dns_zone_setviewrevert() function so it calls
dns_zone_catz_enable() during a zone revert, which will reset the
link between `catzs` and view.
Separate the locked parts of dns_zone_catz_enable() and
dns_zone_catz_disable() functions into static functions. This will
let us perform those tasks from the other parts of the module while
the zone is locked, avoiding one pair of additional unlocking and
locking operations.
When the named is shutting down, the zone event callbacks could
re-schedule the stub and refresh events leading to assertion failure.
Handle the ISC_R_SHUTTINGDOWN event state gracefully by bailing out.
In 2000, old BIND instances (BIND 8?) would return FORMERR if the SOA is
included in the NOTIFY.
Remove the workaround that detected the state and resent the NOTIFY
without SOA record.
This commit converts the license handling to adhere to the REUSE
specification. It specifically:
1. Adds used licnses to LICENSES/ directory
2. Add "isc" template for adding the copyright boilerplate
3. Changes all source files to include copyright and SPDX license
header, this includes all the C sources, documentation, zone files,
configuration files. There are notes in the doc/dev/copyrights file
on how to add correct headers to the new files.
4. Handle the rest that can't be modified via .reuse/dep5 file. The
binary (or otherwise unmodifiable) files could have license places
next to them in <foo>.license file, but this would lead to cluttered
repository and most of the files handled in the .reuse/dep5 file are
system test files.
In some cases we want to keep expired signatures. For example, if the
KSK is offline, we don't want to fall back to signing with the ZSK.
We could remove the signatures, but in any case we end up with a broken
zone.
The change made for GL #763 prevented the behavior to sign the DNSKEY
RRset with the ZSK if the KSK was offline (and signatures were expired).
The change causes the definition of "having both keys": if one key is
offline, we still consider having both keys, so we don't fallback
signing with the ZSK if KSK is offline.
That change also works the other way, if the ZSK is offline, we don't
fallback signing with the KSK.
This commit fixes that, so we only fallback signing zone RRsets with
the KSK, not signing key RRsets with the ZSK.
BIND can log this warning:
zone example.ch/IN (signed): Key example.ch/ECDSAP256SHA256/56340
missing or inactive and has no replacement: retaining signatures.
This log can happen when BIND tries to remove signatures because the
are about to expire or to be resigned. These RRsets may be signed with
the KSK if the ZSK files has been removed from disk. When we have
created a new ZSK we can replace the signatures creeated by the KSK
with signatures from the new ZSK.
It complains about the KSK being missing or inactive, but actually it
takes the key id from the RRSIG.
The warning is logged if BIND detects the private ZSK file is missing.
The warning is logged even if we were able to delete the signature.
With the change from this commit it only logs this warning if it is not
okay to delete the signature.
When the signed version of an inline-signed zone is dumped to disk, the
serial number of the unsigned version of the zone is stored in the
raw-format header so that the contents of the signed zone can be
resynchronized after named restart if the unsigned zone file is modified
while named is not running.
In order for the serial number of the unsigned zone to be determined
during the dump, zone->raw must be set to a non-NULL value. This should
always be the case as long as the signed version of the zone is used for
anything by named.
However, a scenario exists in which the signed version of the zone has
zone->raw set to NULL while it is being dumped:
1. Zone dump is requested; zone_dump() is invoked.
2. Another zone dump is already in progress, so the dump gets deferred
until I/O is available (see zonemgr_getio()).
3. The last external reference to the zone is released.
zone_shutdown() gets queued to the zone's task.
4. I/O becomes available for zone dumping. zone_gotwritehandle() gets
queued to the zone's task.
5. The zone's task runs zone_shutdown(). zone->raw gets set to NULL.
6. The zone's task runs zone_gotwritehandle(). zone->raw is determined
to be NULL, causing the serial number of the unsigned version of the
zone to be omitted from the raw-format dump of the signed zone file.
Note that the naïve solution - deferring the dns_zone_detach() call for
zone->raw until zone_free() gets called for the secure version of the
zone - does not work because it leads to a chicken-and-egg problem when
the inline-signed zone is about to get freed: the raw zone holds a weak
reference to the secure zone and that reference does not get released
until the reference count for the raw zone reaches zero, which in turn
would not happen until all weak references to the secure zone were
released.
Defer detaching from zone->raw in zone_shutdown() if the zone is in the
process of being dumped to disk. Ensure zone->raw gets detached from
after the dump is finished if detaching gets deferred. Prevent zone
dumping from being requeued upon failure if the zone is in the process
of being cleaned up as it opens up possibilities for the zone->raw
reference to leak, triggering a shutdown hang.
When the signed version of an inline-signed zone is dumped to disk, the
serial number of the unsigned version of the zone is written in the
raw-format header so that the contents of the signed zone can be
resynchronized after named restart if the unsigned zone file is
modified while named is not running (see RT #26676).
In order for the serial number of the unsigned zone to be determined
during the dump, zone->raw must be set to a non-NULL value. This
should always be the case as long as the signed version of the zone is
used for anything by named.
However, under certain circumstances the zone->raw could be set to NULL
while the zone is being dumped.
Defer detaching from zone->raw in zone_shutdown() if the zone is in the
process of being dumped to disk.
This commit enables client-side TLS contexts re-use for zone transfers
over TLS. That, in turn, makes it possible to use the internal session
cache associated with the contexts, allowing the TLS connections to be
established faster and requiring fewer resources by not going through
the full TLS handshake procedure.
Previously that would recreate the context on every connection, making
TLS session resumption impossible.
Also, this change lays down a foundation for Strict TLS (when the
client validates a server certificate), as the TLS context cache can
be extended to store additional data required for validation (like
intermediates CA chain).
dns_db_nodecount can now be used to get counts from the auxilary
rbt databases. The existing node count is returned by
tree=dns_dbtree_main. The nsec and nsec3 node counts by dns_dbtree_nsec
and dns_dbtree_nsec3 respectively.
The following scenario triggers a "named" crash:
1. Configure a catalog zone.
2. Start "named".
3. Comment out the "catalog-zone" clause.
4. Run `rndc reconfig`.
5. Uncomment the "catalog-zone" clause.
6. Run `rndc reconfig` again.
Implement the required cleanup of the in-memory catalog zone during
the first `rndc reconfig`, so that the second `rndc reconfig` could
find it in an expected state.
It was found, that the original commit adding the setmodtime() was
incompletely squashed and there was double check for
DNS_ZONEFLG_NEEDDUMP instead of check for DNS_ZONEFLG_NEEDDUMP and
DNS_ZONEFLG_DUMPING.
Change the duplicate check to DNS_ZONEFLG_DUMPING.
With isc_mem_get() and dns_name_dup() no longer being able to fail, some
functions can now only return ISC_R_SUCCESS. Change the return type to
void for the following function(s):
* dns_zone_setprimaries()
* dns_zone_setparentals()
* dns_zone_setparentals()
* dns_zone_setalsonotify()
Replace some "master/slave" terminology in the code with the preferred
"primary/secondary" keywords. This also changes user output such as
log messages, and fixes a typo ("seconary") in cfg_test.c.
There are still some references to "master" and "slave" for various
reasons:
- The old syntax can still be used as a synonym.
- The master syntax is kept when it refers to master files and formats.
- This commit replaces mainly keywords that are local. If "master" or
"slave" is used in for example a structure that is all over the
place, it is considered out of scope for the moment.
Remove the dynamic registration of result codes. Convert isc_result_t
from unsigned + #defines into 32-bit enum type in grand unified
<isc/result.h> header. Keep the existing values of the result codes
even at the expense of the description and identifier tables being
unnecessary large.
Additionally, add couple of:
switch (result) {
[...]
default:
break;
}
statements where compiler now complains about missing enum values in the
switch statement.
- Responses received by the dispatch are no longer sent to the caller
via a task event, but via a netmgr-style recv callback. the 'action'
parameter to dns_dispatch_addresponse() is now called 'response' and
is called directly from udp_recv() or tcp_recv() when a valid response
has been received.
- All references to isc_task and isc_taskmgr have been removed from
dispatch functions.
- All references to dns_dispatchevent_t have been removed and the type
has been deleted.
- Added a task to the resolver response context, to be used for fctx
events.
- When the caller cancels an operation, the response handler will be
called with ISC_R_CANCELED; it can abort immediately since the caller
will presumably have taken care of cleanup already.
- Cleaned up attach/detach in resquery and request.
Since every dispsock was associated with a dispentry anyway (though not
always vice versa), the members of dispsock have been combined into
dispentry, which is now reference-counted. dispentry objects are now
attached before connecting and detached afterward to prevent races
between the connect callback and dns_dispatch_removeresponse().
Dispatch and dispatchmgr objects are now reference counted as well, and
the shutdown process has been simplified. reference counting of
resquery and request objects has also been cleaned up significantly.
dns_dispatch_cancel() now flags a dispentry as having been canceled, so
that if the connect callback runs after cancellation, it will not
initiate a read.
The isblackholed() function has been simplified.
- UDP buffersize is now established when creating dispatch manager
and is always set to 4096.
- Set up the default port range in dispatchmgr before setting the magic
number.
- Magic is not set until dispatchmgr is fully created.
zone.c:integrity_checks() acquires a read lock while iterating the
zone database, and calls zone_check_mx() which acquires another
read lock. If another thread tries to acquire a write lock in the
meantime, it can deadlock. Calling dns_dbiterator_pause() to release
the first read lock prevents this.
Clear the key slots for dnssec-sign statistics for keys that are
removed. This way, the number of slots will stabilize to the maximum
key usage in a zone and will not grow every time a key rollover is
triggered.
After a reload, if the zone hasn't changed, this will log a
DNS_R_UNCHANGED error. This should not be at error level because it
happens on every reload.