The previous commit failed some tests because we expect that if a
fetch fails and we have stale candidates in cache, the
stale-refresh-time window is started. This means that if we hit a stale
entry in cache and answering stale data is allowed, we don't bother
resolving it again for as long we are within the stale-refresh-time
window.
This is useful for two reasons:
- If we failed to fetch the RRset that we are looking for, we are not
hammering the authoritative servers.
- Successor clients don't need to wait for stale-answer-client-timeout
to get their DNS response, only the first one to query will take
the latency penalty.
The latter is not useful when stale-answer-client-timeout is 0 though.
So this exception code only to make sure we don't try to refresh the
RRset again if it failed to do so recently.
Refreshing a stale RRset is similar to prefetching an RRset, so
reuse the existing code. When refreshing an RRset we need to clear
all db options related to serve-stale so that stale RRsets in cache
are ignored during the refresh.
We no longer need to set the "nodetach" flag, because the refresh
fetch is now a "fetch and forget". So we can detach from the client
in the query_send().
This code will break some serve-stale test cases, this will be fixed
in the successor commit.
TODO: add explanation why the serve-stale test cases fail.
Formerly, the isc_hash32() would have to change the key in a local copy
to make it case insensitive. Change the isc_siphash24() and
isc_halfsiphash24() functions to lowercase the input directly when
reading it from the memory and converting the uint8_t * array to
64-bit (respectively 32-bit numbers).
dohpath is specfied in draft-ietf-add-svcb-dns and has a value
of 7. It must be a relative path (start with a /), be encoded
as UTF8 and contain the variable dns ({?dns}).
On systems with signed rlim_t the old code calculated its maximum
value by shifting 1 into the sign bit, which is undefined behaviour.
Avoid the bug by using an unsigned shift.
When the TCP test is run on the busy server, the server might take a
while to wind the server down because it might still be processing all
that 300k invalid XFR requests.
Increate the rncd wait time to 120 seconds, the SIGTERM time to 300
seconds, and reduce the time to wait for ans servers from 1200 second
to just 120 seconds.
The dns__nta_shutdown() could be run from different threads and it was
accessing nta->timer unlocked. Don't check and stop the timer from
dns__nta_shutdown() directly, but leave it for the async callback.
Because the dns_zonemgr_create() was run before the loopmgr was started,
the isc_ratelimiter API was more complicated that it had to be. Move
the dns_zonemgr_create() to run_server() task which is run on the main
loop, and simplify the isc_ratelimiter API implementation.
The isc_timer is now created in the isc_ratelimiter_create() and
starting the timer is now separate async task as is destroying the timer
in case it's not launched from the loop it was created on. The
ratelimiter tick now doesn't have to create and destroy timer logic and
just stops the timer when there's no more work to do.
This should also solve all the races that were causing the
isc_ratelimiter to be left dangling because the timer was stopped before
the last reference would be detached.
Access to the source tree is not available with oss_fuzz. Have
fuzz/dns_message_checksig build and populate a key directory for
the fuzzer to use. This contains a key pair and a zone file which
has the public key from the key pair. Clean it up on shutdown.
Unlike standard free(), isc_mem_free() is not a no-op when passed a
NULL pointer. For size accounting purposes it calls sallocx(), which
crashes when passed a NULL pointer. To get more helpful diagnostics,
REQUIRE() that the pointer is not NULL so that when the programmer
makes a mistake they get a backtrace that shows what went wrong.
Extra care must be taken when executing the callbacks to prevent the
deadlocks on the caller's side. Add a paragraph that addresses when we
can and when we cannot call the callbacks directly.
The isc__nm_udp_send() callback would be called synchronously when
shutting down or when the socket has been closed. This could lead to
double locking in the calling code and thus those callbacks needs to be
called asynchronously.
Add several test cases in the 'upforwd' system test to make sure
that different scenarios of Dynamic DNS update forwarding are
tested, in particular when both the original and forwarded requests
are over Do53, or DoT, or they use different transports.
Now that the 'dns_request' supports using TLS transport, implement
dynamic update forwarding using DoT when the primary server is
configured to use a TLS transport.
Previously, when using such configuration, the dynamic update forwarding
feature was broken.
There's a known memory leak in the engine_pkcs11 at the time of writing
this and it interferes with the named ability to check for memory leaks
in the OpenSSL memory context by default.
Add an autoconf option to explicitly enable the memory leak detection,
and use it in the CI except for pkcs11 enabled builds. When this gets
fixed in the engine_pkc11, the option can be enabled by default.