RPZ rewrites called dns_db_findext() without passing through the
client database options; as as result, if the client set CD=1,
DNS_DBFIND_PENDINGOK was not used as it should have been, and
cache lookups failed, resulting in failure of the rewrite.
sizeof(dns_name_t) did not change but the boolean attributes are now
separated as one-bit structure members. This allows debuggers to
pretty-print dns_name_t attributes without any special hacks, plus we
got rid of manual bit manipulation code.
The previous commit failed some tests because we expect that if a
fetch fails and we have stale candidates in cache, the
stale-refresh-time window is started. This means that if we hit a stale
entry in cache and answering stale data is allowed, we don't bother
resolving it again for as long we are within the stale-refresh-time
window.
This is useful for two reasons:
- If we failed to fetch the RRset that we are looking for, we are not
hammering the authoritative servers.
- Successor clients don't need to wait for stale-answer-client-timeout
to get their DNS response, only the first one to query will take
the latency penalty.
The latter is not useful when stale-answer-client-timeout is 0 though.
So this exception code only to make sure we don't try to refresh the
RRset again if it failed to do so recently.
Refreshing a stale RRset is similar to prefetching an RRset, so
reuse the existing code. When refreshing an RRset we need to clear
all db options related to serve-stale so that stale RRsets in cache
are ignored during the refresh.
We no longer need to set the "nodetach" flag, because the refresh
fetch is now a "fetch and forget". So we can detach from the client
in the query_send().
This code will break some serve-stale test cases, this will be fixed
in the successor commit.
TODO: add explanation why the serve-stale test cases fail.
Don't attempt to resolve DNS responses for intermediate results. This
may create multiple refreshes and can cause a crash.
One scenario is where for the query there is a CNAME and canonical
answer in cache that are both stale. This will trigger a refresh of
the RRsets because we encountered stale data and we prioritized it over
the lookup. It will trigger a refresh of both RRsets. When we start
recursing, it will detect a recursion loop because the recursion
parameters will eventually be the same. In 'dns_resolver_destroyfetch'
the sanity check fails, one of the callers did not get its event back
before trying to destroy the fetch.
Move the call to 'query_refresh_rrset' to 'ns_query_done', so that it
is only called once per client request.
Another scenario is where for the query there is a stale CNAME in the
cache that points to a record that is also in cache but not stale. This
will trigger a refresh of the RRset (because we encountered stale data
and we prioritized it over the lookup).
We mark RRsets that we add to the message with
DNS_RDATASETATTR_STALE_ADDED to prevent adding a duplicate RRset when
a stale lookup and a normal lookup conflict with each other. However,
the other non-stale RRset when following a CNAME chain will be added to
the message without setting that attribute, because it is not stale.
This is a variant of the bug in #2594. The fix covered the same crash
but for stale-answer-client-timeout > 0.
Fix this by clearing all RRsets from the message before refreshing.
This requires the refresh to happen after the query is send back to
the client.
It is possible to bypass Response Rate Limiting (RRL)
`responses-per-second` limitation using specially crafted wildcard
names, because the current implementation, when encountering a found
DNS name generated from a wildcard record, just strips the leftmost
label of the name before making a key for the bucket.
While that technique helps with limiting random requests like
<random>.example.com (because all those requests will be accounted
as belonging to a bucket constructed from "example.com" name), it does
not help with random names like subdomain.<random>.example.com.
The best solution would have been to strip not just the leftmost
label, but as many labels as necessary until reaching the suffix part
of the wildcard record from which the found name is generated, however,
we do not have that information readily available in the context of RRL
processing code.
Fix the issue by interpreting all valid wildcard domain names as
the zone's origin name concatenated to the "*" name, so they all will
be put into the same bucket.
Clean up dns_rdatalist_tordataset() and dns_rdatalist_fromrdataset()
functions by making them return void, because they cannot fail.
Clean up other functions that subsequently cannot fail.
When checking if we should enable serve-stale, add an early out case
when the result is an error signalling a duplicate query or a query
that would be dropped.
Call dns_view_sfd_find to find the namespace to be used to verify
the covering NSEC records returned for the given QNAME. Check that
the NSEC owner names are within that namespace.
The "glue-cache" option was marked as deprecated by commit
5ae33351f286feb25a965bf3c9e6b122ab495342 (first released in BIND 9.17.6,
back in October 2020), so now obsolete that option, removing all code
and documentation related to it.
Note: this causes the glue cache feature to be permanently enabled, not
disabled.
The only values that the isc_quota_attach() function (called from
check_recursionquota() via recursionquotatype_attach_soft()) can
currently return are: ISC_R_SUCCESS, ISC_R_SOFTQUOTA, and ISC_R_QUOTA.
Instead of just propagating any other (unexpected) error up the call
stack, assert immediately, so that if the isc_quota_* API gets updated
in the future to return values currently matching the "default"
statement, check_recursionquota() can be promptly updated to handle such
new return values as desired.
Reduce code duplication in check_recursionquota() by extracting its
parts responsible for logging to a separate helper function.
Remove result text from the "no more recursive clients" message because
it always says "quota reached" (as the relevant branch is only evaluated
when 'result' is set to ISC_R_QUOTA) and therefore brings no additional
value.
When the client->recursionquota pointer was overloaded by different
features, each of those features had to be aware of that fact and handle
any updates of that pointer gracefully. Example: prefetch code
initiates recursion, attaching to client->recursionquota, then query
processing restarts due to a CNAME being encountered, then that CNAME is
not found in the cache, so another recursion is triggered, but
client->recursionquota is already attached to; even though it is not
CNAME chaining code that attached to that pointer, that code still has
to handle such a situation gracefully.
However, all features that can initiate recursion have now been updated
to use separate slots in the 'recursions' array, so keeping the old
checks in place means masking future programming errors that could
otherwise be caught - and should be caught because each feature needs to
properly maintain its own quota reference.
Remove outdated recursion quota pointer checks to enable the assertions
in isc_quota_*() functions to detect programming errors in code paths
that can start recursion. Remove an outdated comment to prevent
confusion.
Some functions fail to detach from the recursion quota if an error
occurs while initiating recursion. This causes the recursive client
counter to be off. Add missing recursionquota_detach() calls, reworking
cleanup code where appropriate.
Similarly to how different code paths reused common client handle
pointers and fetch references despite being logically unrelated, they
also reuse client->recursionquota, the field in which a reference to the
recursion quota is stored. This unnecessarily forces all code using
that field to be aware of the fact that it is overloaded by different
features.
Overloading client->recursionquota also causes inconsistent behavior.
For example, if prefetch code triggers recursion and then delegation
handling code also triggers recursion, only one of these code paths will
be able to attach to the recursion quota, but both recursions will be
started anyway. In other words, each code path only checks whether the
recursion quota has not been exceeded if the quota has not yet been
attached to by another code path. This behavior theoretically allows
the configured recursion quota to be slightly exceeded; while it is not
expected to be a real-world operational issue, it is still confusing and
should therefore be fixed.
Extend the structures comprising the 'recursions' array with a new field
holding a pointer to the recursion quota that a given recursion process
attached to. Update all code paths using client->recursionquota so that
they use the appropriate slot in the 'recursions' array. Drop the
'recursionquota' field from ns_client_t.
Previously, multiple code paths reused client->query.fetch, so it was
enough for ns_query_cancel() to issue a single call to
dns_resolver_cancelfetch() with that fetch as an argument. Now, since
each slot in the 'recursions' array can hold a reference to a separate
resolver fetch, ns_query_cancel() needs to handle all of them, so that
all recursion callbacks get a chance to clean up the associated
resources when a query is canceled.
Async hooks are the last feature using the client->fetchhandle and
client->query.fetch pointers. Update ns_query_hookasync() and
query_hookresume() so that they use a dedicated slot in the 'recursions'
array. Note that async hooks are still not expected to initiate
recursion if one was already started by a prior ns_query_recurse() call,
so the REQUIRE assertion in ns_query_hookasync() needs to check the
RECTYPE_NORMAL slot rather than the RECTYPE_HOOK one.
With prefetch and RPZ code updated to use separate slots in the
'recursions' array, the code responsible for starting recursion in
ns_query_recurse() and resuming query handling in fetch_callback()
should follow suit, so that it does not need to explicitly cooperate
with other code paths that may initiate recursion.
Replace:
- client->fetchhandle with HANDLE_RECTYPE_NORMAL(client)
- client->query.fetch with FETCH_RECTYPE_NORMAL(client)
Also update other functions using client->fetchhandle and
client->query.fetch (ns_query_cancel(), query_usestale()) so that those
two fields can shortly be dropped altogether.
Both prefetch code and RPZ code ignore recursion results (caching the
response notwithstanding). RPZ code has been (ab)using that fact since
commit 08e36aa5a5c7697a839f83831fccf8fb3f792848 by employing
prefetch_done() as the fetch completion callback. This is only
seemingly a simplification as it makes the code harder to follow ("why
is prefetch code used for handling RPZ-triggered recursion?").
Turn prefetch_done() into a new function whose name clearly conveys its
purpose. Add a parameter to its prototype in order to allow callers to
specify which slot in the 'recursions' array it should use. Reintroduce
prefetch_done() as a wrapper for that function. Add rpzfetch_done(), an
RPZ-exclusive wrapper for that function (using a distinct recursion
type).
Since each slot in the 'recursions' array needs to be initialized before
getting cleaned up when recursion completes, rework fetch_and_forget()
so that it takes recursion type rather than extra fetch options as the
last parameter and make it use the requested slot in the 'recursions'
array rather than a fixed slot (RECTYPE_PREFETCH) for all callers. This
makes fetch_and_forget() a logical complement of cleanup_after_fetch().
Collectively, these changes make prefetch and RPZ code logically
separate (except for reusing client->recursionquota, which will be
refactored later).
Replace:
- client->prefetchhandle with HANDLE_RECTYPE_PREFETCH(client)
- client->query.prefetch with FETCH_RECTYPE_PREFETCH(client)
This is preparatory work for separating prefetch code from RPZ code.
Initialize client->query using a compound literal in order to make the
ns_query_init() function shorter and more readable. This also prevents
the need to explicitly initialize any newly added fields in the future.
query_prefetch() and query_rpzfetch() contain a lot of duplicated code.
Extract the common bits into a separate function whose name clearly
suggests its purpose.
Add a set of new helper functions for attaching to the recursion quota
in order to reduce code duplication and to ensure that the recursive
clients counter is always adjusted properly. Since some callers
(query_prefetch(), query_rpzfetch()) treat exceeding the soft quota as
an error while others (check_recursionquota()) do not, also add two
wrapper functions whose names help convey their purpose, in order to
improve code readability.
Add a new helper function for detaching from the recursion quota in
order to reduce code duplication and to ensure that detaching from that
quota is always accompanied by decreasing the recursive clients counter.
Commit 9ffb4a7ba11fae64a6ce2dd6390cd334372b7ab7 causes Clang Static
Analyzer to flag a potential NULL dereference in query_nxdomain():
query.c:9394:26: warning: Dereference of null pointer [core.NullDereference]
if (!qctx->nxrewrite || qctx->rpz_st->m.rpz->addsoa) {
^~~~~~~~~~~~~~~~~~~
1 warning generated.
The warning above is for qctx->rpz_st potentially being a NULL pointer
when query_nxdomain() is called from query_resume(). This is a false
positive because none of the database lookup result codes currently
causing query_nxdomain() to be called (DNS_R_EMPTYWILD, DNS_R_NXDOMAIN)
can be returned by a database lookup following a recursive resolution
attempt. Add a NULL check nevertheless in order to future-proof the
code and silence Clang Static Analyzer.
ns_client_getnamebuf() cannot fail (i.e. return NULL) since commit
e31cc1eeb436095490c7caa120de148df82ecd6c. Remove redundant NULL checks
performed on the pointer returned by ns_client_getnamebuf().
ns_client_newname() cannot fail (i.e. return NULL) since commit
2ce0de699528c8d505adfde37a916b1742e5562f (though it was only made more
apparent by commit 33ba0057a7c44d4e5d63f7f55e1823279e996a19). Remove
redundant NULL checks performed on the pointer returned by
ns_client_newname().
ns_client_newrdataset() cannot fail (i.e. return NULL) since commit
efb385ecdcfd3213b3bb739a3dcb9e431690e559 (though it was only made more
apparent by commit 33ba0057a7c44d4e5d63f7f55e1823279e996a19). Remove
redundant NULL checks performed on the pointer returned by
ns_client_newrdataset().
The dns_message_gettempname(), dns_message_gettemprdata(),
dns_message_gettemprdataset(), and dns_message_gettemprdatalist() always
succeeds because the memory allocation cannot fail now. Change the API
to return void and cleanup all the use of aforementioned functions.
The ns_statscounter_recursclients counter was previously only
incremented or decremented if client->recursionquota was non-NULL.
This was harmless, because that value should always be non-NULL if
recursion is enabled, but it made the code slightly confusing.
Add DNS extended errors 3 (Stale Answer) and 19 (Stale NXDOMAIN Answer)
to responses. Add extra text with the reason why the stale answer was
returned.
To test, we need to change the configuration such that for the first
set of tests the stale-refresh-time window does not interfer with the
expected extended errors.
In order to modify the .localhost and .localnets members of the
dns_aclenv, all other processing on the netmgr loops needed to be
stopped using the task exclusive mode. Add the isc_rwlock to the
dns_aclenv, so any modifications to the .localhost and .localnets can be
done under the write lock.
In couple places, we have missed INSIST(0) or ISC_UNREACHABLE()
replacement on some branches with UNREACHABLE(). Replace all
ISC_UNREACHABLE() or INSIST(0) calls with UNREACHABLE().
The way the ns_client_t .shuttingdown member was practically dead code.
The .shuttingdown would be set to true only in ns__client_put() function
meaning that we have detached from all ns_client_t .*handles and the
ns_client_t object being freed:
client->magic = 0;
client->shuttingdown = true;
[...]
isc_mem_put(manager->ctx, client, sizeof(*client))
Meanwhile the ns_client_t object is accessed like this:
isc_nmhandle_detach(&client->fetchhandle);
client->query.attributes &= ~NS_QUERYATTR_RECURSING;
client->state = NS_CLIENTSTATE_WORKING;
qctx_init(client, &devent, 0, &qctx);
client_shuttingdown = ns_client_shuttingdown(client);
if (fetch_canceled || fetch_answered || client_shuttingdown) {
[...]
}
Even if the isc_nmhandle_detach(...) was the last handle detach, it
would mean that immediatelly, after calling the isc_nmhandle_detach(),
we would be causing use-after-free, because the ns_client_t is
immediatelly destroyed after setting .shuttingdown to true.
The similar code in the query_hookresume() already noticed this:
/*
* This event is running under a client task, so it's safe to detach
* the fetch handle. And it should be done before resuming query
* processing below, since that may trigger another recursion or
* asynchronous hook event.
*/
The ns_client_t is always attached to ns_clientmgr_t which has
associated memory context, server context, task and threadid. Use those
directly from the ns_clientmgr_t instead of attaching it to an extra
copy in ns_client_t to make the ns_client_t more sleek and lean.
Additionally, remove some stray ns_client_t struct members that were not
used anywhere.
Historically, the inline keyword was a strong suggestion to the compiler
that it should inline the function marked inline. As compilers became
better at optimising, this functionality has receded, and using inline
as a suggestion to inline a function is obsolete. The compiler will
happily ignore it and inline something else entirely if it finds that's
a better optimisation.
Therefore, remove all the occurences of the inline keyword with static
functions inside single compilation unit and leave the decision whether
to inline a function or not entirely on the compiler
NOTE: We keep the usage the inline keyword when the purpose is to change
the linkage behaviour.
Previously, the unreachable code paths would have to be tagged with:
INSIST(0);
ISC_UNREACHABLE();
There was also older parts of the code that used comment annotation:
/* NOTREACHED */
Unify the handling of unreachable code paths to just use:
UNREACHABLE();
The UNREACHABLE() macro now asserts when reached and also uses
__builtin_unreachable(); when such builtin is available in the compiler.
Gcc 7+ and Clang 10+ have implemented __attribute__((fallthrough)) which
is explicit version of the /* FALLTHROUGH */ comment we are currently
using.
Add and apply FALLTHROUGH macro that uses the attribute if available,
but does nothing on older compilers.
In one case (lib/dns/zone.c), using the macro revealed that we were
using the /* FALLTHROUGH */ comment in wrong place, remove that comment.
Commit aab691d51266f552a7923db32686fb9398b1d255 did not fix all possible
scenarios in which the ns_statscounter_recursclients counter underflows.
The solution implemented therein can be ineffective e.g. when CNAME
chaining happens with prefetching enabled.
Here is an example recursive resolution scenario in which the
ns_statscounter_recursclients counter can underflow with the current
logic in effect:
1. Query processing starts, the answer is not found in the cache, so
recursion is started. The NS_CLIENTATTR_RECURSING attribute is set.
ns_statscounter_recursclients is incremented (Δ = +1).
2. Recursion completes, returning a CNAME. client->recursionquota is
non-NULL, so the NS_CLIENTATTR_RECURSING attribute remains set.
ns_statscounter_recursclients is decremented (Δ = 0).
3. Query processing restarts.
4. The current QNAME (the target of the CNAME from step 2) is found in
the cache, with a TTL low enough to trigger a prefetch.
5. query_prefetch() attaches to client->recursionquota.
ns_statscounter_recursclients is not incremented because
query_prefetch() does not do that (Δ = 0).
6. Query processing restarts.
7. The current QNAME (the target of the CNAME from step 4) is not found
in the cache, so recursion is started. client->recursionquota is
already attached to (since step 5) and the NS_CLIENTATTR_RECURSING
attribute is set (since step 1), so ns_statscounter_recursclients is
not incremented (Δ = 0).
8. The prefetch from step 5 completes. client->recursionquota is
detached from in prefetch_done(). ns_statscounter_recursclients is
not decremented because prefetch_done() does not do that (Δ = 0).
9. Recursion for the current QNAME completes. client->recursionquota
is already detached from, i.e. set to NULL (since step 8), and the
NS_CLIENTATTR_RECURSING attribute is set (since step 1), so
ns_statscounter_recursclients is decremented (Δ = -1).
Another possible scenario is that after step 7, recursion for the target
of the CNAME from step 4 completes before the prefetch for the CNAME
itself. fetch_callback() then notices that client->recursionquota is
non-NULL and decrements ns_statscounter_recursclients, even though
client->recursionquota was attached to by query_prefetch() and therefore
not accompanied by an incrementation of ns_statscounter_recursclients.
The net result is also an underflow.
Instead of trying to properly handle all possible orderings of events
set into motion by normal recursion and prefetch-triggered recursion,
adjust ns_statscounter_recursclients whenever the recursive clients
quota is successfully attached to or detached from. Remove the
NS_CLIENTATTR_RECURSING attribute altogether as its only purpose is made
obsolete by this change.