If named is configured to perform DNSSEC validation and also forwards
all queries ("forward only;") to validating resolvers, negative trust
anchors do not work properly because the CD bit is not set in queries
sent to the forwarders. As a result, instead of retrieving bogus DNSSEC
material and making validation decisions based on its configuration,
named is only receiving SERVFAIL responses to queries for bogus data.
Fix by ensuring the CD bit is always set in queries sent to forwarders
if the query name is covered by an NTA.
When sending an udp query (resquery_send) we first issue an asynchronous
isc_socket_connect and increment query->connects, then isc_socket_sendto2
and increment query->sends.
If we happen to cancel this query (fctx_cancelquery) we need to cancel
all operations we might have issued on this socket. If we are under very high
load the callback from isc_socket_connect (resquery_udpconnected) might have
not yet been fired. In this case we only cancel the CONNECT event on socket,
and ignore the SEND that's waiting there (as there is an `else if`).
Then we call dns_dispatch_removeresponse which kills the dispatcher socket
and calls isc_socket_close - but if system is under very high load, the send
we issued earlier might still not be complete - which triggers an assertion
because we're trying to close a socket that's still in use.
The fix is to always check if we have incomplete sends on the socket and cancel
them if we do.
If we try to fetch a record from cache and need to look into
hints database we assume that the resolver is not primed and
start dns_resolver_prime(). Priming query is supposed to return
NSes for "." in ANSWER section and glue records for them in
ADDITIONAL section, so that we can fill that info in 'regular'
cache and not use hints db anymore.
However, if we're using a forwarder the priming query goes through
it, and if it's configured to return minimal answers we won't get
the addresses of root servers in ADDITIONAL section. Since the
only records for root servers we have are in hints database we'll
try to prime the resolver with every single query.
This patch adds a DNS_FETCHOPT_NOFORWARD flag which avoids using
forwarders if possible (that is if we have forward-first policy).
Using this flag on priming fetch fixes the problem as we get the
proper glue. With forward-only policy the problem is non-existent,
as we'll never ask for root server addresses because we'll never
have a need to query them.
Also added a test to confirm priming queries are not forwarded.
go back to regular resolution. When this happens the fetch timer is
already running, and we might end up in a situation where we we create
a fetch for qname-minimized query and after that the timer is triggered
and the query is retried (fctx_try) - which causes relaunching of
qname-minimization fetch - and since we already have a qmin fetch
for this fctx - assertion failure.
This fix stops the timer when doing qname minimization - qmin fetch
internal timer should take care of all the possible timeouts.
Since following a delegation resets most fetch context state, address
marks (FCTX_ADDRINFO_MARK) set inside lib/dns/resolver.c are not
preserved when a delegation is followed. This is fine for full
recursive resolution but when named is configured with "forward first;"
and one of the specified forwarders times out, triggering a fallback to
full recursive resolution, that forwarder should no longer be consulted
at each delegation point subsequently reached within a given fetch
context.
Add a new badnstype_t enum value, badns_forwarder, and use it to mark a
forwarder as bad when it times out in a "forward first;" configuration.
Since the bad server list is not cleaned when a fetch context follows a
delegation, this prevents a forwarder from being queried again after
falling back to full recursive resolution. Yet, as each fetch context
maintains its own list of bad servers, this change does not cause a
forwarder timeout to prevent that forwarder from being used by other
fetch contexts.
If we know that we'll have a task pool doing specific thing it's better
to use this knowledge and bind tasks to task queues, this behaves better
than randomly choosing the task queue.
- use bound resolver tasks - we have a pool of tasks doing resolutions,
we can spread the load evenly using isc_task_create_bound
- quantum set universally to 25
Sometimes it is useful to set a 'floor' on the TTL for records
to be cached. Some sites like to use ridiculously low TTLs for
some reason, and that often is not compatible with slow links.
Signed-off-by: Michael Milligan <milli@acmeps.com>
Signed-off-by: LaMont Jones <lamont@debian.org>
At the beginning of qname minimization we get fctx->finds filled with what's
in the cache at this point, in worst case root servers. After doing full
run querying for NSes at different levels we need to clean it and refill
it with proper values from cache.
As part of resquery_response() refactoring [1], a goto statement was
replaced [2] with a call to a new function - originally called
rctx_delegation(), now folded into rctx_answer_none() - extracted from
existing code. However, one call site of that refactored function does
not reset the "result" variable, causing a referral with a non-empty
ANSWER section to be inadvertently treated as an error, which prevents
resolution of names reliant on servers sending such responses. Fix by
resetting the "result" variable to ISC_R_SUCCESS when a response
containing a non-empty ANSWER section can be treated as a delegation.
[1] see RT #45362
[2] see commit e1380a16741a3b4a57e54d7a9ce09dd12691522f
- make qname-minimization option tristate {strict,relaxed,disabled}
- go straight for the record if we hit NXDOMAIN in relaxed mode
- go straight for the record after 3 labels without new delegation or 7 labels total
- use start of fetch (and not time of response) as 'now' time for querying cache for
zonecut when following delegation.