2
0
mirror of https://gitlab.isc.org/isc-projects/bind9 synced 2025-08-28 21:17:54 +00:00

981 Commits

Author SHA1 Message Date
Ondřej Surý
f50b1e0685 Use clang-format to reformat the source files 2020-02-12 15:04:17 +01:00
Ondřej Surý
bc1d4c9cb4 Clear the pointer to destroyed object early using the semantic patch
Also disable the semantic patch as the code needs tweaks here and there because
some destroy functions might not destroy the object and return early if the
object is still in use.
2020-02-09 18:00:17 -08:00
Witold Kręcicki
d708370db4 Fix atomics usage for mutexatomics 2020-02-08 12:34:19 -08:00
Ondřej Surý
a9bd6f6ea6 Fix comparison between type uint16_t and wider type size_t in a loop
Found by LGTM.com (see below for description), and while it should not
happen as EDNS OPT RDLEN is uint16_t, the fix is easy.  A little bit
of cleanup is included too.

> In a loop condition, comparison of a value of a narrow type with a value
> of a wide type may result in unexpected behavior if the wider value is
> sufficiently large (or small). This is because the narrower value may
> overflow. This can lead to an infinite loop.
2020-02-05 01:41:13 +00:00
Ondřej Surý
7dfc092f06 Use C11 atomics for nfctx, kill unused dns_resolver_nrunning() 2020-01-14 13:12:13 +01:00
Mark Andrews
62abb6aa82 make resolver->zspill atomic to prevent potential deadlock 2019-12-12 08:26:59 +00:00
Mark Andrews
13aaeaa06f Note bucket lock requirements and move REQUIRE inside locked section. 2019-12-10 22:16:15 +00:00
Mark Andrews
5589748eca lock access to fctx->nqueries 2019-12-10 22:16:15 +00:00
Mark Andrews
912ce87479 Make fctx->attributes atomic.
FCTX_ATTR_SHUTTINGDOWN needs to be set and tested while holding the node
lock but the rest of the attributes don't as they are task locked. Making
fctx->attributes atomic allows both behaviours without races.
2019-12-03 08:58:53 +11:00
Mark Andrews
9ca6ad6311 Assign fctx->client when fctx is created rather when the join happens.
This prevents races on fctx->client whenever a new fetch joins a existing
fetch (by calling fctx_join) as it is now invariant for the active life of
fctx.
2019-12-02 06:01:46 +00:00
Ondřej Surý
edd97cddc1 Refactor dns_name_dup() usage using the semantic patch 2019-11-29 14:00:37 +01:00
Ondřej Surý
a5189eefa5 lib/dns/resolver.c: Call dns_adb_endudpfetch() only for UDP queries
The dns_adb_beginudpfetch() is called only for UDP queries, but
the dns_adb_endudpfetch() is called for all queries, including
TCP.  This messages the quota counting in adb.c.
2019-11-19 02:53:56 +08:00
Michał Kępień
fce3c93ea2 Prevent TCP failures from affecting EDNS stats
EDNS mechanisms only apply to DNS over UDP.  Thus, errors encountered
while sending DNS queries over TCP must not influence EDNS timeout
statistics.
2019-10-31 09:54:05 +01:00
Michał Kępień
6cd115994e Prevent query loops for misbehaving servers
If a TCP connection fails while attempting to send a query to a server,
the fetch context will be restarted without marking the target server as
a bad one.  If this happens for a server which:

  - was already marked with the DNS_FETCHOPT_EDNS512 flag,
  - responds to EDNS queries with the UDP payload size set to 512 bytes,
  - does not send response packets larger than 512 bytes,

and the response for the query being sent is larger than 512 byes, then
named will pointlessly alternate between sending UDP queries with EDNS
UDP payload size set to 512 bytes (which are responded to with truncated
answers) and TCP connections until the fetch context retry limit is
reached.  Prevent such query loops by marking the server as bad for a
given fetch context if the advertised EDNS UDP payload size for that
server gets reduced to 512 bytes and it is impossible to reach it using
TCP.
2019-10-31 08:48:35 +01:00
Mark Andrews
622bef6aec reset fctx->qmindcname and fctx->qminname after processing a delegation 2019-10-01 22:09:04 -07:00
Evan Hunt
488cb4da10 SERVFAIL if a prior qmin fetch has not been canceled when a new one starts 2019-10-01 20:41:53 -07:00
Ondřej Surý
c2dad0dcb2 Replace RUNTIME_CHECK(dns_name_copy(..., NULL)) with dns_name_copynf()
Use the semantic patch from the previous commit to replace all the calls to
dns_name_copy() with NULL as third argument with dns_name_copynf().
2019-10-01 10:43:26 +10:00
Ondřej Surý
5efa29e03a The final round of adding RUNTIME_CHECK() around dns_name_copy() calls
This commit was done by hand to add the RUNTIME_CHECK() around stray
dns_name_copy() calls with NULL as third argument.  This covers the edge cases
that doesn't make sense to write a semantic patch since the usage pattern was
unique or almost unique.
2019-10-01 10:43:26 +10:00
Ondřej Surý
89b269b0d2 Add RUNTIME_CHECK() around result = dns_name_copy(..., NULL) calls
This second commit uses second semantic patch to replace the calls to
dns_name_copy() with NULL as third argument where the result was stored in a
isc_result_t variable.  As the dns_name_copy(..., NULL) cannot fail gracefully
when the third argument is NULL, it was just a bunch of dead code.

Couple of manual tweaks (removing dead labels and unused variables) were
manually applied on top of the semantic patch.
2019-10-01 10:43:26 +10:00
Ondřej Surý
35bd7e4da0 Add RUNTIME_CHECK() around plain dns_name_copy(..., NULL) calls using spatch
This commit add RUNTIME_CHECK() around all simple dns_name_copy() calls where
the third argument is NULL using the semantic patch from the previous commit.
2019-10-01 10:43:26 +10:00
Mark Andrews
c3bcb4d47a Remove potential use after free (fctx) in rctx_resend. 2019-09-13 12:44:12 +10:00
Ondřej Surý
4957255d13 Use the semantic patch to change the usage isc_mem_create() to new API 2019-09-12 09:26:09 +02:00
Ondřej Surý
50e109d659 isc_event_allocate() cannot fail, remove the fail handling blocks
isc_event_allocate() calls isc_mem_get() to allocate the event structure.  As
isc_mem_get() cannot fail softly (e.g. it never returns NULL), the
isc_event_allocate() cannot return NULL, hence we remove the (ret == NULL)
handling blocks using the semantic patch from the previous commit.
2019-08-30 08:55:34 +02:00
Evan Hunt
1d86b202ad remove DLV-related library code 2019-08-09 09:15:10 -07:00
Mark Andrews
57a328d67e Store the DS and RRSIG(DS) with trust dns_trust_pending_answer
so that the validator can validate the records as part of validating
the current request.
2019-08-02 15:09:42 +10:00
Ondřej Surý
f0c6aef542 Cleanup stray goto labels from removing isc_mem_allocate/strdup checking blocks 2019-07-23 15:32:36 -04:00
Ondřej Surý
9bdc24a9fd Use coccinelle to cleanup the failure handling blocks from isc_mem_strdup 2019-07-23 15:32:36 -04:00
Ondřej Surý
ae83801e2b Remove blocks checking whether isc_mem_get() failed using the coccinelle 2019-07-23 15:32:35 -04:00
Michał Kępień
ca528766d6 Restore locking in resume_dslookup()
Commit 9da902a201b6d0e1bdbac0af067a59bb0a489c9c removed locking around
the fctx_decreference() call inside resume_dslookup().  This allows
fctx_unlink() to be called without the bucket lock being held, which
must never happen.  Ensure the bucket lock is held by resume_dslookup()
before it calls fctx_decreference().
2019-07-23 11:43:46 +02:00
Ondřej Surý
a4141fcf98 Restore more locking in the lib/dns/resolver.c code
1. Restore locking in the fctx_decreference() code, because the insides of the
   function needs to be protected when fctx->references drops to 0.

2. Restore locking in the dns_resolver_attach() code, because two variables are
   accessed at the same time and there's slight chance of data race.
2019-07-22 09:03:27 -04:00
Ondřej Surý
317e36d47e Restore locking in dns_resolver_shutdown and dns_resolver_attach
Although the struct dns_resolver.exiting member is protected by stdatomics, we
actually need to wait for whole dns_resolver_shutdown() to finish before
destroying the resolver object.  Otherwise, there would be a data race and some
fctx objects might not be destroyed yet at the time we tear down the
dns_resolver object.
2019-07-22 08:17:36 -04:00
Ondřej Surý
a912f31398 Add new default siphash24 cookie algorithm, but keep AES as legacy
This commit changes the BIND cookie algorithms to match
draft-sury-toorop-dnsop-server-cookies-00.  Namely, it changes the Client Cookie
algorithm to use SipHash 2-4, adds the new Server Cookie algorithm using SipHash
2-4, and changes the default for the Server Cookie algorithm to be siphash24.

Add siphash24 cookie algorithm, and make it keep legacy aes as
2019-07-21 15:16:28 -04:00
Witold Kręcicki
afa81ee4e4 Remove all cookie algorithms but AES, which was used as a default, for legacy purposes. 2019-07-21 10:08:14 -04:00
Witold Kręcicki
e56cc07f50 Fix a few broken atomics initializations 2019-07-09 16:11:14 +02:00
Ondřej Surý
9da902a201 lib/dns/resolver.c: use isc_refcount_t and atomics 2019-07-09 16:09:36 +02:00
Evan Hunt
6d6e94bee7 fixup! Use experimental "_ A" minimization in relaxed mode. 2019-05-30 14:06:56 -07:00
Witold Kręcicki
ae52c2117e Use experimental "_ A" minimization in relaxed mode.
qname minimization, even in relaxed mode, can fail on
some very broken domains. In relaxed mode, instead of
asking for "foo.bar NS" ask for "_.foo.bar A" to either
get a delegation or NXDOMAIN. It will require more queries
than regular mode for proper NXDOMAINs.
2019-05-30 14:06:55 -07:00
Witold Kręcicki
2691e729f0 Don't SERVFAIL on lame delegations when doing minimization in relaxed mode.
qname minimization in relaxed mode should fall back to regular
resolution in case of failure.
2019-05-30 12:38:18 -07:00
Michał Kępień
5e80488270 Make NTAs work with validating forwarders
If named is configured to perform DNSSEC validation and also forwards
all queries ("forward only;") to validating resolvers, negative trust
anchors do not work properly because the CD bit is not set in queries
sent to the forwarders.  As a result, instead of retrieving bogus DNSSEC
material and making validation decisions based on its configuration,
named is only receiving SERVFAIL responses to queries for bogus data.
Fix by ensuring the CD bit is always set in queries sent to forwarders
if the query name is covered by an NTA.
2019-05-09 19:55:35 -07:00
Witold Kręcicki
7043c6eaf5 When resuming from qname-minimization increase fetches-per-zone counters for the 'new' zone 2019-04-23 10:16:09 -04:00
Witold Kręcicki
7c960e89ea In resume_qmin check if the fetch context is already shutting down - if so, try to destroy it, don't continue 2019-03-29 14:30:40 +01:00
Mark Andrews
d8d04edfba assert hevent->rdataset is non NULL 2019-03-14 12:47:53 -07:00
Witold Kręcicki
56183a3917 Fix a race in fctx_cancelquery.
When sending an udp query (resquery_send) we first issue an asynchronous
isc_socket_connect and increment query->connects, then isc_socket_sendto2
and increment query->sends.
If we happen to cancel this query (fctx_cancelquery) we need to cancel
all operations we might have issued on this socket. If we are under very high
load the callback from isc_socket_connect (resquery_udpconnected) might have
not yet been fired. In this case we only cancel the CONNECT event on socket,
and ignore the SEND that's waiting there (as there is an `else if`).
Then we call dns_dispatch_removeresponse which kills the dispatcher socket
and calls isc_socket_close - but if system is under very high load, the send
we issued earlier might still not be complete - which triggers an assertion
because we're trying to close a socket that's still in use.

The fix is to always check if we have incomplete sends on the socket and cancel
them if we do.
2019-03-12 18:42:35 +01:00
Ondřej Surý
78d0cb0a7d Use coccinelle to remove explicit '#include <config.h>' from the source files 2019-03-08 15:15:05 +01:00
Witold Kręcicki
b49310ac06 If possible don't use forwarders when priming the resolver.
If we try to fetch a record from cache and need to look into
hints database we assume that the resolver is not primed and
start dns_resolver_prime(). Priming query is supposed to return
NSes for "." in ANSWER section and glue records for them in
ADDITIONAL section, so that we can fill that info in 'regular'
cache and not use hints db anymore.
However, if we're using a forwarder the priming query goes through
it, and if it's configured to return minimal answers we won't get
the addresses of root servers in ADDITIONAL section. Since the
only records for root servers we have are in hints database we'll
try to prime the resolver with every single query.

This patch adds a DNS_FETCHOPT_NOFORWARD flag which avoids using
forwarders if possible (that is if we have forward-first policy).
Using this flag on priming fetch fixes the problem as we get the
proper glue. With forward-only policy the problem is non-existent,
as we'll never ask for root server addresses because we'll never
have a need to query them.

Also added a test to confirm priming queries are not forwarded.
2019-01-16 17:41:13 -05:00
Witold Kręcicki
cfa2804e5a When a forwarder fails and we're not in a forward-only mode we
go back to regular resolution. When this happens the fetch timer is
already running, and we might end up in a situation where we we create
a fetch for qname-minimized query and after that the timer is triggered
and the query is retried (fctx_try) - which causes relaunching of
qname-minimization fetch - and since we already have a qmin fetch
for this fctx - assertion failure.

This fix stops the timer when doing qname minimization - qmin fetch
internal timer should take care of all the possible timeouts.
2019-01-16 11:09:30 -08:00
Mark Andrews
dadb924be7 adjust timeout to allow for ECN negotiation failures 2019-01-15 17:10:41 -08:00
Michał Kępień
33350626f9 Track forwarder timeouts in fetch contexts
Since following a delegation resets most fetch context state, address
marks (FCTX_ADDRINFO_MARK) set inside lib/dns/resolver.c are not
preserved when a delegation is followed.  This is fine for full
recursive resolution but when named is configured with "forward first;"
and one of the specified forwarders times out, triggering a fallback to
full recursive resolution, that forwarder should no longer be consulted
at each delegation point subsequently reached within a given fetch
context.

Add a new badnstype_t enum value, badns_forwarder, and use it to mark a
forwarder as bad when it times out in a "forward first;" configuration.
Since the bad server list is not cleaned when a fetch context follows a
delegation, this prevents a forwarder from being queried again after
falling back to full recursive resolution.  Yet, as each fetch context
maintains its own list of bad servers, this change does not cause a
forwarder timeout to prevent that forwarder from being used by other
fetch contexts.
2019-01-08 08:29:54 +01:00
Witold Kręcicki
d5793ecca2 - isc_task_create_bound - create a task bound to specific task queue
If we know that we'll have a task pool doing specific thing it's better
  to use this knowledge and bind tasks to task queues, this behaves better
  than randomly choosing the task queue.

- use bound resolver tasks - we have a pool of tasks doing resolutions,
  we can spread the load evenly using isc_task_create_bound

- quantum set universally to 25
2018-11-23 04:34:02 -05:00
Witold Kręcicki
929ea7c2c4 - Make isc_mutex_destroy return void
- Make isc_mutexblock_init/destroy return void
- Minor cleanups
2018-11-22 11:52:08 +00:00