2
0
mirror of https://gitlab.isc.org/isc-projects/bind9 synced 2025-08-29 05:28:00 +00:00

994 Commits

Author SHA1 Message Date
Michał Kępień
d7583e7926 Restore semantic meaning of DNS_FETCHOPT_EDNS512
When the DNS_FETCHOPT_EDNS512 flag was first introduced [1], it enforced
advertising a 512-byte UDP buffer size in an outgoing query.  Ever since
EDNS processing code got updated [2], that flag has still been set upon
detection of certain query timeout patterns, but it has no longer been
affecting the calculations of the advertised UDP buffer size in outgoing
queries.  Restore original semantic meaning of DNS_FETCHOPT_EDNS512 by
ensuring the advertised UDP buffer size is set to 512 bytes when that
flag is set.  Update existing comments and add new ones to improve code
readability.

[1] see commit 08c90261660649ca7d92065f6f13a61ec5a9a86d
[2] see commit 8e15d5eb3a000f1341e6bea0ddbc28d6dd2a0591
2020-05-25 14:34:56 +02:00
Michał Kępień
949d9a3ea4 Remove fctx->reason and a misleading log message
The following message:

    success resolving '<name>' (in '<domain>'?) after reducing the advertised EDNS UDP packet size to 512 octets

can currently be logged even if the EDNS UDP buffer size advertised in
queries sent to a given server had already been set to 512 octets before
the fetch context was created (e.g. due to the server responding
intermittently).  In other words, this log message may be misleading as
lowering the advertised EDNS UDP buffer size may not be the actual cause
of <name> being successfully resolved.  Remove the log message in
question to prevent confusion.

As this log message is the only existing user of the "reason" field in
struct fetchctx, remove that field as well, along with all the code
related to it.
2020-05-25 14:34:56 +02:00
Mark Andrews
266faa3399 Count queries to the root and TLD servers as well 2020-05-19 12:30:29 +02:00
Mark Andrews
3ee5ea2fdb Reduce the number of fetches we make when looking up addresses
If there are more that 5 NS record for a zone only perform a
maximum of 4 address lookups for all the name servers.  This
limits the amount of remote lookup performed for server
addresses at each level for a given query.
2020-05-19 12:30:29 +02:00
Ondřej Surý
978c7b2e89 Complete rewrite the BIND 9 build system
The rewrite of BIND 9 build system is a large work and cannot be reasonable
split into separate merge requests.  Addition of the automake has a positive
effect on the readability and maintainability of the build system as it is more
declarative, it allows conditional and we are able to drop all of the custom
make code that BIND 9 developed over the years to overcome the deficiencies of
autoconf + custom Makefile.in files.

This squashed commit contains following changes:

- conversion (or rather fresh rewrite) of all Makefile.in files to Makefile.am
  by using automake

- the libtool is now properly integrated with automake (the way we used it
  was rather hackish as the only official way how to use libtool is via
  automake

- the dynamic module loading was rewritten from a custom patchwork to libtool's
  libltdl (which includes the patchwork to support module loading on different
  systems internally)

- conversion of the unit test executor from kyua to automake parallel driver

- conversion of the system test executor from custom make/shell to automake
  parallel driver

- The GSSAPI has been refactored, the custom SPNEGO on the basis that
  all major KRB5/GSSAPI (mit-krb5, heimdal and Windows) implementations
  support SPNEGO mechanism.

- The various defunct tests from bin/tests have been removed:
  bin/tests/optional and bin/tests/pkcs11

- The text files generated from the MD files have been removed, the
  MarkDown has been designed to be readable by both humans and computers

- The xsl header is now generated by a simple sed command instead of
  perl helper

- The <irs/platform.h> header has been removed

- cleanups of configure.ac script to make it more simpler, addition of multiple
  macros (there's still work to be done though)

- the tarball can now be prepared with `make dist`

- the system tests are partially able to run in oot build

Here's a list of unfinished work that needs to be completed in subsequent merge
requests:

- `make distcheck` doesn't yet work (because of system tests oot run is not yet
  finished)

- documentation is not yet built, there's a different merge request with docbook
  to sphinx-build rst conversion that needs to be rebased and adapted on top of
  the automake

- msvc build is non functional yet and we need to decide whether we will just
  cross-compile bind9 using mingw-w64 or fix the msvc build

- contributed dlz modules are not included neither in the autoconf nor automake
2020-04-21 14:19:48 +02:00
Diego Fronza
cf7b0de1eb Fixed rebinding protection bug when using forwarder setups
BIND wasn't honoring option "deny-answer-aliases" when configured to
forward queries.

Before the fix it was possible for nameservers listed in "forwarders"
option to return CNAME answers pointing to unrelated domains of the
original query, which could be used as a vector for rebinding attacks.

The fix ensures that BIND apply filters even if configured as a forwarder
instance.

(cherry picked from commit af6a4de3d5ad6c1967173facf366e6c86b3ffc28)
2020-04-08 09:37:33 +02:00
Evan Hunt
ba0313e649 fix spelling errors reported by Fossies. 2020-02-21 15:05:08 +11:00
Diego Fronza
45543da802 Fixed disposing of resolver->references in destroy() function 2020-02-14 14:28:31 -03:00
Diego Fronza
e7b36924e2 Fixed potential-lock-inversion
This commit simplifies a bit the lock management within dns_resolver_prime()
and prime_done() functions by means of turning resolver's attribute
"priming" into an atomic_bool and by creating only one dependent object on the
lock "primelock", namely the "primefetch" attribute.

By having the attribute "priming" as an atomic type, it save us from having to
use a lock just to test if priming is on or off for the given resolver context
object, within "dns_resolver_prime" function.

The "primelock" lock is still necessary, since dns_resolver_prime() function
internally calls dns_resolver_createfetch(), and whenever this function
succeeds it registers an event in the task manager which could be called by
another thread, namely the "prime_done" function, and this function is
responsible for disposing the "primefetch" attribute in the resolver object,
also for resetting "priming" attribute to false.

It is important that the invariant "priming == false AND primefetch == NULL"
remains constant, so that any thread calling "dns_resolver_prime" knows for sure
that if the "priming" attribute is false, "primefetch" attribute should also be
NULL, so a new fetch context could be created to fulfill this purpose, and
assigned to "primefetch" attribute under the lock protection.

To honor the explanation above, dns_resolver_prime is implemented as follow:
	1. Atomically checks the attribute "priming" for the given resolver context.
	2. If "priming" is false, assumes that "primefetch" is NULL (this is
           ensured by the "prime_done" implementation), acquire "primelock"
	   lock and create a new fetch context, update "primefetch" pointer to
	   point to the newly allocated fetch context.
	3. If "priming" is true, assumes that the job is already in progress,
	   no locks are acquired, nothing else to do.

To keep the previous invariant consistent, "prime_done" is implemented as follow:
	1. Acquire "primefetch" lock.
	2. Keep a reference to the current "primefetch" object;
	3. Reset "primefetch" attribute to NULL.
	4. Release "primefetch" lock.
	5. Atomically update "priming" attribute to false.
	6. Destroy the "primefetch" object by using the temporary reference.

This ensures that if "priming" is false, "primefetch" was already reset to NULL.

It doesn't make any difference in having the "priming" attribute not protected
by a lock, since the visible state of this variable would depend on the calling
order of the functions "dns_resolver_prime" and "prime_done".

As an example, suppose that instead of using an atomic for the "priming" attribute
we employed a lock to protect it.
Now suppose that "prime_done" function is called by Thread A, it is then preempted
before acquiring the lock, thus not reseting "priming" to false.
In parallel to that suppose that a Thread B is scheduled and that it calls
"dns_resolver_prime()", it then acquires the lock and check that "priming" is true,
thus it will consider that this resolver object is already priming and it won't do
any more job.
Conversely if the lock order was acquired in the other direction, Thread B would check
that "priming" is false (since prime_done acquired the lock first and set "priming" to false)
and it would initiate a priming fetch for this resolver.

An atomic variable wouldn't change this behavior, since it would behave exactly the
same, depending on the function call order, with the exception that it would avoid
having to use a lock.

There should be no side effects resulting from this change, since the previous
implementation employed use of the more general resolver's "lock" mutex, which
is used in far more contexts, but in the specifics of the "dns_resolver_prime"
and "prime_done" it was only used to protect "primefetch" and "priming" attributes,
which are not used in any of the other critical sections protected by the same lock,
thus having zero dependency on those variables.
2020-02-14 14:28:31 -03:00
Ondřej Surý
5777c44ad0 Reformat using the new rules 2020-02-14 09:31:05 +01:00
Evan Hunt
e851ed0bb5 apply the modified style 2020-02-13 15:05:06 -08:00
Ondřej Surý
056e133c4c Use clang-tidy to add curly braces around one-line statements
The command used to reformat the files in this commit was:

./util/run-clang-tidy \
	-clang-tidy-binary clang-tidy-11
	-clang-apply-replacements-binary clang-apply-replacements-11 \
	-checks=-*,readability-braces-around-statements \
	-j 9 \
	-fix \
	-format \
	-style=file \
	-quiet
clang-format -i --style=format $(git ls-files '*.c' '*.h')
uncrustify -c .uncrustify.cfg --replace --no-backup $(git ls-files '*.c' '*.h')
clang-format -i --style=format $(git ls-files '*.c' '*.h')
2020-02-13 22:07:21 +01:00
Ondřej Surý
36c6105e4f Use coccinelle to add braces to nested single line statement
Both clang-tidy and uncrustify chokes on statement like this:

for (...)
	if (...)
		break;

This commit uses a very simple semantic patch (below) to add braces around such
statements.

Semantic patch used:

@@
statement S;
expression E;
@@

while (...)
- if (E) S
+ { if (E) { S } }

@@
statement S;
expression E;
@@

for (...;...;...)
- if (E) S
+ { if (E) { S } }

@@
statement S;
expression E;
@@

if (...)
- if (E) S
+ { if (E) { S } }
2020-02-13 21:58:55 +01:00
Ondřej Surý
f50b1e0685 Use clang-format to reformat the source files 2020-02-12 15:04:17 +01:00
Ondřej Surý
bc1d4c9cb4 Clear the pointer to destroyed object early using the semantic patch
Also disable the semantic patch as the code needs tweaks here and there because
some destroy functions might not destroy the object and return early if the
object is still in use.
2020-02-09 18:00:17 -08:00
Witold Kręcicki
d708370db4 Fix atomics usage for mutexatomics 2020-02-08 12:34:19 -08:00
Ondřej Surý
a9bd6f6ea6 Fix comparison between type uint16_t and wider type size_t in a loop
Found by LGTM.com (see below for description), and while it should not
happen as EDNS OPT RDLEN is uint16_t, the fix is easy.  A little bit
of cleanup is included too.

> In a loop condition, comparison of a value of a narrow type with a value
> of a wide type may result in unexpected behavior if the wider value is
> sufficiently large (or small). This is because the narrower value may
> overflow. This can lead to an infinite loop.
2020-02-05 01:41:13 +00:00
Ondřej Surý
7dfc092f06 Use C11 atomics for nfctx, kill unused dns_resolver_nrunning() 2020-01-14 13:12:13 +01:00
Mark Andrews
62abb6aa82 make resolver->zspill atomic to prevent potential deadlock 2019-12-12 08:26:59 +00:00
Mark Andrews
13aaeaa06f Note bucket lock requirements and move REQUIRE inside locked section. 2019-12-10 22:16:15 +00:00
Mark Andrews
5589748eca lock access to fctx->nqueries 2019-12-10 22:16:15 +00:00
Mark Andrews
912ce87479 Make fctx->attributes atomic.
FCTX_ATTR_SHUTTINGDOWN needs to be set and tested while holding the node
lock but the rest of the attributes don't as they are task locked. Making
fctx->attributes atomic allows both behaviours without races.
2019-12-03 08:58:53 +11:00
Mark Andrews
9ca6ad6311 Assign fctx->client when fctx is created rather when the join happens.
This prevents races on fctx->client whenever a new fetch joins a existing
fetch (by calling fctx_join) as it is now invariant for the active life of
fctx.
2019-12-02 06:01:46 +00:00
Ondřej Surý
edd97cddc1 Refactor dns_name_dup() usage using the semantic patch 2019-11-29 14:00:37 +01:00
Ondřej Surý
a5189eefa5 lib/dns/resolver.c: Call dns_adb_endudpfetch() only for UDP queries
The dns_adb_beginudpfetch() is called only for UDP queries, but
the dns_adb_endudpfetch() is called for all queries, including
TCP.  This messages the quota counting in adb.c.
2019-11-19 02:53:56 +08:00
Michał Kępień
fce3c93ea2 Prevent TCP failures from affecting EDNS stats
EDNS mechanisms only apply to DNS over UDP.  Thus, errors encountered
while sending DNS queries over TCP must not influence EDNS timeout
statistics.
2019-10-31 09:54:05 +01:00
Michał Kępień
6cd115994e Prevent query loops for misbehaving servers
If a TCP connection fails while attempting to send a query to a server,
the fetch context will be restarted without marking the target server as
a bad one.  If this happens for a server which:

  - was already marked with the DNS_FETCHOPT_EDNS512 flag,
  - responds to EDNS queries with the UDP payload size set to 512 bytes,
  - does not send response packets larger than 512 bytes,

and the response for the query being sent is larger than 512 byes, then
named will pointlessly alternate between sending UDP queries with EDNS
UDP payload size set to 512 bytes (which are responded to with truncated
answers) and TCP connections until the fetch context retry limit is
reached.  Prevent such query loops by marking the server as bad for a
given fetch context if the advertised EDNS UDP payload size for that
server gets reduced to 512 bytes and it is impossible to reach it using
TCP.
2019-10-31 08:48:35 +01:00
Mark Andrews
622bef6aec reset fctx->qmindcname and fctx->qminname after processing a delegation 2019-10-01 22:09:04 -07:00
Evan Hunt
488cb4da10 SERVFAIL if a prior qmin fetch has not been canceled when a new one starts 2019-10-01 20:41:53 -07:00
Ondřej Surý
c2dad0dcb2 Replace RUNTIME_CHECK(dns_name_copy(..., NULL)) with dns_name_copynf()
Use the semantic patch from the previous commit to replace all the calls to
dns_name_copy() with NULL as third argument with dns_name_copynf().
2019-10-01 10:43:26 +10:00
Ondřej Surý
5efa29e03a The final round of adding RUNTIME_CHECK() around dns_name_copy() calls
This commit was done by hand to add the RUNTIME_CHECK() around stray
dns_name_copy() calls with NULL as third argument.  This covers the edge cases
that doesn't make sense to write a semantic patch since the usage pattern was
unique or almost unique.
2019-10-01 10:43:26 +10:00
Ondřej Surý
89b269b0d2 Add RUNTIME_CHECK() around result = dns_name_copy(..., NULL) calls
This second commit uses second semantic patch to replace the calls to
dns_name_copy() with NULL as third argument where the result was stored in a
isc_result_t variable.  As the dns_name_copy(..., NULL) cannot fail gracefully
when the third argument is NULL, it was just a bunch of dead code.

Couple of manual tweaks (removing dead labels and unused variables) were
manually applied on top of the semantic patch.
2019-10-01 10:43:26 +10:00
Ondřej Surý
35bd7e4da0 Add RUNTIME_CHECK() around plain dns_name_copy(..., NULL) calls using spatch
This commit add RUNTIME_CHECK() around all simple dns_name_copy() calls where
the third argument is NULL using the semantic patch from the previous commit.
2019-10-01 10:43:26 +10:00
Mark Andrews
c3bcb4d47a Remove potential use after free (fctx) in rctx_resend. 2019-09-13 12:44:12 +10:00
Ondřej Surý
4957255d13 Use the semantic patch to change the usage isc_mem_create() to new API 2019-09-12 09:26:09 +02:00
Ondřej Surý
50e109d659 isc_event_allocate() cannot fail, remove the fail handling blocks
isc_event_allocate() calls isc_mem_get() to allocate the event structure.  As
isc_mem_get() cannot fail softly (e.g. it never returns NULL), the
isc_event_allocate() cannot return NULL, hence we remove the (ret == NULL)
handling blocks using the semantic patch from the previous commit.
2019-08-30 08:55:34 +02:00
Evan Hunt
1d86b202ad remove DLV-related library code 2019-08-09 09:15:10 -07:00
Mark Andrews
57a328d67e Store the DS and RRSIG(DS) with trust dns_trust_pending_answer
so that the validator can validate the records as part of validating
the current request.
2019-08-02 15:09:42 +10:00
Ondřej Surý
f0c6aef542 Cleanup stray goto labels from removing isc_mem_allocate/strdup checking blocks 2019-07-23 15:32:36 -04:00
Ondřej Surý
9bdc24a9fd Use coccinelle to cleanup the failure handling blocks from isc_mem_strdup 2019-07-23 15:32:36 -04:00
Ondřej Surý
ae83801e2b Remove blocks checking whether isc_mem_get() failed using the coccinelle 2019-07-23 15:32:35 -04:00
Michał Kępień
ca528766d6 Restore locking in resume_dslookup()
Commit 9da902a201b6d0e1bdbac0af067a59bb0a489c9c removed locking around
the fctx_decreference() call inside resume_dslookup().  This allows
fctx_unlink() to be called without the bucket lock being held, which
must never happen.  Ensure the bucket lock is held by resume_dslookup()
before it calls fctx_decreference().
2019-07-23 11:43:46 +02:00
Ondřej Surý
a4141fcf98 Restore more locking in the lib/dns/resolver.c code
1. Restore locking in the fctx_decreference() code, because the insides of the
   function needs to be protected when fctx->references drops to 0.

2. Restore locking in the dns_resolver_attach() code, because two variables are
   accessed at the same time and there's slight chance of data race.
2019-07-22 09:03:27 -04:00
Ondřej Surý
317e36d47e Restore locking in dns_resolver_shutdown and dns_resolver_attach
Although the struct dns_resolver.exiting member is protected by stdatomics, we
actually need to wait for whole dns_resolver_shutdown() to finish before
destroying the resolver object.  Otherwise, there would be a data race and some
fctx objects might not be destroyed yet at the time we tear down the
dns_resolver object.
2019-07-22 08:17:36 -04:00
Ondřej Surý
a912f31398 Add new default siphash24 cookie algorithm, but keep AES as legacy
This commit changes the BIND cookie algorithms to match
draft-sury-toorop-dnsop-server-cookies-00.  Namely, it changes the Client Cookie
algorithm to use SipHash 2-4, adds the new Server Cookie algorithm using SipHash
2-4, and changes the default for the Server Cookie algorithm to be siphash24.

Add siphash24 cookie algorithm, and make it keep legacy aes as
2019-07-21 15:16:28 -04:00
Witold Kręcicki
afa81ee4e4 Remove all cookie algorithms but AES, which was used as a default, for legacy purposes. 2019-07-21 10:08:14 -04:00
Witold Kręcicki
e56cc07f50 Fix a few broken atomics initializations 2019-07-09 16:11:14 +02:00
Ondřej Surý
9da902a201 lib/dns/resolver.c: use isc_refcount_t and atomics 2019-07-09 16:09:36 +02:00
Evan Hunt
6d6e94bee7 fixup! Use experimental "_ A" minimization in relaxed mode. 2019-05-30 14:06:56 -07:00
Witold Kręcicki
ae52c2117e Use experimental "_ A" minimization in relaxed mode.
qname minimization, even in relaxed mode, can fail on
some very broken domains. In relaxed mode, instead of
asking for "foo.bar NS" ask for "_.foo.bar A" to either
get a delegation or NXDOMAIN. It will require more queries
than regular mode for proper NXDOMAINs.
2019-05-30 14:06:55 -07:00