2
0
mirror of https://gitlab.isc.org/isc-projects/bind9 synced 2025-08-30 05:57:52 +00:00

364 Commits

Author SHA1 Message Date
Ondřej Surý
7e002d89b4 Fix the data race when shutting down dns_adb
When dns_adb is shutting down, first the adb->shutting_down flag is set
and then task is created that runs shutdown_stage2() that sets the
shutdown flag on names and entries.  However, when dns_adb_createfind()
is called, only the individual shutdown flags are being checked, and the
global adb->shutting_down flag was not checked.  Because of that it was
possible for a different thread to slip in and create new find between
the dns_adb_shutdown() and dns_adb_detach(), but before the
shutdown_stage2() task is complete.  This was detected by
ThreadSanitizer as data race because the zonetable might have been
already detached by dns_view shutdown process and simultaneously
accessed by dns_adb_createfind().

This commit converts the adb->shutting_down to atomic_bool to prevent
the global adb lock when creating the find.
2021-11-22 11:09:21 +01:00
Evan Hunt
93f5bc893e incidental cleanup
The NAME_FETCH_A and NAME_FETCH_AAAA macros were meant to be
boolean, indicating whether the pointers were set or not, while
the NAME_FETCH_V4 and NAME_FETCH_V6 macros were meant to return
the pointer values. The latter were only used as booleans, so
they've been removed in favor of the former.

Also did some style cleanup and removed an unreachable code block.
2021-10-21 01:39:30 -07:00
Evan Hunt
18cc459e05 Incidental cleanup
- there are several allocation functions in adb.c that can no
  longer return NULL.
- a macro in rbt.c was never used.
2021-10-18 14:35:50 -07:00
Ondřej Surý
2e3a2eecfe Make isc_result a static enum
Remove the dynamic registration of result codes.  Convert isc_result_t
from unsigned + #defines into 32-bit enum type in grand unified
<isc/result.h> header.  Keep the existing values of the result codes
even at the expense of the description and identifier tables being
unnecessary large.

Additionally, add couple of:

    switch (result) {
    [...]
    default:
        break;
    }

statements where compiler now complains about missing enum values in the
switch statement.
2021-10-06 11:22:20 +02:00
Evan Hunt
08ce69a0ea Rewrite dns_resolver and dns_request to use netmgr timeouts
- The `timeout_action` parameter to dns_dispatch_addresponse() been
  replaced with a netmgr callback that is called when a dispatch read
  times out.  this callback may optionally reset the read timer and
  resume reading.

- Added a function to convert isc_interval to milliseconds; this is used
  to translate fctx->interval into a value that can be passed to
  dns_dispatch_addresponse() as the timeout.

- Note that netmgr timeouts are accurate to the millisecond, so code to
  check whether a timeout has been reached cannot rely on microsecond
  accuracy.

- If serve-stale is configured, then a timeout received by the resolver
  may trigger it to return stale data, and then resume waiting for the
  read timeout. this is no longer based on a separate stale timer.

- The code for canceling requests in request.c has been altered so that
  it can run asynchronously.

- TCP timeout events apply to the dispatch, which may be shared by
  multiple queries.  since in the event of a timeout we have no query ID
  to use to identify the resp we wanted, we now just send the timeout to
  the oldest query that was pending.

- There was some additional refactoring in the resolver: combining
  fctx_join() and fctx_try_events() into one function to reduce code
  duplication, and using fixednames in fetchctx and fetchevent.

- Incidental fix: new_adbaddrinfo() can't return NULL anymore, so the
  code can be simplified.
2021-10-02 11:39:56 -07:00
Ondřej Surý
22db2705cd Use static storage for isc_mem water_t
On the isc_mem water change the old water_t structure could be used
after free.  Instead of introducing reference counting on the hot-path
we are going to introduce additional constraints on the
isc_mem_setwater.  Once it's set for the first time, the additional
calls have to be made with the same water and water_arg arguments.
2021-07-22 11:51:46 +02:00
Ondřej Surý
9c3bebc26f Properly disable the "water" in isc_mem
The proper way how to disable the water limit in the isc_mem context is
to call:

    isc_mem_setwater(ctx, NULL, NULL, 0, 0);

this ensures that the old water callback is called with ISC_MEM_LOWATER
if the callback was called with ISC_MEM_HIWATER before.

Historically, there were some places where the limits were disabled by
calling:

    isc_mem_setwater(ctx, water, water_arg, 0, 0);

which would also call the old callback, but it also causes the water_t
to be allocated and extra check to be executed because water callback is
not NULL.

This commits unifies the calls to disable water to the preferred form.
2021-07-09 15:58:02 +02:00
Ondřej Surý
f487c6948b Replace locked mempools with memory contexts
Current mempools are kind of hybrid structures - they serve two
purposes:

 1. mempool with a lock is basically static sized allocator with
    pre-allocated free items

 2. mempool without a lock is a doubly-linked list of preallocated items

The first kind of usage could be easily replaced with jemalloc small
sized arena objects and thread-local caches.

The second usage not-so-much and we need to keep this (in
libdns:message.c) for performance reasons.
2021-07-09 15:58:02 +02:00
Evan Hunt
b0aadaac8e rename dns_name_copynf() to dns_name_copy()
dns_name_copy() is now the standard name-copying function.
2021-05-22 00:37:27 -07:00
Ondřej Surý
bb990030d3 Simplify the EDNS buffer size logic for DNS Flag Day 2020
The DNS Flag Day 2020 aims to remove the IP fragmentation problem from
the UDP DNS communication.  In this commit, we implement the required
changes and simplify the logic for picking the EDNS Buffer Size.

1. The defaults for `edns-udp-size`, `max-udp-size` and
   `nocookie-udp-size` have been changed to `1232` (the value picked by
   DNS Flag Day 2020).

2. The probing heuristics that would try 512->4096->1432->1232 buffer
   sizes has been removed and the resolver will always use just the
   `edns-udp-size` value.

3. Instead of just disabling the PMTUD mechanism on the UDP sockets, we
   now set IP_DONTFRAG (IPV6_DONTFRAG) flag.  That means that the UDP
   packets won't get ever fragmented.  If the ICMP packets are lost the
   UDP will just timeout and eventually be retried over TCP.
2020-10-05 16:21:21 +02:00
Evan Hunt
dcee985b7f update all copyright headers to eliminate the typo 2020-09-14 16:20:40 -07:00
Ondřej Surý
e24bc324b4 Fix the rbt hashtable and grow it when setting max-cache-size
There were several problems with rbt hashtable implementation:

1. Our internal hashing function returns uint64_t value, but it was
   silently truncated to unsigned int in dns_name_hash() and
   dns_name_fullhash() functions.  As the SipHash 2-4 higher bits are
   more random, we need to use the upper half of the return value.

2. The hashtable implementation in rbt.c was using modulo to pick the
   slot number for the hash table.  This has several problems because
   modulo is: a) slow, b) oblivious to patterns in the input data.  This
   could lead to very uneven distribution of the hashed data in the
   hashtable.  Combined with the single-linked lists we use, it could
   really hog-down the lookup and removal of the nodes from the rbt
   tree[a].  The Fibonacci Hashing is much better fit for the hashtable
   function here.  For longer description, read "Fibonacci Hashing: The
   Optimization that the World Forgot"[b] or just look at the Linux
   kernel.  Also this will make Diego very happy :).

3. The hashtable would rehash every time the number of nodes in the rbt
   tree would exceed 3 * (hashtable size).  The overcommit will make the
   uneven distribution in the hashtable even worse, but the main problem
   lies in the rehashing - every time the database grows beyond the
   limit, each subsequent rehashing will be much slower.  The mitigation
   here is letting the rbt know how big the cache can grown and
   pre-allocate the hashtable to be big enough to actually never need to
   rehash.  This will consume more memory at the start, but since the
   size of the hashtable is capped to `1 << 32` (e.g. 4 mio entries), it
   will only consume maximum of 32GB of memory for hashtable in the
   worst case (and max-cache-size would need to be set to more than
   4TB).  Calling the dns_db_adjusthashsize() will also cap the maximum
   size of the hashtable to the pre-computed number of bits, so it won't
   try to consume more gigabytes of memory than available for the
   database.

   FIXME: What is the average size of the rbt node that gets hashed?  I
   chose the pagesize (4k) as initial value to precompute the size of
   the hashtable, but the value is based on feeling and not any real
   data.

For future work, there are more places where we use result of the hash
value modulo some small number and that would benefit from Fibonacci
Hashing to get better distribution.

Notes:
a. A doubly linked list should be used here to speedup the removal of
   the entries from the hashtable.
b. https://probablydance.com/2018/06/16/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo/
2020-07-21 08:44:26 +02:00
Evan Hunt
57e54c46e4 change "expr == false" to "!expr" in conditionals 2020-05-25 16:09:57 -07:00
Mark Andrews
3ee5ea2fdb Reduce the number of fetches we make when looking up addresses
If there are more that 5 NS record for a zone only perform a
maximum of 4 address lookups for all the name servers.  This
limits the amount of remote lookup performed for server
addresses at each level for a given query.
2020-05-19 12:30:29 +02:00
Ondřej Surý
60c632ab91 Workaround MSVC warning C4477
Due to a way the stdatomic.h shim is implemented on Windows, the MSVC
always things that the outside type is the largest - atomic_(u)int_fast64_t.
This can lead to false positives as this one:

  lib\dns\adb.c(3678): warning C4477: 'fprintf' : format string '%u' requires an argument of type 'unsigned int', but variadic argument 2 has type 'unsigned __int64'

We workaround the issue by loading the value in a scoped local variable
with correct type first.
2020-04-15 12:47:42 +02:00
Ondřej Surý
5777c44ad0 Reformat using the new rules 2020-02-14 09:31:05 +01:00
Evan Hunt
e851ed0bb5 apply the modified style 2020-02-13 15:05:06 -08:00
Ondřej Surý
056e133c4c Use clang-tidy to add curly braces around one-line statements
The command used to reformat the files in this commit was:

./util/run-clang-tidy \
	-clang-tidy-binary clang-tidy-11
	-clang-apply-replacements-binary clang-apply-replacements-11 \
	-checks=-*,readability-braces-around-statements \
	-j 9 \
	-fix \
	-format \
	-style=file \
	-quiet
clang-format -i --style=format $(git ls-files '*.c' '*.h')
uncrustify -c .uncrustify.cfg --replace --no-backup $(git ls-files '*.c' '*.h')
clang-format -i --style=format $(git ls-files '*.c' '*.h')
2020-02-13 22:07:21 +01:00
Ondřej Surý
36c6105e4f Use coccinelle to add braces to nested single line statement
Both clang-tidy and uncrustify chokes on statement like this:

for (...)
	if (...)
		break;

This commit uses a very simple semantic patch (below) to add braces around such
statements.

Semantic patch used:

@@
statement S;
expression E;
@@

while (...)
- if (E) S
+ { if (E) { S } }

@@
statement S;
expression E;
@@

for (...;...;...)
- if (E) S
+ { if (E) { S } }

@@
statement S;
expression E;
@@

if (...)
- if (E) S
+ { if (E) { S } }
2020-02-13 21:58:55 +01:00
Ondřej Surý
f50b1e0685 Use clang-format to reformat the source files 2020-02-12 15:04:17 +01:00
Ondřej Surý
bc1d4c9cb4 Clear the pointer to destroyed object early using the semantic patch
Also disable the semantic patch as the code needs tweaks here and there because
some destroy functions might not destroy the object and return early if the
object is still in use.
2020-02-09 18:00:17 -08:00
Ondřej Surý
5eb3f71a3e Refactor the isc_mempool_create() usage using the semantic patch
The isc_mempool_create() function now cannot fail with ISC_R_MEMORY.
This commit removes all the checks on the return code using the semantic
patch from previous commit, as isc_mempool_create() now returns void.
2020-02-03 08:27:16 +01:00
Ondřej Surý
edd97cddc1 Refactor dns_name_dup() usage using the semantic patch 2019-11-29 14:00:37 +01:00
Ondřej Surý
21eab267df Fix missing adb->{e,i}refcnt locking 2019-11-26 13:07:12 +01:00
Witold Kręcicki
bad5a523c2 lib/dns/adb.c: Use atomics for adb quota values and reference counting 2019-11-26 13:07:12 +01:00
Witold Kręcicki
7d93371581 lib/dns/adb.c: don't use more than 64 lock simultaneously when run under TSAN
- TSAN can't handle more than 64 locks in one thread, lock ADB bucket-by-bucket
   in TSAN mode. This means that the dump won't be consistent but it's good
   enough for testing

 - Use proper order when unlocking adb->namelocks and adb->entrylocks when
   dumping ADB.
2019-11-18 06:51:30 +01:00
Ondřej Surý
56ef09c3a1 Describe the polynomial backoff curve used in the quota adjustment 2019-11-05 09:48:15 +01:00
Ondřej Surý
c2dad0dcb2 Replace RUNTIME_CHECK(dns_name_copy(..., NULL)) with dns_name_copynf()
Use the semantic patch from the previous commit to replace all the calls to
dns_name_copy() with NULL as third argument with dns_name_copynf().
2019-10-01 10:43:26 +10:00
Ondřej Surý
89b269b0d2 Add RUNTIME_CHECK() around result = dns_name_copy(..., NULL) calls
This second commit uses second semantic patch to replace the calls to
dns_name_copy() with NULL as third argument where the result was stored in a
isc_result_t variable.  As the dns_name_copy(..., NULL) cannot fail gracefully
when the third argument is NULL, it was just a bunch of dead code.

Couple of manual tweaks (removing dead labels and unused variables) were
manually applied on top of the semantic patch.
2019-10-01 10:43:26 +10:00
Ondřej Surý
ae83801e2b Remove blocks checking whether isc_mem_get() failed using the coccinelle 2019-07-23 15:32:35 -04:00
Ondřej Surý
78d0cb0a7d Use coccinelle to remove explicit '#include <config.h>' from the source files 2019-03-08 15:15:05 +01:00
Evan Hunt
427e9ca357 clear AD flag when altering response messages
- the AD flag was not being cleared correctly when filtering
- enabled dnssec valdiation in the filter-aaaa test to confirm this
  works correctly now
2018-12-06 10:29:11 -08:00
Witold Kręcicki
929ea7c2c4 - Make isc_mutex_destroy return void
- Make isc_mutexblock_init/destroy return void
- Minor cleanups
2018-11-22 11:52:08 +00:00
Ondřej Surý
2f3eee5a4f isc_mutex_init returns 'void' 2018-11-22 11:51:49 +00:00
Ondřej Surý
b2b43fd235 Turn (int & flag) into (int & flag) != 0 when implicitly typed to bool 2018-11-08 12:21:53 +07:00
Mark Andrews
23766ff690 checkpoint 2018-10-23 12:15:04 +00:00
Witold Kręcicki
86246c7431 Initialize adbname->client properly; check for loops 2018-10-23 12:15:04 +00:00
Witold Kręcicki
f2af336dc4 Fix looping issues 2018-10-23 12:15:04 +00:00
Witold Kręcicki
70a1ba20ec QNAME miminimization should create a separate fetch context for each fetch -
this makes the cache more efficient and eliminates duplicates queries.
2018-10-23 12:15:04 +00:00
Ondřej Surý
994e656977 Replace custom isc_boolean_t with C standard bool type 2018-08-08 09:37:30 +02:00
Ondřej Surý
cb6a185c69 Replace custom isc_u?intNN_t types with C99 u?intNN_t types 2018-08-08 09:37:28 +02:00
Evan Hunt
c8015eb33b style nits (mostly line length) 2018-06-12 09:18:47 +02:00
Evan Hunt
2ea47c7f34 rename test to qmin; add it to conf.sh.in and Makefile.in; fix copyrights 2018-06-12 09:18:47 +02:00
Witold Kręcicki
dd7bb617be - qname minimization:
- make qname-minimization option tristate {strict,relaxed,disabled}
 - go straight for the record if we hit NXDOMAIN in relaxed mode
 - go straight for the record after 3 labels without new delegation or 7 labels total

- use start of fetch (and not time of response) as 'now' time for querying cache for
  zonecut when following delegation.
2018-06-12 09:18:46 +02:00
Ondřej Surý
99ba29bc52 Change isc_random() to be just PRNG, and add isc_nonce_buf() that uses CSPRNG
This commit reverts the previous change to use system provided
entropy, as (SYS_)getrandom is very slow on Linux because it is
a syscall.

The change introduced in this commit adds a new call isc_nonce_buf
that uses CSPRNG from cryptographic library provider to generate
secure data that can be and must be used for generating nonces.
Example usage would be DNS cookies.

The isc_random() API has been changed to use fast PRNG that is not
cryptographically secure, but runs entirely in user space.  Two
contestants have been considered xoroshiro family of the functions
by Villa&Blackman and PCG by O'Neill.  After a consideration the
xoshiro128starstar function has been used as uint32_t random number
provider because it is very fast and has good enough properties
for our usage pattern.

The other change introduced in the commit is the more extensive usage
of isc_random_uniform in places where the usage pattern was
isc_random() % n to prevent modulo bias.  For usage patterns where
only 16 or 8 bits are needed (DNS Message ID), the isc_random()
functions has been renamed to isc_random32(), and isc_random16() and
isc_random8() functions have been introduced by &-ing the
isc_random32() output with 0xffff and 0xff.  Please note that the
functions that uses stripped down bit count doesn't pass our
NIST SP 800-22 based random test.
2018-05-29 22:58:21 +02:00
Ondřej Surý
3a4f820d62 Replace all random functions with isc_random, isc_random_buf and isc_random_uniform API.
The three functions has been modeled after the arc4random family of
functions, and they will always return random bytes.

The isc_random family of functions internally use these CSPRNG (if available):

1. getrandom() libc call (might be available on Linux and Solaris)
2. SYS_getrandom syscall (might be available on Linux, detected at runtime)
3. arc4random(), arc4random_buf() and arc4random_uniform() (available on BSDs and Mac OS X)
4. crypto library function:
4a. RAND_bytes in case OpenSSL
4b. pkcs_C_GenerateRandom() in case PKCS#11 library
2018-05-16 09:54:35 +02:00
Michał Kępień
4df4a8e731 Use dns_fixedname_initname() where possible
Replace dns_fixedname_init() calls followed by dns_fixedname_name()
calls with calls to dns_fixedname_initname() where it is possible
without affecting current behavior and/or performance.

This patch was mostly prepared using Coccinelle and the following
semantic patch:

    @@
    expression fixedname, name;
    @@
    -	dns_fixedname_init(&fixedname);
    	...
    -	name = dns_fixedname_name(&fixedname);
    +	name = dns_fixedname_initname(&fixedname);

The resulting set of changes was then manually reviewed to exclude false
positives and apply minor tweaks.

It is likely that more occurrences of this pattern can be refactored in
an identical way.  This commit only takes care of the low-hanging fruit.
2018-04-09 12:14:16 +02:00
Witold Kręcicki
d54d482af0 libdns refactoring: get rid of multiple versions of dns_view_find, dns_view_findzonecut and dns_view_flushcache 2018-04-06 08:04:41 +02:00
Witold Kręcicki
42ee8c853a libdns refactoring: get rid of 3 versions of dns_resolver_createfetch 2018-04-06 08:04:41 +02:00
Witold Kręcicki
f0a07b7546 libdns refactoring: get rid of two versions of dns_adb_createfind and dns_adb_probesize 2018-04-06 08:04:41 +02:00