Add support for a "tls" key/value pair for zone primaries, referencing
either a "tls" configuration statement or "ephemeral". If set to use
TLS, zones will send SOA and AXFR/IXFR queries over a TLS channel.
Hold a weak reference to the view so that it can't go away while
nta is performing its lookups. Cancel nta timers once all external
references to the view have gone to prevent them triggering new work.
Created isc_refcount_decrement_expect macro to test conditionally
the return value to ensure it is in expected range. Converted
unchecked isc_refcount_decrement to use isc_refcount_decrement_expect.
Converted INSIST(isc_refcount_decrement()...) to isc_refcount_decrement_expect.
There were several problems with rbt hashtable implementation:
1. Our internal hashing function returns uint64_t value, but it was
silently truncated to unsigned int in dns_name_hash() and
dns_name_fullhash() functions. As the SipHash 2-4 higher bits are
more random, we need to use the upper half of the return value.
2. The hashtable implementation in rbt.c was using modulo to pick the
slot number for the hash table. This has several problems because
modulo is: a) slow, b) oblivious to patterns in the input data. This
could lead to very uneven distribution of the hashed data in the
hashtable. Combined with the single-linked lists we use, it could
really hog-down the lookup and removal of the nodes from the rbt
tree[a]. The Fibonacci Hashing is much better fit for the hashtable
function here. For longer description, read "Fibonacci Hashing: The
Optimization that the World Forgot"[b] or just look at the Linux
kernel. Also this will make Diego very happy :).
3. The hashtable would rehash every time the number of nodes in the rbt
tree would exceed 3 * (hashtable size). The overcommit will make the
uneven distribution in the hashtable even worse, but the main problem
lies in the rehashing - every time the database grows beyond the
limit, each subsequent rehashing will be much slower. The mitigation
here is letting the rbt know how big the cache can grown and
pre-allocate the hashtable to be big enough to actually never need to
rehash. This will consume more memory at the start, but since the
size of the hashtable is capped to `1 << 32` (e.g. 4 mio entries), it
will only consume maximum of 32GB of memory for hashtable in the
worst case (and max-cache-size would need to be set to more than
4TB). Calling the dns_db_adjusthashsize() will also cap the maximum
size of the hashtable to the pre-computed number of bits, so it won't
try to consume more gigabytes of memory than available for the
database.
FIXME: What is the average size of the rbt node that gets hashed? I
chose the pagesize (4k) as initial value to precompute the size of
the hashtable, but the value is based on feeling and not any real
data.
For future work, there are more places where we use result of the hash
value modulo some small number and that would benefit from Fibonacci
Hashing to get better distribution.
Notes:
a. A doubly linked list should be used here to speedup the removal of
the entries from the hashtable.
b. https://probablydance.com/2018/06/16/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo/
Also disable the semantic patch as the code needs tweaks here and there because
some destroy functions might not destroy the object and return early if the
object is still in use.
Function dns_view_findzonecut in view.c wasn't correctly handling
classes other than IN (chaos, hesiod, etc) whenever the name being
looked up wasn't in cache or in any of the configured zone views' database.
That resulted in a NULL fname being used in resolver.c:4900, which
in turn was triggering abort.
this function is used by dns_view_untrust() to handle revoked keys, so
it will still be needed after the keytable/validator refactoring is
complete, even though the keytable will be storing DS trust anchors
instead of keys. to simplify the way it's called, it now takes a DNSKEY
rdata struct instead of a DST key.
This second commit uses second semantic patch to replace the calls to
dns_name_copy() with NULL as third argument where the result was stored in a
isc_result_t variable. As the dns_name_copy(..., NULL) cannot fail gracefully
when the third argument is NULL, it was just a bunch of dead code.
Couple of manual tweaks (removing dead labels and unused variables) were
manually applied on top of the semantic patch.
This commit add RUNTIME_CHECK() around all simple dns_name_copy() calls where
the third argument is NULL using the semantic patch from the previous commit.
so that cleanup can all be done in dns_view_weakattach().
(cherry picked from commit be8af3afb711a04ac140b1debce3b950922d714f)
(cherry picked from commit e3946327035832ec03c064410e5c6881841f8f6f)
There's a deadlock in BIND 9 code where (dns_view_t){ .lock } and
(dns_resolver_t){ .buckets[i].lock } gets locked in different order. When
view->weakrefs gets converted to a reference counting we can reduce the locking
in dns_view_weakdetach only to cases where it's the last instance of the
dns_view_t object.
(cherry picked from commit a7c9a52c89146490a0e797df2119026a268f294c)
(cherry picked from commit 232140edae6b80f8344db234cda28f8456269a6e)
If named is configured to perform DNSSEC validation and also forwards
all queries ("forward only;") to validating resolvers, negative trust
anchors do not work properly because the CD bit is not set in queries
sent to the forwarders. As a result, instead of retrieving bogus DNSSEC
material and making validation decisions based on its configuration,
named is only receiving SERVFAIL responses to queries for bogus data.
Fix by ensuring the CD bit is always set in queries sent to forwarders
if the query name is covered by an NTA.
- "hook" is now used only for hook points and hook actions
- the "hook" statement in named.conf is now "plugin"
- ns_module and ns_modlist are now ns_plugin and ns_plugins
- ns_module_load is renamed ns_plugin_register
- the mandatory functions in plugin modules (hook_register,
hook_check, hook_version, hook_destroy) have been renamed
- use a per-view module list instead of global hook_modules
- create an 'instance' pointer when registering modules, store it in
the module structure, and use it as action_data when calling
hook functions - this enables multiple module instances to be set
up in parallel
- also some nomenclature changes and cleanup
- make some cfg-parsing functions global so they can be run
from filter-aaaa.so
- add filter-aaaa options to the hook module's parser
- mark filter-aaaa options in named.conf as obsolete, remove
from named and checkconf, and update the filter-aaaa test not to
use checkconf anymore
- remove filter-aaaa-related struct members from dns_view
- allow multiple "hook" statements at global or view level
- add "optional bracketed text" type for optional parameter list
- load hook module from specified path rather than hardcoded path
- add a hooktable pointer (and a callback for freeing it) to the
view structure
- change the hooktable functions so they no longer update ns__hook_table
by default, and modify PROCESS_HOOK so it uses the view hooktable, if
set, rather than ns__hook_table. (ns__hook_table is retained for
use by unit tests.)
- update the filter-aaaa system test to load filter-aaaa.so
- add a prereq script to check for dlopen support before running
the filter-aaaa system test
not yet done:
- configuration parameters are not being passed to the filter-aaaa
module; the filter-aaaa ACL and filter-aaaa-on-{v4,v6} settings are
still stored in dns_view
This properly orders clearing the freed pointer and calling isc_refcount_destroy
as early as possible to have ability to put proper memory barrier when cleaning
up reference counting.