Per pspacek, currently qp and qpcache logs are too verbose and enabled at a
level too low compared to how often the logging is useful.
This commit increases the logging level, while keeping it configurable
via a define.
The RAD/agent domain is a functionality from RFC 9567 that provides
a suffix for reporting error messages. On every query context reset,
we need to check if a RAD is configured and, if so, copy it.
Since we allow the RAD to be changed by reconfiguring the zone,
this access is currently protected by a mutex, which causes contention.
This commit replaces the mutex with RCU to reduce contention. The
change results in a 3% performance improvement in the 1M delegation
test.
We need to turn off clang-format to preserve the brackets as
'attribute' can be an expression and we need it to be evaluated
first.
Similarly we need the entire result to be evaluated independent of
the adjoining code.
This happens because old key is purged by one zone view, then the other
is freaking out about it.
Keys that are unused or being purged should not be taken into account
when verifying key files are available.
The keyring is maintained per zone. So in one zone, a key in the
keyring is being purged. The corresponding key file is removed.
The key maintenance is done for the other zone view. The key in that
keyring is not yet set to purge, but its corresponding key file is
removed. This leads to "some keys are missing" log errors.
We should not check the purge variable at this point, but the
current time and purge-keys duration.
This commit fixes this erroneous logic.
Use the existing RSASHA256 and RSASHA512 implementation to provide
working PRIVATEOID example implementations. We are using the OID
values normally associated with RSASHA256 (1.2.840.113549.1.1.11)
and RSASHA512 (1.2.840.113549.1.1.13).
Add support for proposed DS digest types that encode the private
algorithm identifier at the start of the DS digest as is done for
DNSKEY and RRSIG. This allows a DS record to identify the specific
DNSSEC algorithm, rather than a set of algorithms, when the algorithm
field is set to PRIVATEDNS or PRIVATEOID.
- dns_zone_cdscheck() has been extended to extract the key algorithms
from DNSKEY data when the CDS algorithm is PRIVATEOID or PRIVATEDNS.
- dns_zone_signwithkey() has been extended to support signing with
PRIVATEDNS and PRIVATEOID algorithms. The signing record (type 65534)
added at the zone apex to indicate the current state of automatic zone
signing can now contain an additional two-byte field for the DST
algorithm value, when the DNS secalg value isn't enough information.
dns_resolver_algorithm_supported() has been extended so in addition to
an algorithm number, it can also take a pointer to an RRSIG signature
field in which key information is encoded.
DST algorithm and DNSSEC algorithm values are not necessarily the same
anymore: if the DNSSEC algorithm value is PRIVATEOID or PRIVATEDNS, then
the DST algorithm will be mapped to something else. The conversion is
now done correctly where necessary.
The algorithm values PRIVATEDNS and PRIVATEOID are placeholders,
signifying that the actual algorithm identifier is encoded into the
key data. Keys using this mechanism are now supported.
- The algorithm values PRIVATEDNS and PRIVATEOID cannot be used to
build a key file name; dst_key_buildfilename() will assert if
they are used.
- The DST key values for private algorithms are higher than 255.
Since DST_ALG_MAXALG now exceeds 256, algorithm arrays that were
previously hardcoded to size 256 have been resized.
- New mnemonic/text conversion functions have been added.
dst_algorithm_{fromtext,totext,format} can handle algorithm
identifiers encoded in PRIVATEDNS and PRIVATEOID keys, as well
as the traditional algorithm identifiers. (Note: The existing
dns_secalg_{fromtext,totext,format} functions are similar, but
do *not* support PRIVATEDNS and PRIVATEOID. In most cases, the
new functions have taken the place of the old ones, but in a few
cases the old version is still appropriate.)
- dns_private{oid,dns}_{fromtext,totext,format} converts between
DST algorithm values and the mnemonic strings for algorithms
implemented using PRIVATEDNS or PRIVATEOID. (E.g., "RSASHA256OID").
- dst_algorithm_tosecalg() returns the DNSSEC algorithm identifier
that applies for a given DST algorithm. For PRIVATEDNS- or
PRIVATEOID- based algorithms, the result will be PRIVATEDNS or
PRIVATEOID, respectively.
- dst_algorithm_fromprivatedns() and dst_algorithm_fromprivateoid()
return the DST algorithm identifier for an encoded algorithm in
wire format, represented as in DNS name or an object identifier,
respectively.
- dst_algorithm_fromdata() is a front-end for the above; it extracts
the private algorithm identifier encoded at the begining of a
block of key or signature data, and returns the matching DST
algorithm number.
- dst_key_fromdns() and dst_key_frombuffer() now work with keys
that have PRIVATEDNS and PRIVATEOID algorithm identifiers at the
beginning.
The header file dns/rdatastruct.h was not being rebuilt when the
rdata type header files where modified.
Removed proforma.c from the list. It is a starting point for new
types.
The "keyopts" field of the dns_zone object was added to support
"auto-dnssec"; at that time the "options" field already had most of
its 32 bits in use by other flags, so it made sense to add a new
field.
Since then, "options" has been widened to 64 bits, and "auto-dnssec"
has been obsoleted and removed. Most of the DNS_ZONEKEY flags are no
longer needed. The one that still seems useful (_FULLSIGN) has been
moved into DNS_ZONEOPT and the rest have been removed, along with
"keyopts" and its setter/getter functions.
Meson is a modern build system that has seen a rise in adoption and some
version of it is available in almost every platform supported.
Compared to automake, meson has the following advantages:
* Meson provides a significant boost to the build and configuration time
by better exploiting parallelism.
* Meson is subjectively considered to be better in readability.
These merits alone justify experimenting with meson as a way of
improving development time and ergonomics. However, there are some
compromises to ensure the transition goes relatively smooth:
* The system tests currently rely on various files within the source
directory. Changing this requirement is a non-trivial task that can't
be currently justified. Currently the last compiled build directory
writes into the source tree which is in turn used by pytest.
* The minimum version supported has been fixed at 0.61. Increasing this
value will require choosing a baseline of distributions that can
package with meson. On the contrary, there will likely be an attempt
to decrease this value to ensure almost universal support for building
BIND 9 with meson.
If the name is fully lowercase, we don't need to access the case bitmap
in order to set the case. Therefore, we can check for the FULLYLOWERCASE
flag using only atomic operations, and skip a lock in the hot path,
provided we clear the FULLYLOWERCASE flag before changing the case
bitmap.
The cache for unreachable primaries was added to BIND 9 in 2006 via
1372e172d0e0b08996376b782a9041d1e3542489. It features a 10-slot LRU
array with 600 seconds (10 minutes) fixed delay. During this time, any
primary with a hiccup would be blocked for the whole block duration
(unless overwritten by a different entry).
As this design is not very flexible (i.e. the fixed delay and the fixed
amount of the slots), redesign it based on the badcache.c module, which
was implemented earlier for a similar mechanism.
The differences between the new code and the badcache module were large
enough to create a new module instead of trying to make the badcache
module universal, which could complicate the implementation.
The new design implements an exponential backoff for entries which are
added again soon after expiring, i.e. the next expiration happens in
double the amount of time of the previous expiration, but in no more
time than the defined maximum value.
The initial and the maximum expiration values are hard-coded, but, if
required, it should be trivial to implement configurable knobs.
Special tokens can now be specified in a zone "file" option
in order to generate the filename parametrically. The first
instead of "$name" in the "file" option is replaced with the
zone origin, the first instance of "$type" is replaced with the
zone type (i.e., primary, secondary, etc), and the first instance
of "$view" is replaced with the view name..
This simplifies the creation of zones using initial-file templates.
For example:
$ rndc addzone <zonename> \
{ type primary; file "$name.db"; initial-file "template.db"
When loading a primary zone for the first time, if the zonefile
does not exist but an "initial-file" option has been set, then a
new file will be copied into place from the path specified by
"initial-file".
This can be used to simplify the process of adding new zones. For
instance, a template zonefile could be used by running:
$ rndc addzone example.com \
'{ type primary; file "example.db"; initial-file "template.db"; };'
Since 70b1777d8aef75da1b184fe8155dc818ce66628a was commited the OSS-Fuzz
build was broken because the `chunk_get_raw()` was not updated in the
`FUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION`-enabled area. Add the `size`
argument to the fuzzing version of the `chunk_get_raw()` function.
Instead of giving the memory pools names with an explicit call to
isc_mempool_setname(), add the name to isc_mempool_create() call to have
all the memory pools an unconditional name.
Instead of giving the memory context names with an explicit call to
isc_mem_setname(), add the name to isc_mem_create() call to have all the
memory contexts an unconditional name.
The memory context for isc_managers and dst_api units had no name and
that was causing trouble with the statistics channel output. Set the
name for the two memory context that were missing a proper name.
After b171cacf4f0123ba96bef6eedfc92dfb608db6b7, a zone object can
remain in the memory for a while, until garbage collection is run.
Setting the DNS_ZONEFLG_EXITING flag should prevent the zone
maintenance function from running while it's in that state.
Otherwise, a secondary zone could initiate a zone transfer after
it had been deleted.
When request manager shuts down, it also shuts down all its ongoing
requests. Currently it calls their callback functions with a
ISC_R_SHUTTINGDOWN result code for the request. Since a request
manager can shutdown not only during named shutdown but also during
named reconfiguration, instead of sending ISC_R_SHUTTINGDOWN result
code send a ISC_R_CANCELED code to avoid confusion and errors with
the expectation that a ISC_R_SHUTTINGDOWN result code can only be
received during actual shutdown of named.
All the callback functions which are passed to either the
dns_request_create() or the dns_request_createraw() functions have
been analyzed to confirm that they can process both the
ISC_R_SHUTTINGDOWN and ISC_R_CANCELED result codes. Changes were
made where it was necessary.
When the zone.c:refresh_callback() callback function is called during
a SOA request before a zone transfer, it can receive a
ISC_R_SHUTTINGDOWN result for the sent request when named is shutting
down, and in that case it just destroys the request and finishes the
ongoing transfer, without clearing the DNS_ZONEFLG_REFRESH flag of the
zone. This is alright when named is going to shutdown, but currently
the callback can get a ISC_R_SHUTTINGDOWN result also when named is
reconfigured during the ongoibg SOA request. In that case, leaving the
DNS_ZONEFLG_REFRESH flag set results in the zone never being able
to refresh again, because any new attempts will be caneled while
the flag is set. Clear the DNS_ZONEFLG_REFRESH flag on the 'exiting'
error path of the callback function.
replace the pattern `for (result = dns_rdataset_first(x); result ==
ISC_R_SUCCES; result = dns_rdataset_next(x)` with a new
`DNS_RDATASET_FOREACH` macro throughout BIND.
the import_rdataset() function can't return any value other
than ISC_R_SUCCESS, so it's been changed to void and its callers
don't rely on its return value any longer.
the comments for some calls in the dns_message API specified
requirements which were not actually enforced in the functions.
in most cases, this has now been corrected by adding the missing
REQUIREs. in one case, the comment was incorrect and has been
revised.
previously, ISC_LIST_FOREACH and ISC_LIST_FOREACH_SAFE were
two separate macros, with the _SAFE version allowing entries
to be unlinked during the loop. ISC_LIST_FOREACH is now also
safe, and the separate _SAFE macro has been removed.
similarly, the ISC_LIST_FOREACH_REV macro is now safe, and
ISC_LIST_FOREACH_REV_SAFE has also been removed.
Before implementing adaptive chunk sizing, it was necessary to ensure
that a chunk could hold up to 48 twigs, but the new logic will size-up
new chunks to ensure that the current allocation can succeed.
We exploit the new logic in two ways:
- We make the minimum chunk size smaller than the old limit of 2^6,
reducing memory consumption.
- We make the maximum chunk size larger, as it has been observed that
it improves resolver performance.
qp-tries allocate their nodes (twigs) in chunks to reduce allocator
pressure and improve memory locality. The choice of chunk size presents
a tradeoff: larger chunks benefit qp-tries with many values (as seen
in large zones and resolvers) but waste memory in smaller use cases.
Previously, our fixed chunk size of 2^10 twigs meant that even an
empty qp-trie would consume 12KB of memory, while reducing this size
would negatively impact resolver performance.
This commit implements an adaptive chunking strategy that:
- Tracks the size of the most recently allocated chunk.
- Doubles the chunk size for each new allocation until reaching a
predefined maximum.
This approach effectively balances memory efficiency for small tries
while maintaining the performance benefits of larger chunk sizes for
bigger data structures.
This commit also splits the callback freeing qpmultis into two
phases, one that frees the underlying qptree, and one that reclaims
the qpmulti memory. In order to prevent races between the qpmulti
destructor and chunk garbage collection jobs, the second phase is
protected by reference counting.
The `max-rsa-exponent-size` could limit the exponents of the RSA
public keys during the DNSSEC verification. Instead of providing
a cryptic (not cryptographic) knob, hardcode the max exponent to
be 4096 (the theoretical maximum for DNSSEC).
The DST API has been cleaned up, duplicate functions has been squashed
into single call (verify and verify2 functions), and couple of unused
functions have been completely removed (createctx2, computesecret,
paramcompare, and cleanup).
This new option sets the delay, in seconds, to wait before sending
a set of NOTIFY messages for a zone. Whenever a NOTIFY message is
ready to be sent, sending will be deferred for this duration.
In a previous change, the "algorithm" value passed to
dns_tsigkey_create() was changed from a DNS name to an integer;
the name was then chosen from a table of known algorithms. A
side effect of this change was that a query using an unknown TSIG
algorithm was no longer handled correctly, and could trigger an
assertion failure. This has been corrected.
The dns_tsigkey struct now stores the signing algorithm
as dst_algorithm_t value 'alg' instead of as a dns_name,
but retains an 'algname' field, which is used only when the
algorithm is DST_ALG_UNKNOWN. This allows the name of the
unrecognized algorithm name to be returned in a BADKEY
response.
Previously all kinds of TCP timeouts had a single getter and setter
functions. Separate each timeout to its own getter/setter functions,
because in majority of cases only one is required at a time, and it's
not optimal expanding those functions every time a new timeout value
is implemented.
The checkds_send_toaddr() function uses hardcoded timeout values
for both UDP and TCP, however, with TCP named has configurable
timeout values. Slightly refactor the timeouts calculation part
and use the configured 'tcp-initial-timeout' value as the connect
timeout.