DLV is long gone, so we can remove design documentation around DLV,
related command line options (that were already a hard failure),
and some DLV related test remnants.
QPDB is now a default implementation for both cache and zone. Remove
the venerable RBTDB database implementation, so we can fast-track the
changes to the database without having to implement the design changes
to both QPDB and RBTDB and this allows us to be more aggressive when
refactoring the database design.
Update the ARM and DNSSEC guide, removing references to 'auto-dnssec',
replacing them with 'dnssec-policy' if needed.
The section "Alternative Ways" of signing has to be refactored, since
we now only focus on one alternative way, that is manual signing.
When shutting down TCP sockets, the read callback calling logic was
flawed, it would call either one less callback or one extra. Fix the
logic in the way:
1. When isc_nm_read() has been called but isc_nm_read_stop() hasn't on
the handle, the read callback will be called with ISC_R_CANCELED to
cancel active reading from the socket/handle.
2. When isc_nm_read() has been called and isc_nm_read_stop() has been
called on the on the handle, the read callback will be called with
ISC_R_SHUTTINGDOWN to signal that the dormant (not-reading) socket
is being shut down.
3. The .reading and .recv_read flags are little bit tricky. The
.reading flag indicates if the outer layer is reading the data (that
would be uv_tcp_t for TCP and isc_nmsocket_t (TCP) for TLSStream),
the .recv_read flag indicates whether somebody is interested in the
data read from the socket.
Usually, you would expect that the .reading should be false when
.recv_read is false, but it gets even more tricky with TLSStream as
the TLS protocol might need to read from the socket even when sending
data.
Fix the usage of the .recv_read and .reading flags in the TLSStream
to their true meaning - which mostly consist of using .recv_read
everywhere and then wrapping isc_nm_read() and isc_nm_read_stop()
with the .reading flag.
4. The TLS failed read helper has been modified to resemble the TCP code
as much as possible, clearing and re-setting the .recv_read flag in
the TCP timeout code has been fixed and .recv_read is now cleared
when isc_nm_read_stop() has been called on the streaming socket.
5. The use of Network Manager in the named_controlconf, isccc_ccmsg, and
isc_httpd units have been greatly simplified due to the improved design.
6. More unit tests for TCP and TLS testing the shutdown conditions have
been added.
Co-authored-by: Ondřej Surý <ondrej@isc.org>
Co-authored-by: Artem Boldariev <artem@isc.org>
The first working multi-threaded qp-trie was stuck with an unpleasant
trade-off:
* Use `isc_rwlock`, which has acceptable write performance, but
terrible read scalability because the qp-trie made all accesses
through a single lock.
* Use `liburcu`, which has great read scalability, but terrible
write performance, because I was relying on `rcu_synchronize()`
which is rather slow. And `liburcu` is LGPL.
To get the best of both worlds, we need our own scalable read side,
which we now have with `isc_qsbr`. And we need to modify the write
side so that it is not blocked by readers.
Better write performance requires an async cleanup function like
`call_rcu()`, instead of the blocking `rcu_synchronize()`. (There
is no blocking cleanup in `isc_qsbr`, because I have concluded
that it would be an attractive nuisance.)
Until now, all my multithreading qp-trie designs have been based
around two versions, read-only and mutable. This is too few to
work with asynchronous cleanup. The bare minimum (as in epoch
based reclamation) is three, but it makes more sense to support an
arbitrary number. Doing multi-version support "properly" makes
fewer assumptions about how safe memory reclamation works, and it
makes snapshots and rollbacks simpler.
To avoid making the memory management even more complicated, I
have introduced a new kind of "packed reader node" to anchor the
root of a version of the trie. This is simpler because it re-uses
the existing chunk lifetime logic - see the discussion under
"packed reader nodes" in `qp_p.h`.
I have also made the chunk lifetime logic simpler. The idea of a
"generation" is gone; instead, chunks are either mutable or
immutable. And the QSBR phase number is used to indicate when a
chunk can be reclaimed.
Instead of the `shared_base` flag (which was basically a one-bit
reference count, with a two version limit) the base array now has a
refcount, which replaces the confusing ad-hoc lifetime logic with
something more familiar and systematic.
A qp-trie is a kind of radix tree that is particularly well-suited to
DNS servers. I invented the qp-trie in 2015, based on Dan Bernstein's
crit-bit trees and Phil Bagwell's HAMT. https://dotat.at/prog/qp/
This code incorporates some new ideas that I prototyped using
NLnet Labs NSD in 2020 (optimizations for DNS names as keys)
and 2021 (custom allocator and garbage collector).
https://dotat.at/cgi/git/nsd.git
The BIND version of my qp-trie code has a number of improvements
compared to the prototype developed for NSD.
* The main omission in the prototype was the very sketchy outline of
how locking might work. Now the locking has been implemented,
using a reader/writer lock and a mutex. However, it is designed to
benefit from liburcu if that is available.
* The prototype was designed for two-version concurrency, one
version for readers and one for the writer. The new code supports
multiversion concurrency, to provide a basis for BIND's dbversion
machinery, so that updates are not blocked by long-running zone
transfers.
* There are now two kinds of transaction that modify the trie: an
`update` aims to support many very small zones without wasting
memory; a `write` avoids unnecessary allocation to help the
performance of many small changes to the cache.
* There is also a single-threaded interface for situations where
concurrent access is not necessary.
* The API makes better use of types to make it more clear which
operations are permitted when.
* The lookup table used to convert a DNS name to a qp-trie key is
now initialized by a run-time constructor instead of a programmer
using copy-and-paste. Key conversion is more flexible, so the
qp-trie can be used with keys other than DNS names.
* There has been much refactoring and re-arranging things to improve
the terminology and order of presentation in the code, and the
internal documentation has been moved from a comment into a file
of its own.
Some of the required functionality has been stripped out, to be
brought back later after the basics are known to work.
* Garbage collector performance statistics are missing.
* Fancy searches are missing, such as longest match and
nearest match.
* Iteration is missing.
* Search for update is missing, for cases where the caller needs to
know if the value object is mutable or not.
DSCP has not been fully working since the network manager was
introduced in 9.16, and has been completely broken since 9.18.
This seems to have caused very few difficulties for anyone,
so we have now marked it as obsolete and removed the
implementation.
To ensure that old config files don't fail, the code to parse
dscp key-value pairs is still present, but a warning is logged
that the feature is obsolete and should not be used. Nothing is
done with configured values, and there is no longer any
range checking.
Previously:
* applications were using isc_app as the base unit for running the
application and signal handling.
* networking was handled in the netmgr layer, which would start a
number of threads, each with a uv_loop event loop.
* task/event handling was done in the isc_task unit, which used
netmgr event loops to run the isc_event calls.
In this refactoring:
* the network manager now uses isc_loop instead of maintaining its
own worker threads and event loops.
* the taskmgr that manages isc_task instances now also uses isc_loopmgr,
and every isc_task runs on a specific isc_loop bound to the specific
thread.
* applications have been updated as necessary to use the new API.
* new ISC_LOOP_TEST macros have been added to enable unit tests to
run isc_loop event loops. unit tests have been updated to use this
where needed.
These notes describe the initial compression design for BIND 9 in
1998/1999, when the IETF had some over-optimistic plans for using EDNS
to change the wire format of domain names. (Another example was
bitstring labels for IPv6 reverse DNS.) By the end of 2000 the EDNS
name compression schemes had been abandoned, and BIND 9's compression
code was rewritten to use a hash table.
There is nothing left of the implementation described here, and the
API functions are better described in `compress.h`, so these notes are
more misleading than helpful. Those who are interested in the past can
look at the version control history.
After removing the isc_task_onshutdown(), the isc_task_shutdown() and
isc_task_destroy() became obsolete.
Remove calls to isc_task_shutdown() and replace the calls to
isc_task_destroy() with isc_task_detach().
Simplify the internal logic to destroy the task when the last reference
is removed.
The isc_task_onshutdown() was used to post event that should be run when
the task is being shutdown. This could happen explicitly in the
isc_test_shutdown() call or implicitly when we detach the last reference
to the task and there are no more events posted on the task.
This whole task onshutdown mechanism just makes things more complicated,
and it's easier to post the "shutdown" events when we are shutting down
explicitly and the existing code already always knows when it should
shutdown the task that's being used to execute the onshutdown events.
Replace the isc_task_onshutdown() calls with explicit calls to execute
the shutdown tasks.
This commit converts the license handling to adhere to the REUSE
specification. It specifically:
1. Adds used licnses to LICENSES/ directory
2. Add "isc" template for adding the copyright boilerplate
3. Changes all source files to include copyright and SPDX license
header, this includes all the C sources, documentation, zone files,
configuration files. There are notes in the doc/dev/copyrights file
on how to add correct headers to the new files.
4. Handle the rest that can't be modified via .reuse/dep5 file. The
binary (or otherwise unmodifiable) files could have license places
next to them in <foo>.license file, but this would lead to cluttered
repository and most of the files handled in the .reuse/dep5 file are
system test files.
Replace some "master/slave" terminology in the code with the preferred
"primary/secondary" keywords. This also changes user output such as
log messages, and fixes a typo ("seconary") in cfg_test.c.
There are still some references to "master" and "slave" for various
reasons:
- The old syntax can still be used as a synonym.
- The master syntax is kept when it refers to master files and formats.
- This commit replaces mainly keywords that are local. If "master" or
"slave" is used in for example a structure that is all over the
place, it is considered out of scope for the moment.
Remove the dynamic registration of result codes. Convert isc_result_t
from unsigned + #defines into 32-bit enum type in grand unified
<isc/result.h> header. Keep the existing values of the result codes
even at the expense of the description and identifier tables being
unnecessary large.
Additionally, add couple of:
switch (result) {
[...]
default:
break;
}
statements where compiler now complains about missing enum values in the
switch statement.
The Windows support has been completely removed from the source tree
and BIND 9 now no longer supports native compilation on Windows.
We might consider reviewing mingw-w64 port if contributed by external
party, but no development efforts will be put into making BIND 9 compile
and run on Windows again.
Add a new option 'purge-keys' to 'dnssec-policy' that will purge key
files for deleted keys. The option determines how long key files
should be retained prior to removing the corresponding files from
disk.
If set to 0, the option is disabled and 'named' will not remove key
files from disk.
With the introduction of dnssec-policy, the aforementioned tools were
either rendered obsolete, or they will be replaced with dnssec-policy
based tools. Remove the tools and the requirement to have Python
installed. Python 3 is still being used for tests, so keep the autoconf
test, but make it much simpler.
The isc_mem API now crashes on memory allocation failure, and this is
the next commit in series to cleanup the code that could fail before,
but cannot fail now, e.g. isc_result_t return type has been changed to
void for the isc_log API functions that could only return ISC_R_SUCCESS.
The keyword 'unlimited' can be used instead of PT0S which means the
same but is more comprehensible for users.
Also fix some redundant "none" parameters in the kasp test.
This commit introduces the initial `dnssec-policy` configuration
statement. It has an initial set of options to deal with signature
and key maintenance.
Add some checks to ensure that dnssec-policy is configured at the
right locations, and that policies referenced to in zone statements
actually exist.
Add some checks that when a user adds the new `dnssec-policy`
configuration, it will no longer contain existing DNSSEC
configuration options. Specifically: `inline-signing`,
`auto-dnssec`, `dnssec-dnskey-kskonly`, `dnssec-secure-to-insecure`,
`update-check-ksk`, `dnssec-update-mode`, `dnskey-sig-validity`,
and `sig-validity-interval`.
Test a good kasp configuration, and some bad configurations.
Previously, the isc_mem_create() and isc_mem_createx() functions took `max_size`
and `target_size` as first two arguments. Those values were never used in the
BIND 9 code. The refactoring removes those arguments and let BIND 9 always use
the default values.
Previously, the isc_mem_create() and isc_mem_createx() functions could have
failed because of failed memory allocation. As this was no longer true and the
functions have always returned ISC_R_SUCCESS, the have been refactored to return
void.
Rather than overloading dns_zone_slave and discerning between a slave
zone and a mirror zone using a zone option, define a separate enum
value, dns_zone_mirror, to be used exclusively by mirror zones. Update
code handling slave zones to ensure it also handles mirror zones where
applicable.
4708. [cleanup] Legacy Windows builds (i.e. for XP and earlier)
are no longer supported. [RT #45186]
4707. [func] The lightweight resolver daemon and library (lwresd
and liblwres) have been removed. [RT #45186]
4706. [func] Code implementing name server query processing has
been moved from bin/named to a new library "libns".
Functions remaining in bin/named are now prefixed
with "named_" rather than "ns_". This will make it
easier to write unit tests for name server code, or
link name server functionality into new tools.
[RT #45186]
3535. [func] Add support for setting Differentiated Services Code
Point (DSCP) values in named. Most configuration
options which take a "port" option (e.g.,
listen-on, forwarders, also-notify, masters,
notify-source, etc) can now also take a "dscp"
option specifying a code point for use with
outgoing traffic, if supported by the underlying
OS. [RT #27596]