Since October 2019 I have had complaints from `dnssec-cds` reporting
that the signatures on some of my test zones had expired. These were
zones signed by BIND 9.15 or 9.17, with a DNSKEY TTL of 24h and
`sig-validity-interval 10 8`.
This is the same setup we have used for our production zones since
2015, which is intended to re-sign the zones every 2 days, keeping
at least 8 days signature validity. The SOA expire interval is 7
days, so even in the presence of zone transfer problems, no-one
should ever see expired signatures. (These timers are a bit too
tight to be completely correct, because I should have increased
the expiry timers when I increased the DNSKEY TTLs from 1h to 24h.
But that should only matter when zone transfers are broken, which
was not the case for the error reports that led to this patch.)
For example, this morning my test zone contained:
dev.dns.cam.ac.uk. 86400 IN RRSIG DNSKEY 13 5 86400 (
20200701221418 20200621213022 ...)
But one of my resolvers had cached:
dev.dns.cam.ac.uk. 21424 IN RRSIG DNSKEY 13 5 86400 (
20200622063022 20200612061136 ...)
This TTL was captured at 20200622105807 so the resolver cached the
RRset 64976 seconds previously (18h02m56s), at 20200621165511
only about 12h before expiry.
The other symptom of this error was incorrect `resign` times in
the output from `rndc zonestatus`.
For example, I have configured a test zone
zone fast.dotat.at {
file "../u/z/fast.dotat.at";
type primary;
auto-dnssec maintain;
sig-validity-interval 500 499;
};
The zone is reset to a minimal zone containing only SOA and NS
records, and when `named` starts it loads and signs the zone. After
that, `rndc zonestatus` reports:
next resign node: fast.dotat.at/NS
next resign time: Fri, 28 May 2021 12:48:47 GMT
The resign time should be within the next 24h, but instead it is
near the signature expiry time, which the RRSIG(NS) says is
20210618074847. (Note 499 hours is a bit more than 20 days.)
May/June 2021 is less than 500 days from now because expiry time
jitter is applied to the NS records.
Using this test I bisected this bug to 09990672d which contained a
mistake leading to the resigning interval always being calculated in
hours, when days are expected.
This bug only occurs for configurations that use the two-argument form
of `sig-validity-interval`.
When we're shutting the system down via "rndc stop" or "rndc halt",
or reconfiguring the control channel, there are potential shutdown
races between the server task and network manager. These are adressed by:
- purging any pending command tasks when shutting down the control channel
- adding an extra handle reference before the command handler to
ensure the handle can't be deleted out from under us before calling
command_respond()
- using an isc_task to execute all rndc functions makes it relatively
simple for them to acquire task exclusive mode when needed
- control_recvmessage() has been separated into two functions,
control_recvmessage() and control_respond(). the respond function
can be called immediately from control_recvmessage() when processing
a nonce, or it can be called after returning from the task event
that ran the rndc command function.
- updated libisccc to use netmgr events
- updated rndc to use isc_nm_tcpconnect() to establish connections
- updated control channel to use isc_nm_listentcp()
open issues:
- the control channel timeout was previously 60 seconds, but it is now
overridden by the TCP idle timeout setting, which defaults to 30
seconds. we should add a function that sets the timeout value for
a specific listener socket, instead of always using the global value
set in the netmgr. (for the moment, since 30 seconds is a reasonable
timeout for the control channel, I'm not prioritizing this.)
- the netmgr currently has no support for UNIX-domain sockets; until
this is addressed, it will not be possible to configure rndc to use
them. we will need to either fix this or document the change in
behavior.
When "rndc reconfig" is run, named first configures a fresh set of views
and then tears down the old views. Consider what happens for a single
view with LMDB enabled; "envA" is the pointer to the LMDB environment
used by the original/old version of the view, "envB" is the pointer to
the same LMDB environment used by the new version of that view:
1. mdb_env_open(envA) is called when the view is first created.
2. "rndc reconfig" is called.
3. mdb_env_open(envB) is called for the new instance of the view.
4. mdb_env_close(envA) is called for the old instance of the view.
This seems to have worked so far. However, an upstream change [1] in
LMDB which will be part of its 0.9.26 release prevents the above
sequence of calls from working as intended because the locktable mutexes
will now get destroyed by the mdb_env_close() call in step 4 above,
causing any subsequent mdb_txn_begin() calls to fail (because all of the
above steps are happening within a single named process).
Preventing the above scenario from happening would require either
redesigning the way we use LMDB in BIND, which is not something we can
easily backport, or redesigning the way BIND carries out its
reconfiguration process, which would be an even more severe change.
To work around the problem, set MDB_NOLOCK when calling mdb_env_open()
to stop LMDB from controlling concurrent access to the database and do
the necessary locking in named instead. Reuse the view->new_zone_lock
mutex for this purpose to prevent the need for modifying struct dns_view
(which would necessitate library API version bumps). Drop use of
MDB_NOTLS as it is made redundant by MDB_NOLOCK: MDB_NOTLS only affects
where LMDB reader locktable slots are stored while MDB_NOLOCK prevents
the reader locktable from being used altogether.
[1] 2fd44e3251
BUFSIZ (512 bytes on Windows) may not be enough to fit the status of a
DNSSEC policy and three DNSSEC keys.
Set the size of the relevant buffer to a hardcoded value of 4096 bytes,
which should be enough for most scenarios.
it is now an error to have two primaries lists with the same
name. this is true regardless of whether the "primaries" or
"masters" keywords were used to define them.
as "type primary" is preferred over "type master" now, it makes
sense to make "primaries" available as a synonym too.
added a correctness check to ensure "primaries" and "masters"
cannot both be used in the same zone.
Due to lack of synchronization, whenever named was being requested to
stop using rndc, controlconf.c module could be trying to access an already
released pointer through named_g_server->interfacemgr in a separate
thread.
The race could only be triggered if named was being shutdown and more
rndc connections were ocurring at the same time.
This fix correctly checks if the server is shutting down before opening
a new rndc connection.
Implement the 'rndc dnssec -status' command that will output
some information about the key states, such as which policy is
used for the zone, what keys are in use, and when rollover is
scheduled.
Add loose testing in the kasp system test, the actual times are
already tested via key file inspection.
Add the code and documentation required to provide DNSSEC signing
status through rndc. This does not yet show any useful information,
just provide the command that will output some dummy string.
enough buffer space. Also change named_os_uname() prototype so
that it is now returning (const char *) rather than (char *). If
uname() is not supported on a UNIX build prepopulate unamebuf[]
with "unknown architecture".
these keywords were added to the parser as synonyms for "master"
and "slave" but were never hooked in to the configuration of named,
so they were ignored. this has been fixed and the option is now
checked for correctness.
- clone keynode->dsset rather than return a pointer so that thread
use is independent of each other.
- hold a reference to the dsset (keynode) so it can't be deleted
while in use.
- create a new keynode when removing DS records so that dangling
pointers to the deleted records will not occur.
- use a rwlock when accessing the rdatalist to prevent instabilities
when DS records are added.
Make various adjustments necessary to enable "make dist" to build a BIND
source tarball whose contents are complete enough to build binaries, run
unit & system tests, and generate documentation on Unix systems.
Known outstanding issues:
- "make distcheck" does not work yet.
- Tests do not work for out-of-tree source-tarball-based builds.
- Source tarballs are not complete enough for building on Windows.
All of the above will be addressed in due course.
Originally, the default value for max-stale-ttl was 1 week, which could
and in some scenarios lead to cache exhaustion on a busy resolvers.
Picking the default value will always be juggling between value that's
useful (e.g. keeping the already cached records after they have already
expired and the upstream name servers are down) and not bloating the
cache too much (e.g. keeping everything for a very long time). The new
default reflects what we think is a reasonable to time to react on both
sides (upstream authoritative and downstream recursive).
The ARM and the manpages have been converted into Sphinx documentation
format.
Sphinx uses reStructuredText as its markup language, and many of its
strengths come from the power and straightforwardness of
reStructuredText and its parsing and translating suite, the Docutils.
The introduction of netmgr doubled the number of threads from which
dnstap data may be logged: previously, it could only happen from within
taskmgr worker threads; with netmgr, it can happen both from taskmgr
worker threads and from network threads. Since the argument passed to
fstrm_iothr_options_set_num_input_queues() was not updated to reflect
this change, some calls to fstrm_iothr_get_input_queue() can now return
NULL, effectively preventing some dnstap data from being logged.
Whether this bug is triggered or not depends on thread scheduling order
and packet distribution between network threads, but will almost
certainly be triggered on any recursive resolver sooner or later. Fix
by requesting the correct number of dnstap input queues to be allocated.
named_os_openfile was being called with switch_user set to true
unconditionally leading to log messages about being unable to
switch user identity from named when regenerating the key.