With adding this state to the statistics channel, it can now show
the zone transfer in this state instead of as "Pending" when the
zone.c module is performing a refresh SOA request, before actually
starting the transfer process. This will help to understand
whether the process is waiting because of the rate limiter (i.e.
"Pending"), or the rate limiter is passed and it is now waiting for
the refresh SOA query to complete or time out.
Add a new field in the incoming zone transfers section of the
statistics channel to show the transport used for the SOA request.
When the transfer is started beginning from the XFRST_SOAQUERY state,
it means that the SOA query will be performed by xfrin itself, using
the same transport. Otherwise, it means that the SOA query was already
performed by other means (e.g. by zone.c:soa_query()), and, in that
case, we use the SOA query transport type information passed by the
'soa_transport_type' argument, when the xfrin object was created.
Refactor isc_hashmap to allow custom matching functions. This allows us
to have better tailored keys that don't require fixed uint8_t arrays,
but can be composed of more fields from the stored data structure.
The 'result' variable should be reset to ISC_R_NOTFOUND again,
because otherwise a log message could be logged about not being
able to get the TLS configuration based on on the 'result' value
from the previous calls to get the TSIG key.
For secondary, mirror and redirect zones the expiry time is set
from the zone file's modification time on restart. As zone dumping
take time, set the modification time of the zone file to the expire
time less the expire interval.
The SET_IF_NOT_NULL() macro avoids a fair amount of tedious boilerplate,
checking pointer parameters to see if they're non-NULL and updating
them if they are. The macro was already in the dns_zone unit, and this
commit moves it to the <isc/util.h> header.
I have included a Coccinelle semantic patch to use SET_IF_NOT_NULL()
where appropriate. The patch needs an #include in `openssl_shim.c`
in order to work.
The updatenotify mechanism in dns_db relied on unlocked ISC_LIST for
adding and removing the "listeners". The mechanism relied on the
exclusive mode - it should have been updated only during reconfiguration
of the server. This turned not to be true anymore in the dns_catz - the
updatenotify list could have been updated during offloaded work as the
offloaded threads are not subject to the exclusive mode.
Change the update_listeners to be cds_lfht (lock-free hash-table), and
slightly refactor how register and unregister the callbacks - the calls
are now idempotent (the register call already was and the return value
of the unregister function was mostly ignored by the callers).
When dns_request was canceled via dns_requestmgr_shutdown() the cancel
event would be propagated on different loop (loop 0) than the loop where
request was created on. In turn this would propagate down to isc_netmgr
where we require all the events to be called from the matching isc_loop.
Pin the dns_requests to the loops and ensure that all the events are
called on the associated loop. This in turn allows us to remove the
hashed locks on the requests and change the single .requests list to be
a per-loop list for the request accounting.
Additionally, do some extra cleanup because some race condititions are
now not possible as all events on the dns_request are serialized.
When stub_glue_response() is called, the associated data is stored in
newly allocated struct stub_glue_request. The allocated structure is
never freed in the callback, thus we leak a little bit of memory.
The stub_request_nameserver_address() used 'request' as name for
struct stub_glue_request leading to confusion between 'request'
(stub_glue_request) and 'request->request' (dns_request_t).
Unify the name to 'sgr' already used in struct stub_glue_response().
The isc_stats_create() can no longer return anything else than
ISC_R_SUCCESS. Refactor isc_stats_create() and its variants in libdns,
libns and named to just return void.
1. Change the _new, _add and _copy functions to return the new object
instead of returning 'void' (or always ISC_R_SUCCESS)
2. Cleanup the isc_ht_find() + isc_ht_add() usage - the code is always
locked with catzs->lock (mutex), so when isc_ht_find() returns
ISC_R_NOTFOUND, the isc_ht_add() must always succeed.
3. Instead of returning direct iterator for the catalog zone entries,
add dns_catz_zone_for_each_entry2() function that calls callback
for each catalog zone entry and passes two extra arguments to the
callback. This will allow changing the internal storage for the
catalog zone entries.
4. Cleanup the naming - dns_catz_<fn>_<obj> -> dns_catz_<obj>_<fn>, as an
example dns_catz_new_zone() gets renamed to dns_catz_zone_new().
These two configuration options worked in conjunction with 'auto-dnssec'
to determine KSK usage, and thus are now obsoleted.
However, in the code we keep KSK processing so that when a zone is
reconfigured from using 'dnssec-policy' immediately to 'none' (without
going through 'insecure'), the zone is not immediately made bogus.
Add one more test case for going straight to none, now with a dynamic
zone (no inline-signing).
Since external DNSKEY records may exist in the unsigned version of the
zone (for example DNSKEY records from other providers), handle these
RRsets also when copying non DNSSEC records from the unsigned zone
database to the signed version.
to reduce the amount of common code that will need to be shared
between the separated cache and zone database implementations,
clean up unused portions of dns_db.
the methods dns_db_dump(), dns_db_isdnssec(), dns_db_printnode(),
dns_db_resigned(), dns_db_expirenode() and dns_db_overmem() were
either never called or were only implemented as nonoperational stub
functions: they have now been removed.
dns_db_nodefullname() was only used in one place, which turned out
to be unnecessary, so it has also been removed.
dns_db_ispersistent() and dns_db_transfernode() are used, but only
the default implementation in db.c was ever actually called. since
they were never overridden by database methods, there's no need to
retain methods for them.
in rbtdb.c, beginload() and endload() methods are no longer defined for
the cache database, because that was never used (except in a few unit
tests which can easily be modified to use the zone implementation
instead). issecure() is also no longer defined for the cache database,
as the cache is always insecure and the default implementation of
dns_db_issecure() returns false.
for similar reasons, hashsize() is no longer defined for zone databases.
implementation functions that are shared between zone and cache are now
prepended with 'dns__rbtdb_' so they can become nonstatic.
serve_stale_ttl is now a common member of dns_db.
The dns_zone_catz_enable_db() and dns_zone_catz_disable_db()
functions can race with similar operations in the catz module
because there is no synchronization between the threads.
Add catz functions which use the view's catalog zones' lock
when registering/unregistering the database update notify callback,
and use those functions in the dns_zone module, instead of doing it
directly.
view->adb may be referenced while the view is shutting down as the
zone uses a weak reference to the view and examines view->adb but
dns_view_detach call dns_adb_detach to clear view->adb.
since it is not necessary to find partial matches when looking
up names in a TSIG keyring, we can use a hash table instead of
an RBT to store them.
the tsigkey object now stores the key name as a dns_fixedname
rather than allocating memory for it.
the `name` parameter to dns_tsigkeyring_add() has been removed;
it was unneeded since the tsigkey object already contains a copy
of the name.
the opportunistic cleanup_ring() function has been removed;
it was only slowing down lookups.
This commit ensures that access to the TLS context cache within zone
manager is properly synchronised.
Previously there was a possibility for it to get unexpectedly
NULLified for a brief moment by a call to
dns_zonemgr_set_tlsctx_cache() from one thread, while being accessed
from another (e.g. from got_transfer_quota()). This behaviour could
lead to server abort()ing on configuration reload (under very rare
circumstances).
That behaviour has been fixed.
The following code block repeats quite often:
if (rdata.type == dns_rdatatype_dnskey ||
rdata.type == dns_rdatatype_cdnskey ||
rdata.type == dns_rdatatype_cds)
Introduce a new function to reduce the repetition.
The raw zone is not supposed to be signed. DNSKEY records in a raw zone
should not trigger zone signing. The update code needs to be able to
identify when it is working on a raw zone. Add dns_zone_israw() and
dns_zone_issecure() enable it to do this. Also, we need to check the
case for 'auto-dnssec maintain'.
For inline-signing zones, sometimes kasp was not detected because
the function was called on the raw (unsigned) version of the zone,
but the kasp is only set on the secure (signed) version of the zone.
Fix the dns_zone_getkasp() function to check whether the zone
structure is inline_raw(), and if so, use the kasp from the
secure version.
In zone.c we can access the kasp pointer directly.
When synchronizing the journal or database from the unsigned version of
the zone to the secure version of the zone, allow DNSKEY records to be
synced, because these may be added by the user with the sole intent to
publish the record (not used for signing). This may be the case for
example in the multisigner model 2 (RFC 8901).
Additional code needs to be added to ensure that we do not remove DNSKEY
records that are under our control. Keys under our control are keys that
are used for signing the zone and thus that we have key files for.
Same counts for CDNSKEY and CDS (records that are derived from keys).
Shutdown and cleanup of zones is more asynchronous with the qp-trie
zone table. As a result it's possible that some activity is delayed
until after a zone has been released from its zonemanager.
Previously, the dns_zone code was not very strict in the way it
refers to the loop it is running on: The loop pointer was stashed when
dns_zonemgr_managezone() was called and never cleared. Now, zones
properly attach to and detach from their loops.
The zone timer depends on its loop. The shutdown crashes occurred
when asynchronous calls tried to modify the zone timer after
dns_zonemgr_releasezone() has been called and the loop was
invalidated. In these cases the attempt to set the timer is now
ignored, with a debug log message.
The zone_resigninc() function does not check the validity of
'zone->db', which can crash named if the zone was unloaded earlier,
for example with "rndc delete".
Check that 'zone->db' is not 'NULL' before attaching to it, like
it is done in zone_sign() and zone_nsec3chain() functions, which
can similarly be called by zone maintenance.
When dns_request_create() failed in notify_send_toaddr(), sending the
notify would silently fail. When notify_done() failed, the error would
be logged on the DEBUG(2) level.
This commit remedies the situation by:
* Promoting several messages related to notifies to INFO level and add
a "success" log message at the INFO level
* Adding a TCP fallback - when sending the notify over UDP fails, named
will retry sending notify over TCP and log the information on the
NOTICE level
* When sending the notify over TCP fails, it will be logged on the
WARNING level
Closes: #4001, #4002
If the zone forwards are canceled from dns_zonemgr_shutdown(), the
forward_cancel() would get called from the main loop, which is wrong.
It needs to be called from the matching zone->loop.
Run the dns_request_cancel() via isc_async_run() on the loop associated
with the zone instead of calling the dns_request_cancel() directly from
the main loop.
This change makes the zone table lock-free for reads. Previously, the
zone table used a red-black tree, which is not thread safe, so the hot
read path acquired both the per-view mutex and the per-zonetable
rwlock. (The double locking was to fix to cleanup races on shutdown.)
One visible difference is that zones are not necessarily shut down
promptly: it depends on when the qp-trie garbage collector cleans up
the zone table. The `catz` system test checks several times that zones
have been deleted; the test now checks for zones to be removed from
the server configuration, instead of being fully shut down. The catz
test does not churn through enough zones to trigger a gc, so the zones
are not fully detached until the server exits.
After this change, it is still possible to improve the way we handle
changes to the zone table, for instance, batching changes, or better
compaction heuristics.
use the ISC_REFCOUNT implementation for dns_zone_attach() and
_detach(). (this applies only to external zone references, not
to dns_zone_iattach() and dns_zone_idetach().)
use dns_zone_ref() where previously a dummy zone object had been
used to increment the reference count.
Implement the new feature, automatic parental-agents. This is enabled
with 'checkds yes'.
When set to 'yes', instead of querying the explicit configured
parental agents, look up the parental agents by resolving the parent
NS records. The found parent NS RRset is considered to be the list
of parental agents that should be queried during a KSK rollover,
looking up the DS RRset corresponding to the key signing keys.
For each NS record, look up the addresses in the ADB. These addresses
will be used to send the DS requests. Count the number of servers and
keep track of how many good DS responses were seen.
Add a new configuration option to set how the checkds method should
work. Acceptable values are 'yes', 'no', and 'explicit'.
When set to 'yes', the checkds method is to lookup the parental agents
by querying the NS records of the parent zone.
When set to 'no', no checkds method is enabled. Users should run
the 'rndc checkds' command to signal that DS records are published and
withdrawn.
When set to 'explicit', the parental agents are explicitly configured
with the 'parental-agents' configuration option.