- there are now two functions for getting rdataslab size:
dns_rdataslab_size() is for full slabs and dns_rdataslab_sizeraw()
for raw slabs. there is no longer a need for a reservelen parameter.
- dns_rdataslab_count() also no longer takes a reservelen parameter.
(currently it's never used for raw slabs, so there is no _countraw()
function.)
- dns_rdataslab_rdatasize() has been removed, because
dns_rdataslab_sizeraw() can do the same thing.
- dns_rdataslab_merge() and dns_rdataslab_subtract() both take
slabheader parameters instead of character buffers, and the
reservelen parameter has been removed.
if both rdataslabs being compared have zero length, return true.
also, since these functions are only ever called on slabheaders
with sizeof(dns_slabheader_t) as the reserve length, we can
simplify the API: remove the reservelen argument, and pass the
slabs as type dns_slabheader_t * instead of unsigned char *.
The dns_slabheader object uses the 'next' pointer for two purposes.
In the first header for any given type, 'next' points to the first
header for the next type. But 'down' points to the next header of
the same type, and in that record, 'next' points back up.
This design made the code confusing to read. We now use a union
so that the 'next' pointer can also be called 'up'.
Database versions are not used in cache databases. Some places in
qpcache.c required the version argument to be NULL; others marked it
as UNUSED. Unify all cases to require version to be NULL.
Add new related_headers() function that simplifies the code
flow in qpcache_findrdataset(). Also use check_stale_header() function
to remove code duplication.
Add a helper function both_headers() that unifies the slabheader
matching for simple type: it returns true when both the type and
the matching RRSIG have been found.
The new maybe_update_headers() function unifies the LRU updates to the
slabheaders that was scattered all over the place. More calls to update
headers after bindrdatasets() were also added for completeness.
This removes code duplication between the dual bindrdataset() calls. It
also unifies the handling as there were small differences between the
calls: one variant was checking for !NEGATIVE(found) condition and one
wasn't, and it is technically ok to do the check for all variants.
The check_stale_header() function now updates header_prev directly
so it doesn't have to be handled in the outer loop; it's always
set to the correct value of the previous header in the chain.
some code was left in the cache database implementation after
it was separated from the zone database, and can be cleaned up
and refactored now:
- the DNS_SLABHEADERATTR_IGNORE flag is never set in the cache
- support for loading the cache from was removed, but the add()
function still had a 'loading' flag that's always false
- two different macros were used for checking the
DNS_SLABHEADERATTR_NONEXISTENT flag - EXISTS() and NONEXISTENT().
it's clearer to just use EXISTS().
- the cache doesn't support versions, so it isn't necessary to
walk down the 'down' pointer chain when iterating through the
cache or looking for a header to update. 'down' now only points
to records that are deleted from the cache but have not yet been
purged from memory. this allows us to simplify both the iterator
and the add() function.
in some places there were checks for failures of dns_qp_insert()
after dns_qp_getname(). such failures could only happen if another
thread inserted a node between the two calls, and that can't happen
because the calls are serialized with dns_qpmulti_write(). we can
simplify the code and just add an INSIST.
When converting SVCB records to text representation use named
SvcParamKeys values unless backward-compatible mode is activated,
in which case the values which were not defined initially in
RFC9460 and were added later (see [1]) are converted to opaque
"keyN" syntax, like, for example, "key7" instead of "dohpath".
[1] https://www.iana.org/assignments/dns-svcb/dns-svcb.xhtml
Co-authored-by: sdomi <ja@sdomi.pl>
If a deferred validation on data that was originally queried with
CD=1 fails, we now repeat the query, since the zone data may have
changed in the meantime.
When a query is made with CD=1, we store the result in the
cache marked pending so that it can be validated later, at
which time it will either be accepted as an answer or removed
from the cache as invalid. Deferred validation was not
attempted when there were no cached RRSIGs for DNSKEY and
DS. We now complete the deferred validation in this scenario.
prio_type was being used in the wrong place to optimize cname_and_other.
We have to first exclude and accepted types and we also have to
determine that the record exists before we can check if we are at
a point where a later CNAME cannot appear.
Instead of using on hash of the name modulo number of the buckets,
assign the locknum randomly with isc_random_uniform(). This makes
the locknum assignment aligned with qpcache and allows the bucket
number to be non-prime in the future.
Reduce the number of qpzone_ref() and qpzone_unref() calls in
qpzone_detachnode() by relying on the call_rcu to delay
the destruction of the lock buckets.
Instead of having many node_lock_count * sizeof(<member>) arrays, pack
all the members into a qpzone_bucket_t that is cacheline aligned and have
a single array of those.
Instead of having many node_lock_count * sizeof(<member>) arrays, pack
all the members into a qpcache_bucket_t struct that is cacheline aligned
and have a single array of those.
Additionaly, make both the head and the tail of isc_queue_t padded, not
just the head, to prevent false sharing of the lock-free structure with
the lock that follows it.
In #1870, the expiration time of ANCIENT records were printed, but
actually the ancient records are very short lived, and the information
carries a little value.
Instead of printing the expiration of ANCIENT records, print the
expiration time of STALE records.
The original .ttl field was actually used as TTL in the dns_qpzone unit.
Restore the field by adding it to union with the .expire struct member
and cleanup all the code that added or subtracted 'now' from the ttl
field as that was misleading as 'now' would be always 0 for qpzone
database.
The find_coveringnsec() was getting the 'now' from two sources -
search->now and separate now argument. Things like this are ticking
bombs, remove the extra 'now' argument and use single source of 'now'.
When the mark_ancient() helper function was introduced, couple of places
with duplicate (or almost duplicate) code was missed. Move the
mark_ancient() function closer to the top of the file, and correctly use
it in places that mark the header as ANCIENT.
If we know that the header has ZEROTTL set, the server should never send
stale records for it and the TTL should never be anything else than 0.
The comment was already there, but the code was not matching the
comment.
The old name was misleading as it never meant time-to-live, e.g. number
of seconds from now when the header should expire. The true meaning was
an expiration time e.g. now + ttl. This was the original design bug
that caused the slip when we assigned header->ttl to rdataset->ttl.
Because the name was matching, nobody has questioned the correctness of
the code both during the MR review and during the numerous re-reviews
when we were searching for the cause of the 54 year TTL.
When the header has been marked as ANCIENT, but the ttl hasn't been
reset (this happens in couple of places), the rdataset TTL would be
set to the header timestamp instead to a reasonable TTL value.
Since this header has been already expired (ANCIENT is set), set the
rdataset TTL to 0 and don't reuse this field to print the expiration
time when dumping the cache. Instead of printing the time, we now
just print 'expired (awaiting cleanup'.
When there are parent and child zones on the same server, the DNSKEY
lookup was failing as the pending record we are validating is needed
to fetch the DNSKEY records. This change allows that to happen.
The caller is already setting STARTATZONE when the name being looked
up is a subdomain of the current domain.
The address lookups from ADB were not being validated, allowing
spoofed responses to be accepted and used for other lookups.
Validate the answers except when CD=1 is set in the triggering
request. Separate ADB names looked up with CD=1 from those without
CD=1, to prevent the use of unvalidated answers in the normal lookup
case (CD=0). Set the TTL on unvalidated (pending) responses to
ADB_CACHE_MINIMUM when adding them to the ADB.
the search for the deepest known zone cut in the cache could
improperly reject a node containing stale data, even if the
NS rdataset wasn't the data that was stale.
this change also improves the efficiency of the search by
stopping it when both NS and RRSIG(NS) have been found.
Change the names of the node reference counting functions
and add comments to make the mechanism easier to understand:
- newref() and decref() are now called qpcnode_acquire()/
qpznode_acquire() and qpcnode_release()/qpznode_release()
respectively; this reflects the fact that they modify both
the internal and external reference counters for a node.
- qpcnode_newref() and qpznode_newref() are now called
qpcnode_erefs_increment() and qpznode_erefs_increment(), and
qpcnode_decref() and qpznode_decref() are now called
qpcnode_erefs_decrement() and qpznode_erefs_decrement(),
to reflect that they only increase and decrease the node's
external reference counters, not internal.
This removes the db_nodelock_t structure and changes the node_locks
array to be composed only of isc_rwlock_t pointers. The .reference
member has been moved to qpdb->references in addition to
common.references that's external to dns_db API users. The .exiting
members has been completely removed as it has no use when the reference
counting is used correctly.
The origin_node in qpcache was always NULL, so we can remove the
getoriginode() function and origin_node pointer as the
dns_db_getoriginnode() correctly returns ISC_R_NOTFOUND when the
function is not implemented.
Cleanup the pattern in the decref() functions in both qpcache.c and
qpzone.c, so it follows the similar patter as we already have in
newref() function.
In order to avoid to loop to find the next position to store an EDE in
a dns_edectx_t, add a "nextede" state which holds the next available
position.
Also, in order ot avoid to loop to find if an EDE is already existing in
a dns_edectx_t, and avoid a duplicate, use a bitmap to immediately know
if the EDE is there or not.
Those both changes applies for adding or copying EDE.
Also make the direction of dns_ede_copy more explicit/avoid errors by
making "edectx_from" a const pointer.
Migrate tests cases in client_test code which were exclusively testing
code which is now all wrapped inside ede compilation unit. Those are
testing maximum number of EDE, duplicate EDE as well as truncation of
text of an EDE.
Also add coverage for the copy of EDE from an edectx to another one, as
well as checking the assertion of the maximum EDE info code which can be
used.
Instead of mixing the dns_resolver and dns_validator units directly with
the EDE code, split-out the dns_ede functionality into own separate
compilation unit and hide the implementation details behind abstraction.
Additionally, the EDE codes are directly copied into the ns_client
buffers by passing the EDE context to dns_resolver_createfetch().
This makes the dns_ede implementation simpler to use, although sligtly
more complicated on the inside.
Co-authored-by: Colin Vidal <colin@isc.org>
Co-authored-by: Ondřej Surý <ondrej@isc.org>
Extended DNS error 22 (No reachable authority) was previously detected
when `fctx_expired` fired. It turns out this function is used as a
"safety net" and the timeout detection should be caught earlier.
It was working though, because of another issue fixed by !9927. Since
this change, the recursive request timed out detection occurs before
`fctx_expired` so EDE 22 is not added to the response message anymore.
The fix of the problem is to add the EDE 22 code in two situations:
- When the dispatch code timed out (rctx_timedout) the resolver code
checks various properties to figure out if it needs to make another
fetch attempt. One of the paramters if the fetch expiration time. If
it expires, the whole recursion is canceled, so it now adds the EDE 22
code.
- If the fetch expiration time doesn't expires in the case above (and
other parameters allows it) a new fetch attempt is made (fctx_query).
But before the new request is actually made, the fetch expiration time
is re-checked. It might then has elapsed, and the whole recursion is
canceled. So it now also adds the EDE 22 code here as well.
Add support for EDE codes 1 (Unsupported DNSKEY Algorithm) and 2
(Unsupported DS Digest Type) which might occurs during DNSSEC
validation in case of unsupported DNSKEY algorithm or DS digest type.
Because DNSSEC internally kicks off various fetches, we need to copy
all encountered extended errors from fetch responses to the fetch
context. Upon an event, the errors from the fetch context are copied
to the client response.
the isc_mem allocation functions can no longer fail; as a result,
ISC_R_NOMEMORY is now rarely used: only when an external library
such as libjson-c or libfstrm could return NULL. (even in
these cases, arguably we should assert rather than returning
ISC_R_NOMEMORY.)
code and comments that mentioned ISC_R_NOMEMORY have been
cleaned up, and the following functions have been changed to
type void, since (in most cases) the only value they could
return was ISC_R_SUCCESS:
- dns_dns64_create()
- dns_dyndb_create()
- dns_ipkeylist_resize()
- dns_kasp_create()
- dns_kasp_key_create()
- dns_keystore_create()
- dns_order_create()
- dns_order_add()
- dns_peerlist_new()
- dns_tkeyctx_create()
- dns_view_create()
- dns_zone_setorigin()
- dns_zone_setfile()
- dns_zone_setstream()
- dns_zone_getdbtype()
- dns_zone_setjournal()
- dns_zone_setkeydirectory()
- isc_lex_openstream()
- isc_portset_create()
- isc_symtab_create()
(the exception is dns_view_create(), which could have returned
other error codes in the event of a crypto library failure when
calling isc_file_sanitize(), but that should be a RUNTIME_CHECK
anyway.)
Track inside the dns_dnsseckey structure whether we have seen the
private key, or if this key only has a public key file.
If the key only has a public key file, or a DNSKEY reference in the
zone, mark the key 'pubkey'. In dnssec-signzone, if the key only
has a public key available, consider the key to be offline. Any
signatures that should be refreshed for which the key is not available,
retain the signature.
So in the code, 'expired' becomes 'refresh', and the new 'expired'
is only used to determine whether we need to keep the signature if
the corresponding key is not available (retaining the signature if
it is not expired).
In the 'keysthatsigned' function, we can remove:
- key->force_publish = false;
- key->force_sign = false;
because they are redundant ('dns_dnsseckey_create' already sets these
values to false).