an assertion could be triggered in the QPDB cache if a DNAME
was found above a queried NS, because the 'foundname' value was
not correctly updated to point to the zone cut.
the same mistake existed in qpzone and has been fixed there as well.
If there are no more previous leaves, it means the queried name
precedes the entire range of names in the database, so we should just
move the iterator one step back and return, instead of continuing our
search for the predecessor.
This is similar to an earlier bug fixed in an earlier commit:
ea9a8cb392ff59438a911485742b220d40f24d6f
every node of a QP database contains a copy of the nodename,
which is used as the key for the QP-trie. previously, the name
was stored as a dns_fixedname object, which has room for up to
255 characters. we can reduce the space consumed by dynamically
allocating a dns_name object that's just long enough for the name
to be stored.
The dns_qpiter_next() was called without checking the return value. If
we cannot move the iterator forward, there is no use in calling the
step() function.
/lib/dns/qpzone.c: 2804 in activeempty()
2798 * of the name we were searching for. Step the iterator
2799 * forward, then step() will continue forward until it
2800 * finds a node with active data. If that node is a
2801 * subdomain of the one we were looking for, then we're
2802 * at an active empty nonterminal node.
2803 */
>>> CID 487882: Error handling issues (CHECKED_RETURN)
>>> Calling "dns_qpiter_next" without checking return value (as is done elsewhere 26 out of 27 times).
2804 dns_qpiter_next(it, NULL, NULL, NULL);
2805 return (step(search, it, FORWARD, next) &&
2806 dns_name_issubdomain(next, current));
2807 }
dns_db_addrdataset() enforces a requirement that version can only
be NULL for a cache database. code that checks for zone semantics
and version == NULL can never be reached.
qpzone does not support cache semantics, so dns_db_addrdataset(),
_deleterdataset() and _subtractrdataset() can't be run with
version == NULL; there's no need to check for it.
we can also clean up free_qpdb() a bit since current_version
is always non-NULL.
The Depends relation refers to types of rollovers in which a certain
record type is going to be swapped. Specifically, the Depends relation
says there should be no dependency on the predecessor key (the set
Dep(x, T) must be empty).
But if the key is phased out (all its states are in HIDDEN), there is
no longer a dependency. Since the relationship is still maintained
(Predecessor and Successor metadata), the keymgr_dep function still
returned true. In other words, the set Dep(x, T) is not considered
empty.
This slows down key rollovers, only retiring keys when the successor
key has been fully propagated.
When there is a secure chain of trust with a KSK that is not actively
signing the DNSKEY RRset, the code for validating the DNSKEY RRset
against the DS RRset could potentially skip DS records, thinking the
chain of trust is broken while there is a valid DS with corresponding
DNSKEY record present.
This is because we pass the result ISC_R_NOMORE on when we are done
checking for signatures, but then treat it as "no more DS records".
Chaning the return value to something else (DNS_R_NOVALIDSIG seems the
most appropriate here) fixes the issue.
the code in qpdb.c was previously shared by qp-cachedb.c and
qp-zonedb.c. since qp-zonedb.c no longer exists, it's not necessary
to keep these separate any longer. the two files have been merged,
and functions that were previously globally accessible have been
changed to static and renamed.
now that "qpzone" databases are available for use in zones, we no
longer need to retain the zone semantics in the "qp" database.
all zone-specific code has been removed from QPDB, and "configure
--with-zonedb" once again takes two values, rbt and qp.
some database API methods that are never used with a cache have
been removed from qpdb.c and qp-cachedb.c; these include newversion,
closeversion, subtractrdataset, and nodefullname.
because dns_qpmulti_commit() can be time consuming, it's inefficient
to open and commit a qpmulti transaction for each rdataset being loaded
into a database. we can improve load time by opening a qpmulti
transaction before adding a group of rdatasets and then committing it
afterward.
this commit adds 'setup' and 'commit' functions to dns_rdatacallbacks_t,
which can be called before and after the loops in which 'add' is
called in dns_master_load() and axfr_apply().
when copying the non-dnssec records in receive_secure_db(),
use DNS_DB_NONSEC3 so we don't accidentally create nodes in
the main tree for NSEC3 records. this was a long-standing error
in the code, but was harmless in the RBTDB.
QP database node data is not reference counted the same way RBT nodes
were: in the RBT, node->references could be zero if the node was in the
tree but was not in use by any caller, whereas in the QP trie, the
database itself uses reference counting of nodes internally.
this caused some subtle errors. in RBTDB, when the newref() function is
called and the node reference count was zero, the node lock reference
counter would also be incremented. in the QP trie, this can never
happen - because as long as the node is in the database its reference
count cannot be zero - and so the node lock reference counter was never
incremented.
this has been addressed by maintaining a separate "erefs" counter for
external references to the node. this is the same approach used in the
"qpdb-lite" database in commit e91fbd8dea.
while troubleshooting this issue, some compile errors were discovered
when building with DNS_DB_NODETRACE; those have also been fixed.
use the dns_qpmulti-based "qpzone" by default throughout BIND,
instead of the existing dns_qp-based "qp", when creating zone
databases. (cache databases still use "qp".)
the "--with-zonedb" option has been updated in configure.ac to permit
the use of both "qp" and "qpzone" databases.
in zone.c there was a test that prevented any database type other than
"qp" from hosting an RPZ. this was outdated, and has been removed.
previously, an RCU critical section was held open for the duration
of a snapshot. this should not be necessary, as the snapshot makes
local copies of QP trie metadata, and it causes problems when a
DB iterator is held open between two loop events. we now call
rcu_read_unlock() after setting up the snapshot.
finish importing the database API methods from RBTDB to qpzone:
issecure, nodecount, getnsec3parameters, findnsec3node, setsigningtime,
getsigningtime, getsize, setgluecachestats, locknode, unlocknode, and
addglue.
add database API methods needed to apply updates to an existing zone
database (newversion, addrdataset, subtractrdataset and deleterdataset).
it is now possible to apply journals to zone databases after loading, so
named-checkzone -J works correctly.
add database API method implementations needed to iterate and dump
a qpzone database to a file (createiterator, allrdatasets and
attachversion, plus dbiterator and rdatasetiter methods).
named-checkzone -D can now dump the contents of most zones,
but zone cuts are not correctly detected.
add database API methods needed for loading rdatasets into memory
(currentversion, beginload, endload), plus the methods used by
zone_postload() for zone consistency checks (getoriginnode, find,
findnode, findrdataset, attachnode, detachnode, deletedata).
the QP trie doesn't support the find callback mechanism available
in dns_rbt_findnode() which allows examination of intermediate nodes
while searching, so the detection of wildcard and delegation nodes
is now done by scanning QP chains after calling dns_qp_lookup().
Note that the lookup in previous_closest_nsec() cannot return
ISC_R_NOTFOUND. In RBTDB, we checked for this return value and
ovewrote the result with ISC_R_NOMORE if it occurred. In the
qpzone implementation, we insist that this return value cannot happen.
dns_qp_lookup() would only return ISC_R_NOTFOUND if we asked for a
name outside the zone's authoritative domain, and we never do that
when looking up a predecessor NSEC record.
named-checkzone is now able to load a zone and check it for errors,
but cannot dump it.
The dns_cache_flush() drops the old database and creates a new one, but
it forgets to pass the loop that runs the node pruning and cleaning
the rbtdb when flushing it next time. This causes the cleaning to skip
cleaning the parent nodes (with .down == NULL) leading to increased
memory usage over time until the database is unable to keep up and just
stays overmem all the time.
Reconstruct the variant of the prune_tree() parent cleaning to consider
all elibible parents in a single loop as we were doing before all the
changes that led to this commit.
Update code comments so that they more precisely describe what the
relevant bits of code actually do.
by default, QPDB is the database used by named and all tools and
unit tests. the old default of RBTDB can now be restored by using
"configure --with-zonedb=rbt --with-cachedb=rbt".
some tests have been fixed so they will work correctly with either
database.
CHANGES and release notes have been updated to reflect this change.
When running resolver benchmark pipeline, a crash occurred:
https://gitlab.isc.org/isc-projects/bind9-shotgun-ci/-/pipelines/163946
In the code we are doing a lookup, it fails (meaning there is no node
with lookup name), we create the node and insert it and it fails.
But dns_qp_insert can only return ISC_R_SUCCESS or ISC_R_EXISTS.
So it must have been inserted in between. This is a race condition bug.
The first lookup only requires a write lock and if the lookup failed
the lock gets upgraded to a write lock and we insert the missing data.
To fix the race condition bug, we need to do a lookup again after we
have upgraded the lock to make sure it wasn't inserted in the mean
time.
the dyndb test requires a mechanism to retrieve the name associated
with a database node, and since the database no longer uses RBT for
its underlying storage, dns_rbt_fullnamefromnode() doesn't work.
addressed this by adding dns_db_nodefullname() to the database API.
If the iterator is paused, the tree is unlocked and may change.
In an RBT tree it's always possible to resume iteration as long
as a valid node pointer was still held, but now that the underlying
database structure is a QP trie, the iterator needs to be initialized
based on the existing structure of the trie or it will return
inconsistent results. We now call dns_qp_lookup() to reinitialize
the QP iterator whenever dbiterator_next() or dbiterator_prev() is
called on a paused iterator.
QP database node data is not reference counted the same way RBT nodes
were: in the RBT, node->references could be zero if the node was in the
tree but was not in use by any caller, whereas in the QP trie, the
database itself uses reference counting of nodes internally.
this caused some subtle errors. in RBTDB, when the newref() function is
called and the node reference count was zero, the node lock reference
counter would also be incremented. in the QP trie, this can never
happen - because as long as the node is in the database its reference
count cannot be zero - and so the node lock reference counter was never
incremented.
reference counting will probably need to be refactored in more detail
later; the node lock reference count may not be needed at all. but
for now, as a temporary measure, we add a third reference counter,
'erefs' (external references), to the dns_qpdata structure. this is
counted separately from the main reference counter, and should match
the node reference count as it would have been in RBTDB.
this change revealed a number of places where the node reference counter
was being incremented on behalf of a caller without newref() being
called; those were cleaned up as well.
This is an adaptation of commit 3dd686261d2c4bcd15a96ebfea10baffa277732b
Fix reference counting: unreference nodes that are succesfully inserted
in the tree, detach created nodes, and cleanup the interior data in
dns_qpdata_destroy().
The name will be stored inside the node now so we can just copy it.
These are leftovers, most of the namefromnode code has been replaced
already in previous commits.
The qp approach pulled apart the chain and iterator into two separate
things. Replace the rbtnodechain with qpchain and qpiter. Most of the
times we are interested in the iterator only, the rbtnodechain was
mainly used as an an iterator to get the previous and next name in the
DNS canonical order.
Since dns_qpiter_prev() and dns_qpiter_next() store the name, origin,
and node in the provided parameters, often there is no need to call
a current() function anymore.
Getting the first or last item from the iterator is done by
re-initializing the iterator and then call dns_qpiter_next() or
dns_qpiter_prev() respectively.
The dbiterator no longer needs to maintain a chain, only an iterator.
All dns_qp_lookup() calls assume it is okay to find empty data, so
we don't need to do anything special for the DNS_RBTFIND_EMPTYDATA.
You can pass a callback function to dns_rbt_findnode(), something that
qp does not support. Instead, call the function afterwards. This has
the drawback that we do more lookup work if there was a zonecut.
With dns_qp_lookup() we also don't pass any options. In this case,
when DNS_RBTFIND_NOEXACT was set, we adapt the result after the lookup.
Replace dns_rbt_deletenode calls with dns_qp_deletename. For removing
the name from the nsec tree, we no longer first have to find it: we can
just remove the key (retrieved by name).
Replace dns_rbt_addnode calls with dns_qp_insert. With QP, it sometimes
makes more sense to first lookup the name and see if there is an
existing node (rather than create new data, insert, find out a node
already exists, and destroy the data again). This is done with
dns_qp_getname(), which is more lightweight than dns_qp_lookup(),
and we are only interested in if there is already a leaf node for this
name or not.