2
0
mirror of https://gitlab.isc.org/isc-projects/bind9 synced 2025-08-28 21:17:54 +00:00

53 Commits

Author SHA1 Message Date
Ondřej Surý
091d738c72 Convert all categories and modules into static lists
Remove the complicated mechanism that could be (in theory) used by
external libraries to register new categories and modules with
statically defined lists in <isc/log.h>.  This is similar to what we
have done for <isc/result.h> result codes.  All the libraries are now
internal to BIND 9, so we don't need to provide a mechanism to register
extra categories and modules.
2024-08-20 12:50:39 +00:00
Ondřej Surý
8506102216 Remove logging context (isc_log_t) from the public namespace
Now that the logging uses single global context, remove the isc_log_t
from the public namespace.
2024-08-20 12:50:39 +00:00
Matthijs Mekking
f882101265 Rewrite qp fix_iterator()
The fix_iterator() function had a lot of bugs in it and while fixing
them, the number of corner cases and the complexity of the function
got out of hand. Rewrite the function with the following modifications:

The function now requires that the iterator is pointing to a leaf node.
This removes the cases we have to deal when the iterator was left on a
dead branch.

From the leaf node, pop up the iterator stack until we encounter the
branch where the offset point is before the point where the search key
differs. This will bring us to the right branch, or at the first
unmatched node, in which case we pop up to the parent branch. From
there it is easier to retrieve the predecessor.

Once we are at the right branch, all we have to do is find the right
twig (which is either the twig for the character at the position where
the search key differs, or the previous twig) and walk down from there
to the greatest leaf or, in case there is no good twig, get the
previous twig from the successor and get the greatest leaf from there.

If there is no previous twig to select in this branch, because every
leaf from this branch node is greater than the one we wanted, we need
to pop up the stack again and resume at the parent branch. This is
achieved by calling prevleaf().
2024-05-16 09:49:41 +00:00
Matthijs Mekking
8b8c16d7a4 Get anyleaf when qp lookup is on a dead end branch
Move the fix_iterator out of the loop and only call it when we found
a leaf node. This leaf node may be the wrong leaf node, but fix_iterator
should correct that.

Also, when we don't need to set the iterator, just get any leaf. We
only need to have a leaf for the qpkey_compare and the end result does
not matter if compare was against an ancestor leaf or any leaf below
that point.
2024-05-16 09:49:41 +00:00
Evan Hunt
b6815de316 Fix QP chain on partial match
When searching for a requested name in dns_qp_lookup(), we may add
a leaf node to the QP chain, then subsequently determine that the
branch we were on was a dead end. When that happens, the chain can be
left holding a pointer to a node that is *not* an ancestor of the
requested name.

We correct for this by unwinding any chain links with an offset
value greater or equal to that of the node we found.
2024-05-14 12:58:46 -07:00
Matthijs Mekking
91de4f6490 Refactor fix_iterator
The code below the if/else construction could only be run if the 'if'
code path was taken. Move the code into the 'if' code block so that
it is more easier to read.
2024-05-14 12:58:46 -07:00
Evan Hunt
f81bf6bafd handle QP lookups involving escaped characters better
in QP keys, characters that are not common in DNS names are
encoded as two-octet sequences. this caused a glitch in iterator
positioning when some lookups failed.

consider the case where we're searching for "\009" (represented
in a QP key as {0x03, 0x0c}) and a branch exists for "\000"
(represented as {0x03, 0x03}). we match on the 0x03, and continue
to search down. at the point where we find we have no match,
we need to pop back up to the branch before the 0x03 - which may
be multiple levels up the stack - before we position the iterator.
2024-05-01 00:36:51 -07:00
Evan Hunt
237123e500 simplify code by removing return values where possible
fix_iterator() and related functions are quite difficult to read.
perhaps it would be a little clearer if we didn't assign values
to variables that won't subsequently be used, or unnecessarily
pop the stack and then push the same value back onto it.

also, in dns_qp_lookup() we previously called fix_iterator(),
removed the leaf from the top of the iterator stack, and then
added it back on. this would be clearer if we just push the leaf
onto the stack when we need to, but leave the stack alone when
it's already complete.
2024-04-25 10:29:07 -07:00
Evan Hunt
66dbff596b clean up fix_iterator() arguments
the value passed as 'start' was redundant; it's always the same
as the current top of the iterator stack.
2024-04-25 10:29:07 -07:00
Evan Hunt
2dff926624 yet another fix_iterator() bug
under some circumstances it was possible for the iterator to
be set to the first leaf in a set of twigs, when it should have
been set to the last.

a unit test has been added to test this scenario. if there is a
a tree containing the following values: {".", "abb.", "abc."}, and
we query for "acb.", previously the iterator would be positioned at
"abb." instead of "abc.".

the tree structure is:
    branch (offset 1, ".")
      branch (offset 3, ".ab")
        leaf (".abb")
        leaf (".abc")

we find the branch with offset 3 (indicating that its twigs differ
from each other in the third position of the label, "abB" vs "abC").
but the search key differs from the found keys at position 2
("aC" vs "aB").  we look up the bit value in position 3 of the
search key ("B"), and incorrectly follow it onto the wrong twig
("abB").

to correct for this, we need to check for the case where the search
key is greater than the found key in a position earlier than the
branch offset. if it is, then we need to pop from the current leaf
to its parent, and get the greatest leaf from there.

a further change is needed to ensure that we don't do this twice;
when we've moved to a new leaf and the point of difference between
it and the search key even earlier than before, then we're definitely
at a predecessor node and there's no need to continue the loop.
2024-04-25 10:29:07 -07:00
Mark Andrews
bf70d4840c dns_qpkey_toname failed to reset name correctly
This could lead to a mismatch between name->length and the rest
of the name structure.
2024-04-18 00:17:48 +00:00
Matthijs Mekking
77d4bb9751 Fix fix_iterator hang
If there are no more previous leaves, it means the queried name
precedes the entire range of names in the database, so we should just
move the iterator one step back and return, instead of continuing our
search for the predecessor.

This is similar to an earlier bug fixed in an earlier commit:

    ea9a8cb392ff59438a911485742b220d40f24d6f
2024-03-25 10:40:23 +01:00
Evan Hunt
2222728a4f release RCU in dns_qpmulti_snapshot()
previously, an RCU critical section was held open for the duration
of a snapshot. this should not be necessary, as the snapshot makes
local copies of QP trie metadata, and it causes problems when a
DB iterator is held open between two loop events.  we now call
rcu_read_unlock() after setting up the snapshot.
2024-03-08 15:36:56 -08:00
Evan Hunt
ea9a8cb392 prevent an infinite loop in fix_iterator()
it was possible for fix_iterator() to get stuck in a loop while
trying to find the predecessor of a missing node. this has been
fixed and a regression test has been added.
2023-12-21 09:18:30 -08:00
Evan Hunt
84f79cd164 fix_iterator() could produce incoherent iterator stacks
the fix_iterator() function moves an iterator so that it points
to the predecessor of the searched-for name when that name doesn't
exist in the database. the tests only checked the correctness of
the top of the stack, however, and missed some cases where interior
branches in the stack could be missing or duplicated. in these
cases, the iterator would produce inconsistent results when walked.

the predecessors test case in qp_test has been updated to walk
each iterator to the end and ensure that the expected number of
nodes are found.
2023-12-21 09:18:30 -08:00
Matthijs Mekking
21867f200a Refactor getpred code
Move the code to find the predecessor into one function, as it is shares
quite some similarities: In both cases we first need to find the
immediate predecessor/successor, then we need to find the immediate
predecessor if the iterator is not already pointing at it.
2023-12-11 21:01:29 +00:00
Matthijs Mekking
ab8a0c4b5a and fix yet another dns_qp_lookup() iterator bug
This one is similar to the bug when searching for a key, reaching a
dead-end branch that doesn't match, because the branch offset point
is after the point where the search key differs.

This fixes the case where we are multiple levels deep. In other
words, we had a more-than-one matches *after* the point where the
search key differs.

For example, consider the following qp-trie:

branch: "[e]", "[m]":
 - leaf: "a.b.c.d.e"
 - branch: "moo[g]", "moo[k]", "moo[n]":
   - leaf: "moog"
   - branch: "mook[e]", "mook[o]"
     - leaf: "mooker"
     - leaf: "mooko"
   - leaf: "moon"

If searching for a key "monky", we would reach the branch with
twigs "moo[k]" and "moo[n]". The key matches on the 'k' on offset=4,
and reaches the branch with twigs "mook[e]" and "mook[o]". This time
we cannot find a twig that matches our key at offset=5, there is no
twig for 'y'. The closest name we found was "mooker".

Note that on a branch it can't detect it is on a dead branch because the
key is not encapsulated in a branch node.

In the previous code we considered "mooker" to be the successor of
"monky" and so we needed to the predecessor of "mooker" to find the
predecessor for "monky". However, since the search key alread differed
before entering this branch, this is not enough. We would be left with
"moog" as the predecessor of "monky", while in this example "a.b.c.d.e"
is the actual predecessor.

Instead, we need to go up a level, find the predecessor and check
again if we are on the right branch, and repeat the process until we
are.

Unit tests to cover the scenario are now added.
2023-12-11 21:01:29 +00:00
Matthijs Mekking
276bdcf5cf and fix another dns_qp_lookup() iterator bug
There was yet another edge case in which an iterator could be
positioned at the wrong node after dns_qp_lookup(). When searching for
a key, it's possible to reach a leaf that matches at the given offset,
but because the offset point is *after* the point where the search key
differs from the leaf's contents, we are now at the wrong leaf.

In other words, the bug fixed the previous commit for dead-end branches
must also be applied on matched leaves.

For example, if searching for the key "monpop", we could reach a branch
containing "moop" and "moor". the branch offset point - i.e., the point
after which the branch's leaves differ from each other - is the
fourth character ("p" or "r"). The search key matches the fourth
character "p", and takes that twig to the next node (which can be
a branch for names starting with "moop", or could be a leaf node for
"moop").

The old code failed to detect this condition, and would have
incorrectly left the iterator pointing at some successor, and not
at the predecessor of the "moop".

To find the right predecessor in this case, we need to get to the
previous branch and get the previous from there.

This has been fixed and the unit test now includes several new
scenarios for testing search names that match and unmatch on the
offset but have a different character before the offset.
2023-12-11 21:01:29 +00:00
Evan Hunt
60a33ae6bb fix another dns_qp_lookup() iterator bug
there was another edge case in which an iterator could be positioned at
the wrong node after dns_qp_lookup().  when searching for a key, it's
possible to reach a dead-end branch that doesn't match, because the
branch offset point is *after* the point where the search key differs
from the branch's contents.

for example, if searching for the key "mop", we could reach a branch
containing "moon" and "moor". the branch offset point - i.e., the
point after which the branch's leaves differ from each other - is the
fourth character ("n" or "r"). however, both leaves differ from the
search key at position *three* ("o" or "p"). the old code failed to
detect this condition, and would have incorrectly left the iterator
pointing at some lower value and not at "moor".

this has been fixed and the unit test now includes this scenario.
2023-12-06 11:03:30 -08:00
Evan Hunt
8612902476 fix dns_qp_lookup() iterator bug
in some cases it was possible for the iterator to be positioned in the
wrong place by dns_qp_lookup(). previously, when a leaf node was found
which matched the search key at its parent branch's offset point, but
did not match after that point, the code incorrectly assumed the leaf
it had found was a successor to the searched-for name, and stepped the
iterator back to find a predecessor.  however, it was possible for the
non-matching leaf to be the predecessor, in which case stepping the
iterator back was wrong.

(for example: a branch contains "aba" and "abcd", and we are searching
for "abcde". we step down to the twig matching the letter "c" in
position 3. "abcd" is the predecessor of "abcde", so the iterator is
already correctly positioned, but because the twig was an exact match,
we would have moved it back one step to "aba".)

this previously went unnoticed due to a mistake in the qp_test unit
test, which had the wrong expected result for the test case that should
have detected the error. both the code and the test have been fixed.
2023-12-06 11:03:30 -08:00
Evan Hunt
947bc0a432 add an iterator argument to dns_qp_lookup()
the 'predecessor' argument to dns_qp_lookup() turns out not to
be sufficient for our needs: the predecessor node in a QP database
could have become "empty" (for the current version) because of an
update or because cache data expired, and in that case the caller
would have to iterate more than one step back to find the predecessor
node that it needs.

it may also be necessary for a caller to iterate forward, in
order to determine whether a node has any children.

for both of these reasons, we now replace the 'predecessor'
argument with an 'iter' argument. if set, this points to memory
with enough space for a dns_qpiter object.

when an exact match is found by the lookup, the iterator will be
pointing to the matching node. if not, it will be pointing to the
lexical predecessor of the nae that was searched for.

a dns_qpiter_current() method has been added for examining
the current value of the iterator without moving it in either
direction.
2023-12-06 11:03:30 -08:00
Evan Hunt
03183baa6d Prevent a possible race in dns_qpmulti_query() and _snapshot()
The `.reader` member of dns_qpmulti_t was accessed without RCU
protection; reader_open() calls rcu_dereference() on it, and this
call needs to be inside an RCU critical section.

A similar problem was identified in the dns_qpmulti_snapshot() - the
RCU critical section was completely missing.

These are relicts of the isc_qsbr - in the QSBR mode the rcu_read_lock()
and rcu_read_unlock() are no-ops and whole event loop is a critical section.
2023-10-26 00:32:22 -07:00
Evan Hunt
3a206da456 check chain length is nonzero before examining last entry
It was possible to reach add_link() without visiting an
intermediate node first, and the check for a duplicate entry
could then cause a crash.

Credit to OSS-Fuzz for discovering this error.
2023-10-12 11:31:32 -07:00
Evan Hunt
8f6a3f47db fix a QP chain bug
depending on how the QP trie is traversed during a lookup, it is
possible for a search to terminate on a leaf which is a partial
match, without that leaf being added to the chain. to ensure the
chain is correct in this case, when a partial match condition is
detected via qpkey_compare(), we will call add_link() again, just
in case.  (add_link() will check for a duplicated node, so it will
be harmless if it was already done.)
2023-10-09 13:29:02 -07:00
Evan Hunt
03016902dd rename dns_qp_findname_ancestor() to dns_qp_lookup()
I am weary of typing so long a name. (plus, the name has become slightly
misleading now that the DNS_QPFIND_NOEXACT option no longer exists.)
2023-09-28 00:32:44 -07:00
Evan Hunt
6231fd66af rename QP-related types to use standard BIND nomenclature
changed type names in QP trie code to match the usual convention:
 - qp_node_t -> dns_qpnode_t
 - qp_ref_t -> dns_qpref_t
 - qp_shift_t -> dns_qpshift_t
 - qp_weight_t -> dns_qpweight_t
 - qp_chunk_t -> dns_qpchunk_t
 - qp_cell_t -> dns_qpcell_t
2023-09-28 00:32:39 -07:00
Evan Hunt
4e3e61806c get predecessor name in dns_qp_findname_ancestor()
dns_qp_findname_ancestor() now takes an optional 'predecessor'
parameter, which if non-NULL is updated to contain the DNSSEC
predecessor of the name searched for. this is done by constructing
an iterator stack while carrying out the search, so it can be used
to step backward if needed.
2023-09-28 00:32:37 -07:00
Evan Hunt
606232b8d5 remove DNS_QPFIND_NOEXACT
since dns_qp_findname_ancestor() can now return a chain object, it is no
longer necessary to provide a _NOEXACT search option. if we want to look
up the closest ancestor of a name, we can just do a normal search, and
if successful, retrieve the second-to-last node from the QP chain.

this makes ancestor lookups slightly more complicated for the caller,
but allows us to simplify the code in dns_qp_findname_ancestor(), making
it easier to ensure correctness.  this was a fairly rare use case:
outside of unit tests, DNS_QPFIND_NOEXACT was only used in the zone
table, which has now been updated to use the QP chain.  the equivalent
RBT feature is only used by the resolver for cache lookups of 'atparent'
types (i.e, DS records).
2023-09-28 00:30:57 -07:00
Evan Hunt
3bf23fadb0 improvements to the QP iterator
- make iterators reversible: refactor dns_qpiter_next() and add a new
  dns_qpiter_prev() function to support iterating both forwards and
  backwards through a QP trie.
- added a 'name' parameter to dns_qpiter_next() (as well as _prev())
  to make it easier to retrieve the nodename while iterating, without
  having to construct it from pointer value data.
2023-09-28 00:30:51 -07:00
Evan Hunt
7f0242b8c7 tidy the helper functions for retrieving twigs
- the helper functions for accessing twigs beneath a branch
  (branch_twig_pos(), branch_twig_ptr(), etc) were somewhat confusing
  to read, since several of them were implemented by calling other
  helper functions. they now all show what they're really doing.
- branch_twigs_vector() has been renamed to simply branch_twigs().
- revised some unrelated comments in qp_p.h for clarity.
2023-09-28 00:30:47 -07:00
Evan Hunt
7f766ba7c4 add a node chain traversal mechanism
dns_qp_findname_ancestor() now takes an optional 'chain' parameter;
if set, the dns_qpchain object it points to will be updated with an
array of pointers to the populated nodes between the tree root and the
requested name. the number of nodes in the chain can then be accessed
using dns_qpchain_length() and the individual nodes using
dns_qpchain_node().
2023-09-28 00:30:43 -07:00
Evan Hunt
29cf7dceb7 modify dns_qp_findname_ancestor() to return found name
add a 'foundname' parameter to dns_qp_findname_ancestor(),
and use it to set the found name in dns_nametree.

this required adding a dns_qpkey_toname() function; that was
done by moving qp_test_keytoname() from the test library to qp.c.
added some more test cases and fixed bugs with the handling of
relative and empty names.
2023-09-28 07:01:13 +00:00
Evan Hunt
06216f4f90
rename dns_qp_findname_parent() to _findname_ancestor()
this function finds the closest matching ancestor, but the function
name could be read to imply that it returns the direct parent node;
this commit suggests a slightly less misleading name.
2023-08-15 14:24:46 +02:00
Tony Finch
b38c71961d
Improve qp-trie leaf return values
Make the `pval_r` and `ival_r` out arguments optional.

Add `pval_r` and `ival_r` out arguments to `dns_qp_deletekey()`
and `dns_qp_deletename()`, to return the deleted leaf.
2023-08-15 14:24:39 +02:00
Ondřej Surý
b6b0d81a36
Cleanup the __tsan_acquire/__tsan_release
With ThreadSanitizer support added to the Userspace RCU, we no longer
need to wrap the call_rcu and caa_container_of with
__tsan_{acquire,release} hints.  Remove the direct calls to
__tsan_{acquire,release} and the isc_urcu_{container,cleanup} macros.
2023-07-28 08:59:08 +02:00
Tony Finch
b754c6628f Acquire qpmulti->mutex during destruction
Thread sanitizer warns that parts of the qp-trie are accessed
both with and without the mutex; the unlocked accesses happen during
destruction, so they should be benign, but there's no harm locking
anyway to convince tsan it is clean.

Also, ensure .tsan-suppress and .tsan-suppress-extra are in sync.
2023-05-20 07:26:21 +00:00
Tony Finch
c319ccd4c9 Fixes for liburcu-qsbr
Move registration and deregistration of the main thread from
`isc_loopmgr_run()` into `isc__initialize()` / `isc__shutdown()`:
liburcu-qsbr fails an assertion if we try to use it from an
unregistered thread, and we need to be able to use it when the
event loops are not running.

Use `rcu_assign_pointer()` and `rcu_dereference()` in qp-trie
transactions so that they properly mark threads as online. The
RCU-protected pointer is no longer declared atomic because
liburcu does not (yet) use standard C atomics.

Fix the definition of `isc_qsbr_rcu_dereference()` to return
the referenced value, and to call the right function inside
liburcu.

Change the thread sanitizer suppressions to match any variant of
`rcu_*_barrier()`
2023-05-15 20:49:42 +00:00
Tony Finch
c377e0a9e3
Help thread sanitizer to cope with liburcu
All the places the qp-trie code was using `call_rcu()` needed
`__tsan_release()` and `__tsan_acquire()` annotations, so
add a couple of wrappers to encapsulate this pattern.

With these wrappers, the tests run almost clean under thread
sanitizer. The remaining problems are due to `rcu_barrier()`
which can be suppressed using `.tsan-suppress`. It does not
suppress the whole of `liburcu`, because we would like thread
sanitizer to detect problems in `call_rcu()` callbacks, which
are called from `liburcu`.

The CI jobs have been updated to use `.tsan-suppress` by
default, except for a special-case job that needs the
additional suppressions in `.tsan-suppress-extra`.

We might be able to get rid of some of this after liburcu gains
support for thread sanitizer.

Note: the `rcu_barrier()` suppression is not entirely effective:
tsan sometimes reports races that originate inside `rcu_barrier()`
but tsan has discarded the stack so it does not have the
information required to suppress the report. These "races" can
be made much easier to reproduce by adding `atexit_sleep_ms=1000`
to `TSAN_OPTIONS`. The problem with tsan's short memory can be
addressed by increasing `history_size`: when it is large enough
(6 or 7) the `rcu_barrier()` stack usually survives long enough
for suppression to work.
2023-05-12 20:48:31 +01:00
Tony Finch
6217e434b5
Refactor the core qp-trie code to use liburcu
A `dns_qmpulti_t` no longer needs to know about its loopmgr. We no
longer keep a linked list of `dns_qpmulti_t` that have reclamation
work, and we no longer mark chunks with the phase in which they are to
be reclaimed. Instead, empty chunks are listed in an array in a
`qp_rcu_t`, which is passed to call_rcu().
2023-05-12 20:48:31 +01:00
Tony Finch
4f97a679f0
A macro for the size of a struct with a flexible array member
It can be fairly long-winded to allocate space for a struct with a
flexible array member: in general we need the size of the struct, the
size of the member, and the number of elements. Wrap them all up in a
STRUCT_FLEX_SIZE() macro, and use the new macro for the flexible
arrays in isc_ht and dns_qp.
2023-05-12 20:48:31 +01:00
Tony Finch
b3e35fd120 A few qp-trie cleanups
Revert refcount debug tracing (commit a8b29f0365), there are better
ways to do it.

Use the dns_qpmethods_t typedef where appropriate.

Some stylistic improvements.
2023-04-05 12:35:04 +01:00
Tony Finch
39f38754e2 Compact more in dns_qp_compact(DNS_QPGC_ALL)
Commit 0858514ae8 enriched dns_qp_compact() to give callers more
control over how thoroughly the trie should be compacted.

In the DNS_QPGC_ALL case, if the trie is small it might be compacted
to a new position in the same memory chunk. In this situation it will
still be holding references to old leaf objects which have been
removed from the trie but will not be completely detached until the
chunk containing the references is freed.

This change resets the qp-trie allocator to a fresh chunk before a
DNS_QPGC_ALL compaction, so all the old memory chunks will be
evacuated and old leaf objects can be detached sooner.
2023-04-05 12:35:04 +01:00
Tony Finch
44c80c4ae1 Support for off-loop read-ony qp-trie transactions
It is sometimes necessary to access a qp-trie outside an isc_loop,
such as in tests or an isc_work callback. The best option was to use
a `dns_qpmulti_write()` transaction, but that has overheads that are
not necessary for read-only access, such as committing a new version
of the trie even when nothing changed.

So this commit adds a `dns_qpmulti_read()` transaction, which is
nearly as lightweight as a query transaction, but it takes the mutex
like a write transaction.
2023-04-05 12:35:04 +01:00
Tony Finch
fa1b57ee6e Support for finding the longest parent domain in a qp-trie
This is the first of the "fancy" searches that know how the DNS
namespace maps on to the structure of a qp-trie. For example, it will
find the closest enclosing zone in the zone tree.
2023-04-05 12:35:04 +01:00
Tony Finch
8a3a216f40 Support for iterating over the leaves in a qp-trie
The iterator object records a path through the trie, in a similar
manner to the existing dns_rbtnodechain.
2023-04-05 12:35:04 +01:00
Tony Finch
3c333d02a0 More dns_qpkey_t safety checks
My original idea had been that the core qp-trie code would be mostly
independent of the storage for keys, so I did not make it check at run
time that key lengths are sensible. However, the qp-trie search
routines need to get keys out of leaf objects, for which they provide
storage on the stack, which is particularly dangerous for unchecked
buffer overflows. So this change checks that key lengths are in bounds
at the API boundary between the qp-trie code and the rest of BIND, and
there is no more pretence that keys might be longer.
2023-04-03 15:10:47 +00:00
Tony Finch
0858514ae8 Improve qp-trie compaction in write transactions
In general, it's better to do one thorough compaction when a batch of
work is complete, which is the way that `update` transactions work.
Conversely, `write` transactions are designed so that lots of little
transactions are not too inefficient, but they need explicit
compaction. This changes `dns_qp_compact()` so that it is easier to
compact any time that makes sense, if there isn't a better way to
schedule compaction. And `dns_qpmulti_commit()` only recycles garbage
when there is enough to make it worthwhile.
2023-02-27 13:47:57 +00:00
Tony Finch
7dcde5d2fc Make the qp-trie stats logging quieter
Only log when useful work was done
2023-02-27 13:47:57 +00:00
Tony Finch
4b5ec07bb7 Refactor qp-trie to use QSBR
The first working multi-threaded qp-trie was stuck with an unpleasant
trade-off:

  * Use `isc_rwlock`, which has acceptable write performance, but
    terrible read scalability because the qp-trie made all accesses
    through a single lock.

  * Use `liburcu`, which has great read scalability, but terrible
    write performance, because I was relying on `rcu_synchronize()`
    which is rather slow. And `liburcu` is LGPL.

To get the best of both worlds, we need our own scalable read side,
which we now have with `isc_qsbr`. And we need to modify the write
side so that it is not blocked by readers.

Better write performance requires an async cleanup function like
`call_rcu()`, instead of the blocking `rcu_synchronize()`. (There
is no blocking cleanup in `isc_qsbr`, because I have concluded
that it would be an attractive nuisance.)

Until now, all my multithreading qp-trie designs have been based
around two versions, read-only and mutable. This is too few to
work with asynchronous cleanup. The bare minimum (as in epoch
based reclamation) is three, but it makes more sense to support an
arbitrary number. Doing multi-version support "properly" makes
fewer assumptions about how safe memory reclamation works, and it
makes snapshots and rollbacks simpler.

To avoid making the memory management even more complicated, I
have introduced a new kind of "packed reader node" to anchor the
root of a version of the trie. This is simpler because it re-uses
the existing chunk lifetime logic - see the discussion under
"packed reader nodes" in `qp_p.h`.

I have also made the chunk lifetime logic simpler. The idea of a
"generation" is gone; instead, chunks are either mutable or
immutable. And the QSBR phase number is used to indicate when a
chunk can be reclaimed.

Instead of the `shared_base` flag (which was basically a one-bit
reference count, with a two version limit) the base array now has a
refcount, which replaces the confusing ad-hoc lifetime logic with
something more familiar and systematic.
2023-02-27 13:47:55 +00:00
Tony Finch
549854f63b Some minor qp-trie improvements
Adjust the dns_qp_memusage() and dns_qp_compact() functions
to be more informative and flexible about handling fragmentation.

Avoid wasting space in runt chunks.

Switch from twigs_mutable() to cells_immutable() because that is the
sense we usually want.

Drop the redundant evacuate() function and rename evacuate_twigs() to
evacuate(). Move some chunk test functions closer to their point of
use.

Clarify compact_recursive(). Some small cleanups to comments.

Use isc_time_monotonic() for qp-trie timing stats.

Use #define constants to control debug logging.

Set up DNS name label offsets in dns_qpkey_fromname() so it is easier
to use in cases where the name is not fully hydrated.
2023-02-27 13:47:25 +00:00