2
0
mirror of https://gitlab.isc.org/isc-projects/bind9 synced 2025-08-31 14:35:26 +00:00
Commit Graph

100 Commits

Author SHA1 Message Date
Alessio Podda
3271f5fda4 Do not skip cleanup for origin nodes in qpzone
Per @each, skipping cleanup of (|nsec_|nsec3_)origin nodes in
qpznode_release in qpzone.c is a residual from RBTDB, but it is
unnecessary or at most a performance optimization with QP.

Remove it to make it further changes easier to qpznode_release easier.
2025-08-19 14:18:19 +02:00
Ondřej Surý
42496f3f4a Use ControlStatementsExceptControlMacros for SpaceBeforeParens
> Put a space before opening parentheses only after control statement
> keywords (for/if/while...) except this option doesn’t apply to ForEach
> and If macros. This is useful in projects where ForEach/If macros are
> treated as function calls instead of control statements.
2025-08-19 07:58:33 +02:00
Evan Hunt
727fb9a011 replace dns_slabheader_raw() with a flexible array member
we can use header->raw instead of dns_slabheader_raw().
2025-08-18 12:36:47 +02:00
Evan Hunt
712ef31a0c use get_uint16() to read count and rdlen
use the same macro defned for rdataslab.c to get count and
length values from raw slabs in qpzone.c.
2025-08-18 12:36:47 +02:00
Ondřej Surý
d7801aec71 Move SIEVE-LRU to dns_slabtop_t structure
As the qpcache has only one active header at the time, we can move the
SIEVE-LRU members from dns_slabheader_t to dns_slabtop_t structure thus
saving a little bit of memory in each slabheader and using it only once
per type.
2025-08-18 12:36:47 +02:00
Ondřej Surý
f4d8841f0d Split the top level slab header hierarchy and the headers
The code that combines the top-level hierarchy (per-typepair) and
individual slab headers (per-version) saves a little bit of memory, but
makes the code convoluted, hard to read and hard to modify.  Change the
top level hierarchy to be of different type with individual slabheaders
"hanging" from the per-typepair dns_slabtop_t structure.

This change makes the future enhancements (changing the top level data
structure for faster lookups; coupling type + sig(type) into single
slabtop) much easier.
2025-08-18 12:36:47 +02:00
Ondřej Surý
2f81952658 Pass 'mctx' instead of 'db' to dns_slabheader_new()
The slabheader doesn't directly attach or link to 'db' anymore.  Pass
only the memory context needed to create the slab header to make the
lack of relation ship more prominent.

Also don't call dns_slabheader_reset() from dns_slabheader_new(), it has
no added value.
2025-08-17 21:56:25 -07:00
Ondřej Surý
eba76df247 Move the slabheader attribute helpers to private header
The slabheader.c, qpzone.c and qpcache.c had couple of shared macros
that were copied and paste between the units.  Move these common
attributes access macros into private header, so these can be shared
among the three compilation units.
2025-08-15 07:35:14 +02:00
Ondřej Surý
8c06d627b3 Unify the NONEXISTENT() macro in qpzone to EXISTS()
In the dns_qpcache unit, we use EXISTS() macro, but in the dns_qpzone
there's a NONEXISTENT() macro for the same slabheader attribute.  Unify
the macro to be also EXISTS() in dns_qpzone.
2025-08-15 07:35:14 +02:00
Ondřej Surý
d555cb9704 The nodefullname doesn't need a read lock to access .name
The qpznode->name is constant - assigned when the node is created
and it is immutable, so there's no reason to have it locked at all.
2025-08-15 07:29:02 +02:00
Ondřej Surý
74fe3db37c Rename DNS_SIGTYPE() to DNS_SIGTYPEPAIR()
The DNS_SIGTYPE() macro "returns" dns_typepair_t, rename it to make this
fact more obvious and also to match DNS_TYPEPAIR() macro naming.
2025-08-15 07:22:52 +02:00
Ondřej Surý
6e2ca5e0d7 Remove the negative type logic from qpcache
Previously, when a negative header was stored in the cache, it would be
stored in the dns_typepair_t as .type = 0, .covers = <negative type>.
When searching the cache internally, we would have to look for both
positive and negative typepair and the slabheader .down list could be a
mix of positive and negative types.

Remove the extra representation of the negative type and simply use the
negative attribute on the slabheader.  Other units (namely dns_ncache)
can still insert the (0, type) negative rdatasets into the cache, but
internally, those will be converted into (type, 0) slabheaders, and vice
versa - when binding the rdatasets, the negative (type, 0) slabheader
will be converted to (0, type) rdataset.  Simple DNS_TYPEPAIR() helper
macro was added to simplify converting single rdatatype to typepair
value.

As a side-effect, the search logic in all places can exit early if
there's a negative header for the type we are looking for, f.e. when
searching for the zone cut, we don't have to walk through all the
slabheaders, if there's a stored negative slabheader.
2025-08-15 07:22:52 +02:00
Ondřej Surý
3445362918 Add dns_rdatatype_isnsec() helper function
Replace the checks for both NSEC and NSEC3 with a single helper
function.
2025-08-15 07:22:52 +02:00
Ondřej Surý
59d1326175 Use dns_rdatatype_none more consistently
Use dns_rdatatype_none instead of plain '0' for dns_rdatatype_t and
dns_typepair_t manipulation.  While plain '0' is technically ok, it
doesn't carry the required semantic meaning, and using the named
dns_rdatatype_none constant makes the code more readable.
2025-08-15 07:22:52 +02:00
Ondřej Surý
76c027e949 Disallow TYPE0 to be queried or inserted into the database
The RR type 0 is a reserved type for SIG[1] resource record.  It should
not be ever inserted into the database nor queried.  Add a special
handling to bail out quickly with DNS_R_DISALLOWED when inserting and
ISC_R_NOTFOUND when looking up TYPE0.  This is also prerequisite for
stricter checks in the follow-up commit.

1. https://www.rfc-editor.org/rfc/rfc2535#section-4.1.8.1
2025-08-15 07:22:52 +02:00
Ondřej Surý
101b1e5a57 Unify the dns_typepair_t variable naming and usage
The dns_typepair_t and dns_rdatatype_t variables were both named 'type'
in multiple places.  Rename all dns_typepair_t variables to include word
'pair' in the variable name to make sure that the distinction between
the two types is more clear.
2025-08-15 07:22:51 +02:00
Alessio Podda
a05db4196f Remove unused dns_slabheader_reset argument
As a part of the previous refactor, the db argument of
dns_slabheader_reset is now unused, and can be removed.
2025-08-07 11:39:38 -07:00
Alessio Podda
ae6a34cbda Decouple database and node lifetimes by adding node-specific vtables
All databases in the codebase follow the same structure: a database is
an associative container from DNS names to nodes, and each node is an
associative container from RR types to RR data.

Each database implementation (qpzone, qpcache, sdlz, builtin, dyndb) has
its own corresponding node type (qpznode, qpcnode, etc). However, some
code needs to work with nodes generically regardless of their specific
type - for example, to acquire locks, manage references, or
register/unregister slabs from the heap.

Currently, these generic node operations are implemented as methods in
the database vtable, which creates problematic coupling between database
and node lifetimes. If a node outlives its parent database, the node
destructor will destroy all RR data, and each RR data destructor will
try to unregister from heaps by calling a virtual function from the
database vtable. Since the database was already freed, this causes a
crash.

This commit breaks the coupling by standardizing the layout of all
database nodes, adding a dedicated vtable for node operations, and
moving node-specific methods from the database vtable to the node
vtable.
2025-08-07 11:39:38 -07:00
Matthijs Mekking
c5a14f263f Add namespace to new_qp(c|z)node
Is there a time when new_qp(c|z)node() would not be followed by
assignment of the namespace? No, so let's add the assignment to the
function that creates the node.
2025-07-10 13:52:59 +00:00
Matthijs Mekking
df6763fd2a Rename DNS_DB_NSEC_ constants to DNS_DBNAMESPACE_
Naming is hard exercise.
2025-07-10 13:52:59 +00:00
Matthijs Mekking
a7021a3a51 Rename dns_qp_lookup2 back to dns_qp_lookup
Now that we have to code working, rename 'dns_qp_lookup2' back to
'dns_qp_lookup' and adjust all remaining 'dns_qp_lookup' occurrences
to take a space=0 parameter.
2025-07-10 13:52:59 +00:00
Matthijs Mekking
e052e14b40 Change denial type to enum
For now we only allow DNS_DB_NSEC_* values so it makes sense to change
the type to an enum.

Rename 'denial' to the more intuitive 'space', indicating the namespace
of the keyvalue pair.
2025-07-10 13:52:59 +00:00
Matthijs Mekking
61f8886fc3 Fix the dbiterator to assume only one qp-trie
The dbiterator can take three modes: full, nsec3only and nonsec3.
Previously, in full mode the dbiterator requires special logic to jump
from one qp-trie to the other. Now everything is in one trie, other
special logic is needed.

The qp-trie is now sorted in such a way that all the normal nodes come
first, followed by NSEC nodes, and finally the NSEC3 nodes. NSEC nodes
are empty nodes and need to be skipped when iterating.

We add an additional auxiliary node to the trie, an NSEC origin, so
we can easily find the point in the trie where we need to continue
iterating.
2025-07-10 13:52:59 +00:00
Matthijs Mekking
16a1c5a623 Prepend qpkey with denial byte
In preparation to merge the three qp tries (tree, nsec, nsec3) into
one, add the piece of information into the qpkey. This is the most
significant bit of information, so prepend the denial type to the qpkey.

This means we need to pass on the denial type when constructing the
qpkey from a name, or doing a lookup.

Reuse the the DNS_DB_NSEC_* values. Most qp tries in the code we just
pass on 0 (nta, rpz, zt, etc.), because there is no need for denial of
existence, but for qpzone and qpcache we must pass the right value.

Change the code, so that node->nsec no longer can have the value
DNS_DB_NSEC_HAS_NSEC, instead track this in a new attribute 'havensec'.

Since we use node->nsec to convert names to keys, the value MUST be set
before inserting the node into the qp-trie.

Update the fuzzing and unit tests accordingly. This only adds a few
extra test cases, more are needed.

In the qp_test.c we can remove test code for empty keys as this is
no longer possible.
2025-07-10 13:52:59 +00:00
Petr Špaček
0a5a25729c Remove unused DNS_RDATASET_COUNT
Albeit technically not unused, it was always defined as 0 and thus did
nothing.

Related: #4666
2025-07-10 11:17:19 +02:00
Petr Špaček
ba861f23f2 Remove unused DNS_RDATASET_ORDER
Related: #4666
2025-07-10 11:17:19 +02:00
Petr Špaček
750d8a61b6 Convert DNS_RDATASETATTR_ bitfield manipulation to struct of bools
RRset ordering is now an enum inside struct rdataset attributes. This
was done to keep size to of the structure to its original value before
this MR.

I expect zero performance impact but it should be easier to deal with
attributes in debuggers and language servers.
2025-07-10 11:17:19 +02:00
Alessio Podda
25daa047d4 Replace per-zone lock buckets with global buckets
Qpzone employs a locking strategy where rwlocks are grouped into
buckets, and each zone gets 17 buckets.
This strategy is suboptimal in two ways:
 - If named is serving a single zone or a zone is the majority of the
   traffic, this strategy pretty much guarantees contention when using
   more than a dozen threads.
 - If named is serving many small zones, it causes substantial memory
   usage.

This commit switches the locking to a global table initialized at start
time. This should have three effects:
 - Performance should improve in the single zone case, since now we are
   selecting from a bigger pool of locks.
 - Memory consumption should go down significantly in the many zone
   cases.
 - Performance should not degrade substantially in the many zone cases.
   The reason for this is that, while we could have substantially more
   zones than locks, we can query/edit only O(num threads) at the same
   time. So by making the global table much bigger than the expected
   number of threads, we can limit contention.
2025-07-09 15:27:38 +02:00
Alessio Podda
0b1785ec10 Extract the resigning heap into a separate struct
In the current implementation, the resigning heap is part of the zone
database. This leads to a cycle, as the database has a reference to its
nodes, but each node needs a reference to the database.

This MR splits the resigning heap into its own separate struct, in order
to help breaking the cycle.
2025-07-09 12:33:18 +02:00
Alessio Podda
c2a84bb17a Abstract bucket lock selection logic
Recovering the node lock from a pointer to the header and a pointer to
the db is a common operation. This commit abstracts it away into a
function, so that the node lock selection logic may be modified more
easily.
2025-07-09 12:33:18 +02:00
Evan Hunt
8487e43ad9 make all ISC_LIST_FOREACH calls safe
previously, ISC_LIST_FOREACH and ISC_LIST_FOREACH_SAFE were
two separate macros, with the _SAFE version allowing entries
to be unlinked during the loop. ISC_LIST_FOREACH is now also
safe, and the separate _SAFE macro has been removed.

similarly, the ISC_LIST_FOREACH_REV macro is now safe, and
ISC_LIST_FOREACH_REV_SAFE has also been removed.
2025-05-23 13:09:10 -07:00
alessio
4017a40b1d Remove zero initialization of large buffers
Profiles show that an high amount of CPU time spent in memset.
By removing zero initalization of certain large buffers we improve
performance in certain authoritative workloads.
2025-04-02 16:24:31 +02:00
Evan Hunt
522ca7bb54 switch to ISC_LIST_FOREACH everywhere
the pattern `for (x = ISC_LIST_HEAD(...); x != NULL; ISC_LIST_NEXT(...)`
has been changed to `ISC_LIST_FOREACH` throughout BIND, except in a few
cases where the change would be excessively complex.

in most cases this was a straightforward change. in some places,
however, the list element variable was referenced after the loop
ended, and the code was refactored to avoid this necessity.

also, because `ISC_LIST_FOREACH` uses typeof(list.head) to declare
the list elements, compilation failures can occur if the list object
has a `const` qualifier.  some `const` qualifiers have been removed
from function parameters to avoid this problem, and where that was not
possible, `UNCONST` was used.
2025-03-31 13:45:10 -07:00
Evan Hunt
341b962665 move dns_zonekey_iszonekey() to dns_dnssec module
dns_zonekey_iszonekey() was the only function defined in the
dns_zonekey module, and was only called from one place. it
makes more sense to group this with dns_dnssec functions.
2025-03-20 18:22:58 +00:00
Evan Hunt
24eaff7adc qpzone.c:step() could ignore rollbacks
the step() function (used for stepping to the prececessor or
successor of a database node) could overlook a node because
there was an rdataset marked IGNORE because it had been rolled
back, covering an active rdataset under it.
2025-03-14 23:19:17 +00:00
Ondřej Surý
1e4695510a Revert "fix: dev: Delete dead nodes when committing a new version"
This reverts commit 67255da4b3, reversing
changes made to 74c9ff384e.
2025-03-05 17:46:54 +01:00
Ondřej Surý
c5075a9a61 Remove convenience list macros from isc/util.h
The short convenience list macros were used very sparingly and
inconsistenly in the code base.  As the consistency is prefered over
the convenience, all shortened list macro were removed in favor of
their ISC_LIST API targets.
2025-03-01 07:33:40 +01:00
Evan Hunt
a6986f6837 remove 'target' parameter from dns_name_concatenate()
the target buffer passed to dns_name_concatenate() was never
used (except for one place in dig, where it wasn't actually
needed, and has already been removed in a prior commit).
we can safely remove the parameter.
2025-02-25 12:53:25 -08:00
Ondřej Surý
04c2c2cbc8 Simplify dns_name_init()
Remove the now-unused offsets parameter from dns_name_init().
2025-02-25 12:17:34 +01:00
Ondřej Surý
08e966df82 Remove offsets from the dns_name and dns_fixedname structures
The offsets were meant to speed-up the repeated dns_name operations, but
it was experimentally proven that there's actually no real-world
benefit.  Remove the offsets and labels fields from the dns_name and the
static offsets fields to save 128 bytes from the fixedname in favor of
calculating labels and offsets only when needed.
2025-02-25 12:17:34 +01:00
Ondřej Surý
d1ef6a93c1 Acquire the database reference before possibly last node release
Acquire the database refernce in the detachnode() to prevent the last
reference to be release while the NODE_LOCK being locked.  The NODE_LOCK
is locked/unlocked inside the RCU critical section, thus it is most
probably this should not pose a problem as the database uses call_rcu
memory reclamation, but this it is still safer to acquire the reference
before releasing the node.
2025-02-24 20:05:56 +01:00
Evan Hunt
ed83455c81 dns_slabheader_fromrdataset() -> dns_rdataset_getheader()
The function name dns_slabheader_fromrdataset() was too similar
to dns_rdataslab_fromrdataset(). Instead, we now have an rdataset
method 'getheader' which is implemented for slab-type rdatasets.

A new NOHEADER rdataset attribute is set for rdatasets using
raw slabs (i.e., noqname and closest encloser proofs); when
called on rdatasets with that flag set, dns_rdataset_getheader()
returns NULL.
2025-02-19 14:58:32 -08:00
Evan Hunt
82edec67a5 initialize header in dns_rdataslab_fromrdataset()
when dns_rdataslab_fromrdataset() is run, in addition to
allocating space for a slab header, it also partially
initializes it, setting the type match rdataset->type and
rdataset->covers, the trust to rdataset->trust, and the TTL to
rdataset->ttl.
2025-02-19 14:58:32 -08:00
Evan Hunt
b4bde9bef4 clarify dns_rdataslab_fromrdataset()
there are now two functions for creating an rdataslab from an
rdataset: dns_rdataslab_fromrdataset() creates a full slab (including
space for a slab header), and dns_rdataslab_raw_fromrdataset() creates
a raw slab.
2025-02-19 14:58:32 -08:00
Evan Hunt
6908d1f9be more rdataslab refactoring
- there are now two functions for getting rdataslab size:
  dns_rdataslab_size() is for full slabs and dns_rdataslab_sizeraw()
  for raw slabs. there is no longer a need for a reservelen parameter.
- dns_rdataslab_count() also no longer takes a reservelen parameter.
  (currently it's never used for raw slabs, so there is no _countraw()
  function.)
- dns_rdataslab_rdatasize() has been removed, because
  dns_rdataslab_sizeraw() can do the same thing.
- dns_rdataslab_merge() and dns_rdataslab_subtract() both take
  slabheader parameters instead of character buffers, and the
  reservelen parameter has been removed.
2025-02-19 14:58:32 -08:00
Ondřej Surý
15fe68e50d Add .up pointer to slabheader
The dns_slabheader object uses the 'next' pointer for two purposes.
In the first header for any given type, 'next' points to the first
header for the next type. But 'down' points to the next header of
the same type, and in that record, 'next' points back up.

This design made the code confusing to read.  We now use a union
so that the 'next' pointer can also be called 'up'.
2025-02-19 14:58:32 -08:00
Evan Hunt
e58ce19cf2 when committing a new qpzone version, delete dead nodes
if all data has been deleted from a node in the qpzone
database, delete the node too.
2025-02-18 14:22:38 -08:00
Evan Hunt
fffa150df3 fix dns_qp_insert() checks in qpzone
in some places there were checks for failures of dns_qp_insert()
after dns_qp_getname(). such failures could only happen if another
thread inserted a node between the two calls, and that can't happen
because the calls are serialized with dns_qpmulti_write(). we can
simplify the code and just add an INSIST.
2025-02-17 12:21:50 -08:00
Mark Andrews
5e49a9e4ae Fix "CNAME and other data" detection
prio_type was being used in the wrong place to optimize cname_and_other.
We have to first exclude and accepted types and we also have to
determine that the record exists before we can check if we are at
a point where a later CNAME cannot appear.
2025-02-14 01:51:38 +00:00
Ondřej Surý
732fc338a9 Switch the locknum generation for qpznode to random
Instead of using on hash of the name modulo number of the buckets,
assign the locknum randomly with isc_random_uniform().  This makes
the locknum assignment aligned with qpcache and allows the bucket
number to be non-prime in the future.
2025-02-04 22:50:49 +01:00