Tony Finch <fanf@isc.org>
, spring 2022
My aim in writing these notes is to get a better idea of what data structures exist in BIND, and what they are used for. I have some questions to consider in the process:
-
How well-matched are the data structures to their uses? Are any of them over-powered, such as using an ordered structure when an unordered one will do?
-
Are there cases where a custom structure is overloaded with multiple purposes? Might it be simpler to use one or more standard data structures instead?
-
What are the concurrency requirements? And again, how well-matched are they? Are there cases where BIND uses a general-purpose multi-threaded structure in a single-threaded manner?
-
Where are the opportunities for using copy-on-write to reduce concurrency interlocks? What are the obstacles?
The reason for all this is to get some foundational knowledge as a basis for refactoring parts of BIND, especially the rbtdb, with the aim of making the code easier to understand, and hopefully to improve performance.
disclaimers
My personal interest is the work I have done over the last several years on my "qp-trie" data structure, and the qp-trie variants that have been tuned to work well with DNS names as keys, and for copy-on-write concurrency. I prototyped these variants in my fork of NSD, because it is simpler than BIND, and it already has a choice between a red-black tree and a radix tree, so adding a qp-trie was not too hard.
I am writing these notes with a personal point of view because they are primarily a learning exercise for myself, though I hope they might also be useful to others. (It is easier not to worry about mistakes or inaccuracies if I am not trying to present this as the consensus of the BIND developers.) And there is a date at the top because there's a fair chance this will become a historical document rather than being maintained as a current one.
isc_astack
Fixed-size array stack
used by
netmgr
for object caches
notes
Locked.
Ondřej's notes
Will be gone after the netmgr refactoring and replaced with per-thread object pools/lists
isc_heap
Priority queue
used by
-
dns_rbtdb
to find nodes by TTL; there are multiple heaps split based on node->locknum like a few other parts of rbtdb. -
dns_zoneverify
for checking consistency of NSEC3 chains. -
isc_timer
, the canonical use for a heap.
notes
Single-threaded.
There's a min-heap in libuv, implemented as a binary tree. The
standard algorithms (as in isc_heap
) use an array which should have
better performance.
Ondřej's notes
The usage in isc_timer has been removed in the loopmgr branch
isc_ht
Hash table
used by
-
bin/plugins/filter-a
andbin/plugins/filter-aaaa
for finding the filter state given aquery_ctx_t
pointer. (case-sensitive) -
dns_adb
for looking up nameservers by name (case-insensitive) or by address (case-insensitive???). -
dns_catz
(which leaks intobin/named/server
) for the list of catalog zones in a view, and the entries in a catalog zone. These hash tables are unexpectedly case-sensitive.This doesn't matter for the entries table, because the key is a label from a domain name stored in the rbtdb, which is forced to lower case (rbtdb stores the case bits separately from the name), and names of catalog zone entries are compared case-insensitively via
dna_zt_t
.For the zones table,
catz
goes wrong if thezone{}
clause does not state the zone's name in all lower-case; the case of the zone name in thecatalog-zones{}
clause does not matter. Initializing the hash table as case-insensitive is not enough to fix this bug. -
dns_rpz
in the zone update functions, case-sensitive but the domain names are forced to lower case. -
isc_tls
for TLS context caches. (case-sensitive)
notes
Single-threaded.
Case-sensitive and case-insensitive variants, chosen at run-time. The
case-insensitive isc_hash
functions squash the input to lower case before
hasing, instead of using a specialized single-pass case-insensitive hash. It
would be simpler to make the callers handle case-insensitivity since there
are so few of them.
Uses chaining, so it is heavy on indirection and heavy on allocation. Modern hash tables use open addressing.
Ondřej's notes
There's work to do on isc_ht
:
-
make header-only variant of the hashtable, so we can unify the usage across the codebase (e.g. no extra allocations when storing item)
-
there's a branch, where I implemented robin-hood hashing, but I never finished because of 1), the branch is ondrej/robin-hood-unordered-node-map
-
the resizing was added only recently, we still miss shrinking (should be easy enough to do)
dns_adb
was recently refactored, it used a custom hashtable before
isc_radix
Radix tree for IP address prefix matching
used by
dns_acl
dns_iptable
notes
Single-threaded.
Slightly unusual because it is first-match rather than best-match.
Tests one bit at a time so there's an opportunity to improve with
popcount
tricks.
The nodes appear to be larger than necessary; is this a problem? How big do ACLs get?
Uses reserved identifiers for static functions, which is undefined behaviour.
isc_symtab
Another hash table, with another hash function.
used by
isccfg/parser
bind9/check
notes
Single-threaded.
Ought to be unnecessary. Might be worth cleaning up as part of some
general parser refactoring (isc_lex
could probably be made faster
using re2c
).
Ondřej's notes
Using ragel is on my TODO list for too long...
isccc
This library (as a whole) is the control channel implementation used
by rndc
and named
.
It has implementations of LISP-style S-expressions and association lists https://en.wikipedia.org/wiki/Greenspun%27s_tenth_rule
There is yet another hash table in isccc/symtab
, and isccc/base64
is also redundant.