2
0
mirror of https://gitlab.isc.org/isc-projects/bind9 synced 2025-08-22 01:59:26 +00:00
Clone
16
Data structure survey
Tony Finch edited this page 2022-04-08 12:01:07 +01:00

Tony Finch <fanf@isc.org>, spring 2022

My aim in writing these notes is to get a better idea of what data structures exist in BIND, and what they are used for. I have some questions to consider in the process:

  • How well-matched are the data structures to their uses? Are any of them over-powered, such as using an ordered structure when an unordered one will do?

  • Are there cases where a custom structure is overloaded with multiple purposes? Might it be simpler to use one or more standard data structures instead?

  • What are the concurrency requirements? And again, how well-matched are they? Are there cases where BIND uses a general-purpose multi-threaded structure in a single-threaded manner?

  • Where are the opportunities for using copy-on-write to reduce concurrency interlocks? What are the obstacles?

The reason for all this is to get some foundational knowledge as a basis for refactoring parts of BIND, especially the rbtdb, with the aim of making the code easier to understand, and hopefully to improve performance.

disclaimers

My personal interest is the work I have done over the last several years on my "qp-trie" data structure, and the qp-trie variants that have been tuned to work well with DNS names as keys, and for copy-on-write concurrency. I prototyped these variants in my fork of NSD, because it is simpler than BIND, and it already has a choice between a red-black tree and a radix tree, so adding a qp-trie was not too hard.

I am writing these notes with a personal point of view because they are primarily a learning exercise for myself, though I hope they might also be useful to others. (It is easier not to worry about mistakes or inaccuracies if I am not trying to present this as the consensus of the BIND developers.) And there is a date at the top because there's a fair chance this will become a historical document rather than being maintained as a current one.

isc_astack

Fixed-size array stack

used by

  • netmgr for object caches

notes

Locked.

Ondřej's notes

Will be gone after the netmgr refactoring and replaced with per-thread object pools/lists

isc_heap

Priority queue

used by

  • dns_rbtdb to find nodes by TTL; there are multiple heaps split based on node->locknum like a few other parts of rbtdb.

  • dns_zoneverify for checking consistency of NSEC3 chains.

  • isc_timer, the canonical use for a heap.

notes

Single-threaded.

There's a min-heap in libuv, implemented as a binary tree. The standard algorithms (as in isc_heap) use an array which should have better performance.

Ondřej's notes

The usage in isc_timer has been removed in the loopmgr branch

isc_ht

Hash table

used by

  • bin/plugins/filter-a and bin/plugins/filter-aaaa for finding the filter state given a query_ctx_t pointer. (case-sensitive)

  • dns_adb for looking up nameservers by name (case-insensitive) or by address (case-insensitive???).

  • dns_catz (which leaks into bin/named/server) for the list of catalog zones in a view, and the entries in a catalog zone. These hash tables are unexpectedly case-sensitive.

    This doesn't matter for the entries table, because the key is a label from a domain name stored in the rbtdb, which is forced to lower case (rbtdb stores the case bits separately from the name), and names of catalog zone entries are compared case-insensitively via dna_zt_t.

    For the zones table, catz goes wrong if the zone{} clause does not state the zone's name in all lower-case; the case of the zone name in the catalog-zones{} clause does not matter. Initializing the hash table as case-insensitive is not enough to fix this bug.

  • dns_rpz in the zone update functions, case-sensitive but the domain names are forced to lower case.

  • isc_tls for TLS context caches. (case-sensitive)

notes

Single-threaded.

Case-sensitive and case-insensitive variants, chosen at run-time. The case-insensitive isc_hash functions squash the input to lower case before hasing, instead of using a specialized single-pass case-insensitive hash. It would be simpler to make the callers handle case-insensitivity since there are so few of them.

Uses chaining, so it is heavy on indirection and heavy on allocation. Modern hash tables use open addressing.

Ondřej's notes

There's work to do on isc_ht:

  1. make header-only variant of the hashtable, so we can unify the usage across the codebase (e.g. no extra allocations when storing item)

  2. there's a branch, where I implemented robin-hood hashing, but I never finished because of 1), the branch is ondrej/robin-hood-unordered-node-map

  3. the resizing was added only recently, we still miss shrinking (should be easy enough to do)

dns_adb was recently refactored, it used a custom hashtable before

isc_radix

Radix tree for IP address prefix matching

used by

  • dns_acl
  • dns_iptable

notes

Single-threaded.

Slightly unusual because it is first-match rather than best-match.

Tests one bit at a time so there's an opportunity to improve with popcount tricks.

The nodes appear to be larger than necessary; is this a problem? How big do ACLs get?

Uses reserved identifiers for static functions, which is undefined behaviour.

isc_symtab

Another hash table, with another hash function.

used by

  • isccfg/parser
  • bind9/check

notes

Single-threaded.

Ought to be unnecessary. Might be worth cleaning up as part of some general parser refactoring (isc_lex could probably be made faster using re2c).

Ondřej's notes

Using ragel is on my TODO list for too long...

isccc

This library (as a whole) is the control channel implementation used by rndc and named.

It has implementations of LISP-style S-expressions and association lists https://en.wikipedia.org/wiki/Greenspun%27s_tenth_rule

There is yet another hash table in isccc/symtab, and isccc/base64 is also redundant.