BIND 9.19 Planning: Refactoring - bind - Mike's Git repositories

mir/bind

Fork 0

mirror of https://gitlab.isc.org/isc-projects/bind9 synced 2025-08-22 01:59:26 +00:00

Table of Contents

Thread pinning
BIND statistics system overhaul
Address database (ADB) refactoring
Per-uv-loop timers
Zone transfers
Zone management refactoring
Catalog Zones v2 (draft 04)
Serve-stale
RBTDB/RBT Refactoring

Back to agenda: https://gitlab.isc.org/isc-projects/bind9/-/wikis/BIND-9.19-Plan

Attendees: Ondrej, Michał K., Michal N., Petr, Aram, Artem, Evan, Matthijs, Vicky.

Initial notes per refactoring topic.

Thread pinning

Ondrej - "Pin" more code to threads to remove locking (general direction)

pspacek - You probably mean pin mode data to threads, right? Sharing data is the expensive part, I believe.

BIND statistics system overhaul

https://gitlab.isc.org/isc-projects/bind9/-/issues/38

We might need to change how the statistics counters for connections are implemented. There are a couple of obvious issues:

Currently, there is no way to introduce statistic counters for multilayered protocols (TLS stream, HTTP/2). These are being counted as TCP instead of having their own counters. Probably, we can implement this by instructing underlying transports to suppress changing statistics and changing it manually at the upper layers. Let's consider the most extreme case, HTTP/2 over TLS; it includes three layers: HTTP/2 code, TLS-code, TCP-code. In this case, we change the dedicated statistics counters in HTTP/2 code, and the TLS-code gets instructed to suppress statistics changes. The TLS code, in turn, instructs the TCP code to suppress its statistics changes.

We need to take into account multi-stream protocols (the ones that multiplex connections). Those are the ones that support multiple virtual connections over one transport connection. Currently, it includes HTTP/2, but with DNS-over-QUIC and HTTP/3 on the horizon, we need to be prepared. At the very least, we need to add counters for the total number of open sub-streams and the number of total and idle transport connections. The idea is to provide counters to give BIND enough feedback on setting http-listener-clients and http-streams-per-connection (and similar options when support for new transports will be added).

What's the problem?

Statistics are undocumented. It is hard to figure out what statistics are useful to identify operational problems.
Statistics are inaccurate (according to Support).

Ondrej - statistics and debug counters are not the same.

Petr - Instrumentation should be low cost.

Vicky - users were dissatisfied with statistics the last time we surveyed them

Ondrej - Can we "use" Carsten to come up with statistics requirements? yes

Examples:

Synth-from-dnssec: Should be visible that it helped.
Cache-hit-ratio (stork requirements).
Recursive clients (hard to retrieve)
There is no way to tell how many DoH or DoT connections you have

Fix api

Three different sets of problems:

What are we counting? Are we calculating correctly?
How does the information get into the statistics?
How to get it out? Do we need more granularity. and/or do we need to change the XML style sheet?

Get rid of the existing HTTP server, use DOH one?

Artem - not a good idea because it does not use HTTP 1

Evan - Replace ancient HTTP server with good exist HTTP libraries.

Address database (ADB) refactoring

Currently Support is getting a lot of questions around ADB.

Per-uv-loop timers

Remove the separate timer thread.

Zone transfers

Zone locking problems
Channel reuse missing
Address/test/harden IXFR/AXFR
- This could also just be in scope for the db re-write oh goodie, the db rewrite needed more things in scope :) the test may be in scope for the tests to validate the db re-write

Zone management refactoring

Refactor lib/dns/zone.c

Evan - low-hanging fruit: move dns_zonemgr to its own file.

Catalog Zones v2 (draft 04)

Already in progress. Also discussed briefly during "New Features in 9.19" meeting slot.

Aram - What happens if catalog zones is working and a new version is arriving, which is a valid DNS zone, but no longer a valid catalog zone?

Petr - discuss in the WG or in the open-source MM.

Aram - in oarc MM, knot dns developers will ignore the new version, and use the old version.

Serve-stale

Michał K. - "want" is a strong word, but... serve-stale? are we keeping it? the industry seems to like it?

Michał K. - the reason 'the industry seemed to like it' was at the time of the Dyn outage in 2018 some people claimed they were unaffected/less impacted because they had something like this. In reality, the only account I can remember who was really clamoring for it was InfoBlox, who heard from some of their customers that they 'needed' it.

Matthijs - Not a big fan, but throwing it away seems a waste of effort. And some users do like it.

Matthijs - Should we refactor serve-stale to have not concurrent lookups? (stale lookup during a resolver fetch).

Maybe there is a completely different feature that could be useful for this purpose?

Evan - serve-stale was introduced because of reflection attacks.

Michal K - Implementing serve-stale could have helped reduce load on resolvers under "attack" from clients requerying after SERVFAIL.

RBTDB/RBT Refactoring

For more discussion on this topic, see Refactoring (RBTDB).

Modernize design with better data structures (not those from the 90s).

Reducing contention between threads - can we remove a lot of locking (possibly dependent on the dispatch uplift).

Refactor RBTDB or replace?

How can we manage/plan the db replacement for 2022 better than we did the network refactoring? (see https://gitlab.isc.org/isc-projects/bind9/-/wikis/BIND-9.19-Planning:-Netmgr-evaluation)

It is important to have 2 people working together (one person needs someone to bounce ideas off of, to help with getting unstuck).
Brainstorm ideas to reduce risk, including test strategies, milestones/checkpoints, maintain a log of 'what has been tried and rejected'.
In the case of the db replacement, it doesn't have to be either/or, we can add a db option

Goals for db replacement:

better management of aging cache data (cache cleaning)
Do we have any reason to think our existing db is a performance bottleneck?
- Michal K: ISTR some tests were performed which indicated that RBTDB is not a performance bottleneck, but I cannot find that write-up :(
- Evan: I remember it too - both jinmei and mukund benchmarked it and found that rbtdb could handle 10x or more the transactions per second that BIND could, so other things were higher priority. but I don't have the data.
improve test coverage on db (ref: "W/w" bug)
separate db for zone data vs cache data would limit risk (so consider introducing a new db for just cache.
data first and retain current db for zone data)
additional reason: nobody understands all details of RBTDB, results in unexpected breakage
- should this be our top priority?
Artem - if we are to adapt an existing solution, we should consider tooling around it
goal: make it easier for people to make changes in RBTDB
goal: reduce memory usage. we are wasting a lot of memory now (this sounds like an argument that would make sense to an operator).
goal: reduce the impact of updating XFR/AXFRs, maybe provide more control over them?
RBTDB is complicated INSIDE, but it has a well-defined external interface, so damage should be easier to contain.
Assorted ideas: Maybe we should store RRSIGs together with their "source" rdatasets. These for cycles going through all RRsets on a node, searching for an RRSIG seem as unnecessary complication to me (pspacek).