The various factors like NS_PER_MS are now defined in a single place
and the names are no longer inconsistent. I chose the _PER_SEC names
rather than _PER_S because it is slightly more clear in isolation;
but the smaller units are always NS, US, and MS.
Extract the tlss values if present from the ipkeylist entry and add
the resulting tls setting to the constructed configuration for the
primary.
When comparing catalog zone entries for reuse also check the
masters.tlss values for equality.
the 'nupdates' field was originally used to track whether a client
was ready to shut down, along with other similar counters nreads,
nrecvs, naccepts and nsends. this is now tracked differently, but
nupdates was overlooked when the other counters were removed.
Remove code that triggers key and denial of existence management
operations. Dynamic update should no longer be used to do DNSSEC
maintenance (other than that of course signatures need to be
created for the new zone contents).
The aim is to do less work per byte:
* Check the bounds for each label, instead of checking the
bounds for each character.
* Instead of copying one character at a time from the wire to
the name, copy entire runs of sequential labels using memmove()
to make the most of its fast loop.
* To remember where the name ends, we only need to set the end
marker when we see a compression pointer or when we reach the
root label. There is no need to check if we jumped back and
conditionally update the counter for every character.
* To parse a compression pointer, we no longer take a diversion
around the outer loop in between reading the upper byte of the
pointer and the lower byte.
* The parser state machine is now implicit in the instruction
pointer, instead of being an explicit variable. Similarly,
when we reach the root label we break directly out of the loop
instead of setting a second state machine variable.
* DNS_NAME_DOWNCASE is never used with dns_name_fromwire() so
that option is no longer supported.
I have removed this comment which dated from January 1999 when
dns_name_fromwire() was first introduced:
/*
* Note: The following code is not optimized for speed, but
* rather for correctness. Speed will be addressed in the future.
*/
No functional change, apart from removing support for the unused
DNS_NAME_DOWNCASE option. The new code is about 2x faster than the
old code: best case 11x faster, worst case 1.4x faster.
When using dual-stack-servers the covering namespace to check whether
answers are in scope or not should be fctx->domain. To do this we need
to be able to distingish forwarding due to forwarders clauses and
dual-stack-servers. A new flag FCTX_ADDRINFO_DUALSTACK has been added
to signal this.
It was possible to set operating system limits (RLIMIT_DATA,
RLIMIT_STACK, RLIMIT_CORE and RLIMIT_NOFILE) from named.conf. It's
better to leave these untouched as setting these is responsibility of
the operating system and/or supervisor.
Deprecate the configuration options and remove them in future BIND 9
release.
The small/large tuning has been completely removed from the code with
last remnant of the dead code in ns_interfacemgr. Remove the dead code
and the configure option.
There were a number of places where the zone table should have been
locked, but wasn't, when dns_zt_apply was called.
Added a isc_rwlocktype_t type parameter to dns_zt_apply and adjusted
all calls to using it. Removed locks in callers.
Despite the RFC says that the NSEC3PARAM is not something that is
intended for the resolver to be cached, and thus the TTL of 0 is most
logical, a zero TTL RRset can be abused by bad actors.
Change the default to SOA MINIMUM.
The call to dns_view_flushcache() is done under exclusive mode, but we
still need to check if view->adb is still attached before calling
dns_adb_flush() because the shutdown might have been already
initialized. This most likely only a theoretical problem on shutdown
because there's either no way how to initiate cache flush when shutting
down or very slim window where the `rndc flush` would have to hit the
slim time during named shutdown.
When starting priming from dns_view_find(), the dns_view shutdown could
be initiated by different thread, detaching from the resolver. Use
dns_view_getresolver() to attach to the resolver under view->lock, so we
don't try to call dns_resolver_prime() with NULL pointer.
There are more accesses to view->resolver, (and also view->adb and
view->requestmgr that suffer from the same problem) in the dns_view
module, but they are all done in exclusive mode or under a view->lock.
Firefox 90+ apparently sends more than 10 headers, so we need to bump
the number to some higher number. Bump it to 100 just to be on a save
side, this is for internal use only anyway.
Replace the use of isc_ht API with isc_hashmap API in the dns_resolver
implementation. This requires extending the fctxbucket_t structure to
include keysize and copy of the key because the isc_hashmap API needs
the raw key in case of resizing the hashmap table.
Replace the use of isc_ht API with isc_hashmap API in the dns_adb
database implementation. This requires extending the
dns_adbnamebucket_t and dns_adbentrybucket_t structures to include
keysize and copy of the key because the isc_hashmap API needs the raw
key in case of resizing the hashmap table.
Add new isc_hashmap API that differs from the current isc_ht API in
several aspects:
1. It implements Robin Hood Hashing which is open-addressing hash table
algorithm (e.g. no linked-lists)
2. No memory allocations - the array to store the nodes is made of
isc_hashmap_node_t structures instead of just pointers, so there's
only allocation on resize.
3. The key is not copied into the hashmap node and must be also stored
externally, either as part of the stored value or in any other
location that's valid as long the value is stored in the hashmap.
This makes the isc_hashmap_t a little less universal because of the key
storage requirements, but the inserts and deletes are faster because
they don't require memory allocation on isc_hashmap_add() and memory
deallocation on isc_hashmap_delete().
Previously, the tree read lock could be upgraded to a write lock in
decrement_reference() and then downgraded back to read lock in
dereference_iter_node(). When the use of isc_rwlock_downgrade() was
removed, the downgrade was changed to a simple unlock+lock. This allows
some delete operations to sneak in and delete nodes that the iterator
expects to be in place.
Expand decrement_reference() so the caller can indicate whether the
tree read lock should be upgraded, and disallow the upgrade when
calling from dereference_iter_node(), so there will be no need to
release the lock afterward.
The zone_refreshkeys() could run before the zone_shutdown(), but after
the last .erefs has been "detached" causing assertion failure when doing
dns_zone_attach(). Remove the use of .erefs (dns_zone_attach/detach)
and replace it with using the .irefs and additional checks whether the
zone is exiting in the callbacks.
There was an exception for dnssec-policy that allowed DNSSEC in the
unsigned version of the zone. This however causes a crash if the
zone switches from dynamic to inline-signing in the case of NSEC3,
because we are now trying to add an NSEC3 record to a non-NSEC3 node.
This is because BIND expects none of the records in the unsigned
version of the zone to be NSEC3.
Remove the exception for dnssec-policy when copying non DNSSEC
records, but do allow for DNSKEY as this may be a published DNSKEY
from a different provider.
The dead nodes might get reactivated during the db iterator walks the
version of the tree, so we can't cleanup the dead nodes while the db
version is open. Restore the previous behaviour that cleaned up the
dead nodes when we are closing the version.
While using mutrace, the phtread-rwlock based isc_rwlock implementation
would be all tracked in the rwlock.c unit losing all useful information
as all rwlocks would be traced in a single place. Rewrite the
pthread_rwlock based implementation to be header-only macros, so we can
use mutrace to properly track the rwlock contention without heavily
patching mutrace to understand the libisc synchronization primitives.
Instead of checking the PTHREAD_RUNTIME_CHECK from the header, move it
to the pthread_rwlock implementation functions. The internal isc_rwlock
actually cannot fail, so the checks in the header was useless anyway.
The dns_rbtdb unit already tracks the state of the node and tree rwlocks
during the top level function and passes the states of the locks to the
called functions.
Add the tree locking family of macros modeled after node locking macros,
and expand both to track the state of the lock in an external variable.
Additionally, in developer mode, add precondition to the macros, so the
lock is in required state - this should cause an assertion failure on
double locking instead of the thread getting stuck.
The only place where isc_rwlock_downgrade was being used was the
decrement_reference() where the code tries either relocks the node
rwlock to write and then tries to upgrade the tree lock. When returning
from the function it tries to restore the locks into a previous state
which is nice, but kind of moot, because at every use of
decrement_reference() the node locks is immediately or almost
immeditately unlocked, and same holds for the tree lock.
Instead of trying to restore the node and tree lock into the initial
state, the decrement_reference now returns the state of the locks, so
the caller can then use the right unlock operation (read or write).
Only when the tree lock was originally unlocked, the decrement_reference
unlocks the tree lock before returning to the caller.
When named starts it creates an empty KEYDATA record in the managed-keys
zone as a placeholder, then schedules a key refresh. If key refresh
fails for some reason (e.g. connectivity problems), named will load the
placeholder key into secroots as a trusted key during the next startup,
which will break the chain of trust, and named will never recover from
that state until managed-keys.bind and managed-keys.bind.jnl files are
manually deleted before (re)starting named again.
Before calling load_secroots(), check that we are not dealing with a
placeholder.
Because dns_resolver_createfetch() locks the view, it was necessary
to unlock the zone in zone_refreshkeys() before calling it in order
to maintain the lock order, and relock afterward. this permitted a race
with dns_zone_synckeyzone().
This commit moves the call to dns_resolver_createfetch() into a separate
function which is called asynchronously after the zone has been
unlocked.
The keyfetch object now attaches to the zone to ensure that
it won't be shut down before the asynchronous call completes.
This necessitated refactoring dns_zone_detach() so it always runs
unlocked. For managed zones it now schedules zone_shutdown() to
run asynchronously, and for unmanaged zones, it requires the last
dns_zone_detach() to be run without loopmgr running.
when more than one event was scheduled in the isc_aysnc queue,
they were executed in reverse order. we need to pull events
off the back of queue instead the front, so that uv_loop will
run them in the right order.
note that isc_job_run() has the same behavior, because it calls
uv_idle_start() directly. in that case we just document it so
it'll be less surprising in the future.
Instead of doing incremental zone loading with fixed quantum - 100
loaded lines per event, move the zone loading process to the offloaded
libuv threads using isc_work_enqueue() API.
This has the advantage that the thread scheduling is given back to the
operating system that understands blocking operations, and the zone
loading operation doesn't block the networking threads directly.
Incremental file loads now use loopmgr events instead of task events.
The dns_master_loadstreaminc(), _loadbufferinc(), _loadlexer() and
_loadlexerinc() functions were not used in BIND, and have been removed.
dns_rdata_checksvcb performs data entry checks on SVCB records.
In particular that _dns SVBC record have an 'alpn' and if that 'alpn'
parameter indicates HTTP is in use that 'dophath' is present.
The wrong tls configuration was picked here. It should be of the
primary that is selected by forward->which, not zone->curprimary.
This bug may cause BIND to select the wrong primary when retrieving
the TLS settings, or cause a crash in case the wrongly selected primary
has no TLS settings.
The netievent handler for isc_nmsocket_set_tlsctx() was inadvertently
ifdef'd out when BIND was built with --disable-doh, resulting in an
assertion failure on startup when DoT was configured.
ARM states that the "eligibility" TTL is the smallest original TTL
value that is accepted for a record to be eligible for prefetching,
but the code, which implements the condition doesn't behave in that
manner for the edge case when the TTL is equal to the configured
eligibility value.
Fix the code to check that the TTL is greater than, or equal to the
configured eligibility value, instead of just greater than it.
For UDP queries, after calling dns_adb_beginudpfetch() in fctx_query(),
make sure that dns_adb_endudpfetch() is also called on error path, in
order to adjust the quota back.
It is currently possible that dns_adb_endudpfetch() is not
called in fctx_cancelquery() for a UDP query, which results
in quotas not being adjusted back.
Always call dns_adb_endudpfetch() for UDP queries.