The notify_send_toaddr() function uses hardcoded timeout values
for both UDP and TCP, however, with TCP named has configurable
timeout values. Slightly refactor the timeouts calculation part
and use the configured 'tcp-initial-timeout' value as the connect
timeout.
The new 'tcp-primaries-timeout' configuration option works the same way
as the existing 'tcp-initial-timeout' option, but applies only to the
TCP connections made to the primary servers, so that the timeout value
can be set separately for them. The default is 15 seconds.
Also, while accommodating zone.c's code to support the new option, make
a light refactoring with the way UDP timeouts are calculated by using
definitions instead of hardcoded values.
Write python-based tests for the many test cases from the kasp system test with the same pattern.
Merge branch 'matthijs-pytest-rewrite-kasp-system-test-3' into 'main'
See merge request isc-projects/bind9!10268
For 'keystore.kasp', a setting 'key-directories' is used. If set, this
will expect a list of two directories, the first one is where the KSKs
will be stored, the second in the list is the ZSK key directory. This
may be expanded in the future to test more complex key storage cases.
The 'rumoured.kasp' zone is weird, the key timings can never match
those key states. But it is a regression test for an early day bug,
so we convert it, but skip the expected key times check.
These test cases follow the same pattern as many other, but all require
some additional checks. These are set in "additional-tests".
The "zsk-missing.autosign" zone is special handled, as it expects the
KSK to sign the SOA RRset (because the ZSK is unavailable).
The kasp/ns3/setup.sh script is updated so the SyncPublish is not set
(named will initialize it correctly). For the test zones that have
missing private key files we do need to set the expected key timing
metadata.
Remove the counterparts for the newly added test from the kasp shell
tests script.
The check_signatures code was initially created to be suitable for
the ksr system test, to test the Offline KSK feature. For that, a
key is expected to be signing if the current time is between
the timing metadata Active and Retired.
With dnssec-policy, the key timing metadata is indicative, the key
states determine the actual signing behavior.
Update the check_signatures function so that by default the signing
is derived from the key states (ksigning and zsigning). Add an
argument 'offline_ksk', if set the make sure that the zsigning is set
if the current time is between the Active and Retired timing metadata,
and for ksigning we just use the timing metadata (as the key is offline,
we cannot check the key states).
Another (upcoming) test case is where key files are missing. When the
ZSK private key file is missing, the KSK takes over. Add an argument
'zsk_missing', when set to True the expected zone signing (zsigning)
is reversed.
The zone 'pregenerated.kasp' is a case where there already exist more
keys than required. For this we set the 'pregenerated' setting. This
will change the 'keydir_to_keylist' function behavior: Only keys in use
are considered. A key is in use if all of the states are either
undefined, or set to 'hidden'.
The 'some-keys.kasp' zone is similar to 'pregenerated.kasp', except
only some keys have been pregenerated.
Write python-based tests for the many test cases from the kasp system
test. These test cases all follow the same pattern:
- Wait until the zone is signed.
- Check the keys from the key-directory against expected properties.
- Set the expected key timings derived from when the key was created.
- Check the key timing metadata against expected timings.
- Check the 'rndc dnssec -status' output.
- Check the apex is signed correctly.
- Check a subdomain is signed correctly.
- Verify that the zone is DNSSEC correct.
Remove the counterparts for the newly added test from the kasp shell
tests script.
The 'qpnode->nsec' structure member isn't protected by a lock and
there's a data race between the reading and writing parts in the
qpcache_addrdataset() function. Use a node read lock for accessing
'qpnode->nsec' in qpcache_addrdataset(). Add an additional
'qpnode->nsec != DNS_DB_NSEC_HAS_NSEC' check under a write lock
to be sure that no other competing thread changed it in the time
when the read lock is unlocked and a write lock is not acquired
yet.
Closes#5285
Merge branch '5285-data-race-in-qpcache_addrdataset' into 'main'
See merge request isc-projects/bind9!10397
The 'qpnode->nsec' structure member isn't protected by a lock and
there's a data race between the reading and writing parts in the
qpcache_addrdataset() function. Use a node read lock for accessing
'qpnode->nsec' in qpcache_addrdataset(). Add an additional
'qpnode->nsec != DNS_DB_NSEC_HAS_NSEC' check under a write lock
to be sure that no other competing thread changed it in the time
when the read lock is unlocked and a write lock is not acquired
yet.
When ``stale-answer-client-timeout 0`` option was enabled, it could be ignored
when resolving a zone which is a delegation of an authoritative zone belonging
to the resolver. This has been fixed.
Closes#5275
Merge branch '5275-stale-answer-client-timeout-0-and-delegation-fix' into 'main'
See merge request isc-projects/bind9!10381
Add a new test which gets an answer for a delegated zone, then
checks whether the 'stale-answer-client-timeout 0' mode (i.e. the
'stalefirst' mode) works for it.
When 'stale-answer-client-timeout' is 0, named is allowed to return
a stale answer immediately, while also initiating a new query to get
the real answer. This mode is activated in ns__query_start() by setting
the 'qctx->options.stalefirst' optoin to 'true' before calling the
query_lookup() function, but not when the zone is known to be
authoritative to the server. When the zone is authoritative, and
query_looup() finds out that the requested name is a delegation,
then before proceeding with the query, named tries to look it up
in the cache first. Here comes the issue that it doesn't consider
enabling 'qctx->options.stalefirst' in this case, and so the
'stale-answer-client-timeout 0' setting doesn't work for those
delegated zones - instead of immediately returning the stale answer
(if it exists), named tries to resolve it.
Fix this issue by enabling 'qctx->options.stalefirst' in the
query_zone_delegation() function just before named looks up the name
in the cache using a new query_lookup() call. Also, if nothing was
found in the cache, don't initiate another query_lookup() from inside
query_notfound(), and let query_notfound() do its work, i.e. it will
call query_delegation() for further processing.
`dig` was producing invalid YAML when displaying some EDNS options. This has been corrected.
Several other improvements have been made to the display of EDNS option data:
- We now use the correct name for the UPDATE-LEASE option, which was previously displayed as "UL", and split it into separate LEASE and LEASE-KEY components in YAML mode.
- Human-readable durations are now displayed as comments in YAML mode so as not to interfere with machine parsing.
- KEY-TAG options are now displayed as an array of integers in YAML mode.
- EDNS COOKIE options are displayed as separate CLIENT and SERVER components, and cookie STATUS is a retrievable variable in YAML mode.
Closes#5014
Merge branch '5014-improve-edns-yaml-processing' into 'main'
See merge request isc-projects/bind9!9695
Split the YAML display of the EDNS COOKIE option into CLIENT and SERVER
parts. The STATUS of the EDNS COOKIE in the reply is now a YAML element
rather than a comment.
The offical EDNS option name for "UL" is "UPDATE-LEASE". We now
emit "UPDATE-LEASE" instead of "UL", when printing messages, but
"UL" has been retained as an alias on the command line.
Update leases consist of 1 or 2 values, LEASE and KEY-LEASE. These
components are now emitted separately so they can be easily extracted
from YAML output. Tests have been added to check YAML correctness.
When rendering text, such as domain names or the EXTRA-TEXT
field of the EDE option, backslashes and quotation marks must
be escaped to ensure that the emitted message is valid YAML.
The CHAIN and REPORT-CHANNEL EDNS options are both domain names, so they
can be combined. THE CLIENT-TAG and SERVER-TAG EDNS options are both 16
bit integers, so they can be combined.
Apple broke custom memory allocation functions in the system-wide libxml2 starting with macOS Sequoia 15.4. Usage of the custom memory allocation functions has been disabled on macOS.
Closes#5268
Merge branch '5268-disable-libxml2-memory-management-on-macos' into 'main'
See merge request isc-projects/bind9!10374
The custom allocation API for libxml2 is deprecated starting in macOS
Sequoia 15.4, iOS 18.4, tvOS 18.4, visionOS 2.4, and tvOS 18.4.
Disable the memory function override for libxml2 when
LIBXML_HAS_DEPRECATED_MEMORY_ALLOCATION_FUNCTIONS is defined as Apple
broke the system-wide libxml2 starting with macOS Sequoia 15.4.
Convert the first batch of tests from `kasp/tests.sh` to `kasp/tests_kasp.py`.
Merge branch 'matthijs-pytest-rewrite-kasp-system-test-2' into 'main'
See merge request isc-projects/bind9!10253
isctest.util was not imported so file_contents_contain could not be
found. And rename verify_keys to check_keys because it asserts in
isctest.run.retry_with_timeout.
This converts a special characters test case, a max-zone-ttl error
check, and two cases of insecure zones.
We no longer assert for having more than one DNSKEY and/or RRSIG
records. If the zone is insecure, this is no longer always true. And
we already check for the expected number of records in the
check_dnskeys/check_signatures functions.
This commit deals with converting the dynamic zone test cases to
pytest. The tests for 'inline-signing.kasp' are similar to the default
case, so these are added to 'test_kasp_default'.
Unfortunately I need to add sleep calls in between freezing, updating,
and thawing a zone. Without it the intermittent failures are too
frequent.
This commit deals with converting the test cases related to the default
dnssec-policy.
This requires a new method 'check_update_is_signed'. This method will
be used in future tests as well, and checks if an expected record is
in the zone and is properly signed.
Remove the counterparts for the newly added test from the kasp shell
tests script.
Convert the first couple of tests from 'kasp/tests.sh' to
'kasp/tests_kasp.py', those are test cases related to 'dnssec-keygen'
and 'dnssec-settime'.
For this, we also add a new KeyProperties method,
'policy_to_properties', that takes a list of strings which represent
the keys according to the dnssec-policy and the expected key states.
The pthread-based implementation of the isc_rwlock_tryupgrade()
function always returns ISC_R_LOCKBUSY. Fix the test by adding
conditional checks.
Closes#5287
Merge branch '5287-pthread-rwlock-tryupgrade-test-fix' into 'main'
See merge request isc-projects/bind9!10398
v9.21.7 was released and the job now passes.
Merge branch 'nicki/ci-re-enable-cross-version-config-tests' into 'main'
See merge request isc-projects/bind9!10402
When isc__thread_initialize() is called from a library constructor, it
could be called before we fork the main process. This happens with
named, and then we have the call_rcu_thread attached to the pre-fork
process and not the post-fork process, which means that the initial
process will never shutdown, because there's noone to tell it so.
Move the isc__thread_initialize() and isc__thread_shutdown() to the
isc_loop unit where we call it before creating the extra thread and
after joining all the extra threads respectively.
Closes#5281
Merge branch '5281-move-call_rcu-thread-ctor-dtor-to-main-thread' into 'main'
See merge request isc-projects/bind9!10394
When isc__thread_initialize() is called from a library constructor, it
could be called before we fork the main process. This happens with
named, and then we have the call_rcu_thread attached to the pre-fork
process and not the post-fork process, which means that the initial
process will never shutdown, because there's noone to tell it so.
Move the isc__thread_initialize() and isc__thread_shutdown() to the
isc_loop unit where we call it before creating the extra thread and
after joining all the extra threads respectively.
The QPDB_VIRTUAL value was introduced to allow the clients (presumably
ns_clients) that has been running for some time to access the cached
data that was valid at the time of its inception. The default value
of 5 minutes is way longer than longevity of the ns_client object as
the resolver will give up after 2 minutes.
Reduce the value to 10 seconds to accomodate to honour the original
more closely, but still allow some leeway for clients that started some
time in the past.
Our measurements show that even setting this value to 0 has no
statistically significant effect, thus the value of 10 seconds should be
on the safe side.
Merge branch 'ondrej/reduce-QPDB_VIRTUAL' into 'main'
See merge request isc-projects/bind9!10309
The *DB_VIRTUAL value was introduced to allow the clients (presumably
ns_clients) that has been running for some time to access the cached
data that was valid at the time of its inception. The default value
of 5 minutes is way longer than longevity of the ns_client object as
the resolver will give up after 2 minutes.
Reduce the value to 10 seconds to accomodate to honour the original
more closely, but still allow some leeway for clients that started some
time in the past.
Our measurements show that even setting this value to 0 has no
statistically significant effect, thus the value of 10 seconds should be
on the safe side.
`python-jinja2` is now required to run system tests.
Related #4938
Merge branch 'nicki/replace-setup-sh-files-with-jinja2-templates' into 'main'
See merge request isc-projects/bind9!9588
Many of the system tests now use jinja2 template engine. Adding jinja2
as a hard dependency is preferable than potentially silently skipping
many system tests.
These setup.sh scripts only do templating and copying files. Both of
these can be replaced with either jinja templates, or using plain files.
Since each test invocation creates its own temporary directory, copying
files to ensure a "clean" state is no longer necessary.
In cases where named writes some content to the files, a jinja template
can be used instead of a plain file to avoid an artifact check which
would detect a change to a git-tracked file.