When resolve.c was moved from lib/samples to bin/tests/system, the
resolve.vcxproj.in would still contain old paths to the directory
root. This commit adds one more ..\ to match the directory depth.
Additionally, fixup the path in BINDInstall.vcxproj.in to be
bin/tests/system and not bin/tests/samples.
When setnsec3param() is schedule from zone_postload() there's no
guarantee that `zone->db` is not `NULL` yet. Thus when the
setnsec3param() is called, we need to check for `zone->db` existence and
reschedule the task, because calling `rss_post()` on a zone with empty
`.db` ends up with no-op (the function just returns).
Previously, the taskmgr, timermgr and socketmgr had a constructor
variant, that would create the mgr on top of existing appctx. This was
no longer true and isc_<*>mgr was just calling isc_<*>mgr_create()
directly without any extra code.
This commit just cleans up the extra function.
"resolve" is used by the resolver system tests, and I'm not
certain whether delv exercises the same code, so rather than
remove it, I moved it to bin/tests/system.
sample code for export libraries is no longer needed and
this code is not used for any internal tests. also, sample-gai.c
had already been removed but there were some dangling references.
the libdns client API is no longer being maintained for
external use, we can remove the code that isn't being used
internally, as well as the related tests.
Too much logic was cramped inside the dns_journal_rollforward() that
made it harder to follow. The dns_journal_rollforward() was refactored
to work over already opened journal and some of the previous logic was
moved to new static zone_journal_rollforward() that separates the
journal "rollforward" logic from the "zone" logic.
when dns_journal_rollforward returned ISC_R_RECOVERABLE the distintion
between 'up to date' and 'success' was lost, as a consequence
zone_needdump() was called writing out the zone file when it shouldn't
have been. This change restores that distintion. Adjust system
test to reflect visible changes.
It fixes a corner case which was causing dig to print annoying
messages like:
14-Apr-2021 18:48:37.099 SSL error in BIO: 1 TLS error (errno:
0). Arguments: received_data: (nil), send_data: (nil), finish: false
even when all the data was properly processed.
Before this fix underlying TCP sockets could remain opened for longer
than it is actually required, causing unit tests to fail with lots of
ISC_R_TOOMANYOPENFILES errors.
The change also enables graceful SSL shutdown (before that it would
happen only in the case when isc_nm_cancelread() were called).
This commit merges TLS tests into the common Network Manager unit
tests suite and extends the unit test framework to include support for
additional "ping-pong" style tests where all data could be sent via
lesser number of connections (the behaviour of the old test
suite). The tests for TCP and TLS were extended to make use of the new
mode, as this mode better translates to how the code is used in DoH.
Both TLS and TCP tests now share most of the unit tests' code, as they
are expected to function similarly from a users's perspective anyway.
Additionally to the above, the TLS test suite was extended to include
TLS tests using the connections quota facility.
Due to the lack of "match-clients" clauses in ns4/named2.conf.in, the
same view is incorrectly chosen for all queries received by ns4 in the
"keymgr2kasp" system test. This causes only one version of the
"view-rsasha256.kasp" zone to actually be checked. Add "match-clients"
clauses to ns4/named2.conf.in to ensure the test really checks what it
claims to.
Use identical view names ("ext", "int") in ns4/named.conf.in and
ns4/named2.conf.in so that it is easier to quickly identify the
differences between these two files.
Update tests.sh to account for the above changes. Also fix a copy-paste
error in a comment to prevent confusion.
The test case for a zone with a missing include file was wrong for two
reasons:
1. It was loading the wrong file (master5 instead of master6)
2. It did actually not set the $ret variable to 1 if the test failed
(it should default to ret=1 and clear the variable if the
appropriate log is found).
Add a test case for inline-signing for a zone with an $INCLUDE
statement. There is already a test for a missing include file, this
one adds a test for a zone with an include file that does exist.
Test if the record in the included file is loaded.
The draft says that the NSEC(3) TTL must have the same TTL value
as the minimum of the SOA MINIMUM field and the SOA TTL. This was
always the intended behaviour.
Update the zone structure to also track the SOA TTL. Whenever we
use the MINIMUM value to determine the NSEC(3) TTL, use the minimum
of MINIMUM and SOA TTL instead.
There is no specific test for this, however two tests need adjusting
because otherwise they failed: They were testing for NSEC3 records
including the TTL. Update these checks to use 600 (the SOA TTL),
rather than 3600 (the SOA MINIMUM).
It is more intuitive to have the countdown 'max-stale-ttl' as the
RRset TTL, instead of 0 TTL. This information was already available
in a comment "; stale (will be retained for x more seconds", but
Support suggested to put it in the TTL field instead.
Before binding an RRset, check the time and see if this record is
stale (or perhaps even ancient). Marking a header stale or ancient
happens only when looking up an RRset in cache, but binding an RRset
can also happen on other occasions (for example when dumping the
database).
Check the time and compare it to the header. If according to the
time the entry is stale, but not ancient, set the STALE attribute.
If according to the time is ancient, set the ANCIENT attribute.
We could mark the header stale or ancient here, but that requires
locking, so that's why we only compare the current time against
the rdh_ttl.
Adjust the test to check the dump-db before querying for data. In the
dumped file the entry should be marked as stale, despite no cache
lookup happened since the initial query.
When introducing change 5149, "rndc dumpdb" started to print a line
above a stale RRset, indicating how long the data will be retained.
At that time, I thought it should also be possible to load
a cache from file. But if a TTL has a value of 0 (because it is stale),
stale entries wouldn't be loaded from file. So, I added the
'max-stale-ttl' to TTL values, and adjusted the $DATE accordingly.
Since we actually don't have a "load cache from file" feature, this
is premature and is causing confusion at operators. This commit
changes the 'max-stale-ttl' adjustments.
A check in the serve-stale system test is added for a non-stale
RRset (longttl.example) to make sure the TTL in cache is sensible.
Also, the comment above stale RRsets could have nonsensical
values. A possible reason why this may happen is when the RRset was
marked a stale but the 'max-stale-ttl' has passed (and is actually an
RRset awaiting cleanup). This would lead to the "will be retained"
value to be negative (but since it is stored in an uint32_t, you would
get a nonsensical value (e.g. 4294362497).
To mitigate against this, we now also check if the header is not
ancient. In addition we check if the stale_ttl would be negative, and
if so we set it to 0. Most likely this will not happen because the
header would already have been marked ancient, but there is a possible
race condition where the 'rdh_ttl + serve_stale_ttl' has passed,
but the header has not been checked for staleness.
When system tests are run on Windows, they are assigned port ranges that
are 100 ports wide and start from port number 5000. This is a different
port assignment method than the one used on Unix systems. Drop the "-p"
command line option from bin/tests/system/run.sh invocations used for
starting system tests on Windows to unify the port assignment method
used across all operating systems.
The get_ports.sh script is used for determining the range of ports a
given system test should use. It first determines the start of the port
range to return (the base port); it can either be specified explicitly
by the caller or chosen randomly. Subsequent ports are picked
sequentially, starting from the base port. To ensure no single port is
used by multiple tests, a state file (get_ports.state) containing the
last assigned port is maintained by the script. Concurrent access to
the state file is protected by a lock file (get_ports.lock); if one
instance of the script holds the lock file while another instance tries
to acquire it, the latter retries its attempt to acquire the lock file
after sleeping for 1 second; this retry process can be repeated up to 10
times before the script returns an error.
There are some problems with this approach:
- the sleep period in case of failure to acquire the lock file is
fixed, which leads to a "thundering herd" type of problem, where
(depending on how processes are scheduled by the operating system)
multiple system tests try to acquire the lock file at the same time
and subsequently sleep for 1 second, only for the same situation to
likely happen the next time around,
- the lock file is being locked and then unlocked for every single
port assignment made, not just once for the entire range of ports a
system test should use; in other words, the lock file is currently
locked and unlocked 13 times per system test; this increases the
odds of the "thundering herd" problem described above preventing a
system test from getting one or more ports assigned before the
maximum retry count is reached (assuming multiple system tests are
run in parallel); it also enables the range of ports used by a given
system test to be non-sequential (which is a rather cosmetic issue,
but one that can make log interpretation harder than necessary when
test failures are diagnosed),
- both issues described above cause unnecessary delays when multiple
system tests are started in parallel (due to high lock file
contention among the system tests being started),
- maintaining a state file requires ensuring proper locking, which
complicates the script's source code.
Rework the get_ports.sh script so that it assigns non-overlapping port
ranges to its callers without using a state file or a lock file:
- add a new command line switch, "-t", which takes the name of the
system test to assign ports for,
- ensure every instance of get_ports.sh knows how many ports all
system tests which form the test suite are going to need in total
(based on the number of subdirectories found in bin/tests/system/),
- in order to ensure all instances of get_ports.sh work on the same
global port range (so that no port range collisions happen), a
stable (throughout the expected run time of a single system test
suite) base port selection method is used instead of the random one;
specifically, the base port, unless specified explicitly using the
"-p" command line switch, is derived from the number of hours which
passed since the Unix Epoch time,
- use the name of the system test to assign ports for (passed via the
new "-t" command line switch) as a unique index into the global
system test range, to ensure all system tests use disjoint port
ranges.
The fromhex.pl script needs to be copied from the source directory to
the build directory before any test is run, otherwise the out-of-tree
fails to find it. Given that the script is used only in system test,
move it to bin/tests/system/.