2
0
mirror of https://gitlab.isc.org/isc-projects/bind9 synced 2025-08-31 06:25:31 +00:00
Commit Graph

33779 Commits

Author SHA1 Message Date
Matthijs Mekking
8556c7f854 Merge branch '2594-servestale-staleonly-recursion-race' into 'main'
Serve-stale "staleonly" recursion race condition

See merge request isc-projects/bind9!4859
2021-04-02 11:26:57 +00:00
Matthijs Mekking
3d3a6415f7 If RPZ config'd, bail stale-answer-client-timeout
When we are recursing, RPZ processing is not allowed. But when we are
performing a lookup due to "stale-answer-client-timeout", we are still
recursing. This effectively means that RPZ processing is disabled on
such a lookup.

In this case, bail the "stale-answer-client-timeout" lookup and wait
for recursion to complete, as we we can't perform the RPZ rewrite
rules reliably.
2021-04-02 10:02:40 +02:00
Matthijs Mekking
839df94190 Rename "staleonly"
The dboption DNS_DBFIND_STALEONLY caused confusion because it implies
we are looking for stale data **only** and ignore any active RRsets in
the cache. Rename it to DNS_DBFIND_STALETIMEOUT as it is more clear
the option is related to a lookup due to "stale-answer-client-timeout".

Rename other usages of "staleonly", instead use "lookup due to...".
Also rename related function and variable names.
2021-04-02 10:02:40 +02:00
Matthijs Mekking
3f81d79ffb Restore the RECURSIONOK attribute after staleonly
When doing a staleonly lookup we don't want to fallback to recursion.
After all, there are obviously problems with recursion, otherwise we
wouldn't do a staleonly lookup.

When resuming from recursion however, we should restore the
RECURSIONOK flag, allowing future required lookups for this client
to recurse.
2021-04-02 10:02:40 +02:00
Matthijs Mekking
aaed7f9d8c Remove result exception on staleonly lookup
When implementing "stale-answer-client-timeout", we decided that
we should only return positive answers prematurely to clients. A
negative response is not useful, and in that case it is better to
wait for the recursion to complete.

To do so, we check the result and if it is not ISC_R_SUCCESS, we
decide that it is not good enough. However, there are more return
codes that could lead to a positive answer (e.g. CNAME chains).

This commit removes the exception and now uses the same logic that
other stale lookups use to determine if we found a useful stale
answer (stale_found == true).

This means we can simplify two test cases in the serve-stale system
test: nodata.example is no longer treated differently than data.example.
2021-04-02 10:02:40 +02:00
Matthijs Mekking
e44bcc6f53 Add notes and changes for [#2594]
Pretty newsworthy.
2021-04-02 10:02:40 +02:00
Matthijs Mekking
3d5429f61f Remove INSIST on NS_QUERYATTR_ANSWERED
The NS_QUERYATTR_ANSWERED attribute is to prevent sending a response
twice. Without the attribute, this may happen if a staleonly lookup
found a useful answer and sends a response to the client, and later
recursion ends and also tries to send a response.

The attribute was also used to mask adding a duplicate RRset. This is
considered harmful. When we created a response to the client with a
stale only lookup (regardless if we actually have send the response),
we should clear the rdatasets that were added during that lookup.

Mark such rdatasets with the a new attribute,
DNS_RDATASETATTR_STALE_ADDED. Set a query attribute
NS_QUERYATTR_STALEOK if we may have added rdatasets during a stale
only lookup. Before creating a response on a normal lookup, check if
we can expect rdatasets to have been added during a staleonly lookup.
If so, clear the rdatasets from the message with the attribute
DNS_RDATASETATTR_STALE_ADDED set.
2021-04-02 09:15:07 +02:00
Matthijs Mekking
48b0dc159b Simplify when to detach the client
With stale-answer-client-timeout, we may send a response to the client,
but we may want to hold on to the network manager handle, because
recursion is going on in the background, or we need to refresh a
stale RRset.

Simplify the setting of 'nodetach':
* During a staleonly lookup we should not detach the nmhandle, so just
  set it prior to 'query_lookup()'.
* During a staleonly "stalefirst" lookup set the 'nodetach' to true
  if we are going to refresh the RRset.

Now there is no longer the need to clear the 'nodetach' if we go
through the "dbfind_stale", "stale_refresh_window", or "stale_only"
paths.
2021-04-02 09:14:09 +02:00
Matthijs Mekking
92f7a67892 Refactor stale lookups, ignore active RRsets
When doing a staleonly lookup, ignore active RRsets from cache. If we
don't, we may add a duplicate RRset to the message, and hit an
assertion failure in query.c because adding the duplicate RRset to the
ANSWER section failed.

This can happen on a race condition. When a client query is received,
the recursion is started. When 'stale-answer-client-timeout' triggers
around the same time the recursion completes, the following sequence
of events may happen:
1. Queue the "try stale" fetch_callback() event to the client task.
2. Add the RRsets from the authoritative response to the cache.
3. Queue the "fetch complete" fetch_callback() event to the client task.
4. Execute the "try stale" fetch_callback(), which retrieves the
   just-inserted RRset from the database.
5. In "ns_query_done()" we are still recursing, but the "staleonly"
   query attribute has already been cleared. In other words, the
   query will resume when recursion ends (it already has ended but is
   still on the task queue).
6. Execute the "fetch complete" fetch_callback(). It finds the answer
   from recursion in the cache again and tries to add the duplicate to
   the answer section.

This commit changes the logic for finding stale answers in the cache,
such that on "stale_only" lookups actually only stale RRsets are
considered. It refactors the code so that code paths for "dbfind_stale",
"stale_refresh_window", and "stale_only" are more clear.

First we call some generic code that applies in all three cases,
formatting the domain name for logging purposes, increment the
trystale stats, and check if we actually found stale data that we can
use.

The "dbfind_stale" lookup will return SERVFAIL if we didn't found a
usable answer, otherwise we will continue with the lookup
(query_gotanswer()). This is no different as before the introduction of
"stale-answer-client-timeout" and "stale-refresh-time".

The "stale_refresh_window" lookup is similar to the "dbfind_stale"
lookup: return SERVFAIL if we didn't found a usable answer, otherwise
continue with the lookup (query_gotanswer()).

Finally the "stale_only" lookup.

If the "stale_only" lookup was triggered because of an actual client
timeout (stale-answer-client-timeout > 0), and if database lookup
returned a stale usable RRset, trigger a response to the client.
Otherwise return and wait until the recursion completes (or the
resolver query times out).

If the "stale_only" lookup is a "stale-anwer-client-timeout 0" lookup,
preferring stale data over a lookup. In this case if there was no stale
data, or the data was not a positive answer, retry the lookup with the
stale options cleared, a.k.a. a normal lookup. Otherwise, continue
with the lookup (query_gotanswer()) and refresh the stale RRset. This
will trigger a response to the client, but will not detach the handle
because a fetch will be created to refresh the RRset.
2021-04-02 09:14:09 +02:00
Matthijs Mekking
fee164243f Keep track of allow client detach
The stale-answer-client-timeout feature introduced a dependancy on
when a client may be detached from the handle. The dboption
DNS_DBFIND_STALEONLY was reused to track this attribute. This overloads
the meaning of this database option, and actually introduced a bug
because the option was checked in other places. In particular, in
'ns_query_done()' there is a check for 'RECURSING(qctx->client) &&
(!QUERY_STALEONLY(&qctx->client->query) || ...' and the condition is
satisfied because recursion has not completed yet and
DNS_DBFIND_STALEONLY is already cleared by that time (in
query_lookup()), because we found a useful answer and we should detach
the client from the handle after sending the response.

Add a new boolean to the client structure to keep track of client
detach from handle is allowed or not. It is only disallowed if we are
in a staleonly lookup and we didn't found a useful answer.
2021-04-02 09:14:09 +02:00
Artem Boldariev
e7fe606020 Merge branch 'artem/tls-tests-and-fixes' into 'main'
TLS transport code refactoring and unit tests

See merge request isc-projects/bind9!4851
2021-04-01 15:41:52 +00:00
Artem Boldariev
fa062162a7 Fix crash (regression) in DIG when handling non-DoH responses
This commit fixes crash in dig when it encounters non-expected header
value. The bug was introduced at some point late in the last DoH
development cycle. Also, refactors the relevant code a little bit to
ensure better incoming data validation for client-side DoH
connections.
2021-04-01 17:31:29 +03:00
Artem Boldariev
11ed7aac5d TLS code refactoring, fixes and unit-tests
This commit fixes numerous stability issues with TLS transport code as
well as adds unit tests for it.
2021-04-01 17:31:29 +03:00
Ondřej Surý
01cd310407 Merge branch '2607-remove-custom-spnego' into 'main'
Remove custom ISC SPNEGO implementation

Closes #2607

See merge request isc-projects/bind9!4856
2021-04-01 14:14:00 +00:00
Ondřej Surý
66bd47a129 Add CHANGES and release note for GL #2607 2021-04-01 16:08:19 +02:00
Mark Andrews
1febea6d7c Merge branch '2538-bind-9-17-build-process-leaving-files-in-unexpected-locations' into 'main'
Resolve "BIND 9.17 build process leaving files in unexpected locations?"

Closes #2538

See merge request isc-projects/bind9!4757
2021-04-01 09:34:17 +00:00
Mark Andrews
35e8f56b49 Test dynamic libraries should not be installed
Tag the libraries with check_ to prevent them being installed
by "make install".  Additionally make check requires .so to be
create which requires .lai files to be constructed which, in
turn, requires -rpath <dir> as part of "linking" the .la file.
2021-04-01 19:11:54 +11:00
Michal Nowak
b34fd6d4f2 Merge branch 'mnowak/web-run-gcc-tarball-ci-job' into 'main'
Run gcc:tarball CI job in web-triggered pipelines

See merge request isc-projects/bind9!4850
2021-03-31 14:37:15 +00:00
Michal Nowak
4d5d3b75da Run gcc:tarball CI job in web-triggered pipelines
The gcc:tarball CI job may identify problems with tarballs created by
"make dist" of the tarball-create CI job. Enabling the gcc:tarball CI
job in web-triggered pipelines provides developers with a test vector.
2021-03-31 16:35:59 +02:00
Michał Kępień
aaac9345eb Merge branch 'michal/include-all-pre-generated-man-pages-in-make-dist' into 'main'
Include all pre-generated man pages in "make dist"

See merge request isc-projects/bind9!4838
2021-03-29 11:08:00 +00:00
Michał Kępień
490e5cb1f1 Include all pre-generated man pages in "make dist"
Some man pages (e.g. dnstap-read.1, named-nzd2nzf.1) should only be
installed conditionally (when the relevant features are enabled in a
given BIND 9 build).  This is achieved using Automake conditionals.
However, while all source reStructuredText files are included in
tarballs produced by "make dist" (distribution tarballs) as they should
be, the list of pre-generated man pages included in distribution
tarballs incorrectly depends on the ./configure switches used for the
build for which "make dist" is run.  Meanwhile, distribution tarballs
should always contain all the files necessary to build any flavor of
BIND 9.

Here is an example scenario which fails to work as intended:

    autoreconf -i
    ./configure --disable-maintainer-mode
    make dist
    tar --extract --file bind-9.17.11.tar.xz
    cd bind-9.17.11
    ./configure --disable-maintainer-mode --enable-dnstap
    make

Fix by always including pre-generated versions of all conditionally
installed man pages in EXTRA_DIST.  While this may cause some of them to
appear in EXTRA_DIST more than once (depending on the ./configure
switches used for the build for which "make dist" is run), it seems to
not be a problem for Automake.
2021-03-29 13:06:39 +02:00
Mark Andrews
99ff8f285c Merge branch '2597-make-calling-generic-rdata-methods-consistent' into 'main'
Resolve "Make calling generic rdata methods consistent"

Closes #2597

See merge request isc-projects/bind9!4834
2021-03-26 22:27:51 +00:00
Mark Andrews
a88d3963e2 Make calling generic rdata methods consistent
add matching macros to pass arguments from called methods
to generic methods.  This will reduce the amount of work
required when extending methods.

Also cleanup unnecessary UNUSED declarations.
2021-03-26 22:04:42 +00:00
Ondřej Surý
19b69e9a3b Merge branch 'bind-dyndb-ldap-v9.16.13' into 'main'
Do not require config.h to use isc/util.h

See merge request isc-projects/bind9!4840
2021-03-26 18:43:18 +00:00
Petr Mensik
81eb3396bf Do not require config.h to use isc/util.h
util.h requires ISC_CONSTRUCTOR definition, which depends on config.h
inclusion. It does not include it from isc/util.h (or any other header).
Using isc/util.h fails hard when isc/util.h is used without including
bind's config.h.

Move the check to c file, where ISC_CONSTRUCTOR is used. Ensure config.h
is included there.
2021-03-26 11:41:22 +01:00
Diego dos Santos Fronza
f38069cdf8 Merge branch '2490-dig-tcp-does-not-honor-tries-1-nor-retry-0' into 'main'
Resolve "dig +tcp does not honor +tries=1 nor +retry=0"

Closes #2490

See merge request isc-projects/bind9!4682
2021-03-25 17:30:24 +00:00
Diego Fronza
04537633a7 Add CHANGES note for [GL #2490] 2021-03-25 14:12:16 -03:00
Diego Fronza
3b98c4d311 Update dig's man page
Adjusted man page entries for +tries and +retry options to reflect the
fact that now those options apply to TCP as well.
2021-03-25 14:08:40 -03:00
Diego Fronza
4f82cc41cc Added tests for tries=1 and retry=0 on TCP EOF
Added tests to ensure that dig won't retry sending a query over tcp
(+tcp) when a TCP connection is closed prematurely (EOF is read) if
either +tries=1 or retry=0 is specified on the command line.
2021-03-25 14:08:40 -03:00
Diego Fronza
e680896003 Adjusted dig system tests
Now that premature EOF on tcp connections take +tries and +retry into
account, the dig system tests handling TCP EOF with +tries=1 were
expecting dig to do a second attempt in handling the tcp query, which
doesn't happen anymore.

To make the test work as expected +tries value was adjusted to 2, to
make it behave as before after the new update on dig.
2021-03-25 14:08:40 -03:00
Diego Fronza
78f6ead480 Don't retry +tcp queries on failure if tries=1 or retries=0
Before this commit, a premature EOF (connection closed) on tcp queries
was causing dig to automatically attempt to send the query again, even
if +tries=1 or +retries=0 was provided on command line.

This commit fix the problem by taking into account the no. of retries
specified by the user when processing a premature EOF on tcp
connections.
2021-03-25 14:08:39 -03:00
Michał Kępień
8bb1547208 Merge branch 'matthijs-configure-kaspsh' into 'main'
Configure kasp.sh

See merge request isc-projects/bind9!4836
2021-03-24 09:07:33 +00:00
Matthijs Mekking
93ed215065 Add kasp.sh to run.sh.in script
Add kasp.sh to the list of scripts copied from the source directory to
the build directory before any test is run. This will fix
the out-of-tree test failures introduced in commit
ecb073bdd6 on the 'main' branch.
2021-03-24 08:55:24 +01:00
Matthijs Mekking
c2c5701dfe Merge branch '2488-refresh-keys-after-rndc-rollover' into 'main'
Rekey immediately after rndc checkds/rollover

Closes #2488

See merge request isc-projects/bind9!4813
2021-03-22 13:35:12 +00:00
Matthijs Mekking
82d667e1d5 Fix some intermittent kasp failures
When calling "rndc dnssec -checkds", it may take some milliseconds
before the appropriate changes have been written to the state file.
Add retry_quiet mechanisms to allow the write operation to finish.

Also retry_quiet the check for the next key event. A "rndc dnssec"
command may trigger a zone_rekey event and this will write out
a new "next key event" log line, but it may take a bit longer than
than expected in the tests.
2021-03-22 11:58:26 +01:00
Matthijs Mekking
82f72ae249 Rekey immediately after rndc checkds/rollover
Call 'dns_zone_rekey' after a 'rndc dnssec -checkds' or 'rndc dnssec
-rollover' command is received, because such a command may influence
the next key event. Updating the keys immediately avoids unnecessary
rollover delays.

The kasp system test no longer needs to call 'rndc loadkeys' after
a 'rndc dnssec -checkds' or 'rndc dnssec -rollover' command.
2021-03-22 11:58:26 +01:00
Matthijs Mekking
28923bc695 Merge branch '2517-cds-dnskey-delete-records-prevent-loading-unsigned-zone' into 'main'
Resolve "CDS and CDNSKEY DELETE records prevent (re-)loading unsigned zone"

Closes #2517

See merge request isc-projects/bind9!4810
2021-03-22 10:06:45 +00:00
Matthijs Mekking
841e90c6fc Add CHANGES and notes for [#2517] 2021-03-22 10:31:23 +01:00
Matthijs Mekking
6f31f62d69 Delete CDS/CDNSKEY records when zone is unsigned
CDS/CDNSKEY DELETE records are only useful if they are signed,
otherwise the parent cannot verify these RRsets anyway. So once the DS
has been removed (and signaled to BIND), we can remove the DNSKEY and
RRSIG records, and at this point we can also remove the CDS/CDNSKEY
records.
2021-03-22 10:30:59 +01:00
Matthijs Mekking
f211c7c2a1 Allow CDS/CDNSKEY DELETE records in unsigned zone
While not useful, having a CDS/CDNSKEY DELETE record in an unsigned
zone is not an error and "named-checkzone" should not complain.
2021-03-22 10:25:30 +01:00
Matthijs Mekking
052ec16a44 Merge branch 'matthijs-test-keymgr2kasp' into 'main'
Test migrating to dnssec-policy

Closes #2544

See merge request isc-projects/bind9!4758
2021-03-22 09:09:06 +00:00
Matthijs Mekking
d5531df79a Retry quiet check keys
Change the 'check_keys' function to try three times. Some intermittent
kasp test failures are because we are inspecting the key files
before the actual change has happen. The 'retry_quiet' approach allows
for a bit more time to let the write operation finish.
2021-03-22 09:50:05 +01:00
Matthijs Mekking
923c2a07bf Update copyrights for keymgr2kasp
This MR introduces a new system test 'keymgr2kasp' to test
migration to 'dnssec-policy'. It moves some existing tests from
the 'kasp' system test to here.

Also a common script 'kasp.sh', to be used in kasp specific tests,
is introduced.
2021-03-22 09:50:05 +01:00
Matthijs Mekking
27e7d5f698 Fix keymgr key init bug
The 'keymgr_key_init()' function initializes key states if they have
not been set previously. It looks at the key timing metadata and
determines using the given times whether a state should be set to
RUMOURED or OMNIPRESENT.

However, the DNSKEY and ZRRSIG states were mixed up: When looking
at the Activate timing metadata we should set the ZRRSIG state, and
when looking at the Published timing metadata we should set the
DNSKEY state.
2021-03-22 09:50:05 +01:00
Matthijs Mekking
c40c1ebcb1 Test keymgr2kasp state from timing metadata
Add two test zones that migrate to dnssec-policy. Test if the key
states are set accordingly given the timing metadata.

The rumoured.kasp zone has its Publish/Active/SyncPublish times set
not too long ago so the key states should be set to RUMOURED. The
omnipresent.kasp zone has its Publish/Active/SyncPublish times set
long enough to set the key states to OMNIPRESENT.

Slightly change the init_migration_keys function to set the
key lifetime to "none" (legacy keys don't have lifetime). Then in the
test case set the expected key lifetime explicitly.
2021-03-22 09:50:05 +01:00
Matthijs Mekking
f6fa254256 Editorial commit keymgr2kasp test
This commit is somewhat editorial as it does not introduce something
new nor fixes anything.

The layout in keymgr2kasp/tests.sh has been changed, with the
intention to make more clear where a test scenario ends and begins.

The publication time of some ZSKs has been changed. It makes a more
clear distinction between publication time and activation time.
2021-03-22 09:50:05 +01:00
Matthijs Mekking
ecb073bdd6 Introduce kasp.sh
Add a script similar to conf.sh to include common functions and
variables for testing KASP. Currently used in kasp, keymgr2kasp, and
nsec3.
2021-03-22 09:50:05 +01:00
Matthijs Mekking
5389172111 Move kasp migration tests to different directory
The kasp system test was getting pretty large, and more tests are on
the way. Time to split up. Move tests that are related to migrating
to dnssec-policy to a separate directory 'keymgr2kasp'.
2021-03-22 09:50:05 +01:00
Michał Kępień
ea26306eba Merge branch '1946-man-page-fixes' into 'main'
Man page fixes

See merge request isc-projects/bind9!4817
2021-03-22 08:39:38 +00:00
Michał Kępień
185a1a5643 Install man page for named-compilezone
The named-checkzone tool can also be invoked as named-compilezone.  Make
sure a man page is installed for that alias.  Move and rename the
"man_named-checkzone" label to prevent a Sphinx duplicate label warning
from being raised (see commit 84862e96c1
for more information).
2021-03-22 09:36:48 +01:00