2
0
mirror of https://gitlab.isc.org/isc-projects/bind9 synced 2025-08-30 22:15:20 +00:00
Commit Graph

42280 Commits

Author SHA1 Message Date
Michal Nowak
47d64f944f Rename assert_custom_named_is_alive to named_alive
(cherry picked from commit 38e751d9ac)
2025-02-13 18:47:42 +00:00
Michal Nowak
0d75e15d4d Rewrite nzd2nzf system test to pytest
(cherry picked from commit 7c499d1689)
2025-02-13 18:47:42 +00:00
Michal Nowak
de9e94889e [9.20] chg: test: Rewrite names system test to pytest
Backport of MR !8759

Merge branch 'backport-mnowak/pytest_rewrite_names-9.20' into 'bind-9.20'

See merge request isc-projects/bind9!10096
2025-02-13 18:34:58 +00:00
Michal Nowak
b04d28f1ef Rewrite names system test to pytest
dnspython 2.7.0 or newer is needed because of wire().

(cherry picked from commit 5250ad8720)
2025-02-13 17:49:26 +00:00
Michal Nowak
97e7e9ff21 [9.20] chg: test: Generate TSAN unit stress tests
This is a complement to the already present system test "stress" test.

Backport of MR !9474

Merge branch 'backport-mnowak/generate-tsan-unit-stress-tests-9.20' into 'bind-9.20'

See merge request isc-projects/bind9!10094
2025-02-13 17:42:26 +00:00
Michal Nowak
b4aab2cfa3 Generate TSAN unit stress tests
(cherry picked from commit a03c4b4cf9)
2025-02-13 16:43:31 +00:00
Andoni Duarte
881b9e2417 chg: doc: Set up version for BIND 9.20.7
Merge branch 'andoni/set-up-version-for-bind-9.20.7' into 'bind-9.20'

See merge request isc-projects/bind9!10092
2025-02-13 16:10:37 +00:00
Andoni Duarte Pintado
33988a1600 Update BIND version to 9.20.7-dev 2025-02-13 15:55:19 +01:00
Michal Nowak
bc3957136d [9.20] fix: ci: Do not evaluate $CI_PROJECT_DIR in generate-stress-test-configs.py
GitLab CI Runner's $builds_dir variable is set to "/builds" by default.
For technical reasons, the FreeBSD Runners, using the "instance"
executor, sets the path differently.

The value of $CI_PROJECT_DIR is based on $builds_dir, so if the
generate-stress-test-configs.py script generates jobs with
$CI_PROJECT_DIR (or variables like $INSTALL_PATH that are based on it)
evaluated, it is calcified to whatever was the value in the particular
environment, disregarding the FreeBSD "instance" executor specifics in
the child pipeline.

Instead of evaluating $CI_PROJECT_DIR in the script, evaluate it in the
runtime environment.

Backport of MR !10075

Merge branch 'backport-mnowak/fix-CI_PROJECT_DIR-variable-evaluation-9.20' into 'bind-9.20'

See merge request isc-projects/bind9!10077
2025-02-05 15:30:30 +00:00
Michal Nowak
aa68ffeabd Do not evaluate $CI_PROJECT_DIR in generate-stress-test-configs.py
GitLab CI Runner's $builds_dir variable is set to "/builds" by default.
For technical reasons, the FreeBSD Runners, using the "instance"
executor, sets the path differently.

The value of $CI_PROJECT_DIR is based on $builds_dir, so if the
generate-stress-test-configs.py script generates jobs with
$CI_PROJECT_DIR (or variables like $INSTALL_PATH that are based on it)
evaluated, it is calcified to whatever was the value in the particular
environment, disregarding the FreeBSD "instance" executor specifics in
the child pipeline.

Instead of evaluating $CI_PROJECT_DIR in the script, evaluate it in the
runtime environment.

(cherry picked from commit dab7d28b09)
2025-02-05 15:04:45 +00:00
Nicki Křížek
38c51c8401 [9.20] new: usr: add a rndc command to toggle jemalloc profiling
The new command is `rndc memprof`. The memory profiling status is also
reported inside `rndc status`. The status also shows whether named can
toggle memory profiling or not and if the server is built with jemalloc.

Closes #4759

Backport of MR !9370

Merge branch 'backport-4759-add-a-trigger-to-dump-jeprof-data-or-memory-statistics-9.20' into 'bind-9.20'

See merge request isc-projects/bind9!10000
2025-02-05 10:14:25 +00:00
Aydın Mercan
dde251b773 add a rndc command to toggle jemalloc profiling
The new command is `rndc memprof`. The memory profiling status is also
reported inside `rndc status`. The status also shows whether named can
toggle memory profiling or not and if the server is built with jemalloc.

(cherry picked from commit b495e9918e)
2025-02-05 10:40:05 +01:00
Ondřej Surý
5c27e9cdda [9.20] fix: dev: Reduce the false sharing the dns_qpcache and dns_qpzone
Instead of having many node_lock_count * sizeof(<member>) arrays, pack all
the members into a qpcache_bucket_t that is cacheline aligned to prevent
false sharing between RWLocks.

Backport of MR !10072

Merge branch 'backport-ondrej/prevent-nodelock-false-sharing-9.20' into 'bind-9.20'

See merge request isc-projects/bind9!10074
2025-02-04 23:17:37 +00:00
Ondřej Surý
db2bce1c6f Switch the locknum generation for qpznode to random
Instead of using on hash of the name modulo number of the buckets,
assign the locknum randomly with isc_random_uniform().  This makes
the locknum assignment aligned with qpcache and allows the bucket
number to be non-prime in the future.

(cherry picked from commit 732fc338a9)
2025-02-04 23:28:53 +01:00
Ondřej Surý
d4e8a92977 Rely on call_rcu() to destroy the qpzone outside of locks
Reduce the number of qpzone_ref() and qpzone_unref() calls in
qpzone_detachnode() by relying on the call_rcu to delay
the destruction of the lock buckets.

(cherry picked from commit 1fa5219fdf)
2025-02-04 23:28:53 +01:00
Ondřej Surý
c6c03a6b11 Reduce false sharing in dns_qpzone
Instead of having many node_lock_count * sizeof(<member>) arrays, pack
all the members into a qpzone_bucket_t that is cacheline aligned and have
a single array of those.

(cherry picked from commit 6dcc398726)
2025-02-04 23:28:50 +01:00
Ondřej Surý
a9f4e3369a Reduce false sharing in dns_qpcache
Instead of having many node_lock_count * sizeof(<member>) arrays, pack
all the members into a qpcache_bucket_t struct that is cacheline aligned
and have a single array of those.

Additionaly, make both the head and the tail of isc_queue_t padded, not
just the head, to prevent false sharing of the lock-free structure with
the lock that follows it.

(cherry picked from commit c602d76c1f)
2025-02-04 23:27:28 +01:00
Ondřej Surý
b5cce0f597 [9.20] new: usr: Print the expiration time of the stale records
Print the expiration time of the stale RRsets in the cache dump.

Backport of MR !10057

Merge branch 'backport-ondrej/print-expiration-time-of-stale-records-9.20' into 'bind-9.20'

See merge request isc-projects/bind9!10061
2025-02-04 17:08:20 +00:00
Ondřej Surý
8229d9cdfa Print the expiration time of the stale records (not ancient)
In #1870, the expiration time of ANCIENT records were printed, but
actually the ancient records are very short lived, and the information
carries a little value.

Instead of printing the expiration of ANCIENT records, print the
expiration time of STALE records.

(cherry picked from commit 355fc48472)
2025-02-04 18:07:59 +01:00
Michal Nowak
9fbc273417 [9.20] chg: test: Rewrite stub system test to pytest
Backport of MR !9190

Merge branch 'backport-mnowak/pytest_rewrite_stub-9.20' into 'bind-9.20'

See merge request isc-projects/bind9!10070
2025-02-04 13:25:41 +00:00
Michal Nowak
fb7d78a258 Rewrite stub system test to pytest
(cherry picked from commit 1069eb1969)
2025-02-04 13:24:54 +00:00
Michal Nowak
1047797100 Add isctest.check.notauth()
(cherry picked from commit b19fb37080)
2025-02-04 13:24:54 +00:00
Nicki Křížek
b5ecd7416c Allow to use an arbitrary numeric identifier for NamedInstance
In some cases, the numeric identifier doesn't correspond to the
directory name (i.e. `resolver` server in `shutdown` test, which is
supposed to be 10.53.0.3). These are typically servers that shouldn't be
auto-started by the runner, thus avoiding the typical `*ns<X>` name.

Support these server by allowing a fallback initialization with custom
numeric identifier in case it can't be parsed from the directory name.

(cherry picked from commit a24f71bae4)
2025-02-04 13:24:54 +00:00
Nicki Křížek
0e412834e9 Add start/stop wrappers to control NamedInstance
The start()/stop() functions can be used in the pytests in the same way
as start_server and stop_server functions were used in shell tests. Note
that the servers obtained through the servers fixture are still started
and stopped by the test runner at the start and end of the test. This
makes these functions mostly useful for restarting the server(s)
mid-test.

(cherry picked from commit 37699ad84b)
2025-02-04 13:24:54 +00:00
Nicki Křížek
184160ac36 Move shell and perl util functions to isctest.run
Previously, these functions have been provided as fixtures. This was
limiting re-use, because it wasn't possible to call these outside of
tests / other fixtures without passing these utility functions around.
Move them into isctest.run package instead.

(cherry picked from commit b6d645410c)
2025-02-04 13:24:54 +00:00
Michal Nowak
f76cacae35 [9.20] fix: ci: Supress the leak detection in __xmlDefaultBufferSize
Closes #5157

Backport of MR !10067

Merge branch 'backport-5157-suppress-lsan-libxml2-leak-9.20' into 'bind-9.20'

See merge request isc-projects/bind9!10068
2025-02-04 13:24:11 +00:00
Michal Nowak
0e14319ef6 Supress the leak detection in __xmlDefaultBufferSize
(cherry picked from commit ca859563aa)
2025-02-04 12:37:53 +00:00
Mark Andrews
16b57388ab [9.20] fix: test: Fix 'ans' servers so they respond with consistent answers to NS queries at QNAME.
The ANS servers were not to written to handle NS queries at the QNAME, resulting in gratuitous protocol errors that will break tests when NS requests are made for the QNAME: i.e., NXDOMAIN for NS vs data for expected type,  CNAME not being returned for all query types.

Prerequisite for !9155 

Closes #5062

Backport of MR !9786

Merge branch 'backport-5062-fix-ans-servers-ns-at-qname-9.20' into 'bind-9.20'

See merge request isc-projects/bind9!10064
2025-02-04 04:14:10 +00:00
Mark Andrews
2a8bf4f6bb Fix gratuitious DNS protocol errors in the ANS servers
The ANS servers were not to written to handle NS queries at the
QNAME resulting in gratuitious protocol errors that will break tests
when NS requests are made for the QNAME.

(cherry picked from commit 0680eb6f64)
2025-02-04 02:37:34 +00:00
Ondřej Surý
9a4df4caac [9.20] fix: usr: Recently expired records could be returned with timestamp in future
Under rare circumstances, the RRSet that expired at the time of
the query could be returned with TTL far in the future.  This
has been fixed.

As a side-effect, the expiration time of expired RRSets are no
longer printed out in the cache dump.

Closes #5094

Backport of MR !10048

Merge branch 'backport-5094-fix-timestamp-in-ttl-9.20' into 'bind-9.20'

See merge request isc-projects/bind9!10059
2025-02-03 15:05:48 +00:00
Ondřej Surý
302aca809d Expand the usage of mark_ancient() helper functions
When the mark_ancient() helper function was introduced, couple of places
with duplicate (or almost duplicate) code was missed.  Move the
mark_ancient() function closer to the top of the file, and correctly use
it in places that mark the header as ANCIENT.

(cherry picked from commit 58179e6a19)
2025-02-03 15:53:34 +01:00
Ondřej Surý
4b114838de Add better ZEROTTL handling in bindrdataset()
If we know that the header has ZEROTTL set, the server should never send
stale records for it and the TTL should never be anything else than 0.
The comment was already there, but the code was not matching the
comment.

(cherry picked from commit cfee6aa565)
2025-02-03 15:53:34 +01:00
Ondřej Surý
b32512a232 In cache, set rdataset TTL to 0 when the header is not active
When the header has been marked as ANCIENT, but the ttl hasn't been
reset (this happens in couple of places), the rdataset TTL would be
set to the header timestamp instead to a reasonable TTL value.

Since this header has been already expired (ANCIENT is set), set the
rdataset TTL to 0 and don't reuse this field to print the expiration
time when dumping the cache.  Instead of printing the time, we now
just print 'expired (awaiting cleanup'.

(cherry picked from commit 1bbb57f81b)
2025-02-03 15:53:34 +01:00
Ondřej Surý
619f163e68 [9.20] fix: dev: Fix the cache findzonecut() implementation
The search for the deepest known zone cut in the cache could improperly reject a node if it contained any stale data, regardless of whether it was the NS RRset that was stale.

Closes #5155

Backport of MR !10047

Merge branch 'backport-5155-fix-findzonecut-9.20' into 'bind-9.20'

See merge request isc-projects/bind9!10050
2025-02-02 22:06:52 +00:00
Evan Hunt
1e818d368f fix the cache findzonecut implementation
the search for the deepest known zone cut in the cache could
improperly reject a node containing stale data, even if the
NS rdataset wasn't the data that was stale.

this change also improves the efficiency of the search by
stopping it when both NS and RRSIG(NS) have been found.

(cherry picked from commit 1f095b902c)
2025-02-02 20:01:52 +01:00
Petr Špaček
51d5d0aae2 [9.20] fix: ci: Do not trigger post-merge jobs for cross-project pushes
Backport of MR !10029
Backport of MR !10042

Merge branch 'backport-pspacek/no-cross-project-after-merge-jobs-9.20' into 'bind-9.20'

See merge request isc-projects/bind9!10043
2025-01-31 14:07:50 +00:00
Petr Špaček
86f7822c81 Fix shell escaping in post-merge before_script
Fixup for commit 6014060774
"Do not trigger post-merge jobs for cross-project pushes".

Related: isc-projects/bind9!10029
(cherry picked from commit 6276e0b23b)
2025-01-31 14:52:04 +01:00
Petr Špaček
80f330ab60 Do not trigger post-merge jobs for cross-project pushes
We need to avoid double-triggering of post-merge jobs in the following
scenario:

 1. A private MR gets merged into the private BIND 9 repository.

 2. This merge operation triggers a "push" pipeline in the private
    repository, which correctly runs post-merge jobs, e.g. to set MR
    metadata in the private project.

 3. When a release is published, a script is run to change the
    automatically assigned milestone value ("Not released yet") to
    something else.

 4. Shortly afterwards, the result of the merge from step 1 is merged
    back into a maintenance branch in the public repository.

 5. The push operation triggers another "push" pipeline, this time in
    the public project.

At this point there are two problems:

  - If the script is dumb (like it currently is), it will extract the
    merge request ID from the merge commit description and change the
    milestone for a merge request in the wrong project namespace.

  - Even if the script was fixed to extract and use the correct GitLab
    project reference, it would reset the milestone for the merge
    request in the private repository back to "Not released yet" - while
    the milestone set in step 3 should be retained.

An alternative would be to change the order of operations so that
post-release milestoning happens at a later stage, while also fixing the
script to correctly follow cross-project references, but that approach
seems more fragile than simply failing on all cross-project pushes.  The
rule to enforce is: each project should only take care of its own
post-merge tasks.

(cherry picked from commit 6014060774)
2025-01-31 14:49:31 +01:00
Michał Kępień
9d87acf959 [9.20] chg: ci: Use default cloning depth for the Danger CI job
With shallow fetching working reliably in pygit2 1.17.0+, there is no
longer any need for GitLab CI runners to clone the BIND 9 repository
with a fixed depth of 1000 during every "danger" CI job as Hazard is now
able to fetch remote refs with an arbitrary depth, controlled by the
HAZARD_FETCH_DEPTH environment variable.  The latter can be defined via
GitLab project's CI settings and adjusted as needed over time, without
the need to update .gitlab-ci.yml every time its value needs to be
changed.

Backport of MR !9946

Merge branch 'backport-michal/use-default-cloning-depth-for-the-danger-ci-job-9.20' into 'bind-9.20'

See merge request isc-projects/bind9!10037
2025-01-31 09:32:53 +00:00
Michał Kępień
3cb6c224f2 Use default cloning depth for the Danger CI job
With shallow fetching working reliably in pygit2 1.17.0+, there is no
longer any need for GitLab CI runners to clone the BIND 9 repository
with a fixed depth of 1000 during every "danger" CI job as Hazard is now
able to fetch remote refs with an arbitrary depth, controlled by the
HAZARD_FETCH_DEPTH environment variable.  The latter can be defined via
GitLab project's CI settings and adjusted as needed over time, without
the need to update .gitlab-ci.yml every time its value needs to be
changed.

(cherry picked from commit e39e7afc16)
2025-01-31 09:30:42 +00:00
Ondřej Surý
3244f7848f [9.20] chg: dev: Refactor reference counting in both QPDB and RBTDB
Clean up the pattern in the newref() and decref() functions in QP and RBTDB databases.  Replace the `db_nodelock_t` structure with plain reference counting for every active database node in QPDB.

Related to #5134

Backport of MR !10006

Merge branch 'backport-5134-refactor-decref-functions-in-qpdb-9.20' into 'bind-9.20'

See merge request isc-projects/bind9!10035
2025-01-31 05:48:14 +00:00
Ondřej Surý
857225aeb6 Clarify reference counting in RBTDB database
Change the names of the node reference counting functions
and add comments to make the mechanism easier to understand:

- dns__rbtdb_newref() and dns__rbtdb_decref() are now called
  dns__rbtnode_acquire() and dns__rbtnode_release()
  respectively; this reflects the fact that they modify both
  the internal and external reference counters for a node.

- rbtnode_newref() and rbtnode_decref are now called
  rbtnode_erefs_increment() and rbtnode_erefs_decrement(),
  to reflect that they only increase and decrease the node's
  external reference counters, not internal.
2025-01-31 06:07:48 +01:00
Ondřej Surý
9c45de9473 Refactor node reference counting in rbtdb.c
Refactor the pattern in the newref() and decref() functions in rbtdb.c
following the pattern, so it follows the similar pattern we already have
for QPDB.
2025-01-31 05:52:13 +01:00
Evan Hunt
5300eebc9e Clarify reference counting in QP databases
Change the names of the node reference counting functions
and add comments to make the mechanism easier to understand:

- newref() and decref() are now called qpcnode_acquire()/
  qpznode_acquire() and qpcnode_release()/qpznode_release()
  respectively; this reflects the fact that they modify both
  the internal and external reference counters for a node.

- qpcnode_newref() and qpznode_newref() are now called
  qpcnode_erefs_increment() and qpznode_erefs_increment(), and
  qpcnode_decref() and qpznode_decref() are now called
  qpcnode_erefs_decrement() and qpznode_erefs_decrement(),
  to reflect that they only increase and decrease the node's
  external reference counters, not internal.

(cherry picked from commit d4f791793e)
2025-01-31 05:52:13 +01:00
Ondřej Surý
7dab6cdfbc Remove db_nodelock_t in favor of reference counted qpdb
This removes the db_nodelock_t structure and changes the node_locks
array to be composed only of isc_rwlock_t pointers.  The .reference
member has been moved to qpdb->references in addition to
common.references that's external to dns_db API users.  The .exiting
members has been completely removed as it has no use when the reference
counting is used correctly.

(cherry picked from commit 431513d8b3)
2025-01-31 05:49:36 +01:00
Ondřej Surý
082a54cc5d Remove origin_node from qpcache
The origin_node in qpcache was always NULL, so we can remove the
getoriginode() function and origin_node pointer as the
dns_db_getoriginnode() correctly returns ISC_R_NOTFOUND when the
function is not implemented.

(cherry picked from commit 36a26bfa1a)
2025-01-31 05:49:23 +01:00
Ondřej Surý
d1d444d2ab Refactor decref() in both qpcache.c and qpzone.c
Cleanup the pattern in the decref() functions in both qpcache.c and
qpzone.c, so it follows the similar patter as we already have in
newref() function.

(cherry picked from commit 814b87da64)
2025-01-31 05:49:12 +01:00
Colin Vidal
8662424442 [9.20] fix: dev: fix EDE 22 time out detection
Extended DNS error 22 (No reachable authority) was previously detected when `fctx_expired` fired. It turns out this function is used as a "safety net" and the timeout detection should be caught earlier.

It was working though, because of another issue fixed by !9927. But then, the recursive request timed out detection occurs before `fctx_expired` making impossible to raise the EDE 22 error.

This fixes the problem by triggering the EDE 22 in the part of the code detecting the (TCP or UDP) time out and taking the decision to cancel the whole fetch (i.e. There is no other server to attempt to contact).

Note this is not targeting users (no release note) because there is no release versions of BIND between !9927 and this changes. Thus a release note would be confusing.

Closes #5137

Backport of MR !9985

Merge branch 'backport-5137-ede22-9.20' into 'bind-9.20'

See merge request isc-projects/bind9!10001
2025-01-30 15:39:18 +00:00
Colin Vidal
588924bbb5 update serve-stale test to support EDE 22
When EDE 3 (stale answer) was added the serve-stale tests were checking
for those exclusively, i.e. grepping for no "EDE" in the dig output when
no stale answer was expected.

However, some stale tests disable stale answers and make the
authoritative server unresponsive, effectively triggering a timed out
request thus an EDE 22. Update those tests so they still tests the
absence of EDE 3 error, but also the presence of EDE 22.

(cherry picked from commit 27f3b8950a)
2025-01-30 14:43:25 +00:00
Colin Vidal
edd6f0eb35 add new EDE 22 system tests
This re-do a previously existing EDE 22 system test as well as add
another one making sure the timed out flow detection works also on UDP
when the resolver is contacting the authoritative server. (the existing
test was using TCP to contact the authoritative servers).

(cherry picked from commit 7cb8a028fe)
2025-01-30 14:43:25 +00:00