The order of the fetch context hash table rwlock and the individual
fetch context was reversed when calling the release_fctx() function.
This was causing a problem when iterating the hash table, and thus the
ordering has been corrected in a way that the hash table rwlock is now
always locked on the outside and the fctx lock is the interior lock.
(cherry picked from commit cf078fadebcf73184a64cf46d28c3f40b54f1867)
In the next commit, we need to know whether the timer has been started
or stopped. Add isc_timer_running() function that returns true if the
timer has been started.
(cherry picked from commit b9e3cd5d2a75f962a1e88cbe676cf875a796543d)
With RPZ in use, `named` could terminate unexpectedly because of a race condition when a reconfiguration command was received using `rndc`. This has been fixed.
Closes#5146
Backport of MR !10079
Merge branch 'backport-5146-rpz-reconfig-bug-fix-9.20' into 'bind-9.20'
See merge request isc-projects/bind9!10144
After a reconfiguration the old view can be left without a valid
'rpzs' member, because when the RPZ is not changed during the named
reconfiguration 'rpzs' "migrate" from the old view into the new
view, so when a query resumes it can find that 'qctx->view->rpzs'
is NULL which query_resume() currently doesn't expect to happen if
it's recursing and 'qctx->rpz_st' is not NULL.
Fix the issue by adding a NULL-check. In order to not split the log
message to two different log messages depending on whether
'qctx->view->rpzs' is NULL or not, change the message to not log
the RPZ policy's "version" which is just a runtime counter and is
most likely not very useful for the users.
(cherry picked from commit 3ea2fbc238e0d933b9f87dfd8fdab9233d978e33)
Previously, when parsing responses, named incorrectly rejected responses without matching RRSIG records for NSEC/DS/NSEC3 records in the authority section. This rejection, if appropriate, should have been left for the validator to determine and has been fixed.
Closes#5185
Backport of MR !10125
Merge branch 'backport-5185-remove-rrsig-check-from-dns_message_parse-9.20' into 'bind-9.20'
See merge request isc-projects/bind9!10142
This scenario should succeed but wasn't due rejection of the
message at the message parsing stage.
(cherry picked from commit 4271d93f00909fad74d694121da970b1a633c495)
Checking whether the authority section is properly signed should
be left to the validator. Checking in getsection (dns_message_parse)
was way too early and resulted in resolution failures of lookups
that should have otherwise succeeded.
(cherry picked from commit 83159d0a545be2845f08386f5dffdc2ac3721ba5)
Previously a hard-coded limitation of maximum two key or message
verification checks were introduced when checking the message's
SIG(0) signature. It was done in order to protect against possible
DoS attacks. The logic behind choosing the number 2 was that more
than a single key should only be required during key rotations, and
in that case two keys are enough. But later it became apparent that
there are other use cases too where even more keys are required, see
issue number #5050 in GitLab.
This change introduces two new configuration options for the views,
`sig0key-checks-limit` and `sig0message-checks-limit`, which define how
many keys are allowed to be checked to find a matching key, and how
many message verifications are allowed to take place once a matching
key has been found. The latter protects against expensive cryptographic
operations when there are keys with colliding tags and algorithm
numbers, with default being 2, and the former protects against a bit
less expensive key parsing operations and defaults to 16.
Closes#5050
Backport of MR !9967
Merge branch 'backport-5050-sig0-let-considering-more-than-two-keys-9.20' into 'bind-9.20'
See merge request isc-projects/bind9!10141
Previously a hard-coded limitation of maximum two key or message
verification checks were introduced when checking the message's
SIG(0) signature. It was done in order to protect against possible
DoS attacks. The logic behind choosing the number two was that more
than one key should only be required only during key rotations, and
in that case two keys are enough. But later it became apparent that
there are other use cases too where even more keys are required, see
issue number #5050 in GitLab.
This change introduces two new configuration options for the views,
sig0key-checks-limit and sig0message-checks-limit, which define how
many keys are allowed to be checked to find a matching key, and how
many message verifications are allowed to take place once a matching
key has been found. The latter protects against expensive cryptographic
operations when there are keys with colliding tags and algorithm
numbers, with default being 2, and the former protects against a bit
less expensive key parsing operations and defaults to 16.
(cherry picked from commit 716b9360452bcebd89d48faa895375fa3360b9ed)
Running jobs which were entered into the isc_quota queue is the
responsibility of the isc_quota_release() function, which, when
releasing a previously acquired quota, checks whether the queue
is empty, and if it's not, it runs a job from the queue without touching
the 'quota->used' counter. This mechanism is susceptible to a possible
hangup of a newly queued job in case when between the time a decision
has been made to queue it (because used >= max) and the time it was
actually queued, the last quota was released. Since there is no more
quotas to be released (unless arriving in the future), the newly
entered job will be stuck in the queue.
Fix the issue by adding checks in both isc_quota_release() and
isc_quota_acquire_cb() to make sure that the described hangup does
not happen. Also see code comments.
Closes#4965
Backport of MR !10082
Merge branch 'backport-4965-isc_quota-bug-fix-9.20' into 'bind-9.20'
See merge request isc-projects/bind9!10139
Running jobs which were entered into the isc_quota queue is the
responsibility of the isc_quota_release() function, which, when
releasing a previously acquired quota, checks whether the queue
is empty, and if it's not, it runs a job from the queue without touching
the 'quota->used' counter. This mechanism is susceptible to a possible
hangup of a newly queued job in case when between the time a decision
has been made to queue it (because used >= max) and the time it was
actually queued, the last quota was released. Since there is no more
quotas to be released (unless arriving in the future), the newly
entered job will be stuck in the queue.
Fix the wrong memory ordering for 'quota->used', as the relaxed
ordering doesn't ensure that data modifications made by one thread
are visible in other threads.
Add checks in both isc_quota_release() and isc_quota_acquire_cb()
to make sure that the described hangup does not happen. Also see
code comments.
(cherry picked from commit c6529891bb36acedb7b1d94aa3ae4ffd580479b7)
A new option 'min-transfer-rate-in <bytes> <minutes>' has been added
to the view and zone configurations. It can abort incoming zone
transfers which run very slowly due to network related issues, for
example. The default value is set to 10240 bytes in 5 minutes.
Closes#3914
Backport of MR !9098
Merge branch 'backport-3914-detect-and-restart-stalled-zone-transfers-9.20' into 'bind-9.20'
See merge request isc-projects/bind9!10137
Expose the average transfer rate (in bytes-per-second) during the
last full 'min-transfer-rate-in <bytes> <minutes>' minutes interval.
If no such interval has passed yet, then the overall average rate is
reported instead.
(cherry picked from commit c701b590e447301e7eca52ba3644fe60b633fa18)
Add a new big zone, run a zone transfer in slow mode, and check
whether the zone transfer gets canceled because 100000 bytes are
not transferred in 5 seconds (as it's running in slow mode).
(cherry picked from commit b9c6aa24f8c0c52d33fee11da28ec7316d6f4ed2)
This new option sets a minimum amount of transfer rate for
an incoming zone transfer that will abort a transfer, which
for some network related reasons run very slowly.
(cherry picked from commit 91ea1562030d1efd58e1e035fcfde3d8962a1f70)
The cache has been updated so that if new data is rejected - for example, because there was already existing data at a higher trust level - then its covering RRSIG will also be rejected.
Closes#5132
Backport of MR !9999
Merge branch 'backport-5132-improve-cd-behavior-9.20' into 'bind-9.20'
See merge request isc-projects/bind9!10134
add a zone with different NS RRsets in the parent and child,
and test resolver and forwarder behavior with and without +CD.
(cherry picked from commit e4652a0444a514773686e75752ad5c65daa753d5)
Add a new dns_rdataset_equals() function to check whether two
rdatasets are equal in DNSSEC terms.
When an rdataset being cached is rejected because its trust
level is lower than the existing rdataset, we now check to see
whether the rejected data was identical to the existing data.
This allows us to cache a potentially useful RRSIG when handling
CD=1 queries, while still rejecting RRSIGs that would definitely
have resulted in a validation failure.
(cherry picked from commit 6aba56ae89cde535fcc6fbee0366c843cdf47845)
The value returned by http_send_outgoing() is not used anywhere, so we
make it not return anything (void). Probably it is an omission from
older times.
(cherry picked from commit 2adabe835a290c021948a43a4c2c25ce2806aef2)
When handling outgoing data, there were a couple of rarely executed
code paths that would not take into account that the callback MUST be
called.
It could lead to potential memory leaks and consequent shutdown hangs.
(cherry picked from commit 8b8f4d500d9c1d41d95d34a79c8935823978114c)
This commit changes the way how the number of active HTTP streams is
calculated and allows it to scale with the values of the maximum
amount of streams per connection, instead of effectively capping at
STREAM_CLIENTS_PER_CONN.
The original limit, which is intended to define the pipelining limit
for TCP/DoT. However, it appeared to be too restrictive for DoH, as it
works quite differently and implements pipelining at protocol level by
the means of multiplexing multiple streams. That renders each stream
to be effectively a separate connection from the point of view of the
rest of the codebase.
(cherry picked from commit a22bc2d7d4974d730e4a7267d2f85e74db53c688)
Previously we would limit the amount of incoming data to process based
solely on the presence of not completed send requests. That worked,
however, it was found to severely degrade performance in certain
cases, as was revealed during extended testing.
Now we switch to keeping track of how much data is in flight (or ready
to be in flight) and limit the amount of processed incoming data when
the amount of in flight data surpasses the given threshold, similarly
to like we do in other transports.
(cherry picked from commit 05e8a508188116e8e367aaf5e68575c8cebb4207)
In the qpzone implementation of `dns_db_closeversion()`, if there are changed nodes that have no remaining data, delete them.
Closes#5169
Backport of MR !10089
Merge branch 'backport-5169-qpzone-delete-dead-nodes-9.20' into 'bind-9.20'
See merge request isc-projects/bind9!10124
if all data has been deleted from a node in the qpzone
database, delete the node too.
(cherry picked from commit e58ce19cf22fd090e09918b842d8ec7f3772c77c)
The host command was extended to also query for the HTTPS RR type by default.
Backport of MR !8642
Merge branch 'backport-feature/main/host-rr-https-9.20' into 'bind-9.20'
See merge request isc-projects/bind9!10123
Unless explicitly specified type from host command, do fourth query for
type HTTPS RR. It is expected it will become more common and some
systems already query that record for every name.
(cherry picked from commit 82069a57006066190dbbb1d30b271c25b7a74552)
Backport of !10017.
Merge branch '5145-artem-fix-wrong-logging-severity-in-do_nsfetch-9.20' into 'bind-9.20'
See merge request isc-projects/bind9!10118
Remove code in the QP zone database to handle failures of `dns_qp_insert()` which can't actually happen.
Closes#5171
Backport of MR !10088
Merge branch 'backport-5171-qpzone-insert-checks-9.20' into 'bind-9.20'
See merge request isc-projects/bind9!10114
in some places there were checks for failures of dns_qp_insert()
after dns_qp_getname(). such failures could only happen if another
thread inserted a node between the two calls, and that can't happen
because the calls are serialized with dns_qpmulti_write(). we can
simplify the code and just add an INSIST.
(cherry picked from commit fffa150df3ffb4c6e31a8e33463f709cd70ede7b)
When processing a query with the "checking disabled" bit set (CD=1), `named` stores the unvalidated result in the cache, marked "pending". When the same query is sent with CD=0, the cached data is validated, and either accepted as an answer, or ejected from the cache as invalid. This deferred validation was not attempted for DS and DNSKEY records if they had no cached signatures, causing spurious validation failures. We now complete the deferred validation in this scenario.
Also, if deferred validation fails, we now re-query the data to find out whether the zone has been corrected since the invalid data was cached.
Closes#5066
Backport of MR !10104
Merge branch 'backport-5066-fix-strip-dnssec-rrsigs-9.20' into 'bind-9.20'
See merge request isc-projects/bind9!10105
If a deferred validation on data that was originally queried with
CD=1 fails, we now repeat the query, since the zone data may have
changed in the meantime.
(cherry picked from commit 04b1484ed8308baede372e642d1ed7c05c523a94)
When a query is made with CD=1, we store the result in the
cache marked pending so that it can be validated later, at
which time it will either be accepted as an answer or removed
from the cache as invalid. Deferred validation was not
attempted when there were no cached RRSIGs for DNSKEY and
DS. We now complete the deferred validation in this scenario.
(cherry picked from commit 8b900d180886ca333d94c87c782619dbedc775b5)
An incorrect optimization caused "CNAME and other data" errors not to be detected if certain types were at the same node as a CNAME. This has been fixed.
Closes#5150
Backport of MR !10033
Merge branch 'backport-5150-cname-and-other-data-check-not-applied-to-all-types-9.20' into 'bind-9.20'
See merge request isc-projects/bind9!10100
prio_type was being used in the wrong place to optimize cname_and_other.
We have to first exclude and accepted types and we also have to
determine that the record exists before we can check if we are at
a point where a later CNAME cannot appear.
(cherry picked from commit 5e49a9e4ae8d0a78fb5ac0c7b683de9a29b6b848)