2
0
mirror of https://gitlab.isc.org/isc-projects/bind9 synced 2025-08-29 05:28:00 +00:00

41385 Commits

Author SHA1 Message Date
Nicki Křížek
b2cfdba565 Merge branch '4480-sig0-can-be-used-to-exhaust-cpu-resources-v6' into 'v9.20.0-release'
[CVE-2024-1975] Mitigate SIG(0) CPU resources exhaustion attack vectors

See merge request isc-private/bind9!689
2024-06-10 15:44:34 +00:00
Petr Špaček
9370acd3a7
Require local KEYs for SIG(0) verification
This is additional hardening. There is no known use-case for KEY RRs
from DNS cache and it potentially allows attackers to put weird keys
into cache.
2024-06-10 17:36:45 +02:00
Aram Sargsyan
d69fab1530
Mark SIG(0) quota settings as experimantal
A different solution in the future might be adopted depending
on feedback and other new information, so it makes sense to mark
these options as EXPERIMENTAL until we have more data.
2024-06-10 17:36:45 +02:00
Aram Sargsyan
54ddd848fe
Avoid running get_matching_view() asynchronously on an error path
Also create a new ns_client_async_reset() static function to decrease
code duplication.
2024-06-10 17:35:40 +02:00
Aram Sargsyan
a2b61c0a65
Test that named checks maximum two keys for SIG(0)-signed messages
Send three updates with three different keys, and expect that one
of them should fail.

Also retain more artifacts for neighboring nsupdate calls.
2024-06-10 17:35:39 +02:00
Aram Sargsyan
be482311de
Add a release note for [GL #4480] 2024-06-10 17:35:36 +02:00
Aram Sargsyan
3bb9241bec
Add a CHANGES note for [GL #4480] 2024-06-10 17:34:09 +02:00
Aram Sargsyan
7ca9bd6014
Limit the number of keys for SIG(0) message verification
Check at most two KEY RRs agains a SIG(0) signature. This should
limit potential abuse and at the same time allow key rollover.
2024-06-10 17:33:11 +02:00
Aram Sargsyan
70ff4a3f85
Run resolver message signature checking asynchronously 2024-06-10 17:33:11 +02:00
Aram Sargsyan
ad489c44df
Remove sig0checks-quota-maxwait-ms support
Waiting for a quota to appear complicates things and wastes
rosources on timer management. Just answer with REFUSE if
there is no quota.
2024-06-10 17:33:11 +02:00
Aram Sargsyan
f0cde05e06
Implement asynchronous view matching for SIG(0)-signed queries
View matching on an incoming query checks the query's signature,
which can be a CPU-heavy task for a SIG(0)-signed message. Implement
an asynchronous mode of the view matching function which uses the
offloaded signature checking facilities, and use it for the incoming
queries.
2024-06-10 17:33:10 +02:00
Aram Sargsyan
710bf9b938
Implement asynchronous message signature verification
Add support for using the offload threadpool to perform message
signature verifications. This should allow check SIG(0)-signed
messages without affecting the worker threads.
2024-06-10 17:33:10 +02:00
Aram Sargsyan
7f013ad05d
Remove dns_message_rechecksig()
This is a tiny helper function which is used only once and can be
replaced with two function calls instead. Removing this makes
supporting asynchronous signature checking less complicated.
2024-06-10 17:33:10 +02:00
Aram Sargsyan
bbc866d0cb
Document the SIG(0) signature checking quota options
Add documentation entries for the 'sig0checks-quota',
'sig0checks-quota-maxwait-ms', and 'sig0checks-quota-exempt'
optoins.
2024-06-10 17:33:10 +02:00
Aram Sargsyan
c7f79a0353
Add a quota for SIG(0) signature checks
In order to protect from a malicious DNS client that sends many
queries with a SIG(0)-signed message, add a quota of simultaneously
running SIG(0) checks.

This protection can only help when named is using more than one worker
threads. For example, if named is running with the '-n 4' option, and
'sig0checks-quota 2;' is used, then named will make sure to not use
more than 2 workers for the SIG(0) signature checks in parallel, thus
leaving the other workers to serve the remaining clients which do not
use SIG(0)-signed messages.

That limitation is going to change when SIG(0) signature checks are
offloaded to "slow" threads in a future commit.

The 'sig0checks-quota-exempt' ACL option can be used to exempt certain
clients from the quota requirements using their IP or network addresses.

The 'sig0checks-quota-maxwait-ms' option is used to define a maximum
amount of time for named to wait for a quota to appear. If during that
time no new quota becomes available, named will answer to the client
with DNS_R_REFUSED.
2024-06-10 17:33:08 +02:00
Nicki Křížek
24e8cc7b38 Merge branch '3405-security-limit-the-number-of-resource-records-in-rrset' into 'v9.20.0-release'
Limit the number of RRs in RRSets

See merge request isc-private/bind9!694
2024-06-10 15:01:48 +00:00
Evan Hunt
1bf7795b38
Add CHANGES and release note for [GL #3403] 2024-06-10 16:57:29 +02:00
Matthijs Mekking
c1ac8b6ad0
Log rekey failure as error if too many records
By default we log a rekey failure on debug level. We should probably
change the log level to error. We make an exception for when the zone
is not loaded yet, it often happens at startup that a rekey is
run before the zone is fully loaded.
2024-06-10 16:55:12 +02:00
Matthijs Mekking
82635e56d8
Log error when update fails
The new "too many records" error can make an update fail without the
error being logged. This commit fixes that.
2024-06-10 16:55:12 +02:00
Evan Hunt
7dd6b47ace
fix a memory leak that could occur when signing
when signatures were not added because of too many types already
existing at a node, the diff was not being cleaned up; this led to
a memory leak being reported at shutdown.
2024-06-10 16:55:12 +02:00
Matthijs Mekking
4e46453035
Add new test cases with DNSSEC signing
kasp-max-types-per-name (named2.conf.in):
An unsigned zone with RR type count on a name right below the
configured limit. Then sign the zone using KASP. Adding a RRSIG would
push it over the RR type limit per name. Signing should fail, but
the server should not crash, nor end up in infinite resign-attempt loop.

kasp-max-records-per-type-dnskey (named1.conf.in):
Test with low max-record-per-rrset limit and a DNSSEC policy requiring
more than the limit. Signing should fail.

kasp-max-types-per-name (named1.conf.in):
Each RRSIG(covered type) is counted as an individual RR type. Test the
corner case where a signed zone, which is just below the limit-1,
adds a new type - doing so would trigger signing for the new type and
thus increase the number of "types" by 2, pushing it over the limit
again.
2024-06-10 16:55:11 +02:00
Matthijs Mekking
15ecd2cce6
Check if restart works 2024-06-10 16:55:11 +02:00
Matthijs Mekking
ef9d5cf552
Switch to inline-signing no 2024-06-10 16:55:11 +02:00
Matthijs Mekking
6297e0d7a9
Add test cases that use DNSSEC signing
Add two new masterformat tests that use signing. In the case of
'under-limit-kasp', the signing will keep the number of records in the
RRset under the limit. In the case of 'on-limit-kasp', the signing
will push the number of records in the RRset over the limit, because
of the added RRSIG record.
2024-06-10 16:55:11 +02:00
Petr Špaček
b2afc83040
Remove duplicated empty zone files 2024-06-10 16:55:11 +02:00
Petr Špaček
d85f516f5b
masterformat: rename zone names to reflect intended meaning 2024-06-10 16:55:10 +02:00
Petr Špaček
124e220579
Test owner name rename: a b c d e -> <number>-txt 2024-06-10 16:55:10 +02:00
Petr Špaček
c080e510ab
Test variable rename i->_attempt 2024-06-10 16:55:10 +02:00
Petr Špaček
35faf81680
Test variable rename a->rrcount 2024-06-10 16:55:10 +02:00
Ondřej Surý
ccde4911ca
Add test for not-loading many RRsets per name on a secondary
This tests makes sure the zone with many RRsets per name is not loaded
via XFR on the secondary server.
2024-06-10 16:55:10 +02:00
Ondřej Surý
86aa4674ab
Add a test for not caching large number of RRsets
Send a recursive query for a large number of RRsets, which should
fail when using the default max-types-per-name setting of 100, but
succeed when the cap is disabled.
2024-06-10 16:55:10 +02:00
Ondřej Surý
52b3d86ef0
Add a limit to the number of RR types for single name
Previously, the number of RR types for a single owner name was limited
only by the maximum number of the types (64k).  As the data structure
that holds the RR types for the database node is just a linked list, and
there are places where we just walk through the whole list (again and
again), adding a large number of RR types for a single owner named with
would slow down processing of such name (database node).

Add a configurable limit to cap the number of the RR types for a single
owner.  This is enforced at the database (rbtdb, qpzone, qpcache) level
and configured with new max-types-per-name configuration option that
can be configured globally, per-view and per-zone.
2024-06-10 16:55:09 +02:00
Evan Hunt
3dc4388f4a
Add a test for not caching large RRset
Send a recursive query for a large (2500 record) RRset, which should
fail when using the default max-records-per-type setting of 100, but
succeed when the cap is disabled.
2024-06-10 16:55:09 +02:00
Ondřej Surý
5d4e57b914
Add test for not-loading and not-transfering huge RRSets
Add two new masterformat tests - the 'huge' zone fits within the ns1
limit and loads on the primary ns1 server, but must not transfer to the
ns2 secondary, and the 'uber' zone should not even load on the primary
ns1 server.
2024-06-10 16:55:09 +02:00
Ondřej Surý
32af7299eb
Add a limit to the number of RRs in RRSets
Previously, the number of RRs in the RRSets were internally unlimited.
As the data structure that holds the RRs is just a linked list, and
there are places where we just walk through all of the RRs, adding an
RRSet with huge number of RRs inside would slow down processing of said
RRSets.

Add a configurable limit to cap the number of the RRs in a single RRSet.
This is enforced at the database (rbtdb, qpzone, qpcache) level and
configured with new max-records-per-type configuration option that can
be configured globally, per-view and per-zone.
2024-06-10 16:55:07 +02:00
Nicki Křížek
0b44383c5b Merge branch '4481-security-tcp-flood' into 'v9.20.0-release'
[CVE-2024-0760] Throttle reading from TCP if the sends are not getting through

See merge request isc-private/bind9!639
2024-06-10 14:53:12 +00:00
Ondřej Surý
1002f920f6
Add CHANGES and release note for [GL #4481] 2024-06-10 16:49:56 +02:00
Ondřej Surý
e28266bfbc
Remove the extra memory context with own arena for sending
The changes in this MR prevent the memory used for sending the outgoing
TCP requests to spike so much.  That strictly remove the extra need for
own memory context, and thus since we generally prefer simplicity,
remove the extra memory context with own jemalloc arenas just for the
outgoing send buffers.
2024-06-10 16:48:54 +02:00
Ondřej Surý
4c2ac25a95
Limit the number of DNS message processed from a single TCP read
The single TCP read can create as much as 64k divided by the minimum
size of the DNS message.  This can clog the processing thread and trash
the memory allocator because we need to do as much as ~20k allocations in
a single UV loop tick.

Limit the number of the DNS messages processed in a single UV loop tick
to just single DNS message and limit the number of the outstanding DNS
messages back to 23.  This effectively limits the number of pipelined
DNS messages to that number (this is the limit we already had before).
2024-06-10 16:48:54 +02:00
Ondřej Surý
452a2e6348
Replace the tcp_buffers memory pool with static per-loop buffer
As a single thread can process only one TCP send at the time, we don't
really need a memory pool for the TCP buffers, but it's enough to have
a single per-loop (client manager) static buffer that's being used to
assemble the DNS message and then it gets copied into own sending
buffer.

In the future, this should get optimized by exposing the uv_try API
from the network manager, and first try to send the message directly
and allocate the sending buffer only if we need to send the data
asynchronously.
2024-06-10 16:48:53 +02:00
Aram Sargsyan
982eab7de0
ns_client: reuse TCP send buffers
Constantly allocating, reallocating and deallocating 64K TCP send
buffers by 'ns_client' instances takes too much CPU time.

There is an existing mechanism to reuse the ns_clent_t structure
associated with the handle using 'isc_nmhandle_getdata/_setdata'
(see ns_client_request()), but it doesn't work with TCP, because
every time ns_client_request() is called it gets a new handle even
for the same TCP connection, see the comments in
streamdns_on_complete_dnsmessage().

To solve the problem, we introduce an array of available (unused)
TCP buffers stored in ns_clientmgr_t structure so that a 'client'
working via TCP can have a chance to reuse one (if there is one)
instead of allocating a new one every time.
2024-06-10 16:48:53 +02:00
Ondřej Surý
4e7c4af17f
Throttle reading from TCP if the sends are not getting through
When TCP client would not read the DNS message sent to them, the TCP
sends inside named would accumulate and cause degradation of the
service.  Throttle the reading from the TCP socket when we accumulate
enough DNS data to be sent.  Currently this is limited in a way that a
single largest possible DNS message can fit into the buffer.
2024-06-10 16:48:52 +02:00
Nicki Křížek
d3609b742d Merge branch '4473-fix-doh-intermittent-crash' into 'v9.20.0-release'
DoH:  Avoid potential data races in our DoH implementation related to to HTTP/2 session object management and endpoints set object management

See merge request isc-private/bind9!614
2024-06-10 14:45:42 +00:00
Artem Boldariev
cdb5ae35e8
Modify release notes [GL #4473]
Mention that an intermittent BIND process termination in DoH code has
been fixed.
2024-06-10 16:41:00 +02:00
Artem Boldariev
a51ffa58d7
Modify CHANGES [GL #4473]
Mention that an intermittent BIND process termination in DoH code has
been fixed.
2024-06-10 16:40:56 +02:00
Artem Boldariev
d80dfbf745
Keep the endpoints set reference within an HTTP/2 socket
This commit ensures that an HTTP endpoints set reference is stored in
a socket object associated with an HTTP/2 stream instead of
referencing the global set stored inside a listener.

This helps to prevent an issue like follows:

1. BIND is configured to serve DoH clients;
2. A client is connected and one or more HTTP/2 stream is
created. Internal pointers are now pointing to the data on the
associated HTTP endpoints set;
3. BIND is reconfigured - the new endpoints set object is created and
promoted to all listeners;
4. The old pointers to the HTTP endpoints set data are now invalid.

Instead referencing a global object that is updated on
re-configurations we now store a local reference which prevents the
endpoints set objects to go out of scope prematurely.
2024-06-10 16:40:12 +02:00
Artem Boldariev
c41fb499b9
DoH: avoid potential use after free for HTTP/2 session objects
It was reported that HTTP/2 session might get closed or even deleted
before all async. processing has been completed.

This commit addresses that: now we are avoiding using the object when
we do not need it or specifically check if the pointers used are not
'NULL' and by ensuring that there is at least one reference to the
session object while we are doing incoming data processing.

This commit makes the code more resilient to such issues in the
future.
2024-06-10 16:40:10 +02:00
Nicki Křížek
07a5e7a921 Merge branch 'nicki/add-placeholder-for-4661' into 'main'
Add a CHANGES placeholder for [GL #4661]

See merge request isc-projects/bind9!9097
2024-06-10 14:16:46 +00:00
Nicki Křížek
4fe6a6bdc0
Add a CHANGES placeholder for [GL #4661] 2024-06-10 16:14:25 +02:00
Evan Hunt
05823eb1b0 Merge branch '4728-allow-transfer-none' into 'main'
change allow-transfer default to "none"

Closes #4728

See merge request isc-projects/bind9!9046
2024-06-05 21:50:47 +00:00