2
0
mirror of https://gitlab.isc.org/isc-projects/bind9 synced 2025-08-30 22:15:20 +00:00
Commit Graph

35845 Commits

Author SHA1 Message Date
Matthijs Mekking
d6d107d804 Save keyfromlabel error output
Save the error output from pkcs11-tool and dnssec-keyfromlabel in the
engine_pkcs11 system test.
2022-03-21 10:11:02 +01:00
Ondřej Surý
8268526294 Merge branch 'ondrej/add-isc_nm_getnworkers' into 'main'
Make netmgr the authority on number of threads running

See merge request isc-projects/bind9!5999
2022-03-18 21:21:47 +00:00
Ondřej Surý
d70daa29f7 Make netmgr the authority on number of threads running
Instead of passing the "workers" variable back and forth along with
passing the single isc_nm_t instance, add isc_nm_getnworkers() function
that returns the number of netmgr threads are running.

Change the ns_interfacemgr and ns_taskmgr to utilize the newly acquired
knowledge.
2022-03-18 21:53:28 +01:00
Tony Finch
4761213e80 Merge branch '3201-no-vla' into 'main'
Avoid using C99 variable length arrays

Closes #3201

See merge request isc-projects/bind9!5956
2022-03-18 16:02:46 +00:00
Tony Finch
599c1d2a6b Avoid using C99 variable length arrays
From an attacker's point of view, a VLA declaration is essentially a
primitive for performing arbitrary arithmetic on the stack pointer. If
the attacker can control the size of a VLA they have a very powerful
tool for causing memory corruption.

To mitigate this kind of attack, and the more general class of stack
clash vulnerabilities, C compilers insert extra code when allocating a
VLA to probe the growing stack one page at a time. If these probes hit
the stack guard page, the program will crash.

From the point of view of a C programmer, there are a few things to
consider about VLAs:

  * If it is important to handle allocation failures in a controlled
    manner, don't use VLAs. You can use VLAs if it is OK for
    unreasonable inputs to cause an uncontrolled crash.

  * If the VLA is known to be smaller than some known fixed size,
    use a fixed size array and a run-time check to ensure it is large
    enough. This will be more efficient than the compiler's stack
    probes that need to cope with arbitrary-size VLAs.

  * If the VLA might be large, allocate it on the heap. The heap
    allocator can allocate multiple pages in one shot, whereas the
    stack clash probes work one page at a time.

Most of the existing uses of VLAs in BIND are in test code where they
are benign, but there was one instance in `named`, in the GSS-TSIG
verification code, which has now been removed.

This commit adjusts the style guide and the C compiler flags to allow
VLAs in test code but not elsewhere.
2022-03-18 15:11:48 +00:00
Tony Finch
eeead1cfe7 Remove a redundant variable-length array
In the GSS-TSIG verification code there was an alarming
variable-length array whose size came off the network, from the
signature in the request. It turned out to be safe, because the caller
had previously checked that the signature had a reasonable size.
However, the safety checks are in the generic TSIG implementation, and
the risky VLA usage was in the GSS-specific code, and they are
separated by the DST indirection layer, so it wasn't immediately
obvious that the risky VLA was in fact safe.

In fact this risky VLA was completely unnecessary, because the GSS
signature can be verified in place without being copied to the stack,
like the message covered by the signature. The `REGION_TO_GBUFFER()`
macro backwardly assigns the region in its left argument to the GSS
buffer in its right argument; this is just a pointer and length
conversion, without copying any data. The `gss_verify_mic()` call uses
both message and signature GSS buffers in a read-only manner.
2022-03-18 15:06:31 +00:00
Arаm Sаrgsyаn
ed22d12f10 Merge branch '3205-dig-tcp-next-server-on-connection-error-crash' into 'main'
Fix dig error when trying the next server after a TCP connection failure

Closes #3205

See merge request isc-projects/bind9!5976
2022-03-18 10:55:23 +00:00
Aram Sargsyan
ced79790b3 Add CHANGES note for [GL #3205] 2022-03-18 10:29:08 +00:00
Aram Sargsyan
03697f1bcc Add various dig/host tests for TCP/UDP socket error handling cases
Rework the "ans8" server in the "digdelv" system test to support various
modes of operations using a control channel.

The supported modes are:

1. `silent` (do not respond)
2. `close` (UDP: same as `silent`; TCP: also close the connection)
3. `servfail` (always respond with `SERVFAIL`)
4. `unstable` (constantly switch between `silent` and `servfail`)

Add multiple tests to check the handling of both TCP and UDP socket
error scenarios in dig/host.
2022-03-18 10:28:19 +00:00
Aram Sargsyan
0fb4fc1897 Fix dig error when trying the next server after a TCP connection failure
When encountering a TCP connection error while trying to initiate a
connection to a server, dig erroneously cancels the lookup even when
there are other server(s) to try, which results in an assertion failure.

Cancel the lookup only when there are no more queries left in the
lookup's queries list (i.e. `next` is NULL).
2022-03-18 10:28:19 +00:00
Arаm Sаrgsyаn
85870ad9ee Merge branch '3128-dig-does-not-recover-from-a-isc_nm_udpconnect-failure' into 'main'
After dig request errors, try to use other servers when they exist

Closes #3128

See merge request isc-projects/bind9!5967
2022-03-18 10:24:46 +00:00
Aram Sargsyan
b3a058e7bb Add CHANGES entry for [GL #3128] 2022-03-18 09:12:23 +00:00
Aram Sargsyan
e8a64d0cbe Add digdelv system test to check that dig tries other servers on error
Add a test to check whether dig tries the next query/server after
a connection error.

Add a test to check whether dig tries the next query/server after
a one or more (default is 3) connection/request timeouts.
2022-03-18 09:12:23 +00:00
Aram Sargsyan
bc203d6082 After dig request errors, try to use other servers when they exist
When timing-out or having other types of socket errors during a query,
dig isn't trying to perform the lookup using other servers which exist
in the lookup's queries list.

After configured amount of timeout retries, or after a socket error,
check if there are other queries/servers in the lookup's queries list,
and start the next one if it exists, instead of unconditionally failing.
2022-03-18 09:12:23 +00:00
Arаm Sаrgsyаn
da0d85d748 Merge branch '3020-dighost-servfail-bug' into 'main'
When resending a UDP request, insert the query to the lookup's list

Closes #3020

See merge request isc-projects/bind9!5954
2022-03-18 09:02:40 +00:00
Aram Sargsyan
3ec5d2d6ed Add digdelv system test to check timed-out result followed by a SERVFAIL
This test ensures that `dig` retries with another attempt after a
timed-out request, and that it does not crash when the retried
request returns a SERVFAIL result. See [GL #3020] for the latter
issue.
2022-03-18 08:24:39 +00:00
Aram Sargsyan
e353700189 Add CHANGES note for [GL #3020] 2022-03-18 08:24:38 +00:00
Aram Sargsyan
a962475948 When resending a UDP request, insert the query to the lookup's list
When a query times out, and `dig` (or `host`) creates a new query
to resend the request, it is being prepended to the lookup's queries
list, which can cause a confusion later, making `dig` (or `host`)
believe that there is another new query in the list, but that is
actually the old one, which was timed out. That mistake will result
in an assertion failure.

That can happen, in particular, when after a timed out request,
the retried request returns a SERVFAIL result, and the recursion
is enabled, and `+nofail` option was used with `dig` (that is the
default behavior in `host`, unless the `-s` option is provided).

Fix the problem by inserting the query just after the current,
timed-out query, instead of prepending to the list.

Before calling start_udp() detach `l->current_query`, like it is
done in another place in the function.

Slightly update a couple of debug messages to make them more
consistent.
2022-03-18 08:23:53 +00:00
Aram Sargsyan
e888c62fbd Fix an issue in dig when retrying with the next server after SERVFAIL
After a query results in a SERVFAIL result, and there is another
registered query in the lookup's queries list, `dig` starts the next
query to try another server, but for some reason, reports about that
also when the current query is in the head of the list, even if there
is no other query in the list to try.

Use the same condition for both decisions, and after starting the next
query, jump to the "detach_query" label instead of "next_lookup",
because there is no need to start the next lookup after we just started
a query in the current lookup.
2022-03-18 08:23:53 +00:00
Ondřej Surý
99906df09e Merge branch '3208-fix-xfrout-maxtimer-timer-log-message-log-level' into 'main'
Change xfer-out timer message log level to DEBUG(1)

Closes #3208

See merge request isc-projects/bind9!5995
2022-03-17 20:34:30 +00:00
Ondřej Surý
8f6e4dfa15 Change xfer-out timer message log level to DEBUG(1)
When max-transfer-*-out timeouts were reintroduced, the log message
about starting the timer was errorneously left as ISC_LOG_ERROR.
Change the log level of said message to ISC_LOG_DEBUG(1).
2022-03-17 21:28:29 +01:00
Ondřej Surý
4c008d20e6 Merge branch 'ondrej/add-missing-braces-clang-format-15' into 'main'
Add couple missing braces around single-line statements

See merge request isc-projects/bind9!5968
2022-03-17 17:50:42 +00:00
Ondřej Surý
ff22498849 Add couple missing braces around single-line statements
The clang-format-15 has new option InsertBraces that could add missing
branches around single line statements.  Use that to our advantage
without switching to not-yet-released LLVM version to add missing braces
in couple of places.
2022-03-17 18:27:45 +01:00
Ondřej Surý
8495edc31d Merge branch '3212-implement-incremental-rehashing-for-isc_ht-hashtables' into 'main'
Implement incremental hash table resizing in isc_ht

Closes #3212

See merge request isc-projects/bind9!5983
2022-03-17 07:35:00 +00:00
Ondřej Surý
5ccb28d6d8 Add CHANGES note for [GL #3212] 2022-03-17 08:16:24 +01:00
Ondřej Surý
cd52953f8a Update the isc_ht unit test to also tesh rehashing
As incremental rehashing has been added to isc_ht implementation, we
need to test whether the rehashing works.

Update the isc_ht unit test to test:

 * preinitialized hash table large enough to hold all the elements
 * smallest hash table that fully grows to hold all the elements
 * partially preinitialized hash table that grows
 * iterating while rehashing is in progress
2022-03-17 08:16:24 +01:00
Ondřej Surý
e42cb1f198 Implement incremental hash table resizing in isc_ht
Previously, an incremental hash table resizing was implemented for the
dns_rbt_t hash table implementation.  Using that as a base, also
implement the incremental hash table resizing also for isc_ht API
hashtables:

 1. During the resize, allocate the new hash table, but keep the old
    table unchanged.
 2. In each lookup, delete, or iterator operation, check both tables.
 3. Perform insertion operations only in the new table.
 4. At each insertion also move <r> elements from the old table to
    the new table.
 5. When all elements are removed from the old table, deallocate it.

To ensure that the old table is completely copied over before the new
table itself needs to be enlarged, it is necessary to increase the
size of the table by a factor of at least (<r> + 1)/<r> during resizing.

In our implementation <r> is equal to 1.

The downside of this approach is that the old table and the new table
could stay in memory for longer when there are no new insertions into
the hash table for prolonged periods of time as the incremental
rehashing happens only during the insertions.
2022-03-17 08:16:24 +01:00
Michał Kępień
7ba3a06935 Merge branch '3129-check-fetch-shutting-down-in-resume_dslookup' into 'main'
[CVE-2022-0667] Check if the fetch is shutting down in resume_dslookup()

See merge request isc-projects/bind9!5989
2022-03-16 22:05:26 +00:00
Michał Kępień
71dd44339f Merge branch '3158-confidential-issue-only-set-foundname-on-success' into 'main'
[CVE-2022-0635] DNAME lookups can trigger INSIST when synth-from-dnssec is enabled

See merge request isc-projects/bind9!5988
2022-03-16 21:42:28 +00:00
Michał Kępień
ae7fa0a308 Merge branch '3112-ensure-correct-ordering-in-isc__nm_process_sock_buffer' into 'main'
[CVE-2022-0396] Resolve #3112 TCP sockets stuck in CLOSE_WAIT

Closes #3112

See merge request isc-projects/bind9!5987
2022-03-16 21:36:53 +00:00
Michał Kępień
9c27a3b0e2 Merge branch '2950-confidential-cache-acceptance-rules' into 'main'
[CVE-2021-25220] prevent cache poisoning from forwarder responses

See merge request isc-projects/bind9!5986
2022-03-16 21:30:34 +00:00
Aram Sargsyan
9241363f36 Add CHANGES and release note for [GL #3129] 2022-03-16 22:11:49 +01:00
Mark Andrews
c9f28777f6 Add CHANGES and release note for [GL #3158] 2022-03-16 22:11:49 +01:00
Ondřej Surý
dcb6a0c4f8 Add CHANGES and release note for [GL #3112] 2022-03-16 22:11:49 +01:00
Petr Špaček
51546e8892 Add Release Note for [GL #2950] 2022-03-16 22:11:49 +01:00
Aram Sargsyan
f0f3370e14 Check if the fetch is shutting down in resume_dslookup()
The fetch can be in the shutting down state when resume_dslookup() is
trying to operate on it.

This is also a security issue, because a malicious actor can set up a
name server which delays certain queries in such a way that the fetch
will time out and shut down, which will cause named to crash.

Add a check to see if the fetch has the shutting down attribute set,
and cancel any further operations on it in such case.

A similar bug had been fixed earlier for the resume_qmin() function,
see [GL #966].
2022-03-16 22:11:49 +01:00
Mark Andrews
9fcc028f5c Skip calling find_coveringnsec if we found a DNAME
This is an optimisation as we can skip a lot of pointless work when we
know there is a DNAME there.

When we have a partial match and a DNAME above the QNAME, the closest
encloser has the same owner as the DNAME, will have the DNAME bit set
in the type map, and we wouldn't use it as we would return the
DNAME + RRSIG(DNAME) instead.

So there is no point in looking for it nor in attempting to check that
it is valid for the QNAME.
2022-03-16 22:11:49 +01:00
Ondřej Surý
bfa4b9c141 Run .closehandle_cb asynchrounosly in nmhandle_detach_cb()
When sock->closehandle_cb is set, we need to run nmhandle_detach_cb()
asynchronously to ensure correct order of multiple packets processing in
the isc__nm_process_sock_buffer().  When not run asynchronously, it
would cause:

  a) out-of-order processing of the return codes from processbuffer();

  b) stack growth because the next TCP DNS message read callback will
     be called from within the current TCP DNS message read callback.

The sock->closehandle_cb is set to isc__nm_resume_processing() for TCP
sockets which calls isc__nm_process_sock_buffer().  If the read callback
(called from isc__nm_process_sock_buffer()->processbuffer()) doesn't
attach to the nmhandle (f.e. because it wants to drop the processing or
we send the response directly via uv_try_write()), the
isc__nm_resume_processing() (via .closehandle_cb) would call
isc__nm_process_sock_buffer() recursively.

The below shortened code path shows how the stack can grow:

 1: ns__client_request(handle, ...);
 2: isc_nm_tcpdns_sequential(handle);
 3: ns_query_start(client, handle);
 4:   query_lookup(qctx);
 5:     query_send(qctcx->client);
 6:       isc__nmhandle_detach(&client->reqhandle);
 7:         nmhandle_detach_cb(&handle);
 8:           sock->closehandle_cb(sock); // isc__nm_resume_processing
 9:             isc__nm_process_sock_buffer(sock);
10:               processbuffer(sock); // isc__nm_tcpdns_processbuffer
11:                 isc_nmhandle_attach(req->handle, &handle);
12:                 isc__nm_readcb(sock, req, ISC_R_SUCCESS);
13:                   isc__nm_async_readcb(NULL, ...);
14:                     uvreq->cb.recv(...); // ns__client_request

Instead, if 'sock->closehandle_cb' is set, we need to run detach the
handle asynchroniously in 'isc__nmhandle_detach', so that on line 8 in
the code flow above does not start this recursion. This ensures the
correct order when processing multiple packets in the function
'isc__nm_process_sock_buffer()' and prevents the stack growth.

When not run asynchronously, the out-of-order processing leaves the
first TCP socket open until all requests on the stream have been
processed.

If the pipelining is disabled on the TCP via `keep-response-order`
configuration option, named would keep the first socket in lingering
CLOSE_WAIT state when the client sends an incomplete packet and then
closes the connection from the client side.
2022-03-16 22:11:49 +01:00
Petr Špaček
612f277877 Add CHANGES note for [GL #2950] 2022-03-16 22:11:49 +01:00
Mark Andrews
5c271f91e1 Only update foundname if returning DNS_R_COVERINGNSEC
'setup_delegation' depends on 'foundname' being the value returned
by 'dns_rbt_findnode' in the cache and 'find_coveringnsec' was
modifying 'foundname' when a covering NSEC was not found.
2022-03-16 22:11:49 +01:00
Mark Andrews
fe1bbba259 Look for zones deeper than the current domain or forward name
When caching glue, we need to ensure that there is no closer
source of truth for the name. If the owner name for the glue
record would be answered by a locally configured zone, do not
cache.
2022-03-16 22:11:49 +01:00
Mark Andrews
c289913e5c Check cached names for possible "forward only" clause
When caching additional and glue data *not* from a forwarder, we must
check that there is no "forward only" clause covering the owner name
that would take precedence.  Such names would normally be allowed by
baliwick rules, but a "forward only" zone introduces a new baliwick
scope.
2022-03-16 22:11:49 +01:00
Mark Andrews
7e37b5e379 Check that the forward declaration is unchanged and not overridden
If we are using a fowarder, in addition to checking that names to
be cached are subdomains of the forwarded namespace, we must also
check that there are no subsidiary forwarded namespaces which would
take precedence. To be safe, we don't cache any responses if the
forwarding configuration has changed since the query was sent.
2022-03-16 22:11:49 +01:00
Mark Andrews
5dc3b25d03 Add additional name checks when using a forwarder
When using a forwarder, check that the owner name of response
records are within the bailiwick of the forwarded name space.
2022-03-16 22:11:49 +01:00
Matthijs Mekking
fd8dd9841d Merge branch '3185-follow-up-fix-zone-documentation' into 'main'
Fix zone named.conf man page documentation

Closes #3185

See merge request isc-projects/bind9!5977
2022-03-15 13:14:25 +00:00
Matthijs Mekking
01b125ff05 Fix named.conf man page documentation
Commit 4ca74eee49 update the zone grammar
such that the zone statement is printed with the valid options per
zone type.

This commit is a follow-up, putting back the ZONE heading and adding
a note that these zone statements may also be put inside the view
statement.

It is tricky to actually print the zone statements inside
the view statement, and so we decided that we would add a note to say
that this is possible.
2022-03-15 14:13:45 +01:00
Ondřej Surý
13b20ef411 Merge branch '3202-cleanup-isc_timer-API' into 'main'
Refactor and simplify isc_timer API

See merge request isc-projects/bind9!5966
2022-03-14 21:13:24 +00:00
Ondřej Surý
7f91f1ecaa Add CHANGES note for [GL #3202] 2022-03-14 13:00:05 -07:00
Ondřej Surý
79b5ccbf34 Implement isc_interval_t on top of isc_time_t
Change the isc_interval_t implementation from separate data type and
separate implementation to be shim implementation on top of isc_time_t.
The distinction between isc_interval_t and isc_time_t has been kept
because they are semantically different - isc_interval_t is relative and
isc_time_t is absolute, but this allows isc_time_t and isc_interval_t to
be freely interchangeable, f.e. this:

    isc_time_t *t1;
    isc_interval_t *interval;
    isc_time_t *t2;

    isc_interval_set(interval, isc_time_seconds(t2), isc_time_nanoseconds(t2);;
    isc_time_subtract(t1, interval, t2);
    isc_interval_set(interval, isc_time_seconds(t2), isc_time_nanoseconds(t2));

to just:

    isc_time_t *t1;
    isc_interval_t *interval;
    isc_time_t *t2;

    isc_time_subtract(t1, t2, interval);

without introducing a whole set of new functions.
2022-03-14 13:00:05 -07:00
Ondřej Surý
e6ca2a651f Refactor isc_timer_reset() use with semantic patch
Add and apply semantic patch to remove expires argument from the
isc_timer_reset() calls through the codebase.
2022-03-14 13:00:05 -07:00