2
0
mirror of https://gitlab.isc.org/isc-projects/bind9 synced 2025-08-29 13:38:26 +00:00

565 Commits

Author SHA1 Message Date
Ondřej Surý
359faf2ff7
Convert isc_astack usage in netmgr to mempool and ISC_LIST
Change the per-socket inactive uvreq cache (implemented as isc_astack)
to per-worker memory pool.

Change the per-socket inactive nmhandle cache (implemented as
isc_astack) to unlocked per-socket ISC_LIST.
2023-01-10 20:31:24 +01:00
Ondřej Surý
5bbba0d1a1
Simplify tracing the reference counting in isc_netmgr
Always track the per-worker sockets in the .active_sockets field in the
isc__networker_t struct and always track the per-socket handles in the
.active_handles field ian the isc_nmsocket_t struct.
2023-01-10 19:57:39 +01:00
Evan Hunt
916ea26ead remove nonfunctional DSCP implementation
DSCP has not been fully working since the network manager was
introduced in 9.16, and has been completely broken since 9.18.
This seems to have caused very few difficulties for anyone,
so we have now marked it as obsolete and removed the
implementation.

To ensure that old config files don't fail, the code to parse
dscp key-value pairs is still present, but a warning is logged
that the feature is obsolete and should not be used. Nothing is
done with configured values, and there is no longer any
range checking.
2023-01-09 12:15:21 -08:00
Evan Hunt
9c577e10c3 use separate barriers for "stop" and "listen" operations
On some platforms, when a synchronizing barrier is cleared, one
thread can progress while other threads are still in the process
of releasing the barrier. If a barrier is reused by the progressing
thread during this window, it can cause a deadlock. This can occur if,
for example, we stop listening immediately after we start, because the
stop and listen functions both use socket->barrier.  This has been
addressed by using separate barrier objects for stop and listen.
2023-01-07 16:30:21 -08:00
Ondřej Surý
6553927d27
Enforce strong thread-affinity on StreamDNS sockets
Add a check that the isc__nm_streamdns_read(), isc__nm_streamdns_send(),
and isc__nm_streamdns_close() are being called from the matching thread.
2023-01-05 09:43:09 +01:00
Artem Boldariev
fbf1546fb8 TLS: use isc_buffer_t for send requests
This commit replaces ad-hoc code for send requests buffer management
within TLS with the one based on isc_buffer_t.

Previous version of the code was trying to use pre-allocated small
buffers to avoid extra allocations. The code would allocate a larger
dynamic buffer when needed. There is no need to have ad-hoc code for
this, as isc_buffer_t now provides this functionality internally.

Additionally to the above, the old version of the code lacked any
logic to reuse the dynamically allocated buffers. Now, as we do not
manage memory buffers, but isc_buffer_t objects, we can implement this
strategy. It can be in particular helpful for longer lasting
connections, as in this case the buffer will adjust itself to the size
of the messages being transferred. That is, it is in particular useful
for XoT, as Stream DNS happen to order send requests in such a way
that the send request will get reused.
2022-12-30 19:56:25 +02:00
Ondřej Surý
6cb6373b5a Convert Stream DNS to use isc_buffer API
Drop the whole isc_dnsbuffer API and use new improved isc_buffer API
that provides same functionality as the isc_dnsbuffer unit now.
2022-12-20 22:13:53 +02:00
Artem Boldariev
0a7e83feea StreamDNS: Use isc__nm_senddns() to send DNS messages
This commit modifies the Stream DNS message so that it uses the
optimised code path (isc__nm_senddns()) for sending DNS messages over
the underlying transport. This way we avoid allocating any
intermediate memory buffers needed to render a DNS message with its
length pre-pended ahead of the contents (TCP DNS message format).
2022-12-20 22:13:53 +02:00
Artem Boldariev
cb6f3dc3c8 TLS: isc__nm_senddns() support
This commit adds support for isc_nm_senddns() to the generic TLS code.
2022-12-20 22:13:53 +02:00
Artem Boldariev
ad876a65af Add isc__nm_senddns()
The new internal function works in the same way as isc_nm_send()
except that it sends a DNS message size ahead of the DNS message
data (the format used in DNS over TCP).

The intention is to provide a fast path for sending DNS messages over
streams protocols - that is, without allocating any intermediate
memory buffers.
2022-12-20 22:13:53 +02:00
Artem Boldariev
56732ac2a0 TLS: try to avoid allocating send request objects
This commit optimises TLS send request object allocation to enable
send request object reuse, somewhat reducing pressure on the memory
manager. It is especially helpful in the case when Stream DNS uses the
TLS implementation as the transport.
2022-12-20 22:13:53 +02:00
Artem Boldariev
4277eeeb9c Remove TLS DNS transport (and parts common with TCP DNS)
This commit removes TLS DNS transport superseded by Stream DNS.
2022-12-20 22:13:53 +02:00
Artem Boldariev
e5649710d3 Remove TCP DNS transport
This commit removes TCP DNS transport superseded by Stream DNS.
2022-12-20 22:13:53 +02:00
Artem Boldariev
4524bf4083 Make isc_nm_tlssocket non-optional
This commit unties generic TLS code (isc_nm_tlssocket) from DoH, so
that it will be available regardless of the fact if BIND was built
with DNS over HTTP support or not.
2022-12-20 22:13:53 +02:00
Artem Boldariev
efe4267044 DoH: use isc_nmhandle_set_tcp_nodelay()
This commit replaces ad-hoc code for disabling Nagle's algorithm with
a call to isc_nmhandle_set_tcp_nodelay().
2022-12-20 22:13:53 +02:00
Artem Boldariev
e89575ddce StreamDNS: opportunistically disable Nagle's algorithm
This commit ensures that Stream DNS code attempts to disable Nagle's
algorithm regardless of underlying stream transport (TCP or TLS), as
we are not interested in trading latency for throughout when dealing
with DNS messages.
2022-12-20 22:13:53 +02:00
Artem Boldariev
05cfb27b80 Disable Nagle's algorithm for TLS connections by default
This commit ensures that Nagle's algorithm is disabled by default for
TLS connections on best effort basis, just like other networking
software (e.g. NGINX) does, as, in the case of TLS, we are not
interested in trading latency for throughput, rather vice versa.

We attempt to disable it as early as we can, right after TCP
connections establishment, as an attempt to speed up handshake
handling.
2022-12-20 22:13:53 +02:00
Artem Boldariev
371b02f37a TCP: make it possible to set Nagle's algorithms state via handle
This commit adds ability to turn the Nagle's algorithm on or off via
connections handle. It adds the isc_nmhandle_set_tcp_nodelay()
function as the public interface for this functionality.
2022-12-20 22:13:53 +02:00
Artem Boldariev
4606384345 Extend isc__nm_socket_tcp_nodelay() to accept value
This makes it possible to both enable and disable Nagle's algorithm
for a TCP socket descriptor, before the change it was possible only to
disable it.
2022-12-20 22:13:53 +02:00
Artem Boldariev
f395cd4b3e Add isc_nm_streamdnssocket (aka Stream DNS)
This commit adds an initial implementation of isc_nm_streamdnssocket
transport: a unified transport for DNS over stream protocols messages,
which is capable of replacing both TCP DNS and TLS DNS
transports. Currently, the interface it provides is a unified set of
interfaces provided by both of the transports it attempts to replace.

The transport is built around "isc_dnsbuffer_t" and
"isc_dnsstream_assembler_t" objects and attempts to minimise both the
number of memory allocations during network transfers as well as
memory usage.
2022-12-20 22:13:51 +02:00
Artem Boldariev
c0c59b55ab TLS: add an internal function isc__nmhandle_get_selected_alpn()
The added function provides the interface for getting an ALPN tag
negotiated during TLS connection establishment.

The new function can be used by higher level transports.
2022-12-20 21:24:44 +02:00
Artem Boldariev
15e626f1ca TLS: add manual read timer control mode
This commit adds manual read timer control mode, similarly to TCP.
This way the read timer can be controlled manually using:

* isc__nmsocket_timer_start();
* isc__nmsocket_timer_stop();
* isc__nmsocket_timer_restart().

The change is required to make it possible to implement more
sophisticated read timer control policies in DNS transports, built on
top of TLS.
2022-12-20 21:24:44 +02:00
Artem Boldariev
9aabd55725 TCP: add manual read timer control mode
This commit adds a manual read timer control mode to the TCP
code (adding isc__nmhandle_set_manual_timer() as the interface to it).

Manual read timer control mode suppresses read timer restarting the
read timer when receiving any amount of data. This way the read timer
can be controlled manually using:

* isc__nmsocket_timer_start();
* isc__nmsocket_timer_stop();
* isc__nmsocket_timer_restart().

The change is required to make it possible to implement more
sophisticated read timer control policies in DNS transports, built on
top of TCP.
2022-12-20 21:24:44 +02:00
Artem Boldariev
f4760358f8 TLS: expose the ability to (re)start and stop underlying read timer
This commit adds implementation of isc__nmsocket_timer_restart() and
isc__nmsocket_timer_stop() for generic TLS code in order to make its
interface more compatible with that of TCP.
2022-12-20 21:24:44 +02:00
Artem Boldariev
f18a9b3743 TLS: add isc__nmsocket_timer_running() support
This commit adds isc__nmsocket_timer_running() support to the generic
TLS code in order to make it more compatible with TCP.
2022-12-20 21:24:44 +02:00
Artem Boldariev
c0808532e1 TLS: isc_nm_bad_request() and isc__nmsocket_reset() support
This commit adds implementations of isc_nm_bad_request() and
isc__nmsocket_reset() to the generic TLS stream code in order to make
it more compatible with TCP code.
2022-12-20 21:24:44 +02:00
Ondřej Surý
6bd2b34180
Enable auto-reallocation for all isc_buffer_allocate() buffers
When isc_buffer_t buffer is created with isc_buffer_allocate() assume
that we want it to always auto-reallocate instead of having an extra
call to enable auto-reallocation.
2022-12-20 19:13:48 +01:00
Ondřej Surý
52307f8116
Add internal logging functions to the netmgr
Add internal logging functions isc__netmgr_log, isc__nmsocket_log(), and
isc__nmhandle_log() that can be used to add logging messages to the
netmgr, and change all direct use of isc_log_write() to use those
logging functions to properly prefix them with netmgr, nmsocket and
nmsocket+nmhandle.
2022-12-14 19:34:48 +01:00
Artem Boldariev
bed5e2bb08 TLS: check for sock->recv_cb when handling received data
This commit adds a check if 'sock->recv_cb' might have been nullified
during the call to 'sock->recv_cb'. That could happen, e.g. by an
indirect call to 'isc_nmhandle_close()' from within the callback when
wrapping up.

In this case, let's close the TLS connection.
2022-12-02 13:20:37 +02:00
Artem Boldariev
8b7e123528 DoH: Avoid accessing non-atomic listener socket flags when accepting
This commit ensures that the non-atomic flags inside a DoH listener
socket object (and associated worker) are accessed when doing accept
for a connection only from within the context of the dedicated thread,
but not other worker threads.

The purpose of this commit is to avoid TSAN errors during
isc__nmsocket_closing() calls. It is a continuation of
4b5559cd8f41e67bf03e92d9d0338dee9d549b48.
2022-12-02 12:16:12 +02:00
Artem Boldariev
4d0c226375 TLS: Avoid accessing non-atomic listener socket flags during HS
This commit ensures that the non-atomic flags inside a TLS listener
socket object (and associated worker) are accessed when doing
handshake for a connection only from within the context of the
dedicated thread, but not other worker threads.

The purpose of this commit is to avoid TSAN errors during
isc__nmsocket_closing() calls. It is a continuation of
4b5559cd8f41e67bf03e92d9d0338dee9d549b48.
2022-12-02 12:16:12 +02:00
Artem Boldariev
4b5559cd8f TLS: Avoid accessing listener socket flags from other threads
This commit ensures that the flags inside a TLS listener socket
object (and associated worker) are accessed when accepting a
connection only from within the context of the dedicated thread, but
not other worker threads.
2022-12-01 21:07:49 +02:00
Ondřej Surý
e3c628d562
Honour single read per client isc_nm_read() call in the TLSDNS
The TLSDNS transport was not honouring the single read callback for
TLSDNS client.  It would call the read callbacks repeatedly in case the
single TLS read would result in multiple DNS messages in the decoded
buffer.
2022-12-01 18:31:05 +01:00
Artem Boldariev
2bfc079946 TLS stream: always handle send callbacks asynchronously
This commit ensures that send callbacks are always called from within
the context of its worker thread even in the case of
shuttigdown/inactive socket, just like TCP transport does and with
which TLS attempts to be as compatible as possible.
2022-11-30 18:09:52 +02:00
Artem Boldariev
ef659365ce TLS Stream: use ISC_R_CANCELLED error when shutting down
This commit changes ISC_R_NOTCONNECTED error code to ISC_R_CANCELLED
when attempting to start reading data on the shutting down socket in
order to make its behaviour compatible with that of TCP and not break
the common code in the unit tests.
2022-11-30 18:09:52 +02:00
Artem Boldariev
fb9955a372 TLS Stream: fix isc_nm_read_stop() and reading flags handling
It turned out that after the latest Network Manager refactoring
'sock->reading' flag was not processed correctly. Due to this
isc_nm_read_stop() might not work as expected because reading from the
underlying TCP socket could have been resume in 'tls_do_bio()'
regardless of the 'sock->reading' value.

This bug did not seem to cause problems with DoH, so it was not
noticed, but Stream DNS has more strict expectations regarding the
underlying transport.

Additionally to the above, the 'sock->recv_read' flag was completely
ignored and corresponding logic was completely unimplemented. That did
not allow to implement one fine detail compared to TCP: once reading
is started, it could be satisfied by one datum reading.

This commit fixes the issues above.
2022-11-30 18:09:52 +02:00
Artem Boldariev
9b1c8c03fd TCP: use uv_try_write() to optimise sends
This commit make TCP code use uv_try_write() on best effort basis,
just like TCP DNS and TLS DNS code does.

This optimisation was added in
'caa5b6548a11da6ca772d6f7e10db3a164a18f8d' but, similar change was
mistakenly omitted for generic TCP code. This commit fixes that.
2022-11-29 13:41:10 +02:00
Michal Nowak
afdb41a5aa
Update sources to Clang 15 formatting 2022-11-29 08:54:34 +01:00
Ondřej Surý
f3004da3a5
Make the netmgr send callback to be asynchronous only when needed
Previously, the send callback would be synchronous only on success.  Add
an option (similar to what other callbacks have) to decide whether we
need the asynchronous send callback on a higher level.

On a general level, we need the asynchronous callbacks to happen only
when we are invoking the callback from the public API.  If the path to
the callback went through the libuv callback or netmgr callback, we are
already on asynchronous path, and there's no need to make the call to
the callback asynchronous again.

For the send callback, this means we need the asynchronous path for
failure paths inside the isc_nm_send() (which calls isc__nm_udp_send(),
isc__nm_tcp_send(), etc...) - all other invocations of the send callback
could be synchronous, because those are called from the respective libuv
send callbacks.
2022-11-25 15:46:25 +01:00
Ondřej Surý
5ca49942a3
Make the netmgr read callback to be asynchronous only when needed
Previously, the read callback would be synchronous only on success or
timeout.  Add an option (similar to what other callbacks have) to decide
whether we need the asynchronous read callback on a higher level.

On a general level, we need the asynchronous callbacks to happen only
when we are invoking the callback from the public API.  If the path to
the callback went through the libuv callback or netmgr callback, we are
already on asynchronous path, and there's no need to make the call to
the callback asynchronous again.

For the read callback, this means we need the asynchronous path for
failure paths inside the isc_nm_read() (which calls isc__nm_udp_read(),
isc__nm_tcp_read(), etc...) - all other invocations of the read callback
could be synchronous, because those are called from the respective libuv
or netmgr read callbacks.
2022-11-25 15:46:15 +01:00
Evan Hunt
67c0128ebb Fix an error when building with --disable-doh
The netievent handler for isc_nmsocket_set_tlsctx() was inadvertently
ifdef'd out when BIND was built with --disable-doh, resulting in an
assertion failure on startup when DoT was configured.
2022-10-24 13:54:39 -07:00
Artem Boldariev
09dcc914b4 TLS Stream: handle successful TLS handshake after listener shutdown
It was possible that accept callback can be called after listener
shutdown. In such a case the callback pointer equals NULL, leading to
segmentation fault. This commit fixes that.
2022-10-18 18:30:24 +03:00
Artem Boldariev
5ab2c0ebb3 Synchronise stop listening operation for multi-layer transports
This commit introduces a primitive isc__nmsocket_stop() which performs
shutting down on a multilayered socket ensuring the proper order of
the operations.

The shared data within the socket object can be destroyed after the
call completed, as it is guaranteed to not be used from within the
context of other worker threads.
2022-10-18 12:06:00 +03:00
Tony Finch
a34a2784b1 De-duplicate some calls to strerror_r()
Specifically, when reporting an unexpected or fatal error.
2022-10-17 11:58:26 +01:00
Artem Boldariev
d62eb206f7 Fix isc_nmsocket_set_tlsctx()
During loop manager refactoring isc_nmsocket_set_tlsctx() was not
properly adapted. The function is expected to broadcast the new TLS
context for every worker, but this behaviour was accidentally broken.
2022-10-14 23:06:31 +03:00
Ondřej Surý
b6b7a6886a
Don't set load-balancing socket option on the UDP connect sockets
The isc_nm_udpconnect() erroneously set the reuse port with
load-balancing on the outgoing connected UDP sockets.  This socket
option makes only sense for the listening sockets.  Don't set the
load-balancing reuse port option on the outgoing UDP sockets.
2022-10-12 15:36:25 +02:00
Artem Boldariev
eaebb92f3e TLS DNS: fix certificate verification error message reporting
This commit fixes TLS DNS verification error message reporting which
we probably broke during one of the recent networking code
refactorings.

This prevent e.g. dig from producing useful error messages related to
TLS certificates verification.
2022-10-12 16:24:04 +03:00
Artem Boldariev
6789b88d25 TLS: clear error queue before doing IO or calling SSL_get_error()
Ensure that TLS error is empty before calling SSL_get_error() or doing
SSL I/O so that the result will not get affected by prior error
statuses.

In particular, the improper error handling led to intermittent unit
test failure and, thus, could be responsible for some of the system
test failures and other intermittent TLS-related issues.

See here for more details:

https://www.openssl.org/docs/man3.0/man3/SSL_get_error.html

In particular, it mentions the following:

> The current thread's error queue must be empty before the TLS/SSL
> I/O operation is attempted, or SSL_get_error() will not work
> reliably.

As we use the result of SSL_get_error() to decide on I/O operations,
we need to ensure that it works reliably by cleaning the error queue.

TLS DNS: empty error queue before attempting I/O
2022-10-12 16:24:04 +03:00
Aram Sargsyan
be95ba0119 Remove a superfluous check of sock->fd against -1
The check is left from when tcp_connect_direct() called isc__nm_socket()
and it was uncertain whether it had succeeded, but now isc__nm_socket()
is called before tcp_connect_direct(), so sock->fd cannot be -1.

    *** CID 357292:    (REVERSE_NEGATIVE)
    /lib/isc/netmgr/tcp.c: 309 in isc_nm_tcpconnect()
    303
    304     	atomic_store(&sock->active, true);
    305
    306     	result = tcp_connect_direct(sock, req);
    307     	if (result != ISC_R_SUCCESS) {
    308     		atomic_store(&sock->active, false);
    >>>     CID 357292:    (REVERSE_NEGATIVE)
    >>>     You might be using variable "sock->fd" before verifying that it is >= 0.
    309     		if (sock->fd != (uv_os_sock_t)(-1)) {
    310     			isc__nm_tcp_close(sock);
    311     		}
    312     		isc__nm_connectcb(sock, req, result, true);
    313     	}
    314
2022-10-12 08:21:35 +00:00
Ondřej Surý
c1d26b53eb
Add and use semantic patch to replace isc_mem_get/allocate+memset
Add new semantic patch to replace the straightfoward uses of:

  ptr = isc_mem_{get,allocate}(..., size);
  memset(ptr, 0, size);

with the new API call:

  ptr = isc_mem_{get,allocate}x(..., size, ISC_MEM_ZERO);
2022-10-05 16:44:05 +02:00