2
0
mirror of https://gitlab.isc.org/isc-projects/bind9 synced 2025-08-28 13:08:06 +00:00

252 Commits

Author SHA1 Message Date
Artem Boldariev
ad876a65af Add isc__nm_senddns()
The new internal function works in the same way as isc_nm_send()
except that it sends a DNS message size ahead of the DNS message
data (the format used in DNS over TCP).

The intention is to provide a fast path for sending DNS messages over
streams protocols - that is, without allocating any intermediate
memory buffers.
2022-12-20 22:13:53 +02:00
Artem Boldariev
4277eeeb9c Remove TLS DNS transport (and parts common with TCP DNS)
This commit removes TLS DNS transport superseded by Stream DNS.
2022-12-20 22:13:53 +02:00
Artem Boldariev
e5649710d3 Remove TCP DNS transport
This commit removes TCP DNS transport superseded by Stream DNS.
2022-12-20 22:13:53 +02:00
Artem Boldariev
4524bf4083 Make isc_nm_tlssocket non-optional
This commit unties generic TLS code (isc_nm_tlssocket) from DoH, so
that it will be available regardless of the fact if BIND was built
with DNS over HTTP support or not.
2022-12-20 22:13:53 +02:00
Artem Boldariev
05cfb27b80 Disable Nagle's algorithm for TLS connections by default
This commit ensures that Nagle's algorithm is disabled by default for
TLS connections on best effort basis, just like other networking
software (e.g. NGINX) does, as, in the case of TLS, we are not
interested in trading latency for throughput, rather vice versa.

We attempt to disable it as early as we can, right after TCP
connections establishment, as an attempt to speed up handshake
handling.
2022-12-20 22:13:53 +02:00
Artem Boldariev
371b02f37a TCP: make it possible to set Nagle's algorithms state via handle
This commit adds ability to turn the Nagle's algorithm on or off via
connections handle. It adds the isc_nmhandle_set_tcp_nodelay()
function as the public interface for this functionality.
2022-12-20 22:13:53 +02:00
Artem Boldariev
f395cd4b3e Add isc_nm_streamdnssocket (aka Stream DNS)
This commit adds an initial implementation of isc_nm_streamdnssocket
transport: a unified transport for DNS over stream protocols messages,
which is capable of replacing both TCP DNS and TLS DNS
transports. Currently, the interface it provides is a unified set of
interfaces provided by both of the transports it attempts to replace.

The transport is built around "isc_dnsbuffer_t" and
"isc_dnsstream_assembler_t" objects and attempts to minimise both the
number of memory allocations during network transfers as well as
memory usage.
2022-12-20 22:13:51 +02:00
Artem Boldariev
c0c59b55ab TLS: add an internal function isc__nmhandle_get_selected_alpn()
The added function provides the interface for getting an ALPN tag
negotiated during TLS connection establishment.

The new function can be used by higher level transports.
2022-12-20 21:24:44 +02:00
Artem Boldariev
15e626f1ca TLS: add manual read timer control mode
This commit adds manual read timer control mode, similarly to TCP.
This way the read timer can be controlled manually using:

* isc__nmsocket_timer_start();
* isc__nmsocket_timer_stop();
* isc__nmsocket_timer_restart().

The change is required to make it possible to implement more
sophisticated read timer control policies in DNS transports, built on
top of TLS.
2022-12-20 21:24:44 +02:00
Artem Boldariev
9aabd55725 TCP: add manual read timer control mode
This commit adds a manual read timer control mode to the TCP
code (adding isc__nmhandle_set_manual_timer() as the interface to it).

Manual read timer control mode suppresses read timer restarting the
read timer when receiving any amount of data. This way the read timer
can be controlled manually using:

* isc__nmsocket_timer_start();
* isc__nmsocket_timer_stop();
* isc__nmsocket_timer_restart().

The change is required to make it possible to implement more
sophisticated read timer control policies in DNS transports, built on
top of TCP.
2022-12-20 21:24:44 +02:00
Artem Boldariev
f4760358f8 TLS: expose the ability to (re)start and stop underlying read timer
This commit adds implementation of isc__nmsocket_timer_restart() and
isc__nmsocket_timer_stop() for generic TLS code in order to make its
interface more compatible with that of TCP.
2022-12-20 21:24:44 +02:00
Artem Boldariev
f18a9b3743 TLS: add isc__nmsocket_timer_running() support
This commit adds isc__nmsocket_timer_running() support to the generic
TLS code in order to make it more compatible with TCP.
2022-12-20 21:24:44 +02:00
Artem Boldariev
c0808532e1 TLS: isc_nm_bad_request() and isc__nmsocket_reset() support
This commit adds implementations of isc_nm_bad_request() and
isc__nmsocket_reset() to the generic TLS stream code in order to make
it more compatible with TCP code.
2022-12-20 21:24:44 +02:00
Ondřej Surý
52307f8116
Add internal logging functions to the netmgr
Add internal logging functions isc__netmgr_log, isc__nmsocket_log(), and
isc__nmhandle_log() that can be used to add logging messages to the
netmgr, and change all direct use of isc_log_write() to use those
logging functions to properly prefix them with netmgr, nmsocket and
nmsocket+nmhandle.
2022-12-14 19:34:48 +01:00
Michal Nowak
afdb41a5aa
Update sources to Clang 15 formatting 2022-11-29 08:54:34 +01:00
Ondřej Surý
f3004da3a5
Make the netmgr send callback to be asynchronous only when needed
Previously, the send callback would be synchronous only on success.  Add
an option (similar to what other callbacks have) to decide whether we
need the asynchronous send callback on a higher level.

On a general level, we need the asynchronous callbacks to happen only
when we are invoking the callback from the public API.  If the path to
the callback went through the libuv callback or netmgr callback, we are
already on asynchronous path, and there's no need to make the call to
the callback asynchronous again.

For the send callback, this means we need the asynchronous path for
failure paths inside the isc_nm_send() (which calls isc__nm_udp_send(),
isc__nm_tcp_send(), etc...) - all other invocations of the send callback
could be synchronous, because those are called from the respective libuv
send callbacks.
2022-11-25 15:46:25 +01:00
Ondřej Surý
5ca49942a3
Make the netmgr read callback to be asynchronous only when needed
Previously, the read callback would be synchronous only on success or
timeout.  Add an option (similar to what other callbacks have) to decide
whether we need the asynchronous read callback on a higher level.

On a general level, we need the asynchronous callbacks to happen only
when we are invoking the callback from the public API.  If the path to
the callback went through the libuv callback or netmgr callback, we are
already on asynchronous path, and there's no need to make the call to
the callback asynchronous again.

For the read callback, this means we need the asynchronous path for
failure paths inside the isc_nm_read() (which calls isc__nm_udp_read(),
isc__nm_tcp_read(), etc...) - all other invocations of the read callback
could be synchronous, because those are called from the respective libuv
or netmgr read callbacks.
2022-11-25 15:46:15 +01:00
Evan Hunt
67c0128ebb Fix an error when building with --disable-doh
The netievent handler for isc_nmsocket_set_tlsctx() was inadvertently
ifdef'd out when BIND was built with --disable-doh, resulting in an
assertion failure on startup when DoT was configured.
2022-10-24 13:54:39 -07:00
Artem Boldariev
5ab2c0ebb3 Synchronise stop listening operation for multi-layer transports
This commit introduces a primitive isc__nmsocket_stop() which performs
shutting down on a multilayered socket ensuring the proper order of
the operations.

The shared data within the socket object can be destroyed after the
call completed, as it is guaranteed to not be used from within the
context of other worker threads.
2022-10-18 12:06:00 +03:00
Tony Finch
a34a2784b1 De-duplicate some calls to strerror_r()
Specifically, when reporting an unexpected or fatal error.
2022-10-17 11:58:26 +01:00
Artem Boldariev
d62eb206f7 Fix isc_nmsocket_set_tlsctx()
During loop manager refactoring isc_nmsocket_set_tlsctx() was not
properly adapted. The function is expected to broadcast the new TLS
context for every worker, but this behaviour was accidentally broken.
2022-10-14 23:06:31 +03:00
Ondřej Surý
fffd444440
Cleanup the asychronous code in the stream implementations
After the loopmgr work has been merged, we can now cleanup the TCP and
TLS protocols a little bit, because there are stronger guarantees that
the sockets will be kept on the respective loops/threads.  We only need
asynchronous call for listening sockets (start, stop) and reading from
the TCP (because the isc_nm_read() might be called from read callback
again.

This commit does the following changes (they are intertwined together):

1. Cleanup most of the asynchronous events in the TCP code, and add
   comments for the events that needs to be kept asynchronous.

2. Remove isc_nm_resumeread() from the netmgr API, and replace
   isc_nm_resumeread() calls with existing isc_nm_read() calls.

3. Remove isc_nm_pauseread() from the netmgr API, and replace
   isc_nm_pauseread() calls with a new isc_nm_read_stop() call.

4. Disable the isc_nm_cancelread() for the streaming protocols, only the
   datagram-like protocols can use isc_nm_cancelread().

5. Add isc_nmhandle_close() that can be used to shutdown the socket
  earlier than after the last detach.  Formerly, the socket would be
  closed only after all reading and sending would be finished and the
  last reference would be detached.  The new isc_nmhandle_close() can
  be used to close the underlying socket earlier, so all the other
  asynchronous calls would call their respective callbacks immediately.

Co-authored-by: Ondřej Surý <ondrej@isc.org>
Co-authored-by: Artem Boldariev <artem@isc.org>
2022-09-22 14:51:15 +02:00
Ondřej Surý
b1026dd4c1
Add missing isc_refcount_destroy() for isc__nmsocket_t
The destructor for the isc__nmsocket_t was missing call to the
isc_refcount_destroy() on the reference counter, which might lead to
spurious ThreadSanitizer data race warnings if we ever change the
acquire-release memory order in the isc_refcount_decrement().
2022-09-19 14:38:56 +02:00
Ondřej Surý
eac8bc5c1a
Prevent unexpected UDP client read callbacks
The network manager UDP code was misinterpreting when the libuv called
the udp_recv_cb with nrecv == 0 and addr == NULL -> this doesn't really
mean that the "stream" has ended, but the libuv indicates that the
receive buffer can be freed.  This could lead to assertion failure in
the code that calls isc_nm_read() from the network manager read callback
due to the extra spurious callbacks.

Properly handle the extra callback calls from the libuv in the client
read callback, and refactor the UDP isc_nm_read() implementation to be
synchronous, so no datagram is lost between the time that we stop the
reading from the UDP socket and we restart it again in the asychronous
udpread event.

Add a unit test that tests the isc_nm_read() call from the read
callback to receive two datagrams.
2022-09-19 12:20:41 +02:00
Ondřej Surý
718e92c31a
Clear the callbacks when isc_nm_stoplistening() is called
When we are closing the listening sockets, there's a time window in
which the TCP connection could be accepted although the respective
stoplistening function has already returned to control to the caller.
Clear the accept callback function early, so it doesn't get called when
we are not interested in the incoming connections anymore.
2022-08-26 09:09:25 +02:00
Ondřej Surý
b69e783164
Update netmgr, tasks, and applications to use isc_loopmgr
Previously:

* applications were using isc_app as the base unit for running the
  application and signal handling.

* networking was handled in the netmgr layer, which would start a
  number of threads, each with a uv_loop event loop.

* task/event handling was done in the isc_task unit, which used
  netmgr event loops to run the isc_event calls.

In this refactoring:

* the network manager now uses isc_loop instead of maintaining its
  own worker threads and event loops.

* the taskmgr that manages isc_task instances now also uses isc_loopmgr,
  and every isc_task runs on a specific isc_loop bound to the specific
  thread.

* applications have been updated as necessary to use the new API.

* new ISC_LOOP_TEST macros have been added to enable unit tests to
  run isc_loop event loops. unit tests have been updated to use this
  where needed.
2022-08-26 09:09:24 +02:00
Ondřej Surý
3e10d3b45f Cleanup the STATID_CONNECT and STATID_CONNECTFAIL stat counters
The STATID_CONNECT and STATID_CONNECTFAIL statistics were used
incorrectly. The STATID_CONNECT was incremented twice (once in
the *_connect_direct() and once in the callback) and STATID_CONNECTFAIL
would not be incremented at all if the failure happened in the callback.

Closes: #3452
2022-07-14 14:34:53 +02:00
Ondřej Surý
a280855f7b Handle the transient TCP connect() failures on FreeBSD
On FreeBSD (and perhaps other *BSD) systems, the TCP connect() call (via
uv_tcp_connect()) can fail with transient UV_EADDRINUSE error.  The UDP
code already handles this by trying three times (is a charm) before
giving up.  Add a code for the TCP, TCPDNS and TLSDNS layers to also try
three times before giving up by calling uv_tcp_connect() from the
callback two more time on UV_EADDRINUSE error.

Additionally, stop the timer only if we succeed or on hard error via
isc__nm_failed_connect_cb().
2022-07-14 14:20:10 +02:00
Artem Boldariev
237ce05b89 TLS: Implement isc_nmhandle_setwritetimeout()
This commit adds a proper implementation of
isc_nmhandle_setwritetimeout() for TLS connections. Now it passes the
value to the underlying TCP handle.
2022-07-12 14:40:22 +03:00
Evan Hunt
a499794984 REQUIRE should not have side effects
it's a style violation to have REQUIRE or INSIST contain code that
must run for the server to work. this was being done with some
atomic_compare_exchange calls. these have been cleaned up.  uses
of atomic_compare_exchange in assertions have been replaced with
a new macro atomic_compare_exchange_enforced, which uses RUNTIME_CHECK
to ensure that the exchange was successful.
2022-07-05 12:22:55 -07:00
Artem Boldariev
d2e13ddf22 Update the set of HTTP endpoints on reconfiguration
This commit ensures that on reconfiguration the set of HTTP
endpoints (=paths) is being updated within HTTP listeners.
2022-06-28 15:42:38 +03:00
Artem Boldariev
e72962d5f1 Update max concurrent streams limit in HTTP listeners on reconfig
This commit ensures that HTTP listeners concurrent streams limit gets
updated properly on reconfiguration.
2022-06-28 15:42:38 +03:00
Ondřej Surý
b432d5d3bc Gracefully handle uv_read_start() failures
Under specific rare timing circumstances the uv_read_start() could
fail with UV_EINVAL when the connection is reset between the connect (or
accept) and the uv_read_start() call on the nmworker loop.  Handle such
situation gracefully by propagating the errors from uv_read_start() into
upper layers, so the socket can be internally closed().
2022-06-14 11:33:02 +02:00
Artem Boldariev
245f7cec2e Avoid aborting when uv_timer_start() is used on a closing socket
In such a case it will return UV_EINVAL (-EINVAL), leading to
aborting, as the code expects the function to succeed.
2022-05-20 20:18:40 +03:00
Artem Boldariev
90bc13a5d5 TLS stream/DoH: implement TLS client session resumption
This commit extends TLS stream code and DoH code with TLS client
session resumption support implemented on top of the TLS client
session cache.
2022-05-20 20:17:45 +03:00
Ondřej Surý
b43812692d Move netmgr/uv-compat.h to <isc/uv.h>
As we are going to use libuv outside of the netmgr, we need the shims to
be readily available for the rest of the codebase.

Move the "netmgr/uv-compat.h" to <isc/uv.h> and netmgr/uv-compat.c to
uv.c, and as a rule of thumb, the users of libuv should include
<isc/uv.h> instead of <uv.h> directly.

Additionally, merge netmgr/uverr2result.c into uv.c and rename the
single function from isc__nm_uverr2result() to isc_uverr2result().
2022-05-03 10:02:19 +02:00
Ondřej Surý
24c3879675 Move socket related functions to netmgr/socket.c
Move the netmgr socket related functions from netmgr/netmgr.c and
netmgr/uv-compat.c to netmgr/socket.c, so they are all present all in
the same place.  Adjust the names of couple interal functions
accordingly.
2022-05-03 09:52:49 +02:00
Artem Boldariev
978f97dcdd TLSDNS: call send callbacks after only the data was sent
This commit ensures that write callbacks are getting called only after
the data has been sent via the network.

Without this fix, a situation could appear when a write callback could
get called before the actual encrypted data would have been sent to
the network. Instead, it would get called right after it would have
been passed to the OpenSSL (i.e. encrypted).

Most likely, the issue does not reveal itself often because the
callback call was asynchronous, so in most cases it should have been
called after the data has been sent, but that was not guaranteed by
the code logic.

Also, this commit removes one memory allocation (netievent) from a hot
path, as there is no need to call this callback asynchronously
anymore.
2022-04-27 17:44:23 +03:00
Ondřej Surý
eb8f2974b1 Abort when libuv at runtime mismatches libuv at compile time
When we compile with libuv that has some capabilities via flags passed
to f.e. uv_udp_listen() or uv_udp_bind(), the call with such flags would
fail with invalid arguments when older libuv version is linked at the
runtime that doesn't understand the flag that was available at the
compile time.

Enforce minimal libuv version when flags have been available at the
compile time, but are not available at the runtime.  This check is less
strict than enforcing the runtime libuv version to be same or higher
than compile time libuv version.
2022-04-26 11:40:40 +02:00
Artem Boldariev
df317184eb Add isc_nmsocket_set_tlsctx()
This commit adds isc_nmsocket_set_tlsctx() - an asynchronous function
that replaces the TLS context within a given TLS-enabled listener
socket object. It is based on the newly added reference counting
functionality.

The intention of adding this function is to add functionality to
replace a TLS context without recreating the whole socket object,
including the underlying TCP listener socket, as a BIND process might
not have enough permissions to re-create it fully on reconfiguration.
2022-04-06 18:45:57 +03:00
Artem Boldariev
9256026d18 Use isc_tlsctx_attach() in TLS DNS code
This commit adds proper reference counting for TLS contexts into
generic TLS DNS (DoT) code.
2022-04-06 18:45:57 +03:00
Ondřej Surý
142c63dda8 Enable the load-balance-sockets configuration
Previously, HAVE_SO_REUSEPORT_LB has been defined only in the private
netmgr-int.h header file, making the configuration of load balanced
sockets inoperable.

Move the missing HAVE_SO_REUSEPORT_LB define the isc/netmgr.h and add
missing isc_nm_getloadbalancesockets() implementation.
2022-04-05 01:30:58 +02:00
Ondřej Surý
85c6e797aa Add option to configure load balance sockets
Previously, the option to enable kernel load balancing of the sockets
was always enabled when supported by the operating system (SO_REUSEPORT
on Linux and SO_REUSEPORT_LB on FreeBSD).

It was reported that in scenarios where the networking threads are also
responsible for processing long-running tasks (like RPZ processing, CATZ
processing or large zone transfers), this could lead to intermitten
brownouts for some clients, because the thread assigned by the operating
system might be busy.  In such scenarious, the overall performance would
be better served by threads competing over the sockets because the idle
threads can pick up the incoming traffic.

Add new configuration option (`load-balance-sockets`) to allow enabling
or disabling the load balancing of the sockets.
2022-04-04 23:10:04 +02:00
Ondřej Surý
30e0fd942b Remove task privileged mode
Previously, the task privileged mode has been used only when the named
was starting up and loading the zones from the disk as the "first" thing
to do.  The privileged task was setup with quantum == 2, which made the
taskmgr/netmgr spin around the privileged queue processing two events at
the time.

The same effect can be achieved by setting the quantum to UINT_MAX (e.g.
practically unlimited) for the loadzone task, hence the privileged task
mode was removed in favor of just processing all the events on the
loadzone task in a single task_run().
2022-04-01 23:55:26 +02:00
Ondřej Surý
4dceab142d Consistenly use UNREACHABLE() instead of ISC_UNREACHABLE()
In couple places, we have missed INSIST(0) or ISC_UNREACHABLE()
replacement on some branches with UNREACHABLE().  Replace all
ISC_UNREACHABLE() or INSIST(0) calls with UNREACHABLE().
2022-03-28 23:26:08 +02:00
Artem Boldariev
783663db80 Add ISC_R_TLSBADPEERCERT error code to the TLS related code
This commit adds support for ISC_R_TLSBADPEERCERT error code, which is
supposed to be used to signal for TLS peer certificates verification
in dig and other code.

The support for this error code is added to our TLS and TLS DNS
implementations.

This commit also adds isc_nm_verify_tls_peer_result_string() function
which is supposed to be used to get a textual description of the
reason for getting a ISC_R_TLSBADPEERCERT error.
2022-03-28 15:32:30 +03:00
Ondřej Surý
9de10cd153 Remove extrahandle size from netmgr
Previously, it was possible to assign a bit of memory space in the
nmhandle to store the client data.  This was complicated and prevents
further refactoring of isc_nmhandle_t caching (future work).

Instead of caching the data in the nmhandle, allocate the hot-path
ns_client_t objects from per-thread clientmgr memory context and just
assign it to the isc_nmhandle_t via isc_nmhandle_set().
2022-03-25 10:38:35 +01:00
Ondřej Surý
584f0d7a7e Simplify way we tag unreachable code with only ISC_UNREACHABLE()
Previously, the unreachable code paths would have to be tagged with:

    INSIST(0);
    ISC_UNREACHABLE();

There was also older parts of the code that used comment annotation:

    /* NOTREACHED */

Unify the handling of unreachable code paths to just use:

    UNREACHABLE();

The UNREACHABLE() macro now asserts when reached and also uses
__builtin_unreachable(); when such builtin is available in the compiler.
2022-03-25 08:33:43 +01:00
Ondřej Surý
fe7ce629f4 Add FALLTHROUGH macro for __attribute__((fallthrough))
Gcc 7+ and Clang 10+ have implemented __attribute__((fallthrough)) which
is explicit version of the /* FALLTHROUGH */ comment we are currently
using.

Add and apply FALLTHROUGH macro that uses the attribute if available,
but does nothing on older compilers.

In one case (lib/dns/zone.c), using the macro revealed that we were
using the /* FALLTHROUGH */ comment in wrong place, remove that comment.
2022-03-25 08:33:43 +01:00
Ondřej Surý
d70daa29f7 Make netmgr the authority on number of threads running
Instead of passing the "workers" variable back and forth along with
passing the single isc_nm_t instance, add isc_nm_getnworkers() function
that returns the number of netmgr threads are running.

Change the ns_interfacemgr and ns_taskmgr to utilize the newly acquired
knowledge.
2022-03-18 21:53:28 +01:00