2
0
mirror of https://gitlab.isc.org/isc-projects/bind9 synced 2025-08-28 13:08:06 +00:00

253 Commits

Author SHA1 Message Date
Artem Boldariev
5781ff3a93 Drop expired but not accepted TCP connections
This commit ensures that we are not attempting to accept an expired
TCP connection as we are not interested in any data that could have
been accumulated in its internal buffers. Now we just drop them for
good.
2024-07-03 15:03:02 +03:00
Artem Boldariev
55b1a093ea
Do not un-throttle TCP connections on isc_nm_read()
Due to omission it was possible to un-throttle a TCP connection
previously throttled due to the peer not reading back data we are
sending.

In particular, that affected DoH code, but it could also affect other
transports (the current or future ones) that pause/resume reading
according to its internal state.
2024-06-12 13:44:37 +03:00
Ondřej Surý
4c2ac25a95
Limit the number of DNS message processed from a single TCP read
The single TCP read can create as much as 64k divided by the minimum
size of the DNS message.  This can clog the processing thread and trash
the memory allocator because we need to do as much as ~20k allocations in
a single UV loop tick.

Limit the number of the DNS messages processed in a single UV loop tick
to just single DNS message and limit the number of the outstanding DNS
messages back to 23.  This effectively limits the number of pipelined
DNS messages to that number (this is the limit we already had before).
2024-06-10 16:48:54 +02:00
Ondřej Surý
4e7c4af17f
Throttle reading from TCP if the sends are not getting through
When TCP client would not read the DNS message sent to them, the TCP
sends inside named would accumulate and cause degradation of the
service.  Throttle the reading from the TCP socket when we accumulate
enough DNS data to be sent.  Currently this is limited in a way that a
single largest possible DNS message can fit into the buffer.
2024-06-10 16:48:52 +02:00
Artem Boldariev
d80dfbf745
Keep the endpoints set reference within an HTTP/2 socket
This commit ensures that an HTTP endpoints set reference is stored in
a socket object associated with an HTTP/2 stream instead of
referencing the global set stored inside a listener.

This helps to prevent an issue like follows:

1. BIND is configured to serve DoH clients;
2. A client is connected and one or more HTTP/2 stream is
created. Internal pointers are now pointing to the data on the
associated HTTP endpoints set;
3. BIND is reconfigured - the new endpoints set object is created and
promoted to all listeners;
4. The old pointers to the HTTP endpoints set data are now invalid.

Instead referencing a global object that is updated on
re-configurations we now store a local reference which prevents the
endpoints set objects to go out of scope prematurely.
2024-06-10 16:40:12 +02:00
Ondřej Surý
15329d471e
Add memory pools for isc_nmsocket_t structures
To reduce memory pressure, we can add light per-loop (netmgr worker)
memory pools for isc_nmsocket_t structures.  This will help in
situations where there's a lot of churn creating and destroying the
nmsockets.
2024-02-08 15:13:47 +01:00
Ondřej Surý
750bd364b5
Reduce the isc_nmsocket_t size from 1840 to 1208 bytes
Embedding isc_nmsocket_h2_t directly inside isc_nmsocket_t had increased
the size of isc_nmsocket_t to 1840 bytes.  Making the isc_nmsocket_h2_t
to be a pointer to the structure and allocated on demand allows us to
reduce the size to 1208 bytes.  While there are still some possible
reductions in the isc_nmsocket_t (embedded tlsstream, streamdns
structures), this was the far biggest drop in the memory usage.
2024-02-08 15:13:47 +01:00
Ondřej Surý
eada7b6e13
Reduce struct isc__nm_uvreq size from 1560 to 560 bytes
The uv_req union member of struct isc__nm_uvreq contained libuv request
types that we don't use.  Turns out that uv_getnameinfo_t is 1000 bytes
big and unnecessarily enlarged the whole structure.  Remove all the
unused members from the uv_req union.
2024-02-08 15:13:47 +01:00
Ondřej Surý
cb1d2e57e9
Remove unused mutex from netmgr
The netmgr->lock was dead code, remove it.
2024-02-07 20:54:05 +01:00
Aydın Mercan
2690dc48d3
Expose the TCP client count in statistics channel
The statistics channel does not expose the current number of TCP clients
connected, only the highwater. Therefore, users did not have an easy
means to collect statistics about TCP clients served over time. This
information could only be measured as a seperate mechanism via rndc by
looking at the TCP quota filled.

In order to expose the exact current count of connected TCP clients
(tracked by the "tcp-clients" quota) as a statistics counter, an
extra, dedicated Network Manager callback would need to be
implemented for that purpose (a counterpart of ns__client_tcpconn()
that would be run when a TCP connection is torn down), which is
inefficient. Instead, track the number of currently-connected TCP
clients separately for IPv4 and IPv6, as Network Manager statistics.
2024-01-17 11:11:12 +03:00
Artem Boldariev
3c45dd59cb Add a utility function to dump all active sockets on a NM instance
Add the new isc__nm_dump_active_manager() function that can be used
for debugging purposes: it dumps all active sockets withing the
network manager instance.
2023-12-06 15:15:25 +02:00
Artem Boldariev
4a88fc9d5b PROXYv2 over UDP transport
This commit adds a new transport that supports PROXYv2 over UDP. It is
built on top of PROXYv2 handling code (just like PROXY Stream). It
works by processing and stripping the PROXYv2 headers at the beginning
of a datagram (when accepting a datagram) or by placing a PROXYv2
header to the beginning of an outgoing datagram.

The transport is built in such a way that incoming datagrams are being
handled with minimal memory allocations and copying.
2023-12-06 15:15:25 +02:00
Artem Boldariev
3d1b6c48ab Add PROXY over TLS support to PROXY Stream
This commit makes it possible to use PROXY Stream not only over TCP,
but also over TLS. That is, now PROXY Stream can work in two modes as
far as TLS is involved:

1. PROXY over (plain) TCP - PROXYv2 headers are sent unencrypted before
TLS handshake messages. That is the main mode as described in the
PROXY protocol specification (as it is clearly stated there), and most
of the software expects PROXYv2 support to be implemented that
way (e.g. HAProxy);

2. PROXY over (encrypted) TLS - PROXYv2 headers are sent after the TLS
handshake has happened. For example, this mode is being used (only ?)
by "dnsdist". As far as I can see, that is, in fact, a deviation from
the spec, but I can certainly see how PROXYv2 could end up being
implemented this way elsewhere.
2023-12-06 15:15:24 +02:00
Artem Boldariev
eccc3fe0a0 Add PROXYv2 support to DNS over HTTP(S) transport
This commit extends DNS over HTTP(S) transport with PROXYv2 support.
2023-12-06 15:15:24 +02:00
Artem Boldariev
d119d666b3 PROXY Stream transport
This commit adds a new stream-based transport with an interface
compatible with TCP. The transport is built on top of TCP transport
and the new PROXYv2 handling code. Despite being built on top of TCP,
it can be easily extended to work on top of any TCP-like stream-based
transport. The intention of having this transport is to add PROXYv2
support into all existing stream-based DNS transport (DNS over TCP,
DNS over TLS, DNS over HTTP) by making the work on top of this new
transport.

The idea behind the transport is simple after accepting the connection
or connecting to a remote server it enters PROXYv2 handling mode: that
is, it either attempts to read (when accepting the connection) or send
(when establishing a connection) a PROXYv2 header. After that it works
like a mere wrapper on top of the underlying stream-based
transport (TCP).
2023-12-06 15:15:24 +02:00
Ondřej Surý
f36e118b9a
Limit the number of inactive handles kept for reuse
Instead of growing and never shrinking the list of the inactive
handles (to be reused mostly on the UDP connections), limit the number
of maximum number of inactive handles kept to 64.  Instead of caching
the inactive handles for all listening sockets, enable the caching on on
UDP listening sockets.  For TCP, the handles were cached for each
accepted socket thus reusing the handles only for long-standing TCP
connections, but not reusing the handles across different TCP streams.
2023-08-21 16:34:30 +02:00
Ondřej Surý
3b10814569
Fix the streaming read callback shutdown logic
When shutting down TCP sockets, the read callback calling logic was
flawed, it would call either one less callback or one extra.  Fix the
logic in the way:

1. When isc_nm_read() has been called but isc_nm_read_stop() hasn't on
   the handle, the read callback will be called with ISC_R_CANCELED to
   cancel active reading from the socket/handle.

2. When isc_nm_read() has been called and isc_nm_read_stop() has been
   called on the on the handle, the read callback will be called with
   ISC_R_SHUTTINGDOWN to signal that the dormant (not-reading) socket
   is being shut down.

3. The .reading and .recv_read flags are little bit tricky.  The
   .reading flag indicates if the outer layer is reading the data (that
   would be uv_tcp_t for TCP and isc_nmsocket_t (TCP) for TLSStream),
   the .recv_read flag indicates whether somebody is interested in the
   data read from the socket.

   Usually, you would expect that the .reading should be false when
   .recv_read is false, but it gets even more tricky with TLSStream as
   the TLS protocol might need to read from the socket even when sending
   data.

   Fix the usage of the .recv_read and .reading flags in the TLSStream
   to their true meaning - which mostly consist of using .recv_read
   everywhere and then wrapping isc_nm_read() and isc_nm_read_stop()
   with the .reading flag.

4. The TLS failed read helper has been modified to resemble the TCP code
   as much as possible, clearing and re-setting the .recv_read flag in
   the TCP timeout code has been fixed and .recv_read is now cleared
   when isc_nm_read_stop() has been called on the streaming socket.

5. The use of Network Manager in the named_controlconf, isccc_ccmsg, and
   isc_httpd units have been greatly simplified due to the improved design.

6. More unit tests for TCP and TLS testing the shutdown conditions have
   been added.

Co-authored-by: Ondřej Surý <ondrej@isc.org>
Co-authored-by: Artem Boldariev <artem@isc.org>
2023-04-20 12:58:32 +02:00
Ondřej Surý
f677cf6b73
Remove unused netmgr->worker->sendbuf
By inspecting the code, it was discovered that .sendbuf member of the
isc__nm_networker_t was unused and just consuming ~64k per worker.
Remove the member and the association allocation/deallocation.
2023-04-14 16:20:14 +02:00
Ondřej Surý
1715cad685
Refactor the isc_quota code and fix the quota in TCP accept code
In e18541287231b721c9cdb7e492697a2a80fd83fc, the TCP accept quota code
became broken in a subtle way - the quota would get initialized on the
first accept for the server socket and then deleted from the server
socket, so it would never get applied again.

Properly fixing this required a bigger refactoring of the isc_quota API
code to make it much simpler.  The new code decouples the ownership of
the quota and acquiring/releasing the quota limit.

After (during) the refactoring it became more clear that we need to use
the callback from the child side of the accepted connection, and not the
server side.
2023-04-12 14:10:37 +02:00
Ondřej Surý
3adba8ce23
Use isc_job_run() for reading from StreamDNS socket
Change the reading in the StreamDNS code to use isc_job_run() instead of
using isc_async_run() for less allocations and more streamlined
execution.
2023-04-12 14:10:37 +02:00
Ondřej Surý
74cbf523b3
Run closehandle_cb on run queue instead of async queue
Instead of using isc_async_run() when closing StreamDNS handle, add
isc_job_t member to the isc_nmhandle_t structure and use isc_job_run()
to avoid allocation/deallocation on the StreamDNS hot-path.
2023-04-12 14:10:37 +02:00
Ondřej Surý
2c0a9575d7
Replace __attribute__((unused)) with ISC_ATTR_UNUSED attribute macro
Instead of marking the unused entities with UNUSED(x) macro in the
function body, use a `ISC_ATTR_UNUSED` attribute macro that expans to
C23 [[maybe_unused]] or __attribute__((__unused__)) as fallback.
2023-03-30 23:29:25 +02:00
Ondřej Surý
2846888c57
Attach the accept "client" socket to .listener member of the socket
When accepting a TCP connection in the higher layers (tlsstream,
streamdns, and http) attach to the socket the connection was accepted
on, and use this socket instead of the parent listening socket.

This has an advantage - accessing the sock->listener now doesn't break
the thread boundaries, so we can properly check whether the socket is
being closed without requiring .closing member to be atomic_bool.
2023-03-30 16:10:08 +02:00
Ondřej Surý
45365adb32
Convert sock->active to non-atomic variable, cleanup rchildren
The last atomic_bool variable sock->active was converted to non-atomic
bool by properly handling the listening socket case where we were
checking parent socket instead of children sockets.

This is no longer necessary as we properly set the .active to false on
the children sockets.

Additionally, cleanup the .rchildren - the atomic variable was used for
mutex+condition to block until all children were listening, but that's
now being handled by a barrier.

Finally, just remove dead .self and .active_child_connections members of
the netmgr socket.
2023-03-30 16:10:08 +02:00
Ondřej Surý
e1a4572fd6
Refactor the use of atomics in netmgr
Now that everything runs on their own loop and we don't cross the thread
boundaries (with few exceptions), most of the atomic_bool variables used
to track the socket state have been unatomicized because they are always
accessed from the matching thread.

The remaining few have been relaxed: a) the sock->active is now using
acquire/release memory ordering; b) the various global limits are now
using relaxed memory ordering - we don't really care about the
synchronization for those.
2023-03-30 16:10:08 +02:00
Ondřej Surý
1844590ad9
Refactor isc_job_run to not-make any allocations
Change the isc_job_run() to not-make any allocations.  The caller must
make sure that it allocates isc_job_t - usually as part of the argument
passed to the callback.

For simple jobs, using isc_async_run() is advised as it allocates its
own separate isc_job_t.
2023-03-30 16:00:52 +02:00
Ondřej Surý
639d5065a3
Refactor the isc__nm_uvreq_t to have idle callback
Change the isc__nm_uvreq_t to have the idle callback as a separate
member as we always need to use it to properly close the uvreq.

Slightly refactor uvreq_put and uvreq_get to remove the unneeded
arguments - in uvreq_get(), we always use sock->worker, and in
uvreq_put, we always use req->sock, so there's not reason to pass those
extra arguments.
2023-03-29 21:16:44 +02:00
Ondřej Surý
476198f26c
Use uv_idle API for calling asynchronous connect/read/send callback
Instead of using isc_job_run() that's quite heavy as it allocates memory
for every new job, add uv_idle_t to uvreq union, and use uv_idle API
directly to execute the connect/read/send callback without any
additional allocations.
2023-03-29 21:16:44 +02:00
Evan Hunt
4ad95e0567 add ns_interface_create()
add a public function ns_interface_create() allowing the caller
to set up a listening interface directly without having to set
up listen-on and scan network interfaces.
2023-03-28 12:38:28 -07:00
Ondřej Surý
a2e4a6883f
Remove the netievent remnants
After removing all functional netievents, remove what has been left from
the netievents.  This also includes leftovers from previous refactorings.
2023-03-24 07:58:53 +01:00
Ondřej Surý
6b107c3fbc
Convert stopping generic socket children to to isc_async callback
Simplify the stopping of the generic socket children by using the
isc_async API from the loopmgr instead of using the asychronous
netievent mechanism in the netmgr.
2023-03-24 07:58:53 +01:00
Ondřej Surý
744e93b70d
Convert setting of the TLS contexts to to isc_async callback
Simplify the setting of the TLS contexts by using the isc_async API
from the loopmgr instead of using the asychronous netievent mechanism in
the netmgr.
2023-03-24 07:58:53 +01:00
Ondřej Surý
7ddc49d66a
Convert canceling StreamDNS socket to to isc_async callback
Simplify the canceling of the StreamDNS socket by using the isc_async API
from the loopmgr instead of using the asychronous netievent mechanism in
the netmgr.
2023-03-24 07:58:53 +01:00
Ondřej Surý
2185dc75f0
Convert reading from StreamDNS socket to to isc_async callback
Simplify the reading from the StreamDNS socket by using the isc_async API
from the loopmgr instead of using the asychronous netievent mechanism in
the netmgr.
2023-03-24 07:58:52 +01:00
Ondřej Surý
4a4bd68777
Convert setting of the DoH endpoints to to isc_async callback
Simplify the setting of the DoH endpoints by using the isc_async API
from the loopmgr instead of using the asychronous netievent mechanism in
the netmgr.
2023-03-24 07:58:52 +01:00
Ondřej Surý
115160de73
Convert sending on the DoH socket to to isc_async callback
Simplify the sending on the DoH socket by using the isc_async API
from the loopmgr instead of using the asychronous netievent mechanism in
the netmgr.
2023-03-24 07:58:52 +01:00
Ondřej Surý
a321d3f419
Convert closing the DoH socket to to isc_async callback
Simplify the closing the DoH socket by using the isc_async API
from the loopmgr instead of using the asychronous netievent mechanism in
the netmgr.
2023-03-24 07:58:52 +01:00
Ondřej Surý
8c48c51f71
Convert doing the TLS IO to to isc_async callback
Simplify the doing the TLS IO by using the isc_async API
from the loopmgr instead of using the asychronous netievent mechanism in
the netmgr.
2023-03-24 07:58:52 +01:00
Ondřej Surý
3d4d099ac8
Cleanup already defunct tlsconnect netievent
The netievent used for TLS connect was already defunct, just cleanup the
cruft.
2023-03-24 07:58:52 +01:00
Ondřej Surý
35b4ef0a08
Convert sending on the TLS socket to to isc_async callback
Simplify the sending on the TLS socket by using the isc_async API
from the loopmgr instead of using the asychronous netievent mechanism in
the netmgr.
2023-03-24 07:58:52 +01:00
Ondřej Surý
4f27b14cd1
Convert closing the TLS socket to to isc_async callback
Simplify the closing the TLS socket by using the isc_async API
from the loopmgr instead of using the asychronous netievent mechanism in
the netmgr.
2023-03-24 07:58:52 +01:00
Ondřej Surý
e185412872
Convert accepting new TCP connection to to isc_async callback
Simplify the acception the new TCP connection by using the isc_async API
from the loopmgr instead of using the asychronous netievent mechanism in
the netmgr.
2023-03-24 07:58:52 +01:00
Ondřej Surý
1baffb6ff5
Convert canceling UDP socket to to isc_async callback
Simplify the canceling of the UDP socket by using the isc_async API
from the loopmgr instead of using the asychronous netievent mechanism in
the netmgr.
2023-03-24 07:58:52 +01:00
Ondřej Surý
4419848efd
Convert stopping TCP children to to isc_async callback
Simplify the stopping of the TCP children by using the isc_async API
from the loopmgr instead of using the asychronous netievent mechanism in
the netmgr.
2023-03-24 07:58:52 +01:00
Ondřej Surý
e1524f2b4e
Convert starting TCP children to to isc_async callback
Simplify the starting of the TCP children by using the isc_async API
from the loopmgr instead of using the asychronous netievent mechanism in
the netmgr.
2023-03-24 07:58:52 +01:00
Ondřej Surý
8cb4cfd9db
Convert stopping UDP children to to isc_async callback
Simplify the stopping of the UDP children by using the isc_async API
from the loopmgr instead of using the asychronous netievent mechanism in
the netmgr.
2023-03-24 07:58:52 +01:00
Ondřej Surý
b25dd5eaf5
Convert starting UDP children to to isc_async callback
Simplify the starting of the UDP children by using the isc_async API
from the loopmgr instead of using the asychronous netievent mechanism in
the netmgr.
2023-03-24 07:58:52 +01:00
Ondřej Surý
5a43be0775
Simplify netmgr active handles accounting
The active handles accounting was both using atomic counter and ISC_LIST
to keep track of active handles.  Remove the atomic counter that was in
use before the ISC_LIST was added for better tracking of the handles
attached to the socket.
2023-03-24 07:58:52 +01:00
Ondřej Surý
96cff4fc51
Convert netmgr handle detach to synchronous callback
Instead of calling isc__nmhandle_detach calling
nmhandle_detach_cb() asynchronously when there's closehandle_cb
initialized, convert the closehandle_cb to use isc_job, and make the
isc__nmhandle_detach() to be fully synchronous.
2023-03-24 07:58:52 +01:00
Ondřej Surý
237f4af152 Convert netmgr connect, read and send callbacks to isc_job
The netmgr connect, read and send callbacks can now only be executed on
the same loop, convert it from asynchronous netievent queue event to
more direct isc_job.
2023-03-23 22:33:40 -07:00