2
0
mirror of https://gitlab.isc.org/isc-projects/bind9 synced 2025-08-23 10:39:16 +00:00

112 Commits

Author SHA1 Message Date
Ondřej Surý
8715be1e4b Use UV_RUNTIME_CHECK() as appropriate
Replace the RUNTIME_CHECK() calls for libuv API calls with
UV_RUNTIME_CHECK() to get more detailed error message when
something fails and should not.
2022-02-16 11:16:57 +01:00
Ondřej Surý
2ae84702ad Add log message when hard quota is reached in TCP accept
When isc_quota_attach_cb() API returns ISC_R_QUOTA (meaning hard quota
was reached) the accept_connection() would return without logging a
message about quota reached.

Change the connection callback to log the quota reached message.
2022-02-01 21:00:05 +01:00
Ondřej Surý
b5e086257d Explicitly enable IPV6_V6ONLY on the netmgr sockets
Some operating systems (OpenBSD and DragonFly BSD) don't restrict the
IPv6 sockets to sending and receiving IPv6 packets only.  Explicitly
enable the IPV6_V6ONLY socket option on the IPv6 sockets to prevent
failures from using the IPv4-mapped IPv6 address.
2022-01-17 22:16:27 +01:00
Ondřej Surý
0a4e91ee47 Revert "Always enqueue isc__nm_tcp_resumeread()"
The commit itself is harmless, but at the same time it is also useless,
so we are reverting it.

This reverts commit 11c869a3d53eafa4083b404e6b6686a120919c26.
2022-01-13 19:06:39 +01:00
Ondřej Surý
7370725008 Fix the UDP recvmmsg support
Previously, the netmgr/udp.c tried to detect the recvmmsg detection in
libuv with #ifdef UV_UDP_<foo> preprocessor macros.  However, because
the UV_UDP_<foo> are not preprocessor macros, but enum members, the
detection didn't work.  Because the detection didn't work, the code
didn't have access to the information when we received the final chunk
of the recvmmsg and tried to free the uvbuf every time.  Fortunately,
the isc__nm_free_uvbuf() had a kludge that detected attempt to free in
the middle of the receive buffer, so the code worked.

However, libuv 1.37.0 changed the way the recvmmsg was enabled from
implicit to explicit, and we checked for yet another enum member
presence with preprocessor macro, so in fact libuv recvmmsg support was
never enabled with libuv >= 1.37.0.

This commit changes to the preprocessor macros to autoconf checks for
declaration, so the detection now works again.  On top of that, it's now
possible to cleanup the alloc_cb and free_uvbuf functions because now,
the information whether we can or cannot free the buffer is available to
us.
2022-01-13 19:06:39 +01:00
Ondřej Surý
58bd26b6cf Update the copyright information in all files in the repository
This commit converts the license handling to adhere to the REUSE
specification.  It specifically:

1. Adds used licnses to LICENSES/ directory

2. Add "isc" template for adding the copyright boilerplate

3. Changes all source files to include copyright and SPDX license
   header, this includes all the C sources, documentation, zone files,
   configuration files.  There are notes in the doc/dev/copyrights file
   on how to add correct headers to the new files.

4. Handle the rest that can't be modified via .reuse/dep5 file.  The
   binary (or otherwise unmodifiable) files could have license places
   next to them in <foo>.license file, but this would lead to cluttered
   repository and most of the files handled in the .reuse/dep5 file are
   system test files.
2022-01-11 09:05:02 +01:00
Ondřej Surý
11c869a3d5 Always enqueue isc__nm_tcp_resumeread()
The isc__nm_tcp_resumeread() was using maybe_enqueue function to enqueue
netmgr event which could case the read callback to be executed
immediately if there was enough data waiting in the TCP queue.

If such thing would happen, the read callback would be called before the
previous read callback was finished and the worker receive buffer would
be still marked "in use" causing a assertion failure.

This would affect only raw TCP channels, e.g. rndc and http statistics.
2022-01-06 10:34:04 -08:00
Michał Kępień
ea89ab80ae Fix error codes passed to connection callbacks
Commit 9ee60e7a17bf34c7ef7f4d79e6a00ca45444ec8c erroneously introduced
duplicate conditions to several existing conditional statements
responsible for determining error codes passed to connection callbacks
upon failure.  Fix the affected expressions to ensure connection
callbacks are invoked with:

  - the ISC_R_SHUTTINGDOWN error code when a global netmgr shutdown is
    in progress,

  - the ISC_R_CANCELED error code when a specific operation has been
    canceled.

This does not fix any known bugs, it only adjusts the changes introduced
by commit 9ee60e7a17bf34c7ef7f4d79e6a00ca45444ec8c so that they match
its original intent.
2021-12-28 15:09:50 +01:00
Ondřej Surý
20ac73eb22 Improve the logging on failed TCP accept
Previously, when TCP accept failed, we have logged a message with
ISC_LOG_ERROR level.  One common case, how this could happen is that the
client hits TCP client quota and is put on hold and when resumed, the
client has already given up and closed the TCP connection.  In such
case, the named would log:

    TCP connection failed: socket is not connected

This message was quite confusing because it actually doesn't say that
it's related to the accepting the TCP connection and also it logs
everything on the ISC_LOG_ERROR level.

Change the log message to "Accepting TCP connection failed" and for
specific error states lower the severity of the log message to
ISC_LOG_INFO.
2021-12-02 13:50:00 +01:00
Evan Hunt
8d6bf826c6 netmgr: refactor isc__nm_incstats() and isc__nm_decstats()
After support for route/netlink sockets is merged, not all sockets
will have stats counters associated with them, so it's now necessary
to check whether socket stats exist before incrementing or decrementing
them. rather than relying on the caller for this, we now just pass the
socket and an index, and the correct stats counter will be updated if
it exists.
2021-10-15 00:40:37 -07:00
Ondřej Surý
9ee60e7a17 netmgr fixes needed for dispatch
- The read timer must always be stopped when reading stops.

- Read callbacks can now call isc_nm_read() again in TCP, TCPDNS and
  TLSDNS; previously this caused an assertion.

- The wrong failure code could be sent after a UDP recv failure because
  the if statements were in the wrong order. the check for a NULL
  address needs to be after the check for an error code, otherwise the
  result will always be set to ISC_R_EOF.

- When aborting a read or connect because the netmgr is shutting down,
  use ISC_R_SHUTTINGDOWN. (ISC_R_CANCELED is now reserved for when the
  read has been canceled by the caller.)

- A new function isc_nmhandle_timer_running() has been added enabling a
  callback to check whether the timer has been reset after processing a
  timeout.

- Incidental netmgr fix: always use isc__nm_closing() instead of
  referencing sock->mgr->closing directly

- Corrected a few comments that used outdated function names.
2021-10-02 11:39:56 -07:00
Ondřej Surý
440fb3d225 Completely remove BIND 9 Windows support
The Windows support has been completely removed from the source tree
and BIND 9 now no longer supports native compilation on Windows.

We might consider reviewing mingw-w64 port if contributed by external
party, but no development efforts will be put into making BIND 9 compile
and run on Windows again.
2021-06-09 14:35:14 +02:00
Ondřej Surý
50270de8a0 Refactor the interface handling in the netmgr
The isc_nmiface_t type was holding just a single isc_sockaddr_t,
so we got rid of the datatype and use plain isc_sockaddr_t in place
where isc_nmiface_t was used before.  This means less type-casting and
shorter path to access isc_sockaddr_t members.

At the same time, instead of keeping the reference to the isc_sockaddr_t
that was passed to us when we start listening, we will keep a local
copy. This prevents the data race on destruction of the ns_interface_t
objects where pending nmsockets could reference the sockaddr of already
destroyed ns_interface_t object.
2021-05-26 09:43:12 +02:00
Ondřej Surý
4509089419 Add configuration option to set send/recv buffers on the nm sockets
This commit adds a new configuration option to set the receive and send
buffer sizes on the TCP and UDP netmgr sockets.  The default is `0`
which doesn't set any value and just uses the value set by the operating
system.

There's no magic value here - set it too small and the performance will
drop, set it too large, the buffers can fill-up with queries that have
already timeouted on the client side and nobody is interested for the
answer and this would just make the server clog up even more by making
it produce useless work.

The `netstat -su` can be used on POSIX systems to monitor the receive
and send buffer errors.
2021-05-17 08:47:09 +02:00
Ondřej Surý
365c6a9851 ensure interlocked netmgr events run on worker[0]
Network manager events that require interlock (pause, resume, listen)
are now always executed in the same worker thread, mgr->workers[0],
to prevent races.

"stoplistening" events no longer require interlock.
2021-05-07 14:28:32 -07:00
Ondřej Surý
4c8f6ebeb1 Use barriers for netmgr synchronization
The netmgr listening, stoplistening, pausing and resuming functions
now use barriers for synchronization, which makes the code much simpler.

isc/barrier.h defines isc_barrier macros as a front-end for uv_barrier
on platforms where that works, and pthread_barrier where it doesn't
(including TSAN builds).
2021-05-07 14:28:32 -07:00
Artem Boldariev
8510c5cd59 Always call TCP connect callback from within a worker context
This change ensures that a TCP connect callback is called from within
the context of a worker thread in case of a low-level error when
descriptors cannot be created (e.g. when there are too many open file
descriptors).
2021-05-07 15:47:24 +03:00
Evan Hunt
c90da99180 fix TCP timeout recovery
removed an unnecessary assert in the failed_read_cb() function.
also renamed to isc__nm_tcp_failed_read_cb() to match the practice
in other modules.
2021-04-22 12:08:04 -07:00
Evan Hunt
bcf5b2a675 run read callbacks synchronously on timeout
when running read callbacks, if the event result is not ISC_R_SUCCESS,
the callback is always run asynchronously. this is a problem on timeout,
because there's no chance to reset the timer before the socket has
already been destroyed. this commit allows read callbacks to run
synchronously for both ISC_R_SUCCESS and ISC_R_TIMEDOUT result codes.
2021-04-22 12:08:04 -07:00
Ondřej Surý
b540722bc3 Refactor taskmgr to run on top of netmgr
This commit changes the taskmgr to run the individual tasks on the
netmgr internal workers.  While an effort has been put into keeping the
taskmgr interface intact, couple of changes have been made:

 * The taskmgr has no concept of universal privileged mode - rather the
   tasks are either privileged or unprivileged (normal).  The privileged
   tasks are run as a first thing when the netmgr is unpaused.  There
   are now four different queues in in the netmgr:

   1. priority queue - netievent on the priority queue are run even when
      the taskmgr enter exclusive mode and netmgr is paused.  This is
      needed to properly start listening on the interfaces, free
      resources and resume.

   2. privileged task queue - only privileged tasks are queued here and
      this is the first queue that gets processed when network manager
      is unpaused using isc_nm_resume().  All netmgr workers need to
      clean the privileged task queue before they all proceed normal
      operation.  Both task queues are processed when the workers are
      finished.

   3. task queue - only (traditional) task are scheduled here and this
      queue along with privileged task queues are process when the
      netmgr workers are finishing.  This is needed to process the task
      shutdown events.

   4. normal queue - this is the queue with netmgr events, e.g. reading,
      sending, callbacks and pretty much everything is processed here.

 * The isc_taskmgr_create() now requires initialized netmgr (isc_nm_t)
   object.

 * The isc_nm_destroy() function now waits for indefinite time, but it
   will print out the active objects when in tracing mode
   (-DNETMGR_TRACE=1 and -DNETMGR_TRACE_VERBOSE=1), the netmgr has been
   made a little bit more asynchronous and it might take longer time to
   shutdown all the active networking connections.

 * Previously, the isc_nm_stoplistening() was a synchronous operation.
   This has been changed and the isc_nm_stoplistening() just schedules
   the child sockets to stop listening and exits.  This was needed to
   prevent a deadlock as the the (traditional) tasks are now executed on
   the netmgr threads.

 * The socket selection logic in isc__nm_udp_send() was flawed, but
   fortunatelly, it was broken, so we never hit the problem where we
   created uvreq_t on a socket from nmhandle_t, but then a different
   socket could be picked up and then we were trying to run the send
   callback on a socket that had different threadid than currently
   running.
2021-04-20 23:22:28 +02:00
Ondřej Surý
72ef5f465d Refactor async callbacks and fix the double tlsdnsconnect callback
The isc_nm_tlsdnsconnect() call could end up with two connect callbacks
called when the timeout fired and the TCP connection was aborted,
but the TLS handshake was not complete yet.  isc__nm_connecttimeout_cb()
forgot to clean up sock->tls.pending_req when the connect callback was
called with ISC_R_TIMEDOUT, leading to a second callback running later.

A new argument has been added to the isc__nm_*_failed_connect_cb and
isc__nm_*_failed_read_cb functions, to indicate whether the callback
needs to run asynchronously or not.
2021-04-07 15:36:59 +02:00
Ondřej Surý
86f4872dd6 isc_nm_*connect() always return via callback
The isc_nm_*connect() functions were refactored to always return the
connection status via the connect callback instead of sometimes returning
the hard failure directly (for example, when the socket could not be
created, or when the network manager was shutting down).

This commit changes the connect functions in all the network manager
modules, and also makes the necessary refactoring changes in places
where the connect functions are called.
2021-04-07 15:36:59 +02:00
Ondřej Surý
ca12e25bb0 Use generic functions for reading and timers in TCP
The TCP module has been updated to use the generic functions from
netmgr.c instead of its own local copies.  This brings the module
mostly up to par with the TCPDNS and TLSDNS modules.
2021-04-07 15:36:59 +02:00
Ondřej Surý
7df8c7061c Fix and clean up handling of connect callbacks
Serveral problems were discovered and fixed after the change in
the connection timeout in the previous commits:

  * In TLSDNS, the connection callback was not called at all under some
    circumstances when the TCP connection had been established, but the
    TLS handshake hadn't been completed yet.  Additional checks have
    been put in place so that tls_cycle() will end early when the
    nmsocket is invalidated by the isc__nm_tlsdns_shutdown() call.

  * In TCP, TCPDNS and TLSDNS, new connections would be established
    even when the network manager was shutting down.  The new
    call isc__nm_closing() has been added and is used to bail out
    early even before uv_tcp_connect() is attempted.
2021-04-07 15:36:59 +02:00
Ondřej Surý
5a87c7372c Make it possible to recover from connect timeouts
Similarly to the read timeout, it's now possible to recover from
ISC_R_TIMEDOUT event by restarting the timer from the connect callback.

The change here also fixes platforms that missing the socket() options
to set the TCP connection timeout, by moving the timeout code into user
space.  On platforms that support setting the connect timeout via a
socket option, the timeout has been hardcoded to 2 minutes (the maximum
value of tcp-initial-timeout).
2021-04-07 15:36:58 +02:00
Ondřej Surý
33c00c281f Make it possible to recover from read timeouts
Previously, when the client timed out on read, the client socket would
be automatically closed and destroyed when the nmhandle was detached.
This commit changes the logic so that it's possible for the callback to
recover from the ISC_R_TIMEDOUT event by restarting the timer. This is
done by calling isc_nmhandle_settimeout(), which prevents the timeout
handling code from destroying the socket; instead, it continues to wait
for data.

One specific use case for multiple timeouts is serve-stale - the client
socket could be created with shorter timeout (as specified with
stale-answer-client-timeout), so we can serve the requestor with stale
answer, but keep the original query running for a longer time.
2021-04-07 15:36:58 +02:00
Ondřej Surý
caa5b6548a Fix TCPDNS and TLSDNS timers
After the TCPDNS refactoring the initial and idle timers were broken and
only the tcp-initial-timeout was always applied on the whole TCP
connection.

This broke any TCP connection that took longer than tcp-initial-timeout,
most often this would affect large zone AXFRs.

This commit changes the timeout logic in this way:

  * On TCP connection accept the tcp-initial-timeout is applied
    and the timer is started
  * When we are processing and/or sending any DNS message the timer is
    stopped
  * When we stop processing all DNS messages, the tcp-idle-timeout
    is applied and the timer is started again
2021-03-18 16:37:57 +01:00
Evan Hunt
88752b1121 refactor outgoing HTTP connection support
- style, cleanup, and removal of unnecessary code.
- combined isc_nm_http_add_endpoint() and isc_nm_http_add_doh_endpoint()
  into one function, renamed isc_http_endpoint().
- moved isc_nm_http_connect_send_request() into doh_test.c as a helper
  function; remove it from the public API.
- renamed isc_http2 and isc_nm_http2 types and functions to just isc_http
  and isc_nm_http, for consistency with other existing names.
- shortened a number of long names.
- the caller is now responsible for determining the peer address.
  in isc_nm_httpconnect(); this eliminates the need to parse the URI
  and the dependency on an external resolver.
- the caller is also now responsible for creating the SSL client context,
  for consistency with isc_nm_tlsdnsconnect().
- added setter functions for HTTP/2 ALPN. instead of setting up ALPN in
  isc_tlsctx_createclient(), we now have a function
  isc_tlsctx_enable_http2client_alpn() that can be run from
  isc_nm_httpconnect().
- refactored isc_nm_httprequest() into separate read and send functions.
  isc_nm_send() or isc_nm_read() is called on an http socket, it will
  be stored until a corresponding isc_nm_read() or _send() arrives; when
  we have both halves of the pair the HTTP request will be initiated.
- isc_nm_httprequest() is renamed isc__nm_http_request() for use as an
  internal helper function by the DoH unit test. (eventually doh_test
  should be rewritten to use read and send, and this function should
  be removed.)
- added implementations of isc__nm_tls_settimeout() and
  isc__nm_http_settimeout().
- increased NGHTTP2 header block length for client connections to 128K.
- use isc_mem_t for internal memory allocations inside nghttp2, to
  help track memory leaks.
- send "Cache-Control" header in requests and responses. (note:
  currently we try to bypass HTTP caching proxies, but ideally we should
  interact with them: https://tools.ietf.org/html/rfc8484#section-5.1)
2021-03-05 13:29:26 +02:00
Artem Boldariev
08da09bc76 Initial support for DNS-over-HTTP(S)
This commit completes the support for DNS-over-HTTP(S) built on top of
nghttp2 and plugs it into the BIND. Support for both GET and POST
requests is present, as required by RFC8484.

Both encrypted (via TLS) and unencrypted HTTP/2 connections are
supported. The latter are mostly there for debugging/troubleshooting
purposes and for the means of encryption offloading to third-party
software (as might be desirable in some environments to simplify TLS
certificates management).
2021-02-03 12:06:17 +01:00
Witold Kręcicki
7a96081360 nghttp2-based HTTP layer in netmgr
This commit includes work-in-progress implementation of
DNS-over-HTTP(S).

Server-side code remains mostly untested, and there is only support
for POST requests.
2021-02-03 12:06:17 +01:00
Ondřej Surý
5caf33feda Fix HAVE_SO_REUSEPORT_LB macro name definition
A typo in macro definition caused the load-balanced sockets to be
disabled even on platforms with existing support for load-balanced
sockets.
2020-12-04 14:45:22 +01:00
Ondřej Surý
87c5867202 Use sock->nchildren instead of mgr->nworkers when initializing NM
On Windows, we were limiting the number of listening children to just 1,
but we were then iterating on mgr->nworkers.  That lead to scheduling
more async_*listen() than actually allocated and out-of-bound read-write
operation on the heap.
2020-12-03 18:03:25 +01:00
Ondřej Surý
151852f428 Fix datarace when UDP/TCP connect fails and we are in nmthread
When we were in nmthread, the isc__nm_async_<proto>connect() function
executes in the same thread as the isc__nm_<proto>connect() and on a
failure, it would block indefinitely because the failure branch was
setting sock->active to false before the condition around the wait had a
chance to skip the WAIT().

This also fixes the zero system test being stuck on FreeBSD 11, so we
re-enable the test in the commit.
2020-12-03 13:56:34 +01:00
Ondřej Surý
1d066e4bc5 Distribute queries among threads even on platforms without lb sockets
On platforms without load-balancing socket all the queries would be
handle by a single thread.  Currently, the support for load-balanced
sockets is present in Linux with SO_REUSEPORT and FreeBSD 12 with
SO_REUSEPORT_LB.

This commit adds workaround for such platforms that:

1. setups single shared listening socket for all listening nmthreads for
   UDP, TCP and TCPDNS netmgr transports

2. Calls uv_udp_bind/uv_tcp_bind on the underlying socket just once and
   for rest of the nmthreads only copy the internal libuv flags (should
   be just UV_HANDLE_BOUND and optionally UV_HANDLE_IPV6).

3. start reading on UDP socket or listening on TCP socket

The load distribution among the nmthreads is uneven, but it's still
better than utilizing just one thread for processing all the incoming
queries
2020-12-03 09:20:33 +01:00
Ondřej Surý
2e1dd56d0b Fix the data race in accessing the isc_nm_t timers
The following TSAN report about accessing the mgr timers (mgr->init,
mgr->idle, mgr->keepalive and mgr->advertised) has been fixed in this
commit:

    ==================
    WARNING: ThreadSanitizer: data race (pid=2746)
    Read of size 4 at 0x7b440008a948 by thread T18:
    #0 isc__nm_tcpdns_read /home/ondrej/Projects/bind9/lib/isc/netmgr/tcpdns.c:849:25 (libisc.so.1706+0x2ba0f)
    #1 isc_nm_read /home/ondrej/Projects/bind9/lib/isc/netmgr/netmgr.c:1679:3 (libisc.so.1706+0x22258)
    #2 tcpdns_connect_connect_cb /home/ondrej/Projects/bind9/lib/isc/tests/tcpdns_test.c:363:2 (tcpdns_test+0x4bc5fb)
    #3 isc__nm_async_connectcb /home/ondrej/Projects/bind9/lib/isc/netmgr/netmgr.c:1816:2 (libisc.so.1706+0x228c9)
    #4 isc__nm_connectcb /home/ondrej/Projects/bind9/lib/isc/netmgr/netmgr.c:1791:3 (libisc.so.1706+0x22713)
    #5 tcpdns_connect_cb /home/ondrej/Projects/bind9/lib/isc/netmgr/tcpdns.c:343:2 (libisc.so.1706+0x2d89d)
    #6 uv__stream_connect /home/ondrej/Projects/tsan/libuv/src/unix/stream.c:1381:5 (libuv.so.1+0x27c18)
    #7 uv__stream_io /home/ondrej/Projects/tsan/libuv/src/unix/stream.c:1298:5 (libuv.so.1+0x25977)
    #8 uv__io_poll /home/ondrej/Projects/tsan/libuv/src/unix/linux-core.c:462:11 (libuv.so.1+0x2e795)
    #9 uv_run /home/ondrej/Projects/tsan/libuv/src/unix/core.c:385:5 (libuv.so.1+0x158ec)
    #10 nm_thread /home/ondrej/Projects/bind9/lib/isc/netmgr/netmgr.c:530:11 (libisc.so.1706+0x1c94a)

    Previous write of size 4 at 0x7b440008a948 by main thread:
    #0 isc_nm_settimeouts /home/ondrej/Projects/bind9/lib/isc/netmgr/netmgr.c:490:12 (libisc.so.1706+0x1dda5)
    #1 tcpdns_recv_two /home/ondrej/Projects/bind9/lib/isc/tests/tcpdns_test.c:601:2 (tcpdns_test+0x4bad0e)
    #2 cmocka_run_one_test_or_fixture <null> (libcmocka.so.0+0x70be)
    #3 __libc_start_main /build/glibc-vjB4T1/glibc-2.28/csu/../csu/libc-start.c:308:16 (libc.so.6+0x2409a)

    Location is heap block of size 281 at 0x7b440008a840 allocated by main thread:
    #0 malloc <null> (tcpdns_test+0x42864b)
    #1 default_memalloc /home/ondrej/Projects/bind9/lib/isc/mem.c:713:8 (libisc.so.1706+0x6d261)
    #2 mem_get /home/ondrej/Projects/bind9/lib/isc/mem.c:622:8 (libisc.so.1706+0x69b9c)
    #3 isc___mem_get /home/ondrej/Projects/bind9/lib/isc/mem.c:1044:9 (libisc.so.1706+0x6d379)
    #4 isc__mem_get /home/ondrej/Projects/bind9/lib/isc/mem.c:2432:10 (libisc.so.1706+0x6889e)
    #5 isc_nm_start /home/ondrej/Projects/bind9/lib/isc/netmgr/netmgr.c:203:8 (libisc.so.1706+0x1c219)
    #6 nm_setup /home/ondrej/Projects/bind9/lib/isc/tests/tcpdns_test.c:244:11 (tcpdns_test+0x4baaa4)
    #7 cmocka_run_one_test_or_fixture <null> (libcmocka.so.0+0x70fd)
    #8 __libc_start_main /build/glibc-vjB4T1/glibc-2.28/csu/../csu/libc-start.c:308:16 (libc.so.6+0x2409a)

    Thread T18 'isc-net-0000' (tid=3513, running) created by main thread at:
    #0 pthread_create <null> (tcpdns_test+0x429e7b)
    #1 isc_thread_create /home/ondrej/Projects/bind9/lib/isc/pthreads/thread.c:73:8 (libisc.so.1706+0x8476a)
    #2 isc_nm_start /home/ondrej/Projects/bind9/lib/isc/netmgr/netmgr.c:271:3 (libisc.so.1706+0x1c66a)
    #3 nm_setup /home/ondrej/Projects/bind9/lib/isc/tests/tcpdns_test.c:244:11 (tcpdns_test+0x4baaa4)
    #4 cmocka_run_one_test_or_fixture <null> (libcmocka.so.0+0x70fd)
    #5 __libc_start_main /build/glibc-vjB4T1/glibc-2.28/csu/../csu/libc-start.c:308:16 (libc.so.6+0x2409a)

    SUMMARY: ThreadSanitizer: data race /home/ondrej/Projects/bind9/lib/isc/netmgr/tcpdns.c:849:25 in isc__nm_tcpdns_read
    ==================
    ThreadSanitizer: reported 1 warnings
2020-12-02 10:14:31 +01:00
Ondřej Surý
634bdfb16d Refactor netmgr and add more unit tests
This is a part of the works that intends to make the netmgr stable,
testable, maintainable and tested.  It contains a numerous changes to
the netmgr code and unfortunately, it was not possible to split this
into smaller chunks as the work here needs to be committed as a complete
works.

NOTE: There's a quite a lot of duplicated code between udp.c, tcp.c and
tcpdns.c and it should be a subject to refactoring in the future.

The changes that are included in this commit are listed here
(extensively, but not exclusively):

* The netmgr_test unit test was split into individual tests (udp_test,
  tcp_test, tcpdns_test and newly added tcp_quota_test)

* The udp_test and tcp_test has been extended to allow programatic
  failures from the libuv API.  Unfortunately, we can't use cmocka
  mock() and will_return(), so we emulate the behaviour with #define and
  including the netmgr/{udp,tcp}.c source file directly.

* The netievents that we put on the nm queue have variable number of
  members, out of these the isc_nmsocket_t and isc_nmhandle_t always
  needs to be attached before enqueueing the netievent_<foo> and
  detached after we have called the isc_nm_async_<foo> to ensure that
  the socket (handle) doesn't disappear between scheduling the event and
  actually executing the event.

* Cancelling the in-flight TCP connection using libuv requires to call
  uv_close() on the original uv_tcp_t handle which just breaks too many
  assumptions we have in the netmgr code.  Instead of using uv_timer for
  TCP connection timeouts, we use platform specific socket option.

* Fix the synchronization between {nm,async}_{listentcp,tcpconnect}

  When isc_nm_listentcp() or isc_nm_tcpconnect() is called it was
  waiting for socket to either end up with error (that path was fine) or
  to be listening or connected using condition variable and mutex.

  Several things could happen:

    0. everything is ok

    1. the waiting thread would miss the SIGNAL() - because the enqueued
       event would be processed faster than we could start WAIT()ing.
       In case the operation would end up with error, it would be ok, as
       the error variable would be unchanged.

    2. the waiting thread miss the sock->{connected,listening} = `true`
       would be set to `false` in the tcp_{listen,connect}close_cb() as
       the connection would be so short lived that the socket would be
       closed before we could even start WAIT()ing

* The tcpdns has been converted to using libuv directly.  Previously,
  the tcpdns protocol used tcp protocol from netmgr, this proved to be
  very complicated to understand, fix and make changes to.  The new
  tcpdns protocol is modeled in a similar way how tcp netmgr protocol.
  Closes: #2194, #2283, #2318, #2266, #2034, #1920

* The tcp and tcpdns is now not using isc_uv_import/isc_uv_export to
  pass accepted TCP sockets between netthreads, but instead (similar to
  UDP) uses per netthread uv_loop listener.  This greatly reduces the
  complexity as the socket is always run in the associated nm and uv
  loops, and we are also not touching the libuv internals.

  There's an unfortunate side effect though, the new code requires
  support for load-balanced sockets from the operating system for both
  UDP and TCP (see #2137).  If the operating system doesn't support the
  load balanced sockets (either SO_REUSEPORT on Linux or SO_REUSEPORT_LB
  on FreeBSD 12+), the number of netthreads is limited to 1.

* The netmgr has now two debugging #ifdefs:

  1. Already existing NETMGR_TRACE prints any dangling nmsockets and
     nmhandles before triggering assertion failure.  This options would
     reduce performance when enabled, but in theory, it could be enabled
     on low-performance systems.

  2. New NETMGR_TRACE_VERBOSE option has been added that enables
     extensive netmgr logging that allows the software engineer to
     precisely track any attach/detach operations on the nmsockets and
     nmhandles.  This is not suitable for any kind of production
     machine, only for debugging.

* The tlsdns netmgr protocol has been split from the tcpdns and it still
  uses the old method of stacking the netmgr boxes on top of each other.
  We will have to refactor the tlsdns netmgr protocol to use the same
  approach - build the stack using only libuv and openssl.

* Limit but not assert the tcp buffer size in tcp_alloc_cb
  Closes: #2061
2020-12-01 16:47:07 +01:00
Ondřej Surý
a49d88568f Turn all the callback to be always asynchronous
When calling the high level netmgr functions, the callback would be
sometimes called synchronously if we catch the failure directly, or
asynchronously if it happens later.  The synchronous call to the
callback could create deadlocks as the caller would not expect the
failed callback to be executed directly.
2020-11-11 22:15:40 +01:00
Evan Hunt
4be63c5b00 add isc_nmhandle_settimeout() function
this function sets the read timeout for the socket associated
with a netmgr handle and, if the timer is running, resets it.
for TCPDNS sockets it also sets the read timeout and resets the
timer on the outer TCP socket.
2020-11-07 20:49:53 +01:00
Mark Andrews
0073cb7356 Incorrect result code passed to failed_connect_cb
*** CID 312970:  Incorrect expression  (COPY_PASTE_ERROR) /lib/isc/netmgr/tcp.c: 282 in tcp_connect_cb()
    276     	}
    277
    278     	isc__nm_incstats(sock->mgr, sock->statsindex[STATID_CONNECT]);
    279     	r = uv_tcp_getpeername(&sock->uv_handle.tcp, (struct sockaddr *)&ss,
    280     			       &(int){ sizeof(ss) });
    281     	if (r != 0) {
    >>>     CID 312970:  Incorrect expression  (COPY_PASTE_ERROR)
    >>>     "status" in "isc___nm_uverr2result(status, true, "netmgr/tcp.c", 282U)" looks like a copy-paste error.
    282     		failed_connect_cb(sock, req, isc__nm_uverr2result(status));
    283     		return;
    284     	}
    285
    286     	atomic_store(&sock->connecting, false);
    287
2020-11-04 21:58:05 +00:00
Ondřej Surý
c14c1fdd2c Put up additional safe guards to not use inactive/closed tcpdns socket
When we are operating on the tcpdns socket, we need to double check
whether the socket or its outerhandle or its listener or its mgr is
still active and when not, bail out early.
2020-11-02 20:58:00 +01:00
Evan Hunt
8fcad58ea6 check return value from uv_tcp_getpeername() when connecting
if we can't determine the peer, the connect should fail.
2020-10-30 11:11:54 +01:00
Evan Hunt
26a3a22895 set REUSEPORT and REUSEADDR on TCP sockets if needed
When binding a TCP socket, if bind() fails with EADDRINUSE,
try again with REUSEPORT/REUSEADDR (or the equivalent options).
2020-10-30 11:11:54 +01:00
Ondřej Surý
ed3ab63f74 Fix more races between connect and shutdown
There were more races that could happen while connecting to a
socket while closing or shutting down the same socket.  This
commit introduces a .closing flag to guard the socket from
being closed twice.
2020-10-30 11:11:54 +01:00
Ondřej Surý
6cfadf9db0 Fix a race between isc__nm_async_shutdown() and new sends/reads
There was a data race where a new event could be scheduled after
isc__nm_async_shutdown() had cleaned up all the dangling UDP/TCP
sockets from the loop.
2020-10-30 11:11:54 +01:00
Ondřej Surý
cdccac4993 Fix netmgr read/connect timeout issues
- don't bother closing sockets that are already closing.
- UDP read timeout timer was not stopped after reading.
- improve handling of TCP connection failures.
2020-10-30 11:11:54 +01:00
Evan Hunt
5dcdc00b93 add netmgr functions to support outgoing DNS queries
- isc_nm_tcpdnsconnect() sets up up an outgoing TCP DNS connection.
- isc_nm_tcpconnect(), _udpconnect() and _tcpdnsconnect() now take a
  timeout argument to ensure connections time out and are correctly
  cleaned up on failure.
- isc_nm_read() now supports UDP; it reads a single datagram and then
  stops until the next time it's called.
- isc_nm_cancelread() now runs asynchronously to prevent assertion
  failure if reading is interrupted by a non-network thread (e.g.
  a timeout).
- isc_nm_cancelread() can now apply to UDP sockets.
- added shim code to support UDP connection in versions of libuv
  prior to 1.27, when uv_udp_connect() was added

all these functions will be used to support outgoing queries in dig,
xfrin, dispatch, etc.
2020-10-30 11:11:54 +01:00
Ondřej Surý
8797e5efd5 Fix the data race when read-writing sock->active by using cmpxchg 2020-10-22 11:46:58 -07:00
Ondřej Surý
5ef71c420f Ignore and don't log ISC_R_NOTCONNECTED from uv_accept()
When client disconnects before the connection can be accepted, the named
would log a spurious log message:

    error: Accepting TCP connection failed: socket is not connected

We now ignore the ISC_R_NOTCONNECTED result code and log only other
errors
2020-10-22 11:37:16 -07:00
Ondřej Surý
f7c82e406e Fix the isc_nm_closedown() to actually close the pending connections
1. The isc__nm_tcp_send() and isc__nm_tcp_read() was not checking
   whether the socket was still alive and scheduling reads/sends on
   closed socket.

2. The isc_nm_read(), isc_nm_send() and isc_nm_resumeread() have been
   changed to always return the error conditions via the callbacks, so
   they always succeed.  This applies to all protocols (UDP, TCP and
   TCPDNS).
2020-10-22 11:37:16 -07:00
Ondřej Surý
6af08d1ca6 Fix the way tcp_send_direct() is used
There were two problems how tcp_send_direct() was used:

1. The tcp_send_direct() can return ISC_R_CANCELED (or translated error
   from uv_tcp_send()), but the isc__nm_async_tcpsend() wasn't checking
   the error code and not releasing the uvreq in case of an error.

2. In isc__nm_tcp_send(), when the TCP send is already in the right
   netthread, it uses tcp_send_direct() to send the TCP packet right
   away.  When that happened the uvreq was not freed, and the error code
   was returned to the caller.  We need to return ISC_R_SUCCESS and
   rather use the callback to report an error in such case.
2020-10-22 11:37:16 -07:00