mir/bind - bind - Mike's Git repositories

mir/bind

mirror of https://gitlab.isc.org/isc-projects/bind9 synced 2025-08-24 02:58:38 +00:00

Author	SHA1	Message	Date
Ondřej Surý	b432d5d3bc	Gracefully handle uv_read_start() failures Under specific rare timing circumstances the uv_read_start() could fail with UV_EINVAL when the connection is reset between the connect (or accept) and the uv_read_start() call on the nmworker loop. Handle such situation gracefully by propagating the errors from uv_read_start() into upper layers, so the socket can be internally closed().	2022-06-14 11:33:02 +02:00
Ondřej Surý	b43812692d	Move netmgr/uv-compat.h to <isc/uv.h> As we are going to use libuv outside of the netmgr, we need the shims to be readily available for the rest of the codebase. Move the "netmgr/uv-compat.h" to <isc/uv.h> and netmgr/uv-compat.c to uv.c, and as a rule of thumb, the users of libuv should include <isc/uv.h> instead of <uv.h> directly. Additionally, merge netmgr/uverr2result.c into uv.c and rename the single function from isc__nm_uverr2result() to isc_uverr2result().	2022-05-03 10:02:19 +02:00
Ondřej Surý	24c3879675	Move socket related functions to netmgr/socket.c Move the netmgr socket related functions from netmgr/netmgr.c and netmgr/uv-compat.c to netmgr/socket.c, so they are all present all in the same place. Adjust the names of couple interal functions accordingly.	2022-05-03 09:52:49 +02:00
Ondřej Surý	407b37c3f2	Set IP(V6)_RECVERR on connect UDP sockets (via libuv) The connect()ed UDP socket provides feedback on a variety of ICMP errors (eg port unreachable) which bind can then use to decide what to do with errors (report them to the client, try again with a different nameserver etc). However, Linux's implementation does not report what it considers "transient" conditions, which is defined as Destination host Unreachable, Destination network unreachable, Source Route Failed and Message Too Big. Explicitly enable IP_RECVERR / IPV6_RECVERR (via libuv uv_udp_bind() flag) to learn about ICMP destination network/host unreachable.	2022-04-26 12:22:18 +02:00
Ondřej Surý	f55a4d3e55	Allow listening on less than nworkers threads For some applications, it's useful to not listen on full battery of threads. Add workers argument to all isc_nm_listen*() functions and convenience ISC_NM_LISTEN_ONE and ISC_NM_LISTEN_ALL macros.	2022-04-19 11:08:13 +02:00
Ondřej Surý	85c6e797aa	Add option to configure load balance sockets Previously, the option to enable kernel load balancing of the sockets was always enabled when supported by the operating system (SO_REUSEPORT on Linux and SO_REUSEPORT_LB on FreeBSD). It was reported that in scenarios where the networking threads are also responsible for processing long-running tasks (like RPZ processing, CATZ processing or large zone transfers), this could lead to intermitten brownouts for some clients, because the thread assigned by the operating system might be busy. In such scenarious, the overall performance would be better served by threads competing over the sockets because the idle threads can pick up the incoming traffic. Add new configuration option (`load-balance-sockets`) to allow enabling or disabling the load balancing of the sockets.	2022-04-04 23:10:04 +02:00
Ondřej Surý	9de10cd153	Remove extrahandle size from netmgr Previously, it was possible to assign a bit of memory space in the nmhandle to store the client data. This was complicated and prevents further refactoring of isc_nmhandle_t caching (future work). Instead of caching the data in the nmhandle, allocate the hot-path ns_client_t objects from per-thread clientmgr memory context and just assign it to the isc_nmhandle_t via isc_nmhandle_set().	2022-03-25 10:38:35 +01:00
Ondřej Surý	584f0d7a7e	Simplify way we tag unreachable code with only ISC_UNREACHABLE() Previously, the unreachable code paths would have to be tagged with: INSIST(0); ISC_UNREACHABLE(); There was also older parts of the code that used comment annotation: /* NOTREACHED */ Unify the handling of unreachable code paths to just use: UNREACHABLE(); The UNREACHABLE() macro now asserts when reached and also uses __builtin_unreachable(); when such builtin is available in the compiler.	2022-03-25 08:33:43 +01:00
Ondřej Surý	a761aa59e3	Change single write timer to per-send timers Previously, there was a single per-socket write timer that would get restarted for every new write. This turned out to be insufficient because the other side could keep reseting the timer, and never reading back the responses. Change the single write timer to per-send timer which would in turn reset the TCP connection on the first send timeout.	2022-03-11 09:56:57 +01:00
Ondřej Surý	5d34a14f22	Set minimum MTU (1280) on IPv6 sockets The IPV6_USE_MIN_MTU socket option directs the IP layer to limit the IPv6 packet size to the minimum required supported MTU from the base IPv6 specification, i.e. 1280 bytes. Many implementations of TCP running over IPv6 neglect to check the IPV6_USE_MIN_MTU value when performing MSS negotiation and when constructing a TCP segment despite MSS being defined to be the MTU less the IP and TCP header sizes (60 bytes for IPv6). This leads to oversized IPv6 packets being sent resulting in unintended Path Maximum Transport Unit Discovery (PMTUD) being performed and to fragmented IPv6 packets being sent. Add and use a function to set socket option to limit the MTU on IPv6 sockets to the minimum MTU (1280) both for UDP and TCP.	2022-03-08 10:27:05 +01:00
Ondřej Surý	408b362169	Add TCP, TCPDNS and TLSDNS write timer When the outgoing TCP write buffers are full because the other party is not reading the data, the uv_write() could wait indefinitely on the uv_loop and never calling the callback. Add a new write timer that uses the `tcp-idle-timeout` value to interrupt the TCP connection when we are not able to send data for defined period of time.	2022-02-17 09:06:58 +01:00
Ondřej Surý	45a73c113f	Rename sock->timer to sock->read_timer Before adding the write timer, we have to remove the generic sock->timer to sock->read_timer. We don't touch the function names to limit the impact of the refactoring.	2022-02-17 09:06:58 +01:00
Ondřej Surý	8715be1e4b	Use UV_RUNTIME_CHECK() as appropriate Replace the RUNTIME_CHECK() calls for libuv API calls with UV_RUNTIME_CHECK() to get more detailed error message when something fails and should not.	2022-02-16 11:16:57 +01:00
Ondřej Surý	b5e086257d	Explicitly enable IPV6_V6ONLY on the netmgr sockets Some operating systems (OpenBSD and DragonFly BSD) don't restrict the IPv6 sockets to sending and receiving IPv6 packets only. Explicitly enable the IPV6_V6ONLY socket option on the IPv6 sockets to prevent failures from using the IPv4-mapped IPv6 address.	2022-01-17 22:16:27 +01:00
Ondřej Surý	7370725008	Fix the UDP recvmmsg support Previously, the netmgr/udp.c tried to detect the recvmmsg detection in libuv with #ifdef UV_UDP_<foo> preprocessor macros. However, because the UV_UDP_<foo> are not preprocessor macros, but enum members, the detection didn't work. Because the detection didn't work, the code didn't have access to the information when we received the final chunk of the recvmmsg and tried to free the uvbuf every time. Fortunately, the isc__nm_free_uvbuf() had a kludge that detected attempt to free in the middle of the receive buffer, so the code worked. However, libuv 1.37.0 changed the way the recvmmsg was enabled from implicit to explicit, and we checked for yet another enum member presence with preprocessor macro, so in fact libuv recvmmsg support was never enabled with libuv >= 1.37.0. This commit changes to the preprocessor macros to autoconf checks for declaration, so the detection now works again. On top of that, it's now possible to cleanup the alloc_cb and free_uvbuf functions because now, the information whether we can or cannot free the buffer is available to us.	2022-01-13 19:06:39 +01:00
Ondřej Surý	58bd26b6cf	Update the copyright information in all files in the repository This commit converts the license handling to adhere to the REUSE specification. It specifically: 1. Adds used licnses to LICENSES/ directory 2. Add "isc" template for adding the copyright boilerplate 3. Changes all source files to include copyright and SPDX license header, this includes all the C sources, documentation, zone files, configuration files. There are notes in the doc/dev/copyrights file on how to add correct headers to the new files. 4. Handle the rest that can't be modified via .reuse/dep5 file. The binary (or otherwise unmodifiable) files could have license places next to them in <foo>.license file, but this would lead to cluttered repository and most of the files handled in the .reuse/dep5 file are system test files.	2022-01-11 09:05:02 +01:00
Evan Hunt	8c51a32e5c	netmgr: add isc_nm_routeconnect() isc_nm_routeconnect() opens a route/netlink socket, then calls a connect callback, much like isc_nm_udpconnect(), with a handle that can then be monitored for network changes. Internally the socket is treated as a UDP socket, since route/netlink sockets follow the datagram contract.	2021-10-15 00:56:58 -07:00
Evan Hunt	8d6bf826c6	netmgr: refactor isc__nm_incstats() and isc__nm_decstats() After support for route/netlink sockets is merged, not all sockets will have stats counters associated with them, so it's now necessary to check whether socket stats exist before incrementing or decrementing them. rather than relying on the caller for this, we now just pass the socket and an index, and the correct stats counter will be updated if it exists.	2021-10-15 00:40:37 -07:00
Ondřej Surý	9ee60e7a17	netmgr fixes needed for dispatch - The read timer must always be stopped when reading stops. - Read callbacks can now call isc_nm_read() again in TCP, TCPDNS and TLSDNS; previously this caused an assertion. - The wrong failure code could be sent after a UDP recv failure because the if statements were in the wrong order. the check for a NULL address needs to be after the check for an error code, otherwise the result will always be set to ISC_R_EOF. - When aborting a read or connect because the netmgr is shutting down, use ISC_R_SHUTTINGDOWN. (ISC_R_CANCELED is now reserved for when the read has been canceled by the caller.) - A new function isc_nmhandle_timer_running() has been added enabling a callback to check whether the timer has been reset after processing a timeout. - Incidental netmgr fix: always use isc__nm_closing() instead of referencing sock->mgr->closing directly - Corrected a few comments that used outdated function names.	2021-10-02 11:39:56 -07:00
Ondřej Surý	87d5c8ab7c	Disable the Path MTU Discover on UDP Sockets Instead of disabling the fragmentation on the UDP sockets, we now disable the Path MTU Discovery by setting IP(V6)_MTU_DISCOVER socket option to IP_PMTUDISC_OMIT on Linux and disabling IP(V6)_DONTFRAG socket option on FreeBSD. This option sets DF=0 in the IP header and also ignores the Path MTU Discovery. As additional mitigation on Linux, we recommend setting net.ipv4.ip_no_pmtu_disc to Mode 3: Mode 3 is a hardend pmtu discover mode. The kernel will only accept fragmentation-needed errors if the underlying protocol can verify them besides a plain socket lookup. Current protocols for which pmtu events will be honored are TCP, SCTP and DCCP as they verify e.g. the sequence number or the association. This mode should not be enabled globally but is only intended to secure e.g. name servers in namespaces where TCP path mtu must still work but path MTU information of other protocols should be discarded. If enabled globally this mode could break other protocols.	2021-08-19 07:12:33 +02:00
Ondřej Surý	440fb3d225	Completely remove BIND 9 Windows support The Windows support has been completely removed from the source tree and BIND 9 now no longer supports native compilation on Windows. We might consider reviewing mingw-w64 port if contributed by external party, but no development efforts will be put into making BIND 9 compile and run on Windows again.	2021-06-09 14:35:14 +02:00
Ondřej Surý	67afea6cfc	Cleanup the remaining of HAVE_UV_<func> macros While cleaning up the usage of HAVE_UV_<func> macros, we forgot to cleanup the HAVE_UV_UDP_CONNECT in the actual code and HAVE_UV_TRANSLATE_SYS_ERROR and this was causing Windows build to fail on uv_udp_send() because the socket was already connected and we were falsely assuming that it was not. The platforms with autoconf support were not affected, because we were still checking for the functions from the configure.	2021-06-02 11:23:36 +02:00
Ondřej Surý	50270de8a0	Refactor the interface handling in the netmgr The isc_nmiface_t type was holding just a single isc_sockaddr_t, so we got rid of the datatype and use plain isc_sockaddr_t in place where isc_nmiface_t was used before. This means less type-casting and shorter path to access isc_sockaddr_t members. At the same time, instead of keeping the reference to the isc_sockaddr_t that was passed to us when we start listening, we will keep a local copy. This prevents the data race on destruction of the ns_interface_t objects where pending nmsockets could reference the sockaddr of already destroyed ns_interface_t object.	2021-05-26 09:43:12 +02:00
Ondřej Surý	4509089419	Add configuration option to set send/recv buffers on the nm sockets This commit adds a new configuration option to set the receive and send buffer sizes on the TCP and UDP netmgr sockets. The default is `0` which doesn't set any value and just uses the value set by the operating system. There's no magic value here - set it too small and the performance will drop, set it too large, the buffers can fill-up with queries that have already timeouted on the client side and nobody is interested for the answer and this would just make the server clog up even more by making it produce useless work. The `netstat -su` can be used on POSIX systems to monitor the receive and send buffer errors.	2021-05-17 08:47:09 +02:00
Ondřej Surý	cd413234f7	Fix the outgoing UDP socket selection on Windows The outgoing UDP socket selection would pick unintialized children socket on Windows, because we have more netmgr workers than we have listening sockets. This commit fixes the selection by keeping the outgoing socket the same, so it's always run on existing socket.	2021-05-13 15:04:48 +02:00
Ondřej Surý	365c6a9851	ensure interlocked netmgr events run on worker[0] Network manager events that require interlock (pause, resume, listen) are now always executed in the same worker thread, mgr->workers[0], to prevent races. "stoplistening" events no longer require interlock.	2021-05-07 14:28:32 -07:00
Ondřej Surý	4c8f6ebeb1	Use barriers for netmgr synchronization The netmgr listening, stoplistening, pausing and resuming functions now use barriers for synchronization, which makes the code much simpler. isc/barrier.h defines isc_barrier macros as a front-end for uv_barrier on platforms where that works, and pthread_barrier where it doesn't (including TSAN builds).	2021-05-07 14:28:32 -07:00
Ondřej Surý	29a208aaf7	Fix crash when allocating UDP socket fails on OpenBSD When socket() call fails, the UDP connect code would call the connectcb with empty req->handle. This has been fixed.	2021-05-07 14:28:30 -07:00
Evan Hunt	bcf5b2a675	run read callbacks synchronously on timeout when running read callbacks, if the event result is not ISC_R_SUCCESS, the callback is always run asynchronously. this is a problem on timeout, because there's no chance to reset the timer before the socket has already been destroyed. this commit allows read callbacks to run synchronously for both ISC_R_SUCCESS and ISC_R_TIMEDOUT result codes.	2021-04-22 12:08:04 -07:00
Ondřej Surý	b540722bc3	Refactor taskmgr to run on top of netmgr This commit changes the taskmgr to run the individual tasks on the netmgr internal workers. While an effort has been put into keeping the taskmgr interface intact, couple of changes have been made: * The taskmgr has no concept of universal privileged mode - rather the tasks are either privileged or unprivileged (normal). The privileged tasks are run as a first thing when the netmgr is unpaused. There are now four different queues in in the netmgr: 1. priority queue - netievent on the priority queue are run even when the taskmgr enter exclusive mode and netmgr is paused. This is needed to properly start listening on the interfaces, free resources and resume. 2. privileged task queue - only privileged tasks are queued here and this is the first queue that gets processed when network manager is unpaused using isc_nm_resume(). All netmgr workers need to clean the privileged task queue before they all proceed normal operation. Both task queues are processed when the workers are finished. 3. task queue - only (traditional) task are scheduled here and this queue along with privileged task queues are process when the netmgr workers are finishing. This is needed to process the task shutdown events. 4. normal queue - this is the queue with netmgr events, e.g. reading, sending, callbacks and pretty much everything is processed here. * The isc_taskmgr_create() now requires initialized netmgr (isc_nm_t) object. * The isc_nm_destroy() function now waits for indefinite time, but it will print out the active objects when in tracing mode (-DNETMGR_TRACE=1 and -DNETMGR_TRACE_VERBOSE=1), the netmgr has been made a little bit more asynchronous and it might take longer time to shutdown all the active networking connections. * Previously, the isc_nm_stoplistening() was a synchronous operation. This has been changed and the isc_nm_stoplistening() just schedules the child sockets to stop listening and exits. This was needed to prevent a deadlock as the the (traditional) tasks are now executed on the netmgr threads. * The socket selection logic in isc__nm_udp_send() was flawed, but fortunatelly, it was broken, so we never hit the problem where we created uvreq_t on a socket from nmhandle_t, but then a different socket could be picked up and then we were trying to run the send callback on a socket that had different threadid than currently running.	2021-04-20 23:22:28 +02:00
Ondřej Surý	72ef5f465d	Refactor async callbacks and fix the double tlsdnsconnect callback The isc_nm_tlsdnsconnect() call could end up with two connect callbacks called when the timeout fired and the TCP connection was aborted, but the TLS handshake was not complete yet. isc__nm_connecttimeout_cb() forgot to clean up sock->tls.pending_req when the connect callback was called with ISC_R_TIMEDOUT, leading to a second callback running later. A new argument has been added to the isc__nm__failed_connect_cb and isc__nm__failed_read_cb functions, to indicate whether the callback needs to run asynchronously or not.	2021-04-07 15:36:59 +02:00
Ondřej Surý	86f4872dd6	isc_nm_connect() always return via callback The isc_nm_connect() functions were refactored to always return the connection status via the connect callback instead of sometimes returning the hard failure directly (for example, when the socket could not be created, or when the network manager was shutting down). This commit changes the connect functions in all the network manager modules, and also makes the necessary refactoring changes in places where the connect functions are called.	2021-04-07 15:36:59 +02:00
Evan Hunt	a70cd026df	move UDP connect retries from dig into isc_nm_udpconnect() dig previously ran isc_nm_udpconnect() three times before giving up, to work around a freebsd bug that caused connect() to return a spurious transient EADDRINUSE. this commit moves the retry code into the network manager itself, so that isc_nm_udpconnect() no longer needs to return a result code.	2021-04-07 15:36:59 +02:00
Ondřej Surý	7df8c7061c	Fix and clean up handling of connect callbacks Serveral problems were discovered and fixed after the change in the connection timeout in the previous commits: * In TLSDNS, the connection callback was not called at all under some circumstances when the TCP connection had been established, but the TLS handshake hadn't been completed yet. Additional checks have been put in place so that tls_cycle() will end early when the nmsocket is invalidated by the isc__nm_tlsdns_shutdown() call. * In TCP, TCPDNS and TLSDNS, new connections would be established even when the network manager was shutting down. The new call isc__nm_closing() has been added and is used to bail out early even before uv_tcp_connect() is attempted.	2021-04-07 15:36:59 +02:00
Ondřej Surý	5a87c7372c	Make it possible to recover from connect timeouts Similarly to the read timeout, it's now possible to recover from ISC_R_TIMEDOUT event by restarting the timer from the connect callback. The change here also fixes platforms that missing the socket() options to set the TCP connection timeout, by moving the timeout code into user space. On platforms that support setting the connect timeout via a socket option, the timeout has been hardcoded to 2 minutes (the maximum value of tcp-initial-timeout).	2021-04-07 15:36:58 +02:00
Ondřej Surý	1ef232f93d	Merge the common parts between udp, tcpdns and tlsdns protocol The udp, tcpdns and tlsdns contained lot of cut&paste code or code that was very similar making the stack harder to maintain as any change to one would have to be copied to the the other protocols. In this commit, we merge the common parts into the common functions under isc__nm_<foo> namespace and just keep the little differences based on the socket type.	2021-03-18 16:37:57 +01:00
Ondřej Surý	caa5b6548a	Fix TCPDNS and TLSDNS timers After the TCPDNS refactoring the initial and idle timers were broken and only the tcp-initial-timeout was always applied on the whole TCP connection. This broke any TCP connection that took longer than tcp-initial-timeout, most often this would affect large zone AXFRs. This commit changes the timeout logic in this way: * On TCP connection accept the tcp-initial-timeout is applied and the timer is started * When we are processing and/or sending any DNS message the timer is stopped * When we stop processing all DNS messages, the tcp-idle-timeout is applied and the timer is started again	2021-03-18 16:37:57 +01:00
Witold Kręcicki	7a96081360	nghttp2-based HTTP layer in netmgr This commit includes work-in-progress implementation of DNS-over-HTTP(S). Server-side code remains mostly untested, and there is only support for POST requests.	2021-02-03 12:06:17 +01:00
Ondřej Surý	5caf33feda	Fix HAVE_SO_REUSEPORT_LB macro name definition A typo in macro definition caused the load-balanced sockets to be disabled even on platforms with existing support for load-balanced sockets.	2020-12-04 14:45:22 +01:00
Ondřej Surý	87c5867202	Use sock->nchildren instead of mgr->nworkers when initializing NM On Windows, we were limiting the number of listening children to just 1, but we were then iterating on mgr->nworkers. That lead to scheduling more async_*listen() than actually allocated and out-of-bound read-write operation on the heap.	2020-12-03 18:03:25 +01:00
Ondřej Surý	151852f428	Fix datarace when UDP/TCP connect fails and we are in nmthread When we were in nmthread, the isc__nm_async_<proto>connect() function executes in the same thread as the isc__nm_<proto>connect() and on a failure, it would block indefinitely because the failure branch was setting sock->active to false before the condition around the wait had a chance to skip the WAIT(). This also fixes the zero system test being stuck on FreeBSD 11, so we re-enable the test in the commit.	2020-12-03 13:56:34 +01:00
Ondřej Surý	1d066e4bc5	Distribute queries among threads even on platforms without lb sockets On platforms without load-balancing socket all the queries would be handle by a single thread. Currently, the support for load-balanced sockets is present in Linux with SO_REUSEPORT and FreeBSD 12 with SO_REUSEPORT_LB. This commit adds workaround for such platforms that: 1. setups single shared listening socket for all listening nmthreads for UDP, TCP and TCPDNS netmgr transports 2. Calls uv_udp_bind/uv_tcp_bind on the underlying socket just once and for rest of the nmthreads only copy the internal libuv flags (should be just UV_HANDLE_BOUND and optionally UV_HANDLE_IPV6). 3. start reading on UDP socket or listening on TCP socket The load distribution among the nmthreads is uneven, but it's still better than utilizing just one thread for processing all the incoming queries	2020-12-03 09:20:33 +01:00
Ondřej Surý	d6d2fbe0e9	Avoid netievent allocations when the callbacks can be called directly After turning the users callbacks to be asynchronous, there was a visible performance drop. This commit prevents the unnecessary allocations while keeping the code paths same for both asynchronous and synchronous calls. The same change was done to the isc__nm_udp_{read,send} as those two functions are in the hot path.	2020-12-02 09:45:05 +01:00
Ondřej Surý	634bdfb16d	Refactor netmgr and add more unit tests This is a part of the works that intends to make the netmgr stable, testable, maintainable and tested. It contains a numerous changes to the netmgr code and unfortunately, it was not possible to split this into smaller chunks as the work here needs to be committed as a complete works. NOTE: There's a quite a lot of duplicated code between udp.c, tcp.c and tcpdns.c and it should be a subject to refactoring in the future. The changes that are included in this commit are listed here (extensively, but not exclusively): * The netmgr_test unit test was split into individual tests (udp_test, tcp_test, tcpdns_test and newly added tcp_quota_test) * The udp_test and tcp_test has been extended to allow programatic failures from the libuv API. Unfortunately, we can't use cmocka mock() and will_return(), so we emulate the behaviour with #define and including the netmgr/{udp,tcp}.c source file directly. * The netievents that we put on the nm queue have variable number of members, out of these the isc_nmsocket_t and isc_nmhandle_t always needs to be attached before enqueueing the netievent_<foo> and detached after we have called the isc_nm_async_<foo> to ensure that the socket (handle) doesn't disappear between scheduling the event and actually executing the event. * Cancelling the in-flight TCP connection using libuv requires to call uv_close() on the original uv_tcp_t handle which just breaks too many assumptions we have in the netmgr code. Instead of using uv_timer for TCP connection timeouts, we use platform specific socket option. * Fix the synchronization between {nm,async}_{listentcp,tcpconnect} When isc_nm_listentcp() or isc_nm_tcpconnect() is called it was waiting for socket to either end up with error (that path was fine) or to be listening or connected using condition variable and mutex. Several things could happen: 0. everything is ok 1. the waiting thread would miss the SIGNAL() - because the enqueued event would be processed faster than we could start WAIT()ing. In case the operation would end up with error, it would be ok, as the error variable would be unchanged. 2. the waiting thread miss the sock->{connected,listening} = `true` would be set to `false` in the tcp_{listen,connect}close_cb() as the connection would be so short lived that the socket would be closed before we could even start WAIT()ing * The tcpdns has been converted to using libuv directly. Previously, the tcpdns protocol used tcp protocol from netmgr, this proved to be very complicated to understand, fix and make changes to. The new tcpdns protocol is modeled in a similar way how tcp netmgr protocol. Closes: #2194, #2283, #2318, #2266, #2034, #1920 * The tcp and tcpdns is now not using isc_uv_import/isc_uv_export to pass accepted TCP sockets between netthreads, but instead (similar to UDP) uses per netthread uv_loop listener. This greatly reduces the complexity as the socket is always run in the associated nm and uv loops, and we are also not touching the libuv internals. There's an unfortunate side effect though, the new code requires support for load-balanced sockets from the operating system for both UDP and TCP (see #2137). If the operating system doesn't support the load balanced sockets (either SO_REUSEPORT on Linux or SO_REUSEPORT_LB on FreeBSD 12+), the number of netthreads is limited to 1. * The netmgr has now two debugging #ifdefs: 1. Already existing NETMGR_TRACE prints any dangling nmsockets and nmhandles before triggering assertion failure. This options would reduce performance when enabled, but in theory, it could be enabled on low-performance systems. 2. New NETMGR_TRACE_VERBOSE option has been added that enables extensive netmgr logging that allows the software engineer to precisely track any attach/detach operations on the nmsockets and nmhandles. This is not suitable for any kind of production machine, only for debugging. * The tlsdns netmgr protocol has been split from the tcpdns and it still uses the old method of stacking the netmgr boxes on top of each other. We will have to refactor the tlsdns netmgr protocol to use the same approach - build the stack using only libuv and openssl. * Limit but not assert the tcp buffer size in tcp_alloc_cb Closes: #2061	2020-12-01 16:47:07 +01:00
Ondřej Surý	a49d88568f	Turn all the callback to be always asynchronous When calling the high level netmgr functions, the callback would be sometimes called synchronously if we catch the failure directly, or asynchronously if it happens later. The synchronous call to the callback could create deadlocks as the caller would not expect the failed callback to be executed directly.	2020-11-11 22:15:40 +01:00
Ondřej Surý	8af7f81d6c	netmgr: Don't crash if socket() returns an error in udpconnect socket() call can return an error - e.g. EMFILE, so we need to handle this nicely and not crash. Additionally wrap the socket() call inside a platform independent helper function as the Socket data type on Windows is unsigned integer: > This means, for example, that checking for errors when the socket and > accept functions return should not be done by comparing the return > value with –1, or seeing if the value is negative (both common and > legal approaches in UNIX). Instead, an application should use the > manifest constant INVALID_SOCKET as defined in the Winsock2.h header > file.	2020-11-08 13:36:12 -08:00
Ondřej Surý	050258bda4	netmgr: Always load the result from async socket Because we use result earlier for setting the loadbalancing on the socket, we could be left with a ISC_R_NOTIMPLEMENTED value stored in the variable and when the UDP connection would succeed, we would errorneously return this value instead of ISC_R_SUCCESS.	2020-11-07 21:12:08 +01:00
Evan Hunt	ea2b04c361	dig: use new netmgr timeout mechanism use isc_nmhandle_settimeout() to set read/recv timeouts, and get rid of connect_timeout() and related functions in dighost.c.	2020-11-07 20:49:53 +01:00
Evan Hunt	4be63c5b00	add isc_nmhandle_settimeout() function this function sets the read timeout for the socket associated with a netmgr handle and, if the timer is running, resets it. for TCPDNS sockets it also sets the read timeout and resets the timer on the outer TCP socket.	2020-11-07 20:49:53 +01:00
Ondřej Surý	ed3ab63f74	Fix more races between connect and shutdown There were more races that could happen while connecting to a socket while closing or shutting down the same socket. This commit introduces a .closing flag to guard the socket from being closed twice.	2020-10-30 11:11:54 +01:00

1 2

100 Commits