When shutting down, calling the isc__nm_failed_connect_cb() was delayed
until the connect callback would be called. It turned out that the
connect callback might not get called at all when the socket is being
shut down. Call the failed_connect_cb() directly in the
tlsdns_shutdown() instead of waiting for the connect callback to call it.
After a partial write the tls.senddata buffer would be rearranged to
contain only the data tha wasn't sent and the len part would be made
shorter, which would lead to attempt to free only part of a socket's
tls.senddata buffer.
The tlsdns_cycle() might call uv_write() to write data to the socket,
when this happens and the socket is shutdown before the callback
completes, the uvreq structure was not freed because the callback would
be called with non-zero status code.
The udp, tcpdns and tlsdns contained lot of cut&paste code or code that
was very similar making the stack harder to maintain as any change to
one would have to be copied to the the other protocols.
In this commit, we merge the common parts into the common functions
under isc__nm_<foo> namespace and just keep the little differences based
on the socket type.
After the TCPDNS refactoring the initial and idle timers were broken and
only the tcp-initial-timeout was always applied on the whole TCP
connection.
This broke any TCP connection that took longer than tcp-initial-timeout,
most often this would affect large zone AXFRs.
This commit changes the timeout logic in this way:
* On TCP connection accept the tcp-initial-timeout is applied
and the timer is started
* When we are processing and/or sending any DNS message the timer is
stopped
* When we stop processing all DNS messages, the tcp-idle-timeout
is applied and the timer is started again
- style, cleanup, and removal of unnecessary code.
- combined isc_nm_http_add_endpoint() and isc_nm_http_add_doh_endpoint()
into one function, renamed isc_http_endpoint().
- moved isc_nm_http_connect_send_request() into doh_test.c as a helper
function; remove it from the public API.
- renamed isc_http2 and isc_nm_http2 types and functions to just isc_http
and isc_nm_http, for consistency with other existing names.
- shortened a number of long names.
- the caller is now responsible for determining the peer address.
in isc_nm_httpconnect(); this eliminates the need to parse the URI
and the dependency on an external resolver.
- the caller is also now responsible for creating the SSL client context,
for consistency with isc_nm_tlsdnsconnect().
- added setter functions for HTTP/2 ALPN. instead of setting up ALPN in
isc_tlsctx_createclient(), we now have a function
isc_tlsctx_enable_http2client_alpn() that can be run from
isc_nm_httpconnect().
- refactored isc_nm_httprequest() into separate read and send functions.
isc_nm_send() or isc_nm_read() is called on an http socket, it will
be stored until a corresponding isc_nm_read() or _send() arrives; when
we have both halves of the pair the HTTP request will be initiated.
- isc_nm_httprequest() is renamed isc__nm_http_request() for use as an
internal helper function by the DoH unit test. (eventually doh_test
should be rewritten to use read and send, and this function should
be removed.)
- added implementations of isc__nm_tls_settimeout() and
isc__nm_http_settimeout().
- increased NGHTTP2 header block length for client connections to 128K.
- use isc_mem_t for internal memory allocations inside nghttp2, to
help track memory leaks.
- send "Cache-Control" header in requests and responses. (note:
currently we try to bypass HTTP caching proxies, but ideally we should
interact with them: https://tools.ietf.org/html/rfc8484#section-5.1)
Instead of calling isc_tls_initialize()/isc_tls_destroy() explicitly use
gcc/clang attributes on POSIX and DLLMain on Windows to initialize and
shutdown OpenSSL library.
This resolves the issue when isc_nm_create() / isc_nm_destroy() was
called multiple times and it would call OpenSSL library destructors from
isc_nm_destroy().
At the same time, since we now have introduced the ctor/dtor for libisc,
this commit moves the isc_mem API initialization (the list of the
contexts) and changes the isc_mem_checkdestroyed() to schedule the
checking of memory context on library unload instead of executing the
code immediately.
Although harmless, the memmove() in tlsdns and tcpdns was guarded by a
current message length variable that was always bigger than 0 instead of
correct current buffer length remainder variable.
* Following the example set in 634bdfb16d8, the tlsdns netmgr
module now uses libuv and SSL primitives directly, rather than
opening a TLS socket which opens a TCP socket, as the previous
model was difficult to debug. Closes#2335.
* Remove the netmgr tls layer (we will have to re-add it for DoH)
* Add isc_tls API to wrap the OpenSSL SSL_CTX object into libisc
library; move the OpenSSL initialization/deinitialization from dstapi
needed for OpenSSL 1.0.x to the isc_tls_{initialize,destroy}()
* Add couple of new shims needed for OpenSSL 1.0.x
* When LibreSSL is used, require at least version 2.7.0 that
has the best OpenSSL 1.1.x compatibility and auto init/deinit
* Enforce OpenSSL 1.1.x usage on Windows
* Added a TLSDNS unit test and implemented a simple TLSDNS echo
server and client.
This is a part of the works that intends to make the netmgr stable,
testable, maintainable and tested. It contains a numerous changes to
the netmgr code and unfortunately, it was not possible to split this
into smaller chunks as the work here needs to be committed as a complete
works.
NOTE: There's a quite a lot of duplicated code between udp.c, tcp.c and
tcpdns.c and it should be a subject to refactoring in the future.
The changes that are included in this commit are listed here
(extensively, but not exclusively):
* The netmgr_test unit test was split into individual tests (udp_test,
tcp_test, tcpdns_test and newly added tcp_quota_test)
* The udp_test and tcp_test has been extended to allow programatic
failures from the libuv API. Unfortunately, we can't use cmocka
mock() and will_return(), so we emulate the behaviour with #define and
including the netmgr/{udp,tcp}.c source file directly.
* The netievents that we put on the nm queue have variable number of
members, out of these the isc_nmsocket_t and isc_nmhandle_t always
needs to be attached before enqueueing the netievent_<foo> and
detached after we have called the isc_nm_async_<foo> to ensure that
the socket (handle) doesn't disappear between scheduling the event and
actually executing the event.
* Cancelling the in-flight TCP connection using libuv requires to call
uv_close() on the original uv_tcp_t handle which just breaks too many
assumptions we have in the netmgr code. Instead of using uv_timer for
TCP connection timeouts, we use platform specific socket option.
* Fix the synchronization between {nm,async}_{listentcp,tcpconnect}
When isc_nm_listentcp() or isc_nm_tcpconnect() is called it was
waiting for socket to either end up with error (that path was fine) or
to be listening or connected using condition variable and mutex.
Several things could happen:
0. everything is ok
1. the waiting thread would miss the SIGNAL() - because the enqueued
event would be processed faster than we could start WAIT()ing.
In case the operation would end up with error, it would be ok, as
the error variable would be unchanged.
2. the waiting thread miss the sock->{connected,listening} = `true`
would be set to `false` in the tcp_{listen,connect}close_cb() as
the connection would be so short lived that the socket would be
closed before we could even start WAIT()ing
* The tcpdns has been converted to using libuv directly. Previously,
the tcpdns protocol used tcp protocol from netmgr, this proved to be
very complicated to understand, fix and make changes to. The new
tcpdns protocol is modeled in a similar way how tcp netmgr protocol.
Closes: #2194, #2283, #2318, #2266, #2034, #1920
* The tcp and tcpdns is now not using isc_uv_import/isc_uv_export to
pass accepted TCP sockets between netthreads, but instead (similar to
UDP) uses per netthread uv_loop listener. This greatly reduces the
complexity as the socket is always run in the associated nm and uv
loops, and we are also not touching the libuv internals.
There's an unfortunate side effect though, the new code requires
support for load-balanced sockets from the operating system for both
UDP and TCP (see #2137). If the operating system doesn't support the
load balanced sockets (either SO_REUSEPORT on Linux or SO_REUSEPORT_LB
on FreeBSD 12+), the number of netthreads is limited to 1.
* The netmgr has now two debugging #ifdefs:
1. Already existing NETMGR_TRACE prints any dangling nmsockets and
nmhandles before triggering assertion failure. This options would
reduce performance when enabled, but in theory, it could be enabled
on low-performance systems.
2. New NETMGR_TRACE_VERBOSE option has been added that enables
extensive netmgr logging that allows the software engineer to
precisely track any attach/detach operations on the nmsockets and
nmhandles. This is not suitable for any kind of production
machine, only for debugging.
* The tlsdns netmgr protocol has been split from the tcpdns and it still
uses the old method of stacking the netmgr boxes on top of each other.
We will have to refactor the tlsdns netmgr protocol to use the same
approach - build the stack using only libuv and openssl.
* Limit but not assert the tcp buffer size in tcp_alloc_cb
Closes: #2061