If isc_app_run() gets interrupted by a signal, the global 'rndc_task'
variable may already be detached from (set to NULL) by the time the
outstanding netmgr callbacks are run. This triggers an assertion
failure in isc_task_shutdown(). However, explicitly calling
isc_task_shutdown() from rndc code is redundant because it does not use
isc_task_onshutdown() and the task_shutdown() function gets
automatically called anyway when the task manager gets destroyed (after
isc_app_run() returns). Remove the redundant isc_task_shutdown() calls
to prevent crashes after receiving a signal.
rndc_recvdone() is not treating the ISC_R_CANCELED result code as a
request to stop data processing, which may cause a crash when trying to
dereference ccmsg->buffer. Fix by ensuring ISC_R_CANCELED results in an
early exit from rndc_recvdone().
Make sure the logic for handling ISC_R_CANCELED in rndc_recvnonce()
matches the one present in rndc_recvdone() to ensure consistent behavior
between these two sibling functions.
Sometimes the serving a query or two might fail in the test due to the
listeners not being reinitialised on time. This commit makes the test
suite to wait for reconfiguration message in the log file to detect
the time when the reconfiguration request completed.
gnutls-cli is tricky to script around as it immediately closes the
server connection when its standard input is closed. This prevents
simple shell-based I/O redirection from being used for capturing the DNS
response sent over a TLS connection and the workarounds for this issue
employ non-standard utilities like "timeout".
Instead of resorting to clever shell hacks, reimplement the relevant
check in Python. Exit immediately upon receiving a valid DNS response
or when gnutls-cli exits in order to decrease the test's run time.
Employ dnspython to avoid the need for storing DNS queries in binary
files and to improve test readability. Capture more diagnostic output
to facilitate troubleshooting. Use a pytest fixture instead of an
Autoconf macro to keep test requirements localized.
Some operating systems (OpenBSD and DragonFly BSD) don't restrict the
IPv6 sockets to sending and receiving IPv6 packets only. Explicitly
enable the IPV6_V6ONLY socket option on the IPv6 sockets to prevent
failures from using the IPv4-mapped IPv6 address.
The server_send_error_response() function is supposed to be used only
in case of failures and never in case of legitimate requests. Ensure
that ISC_HTTP_ERROR_SUCCESS is never passed there by mistake.
1. 10 seconds is an unfortunate pick because that reintroduces the
problem described in commit 5307bf64 (for an earlier check).
Change the +tries=3 +timeout=10 to +tries=2 +time=15, so that we
minimize the risk of dig missing any responses sent by the server in
the first 15 seconds while also increasing our chances of the
response arriving in time on machines under heavy load and allowing
it a single retry in case things go awry.
2. The comment about TCP above was misleading: as painfully proven by
GitLab CI, using TCP is no guarantee of receiving a response in a
timely manner. It may help a bit, but it is certainly not a 100%
reliable solution.
Change the dig invocation to just use UDP like in the two prior
tests for consistency (and revise that comment accordingly).
The resolver system tests was exhibiting often intermitten failures,
increase the timeout from default 5 second to 10 seconds to give the dig
more leeway for providing an answer.
The detection of MUSL libc via autoconf $host turned out to be
not reliable.
Convert the autoconf check from $host detection to actually detect
the padding used in the struct msghdr.
The commit itself is harmless, but at the same time it is also useless,
so we are reverting it.
This reverts commit 11c869a3d53eafa4083b404e6b6686a120919c26.
The Linux kernel diverts from the POSIX specification for two members of
struct msghdr making them size_t sized (instead of int and socklen_t).
In glibc, the developers have decided to use that. However, the MUSL
developers used padding for the struct and kept the members defined
according to the POSIX.
This creates a problem, because libuv doesn't use recvmmsg() library
call where the padding members are correctly zeroed and instead calls
the syscall directly, the struct msghdr is passed to the kernel with
enormous values in those two members (because of the random junk in the
padding members) and the syscall thus fail with EMSGSIZE.
Disable udp recvmmsg support on systems with MUSL libc until the libuv
starts zeroing the struct msghdr before passing it to the syscall.
Previously, the netmgr/udp.c tried to detect the recvmmsg detection in
libuv with #ifdef UV_UDP_<foo> preprocessor macros. However, because
the UV_UDP_<foo> are not preprocessor macros, but enum members, the
detection didn't work. Because the detection didn't work, the code
didn't have access to the information when we received the final chunk
of the recvmmsg and tried to free the uvbuf every time. Fortunately,
the isc__nm_free_uvbuf() had a kludge that detected attempt to free in
the middle of the receive buffer, so the code worked.
However, libuv 1.37.0 changed the way the recvmmsg was enabled from
implicit to explicit, and we checked for yet another enum member
presence with preprocessor macro, so in fact libuv recvmmsg support was
never enabled with libuv >= 1.37.0.
This commit changes to the preprocessor macros to autoconf checks for
declaration, so the detection now works again. On top of that, it's now
possible to cleanup the alloc_cb and free_uvbuf functions because now,
the information whether we can or cannot free the buffer is available to
us.
When the named is shutting down, the zone event callbacks could
re-schedule the stub and refresh events leading to assertion failure.
Handle the ISC_R_SHUTTINGDOWN event state gracefully by bailing out.