mir/bind - bind - Mike's Git repositories

mir/bind

mirror of https://gitlab.isc.org/isc-projects/bind9 synced 2025-08-29 05:28:00 +00:00

Author	SHA1	Message	Date
Artem Boldariev	4c5b36780b	Fix flawed DoH unit tests logic This commit fixes some logical mistakes in DoH unit tests logic, causing them either to fail or not to do what they are intended to do.	2021-05-07 15:47:24 +03:00
Matthijs Mekking	66f2cd228d	Use isdigit instead of checking character range When looking for key files, we could use isdigit rather than checking if the character is within the range [0-9]. Use (unsigned char) cast to ensure the value is representable in the unsigned char type (as suggested by the isdigit manpage). Change " & 0xff" occurrences to the recommended (unsigned char) type cast.	2021-05-05 19:15:33 +02:00
Ondřej Surý	dfd56b84f5	Add support for generating backtraces on Windows This commit adds support for generating backtraces on Windows and refactors the isc_backtrace API to match the Linux/BSD API (without the isc_ prefix) * isc_backtrace_gettrace() was renamed to isc_backtrace(), the third argument was removed and the return type was changed to int * isc_backtrace_symbols() was added * isc_backtrace_symbols_fd() was added and used as appropriate	2021-05-03 20:31:52 +02:00
Ondřej Surý	37c0d196e3	Use uv_sleep in the netmgr code libuv added uv_sleep(unsigned int msec) to the API since 1.34.0. Use that in the netmgr code and define usleep based shim for libuv << 1.34.0.	2021-05-03 20:22:54 +02:00
Ondřej Surý	c37ff5d188	Add nanosleep and usleep Windows shims This commit adds POSIX nanosleep() and usleep() shim implementation for Windows to help implementors use less #ifdef _WIN32 in the code.	2021-05-03 20:22:54 +02:00
Ondřej Surý	cd54bbbd9a	Add trampoline around iocompletionport_createthreads() On Windows, the iocompletionport_createthreads() didn't use isc_thread_create() to create new threads for processing IO, but just a simple CreateThread() function that completely circumvent the isc_trampoline mechanism to initialize global isc_tid_v. This lead to segmentation fault in isc_hp API because '-1' isn't valid index to the hazard pointer array. This commit changes the iocompletionport_createthreads() to use isc_thread_create() instead of CreateThread() to properly initialize isc_tid_v.	2021-05-03 20:21:15 +02:00
Diego Fronza	7729844150	Address comparison of integers with different signedess	2021-05-03 06:54:30 +00:00
Diego Fronza	54aa60eef8	Add malloc attribute to memory allocation functions The malloc attribute allows compiler to do some optmizations on functions that behave like malloc/calloc, like assuming that the returned pointer do not alias other pointers.	2021-04-26 11:32:17 -03:00
Diego Fronza	efb9c540cd	Removed unnecessary check (mpctx->items == NULL) There is no possibility for mpctx->items to be NULL at the point where the code was removed, since we enforce that fillcount > 0, if mpctx->items == NULL when isc_mempool_get is called, then we will allocate fillcount more items and add to the mpctx->items list.	2021-04-26 11:32:17 -03:00
Artem Boldariev	62033110b9	Use a constant for timeouts in soft-timeout tests It makes it easier to change the value should the need arise.	2021-04-23 10:01:42 -07:00
Evan Hunt	7f367b0c7f	use the correct handle when calling the read callback when calling isc_nm_read() on an HTTP socket, the read callback was being run with the incorrect handle. this has been corrected.	2021-04-23 10:01:42 -07:00
Evan Hunt	f0d75ee7c3	fix DOH timeout recovery as with TLS, the destruction of a client stream on failed read needs to be conditional: if we reached failed_read_cb() as a result of a timeout on a timer which has subsequently been reset, the stream must not be closed.	2021-04-23 10:01:42 -07:00
Evan Hunt	b258df8562	add HTTP timeout recovery test NOTE: this test currently fails	2021-04-22 12:40:04 -07:00
Evan Hunt	23ec011298	fix TLS timeout recovery the destruction of the socket in tls_failed_read_cb() needs to be conditional; if reached due to a timeout on a timer that has subsequently been reset, the socket must not be destroyed.	2021-04-22 12:08:04 -07:00
Evan Hunt	c90da99180	fix TCP timeout recovery removed an unnecessary assert in the failed_read_cb() function. also renamed to isc__nm_tcp_failed_read_cb() to match the practice in other modules.	2021-04-22 12:08:04 -07:00
Evan Hunt	25ef0547a9	add TCP and TLS timeout recovery tests NOTE: currently these tests fail	2021-04-22 12:08:04 -07:00
Evan Hunt	52f256f9ae	add TCPDNS and TLSDNS timeout recovery tests this is similar in structure to the UDP timeout recovery test. this commit adds a new mechanism to the netmgr test allowing the listen socket to accept incoming TCP connections but never send a response. this forces the client to time out on read.	2021-04-22 12:08:04 -07:00
Evan Hunt	bcf5b2a675	run read callbacks synchronously on timeout when running read callbacks, if the event result is not ISC_R_SUCCESS, the callback is always run asynchronously. this is a problem on timeout, because there's no chance to reset the timer before the socket has already been destroyed. this commit allows read callbacks to run synchronously for both ISC_R_SUCCESS and ISC_R_TIMEDOUT result codes.	2021-04-22 12:08:04 -07:00
Evan Hunt	609975ad20	add a UDP timeout recovery test this test sets up a server socket that listens for UDP connections but never responds. the client will always time out; it should retry five times before giving up.	2021-04-22 12:08:04 -07:00
Evan Hunt	1f41d59a5e	allow client read callback to be assignable allow netmgr client tests to choose the function that will be used as a read callback, without having to write a different connect callback handler.	2021-04-22 12:08:04 -07:00
Ondřej Surý	b540722bc3	Refactor taskmgr to run on top of netmgr This commit changes the taskmgr to run the individual tasks on the netmgr internal workers. While an effort has been put into keeping the taskmgr interface intact, couple of changes have been made: * The taskmgr has no concept of universal privileged mode - rather the tasks are either privileged or unprivileged (normal). The privileged tasks are run as a first thing when the netmgr is unpaused. There are now four different queues in in the netmgr: 1. priority queue - netievent on the priority queue are run even when the taskmgr enter exclusive mode and netmgr is paused. This is needed to properly start listening on the interfaces, free resources and resume. 2. privileged task queue - only privileged tasks are queued here and this is the first queue that gets processed when network manager is unpaused using isc_nm_resume(). All netmgr workers need to clean the privileged task queue before they all proceed normal operation. Both task queues are processed when the workers are finished. 3. task queue - only (traditional) task are scheduled here and this queue along with privileged task queues are process when the netmgr workers are finishing. This is needed to process the task shutdown events. 4. normal queue - this is the queue with netmgr events, e.g. reading, sending, callbacks and pretty much everything is processed here. * The isc_taskmgr_create() now requires initialized netmgr (isc_nm_t) object. * The isc_nm_destroy() function now waits for indefinite time, but it will print out the active objects when in tracing mode (-DNETMGR_TRACE=1 and -DNETMGR_TRACE_VERBOSE=1), the netmgr has been made a little bit more asynchronous and it might take longer time to shutdown all the active networking connections. * Previously, the isc_nm_stoplistening() was a synchronous operation. This has been changed and the isc_nm_stoplistening() just schedules the child sockets to stop listening and exits. This was needed to prevent a deadlock as the the (traditional) tasks are now executed on the netmgr threads. * The socket selection logic in isc__nm_udp_send() was flawed, but fortunatelly, it was broken, so we never hit the problem where we created uvreq_t on a socket from nmhandle_t, but then a different socket could be picked up and then we were trying to run the send callback on a socket that had different threadid than currently running.	2021-04-20 23:22:28 +02:00
Ondřej Surý	16fe0d1f41	Cleanup the public vs private ISCAPI remnants Since all the libraries are internal now, just cleanup the ISCAPI remnants in isc_socket, isc_task and isc_timer APIs. This means, there's one less layer as following changes have been done: * struct isc_socket and struct isc_socketmgr have been removed * struct isc__socket and struct isc__socketmgr have been renamed to struct isc_socket and struct isc_socketmgr * struct isc_task and struct isc_taskmgr have been removed * struct isc__task and struct isc__taskmgr have been renamed to struct isc_task and struct isc_taskmgr * struct isc_timer and struct isc_timermgr have been removed * struct isc__timer and struct isc__timermgr have been renamed to struct isc_timer and struct isc_timermgr * All the associated code that dealt with typing isc_<foo> to isc__<foo> and back has been removed.	2021-04-19 13:18:24 +02:00
Ondřej Surý	3388ef36b3	Cleanup the isc_<>mgr_createinc() constructors Previously, the taskmgr, timermgr and socketmgr had a constructor variant, that would create the mgr on top of existing appctx. This was no longer true and isc_<>mgr was just calling isc_<*>mgr_create() directly without any extra code. This commit just cleans up the extra function.	2021-04-19 10:22:56 +02:00
Artem Boldariev	66432dcd65	Handle a situation when SSL shutdown messages were sent and received It fixes a corner case which was causing dig to print annoying messages like: 14-Apr-2021 18:48:37.099 SSL error in BIO: 1 TLS error (errno: 0). Arguments: received_data: (nil), send_data: (nil), finish: false even when all the data was properly processed.	2021-04-15 15:49:36 +03:00
Artem Boldariev	513cdb52ec	TLS: try to close TCP socket descriptor earlier when possible Before this fix underlying TCP sockets could remain opened for longer than it is actually required, causing unit tests to fail with lots of ISC_R_TOOMANYOPENFILES errors. The change also enables graceful SSL shutdown (before that it would happen only in the case when isc_nm_cancelread() were called).	2021-04-15 15:49:36 +03:00
Ondřej Surý	202b1d372d	Merge the tls_test.c into netmgr_test.c and extend the tests suite This commit merges TLS tests into the common Network Manager unit tests suite and extends the unit test framework to include support for additional "ping-pong" style tests where all data could be sent via lesser number of connections (the behaviour of the old test suite). The tests for TCP and TLS were extended to make use of the new mode, as this mode better translates to how the code is used in DoH. Both TLS and TCP tests now share most of the unit tests' code, as they are expected to function similarly from a users's perspective anyway. Additionally to the above, the TLS test suite was extended to include TLS tests using the connections quota facility.	2021-04-15 15:49:36 +03:00
Artem Boldariev	8da12738f1	Use T_CONNECT timeout constant for TCP tests (instead of 1 ms) The netmgr_test would be failing on heavily loaded systems because the connection timeout was set to 1 ms. Use the global constant instead.	2021-04-07 15:37:10 +02:00
Ondřej Surý	72ef5f465d	Refactor async callbacks and fix the double tlsdnsconnect callback The isc_nm_tlsdnsconnect() call could end up with two connect callbacks called when the timeout fired and the TCP connection was aborted, but the TLS handshake was not complete yet. isc__nm_connecttimeout_cb() forgot to clean up sock->tls.pending_req when the connect callback was called with ISC_R_TIMEDOUT, leading to a second callback running later. A new argument has been added to the isc__nm__failed_connect_cb and isc__nm__failed_read_cb functions, to indicate whether the callback needs to run asynchronously or not.	2021-04-07 15:36:59 +02:00
Ondřej Surý	58e75e3ce5	Skip long tls_tests in the CI We already skip most of the recv_send tests in CI because they are too timing-related to be run in overloaded environment. This commit adds a similar change to tls_test before we merge tls_test into netmgr_test.	2021-04-07 15:36:59 +02:00
Artem Boldariev	340235c855	Prevent short TLS tests from hanging in case of errors The tests in tls_test.c could hang in the event of a connect error. This commit allows the tests to bail out when such an error occurs.	2021-04-07 15:36:59 +02:00
Evan Hunt	426c40c96d	rearrange nm_teardown() to check correctness after shutting down if a test failed at the beginning of nm_teardown(), the function would abort before isc_nm_destroy() or isc_tlsctx_free() were reached; we would then abort when nm_setup() was run for the next test case. rearranging the teardown function prevents this problem.	2021-04-07 15:36:59 +02:00
Ondřej Surý	86f4872dd6	isc_nm_connect() always return via callback The isc_nm_connect() functions were refactored to always return the connection status via the connect callback instead of sometimes returning the hard failure directly (for example, when the socket could not be created, or when the network manager was shutting down). This commit changes the connect functions in all the network manager modules, and also makes the necessary refactoring changes in places where the connect functions are called.	2021-04-07 15:36:59 +02:00
Evan Hunt	a70cd026df	move UDP connect retries from dig into isc_nm_udpconnect() dig previously ran isc_nm_udpconnect() three times before giving up, to work around a freebsd bug that caused connect() to return a spurious transient EADDRINUSE. this commit moves the retry code into the network manager itself, so that isc_nm_udpconnect() no longer needs to return a result code.	2021-04-07 15:36:59 +02:00
Ondřej Surý	ca12e25bb0	Use generic functions for reading and timers in TCP The TCP module has been updated to use the generic functions from netmgr.c instead of its own local copies. This brings the module mostly up to par with the TCPDNS and TLSDNS modules.	2021-04-07 15:36:59 +02:00
Ondřej Surý	7df8c7061c	Fix and clean up handling of connect callbacks Serveral problems were discovered and fixed after the change in the connection timeout in the previous commits: * In TLSDNS, the connection callback was not called at all under some circumstances when the TCP connection had been established, but the TLS handshake hadn't been completed yet. Additional checks have been put in place so that tls_cycle() will end early when the nmsocket is invalidated by the isc__nm_tlsdns_shutdown() call. * In TCP, TCPDNS and TLSDNS, new connections would be established even when the network manager was shutting down. The new call isc__nm_closing() has been added and is used to bail out early even before uv_tcp_connect() is attempted.	2021-04-07 15:36:59 +02:00
Ondřej Surý	5a87c7372c	Make it possible to recover from connect timeouts Similarly to the read timeout, it's now possible to recover from ISC_R_TIMEDOUT event by restarting the timer from the connect callback. The change here also fixes platforms that missing the socket() options to set the TCP connection timeout, by moving the timeout code into user space. On platforms that support setting the connect timeout via a socket option, the timeout has been hardcoded to 2 minutes (the maximum value of tcp-initial-timeout).	2021-04-07 15:36:58 +02:00
Ondřej Surý	33c00c281f	Make it possible to recover from read timeouts Previously, when the client timed out on read, the client socket would be automatically closed and destroyed when the nmhandle was detached. This commit changes the logic so that it's possible for the callback to recover from the ISC_R_TIMEDOUT event by restarting the timer. This is done by calling isc_nmhandle_settimeout(), which prevents the timeout handling code from destroying the socket; instead, it continues to wait for data. One specific use case for multiple timeouts is serve-stale - the client socket could be created with shorter timeout (as specified with stale-answer-client-timeout), so we can serve the requestor with stale answer, but keep the original query running for a longer time.	2021-04-07 15:36:58 +02:00
Ondřej Surý	0aad979175	Disable netmgr tests only when running under CI The full netmgr test suite is unstable when run in CI due to various timing issues. Previously, we enabled the full test suite only when CI_ENABLE_ALL_TESTS environment variable was set, but that went against original intent of running the full suite when an individual developer would run it locally. This change disables the full test suite only when running in the CI and the CI_ENABLE_ALL_TESTS is not set.	2021-04-07 15:36:58 +02:00
Artem Boldariev	ee10948e2d	Remove dead code which was supposed to handle TLS shutdowns nicely Fixes Coverity issue CID 330954 (See #2612).	2021-04-07 11:21:08 +03:00
Artem Boldariev	e6062210c7	Handle buggy situations with SSL_ERROR_SYSCALL See "BUGS" section at: https://www.openssl.org/docs/man1.1.1/man3/SSL_get_error.html It is mentioned there that when TLS status equals SSL_ERROR_SYSCALL AND errno == 0 it means that underlying transport layer returned EOF prematurely. However, we are managing the transport ourselves, so we should just resume reading from the TCP socket. It seems that this case has been handled properly on modern versions of OpenSSL. That being said, the situation goes in line with the manual: it is briefly mentioned there that SSL_ERROR_SYSCALL might be returned not only in a case of low-level errors (like system call failures).	2021-04-07 11:21:08 +03:00
Artem Boldariev	fa062162a7	Fix crash (regression) in DIG when handling non-DoH responses This commit fixes crash in dig when it encounters non-expected header value. The bug was introduced at some point late in the last DoH development cycle. Also, refactors the relevant code a little bit to ensure better incoming data validation for client-side DoH connections.	2021-04-01 17:31:29 +03:00
Artem Boldariev	11ed7aac5d	TLS code refactoring, fixes and unit-tests This commit fixes numerous stability issues with TLS transport code as well as adds unit tests for it.	2021-04-01 17:31:29 +03:00
Petr Mensik	81eb3396bf	Do not require config.h to use isc/util.h util.h requires ISC_CONSTRUCTOR definition, which depends on config.h inclusion. It does not include it from isc/util.h (or any other header). Using isc/util.h fails hard when isc/util.h is used without including bind's config.h. Move the check to c file, where ISC_CONSTRUCTOR is used. Ensure config.h is included there.	2021-03-26 11:41:22 +01:00
Patrick McLean	ebced74b19	Add isc_time_now_hires function to get current time with high resolution The current isc_time_now uses CLOCK_REALTIME_COARSE which only updates on a timer tick. This clock is generally fine for millisecond accuracy, but on servers with 100hz clocks, this clock is nowhere near accurate enough for microsecond accuracy. This commit adds a new isc_time_now_hires function that uses CLOCK_REALTIME, which gives the current time, though it is somewhat expensive to call. When microsecond accuracy is required, it may be required to use extra resources for higher accuracy.	2021-03-20 11:25:55 -07:00
Ondřej Surý	d016ea745f	Fix compilation with NETMGR_TRACE(_VERBOSE) enabled on non-Linux When NETMGR_TRACE(_VERBOSE) is enabled, the build would fail on some non-Linux non-glibc platforms because: * Use <stdint.h> print macros because uint_fast32_t is not always unsigned long * The header <execinfo.h> is not available on non-glibc, thus commit adds dummy backtrace() and backtrace_symbols_fd() functions for platforms without HAVE_BACKTRACE	2021-03-19 16:25:28 +01:00
Ondřej Surý	42e4e3b843	Improve reliability of the netmgr unit tests The netmgr unit tests were designed to push the system limits to maximum by sending as many queries as possible in the busy loop from multiple threads. This mostly works with UDP, but in the stateful protocol where establishing the connection takes more time, it failed quite often in the CI. On FreeBSD, this happened more often, because the socket() call would fail spuriosly making the problem even worse. This commit does several things to improve reliability: * return value of isc_nm_<proto>connect() is always checked and retried when scheduling the connection fails * The busy while loop has been slowed down with usleep(1000); so the netmgr threads could schedule the work and get executed. * The isc_thread_yield() was replaced with usleep(1000); also to allow the other threads to do any work. * Instead of waiting on just one variable, we wait for multiple variables to reach the final value * We are wrapping the netmgr operations (connects, reads, writes, accepts) with reference counting and waiting for all the callbacks to be accounted for. This has two effects: a) the isc_nm_t is always clean of active sockets and handles when destroyed, so it will prevent the spurious INSIST(references == 1) from isc_nm_destroy() b) the unit test now ensures that all the callbacks are always called when they should be called, so any stuck test means that there was a missing callback call and it is always a real bug These changes allows us to remove the workaround that would not run certain tests on systems without port load-balancing.	2021-03-19 16:25:28 +01:00
Ondřej Surý	e4e0e9e3c1	Call isc__nm_tlsdns_failed_read on tls_error to cleanup the socket In tls_error(), we now call isc__nm_tlsdns_failed_read() instead of just stopping timer and reading from the socket. This allows us to properly cleanup any pending operation on the socket.	2021-03-19 15:28:52 +01:00
Ondřej Surý	e4b0730387	Call the isc__nm_failed_connect_cb() early when shutting down When shutting down, calling the isc__nm_failed_connect_cb() was delayed until the connect callback would be called. It turned out that the connect callback might not get called at all when the socket is being shut down. Call the failed_connect_cb() directly in the tlsdns_shutdown() instead of waiting for the connect callback to call it.	2021-03-18 14:31:15 -07:00
Ondřej Surý	73c574e553	Fix typo in processbuffer() - tcpdns vs tlsdns The processbuffer() would call isc__nm_tcpdns_processbuffer() instead of isc__nm_tlsdns_processbuffer() for the isc_nm_tlsdnssocket type of socket.	2021-03-18 21:35:13 +01:00
Ondřej Surý	1d64d4cde8	Fix memory accounting bug in TLSDNS After a partial write the tls.senddata buffer would be rearranged to contain only the data tha wasn't sent and the len part would be made shorter, which would lead to attempt to free only part of a socket's tls.senddata buffer.	2021-03-18 18:14:38 +01:00

... 4 5 6 7 8 ...

4339 Commits