mir/bind - bind - Mike's Git repositories

mir/bind

mirror of https://gitlab.isc.org/isc-projects/bind9 synced 2025-08-30 14:07:59 +00:00

Author	SHA1	Message	Date
Ondřej Surý	0aad979175	Disable netmgr tests only when running under CI The full netmgr test suite is unstable when run in CI due to various timing issues. Previously, we enabled the full test suite only when CI_ENABLE_ALL_TESTS environment variable was set, but that went against original intent of running the full suite when an individual developer would run it locally. This change disables the full test suite only when running in the CI and the CI_ENABLE_ALL_TESTS is not set.	2021-04-07 15:36:58 +02:00
Artem Boldariev	ee10948e2d	Remove dead code which was supposed to handle TLS shutdowns nicely Fixes Coverity issue CID 330954 (See #2612).	2021-04-07 11:21:08 +03:00
Artem Boldariev	e6062210c7	Handle buggy situations with SSL_ERROR_SYSCALL See "BUGS" section at: https://www.openssl.org/docs/man1.1.1/man3/SSL_get_error.html It is mentioned there that when TLS status equals SSL_ERROR_SYSCALL AND errno == 0 it means that underlying transport layer returned EOF prematurely. However, we are managing the transport ourselves, so we should just resume reading from the TCP socket. It seems that this case has been handled properly on modern versions of OpenSSL. That being said, the situation goes in line with the manual: it is briefly mentioned there that SSL_ERROR_SYSCALL might be returned not only in a case of low-level errors (like system call failures).	2021-04-07 11:21:08 +03:00
Artem Boldariev	fa062162a7	Fix crash (regression) in DIG when handling non-DoH responses This commit fixes crash in dig when it encounters non-expected header value. The bug was introduced at some point late in the last DoH development cycle. Also, refactors the relevant code a little bit to ensure better incoming data validation for client-side DoH connections.	2021-04-01 17:31:29 +03:00
Artem Boldariev	11ed7aac5d	TLS code refactoring, fixes and unit-tests This commit fixes numerous stability issues with TLS transport code as well as adds unit tests for it.	2021-04-01 17:31:29 +03:00
Petr Mensik	81eb3396bf	Do not require config.h to use isc/util.h util.h requires ISC_CONSTRUCTOR definition, which depends on config.h inclusion. It does not include it from isc/util.h (or any other header). Using isc/util.h fails hard when isc/util.h is used without including bind's config.h. Move the check to c file, where ISC_CONSTRUCTOR is used. Ensure config.h is included there.	2021-03-26 11:41:22 +01:00
Patrick McLean	ebced74b19	Add isc_time_now_hires function to get current time with high resolution The current isc_time_now uses CLOCK_REALTIME_COARSE which only updates on a timer tick. This clock is generally fine for millisecond accuracy, but on servers with 100hz clocks, this clock is nowhere near accurate enough for microsecond accuracy. This commit adds a new isc_time_now_hires function that uses CLOCK_REALTIME, which gives the current time, though it is somewhat expensive to call. When microsecond accuracy is required, it may be required to use extra resources for higher accuracy.	2021-03-20 11:25:55 -07:00
Ondřej Surý	d016ea745f	Fix compilation with NETMGR_TRACE(_VERBOSE) enabled on non-Linux When NETMGR_TRACE(_VERBOSE) is enabled, the build would fail on some non-Linux non-glibc platforms because: * Use <stdint.h> print macros because uint_fast32_t is not always unsigned long * The header <execinfo.h> is not available on non-glibc, thus commit adds dummy backtrace() and backtrace_symbols_fd() functions for platforms without HAVE_BACKTRACE	2021-03-19 16:25:28 +01:00
Ondřej Surý	42e4e3b843	Improve reliability of the netmgr unit tests The netmgr unit tests were designed to push the system limits to maximum by sending as many queries as possible in the busy loop from multiple threads. This mostly works with UDP, but in the stateful protocol where establishing the connection takes more time, it failed quite often in the CI. On FreeBSD, this happened more often, because the socket() call would fail spuriosly making the problem even worse. This commit does several things to improve reliability: * return value of isc_nm_<proto>connect() is always checked and retried when scheduling the connection fails * The busy while loop has been slowed down with usleep(1000); so the netmgr threads could schedule the work and get executed. * The isc_thread_yield() was replaced with usleep(1000); also to allow the other threads to do any work. * Instead of waiting on just one variable, we wait for multiple variables to reach the final value * We are wrapping the netmgr operations (connects, reads, writes, accepts) with reference counting and waiting for all the callbacks to be accounted for. This has two effects: a) the isc_nm_t is always clean of active sockets and handles when destroyed, so it will prevent the spurious INSIST(references == 1) from isc_nm_destroy() b) the unit test now ensures that all the callbacks are always called when they should be called, so any stuck test means that there was a missing callback call and it is always a real bug These changes allows us to remove the workaround that would not run certain tests on systems without port load-balancing.	2021-03-19 16:25:28 +01:00
Ondřej Surý	e4e0e9e3c1	Call isc__nm_tlsdns_failed_read on tls_error to cleanup the socket In tls_error(), we now call isc__nm_tlsdns_failed_read() instead of just stopping timer and reading from the socket. This allows us to properly cleanup any pending operation on the socket.	2021-03-19 15:28:52 +01:00
Ondřej Surý	e4b0730387	Call the isc__nm_failed_connect_cb() early when shutting down When shutting down, calling the isc__nm_failed_connect_cb() was delayed until the connect callback would be called. It turned out that the connect callback might not get called at all when the socket is being shut down. Call the failed_connect_cb() directly in the tlsdns_shutdown() instead of waiting for the connect callback to call it.	2021-03-18 14:31:15 -07:00
Ondřej Surý	73c574e553	Fix typo in processbuffer() - tcpdns vs tlsdns The processbuffer() would call isc__nm_tcpdns_processbuffer() instead of isc__nm_tlsdns_processbuffer() for the isc_nm_tlsdnssocket type of socket.	2021-03-18 21:35:13 +01:00
Ondřej Surý	1d64d4cde8	Fix memory accounting bug in TLSDNS After a partial write the tls.senddata buffer would be rearranged to contain only the data tha wasn't sent and the len part would be made shorter, which would lead to attempt to free only part of a socket's tls.senddata buffer.	2021-03-18 18:14:38 +01:00
Ondřej Surý	5cc406a920	Fix dangling uvreq when data is sent from tlsdns_cycle() The tlsdns_cycle() might call uv_write() to write data to the socket, when this happens and the socket is shutdown before the callback completes, the uvreq structure was not freed because the callback would be called with non-zero status code.	2021-03-18 17:58:56 +01:00
Ondřej Surý	36ddefacb4	Change the isc_nm_(get\|set)timeouts() to work with milliseconds The RFC7828 specifies the keepalive interval to be 16-bit, specified in units of 100 milliseconds and the configuration options tcp-*-timeouts are following the suit. The units of 100 milliseconds are very unintuitive and while we can't change the configuration and presentation format, we should not follow this weird unit in the API. This commit changes the isc_nm_(get\|set)timeouts() functions to work with milliseconds and convert the values to milliseconds before passing them to the function, not just internally.	2021-03-18 16:37:57 +01:00
Ondřej Surý	1ef232f93d	Merge the common parts between udp, tcpdns and tlsdns protocol The udp, tcpdns and tlsdns contained lot of cut&paste code or code that was very similar making the stack harder to maintain as any change to one would have to be copied to the the other protocols. In this commit, we merge the common parts into the common functions under isc__nm_<foo> namespace and just keep the little differences based on the socket type.	2021-03-18 16:37:57 +01:00
Ondřej Surý	caa5b6548a	Fix TCPDNS and TLSDNS timers After the TCPDNS refactoring the initial and idle timers were broken and only the tcp-initial-timeout was always applied on the whole TCP connection. This broke any TCP connection that took longer than tcp-initial-timeout, most often this would affect large zone AXFRs. This commit changes the timeout logic in this way: * On TCP connection accept the tcp-initial-timeout is applied and the timer is started * When we are processing and/or sending any DNS message the timer is stopped * When we stop processing all DNS messages, the tcp-idle-timeout is applied and the timer is started again	2021-03-18 16:37:57 +01:00
Mark Andrews	a9f883cbc2	Stop using deprecated calls in lib/isc/tls.c from Rosen Penev @neheb	2021-03-17 20:05:47 +00:00
Artem Boldariev	75363dcb7c	Load full certificate chain from a certificate chain file This commit fixes loading the certificate chain files so that the full chain could be sent to the clients which require that for verification. Before that fix only the top most certificate would be loaded from the chain and sent to clients preventing some of them to perform certificate validation (e.g. Windows 10 DoH client).	2021-03-16 11:49:04 +02:00
Mark Andrews	99bd0c346f	cast (char) to (unsigned char) when calling is*()	2021-03-15 14:18:03 +11:00
Artem Boldariev	7a59fb8207	Disable Nagle's algorithm for HTTP/2 connections It is advisable to disable Nagle's algorithm for HTTP/2 connections because multiple HTTP/2 streams could be multiplexed over one transport connection. Thus, delays when delivering small packets could bring down performance for the whole session. HTTP/2 is meant to be used this way.	2021-03-05 18:09:42 +02:00
Artem Boldariev	66d20cf28b	Fix deadlock in isc_nm_tlsconnect() when called from within the context of a network thread, isc_nm_tlsconnect() hangs. it is waiting for the socket's result code to be updated, but that update is supposed to happen asynchronously in the network thread, and if we're already blocking in the network thread, it can never occur. we can kluge around this by setting the socket result code early; this works for most clients (including "dig"), but it causes inconsistent behaviors that manifest as test failures in the DoH unit test. so we kluged around it even more by setting the socket result code early only when running in the network thread. we need a better solution for this problem, but this will do for now.	2021-03-05 18:09:22 +02:00
Artem Boldariev	ca9a15e3bc	DoH: call send callbacks after data was actually sent	2021-03-05 13:29:32 +02:00
Artem Boldariev	71668437d4	Put sane limitations in place to handle bad requests gracefully This commit makes the server-side code polite. It fixes the error handling code on the server side and fixes returning error code in responses (there was a nasty bug which could potentially crash the server). Also, in this commit we limit max size POST request data to 96K, max processed data size in headers to 128K (should be enough to handle any GET requests). If these limits are surpassed, server will terminate the request with RST_STREAM without responding with error code. Otherwise it politely responds with error code. This commit also limits number of concurrent HTTP/2 streams per transport connection on server to 100 (as nghttp2 advises by default). Ideally, these parameters should be configurable both globally and per every HTTP endpoint description in the configuration file, but for now putting sane limits should be enough.	2021-03-05 13:29:32 +02:00
Evan Hunt	88752b1121	refactor outgoing HTTP connection support - style, cleanup, and removal of unnecessary code. - combined isc_nm_http_add_endpoint() and isc_nm_http_add_doh_endpoint() into one function, renamed isc_http_endpoint(). - moved isc_nm_http_connect_send_request() into doh_test.c as a helper function; remove it from the public API. - renamed isc_http2 and isc_nm_http2 types and functions to just isc_http and isc_nm_http, for consistency with other existing names. - shortened a number of long names. - the caller is now responsible for determining the peer address. in isc_nm_httpconnect(); this eliminates the need to parse the URI and the dependency on an external resolver. - the caller is also now responsible for creating the SSL client context, for consistency with isc_nm_tlsdnsconnect(). - added setter functions for HTTP/2 ALPN. instead of setting up ALPN in isc_tlsctx_createclient(), we now have a function isc_tlsctx_enable_http2client_alpn() that can be run from isc_nm_httpconnect(). - refactored isc_nm_httprequest() into separate read and send functions. isc_nm_send() or isc_nm_read() is called on an http socket, it will be stored until a corresponding isc_nm_read() or _send() arrives; when we have both halves of the pair the HTTP request will be initiated. - isc_nm_httprequest() is renamed isc__nm_http_request() for use as an internal helper function by the DoH unit test. (eventually doh_test should be rewritten to use read and send, and this function should be removed.) - added implementations of isc__nm_tls_settimeout() and isc__nm_http_settimeout(). - increased NGHTTP2 header block length for client connections to 128K. - use isc_mem_t for internal memory allocations inside nghttp2, to help track memory leaks. - send "Cache-Control" header in requests and responses. (note: currently we try to bypass HTTP caching proxies, but ideally we should interact with them: https://tools.ietf.org/html/rfc8484#section-5.1)	2021-03-05 13:29:26 +02:00
Ondřej Surý	a55bdb28f9	Assigning uint64_t from buffer might be misaligned in netmgr tests Resolve possible 8-byte unaligned access when assigning the magic value from the received buffer.	2021-03-04 15:02:24 +01:00
Ondřej Surý	d3bb3ae64f	Fix comparison between signed and unsigned integer expressions Simple typecast to size_t should be enough to silence the warning on ARMv7, even though the code is in fact correct, because the readlen is checked for being < 0 in the block before the warning.	2021-03-04 11:21:43 +01:00
Ondřej Surý	a50f5d0cf5	Call isc__initialize()/isc__shutdown() from win32 DllMain Call the libisc isc__initialize() constructor and isc__shutdown() destructor from DllMain instead of having duplicate code between those and DllMain() code.	2021-03-01 14:24:57 +01:00
Ondřej Surý	888bdfc1ff	Add mempool get/put tracking with AddressSanitizer When AddressSanitizer is in use, disable the internal mempool implementation and redirect the isc_mempool_get to isc_mem_get (and similarly for isc_mempool_put). This is the method recommended by the AddressSanitizer authors for tracking allocations and deallocations instead of custom poison/unpoison code (see https://github.com/google/sanitizers/wiki/AddressSanitizerManualPoisoning).	2021-02-26 10:05:42 -08:00
Ondřej Surý	a0181056a8	Change the isc_thread_self() return type to uintptr_t The pthread_self(), thrd_current() or GetCurrentThreadId() could actually be a pointer, so we should rather convert the value into uintptr_t instead of unsigned long.	2021-02-25 16:21:10 +01:00
Ondřej Surý	bea333f7c9	Use globally assigned thread_id in the isc_hp API Convert the isc_hp API to use the globally available isc_tid_v instead of locally defined tid_v. This should solve most of the problems on machines with many number of cores / CPUs.	2021-02-25 16:21:10 +01:00
Ondřej Surý	cbbecfcc82	Add isc_trampoline API to have simple accounting around threads The current isc_hp API uses internal tid_v variable that gets incremented for each new thread using hazard pointers. This tid_v variable is then used as a index to global shared table with hazard pointers state. Since the tid_v is only incremented and never decremented the table could overflow very quickly if we create set of threads for short period of time, they finish the work and cease to exist. Then we create identical set of threads and so on and so on. This is not a problem for a normal `named` operation as the set of threads is stable, but the problematic place are the unit tests where we test network manager or other APIs (task, timer) that create threads. This commits adds a thin wrapper around any function called from isc_thread_create() that adds unique-but-reusable small digit thread id that can be used as index to f.e. hazard pointer tables. The trampoline wrapper ensures that the thread ids will be reused, so the highest thread_id number doesn't grow indefinitely when threads are created and destroyed and then created again. This fixes the hazard pointer table overflow on machines with many cores. [GL #2396]	2021-02-25 16:21:10 +01:00
Mark Andrews	3ac53daa06	Address unbalanced lock/unlock Also address race between reading and testing mpctx->allocated and incrementing mpctx->allocated.	2021-02-25 13:08:07 +11:00
Ondřej Surý	c5887c4312	Disable safe-guard assertion in DLL_THREAD_ATTACH/DLL_THREAD_DETACH The BIND 9 libraries on Windows define DllMain() optional entry point into a dynamic-link library (DLL). When the system starts or terminates a process or thread, it calls the entry-point function for each loaded DLL using the first thread of the process. When the DLL is being loaded into the virtual address space of the current process as a result of the process starting up, we make a call to DisableThreadLibraryCalls() which should disable the DLL_THREAD_ATTACH and DLL_THREAD_DETACH notifications for the specified dynamic-link library (DLL). This seems not be the case because we never check the return value of the DisableThreadLibraryCalls() call, and it could in fact fail. The DisableThreadLibraryCalls() function fails if the DLL specified by hModule has active static thread local storage, or if hModule is an invalid module handle. In this commit, we remove the safe-guard assertion put in place for the DLL_THREAD_ATTACH and DLL_THREAD_DETACH events and we just ignore them. BIND 9 doesn't create/destroy enough threads for it actually to make any difference, and in fact we do use static thread local storage in the code.	2021-02-24 08:31:42 +01:00
Ondřej Surý	f53e7ed12c	Include lib/isc/tls_p.h in release tarballs The addition of lib/isc/tls_p.h to the source tree was not accounted for in the relevant variable in lib/isc/Makefile.am and thus the former file is not being included in release tarballs prepared using "make dist". Fix by tweaking the libisc_la_SOURCES list in lib/isc/Makefile.am accordingly.	2021-02-19 13:25:18 +01:00
Ondřej Surý	494d0da522	Use library constructor/destructor to initialize OpenSSL Instead of calling isc_tls_initialize()/isc_tls_destroy() explicitly use gcc/clang attributes on POSIX and DLLMain on Windows to initialize and shutdown OpenSSL library. This resolves the issue when isc_nm_create() / isc_nm_destroy() was called multiple times and it would call OpenSSL library destructors from isc_nm_destroy(). At the same time, since we now have introduced the ctor/dtor for libisc, this commit moves the isc_mem API initialization (the list of the contexts) and changes the isc_mem_checkdestroyed() to schedule the checking of memory context on library unload instead of executing the code immediately.	2021-02-18 19:33:54 +01:00
Ondřej Surý	4bde4f050b	Disable calling DllMain() on thread creation/destruction Disables the DLL_THREAD_ATTACH and DLL_THREAD_DETACH notifications for the specified dynamic-link library (DLL). This can reduce the size of the working set for some applications.	2021-02-18 19:33:54 +01:00
Ondřej Surý	f225462055	Fix the invalid condition variable Although harmless, the memmove() in tlsdns and tcpdns was guarded by a current message length variable that was always bigger than 0 instead of correct current buffer length remainder variable.	2021-02-18 19:33:54 +01:00
Ondřej Surý	4775e9f256	Move most of the OpenSSL initialization to isc_tls Since we now require both libcrypto and libssl to be initialized for netmgr, we move all the OpenSSL initialization code except the engine initialization to isc_tls API. The isc_tls_initialize() and isc_tls_destroy() has been made idempotent, so they could be called multiple time. However when isc_tls_destroy() has been called, the isc_tls_initialize() could not be called again.	2021-02-18 19:33:54 +01:00
Ondřej Surý	ff47b47f1a	Remove overrun checking code from memory allocator The ISC_MEM_CHECKOVERRUN would add canary byte at the end of every allocations and check whether the canary byte hasn't been changed at the free time. The AddressSanitizer and valgrind memory checks surpases simple checks like this, so there's no need to actually keep the code inside the allocator.	2021-02-18 19:33:54 +01:00
Ondřej Surý	549e5b693a	Modify the way we benchmark mem_{get,put} Previously, the mem_{get,put} benchmark would pass the allocation size as thread_create argument. This has been now changed, so the allocation size is stored and decremented (divided) in atomic variable and the thread create routing is given a memory context. This will allow to write tests where each thread is given different memory context and do the same for mempool benchmarking.	2021-02-18 19:33:54 +01:00
Ondřej Surý	f34f943b16	Disable memory debugging features in non-developer build The two memory debugging features: ISC_MEM_DEFAULTFILL (ISC_MEMFLAG_FILL) and ISC_MEM_TRACKLINES were always enabled in all builds and the former was only disabled in `named`. This commits disables those two features in non-developer build to make the memory allocator significantly faster.	2021-02-18 19:33:54 +01:00
Ondřej Surý	c9fe12443f	Make the mempool names unconditional The named memory pools were default and always compiled-in. Remove the extra complexity by removing the #define and #ifdefs around the code.	2021-02-18 19:33:54 +01:00
Ondřej Surý	b09106e93a	Make the memory and mempool counters to be stdatomic types This is yet another step into unlocking some parts of the memory contexts. All the regularly updated variables has been turned into atomic types, so we can later remove the locks when updating various counters. Also unlock as much code as possible without breaking anything.	2021-02-18 19:33:51 +01:00
Ondřej Surý	0f44139145	Bump the maximum number of hazard pointers in tests On 24-core machine, the tests would crash because we would run out of the hazard pointers. We now adjust the number of hazard pointers to be in the <128,256> interval based on the number of available cores. Note: This is just a band-aid and needs a proper fix.	2021-02-18 19:32:55 +01:00
Ondřej Surý	7de846977b	Remove the extra level of indirection via isc_memmethods_t Previously, the applications using libisc would be able to override the internal memory methods with own implementation. This was no longer possible, but the extra level of indirection was not removed. This commit removes the extra level of indirection for the memory methods and the default_memalloc() and default_memfree().	2021-02-18 19:32:55 +01:00
Ondřej Surý	55ace5d3aa	Remove the internal memory allocator The internal memory allocator had an extra code to keep a list of blocks for small size allocation. This would help to reduce the interactions with the system malloc as the memory would be already allocated from the system, but there's an extra cost associated with that - all the allocations/deallocations must be locked, effectively eliminating any optimizations in the system allocator targeted at multi-threaded applications. While the isc_mem API is still using locks pretty heavily, this is a first step into reducing the memory allocation/deallocation contention.	2021-02-18 19:32:02 +01:00
Ondřej Surý	66eefac78c	Rollback setting IP_DONTFRAG option on the UDP sockets In DNS Flag Day 2020, the development branch started setting the IP_DONTFRAG option on the UDP sockets. It turned out, that this code was incomplete leading to dropping the outgoing UDP packets. Henceforth this commit rolls back this setting until we have a proper fix that would send back empty response with TC flag set.	2021-02-17 08:09:56 +01:00
Michal Nowak	c286341703	Use SKIPPED_TEST_EXIT_CODE consistently Commit `fa505bfb0e` omitted two unit tests while introducing the SKIP_TEST_EXIT_CODE preprocessor macro. Fix the outliers to make use of SKIP_TEST_EXIT_CODE consistent across all unit tests. Also make sure lib/dns/tests/dnstap_test returns an exit code that indicates a skipped test when dnstap is not enabled.	2021-02-16 13:41:50 +01:00
Ondřej Surý	d1448a4c2a	Move the <isc/readline.h> header to bin/dig/readline.h The <isc/readline.h> header provided a compatibility shim to use when other non-GNU readline libraries are in use. The two places where readline library is being used is nslookup and nsupdate, so the header file has been moved to bin/dig directory and it's directly included from bin/nsupdate. This also conceals any readline headers exposed from the libisc headers.	2021-02-16 01:04:46 +00:00

... 7 8 9 10 11 ...

4452 Commits