mir/bind - bind - Mike's Git repositories

mir/bind

mirror of https://gitlab.isc.org/isc-projects/bind9 synced 2025-08-28 13:08:06 +00:00

Author	SHA1	Message	Date
Evan Hunt	a3ba95116e	Handle UDP send errors when sending DNS message larger than MTU When the fragmentation is disabled on UDP sockets, the uv_udp_send() call can fail with UV_EMSGSIZE for messages larger than path MTU. Previously, this error would end with just discarding the response. In this commit, a proper handling of such case is added and on such error, a new DNS response with truncated bit set is generated and sent to the client. This change allows us to disable the fragmentation on the UDP sockets again.	2021-06-23 17:41:34 +02:00
Ondřej Surý	ec86759401	Replace netmgr per-protocol sequential function with a common one Previously, each protocol (TCPDNS, TLSDNS) has specified own function to disable pipelining on the connection. An oversight would lead to assertion failure when opcode is not query over non-TCPDNS protocol because the isc_nm_tcpdns_sequential() function would be called over non-TCPDNS socket. This commit removes the per-protocol functions and refactors the code to have and use common isc_nm_sequential() function that would either disable the pipelining on the socket or would handle the request in per specific manner. Currently it ignores the call for HTTP sockets and causes assertion failure for protocols where it doesn't make sense to call the function at all.	2021-06-22 17:21:44 +03:00
Ondřej Surý	440fb3d225	Completely remove BIND 9 Windows support The Windows support has been completely removed from the source tree and BIND 9 now no longer supports native compilation on Windows. We might consider reviewing mingw-w64 port if contributed by external party, but no development efforts will be put into making BIND 9 compile and run on Windows again.	2021-06-09 14:35:14 +02:00
Mark Andrews	715a2c7fc1	Add missing initialisations configuring with --enable-mutex-atomics flagged these incorrectly initialised variables on systems where pthread_mutex_init doesn't just zero out the structure.	2021-05-26 08:15:08 +00:00
Ondřej Surý	28b65d8256	Reduce the number of clientmgr objects created Previously, as a way of reducing the contention between threads a clientmgr object would be created for each interface/IP address. We tasks being more strictly bound to netmgr workers, this is no longer needed and we can just create clientmgr object per worker queue (ncpus). Each clientmgr object than would have a single task and single memory context.	2021-05-24 20:44:54 +02:00
Ondřej Surý	0be7ea78be	Reduce the number of client tasks and bind them to netmgr queues Since a client object is bound to a netmgr handle, each client will always be processed by the same netmgr worker, so we can simplify the code by binding client->task to the same thread as the client. Since ns__client_request() now runs in the same event loop as client->task events, is no longer necessary to pause the task manager before launching them. Also removed some functions in isc_task that were not used.	2021-05-24 20:02:20 +02:00
Ondřej Surý	c07f8c5a43	Reduce the number of tasks in the clientmgr We now use one task per CPU per dispatchmgr (that's still a lot).	2021-05-24 20:02:20 +02:00
Ondřej Surý	0719f032e1	Reduce the number of mctx created in clientmgr The number of memory contexts created in the clientmgr was enormous. It could easily create thousands of memory contexts because the formula was: nprotocols * ncpus * ninterfaces * CLIENT_NMCTXS_PERCPU (8) The original goal was to reduce the contention when allocating the memory, but after a while nobody noticed that the amount of memory context allocated would not reduce contention at all. This commit removes the whole mctxpool and just uses the mctx from clientmgr as the contention will be reduced directly in the allocator.	2021-05-24 20:02:20 +02:00
Evan Hunt	e31cc1eeb4	use a fixedname buffer in dns_message_gettempname() dns_message_gettempname() now returns a pointer to an initialized name associated with a dns_fixedname_t object. it is no longer necessary to allocate a buffer for temporary names associated with the message object.	2021-05-20 20:41:29 +02:00
Ondřej Surý	b5bf58b419	Destroy netmgr before destroying taskmgr With taskmgr running on top of netmgr, the ordering of how the tasks and netmgr shutdown interacts was wrong as previously isc_taskmgr_destroy() was waiting until all tasks were properly shutdown and detached. This responsibility was moved to netmgr, so we now need to do the following: 1. shutdown all the tasks - this schedules all shutdown events onto the netmgr queue 2. shutdown the netmgr - this also makes sure all the tasks and events are properly executed 3. Shutdown the taskmgr - this now waits for all the tasks to finish running before returning 4. Shutdown the netmgr - this call waits for all the netmgr netievents to finish before returning This solves the race when the taskmgr object would be destroyed before all the tasks were finished running in the netmgr loops.	2021-05-07 14:28:30 -07:00
Ondřej Surý	36ddefacb4	Change the isc_nm_(get\|set)timeouts() to work with milliseconds The RFC7828 specifies the keepalive interval to be 16-bit, specified in units of 100 milliseconds and the configuration options tcp-*-timeouts are following the suit. The units of 100 milliseconds are very unintuitive and while we can't change the configuration and presentation format, we should not follow this weird unit in the API. This commit changes the isc_nm_(get\|set)timeouts() functions to work with milliseconds and convert the values to milliseconds before passing them to the function, not just internally.	2021-03-18 16:37:57 +01:00
Ondřej Surý	0f44139145	Bump the maximum number of hazard pointers in tests On 24-core machine, the tests would crash because we would run out of the hazard pointers. We now adjust the number of hazard pointers to be in the <128,256> interval based on the number of available cores. Note: This is just a band-aid and needs a proper fix.	2021-02-18 19:32:55 +01:00
Diego Fronza	e219422575	Allow stale data to be used before name resolution This commit allows stale RRset to be used (if available) for responding a query, before an attempt to refresh an expired, or otherwise resolve an unavailable RRset in cache is made. For that to work, a value of zero must be specified for stale-answer-client-timeout statement. To better understand the logic implemented, there are three flags being used during database lookup and other parts of code that must be understood: . DNS_DBFIND_STALEOK: This flag is set when BIND fails to refresh a RRset due to timeout (resolver-query-timeout), its intent is to try to look for stale data in cache as a fallback, but only if stale answers are enabled in configuration. This flag is also used to activate stale-refresh-time window, since it is the only way the database knows that a resolution has failed. . DNS_DBFIND_STALEENABLED: This flag is used as a hint to the database that it may use stale data. It is always set during query lookup if stale answers are enabled, but only effectively used during stale-refresh-time window. Also during this window, the resolver will not try to resolve the query, in other words no attempt to refresh the data in cache is made when the stale-refresh-time window is active. . DNS_DBFIND_STALEONLY: This new introduced flag is used when we want stale data from the database, but not due to a failure in resolution, it also doesn't require stale-refresh-time window timer to be active. As long as there is a stale RRset available, it should be returned. It is mainly used in two situations: 1. When stale-answer-client-timeout timer is triggered: in that case we want to know if there is stale data available to answer the client. 2. When stale-answer-client-timeout value is set to zero: in that case, we also want to know if there is some stale RRset available to promptly answer the client. We must also discern between three situations that may happen when resolving a query after the addition of stale-answer-client-timeout statement, and how to handle them: 1. Are we running query_lookup() due to stale-answer-client-timeout timer being triggered? In this case, we look for stale data, making use of DNS_DBFIND_STALEONLY flag. If a stale RRset is available then respond the client with the data found, mark this query as answered (query attribute NS_QUERYATTR_ANSWERED), so when the fetch completes the client won't be answered twice. We must also take care of not detaching from the client, as a fetch will still be running in background, this is handled by the following snippet: if (!QUERY_STALEONLY(&client->query)) { isc_nmhandle_detach(&client->reqhandle); } Which basically tests if DNS_DBFIND_STALEONLY flag is set, which means we are here due to a stale-answer-client-timeout timer expiration. 2. Are we running query_lookup() due to resolver-query-timeout being triggered? In this case, DNS_DBFIND_STALEOK flag will be set and an attempt to look for stale data will be made. As already explained, this flag is algo used to activate stale-refresh-time window, as it means that we failed to refresh a RRset due to timeout. It is ok in this situation to detach from the client, as the fetch is already completed. 3. Are we running query_lookup() during the first time, looking for a RRset in cache and stale-answer-client-timeout value is set to zero? In this case, if stale answers are enabled (probably), we must do an initial database lookup with DNS_DBFIND_STALEONLY flag set, to indicate to the database that we want stale data. If we find an active RRset, proceed as normal, answer the client and the query is done. If we find a stale RRset we respond to the client and mark the query as answered, but don't detach from the client yet as an attempt in refreshing the RRset will still be made by means of the new introduced function 'query_resolve'. If no active or stale RRset is available, begin resolution as usual.	2021-01-25 10:47:14 -03:00
Diego Fronza	171a5b7542	Add stale-answer-client-timeout option The general logic behind the addition of this new feature works as folows: When a client query arrives, the basic path (query.c / ns_query_recurse) was to create a fetch, waiting for completion in fetch_callback. With the introduction of stale-answer-client-timeout, a new event of type DNS_EVENT_TRYSTALE may invoke fetch_callback, whenever stale answers are enabled and the fetch took longer than stale-answer-client-timeout to complete. When an event of type DNS_EVENT_TRYSTALE triggers fetch_callback, we must ensure that the folowing happens: 1. Setup a new query context with the sole purpose of looking up for stale RRset only data, for that matters a new flag was added 'DNS_DBFIND_STALEONLY' used in database lookups. . If a stale RRset is found, mark the original client query as answered (with a new query attribute named NS_QUERYATTR_ANSWERED), so when the fetch completion event is received later, we avoid answering the client twice. . If a stale RRset is not found, cleanup and wait for the normal fetch completion event. 2. In ns_query_done, we must change this part: /* * If we're recursing then just return; the query will * resume when recursion ends. */ if (RECURSING(qctx->client)) { return (qctx->result); } To this: if (RECURSING(qctx->client) && !QUERY_STALEONLY(qctx->client)) { return (qctx->result); } Otherwise we would not proceed to answer the client if it happened that a stale answer was found when looking up for stale only data. When an event of type DNS_EVENT_FETCHDONE triggers fetch_callback, we proceed as before, resuming query, updating stats, etc, but a few exceptions had to be added, most important of which are two: 1. Before answering the client (ns_client_send), check if the query wasn't already answered before. 2. Before detaching a client, e.g. isc_nmhandle_detach(&client->reqhandle), ensure that this is the fetch completion event, and not the one triggered due to stale-answer-client-timeout, so a correct call would be: if (!QUERY_STALEONLY(client)) { isc_nmhandle_detach(&client->reqhandle); } Other than these notes, comments were added in code in attempt to make these updates easier to follow.	2021-01-25 10:47:14 -03:00
Ondřej Surý	634bdfb16d	Refactor netmgr and add more unit tests This is a part of the works that intends to make the netmgr stable, testable, maintainable and tested. It contains a numerous changes to the netmgr code and unfortunately, it was not possible to split this into smaller chunks as the work here needs to be committed as a complete works. NOTE: There's a quite a lot of duplicated code between udp.c, tcp.c and tcpdns.c and it should be a subject to refactoring in the future. The changes that are included in this commit are listed here (extensively, but not exclusively): * The netmgr_test unit test was split into individual tests (udp_test, tcp_test, tcpdns_test and newly added tcp_quota_test) * The udp_test and tcp_test has been extended to allow programatic failures from the libuv API. Unfortunately, we can't use cmocka mock() and will_return(), so we emulate the behaviour with #define and including the netmgr/{udp,tcp}.c source file directly. * The netievents that we put on the nm queue have variable number of members, out of these the isc_nmsocket_t and isc_nmhandle_t always needs to be attached before enqueueing the netievent_<foo> and detached after we have called the isc_nm_async_<foo> to ensure that the socket (handle) doesn't disappear between scheduling the event and actually executing the event. * Cancelling the in-flight TCP connection using libuv requires to call uv_close() on the original uv_tcp_t handle which just breaks too many assumptions we have in the netmgr code. Instead of using uv_timer for TCP connection timeouts, we use platform specific socket option. * Fix the synchronization between {nm,async}_{listentcp,tcpconnect} When isc_nm_listentcp() or isc_nm_tcpconnect() is called it was waiting for socket to either end up with error (that path was fine) or to be listening or connected using condition variable and mutex. Several things could happen: 0. everything is ok 1. the waiting thread would miss the SIGNAL() - because the enqueued event would be processed faster than we could start WAIT()ing. In case the operation would end up with error, it would be ok, as the error variable would be unchanged. 2. the waiting thread miss the sock->{connected,listening} = `true` would be set to `false` in the tcp_{listen,connect}close_cb() as the connection would be so short lived that the socket would be closed before we could even start WAIT()ing * The tcpdns has been converted to using libuv directly. Previously, the tcpdns protocol used tcp protocol from netmgr, this proved to be very complicated to understand, fix and make changes to. The new tcpdns protocol is modeled in a similar way how tcp netmgr protocol. Closes: #2194, #2283, #2318, #2266, #2034, #1920 * The tcp and tcpdns is now not using isc_uv_import/isc_uv_export to pass accepted TCP sockets between netthreads, but instead (similar to UDP) uses per netthread uv_loop listener. This greatly reduces the complexity as the socket is always run in the associated nm and uv loops, and we are also not touching the libuv internals. There's an unfortunate side effect though, the new code requires support for load-balanced sockets from the operating system for both UDP and TCP (see #2137). If the operating system doesn't support the load balanced sockets (either SO_REUSEPORT on Linux or SO_REUSEPORT_LB on FreeBSD 12+), the number of netthreads is limited to 1. * The netmgr has now two debugging #ifdefs: 1. Already existing NETMGR_TRACE prints any dangling nmsockets and nmhandles before triggering assertion failure. This options would reduce performance when enabled, but in theory, it could be enabled on low-performance systems. 2. New NETMGR_TRACE_VERBOSE option has been added that enables extensive netmgr logging that allows the software engineer to precisely track any attach/detach operations on the nmsockets and nmhandles. This is not suitable for any kind of production machine, only for debugging. * The tlsdns netmgr protocol has been split from the tcpdns and it still uses the old method of stacking the netmgr boxes on top of each other. We will have to refactor the tlsdns netmgr protocol to use the same approach - build the stack using only libuv and openssl. * Limit but not assert the tcp buffer size in tcp_alloc_cb Closes: #2061	2020-12-01 16:47:07 +01:00
Witold Kręcicki	f68fe9ff14	Fix a startup/shutdown crash in ns_clientmgr_create	2020-11-10 14:17:05 +01:00
Mark Andrews	b09727a765	Implement DNSTAP support in ns_client_sendraw() ns_client_sendraw() is currently only used to relay UPDATE responses back to the client. dns_dt_send() is called with this assumption.	2020-11-10 06:15:46 +00:00
Ondřej Surý	f7c82e406e	Fix the isc_nm_closedown() to actually close the pending connections 1. The isc__nm_tcp_send() and isc__nm_tcp_read() was not checking whether the socket was still alive and scheduling reads/sends on closed socket. 2. The isc_nm_read(), isc_nm_send() and isc_nm_resumeread() have been changed to always return the error conditions via the callbacks, so they always succeed. This applies to all protocols (UDP, TCP and TCPDNS).	2020-10-22 11:37:16 -07:00
Ondřej Surý	33eefe9f85	The dns_message_create() cannot fail, change the return to void The dns_message_create() function cannot soft fail (as all memory allocations either succeed or cause abort), so we change the function to return void and cleanup the calls.	2020-09-29 08:22:08 +02:00
Diego Fronza	12d6d13100	Refactored dns_message_t for using attach/detach semantics This commit will be used as a base for the next code updates in order to have a better control of dns_message_t objects' lifetime.	2020-09-29 08:22:08 +02:00
Evan Hunt	dcee985b7f	update all copyright headers to eliminate the typo	2020-09-14 16:20:40 -07:00
Evan Hunt	57b4dde974	change from isc_nmhandle_ref/unref to isc_nmhandle attach/detach Attaching and detaching handle pointers will make it easier to determine where and why reference counting errors have occurred. A handle needs to be referenced more than once when multiple asynchronous operations are in flight, so callers must now maintain multiple handle pointers for each pending operation. For example, ns_client objects now contain: - reqhandle: held while waiting for a request callback (query, notify, update) - sendhandle: held while waiting for a send callback - fetchhandle: held while waiting for a recursive fetch to complete - updatehandle: held while waiting for an update-forwarding task to complete control channel connection objects now contain: - readhandle: held while waiting for a read callback - sendhandle: held while waiting for a send callback - cmdhandle: held while an rndc command is running httpd connections contain: - readhandle: held while waiting for a read callback - sendhandle: held while waiting for a send callback	2020-09-11 12:17:57 -07:00
Mark Andrews	20488d6ad3	Map DNS_R_BADTSIG to FORMERR Now that the log message has been printed set the result code to DNS_R_FORMERR. We don't do this via dns_result_torcode() as we don't want upstream errors to produce FORMERR if that processing end with DNS_R_BADTSIG.	2020-08-04 12:20:37 +00:00
Evan Hunt	23c7373d68	restore "blackhole" functionality the blackhole ACL was accidentally disabled with respect to client queries during the netmgr conversion. in order to make this work for TCP, it was necessary to add a return code to the accept callback functions passed to isc_nm_listentcp() and isc_nm_listentcpdns().	2020-06-30 17:29:09 -07:00
Mark Andrews	abe2c84b1d	Suppress cppcheck warnings: cppcheck-suppress objectIndex cppcheck-suppress nullPointerRedundantCheck	2020-06-25 12:04:36 +10:00
Evan Hunt	75c985c07f	change the signature of recv callbacks to include a result code this will allow recv event handlers to distinguish between cases in which the region is NULL because of error, shutdown, or cancelation.	2020-06-19 12:33:26 -07:00
Mark Andrews	924e141a15	Adjust NS_CLIENT_TCP_BUFFER_SIZE and cleanup client_allocsendbuf NS_CLIENT_TCP_BUFFER_SIZE was 2 byte too large following the move to netmgr add associated changes to lib/ns/client.c and as a result an INSIST could be trigger if the DNS message being constructed had a checkpoint stage that fell in those two extra bytes. Adjusted NS_CLIENT_TCP_BUFFER_SIZE and cleaned up client_allocsendbuf now that the previously reserved 2 bytes are no longer used.	2020-06-18 09:59:19 +02:00
Evan Hunt	249184e03e	add a quick-and-dirty method of debugging a single query when built with "configure --enable-singletrace", named will produce detailed query logging at the highest debug level for any query with query ID zero. this enables monitoring of the progress of a single query by specifying the QID using "dig +qid=0". the "client" logging category should be set to a low severity level to suppress logging of other queries. (the chance of another query using QID=0 at the same time is only 1 in 2^16.) "--enable-singletrace" turns on "--enable-querytrace" as well, so if the logging severity is not lowered, all other queries will be logged verbosely as well. compiling with either of these options will impair query performance; they should only be turned on when testing or troubleshooting.	2020-05-26 00:47:18 -07:00
Evan Hunt	57e54c46e4	change "expr == false" to "!expr" in conditionals	2020-05-25 16:09:57 -07:00
Evan Hunt	68a1c9d679	change 'expr == true' to 'expr' in conditionals	2020-05-25 16:09:57 -07:00
Ondřej Surý	78886d4bed	Fix the statistic counter underflow in ns_client_t In case of normal fetch, the .recursionquota is attached and ns_statscounter_recursclients is incremented when the fetch is created. Then the .recursionquota is detached and the counter decremented in the fetch_callback(). In case of prefetch or rpzfetch, the quota is attached, but the counter is not incremented. When we reach the soft-quota, the function returns early but don't detach from the quota, and it gets destroyed during the ns_client_endrequest(), so no memory was leaked. But because the ns_statscounter_recursclients is only incremented during the normal fetch the counter would be incorrectly decremented on two occassions: 1) When we reached the softquota, because the quota was not properly detached 2) When the prefetch or rpzfetch was cancelled mid-flight and the callback function was never called.	2020-04-03 19:41:46 +02:00
Witold Kręcicki	df3dbdff81	Destroy query in killoldestquery under a lock. Fixes a race between ns_client_killoldestquery and ns_client_endrequest - killoldestquery takes a client from `recursing` list while endrequest destroys client object, then killoldestquery works on a destroyed client object. Prevent it by holding reclist lock while cancelling query.	2020-03-05 08:13:50 +00:00
Evan Hunt	0b76d8a490	comments	2020-02-28 08:46:16 +01:00
Witold Kręcicki	4b6a064972	Don't define NS_CLIENT_TRACE by default	2020-02-28 08:46:16 +01:00
Witold Kręcicki	8c6c07286f	Remove some stale fields from ns_client_t; make sendbuf allocated on heap	2020-02-28 08:46:16 +01:00
Witold Kręcicki	938b61405b	Don't check if the client is on recursing list (requires locking) if it's not RECURSING	2020-02-28 08:46:16 +01:00
Witold Kręcicki	b0888ff039	Don't issue ns_client_endrequest on a NS_CLIENTSTATE_READY client. Fix a potential assertion failure on shutdown in ns__client_endrequest. Scenario: 1. We are shutting down, interface->clientmgr is gone. 2. We receive a packet, it gets through ns__client_request 3. mgr == NULL, return 4. isc_nmhandle_detach calls ns_client_reset_cb 5. ns_client_reset_cb calls ns_client_endrequest 6. INSIST(client->state == NS_CLIENTSTATE_WORKING \|\| client->state == NS_CLIENTSTATE_RECURSING) is not met - we haven't started processing this packet so client->state == NS_CLIENTSTATE_READY. As a solution - don't do anything in ns_client_reset_cb if the client is still in READY state.	2020-02-26 12:15:01 +00:00
Witold Kręcicki	952f7b503d	Use thread-friendly mctxpool and taskpool in ns_client. Make ns_client mctxpool more thread-friendly by sharding it by netmgr threadid, use task pool also sharded by thread id to avoid lock contention.	2020-02-18 10:31:13 +01:00
Ondřej Surý	5777c44ad0	Reformat using the new rules	2020-02-14 09:31:05 +01:00
Evan Hunt	e851ed0bb5	apply the modified style	2020-02-13 15:05:06 -08:00
Ondřej Surý	056e133c4c	Use clang-tidy to add curly braces around one-line statements The command used to reformat the files in this commit was: ./util/run-clang-tidy \ -clang-tidy-binary clang-tidy-11 -clang-apply-replacements-binary clang-apply-replacements-11 \ -checks=-,readability-braces-around-statements \ -j 9 \ -fix \ -format \ -style=file \ -quiet clang-format -i --style=format $(git ls-files '.c' '.h') uncrustify -c .uncrustify.cfg --replace --no-backup $(git ls-files '.c' '.h') clang-format -i --style=format $(git ls-files '.c' '*.h')	2020-02-13 22:07:21 +01:00
Ondřej Surý	f50b1e0685	Use clang-format to reformat the source files	2020-02-12 15:04:17 +01:00
Ondřej Surý	bc1d4c9cb4	Clear the pointer to destroyed object early using the semantic patch Also disable the semantic patch as the code needs tweaks here and there because some destroy functions might not destroy the object and return early if the object is still in use.	2020-02-09 18:00:17 -08:00
Ondřej Surý	41fe9b7a14	Formatting issues found by local coccinelle run	2020-02-08 03:12:09 -08:00
Ondřej Surý	c73e5866c4	Refactor the isc_buffer_allocate() usage using the semantic patch The isc_buffer_allocate() function now cannot fail with ISC_R_MEMORY. This commit removes all the checks on the return code using the semantic patch from previous commit, as isc_buffer_allocate() now returns void.	2020-02-03 08:29:00 +01:00
Witold Kręcicki	8d6dc8613a	clean up some handle/client reference counting errors in error cases. We weren't consistent about who should unreference the handle in case of network error. Make it consistent so that it's always the client code responsibility to unreference the handle - either in the callback or right away if send function failed and the callback will never be called.	2020-01-20 22:28:36 +01:00
Witold Kręcicki	dcc0835a3a	cleanup properly if we fail to initialize ns_client structure If taskmgr is shutting down ns_client_setup will fail to create a task for the newly created client, we weren't cleaning up already created/attached things (memory context, server, clientmgr).	2020-01-20 22:28:36 +01:00
Witold Kręcicki	eda4300bbb	netmgr: have a single source of truth for tcpdns callback We pass interface as an opaque argument to tcpdns listening socket. If we stop listening on an interface but still have in-flight connections the opaque 'interface' is not properly reference counted, and we might hit a dead memory. We put just a single source of truth in a listening socket and make the child sockets use that instead of copying the value from listening socket. We clean the callback when we stop listening.	2020-01-15 17:22:13 +01:00
Ondřej Surý	7c3e342935	Use isc_refcount_increment0() where appropriate	2020-01-14 13:12:13 +01:00
Ondřej Surý	fbf9856f43	Add isc_refcount_destroy() as appropriate	2020-01-14 13:12:13 +01:00

1 2 3

121 Commits