mir/bind - bind - Mike's Git repositories

mir/bind

mirror of https://gitlab.isc.org/isc-projects/bind9 synced 2025-08-30 05:57:52 +00:00

Author	SHA1	Message	Date
Aram Sargsyan	70ad94257d	Implement tcp-primaries-timeout The new 'tcp-primaries-timeout' configuration option works the same way as the existing 'tcp-initial-timeout' option, but applies only to the TCP connections made to the primary servers, so that the timeout value can be set separately for them. The default is 15 seconds. Also, while accommodating zone.c's code to support the new option, make a light refactoring with the way UDP timeouts are calculated by using definitions instead of hardcoded values.	2025-04-23 17:03:05 +00:00
Evan Hunt	ad7f744115	use ISC_LIST_FOREACH in more places use the ISC_LIST_FOREACH pattern in places where lists had been iterated using a different pattern from the typical `for` loop: for example, `while (!ISC_LIST_EMPTY(...))` or `while ((e = ISC_LIST_HEAD(...)) != NULL)`.	2025-03-31 13:45:14 -07:00
Evan Hunt	522ca7bb54	switch to ISC_LIST_FOREACH everywhere the pattern `for (x = ISC_LIST_HEAD(...); x != NULL; ISC_LIST_NEXT(...)` has been changed to `ISC_LIST_FOREACH` throughout BIND, except in a few cases where the change would be excessively complex. in most cases this was a straightforward change. in some places, however, the list element variable was referenced after the loop ended, and the code was refactored to avoid this necessity. also, because `ISC_LIST_FOREACH` uses typeof(list.head) to declare the list elements, compilation failures can occur if the list object has a `const` qualifier. some `const` qualifiers have been removed from function parameters to avoid this problem, and where that was not possible, `UNCONST` was used.	2025-03-31 13:45:10 -07:00
Artem Boldariev	eaad0aefe6	DoH: Bump the active streams processing limit This commit bumps the total number of active streams (= the opened streams for which a request is received, but response is not ready) to 60% of the total streams limit. The previous limit turned out to be too tight as revealed by longer (≥1h) runs of "stress:long:rpz:doh+udp:linux:*" tests.	2025-03-03 11:32:29 +02:00
Artem Boldariev	217a1ebd79	DoH: remove obsolete INSIST() check The check, while not active by default, is not valid since the commit 8b8f4d500d9c1d41d95d34a79c8935823978114c. See 'if (total == 0) { ...' below branch to understand why.	2025-03-03 11:32:11 +02:00
Artem Boldariev	c5f7968856	DoH: Flush HTTP write buffer on an outgoing DNS message Previously, the code would try to avoid sending any data regardless of what it is unless: a) The flush limit is reached; b) There are no sends in flight. This strategy is used to avoid too numerous send requests with little amount of data. However, it has been proven to be too aggressive and, in fact, harms performance in some cases (e.g., on longer (≥1h) runs of "stress:long:rpz:doh+udp:linux:"). Now, additionally to the listed cases, we also: c) Flush the buffer and perform a send operation when there is an outgoing DNS message passed to the code (which is indicated by the presence of a send callback). That helps improve performance for "stress:long:rpz:doh+udp:linux:" tests.	2025-03-03 11:32:11 +02:00
Artem Boldariev	0e1b02868a	DoH: Limit the number of delayed IO processing requests Previously, a function for continuing IO processing on the next UV tick was introduced (http_do_bio_async()). The intention behind this function was to ensure that http_do_bio() is eventually called at least once in the future. However, the current implementation allows queueing multiple such delayed requests needlessly. There is currently no need for these excessive requests as http_do_bio() can requeue them if needed. At the same time, each such request can lead to a memory allocation, particularly in BIND 9.18. This commit ensures that the number of enqueued delayed IO processing requests never exceeds one in order to avoid potentially bombarding IO threads with the delayed requests needlessly.	2025-03-03 11:32:11 +02:00
Artem Boldariev	0956fb9b9e	DoH: Simplify http_do_bio() This commit significantly simplifies the code flow in the http_do_bio() function, which is responsible for processing incoming and outgoing HTTP/2 data. It seems that the way it was structured before was indirectly caused by the presence of the missing callback calls bug, fixed in 8b8f4d500d9c1d41d95d34a79c8935823978114c. The change introduced by this commit is known to remove a bottleneck and allows reproducible and measurable performance improvement for long runs (>= 1h) of "stress:long:rpz:doh+udp:linux:*" tests. Additionally, it fixes a similar issue with potentially missing send callback calls processing and hardens the code against use-after-free errors related to the session object (they can potentially occur).	2025-03-03 11:32:11 +02:00
Ondřej Surý	c5075a9a61	Remove convenience list macros from isc/util.h The short convenience list macros were used very sparingly and inconsistenly in the code base. As the consistency is prefered over the convenience, all shortened list macro were removed in favor of their ISC_LIST API targets.	2025-03-01 07:33:40 +01:00
Ondřej Surý	2aa70fff76	Remove unused isc_mutexblock and isc_condition units The isc_mutexblock and isc_condition units were no longer in use and were removed.	2025-03-01 07:33:09 +01:00
Artem Boldariev	2adabe835a	DoH: http_send_outgoing() return value is not used The value returned by http_send_outgoing() is not used anywhere, so we make it not return anything (void). Probably it is an omission from older times.	2025-02-19 17:52:36 +02:00
Artem Boldariev	8b8f4d500d	DoH: Fix missing send callback calls When handling outgoing data, there were a couple of rarely executed code paths that would not take into account that the callback MUST be called. It could lead to potential memory leaks and consequent shutdown hangs.	2025-02-19 17:52:36 +02:00
Artem Boldariev	a22bc2d7d4	DoH: change how the active streams number is calculated This commit changes the way how the number of active HTTP streams is calculated and allows it to scale with the values of the maximum amount of streams per connection, instead of effectively capping at STREAM_CLIENTS_PER_CONN. The original limit, which is intended to define the pipelining limit for TCP/DoT. However, it appeared to be too restrictive for DoH, as it works quite differently and implements pipelining at protocol level by the means of multiplexing multiple streams. That renders each stream to be effectively a separate connection from the point of view of the rest of the codebase.	2025-02-19 17:52:36 +02:00
Artem Boldariev	05e8a50818	DoH: Track the amount of in flight outgoing data Previously we would limit the amount of incoming data to process based solely on the presence of not completed send requests. That worked, however, it was found to severely degrade performance in certain cases, as was revealed during extended testing. Now we switch to keeping track of how much data is in flight (or ready to be in flight) and limit the amount of processed incoming data when the amount of in flight data surpasses the given threshold, similarly to like we do in other transports.	2025-02-19 17:52:36 +02:00
Artem Boldariev	937b5f8349	DoH: reduce excessive bad request logging We started using isc_nm_bad_request() more actively throughout codebase. In the case of HTTP/2 it can lead to a large count of useless "Bad Request" messages in the BIND log, as often we attempt to send such request over effectively finished HTTP/2 sessions. This commit fixes that.	2025-01-15 14:09:17 +00:00
Artem Boldariev	4ae4e255cf	Do not stop timer in isc_nm_read_stop() in manual timer mode A call to isc_nm_read_stop() would always stop reading timer even in manual timer control mode which was added with StreamDNS in mind. That looks like an omission that happened due to how timers are controlled in StreamDNS where we always stop the timer before pausing reading anyway (see streamdns_on_complete_dnsmessage()). That would not work well for HTTP, though, where we might want pause reading without stopping the timer in the case we want to split incoming data into multiple chunks to be processed independently. I suppose that it happened due to NM refactoring in the middle of StreamDNS development (at the time isc_nm_cancelread() and isc_nm_pauseread() were removed), as the StreamDNS code seems to be written as if timers are not stoping during a call to isc_nm_read_stop().	2025-01-15 14:09:17 +00:00
Artem Boldariev	609a41517b	DoH: introduce manual read timer control This commit introduces manual read timer control as used by StreamDNS and its underlying transports. Before that, DoH code would rely on the timer control provided by TCP, which would reset the timer any time some data arrived. Now, the timer is restarted only when a full DNS message is processed in line with other DNS transports. That change is required because we should not stop the timer when reading from the network is paused due to throttling. We need a way to drop timed-out clients, particularly those who refuse to read the data we send.	2025-01-15 14:09:17 +00:00
Artem Boldariev	3425e4b1d0	DoH: floodding clients detection This commit adds logic to make code better protected against clients that send valid HTTP/2 data that is useless from a DNS server perspective. Firstly, it adds logic that protects against clients who send too little useful (=DNS) data. We achieve that by adding a check that eventually detects such clients with a nonfavorable useful to processed data ratio after the initial grace period. The grace period is limited to processing 128 KiB of data, which should be enough for sending the largest possible DNS message in a GET request and then some. This is the main safety belt that would detect even flooding clients that initially behave well in order to fool the checks server. Secondly, in addition to the above, we introduce additional checks to detect outright misbehaving clients earlier: The code will treat clients that open too many streams (50) without sending any data for processing as flooding ones; The clients that managed to send 1.5 KiB of data without opening a single stream or submitting at least some DNS data will be treated as flooding ones. Of course, the behaviour described above is nothing else but heuristical checks, so they can never be perfect. At the same time, they should be reasonable enough not to drop any valid clients, realatively easy to implement, and have negligible computational overhead.	2025-01-15 14:09:17 +00:00
Artem Boldariev	9846f395ad	DoH: process data chunk by chunk instead of all at once Initially, our DNS-over-HTTP(S) implementation would try to process as much incoming data from the network as possible. However, that might be undesirable as we might create too many streams (each effectively backed by a ns_client_t object). That is too forgiving as it might overwhelm the server and trash its memory allocator, causing high CPU and memory usage. Instead of doing that, we resort to processing incoming data using a chunk-by-chunk processing strategy. That is, we split data into small chunks (currently 256 bytes) and process each of them asynchronously. However, we can process more than one chunk at once (up to 4 currently), given that the number of HTTP/2 streams has not increased while processing a chunk. That alone is not enough, though. In addition to the above, we should limit the number of active streams: these streams for which we have received a request and started processing it (the ones for which a read callback was called), as it is perfectly fine to have more opened streams than active ones. In the case we have reached or surpassed the limit of active streams, we stop reading AND processing the data from the remote peer. The number of active streams is effectively decreased only when responses associated with the active streams are sent to the remote peer. Overall, this strategy is very similar to the one used for other stream-based DNS transports like TCP and TLS.	2025-01-15 14:09:17 +00:00
Michał Kępień	d6f9785ac6	Enable extraction of exact local socket addresses Extracting the exact address that each wildcard/TCP socket is bound to locally requires issuing the getsockname() system call, which libuv exposes via its uv__getsockname() functions. This is only required for detailed logging and comes at a noticeable performance cost, so it should not happen by default. However, it is useful for debugging certain problems (e.g. cryptic system test failures), so a convenient way of enabling that behavior should exist. Update isc_nmhandle_localaddr() so that it calls uv__getsockname() when the ISC_SOCKET_DETAILS preprocessor macro is set at compile time. Ensure proper handling of sockets that wrap other sockets. Set the new ISC_SOCKET_DETAILS macro by default when --enable-developer is passed to ./configure. This enables detailed logging in the system tests run in GitLab CI without affecting performance in non-development BIND 9 builds. Note that setting the ISC_SOCKET_DETAILS preprocessor macro at compile time enables all callers of isc_nmhandle_localaddr() to extract the exact address of a given local socket, which results e.g. in dnstap captures containing more accurate information. Mention the new preprocessor macro in the section of the ARM that discusses why exact socket addresses may not be logged by default.	2024-12-29 12:32:05 +01:00
Artem Boldariev	6691a1530d	TLS SNI - add low level support for SNI to the networking code This commit adds support for setting SNI hostnames in outgoing connections over TLS. Most of the changes are related to either adapting the code to accept and extra argument in *connect() functions and a couple of changes to the TLS Stream to actually make use of the new SNI hostname information.	2024-12-26 17:23:12 +02:00
Matthijs Mekking	aa24b77d8b	Fix nsupdate hang when processing a large update The root cause is the fix for CVE-2024-0760 (part 3), which resets the TCP connection on a failed send. Specifically commit 4b7c61381f186e20a476c35032a871295ebbd385 stops reading on the socket because the TCP connection is throttling. When the tcpdns_send_cb callback thinks about restarting reading on the socket, this fails because the socket is a client socket. And nsupdate is a client and is using the same netmgr code. This commit removes the requirement that the socket must be a server socket, allowing reading on the socket again after being throttled.	2024-12-05 15:40:48 +01:00
Artem Boldariev	300f05110d	Extended TCP accept()/close() logging This commit adds extra log messages issued when accepting or closing a TCP connection (provided that debugging logging level >=99 is enabled).	2024-11-27 21:14:08 +02:00
Aydın Mercan	d987e2d745	add separate query counters for new protocols Add query counters for DoT, DoH, unencrypted DoH and their proxied counterparts. The protocols don't increment TCP/UDP counters anymore since they aren't the same as plain DNS-over-53.	2024-11-25 13:07:29 +03:00
Ondřej Surý	1a19ce39db	Remove redundant semicolons after the closing braces of functions	2024-11-19 12:27:22 +01:00
Ondřej Surý	0258850f20	Remove redundant parentheses from the return statement	2024-11-19 12:27:22 +01:00
Evan Hunt	5ea1f6390d	corrected code style errors - add missing brackets around one-line statements - add paretheses around return values	2024-10-18 19:31:27 +00:00
Ondřej Surý	eec30c33c2	Don't enable SO_REUSEADDR on outgoing UDP sockets Currently, the outgoing UDP sockets have enabled SO_REUSEADDR (SO_REUSEPORT on BSDs) which allows multiple UDP sockets to bind to the same address+port. There's one caveat though - only a single (the last one) socket is going to receive all the incoming traffic. This in turn could lead to incoming DNS message matching to invalid dns_dispatch and getting dropped. Disable setting the SO_REUSEADDR on the outgoing UDP sockets. This needs to be done explicitly because `uv_udp_open()` silently enables the option on the socket.	2024-10-02 12:15:53 +00:00
Ondřej Surý	b576c4c977	Limit the outgoing UDP send queue size If the operating system UDP queue gets full and the outgoing UDP sending starts to be delayed, BIND 9 could exhibit memory spikes as it tries to enqueue all the outgoing UDP messages. As those are not going to be delivered anyway (as we argued when we stopped enlarging the operating system send and receive buffers), try to send the UDP messages directly using `uv_udp_try_send()` and if that fails, drop the outgoing UDP message.	2024-09-17 14:02:03 +00:00
alessio	8b8149cdd2	Do not set SO_INCOMING_CPU We currently set SO_INCOMING_CPU incorrectly, and testing by Ondrej shows that fixing the issue and setting affinities is worse than letting the kernel schedule threads without constraints. So we should not set SO_INCOMING_CPU anymore.	2024-09-16 12:18:22 +00:00
Ondřej Surý	7b756350f5	Use clang-format-19 to update formatting This is purely result of running: git-clang-format-19 --binary clang-format-19 origin/main	2024-08-22 09:21:55 +02:00
Ondřej Surý	679e90a57d	Add isc_log_createandusechannel() function to simplify usage The new isc_log_createandusechannel() function combines following calls: isc_log_createchannel() isc_log_usechannel() calls into a single call that cannot fail and therefore can be used in places where we know this cannot fail thus simplifying the error handling.	2024-08-20 12:50:39 +00:00
Ondřej Surý	8506102216	Remove logging context (isc_log_t) from the public namespace Now that the logging uses single global context, remove the isc_log_t from the public namespace.	2024-08-20 12:50:39 +00:00
Ondřej Surý	684f3eb8e6	Attach/detach to the listening child socket when accepting TLS When TLS connection (TLSstream) connection was accepted, the children listening socket was not attached to sock->server and thus it could have been freed before all the accepted connections were actually closed. In turn, this would cause us to call isc_tls_free() too soon - causing cascade errors in pending SSL_read_ex() in the accepted connections. Properly attach and detach the children listening socket when accepting and closing the server connections.	2024-08-07 14:17:43 +00:00
Ondřej Surý	827a153d99	Remove superfluous memset() in isc_nmsocket_init() The tlsstream part of the isc_nmsocket_t gets initialized via designater initializer and doesn't need the extra memset() later; just remove it.	2024-08-05 07:32:12 +00:00
Artem Boldariev	5781ff3a93	Drop expired but not accepted TCP connections This commit ensures that we are not attempting to accept an expired TCP connection as we are not interested in any data that could have been accumulated in its internal buffers. Now we just drop them for good.	2024-07-03 15:03:02 +03:00
Ondřej Surý	bc3e713317	Throttle the reading when writes are asynchronous Be more aggressive when throttling the reading - when we can't send the outgoing TCP synchronously with uv_try_write(), we start throttling the reading immediately instead of waiting for the send buffers to fill up. This should not affect behaved clients that read the data from the TCP on the other end.	2024-07-03 08:45:39 +02:00
Artem Boldariev	55b1a093ea	Do not un-throttle TCP connections on isc_nm_read() Due to omission it was possible to un-throttle a TCP connection previously throttled due to the peer not reading back data we are sending. In particular, that affected DoH code, but it could also affect other transports (the current or future ones) that pause/resume reading according to its internal state.	2024-06-12 13:44:37 +03:00
Ondřej Surý	4c2ac25a95	Limit the number of DNS message processed from a single TCP read The single TCP read can create as much as 64k divided by the minimum size of the DNS message. This can clog the processing thread and trash the memory allocator because we need to do as much as ~20k allocations in a single UV loop tick. Limit the number of the DNS messages processed in a single UV loop tick to just single DNS message and limit the number of the outstanding DNS messages back to 23. This effectively limits the number of pipelined DNS messages to that number (this is the limit we already had before).	2024-06-10 16:48:54 +02:00
Ondřej Surý	4e7c4af17f	Throttle reading from TCP if the sends are not getting through When TCP client would not read the DNS message sent to them, the TCP sends inside named would accumulate and cause degradation of the service. Throttle the reading from the TCP socket when we accumulate enough DNS data to be sent. Currently this is limited in a way that a single largest possible DNS message can fit into the buffer.	2024-06-10 16:48:52 +02:00
Artem Boldariev	d80dfbf745	Keep the endpoints set reference within an HTTP/2 socket This commit ensures that an HTTP endpoints set reference is stored in a socket object associated with an HTTP/2 stream instead of referencing the global set stored inside a listener. This helps to prevent an issue like follows: 1. BIND is configured to serve DoH clients; 2. A client is connected and one or more HTTP/2 stream is created. Internal pointers are now pointing to the data on the associated HTTP endpoints set; 3. BIND is reconfigured - the new endpoints set object is created and promoted to all listeners; 4. The old pointers to the HTTP endpoints set data are now invalid. Instead referencing a global object that is updated on re-configurations we now store a local reference which prevents the endpoints set objects to go out of scope prematurely.	2024-06-10 16:40:12 +02:00
Artem Boldariev	c41fb499b9	DoH: avoid potential use after free for HTTP/2 session objects It was reported that HTTP/2 session might get closed or even deleted before all async. processing has been completed. This commit addresses that: now we are avoiding using the object when we do not need it or specifically check if the pointers used are not 'NULL' and by ensuring that there is at least one reference to the session object while we are doing incoming data processing. This commit makes the code more resilient to such issues in the future.	2024-06-10 16:40:10 +02:00
Matthijs Mekking	c40e5c8653	Call reset_shutdown if uv_tcp_close_reset failed If uv_tcp_close_reset() returns an error code, this means the reset_shutdown callback has not been issued, so do it now.	2024-06-03 10:14:47 +02:00
Matthijs Mekking	5b94bb2129	Do not runtime check uv_tcp_close_reset When we reset a TCP connection by sending a RST packet, do not bother requiring the result is a success code.	2024-06-03 10:14:47 +02:00
Mark Andrews	b7de2c7cb9	Clang-format header file changes	2024-05-17 16:03:21 -07:00
Mark Andrews	dd57db2274	Remove duplicate unreachable code block This was accidentially left in during the developement of !8299.	2024-02-12 15:18:46 +11:00
Ondřej Surý	15329d471e	Add memory pools for isc_nmsocket_t structures To reduce memory pressure, we can add light per-loop (netmgr worker) memory pools for isc_nmsocket_t structures. This will help in situations where there's a lot of churn creating and destroying the nmsockets.	2024-02-08 15:13:47 +01:00
Ondřej Surý	750bd364b5	Reduce the isc_nmsocket_t size from 1840 to 1208 bytes Embedding isc_nmsocket_h2_t directly inside isc_nmsocket_t had increased the size of isc_nmsocket_t to 1840 bytes. Making the isc_nmsocket_h2_t to be a pointer to the structure and allocated on demand allows us to reduce the size to 1208 bytes. While there are still some possible reductions in the isc_nmsocket_t (embedded tlsstream, streamdns structures), this was the far biggest drop in the memory usage.	2024-02-08 15:13:47 +01:00
Ondřej Surý	eada7b6e13	Reduce struct isc__nm_uvreq size from 1560 to 560 bytes The uv_req union member of struct isc__nm_uvreq contained libuv request types that we don't use. Turns out that uv_getnameinfo_t is 1000 bytes big and unnecessarily enlarged the whole structure. Remove all the unused members from the uv_req union.	2024-02-08 15:13:47 +01:00
Ondřej Surý	cb1d2e57e9	Remove unused mutex from netmgr The netmgr->lock was dead code, remove it.	2024-02-07 20:54:05 +01:00

1 2 3 4 5 ...

600 Commits