2
0
mirror of https://gitlab.isc.org/isc-projects/bind9 synced 2025-08-30 22:15:20 +00:00
Commit Graph

4917 Commits

Author SHA1 Message Date
Ondřej Surý
f0feaa3305 Remove isc_task_sendto(anddetach) functions
The only place where isc_task_sendto() was used was in dns_resolver
unit, where the "sendto" part was actually no-op, because dns_resolver
uses bound tasks.  Remove the isc_task_sendto() and
isc_task_sendtoanddetach() functions in favor of using bound tasks
create with isc_task_create_bound().

Additionally, cache the number of running netmgr threads (nworkers)
locally to reduce the number of function calls.
2022-04-19 14:24:36 +02:00
Ondřej Surý
1eeb4c1121 Remove isc_event_constallocate()
The isc_event_constallocate() function was not used anywhere, thus
remove the isc_event_constallocate() macro, declaration and definition.
2022-04-19 13:46:26 +02:00
Ondřej Surý
f55a4d3e55 Allow listening on less than nworkers threads
For some applications, it's useful to not listen on full battery of
threads.  Add workers argument to all isc_nm_listen*() functions and
convenience ISC_NM_LISTEN_ONE and ISC_NM_LISTEN_ALL macros.
2022-04-19 11:08:13 +02:00
Artem Boldariev
df317184eb Add isc_nmsocket_set_tlsctx()
This commit adds isc_nmsocket_set_tlsctx() - an asynchronous function
that replaces the TLS context within a given TLS-enabled listener
socket object. It is based on the newly added reference counting
functionality.

The intention of adding this function is to add functionality to
replace a TLS context without recreating the whole socket object,
including the underlying TCP listener socket, as a BIND process might
not have enough permissions to re-create it fully on reconfiguration.
2022-04-06 18:45:57 +03:00
Artem Boldariev
25609156a5 Maintain a per-thread TLS ctx reference in TLS stream code
This commit changes the generic TLS stream code to maintain a
per-worker thread TLS context reference.
2022-04-06 18:45:57 +03:00
Artem Boldariev
9256026d18 Use isc_tlsctx_attach() in TLS DNS code
This commit adds proper reference counting for TLS contexts into
generic TLS DNS (DoT) code.
2022-04-06 18:45:57 +03:00
Artem Boldariev
b52d46612f Use isc_tlsctx_attach() in TLS stream code
This commit adds proper reference counting for TLS contexts into
generic TLS stream code.
2022-04-06 18:45:57 +03:00
Artem Boldariev
a7a482c1b1 Add isc_tlsctx_attach()
The implementation is done on top of the reference counting
functionality found in OpenSSL/LibreSSL, which allows for avoiding
wrapping the object.

Adding this function allows using reference counting for TLS contexts
in BIND 9's codebase.
2022-04-06 18:45:57 +03:00
Mark Andrews
98718b3b4b Unlink the timer event before trying to purge it
as far as I can determine the order of operations is not important.

    *** CID 351372:  Concurrent data access violations  (ATOMICITY)
    /lib/isc/timer.c: 227 in timer_purge()
    221     		LOCK(&timer->lock);
    222     		if (!purged) {
    223     			/*
    224     			 * The event has already been executed, but not
    225     			 * yet destroyed.
    226     			 */
    >>>     CID 351372:  Concurrent data access violations  (ATOMICITY)
    >>>     Using an unreliable value of "event" inside the second locked section. If the data that "event" depends on was changed by another thread, this use might be incorrect.
    227     			timerevent_unlink(timer, event);
    228     		}
    229     	}
    230     }
    231
    232     void
2022-04-06 07:33:41 +00:00
Artem Boldariev
f0ac4c47b0 Change X509_STORE_up_ref() shim return value
X509_STORE_up_ref() must return 1 on success, while the previous
implementation would return the references count. This commit fixes
that.
2022-04-05 15:03:27 +03:00
Ondřej Surý
7868d8145b Rename shutdown() to test_shutdown() in timer_test.c
The shutdown() is part of standard library (POSIX-1), don't use such
name in the timer_test.c, but rather rename it to test_shutdown().
2022-04-05 01:49:04 +02:00
Ondřej Surý
142c63dda8 Enable the load-balance-sockets configuration
Previously, HAVE_SO_REUSEPORT_LB has been defined only in the private
netmgr-int.h header file, making the configuration of load balanced
sockets inoperable.

Move the missing HAVE_SO_REUSEPORT_LB define the isc/netmgr.h and add
missing isc_nm_getloadbalancesockets() implementation.
2022-04-05 01:30:58 +02:00
Ondřej Surý
85c6e797aa Add option to configure load balance sockets
Previously, the option to enable kernel load balancing of the sockets
was always enabled when supported by the operating system (SO_REUSEPORT
on Linux and SO_REUSEPORT_LB on FreeBSD).

It was reported that in scenarios where the networking threads are also
responsible for processing long-running tasks (like RPZ processing, CATZ
processing or large zone transfers), this could lead to intermitten
brownouts for some clients, because the thread assigned by the operating
system might be busy.  In such scenarious, the overall performance would
be better served by threads competing over the sockets because the idle
threads can pick up the incoming traffic.

Add new configuration option (`load-balance-sockets`) to allow enabling
or disabling the load balancing of the sockets.
2022-04-04 23:10:04 +02:00
Ondřej Surý
f106d0ed2b Run the RPZ update as offloaded work
Previously, the RPZ updates ran quantized on the main nm_worker loops.
As the quantum was set to 1024, this might lead to service
interruptions when large RPZ update was processed.

Change the RPZ update process to run as the offloaded work.  The update
and cleanup loops were refactored to do as little locking of the
maintenance lock as possible for the shortest periods of time and the db
iterator is being paused for every iteration, so we don't hold the rbtdb
tree lock for prolonged periods of time.
2022-04-04 21:20:05 +02:00
Ondřej Surý
ae01ec2823 Don't use reference counting in isc_timer unit
The reference counting and isc_timer_attach()/isc_timer_detach()
semantic are actually misleading because it cannot be used under normal
conditions.  The usual conditions under which is timer used uses the
object where timer is used as argument to the "timer" itself.  This
means that when the caller is using `isc_timer_detach()` it needs the
timer to stop and the isc_timer_detach() does that only if this would be
the last reference.  Unfortunately, this also means that if the timer is
attached elsewhere and the timer is fired it will most likely be
use-after-free, because the object used in the timer no longer exists.

Remove the reference counting from the isc_timer unit, remove
isc_timer_attach() function and rename isc_timer_detach() to
isc_timer_destroy() to better reflect how the API needs to be used.

The only caveat is that the already executed event must be destroyed
before the isc_timer_destroy() is called because the timer is no longet
attached to .ev_destroy_arg.
2022-04-02 01:23:15 +02:00
Ondřej Surý
30e0fd942b Remove task privileged mode
Previously, the task privileged mode has been used only when the named
was starting up and loading the zones from the disk as the "first" thing
to do.  The privileged task was setup with quantum == 2, which made the
taskmgr/netmgr spin around the privileged queue processing two events at
the time.

The same effect can be achieved by setting the quantum to UINT_MAX (e.g.
practically unlimited) for the loadzone task, hence the privileged task
mode was removed in favor of just processing all the events on the
loadzone task in a single task_run().
2022-04-01 23:55:26 +02:00
Ondřej Surý
62a72211aa Remove isc_pool API
Since the last user of the isc_pool API is gone, remove the whole
isc_pool API.
2022-04-01 23:50:34 +02:00
Ondřej Surý
2bc7303af2 Use isc_nm_getnworkers to manage zone resources
Instead of passing the number of worker to the dns_zonemgr manually,
get the number of nm threads using the new isc_nm_getnworkers() call.

Additionally, remove the isc_pool API and manage the array of memory
context, zonetasks and loadtasks directly in the zonemgr.
2022-04-01 23:50:34 +02:00
Ondřej Surý
2707d0eeb7 Set hard thread affinity for each zone
After switching to per-thread resources in the zonemgr, the performance
was decreased because the memory context, zonetask and loadtask was
picked from the pool at random.

Pin the zone to single threadid (.tid) and align the memory context,
zonetask and loadtask to be the same, this sets the hard affinity of the
zone to the netmgr thread.
2022-04-01 23:50:34 +02:00
Ondřej Surý
a94678ff77 Create per-thread task and memory context for zonemgr
Previously, the zonemgr created 1 task per 100 zones and 1 memory
context per 1000 zones (with minimum 10 tasks and 2 memory contexts) to
reduce the contention between threads.

Instead of reducing the contention by having many resources, create a
per-nm_thread memory context, loadtask and zonetask and spread the zones
between just per-thread resources.

Note: this commit alone does decrease performance when loading the zone
by couple seconds (in case of 1M zone) and thus there's more work in
this whole MR fixing the performance.
2022-04-01 23:50:34 +02:00
Ondřej Surý
15ea6f002f Add isc_task_setquantum() and use it for post-init zone loading
Add isc_task_setquantum() function that modifies quantum for the future
isc_task_run() invocations.

NOTE: The current isc_task_run() caches the task->quantum into a local
variable and therefore the current event loop is not affected by any
quantum change.
2022-04-01 23:45:23 +02:00
Ondřej Surý
c17eee034b Remove isc_task_purge() and isc_task_purgerange()
The isc_task_purge() and isc_task_purgerange() were now unused, so sweep
the task.c file.  Additionally remove unused ISC_EVENTATTR_NOPURGE event
attribute.
2022-04-01 23:45:23 +02:00
Ondřej Surý
48b2a5df97 Keep the list of scheduled events on the timer
Instead of searching for the events to purge, keep the list of scheduled
events on the timer list and purge the events that we have scheduled.
2022-04-01 23:45:23 +02:00
Ondřej Surý
17aed2f895 Repair isc_task_purgeevent(), clean isc_task_unsend{,range}()
The isc_task_purgerange() was walking through all events on the task to
find a matching task.  Instead use the ISC_LINK_LINKED to find whether
the event is active.

Cleanup the related isc_task_unsend() and isc_task_unsendrange()
functions that were not used anywhere.
2022-04-01 23:45:23 +02:00
Ondřej Surý
b84c9b2608 Turn isc_hash_bits32() into static online function
Adding extra val & 0xffff in the isc_hash_bits32() macros in the hotpath
has significantly reduced the performance.  Turn the macro into static
inline function matching the previous hash_32() function used to compute
hashval matching the hashtable->bits.
2022-04-01 23:04:24 +02:00
Artem Boldariev
3edf7a9fe7 Implement shim for SSL_CTX_set1_cert_store() (affects Debian 9)
This commit implements a shim for SSL_CTX_set1_cert_store() for
OpenSSL/LibreSSL versions where it is not available.
2022-04-01 16:33:43 +03:00
Ondřej Surý
b05a991ad0 Make isc_ht optionally case insensitive
Previously, the isc_ht API would always take the key as a literal input
to the hashing function.  Change the isc_ht_init() function to take an
'options' argument, in which ISC_HT_CASE_SENSITIVE or _INSENSITIVE can
be specified, to determine whether to use case-sensitive hashing in
isc_hash32() when hashing the key.
2022-03-28 15:02:18 -07:00
Evan Hunt
e9ef3defa4 consolidate fibonacci hashing in one place
Fibonacci hashing was implemented in four separate places (rbt.c,
rbtdb.c, resolver.c, zone.c). This commit combines them into a single
implementation. The hash_32() function is now replaced with
isc_hash_bits32().
2022-03-28 14:44:21 -07:00
Ondřej Surý
4dceab142d Consistenly use UNREACHABLE() instead of ISC_UNREACHABLE()
In couple places, we have missed INSIST(0) or ISC_UNREACHABLE()
replacement on some branches with UNREACHABLE().  Replace all
ISC_UNREACHABLE() or INSIST(0) calls with UNREACHABLE().
2022-03-28 23:26:08 +02:00
Artem Boldariev
783663db80 Add ISC_R_TLSBADPEERCERT error code to the TLS related code
This commit adds support for ISC_R_TLSBADPEERCERT error code, which is
supposed to be used to signal for TLS peer certificates verification
in dig and other code.

The support for this error code is added to our TLS and TLS DNS
implementations.

This commit also adds isc_nm_verify_tls_peer_result_string() function
which is supposed to be used to get a textual description of the
reason for getting a ISC_R_TLSBADPEERCERT error.
2022-03-28 15:32:30 +03:00
Artem Boldariev
71cf8fa5ac Extend TLS context cache with CA certificates store
This commit adds support for keeping CA certificates stores associated
with TLS contexts. The intention is to keep one reusable store per a
set of related TLS contexts.
2022-03-28 15:31:22 +03:00
Artem Boldariev
c49a81e27d Add foundational functions to implement Strict/Mutual TLS
This commit adds a set of functions that can be used to implement
Strict and Mutual TLS:

* isc_tlsctx_load_client_ca_names();
* isc_tlsctx_load_certificate();
* isc_tls_verify_peer_result_string();
* isc_tlsctx_enable_peer_verification().
2022-03-28 15:31:22 +03:00
Artem Boldariev
32783d36c2 Add utility functions to manipulate X509 certificate stores
This commit adds a set of high-level utility functions to manipulate
the certificate stores. The stores are needed to implement TLS
certificates verification efficiently.
2022-03-28 15:31:22 +03:00
Ondřej Surý
9de10cd153 Remove extrahandle size from netmgr
Previously, it was possible to assign a bit of memory space in the
nmhandle to store the client data.  This was complicated and prevents
further refactoring of isc_nmhandle_t caching (future work).

Instead of caching the data in the nmhandle, allocate the hot-path
ns_client_t objects from per-thread clientmgr memory context and just
assign it to the isc_nmhandle_t via isc_nmhandle_set().
2022-03-25 10:38:35 +01:00
Ondřej Surý
20f0936cf2 Remove use of the inline keyword used as suggestion to compiler
Historically, the inline keyword was a strong suggestion to the compiler
that it should inline the function marked inline.  As compilers became
better at optimising, this functionality has receded, and using inline
as a suggestion to inline a function is obsolete.  The compiler will
happily ignore it and inline something else entirely if it finds that's
a better optimisation.

Therefore, remove all the occurences of the inline keyword with static
functions inside single compilation unit and leave the decision whether
to inline a function or not entirely on the compiler

NOTE: We keep the usage the inline keyword when the purpose is to change
the linkage behaviour.
2022-03-25 08:33:43 +01:00
Ondřej Surý
04d0b70ba2 Replace ISC_NORETURN with C11's noreturn
C11 has builtin support for _Noreturn function specifier with
convenience noreturn macro defined in <stdnoreturn.h> header.

Replace ISC_NORETURN macro by C11 noreturn with fallback to
__attribute__((noreturn)) if the C11 support is not complete.
2022-03-25 08:33:43 +01:00
Ondřej Surý
584f0d7a7e Simplify way we tag unreachable code with only ISC_UNREACHABLE()
Previously, the unreachable code paths would have to be tagged with:

    INSIST(0);
    ISC_UNREACHABLE();

There was also older parts of the code that used comment annotation:

    /* NOTREACHED */

Unify the handling of unreachable code paths to just use:

    UNREACHABLE();

The UNREACHABLE() macro now asserts when reached and also uses
__builtin_unreachable(); when such builtin is available in the compiler.
2022-03-25 08:33:43 +01:00
Ondřej Surý
fe7ce629f4 Add FALLTHROUGH macro for __attribute__((fallthrough))
Gcc 7+ and Clang 10+ have implemented __attribute__((fallthrough)) which
is explicit version of the /* FALLTHROUGH */ comment we are currently
using.

Add and apply FALLTHROUGH macro that uses the attribute if available,
but does nothing on older compilers.

In one case (lib/dns/zone.c), using the macro revealed that we were
using the /* FALLTHROUGH */ comment in wrong place, remove that comment.
2022-03-25 08:33:43 +01:00
Ondřej Surý
d70daa29f7 Make netmgr the authority on number of threads running
Instead of passing the "workers" variable back and forth along with
passing the single isc_nm_t instance, add isc_nm_getnworkers() function
that returns the number of netmgr threads are running.

Change the ns_interfacemgr and ns_taskmgr to utilize the newly acquired
knowledge.
2022-03-18 21:53:28 +01:00
Ondřej Surý
ff22498849 Add couple missing braces around single-line statements
The clang-format-15 has new option InsertBraces that could add missing
branches around single line statements.  Use that to our advantage
without switching to not-yet-released LLVM version to add missing braces
in couple of places.
2022-03-17 18:27:45 +01:00
Ondřej Surý
cd52953f8a Update the isc_ht unit test to also tesh rehashing
As incremental rehashing has been added to isc_ht implementation, we
need to test whether the rehashing works.

Update the isc_ht unit test to test:

 * preinitialized hash table large enough to hold all the elements
 * smallest hash table that fully grows to hold all the elements
 * partially preinitialized hash table that grows
 * iterating while rehashing is in progress
2022-03-17 08:16:24 +01:00
Ondřej Surý
e42cb1f198 Implement incremental hash table resizing in isc_ht
Previously, an incremental hash table resizing was implemented for the
dns_rbt_t hash table implementation.  Using that as a base, also
implement the incremental hash table resizing also for isc_ht API
hashtables:

 1. During the resize, allocate the new hash table, but keep the old
    table unchanged.
 2. In each lookup, delete, or iterator operation, check both tables.
 3. Perform insertion operations only in the new table.
 4. At each insertion also move <r> elements from the old table to
    the new table.
 5. When all elements are removed from the old table, deallocate it.

To ensure that the old table is completely copied over before the new
table itself needs to be enlarged, it is necessary to increase the
size of the table by a factor of at least (<r> + 1)/<r> during resizing.

In our implementation <r> is equal to 1.

The downside of this approach is that the old table and the new table
could stay in memory for longer when there are no new insertions into
the hash table for prolonged periods of time as the incremental
rehashing happens only during the insertions.
2022-03-17 08:16:24 +01:00
Ondřej Surý
bfa4b9c141 Run .closehandle_cb asynchrounosly in nmhandle_detach_cb()
When sock->closehandle_cb is set, we need to run nmhandle_detach_cb()
asynchronously to ensure correct order of multiple packets processing in
the isc__nm_process_sock_buffer().  When not run asynchronously, it
would cause:

  a) out-of-order processing of the return codes from processbuffer();

  b) stack growth because the next TCP DNS message read callback will
     be called from within the current TCP DNS message read callback.

The sock->closehandle_cb is set to isc__nm_resume_processing() for TCP
sockets which calls isc__nm_process_sock_buffer().  If the read callback
(called from isc__nm_process_sock_buffer()->processbuffer()) doesn't
attach to the nmhandle (f.e. because it wants to drop the processing or
we send the response directly via uv_try_write()), the
isc__nm_resume_processing() (via .closehandle_cb) would call
isc__nm_process_sock_buffer() recursively.

The below shortened code path shows how the stack can grow:

 1: ns__client_request(handle, ...);
 2: isc_nm_tcpdns_sequential(handle);
 3: ns_query_start(client, handle);
 4:   query_lookup(qctx);
 5:     query_send(qctcx->client);
 6:       isc__nmhandle_detach(&client->reqhandle);
 7:         nmhandle_detach_cb(&handle);
 8:           sock->closehandle_cb(sock); // isc__nm_resume_processing
 9:             isc__nm_process_sock_buffer(sock);
10:               processbuffer(sock); // isc__nm_tcpdns_processbuffer
11:                 isc_nmhandle_attach(req->handle, &handle);
12:                 isc__nm_readcb(sock, req, ISC_R_SUCCESS);
13:                   isc__nm_async_readcb(NULL, ...);
14:                     uvreq->cb.recv(...); // ns__client_request

Instead, if 'sock->closehandle_cb' is set, we need to run detach the
handle asynchroniously in 'isc__nmhandle_detach', so that on line 8 in
the code flow above does not start this recursion. This ensures the
correct order when processing multiple packets in the function
'isc__nm_process_sock_buffer()' and prevents the stack growth.

When not run asynchronously, the out-of-order processing leaves the
first TCP socket open until all requests on the stream have been
processed.

If the pipelining is disabled on the TCP via `keep-response-order`
configuration option, named would keep the first socket in lingering
CLOSE_WAIT state when the client sends an incomplete packet and then
closes the connection from the client side.
2022-03-16 22:11:49 +01:00
Ondřej Surý
79b5ccbf34 Implement isc_interval_t on top of isc_time_t
Change the isc_interval_t implementation from separate data type and
separate implementation to be shim implementation on top of isc_time_t.
The distinction between isc_interval_t and isc_time_t has been kept
because they are semantically different - isc_interval_t is relative and
isc_time_t is absolute, but this allows isc_time_t and isc_interval_t to
be freely interchangeable, f.e. this:

    isc_time_t *t1;
    isc_interval_t *interval;
    isc_time_t *t2;

    isc_interval_set(interval, isc_time_seconds(t2), isc_time_nanoseconds(t2);;
    isc_time_subtract(t1, interval, t2);
    isc_interval_set(interval, isc_time_seconds(t2), isc_time_nanoseconds(t2));

to just:

    isc_time_t *t1;
    isc_interval_t *interval;
    isc_time_t *t2;

    isc_time_subtract(t1, t2, interval);

without introducing a whole set of new functions.
2022-03-14 13:00:05 -07:00
Ondřej Surý
e6ca2a651f Refactor isc_timer_reset() use with semantic patch
Add and apply semantic patch to remove expires argument from the
isc_timer_reset() calls through the codebase.
2022-03-14 13:00:05 -07:00
Ondřej Surý
6437bcc488 Remove expires argument from isc_timer API
The isc_timer_reset() now works only with intervals for once timers.

This makes the API almost 1:1 compatible with the libuv timers making
the further refactoring possible.
2022-03-14 13:00:05 -07:00
Ondřej Surý
27850a5ad2 Change isc_timer_reset() usage to never use expires argument
There were two places where expires argument (absolute isc_time_t value)
was being used.  Both places has been converted to use relative interval
argument in preparation of simplification and refactoring of isc_timer
API.
2022-03-14 13:00:05 -07:00
Ondřej Surý
c259cecc90 Refactor isc_timer_create() to just create timer
The isc_timer_create() function was a bit conflated.  It could have been
used to create a timer and start it at the same time.  As there was a
single place where this was done before (see the previous commit for
nta.c), this was cleaned up and the isc_timer_create() function was
changed to only create new timer.
2022-03-14 13:00:05 -07:00
Ondřej Surý
8fbb42c49c Remove "a temporary hack, 'rndc timerpoke'"
In 2002, "a temporary hack, 'rndc timerpoke'" was added.  It's time
for it to go, so it was removed.
2022-03-14 13:00:05 -07:00
Ondřej Surý
f4751a91f7 Remove unused isc_timer_touch() function
The isc_timer_touch() was unused, just remove it.
2022-03-14 13:00:05 -07:00