2
0
mirror of https://gitlab.isc.org/isc-projects/bind9 synced 2025-08-29 05:28:00 +00:00

4917 Commits

Author SHA1 Message Date
Evan Hunt
f58e7c28cd
switch to using isc_loopmgr_pause() instead of task exclusive
change functions using isc_taskmgr_beginexclusive() to use
isc_loopmgr_pause() instead.

also, removed an unnecessary use of exclusive mode in
named_server_tcptimeouts().

most functions that were implemented as task events because they needed
to be running in a task to use exclusive mode have now been changed
into loop callbacks instead. (the exception is catz, which is being
changed in a separate commit because it's a particularly complex change.)
2023-02-16 17:51:55 +01:00
Tony Finch
f9c725d7d4 Remove do-nothing header <isc/stat.h>
Use <sys/stat.h> instead
2023-02-15 16:44:47 +00:00
Tony Finch
6927a30926 Remove do-nothing header <isc/print.h>
This one really truly did nothing. No lines added!
2023-02-15 16:44:47 +00:00
Tony Finch
c7615bc28d Remove do-nothing header <isc/offset.h>
And replace all uses of isc_offset_t with standard off_t
2023-02-15 16:44:47 +00:00
Tony Finch
bed09c1676 Remove do-nothing header <isc/netdb.h>
Not needed since we dropped Windows support
2023-02-15 16:44:47 +00:00
Tony Finch
b0893ae09a Explain <isc/strerr.h> a little more
The purpose of the `strerror_r()` wrapper was not obvious.
2023-02-15 16:44:09 +00:00
Tony Finch
75f7a85a39 Deprecate <isc/deprecated.h>
We refactor more freely these days.
2023-02-15 15:36:20 +00:00
Ondřej Surý
6ffda5920e
Add the reader-writer synchronization with modified C-RW-WP
This changes the internal isc_rwlock implementation to:

  Irina Calciu, Dave Dice, Yossi Lev, Victor Luchangco, Virendra
  J. Marathe, and Nir Shavit.  2013.  NUMA-aware reader-writer locks.
  SIGPLAN Not. 48, 8 (August 2013), 157–166.
  DOI:https://doi.org/10.1145/2517327.24425

(The full article available from:
  http://mcg.cs.tau.ac.il/papers/ppopp2013-rwlocks.pdf)

The implementation is based on the The Writer-Preference Lock (C-RW-WP)
variant (see the 3.4 section of the paper for the rationale).

The implemented algorithm has been modified for simplicity and for usage
patterns in rbtdb.c.

The changes compared to the original algorithm:

  * We haven't implemented the cohort locks because that would require a
    knowledge of NUMA nodes, instead a simple atomic_bool is used as
    synchronization point for writer lock.

  * The per-thread reader counters are not being used - this would
    require the internal thread id (isc_tid_v) to be always initialized,
    even in the utilities; the change has a slight performance penalty,
    so we might revisit this change in the future.  However, this change
    also saves a lot of memory, because cache-line aligned counters were
    used, so on 32-core machine, the rwlock would be 4096+ bytes big.

  * The readers use a writer_barrier that will raise after a while when
    readers lock can't be acquired to prevent readers starvation.

  * Separate ingress and egress readers counters queues to reduce both
    inter and intra-thread contention.
2023-02-15 09:30:04 +01:00
Ondřej Surý
28fe8104ee
Add isc_hashmap_find() DbC check for valuep
This adds DbC check, so we don't pass non-NULL memory for a valued to
the isc_hashmap_find() function.
2023-02-15 09:30:04 +01:00
Tony Finch
436b76bb17 Improve the spinloop pause / yield hint
Unfortunately, C still lacks a standard function for pause (x86,
sparc) or yeild (arm) instructions, for use in spin lock or CAS loops.
BIND has its own based on vendor intrinsics or inline asm.

Previously, it was buried in the `isc_rwlock` implementation. This
commit renames `isc_rwlock_pause()` to `isc_pause()` and moves
it into <isc/pause.h>.

This commit also fixes the configure script so that it detects ARM
yield support on systems that identify as `aarch*` instead of `arm*`.

On 64-bit ARM systems we now use the ISB (instruction synchronization
barrier) instruction in preference to yield. The ISB instruction
pauses the CPU for longer, several nanoseconds, which is more like the
x86 pause instruction. There are more details in a Rust pull request,
which also refers to MySQL making the same change:
https://github.com/rust-lang/rust/pull/84725
2023-02-14 17:13:24 +00:00
Evan Hunt
3a1bb8dac8 remove some unused functions
removed some functions that are no longer used and unlikely to
be resurrected, and also some that were only used to support Windows
and can now be replaced with generic versions.
2023-02-13 11:50:59 -08:00
Evan Hunt
935879ed11 remove isc_bind9 variable
isc_bind9 was a global bool used to indicate whether the library
was being used internally by BIND or by an external caller. external
use is no longer supported, but the variable was retained for use
by dyndb, which needed it only when being built without libtool.
building without libtool is *also* no longer supported, so the variable
can go away.
2023-02-09 18:00:13 +00:00
Ondřej Surý
d4d57f16c3 Sync compile-time & run-time libuv requirements
Bump the minimum libuv version required at runtime so that it matches
the compile-time requirements.
2023-02-09 15:04:52 +01:00
Ondřej Surý
735d09bffe Enforce version drift limits for libuv
libuv support for receiving multiple UDP messages in a single system
call (recvmmsg()) has been tweaked several times between libuv versions
1.35.0 and 1.40.0.  Mixing and matching libuv versions within that span
may lead to assertion failures and is therefore considered harmful, so
try to limit potential damage be preventing users from mixing libuv
versions with distinct sets of recvmmsg()-related flags.
2023-02-09 15:04:52 +01:00
Ondřej Surý
251f411fc3 Avoid libuv 1.35 and 1.36 that have broken recvmmsg implementation
The implementation of UDP recvmmsg in libuv 1.35 and 1.36 is
incomplete and could cause assertion failure under certain
circumstances.

Modify the configure and runtime checks to report a fatal error when
trying to compile or run with the affected versions.
2023-02-09 15:04:52 +01:00
Ondřej Surý
baced007af
Require C11 Atomic Operations via <stdatomic.h>
Make the C11 Atomic Operations mandatory and drop the Gcc __atomic
builtin shims.
2023-02-08 21:33:23 +01:00
Ondřej Surý
1c456c0284
Require C11 thread_local keyword and <threads.h> header
Change the autoconf check to require C11 <threads.h> header and
thread_local keyword.
2023-02-08 21:33:23 +01:00
Tony Finch
ff63b53ff4 Add isc_time_monotonic()
This is to simplify measurements of how long things take.
2023-02-06 12:14:51 +00:00
Tony Finch
b8e71f9580 Fix ISC_MEM_ZERO on allocators with malloc_usable_size()
ISC_MEM_ZERO requires great care to use when the space returned by
the allocator is larger than the requested space, and when memory is
reallocated. You must ensure that _every_ call to allocate or
reallocate a particular block of memory uses ISC_MEM_ZERO, to ensure
that the extra space is zeroed as expected. (When ISC_MEMFLAG_FILL
is set, the extra space will definitely be non-zero.)

When BIND is built without jemalloc, ISC_MEM_ZERO is implemented in
`jemalloc_shim.h`. This had a bug on systems that have malloc_size()
or malloc_usable_size(): memory was only zeroed up to the requested
size, not the allocated size. When an oversized allocation was
returned, and subsequently reallocated larger, memory between the
original requested size and the original allocated size could
contain unexpected nonzero junk. The realloc call does not know the
original requested size and only zeroes from the original allocated
size onwards.

After this change, `jemalloc_shim.h` always zeroes up to the
allocated size, not the requested size.
2023-02-06 11:21:12 +00:00
Evan Hunt
7fd78344e0 refactor isc_ratelimiter to use loop callbacks
the rate limter now uses loop callbacks rather than task events.
the API for isc_ratelimiter_enqueue() has been changed; we now pass
in a loop, a callback function and a callback argument, and
receive back a rate limiter event object (isc_rlevent_t). it
is no longer necessary for the caller to allocate the event.

the callback argument needs to include a pointer to the rlevent
object so that it can be freed using isc_rlevent_free(), or by
dequeueing.
2023-01-31 21:41:19 -08:00
Ondřej Surý
3d674ccc1d Restore Malloced memory counter as InUse alias + little cleanups
This restores the Malloced memory counter and it's now always equal to
InUse counter.  This is only for backwards compatibility reason and
there is no separate counter.

The commit also cleanups little things like structure with a single
item (summary.inuse), and shuts up a wrong cppcheck warning (the
notorious NULL check after assignment).
2023-01-24 17:57:16 +00:00
Ondřej Surý
474279e5f1 Remove ContextSize memory counter
Again, this was an internal allocator counter, now it's useless.
2023-01-24 17:57:16 +00:00
Ondřej Surý
863b2b8bf3 Make the all inuse memory counter atomic operations relaxed
Instead of enforcing stronger synchronization between threads, make all
the atomic operations relaxed.  We are not really interested in exact
numbers at all times - the single place where we need the exact number
is when the memory context is being destroyed.  Even when there's a
overmem counter, we don't care about exact ordering or exact number.
2023-01-24 17:57:16 +00:00
Ondřej Surý
a08e2d37ed Cleanup the ptr argument from mem_putstats()
The ptr argument was unneeded and unused.
2023-01-24 17:57:16 +00:00
Ondřej Surý
699736b7bb Remove the Lost memory counter
The Lost memory counter would count the memory "lost" by external
libraries.  There's really no such thing as `named` require the memory
contexts to be clean on destroy.
2023-01-24 17:57:16 +00:00
Ondřej Surý
7588cd5cb1 Remove stats buckets memory counters
The stats buckets were again more useful for internal allocator, because
we would see the individual "block" caches where the allocations would
fall into.  Remove the stats buckets, and if needed, we can pull more
detailed statistics out of the jemalloc.
2023-01-24 17:57:16 +00:00
Ondřej Surý
1ea8894626 Remove the 'totalgets' memory counter
The totalgets falls into the same category as other "total" and "max"
numbers - it's just a big number with no meaning to end user.
2023-01-24 17:57:16 +00:00
Ondřej Surý
3d4e41d076 Remove the total memory counter
The total memory counter had again little or no meaning when we removed
the internal memory allocator.  It was just a monotonic counter that
would count add the allocation sizes but never subtracted anything, so
it would be just a "big number".
2023-01-24 17:57:16 +00:00
Ondřej Surý
91e349433f Remove maxinuse memory counter
The maxinuse memory counter indicated the highest amount of
memory allocated in the past. Checking and updating this high-
water mark value every time memory was allocated had an impact
on server performance, so it has been removed. Memory size can
be monitored more efficiently via an external tool logging RSS.
2023-01-24 17:57:16 +00:00
Ondřej Surý
971df0b4ed Remove malloced and maxmalloced memory counter
The malloced and maxmalloced memory counters were mostly useless since
we removed the internal allocator blocks - it would only differ from
inuse by the memory context size itself.
2023-01-24 17:57:16 +00:00
Ondřej Surý
7d8aa63026 Make {increment,decrement}_malloced() return void
The return value was only used in a single place and only for
decrement_malloced() and we can easily replace that with atomic_load().
2023-01-24 17:57:16 +00:00
Evan Hunt
a2d773fb98 Refactor dnssec-signzone to use loop callbacks
Use isc_job_run() instead of isc_task_send() for dnssec-signzone
worker threads.

Also fix the issue where the additional assignwork() would be run only
from the main thread effectively serializing all the signing.
2023-01-21 23:39:09 -08:00
Evan Hunt
301f8b23e1 complete change of NETMGR_TRACE to ISC_NETMGR_TRACE
some references to the old ifdef were still in place.
2023-01-20 12:46:34 -08:00
Mark Andrews
b74dd2e8c2 Use INSIST rather then REQUIRE to meet DBC usage rules 2023-01-20 11:05:24 +11:00
Mark Andrews
08c39736a9 isc_nm_listentcp: treat socket failures gracefully
The old code didn't handle race conditions and errors on systems
with non load balancing sockets gracefully.  Look for an error on
any child socket and if found close all the child sockets and return
an error.
2023-01-20 11:05:24 +11:00
Mark Andrews
624f5a0dae isc_nm_listenudp: treat socket failures gracefully
The old code didn't handle race conditions and errors on systems
with non load balancing sockets gracefully.  Look for an error on
any child socket and if found close all the child sockets and return
an error.
2023-01-20 11:05:24 +11:00
Artem Boldariev
942569a1bb Fix building BIND on DragonFly BSD (on both older an newer versions)
This commit ensures that BIND and supplementary tools still can be
built on newer versions of DragonFly BSD. It used to be the case, but
somewhere between versions 6.2 and 6.4 the OS developers rearranged
headers and moved some function definitions around.

Before that the fact that it worked was more like a coincidence, this
time we, at least, looked at the related man pages included with the
OS.

No in depth testing has been done on this OS as we do not really
support this platform - so it is more like a goodwill act. We can,
however, use this platform for testing purposes, too. Also, we know
that the OS users do use BIND, as it is included in its ports
directory.

Building with './configure' and './configure --without-jemalloc' have
been fixed and are known to work at the time the commit is made.
2023-01-20 00:19:12 +02:00
Aram Sargsyan
41dc48bfd7 Refactor isc_nm_xfr_allowed()
Return 'isc_result_t' type value instead of 'bool' to indicate
the actual failure. Rename the function to something not suggesting
a boolean type result. Make changes in the places where the API
function is being used to check for the result code instead of
a boolean value.
2023-01-19 10:24:08 +00:00
Ondřej Surý
5abbcdadaf
Use thread_local EVP_MD in isc_iterated_hash()
Cherry-pick small fixup commit from 9.18/9.16 branches needed for
thread-safety.  This fixup commit is not needed for 9.19+ because of
reworked application setup, but it decouples isc_iterated_hash and
isc_md units and keeps all the branches in sync.
2023-01-18 23:33:43 +01:00
Ondřej Surý
f3753d591f Use thread_local EVP_MD_CTX in isc_iterated_hash()
As this code is on hot path (NSEC3) this introduces an additional
optimization of the EVP_MD API - instead of calling EVP_MD_CTX_new() on
every call to isc_iterated_hash(), we create two thread_local objects
for each thread - a basectx and mdctx, initialize basectx once and then
use EVP_MD_CTX_copy_ex() to flip the initialized state into mdctx.  This
saves us couple more valuable microseconds from the isc_iterated_hash()
call.
2023-01-18 19:36:21 +01:00
Ondřej Surý
25db8d0103 Use OpenSSL 1.x SHA_CTX API in isc_iterated_hash()
If the OpenSSL SHA1_{Init,Update,Final} API is still available, use it.
The API has been deprecated in OpenSSL 3.0, but it is significantly
faster than EVP_MD API, so make an exception here and keep using it
until we can't.
2023-01-18 19:36:17 +01:00
Ondřej Surý
36654df732 Use OpenSSL EVP_MD API directly in isc_iterated_hash()
Instead of going through another layer, use OpenSSL EVP_MD API directly
in the isc_iterated_hash() implementation.  This shaves off couple of
microseconds in the microbenchmark.
2023-01-18 18:32:57 +01:00
Ondřej Surý
e6bfb8e456 Avoid implicit algorithm fetch for OpenSSL EVP_MD family
The implicit algorithm fetch causes a lock contention and significant
slowdown for small input buffers.  For more details, see:

https://github.com/openssl/openssl/issues/19612

Instead of using EVP_DigestInit_ex() initialize empty MD_CTX objects for
each algorithm and use EVP_MD_CTX_copy_ex() to initialize MD_CTX from a
static copy.  Additionally avoid implicit algorithm fetching by using
EVP_MD_fetch() for OpenSSL 3.0.
2023-01-18 18:32:57 +01:00
Tony Finch
290899661d Fix a typo in the NS_PER_ macros
Milliseconds and microseconds were swapped.
2023-01-16 20:33:57 +00:00
Ondřej Surý
d07c4a98da Prefer the pthread_barrier implementation over uv_barrier
Prefer the pthread_barrier implementation on platforms where it is
available over uv_barrier implementation.  This also solves the problem
with thread sanitizer builds on macOS that doesn't have pthread barrier.
2023-01-11 09:51:02 +01:00
Ondřej Surý
d06602f036
Get rid of locking during UDP and TCP listen
We already have a synchronization mechanism when starting the UDP and
TCP listener children - barriers.  Change how we start the first-born
child (tid == 0), so we don't have to race for sock->parent->result and
sock->parent->fd.
2023-01-11 07:17:46 +01:00
Ondřej Surý
10f884a5b8
Remove unused isc_astack unit
The isc_astack unit is now unused, so just remove it.
2023-01-10 20:31:24 +01:00
Ondřej Surý
359faf2ff7
Convert isc_astack usage in netmgr to mempool and ISC_LIST
Change the per-socket inactive uvreq cache (implemented as isc_astack)
to per-worker memory pool.

Change the per-socket inactive nmhandle cache (implemented as
isc_astack) to unlocked per-socket ISC_LIST.
2023-01-10 20:31:24 +01:00
Ondřej Surý
5bbba0d1a1
Simplify tracing the reference counting in isc_netmgr
Always track the per-worker sockets in the .active_sockets field in the
isc__networker_t struct and always track the per-socket handles in the
.active_handles field ian the isc_nmsocket_t struct.
2023-01-10 19:57:39 +01:00
Mark Andrews
349c23dbb7 Accept 'in=NULL' with 'inlen=0' in isc_{half}siphash24
Arthimetic on NULL pointers is undefined.  Avoid arithmetic operations
when 'in' is NULL and require 'in' to be non-NULL if 'inlen' is not zero.
2023-01-10 17:52:56 +11:00