Instead of juggling different logging context, use one single logging
context that gets initialized in the libisc constructor and destroyed in
the libisc destructor.
The application is still responsible for creating the logging
configuration before using the isc_log API.
This patch is first in the series in a way that it is transparent for
the users of the isc_log API as the isc_log_create() and
isc_log_destroy() are now thin shims that emulate the previous
functionality, but it isc_log_create() will always return internal
isc__lctx pointer and isc_log_destroy() will actually not destroy the
internal isc__lctx context.
Signed-off-by: Ondřej Surý <ondrej@isc.org>
Instead of directly using the result of dirfd() in the unlinkat() call,
check whether the returned file descriptor is actually valid. That
doesn't really change the logic as the unlinkat() would fail with
invalid descriptor anyway, but this is cleaner and will report the right
error returned directly by dirfd() instead of EBADF from unlinkat().
The clang-scan 19 has reported that we are ignoring errno after the call
to rewind(). As we don't really care about the result, just silence the
error, the whole code will be removed in the development version anyway
as it is not needed.
The contexpr introduced in C23 standard makes perfect sense to be used
instead of preprocessor macros - the symbols are kept, etc. Define
ISC_CONSTEXPR to be `constexpr` for C23 and `static const` for the older
C standards. Use the newly introduced macro for the NS_PER_SEC and
friends time constants.
New version of clang (19) has introduced a stricter checks when mixing
integer (and float types) with enums. In this case, we used enum {}
as C17 doesn't have constexpr yet. Change the time conversion constants
to be static const unsigned int instead of enum values.
Check if 'lctx->logconfig' is NULL before using it in isc_log_doit(),
because it's possible that isc_log_destroy() was already called, e.g.
when a 'call_rcu' function wants to log a message during shutdown.
When iterating through the old internal hashmap table, skip all the
nodes that have been already migrated to the new table. We know that
all positions with index less than .hiter are NULL.
When the round robin hashing reorders the map entries on deletion, we
were adjusting the iterator table size only when the reordering was
happening at the internal table boundary. The iterator table size had
to be reduced by one to prevent seeing the entry that resized on
position [0] twice because it migrated to [iter->size - 1] position.
However, the same thing could happen when the same entry migrates a
second time from [iter->size - 1] to [iter->size - 2] position (and so
on) because the check that we are manipulating the entry just in the [0]
position was insufficient. Instead of checking the position [pos == 0],
we now check that the [pos % iter->size == 0], thus ignoring all the
entries that might have moved back to the end of the internal table.
Since the enable_fips_mode() now resides inside the isc_tls unit, BIND 9
would fail to compile when FIPS mode was enabled as the DST subsystem
logging functions were missing.
Move the crypto library logging functions from the openssl_link unit to
isc_tls unit and enhance it, so it can now be used from both places
keeping the old dst__openssl_toresult* macros alive.
Instead of calling dst_lib_init() and dst_lib_destroy() explicitly by
all the programs, create a separate memory context for the DST subsystem
and use the library constructor and destructor to initialize the DST
internals.
When the SSL object was destroyed, it would invalidate all SSL_SESSION
objects including the cached, but not yet used, TLS session objects.
Properly disassociate the SSL object from the SSL_SESSION before we
store it in the TLS session cache, so we can later destroy it without
invalidating the cached TLS sessions.
Co-authored-by: Ondřej Surý <ondrej@isc.org>
Co-authored-by: Artem Boldariev <artem@isc.org>
Co-authored-by: Aram Sargsyan <aram@isc.org>
When TLS connection (TLSstream) connection was accepted, the children
listening socket was not attached to sock->server and thus it could have
been freed before all the accepted connections were actually closed.
In turn, this would cause us to call isc_tls_free() too soon - causing
cascade errors in pending SSL_read_ex() in the accepted connections.
Properly attach and detach the children listening socket when accepting
and closing the server connections.
Since the support for OpenSSL Engines has been removed, we can now also
remove the checks for OPENSSL_API_LEVEL; The OpenSSL 3.x APIs will be
used when compiling with OpenSSL 3.x, and OpenSSL 1.1.xx APIs will be
used only when OpenSSL 1.1.x is used.
The OpenSSL 1.x Engines support has been deprecated in the OpenSSL 3.x
and is going to be removed. Remove the OpenSSL Engine support in favor
of OpenSSL Providers.
The previous work in this area was led by the belief that we might be
calling call_rcu() from within call_rcu() callbacks. After carefully
checking all the current callback, it became evident that this is not
the case and the problem isn't enough rcu_barrier() calls, but something
entirely else.
Call the rcu_barrier() just once as that's enough and the multiple
rcu_barrier() calls will not hide the real problem anymore, so we can
find it.
Since the minimal OpenSSL version is now OpenSSL 1.1.1, remove all kind
of OpenSSL shims and checks for functions that are now always present in
the OpenSSL libraries.
Co-authored-by: Ondřej Surý <ondrej@isc.org>
Co-authored-by: Aydın Mercan <aydin@isc.org>
When putting the 48-bit number into a fixed-size buffer that's exactly 6
bytes, the assertion failure would occur as the 48-bit number is
internally represented as 64-bit number and the code was checking if
there is enough space for `sizeof(val)`. This causes assertion failure
when otherwise valid TSIG signature has a bad timing information.
Specify the size of the argument explicitly, so the 48-bit number
doesn't require 8-byte long buffer.
The PTHREAD_MUTEX_ADAPTIVE_NP and PTHREAD_MUTEX_ERRORCHECK_NP are
usually not defines, but enum values, so simple preprocessor check
doesn't work.
Check for PTHREAD_MUTEX_ADAPTIVE_NP from the autoconf AS_COMPILE_IFELSE
block and define HAVE_PTHREAD_MUTEX_ADAPTIVE_NP. This should enable
adaptive mutex on Linux and FreeBSD.
As PTHREAD_MUTEX_ERRORCHECK actually comes from POSIX and Linux glibc
does define it when compatibility macros are being set, we can just use
PTHREAD_MUTEX_ERRORCHECK instead of PTHREAD_MUTEX_ERRORCHECK_NP.
On a 32 bit machine casting to size_t can still lead to an overflow.
Cast to uint64_t. Also detect all possible negative values for
pages and pagesize to silence warning about possible negative value.
39#if defined(_SC_PHYS_PAGES) && defined(_SC_PAGESIZE)
1. tainted_data_return: Called function sysconf(_SC_PHYS_PAGES),
and a possible return value may be less than zero.
2. assign: Assigning: pages = sysconf(_SC_PHYS_PAGES).
40 long pages = sysconf(_SC_PHYS_PAGES);
41 long pagesize = sysconf(_SC_PAGESIZE);
42
3. Condition pages == -1, taking false branch.
4. Condition pagesize == -1, taking false branch.
43 if (pages == -1 || pagesize == -1) {
44 return (0);
45 }
46
5. overflow: The expression (size_t)pages * pagesize might be negative,
but is used in a context that treats it as unsigned.
CID 498034: (#1 of 1): Overflowed return value (INTEGER_OVERFLOW)
6. return_overflow: (size_t)pages * pagesize, which might have underflowed,
is returned from the function.
47 return ((size_t)pages * pagesize);
48#endif /* if defined(_SC_PHYS_PAGES) && defined(_SC_PAGESIZE) */
This commit ensures that we are not attempting to accept an expired
TCP connection as we are not interested in any data that could have
been accumulated in its internal buffers. Now we just drop them for
good.
Be more aggressive when throttling the reading - when we can't send the
outgoing TCP synchronously with uv_try_write(), we start throttling the
reading immediately instead of waiting for the send buffers to fill up.
This should not affect behaved clients that read the data from the TCP
on the other end.
Due to omission it was possible to un-throttle a TCP connection
previously throttled due to the peer not reading back data we are
sending.
In particular, that affected DoH code, but it could also affect other
transports (the current or future ones) that pause/resume reading
according to its internal state.
The single TCP read can create as much as 64k divided by the minimum
size of the DNS message. This can clog the processing thread and trash
the memory allocator because we need to do as much as ~20k allocations in
a single UV loop tick.
Limit the number of the DNS messages processed in a single UV loop tick
to just single DNS message and limit the number of the outstanding DNS
messages back to 23. This effectively limits the number of pipelined
DNS messages to that number (this is the limit we already had before).
When TCP client would not read the DNS message sent to them, the TCP
sends inside named would accumulate and cause degradation of the
service. Throttle the reading from the TCP socket when we accumulate
enough DNS data to be sent. Currently this is limited in a way that a
single largest possible DNS message can fit into the buffer.
This commit ensures that an HTTP endpoints set reference is stored in
a socket object associated with an HTTP/2 stream instead of
referencing the global set stored inside a listener.
This helps to prevent an issue like follows:
1. BIND is configured to serve DoH clients;
2. A client is connected and one or more HTTP/2 stream is
created. Internal pointers are now pointing to the data on the
associated HTTP endpoints set;
3. BIND is reconfigured - the new endpoints set object is created and
promoted to all listeners;
4. The old pointers to the HTTP endpoints set data are now invalid.
Instead referencing a global object that is updated on
re-configurations we now store a local reference which prevents the
endpoints set objects to go out of scope prematurely.
It was reported that HTTP/2 session might get closed or even deleted
before all async. processing has been completed.
This commit addresses that: now we are avoiding using the object when
we do not need it or specifically check if the pointers used are not
'NULL' and by ensuring that there is at least one reference to the
session object while we are doing incoming data processing.
This commit makes the code more resilient to such issues in the
future.
Add an isc_queue implementation that hides the gory details of cds_wfcq
into more neat API. The same caveats as with cds_wfcq.
TODO: Add documentation to the API.
The detach function declaration in `ISC__REFCOUNT_TRACE_DECL` had an
returned an accidental implicit int. While not allowed since C99, it
became an error by default in GCC 14.
`ISC_REFCOUNT_TRACE_IMPL` and `ISC_REFCOUNT_STATIC_TRACE_IMPL` expanded
into the wrong macros, trying to declare it again with the wrong number
of parameters.
An assertion failure would be triggered when sending the TCP data ends
after the TCP reading gets closed. Implement proper reference counting
for the isc_httpd object.
Returning the value allows for better high-water tracking without
running into edge cases like the following:
0. The counter is at value X
1. Increment the value (X+1)
2. The value is decreased multiple times in another threads (X+1-Y)
3. Get the value (X+1-Y)
4. Update-if-greater misses the X+1 value which should have been the
high-water
Prepare the statistics channel data in the offloaded worker thread, so
the networking thread is not blocked by the process gathering data from
various data structures. Only the netmgr send is then run on the
networkin thread when all the data is already there.
isc_loop() can now take its place.
This also requires changes to the test harness - instead of running the
setup and teardown outside of th main loop, we now schedule the setup
and teardown to run on the loop (via isc_loop_setup() and
isc_loop_teardown()) - this is needed because the new the isc_loop()
call has to be run on the active event loop, but previously the
isc_loop_current() (and the variants like isc_loop_main()) would work
even outside of the loop because it needed just isc_tid() to work, but
not the full loop (which was mainly true for the main thread).
if we had a method to get the running loop, similar to how
isc_tid() gets the current thread ID, we can simplify loop
and loopmgr initialization.
remove most uses of isc_loop_current() in favor of isc_loop().
in some places where that was the only reason to pass loopmgr,
remove loopmgr from the function parameters.