2
0
mirror of https://gitlab.isc.org/isc-projects/bind9 synced 2025-08-28 21:17:54 +00:00

4339 Commits

Author SHA1 Message Date
Ondrej Sury
6eca4b402e Use max_align_t for memory sizeinfo alignment on OpenBSD
On OpenBSD and more generally on platforms without either jemalloc or
malloc_(usable_)size, we need to increase the alignment for the memory
to sizeof(max_align_t) as with plain sizeof(void *), the compiled code
would be crashing when accessing the returned memory.
2021-07-13 13:48:33 +02:00
Ondrej Sury
23751fe252 Cache the isc_os_ncpu() result
It was discovered that on some platforms (f.e. Alpine Linux with MUSL)
the result of isc_os_ncpus() call differ when called before and after we
drop privileges.  This commit changes the isc_os_ncpus() call to cache
the result from the first call and thus always return the same value
during the runtime of the named.  The first call to isc_os_ncpus() is
made as soon as possible on the library initalization.
2021-07-13 09:12:04 +02:00
Ondřej Surý
ce03015d48 Remove nonnull attribute from isc_mem_{get,allocate,reallocate}
The isc_mem_get(), isc_mem_allocate() and isc_mem_reallocate() can
return NULL ptr in case where the allocation size is NULL.  Remove the
nonnull attribute from the functions' declarations.

This stems from the following definition in the C11 standard:

> If the size of the space requested is zero, the behavior is
> implementation-defined: either a null pointer is returned, or the
> behavior is as if the size were some nonzero value, except that the
> returned pointer shall not be used to access an object.

In this case, we return NULL as it's easier to detect errors when
accessing pointer from zero-sized allocation which should obviously
never happen.
2021-07-12 10:02:18 +02:00
Ondřej Surý
d1a9e549b1 Fix the real allocation size in OpenBSD rallocx shim
In the rallocx() shim for OpenBSD (that's the only platform that doesn't
have malloc_size() or malloc_usable_size() equivalent), the newly
allocated size was missing the extra size_t member for storing the
allocation size leading to size_t sized overflow at the end of the
reallocated memory chunk.
2021-07-12 08:43:14 +02:00
Mark Andrews
3945c289bb Reset errcnt at the start of each subtest 2021-07-12 03:47:11 +00:00
Mark Andrews
ce5207699d Fix unchecked return of isc_rwlock_lock and isc_rwlock_unlock
(cherry picked from commit bcaf23dd2799e30cfee1b5d39dec0fbc8e5f7193)
2021-07-12 13:26:29 +10:00
Ondřej Surý
29a285a67d Revert the allocate/free -> get/put change from jemalloc change
In the jemalloc merge request, we missed the fact that ah_frees and ah_handles
are reallocated which is not compatible with using isc_mem_get() for allocation
and isc_mem_put() for deallocation.  This commit reverts that part and restores
use of isc_mem_allocate() and isc_mem_free().
2021-07-09 18:19:57 +02:00
Artem Boldariev
3673abc53c Use restrict and const in isc_mempool_t
This commit makes add restrict and const modifiers to some variables
to aid compiler to do its optimizations.
2021-07-09 15:58:02 +02:00
Artem Boldariev
c11a401add Do not use atomic variables in isc_mempool_t
As now mempool objects intended to be used in a thread-local manner,
there is no point in using atomic here.
2021-07-09 15:58:02 +02:00
Ondřej Surý
63b06571b9 Use isc_mem_get() and isc_mem_put() in isc_mem_total test
Previously, the isc_mem_allocate() and isc_mem_free() would be used for
isc_mem_total test, but since we now use the real allocation
size (sallocx, malloc_size, malloc_usable_size) to track the allocation
size, it's impossible to get the test value right.  Changing the test to
use isc_mem_get() and isc_mem_put() will use the exact size provided, so
the test would work again on all the platforms even when jemalloc is not
being used.
2021-07-09 15:58:02 +02:00
Ondřej Surý
6f162e8aa4 Rewrite isc_mem water to use single atomic exchange operation
This commit refactors the water mechanism in the isc_mem API to use
single pointer to a water_t structure that can be swapped with
atomic_exchange operation instead of having four different
values (water, water_arg, hi_water, lo_water) in the flat namespace.

This reduces the need for locking and prevents a race when water and
water_arg could be desynchronized.
2021-07-09 15:58:02 +02:00
Ondřej Surý
798333d456 Allow size == 0 in isc_mem_{get,allocate,reallocate}
Calls to jemalloc extended API with size == 0 ends up in undefined
behaviour.  This commit makes the isc_mem_get() and friends calls
more POSIX aligned:

  If size is 0, either a null pointer or a unique pointer that can be
  successfully passed to free() shall be returned.

We picked the easier route (which have been already supported in the old
code) and return NULL on calls to the API where size == 0.
2021-07-09 15:58:02 +02:00
Ondřej Surý
e20cc41e56 Use system allocator when jemalloc is unavailable
This commit adds support for systems where the jemalloc library is not
available as a package, here's the quick summary:

  * On Linux - the jemalloc is usually available as a package, if
    configured --without-jemalloc, the shim would be used around
    malloc(), free(), realloc() and malloc_usable_size()

  * On macOS - the jemalloc is available from homebrew or macports, if
    configured --without-jemalloc, the shim would be used around
    malloc(), free(), realloc() and malloc_size()

  * On FreeBSD - the jemalloc is *the* system allocator, we just need
    to check for <malloc_np.h> header to get access to non-standard API

  * On NetBSD - the jemalloc is *the* system allocator, we just need to
    check for <jemalloc/jemalloc.h> header to get access to non-standard
    API

  * On a system hostile to users and developers (read OpenBSD) - the
    jemalloc API is emulated by using ((size_t *)ptr)[-1] field to hold
    the size information.  The OpenBSD developers care only for
    themselves, so why should we care about speed on OpenBSD?
2021-07-09 15:58:02 +02:00
Ondřej Surý
e754360170 Remove atomic thread synchronization from the memory hot-path
This commit refactors the hi/lo-water related code to remove contention
on the hot path in the memory allocator.
2021-07-09 15:58:02 +02:00
Ondřej Surý
efb385ecdc Clean up isc_mempool API
- isc_mempool_get() can no longer fail; when there are no more objects
  in the pool, more are always allocated. checking for NULL return is
  no longer necessary.
- the isc_mempool_setmaxalloc() and isc_mempool_getmaxalloc() functions
  are no longer used and have been removed.
2021-07-09 15:58:02 +02:00
Ondřej Surý
f487c6948b Replace locked mempools with memory contexts
Current mempools are kind of hybrid structures - they serve two
purposes:

 1. mempool with a lock is basically static sized allocator with
    pre-allocated free items

 2. mempool without a lock is a doubly-linked list of preallocated items

The first kind of usage could be easily replaced with jemalloc small
sized arena objects and thread-local caches.

The second usage not-so-much and we need to keep this (in
libdns:message.c) for performance reasons.
2021-07-09 15:58:02 +02:00
Ondřej Surý
fd3ceec475 Add debug tracing capability to isc_mempool_create/destroy
Previously, we only had capability to trace the mempool gets and puts,
but for debugging, it's sometimes also important to keep track how many
and where do the memory pools get created and destroyed.  This commit
adds such tracking capability.
2021-07-09 15:58:02 +02:00
Ondřej Surý
5ab05d1696 Replace isc_mem_allocate() usage with isc_mem_get() in netmgr.c
The isc_mem_allocate() comes with additional cost because of the memory
tracking.  In this commit, we replace the usage with isc_mem_get()
because we track the allocated sizes anyway, so it's possible to also
replace isc_mem_free() with isc_mem_put().
2021-07-09 15:58:02 +02:00
Ondřej Surý
fcc6814776 Replace internal memory calls with non-standard jemalloc API
The jemalloc non-standard API fits nicely with our memory contexts, so
just rewrite the memory context internals to use the non-public API.

There's just one caveat - since we no longer track the size of the
allocation for isc_mem_allocate/isc_mem_free combination, we need to use
sallocx() to get real allocation size in both allocator and deallocator
because otherwise the sizes would not match.
2021-07-09 15:58:02 +02:00
Ondřej Surý
4b3d0c6600 Remove ISC_MEM_DEBUGSIZE and ISC_MEM_DEBUGRECORD
The ISC_MEM_DEBUGSIZE and ISC_MEM_DEBUGCTX did sanity checks on matching
size and memory context on the memory returned to the allocator.  Those
will no longer needed when most of the allocator will be replaced with
jemalloc.
2021-07-09 15:58:02 +02:00
Ondřej Surý
692fd2a216 Remove default_memalloc and default_memfree
Now that we have xmalloc:true enabled, we can remove our xmalloc-like
wrappers around malloc and free.
2021-07-09 15:58:02 +02:00
Ondřej Surý
5184384efd Add recommended jemalloc configuration for our load
There's global variable called `malloc_conf` that can be used to
configure jemalloc behaviour at the program startup.  We use following
configuration:

  * xmalloc:true - abort-on-out-of-memory enabled.

  * background_thread:true - Enable internal background worker threads
    to handle purging asynchronously.

  * metadata_thp:auto - allow jemalloc to use transparent huge page
    (THP) for internal metadata initially, but may begin to do so when
    metadata usage reaches certain level.

  * dirty_decay_ms:30000 - Approximate time in milliseconds from the
    creation of a set of unused dirty pages until an equivalent set of
    unused dirty pages is purged and/or reused.

  * muzzy_decay_ms:30000 - Approximate time in milliseconds from the
    creation of a set of unused muzzy pages until an equivalent set of
    unused muzzy pages is purged and/or reused.

More information about the specific meaning can be found in the jemalloc
manpage or online at http://jemalloc.net/jemalloc.3.html
2021-07-09 15:58:02 +02:00
Ondřej Surý
7f1c525625 Compile with jemalloc to reduce memory allocator contention
The jemalloc allocator is scalable high performance allocator, this is
the first in the series of commits that will add jemalloc as a memory
allocator for BIND 9.

This commit adds configure.ac check and Makefile modifications to use
jemalloc as BIND 9 allocator.
2021-07-09 15:58:02 +02:00
Ondřej Surý
63924968d1 Add debug tracing capability to isc_mem_create/isc_mem_destroy
Previously, we only had capability to trace the memory gets and puts,
but for debugging, it's sometimes also important to keep track how many
and where do the memory contexts get created and destroyed.  This commit
adds such tracking capability.
2021-07-09 15:58:02 +02:00
Artem Boldariev
c6d0e3d3a7 Return HTTP status code for small/malformed requests
This commit makes BIND return HTTP status codes for malformed or too
small requests.

DNS request processing code would ignore such requests. Such an
approach works well for other DNS transport but does not make much
sense for HTTP, not allowing it to complete the request/response
sequence.

Suppose execution has reached the point where DNS message handling
code has been called. In that case, it means that the HTTP request has
been successfully processed, and, thus, we are expected to respond to
it either with a message containing some DNS payload or at least to
return an error status code. This commit ensures that BIND behaves
this way.
2021-07-09 16:37:08 +03:00
Artem Boldariev
fedff2cd6c Return "Bad Request" (400) in a case of Base64 decoding error
This error code fits better than the more generic "Internal Server
Error" (500) which implies that the problem is on the server.

Also, do not end the whole HTTP/2 session on a bad request.
2021-07-09 16:26:46 +03:00
Artem Boldariev
1792740075 Ignore an "Accept" HTTP header value
We were too strict regarding the value and presence of "Accept" HTTP
header, slightly breaking compatibility with the specification.

According to RFC8484 client SHOULD add "Accept" header to the requests
but MUST be able to handle "application/dns-message" media type
regardless of the value of the header. That basically suggests we
ignore its value.

Besides, verifying the value of the "Accept" header is a bit tricky
because it could contain multiple media types, thus requiring proper
parsing. That is doable but does not provide us with any benefits.

Among other things, not verifying the value also fixes compatibility
with clients, which could advertise multiple media types as supported,
which we should accept. For example, it is possible for a perfectly
valid request to contain "application/dns-message", "application/*",
and "*/*" in the "Accept" header value. Still, we would treat such a
request as invalid.
2021-07-09 16:26:46 +03:00
Artem Boldariev
7b6945fb60 Fix BIND hanging when browsers end HTTP/2 streams prematurely
The commit fixes BIND hanging when browsers end HTTP/2 streams
prematurely (for example, by sending RST_STREAM). It ensures that
isc__nmsocket_prep_destroy() will be called for an HTTP/2 stream,
allowing it to be properly disposed.

The problem was impossible to reproduce using dig or DoH benchmarking
software (e.g. flamethrower) because these do not tend to end HTTP/2
streams prematurely.
2021-07-09 15:42:44 +03:00
Artem Boldariev
094fcc10e7 Move the code which calls server read callback into a separate func
This commit moves the code which calls server read callback into a
separate function to avoid code repetition.
2021-07-09 15:42:44 +03:00
Ondřej Surý
2bb454182b Make the DNS over HTTPS support optional
This commit adds two new autoconf options `--enable-doh` (enabled by
default) and `--with-libnghttp2` (mandatory when DoH is enabled).

When DoH support is disabled the library is not linked-in and support
for http(s) protocol is disabled in the netmgr, named and dig.
2021-07-07 09:50:53 +02:00
Ondřej Surý
29c2e52484 The isc/platform.h header has been completely removed
The isc/platform.h header was left empty which things either already
moved to config.h or to appropriate headers.  This is just the final
cleanup commit.
2021-07-06 05:33:48 +00:00
Ondřej Surý
bf4a0e26dc Move NAME_MAX and PATH_MAX from isc/platform.h to isc/dir.h
The last remaining defines needed for platforms without NAME_MAX and
PATH_MAX (I'm looking at you, GNU Hurd) were moved to isc/dir.h where
it's prevalently used.
2021-07-06 05:33:48 +00:00
Ondřej Surý
4da0c49e80 Move ISC_STRERRORSIZE to isc/strerr.h header
The ISC_STRERRORSIZE was defined in isc/platform.h header as the
value was different between Windows and POSIX platforms.  Now that
Windows is gone, move the define to where it belongs.
2021-07-06 05:33:48 +00:00
Ondřej Surý
d881e30b0a Remove LIB<*>_EXTERNAL_DATA defines
After Windows has been removed, the LIB<*>_EXTERNAL_DATA defines
were just dummy leftovers.  Remove them.
2021-07-06 05:33:48 +00:00
Ondřej Surý
e59a359929 Move the include Makefile.tests to the bottom of Makefile.am(s)
The Makefile.tests was modifying global AM_CFLAGS and LDADD and could
accidentally pull /usr/include to be listed before the internal
libraries, which is known to cause problems if the headers from the
previous version of BIND 9 has been installed on the build machine.
2021-06-24 15:33:52 +02:00
Ondřej Surý
b941411072 Disable IP fragmentation on the UDP sockets
In DNS Flag Day 2020, we started setting the DF (Don't Fragment socket
option on the UDP sockets.  It turned out, that this code was incomplete
leading to dropping the outgoing UDP packets.

This has been now remedied, so it is possible to disable the
fragmentation on the UDP sockets again as the sending error is now
handled by sending back an empty response with TC (truncated) bit set.

This reverts commit 66eefac78c92b64b6689a1655cc677a2b1d13496.
2021-06-23 17:41:34 +02:00
Evan Hunt
a3ba95116e Handle UDP send errors when sending DNS message larger than MTU
When the fragmentation is disabled on UDP sockets, the uv_udp_send()
call can fail with UV_EMSGSIZE for messages larger than path MTU.
Previously, this error would end with just discarding the response.  In
this commit, a proper handling of such case is added and on such error,
a new DNS response with truncated bit set is generated and sent to the
client.

This change allows us to disable the fragmentation on the UDP
sockets again.
2021-06-23 17:41:34 +02:00
Ondřej Surý
ec86759401 Replace netmgr per-protocol sequential function with a common one
Previously, each protocol (TCPDNS, TLSDNS) has specified own function to
disable pipelining on the connection.  An oversight would lead to
assertion failure when opcode is not query over non-TCPDNS protocol
because the isc_nm_tcpdns_sequential() function would be called over
non-TCPDNS socket.  This commit removes the per-protocol functions and
refactors the code to have and use common isc_nm_sequential() function
that would either disable the pipelining on the socket or would handle
the request in per specific manner.  Currently it ignores the call for
HTTP sockets and causes assertion failure for protocols where it doesn't
make sense to call the function at all.
2021-06-22 17:21:44 +03:00
Ondřej Surý
54c389dbc0 Drop support for clang atomic and gcc __sync builtins
The requirements for BIND 9.17+ now requires C11 support from the
compiler, so we can safely drop most of the stdatomic.h shims from
lib/isc/unix/include/stdatomic.h.

This commit removes support for clang atomic builtins (clang >= 3.6.0
includes stdatomic.h header) and for Gcc __sync builtins.

The only compatibility shim that remains is support for __atomic
builtins for Gcc >= 4.7.0 since CentOS 7 still includes only Gcc 4.8.1
and the proper stdatomic.h header was only introduced in Gcc >= 4.9.
2021-06-17 09:51:04 +02:00
Ondřej Surý
4677bb28d1 Remove atomics emulated by a mutex-locked variable
Mutex atomics were intended to be used as a debugging tool only
and it has already served its purpose and it's not needed anymore.
2021-06-17 09:51:04 +02:00
Artem Boldariev
dc356bb196 Fix ASAN error in DoH (passing NULL to memmove())
The warning was produced by an ASAN build:

runtime error: null pointer passed as argument 2, which is declared to
never be null

This commit fixes it by checking if nghttp2_session_mem_send() has
actually returned anything.
2021-06-16 17:46:10 +03:00
Mark Andrews
234ad2d075 Lock access to task->threadid 2021-06-15 00:01:58 +00:00
Artem Boldariev
ccd2267b1c Set sock->iface and sock->peer properly for layered connection types
This change sets the mentioned fields properly and gets rid of klusges
added in the times when we were keeping pointers to isc_sockaddr_t
instead of copies. Among other things it helps to avoid a situation
when garbage instead of an address appears in dig output.
2021-06-14 11:37:36 +03:00
Artem Boldariev
b84fa122ce Make BIND refuse to serve XFRs over DoH
We cannot use DoH for zone transfers.  According to RFC8484 a DoH
request contains exactly one DNS message (see Section 6: Definition of
the "application/dns-message" Media Type,
https://datatracker.ietf.org/doc/html/rfc8484#section-6).  This makes
DoH unsuitable for zone transfers as often (and usually!) these need
more than one DNS message, especially for larger zones.

As zone transfers over DoH are not (yet) standardised, nor discussed
in RFC8484, the best thing we can do is to return "not implemented."

Technically DoH can be used to transfer small zones which fit in one
message, but that is not enough for the generic case.

Also, this commit makes the server-side DoH code ensure that no
multiple responses could be attempted to be sent over one HTTP/2
stream. In HTTP/2 one stream is mapped to one request/response
transaction. Now the write callback will be called with failure error
code in such a case.
2021-06-14 11:37:36 +03:00
Artem Boldariev
009752cab0 Pass an HTTP handle to the read callback when finishing a stream
This commit fixes a leftover from an earlier version of the client-side
DoH code when the underlying transport handle was used directly.
2021-06-14 11:37:36 +03:00
Artem Boldariev
d5d20cebb2 Fix a crash in the client-side DoH code (header processing callback)
Support a situation in header processing callback when client side
code could receive a belated response or part of it. That could
happen when the HTTP/2 session was already closed, but there were some
response data from server in flight. Other client-side nghttp2
callbacks code already handled this case.

The bug became apparent after HTTP/2 write buffering was supported,
leading to rare unit test failures.
2021-06-14 11:37:33 +03:00
Artem Boldariev
2dfc0d9afc Nullify connect.cstream in time and keep track of all client streams
This commit ensures that sock->h2.connect.cstream gets nullified when
the object in question is deleted. This fixes a nasty crash in dig
exposed when receiving large responses leading to double free()ing.

Also, it refactors how the client-side code keeps track of client
streams (hopefully) preventing from similar errors appearing in the
future.
2021-06-14 11:37:29 +03:00
Artem Boldariev
5b507c1136 Fix BIND to serve large HTTP responses
This commit makes NM code to report HTTP as a stream protocol. This
makes it possible to handle large responses properly. Like:

dig +https @127.0.0.1 A cmts1-dhcp.longlines.com
2021-06-14 11:37:17 +03:00
Ondřej Surý
b3de93e54c Update the source code formatting using clang-format-12
clang-format now tries to keep the type-cast on the same line as the
variable.  Update the formatting.
2021-06-13 08:46:28 +02:00
Ondřej Surý
440fb3d225 Completely remove BIND 9 Windows support
The Windows support has been completely removed from the source tree
and BIND 9 now no longer supports native compilation on Windows.

We might consider reviewing mingw-w64 port if contributed by external
party, but no development efforts will be put into making BIND 9 compile
and run on Windows again.
2021-06-09 14:35:14 +02:00