2
0
mirror of https://gitlab.isc.org/isc-projects/bind9 synced 2025-08-31 06:25:31 +00:00
Commit Graph

613 Commits

Author SHA1 Message Date
Michał Kępień
ccee3507c2 Check recursions at the end of request processing
ns_client_endrequest() currently contains code that looks for
outstanding quota references and cleans them up if necessary.  This
approach masks programming errors because ns_client_endrequest() is only
called from ns__client_reset_cb(), which in turn is only called when all
references to the client's netmgr handle are released, which in turn
only happens after all recursion completion callbacks are invoked
(because isc_nmhandle_attach() is called before every call to
dns_resolver_createfetch() in lib/ns/query.c and the completion callback
is expected to detach from the handle), which in turn is expected to
happen for all recursions attempts, even those that get canceled.

Furthermore, declaring the prototype of ns_client_endrequest() at the
top of lib/ns/client.c is redundant because the definition of that
function is placed before its first use in that file.  Remove the
redundant function prototype.

Finally, remove INSIST assertions ensuring quota pointers are NULL in
ns__client_reset_cb() because the latter calls ns_client_endrequest() a
few lines earlier.
2022-06-14 13:13:32 +02:00
Michał Kępień
e09b36f2cc Adjust recursion quota when starting a fetch fails
Some functions fail to detach from the recursion quota if an error
occurs while initiating recursion.  This causes the recursive client
counter to be off.  Add missing recursionquota_detach() calls, reworking
cleanup code where appropriate.
2022-06-14 13:13:32 +02:00
Michał Kępień
172e15f7ad Attach to separate recursion quota pointers
Similarly to how different code paths reused common client handle
pointers and fetch references despite being logically unrelated, they
also reuse client->recursionquota, the field in which a reference to the
recursion quota is stored.  This unnecessarily forces all code using
that field to be aware of the fact that it is overloaded by different
features.

Overloading client->recursionquota also causes inconsistent behavior.
For example, if prefetch code triggers recursion and then delegation
handling code also triggers recursion, only one of these code paths will
be able to attach to the recursion quota, but both recursions will be
started anyway.  In other words, each code path only checks whether the
recursion quota has not been exceeded if the quota has not yet been
attached to by another code path.  This behavior theoretically allows
the configured recursion quota to be slightly exceeded; while it is not
expected to be a real-world operational issue, it is still confusing and
should therefore be fixed.

Extend the structures comprising the 'recursions' array with a new field
holding a pointer to the recursion quota that a given recursion process
attached to.  Update all code paths using client->recursionquota so that
they use the appropriate slot in the 'recursions' array.  Drop the
'recursionquota' field from ns_client_t.
2022-06-14 13:13:32 +02:00
Michał Kępień
95e703121d Ensure ns_query_cancel() handles all recursions
Previously, multiple code paths reused client->query.fetch, so it was
enough for ns_query_cancel() to issue a single call to
dns_resolver_cancelfetch() with that fetch as an argument.  Now, since
each slot in the 'recursions' array can hold a reference to a separate
resolver fetch, ns_query_cancel() needs to handle all of them, so that
all recursion callbacks get a chance to clean up the associated
resources when a query is canceled.
2022-06-14 13:13:32 +02:00
Michał Kępień
9e187b893d Drop the 'fetchhandle' and 'fetch' fields
Drop the 'fetchhandle' field from ns_client_t as all code using it has
been migrated to use the recursion-type-specific HANDLE_RECTYPE_*()
macros.

Drop the 'fetch' field from ns_query_t as all code using it has been
migrated to use the recursion-type-specific FETCH_RECTYPE_*() macros.
2022-06-14 13:13:32 +02:00
Michał Kępień
e0be643f50 Make async hooks code use the 'recursions' array
Async hooks are the last feature using the client->fetchhandle and
client->query.fetch pointers.  Update ns_query_hookasync() and
query_hookresume() so that they use a dedicated slot in the 'recursions'
array.  Note that async hooks are still not expected to initiate
recursion if one was already started by a prior ns_query_recurse() call,
so the REQUIRE assertion in ns_query_hookasync() needs to check the
RECTYPE_NORMAL slot rather than the RECTYPE_HOOK one.
2022-06-14 13:13:32 +02:00
Michał Kępień
af6fcf5641 Make resolver glue code use the 'recursions' array
With prefetch and RPZ code updated to use separate slots in the
'recursions' array, the code responsible for starting recursion in
ns_query_recurse() and resuming query handling in fetch_callback()
should follow suit, so that it does not need to explicitly cooperate
with other code paths that may initiate recursion.

Replace:

  - client->fetchhandle with HANDLE_RECTYPE_NORMAL(client)
  - client->query.fetch with FETCH_RECTYPE_NORMAL(client)

Also update other functions using client->fetchhandle and
client->query.fetch (ns_query_cancel(), query_usestale()) so that those
two fields can shortly be dropped altogether.
2022-06-14 13:13:32 +02:00
Michał Kępień
9eaddf2e4f Separate prefetch handling from RPZ fetch handling
Both prefetch code and RPZ code ignore recursion results (caching the
response notwithstanding).  RPZ code has been (ab)using that fact since
commit 08e36aa5a5 by employing
prefetch_done() as the fetch completion callback.  This is only
seemingly a simplification as it makes the code harder to follow ("why
is prefetch code used for handling RPZ-triggered recursion?").

Turn prefetch_done() into a new function whose name clearly conveys its
purpose.  Add a parameter to its prototype in order to allow callers to
specify which slot in the 'recursions' array it should use.  Reintroduce
prefetch_done() as a wrapper for that function.  Add rpzfetch_done(), an
RPZ-exclusive wrapper for that function (using a distinct recursion
type).

Since each slot in the 'recursions' array needs to be initialized before
getting cleaned up when recursion completes, rework fetch_and_forget()
so that it takes recursion type rather than extra fetch options as the
last parameter and make it use the requested slot in the 'recursions'
array rather than a fixed slot (RECTYPE_PREFETCH) for all callers.  This
makes fetch_and_forget() a logical complement of cleanup_after_fetch().

Collectively, these changes make prefetch and RPZ code logically
separate (except for reusing client->recursionquota, which will be
refactored later).
2022-06-14 13:13:32 +02:00
Michał Kępień
30ace0663d Make prefetch code use the 'recursions' array
Replace:

  - client->prefetchhandle with HANDLE_RECTYPE_PREFETCH(client)
  - client->query.prefetch with FETCH_RECTYPE_PREFETCH(client)

This is preparatory work for separating prefetch code from RPZ code.
2022-06-14 13:13:32 +02:00
Michał Kępień
0fd787c8b8 Enable ns_query_t to track multiple recursions
When a client waits for a prefetch- or RPZ-triggered recursion to
complete, its netmgr handle is attached to client->prefetchhandle and a
reference to the resolver fetch is stored in client->query.prefetch.
Both of these features use the same fields mentioned above.  This makes
the code fragile and hard to follow as its logically distinct parts
become intertwined for no obvious reason.

Furthermore, storing pointers related to a specific recursion process in
two different structures makes their purpose harder to grasp than it has
to be.

To alleviate the problem, extend ns_query_t with an array of structures
containing recursion-related pointers.  Each feature able to initiate
recursion is supposed to use its own slot in that array, allowing
logically unrelated code paths to be untangled.  Prefetch and RPZ will
be the first users of that array.

Define helper macros for accessing specific recursion-related pointers
in order to improve code readability.
2022-06-14 13:13:32 +02:00
Michał Kępień
76070fbf33 Simplify client->query initialization
Initialize client->query using a compound literal in order to make the
ns_query_init() function shorter and more readable.  This also prevents
the need to explicitly initialize any newly added fields in the future.
2022-06-14 13:13:32 +02:00
Michał Kępień
525d2875ec Use common code to start prefetches & RPZ fetches
query_prefetch() and query_rpzfetch() contain a lot of duplicated code.
Extract the common bits into a separate function whose name clearly
suggests its purpose.
2022-06-14 13:13:32 +02:00
Ondřej Surý
4232a281c8 Add recursionquota_attach*()
Add a set of new helper functions for attaching to the recursion quota
in order to reduce code duplication and to ensure that the recursive
clients counter is always adjusted properly.  Since some callers
(query_prefetch(), query_rpzfetch()) treat exceeding the soft quota as
an error while others (check_recursionquota()) do not, also add two
wrapper functions whose names help convey their purpose, in order to
improve code readability.
2022-06-14 13:13:32 +02:00
Ondřej Surý
70254724e7 Add recursionquota_detach()
Add a new helper function for detaching from the recursion quota in
order to reduce code duplication and to ensure that detaching from that
quota is always accompanied by decreasing the recursive clients counter.
2022-06-14 13:13:32 +02:00
Michał Kępień
07592d1315 Check for NULL before dereferencing qctx->rpz_st
Commit 9ffb4a7ba1 causes Clang Static
Analyzer to flag a potential NULL dereference in query_nxdomain():

    query.c:9394:26: warning: Dereference of null pointer [core.NullDereference]
            if (!qctx->nxrewrite || qctx->rpz_st->m.rpz->addsoa) {
                                    ^~~~~~~~~~~~~~~~~~~
    1 warning generated.

The warning above is for qctx->rpz_st potentially being a NULL pointer
when query_nxdomain() is called from query_resume().  This is a false
positive because none of the database lookup result codes currently
causing query_nxdomain() to be called (DNS_R_EMPTYWILD, DNS_R_NXDOMAIN)
can be returned by a database lookup following a recursive resolution
attempt.  Add a NULL check nevertheless in order to future-proof the
code and silence Clang Static Analyzer.
2022-06-13 14:03:16 +02:00
Michał Kępień
39fd8efbb7 Remove NULL checks for ns_client_getnamebuf()
ns_client_getnamebuf() cannot fail (i.e. return NULL) since commit
e31cc1eeb4.  Remove redundant NULL checks
performed on the pointer returned by ns_client_getnamebuf().
2022-06-10 14:30:23 +02:00
Michał Kępień
a229236019 Remove NULL checks for ns_client_newname()
ns_client_newname() cannot fail (i.e. return NULL) since commit
2ce0de6995 (though it was only made more
apparent by commit 33ba0057a7).  Remove
redundant NULL checks performed on the pointer returned by
ns_client_newname().
2022-06-10 14:30:23 +02:00
Michał Kępień
9ffb4a7ba1 Remove NULL checks for ns_client_newrdataset()
ns_client_newrdataset() cannot fail (i.e. return NULL) since commit
efb385ecdc (though it was only made more
apparent by commit 33ba0057a7).  Remove
redundant NULL checks performed on the pointer returned by
ns_client_newrdataset().
2022-06-10 14:30:23 +02:00
Tony Finch
129a522d88 There can no longer be multiple compression methods
The aim is to get rid of the obsolete term "GLOBAL14" and instead just
refer to DNS name compression.

This is mostly mechanically renaming

from	dns_(de)compress_(get|set)methods()
to	dns_(de)compress_(get|set)permitted()

and replacing the related enum by a simple flag, because compression
is either on or off.
2022-06-01 13:00:40 +01:00
Tony Finch
e37b782c1a DNS name compression does not depend on the EDNS version
There was a proposal in the late 1990s that it might, but it turned
out to be unworkable. See RFC 6891, Extension Mechanisms for
DNS (EDNS(0)), section 5, Extended Label Types.

The remnants of the code that supported this in BIND are redundant.
2022-06-01 13:00:40 +01:00
Ondřej Surý
2c3b2dabe9 Move all the unit tests to /tests/<libname>/
The unit tests are now using a common base, which means that
lib/dns/tests/ code now has to include lib/isc/include/isc/test.h and
link with lib/isc/test.c and lib/ns/tests has to include both libisc and
libdns parts.

Instead of cross-linking code between the directories, move the
/lib/<foo>/test.c to /tests/<foo>.c and /lib/<foo>/include/<foo>test.h
to /tests/include/tests/<foo>.h and create a single libtest.la
convenience library in /tests/.

At the same time, move the /lib/<foo>/tests/ to /tests/<foo>/ (but keep
it symlinked to the old location) and adjust paths accordingly.  In few
places, we are now using absolute paths instead of relative paths,
because the directory level has changed.  By moving the directories
under the /tests/ directory, the test-related code is kept in a single
place and we can avoid referencing files between libns->libdns->libisc
which is unhealthy because they live in a separate Makefile-space.

In the future, the /bin/tests/ should be merged to /tests/ and symlink
kept, and the /fuzz/ directory moved to /tests/fuzz/.
2022-05-28 14:53:02 -07:00
Ondřej Surý
63fe9312ff Give the unit tests a big overhaul
The unit tests contain a lot of duplicated code and here's an attempt
to reduce code duplication.

This commit does several things:

1. Remove #ifdef HAVE_CMOCKA - we already solve this with automake
   conditionals.

2. Create a set of ISC_TEST_* and ISC_*_TEST_ macros to wrap the test
   implementations, test lists, and the main test routine, so we don't
   have to repeat this all over again.  The macros were modeled after
   libuv test suite but adapted to cmocka as the test driver.

   A simple example of a unit test would be:

    ISC_RUN_TEST_IMPL(test1) { assert_true(true); }

    ISC_TEST_LIST_START
    ISC_TEST_ENTRY(test1)
    ISC_TEST_LIST_END

    ISC_TEST_MAIN (Discussion: Should this be ISC_TEST_RUN ?)

   For more complicated examples including group setup and teardown
   functions, and per-test setup and teardown functions.

3. The macros prefix the test functions and cmocka entries, so the name
   of the test can now match the tested function name, and we don't have
   to append `_test` because `run_test_` is automatically prepended to
   the main test function, and `setup_test_` and `teardown_test_` is
   prepended to setup and teardown function.

4. Update all the unit tests to use the new syntax and fix a few bits
   here and there.

5. In the future, we can separate the test declarations and test
   implementations which are going to greatly help with uncluttering the
   bigger unit tests like doh_test and netmgr_test, because the test
   implementations are not declared static (see `ISC_RUN_TEST_DECLARE`
   and `ISC_RUN_TEST_IMPL` for more details.

NOTE: This heavily relies on preprocessor macros, but the result greatly
outweighs all the negatives of using the macros.  There's less
duplicated code, the tests are more uniform and the implementation can
be more flexible.
2022-05-28 14:52:56 -07:00
Ondřej Surý
1fe391fd40 Make all tasks to be bound to a thread
Previously, tasks could be created either unbound or bound to a specific
thread (worker loop).  The unbound tasks would be assigned to a random
thread every time isc_task_send() was called.  Because there's no logic
that would assign the task to the least busy worker, this just creates
unpredictability.  Instead of random assignment, bind all the previously
unbound tasks to worker 0, which is guaranteed to exist.
2022-05-25 16:04:51 +02:00
Artem Boldariev
b58c4b8462 Disable periodic interface re-scans on modern platforms
This commit disables periodic interface re-scans timer on Linux where
a kernel-based dynamic interface mechanisms make it a thing of the
past in most cases.
2022-05-24 15:26:35 +03:00
Artem Boldariev
987892d113 Extend TLS context cache with TLS client session cache
This commit extends TLS context cache with TLS client session cache so
that an associated session cache can be stored alongside the TLS
context within the context cache.
2022-05-20 20:13:20 +03:00
Ondřej Surý
33ba0057a7 Cleanup dns_message_gettemp*() functions - they cannot fail
The dns_message_gettempname(), dns_message_gettemprdata(),
dns_message_gettemprdataset(), and dns_message_gettemprdatalist() always
succeeds because the memory allocation cannot fail now.  Change the API
to return void and cleanup all the use of aforementioned functions.
2022-05-17 12:39:25 +02:00
Evan Hunt
0201eab655 Cleanup: always count ns_statscounter_recursclients
The ns_statscounter_recursclients counter was previously only
incremented or decremented if client->recursionquota was non-NULL.
This was harmless, because that value should always be non-NULL if
recursion is enabled, but it made the code slightly confusing.
2022-05-13 21:47:27 -07:00
Ondřej Surý
0582478c96 Remove isc_task_destroy() and isc_task_shutdown()
After removing the isc_task_onshutdown(), the isc_task_shutdown() and
isc_task_destroy() became obsolete.

Remove calls to isc_task_shutdown() and replace the calls to
isc_task_destroy() with isc_task_detach().

Simplify the internal logic to destroy the task when the last reference
is removed.
2022-05-12 14:55:49 +02:00
Ondřej Surý
2235edabcf Remove isc_task_onshutdown()
The isc_task_onshutdown() was used to post event that should be run when
the task is being shutdown.  This could happen explicitly in the
isc_test_shutdown() call or implicitly when we detach the last reference
to the task and there are no more events posted on the task.

This whole task onshutdown mechanism just makes things more complicated,
and it's easier to post the "shutdown" events when we are shutting down
explicitly and the existing code already always knows when it should
shutdown the task that's being used to execute the onshutdown events.

Replace the isc_task_onshutdown() calls with explicit calls to execute
the shutdown tasks.
2022-05-12 13:45:34 +02:00
Mark Andrews
8fb72012e3 Check the cache as well when glue NS are returned processing RPZ 2022-05-04 23:30:32 +10:00
Mark Andrews
07c828531c Process learned records as well as glue 2022-05-04 23:30:32 +10:00
Mark Andrews
cf97c61f48 Process the delegating NS RRset when checking rpz rules 2022-05-04 23:30:32 +10:00
Ondřej Surý
b43812692d Move netmgr/uv-compat.h to <isc/uv.h>
As we are going to use libuv outside of the netmgr, we need the shims to
be readily available for the rest of the codebase.

Move the "netmgr/uv-compat.h" to <isc/uv.h> and netmgr/uv-compat.c to
uv.c, and as a rule of thumb, the users of libuv should include
<isc/uv.h> instead of <uv.h> directly.

Additionally, merge netmgr/uverr2result.c into uv.c and rename the
single function from isc__nm_uverr2result() to isc_uverr2result().
2022-05-03 10:02:19 +02:00
Tony Finch
66b3cb9732 Remove several superfluous newlines in log messages 2022-05-02 23:49:38 +01:00
Matthijs Mekking
c66b9abc0b Add stale answer extended errors
Add DNS extended errors 3 (Stale Answer) and 19 (Stale NXDOMAIN Answer)
to responses. Add extra text with the reason why the stale answer was
returned.

To test, we need to change the configuration such that for the first
set of tests the stale-refresh-time window does not interfer with the
expected extended errors.
2022-04-28 09:58:25 +02:00
Evan Hunt
5c4cf3fcc4 prevent a deadlock in the shutdown system test
The shutdown test sends 'rdnc status' commands in parallel with
'rndc stop' A new rndc connection arriving will reference the ACL
environment to see whether the client is allowed to connect.
Commit c0995bc380 added a mutex lock to ns_interfacemgr_getaclenv(),
but if the new connection arrives while the interfaces are being
purged during shutdown, that lock is already being held. If the
the connection event slips in ahead of one of the netmgr's "stop
listening" events on a worker thread, a deadlock can occur.

The fix is not to hold the interfacemgr lock while shutting down
interfaces; only while actually traversing the interface list to
identify interfaces needing shutdown.
2022-04-27 23:25:57 -07:00
Ondřej Surý
9ae34a04e8 The route socket and its storage was detached while still reading
The interfacemgr and the .route was being detached while the network
manager had pending read from the socket.  Instead of detaching from the
socket, we need to cancel the read which in turn will detach the route
socket and the associated interfacemgr.
2022-04-25 17:19:33 +02:00
Michał Kępień
5065c4686e Fix loading plugins using just their filenames
BIND 9 plugins are installed using Automake's pkglib_LTLIBRARIES stanza,
which causes the relevant shared objects to be placed in the
$(libdir)/@PACKAGE@/ directory, where @PACKAGE@ is expanded to the
lowercase form of the first argument passed to AC_INIT(), i.e. "bind".
Meanwhile, NAMED_PLUGINDIR - the preprocessor macro that the
ns_plugin_expandpath() function uses for determining the absolute path
to a plugin for which only a filename has been provided (rather than a
path) - is set to $(libdir)/named.  This discrepancy breaks loading
plugins using just their filenames.  Fix the issue (and also prevent it
from reoccurring) by setting NAMED_PLUGINDIR to $(pkglibdir).
2022-04-22 13:27:12 +02:00
Ondřej Surý
f55a4d3e55 Allow listening on less than nworkers threads
For some applications, it's useful to not listen on full battery of
threads.  Add workers argument to all isc_nm_listen*() functions and
convenience ISC_NM_LISTEN_ONE and ISC_NM_LISTEN_ALL macros.
2022-04-19 11:08:13 +02:00
Artem Boldariev
77b2db8246 Replace listener TLS contexts on reconfiguration
This commit makes use of isc_nmsocket_set_tlsctx(). Now, instead of
recreating TLS-enabled listeners (including the underlying TCP
listener sockets), only the TLS context in use is replaced.
2022-04-06 18:45:57 +03:00
Ondřej Surý
c0995bc380 Remove exclusive mode from ns_interfacemgr
Now that the dns_aclenv_t has now properly rwlocked .localhost and
.localnets member, we can remove the task exclusive mode use from the
ns_interfacemgr.  Some light related cleanup has been also done.
2022-04-04 19:27:00 +02:00
Ondřej Surý
8138a595d9 Add isc_rwlock around dns_aclenv .localhost and .localnets member
In order to modify the .localhost and .localnets members of the
dns_aclenv, all other processing on the netmgr loops needed to be
stopped using the task exclusive mode.  Add the isc_rwlock to the
dns_aclenv, so any modifications to the .localhost and .localnets can be
done under the write lock.
2022-04-04 19:27:00 +02:00
Ondřej Surý
2bc7303af2 Use isc_nm_getnworkers to manage zone resources
Instead of passing the number of worker to the dns_zonemgr manually,
get the number of nm threads using the new isc_nm_getnworkers() call.

Additionally, remove the isc_pool API and manage the array of memory
context, zonetasks and loadtasks directly in the zonemgr.
2022-04-01 23:50:34 +02:00
Ondřej Surý
2707d0eeb7 Set hard thread affinity for each zone
After switching to per-thread resources in the zonemgr, the performance
was decreased because the memory context, zonetask and loadtask was
picked from the pool at random.

Pin the zone to single threadid (.tid) and align the memory context,
zonetask and loadtask to be the same, this sets the hard affinity of the
zone to the netmgr thread.
2022-04-01 23:50:34 +02:00
Ondřej Surý
a94678ff77 Create per-thread task and memory context for zonemgr
Previously, the zonemgr created 1 task per 100 zones and 1 memory
context per 1000 zones (with minimum 10 tasks and 2 memory contexts) to
reduce the contention between threads.

Instead of reducing the contention by having many resources, create a
per-nm_thread memory context, loadtask and zonetask and spread the zones
between just per-thread resources.

Note: this commit alone does decrease performance when loading the zone
by couple seconds (in case of 1M zone) and thus there's more work in
this whole MR fixing the performance.
2022-04-01 23:50:34 +02:00
Tony Finch
84c4eb02e7 Log "not authoritative for update zone" more clearly
Ensure the update zone name is mentioned in the NOTAUTH error message
in the server log, so that it is easier to track down problematic
update clients. There are two cases: either the update zone is
unrelated to any of the server's zones (previously no zone was
mentioned); or the update zone is a subdomain of one or more of the
server's zones (previously the name of the irrelevant parent zone was
misleadingly logged).

Closes #3209
2022-03-30 12:50:30 +01:00
Ondřej Surý
4f74e1010e Remove task exclusive mode from ns_clientmgr
The .lock, .exiting and .excl members were not using for anything else
than starting task exclusive mode, setting .exiting to true and ending
exclusive mode.

Remove all the stray members and dead code eliminating the task
exclusive mode use from ns_clientmgr.
2022-03-30 12:41:55 +02:00
Ondřej Surý
4dceab142d Consistenly use UNREACHABLE() instead of ISC_UNREACHABLE()
In couple places, we have missed INSIST(0) or ISC_UNREACHABLE()
replacement on some branches with UNREACHABLE().  Replace all
ISC_UNREACHABLE() or INSIST(0) calls with UNREACHABLE().
2022-03-28 23:26:08 +02:00
Artem Boldariev
57f0251713 Add support for Strict/Mutual TLS into BIND
This commit adds support for Strict/Mutual TLS into BIND. It does so
by implementing the backing code for 'hostname' and 'ca-file' options
of the 'tls' statement. The commit also updates the documentation
accordingly.
2022-03-28 16:22:53 +03:00
Artem Boldariev
71cf8fa5ac Extend TLS context cache with CA certificates store
This commit adds support for keeping CA certificates stores associated
with TLS contexts. The intention is to keep one reusable store per a
set of related TLS contexts.
2022-03-28 15:31:22 +03:00