2
0
mirror of https://gitlab.isc.org/isc-projects/bind9 synced 2025-08-30 14:07:59 +00:00
Commit Graph

1569 Commits

Author SHA1 Message Date
Evan Hunt
a27860ba57 mark "cache-file" as ancient and remove all code implementing it
"cache-file" was already documented as intended for testing
purposes only and not to be used, so we can remove it without
waiting.  this commit marks the option as "ancient", and
removes all the documentation and implementing code, including
dns_cache_setfilename() and dns_cache_dump().

it also removes the documentation for the '-x cachefile`
parameter to named, which had already been removed, but the man
page was not updated at the time.
2021-09-16 00:19:02 -07:00
Evan Hunt
7bf61a6d7f use named_g_httpport correctly when creating listeners
when the default http port was set on the command line, it was
not used correctly by listeners. [GL #2902]
2021-09-14 20:22:13 +00:00
Ondřej Surý
8ac1d4e0da Remove the code to adjust listening interfaces for *-source-v6
Previously, named would run with a configuration
where *-source-v6 (notify-source-v6, transfer-source-v6 and
query-source-v6) address and port could be simultaneously used for
listening.  This is no longer true for BIND 9.16+ and the code that
would do interface adjustments would unexpectedly disable listening on
TCP for such interfaces.

This commit removes the code that would adjust listening interfaces
for addresses/ports configured in *-source-v6 option.
2021-09-14 14:51:03 +02:00
Aram Sargsyan
ae53919154 Add synonym configuration options for catalog zones
This commit adds 'primaries' and 'default-primaries' catalog zones
configuration options synonyms for 'masters' and 'default-masters'
respectively.
2021-09-09 21:54:10 +00:00
Artem Boldariev
42868c6f3e Fix building BIND without DoH support
The body of the listenelt_http() function was not properly wrapped in
ifdef ... endif, leading to build failures with DoH disabled.
2021-08-31 15:20:05 +02:00
Evan Hunt
916760ae46 rename dns_zone_master and dns_zone_slave
dns_zone_master and dns_zone_slave are renamed as dns_zone_primary
and dns_zone_secondary.
2021-08-30 11:06:12 -07:00
Artem Boldariev
db1ba15ff2 Replace multiple /dns-query constants with a global one
This commit replaces the constants defining /dns-query, the default
DoH endpoint, with a global definition.
2021-08-30 10:32:17 +03:00
Mark Andrews
0b83636648 Also delete journal file 2021-08-25 14:44:54 +10:00
Artem Boldariev
590e8e0b86 Make max number of HTTP/2 streams configurable
This commit makes number of concurrent HTTP/2 streams per connection
configurable as a mean to fight DDoS attacks. As soon as the limit is
reached, BIND terminates the whole session.

The commit adds a global configuration
option (http-streams-per-connection) which can be overridden in an
http <name> {...} statement like follows:

http local-http-server {
    ...
    streams-per-connection 100;
    ...
};

For now the default value is 100, which should be enough (e.g. NGINX
uses 128, but it is a full-featured WEB-server). When using lower
numbers (e.g. ~70), it is possible to hit the limit with
e.g. flamethrower.
2021-07-16 11:50:22 +03:00
Artem Boldariev
03a557a9bb Add (http-)listener-clients option (DoH quota mechanism)
This commit adds support for http-listener-clients global options as
well as ability to override the default in an HTTP server description,
like:

http local-http-server {
    ...
    listener-clients 100;
    ...
};

This way we have ability to specify per-listener active connections
quota globally and then override it when required. This is exactly
what AT&T requested us: they wanted a functionality to specify quota
globally and then override it for specific IPs. This change
functionality makes such a configuration possible.

It makes sense: for example, one could have different quotas for
internal and external clients. Or, for example, one could use BIND's
internal ability to serve encrypted DoH with some sane quota value for
internal clients, while having un-encrypted DoH listener without quota
to put BIND behind a load balancer doing TLS offloading for external
clients.

Moreover, the code no more shares the quota with TCP, which makes
little sense anyway (see tcp-clients option), because of the nature of
interaction of DoH clients: they tend to keep idle opened connections
for longer periods of time, preventing the TCP and TLS client from
being served. Thus, the need to have a separate, generally larger,
quota for them.

Also, the change makes any option within "http <name> { ... };"
statement optional, making it easier to override only required default
options.

By default, the DoH connections are limited to 300 per listener. I
hope that it is a good initial guesstimate.
2021-07-16 11:50:20 +03:00
Mark Andrews
ac0fc3c2de Add DBC REQUIRE to check that 'text' is non NULL
for all control channel commands.  This should silence
gcc-10-analyzer reporting NULL pointer dereference of 'text'.
2021-07-12 03:55:37 +00:00
Ondřej Surý
7cbfbc8faa Clean up the dns_dispatch_getudp API
Cleanup unused parts of dns_dispatch_getudp API, remove
dns_dispatch_getudp_dup() function and related code.
2021-07-09 15:58:02 +02:00
Ondřej Surý
2bb454182b Make the DNS over HTTPS support optional
This commit adds two new autoconf options `--enable-doh` (enabled by
default) and `--with-libnghttp2` (mandatory when DoH is enabled).

When DoH support is disabled the library is not linked-in and support
for http(s) protocol is disabled in the netmgr, named and dig.
2021-07-07 09:50:53 +02:00
Ondřej Surý
29c2e52484 The isc/platform.h header has been completely removed
The isc/platform.h header was left empty which things either already
moved to config.h or to appropriate headers.  This is just the final
cleanup commit.
2021-07-06 05:33:48 +00:00
Matthijs Mekking
40331a20c4 Add helpful function 'dns_zone_getdnsseckeys'
This code gathers DNSSEC keys from key files and from the DNSKEY RRset.
It is used for the 'rndc dnssec -status' command, but will also be
needed for "checkds". Turn it into a function.
2021-06-30 17:28:48 +02:00
Matthijs Mekking
6040c71478 Make "primaries" config parsing generic
Make the code to parse "primaries" configuration more generic so
it can be reused for "parental-agents".
2021-06-30 17:28:48 +02:00
Petr Špaček
9290d9752d fix tcp-send-buffer, udp-receive-buffer, udp-send-buffer limits 2021-06-28 11:16:00 +02:00
Michał Kępień
86541b39d3 Use minimal-sized caches for non-recursive views
Currently the implicit default for the "max-cache-size" option is "90%".
As this option is inherited by all configured views, using multiple
views can lead to memory exhaustion over time due to overcommitment.
The "max-cache-size 90%;" default also causes cache RBT hash tables to
be preallocated for every configured view, which does not really make
sense for views which do not allow recursion.

To limit this problem's potential for causing operational issues, use a
minimal-sized cache for views which do not allow recursion and do not
have "max-cache-size" explicitly set (either in global configuration or
in view configuration).

For configurations which include multiple views allowing recursion,
adjusting "max-cache-size" appropriately is still left to the operator.
2021-06-22 15:28:31 +02:00
Ondřej Surý
8a5c62de83 Refactor zone dumping code to use netmgr async threadpools
Previously, dumping the zones to the files were quantized, so it doesn't
slow down network IO processing.  With the introduction of network
manager asynchronous threadpools, we can move the IO intensive work to
use that API and we don't have to quantize the work anymore as it the
file IO won't block anything except other zone dumping processes.
2021-05-31 14:52:05 +02:00
Ondřej Surý
28b65d8256 Reduce the number of clientmgr objects created
Previously, as a way of reducing the contention between threads a
clientmgr object would be created for each interface/IP address.

We tasks being more strictly bound to netmgr workers, this is no longer
needed and we can just create clientmgr object per worker queue (ncpus).

Each clientmgr object than would have a single task and single memory
context.
2021-05-24 20:44:54 +02:00
Evan Hunt
b0aadaac8e rename dns_name_copynf() to dns_name_copy()
dns_name_copy() is now the standard name-copying function.
2021-05-22 00:37:27 -07:00
Matthijs Mekking
252a1ae0a1 Lock kasp when looking for zone keys
We should also lock kasp when reading key files, because at the same
time the zone in another view may be updating the key file.
2021-05-20 09:15:43 +02:00
Ondřej Surý
4509089419 Add configuration option to set send/recv buffers on the nm sockets
This commit adds a new configuration option to set the receive and send
buffer sizes on the TCP and UDP netmgr sockets.  The default is `0`
which doesn't set any value and just uses the value set by the operating
system.

There's no magic value here - set it too small and the performance will
drop, set it too large, the buffers can fill-up with queries that have
already timeouted on the client side and nobody is interested for the
answer and this would just make the server clog up even more by making
it produce useless work.

The `netstat -su` can be used on POSIX systems to monitor the receive
and send buffer errors.
2021-05-17 08:47:09 +02:00
Evan Hunt
220ada9422 reset taskmgr mode immediately after returning from zone load
all privileged tasks are complete by the time we return from
isc_task_endexclusive(), so it makes sense to reset the taskmgr
mode to non-privileged right then.
2021-05-10 12:26:27 -07:00
Ondřej Surý
365c6a9851 ensure interlocked netmgr events run on worker[0]
Network manager events that require interlock (pause, resume, listen)
are now always executed in the same worker thread, mgr->workers[0],
to prevent races.

"stoplistening" events no longer require interlock.
2021-05-07 14:28:32 -07:00
Evan Hunt
5c08f97791 only run tasks as privileged if taskmgr is in privileged mode
all zone loading tasks have the privileged flag, but we only want
them to run as privileged tasks when the server is being initialized;
if we privilege them the rest of the time, the server may hang for a
long time after a reload/reconfig. so now we call isc_taskmgr_setmode()
to turn privileged execution mode on or off in the task manager.

isc_task_privileged() returns true if the task's privilege flag is
set *and* the taskmgr is in privileged execution mode. this is used
to determine in which netmgr event queue the task should be run.
2021-05-07 14:28:30 -07:00
Ondřej Surý
a011d42211 Add new isc_managers API to simplify <*>mgr create/destroy
Previously, netmgr, taskmgr, timermgr and socketmgr all had their own
isc_<*>mgr_create() and isc_<*>mgr_destroy() functions.  The new
isc_managers_create() and isc_managers_destroy() fold all four into a
single function and makes sure the objects are created and destroy in
correct order.

Especially now, when taskmgr runs on top of netmgr, the correct order is
important and when the code was duplicated at many places it's easy to
make mistake.

The former isc_<*>mgr_create() and isc_<*>mgr_destroy() functions were
made private and a single call to isc_managers_create() and
isc_managers_destroy() is required at the program startup / shutdown.
2021-05-07 10:19:05 -07:00
Matthijs Mekking
b3a5859a9b rndc dnssec -status should include offline keys
The rndc command 'dnssec -status' only considered keys from
'dns_dnssec_findmatchingkeys' which only includes keys with accessible
private keys. Change it so that offline keys are also listed in the
status.
2021-05-05 11:13:19 +02:00
Matthijs Mekking
2710d9a11d Add built-in dnssec-policy "insecure"
Add a new built-in policy "insecure", to be used to gracefully unsign
a zone. Previously you could just remove the 'dnssec-policy'
configuration from your zone statement, or remove it.

The built-in policy "none" (or not configured) now actually means
no DNSSEC maintenance for the corresponding zone. So if you
immediately reconfigure your zone from whatever policy to "none",
your zone will temporarily be seen as bogus by validating resolvers.

This means we can remove the functions 'dns_zone_use_kasp()' and
'dns_zone_secure_to_insecure()' again. We also no longer have to
check for the existence of key state files to figure out if a zone
is transitioning to insecure.
2021-04-30 11:18:38 +02:00
Mark Andrews
29126500d2 Reduce nsec3 max iterations to 150 2021-04-29 17:18:26 +10:00
Diego Fronza
9298dcebbd Fix deadlock between rndc addzone/delzone/modzone
It follows a description of the steps that were leading to the deadlock:

1. `do_addzone` calls `isc_task_beginexclusive`.

2. `isc_task_beginexclusive` waits for (N_WORKERS - 1) halted tasks,
   this blocks waiting for those (no. workers -1) workers to halt.
...
isc_task_beginexclusive(isc_task_t *task0) {
    ...
	while (manager->halted + 1 < manager->workers) {
		wake_all_queues(manager);
		WAIT(&manager->halt_cond, &manager->halt_lock);
	}
```

3. It is possible that in `task.c / dispatch()` a worker is running a
   task event, if that event blocks it will not allow this worker to
   halt.

4. `do_addzone` acquires `LOCK(&view->new_zone_lock);`,

5. `rmzone` event is called from some worker's `dispatch()`, `rmzone`
   blocks waiting for the same lock.

6. `do_addzone` calls `isc_task_beginexclusive`.

7. Deadlock triggered, since:
	- `rmzone` is wating for the lock.
	- `isc_task_beginexclusive` is waiting for (no. workers - 1) to
	   be halted
	- since `rmzone` event is blocked it won't allow the worker to halt.

To fix this, we updated do_addzone code to call isc_task_beginexclusive
before the lock is acquired, we postpone locking to the nearest required
place, same for isc_task_beginexclusive.

The same could happen with rndc modzone, so that was addressed as well.
2021-04-22 15:45:55 +00:00
Ondřej Surý
b540722bc3 Refactor taskmgr to run on top of netmgr
This commit changes the taskmgr to run the individual tasks on the
netmgr internal workers.  While an effort has been put into keeping the
taskmgr interface intact, couple of changes have been made:

 * The taskmgr has no concept of universal privileged mode - rather the
   tasks are either privileged or unprivileged (normal).  The privileged
   tasks are run as a first thing when the netmgr is unpaused.  There
   are now four different queues in in the netmgr:

   1. priority queue - netievent on the priority queue are run even when
      the taskmgr enter exclusive mode and netmgr is paused.  This is
      needed to properly start listening on the interfaces, free
      resources and resume.

   2. privileged task queue - only privileged tasks are queued here and
      this is the first queue that gets processed when network manager
      is unpaused using isc_nm_resume().  All netmgr workers need to
      clean the privileged task queue before they all proceed normal
      operation.  Both task queues are processed when the workers are
      finished.

   3. task queue - only (traditional) task are scheduled here and this
      queue along with privileged task queues are process when the
      netmgr workers are finishing.  This is needed to process the task
      shutdown events.

   4. normal queue - this is the queue with netmgr events, e.g. reading,
      sending, callbacks and pretty much everything is processed here.

 * The isc_taskmgr_create() now requires initialized netmgr (isc_nm_t)
   object.

 * The isc_nm_destroy() function now waits for indefinite time, but it
   will print out the active objects when in tracing mode
   (-DNETMGR_TRACE=1 and -DNETMGR_TRACE_VERBOSE=1), the netmgr has been
   made a little bit more asynchronous and it might take longer time to
   shutdown all the active networking connections.

 * Previously, the isc_nm_stoplistening() was a synchronous operation.
   This has been changed and the isc_nm_stoplistening() just schedules
   the child sockets to stop listening and exits.  This was needed to
   prevent a deadlock as the the (traditional) tasks are now executed on
   the netmgr threads.

 * The socket selection logic in isc__nm_udp_send() was flawed, but
   fortunatelly, it was broken, so we never hit the problem where we
   created uvreq_t on a socket from nmhandle_t, but then a different
   socket could be picked up and then we were trying to run the send
   callback on a socket that had different threadid than currently
   running.
2021-04-20 23:22:28 +02:00
Matthijs Mekking
82f72ae249 Rekey immediately after rndc checkds/rollover
Call 'dns_zone_rekey' after a 'rndc dnssec -checkds' or 'rndc dnssec
-rollover' command is received, because such a command may influence
the next key event. Updating the keys immediately avoids unnecessary
rollover delays.

The kasp system test no longer needs to call 'rndc loadkeys' after
a 'rndc dnssec -checkds' or 'rndc dnssec -rollover' command.
2021-03-22 11:58:26 +01:00
Ondřej Surý
36ddefacb4 Change the isc_nm_(get|set)timeouts() to work with milliseconds
The RFC7828 specifies the keepalive interval to be 16-bit, specified in
units of 100 milliseconds and the configuration options tcp-*-timeouts
are following the suit.  The units of 100 milliseconds are very
unintuitive and while we can't change the configuration and presentation
format, we should not follow this weird unit in the API.

This commit changes the isc_nm_(get|set)timeouts() functions to work
with milliseconds and convert the values to milliseconds before passing
them to the function, not just internally.
2021-03-18 16:37:57 +01:00
Ondřej Surý
0f44139145 Bump the maximum number of hazard pointers in tests
On 24-core machine, the tests would crash because we would run out of
the hazard pointers.  We now adjust the number of hazard pointers to be
in the <128,256> interval based on the number of available cores.

Note: This is just a band-aid and needs a proper fix.
2021-02-18 19:32:55 +01:00
Evan Hunt
2b2e1a02bd allow configuration of "default" http endpoint
specifying "http default" in a listen-on statement sets up
the default "/dns-query" endpoint. tests and documentation
have been updated.
2021-02-16 16:24:35 -08:00
Evan Hunt
957052eea5 move listen-on correctness checks into check.c
errors in listen-on and listen-on-v6 can now be detected
by named-checkconf.
2021-02-16 16:24:35 -08:00
Evan Hunt
fd763d7223 enable listen-on parameters to be specified in any order
updated the parser to allow the "port", "tls" and "http"
paramters to "listen-on" and "listen-on-v6" to be specified in any
order. previously the parser would throw an error if any other order
was used than port, tls, http.
2021-02-16 16:24:35 -08:00
Evan Hunt
07f525bae5 require "tls none" for unencrypted HTTP listeners
unencrypted DoH connections may be used in some operational
environments where encryption is handled by a reverse proxy,
but it's going to be relatively rare, so we shouldn't make it
easy to do by mistake.  this commit changes the syntax for
listen-on and listen-on-v6 so that if "http" is specified, "tls"
must also be specified; for unencrypted listeners, "tls none"
can be used.
2021-02-16 16:24:35 -08:00
Ondřej Surý
23c3bcc711 Stop including dnstap headers from <dns/dnstap.h>
The <fstrm.h> and <protobuf-c/protobuf-c.h> headers are only directly
included where used and we stopped exposing those headers from libdns
headers.
2021-02-16 01:04:46 +00:00
Diego Fronza
30729c7013 Fix dangling references to outdated views after reconfig
This commit fix a leak which was happening every time an inline-signed
zone was added to the configuration, followed by a rndc reconfig.

During the reconfig process, the secure version of every inline-signed
zone was "moved" to a new view upon a reconfig and it "took the raw
version along", but only once the secure version was freed (at shutdown)
was prev_view for the raw version detached from, causing the old view to
be released as well.

This caused dangling references to be kept for the previous view, thus
keeping all resources used by that view in memory.
2021-02-15 11:15:20 -03:00
Evan Hunt
fe99484e14 support "tls ephemeral" with https 2021-02-03 12:06:17 +01:00
Evan Hunt
aa9d51c494 tls and http configuration code was unnecessarily complex
removed the isc_cfg_http_t and isc_cfg_tls_t structures
and the functions that loaded and accessed them; this can
be done using normal config parser functions.
2021-02-03 12:06:17 +01:00
Artem Boldariev
08da09bc76 Initial support for DNS-over-HTTP(S)
This commit completes the support for DNS-over-HTTP(S) built on top of
nghttp2 and plugs it into the BIND. Support for both GET and POST
requests is present, as required by RFC8484.

Both encrypted (via TLS) and unencrypted HTTP/2 connections are
supported. The latter are mostly there for debugging/troubleshooting
purposes and for the means of encryption offloading to third-party
software (as might be desirable in some environments to simplify TLS
certificates management).
2021-02-03 12:06:17 +01:00
Ondřej Surý
e488309da7 implement xfrin via XoT
Add support for a "tls" key/value pair for zone primaries, referencing
either a "tls" configuration statement or "ephemeral". If set to use
TLS, zones will send SOA and AXFR/IXFR queries over a TLS channel.
2021-01-29 12:07:38 +01:00
Mark Andrews
2b3fcd7156 Pass an afg_aclconfctx_t structure to cfg_acl_fromconfig
in named_zone_inlinesigning.  A NULL pointer does not work.
2021-01-28 01:54:59 +00:00
Mark Andrews
dd3520ae41 Improve the diagnostic 'rndc retransfer' error message 2021-01-28 08:43:03 +11:00
Diego Fronza
0ad6f594f6 Added option for disabling stale-answer-client-timeout
This commit allows to specify "disabled" or "off" in
stale-answer-client-timeout statement. The logic to support this
behavior will be added in the subsequent commits.

This commit also ensures an upper bound to stale-answer-client-timeout
which equals to one second less than 'resolver-query-timeout'.
2021-01-25 10:47:14 -03:00
Diego Fronza
171a5b7542 Add stale-answer-client-timeout option
The general logic behind the addition of this new feature works as
folows:

When a client query arrives, the basic path (query.c / ns_query_recurse)
was to create a fetch, waiting for completion in fetch_callback.

With the introduction of stale-answer-client-timeout, a new event of
type DNS_EVENT_TRYSTALE may invoke fetch_callback, whenever stale
answers are enabled and the fetch took longer than
stale-answer-client-timeout to complete.

When an event of type DNS_EVENT_TRYSTALE triggers fetch_callback, we
must ensure that the folowing happens:

1. Setup a new query context with the sole purpose of looking up for
   stale RRset only data, for that matters a new flag was added
   'DNS_DBFIND_STALEONLY' used in database lookups.

    . If a stale RRset is found, mark the original client query as
      answered (with a new query attribute named NS_QUERYATTR_ANSWERED),
      so when the fetch completion event is received later, we avoid
      answering the client twice.

    . If a stale RRset is not found, cleanup and wait for the normal
      fetch completion event.

2. In ns_query_done, we must change this part:
	/*
	 * If we're recursing then just return; the query will
	 * resume when recursion ends.
	 */
	if (RECURSING(qctx->client)) {
		return (qctx->result);
	}

   To this:

	if (RECURSING(qctx->client) && !QUERY_STALEONLY(qctx->client)) {
		return (qctx->result);
	}

   Otherwise we would not proceed to answer the client if it happened
   that a stale answer was found when looking up for stale only data.

When an event of type DNS_EVENT_FETCHDONE triggers fetch_callback, we
proceed as before, resuming query, updating stats, etc, but a few
exceptions had to be added, most important of which are two:

1. Before answering the client (ns_client_send), check if the query
   wasn't already answered before.

2. Before detaching a client, e.g.
   isc_nmhandle_detach(&client->reqhandle), ensure that this is the
   fetch completion event, and not the one triggered due to
   stale-answer-client-timeout, so a correct call would be:
   if (!QUERY_STALEONLY(client)) {
        isc_nmhandle_detach(&client->reqhandle);
   }

Other than these notes, comments were added in code in attempt to make
these updates easier to follow.
2021-01-25 10:47:14 -03:00
Matthijs Mekking
cf420b2af0 Treat dnssec-policy "none" as a builtin zone
Configure "none" as a builtin policy. Change the 'cfg_kasp_fromconfig'
api so that the 'name' will determine what policy needs to be
configured.

When transitioning a zone from secure to insecure, there will be
cases when a zone with no DNSSEC policy (dnssec-policy none) should
be using KASP. When there are key state files available, this is an
indication that the zone once was DNSSEC signed but is reconfigured
to become insecure.

If we would not run the keymgr, named would abruptly remove the
DNSSEC records from the zone, making the zone bogus. Therefore,
change the code such that a zone will use kasp if there is a valid
dnssec-policy configured, or if there are state files available.
2020-12-23 09:02:11 +01:00