2
0
mirror of https://gitlab.isc.org/isc-projects/bind9 synced 2025-08-28 21:17:54 +00:00

243 Commits

Author SHA1 Message Date
Ondřej Surý
50270de8a0 Refactor the interface handling in the netmgr
The isc_nmiface_t type was holding just a single isc_sockaddr_t,
so we got rid of the datatype and use plain isc_sockaddr_t in place
where isc_nmiface_t was used before.  This means less type-casting and
shorter path to access isc_sockaddr_t members.

At the same time, instead of keeping the reference to the isc_sockaddr_t
that was passed to us when we start listening, we will keep a local
copy. This prevents the data race on destruction of the ns_interface_t
objects where pending nmsockets could reference the sockaddr of already
destroyed ns_interface_t object.
2021-05-26 09:43:12 +02:00
Evan Hunt
e31cc1eeb4 use a fixedname buffer in dns_message_gettempname()
dns_message_gettempname() now returns a pointer to an initialized
name associated with a dns_fixedname_t object. it is no longer
necessary to allocate a buffer for temporary names associated with
the message object.
2021-05-20 20:41:29 +02:00
Mark Andrews
e86508708d Check that the first and last SOA of an AXFR are consistent 2021-05-13 03:36:50 +00:00
Mark Andrews
01209dfa49 Check SOA owner names in zone transfers
An IXFR containing SOA records with owner names different than the
transferred zone's origin can result in named serving a version of that
zone without an SOA record at the apex.  This causes a RUNTIME_CHECK
assertion failure the next time such a zone is refreshed.  Fix by
immediately rejecting a zone transfer (either an incremental or
non-incremental one) upon detecting an SOA record not placed at the apex
of the transferred zone.
2021-04-29 10:30:00 +02:00
Ondřej Surý
6cf6de55bc Prevent the double xfrin_fail() call
When we are reading from the xfrin socket, and the transfer would be
shutdown, the shutdown function would call `xfrin_fail()` which in turns
calls `xfrin_cancelio()` that causes the read callback to be invoked
with `ISC_R_CANCELED` status code and that caused yet another
`xfrin_fail()` call.

The fix here is to ensure the `xfrin_fail()` would be run only once
properly using better synchronization on xfr->shuttingdown flag.
2021-04-20 14:12:26 +02:00
Ondřej Surý
86f4872dd6 isc_nm_*connect() always return via callback
The isc_nm_*connect() functions were refactored to always return the
connection status via the connect callback instead of sometimes returning
the hard failure directly (for example, when the socket could not be
created, or when the network manager was shutting down).

This commit changes the connect functions in all the network manager
modules, and also makes the necessary refactoring changes in places
where the connect functions are called.
2021-04-07 15:36:59 +02:00
Ondřej Surý
e488309da7 implement xfrin via XoT
Add support for a "tls" key/value pair for zone primaries, referencing
either a "tls" configuration statement or "ephemeral". If set to use
TLS, zones will send SOA and AXFR/IXFR queries over a TLS channel.
2021-01-29 12:07:38 +01:00
Ondřej Surý
634bdfb16d Refactor netmgr and add more unit tests
This is a part of the works that intends to make the netmgr stable,
testable, maintainable and tested.  It contains a numerous changes to
the netmgr code and unfortunately, it was not possible to split this
into smaller chunks as the work here needs to be committed as a complete
works.

NOTE: There's a quite a lot of duplicated code between udp.c, tcp.c and
tcpdns.c and it should be a subject to refactoring in the future.

The changes that are included in this commit are listed here
(extensively, but not exclusively):

* The netmgr_test unit test was split into individual tests (udp_test,
  tcp_test, tcpdns_test and newly added tcp_quota_test)

* The udp_test and tcp_test has been extended to allow programatic
  failures from the libuv API.  Unfortunately, we can't use cmocka
  mock() and will_return(), so we emulate the behaviour with #define and
  including the netmgr/{udp,tcp}.c source file directly.

* The netievents that we put on the nm queue have variable number of
  members, out of these the isc_nmsocket_t and isc_nmhandle_t always
  needs to be attached before enqueueing the netievent_<foo> and
  detached after we have called the isc_nm_async_<foo> to ensure that
  the socket (handle) doesn't disappear between scheduling the event and
  actually executing the event.

* Cancelling the in-flight TCP connection using libuv requires to call
  uv_close() on the original uv_tcp_t handle which just breaks too many
  assumptions we have in the netmgr code.  Instead of using uv_timer for
  TCP connection timeouts, we use platform specific socket option.

* Fix the synchronization between {nm,async}_{listentcp,tcpconnect}

  When isc_nm_listentcp() or isc_nm_tcpconnect() is called it was
  waiting for socket to either end up with error (that path was fine) or
  to be listening or connected using condition variable and mutex.

  Several things could happen:

    0. everything is ok

    1. the waiting thread would miss the SIGNAL() - because the enqueued
       event would be processed faster than we could start WAIT()ing.
       In case the operation would end up with error, it would be ok, as
       the error variable would be unchanged.

    2. the waiting thread miss the sock->{connected,listening} = `true`
       would be set to `false` in the tcp_{listen,connect}close_cb() as
       the connection would be so short lived that the socket would be
       closed before we could even start WAIT()ing

* The tcpdns has been converted to using libuv directly.  Previously,
  the tcpdns protocol used tcp protocol from netmgr, this proved to be
  very complicated to understand, fix and make changes to.  The new
  tcpdns protocol is modeled in a similar way how tcp netmgr protocol.
  Closes: #2194, #2283, #2318, #2266, #2034, #1920

* The tcp and tcpdns is now not using isc_uv_import/isc_uv_export to
  pass accepted TCP sockets between netthreads, but instead (similar to
  UDP) uses per netthread uv_loop listener.  This greatly reduces the
  complexity as the socket is always run in the associated nm and uv
  loops, and we are also not touching the libuv internals.

  There's an unfortunate side effect though, the new code requires
  support for load-balanced sockets from the operating system for both
  UDP and TCP (see #2137).  If the operating system doesn't support the
  load balanced sockets (either SO_REUSEPORT on Linux or SO_REUSEPORT_LB
  on FreeBSD 12+), the number of netthreads is limited to 1.

* The netmgr has now two debugging #ifdefs:

  1. Already existing NETMGR_TRACE prints any dangling nmsockets and
     nmhandles before triggering assertion failure.  This options would
     reduce performance when enabled, but in theory, it could be enabled
     on low-performance systems.

  2. New NETMGR_TRACE_VERBOSE option has been added that enables
     extensive netmgr logging that allows the software engineer to
     precisely track any attach/detach operations on the nmsockets and
     nmhandles.  This is not suitable for any kind of production
     machine, only for debugging.

* The tlsdns netmgr protocol has been split from the tcpdns and it still
  uses the old method of stacking the netmgr boxes on top of each other.
  We will have to refactor the tlsdns netmgr protocol to use the same
  approach - build the stack using only libuv and openssl.

* Limit but not assert the tcp buffer size in tcp_alloc_cb
  Closes: #2061
2020-12-01 16:47:07 +01:00
Evan Hunt
e011521ef1 address some possible shutdown races in xfrin
there were two failures during observed in testing, both occurring
when 'rndc halt' was run rather than 'rndc stop' - the latter dumps
zone contents to disk and presumably introduced enough delay to
prevent the races:

- a failure when the zone was shut down and called dns_xfrin_detach()
  before the xfrin had finished connecting; the connect timeout
  terminated without detaching its handle
- a failure when the tcpdns socket timer fired after the outerhandle
  had already been cleared.

this commit incidentally addresses a failure observed in mutexatomic
due to a variable having been initialized incorrectly.
2020-11-09 12:33:37 -08:00
Ondřej Surý
934d6c6f92 Refactor the xfrin reference counting
Previously, the xfrin object relied on four different reference counters
(`refs`, `connects`, `sends`, `recvs`) and destroyed the xfrin object
only if all of them were zero.  This commit reduces the reference
counting only to the `references` (renamed from `refs`) counter.  We
keep the existing `connects`, `sends` and `recvs` as safe guards, but
they are not formally needed.
2020-11-09 14:50:48 +01:00
Evan Hunt
1170a52f48 remove isc_task from xfrin
since the network manager is now handling timeouts, xfrin doesn't
need an isc_task object.

it may be necessary to revert this later if we find that it's
important for zone_xfrdone() to be executed in the zone task context.
currently things seem to be working well without that, though.
2020-11-09 13:45:43 +01:00
Evan Hunt
a8d28881d1 remove isc_timer from xfrin
the network manager can now handle timeouts, so it isn't
necessary for xfrin to use isc_timer for the purpose any
longer.
2020-11-09 13:45:43 +01:00
Evan Hunt
49d53a4aa9 use netmgr for xfrin
Use isc_nm_tcpdnsconnect() in xfrin.c for zone transfers.
2020-11-09 13:45:43 +01:00
Ondřej Surý
33eefe9f85 The dns_message_create() cannot fail, change the return to void
The dns_message_create() function cannot soft fail (as all memory
allocations either succeed or cause abort), so we change the function to
return void and cleanup the calls.
2020-09-29 08:22:08 +02:00
Diego Fronza
12d6d13100 Refactored dns_message_t for using attach/detach semantics
This commit will be used as a base for the next code updates in order
to have a better control of dns_message_t objects' lifetime.
2020-09-29 08:22:08 +02:00
Evan Hunt
dcee985b7f update all copyright headers to eliminate the typo 2020-09-14 16:20:40 -07:00
Mark Andrews
fd96a41868 Verify the question section when transfering in.
There was a case where an primary server sent a response
on the wrong TCP connection and failure to check the question
section resulted in a truncated zone being served.
2020-06-04 16:10:41 +02:00
Evan Hunt
57e54c46e4 change "expr == false" to "!expr" in conditionals 2020-05-25 16:09:57 -07:00
Mark Andrews
33eee6572a Reject AXFR streams where the message id is not consistent. 2020-04-20 18:24:12 +10:00
Evan Hunt
89615c2ab5 add serial number to "transfer ended" log messages 2020-03-05 17:20:16 -08:00
Evan Hunt
ba0313e649 fix spelling errors reported by Fossies. 2020-02-21 15:05:08 +11:00
Ondřej Surý
5777c44ad0 Reformat using the new rules 2020-02-14 09:31:05 +01:00
Evan Hunt
e851ed0bb5 apply the modified style 2020-02-13 15:05:06 -08:00
Ondřej Surý
056e133c4c Use clang-tidy to add curly braces around one-line statements
The command used to reformat the files in this commit was:

./util/run-clang-tidy \
	-clang-tidy-binary clang-tidy-11
	-clang-apply-replacements-binary clang-apply-replacements-11 \
	-checks=-*,readability-braces-around-statements \
	-j 9 \
	-fix \
	-format \
	-style=file \
	-quiet
clang-format -i --style=format $(git ls-files '*.c' '*.h')
uncrustify -c .uncrustify.cfg --replace --no-backup $(git ls-files '*.c' '*.h')
clang-format -i --style=format $(git ls-files '*.c' '*.h')
2020-02-13 22:07:21 +01:00
Ondřej Surý
f50b1e0685 Use clang-format to reformat the source files 2020-02-12 15:04:17 +01:00
Ondřej Surý
bc1d4c9cb4 Clear the pointer to destroyed object early using the semantic patch
Also disable the semantic patch as the code needs tweaks here and there because
some destroy functions might not destroy the object and return early if the
object is still in use.
2020-02-09 18:00:17 -08:00
Ondřej Surý
edd97cddc1 Refactor dns_name_dup() usage using the semantic patch 2019-11-29 14:00:37 +01:00
Ondřej Surý
ae83801e2b Remove blocks checking whether isc_mem_get() failed using the coccinelle 2019-07-23 15:32:35 -04:00
Ondřej Surý
78d0cb0a7d Use coccinelle to remove explicit '#include <config.h>' from the source files 2019-03-08 15:15:05 +01:00
Michał Kępień
9c611dd999 Prevent races when waiting for log messages
The "mirror" system test checks whether log messages announcing a mirror
zone coming into effect are emitted properly.  However, the helper
functions responsible for waiting for zone transfers and zone loading to
complete do not wait for these exact log messages, but rather for other
ones preceding them, which introduces a possibility of false positives.

This problem cannot be addressed by just changing the log message to
look for because the test still needs to discern between transferring a
zone and loading a zone.

Add two new log messages at debug level 99 (which is what named
instances used in system tests are configured with) that are to be
emitted after the log messages announcing a mirror zone coming into
effect.  Tweak the aforementioned helper functions to only return once
the log messages they originally looked for are followed by the newly
added log messages.  This reliably prevents races when looking for
"mirror zone is now in use" log messages and also enables a workaround
previously put into place in the "mirror" system test to be reverted.
2019-02-14 10:41:56 +01:00
Michał Kępień
1c97ace7dc Log a message when a transferred mirror zone comes into effect
Log a message when a mirror zone is successfully transferred and
verified, but only if no database for that zone was yet loaded at the
time the transfer was initiated.

This could have been implemented in a simpler manner, e.g. by modifying
zone_replacedb(), but (due to the calling order of the functions
involved in finalizing a zone transfer) that would cause the resulting
logs to suggest that a mirror zone comes into effect before its transfer
is finished, which would be confusing given the nature of mirror zones
and the fact that no message is logged upon successful mirror zone
verification.

Once the dns_zone_replacedb() call in axfr_finalize() is made, it
becomes impossible to determine whether the transferred zone had a
database attached before the transfer was started.  Thus, that check is
instead performed when the transfer context is first created and the
result of this check is passed around in a field of the transfer context
structure.  If it turns out to be desired, the relevant log message is
then emitted just before the transfer context is freed.

Taking this approach means that the log message added by this commit is
not timed precisely, i.e. mirror zone data may be used before this
message is logged.  However, that can only be fixed by logging the
message inside zone_replacedb(), which causes arguably more dire issues
discussed above.

dns_zone_isloaded() is not used to double-check that transferred zone
data was correctly loaded since the 'shutdown_result' field of the zone
transfer context will not be set to ISC_R_SUCCESS unless axfr_finalize()
succeeds (and that in turn will not happen unless dns_zone_replacedb()
succeeds).
2019-01-16 10:33:02 -08:00
Ondřej Surý
23fff6c569 Hint the compiler with ISC_UNREACHABLE(); that code after INSIST(0); cannot be reached 2018-11-08 12:22:17 +07:00
Ondřej Surý
994e656977 Replace custom isc_boolean_t with C standard bool type 2018-08-08 09:37:30 +02:00
Ondřej Surý
cb6a185c69 Replace custom isc_u?intNN_t types with C99 u?intNN_t types 2018-08-08 09:37:28 +02:00
Ondřej Surý
64fe6bbaf2 Replace ISC_PRINT_QUADFORMAT with inttypes.h format constants 2018-08-08 09:36:44 +02:00
Michał Kępień
6439a76c6d Verify mirror zone IXFRs
Update ixfr_commit() so that all incoming versions of a mirror zone
transferred using IXFR are verified before being used.
2018-06-28 13:38:39 +02:00
Michał Kępień
d86f1d00ad Verify mirror zone AXFRs
Update axfr_commit() so that all incoming versions of a mirror zone
transferred using AXFR are verified before being used.  If zone
verification fails, discard the received version of the zone, wait until
the next refresh and retry.
2018-06-28 13:38:39 +02:00
Ondřej Surý
99ba29bc52 Change isc_random() to be just PRNG, and add isc_nonce_buf() that uses CSPRNG
This commit reverts the previous change to use system provided
entropy, as (SYS_)getrandom is very slow on Linux because it is
a syscall.

The change introduced in this commit adds a new call isc_nonce_buf
that uses CSPRNG from cryptographic library provider to generate
secure data that can be and must be used for generating nonces.
Example usage would be DNS cookies.

The isc_random() API has been changed to use fast PRNG that is not
cryptographically secure, but runs entirely in user space.  Two
contestants have been considered xoroshiro family of the functions
by Villa&Blackman and PCG by O'Neill.  After a consideration the
xoshiro128starstar function has been used as uint32_t random number
provider because it is very fast and has good enough properties
for our usage pattern.

The other change introduced in the commit is the more extensive usage
of isc_random_uniform in places where the usage pattern was
isc_random() % n to prevent modulo bias.  For usage patterns where
only 16 or 8 bits are needed (DNS Message ID), the isc_random()
functions has been renamed to isc_random32(), and isc_random16() and
isc_random8() functions have been introduced by &-ing the
isc_random32() output with 0xffff and 0xff.  Please note that the
functions that uses stripped down bit count doesn't pass our
NIST SP 800-22 based random test.
2018-05-29 22:58:21 +02:00
Ondřej Surý
3a4f820d62 Replace all random functions with isc_random, isc_random_buf and isc_random_uniform API.
The three functions has been modeled after the arc4random family of
functions, and they will always return random bytes.

The isc_random family of functions internally use these CSPRNG (if available):

1. getrandom() libc call (might be available on Linux and Solaris)
2. SYS_getrandom syscall (might be available on Linux, detected at runtime)
3. arc4random(), arc4random_buf() and arc4random_uniform() (available on BSDs and Mac OS X)
4. crypto library function:
4a. RAND_bytes in case OpenSSL
4b. pkcs_C_GenerateRandom() in case PKCS#11 library
2018-05-16 09:54:35 +02:00
Ondřej Surý
55a10b7acd Remove $Id markers, Principal Author and Reviewed tags from the full source tree 2018-05-11 13:17:46 +02:00
Witold Kręcicki
c8aa1ee9e6 libdns refactoring: get rid of multiple versions of dns_dt_create, dns_view_setcache, dns_zt_apply, dns_message_logfmtpacket, dns_message_logpacket, dns_ssutable_checkrules and dns_ttl_totext 2018-04-06 08:04:41 +02:00
Witold Kręcicki
702c022016 libdns refactoring: get rid of multiple versions of dns_xfrin_create, dst_key_generate, dst_lib_init and dst_context_create 2018-04-06 08:04:41 +02:00
Ondřej Surý
843d389661 Update license headers to not include years in copyright in all applicable files 2018-02-23 10:12:02 +01:00
Evan Hunt
114f95089c [master] cleanup strcat/strcpy
4722.	[cleanup]	Clean up uses of strcpy() and strcat() in favor of
			strlcpy() and strlcat() for safety. [RT #45981]
2017-09-13 00:14:37 -07:00
Tinderbox User
f4eb664ce3 update copyright notice / whitespace 2017-08-09 23:47:50 +00:00
Evan Hunt
cdacec1dcb [master] silence gcc 7 warnings
4673.	[port]		Silence GCC 7 warnings. [RT #45592]
2017-08-09 00:17:44 -07:00
Mark Andrews
52e2aab392 4546. [func] Extend the use of const declarations. [RT #43379] 2016-12-30 15:45:08 +11:00
Mark Andrews
5f8412a4cb 4504. [security] Allow the maximum number of records in a zone to
be specified.  This provides a control for issues
                        raised in CVE-2016-6170. [RT #42143]
2016-11-02 17:31:27 +11:00
Mark Andrews
0c27b3fe77 4401. [misc] Change LICENSE to MPL 2.0. 2016-06-27 14:56:38 +10:00
Evan Hunt
6c2a76b3e2 [master] copyrights, win32 definitions 2016-05-26 12:36:17 -07:00