2
0
mirror of https://gitlab.isc.org/isc-projects/bind9 synced 2025-08-31 14:35:26 +00:00
Commit Graph

33888 Commits

Author SHA1 Message Date
Ondřej Surý
d34672796c Merge branch '2313-set-RCVBUF-SNDBUF' into 'main'
Resolve "Set reasonable values to SO_RCVBUF and SO_SNDBUF"

Closes #2313

See merge request isc-projects/bind9!4460
2021-05-17 07:42:37 +00:00
Ondřej Surý
3733b4f101 Add CHANGES and release note for GL #2313 2021-05-17 08:47:09 +02:00
Ondřej Surý
4509089419 Add configuration option to set send/recv buffers on the nm sockets
This commit adds a new configuration option to set the receive and send
buffer sizes on the TCP and UDP netmgr sockets.  The default is `0`
which doesn't set any value and just uses the value set by the operating
system.

There's no magic value here - set it too small and the performance will
drop, set it too large, the buffers can fill-up with queries that have
already timeouted on the client side and nobody is interested for the
answer and this would just make the server clog up even more by making
it produce useless work.

The `netstat -su` can be used on POSIX systems to monitor the receive
and send buffer errors.
2021-05-17 08:47:09 +02:00
Michal Nowak
089bfe20f9 Merge branch '2386-check-correct-copyright-dates-in-man-pages' into 'main'
Set copyright year to the current year

Closes #2386

See merge request isc-projects/bind9!4869
2021-05-14 12:46:33 +00:00
Michal Nowak
7eb44b05c5 Set copyright year to the current year
To ensure that a release with outdated copyright year is not produced at
the beginning of a year, set copyright year to the current year.
2021-05-14 14:21:58 +02:00
Michal Nowak
6f9ac0e997 Merge branch 'mnowak/add-unit-gcc-out-of-tree-ci-job' into 'main'
Add unit:gcc:out-of-tree CI job

See merge request isc-projects/bind9!4740
2021-05-14 12:21:06 +00:00
Michal Nowak
741fdd4fe1 Add unit:gcc:out-of-tree CI job
Also extract the workspace save-and-retrieve logic to YAML anchors.
2021-05-14 13:22:09 +02:00
Michal Nowak
c628f2c71b Make masterXX.data.in reachable by out-of-tree builds
Unit test run for out-of-tree builds used to fail to find
masterXX.data.in files:

    /usr/bin/perl -w /builds/mnowak/bind9/lib/dns/tests/mkraw.pl < testdata/master/master12.data.in > testdata/master/master12.data
    /bin/bash: testdata/master/master12.data.in: No such file or directory
    make[4]: *** [Makefile:1910: testdata/master/master12.data] Error 1
2021-05-14 13:22:09 +02:00
Ondřej Surý
1c5de1aa43 Merge branch 'ondrej/fix-outgoing-udp-socket-selection-on-windows' into 'main'
Fix the outgoing UDP socket selection on Windows

See merge request isc-projects/bind9!5021
2021-05-13 14:05:22 +00:00
Ondřej Surý
cd413234f7 Fix the outgoing UDP socket selection on Windows
The outgoing UDP socket selection would pick unintialized children
socket on Windows, because we have more netmgr workers than we have
listening sockets.  This commit fixes the selection by keeping the
outgoing socket the same, so it's always run on existing socket.
2021-05-13 15:04:48 +02:00
Artem Boldariev
d7689d8dbc Merge branch 'artem-flamethrower-fixes' into 'main'
DoH flamethrower fixes

See merge request isc-projects/bind9!5019
2021-05-13 10:01:26 +00:00
Artem Boldariev
bab9309231 Fix DoH unit tests logic
This commit fixes logic bugs in DoH test suite revealed by making DoH
not to call nghttp2_session_terminate_session() in server-side code.
2021-05-13 10:42:25 +03:00
Artem Boldariev
6816a741ca Fix crash in TLS caused by improper handling of shutdown messages
The problem was found when flamethrower was accidentally run in DoT
mode against DoH port.
2021-05-13 10:42:25 +03:00
Artem Boldariev
1947f6372d Limit the number of active concurrent HTTP/2 streams
The initial intent was to limit the number of concurrent streams by
the value of 100 but due to the error when reading the documentation
it was set to the maximum possible number of streams per session.

This could lead to security issues, e.g. a remote attacker could have
taken down the BIND instance by creating lots of sessions via low
number of transport connections. This commit fixes that.
2021-05-13 10:42:25 +03:00
Artem Boldariev
d80d1b0dd9 Do not allow empty DoH endpoints to be added
It was possible to specify empty DoH endpoint in BIND's configuration
file: that was an error, we should not allow doing so.
2021-05-13 10:42:25 +03:00
Artem Boldariev
9155a87528 Do not call nghttp2_session_terminate_session() in server-side code
We should not call nghttp2_session_terminate_session() in server-side
code after all of the active HTTP/2 streams are processed. The
underlying transport connection is expected to remain opened at least
for some time in this case for new HTTP/2 requests to arrive. That is
what flamethrower was expecting and it makes perfect sense from the
HTTP/2 perspective.
2021-05-13 10:42:25 +03:00
Mark Andrews
4d888458ab Merge branch '2528-check-soa-rdata' into 'main'
Check SOA rdata for consistency in AXFR.

Closes #2528

See merge request isc-projects/bind9!5014
2021-05-13 05:17:39 +00:00
Evan Hunt
4d94f82232 system test
Attempt a zone transfer with mismatched SOA records.
2021-05-13 03:36:50 +00:00
Mark Andrews
7e54d8d2cb Add CHANGES entry for [GL #2528] 2021-05-13 03:36:50 +00:00
Mark Andrews
e86508708d Check that the first and last SOA of an AXFR are consistent 2021-05-13 03:36:50 +00:00
Mark Andrews
72da25f31f Merge branch '2656-resolver-system-test-fails-on-macos' into 'main'
Resolve "resolver system test fails on macOS"

Closes #2656

See merge request isc-projects/bind9!4947
2021-05-12 03:40:52 +00:00
Mark Andrews
a83afc10f9 Add missing call to isc_app_ctxstart 2021-05-12 03:01:15 +00:00
Ondřej Surý
0860ed6f5b Merge branch 'marka/add-missing-isc_condition_init' into 'main'
initalise sock->cond

See merge request isc-projects/bind9!5013
2021-05-11 13:03:13 +00:00
Mark Andrews
0f6ae9000a initalise sock->cond 2021-05-11 14:06:26 +02:00
Ondřej Surý
4efd1e2ac8 Merge branch 'ondrej/increase-netmgr-quantum' into 'main'
Bump the netmgr quantum to 1024

See merge request isc-projects/bind9!5009
2021-05-10 20:04:10 +00:00
Ondřej Surý
3713a38689 Bump the netmgr quantum to 1024
During the stress testing, it was discovered that the default netmgr
quantum of 128 is not enough and there was a performance drop for TCP on
FreeBSD.  Bumping the default quantum to 1024 solves the performance
issue and is still enough to prevent the endless loops.
2021-05-10 21:32:31 +02:00
Evan Hunt
ee6e540004 Merge branch 'each-taskmgr-setmode' into 'main'
reset taskmgr immediately after loading zones

See merge request isc-projects/bind9!5010
2021-05-10 19:32:00 +00:00
Evan Hunt
220ada9422 reset taskmgr mode immediately after returning from zone load
all privileged tasks are complete by the time we return from
isc_task_endexclusive(), so it makes sense to reset the taskmgr
mode to non-privileged right then.
2021-05-10 12:26:27 -07:00
Ondřej Surý
1639bcb59e Merge branch 'ondrej/dereference-taskmgr-after-all-tasks-are-done' into 'main'
Destroy reference to taskmgr after all tasks are done

See merge request isc-projects/bind9!5008
2021-05-10 19:24:53 +00:00
Ondřej Surý
e623c12757 Destroy reference to taskmgr after all tasks are done
We were clearing the pointer to taskmgr as soon as isc_taskmgr_destroy()
would be called and before all tasks were finished.  Unfortunately, some
tasks would use global named_g_taskmgr objects from inside the events
and this would cause either a data race or NULL pointer dereference.

This commit fixes the data race by moving the destruction of the
referenced pointer to the time after all tasks are finished.
2021-05-10 12:13:27 -07:00
Ondřej Surý
d3ebd19e23 Merge branch 'ondrej/fix-missing-isc_taskmgr_detach-on-exiting' into 'main'
Add isc_taskmgr_detach when task is created while shutting down

See merge request isc-projects/bind9!5006
2021-05-10 11:33:46 +00:00
Ondřej Surý
6c57a6cc3d Add isc_taskmgr_detach when task is created while shutting down
When taskmgr is shutting down, the creating the task would attach
to the taskmgr, but don't detach on error condition.
2021-05-10 11:39:51 +02:00
Evan Hunt
0e92060833 Merge branch '2654-create-isc_managers-api' into 'main'
Destroy netmgr before destroying taskmgr

Closes #2654

See merge request isc-projects/bind9!4983
2021-05-07 21:37:01 +00:00
Evan Hunt
19431b1c83 CHANGES 2021-05-07 14:28:33 -07:00
Ondřej Surý
0133096c88 improvements to socket_test
- be more strict, but patient, waiting for event completion.
- use an atomic pointer for the socket to silence TSAN warnings.
2021-05-07 14:28:33 -07:00
Ondřej Surý
365c6a9851 ensure interlocked netmgr events run on worker[0]
Network manager events that require interlock (pause, resume, listen)
are now always executed in the same worker thread, mgr->workers[0],
to prevent races.

"stoplistening" events no longer require interlock.
2021-05-07 14:28:32 -07:00
Evan Hunt
c44423127d fix shutdown deadlocks
- ensure isc_nm_pause() and isc_nm_resume() work the same whether
  run from inside or outside of the netmgr.
- promote 'stop' events to the priority event level so they can
  run while the netmgr is pausing or paused.
- when pausing, drain the priority queue before acquiring an
  interlock; this prevents a deadlock when another thread is waiting
  for us to complete a task.
- release interlock after pausing, reacquire it when resuming, so
  that stop events can happen.

some incidental changes:
- use a function to enqueue pause and resume events (this was part of a
  different change attempt that didn't work out; I kept it because I
  thought was more readable).
- make mgr->nworkers a signed int to remove some annoying integer casts.
2021-05-07 14:28:32 -07:00
Ondřej Surý
4c8f6ebeb1 Use barriers for netmgr synchronization
The netmgr listening, stoplistening, pausing and resuming functions
now use barriers for synchronization, which makes the code much simpler.

isc/barrier.h defines isc_barrier macros as a front-end for uv_barrier
on platforms where that works, and pthread_barrier where it doesn't
(including TSAN builds).
2021-05-07 14:28:32 -07:00
Ondřej Surý
2eae7813b6 Run isc__nm_http_stoplistening() synchronously in netmgr
When isc__nm_http_stoplistening() is run from inside the netmgr, we need
to make sure it's run synchronously.  This commit is just a band-aid
though, as the desired behvaior for isc_nm_stoplistening() is not always
the same:

  1. When run from outside user of the interface, the call must be
     synchronous, e.g. the calling code expects the call to really stop
     listening on the interfaces.

  2. But if there's a call from listen<proto> when listening fails,
     that needs to be scheduled to run asynchronously, because
     isc_nm_listen<proto> is being run in a paused (interlocked)
     netmgr thread and we could get stuck.

The proper solution would be to make isc_nm_stoplistening()
behave like uv_close(), i.e., to have a proper callback.
2021-05-07 14:28:32 -07:00
Evan Hunt
5c08f97791 only run tasks as privileged if taskmgr is in privileged mode
all zone loading tasks have the privileged flag, but we only want
them to run as privileged tasks when the server is being initialized;
if we privilege them the rest of the time, the server may hang for a
long time after a reload/reconfig. so now we call isc_taskmgr_setmode()
to turn privileged execution mode on or off in the task manager.

isc_task_privileged() returns true if the task's privilege flag is
set *and* the taskmgr is in privileged execution mode. this is used
to determine in which netmgr event queue the task should be run.
2021-05-07 14:28:30 -07:00
Ondřej Surý
29a208aaf7 Fix crash when allocating UDP socket fails on OpenBSD
When socket() call fails, the UDP connect code would call the connectcb
with empty req->handle.  This has been fixed.
2021-05-07 14:28:30 -07:00
Ondřej Surý
0b491913df Don't clear dig lookup if it was already cleared
This workarounds couple of races where the current_lookup would be
already detached during shutting down the dig, but still processing the
pending reads.
2021-05-07 14:28:30 -07:00
Ondřej Surý
2836bc1854 Fix wrong query accounting in the connect function in dighost.c
The start_udp() function didn't properly attach to the query and thus
a callback with ISC_R_CANCELED would end with wrong accounting on the
query object.

Usually, this doesn't happen because underlying libuv API
uv_udp_connect() is synchronous, but isc_nm_udpconnect() could return
ISC_R_CANCELED in case it's called while the netmgr is shutting down.
2021-05-07 14:28:30 -07:00
Ondřej Surý
dacf586e18 Make the netmgr queue processing quantized
There was a theoretical possibility of clogging up the queue processing
with an endless loop where currently processing netievent would schedule
new netievent that would get processed immediately.  This wasn't such a
problem when only netmgr netievents were processed, but with the
addition of the tasks, there are at least two situation where this could
happen:

 1. In lib/dns/zone.c:setnsec3param() the task would get re-enqueued
    when the zone was not yet fully loaded.

 2. Tasks have internal quantum for maximum number of isc_events to be
    processed, when the task quantum is reached, the task would get
    rescheduled and then immediately processed by the netmgr queue
    processing.

As the isc_queue doesn't have a mechanism to atomically move the queue,
this commit adds a mechanism to quantize the queue, so enqueueing new
netievents will never stop processing other uv_loop_t events.
The default quantum size is 128.

Since the queue used in the network manager allows items to be enqueued
more than once, tasks are now reference-counted around task_ready()
and task_run(). task_ready() now has a public API wrapper,
isc_task_ready(), that the netmgr can use to reschedule processing
of a task if the quantum has been reached.

Incidental changes: Cleaned up some unused fields left in isc_task_t
and isc_taskmgr_t after the last refactoring, and changed atomic
flags to atomic_bools for easier manipulation.
2021-05-07 14:28:30 -07:00
Ondřej Surý
b5bf58b419 Destroy netmgr before destroying taskmgr
With taskmgr running on top of netmgr, the ordering of how the tasks and
netmgr shutdown interacts was wrong as previously isc_taskmgr_destroy()
was waiting until all tasks were properly shutdown and detached.  This
responsibility was moved to netmgr, so we now need to do the following:

  1. shutdown all the tasks - this schedules all shutdown events onto
     the netmgr queue

  2. shutdown the netmgr - this also makes sure all the tasks and
     events are properly executed

  3. Shutdown the taskmgr - this now waits for all the tasks to finish
     running before returning

  4. Shutdown the netmgr - this call waits for all the netmgr netievents
     to finish before returning

This solves the race when the taskmgr object would be destroyed before
all the tasks were finished running in the netmgr loops.
2021-05-07 14:28:30 -07:00
Ondřej Surý
a011d42211 Add new isc_managers API to simplify <*>mgr create/destroy
Previously, netmgr, taskmgr, timermgr and socketmgr all had their own
isc_<*>mgr_create() and isc_<*>mgr_destroy() functions.  The new
isc_managers_create() and isc_managers_destroy() fold all four into a
single function and makes sure the objects are created and destroy in
correct order.

Especially now, when taskmgr runs on top of netmgr, the correct order is
important and when the code was duplicated at many places it's easy to
make mistake.

The former isc_<*>mgr_create() and isc_<*>mgr_destroy() functions were
made private and a single call to isc_managers_create() and
isc_managers_destroy() is required at the program startup / shutdown.
2021-05-07 10:19:05 -07:00
Artem Boldariev
f23afce683 Merge branch 'artem/doh-tests-fix' into 'main'
Fix flawed DoH unit tests logic and some corner cases in the DoH code. Fix doh_test failure on FreeBSD 13.0

Closes #2632

See merge request isc-projects/bind9!5005
2021-05-07 13:25:56 +00:00
Artem Boldariev
8c0ea01f34 DoH: close active server streams when finishing session
Under some circumstances a situation might occur when server-side
session gets finished while there are still active HTTP/2
streams. This would lead to isc_nm_httpsocket object leaks.

This commit fixes this behaviour as well as refactors failed_read_cb()
to allow better code reuse.
2021-05-07 15:47:24 +03:00
Artem Boldariev
a9e97f28b7 Fix crash in client side DoH code
This commit fixes a situation when a cstream object could get unlinked
from the list as a result of a cstream->read_cb call. Thus, unlinking
it after the call could crash the program.
2021-05-07 15:47:24 +03:00
Artem Boldariev
cd178043d9 Make some TLS tests actually use quota
A directive to check quota was missing from some of the TLS tests
which were supposed to test TLS code with quotas.
2021-05-07 15:47:24 +03:00