2
0
mirror of https://gitlab.isc.org/isc-projects/bind9 synced 2025-08-30 05:57:52 +00:00

35420 Commits

Author SHA1 Message Date
Ondřej Surý
6269fce0fe Use isc_mem_get_aligned() for isc_queue and cleanup max_threads
The isc_queue_new() was using dirty tricks to allocate the head and tail
members of the struct aligned to the cacheline.  We can now use
isc_mem_get_aligned() to allocate the structure to the cacheline
directly.

Use ISC_OS_CACHELINE_SIZE (64) instead of arbitrary ALIGNMENT (128), one
cacheline size is enough to prevent false sharing.

Cleanup the unused max_threads variable - there was actually no limit on
the maximum number of threads.  This was changed a while ago.
2022-01-05 17:10:58 +01:00
Ondřej Surý
c84eb55049 Reduce the memory used by hazard pointers
The hazard pointers implementation was bit of frivolous with memory
usage allocating memory based on maximum constants rather than on the
usage.

Make the retired list bit use exactly the memory needed for specified
number of hazard pointers.  This reduced the memory used by hazard
pointers to one quarter in our specific case because we only use single
HP in the queue implementation (as opposed to allocating memory for
HP_MAX_HPS = 4).

Previously, the alignment to prevent false sharing was double the
cacheline size.  This was copied from the ConcurrencyFreaks
implementation, but one cacheline size is enough to prevent false
sharing, so we are using this now to save few bits of memory.

The top level hazard pointers and retired list arrays are now not
aligned to the cacheline size - they are read-only for the whole
life-time of the isc_hp object.  Only hp (hazard pointer) and
rl (retired list) array members are allocated aligned to the cacheline
size to avoid false sharing between threads.

Cleanup HP_MAX_HPS and HP_THRESHOLD_R constants from the paper, because
we don't use them in the code.  HP_THRESHOLD_R was 0, so the check
whether the retired list size was smaller than the value was basically a
dead code.
2022-01-05 17:10:58 +01:00
Ondřej Surý
c917a2ca88 Add isc_mem_*_aligned() function that works with aligned memory
There are some situations where having aligned allocations would be
useful, so we don't have to play tricks with padding the data to the
cacheline sizes.

Add isc_mem_{get,put,reget,putanddetach}_aligned() functions that has
alignment and size as last argument mimicking the POSIX posix_memalign()
functions on systems with jemalloc (see the documentation on
MALLOX_ALIGN() for more details).  On systems without jemalloc, those
functions are same as non-aligned variants.
2022-01-05 17:10:56 +01:00
Ondřej Surý
4f78f9d72a Add #define ISC_OS_CACHELINE_SIZE 64
Add library ctor and dtor for isc_os compilation unit which initializes
the numbers of the CPUs and also checks whether L1 cacheline size is
really 64 if the sysconf() call is available.
2022-01-05 17:07:35 +01:00
Ondřej Surý
55aa182ae6 Merge branch '2979-lock-view-while-accessing-its-zone-table' into 'main'
Lock view while accessing its zone table

Closes #2979

See merge request isc-projects/bind9!5676
2022-01-05 15:58:02 +00:00
Ondřej Surý
7624dc3ee4 Merge branch 'ondrej/fix-taskmgr-exiting-access' into 'main'
Fixup code related to the taskmgr shutdown

See merge request isc-projects/bind9!5690
2022-01-05 15:56:31 +00:00
Ondřej Surý
ab5b2ef43c Add isc_refcount_destroy() for dns_zt reference counters
The zt_destroy() function was missing isc_refcount_destroy() on the two
reference counters.  The isc_refcount_destroy() adds proper memory
ordering on destroy and also ensures that the reference counters have
been zeroed before destroying the object.
2022-01-05 16:56:16 +01:00
Ondřej Surý
f326d45135 Lock view while accessing its zone table
Commit 308bc46a59302c88ecff11d4831475ecfa8b8fb0 introduced a change to
the view_flushanddetach() function which makes the latter access
view->zonetable without holding view->lock.  As confirmed by TSAN, this
enables races between threads for view->zonetable accesses.

Swap the view->zonetable pointer under view lock and then detach the
local swapped dns_zt_t later when the view lock is already unlocked.

This commit also changes the dns_zt interfaces, so the setting the
zonetable "flush" flag is separate operation to dns_zt_detach,
e.g. instead of doing:

    if (view->flush) {
        dns_zt_flushanddetach(&zt);
    } else {
        dns_zt_detach(&zt);
    }

the code is now:

    if (view->flush) {
        dns_zt_flush(zt);
    }
    dns_zt_detach(&zt);

making the code more consistent with how we handle flushing and
detaching dns_zone_t pointers from the view.
2022-01-05 16:56:16 +01:00
Ondřej Surý
e705f213ca Remove taskmgr->excl_lock, fix the locking for taskmgr->exiting
While doing code review, it was found that the taskmgr->exiting is set
under taskmgr->lock, but accessed under taskmgr->excl_lock in the
isc_task_beginexclusive().

Additionally, before the change that moved running the tasks to the
netmgr, the task_ready() subrouting of isc_task_detach() would lock
mgr->lock, requiring the mgr->excl to be protected mgr->excl_lock
to prevent deadlock in the code.  After !4918 has been merged, this is
no longer true, and we can remove taskmgr->excl_lock and use
taskmgr->lock in its stead.

Solve both issues by removing the taskmgr->excl_lock and exclusively use
taskmgr->lock to protect both taskmgr->excl and taskmgr->exiting which
now doesn't need to be atomic_bool, because it's always accessed from
within the locked section.
2022-01-05 16:44:57 +01:00
Ondřej Surý
f9d90159b8 On shutdown, return ISC_R_SHUTTINGDOWN from isc_taskmgr_excltask()
The isc_taskmgr_excltask() would return ISC_R_NOTFOUND either when the
exclusive task was not set (yet) or when the taskmgr is shutting down
and the exclusive task has been already cleared.

Distinguish between the two states and return ISC_R_SHUTTINGDOWN when
the taskmgr is being shut down instead of ISC_R_NOTFOUND.
2022-01-05 13:41:12 +01:00
Ondřej Surý
b2c9543a6e Merge branch '3074-catz-excl-task' into 'main'
Prevent a shutdown race in catz_create_chg_task()

Closes #3074

See merge request isc-projects/bind9!5687
2022-01-05 12:37:27 +00:00
Evan Hunt
81c09b005b Add CHANGES note for [GL #3074] 2022-01-05 13:15:40 +01:00
Evan Hunt
973ac1d891 Prevent a shutdown race in catz_create_chg_task()
If a catz event is scheduled while the task manager was being
shut down, task-exclusive mode is unavailable. This needs to be
handled as an error rather than triggering an assertion.
2022-01-05 12:48:40 +01:00
Matthijs Mekking
c2aeda6c99 Merge branch '3023-auto-dnssec-documentation-bug' into 'main'
Update auto-dnssec documentation

Closes #3023

See merge request isc-projects/bind9!5598
2022-01-05 11:26:14 +00:00
Matthijs Mekking
447fa2a816 Add CHANGES for #3023 2022-01-05 11:48:50 +01:00
Matthijs Mekking
aac39647f3 Update auto-dnssec documentation
Explain that 'auto-dnssec' may only be activated at zone level.
2022-01-05 11:48:26 +01:00
Ondřej Surý
a71be346c4 Merge branch '3071-signed-version-of-an-inline-signed-zone-may-be-dumped-without-unsigned-serial-number' into 'main'
Do not detach raw zone until dumping is complete

Closes #3071

See merge request isc-projects/bind9!5680
2022-01-05 09:32:25 +00:00
Ondřej Surý
4d71a3b309 Add CHANGES and release note for [GL #3071] 2022-01-05 10:29:15 +01:00
Michał Kępień
cc1d4e1aa6 Ensure the correct ordering zone_shutdown() vs zone_gotwritehandle()
When the signed version of an inline-signed zone is dumped to disk, the
serial number of the unsigned version of the zone is written in the
raw-format header so that the contents of the signed zone can be
resynchronized after named restart if the unsigned zone file is
modified while named is not running (see RT #26676).

In order for the serial number of the unsigned zone to be determined
during the dump, zone->raw must be set to a non-NULL value.  This
should always be the case as long as the signed version of the zone is
used for anything by named.

However, under certain circumstances the zone->raw could be set to NULL
while the zone is being dumped.

Defer detaching from zone->raw in zone_shutdown() if the zone is in the
process of being dumped to disk.
2022-01-05 10:27:55 +01:00
Evan Hunt
99af3fbeda Merge branch '3075-fix-tlsctx-detach' into 'main'
Ensure that cache pointer is set to NULL by isc_tlsctx_cache_detach()

Closes #3075

See merge request isc-projects/bind9!5686
2022-01-05 07:07:47 +00:00
Evan Hunt
61c160c4a5 Clean up isc_tlsctx_cache_detach()
For consistency with similar functions, rename `pcache` to `cachep`,
call a separate destroy function when references reach 0, and add
a missing call to isc_refcount_destroy().
2022-01-04 23:07:12 -08:00
Evan Hunt
f5074c0c8e Ensure that cache pointer is set to NULL by isc_tlsctx_cache_detach()
If the reference count was higher than 1, detaching a tlsctx cache
didn't clear the pointer, which could trigger an assertion later.
2022-01-04 11:48:25 -08:00
Michał Kępień
a1db2347d4 Merge branch '3032-include-isc-logo-in-source-tarballs' into 'main'
Include doc/arm/isc-logo.pdf in source tarballs

Closes #3032

See merge request isc-projects/bind9!5678
2022-01-04 13:43:07 +00:00
Michał Kępień
62be4f6b0e Include doc/arm/isc-logo.pdf in source tarballs
The doc/arm/conf.py Sphinx configuration file specifies
doc/arm/isc-logo.pdf as the logo to use in the PDF files produced.
Since doc/arm/isc-logo.pdf is not currently included in source tarballs
produced using "make dist", attempting to build documentation in PDF
format using a source tarball results in the following error being
raised:

    Sphinx error:
    logo file 'isc-logo.pdf' does not exist

Ensure doc/arm/isc-logo.pdf is included in source tarballs produced
using "make dist", so that the BIND 9 ARM can be successfully built in
PDF format using just the source tarball.
2022-01-04 14:37:52 +01:00
Michał Kępień
0bca8f0b2a Add a tarball-based documentation-building job
The existing "docs" GitLab CI job operates on a Git repository rather
than a source tarball.  This prevents it from detecting issues caused by
files missing from source tarballs.  Add a new GitLab CI job similar to
the "docs" one, but using a source tarball rather than a Git repository.
Extract YAML bits used by multiple job definitions into anchors to avoid
code duplication.  Drop the "allow_failure: false" key in the process as
it is the implicit default for non-manual jobs.  Replace the
"artifacts:paths" key with "artifacts:untracked" in order to include all
untracked files in the artifact archive for each documentation-building
job; this allows tarball-based artifacts to be properly captured and
also facilitates troubleshooting failed jobs.
2022-01-04 14:37:52 +01:00
Mark Andrews
1515d39f8c Merge branch '3065-memory-leak-on-duplicately-named-dnssec-policy' into 'main'
Resolve "memory leak on duplicately named dnssec-policy"

Closes #3065

See merge request isc-projects/bind9!5669
2022-01-03 21:45:01 +00:00
Mark Andrews
6de041f19c Add CHANGES for [GL #3065] 2022-01-03 11:49:27 -08:00
Mark Andrews
b8845454c8 Report duplicate dnssec-policy names
Duplicate dnssec-policy names were detected as an error condition
but were not logged.
2022-01-03 11:48:26 -08:00
Mark Andrews
694440e614 Address memory leak when processing dnssec-policy clauses
A kasp structure was not detached when looking to see if there
was an existing kasp structure with the same name, causing memory
to be leaked.  Fixed by calling dns_kasp_detach() to release the
reference.
2022-01-03 11:47:33 -08:00
Michal Nowak
441b251207 Merge branch 'mnowak/drop-xmllint-check-from-misc-ci-job' into 'main'
Drop xmllint check from misc CI job

See merge request isc-projects/bind9!5684
2022-01-03 15:47:00 +00:00
Michal Nowak
1f64be2811
Drop xmllint check from misc CI job
There are no XML or docbook files in the "main" source tree to be
checked and the xmllint command just prints out a usage message.
2022-01-03 15:51:36 +01:00
Michal Nowak
a6bde9612b Merge branch 'mnowak/year-2022' into 'main'
Update copyrights to 2022

See merge request isc-projects/bind9!5681
2022-01-03 14:50:33 +00:00
Michal Nowak
befd654e00
Update copyrights to 2022 2022-01-03 10:53:28 +01:00
Michał Kępień
ae7ba926d4 Merge branch '2782-set-version-and-release-variables-in-conf.py' into 'main'
Set version and release variables in conf.py

Closes #2782

See merge request isc-projects/bind9!5205
2021-12-29 09:02:10 +00:00
Michał Kępień
e67cdb390a Clarify use of the "today" Sphinx variable
Add a comment explaining the purpose of setting the "today" variable in
Sphinx invocations to prevent confusion caused by the absence of that
variable from reStructuredText sources.

Drop the -A command-line option from the sphinx-build invocation for
EPUB output as "today" is already set in the ALLSPHINXOPTS variable.
2021-12-29 09:58:48 +01:00
Michał Kępień
38d251e11b Set version and release variables in conf.py
Some Sphinx variables used in the ARM are only set in Makefile.docs.
This works fine when building the ARM using "make", but does not work
with Read the Docs, which only looks at conf.py files.

Since Read the Docs does not run ./configure, renaming conf.py to
conf.py.in and using Autoconf output variables is not a feasible
solution.

Instead, extend doc/arm/conf.py with some Python code which processes
configure.ac using regular expressions and sets the relevant Sphinx
variables accordingly.  As this solution also works fine when building
the ARM using "make", drop the relevant -D options from the list of
sphinx-build options used for building the ARM in Makefile.docs.

Note that the man_SPHINXOPTS counterparts of the removed -D switches are
left intact because doc/man/conf.py is a separate Sphinx project which
is only processed using "make" and duplicating the Python code added to
doc/arm/conf.py by this commit would be inelegant.
2021-12-29 09:58:48 +01:00
Artem Boldariev
3addc36533 Merge branch 'artem-tlsctx-caching' into 'main'
Add TLS context cache

Closes #3067

See merge request isc-projects/bind9!5672
2021-12-29 08:58:10 +00:00
Artem Boldariev
cb330c432d Add a CHANGES entry [GL !5672]
Mention that TLS contexts reuse was implemented.
2021-12-29 10:25:16 +02:00
Artem Boldariev
64f7c55662 Use the TLS context cache for client-side contexts (XoT)
This commit enables client-side TLS contexts re-use for zone transfers
over TLS. That, in turn, makes it possible to use the internal session
cache associated with the contexts, allowing the TLS connections to be
established faster and requiring fewer resources by not going through
the full TLS handshake procedure.

Previously that would recreate the context on every connection, making
TLS session resumption impossible.

Also, this change lays down a foundation for Strict TLS (when the
client validates a server certificate), as the TLS context cache can
be extended to store additional data required for validation (like
intermediates CA chain).
2021-12-29 10:25:15 +02:00
Artem Boldariev
5b7d4341fe Use the TLS context cache for server-side contexts
Using the TLS context cache for server-side contexts could reduce the
number of contexts to initialise in the configurations when e.g. the
same 'tls' entry is used in multiple 'listen-on' statements for the
same DNS transport, binding to multiple IP addresses.

In such a case, only one TLS context will be created, instead of a
context per IP address, which could reduce the initialisation time, as
initialising even a non-ephemeral TLS context introduces some delay,
which can be *visually* noticeable by log activity.

Also, this change lays down a foundation for Mutual TLS (when the
server validates a client certificate, additionally to a client
validating the server), as the TLS context cache can be extended to
store additional data required for validation (like intermediates CA
chain).

Additionally to the above, the change ensures that the contexts are
not being changed after initialisation, as such a practice is frowned
upon. Previously we would set the supported ALPN tags within
isc_nm_listenhttp() and isc_nm_listentlsdns(). We do not do that for
client-side contexts, so that appears to be an overlook. Now we set
the supported ALPN tags right after server-side contexts creation,
similarly how we do for client-side ones.
2021-12-29 10:25:14 +02:00
Artem Boldariev
eb37d967c2 Add TLS context cache
This commit adds a TLS context object cache implementation. The
intention of having this object is manyfold:

- In the case of client-side contexts: allow reusing the previously
created contexts to employ the context-specific TLS session resumption
cache. That will enable XoT connection to be reestablished faster and
with fewer resources by not going through the full TLS handshake
procedure.

- In the case of server-side contexts: reduce the number of contexts
created on startup. That could reduce startup time in a case when
there are many "listen-on" statements referring to a smaller amount of
`tls` statements, especially when "ephemeral" certificates are
involved.

- The long-term goal is to provide in-memory storage for additional
data associated with the certificates, like runtime
representation (X509_STORE) of intermediate CA-certificates bundle for
Strict TLS/Mutual TLS ("ca-file").
2021-12-29 10:25:11 +02:00
Michał Kępień
c6dffa3e09 Merge branch 'michal/fix-error-codes-passed-to-connection-callbacks' into 'main'
Fix error codes passed to connection callbacks

See merge request isc-projects/bind9!5675
2021-12-28 15:14:11 +00:00
Michał Kępień
ea89ab80ae Fix error codes passed to connection callbacks
Commit 9ee60e7a17bf34c7ef7f4d79e6a00ca45444ec8c erroneously introduced
duplicate conditions to several existing conditional statements
responsible for determining error codes passed to connection callbacks
upon failure.  Fix the affected expressions to ensure connection
callbacks are invoked with:

  - the ISC_R_SHUTTINGDOWN error code when a global netmgr shutdown is
    in progress,

  - the ISC_R_CANCELED error code when a specific operation has been
    canceled.

This does not fix any known bugs, it only adjusts the changes introduced
by commit 9ee60e7a17bf34c7ef7f4d79e6a00ca45444ec8c so that they match
its original intent.
2021-12-28 15:09:50 +01:00
Michał Kępień
cb22ed0492 Merge branch '3068-fix-rare-control-channel-socket-reference-leak' into 'main'
Fix rare control channel socket reference leak

Closes #3068

See merge request isc-projects/bind9!5673
2021-12-28 12:42:45 +00:00
Michał Kępień
fc678b19d9 Fix rare control channel socket reference leak
Commit 9ee60e7a17bf34c7ef7f4d79e6a00ca45444ec8c enabled netmgr shutdown
to cause read callbacks for active control channel sockets to be invoked
with the ISC_R_SHUTTINGDOWN result code.  However, control channel code
only recognizes ISC_R_CANCELED as an indicator of an in-progress netmgr
shutdown (which was correct before the above commit).  This discrepancy
enables the following scenario to happen in rare cases:

 1. A control channel request is received and responded to.  libuv
    manages to write the response to the TCP socket, but the completion
    callback (control_senddone()) is yet to be invoked.

 2. Server shutdown is initiated.  All TCP sockets are shut down, which
    i.a. causes control_recvmessage() to be invoked with the
    ISC_R_SHUTTINGDOWN result code.  As the result code is not
    ISC_R_CANCELED, control_recvmessage() does not set
    listener->controls->shuttingdown to 'true'.

 3. control_senddone() is called with the ISC_R_SUCCESS result code.  As
    neither listener->controls->shuttingdown is 'true' nor is the result
    code ISC_R_CANCELED, reading is resumed on the control channel
    socket.  However, this read can never be completed because the read
    callback on that socket was cleared when the TCP socket was shut
    down.  This causes a reference on the socket's handle to be held
    indefinitely, leading to a hang upon shutdown.

Ensure listener->controls->shuttingdown is also set to 'true' when
control_recvmessage() is invoked with the ISC_R_SHUTTINGDOWN result
code.  This ensures the send completion callback does not resume reading
after the control channel socket is shut down.
2021-12-28 08:36:01 +01:00
Michal Nowak
ab52c99843 Merge branch 'mnowak/make-debian-11-bullseye-base-image' into 'main'
Make bullseye the base image

See merge request isc-projects/bind9!5367
2021-12-23 14:41:45 +00:00
Michal Nowak
4d7e343813
Use /dev/urandom as BIND 9.11 randomness source
This prevents resolver timeouts for the reference (BIND 9.11) servers
used in respdiff tests run on Debian 11 "bullseye".
2021-12-23 11:37:59 +01:00
Michal Nowak
910d595fbc
Make bullseye the base image
"buster" jobs are now only going to be run in scheduled pipelines.

"--without-gssapi" ./configure option of "bullseye" before it became
the base image is dropped from "bullseye"-the-base-image because it
reduces gcov coverage by 0.38 % (651 lines) and is used in Debian 9
"stretch".
2021-12-23 11:37:59 +01:00
Mark Andrews
3959776b02 Merge branch '3041-decide-what-to-do-with-reject-000-and-other-obscure-options-for-synth-from-dnssec-feature' into 'main'
remove reject-000 and broken-nsec options (related to synth-from-dnssec feature)

Closes #3041

See merge request isc-projects/bind9!5621
2021-12-23 05:14:50 +00:00
Mark Andrews
dc8595936c remove broken-nsec and reject-000-label options 2021-12-23 15:13:46 +11:00