IBM power architecture has L1 cache line size equal to 128. Take
advantage of that on that architecture, do not force more common value
of 64. When it is possible to detect higher value, use that value
instead. Keep the default to be 64.
This commit converts the license handling to adhere to the REUSE
specification. It specifically:
1. Adds used licnses to LICENSES/ directory
2. Add "isc" template for adding the copyright boilerplate
3. Changes all source files to include copyright and SPDX license
header, this includes all the C sources, documentation, zone files,
configuration files. There are notes in the doc/dev/copyrights file
on how to add correct headers to the new files.
4. Handle the rest that can't be modified via .reuse/dep5 file. The
binary (or otherwise unmodifiable) files could have license places
next to them in <foo>.license file, but this would lead to cluttered
repository and most of the files handled in the .reuse/dep5 file are
system test files.
The hazard pointers implementation was bit of frivolous with memory
usage allocating memory based on maximum constants rather than on the
usage.
Make the retired list bit use exactly the memory needed for specified
number of hazard pointers. This reduced the memory used by hazard
pointers to one quarter in our specific case because we only use single
HP in the queue implementation (as opposed to allocating memory for
HP_MAX_HPS = 4).
Previously, the alignment to prevent false sharing was double the
cacheline size. This was copied from the ConcurrencyFreaks
implementation, but one cacheline size is enough to prevent false
sharing, so we are using this now to save few bits of memory.
The top level hazard pointers and retired list arrays are now not
aligned to the cacheline size - they are read-only for the whole
life-time of the isc_hp object. Only hp (hazard pointer) and
rl (retired list) array members are allocated aligned to the cacheline
size to avoid false sharing between threads.
Cleanup HP_MAX_HPS and HP_THRESHOLD_R constants from the paper, because
we don't use them in the code. HP_THRESHOLD_R was 0, so the check
whether the retired list size was smaller than the value was basically a
dead code.
Previously, the isc_hp_init() could not lower the value of
isc__hp_max_threads, but because of a mistake the isc__hp_max_threads
would be set to HP_MAX_THREADS (e.g. 128 threads) thus it would be
always set to 128. This would result in increased memory usage even
when small number of workers were in use.
Change the default value of isc__hp_max_threads to be 1.
Additionally, enforce the max_hps value in isc_hp_new() to be smaller or
equal to HP_MAX_HPS. The only user is isc_queue which uses just 1
hazard pointer, so it's only theoretical issue.
Previously, netmgr, taskmgr, timermgr and socketmgr all had their own
isc_<*>mgr_create() and isc_<*>mgr_destroy() functions. The new
isc_managers_create() and isc_managers_destroy() fold all four into a
single function and makes sure the objects are created and destroy in
correct order.
Especially now, when taskmgr runs on top of netmgr, the correct order is
important and when the code was duplicated at many places it's easy to
make mistake.
The former isc_<*>mgr_create() and isc_<*>mgr_destroy() functions were
made private and a single call to isc_managers_create() and
isc_managers_destroy() is required at the program startup / shutdown.
Convert the isc_hp API to use the globally available isc_tid_v instead
of locally defined tid_v. This should solve most of the problems on
machines with many number of cores / CPUs.
The rewrite of BIND 9 build system is a large work and cannot be reasonable
split into separate merge requests. Addition of the automake has a positive
effect on the readability and maintainability of the build system as it is more
declarative, it allows conditional and we are able to drop all of the custom
make code that BIND 9 developed over the years to overcome the deficiencies of
autoconf + custom Makefile.in files.
This squashed commit contains following changes:
- conversion (or rather fresh rewrite) of all Makefile.in files to Makefile.am
by using automake
- the libtool is now properly integrated with automake (the way we used it
was rather hackish as the only official way how to use libtool is via
automake
- the dynamic module loading was rewritten from a custom patchwork to libtool's
libltdl (which includes the patchwork to support module loading on different
systems internally)
- conversion of the unit test executor from kyua to automake parallel driver
- conversion of the system test executor from custom make/shell to automake
parallel driver
- The GSSAPI has been refactored, the custom SPNEGO on the basis that
all major KRB5/GSSAPI (mit-krb5, heimdal and Windows) implementations
support SPNEGO mechanism.
- The various defunct tests from bin/tests have been removed:
bin/tests/optional and bin/tests/pkcs11
- The text files generated from the MD files have been removed, the
MarkDown has been designed to be readable by both humans and computers
- The xsl header is now generated by a simple sed command instead of
perl helper
- The <irs/platform.h> header has been removed
- cleanups of configure.ac script to make it more simpler, addition of multiple
macros (there's still work to be done though)
- the tarball can now be prepared with `make dist`
- the system tests are partially able to run in oot build
Here's a list of unfinished work that needs to be completed in subsequent merge
requests:
- `make distcheck` doesn't yet work (because of system tests oot run is not yet
finished)
- documentation is not yet built, there's a different merge request with docbook
to sphinx-build rst conversion that needs to be rebased and adapted on top of
the automake
- msvc build is non functional yet and we need to decide whether we will just
cross-compile bind9 using mingw-w64 or fix the msvc build
- contributed dlz modules are not included neither in the autoconf nor automake
hp implementation requires an object for each thread accessing
a hazard pointer. previous implementation had a hardcoded
HP_MAX_THREAD value of 128, which failed on machines with lots of
CPU cores (named uses 3n threads). We make isc__hp_max_threads
configurable at startup, with the value set to 4*named_g_cpus.
It's also important for this value not to be too big as we do
linear searches on a list.
The new ISC_THREAD_LOCAL macro unifies usage of platform dependent
Thread Local Storage definition thread_local vs __thread vs
__declspec(thread) to a single macro.
The commit also unifies the required level of support for TLS as for
some parts of the code it was mandatory and for some parts of the code
it wasn't.