The rewrite of BIND 9 build system is a large work and cannot be reasonable
split into separate merge requests. Addition of the automake has a positive
effect on the readability and maintainability of the build system as it is more
declarative, it allows conditional and we are able to drop all of the custom
make code that BIND 9 developed over the years to overcome the deficiencies of
autoconf + custom Makefile.in files.
This squashed commit contains following changes:
- conversion (or rather fresh rewrite) of all Makefile.in files to Makefile.am
by using automake
- the libtool is now properly integrated with automake (the way we used it
was rather hackish as the only official way how to use libtool is via
automake
- the dynamic module loading was rewritten from a custom patchwork to libtool's
libltdl (which includes the patchwork to support module loading on different
systems internally)
- conversion of the unit test executor from kyua to automake parallel driver
- conversion of the system test executor from custom make/shell to automake
parallel driver
- The GSSAPI has been refactored, the custom SPNEGO on the basis that
all major KRB5/GSSAPI (mit-krb5, heimdal and Windows) implementations
support SPNEGO mechanism.
- The various defunct tests from bin/tests have been removed:
bin/tests/optional and bin/tests/pkcs11
- The text files generated from the MD files have been removed, the
MarkDown has been designed to be readable by both humans and computers
- The xsl header is now generated by a simple sed command instead of
perl helper
- The <irs/platform.h> header has been removed
- cleanups of configure.ac script to make it more simpler, addition of multiple
macros (there's still work to be done though)
- the tarball can now be prepared with `make dist`
- the system tests are partially able to run in oot build
Here's a list of unfinished work that needs to be completed in subsequent merge
requests:
- `make distcheck` doesn't yet work (because of system tests oot run is not yet
finished)
- documentation is not yet built, there's a different merge request with docbook
to sphinx-build rst conversion that needs to be rebased and adapted on top of
the automake
- msvc build is non functional yet and we need to decide whether we will just
cross-compile bind9 using mingw-w64 or fix the msvc build
- contributed dlz modules are not included neither in the autoconf nor automake
The isc_mem API now crashes on memory allocation failure, and this is
the next commit in series to cleanup the code that could fail before,
but cannot fail now, e.g. isc_result_t return type has been changed to
void for the isc_log API functions that could only return ISC_R_SUCCESS.
The isc_buffer_allocate() function now cannot fail with ISC_R_MEMORY.
This commit removes all the checks on the return code using the semantic
patch from previous commit, as isc_buffer_allocate() now returns void.
The isc_mempool_create() function now cannot fail with ISC_R_MEMORY.
This commit removes all the checks on the return code using the semantic
patch from previous commit, as isc_mempool_create() now returns void.
Previously, the dns_name API used isc_thread_key API for TLS, which is
fairly complicated and requires initialization of memory contexts, etc.
This part of code was refactored to use a ISC_THREAD_LOCAL pointer which
greatly simplifies the whole code related to storing TLS variables.
When a task manager is created, we can now specify an `isc_nm`
object to associate with it; thereafter when the task manager is
placed into exclusive mode, the network manager will be paused.
The coccinellery repository provides many little semantic patches to fix common
problems in the code. The number of semantic patches in the coccinellery
repository is high and most of the semantic patches apply only for Linux, so it
doesn't make sense to run them on regular basis as the processing takes a lot of
time.
The list of issue found in BIND 9, by no means complete, includes:
- double assignment to a variable
- `continue` at the end of the loop
- double checks for `NULL`
- useless checks for `NULL` (cannot be `NULL`, because of earlier return)
- using `0` instead of `NULL`
- useless extra condition (`if (foo) return; if (!foo) { ...; }`)
- removing & in front of static functions passed as arguments
This commit was done by hand to add the RUNTIME_CHECK() around stray
dns_name_copy() calls with NULL as third argument. This covers the edge cases
that doesn't make sense to write a semantic patch since the usage pattern was
unique or almost unique.
This commit add RUNTIME_CHECK() around all simple dns_name_copy() calls where
the third argument is NULL using the semantic patch from the previous commit.
It is possible dig used ACE encoded name in locale, which does not
support converting it to unicode. Instead of fatal error, fallback to
ACE name on output.
isc_event_allocate() calls isc_mem_get() to allocate the event structure. As
isc_mem_get() cannot fail softly (e.g. it never returns NULL), the
isc_event_allocate() cannot return NULL, hence we remove the (ret == NULL)
handling blocks using the semantic patch from the previous commit.
dig retries a TCP query when a server closes the connection prematurely.
However, dig's exit code remains unaffected even if the second attempt
to get a response also fails with the same error for the same lookup,
which should not be the case. Ensure the exit code is updated
appropriately when a retry triggered by a TCP EOF condition fails.
When a query times out after a socket is created and associated with a
given dig_query_t structure, calling isc_socket_cancel() causes
connect_done() to be run, which in turn takes care of all necessary
cleanups. However, certain errors (e.g. get_address() returning
ISC_R_FAMILYNOSUPPORT) may prevent a TCP socket from being created in
the first place. Since force_timeout() may be used in code handling
such errors, connect_timeout() needs to properly clean up a TCP query
which is not associated with any socket. Call clear_query() from
connect_timeout() after attempting to send a TCP query to the next
available server if the timed out query does not have a socket
associated with it, in order to prevent dig from hanging indefinitely
due to the dig_query_t structure not being detached from its parent
dig_lookup_t structure.
When a query times out and another server is available for querying
within the same lookup, the timeout handler - connect_timeout() - is
responsible for sending the query to the next server. Extract the
relevant part of connect_timeout() to a separate function in order to
improve code readability.
Before commit c2ec022f5784a2ff844f7d062c2022197dc4ad09, using the "-b"
command line switch for dig did not disable use of the other address
family than the one to which the address supplied to that option
belonged to. Thus, bind9_getaddresses() could e.g. prepare an
isc_sockaddr_t structure for an IPv6 address when an IPv4 address has
been passed to the "-b" command line option. To avoid attempting the
impossible (e.g. querying an IPv6 address from a socket bound to an IPv4
address), a certain code block in send_tcp_connect() checked whether the
address family of the server to be queried was the same as the address
family of the socket set up for sending that query; if there was a
mismatch, that particular server address was skipped.
Commit c2ec022f5784a2ff844f7d062c2022197dc4ad09 made
bind9_getaddresses() fail upon an address family mismatch between the
address the hostname passed to it resolved to and the address supplied
to the "-b" command line option. Such failures were fatal to dig back
then.
Commit 7f658603910358db7ee27ffb9783096250afab62 made
bind9_getaddresses() failures non-fatal, but also ensured that a
get_address() failure in send_tcp_connect() still causes the given query
address to be skipped (and also made such failures trigger an early
return from send_tcp_connect()).
Summing up, the code block handling address family mismatches in
send_tcp_connect() has been redundant since commit
c2ec022f5784a2ff844f7d062c2022197dc4ad09. Remove it.
In BIND 9.11 and earlier, dig and similar tools used liblwres for
parsing /etc/resolv.conf. After getting a list of servers from
liblwres, a tool would check the address family of each server found and
reject those unusable. When the resulting list of usable servers was
empty, localhost addresses were queried as a fallback.
When liblwres was removed in BIND 9.12, dig and similar tools were
updated to parse /etc/resolv.conf using libirs instead. As part of that
process, the localhost fallback was removed from bin/dig/dighost.c since
the localhost fallback built into libirs was deemed to be sufficient.
However, libirs only falls back to localhost if it does not find any
name servers at all; if it does find any valid nameserver entry in
/etc/resolv.conf, it just returns it to the caller because it is
oblivious to whether the caller supports IPv4 and/or IPv6 or not. The
code in bin/dig/dighost.c subsequently filters the returned list of
servers in get_server_list() according to the requested address family
restrictions. This may result in none of the addresses returned by
libirs being usable, in which case a tool will attempt to work with an
empty server list, causing a hang and subsequently a crash upon user
interruption.
Restore the localhost fallback in bin/dig/dighost.c to prevent the
aforementioned hangs and crashes and ensure recent BIND versions behave
identically to the older ones in the circumstances described above.
If a tool using the routines defined in bin/dig/dighost.c is sent an
interruption signal around the time a connection timeout is scheduled to
fire, connect_timeout() may be executed after destroy_libs() detaches
from the global task (setting 'global_task' to NULL), which results in a
crash upon a UDP retry due to bringup_timer() attempting to create a
timer with 'task' set to NULL. Fix by preventing connect_timeout() from
attempting a retry when shutdown is in progress.
While implementing the new unit testing framework cmocka, it was found that the
BIND 9 code doesn't compile when assertions are disabled or replaced with any
function (such as mock_assert() from cmocka unit testing framework) that's not
directly recognized as assertion by the compiler.
This made the compiler to complain about blocks of code that was recognized as
unreachable before, but now it isn't.
The changes in this commit include:
* assigns default values to couple of local variables,
* moves some return statements around INSIST assertions,
* adds __builtin_unreachable(); annotations after some INSIST assertions,
* fixes one broken assertion (= instead of ==)