The malloc attribute allows compiler to do some optmizations on
functions that behave like malloc/calloc, like assuming that the
returned pointer do not alias other pointers.
There is no possibility for mpctx->items to be NULL at the point where
the code was removed, since we enforce that fillcount > 0, if
mpctx->items == NULL when isc_mempool_get is called, then we will
allocate fillcount more items and add to the mpctx->items list.
_query_detach function was incorrectly unliking the query object from
the lookup->q query list, this made it impossible to follow a query
lookup failure with the next one in the list (possibly using a separate
resolver), as the link to the next query in the list was dissolved.
Fix by unliking the node only when the query object is about to be
destroyed, i.e. there is no more references to the object.
When the keymgr needs to create new keys, it is possible it needs to
create multiple keys. The keymgr checks for keyid conflicts with
already existing keys, but it should also check against that it just
created.
GitLab CI pipelines do not currently include a Linux job that would have
GSSAPI support disabled. Add the "--without-gssapi" option to the
./configure invocation on Debian 9 to address that deficiency and also
to continuously test that build-time switch.
If "tkey-gssapi-credential" is set in the configuration and GSSAPI
support is not available, named will refuse to start. As the test
system framework does not support starting named instances
conditionally, ensure that "tkey-gssapi-credential" is only present in
named.conf if GSSAPI support is available.
as with TLS, the destruction of a client stream on failed read
needs to be conditional: if we reached failed_read_cb() as a
result of a timeout on a timer which has subsequently been
reset, the stream must not be closed.
Add a check to the "dnssec" system test which ensures that RRSIG(SOA)
RRsets present anywhere else than at the zone apex are automatically
removed after a zone containing such RRsets is loaded.
If there happens to be a RRSIG(SOA) that is not at the zone apex
for any reason it should not be considered as a stopping condition
for incremental zone signing.
the destruction of the socket in tls_failed_read_cb() needs to be
conditional; if reached due to a timeout on a timer that has
subsequently been reset, the socket must not be destroyed.
this is similar in structure to the UDP timeout recovery test.
this commit adds a new mechanism to the netmgr test allowing the
listen socket to accept incoming TCP connections but never send
a response. this forces the client to time out on read.
when running read callbacks, if the event result is not ISC_R_SUCCESS,
the callback is always run asynchronously. this is a problem on timeout,
because there's no chance to reset the timer before the socket has
already been destroyed. this commit allows read callbacks to run
synchronously for both ISC_R_SUCCESS and ISC_R_TIMEDOUT result codes.
this test sets up a server socket that listens for UDP connections
but never responds. the client will always time out; it should retry
five times before giving up.
The test spawns 4 parallel workers that keep adding, modifying and
deleting zones, the main thread repeatedly checks wheter rndc
status responds within a reasonable period.
While environment and timing issues may affect the test, in most
test cases the deadlock that was taking place before the fix used to
trigger in less than 7 seconds in a machine with at least 2 cores.
It follows a description of the steps that were leading to the deadlock:
1. `do_addzone` calls `isc_task_beginexclusive`.
2. `isc_task_beginexclusive` waits for (N_WORKERS - 1) halted tasks,
this blocks waiting for those (no. workers -1) workers to halt.
...
isc_task_beginexclusive(isc_task_t *task0) {
...
while (manager->halted + 1 < manager->workers) {
wake_all_queues(manager);
WAIT(&manager->halt_cond, &manager->halt_lock);
}
```
3. It is possible that in `task.c / dispatch()` a worker is running a
task event, if that event blocks it will not allow this worker to
halt.
4. `do_addzone` acquires `LOCK(&view->new_zone_lock);`,
5. `rmzone` event is called from some worker's `dispatch()`, `rmzone`
blocks waiting for the same lock.
6. `do_addzone` calls `isc_task_beginexclusive`.
7. Deadlock triggered, since:
- `rmzone` is wating for the lock.
- `isc_task_beginexclusive` is waiting for (no. workers - 1) to
be halted
- since `rmzone` event is blocked it won't allow the worker to halt.
To fix this, we updated do_addzone code to call isc_task_beginexclusive
before the lock is acquired, we postpone locking to the nearest required
place, same for isc_task_beginexclusive.
The same could happen with rndc modzone, so that was addressed as well.
Four named instances in the "nsupdate" system test have GSS-TSIG support
enabled. All of them currently use "tkey-gssapi-keytab". Configure two
of them with "tkey-gssapi-credential" to test that option.
As "tkey-gssapi-keytab" and "tkey-gssapi-credential" both provide the
same functionality, no test modifications are required. The difference
between the two options is that the value of "tkey-gssapi-keytab" is an
explicit path to the keytab file to acquire credentials from, while the
value of "tkey-gssapi-credential" is the name of the principal whose
credentials should be used; those credentials are looked up in the
keytab file expected by the Kerberos library, i.e. /etc/krb5.keytab by
default. The path to the default keytab file can be overridden using by
setting the KRB5_KTNAME environment variable. Utilize that variable to
use existing keytab files with the "tkey-gssapi-credential" option.
The KRB5_KTNAME environment variable should not interfere with the
"tkey-gssapi-keytab" option. Nevertheless, rename one of the keytab
files used with "tkey-gssapi-keytab" to something else than the contents
of the KRB5_KTNAME environment variable in order to make sure that both
"tkey-gssapi-keytab" and "tkey-gssapi-credential" are actually tested.
This mostly removes stuff that's either deprecated, obsolete or not used
at all:
* Update the minimal autoconf version to 2.69
* AC_PROG_CC_C99 is deprecated, just use AC_PROG_CC as we require C11
anyway
* AC_HEADER_TIME is deprecated, both <sys/time.h> and <time.h> can be
included at the same time, and we don't use the macros that
AC_HEADER_TIME defines anywhere
* AC_HEADER_STDC checks for ISO C90 and we require at least C11
* Replace AC_TRY_*([]) with AC_*_IFELSE([AC_LANG_PROGRAM()])
* Update m4/ax_check_openssl.m4 from serial 10 to serial 11
* Update m4/ax_gcc_func_attribute.m4 from serial 10 to serial 13
* Update m4/ax_pthread.m4 from serial 24 to serial 30
* Add early AC_CANONICAL_TARGET call to prevent warning from AX_PTHREAD
This commit changes the taskmgr to run the individual tasks on the
netmgr internal workers. While an effort has been put into keeping the
taskmgr interface intact, couple of changes have been made:
* The taskmgr has no concept of universal privileged mode - rather the
tasks are either privileged or unprivileged (normal). The privileged
tasks are run as a first thing when the netmgr is unpaused. There
are now four different queues in in the netmgr:
1. priority queue - netievent on the priority queue are run even when
the taskmgr enter exclusive mode and netmgr is paused. This is
needed to properly start listening on the interfaces, free
resources and resume.
2. privileged task queue - only privileged tasks are queued here and
this is the first queue that gets processed when network manager
is unpaused using isc_nm_resume(). All netmgr workers need to
clean the privileged task queue before they all proceed normal
operation. Both task queues are processed when the workers are
finished.
3. task queue - only (traditional) task are scheduled here and this
queue along with privileged task queues are process when the
netmgr workers are finishing. This is needed to process the task
shutdown events.
4. normal queue - this is the queue with netmgr events, e.g. reading,
sending, callbacks and pretty much everything is processed here.
* The isc_taskmgr_create() now requires initialized netmgr (isc_nm_t)
object.
* The isc_nm_destroy() function now waits for indefinite time, but it
will print out the active objects when in tracing mode
(-DNETMGR_TRACE=1 and -DNETMGR_TRACE_VERBOSE=1), the netmgr has been
made a little bit more asynchronous and it might take longer time to
shutdown all the active networking connections.
* Previously, the isc_nm_stoplistening() was a synchronous operation.
This has been changed and the isc_nm_stoplistening() just schedules
the child sockets to stop listening and exits. This was needed to
prevent a deadlock as the the (traditional) tasks are now executed on
the netmgr threads.
* The socket selection logic in isc__nm_udp_send() was flawed, but
fortunatelly, it was broken, so we never hit the problem where we
created uvreq_t on a socket from nmhandle_t, but then a different
socket could be picked up and then we were trying to run the send
callback on a socket that had different threadid than currently
running.
When we are reading from the xfrin socket, and the transfer would be
shutdown, the shutdown function would call `xfrin_fail()` which in turns
calls `xfrin_cancelio()` that causes the read callback to be invoked
with `ISC_R_CANCELED` status code and that caused yet another
`xfrin_fail()` call.
The fix here is to ensure the `xfrin_fail()` would be run only once
properly using better synchronization on xfr->shuttingdown flag.
Since all the libraries are internal now, just cleanup the ISCAPI remnants
in isc_socket, isc_task and isc_timer APIs. This means, there's one less
layer as following changes have been done:
* struct isc_socket and struct isc_socketmgr have been removed
* struct isc__socket and struct isc__socketmgr have been renamed
to struct isc_socket and struct isc_socketmgr
* struct isc_task and struct isc_taskmgr have been removed
* struct isc__task and struct isc__taskmgr have been renamed
to struct isc_task and struct isc_taskmgr
* struct isc_timer and struct isc_timermgr have been removed
* struct isc__timer and struct isc__timermgr have been renamed
to struct isc_timer and struct isc_timermgr
* All the associated code that dealt with typing isc_<foo>
to isc__<foo> and back has been removed.