This change makes the zone table lock-free for reads. Previously, the
zone table used a red-black tree, which is not thread safe, so the hot
read path acquired both the per-view mutex and the per-zonetable
rwlock. (The double locking was to fix to cleanup races on shutdown.)
One visible difference is that zones are not necessarily shut down
promptly: it depends on when the qp-trie garbage collector cleans up
the zone table. The `catz` system test checks several times that zones
have been deleted; the test now checks for zones to be removed from
the server configuration, instead of being fully shut down. The catz
test does not churn through enough zones to trigger a gc, so the zones
are not fully detached until the server exits.
After this change, it is still possible to improve the way we handle
changes to the zone table, for instance, batching changes, or better
compaction heuristics.
Instead of explicitly adding a reference to catzs (catalog zones) when
calling the update callback, attach the catzs to the catz (catalog zone)
object to keep it referenced for the whole time the catz exists.
The isc_time_now() and isc_time_now_hires() were used inconsistently
through the code - either with status check, or without status check,
or via TIME_NOW() macro with RUNTIME_CHECK() on failure.
Refactor the isc_time_now() and isc_time_now_hires() to always fail when
getting current time has failed, and return the isc_time_t value as
return value instead of passing the pointer to result in the argument.
The dns__catz_update_cb() function was earlier updated (see
d2ecff3c4a0d961041b860515858d258d40462d7) to use a separate
'dns_db_t' object ('catz->updb' instead of 'catz->db') to
avoid a race between the 'dns__catz_update_cb()' and
'dns_catz_dbupdate_callback()' functions, but the 'REQUIRE'
check there still checks the validity of the 'catz->db' object.
Fix the omission.
This should delay the catalog zone from being destroyed during
shutdown, if the update process is still running.
Doing this should not introduce significant shutdown delays, as
the update function constantly checks the 'shuttingdown' flag
and cancels the process if it is set.
As it is done in the RPZ module, use 'db' and 'dbversion' for the
database we are going to update to, and 'updb' and 'updbversion' for
the database we are working on.
Doing this should avoid a race between the 'dns__catz_update_cb()' and
'dns_catz_dbupdate_callback()' functions.
When detaching from the previous version of the database, make sure
that the update-notify callback is unregistered, otherwise there is
an INSIST check which can generate an assertion failure in free_rbtdb(),
which checks that there are no outstanding update listeners in the list.
There is a similar code already in place for RPZ.
The dbiterator read-locks the whole zone and it stayed locked during
whole processing time when catz is being read. Pause the iterator, so
the updates to catz zone are not being blocked while processing the catz
update.
Instead of holding the catzs->lock the whole time we process the catz
update, only hold it for hash table lookup and then release it. This
should unblock any other threads that might be processing updates to
catzs triggered by extra incoming transfer.
Offload catalog zone processing so that the network manager threads
are not interrupted by a large catalog zone update.
Introduce a new 'updaterunning' state alongside with 'updatepending',
like it is done in the RPZ module.
Note that the dns__catz_update_cb() function currently holds the
catzs->lock during the whole process, which is far from being optimal,
but the issue is going to be addressed separately.
This change should make sure that catalog zone update processing
doesn't happen when the catalog zone is being shut down. This
should help avoid races when offloading the catalog zone updates
in the follow-up commit.
* Change 'dns_catz_new_zones()' function's prototype (the order of the
arguments) to synchronize it with the similar function in rpz.c.
* Rename 'refs' to 'references' in preparation of ISC_REFCOUNT_*
macros usage for reference tracking.
* Unify dns_catz_zone_t naming to catz, and dns_catz_zones_t naming to
catzs, following the logic of similar changes in rpz.c.
* Use C compound literals for structure initialization.
* Synchronize the "new zone version came too soon" log message with the
one in rpz.c.
* Use more of 'sizeof(*ptr)' style instead of the 'sizeof(type_t)' style
expressions when allocating or freeing memory for 'ptr'.
as there is no further use of isc_task in BIND, this commit removes
it, along with isc_taskmgr, isc_event, and all other related types.
functions that accepted taskmgr as a parameter have been cleaned up.
as a result of this change, some functions can no longer fail, so
they've been changed to type void, and their callers have been
updated accordingly.
the tasks table has been removed from the statistics channel and
the stats version has been updated. dns_dyndbctx has been changed
to reference the loopmgr instead of taskmgr, and DNS_DYNDB_VERSION
has been udpated as well.
DSCP has not been fully working since the network manager was
introduced in 9.16, and has been completely broken since 9.18.
This seems to have caused very few difficulties for anyone,
so we have now marked it as obsolete and removed the
implementation.
To ensure that old config files don't fail, the code to parse
dscp key-value pairs is still present, but a warning is logged
that the feature is obsolete and should not be used. Nothing is
done with configured values, and there is no longer any
range checking.
When isc_buffer_t buffer is created with isc_buffer_allocate() assume
that we want it to always auto-reallocate instead of having an extra
call to enable auto-reallocation.
The isc_buffer_putdecint() could be easily replaced with
isc_buffer_printf() with just a small overhead of calling vsnprintf()
twice instead once. This is not on a hot-path (dns_catz unit), so we
can ignore the overhead and instead have less single-use code in favor
of using reusable more generic function.
The isc_buffer_reserve() would be passed a reference to the buffer
pointer, which was unnecessary as the pointer would never be changed
in the current implementation. Remove the extra dereference.
The dns_catz_update_from_db() function prints serial number as a signed
number (with "%d" in the format string), but the `vers` variable's type
is 'uint32_t'. This breaks serials bigger than 2^31.
Use PRIu32 instead of "d" in the format string.
Extract the tlss values if present from the ipkeylist entry and add
the resulting tls setting to the constructed configuration for the
primary.
When comparing catalog zone entries for reuse also check the
masters.tlss values for equality.
Add new semantic patch to replace the straightfoward uses of:
ptr = isc_mem_{get,allocate}(..., size);
memset(ptr, 0, size);
with the new API call:
ptr = isc_mem_{get,allocate}x(..., size, ISC_MEM_ZERO);
When looking for changes in a catalog zone member zone we need to
also check if the TSIG key name associated with a primary server
has be added, removed or changed.
Instead of creating the catalog zone deferred update timer when creating
the catalog zone object, create it on demand on the current loop and
destroy it as soon as the timer has finished its job. There's a
side-effect - the processing of the catalog zone update is now done on
the current loop - previously, it was always on the main loop.
Previously:
* applications were using isc_app as the base unit for running the
application and signal handling.
* networking was handled in the netmgr layer, which would start a
number of threads, each with a uv_loop event loop.
* task/event handling was done in the isc_task unit, which used
netmgr event loops to run the isc_event calls.
In this refactoring:
* the network manager now uses isc_loop instead of maintaining its
own worker threads and event loops.
* the taskmgr that manages isc_task instances now also uses isc_loopmgr,
and every isc_task runs on a specific isc_loop bound to the specific
thread.
* applications have been updated as necessary to use the new API.
* new ISC_LOOP_TEST macros have been added to enable unit tests to
run isc_loop event loops. unit tests have been updated to use this
where needed.
* isc_timer was rewritten using the uv_timer, and isc_timermgr_t was
completely removed; isc_timer objects are now directly created on the
isc_loop event loops.
* the isc_timer API has been simplified. the "inactive" timer type has
been removed; timers are now stopped by calling isc_timer_stop()
instead of resetting to inactive.
* isc_manager now creates a loop manager rather than a timer manager.
* modules and applications using isc_timer have been updated to use the
new API.
When processing a catalog zone update, skip processing records with
DNSSEC-related and ZONEMD types, because we are not interested in them
in the context of a catalog zone, and processing them will fail and
produce an unnecessary warning message.
Previously, tasks could be created either unbound or bound to a specific
thread (worker loop). The unbound tasks would be assigned to a random
thread every time isc_task_send() was called. Because there's no logic
that would assign the task to the least busy worker, this just creates
unpredictability. Instead of random assignment, bind all the previously
unbound tasks to worker 0, which is guaranteed to exist.
After removing the isc_task_onshutdown(), the isc_task_shutdown() and
isc_task_destroy() became obsolete.
Remove calls to isc_task_shutdown() and replace the calls to
isc_task_destroy() with isc_task_detach().
Simplify the internal logic to destroy the task when the last reference
is removed.
The DNS catalog zones draft version 5 document requires that catalog
zones consumers must reset the member zone's internal zone state when
its unique label changes (either within the same catalog zone or
during change of ownership performed using the "coo" property).
BIND already behaves like that, and, in fact, doesn't support keeping
the zone state during change of ownership even if the unique label
has been kept the same, because BIND always removes the member zone
and adds it back during unique label renaming or change of ownership.
Document the described behavior and add a log message to inform when
unique label renaming occurs.
Add a system test case with unique label renaming.
We check the `rdclass` to be of type IN in `dns_catz_update_process()`
function, and all the other static functions where similar checks exist
are called after (and in the result of) that function being called,
so they are effectively redundant.
The DNS catalog zones draft version 5 document describes various
situations when a catalog zones must be considered as "broken" and
not be processed.
Implement those checks in catz.c and add corresponding system tests.
Catalog zones change of ownership is special mechanism to facilitate
controlled migration of a member zone from one catalog to another.
It is implemented using catalog zones property named "coo" and is
documented in DNS catalog zones draft version 5 document.
Implement the feature using a new hash table in the catalog zone
structure, which holds the added "coo" properties for the catalog zone
(containing the target catalog zone's name), and the key for the hash
table being the member zone's name for which the "coo" property is being
created.
Change some log messages to have consistent zone name quoting types.
Update the ARM with change of ownership documentation and usage
examples.
Add tests which check newly the added features.