due to a typo in the code, ADB entries were unlinked from their entry
buckets during shutdown if they had a nonzero reference count. they
were only supposed to be unlinked if the reference count was exactly
one (that being the reference held by the bucket itself).
Implement TCP support in the `ans11` Python-based DNS server.
Implement a control command channel in `ans11` to support an optional
silent mode of operation, which, when enabled, will ignore incoming
queries.
In the added check, make the `ans11` the NS server of
"a.root-servers.nil." for `ns3`, so it uses `ans11` (in silent mode)
for the regular (non-forwarded) name resolutions.
This will trigger the "hung fetch" scenario, which was causing `named`
to crash.
- Check that an NS in an authority section returned from a forwarder
which is above the name in a configured "forward first" or "forward
only" zone (i.e., net/NS in a response from a forwarder configured for
local.net) is not cached.
- Test that a DNAME for a parent domain will not be cached when sent
in a response from a forwarder configured to answer for a child.
- Check that glue is rejected if its name falls below that of zone
configured locally.
- Check that an extra out-of-bailiwick data in the answer section is
not cached (this was already working correctly, but was not explicitly
tested before).
The mctx, zonetask and loadtask pools were being destroyed in the
shutdown function where in theory a dangling zone could be still
attached to it.
Move the isc_mem_put() on the pools to the destroy() function.
There's couple of files that modify behaviour of named when started via
bin/tests/system/start.pl. Add those files as CC-1.0 to .reuse/dep5 as
they are just empty placeholders.
Add a test case to check for lingering TCP sockets stuck in the
CLOSE_WAIT state. This can happen if a client sends some garbage after
its first query.
The system test runs the reproducer script and then sends another TCP
query to the resolver. The resolver is configured to allow one TCP
client only. If BIND has its TCP socket stuck in CLOSE_WAIT, it does
not have the resources available to answer the second query.
Note: A better test would be to check if the named daemon does not
have a TCP socket stuck in CLOSE_WAIT for example with netstat. When
running this test locally you can examine named with netstat manually.
But since netstat is platform specific it is not a good candidate to do
this as a system test.
If you, if you could return, don't let it burn.
Do you have to let it linger?
- Cranberries
This allows Gitlab to show nice summary for individual tests/test
directories and to expose the results in Gitlab API for consumption
elsewhere.
A catch: As of Gitlab 14.7.7, the detailed results are stored
only in artifacts and thus expire. All consumers (including API) need
to be "fast enough" to get the data before they disappear.
This also forces us to always store the artifacts intead of storing them
only on failure.
There are a couple of problems with dns_request_createvia(): a UDP
retry count of zero means unlimited retries (it should mean no
retries), and the overall request timeout is not enforced. The
combination of these bugs means that requests can be retried forever.
This change alters calls to dns_request_createvia() to avoid the
infinite retry bug by providing an explicit retry count. Previously,
the calls specified infinite retries and relied on the limit implied
by the overall request timeout and the UDP timeout (which did not work
because the overall timeout is not enforced). The `udpretries`
argument is also changed to be the number of retries; previously, zero
was interpreted as infinity because of an underflow to UINT_MAX, which
appeared to be a mistake. And `mdig` is updated to match the change in
retry accounting.
The bug could be triggered by zone maintenance queries, including
NOTIFY messages, DS parental checks, refresh SOA queries and stub zone
nameserver lookups. It could also occur with `nsupdate -r 0`.
(But `mdig` had its own code to avoid the bug.)
Implement reference counting for TLS contexts, Resolve#3122 DoT stops working after "rndc reconfigure" when running named as non-root
Closes#3122
See merge request isc-projects/bind9!6087
This commit makes use of isc_nmsocket_set_tlsctx(). Now, instead of
recreating TLS-enabled listeners (including the underlying TCP
listener sockets), only the TLS context in use is replaced.
This commit adds isc_nmsocket_set_tlsctx() - an asynchronous function
that replaces the TLS context within a given TLS-enabled listener
socket object. It is based on the newly added reference counting
functionality.
The intention of adding this function is to add functionality to
replace a TLS context without recreating the whole socket object,
including the underlying TCP listener socket, as a BIND process might
not have enough permissions to re-create it fully on reconfiguration.
The implementation is done on top of the reference counting
functionality found in OpenSSL/LibreSSL, which allows for avoiding
wrapping the object.
Adding this function allows using reference counting for TLS contexts
in BIND 9's codebase.
After some back and forth, it was decidede to match the configuration
option with unbound ("so-reuseport"), PowerDNS ("reuseport") and/or
nginx ("reuseport").
as far as I can determine the order of operations is not important.
*** CID 351372: Concurrent data access violations (ATOMICITY)
/lib/isc/timer.c: 227 in timer_purge()
221 LOCK(&timer->lock);
222 if (!purged) {
223 /*
224 * The event has already been executed, but not
225 * yet destroyed.
226 */
>>> CID 351372: Concurrent data access violations (ATOMICITY)
>>> Using an unreliable value of "event" inside the second locked section. If the data that "event" depends on was changed by another thread, this use might be incorrect.
227 timerevent_unlink(timer, event);
228 }
229 }
230 }
231
232 void
*** CID 351371: Null pointer dereferences (REVERSE_INULL)
/lib/dns/adb.c: 2615 in dns_adb_createfind()
2609 /*
2610 * Copy out error flags from the name structure into the find.
2611 */
2612 find->result_v4 = find_err_map[adbname->fetch_err];
2613 find->result_v6 = find_err_map[adbname->fetch6_err];
2614
>>> CID 351371: Null pointer dereferences (REVERSE_INULL)
>>> Null-checking "find" suggests that it may be null, but it has already been dereferenced on all paths leading to the check.
2615 if (find != NULL) {
2616 if (want_event) {
2617 INSIST((find->flags & DNS_ADBFIND_ADDRESSMASK) != 0);
2618 isc_task_attach(task, &(isc_task_t *){ NULL });
2619 find->event.ev_sender = task;
2620 find->event.ev_action = action;
The error code path handling the `ISC_R_CANCELED` code lacks a
`clear_current_lookup()` call, without which dig hangs indefinitely
when handling the error.
Add the missing call to account for all references of the lookup so
it can be destroyed.