The SOA lookup for edns512 could succeed if the negative response
for ns.edns512/AAAA completed before all the edns512/SOA query
attempts are made. The ns.edns512/AAAA lookup returns tc=1 and
the SOA record is cached after processing the NODATA response.
Lookup a TXT record at edns512 and look it up instead of the
SOA record.
Removed 'checking that TCP failures do not influence EDNS statistics
in the ADB' as it is no longer appropriate.
there were two failures during observed in testing, both occurring
when 'rndc halt' was run rather than 'rndc stop' - the latter dumps
zone contents to disk and presumably introduced enough delay to
prevent the races:
- a failure when the zone was shut down and called dns_xfrin_detach()
before the xfrin had finished connecting; the connect timeout
terminated without detaching its handle
- a failure when the tcpdns socket timer fired after the outerhandle
had already been cleared.
this commit incidentally addresses a failure observed in mutexatomic
due to a variable having been initialized incorrectly.
This commit extends the perl Configure script to also check for libssl
in addition to libcrypto and change the vcxproj source files to link
with both libcrypto and libssl.
Previously, the xfrin object relied on four different reference counters
(`refs`, `connects`, `sends`, `recvs`) and destroyed the xfrin object
only if all of them were zero. This commit reduces the reference
counting only to the `references` (renamed from `refs`) counter. We
keep the existing `connects`, `sends` and `recvs` as safe guards, but
they are not formally needed.
since the network manager is now handling timeouts, xfrin doesn't
need an isc_task object.
it may be necessary to revert this later if we find that it's
important for zone_xfrdone() to be executed in the zone task context.
currently things seem to be working well without that, though.
socket() call can return an error - e.g. EMFILE, so we need to handle
this nicely and not crash.
Additionally wrap the socket() call inside a platform independent helper
function as the Socket data type on Windows is unsigned integer:
> This means, for example, that checking for errors when the socket and
> accept functions return should not be done by comparing the return
> value with –1, or seeing if the value is negative (both common and
> legal approaches in UNIX). Instead, an application should use the
> manifest constant INVALID_SOCKET as defined in the Winsock2.h header
> file.
The recv_done() callback had many exit paths with different conditions,
and every path had it's own set of destructors. The refactored code now
has unified exit path with descriptive goto labels matching the intent:
- cancel_lookup
- next_lookup
- detach_query
- keep_query
The only exception to the rule is check_for_more_data() path, where the
part of the query gets reused, so the query->readhandle and query gets
detached on it's own, and by going to the keep_query, we are just
skipping calling the destructors again.
Because we use result earlier for setting the loadbalancing on the
socket, we could be left with a ISC_R_NOTIMPLEMENTED value stored in the
variable and when the UDP connection would succeed, we would
errorneously return this value instead of ISC_R_SUCCESS.
FreeBSD sometimes returns spurious errors in UDP connect() attempts,
so we try a few times before giving up. However, each failed attempt
triggers a call to udp_ready() in dighost.c, and that was causing
the query object to be detached prematurely.
Sometimes, the dig_lookup_t could be destroyed before the final
send_done() callback was be called, leading to dereferencing an
already freed dig_lookup_t object. By making the dig_lookup_t
reference counted, we are ensuring that it won't be freed until
the last reference (from dig_query_t .lookup) is released.
one of the tests in the resolver system test depends on dig
getting no response to its first two query attempts, and SERVFAIL
on the third after resolution times out.
using a 5-second retry timer in dig means the SERVFAIL response
could occur while dig is discarding the second query and preparing
to send the third. in this case the server's response could be
missed. shortening the retry interval to 4 seconds ensures that
dig has already sent the third query when the SERVFAIL response
arrives.
also, the serve-stale system test could fail due to a race in which
it timed out after waiting ten seconds for a file to be written, and
the dig timeout was just a bit longer. this is addressed by extending
the dig timeout to 11 seconds for this test.
this function sets the read timeout for the socket associated
with a netmgr handle and, if the timer is running, resets it.
for TCPDNS sockets it also sets the read timeout and resets the
timer on the outer TCP socket.
because dig now uses the netmgr, printing of response messages
happens in a different thread than setup. the IDN output filtering
procedure, which set using dns_name_settotextfilter(), is stored as
thread-local data, and so if it's set during setup, it won't be
accessible when printing. we now set it immediately before printing,
in the same thread, and clear it immedately afterward.
The network manager does not support returning UDP datagrams to
clients from unexpected sources; it is therefore not possible for
dig to accept them. The "+[no]unexpected" option has therefore
been removed from the dig command and its documentation.
As of libuv 1.36.0, CMake is the only supported build method for libuv
on Windows. Account for that fact by adjusting the relevant paths and
DLL file names used in the win32utils/Configure script. Update
Windows-specific documentation accordingly.
Our GitLab Runner Custom executor scripts now use the "image" key for
determining the Windows Docker image to use for a given CI job. Update
.gitlab-ci.yml to reflect that change.
In order for a "fast-expire/IN: response-policy zone expired" message to
be logged in ns3/named.run, the "fast-expire" zone must first be
transferred in by that server. However, with unfavorable timing, ns3
may be stopped before it manages to fetch the "fast-expire" zone from
ns5 and after the latter has been reconfigured to no longer serve that
zone. In such a case, the "rpz" system test will report a false
positive for the relevant check. Prevent that from happening by
ensuring ns3 manages to transfer the "fast-expire" zone before getting
shut down.