mir/bind - bind - Mike's Git repositories

mir/bind

mirror of https://gitlab.isc.org/isc-projects/bind9 synced 2025-08-31 14:35:26 +00:00

Author	SHA1	Message	Date
Petr Špaček	ba861f23f2	Remove unused DNS_RDATASET_ORDER Related: #4666	2025-07-10 11:17:19 +02:00
Petr Špaček	ae600b0a95	Remove unused DNS_RDATASET_FIXED There was no way to define this in the build system. Related: #4666	2025-07-10 11:17:19 +02:00
Petr Špaček	750d8a61b6	Convert DNS_RDATASETATTR_ bitfield manipulation to struct of bools RRset ordering is now an enum inside struct rdataset attributes. This was done to keep size to of the structure to its original value before this MR. I expect zero performance impact but it should be easier to deal with attributes in debuggers and language servers.	2025-07-10 11:17:19 +02:00
Arаm Sаrgsyаn	338bd67a10	fix: usr: Log dropped or slipped responses in the query-errors category Responses which were dropped or slipped because of RRL (Response Rate Limiting) were logged in the ``rate-limit`` category instead of the ``query-errors`` category, as documented in ARM. This has been fixed. Closes #5388 Merge branch '5388-rrl-log-category-fix' into 'main' See merge request isc-projects/bind9!10676	2025-07-10 08:56:09 +00:00
Aram Sargsyan	27e7961479	Log dropped or slipped responses in the query-errors category As mentioned in the comments block before the changed code block, the dropped or slipped responses should be logged in the query category (or rather query-errors category as done in lib/ns/client.c), so that requests are not silently lost. Also fix a couple of errors/typos in the code comments.	2025-07-10 08:20:17 +00:00
Alessio Podda	1d71e3b507	chg: dev: Improve efficiency of ns_client_t reset The ns_client_t struct is reset and zeroed out on every query, but some fields (query, message, manager) are preserved. We observe two things: - The sendbuf field is going to be overwritten anyway, there's no need to zero it out. - The fields are copied out when the struct is zero-ed out, and then copied back in. For the query field (which is 896 bytes) this is very inefficient. This commit makes the reset more efficient by avoiding the unnecessary zeroing and copying. Merge branch 'alessio/experimental-ns-client-noinit' into 'main' See merge request isc-projects/bind9!10463	2025-07-10 05:53:23 +00:00
Alessio Podda	e84704bd55	Improve efficiency of ns_client_t reset The ns_client_t struct is reset and zero-ed out on every query, but some fields (query, message, manager) are preserved. We observe two things: - The sendbuf field is going to be overwritten anyway, there's no need to zero it out. - The fields are copied out when the struct is zero-ed out, and then copied back in. For the query field (which is 896 bytes) this is very inefficient. This commit makes the reset more efficient avoiding to unnecessary zero-ing and copy.	2025-07-10 07:19:47 +02:00
Ondřej Surý	0c15da33e8	chg: dev: Increase the scalability in the ADB This MR reduces lock contention and increases scalability in the ADB by: a) Using SIEVE algorithm instead of classical LRU; b) Replacing rwlocked isc_hashmap with RCU cds_lfht table; c) Replace the single LRU table per-object with per-loop LRU tables per-object. Merge branch 'ondrej/use-urcu-lfht-for-ADB-tables' into 'main' See merge request isc-projects/bind9!10645	2025-07-09 23:19:56 +02:00
Ondřej Surý	031a3e65f8	Add doc/dev/LRU.md with per-loop LRU description Several compilation units now use per-loop LRU lists, add basic developers documentation on the design.	2025-07-09 21:54:49 +02:00
Ondřej Surý	cdeb8d1c14	Use cds_lfht for lock-free hashtables in dns_adb Replace the read-write locked isc_hashmap with lock-free cds_lfht hashtable and replace the singular LRU tables for ADB names and entries with a per-thread LRU tables. These changes allowed to remove all the read-write locking on the names and entries tables.	2025-07-09 21:22:48 +02:00
Ondřej Surý	cca4b26d31	Use regular reference counting macro for isc_nm_t structure Instead of having hand crafted attach/detach/destroy functions, replace them with the standard ISC_REFCOUNT macro. This also have advantage that delayed netmgr detach (from dns_dispatch) now doesn't cause assertion failure. This can happen with delayed (call_rcu) shutdown of dns_adb.	2025-07-09 21:22:48 +02:00
Ondřej Surý	51d7efbfb4	Print the memory context when printing overmem limits When printing the memory context going into or out of the overmem condition, also print the memory context name for easier debugging.	2025-07-09 21:22:48 +02:00
Ondřej Surý	7682bc21a9	Rewrite dns_adb LRU to SIEVE The dns_adb cleaning is little bit muddled as it mixes the "TTL" based cleaning (.expire_v4 and .expire_v6 for adbname, .expires for adbentry) with overmem cleaning. Rewrite the LRU based cleaning to use SIEVE algorithm and to be overmem cleaning only with a requirement to always cleanup at least 2-times the size of the newly added entry.	2025-07-09 21:22:47 +02:00
Alessio Podda	e0d1d936de	chg: dev: Replace per-zone lock buckets with global buckets Qpzone employs a locking strategy where rwlocks are grouped into buckets, and each zone gets 17 buckets. This strategy is suboptimal in two ways: - If named is serving a single zone or a zone is the majority of the traffic, this strategy pretty much guarantees contention when using more than a dozen threads. - If named is serving many small zones, it causes substantial memory usage. This commit switches the locking to a global table initialized at start time. This should have three effects: - Performance should improve in the single zone case, since now we are selecting from a bigger pool of locks. - Memory consumption should go down significantly in the many zone cases. - Performance should not degrade substantially in the many zone cases. The reason for this is that, while we could have substantially more zones than locks, we can query/edit only O(num threads) at the same time. So by making the global table much bigger than the expected number of threads, we can limit contention. Merge branch 'alessio/global-qpzone-lock-table' into 'main' See merge request isc-projects/bind9!10446	2025-07-09 14:17:02 +00:00
Alessio Podda	25daa047d4	Replace per-zone lock buckets with global buckets Qpzone employs a locking strategy where rwlocks are grouped into buckets, and each zone gets 17 buckets. This strategy is suboptimal in two ways: - If named is serving a single zone or a zone is the majority of the traffic, this strategy pretty much guarantees contention when using more than a dozen threads. - If named is serving many small zones, it causes substantial memory usage. This commit switches the locking to a global table initialized at start time. This should have three effects: - Performance should improve in the single zone case, since now we are selecting from a bigger pool of locks. - Memory consumption should go down significantly in the many zone cases. - Performance should not degrade substantially in the many zone cases. The reason for this is that, while we could have substantially more zones than locks, we can query/edit only O(num threads) at the same time. So by making the global table much bigger than the expected number of threads, we can limit contention.	2025-07-09 15:27:38 +02:00
Alessio Podda	512f1d3005	chg: dev: Extract the resigning heap into a separate struct In the current implementation, the resigning heap is part of the zone database. This leads to a cycle, as the database has a reference to its nodes, but each node needs a reference to the database. This MR splits the resigning heap into its own separate struct, in order to help breaking the cycle. Merge branch 'alessio/split-qpzone-heap-from-qpdb' into 'main' See merge request isc-projects/bind9!10706	2025-07-09 11:05:52 +00:00
Alessio Podda	0b1785ec10	Extract the resigning heap into a separate struct In the current implementation, the resigning heap is part of the zone database. This leads to a cycle, as the database has a reference to its nodes, but each node needs a reference to the database. This MR splits the resigning heap into its own separate struct, in order to help breaking the cycle.	2025-07-09 12:33:18 +02:00
Alessio Podda	c2a84bb17a	Abstract bucket lock selection logic Recovering the node lock from a pointer to the header and a pointer to the db is a common operation. This commit abstracts it away into a function, so that the node lock selection logic may be modified more easily.	2025-07-09 12:33:18 +02:00
Mark Andrews	720fa14670	fix: dev: Fix a possible crash when adding a zone while recursing A query for a zone that was not yet loaded may yield an unexpected result such as a CNAME or DNAME, triggering an assertion failure. This has been fixed. Closes #5357 Merge branch '5357-resume-qmin-cname' into 'main' See merge request isc-projects/bind9!10562	2025-07-09 10:55:28 +10:00
Petr Menšík	d2c6966232	Add few extra WANT_QUERYTRACE logs into resume_qmin Print optionally a bit more details not passed to event in case dns_view_findzonecut returns unexpected result. Result would be visible later in foundevent, but found fname would be lost. Print it into the log.	2025-07-09 10:13:29 +10:00
Petr Mensik	2fd3da54f9	Handle CNAME and DNAME in resume_min in a special way When authoritative zone is loaded when query minimization query for the same zone is already pending, it might receive unexpected result codes. Normally DNS_R_CNAME would follow to query_cname after processing sent events, but dns_view_findzonecut does not fill CNAME target into event->foundevent. Usual lookup via query_lookup would always have that filled. Ideally we would restart the query with unmodified search name, if unexpected change from recursing to local zone cut were detected. Until dns_view_findzonecut is modified to export zone/cache source of the cut, at least fail queries which went into unexpected state.	2025-07-09 10:13:29 +10:00
Michal Nowak	f0ca86be7c	new: ci: Add AlmaLinux 10 Merge branch 'mnowak/add-almalinux-10' into 'main' See merge request isc-projects/bind9!10682	2025-07-08 15:59:27 +02:00
Michal Nowak	7c5c16ea6b	Do not add AlmaLinux 9 unit and system test in MR pipelines	2025-07-08 14:51:47 +02:00
Michal Nowak	42367082cc	Add AlmaLinux 10	2025-07-08 14:51:47 +02:00
Michał Kępień	28226f979a	fix: pkg: Fix named-makejournal man page installation The man page for :iscman:`named-makejournal` was erroneously not installed when building from a source tarball. This has been fixed. See #5379 Merge branch '5379-fix-named-makejournal-man-page-installation' into 'main' See merge request isc-projects/bind9!10709	2025-07-08 14:13:33 +02:00
Aydın Mercan	ccae13b482	Add missing files for meson built manpages These manual entries still get built and installed but get excluded from meson's rebuild detection.	2025-07-08 13:44:03 +03:00
Michał Kępień	caa0451e28	Fix named-makejournal man page installation The man page for named-makejournal is erroneously not installed when building from a source tarball. Add that man page to the appropriate lists in the build system so that it is installed both when building from a Git repository and from a source tarball.	2025-07-08 13:44:03 +03:00
Michal Nowak	8936237ef6	fix: ci: Ensure PYTHON is set for every parse_tsan.py invocation System tests' after_script missed the PYTHON environmental variable setup. $ find -name 'tsan.*' -exec "$PYTHON" util/parse_tsan.py {} \; find: '': No such file or directory Merge branch 'mnowak/fix-parse_tsan-invocation' into 'main' See merge request isc-projects/bind9!10683	2025-07-08 12:21:47 +02:00
Michal Nowak	8f858c4f03	Ensure PYTHON is set for every parse_tsan.py invocation System tests' after_script missed the PYTHON environmental variable setup. $ find -name 'tsan.*' -exec "$PYTHON" util/parse_tsan.py {} \; find: '': No such file or directory	2025-07-08 11:05:00 +02:00
Ondřej Surý	754d17590e	fix: usr: Clean enough memory when adding new ADB names/entries under memory pressure The ADB memory cleaning is opportunistic even when we are under memory pressure (in the overmem condition). Split the opportunistic LRU cleaning and overmem cleaning and make the overmem cleaning always cleanup double of the newly allocated adbname/adbentry to ensure we never allocate more memory than the assigned limit. Merge branch 'ondrej/enforce-memory-cleanup-in-ADB-when-overmem' into 'main' See merge request isc-projects/bind9!10637	2025-07-08 09:49:30 +02:00
Ondřej Surý	eb0ffa0d5f	When overmem, clean enough memory when adding new ADB names/entries The purge_stale_names()/purge_stale_entries() is opportunistic even when we are under memory pressure (overmem). Split the opportunistic LRU cleaning and overmem cleaning. This makes the stale purging much simpler as we don't have to try that hard and makes the overmem cleaning always cleanup double the amount of the newly allocated ADB name/entry.	2025-07-08 05:56:19 +02:00
Mark Andrews	8420adf218	chg: usr: use native shared library extension Use the native shared library extension when build loadable libaries. For most platforms this is ".so" but for Darwin it is ".dylib". Closes #5375 Merge branch '5375-use-native-shared-library-extension' into 'main' See merge request isc-projects/bind9!10588	2025-07-08 01:24:40 +10:00
Mark Andrews	28a8933690	Use native shared library extension For most platforms this is ".so" but for Darwin it is ".dylib".	2025-07-07 23:39:44 +10:00
Nicki Křížek	02d9fbfe26	chg: test: Improve system test stability Tweak various system test which have been unstable in the past weeks. Closes #5406 Merge branch 'nicki/improve-system-test-stability' into 'main' See merge request isc-projects/bind9!10690	2025-07-07 14:04:10 +02:00
Nicki Křížek	b98660e93e	Remove unstable check from digdelv test The code which checks for both IPv4 and IPv6 mixed usage is inherently unstable, since the address family is chosen randomly for each connection. Closes #5406	2025-07-07 13:29:15 +02:00
Nicki Křížek	4c487c811d	Use pytest.mark.flaky as the flaky marker It's possible to use pytest.mark.flaky, which achieves the exact same thing as our custom-defined isctest.mark.flaky -- attempts to rerun the test on failure, but only is flaky package is available.	2025-07-07 13:29:15 +02:00
Nicki Křížek	126a59cef2	Mark secondary.kasp test case as flaky on freebsd13 The test_kasp_case[secondary.kasp] can sometimes fail on freebsd13. It appears the test gets stuck on some operation which should be very quick, but for some reason takes at least a few seconds, causing the cb_ixfr_is_signed() function to time out. In one of the cases I investigated, it wasn't a query/response that caused a timeout, but rather some operation in between. The test attempts to read from a keyfile/statefile, but I see no reason why that should block. In any case, try to increase the timeout for the verification, as that shouldn't hurt. Also allow the test to be re-run on freebsd13, as it's likely to be caused by some odd behaviour on that platform -- the issue doesn't appear anywhere else.	2025-07-07 13:29:15 +02:00
Nicki Křížek	34867e1693	Allow dnstap system test rerun on freebsd13 The check "unix socket message counts" sometimes fails with "dnstap output file smaller than expected". This only happens on freebsd13 and can't be reproduced easily. There was an attempt to decrease the required file size in the past, but apparently, the issue can still occur.	2025-07-07 13:29:15 +02:00
Nicki Křížek	1e0df480c7	Mark the serve_stale system test as flaky The serve_stale test has some inherent instabilities affecting many different checks. While the failure rate isn't too high (about four failures in past three weeks of nightlies), it gets ignored, because the test has been unstable for a very long time.	2025-07-07 13:29:15 +02:00
Nicki Křížek	6755d741e4	Remove token deletion check in keyfromlabel test This removes a leftover check which should've been removed in a prior change (see #5244). The softhsm2 failures when attempting to delete the token should be ignored.	2025-07-07 13:29:15 +02:00
Nicki Křížek	87ab198b73	Use proper wait in rndc test Previously, the one-second sleep was unreliable, as it didn't properly indicate that the rndc reconfig has been processed. The "test 'rndc reconfig' with a broken config" check would sometimes fail under TSAN in CI, because the previous rndc reconfig was still ongoing, and the subsequent rndc reconfig was ignored.	2025-07-07 13:29:15 +02:00
Nicki Křížek	66f6f4bba9	Allow reruns for test_json and test_xml tests These tests have been unstable under TSAN in the past, but it appears that the same failure mode can happen outside of TSAN tests as well. These tests have produced 12 failures combined in the past three weeks in nightlies.	2025-07-07 13:29:02 +02:00
Nicki Křížek	ae932eefc5	Increase test reruns for fetchlimit The fetchlimit test has failed 8 times in the nightly CI over the past three weeks. That makes the overall failure rate somewhere around 1 %, which isn't a lot, but is still annoying when lots of testing is going on.	2025-07-07 13:29:02 +02:00
Mark Andrews	a9575a4154	fix: test: rndc test: second 'rndc reconfig' happens too soon Rndc test "test 'rndc reconfig' with a broken config" was failing intermittently. Wait for 'running' to be logged rather than just using 'sleep 1' before calling 'rndc reconfig' a second time to get the expected error message rather than 'reconfig request ignored: already running'. Closes #5408 Merge branch '5408-rndc-test-second-rndc-reconfig-happens-too-soon' into 'main' See merge request isc-projects/bind9!10687	2025-07-07 12:21:58 +10:00
Mark Andrews	8b7bbda2f1	rndc test: second 'rndc reconfig' happens too soon Rndc test "test 'rndc reconfig' with a broken config" was failing intermittently. Wait for 'running' to be logged rather than just using 'sleep 1' before calling 'rndc reconfig' a second time to get the expected error message rather than 'reconfig request ignored: already running'.	2025-07-07 11:42:10 +10:00
Štěpán Balážik	7dcc654f2c	chg: test: Disable DNSSEC validation instead of enabling it with empty TAs in system tests There are many system tests where we set `dnssec-validation yes;` only to also set `trust-anchors { };` which effectively disables the validation. This MR replaces this convoluted setup with just `dnssec-validation no;`. Merge branch 'stepan/empty-trust-anchors-in-system-tests' into 'main' See merge request isc-projects/bind9!10684	2025-07-06 16:54:41 +00:00
Štěpán Balážik	01d1ad7988	Disable DNSSEC validation instead of enabling it with empty TAs in tests There are many system tests where we set `dnssec-validation yes;` only to also set `trust-anchors { };` which effectively disables the validation. This commit replaces this convoluted setup with just `dnssec-validation no;`.	2025-07-06 14:18:10 +00:00
Štěpán Balážik	67916aafad	new: ci: Run an additional respdiff job for merge requests and schedules On MRs it uses the merge target as the reference. In schedules it uses the latest released version for this branch as the reference. This MR lays the ground work for using respdiff on non-standard configurations (like ECS) in the public repo, see https://gitlab.isc.org/isc-private/bind9/-/merge_requests/807#note_573140. To reduce the future hassle when maintaining the -S version, most of the work (including an added job, so we know that it actually works) is done here. Merge branch 'stepan/respdiff-against-merge-target-or-last-release' into 'main' See merge request isc-projects/bind9!10664	2025-07-06 13:18:53 +00:00
Štěpán Balážik	9a6e8b9190	Run an additional respdiff job for merge requests and schedules On MRs it uses the merge target as the reference. In schedules it uses the latest released version for this branch as the reference.	2025-07-06 13:18:42 +00:00
Mark Andrews	571d318466	fix: dev: Separate out adbname type flags There are three adbname flags that are used to identify different types of adbname lookups when hashing rather than using multiple hash tables. Separate these to their own structure element as these need to be able to be read without locking the adbname structure. Closes #5404 Merge branch '5404-seperate-out-adbname-type-flags' into 'main' See merge request isc-projects/bind9!10677	2025-07-06 23:09:13 +10:00

1 2 3 4 5 ...

43403 Commits