mir/bind - bind - Mike's Git repositories

mir/bind

mirror of https://gitlab.isc.org/isc-projects/bind9 synced 2025-08-28 13:08:06 +00:00

Author	SHA1	Message	Date
Matthijs Mekking	c5a14f263f	Add namespace to new_qp(c\|z)node Is there a time when new_qp(c\|z)node() would not be followed by assignment of the namespace? No, so let's add the assignment to the function that creates the node.	2025-07-10 13:52:59 +00:00
Matthijs Mekking	df6763fd2a	Rename DNS_DB_NSEC_ constants to DNS_DBNAMESPACE_ Naming is hard exercise.	2025-07-10 13:52:59 +00:00
Matthijs Mekking	a7021a3a51	Rename dns_qp_lookup2 back to dns_qp_lookup Now that we have to code working, rename 'dns_qp_lookup2' back to 'dns_qp_lookup' and adjust all remaining 'dns_qp_lookup' occurrences to take a space=0 parameter.	2025-07-10 13:52:59 +00:00
Matthijs Mekking	e052e14b40	Change denial type to enum For now we only allow DNS_DB_NSEC_* values so it makes sense to change the type to an enum. Rename 'denial' to the more intuitive 'space', indicating the namespace of the keyvalue pair.	2025-07-10 13:52:59 +00:00
Matthijs Mekking	16a1c5a623	Prepend qpkey with denial byte In preparation to merge the three qp tries (tree, nsec, nsec3) into one, add the piece of information into the qpkey. This is the most significant bit of information, so prepend the denial type to the qpkey. This means we need to pass on the denial type when constructing the qpkey from a name, or doing a lookup. Reuse the the DNS_DB_NSEC_* values. Most qp tries in the code we just pass on 0 (nta, rpz, zt, etc.), because there is no need for denial of existence, but for qpzone and qpcache we must pass the right value. Change the code, so that node->nsec no longer can have the value DNS_DB_NSEC_HAS_NSEC, instead track this in a new attribute 'havensec'. Since we use node->nsec to convert names to keys, the value MUST be set before inserting the node into the qp-trie. Update the fuzzing and unit tests accordingly. This only adds a few extra test cases, more are needed. In the qp_test.c we can remove test code for empty keys as this is no longer possible.	2025-07-10 13:52:59 +00:00
Petr Špaček	750d8a61b6	Convert DNS_RDATASETATTR_ bitfield manipulation to struct of bools RRset ordering is now an enum inside struct rdataset attributes. This was done to keep size to of the structure to its original value before this MR. I expect zero performance impact but it should be easier to deal with attributes in debuggers and language servers.	2025-07-10 11:17:19 +02:00
Alessio Podda	ef95806e05	Change QP and qpcache logging from DEBUG(1) to DEBUG(3) Per pspacek, currently qp and qpcache logs are too verbose and enabled at a level too low compared to how often the logging is useful. This commit increases the logging level, while keeping it configurable via a define.	2025-06-25 14:37:01 +02:00
alessio	70b1777d8a	Adaptive memory allocation strategy for qp-tries qp-tries allocate their nodes (twigs) in chunks to reduce allocator pressure and improve memory locality. The choice of chunk size presents a tradeoff: larger chunks benefit qp-tries with many values (as seen in large zones and resolvers) but waste memory in smaller use cases. Previously, our fixed chunk size of 2^10 twigs meant that even an empty qp-trie would consume 12KB of memory, while reducing this size would negatively impact resolver performance. This commit implements an adaptive chunking strategy that: - Tracks the size of the most recently allocated chunk. - Doubles the chunk size for each new allocation until reaching a predefined maximum. This approach effectively balances memory efficiency for small tries while maintaining the performance benefits of larger chunk sizes for bigger data structures. This commit also splits the callback freeing qpmultis into two phases, one that frees the underlying qptree, and one that reclaims the qpmulti memory. In order to prevent races between the qpmulti destructor and chunk garbage collection jobs, the second phase is protected by reference counting.	2025-05-22 15:19:27 -07:00
Aram Sargsyan	e1a415b412	Fix a date race in qpcache_addrdataset() The 'qpnode->nsec' structure member isn't protected by a lock and there's a data race between the reading and writing parts in the qpcache_addrdataset() function. Use a node read lock for accessing 'qpnode->nsec' in qpcache_addrdataset(). Add an additional 'qpnode->nsec != DNS_DB_NSEC_HAS_NSEC' check under a write lock to be sure that no other competing thread changed it in the time when the read lock is unlocked and a write lock is not acquired yet.	2025-04-23 13:02:43 +00:00
Ondřej Surý	6ed821beb4	Reduce QPDB_VIRTUAL to 10 seconds The *DB_VIRTUAL value was introduced to allow the clients (presumably ns_clients) that has been running for some time to access the cached data that was valid at the time of its inception. The default value of 5 minutes is way longer than longevity of the ns_client object as the resolver will give up after 2 minutes. Reduce the value to 10 seconds to accomodate to honour the original more closely, but still allow some leeway for clients that started some time in the past. Our measurements show that even setting this value to 0 has no statistically significant effect, thus the value of 10 seconds should be on the safe side.	2025-04-16 11:21:38 +02:00
alessio	4017a40b1d	Remove zero initialization of large buffers Profiles show that an high amount of CPU time spent in memset. By removing zero initalization of certain large buffers we improve performance in certain authoritative workloads.	2025-04-02 16:24:31 +02:00
Ondřej Surý	1233dc8a61	Add isc_sieve unit implementing SIEVE-LRU algorithm This is the core implementation of the SIEVE algorithm described in the following paper: Zhang, Yazhuo, Juncheng Yang, Yao Yue, Ymir Vigfusson, and K V Rashmi. “SIEVE Is Simpler than LRU: An Efficient Turn-Key Eviction Algorithm for Web Caches,” n.d.. available online from https://junchengyang.com/publication/nsdi24-SIEVE.pdf	2025-03-26 15:36:33 -07:00
Ondřej Surý	e8a1949566	Remove lock upgrading from the hot path in the cache In QPcache, there were two places that tried to upgrade the lock. In clean_stale_header(), the code would try to upgrade the lock and cleanup the header, and in qpzonode_release(), the tree lock would be optionally upgraded, so we can cleanup the node directly if empty. These optimizations are not needed and they have no effect on the performance.	2025-03-25 10:57:19 +01:00
Ondřej Surý	3ef9b09620	Fix invalid cache-line padding for qpcache buckets The isc_queue_t was missing in the calculation of the required padding size inside the qpcache bucket structure.	2025-03-25 10:56:21 +01:00
Evan Hunt	341b962665	move dns_zonekey_iszonekey() to dns_dnssec module dns_zonekey_iszonekey() was the only function defined in the dns_zonekey module, and was only called from one place. it makes more sense to group this with dns_dnssec functions.	2025-03-20 18:22:58 +00:00
Evan Hunt	606d30796e	use new dns_rdatatype classification functions modify code to use dns_rdatatype_ismulti(), dns_rdatatype_issig(), dns_rdatatype_isaddr(), and dns_rdatatype_isalias() where applicable.	2025-03-15 00:27:54 +00:00
Ondřej Surý	1e4695510a	Revert "fix: dev: Delete dead nodes when committing a new version" This reverts commit 67255da4b376f65138b299dcd5eb6a3b7f9735a9, reversing changes made to 74c9ff384e695d1b27fa365d1fee84576f869d4c.	2025-03-05 17:46:54 +01:00
Ondřej Surý	303c20caf8	Fix the foundname vs dcname madness in qpcache_findzonecut() The qpcache_findzonecut() accepts two "foundnames": 'foundname' and 'dcname' could be NULL. Originally, when 'dcname' would be NULL, the 'dcname' would be set to 'foundname'. Then code like this was present: result = find_deepest_zonecut(&search, node, nodep, foundname, rdataset, sigrdataset DNS__DB_FLARG_PASS); dns_name_copy(foundname, dcname); Which basically means that we are copying the .ndata over itself for no apparent reason. Cleanup the dcname vs foundname usage. Co-authored-by: Evan Hunt <each@isc.org> Co-authored-by: Ondřej Surý <ondrej@isc.org>	2025-03-05 07:49:46 +01:00
Ondřej Surý	08e966df82	Remove offsets from the dns_name and dns_fixedname structures The offsets were meant to speed-up the repeated dns_name operations, but it was experimentally proven that there's actually no real-world benefit. Remove the offsets and labels fields from the dns_name and the static offsets fields to save 128 bytes from the fixedname in favor of calculating labels and offsets only when needed.	2025-02-25 12:17:34 +01:00
Ondřej Surý	d1ef6a93c1	Acquire the database reference before possibly last node release Acquire the database refernce in the detachnode() to prevent the last reference to be release while the NODE_LOCK being locked. The NODE_LOCK is locked/unlocked inside the RCU critical section, thus it is most probably this should not pose a problem as the database uses call_rcu memory reclamation, but this it is still safer to acquire the reference before releasing the node.	2025-02-24 20:05:56 +01:00
Ondřej Surý	2fc32c105d	Remove the "raw" version of the dns_slabheader API The "raw" version of the header was used for the noqname and the closest proofs to save around 152 bytes of the dns_slabheader_t while bringing an additional complexity. Remove the raw version of the dns_slabheader API at the slight expense of having unused dns_slabheader_t data sitting in front of the proofs.	2025-02-19 15:00:15 -08:00
Evan Hunt	82edec67a5	initialize header in dns_rdataslab_fromrdataset() when dns_rdataslab_fromrdataset() is run, in addition to allocating space for a slab header, it also partially initializes it, setting the type match rdataset->type and rdataset->covers, the trust to rdataset->trust, and the TTL to rdataset->ttl.	2025-02-19 14:58:32 -08:00
Evan Hunt	b4bde9bef4	clarify dns_rdataslab_fromrdataset() there are now two functions for creating an rdataslab from an rdataset: dns_rdataslab_fromrdataset() creates a full slab (including space for a slab header), and dns_rdataslab_raw_fromrdataset() creates a raw slab.	2025-02-19 14:58:32 -08:00
Evan Hunt	6908d1f9be	more rdataslab refactoring - there are now two functions for getting rdataslab size: dns_rdataslab_size() is for full slabs and dns_rdataslab_sizeraw() for raw slabs. there is no longer a need for a reservelen parameter. - dns_rdataslab_count() also no longer takes a reservelen parameter. (currently it's never used for raw slabs, so there is no _countraw() function.) - dns_rdataslab_rdatasize() has been removed, because dns_rdataslab_sizeraw() can do the same thing. - dns_rdataslab_merge() and dns_rdataslab_subtract() both take slabheader parameters instead of character buffers, and the reservelen parameter has been removed.	2025-02-19 14:58:32 -08:00
Evan Hunt	4601d4299a	fix and simplify dns_rdataset_equal() and _equalx() if both rdataslabs being compared have zero length, return true. also, since these functions are only ever called on slabheaders with sizeof(dns_slabheader_t) as the reserve length, we can simplify the API: remove the reservelen argument, and pass the slabs as type dns_slabheader_t * instead of unsigned char *.	2025-02-19 14:58:32 -08:00
Evan Hunt	e58ce19cf2	when committing a new qpzone version, delete dead nodes if all data has been deleted from a node in the qpzone database, delete the node too.	2025-02-18 14:22:38 -08:00
Ondřej Surý	ce9f6e68c3	Unify how we handle database version in the cache Database versions are not used in cache databases. Some places in qpcache.c required the version argument to be NULL; others marked it as UNUSED. Unify all cases to require version to be NULL.	2025-02-18 20:15:00 +00:00
Ondřej Surý	2d53796e28	Clean up 'now' usage in the cache Unify the way we handle the 'now' argument in the cache: when it's set to zero by the caller, it is replaced with isc_stdtime_now().	2025-02-18 20:15:00 +00:00
Ondřej Surý	3b2fe808c4	Clean up the search part in qpcache_find() Slightly refactor the header search in qpcache_find(), so the scope level is reduced and the cname parts are logically grouped together.	2025-02-18 20:15:00 +00:00
Ondřej Surý	bfb219ac2d	Refactor the search in qpcache_findrdataset() Add new related_headers() function that simplifies the code flow in qpcache_findrdataset(). Also use check_stale_header() function to remove code duplication.	2025-02-18 20:15:00 +00:00
Ondřej Surý	cf66ba02a4	Refactor simple slabheader matching Add a helper function both_headers() that unifies the slabheader matching for simple type: it returns true when both the type and the matching RRSIG have been found.	2025-02-18 20:15:00 +00:00
Ondřej Surý	4cd1dd8dd7	Add new helper maybe_update_headers() function The new maybe_update_headers() function unifies the LRU updates to the slabheaders that was scattered all over the place. More calls to update headers after bindrdatasets() were also added for completeness.	2025-02-18 20:15:00 +00:00
Ondřej Surý	4448f1adb2	Add bindrdatasets() function that binds both rdatasets This removes code duplication between the dual bindrdataset() calls. It also unifies the handling as there were small differences between the calls: one variant was checking for !NEGATIVE(found) condition and one wasn't, and it is technically ok to do the check for all variants.	2025-02-18 20:15:00 +00:00
Ondřej Surý	53d9ef5bd0	Refactor check_stale_header() function The check_stale_header() function now updates header_prev directly so it doesn't have to be handled in the outer loop; it's always set to the correct value of the previous header in the chain.	2025-02-18 20:15:00 +00:00
Evan Hunt	5281c708d3	clean up unnecessary code in qpcache some code was left in the cache database implementation after it was separated from the zone database, and can be cleaned up and refactored now: - the DNS_SLABHEADERATTR_IGNORE flag is never set in the cache - support for loading the cache from was removed, but the add() function still had a 'loading' flag that's always false - two different macros were used for checking the DNS_SLABHEADERATTR_NONEXISTENT flag - EXISTS() and NONEXISTENT(). it's clearer to just use EXISTS(). - the cache doesn't support versions, so it isn't necessary to walk down the 'down' pointer chain when iterating through the cache or looking for a header to update. 'down' now only points to records that are deleted from the cache but have not yet been purged from memory. this allows us to simplify both the iterator and the add() function.	2025-02-18 20:15:00 +00:00
Ondřej Surý	1fa5219fdf	Rely on call_rcu() to destroy the qpzone outside of locks Reduce the number of qpzone_ref() and qpzone_unref() calls in qpzone_detachnode() by relying on the call_rcu to delay the destruction of the lock buckets.	2025-02-04 21:37:46 +01:00
Ondřej Surý	c602d76c1f	Reduce false sharing in dns_qpcache Instead of having many node_lock_count * sizeof(<member>) arrays, pack all the members into a qpcache_bucket_t struct that is cacheline aligned and have a single array of those. Additionaly, make both the head and the tail of isc_queue_t padded, not just the head, to prevent false sharing of the lock-free structure with the lock that follows it.	2025-02-04 21:37:46 +01:00
Ondřej Surý	355fc48472	Print the expiration time of the stale records (not ancient) In #1870, the expiration time of ANCIENT records were printed, but actually the ancient records are very short lived, and the information carries a little value. Instead of printing the expiration of ANCIENT records, print the expiration time of STALE records.	2025-02-03 15:47:06 +01:00
Ondřej Surý	60f6b88c63	Remove duplicate 'now' argument from find_coveringnsec() The find_coveringnsec() was getting the 'now' from two sources - search->now and separate now argument. Things like this are ticking bombs, remove the extra 'now' argument and use single source of 'now'.	2025-02-03 14:39:06 +01:00
Ondřej Surý	58179e6a19	Expand the usage of mark_ancient() helper functions When the mark_ancient() helper function was introduced, couple of places with duplicate (or almost duplicate) code was missed. Move the mark_ancient() function closer to the top of the file, and correctly use it in places that mark the header as ANCIENT.	2025-02-03 14:39:06 +01:00
Ondřej Surý	cfee6aa565	Add better ZEROTTL handling in bindrdataset() If we know that the header has ZEROTTL set, the server should never send stale records for it and the TTL should never be anything else than 0. The comment was already there, but the code was not matching the comment.	2025-02-03 14:39:06 +01:00
Ondřej Surý	e07f5a4a5b	In dns_slabheader_t structure, change .ttl to .expire The old name was misleading as it never meant time-to-live, e.g. number of seconds from now when the header should expire. The true meaning was an expiration time e.g. now + ttl. This was the original design bug that caused the slip when we assigned header->ttl to rdataset->ttl. Because the name was matching, nobody has questioned the correctness of the code both during the MR review and during the numerous re-reviews when we were searching for the cause of the 54 year TTL.	2025-02-03 14:39:06 +01:00
Ondřej Surý	1bbb57f81b	In cache, set rdataset TTL to 0 when the header is not active When the header has been marked as ANCIENT, but the ttl hasn't been reset (this happens in couple of places), the rdataset TTL would be set to the header timestamp instead to a reasonable TTL value. Since this header has been already expired (ANCIENT is set), set the rdataset TTL to 0 and don't reuse this field to print the expiration time when dumping the cache. Instead of printing the time, we now just print 'expired (awaiting cleanup'.	2025-02-03 14:39:06 +01:00
Evan Hunt	1f095b902c	fix the cache findzonecut implementation the search for the deepest known zone cut in the cache could improperly reject a node containing stale data, even if the NS rdataset wasn't the data that was stale. this change also improves the efficiency of the search by stopping it when both NS and RRSIG(NS) have been found.	2025-02-02 18:43:50 +01:00
Evan Hunt	d4f791793e	Clarify reference counting in QP databases Change the names of the node reference counting functions and add comments to make the mechanism easier to understand: - newref() and decref() are now called qpcnode_acquire()/ qpznode_acquire() and qpcnode_release()/qpznode_release() respectively; this reflects the fact that they modify both the internal and external reference counters for a node. - qpcnode_newref() and qpznode_newref() are now called qpcnode_erefs_increment() and qpznode_erefs_increment(), and qpcnode_decref() and qpznode_decref() are now called qpcnode_erefs_decrement() and qpznode_erefs_decrement(), to reflect that they only increase and decrease the node's external reference counters, not internal.	2025-01-30 20:08:46 -08:00
Ondřej Surý	431513d8b3	Remove db_nodelock_t in favor of reference counted qpdb This removes the db_nodelock_t structure and changes the node_locks array to be composed only of isc_rwlock_t pointers. The .reference member has been moved to qpdb->references in addition to common.references that's external to dns_db API users. The .exiting members has been completely removed as it has no use when the reference counting is used correctly.	2025-01-30 16:43:02 +01:00
Ondřej Surý	36a26bfa1a	Remove origin_node from qpcache The origin_node in qpcache was always NULL, so we can remove the getoriginode() function and origin_node pointer as the dns_db_getoriginnode() correctly returns ISC_R_NOTFOUND when the function is not implemented.	2025-01-30 16:43:02 +01:00
Ondřej Surý	814b87da64	Refactor decref() in both qpcache.c and qpzone.c Cleanup the pattern in the decref() functions in both qpcache.c and qpzone.c, so it follows the similar patter as we already have in newref() function.	2025-01-30 16:43:02 +01:00
Andoni Duarte Pintado	3a64b288c1	Merge tag 'v9.21.4'	2025-01-29 17:17:18 +01:00
JINMEI Tatuya	7f4471594d	Optimize database decref by avoiding locking with refs > 1 Previously, this function always acquires a node write lock if it might need node cleanup in case the reference decrements to 0. In fact, the lock is unnecessary if the reference is larger than 1 and it can be optimized as an "easy" case. This optimization could even be "necessary". In some extreme cases, many worker threads could repeat acquring and releasing the reference on the same node, resulting in severe lock contention for nothing (as the ref wouldn't decrement to 0 in most cases). This change would prevent noticeable performance drop like query timeout for such cases. Co-authored-by: JINMEI Tatuya <jtatuya@infoblox.com> Co-authored-by: Ondřej Surý <ondrej@isc.org>	2025-01-22 14:27:13 +01:00

1 2

81 Commits