diff --git a/doc/draft/draft-ietf-dnsop-respsize-05.txt b/doc/draft/draft-ietf-dnsop-respsize-05.txt deleted file mode 100644 index b9615be6f8..0000000000 --- a/doc/draft/draft-ietf-dnsop-respsize-05.txt +++ /dev/null @@ -1,640 +0,0 @@ - - - - - - - DNSOP Working Group Paul Vixie, ISC - INTERNET-DRAFT Akira Kato, WIDE - August 2006 - - DNS Referral Response Size Issues - - Status of this Memo - By submitting this Internet-Draft, each author represents that any - applicable patent or other IPR claims of which he or she is aware - have been or will be disclosed, and any of which he or she becomes - aware will be disclosed, in accordance with Section 6 of BCP 79. - - Internet-Drafts are working documents of the Internet Engineering - Task Force (IETF), its areas, and its working groups. Note that - other groups may also distribute working documents as Internet- - Drafts. - - Internet-Drafts are draft documents valid for a maximum of six months - and may be updated, replaced, or obsoleted by other documents at any - time. It is inappropriate to use Internet-Drafts as reference - material or to cite them other than as "work in progress." - - The list of current Internet-Drafts can be accessed at - http://www.ietf.org/ietf/1id-abstracts.txt - - The list of Internet-Draft Shadow Directories can be accessed at - http://www.ietf.org/shadow.html. - - Copyright Notice - - Copyright (C) The Internet Society (2006). All Rights Reserved. - - - - - Abstract - - With a mandated default minimum maximum message size of 512 octets, - the DNS protocol presents some special problems for zones wishing to - expose a moderate or high number of authority servers (NS RRs). This - document explains the operational issues caused by, or related to - this response size limit, and suggests ways to optimize the use of - this limited space. Guidance is offered to DNS server implementors - and to DNS zone operators. - - - - - Expires January 2007 [Page 1] - - INTERNET-DRAFT August 2006 RESPSIZE - - - 1 - Introduction and Overview - - 1.1. The DNS standard (see [RFC1035 4.2.1]) limits message size to 512 - octets. Even though this limitation was due to the required minimum IP - reassembly limit for IPv4, it became a hard DNS protocol limit and is - not implicitly relaxed by changes in transport, for example to IPv6. - - 1.2. The EDNS0 protocol extension (see [RFC2671 2.3, 4.5]) permits - larger responses by mutual agreement of the requester and responder. - The 512 octet message size limit will remain in practical effect until - there is widespread deployment of EDNS0 in DNS resolvers on the - Internet. - - 1.3. Since DNS responses include a copy of the request, the space - available for response data is somewhat less than the full 512 octets. - Negative responses are quite small, but for positive and delegation - responses, every octet must be carefully and sparingly allocated. This - document specifically addresses delegation response sizes. - - 2 - Delegation Details - - 2.1. RELEVANT PROTOCOL ELEMENTS - - 2.1.1. A delegation response will include the following elements: - - Header Section: fixed length (12 octets) - Question Section: original query (name, class, type) - Answer Section: (empty) - Authority Section: NS RRset (nameserver names) - Additional Section: A and AAAA RRsets (nameserver addresses) - - 2.1.2. If the total response size exceeds 512 octets, and if the data - that does not fit was "required", then the TC bit will be set - (indicating truncation). This will usually cause the requester to retry - using TCP, depending on what information was desired and what - information was omitted. For example, truncation in the authority - section is of no interest to a stub resolver who only plans to consume - the answer section. If a retry using TCP is needed, the total cost of - the transaction is much higher. See [RFC1123 6.1.3.2] for details on - the requirement that UDP be attempted before falling back to TCP. - - 2.1.3. RRsets are never sent partially unless TC bit set to indicate - truncation. When TC bit is set, the final apparent RRset in the final - non-empty section must be considered "possibly damaged" (see [RFC1035 - 6.2], [RFC2181 9]). - - - - Expires January 2007 [Page 2] - - INTERNET-DRAFT August 2006 RESPSIZE - - - 2.1.4. With or without truncation, the glue present in the additional - data section should be considered "possibly incomplete", and requesters - should be prepared to re-query for any damaged or missing RRsets. Note - that truncation of the additional data section might not be signalled - via the TC bit since additional data is often optional. - - 2.1.5. DNS label compression allows a domain name to be instantiated - only once per DNS message, and then referenced with a two-octet - "pointer" from other locations in that same DNS message (see [RFC1035 - 4.1.4]). If all nameserver names in a message share a common parent - (for example, all ending in ".ROOT-SERVERS.NET"), then more space will - be available for incompressable data (such as nameserver addresses). - - 2.1.6. The query name can be as long as 255 characters of presentation - data, which can be up to 256 octets of network data. In this worst case - scenario, the question section will be 260 octets in size, which would - leave only 240 octets for the authority and additional sections (after - deducting 12 octets for the fixed length header.) - - 2.2. ADVICE TO ZONE OWNERS - - 2.2.1. Average and maximum question section sizes can be predicted by - the zone owner, since they will know what names actually exist, and can - measure which ones are queried for most often. Note that if the zone - contains any wildcards, it is possible for maximum length queries to - require positive responses, but that it is reasonable to expect - truncation and TCP retry in that case. For cost and performance - reasons, the majority of requests should be satisfied without truncation - or TCP retry. - - 2.2.2. Some queries to non-existing names can be large, but this is not - a problem because negative responses need not contain any answer, - authority or additional records. See [RFC2308 2.1] for more information - about the format of negative responses. - - 2.2.3. The minimum useful number of name servers is two, for redundancy - (see [RFC1034 4.1]). A zone's name servers should be reachable by all - IP transport protocols (e.g., IPv4 and IPv6) in common use. - - 2.2.4. The best case is no truncation at all. This is because many - requesters will retry using TCP by reflex, or will automatically re- - query for RRsets that are possibly truncated, without considering - whether the omitted data was actually necessary. - - - - - - Expires January 2007 [Page 3] - - INTERNET-DRAFT August 2006 RESPSIZE - - - 2.3. ADVICE TO SERVER IMPLEMENTORS - - 2.3.1. In case of multi-homed name servers, it is advantageous to - include an address record from each of several name servers before - including several address records for any one name server. If address - records for more than one transport (for example, A and AAAA) are - available, then it is advantageous to include records of both types - early on, before the message is full. - - 2.3.2. Each added NS RR for a zone will add between 16 and 44 octets to - every non-truncated referral or negative response from the zone's - authority servers (16 octets for an NS RR, 16 octets for an A RR, and 28 - octets for an AAAA RR), in addition to whatever space is taken by the - nameserver name (NS NSDNAME as well as A or AAAA owner name). - - 2.3.3. While DNS distinguishes between necessary and optional resource - records, this distinction is according to protocol elements necessary to - signify facts, and takes no official notice of protocol content - necessary to ensure correct operation. For example, a nameserver name - that is in or below the zone cut being described by a delegation is - "necessary content," since there is no way to reach that zone unless the - parent zone's delegation includes "glue records" describing that name - server's addresses. - - 2.3.4. It is also necessary to distinguish between "explicit truncation" - where a message could not contain enough records to convey its intended - meaning, and so the TC bit has been set, and "silent truncation", where - the message was not large enough to contain some records which were "not - required", and so the TC bit was not set. - - 2.3.5. A delegation response should prioritize glue records as follows. - - first - All glue RRsets for one name server whose name is in or below the - zone being delegated, or which has multiple address RRsets (currently - A and AAAA), or preferably both; - - second - Alternate between adding all glue RRsets for any name servers whose - names are in or below the zone being delegated, and all glue RRsets - for any name servers who have multiple address RRsets (currently A - and AAAA); - - thence - All other glue RRsets, in any order. - - - - Expires January 2007 [Page 4] - - INTERNET-DRAFT August 2006 RESPSIZE - - - Whenever there are multiple candidates for a position in this priority - scheme, one should be chosen on a round-robin or fully random basis. - - The goal of this priority scheme is to offer "necessary" glue first, - avoiding silent truncation for this glue if possible. - - 2.3.6. If any "necessary content" is silently truncated, then it is - advisable that the TC bit be set in order to force a TCP retry, rather - than have the zone be unreachable. Note that a parent server's proper - response to a query for in-child glue or below-child glue is a referral - rather than an answer, and that this referral MUST be able to contain - the in-child or below-child glue, and that in outlying cases, only EDNS - or TCP will be large enough to contain that data. - - 3 - Analysis - - 3.1. An instrumented protocol trace of a best case delegation response - follows. Note that 13 servers are named, and 13 addresses are given. - This query was artificially designed to exactly reach the 512 octet - limit. - - ;; flags: qr rd; QUERY: 1, ANS: 0, AUTH: 13, ADDIT: 13 - ;; QUERY SECTION: - ;; [23456789.123456789.123456789.\ - 123456789.123456789.123456789.com A IN] ;; @80 - - ;; AUTHORITY SECTION: - com. 86400 NS E.GTLD-SERVERS.NET. ;; @112 - com. 86400 NS F.GTLD-SERVERS.NET. ;; @128 - com. 86400 NS G.GTLD-SERVERS.NET. ;; @144 - com. 86400 NS H.GTLD-SERVERS.NET. ;; @160 - com. 86400 NS I.GTLD-SERVERS.NET. ;; @176 - com. 86400 NS J.GTLD-SERVERS.NET. ;; @192 - com. 86400 NS K.GTLD-SERVERS.NET. ;; @208 - com. 86400 NS L.GTLD-SERVERS.NET. ;; @224 - com. 86400 NS M.GTLD-SERVERS.NET. ;; @240 - com. 86400 NS A.GTLD-SERVERS.NET. ;; @256 - com. 86400 NS B.GTLD-SERVERS.NET. ;; @272 - com. 86400 NS C.GTLD-SERVERS.NET. ;; @288 - com. 86400 NS D.GTLD-SERVERS.NET. ;; @304 - - - - - - - - - Expires January 2007 [Page 5] - - INTERNET-DRAFT August 2006 RESPSIZE - - - ;; ADDITIONAL SECTION: - A.GTLD-SERVERS.NET. 86400 A 192.5.6.30 ;; @320 - B.GTLD-SERVERS.NET. 86400 A 192.33.14.30 ;; @336 - C.GTLD-SERVERS.NET. 86400 A 192.26.92.30 ;; @352 - D.GTLD-SERVERS.NET. 86400 A 192.31.80.30 ;; @368 - E.GTLD-SERVERS.NET. 86400 A 192.12.94.30 ;; @384 - F.GTLD-SERVERS.NET. 86400 A 192.35.51.30 ;; @400 - G.GTLD-SERVERS.NET. 86400 A 192.42.93.30 ;; @416 - H.GTLD-SERVERS.NET. 86400 A 192.54.112.30 ;; @432 - I.GTLD-SERVERS.NET. 86400 A 192.43.172.30 ;; @448 - J.GTLD-SERVERS.NET. 86400 A 192.48.79.30 ;; @464 - K.GTLD-SERVERS.NET. 86400 A 192.52.178.30 ;; @480 - L.GTLD-SERVERS.NET. 86400 A 192.41.162.30 ;; @496 - M.GTLD-SERVERS.NET. 86400 A 192.55.83.30 ;; @512 - - ;; MSG SIZE sent: 80 rcvd: 512 - - 3.2. For longer query names, the number of address records supplied will - be lower. Furthermore, it is only by using a common parent name (which - is GTLD-SERVERS.NET in this example) that all 13 addresses are able to - fit, due to the use of DNS compression pointers in the last 12 - occurances of the parent domain name. The following output from a - response simulator demonstrates these properties. - - % perl respsize.pl a.dns.br b.dns.br c.dns.br d.dns.br - a.dns.br requires 10 bytes - b.dns.br requires 4 bytes - c.dns.br requires 4 bytes - d.dns.br requires 4 bytes - # of NS: 4 - For maximum size query (255 byte): - only A is considered: # of A is 4 (green) - A and AAAA are considered: # of A+AAAA is 3 (yellow) - preferred-glue A is assumed: # of A is 4, # of AAAA is 3 (yellow) - For average size query (64 byte): - only A is considered: # of A is 4 (green) - A and AAAA are considered: # of A+AAAA is 4 (green) - preferred-glue A is assumed: # of A is 4, # of AAAA is 4 (green) - - - - - - - - - - - Expires January 2007 [Page 6] - - INTERNET-DRAFT August 2006 RESPSIZE - - - % perl respsize.pl ns-ext.isc.org ns.psg.com ns.ripe.net ns.eu.int - ns-ext.isc.org requires 16 bytes - ns.psg.com requires 12 bytes - ns.ripe.net requires 13 bytes - ns.eu.int requires 11 bytes - # of NS: 4 - For maximum size query (255 byte): - only A is considered: # of A is 4 (green) - A and AAAA are considered: # of A+AAAA is 3 (yellow) - preferred-glue A is assumed: # of A is 4, # of AAAA is 2 (yellow) - For average size query (64 byte): - only A is considered: # of A is 4 (green) - A and AAAA are considered: # of A+AAAA is 4 (green) - preferred-glue A is assumed: # of A is 4, # of AAAA is 4 (green) - - (Note: The response simulator program is shown in Section 5.) - - Here we use the term "green" if all address records could fit, or - "yellow" if two or more could fit, or "orange" if only one could fit, or - "red" if no address record could fit. It's clear that without a common - parent for nameserver names, much space would be lost. For these - examples we use an average/common name size of 15 octets, befitting our - assumption of GTLD-SERVERS.NET as our common parent name. - - We're assuming a medium query name size of 64 since that is the typical - size seen in trace data at the time of this writing. If - Internationalized Domain Name (IDN) or any other technology which - results in larger query names be deployed significantly in advance of - EDNS, then new measurements and new estimates will have to be made. - - 4 - Conclusions - - 4.1. The current practice of giving all nameserver names a common parent - (such as GTLD-SERVERS.NET or ROOT-SERVERS.NET) saves space in DNS - responses and allows for more nameservers to be enumerated than would - otherwise be possible, since the common parent domain name only appears - once in a DNS message and is referred to via "compression pointers" - thereafter. - - 4.2. If all nameserver names for a zone share a common parent, then it - is operationally advisable to make all servers for the zone thus served - also be authoritative for the zone of that common parent. For example, - the root name servers (?.ROOT-SERVERS.NET) can answer authoritatively - for the ROOT-SERVERS.NET. This is to ensure that the zone's servers - always have the zone's nameservers' glue available when delegating, and - - - - Expires January 2007 [Page 7] - - INTERNET-DRAFT August 2006 RESPSIZE - - - will be able to respond with answers rather than referrals if a - requester who wants that glue comes back asking for it. In this case - the name server will likely be a "stealth server" -- authoritative but - unadvertised in the glue zone's NS RRset. See [RFC1996 2] for more - information about stealth servers. - - 4.3. Thirteen (13) is the effective maximum number of nameserver names - usable traditional (non-extended) DNS, assuming a common parent domain - name, and given that implicit referral response truncation is - undesirable in the average case. - - 4.4. Multi-homing of name servers within a protocol family is - inadvisable since the necessary glue RRsets (A or AAAA) are atomically - indivisible, and will be larger than a single resource record. Larger - RRsets are more likely to lead to or encounter truncation. - - 4.5. Multi-homing of name servers across protocol families is less - likely to lead to or encounter truncation, partly because multiprotocol - clients are more likely to speak EDNS which can use a larger response - size limit, and partly because the resource records (A and AAAA) are in - different RRsets and are therefore divisible from each other. - - 4.6. Name server names which are at or below the zone they serve are - more sensitive to referral response truncation, and glue records for - them should be considered "less optional" than other glue records, in - the assembly of referral responses. - - 4.7. If a zone is served by thirteen (13) name servers having a common - parent name (such as ?.ROOT-SERVERS.NET) and each such name server has a - single address record in some protocol family (e.g., an A RR), then all - thirteen name servers or any subset thereof could multi-home in a second - protocol family by adding a second address record (e.g., an AAAA RR) - without reducing the reachability of the zone thus served. - - 5 - Source Code - - #!/usr/bin/perl - # - # SYNOPSIS - # repsize.pl [ -z zone ] fqdn_ns1 fqdn_ns2 ... - # if all queries are assumed to have a same zone suffix, - # such as "jp" in JP TLD servers, specify it in -z option - # - use strict; - use Getopt::Std; - - - - Expires January 2007 [Page 8] - - INTERNET-DRAFT August 2006 RESPSIZE - - - my ($sz_msg) = (512); - my ($sz_header, $sz_ptr, $sz_rr_a, $sz_rr_aaaa) = (12, 2, 16, 28); - my ($sz_type, $sz_class, $sz_ttl, $sz_rdlen) = (2, 2, 4, 2); - my (%namedb, $name, $nssect, %opts, $optz); - my $n_ns = 0; - - getopt('z', %opts); - if (defined($opts{'z'})) { - server_name_len($opts{'z'}); # just register it - } - - foreach $name (@ARGV) { - my $len; - $n_ns++; - $len = server_name_len($name); - print "$name requires $len bytes\n"; - $nssect += $sz_ptr + $sz_type + $sz_class + $sz_ttl - + $sz_rdlen + $len; - } - print "# of NS: $n_ns\n"; - arsect(255, $nssect, $n_ns, "maximum"); - arsect(64, $nssect, $n_ns, "average"); - - sub server_name_len { - my ($name) = @_; - my (@labels, $len, $n, $suffix); - - $name =~ tr/A-Z/a-z/; - @labels = split(/\./, $name); - $len = length(join('.', @labels)) + 2; - for ($n = 0; $#labels >= 0; $n++, shift @labels) { - $suffix = join('.', @labels); - return length($name) - length($suffix) + $sz_ptr - if (defined($namedb{$suffix})); - $namedb{$suffix} = 1; - } - return $len; - } - - sub arsect { - my ($sz_query, $nssect, $n_ns, $cond) = @_; - my ($space, $n_a, $n_a_aaaa, $n_p_aaaa, $ansect); - $ansect = $sz_query + 1 + $sz_type + $sz_class; - $space = $sz_msg - $sz_header - $ansect - $nssect; - $n_a = atmost(int($space / $sz_rr_a), $n_ns); - - - - Expires January 2007 [Page 9] - - INTERNET-DRAFT August 2006 RESPSIZE - - - $n_a_aaaa = atmost(int($space - / ($sz_rr_a + $sz_rr_aaaa)), $n_ns); - $n_p_aaaa = atmost(int(($space - $sz_rr_a * $n_ns) - / $sz_rr_aaaa), $n_ns); - printf "For %s size query (%d byte):\n", $cond, $sz_query; - printf " only A is considered: "; - printf "# of A is %d (%s)\n", $n_a, &judge($n_a, $n_ns); - printf " A and AAAA are considered: "; - printf "# of A+AAAA is %d (%s)\n", - $n_a_aaaa, &judge($n_a_aaaa, $n_ns); - printf " preferred-glue A is assumed: "; - printf "# of A is %d, # of AAAA is %d (%s)\n", - $n_a, $n_p_aaaa, &judge($n_p_aaaa, $n_ns); - } - - sub judge { - my ($n, $n_ns) = @_; - return "green" if ($n >= $n_ns); - return "yellow" if ($n >= 2); - return "orange" if ($n == 1); - return "red"; - } - - sub atmost { - my ($a, $b) = @_; - return 0 if ($a < 0); - return $b if ($a > $b); - return $a; - } - - 6 - Security Considerations - - The recommendations contained in this document have no known security - implications. - - 7 - IANA Considerations - - This document does not call for changes or additions to any IANA - registry. - - 8 - Acknowledgement - - The authors thank Peter Koch, Rob Austein, and Joe Abley for their - valuable comments and suggestions. - - - - - Expires January 2007 [Page 10] - - INTERNET-DRAFT August 2006 RESPSIZE - - - This work was supported by the US National Science Foundation (research - grant SCI-0427144) and DNS-OARC. - - 9 - References - - [RFC1034] Mockapetris, P.V., "Domain names - Concepts and Facilities", - RFC1034, November 1987. - - [RFC1035] Mockapetris, P.V., "Domain names - Implementation and - Specification", RFC1035, November 1987. - - [RFC1123] Braden, R., Ed., "Requirements for Internet Hosts - - Application and Support", RFC1123, October 1989. - - [RFC1996] Vixie, P., "A Mechanism for Prompt Notification of Zone - Changes (DNS NOTIFY)", RFC1996, August 1996. - - [RFC2308] Andrews, M., "Negative Caching of DNS Queries (DNS NCACHE)", - RFC2308, March 1998. - - [RFC2181] Elz, R., Bush, R., "Clarifications to the DNS Specification", - RFC2181, July 1997. - - [RFC2671] Vixie, P., "Extension Mechanisms for DNS (EDNS0)", RFC2671, - August 1999. - - 10 - Authors' Addresses - - Paul Vixie - Internet Systems Consortium, Inc. - 950 Charter Street - Redwood City, CA 94063 - +1 650 423 1301 - vixie@isc.org - - Akira Kato - University of Tokyo, Information Technology Center - 2-11-16 Yayoi Bunkyo - Tokyo 113-8658, JAPAN - +81 3 5841 2750 - kato@wide.ad.jp - - - - - - - - Expires January 2007 [Page 11] - - INTERNET-DRAFT August 2006 RESPSIZE - - - Full Copyright Statement - - Copyright (C) The Internet Society (2006). - - This document is subject to the rights, licenses and restrictions - contained in BCP 78, and except as set forth therein, the authors retain - all their rights. - - This document and the information contained herein are provided on an - "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR - IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET - ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, - INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE - INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED - WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. - - Intellectual Property - - The IETF takes no position regarding the validity or scope of any - Intellectual Property Rights or other rights that might be claimed to - pertain to the implementation or use of the technology described in this - document or the extent to which any license under such rights might or - might not be available; nor does it represent that it has made any - independent effort to identify any such rights. Information on the - procedures with respect to rights in RFC documents can be found in BCP - 78 and BCP 79. - - Copies of IPR disclosures made to the IETF Secretariat and any - assurances of licenses to be made available, or the result of an attempt - made to obtain a general license or permission for the use of such - proprietary rights by implementers or users of this specification can be - obtained from the IETF on-line IPR repository at - http://www.ietf.org/ipr. - - The IETF invites any interested party to bring to its attention any - copyrights, patents or patent applications, or other proprietary rights - that may cover technology that may be required to implement this - standard. Please address the information to the IETF at - ietf-ipr@ietf.org. - - Acknowledgement - - Funding for the RFC Editor function is provided by the IETF - Administrative Support Activity (IASA). - - - - - Expires January 2007 [Page 12] - -