From 9309cf30c072ec762753bd7c56a09366dbc1bee8 Mon Sep 17 00:00:00 2001 From: David Lawrence Date: Wed, 29 Nov 2000 04:27:35 +0000 Subject: [PATCH] The role of the DNS, as viewed by the IAB chair --- doc/draft/draft-klensin-dns-role-00.txt | 571 ++++++++++++++++++++++++ 1 file changed, 571 insertions(+) create mode 100644 doc/draft/draft-klensin-dns-role-00.txt diff --git a/doc/draft/draft-klensin-dns-role-00.txt b/doc/draft/draft-klensin-dns-role-00.txt new file mode 100644 index 0000000000..d708ec5fc0 --- /dev/null +++ b/doc/draft/draft-klensin-dns-role-00.txt @@ -0,0 +1,571 @@ +INTERNET-DRAFT John C. Klensin +November 10, 2000 +Expires May 2001 + + + Role of the Domain Name System + draft-klensin-dns-role-00.txt + +Status of this Memo + +This document is an Internet-Draft and is in full conformance with +all provisions of Section 10 of RFC2026. + +Internet-Drafts are working documents of the Internet Engineering +Task Force (IETF), its areas, and its working groups. Note that +other groups may also distribute working documents as Internet-Drafts. + +Internet-Drafts are draft documents valid for a maximum of six months +and may be updated, replaced, or obsoleted by other documents at any +time. It is inappropriate to use Internet-Drafts as reference +material or to cite them other than as "work in progress." + +The list of current Internet-Drafts can be accessed at +http://www.ietf.org/ietf/1id-abstracts.txt + +The list of Internet-Draft Shadow Directories can be accessed at +http://www.ietf.org/shadow.html. + +This document represents a summary of the personal opinions of the +author on the subject covered and is not intended to evolve into a +standard of any kind. + +Copyright Notice + +Copyright (C) The Internet Society (2000). All Rights Reserved. + + + +0. Abstract + +The original function and purpose of the DNS is reviewed, and +contrasted with some of the functions into which it is being forced +today and some of the newer demands being placed upon it or suggested +for it. A framework for an alternative to placing these additional +stresses on the DNS is then outlined. This document and that +framework are not a proposed solution, only a strong suggestion that +the time has come to begin thinking more broadly about the problems +we are encountering and possible approaches to solving them. + + +1. History + +Several of the comments that follow are somewhat revisionist. Good +design and engineering often requires a level of intuition by the +designers about things that will be necessary in the future; the +reasons for some of these design decisions are not made explicit at +the time because no one is able to articulate them. The discussion +below reconstructs some of the decisions about the Internet's primary +namespace in the light of subsequent development and experience. In +addition, the historical reasons for particular decisions about the +Internet were often severely underdocumented contemporaneously and, +not surprisingly, different participants have different recollections +about what happened and what was considered important. Consequently, +the quasi-historical story below is just one story. There may be +(indeed, almost certainly are) other stories about how we got to +where we are today, but they probably don't, of themselves, +invalidate the inferences and conclusions. + +1.1 Context for DNS development + +During the entire life of the ARPANET and nearly the first decade or +so of operation of the Internet, the list of host names and their +mapping to and from addresses was maintained in a frequently-updated +"host table" [RFC625, 811, 952]. This table was just a list in an +agreed-upon format; sites were expected to frequently obtain copies +of, and install, new versions. The host tables themselves were +introduced to + + * Eliminate the requirement for people to remember host numbers + (addresses). Despite apparent experience to the contrary in the + conventional telephone system, numeric numbering systems, including + the numeric host number strategy, did not (and do not) work well for + more than a (large) handful of hosts. + + * Provide stability when addresses changed. Since addresses --to + some degree in the ARPANET and more importantly in the contemporary + Internet-- are a function of network topology and routing, they + often had to be changed when connectivity or topology changed. The + names could be kept stable even as addresses changed. + + * Some hosts (so-called "multihomed" ones) needed multiple + addresses to reflect different types of connectivity and topology. + Again, the names were very useful for avoiding the requirement that + would otherwise exist for users and other hosts to track these + multiple host numbers and addresses. +Toward the end of that long (in network time) period, the community +concluded that the host table model did not scale adequately and that +it would not adequately support new service variations. A working +group was created, and the DNS was the result of that effort. The +role of the DNS was to preserve the capabilities of the host table +arrangements (especially unique, unambiguous, host names), provide +for addition of additional services (e.g., the special record types +for electronic mail routing which rather quickly followed +introduction of the DNS), and to do so on the base of a robust, +hierarchical, distributed, name lookup system. That system also +permitted distribution of name administration, rather than requiring +that each host be entered into a single, central, table by a central +administration. + +1.2 Review of the DNS + +The DNS was designed primarily to identify network resources. +Although there was speculation about including, e.g., personal names +and email addresses, it was not designed primarily to identify +people, brands, etc. At the same time, the system was designed with +the flexibility to accomodate new data types and structures through +the addition of new record types to the initial "INternet" class. +Since the appropriate identifiers and content of those future +extensions could not be anticipated, the design provided that these +fields could contain any (binary) information, not just the +restricted text forms of the host table. + +However, the DNS as-used is intimately tied to the applications and +application protocols that utilize it, often at a fairly low level. + +In particular, despite the ability of the protocols and data +structures themselves to accomodate any binary representation, DNS +names as used are historically not [even] ASCII, but a very +restricted subset of it, a subset that derives primarily from the +original host table naming rules. Selection of that subset was +driven in part by human factors considerations, including a desire to +eliminate possible ambiguities in an international context. Hence +character codes that had international variations in interpretation +were excluded, the underscore character and case distinctions were +eliminated as being confusing (in the underscore's case, with the +hyphen character) when written or read by people, and so on. These +considerations appear to be very similar to those that resulted in +similarly restricted character sets being used as protocol elements +in many ITU and ISO protocols (cf. X.9, X.29). + +Another assumption was that there would be a high ratio of physical +hosts to second level domains and, more generally, that the system +would be deeply hierarchical, with most systems (and names) at the +third level or below and a large ratio of names representing physical +hosts to total names. There are domains that follow this model: many +university and corporate domains use fairly deep hierarchies, as do a +few country code TLDs (".US" is an excellent example). However, the +RIPE hostcount list is now showing a count of SOA records that is +approaching (and may have passed) the number of distinct hosts. +While recent experience has shown that the DNS is robust enough +--given contemporary machines as servers and current bandwidth +norms-- to be able to continue to operate reasonably well when those +historical assumptions are not met (e.g., with a huge, flat, +structure under ".COM"), it is still useful to remember that the +system could have been designed to work optimally with a flat +structure (and very large zones) rather than a deeply hierarchical +one, and was not. + +Similarly, despite some early speculation about entering people's +names and email addresses into the DNS directly, with the sole +exception (at least in the "IN" class) of one field of the SOA +record, electronic mail addresses in the Internet have preserved the +original, pre-DNS, "user at location" conceptual format rather than a +flatter one. Location, in that instance, is a reference to a host. + +Both the DNS architecture itself and the two-level provisions for +email and similar functions (e.g., see the finger protocol), also +anticipated a relatively high ratio of users to actual hosts. It was +never clear that the DNS was intended to, or could, scale to the +order of magnitude of number of users (or, more recently, products or +document objects), rather than that of physical hosts. + +Like the host table before it, the DNS has provided criticial +uniqueness for names and universal accessibility to them as part of +overall "single internet" and "end to end" models (cf [RFC2826]). +However, there are many signs that, as new uses evolve and original +assmumptions are abused, the system is being stretched to, or beyond, +its practical limits. + +1.3 The web and user-visible domain names + +>From the standpoint of the integrity of the domain name system --and +scaling of the Internet, including optimal accessibility to content-- +the design decision to use "A record" domain names, rather than some +system of indirection, has proven to be a serious mistake in several +respects. Convenience of typing, and the desire to make domain names +out of easily-remembered product names, has led to a flattening of +the DNS, with many people now perceiving that second-level names +under COM (or in some countries, second- or third-level names under +the relevant ccTLD) are all that is meaningful (this perception has +been reinforced by some domain name registrars who have been anxious +to "sell" additional names). And, of course, the perception that one +needs a top-level domain per product, rather than a (usually +organizational) collection of network resources has led to a rapid +acceleration in the number of names being registered, a phenonenum +that has clearly benefited registrars charging on a per-name basis, +"cybersquatters", and others in the business of "selling" names, but +has not obviously benefitted the Internet as a whole. + +The emphasis on second-level domain names has also created a problem +for the trademark community. Since the Internet is international, +and names are being populated in a flat and unqualified space, +similarly-named entities are in conflict even if there would +ordinarily be no chance of confusing them in the marketplace. The +problem appears to be unsolvable except by a choice between draconian +measures --possibly including significant changes to the underlying +legislation and conventions-- and a situation in which the "rights" +to a name are typically not settled using the subtle and traditional +product (or industry) type and geopolitical scope rules of the +trademark system but by depending largely on main force, e.g., the +organization with the greatest resources to invest in defending (or +attacking) names will ultimately win out. The latter raises not only +important issues of equity, but the risk of backlash as the numerous +small players are forced to relinquish names they find attractive and +to adopt less-desirable naming conventions. + +Independent of these sociopolitical problems, content distribution +issues have made it clear that it should be possible for an +organization to have copies of data it wishes to make available +distributed around the network, with a user who asks for the +information by name getting the topologically-closest copy. This is +not possible with simple, as-designed, use of the DNS: DNS names +identify target resources or, in the case of email "MX" records, a +preferentially-ordered list of resources "closest" to a target (not +the source/user). Several technologies (and, in some cases, +corresponding business models) have arisen to work around these +problems, including intercepting and altering DNS requests so as to +point to other locations, +While additional implications are still being discovered and +seriously evaluated, it appears, not surprisingly, that rewriting DNS +names in the middle of the network, or trying to give them different +values or interpretations depending on the topological location of +the user trying to resolve the name interferes with end-to-end +applications in the general case. These problems occur even if the +rewriting machinery is accompanied by additional workarounds for +particular applications: security associations and applications that +need to identify "the same host" as the applications for which these +tools have been designed often run into one problem or another. + + +1.4 A pessimistic history of the evolution of Internet applications +protocols. + +At the applications level, few of the protocols in active, widespread +use on the Internet reflect the either contemporary knowledge in +computer science or human factors or experience accumulated through +deployment and use. Instead, protocols tend to be deployed at a +just-past-prototype level, typically including the types of expedient +compromises typical with prototypes. If they prove useful, the +nature of the network permit very rapid dissemination (i.e., they +fill a vacuum, even if one that no one previously knew existed). +But, once the vacuum is filled, the installed base provides its own +inertia: unless the design is so seriously faulty as to prevent +effective use (or there is a widely-perceived sense of impending +disaster unless the protocol is replaced), future developments must +maintain backward compatibility and workarounds for problematic +characteristics rather than benefiting from redesign in the light of +experience. Applications that are "almost good enough" prevent +development and deployment of high-quality replacements. + +2. Signs of DNS overloading + +Parts of the historical discussion above identify areas in which it +is becoming clear that the DNS is becoming overloaded (semantically +if not in the mechanical ability to resolve names). While we seem to +still be well within the "just about good enough" range -- current +mechanisms and proposals to deal with these problems are all focused +on patching or working around limitations within the DNS rather than +dramatic rethinking -- the number of these issues that are arising +at the same time may argue for rethinging mechanisms and +relationships, not just more patches and kludges. For example: + +o While technical approaches such as larger and higher-powered servers +and more bandwidth, and legal/political mechanisms such as dispute +resolution policies have arguably kept the problems from becoming +critical, the DNS has not proven adequately responsive to business +and individual needs to describe or identify things (such as product +names and names of individuals) other than strict network resources. + +o While stacks have been modified to better handle multiple addresses +on a physical interface and some protocols have been extended to +include DNS names for determining context, the DNS doesn't deal +especially well with high-multiple names per host (needed for web +hosting facilities with multiple domains on a server). + +o Efforts to add names deriving from languages or character sets +based on other than simple ASCII and English-like names (see below), +or even to utilize complex company or product names without the use +of hierarchy have created apparent requirements names (labels) that +are over 63 octets long. This requirement will undoubtedly increase +over time; while there are workarounds to accomodate longer names, +they impose their own restrictions and cause their own problems. + +o Increasing commercialization of the Internet, and visibility of +domain names that are assumed to match names of companies or +products, has turned the DNS and DNS names into a trademark +battleground. The traditional trademark system in (at least) most +countries makes careful distinctions about fields of applicability. +When the space is flattened, without differentiators by either +geography or industry sector, not only are there likely conflicts +between "Joe's Pizza" (of Boston) and "Joe's Pizza" (of San Francisco) +but between both and "Joe's Auto Repair" (of Los Angeles): all three +would like to control "Joes.com" and may claim trademark rights to do +so, even though conflict or confusion would not occcur with +traditional trademarks. + +o Many organizations wish to have different web sites under the same +URL and domain name. Sometimes this is to create local variations +--the Widget Company might want to present different material to a UK +user relative to a US one-- and sometimes it is to provide higher +performance by supplying information from the server topologically +closest to the user. Arguably, the name resolution mechanism should +provide information about multiple sites that can provide information +associated with the same name and sufficient attributes associated +with each of those sites to permit applications to make sensible +choices, or should accept client-site attributes and utilize them in +the search process. +o Many existing and proposed systems for "finding things on the +Internet" require a true search capability in which near matches can +be reported to the user and queries may be slightly ambiguous or +fuzzy. The DNS can accomodate only one set of (quite rigid) matching +rules. Current proposals to permit different rules in different +localities help to identify the problem, but, if applied directly to +the DNS either don't provide the level of flexibility that would be +desirable or tend to isolate different parts of the Internet from +each other (or both). Fuzzy or ambiguous searches are desirable for +(at least) resolution of business names that might have spelling +variations and for names that can be resolved into different sets of +glyphs depending on context. This goes beyond "mere" +canonicalization differences (different ways of representing the same +character) and into such relationships as the use of different +alphabets for the same language, Kanji-Hiragana relationships, etc. + +o The historical DNS and applications that make assumptions about how +it works impose significant risk (or forces technical kludges and +consequent odd restrictions), when one considers adding mechanisms +for use with various multi-character-set and multilingual +"internationalization" systems. Cf RFC 2825. + +o In order to provide proper functionality to the Internet, the DNS +must have a single unique root (see RFC 2826 for a discussion of this +issue). There are many desires for local treatment of names or +character sets that cannot be accomodated without either multiple +roots (e.g., a separate root for multilingual names) or mechanisms +that would have similar effects in terms of Internet fragmentation +and isolation. + +In each of these cases, it is, or might be, possible to devise ways +to trick the DNS system into supporting mechanisms that were not +designed into it. Several ingenious solutions have been proposed in +many of these areas already, and some have been successfully deployed +into the marketplace. + +Several of the above problems are addressed well by a good directory +system (supported by the LDAP protocol or otherwise) or searching +environment (such as common web search engines) although not by the +DNS. Given the difficulty of deploying new applications discussed +above, an important question is whether the kludges are bad enough, +or will scale up to bad enough, that new solutions are needed and can +be deployed. + + +3. The directory story. + +3.1 Overview +The constraints of the DNS argue for introducing an intermediate +protocol mechanism, referred to here as a "directory layer". +Directory layer proposals would use a two-stage lookup, not unlike +several of the IDN proposals, but would do the first lookup in a +directory system, rather than in the DNS itself. This would permit +us to relax several constraints and produce a more comprehensive +system. + +Ultimately, many of the issues with domain names arise as the result +of people attempting to use the DNS as a directory. While there +hasn't been enough pressure/demand to justify a change to date, it +has already been quite clear that, as a directory system, the DNS is +a good deal less than ideal. This document suggests that there +actually is a requirement for a directory system, and that the right +solution to a directory requirement is a directory, not a series of +DNS patches, kludges, or workarounds. + +In particular... + +* A directory system would not require imposition of particular +length limits on names. + +* A directory system could permit explicit association of attributes +of, e.g., language and country, with a name, without having to +utilize trick encodings to incorporate that information in DNS labels +(or creating artificial hierarchy for doing so). + +* There is considerable experience in doing fuzzy and "sonex" +(similar-sounding) matching in directory systems. Moreover, it is +plausible to think about different matching rules for different areas +and sets of names so that these can be adapted to local cultural +requirements. Specifically, it might be possible to have a single +form of a name in a directory, but to have great flexibility about +what queries matched that name (and even have different variations in +different areas). Of course, the more flexibility one provides, the +greater the possibility of real or imagined trademark conflicts, but +we would have the opportunity to design a directory structure that +dealt with those issues in an intelligent way, while DNS constraints +arguably make a general and equitable DNS-only solution impossible. + +* If a directory system is used to translate to DNS names, and then +DNS names are looked up in the normal fashion, it may be possible to +relax several of the constraints that have been traditional (and +perhaps necessary) with the DNS. For example, reverse-mapping of +addresses to directory names may not be a requirement, since the DNS +name(s) would (continue to) uniquely identify the host. + +* Solutions to multilingual transcription problems that are common in +"normal life" (e.g., two-sided business cards to be sure that a +recipient trying to contact a person can access romanized spellings +and numbers when the original language may not be comprehensible to +that recipient) can be easily handled in a directory system by +inserting both sets of entries. + +* One can easily imagine a directory system that would return, not a +single name, but a set of names paired with network-locational +information or other context-establishing attributes. This type of +information might be of considerable use in resolving the "nearest +(or best) server for a particular named resource" problems that are a +significant concern for organizations hosting web and other sites +that are accessed from a wide range of locations and subnets. + +* Names bound to countries and languages might help to manage +trademark realities, while use of the DNS in trademark-significant +areas tends to require worldwide "flattening" of the trademark +system. +3.2 Some details and comments. + +As several proposals have noted, almost any i18n proposal for names +that are in, or map into, the DNS will require changing DNS resolver +API calls ("gethostbyname" or equivalent or adding some +pre-resolution preparation mechanism) in almost all Internet +applications -- whether to cause the API to take a different +character set, to accept or return more arguments with qualifying or +identifying information, or otherwise. Once applications must be +opened to make such changes, it is a relatively small matter to +switch from calling into the DNS to calling a directory service and +then the DNS (in many situations, both actions could be accomplished +in a single API call). + +A directory approach can be consistent both with "flat" stories and +multi-attribute ones. The DNS requires strict hierarchies, limiting +its ability to handle differentiation among names by their +properties. By contrast, modern directories can utilize +independently-searched attributes and other structured schema to +provide flexibilities not present in a strictly hierarchical system. + +There is a strong argument for a single directory structure (implying +a need for mechanisms for registration, delegation, etc.). But it is +not a strict requirement, especially if in-depth case analysis and +design work leads to the conclusion that reverse-mapping to directory +names is not a requirement (see section 4). + +While the discussion above includes very general comments about +attributes, it appears that only a very small number of attributes +would be needed. The list would almost certainly include country and +language for IDN purposes and might require "charset" if we cannot +agree on a character set and encoding. Trademark issues might +motivate "commercial" and "non-commercial" (or other) attributes if +they would be helpful in bypassing trademark problems. And applications to +resource location might argue for a few other attributes (as outlined +above). + +4. The Controversies + +4.1. One directory or many + +As suggested in some of the text above, it is an open question as to +whether the needs of the community would be best served by a single +directory with universal applicability, a single directory but +locally-tailored search (and, most important, matching) functions, or +multiple, locally-determined, directories. Each has its attractions. +Any but the first would essentially prevent reverse-mapping +(determination of the user-visible name of the host or resource from +target information such as an address or DNS name). But reverse +mapping has become less useful over the years --at least to users-- +as we have assigned more and more names per host address. +Locally-tailored search and mappings would permit national variations +on interpretation of which strings matched which other ones, an +arrangement that is especially important when different localities +apply different rules to, e.g., matching of characters with and +without diacriticals. But, of course, this implies that a URL may +evaluate properly or not depending on either settings on a client +machine or the network connectivity of the user, which is not, in +general, a desirable situation. + +And, of course, completely separate directories would permit +translation and transliteration functions to be embedded in the +directory, given much of the Internet a different appearance +depending on which directory was chosen. The attractions of this are +obvious, but, unless things were very carefully designed to preserve +uniqueness and precise identities at the right points (which may or +may not be possible), such a system would have many of the +difficulties associated with multiple roots. + +4.2 Why not a proposal? + +As this document has gone through various preliminary drafts and +reviews, the question has been raised as to whether it should contain +a specific proposal: a specific directory mechanism, schema, and so +on. It deliberately does not take that step. It has been difficult +to get directory systems deployed in significant ways in the Internet +infrastructure, partially because we have a surplus of options. +There are also some approaches that could be used to implement the +general concepts described here, such as the Common Name Resolution +Protocol [RFC2972], which some would not consider directory protocols +at all. Consequently, it appeared better to present the general +concepts and arguments here and leave the specifics to other sources, +documents, and proposals. + +5. Security Considerations + +The set of proposals implied by this document suggests an interesting +set of security issues (i.e., nothing important is ever easy). A +directory system used for this purpose would presumably need to be as +carefully protected against unauthorized changes as the DNS itself. +There also might be new opportunities for problems in the two-layer +arrangement; but those problems are not more severe than a two-stage +lookup in the DNS. + + +6. References + +RFC 625 On-line hostnames service. M.D. Kudlick, E.J. Feinler. +Mar-07-1974. + +RFC 811 Hostnames Server. K. Harrenstien, V. White, E.J. Feinler. +Mar-01-1982. + +RFC 952 DoD Internet host table specification. K. Harrenstien, M.K. +Stahl, E.J. Feinler. Oct-01-1985. + +RFC 882 Domain names: Concepts and facilities. P.V. Mockapetris. +Nov-01-1983. + +RFC 883 Domain names: Implementation specification. P.V. Mockapetris. +Nov-01-1983. + +RFC 1035 Domain names - implementation and specification. P.V. +Mockapetris. Nov-01-1987. + +RFC 1591 Domain Name System Structure and Delegation. J. Postel. +March 1994. + +RFC 2825 A Tangled Web: Issues of I18N, Domain Names, and the Other +Internet protocols. IAB, L. Daigle, ed.. May 2000. + +RFC 2826 IAB Technical Comment on the Unique DNS Root. IAB. May 2000. + +RFC 2972 Context and Goals for Common Name Resolution. N. Popp, M. +Mealling, L. Masinter, K. Sollins. October 2000. + +ITU Recommendation X.9 + +ITU Recommendation X.25 + + +7. Culprit address + +John Klensin +AT&T Labs +99 Bedford Street +Boston, MA 02111 +klensin@research.att.com + +Expires May 2001