Network Working Group P. Stickler Internet-Draft NRC Expires: July 17, 2002 January 16, 2002 An Extended Class Taxonomy for Uniform Resource Identifier Schemes draft-pstickler-uri-taxonomy-00 Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http:// www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on July 17, 2002. Copyright Notice Copyright (C) The Internet Society (2002). All Rights Reserved. Abstract This document defines a taxonomy of URI classes which extends the set of classes defined in RFC 2396 [15]. Stickler Expires July 17, 2002 [Page 1] Internet-Draft Extended URI Taxonomy January 2002 Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Uniform Resource Primitive (URP) . . . . . . . . . . . . . . . 4 2.1 Uniform Resource Term (URT) . . . . . . . . . . . . . . . . . 4 2.2 Uniform Resource Value (URV) . . . . . . . . . . . . . . . . . 5 3. Basis for Distinction between URI Classes . . . . . . . . . . 6 4. Security Considerations . . . . . . . . . . . . . . . . . . . 8 5. A (Re)Classification of Select URI Schemes . . . . . . . . . . 8 References . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Author's Address . . . . . . . . . . . . . . . . . . . . . . . 10 Full Copyright Statement . . . . . . . . . . . . . . . . . . . 11 Stickler Expires July 17, 2002 [Page 2] Internet-Draft Extended URI Taxonomy January 2002 1. Introduction RFC 2396 [15] defines two subclasses of URI: Uniform Resource Locators (URLs) and Uniform Resource Names (URN) as follows: "A URI can be further classified as a locator, a name, or both. The term "Uniform Resource Locator" (URL) refers to the subset of URI that identify resources via a representation of their primary access mechanism (e.g., their network "location"), rather than identifying the resource by name or by some other attribute(s) of that resource. The term "Uniform Resource Name" (URN) refers to the subset of URI that are required to remain globally unique and persistent even when the resource ceases to exist or becomes unavailable." Thus, the URI taxonomy defined by RFC 2396 is as follows: URI | ----------------- | | URL URN A recent publication by the W3C [8] provides a useful discussion regarding the role and purpose of such classifications, and their significance to software applications as a formal partitioning of the URI space. Per the terminology offered by this recent publication, this document does not strictly subscribe to either the "classical" or "contemporary" view, but serves as a useful resource for both. From the classical perspective, one may view the extended taxonomy defined herein as a -formal- partitioning of URI schemes such that each scheme is a member of one and only one URI class and an application may infer semantics or characteristics about URIs of a given scheme based on the defined semantics or characteristics of the URI class to which it belongs. From the contemporary perspective, one may view the extended taxonomy defined herein as an -informal- partitioning of URI schemes facilitating communication between humans only, allowing the shared semantics and characteristics of similar URI schemes to be more effectively expressed, but requiring that an application deal with each URI scheme in isolation, regardless of what URI class the scheme may be considered a member of. Stickler Expires July 17, 2002 [Page 3] Internet-Draft Extended URI Taxonomy January 2002 In addition to URLs and URNs, this document defines an additional class of URI: Uniform Resource Primitives (URP), which is itself divided into two subclasses: Uniform Resource Terms (URT), and Uniform Resource Values (URV). Thus defining the following extended URI taxonomy: URI | ----------------- | | | URL URP URN | --------- | | URT URV 2. Uniform Resource Primitive (URP) URPs contstitute a class of URIs which we may describe as "fully resolved" or "non-dereferencable" such that they are self-contained and do not represent any other resource -- hence the term ‘primitive’. The entire resource is embodied in the actual URI form. In contrast to most other URI schemes, which serve as mechanisms of indirection, their purpose is primarily for attribution or classification of other resources. One may make statements about those resources (e.g. via RDF), as one may do about any resource, but they do not resolve or dereference to any other resource. URPs constitute "WYSIWYG" URIs. I.e. "what you see (in the URI) is what you get". This quality should become clear following the discussion of URTs and URVs below. Because a URP does not resolve to any further data than what is embodied in the URI itself, the concept of a fragment has no meaning in the context of a URP; therefore, URPs do not allow fragment identifiers and there is no such thing as a "URP Reference". 2.1 Uniform Resource Term (URT) A URT represents a member of a finite, controlled and explicitly enumerated vocabulary of terms, possibly constituting and organized as a taxonomy. Stickler Expires July 17, 2002 [Page 4] Internet-Draft Extended URI Taxonomy January 2002 All URTs are defined by the same authority defining the scheme, or a proxy agency authorized by that authority, and the creating authority is (directly or indirectly) the same as for the scheme itself. Examples of URT schemes: voc: [11] Potential URT defined vocabularies: ISO 639 language codes ISO 3166 country codes Thesaurus of Geographic Names (TGN) [16] Upper Cyc Ontology [17] WordNet [18] and all vocabularies for essentially all abstract ontologies, XML and RDF schemata, and classificatory code sets. 2.2 Uniform Resource Value (URV) A URV is a lexically defined data structure. URVs have the unique quality that they may be defined by agents other than the authority defining the scheme, so long as they are defined in accordance with the lexical constraints defined for the scheme; and unlike all other classes of URI schemes, the creating authority is not discernable from the URI itself. Examples of URV schemes: data: [5] uri: [9] qname: [12] xmlns: [13] tdl: [10] Stickler Expires July 17, 2002 [Page 5] Internet-Draft Extended URI Taxonomy January 2002 The key difference between URTs and URVs is that all values of a URT are defined (directly or indirectly) by a single, central authority, whereas URVs may be defined by anyone, so long as they conform to the lexical constraints defined for the scheme. 3. Basis for Distinction between URI Classes Although historically the distinction between URL and URN has been based primarily on the concepts of 'location' versus 'name', and although such concepts and the distinctions based on those concepts are still meaningful and useful, an alternative basis for differentiation between URI classes is adopted for this extended taxonomy. This extended taxonomy distinguishes between terminal URI classes according to whether the URI is dereferencable or non-dereferencable, and whether the URI itself denotes the agency employed in: 1. the definition of the URI scheme or subscheme/namespace 2. the creation of the URI 3. the interpretation of the URI For the purposes of this distinction, an -agency- corresponds to a human, organization, or computer system which is the primary authority and/or actor involved in any of these three roles. An agency (particularly an authority) may be denoted only implicitly in the URI by right of owning or controlling a given web authority denoted explicitly by the URI. This gives us the following matrix, indicating where agencies are denoted (either implicitly or explicitly) in the URI itself: | URL URN | URT URV ------------|------------ Definition * * | * * Creation * * | * Interpretation * | | Where '*' indicates denotation and the vertical bar separates dereferencable schemes on the left from non-dereferencable schemes on Stickler Expires July 17, 2002 [Page 6] Internet-Draft Extended URI Taxonomy January 2002 the right. Thus, given the HTTP URL "http://xyz.foo.com/bar", we have the following agencies denoted: Definition: the owner of the 'http:' URI scheme Creation: the owner of the subdomain 'xyz.foo.com' Interpretation: the computer system residing at 'xyz.foo.com' Given the URN "urn:foo:bar", we have the following agencies denoted: Definition: the owner of the URN Namespace 'urn:foo' Creation: the owner of the URN Namespace 'urn:foo' Interpretation: (undefined) Given the URT "voc://abc.com/widgets/hk28", we have the following agencies denoted: Definition: the owner of the URT vocabulary 'voc://abc.com/widgets' Creation: the owner of the URT vocabulary 'voc://abc.com/widgets' Interpretation: (undefined) Given the URV "xyz:integer:42", we have the following agencies denoted: Definition: the owner of the URV scheme 'xyz:integer' Creation: (undefined) Interpretation: (undefined) Thus, given the above basis of distinction, the primary difference between URLs and URNs is that URLs denote both the agency which mints the URI and the agency which is expected to interpret the URI; whereas URNs on the other hand only denote the agent minting the URI and leave the agent of interpretation (dereferencing) unspecified -- which results in a more persistent identifier due to the fact that it is not dependent on the longevity of the validity of the location or identity of any given computer system for interpretation. While it may be argued that a URL may be interpreted by an agency other than that denoted in the URI itself, and thus the above Stickler Expires July 17, 2002 [Page 7] Internet-Draft Extended URI Taxonomy January 2002 distinction is not meaningful, it is reasonable to consider such interpretation by an agency other than denoted by the URI as contrary to the wishes of the creator of the URI by the fact that the creator chose a URL rather than a URN as the class of identitifier in the first place. 4. Security Considerations This document raises no known security issues. 5. A (Re)Classification of Select URI Schemes The following chart suggests one possible organization of a number of currently defined URI Schemes in terms of the extended taxonomy defined in this document: |--- http: [1] | |--- ftp: [1] |----------------- URL ---| | |--- mailto: [1] | | | |--- rtsp: [2] | | | | |--- urn: [3] |----------------- URN ---| | |--- hrn: [7] URI ---| | | | |--- tag: [4] | |----- URT ---| | | |--- voc: [11] | | | | | | |--- data: [5] |--- URP ---| | | |--- uri: [9] | | | |--- qname: [12] | | |----- URV ---|--- xmlns: [13] | |--- auth: [14] Stickler Expires July 17, 2002 [Page 8] Internet-Draft Extended URI Taxonomy January 2002 | |--- tdl: [10] | |--- esl: [6] Note: not all of the URI schemes in the diagram above are yet fully registered URI schemes, though all are either registered or in the process of registration. References [1] Berners-Lee, T., Masinter, L. and M. McCahill, "Uniform Resource Locators (URL)", RFC 1738, December 1994. [2] Shulzrinne, H., Rao, A. and R. Lanphier, "Real Time Streaming Protocol (RTSP)", RFC 2326, April 1998. [3] Moats, R., "URN Syntax", RFC 2141, May 1997. [4] Kindberg, T. and S. Hawke, "The 'tag' URI scheme and URN namespace", September 2001, . [5] Masinter, L., "The "data" URL scheme", RFC 2397, August 1998. [6] Palmer, S., "The "esl" URI scheme", September 2001, . [7] Stickler, P., "The 'hrn:' URI Scheme for Hierarchical Resource Names", January 2002, . [8] W3C/IETF, "URIs, URLs, and URNs: Clarifications and Recommendations 1.0", September 2001, . [9] Stickler, P., "The 'uri:' URI Scheme for URI Reification", January 2002, . [10] Stickler, P., "The 'tdl:' URI Scheme for Typed Data Literals", January 2002, . [11] Stickler, P., "The 'voc:' URI Scheme for Vocabulary Terms and Codes", January 2002, . [12] Stickler, P., "The 'qname:' URI Scheme for XML Namespace Qualified Names", January 2002, . [13] Stickler, P., "The 'xmlns:' URI Scheme for XML Namespace Declarations", January 2002, . [14] Stickler, P., "The 'auth:' URI Scheme for Hierarchical Authority Identifiers", January 2002, . [15] Berners-Lee, T., Fielding, R. and L. Masinter, "Uniform Resource Identifiers (URI): Generic Syntax", RFC 2396, August 1998. [16] [17] [18] Author's Address Patrick Stickler Nokia Research Center Visiokatu 1 Tampere 33720 FI EMail: patrick.stickler@nokia.com Stickler Expires July 17, 2002 [Page 10] Internet-Draft Extended URI Taxonomy January 2002 Full Copyright Statement Copyright (C) The Internet Society (2002). All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English. The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns. This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Acknowledgement Funding for the RFC Editor function is currently provided by the Internet Society. Stickler Expires July 17, 2002 [Page 11]