Descriptive Name Services for Large Internets

This thesis addresses the challenge of locating people, resources, and other objects in the global Internet. As the Internet grows beyond a million hosts in tens of thousands of organizations, it is increasingly difficult to locate any particular object. Hierarchical name services are frustrating, because users must guess the unique names for objects or navigate the name space to find information. Descriptive (i.e. relational) name services promise that users can locate resources simply by describing resource attributes. This thesis makes the promise of descriptive name services real by providing fast query processing in large internets. The key to speed in descriptive query processing is constraining the search space using two new techniques, called an active catalog and meta-data caching. The active catalog constrains the search space for a query by returning a list of data repositories where the answer to the query is likely to be found. Components of the catalog are distributed indices that isolate queries to parts of the network, and smart algorithms for limiting the search space by using semantic, syntactic, or structural constraints. Meta-data caching improves performance by keeping frequently used characterizations of the search space close to the user, thus reducing active catalog communication and processing costs. When searching for query responses, these techniques improve query performance by contacting only the data repositories likely to have actual responses, resulting in acceptable search times. The new techniques are integrated with existing data caching techniques through a single mechanism, called a referral. Referrals describe the conditions for using active catalog components, or re-using meta-data or data cache entries. Our prototype descriptive name service called Nomenclator employs these techniques. In one measurement study, Nomenclator consistently improved the performance of descriptive queries in X.500. Another measurement study shows how Nomenclator uses a small investment of network bandwidth and server resources to improve the response time for a wide range of query sizes. An analytical modeling study shows that Nomenclator can amortize this investment over many queries to provide an overall reduction in system load and, as a consequence, better scaling and response time.

[1]  Walt Scacchi,et al.  Integrating diverse information repositories: a distributed hypertext approach , 1991, Computer.

[2]  Richard W. Watson,et al.  Identifiers (Naming) in Distributed Systems , 1980, Advanced Course: Distributed Systems.

[3]  Michael J. Fischer,et al.  Sacrificing serializability to attain high availability of data in an unreliable network , 1982, PODS.

[4]  David J. DeWitt,et al.  GAMMA - A High Performance Dataflow Database Machine , 1986, VLDB.

[5]  Kjell Bratbergsengen,et al.  Hashing Methods and Relational Algebra Operations , 1984, VLDB.

[6]  Peter B. Danzig,et al.  Internet resource discovery services , 1993, Computer.

[7]  Yogen K. Dalal,et al.  The clearinghouse: a decentralized agent for locating named objects in a distributed environment , 1983, TOIS.

[8]  中嶋 和久,et al.  環境 Environment について , 1992 .

[9]  Gio Wiederhold,et al.  Mediators in the architecture of future information systems , 1992, Computer.

[10]  Karen R. Sollins Plan for Internet directory services , 1989, RFC.

[11]  Larry L. Peterson The profile naming service , 1988, TOCS.

[12]  Karen R. Sollins Distributed Name Management. , 1985 .

[13]  Ralph E. Droms,et al.  Access to heterogeneous directory services , 1990, Proceedings. IEEE INFOCOM '90: Ninth Annual Joint Conference of the IEEE Computer and Communications Societies@m_The Multiple Facets of Integration.

[14]  Bernard M. Hauzeur A model for naming, addressing and routing , 1986, TOIS.

[15]  Vincent Cate,et al.  Alex - a Global Filesystem , 1992 .

[16]  A. Retrospective,et al.  The UNIX Time-sharing System , 1977 .

[17]  Bohdan Smetaniuk Distributed Operation of the X.500 Directory , 1991, Comput. Networks ISDN Syst..

[18]  Darren R. Hardy,et al.  Supporting resource discovery among public Internet archives using a spectrum of information quality , 1991, [1991] Proceedings. 11th International Conference on Distributed Computing Systems.

[19]  Renée J. Miller,et al.  The Use of Information Capacity in Schema Integration and Translation , 1993, VLDB.

[20]  Ken Thompson,et al.  Plan 9 from Bell Labs , 1995 .

[21]  Barton P. Miller,et al.  Distributed upcalls: a mechanism for layering asynchronous abstractions , 1988, [1988] Proceedings. The 8th International Conference on Distributed.

[22]  Joann J. Ordille,et al.  Nomenclator Descriptive Query Optimization for Large X.500 Environments , 1991, SIGCOMM.

[23]  Roger M. Needham,et al.  Grapevine: an exercise in distributed computing , 1982, CACM.

[24]  Paul V. Mockapetris,et al.  DNS encoding of network names and other types , 1989, RFC.

[25]  Douglas B. Terry Caching Hints in Distributed Systems , 1987, IEEE Transactions on Software Engineering.

[26]  Peter B. Danzig,et al.  An analysis of wide-area name server traffic: a study of the Internet Domain Name System , 1992, SIGCOMM 1992.

[27]  Abraham Silberschatz,et al.  Reliable transaction management in a multidatabase system , 1990, SIGMOD '90.

[28]  Larry L. Peterson,et al.  Univers: An attribute‐based name server , 1990, Softw. Pract. Exp..

[29]  Ralph E. Droms,et al.  An Experimental Implementation of the Tilde Naming System , 1990, Comput. Syst..

[30]  Gerald J. Popek,et al.  Name Service Locality and Cache Design in a Distributed Operating System , 1986, ICDCS.

[31]  Sheldon J. Finkelstein Common expression analysis in database applications , 1982, SIGMOD '82.

[32]  Joann J. Ordille,et al.  Database challenges in global information systems , 1993, SIGMOD '93.

[33]  Peter Deutsch,et al.  Architecture of the WHOIS++ service , 1995, RFC.

[34]  Butler W. Lampson,et al.  Designing a global name service , 1986, PODC '86.

[35]  Steve Kille Using the OSI Directory to Achieve User Friendly Naming , 1995, RFC.

[36]  Joann J. Ordille,et al.  Nomenclator descriptive query optimization for large X.500 environments , 1991, SIGCOMM 1991.

[37]  Rafael Alonso,et al.  Data caching issues in an information retrieval system , 1990, TODS.

[38]  David Notkin,et al.  A name service for evolving heterogeneous systems , 1987, SOSP '87.

[39]  Hector Garcia-Molina,et al.  Read-only transactions in a distributed database , 1982, TODS.

[40]  Pierre Jouvelot,et al.  Semantic file systems , 1991, SOSP '91.

[41]  Ralph E. Droms,et al.  The Knowbot Information Service , 1989 .

[42]  Patrick Valduriez,et al.  Principles of Distributed Database Systems , 1990 .

[43]  Jerome H. Saltzer,et al.  On the Naming and Binding of Network Destinations , 1993, RFC.

[44]  Samuel J. Leffler,et al.  Measuring and Improving the Performance of 4.2BSD , 1984 .

[45]  David R. Cheriton,et al.  Uniform Access to Distributed Name Interpretation in the V-System , 1984, ICDCS.

[46]  Stuart Sechrest,et al.  Blending hierarchical and attribute-based file naming , 1992, [1992] Proceedings of the 12th International Conference on Distributed Computing Systems.

[47]  J. Wrench Table errata: The art of computer programming, Vol. 2: Seminumerical algorithms (Addison-Wesley, Reading, Mass., 1969) by Donald E. Knuth , 1970 .

[48]  Michael J. Accetta Resource Location Protocol , 1983, RFC.

[49]  Bruce Walker,et al.  The LOCUS distributed operating system , 1983, SOSP '83.

[50]  Daniel H. Craft,et al.  Resource management in a decentralized system , 1983, SOSP '83.

[51]  Patrick Valduriez,et al.  Join and Semijoin Algorithms for a Multiprocessor Database Machine , 1984, TODS.

[52]  Nick Roussopoulos,et al.  Interoperability of multiple autonomous databases , 1990, CSUR.

[53]  Mark A. Sheldon,et al.  A Content Routing System for Distributed Information Servers , 1993 .

[54]  C. Mic Bowman,et al.  The Enterprise Distributed White-pages Service , 1993, USENIX Winter.

[55]  Hector Garcia-Molina,et al.  Distributed processing of filtering queries in HyperFile , 1991, [1991] Proceedings. 11th International Conference on Distributed Computing Systems.

[56]  Jeffrey C Mogull Representing information about files , 1986, ICDCS 1986.

[57]  K. R. Collins,et al.  Supporting the information mesh , 1992, [1992] Proceedings Third Workshop on Workstation Operating Systems.

[58]  Michael F. Schwartz The Networked Resource Discovery Project , 1989, IFIP Congress.

[59]  Paul V. Mockapetris,et al.  Domain names - implementation and specification , 1987, RFC.

[60]  C Bennett A model solution. , 1983, Health and social service journal.

[61]  John K. Ousterhout,et al.  Prefix Tables: A Simple Mechanism for Locating Files in a Distributed System , 1985, ICDCS.

[62]  Gerald W. Neufeld Descriptive names in X.500 , 1989, SIGCOMM 1989.

[63]  Daniel Barbará Extending the scope of database services , 1993, SGMD.

[64]  David Powell Overview of the Architecture , 1991 .

[65]  Abraham Silberschatz,et al.  Distributed file systems: concepts and examples , 1990, CSUR.

[66]  Jeffrey I. Schiller,et al.  An Authentication Service for Open Network Systems. In , 1998 .

[67]  Carla Schlatter Ellis,et al.  Directory Reference Patterns in Hierarchical File Systems , 1989, IEEE Trans. Knowl. Data Eng..

[68]  David K. Gifford,et al.  An Architecture for Large Scale Information Systems , 1985, SOSP.

[69]  Scott Shenker,et al.  Epidemic algorithms for replicated database maintenance , 1988, OPSR.

[70]  Bruce R. Schatz,et al.  TELESOPHY: A SYSTEM FOR MANIPULATING THE KNOWLEDGE OF A COMMUNITY. , 1987 .

[71]  Abraham Silberschatz,et al.  Failure-resilient transaction management in multidatabase , 1991, Computer.

[72]  Calton Pu,et al.  Heterogeneous and autonomous transaction processing , 1991, Computer.

[73]  Eric C. Cooper Replicated distributed programs , 1985, SOSP '85.

[74]  John K. Ousterhout,et al.  A trace-driven analysis of name and attribute caching in a distributed system , 1992 .

[75]  Ifip Tc,et al.  Network information processing systems : proceedings of the IFIP TC6/TC8 Open Symposium on Network Information Processing Systems, Sofia, Bulgaria, 9-13 May, 1988 , 1989 .

[76]  B. Clifford Neuman,et al.  A Comparison of Internet Resource Discovery Approaches , 1992, Comput. Syst..

[77]  B. Clifford Neuman,et al.  The Prospero File System: A Global File System Based on the Virtual System Model , 1992, Comput. Syst..

[78]  Ken Harrenstien,et al.  Nicname/whois , 1982, RFC.

[79]  Douglas Brian Terry,et al.  Distributed name servers: naming and caching in large distributed computing environments , 1985 .

[80]  Gladys Mahosky,et al.  The directory. , 1953, Research newsletter. College of General Practitioners.

[81]  M. Litzkow,et al.  Architecture of the CSNET name server , 1983, SIGCOMM.

[82]  Jeffrey C. Mogul,et al.  Representing Information About Files , 1984, ICDCS.

[83]  Hossam Afifi,et al.  Evaluating caching schemes for the X.500 directory , 1993, [1993] Proceedings. The 13th International Conference on Distributed Computing Systems.

[84]  Hector Garcia-Molina,et al.  Consistency in a partitioned network: a survey , 1985, CSUR.

[85]  Edward Babb,et al.  Implementing a relational database by means of specialzed hardware , 1979, TODS.

[86]  Michael Stonebraker,et al.  The Design and Implementation of Distributed INGRES , 1986, The INGRES Papers.

[87]  Jungyun Seo,et al.  Classifying schematic and data heterogeneity in multidatabase systems , 1991, Computer.

[88]  J. E. White A user-friendly naming convention for use in communication networks , 1984 .

[89]  Bjarne Stroustrup,et al.  C++ Programming Language , 1986, IEEE Softw..

[90]  Peter B. Danzig,et al.  Distributed Indexing of Autonomous Internet Services , 1992, Comput. Syst..

[91]  Roger M. Needham,et al.  Experience with Grapevine: the growth of a distributed system , 1984, TOCS.

[92]  Larry Press,et al.  The Net: progress and opportunity , 1992, CACM.

[93]  WU KarenT,et al.  Results , 1969 .

[94]  Christian Huitema,et al.  The X.500 Directory Services , 1988, Comput. Networks.

[95]  Paul V. Mockapetris,et al.  Domain names: Concepts and facilities , 1983, RFC.

[96]  Douglas Comer,et al.  The Tilde File Naming Scheme , 1986, ICDCS.

[97]  Dean Daniels,et al.  R*: An Overview of the Architecture , 1986, JCDKB.

[98]  Mark K. Lottor Internet Growth (1981-1991) , 1992, RFC.

[99]  Marvin Theimer,et al.  QuickSilver support for access to data in large, geographically dispersed systems , 1989, [1989] Proceedings. The 9th International Conference on Distributed Computing Systems.

[100]  Joann J. Ordille,et al.  Distributed active catalogs and meta-data caching in descriptive name services , 1993, [1993] Proceedings. The 13th International Conference on Distributed Computing Systems.

[101]  Edward D. Lazowska,et al.  Quantitative System Performance , 1985, Int. CMG Conference.

[102]  David R. Cheriton,et al.  Decentralizing a global naming service for improved performance and fault tolerance , 1989, TOCS.

[103]  Darren R. Hardy,et al.  Essence: A Resource Discovery System Based on Semantic File Indexing , 1993, USENIX Winter.