Open digital libraries

Digital Libraries (DLs) are software systems specifically designed to assist users in information seeking activities. Stemming from the intersection of library sciences and computer networking, traditional DL systems impose library philosophies of structure and management on the sprawling collections of data that are made possible through the Internet. DLs evolve to keep pace with innovation on the Internet so there is little standardization in the architecture of such systems. However, in attempting to provide users with the highest possible levels of service with the minimum possible effort, many systems work collaboratively with others, e.g., meta-search engines. This type of system interoperability is encouraged by the emergence of simple data transfer protocols such as the Open Archives Initiative's Protocol for Metadata Harvesting (OAI-PMH). Open Digital Libraries are an extension of the work of the OAI. It is proposed in this dissertation that the philosophy and approach adopted by the OAT can easily be extended to support inter-component interaction within a componentized DL. In particular, DLs can be built by connecting small components that communicate through a family of lightweight protocols, using XML as the data interchange mechanism. In order to test the feasibility of this, a set of protocols was designed based on a generalization of the work of the OAT. Components adhering to these protocols were implemented and integrated into production and research DLs. These systems were then evaluated for simplicity, reusability, and performance. On the whole, this study has shown promise in the approach of applying the fundamental concepts of the OAT protocol to the task of DL component design and implementation. Further, it has shown the feasibility of building componentized DL systems using techniques that are a precursor to the Web Services approach to system design.

[1]  Edward A. Fox,et al.  Networked Digital Library of Theses and Dissertations: An International Effort Unlocking University Resources , 1997, D Lib Mag..

[2]  Vinton G. Cerf,et al.  30 Years of RFCs , 1999, RFC.

[3]  Nicholas J. Belkin,et al.  Digital Library : Gross Structure and Requirements ( Report from a Workshop ) , 1994 .

[4]  Hussein Suleman Enforcing interoperability with the open archives initiative repository explorer , 2001, JCDL '01.

[5]  Clemens A. Szyperski Component software and the way ahead , 2000 .

[6]  Ian H. Witten,et al.  Greenstone: a comprehensive open-source digital library software system , 2000, DL '00.

[7]  Carl Lagoze,et al.  Dienst: an architecture for distributed document libraries , 1995, CACM.

[8]  Timothy W. Finin,et al.  KQML as an agent communication language , 1994, CIKM '94.

[9]  Herbert Van de Sompel,et al.  The open archives initiative: building a low-barrier interoperability framework , 2001, JCDL '01.

[10]  Larry Masinter,et al.  Hyper Text Coffee Pot Control Protocol (HTCPCP/1.0) , 1998, RFC.

[11]  Edward A. Fox,et al.  Building interoperable digital library services: MARIAN, open archives, and the NDLTD , 2001, SIGIR '01.

[12]  Edward A. Fox,et al.  Effective, efficient retrieval in a network of digital information objects , 2001 .

[13]  William P. Birmingham,et al.  An agent-based architecture for digital libraries , 1995, D Lib Mag..

[14]  Larry Lannom,et al.  Handle System Overview , 2003, RFC.

[15]  Kurt Maly,et al.  The UPS Prototype: An Experimental End-User Service across E-Print Archives , 2000, D Lib Mag..

[16]  Herbert Van de Sompel,et al.  The Santa Fe Convention of the Open Archives Initiative , 2000, D Lib Mag..

[17]  Ian H. Witten,et al.  Managing Gigabytes: Compressing and Indexing Documents and Images , 1999 .

[18]  Stevan Harnad,et al.  Free at Last: The Future of Peer-Reviewed Journals , 1999, D Lib Mag..

[19]  Sara Bertocco Torii, an Open Portal over Open Archives , 2001 .

[20]  Paul Resnick,et al.  Recommender systems , 1997, CACM.

[21]  Roy T. Fielding,et al.  Hypertext Transfer Protocol - HTTP/1.1 , 1997, RFC.

[22]  Eric A. Brewer When everything is searchable , 2001, CACM.

[23]  Edward A. Fox,et al.  National Digital Library of Theses and Dissertations: A Scalable and Sustainable Approach to Unlock University Resources , 1996, D Lib Mag..

[24]  Colin Potts,et al.  Design of Everyday Things , 1988 .

[25]  Peter B. Danzig,et al.  The Harvest Information Discovery and Access System , 1995, Comput. Networks ISDN Syst..

[26]  Kurt Maly,et al.  Arc: an OAI service provider for cross-archive searching , 2001, JCDL '01.

[27]  Edward A. Fox,et al.  Networked Digital Library of Theses and Dissertations: Bridging the Gaps for Global Access - Part 1: Mission and Progress , 2001, D Lib Mag..

[28]  Alan O. Freier,et al.  The SSL Protocol Version 3.0 , 1996 .

[29]  Michael E. Lesk,et al.  Practical Digital Libraries: Books, Bytes, and Bucks , 1997 .

[30]  Shishir Gundavaram,et al.  CGI Programming on the World Wide Web , 1996 .

[31]  Amjad Umar,et al.  Object-Oriented Client/Server Internet Environments , 1997 .

[32]  Edward A. Fox,et al.  Building quality into a digital library , 2000, DL '00.

[33]  Michael W. Berry,et al.  Understanding search engines: mathematical modeling and text retrieval (software , 1999 .

[34]  Gail McMillan,et al.  Open Archives Initiative , 2000 .

[35]  C. M. Sperberg-McQueen,et al.  eXtensible Markup Language (XML) 1.0 (Second Edition) , 2000 .

[36]  Eric S. Raymond,et al.  The cathedral and the bazaar - musings on Linux and Open Source by an accidental revolutionary , 2001 .

[37]  Constantinos Phanouriou,et al.  UIML: A Device-Independent User Interface Markup Language , 2000 .

[38]  Edward A. Fox,et al.  Networked Digital Library of Theses and Dissertations: Bridging the Gaps for Global Access - Part 2: Services and Research , 2001, D Lib Mag..

[39]  Tim Berners-Lee,et al.  The World Wide Web - Past, Present and Future , 2006, J. Digit. Inf..

[40]  Herbert Van de Sompel,et al.  Open Archives Initiative - Protocol for Metadata Harvesting - v.2.0 , 2002 .

[41]  Ellen Siever,et al.  PERL in a Nutshell , 1998 .

[42]  Luis Gravano,et al.  STARTS: Stanford Protocol Proposal for Internet Retrieval and Search , 1997 .

[43]  Douglas B. Terry,et al.  Using collaborative filtering to weave an information tapestry , 1992, CACM.

[44]  Hyacinth S. Nwana,et al.  2 Multi-Agent Systems : Promises and Reality , 1999 .

[45]  Edward A. Fox,et al.  Designing Protocols in Support of Digital Library Componentization , 2002, ECDL.

[46]  Roy T. Fielding,et al.  Hypertext Transfer Protocol - HTTP/1.0 , 1996, RFC.

[47]  Luis Gravano,et al.  SDLIP + STARTS = SDARTS a protocol and toolkit for metasearching , 2001, JCDL '01.

[48]  Oscar Nierstrasz,et al.  Component-oriented software technology , 1995 .

[49]  C. M. Sperberg-McQueen,et al.  Extensible Markup Language (XML) , 1997, World Wide Web J..

[50]  Luis Gravano,et al.  The Stanford Digital Library metadata architecture , 1997, International Journal on Digital Libraries.

[51]  Tim Berners-Lee,et al.  Weaving The Web: The Original Design And Ultimate Destiny of the World Wide Web , 1999 .

[52]  Catherine C. Marshall,et al.  Going digital: a look at assumptions underlying digital libraries , 1995, CACM.

[53]  Carl Lagoze,et al.  The Open Archives Initiative Protocol for Metadata Harvesting Protocol , 2002 .

[54]  Edsger W. Dijkstra,et al.  The end of computing science? , 2001, CACM.

[55]  Danny Cohen,et al.  A Format for Bibliographic Records , 1995, RFC.

[56]  Carl Lagoze,et al.  Core services in the architecture of the national science digital library (NSDL) , 2002, JCDL '02.

[57]  Barry M. Leiner,et al.  The NCSTRL Approach to Open Architecture for the Confederated Digital Library , 1998, D-Lib Magazine.

[58]  Edward A. Fox,et al.  Multilingual Federated Searching Across Heterogeneous Collections , 1998, D Lib Mag..

[59]  Edward A. Fox,et al.  An XML Log Standard and Tool for Digital Library Logging Analysis , 2002, ECDL.

[60]  Hussein Suleman Using the repository explorer to achieve OAI protocol compliance , 2001, JCDL '01.

[61]  James A. Hendler,et al.  The Semantic Web" in Scientific American , 2001 .

[62]  Coletta Witherspoon Sams Teach Yourself Linux in 24 Hours , 2000 .

[63]  Herbert Van de Sompel,et al.  Reference Linking in a Hybrid Library Environment , 1999 .

[64]  Hussein Suleman Building Interoperable Digital Libraries: A Practical Guide to Creating Open Archives , 2001 .

[65]  Pasquale Pagano,et al.  OpenDLib: A Digital Library Service System , 2002, ECDL.

[66]  Carl Lagoze,et al.  NCSTRL: design and deployment of a globally distributed digital library , 2000 .

[67]  Edward A. Fox,et al.  A Framework for Building Open Digital Libraries , 2001, D Lib Mag..

[68]  Gary Simons,et al.  The OLAC Metadata Set and Controlled Vocabularies , 2001, ACL 2001.

[69]  Sandra Payette,et al.  Flexible and Extensible Digital Object and Repository Architecture (FEDORA) , 1998, ECDL.

[70]  Edward A. Fox,et al.  The Open Archives Initiative , 2001 .

[71]  Sriram Raghavan,et al.  Search Middleware and the Simple Digital Library Interoperability Protocol , 2000, D Lib Mag..

[72]  Luis Gravano,et al.  The Stanford InfoBus and Its Service Layers: Augmenting the Internet with High-Level Information Management Protocols , 1998, The MeDoc Approach.