MARIAN: Flexible Interoperability for Federated Digital Libraries

Federated digital libraries are composed of distributed, autonomous, and often heterogeneous information services but provide users with a transparent, integrated view of collected information. In this paper we discuss a federated system for the Networked Digital Library of Theses and Dissertations (NDLTD), an international consortium of universities, libraries, and other supporting institutions focused on electronic theses and dissertations (ETDs). Federation requires dealing flexibly with differences among systems, ontologies, and data formats while respecting information sources' autonomy. Our solution involves adapting the object-oriented digital library system MARIAN to serve as mediation middleware for the federated NDLTD collection. Components of the solution include: 1) the use and integration of several harvesting techniques; 2) an architecture based on object-oriented ontologies of search modules and metadata; 3) reconciliation of diversity within the harvested data joined to a single collection view for the user; and 4) an integrated framework for addressing such questions as data quality, flexible and efficient search, and scalability.

[1]  Hans-Jürgen Zimmermann,et al.  Fuzzy Set Theory - and Its Applications , 1985 .

[2]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems , 1988 .

[3]  Gio Wiederhold,et al.  Mediators in the architecture of future information systems , 1992, Computer.

[4]  Ricardo Baeza-Yates,et al.  Information Retrieval: Data Structures and Algorithms , 1992 .

[5]  Edward A. Fox,et al.  Development of a modern OPAC: from REVTOLC to MARIAN , 1993, SIGIR.

[6]  H. Zimmermann,et al.  Fuzzy Set Theory and Its Applications , 1993 .

[7]  Peter B. Danzig,et al.  The Harvest Information Discovery and Access System , 1995, Comput. Networks ISDN Syst..

[8]  Sandra Heiler,et al.  Semantic interoperability , 1995, CSUR.

[9]  Vipul Kashyap,et al.  Observer: an approach for query processing in global information systems based on interoperation across pre-existing ontologies , 1996, Proceedings First IFCIS International Conference on Cooperative Information Systems.

[10]  Luis Gravano,et al.  Merging Ranks from Heterogeneous Internet Sources , 1997, VLDB.

[11]  Luis Gravano,et al.  The Stanford Digital Library metadata architecture , 1997, International Journal on Digital Libraries.

[12]  Clifford A. Lynch,et al.  The Z39.50 Information Retrieval Standard: Part I: A Strategic View of Its Past, Present and Future , 1997, D-Lib Magazine.

[13]  Norbert Fuhr,et al.  A probabilistic relational algebra for the integration of information retrieval and database systems , 1997, TOIS.

[14]  Sandra Payette,et al.  Making global digital libraries work: collection services, connectivity regions, and collection views , 1998, DL '98.

[15]  Alberto O. Mendelzon,et al.  Database techniques for the World-Wide Web: a survey , 1998, SGMD.

[16]  Kevin Chen-Chuan Chang,et al.  Interoperability for digital libraries worldwide , 1998, CACM.

[17]  Edward A. Fox,et al.  Multilingual Federated Searching Across Heterogeneous Collections , 1998, D Lib Mag..

[18]  Amit P. Sheth,et al.  Semantic interoperability in global information systems , 1999, SGMD.

[19]  Norbert Fuhr Towards Data Abstraction in Networked Information Retrieval Systems , 1999, Inf. Process. Manag..

[20]  Kevin Chen-Chuan Chang,et al.  Predicate rewriting for translating Boolean queries in a heterogeneous information system , 1999, TOIS.

[21]  Dan Suciu,et al.  Data on the Web: From Relations to Semistructured Data and XML , 1999 .

[22]  Edward A. Fox,et al.  Use and usability in a digital library search system , 1999, ArXiv.

[23]  Vassilis Christophides,et al.  Declarative Specification of Z39.50 Wrappers Using Description Logics , 1999, ECDL.

[24]  Peter Mc Brien,et al.  Automatic Migration and Wrapping of Database Applications — A Schema Transformation Approach , 1999 .

[25]  Norbert Fuhr,et al.  A decision-theoretic approach to database selection in networked IR , 1999, TOIS.

[26]  Amit P. Sheth,et al.  Semantic Interoperability in Global Information Systems: A Brief Introduction to the Research Area a , 1999 .

[27]  Edward A. Fox,et al.  A digital library for authors: recent progress of the networked digital library of theses and dissertations , 1999, DL '99.

[28]  Mike P. Papazoglou,et al.  Contextualizing the information space in federated digital libraries , 1999, SGMD.

[29]  Herbert Van de Sompel,et al.  The Santa Fe Convention of the Open Archives Initiative , 2000, D Lib Mag..

[30]  Dan Suciu,et al.  Declarative specification of Web sites with Strudel , 2000, The VLDB Journal.

[31]  Vijayalakshmi Atluri,et al.  SI in digital libraries , 2000, CACM.

[32]  James C. French,et al.  Growth and server availability of the NCSTRL digital library , 2000, DL '00.

[33]  Kurt Maly,et al.  The UPS Prototype project: exploring the obstacles in creating a crosse-print archive end-user service , 2000 .

[34]  Elke A. Rundensteiner,et al.  Maintaining data warehouses over changing information sources , 2000, CACM.

[35]  Andreas Paepcke,et al.  A mediation infrastructure for digital library services , 2000, DL '00.

[36]  Herbert Van de Sompel,et al.  The open archives initiative: building a low-barrier interoperability framework , 2001, JCDL '01.

[37]  Edward A. Fox,et al.  The Open Archives Initiative , 2001 .

[38]  D. Watts,et al.  Small Worlds: The Dynamics of Networks between Order and Randomness , 2001 .

[39]  Jie Wu,et al.  Small Worlds: The Dynamics of Networks between Order and Randomness , 2003 .