An Integrated Digital Library Server with QAI and Self-Organizing Capabilities

The Open Archives Initiative (OAI) is an experimental initiative for the interoperability of Digital Libraries (DLs) based on metadata harvesting. The goal of OAI is to develop and promote interoperability solutions to facilitate the efficient dissemination of content. At present, however, there are still several challenging issues such as metadata incorrectness, poor quality of metadata, and metadata inconsistency that have to be solved in order to create a variety of high-quality services. In this paper we propose an integrated DL system with OAI and self-organizing capabilities. The system provides two value-added services, cross-archive searching and interactive concept browsing services, for organizing, exploring, and searching a collection of harvested metadata to satisfy users’ information needs. We also propose a multi-layered Self-Organizing Map (SOM) algorithm for building a subject-specific concept hierarchy using two input vector sets constructed by indexing the harvested metadata collection. By using the concept hierarchy, we can also automatically classify the harvested metadata collection for the purpose of selective harvesting.

[1]  Eric Brill,et al.  A Simple Rule-Based Part of Speech Tagger , 1992, HLT.

[2]  Herbert Van de Sompel,et al.  The open archives initiative: building a low-barrier interoperability framework , 2001, JCDL '01.

[3]  James C. French,et al.  Growth and server availability of the NCSTRL digital library , 2000, DL '00.

[4]  James C. French,et al.  The impact of database selection on distributed searching , 2000, SIGIR '00.

[5]  Andreas Rauber,et al.  The growing hierarchical self-organizing map , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.

[6]  Su-Shing Chen Digital Libraries : The Life Cycle of Information , 1998 .

[7]  Eric Brill,et al.  Some Advances in Transformation-Based Part of Speech Tagging , 1994, AAAI.

[8]  Amy Friedlander,et al.  D-Lib Magazine: Publishing as the Honest Broker , 1998 .

[9]  Teuvo Kohonen,et al.  Self-Organization of Very Large Document Collections: State of the Art , 1998 .

[10]  Dmitri Roussinov,et al.  A Scalable Self-organizing Map Algorithm for Textual Classification: A Neural Network Approach to Thesaurus Generation , 1998 .

[11]  Su-Shing Chen,et al.  A DL server with OAI capabilities: LOVE , 2002, JCDL '02.

[12]  Soumen Chakrabarti,et al.  Data mining for hypertext: a tutorial survey , 2000, SKDD.

[13]  Esa Alhoniemi,et al.  Clustering of the self-organizing map , 2000, IEEE Trans. Neural Networks Learn. Syst..

[14]  Gail McMillan,et al.  Open Archives Initiative , 2000 .

[15]  Hsinchun Chen,et al.  Internet Categorization and Search: A Self-Organizing Approach , 1996, J. Vis. Commun. Image Represent..

[16]  Edward A. Fox,et al.  Beyond Harvesting: Digital Library Components as OAI Extensions , 2002 .

[17]  Kurt Maly,et al.  Federated Searching Interface Techniques for Heterogeneous OAI Repositories , 2006, J. Digit. Inf..

[18]  Kurt Maly,et al.  A Scalable Architecture for Harvest-Based Digital Libraries: The ODU/Southampton Experiments , 2002, D Lib Mag..

[19]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[20]  Gerard Salton,et al.  Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[21]  Samuel Kaski,et al.  Self organization of a massive document collection , 2000, IEEE Trans. Neural Networks Learn. Syst..

[22]  Kurt Maly,et al.  Arc - An OAI Service Provider for Digital Library Federation , 2001, D Lib Mag..

[23]  Kurt Maly,et al.  The UPS Prototype: An Experimental End-User Service across E-Print Archives , 2000 .