Improving Library Searches Using Word-Correlation Factors and Folksonomies

Improving Library Searches Using Word-Correlation Factors and Folksonomies Maria Soledad Pera Department of Computer Science Master of Science Libraries, private and public, offer valuable resources to library patrons; however, formulating library queries to retrieve relevant results can be difficult. This occurs because when using a library catalog for library searches, patrons often do not know the exact keywords to be included in a query that match the rigid subject terms (chosen by the Library of Congress) or terms in other fields of a desired library catalog record. These improperly formulated queries often translate into a high percentage of failed searches that retrieve irrelevant results or no results at all. This explains why frustrated library patrons nowadays rely on Web search engines to perform their searches first, and upon obtaining the initial information, such as titles, subject areas, or authors, they query the library catalog. This searching strategy is an evidence of failure of today’s library systems. In solving this problem, we propose an enhanced library system, called EnLibS, which allows partial, similarity matching of (i) tags defined by ordinary users at a folksonomy site which describe the content of books and (ii) keywords in a library query to improve the searches on library catalogs. The proposed library system allows patrons to post a query Q with commonly-used words and ranks the retrieved results according to their degrees of resemblance with Q. Experimental results show that EnLibS (i) reduces the amount of queries that retrieve no results, (ii) obtains high precision in retrieving and accuracy in ranking relevant results, and (iii) achieves a processing time comparable to existing library catalog search engines.

[1]  Christoph Hölscher,et al.  Web search behavior of Internet experts and newbies , 2000, Comput. Networks.

[2]  Alberto H. F. Laender,et al.  The effectiveness of automatically structured queries in digital libraries , 2004, Proceedings of the 2004 Joint ACM/IEEE Conference on Digital Libraries, 2004..

[3]  James P. Callan,et al.  Query-based sampling of text databases , 2001, TOIS.

[4]  Diane Neal Folksonomies: Introduction: Folksonomies and image tagging: Seeing the future? , 2008 .

[5]  Young-In Song,et al.  Finding question-answer pairs from online forums , 2008, SIGIR '08.

[6]  Emine Yilmaz,et al.  Estimating average precision with incomplete and imperfect judgments , 2006, CIKM '06.

[7]  Sai Deng,et al.  Location and shelf mapping from OPAC search results: with reference to Wichita State University , 2008 .

[8]  A. Hossein Farajpahlou,et al.  Defining some criteria for the success of automated library systems , 1999 .

[9]  Ellen M. Voorhees,et al.  The TREC-8 Question Answering Track Evaluation , 2000, TREC.

[10]  Ray R. Larson Evaluation of advanced retrieval techniques in an experimental online catalog , 1992 .

[11]  Amanda Spink,et al.  Real life, real users, and real needs: a study and analysis of user queries on the web , 2000, Inf. Process. Manag..

[12]  Clément Arsenault,et al.  Searching titles with initial articles in library catalogs: a case study and search behavior analysis , 2007 .

[13]  Paul DuBois MySQL (3rd Edition) (Developer's Library) , 2005 .

[14]  Wendy A. Rogers,et al.  Influences of general computer experience and age on library database search performance , 2000, Behav. Inf. Technol..

[15]  Gail Herrera,et al.  MetaSearching and Beyond: Implementation Experiences and Advice from an Academic Library , 2007 .

[16]  Dion Hoe-Lian Goh,et al.  In search of query patterns: A case study of a university OPAC , 2006, Inf. Process. Manag..

[17]  Emine Yilmaz,et al.  A statistical method for system evaluation using incomplete judgments , 2006, SIGIR.

[18]  Yiu-Kai Ng,et al.  Augmenting Data Retrieval with Information Retrieval Techniques by Using Word Similarity , 2008, NLDB.

[19]  Leonard J. Kazmier Schaum's Outline of Business Statistics , 1976 .

[20]  David M. Levy,et al.  A digital strategy for the Library of Congress , 2001, JCDL '01.

[21]  C. Lee Giles,et al.  Adaptive sorted neighborhood methods for efficient record linkage , 2007, JCDL '07.

[22]  Ray R. Larson,et al.  The decline of subject searching: Long-term trends and patterns of index use in an online catalog , 1991, J. Am. Soc. Inf. Sci..

[23]  Laurie Rozakis Test taking strategies and study skills for the utterly confused , 2003 .

[24]  Peter Christen,et al.  Automatic record linkage using seeded nearest neighbour and support vector machine classification , 2008, KDD.

[25]  Michael G. Kenward,et al.  Design and Analysis of Cross-Over Trials, Second Edition , 2003 .

[26]  Micheal D. Cooper Predicting the relevance of a library catalog search , 2001 .

[27]  C. D. Rosa,et al.  Perceptions of libraries and information resources , 2005 .

[28]  Kristin Antelman,et al.  Toward a 21st Century Library Catalog , 2006 .

[29]  Yiu-Kai Ng,et al.  Using Word Clusters to Detect Similar Web Documents , 2006, KSEM.

[30]  Andrew K. Pace,et al.  Toward a Twenty-First Century Library Catalog , 2006 .

[31]  Melissa L. Rethlefsen Tags Help Make Libraries Del.icio.us: Social Bookmarking and Tagging Boost Participation. , 2007 .

[32]  Perry R. Hinton Statistics Explained: A Guide for Social Science Students , 1995 .