Knowledge-based clustering scheme for collection management and retrieval of library books

Abstract We propose a knowledge-based clustering scheme for grouping books in a library. Such a grouping is achieved with the help of domain knowledge in the form of the ACM CR (Computing Reviews) category hierarchy. A new knowledge-based similarity measure is defined and used in clustering books. The proposed scheme is useful in overcoming several problems associated with the existing book collection management and document retrieval systems. More specifically, it can be used in: (1) helping the user select an appropriate collection of books in a library which contains the topics of interest; (2) assigning a classification number to a new book; (3) designing a more appropriate and uniform classification scheme for books; and (4) comparison of libraries based on their collections. Initial experiments on a collection of hundred books using the proposed clustering scheme have given us encouraging results.

[1]  Jamshid Beheshti Browsing through Public Access Catalogs. , 1992 .

[2]  Ryszard S. Michalski,et al.  Revealing Conceptual Structure in Data by Inductive Inference , 1982 .

[3]  Edie M. Rasmussen,et al.  Clustering Algorithms , 1992, Information Retrieval: Data Structures & Algorithms.

[4]  Anil K. Jain,et al.  Evidence-Based Recognition of 3-D Objects , 1988, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Donald Ervin Knuth,et al.  The Art of Computer Programming , 1968 .

[6]  Rajeev Sangal Programming paradigms in LISP , 1991 .

[7]  Robert M. Losee,et al.  Research and Evaluation for Information Professionals , 1993 .

[8]  Steven J. Fenves,et al.  Applying AI clustering to engineering tasks , 1993, IEEE Expert.

[9]  J. J. Branin,et al.  The National Shelflist Count project: its history, limitations, and usefulness , 1985 .

[10]  Mona L. Scott,et al.  Conversion Tables: LC-Dewey, Dewey-LC , 1993 .

[11]  Judith L. Gersting Mathematical structures for computer science , 1982 .

[12]  Beverly P. Lynch,et al.  The Academic library in transition : planning for the 1990s , 1989 .

[13]  Charles T. Meadow,et al.  Text information retrieval systems , 1992 .

[14]  Ricardo Baeza-Yates,et al.  Information Retrieval: Data Structures and Algorithms , 1992 .

[15]  Mary Biggs,et al.  Why Librarians Don't Buy Contemporary Poetry , 1993 .

[16]  M. Narasimha Murty,et al.  A comparison between conceptual clustering and conventional clustering , 1990, Pattern Recognit..

[17]  david c. rine,et al.  an introduction to multiple-valued logic , 1977 .

[18]  Michael R. Anderberg,et al.  Cluster Analysis for Applications , 1973 .

[19]  Elliot S. Palais,et al.  Use of Course Analysis in Compiling a Collection Development Policy Statement for a University Library. , 1987 .

[20]  Ryszard S. Michalski,et al.  Automated Construction of Classifications: Conceptual Clustering Versus Numerical Taxonomy , 1983, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Stuart Weibel,et al.  Automated Collection Analysis Using the OCLC and RLG Bibliographic Databases. , 1988 .

[22]  Anil K. Jain,et al.  Clustering techniques: The user's dilemma , 1976, Pattern Recognit..

[23]  Valery I. Frants,et al.  One Approach to Classification of Users and Automatic Clustering of Documents , 1993, Inf. Process. Manag..

[24]  William M. Shaw,et al.  Controlled and Uncontrolled Subject Descriptions in the CF Database: A Comparison of Optimal Cluster-Based Retrieval Results , 1993, Inf. Process. Manag..

[25]  M. Narasimha Murty,et al.  Clustering algorithms for library comparison , 1991, Pattern Recognit..

[26]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[27]  Ricardo A. Baeza-Yates,et al.  Introduction to Data Structures and Algorithms Related to Information Retrieval , 1992, Information Retrieval: Data Structures & Algorithms.

[28]  Gerald Salton,et al.  Automatic text processing , 1988 .

[29]  Melvil Dewey,et al.  Abridged Dewey decimal classification and relative index , 1971 .