A simple relevancy-ranking strategy for an interface to Boolean OPACs

A relevancy‐ranking algorithm for a natural language interface to Boolean online public access catalogs (OPACs) was formulated and compared with that currently used in a knowledge‐based search interface called the E‐Referencer, being developed by the authors. The algorithm makes use of seven well‐known ranking criteria: breadth of match, section weighting, proximity of query words, variant word forms (stemming), document frequency, term frequency and document length. The algorithm converts a natural language query into a series of increasingly broader Boolean search statements. In a small experiment with ten subjects in which the algorithm was simulated by hand, the algorithm obtained good results with a mean overall precision of 0.42 and mean average precision of 0.62, representing a 27 percent improvement in precision and 41 percent improvement in average precision compared to the E‐Referencer. The usefulness of each step in the algorithm was analyzed and suggestions are made for improving the algorithm.

[1]  Patricia Te Arapo Wallace,et al.  How Do Patrons Search the Online Catalog When No One's Looking? Transaction Log Analysis and Implications for Bibliographic Instruction and Systems Design. , 1993 .

[2]  Christine L. Borgman,et al.  Why are online catalogsstill hard to use , 1996 .

[3]  Christopher S. G. Khoo,et al.  Development of an intelligent Web interface to online library catalog databases , 1999, Proceedings Sixth Asia Pacific Software Engineering Conference (ASPEC'99) (Cat. No.PR00509).

[4]  Christine L. Borgman,et al.  Why are online catalogs still hard to use , 1996 .

[5]  Edward A. Fox,et al.  Research Contributions , 2014 .

[6]  Christopher S. G. Khoo,et al.  E-Referencer: Transforming Boolean OPACs to Web Search Engines. , 1999 .

[7]  Ellen M. Voorhees,et al.  The Sixth Text REtrieval Conference (TREC-6) , 2000, Inf. Process. Manag..

[8]  Vijay V. Raghavan,et al.  Extended Boolean query processing in the generalized vector space model , 1989, Inf. Syst..

[9]  Maria Elena Smith,et al.  Aspects of the P-Norm Model of Information Retrieval: Syntactic Query Generation, Efficiency, And Theoretical , 1990 .

[10]  Christopher S. G. Khoo,et al.  E-Referencer: A Prototype Expert System Web Interface to Online Catalogs , 1998, ECDL.

[11]  Berthier A. Ribeiro-Neto,et al.  A belief network model for IR , 1996, SIGIR '96.

[12]  W. Bruce Croft,et al.  Evaluation of an inference network-based retrieval model , 1991, TOIS.

[13]  Ross Wilkinson,et al.  Using the cosine measure in a neural network for document retrieval , 1991, SIGIR '91.

[14]  Geoffrey McKim,et al.  Systematic weighting and ranking: cutting the Gordian knot , 1999 .

[15]  Lotfi A. Zadeh,et al.  Fuzzy Sets , 1996, Inf. Control..

[16]  Norbert Fuhr,et al.  Probabilistic Models in Information Retrieval , 1992, Comput. J..

[17]  Edward A. Fox,et al.  Extended Boolean Models , 1992, Information retrieval (Boston).

[18]  Charles H. Davis,et al.  Beyond Boole : the next logical step , 1995 .

[19]  W. Bruce Croft Boolean Queries and Term Dependencies in Probabilistic Retrieval Models. , 1986 .

[20]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[21]  E. Fox,et al.  A Comparison of Two Methods For Soft Boolean Operator Interpretation In Information Retrieval , 1986 .

[22]  Tetsuya Morita,et al.  A fuzzy document retrieval system using the keyword connection matrix and a learning method , 1991 .

[23]  Joon Ho Lee,et al.  Properties of extended Boolean models in information retrieval , 1994, SIGIR '94.

[24]  Edward A. Fox,et al.  Experimental Comparison of Schemes for Interpreting Boolean Queries , 1988 .

[25]  Christopher S. G. Khoo,et al.  Design and Implementation of the E-Referencer , 2000, Data Knowl. Eng..

[26]  C. Paice Soft evaluation of Boolean search queries in information retrieval systems , 1984 .

[27]  Tadeusz Radecki Trends in research on information retrieval -- The potential for improvements in conventional Boolean retrieval systems , 1988, Inf. Process. Manag..

[28]  Pat Ensor User practices in keyword and Boolean searching on an online public access catalog , 1992 .

[29]  Myoung-Ho Kim,et al.  On the evaluation of Boolean operators in the extended Boolean retrieval framework , 1993, SIGIR.

[30]  Donna K. Harman,et al.  Ranking Algorithms , 1992, Information Retrieval: Data Structures & Algorithms.

[31]  K. Sparck Jones,et al.  Simple, proven approaches to text retrieval , 1994 .