The use of phrases and structured queries in information retrieval

Both phrases and Boolean queries have a long history in information retrieval, particularly in commercial systems. In previous work, Boolean queries have been used as a source of phrases for a statistical retrieval model, This work, like the majority of research on phrases, resulted in little improvement in retrieval effectiveness, In this paper, we describe an approach where phrases identified in natural language queries are used to build structured queries for a probabilistic retrieval model. Our results show that using phrases in this way can improve performance, and that phrases that are automatically extracted from a natural language query perform nearly as well as manually selected phrases.

[1]  W. Bruce Croft,et al.  Inference networks for document retrieval , 1989, SIGIR '90.

[2]  Donna Harman,et al.  Retrieving Records from a Gigabyte of Text on a Minicomputer Using Statistical Ranking. , 1990 .

[3]  Edward A. Fox,et al.  Research Contributions , 2014 .

[4]  Norbert Fuhr,et al.  Models for retrieval with probabilistic indexing , 1989, Inf. Process. Manag..

[5]  Padmini Das-Gupta Boolean Interpretation of Conjunctions for Document Retrieval. , 1987 .

[6]  Gerard Salton,et al.  Automatic Information Organization And Retrieval , 1968 .

[7]  Padmini Das-Gupta Boolean interpretation of conjunctions for document retrieval , 1987, J. Am. Soc. Inf. Sci..

[8]  Michael Keen,et al.  ASLIB CRANFIELD RESEARCH PROJECT FACTORS DETERMINING THE PERFORMANCE OF INDEXING SYSTEMS VOLUME 2 , 1966 .

[9]  W. Bruce Croft,et al.  Evaluation of an inference network-based retrieval model , 1991, TOIS.

[10]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[11]  Daniel G. Shapiro,et al.  Experimental Investigations of Uncertainty in a Rule-Based System for Information Retrieval , 1985, Int. J. Man Mach. Stud..

[12]  Branimir Boguraev,et al.  Large Lexicons for Natural Language Processing: Utilising the Grammar Coding System of LDOCE , 1987, CL.

[13]  Joel L Fagan,et al.  Experiments in Automatic Phrase Indexing For Document Retrieval: A Comparison of Syntactic and Non-Syntactic Methods , 1987 .

[14]  Kenneth Ward Church A Stochastic Parts Program and Noun Phrase Parser for Unrestricted Text , 1988, ANLP.

[15]  Chris Buckley,et al.  Probabilistic document indexing from relevance feedback data , 1989, SIGIR '90.

[16]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[17]  W. Bruce Croft,et al.  Lexical ambiguity and information retrieval , 1992, TOIS.

[18]  W. Bruce Croft,et al.  Interpreting nominal compounds for information retrieval , 1990, Inf. Process. Manag..

[19]  Alan F. Smeaton,et al.  Experiments on incorporating syntactic processing of user queries into a document retrieval strategy , 1988, SIGIR '88.

[20]  Martin Dillon,et al.  FASIT: A fully automatic syntactically based indexing system , 1983, J. Am. Soc. Inf. Sci..

[21]  W. Bruce Croft,et al.  Term clustering of syntactic phrases , 1989, SIGIR '90.

[22]  Cyril W. Cleverdon,et al.  Factors determining the performance of indexing systems , 1966 .

[23]  Karen Spärck Jones,et al.  Automatic Search Term variant Generation , 1984, J. Documentation.

[24]  David D. Lewis,et al.  Representation and Learning in Information Retrieval , 1991 .

[25]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems , 1988 .

[26]  W. Bruce Croft Boolean queries and term dependencies in probabilistic retrieval models , 1986, J. Am. Soc. Inf. Sci..

[27]  Clement T. Yu,et al.  A theory of term importance in automatic text analysis , 1974, J. Am. Soc. Inf. Sci..

[28]  W. Bruce Croft,et al.  Experiments with query acquisition and use in document retrieval systems , 1989, SIGIR '90.

[29]  Peter G. Anick,et al.  A direct manipulation interface for boolean information retrieval via natural language query , 1989, SIGIR '90.

[30]  Maria Elena Smith,et al.  Aspects of the P-Norm Model of Information Retrieval: Syntactic Query Generation, Efficiency, And Theoretical , 1990 .