Efficient query expansion

Hundreds of millions of users each day search the web and other repositories to meet their information needs. However, queries can fail to find documents due to a mismatch in terminology. Query expansion seeks to address this problem by automatically addi

[1]  S. E. Robertson,et al.  On Relevance weight estimation and Query Expansion , 1986, J. Documentation.

[2]  William S. Cooper,et al.  On selecting a measure of retrieval effectiveness , 1973, J. Am. Soc. Inf. Sci..

[3]  K. Sparck Jones,et al.  KEYWORDS AND CLUMPS , 1964 .

[4]  Gerard Salton,et al.  Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[5]  F ChenStanley,et al.  An Empirical Study of Smoothing Techniques for Language Modeling , 1996, ACL.

[6]  Jeff A. Bilmes,et al.  A gentle tutorial of the em algorithm and its application to parameter estimation for Gaussian mixture and hidden Markov models , 1998 .

[7]  Anton Leuski,et al.  Relevance and reinforcement in interactive browsing , 2000, CIKM '00.

[8]  Amit Singhal,et al.  Pivoted document length normalization , 1996, SIGIR 1996.

[9]  Ian Soboroff,et al.  Collaborative filtering and the generalized vector space model (poster session) , 2000, SIGIR '00.

[10]  Lauren B. Doyle Is Automatic Classification a Reasonable Application of Statistical Analysis of Text? , 1965, JACM.

[11]  Justin Zobel,et al.  Efficient single-pass index construction for text databases , 2003, J. Assoc. Inf. Sci. Technol..

[12]  Martin F. Porter,et al.  An algorithm for suffix stripping , 1997, Program.

[13]  Mark Sanderson,et al.  A Study of User Interaction with a Concept-Based Interactive Query Expansion Support Tool , 2004, ECIR.

[14]  Hugh E. Williams,et al.  Efficient phrase querying with an auxiliary index , 2002, SIGIR '02.

[15]  Helen Meng,et al.  Document Expansion using a Side Collection for Monolingual and Cross-language Spoken Document Retrieval , 2003 .

[16]  New York Dover,et al.  ON THE CONVERGENCE PROPERTIES OF THE EM ALGORITHM , 1983 .

[17]  Stephen E. Robertson,et al.  Flexible pseudo-relevance feedback using optimization tables , 2001, SIGIR '01.

[18]  Ellen M. Voorhees,et al.  Variations in relevance judgments and the measurement of retrieval effectiveness , 1998, SIGIR '98.

[19]  David A. Hull Using statistical testing in the evaluation of retrieval experiments , 1993, SIGIR.

[20]  Iain Campbell,et al.  Supporting Information Needs by Ostensive Definition in an Adaptive Information Space , 1995, MIRO.

[21]  W J Wilbur,et al.  Corpus-based statistical screening for phrase identification. , 2000, Journal of the American Medical Informatics Association : JAMIA.

[22]  Michael Lesk,et al.  Word-word associations in document retrieval systems , 1969 .

[23]  Christos Faloutsos,et al.  Access methods for text , 1985, CSUR.

[24]  Ian Ruthven,et al.  Re-examining the potential effectiveness of interactive query expansion , 2003, SIGIR.

[25]  Carolyn J. Crouch,et al.  A cluster-based approach to thesaurus construction , 1988, SIGIR '88.

[26]  C. Buckley,et al.  Reliable Information Access Final Workshop Report , 2004 .

[27]  John D. Lafferty,et al.  Two-stage language models for information retrieval , 2002, SIGIR '02.

[28]  Eugene L. Margulis,et al.  Modelling Documents with Multiple Poisson Distributions , 1993, Inf. Process. Manag..

[29]  David Hawking,et al.  Overview of the TREC-2001 Web track , 2002 .

[30]  Donna K. Harman,et al.  Overview of the Eighth Text REtrieval Conference (TREC-8) , 1999, TREC.

[31]  Donna K. Harman,et al.  The NRRC reliable information access (RIA) workshop , 2004, SIGIR '04.

[32]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[33]  David Hawking,et al.  Overview of the TREC-9 Web Track , 2000, TREC.

[34]  W. Bruce Croft,et al.  Deriving concept hierarchies from text , 1999, SIGIR '99.

[35]  Rifat Ozcan,et al.  Concept-based information access , 2005, International Conference on Information Technology: Coding and Computing (ITCC'05) - Volume II.

[36]  W. Bruce Croft,et al.  Cluster-based retrieval using language models , 2004, SIGIR '04.

[37]  Mark Sanderson,et al.  Retrieving descriptive phrases from large amounts of free text , 2000, CIKM '00.

[38]  W. Bruce Croft,et al.  Relevance feedback and inference networks , 1993, SIGIR.

[39]  C. J. van Rijsbergen,et al.  Another Look at the Logical Uncertainty Principle , 2000, Information Retrieval.

[40]  Ting Liu,et al.  A review of relevance feedback experiments at the 2003 reliable information access (RIA) workshop. , 2004, SIGIR '04.

[41]  Kotagiri Ramamohanarao,et al.  Hybrid pre-query term expansion using latent semantic analysis , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[42]  Fabio Brugnara,et al.  HMM methods in speech recognition , 1997 .

[43]  David Ellis,et al.  A Behavioural Approach to Information Retrieval System Design , 1989, J. Documentation.

[44]  Donna K. Harman,et al.  Overview of the Ninth Text REtrieval Conference (TREC-9) , 2000, TREC.

[45]  Efthimis N. Efthimiadis,et al.  User Choices: A new Yardstick for the Evaluation of Ranking Algorithms for Interactive Query Expansion , 1995, Inf. Process. Manag..

[46]  Gerard Salton,et al.  Automatic Routing and Retrieval Using Smart: TREC-2 , 1995, Inf. Process. Manag..

[47]  Stephen E. Robertson,et al.  The TREC 2002 Filtering Track Report , 2002, TREC.

[48]  W. Bruce Croft,et al.  A general language model for information retrieval , 1999, CIKM '99.

[49]  Hugh E. Williams,et al.  Searchable words on the Web , 2005, International Journal on Digital Libraries.

[50]  Mark Sanderson,et al.  Relevance Feedback for Cross Language Image Retrieval , 2004, ECIR.

[51]  Susan T. Dumais,et al.  Latent Semantic Indexing (LSI): TREC-3 Report , 1994, TREC.

[52]  Carol L. Barry,et al.  Order Effects: A Study of the Possible Influence of Presentation Order on User Judgments of Document Relevance. , 1988 .

[53]  Ellen M. Voorhees,et al.  Overview of the seventh text retrieval conference (trec-7) [on-line] , 1999 .

[54]  Vijay V. Raghavan,et al.  On the reuse of past optimal queries , 1995, SIGIR '95.

[55]  John D. Lafferty,et al.  Information Retrieval as Statistical Translation , 2017 .

[56]  Stephen E. Robertson,et al.  On Collection Size and Retrieval Effectiveness , 2004, Information Retrieval.

[57]  Amit Singhal,et al.  Document expansion for speech retrieval , 1999, SIGIR '99.

[58]  Susan T. Dumais,et al.  Improving the retrieval of information from external sources , 1991 .

[59]  Jaana Kekäläinen,et al.  Using graded relevance assessments in IR evaluation , 2002, J. Assoc. Inf. Sci. Technol..

[60]  Robert M. Losee,et al.  Feedback in Information Retrieval. , 1996 .

[61]  Thorsten Joachims,et al.  A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categorization , 1997, ICML.

[62]  Alan F. Smeaton,et al.  The Retrieval Effects of Query Expansion on a Feedback Document Retrieval System , 1983, Comput. J..

[63]  Yuen-Hsien Tseng,et al.  Document-self expansion for text categorization , 2003, SIGIR '03.

[64]  Justin Zobel,et al.  Effective ranking with arbitrary passages , 2001 .

[65]  Mounia Lalmas,et al.  A survey on the use of relevance feedback for information access systems , 2003, The Knowledge Engineering Review.

[66]  Joon Ho Lee,et al.  Combining the Evidence of Different Relevance Feedback Methods for Information Retrieval , 1998, Inf. Process. Manag..

[67]  David A. Hull Improving text retrieval for the routing problem using latent semantic indexing , 1994, SIGIR '94.

[68]  Ellen M. Voorhees,et al.  The effect of topic set size on retrieval experiment error , 2002, SIGIR '02.

[69]  W. Bruce Croft,et al.  An Association Thesaurus for Information Retrieval , 1994, RIAO.

[70]  Peter Willett,et al.  The limitations of term co-occurrence data for query expansion in document retrieval systems , 1991, J. Am. Soc. Inf. Sci..

[71]  Peter Bailey,et al.  Engineering a multi-purpose test collection for Web retrieval experiments , 2003, Inf. Process. Manag..

[72]  Mark Sanderson,et al.  Word sense disambiguation and information retrieval , 1994, SIGIR '94.

[73]  CHENGXIANG ZHAI,et al.  A study of smoothing methods for language models applied to information retrieval , 2004, TOIS.

[74]  David A. Evans,et al.  Design and Evaluation of the CLARIT-TREC-2 System , 1993, TREC.

[75]  Stephen E. Robertson,et al.  On Term Selection for Query Expansion , 1991, J. Documentation.

[76]  Van Rijsbergen,et al.  A theoretical basis for the use of co-occurence data in information retrieval , 1977 .

[77]  Kui-Lam Kwok Higher precision for two-word queries , 2002, SIGIR '02.

[78]  David Hawking,et al.  Overview of the TREC 2004 Web Track , 2004, TREC.

[79]  Alistair Moffat,et al.  In Situ Generation of Compressed Inverted Files , 1995, J. Am. Soc. Inf. Sci..

[80]  Stefano Mizzaro,et al.  Evaluating user interfaces to information retrieval systems: a case study on user support , 1996, SIGIR '96.

[81]  Nicholas Nethercote,et al.  Valgrind: A Program Supervision Framework , 2003, RV@CAV.

[82]  Peter Elias,et al.  Universal codeword sets and representations of the integers , 1975, IEEE Trans. Inf. Theory.

[83]  Clement T. Yu,et al.  An effective approach to document retrieval via utilizing WordNet and recognizing phrases , 2004, SIGIR '04.

[84]  Jianying Wang,et al.  A corpus analysis approach for automatic query expansion , 1997, CIKM '97.

[85]  Kui-Lam Kwok,et al.  Improving two-stage ad-hoc retrieval for short queries , 1998, SIGIR '98.

[86]  Yasushi Ogawa,et al.  Selecting expansion terms in automatic query expansion , 2001, SIGIR '01.

[87]  Ellen M. Voorhees,et al.  Query expansion using lexical-semantic relations , 1994, SIGIR '94.

[88]  Justin Zobel,et al.  Document expansion versus query expansion for ad-hoc retrieval , 2005 .

[89]  W. Bruce Croft,et al.  Relevance-Based Language Models , 2001, SIGIR '01.

[90]  S. Robertson The probability ranking principle in IR , 1997 .

[91]  Hans-Peter Frei,et al.  Concept based query expansion , 1993, SIGIR.

[92]  Thorsten Joachims,et al.  Evaluating Retrieval Performance Using Clickthrough Data , 2003, Text Mining.

[93]  Larry Fitzpatrick,et al.  Automatic feedback using past queries: social searching? , 1997, SIGIR '97.

[94]  Michael B. Eisenberg,et al.  DICHOTOMOUS RELEVANCE JUDGMENTS AND THE EVALUATION OF INFORMATION SYSTEMS. , 1987 .

[95]  Andrei Broder,et al.  A taxonomy of web search , 2002, SIGF.

[96]  Graham A Stephen,et al.  Approximate String Matching , 1994, Encyclopedia of Algorithms.

[97]  Takenobu Tokunaga,et al.  Combining multiple evidence from different types of thesaurus for query expansion , 1999, SIGIR '99.

[98]  Abraham Bookstein,et al.  Explanation and Generalization of Vector Models in Information Retrieval , 1982, SIGIR.

[99]  Stefano Mizzaro,et al.  How many relevances in information retrieval? , 1998, Interact. Comput..

[100]  Hugh E. Williams,et al.  Query association for effective retrieval , 2002, CIKM '02.

[101]  Douglas W. Oard,et al.  Signal Boosting for Translingual Topic Tracking , 2002 .

[102]  Carolyn J. Crouch,et al.  Experiments in automatic statistical thesaurus construction , 1992, SIGIR '92.

[103]  Peter Willett,et al.  Readings in information retrieval , 1997 .

[104]  Stephen P. Harter,et al.  A probabilistic approach to automatic keyword indexing. Part I. On the Distribution of Specialty Words in a Technical Literature , 1975, J. Am. Soc. Inf. Sci..

[105]  James Allan,et al.  Relevant query feedback in statistical language modeling , 2003, CIKM '03.

[106]  Justin Zobel,et al.  How reliable are the results of large-scale information retrieval experiments? , 1998, SIGIR '98.

[107]  Amanda Spink,et al.  Toward a Theoretical Framework for Information Retrieval (IR) Evaluation in an Information Seeking Context , 1999, MIRA.

[108]  Alistair Moffat,et al.  Self-indexing inverted files for fast text retrieval , 1996, TOIS.