Variations on language modeling for information retrieval

Search engine technology builds on theoretical and empirical research results in the area of information retrieval (IR). This dissertation makes a contribution to the field of language modeling (LM) for IR, which views both queries and documents as instances of a unigram language model and defines the matching function between a query and each document as the probability that the query terms are generated by the document language model. The work described is concerned with three research issues.

[1]  Hinrich Schütze,et al.  Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.

[2]  Richard A. Harshman,et al.  Indexing by Latent Semantic Analysis , 1990, J. Am. Soc. Inf. Sci..

[3]  Gobinda G. Chowdhury,et al.  TREC: Experiment and Evaluation in Information Retrieval , 2007 .

[4]  Sholom M. Weiss,et al.  Automated learning of decision rules for text categorization , 1994, TOIS.

[5]  Andrew Turpin,et al.  Do batch and user evaluations give the same results? , 2000, SIGIR '00.

[6]  Hugo Zaragoza,et al.  Information Retrieval: Algorithms and Heuristics , 2002, Information Retrieval.

[7]  S. Robertson The probability ranking principle in IR , 1997 .

[8]  Jean Tague-Sutcliffe,et al.  Measuring information : an information services perspective , 1995 .

[9]  Ellen M. Voorhees,et al.  Variations in relevance judgments and the measurement of retrieval effectiveness , 1998, SIGIR '98.

[10]  Charles L. A. Clarke,et al.  Relevance ranking for one to three term queries , 1997, Inf. Process. Manag..

[11]  David A. Hull Using statistical testing in the evaluation of retrieval experiments , 1993, SIGIR.

[12]  Fabio A. Crestani A study of the kinematics of probabilities in information retrieval , 1998 .

[13]  Robert Krovetz,et al.  Viewing morphology as an inference process , 1993, Artif. Intell..

[14]  Fabio Crestani,et al.  “Is this document relevant?…probably”: a survey of probabilistic models in information retrieval , 1998, CSUR.

[15]  Susan T. Dumais,et al.  Using LSI for information filtering: TREC-3 experiments , 1995 .

[16]  Mark Sanderson,et al.  The impact on retrieval effectiveness of skewed frequency distributions , 1999, TOIS.

[17]  Karen Sparck-Jones Assumptions and issues in text-based retrieval , 1992 .

[18]  Jian-Yun Nie,et al.  Toward a Broader Logical Model for Information Retrieval , 1998 .

[19]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[20]  E. Dura Natural Language in Information Retrieval , 2003, CICLing.

[21]  Douglas W. Oard,et al.  A comparative study of query and document translation for cross-language information retrieval , 1998, AMTA.

[22]  Kalervo Järvelin,et al.  The Effects of Conjunction, Facet Structure, and Dictionary Combinations in Concept-Based Cross-Language Retrieval , 2004, Information Retrieval.

[23]  Alexander S. Yeh,et al.  More accurate tests for the statistical significance of result differences , 2000, COLING.

[24]  Peter Schäuble Information retrieval based on information structures , 1989 .

[25]  Wessel Kraaij,et al.  Language Models for Topic Tracking , 2003 .

[26]  Djoerd Hiemstra,et al.  Twenty-One at TREC7: Ad-hoc and Cross-Language Track , 1998, TREC.

[27]  M. E. Maron,et al.  An evaluation of retrieval effectiveness for a full-text document-retrieval system , 1985, CACM.

[28]  Peter Willett,et al.  Overall introduction , 1997 .

[29]  Wessel Kraaij Exploring transitive translation methods , 2003 .

[30]  Wessel Kraaij,et al.  Different approaches to Cross Language Information Retrieval , 2000, CLIN.

[31]  Wessel Kraaij,et al.  TNO at CLEF-2001: Comparing Translation Resources , 2001, CLEF.

[32]  James H. Martin,et al.  Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition, 2nd Edition , 2000, Prentice Hall series in artificial intelligence.

[33]  Gregory Grefenstette The Problem of Cross-Language Information Retrieval , 1998 .

[34]  Gerard Salton,et al.  Automatic indexing , 1980, ACM '80.

[35]  Nicholas J. Belkin,et al.  Panel: Evaluating Interactive Retrieval Systems , 1994 .

[36]  Charles L. Wayne Multilingual Topic Detection and Tracking: Successful Research Enabled by Corpora and Evaluation , 2000, LREC.

[37]  R. H. Baayen,et al.  The CELEX Lexical Database (CD-ROM) , 1996 .

[38]  Margaret King,et al.  Evaluation of natural language processing systems , 1991 .

[39]  Hubert Jin,et al.  The BBN Crosslingual Topic Detection and Tracking System , 2007 .

[40]  W. Bruce Croft,et al.  Inference networks for document retrieval , 1989, SIGIR '90.

[41]  Mark Sanderson,et al.  Addressing the lack of direct translation resources for cross-language retrieval , 2003, CIKM '03.

[42]  Rada Mihalcea,et al.  Semantic Indexing using WordNet Senses , 2000 .

[43]  Ted Dunning,et al.  Accurate Methods for the Statistics of Surprise and Coincidence , 1993, CL.

[44]  Venkata Subramaniam,et al.  Information Retrieval: Data Structures & Algorithms , 1992 .

[45]  Philip Resnik,et al.  Parallel strands: a preliminary investigation into mining the Web for bilingual text , 1998, AMTA.

[46]  Gerard Salton,et al.  A vector space model for automatic indexing , 1975, CACM.

[47]  Jaana Kekäläinen,et al.  IR evaluation methods for retrieving highly relevant documents , 2000, SIGIR '00.

[48]  Jaana Kekäläinen,et al.  The Co-Effects of Query Structure and Expansion on Retrieval Performance in Probabilistic Text Retrieval , 2004, Information Retrieval.

[49]  W. B. Cavnar,et al.  Using An N-Gram-Based Document Representation With A Vector Processing Retrieval Model , 1994, TREC.

[50]  Jan O. Pedersen Information Retrieval Based on Word Senses , 1995 .

[51]  James Mayfield,et al.  Comparing cross-language query expansion techniques by degrading translation resources , 2002, SIGIR '02.

[52]  Fabio Crestani,et al.  The Troubles with Using a Logical Model of IR on a Large Collection of Documents , 1995, TREC.

[53]  George Kingsley Zipf,et al.  Human behavior and the principle of least effort , 1949 .

[54]  Wessel Kraaij,et al.  Viewing stemming as recall enhancement , 1996, SIGIR '96.

[55]  David J. Groggel,et al.  Practical Nonparametric Statistics , 2000, Technometrics.

[56]  Christof Monz,et al.  From document retrieval to question answering , 2003 .

[57]  Ricardo Baeza-Yates,et al.  Information Retrieval: Data Structures and Algorithms , 1992 .

[58]  David L. Waltz,et al.  Classifying news stories using memory based reasoning , 1992, SIGIR '92.

[59]  Amit Singhal,et al.  Document expansion for speech retrieval , 1999, SIGIR '99.

[60]  Jean Paul Ballerini,et al.  Experiments in multilingual information retrieval using the SPIDER system , 1996, SIGIR '96.

[61]  John A. Goldsmith,et al.  Unsupervised Learning of the Morphology of a Natural Language , 2001, CL.

[62]  Vijay V. Raghavan,et al.  A critical analysis of vector space model for information retrieval , 1986, J. Am. Soc. Inf. Sci..

[63]  Ellen M. Voorhees,et al.  Query expansion using lexical-semantic relations , 1994, SIGIR '94.

[64]  Vijay V. Raghavan,et al.  On modeling of information retrieval concepts in vector spaces , 1987, TODS.

[65]  Djoerd Hiemstra,et al.  Disambiguation Strategies for Cross-Language Information Retrieval , 1999, ECDL.

[66]  James Allan,et al.  Relevance models for topic detection and tracking , 2002 .

[67]  William T. Morgan,et al.  Contributions of Language Modeling to the Theory and Practice of Information Retrieval , 2003 .

[68]  James Allan,et al.  Recent Experiments with INQUERY , 1995, TREC.

[69]  David A. Hull Stemming Algorithms: A Case Study for Detailed Evaluation , 1996, J. Am. Soc. Inf. Sci..

[70]  Ellen Riloff,et al.  Little words can make a big difference for text classification , 1995, SIGIR '95.

[71]  John A. Swets,et al.  Effectiveness of information retrieval methods , 1969 .

[72]  Kenney Ng A Maximum Likelihood Ratio Information Retrieval Model , 1999, TREC.

[73]  Jonathan G. Fiscus,et al.  Topic detection and tracking evaluation overview , 2002 .

[74]  Djoerd Hiemstra,et al.  Twenty-One at CLEF-2000: Translation Resources, Merging Strategies and Relevance Feedback , 2000, CLEF.

[75]  Alexander M. Fraser,et al.  Empirical studies in strategies for Arabic retrieval , 2002, SIGIR '02.

[76]  David D. Lewis,et al.  Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval , 1998, ECML.

[77]  Richard M. Schwartz,et al.  BBN at TREC7: Using Hidden Markov Models for Information Retrieval , 1998, TREC.

[78]  Norbert Fuhr,et al.  Information Retrieval with Probabilistic Datalog , 1998 .

[79]  Roger M. Needham,et al.  The thesaurus approach to information retrieval , 1958 .

[80]  Hwee Tou Ng,et al.  Feature selection, perceptron learning, and a usability case study for text categorization , 1997, SIGIR '97.

[81]  C. E. SHANNON,et al.  A mathematical theory of communication , 1948, MOCO.

[82]  Kevin Knight,et al.  Machine Transliteration , 1997, CL.

[83]  W. Bruce Croft,et al.  Workshop on language modeling and information retrieval , 2001, SIGF.

[84]  Justin Zobel,et al.  How reliable are the results of large-scale information retrieval experiments? , 1998, SIGIR '98.

[85]  Yiyu Yao,et al.  On modeling information retrieval with probabilistic inference , 1995, TOIS.

[86]  Jin Yang,et al.  SYSTRAN on AltaVista: A User Study on Real-Time Machine Translation on the Internet , 1998, AMTA.

[87]  W. Bruce Croft,et al.  Corpus-Specific Stemming using Work Form Co-occurrence , 1994 .

[88]  Edward A. Fox,et al.  The SMART lab report , 1997, SIGF.

[89]  Natasa Milic-Frayling,et al.  Evaluation of Syntactic Phrase Indexing -- CLARIT NLP Track Report , 1996, TREC.

[90]  G. Salton,et al.  A Generalized Term Dependence Model in Information Retrieval , 1983 .

[91]  Douglas W. Oard,et al.  A survey of multilingual text retrieval , 1996 .

[92]  George F. Foster A Maximum Entropy/Minimum Divergence Translation Model , 2000, ACL.

[93]  Michael D. Gordon,et al.  Finding Information on the World Wide Web: The Retrieval Effectiveness of Search Engines , 1999, Inf. Process. Manag..

[94]  Kenneth Ward Church,et al.  Robust Bilingual Word Alignment for Machine Aided Translation , 1993, VLC@ACL.

[95]  Wessel Kraaij,et al.  Using Linguistic Knowledge in Information Retrieval Technical Report , 1996 .

[96]  Jade Goldstein-Stewart,et al.  The use of MMR, diversity-based reranking for reordering documents and producing summaries , 1998, SIGIR '98.

[97]  Richard M. Schwartz,et al.  Topic tracking for radio, TV broadcast, and newswire , 1999, EUROSPEECH.

[98]  Hinrich Schütze,et al.  A comparison of classifiers and document representations for the routing problem , 1995, SIGIR '95.

[99]  James Mayfield,et al.  Indexing Using Both N-Grams and Words , 1998, TREC.

[100]  Martin Franz,et al.  Quantifying the utility of parallel corpora , 2001, SIGIR '01.

[101]  NewsBrowser InfoGuide Full-text Document Retrieval: from Theory to Applications Nl-text Document Retrieval: from Theory to Applications , 2004 .

[102]  Kenney Ng,et al.  Subword-based approaches for spoken document retrieval , 2000, Speech Commun..

[103]  William S. Cooper The formalism of probability theory in IR: a foundation or an encumbrance? , 1994, SIGIR '94.

[104]  J. Gill Hierarchical Linear Models , 2005 .

[105]  George A. Miller,et al.  Introduction to WordNet: An On-line Lexical Database , 1990 .

[106]  Christa Womser-Hacker Multilingual Topic Generation within the CLEF 2001 Experiments , 2001, CLEF.

[107]  Yiming Yang,et al.  Translingual Information Retrieval: Learning from Bilingual Corpora , 1998, Artif. Intell..

[108]  John D. Lafferty,et al.  The Weaver System for Document Retrieval , 1999, TREC.

[109]  A. W. Kemp,et al.  Kendall's Advanced Theory of Statistics. , 1994 .

[110]  Jaideep Srivastava,et al.  First 20 precision among World Wide Web search services (search engines) , 1999 .

[111]  Arjen P. de Vries A Poor Man's Approach to CLEF , 2000, CLEF.

[112]  Wessel Kraaij,et al.  Evaluation of a Dutch stemming algorithm , 1994 .

[113]  David A. Hull Using Structured Queries for Disambiguation in Cross-Language Information Retrieval , 1997 .

[114]  Avi Arampatzis,et al.  The score-distributional threshold optimization for adaptive binary classification tasks , 2001, SIGIR '01.

[115]  Stephen E. Robertson,et al.  Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval , 1994, SIGIR '94.

[116]  Stephen E. Robertson,et al.  Relevance weighting of search terms , 1976, J. Am. Soc. Inf. Sci..

[117]  Oren Etzioni,et al.  On the Instability of Web Search Engines , 2000, RIAO.

[118]  Donna K. Harman,et al.  Overview of the Eighth Text REtrieval Conference (TREC-8) , 1999, TREC.

[119]  Elke Mittendorf Data corruption and information retrieval , 1998 .

[120]  W. Bruce Croft,et al.  Document Retrieval and Routing Using the INQUERY System , 1994, TREC.

[121]  W. Bruce Croft,et al.  Predicting query performance , 2002, SIGIR '02.

[122]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[123]  Alan F. Smeaton,et al.  The effect of pool depth on system evaluation in TREC , 2001, J. Assoc. Inf. Sci. Technol..

[124]  Ari Pirkola,et al.  The effects of query structure and dictionary setups in dictionary-based cross-language information retrieval , 1998, SIGIR '98.

[125]  Christoph Baumgarten,et al.  A probabilistic model for distributed information retrieval , 1997, Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.

[126]  Djoerd Hiemstra,et al.  Challenges in information retrieval and language modeling: report of a workshop held at the center for intelligent information retrieval, University of Massachusetts Amherst, September 2002 , 2003, SIGF.

[127]  Scott E. Maxwell,et al.  Designing Experiments and Analyzing Data , 1991 .

[128]  Amit Singhal,et al.  AT&T at TREC-7 , 1998, TREC.

[129]  Christoph Baumgarten,et al.  A probabilistic solution to the selection and fusion problem in distributed information retrieval , 1999, SIGIR '99.

[130]  David C. Blair STAIRS Redux: Thoughts on the STAIRS Evaluation, Ten Years after , 1996, J. Am. Soc. Inf. Sci..

[131]  Djoerd Hiemstra,et al.  Translation Resources, Merging Strategies, and Relevance Feedback for Cross-Language Information Retrieval , 2000, CLEF.

[132]  Gregory Grefenstette,et al.  Querying across languages: a dictionary-based approach to multilingual information retrieval , 1996, SIGIR '96.

[133]  Norbert Fuhr,et al.  Probabilistic Models in Information Retrieval , 1992, Comput. J..

[134]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[135]  Tefko Saracevic,et al.  RELEVANCE: A review of and a framework for the thinking on the notion in information science , 1997, J. Am. Soc. Inf. Sci..

[136]  Chris Buckley,et al.  New Retrieval Approaches Using SMART: TREC 4 , 1995, TREC.

[137]  Evelyne Tzoukermann,et al.  NLP for Term Variant Extraction: Synergy Between Morphology, Lexicon, and Syntax , 1999 .

[138]  W. Bruce Croft,et al.  Cross-lingual relevance models , 2002, SIGIR '02.

[139]  Wessel Kraaij,et al.  Porter's stemming algorithm for Dutch , 1994 .

[140]  Gerard Salton,et al.  Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[141]  Scott E. Maxwell,et al.  Designing Experiments and Analyzing Data , 1992 .

[142]  Djoerd Hiemstra,et al.  Using language models for information retrieval , 2001 .

[143]  K. Sparck Jones,et al.  General query expansion techniques for spoken document retrieval , 1999 .

[144]  Wessel Kraaij,et al.  Transitive probabilistic CLIR models , 2004 .

[145]  Wessel Kraaij,et al.  The Effect of Syntactic Phrase Indexing on Retrieval Performance for Dutch Texts , 1997, RIAO.

[146]  Kui-Lam Kwok,et al.  TREC-8 Ad-Hoc, Query and Filtering Track Experiments using PIRCS , 1999, TREC.

[147]  Djoerd Hiemstra,et al.  A domain Specific Lexicon Acquisition Tool for Cross-Language Information Retrieval , 1997, RIAO.

[148]  Wessel Kraaij,et al.  Twenty-One: Cross-Language Disclosure and Retrieval of Multimedia Documents on Sustainable Development , 1998, Comput. Networks.

[149]  Peter Schäuble,et al.  Document and passage retrieval based on hidden Markov models , 1994, SIGIR '94.

[150]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[151]  Ellen M. Voorhees,et al.  Overview of the Seventh Text REtrieval Conference , 1998 .

[152]  Ren'ee Pohlmann Wessel Kraaij Improving the Precision of a Text Retrieval System with Compound Analysis , 1996 .

[153]  Claire Cardie,et al.  Using clustering and SuperConcepts within SMART: TREC 6 , 1997, Inf. Process. Manag..

[154]  Ellen M. Voorhees,et al.  Evaluating evaluation measure stability , 2000, SIGIR '00.

[155]  J. Laffling On Constructing a Transfer Dictionary for Man and Machine , 1992 .

[156]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[157]  Mark Sanderson,et al.  Word sense disambiguation and information retrieval , 1994, SIGIR '94.

[158]  Stephen E. Robertson,et al.  Experimentation as a way of life: Okapi at TREC , 2000, Inf. Process. Manag..

[159]  James P. Callan,et al.  Experiments Using the Lemur Toolkit , 2001, TREC.

[160]  M. de Rijke,et al.  Monolingual Document Retrieval for European Languages , 2004, Information Retrieval.

[161]  Douglas W. Oard,et al.  Alternative Approaches for Cross-Language Text Retrieval , 1997 .

[162]  Wessel Kraaij,et al.  Embedding Web-Based Statistical Translation Models in Cross-Language Information Retrieval , 2003, CL.

[163]  J. J. Rocchio,et al.  Relevance feedback in information retrieval , 1971 .

[164]  G. W. Snedecor Statistical Methods , 1964 .

[165]  Tomek Strzalkowski,et al.  Natural Language Information Retrieval: TREC-8 Report , 1994, TREC.

[166]  Michel Simard,et al.  Using cognates to align sentences in bilingual corpora , 1993, TMI.

[167]  Wessel Kraaij,et al.  Unsupervised Event Clustering in Multilingual News Streams , 2002 .

[168]  Martin Braschler,et al.  Experiments with the Eurospider Retrieval System for CLEF 2000 , 2000, CLEF.

[169]  W. Bruce Croft,et al.  Lexical ambiguity and information retrieval , 1992, TOIS.

[170]  Claire Cardie,et al.  SMART High Precision: TREC 7 , 1998, TREC.

[171]  Martin Braschler,et al.  Stemming and Decompounding for German Text Retrieval , 2003, ECIR.

[172]  Alexander Dekhtyar,et al.  Information Retrieval , 2018, Lecture Notes in Computer Science.

[173]  Christopher J. Fox,et al.  A stop list for general text , 1989, SIGF.

[174]  Harold Borko,et al.  Automatic indexing , 1981, ACM '81.

[175]  Elizabeth D. Liddy,et al.  Translation events in cross-language information retrieval: lexical ambiguity, lexical holes, vocabulary mismatch, and correct translations , 2003 .

[176]  Martin Braschler,et al.  Using Corpus-Based Approaches in a System for Multilingual Information Retrieval , 2000, Information Retrieval.

[177]  Vijay V. Raghavan,et al.  A critical investigation of recall and precision as measures of retrieval system performance , 1989, TOIS.

[178]  Djoerd Hiemstra,et al.  The Importance of Prior Probabilities for Entry Page Search , 2002, SIGIR '02.

[179]  Edward A. Fox,et al.  Research Contributions , 2014 .

[180]  Jacques Savoy,et al.  Statistical inference in retrieval effectiveness evaluation , 1997, Inf. Process. Manag..

[181]  Wessel Kraaij,et al.  Comparing the Effect of Syntactic vs. Statistical Phrase Indexing Strategies for Dutch , 1998, ECDL.

[182]  Mark Sanderson,et al.  Improving cross language retrieval with triangulated translation , 2001, SIGIR '01.

[183]  Gerard Salton,et al.  Experiments in Multi-Lingual Information Retrieval , 1972, Inf. Process. Lett..

[184]  Vijay V. Raghavan,et al.  On extending the vector space model for Boolean query processing , 1986, SIGIR '86.

[185]  Douglas W. Oard The CLEF 2001 Interactive Track , 2001, CLEF.

[186]  Ellen M. Voorhees,et al.  Overview of the seventh text retrieval conference (trec-7) [on-line] , 1999 .

[187]  Tim Leek,et al.  Probabilistic approaches to topic detection and tracking , 2002 .

[188]  Kalervo Järvelin,et al.  Employing the resolution power of search keys , 2001, J. Assoc. Inf. Sci. Technol..

[189]  James Allan,et al.  Details on Stemming in the Language Modeling Framework , 2003 .

[190]  W. Bruce Croft,et al.  Relevance Models in Information Retrieval , 2003 .

[191]  Pascale Fung,et al.  A Statistical View on Bilingual Lexicon Extraction: From Parallel Corpora to Non-parallel Corpora , 1998, AMTA.

[192]  C. J. van Rijsbergen,et al.  A Non-Classical Logic for Information Retrieval , 1997, Comput. J..

[193]  Julie Beth Lovins,et al.  Development of a stemming algorithm , 1968, Mech. Transl. Comput. Linguistics.

[194]  Ellen M. Voorhees,et al.  Report on the TREC-5 Confusion Track , 1996, TREC.

[195]  Yonggang Qiu Automatic query expansion based on a similarity thesaurus , 1995 .

[196]  Carol Peters,et al.  Cross-Language Information Retrieval and Evaluation , 2001, Lecture Notes in Computer Science.

[197]  W. Bruce Croft,et al.  An Association Thesaurus for Information Retrieval , 1994, RIAO.

[198]  Kenneth Ward Church,et al.  Work on Statistical Methods for Word Sense Disambiguation , 1992 .

[199]  Douglas W. Oard,et al.  Probabilistic structured query methods , 2003, SIGIR.

[200]  Stephen P. Harter,et al.  A probabilistic approach to automatic keyword indexing , 1974 .

[201]  Martin F. Porter,et al.  An algorithm for suffix stripping , 1997, Program.

[202]  Djoerd Hiemstra,et al.  A Linguistically Motivated Probabilistic Model of Information Retrieval , 1998, ECDL.

[203]  Wessel Kraaij,et al.  Using language models for tracking events of interest over time , 2001 .

[204]  John D. Lafferty,et al.  Information retrieval as statistical translation , 1999, SIGIR '99.

[205]  Marcello Federico,et al.  Statistical cross-language information retrieval using n-best query translations , 2002, SIGIR '02.

[206]  John D. Lafferty,et al.  A study of smoothing methods for language models applied to Ad Hoc information retrieval , 2001, SIGIR '01.

[207]  Frederick Jelinek,et al.  Statistical methods for speech recognition , 1997 .

[208]  T. de Heer Quasi comprehension of natural language simulated by means of information traces , 1979, Inf. Process. Manag..

[209]  Evelyne Tzoukermann,et al.  Information retrieval based on context distance and morphology , 1999, SIGIR '99.

[210]  F. W. Lancaster,et al.  MEDLARS: Report on the Evaluation of Its Operating Efficiency. , 1997 .

[211]  Chris Buckley,et al.  Pivoted Document Length Normalization , 1996, SIGIR Forum.

[212]  Adam Kilgarriff,et al.  Introduction to the Special Issue on the Web as Corpus , 2003, CL.

[213]  Hans Peter Luhn,et al.  A Statistical Approach to Mechanized Encoding and Searching of Literary Information , 1957, IBM J. Res. Dev..

[214]  Alistair Moffat,et al.  Exploring the similarity space , 1998, SIGF.

[215]  Don R. Swanson,et al.  A decision theoretic foundation for indexing , 1975, J. Am. Soc. Inf. Sci..

[216]  R. Manmatha,et al.  Modeling score distributions for combining the outputs of search engines , 2001, SIGIR '01.

[217]  Fabio Crestani,et al.  Information Retrieval: Uncertainty and Logics , 1998, The Kluwer International Series on Information Retrieval.

[218]  Jinxi Xu,et al.  Evaluating a probabilistic model for cross-lingual information retrieval , 2001, SIGIR '01.

[219]  M. E. Maron,et al.  On Relevance, Probabilistic Indexing and Information Retrieval , 1960, JACM.

[220]  Karen Sparck Jones What is the Role of NLP in Text Retrieval , 1999 .

[221]  Jacques Savoy Report on CLEF-2001 Experiments , 2001, CLEF.

[222]  Gerard Salton,et al.  Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer , 1989 .

[223]  Jean M. Tague,et al.  The pragmatics of information retrieval experimentation , 1981 .

[224]  Peter Willett,et al.  The Effectiveness of Stemming for Natural-Language Access to Slovene Textual Data , 1992, J. Am. Soc. Inf. Sci..

[225]  Lisa Ballesteros,et al.  Cross-Language Retrieval via Transitive Translation , 2002 .

[226]  Eero Sormunen,et al.  A novel method for the evaluation of Boolean query effectiveness across a wide operational range , 2000, SIGIR '00.

[227]  John Cocke,et al.  A Statistical Approach to Machine Translation , 1990, CL.

[228]  Karen Sparck Jones,et al.  The Key Concepts , 2004 .

[229]  Djoerd Hiemstra,et al.  Twenty-One at TREC-8: using Language Technology for Information Retrieval , 1999, TREC.

[230]  James Blustein,et al.  A Statistical Analysis of the TREC-3 Data , 1995, TREC.

[231]  Alexander M. Fraser,et al.  TREC 2001 Cross-lingual Retrieval at BBN , 2001, TREC.

[232]  Claire Cardie,et al.  An Analysis of Statistical and Syntactic Phrases , 1997, RIAO.

[233]  Ellen M. Voorhees Variations in relevance judgments and the measurement of retrieval effectiveness , 2000, Inf. Process. Manag..

[234]  Stephen E. Robertson,et al.  A probabilistic model of information retrieval: development and comparative experiments - Part 2 , 2000, Inf. Process. Manag..

[235]  Mark Liberman,et al.  Large, Multilingual, Broadcast News Corpora for Cooperative Research in Topic Detection and Tracking: The TDT-2 and TDT-3 Corpus Efforts , 2000, LREC.

[236]  Robert Burgin Variations in Relevance Judgments and the Evaluation of Retrieval Performance , 1992, Inf. Process. Manag..

[237]  Cyril Cleverdon,et al.  The Cranfield tests on index language devices , 1997 .

[238]  Carol Peters,et al.  Evaluation of Cross-Language Information Retrieval Systems , 2002, Lecture Notes in Computer Science.

[239]  Jean Véronis,et al.  Parallel Text Processing , 2000 .

[240]  Richard M. Schwartz,et al.  A hidden Markov model information retrieval system , 1999, SIGIR '99.

[241]  James Mayfield,et al.  A Language-Independent Approach to European Text Retrieval , 2000, CLEF.

[242]  James Allan,et al.  Automatic Query Expansion Using SMART: TREC 3 , 1994, TREC.

[243]  Umberto Straccia,et al.  Mirlog: A Logic for Multimedia Information Retrieval , 1998 .

[244]  Jian-Yun Nie,et al.  Cross-language information retrieval based on parallel texts and automatic mining of parallel texts from the Web , 1999, SIGIR '99.

[245]  J Allan,et al.  Readings in information retrieval. , 1998 .

[246]  Donna K. Harman,et al.  Relevance Feedback and Other Query Modification Techniques , 1992, Information retrieval (Boston).

[247]  Mark Sanderson,et al.  Universities of Leeds, Sheffield and York , 2022 .

[248]  Paul Over,et al.  TREC-7 Interactive Track Report , 1998, TREC.

[249]  Salim Roukos,et al.  Ad hoc and Multilingual Information Retrieval at IBM , 1998, TREC.