Automatic Indexing and Abstracting of Document Texts

Preface. Acknowledgements. Part I: The Indexing and Abstracting Environment. 1. The Need for Indexing and Abstracting Texts. 2. The Attributes of Text. 3. Text Representations and Their Use. Part II: Methods of Automatic Indexing and Abstracting. 4. Automatic Indexing: The Selection of Natural Language Index Terms. 5. Automatic Indexing: The Assignment of Controlled Language Index Terms. 6. Automatic Abstracting: The Creation of Text Summaries. Part III: Applications. 7. Text Structuring and Categorization When Summarizing Legal Cases. 8. Clustering of Paragraphs when Summarizing Legal Cases. 9. The Creation of Highlight Abstracts of Magazine Articles. 10. The Assignment of Subject Descriptors to Magazine Articles. Summary and Future Prospects. References. Subject Index.

[1]  Yoav Shoham,et al.  An overview of agent-oriented programming , 1997 .

[2]  MaryEllen C. Sievert Full-text information retrieval: introduction , 1996 .

[3]  Chris Buckley,et al.  Learning routing queries in a query zone , 1997, SIGIR '97.

[4]  Antonio García-Berrio,et al.  Compositional Structure: Macrostructures , 1988 .

[5]  Karen Spärck Jones,et al.  Automatic content-based retrieval of broadcast news , 1995, MULTIMEDIA '95.

[6]  Anders Krogh,et al.  Introduction to the theory of neural computation , 1994, The advanced book program.

[7]  Gerard Salton,et al.  Dynamic information and library processing , 1975 .

[8]  Thomas J. Froehlich,et al.  Relevance reconsidered—towards an agenda for the 21st century: introduction to special topic issue on relevance research , 1994 .

[9]  Gerard Salton,et al.  Automatic Information Organization And Retrieval , 1968 .

[10]  L. R. Rasmussen,et al.  In information retrieval: data structures and algorithms , 1992 .

[11]  Yves Chiaramella,et al.  Indexing medical reports in a multimedia environment: the RIME experimental approach , 1989, SIGIR '89.

[12]  F. Hayes-Roth,et al.  Concept learning and the recognition and classification of exemplars , 1977 .

[13]  William A. Gale,et al.  A sequential algorithm for training text classifiers , 1994, SIGIR '94.

[14]  Peter Willett,et al.  Using interdocument similarity information in document retrieval systems , 1997 .

[15]  J. F. Pittam Discourse as structure and process. Discourse as social interaction , 1999 .

[16]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[17]  W. Bruce Croft,et al.  Lexical ambiguity and information retrieval , 1992, TOIS.

[18]  David Ellis,et al.  On the Creation of Hypertext Links in Full-Text Documents: Measurement of Inter-Linker Consistency , 1994, J. Documentation.

[19]  Yiannis Aloimonos,et al.  Artificial intelligence - theory and practice , 1995 .

[20]  J. J. Rocchio,et al.  Relevance feedback in information retrieval , 1971 .

[21]  M. Brady,et al.  Focusing in the Comprehension of Definite Anaphora , 1983 .

[22]  Stephen E. Robertson,et al.  Okapi at TREC-4 , 1995, TREC.

[23]  Yoshimi Suzuki,et al.  Keyword extraction of radio news using term weighting with an encyclopedia and newspaper articles , 1998, SIGIR '98.

[24]  Gilbert K. Krulee,et al.  Computer processing of natural language , 1985 .

[25]  Stephen F. Weiss,et al.  Word segmentation by letter successor varieties , 1974, Inf. Storage Retr..

[26]  Stephen P. Harter,et al.  A probabilistic approach to automatic keyword indexing. Part II. An algorithm for probabilistic indexing , 1975, J. Am. Soc. Inf. Sci..

[27]  Martin Dillon,et al.  FASIT: A fully automatic syntactically based indexing system , 1983, J. Am. Soc. Inf. Sci..

[28]  Joel L. Fagan,et al.  The effectiveness of a nonsyntactic approach to automatic phrase indexing for document retrieval , 1989, JASIS.

[29]  Peter Willett,et al.  Recent trends in hierarchic document clustering: A critical review , 1988, Inf. Process. Manag..

[30]  James E. Rush,et al.  Improvement of automatic abstracts by the use of structural analysis , 1973, J. Am. Soc. Inf. Sci..

[31]  Lisa F. Rau,et al.  Information extraction and text summarization using linguistic knowledge acquisition , 1989, Inf. Process. Manag..

[32]  Ruxandra Domenig,et al.  SPIDER Retrieval System at TREC-5 , 1996, TREC.

[33]  Ludovic Lebart,et al.  Exploring Textual Data , 1997 .

[34]  Robert M. Losee,et al.  Parameter Estimation for Probabilistic Document-Retrieval Models. , 1988 .

[35]  Marie-Francine Moens,et al.  Information extraction from legal texts: the potential of discourse analysis , 1999, Int. J. Hum. Comput. Stud..

[36]  Richard M. Tong,et al.  Classification Trees for Document Routing, A Report on the TREC Experiment , 1992, TREC.

[37]  T. V. Dijk News as Discourse , 1990 .

[38]  Ellen M. Voorhees,et al.  Information Technology: The Sixth Text Retrieval Conference (TREC-6) | NIST , 1998 .

[39]  Michael R. Anderberg,et al.  Cluster Analysis for Applications , 1973 .

[40]  Marie-Francine Moens,et al.  Automatic text structuring and categorization as a first step in summarizing legal cases , 1997, Inf. Process. Manag..

[41]  Jean-Pierre Chevallet,et al.  About Retrieval Models and Logic , 1992, Comput. J..

[42]  D. Rumelhart NOTES ON A SCHEMA FOR STORIES , 1975 .

[43]  A. Bell The language of news media , 1991 .

[44]  Warren R. Greiff,et al.  A theory of term weighting based on exploratory data analysis , 1998, SIGIR '98.

[45]  Ellen Riloff,et al.  Little words can make a big difference for text classification , 1995, SIGIR '95.

[46]  Stephen E. Robertson,et al.  On relevance weights with little relevance information , 1997, SIGIR '97.

[47]  Eric Saund,et al.  Applying the Multiple Cause Mixture Model to Text Categorization , 1996, ICML.

[48]  Jacob Shapiro,et al.  Automated information retrieval - theory and methods , 1997, Library and information science series.

[49]  C. J. van Rijsbergen,et al.  The selection of good search terms , 1981, Inf. Process. Manag..

[50]  Elizabeth Du,et al.  Anaphora in natural language processing and information retrieval , 1990, Inf. Process. Manag..

[51]  Paul S. Jacobs,et al.  Using statistical methods to improve knowledge-based news categorization , 1993, IEEE Expert.

[52]  Cornelis H. A. Koster,et al.  Four text classification algorithms compared on a Dutch corpus , 1998, SIGIR '98.

[53]  James Pustejovsky,et al.  The role of lexicons in natural language processing , 1996, CACM.

[54]  Evelyne Tzoukermann,et al.  Effective use of natural language processing techniques for automatic conflation of multi-word terms: the role of derivational morphology, part of speech tagging, and shallow parsing , 1997, SIGIR '97.

[55]  M. E. Maron,et al.  Automatic Indexing: An Experimental Inquiry , 1961, JACM.

[56]  Padmini Srinivasan,et al.  On generalizing the Two-Poisson Model , 1990, J. Am. Soc. Inf. Sci..

[57]  Daphne Gelbart,et al.  FLEXICON: an evaluation of a statistical ranking model adapted to intelligent legal text management , 1993, ICAIL '93.

[58]  G Salton,et al.  Automatic Analysis, Theme Generation, and Summarization of Machine-Readable Texts , 1994, Science.

[59]  Gerard Salton,et al.  Improving retrieval performance by relevance feedback , 1997, J. Am. Soc. Inf. Sci..

[60]  David D. Lewis,et al.  Evaluating and optimizing autonomous text classification systems , 1995, SIGIR '95.

[61]  Peter H. Fries,et al.  On Theme, Rheme and discourse goals , 2002 .

[62]  Gerald DeJong,et al.  Skimming Newspaper Stories by Computer , 1977, IJCAI.

[63]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[64]  Donna K. Harman,et al.  Ranking Algorithms , 1992, Information Retrieval: Data Structures & Algorithms.

[65]  James Allan,et al.  Approaches to passage retrieval in full text information systems , 1993, SIGIR.

[66]  R. Beaugrande,et al.  Introduction to text linguistics , 1981 .

[67]  Raya Fidel,et al.  Terminological knowledge structure for intermediary expert systems , 1995 .

[68]  Shalom Lappin,et al.  An Algorithm for Pronominal Anaphora Resolution , 1994, CL.

[69]  Linda Schamber What Is a Document? Rethinking the Concept in Uneasy Times , 1996, J. Am. Soc. Inf. Sci..

[70]  W. Bruce Croft,et al.  Interactive retrieval of complex documents , 1990, Inf. Process. Manag..

[71]  I. Kim,et al.  Electronic document management , 1996 .

[72]  Chris D. Paice,et al.  The identification of important concepts in highly structured technical papers , 1993, SIGIR.

[73]  Karen Sparck Jones Automatic keyword classification for information retrieval , 1971 .

[74]  James Allan,et al.  Automatic Query Expansion Using SMART: TREC 3 , 1994, TREC.

[75]  Christopher J. Fox,et al.  A stop list for general text , 1989, SIGF.

[76]  Elizabeth D. Liddy,et al.  Development, Implementation and Testing of a Discourse Model for Newspaper Texts , 1993, HLT.

[77]  C. J. van Rijsbergen,et al.  A Non-Classical Logic for Information Retrieval , 1997, Comput. J..

[78]  David A. Hull Improving text retrieval for the routing problem using latent semantic indexing , 1994, SIGIR '94.

[79]  Marti A. Hearst Using Categories to Provide Context for Full-Text Retrieval Results , 1994, RIAO.

[80]  Van Rijsbergen,et al.  A theoretical basis for the use of co-occurence data in information retrieval , 1977 .

[81]  Elizabeth D. Liddy,et al.  Interpretation of Proper Nouns for Information Retrieval , 1993, HLT.

[82]  W. J. Hutchins,et al.  ON THE PROBLEM OF 'ABOUTNESS' IN DOCUMENT ANALYSIS , 1977 .

[83]  Jakob Nielsen,et al.  Multimedia and Hypertext: The Internet and Beyond , 1995 .

[84]  Fredric C. Gey,et al.  Experiments in the Probabilistic Retrieval of Full Text Documents , 1994, TREC.

[85]  Hans Peter Luhn,et al.  The Automatic Creation of Literature Abstracts , 1958, IBM J. Res. Dev..

[86]  Susan T. Dumais,et al.  The vocabulary problem in human-system communication , 1987, CACM.

[87]  Donna Harman,et al.  The fourth text REtrieval conference , 1996 .

[88]  M.P. Wellman,et al.  The digital library as a community of information agents , 1996, IEEE Expert.

[89]  Rainer Hoch,et al.  Using IR techniques for text classification in document analysis , 1994, SIGIR '94.

[90]  Norbert Fuhr,et al.  DOLORES: a system for logic-based retrieval of multimedia objects , 1998, SIGIR '98.

[91]  Stephen E. Robertson,et al.  Relevance weighting of search terms , 1976, J. Am. Soc. Inf. Sci..

[92]  W. Kintsch,et al.  Strategies of discourse comprehension , 1983 .

[93]  Gerald Kowalski,et al.  Information Retrieval Systems: Theory and Implementation , 1997 .

[94]  David D. McDonald,et al.  Robust partial-parsing through incremental, multi-algorithm processing , 1992 .

[95]  Inderjeet Mani Multi-document Summarization by Graph Search and Mate , 1999 .

[96]  Mountaz Zizi,et al.  Interactive Dynamic Maps for Visualisation and Retrieval from Hypertext Systems , 1996 .

[97]  Kevin D. Ashley,et al.  Finding factors: learning to classify case opinions under abstract fact categories , 1997, ICAIL '97.

[98]  Robert M. Hayes The SMART retrieval system; experiments in automatic document processing: Edited by Gerard Salton, Prentice-Hall, Englewood Cliffs, New Jersey, 1971. 556 pages , 1973 .

[99]  Fabio Ciravegna,et al.  Understanding Messages in a Diagnostic Domain , 1995, Inf. Process. Manag..

[100]  Robert B. Allen,et al.  User Models: Theory, Method, and Practice , 1990, Int. J. Man Mach. Stud..

[101]  Stephen P. Harter,et al.  Online Information Retrieval: Concepts, Principles and Techniques , 1986 .

[102]  Robert Krovetz,et al.  Viewing morphology as an inference process , 1993, Artif. Intell..

[103]  Philip J. Hayes,et al.  Intelligent high-volume text processing using shallow, domain-specific techniques , 1992 .

[104]  Joon Ho Lee,et al.  Combining multiple evidence from different properties of weighting schemes , 1995, SIGIR '95.

[105]  Kok F. Lai,et al.  Document Routing by Discriminant Projection: TREC-4 , 1995, TREC.

[106]  Ellen M. Voorhees,et al.  Information Technology: The Fifth Text REtrieval Conference [TREC-5] | NIST , 1997 .

[107]  Harold Borko,et al.  Abstracting Concepts and Methods , 1975 .

[108]  Yiming Yang,et al.  An application of least squares fit mapping to text information retrieval , 1993, SIGIR.

[109]  W. Bruce Croft,et al.  A Comparison of Text Retrieval Models , 1992, Comput. J..

[110]  B. Houghton,et al.  Online Information Retrieval Systems , 1984 .

[111]  Michael L. Mauldin,et al.  Retrieval performance in Ferret a conceptual information retrieval system , 1991, SIGIR '91.

[112]  Richard E. Susskind,et al.  The Future of Law: Facing the Challenges of Information Technology , 1996 .

[113]  Christopher J. Fox,et al.  Lexical Analysis and Stoplists , 1992, Information Retrieval: Data Structures & Algorithms.

[114]  Peter Willett,et al.  SIBRIS: the Sandwich Interactive Browsing and Ranking Information System , 1989, J. Inf. Sci..

[115]  Elizabeth D. Liddy,et al.  Document Filtering using Semantic Information from a Machine Readable Dictionary , 1993, VLC@ACL.

[116]  David A. Hull Stemming algorithms: a case study for detailed evaluation , 1996 .

[117]  Ellen M. Voorhees,et al.  Implementing agglomerative hierarchic clustering algorithms for use in document retrieval , 1986, Inf. Process. Manag..

[118]  Jian-Yun Nie,et al.  Towards a probabilistic modal logic for semantic-based information retrieval , 1992, SIGIR '92.

[119]  James Allan,et al.  Relevance feedback with too much data , 1995, SIGIR '95.

[120]  Carol L. Barry User-defined relevance criteria: an exploratory study , 1994 .

[121]  Jeff Conklin,et al.  Hypertext: An Introduction and Survey , 1987, Computer.

[122]  Elizabeth D. Liddy,et al.  The use of anaphoric resolution for document description in information retrieval , 1989, Inf. Process. Manag..

[123]  M. E. Maron,et al.  On Relevance, Probabilistic Indexing and Information Retrieval , 1960, JACM.

[124]  Lourdes Y. Collantes Degree of Agreement in Naming Objects and Concepts for Information Retrieval , 1995, J. Am. Soc. Inf. Sci..

[125]  Julie Beth Lovins,et al.  Development of a stemming algorithm , 1968, Mech. Transl. Comput. Linguistics.

[126]  Carol Tenopir,et al.  Full text database retrieval performance , 1985 .

[127]  Jung Soon Ro An evaluation of the applicability of ranking algorithms to improve the effectiveness of full‐text retrieval. I. On the effectiveness of full‐text retrieval , 1988 .

[128]  Mark Sanderson,et al.  Advantages of query biased summaries in information retrieval , 1998, SIGIR '98.

[129]  Martha W. Evens,et al.  Relational thesauri in information retrieval , 1985, J. Am. Soc. Inf. Sci..

[130]  Dragomir R. Radev,et al.  Generating summaries of multiple news articles , 1995, SIGIR '95.

[131]  Tomek Strzalkowski Natural Language Information Retrieval , 1995, Inf. Process. Manag..

[132]  Sung-Hyon Myaeng,et al.  TIPSTER Panel - DR-LINK's Linguistic-Conceptual Approach to Document Detection , 1992, TREC.

[133]  W. Bruce Croft,et al.  An Association Thesaurus for Information Retrieval , 1994, RIAO.

[134]  M. E. Maron,et al.  Full-text information retrieval: Further analysis and clarification , 1990, Inf. Process. Manag..

[135]  W. Bruce Croft,et al.  The use of phrases and structured queries in information retrieval , 1991, SIGIR '91.

[136]  Robert M. Fung,et al.  Applying Bayesian networks to information retrieval , 1995, CACM.

[137]  S. Robertson The probability ranking principle in IR , 1997 .

[138]  Gerard Salton,et al.  Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer , 1989 .

[139]  Giovanni Guida,et al.  Evaluating Importance: A Step Towards Text Summarization , 1985, IJCAI.

[140]  William W. Cohen Text Categorization and Relational Learning , 1995, ICML.

[141]  Kathleen McKeown,et al.  The decomposition of human-written summary sentences , 1999, SIGIR '99.

[142]  James R. Driscoll,et al.  Incorporating a semantic analysis into a document retrieval strategy , 1991, SIGIR '91.

[143]  Kathleen McKeown,et al.  Generating Concise Natural Language Summaries , 1995, Inf. Process. Manag..

[144]  Ralph Grishman,et al.  Computational linguistics : an introduction , 1986 .

[145]  Fredric C. Gey,et al.  Inferring probability of relevance using the method of logistic regression , 1994, SIGIR '94.

[146]  Maristella Agosti,et al.  An Overview of Hypertext , 1996 .

[147]  Donna Harman,et al.  The First Text REtrieval Conference (TREC-1) , 1993 .

[148]  Eugene L. Margulis,et al.  N-Poisson document modelling , 1992, SIGIR '92.

[149]  Jonathan D. Cohen Highlights: language- and domain-independent automatic indexing terms for abstracting , 1995 .

[150]  Amit Singhal,et al.  Pivoted document length normalization , 1996, SIGIR 1996.

[151]  W. J. Hutchins,et al.  Languages of indexing and classification: A linguistic study of structures and functions , 1975 .

[152]  Therese Firmin Hand,et al.  A Proposal for Task-based Evaluation of Text Summarization Systems , 1997, Workshop On Intelligent Scalable Text Summarization.

[153]  Augusto Celentano,et al.  Knowledge-based document filing , 1993, IEEE Expert.

[154]  Nicholas J. Belkin,et al.  Information filtering and information retrieval: two sides of the same coin? , 1992, CACM.

[155]  Jian-Yun Nie,et al.  A retrieval model based on an extended modal logic and its application to the RIME experimental approach , 1989, SIGIR '90.

[156]  David E. Kieras,et al.  Thematic Processes in the Comprehension of Technical Prose. , 1982 .

[157]  William B. Frakes,et al.  Introduction to Information Storage and Retrieval Systems , 1992, Information Retrieval: Data Structures & Algorithms.

[158]  Takashi Maeda,et al.  An automatic method for extracting significant phrases in scientific or technical documents , 1980, Inf. Process. Manag..

[159]  V. Dijk,et al.  On macrostructures, mental models and other inventions. A brief personal history of the Kintsch-Van Dijk Theory , 1995 .

[160]  Hinrich Schütze,et al.  Xerox TREC-3 Report: Combining Exact and Fuzzy Predictors , 1994, TREC.

[161]  Arthur C. Graesser,et al.  Structures and Procedures of Implicit Knowledge , 1985 .

[162]  Ted Dunning,et al.  Accurate Methods for the Statistics of Surprise and Coincidence , 1993, CL.

[163]  Edward A. Fox,et al.  Building a Large Thesaurus for Information Retrieval , 1988, ANLP.

[164]  Hwee Tou Ng,et al.  Feature selection, perceptron learning, and a usability case study for text categorization , 1997, SIGIR '97.

[165]  Elizabeth D. Liddy,et al.  Categorization and Standardizing Proper Nouns for Efficient Information Retrieval , 1996 .

[166]  Martha W. Evens : Getting Computers to Talk like You and Me: Discourse Context, Focus, and Semantics (An ATN Model) , 1987 .

[167]  Elizabeth D. Liddy,et al.  Text categorization for multiple users based on semantic features from a machine-readable dictionary , 1994, TOIS.

[168]  Gerard Salton,et al.  On the application of syntactic methodologies in automatic text analysis , 1990, Inf. Process. Manag..

[169]  Sholom M. Weiss,et al.  Optimized rule induction , 1993, IEEE Expert.

[170]  Chris D. Paice,et al.  The automatic generation of literature abstracts: an approach based on the identification of self-indicating phrases , 1980, SIGIR '80.

[171]  Clare Beghtol,et al.  Bibliographic Classification Theory and Text Linguistics: Aboutness Analysis, intertextuality and the Cognitive Act of Classifying Documents , 1986, J. Documentation.

[172]  Brian C. O'Connor,et al.  Language and representation in information retrieval , 1993 .

[173]  Andrew Dillon,et al.  Readers' Models of Text Structures: The Case of Academic Articles , 1991, Int. J. Man Mach. Stud..

[174]  Elisabeth Neugebauer,et al.  Professional summarizing: no cognitive simulation without observation , 1998 .

[175]  Noam Chomsky,et al.  The Logical Structure of Linguistic Theory , 1975 .

[176]  Antonio Zamora,et al.  The use of titles for automatic document classification , 1980, J. Am. Soc. Inf. Sci..

[177]  Phyllis B. Baxendale,et al.  Machine-Made Index for Technical Literature - An Experiment , 1958, IBM J. Res. Dev..

[178]  Alan F. Smeaton,et al.  Using morpho-syntactic language analysis in phrase matching , 1991, RIAO.

[179]  Lois L. Earl,et al.  Experiments in automatic extracting and indexing , 1970, Inf. Storage Retr..

[180]  Phil Hayes,et al.  NameFinder: Software that finds Names in Text , 1994, RIAO.

[181]  Jerry R. Hobbs Coherence and Coreference , 1979, Cogn. Sci..

[182]  Michael B. Eisenberg,et al.  A re-examination of relevance: toward a dynamic, situational definition , 1990, Inf. Process. Manag..

[183]  Gerard Salton,et al.  Automatic text analysis , 1970, J. Am. Soc. Inf. Sci..

[184]  Daniel Marcu,et al.  From discourse structures to text summaries , 1997 .

[185]  Alan F. Smeaton,et al.  An Overview of Information Retrieval , 1996 .

[186]  O. Scholz,et al.  Some Issues in the Theory of Metaphor , 1988 .

[187]  Marie-Francine Moens,et al.  SALOMON: Abstracting of Legal Cases for Effective Access to Court Decisions , 1996 .

[188]  T. V. Dijk The Study of Discourse , 1997 .

[189]  Lisa F. Rau,et al.  Innovations in Text Interpretation , 1993, Artif. Intell..

[190]  Karen Spärck Jones,et al.  Natural language processing for information retrieval , 1996, CACM.

[191]  Norbert Fuhr,et al.  Probabilistic Models in Information Retrieval , 1992, Comput. J..

[192]  David E. Rumelhart,et al.  Introduction to human information processing , 1977 .

[193]  Francine Chen,et al.  A trainable document summarizer , 1995, SIGIR '95.

[194]  Tom M. Mitchell,et al.  Version Spaces: A Candidate Elimination Approach to Rule Learning , 1977, IJCAI.

[195]  James F. Allen Natural language understanding (2nd ed.) , 1995 .

[196]  Edgar A. Whitley,et al.  Building Knowledge Based Systems: Towards a Methodology , 1991 .

[197]  Marie-Francine Moens,et al.  Automatic abstracting of magazine articles: the creation of 'Highlight' abstracts , 1998, SIGIR '98.

[198]  N. Fairclough Media Discourse. (Cet. 1) , 1995 .

[199]  Stephen P. Harter,et al.  A probabilistic approach to automatic keyword indexing. Part I. On the Distribution of Specialty Words in a Technical Literature , 1975, J. Am. Soc. Inf. Sci..

[200]  Philip J. Hayes,et al.  CONSTRUE/TIS: A System for Content-Based Indexing of a Database of News Stories , 1990, IAAI.

[201]  Fred J. Damerau,et al.  Generating and Evaluating Domain-Oriented Multi-Word Terms from Texts , 1993, Inf. Process. Manag..

[202]  Simone Teufel,et al.  Sentence extraction as a classification task , 1997 .

[203]  Carol A. Bean,et al.  Topical Relevance Relationships. II. An Exploratory Study and Preliminary Typology , 1995, J. Am. Soc. Inf. Sci..

[204]  W. Bruce Croft,et al.  Combining classifiers in text categorization , 1996, SIGIR '96.

[205]  Donna K. Harman,et al.  Overview of the Third Text REtrieval Conference (TREC-3) , 1995, TREC.

[206]  Donald G. Ellis From Language To Communication , 1991 .

[207]  B. C. Brookes THE MEASURES OF INFORMATION RETRIEVAL EFFECTIVENESS PROPOSED BY SWETS , 1968 .

[208]  F. W. Lancaster,et al.  Information Retrieval Today , 1993 .

[209]  Hinrich Schütze,et al.  A comparison of classifiers and document representations for the routing problem , 1995, SIGIR '95.

[210]  Susan Gauch,et al.  Search improvement via automatic query reformulation , 1991, TOIS.

[211]  W. Bruce Croft,et al.  Using Probabilistic Models of Document Retrieval without Relevance Information , 1979, J. Documentation.

[212]  Peter Willett,et al.  The limitations of term co-occurrence data for query expansion in document retrieval systems , 1991, J. Am. Soc. Inf. Sci..

[213]  Petr Sgall,et al.  Topic and Focus of a Sentence and the Patterning of a Text , 1988 .

[214]  Kathleen Dahlgren,et al.  A linguistic ontology , 1995, Int. J. Hum. Comput. Stud..

[215]  Chris Buckley,et al.  A probabilistic learning approach for document indexing , 1991, TOIS.

[216]  Marti A. Hearst,et al.  Reexamining the cluster hypothesis: scatter/gather on retrieval results , 1996, SIGIR '96.

[217]  Karen Spärck Jones The role of artificial intelligence in information retrieval , 1991, J. Am. Soc. Inf. Sci..

[218]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[219]  Wai Lam,et al.  Using a generalized instance set for automatic text categorization , 1998, SIGIR '98.

[220]  Donna Harman The Second Text Retrieval Conference (TREC-2) | NIST , 1994 .

[221]  Elaine Svenonius,et al.  Unanswered questions in the design of controlled vocabularies , 1986, J. Am. Soc. Inf. Sci..

[222]  George Kingsley Zipf,et al.  Human behavior and the principle of least effort , 1949 .

[223]  C. J. van Rijsbergen,et al.  The use of hierarchic clustering in information retrieval , 1971, Inf. Storage Retr..

[224]  Udo Hahn,et al.  Making understanders out of parsers: Semantically driven parsing as a key concept for realistic text understanding applications , 1989, Int. J. Intell. Syst..

[225]  Chi-Hong Leung,et al.  A Statistical Learning Approach to Automatic Indexing of Controlled Index Terms , 1997, J. Am. Soc. Inf. Sci..

[226]  Wessel Kraaij,et al.  Viewing stemming as recall enhancement , 1996, SIGIR '96.

[227]  Gerard Salton,et al.  Automatic term class construction using relevance--A summary of work in automatic pseudoclassification , 1980, Inf. Process. Manag..

[228]  Charles T. Meadow,et al.  Text information retrieval systems , 1992 .

[229]  David W. Aha,et al.  Comparing Instance-Averaging with Instance-Saving Learning Algorithms , 1990 .

[230]  Dario Lucarella,et al.  INFORMATION MODELLING AND RETRIEVAL IN HYPERMEDIA SYSTEMS , 1996 .

[231]  Susan T. Dumais,et al.  Latent Semantic Indexing (LSI): TREC-3 Report , 1994, TREC.

[232]  David L. Waltz,et al.  Classifying news stories using memory based reasoning , 1992, SIGIR '92.

[233]  Gerard Salton,et al.  Document Length Normalization , 1995, Inf. Process. Manag..

[234]  Alan F. Smeaton,et al.  Progress in the Application of Natural Language Processing to Information Retrieval Tasks , 1992, Comput. J..

[235]  C. J. van Rijsbergen,et al.  Towards an information logic , 1989, SIGIR '89.

[236]  Barbara J. Grosz,et al.  Focusing and Description in Natural Language Dialogues , 1979 .

[237]  E. Barrett,et al.  Textual intervention, collaboration, and the online environment , 1989 .

[238]  Peter Bruza,et al.  Stratified Hypermedia Structures for Information Disclosure , 1992, Comput. J..

[239]  Douglas W. Oard Alignment of Spanish and English TREC Topic Descriptions , 1996, TREC.

[240]  Udo Hahn,et al.  Topic parsing: Accounting for text macro structures in full-text analysis , 1990, Inf. Process. Manag..

[241]  Gerard Salton,et al.  Automatic text structuring and retrieval-experiments in automatic encyclopedia searching , 1991, SIGIR '91.

[242]  Don R. Swanson,et al.  Searching Natural Language Text by Computer , 1960 .

[243]  Ralph Grishman,et al.  Analyzing language in restricted domains : sublanguage description and processing , 1986 .

[244]  Lynette Hirschman,et al.  Evaluating Message Understanding Systems: An Analysis of the Third Message Understanding Conference (MUC-3) , 1993, CL.

[245]  Alan F. Smeaton,et al.  Indexing Structures Derived from Syntax in TREC-3: System Description , 1994, TREC.

[246]  Seiji Miike,et al.  A full-text retrieval system with a dynamic abstract generation function , 1994, SIGIR '94.

[247]  Nils J. Nilsson,et al.  The Mathematical Foundations of Learning Machines , 1990 .

[248]  Norbert Fuhr,et al.  Retrieval Test Evaluation of a Rule Based Automatic Index (AIR/PHYS) , 1984, SIGIR.

[249]  Gary Promhouse,et al.  Experiments with TREC using the Open Text Livelink Engine , 1996, TREC.

[250]  Candace L. Sidner,et al.  Attention, Intentions, and the Structure of Discourse , 1986, CL.

[251]  Yves Chiaramella,et al.  An Integrated Model for Hypermedia and Information Retrieval , 1996 .

[252]  Gerard Salton,et al.  Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[253]  John A. Bateman,et al.  On the relationship between ontology construction and natural language: a socio-semiotic view , 1995, Int. J. Hum. Comput. Stud..

[254]  Dieter Merkl,et al.  Exploration of text collections with hierarchical feature maps , 1997, SIGIR '97.

[255]  Christian Jacquemin,et al.  Retrieving terms and their variants in a lexicalized unification-based framework , 1994, SIGIR '94.

[256]  W. Bruce Croft,et al.  Language‐oriented information retrieval , 1989, Int. J. Intell. Syst..

[257]  Gerard Salton,et al.  Another look at automatic text-retrieval systems , 1986, CACM.

[258]  Elisabeth Rudolph,et al.  Connective Relations – Connective Expressions – Connective Structures , 1988 .

[259]  Donald Michie,et al.  Machine learning of rules and trees , 1995 .

[260]  Ellen Riloff,et al.  Information extraction as a basis for high-precision text classification , 1994, TOIS.

[261]  David D. Lewis,et al.  Representation and Learning in Information Retrieval , 1991 .

[262]  William R. Hersh,et al.  Towards new measures of information retrieval evaluation , 1995, SIGIR '95.

[263]  Robert E. Williamson,et al.  Testing of a natural language retrieval system for a full text knowledge base , 1984, J. Am. Soc. Inf. Sci..

[264]  Sholom M. Weiss,et al.  Automated learning of decision rules for text categorization , 1994, TOIS.

[265]  Teun A. Van Dijκ Structures of News in the Press , 1985 .

[266]  Shmuel T. Klein,et al.  Clumping properties of content-bearing words , 1998 .

[267]  Yiyu Yao,et al.  An analysis of vector space models based on computational geometry , 1992, SIGIR '92.

[268]  Eugene L. Margulis,et al.  Modelling Documents with Multiple Poisson Distributions , 1993, Inf. Process. Manag..

[269]  Robert M. Fung,et al.  Bayesian Inference with Node Aggregation for Information Retrieval , 1993, TREC.

[270]  David D. Lewis,et al.  An evaluation of phrasal and clustered representations on a text categorization task , 1992, SIGIR '92.

[271]  Gerard Salton,et al.  On the Specification of Term Values in Automatic Indexing , 1973 .

[272]  Gerard Salton,et al.  A theory of indexing , 1975, Regional conference series in applied mathematics.

[273]  Jennifer Rowley,et al.  Abstracting and indexing , 1982 .

[274]  Andrew McCallum,et al.  Distributional clustering of words for text classification , 1998, SIGIR '98.

[275]  George K. Kokkinakis,et al.  Automatic Stochastic Tagging of Natural Language Texts , 1995, Comput. Linguistics.

[276]  W. Bruce Croft,et al.  Text retrieval and inference , 1992 .

[277]  Gerda Ruge,et al.  Experiments on Linguistically-Based Term Associations , 1992, Inf. Process. Manag..

[278]  Steven Finch,et al.  Partial orders for document representation: a new methodology for combining document features , 1995, SIGIR '95.

[279]  E. Dura Natural Language in Information Retrieval , 2003, CICLing.

[280]  Paul Thompson,et al.  TREC-3 Ad Hoc Retrieval and Routing Experiments using the WIN System , 1994, TREC.

[281]  H. P. Edmundson,et al.  New Methods in Automatic Extracting , 1969, JACM.

[282]  M. E. Maron,et al.  An evaluation of retrieval effectiveness for a full-text document-retrieval system , 1985, CACM.

[283]  Norbert Fuhr,et al.  Probabilistic information retrieval as a combination of abstraction, inductive learning, and probabilistic assumptions , 1994, TOIS.

[284]  Christoph Schwarz,et al.  Automatic syntactic analysis of free text , 1990, J. Am. Soc. Inf. Sci..

[285]  Stephen E. Robertson,et al.  Probabilistic models of indexing and searching , 1980, SIGIR '80.

[286]  M. Halliday Spoken and Written Language , 1989 .

[287]  Klaus Kreplin,et al.  Knowledge based document classification supporting integrated document handling , 1988, COCS '88.

[288]  Norbert Fuhr,et al.  Models for retrieval with probabilistic indexing , 1989, Inf. Process. Manag..

[289]  Alan F. Smeaton,et al.  Incorporating syntactic information into a document retrieval strategy: an investigation , 1986, SIGIR '86.

[290]  Christos Faloutsos,et al.  Signature Files , 1992, Information Retrieval: Data Structures & Algorithms.

[291]  Mary Hart,et al.  Automatic indexing using selective NLP and first-order thesauri , 1991, RIAO.

[292]  Donald B. Cleveland,et al.  Introduction to indexing and abstracting (2. ed.) , 1990 .

[293]  Roger C. Schank,et al.  Scripts, plans, goals and understanding: an inquiry into human knowledge structures , 1978 .

[294]  Stephen E. Robertson,et al.  Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval , 1994, SIGIR '94.

[295]  Zdenek Jonák,et al.  Automatic indexing of full texts , 1984, Inf. Process. Manag..

[296]  M. E. Maron,et al.  On indexing, retrieval and the meaning of about , 1977, J. Am. Soc. Inf. Sci..

[297]  Douglas E. Appelt,et al.  FASTUS: A Finite-state Processor for Information Extraction from Real-world Text , 1993, IJCAI.

[298]  Dagobert Soergel Indexing and retrieval performance: the logical evidence , 1994 .

[299]  J. H. Walker,et al.  Authoring tools for complex document sets , 1989 .

[300]  Walter Kintsch,et al.  Toward a model of text comprehension and production. , 1978 .

[301]  David D. Palmer,et al.  Information Retrieval and Trainable Natural Language Processing , 1996, TREC.

[302]  Clement T. Yu,et al.  A theory of term importance in automatic text analysis , 1974, J. Am. Soc. Inf. Sci..