REXTOR: A System for Generating Relations from Natural Language

This paper argues that a finite-state language model with a ternary expression representation is currently the most practical and suitable bridge between natural language processing and information retrieval. Despite the theoretical computational inadequacies of finite-state grammars, they are very cost effective (in time and space requirements) and adequate for practical purposes. The ternary expressions that we use are not only linguistically-motivated, but also amenable to rapid large-scale indexing. REXTOR (Relations EXtracTOR) is an implementation of this model; in one uniform framework, the system provides two separate grammars for extracting arbitrary patterns of text and building ternary expressions from them. These content representational structures serve as the input to our ternary expressions indexer. This approach to natural language information retrieval promises to significantly raise the performance of current systems.

[1]  K. W. Church On memory limitations in natural language processing , 1982 .

[2]  W. Bruce Croft,et al.  An Approach to Natural Language Processing for Document Retrieval. , 1987, Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.

[3]  Boris Katz,et al.  Annotating the World Wide Web using Natural Language , 1997, RIAO.

[4]  Joel L Fagan,et al.  Experiments in Automatic Phrase Indexing For Document Retrieval: A Comparison of Syntactic and Non-Syntactic Methods , 1987 .

[5]  Edward Loper,et al.  Applying semantic relation extraction to information retrieval , 2000 .

[6]  Ralph Grishman,et al.  The NYU System for MUC-6 or Where’s the Syntax? , 1995, MUC.

[7]  Ralph Grishman,et al.  New York University: Description of the PROTEUS System as Used for MUC-3 , 1991, MUC.

[8]  Alan F. Smeaton,et al.  Indexing Structures Derived from Syntax in TREC-3: System Description , 1994, TREC.

[9]  Boris Katz,et al.  Using English for Indexing and Retrieving , 1991 .

[10]  Jerry R. Hobbs,et al.  FASTUS : Extracting Information from Natural-Language Texts , 1996 .

[11]  Yves Schabes,et al.  Finite-State Approximation of Phrase-Structure Grammars , 1997 .

[12]  Evelyne Tzoukermann,et al.  Expansion of multi-word terms for indexing and retrieval using morphology and syntax , 1997 .

[13]  Douglas E. Appelt,et al.  FASTUS: A Cascaded Finite-State Transducer for Extracting Information from Natural-Language Text , 1997, ArXiv.

[14]  Avi Arampatzis,et al.  Phrase-based Information Retrieval , 1998 .

[15]  Noam Chomsky,et al.  A Note on Phrase Structure Grammars , 1959, Inf. Control..

[16]  Natasa Milic-Frayling,et al.  Evaluation of Syntactic Phrase Indexing -- CLARIT NLP Track Report , 1996, TREC.

[17]  Boris Katz A Three-Step Procedure for Language Generation , 1980 .

[18]  Noam Chomsky,et al.  On Certain Formal Properties of Grammars , 1959, Inf. Control..

[19]  Avi Arampatzis,et al.  An Evaluation of Linguistically-motivated Indexing Schemes , 2000 .

[20]  Evelyne Tzoukermann,et al.  Expansion of Multi-Word Terms for Indexing and Retrieval Using Morphology and Syntax , 1997, ACL.

[21]  Boris Katz,et al.  Exploiting Lexical Regularities in Designing Natural Language Systems , 1988, COLING.

[22]  Karen Jensen,et al.  Natural Language Processing: The PLNLP Approach , 2013, Natural Language Processing.

[23]  Ralph Grishman,et al.  New York University: description of the Proteus system as used for MUC-5 , 1993, MUC.

[24]  George E. Heidorn,et al.  Natural language inputs to a simulation programming system: An introduction , 1971 .

[25]  Alfred V. Aho,et al.  Compilers: Principles, Techniques, and Tools , 1986, Addison-Wesley series in computer science / World student series edition.

[26]  Avi Arampatzis,et al.  Phase-Based Information Retrieval , 1998, Inf. Process. Manag..

[27]  W. Bruce Croft,et al.  An approach to natural language for document retrieval , 1987, SIGIR '87.

[28]  William A. Woods,et al.  Computational Linguistics Transition Network Grammars for Natural Language Analysis , 2022 .

[29]  Tomek Strzalkowski,et al.  Natural Language Information Retrieval: TIPSTER-2 Final Report , 1996, TIPSTER.

[30]  Eric Brill,et al.  A Simple Rule-Based Part of Speech Tagger , 1992, HLT.