A Parallel System for Textual Inference

| This paper presents a possible solution for the text inference problem-extracting information unstated in a text, but implied. Text inference is central to natural language applications such as information extraction and dissemination , text understanding, summarization, and translation. Our solution takes advantage of a semantic English dictionary available in electronic form that provides the basis for the development of a large linguistic knowledge base. The inference algorithm consists of a set of highly parallel search methods that when applied to the knowledge base nd contexts in which sentences are interpreted. These contexts reveal information relevant to the text. Implementation, results and parallelism analysis are discussed. T HIS paper addresses the issue of parallelism in a class of problems that is largely unexplored, yet of growing importance. Text inference refers to the problem of extracting information that is not stated directly in a text, but is implied. This may be achieved by reasoning about a text by making logical judgments on the basis of circum-stantial evidence from a large knowledge base that contains knowledge about the world. A related, but much simpler problem is information retrieval where the goal is the recognition of facts, events and properties that are explicitly stated in the text. While current information retrieval systems that process millions of sentences per minute with an accuracy close to that of humans have been built 25], the process of large scale inference has not been automated yet. The major obstacles that need to be resolved are: (1) building knowledge bases large enough to capture world knowledge, (2) nd-ing a knowledge representation scheme good for common sense reasoning, and (3) developing inference methods and control mechanisms able to provide relevant inferences at speeds comparable to humans. In this paper we present a parallel inference system that operates on a very large linguistic knowledge base. The system is scalable both in size and accuracy and is highly parallel. The novelty of this work derives from our use of an extended linguistic knowledge base for English language called WordNet, and an inference algorithm that consists S. Harabagiu is with the and reference IEEECS Log Number D96261. of a set of parallel search procedures over the linguistic semantic network (i.e. the knowledge base). WordNet is being developed at Princeton by a group led by Miller 17]. Text inference is of great importance especially today when there are many newspapers, books and other …

[1]  Sanda M. Harabagiu,et al.  A parallel algorithm for text inference , 1996, Proceedings of International Conference on Parallel Processing.

[2]  Kenneth Ward Church,et al.  Word Association Norms, Mutual Information, and Lexicography , 1989, ACL.

[3]  Janyce Wiebe,et al.  Word-Sense Disambiguation Using Decomposable Models , 1994, ACL.

[4]  Hwee Tou Ng,et al.  Integrating Multiple Knowledge Sources to Disambiguate Word Sense: An Exemplar-Based Approach , 1996, ACL.

[5]  Robert P. Goldman,et al.  A Semantics for Probabilistic Quantifier-Free First-Order Languages, with Particular Application to Story Understanding , 1989, IJCAI.

[6]  Eric Brill,et al.  A Simple Rule-Based Part of Speech Tagger , 1992, HLT.

[7]  Scott E. Fahlman,et al.  NETL: A System for Representing and Using Real-World Knowledge , 1979, CL.

[8]  R. Wilensky Planning and Understanding: A Computational Approach to Human Reasoning , 1983 .

[9]  Sanda M. Harabagiu,et al.  Wordnet-based inference of textual context, cohesion and coherence , 1997 .

[10]  Charles J. Fillmore,et al.  THE CASE FOR CASE. , 1967 .

[11]  Beth M. Sundheim,et al.  TIPSTER/MUC-5 Information Extraction System Evaluation , 1993, TIPSTER.

[12]  J. Jenkins,et al.  Word association norms , 1964 .

[13]  Eugene Charniak,et al.  A Neat Theory of Marker Passing , 1986, AAAI.

[14]  Susan McRoy,et al.  Using Multiple Knowledge Sources for Word Sense Discrimination , 1992, Comput. Linguistics.

[15]  David L. Waltz,et al.  Trading MIPS and memory for knowledge engineering , 1992, CACM.

[16]  Dan I. Moldovan,et al.  SNAP: A Market-Propagation Architecture for Knowledge Processing , 1992, IEEE Trans. Parallel Distributed Syst..

[17]  Robert F. Simmons,et al.  Truly Parallel Understanding of Text , 1990, AAAI.

[18]  Yorick Wilks,et al.  Book Reviews: Electric Words: Dictionaries, Computers, and Meanings , 1996, CL.

[19]  David Yarowsky,et al.  Unsupervised Word Sense Disambiguation Rivaling Supervised Methods , 1995, ACL.

[20]  Adwait Ratnaparkhi,et al.  A Maximum Entropy Model for Part-Of-Speech Tagging , 1996, EMNLP.

[21]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[22]  Peter Norvig,et al.  Marker Passing as a Weak Method for Text Inferencing , 1989, Cogn. Sci..

[23]  Roger C. Schank,et al.  Language and Memory , 1986, Cogn. Sci..

[24]  Paul R. Cohen,et al.  Beyond ISA: Structures for Plausible Inference In Semantic Networks , 1988, AAAI.

[25]  Eugene Charniak,et al.  Passing Markers: A Theory of Contextual Influence in Language Comprehension* , 1983 .

[26]  James Alexander Hendler,et al.  integrating Marker Passing and Problem Solving: A Spreading Activation Approach To Improved Choice in Planning , 1987 .