Open Knowledge Extraction through Compositional Language Processing

We present results for a system designed to perform Open Knowledge Extraction, based on a tradition of compositional language processing, as applied to a large collection of text derived from the Web. Evaluation through manual assessment shows that well-formed propositions of reasonable quality, representing general world knowledge, given in a logical form potentially usable for inference, may be extracted in high volume from arbitrary input sentences. We compare these results with those obtained in recent work on Open Information Extraction, indicating with some examples the quite different kinds of output obtained by the two approaches. Finally, we observe that portions of the extracted knowledge are comparable to results of recent work on class attribute extraction.

[1]  Uri Zernik,et al.  Closed Yesterday and Closed Minds: Asking the Right Questions of the Corpus To Distinguish Thematic from Sentential Relations , 1992, COLING.

[2]  Philip Resnik,et al.  Semantic Classes and Syntactic Ambiguity , 1993, HLT.

[3]  Steven P. Abney Partial parsing via finite-state cascades , 1996, Natural Language Engineering.

[4]  Raymond J. Mooney,et al.  Learning to Parse Database Queries Using Inductive Logic Programming , 1996, AAAI/IAAI, Vol. 2.

[5]  Michael Collins,et al.  Three Generative, Lexicalised Models for Statistical Parsing , 1997, ACL.

[6]  Lucy Vanderwende,et al.  MindNet: Acquiring and Structuring Semantic Information from Text , 1998, COLING-ACL.

[7]  Stephen Clark,et al.  An Iterative Approach to Estimating Frequencies over a Semantic Hierarchy , 1999, EMNLP.

[8]  Chung Hee Hwang,et al.  Episodic Logic Meets Little Red Riding Hood: A Comprehensive, Natural Representation for Language Un , 2000 .

[9]  Eugene Charniak,et al.  A Maximum-Entropy-Inspired Parser , 2000, ANLP.

[10]  Daniel Gildea,et al.  Automatic Labeling of Semantic Roles , 2000, ACL.

[11]  Daniel Gildea,et al.  The Necessity of Parsing for Predicate Argument Recognition , 2002, ACL.

[12]  Maria Liakata,et al.  From Trees to Predicate-argument Structures , 2002, COLING.

[13]  Lenhart K. Schubert Can we derive general world knowledge from texts , 2002 .

[14]  Lenhart K. Schubert,et al.  Extracting and evaluating general world knowledge from the Brown Corpus , 2003, HLT-NAACL 2003.

[15]  Peter Clark,et al.  A knowledge-driven approach to text meaning processing , 2003, HLT-NAACL 2003.

[16]  Luke S. Zettlemoyer,et al.  Learning to Map Sentences to Logical Form: Structured Classification with Probabilistic Categorial Grammars , 2005, UAI.

[17]  Raymond J. Mooney,et al.  Learning Synchronous Grammars for Semantic Parsing with Lambda Calculus , 2007, ACL.

[18]  Benjamin Van Durme,et al.  What You Seek Is What You Get: Extraction of Class Attributes from Query Logs , 2007, IJCAI.

[19]  Oren Etzioni,et al.  Open Information Extraction from the Web , 2007, CACM.

[20]  Dan Roth,et al.  The Importance of Syntactic Parsing and Inference in Semantic Role Labeling , 2008, CL.