Extraction of Entailed Semantic Relations Through Syntax-Based Comma Resolution

This paper studies textual inference by investigating comma structures, which are highly frequent elements whose major role in the extraction of semantic relations has not been hitherto recognized. We introduce the problem of comma resolution, defined as understanding the role of commas and extracting the relations they imply. We show the importance of the problem using examples from Textual Entailment tasks, and present A Sentence Transformation Rule Learner (ASTRL) , a machine learning algorithm that uses a syntactic analysis of the sentence to learn sentence transformation rules that can then be used to extract relations. We have manually annotated a corpus identifying comma structures and relations they entail and experimented with both gold standard parses and parses created by a leading statistical parser, obtaining F-scores of 80.2% and 70.4% respectively.

[1]  Eugene Charniak,et al.  Coarse-to-Fine n-Best Parsing and MaxEnt Discriminative Reranking , 2005, ACL.

[2]  Ido Dagan,et al.  Semantic Inference at the Lexical-Syntactic Level , 2007, AAAI.

[3]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[4]  Patrick Pantel,et al.  DIRT @SBT@discovery of inference rules from text , 2001, KDD '01.

[5]  Jeffrey P. Bigham,et al.  Names and Similarities on the Web: Fact Extraction in the Fast Lane , 2006, ACL.

[6]  Varol Akman,et al.  Current approaches to punctuation in computational linguistics , 1996, Comput. Humanit..

[7]  Oren Etzioni,et al.  Open Information Extraction from the Web , 2007, CACM.

[8]  Patrick Pantel,et al.  Espresso: Leveraging Generic Patterns for Automatically Harvesting Semantic Relations , 2006, ACL.

[9]  Ido Dagan,et al.  The Third PASCAL Recognizing Textual Entailment Challenge , 2007, ACL-PASCAL@ACL.

[10]  Varol Akman,et al.  An Analysis of English Punctuation , 1998 .

[11]  L. Getoor,et al.  1 Global Inference for Entity and Relation Identification via a Linear Programming Formulation , 2007 .

[12]  Dan Roth,et al.  A Linear Programming Formulation for Global Inference in Natural Language Tasks , 2004, CoNLL.

[13]  Ari Rappoport,et al.  Unsupervised Discovery of Generic Relationships Using Pattern Clusters and its Evaluation by Automatically Generated SAT Analogy Questions , 2008, ACL.

[14]  Aron Culotta,et al.  Dependency Tree Kernels for Relation Extraction , 2004, ACL.

[15]  Dan Roth,et al.  An Inference Model for Semantic Entailment in Natural Language , 2005, IJCAI.

[16]  Dekang Lin,et al.  DIRT – Discovery of Inference Rules from Text , 2001 .

[17]  Toru Hirano,et al.  Detecting Semantic Relations between Named Entities in Text Using Contextual Features , 2007, ACL.

[18]  D. Roth 1 Global Inference for Entity and Relation Identification via a Linear Programming Formulation , 2007 .

[19]  Sebastian van Delden,et al.  Combining finite state automata and a greedy learning algorithm to determine the syntactic roles of commas , 2002, 14th IEEE International Conference on Tools with Artificial Intelligence, 2002. (ICTAI 2002). Proceedings..

[20]  Ari Rappoport,et al.  Fully Unsupervised Discovery of Concept-Specific Relationships by Web Mining , 2007, ACL.

[21]  Satoshi Sekine,et al.  On-Demand Information Extraction , 2006, ACL.

[22]  Dmitry Zelenko,et al.  Kernel Methods for Relation Extraction , 2002, J. Mach. Learn. Res..