Multilingual Open Information Extraction

Open Information Extraction (OIE) is a recent unsupervised strategy to extract great amounts of basic propositions (verb-based triples) from massive text corpora which scales to Web-size document collections. We propose a multilingual rule-based OIE method that takes as input dependency parses in the CoNLL-X format, identifies argument structures within the dependency parses, and extracts a set of basic propositions from each argument structure. Our method requires no training data and, according to experimental studies, obtains higher recall and higher precision than existing approaches relying on training data. Experiments were performed in three languages: English, Portuguese, and Spanish.

[1]  Lluís Padró,et al.  FreeLing 3.0: Towards Wider Multilinguality , 2012, LREC.

[2]  Daniel S. Weld,et al.  Open Information Extraction Using Wikipedia , 2010, ACL.

[3]  Sabine Buchholz,et al.  CoNLL-X Shared Task on Multilingual Dependency Parsing , 2006, CoNLL.

[4]  Luciano Del Corro,et al.  ClausIE: clause-based open information extraction , 2013, WWW.

[5]  Oren Etzioni,et al.  The Tradeoffs Between Open and Traditional Relation Extraction , 2008, ACL.

[6]  Pablo Gamallo Otero,et al.  A grammatical formalism based on patterns of part of speech tags , 2011 .

[7]  Zhila A,et al.  Comparison of open information extraCtion for english and spanish , 2013 .

[8]  Alexander Löser,et al.  KrakeN: N-ary Facts in Open Information Extraction , 2012, AKBC-WEKEX@NAACL-HLT.

[9]  Joakim Nivre,et al.  MaltParser: A Language-Independent System for Data-Driven Dependency Parsing , 2007, Natural Language Engineering.

[10]  Dan Klein,et al.  Accurate Unlexicalized Parsing , 2003, ACL.

[11]  Helmut Schmid,et al.  Improvements in Part-of-Speech Tagging with an Application to German , 1999 .

[12]  Vera Lúcia Strube de Lima,et al.  Open Information Extraction Based on Lexical-Syntactic Patterns , 2013, 2013 Brazilian Conference on Intelligent Systems.

[13]  Sebastian Riedel,et al.  The CoNLL 2007 Shared Task on Dependency Parsing , 2007, EMNLP.

[14]  Elmar Haussmann,et al.  Open Information Extraction via Contextual Sentence Decomposition , 2013, 2013 IEEE Seventh International Conference on Semantic Computing.

[15]  Oren Etzioni,et al.  Open Language Learning for Information Extraction , 2012, EMNLP.

[16]  Oren Etzioni,et al.  Identifying Relations for Open Information Extraction , 2011, EMNLP.

[17]  Oren Etzioni,et al.  Adapting Open Information Extraction to Domain-Specific Relations , 2010, AI Mag..

[18]  Oren Etzioni,et al.  Identifying Functional Relations in Web Text , 2010, EMNLP.

[19]  Oren Etzioni,et al.  Open Information Extraction from the Web , 2007, CACM.

[20]  Oren Etzioni,et al.  Machine Reading , 2006, AAAI.

[21]  Pablo Gamallo,et al.  Dependency-Based Open Information Extraction , 2012 .

[22]  Oren Etzioni,et al.  Open Information Extraction: The Second Generation , 2011, IJCAI.