MILIE: Modular & Iterative Multilingual Open Information Extraction

Open Information Extraction (OpenIE) is the task of extracting (subject, predicate, object) triples from natural language sentences. Current OpenIE systems extract all triple slots independently. In contrast, we explore the hypothesis that it may be beneficial to extract triple slots iteratively: first extract easy slots, followed by the difficult ones by conditioning on the easy slots, and therefore achieve a better overall extraction.Based on this hypothesis, we propose a neural OpenIE system, MILIE, that operates in an iterative fashion. Due to the iterative nature, the system is also modularit is possible to seamlessly integrate rule based extraction systems with a neural end-to-end system, thereby allowing rule based systems to supply extraction slots which MILIE can leverage for extracting the remaining slots. We confirm our hypothesis empirically: MILIE outperforms SOTA systems on multiple languages ranging from Chinese to Arabic. Additionally, we are the first to provide an OpenIE test dataset for Arabic and Galician.

[1]  Mathias Niepert,et al.  AnnIE: An Annotation Platform for Constructing Complete Open Information Extraction Benchmark , 2021, ACL.

[2]  Mirella Lapata,et al.  Generating Query Focused Summaries from Query-Free Resources , 2020, ACL.

[3]  Mausam,et al.  Constrained Iterative Labeling for Open Information Extraction , 2020, EMNLP.

[4]  Yukyung Lee,et al.  Multiˆ2OIE: Multilingual Open Information Extraction based on Multi-Head Attention with BERT , 2020, FINDINGS.

[5]  Mausam,et al.  IMoJIE: Iterative Memory-Based Joint Open Information Extraction , 2020, ACL.

[6]  Mausam,et al.  CaRB: A Crowdsourced Benchmark for Open IE , 2019, EMNLP.

[7]  Anne Lauscher,et al.  MinScIE: Citation-Centered Open Information Extraction , 2019, 2019 ACM/IEEE Joint Conference on Digital Libraries (JCDL).

[8]  Hai Zhao,et al.  Span Model for Open Information Extraction on Accurate Corpus , 2019, AAAI.

[9]  Ido Dagan,et al.  Supervised Open Information Extraction , 2018, NAACL.

[10]  Miao Fan,et al.  Logician: A Unified End-to-End Neural Approach for Open-Domain Information Extraction , 2018, WSDM.

[11]  Zhoujun Li,et al.  Assertion-based QA with Question-Aware Open Information Extraction , 2018, AAAI.

[12]  Luciano Del Corro,et al.  MinIE: Minimizing Facts in Open Information Extraction , 2017, EMNLP.

[13]  Peter Clark,et al.  Answering Complex Questions Using Open Information Extraction , 2017, ACL.

[14]  Ido Dagan,et al.  Creating a Large Benchmark for Open Information Extraction , 2016, EMNLP.

[15]  Mausam,et al.  Open Information Extraction Systems and Downstream Applications , 2016, IJCAI.

[16]  Partha P. Talukdar,et al.  Relation Schema Induction using Tensor Factorization with Side Information , 2016, EMNLP.

[17]  Christopher D. Manning,et al.  Leveraging Linguistic Structure For Open Domain Information Extraction , 2015, ACL.

[18]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[19]  Luciano Del Corro,et al.  ClausIE: clause-based open information extraction , 2013, WWW.

[20]  Oren Etzioni,et al.  Open Information Extraction from the Web , 2007, CACM.

[21]  Anthony Ephremides,et al.  Wireless networking , 1997, Proceedings Second IEEE Symposium on Computer and Communications.

[22]  Goran Glavas,et al.  BenchIE: Open Information Extraction Evaluation Based on Facts, Not Tokens , 2021, ArXiv.

[23]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.