When Lexicon-Grammar Meets Open Information Extraction: a Computational Experiment for Italian Sentences

In this work we show an experiment on building an Open Information Extraction system (OIE) for Italian language. We propose a system wholly reliant on linguistic structures and on a small set of verbal behavior patterns defined putting together theoretical linguistic knowledge and corpus-based statistical information1. Starting from elementary one-verb sentences, the system identifies elementary tuples and then, all their permutations, preserving the overall well-formedness (grammaticality) and trying to preserve semantic coherence (acceptability). Although the work focuses only on the Italian language, it can be proficiently extended also to other languages, since it is essentially based only on linguistic resources and on a representative corpus for the language under consideration2.

[1]  Roman Kern,et al.  GerIE - An Open Information Extraction System for the German Language , 2018, J. Univers. Comput. Sci..

[2]  Zellig S. Harris,et al.  A Grammar of English on Mathematical Principles , 1982 .

[3]  Ido Dagan,et al.  Creating a Large Benchmark for Open Information Extraction , 2016, EMNLP.

[4]  Alessandro Lenci,et al.  LexIt: A Computational Resource on Italian Argument Structure , 2012, LREC.

[5]  Oren Etzioni,et al.  Towards Coherent Multi-Document Summarization , 2013, NAACL.

[6]  Bo Zhang,et al.  StatSnowball: a statistical approach to extracting entity relationships , 2009, WWW '09.

[7]  Luciano Del Corro,et al.  ClausIE: clause-based open information extraction , 2013, WWW.

[8]  Daniel S. Weld,et al.  Open Information Extraction Using Wikipedia , 2010, ACL.

[9]  Pablo Gamallo,et al.  Multilingual Open Information Extraction , 2015, EPIA.

[10]  Jorge Baptista ViPEr: A Lexicon-Grammar of European Portuguese Verbs , 2012 .

[11]  Elsa Tolone Analyse syntaxique à l’aide des tables du Lexique-Grammaire du français , 2012 .

[12]  Peter Clark,et al.  Answering Complex Questions Using Open Information Extraction , 2017, ACL.

[13]  Oren Etzioni,et al.  Open Information Extraction from the Web , 2007, CACM.

[14]  Silvia Bernardini,et al.  The WaCky wide web: a collection of very large linguistically processed web-crawled corpora , 2009, Lang. Resour. Evaluation.

[15]  Oren Etzioni,et al.  Open question answering over curated and extracted knowledge bases , 2014, KDD.

[16]  Zhila A,et al.  Comparison of open information extraCtion for english and spanish , 2013 .

[17]  Pablo Gamallo,et al.  Dependency-Based Open Information Extraction , 2012 .

[18]  C. Phillips Some arguments and nonarguments for reductionist accounts of syntactic phenomena , 2013 .

[19]  C. Phillips Should we impeach armchair linguists ? , 2022 .

[20]  Oren Etzioni,et al.  Open Information Extraction: The Second Generation , 2011, IJCAI.

[21]  Oren Etzioni,et al.  Identifying Relations for Open Information Extraction , 2011, EMNLP.

[22]  Alexander Clark,et al.  Grammaticality, Acceptability, and Probability: A Probabilistic View of Linguistic Knowledge , 2017, Cogn. Sci..

[23]  Christian Leclère The Lexicon-Grammar of French Verbs , 2005 .

[24]  Ido Dagan,et al.  Open IE as an Intermediate Structure for Semantic Tasks , 2015, ACL.

[25]  Morten H. Christiansen,et al.  The need for quantitative methods in syntax and semantics research , 2013 .

[26]  Cristiana Ciocanea Lexique-grammaire des constructions converses en a da/ a primi en roumain. (Lexicon-grammar of converse constructions in a da/ a primi in Romanian) , 2011 .

[27]  R. Berwick,et al.  Colorless green ideas do sleep furiously: gradient acceptability and the nature of the grammar , 2018, The Linguistic Review.

[28]  Uyen Trang Nguyen,et al.  Vietnamese Open Information Extraction , 2017, SoICT.

[29]  Lei Li,et al.  Semi-supervised Chinese Open Entity Relation Extraction , 2014, 2014 IEEE 3rd International Conference on Cloud Computing and Intelligence Systems.

[30]  Massimo Esposito,et al.  Open Information Extraction for Italian Sentences , 2018, 2018 32nd International Conference on Advanced Information Networking and Applications Workshops (WAINA).

[31]  Maurizio Lenzerini,et al.  Senso Comune , 2010, LREC.

[32]  Oren Etzioni,et al.  Open Language Learning for Information Extraction , 2012, EMNLP.

[33]  Annibale Elia,et al.  Lessico e strutture sintattiche : introduzione alla sintassi del verbo italiano , 1981 .

[34]  Christian Leclère,et al.  Organization of the lexicon-grammar of French verbs , 2002 .