Identifying motifs for evaluating open knowledge extraction on the Web

Open Knowledge Extraction (OKE) is the process of extracting knowledge from text and representing it in formalized machine readable format, by means of unsupervised, open-domain and abstractive techniques. Despite the growing presence of tools for reusing NLP results as linked data (LD), there is still lack of established practices and benchmarks for the evaluation of OKE results tailored to LD. In this paper, we propose to address this issue by constructing RDF graph banks, based on the definition of logical patterns called OKE Motifs. We demonstrate the usage and extraction techniques of motifs using a broad-coverage OKE tool for the Semantic Web called FRED. Finally, we use identified motifs as empirical data for assessing the quality of OKE results, and show how they can be extended trough a use case represented by an application within the Semantic Sentiment Analysis domain.

[1]  Diego Reforgiato Recupero,et al.  Sentilo: Frame-Based Sentiment Analysis , 2014, Cognitive Computation.

[2]  Pedro M. Domingos,et al.  Deep Transfer: A Markov Logic Approach , 2011, AI Mag..

[3]  Aldo Gangemi,et al.  A Comparison of Knowledge Extraction Tools for the Semantic Web , 2013, ESWC.

[4]  Jens Lehmann,et al.  Linked-Data Aware URI Schemes for Referencing Text Fragments , 2012, EKAW.

[5]  Oren Etzioni,et al.  Identifying Relations for Open Information Extraction , 2011, EMNLP.

[6]  Raphaël Troncy,et al.  NERD meets NIF: Lifting NLP Extraction Results to the Linked Data Cloud , 2012, LDOW.

[7]  Richard Moot The Logic of Categorial Grammars: A deductive account of natural language syntax and semantics , 2012 .

[8]  Oren Etzioni,et al.  Open Information Extraction: The Second Generation , 2011, IJCAI.

[9]  Jens Lehmann,et al.  Integrating NLP Using Linked Data , 2013, SEMWEB.

[10]  Fabio Vitali,et al.  Dealing with markup semantics , 2011, I-Semantics '11.

[11]  Erik Cambria,et al.  SenticNet 3: A Common and Common-Sense Knowledge Base for Cognition-Driven Sentiment Analysis , 2014, AAAI.

[12]  Anne Abeillé,et al.  Treebanks: Building and Using Parsed Corpora , 2003 .

[13]  Oren Etzioni,et al.  Machine Reading , 2006, AAAI.

[14]  Aldo Gangemi,et al.  Knowledge Extraction Based on Discourse Representation Theory and Linguistic Frames , 2012, EKAW.

[15]  Lenhart K. Schubert,et al.  Open Knowledge Extraction through Compositional Language Processing , 2008, STEP.

[16]  Arno Scharl,et al.  Enriching semantic knowledge bases for opinion mining in big data applications , 2014, Knowl. Based Syst..

[17]  Pasquale De Meo,et al.  Web Data Extraction , Applications and Techniques : A Survey , 2010 .

[18]  Roberto V. Zicari,et al.  PoliTwi: Early detection of emerging political topics on twitter and the impact on concept-level sentiment analysis , 2014, Knowl. Based Syst..

[19]  Tom M. Mitchell,et al.  Improving Learning and Inference in a Large Knowledge-Base using Latent Syntactic Cues , 2013, EMNLP.

[20]  Diego Reforgiato Recupero,et al.  Frame-Based Detection of Opinion Holders and Topics: A Model and a Tool , 2014, IEEE Computational Intelligence Magazine.

[21]  Johan Bos,et al.  Developing a large semantically annotated corpus , 2012, LREC.

[22]  Philipp Koehn,et al.  Proceedings of the 7th Linguistic Annotation Workshop and Interoperability with Discourse, LAW-ID@ACL 2013, August 8-9, 2013, Sofia, Bulgaria , 2013, LAW-ID@ACL.

[23]  Erik Cambria,et al.  Sentic Computing: A Common-Sense-Based Framework for Concept-Level Sentiment Analysis , 2015 .

[24]  Estevam R. Hruschka,et al.  Discovering Relations between Noun Categories , 2011, EMNLP.

[25]  Eduard Hovy,et al.  OntoNotes: A Unified Relational Semantic Representation , 2007 .

[26]  Heiner Stuckenschmidt,et al.  Enriching Structured Knowledge with Open Information , 2015, WWW.

[27]  David L. Davidson,et al.  The Logical Form of Action Sentences , 2001 .

[28]  Angelo Di Iorio,et al.  Towards the Automatic Identification of the Nature of Citations , 2013, SePublica.

[29]  Isabelle Augenstein,et al.  LODifier: Generating Linked Data from Unstructured Text , 2012, ESWC.

[30]  Johan Bos,et al.  Wide-Coverage Semantic Analysis with Boxer , 2008, STEP.

[31]  Dragomir R. Radev,et al.  Introduction to the Special Issue on Summarization , 2002, CL.

[32]  Johanna Völker,et al.  A Framework for Ontology Learning and Data-driven Change Discovery , 2005 .

[33]  Vadlamani Ravi,et al.  A survey on opinion mining and sentiment analysis: Tasks, approaches and applications , 2015, Knowl. Based Syst..

[34]  Diego Reforgiato Recupero,et al.  A Machine Reader for the Semantic Web , 2013, International Semantic Web Conference.

[35]  Diego Calvanese,et al.  The Description Logic Handbook , 2007 .

[36]  Raymond Y. K. Lau,et al.  Product aspect extraction supervised with online domain knowledge , 2014, Knowl. Based Syst..

[37]  Diego Reforgiato Recupero,et al.  Uncovering the Semantics of Wikipedia Pagelinks , 2014, EKAW.