Semantic knowledge extraction from research documents

In this paper, we designed a knowledge supporting software system in which sentences and keywords are extracted from large scale document database. This system consists of semantic representation scheme for natural language processing of the document database. Documents originally in a form of PDF are broken into triple-store data after pre-processing. The semantic representation is a hyper-graph which consists of collections of binary relations of `triples'. According to a certain rule based on user's interests, the system identify sentences and words of interests. The relationship of those extracted sentences is visualized in the form of network graph. A user can introduce new rules to extract additional Knowledge from the Database or paper. For practical example, we choose a set of research papers related IoT for the case study purpose. Applying several rules concerning authors' indicated keywords as well as the system's specified discourse words, significant knowledge are extracted from the papers.

[1]  C. Apte,et al.  Data mining with decision trees and decision rules , 1997, Future Gener. Comput. Syst..

[2]  Ryszard S. Michalski,et al.  Knowledge Mining: A Proposed New Direction , 2003 .

[3]  K. Cuhls From forecasting to foresight processes—new participative foresight activities in Germany , 2003 .

[4]  Matthew Fisher,et al.  Semantic Web Programming , 2009 .

[5]  Emden R. Gansner,et al.  Graphviz - Open Source Graph Drawing Tools , 2001, GD.

[6]  K. Matthias Weber,et al.  Foresight and Adaptive Planning as Complementary Elements in Anticipatory Policy-making: A Conceptual and Methodological Approach , 2006 .

[7]  Paul Buitelaar,et al.  Ontology Construction , 2010, Handbook of Natural Language Processing.

[8]  David E. Millard,et al.  Automatic extraction of knowledge from web documents , 2003 .

[9]  Gaetano Marrocco,et al.  RFID Technology for IoT-Based Personal Healthcare in Smart Spaces , 2014, IEEE Internet of Things Journal.

[10]  Fabrizio Sebastiani,et al.  Machine learning in automated text categorization , 2001, CSUR.

[11]  Romaric Besançon,et al.  Text Mining, knowledge extraction from unstructured textual data , 1998 .

[12]  Hoifung Poon,et al.  Grounded Semantic Parsing for Complex Knowledge Extraction , 2015, NAACL.

[13]  Key-Sun Choi,et al.  Toward Ontology-based Knowledge Extraction from Web Data with the Lexicalization of Ontology for Korean QA System , 2014 .

[14]  Usama M. Fayyad,et al.  Knowledge Discovery in Databases: An Overview , 1997, ILP.

[15]  Raymond J. Mooney,et al.  Text mining with information extraction , 2004 .

[16]  Yorick Wilks,et al.  Information Extraction as a Core Language Technology , 1997, SCIE.

[17]  Behrang Q. Zadeh,et al.  Towards Technology Structure Mining from Scientific Literature , 2010, SEMWEB.

[18]  Petra Perner,et al.  Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[19]  Hideki Mima,et al.  Design and Implementation of a Terminology-based Literature Mining and Knowledge Structuring System , 2004 .

[20]  Pak Chung Wong,et al.  Visualizing association rules for text mining , 1999, Proceedings 1999 IEEE Symposium on Information Visualization (InfoVis'99).

[21]  Shaidah Jusoh,et al.  Techniques , Applications and Challenging Issue in Text Mining , 2012 .

[22]  Peter Clark,et al.  Large-scale extraction and use of knowledge from text , 2009, K-CAP '09.

[23]  Ralph Grishman,et al.  Information Extraction: Techniques and Challenges , 1997, SCIE.

[24]  Rebecca P. Ang,et al.  An introduction to association rule mining: An application in counseling and help-seeking behavior of adolescents , 2007, Behavior research methods.

[25]  Haralampos Karanikas,et al.  An Approach to Text Mining using Information Extraction , 2000 .