A semantic-based knowledge fusion model for solution-oriented information network development: a case study in intrusion detection field

Building information networks using semantic based techniques to avoid tedious work and to achieve high efficiency has been a long-term goal in the information management world. A great volume of research has focused on developing large scale information networks for general domains to pursue the comprehensiveness and integrity of the information. However, constructing customised information networks containing subject-specific knowledge has been neglected. Such research can potentially return high value in terms of both theoretical and practical contribution. In this paper, a new type of network, solution-oriented information network, is coined that includes research problems and proposed techniques as nodes, and the relationship between them. A lightweight Semantic-based Knowledge Fusion Model (SKFM) is proposed leveraging the power of Natural Language Processing (NLP) and Crowdsourcing to construct the proposed information networks using academic papers (knowledge) from Scopus. SKFM relies on NLP in terms of automatic components while Crowdsourcing is initiated when uncertain cases arise. Applying the NLP technique assists to develop a semi-automatic knowledge fusion method for saving effort and time in extracting information from academic papers. Leveraging human power in uncertain cases is to make sure the essential concepts for developing the information networks are extracted reliably and connected correctly. SKFM shows a theoretical contribution in terms of lightweight knowledge extraction and reconstruction framework, as well as practical value by providing solutions proposed in academic papers to address corresponding research issues in subject-specific areas. Experiments have been implemented which have shown promising results. In the research field of intrusion detection, the information of attack types and proposed solutions has been extracted and integrated in a graphic manner with high accuracy and efficiency.

[1]  Gilles Bisson,et al.  Designing Clustering Methods for Ontology Building - The Mo'K Workbench , 2000, ECAI Workshop on Ontology Learning.

[2]  Cungen Cao,et al.  A knowledge fusion model for Web information , 2005, The 2005 IEEE/WIC/ACM International Conference on Web Intelligence (WI'05).

[3]  Anil Kumar,et al.  A reliable solution against Packet dropping attack due to malicious nodes using fuzzy Logic in MANETs , 2014, 2014 International Conference on Reliability Optimization and Information Technology (ICROIT).

[4]  Alexander V. Smirnov,et al.  Fusion-based knowledge logistics for intelligent decision support in network-centric environment , 2005, Int. J. Gen. Syst..

[5]  Antonio Gabriel López-Herrera,et al.  Sketching the first 45 years of the journal Psychophysiology (1964-2008): a co-word-based analysis. , 2011, Psychophysiology.

[6]  Sue Ziebland,et al.  Ethics and dementia: mapping the literature by bibliometric analysis , 2003, International journal of geriatric psychiatry.

[7]  Stuart E. Madnick,et al.  A framework for technology forecasting and visualization , 2009, 2009 International Conference on Innovations in Information Technology (IIT).

[8]  V. Phan-Luong A framework for integrating information sources under lattice structure , 2008, Inf. Fusion.

[9]  Steffen Staab,et al.  Mining Ontologies from Text , 2000, EKAW.

[10]  Mathias Binswanger,et al.  Excellence by Nonsense: The Competition for Publications in Modern Science , 2014 .

[11]  Dominika Tkaczyk,et al.  CERMINE -- Automatic Extraction of Metadata and References from Scientific Literature , 2014, 2014 11th IAPR International Workshop on Document Analysis Systems.

[12]  Liang Zhang,et al.  Global biodiversity research during 1900–2009: a bibliometric analysis , 2011, Biodiversity and Conservation.

[13]  Yang Xiaorong,et al.  Rule-Based Agricultural Knowledge Fusion in Web Information Integration , 2012 .

[14]  Xing Jiang,et al.  Testing the trade-off between productivity and quality in research activities , 2010 .

[15]  Jin-Hee Cho,et al.  Hierarchical Trust Management for Wireless Sensor Networks and its Applications to Trust-Based Routing and Intrusion Detection , 2012, IEEE Transactions on Network and Service Management.

[16]  J. Masters Structured Knowledge Source Integration and its applications to information fusion , 2002, Proceedings of the Fifth International Conference on Information Fusion. FUSION 2002. (IEEE Cat.No.02EX5997).

[17]  David Sánchez,et al.  Learning relation axioms from text: An automatic Web-based approach , 2012, Expert Syst. Appl..

[18]  A. Boury-Brisset Towards a knowledge server to support the situation analysis process , 2001 .

[19]  Henry G. Small,et al.  Co-citation in the scientific literature: A new measure of the relationship between two documents , 1973, J. Am. Soc. Inf. Sci..

[20]  Jimmy J. Lin,et al.  Web question answering: is more always better? , 2002, SIGIR '02.

[21]  Trevor J. M. Bench-Capon,et al.  Kraft: An Agent Architecture for Knowledge Fusion , 2001, Int. J. Cooperative Inf. Syst..

[22]  Pooya Moradian Zadeh,et al.  A Bayesian Game Approach for Preventing DoS Attacks in Wireless Sensor Networks , 2009, 2009 WRI International Conference on Communications and Mobile Computing.

[23]  Shian-Shyong Tseng,et al.  Ontology-Based Knowledge Fusion Framework Using Graph Partitioning , 2003, IEA/AIE.

[24]  Y. Kajikawa,et al.  Citation network analysis of organic LEDs , 2009 .

[25]  Yuh-Shan Ho,et al.  Bibliometric analysis of Severe Acute Respiratory Syndrome-related research in the beginning stage , 2004, Scientometrics.

[26]  Alon Y. Levy The Information Manifold Approach to Data Integration , 2007 .

[27]  Wei Zhang,et al.  Knowledge vault: a web-scale approach to probabilistic knowledge fusion , 2014, KDD.

[28]  Masood Fooladi,et al.  A Comparison between Two Main Academic Literature Collections: Web of Science and Scopus Databases , 2013, ArXiv.

[29]  M Thorogood,et al.  Health promotion research literature in Europe 1995-2005. , 2007, European journal of public health.

[30]  Sylvie Szulman,et al.  TERMINAE : a method and a tool to build a domain ontology , 1999 .

[31]  Yizhou Sun,et al.  Mining Heterogeneous Information Networks: Principles and Methodologies , 2012, Mining Heterogeneous Information Networks: Principles and Methodologies.

[32]  Marti A. Hearst Trends & Controversies: Information integration , 1998, IEEE Intell. Syst..

[33]  Azzedine Boukerche,et al.  An agent based and biological inspired real-time intrusion detection and security model for computer network operations , 2007, Comput. Commun..

[34]  Ran Yan,et al.  Knowledge fusion based on D-S theory and its application on Expert System for software fault diagnosis , 2015, 2015 Prognostics and System Health Management Conference (PHM).

[35]  Christopher Ré,et al.  Elementary: Large-Scale Knowledge-Base Construction via Machine Learning and Statistical Inference , 2012, Int. J. Semantic Web Inf. Syst..

[36]  Ronald N. Kostoff,et al.  Literature-related discovery (LRD): Methodology , 2008 .

[37]  Cheng Wen,et al.  Global scientific production on GIS research by bibliometric analysis from 1997 to 2006 , 2008, J. Informetrics.

[38]  Diego Reforgiato Recupero,et al.  ACM: Article Content Miner for Assessing the Quality of Scientific Output , 2016, SemWebEval@ESWC.

[39]  Yoshiyuki Takeda,et al.  Tracking emerging technologies in energy research : toward a roadmap for sustainable energy , 2008 .

[40]  Johanna Völker,et al.  A Framework for Ontology Learning and Data-driven Change Discovery , 2005 .

[41]  Jean Pierre Courtial,et al.  Co-word analysis as a tool for describing the network of interactions between basic and technological research: The case of polymer chemsitry , 1991, Scientometrics.

[42]  Paola Velardi,et al.  The Usable Ontology: An Environment for Building and Assessing a Domain Ontology , 2002, SEMWEB.

[43]  Diego Reforgiato Recupero,et al.  A Machine Reader for the Semantic Web , 2013, International Semantic Web Conference.

[44]  Trilce Estrada,et al.  TAO: System for Table Detection and Extraction from PDF Documents , 2016, FLAIRS.

[45]  David M. Shotton,et al.  CiTO, the Citation Typing Ontology , 2010, J. Biomed. Semant..

[46]  Ah-Hwee Tan,et al.  Mining semantic networks for knowledge discovery , 2003, Third IEEE International Conference on Data Mining.

[47]  Yu Zhang,et al.  Semantic-based lightweight ontology learning framework: a case study of intrusion detection ontology , 2017, WI.

[48]  Gobinda G. Chowdhury,et al.  Bibliometric cartography of information retrieval research by using co-word analysis , 2001, Inf. Process. Manag..

[49]  Paulo Cesar G. da Costa,et al.  Probabilistic Ontology and Knowledge Fusion for Procurement Fraud Detection in Brazil , 2009, URSW.

[50]  David Faure,et al.  A corpus-based conceptual clustering method for verb frames and ontology , 1998 .

[51]  Richard B. Scherl,et al.  Technologies for Army Knowledge Fusion , 2004 .

[52]  Andrei Voronkov,et al.  PDFX: fully-automated PDF-to-XML conversion of scientific literature , 2013, ACM Symposium on Document Engineering.

[53]  Angelo Di Iorio,et al.  Characterising Citations in Scholarly Documents: The CiTalO Framework , 2013, ESWC.

[54]  Gobinda G. Chowdhury,et al.  Introduction to Modern Information Retrieval , 1999 .

[55]  Guanrong Chen,et al.  A network model of knowledge accumulation through diffusion and upgrade , 2011 .

[56]  Adolfo Guzmán-Arenas,et al.  Knowledge accumulation through automatic merging of ontologies , 2010, Expert Syst. Appl..

[57]  Tom M. Mitchell,et al.  Learning to construct knowledge bases from the World Wide Web , 2000, Artif. Intell..

[58]  Paulo Cesar G. da Costa,et al.  Probabilistic ontologies for knowledge fusion , 2008, 2008 11th International Conference on Information Fusion.

[59]  Enrico Motta,et al.  KnoFuss: a comprehensive architecture for knowledge fusion , 2007, K-CAP '07.

[60]  Michael Lesk How Can We Get High-Quality Electronic Journals? , 1998 .