Incremental Theory Closure Reasoning for Large Scale Knowledge Graphs

Knowledge graph(KG)s have become more and more important for the field of cybersecurity. However, they usually contain much implicit semantic information, which needs to be further mined through semantic inference to be more useful for both analysts and automated systems. In this paper, we propose an incremental reasoning algorithm KGRL-Incre for the scenario that the instances of a KG is expanded with only a small set of triples, which can perform an incremental update to the previous reasoning result effectively to avoid a full re-reasoning over the expanded KG. The main contributions of our approach are the irrelevant triple filtering algorithms which reduce the scale of data that need to be processed and a delay reasoning strategy which limits the number of time-consuming iterations while still preserves relative completeness of the final result. The extensive experiments and comprehensive evaluations are conducted and the experimental results show that the KGRL-Incre can significantly reduce time consumption compared with the expanding and reasoning approach in the target scenario.

[1]  Yarden Katz,et al.  Pellet: A practical OWL-DL reasoner , 2007, J. Web Semant..

[2]  Jens Lehmann,et al.  DBpedia - A large-scale, multilingual knowledge base extracted from Wikipedia , 2015, Semantic Web.

[3]  Frank van Harmelen,et al.  Scalable Distributed Reasoning Using MapReduce , 2009, SEMWEB.

[4]  Michael D. Iannacone,et al.  Developing an Ontology for Cyber Security Knowledge Graphs , 2015, CISR.

[5]  Lidan Shou,et al.  SLADE: A Smart Large-Scale Task Decomposer in Crowdsourcing , 2018, IEEE Transactions on Knowledge and Data Engineering.

[6]  Thomas Eiter,et al.  LARS: A Logic-Based Framework for Analyzing Reasoning over Streams , 2015, AAAI.

[7]  Brian McBride,et al.  Jena: A Semantic Web Toolkit , 2002, IEEE Internet Comput..

[8]  Emanuele Della Valle,et al.  RSPLab: RDF Stream Processing Benchmarking Made Easy , 2017, SEMWEB.

[9]  Scott Shenker,et al.  Spark: Cluster Computing with Working Sets , 2010, HotCloud.

[10]  Jie Luo,et al.  KGRL: An OWL2 RL Reasoning System for Large Scale Knowledge Graph , 2016, 2016 12th International Conference on Semantics, Knowledge and Grids (SKG).

[11]  Jeff Heflin,et al.  LUBM: A benchmark for OWL knowledge base systems , 2005, J. Web Semant..

[12]  Rong Gu,et al.  Cichlid: Efficient Large Scale RDFS/OWL Reasoning with Spark , 2015, 2015 IEEE International Parallel and Distributed Processing Symposium.

[13]  Frank van Harmelen,et al.  Sesame: A Generic Architecture for Storing and Querying RDF and RDF Schema , 2002, SEMWEB.

[14]  Jacopo Urbani,et al.  Expressive Stream Reasoning with Laser , 2017, International Semantic Web Conference.

[15]  Sebastian Rudolph,et al.  EP-SPARQL: a unified language for event processing and stream reasoning , 2011, WWW.

[16]  Steven Noel,et al.  Chapter 4 – CyGraph: Graph-Based Analytics and Visualization for Cybersecurity , 2016 .

[17]  Je-Min Kim,et al.  Scalable OWL-Horst ontology reasoning using SPARK , 2015, 2015 International Conference on Big Data and Smart Computing (BIGCOMP).

[18]  Yan Jia,et al.  A Practical Approach to Constructing a Knowledge Graph for Cybersecurity , 2018 .

[19]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[20]  Frank van Harmelen,et al.  OWL Reasoning with WebPIE: Calculating the Closure of 100 Billion Triples , 2010, ESWC.

[21]  Yifei Wang,et al.  An Incremental Reasoning Algorithm for Large Scale Knowledge Graph , 2018, KSEM.

[22]  Lise Getoor,et al.  A short introduction to probabilistic soft logic , 2012, NIPS 2012.