Implementing scalable structured machine learning for big data in the SAKE project

Exploration and analysis of large amounts of machine generated data requires innovative approaches. We propose a combination of Semantic Web and Machine Learning to facilitate the analysis. First, data is collected and converted to RDF according to a schema in the Web Ontology Language OWL. Several components can continue working with the data, to interlink, label, augment, or classify. The size of the data poses new challenges to existing solutions, which we solve in this contribution by transitioning from in-memory to database.

[1]  Jens Lehmann,et al.  DL-Learner - A framework for inductive learning on the Semantic Web , 2016, J. Web Semant..

[2]  Marco F. Huber Conditional anomaly detection in event streams , 2017, Autom..

[3]  Jens Lehmann,et al.  Ideal Downward Refinement in the EL Description Logic , 2009, ILP.

[4]  Jens Lehmann,et al.  Class expression learning for ontology engineering , 2011, J. Web Semant..

[5]  Axel-Cyrille Ngonga Ngomo,et al.  Radon - Rapid Discovery of Topological Relations , 2017, AAAI.

[6]  Marcus Baum,et al.  Framework for mining event correlations and time lags in large event sequences , 2017, 2017 IEEE 15th International Conference on Industrial Informatics (INDIN).

[7]  Diego Calvanese,et al.  The Description Logic Handbook: Theory, Implementation, and Applications , 2003, Description Logic Handbook.

[8]  Jens Lehmann,et al.  Simplified RDB2RDF Mapping , 2015, LDOW@WWW.

[9]  Axel-Cyrille Ngonga Ngomo,et al.  An Efficient Approach for the Generation of Allen Relations , 2016, ECAI.

[10]  Katja Hose,et al.  Federated Query Processing over Linked Data , 2014, Linked Data Management.

[11]  Axel-Cyrille Ngonga Ngomo,et al.  Big data architecture for the semantic analysis of complex events in manufacturing , 2016, GI-Jahrestagung.

[12]  Sebastian Rudolph,et al.  Schema-Agnostic Query Rewriting in SPARQL 1.1 , 2014, International Semantic Web Conference.

[13]  Jens Lehmann,et al.  Towards SPARQL-Based Induction for Large-Scale RDF Data Sets , 2016, ECAI.

[14]  Stefan Decker,et al.  Linked cancer genome atlas database , 2013, I-SEMANTICS '13.

[15]  Jens Lehmann,et al.  Distributed Semantic Analytics Using the SANSA Stack , 2017, SEMWEB.

[16]  Jens Lehmann,et al.  SML-Bench - A benchmarking framework for structured machine learning , 2019, Semantic Web.

[17]  Ian Horrocks,et al.  BootOX: Practical Mapping of RDBs to OWL 2 , 2015, SEMWEB.

[18]  Natanael Arndt,et al.  Decentralized Evolution and Consolidation of RDF Graphs , 2017, ICWE.