Semi-supervised online structure learning for composite event recognition

Online structure learning approaches, such as those stemming from statistical relational learning, enable the discovery of complex relations in noisy data streams. However, these methods assume the existence of fully-labelled training data, which is unrealistic for most real-world applications. We present a novel approach for completing the supervision of a semi-supervised structure learning task. We incorporate graph-cut minimisation, a technique that derives labels for unlabelled data, based on their distance to their labelled counterparts. In order to adapt graph-cut minimisation to first order logic, we employ a suitable structural distance for measuring the distance between sets of logical atoms. The labelling process is achieved online (single-pass) by means of a caching mechanism and the Hoeffding bound, a statistical tool to approximate globally-optimal decisions from locally-optimal ones. We evaluate our approach on the task of composite event recognition by using a benchmark dataset for human activity recognition, as well as a real dataset for maritime monitoring. The evaluation suggests that our approach can effectively complete the missing labels and eventually, improve the accuracy of the underlying structure learning system.

[1]  Nitesh V. Chawla,et al.  Learning From Labeled And Unlabeled Data: An Empirical Study Across Techniques And Domains , 2011, J. Artif. Intell. Res..

[2]  Matthew Richardson,et al.  Markov logic networks , 2006, Machine Learning.

[3]  Luc De Raedt,et al.  kFOIL: Learning Simple Relational Kernels , 2006, AAAI.

[4]  David W. Aha,et al.  Instance-Based Learning Algorithms , 1991, Machine Learning.

[5]  Andrew McCallum,et al.  Efficiently Inducing Features of Conditional Random Fields , 2002, UAI.

[6]  John D. Lafferty,et al.  Inducing Features of Random Fields , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  John D. Lafferty,et al.  Semi-supervised learning using randomized mincuts , 2004, ICML.

[8]  Alexander Artikis,et al.  An Event Calculus for Event Recognition , 2015, IEEE Transactions on Knowledge and Data Engineering.

[9]  Alexander Artikis,et al.  Online learning of event definitions , 2016, Theory and Practice of Logic Programming.

[10]  Gisele L. Pappa,et al.  An ant colony-based semi-supervised approach for learning classification rules , 2015, Swarm Intelligence.

[11]  Gary C. Borchardt,et al.  Event Calculus , 1985, IJCAI.

[12]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[13]  D. Kibler,et al.  Instance-based learning algorithms , 2004, Machine Learning.

[14]  G. Michailidis,et al.  An Iterative Algorithm for Extending Learners to a Semi-Supervised Setting , 2008 .

[15]  Marek J. Sergot,et al.  A logic-based calculus of events , 1989, New Generation Computing.

[16]  Maurice Bruynooghe,et al.  A Framework for Defining Distances Between First-Order Logic Objects , 1998, ILP.

[17]  Alessandro Margara,et al.  Processing flows of information: From data stream to complex event processing , 2012, CSUR.

[18]  Luc De Raedt,et al.  Integrating Naïve Bayes and FOIL , 2007, J. Mach. Learn. Res..

[19]  Hendrik Blockeel,et al.  Top-Down Induction of First Order Logical Decision Trees , 1998, AI Commun..

[20]  H. Kuhn The Hungarian method for the assignment problem , 1955 .

[21]  Mathias Kirsten,et al.  Extending K-Means Clustering to First-Order Representations , 2000, ILP.

[22]  David B. Skillicorn,et al.  Classification Using Streaming Random Forests , 2011, IEEE Transactions on Knowledge and Data Engineering.

[23]  Mathias Kirsten,et al.  Relational Distance-Based Clustering , 1998, ILP.

[24]  Alexander Artikis,et al.  A Prototype for Credit Card Fraud Management: Industry Paper , 2017, DEBS.

[25]  Stefan Wrobel,et al.  Term Comparisons in First-Order Similarity Measures , 1998, ILP.

[26]  David Heckerman,et al.  A Tutorial on Learning with Bayesian Networks , 1998, Learning in Graphical Models.

[27]  Dietrich Wettschereck,et al.  Relational Instance-Based Learning , 1996, ICML.

[28]  Evangelos Michelioudakis,et al.  Online Structure Learning for Traffic Management , 2016, ILP.

[29]  Nikos Pelekis,et al.  Online event recognition from moving vessel trajectories , 2016, GeoInformatica.

[30]  Geoff Hulten,et al.  Mining high-speed data streams , 2000, KDD '00.

[31]  Michael I. Jordan,et al.  Supervised learning from incomplete data via an EM approach , 1993, NIPS.

[32]  Alexander Artikis,et al.  Probabilistic Complex Event Recognition , 2017, ACM Comput. Surv..

[33]  Maozu Guo,et al.  A new relational Tri-training system with adaptive data editing for inductive logic programming , 2012, Knowl. Based Syst..

[34]  Zhi-Hua Zhou,et al.  Tri-training: exploiting unlabeled data using three classifiers , 2005, IEEE Transactions on Knowledge and Data Engineering.

[35]  Yoram Singer,et al.  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[36]  Gilles Bisson Conceptual Clustering in a First Order Logic Representation , 1992, ECAI.

[37]  Nuanwan Soonthornphisaj,et al.  Combining ILP with Semi-supervised Learning for Web Page Categorization , 2004, International Conference on Computational Intelligence.

[38]  W. Hoeffding Probability Inequalities for sums of Bounded Random Variables , 1963 .

[39]  Gilles Bisson,et al.  Learning in FOL with a Similarity Measure , 1992, AAAI.

[40]  J. R. Quinlan Learning Logical Definitions from Relations , 1990 .

[41]  Yan Zhou,et al.  Enhancing Supervised Learning with Unlabeled Data , 2000, ICML.

[42]  Raymond J. Mooney,et al.  Online Structure Learning for Markov Logic Networks , 2011, ECML/PKDD.

[43]  Avrim Blum,et al.  Learning from Labeled and Unlabeled Data using Graph Mincuts , 2001, ICML.

[44]  Luc De Raedt,et al.  Logical and Relational Learning: From ILP to MRDM (Cognitive Technologies) , 2008 .

[45]  Evangelos Michelioudakis,et al.  Online Learning of Weighted Relational Rules for Complex Event Recognition , 2018, ECML/PKDD.

[46]  Luc De Raedt,et al.  Clausal Discovery , 1997, Machine Learning.

[47]  Raymond J. Mooney,et al.  Learning Relations by Pathfinding , 1992, AAAI.

[48]  David Yarowsky,et al.  Unsupervised Word Sense Disambiguation Rivaling Supervised Methods , 1995, ACL.

[49]  George A. Vouros,et al.  Probabilistic Event Calculus for Event Recognition , 2012, ACM Trans. Comput. Log..

[50]  Xiaojin Zhu,et al.  Introduction to Semi-Supervised Learning , 2009, Synthesis Lectures on Artificial Intelligence and Machine Learning.

[51]  Stephen Muggleton,et al.  Inverse entailment and progol , 1995, New Generation Computing.

[52]  Nuanwan Soonthornphisaj,et al.  Iterative cross-training: An algorithm for web page categorization , 2003, Intell. Data Anal..

[53]  Amit Dhurandhar,et al.  Distribution-free bounds for relational classification , 2012, Knowledge and Information Systems.

[54]  Evangelos Michelioudakis,et al.  \mathtt OSLα : Online Structure Learning Using Background Knowledge Axiomatization , 2016, ECML/PKDD.

[55]  F. Stephan,et al.  Set theory , 2018, Mathematical Statistics with Applications in R.

[56]  Zoubin Ghahramani,et al.  Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions , 2003, ICML 2003.

[57]  Alexander Artikis,et al.  Logic-based event recognition , 2012, The Knowledge Engineering Review.

[58]  Maozu Guo,et al.  Web Page Classification Using Relational Learning Algorithm and Unlabeled Data , 2011, J. Comput..

[59]  Shan-Hwei Nienhuys-Cheng,et al.  Distance Between Herbrand Interpretations: A Measure for Approximations to a Target Concept , 1997, ILP.