Contrasting logical sequences in multi-relational learning

AbstractIn this paper, we present the BeamSouL sequence miner that finds sequences of logical atoms. This algorithm uses a levelwise hybrid search strategy to find a subset of contrasting logical sequences available in a SeqLog database. The hybrid search strategy runs an exhaustive search, in the first phase, followed by a beam search strategy. In the beam search phase, the algorithm uses the confidence metric to select the top k sequential patterns that will be specialized in the next level. Moreover, we develop a first-order logic classification framework that uses predicate invention technique to include the BeamSouL findings in the learning process. We evaluate the performance of our proposals using four multi-relational databases. The results are promising, and the BeamSouL algorithm can be more than one order of magnitude faster than the baseline and can find long and highly discriminative contrasting sequences.

[1]  Stefano Ferilli,et al.  Feature Construction for Relational Sequence Learning , 2010, ArXiv.

[2]  Krysia Broda,et al.  Predicate Invention in Inductive Logic Programming , 2012, ICCSW.

[3]  Rakesh Agarwal,et al.  Fast Algorithms for Mining Association Rules , 1994, VLDB 1994.

[4]  Michèle Sebag,et al.  Scalability and efficiency in multi-relational data mining , 2003, SKDD.

[5]  João Gama,et al.  Predictive Sequence Miner in ILP Learning , 2011, ILP.

[6]  Roberto J. Bayardo,et al.  Efficiently mining long patterns from databases , 1998, SIGMOD '98.

[7]  Stefano Ferilli,et al.  Multi-Dimensional Relational Sequence Mining , 2008, Fundam. Informaticae.

[8]  Sandra de Amo,et al.  First-order temporal pattern mining with regular expression constraints , 2007, Data Knowl. Eng..

[9]  Osmar R. Zaïane,et al.  Contrasting Sequence Groups by Emerging Sequences , 2009, Discovery Science.

[10]  Luc De Raedt,et al.  Logical and relational learning , 2008, Cognitive Technologies.

[11]  L. De Raedt,et al.  Logical Hidden Markov Models , 2011, J. Artif. Intell. Res..

[12]  Mohammed J. Zaki Sequence mining in categorical domains: incorporating constraints , 2000, CIKM '00.

[13]  Luc De Raedt,et al.  Inductive Logic Programming: Theory and Methods , 1994, J. Log. Program..

[14]  Florence Le Ber,et al.  Extracting Hierarchies of Closed Partially-Ordered Patterns Using Relational Concept Analysis , 2016, ICCS.

[15]  Ramakrishnan Srikant,et al.  Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[16]  BlockeelHendrik,et al.  Scalability and efficiency in multi-relational data mining , 2003 .

[17]  Luc De Raedt,et al.  Constraint Based Mining of First Order Sequences in SeqLog , 2004, Database Support for Data Mining Applications.

[18]  Xifeng Yan,et al.  CloSpan: Mining Closed Sequential Patterns in Large Datasets , 2003, SDM.

[19]  Mohammed J. Zaki,et al.  Mining features for sequence classification , 1999, KDD '99.

[20]  Petra Perner,et al.  Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[21]  Ramakrishnan Srikant,et al.  Mining Sequential Patterns: Generalizations and Performance Improvements , 1996, EDBT.

[22]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[23]  João Gama,et al.  Exploring multi-relational temporal databases with a propositional sequence miner , 2015, Progress in Artificial Intelligence.

[24]  Luc De Raedt,et al.  Molecular feature mining in HIV data , 2001, KDD '01.

[25]  Takashi Washio,et al.  Analysis of Hepatitis Dataset by Decision Tree Graph-Based Induction , 2004 .

[26]  Geoffrey I. Webb,et al.  Supervised Descriptive Rule Discovery: A Unifying Survey of Contrast Set, Emerging Pattern and Subgroup Mining , 2009, J. Mach. Learn. Res..

[27]  Letizia Tanca,et al.  What you Always Wanted to Know About Datalog (And Never Dared to Ask) , 1989, IEEE Trans. Knowl. Data Eng..

[28]  Johannes Fürnkranz,et al.  Round Robin Classification , 2002, J. Mach. Learn. Res..