On Horn Axiomatizations for Sequential Data

We propose a notion of deterministic association rules for ordered data. We prove that our proposed rules can be formally justified by a purely logical characterization, namely, a natural notion of empirical Horn approximation for ordered data which involves background Horn conditions; these ensure the consistency of the propositional theory obtained with the ordered context. The main proof resorts to a concept lattice model in the framework of Formal Concept Analysis, but adapted to ordered contexts. We also discuss a general method to mine these rules that can be easily incorporated into any algorithm for mining closed sequences, of which there are already some in the literature.

[1]  Jérôme Azé Extraction de Connaissances à partir de Données Numériques et Textuelles. (Knowledge extraction from numerical and textual data) , 2003 .

[2]  Chen C. Chang,et al.  Model Theory: Third Edition (Dover Books On Mathematics) By C.C. Chang;H. Jerome Keisler;Mathematics , 1966 .

[3]  Christos H. Papadimitriou,et al.  On Horn Envelopes and Hypergraph Transversals , 1993, ISAAC.

[4]  Ulrich Güntzer,et al.  Algorithms for association rule mining — a general survey and comparison , 2000, SKDD.

[5]  Rajeev Motwani,et al.  Beyond market baskets: generalizing association rules to correlations , 1997, SIGMOD '97.

[6]  Gemma Casas-Garriga,et al.  Summarizing Sequential Data with Closed Partial Orders. , 2005 .

[7]  Nicolas Pasquier,et al.  Mining Bases for Association Rules Using Closed Sets , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[8]  Philip K. Chan,et al.  Learning Patterns from Unix Process Execution Traces for Intrusion Detection , 1997 .

[9]  Bart Selman,et al.  Horn Approximations of Empirical Data , 1995, Artif. Intell..

[10]  Mohammed J. Zaki,et al.  Fast vertical mining using diffsets , 2003, KDD '03.

[11]  Ramakrishnan Srikant,et al.  Mining Sequential Patterns: Generalizations and Performance Improvements , 1996, EDBT.

[12]  Nicolas Pasquier,et al.  Closed Set Based Discovery of Small Covers for Association Rules , 1999, Proc. 15èmes Journées Bases de Données Avancées, BDA.

[13]  Xifeng Yan,et al.  CloSpan: Mining Closed Sequential Patterns in Large Datasets , 2003, SDM.

[14]  Salvatore J. Stolfo,et al.  A data mining framework for building intrusion detection models , 1999, Proceedings of the 1999 IEEE Symposium on Security and Privacy (Cat. No.99CB36344).

[15]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[16]  Jiawei Han,et al.  TSP: Mining top-k closed sequential patterns , 2004, Knowledge and Information Systems.

[17]  Gerd Stumme,et al.  Mining Minimal Non-redundant Association Rules Using Frequent Closed Itemsets , 2000, Computational Logic.

[18]  Mohammed J. Zaki,et al.  SPADE: An Efficient Algorithm for Mining Frequent Sequences , 2004, Machine Learning.

[19]  Daniel Sánchez,et al.  Measuring the accuracy and interest of association rules: A new framework , 2002, Intell. Data Anal..

[20]  Bart Selman,et al.  Knowledge compilation and theory approximation , 1996, JACM.

[21]  H. Mannila,et al.  Discovering all most specific sentences , 2003, TODS.

[22]  Zvi M. Kedem,et al.  Pincer-Search: A New Algorithm for Discovering the Maximum Frequent Set , 1998, EDBT.

[23]  Mohammed J. Zaki Mining Non-Redundant Association Rules , 2004, Data Min. Knowl. Discov..

[24]  Jiawei Han,et al.  BIDE: efficient mining of frequent closed sequences , 2004, Proceedings. 20th International Conference on Data Engineering.

[25]  Gemma C. Garriga,et al.  Sampling Strategies for Finding Frequent Sets , 2003, EGC.

[26]  Roberto J. Bayardo,et al.  Mining the most interesting rules , 1999, KDD '99.

[27]  Philippe Lenca,et al.  A Clustering of Interestingness Measures , 2004, Discovery Science.

[28]  Michael Luxenburger,et al.  Implications partielles dans un contexte , 1991 .

[29]  T. Lane,et al.  Sequence Matching and Learning in Anomaly Detection for Computer Security , 1997 .

[30]  Mohammed J. Zaki Generating non-redundant association rules , 2000, KDD '00.

[31]  Heikki Mannila,et al.  Discovering Frequent Episodes in Sequences , 1995, KDD.

[32]  Jean Sallantin,et al.  Structural Machine Learning with Galois Lattice and Graphs , 1998, ICML.

[33]  Heikki Mannila,et al.  Global partial orders from sequential data , 2000, KDD '00.

[34]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD 2000.

[35]  Georg Gottlob,et al.  Identifying the Minimal Transversals of a Hypergraph and Related Problems , 1995, SIAM J. Comput..

[36]  Jun-Lin Lin,et al.  Mining association rules: anti-skew algorithms , 1998, Proceedings 14th International Conference on Data Engineering.

[37]  Abraham Silberschatz,et al.  On Subjective Measures of Interestingness in Knowledge Discovery , 1995, KDD.

[38]  Alan Day,et al.  The Lattice Theory of Functional Dependencies and Normal Decompositions , 1992, Int. J. Algebra Comput..

[39]  Howard J. Hamilton,et al.  Evaluation of Interestingness Measures for Ranking Discovered Knowledge , 2001, PAKDD.

[40]  Cláudia Antunes,et al.  Sequential Pattern Mining Algorithms: Trade-offs between Speed and Memory , 2004 .

[41]  János Demetrovics,et al.  Functional Dependencies in Relational Databases: A Lattice Point of View , 1992, Discret. Appl. Math..

[42]  Roberto J. Bayardo,et al.  Efficiently mining long patterns from databases , 1998, SIGMOD '98.

[43]  Jaideep Srivastava,et al.  Selecting the right interestingness measure for association patterns , 2002, KDD.

[44]  Johannes Fürnkranz,et al.  An Analysis of Rule Evaluation Metrics , 2003, ICML.

[45]  Dan A. Simovici,et al.  Generating an informative cover for association rules , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[46]  Bernhard Ganter,et al.  Formal Concept Analysis: Mathematical Foundations , 1998 .

[47]  Ramakrishnan Srikant,et al.  Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[48]  Bart Goethals,et al.  Advances in frequent itemset mining implementations: report on FIMI'03 , 2004, SKDD.

[49]  Heikki Mannila,et al.  Fast Discovery of Association Rules , 1996, Advances in Knowledge Discovery and Data Mining.

[50]  Rajeev Motwani,et al.  Dynamic itemset counting and implication rules for market basket data , 1997, SIGMOD '97.