Symmetry-Based Pruning in Itemset Mining

In this paper, we show how symmetries, a fundamental structural property, can be used to prune the search space in itemset mining problems. Our approach is based on a dynamic integration of symmetries in APRIORI-like algorithms to prune the set of possible candidate patterns. More precisely, for a given itemset, symmetry can be applied to deduce other itemsets while preserving their properties. We also show that our symmetry-based pruning approach can be extended to the general Mannila and Toivonen pattern mining framework. Experimental results highlight the usefulness and the efficiency of our symmetry-based pruning approach.

[1]  Lakhdar Sais,et al.  Tractability through symmetries in propositional calculus , 1994, Journal of Automated Reasoning.

[2]  Das Amrita,et al.  Mining Association Rules between Sets of Items in Large Databases , 2013 .

[3]  Angel Garrido,et al.  Symmetry in Complex Networks , 2011, Symmetry.

[4]  Jean-François Puget,et al.  On the Satisfiability of Symmetrical Constrained Satisfaction Problems , 1993, ISMIS.

[5]  Lhouari Nourine,et al.  Uncovering and Reducing Hidden Combinatorics in Guigues-Duquenne Bases , 2005, ICFCA.

[6]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[7]  Philip S. Yu,et al.  Relational clustering by symmetric convex coding , 2007, ICML '07.

[8]  Pierre Hansen,et al.  Improving Frequent Subgraph Mining in the Presence of Symmetry , 2007, MLG.

[9]  Fionn Murtagh,et al.  Hierarchical Clustering for Finding Symmetries and Other Patterns in Massive, High Dimensional Datasets , 2010, ArXiv.

[10]  Natalia Vanetik Mining Graphs with Constraints on Symmetry and Diameter , 2010, WAIM Workshops.

[11]  M. Xiong,et al.  Symmetry-based structure entropy of complex networks , 2007, 0710.0108.

[12]  Leo Liberti,et al.  Reformulations in mathematical programming: automatic symmetry detection and exploitation , 2010, Mathematical Programming.

[13]  Luc De Raedt,et al.  k-Pattern Set Mining under Constraints , 2013, IEEE Transactions on Knowledge and Data Engineering.

[14]  Heikki Mannila,et al.  Levelwise Search and Borders of Theories in Knowledge Discovery , 1997, Data Mining and Knowledge Discovery.

[15]  Igor L. Markov,et al.  Solving difficult instances of Boolean satisfiability in the presence of symmetry , 2003, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst..

[16]  Ian P. Gent,et al.  Symmetry Breaking in Constraint Programming , 2000, ECAI.

[17]  Igor L. Markov,et al.  Faster symmetry discovery using sparsity of symmetries , 2008, 2008 45th ACM/IEEE Design Automation Conference.

[18]  James M. Crawford,et al.  Symmetry-Breaking Predicates for Search Problems , 1996, KR.

[19]  Balakrishnan Krishnamurthy Short proofs for tricky formulas , 2004, Acta Informatica.

[20]  Shin-ichi Minato Symmetric Item Set Mining Based on Zero-Suppressed BDDs , 2006, Discovery Science.

[21]  Wei Wang,et al.  Efficient mining of frequent subgraphs in the presence of isomorphism , 2003, Third IEEE International Conference on Data Mining.

[22]  Akhilesh Tiwari,et al.  A Survey on Frequent Pattern Mining: Current Status and Challenging Issues , 2010 .

[23]  Lakhdar Sais,et al.  Symmetries in Itemset Mining , 2012, ECAI.

[24]  Raoul Medina,et al.  Efficient algorithms for clone items detection , 2005, CLA.

[25]  Ron Kohavi,et al.  Real world performance of association rule algorithms , 2001, KDD '01.

[26]  F. Murtagh Symmetry in data mining and analysis: A unifying view based on hierarchy , 2008, 0805.2744.

[27]  Brendan D. McKay,et al.  Practical graph isomorphism, II , 2013, J. Symb. Comput..

[28]  Petteri Kaski,et al.  Engineering an Efficient Canonical Labeling Tool for Large and Sparse Graphs , 2007, ALENEX.

[29]  Raoul Medina,et al.  Ecien t algorithms for clone items detection , 2005 .