Walk 'n' Merge: A Scalable Algorithm for Boolean Tensor Factorization

Tensors are becoming increasingly common in data mining, and consequently, tensor factorizations are becoming more important tools for data miners. When the data is binary, it is natural to ask if we can factorize it into binary factors while simultaneously making sure that the reconstructed tensor is still binary. Such factorizations, called Boolean tensor factorizations, can provide improved interpretability and find Boolean structure that is hard to express using normal factorizations. Unfortunately the algorithms for computing Boolean tensor factorizations do not usually scale well. In this paper we present a novel algorithm for finding Boolean CP and Tucker decompositions of large and sparse binary tensors. In our experimental evaluation we show that our algorithm can handle large tensors and accurately reconstructs the latent Boolean structure.

[1]  J. Rissanen,et al.  Modeling By Shortest Data Description* , 1978, Autom..

[2]  Pauli Miettinen,et al.  Discovering facts with boolean tensor tucker decomposition , 2013, CIKM.

[3]  Pauli Miettinen,et al.  Scalable Boolean Tensor Factorizations using Random Walks , 2013, ArXiv.

[4]  L. Tucker,et al.  Some mathematical notes on three-mode factor analysis , 1966, Psychometrika.

[5]  Tamara G. Kolda,et al.  On Tensors, Sparsity, and Nonnegative Factorizations , 2011, SIAM J. Matrix Anal. Appl..

[6]  Krishna P. Gummadi,et al.  On the evolution of user interaction in Facebook , 2009, WOSN '09.

[7]  Iven Van Mechelen,et al.  Indclas: A three-way hierarchical classes model , 1999 .

[8]  Jean-François Boulicaut,et al.  Closed patterns meet n-ary relations , 2009, TKDD.

[9]  Richard A. Harshman,et al.  Foundations of the PARAFAC procedure: Models and conditions for an "explanatory" multi-model factor analysis , 1970 .

[10]  Seungjin Choi,et al.  Nonnegative Tucker Decomposition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Pauli Miettinen,et al.  Boolean Tensor Factorizations , 2011, 2011 IEEE 11th International Conference on Data Mining.

[12]  Pauli Miettinen,et al.  The Discrete Basis Problem , 2006, IEEE Transactions on Knowledge and Data Engineering.

[13]  Nikos D. Sidiropoulos,et al.  ParCube: Sparse Parallelizable Tensor Decompositions , 2012, ECML/PKDD.

[14]  Tamir Hazan,et al.  Non-negative tensor factorization with applications to statistics and computer vision , 2005, ICML.

[15]  J. Chang,et al.  Analysis of individual differences in multidimensional scaling via an n-way generalization of “Eckart-Young” decomposition , 1970 .

[16]  Tamara G. Kolda,et al.  Tensor Decompositions and Applications , 2009, SIAM Rev..

[17]  Pauli Miettinen,et al.  MDL4BMF: Minimum Description Length for Boolean Matrix Factorization , 2014, TKDD.

[18]  Cynthia Vera Glodeanu,et al.  Optimal Factorization of Three-Way Binary Data Using Triadic Concepts , 2013, Order.