The Boolean column and column-row matrix decompositions

Matrix decompositions are used for many data mining purposes. One of these purposes is to find a concise but interpretable representation of a given data matrix. Different decomposition formulations have been proposed for this task, many of which assume a certain property of the input data (e.g., nonnegativity) and aim at preserving that property in the decomposition. In this paper we propose new decomposition formulations for binary matrices, namely the Boolean CX and CUR decompositions. They are natural combinations of two previously presented decomposition formulations. We consider also two subproblems of these decompositions and present a rigorous theoretical study of the subproblems. We give algorithms for the decompositions and for the subproblems, and study their performance via extensive experimental evaluation. We show that even simple algorithms can give accurate and intuitive decompositions of real data, thus demonstrating the power and usefulness of the proposed decompositions.

[1]  Aristides Gionis,et al.  Spectral ordering and biochronology of European fossil mammals , 2006, Paleobiology.

[2]  Chris H. Q. Ding,et al.  Binary Matrix Factorization with Applications , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[3]  David Peleg Approximation algorithms for the Label-CoverMAX and Red-Blue Set Cover problems , 2007, J. Discrete Algorithms.

[4]  Pauli Miettinen,et al.  On the Positive-Negative Partial Set Cover problem , 2008, Inf. Process. Lett..

[5]  L. Khachiyan,et al.  The polynomial solvability of convex quadratic programming , 1980 .

[6]  Pauli Miettinen,et al.  The Discrete Basis Problem , 2008, IEEE Trans. Knowl. Data Eng..

[7]  Pauli Miettinen,et al.  The Discrete Basis Problem , 2006, IEEE Transactions on Knowledge and Data Engineering.

[8]  Gene H. Golub,et al.  Matrix computations , 1983 .

[9]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[10]  Jimeng Sun,et al.  Less is More: Sparse Graph Mining with Compact Matrix Decomposition , 2008, Stat. Anal. Data Min..

[11]  Vijayalakshmi Atluri,et al.  Optimal Boolean Matrix Decomposition: Application to Role Engineering , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[12]  Pauli Miettinen,et al.  Interpretable nonnegative matrix decompositions , 2008, KDD.

[13]  Michael W. Berry,et al.  Algorithm 844: Computing sparse reduced-rank approximations to sparse matrices , 2005, TOMS.

[14]  Michael W. Berry,et al.  Algorithms and applications for approximate nonnegative matrix factorization , 2007, Comput. Stat. Data Anal..

[15]  S. Muthukrishnan,et al.  Relative-Error CUR Matrix Decompositions , 2007, SIAM J. Matrix Anal. Appl..