Robust Discrete Matrix Completion

Most existing matrix completion methods seek the matrix global structure in the real number domain and produce predictions that are inappropriate for applications retaining discrete structure, where an additional step is required to post-process prediction results with either heuristic threshold parameters or complicated mappings. Such an ad-hoc process is inefficient and impractical. In this paper, we propose a novel robust discrete matrix completion algorithm that produces the prediction from the collection of user specified discrete values by introducing a new discrete constraint to the matrix completion model. Our method achieves a high prediction accuracy, very close to the most optimal value of competitive methods with threshold values tuning. We solve the difficult integer programming problem via incorporating augmented Lagrangian method in an elegant way, which greatly accelerates the converge process of our method and provides the asymptotic convergence in theory. The proposed discrete matrix completion model is applied to solve three real-world applications, and all empirical results demonstrate the effectiveness of our method.

[1]  Feiping Nie,et al.  Robust Matrix Completion via Joint Schatten p-Norm and lp-Norm Minimization , 2012, 2012 IEEE 12th International Conference on Data Mining.

[2]  Tommi S. Jaakkola,et al.  Weighted Low-Rank Approximations , 2003, ICML.

[3]  Emmanuel J. Candès,et al.  A Singular Value Thresholding Algorithm for Matrix Completion , 2008, SIAM J. Optim..

[4]  Michael J. Pazzani,et al.  Learning Collaborative Information Filters , 1998, ICML.

[5]  Richard A. Harshman,et al.  Indexing by Latent Semantic Analysis , 1990, J. Am. Soc. Inf. Sci..

[6]  Dimitris Bertsimas,et al.  Optimization over integers , 2005 .

[7]  Feiping Nie,et al.  Low-Rank Matrix Recovery via Efficient Schatten p-Norm Minimization , 2012, AAAI.

[8]  Christopher M. Bishop,et al.  Bayesian PCA , 1998, NIPS.

[9]  David Heckerman,et al.  Empirical Analysis of Predictive Algorithms for Collaborative Filtering , 1998, UAI.

[10]  Thomas Hofmann,et al.  Learning What People (Don't) Want , 2001, ECML.

[11]  Nathan Srebro,et al.  Fast maximum margin matrix factorization for collaborative prediction , 2005, ICML.

[12]  Sanjoy Dasgupta,et al.  A Generalization of Principal Components Analysis to the Exponential Family , 2001, NIPS.

[13]  Dimitri P. Bertsekas,et al.  Nonlinear Programming , 1997 .

[14]  Emmanuel J. Candès,et al.  Exact Matrix Completion via Convex Optimization , 2008, Found. Comput. Math..

[15]  Serge A. Hazout,et al.  Influence of microarrays experiments missing values on the stability of gene groups by hierarchical clustering , 2004, BMC Bioinformatics.

[16]  Benjamin M. Marlin,et al.  Modeling User Rating Profiles For Collaborative Filtering , 2003, NIPS.

[17]  Jieping Ye,et al.  An accelerated gradient method for trace norm minimization , 2009, ICML '09.

[18]  Ruslan Salakhutdinov,et al.  Bayesian probabilistic matrix factorization using Markov chain Monte Carlo , 2008, ICML '08.

[19]  Mike Tyers,et al.  BioGRID: a general repository for interaction datasets , 2005, Nucleic Acids Res..

[20]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[21]  Dima Grigoriev,et al.  Complexity of Quantifier Elimination in the Theory of Algebraically Closed Fields , 1984, MFCS.

[22]  Shin Ishii,et al.  A Bayesian missing value estimation method for gene expression profile data , 2003, Bioinform..