Fast and Accurate PSD Matrix Estimation by Row Reduction

Fast and accurate estimation of missing relations, e.g., similarity, distance and kernel, among objects is now one of the most important techniques required by major data mining tasks, because the missing information of the relations is needed in many applications such as economics, psychology, and social network communities. Though some approaches have been proposed in the last several years, the practical balance between their required computation amount and obtained accuracy are insufficient for some class of the relation estimation. The objective of this paper is to formalize a problem to quickly and efficiently estimate missing relations among objects from the other known relations among the objects and to propose techniques called “PSD Estimation” and “Row Reduction” for the estimation problem. This technique uses a characteristic of the relations named “Positive Semi-Definiteness (PSD)” and a special assumption for known relations in a matrix. The superior performance of our approach in both efficiency and accuracy is demonstrated through an evaluation based on artificial and real-world data sets. key words: similarity, Positive Semi-Definite (PSD) matrix, Positive SemiDefinite (PSD) Estimation, row reduction, incomplete Cholesky decomposition

[1]  Charles R. Johnson,et al.  Positive definite completions of partial Hermitian matrices , 1984 .

[2]  Hayao Miyagi,et al.  A method of constructing pairwise comparison matrix in decision making , 1996, 1996 IEEE International Conference on Systems, Man and Cybernetics. Information Intelligence and Systems (Cat. No.96CH35929).

[3]  Monique Laurent,et al.  Cuts, matrix completions and graph rigidity , 1997, Math. Program..

[4]  Henry Wolkowicz,et al.  An Interior-Point Method for Approximate Positive Semidefinite Completions , 1998, Comput. Optim. Appl..

[5]  D. J. Newman,et al.  UCI Repository of Machine Learning Database , 1998 .

[6]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[7]  Christopher K. I. Williams,et al.  The Effect of the Input Density Distribution on Kernel-based Classifiers , 2000, ICML.

[8]  Katya Scheinberg,et al.  Efficient SVM Training Using Low-Rank Kernel Representations , 2002, J. Mach. Learn. Res..

[9]  John D. Lafferty,et al.  Diffusion Kernels on Graphs and Other Discrete Input Spaces , 2002, ICML.

[10]  B. Snel,et al.  Comparative assessment of large-scale data sets of protein–protein interactions , 2002, Nature.

[11]  Thore Graepel,et al.  Kernel Matrix Completion by Semidefinite Programming , 2002, ICANN.

[12]  Kiyoshi Asai,et al.  The em Algorithm for Kernel Matrix Completion with Auxiliary Data , 2003, J. Mach. Learn. Res..

[13]  Susumu Goto,et al.  The KEGG resource for deciphering the genome , 2004, Nucleic Acids Res..

[14]  William Stafford Noble,et al.  Learning kernels from biological networks by maximizing entropy , 2004, ISMB/ECCB.

[15]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[16]  L. Carin,et al.  Analytical Kernel Matrix Completion with Incomplete Multi-View Data , 2005 .

[17]  Michael I. Jordan,et al.  Predictive low-rank decomposition for kernel methods , 2005, ICML.

[18]  Hisashi Kashima,et al.  A Parameterized Probabilistic Model of Network Evolution for Supervised Link Prediction , 2006, Sixth International Conference on Data Mining (ICDM'06).

[19]  R. Bhatia Positive Definite Matrices , 2007 .

[20]  Barbara Kaltenbacher Regularization by truncated Cholesky factorization: A comparison of four different approaches , 2007, J. Complex..

[21]  S. Yun,et al.  An accelerated proximal gradient algorithm for nuclear norm regularized linear least squares problems , 2009 .

[22]  S. Yun,et al.  An accelerated proximal gradient algorithm for nuclear norm regularized linear least squares problems , 2009 .