Clustering Based Fast Low-Rank Approximation for Large-Scale Graph

As a fundamental data structure, graph has been widely used in machine learning, data mining, and computer vision. However, graph based analysis with respect to kernel method, spectral clustering and manifold learning can reach time complexity of $O(m^3)$, where $m$ is the data size. In particular, the problem becomes intractable when data is in large-scale. Recently, low-rank matrix approximation draws considerable attentions since it can extract essential parts that are responsible for most actions of the matrix. Nonetheless, the structure information embedded in the massive data is inevitably ignored. In this paper, we argue that vector quantization can better reveal the intrinsic structure of the large-scale data and both intra- and inter-cluster matrices should be taken advantage of to boost the accuracy of low-rank matrix approximation. Considering both inter- and intra-relationships, we can reach a better trade-off on different kinds of graphs. Extensive experiments demonstrate that the proposed framework not only keeps lower time complexity but also performs comparably with the state of the art.

[1]  Axthonv G. Oettinger,et al.  IEEE Transactions on Information Theory , 1998 .

[2]  Stephen Lin,et al.  Graph Embedding and Extensions: A General Framework for Dimensionality Reduction , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[4]  Xuelong Li,et al.  Patch Alignment for Dimensionality Reduction , 2009, IEEE Transactions on Knowledge and Data Engineering.

[5]  David G. Stork,et al.  Pattern Classification , 1973 .

[6]  M. Naderi Think globally... , 2004, HIV prevention plus!.

[7]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[8]  Bernhard Schölkopf,et al.  A kernel view of the dimensionality reduction of manifolds , 2004, ICML.

[9]  Michael W. Mahoney Algorithmic and Statistical Perspectives on Large-Scale Data Analysis , 2010, ArXiv.

[10]  Mikhail Belkin,et al.  Laplacian Eigenmaps for Dimensionality Reduction and Data Representation , 2003, Neural Computation.

[11]  Pavel Pudil,et al.  Introduction to Statistical Pattern Recognition , 2006 .

[12]  Michael W. Mahoney Randomized Algorithms for Matrices and Data , 2011, Found. Trends Mach. Learn..

[13]  D. Sculley,et al.  Web-scale k-means clustering , 2010, WWW '10.

[14]  Nathan Halko,et al.  Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions , 2009, SIAM Rev..

[15]  Zhigang Luo,et al.  Non-Negative Patch Alignment Framework , 2011, IEEE Transactions on Neural Networks.

[16]  Ling Huang,et al.  Fast approximate spectral clustering , 2009, KDD.

[17]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[18]  YanShuicheng,et al.  Learning with l1-graph for image analysis , 2010 .

[19]  Zhigang Luo,et al.  Manifold Regularized Discriminative Nonnegative Matrix Factorization With Fast Gradient Descent , 2011, IEEE Transactions on Image Processing.

[20]  J. Tenenbaum,et al.  A global geometric framework for nonlinear dimensionality reduction. , 2000, Science.

[21]  David L. Neuhoff,et al.  Quantization , 2022, IEEE Trans. Inf. Theory.

[22]  Inderjit S. Dhillon,et al.  Clustered low rank approximation of graphs in information science applications , 2011, SDM.

[23]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[24]  Sanjoy Dasgupta,et al.  Random projection trees and low dimensional manifolds , 2008, STOC.

[25]  Lawrence K. Saul,et al.  Think Globally, Fit Locally: Unsupervised Learning of Low Dimensional Manifold , 2003, J. Mach. Learn. Res..