On the Low-Rank Approximation of Data on the Unit Sphere

In various applications, data in multidimensional space are normalized to unit length. This paper considers the problem of best fitting given points on the m-dimensional unit sphere Sm-1 by k-dimensional great circles with k much less than m. The task is cast as an algebraically constrained low-rank matrix approximation problem. Using the fidelity of the low-rank approximation to the original data as the cost function, this paper offers an analytic expression of the projected gradient which, on one hand, furnishes the first order optimality condition and, on the other hand, can be used as a numerical means for solving this problem.

[1]  Inderjit S. Dhillon,et al.  Concept Decompositions for Large Sparse Text Data Using Clustering , 2004, Machine Learning.

[2]  Paul Horst,et al.  Factor analysis of data matrices , 1965 .

[3]  Jon M. Kleinberg,et al.  A Microeconomic View of Data Mining , 1998, Data Mining and Knowledge Discovery.

[4]  Tamara G. Kolda,et al.  A semidiscrete matrix decomposition for latent semantic indexing information retrieval , 1998, TOIS.

[5]  Younes Chahlaoui,et al.  Low-rank approximation and model reduction. , 2003 .

[6]  J. B. Rosen,et al.  Lower dimensional representation of text data in vector space based information retrieval , 2001 .

[7]  Ralph Chill,et al.  On the Łojasiewicz–Simon gradient inequality , 2003 .

[8]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[9]  Bart De Moor,et al.  Total least squares for affinely structured matrices and the noisy realization problem , 1994, IEEE Trans. Signal Process..

[10]  Lawrence F. Shampine,et al.  The MATLAB ODE Suite , 1997, SIAM J. Sci. Comput..

[11]  D. Sorensen,et al.  Approximation of large-scale dynamical systems: an overview , 2004 .

[12]  E. Stiefel Richtungsfelder und Fernparallelismus in n-dimensionalen Mannigfaltigkeiten , 1935 .

[13]  R. Plemmons,et al.  Structured low rank approximation , 2003 .

[14]  Inderjit S. Dhillon,et al.  Diametrical clustering for identifying anti-correlated gene clusters , 2003, Bioinform..

[15]  A. Iusem On the convergence properties of the projected gradient method for convex optimization , 2003 .

[16]  Willem J. Heiser,et al.  Two Purposes for Matrix Factorization: A Historical Appraisal , 2000, SIAM Rev..

[17]  Robert W. Heath,et al.  Designing structured tight frames via an alternating projection method , 2005, IEEE Transactions on Information Theory.

[18]  Petre Stoica,et al.  Reduced-rank linear regression , 1996, Proceedings of 8th Workshop on Statistical Signal and Array Processing.

[19]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[20]  A. Iserles,et al.  Lie-group methods , 2000, Acta Numerica.

[21]  George Carayannis,et al.  Speech enhancement from noise: A regenerative approach , 1991, Speech Commun..

[22]  L. Simon Asymptotics for a class of non-linear evolution equations, with applications to geometric problems , 1983 .

[23]  Kenneth R. Driessel,et al.  The projected gradient methods for least squares matrix approximations with spectral constraints , 1990 .

[24]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[25]  Philip E. Gill,et al.  Practical optimization , 1981 .

[26]  P. Paatero Least squares formulation of robust non-negative factor analysis , 1997 .

[27]  D. Luenberger,et al.  Estimation of structured covariance matrices , 1982, Proceedings of the IEEE.

[28]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[29]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[30]  David G. Stork,et al.  Pattern Classification , 1973 .

[31]  Amit Singhal,et al.  Pivoted document length normalization , 1996, SIGIR 1996.

[32]  Michael W. Berry,et al.  Computational information retrieval , 2001 .

[33]  G KoldaTamara,et al.  A semidiscrete matrix decomposition for latent semantic indexing information retrieval , 1998 .