Sampling-based algorithms for dimension reduction

Can one compute a low-dimensional representation of any given data by looking only at its small sample, chosen cleverly on the fly? Motivated by the above question, we consider the problem of low-rank matrix approximation: given a matrix A E Rm", one wants to compute a rank-k matrix (where k << min{m, n}) nearest to A in the Frobenius norm (also known as the Hilbert-Schmidt norm). We prove that using a sample of roughly O(k/E) rows of A one can compute, with high probability, a (1 + E)-approximation to the nearest rank-k matrix. This gives an algorithm for low-rank approximation with an improved error guarantee (compared to the additive c IIA112 guarantee known earlier from the work of Frieze, Kannan, and Vempala) and running time O(Mk/l), where M is the number of non-zero entries of A. The proof is based on two sampling techniques called adaptive sampling and volume sampling, and some linear algebraic tools. Low-rank matrix approximation under the Frobenius norm is equivalent to the problem of finding a low-dimensional subspace that minimizes the sum of squared distances to given points. The general subspace approximation problem asks one to find a low-dimensional subspace that minimizes the sum of p-th powers of distances (for p > 1) to given points. We generalize our sampling techniques and prove similar sampling-based dimension reduction results for subspace approximation. However, the proof is geometric. Thesis co-supervisor: Santosh S. Vempala Title: Associate Professor of Applied Mathematics Thesis co-supervisor: Daniel A. Spielman Title: Professor of Computer Science, Yale Unversity To dear aai and baba, Acknowledgments I am greatly indebted to my advisors Daniel Spielman and Santosh Vempala for their guidance, encouragement, and patience during all these years. Both of them showed tremendous faith in me even when my progress was slow. In Spring'05, Santosh taught a wonderful course on Spectral Algorithms and Representations; most of my thesis has germinated from some questions that he posed in the class. I thank my collaborators Luis Rademacher, Grant Wang, and Kasturi Varadarajan, whose contributions have been pivotal in my research. Luis has been more than just a collaborator. I thank him for all the conversations, lunches, dinners, squash, sailing, and travel that we enjoyed together. I thank Piotr Indyk, Daniel Kleitman, and Peter Shor for agreeing to be on my thesis committee. (Finally, it was Peter Shor who attended my thesis defense and signed the thesis approval form.) My five years at MIT, including a semester spent at Georgia Tech, were made memorable by people around me: teachers, colleagues, friends. I thank my professors Daniel Kleitman, Madhu Sudan, Michel Goemans, and Rom Pinchasi for the wonderful courses they taught. I thank Subhash Khot and Prahladh Harsha for playing the roles of my mentors from time to time. I thank my roommate Shashi, my occasional cooking and chit-chat buddies Ajay, Pavithra, Kripa, Sreekar, Punya, Shubhangi, ... my friends from CMI Tejaswi, Krishna, Baskar, Debajyoti, Sourav, ... my friends from the skit group Vikram, Raghavendran, Pranava, ... and many others who made my everyday life enjoyable. Words are insufficient to express my gratitude towards my music teacher Warren Senders, who introduced me to the hell of frustration as well as the heaven of enlightenment called Hindustani music! I thank Jaikumar Radhakrishnan, Ramesh Hariharan, Narayan Kumar, K. V. Subrahmanyam, and Meena Mahajan, my professors during undergrad, for my upbringing in Theory. I thank my teachers from school and math olympiad camps for cultivating my interest in mathematics. I thank my relatives cousins, aunts, uncles and friends for asking me every year, "How many more?" and providing the external force required by Newton's first law of graduation! Above all, I thank my grandparents and my extended family, who instilled in me the liking for an academic career. I thank my parents for always giving me the freedom to choose my own path. I couldn't have reached this far if they hadn't walked me through my very early steps. This thesis is dedicated to them.

[1]  Kasturi R. Varadarajan,et al.  Sampling-based dimension reduction for subspace approximation , 2007, STOC '07.

[2]  Amos Fiat,et al.  Coresets forWeighted Facilities and Their Applications , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[3]  Kasturi R. Varadarajan,et al.  Efficient Subspace Approximation Algorithms , 2007, Discrete & Computational Geometry.

[4]  Santosh S. Vempala,et al.  Adaptive Sampling and Fast Low-Rank Matrix Approximation , 2006, APPROX-RANDOM.

[5]  Alan M. Frieze,et al.  Fast monte-carlo algorithms for finding low-rank approximations , 2004, JACM.

[6]  Santosh S. Vempala,et al.  The Random Projection Method , 2005, DIMACS Series in Discrete Mathematics and Theoretical Computer Science.

[7]  Sariel Har-Peled,et al.  High-Dimensional Shape Fitting in Linear Time , 2003, SCG '03.

[8]  Dimitris Achlioptas,et al.  Fast computation of low rank matrix approximations , 2001, STOC '01.

[9]  Dimitrios Gunopulos,et al.  Automatic subspace clustering of high dimensional data for data mining applications , 1998, SIGMOD '98.

[10]  Sariel Har-Peled How to get close to the median shape , 2007, Comput. Geom..

[11]  Pankaj K. Agarwal,et al.  Approximation Algorithms for k-Line Center , 2002, ESA.

[12]  Petros Drineas,et al.  FAST MONTE CARLO ALGORITHMS FOR MATRICES II: COMPUTING A LOW-RANK APPROXIMATION TO A MATRIX∗ , 2004 .

[13]  Santosh S. Vempala,et al.  Matrix approximation and projective clustering via volume sampling , 2006, SODA '06.

[14]  Sariel Har-Peled,et al.  Projective clustering in high dimensions using core-sets , 2002, SCG '02.

[15]  Dan Feldman Coresets for Weighted Facilities and Their Applications , 2006 .

[16]  Tamás Sarlós,et al.  Improved Approximation Algorithms for Large Matrices via Random Projections , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[17]  Madhu Sudan,et al.  Hardness of approximating the minimum distance of a linear code , 1999, IEEE Trans. Inf. Theory.

[18]  Alan M. Frieze,et al.  Clustering Large Graphs via the Singular Value Decomposition , 2004, Machine Learning.

[19]  Nabil H. Mustafa,et al.  k-means projective clustering , 2004, PODS.