A Distributed and Secure Algorithm for Computing Dominant SVD Based on Projection Splitting

In this paper, we propose and study a distributed and secure algorithm for computing dominant (or truncated) singular value decompositions (SVD) of large and distributed data matrices. We consider the scenario where each node privately holds a subset of columns and only exchanges "safe" information with other nodes in a collaborative effort to calculate a dominant SVD for the whole matrix. In the framework of alternating direction methods of multipliers (ADMM), we propose a novel formulation for building consensus by equalizing subspaces spanned by splitting variables instead of equalizing the variables themselves. This technique greatly relaxes feasibility restrictions and accelerates convergence significantly, while at the same time yielding simple subproblems. We design several algorithmic features, including a low-rank multiplier formula and mechanisms for controlling subproblem solution accuracies, to increase the algorithm's computational efficiency and reduce its communication overhead. More importantly, unlike most existing distributed or parallelized algorithms, our algorithm preserves the privacy of locally-held data; that is, none of the nodes can recover the data stored in another node through information exchanged during communications. We present convergence analysis results, including a worst-case complexity estimate, and extensive experimental results indicating that the proposed algorithm, while safely guarding data privacy, has a strong potential to deliver a cutting-edge performance, especially when communication costs are relatively high.

[1]  Jianping Yin,et al.  A Fast Distributed Principal Component Analysis with Variance Reduction , 2017, 2017 16th International Symposium on Distributed Computing and Applications to Business, Engineering and Science (DCABES).

[2]  B. Moore Principal component analysis in linear systems: Controllability, observability, and model reduction , 1981 .

[3]  H. Rutishauser Simultaneous iteration method for symmetric matrices , 1970 .

[4]  William J. Stewart,et al.  A Simultaneous Iteration Algorithm for Real Matrices , 1981, TOMS.

[5]  Xiaojun Chen,et al.  A New First-Order Algorithmic Framework for Optimization Problems with Orthogonality Constraints , 2018, SIAM J. Optim..

[6]  Shiqian Ma,et al.  On the Nonergodic Convergence Rate of an Inexact Augmented Lagrangian Framework for Composite Convex Programming , 2016, Math. Oper. Res..

[7]  Ioannis D. Schizas,et al.  A Distributed Framework for Dimensionality Reduction and Denoising , 2015, IEEE Transactions on Signal Processing.

[8]  Pascal Bianchi,et al.  Asynchronous distributed principal component analysis using stochastic approximation , 2012, 2012 IEEE 51st IEEE Conference on Decision and Control (CDC).

[9]  E. Stiefel Richtungsfelder und Fernparallelismus in n-dimensionalen Mannigfaltigkeiten , 1935 .

[10]  V. Kshirsagar,et al.  Face recognition using Eigenfaces , 2011, 2011 3rd International Conference on Computer Research and Development.

[11]  L G SleijpenGerard,et al.  A Jacobi--Davidson Iteration Method for Linear Eigenvalue Problems , 1996 .

[12]  Mark Tygert,et al.  Randomized algorithms for distributed computation of principal component analysis and singular value decomposition , 2016, Adv. Comput. Math..

[13]  David Picard,et al.  Asynchronous gossip principal components analysis , 2015, Neurocomputing.

[14]  A. Stathopoulos,et al.  A Davidson program for finding a few selected extreme eigenpairs of a large, sparse, real, symmetric matrix , 1994 .

[15]  C. Lanczos An iteration method for the solution of the eigenvalue problem of linear differential and integral operators , 1950 .

[16]  W. Arnoldi The principle of minimized iterations in the solution of the matrix eigenvalue problem , 1951 .

[17]  M. Elad,et al.  $rm K$-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation , 2006, IEEE Transactions on Signal Processing.

[18]  R Jessup,et al.  A Parallel Algorithm for Computing the Singular Value Decomposition of a Matrix:A Revision of Argonne National Laboratory Tech. Report ANL/MCS-TM-102 ; CU-CS-623-92 , 1994 .

[19]  Yin Zhang,et al.  An Efficient Gauss-Newton Algorithm for Symmetric Low-Rank Product Matrix Approximations , 2015, SIAM J. Optim..

[20]  Yongqiang Wang,et al.  ADMM Based Privacy-Preserving Decentralized Optimization , 2017, IEEE Transactions on Information Forensics and Security.

[21]  R. Tyrrell Rockafellar,et al.  Augmented Lagrangians and Applications of the Proximal Point Algorithm in Convex Programming , 1976, Math. Oper. Res..

[22]  David Kaeli,et al.  Introduction to Parallel Programming , 2013 .

[23]  Christian Himpe,et al.  Hierarchical Approximate Proper Orthogonal Decomposition , 2016, SIAM J. Sci. Comput..

[24]  David P. Woodruff,et al.  Improved Distributed Principal Component Analysis , 2014, NIPS.

[25]  Richard A. Harshman,et al.  Indexing by Latent Semantic Analysis , 1990, J. Am. Soc. Inf. Sci..

[26]  Richard B. Lehoucq,et al.  Implicitly Restarted Arnoldi Methods and Subspace Iteration , 2001, SIAM J. Matrix Anal. Appl..

[27]  Gustavo Marrero Callicó,et al.  Adaptation of an Iterative PCA to a Manycore Architecture for Hyperspectral Image Processing , 2018, Journal of Signal Processing Systems.

[28]  Yousef Saad,et al.  A Filtered Lanczos Procedure for Extreme and Interior Eigenvalue Problems , 2012, SIAM J. Sci. Comput..

[29]  H. Andrews,et al.  Singular Value Decomposition (SVD) Image Coding , 1976, IEEE Trans. Commun..

[30]  Tarek Elgamal,et al.  sPCA: Scalable Principal Component Analysis for Big Data on Distributed Platforms , 2015, SIGMOD Conference.

[31]  Nathan Halko,et al.  Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions , 2009, SIAM Rev..

[32]  Anna Scaglione,et al.  The Power-Oja method for decentralized subspace estimation/tracking , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[33]  Anna Scaglione,et al.  Distributed Principal Subspace Estimation in Wireless Sensor Networks , 2011, IEEE Journal of Selected Topics in Signal Processing.

[34]  A. Bruckstein,et al.  K-SVD : An Algorithm for Designing of Overcomplete Dictionaries for Sparse Representation , 2005 .

[35]  Paulo J. S. Silva,et al.  A practical relative error criterion for augmented Lagrangians , 2012, Mathematical Programming.

[36]  Robert A. van de Geijn,et al.  Collective communication: theory, practice, and experience , 2007, Concurr. Comput. Pract. Exp..

[37]  G. Stewart Simultaneous iteration for computing invariant subspaces of non-Hermitian matrices , 1976 .

[38]  Waheed Uz Zaman Bajwa,et al.  Cloud K-SVD: A Collaborative Dictionary Learning Algorithm for Big, Distributed Data , 2014, IEEE Transactions on Signal Processing.

[39]  Roland W. Freund,et al.  An Implementation of the Look-Ahead Lanczos Algorithm for Non-Hermitian Matrices , 1993, SIAM J. Sci. Comput..

[40]  Stanley C. Eisenstat,et al.  A Divide-and-Conquer Algorithm for the Bidiagonal SVD , 1995, SIAM J. Matrix Anal. Appl..

[41]  Chao Yang,et al.  Trace-Penalty Minimization for Large-Scale Eigenspace Computation , 2016, J. Sci. Comput..

[42]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[43]  Dong Wang,et al.  Distributed estimation of principal eigenspaces. , 2017, Annals of statistics.

[44]  M. A. Iwen,et al.  A Distributed and Incremental SVD Algorithm for Agglomerative Data Analysis on Large Networks , 2016, SIAM J. Matrix Anal. Appl..

[45]  Alan Edelman,et al.  The Geometry of Algorithms with Orthogonality Constraints , 1998, SIAM J. Matrix Anal. Appl..

[46]  Andrew V. Knyazev,et al.  Toward the Optimal Preconditioned Eigensolver: Locally Optimal Block Preconditioned Conjugate Gradient Method , 2001, SIAM J. Sci. Comput..

[47]  Abdelhak M. Zoubir,et al.  Performance Analysis of the Decentralized Eigendecomposition and ESPRIT Algorithm , 2015, IEEE Transactions on Signal Processing.

[48]  Lean Yu,et al.  Privacy Preservation in Distributed Subgradient Optimization Algorithms , 2015, IEEE Transactions on Cybernetics.

[49]  Jack J. Dongarra,et al.  The Singular Value Decomposition: Anatomy of Optimizing an Algorithm for Extreme Scale , 2018, SIAM Rev..

[50]  Juan C. Meza,et al.  A Trust Region Direct Constrained Minimization Algorithm for the Kohn-Sham Equation , 2007, SIAM J. Sci. Comput..

[51]  Merico E. Argentati,et al.  Block Locally Optimal Preconditioned Eigenvalue Xolvers (BLOPEX) in hypre and PETSc , 2007, SIAM J. Sci. Comput..

[52]  Haroon Raja,et al.  Fast and Communication-efficient Distributed Pca , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[53]  Jack J. Dongarra,et al.  A Parallel Divide and Conquer Algorithm for the Symmetric Eigenvalue Problem on Distributed Memory Architectures , 1999, SIAM J. Sci. Comput..

[54]  Yin Zhang,et al.  Limited Memory Block Krylov Subspace Optimization for Computing Dominant Singular Value Decompositions , 2013, SIAM J. Sci. Comput..

[55]  Emmanuel J. Candès,et al.  Exact Matrix Completion via Convex Optimization , 2008, Found. Comput. Math..