论文信息 - Stochastic Approximation for Online Tensorial Independent Component Analysis

Stochastic Approximation for Online Tensorial Independent Component Analysis

Independent component analysis (ICA) has been a popular dimension reduction tool in statistical machine learning and signal processing. In this paper, we present a convergence analysis for an online tensorial ICA algorithm, by viewing the problem as a nonconvex stochastic approximation problem. For estimating one component, we provide a dynamics-based analysis to prove that our online tensorial ICA algorithm with a specific choice of stepsize achieves a sharp finite-sample error bound. In particular, under a mild assumption on the data-generating distribution and a scaling condition such that d/T is sufficiently small up to a polylogarithmic factor of data dimension d and sample size T , a sharp finite-sample error bound of Õ( √ d/T ) can be obtained.

Michael I. Jordan | Chris Junchi Li | C. J. Li

[1] Michael I. Jordan,et al. Accelerated Gradient Descent Escapes Saddle Points Faster than Gradient Descent , 2017, COLT.

[2] John Wright,et al. A Geometric Analysis of Phase Retrieval , 2016, 2016 IEEE International Symposium on Information Theory (ISIT).

[3] Tong Zhang,et al. SPIDER: Near-Optimal Non-Convex Optimization via Stochastic Path Integrated Differential Estimator , 2018, NeurIPS.

[4] Anima Anandkumar,et al. Spectral Learning on Matrices and Tensors , 2019, Found. Trends Mach. Learn..

[5] M. Yuan,et al. Independent component analysis via nonparametric maximum likelihood estimation , 2012, 1206.0457.

[6] Yan Shuo Tan,et al. Online Stochastic Gradient Descent with Arbitrary Initialization Solves Non-smooth, Non-convex Phase Retrieval , 2019, ArXiv.

[7] Yu Bai,et al. Subgradient Descent Learns Orthogonal Dictionaries , 2018, ICLR.

[8] Yuxin Chen,et al. Implicit Regularization in Nonconvex Statistical Estimation: Gradient Descent Converges Linearly for Phase Retrieval, Matrix Completion, and Blind Deconvolution , 2017, Found. Comput. Math..

[9] Michael I. Jordan,et al. First-order methods almost always avoid strict saddle points , 2019, Mathematical Programming.

[10] S. Bonhomme,et al. Consistent noisy independent component analysis , 2008 .

[11] Michael I. Jordan,et al. First-order methods almost always avoid saddle points: The case of vanishing step-sizes , 2019, NeurIPS.

[12] Michael I. Jordan,et al. Gradient Descent Can Take Exponential Time to Escape Saddle Points , 2017, NIPS.

[13] Yi Zheng,et al. No Spurious Local Minima in Nonconvex Low Rank Problems: A Unified Geometric Analysis , 2017, ICML.

[14] James V. Stone. Independent Component Analysis: A Tutorial Introduction , 2007 .

[15] Visa Koivunen,et al. Identifiability, separability, and uniqueness of linear ICA models , 2004, IEEE Signal Processing Letters.

[16] Quanquan Gu,et al. Stochastic Nested Variance Reduction for Nonconvex Optimization , 2018, J. Mach. Learn. Res..

[17] Jianqing Fan,et al. Spectral Methods for Data Science: A Statistical Perspective , 2020, Found. Trends Mach. Learn..

[18] D. Farnsworth. A First Course in Order Statistics , 1993 .

[19] T. Kollo. Multivariate skewness and kurtosis measures with an application in ICA , 2008 .

[20] Yuanzhi Li,et al. First Efficient Convergence for Streaming k-PCA: A Global, Gap-Free, and Near-Optimal Rate , 2016, 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS).

[21] Michael I. Jordan,et al. Non-convex Finite-Sum Optimization Via SCSG Methods , 2017, NIPS.

[22] Dmitriy Drusvyatskiy,et al. Subgradient Methods for Sharp Weakly Convex Functions , 2018, Journal of Optimization Theory and Applications.

[23] Yue M. Lu,et al. The scaling limit of high-dimensional online independent component analysis , 2017, NIPS.

[24] Yair Carmon,et al. Second-Order Information in Non-Convex Stochastic Optimization: Power and Limitations , 2020, COLT.

[25] Sen Na,et al. High-dimensional Varying Index Coefficient Models via Stein's Identity , 2018, J. Mach. Learn. Res..

[26] Zhihui Zhu,et al. A Nonconvex Approach for Exact and Efficient Multichannel Sparse Blind Deconvolution , 2019, NeurIPS.

[27] Han Liu,et al. Online ICA: Understanding Global Dynamics of Nonconvex Optimization via Diffusion Processes , 2018, NIPS.

[28] Prateek Jain,et al. Streaming PCA: Matching Matrix Bernstein and Near-Optimal Finite Sample Guarantees for Oja's Algorithm , 2016, COLT.

[29] Jon A. Wellner,et al. Weak Convergence and Empirical Processes: With Applications to Statistics , 1996 .

[30] John Wright,et al. Complete Dictionary Recovery Over the Sphere I: Overview and the Geometric Picture , 2015, IEEE Transactions on Information Theory.

[31] Vincent Q. Vu,et al. MINIMAX SPARSE PRINCIPAL SUBSPACE ESTIMATION IN HIGH DIMENSIONS , 2012, 1211.0373.

[32] Seungjin Choi,et al. Independent Component Analysis , 2009, Handbook of Natural Computing.

[33] Lin F. Yang,et al. Misspecified nonconvex statistical optimization for sparse phase retrieval , 2019, Mathematical Programming.

[34] Erkki Oja,et al. Independent component analysis: algorithms and applications , 2000, Neural Networks.

[35] A. Montanari,et al. The landscape of empirical risk for nonconvex losses , 2016, The Annals of Statistics.

[36] Thomas Hofmann,et al. Escaping Saddles with Stochastic Gradients , 2018, ICML.

[37] Naâmane Laïb. Exponential-type inequalities for martingale difference sequences. Application to nonparametric regression estimation , 1999 .

[38] Zeyuan Allen-Zhu,et al. Natasha 2: Faster Non-Convex Optimization Than SGD , 2017, NeurIPS.

[39] Terrence J. Sejnowski,et al. Independent Component Analysis Using an Extended Infomax Algorithm for Mixed Subgaussian and Supergaussian Sources , 1999, Neural Computation.

[40] Yanjun Li,et al. Global Geometry of Multichannel Sparse Blind Deconvolution on the Sphere , 2018, NeurIPS.

[41] John C. Duchi,et al. First-Order Methods for Nonconvex Quadratic Minimization , 2020, SIAM Rev..

[42] D. Chakrabarti,et al. A fast fixed - point algorithm for independent component analysis , 1997 .

[43] Jie Liu,et al. Stochastic Recursive Gradient Algorithm for Nonconvex Optimization , 2017, ArXiv.

[44] Jakub W. Pachocki,et al. Geometric median in nearly linear time , 2016, STOC.

[45] Pauliina Ilmonen,et al. Semiparametrically efficient inference based on signed ranks in symmetric independent component models , 2011, 1202.5159.

[46] Anima Anandkumar,et al. Efficient approaches for escaping higher order saddle points in non-convex optimization , 2016, COLT.

[47] Robert Tibshirani,et al. Independent Components Analysis through Product Density Estimation , 2002, NIPS.

[48] Zhouchen Lin,et al. Sharp Analysis for Nonconvex SGD Escaping from Saddle Points , 2019, COLT.

[49] Yingbin Liang,et al. SpiderBoost and Momentum: Faster Variance Reduction Algorithms , 2019, NeurIPS.

[50] Michael I. Jordan,et al. Kernel independent component analysis , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[51] Aapo Hyvärinen,et al. Fast and robust fixed-point algorithms for independent component analysis , 1999, IEEE Trans. Neural Networks.

[52] Francesco Orabona,et al. Momentum-Based Variance Reduction in Non-Convex SGD , 2019, NeurIPS.

[53] Shang Wu,et al. Asymptotic Analysis via Stochastic Differential Equations of Gradient Descent Algorithms in Statistical and Computational Paradigms , 2017, J. Mach. Learn. Res..

[54] Pierre Comon,et al. Independent component analysis, A new concept? , 1994, Signal Process..

[55] Surya Ganguli,et al. Identifying and attacking the saddle point problem in high-dimensional non-convex optimization , 2014, NIPS.

[56] Yuxin Chen,et al. Gradient descent with random initialization: fast global convergence for nonconvex phase retrieval , 2018, Mathematical Programming.

[57] Suvrit Sra,et al. First-order Methods for Geodesically Convex Optimization , 2016, COLT.

[58] A. Tsybakov,et al. Nonparametric independent component analysis , 2004 .

[59] Furong Huang,et al. Escaping From Saddle Points - Online Stochastic Gradient for Tensor Decomposition , 2015, COLT.

[60] Suvrit Sra,et al. Fast stochastic optimization on Riemannian manifolds , 2016, ArXiv.

[61] John Wright,et al. Geometry and Symmetry in Short-and-Sparse Deconvolution , 2019, ICML.

[62] Pengcheng Zhou,et al. Short-and-Sparse Deconvolution - A Geometric Approach , 2019, ICLR.