Stochastic Approximation and Memory-Limited Subspace Tracking for Poisson Streaming Data

Poisson count data is ubiquitously encountered in applications such as optical imaging, social networks, and traffic monitoring, where the data is typically modeled after a Poisson distribution and presented in a streaming fashion. Therefore, it calls for techniques to efficiently extract and track the useful information embedded therein. We consider the problem of recovering and tracking the underlying Poisson rate, where the rate vectors are assumed to lie in an unknown low-dimensional subspace, from streaming Poisson data with possibly missing entries. The recovery of the underlying subspace is posed as an expected loss minimization problem under nonnegative constraints, where the loss function is a penalized Poisson log-likelihood function. A stochastic approximation (SA) algorithm is proposed and can be implemented in an online manner. Two theoretical results are established regarding the convergence of the SA algorithm. The SA algorithm is guaranteed almost surely to converge to the same point as the original expected loss minimization problem, and the estimate converges to a local minimum. To further reduce the memory requirement and handle missing data, the SA algorithm is modified via lower bounding the log-likelihood function in a form that is decomposable and can be implemented in a memory-limited manner without storing history data. Numerical experiments are provided to demonstrate the superior performance of the proposed algorithms, compared to existing algorithms. The memory-limited SA algorithm is shown to empirically yield similar performance as the original SA algorithm at a much lower memory requirement.

[1]  Namrata Vaswani,et al.  An Online Algorithm for Separating Sparse and Low-Dimensional Signal Sequences From Their Sum , 2013, IEEE Transactions on Signal Processing.

[2]  A. Robert Calderbank,et al.  Signal Recovery and System Calibration from Multiple Compressive Poisson Measurements , 2015, SIAM J. Imaging Sci..

[3]  L. Eon Bottou Online Learning and Stochastic Approximations , 1998 .

[4]  Guillermo Sapiro,et al.  Online Learning for Matrix Factorization and Sparse Coding , 2009, J. Mach. Learn. Res..

[5]  A. Robert Calderbank,et al.  Low-rank matrix recovery with poison noise , 2013, 2013 IEEE Global Conference on Signal and Information Processing.

[6]  Michel Métivier,et al.  Semimartingales: A course on stochastic processes , 1986 .

[7]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[8]  Rebecca Willett,et al.  Dynamical Models and tracking regret in online convex programming , 2013, ICML.

[9]  Rebecca Willett,et al.  This is SPIRAL-TAP: Sparse Poisson Intensity Reconstruction ALgorithms—Theory and Practice , 2010, IEEE Transactions on Image Processing.

[10]  Yang Cao,et al.  Poisson matrix completion , 2015, 2015 IEEE International Symposium on Information Theory (ISIT).

[11]  Julien Mairal,et al.  Stochastic Majorization-Minimization Algorithms for Large-Scale Optimization , 2013, NIPS.

[12]  Costas J. Spanos,et al.  Sequential Logistic Principal Component Analysis (SLPCA): Dimensional Reduction in Streaming Multivariate Binary-State System , 2014, 2014 13th International Conference on Machine Learning and Applications.

[13]  Joydeep Ghosh,et al.  Nonparametric Bayesian Factor Analysis for Dynamic Count Matrices , 2015, AISTATS.

[14]  W. Rudin Principles of mathematical analysis , 1964 .

[15]  Robert D. Nowak,et al.  A characterization of deterministic sampling patterns for low-rank matrix completion , 2015, Allerton.

[16]  Emmanuel J. Candès,et al.  Exact Matrix Completion via Convex Optimization , 2009, Found. Comput. Math..

[17]  Roummel F. Marcia,et al.  Compressed Sensing Performance Bounds Under Poisson Noise , 2009, IEEE Transactions on Signal Processing.

[18]  S. Simić ON AN UPPER BOUND FOR JENSEN'S INEQUALITY , 2009 .

[19]  Morteza Mardani,et al.  Rank minimization for subspace tracking from incomplete data , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[20]  Shuicheng Yan,et al.  Online Robust PCA via Stochastic Optimization , 2013, NIPS.

[21]  A. Robert Calderbank,et al.  Designed Measurements for Vector Count Data , 2013, NIPS.

[22]  Tian Zheng,et al.  How Many People Do You Know in Prison? , 2006 .

[23]  E. Oja Simplified neuron model as a principal component analyzer , 1982, Journal of mathematical biology.

[24]  Robert D. Nowak,et al.  Online identification and tracking of subspaces from highly incomplete information , 2010, 2010 48th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[25]  A. Robert Calderbank,et al.  PETRELS: Parallel Estimation and Tracking of Subspace by Recursive Least Squares from Partial Observations , 2012, ArXiv.

[26]  A. Robert Calderbank,et al.  Performance Bounds for Expander-Based Compressed Sensing in Poisson Noise , 2010, IEEE Transactions on Signal Processing.

[27]  Kenji Kita,et al.  Dimensionality reduction using non-negative matrix factorization for information retrieval , 2001, 2001 IEEE International Conference on Systems, Man and Cybernetics. e-Systems and e-Man for Cybernetics in Cyberspace (Cat.No.01CH37236).

[28]  Guillermo Sapiro,et al.  Online dictionary learning for sparse coding , 2009, ICML '09.

[29]  Léon Bottou,et al.  On-line learning and stochastic approximations , 1999 .

[30]  Bernhard Schölkopf,et al.  Optimization of k‐space trajectories for compressed sensing by Bayesian experimental design , 2010, Magnetic resonance in medicine.

[31]  J. Borwein,et al.  Two-Point Step Size Gradient Methods , 1988 .

[32]  Morteza Mardani,et al.  Online Categorical Subspace Learning for Sketching Big Data with Misses , 2016, IEEE Transactions on Signal Processing.

[33]  D. Brady Optical Imaging and Spectroscopy , 2009 .