Accelerated Sparse Bayesian Learning via Screening Test and Its Applications

In high-dimensional settings, sparse structures are critical for efficiency in term of memory and computation complexity. For a linear system, to find the sparsest solution provided with an over-complete dictionary of features directly is typically NP-hard, and thus alternative approximate methods should be considered. In this paper, our choice for alternative method is sparse Bayesian learning, which, as empirical Bayesian approaches, uses a parameterized prior to encourage sparsity in solution, rather than the other methods with fixed priors such as LASSO. Screening test, however, aims at quickly identifying a subset of features whose coefficients are guaranteed to be zero in the optimal solution, and then can be safely removed from the complete dictionary to obtain a smaller, more easily solved problem. Next, we solve the smaller problem, after which the solution of the original problem can be recovered by padding the smaller solution with zeros. The performance of the proposed method will be examined on various data sets and applications.

[1]  N. L. Johnson,et al.  Linear Statistical Inference and Its Applications , 1966 .

[2]  Lie Wang,et al.  Orthogonal Matching Pursuit for Sparse Signal Recovery With Noise , 2011, IEEE Transactions on Information Theory.

[3]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[4]  George Eastman House,et al.  Sparse Bayesian Learning and the Relevan e Ve tor Ma hine , 2001 .

[5]  R. Tibshirani,et al.  Strong rules for discarding predictors in lasso‐type problems , 2010, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[6]  Yann LeCun,et al.  The mnist database of handwritten digits , 2005 .

[7]  Peter J. Ramadge,et al.  Screening Tests for Lasso Problems , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Jie Wang,et al.  Lasso screening rules via dual polytope projection , 2012, J. Mach. Learn. Res..

[9]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[10]  Jianqing Fan,et al.  Sure independence screening for ultrahigh dimensional feature space , 2006, math/0612857.

[11]  Hao Xu,et al.  Learning Sparse Representations of High Dimensional Data on Large Scale Dictionaries , 2011, NIPS.

[12]  Lennart Ljung,et al.  System Identification Via Sparse Multiple Kernel-Based Regularization Using Sequential Convex Optimization Techniques , 2014, IEEE Transactions on Automatic Control.

[13]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[14]  S. Frick,et al.  Compressed Sensing , 2014, Computer Vision, A Reference Guide.

[15]  Stephen P. Boyd,et al.  Graph Implementations for Nonsmooth Convex Programs , 2008, Recent Advances in Learning and Control.

[16]  David P. Wipf,et al.  A New View of Automatic Relevance Determination , 2007, NIPS.

[17]  P. Jaccard Distribution de la flore alpine dans le bassin des Dranses et dans quelques régions voisines , 1901 .

[18]  Giuseppe De Nicolao,et al.  A new kernel-based approach for linear system identification , 2010, Autom..

[19]  Brian D. Jeffs,et al.  Restoration of blurred star field images by maximally sparse optimization , 1993, IEEE Trans. Image Process..

[20]  E. Kamen,et al.  Introduction to Optimal Estimation , 1999 .

[21]  D. Donoho,et al.  Basis pursuit , 1994, Proceedings of 1994 28th Asilomar Conference on Signals, Systems and Computers.

[22]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[23]  Peter J. Ramadge,et al.  Fast lasso screening tests based on correlations , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[24]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[25]  Gene H. Golub,et al.  Matrix computations , 1983 .

[26]  Christian P. Robert,et al.  Monte Carlo Statistical Methods (Springer Texts in Statistics) , 2005 .

[27]  Djemel Ziou,et al.  Image Quality Metrics: PSNR vs. SSIM , 2010, 2010 20th International Conference on Pattern Recognition.

[28]  Bhaskar D. Rao,et al.  Sparse Bayesian learning for basis selection , 2004, IEEE Transactions on Signal Processing.

[29]  Bhaskar D. Rao,et al.  Bayesian learning for sparse signal reconstruction , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[30]  Richard F. Gunst,et al.  Applied Regression Analysis , 1999, Technometrics.

[31]  Carlas S. Smith,et al.  Advanced 3D Analysis and Optimization of Single-Molecule FISH in Drosophila Muscle. , 2018, Small methods.

[32]  David J. C. MacKay,et al.  Bayesian Interpolation , 1992, Neural Computation.

[33]  Lennart Ljung,et al.  System Identification: Theory for the User , 1987 .

[34]  Laurent El Ghaoui,et al.  Safe Feature Elimination in Sparse Supervised Learning , 2010, ArXiv.