On degeneracy control in overcomplete ICA

Understanding the effects of degeneracy control mechanisms when learning overcomplete representations is crucial for applying Independent Components Analysis (ICA) in machine learning and theoretical neuroscience. A number of approaches to degeneracy control have been proposed which can learn non-degenerate complete representations, however some of these methods can fall into bad local minima when extended to overcomplete ICA. Furthermore, they may have unintended side-effects on the distribution of learned basis elements, which may lead to a biased exploration of the data manifold. In this work, we identify and theoretically analyze the cause of these failures and propose a framework that can be used to evaluate arbitrary degeneracy control mechanisms. We evaluate different methods for degeneracy control in overcomplete ICA and suggest two novel approaches, one of which can learn highly orthonormal bases. Finally, we compare all methods on the task of estimating an overcomplete basis on natural images.

[1]  Honglak Lee,et al.  An Analysis of Single-Layer Networks in Unsupervised Feature Learning , 2011, AISTATS.

[2]  Bruno A. Olshausen,et al.  PROBABILISTIC FRAMEWORK FOR THE ADAPTATION AND COMPARISON OF IMAGE CODES , 1999 .

[3]  Michael S. Lewicki,et al.  Efficient auditory coding , 2006, Nature.

[4]  Bruno A. Olshausen,et al.  Highly overcomplete sparse coding , 2013, Electronic Imaging.

[5]  Pierre Comon Independent component analysis - a new concept? signal processing , 1994 .

[6]  Erkki Oja,et al.  A fast algorithm for estimating overcomplete ICA bases for image windows , 1999, IJCNN'99. International Joint Conference on Neural Networks. Proceedings (Cat. No.99CH36339).

[7]  D. Ringach Spatial structure and symmetry of simple-cell receptive fields in macaque primary visual cortex. , 2002, Journal of neurophysiology.

[8]  J. H. Hateren,et al.  Independent component filters of natural images compared with simple cells in primary visual cortex , 1998 .

[9]  H B Barlow,et al.  The Ferrier lecture, 1980 , 1981, Proceedings of the Royal Society of London. Series B. Biological Sciences.

[10]  Aapo Hyvärinen,et al.  Emergence of Phase- and Shift-Invariant Features by Decomposition of Natural Images into Independent Feature Subspaces , 2000, Neural Computation.

[11]  Nicole L. Carlson,et al.  Sparse Codes for Speech Predict Spectrotemporal Receptive Fields in the Inferior Colliculus , 2012, PLoS Comput. Biol..

[12]  Michael Elad,et al.  Optimally sparse representation in general (nonorthogonal) dictionaries via ℓ1 minimization , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[13]  Quoc V. Le,et al.  ICA with Reconstruction Cost for Efficient Overcomplete Feature Learning , 2011, NIPS.

[14]  Pierre Comon,et al.  Independent component analysis, A new concept? , 1994, Signal Process..

[15]  Lloyd R. Welch,et al.  Lower bounds on the maximum cross correlation of signals (Corresp.) , 1974, IEEE Trans. Inf. Theory.

[16]  Aapo Hyvärinen,et al.  Estimation of Non-Normalized Statistical Models by Score Matching , 2005, J. Mach. Learn. Res..

[17]  Martin Rehn,et al.  A network that uses few active neurones to code visual input predicts the diverse shapes of cortical receptive fields , 2007, Journal of Computational Neuroscience.

[18]  Yoshua Bengio,et al.  Greedy Layer-Wise Training of Deep Networks , 2006, NIPS.

[19]  Joachim M. Buhmann,et al.  Learning Dictionaries With Bounded Self-Coherence , 2012, IEEE Signal Processing Letters.

[20]  Jorge Nocedal,et al.  A Limited Memory Algorithm for Bound Constrained Optimization , 1995, SIAM J. Sci. Comput..

[21]  John Salvatier,et al.  Theano: A Python framework for fast computation of mathematical expressions , 2016, ArXiv.

[22]  Aapo Hyvärinen,et al.  A Fast Fixed-Point Algorithm for Independent Component Analysis , 1997, Neural Computation.

[23]  David J. Field,et al.  Emergence of simple-cell receptive field properties by learning a sparse code for natural images , 1996, Nature.

[24]  T. Hromádka,et al.  Reliability and Representational Bandwidth in the Auditory Cortex , 2005, Neuron.

[25]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[26]  A. Robert Calderbank,et al.  A fast reconstruction algorithm for deterministic compressive sensing using second order reed-muller codes , 2008, 2008 42nd Annual Conference on Information Sciences and Systems.

[27]  Dustin G. Mixon,et al.  Tables of the existence of equiangular tight frames , 2015, ArXiv.

[28]  Guillermo Sapiro,et al.  Sparse Modeling with Universal Priors and Learned Incoherent Dictionaries(PREPRINT) , 2009 .

[29]  David J. Field,et al.  Sparse coding with an overcomplete basis set: A strategy employed by V1? , 1997, Vision Research.

[30]  Terrence J. Sejnowski,et al.  The “independent components” of natural scenes are edge filters , 1997, Vision Research.

[31]  Terrence J. Sejnowski,et al.  Enhanced detection of artifacts in EEG data using higher-order statistics and independent component analysis , 2007, NeuroImage.

[32]  Aapo Hyvärinen,et al.  Estimating Overcomplete Independent Component Bases for Image Windows , 2002, Journal of Mathematical Imaging and Vision.

[33]  Ian H. Stevenson,et al.  Spatially Distributed Local Fields in the Hippocampus Encode Rat Position , 2014, Science.

[34]  Friedrich T. Sommer,et al.  When Can Dictionary Learning Uniquely Recover Sparse Data From Subsamples? , 2011, IEEE Transactions on Information Theory.

[35]  S. Smale Mathematical problems for the next century , 1998 .