Online learning and generalization of parts-based image representations by non-negative sparse autoencoders

We present an efficient online learning scheme for non-negative sparse coding in autoencoder neural networks. It comprises a novel synaptic decay rule that ensures non-negative weights in combination with an intrinsic self-adaptation rule that optimizes sparseness of the non-negative encoding. We show that non-negativity constrains the space of solutions such that overfitting is prevented and very similar encodings are found irrespective of the network initialization and size. We benchmark the novel method on real-world datasets of handwritten digits and faces. The autoencoder yields higher sparseness and lower reconstruction errors than related offline algorithms based on matrix factorization. It generalizes to new inputs both accurately and without costly computations, which is fundamentally different from the classical matrix factorization approaches.

[1]  Yoshua Bengio,et al.  Exploring Strategies for Training Deep Neural Networks , 2009, J. Mach. Learn. Res..

[2]  Patrik O. Hoyer,et al.  Non-negative Matrix Factorization with Sparseness Constraints , 2004, J. Mach. Learn. Res..

[3]  Qiang Yang,et al.  Detect and Track Latent Factors with Online Nonnegative Matrix Factorization , 2007, IJCAI.

[4]  J.J. Steil,et al.  Backpropagation-decorrelation: online recurrent learning with O(N) complexity , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[5]  Jochen J. Steil,et al.  Efficient online learning of a non-negative sparse autoencoder , 2010, ESANN.

[6]  Keiji Tanaka Columns for complex visual object features in the inferotemporal cortex: clustering of cells with similar but slightly different stimulus selectivities. , 2003, Cerebral cortex.

[7]  Michael W. Spratling Learning Image Components for Object Recognition , 2006, J. Mach. Learn. Res..

[8]  Yoshua Bengio,et al.  Greedy Layer-Wise Training of Deep Networks , 2006, NIPS.

[9]  P. Földiák,et al.  Forming sparse representations by local anti-Hebbian learning , 1990, Biological Cybernetics.

[10]  Bruno A Olshausen,et al.  Sparse coding of sensory inputs , 2004, Current Opinion in Neurobiology.

[11]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[12]  Bhavin J. Shastri,et al.  Face recognition using localized features based on non-negative sparse coding , 2006, Machine Vision and Applications.

[13]  Guillermo Sapiro,et al.  Online Learning for Matrix Factorization and Sparse Coding , 2009, J. Mach. Learn. Res..

[14]  Helge Ritter,et al.  Hierarchical feed-forward network for object detection tasks , 2006 .

[15]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[16]  Heiko Wersing,et al.  Correlating Shape and Functional Properties Using Decomposition Approaches , 2010, FLAIRS Conference.

[17]  Erkki Oja,et al.  A "nonnegative PCA" algorithm for independent component analysis , 2004, IEEE Transactions on Neural Networks.

[18]  András Frank,et al.  On Kuhn's Hungarian Method—A tribute from Hungary , 2005 .

[19]  Heiko Wersing,et al.  Learning Optimized Features for Hierarchical Models of Invariant Object Recognition , 2003, Neural Computation.

[20]  Jochen Triesch,et al.  A Gradient Rule for the Plasticity of a Neuron's Intrinsic Excitability , 2005, ICANN.

[21]  Patrik O. Hoyer,et al.  Non-negative sparse coding , 2002, Proceedings of the 12th IEEE Workshop on Neural Networks for Signal Processing.

[22]  Honglak Lee,et al.  Sparse deep belief net model for visual area V2 , 2007, NIPS.

[23]  Marc'Aurelio Ranzato,et al.  Sparse Feature Learning for Deep Belief Networks , 2007, NIPS.

[24]  N. Otsu A threshold selection method from gray level histograms , 1979 .