Deep Learning of Part-Based Representation of Data Using Sparse Autoencoders With Nonnegativity Constraints

We demonstrate a new deep learning autoencoder network, trained by a nonnegativity constraint algorithm (nonnegativity-constrained autoencoder), that learns features that show part-based representation of data. The learning algorithm is based on constraining negative weights. The performance of the algorithm is assessed based on decomposing data into parts and its prediction performance is tested on three standard image data sets and one text data set. The results indicate that the nonnegativity constraint forces the autoencoder to learn features that amount to a part-based representation of data, while improving sparsity and reconstruction quality in comparison with the traditional sparse autoencoder and nonnegative matrix factorization. It is also shown that this newly acquired representation improves the prediction performance of a deep neural network.

[1]  D. Perrett,et al.  Recognition of objects and their component parts: responses of single units in the temporal cortex of the macaque. , 1994, Cerebral cortex.

[2]  David J. Field,et al.  Emergence of simple-cell receptive field properties by learning a sparse code for natural images , 1996, Nature.

[3]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[4]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[5]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[6]  Anders Krogh,et al.  A Simple Weight Decay Can Improve Generalization , 1991, NIPS.

[7]  J. van Leeuwen,et al.  Neural Networks: Tricks of the Trade , 2002, Lecture Notes in Computer Science.

[8]  Samy Bengio,et al.  Guest Editors' Introduction: Special Section on Learning Deep Architectures , 2013, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Geoffrey E. Hinton Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.

[10]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[11]  Brendan J. Frey,et al.  k-Sparse Autoencoders , 2013, ICLR.

[12]  Nitish Srivastava,et al.  Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[13]  Jacek M. Zurada,et al.  Learning Understandable Neural Networks With Nonnegative Weight Constraints , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[14]  Geoffrey E. Hinton,et al.  3D Object Recognition with Deep Belief Nets , 2009, NIPS.

[15]  Li Deng,et al.  A tutorial survey of architectures, algorithms, and applications for deep learning , 2014, APSIPA Transactions on Signal and Information Processing.

[16]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  David J. Field,et al.  Sparse coding with an overcomplete basis set: A strategy employed by V1? , 1997, Vision Research.

[18]  Y. LeCun,et al.  Learning methods for generic object recognition with invariance to pose and lighting , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[19]  D. V. Essen,et al.  Neural mechanisms of form and motion processing in the primate visual system , 1994, Neuron.

[20]  Yoshua Bengio,et al.  Greedy Layer-Wise Training of Deep Networks , 2006, NIPS.

[21]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[22]  Jacek M. Zurada,et al.  Introduction to artificial neural systems , 1992 .

[23]  Geoffrey E. Hinton,et al.  Autoencoders, Minimum Description Length and Helmholtz Free Energy , 1993, NIPS.

[24]  Yoshua Bengio,et al.  Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.

[25]  Vipin Kumar,et al.  Introduction to Data Mining , 2022, Data Mining and Machine Learning Applications.

[26]  Jochen J. Steil,et al.  Online learning and generalization of parts-based image representations by non-negative sparse autoencoders , 2012, Neural Networks.

[27]  Svetha Venkatesh,et al.  Learning Parts-based Representations with Nonnegative Restricted Boltzmann Machine , 2013, ACML.

[28]  Marc'Aurelio Ranzato,et al.  Sparse Feature Learning for Deep Belief Networks , 2007, NIPS.

[29]  Karen Spärck Jones A statistical interpretation of term specificity and its application in retrieval , 2021, J. Documentation.

[30]  Jorge Nocedal,et al.  A Limited Memory Algorithm for Bound Constrained Optimization , 1995, SIAM J. Sci. Comput..

[31]  Dong Yu,et al.  Tensor Deep Stacking Networks , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Patrik O. Hoyer,et al.  Non-negative Matrix Factorization with Sparseness Constraints , 2004, J. Mach. Learn. Res..

[33]  D K Smith,et al.  Numerical Optimization , 2001, J. Oper. Res. Soc..

[34]  Honglak Lee,et al.  Sparse deep belief net model for visual area V2 , 2007, NIPS.

[35]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[36]  Yoshua Bengio,et al.  Scaling learning algorithms towards AI , 2007 .

[37]  Marc'Aurelio Ranzato,et al.  Efficient Learning of Sparse Representations with an Energy-Based Model , 2006, NIPS.

[38]  Andy Harter,et al.  Parameterisation of a stochastic model for human face identification , 1994, Proceedings of 1994 IEEE Workshop on Applications of Computer Vision.

[39]  Rajat Raina,et al.  Self-taught learning: transfer learning from unlabeled data , 2007, ICML '07.

[40]  Klaus-Robert Müller,et al.  Efficient BackProp , 2012, Neural Networks: Tricks of the Trade.

[41]  Yoshua Bengio,et al.  Why Does Unsupervised Pre-training Help Deep Learning? , 2010, AISTATS.

[42]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .